Yes, I suppose you could put in "Strip this, sucker," but then if it didn't strip it, readers who know nothing about the scanning validating processes might really be confused. <big grin> I just write out the title of the book, add a blank line, copy it to my clipboard and then paste it in whenever I come across a chapter heading. It's really not very much work. Some of the more techno-savvy on this list have suggested ways to improve the stripper, but that goes over my head pretty quickly, and at any rate, BS hasn't made changes to it as of yet. I still think it is best to manually strip the junk headers and protect the page numbers. Again, it's whatever you're comfortable doing. BTW, I have the same problem when actually reading a book instead of just scrolling through for things to correct. I've been reading scanned text since the Easy Scan days, (that's pre-OpenBook for you youngsters out there), and my brain is very good at transposing scan junk into real English. It's kind of scary, really. <smile> Donna _____ From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Evan Reese Sent: Saturday, May 13, 2006 2:04 PM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings Thanks, Donna. I also do as much cleanup as I can. I read through "the entire text of the print edition" as APH puts it, and I fix everything I notice. I am not a professional proofreader, though, so some things get by me. Also, when I am enjoying a book, I sometimes don't notice minor typos. My brain seems to sometimes make corrections in what I read that it doesn't tell me about, in a manner of speaking. <smile> But I usually get most stuff. After reading some of the discussion this morning, I decided to download another book I had submitted. The book contains a novel and some short stories, and I was worried that the stripper might have removed the titles of the stories, as they appear in the middle of the page, with nothing above them. Luckily, the story titles were not removed, which is confusing, because such things as "Part One" and "Part Two" were removed, and they also appear in the middle of the page. The chapter headings were also removed, but since the chapter and parts all have titles, readers will at least know that they are starting a new section of some kind. Also, the page break right above the title is a pretty good indicator of a new chapter or part. Why it removes some headers like part headers which are in the middle of the page, but not the story titles which are also in the middle of the page is mystifying to me. Removing consistent, repetetive text at the top of each page I can understand. But removing items dozens of pages apart just because they happen to be the first thing on the page, and not even doing that consistently is just goofy. I thought that by removing headers, I would be making life easier for everyone - especially since scanners - mine at least - can garble them and so make them not entirely consistent throughout. But it appears that by removing them, I have negatively affected the quality of my submissions, which I had hoped would be great. At least the text is great, still. I can be thankful for that at least. But now, I have to do more work to get around something someone thought was a "bonus"? Oh, brother! If it removes consistent text at the top of each page, and it doesn't find any, why doesn't it just go to sleep? It appears that whether or not I remove headers, I gotta add them to the beginnings of chapters and parts or whatever stuff I want to make sure doesn't get munched. I will try to remember that in the future. Can I put in any text I want at the top of the page to preserve the chapter headers?, such as "Remove this, not the chapter header you stupid software."? <lol> ----- Original Message ----- From: Donna <mailto:donnafsmith@xxxxxxxxxxx> Smith To: bksvol-discuss@xxxxxxxxxxxxx Sent: Saturday, May 13, 2006 10:32 AM Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings Hi Evan and others following this thread. I apologize for being the one to open this particular can of worms. Every time we've had this discussion on the list, it has brought up these same issues. There is no official requirement for this from BookShare. The goal is still to get as many books as possible in the collection following the absolute guidelines regarding copyright laws as amended, and the concern of quality is next in line with the understanding that it is possible to rate books as fair, good or excellent. However, many of us who do the work have our own standards of quality for our submissions and so we've come up with tips for how to get the best quality possible. These standards, and they vary from volunteer to volunteer, are not required by anyone. It's just a self-imposed standard. Personally, when I submit a book, I try to do as much clean-up as possible so that the validator has an easy job. Actually, I try to submit the book in a form I'd not mind reading. I knew from previous discussions on this list that the BS software has little quirks that we can accommodate by creating a particular format, though I had forgotten what that was specifically. I choose to follow this pattern of blank line, page number, blank line, text, or, false header, blank line, chapter heading, blank line text, because it will produce a better end result for all formats, but there is no requirement to do so. Volunteers can do as much as they are comfortable with doing both as submitters and validators, and the addition to the collection is appreciated by all. Hope this helps to explain it all a bit. Keep on scanning and keep up the good work! Peace and Hope, Donna _____ From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Evan Reese Sent: Saturday, May 13, 2006 11:17 AM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings Ok, now I'm more confused than ever. If I strip the headings manually, then why does the stripper need something to strip. If I don't strip them, then I need to make them consistent so the stripper can strip them correctly? But if I do strip them, then I need to put a fake one at the beginning of each chapter so the chapter heading doesn't get stripped? Do I have this all correct? Wouldn't it be a lot simpler just to not strip anything? Aren't we supposed - ideally anyhow - to be replicating the original book, in which there is in fact a heading at the top of each page, except on those which begin a chapter. This makes a lot of extra work and complexity and a reduction of quality for those who can't keep straight all the persnickety demands of the Bookshare software, or who don't have the time to go through and fix everything just right so that other readers can have a decent reading experience. I would say, just drop the whole stripping thing and make life simpler for everyone and then we won't have these quality problems. ----- Original Message ----- From: Gerald <mailto:GeraldHovas@xxxxxxxxxxx> Hovas To: bksvol-discuss@xxxxxxxxxxxxx Sent: Friday, May 12, 2006 4:18 PM Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings Donna, 1. Yes, put a blank line before and after the page number when the page number is in the header as well as a blank line at the bottom of the page. I'm not sure if the blank line is necessary between the page number and the text, but I can say that putting one after the header does work well. If you leave the blank lines at the top and bottom of the pages off, then some odd things will happen to the page numbers when the HTML file is created. The page numbers end up as part of a paragraph instead of on a separate line. Example of how the page should look... [Page Break] 95 . . . text . . . [Page Break] 2. Yes, put a blank line between the chapter heading and the first line of text. Again, I don't know that this is necessary, but it does work well, and some people have said that it is necessary. 3. Yes, you need to give the Stripper something to strip other than the chapter heading or the chapter heading will be stripped. The easiest thing to do when page numbers are normally at the top of pages except on pages where a chapter begins is to move the page number from the bottom of the page to the top. If page numbers are always at the bottom of the page, then place a false header above the chapter heading. Using the title works well, and it won't look as odd if the Stripper fails to strip it for some odd reason. It helps to strip all of the real headers manually since its easier to strip them yourself than to insure that they are consistent so that the Stripper will remove them all, but if you do leave them in, then just follow the pattern of the headers and put whatever header would normally appear on the top of that page above the chapter heading. Examples of how pages should look... [Page Break] 95 Chapter Seven . . . text . . . [Page Break] [Page Break] The Firm Chapter Seven . . . text . . . 95 [Page Break] Or if the page would normally contain the author's name instead of the title... [Page Break] John Grisham Chapter Seven . . . text . . . [Page Break] I can't say that these are the only ways to do it, but I've had excellent results with these approaches. HTH Gerald _____ From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Donna Smith Sent: Friday, May 12, 2006 5:41 PM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] Protecting page numbers and chapter headings Hi all. I hesitate to ask this question, but I can't find the answer on any of the official or unofficial sites giving tips to scanners. Did we ever determine absolutely what should be done to protect page numbers and chapter headings? What I want to know is: 1. At the top of each page, do I need to put a blank line before and/or after the page number? 2. For chapter numbers/titles that appear at the top of the page, do I put a blank line before and/or after it? 3. Is it necessary to have something at the top of the page for the BookShare stripper to strip? I spend a good bit of time cleaning up each scan regarding page numbers, chapter headings and stripping out unwanted headers. I'm trying to do it in such a way that it will result in the best book in the finished product. I really, really hope that an answer has been found to this question and that I'm not opening up the can of worms we've had before about the different perceptions of what might work. <smile> Thanks. Peace and Hope, Donna