Hi Pratik, In theory, this should be the case. However, I have seen several books that I have either submitted or validated that weren't treated accurately, according to these guidelines. I wonder how often the word "chapter" plus a number, located on the first line of a page, would be necessary in order to trigger the automatic tool into considering it a page heading. Would a book that has only 12 chapters be less likely to get stripped than a book with 42 chapters? Or possibly, is it dependent upon a percentage of the number of total pages? -- Rob ----- Original Message ----- From: "Pratik Patel" <pratikp1@xxxxxxxxx> To: <bksvol-discuss@xxxxxxxxxxxxx> Sent: Monday, August 30, 2004 3:44 AM Subject: [bksvol-discuss] Re: Automatic Stripper problem > Hello All, > > Here is What I suspect has been happening with the header situation. > > First, let me assure all of you that the automatic stripper only looks at > the first and the last line on the page. We were assured of this fact when > this discussion arose the last time. As a result, nothing that does not > appear on the first or the last line of the page will be removed. To make > sure that chapter headings such as beginnings of new chapters are preserved, > they should be placed on the second line of the page. We are further > assured that if all headers/footers are consistent, the chapter headings > will not be remoed as they do not fall into the typical header pattern. > But, to save myself from the whims of this type of analysis, I generally > make it a habit to put the chapter headings on the second line. > > In this case, I actually suspect that the validator may have removed the > headings that were refered to. > > To alay Debra's concerns, you must make sure that the page number appears > either on the first or the last line of the page. We are further assured by > Bookshare that if the page number is is placed this way, there is no needd > to it to be preceeded by or followed by a space. The automatic tools will > recognize it. If you have no additional header/footer info on a particular > line, the page number is used by Bookshare's automated conversion tools to > assign actual page numbers in the DAISY files. > > Pratik > > Pratik Patel > Managing Director > CUNYAssistive Technology Services > The City University of New York > ppatel@xxxxxx > > -----Original Message----- > From: bksvol-discuss-bounce@xxxxxxxxxxxxx > [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Deborah Kent Stein > Sent: Sunday, August 29, 2004 3:50 PM > To: bksvol-discuss@xxxxxxxxxxxxx > Subject: [bksvol-discuss] Re: Automatic Stripper problem > > > > To clarify, > The instructions say that page numbers should have a space on either side. > I was under the impression that they should have a line feed on either side. > Yikes! Have I been doing it wrong all these years? > Debbie > > ----- Original Message ----- > From: "Jesse Fahnestock" <Jesse.F@xxxxxxxxxxxx> > To: <bksvol-discuss@xxxxxxxxxxxxx> > Sent: Thursday, August 26, 2004 10:24 AM > Subject: [bksvol-discuss] Re: Automatic Stripper problem > > > > Hi all, as I see there is some new conversation about normalizing headers > and footers, I thought I would repost the guidelines for doing so. Please > remember that you are not required to do this. But this is how to do it > right if you want to take it on! > > > > ---Begin instructions for headers and footers-- > > > > Volunteers can assist this tool by "normalizing" headers, footers, and > > page numbers in submitted files where they do not appear consistent. > > Normalizing such a headers/footers helps but it needs to be a > > complete job, as normalizing just a few headers could skew the > > probability of properly recognizing them throughout the book. If you > > wish to undertake this task, please be sure to: > > > > 1) Check line position of text (the first paragraph on a given page > > should be the header, the last should be the footer) > > 2) Check that page numbers should have a space on either side, > > separating them from the header/footer text. If the page number is > > the first character in a header it does not need a space before it; or if > > it is the last character in a footer it does not need a space after it. > > 3) Only change text in the header or footer in order to make it look > > like all other headers/footers > > 4) Perform 1-3 on every page. > > > > Remember that the automated tool is designed to be effective on most > > scanned books so that you should undertake this "normalization" > > process only if you are sure that the headers and footers in the book > > you are validating are inconsistent and if you are able to normalize all > > of them throughout the book. > > > > --end instructions-- > > > > jesse. > > > > ________________________ > > > > Jesse Fahnestock > > Collection Development Coordinator, Bookshare.org > > www.bookshare.org > > > > A Project of The Benetech Initiative - Technology Serving Humanity > > 480 S. California Ave., Suite 201 > > Palo Alto, CA 94306-1609 USA > > (650)475-5440 x133 > > (650) 475-1066 FAX > > jesse@xxxxxxxxxxxx > > www.benetech.org > > > > -----Original Message----- > > From: bksvol-discuss-bounce@xxxxxxxxxxxxx > > [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of Jake > > Sent: den 26 augusti 2004 15:12 > > To: bksvol-discuss@xxxxxxxxxxxxx > > Subject: [bksvol-discuss] Re: Automatic Stripper problem > > > > > > My guess is that is part of the issue. Many of the books I scan/plan to > scan > > have page numbers at the top of the page except on pages where a new > chapter > > begins, in that case the page numbers are located at the bottom of the > page. > > So my guess is since the word Chapter is found first and on several pages > > that the program thinks it is a heading and therefore throws it out the > > window. > > I'm sure if the program bookshare is using was written by them to add code > > to skip the word chapter as a heading, but if not then I'd seriously > > recommend finding a new program that does what we want, not what we don't. > > > > Jake > > ----- Original Message ----- > > From: "Kyrath. (AKA Rob)" <kyrath@xxxxxxx> > > To: <bksvol-discuss@xxxxxxxxxxxxx> > > Sent: Thursday, August 26, 2004 7:48 AM > > Subject: [bksvol-discuss] Re: Automatic Stripper problem > > > > > > > Given the aggressive nature of the stripper, what I now intend to do is > > put > > > in the actual page number 2 lines above the chapter heading, assuming > that > > > page numbers are on top. In theory, this should prevent the stripper > from > > > getting her greedy little hands on the chapter headings. *grin* > > > However, I wonder how the stripper treats headings in books that have > page > > > numbers at the bottom of the page? > > > -- Rob > > > > > > ----- Original Message ----- > > > From: "Jake" <jabrown@xxxxxxxxx> > > > To: <bksvol-discuss@xxxxxxxxxxxxx> > > > Sent: Wednesday, August 25, 2004 11:02 PM > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem > > > > > > > > > > Yes, I recently discovered that the auto stripper pretty much > destroyed > > my > > > > first accepted submission. > > > > I understand the reason for getting rid of the headers, but when it > gets > > > rid > > > > of critical information like Chapter zzz or something, sometimes it is > > > hard > > > > to realize that you are in fact in a new chapter (I have also noticed > > > this > > > > with books I've downloaded). > > > > > > > > While going back and fixing the messed up titles would be a long and > > > > tiresome, not to mention cumbersome process, I believe that we need to > > get > > > > this problem resolved so that all new submissions are of a better > > quality. > > > > > > > > So, would it be a good idea for me to strip the headers in books > before > > I > > > > submit them now? > > > > > > > > Thanks, > > > > Jake Brownell > > > > ----- Original Message ----- > > > > From: <socly@xxxxxxxxx> > > > > To: <bksvol-discuss@xxxxxxxxxxxxx> > > > > Sent: Wednesday, August 25, 2004 9:23 PM > > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem > > > > > > > > > > > > > I, too, strip my headers before submitting or uploading -- but what > > you > > > > say about the chapter headings worries me. Is this a new problem? I've > > > been > > > > putting the > > > > > page number on the first line, then skipping a couple of lines > before > > > the > > > > Chapter heading, be it Chapter and a number or an actual title. I > hope > > > they > > > > haven't been > > > > > stripped. And everyone wants page numbers (if you can't read the > book > > > at > > > > one sitting, even when you're reading to children, how do you know > where > > > you > > > > left off? > > > > > Of what if they want you to go back to a particular page? I do hope > > > what > > > > you found, Dilsia, was an aberration. Maybe Jesse can clear it up for > us > > > > (and publish > > > > > another list of books being worked on or awaiting approval.) > > > > > > > > > > Cindy > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > From: Pam Quinn <quinns@xxxxxxxxxxxxx> > > > > > Date: Wed, 25 Aug 2004 21:11:49 -0500 > > > > > To: bksvol-discuss@xxxxxxxxxxxxx > > > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem > > > > > > > > > > > I agree. I manually strip my own headers now before submitting, > and > > > > > > even if everybody didn't do this, I'd rather see the headers left > in > > > > > > than to lose information that the automatic stripper takes out. > They > > > > > > just don't work the way that they should. Oh boy; here we go, > > talking > > > > > > about strippers again. > > > > > > > > > > > > Pam > > > > > > > > > > > > > > > > > > On Wed, 25 Aug 2004 19:55:51 -0400, you wrote: > > > > > > > > > > > > >Hi List: > > > > > > > > > > > > > >One of my books was accepted today. I downloaded the book to find > > out > > > > if the chapter headings were stripped. I had skipped a couple of blank > > > lines > > > > before > > > > > each chapter number. Sure enough, all chapter headings are gone as > > well > > > as > > > > other important headings. Apparently the trick of skipping a couple > > lines > > > > before each > > > > > chapter heading is not working any more, if it ever did. Does the > > > > automatic stripper always have to be applied? Personally I always > strip > > > > headers of books that I > > > > > submit or validate. Another book that I validated all the numbers > were > > > > stripped. The page numbers are important for this particular book > > because > > > > it's a choose > > > > > your own adventure which tells you to turn to certain pages at > > different > > > > points in the story. I find it very annoying that even the chapter > > > headings > > > > are stripped. I can > > > > > understand the titles being stripped. In my humble opinion, I > rather > > > have > > > > the page numbers be left in. It gives me an idea how far I am into the > > > book. > > > > But at least > > > > > the chapter heading > > > > > > s should > > > > > > >definitely be preserved. Any suggestions on how to preserve the > > > chapter > > > > headings? > > > > > > > > > > > > > >***** > > > > > > >Grace > > > > > > > > > > > > > >MSN: gcpires@xxxxxxxxxxx > > > > > > > > > > > > > > > > > -- > > > > > _______________________________________________ > > > > > Find what you are looking for with the Lycos Yellow Pages > > > > > > > > > > > > > > > http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp > ?SRC=lycos10 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >