[bksvol-discuss] Re: Automatic Stripper problem

  • From: Tracy Carcione <carcione@xxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Mon, 30 Aug 2004 11:33:42 -0400

Pratik,
I thought as you do about chapter headers, but then I downloaded a book
where I had put the chapter header down a couple lines when I validated it,
and it was gone.  Others have had similar experiences.
Now I am putting a dummy page number before chapter headers.
Tracy

At 03:44 AM 8/30/04 -0400, you wrote:
>Hello All,
>
>Here is What I suspect has been happening with the header situation.
>
>First, let me assure all of you that the automatic stripper only looks at
>the first and the last line on the page.  We were assured of this fact when
>this discussion arose the last time.  As a result, nothing that does not
>appear on the first or the last line of the page will be removed.  To make
>sure that chapter headings such as beginnings of new chapters are preserved,
>they should be placed on the second line of the page.  We are further
>assured that if all headers/footers are consistent, the chapter headings
>will not be remoed as they do not fall into the typical header pattern.
>But, to save myself from the whims of this type of analysis, I generally
>make it a habit to put the chapter headings on the second line.
>
>In this case, I actually suspect that the validator may have removed the
>headings that were refered to.  
>
>To alay Debra's concerns, you must make sure that the page number appears
>either on the first or the last line of the page.  We are further assured by
>Bookshare that if the page number is is placed this way, there is no needd
>to it to be preceeded by or followed by a space.  The automatic tools will
>recognize it.  If you have no additional header/footer info on a particular
>line, the page number is used by Bookshare's automated conversion tools to
>assign actual page numbers in the DAISY files.
>
>Pratik
>
>Pratik Patel
>Managing Director
>CUNYAssistive Technology Services
>The City University of New York
>     ppatel@xxxxxx
> 
>-----Original Message-----
>From: bksvol-discuss-bounce@xxxxxxxxxxxxx
>[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Deborah Kent Stein
>Sent: Sunday, August 29, 2004 3:50 PM
>To: bksvol-discuss@xxxxxxxxxxxxx
>Subject: [bksvol-discuss] Re: Automatic Stripper problem
>
>
>
>To clarify,
>The instructions say that page numbers should have a space on either side.
>I was under the impression that they should have a line feed on either side.
>Yikes!  Have I been doing it wrong all these years?
>Debbie
>
>----- Original Message -----
>From: "Jesse Fahnestock" <Jesse.F@xxxxxxxxxxxx>
>To: <bksvol-discuss@xxxxxxxxxxxxx>
>Sent: Thursday, August 26, 2004 10:24 AM
>Subject: [bksvol-discuss] Re: Automatic Stripper problem
>
>
>> Hi all, as I see there is some new conversation about normalizing headers
>and footers, I thought I would repost the guidelines for doing so. Please
>remember that you are not required to do this. But this is how to do it
>right if you want to take it on!
>>
>> ---Begin instructions for headers and footers--
>>
>> Volunteers can assist this tool by "normalizing" headers, footers, and
>> page numbers in submitted files where they do not appear consistent.
>> Normalizing such a headers/footers helps but it needs to be a
>> complete job, as normalizing just a few headers could skew the
>> probability of properly recognizing them throughout the book. If you
>> wish to undertake this task, please be sure to:
>>
>> 1) Check line position of text (the first paragraph on a given page
>> should be the header, the last should be the footer)
>> 2) Check that page numbers should have a space on either side,
>> separating them from the header/footer text. If the page number is
>> the first character in a header it does not need a space before it; or if
>> it is the last character in a footer it does not need a space after it.
>> 3) Only change text in the header or footer in order to make it look
>> like all other headers/footers
>> 4) Perform 1-3 on every page.
>>
>> Remember that the automated tool is designed to be effective on most
>> scanned books so that you should undertake this "normalization"
>> process only if you are sure that the headers and footers in the book
>> you are validating are inconsistent and if you are able to normalize all
>> of them throughout the book.
>>
>> --end instructions--
>>
>> jesse.
>>
>> ________________________
>>
>> Jesse Fahnestock
>> Collection Development Coordinator, Bookshare.org
>> www.bookshare.org
>>
>> A Project of The Benetech Initiative - Technology Serving Humanity
>> 480 S. California Ave., Suite 201
>> Palo Alto, CA 94306-1609  USA
>> (650)475-5440 x133
>> (650) 475-1066 FAX
>> jesse@xxxxxxxxxxxx
>> www.benetech.org
>>
>> -----Original Message-----
>> From: bksvol-discuss-bounce@xxxxxxxxxxxxx
>> [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of Jake
>> Sent: den 26 augusti 2004 15:12
>> To: bksvol-discuss@xxxxxxxxxxxxx
>> Subject: [bksvol-discuss] Re: Automatic Stripper problem
>>
>>
>> My guess is that is part of the issue. Many of the books I scan/plan to
>scan
>> have page numbers at the top of the page except on pages where a new
>chapter
>> begins, in that case the page numbers are located at the bottom of the
>page.
>> So my guess is since the word Chapter is found first and on several pages
>> that the program thinks it is a heading and therefore throws it out the
>> window.
>> I'm sure if the program bookshare is using was written by them to add code
>> to skip the word chapter as a heading, but if not then I'd seriously
>> recommend finding a new program that does what we want, not what we don't.
>>
>> Jake
>> ----- Original Message -----
>> From: "Kyrath. (AKA Rob)" <kyrath@xxxxxxx>
>> To: <bksvol-discuss@xxxxxxxxxxxxx>
>> Sent: Thursday, August 26, 2004 7:48 AM
>> Subject: [bksvol-discuss] Re: Automatic Stripper problem
>>
>>
>> > Given the aggressive nature of the stripper, what I now intend to do is
>> put
>> > in the actual page number 2 lines above the chapter heading, assuming
>that
>> > page numbers are on top.  In theory, this should prevent the stripper
>from
>> > getting her greedy little hands on the chapter headings.  *grin*
>> > However, I wonder how the stripper treats headings in books that have
>page
>> > numbers at the bottom of the page?
>> > -- Rob
>> >
>> > ----- Original Message -----
>> > From: "Jake" <jabrown@xxxxxxxxx>
>> > To: <bksvol-discuss@xxxxxxxxxxxxx>
>> > Sent: Wednesday, August 25, 2004 11:02 PM
>> > Subject: [bksvol-discuss] Re: Automatic Stripper problem
>> >
>> >
>> > > Yes, I recently discovered that the auto stripper pretty much
>destroyed
>> my
>> > > first accepted submission.
>> > > I understand the reason for getting rid of the headers, but when it
>gets
>> > rid
>> > > of critical information like Chapter zzz or something, sometimes it is
>> > hard
>> > > to realize that  you are in fact in a new chapter (I have also noticed
>> > this
>> > > with books I've downloaded).
>> > >
>> > > While going back and fixing the messed up titles would be a long and
>> > > tiresome, not to mention cumbersome process, I believe that we need to
>> get
>> > > this problem resolved so that all new submissions are of a better
>> quality.
>> > >
>> > > So, would it be a good idea for me to strip the headers in books
>before
>> I
>> > > submit them now?
>> > >
>> > > Thanks,
>> > > Jake Brownell
>> > > ----- Original Message -----
>> > > From: <socly@xxxxxxxxx>
>> > > To: <bksvol-discuss@xxxxxxxxxxxxx>
>> > > Sent: Wednesday, August 25, 2004 9:23 PM
>> > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
>> > >
>> > >
>> > > > I, too, strip my headers before submitting or uploading -- but what
>> you
>> > > say about the chapter headings worries me. Is this a new problem? I've
>> > been
>> > > putting the
>> > > > page number on the first line, then skipping a couple of lines
>before
>> > the
>> > > Chapter heading, be it Chapter and a number or an actual title.  I
>hope
>> > they
>> > > haven't been
>> > > > stripped.  And everyone wants page numbers (if you can't read the
>book
>> > at
>> > > one sitting, even when you're reading to children, how do you know
>where
>> > you
>> > > left off?
>> > > > Of what if they want you to go back to a particular page?  I do hope
>> > what
>> > > you found, Dilsia, was an aberration. Maybe Jesse can clear it up for
>us
>> > > (and publish
>> > > > another list of books being worked on or awaiting approval.)
>> > > >
>> > > > Cindy
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > ----- Original Message -----
>> > > > From: Pam Quinn <quinns@xxxxxxxxxxxxx>
>> > > > Date: Wed, 25 Aug 2004 21:11:49 -0500
>> > > > To: bksvol-discuss@xxxxxxxxxxxxx
>> > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
>> > > >
>> > > > > I agree. I manually strip my own headers now before submitting,
>and
>> > > > > even if everybody didn't do this, I'd rather see the headers left
>in
>> > > > > than to lose information that the automatic stripper takes out.
>They
>> > > > > just don't work the way that they should. Oh boy; here we go,
>> talking
>> > > > > about strippers again.
>> > > > >
>> > > > > Pam
>> > > > >
>> > > > >
>> > > > > On Wed, 25 Aug 2004 19:55:51 -0400, you wrote:
>> > > > >
>> > > > > >Hi List:
>> > > > > >
>> > > > > >One of my books was accepted today. I downloaded the book to find
>> out
>> > > if the chapter headings were stripped. I had skipped a couple of blank
>> > lines
>> > > before
>> > > > each chapter number. Sure enough, all chapter headings are gone as
>> well
>> > as
>> > > other important headings. Apparently the trick of skipping a couple
>> lines
>> > > before each
>> > > > chapter heading is not working any more, if it ever did. Does the
>> > > automatic stripper always have to be applied? Personally I always
>strip
>> > > headers of books that I
>> > > > submit or validate. Another book that I validated all the numbers
>were
>> > > stripped. The page numbers are important for this particular book
>> because
>> > > it's a choose
>> > > > your own adventure which tells you to turn to certain pages at
>> different
>> > > points in the story. I find it very annoying that even the chapter
>> > headings
>> > > are stripped. I can
>> > > > understand the titles being stripped.  In my humble opinion, I
>rather
>> > have
>> > > the page numbers be left in. It gives me an idea how far I am into the
>> > book.
>> > > But at least
>> > > > the chapter heading
>> > > > >  s should
>> > > > > >definitely be preserved. Any suggestions on how to preserve the
>> > chapter
>> > > headings?
>> > > > > >
>> > > > > >*****
>> > > > > >Grace
>> > > > > >
>> > > > > >MSN: gcpires@xxxxxxxxxxx
>> > > > >
>> > > > >
>> > > > --
>> > > > _______________________________________________
>> > > > Find what you are looking for with the Lycos Yellow Pages
>> > > >
>> > >
>> >
>>
>http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp
>?SRC=lycos10
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> >
>> >
>> >
>>
>>
>>
>
>
>
>
>
>


Other related posts: