[bksvol-discuss] Re: Automatic Stripper problem

  • From: "Pratik Patel" <pratikp1@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Mon, 30 Aug 2004 10:54:35 -0400

Jake,

The reason  why it's not listed in the guidelines is because that
information comes from me.  It's a technique I particularly use to ensure
that an overly-zealous automated stripper does not remove the info that
needs to be there.
 
Pratik

Pratik Patel
Managing Director
CUNYAssistive Technology Services
The City University of New York
     ppatel@xxxxxx
 
-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Jake
Sent: Monday, August 30, 2004 10:34 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: Automatic Stripper problem

Hi there,
    My main concern was the stripping of chapter information that appeared
at the top of pages in a book I submitted. I know the validator didn't
remove them, because after it sat in the validation pool for quite a bit I
went ahead and gave it another look over and validated it.
    So if it is only necessary for the chapter headings to be placed on the
second line to avoid issues, why isn't that listed in the header/footer
guidelines?

Jake
----- Original Message ----- 
From: "Pratik Patel" <pratikp1@xxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Monday, August 30, 2004 2:44 AM
Subject: [bksvol-discuss] Re: Automatic Stripper problem


> Hello All,
>
> Here is What I suspect has been happening with the header situation.
>
> First, let me assure all of you that the automatic stripper only looks at
> the first and the last line on the page.  We were assured of this fact
when
> this discussion arose the last time.  As a result, nothing that does not
> appear on the first or the last line of the page will be removed.  To make
> sure that chapter headings such as beginnings of new chapters are
preserved,
> they should be placed on the second line of the page.  We are further
> assured that if all headers/footers are consistent, the chapter headings
> will not be remoed as they do not fall into the typical header pattern.
> But, to save myself from the whims of this type of analysis, I generally
> make it a habit to put the chapter headings on the second line.
>
> In this case, I actually suspect that the validator may have removed the
> headings that were refered to.
>
> To alay Debra's concerns, you must make sure that the page number appears
> either on the first or the last line of the page.  We are further assured
by
> Bookshare that if the page number is is placed this way, there is no needd
> to it to be preceeded by or followed by a space.  The automatic tools will
> recognize it.  If you have no additional header/footer info on a
particular
> line, the page number is used by Bookshare's automated conversion tools to
> assign actual page numbers in the DAISY files.
>
> Pratik
>
> Pratik Patel
> Managing Director
> CUNYAssistive Technology Services
> The City University of New York
>      ppatel@xxxxxx
>
> -----Original Message-----
> From: bksvol-discuss-bounce@xxxxxxxxxxxxx
> [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Deborah Kent
Stein
> Sent: Sunday, August 29, 2004 3:50 PM
> To: bksvol-discuss@xxxxxxxxxxxxx
> Subject: [bksvol-discuss] Re: Automatic Stripper problem
>
>
>
> To clarify,
> The instructions say that page numbers should have a space on either side.
> I was under the impression that they should have a line feed on either
side.
> Yikes!  Have I been doing it wrong all these years?
> Debbie
>
> ----- Original Message -----
> From: "Jesse Fahnestock" <Jesse.F@xxxxxxxxxxxx>
> To: <bksvol-discuss@xxxxxxxxxxxxx>
> Sent: Thursday, August 26, 2004 10:24 AM
> Subject: [bksvol-discuss] Re: Automatic Stripper problem
>
>
> > Hi all, as I see there is some new conversation about normalizing
headers
> and footers, I thought I would repost the guidelines for doing so. Please
> remember that you are not required to do this. But this is how to do it
> right if you want to take it on!
> >
> > ---Begin instructions for headers and footers--
> >
> > Volunteers can assist this tool by "normalizing" headers, footers, and
> > page numbers in submitted files where they do not appear consistent.
> > Normalizing such a headers/footers helps but it needs to be a
> > complete job, as normalizing just a few headers could skew the
> > probability of properly recognizing them throughout the book. If you
> > wish to undertake this task, please be sure to:
> >
> > 1) Check line position of text (the first paragraph on a given page
> > should be the header, the last should be the footer)
> > 2) Check that page numbers should have a space on either side,
> > separating them from the header/footer text. If the page number is
> > the first character in a header it does not need a space before it; or
if
> > it is the last character in a footer it does not need a space after it.
> > 3) Only change text in the header or footer in order to make it look
> > like all other headers/footers
> > 4) Perform 1-3 on every page.
> >
> > Remember that the automated tool is designed to be effective on most
> > scanned books so that you should undertake this "normalization"
> > process only if you are sure that the headers and footers in the book
> > you are validating are inconsistent and if you are able to normalize all
> > of them throughout the book.
> >
> > --end instructions--
> >
> > jesse.
> >
> > ________________________
> >
> > Jesse Fahnestock
> > Collection Development Coordinator, Bookshare.org
> > www.bookshare.org
> >
> > A Project of The Benetech Initiative - Technology Serving Humanity
> > 480 S. California Ave., Suite 201
> > Palo Alto, CA 94306-1609  USA
> > (650)475-5440 x133
> > (650) 475-1066 FAX
> > jesse@xxxxxxxxxxxx
> > www.benetech.org
> >
> > -----Original Message-----
> > From: bksvol-discuss-bounce@xxxxxxxxxxxxx
> > [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of Jake
> > Sent: den 26 augusti 2004 15:12
> > To: bksvol-discuss@xxxxxxxxxxxxx
> > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> >
> >
> > My guess is that is part of the issue. Many of the books I scan/plan to
> scan
> > have page numbers at the top of the page except on pages where a new
> chapter
> > begins, in that case the page numbers are located at the bottom of the
> page.
> > So my guess is since the word Chapter is found first and on several
pages
> > that the program thinks it is a heading and therefore throws it out the
> > window.
> > I'm sure if the program bookshare is using was written by them to add
code
> > to skip the word chapter as a heading, but if not then I'd seriously
> > recommend finding a new program that does what we want, not what we
don't.
> >
> > Jake
> > ----- Original Message -----
> > From: "Kyrath. (AKA Rob)" <kyrath@xxxxxxx>
> > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > Sent: Thursday, August 26, 2004 7:48 AM
> > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> >
> >
> > > Given the aggressive nature of the stripper, what I now intend to do
is
> > put
> > > in the actual page number 2 lines above the chapter heading, assuming
> that
> > > page numbers are on top.  In theory, this should prevent the stripper
> from
> > > getting her greedy little hands on the chapter headings.  *grin*
> > > However, I wonder how the stripper treats headings in books that have
> page
> > > numbers at the bottom of the page?
> > > -- Rob
> > >
> > > ----- Original Message -----
> > > From: "Jake" <jabrown@xxxxxxxxx>
> > > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > > Sent: Wednesday, August 25, 2004 11:02 PM
> > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> > >
> > >
> > > > Yes, I recently discovered that the auto stripper pretty much
> destroyed
> > my
> > > > first accepted submission.
> > > > I understand the reason for getting rid of the headers, but when it
> gets
> > > rid
> > > > of critical information like Chapter zzz or something, sometimes it
is
> > > hard
> > > > to realize that  you are in fact in a new chapter (I have also
noticed
> > > this
> > > > with books I've downloaded).
> > > >
> > > > While going back and fixing the messed up titles would be a long and
> > > > tiresome, not to mention cumbersome process, I believe that we need
to
> > get
> > > > this problem resolved so that all new submissions are of a better
> > quality.
> > > >
> > > > So, would it be a good idea for me to strip the headers in books
> before
> > I
> > > > submit them now?
> > > >
> > > > Thanks,
> > > > Jake Brownell
> > > > ----- Original Message -----
> > > > From: <socly@xxxxxxxxx>
> > > > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > > > Sent: Wednesday, August 25, 2004 9:23 PM
> > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> > > >
> > > >
> > > > > I, too, strip my headers before submitting or uploading -- but
what
> > you
> > > > say about the chapter headings worries me. Is this a new problem?
I've
> > > been
> > > > putting the
> > > > > page number on the first line, then skipping a couple of lines
> before
> > > the
> > > > Chapter heading, be it Chapter and a number or an actual title.  I
> hope
> > > they
> > > > haven't been
> > > > > stripped.  And everyone wants page numbers (if you can't read the
> book
> > > at
> > > > one sitting, even when you're reading to children, how do you know
> where
> > > you
> > > > left off?
> > > > > Of what if they want you to go back to a particular page?  I do
hope
> > > what
> > > > you found, Dilsia, was an aberration. Maybe Jesse can clear it up
for
> us
> > > > (and publish
> > > > > another list of books being worked on or awaiting approval.)
> > > > >
> > > > > Cindy
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: Pam Quinn <quinns@xxxxxxxxxxxxx>
> > > > > Date: Wed, 25 Aug 2004 21:11:49 -0500
> > > > > To: bksvol-discuss@xxxxxxxxxxxxx
> > > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> > > > >
> > > > > > I agree. I manually strip my own headers now before submitting,
> and
> > > > > > even if everybody didn't do this, I'd rather see the headers
left
> in
> > > > > > than to lose information that the automatic stripper takes out.
> They
> > > > > > just don't work the way that they should. Oh boy; here we go,
> > talking
> > > > > > about strippers again.
> > > > > >
> > > > > > Pam
> > > > > >
> > > > > >
> > > > > > On Wed, 25 Aug 2004 19:55:51 -0400, you wrote:
> > > > > >
> > > > > > >Hi List:
> > > > > > >
> > > > > > >One of my books was accepted today. I downloaded the book to
find
> > out
> > > > if the chapter headings were stripped. I had skipped a couple of
blank
> > > lines
> > > > before
> > > > > each chapter number. Sure enough, all chapter headings are gone as
> > well
> > > as
> > > > other important headings. Apparently the trick of skipping a couple
> > lines
> > > > before each
> > > > > chapter heading is not working any more, if it ever did. Does the
> > > > automatic stripper always have to be applied? Personally I always
> strip
> > > > headers of books that I
> > > > > submit or validate. Another book that I validated all the numbers
> were
> > > > stripped. The page numbers are important for this particular book
> > because
> > > > it's a choose
> > > > > your own adventure which tells you to turn to certain pages at
> > different
> > > > points in the story. I find it very annoying that even the chapter
> > > headings
> > > > are stripped. I can
> > > > > understand the titles being stripped.  In my humble opinion, I
> rather
> > > have
> > > > the page numbers be left in. It gives me an idea how far I am into
the
> > > book.
> > > > But at least
> > > > > the chapter heading
> > > > > >  s should
> > > > > > >definitely be preserved. Any suggestions on how to preserve the
> > > chapter
> > > > headings?
> > > > > > >
> > > > > > >*****
> > > > > > >Grace
> > > > > > >
> > > > > > >MSN: gcpires@xxxxxxxxxxx
> > > > > >
> > > > > >
> > > > > --
> > > > > _______________________________________________
> > > > > Find what you are looking for with the Lycos Yellow Pages
> > > > >
> > > >
> > >
> >
>
http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp
> ?SRC=lycos10
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
>



Other related posts: