[bksvol-discuss] Re: Empire Falls, and a validating practice question

  • From: "Gerald Hovas" <GeraldHovas@xxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Fri, 10 Mar 2006 12:20:17 -0600

Jill,

Yes, it does.  The Stripper is looking for the page number to be in either
the header or the footer.  By definition, a header is the first line on the
page, so putting text or asterisks (or other special symbols or comination
of special symbols) above the page number will cause the page number to be
left in the text rather than processed as a page number.  While some people
may think this is better because it causes the page numbers to show up in
the text when reading books using K-1000, it could affect navigating of the
book by page in the DAISY book and the page numbering in the BRF and HTML
files.  If the book happens to start numbering the first page of the
frontmatter of the book as 1 rather than I, then not processing the page
numbers properly shouldn't have any affect as long as the book jacket isn't
placed before the book since the Stripper should default to numbering the
pages by their position in the file (i.e. the first page in the file would
be numbered as page 1, etc.).  However, many books begin numbering pages
with the Prologue or Chapter 1 (i.e. the first page of the Prologue or
Chapter 1 will be numbered as page 1), so preventing the Stripper from doing
its job by placing asterisks above the page number will cause two sets of
page numbers in the BRF and HTML files, and the page numbers added by
Bookshare will be off by the number of pages in the front of the book.
Also, jumping to a given page in the DAISY books will be off by the same
number of pages as well.

Placing a line above Chapter Headings is a different issue, however.  I've
heard that Bookshare originally made a half-hearted attempt to have the
Stripper recognize Chapter Headings, so it's possible that the missing
Chapter Headings are due to their being stripped for the purpose of tagging
them and that an Engineer didn't realize that they need to be also left in
the text, but I rather think it's a bug with the algorithm for recognizing
headers.  In either case, I recommend either moving the page number from the
bottom of the page to the top in books where the page numbers reside in the
headers except on pages with Chapter Headings, or placing the title of the
book above the Chapter Headings in books where the page numbers reside
totally in the footers.  Jake has said that there may be a need to follow
the pattern of the book in cases where the book alternates title and author
name (or some other text in the header like chapter name), but I haven't
seen any indication that the Stripper is that smart.  Unless you have some
evidence to the contrary, though, you may want to follow the pattern of the
book when adding a header to the top of the pages with Chapter Headings.
The reason for adding a header to the top of the page rather than asterisks,
or some other special symbol or combination of special symbols) is that the
Stripper will strip the header and leave the Chapter heading while it will
leave the asterisks in the text of the book, so using a header is cleaner,
especially when the author is using special symbols like asterisks to
separate sections in the book.

As I mentioned in my earlier message, one of the current job notices on
Benetech's website includes a task for upgrading the Stripper.  The job
notice only mentions upgrading it to support Chapter Headings, but I would
expect the task to include support for NIMAS, and not just support for
Chapter Headings.  I don't know exactly how that differs, but it was my
impression that NIMAS also includes support for recognizing Section Headings
at a minimum since it is a standard for K-12 textbooks.  Again, in any case,
now that Bookshare has some additional funding, they are looking for someone
to do some Engineering work which includes work on the Stripper.  I'll try
to see that the work includes fixing the problems that the volunteering
community has found in the Stripper, and hopefully Jennifer can insist on
better testing before the next revision of the Stripper is released.  Don't
expect the new revision to be released anytime soon because the job notice
wasn't there when they initially listed Jeniffer's position, and the task
will require quite a bit of work even when the person is hired.  Also, It
was obvious from the job notice that the person will have additional tasks
as well, but I don't remember what they were off the top of my head.

HTH

Gerald

-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Jill O'Connell
Sent: Friday, March 10, 2006 11:24 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: Empire Falls, and a validating practice
question

Gerald, Do I assume that this means no stars etc. on the line above the 
number or chapter either? I hand delete text headers and footers leaving 
just the number. 

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of
available commands, put the word 'help' by itself in the subject line.

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: