[bksvol-discuss] Re: Protecting page numbers and chapter headings

"that's just my point.  If it worked reliably, consistently, that would be one 
thing.  But from what I've seen, it strips some things, but not others which it 
should if it were being consistent.  So it appears, no matter which way you 
slice it, readers get junk at the top of their pages, or they get missing 
stuff, or both in the same book on different pages perhaps.  I say just dump 
the stripper.  Not the page numbering portion, of course, but the part that 
attempts to strip page headers.  Print books usually have headings on each 
page, and if we are supposed to try to replicate the original book - don't I 
hear that a lot on this list? - then why are they being stripped in the first 
place?  Besides, then if people leave them in, readers are getting what was in 
the book to begin with.  If submitters or validaters remove them, then the 
chapter and other kinds of section headings will be in no danger of being 
removed by poorly written software.  Poorly written I say, because it shouldn't 
be removing text that appears only dozens of pages apart in a book for any 
reason.  But it does, so we have to cater to it so readers have a better 
reading experience.  Instead of having the software work for us, we have to 
work to satisfy it.  True, it is not an especially onerous task, as you 
describe the way you do it, but it shouldn't have to be done at all to get 
chapter headings to remain undeleted.  The emphasis should be on making sure 
the copyright and other publication information is correct, and then to make 
sure the book is complete and other validation stuff, then text quality for 
those who choose to do that.  We shouldn't have software working against us in 
at least the sense that the quality of the book is diminished if people don't 
take steps to mitigate it.


  ----- Original Message ----- 
  From: Donna Smith 
  To: bksvol-discuss@xxxxxxxxxxxxx 
  Sent: Saturday, May 13, 2006 11:23 AM
  Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings


  Yes, I suppose you could put in "Strip this, sucker," but then if it didn't 
strip it, readers who know nothing about the scanning validating processes 
might really be confused.  <big grin>  I just write out the title of the book, 
add a blank line, copy it to my clipboard and then paste it in whenever I come 
across a chapter heading.  It's really not very much work.  Some of the more 
techno-savvy on this list have suggested ways to improve the stripper, but that 
goes over my head pretty quickly, and at any rate, BS hasn't made changes to it 
as of yet.  I still think it is best to manually strip the junk headers and 
protect the page numbers.  Again, it's whatever you're comfortable doing.

   

  BTW, I have the same problem when actually reading a book instead of just 
scrolling through for things to correct.  I've been reading scanned text since 
the Easy Scan days, (that's pre-OpenBook for you youngsters out there), and my 
brain is very good at transposing scan junk into real English.  It's kind of 
scary, really.  <smile>

   

  Donna

   


------------------------------------------------------------------------------

  From: bksvol-discuss-bounce@xxxxxxxxxxxxx 
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Evan Reese
  Sent: Saturday, May 13, 2006 2:04 PM
  To: bksvol-discuss@xxxxxxxxxxxxx
  Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings

   

  Thanks, Donna.  I also do as much cleanup as I can.  I read through "the 
entire text of the print edition" as APH puts it, and I fix everything I 
notice.  I am not a professional proofreader, though, so some things get by me. 
 Also, when I am enjoying a book, I sometimes don't notice minor typos.  My 
brain seems to sometimes make corrections in what I read that it doesn't tell 
me about, in a manner of speaking. <smile>  But I usually get most stuff.

   

  After reading some of the discussion this morning, I decided to download 
another book I had submitted.  The book contains a novel and some short 
stories, and I was worried that the stripper might have removed the titles of 
the stories, as they appear in the middle of the page, with nothing above them. 
 Luckily, the story titles were not removed, which is confusing, because such 
things as "Part One" and "Part Two" were removed, and they also appear in the 
middle of the page.  The chapter headings were also removed, but since the 
chapter and parts all have titles, readers will at least know that they are 
starting a new section of some kind.  Also, the page break right above the 
title is a pretty good indicator of a new chapter or part.  Why it removes some 
headers like part headers which are in the middle of the page, but not the 
story titles which are also in the middle of the page is mystifying to me.  
Removing consistent, repetetive text at the top of each page I can understand.  
But removing items dozens of pages apart just because they happen to be the 
first thing on the page, and not even doing that consistently is just goofy.  I 
thought that by removing headers, I would be making life easier for everyone - 
especially since scanners - mine at least - can garble them and so make them 
not entirely consistent throughout.  But it appears that by removing them, I 
have negatively affected the quality of my submissions, which I had hoped would 
be great.  At least the text is great, still.  I can be thankful for that at 
least.  But now, I have to do more work to get around something someone thought 
was a "bonus"?  Oh, brother!  If it removes consistent text at the top of each 
page, and it doesn't find any, why doesn't it just go to sleep?  It appears 
that whether or not I remove headers, I gotta add them to the beginnings of 
chapters and parts or whatever stuff I want to make sure doesn't get munched.  
I will try to remember that in the future.  Can I put in any text I want at the 
top of the page to preserve the chapter headers?, such as "Remove this, not the 
chapter header you stupid software."?  <lol>

   

    ----- Original Message ----- 

    From: Donna Smith 

    To: bksvol-discuss@xxxxxxxxxxxxx 

    Sent: Saturday, May 13, 2006 10:32 AM

    Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings

     

    Hi Evan and others following this thread.

     

    I apologize for being the one to open this particular can of worms.  Every 
time we've had this discussion on the list, it has brought up these same 
issues.  There is no official requirement for this from BookShare.  The goal is 
still to get as many books as possible in the collection following the absolute 
guidelines regarding copyright laws as amended, and the concern of quality is 
next in line with the understanding that it is possible to rate books as fair, 
good or excellent.  

     

    However, many of us who do the work have our own standards of quality for 
our submissions and so we've come up with tips for how to get the best quality 
possible.  These standards, and they vary from volunteer to volunteer, are not 
required by anyone.  It's just a self-imposed standard.  

     

    Personally, when I submit a book, I try to do as much clean-up as possible 
so that the validator has an easy job.  Actually, I try to submit the book in a 
form I'd not mind reading.  I knew from previous discussions on this list that 
the BS software has little quirks that we can accommodate by creating a 
particular format, though I had forgotten what that was specifically.  I choose 
to follow this pattern of blank line, page number, blank line, text, or, false 
header, blank line, chapter heading, blank line text, because it will produce a 
better end result for all formats, but there is no requirement to do so.  
Volunteers can do as much as they are comfortable with doing both as submitters 
and validators, and the addition to the collection is appreciated by all.  

     

    Hope this helps to explain it all a bit.  Keep on scanning and keep up the 
good work!

     

    Peace and Hope,

     

    Donna

     


----------------------------------------------------------------------------

    From: bksvol-discuss-bounce@xxxxxxxxxxxxx 
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Evan Reese
    Sent: Saturday, May 13, 2006 11:17 AM
    To: bksvol-discuss@xxxxxxxxxxxxx
    Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings

     

    Ok, now I'm more confused than ever.  If I strip the headings manually, 
then why does the stripper need something to strip.  If I don't strip them, 
then I need to make them consistent so the stripper can strip them correctly?  
But if I do strip them, then I need to put a fake one at the beginning of each 
chapter so the chapter heading doesn't get stripped?  Do I have this all 
correct?  Wouldn't it be a lot simpler just to not strip anything?  Aren't we 
supposed - ideally anyhow - to be replicating the original book, in which there 
is in fact a heading at the top of each page, except on those which begin a 
chapter.  This makes a lot of extra work and complexity and a reduction of 
quality for those who can't keep straight all the persnickety demands of the 
Bookshare software, or who don't have the time to go through and fix everything 
just right so that other readers can have a decent reading experience.  I would 
say, just drop the whole stripping thing and make life simpler for everyone and 
then we won't have these quality problems.

     

      ----- Original Message ----- 

      From: Gerald Hovas 

      To: bksvol-discuss@xxxxxxxxxxxxx 

      Sent: Friday, May 12, 2006 4:18 PM

      Subject: [bksvol-discuss] Re: Protecting page numbers and chapter headings

       

      Donna,

       

        1.. Yes, put a blank line before and after the page number when the 
page number is in the header as well as a blank line at the bottom of the page. 
 I'm not sure if the blank line is necessary between the page number and the 
text, but I can say that putting one after the header does work well.  If you 
leave the blank lines at the top and bottom of the pages off, then some odd 
things will happen to the page numbers when the HTML file is created.  The page 
numbers end up as part of a paragraph instead of on a separate line. 
       

      Example of how the page should look...

       

      [Page Break]

       

      95

       

      .

      .

      .

      text

      .

      .

      .

       

      [Page Break]

       

        2.. Yes, put a blank line between the chapter heading and the first 
line of text.  Again, I don't know that this is necessary, but it does work 
well, and some people have said that it is necessary. 
       

      3. Yes, you need to give the Stripper something to strip other than the 
chapter heading or the chapter heading will be stripped.  The easiest thing to 
do when page numbers are normally at the top of pages except on pages where a 
chapter begins is to move the page number from the bottom of the page to the 
top.  If page numbers are always at the bottom of the page, then place a false 
header above the chapter heading.  Using the title works well, and it won't 
look as odd if the Stripper fails to strip it for some odd reason.  It helps to 
strip all of the real headers manually since its easier to strip them yourself 
than to insure that they are consistent so that the Stripper will remove them 
all, but if you do leave them in, then just follow the pattern of the headers 
and put whatever header would normally appear on the top of that page above the 
chapter heading.

       

      Examples of how pages should look...

       

      [Page Break]

       

      95

       

      Chapter Seven

       

      .

      .

      .

      text

      .

      .

      .

       

      [Page Break]

       

      [Page Break]

       

      The Firm

       

      Chapter Seven

       

      .

      .

      .

      text

      .

      .

      .

       

      95

       

      [Page Break]

       

      Or if the page would normally contain the author's name instead of the 
title...

       

      [Page Break]

       

      John Grisham

       

      Chapter Seven

       

      .

      .

      .

      text

      .

      .

      .

       

      [Page Break]

       

      I can't say that these are the only ways to do it, but I've had excellent 
results with these approaches.

       

      HTH

       

      Gerald

       


--------------------------------------------------------------------------

      From: bksvol-discuss-bounce@xxxxxxxxxxxxx 
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Donna Smith
      Sent: Friday, May 12, 2006 5:41 PM
      To: bksvol-discuss@xxxxxxxxxxxxx
      Subject: [bksvol-discuss] Protecting page numbers and chapter headings

       

      Hi all.

       

      I hesitate to ask this question, but I can't find the answer on any of 
the official or unofficial sites giving tips to scanners.

       

      Did we ever determine absolutely what should be done to protect page 
numbers and chapter headings?  What I want to know is:

       

        1.. At the top of each page, do I need to put a blank line before 
and/or after the page number? 
        2.. For chapter numbers/titles that appear at the top of the page, do I 
put a blank line before and/or after it? 
        3.. Is it necessary to have something at the top of the page for the 
BookShare stripper to strip? 
       

      I spend a good bit of time cleaning up each scan regarding page numbers, 
chapter headings and stripping out unwanted headers.  I'm trying to do it in 
such a way that it will result in the best book in the finished product.

       

      I really, really hope that an answer has been found to this question and 
that I'm not opening up the can of worms we've had before about the different 
perceptions of what might work.  <smile>

       

      Thanks.

       

      Peace and Hope,

       

      Donna

Other related posts: