[bksvol-discuss] Re: What The New Converter Does and thanks to Marilyn!

  • From: "Bob" <rwiley@xxxxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Mon, 6 Apr 2009 14:38:45 -0500

For those who use Kurzweil to read their books you can find the first page of 
the book (past all the front stuff with roman numbers) and press alt+v to go to 
the "navigation" menu, and arrow up until you hear "set user defined page 
number" (or something like that). Hit enter and a dialog will open where you 
can enter the page number you want to start with (in this case number 1). From 
then on, Kurzweil will anounce the correct page number. This is especially 
helpful if the book has a table of contents but no page numbers. You can look 
up a chapter's page number in the contents, and by pressing ctrl+g go directly 
to that page.

Bob
  ----- Original Message ----- 
  From: Rogerbailey81@xxxxxxx 
  To: bksvol-discuss@xxxxxxxxxxxxx 
  Sent: Monday, April 06, 2009 2:17 PM
  Subject: [bksvol-discuss] Re: What The New Converter Does and thanks to 
Marilyn!


  In that case, the page numbers are not required so you can just ignore them 
if you want, but it would be appreciated by Bookshare and all future readers of 
the book if you would add them.

                  "Philosophers have merely interpreted the world in various 
ways; the point is to change it." Karl Marx    

  table with 2 columns and 6 rows
  Subj: 
  [bksvol-discuss] Re: What The New Converter Does and thanks to Marilyn!   
  Date: 
  4/6/2009 2:34:47 PM Eastern Daylight Time  
  From: 
  regandlon@xxxxxxxxxxxxxxxx  
  Reply-to: 
  bksvol-discuss@xxxxxxxxxxxxx  
  To: 
  bksvol-discuss@xxxxxxxxxxxxx  
  Sent from the Internet 
  (Details) 
  table end

  Oops just saw my mistake.  The books have page breaks but not numbers.  Same
  question as below.  Thanks.
  Reggie 

  -----Original Message-----
  From: bksvol-discuss-bounce@xxxxxxxxxxxxx
  [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Reggie & Brooks
  Sent: Monday, April 06, 2009 11:55 AM
  To: bksvol-discuss@xxxxxxxxxxxxx
  Subject: [bksvol-discuss] Re: What The New Converter Does and thanks to
  Marilyn!

  Thank you Pavi and Jake.  One more question.  I have been working on a
  couple of books that have no page breaks.  Will one be put in automatically,
  should I try to put them in, or just leave it alone?  

  By the way, thanks Marilyn for the help.  Sorry I did not see the large
  print designation before you originally got Coal Black Horse.  Continuing to
  read, and hope you had success getting the other edition.  Thanks so much.
  Reggie 

  -----Original Message-----
  From: bksvol-discuss-bounce@xxxxxxxxxxxxx
  [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Pavi Mehta
  Sent: Thursday, April 02, 2009 7:01 PM
  To: bksvol-discuss@xxxxxxxxxxxxx
  Subject: [bksvol-discuss] What The New Converter Does

  Hi Folks,

  Exciting news! At our request, Bookshare's Jake Brownell wrote up a detailed
  explanation of how the new converter works and how it interfaces with our
  volunteer work. I distilled his explanation into the three guidelines below
  (we will include these in the volunteer manual shortly and also let
  volunteers who are not on this list know). Jake's terrific write up is
  included in its entirety at the end of this mail (Thank you Jake!)

  Guidelines for Chapter Headings & Page Numbers

  1.       The new converter removes the unwanted running headers and
  footers (author name, book title, chapter title) at the top of the page for
  you. Caveat: The converter is an improvement on the stripper but
  occasionally may leave a header in or remove a legitimate piece of text.
  If you come across an occurrence of "unintended stripping" please report it
  to us.

  2.       Ensuring that chapter headings are in a font size bigger than
  the rest of the text helps the new converter recognize them more easily.
  You no longer have to do anything beyond that to protect chapter headings.

  3.        You do not have to move page numbers from the bottom to the
  top. The converter places recognized page numbers at the top of each page
  for you.

  Jake's Explanation of the New Converter:

  We have anew RTF converter as part of the new platform we launched early
  this year. Along with the RTF converter is a new tool designed to process
  running headers, running footers, page numbers and chapters. The term
  chapters in this context is really any generic section of a book, but since
  most books use chapters, for the sake of discussion we'll use that term. The
  terms running headers and running footers refers to text on books that is
  repetitious at the top or bottom of nearly all pages and is something
  usually ignored during the reading of a standard print book. Examples of
  running headers and running footers are the book title, the author's name or
  a chapter title. Running headers are much more common in practice than
  running footers.

  The tool attempts to do several things. It attempts to identify and remove
  running headers and running footers from the text, so that this information
  is not repeated on every page by TTS engines, interrupting the flow of the
  book. The tool also attempts to identify page numbers on each page and
  handle them appropriately for each format DAISY or BRF.
  (For DAISY this means placing the page number in the special pagenum tag
  that tells a DAISY player that the enclosed text is a page number. It's that
  tag that allows a DAISY player to skip to different pages. For BRF books
  this means placing the page number at the end of a line of dashes so that it
  can be easily located.)

  Does all of this sound familiar? Veteran volunteers might recognize the
  above steps as those that our old, now defunct tool used to do. Our new tool
  does each of them more accurately producing much better results.

  The new tool also attempts to locate chapters within a book. If the tool can
  reasonably identify some sort of consistent divisions throughout a book, it
  will make appropriate DAISY levels and headings. Note, don't confuse
  "headings" with "headers." Headings are similar to those found on web pages.
  This additional markup can help with navigation.

  What does all of this imply?

  The old tool was overzealous in the removal of text it considered to be a
  running header or running footer. The new tool is more conservative about
  what it should remove. For example, the old tool might have considered the
  text at the start of a chapter to be a running header, e.g. "Chapter 10" or
  "Chapter 15." Some volunteers elected to "protect"
  that text by placing a dummy header above it such as "***". This should no
  longer be necessary with the new tool. In fact, the new tool in the best of
  circumstances will recognize "Chapter 10" as a new chapter and mark it as
  such.

  Is it still more accurate to strip running headers and footers by hand?

  The best result is to remove the running headers and running footers by
  hand, but this is a time consuming process. It's also a time consuming
  process to ensure the headers match exactly. The new tool will allow minor
  variations in a running header and footer, but since we wanted to air on the
  side of caution, some headers or footers might be left in the text.

  How are chapters identified among all the text?

  We use a few different techniques and may add more in the future. Right now
  the easiest way to identify chapters is when the text of the header is
  slightly larger than the rest of the text. For example the normal text might
  be 12 pt while the chapter text is in 16 pt. Other factors can affect the
  identification, but that's an easy rule of thumb.

  Some books have page numbers at the top and others at the bottom; does it
  matter where they are in the scan?

  The easy answer is no, it does not. When processing a book we look at text
  between two page breaks. When a page number is located either at the top or
  bottom of the page, the text between the page breaks is associated with that
  number. When generating DAISY and BRF we place the associated page number in
  the correct spot, which for both formats is at the beginning of the page. So
  effectively if the page number is at the bottom of the page, we move it to
  the top.

  All good things,

  Pavi Mehta

  Volunteer Coordinator, Bookshare

  Benetech 

  480 S. California Ave., Suite 201

  Palo Alto, CA 94306-1609 USA

  Phone:  +1 650 644-3459

  pavim@xxxxxxxxxxxx

  www.benetech.org

  The Benetech Initiative - Technology Serving Humanity 

  A Nonprofit Organization

  No virus found in this incoming message.
  Checked by AVG - www.avg.com
  Version: 8.5.283 / Virus Database: 270.11.40/2039 - Release Date: 04/03/09
  06:19:00

  To unsubscribe from this list send a blank Email to
  bksvol-discuss-request@xxxxxxxxxxxxx
  put the word 'unsubscribe' by itself in the subject line.  To get a list of
  available commands, put the word 'help' by itself in the subject line.

  No virus found in this incoming message.
  Checked by AVG - www.avg.com
  Version: 8.5.285 / Virus Database: 270.11.43/2043 - Release Date: 04/06/09
  06:22:00

  To unsubscribe from this list send a blank Email to
  bksvol-discuss-request@xxxxxxxxxxxxx
  put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.




  **************
  A Good Credit Score is 700 or Above. See yours in just 2 easy steps! 
(http://pr.atwola.com/promoclk/100126575x1221621488x1201450096/aol?redir=http:%2F%2Fwww.freecreditreport.com%2Fpm%2Fdefault.aspx%3Fsc%3D668072%26hmpgID%3D62%26bcd%3DAprilfooterNO62)
 

Other related posts: