Jill, Bookshare is currently working on preserving the original page numbers in the BRF files. They've had to ask Duxbury for some guidance then do some experimenting to make sure they understand Duxbury's advice. They think they have a better understanding of what they need to do now and are working to update the tools to fix the problem as one of their engineering projects. The following message was sent out in September and says that they will probably need a few months to get the kinks out before we start seeing any change. As for guidance on how to handle running headers and footers, it sounds like what you're doing is fine. Marissa, who left Bookshare back at the end of May, told me that manually stripping the text was fine since that's what the Stripper will attempt to do anyway. That makes it easier for the Stripper to recognize the page numbers. Just be sure the page number is the first line of text if it's in the header or the last if it's in the footer or the Stripper can't recognize it. Keep in mind that this is why the Stripper is really part of Bookshare's set of tools, not to strip headers and footers, but to process the page numbers to make books easier to navigate with DAISY. The list has brought up the issue of just how well that's working at the moment since the page numbers are not showing up in K-1000 as expected. But from what Stephen Baum from Kurzweil has said, it sounds like it's just one of those growing pains that we'll have to endure while Bookshare's tools and the software packages that support DAISY mature. My guess is that it's just a temporary problem which will be worked out in the near future. There's also the issue of preventing chapter headings from being stripped. Moving a page number from the bottom of the page to the top of a page which has a chapter heading when pages numbers appear in running headers will prevent this from happening, and in the case where all of the page numbers are in the footers, placing a line with the title of the book above the chapter heading should solve the problem. This gives the Stripper a header to strip, and it leaves the chapter heading alone. You'll see Dave mentioning NIMAS in the following message as he explains the original page numbers in BRF issue. When Engineering upgrades the tools to support NIMAS, they'll attempt to address the issues volunteers are having with the Stripper. In the mean time, we'll just have to be patient. That doesn't mean we can't ask for some documentation for handling the Stripper as it exists today, though, so I'll pass your request on to the staff. I'm sure you realize that this won't be the first time they will have gotten this particular request. That doesn't mean that they wouldn't like to provide the help, just that they have their hands full with everything they have to do and squeezing it in isn't easy. If nothing else, sometime in the next few weeks I'll try to pull together some of the e-mails that have been sent out over the list concerning how to handle the Stripper and add a tip to Jake's website. HTH Gerald -----Original Message----- From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of Janice Carter Sent: Friday, September 02, 2005 7:17 PM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] BRF and page numbers - Bookshare.org Friday Update Based on lots of discussion on the list regarding the problems of page numbers not appearing in Grade II BRF files downloaded from Bookshare.org, The Benetech engineering team has been working on some short-term as well as longer-term solutions. The following is a fairly detailed explanation from Dave Offen, Benetech's Director of Engineering. "...Recently we have been in frequent contact with Duxbury, the folks who make our Grade II translator, to see if we can introduce into our books a special "new page" code that the Duxbury translator will output in Braille along with the original page number. The folks at Duxbury have told us which code we should use, and we have been experimenting with it. At first it didn't work at all, until we discovered that if the new page occurs within a paragraph (because the paragraph continues on the next page) the page number and new-page mark will be ignored by Duxbury. Now that we better understand the Duxbury requirements, we should be able to reformat any open HTML tags in the vicinity of new-page marks (such as the open paragraph tags) and get the Braille to properly output page numbers. In DAISY 3 book readers, you can always ask "where am I" and it will tell you your current page. With the above mentioned change to our Braille generation, people downloading BRF files will have access to the same page information that people downloading DAISY 3 books now have. It may take a few months before we've got all the kinks ironed out of this process, but we understand that lots of people are waiting for this kind of improvement. For the longer term, we are looking into ways of improving the page number identification in our books. This is especially important for textbook users. We're investigating if there are scanning or proofreading guidelines that can improve our ability to capture page numbers. This page number capturing takes place in the header/footer stripper. The header/footer stripper is needed to make the books flow smoothly when listening to them using TTS in a DAISY 3 reader. If the page header or footer is located in the first line before or after a page break in the OCR'd RTF file that gets uploaded to our collection, the stripper will usually be able to extract the page number information before it strips away the header/footer, and this information is stored in our master XML file from which both DAISY 3 and BRF books are generated. As we begin to work with Publishers producing NIMAS content under the new guidelines, these improvements to our BRF processing will carry forward to our new NIMAS books as well. We will be able to take NIMAS files and using these same processes feed them through Duxbury to produce BRF files with the original pages marked in Braille." As we've mentioned in several other postings, changes to the Bookshare.org system are no longer small efforts. We will have 25,000 books very soon and changes and upgrades that will help Bookshare.org grow are getting fully vetted by Engineering and Operations and Fundraising and Jim and you. (The "when will this happen?" is based on funding timing.) Thanks again for keeping us focused on your needs. Stay safe this weekend. Janice Carter Director, Literacy Programs Benetech 480 S. California Ave., Suite 201 Palo Alto, CA 94306-1609 USA (650) 475-5440 x122 (650) 759-5828 cell (650) 475-1066 fax janice.c@xxxxxxxxxxxx www.benetech.org The Benetech Initiative - Technology Serving Humanity A Nonprofit Organization -----Original Message----- From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of Jill O'Connell Sent: Friday, November 04, 2005 3:23 PM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] leaving headers I would like to know what those of you who are submitting books are doing about headers. I am a braille reader so don't know how they come across in daisy format other than what I have read here on the list about tags. They are certainly unpleasant to keep reading in braille, but the stripper seems so unreliable that I don't feel we can count on it to remove them. Since we keep begging Bookshare to give us guidance in this matter and they do not, I am wondering what most of you do and also how you think it affects the acceptance of your submissions. My present policy is to remove headers, preserving the print page number, but as others have pointed out, the numbers don't seem to be preserved in braille regardless.