[bksvol-discuss] Re: solution for The Broker (sort of)

When I validate a book., I delete the headers, which
are usually garbled anyway, but I leave in the page
numbers. My understanding is that the stripper only
strips what it sees repeated, and, of course, since
page numbers aren't the same on each page, they
wouldn't be stripped. I do as Pratik or Guido said 
some time ago to do: page number on top line, line
space, and then text -- and as far as numbers on the
bottom go, text, line space, page number and then page
break. If this is wrong or the page numbers are
getting stripped when I do this, please let me know
and I'll make more spaces below the preceding page
break.

I've also come across a couple of books that have the
page numbers in the margins and in the middle of the
page. Those I put at the top of the page -- though now
I think  about it maybe it would be safer to put them
at the bottom.

Cindy

-- Rui <rui@xxxxxxxx> wrote:

> Good afternoon my fellow booksharians:
> 
> We have now solved the problem for this one book, it
> took about a dozen
> messages, not to mention the book is being re-edited
> a second time instead
> of correcting the original problem.
> If you think there is a backlog now, if every book
> when through 2 edits
> instead of getting it right the first time, you
> would have a 1500 book step
> 1 page.
> 
> 
> For the few times I do read a book as i am currently
> doing with a
> validation, (good book brian) i continuously read
> and the page info does not
> bother me at all, in fact it is useful.  i know that
> i am on page 134 of the
> book, not maybe around page 134, but actually on
> page 134 of the original
> text, (because the page # from the book is present)
> 
> Once the autmated tool hacks at the book, then you
> have a different story.
> no heading, no page #s, no chapter headings, etc.
> 
> I know that bookshare is fully aware of this and I
> hope this will come up at
> a subsequent meeting with their engineers.
> 
> You see, like I said in my bio, i'm not much of a
> reader so consequently I
> do not have a bookshare membership. All i do is
> validate.
> But it is very disconcerting that books i submit
> come out in worse shape
> after i hit the submit button then what my edited
> copy is here at home.
> 
> In my opinion, that material should remain in the
> book.
> For those of us who don't want it in the book, then
> the individual user can
> strip out the headers if they so choose.
> But for those of us who want the complete text, we
> can't put the
> headers/pagebreaks back in once they're gone, we
> have no recourse.
> You can take material out of a book, but once it is
> gone you can't
> manufacture it and put it back in short of having
> the print copy.  And at
> that point, it would be faster to just rescan it
> yourself.
> 
> 
> In closing, when Pratik uploads the book again,
> what's to stop the same
> thing from happening again?
> 
> 
> ----- Original Message ----- 
> From: "Mike Pietruk" <pietruk@xxxxxxxxx>
> To: <bksvol-discuss@xxxxxxxxxxxxx>
> Sent: Monday, January 17, 2005 9:54 AM
> Subject: [bksvol-discuss] Re: The Broker--strengths
> and weaknesses
> 
> 
> > Guido
> >
> > I think we here have as much a philosophical
> question as a technical one.
> > As no matter what system is implemented or put in
> place, both on BookShare
> > and within our ocr software, someone will find it
> not to their liking.
> > On the one hand, having page numbers, sections,
> chapters, etc kept in the
> > text is invaluable.
> > But then when too much of that info is announced,
> others object.
> > My personal preference is to have more rather than
> less kept; and hence, a
> > lenient stripper;
> > but I alredy understand the objections especially
> among those who do
> > automated continuous reading, convert to mp3 and
> all the rest.
> >
> > "The Broker" should be a case study in showing
> just how difficult all this
> > can be especially when dealing with automated
> tools and rush scanning
> > without hand validating.
> > Unintentionally, and this could in no way have
> been prevented other than
> > through painstaking effort which would have
> delayed availability of the
> > book, valuable info was lost.
> > In the short-run, having the book immediately
> available is more important
> > than having technical glitches dealt with.
> > Perhaps the best solution, in a case such as this,
> is to have the book
> > immediately made available with the originally
> scanned copy placed on the
> > step 1 validation page for someone, if they chose,
> to do the manual
> > finetuning.
> > Then, once validated, the improved copy would
> replace the original one.
> > That would be the best of both worlds -- quick
> access but also addressing
> > the real concerns expressed by Ken that the book
> isn''t optimally labeled
> > internally.
> >
> >
> >
> >
> 
> 
> 



                
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - 250MB free storage. Do more. Manage less. 
http://info.mail.yahoo.com/mail_250

Other related posts: