[bksvol-discuss] Re: using machines for what they were meant to do

  • From: "groups Warford" <groups_warford@xxxxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Thu, 11 Sep 2008 10:59:06 -0400

Hi Jake and Listers,
It's been a long time since I've written or been able to keep up with the
list so I hope all of you are doing well.

Jake, I think I understand and as a validator I definitely wouldn't mind
going the extra mile.  I think the books that you can pull from the Stream
pilot site use tags such as <h2>, <h3> and so forth.  At the beginning of
the text comments are written saying what each "level" represents.  But I
can also see that they may be doing something else so as not to confuse the
HTML.  I tried this sort of method on a Bookshare book of short stories,
marking each story, but I was pretty sure that the tools wouldn't recognize
this but also that the worse that could happen is that it would just ignore
my tags and you wouldn't be able to navigate to each story.  And, that is
what happened!  I think it would be wonderful if Bookshare came up with a
mark-up language that they would like for us to use, making it available to
both submitters and validators, but maybe it would be mostly the job of the
validator to put these in.  Personally I would love to be able to move from
chapter to chapter, section to section and so forth.  As well as making
written instructions available, it might be good to have a class too with
archives of course.  <smile>.

Anyway, just wanted to throw out my opinion, and you're welcome to throw it
out, too!

Take care, thanks for the information and for Bookshare considering these
options.

By the way, I don't drink coffee, don't have a hand grinder but do have an
electric one for guests.  I'm not sure how this came into the story, but I
figured while I was writing... <smile>

My Best to All,
Cindy 4
"Always" and "never" are two words you should always try never to say."
--Coffee News Magazine
 
-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx] On Behalf Of Jake Brownell
Sent: Thursday, September 11, 2008 12:57 AM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] Re: using machines for what they were meant to do

Hi Elizabeth,

I'll try and give you a bit more information. It's slightly techie. DAISY as

a markup language is very rich. DAISY draws many of its tags from HTML, but 
adds many more of its own. A good example of a tag included in DAISY that is

not included in HTML is the sidebar tag. This tag is intended to enclose 
information that represents a sidebar in the original. HTML has little use 
for such a tag. If you were to convert a book from HTML to DAISY, how would 
you know when it was appropriate to insert a sidebar tag? A human might be 
able to decide what becomes a sidebar, but a computer may have a much more 
difficult time. Keep this example in mind as I switch vectors to RTF.

RTF also has markup and like HTML's relationship to DAISY, its not a fully 
two way thing. It's also necessary to consider what RTF markup is generated 
by OCR? The markup generated is much less than what's available, usually 
because OCR is only so smart--its main goal is to get the text extracted.

So, the question becomes, how can we make a more meaningful DAISY book from 
RTF books that don't have a whole lot of markup after OCR?

There are different options available and we're considering which option(s) 
are best. We may be able to detect chapters and add appropriate markup by 
considering font size, or relative chunks of text, or by a code inserted by 
a volunteer....

We know not every volunteer will be able to give us beautifully marked up 
books, but that's okay--we'd like the technology to be in place for those 
who choose to go the extra mile.

And hey, I do have a hand coffee grinder--though I don't drink coffee, 
smile.

Jake
> I know I can grind coffee beans by hand and save electricity but do I want

> to?
>
> E.

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of
available commands, put the word 'help' by itself in the subject line.



 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: