[bksvol-discuss] Re: paragraphs--indented or line skipped between?

  • From: "Jake Brownell" <jabrown@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Thu, 20 Mar 2008 22:29:28 -0500

Hi Everyone,

Mayrie is essentially correct. The tool that converts books to DAISY identifies paragraphs by a new line. It does not matter whether there's multiple new lines (blank lines) or an indentation, it will still be marked as a paragraph.

For example, here's what info might be shown in the DAISY file for the above paragraph:

<p id="exampleParagraph">Mayrie is essentially correct. The tool that converts books to DAISY identifies paragraphs by a new line. It does not matter whether there's multiple new lines or an indentation, it will still be marked as a paragraph.</p>

Then, the software that reads the DAISY file will decide exactly how a paragraph should be rendered. For example when opening a DAISY book in internet explorer and using a screen reader there is a "blank line" between paragraphs.

The only potential problem that arises is when there are more newlines than there should be. For example, paragraphs often span multiple lines in printed books. OCR programs can be configured to retain the "exact" layout, which means that the program puts a newline at the end of every printed line rather than only at the end of paragraphs.

For example notice
that this text is on
multiple lines when it is
clearly text that is meant
to be together. When text
like this is converted, the extra
new lines cause lots of paragraphs.

Again, here's the above paragraph written as it might appear in DAISY.

<p id="paragraph1">For example notice</p>

<p id="paragraph2">that this text is on</p>

<p id="paragraph3">multiple lines when it is</p>

<p id="paragraph4">clearly text that is meant</p>

<p id="paragraph5">to be together. When text</p>

<p id="paragraph6">like this is converted, the extra</p>

<p id="paragraph7">new lines cause lots of paragraphs.</p>

This can give some TTS engines problems, causing jerky reading. As you might guess, books formatted like this would also be next to impossible to navigate by paragraph. Gerald Hovas contributed some tips to my unofficial website that provide automated procedures that can usually fix up a book that has this symptom.

I hope this gives some insight into our process and wasn't overly complex.

Best regards,
Jake

Jake Brownell
QA Engineer, Bookshare.org
jake.b@xxxxxxxxxxxx

----- Original Message ----- From: "Mayrie ReNae" <mrenae@xxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Thursday, March 20, 2008 6:20 PM
Subject: [bksvol-discuss] Re: paragraphs--indented or line skipped between?


Hi Rita,

        I believe that Bookshare's tools remove any extra blank lines.

I hope that Jake or Pratik will weigh in on this topic. But this is what I remember having read. It makes no difference whether you leave blank lines or indentations in your files. Here is the deal. Whether you leave an extra blank line or indent the paragraphs, the hard return at the end of a paragraph is the marker that the daisy tools look for and use to know where the paragraphs end and subsequently begin. /All extra tabs, spaces, and blank lines are stripped. So, any extra work that you all go to inserting these things is wasted, because only the hard returns that denote the end/beginning of a paragraph in an rtf document are used to tell the daisy where paragraphs are. All other formatting is lost. So, don't knock yourself out with formatting involving extra blank lines, or spaces. It will be stripped anyway.

Sorry to be the bearer of bad news, but I am almost positive that Jake and or Pratik have said this before when the subject has come up.

Peace,
Mayrie

To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: