[brailleblaster] Re: Compiled

  • From: "John J. Boyer" <john.boyer@xxxxxxxxxxxxxxxxx>
  • To: brailleblaster@xxxxxxxxxxxxx
  • Date: Fri, 20 Jul 2012 00:21:45 -0500

It looks to me as if the coordinate information in the <newline> tag in 
UTDML is not being used. The x coordinate specifies indenting in terms 
of a resolution of 20 dpi.

John B

On Fri, Jul 20, 2012 at 12:17:57AM -0400, Vic Beckley wrote:
> John G,
> 
> On my brief experimenting with *.docx, *.rtf, and *.pdf, it seems the 
> structure is being preserved. The paragraphs don't show up in the imported 
> text that I can tell. I don't know if they are showing up visually or not. 
> The text in the XML view is not noticeably indented where it should be. If 
> you say no to the translation prompt for UTD, then the paragraphs are there 
> in the formatted Braille. If you say yes to UTD then the formatting is not 
> shown. It was this way before when I was working with UTD files. I think it 
> is just that this feature needs work on how they are displayed. These are 
> just my thoughts based on limited testing.
> 
> 
> Best regards from Ohio, U.S.A.,
> 
> Vic
> E-mail: vic.beckley3@xxxxxxxxx
> 
> 
> -----Original Message-----
> From: brailleblaster-bounce@xxxxxxxxxxxxx 
> [mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of John Gardner
> Sent: Thursday, July 19, 2012 11:32 PM
> To: brailleblaster@xxxxxxxxxxxxx
> Subject: [brailleblaster] Re: Compiled
> 
> Hello all, we should distinguish between formatting and structure.  We need 
> to capture the structure - which includes paragraphs, headings, tables, etc.  
> But we don't need to capture the formatting of those things - whether the 
> paragraph is indented or double spaced, whether the heading is bold or 
> centered. Or...  From the on-going conversation, it isn't clear to me that we 
> are even capturing structure.  I hope I am misunderstanding.
> 
> Thanks.
> John G
> 
> 
> 
> -----Original Message-----
> From: brailleblaster-bounce@xxxxxxxxxxxxx 
> [mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of Fran�ois Ouellette
> Sent: Thursday, July 19, 2012 3:25 PM
> To: brailleblaster@xxxxxxxxxxxxx
> Subject: [brailleblaster] Re: Compiled
> 
> I don't know what was there at the beginning, but it looks like it has 
> improved over time if we read the notes from the consecutve releases.
> Again, it is not a content formatter, it is a content extractor! But we can 
> get XML or XHTML from a file through SAX classes and decide on the resulting 
> format. I am following-up on this. Currently we only get unformatted text and 
> it is a start point.
> 
> Fran�ois
> 
> On Thu, Jul 19, 2012 at 5:40 PM, Michael Whapples <mwhapples@xxxxxxx> wrote:
> > Hello,
> > I remember John Gardner mentioning Tika near the beginning of the 
> > Brailleblaster project, but at that time we concluded the formatting 
> > from it was not really good enough. Has it improved?
> >
> > Michael Whapples
> > On 19/07/2012 22:17, Fran�ois Ouellette wrote:
> >>
> >> John: Exactly! Transformation should be a breeze with the sem 
> >> statements. I will sure follow up.
> >>
> >> Fran�ois.
> >>
> >> On Thu, Jul 19, 2012 at 3:37 PM, John J. Boyer 
> >> <john.boyer@xxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>> Hi Francois,
> >>>
> >>> It is very desirable to get xml output from tika. liblouisutdml may 
> >>> already have a .sem file to handle it. If not, one can be created 
> >>> easily.
> >>>
> >>> John
> >>>
> >>> On Thu, Jul 19, 2012 at 03:03:41PM -0400, Fran ois Ouellette wrote:
> >>>>
> >>>> (follow-up on previous email)
> >>>> Vic: it seems like we can produce formatted XML or HTML from the 
> >>>> extraction, in which case we could retrieve the main formatting 
> >>>> elements and replicate them in BB. Let me check on this.
> >>>>
> >>>> Fran�ois.
> >>>>
> >>>> On Thu, Jul 19, 2012 at 12:26 PM, Vic Beckley 
> >>>> <vic.beckley3@xxxxxxxxx>
> >>>> wrote:
> >>>>>
> >>>>> John and Fran�ois,
> >>>>>
> >>>>> I got it to compile. I opened a Word 2010 document with it. It 
> >>>>> seemed the format of the text was missing. I don't think the 
> >>>>> paragraphs were still intact.
> >>>>>
> >>>>> I will do more testing later. I am a little under the weather 
> >>>>> today and I think I am going to go rest now. More later. Looks 
> >>>>> good so far.
> >>>>>
> >>>>>
> >>>>> Best regards from Ohio, U.S.A.,
> >>>>>
> >>>>> Vic
> >>>>> E-mail: vic.beckley3@xxxxxxxxx
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>> --
> >>> John J. Boyer; President, Chief Software Developer Abilitiessoft, 
> >>> Inc.
> >>> http://www.abilitiessoft.com
> >>> Madison, Wisconsin USA
> >>> Developing software for people with disabilities
> >>>
> >>>
> >
> >
> 
> 
> 

-- 
John J. Boyer; President, Chief Software Developer
Abilitiessoft, Inc.
http://www.abilitiessoft.com
Madison, Wisconsin USA
Developing software for people with disabilities


Other related posts: