John: Exactly! Transformation should be a breeze with the sem statements. I will sure follow up. François. On Thu, Jul 19, 2012 at 3:37 PM, John J. Boyer <john.boyer@xxxxxxxxxxxxxxxxx> wrote: > Hi Francois, > > It is very desirable to get xml output from tika. liblouisutdml may > already have a .sem file to handle it. If not, one can be created > easily. > > John > > On Thu, Jul 19, 2012 at 03:03:41PM -0400, Fran�ois Ouellette wrote: >> (follow-up on previous email) >> Vic: it seems like we can produce formatted XML or HTML from the >> extraction, in which case we could retrieve the main formatting >> elements and replicate them in BB. Let me check on this. >> >> François. >> >> On Thu, Jul 19, 2012 at 12:26 PM, Vic Beckley <vic.beckley3@xxxxxxxxx> wrote: >> > John and François, >> > >> > I got it to compile. I opened a Word 2010 document with it. It seemed the >> > format of the text was missing. I don't think the paragraphs were still >> > intact. >> > >> > I will do more testing later. I am a little under the weather today and I >> > think I am going to go rest now. More later. Looks good so far. >> > >> > >> > Best regards from Ohio, U.S.A., >> > >> > Vic >> > E-mail: vic.beckley3@xxxxxxxxx >> > >> > >> > >> > > > -- > John J. Boyer; President, Chief Software Developer > Abilitiessoft, Inc. > http://www.abilitiessoft.com > Madison, Wisconsin USA > Developing software for people with disabilities > >