[brailleblaster] Re: Content importers

  • From: François Ouellette <braille@xxxxxxx>
  • To: brailleblaster@xxxxxxxxxxxxx
  • Date: Wed, 4 Jul 2012 13:51:10 -0400

Hi John, for importing content tika seems to do a very good job, we can get
the text from almost any file format in a formatted or unformatted way.
This would be ideal to bring in external documents to be translated. It
recognizes the document types and performs the necessary extraction. We can
even extend its properties to include our own mime types.
The exporting is another story. We have indeed to drive it more
specifically if we want to add or save translated content into a specific
document structure like Daisy or epub.

F.

On Wed, Jul 4, 2012 at 12:50 PM, John J. Boyer <john.boyer@xxxxxxxxxxxxxxxxx
> wrote:

> Hi Francois,
>
> We looked at tika two years ago near the start of the project. At that
> time ikt's results just weren't good enough. It may have developed
> considerablyt since then. The main question is whether it will now
> produce xml files that meet our requirements.
>
> John
>
> On Wed, Jul 04, 2012 at 10:38:40AM -0400, Francois wrote:
> > Hi, and Happy Independence Day to our American friends!
> > I am looking at the various content types that we may want to import into
> > BrailleBlaster, and the Apache TIKA toolkit facilitates the importing of
> > most known documents types including old and new Word formats, RTF,
> > OpenDocument, XML, etc. even compressed documents and email messages.
> >
> > Has anyone worked on this before? That would save us lots of development
> > time. I am having fun at this...
> >
> > Thanks.
> >
> > François.
>
> --
> John J. Boyer; President, Chief Software Developer
> Abilitiessoft, Inc.
> http://www.abilitiessoft.com
> Madison, Wisconsin USA
> Developing software for people with disabilities
>
>
>

Other related posts: