[brailleblaster] Re: BrailleBlaster doWe could even consider re-introducing Tikawnload size

  • From: Keith Creasy <kcreasy@xxxxxxx>
  • To: "brailleblaster@xxxxxxxxxxxxx" <brailleblaster@xxxxxxxxxxxxx>
  • Date: Tue, 28 May 2013 14:55:19 +0000

Hi all.

We might want to consider using external tools such as the DAISY Pipeline to 
support some formats. We could even consider re-introducing Tika but only as an 
ancillary tool. We dropped it because it was rolled into DocumentManager and 
only has limited usefulness as an XML parser. In most cases we are more 
interested in going from text, rtf, or some other text format to XML and not 
the other way around.

My experience is that converting from one document file to another is actually 
the easy part. Doing something with them to produce clean results is the more 
difficult part.



Keith Creasy
Software Developer
American Printing House for the Blind
KCreasy@xxxxxxx
Phone: 502.895.2405
Skype: keith537

-----Original Message-----
From: brailleblaster-bounce@xxxxxxxxxxxxx 
[mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of John Gardner
Sent: Tuesday, May 28, 2013 10:44 AM
To: brailleblaster@xxxxxxxxxxxxx
Subject: [brailleblaster] Re: BrailleBlaster download size

Michael, the original specs called for supporting such popular formats.  My 
thought is that it is very important to support the docx format initially but 
that non-xml formats could be postponed to version 2.  Or at least 1.1.


John G


John Gardner
 |
President
|

541.754.4002 x 200
 |
www.viewplus.com



PRIVILEGED AND CONFIDENTIAL: This message and any files transmitted with it may 
be proprietary and are intended solely for the use of the individual to whom 
they are addressed. If you are not the intended recipient, any use, copying, 
disclosure, dissemination or distribution is strictly prohibited; please notify 
the sender and delete the message. ViewPlus Technologies, Inc.
accepts no liability for damage of any kind resulting from this email. 

-----Original Message-----
From: brailleblaster-bounce@xxxxxxxxxxxxx
[mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of Michael Whapples
Sent: Monday, May 27, 2013 11:24 PM
To: brailleblaster@xxxxxxxxxxxxx
Subject: [brailleblaster] Re: BrailleBlaster download size

Last week I made a installer version for ViewPlus and as the Tika dependency 
was not listed on the BrailleBlaster website or in the README files I did not 
include it. There was no compilation issues so I take it that it must not be 
used and thus can be safely removed.

However, this does raise a wider question: What formats do we wish to support? 
Probably really what I am getting at, do we wish to support non-XML formats 
such as the old binary Word document format (.doc) or PDF being another popular 
format I can think of. I feel it would be unwise to ignore these formats, even 
if we are converting them to an XML format behind the scenes, I feel the user 
might expect to just be able to open some of these non-XML formats.

Michael Whapples
On 28/05/2013 06:00, John J. Boyer wrote:
> I think it should be removed. If we want to import docx files we can 
> use the content document which is xml and which liblouisutdml can handle.
> Making a semantic-action file for it would be easy.
>
> John
>
> On Tue, May 28, 2013 at 12:50:47AM -0400, Vic Beckley wrote:
>> Keith,
>>
>> This may not be possible but I was thinking that since we aren't 
>> using the tika-app-1.1.jar file now that it could be removed from the 
>> package before the beta is released. The entire installer for Windows 
>> is about 30 MB, with that one file taking up approximately 25 MB of 
>> that total size. Removing it would significantly reduce the size of 
>> BrailleBlaster. I don't know at this point if the decision has been 
>> made whether we are going to use this file later on or not. Any thoughts?
>>
>>
>> Best regards from Ohio,
>>
>> Vic
>>
>>





Other related posts:

  • » [brailleblaster] Re: BrailleBlaster doWe could even consider re-introducing Tikawnload size - Keith Creasy