Hi all. We might want to consider using external tools such as the DAISY Pipeline to support some formats. We could even consider re-introducing Tika but only as an ancillary tool. We dropped it because it was rolled into DocumentManager and only has limited usefulness as an XML parser. In most cases we are more interested in going from text, rtf, or some other text format to XML and not the other way around. My experience is that converting from one document file to another is actually the easy part. Doing something with them to produce clean results is the more difficult part. Keith Creasy Software Developer American Printing House for the Blind KCreasy@xxxxxxx Phone: 502.895.2405 Skype: keith537 -----Original Message----- From: brailleblaster-bounce@xxxxxxxxxxxxx [mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of John Gardner Sent: Tuesday, May 28, 2013 10:44 AM To: brailleblaster@xxxxxxxxxxxxx Subject: [brailleblaster] Re: BrailleBlaster download size Michael, the original specs called for supporting such popular formats. My thought is that it is very important to support the docx format initially but that non-xml formats could be postponed to version 2. Or at least 1.1. John G John Gardner | President | 541.754.4002 x 200 | www.viewplus.com PRIVILEGED AND CONFIDENTIAL: This message and any files transmitted with it may be proprietary and are intended solely for the use of the individual to whom they are addressed. If you are not the intended recipient, any use, copying, disclosure, dissemination or distribution is strictly prohibited; please notify the sender and delete the message. ViewPlus Technologies, Inc. accepts no liability for damage of any kind resulting from this email. -----Original Message----- From: brailleblaster-bounce@xxxxxxxxxxxxx [mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of Michael Whapples Sent: Monday, May 27, 2013 11:24 PM To: brailleblaster@xxxxxxxxxxxxx Subject: [brailleblaster] Re: BrailleBlaster download size Last week I made a installer version for ViewPlus and as the Tika dependency was not listed on the BrailleBlaster website or in the README files I did not include it. There was no compilation issues so I take it that it must not be used and thus can be safely removed. However, this does raise a wider question: What formats do we wish to support? Probably really what I am getting at, do we wish to support non-XML formats such as the old binary Word document format (.doc) or PDF being another popular format I can think of. I feel it would be unwise to ignore these formats, even if we are converting them to an XML format behind the scenes, I feel the user might expect to just be able to open some of these non-XML formats. Michael Whapples On 28/05/2013 06:00, John J. Boyer wrote: > I think it should be removed. If we want to import docx files we can > use the content document which is xml and which liblouisutdml can handle. > Making a semantic-action file for it would be easy. > > John > > On Tue, May 28, 2013 at 12:50:47AM -0400, Vic Beckley wrote: >> Keith, >> >> This may not be possible but I was thinking that since we aren't >> using the tika-app-1.1.jar file now that it could be removed from the >> package before the beta is released. The entire installer for Windows >> is about 30 MB, with that one file taking up approximately 25 MB of >> that total size. Removing it would significantly reduce the size of >> BrailleBlaster. I don't know at this point if the decision has been >> made whether we are going to use this file later on or not. Any thoughts? >> >> >> Best regards from Ohio, >> >> Vic >> >>