You might find this comparison of in-memory XML representations interesting: http://softwaredevscott.spaces.live.com/blog/cns!1A9E939F7373F3B7!443.entry
If you're really concerned about memory usage and don't want to use XML-to-Java bindings, you might consider implementing something like the SAXON TinyTree (http://www.stylusstudio.com/api/saxon8/net/sf/saxon/tinytree/package-summary.htm ) that tokenizes strings and can significantly reduce memory footprint.
Cheers Chris On Nov 13, 2010, at 6:24 AM, John J. Boyer wrote:
After thinking some more, I agree with you. In our scripts which call the Java runtime we can simply specify minimum memory as 200 MB and maximum memory as 500 MB. John On Sat, Nov 13, 2010 at 08:54:46AM -0500, Sina Bahram wrote:Memory compaction can actually happen for free depending on what technologies we use. Also, I wouldn't worry too much about this because these kinds of optimizations can be performed once the product is complete.Don't forget the entropy in this data is so low so as to almost be considered partially ordered E.G. more than 50% compressionratios on any arbitrary subsection. Take care, Sina -----Original Message-----From: brailleblaster-bounce@xxxxxxxxxxxxx [mailto:brailleblaster-bounce@xxxxxxxxxxxxx ] On Behalf Of John J. BoyerSent: Saturday, November 13, 2010 2:11 AM To: brailleblaster@xxxxxxxxxxxxx Subject: [brailleblaster] Re: Start of EditorWhat I want to do is make the value of the index attribute more compact before it is incorporated into the DOM tree. Take thefollowing example.Suppose we have a paragraph for which the braille translation is 500 characters long. For each of these characters we have a decimal number followed by a comma. Taking only the characters above position 100, each will correspond to four characters iin the attribute value. Each of these four characters will be translated into a 16- bit (two byte) character in the string representing the value.This will result in 400* 4 * 2 bytes needed to represent this attribute. I've been entertaining the possibly wild idea of converting the decimal number at each position to a short and then putting all these shorts together into a new string which would replace the old. This wouldrequire only 800 bytes.Subsequent processing would also be easier, since the program would need to only convert the string into a char array and athenindex through it.Of course, such a string would have to be reconverted to the original on output.I'm perhaps overly concerned about memory usage. My estimate is that an ordinary document could take over 100 MB of memory. Large ones might take twice that. Compaction of the index attribute would reduce this by maybe two-thirds. If we don't compact we shouldassign about 400 MB to the JVM. John On Fri, Nov 12, 2010 at 08:48:22PM -0500, Sina Bahram wrote:Modifying that code is not how you'd want to go about doing this. You'd want to either hook in externally, through published andaccepted means, or simply do the processing on the dom where such semantics are more clearly available anyways.Take care, Sina -----Original Message----- From: brailleblaster-bounce@xxxxxxxxxxxxx [mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of John J. Boyer Sent: Friday, November 12, 2010 8:24 PM To: brailleblaster@xxxxxxxxxxxxx Subject: [brailleblaster] Re: Start of EditorThanks for the info. I'm still just a beginning Java programmer, so ihave some wild ideas. What I was thinking of was using Sax to modifysome things before JDOM builds the parse tree. For example, putting the value of the index attribute on the <brl> tag in a morecompact and easily-processed form.John On Fri, Nov 12, 2010 at 08:08:26AM -0800, Chris von See wrote:I would be interested to hear more about why you want to modify JDOM.I've built many parse trees and other similar structures using theSAX callbacks documented in the "org.xml.sax" package in the JDK; ifyou're using Java 6 you can also use the stream-oriented StAXclasses in the "javax.xml.stream" package, which may be a bit easierto deal with. In both cases you'd be using SAX or StAX to read the input tree and JDOM to build your parse tree instead of modifyingthe tree built by JDOM, which is a little bit more work but IMO muchcleaner and easier to maintain. Cheers Chris On Nov 12, 2010, at 12:27 AM, John J. Boyer wrote:A few weeks ago I sent the eclipse.swt TextEditor as an attachment.On this message I am attaching SAXBuilderDemo.java from the JDOM samples directory. I also resending TextEditor.java for yourconvenience. I think that we can combine these two to get a start on our editor.One thing I would like advice on is the copyright notice that should appear at the beginning of each class. I could copy the liblouis copyright notice with appropriate changes, but you willnotice that the authors already have some conditions that they wishrespected. We will be modifying their code, of course. We may also want to dig down to the SAX level so we can modify the parse tree as it is being built. This will mean modifying a few classes inJDOM itself. Eclipse may also have some conditions on the reuse of their code.Please give your comments and suggestionsj regarding the copytight issue. Thanks, John B. -- John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc. http://www.abilitiessoft.com Madison, Wisconsin USA Developing software for people with disabilities <SAXBuilderDemo.java><TextEditor.java>-- My websites: GodTouches Digital Ministry, Inc. http://www.godtouches.org Abilitiessoft, Inc. http://www.abilitiessoft.com Location: Madison, WI, USA--John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc.http://www.abilitiessoft.com Madison, Wisconsin USA Developing software for people with disabilities-- My websites: GodTouches Digital Ministry, Inc. http://www.godtouches.org Abilitiessoft, Inc. http://www.abilitiessoft.com Location: Madison, WI, USA