[brailleblaster] Re: Start of Editor

  • From: Chris von See <chris@xxxxxxxxxxxxx>
  • To: brailleblaster@xxxxxxxxxxxxx
  • Date: Sat, 13 Nov 2010 10:14:05 -0800

You might find this comparison of in-memory XML representations interesting: http://softwaredevscott.spaces.live.com/blog/cns!1A9E939F7373F3B7!443.entry


If you're really concerned about memory usage and don't want to use XML-to-Java bindings, you might consider implementing something like the SAXON TinyTree (http://www.stylusstudio.com/api/saxon8/net/sf/saxon/tinytree/package-summary.htm ) that tokenizes strings and can significantly reduce memory footprint.


Cheers
Chris

On Nov 13, 2010, at 6:24 AM, John J. Boyer wrote:

After thinking some more, I agree with you. In our scripts which call
the Java runtime we can simply specify minimum memory as 200 MB and
maximum memory as 500 MB.

John

On Sat, Nov 13, 2010 at 08:54:46AM -0500, Sina Bahram wrote:
Memory compaction can actually happen for free depending on what technologies we use. Also, I wouldn't worry too much about this because these kinds of optimizations can be performed once the product is complete.

Don't forget the entropy in this data is so low so as to almost be considered partially ordered E.G. more than 50% compression
ratios on any arbitrary subsection.

Take care,
Sina

-----Original Message-----
From: brailleblaster-bounce@xxxxxxxxxxxxx [mailto:brailleblaster-bounce@xxxxxxxxxxxxx ] On Behalf Of John J. Boyer
Sent: Saturday, November 13, 2010 2:11 AM
To: brailleblaster@xxxxxxxxxxxxx
Subject: [brailleblaster] Re: Start of Editor

What I want to do is make the value of the index attribute more compact before it is incorporated into the DOM tree. Take the
following example.
Suppose we have a paragraph for which the braille translation is 500 characters long. For each of these characters we have a decimal number followed by a comma. Taking only the characters above position 100, each will correspond to four characters iin the attribute value. Each of these four characters will be translated into a 16- bit (two byte) character in the string representing the value.
This will result in 400
* 4 * 2 bytes needed to represent this attribute. I've been entertaining the possibly wild idea of converting the decimal number at each position to a short and then putting all these shorts together into a new string which would replace the old. This would
require only 800 bytes.
Subsequent processing would also be easier, since the program would need to only convert the string into a char array and athen
index through it.
Of course, such a string would have to be reconverted to the original on output.

I'm perhaps overly concerned about memory usage. My estimate is that an ordinary document could take over 100 MB of memory. Large ones might take twice that. Compaction of the index attribute would reduce this by maybe two-thirds. If we don't compact we should
assign about 400 MB to the JVM.

John

On Fri, Nov 12, 2010 at 08:48:22PM -0500, Sina Bahram wrote:
Modifying that code is not how you'd want to go about doing this.

You'd want to either hook in externally, through published and
accepted means, or simply do the processing on the dom where such semantics are more clearly available anyways.

Take care,
Sina

-----Original Message-----
From: brailleblaster-bounce@xxxxxxxxxxxxx
[mailto:brailleblaster-bounce@xxxxxxxxxxxxx] On Behalf Of John J.
Boyer
Sent: Friday, November 12, 2010 8:24 PM
To: brailleblaster@xxxxxxxxxxxxx
Subject: [brailleblaster] Re: Start of Editor

Thanks for the info. I'm still just a beginning Java programmer, so i
have some wild ideas. What I was thinking of was using Sax to modify
some things before JDOM builds the parse tree. For example, putting the value of the index attribute on the <brl> tag in a more
compact and easily-processed form.

John

On Fri, Nov 12, 2010 at 08:08:26AM -0800, Chris von See wrote:
I would be interested to hear more about why you want to modify JDOM.
I've built many parse trees and other similar structures using the
SAX callbacks documented in the "org.xml.sax" package in the JDK; if
you're using Java 6 you can also use the stream-oriented StAX
classes in the "javax.xml.stream" package, which may be a bit easier
to deal with.  In both cases you'd be using SAX or StAX to read the
input tree and JDOM to build your parse tree instead of modifying
the tree built by JDOM, which is a little bit more work but IMO much
cleaner and easier to maintain.

Cheers
Chris



On Nov 12, 2010, at 12:27 AM, John J. Boyer wrote:

A few weeks ago I sent the eclipse.swt TextEditor as an attachment.
On this message I am attaching SAXBuilderDemo.java from the JDOM
samples directory. I also resending TextEditor.java for your
convenience. I think that we can combine these two to get a start on our editor.

One thing I would like advice on is the copyright notice that
should appear at the beginning of each class. I could copy the
liblouis copyright notice with appropriate changes, but you will
notice that the authors already have some conditions that they wish
respected. We will be modifying their code, of course. We may also
want to dig down to the SAX level so we can modify the parse tree
as it is being built. This will mean modifying a few classes in
JDOM itself. Eclipse may also have some conditions on the reuse of their code.

Please give your comments and suggestionsj regarding the copytight
issue.

Thanks,
John B.

--
John J. Boyer; President, Chief Software Developer Abilitiessoft,
Inc.
http://www.abilitiessoft.com
Madison, Wisconsin USA
Developing software for people with disabilities

<SAXBuilderDemo.java><TextEditor.java>



--
My websites:
GodTouches Digital Ministry, Inc. http://www.godtouches.org
Abilitiessoft, Inc. http://www.abilitiessoft.com
Location: Madison, WI, USA



--
John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc.
http://www.abilitiessoft.com
Madison, Wisconsin USA
Developing software for people with disabilities




--
My websites:
GodTouches Digital Ministry, Inc. http://www.godtouches.org
Abilitiessoft, Inc. http://www.abilitiessoft.com
Location: Madison, WI, USA




Other related posts: