[cad-linux-dev] Re: testing: intro

From: "cr88192" <cr88192@xxxxxxxxxxx>
To: <cad-linux-dev@xxxxxxxxxxxxx>
Date: Fri, 5 Dec 2003 16:00:35 -0800

From: "Jeffrey McGrew" <JMcGrew@xxxxxxxxxxxxxx>
To: <cad-linux-dev@xxxxxxxxxxxxx>
Date: Fri, 5 Dec 2003 12:29:11 -0800
>> (though for decent performance something like this
>> would require a binary format...)
>
>Why?
>
this was taken out of context.
I was meaning for an oodb style format (eg: you are performing queries and
insertions on a file, without loading/saving it all at once).

>Just thinking along the same lines. And after working
>with Linux & Radiance & Cygwin for a little while, I'm
>really starting to see the benefits of plaintext. Start
>looking at things like CVS and merging, and it starts
>looking like amazing things could be leveraged out of
>it.
>

>So, not being an experienced programmer, where does this
>basic assumption come from that binary files perform
>better than plaintext files in most cases?
>
I am annoyed that you are calling me a newbie. I applogize if that was not
the intended meaning (I am not fully able to disambiguate this...).

but anyways, a problem with plaintext is that it is variable sized and can't
be optimized with some algo's (eg: b-trees). for a large pool of geometry
and a text format, one would end up loading and saving the whole file each
time.
in the case of delta's, one still has to rewrite the file to apply the
delta, again not very fast for a large file.

if one uses a b-tree style encoding, one can incrementally rewrite bits of
the file, and not need an explicit "save" operation (making large files a
lot more practical). combined with a log, one can also have various manner
of undo's.
but a problem comes up here:
if a binary format is used (especially a b-tree+log) then the openness of
the format is largely reduced...

again: I am primarily considering use of a text based format (s-expressions,
aka "lisp syntax").
my planned approach is that geometry will likely be split into multiple
files, though my cad is not taking this approach as of present (it is a bit
too early in the development...).

>I'm not challenging you, or disagreeing with you at all,
>I just see this stated a lot as fact and not being a
>computer programmer I don't know why this is the case,
>for it's not obvious to me.
>
>Jeffrey=20

various reasons.
you can do fancy stuff with binary formats (eg: straightforward incremental
rewrites).
parsing is very much simplified compared with text, and often can be done
faster.
eg: a tlv encoding can be used, often consisting of:
read the tag;
read the len;
read the data (len bytes).

the tag and data are then passed to a switch (or similar) construct, which
may then decompose the data more or do whatever with it.

with text one has to do things, eg:
reading in a line (or otherwise managing buffering);
reading tokens (ie: stepping along until whitespace is found or such);
using string comparisons;
..

these have some cost as well (though minor compared to the cost of reading
the data to begin with).

misc:
I did not recieve an email for this response.
can someone indicate how I am supposed to respond on this list (or if the
email is on some sort of delay...).

all for now.

--
cr88192 at hotmail dot com
http://sourceforge.net/projects/bgb-sys/
http://bgb-sys.sourceforge.net/

Follow-Ups:
- [cad-linux-dev] text <-> binary
  - From: Janek Kozicki
- [cad-linux-dev] Re: testing: intro
  - From: Eric Wilhelm
- [cad-linux-dev] text <-> binary
  - From: Janek Kozicki

[cad-linux-dev] Re: testing: intro

Other related posts: