[cad-linux-dev] Re: variably formatted text (now with references:)

  • From: Eric Wilhelm <ewilhelm@xxxxxxxxxxxxx>
  • To: cad-linux-dev@xxxxxxxxxxxxx
  • Date: Sat, 6 Dec 2003 18:14:59 -0600

> The following was supposedly scribed by
> cr88192
> on Saturday 06 December 2003 03:38 pm:

>this is giving me an idea:
>a line is defined as:
><control> <rest*>
>
>and header info is also defined in this form.
>control is a glob of characters, the first character tells about what it is
>and the remaining ones may be used as a code or such.
>
>eg:
<removing quote characters for clarity, see # comments>
# comment, #...
$ raw data, $...
( begin, (<tag> ...
) end,
! declared form, !<tag> <type> <fields*>
# "!" character is going to cause shell problems
@ declared type, @<tag> ...

# thus an example:
!0 cube minx miny minz maxx maxy maxz
# so when we have @0 below, the data following it is a cube?
!1 sphere orgx orgy orgz rad
!2 union name
# I think the unseparated "type field field" could be problematic
(2 cube-and-sphere
@0 -10 -10 -10 10 10 10
# if I can put a comment here, my comment is that I think the 
# "in-line" content of this cube is harder to read and more difficult
# to support in the program (not necessarily wrt parsing, but more
# difficult in terms of the scalability of the format into more complex
# programs.)
@1 0 0 20 10
)

I like your extension of this "declared tags" idea, but as you said, the 
readability suffers.   However, what if we look at your union as part of a 
larger model and broken out into the filesystem?

$tree -a model/
model/
|-- cubes
|   |-- .cubes
|   `-- anon.cube
|-- spheres
|   |-- .spheres
|   `-- anon.sphere
`-- unions
    |-- .unions
    `-- set0.union

Say that model/unions/.unions contains the format definitions for all unions 
in the "unions" directory (this directory name could be arbitrary, rather it 
is identified with the "./.unions" definition file which it contains.)  Now 
lets say that "model/unions/set0.union" contains the following:

cube-and-sphere:$cubes/anon.cube#7,$spheres/anon.sphere#9

Therefore, the $cubes variable holds the value "model/cubes/" (yes, that needs 
work) and the file anon.cube contains a cube named "7" which looks like:

7:     -10,-10,-10   :   10,10,10
8:  "$points/anon.point#9"  : 15,15,15
10:  "$points/control.point#originA" :  (+) 9, -4, 3.442 : 0.1132,0,3.225

Sorry to run-off there.  That's just me chasing after that parametric format 
which is capable of having named and anonymous references via child objects 
stored in other files.    The "double quoted" parts are intended to be a sort 
of variable interpolation (after all, we did specify that the coordinates 
would be a list, and a reference is not a list unless we expand it into one 
somehow (though in memory, you likely store the coordinates in a reference 
(pointer), so maybe the interpolation is just an indicator that you should 
load that line and store a pointer to it.))  That third one (anon.cube#10) is 
using relative coordinates for it's second point, where the normal vector for 
the plane on which the relative coordinates are based is defined by the third 
value set (as specified by the (quoted-below) model/cubes/.cube 
control-file.)

# model/cubes/.cube  -- control-file for cubes in this directory
suffix:    .cube
format:  $name:@min:@max:@basenormal [0,0,1]

I'm using the @ to indicate that the key is a list (and toying with the idea 
of specifying a default value within the control file in [brackets].)

Though I haven't messed with it before, I think seeking by lines is fairly 
quick in most languages.  Correct me if I'm way-off.  

Also, what is the possibility of constructing a B-tree as an index of which 
named objects are on what line number? (all objects are named, but the 
anonymous ones are really simply numbered)  You don't have to index anything 
beyond the line-number because the file-name is part of the reference.  This 
also allows debugging to be done with:
        'grep "^object_origin" $points/control.point'  
(assuming that $points is a defined environment variable containing that 
directory name (something which could be done with a script or control-file 
contained in the toplevel of the model))

What I'm driving at here is that all information relevant to the entity is 
contained on one line and that control information is contained in as few 
places as possible (but as close to the content as is feasible without 
actually being IN the content file.)  I guess this is similar to the xml DTD 
setup?

I'm also looking for this to be compact enough to be _quickly_ human-readable 
while making it simple to parse, yet extendable:)  As if that were not enough 
to ask, I'd also like it to facilitate simplification of a parametric model 
with complex links and references.  Example:  you want to do a material 
take-off with a one-off script using a snapshot of a parametric model.  You 
don't want that script to have to worry itself about the intricacies of 
dereferencing and calculating the endpoints of a conduit run.  This leaves 
you with a couple of options.  You either have some sort of format-conversion 
(yuck (stale cache, etc.)) or you have a library with a multi-mode interface 
(2D/3D/parametric.)  I think I am leaning toward the library, since this 
gives a much better solution from the programmer's standpoint and you don't 
have to explain things like stale data to users.  

But, this is getting off-track, since the real point is that the format can be 
used in simple or complex ways with full upward (static->relational) 
compatibility and the backward (relational->static) compatibility provided 
through an interface / extraction / compiling of some sort.

--Eric
-- 
"It ain't those parts of the Bible that I can't understand that 
bother me, it's the parts that I do understand."
                                        --Mark Twain


Other related posts: