[haiku-development] Re: Proposal and questions: Support of BFS attributes

  • From: Siarzhuk Zharski <zharik@xxxxxx>
  • To: <haiku-development@xxxxxxxxxxxxx>
  • Date: Tue, 23 Oct 2012 20:37:03 +0200

Hallo,

Alexander G. M. Smith ÐÐÑÐÐ 23.10.2012 16:44:
It was decided to use tar format extensions introduced in POSIX.1-2001. Those extensions are presented as list of strings in special (so known
PaxHeader) header before the standart header of every file. Every
string, that correspond to one BFS attribute has following format and
should be UTF-8 encoded:

<length> <key>=<type> <data><CR>

http://www.mkssoftware.com/docs/man4/pax.4.asp
[...]
But doesn't each extended header have to fit into one 512 byte block?

Those attributes are stored not in 512-bytes pax header block but in "pax Extended Header" that has not defined structure. I already have draft support of BFS attributes both for gnutar and for libarchive's bsdtar and they have no problems handling PaxHeaders longer than 512 bytes. But I should check this more accurately now.

2) Endiannes - [...] But many of Haiku attributes are binary by theirs nature and can become invalid in case of easy expanding on the system with
different byte order. [...]

Always store the archived bytes in network byte order
(http://en.wikipedia.org/wiki/Endianness#Endianness_in_networking
and RFC1700).  Then swap as needed when restoring the file.  Though

Looks like you are right here. Thanks.

you'll need a small database of the attribute types to figure out
which ones need swapping (and be open to adding new types when someone
invents one).  Keep in mind that some attributes can be considered to
be arrays of values.  For example, an attribute with 12 bytes of data
and a type of 'LONG' (which means a 4 byte integer) can be treated as
an array of three integers. I'm not sure what you'd do with a BMESSAGE
type of attribute.  Don't convert numbers to text and back, just save
the binary, since text doesn't work accurately for floating point numbers.

I can take care about numerical and floating-point types, ever RECT and BPNT can be handled correctly, but nobody knows about complex data types like RAWT, MSGG and friends. So it looks like the final program obligation to care about it. :-( By the way, Traker stores some of directory attributes with "_le" suffix.

3) The size of attributes. It is obvious that bloating megabytes with conversion to HEX is bad practice. In opposite, trashing archive with extra "BeOS attributes" special files like MAC version of libarchive do
is not nice too. I think some limit should be defined either the
attribute can be HEX-stored in PaxHeader or as special binary file. For
example 8 or 16 kilobytes. Attributes exxeeding this limt should be
handled as special "attributes" file.

It's already limited to 512 bytes, less with overhead.

See above.

PS: And, yes, I have already checked that "bin->HEX->gz/bz2/zip" way
has better compression ratio than "bin->base64->gz/bz2/zip" one. ;-) So assuming typical using of tar files as container for stream compressing
I do the things in simpler way.

Both are annoying in that you can't read the original attribute text
(for string attributes) when looking at the archive file (useful for
debugging).

Please look the sample in my original message. It is the _real_ result of my draft implementation. String types (CSTR and MIMS) are not encoded of course. ;-)

--
Kind Regards,
   S.Zharski

Other related posts: