I have pretty extensively studied this format and even extracted deflated data from it using the scripting language Ruby. To see what the various options in PackageBuilder did in the PKG file, I made a base PKG file with nothing in it and all options off, then turned on each option and added files to make comparisons. I used a hex dumper and diff to see what changed. I don't have my notes on my current machine (I'm at work), but I'll get them and post them to this list as soon as I can. I was planning on writing the PKG installer, but I don't have much time to work on that at the moment, plus I have other pressing Haiku duties I want to finish first (like the article on the new GUI layout system.) So I welcome anyone else to take a look. We are also submitting this as an idea for the Google Summer of Code, should our project be accepted. But from what I can remember about the format, it consists of various sections delimited by 4 character string codes. I've written down most of these codes and my "interpretation" of what they mean in the notes which I will post here. The files contents are deflated and stored in one part of the file (pretty soon after the header) and the offset and lengths of the deflated data are stored later in the file along with the filename and other file metadata. The strings for filenames and other stuff are stored in a length+string format, as I recall. The 4 character string codes seem to be null-terminated and are always followed by some extra bytes (probably just to pad them to a byte boundary.) Also for some reason every PKG file (even an empty one) has the default BeOS directories (B_DESKTOP_DIRECTORY, etc.) stored in it based on the integer value of the code one would pass to find_directory. This might be safe to ignore. I don't think reading the file format will be too hard, the hard part will be implementing all the various options the format supports in the new installer. But that is more tedious than hard I suppose. Ryan On 3/7/07, Gustavo grieco <gustavo.grieco@xxxxxxxxx> wrote:
I have some free days, so I started to study the PKG format (http://dev.haiku-os.org/ticket/1040). First, i've searched for some information about reverse engineering (I found this book: http://en.wikibooks.org/wiki/Reverse_Engineering) and the zlib specification: + http://www.gzip.org/zlib/rfc-zlib.html + http://www.gzip.org/zlib/rfc-deflate.html And the most important thing, looking some pkg files in the hex viewer i concluded that its not so dificult to understand them. Its easy to recognize compressed data because the MIME is before the zlib stream and this stream beggins with bytes: 02 78 9C At the end of the file there are names of the files and some other information. If someone have more ideas, i'd like to hear them. Thanks!