sorry for the long email, but wanted to share some thoughts, kind of unorganized. please reply on any of this stuff.. I wanted to contact the bfs list, but couldn't find any mail address. a database-like filesystem and management of files or bfs 2nd generation (or whatever u'd call it) =================================================== very general design document. please comment on any of the following thoughts.thanks. to me it seems like all this makes sense, and should be a simple design and thus fast, but i don't know too much about filesystems. files folder list (virtual) attributes are everything a file is assigned an ID. name, path and almost everything else is an attribute. a file can have multiple filenames and paths (which represent a copied/moved file) folders are not files. there is a special ID for the file which contains all the folder data. there is also a special id for trash, which can be deleted, but its id and path is recreated whenever we delete a file. there should be a util that reorganizes data according to best blocksize for max. speed. though best seats at beginning of drive (to avoid drive seek time) are reserved for frequently accessed files. (sort by times read, defrag by this order - file is created with the appropriate block size. files that need to move closer or further, according to how much they're accessed, will move with their block. (the filesystem has many block sizes.) at delete time trash is recreated if it doesn't exist, and the path is deleted from the file. find becomes extremely easy to perform/obvious to code, and should be very fast. the basic filesystem should be able to have different block sizes in every area of the filesystem. 64k,1k,1k,4k,16k,64k,64k,64k, etc.. at creation time it decides and gets the right block size for the file. 64 bit. examples for: copy ---- works like so: ID size name type path lastpath 2543 36 name(1):blah txt path(1): /boot --- --- name(2):kookoo.zip --- path(2): /boot/home (notice that type/size/id can't change.) the data located on the drive has an ID (only one) which represents the file. copying this file to another place is simply creating a path2 attribute. when listing a folder, we'll iterate thru all paths found, so we avoid copying the data. there's no need for links/shortcuts anymore either. move ------ is simply manipulating one attribute, the path attribute. (or path2 or path3, depends on which file we're moving) delete ------ delete is changing the path attribute to nothing, and putting whatever was in path into lastpath attribute. if there is only one path for this id, and the file is already in trash, the data can be deleted, when we empty trash or delete it from there. (special folder (hardcoded reserved ID actually)) create ------- assigning an ID to the data file, and "investigating" more attributes (mimetype, size, etc.) folder change -------------- is in the special folders file. -------------------- limitations and downsides: if we have 2 million files, going thru them(iterating) for every open folder will take too much time. i haven't find anything elegant to avoid this problem which should be pretty rare in any case. but, it's true that according to this design as the drive is larger and has more files, operations (even listing of folder) get slower. only downloading or getting files from other filesystems or creating files actually adds data. we can make the kernel load lower IDs first, so we can make important/system/boot files load first, or preferably, we can very easily maximize I/O throughput to most frequently read/accessed files, by simply adding another attribute of how many times the file has been read, so we can put it in the beginning of the drive/fast section. this will be done by the reorganization tool described elsewhere. for the user, the tracker app will have an api we'll provide for listing, which would be done by simply iterating thru the files, looking at the path attribute, and showing the matching files. (like sort by path) bfs will only be a read/write addon to this filesystem. this also means that read/write functions, buffering and all that lowlevel stuff, will be most developed and get very optimized (maybe even asm) since the rest of the filesystem is relatively simple to code. we're journaled by design, but let's check ext3 data journaling coz this we don't have. another piece of code would be an efficient bit comparison method, to see if two files are identical. added bonus: quick check of empty directories. almost no fragmentation as there's barely any movement. The trash works by simply iterating thru the files looking at empty path attribute. name can change, path can change plus user added attributes. if we del the path, we assign to the "lastpath(x)" attribute, from the path(x) attribute. the end. :) just general thoughts and how it should be performed. please enlighten me with your thoughts.