
|
[openbeos]
||
[Date Prev]
[02-2002 Date Index]
[Date Next]
||
[Thread Prev]
[02-2002 Thread Index]
[Thread Next]
[openbeos] new filesystem design
- From: Kobi Lurie <k_lurie@xxxxxxxxxxxxxx>
- To: openbeos@xxxxxxxxxxxxx
- Date: Tue, 05 Feb 2002 18:48:23 +0200
sorry for the long email, but wanted to share some thoughts, kind of
unorganized.
please reply on any of this stuff.. I wanted to contact the bfs list, but
couldn't find any mail
address.
a database-like filesystem and management of files
or bfs 2nd generation (or whatever u'd call it)
===================================================
very general design document. please comment on any of the following
thoughts.thanks.
to me it seems like all this makes sense, and should be a simple design and
thus fast,
but i don't know too much about filesystems.
files folder list (virtual)
attributes are everything
a file is assigned an ID.
name, path and almost everything else is an attribute.
a file can have multiple filenames and paths (which represent a copied/moved
file)
folders are not files.
there is a special ID for the file which contains all the folder data.
there is also a special id for trash, which can be deleted, but its id and path
is recreated
whenever we delete a file.
there should be a util that reorganizes data according to best blocksize for
max. speed.
though best seats at beginning of drive (to avoid drive seek time) are reserved
for
frequently accessed files.
(sort by times read, defrag by this order - file is created with the
appropriate block size.
files that need to move closer or further, according to how much they're
accessed, will
move with their block.
(the filesystem has many block sizes.)
at delete time trash is recreated if it doesn't exist, and the path is deleted
from the file.
find becomes extremely easy to perform/obvious to code, and should be very fast.
the basic filesystem should be able to have different block sizes in every area
of the
filesystem.
64k,1k,1k,4k,16k,64k,64k,64k, etc..
at creation time it decides and gets the right block size for the file.
64 bit.
examples for:
copy
----
works like so:
ID size name type path lastpath
2543 36 name(1):blah txt path(1): /boot
--- --- name(2):kookoo.zip --- path(2): /boot/home
(notice that type/size/id can't change.)
the data located on the drive has an ID (only one) which represents the file.
copying this file to another place is simply creating a path2 attribute.
when listing a folder, we'll iterate thru all paths found, so we avoid copying
the data.
there's no need for links/shortcuts anymore either.
move
------
is simply manipulating one attribute, the path attribute. (or path2 or path3,
depends on
which file we're moving)
delete
------
delete is changing the path attribute to nothing, and putting whatever was in
path into
lastpath attribute.
if there is only one path for this id, and the file is already in trash, the
data can be deleted,
when we empty trash or delete it from there.
(special folder (hardcoded reserved ID actually))
create
-------
assigning an ID to the data file, and "investigating" more attributes
(mimetype, size, etc.)
folder change
--------------
is in the special folders file.
--------------------
limitations and downsides:
if we have 2 million files, going thru them(iterating) for every open folder
will take too much
time.
i haven't find anything elegant to avoid this problem which should be pretty
rare in any
case.
but, it's true that according to this design as the drive is larger and has
more files,
operations (even listing of folder) get slower.
only downloading or getting files from other filesystems or creating files
actually adds data.
we can make the kernel load lower IDs first, so we can make
important/system/boot files
load first, or preferably, we can very easily maximize I/O throughput to most
frequently
read/accessed files, by simply adding another attribute of how many times the
file has
been read, so we can put it in the beginning of the drive/fast section.
this will be done by the reorganization tool described elsewhere.
for the user, the tracker app will have an api we'll provide for listing, which
would be done
by simply iterating thru the files, looking at the path attribute, and showing
the matching
files.
(like sort by path)
bfs will only be a read/write addon to this filesystem.
this also means that read/write functions, buffering and all that lowlevel
stuff,
will be most developed and get very optimized (maybe even asm) since the rest
of the
filesystem is relatively simple to code.
we're journaled by design, but let's check ext3 data journaling coz this we
don't have.
another piece of code would be an efficient bit comparison method, to see if
two files are
identical.
added bonus: quick check of empty directories.
almost no fragmentation as there's barely any movement.
The trash works by simply iterating thru the files looking at empty path
attribute.
name can change, path can change plus user added attributes.
if we del the path, we assign to the "lastpath(x)" attribute, from the path(x)
attribute.
the end. :)
just general thoughts and how it should be performed. please enlighten me with
your
thoughts.
|

|