Go to the FreeLists Home Page Home Signup Help Login
 



[openbeosstorage] || [Date Prev] [06-2003 Date Index] [Date Next] || [Thread Prev] [06-2003 Thread Index] [Thread Next]

[openbeosstorage] Re: DiskDevice API 2.x, Kernelland Draft

  • From: "Axel Dörfler" <axeld@xxxxxxxxxxxxxxxx>
  • To: openbeosstorage@xxxxxxxxxxxxx
  • Date: Fri, 06 Jun 2003 16:05:08 +0200 CEST
"Ingo Weinhold" <bonefish@xxxxxxxxxxxxxxx> wrote:
> On Thu, 05 Jun 2003 12:56:21 +0200 CEST "Axel Dörfler" <axeld@pinc-
> software.de> wrote:
> > First of all, I haven't had a deep look at it yet, because I didn't 
> > find the time. But I somehow wanted to answer this one now :)
> Hehe. :-)
> You know, of course, that you can't sneak out of having a look at the 
> kernel stuff I proposed -- being the main kernel developer, not to 
> mention team lead. ;-)

Uh oh, okay, I'll try ;-)

> > As long as we can't change the nested structure, it would be pretty 
> > simple, because the partitions are easily identified by their ID - 
> > and there are only two methods, moving and resizing, which can 
> > easily be differentiated.
> If you mean by changing the nested structure, moving a partition 
> within 
> the hierarchy (e.g. make it child of another parent), that shouldn't 
> be 
> allowed, I think. At least it could turn out to be quite tricky. 
> Otherwise changing the hierarchy, like creating and deleting 
> partitions 
> shouldn't be any problem.

I meant the former, and I think that should be simply forbidden :-)

> > Having those shadow partitions (although I don't like the name 
> > much, 
> > something like "target partitions" would make clearer that the 
> > current partitions should be changed
> As I said, I was lacking a better name... :-)
> I find `target partition' a bit general/vague, though.

At least it does point in a direction ;-) target_partition might be a 
bit vague, but I really wouldn't understand shadow_partition, if I'd 
not knew about it.
But anyway, do I understand the procedure correctly?
First, you'd need to get all current partitions on a disk, but I 
somehow don't see how this would be possible using that API? Where do I 
get those partition IDs from?

Currently, the standard way to iterate over all disk devices is to 
iterate over all entries under the /dev/disk path. But that will return 
all disks, if partitioned or not, even if present or not (in the case 
of removable media). If you got a device you'd ask for the partitions 
using ioctl().

So, assuming we somehow got to those partition IDs, would it be correct 
to do this:

prepare_disk_device_modifications(device);
        // this will lock the disk device API (or just for this device, if 
possible)
        // (or even several for a software RAID)

defragment_partition(partition);
resize_partition(partition, 100*1024*1024);
        // this will add the jobs to the job list

commit_disk_device_modifications(device, ...);
        // this will finally trigger the modifications to be made
        // and unlock the API - or will it first process all jobs
        // and the unlock the API?

Where do we need the shadow partition name anyway? It doesn't seem to 
be part of the user API.
Also, the real user API is the C++ API, right? So the user will never 
come across all those user_ prefixes - because if he would, I would 
consider dumping them.

And if we have all these int32 IDs we could think about adding a new 
type for them, like partition_id.

> > > > initialize_partition() does perhaps need a bit more discussion, 
> > > > since there exists the planned fs_initialize_volume() function 
> > > > (<be/
> > > > kernel/fs_volume.h>), which has largely intersecting 
> > > > functionality (cf.
> > > > userland_interface.h for some more thoughts).
> > > My vote is for ditching fs_initialize_volume() and adding support 
> > > for registering files as disk devices. What would the arguments 
> > > be 
> > > for keeping fs_initialize_volume() (other than the regular file 
> > > problem)?
> > No, actually, that would be kinda stupid IMO. A file system needs a 
> > device (or file) to initialize its structure on. It starts at 0 and 
> > has
> > the length of the whole device (partition or file). Anything else 
> > would make it complicated.
> > Now, why should we hide the direct method of initializing a file 
> > system, and force the user to get the whole BPartition tree, the 
> > need 
> > to search for the right partition, disabling the possibility of 
> > creating file systems in regular files, etc.
> > I would rather remove the initialize_partition() function, and have 
> > something like (I would guess it already exists):
> > status_t BDiskDeviceList::GetPartition(BPartition &partition, const 
> > char *deviceName);
> > (dunno if this class would be the right container, though)
> > 
> > status_t BPartition::GetDeviceName(char *deviceName);
> > 
> > and then just call fs_initialize_volume() using that deviceName.
> Since initialize_partition() is more general -- it also initializes 
> partitioning system, not only file systems -- it definitely cannot be 
> removed. It even has a quite different semantics, for it doesn't do 
> anything destructive immediately, but only operates on shadow 
> partitions.

Yeah, I noticed that now :-)

> Regarding fs_initialize_volume(), it would be at best a convenience 
> function, nothing more. To reply to your arguments: Your first 
> paragraph just doesn't apply. Both calls initialize_partition() and 
> fs_initialize_volume() end at the same FS hook, which gets a 
> (partition) device path.

What I meant was that the partition must exist at this point in time; 
if it does, then there is no problem.

> The latter method is not more direct than the former one. Well, more 
> convenient, if you mean that, but not more direct with respect to the 
> functions involved. The call cannot be directly passed to the FS, but 
> has to go through the disk device manager, since it must be 
> coordinated 
> with other operations on the disk device. Now things get a bit 
> difficult, for fs_initialize_volume() is synchronous, while the disk 
> device jobs aren't. Moreover the disk device could be locked by a 
> userland API user, so that the thread couldn't even get the job 
> scheduled -- it could simply fail in this case, though.
> 
> As I mentioned, creating file system in file could by addressed by 
> providing an API to register files as disk devices. A worthwile thing 
> to do, I think, since that would even allow to initialize 
> partitioning 
> systems.
> 
> To sum it up, I would see fs_initialize_volume() mainly as a 
> concenience function for one purpose, to create FSs in files. I would 
> discourage application on partition devices (how did the caller get 
> hold of the partition device path, anyway?).

Okay, you got me thinking :-) And I am also unsure about the 
justification of my previous rant :)
What I completely missed was the fact that it really makes sense to 
direct any calls to fs_initialize_volume() through the disk device 
manager. It wouldn't be necessary to do so, because as long as there is 
a device (in /dev/disk/...), *anybody* can write to it.
But of course, direct access could be dangerous in this case, so 
directing that call through the disk device manager would add some 
value.

What I also don't understand with this API is how to create a new 
partition? And if you want initialize_partition() to accept a file 
system and a partitioning system at the same place they had to share 
the same namespace, right?
Is there a way for a user application to differentiate between the two? 
I would be a little bit surprised if our general "mkfs" would create a 
partition on the disk.
We might also add a "name" field to initialize_partition() since almost 
all disk systems share this property (IIRC only Intel style 
partitioning doesn't have it). I think I would find it cleaner if there 
are two functions to do that, anyway, even if embedded in the disk 
device manager context. For example, resizing a partition needs *two* 
resize calls internally, one to the partition, and one to the file 
system which is on that partition.

OTOH I think it would be nice to have an add_partition() function - if 
we can do resize_partition(), why shouldn't we be able to do this?
What steps would be involved to create 4 partitions on a given (empty) 
disk? I would like something as (may not fit perfectly in the proposed 
API, though):

prepare_disk_device_modifications(device);
initialize_disk_system(device);
add_partition(device, "first", 100*1024*1024, "active=true");
add_partition(device, "second", 50*1024*1024, NULL);
...
commit_disk_device_modifications(device);

As you mentioned, the only problem which we would still have is with 
the image files. Though I would really like us to be able to simply 
register/publish them as a device, it would also be nice to create a 
file system on them without having to register them.
For example, that would be the use for the fs_shell, but perhaps also 
other things, although I don't have a good idea right now.
Or to say it this way: I like to have no real differentiation between 
block devices and files - they are almost the same from the OS point of 
view, why shouldn't we keep this? Registering a file as device would be 
certainly a step into the right direction - but it would be even nicer 
if that wouldn't be necessary at all.
We would also need to differentiate between file system images and disk 
images - but I guess we're doing this anyway.

Well, I think it would be okay to not consider the fs_shell case, but 
to have mkfs register files automatically - the only thing we should 
support, though, is that we should continue to be able to address this 
file using its standard path. I.e. something like:
$ mkfs test.image
$ makebootable test.image

Shouldn't fail even if "makebootable" needs a device to operate on. 
Dunno how we should do that right now, though. Maybe it's not even a 
top priority - but it'd be nice to have it.

> > We can cancel all jobs - there would pop up a requester which says 
> > "Canceling the operation will recreate the initial state - this can 
> > take a while", and just reverse the thing.
> > Shouldn't be too hard, at least not for moving a partition around.
> Yep, for that task it would work. OTOH, something like initializing a 
> partition is not so easy to be undone. ;-) Even more fun it will be, 
> when several jobs are in progress in parallel.

Sure :-)

> > How ugly it gets for resizing a partition would be the job of the 
> > file system to judge on. But since we already need to have logging 
> > for those jobs, reversing the operation at any point shouldn't be 
> > impossible (or too ugly) at all.
> We need logging for those jobs? I planned to gracefully leave that 
> out 
> for R1. :-P

Well, I would also say that we don't implement any logging for this 
stuff in R1. *But* the API shouldn't reflect this in any way, I guess.
It should just make sure that the user will have all the information he 
needs - like "Pressing cancel will destroy all data on disk" vs. 
"Pressing cancel will restore the initial state of the currently 
processed job" :-)

Tyler Dauwalder <tyler@xxxxxxxxxxxxx> wrote:
> Yes. I just think it's clearer for someone who's writing their first 
> fs/partition add-on if there's an explicit function there for each 
> operation rather than making them dig thru documentation or headers 
> elsewhere to figure out what operations they can/should support, and 
> how they should support them. One could multiplex the entire fs add-
> on 
> suite thru a single function, but I think that's a much less 
> attractive 
> way to do it than the current setup.
> 
> So really, I actually prefer the no multiplexing route in both cases; 
> I 
> just think it's tolerable in the syscalls, since we're basically the 
> only ones who'll be using them, and I can see the argument for not 
> polluting the syscall namespace.

That would also be the only problem I see - but since *we* are creating 
all syscall names ourselves, and we have beautiful names such as 
"prepare_disk_device_modifications" I can hardly think of any other use 
for that name :-)
Also, having them as separate calls increases the possibilities to 
check their arguments.

> > As I said, I was lacking a better name... :-)
> > I find `target partition' a bit general/vague, though.
> I actually really like the name "shadow partitions". I think it does 
> a 
> pretty good job of capturing the idea of what's going on. :-)
> 
> Other ideas:
> 
> CreateWorkingPartition()
> CreateEditablePartition()
> CreateWorkingCopyPartition()
> CreateEditableCopyPartition()
> ...
> 
> I like shadow partitions just fine, though. :-)

Well, if you insist on it... :-)

> > with other operations on the disk device. Now things get a bit
> > difficult, for fs_initialize_volume() is synchronous, while the 
> > disk
> > device jobs aren't. Moreover the disk device could be locked by a
> > userland API user, so that the thread couldn't even get the job
> > scheduled -- it could simply fail in this case, though.
> Yes, this is the main concern I have. fs_initialize_volume() needs to 
> play nice with the rest of the DiskDevice API if it's kept around. 
> Besides, who's really going to use fs_initialize_volume() anyway? 
> Isn't 
> it only programs like mkbfs...? It seems to me that those should be 
> rewritten to use the more native, DiskDevice API anyway. It really 
> wouldn't be that much more work, and one could write a nice, general 
> command-line initialization app that would work with any partition/fs 
> add-on on the system that supports initalize_partition() using the 
> DiskDevice API.

That was the plan anyway, as you could derive from the existance of the 
fs_initialize_volume() function - it didn't exist in R5.

> > To sum it up, I would see fs_initialize_volume() mainly as a
> > concenience function for one purpose, to create FSs in files. I 
> > would
> > discourage application on partition devices (how did the caller get
> > hold of the partition device path, anyway?).
> I agree.

Well, the standard method to get a device is by looking in the /dev 
path - and this is perfectly legal, we can't do anything against this, 
nor should we.
OTOH we should make sure, that at least our API plays nicely together.

> > We need logging for those jobs? I planned to gracefully leave that 
> > out
> > for R1. :-P
> Yes, let's get this clear. First I thought we were leaving it out for 
> R1, then I thought it was in, then out again, and now... :-) Which is 
> it?

Logging possibility on file system level should always be there, its 
implementation doesn't have to be, though (even in R2).
The partition resizing stuff itself doesn't have to be logged for R1, 
IIRC :)

> > > It could be nice to have this, and it would also be a bit faster. 
> > > I
> > > am not sure if it's rendering the API inconsistent there, though. 
> > > One
> > > could argue that the partition related functions (not 
> > > initialize())
> > > are a bit separated from the rest anyway.
> > Either way would work for me. So, if you have any preferences...
> I prefer passing the device. :-)

Okay, so let's do that, then :-)

Adios...
   Axel.






[ Home | Signup | Help | Login | Archives | Lists ]

All trademarks and copyrights within the FreeLists archives are owned by their respective owners.
Everything else ©2007 Avenir Technologies, LLC.