From phillips at arcor.de Mon Mar 24 09:39:44 2003 From: phillips at arcor.de (Daniel Phillips) Date: Sun, 23 Mar 2003 23:39:44 +0100 Subject: [Prophesy] Versioning filesystem Message-ID: <20030323223627.6476B11373F@mx12.arcor-online.net> Chances are I'm talking to myself, after not doing anything here for nine months or so. That doesn't mean things haven't been happening. Specifically, I've been introspecting. The main subject of introspection has been how I'd go about implementing a versioning filesystem, and even where that seems like a good thing to do. I basically beat my head against a wrong approach for most of the nine months, pursing the idea of hooking out file_operations from a vfs a path_walk. I thought that would be the most efficient way to hook the thing up, because writes could be specially handled, whereas reads would just follow the normal path. This never worked out cleanly. The vfs just isn't set up that way, and would have required major surgery. Besides that, I gradually realized that I did not always want to pass reads straight through. This would my design options by not allowing me to generate the read data on the fly. Then I saw the light, by realizing that Martin Poole already had the right idea with his newuserfs. Newuserfs is a forward port of Jeremy Fitzhardinge's userfs, which works by passing vfs operations through a pipe to user space. After thinking about this a short time, I realized that I could start with ramfs, which implements full posix semantics and just bolt that onto a usermode daemon with the socket. There are a number of right things about this approach, not least of which is the fact that the stack never gets very deep for either the task calling for file operations or the server implementing them. This is because the kernel does a task switch to the server each time a complex low-level file operation needs to be done, and the stack-hungry things happen in user space. There's no recursive calling into the kernel. Another right thing is the way caching works with this approach, specifically the page cache and dcache. For both, the vfs only needs help from the usermode daemon when some name or file data isn't in its cache. So the usermode implementation can be quite slow and the cache will cover that up. Not that I want to make the usermode part slow, but in theory it could be, especially if there is database access and application of a chain of file differences going on. So I started implementing this about 10 days ago and have been occupied with it since. Things are going pretty well, to the point I could think about a code release in a week or two. The project has a name: Stuf - STackable Usermode Filesystem which is actually not specific to versioning filesystems. A particular filesystem is implemented by a usermode server daemon that implements Stuf's socket protocol (which I call "beads"). The sever I'm working on now is called "simple" and just passes filesystem operations through to the underlying filesystem. After that is working reasonably well, to the point that you can, say, compile a kernel on the stacked filesystem, I'll move on to a versioning server. At this point I can mount a filesystem with the "stuff" command (Stuf Frontend), fork the server, connect the pipe, generate and pass FDs for both the mounted and underlying filesystem through the pipe. The server can ioctl the virtual filesystem to take care of special needs that can't be satisfied by (or would be too slow and racy with) posix operations. I can now pass open(2) requests through through the pipe, and am currently busy implementing a new system call that can open a file, given a directory fd and a name. There's been a lot of work on SCM high level design considerations done on the Arch mailing list, including what needs to be done to satisfy the requirements of kernel developers. It seems to me, that much of what has been discussed is suitable for implementation as a versioning filesystem, and so I have set out to do that. Regards, Daniel From hacker at gnu-designs.com Mon Mar 24 10:05:54 2003 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sun, 23 Mar 2003 18:05:54 -0500 (EST) Subject: [Prophesy] Versioning filesystem In-Reply-To: <20030323223627.6476B11373F@mx12.arcor-online.net> References: <20030323223627.6476B11373F@mx12.arcor-online.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Chances are I'm talking to myself, after not doing anything here for nine > months or so. That doesn't mean things haven't been happening. > Specifically, I've been introspecting. I'm still here, also introspective (and still out of work). > So I started implementing this about 10 days ago and have been occupied > with it since. Things are going pretty well, to the point I could think > about a code release in a week or two. The project has a name: Is there an OLS paper in the works for this? Have you looked at things like udev[1]? > It seems to me, that much of what has been discussed is suitable for > implementation as a versioning filesystem, and so I have set out to do > that. What's the target? Something akin to cvs? Or something more robust? (No religious wars please re: Bk, svn, cvs, etc.) [1] udev: http://www.linuxsymposium.org/2003/view_abstract.php?talk=94 d. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE+fj3UkRQERnB1rkoRAh0SAJ4s6AA+RjhoawnPeAHOS2HHBviVLwCeNm6y rOOZRd2d0TQF6sRW4mS21i0= =kaT3 -----END PGP SIGNATURE----- From phillips at arcor.de Mon Mar 24 10:38:32 2003 From: phillips at arcor.de (Daniel Phillips) Date: Mon, 24 Mar 2003 00:38:32 +0100 Subject: [Prophesy] Versioning filesystem In-Reply-To: References: <20030323223627.6476B11373F@mx12.arcor-online.net> Message-ID: <20030323233514.DB6AB119762@mx12.arcor-online.net> On Mon 24 Mar 03 00:05, David A. Desrosiers wrote: > > So I started implementing this about 10 days ago and have been occupied > > with it since. Things are going pretty well, to the point I could think > > about a code release in a week or two. The project has a name: > > Is there an OLS paper in the works for this? I'll get the code out there and do some benchmarks before thinking about a paper. > Have you looked at things like udev[1]? Interesting. But this project does not appear to be aimed at implementing filesystem semantics in user space. > > It seems to me, that much of what has been discussed is suitable for > > implementation as a versioning filesystem, and so I have set out to do > > that. > > What's the target? Something akin to cvs? Or something more robust? To start, I'm thinking in terms of something like an undo-redo chain at the filesystem level, but with branching. The branches would be shown as a tree, which is also a virtual filesystem. The database schema I already said a lot about, archived on this list. It's designed with a sophisticated SCM in mind, including communication with remote repositories, but the initial target functionality is much more modest. The idea is to have a stable platform that does something useful, to build on. Regards, Daniel From hacker at gnu-designs.com Mon Mar 24 11:11:33 2003 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sun, 23 Mar 2003 19:11:33 -0500 (EST) Subject: [Prophesy] Versioning filesystem In-Reply-To: <20030323233514.DB6AB119762@mx12.arcor-online.net> References: <20030323223627.6476B11373F@mx12.arcor-online.net> <20030323233514.DB6AB119762@mx12.arcor-online.net> Message-ID: > Interesting. But this project does not appear to be aimed at implementing > filesystem semantics in user space. I was thinking more about the notifications/fd parts, not filesystem specific semantics. > The idea is to have a stable platform that does something useful, to build > on. Understood. d. From phillips at arcor.de Mon Mar 24 11:29:39 2003 From: phillips at arcor.de (Daniel Phillips) Date: Mon, 24 Mar 2003 01:29:39 +0100 Subject: [Prophesy] Versioning filesystem In-Reply-To: References: <20030323223627.6476B11373F@mx12.arcor-online.net> <20030323233514.DB6AB119762@mx12.arcor-online.net> Message-ID: <20030324002622.D3CC4119B12@mx12.arcor-online.net> On Mon 24 Mar 03 01:11, David A. Desrosiers wrote: > > Interesting. But this project does not appear to be aimed at > > implementing filesystem semantics in user space. > > I was thinking more about the notifications/fd parts, not filesystem > specific semantics. I don't know how he did it. I'll take a look eventually, but I'm entirely satisfied with how I did that part :-) Regards, Daniel