[Prophesy] Re: Improved string transformation

Daniel Phillips phillips at bonn-fries.net
Mon Jun 3 18:55:56 EST 2002


On Monday 03 June 2002 10:06, Rasmus Andersen wrote:
> On Mon, Jun 03, 2002 at 09:44:01AM +0200, Daniel Phillips wrote:
> > > On the other hand, if the SCM merely uses the FS operations to gather
> > > knowledge about changed objects[1], then the user would still have to
> > > do a explicit 'commit' to make a delta(?) and attach comments. Which
> > > isn't that far from what you would do anyway without the magic FS.
> > > 
> > > Or am I missing something?
> > 
> > No, good point, and it's one I've thought about, I just neglected to say 
> > anything about it.  My thinking is that the scm will normally save a 
> > transform against a file every time the file is written to for whatever 
> > reason, but when you commit, those transforms are collapsed into a single 
> > transform.  So until you do the commit, you have file-level undo, if you want 
> > it.  It's just easy to provide this, so why not?  We can also provide an 
> > option to leave those transformed un-composed in the database, which will eat 
> > a lot of space a probably be useless, but it might be interesting to 
> > somebody, and it's easy to do, so again, why not?
> 
> OK then. This would seem to be a reasonable middle ground. If it wasn't
> for you having some FS experience already, I would probably think the
> magic FS way too complex for what it is buying us/the user.
> 
> Can we do this without a kernel patch? A kernel patch may be a bit
> too much for many that just wants to dip their toes.

By rights, a generic method for accomplishing such a thing should already
have been merged, but sadly that isn't the case, or perhaps fortunately, if
the official interface would have been less than ideal.  In any event, I'm
not shy about constructing should a thing if it's needed, and I can assure
you it will be elegant and efficient.  As far as applying a patch goes, I
think it will only be a module, and that module will be small, since most
of the work will be done in user land.  No, we absolutely can't do this
without involving the kernel, and no standard mechansim exists in Linux at
the moment for doing this.  Plan9 has 9P, a network protocol, precisely for
such a purpose, however I'd rather bypass the network and do a tight
little local interface.  If I decide the network interface is really the
right way to do it, or just want to be lazy, the uservfs project already
exists, and is being maintained I believe.  It isn't in kernel though, and
it depends on coda, which is a another whole big piece, so I'm not that
enthusiastic about it.  I'd rather just define a nice interface that
exports the vfs securely and racelessly to user space via the various nice
methods we have available.  It doesn't have to be particularly general
either, to get us going.  I consider this a fairly easy project and a
chance to get some experience with some of the ipc mechanisms I haven't
done a lot with to date, such as signals.

There is another, simpler method, and the one I propose to use for initial
testing: simply issue all edit commands and other file manipulations, such
as rename, patch etc. from a python shell, which will take care of the
needed preserving of data and calls to the scm.  This gives us a quick
start so we don't have to get bogged down in the details of filesystem
exporting, and others who just want to take a test drive might find this
method useful as well.

There's no question in my mind that the magic filesystem is the best
interface.

> > > I could see some nice things coming
> > > from having smaller granularity than the file one, but since we
> > > are aiming at having 'loose' dependencies in the SCM I think we
> > > will get those anyway.
> > 
> [snip things about having files as basic versioning object]
> > 
> > A practical question is whether we're going to version directories.  I 
> > mentioned the idea that each file object would have an id (which is 
> > universally unique) and the name of the file would be metadata associated 
> > with the object (i.e., an attribute of the object).  However, we will need to 
> > look up files rapidly by name, for example, when a file is changed and a 
> > transform needs to be recorded against it in the database.  This can of 
> > course be handled efficiently by appropriate use of database indexing.
> > 
> > We may sometimes want to traverse the database in directory order, perhaps 
> > when producing a diff between two tree versions.  Does this mean we want to 
> > record directories as objects?  I don't know yet.  It may be enough just to 
> > compute the directories on the fly.
> 
> Another related thing is, how do we group changes to achieve logically
> connected changes, aka changesets in BK terminology? I guess that would
> be by explicit operations in the GUI/command line thingie operating on
> deltas?

Right, and I'd like to expose the full power of sql for this purpose, while
also supporting other methods of course, such as remembering the regions
affected by imported patch sets, or indeed, remembering enough information
to reconstruct each patch set exactly.  Let's call that information 'scope',
and we want to carry scope information in a precise way in the database.  In
general, the scopes of changes should not overlap, but when they do, we need
to record exactly how.  Overlapping scope results either in ordering
dependencies, or conflicts.  In either case, we need to record just what
those dependencies or conflicts are.

> > Drifting further in that direction, the question arises of how much 
> > filesystem structure we want to support in the scm.  Do we want to support 
> > symlinks?  I think we do.  Hard links?  Good question.  Device nodes?  Hmm.
> > If we support all of the above, then what we have is more general than a 
> > source code versioning system, it's actually a versioning filesystem.  That's 
> > something to think about.  However, right now I'll be satisfied aiming at 
> > something with more modest goals.
> 
> Rik van Riel and Larry had some thought about using magic FS's to the
> job a while back... <googling> Here we go. Its kinda sketchy but some
> stuff can be had:
> 
> http://search.luky.org/linux-kernel.2001/msg25061.html

Yes, there you go.  'Obviously right'.  Except I don't want to involve the
network, that just doesn't make any sense to me.

-- 
Daniel



More information about the Prophesy mailing list