[Prophesy] Improved string transformation

Daniel Phillips phillips at bonn-fries.net
Fri May 31 18:00:02 EST 2002


Today I added support for the 'move' operation to the string transforma, 
roughly doubling the size of the state network, leading me to reflect on how 
much a slight irregularity in an encoding scheme can bloat up an 
implementation.  Oh well, it's still reasonably tight and efficient, and it 
is not going to grow more any time in the near future, except to add more 
error checking in the transinfo function.

The idea is that transform itself will have little or no error checking.  We 
will always have run transinfo in the operation string sometime before we run 
the transform.  If a transform is to be stored in the database, we will also 
store the lengths of the input and output strings, as calculated by transinfo 
and checked against the known lengths.  Yes, this is micro-optimizing, but I 
like to keep the low level things light and tight, it makes me feel better.

Notice how the three primitive operations skip, text and copy map onto the 
diff codes '+', '-' and ' '.  This is no accident, these are in fact the same 
thing, just more loosely expressed, in human-readable form.  Which leads to 
the observation that we can start generating transform strings without a 
whole lot of effort by converting diff files.  This is indeed something we 
want to do, even after we have code for generating transform strings directly.

On the general theme of using the power tools available, I'm thinking about 
generating a Bison parser to parse diff files into transforms, and doing the 
job properly.

Now that move is done, the transform engine is pretty much complete.  Move
wasn't too hard to implement, but it will be a little tricky to generate code 
for.  That's ok, the transform generator can stick with the three simple 
operations as long as it likes, since the move operation is nothing more than 
a space-saving optimization.  This is on the theme of forward compatibility.  
In this case we can upgrade the transform generator at any time, and new 
databases can will be hancled by early versions of the system.  This is a 
nice result when you can get it.

The attached code demonstrates the move operation in action.

--
Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: transform.c
Type: text/x-c
Size: 2248 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/prophesy/attachments/20020531/df960684/attachment.bin>


More information about the Prophesy mailing list