[Prophesy] Diff to transform converter

Daniel Phillips phillips at bonn-fries.net
Sun Jun 9 23:07:17 EST 2002


On Friday 07 June 2002 16:17, Daniel Phillips wrote:
> There needs to be more error checking.  As it stands, this code should 
> perform its job correctly, on the assumption that the diff text is always 
> correct.  It would of course be foolish to assume this.  Some of the 
> redundant information in the diff can be used for a crosscheck:
> 
>   - The number of copied and skipped lines in each chunk should match
>     the chunk's specified input line count
> 
>   - The number of copied and added lines in each chunk should match the
>     specified chunk's output line count.
> 
>   - The current output line should be tracked and checked against the
>     chunk's output line number
> 
>   - Copied and skipped text in the diff should be checked to ensure it
>     matches the corresponding input text
> 
>   - Added text could possibly be checked against the original target
>     text, but since the target text is not required for any other
>     purpose, it makes more sense just to test-apply the generated
>     transform to the input text, then ensure it matches the original
>     target text

A couple of items to round out the list:

    - The input line number of each chunk should be monotonically
      increasing, and the input chunks should not overlap

    - The output line number of each chunk should be monotonically
      increasing, and the output chunks should not overlap

Under the category of 'further work', there needs to be special
attention paid to the possibility that the final line may not be terminated 
by an end-of-line character, in any combination of:

  - the input file
  - the output file
  - the diff file

Diff uses some bizarre syntax for indicating the absence of end-of-line in 
certain circumstances.  It doesn't seem to be documented (the unified diff 
format itself is only loosely documented, in a bsd man page) and I have not 
taken the time to reverse engineer it.  It seems to have something to do with 
a \ character beginning the line just after a +++ line, with a comment to the 
effect that an end of line is missing in one of the files.  Yuck.

If an end-of-line is missing in a diff file, it's probably fair to treat it 
as a syntax error.  If missing in the input or output file then we have to 
watch out for the (crude) diff syntax that indicates this and process it to 
produce the correct transform.  I believe this only affects the final 
operation, and then only with certain of the three basic operations.

-- 
Daniel



More information about the Prophesy mailing list