[PATCH] pwclient: fix handling of UTF-8 char in submitter name
Jeremy Kerr
jk at ozlabs.org
Mon Dec 13 11:58:19 EST 2010
Hi Mauro,
> I never used it, nor I am a python expert, but it sems that django defines
> a class of lazy utf decoders that won't cause python to crash due to a
> string that it is not following the proper encoding:
> http://docs.djangoproject.com/en/dev/ref/unicode/
>
> I had one interesting case of a patch with a driver from staging being
> changed/moved to another place, with a string inside using a non-utf8.
> Patchwork simply discarded this patch. I only noticed it because this were
> patch 6 of a sequence of patches, so I went to the ML to double check what
> were missing.
The parser (and pwclient) need to be fairly independent of django, as they're
both intended to be run on machine with a fairly minimal python environment.
However, the unicode decoder has a 'replace'-mode, where invalid byte
sequences are replaced with U+FFFD REPLACEMENT CHARACTER:
'\x80'.decode('utf-8', 'replace') = '\ufffd'
The reason that I don't do this currently is that patchwork would now be
altering your patches to something that the author didn't write. If you were
to apply the resulting patch, you would be introducing the U+FFFD character to
your source tree.
However, dropping patches isn't a great solution either, so other alternatives
welcome :)
Cheers,
Jeremy
More information about the Patchwork
mailing list