[PATCH] pwclient: fix handling of UTF-8 char in submitter name

Mauro Carvalho Chehab mchehab at redhat.com
Fri Dec 10 21:16:50 EST 2010


Em 10-12-2010 05:10, Jeremy Kerr escreveu:
> Hi Wolfgang,
> 
>>> I think a number of messages have special characters in iso8859-1 and
>>> iso8859-9.
>>>
>>> A n option would not really help. I'm running into this when
>>> auto-updating the status of some patches using a script similat to
>>> what you just posted.
>>
>> OK, sounds like we need the parser to be able to take an mbox and read the
>> encoding from the headers then.
> 
> Hm, but these aren't mbox messages, right? (they'd be git commits).
> 
> I think in this case, we have to assume some encoding, as it isn't specified 
> in any metadata. Autodetection is just going to cause pain.

I never used it, nor I am a python expert, but it sems that django defines a class
of lazy utf decoders that won't cause python to crash due to a string that it is
not following the proper encoding:
	http://docs.djangoproject.com/en/dev/ref/unicode/

I had one interesting case of a patch with a driver from staging being changed/moved
to another place, with a string inside using a non-utf8. Patchwork simply discarded
this patch. I only noticed it because this were patch 6 of a sequence of patches,
so I went to the ML to double check what were missing.

Patchwork should be reliable enough to just import a patch, even if python dislikes
the charset.

Cheers,
Mauro


More information about the Patchwork mailing list