[PATCH] pwclient: fix handling of UTF-8 char in submitter name

Fri May 27 00:02:15 EST 2011

hey there

i am topposting because this topic is very old and my problem is "only"
strongly related to this discussion.

i tried to "pwclient apply <id>" this patch:
http://patchwork.coreboot.org/patch/2997/

pwclient bails with:
> pwclient apply 2997
> Applying patch #2997 to current directory
> Description: ichspi: fix unused FREG detection
> Traceback (most recent call last):
>   File "/home/ameno/bin/pwclient", line 463, in <module>
>     main()
>   File "/home/ameno/bin/pwclient", line 446, in main
>     action_apply(rpc, patch_id)
>   File "/home/ameno/bin/pwclient", line 263, in action_apply
>     proc.communicate(s)
>   File "/usr/lib/python2.6/subprocess.py", line 680, in communicate
>     self.stdin.write(input)

obviously it does not like the '•' inside the mail... which is
even more unfortunate than the original problem described in this
thread because it is not even part of the patch itself.

changing line 263 by adding
'.encode("utf-8")'
resulting in
'        proc.communicate(s.encode("utf-8"))'
fixes this problem but probably with the side effects mentioned here.

appending the encode call to line 259 would probably "solve" the
problem for the patch name/subject btw.

i have not and will not follow the development of pwclient, but would
be happy to receive and replies via cc, thanks.

> Em 12-12-2010 22:58, Jeremy Kerr escreveu:
> > Hi Mauro,
> > 
> >> I never used it, nor I am a python expert, but it sems that django defines
> >> a class of lazy utf decoders that won't cause python to crash due to a
> >> string that it is not following the proper encoding:
> >> 	http://docs.djangoproject.com/en/dev/ref/unicode/
> >>
> >> I had one interesting case of a patch with a driver from staging being
> >> changed/moved to another place, with a string inside using a non-utf8.
> >> Patchwork simply discarded this patch. I only noticed it because this were
> >> patch 6 of a sequence of patches, so I went to the ML to double check what
> >> were missing.
> > 
> > The parser (and pwclient) need to be fairly independent of django, as they're 
> > both intended to be run on machine with a fairly minimal python environment.
> 
> I don't see much problem for the parser, as it runs on a server, but I agree
> that a lighter environment at the client side is interesting. Yet, it is better to
> install some additional python packages locally than to loose patches.
>  
> > 
> > However, the unicode decoder has a 'replace'-mode, where invalid byte 
> > sequences are replaced with U+FFFD REPLACEMENT CHARACTER:
> > 
> >   '\x80'.decode('utf-8', 'replace') = '\ufffd'
> 
> Interesting.
>  
> > The reason that I don't do this currently is that patchwork would now be 
> > altering your patches to something that the author didn't write. If you were 
> > to apply the resulting patch, you would be introducing the U+FFFD character to 
> > your source tree.
> > 
> > However, dropping patches isn't a great solution either, so other alternatives 
> > welcome :)
> 
> Would it be possible to handle the error at decode with "try"? If so, maybe you could
> add some logic there to try to decode first with the email charset. Then, try utf-8. 
> If both fails, try to decode with some other protocols, like iso8859-11. This will
> likely catch 99% of the issues. If everything fails, it is preferred to use the
> replacement character than to loose the patch. 
> 
> I would also add a meta-tag to inticate the cases where patchwork is guessing a
> type (or using a replacement character). This way, the maintainer may manually 
> take care of the fixes.
> 
> Cheers,
> Mauro

-- 
Kind regards/Mit freundlichen Grüßen, Stefan Tauner