[RFC PATCH] pwclient: Force xmlrpc client to return unicode strings

Stephen Finucane stephen at that.guru
Tue May 23 20:08:43 AEST 2017


On Mon, 2017-05-22 at 11:37 +0200, Robin Jarry wrote:
> On python 2, the reference implementation of the XML-RPC unmarshaller
> decodes strings to unicode with the selected encoding (utf-8 by
> default) but it tries to re-encode the unicode strings to ascii bytes
> before returning the values. If it fails, it leaves the value as
> unicode.
> 
> See these links for more details:
> 
>     https://hg.python.org/cpython/file/2.7/Lib/xmlrpclib.py#l878
>     https://hg.python.org/cpython/file/2.7/Lib/xmlrpclib.py#l180
> 
>     https://hg.python.org/cpython/file/3.6/Lib/xmlrpc/client.py#l753
> 
> Override the Transport.getparser() method to return a subclass of
> unmarshaller that does not re-encode to ascii but preserves the
> unicode strings intact. This allows to have similar behaviour in both
> python 2 and python 3.
> 
> Signed-off-by: Robin Jarry <robin.jarry at 6wind.com>
> ---
>  patchwork/bin/pwclient | 42
> ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/patchwork/bin/pwclient b/patchwork/bin/pwclient
> index 5fcb0844b923..d3e3d9cde134 100755
> --- a/patchwork/bin/pwclient
> +++ b/patchwork/bin/pwclient
> @@ -97,6 +97,28 @@ class Filter(object):
>          return str(self.d)
>  
>  
> +if xmlrpclib.FastUnmarshaller is not None:
> +    Unmarshaller = xmlrpclib.FastUnmarshaller
> +else:
> +    Unmarshaller = xmlrpclib.Unmarshaller
> +
> +class UnicodeUnmarshaller(Unmarshaller):
> +
> +    dispatch = Unmarshaller.dispatch.copy()
> +
> +    def end_string(self, data):
> +        if self._encoding:
> +            data = xmlrpclib._decode(data, self._encoding)
> +        # the python 2.7 reference implementation tries to re-encode 
> to
> +        # ascii bytes here but leaves unicode if it fails, do not
> try to
> +        # re-encode to ascii byte string

Thanks for doing this, Robin :) The patch looks pretty good as-is.
However, based on what you've said, all this hassle traces back to the
the '_stringify' function [1]. I wonder if we could simplify things by
merely monkey patching that function instead? Something like the below
_could_ do the job, I'd imagine?

    if sys.version_info[0] < 3:
        def _stringify(string):
            return string

        xmlrpclib._stringify = _stringify
 
Other than the fact that we're messing with private methods (we're
consenting adults here), does this looks sensible? Any thoughts on how 
this compares?

Cheers,
Stephen

[1] https://github.com/python/cpython/blob/2.7/Lib/xmlrpclib.py#L181-L1
86

PS: I noticed this didn't get picked up by Patchwork. I wonder why?

> +        self.append(data)
> +        self._value = 0
> +
> +    dispatch['string'] = end_string
> +    dispatch['name'] = end_string
> +
> +
>  class Transport(xmlrpclib.SafeTransport):
>  
>      def __init__(self, url):
> @@ -132,6 +154,26 @@ class Transport(xmlrpclib.SafeTransport):
>              handler = '%s://%s%s' % (self.scheme, self.host,
> handler)
>              xmlrpclib.Transport.send_request(self, connection,
> handler,
>                                               request_body)
> +        def getparser(self):
> +            # copied from Python 2.7 Lib/xmlrpclib.py to support our
> custom
> +            # UnicodeUnmarshaller
> +            if xmlrpclib.FastParser and xmlrpclib.FastUnmarshaller:
> +                if self._use_datetime:
> +                    mkdatetime = xmlrpclib._datetime_type
> +                else:
> +                    mkdatetime = xmlrpclib._datetime
> +                target = UnicodeUnmarshaller(True, False,
> xmlrpclib._binary,
> +                                             mkdatetime,
> xmlrpclib.Fault)
> +                parser = xmlrpclib.FastParser(target)
> +            else:
> +                target =
> UnicodeUnmarshaller(use_datetime=self._use_datetime)
> +                if xmlrpclib.FastParser:
> +                    parser = xmlrpclib.FastParser(target)
> +                elif xmlrpclib.ExpatParser:
> +                    parser = xmlrpclib.ExpatParser(target)
> +                else:
> +                    parser = xmlrpclib.SlowParser(target)
> +            return parser, target
>      else:  # Python 3
>          def send_request(self, host, handler, request_body, debug):
>              handler = '%s://%s%s' % (self.scheme, host, handler)



More information about the Patchwork mailing list