[PATCH v2] pwclient: Fix silent crash on python 2

Stephen Finucane stephen at that.guru
Sat Apr 29 07:46:03 AEST 2017


On Wed, 2017-04-05 at 14:46 +0200, Robin Jarry wrote:
> Replacing sys.stdout and sys.stderr can cause obscure crashes when
> trying to write non unicode data. The interpreter is terminated with
> SIGINT without any specific error writen on the console.
> 
>   rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f964820e8d0},
>   {0x559f50, [], SA_RESTORER, 0x7f964820e8d0}, 8) = 0
> 
> This happens easily when there is an untrapped exception which should
> lead to printing a traceback on stderr.
> 
> The only way to prevent UnicodeEncodeErrors is to make sure that
> PYTHONIOENCODING is set with the ':replace' suffix and this can only
> be
> done *before* starting the interpreter as the initialization is made
> very early on and the encoding cannot be set or modified after.
> 
> From the official documentation:
> 
>   PYTHONIOENCODING
> 
>   Overrides the encoding used for stdin/stdout/stderr, in the syntax
>   encodingname:errorhandler. The :errorhandler part is optional and
>   has the same meaning as in str.encode().
> 
>   https://docs.python.org/2/using/cmdline.html
>   https://docs.python.org/3/using/cmdline.html
> 
> Of course, for proper encoding of unicode characters, one of the
> locale-related environment variables (LC_ALL, LANG, LANGUAGE, etc.)
> must
> be set. Python will use the correct encoding accordingly and
> PYTHONIOENCODING will be set to "encoding:replace".
> 
> Examples:
> 
>   $ grep utf8 ~/.Xresources
>   xterm*utf8: 2
> 
>   $ env - PYTHONIOENCODING=utf-8:replace python2 -c "print
> u's\u00e9duisante'"
>   séduisante
>   $ env - PYTHONIOENCODING=utf-8:replace python3 -c
> "print('s\u00e9duisante')"
>   séduisante
> 
>   $ env - PYTHONIOENCODING=ascii:replace python2 -c "print
> u's\u00e9duisante'"
>   s?duisante
>   $ env - PYTHONIOENCODING=ascii:replace python3 -c
> "print('s\u00e9duisante')"
>   s?duisante
> 
>   $ env - PYTHONIOENCODING=ISO-8859-1:replace python2 -c "print
> u's\u00e9duisante'"
>   s�duisante
>   $ env - PYTHONIOENCODING=ISO-8859-1:replace python3 -c
> "print('s\u00e9duisante')"
>   s�duisante
> 
> Fixes: 046419a3bf8f ("pwclient: Fix encoding problems")
> Signed-off-by: Robin Jarry <robin.jarry at 6wind.com>

I understand the issue now - thanks for clarifying :) Sorry for the
delay in replying also.

Now, while I understand what you're doing here, I'm wondering if we
could simply revert the original patch instead. I clearly merged that
without fully understanding the issue and not only does it cause the
issues you've trying to resolve here but it seems it's not necessary
for a lot of people (including me. Fedora 24). For those users that are
affected, there are less hacky workarounds. For example, a simple
script like so would so the job, I imagine.

    $ cat pwclient2
    #!/usr/bin/env bash
    PYTHONIOENCODING=UTF-8 pwclient

I'd be happy to include an epilog detailing this issue that would
appear when you'd run '--help'. I realize that were still be an extra
step for affected users, but the vast majority aren't and wouldn't be
affected.

Would this be acceptable?

Stephen

PS: What kind of environment are you encountering this issue in?

> ---
> v2:
> 
> - Always set PYTHONIOENCODING=<enc>:replace (on python 2 and 3) to
> prevent from
>   UnicodeEncodeErrors
> 
>  patchwork/bin/pwclient | 35 ++++++++++++++++++++++++++---------
>  1 file changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/patchwork/bin/pwclient b/patchwork/bin/pwclient
> index ed0351bf5288..5a7b6723afe3 100755
> --- a/patchwork/bin/pwclient
> +++ b/patchwork/bin/pwclient
> @@ -41,16 +41,7 @@ except ImportError:
>  import shutil
>  import re
>  import io
> -import locale
>  
> -if sys.version_info.major == 2:
> -    # hack to make writing unicode to standard output/error work on
> Python 2
> -    OUT_ENCODING = (sys.stdout.encoding or
> locale.getpreferredencoding() or
> -                    os.getenv('PYTHONIOENCODING', 'utf-8'))
> -    sys.stdout = io.open(sys.stdout.fileno(), mode='w',
> -                         encoding=OUT_ENCODING, errors='replace')
> -    sys.stderr = io.open(sys.stderr.fileno(), mode='w',
> -                         encoding=OUT_ENCODING, errors='replace')
>  
>  # Default Patchwork remote XML-RPC server URL
>  # This script will check the PW_XMLRPC_URL environment variable
> @@ -821,5 +812,31 @@ def main():
>          sys.exit(1)
>  
>  
> +def force_io_encoding():
> +    """
> +    Force PYTHONIOENCODING ":errorhandler" to avoid
> UnicodeEncodeErrors. The
> +    only way to do it is to set the environment variable *before*
> starting the
> +    interpreter. From the python docs:
> +
> +      PYTHONIOENCODING
> +
> +      Overrides the encoding used for stdin/stdout/stderr, in the
> syntax
> +      encodingname:errorhandler. The :errorhandler part is optional
> and has the
> +      same meaning as in str.encode().
> +
> +    Note that this only prevents interpreter crashes, it does not
> exempt from
> +    correctly setting the LANG or LC_ALL variables in order to have
> valid
> +    output.
> +    """
> +    if 'PYTHONIOENCODING' in os.environ:
> +        return
> +
> +    encoding = sys.stdout.encoding or 'utf-8'
> +
> +    os.environ['PYTHONIOENCODING'] = encoding + ':replace'
> +    os.execvp(sys.executable, [sys.executable] + sys.argv)  # no
> return
> +
> +
>  if __name__ == "__main__":
> +    force_io_encoding()
>      main()



More information about the Patchwork mailing list