[PATCH v2] pwclient: Fix silent crash on python 2
Stephen Finucane
stephen at that.guru
Sat Apr 29 07:46:03 AEST 2017
On Wed, 2017-04-05 at 14:46 +0200, Robin Jarry wrote:
> Replacing sys.stdout and sys.stderr can cause obscure crashes when
> trying to write non unicode data. The interpreter is terminated with
> SIGINT without any specific error writen on the console.
>
> rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f964820e8d0},
> {0x559f50, [], SA_RESTORER, 0x7f964820e8d0}, 8) = 0
>
> This happens easily when there is an untrapped exception which should
> lead to printing a traceback on stderr.
>
> The only way to prevent UnicodeEncodeErrors is to make sure that
> PYTHONIOENCODING is set with the ':replace' suffix and this can only
> be
> done *before* starting the interpreter as the initialization is made
> very early on and the encoding cannot be set or modified after.
>
> From the official documentation:
>
> PYTHONIOENCODING
>
> Overrides the encoding used for stdin/stdout/stderr, in the syntax
> encodingname:errorhandler. The :errorhandler part is optional and
> has the same meaning as in str.encode().
>
> https://docs.python.org/2/using/cmdline.html
> https://docs.python.org/3/using/cmdline.html
>
> Of course, for proper encoding of unicode characters, one of the
> locale-related environment variables (LC_ALL, LANG, LANGUAGE, etc.)
> must
> be set. Python will use the correct encoding accordingly and
> PYTHONIOENCODING will be set to "encoding:replace".
>
> Examples:
>
> $ grep utf8 ~/.Xresources
> xterm*utf8: 2
>
> $ env - PYTHONIOENCODING=utf-8:replace python2 -c "print
> u's\u00e9duisante'"
> séduisante
> $ env - PYTHONIOENCODING=utf-8:replace python3 -c
> "print('s\u00e9duisante')"
> séduisante
>
> $ env - PYTHONIOENCODING=ascii:replace python2 -c "print
> u's\u00e9duisante'"
> s?duisante
> $ env - PYTHONIOENCODING=ascii:replace python3 -c
> "print('s\u00e9duisante')"
> s?duisante
>
> $ env - PYTHONIOENCODING=ISO-8859-1:replace python2 -c "print
> u's\u00e9duisante'"
> s�duisante
> $ env - PYTHONIOENCODING=ISO-8859-1:replace python3 -c
> "print('s\u00e9duisante')"
> s�duisante
>
> Fixes: 046419a3bf8f ("pwclient: Fix encoding problems")
> Signed-off-by: Robin Jarry <robin.jarry at 6wind.com>
I understand the issue now - thanks for clarifying :) Sorry for the
delay in replying also.
Now, while I understand what you're doing here, I'm wondering if we
could simply revert the original patch instead. I clearly merged that
without fully understanding the issue and not only does it cause the
issues you've trying to resolve here but it seems it's not necessary
for a lot of people (including me. Fedora 24). For those users that are
affected, there are less hacky workarounds. For example, a simple
script like so would so the job, I imagine.
$ cat pwclient2
#!/usr/bin/env bash
PYTHONIOENCODING=UTF-8 pwclient
I'd be happy to include an epilog detailing this issue that would
appear when you'd run '--help'. I realize that were still be an extra
step for affected users, but the vast majority aren't and wouldn't be
affected.
Would this be acceptable?
Stephen
PS: What kind of environment are you encountering this issue in?
> ---
> v2:
>
> - Always set PYTHONIOENCODING=<enc>:replace (on python 2 and 3) to
> prevent from
> UnicodeEncodeErrors
>
> patchwork/bin/pwclient | 35 ++++++++++++++++++++++++++---------
> 1 file changed, 26 insertions(+), 9 deletions(-)
>
> diff --git a/patchwork/bin/pwclient b/patchwork/bin/pwclient
> index ed0351bf5288..5a7b6723afe3 100755
> --- a/patchwork/bin/pwclient
> +++ b/patchwork/bin/pwclient
> @@ -41,16 +41,7 @@ except ImportError:
> import shutil
> import re
> import io
> -import locale
>
> -if sys.version_info.major == 2:
> - # hack to make writing unicode to standard output/error work on
> Python 2
> - OUT_ENCODING = (sys.stdout.encoding or
> locale.getpreferredencoding() or
> - os.getenv('PYTHONIOENCODING', 'utf-8'))
> - sys.stdout = io.open(sys.stdout.fileno(), mode='w',
> - encoding=OUT_ENCODING, errors='replace')
> - sys.stderr = io.open(sys.stderr.fileno(), mode='w',
> - encoding=OUT_ENCODING, errors='replace')
>
> # Default Patchwork remote XML-RPC server URL
> # This script will check the PW_XMLRPC_URL environment variable
> @@ -821,5 +812,31 @@ def main():
> sys.exit(1)
>
>
> +def force_io_encoding():
> + """
> + Force PYTHONIOENCODING ":errorhandler" to avoid
> UnicodeEncodeErrors. The
> + only way to do it is to set the environment variable *before*
> starting the
> + interpreter. From the python docs:
> +
> + PYTHONIOENCODING
> +
> + Overrides the encoding used for stdin/stdout/stderr, in the
> syntax
> + encodingname:errorhandler. The :errorhandler part is optional
> and has the
> + same meaning as in str.encode().
> +
> + Note that this only prevents interpreter crashes, it does not
> exempt from
> + correctly setting the LANG or LC_ALL variables in order to have
> valid
> + output.
> + """
> + if 'PYTHONIOENCODING' in os.environ:
> + return
> +
> + encoding = sys.stdout.encoding or 'utf-8'
> +
> + os.environ['PYTHONIOENCODING'] = encoding + ':replace'
> + os.execvp(sys.executable, [sys.executable] + sys.argv) # no
> return
> +
> +
> if __name__ == "__main__":
> + force_io_encoding()
> main()
More information about the Patchwork
mailing list