[PATCH v2] pwclient: Fix silent crash on python 2

Robin Jarry robin.jarry at 6wind.com
Wed Apr 5 22:46:04 AEST 2017


Replacing sys.stdout and sys.stderr can cause obscure crashes when
trying to write non unicode data. The interpreter is terminated with
SIGINT without any specific error writen on the console.

  rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f964820e8d0},
  {0x559f50, [], SA_RESTORER, 0x7f964820e8d0}, 8) = 0

This happens easily when there is an untrapped exception which should
lead to printing a traceback on stderr.

The only way to prevent UnicodeEncodeErrors is to make sure that
PYTHONIOENCODING is set with the ':replace' suffix and this can only be
done *before* starting the interpreter as the initialization is made
very early on and the encoding cannot be set or modified after.

>From the official documentation:

  PYTHONIOENCODING

  Overrides the encoding used for stdin/stdout/stderr, in the syntax
  encodingname:errorhandler. The :errorhandler part is optional and
  has the same meaning as in str.encode().

  https://docs.python.org/2/using/cmdline.html
  https://docs.python.org/3/using/cmdline.html

Of course, for proper encoding of unicode characters, one of the
locale-related environment variables (LC_ALL, LANG, LANGUAGE, etc.) must
be set. Python will use the correct encoding accordingly and
PYTHONIOENCODING will be set to "encoding:replace".

Examples:

  $ grep utf8 ~/.Xresources
  xterm*utf8: 2

  $ env - PYTHONIOENCODING=utf-8:replace python2 -c "print u's\u00e9duisante'"
  séduisante
  $ env - PYTHONIOENCODING=utf-8:replace python3 -c "print('s\u00e9duisante')"
  séduisante

  $ env - PYTHONIOENCODING=ascii:replace python2 -c "print u's\u00e9duisante'"
  s?duisante
  $ env - PYTHONIOENCODING=ascii:replace python3 -c "print('s\u00e9duisante')"
  s?duisante

  $ env - PYTHONIOENCODING=ISO-8859-1:replace python2 -c "print u's\u00e9duisante'"
  s�duisante
  $ env - PYTHONIOENCODING=ISO-8859-1:replace python3 -c "print('s\u00e9duisante')"
  s�duisante

Fixes: 046419a3bf8f ("pwclient: Fix encoding problems")
Signed-off-by: Robin Jarry <robin.jarry at 6wind.com>
---
v2:

- Always set PYTHONIOENCODING=<enc>:replace (on python 2 and 3) to prevent from
  UnicodeEncodeErrors

 patchwork/bin/pwclient | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/patchwork/bin/pwclient b/patchwork/bin/pwclient
index ed0351bf5288..5a7b6723afe3 100755
--- a/patchwork/bin/pwclient
+++ b/patchwork/bin/pwclient
@@ -41,16 +41,7 @@ except ImportError:
 import shutil
 import re
 import io
-import locale
 
-if sys.version_info.major == 2:
-    # hack to make writing unicode to standard output/error work on Python 2
-    OUT_ENCODING = (sys.stdout.encoding or locale.getpreferredencoding() or
-                    os.getenv('PYTHONIOENCODING', 'utf-8'))
-    sys.stdout = io.open(sys.stdout.fileno(), mode='w',
-                         encoding=OUT_ENCODING, errors='replace')
-    sys.stderr = io.open(sys.stderr.fileno(), mode='w',
-                         encoding=OUT_ENCODING, errors='replace')
 
 # Default Patchwork remote XML-RPC server URL
 # This script will check the PW_XMLRPC_URL environment variable
@@ -821,5 +812,31 @@ def main():
         sys.exit(1)
 
 
+def force_io_encoding():
+    """
+    Force PYTHONIOENCODING ":errorhandler" to avoid UnicodeEncodeErrors. The
+    only way to do it is to set the environment variable *before* starting the
+    interpreter. From the python docs:
+
+      PYTHONIOENCODING
+
+      Overrides the encoding used for stdin/stdout/stderr, in the syntax
+      encodingname:errorhandler. The :errorhandler part is optional and has the
+      same meaning as in str.encode().
+
+    Note that this only prevents interpreter crashes, it does not exempt from
+    correctly setting the LANG or LC_ALL variables in order to have valid
+    output.
+    """
+    if 'PYTHONIOENCODING' in os.environ:
+        return
+
+    encoding = sys.stdout.encoding or 'utf-8'
+
+    os.environ['PYTHONIOENCODING'] = encoding + ':replace'
+    os.execvp(sys.executable, [sys.executable] + sys.argv)  # no return
+
+
 if __name__ == "__main__":
+    force_io_encoding()
     main()
-- 
2.11.0.193.g1d1bdafd6426



More information about the Patchwork mailing list