[PATCH 06/10] parser: better date parsing

Daniel Axtens dja at axtens.net
Wed Jun 28 17:48:48 AEST 2017


It turns out that there is a lot that can go wrong in parsing a
date. OverflowError, ValueError and OSError have all been observed.

If these go wrong, substitute the current datetime.

Signed-off-by: Daniel Axtens <dja at axtens.net>
---
 patchwork/parser.py | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/patchwork/parser.py b/patchwork/parser.py
index 203e11584504..80450c2e4860 100644
--- a/patchwork/parser.py
+++ b/patchwork/parser.py
@@ -344,10 +344,33 @@ def find_date(mail):
     h = clean_header(mail.get('Date', ''))
     if not h:
         return datetime.datetime.utcnow()
+
     t = parsedate_tz(h)
     if not t:
         return datetime.datetime.utcnow()
-    return datetime.datetime.utcfromtimestamp(mktime_tz(t))
+
+    try:
+        d = datetime.datetime.utcfromtimestamp(mktime_tz(t))
+    except OverflowError:
+        # If you have a date like:
+        # Date: Wed, 4 Jun 207777777777777777777714 17:50:46 0
+        # then you can end up with:
+        # OverflowError: Python int too large to convert to C long
+        d = datetime.datetime.utcnow()
+    except ValueError:
+        # If you have a date like:
+        # Date:, 11 Sep 2016 23:22:904070804030804 +0100
+        # then you can end up with:
+        # ValueError: year is out of range
+        d = datetime.datetime.utcnow()
+    except OSError:
+        # If you have a date like:
+        # Date:, 11 Sep 2016 407080403080105:04 +0100
+        # then you can end up with (in py3)
+        # OSError: [Errno 75] Value too large for defined data type
+        d = datetime.datetime.utcnow()
+
+    return d
 
 
 def find_headers(mail):
-- 
2.11.0



More information about the Patchwork mailing list