[RFC] parsemail.py: Don't search for patches in replies

Markus Mayer markus.mayer at linaro.org
Fri Feb 7 09:49:21 EST 2014


Make sure we don't attempt to search for a patch in a reply e-mail.
There are MUAs out there who leave the quoted e-mail intact without
prepending quote characters such as ">" at the beginning of each line.

When that happens, parse_patch() thinks the quoted patch is new. The
result are multiple database entries containing the same patch (one for
each such reply) when one would really expect a consolidated thread
containing the entire discussion and only one copy of the patch.

Signed-off-by: Markus Mayer <markus.mayer at linaro.org>
---
This problem is mainly caused when replies to patches are sent using
Outlook.

The approach below seems to work, although there is the downside that
it relies on English MUA settings. If a mail client translates "Re:" to
some other string such as "AW:" the proposed code will not detect that
the e-mail in question is a reply. (Although it wouldn't be any worse
than it is now.)

To avoid this, I tried using the presence of "In-Reply-To:" and
"References:" headers to detect a reply, but "git send-email" inserts
references into patches that aren't replies (e.g. v2 of a patch
referencing v1), which then leads to the opposite problem: mails being
categorized as replies when they are not.

So, checking for "Re:" still seems to be the better option. Please let
me know your thoughts.

 apps/patchwork/bin/parsemail.py |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/apps/patchwork/bin/parsemail.py b/apps/patchwork/bin/parsemail.py
index b6eb97a..405fb69 100755
--- a/apps/patchwork/bin/parsemail.py
+++ b/apps/patchwork/bin/parsemail.py
@@ -151,6 +151,8 @@ def find_content(project, mail):
     patchbuf = None
     commentbuf = ''
     pullurl = None
+    subject = mail.get('Subject')
+    is_reply = (subject.lower().find("re:") == 0)
 
     for part in mail.walk():
         if part.get_content_maintype() != 'text':
@@ -185,8 +187,8 @@ def find_content(project, mail):
     patch = None
     comment = None
 
-    if pullurl or patchbuf:
-        name = clean_subject(mail.get('Subject'), [project.linkname])
+    if not is_reply and (pullurl or patchbuf):
+        name = clean_subject(subject, [project.linkname])
         patch = Patch(name = name, pull_url = pullurl, content = patchbuf,
                     date = mail_date(mail), headers = mail_headers(mail))
 
-- 
1.7.9.5



More information about the Patchwork mailing list