[PATCH] parser: Fix parsing of pull request emails with CRLF line endings on Python 2

Andrew Donnellan andrew.donnellan at au1.ibm.com
Fri Jan 19 10:38:57 AEDT 2018


On 19/01/18 00:24, Stephen Finucane wrote:
> On Tue, 2018-01-09 at 23:56 +0000, Stephen Finucane wrote:
>> On Tue, 2018-01-09 at 12:01 +1100, Andrew Donnellan wrote:
>>> On 09/01/18 11:56, Daniel Axtens wrote:
>>>>> diff --git a/patchwork/parser.py b/patchwork/parser.py
>>>>> index 1568bc4..7c677db 100644
>>>>> --- a/patchwork/parser.py
>>>>> +++ b/patchwork/parser.py
>>>>> @@ -666,9 +666,13 @@ def clean_content(content):
>>>>>        """Remove cruft from the email message.
>>>>>    
>>>>>        Catch signature (-- ) and list footer (_____) cruft.
>>>>> +
>>>>> +    Change to Unix line endings (the Python 3 email module
>>>>> does
>>>>> this for us,
>>>>> +    but not Python 2).
>>>>>        """
>>>>>        sig_re = re.compile(r'^(-- |_+)\n.*', re.S | re.M)
>>>>>        content = sig_re.sub('', content)
>>>>> +    content = content.replace('\r\n', '\n')
>>>>
>>>> Shouldn't this go before the removal of signatures?
>>>
>>> Good point
>>
>> Pending this change, this looks good to me. I'll leave the actual
>> applying to Daniel though, in case he has more comments.
>>
>> Reviewed-by: Stephen Finucane <stephen at that.guru>
> 
> As an aside, we could also just open files with universal newlines [1]
> in the parse_mail/parse_archive commands. Not sure if there are any
> advantages to doing this (would you ever have reason to mix CRLF and
> LF?).

Yeah, I did think about that, but on Py3 we currently open the files in 
binary mode and I wasn't sure whether that was going to break anything.

Let me see...

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan at au1.ibm.com  IBM Australia Limited



More information about the Patchwork mailing list