[PATCH v6 3/8] parser: Add series parsing
Daniel Axtens
dja at axtens.net
Thu Oct 27 10:19:01 AEDT 2016
Hi Stephen,
>> Hopefully I've understood what's going on here:
>> - first, check for a Series that directly matches this message's
>> message id (hence the True to find_references)
>> - then, check for a series that matches an In-Reply-To
>> - then, check for a series that matches the References, from most
>> recent to least recent (hopefully!)
>>
>>
>> Is that right?
>
> Sure is. We prefer locality to the current email where possible. I've
> added comments to clarify this.
Thanks.
>> I think I would prefer pulling out the check for the Message ID into a
>> different test, rather than the extra flag to find_references - I'm not
>> sure a message ID is a reference in the sense in the usual sense. But
>> I'm not particularly fussy - if you've considered this approach and
>> rejected it for some reason that's fine.
>
> Nope - it was this way in previous revisions and this seemed cleaner.
> Reverted to the older approach.
Thanks!
>>> +def _parse_prefixes(subject_prefixes, regex):
>>> + for prefix in subject_prefixes:
>>> + m = regex.match(prefix)
>>> + if m:
>>> + return m
>>> +
>>> +
>> _first_prefix_match or _first_matching_prefix perhaps?
>
> Done.
Thanks.
>>> + regex = re.compile('^[vV](\d+)$')
>>> + m = _parse_prefixes(subject_prefixes, regex)
>>> + if m:
>>> + return int(m.group(1))
>>> +
>>> + m = re.search(r'\([vV](\d+)\)', subject)
>> Should this regex be precompiled at the head of the file? I also notice
>> you've attempted to compile it at the head of the function, you're just
>> not using the compiled version here...
>
> Is seems Python caches this for us anyway [1], so I see no real reason.
> We compile in the first case simply for readability's sake, however, the
> regexes are not the same so we can't reuse it for the second case (note
> the additional brackets).
>
> [1] http://stackoverflow.com/a/452143/613428
Ah - you're right, they are different - sorry I missed that. It's fine
as is then, thanks for clarifying.
>> I must admit I don't quite understand the benefit of creating all the
>> series references straight away. What is the advantage of creating them
>> here as opposed to when the subsequent patches are received? (I could
>> be
>> completely misunderstanding some key part of the parsing here -
>> apologies if so!)
>
> The key reason is to handle patches received out-of-order. Without
> storing references, we have no way to identify a relationship between
> the two patches as the RFC822 headers (References, Message-Id,
> In-Reply-To) only refer to older, prior mails. This fixes that issue.
>
> FWIW, another idea was to store the root message ID against the series
> (I think the freedesktop fork does this), but doing so would require
> some magic for deeply threaded series (where each patch in sent in reply
> to the previous patch). In addition, I don't think any amount of magic
> would work for series sent in reply to a previous series.
>
Ah, OK. I'm convinced then :)
>>> + try:
>>> + series =
>>> SeriesReference.objects.get(msgid=msgid).series
>>> + except SeriesReference.DoesNotExist:
>>> + series = None
>> django shortcuts has a get_object_or_None - I'll leave it to you to
>> weigh whether or not the brevity is worth the additional dependency.
>
> I don't think it is. TBH, if we could get rid of the non-Django
> dependencies we do have, I'm sure we'd make some kernel.org sysadmins
> very happy. Alas, django-rest-framework is far too good to ignore.
That's a completely fair call.
For your next revision:
Reviewed-by: Daniel Axtens <dja at axtens.net>
Regards,
Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 859 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/patchwork/attachments/20161027/c3d9f004/attachment-0001.sig>
More information about the Patchwork
mailing list