[PATCH v2 9/9] parser: don't fail on multiple SeriesReferences

Stephen Finucane stephen at that.guru
Tue Feb 27 22:21:48 AEDT 2018


On Mon, 2018-02-26 at 15:55 +1100, Daniel Axtens wrote:
> Stephen Finucane <stephen at that.guru> writes:
> 
> > On Sun, 2018-02-25 at 01:50 +1100, Daniel Axtens wrote:
> > > Parallel parsing would occasonally fail with:
> > > 
> > > patchwork.models.MultipleObjectsReturned: get() returned more than one SeriesReference -- it returned 2!
> > > 
> > > I think these are happening if you have different processes parsing
> > > e.g. 1/3 and 2/3 simultaneously: both will have a reference to 1/3,
> > > in the case of 1 it will be the msgid, in the case of 2 it will be
> > > in References. So when we come to parse 3/3, .get() finds 2 and
> > > throws the exception.
> > > 
> > > This does not fix the creation of multiple series references; it
> > > just causes them to be ignored. We still have serious race conditions
> > > with series creation, but I don't yet have clear answers for them.
> > > With this patch, they will at least not stop patches from being
> > > processed - they'll just lead to wonky series, which we already have.
> > > 
> > > Reviewed-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
> > > Signed-off-by: Daniel Axtens <dja at axtens.net>
> > 
> > Correct me if I'm wrong, but this seems to highlight two issues:
> > 
> >  * The race condition between searching for a series reference and
> >    using it
> >  * The fact we expect a series to be unique for a given msgid and
> >    project, yet the 'unique_together' is for a given msgid and
> >    *series*.
> > 
> > The first of these is caught here (yet not fixed, as it's a non-trivial 
> > thing to resolve). The second should still be fixed though. Should we
> > do this here, as part of this series?
> 
> My thinking is no, for these reasons:
> 
>  - I want a patch I can backport to stable without requiring a database
>    migration, because:
> 
>     * that seems like a very unusual thing for a stable update
> 
>     * I don't trust my ability to get it right for a stable release
> 
>     * relatedly, it will require a lot of testing for me to trust it,
>       and I want something sooner rather than later so that the
>       Buildroot people can get on with their lives.
> 
>     * The different numbering of migrations will make the different
>       upgrade paths a nightmare to describe, yet alone test

Yup, I had no considered the backportability aspects of this. That's a
very valid point. Let's do this separately.

>  - I have a long-term ongoing effort to move 'project' up into the
>    various tables that use it so as to avoid JOINs and this would be a
>    good thing to do in that effort. (It's the next big push on my list,
>    it's just a nightmare to get the compatibility and migration code
>    going.)

We should probably discuss this separately, but is this a significant
part of the issue? Far as I could see, there were two different issues
with the DB causing our pain:

 * The number of JOINs on Event, which I have an RFC waiting on reviews
   for.
 * The need to JOIN for Patch and CoverLetter, which could probably be
   resolved by flattening this back down into Submission.

Have I missed something?
Stephen

> Regards,
> Daniel
> 
> > 
> > Stephen
> > 
> > > ---
> > >  patchwork/parser.py | 23 +++++++++++++++++++++--
> > >  1 file changed, 21 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/patchwork/parser.py b/patchwork/parser.py
> > > index 56dc7006c811..5a7344cee93c 100644
> > > --- a/patchwork/parser.py
> > > +++ b/patchwork/parser.py
> > > @@ -240,6 +240,13 @@ def _find_series_by_references(project, mail):
> > >                  msgid=ref[:255], series__project=project).series
> > >          except SeriesReference.DoesNotExist:
> > >              continue
> > > +        except SeriesReference.MultipleObjectsReturned:
> > > +            # FIXME: Open bug: this can happen when we're processing
> > > +            # messages in parallel. Pick the first and log.
> > > +            logger.error("Multiple SeriesReferences for %s in project %s!" %
> > > +                         (ref[:255], project.name))
> > > +            return SeriesReference.objects.filter(
> > > +                msgid=ref[:255], series__project=project).first().series
> > >  
> > >  
> > >  def _find_series_by_markers(project, mail, author):
> > > @@ -1037,6 +1044,9 @@ def parse_mail(mail, list_id=None):
> > >                                                  series__project=project)
> > >                  except SeriesReference.DoesNotExist:
> > >                      SeriesReference.objects.create(series=series, msgid=ref)
> > > +                except SeriesReference.MultipleObjectsReturned:
> > > +                    logger.error("Multiple SeriesReferences for %s"
> > > +                                 " in project %s!" % (ref, project.name))
> > >  
> > >          # add to a series if we have found one, and we have a numbered
> > >          # patch. Don't add unnumbered patches (for example diffs sent
> > > @@ -1075,6 +1085,11 @@ def parse_mail(mail, list_id=None):
> > >                      msgid=msgid, series__project=project).series
> > >              except SeriesReference.DoesNotExist:
> > >                  series = None
> > > +            except SeriesReference.MultipleObjectsReturned:
> > > +                logger.error("Multiple SeriesReferences for %s"
> > > +                             " in project %s!" % (msgid, project.name))
> > > +                series = SeriesReference.objects.filter(
> > > +                    msgid=msgid, series__project=project).first().series
> > >  
> > >              if not series:
> > >                  series = Series(project=project,
> > > @@ -1087,8 +1102,12 @@ def parse_mail(mail, list_id=None):
> > >                  # we don't save the in-reply-to or references fields
> > >                  # for a cover letter, as they can't refer to the same
> > >                  # series
> > > -                SeriesReference.objects.get_or_create(series=series,
> > > -                                                      msgid=msgid)
> > > +                try:
> > > +                    SeriesReference.objects.get_or_create(series=series,
> > > +                                                          msgid=msgid)
> > > +                except SeriesReference.MultipleObjectsReturned:
> > > +                    logger.error("Multiple SeriesReferences for %s"
> > > +                                 " in project %s!" % (msgid, project.name))
> > >  
> > >              cover_letter = CoverLetter(
> > >                  msgid=msgid,



More information about the Patchwork mailing list