[RFC 0/2 REBASE] Rework tagging infrastructure

Veronika Kabatova vkabatov at redhat.com
Wed Apr 18 02:09:18 AEST 2018


----- Original Message -----
> From: "Stephen Finucane" <stephen at that.guru>
> To: "Veronika Kabatova" <vkabatov at redhat.com>
> Cc: patchwork at lists.ozlabs.org
> Sent: Tuesday, April 17, 2018 5:46:12 PM
> Subject: Re: [RFC 0/2 REBASE] Rework tagging infrastructure
> 
> On Tue, 2018-04-17 at 11:35 -0400, Veronika Kabatova wrote:
> > > > Unless I'm overlooking something, we'd need to have the link from Tag
> > > > to
> > > > both Patch and CoverLetter. This should still have much better
> > > > performance
> > > > than my original solution (and will get rid of the duplication of
> > > > yours).
> > > > 
> > > > Does this proposal make sense, or am I missing something?
> > > 
> > > That mostly makes sense. My main concern is what happens when you want
> > > to show tags for a patch when those tags were created again the cover
> > > letter. If that's the case, are we going to have to query on
> > > 'patch.series.cover_letter.tags'? I imagine that's going to be slow
> > > (lots of JOINs). We could store it on the series instead, but I'm not
> > > sure how much that would improve things. Any ideas how to work around
> > > this?
> > > 
> > 
> > I was thinking about filtering on the SubmissionTag (or whatever the
> > intermediate model will be named) based on submission IDs of the patch
> > and cover (or comment IDs in case of comments API), instead of going
> > through the relations. That said, my database knowledge is very...
> > abstract... so I have no idea how much it helps with the underlying
> > queries.
> > 
> > If you (or whoever else) can offer any insight that would be great!
> 
> We'd still need to get information about the cover letter though, and
> that requires going through the series (one join). Maybe we already
> have that JOIN though, so this warrants some validation.
> 

We already have the series in the API. For the view, we are prefetching
them, but only after annotation with tag counts. Will it help to change
the order there?

> Another idea I've had is to store a series attribute in addition to the
> cover letter, comment and patch attributes. That way we could do
> something like this for patches:
> 
>    tags = Tag.objects.filter(series=patch.series,
>                              Q(patch=patch) | Q(patch=None))
> 
> e.g. if the patch is part of our series and doesn't belong to _another_
> patch, it must be a series-wide patch? You'd need to do additional
> filtering on this for duplicates, of course, but I imagine that's easy
> enough. You'd also want to make liberal use of the 'only' and 'defer'
> functions to make sure we avoid as many joins as possible, however, I
> don't think this would require a join on the 'patchwork_series' table
> as we only use the ID column (which we'd have).
> 

I find having separate tables for cover letter and patch tags easier to
wrap around (and avoiding most of the duplication), but in case you think
the above won't help with performance, I'll go this way and see how it
works out. Yeah, getting out only distinct values is the easiest part :)

Veronika

> Thoughts?
> Stephen
> 


More information about the Patchwork mailing list