[PATCH RESEND 0/2] Rework tagging infrastructure

Wed Sep 12 07:30:13 AEST 2018

On Tue, 2018-09-11 at 12:03 -0600, Stephen Finucane wrote:
> On Mon, 2018-07-09 at 18:10 +0200, vkabatov at redhat.com wrote:
> > From: Veronika Kabatova <vkabatov at redhat.com>
> > 
> > I was trying to send the full patch for a while now, but never saw it arrive to
> > the mailing list. Using a different email (from entirely different domain)
> > didn't help, so I'm thinking it's because of the size of the message. Hence I'm
> > now sending it again, this time split into two parts, in hopes it will arrive
> > (feel free to merge the patches when applying).
> 
> About time we got going on this :) I've left my review comments but I
> wasn't able to run this locally and inspect the impact on the DB due to
> merge conflicts. Could I ask that you address what you can and rebase
> this onto the 'Convert Series-Patch relationship to 1:N' series I
> posted recently, as that will impact this series significantly. If you
> can, I'll apply locally and help you work through the DB issues that
> I'm pretty sure we're going to see with the patch as-is right now.

I've actually gone and rebased this + reworked per the aforementioned
series. You can find the end result here:

  https://github.com/stephenfin/patchwork/tree/review/rework-tagging-infrastructure

This should be a good starting point, assuming you want to avoid having
to rebase months old code yourself :) However, as expected, the current
implementation hammers the DB when listing patches, so I think we need
to rework this further. I'm still thinking about how to do this and
would welcome input.

Stephen

> Cheers,
> Stephen
> 
> > The original cover letter for context follows.
> > 
> > This patch attempts to solve GitHub issues #57 [1] and #113 [2] as well as some
> > other things we encountered.
> > 
> > Right now, tags are extracted from all the comments on the patch and the patch
> > itself, and they are reextracted from all the sources every time a comment is
> > added or removed. It makes saving slower, and might contribute to races with
> > writes to database when we are parsing multiple emails at the same time. This
> > gets even more prevalent if we want to solve the issue #113 (tags on cover
> > letter should increment counters on every patch in series) -- for each added
> > comment on the cover letter, we would reextract tags from all the other sources,
> > for each patch in series; and for a change on comments related to the patch
> > directly, we would need to take the tags on the cover letter and it's comments
> > into account as well (I implemented this solution in my fork but I really
> > don't like it).
> > 
> > The current approach has several other issues as well, some of which are
> > mentioned in the issue #57, eg duplicate tags are counted more times. Taking
> > into account the tags on cover letters would also be easier if we could store
> > tags against them and just query them on demand, instead of reextracting
> > everything all over again.
> > 
> > Solutions for some other things we found missing solve the issues mentioned
> > above too. If we want to determine if the tag is duplicate, we need to save the
> > associated value. Having the value would help us to use arbitrary strings as
> > tags (for example links to issue trackers, like `Bugzilla: <link>` if the patch
> > solves a known bug). The key-values approach to storing tags is mentioned [3],
> > this email additionally mentions a comments REST API (currently worked on). For
> > the comments API we would also find it very useful to have the tags extracted
> > from the comments available directly so we can query for them, which means we
> > would either need to reextract the tags on every API call, or we could store
> > the comment data with the tags as they are extracted and only query them as
> > needed. Altogether, we would get rid of the `patch_responses` property used
> > when converting comments to mbox (we finally get all the custom tags there
> > instead of only the few hardcoded ones too).
> > 
> > TL;DR:
> > Our goals:
> > - Avoid tag reextraction with each added comment
> > - Fix issues #57 and #113
> > - Prepare tags addition to comments in the API
> > - Add tags to patch (currently returns {}) and cover letter APIs
> > 
> > 
> > [1] https://github.com/getpatchwork/patchwork/issues/57
> > [2] https://github.com/getpatchwork/patchwork/issues/113
> > [3] https://lists.ozlabs.org/pipermail/patchwork/2018-January/004741.html
> > 
> 
> 
> _______________________________________________
> Patchwork mailing list
> Patchwork at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/patchwork