[PATCH RESEND 2/2] Rework tagging infrastructure
Veronika Kabatova
vkabatov at redhat.com
Wed Sep 12 19:04:27 AEST 2018
----- Original Message -----
> From: "Stephen Finucane" <stephen at that.guru>
> To: vkabatov at redhat.com, patchwork at lists.ozlabs.org
> Sent: Tuesday, September 11, 2018 7:59:57 PM
> Subject: Re: [PATCH RESEND 2/2] Rework tagging infrastructure
>
> On Mon, 2018-07-09 at 18:10 +0200, vkabatov at redhat.com wrote:
> > From: Veronika Kabatova <vkabatov at redhat.com>
> >
> > Solve #113 and #57 GitHub issues, fix up returning tags in the API,
> > keep track of tag origin to be able to add tags to comments in the API.
> >
> > Use relations Tag-Patch and Tag-CoverLetter to avoid duplication of
> > tags for each patch in series, and use `series` attribute of
> > SubmissionTag as a notion of a tag which is related to each patch in the
> > series (because it comes from cover letter or it's comments)
> >
> > Signed-off-by: Veronika Kabatova <vkabatov at redhat.com>
> > ---
> > docs/usage/overview.rst | 9 +-
> > patchwork/management/commands/retag.py | 16 ++-
> > patchwork/migrations/0027_rework_tagging.py | 146
> > +++++++++++++++++++++
> > patchwork/tests/api/test_patch.py | 2 +-
> > patchwork/tests/test_mboxviews.py | 18 ++-
> > patchwork/tests/test_parser.py | 23 ++--
> > patchwork/tests/test_tags.py | 64 ++++-----
> > .../notes/tagging-rework-9907e9dc3f835566.yaml | 18 +++
> > 8 files changed, 233 insertions(+), 63 deletions(-)
> > create mode 100644 patchwork/migrations/0027_rework_tagging.py
> > create mode 100644 releasenotes/notes/tagging-rework-9907e9dc3f835566.yaml
> >
> > diff --git a/docs/usage/overview.rst b/docs/usage/overview.rst
> > index e84e13d..d3ad2f6 100644
> > --- a/docs/usage/overview.rst
> > +++ b/docs/usage/overview.rst
> > @@ -119,10 +119,11 @@ one delegate can be assigned to a patch.
> > Tags
> > ~~~~
> >
> > -Tags are specially formatted metadata appended to the foot the body of a
> > patch
> > -or a comment on a patch. Patchwork extracts these tags at parse time and
> > -associates them with the patch. You add extra tags to an email by replying
> > to
> > -the email. The following tags are available on a standard Patchwork
> > install:
> > +Tags are specially formatted metadata appended to the foot the body of a
> > patch,
> > +cover letter or a comment related to them. Patchwork extracts these tags
> > at
> > +parse time and associates them with patches. You add extra tags to an
> > email by
> > +replying to the email. The following tags are available on a standard
> > Patchwork
> > +install:
>
> This can be a separate patch.
>
Ack
> > ``Acked-by:``
> > For example::
> > diff --git a/patchwork/management/commands/retag.py
> > b/patchwork/management/commands/retag.py
> > index 8617ff4..150dbf1 100644
> > --- a/patchwork/management/commands/retag.py
> > +++ b/patchwork/management/commands/retag.py
> > @@ -19,11 +19,13 @@
> >
> > from django.core.management.base import BaseCommand
> >
> > +from patchwork.models import Cover
> > from patchwork.models import Patch
> > +from patchwork.models import SeriesPatch
> >
> >
> > class Command(BaseCommand):
> > - help = 'Update the tag (Ack/Review/Test) counts on existing patches'
> > + help = 'Update tags on existing patches'
>
> s/patches/submissions/ ?
Thanks!
>
> > args = '[<patch_id>...]'
> >
> > def handle(self, *args, **options):
> > @@ -37,7 +39,17 @@ class Command(BaseCommand):
> > count = query.count()
> >
> > for i, patch in enumerate(query.iterator()):
> > - patch.refresh_tag_counts()
> > + patch.refresh_tags()
> > + for comment in patch.comments.all():
> > + comment.refresh_tags()
> > +
> > + series_patches = SeriesPatch.objects.filter(patch_id=patch.id)
> > + for series_patch in series_patches:
> > + cover = series_patch.series.cover_letter
> > + cover.refresh_tags()
> > + for comment in cover.comments.all():
> > + comment.refresh_tags()
> > +
> > if (i % 10) == 0:
> > self.stdout.write('%06d/%06d\r' % (i, count), ending='')
> > self.stdout.flush()
> > diff --git a/patchwork/migrations/0027_rework_tagging.py
> > b/patchwork/migrations/0027_rework_tagging.py
> > new file mode 100644
> > index 0000000..009546f
> > --- /dev/null
> > +++ b/patchwork/migrations/0027_rework_tagging.py
> > @@ -0,0 +1,146 @@
> > +# -*- coding: utf-8 -*-
> > +from __future__ import unicode_literals
> > +
> > +from django.db import migrations, models
> > +import django.db.models.deletion
> > +
> > +import re
> > +
> > +
> > +# Django migrations don't allow us to call models' methods because the
> > +# migration will break if the methods change. Therefore we need to use an
> > +# altered copy of all the code needed.
> > +def extract_tags(extract_from, tags):
> > + found_tags = {}
> > +
> > + if not extract_from.content:
> > + return found_tags
> > +
> > + for tag in tags:
> > + regex = re.compile(tag.pattern + r'\s(.*)', re.M | re.I)
> > + found_tags[tag] = regex.findall(extract_from.content)
> > +
> > + return found_tags
>
> This doesn't belong in a migration as the command is going to take
> _forever_ to run on a large instance and will block all other
> migrations. I'd personally note in the release notes that the 'retag'
> command must be run post-update. Would make this migration way simpler
> too.
>
I agree that the migration takes a very long time. The idea of only
doing the basics in the migration and running the retag command to
create the actual data is a genius one, thank you!
> > +
> > +
> > +def add_tags(apps, submission, tag, values, comment=None):
> > + if not values:
> > + # We don't need to delete tags since none exist yet and we can't
> > + # delete comments etc. during the migration
> > + return
> > +
> > + if hasattr(submission, 'patch'):
> > + series = None
> > + else:
> > + series = submission.coverletter.series.first()
> > +
> > + SubmissionTag = apps.get_model('patchwork', 'SubmissionTag')
> > + current_objs = SubmissionTag.objects.filter(submission=self,
> > + comment=comment,
> > + tag=tag,
> > + series=series)
> > +
> > + # Only create nonexistent tags
> > + values_to_add = set(values) - set(current_objs.values_list('value',
> > + flat=True))
> > + SubmissionTag.objects.bulk_create([SubmissionTag(
> > + submission=submission,
> > + tag=tag,
> > + value=value,
> > + comment=comment,
> > + series=series
> > + ) for value in values_to_add])
> > +
> > +
> > +def create_all(apps, schema_editor):
> > + Tag = apps.get_model('patchwork', 'Tag')
> > + tags = Tag.objects.all()
> > +
> > + Submission = apps.get_model('patchwork', 'Submission')
> > + for submission in Submission.objects.all():
> > + extracted = extract_tags(submission, tags)
> > + for tag in extracted:
> > + add_tags(apps, submission, tag, extracted[tag])
> > +
> > + Comment = apps.get_model('patchwork', 'Comment')
> > + for comment in Comment.objects.all():
> > + extracted = extract_tags(comment, tags)
> > + for tag in extracted:
> > + add_tags(apps,
> > + comment.submission,
> > + tag,
> > + extracted[tag],
> > + comment=comment)
> > +
> > +
> > +class Migration(migrations.Migration):
> > +
> > + dependencies = [
> > + ('patchwork', '0026_add_user_bundles_backref'),
> > + ]
> > +
> > + operations = [
> > + migrations.CreateModel(
> > + name='SubmissionTag',
> > + fields=[
> > + ('id', models.AutoField(auto_created=True,
> > + primary_key=True,
> > + serialize=False,
> > + verbose_name='ID')),
> > + ('value', models.CharField(max_length=255)),
> > + ('comment', models.ForeignKey(
> > + on_delete=django.db.models.deletion.CASCADE,
> > + to='patchwork.Comment',
> > + null=True
> > + )),
> > + ('submission', models.ForeignKey(
> > + on_delete=django.db.models.deletion.CASCADE,
> > + to='patchwork.Submission'
> > + )),
> > + ('tag', models.ForeignKey(
> > + on_delete=django.db.models.deletion.CASCADE,
> > + to='patchwork.Tag'
> > + )),
> > + ('series', models.ForeignKey(
> > + on_delete=django.db.models.deletion.CASCADE,
> > + to='patchwork.Series',
> > + null=True
> > + )),
> > + ],
> > + ),
> > + migrations.AlterUniqueTogether(
> > + name='patchtag',
> > + unique_together=set([]),
> > + ),
> > + migrations.RemoveField(
> > + model_name='patchtag',
> > + name='patch',
> > + ),
> > + migrations.RemoveField(
> > + model_name='patchtag',
> > + name='tag',
> > + ),
> > + migrations.RemoveField(
> > + model_name='patch',
> > + name='tags',
> > + ),
> > + migrations.DeleteModel(
> > + name='PatchTag',
> > + ),
> > + migrations.AddField(
> > + model_name='submission',
> > + name='related_tags',
> > + field=models.ManyToManyField(
> > + through='patchwork.SubmissionTag',
> > + to='patchwork.Tag'
> > + ),
> > + ),
> > + migrations.AlterUniqueTogether(
> > + name='submissiontag',
> > + unique_together=set([('submission',
> > + 'tag',
> > + 'value',
> > + 'comment')]),
> > + ),
> > + migrations.RunPython(create_all, atomic=False),
> > + ]
> > diff --git a/patchwork/tests/api/test_patch.py
> > b/patchwork/tests/api/test_patch.py
> > index 27b9924..55b8efd 100644
> > --- a/patchwork/tests/api/test_patch.py
> > +++ b/patchwork/tests/api/test_patch.py
> > @@ -166,7 +166,7 @@ class TestPatchAPI(APITestCase):
> >
> > self.assertEqual(patch.content, resp.data['content'])
> > self.assertEqual(patch.diff, resp.data['diff'])
> > - self.assertEqual(0, len(resp.data['tags']))
> > + self.assertEqual(1, len(resp.data['tags']))
> >
> > def test_detail_version_1_0(self):
> > patch = create_patch()
> > diff --git a/patchwork/tests/test_mboxviews.py
> > b/patchwork/tests/test_mboxviews.py
> > index 8eb3581..16b5b0c 100644
> > --- a/patchwork/tests/test_mboxviews.py
> > +++ b/patchwork/tests/test_mboxviews.py
> > @@ -39,6 +39,8 @@ class MboxPatchResponseTest(TestCase):
> >
> > """Test that the mbox view appends the Acked-by from a patch
> > comment."""
> >
> > + fixtures = ['default_tags']
> > +
> > def setUp(self):
> > self.project = create_project()
> > self.person = create_person()
> > @@ -53,7 +55,9 @@ class MboxPatchResponseTest(TestCase):
> > submitter=self.person,
> > content='comment 2 text\nAcked-by: 2\n')
> > response = self.client.get(reverse('patch-mbox', args=[patch.id]))
> > - self.assertContains(response, 'Acked-by: 1\nAcked-by: 2\n')
> > + # Can't guarantee the order in which the tags are returned
> > + self.assertContains(response, 'Acked-by: 1\n')
> > + self.assertContains(response, 'Acked-by: 2\n')
> >
> > def test_patch_utf8_nbsp(self):
> > patch = create_patch(
> > @@ -73,6 +77,8 @@ class MboxPatchSplitResponseTest(TestCase):
> > """Test that the mbox view appends the Acked-by from a patch comment,
> > and places it before an '---' update line."""
> >
> > + fixtures = ['default_tags']
> > +
> > def setUp(self):
> > project = create_project()
> > self.person = create_person()
> > @@ -88,7 +94,15 @@ class MboxPatchSplitResponseTest(TestCase):
> >
> > def test_patch_response(self):
> > response = self.client.get(reverse('patch-mbox',
> > args=[self.patch.id]))
> > - self.assertContains(response, 'Acked-by: 1\nAcked-by: 2\n')
> > + # Can't guarantee the order in which the tags are returned
> > + self.assertContains(response, 'Acked-by: 1\n')
> > + self.assertContains(response, 'Acked-by: 2\n')
> > + # We need to check for 3 Acked-by strings, one comes from the body
> > of
> > + # the patch and the other two are the tags themselves.
> > + self.assertRegex(
> > + response.content.decode(),
> > + '(?s).*Acked-by: 1\n.*Acked-by.*Acked-by.*---\nupdate.*'
> > + )
> >
> >
> > class MboxHeaderTest(TestCase):
> > diff --git a/patchwork/tests/test_parser.py
> > b/patchwork/tests/test_parser.py
> > index e99cf21..a29066f 100644
> > --- a/patchwork/tests/test_parser.py
> > +++ b/patchwork/tests/test_parser.py
> > @@ -802,12 +802,12 @@ class ParseInitialTagsTest(PatchTest):
> > def test_tags(self):
> > self.assertEqual(Patch.objects.count(), 1)
> > patch = Patch.objects.all()[0]
> > - self.assertEqual(patch.patchtag_set.filter(
> > - tag__name='Acked-by').count(), 0)
> > - self.assertEqual(patch.patchtag_set.get(
> > - tag__name='Reviewed-by').count, 1)
> > - self.assertEqual(patch.patchtag_set.get(
> > - tag__name='Tested-by').count, 1)
> > +
> > self.assertEqual(patch.related_tags.filter(name='Acked-by').count(),
> > + 0)
> > +
> > self.assertEqual(patch.related_tags.filter(name='Reviewed-by').count(),
> > + 1)
> > +
> > self.assertEqual(patch.related_tags.filter(name='Tested-by').count(),
> > + 1)
> >
> >
> > class ParseCommentTagsTest(PatchTest):
> > @@ -830,12 +830,11 @@ class ParseCommentTagsTest(PatchTest):
> > def test_tags(self):
> > self.assertEqual(Patch.objects.count(), 1)
> > patch = Patch.objects.all()[0]
> > - self.assertEqual(patch.patchtag_set.filter(
> > - tag__name='Acked-by').count(), 0)
> > - self.assertEqual(patch.patchtag_set.get(
> > - tag__name='Reviewed-by').count, 1)
> > - self.assertEqual(patch.patchtag_set.get(
> > - tag__name='Tested-by').count, 1)
> > +
> > self.assertEqual(patch.related_tags.filter(name='Acked-by').count(),
> > 0)
> > +
> > self.assertEqual(patch.related_tags.filter(name='Reviewed-by')count(),
> > + 1)
> > +
> > self.assertEqual(patch.related_tags.filter(name='Tested-by').count(),
> > + 1)
> >
> >
> > class SubjectTest(TestCase):
> > diff --git a/patchwork/tests/test_tags.py b/patchwork/tests/test_tags.py
> > index 4fd1bf2..f1df99f 100644
> > --- a/patchwork/tests/test_tags.py
> > +++ b/patchwork/tests/test_tags.py
> > @@ -21,7 +21,7 @@ from django.test import TestCase
> > from django.test import TransactionTestCase
> >
> > from patchwork.models import Patch
> > -from patchwork.models import PatchTag
> > +from patchwork.models import SubmissionTag
> > from patchwork.models import Tag
> > from patchwork.tests.utils import create_comment
> > from patchwork.tests.utils import create_patch
> > @@ -34,11 +34,14 @@ class ExtractTagsTest(TestCase):
> > name_email = 'test name <' + email + '>'
> >
> > def assertTagsEqual(self, str, acks, reviews, tests): # noqa
> > - counts = Patch.extract_tags(str, Tag.objects.all())
> > - self.assertEqual((acks, reviews, tests),
> > - (counts[Tag.objects.get(name='Acked-by')],
> > - counts[Tag.objects.get(name='Reviewed-by')],
> > - counts[Tag.objects.get(name='Tested-by')]))
> > + patch = create_patch(content=str)
> > + extracted = patch._extract_tags(Tag.objects.all())
> > + self.assertEqual(
> > + (acks, reviews, tests),
> > + (len(extracted.get(Tag.objects.get(name='Acked-by'), [])),
> > + len(extracted.get(Tag.objects.get(name='Reviewed-by'), [])),
> > + len(extracted.get(Tag.objects.get(name='Tested-by'), [])))
> > + )
> >
> > def test_empty(self):
> > self.assertTagsEqual('', 0, 0, 0)
> > @@ -80,7 +83,7 @@ class ExtractTagsTest(TestCase):
> > self.assertTagsEqual('> Acked-by: %s\n' % self.name_email, 0, 0,
> > 0)
> >
> >
> > -class PatchTagsTest(TransactionTestCase):
> > +class SubmissionTagsTest(TransactionTestCase):
> >
> > fixtures = ['default_tags']
> > ACK = 1
> > @@ -95,16 +98,14 @@ class PatchTagsTest(TransactionTestCase):
> > def assertTagsEqual(self, patch, acks, reviews, tests): # noqa
> > patch = Patch.objects.get(pk=patch.pk)
> >
> > - def count(name):
> > - try:
> > - return patch.patchtag_set.get(tag__name=name).count
> > - except PatchTag.DoesNotExist:
> > - return 0
> > + def count(submission, name):
> > + return SubmissionTag.objects.filter(submission=patch,
> > + tag__name=name).count()
> >
> > counts = (
> > - count(name='Acked-by'),
> > - count(name='Reviewed-by'),
> > - count(name='Tested-by'),
> > + count(patch, 'Acked-by'),
> > + count(patch, 'Reviewed-by'),
> > + count(patch, 'Tested-by'),
> > )
> >
> > self.assertEqual(counts, (acks, reviews, tests))
> > @@ -118,7 +119,12 @@ class PatchTagsTest(TransactionTestCase):
> > if tagtype not in tags:
> > return ''
> >
> > - return '%s-by: Test Tagger <tagger at example.com>\n' % tags[tagtype]
> > + index = SubmissionTag.objects.filter(
> > + tag__name=tags[tagtype] + '-by'
> > + ).count()
> > + return '%s-by: Test Taggeri%d <tagger at example.com>\n' % (
> > + tags[tagtype], index + 1
> > + )
> >
> > def create_tag_comment(self, patch, tagtype=None):
> > comment = create_comment(
> > @@ -179,29 +185,3 @@ class PatchTagsTest(TransactionTestCase):
> > c1.content += self.create_tag(self.REVIEW)
> > c1.save()
> > self.assertTagsEqual(self.patch, 1, 1, 0)
> > -
> > -
> > -class PatchTagManagerTest(PatchTagsTest):
> > -
> > - def assertTagsEqual(self, patch, acks, reviews, tests): # noqa
> > - tagattrs = {}
> > - for tag in Tag.objects.all():
> > - tagattrs[tag.name] = tag.attr_name
> > -
> > - # force project.tags to be queried outside of the assertNumQueries
> > - patch.project.tags
> > -
> > - # we should be able to do this with two queries: one for
> > - # the patch table lookup, and the prefetch_related for the
> > - # projects table.
> > - with self.assertNumQueries(2):
> > - patch = Patch.objects.with_tag_counts(project=patch.project) \
> > - .get(pk=patch.pk)
> > -
> > - counts = (
> > - getattr(patch, tagattrs['Acked-by']),
> > - getattr(patch, tagattrs['Reviewed-by']),
> > - getattr(patch, tagattrs['Tested-by']),
> > - )
> > -
> > - self.assertEqual(counts, (acks, reviews, tests))
> > diff --git a/releasenotes/notes/tagging-rework-9907e9dc3f835566.yaml
> > b/releasenotes/notes/tagging-rework-9907e9dc3f835566.yaml
> > new file mode 100644
> > index 0000000..1151be9
> > --- /dev/null
> > +++ b/releasenotes/notes/tagging-rework-9907e9dc3f835566.yaml
> > @@ -0,0 +1,18 @@
> > +---
> > +features:
> > + - |
> > + Tagging is completely reworked. Instead of counts, real values are
> > + extracted. This fixes wrong counts when for example the same person
> > + accidentally sent the Acked-by email twice, since only a single same
> > pair
> > + tagname-value can be assigned to a patch. Tags from cover letters are
> > now
> > + counted towards each patch in the series and added to mbox files and
> > patch
> > + API as well. Sources of tags are tracked and `tags` field is populated
> > in
> > + the API for cover letters and comments so it's much easier to find
> > where
> > + a specific tag originated from. Three different filters are added for
> > patch
> > + and cover letter filtering in the REST API - `tagname=xxx` is a
> > + case-insensitive filter on tag name (eg. find patches that were
> > Tested-by
> > + someone), `tagvalue=xxx` checks for case-insensitive substrings in the
> > + value of the tags (is Johny slacking off and not acking/reviewing
> > + patches?), and `tag=name,value` combines the two and filters on the
> > + tagname-value pair (is there a patch that fixes the bug with this
> > issue
> > + link?).
>
> There should be an 'Upgrade' section here telling the user to run the
> 'retag' command, as noted above.
Will do, as I simplify the migration!
>
> I didn't see an update to the 'default_tags' fixture. I assume you need
> to update this?
>
I believe the fixture worked fine, since it only sets up the "tag" which
is unchanged.
> Stephen
>
>
Thanks for the comments! I'll dig into them and send v2 hopefully soon.
Veronika
More information about the Patchwork
mailing list