[PATCH] Fix slow Patch counting query

Daniel Axtens dja at axtens.net
Wed Mar 21 10:04:23 AEDT 2018


I have fixed up a small error where I didn't update a test and applied
this to master. I'm just waiting for Travis CI to run on my repository,
and then I will push it to the main repo.

As this is very important to OzLabs, I will spin a 2.1-rc1 soon.

Regards,
Daniel

Daniel Axtens <dja at axtens.net> writes:

> Stephen Rothwell noticed (way back in September - sorry Stephen!) that
> the following query is really slow on OzLabs:
>
> SELECT COUNT(*) AS "__count" FROM "patchwork_patch"
>     INNER JOIN "patchwork_submission" ON
>         ("patchwork_patch"."submission_ptr_id" = "patchwork_submission"."id")
>     WHERE ("patchwork_submission"."project_id" = 14 AND
>            "patchwork_patch"."state_id" IN
> 	       (SELECT U0."id" AS Col1 FROM "patchwork_state" U0
>                 WHERE U0."action_required" = true
> 		ORDER BY U0."ordering" ASC));
>
> I think this is really slow because we have to join the patch and
> submission table to get the project id, which we need to filter the
> patches.
>
> Duplicate the project id in the patch table itself, which allows us to
> avoid the JOIN.
>
> The new query reads as:
> SELECT COUNT(*) AS "__count" FROM "patchwork_patch"
>     WHERE ("patchwork_patch"."patch_project_id" = 1 AND
>            "patchwork_patch"."state_id" IN
> 	       (SELECT U0."id" AS Col1 FROM "patchwork_state" U0
> 	        WHERE U0."action_required" = true
> 		ORDER BY U0."ordering" ASC)); 
>
> Very simple testing on a small, artifical Postgres instance (3
> projects, 102711 patches), shows speed gains of ~1.5-5x for this
> query. Looking at Postgres' cost estimates (EXPLAIN) of the first
> query vs the second query, we see a ~1.75x improvement there too.
>
> I suspect the gains will be bigger on OzLabs.
>
> (It turns out all of this is all for the "| NN patches" counter we
> added to the filter bar!!)
>
> Reported-by: Stephen Rothwell <sfr at canb.auug.org.au>
> Signed-off-by: Daniel Axtens <dja at axtens.net>
>
> ---
>
> This requires a migration, so I don't think we can feasibly do it as a
> stable update.
>
> I think we drop the patch counter for stable and try to get this and
> the event stuff merged to master promptly, and just tag 2.1. (To that
> end, I will re-read and finish reviewing the event stuff soon.)
> ---
>  patchwork/migrations/0024_patch_patch_project.py | 39 ++++++++++++++++++++++++
>  patchwork/models.py                              |  4 +++
>  patchwork/parser.py                              |  1 +
>  patchwork/views/__init__.py                      |  2 +-
>  4 files changed, 45 insertions(+), 1 deletion(-)
>  create mode 100644 patchwork/migrations/0024_patch_patch_project.py
>
> diff --git a/patchwork/migrations/0024_patch_patch_project.py b/patchwork/migrations/0024_patch_patch_project.py
> new file mode 100644
> index 000000000000..76d8f144c9dd
> --- /dev/null
> +++ b/patchwork/migrations/0024_patch_patch_project.py
> @@ -0,0 +1,39 @@
> +# -*- coding: utf-8 -*-
> +# Generated by Django 1.11.10 on 2018-03-08 01:51
> +from __future__ import unicode_literals
> +
> +from django.db import migrations, models
> +import django.db.models.deletion
> +
> +
> +class Migration(migrations.Migration):
> +    # per migration 16, but note this seems to be going away
> +    # in new PostgreSQLs (https://stackoverflow.com/questions/12838111/south-cannot-alter-table-because-it-has-pending-trigger-events#comment44629663_12838113)
> +    atomic = False
> +
> +    dependencies = [
> +        ('patchwork', '0023_timezone_unify'),
> +    ]
> +
> +    operations = [
> +        migrations.AddField(
> +            model_name='patch',
> +            name='patch_project',
> +            field=models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.CASCADE, to='patchwork.Project'),
> +            preserve_default=False,
> +        ),
> +
> +        # as with 10, this will break if you use non-default table names
> +        migrations.RunSQL('''UPDATE patchwork_patch SET patch_project_id =
> +                               (SELECT project_id FROM patchwork_submission
> +                                WHERE patchwork_submission.id =
> +                                        patchwork_patch.submission_ptr_id);'''
> +        ),
> +
> +        migrations.AlterField(
> +            model_name='patch',
> +            name='patch_project',
> +            field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='patchwork.Project'),
> +        ),
> +
> +    ]
> diff --git a/patchwork/models.py b/patchwork/models.py
> index b2491752f04a..3b905c4cd75b 100644
> --- a/patchwork/models.py
> +++ b/patchwork/models.py
> @@ -423,6 +423,10 @@ class Patch(SeriesMixin, Submission):
>      archived = models.BooleanField(default=False)
>      hash = HashField(null=True, blank=True)
>  
> +    # duplicate project from submission in subclass so we can count the
> +    # patches in a project without needing to do a JOIN.
> +    patch_project = models.ForeignKey(Project, on_delete=models.CASCADE)
> +
>      objects = PatchManager()
>  
>      @staticmethod
> diff --git a/patchwork/parser.py b/patchwork/parser.py
> index 803e98592fa8..805037c72d73 100644
> --- a/patchwork/parser.py
> +++ b/patchwork/parser.py
> @@ -1004,6 +1004,7 @@ def parse_mail(mail, list_id=None):
>          patch = Patch.objects.create(
>              msgid=msgid,
>              project=project,
> +            patch_project=project,
>              name=name[:255],
>              date=date,
>              headers=headers,
> diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py
> index 3baf2999a836..f8d23a388ac7 100644
> --- a/patchwork/views/__init__.py
> +++ b/patchwork/views/__init__.py
> @@ -270,7 +270,7 @@ def generic_list(request, project, view, view_args=None, filter_settings=None,
>              context['filters'].set_status(filterclass, setting)
>  
>      if patches is None:
> -        patches = Patch.objects.filter(project=project)
> +        patches = Patch.objects.filter(patch_project=project)
>  
>      # annotate with tag counts
>      patches = patches.with_tag_counts(project)
> -- 
> 2.14.1


More information about the Patchwork mailing list