[PATCH] Fix slow Patch counting query
Daniel Axtens
dja at axtens.net
Wed Mar 28 09:42:33 AEDT 2018
Stephen Finucane <stephen at that.guru> writes:
> On Fri, 2018-03-09 at 02:54 +1100, Daniel Axtens wrote:
>> Stephen Rothwell noticed (way back in September - sorry Stephen!) that
>> the following query is really slow on OzLabs:
>>
>> SELECT COUNT(*) AS "__count" FROM "patchwork_patch"
>> INNER JOIN "patchwork_submission" ON
>> ("patchwork_patch"."submission_ptr_id" = "patchwork_submission"."id")
>> WHERE ("patchwork_submission"."project_id" = 14 AND
>> "patchwork_patch"."state_id" IN
>> (SELECT U0."id" AS Col1 FROM "patchwork_state" U0
>> WHERE U0."action_required" = true
>> ORDER BY U0."ordering" ASC));
>>
>> I think this is really slow because we have to join the patch and
>> submission table to get the project id, which we need to filter the
>> patches.
>>
>> Duplicate the project id in the patch table itself, which allows us to
>> avoid the JOIN.
>>
>> The new query reads as:
>> SELECT COUNT(*) AS "__count" FROM "patchwork_patch"
>> WHERE ("patchwork_patch"."patch_project_id" = 1 AND
>> "patchwork_patch"."state_id" IN
>> (SELECT U0."id" AS Col1 FROM "patchwork_state" U0
>> WHERE U0."action_required" = true
>> ORDER BY U0."ordering" ASC));
>>
>> Very simple testing on a small, artifical Postgres instance (3
>> projects, 102711 patches), shows speed gains of ~1.5-5x for this
>> query. Looking at Postgres' cost estimates (EXPLAIN) of the first
>> query vs the second query, we see a ~1.75x improvement there too.
>>
>> I suspect the gains will be bigger on OzLabs.
>>
>> (It turns out all of this is all for the "| NN patches" counter we
>> added to the filter bar!!)
>>
>> Reported-by: Stephen Rothwell <sfr at canb.auug.org.au>
>> Signed-off-by: Daniel Axtens <dja at axtens.net>
>
> It's unfortunate that this has already merged. While it works, it
> defeats the whole point of the multi-table inheritance introduced in
> commit 86172ccc1 (normalization). To be honest, given the performance
> impacts that particular change introduced (which we're only seeing at
> scale), I'd rather denormalize the whole thing and fold the 'Patch' and
> 'CoverLetter' models back into 'Submission' and just use a 'type' field
> (or similar) to control behavior. Is there any reason not to do this?
I agree that would be a better conceptual solution. The reason I didn't
do it is because I tried a couple of times and couldn't get it all to
work, and I haven't had the time to really sit down and re-engineer it.
If you can get it to work I am happy to revert this and apply that; the
changes to the database schema aren't at all difficult to undo so any
brave souls running master won't be completely hosed.
Regards,
Daniel
> If not, I'd like to wait on 2.1 until we've done both this and the
> event fixes.
>
> Stephen
>
>> ---
>>
>> This requires a migration, so I don't think we can feasibly do it as a
>> stable update.
>>
>> I think we drop the patch counter for stable and try to get this and
>> the event stuff merged to master promptly, and just tag 2.1. (To that
>> end, I will re-read and finish reviewing the event stuff soon.)
>> ---
>> patchwork/migrations/0024_patch_patch_project.py | 39 ++++++++++++++++++++++++
>> patchwork/models.py | 4 +++
>> patchwork/parser.py | 1 +
>> patchwork/views/__init__.py | 2 +-
>> 4 files changed, 45 insertions(+), 1 deletion(-)
>> create mode 100644 patchwork/migrations/0024_patch_patch_project.py
>>
>> diff --git a/patchwork/migrations/0024_patch_patch_project.py b/patchwork/migrations/0024_patch_patch_project.py
>> new file mode 100644
>> index 000000000000..76d8f144c9dd
>> --- /dev/null
>> +++ b/patchwork/migrations/0024_patch_patch_project.py
>> @@ -0,0 +1,39 @@
>> +# -*- coding: utf-8 -*-
>> +# Generated by Django 1.11.10 on 2018-03-08 01:51
>> +from __future__ import unicode_literals
>> +
>> +from django.db import migrations, models
>> +import django.db.models.deletion
>> +
>> +
>> +class Migration(migrations.Migration):
>> + # per migration 16, but note this seems to be going away
>> + # in new PostgreSQLs (https://stackoverflow.com/questions/12838111/south-cannot-alter-table-because-it-has-pending-trigger-events#comment44629663_12838113)
>> + atomic = False
>> +
>> + dependencies = [
>> + ('patchwork', '0023_timezone_unify'),
>> + ]
>> +
>> + operations = [
>> + migrations.AddField(
>> + model_name='patch',
>> + name='patch_project',
>> + field=models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.CASCADE, to='patchwork.Project'),
>> + preserve_default=False,
>> + ),
>> +
>> + # as with 10, this will break if you use non-default table names
>> + migrations.RunSQL('''UPDATE patchwork_patch SET patch_project_id =
>> + (SELECT project_id FROM patchwork_submission
>> + WHERE patchwork_submission.id =
>> + patchwork_patch.submission_ptr_id);'''
>> + ),
>> +
>> + migrations.AlterField(
>> + model_name='patch',
>> + name='patch_project',
>> + field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='patchwork.Project'),
>> + ),
>> +
>> + ]
>> diff --git a/patchwork/models.py b/patchwork/models.py
>> index b2491752f04a..3b905c4cd75b 100644
>> --- a/patchwork/models.py
>> +++ b/patchwork/models.py
>> @@ -423,6 +423,10 @@ class Patch(SeriesMixin, Submission):
>> archived = models.BooleanField(default=False)
>> hash = HashField(null=True, blank=True)
>>
>> + # duplicate project from submission in subclass so we can count the
>> + # patches in a project without needing to do a JOIN.
>> + patch_project = models.ForeignKey(Project, on_delete=models.CASCADE)
>> +
>> objects = PatchManager()
>>
>> @staticmethod
>> diff --git a/patchwork/parser.py b/patchwork/parser.py
>> index 803e98592fa8..805037c72d73 100644
>> --- a/patchwork/parser.py
>> +++ b/patchwork/parser.py
>> @@ -1004,6 +1004,7 @@ def parse_mail(mail, list_id=None):
>> patch = Patch.objects.create(
>> msgid=msgid,
>> project=project,
>> + patch_project=project,
>> name=name[:255],
>> date=date,
>> headers=headers,
>> diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py
>> index 3baf2999a836..f8d23a388ac7 100644
>> --- a/patchwork/views/__init__.py
>> +++ b/patchwork/views/__init__.py
>> @@ -270,7 +270,7 @@ def generic_list(request, project, view, view_args=None, filter_settings=None,
>> context['filters'].set_status(filterclass, setting)
>>
>> if patches is None:
>> - patches = Patch.objects.filter(project=project)
>> + patches = Patch.objects.filter(patch_project=project)
>>
>> # annotate with tag counts
>> patches = patches.with_tag_counts(project)
More information about the Patchwork
mailing list