[PATCH] Fix slow Patch counting query

Daniel Axtens dja at axtens.net
Fri Mar 9 02:54:37 AEDT 2018


Stephen Rothwell noticed (way back in September - sorry Stephen!) that
the following query is really slow on OzLabs:

SELECT COUNT(*) AS "__count" FROM "patchwork_patch"
    INNER JOIN "patchwork_submission" ON
        ("patchwork_patch"."submission_ptr_id" = "patchwork_submission"."id")
    WHERE ("patchwork_submission"."project_id" = 14 AND
           "patchwork_patch"."state_id" IN
	       (SELECT U0."id" AS Col1 FROM "patchwork_state" U0
                WHERE U0."action_required" = true
		ORDER BY U0."ordering" ASC));

I think this is really slow because we have to join the patch and
submission table to get the project id, which we need to filter the
patches.

Duplicate the project id in the patch table itself, which allows us to
avoid the JOIN.

The new query reads as:
SELECT COUNT(*) AS "__count" FROM "patchwork_patch"
    WHERE ("patchwork_patch"."patch_project_id" = 1 AND
           "patchwork_patch"."state_id" IN
	       (SELECT U0."id" AS Col1 FROM "patchwork_state" U0
	        WHERE U0."action_required" = true
		ORDER BY U0."ordering" ASC));

Very simple testing on a small, artifical Postgres instance (3
projects, 102711 patches), shows speed gains of ~1.5-5x for this
query. Looking at Postgres' cost estimates (EXPLAIN) of the first
query vs the second query, we see a ~1.75x improvement there too.

I suspect the gains will be bigger on OzLabs.

(It turns out all of this is all for the "| NN patches" counter we
added to the filter bar!!)

Reported-by: Stephen Rothwell <sfr at canb.auug.org.au>
Signed-off-by: Daniel Axtens <dja at axtens.net>

---

This requires a migration, so I don't think we can feasibly do it as a
stable update.

I think we drop the patch counter for stable and try to get this and
the event stuff merged to master promptly, and just tag 2.1. (To that
end, I will re-read and finish reviewing the event stuff soon.)
---
 patchwork/migrations/0024_patch_patch_project.py | 39 ++++++++++++++++++++++++
 patchwork/models.py                              |  4 +++
 patchwork/parser.py                              |  1 +
 patchwork/views/__init__.py                      |  2 +-
 4 files changed, 45 insertions(+), 1 deletion(-)
 create mode 100644 patchwork/migrations/0024_patch_patch_project.py

diff --git a/patchwork/migrations/0024_patch_patch_project.py b/patchwork/migrations/0024_patch_patch_project.py
new file mode 100644
index 000000000000..76d8f144c9dd
--- /dev/null
+++ b/patchwork/migrations/0024_patch_patch_project.py
@@ -0,0 +1,39 @@
+# -*- coding: utf-8 -*-
+# Generated by Django 1.11.10 on 2018-03-08 01:51
+from __future__ import unicode_literals
+
+from django.db import migrations, models
+import django.db.models.deletion
+
+
+class Migration(migrations.Migration):
+    # per migration 16, but note this seems to be going away
+    # in new PostgreSQLs (https://stackoverflow.com/questions/12838111/south-cannot-alter-table-because-it-has-pending-trigger-events#comment44629663_12838113)
+    atomic = False
+
+    dependencies = [
+        ('patchwork', '0023_timezone_unify'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='patch',
+            name='patch_project',
+            field=models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.CASCADE, to='patchwork.Project'),
+            preserve_default=False,
+        ),
+
+        # as with 10, this will break if you use non-default table names
+        migrations.RunSQL('''UPDATE patchwork_patch SET patch_project_id =
+                               (SELECT project_id FROM patchwork_submission
+                                WHERE patchwork_submission.id =
+                                        patchwork_patch.submission_ptr_id);'''
+        ),
+
+        migrations.AlterField(
+            model_name='patch',
+            name='patch_project',
+            field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='patchwork.Project'),
+        ),
+
+    ]
diff --git a/patchwork/models.py b/patchwork/models.py
index b2491752f04a..3b905c4cd75b 100644
--- a/patchwork/models.py
+++ b/patchwork/models.py
@@ -423,6 +423,10 @@ class Patch(SeriesMixin, Submission):
     archived = models.BooleanField(default=False)
     hash = HashField(null=True, blank=True)
 
+    # duplicate project from submission in subclass so we can count the
+    # patches in a project without needing to do a JOIN.
+    patch_project = models.ForeignKey(Project, on_delete=models.CASCADE)
+
     objects = PatchManager()
 
     @staticmethod
diff --git a/patchwork/parser.py b/patchwork/parser.py
index 803e98592fa8..805037c72d73 100644
--- a/patchwork/parser.py
+++ b/patchwork/parser.py
@@ -1004,6 +1004,7 @@ def parse_mail(mail, list_id=None):
         patch = Patch.objects.create(
             msgid=msgid,
             project=project,
+            patch_project=project,
             name=name[:255],
             date=date,
             headers=headers,
diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py
index 3baf2999a836..f8d23a388ac7 100644
--- a/patchwork/views/__init__.py
+++ b/patchwork/views/__init__.py
@@ -270,7 +270,7 @@ def generic_list(request, project, view, view_args=None, filter_settings=None,
             context['filters'].set_status(filterclass, setting)
 
     if patches is None:
-        patches = Patch.objects.filter(project=project)
+        patches = Patch.objects.filter(patch_project=project)
 
     # annotate with tag counts
     patches = patches.with_tag_counts(project)
-- 
2.14.1



More information about the Patchwork mailing list