[PATCH] Optimise querying of checks in patch list view

Thu Sep 15 18:10:29 AEST 2016

tl;dr: with about 300 mails from the patchwork list, according to
django-debug-toolbar, to render
http://localhost:8000/project/patchwork/list/

Without this patch:
 - ~1.35 seconds of CPU time
 - 110 SQL queries, taking ~70ms

With this patch:
 - < 0.3 seconds of CPU time
 - 10 SQL queries, taking <20ms

How? Replace an .exclude() on a QuerySet with a list comprehension.
Yes, that's normally a pessimisation.
Surprisingly, it's an optimisation here.
Why? Where we're looking at patches in anything that uses a generic_list()
in the view, we do a prefetch_related('check_set'). But, if we then do a
.filter or a .exclude, that throws out the existing, cached information,
and does another query. (See the Django docs on prefetch_related.)

So, do it 'by hand' in Python instead.

Signed-off-by: Daniel Axtens <dja at axtens.net>
---
 patchwork/models.py | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/patchwork/models.py b/patchwork/models.py
index 5043c9a031a8..e7538703ff86 100644
--- a/patchwork/models.py
+++ b/patchwork/models.py
@@ -502,7 +502,17 @@ class Patch(Submission):
             unique[user][ctx] = check
 
         # filter out the "duplicates" or older, now-invalid results
-        return self.check_set.all().exclude(id__in=duplicates)
+
+        # Why don't we use filter or exclude here? Surprisingly, it's
+        # an optimisation in the common case. Where we're looking at
+        # checks in anything that uses a generic_list() in the view,
+        # we do a prefetch_related('check_set'). But, if we then do a
+        # .filter or a .exclude, that throws out the existing, cached
+        # information, and does another query. (See the Django docs on
+        # prefetch_related.) So, do it 'by hand' in Python. We can
+        # also be confident that this won't be worse, seeing as we've
+        # just iterated over self.check_set.all() *anyway*.
+        return [c for c in self.check_set.all() if c.id not in duplicates]
 
     @property
     def check_count(self):
-- 
2.7.4