[PATCH 2/4] REST: fix patch listing query

Daniel Axtens dja at axtens.net
Wed Mar 18 00:59:14 AEDT 2020


The patch listing query is punishingly slow under even very simple
filters.

The new data model in 3.0 will help _a lot_, so this is a simple fix: I
did try indexes but haven't got really deeply into the weeds of what we can do
with them.

Move a number of things from select_related to prefetch_related: we trade off
one big, inefficient query for a slightly larger number of significantly more
efficient queries.

On my laptop with 2 copies of the canonical kernel team list loaded into the
database, and considering only the API view (the JSON view is always faster)
with warm caches and considering the entire set of SQL queries:

 - /api/patches/?project=1
    ~1.4-1.5s -> <100ms, something like 14x better

 - /api/patches/?project=1&since=2010-11-01T00:00:00
    ~1.7-1.8s -> <560ms, something like 3x better (now dominated by the
                         counting query only invoked on the HTML API view,
                         not the pure JSON API view.)

The things I moved:

 * project: this was generating SQL that looked like:

   INNER JOIN `patchwork_project` T5
    ON (`patchwork_submission`.`project_id` = T5.`id`)

   This is correct but we've already had to join the patchwork_submission
   table and perhaps as a result it seems to be inefficient.

 * series__project: Likewise we've already had to join the series table,
   doing another join is possibly why it is inefficient.

 * delegate: I do not know why this was tanking performance. I think it
   might relate to the strategy mysql was using.

Reported-by: Konstantin Ryabitsev <konstantin at linuxfoundation.org>
Signed-off-by: Daniel Axtens <dja at axtens.net>
---
 patchwork/api/patch.py            | 9 ++++++---
 patchwork/tests/api/test_patch.py | 4 ++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/patchwork/api/patch.py b/patchwork/api/patch.py
index efa38e1f0788..000100cec6bd 100644
--- a/patchwork/api/patch.py
+++ b/patchwork/api/patch.py
@@ -288,10 +288,13 @@ class PatchList(ListAPIView):
     ordering = 'id'
 
     def get_queryset(self):
+        # TODO(dja): we need to revisit this after the patch migration, paying
+        # particular attention to cases with filtering
         return Patch.objects.all()\
-            .prefetch_related('check_set', 'related__patches__project')\
-            .select_related('project', 'state', 'submitter', 'delegate',
-                            'series__project')\
+            .prefetch_related('check_set', 'delegate', 'project',
+                              'series__project',
+                              'related__patches__project',)\
+            .select_related('state', 'submitter', 'series')\
             .defer('content', 'diff', 'headers')
 
 
diff --git a/patchwork/tests/api/test_patch.py b/patchwork/tests/api/test_patch.py
index aba92b92082e..b24c5ab28947 100644
--- a/patchwork/tests/api/test_patch.py
+++ b/patchwork/tests/api/test_patch.py
@@ -211,11 +211,11 @@ class TestPatchAPI(utils.APITestCase):
         self.assertNotIn('web_url', resp.data[0])
 
     def test_list_bug_335(self):
-        """Ensure we retrieve the embedded series project once."""
+        """Ensure we retrieve the embedded series project in O(1)."""
         series = create_series()
         create_patches(5, series=series)
 
-        with self.assertNumQueries(4):
+        with self.assertNumQueries(7):
             self.client.get(self.api_url())
 
     @utils.store_samples('patch-detail')
-- 
2.20.1



More information about the Patchwork mailing list