[PATCH 10/11] Be sensible computing project patch counts
Stewart Smith
stewart at linux.ibm.com
Mon Sep 10 12:56:16 AEST 2018
Stephen Finucane <stephen at that.guru> writes:
> On Fri, 2018-08-31 at 15:16 +0100, Stephen Finucane wrote:
>> On Fri, 2018-08-10 at 18:01 +1000, Stewart Smith wrote:
>> > Django actively fights constructing a query that isn't insane.
>> >
>> > So, let's go and just execute a raw one. This is all very standard
>> > SQL so should execute everywhere without a problem.
>> >
>> > With the dataset of patchwork.ozlabs.org, looking at the /project/
>> > page for qemu-devel would take 13 queries and 1500ms,
>> > with this patch it's down to 11 queries in ~250ms.
>> > For the dataset of the netdev list, it's down to 440ms from 1500ms.
>> >
>> > Signed-off-by: Stewart Smith <stewart at linux.ibm.com>
>>
>> I'm generally reluctant to dive into SQL unless it's absolutely
>> required. This is simple enough and the gains are significant so
>>
>> Reviewed-by: Stephen Finucane <stephen at that.guru>
>
> Actually, there's a potential issue here which I spotted while testing
> with a smaller (odd, I know) set of patches. I can fix at merge time,
> assuming a respin isn't needed.
>
>> Is this a potential bug report for Django?
>>
>> Stephen
>>
>> > ---
>> > patchwork/views/project.py | 23 ++++++++++++++++++++---
>> > 1 file changed, 20 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/patchwork/views/project.py b/patchwork/views/project.py
>> > index 484455c02d9d..2a75242a06af 100644
>> > --- a/patchwork/views/project.py
>> > +++ b/patchwork/views/project.py
>> > @@ -27,6 +27,8 @@ from patchwork.compat import reverse
>> > from patchwork.models import Patch
>> > from patchwork.models import Project
>> >
>> > +from django.db import connection
>> > +
>> >
>> > def project_list(request):
>> > projects = Project.objects.all()
>> > @@ -44,14 +46,29 @@ def project_list(request):
>> >
>> > def project_detail(request, project_id):
>> > project = get_object_or_404(Project, linkname=project_id)
>> > - patches = Patch.objects.filter(project=project)
>> > +
>> > + # So, we revert to raw sql because if we do what you'd think would
>> > + # be the correct thing in Django-ese, it ends up doing a *pointless*
>> > + # join with patchwork_submissions that ends up ruining the query.
>> > + # So, we do not do this, as this is wrong:
>> > + #patches = Patch.objects.filter(patch_project_id=project.id).only('archived')
>> > + #patches = patches.annotate(c=Count('archived'))
>> > + # and instead do this, because it's simple and fast
>> > +
>> > + n_patches = {}
>> > + with connection.cursor() as cursor:
>> > + c = cursor.execute('SELECT archived,COUNT(submission_ptr_id) as c FROM patchwork_patch WHERE patch_project_id=%s GROUP BY archived',
>> > + [project.id])
>> > +
>> > + for r in cursor:
>> > + n_patches[r[0]] = r[1]
>> >
>> > context = {
>> > 'project': project,
>> > 'maintainers': User.objects.filter(
>> > profile__maintainer_projects=project),
>> > - 'n_patches': patches.filter(archived=False).count(),
>> > - 'n_archived_patches': patches.filter(archived=True).count(),
>> > + 'n_patches': n_patches[False],
>> > + 'n_archived_patches': n_patches[True],
>
> It's possible that there are (a) no patches, (b) all patches are
> archived, or (c) no patches are archived. If any of these are true,
> you'll get a key error. We can fix this very simply like so:
>
> 'n_patches': n_patches[False] if False in n_patches else 0,
> 'n_archived_patches': n_patches[True] if True in n_patches else 0,
>
> As noted above, I'll fix this at merge time and add a test to ensure we
> don't regress.
Oh, neat! Yeah, I likely didn't test those scenarios at all, I was
focused on large amounts of data rather than no data.
--
Stewart Smith
OPAL Architect, IBM.
More information about the Patchwork
mailing list