[PATCH 05/11] Add covering index for /list/ query

Stewart Smith stewart at linux.ibm.com
Mon Sep 10 12:50:56 AEST 2018

Stephen Finucane <stephen at that.guru> writes:
> On Fri, 2018-08-10 at 18:01 +1000, Stewart Smith wrote:
>> In constructing the list of patches for a project, there are two
>> main queries that are executed:
>> 1) get a count() of how many patches there are
>> 2) Get the page of results being displayed
>> In a test dataset of ~11500 LKML patches and ~4000 others, the existing
>> code would take around 585ms and 858ms with a cold cache and 28ms and
>> 198ms for a warm cache.
>> By adding a covering index, we get down to 4ms and 255ms for a cold
>> cache, and 4ms and 143ms for a warm cache!
>> Additionally, when there's a lot of archived or accepted patches
>> (I used ~11000 archived out of the 15000 total in my test set)
>> the query time goes from 28ms and 72ms down to 2ms and 33-40ms!
>> Signed-off-by: Stewart Smith <stewart at linux.ibm.com>
> As before, I'm trusting your leet skillz here in all DB-related things.
> I don't have a large enough dataset to validate this properly but I am
> seeing a performance improvement in the smaller set I do have.

Simply put, an index is just a tree with a key of
something like this: |IndexField1|IndexField|PrimaryKeyOrPointerToHowtoGetFullRow|
So, if you have all the fields you need to answer a query in the index,
the database engine doesn't *need* to go and find the real row, it can
answer the query just from the index - saving potentially *lot* of disk
seeks as well as being a lot more cache friendly - especially as this
will group things on disk based on the order of the index, so if we
include things the right way for us, we get a cache friendly structure
that groups information on patches in a project close together on disk
and in an order that we're likely to request.

Stewart Smith
OPAL Architect, IBM.

More information about the Patchwork mailing list