[PATCH] powerpc/drmem: cache LMBs in xarray to accelerate lookup

Tue Feb 4 07:13:46 AEDT 2020

On Thu, Jan 30, 2020 at 10:09:32AM -0600, Fontenot, Nathan wrote:
> On 1/29/2020 12:10 PM, Scott Cheloha wrote:
> > On Tue, Jan 28, 2020 at 05:56:55PM -0600, Nathan Lynch wrote:
> >> Scott Cheloha <cheloha at linux.ibm.com> writes:
> >>> LMB lookup is currently an O(n) linear search.  This scales poorly when
> >>> there are many LMBs.
> >>>
> >>> If we cache each LMB by both its base address and its DRC index
> >>> in an xarray we can cut lookups to O(log n), greatly accelerating
> >>> drmem initialization and memory hotplug.
> >>>
> >>> This patch introduces two xarrays of of LMBs and fills them during
> >>> drmem initialization.  The patch also adds two interfaces for LMB
> >>> lookup.
> >>
> >> Good but can you replace the array of LMBs altogether
> >> (drmem_info->lmbs)? xarray allows iteration over the members if needed.
> > 
> > I don't think we can without potentially changing the current behavior.
> > 
> > The current behavior in dlpar_memory_{add,remove}_by_ic() is to advance
> > linearly through the array from the LMB with the matching DRC index.
> > 
> > Iteration through the xarray via xa_for_each_start() will return LMBs
> > indexed with monotonically increasing DRC indices.> 
> > Are they equivalent?  Or can we have an LMB with a smaller DRC index
> > appear at a greater offset in the array?
> > 
> > If the following condition is possible:
> > 
> > 	drmem_info->lmbs[i].drc_index > drmem_info->lmbs[j].drc_index
> > 
> > where i < j, then we have a possible behavior change because
> > xa_for_each_start() may not return a contiguous array slice.  It might
> > "leap backwards" in the array.  Or it might skip over a chunk of LMBs.
> > 
> 
> The LMB array should have each LMB in monotonically increasing DRC Index
> value. Note that this is set up based on the DT property but I don't recall
> ever seeing the DT specify LMBs out of order or not being contiguous.

Is that ordering guaranteed by the PAPR or some other spec or is that
just a convention?

Code like drmem_update_dt_v1() makes me very nervous:

static int drmem_update_dt_v1(struct device_node *memory,
                              struct property *prop)
{
        struct property *new_prop;
        struct of_drconf_cell_v1 *dr_cell;
        struct drmem_lmb *lmb;
        u32 *p;

        new_prop = clone_property(prop, prop->length);
        if (!new_prop)
                return -1;

        p = new_prop->value;
        *p++ = cpu_to_be32(drmem_info->n_lmbs);

        dr_cell = (struct of_drconf_cell_v1 *)p;

        for_each_drmem_lmb(lmb) {
                dr_cell->base_addr = cpu_to_be64(lmb->base_addr);
                dr_cell->drc_index = cpu_to_be32(lmb->drc_index);
                dr_cell->aa_index = cpu_to_be32(lmb->aa_index);
                dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb));

                dr_cell++;
        }

        of_update_property(memory, new_prop);
        return 0;
}

If for whatever reason the firmware has a DRC that isn't monotonically
increasing and we update a firmware property at the wrong offset I have
no idea what would happen.

With the array we preserve the order.  Without it we might violate
some assumption the firmware has made.