[Cbe-oss-dev] [PATCH v2 04/17] powerpc/msi: Improve IRQ bitmap allocator

Thu Oct 2 12:01:52 EST 2014

On Wed, 2014-10-01 at 17:13 +1000, Michael Ellerman wrote:
> On Tue, 2014-30-09 at 10:34:53 UTC, Michael Neuling wrote:
> > From: Ian Munsie <imunsie at au1.ibm.com>
> > 
> > Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
>                                                                        request
> > to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
>                              ^ one space after a period, or die!
> 
> > lot of IRQs which can be a scarce resource.
> > 
> > For cxl we can require multiple IRQs for every contexts that is attached to the
>                                                  context
> > accelerator.  For AFU directed accelerators, there may be 1000s of contexts
> 
> What is an AFU directed accelerator?

From the documentation in the last patch:

AFU Models
==========

    There are two programming models supported by the AFU.  Dedicated
    and AFU directed.  AFU may support one or both models.

    In dedicated model only one MMU context is supported.  In this
    model, only one userspace process can use the accelerator at time.

    In AFU directed model, up to 16K simultaneous contexts can be
    supported.  This means up to 16K simultaneous userspace
    applications may use the accelerator (although specific AFUs may
    support less).  In this mode, the AFU sends a 16 bit context ID
    with each of its requests.  This tells the PSL which context is
    associated with this operation.  If the PSL can't translate a
    request, the ID can also be accessed by the kernel so it can
    determine the associated userspace context to service this
    translation with.

>                    
> > attached, hence we can easily run out of IRQs, especially if we are needlessly
> > wasting them.
> > 
> > This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
>                 x
> > of IRQs, hence avoiding this wastage.
> 
> The crucial detail you failed to mention is that you maintain the behaviour that
> allocations are naturally aligned.

ok, I'll add that.

> Can you add a check in the test code at the bottom of the file to confirm that
> please?

Yep
> 
> > diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c
> > index 2ff6302..961a358 100644
> > --- a/arch/powerpc/sysdev/msi_bitmap.c
> > +++ b/arch/powerpc/sysdev/msi_bitmap.c
> > @@ -20,32 +20,37 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num)
> >  	int offset, order = get_count_order(num);
> >  
> >  	spin_lock_irqsave(&bmp->lock, flags);
> > -	/*
> > -	 * This is fast, but stricter than we need. We might want to add
> > -	 * a fallback routine which does a linear search with no alignment.
> > -	 */
> > -	offset = bitmap_find_free_region(bmp->bitmap, bmp->irq_count, order);
> > +
> > +	offset = bitmap_find_next_zero_area(bmp->bitmap, bmp->irq_count, 0,
> > +					    num, (1 << order) - 1);
> > +	if (offset > bmp->irq_count)
> > +		goto err;
> 
> Can we get a newline here :)

Ok.

> 
> > +	bitmap_set(bmp->bitmap, offset, num);
> >  	spin_unlock_irqrestore(&bmp->lock, flags);
> >  
> >  	pr_debug("msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n",
> >  		 num, order, offset);
> 
> This print out is a bit confusing now, should probably just drop the order.

Arrh, yep.

Thanks,
Mikey