[PATCH v8 04/14] powerpc/vas: Alloc and setup IRQ and trigger port address

Wed Mar 25 00:27:12 AEDT 2020

On 3/24/20 3:26 AM, Oliver O'Halloran wrote:
> On Mon, Mar 23, 2020 at 8:28 PM Cédric Le Goater <clg at kaod.org> wrote:
>>
>> On 3/23/20 10:06 AM, Cédric Le Goater wrote:
>>> On 3/19/20 7:14 AM, Haren Myneni wrote:
>>>>
>>>> Alloc IRQ and get trigger port address for each VAS instance. Kernel
>>>> register this IRQ per VAS instance and sets this port for each send
>>>> window. NX interrupts the kernel when it sees page fault.
>>>
>>> I don't understand why this is not done by the OPAL driver for each VAS
>>> of the system. Is the VAS unit very different from OpenCAPI regarding
>>> the fault ?
>>
>> I checked the previous patchsets and I see that v3 was more like I expected
>> it: one interrupt for faults allocated by the skiboot driver and exposed
>> in the DT.
>>
>> What made you change your mind ?
> 
> From init_vas_inst() in arch/powerpc/platforms/powernv/vas.c:
> 
>         if (pdev->num_resources != 4) {
>                 pr_err("Unexpected DT configuration for [%s, %d]\n",
>                                 pdev->name, vasid);
>                 return -ENODEV;
>         }
> 
> This code should never have been written, but here we are. Due to the
> above adding an interrupt in the DT makes the driver unable to bind on
> older kernels. In an older version of the patches (don't think it was
> posted) Haren was using a non-standard interrupt property and we could
> work around the problem by going back to that.

ok ... :/ I didn't know. Don't we have a rule on LinuxPPC for such 
things ? Such as, the culprit should send a croissant to everyone 
involved. 

> However, we already have the OPAL calls for allocating / freeing
> hardware interrupt numbers so why not do that? 

It's a good way to work around the problem but we are bypassing the
irqchip which does other things for the driver.

> If we ever want to take
> advantage of the job completion interrupts we'd want to have the
> ability to allocate them since the completion interrupts are
> per-window rather than per-VAS.

Yes. That's what I thought it was about to begin with. OCXL has a  
first implementation of such interrupts. 

>> This version is hijacking the lowlevel routines of the XIVE irqchip which
>> is not the best approach. OCXL is doing that because it needs to allocate
>> interrupts for the user space processes using the AFU and we should rework
>> that part.
> 
> What'd you have in mind for the reworking the oxcl interrupt allocation? 
> I didn't find it that objectionable since it's more or less the same as 
> what happens when allocating IPIs.

I think we need to work a bit more on the concepts, on the interfaces,
internal at the platform kernel level and at the user space level, and 
on the configuration, with chip affinity in mind. There are bunch of 
information on the sources that are retrieved from the firmware or 
hypervisor that we care about. An irqchip might be the best option 
for the moment. 

At the same time, it would be good to keep in mind user interrupts. 

C.

> 
>> However, the translation fault interrupt is allocated by skiboot.
>>
>> Sorry for the noise, I would like to understand more how this works. I also
>> have passthrough in mind.
>>
>> C.