[Skiboot-stable] [PATCH] xive/p9: Remove assert from xive_eq_for_target()

Cédric Le Goater clg at kaod.org
Fri Nov 27 20:44:01 AEDT 2020


On 11/27/20 10:23 AM, Greg Kurz wrote:
> On Fri, 27 Nov 2020 09:32:43 +0100
> Cédric Le Goater <clg at kaod.org> wrote:
> 
>> On 11/27/20 9:08 AM, Greg Kurz wrote:
>>> On Fri, 27 Nov 2020 08:32:12 +0100
>>> Cédric Le Goater <clg at kaod.org> wrote:
>>>
>>>> XIVE VPs are structures describing the vCPUs of guests. When starting
>>>> a guest, these are allocated and enabled and some checks are done on
>>>> the location of the associated ENDs, which describe the event
>>>> queues. If the block of the VP and the block of the ENDs do not match,
>>>> the XIVE driver asserts.
>>>>
>>>> Unfortunately, there is no way to check that a VP identifier is part
>>>> of a VP block that was previously allocated and it is relatively easy
>>>> to crash the host with a bogus VP id. That can be done with a QEMU
>>>> hack on a machine using vsmt.
>>>>
>>>
>>> Reported-by: Greg Kurz <groug at kaod.org>
>>>
>>> :)
>>>
>>>> Simply remove the assert, the OS should gracefully handle the error.
>>>>
>>>
>>> This seems to be reasonable when xive_eq_for_target() is called from
>>> an OPAL call since they'd all return OPAL_PARAMETER to the OS in this
>>> case.
>>
>> Linux behaves correctly. There are errors but the rollback is correct.
>>
> 
> Cool.
> 
>>  
>>> Some other paths maybe need more care though, eg:
>>>
>>> xive_ipi_init()
>>>     __xive_set_irq_config()
>>>         xive_set_irq_targetting()
>>>             xive_eq_for_target()
>>>
>>> If xive_eq_for_target() fails to map the target to a valid EQ,
>>> this ends up being ignored in xive_ipi_init() with this patch.
>>> Is it okay ?
>>
>> This is the P8 emulation layer fully handled by OPAL. It the VP
>> is bogus, then it means that OPAL just screwed up. 
>>
>> And this seems to be the case because it's using cpu->pir and not 
>> a vp identifier ... If this is working, it's pure luck. Another 
>> good reason to drop support for this emulation which is a source 
>> of complexity. It's dropped for P10. 
>>
>> See :
>>
>>   patchwork.ozlabs.org/patch/1157193  xive/p9: remove XICS emulation
>>   patchwork.ozlabs.org/patch/1157182  xive/p9: remove percpu logging
>>   patchwork.ozlabs.org/patch/1157172  xive/p9: remove XICS emulation OPAL calls
>>
>> I would be very happy to push these again and get rid of 1100 
>> lines of useless bringup complexity. 
>>
> 
> Hmm... some existing setups might still be using power8 compat mode
> for legitimate reasons, eg. workload only certified on POWER8. 

To be clear, I am talking of XICS emulation in OPAL/BML and not in KVM
or PowerVM. 

If I am correct, the certification of an application is tied to an OS 
which is tied to a set of HW. RH7 is certified to run on POWER8 but 
not on POWER9. So, if it might work to run a RH7 on a POWER9 but 
I doubt Redhat would guarantee any POWER8 workload there. 

C. 


More information about the Skiboot-stable mailing list