[PATCH] powerpc/pseries: Avoid context switch in EEH reset if required

Brian King brking at linux.vnet.ibm.com
Tue Jan 27 10:36:19 AEDT 2015


On 01/24/2015 03:57 AM, Benjamin Herrenschmidt wrote:
> On Fri, 2015-01-23 at 14:50 +1100, Gavin Shan wrote:
> 
>> Messages from Brian for reference:
>>
>> | The API has changed. I wrote the pci_set_pcie_reset_state API originally.
>> | When this API was put in place initially, it was perfectly legal to call it
>> | from an atomic context. Can you clarify why we have to have the delay in the
>> | pci_set_pcie_reset_state function? Shouldn't it be the responsibility of the
>> | callers to ensure a proper delay is used? This was always the case until
>> | recently.
>>
>> So please ignore this patch and I'll send another one, which is implemented in
>> above approach.
> 
> I still think it's not a great idea to allow that API to be called in
> atomic context.

That was the entire purpose of the API though. If a driver doesn't need to
do the reset in atomic context, why bother having separate assert / deassert
APIs and just have an API that does the reset, delay, and deassert?


> Brian, the reset API for PCIe involves FW calls which might have to do
> a bunch of stuff under the hood, including potentially significant
> delays.
> 
> For example, under OPAL (and I suppose PowerVM), if doing a PERST, the
> API calls will loop until the link is back up, at least when "releasing"
> the reset line.

Under PowerVM, this maps to the ibm,set-slot-reset. According to PAPR+ V2.7,
R1-7.2.4-5:

For RTAS calls which do not allow the Status of -2 (Busy), there may be “rare” instances where an
anomaly may occur and the call may take longer than a “very short period of time.” In these cases, the call
must complete within 250 microseconds. Otherwise, a hardware error response must be given.


> I wouldn't be surprised if on x86, similar kinds of ACPI calls are
> needed which may not be the best thing to do in atomic context.

x86 hasn't wired up the function yet, so we don't have any code
to look at here. In fact, Power is the only architecture that has
wired up this API at all, all other architectures will return -EINVAL
if it is called.


> I don't see any specific performance issues with issuing resets, so I
> would strongly advocate for changing the API requirements instead so
> that it's called from a task context.

To set some context, this function is only used by ipr for some old
broken adapters. These are adapters that are not supported on p8,
so will never show up under OPAL, only PowerVM. I'm fine with looking
at alternatives for the future, but I can't say I'm too excited about
changing the calling requirements for an API that has been around
for many years. Particularly given that this code is only needed for
these old adapters. If its difficult to implement this for OPAL without
noticeable delays, could we just return -EINVAL for this function on OPAL?,
since its not needed there today anyway.

Thanks,

Brian

> 
> Cheers,
> Ben.
> 
> 
> 
>> Thanks,
>> Gavin
>>
>>>>>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>>>>> Tested-by: Wen Xiong<wenxiong at linux.vnet.ibm.com>
>>>>>> ---
>>>>>>  arch/powerpc/platforms/pseries/eeh_pseries.c | 12 ++++++++----
>>>>>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
>>>>>> index a6c7e19..67623a3 100644
>>>>>> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
>>>>>> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
>>>>>> @@ -503,8 +503,7 @@ static int pseries_eeh_get_state(struct eeh_pe *pe, int *state)
>>>>>>   */
>>>>>>  static int pseries_eeh_reset(struct eeh_pe *pe, int option)
>>>>>>  {
>>>>>> -	int config_addr;
>>>>>> -	int ret;
>>>>>> +	int config_addr, delay, ret;
>>>>>>  
>>>>>>  	/* Figure out PE address */
>>>>>>  	config_addr = pe->config_addr;
>>>>>> @@ -528,9 +527,14 @@ static int pseries_eeh_reset(struct eeh_pe *pe, int option)
>>>>>>  	/* We need reset hold or settlement delay */
>>>>>>  	if (option == EEH_RESET_FUNDAMENTAL ||
>>>>>>  	    option == EEH_RESET_HOT)
>>>>>> -		msleep(EEH_PE_RST_HOLD_TIME);
>>>>>> +		delay = EEH_PE_RST_HOLD_TIME;
>>>>>> +	else
>>>>>> +		delay = EEH_PE_RST_SETTLE_TIME;
>>>>>> +
>>>>>> +	if (in_atomic())
>>>>>> +		udelay(delay * 1000);
>>>>>>  	else
>>>>>> -		msleep(EEH_PE_RST_SETTLE_TIME);
>>>>>> +		msleep(delay);
>>>>>>  
>>>>>>  	return ret;
>>>>>>  }
>>>>>
>>>>>
> 
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center




More information about the Linuxppc-dev mailing list