[PATCH] powerpc/pseries: Avoid context switch in EEH reset if required
gwshan at linux.vnet.ibm.com
Fri Jan 23 14:50:29 AEDT 2015
On Wed, Jan 21, 2015 at 10:53:38AM +1100, Gavin Shan wrote:
>On Wed, Jan 21, 2015 at 09:56:07AM +1100, Gavin Shan wrote:
>>On Tue, Jan 20, 2015 at 10:28:16AM +0100, Benjamin Herrenschmidt wrote:
>>>On Mon, 2015-01-19 at 09:47 +1100, Gavin Shan wrote:
>>>> On pseries platform, the EEH reset backend pseries_eeh_reset() can
>>>> be called in atomic context as follows. For this case, we should
>>>> call udelay() instead of msleep() to avoid context switching.
>>>It's not acceptable to introduce multi-millisecond delays at interrupt
>>>time. In fact, we should generally not use udelay in such context.
>>>I understand that this is an exceptional error handling case but it's
>>>still not right.
>>Yes, I agree it's unsafe to udelay for multi-milliseconds as the queued
>>works in atomic context is expected to be completed as soon as possible.
>>>Are there many other users of pci_set_pcie_reset_state() at interrupt
>>>time ? Can we have a discussion with the PCI folks as to whether that
>>>should be legal or not ?
>>>I'm tempted to require that it's made illegal.
>>Currently, there are 2 drivers calling this function: IPR and misc/genwqe.
>>Also, VFIO would call this function for IBM and Mellanox adapters in PowerKVM
>>repository. For now, IPR driver is the only one call this function in atomic
>>Sure, I'll send one email to confirm with PCI folks. I guess it's illegal
>>to call pci_set_pcie_reset_state() in atomic context. If it's the case,
>>I'm afraid Wendy has to change IPR driver to replace the reset timer with
>>something else (e.g. workqueue).
>Another way is to drop the hold/settle delays for pcibios_set_pcie_reset_state()
>and IPR relies on the timer interval to cover them. Wendy, could you please
>let me know if it would work for you or not?
> Start reset timer;
> Timer expires, assert the reset. Restart the timer with assert delay;
> Timer expires, deassert the reset. Restart the timer with settle delay;
> Timer expires, ready for subsequent works;
Brian King is the author of pci_set_pcie_reset_state() and as he said, it was
expected to be called in atomic context. So I'm going to come up with another
approach as above, caller of pci_set_pcie_reset_state() should take corresponding
delays as what IPR driver is doing.
Messages from Brian for reference:
| The API has changed. I wrote the pci_set_pcie_reset_state API originally.
| When this API was put in place initially, it was perfectly legal to call it
| from an atomic context. Can you clarify why we have to have the delay in the
| pci_set_pcie_reset_state function? Shouldn't it be the responsibility of the
| callers to ensure a proper delay is used? This was always the case until
So please ignore this patch and I'll send another one, which is implemented in
>>>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>>> Tested-by: Wen Xiong<wenxiong at linux.vnet.ibm.com>
>>>> arch/powerpc/platforms/pseries/eeh_pseries.c | 12 ++++++++----
>>>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>>> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
>>>> index a6c7e19..67623a3 100644
>>>> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
>>>> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
>>>> @@ -503,8 +503,7 @@ static int pseries_eeh_get_state(struct eeh_pe *pe, int *state)
>>>> static int pseries_eeh_reset(struct eeh_pe *pe, int option)
>>>> - int config_addr;
>>>> - int ret;
>>>> + int config_addr, delay, ret;
>>>> /* Figure out PE address */
>>>> config_addr = pe->config_addr;
>>>> @@ -528,9 +527,14 @@ static int pseries_eeh_reset(struct eeh_pe *pe, int option)
>>>> /* We need reset hold or settlement delay */
>>>> if (option == EEH_RESET_FUNDAMENTAL ||
>>>> option == EEH_RESET_HOT)
>>>> - msleep(EEH_PE_RST_HOLD_TIME);
>>>> + delay = EEH_PE_RST_HOLD_TIME;
>>>> + else
>>>> + delay = EEH_PE_RST_SETTLE_TIME;
>>>> + if (in_atomic())
>>>> + udelay(delay * 1000);
>>>> - msleep(EEH_PE_RST_SETTLE_TIME);
>>>> + msleep(delay);
>>>> return ret;
More information about the Linuxppc-dev