[PATCH v2 2/2] powerpc/pseries: Wait for completion of hotplug events during PRRN handling

Wed Aug 8 23:38:52 AEST 2018

John Allen <jallen at linux.ibm.com> writes:
> On Wed, Aug 01, 2018 at 11:16:22PM +1000, Michael Ellerman wrote:
>>John Allen <jallen at linux.ibm.com> writes:
>>> On Mon, Jul 23, 2018 at 11:41:24PM +1000, Michael Ellerman wrote:
>>>>John Allen <jallen at linux.ibm.com> writes:
>>>>
>>>>> While handling PRRN events, the time to handle the actual hotplug events
>>>>> dwarfs the time it takes to perform the device tree updates and queue the
>>>>> hotplug events. In the case that PRRN events are being queued continuously,
>>>>> hotplug events have been observed to be queued faster than the kernel can
>>>>> actually handle them. This patch avoids the problem by waiting for a
>>>>> hotplug request to complete before queueing more hotplug events.
>>
>>Have you tested this patch in isolation, ie. not with patch 1?
>
> While I was away on vacation, I believe a build was tested with just 
> this patch and not the first and it has been running with no problems.  
> However, I think they've had problems recreating the problem in general 
> so it may just be that the environment is not setup properly to recreate 
> the issue.

Yes I asked Haren to test it :)

>From memory there were some warnings still about the work queue being
blocked for long periods, but they weren't fatal.

>>>>So do we need the hotplug work queue at all? Can we just call
>>>>handle_dlpar_errorlog() directly?
>>>>
>>>>Or are we using the work queue to serialise things? And if so would a
>>>>mutex be better?
>>>
>>> Right, the workqueue is meant to serialize all hotplug events and it
>>> gets used for more than just PRRN events. I believe the motivation for
>>> using the workqueue over a mutex is that KVM guests initiate hotplug
>>> events through the hotplug interrupt and can queue fairly large requests
>>> meaning that in this scenario, waiting for a lock would block interrupts
>>> for a while.
>>
>>OK, but that just means that path needs to schedule work to run later.
>>
>>> Using the workqueue allows us to serialize hotplug events
>>> from different sources in the same way without worrying about the
>>> context in which the event is generated.
>>
>>A lock would be so much simpler.
>>
>>It looks like we have three callers of queue_hotplug_event(), the dlpar
>>code, the mobility code and the ras interrupt.
>>
>>The dlpar code already waits synchronously:
>>
>>  init_completion(&hotplug_done);
>>  queue_hotplug_event(hp_elog, &hotplug_done, &rc);
>>  wait_for_completion(&hotplug_done);
>>
>>You're changing mobility to do the same (this patch), leaving only the
>>ras interrupt that actually queues work and returns.
>>
>>
>>So it really seems like a mutex would do the trick, and the ras
>>interrupt would be the only case that needs to schedule work for later.
>
> I think you may be right, but I would need some feedback from Nathan 
> Fontenot before I redesign the queue. He's been thinking about that 
> design for longer than I have and may know something that I don't 
> regarding the reason we're using a workqueue rather than a mutex.

> Given that the bug this is meant to address is pretty high priority, 
> would you consider the wait_for_completion an acceptable stopgap while a 
> more substantial redesign of this code is discussed?

Yeah that would be OK.

cheers