[PATCH v2 2/2] powerpc/pseries: Wait for completion of hotplug events during PRRN handling
Michael Ellerman
mpe at ellerman.id.au
Wed Aug 1 23:16:22 AEST 2018
John Allen <jallen at linux.ibm.com> writes:
> On Mon, Jul 23, 2018 at 11:41:24PM +1000, Michael Ellerman wrote:
>>John Allen <jallen at linux.ibm.com> writes:
>>
>>> While handling PRRN events, the time to handle the actual hotplug events
>>> dwarfs the time it takes to perform the device tree updates and queue the
>>> hotplug events. In the case that PRRN events are being queued continuously,
>>> hotplug events have been observed to be queued faster than the kernel can
>>> actually handle them. This patch avoids the problem by waiting for a
>>> hotplug request to complete before queueing more hotplug events.
Have you tested this patch in isolation, ie. not with patch 1?
>>So do we need the hotplug work queue at all? Can we just call
>>handle_dlpar_errorlog() directly?
>>
>>Or are we using the work queue to serialise things? And if so would a
>>mutex be better?
>
> Right, the workqueue is meant to serialize all hotplug events and it
> gets used for more than just PRRN events. I believe the motivation for
> using the workqueue over a mutex is that KVM guests initiate hotplug
> events through the hotplug interrupt and can queue fairly large requests
> meaning that in this scenario, waiting for a lock would block interrupts
> for a while.
OK, but that just means that path needs to schedule work to run later.
> Using the workqueue allows us to serialize hotplug events
> from different sources in the same way without worrying about the
> context in which the event is generated.
A lock would be so much simpler.
It looks like we have three callers of queue_hotplug_event(), the dlpar
code, the mobility code and the ras interrupt.
The dlpar code already waits synchronously:
init_completion(&hotplug_done);
queue_hotplug_event(hp_elog, &hotplug_done, &rc);
wait_for_completion(&hotplug_done);
You're changing mobility to do the same (this patch), leaving only the
ras interrupt that actually queues work and returns.
So it really seems like a mutex would do the trick, and the ras
interrupt would be the only case that needs to schedule work for later.
cheers
More information about the Linuxppc-dev
mailing list