Question: handling early hotplug interrupts

Wed Aug 30 15:59:30 AEST 2017

Daniel Henrique Barboza <danielhb at linux.vnet.ibm.com> writes:

> Hi,
>
> This is a scenario I've been facing when working in early device 
> hotplugs in QEMU. When a device is added, a IRQ pulse is fired to warn 
> the guest of the event, then the kernel fetches it by calling 
> 'check_exception' and handles it. If the hotplug is done too early 
> (before SLOF, for example), the pulse is ignored and the hotplug event 
> is left unchecked in the events queue.
>
> One solution would be to pulse the hotplug queue interrupt after CAS, 
> when we are sure that the hotplug queue is negotiated. However, this 
> panics the kernel with sig 11 kernel access of bad area, which suggests 
> that the kernel wasn't quite ready to handle it.
>
> In my experiments using upstream 4.13 I saw that there is a 'safe time' 
> to pulse the queue, sometime after CAS and before mounting the root fs, 
> but I wasn't able to pinpoint it. From QEMU perspective, the last hcall 
> done (an h_set_mode) is still too early to pulse it and the kernel 
> panics. Looking at the kernel source I saw that the IRQ handling is 
> initiated quite early in the init process.
>
> So my question (ok, actually 2 questions):
>
> - Is my analysis correct? Is there an unsafe time to fire a IRQ pulse 
> before CAS that can break the kernel or am I overlooking/doing something 
> wrong?
> - is there a reliable way to know when can the kernel safely handle the 
> hotplug interrupt?

In addition to Ben's comments, you need to think about this differently.

The operating system you're booting may not be Linux.

Whatever Qemu does needs to make sense without reference to the exact
details or ordering of the Linux code. Qemu needs to provide a mechanism
that any operating system could use, and then we can make it work with
Linux.

cheers