[RFC/PATCH] powerpc: Fix kdump EOI bug (often exhibits as dead console)

Milton Miller miltonm at bga.com
Tue Mar 14 02:59:15 EST 2006


On Mar 13, 2006, at 2:44 AM, Benjamin Herrenschmidt wrote:

> On Mon, 2006-03-13 at 19:16 +1100, Michael Ellerman wrote:
>> If we take an interrupt, and while processing it, decide to kdump we 
>> never
>> EOI that interrupt. This can happen for any interrupt, but most 
>> commonly it's
>> the console interrupt from a user hitting 'sysrq-c', which prevents 
>> the
>> console from working in the second kernel.
>>
>> We're panicking so we don't want to do any more than we need to in 
>> the first
>> kernel, so leave things alone there. When we come back up iff we 
>> reenable the
>> interrupt in question, do an EOI then. This fixes the bug for me, and 
>> appears
>> to cause no issue for other interrupts.
>
> You may want to do the same for mpic.c ...
>
> Ben.

Is that possible?   My memory says that, at least for the distributed 
pic in Power3 boxes, part of the information to do the EOI was 
remembered in a stack in the interrupt controller.  This means (1) the 
EOI must be issued from the processor server that took the interrupt, 
(2) there are a limited number of interrupts that can be presented 
before they are EOId (3) they must be EOId in reverse order.  (4) I 
don't know what happens if we issue and EOI with no hardware.  However, 
a write to the reset register sent a packet to all pics to reset them.

What we are doing here is a possibly extranious, third party EOI 
(device X interrupted cpu Y, and cpu Z is issuing the EOI to allow the 
device to reissue the interrupt).  For real XICS I know that is both 
possible and safe as long as the interrupt X exists; I am familiar with 
the hardware implementation.


milton




More information about the Linuxppc64-dev mailing list