[PATCH] ehea: Add kdump support

Michael Ellerman michael at ellerman.id.au
Mon Nov 26 19:16:28 EST 2007


On Fri, 2007-11-09 at 14:33 +0100, Thomas Klein wrote:
> To support ehea driver reloading in a kdump kernel the driver has to perform
> firmware handle deregistrations when the original kernel crashes. As there's
> currently no notifier chain for machine crashes this patch enables kdump support
> in the ehea driver by bending the ppc_md.machine_crash_shutdown hook to its own
> machine crash handler. The original machine_crash_shutdown() fn is called
> afterwards. This works fine as long as the ehea driver is the only one which
> does so. Problems may occur if other drivers do the same and unload regularly.
> This patch enables 2.6.24-rc2 to use kdump with ehea and only puts a very
> low risk on base kernel. In 2.6.24 we know ehea is the only user of this
> mechanism. The next step for 2.6.25 would be to add a proper notifier chain.
> The full solution might be that register_reboot_notifier() provides sth
> like a SYS_CRASH action. Please apply.
> 
> Signed-off-by: Thomas Klein <tklein at de.ibm.com>
> 
> ---
>  drivers/net/ehea/ehea.h      |    2 +-
>  drivers/net/ehea/ehea_main.c |   28 ++++++++++++++++++++++++++++
>  2 files changed, 29 insertions(+), 1 deletions(-)
> 

Hi Thomas,

I'm sorry, but this patch is all wrong IMHO.

For kdump we have to assume that the kernel is fundamentally broken,
we've panicked, so something bad has happened - every line of kernel
code that is run decreases the chance that we'll successfully make it
into the kdump kernel.

So just calling unregister_driver() is no good, that's going to call
lots of code, try to take lots of locks, rely on lots of data structures
being uncorrupted etc. etc.

And the hijacking of machine_crash_shutdown() is IMO not an acceptable
solution, as you say it only works if EHEA is the only driver to do it.
And as soon as EHEA does it other drivers will want to do it.

What we need is the _minimal_ set of actions that can happen to make
EHEA work in the kdump kernel.

Solutions that might be better:

 a) if there are a finite number of handles and we can predict their 
    values, just delete them all in the kdump kernel before the driver
    loads.
 b) if there are a small & finite number of handles, save their values 
    in a device tree property and have the kdump kernel read them and 
    delete them before the driver loads.
 c) if neither of those work, provide a minimal routine that _only_
    deletes the handles in the crashed kernel.
 d) <something else>

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20071126/5cc72219/attachment.pgp>


More information about the Linuxppc-dev mailing list