[OpenPower-Firmware] Generate a dump of the Linux kernel on host OS (P8)

Nicholas Piggin npiggin at gmail.com
Wed Feb 20 23:19:18 AEDT 2019


Artem Senichev's on February 20, 2019 9:02 pm:
> On Tue, Feb 19, 2019 at 11:47:43PM +1000, Nicholas Piggin wrote:
>> Artem Senichev's on February 19, 2019 9:22 pm:
>> > On Fri, Apr 13, 2018 at 01:56:17PM +1000, Nicholas Piggin wrote:
>> >> > Artem Senichev <artemsen at gmail.com> writes:
>> >> > > I need the ability to generate a dump of the Linux kernel on host OS
>> >> > > using a command from BMC.
>> >> 
>> >> The dump will be initiated when we get a crash or sreset. We can kick
>> >> off a dump without using sreset. The benefits of sreset is that it can
>> >> be generated from the BMC, and that the host CPUs can't block it if they
>> >> have crashed with interrupts off.
>> >> 
>> >> My thought is that we could use libpdbg to send the sreset to the host.
>> >> If we could get ipmi wired up to use that for the nmi command, it should
>> >> work.
>> >> 
>> >> We have just been talking about this a bit more. Ramming is a bit
>> >> complex and has some restrictions. On P8 we can actually send a sreset,
>> >> but the SRR1 register may end up being incorrect. This means we can not
>> >> return from the interrupt and continue, but we should be able to go on
>> >> to take a crash dump and restart the machine.
>> >> 
>> >> Most of the P8 code is already there in skiboot to do this for fast
>> >> reboot as an IPI with OPAL_SIGNAL_SYSTEM_RESET (core/direct-controls.c),
>> >> and pdbg on the BMC has the sreset command.
>> > 
>> > Yes, in fact we don't need any patches for skiboot to get the NMI/SRESET
>> > functionality. Existing code works fine in most cases and handles
>> > SRESET signal correctly.
>> > 
>> > The entire solution includes only one patch for PDBG, that allows us to
>> > send SRESET signal from OpenBMC console:
>> > http://patchwork.ozlabs.org/patch/1038525/
>> > 
>> > The only problem I have is the case when I load the CPU's thread that should
>> > handle SRESET signal. If I understand right, we should send SRESET to one only
>> > thread on host's CPU.
>> 
>> Linux can deal with one or more threads taking sreset. You should sreset
>> all, because if Linux does not see all threads getting sreset, it will 
>> use IPIs to bring the remaining threads in. If you are going to use P8
>> with no skiboot patch, then Linux will have no NMI IPI.
>> 
> 
> I tried to send SRESET to all threads (with '-a' option of pdbg),
> in this case I get a lot of kernel messages about system reset, one message
> per logical CPU:
> 
> cpu 0x47: Vector: 100 (System Reset) at [c000003fcac4fbd8]
> ...
> 
> but it stops working after that, kernel just hangs. Also, the last
> message says that the last CPU that received sreset is 71 (0x47),
> but I have 256 logical CPU in the system.

Okay. It's not supposed to of course, and guest kernels under hypervisor 
(PowerVM or KVM) get a 0x100 interrupt on every CPU when the HV gives a 
crash or NMI signal.

Is this happening with an upstream kernel? Not running KVM?

> 
>> > signal.
>> > Step to reproduce:
>> > 1. On the host's side: call `stress` for the first thread of CPU0:
>> >    # taskset 01 stress -c 1
>> > 2. From OpenBMC: send SRESET signal for the first host's thread:
>> >    # pdbg --backend=i2c --device=/dev/i2c-4 -p 0 -c 1 -t 0 sreset
>> > In this scenario, as a result, SRESET signal is ignored, there are no any
>> > messages in OPAL's or kernel's logs. I can just stop `stress` execution by
>> > Ctrl-C and the system continues to work as usual. After that, I can resend
>> > SRESET and everything works as expected: kernel starts 'System Reset' signal
>> > handler and initiates reload kernel to perform memory dump creation.
>> 
>> You may need to stop the thread first with pdbg. P9 requires that I 
>> think. Some documentation indicates it works without stopping first,
>> but I don't think that's the case. P8 may be similar.
>> 
>> The stop sequence in pdbg for P8 does not exactly match the workbook 
>> either, by the looks. It doesn't check for maint mode, it does some
>> funny thing for RAM mode at the end, etc. If it does not work
>> properly for sreset then it would be worth experimenting with that
>> (I would try take out the last bit of code from p8_thread_stop() that
>> sets the thread active).
>> 
> 
> Nick, what do you mean, "stop the thread"?
> Is it something like Alister suggested to do in the patch
> "core/fast-reboot.c: Add sreset opal call":
> https://patchwork.ozlabs.org/patch/694794/
> By ramming an instruction sequence into an active thread?

No, I meant stop with pdbg.

> Because if I stop thread from pdbg (with 'stop' command), the SRESET
> signal doesn't handle by host, it has same effect as using `stress`.

I'm not sure what stress is. Does nothing appear to happen? It could be
due to the ramming thing in the p8 stop sequence in pdbg.

Thanks,
Nick


More information about the OpenPower-Firmware mailing list