// a kdump hang caused by PPC pci patch series

Cédric Le Goater clg at kaod.org
Mon Nov 21 23:57:16 AEDT 2022


On 11/21/22 12:57, Pingfan Liu wrote:
> Sorry that forget a subject.
> 
> On Mon, Nov 21, 2022 at 7:54 PM Pingfan Liu <kernelfans at gmail.com> wrote:
>>
>> Hello Powerpc folks,
>>
>> I encounter an kdump bug, which I bisect and pin commit 174db9e7f775
>> ("powerpc/pseries/pci: Add support of MSI domains to PHB hotplug")
>> In that case, using Fedora 36 as host, the mentioned commit as the
>> guest kernel, and virto-block disk, the kdump kernel will hang:

The host kernel should be using the PowerNV platform and not pseries
or are you running a nested L2 guest on KVM/pseries L1 ?

And as far as I remember, the patch above only impacts the IBM PowerVM
hypervisor, not KVM, and PHB hotplug, or kdump induces some hot-plugging
I am not aware of.

Also, if indeed, this is a L2 guest, the XIVE interrupt controller is
emulated in QEMU, "info pic" should return:

   ...
   irqchip: emulated

>>
>> [    0.000000] Kernel command line: elfcorehdr=0x22c00000
>> no_timer_check net.ifnames=0 console=tty0 console=hvc0,115200n8
>> irqpoll maxcpus=1 noirqdistrib reset_devices cgroup_disable=memory
>>       numa=off udev.children-max=2 ehea.use_mcs=0 panic=10
>> kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd
>> hugetlb_cma=0
>>      ...
>>      [    7.763260] virtio_blk virtio2: 32/0/0 default/read/poll queues
>>      [    7.771391] virtio_blk virtio2: [vda] 20971520 512-byte logical
>> blocks (10.7 GB/10.0 GiB)
>>      [   68.398234] systemd-udevd[187]: virtio2: Worker [190]
>> processing SEQNUM=1193 is taking a long time
>>      [  188.398258] systemd-udevd[187]: virtio2: Worker [190]
>> processing SEQNUM=1193 killed
>>
>>
>> During my test, I found that in very rare cases, the kdump can success
>> (I guess it may be due to the cpu id).  And if using either maxcpus=2
>> or using scsi-disk, then kdump can also success.  And before the
>> mentioned commit, kdump can also success.
>>
>> The attachment contains the xml to reproduce that bug.
>>
>> Do you have any ideas?

Most certainly an interrupt not being delivered. You can check the status
on the host with :

   virsh qemu-monitor-command --hmp <domain>  "info pic"



Thanks,

C.


More information about the Linuxppc-dev mailing list