Linux kernel: powerpc: KVM guest can trigger host crash on Power8

Nicholas Piggin npiggin at gmail.com
Fri Oct 29 11:41:57 AEDT 2021


Excerpts from John Paul Adrian Glaubitz's message of October 29, 2021 12:05 am:
> Hi Michael!
> 
> On 10/28/21 13:20, John Paul Adrian Glaubitz wrote:
>> It seems I also can no longer reproduce the issue, even when building the most problematic
>> packages and I think we should consider it fixed for now. I will keep monitoring the server,
>> of course, and will let you know in case the problem shows again.
> 
> The host machine is stuck again but I'm not 100% sure what triggered the problem:
> 
> [194817.984249] watchdog: BUG: soft lockup - CPU#80 stuck for 246s! [CPU 2/KVM:1836]
> [194818.012248] watchdog: BUG: soft lockup - CPU#152 stuck for 246s! [CPU 3/KVM:1837]
> [194825.960164] watchdog: BUG: soft lockup - CPU#24 stuck for 246s! [khugepaged:318]
> [194841.983991] watchdog: BUG: soft lockup - CPU#80 stuck for 268s! [CPU 2/KVM:1836]
> [194842.011991] watchdog: BUG: soft lockup - CPU#152 stuck for 268s! [CPU 3/KVM:1837]
> [194849.959906] watchdog: BUG: soft lockup - CPU#24 stuck for 269s! [khugepaged:318]
> [194865.983733] watchdog: BUG: soft lockup - CPU#80 stuck for 291s! [CPU 2/KVM:1836]
> [194866.011733] watchdog: BUG: soft lockup - CPU#152 stuck for 291s! [CPU 3/KVM:1837]
> [194873.959648] watchdog: BUG: soft lockup - CPU#24 stuck for 291s! [khugepaged:318]
> [194889.983475] watchdog: BUG: soft lockup - CPU#80 stuck for 313s! [CPU 2/KVM:1836]
> [194890.011475] watchdog: BUG: soft lockup - CPU#152 stuck for 313s! [CPU 3/KVM:1837]
> [194897.959390] watchdog: BUG: soft lockup - CPU#24 stuck for 313s! [khugepaged:318]
> [194913.983218] watchdog: BUG: soft lockup - CPU#80 stuck for 335s! [CPU 2/KVM:1836]
> [194914.011217] watchdog: BUG: soft lockup - CPU#152 stuck for 335s! [CPU 3/KVM:1837]
> [194921.959133] watchdog: BUG: soft lockup - CPU#24 stuck for 336s! [khugepaged:318]

Soft lockup should mean it's taking timer interrupts still, just not 
scheduling. Do you have the hard lockup detector enabled as well? Is
there anything stuck spinning on another CPU?

Do you have the full dmesg / kernel log for this boot?

Could you try a sysrq+w to get a trace of blocked tasks?

Are you able to shut down the guests and exit qemu normally?

Thanks,
Nick



More information about the Linuxppc-dev mailing list