loop nesting in alignment exception and machine check

Wangshaobo (bobo) bobo.shaobowang at huawei.com
Sat Oct 26 18:23:22 AEDT 2019


Hi,
I encountered a problem about a loop nesting occurred in manufacturing the alignment exception in machine check, trigger background is :

problem:
machine checkout or critical interrupt ->...->kbox_write[for recording last words] -> memcpy(irremap_addr, src,size):_GLOBAL(memcpy)...
when we enter memcpy,a command 'dcbz r11,r6' will cause a alignment exception, in this situation,r11 loads the ioremap address,which leads to the alignment exception,
then the command can not be process successfully,as we still in machine check.at the end ,it triggers a new irq machine check in irq handler function,a loop nesting begins.

analysis:
We have analysed a lot,but it still can not come to a reasonable description,in common,the alignment triggered in machine check context can still be collected into the Kbox
after alignment exception be handled by handler function, but how does the machine checkout can be triggered in the handler fucntion for any causes? We print relevant registers
as follow when first enter machine check and alignment exception handler function:
         MSR:0x2      MSR:0x0
         SRR1:0x2      SRR1:0x21002
         But the manual says SRR1 should be set to MSR(0x2),why that happened ?
         [cid:image001.jpg at 01D58C0D.E496CFD0]
         Then a branch in handler function copy the SRR1 to MSR,this enble MSR[ME] and MSR[CE],system collapses.

Conclusion:
         1)  why the alignment exception can not be handled in machine check ?
         2)  besides memcpy,any other function can cause the alignment exception ?

We still recurrent it, the line as follows:
         Cpu dead lock->watch log->trigger fiq->kbox_write->memcpy->alignment exception->print last words.
         but for those problems as below,what the kbox printed is empty.
------------------kbox restart:[   10.147594]----------------
kbox verify fs magic fail
kbox mem mabye destroyed, format it
kbox: load OK
lock-task: major[249] minor[0]
-----start show_destroyed_kbox_mem_head----
00000000: 00000000 00000000 00000000 00000000  ................
00000010: 00000000 00000000 00000000 00000000  ................
00000020: 00000000 00000000 00000000 00000000  ................
00000030: 00000000 00000000 00000000 00000000  ................
00000040: 00000000 00000000 00000000 00000000  ................
00000050: 00000000 00000000 00000000 00000000  ................
00000060: 00000000 00000000 00000000 00000000  ................
00000070: 00000000 00000000 00000000 00000000  ................
00000080: 00000000 00000000 00000000 00000000  ................
00000090: 00000000 00000000 00000000 00000000  ................

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20191026/f414e61e/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 11935 bytes
Desc: image001.jpg
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20191026/f414e61e/attachment-0001.jpg>


More information about the Linuxppc-dev mailing list