ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
Eran Liberty
liberty at extricom.com
Thu Aug 21 00:02:28 EST 2008
Steven Rostedt wrote:
> On Wed, 20 Aug 2008, Eran Liberty wrote:
>
>
>> Steven Rostedt wrote:
>>
>>> On Wed, 20 Aug 2008, Steven Rostedt wrote:
>>>
>>>
>>>
>>>> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
>>>>
>>>>
>>>>
>>>>> Found the problem (or at least -a- problem), it's a gcc bug.
>>>>>
>>>>> Well, first I must say the code generated by -pg is just plain
>>>>> horrible :-)
>>>>>
>>>>> Appart from that, look at the exit of, for example, __d_lookup, as
>>>>> generated by gcc when ftrace is enabled:
>>>>>
>>>>> c00c0498: 38 60 00 00 li r3,0
>>>>> c00c049c: 81 61 00 00 lwz r11,0(r1)
>>>>> c00c04a0: 80 0b 00 04 lwz r0,4(r11)
>>>>> c00c04a4: 7d 61 5b 78 mr r1,r11
>>>>> c00c04a8: bb 0b ff e0 lmw r24,-32(r11)
>>>>> c00c04ac: 7c 08 03 a6 mtlr r0
>>>>> c00c04b0: 4e 80 00 20 blr
>>>>>
>>>>> As you can see, it restores r1 -before- it pops r24..r31 off
>>>>> the stack ! I let you imagine what happens if an interrupt happens
>>>>> just in between those two instructions (mr and lmw). We don't do
>>>>> redzones on our ABI, so basically, the registers end up corrupted
>>>>> by the interrupt.
>>>>>
>>>>>
>>>> Ouch! You've disassembled this without -pg too, and it does not have this
>>>> bug? What version of gcc do you have?
>>>>
>>>>
>>>>
>>> I have:
>>> gcc (Debian 4.3.1-2) 4.3.1
>>>
>>> c00c64c8: 81 61 00 00 lwz r11,0(r1)
>>> c00c64cc: 7f 83 e3 78 mr r3,r28
>>> c00c64d0: 80 0b 00 04 lwz r0,4(r11)
>>> c00c64d4: ba eb ff dc lmw r23,-36(r11)
>>> c00c64d8: 7d 61 5b 78 mr r1,r11
>>> c00c64dc: 7c 08 03 a6 mtlr r0
>>> c00c64e0: 4e 80 00 20 blr
>>>
>>>
>>> My version looks fine. I'm thinking that this is a separate issue than what
>>> Eran is seeing.
>>>
>>> Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and
>>> print out the end of the function dump.
>>>
>>> Thanks,
>>>
>>> -- Steve
>>>
>>>
>>>
>>>
>>>
>> powerpc-linux-gnu-objdump -dr --start-address=0xc00bb584 vmlinux | head -n 100
>>
>> vmlinux: file format elf32-powerpc
>>
>> Disassembly of section .text:
>>
>> c00bb584 <__d_lookup>:
>>
>
> [...]
>
>
>> c00bb670: 41 9e 00 50 beq- cr7,c00bb6c0 <__d_lookup+0x13c>
>> c00bb674: 83 de 00 00 lwz r30,0(r30)
>> c00bb678: 2f 9e 00 00 cmpwi cr7,r30,0
>> c00bb67c: 40 9e ff 98 bne+ cr7,c00bb614 <__d_lookup+0x90>
>> c00bb680: 38 60 00 00 li r3,0
>> c00bb684: 81 61 00 00 lwz r11,0(r1)
>> c00bb688: 80 0b 00 04 lwz r0,4(r11)
>> c00bb68c: 7d 61 5b 78 mr r1,r11
>>
>
> [ BUG HERE IF INTERRUPT HAPPENS ]
>
>
>> c00bb690: bb 0b ff e0 lmw r24,-32(r11)
>> c00bb694: 7c 08 03 a6 mtlr r0
>> c00bb698: 4e 80 00 20 blr
>>
>
> Yep, you have the same bug in your compiler.
>
> -- Steve
>
Hmm... so whats now?
Is there a way to prove this scenario is indeed the one that caused the
opps?
-- Liberty
More information about the Linuxppc-dev
mailing list