Kernel bug in 2.6.23...was: RE: How to debug a hung multi-core system....
Morrison, Tom
tmorrison at empirix.com
Fri May 29 04:46:29 EST 2009
Kumar,
To follow up on our postings from late last week...
(which I was expecting a response (but never got) from you)...
-----
We (well, mostly a very bright engineer who was very persistent)
have(has) found the origin of how the kernel TLB got corrupted.
We tracked down the problem to a programming bug in the DataStorage
exception handler for our kernel (2.6.23). We have looked at newer
kernels, and have noticed that this piece of processing has changed,
but let me explain to you what happened (and the conditions that
caused the problem on our MPC8572E (running SMP)...
If you follow the logic of in this version of the kernel, it reads
the SPRN_DEAR into register R10, and then does some operations
(including a tlbsx operation (which uses R10)), and then attempts
to update the associated PTE entry.
Well, if you have REALLY bad luck, sometime between the time you
took this exception and try to update the PTE for this page, the
other core has decided to invalidate this page's PTE. The good
part is the kernel recognizes this unlucky case.
Unfortunately, in this 'bad luck' case, a kernel bug was
Introduced. The kernel uses R10 for some processing (puts
the physical address associated with this virtual page) and
then branches up 'above' the tlbsx operation to try again
...without restoring R10 to the SPRN_DEAR required by the tlbsx
operation...
This means, that even though the kernel recognized this exceptional
problem, it NEVER did the right thing, and instead, the kernel would
(attempt) to modify the unlucky TLB virtual address that corresponds
to the physical address of the original DataStorage exception.
The only way we caught this is that we also had a second piece of
'bad luck' by having that physical address map to the virtual address
of the kernel (0xC0000000), and thus, when it loops back to try again,
it gets the kernel page(s) from the tlbsx operation, and modifies
permissions on the kernel pages and thus causing an InstructionStore
Exception (forever).
We fixed this in our kernel by just restoring R10 to SPRN_DEAR value
just before it loops back, something like this:
================================
....
mtspr SPRN_MAS1, r13
tlbwe
/* because we did NOT find in PTE */
/* r10 was changed - so we need */
/* to re-load it here to work */
mfspr r10, SPRN_DEAR /* restore the faulting
address */
b 5b /* Try again */
....
================================
That's the short and long of it...and 4 weeks of very stressful
problems...
I am wondering why nobody has found this problem before - are we the
first to be this unlucky? I am not sure that is a good thing!
Comments? Suggestions? What else should I be doing with this
information?
Tom Morrison
Principal Software Engineer
EMPIRIX
20 Crosby Drive - Bedford, MA 01730
p: 781.266.3567 f: 781.266.3670
email: tmorrison at empirix.com
www.empirix.com
>> -----Original Message-----
>> From: Morrison, Tom
>> Sent: Thursday, May 21, 2009 11:24 AM
>> To: Morrison, Tom; Kumar Gala
>> Cc: linuxppc-dev at ozlabs.org; Young, Andrew; Brown, Jeff; Geary Sean-
>> R60898
>> Subject: RE: How to debug a hung multi-core system....
>>
>> Just had a little conference with several co-workers...to go over
results
>>
>> We think that LT0 (the one that maps the kernel) has been corrupted:
>>
>> Entry EPN RPN TID TMASK WIMGE TSIZ U0:3 X0:1
>>
---------------------------------------------------------------
>> LT0 C0000000 00000000 00 0FF 04 9 0 0
>>
>> PID TS PROT SHEN UR UW UX SR SW SX TIDZ VAL
>>
---------------------------------------------------------------
>> 0 0 P P E E D E E D D V
>>
>> Is absolutely wrong - this is TLB for the kernel - and as you can see
>> ...it does NOT have execution privileges (and in fact the user space
>> HAS executive privileges for this area (complete opposite of what it
>> should be)...
>>
>> This is why it is stuck AT that instruction (can't even single step
>> from that location)..
>>
>> (one of) The first problem(s) is how can/when did this TLB get
corrupted!
>>
>> Tom
More information about the Linuxppc-dev
mailing list