[PATCH] powerpc/32e: Ignore ESR in instruction storage interrupt handler
Nicholas Piggin
npiggin at gmail.com
Fri Oct 29 11:50:00 AEDT 2021
Excerpts from Daniel Axtens's message of October 29, 2021 8:13 am:
> Hi Nick,
>
>> A e5500 machine running a 32-bit kernel sometimes hangs at boot,
>> seemingly going into an infinite loop of instruction storage interrupts.
>> The ESR SPR has a value of 0x800000 (store) when this happens, which is
>> likely set by a previous store. An instruction TLB miss interrupt would
>> then leave ESR unchanged, and if no PTE exists it calls directly to the
>> instruction storage interrupt handler without changing ESR.
>
> I hadn't heard of the ESR before and your patch just uses the acronym:
> apparently it is the Exception Syndrome Register. In ISA 2.07 it's in
> Section 7.2.13 of Book III-E. (e5500 implements 2.06 but I assume it's
> roughly the same there.)
>
> The ISA says that ESR_ST is set for the following types of exception:
> - Alignment
> - Data Storage
> - Data TLB
> - LRAT Error
>
> So it makes sense to assume that it was not set by the instruction
> storage exception. (I see you have a discussion as to how that might
> occur in
> https://lore.kernel.org/linuxppc-dev/1635413197.83rhc4s3fc.astroid@bobo.none/#t
> and you concluded that in the comment you add that we arrive here via
> the TLB handler jumping to the ISI.)
I think that's most likely. I don't know book E arch very well, *maybe*
you could have an ISI which does not change ESR without violating the
letter of the architecture, but not sure. Either way, this fix should
take care of both.
>
>> access_error() does not cause a segfault due to a store to a read-only
>> vma because is_exec is true. Most subsequent fault handling does not
>> check for a write fault on a read-only vma, and might do strange things
>> like create a writeable PTE or call page_mkwrite on a read only vma or
>> file. It's not clear what happens here to cause the infinite faulting in
>> this case, a fault handler failure or low level PTE or TLB handling.
>
> I'm just trying to unpick this:
>
> - INSTRUCTION_STORAGE_EXCEPTION ends up branching to do_page_fault ->
> __do_page_fault -> ___do_page_fault
>
> - ___do_page_fault has is_exec = true because the exception is a
> instruction storage interrupt
>
> - ___do_page_fault also decides that is_write = true because the
> ESR_DST bit is set.
>
> This puts us in a bad position because we're taking some information
> from the current exception and some information from a previous
> exception, so it makes sense that things would go wrong from here!
Yeah, we miss the store into read-only vma check because is_exec which
causes us to go into write path of the core handle_mm_fault. Although
even if we caught it and segfaulted obviously that's still not the right
thing to do.
>
>> In any case this can be fixed by having the instruction storage
>> interrupt zero regs->dsisr rather than storing the ESR value to it.
>
> Is it valid to just ignore the contents of ESR for an Instruction
> Storage exception?
>
> The 2.07 ISA at least says that the following ESR bits can be set by an
> Instruction Storage exception:
> - Byte Ordering
> - TLB Inelligible
> - Page Table
>
> It sounds from
>
>> + * In any case, do_page_fault for BOOK3E does not use ESR and always expects
>> + * dsisr to be 0. ESR_DST from a prior store in particular would confuse fault
>> + * handling.
>
> that we don't actually ever check any of those three bits in the
> do_page_fault path. reg_booke.h defines ESR_BO but the definition is
> never used, and there is no ESR_TLBI or ESR_PT constant defined.
Yeah that's right. I don't quite know what we would do with the other
exception types. Possibly just check them in page_fault_is_bad and cause
a sigbus. Maybe there is some low level fixup that would be possible.
But nothing is implemented.
> So much as it seems a bit dodgy to me, I guess it is right to say that
> we're not changing behaviour by setting dsisr to 0 and just ignoring
> those 3 bits? Should we document that in the comment?
The comment does say do_page_fault expects it to be 0. Did you have more
in mind? I can tweak the comment.
> I probably would have masked ESR_DST but I guess extra bit-twiddling in
> an interrupt path when zeroing would do is probably not worth it for
> theoretical correctness?
Well the ESR is completely un-set so none of the bits make sense at this
point. Reading it is jsut overhead.
If we wanted to do it really "cleanly", the TLB error interrupt handler
might set ESR before calling here, or it might call a few instructions
later with the register already set to a sane ESR value. But that would
really only matter if do_page_fault started doing anything useful with
the other bits.
Thanks,
Nick
>
> Kind regards,
> Daniel
>
>> Link: https://lore.kernel.org/linuxppc-dev/1635306738.0z8wt7619v.astroid@bobo.none/
>> Fixes: a01a3f2ddbcd ("powerpc: remove arguments from fault handler functions")
>> Reported-by: Jacques de Laval <jacques.delaval at protonmail.com>
>> Tested-by: Jacques de Laval <jacques.delaval at protonmail.com>
>> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
>> ---
>> arch/powerpc/kernel/head_booke.h | 15 ++++++++++++---
>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
>> index e5503420b6c6..ef8d1b1c234e 100644
>> --- a/arch/powerpc/kernel/head_booke.h
>> +++ b/arch/powerpc/kernel/head_booke.h
>> @@ -465,12 +465,21 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
>> bl do_page_fault; \
>> b interrupt_return
>>
>> +/*
>> + * Instruction TLB Error interrupt handlers may call InstructionStorage
>> + * directly without clearing ESR, so the ESR at this point may be left over
>> + * from a prior interrupt.
>> + *
>> + * In any case, do_page_fault for BOOK3E does not use ESR and always expects
>> + * dsisr to be 0. ESR_DST from a prior store in particular would confuse fault
>> + * handling.
>> + */
>> #define INSTRUCTION_STORAGE_EXCEPTION \
>> START_EXCEPTION(InstructionStorage) \
>> - NORMAL_EXCEPTION_PROLOG(0x400, INST_STORAGE); \
>> - mfspr r5,SPRN_ESR; /* Grab the ESR and save it */ \
>> + NORMAL_EXCEPTION_PROLOG(0x400, INST_STORAGE); \
>> + li r5,0; /* Store 0 in regs->esr (dsisr) */ \
>> stw r5,_ESR(r11); \
>> - stw r12, _DEAR(r11); /* Pass SRR0 as arg2 */ \
>> + stw r12, _DEAR(r11); /* Set regs->dear (dar) to SRR0 */ \
>> prepare_transfer_to_handler; \
>> bl do_page_fault; \
>> b interrupt_return
>> --
>> 2.23.0
>
More information about the Linuxppc-dev
mailing list