[PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
Alexey Kardashevskiy
aik at ozlabs.ru
Fri Apr 7 13:28:49 AEST 2017
On 06/03/17 12:54, Alexey Kardashevskiy wrote:
> On 06/03/17 10:22, Gavin Shan wrote:
>> On Fri, Mar 03, 2017 at 04:59:11PM +1100, Alexey Kardashevskiy wrote:
>>> On 03/03/17 15:47, Russell Currey wrote:
>>>> eeh_handle_special_event() is called when an EEH event is detected but
>>>> can't be narrowed down to a specific PE. This function looks through
>>>> every PE to find one in an erroneous state, then calls the regular event
>>>> handler eeh_handle_normal_event() once it knows which PE has an error.
>>>>
>>>> However, if eeh_handle_normal_event() found that the PE cannot possibly
>>>> be recovered, it will remove the PE and associated devices. This leads
>>>> to a use after free in eeh_handle_special_event() as it attempts to clear
>>>> the "recovering" state on the PE after eeh_handle_normal_event() returns.
>>>>
>>>> Thus, make sure the PE is valid when attempting to clear state in
>>>> eeh_handle_special_event().
>>>>
>>>> Cc: <stable at vger.kernel.org> #3.10+
>>>> Reported-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>>>> Signed-off-by: Russell Currey <ruscur at russell.cc>
>>>> ---
>>>> arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
>>>> 1 file changed, 13 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>>>> index b94887165a10..492397298a2a 100644
>>>> --- a/arch/powerpc/kernel/eeh_driver.c
>>>> +++ b/arch/powerpc/kernel/eeh_driver.c
>>>> @@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
>>>> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
>>>> rc == EEH_NEXT_ERR_FENCED_PHB) {
>>>> eeh_handle_normal_event(pe);
>>>> +
>>>> + /*
>>>> + * eeh_handle_normal_event() can free the PE if it
>>>> + * determines that the PE cannot possibly be recovered.
>>>> + * Make sure the PE still exists before changing its
>>>> + * state.
>>>> + */
>>>> + if (!pe || (pe->type & EEH_PE_INVALID)
>>>> + || (pe->state & EEH_PE_REMOVED)) {
>>>
>>>
>>> The bug is that pe becomes stale after eeh_handle_normal_event() returned
>>> and dereferencing it afterwards is broken.
>>>
>>
>> Correct, it won't cause a kernel crash as @pe is deferencing linear mapped
>> area whose address is always valid.
>
> Dereferencing pe would not crash but dereferencing any pointer from the
> pnv_ioda_pe struct would (as it would random stuff or a poison).
>
>
>> I think the proper fix would be to use
>> eeh_handle_normal_event() to indicate the @pe has been released and don't
>> access it any more.
>
> Correct. The problem is that the callstack from my other reply is a bit too
> long to make an trivial patch :)
Any update on this?
>
>
>
>>>
>>>
>>>> + pr_warn("EEH: not clearing state on bad PE\n");
>>
>> The message like this isn't meaningful, no need to have it. The messages that
>> have prefix "EEH:" is informative messages. We definitely needn't this here.
>> However, the message might be not needed in next revision.
>>
>>>> + continue;
>>>> + }
>>>> +
>>>> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>>>> } else {
>>>> pci_lock_rescan_remove();
>>>>
>>
>> Thanks,
>> Gavin
>>
>
>
--
Alexey
More information about the Linuxppc-dev
mailing list