[PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C

Christophe Leroy christophe.leroy at csgroup.eu
Fri Jan 15 00:28:13 AEDT 2021



Le 14/01/2021 à 14:17, Nicholas Piggin a écrit :
> Excerpts from Christophe Leroy's message of January 14, 2021 10:25 pm:
>>
>>
>> Le 14/01/2021 à 13:09, Nicholas Piggin a écrit :
>>> Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm:
>>>> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
>>>>>
>>>>>
>>>>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>>>>>> The page fault handling still has some complex logic particularly around
>>>>>> hash table handling, in asm. Implement this in C instead.
>>>>>>
>>>>>> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
>>>>>> ---
>>>>>>     arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>>>>>     arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>>>>>     arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>>>>>     arch/powerpc/mm/fault.c                       |  46 ++++--
>>>>>>     4 files changed, 107 insertions(+), 148 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>> index 066b1d34c7bc..60a669379aa0 100644
>>>>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>>>>>     #define HPTE_NOHPTE_UPDATE	0x2
>>>>>>     #define HPTE_USE_KERNEL_KEY	0x4
>>>>>>     
>>>>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>>>>>     extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>>>>>     			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>>>>>     			  unsigned long flags, int ssize, int subpage_prot);
>>>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>>>>> index 6e53f7638737..bcb5e81d2088 100644
>>>>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>>>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>>>      *
>>>>>>      * Handling:
>>>>>>      * - Hash MMU
>>>>>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>>>>>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>>>>>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>>      *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>>>>>      *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>>>>>> - *   backed by Linux page tables.
>>>>>> + *   backed by Linux page table entries.
>>>>>>      *
>>>>>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>>>>>> - *   kernel mode due to user copy operations of course.
>>>>>> + *   If no entry is found the Linux page fault handler is invoked (by
>>>>>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>>>>>> + *   copy operations of course.
>>>>>>      *
>>>>>>      *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>>>>>      *   MMU context, which may cause a DSI in the host, which must go to the
>>>>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>>>>>     	GEN_COMMON data_access
>>>>>>     	ld	r4,_DAR(r1)
>>>>>>     	ld	r5,_DSISR(r1)
>>>>>
>>>>> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
>>>>> - It would be more similar to other arches
>>>>
>>>> Other sub-archs?
>>>>
>>>>> - Would avoid doing it also in instruction fault
>>>>
>>>> True but it's hidden under an unlikely branch so won't really help
>>>> instruction fault.
>>>>
>>>>> - Would avoid that -1 return which looks more like a hack.
>>>>
>>>> I don't really see it as a hack, we return a code to asm caller to
>>>> direct whether to restore registers or not, we alrady have this
>>>> pattern.
>>>>
>>>> (I'm hoping all that might be go away one day by conrolling NV
>>>> regs from C if we can get good code generation but even if not we
>>>> still have it in the interrupt returns).
>>>>
>>>> That said I will give it a try here. At very least it might be a
>>>> better intermediate step.
>>>
>>> Ah yes, this way doesn't work well for later patches because you end
>>> e.g., with the do_break call having to call the interrupt handler
>>> wrappers again when they actually expect to be in the asm entry state
>>> (e.g., irq soft-mask state) when called, and return via interrupt_return
>>> after the exit wrapper runs (which 64s uses to implement better context
>>> tracking for example).
>>>
>>> That could possibly be hacked up to deal with multiple interrupt
>>> wrappers per interrupt, but I'd rather not go backwards.
>>>
>>> That does leave the other sub archs as having this issue, but they don't
>>> do so much in their handlers. 32 doesn't have soft-mask or context
>>> tracking to deal with for example. We will need to fix this up though
>>> and unify things more.
>>>
>>
>> Not sure I understand what you mean exactly.
>>
>> On the 8xx, do_break() is called by totally different exceptions:
>> - Exception 0x1c00 Data breakpoint ==> do_break()
>> - Exception 0x1300 Instruction TLB error ==> handle_page_fault()
>> - Exception 0x1400 Data TLB error ==> handle_page_fault()
>>
>> On book3s/32, we now (after my patch ie patch 1 in your series ) have either do_break() or
>> handle_page_fault() being called from very early in ASM.
>>
>> If you do the same in book3s/64, then there is no issue with interrupt wrappers being called twice,
>> is it ?
> 
> bad_page_fault is the problem, it has to go afterwards.
> 
> Once we have the changed 64s behaviour of do_page_fault, I don't know if
> there is any point leaving do_break in asm is there? I guess it is neat
> to treat it quite separately, I might need to count fast path branches...
> I have done the split anyway already, so I will post your way first.
> 

As far as I understand, not taken unlikely branches are costless (at least on book3s/32), so you 
would only suffer the cost of the logical 'and.' on the value of DSISR that you already have in a 
register. Should be in the noise.

bad_page_fault() is not in the fast path anymore since we now handle the exception fixup at the end 
of do_page_fault(). So I think it shouldn't be a concern to call the wrapper again for bad_page_fault()

Christophe


More information about the Linuxppc-dev mailing list