[RFC PATCH 3/3] powerpc/lib: Use a temporary mm for code patching
Christophe Leroy
christophe.leroy at c-s.fr
Wed Apr 15 18:45:37 AEST 2020
Le 15/04/2020 à 07:11, Christopher M Riedl a écrit :
>> On March 24, 2020 11:25 AM Christophe Leroy <christophe.leroy at c-s.fr> wrote:
>>
>>
>> Le 23/03/2020 à 05:52, Christopher M. Riedl a écrit :
>>> Currently, code patching a STRICT_KERNEL_RWX exposes the temporary
>>> mappings to other CPUs. These mappings should be kept local to the CPU
>>> doing the patching. Use the pre-initialized temporary mm and patching
>>> address for this purpose. Also add a check after patching to ensure the
>>> patch succeeded.
>>>
>>> Based on x86 implementation:
>>>
>>> commit b3fd8e83ada0
>>> ("x86/alternatives: Use temporary mm for text poking")
>>>
>>> Signed-off-by: Christopher M. Riedl <cmr at informatik.wtf>
>>> ---
>>> arch/powerpc/lib/code-patching.c | 128 ++++++++++++++-----------------
>>> 1 file changed, 57 insertions(+), 71 deletions(-)
>>>
>>> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
>>> index 18b88ecfc5a8..f156132e8975 100644
>>> --- a/arch/powerpc/lib/code-patching.c
>>> +++ b/arch/powerpc/lib/code-patching.c
>>> @@ -19,6 +19,7 @@
>>> #include <asm/page.h>
>>> #include <asm/code-patching.h>
>>> #include <asm/setup.h>
>>> +#include <asm/mmu_context.h>
>>>
>>> static int __patch_instruction(unsigned int *exec_addr, unsigned int instr,
>>> unsigned int *patch_addr)
>>> @@ -65,99 +66,79 @@ void __init poking_init(void)
>>> pte_unmap_unlock(ptep, ptl);
>>> }
>>>
>>> -static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>>> -
>>> -static int text_area_cpu_up(unsigned int cpu)
>>> -{
>>> - struct vm_struct *area;
>>> -
>>> - area = get_vm_area(PAGE_SIZE, VM_ALLOC);
>>> - if (!area) {
>>> - WARN_ONCE(1, "Failed to create text area for cpu %d\n",
>>> - cpu);
>>> - return -1;
>>> - }
>>> - this_cpu_write(text_poke_area, area);
>>> -
>>> - return 0;
>>> -}
>>> -
>>> -static int text_area_cpu_down(unsigned int cpu)
>>> -{
>>> - free_vm_area(this_cpu_read(text_poke_area));
>>> - return 0;
>>> -}
>>> -
>>> -/*
>>> - * Run as a late init call. This allows all the boot time patching to be done
>>> - * simply by patching the code, and then we're called here prior to
>>> - * mark_rodata_ro(), which happens after all init calls are run. Although
>>> - * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
>>> - * it as being preferable to a kernel that will crash later when someone tries
>>> - * to use patch_instruction().
>>> - */
>>> -static int __init setup_text_poke_area(void)
>>> -{
>>> - BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>>> - "powerpc/text_poke:online", text_area_cpu_up,
>>> - text_area_cpu_down));
>>> -
>>> - return 0;
>>> -}
>>> -late_initcall(setup_text_poke_area);
>>> +struct patch_mapping {
>>> + spinlock_t *ptl; /* for protecting pte table */
>>> + struct temp_mm temp_mm;
>>> +};
>>>
>>> /*
>>> * This can be called for kernel text or a module.
>>> */
>>> -static int map_patch_area(void *addr, unsigned long text_poke_addr)
>>> +static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
>>
>> Why change the name ?
>>
>
> It's not really an "area" anymore.
>
>>> {
>>> - unsigned long pfn;
>>> - int err;
>>> + struct page *page;
>>> + pte_t pte, *ptep;
>>> + pgprot_t pgprot;
>>>
>>> if (is_vmalloc_addr(addr))
>>> - pfn = vmalloc_to_pfn(addr);
>>> + page = vmalloc_to_page(addr);
>>> else
>>> - pfn = __pa_symbol(addr) >> PAGE_SHIFT;
>>> + page = virt_to_page(addr);
>>>
>>> - err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
>>> + if (radix_enabled())
>>> + pgprot = __pgprot(pgprot_val(PAGE_KERNEL));
>>> + else
>>> + pgprot = PAGE_SHARED;
>>
>> Can you explain the difference between radix and non radix ?
>>
>> Why PAGE_KERNEL for a page that is mapped in userspace ?
>>
>> Why do you need to do __pgprot(pgprot_val(PAGE_KERNEL)) instead of just
>> using PAGE_KERNEL ?
>>
>
> On hash there is a manual check which prevents setting _PAGE_PRIVILEGED for
> kernel to userspace access in __hash_page - hence we cannot access the mapping
> if the page is mapped PAGE_KERNEL on hash. However, I would like to use
> PAGE_KERNEL here as well and am working on understanding why this check is
> done in hash and if this can change. On radix this works just fine.
>
> The page is mapped PAGE_KERNEL because the address is technically a userspace
> address - but only to keep the mapping local to this CPU doing the patching.
> PAGE_KERNEL makes it clear both in intent and protection that this is a kernel
> mapping.
>
> I think the correct way is pgprot_val(PAGE_KERNEL) since PAGE_KERNEL is defined
> as:
>
> #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
>
> and __pgprot() is defined as:
>
> typedef struct { unsigned long pgprot; } pgprot_t;
> #define pgprot_val(x) ((x).pgprot)
> #define __pgprot(x) ((pgprot_t) { (x) })
Yes, so:
pgprot_val(__pgprot(x)) == x
You do:
pgprot = __pgprot(pgprot_val(PAGE_KERNEL));
Which is:
pgprot = __pgprot(pgprot_val(__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)));
Which is equivalent to:
pgprot = __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW);
So at the end it should simply be:
pgprot = PAGE_KERNEL;
Christophe
More information about the Linuxppc-dev
mailing list