[RFC PATCH 3/3] powerpc/lib: Use a temporary mm for code patching

Christophe Leroy christophe.leroy at c-s.fr
Wed Apr 15 18:45:37 AEST 2020



Le 15/04/2020 à 07:11, Christopher M Riedl a écrit :
>> On March 24, 2020 11:25 AM Christophe Leroy <christophe.leroy at c-s.fr> wrote:
>>
>>   
>> Le 23/03/2020 à 05:52, Christopher M. Riedl a écrit :
>>> Currently, code patching a STRICT_KERNEL_RWX exposes the temporary
>>> mappings to other CPUs. These mappings should be kept local to the CPU
>>> doing the patching. Use the pre-initialized temporary mm and patching
>>> address for this purpose. Also add a check after patching to ensure the
>>> patch succeeded.
>>>
>>> Based on x86 implementation:
>>>
>>> commit b3fd8e83ada0
>>> ("x86/alternatives: Use temporary mm for text poking")
>>>
>>> Signed-off-by: Christopher M. Riedl <cmr at informatik.wtf>
>>> ---
>>>    arch/powerpc/lib/code-patching.c | 128 ++++++++++++++-----------------
>>>    1 file changed, 57 insertions(+), 71 deletions(-)
>>>
>>> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
>>> index 18b88ecfc5a8..f156132e8975 100644
>>> --- a/arch/powerpc/lib/code-patching.c
>>> +++ b/arch/powerpc/lib/code-patching.c
>>> @@ -19,6 +19,7 @@
>>>    #include <asm/page.h>
>>>    #include <asm/code-patching.h>
>>>    #include <asm/setup.h>
>>> +#include <asm/mmu_context.h>
>>>    
>>>    static int __patch_instruction(unsigned int *exec_addr, unsigned int instr,
>>>    			       unsigned int *patch_addr)
>>> @@ -65,99 +66,79 @@ void __init poking_init(void)
>>>    	pte_unmap_unlock(ptep, ptl);
>>>    }
>>>    
>>> -static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>>> -
>>> -static int text_area_cpu_up(unsigned int cpu)
>>> -{
>>> -	struct vm_struct *area;
>>> -
>>> -	area = get_vm_area(PAGE_SIZE, VM_ALLOC);
>>> -	if (!area) {
>>> -		WARN_ONCE(1, "Failed to create text area for cpu %d\n",
>>> -			cpu);
>>> -		return -1;
>>> -	}
>>> -	this_cpu_write(text_poke_area, area);
>>> -
>>> -	return 0;
>>> -}
>>> -
>>> -static int text_area_cpu_down(unsigned int cpu)
>>> -{
>>> -	free_vm_area(this_cpu_read(text_poke_area));
>>> -	return 0;
>>> -}
>>> -
>>> -/*
>>> - * Run as a late init call. This allows all the boot time patching to be done
>>> - * simply by patching the code, and then we're called here prior to
>>> - * mark_rodata_ro(), which happens after all init calls are run. Although
>>> - * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
>>> - * it as being preferable to a kernel that will crash later when someone tries
>>> - * to use patch_instruction().
>>> - */
>>> -static int __init setup_text_poke_area(void)
>>> -{
>>> -	BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>>> -		"powerpc/text_poke:online", text_area_cpu_up,
>>> -		text_area_cpu_down));
>>> -
>>> -	return 0;
>>> -}
>>> -late_initcall(setup_text_poke_area);
>>> +struct patch_mapping {
>>> +	spinlock_t *ptl; /* for protecting pte table */
>>> +	struct temp_mm temp_mm;
>>> +};
>>>    
>>>    /*
>>>     * This can be called for kernel text or a module.
>>>     */
>>> -static int map_patch_area(void *addr, unsigned long text_poke_addr)
>>> +static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
>>
>> Why change the name ?
>>
> 
> It's not really an "area" anymore.
> 
>>>    {
>>> -	unsigned long pfn;
>>> -	int err;
>>> +	struct page *page;
>>> +	pte_t pte, *ptep;
>>> +	pgprot_t pgprot;
>>>    
>>>    	if (is_vmalloc_addr(addr))
>>> -		pfn = vmalloc_to_pfn(addr);
>>> +		page = vmalloc_to_page(addr);
>>>    	else
>>> -		pfn = __pa_symbol(addr) >> PAGE_SHIFT;
>>> +		page = virt_to_page(addr);
>>>    
>>> -	err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
>>> +	if (radix_enabled())
>>> +		pgprot = __pgprot(pgprot_val(PAGE_KERNEL));
>>> +	else
>>> +		pgprot = PAGE_SHARED;
>>
>> Can you explain the difference between radix and non radix ?
>>
>> Why PAGE_KERNEL for a page that is mapped in userspace ?
>>
>> Why do you need to do __pgprot(pgprot_val(PAGE_KERNEL)) instead of just
>> using PAGE_KERNEL ?
>>
> 
> On hash there is a manual check which prevents setting _PAGE_PRIVILEGED for
> kernel to userspace access in __hash_page - hence we cannot access the mapping
> if the page is mapped PAGE_KERNEL on hash. However, I would like to use
> PAGE_KERNEL here as well and am working on understanding why this check is
> done in hash and if this can change. On radix this works just fine.
> 
> The page is mapped PAGE_KERNEL because the address is technically a userspace
> address - but only to keep the mapping local to this CPU doing the patching.
> PAGE_KERNEL makes it clear both in intent and protection that this is a kernel
> mapping.
> 
> I think the correct way is pgprot_val(PAGE_KERNEL) since PAGE_KERNEL is defined
> as:
> 
> #define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
> 
> and __pgprot() is defined as:
> 
> typedef struct { unsigned long pgprot; } pgprot_t;
> #define pgprot_val(x)   ((x).pgprot)
> #define __pgprot(x)     ((pgprot_t) { (x) })


Yes, so:
	pgprot_val(__pgprot(x)) == x


You do:

	pgprot = __pgprot(pgprot_val(PAGE_KERNEL));

Which is:

	pgprot = __pgprot(pgprot_val(__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)));

Which is equivalent to:

	pgprot = __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW);

So at the end it should simply be:

	pgprot = PAGE_KERNEL;




Christophe


More information about the Linuxppc-dev mailing list