[RFC PATCH 2/3] powerpc/lib: Initialize a temporary mm for code patching

Wed Apr 8 21:01:02 AEST 2020

Le 31/03/2020 à 05:19, Christopher M Riedl a écrit :
>> On March 24, 2020 11:10 AM Christophe Leroy <christophe.leroy at c-s.fr> wrote:
>>
>>   
>> Le 23/03/2020 à 05:52, Christopher M. Riedl a écrit :
>>> When code patching a STRICT_KERNEL_RWX kernel the page containing the
>>> address to be patched is temporarily mapped with permissive memory
>>> protections. Currently, a per-cpu vmalloc patch area is used for this
>>> purpose. While the patch area is per-cpu, the temporary page mapping is
>>> inserted into the kernel page tables for the duration of the patching.
>>> The mapping is exposed to CPUs other than the patching CPU - this is
>>> undesirable from a hardening perspective.
>>>
>>> Use the `poking_init` init hook to prepare a temporary mm and patching
>>> address. Initialize the temporary mm by copying the init mm. Choose a
>>> randomized patching address inside the temporary mm userspace address
>>> portion. The next patch uses the temporary mm and patching address for
>>> code patching.
>>>
>>> Based on x86 implementation:
>>>
>>> commit 4fc19708b165
>>> ("x86/alternatives: Initialize temporary mm for patching")
>>>
>>> Signed-off-by: Christopher M. Riedl <cmr at informatik.wtf>
>>> ---
>>>    arch/powerpc/lib/code-patching.c | 26 ++++++++++++++++++++++++++
>>>    1 file changed, 26 insertions(+)
>>>
>>> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
>>> index 3345f039a876..18b88ecfc5a8 100644
>>> --- a/arch/powerpc/lib/code-patching.c
>>> +++ b/arch/powerpc/lib/code-patching.c
>>> @@ -11,6 +11,8 @@
>>>    #include <linux/cpuhotplug.h>
>>>    #include <linux/slab.h>
>>>    #include <linux/uaccess.h>
>>> +#include <linux/sched/task.h>
>>> +#include <linux/random.h>
>>>    
>>>    #include <asm/pgtable.h>
>>>    #include <asm/tlbflush.h>
>>> @@ -39,6 +41,30 @@ int raw_patch_instruction(unsigned int *addr, unsigned int instr)
>>>    }
>>>    
>>>    #ifdef CONFIG_STRICT_KERNEL_RWX
>>> +
>>> +__ro_after_init struct mm_struct *patching_mm;
>>> +__ro_after_init unsigned long patching_addr;
>>
>> Can we make those those static ?
>>
> 
> Yes, makes sense to me.
> 
>>> +
>>> +void __init poking_init(void)
>>> +{
>>> +	spinlock_t *ptl; /* for protecting pte table */
>>> +	pte_t *ptep;
>>> +
>>> +	patching_mm = copy_init_mm();
>>> +	BUG_ON(!patching_mm);
>>
>> Does it needs to be a BUG_ON() ? Can't we fail gracefully with just a
>> WARN_ON ?
>>
> 
> I'm not sure what failing gracefully means here? The main reason this could
> fail is if there is not enough memory to allocate the patching_mm. The
> previous implementation had this justification for BUG_ON():

But the system can continue running just fine after this failure.
Only the things that make use of code patching will fail (ftrace, kgdb, ...)

Checkpatch tells: "Avoid crashing the kernel - try using WARN_ON & 
recovery code rather than BUG() or BUG_ON()"

All vital code patching has already been done previously, so I think a 
WARN_ON() should be enough, plus returning non 0 to indicate that the 
late_initcall failed.

> 
> /*
>   * Run as a late init call. This allows all the boot time patching to be done
>   * simply by patching the code, and then we're called here prior to
>   * mark_rodata_ro(), which happens after all init calls are run. Although
>   * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
>   * it as being preferable to a kernel that will crash later when someone tries
>   * to use patch_instruction().
>   */
> static int __init setup_text_poke_area(void)
> {
>          BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>                  "powerpc/text_poke:online", text_area_cpu_up,
>                  text_area_cpu_down));
> 
>          return 0;
> }
> late_initcall(setup_text_poke_area);
> 
> I think the BUG_ON() is appropriate even if only to adhere to the previous
> judgement call. I can add a similar comment explaining the reasoning if
> that helps.
> 
>>> +
>>> +	/*
>>> +	 * In hash we cannot go above DEFAULT_MAP_WINDOW easily.
>>> +	 * XXX: Do we want additional bits of entropy for radix?
>>> +	 */
>>> +	patching_addr = (get_random_long() & PAGE_MASK) %
>>> +		(DEFAULT_MAP_WINDOW - PAGE_SIZE);
>>> +
>>> +	ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
>>> +	BUG_ON(!ptep);
>>
>> Same here, can we fail gracefully instead ?
>>
> 
> Same reasoning as above.

Here as well, a WARN_ON() should be enough, the system will continue 
running after that.

> 
>>> +	pte_unmap_unlock(ptep, ptl);
>>> +}
>>> +
>>>    static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>>>    
>>>    static int text_area_cpu_up(unsigned int cpu)
>>>
>>
>> Christophe

Christophe