[PATCH 6/6] powerpc/rtas: constrain user region allocation to RMA

Alexey Kardashevskiy aik at ozlabs.ru
Wed Jan 20 15:49:33 AEDT 2021



On 20/01/2021 11:39, Nathan Lynch wrote:
> Alexey Kardashevskiy <aik at ozlabs.ru> writes:
>> On 16/01/2021 02:38, Nathan Lynch wrote:
>>> Alexey Kardashevskiy <aik at ozlabs.ru> writes:
>>>> On 15/01/2021 09:00, Nathan Lynch wrote:
>>>>> Memory locations passed as arguments from the OS to RTAS usually need
>>>>> to be addressable in 32-bit mode and must reside in the Real Mode
>>>>> Area. On PAPR guests, the RMA starts at logical address 0 and is the
>>>>> first logical memory block reported in the LPAR’s device tree.
>>>>>
>>>>> On powerpc targets with RTAS, Linux makes available to user space a
>>>>> region of memory suitable for arguments to be passed to RTAS via
>>>>> sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
>>>>> API during boot in order to ensure that it satisfies the requirements
>>>>> described above.
>>>>>
>>>>> With radix MMU, the upper limit supplied to the memblock allocation
>>>>> can exceed the bounds of the first logical memory block, since
>>>>> ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB. (512MB is
>>>>> a common size of the first memory block according to a small sample of
>>>>> LPARs I have checked.) This leads to failures when user space invokes
>>>>> an RTAS function that uses a work area, such as
>>>>> ibm,configure-connector.
>>>>>
>>>>> Alter the determination of the upper limit for rtas_rmo_buf's
>>>>> allocation to consult the device tree directly, ensuring placement
>>>>> within the RMA regardless of the MMU in use.
>>>>
>>>> Can we tie this with RTAS (which also needs to be in RMA) and simply add
>>>> extra 64K in prom_instantiate_rtas() and advertise this address
>>>> (ALIGH_UP(rtas-base + rtas-size, PAGE_SIZE)) to the user space? We do
>>>> not need this RMO area before that point.
>>>
>>> Can you explain more about what advantage that would bring? I'm not
>>> seeing it. It's a more significant change than what I've written
>>> here.
>>
>>
>> We already allocate space for RTAS and (like RMO) it needs to be in RMA,
>> and RMO is useless without RTAS. We can reuse RTAS allocation code for
>> RMO like this:
> 
> When you say RMO I assume you are referring to rtas_rmo_buf? (I don't
> think it is well-named.)
> 
> 
>> ===
>> diff --git a/arch/powerpc/kernel/prom_init.c
>> b/arch/powerpc/kernel/prom_init.c
>> index e9d4eb6144e1..d9527d3e01d2 100644
>> --- a/arch/powerpc/kernel/prom_init.c
>> +++ b/arch/powerpc/kernel/prom_init.c
>> @@ -1821,7 +1821,8 @@ static void __init prom_instantiate_rtas(void)
>>           if (size == 0)
>>                   return;
>>
>> -       base = alloc_down(size, PAGE_SIZE, 0);
>> +       /* One page for RTAS, one for RMO */
> 
> One page for RTAS? RTAS is ~20MB on LPARs I've checked:
> 
> # lsprop /proc/device-tree/rtas/{rtas-size,linux,rtas-base}
> /proc/device-tree/rtas/rtas-size
> 		 01370000 (20381696)

You are right, I did not sleep well when replied, sorry about that :) I 
tried it with KVM where RTAS is just a few KBs (20 constant bytes + MCE 
log, depends on cpu number) so it worked for me.


> 
>> +       base = alloc_down(size, PAGE_SIZE + PAGE_SIZE, 0);
> 
> This changes the alignment but not the size of the allocation.


Should be:

base = alloc_down(ALIGN_UP(size, PAGE_SIZE) + PAGE_SIZE, PAGE_SIZE, 0);

> 
> 
>>           if (base == 0)
>>                   prom_panic("Could not allocate memory for RTAS\n");
>>
>> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
>> index d126d71ea5bd..885d95cf4ed3 100644
>> --- a/arch/powerpc/kernel/rtas.c
>> +++ b/arch/powerpc/kernel/rtas.c
>> @@ -1186,6 +1186,7 @@ void __init rtas_initialize(void)
>>           rtas.size = size;
>>           no_entry = of_property_read_u32(rtas.dev, "linux,rtas-entry",
>> &entry);
>>           rtas.entry = no_entry ? rtas.base : entry;
>> +       rtas_rmo_buf = rtas.base + PAGE_SIZE;
> 
> I think this would overlay the user region on top of the RTAS private
> data area, allowing user space to corrupt it.


Right, my bad. Should be:

rtas_rmo_buf = ALIGN_UP(rtas.base + rtas.size, PAGE_SIZE);


> 
>>
>>           /* If RTAS was found, allocate the RMO buffer for it and look for
>>            * the stop-self token if any
>> @@ -1196,11 +1197,6 @@ void __init rtas_initialize(void)
>>                   ibm_suspend_me_token = rtas_token("ibm,suspend-me");
>>           }
>>    #endif
>> -       rtas_rmo_buf = memblock_phys_alloc_range(RTAS_RMOBUF_MAX, PAGE_SIZE,
>> -                                                0, rtas_region);
>> -       if (!rtas_rmo_buf)
>> -               panic("ERROR: RTAS: Failed to allocate %lx bytes below
>> %pa\n",
>> -                     PAGE_SIZE, &rtas_region);
>> ===
>>
>> May be store in the FDT as "linux,rmo-base" next to "linux,rtas-base",
>> for clarity, as sharing symbols between prom and main kernel is a bit
>> tricky.
>>
>> The benefit is that we do not do the same thing   (== find 64K in RMA)
>> in 2 different ways and if the RMO allocated my way is broken - we'll
>> know it much sooner as RTAS itself will break too.
> 
> Implementation details aside... I'll grant that combining the
> allocations into one in prom_init reduces some duplication in the sense
> that both are subject to the same constraints (mostly - the RTAS data
> area must not cross a 256MB boundary, while the user region may). But
> they really are distinct concerns. The RTAS private data area is
> specified in the platform architecture, the OS is obligated to allocate
> it and pass it to instantiate-rtas, etc etc. However the user region
> (rtas_rmo_buf) is purely a Linux construct which is there to support
> sys_rtas.

Not purely - it should be an address which RTAS accepts. Cannot argue 
with the rest though, it all sounds correct.

> Now, there are multiple sites in the kernel proper that must allocate
> memory suitable for passing to RTAS. Obviously there is value in
> consolidating the logic for that purpose in one place, so I'll work on
> adding that in v2. OK?

Sure.


-- 
Alexey


More information about the Linuxppc-dev mailing list