[PATCH 4/6] powerpc/mm/64s/hash: Factor out change_memory_range()

Tue Mar 16 17:30:24 AEDT 2021

Daniel Axtens <dja at axtens.net> writes:
> Michael Ellerman <mpe at ellerman.id.au> writes:
>
>> Pull the loop calling hpte_updateboltedpp() out of
>> hash__change_memory_range() into a helper function. We need it to be a
>> separate function for the next patch.
>>
>> Signed-off-by: Michael Ellerman <mpe at ellerman.id.au>
>> ---
>>  arch/powerpc/mm/book3s64/hash_pgtable.c | 23 +++++++++++++++--------
>>  1 file changed, 15 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
>> index 03819c259f0a..3663d3cdffac 100644
>> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
>> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
>> @@ -400,10 +400,23 @@ EXPORT_SYMBOL_GPL(hash__has_transparent_hugepage);
>>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>>  
>>  #ifdef CONFIG_STRICT_KERNEL_RWX
>> +static void change_memory_range(unsigned long start, unsigned long end,
>> +				unsigned int step, unsigned long newpp)
>
> Looking at the call paths, this gets called only in bare metal, not
> virtualised: should the name reflect that?

It's also called on bare metal:

static bool hash__change_memory_range(unsigned long start, unsigned long end,
				      unsigned long newpp)
{
	...
	if (firmware_has_feature(FW_FEATURE_LPAR)) {
	        ...
		stop_machine_cpuslocked(change_memory_range_fn, &chmem_parms,
					cpu_online_mask);
	        ...
	} else
		change_memory_range(start, end, step, newpp);
                ^^^^^^^^^^^^^^^^^^^

>> +{
>> +	unsigned long idx;
>> +
>> +	pr_debug("Changing page protection on range 0x%lx-0x%lx, to 0x%lx, step 0x%x\n",
>> +		 start, end, newpp, step);
>> +
>> +	for (idx = start; idx < end; idx += step)
>> +		/* Not sure if we can do much with the return value */
>
> Hmm, I realise this comment isn't changed, but it did make me wonder
> what the return value!
>
> It turns out that the function doesn't actually return anything.
>
> Tracking back the history of hpte_updateboltedpp, it looks like it has
> not had a return value since the start of git history:
>
> ^1da177e4c3f4 include/asm-ppc64/machdep.h    void            (*hpte_updateboltedpp)(unsigned long newpp, 
> 3c726f8dee6f5 include/asm-powerpc/machdep.h                                         unsigned long ea,
> 1189be6508d45 include/asm-powerpc/machdep.h                                        int psize, int ssize);
>
> The comment comes from commit cd65d6971334 ("powerpc/mm/hash: Implement
> mark_rodata_ro() for hash") where Balbir added the comment, but again I
> can't figure out what sort of return value there would be to ignore.

I suspect he just assumed there was a return value, and the comment is
saying we aren't really allowed to fail here, so what could we do?

In general these routines changing the kernel map permissions aren't
allowed to fail, because the callers don't cope with a failure, and at
least in some cases eg. RW -> RX the permission change is not optional.

> Should we drop the comment? (or return something from hpte_updateboltedpp)

I'll leave the comment for now, but we could probably drop it.

It would be good if hpte_updateboltedpp() could fail and return an
error. Currently pSeries_lpar_hpte_updateboltedpp() BUGs if the hcall
fails, because the only error cases are due to bad input on our part.
And similarly native_hpte_updateboltedpp() panics if we give it bad
input.

We may still need to panic() at a higher level, ie. adding execute to a
mapping is not optional. But possibly for some changes, like RW->RO we
could WARN and continue.

And I guess for modules we could eventually plumb the error all the way
out and fail the module load.

>> +		mmu_hash_ops.hpte_updateboltedpp(newpp, idx, mmu_linear_psize,
>> +							mmu_kernel_ssize);
>> +}
>> +
>>  static bool hash__change_memory_range(unsigned long start, unsigned long end,
>>  				      unsigned long newpp)
>>  {
>> -	unsigned long idx;
>>  	unsigned int step, shift;
>>  
>>  	shift = mmu_psize_defs[mmu_linear_psize].shift;
>> @@ -415,13 +428,7 @@ static bool hash__change_memory_range(unsigned long start, unsigned long end,
>>  	if (start >= end)
>>  		return false;
>>  
>> -	pr_debug("Changing page protection on range 0x%lx-0x%lx, to 0x%lx, step 0x%x\n",
>> -		 start, end, newpp, step);
>> -
>> -	for (idx = start; idx < end; idx += step)
>> -		/* Not sure if we can do much with the return value */
>> -		mmu_hash_ops.hpte_updateboltedpp(newpp, idx, mmu_linear_psize,
>> -							mmu_kernel_ssize);
>> +	change_memory_range(start, end, step, newpp);
>
> Looking at how change_memory_range is called, step is derived by:
>
> 	shift = mmu_psize_defs[mmu_linear_psize].shift;
> 	step = 1 << shift;
>
> We probably therefore don't really need to pass step in to
> change_memory_range. Having said that, I'm not sure it would really be that
> much tidier to compute step in change_memory_range, especially since we
> also need step for the other branch in hash__change_memory_range.

Hmm yeah, swings and roundabouts. I think I'll leave it as is, so that
we're only calculating step in one place.

> Beyond that it all looks reasonable to me!
>
> I also checked that the loop operations made sense, I think they do - we
> cover from start inclusive to end exclusive and the alignment is done
> before we call into change_memory_range.

Thanks.

cheers