[PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Tue Jun 20 02:17:27 AEST 2023

David Hildenbrand <david at redhat.com> writes:

> On 09.06.23 08:08, Aneesh Kumar K.V wrote:
>> Certain devices can possess non-standard memory capacities, not constrained
>> to multiples of 1GB. Provide a kernel parameter so that we can map the
>> device memory completely on memory hotplug.
> So, the unfortunate thing is that these devices would have worked out of 
> the box before the memory block size was increased from 256 MiB to 1 GiB 
> in these setups. Now, one has to fine-tune the memory block size. The 
> only other arch that I know, which supports setting the memory block 
> size, is x86 for special (large) UV systems -- and at least in the past 
> 128 MiB vs. 2 GiB memory blocks made a performance difference during 
> boot (maybe no longer today, who knows).
> Obviously, less tunable and getting stuff simply working out of the box 
> is preferable.
> Two questions:
> 1) Isn't there a way to improve auto-detection to fallback to 256 MiB in 
> these setups, to avoid specifying these parameters?

The patch does try to detect as much as possible by looking at device tree
nodes and aperture window size. But there are still cases where we find
a memory aperture of size X GB and device driver hotplug X.YGB memory.

> 2) Is the 256 MiB -> 1 GiB memory block size switch really worth it? On 
> x86-64, experiments (with direct map fragmentation) showed that the 
> effective performance boost is pretty insignificant, so I wonder how big 
> the 1 GiB direct map performance improvement is.

Tarun is running some tests to evaluate the impact. We used to use 1GiB
mapping always. This was later switched to use memory block size to fix
issues with memory unplug
commit af9d00e93a4f ("powerpc/mm/radix: Create separate mappings for hot-plugged memory")
explains some details related to that change.

> I guess the only real issue with 256 MiB memory blocks and 1 GiB direct 
> mapping is memory unplug of boot memory: when unplugging a 256 MiB 
> block, one would have to remap the 1 GiB range using 2 MiB ranges.

> ... I was wondering what would happen if you simply leave the direct 
> mapping in this corner case in place instead of doing this remapping. 
> IOW, remove the memory but keep the direct map pointing at the removed 
> memory. Nobody should be touching it, or are there any cases where that 
> could hurt?
> Or is there any other reason why we really want 1 GiB memory blocks 
> instead of to defaulting to 256 MiB the way it used to be?

The idea we are working towards is to keep the memory block size small
but map the boot memory using 1G. An unplug request can split that 1G
mapping later. We could look at the possibility of leaving that mapping
without splitting. But not sure why we would want to do that if we can
correctly split things. Right now there is no splitting support in powerpc.


More information about the Linuxppc-dev mailing list