[PATCH V2 1/2] mm/page_alloc: Replace set_dma_reserve to set_memory_reserve
Vlastimil Babka
vbabka at suse.cz
Fri Aug 5 19:09:15 AEST 2016
On 08/05/2016 09:24 AM, Srikar Dronamraju wrote:
> * Vlastimil Babka <vbabka at suse.cz> [2016-08-05 08:45:03]:
>
>>> @@ -5493,10 +5493,10 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
>>> }
>>>
>>> /* Account for reserved pages */
>>> - if (j == 0 && freesize > dma_reserve) {
>>> - freesize -= dma_reserve;
>>> + if (j == 0 && freesize > nr_memory_reserve) {
>>
>> Will this really work (together with patch 2) as intended?
>> This j == 0 means that we are doing this only for the first zone, which is
>> ZONE_DMA (or ZONE_DMA32) on node 0 on many systems. I.e. I don't think it's
>> really true that "dma_reserve has nothing to do with DMA or ZONE_DMA".
>>
>> This zone will have limited amount of memory, so the "freesize >
>> nr_memory_reserve" will easily be false once you set this to many gigabytes,
>> so in fact nothing will get subtracted.
>>
>> On the other hand if the kernel has both CONFIG_ZONE_DMA and
>> CONFIG_ZONE_DMA32 disabled, then j == 0 will be true for ZONE_NORMAL. This
>> zone might be present on multiple nodes (unless they are configured as
>> movable) and then the value intended to be global will be subtracted from
>> several nodes.
>>
>> I don't know what's the exact ppc64 situation here, perhaps there are indeed
>> no DMA/DMA32 zones, and the fadump kernel only uses one node, so it works in
>> the end, but it doesn't seem much robust to me?
>>
>
> At the page initialization time, powerpc seems to have just one zone
> spread across the 16 nodes.
>
> From the dmesg.
>
> [ 0.000000] Memory hole size: 0MB
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x0000000000000000-0x00001f5c8fffffff]
> [ 0.000000] DMA32 empty
> [ 0.000000] Normal empty
> [ 0.000000] Movable zone start for each node
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000000000000-0x000001fb4fffffff]
> [ 0.000000] node 1: [mem 0x000001fb50000000-0x000003fa8fffffff]
> [ 0.000000] node 2: [mem 0x000003fa90000000-0x000005f9cfffffff]
> [ 0.000000] node 3: [mem 0x000005f9d0000000-0x000007f8efffffff]
> [ 0.000000] node 4: [mem 0x000007f8f0000000-0x000009f81fffffff]
> [ 0.000000] node 5: [mem 0x000009f820000000-0x00000bf77fffffff]
> [ 0.000000] node 6: [mem 0x00000bf780000000-0x00000df6dfffffff]
> [ 0.000000] node 7: [mem 0x00000df6e0000000-0x00000ff63fffffff]
> [ 0.000000] node 8: [mem 0x00000ff640000000-0x000011f58fffffff]
> [ 0.000000] node 9: [mem 0x000011f590000000-0x000013644fffffff]
> [ 0.000000] node 10: [mem 0x0000136450000000-0x00001563afffffff]
> [ 0.000000] node 11: [mem 0x00001563b0000000-0x000017630fffffff]
> [ 0.000000] node 12: [mem 0x0000176310000000-0x000019625fffffff]
> [ 0.000000] node 13: [mem 0x0000196260000000-0x00001b5dcfffffff]
> [ 0.000000] node 14: [mem 0x00001b5dd0000000-0x00001d5d2fffffff]
> [ 0.000000] node 15: [mem 0x00001d5d30000000-0x00001f5c8fffffff]
Hmm so it will work for ppc64 and its fadump, but I'm not happy that we
made the function name sound like it's generic (unlike when the name
contained "dma"), while it only works as intended in specific corner
cases. The next user might be surprised...
More information about the Linuxppc-dev
mailing list