[PATCH kernel v3 10/22] powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation
Alexey Kardashevskiy
aik at ozlabs.ru
Mon Nov 19 18:43:36 AEDT 2018
On 16/11/2018 16:23, David Gibson wrote:
> On Tue, Nov 13, 2018 at 07:28:11PM +1100, Alexey Kardashevskiy wrote:
>> We might have memory@ nodes with "linux,usable-memory" set to zero
>> (for example, to replicate powernv's behaviour for GPU coherent memory)
>> which means that the memory needs an extra initialization but since
>> it can be used afterwards, the pseries platform will try mapping it
>> for DMA so the DMA window needs to cover those memory regions too.
>>
>> This walks through the memory nodes to find the highest RAM address to
>> let a huge DMA window cover that too in case this memory gets onlined
>> later.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>> ---
>> arch/powerpc/platforms/pseries/iommu.c | 43 +++++++++++++++++++++++++-
>> 1 file changed, 42 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>> index 78473ac..f818737 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -967,6 +967,47 @@ struct failed_ddw_pdn {
>>
>> static LIST_HEAD(failed_ddw_pdn_list);
>>
>> +static unsigned long read_n_cells(int n, const __be32 **buf)
>> +{
>> + unsigned long result = 0;
>> +
>> + while (n--) {
>> + result = (result << 32) | of_read_number(*buf, 1);
>> + (*buf)++;
>> + }
>> + return result;
>> +}
>
> Um.. this appears to be re-implementing of_read_number() in terms of
> of_read_number(). Wat!?
This is a cut-n-paste from arch/powerpc/mm/numa.c :) My bad, I did not
think much when I did this.
>
>> +static phys_addr_t ddw_memory_hotplug_max(void)
>> +{
>> + phys_addr_t max_addr = memory_hotplug_max();
>> + struct device_node *memory;
>> +
>> + for_each_node_by_type(memory, "memory") {
>> + unsigned long start, size;
>> + int ranges, n_mem_addr_cells, n_mem_size_cells, len;
>> + const __be32 *memcell_buf;
>> +
>> + memcell_buf = of_get_property(memory, "reg", &len);
>> + if (!memcell_buf || len <= 0)
>> + continue;
>> +
>> + n_mem_addr_cells = of_n_addr_cells(memory);
>> + n_mem_size_cells = of_n_size_cells(memory);
>> +
>> + /* ranges in cell */
>> + ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells);
>> +
>> + /* these are order-sensitive, and modify the buffer pointer */
>> + start = read_n_cells(n_mem_addr_cells, &memcell_buf);
>> + size = read_n_cells(n_mem_size_cells, &memcell_buf);
>> +
>> + max_addr = max_t(phys_addr_t, max_addr, start + size);
>> + }
>> +
>> + return max_addr;
>> +}
>
> Is there really no existing place we keep track of maxmimum possible
> memory address?
There are:
1. memblocks from mm/memblock.c - populated at the boot time from
"usable" memory@ nodes and mine are not "usable";
2. drmem from mm/drmem.c - populated from ibm,dynamic-memory-v2 - these
things do not support sparse regions so when I tried these with a GPU
RAM region mapped at 0x244000000000 - the device tree became quickly
over 1 MB and then qemu crashed, I did not debug any further as this
memory is not hotpluggable anyway from the rtas/qemu prospective, in
other words it is not something the user can hotplug or unplug.
And that is it afaict.
>
>> /*
>> * If the PE supports dynamic dma windows, and there is space for a table
>> * that can map all pages in a linear offset, then setup such a table,
>> @@ -1067,7 +1108,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn,
>> }
>> /* verify the window * number of ptes will map the partition */
>> /* check largest block * page size > max memory hotplug addr */
>> - max_addr = memory_hotplug_max();
>> + max_addr = ddw_memory_hotplug_max();
>> if (query.largest_available_block < (max_addr >> page_shift)) {
>> dev_dbg(&dev->dev, "can't map partition max 0x%llx with %u "
>> "%llu-sized pages\n", max_addr, query.largest_available_block,
>
--
Alexey
More information about the Linuxppc-dev
mailing list