[PATCH kernel] powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested

Alexey Kardashevskiy aik at ozlabs.ru
Mon Mar 6 12:28:19 AEDT 2017


On 06/03/17 10:03, Benjamin Herrenschmidt wrote:
> On Mon, 2017-02-27 at 22:00 +1100, Michael Ellerman wrote:
>>> The alternative would be allocating TCE tables as big as PAGE_SIZE
>>> but
>>> only using parts of it but this would complicate a bit bits of code
>>> responsible for overall amount of memory used for TCE table.
>>>
>>> Or kmem_cache_create() could be used to allocate as big TCE table
>>> levels
>>> as we really need but that API does not seem to support NUMA nodes.
>>
>> kmem_cache_alloc_node() ?
> 
> Is that 55 bits of address space (ie, 3 indirect levels + 64k pages) ?
> Or only 39 (2 indirect level + 64k pages) ?

39, yes.

> In the former case, I'm happy to limit the levels to 3 for 64K pages,
> 55 bits of TCE space is more than enough. 39 isn't however.

8192*8192*8192*65536>>40 = 32768TB of addressable memory (but there is no
good reason not to use huge pages);
8192*8192*8192*4096>>40 = 2048TB or addressable memory (even with 2
indirect levels but we can have all 5 levels with 4K IOMMU pages).

Looks enough to me...

And in this particular patch I am not limiting anything, I just replace
already existing EEH condition with -EINVAL. If it is this important to
have all 5 levels, then we can switch from alloc_pages_node() to
kmem_cache_alloc_node(), in a separate patch.


-- 
Alexey


More information about the Linuxppc-dev mailing list