[PATCH kernel] powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
aik at ozlabs.ru
Tue Feb 28 11:54:36 AEDT 2017
On 27/02/17 22:00, Michael Ellerman wrote:
> Alexey Kardashevskiy <aik at ozlabs.ru> writes:
>> The IODA2 specification says that a 64 DMA address cannot use top 4 bits
>> (3 are reserved and one is a "TVE select"); bottom page_shift bits
>> cannot be used for multilevel table addressing either.
>> The existing IODA2 table allocation code aligns the minimum TCE table
>> size to PAGE_SIZE so in the case of 64K system pages and 4K IOMMU pages,
>> we have 64-4-12=48 bits. Since 64K page stores 8192 TCEs, i.e. needs
>> 13 bits, the maximum number of levels is 48/13 = 3 so we physically
>> cannot address more and EEH happens on DMA accesses.
>> This adds a check that too many levels were requested.
>> It is still possible to have 5 levels in the case of 4K system page size.
>> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>> The alternative would be allocating TCE tables as big as PAGE_SIZE but
>> only using parts of it but this would complicate a bit bits of code
>> responsible for overall amount of memory used for TCE table.
>> Or kmem_cache_create() could be used to allocate as big TCE table levels
>> as we really need but that API does not seem to support NUMA nodes.
> kmem_cache_alloc_node() ?
Yeah, discovered this later. Still, if a single level is used, then the
table is 4MB and kmem_cache_alloc_node() does not seem the right tool here
(although I cannot find any enforced upper limit).
So to keep things simpler, I decided to stick to alloc_pages_node() and
avoid mixing memory allocation APIs.
More information about the Linuxppc-dev