[PATCH kernel] powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
aik at ozlabs.ru
Wed Feb 22 15:43:59 AEDT 2017
The IODA2 specification says that a 64 DMA address cannot use top 4 bits
(3 are reserved and one is a "TVE select"); bottom page_shift bits
cannot be used for multilevel table addressing either.
The existing IODA2 table allocation code aligns the minimum TCE table
size to PAGE_SIZE so in the case of 64K system pages and 4K IOMMU pages,
we have 64-4-12=48 bits. Since 64K page stores 8192 TCEs, i.e. needs
13 bits, the maximum number of levels is 48/13 = 3 so we physically
cannot address more and EEH happens on DMA accesses.
This adds a check that too many levels were requested.
It is still possible to have 5 levels in the case of 4K system page size.
Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
The alternative would be allocating TCE tables as big as PAGE_SIZE but
only using parts of it but this would complicate a bit bits of code
responsible for overall amount of memory used for TCE table.
Or kmem_cache_create() could be used to allocate as big TCE table levels
as we really need but that API does not seem to support NUMA nodes.
In the reality, even 3 levels give us way too much addressable memory.
arch/powerpc/platforms/powernv/pci-ioda.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 24fa2de2a0af..1e92ec954321 100644
@@ -2631,6 +2631,9 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset,
level_shift = entries_shift + 3;
level_shift = max_t(unsigned, level_shift, PAGE_SHIFT);
+ if ((level_shift - 3) * levels + page_shift >= 60)
+ return -EINVAL;
/* Allocate TCE table */
addr = pnv_pci_ioda2_table_do_alloc_pages(nid, level_shift,
levels, tce_table_size, &offset, &total_allocated);
More information about the Linuxppc-dev