[Linuxppc-users] DMA fails to reserved memory

Benjamin Herrenschmidt benh at au1.ibm.com
Tue Jun 5 09:15:06 AEST 2018


On Mon, 2018-06-04 at 17:04 -0600, Brian Varney wrote:
> Okay, so in my haste to look up the dma api, it looks like I ended up
> using the wrong function.
> 
> The dma_map_single and dma_map_page take a kernel virtual pointer and
> page * respectively.  I don't have either one of those.  All I have
> is a physical address.  Since I am modifying the devicetree to hide
> this memory before booting the kernel, the kernel doesn't really know
> this memory exists.
> 
> What I'm trying to do is reserve a huge chunk of contiguous memory
> (i.e. several GBs).  I want this memory to be DMA-able from my PCIE
> adapter and be able access it in userspace.  Is there a better way to
> accomplish this -- probably, but I'm trying to port a big code base
> that we had working from x86 platform to PPC without changing too
> much.

Rather than completely "remove" it from the /memory node of the device-
tree, instead make sure you put it in the reserved regions, that way it
will still be mapped by the kernel, just not used.

That way you can get a kernel virtual address by just doing __va() on
the physical address. A bit hackish but should work.

> Reserving that memory by modifying the devicetree passed into the
> kernel works great.  I can then access that memory by
> calling remap_pfn_range on it.  The only thing I am missing is being
> able to DMA to that memory.
> 
> Thanks,
> Brian V.
> 
> 
> 
> 
> 
> On Mon, Jun 4, 2018 at 3:53 PM, Benjamin Herrenschmidt <benh at au1.ibm.com> wrote:
> > On Mon, 2018-06-04 at 13:08 -0600, Brian Varney wrote:
> > > Thanks for responding.
> > > 
> > > I'm not opposed to sharing snippets of code, but not sure what you
> > > would need to see at this point.
> > > 
> > > I'm not using the kernel's DMA API.  It wasn't necessary with X86
> > > architecture where the code is working fine.  Note that for x86
> > > architecture, I am using the "memmap=" kernel parameter to reserve
> > > memory instead of removing memory from the devicetree passed into the
> > > kernel.
> > > 
> > > After your response, I did try running using the api's
> > > "dma_map_resource" function but it returned the same address I sent
> > > it.  So no translation is necessary I guess.  It still failed the
> > > same way.
> > 
> > dma_map_resource() seems to be a new call that was added to the API
> > recently that we don't yet support or implement on powerpc. It also
> > look rather ... broken as it doesn't fail if the backend doesn't
> > support it.
> > 
> > It also doesn't work for normal memory.
> > 
> > What are you trying to do ?
> > 
> > If you are trying to write to memory, use dma_map_single/sg or
> > dma_alloc_coherent, and DMA to that.
> > 
> > > I did get a PCIE analyzer hooked up now and I was able to get a pcie
> > > trace of it:
> > > https://ent.box.com/s/vx8nex5zo83xs562usd6bfeckoxbn3c8
> > > 
> > > The last transaction shown is my PCIE adapter writing to 0x81166000
> > > -- within the memory I reserved.  The transaction gets ACK'ed but
> > > then no other traffic is shown.  It seems the system has cut itself
> > > off from this PCIE adapter at this point.
> > > 
> > > Any ideas?
> > > 
> > > Thanks,
> > > -Brian V.
> > > 
> > > 
> > > 
> > > 
> > > 
> > > On Fri, Jun 1, 2018 at 1:50 PM, Brian King <brking at linux.vnet.ibm.com> wrote:
> > > > Any code you can share? On an LC922 system running bare metal, as long as your
> > > > adapter is capable of 64 bits of DMA address space, and not all adapters are,
> > > > then you would not be using the IOMMU. However, that does not mean that the
> > > > physical address the host sees equals the address that is used on the PCIe
> > > > link by the adapter. You need to make sure you are using the DMA-API
> > > > as defined in the kernel documentation to allocate DMA'able memory to be given
> > > > to the adapter. This API will give you a virtual address you can use to access
> > > > the memory in the kernel as well as a dma_addr_t which is the token you give
> > > > to the adapter as the DMA address. In cases where an IOMMU is in use, this would
> > > > setup the translation control entry (TCE) in the IOMMU. In your case, where you
> > > > are not using an IOMMU, it will do a simple translation to an address that
> > > > can be used on the PCIe link.
> > > > 
> > > > What you are seeing in the log below is an EEH error, which is an error correction
> > > > feature of the Power PCI host bridge, which allows the platform to recover from
> > > > various PCIe errors. In this case, its a DMA write to an invalid address. In your
> > > > case the invalid address is 81166000, which is not a valid DMA address on an
> > > > LC922. 
> > > > 
> > > > Thanks,
> > > > 
> > > > Brian
> > > > 
> > > > On 06/01/2018 11:56 AM, Brian Varney wrote:
> > > > > Hello all,
> > > > > 
> > > > > I have a LC922 system running Fedora 28 (4.16.10-300.fc28.ppc64le) and I am reserving memory by modifying the device tree passed in to the kernel as described by this forum entry: https://lists.ozlabs.org/pipermail/linuxppc-users/2017-September/000112.html <https://lists.ozlabs.org/pipermail/linuxppc-users/2017-September/000112.html>
> > > > > 
> > > > > I have a PCIE adapter plugged into the system that I am testing.  When the adapter performs a DMA operation to this reserved memory, things start to go south.  All reads with the adapter's BAR space start returning all FF's.  I suspect the reads aren't actually making it to the adapter but I don't have a PCIE analyzer on there to verify.  Then I get the following in dmesg:
> > > > > 
> > > > > [  340.316599] EEH: Frozen PHB#34-PE#0 detected
> > > > > [  340.316645] EEH: PE location: WIO Slot2, PHB location: N/A
> > > > > [  340.316675] CPU: 133 PID: 5380 Comm: mr Tainted: P           OE    4.16.10-300.fc28.ppc64le #1
> > > > > [  340.316676] Call Trace:
> > > > > [  340.316682] [c0002004287b7a40] [c000000000bec5d0] dump_stack+0xb4/0x104 (unreliable)
> > > > > [  340.316686] [c0002004287b7a80] [c00000000003f9d0] eeh_dev_check_failure+0x4b0/0x5b0
> > > > > [  340.316689] [c0002004287b7b20] [c0000000000b3ae8] pnv_pci_read_config+0x138/0x170
> > > > > [  340.316692] [c0002004287b7b70] [c0000000006c7e14] pci_user_read_config_byte+0x84/0x160
> > > > > [  340.316693] [c0002004287b7bc0] [c0000000006df1fc] pci_read_config+0x12c/0x2d0
> > > > > [  340.316696] [c0002004287b7c50] [c0000000004bceb4] sysfs_kf_bin_read+0x94/0xf0
> > > > > [  340.316698] [c0002004287b7c90] [c0000000004bbd30] kernfs_fop_read+0x130/0x2a0
> > > > > [  340.316699] [c0002004287b7ce0] [c0000000003e721c] __vfs_read+0x6c/0x1e0
> > > > > [  340.316701] [c0002004287b7d80] [c0000000003e744c] vfs_read+0xbc/0x1b0
> > > > > [  340.316703] [c0002004287b7dd0] [c0000000003e7ed4] SyS_pread64+0xc4/0x120
> > > > > [  340.316705] [c0002004287b7e30] [c00000000000b8e0] system_call+0x58/0x6c
> > > > > [  340.316734] EEH: Detected PCI bus error on PHB#34-PE#0
> > > > > [  340.316739] EEH: This PCI device has failed 1 times in the last hour
> > > > > [  340.316739] EEH: Notify device drivers to shutdown
> > > > > [  340.316746] EEH: Collect temporary log
> > > > > [  340.316780] EEH: of node=0034:01:00.0
> > > > > [  340.316783] EEH: PCI device/vendor: 00d11000
> > > > > [  340.316786] EEH: PCI cmd/status register: 00100146
> > > > > [  340.316787] EEH: PCI-E capabilities and status follow:
> > > > > [  340.316802] EEH: PCI-E 00: 0002b010 112c8023 00002950 00437d03
> > > > > [  340.316813] EEH: PCI-E 10: 10830000 00000000 00000000 00000000
> > > > > [  340.316815] EEH: PCI-E 20: 00000000
> > > > > [  340.316816] EEH: PCI-E AER capability register set follows:
> > > > > [  340.316828] EEH: PCI-E AER 00: 14820001 00000000 00400000 00462030
> > > > > [  340.316839] EEH: PCI-E AER 10: 00000000 0000e000 000001e0 00000000
> > > > > [  340.316849] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
> > > > > [  340.316853] EEH: PCI-E AER 30: 00000000 00000000
> > > > > [  340.316855] PHB4 PHB#52 Diag-data (Version: 1)
> > > > > [  340.316855] brdgCtl:    00000002
> > > > > [  340.316857] RootSts:    00000040 00402000 a0830008 00100107 00000000
> > > > > [  340.316859] PhbSts:     0000001c00000000 0000001c00000000
> > > > > [  340.316860] Lem:        0000000010000000 0000000000000000 0000000010000000
> > > > > [  340.316862] PhbErr:     0000080000000000 0000080000000000 2148000098000240 a008400000000000
> > > > > [  340.316864] RxeArbErr:  0000000800000000 0000000800000000 7f1a01000000001b 0000000081166000
> > > > > [  340.316865] RegbErr:    0040000000000000 0000000000000000 a2000a4018000000 1800000000000000
> > > > > [  340.316868] PE[000] A/B: 8000802301000000 8000000081166000
> > > > > [  340.316871] PE[100] A/B: 80000000ff275c00 80000000300d088b
> > > > > [  340.316872] EEH: Reset with hotplug activity
> > > > > [  340.316898] iommu: Removing device 0034:01:00.0 from group 4
> > > > > 
> > > > > 
> > > > > Any idea why this is happening?  My suspicion is that there is iommu hardware that is not allowing my adapter to access this memory, but I am not familiar with the power9 architecture.  Is there a way to disable the iommu completely or kernel functions to call to give my adapter "permision" to DMA to a memory range?
> > > > > 
> > > > > Thanks,
> > > > > Brian V.
> > > > > 
> > > > > 
> > > > > 
> > > > > _______________________________________________
> > > > > Linuxppc-users mailing list
> > > > > Linuxppc-users at lists.ozlabs.org
> > > > > https://lists.ozlabs.org/listinfo/linuxppc-users
> > > > > 
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Linuxppc-users mailing list
> > > > Linuxppc-users at lists.ozlabs.org
> > > > https://lists.ozlabs.org/listinfo/linuxppc-users
> > 
> 
> 



More information about the Linuxppc-users mailing list