[PATCH] powerpc/dma: Fix invalid DMA mmap behavior

Michael Ellerman mpe at ellerman.id.au
Tue Jul 23 09:09:04 AEST 2019


Shawn Anastasio <shawn at anastas.io> writes:
> On 7/22/19 7:16 AM, Michael Ellerman wrote:
>> Arnd Bergmann <arnd at arndb.de> writes:
>>> On Thu, Jul 18, 2019 at 11:52 AM Christoph Hellwig <hch at lst.de> wrote:
>>>> On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:
>>>>> On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:
>>>>>>> Other than m68k, mips, and arm64, everybody else that doesn't have
>>>>>>> ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
>>>>>>> I assume this behavior is acceptable on those architectures.
>>>>>>
>>>>>> It might be acceptable, but there's no reason to use pgport_noncached
>>>>>> if the platform supports cache-coherent DMA.
>>>>>>
>>>>>> Christoph (+cc) made the change so maybe he saw something we're missing.
>>>>>
>>>>> I always found the forcing of noncached access even for coherent
>>>>> devices a little odd, but this was inherited from the previous
>>>>> implementation, which surprised me a bit as the different attributes
>>>>> are usually problematic even on x86.  Let me dig into the history a
>>>>> bit more, but I suspect the righ fix is to default to cached mappings
>>>>> for coherent devices.
>>>>
>>>> Ok, some history:
>>>>
>>>> The generic dma mmap implementation, which we are effectively still
>>>> using today was added by:
>>>>
>>>> commit 64ccc9c033c6089b2d426dad3c56477ab066c999
>>>> Author: Marek Szyprowski <m.szyprowski at samsung.com>
>>>> Date:   Thu Jun 14 13:03:04 2012 +0200
>>>>
>>>>      common: dma-mapping: add support for generic dma_mmap_* calls
>>>>
>>>> and unconditionally uses pgprot_noncached in dma_common_mmap, which is
>>>> then used as the fallback by dma_mmap_attrs if no ->mmap method is
>>>> present.  At that point we already had the powerpc implementation
>>>> that only uses pgprot_noncached for non-coherent mappings, and
>>>> the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
>>>> is set and otherwise pgprot_dmacoherent, which seems to be uncached.
>>>> Arm did support coherent platforms at that time, but they might have
>>>> been an afterthought and not handled properly.
>>>
>>> Cache-coherent devices are still very rare on 32-bit ARM.
>>>
>>> Among the callers of dma_mmap_coherent(), almost all are in platform
>>> specific device drivers that only ever run on noncoherent ARM SoCs,
>>> which explains why nobody would have noticed problems.
>>>
>>> There is also a difference in behavior between ARM and PowerPC
>>> when dealing with mismatched cacheability attributes: If the same
>>> page is mapped as both cached and uncached to, this may
>>> cause silent undefined behavior on ARM, while PowerPC should
>>> enter a checkstop as soon as it notices.
>> 
>> On newer Power CPUs it's actually more like the ARM behaviour.
>> 
>> I don't know for sure that it will *never* checkstop but there are at
>> least cases where it won't. There's some (not much) detail in the
>> Power8/9 user manuals.
>
> The issue was discovered due to sporadic checkstops on P9, so it
> seems like it will happen at least sometimes.

Yeah true. I wasn't sure if that checkstop was actually caused by a
cached/uncached mismatch or something else, but looks like it was, from
the hostboot issue (https://github.com/open-power/hostboot/issues/180):

  12.47015|  Signature Description    : pu.ex:k0:n0:s0:p00:c0 (L2FIR[16]) Cache line inhibited hit cacheable space


So I'm not really sure how to square that with the documentation in the
user manual:

  If a caching-inhibited load instruction hits in the L1 data cache, the
  load data is serviced from the L1 data cache and no request is sent to
  the NCU.
  If a caching-inhibited store instruction hits in the L1 data cache,
  the store data is written to the L1 data cache and sent to the NCU.
  Note that the L1 data cache and L2 cache are no longer coherent.

I guess I'm either misinterpreting that section or there's some *other*
documentation somewhere I haven't found that says that it will also
checkstop.

cheers


More information about the Linuxppc-dev mailing list