AW: PowerPC PCI DMA issues (prefetch/coherency?)

Mikhail Zolotaryov lebon at lebon.org.ua
Thu Sep 10 00:12:58 EST 2009


Hi Tom,

possible solution could be to use tasklet to perform DMA-related job (as 
in most cases DMA transfer is interrupt driven - makes sense).


Tom Burns wrote:
> Hi,
>
> With the default config for the Sequoia board on 2.6.24, calling 
> pci_dma_sync_sg_for_cpu() results in executing
> invalidate_dcache_range() in arch/ppc/kernel/misc.S from 
> __dma_sync().  This OOPses on PPC440 since it tries to call directly 
> the assembly instruction dcbi, which can only be executed in 
> supervisor mode.  We tried that before resorting to manual cache line 
> management with usermode-safe assembly calls.
>
> Regards,
> Tom Burns
> International Datacasting Corporation
>
> Mikhail Zolotaryov wrote:
>> Hi,
>>
>> Why manage cache lines  manually, if appropriate code is a part of 
>> __dma_sync / dma_sync_single_for_device of DMA API ? (implies 
>> CONFIG_NOT_COHERENT_CACHE enabled, as default for Sequoia Board)
>>
>> Prodyut Hazarika wrote:
>>> Hi Adam,
>>>
>>>  
>>>> Yes, I am using the 440EPx (same as the sequoia board). Our 
>>>> ideDriver is DMA'ing blocks of 192-byte data over the PCI bus
>>>>     
>>> (using
>>>  
>>>> the Sil0680A PCI-IDE bridge). Most of the DMA's (depending on timing)
>>>> end up being partially corrupted when we try to parse the data in the
>>>> virtual page. We have confirmed the data is good before the PCI-IDE
>>>> bridge. We are creating two 8K pages and map them to physical DMA
>>>>     
>>> memory
>>>  
>>>> using single-entry scatter/gather structs. When a DMA block is
>>>> corrupted, we see a random portion of it (always a multiple of 16byte
>>>> cache lines) is overwritten with old data from the last time the
>>>>     
>>> buffer
>>>  
>>>> was used.     
>>>
>>> This looks like a cache coherency problem.
>>> Can you ensure that the TLB entries corresponding to the DMA region has
>>> the CacheInhibit bit set.
>>> You will need a BDI connected to your system.
>>>
>>> Also, you will need to invalidate and flush the lines appropriately,
>>> since in 440 cores,
>>> L1Cache coherency is managed entirely by software.
>>> Please look at drivers/net/ibm_newemac/mal.c and core.c for example on
>>> how to do it.
>>>
>>> Thanks
>>> Prodyut
>>>
>>> On Thu, 2009-09-03 at 13:27 -0700, Prodyut Hazarika wrote:
>>>  
>>>> Hi Adam,
>>>>
>>>>   
>>>>> Are you sure there is L2 cache on the 440?
>>>>>       
>>>> It depends on the SoC you are using. SoC like 460EX (Canyonlands
>>>>     
>>> board)
>>>  
>>>> have L2Cache.
>>>> It seems you are using a Sequoia board, which has a 440EPx SoC. 440EPx
>>>> has a 440 cpu core, but no L2Cache.
>>>> Could you please tell me which SoC you are using?
>>>> You can also refer to the appropriate dts file to see if there is L2C.
>>>> For example, in canyonlands.dts (460EX based board), we have the L2C
>>>> entry.
>>>>         L2C0: l2c {
>>>>               ...
>>>>         }
>>>>
>>>>   
>>>>> I am seeing this problem with our custom IDE driver which is based on
>>>>>       
>>>
>>>  
>>>>> pretty old code. Our driver uses pci_alloc_consistent() to allocate
>>>>>       
>>> the
>>>  
>>>>> physical DMA memory and alloc_pages() to allocate a virtual page. 
>>>>> It then uses pci_map_sg() to map to a scatter/gather buffer. 
>>>>> Perhaps I should convert these to the DMA API calls as you suggest.
>>>>>       
>>>> Could you give more details on the consistency problem? It is a good
>>>> idea to change to the new DMA APIs, but pci_alloc_consistent() should
>>>> work too
>>>>
>>>> Thanks
>>>> Prodyut  
>>>> On Thu, 2009-09-03 at 19:57 +1000, Benjamin Herrenschmidt wrote:
>>>>   
>>>>> On Thu, 2009-09-03 at 09:05 +0100, Chris Pringle wrote:
>>>>>     
>>>>>> Hi Adam,
>>>>>>
>>>>>> If you have a look in include/asm-ppc/pgtable.h for the following
>>>>>>         
>>>> section:
>>>>   
>>>>>> #ifdef CONFIG_44x
>>>>>> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED |
>>>>>>         
>>>> _PAGE_GUARDED)
>>>>   
>>>>>> #else
>>>>>> #define _PAGE_BASE    (_PAGE_PRESENT | _PAGE_ACCESSED)
>>>>>> #endif
>>>>>>
>>>>>> Try adding _PAGE_COHERENT to the appropriate line above and see if
>>>>>>         
>>>> that   
>>>>>> fixes your issue - this causes the 'M' bit to be set on the page
>>>>>>         
>>>> which   
>>>>>> sure enforce cache coherency. If it doesn't, you'll need to check
>>>>>>         
>>>> the   
>>>>>> 'M' bit isn't being masked out in head_44x.S (it was originally
>>>>>>         
>>>> masked   
>>>>>> out on arch/powerpc, but was fixed in later kernels when the cache
>>>>>>         
>>>
>>>  
>>>>>> coherency issues with non-SMP systems were resolved).
>>>>>>         
>>>>> I have some doubts about the usefulness of doing that for 4xx.
>>>>>       
>>> AFAIK,
>>>  
>>>>> the 440 core just ignores M.
>>>>>
>>>>> The problem lies probably elsewhere. Maybe the L2 cache coherency
>>>>>       
>>>> isn't
>>>>   
>>>>> enabled or not working ?
>>>>>
>>>>> The L1 cache on 440 is simply not coherent, so drivers have to make
>>>>>       
>>>> sure
>>>>   
>>>>> they use the appropriate DMA APIs which will do cache flushing when
>>>>> needed.
>>>>>
>>>>> Adam, what driver is causing you that sort of problems ?
>>>>>
>>>>> Cheers,
>>>>> Ben.
>>>>>
>>>>>
>>>>>       
>>
>>
>
>


More information about the Linuxppc-dev mailing list