[PATCH v5 21/23] powerpc: Simplify test in __dma_sync()

Christophe Leroy christophe.leroy at c-s.fr
Fri Feb 5 18:56:51 AEDT 2016



Le 05/02/2016 08:52, Denis Kirjanov a écrit :
> On 2/4/16, Christophe Leroy <christophe.leroy at c-s.fr> wrote:
>>
>> Le 04/02/2016 12:37, Denis Kirjanov a écrit :
>>> On 2/4/16, Christophe Leroy <christophe.leroy at c-s.fr> wrote:
>>>> This simplification helps the compiler. We now have only one test
>>>> instead of two, so it reduces the number of branches.
>>>>
>>>> Signed-off-by: Christophe Leroy <christophe.leroy at c-s.fr>
>>>> ---
>>>> v2: new
>>>> v3: no change
>>>> v4: no change
>>>> v5: no change
>>>>
>>>>    arch/powerpc/mm/dma-noncoherent.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/powerpc/mm/dma-noncoherent.c
>>>> b/arch/powerpc/mm/dma-noncoherent.c
>>>> index 169aba4..2dc74e5 100644
>>>> --- a/arch/powerpc/mm/dma-noncoherent.c
>>>> +++ b/arch/powerpc/mm/dma-noncoherent.c
>>>> @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int
>>>> direction)
>>>>    		 * invalidate only when cache-line aligned otherwise there is
>>>>    		 * the potential for discarding uncommitted data from the cache
>>>>    		 */
>>>> -		if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1)))
>>>> +		if ((start | end) & (L1_CACHE_BYTES - 1))
>>>>    			flush_dcache_range(start, end);
>>>>    		else
>>>>    			invalidate_dcache_range(start, end);
>>> The previous version of address cache-line aligned check reads perfectly
>>> fine.
>>> What's the benefit of this micro optimization?
>> With this optimisation we avoid one unneccessary test and two associated
>> jumps. Taking into account that __dma_sync() is one of the top ten CPU
>> consummers, I believe it is worth it:
>>
>>
> Yeah, looks better. Did you compile the kernel with default compiler flags?
>
> Thanks!
Yes I did

Christophe



More information about the Linuxppc-dev mailing list