[PATCH] crypto: caam/jr - Remove extra memory barrier during job ring dequeue

Horia Geanta horia.geanta at nxp.com
Thu May 2 21:08:55 AEST 2019


On 5/1/2019 8:49 AM, Michael Ellerman wrote:
> Vakul Garg wrote:
>> In function caam_jr_dequeue(), a full memory barrier is used before
>> writing response job ring's register to signal removal of the completed
>> job. Therefore for writing the register, we do not need another write
>> memory barrier. Hence it is removed by replacing the call to wr_reg32()
>> with a newly defined function wr_reg32_relaxed().
>>
>> Signed-off-by: Vakul Garg <vakul.garg at nxp.com>
>> ---
>>  drivers/crypto/caam/jr.c   | 2 +-
>>  drivers/crypto/caam/regs.h | 8 ++++++++
>>  2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
>> index 4e9b3fca5627..2ce6d7d2ad72 100644
>> --- a/drivers/crypto/caam/jr.c
>> +++ b/drivers/crypto/caam/jr.c
>> @@ -266,7 +266,7 @@ static void caam_jr_dequeue(unsigned long devarg)
>>  		mb();
>>  
>>  		/* set done */
>> -		wr_reg32(&jrp->rregs->outring_rmvd, 1);
>> +		wr_reg32_relaxed(&jrp->rregs->outring_rmvd, 1);
>>  
>>  		jrp->out_ring_read_index = (jrp->out_ring_read_index + 1) &
>>  					   (JOBR_DEPTH - 1);
>> diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
>> index 3cd0822ea819..9e912c722e33 100644
>> --- a/drivers/crypto/caam/regs.h
>> +++ b/drivers/crypto/caam/regs.h
>> @@ -96,6 +96,14 @@ cpu_to_caam(16)
>>  cpu_to_caam(32)
>>  cpu_to_caam(64)
>>  
>> +static inline void wr_reg32_relaxed(void __iomem *reg, u32 data)
>> +{
>> +	if (caam_little_end)
>> +		writel_relaxed(data, reg);
>> +	else
>> +		writel_relaxed(cpu_to_be32(data), reg);
>> +}
When both core (PPC) and crypto engine (caam) are big endian, data ends up being
swapped - which is incorrect:
writel_relaxed -> writel -> __do_writel -> out_le32 -> swap
cpu_to_be32(data) -> data

>> +
>>  static inline void wr_reg32(void __iomem *reg, u32 data)
>>  {
>>  	if (caam_little_end)
> 
> This crashes on my p5020ds. Did you test on powerpc?
> 
> # first bad commit: [bbfcac5ff5f26aafa51935a62eb86b6eacfe8a49] crypto: caam/jr - Remove extra memory barrier during job ring dequeue

Thanks for the report Michael.

Any hint what would be the proper approach here - to have relaxed I/O accessors
that would work both for ARM and PPC, and avoid ifdeffery etc.?

For non-relaxed version, we used iowriteXX and iowriteXXbe - which work fine on
ARM and PPC, covering all the endianness combinations (core + crypto engine):

static inline void wr_reg32(void __iomem *reg, u32 data)
{
        if (caam_little_end)
                iowrite32(data, reg);
        else
                iowrite32be(data, reg);
}

Thanks,
Horia


More information about the Linuxppc-dev mailing list