[PATCH] crypto: caam/jr - Remove extra memory barrier during job ring dequeue
Horia Geanta
horia.geanta at nxp.com
Thu May 2 21:08:55 AEST 2019
On 5/1/2019 8:49 AM, Michael Ellerman wrote:
> Vakul Garg wrote:
>> In function caam_jr_dequeue(), a full memory barrier is used before
>> writing response job ring's register to signal removal of the completed
>> job. Therefore for writing the register, we do not need another write
>> memory barrier. Hence it is removed by replacing the call to wr_reg32()
>> with a newly defined function wr_reg32_relaxed().
>>
>> Signed-off-by: Vakul Garg <vakul.garg at nxp.com>
>> ---
>> drivers/crypto/caam/jr.c | 2 +-
>> drivers/crypto/caam/regs.h | 8 ++++++++
>> 2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
>> index 4e9b3fca5627..2ce6d7d2ad72 100644
>> --- a/drivers/crypto/caam/jr.c
>> +++ b/drivers/crypto/caam/jr.c
>> @@ -266,7 +266,7 @@ static void caam_jr_dequeue(unsigned long devarg)
>> mb();
>>
>> /* set done */
>> - wr_reg32(&jrp->rregs->outring_rmvd, 1);
>> + wr_reg32_relaxed(&jrp->rregs->outring_rmvd, 1);
>>
>> jrp->out_ring_read_index = (jrp->out_ring_read_index + 1) &
>> (JOBR_DEPTH - 1);
>> diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
>> index 3cd0822ea819..9e912c722e33 100644
>> --- a/drivers/crypto/caam/regs.h
>> +++ b/drivers/crypto/caam/regs.h
>> @@ -96,6 +96,14 @@ cpu_to_caam(16)
>> cpu_to_caam(32)
>> cpu_to_caam(64)
>>
>> +static inline void wr_reg32_relaxed(void __iomem *reg, u32 data)
>> +{
>> + if (caam_little_end)
>> + writel_relaxed(data, reg);
>> + else
>> + writel_relaxed(cpu_to_be32(data), reg);
>> +}
When both core (PPC) and crypto engine (caam) are big endian, data ends up being
swapped - which is incorrect:
writel_relaxed -> writel -> __do_writel -> out_le32 -> swap
cpu_to_be32(data) -> data
>> +
>> static inline void wr_reg32(void __iomem *reg, u32 data)
>> {
>> if (caam_little_end)
>
> This crashes on my p5020ds. Did you test on powerpc?
>
> # first bad commit: [bbfcac5ff5f26aafa51935a62eb86b6eacfe8a49] crypto: caam/jr - Remove extra memory barrier during job ring dequeue
Thanks for the report Michael.
Any hint what would be the proper approach here - to have relaxed I/O accessors
that would work both for ARM and PPC, and avoid ifdeffery etc.?
For non-relaxed version, we used iowriteXX and iowriteXXbe - which work fine on
ARM and PPC, covering all the endianness combinations (core + crypto engine):
static inline void wr_reg32(void __iomem *reg, u32 data)
{
if (caam_little_end)
iowrite32(data, reg);
else
iowrite32be(data, reg);
}
Thanks,
Horia
More information about the Linuxppc-dev
mailing list