[PATCH] powerpc: Fix data corruption on IPI

Timothy Pearson tpearson at raptorengineering.com
Tue Nov 14 19:03:40 AEDT 2023



----- Original Message -----
> From: "Salvatore Bonaccorso" <carnil at debian.org>
> To: "Timothy Pearson" <tpearson at raptorengineering.com>
> Cc: "Linuxppc-dev" <linuxppc-dev-bounces+tpearson=raptorengineering.com at lists.ozlabs.org>, "Jens Axboe"
> <axboe at kernel.dk>, "regressions" <regressions at lists.linux.dev>, "Michael Ellerman" <mpe at ellerman.id.au>, "npiggin"
> <npiggin at gmail.com>, "christophe leroy" <christophe.leroy at csgroup.eu>
> Sent: Tuesday, November 14, 2023 1:59:14 AM
> Subject: Re: [PATCH] powerpc: Fix data corruption on IPI

> On Mon, Nov 13, 2023 at 10:56:09PM -0600, Timothy Pearson wrote:
>> >From 0b2678b7cdada1a3d9aec8626f31a988d81373fa Mon Sep 17 00:00:00 2001
>> From: Timothy Pearson <tpearson at raptorengineering.com>
>> Date: Mon, 13 Nov 2023 22:42:58 -0600
>> Subject: [PATCH] powerpc: Fix data corruption on IPI
>> 
>> On multithreaded SMP workloads such as those using io_uring, it is possible for
>> multiple threads to hold an inconsistent view of system memory when an IPI is
>> issued.  This in turn leads to userspace memory corruption with varying degrees
>> of probability based on workload and inter-thread timing.
>> 
>> io_uring provokes this bug by its use of TWA_SIGNAL during thread creation,
>> which is especially noticeable as significant userspace data corruption with
>> certain workloads such as MariaDB (bug MDEV-30728).  While using
>> TWA_SIGNAL_NO_IPI works around the corruption, no other architecture requires
>> this workaround.
>> 
>> Issue an lwsync barrier instruction prior to sending the IPI.  This ensures
>> the receiving CPU has a consistent view of system memory, in line with other
>> architectures.
>> 
>> Tested under QEMU in kvm mode, running on a Talos II workstation with dual
>> POWER9 DD2.2 CPUs.
>> 
>> Tested-by: Timothy Pearson <tpearson at raptorengineering.com>
>> Signed-off-by: Timothy Pearson <tpearson at raptorengineering.com>
>> ---
>>  arch/powerpc/kernel/smp.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
>> index ab691c89d787..ba42238de518 100644
>> --- a/arch/powerpc/kernel/smp.c
>> +++ b/arch/powerpc/kernel/smp.c
>> @@ -369,8 +369,10 @@ static inline void do_message_pass(int cpu, int msg)
>>  
>>  void arch_smp_send_reschedule(int cpu)
>>  {
>> -	if (likely(smp_ops))
>> +	if (likely(smp_ops)) {
>> +		__smp_lwsync();
>>  		do_message_pass(cpu, PPC_MSG_RESCHEDULE);
>> +	}
>>  }
>>  EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
> 
> Once this is accepted in mainline, can you ensure that it get
> backported to the needed relevant stable series? (Should it be CC'ed
> as well for stable@?).

Absolutely!  We've been blocked on kernel upgrades for production database systems for a while due to this particular bug.

> For context, and maybe worth adding a Link: reference as well this is
> hit in Debian in https://bugs.debian.org/1032104

Sounds good.  If anyone here needs a v2 with that line added just let me know.

Thanks!


More information about the Linuxppc-dev mailing list