[PATCH] powerpc: Fix data corruption on IPI
Timothy Pearson
tpearson at raptorengineering.com
Tue Nov 14 19:03:40 AEDT 2023
----- Original Message -----
> From: "Salvatore Bonaccorso" <carnil at debian.org>
> To: "Timothy Pearson" <tpearson at raptorengineering.com>
> Cc: "Linuxppc-dev" <linuxppc-dev-bounces+tpearson=raptorengineering.com at lists.ozlabs.org>, "Jens Axboe"
> <axboe at kernel.dk>, "regressions" <regressions at lists.linux.dev>, "Michael Ellerman" <mpe at ellerman.id.au>, "npiggin"
> <npiggin at gmail.com>, "christophe leroy" <christophe.leroy at csgroup.eu>
> Sent: Tuesday, November 14, 2023 1:59:14 AM
> Subject: Re: [PATCH] powerpc: Fix data corruption on IPI
> On Mon, Nov 13, 2023 at 10:56:09PM -0600, Timothy Pearson wrote:
>> >From 0b2678b7cdada1a3d9aec8626f31a988d81373fa Mon Sep 17 00:00:00 2001
>> From: Timothy Pearson <tpearson at raptorengineering.com>
>> Date: Mon, 13 Nov 2023 22:42:58 -0600
>> Subject: [PATCH] powerpc: Fix data corruption on IPI
>>
>> On multithreaded SMP workloads such as those using io_uring, it is possible for
>> multiple threads to hold an inconsistent view of system memory when an IPI is
>> issued. This in turn leads to userspace memory corruption with varying degrees
>> of probability based on workload and inter-thread timing.
>>
>> io_uring provokes this bug by its use of TWA_SIGNAL during thread creation,
>> which is especially noticeable as significant userspace data corruption with
>> certain workloads such as MariaDB (bug MDEV-30728). While using
>> TWA_SIGNAL_NO_IPI works around the corruption, no other architecture requires
>> this workaround.
>>
>> Issue an lwsync barrier instruction prior to sending the IPI. This ensures
>> the receiving CPU has a consistent view of system memory, in line with other
>> architectures.
>>
>> Tested under QEMU in kvm mode, running on a Talos II workstation with dual
>> POWER9 DD2.2 CPUs.
>>
>> Tested-by: Timothy Pearson <tpearson at raptorengineering.com>
>> Signed-off-by: Timothy Pearson <tpearson at raptorengineering.com>
>> ---
>> arch/powerpc/kernel/smp.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
>> index ab691c89d787..ba42238de518 100644
>> --- a/arch/powerpc/kernel/smp.c
>> +++ b/arch/powerpc/kernel/smp.c
>> @@ -369,8 +369,10 @@ static inline void do_message_pass(int cpu, int msg)
>>
>> void arch_smp_send_reschedule(int cpu)
>> {
>> - if (likely(smp_ops))
>> + if (likely(smp_ops)) {
>> + __smp_lwsync();
>> do_message_pass(cpu, PPC_MSG_RESCHEDULE);
>> + }
>> }
>> EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
>
> Once this is accepted in mainline, can you ensure that it get
> backported to the needed relevant stable series? (Should it be CC'ed
> as well for stable@?).
Absolutely! We've been blocked on kernel upgrades for production database systems for a while due to this particular bug.
> For context, and maybe worth adding a Link: reference as well this is
> hit in Debian in https://bugs.debian.org/1032104
Sounds good. If anyone here needs a v2 with that line added just let me know.
Thanks!
More information about the Linuxppc-dev
mailing list