[PATCH] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()

Sat Sep 25 10:10:03 AEST 2021

Michael Ellerman <mpe at ellerman.id.au> writes:
> Srikar Dronamraju <srikar at linux.vnet.ibm.com> writes:
>> * Michael Ellerman <mpe at ellerman.id.au> [2021-09-23 17:29:32]:
>>
>>> Nathan Lynch <nathanl at linux.ibm.com> writes:
>>> > Srikar Dronamraju <srikar at linux.vnet.ibm.com> writes:
>>> >
>>> >> * Nathan Lynch <nathanl at linux.ibm.com> [2021-09-22 11:01:12]:
>>> >>
>>> >>> Srikar Dronamraju <srikar at linux.vnet.ibm.com> writes:
> ...
>>> >> Or can I understand how debug_smp_processor_id() is useful if
>>> >> __smp_processor_id() is defined as raw_smp_processor_id()?
>>> 
>>> debug_smp_processor_id() is useful on powerpc, as well as other arches,
>>> because it checks that we're in a context where the processor id won't
>>> change out from under us.
>>> 
>>> eg. something like this is unsafe:
>>> 
>>>   int counts[NR_CPUS];
>>>   int tmp, cpu;
>>>   
>>>   cpu = smp_processor_id();
>>>   tmp = counts[cpu];
>>>   				<- preempted here and migrated to another CPU
>>>   counts[cpu] = tmp + 1;
>>> 
>>
>> If lets say the above call was replaced by raw_smp_processor_id(), how would
>> it avoid the preemption / migration to another CPU?
>>
>> Replacing it with raw_smp_processor_id() may avoid, the debug splat but the
>> underlying issue would still remain as is. No?
>
> Correct.
>
> Using raw_smp_processor_id() is generally the wrong answer. For this
> example the correct fix is to disable preemption around the code, eg:
>
>    int counts[NR_CPUS];
>    int tmp, cpu;
>    
>    preempt_disable()
>
>    cpu = smp_processor_id();
>    tmp = counts[cpu];
>    counts[cpu] = tmp + 1;
>
>    preempt_enable();
>
>
> For the original issue I think it is OK to use raw_smp_processor_id(),
> because we're already doing a racy check of whether another CPU has been
> preempted by the hypervisor.
>
>         if (!is_kvm_guest()) {
>                 int first_cpu = cpu_first_thread_sibling(smp_processor_id());
>
>                 if (cpu_first_thread_sibling(cpu) == first_cpu)
>                         return false;
>         }
>
> We could disable preemption around that, eg:
>
>         if (!is_kvm_guest()) {
>                 int first_cpu;
>                 bool is_sibling;
>
>                 preempt_disable();
>                 first_cpu = cpu_first_thread_sibling(smp_processor_id());
>                 is_sibling = (cpu_first_thread_sibling(cpu) == first_cpu)
>                 preempt_enable();
>
>                 // Can be preempted here
>
>                 if (is_sibling)
>                     return false;
>         }
>
> But before we return we could be preempted, and then is_sibling is no
> longer necessarily correct. So that doesn't really gain us anything.
>
> The only way to make that result stable is to disable preemption in the
> caller, but most callers don't want to AFAICS, because they know they're
> doing a racy check to begin with.

I'll add that one way I think about this is that when I choose
smp_processor_id(), I am making a claim about the context in which it is
used, and when I use raw_smp_processor_id() I am making a different
claim.

smp_processor_id() => I am sampling the CPU index in a critical section
where the result equals the CPU that executes the entire critical
section, and I am relying on that property for the section's
correctness. This claim is verified by debug_smp_processor_id() when
DEBUG_PREEMPT=y.

raw_smp_processor_id() => I am sampling the CPU index and using the
result in a way that is safe even if it differs from the CPU(s) on which
the surrounding code actually executes.

This framing doesn't cover all situations, but I've found it to be
generally useful.