[PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries

Darren Hart dvhltc at us.ibm.com
Thu Sep 2 01:10:04 EST 2010

On 08/31/2010 10:54 PM, Michael Ellerman wrote:
> On Tue, 2010-08-31 at 00:12 -0700, Darren Hart wrote:
> ..
>> When running with the function plugin I had to stop the trace
>> immediately before entering start_secondary after an online or my traces
>> would not include the pseries_mach_cpu_die function, nor the tracing I
>> added there (possibly buffer size, I am using 2048). The following trace
>> was collected using "trace-cmd record -p function -e irq -e sched" and
>> has been filtered to only show CPU [001] (the CPU undergoing the
>> offline/online test, and the one seeing preempt_count (pcnt) go to
>> ffffffff after cede. The function tracer does not indicate anything
>> running on the CPU other than the HCALL - unless the __trace_hcall*
>> commands might be to blame. 
> It's not impossible. Though normally they're disabled right, so the only
> reason they're running is because you're tracing. So if they are causing
> the bug then that doesn't explain why you see it normally.
> Still, might be worth disabling just the hcall tracepoints just to be
> 100% sure.

A couple of updates. I was working from tip/rt/head, which has been
stale for some months now. I switched to tip/rt/2.6.33 and the
preempt_count() change over cede went away. I now see the live hang that
Will has reported.

Before I dive into the live hang, I want to understand what fixed the
preempt_count() change. That might start pointing us in the right
direction for the live hang.

I did an inverted git bisect between tip/rt/head and tip/rt/2.6.33 to
try and locate the fix. I've narrowed it down to the merge:

# git show 7e1af1172bbd4109d09ac515c5d376f633da7cff
commit 7e1af1172bbd4109d09ac515c5d376f633da7cff
Merge: d8e94db 9666790
Author: Thomas Gleixner <tglx at linutronix.de>
Date:   Tue Jul 13 16:01:16 2010 +0200



    Signed-off-by: Thomas Gleixner <tglx at linutronix.de>

Visual inspection yields two patches of interest:

powerpc/pseries: Make query-cpu-stopped callable outside hotplug cpu

powerpc/pseries: Only call start-cpu when a CPU is stopped

I'm going to try reverting these today and see if they addressed the
issue indirectly.

Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

More information about the Linuxppc-dev mailing list