trace_hardirqs_on/off vs. extra stack frames
Benjamin Herrenschmidt
benh at kernel.crashing.org
Sat Dec 22 10:52:02 AEDT 2018
On Thu, 2018-12-20 at 21:02 -0500, Steven Rostedt wrote:
> On Fri, 21 Dec 2018 12:11:35 +1100
> Benjamin Herrenschmidt <benh at kernel.crashing.org> wrote:
>
> > Hi Steven !
> >
> > I'm trying to untangle something, and I need your help :-)
> >
> > In commit 3cb5f1a3e58c0bd70d47d9907cc5c65192281dee, you added a summy
> > stack frame around the assembly calls to trace_hardirqs_on/off on the
> > ground that when using the latency tracer (irqsoff), you might poke at
> > CALLER_ADDR1 and that could blow up if there's only one frame at hand.
> >
> > However, I can't see where it would be doing that. lockdep.c only uses
> > CALLER_ADDR0 and irqsoff uses the values passed by it. In fact, that
> > was already the case when the above commit was merged.
> >
> > I tried on a 32-bit kernel to remove the dummy stack frame with no
> > issue so far .... (though I do get stupid values reported with or
> > without a stack frame, but I think that's normal, looking into it).
>
> BTW, I only had a 64 bit PPC working, so I would have been testing that.
>
> > The reason I'm asking is that we have other code path, on return
> > from interrupts for example, at least on 32-bits where we call the
> > tracing without the extra stack frame, and I yet to see it crash.
>
> Have you tried enabling the irqsoff tracer and running it for a while?
>
> echo irqsoff > /sys/kernel/debug/tracing/current_tracer
>
> The problem is that when we come from user space, and we disable
> interrupts in the entry code, it calls into the irqsoff tracer:
>
> [ in userspace ]
> <interrupt>
> [ in kernel ]
> bl .trace_hardirqs_off
>
> kernel/trace/trace_preemptirq.c:
>
> trace_hardirqs_off(CALLER_ADDR_0, CALLER_ADDR1)
>
> IIRC, without the stack frame, that CALLER_ADDR1 can end up having the
> kernel read garbage.
You're right, I was looking at a too old tree where trace_hardirqs_* is
implemented in kernel/locking/lockdep.c and only uses CALLER_ADDR0.
>
> -- Steve
>
>
> > I wonder if the commit and bug fix above relates to some older code
> > that no longer existed even at the point where the commit was
> merged...
More information about the Linuxppc-dev
mailing list