[PATCH v3 4/6] powerpc/64s: Un-blacklist system_call() from kprobes
Naveen N. Rao
naveen.n.rao at linux.vnet.ibm.com
Fri Jun 23 01:43:36 AEST 2017
On 2017/06/22 11:14PM, Nicholas Piggin wrote:
> On Thu, 22 Jun 2017 21:14:21 +1000
> Michael Ellerman <mpe at ellerman.id.au> wrote:
>
> > Nicholas Piggin <npiggin at gmail.com> writes:
> >
> > > On Thu, 22 Jun 2017 00:08:40 +0530
> > > "Naveen N. Rao" <naveen.n.rao at linux.vnet.ibm.com> wrote:
> > >
> > >> It is actually safe to probe system_call() in entry_64.S, but only till
> > >> we unset MSR_RI. To allow this, add a new symbol system_call_exit()
> > >> after the mtmsrd and blacklist that. Though the mtmsrd instruction
> > >> itself is now whitelisted, we won't be allowed to probe on it as we
> > >> don't allow probing on rfi and mtmsr instructions (checked for in
> > >> arch_prepare_kprobe()).
> > >
> > > Can you add a little comment to say probes aren't allowed, and it's
> > > located after the mtmsr in order to avoid contaminating traces?
Hmm.. it's actually after the mtmsrd since we can probe until that
point. In v2, I converted .Lsyscall_exit() into a different label and
blacklisted all of it. But Michael prefers to only blacklist what we
must. It is helpful if we ever want to probe after returning from a
syscall. That said, please see further below (*)
> > >
> > > Also I wonder if a slightly different name would be more instructive?
> > > I don't normally care, but the system_call_common code isn't trivial
> > > to follow. system_call_exit might give the impression that it is the
> > > entire exit path (which would pair with system_call for entry).
> >
> > It is the entire path in the happy case isn't it? I'm not sure I know
> > what you mean.
>
> Oh, yes you're right, I thought it was moved down further.
>
> > If you're tracing etc. then you'll be in syscall_exit_work, isn't
> > that sufficient to differentiate the two?
Not sure how much this matters, but the new symbol (system_call_exit) is
actually a few instructions from after the syscall return
(.Lsyscall_exit). And I have converted syscall_exit_work to a private
symbol as well.
In general, from kprobes standpoint, there are places there where we can
probe safely but doing so may require new symbols to be introduced,
which could make traces look weird, as well as distribute samples among
different symbols.
I'm not sure how best to proceed. In this current series, I pretty much
blacklist all of system_call_common() through to syscall_exit() except
for the small part in between in system_call(). All other symbols have
been made private, so nothing else apart from system_call_common(),
system_call() and system_call_exit() show up in traces.
We could probably retain syscall_dotrace(), syscall_enosys() and parts
of syscall_exit_work() after RI is restored, which will allow those to
be probed, but end up showing these symbols in traces.
What would be preferable?
-----
(*)
The other suggestion Michael had was to just put a nop after the bctrl
and to again make .Lsyscall_exit public and blacklist that (though he
wasn't all for it). I suppose it does solve this issue in a nice way -
the call traces when in a system call show the proper symbol and we do
have a 'nop' instruction on which we can probe to catch returns from
system calls. Is that something we can re-consider? Something like this
(along with converting other .Lsyscall_exit references):
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -175,8 +175,9 @@ system_call: /* label this so stack traces look sane */
ldx r12,r11,r0 /* Fetch system call handler [ptr] */
mtctr r12
bctrl /* Call handler */
+ nop
-.Lsyscall_exit:
+system_call_exit:
std r3,RESULT(r1)
CURRENT_THREAD_INFO(r12, r1)
Thanks,
Naveen
More information about the Linuxppc-dev
mailing list