[RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

Naveen N. Rao naveen.n.rao at linux.vnet.ibm.com
Fri Mar 9 19:15:45 AEDT 2018

Michael Ellerman wrote:
> "Naveen N. Rao" <naveen.n.rao at linux.vnet.ibm.com> writes:
>> We can't take a trap in most parts of real mode code. Instead of adding
>> the 'notrace' annotation to all C functions that can be invoked from
>> real mode, detect that we are in real mode on ftrace entry and return
>> back.
>> Signed-off-by: Naveen N. Rao <naveen.n.rao at linux.vnet.ibm.com>
>> ---
>> This RFC only handles -mprofile-kernel to demonstrate the approach being 
>> considered. We will need to handle other ftrace entry if we decide to 
>> continue down this path.
> Paul and I were talking about having a paca flag for this, ie.
> paca->safe_to_ftrace (or whatever). I'm not sure if you've talked to
> him and decided this is a better approach.
> I guess I'm 50/50 on which is better, they both have pluses and minuses.

Thanks, I hadn't spoken to Paul, but I now think that this is probably 
the better approach to take.

My earlier assumption was that we have other scenarios when we are in 
realmode (specifically with MSR_RI unset) where we won't be able to 
recover from a trap, during function tracing (*). I did a set of 
experiments yesterday to verify that, but I was not able to uncover any 
such scenarios with my brief testing. So, we seem to be functioning just 
fine while tracing realmode C code, except for KVM.

As such, rather than blacklisting all realmode code, I think it is 
better to be selective and just disable the tracer for KVM since we know 
we can't take a trap there. We will be able to use the same approach if 
we uncover additional scenarios where we can't use function tracing. I 
will look at implementing a paca field for this purpose.

I also noticed that even with an unexpected timebase, we still seem to 
recover just fine with a simple change:

--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2629,8 +2629,8 @@ static noinline void
 rb_handle_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
                    struct rb_event_info *info)
-       WARN_ONCE(info->delta > (1ULL << 59),
-                 KERN_WARNING "Delta way too big! %llu ts=%llu write stamp = %llu\n%s",
+       if (info->delta > (1ULL << 59))
+               pr_warn_once("Delta way too big! %llu ts=%llu write stamp = %llu\n%s",
                  (unsigned long long)info->delta,
                  (unsigned long long)info->ts,
                  (unsigned long long)cpu_buffer->write_stamp,

This allowed the virtual machine to boot and we were able to trace the 
rest of KVM C code. I only just did a boot test, so I'm not sure if 
there are other scenarios where things can go wrong.

Would you be willing to accept a patch like the above? Since we seem to 
handle the larger delta just fine, I think the above change should be 

I will still work on excluding KVM C code from being traced, but the 
advantage with the above patch is that we will be able to trace KVM C 
code with a small change if necessary.

- Naveen

(*) putting on my kprobe hat

More information about the Linuxppc-dev mailing list