[PATCH v4 01/15] stacktrace/x86: add function for detecting reliable stack traces

Petr Mladek pmladek at suse.com
Fri Jan 27 00:56:03 AEDT 2017


On Thu 2017-01-19 09:46:09, Josh Poimboeuf wrote:
> For live patching and possibly other use cases, a stack trace is only
> useful if it can be assured that it's completely reliable.  Add a new
> save_stack_trace_tsk_reliable() function to achieve that.

> diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
> index 0653788..fc36842 100644
> --- a/arch/x86/kernel/stacktrace.c
> +++ b/arch/x86/kernel/stacktrace.c
> @@ -74,6 +74,90 @@ void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
>  }
>  EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
>  
> +#ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
> +static int __save_stack_trace_reliable(struct stack_trace *trace,
> +				       struct task_struct *task)
> +{
> +	struct unwind_state state;
> +	struct pt_regs *regs;
> +	unsigned long addr;
> +
> +	for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
> +	     unwind_next_frame(&state)) {
> +
> +		regs = unwind_get_entry_regs(&state);
> +		if (regs) {
> +			/*
> +			 * Kernel mode registers on the stack indicate an
> +			 * in-kernel interrupt or exception (e.g., preemption
> +			 * or a page fault), which can make frame pointers
> +			 * unreliable.
> +			 */
> +			if (!user_mode(regs))
> +				return -1;
> +
> +			/*
> +			 * The last frame contains the user mode syscall
> +			 * pt_regs.  Skip it and finish the unwind.
> +			 */
> +			unwind_next_frame(&state);
> +			if (WARN_ON_ONCE(!unwind_done(&state))) {
> +				show_stack(task, NULL);

We should make sure that show_stack() is called only once as well.
Otherwise, it would fill logbuffer with random stacktraces without
any context. It might easily cause flood of messages and the first
useful one might get lost in the ring buffer.

> +				return -1;
> +			}
> +			break;
> +		}
> +
> +		addr = unwind_get_return_address(&state);
> +
> +		/*
> +		 * A NULL or invalid return address probably means there's some
> +		 * generated code which __kernel_text_address() doesn't know
> +		 * about.
> +		 */
> +		if (WARN_ON_ONCE(!addr)) {
> +			show_stack(task, NULL);

Same here.

> +			return -1;
> +		}
> +
> +		if (save_stack_address(trace, addr, false))
> +			return -1;
> +	}
> +
> +	/* Check for stack corruption */
> +	if (WARN_ON_ONCE(unwind_error(&state))) {
> +		show_stack(task, NULL);

And here.

> +		return -1;
> +	}
> +
> +	if (trace->nr_entries < trace->max_entries)
> +		trace->entries[trace->nr_entries++] = ULONG_MAX;
> +
> +	return 0;
> +}
> +
> +/*
> + * This function returns an error if it detects any unreliable features of the
> + * stack.  Otherwise it guarantees that the stack trace is reliable.
> + *
> + * If the task is not 'current', the caller *must* ensure the task is inactive.
> + */
> +int save_stack_trace_tsk_reliable(struct task_struct *tsk,
> +				  struct stack_trace *trace)
> +{
> +	int ret;
> +
> +	if (!try_get_task_stack(tsk))
> +		return -EINVAL;
> +
> +	ret = __save_stack_trace_reliable(trace, tsk);

__save_stack_trace_reliable() returns -1 in case of problems.
But this function returns a meaningful error codes, line -EINVAL,
-ENOSYS, otherwise.

We should either transform the error code here to something
"meaningful", probably -EINVAL. Or we should update
__save_stack_trace_reliable() to return meaningful error codes.


> +	put_task_stack(tsk);
> +
> +	return ret;
> +}
> +#endif /* CONFIG_HAVE_RELIABLE_STACKTRACE */
> +
>  /* Userspace stacktrace - based on kernel/trace/trace_sysprof.c */
>  
>  struct stack_frame_user {

Otherwise, all the logic looks fine to me. Great work!

Best Regards,
Petr


More information about the Linuxppc-dev mailing list