[Skiboot] [RFC PATCH] core/lock: Add debug options to store backtrace of where lock was taken

Stewart Smith stewart at linux.ibm.com
Mon Feb 25 17:11:04 AEDT 2019


Andrew Donnellan <andrew.donnellan at au1.ibm.com> writes:
> Contrary to popular belief, skiboot developers are imperfect and

Say it isn't so!

> occasionally write locking bugs. When we exit skiboot, we check if we're
> still holding any locks, and if so, we print an error with a list of the
> locks currently held and the locations where they were taken.
>
> However, this only tells us the location where lock() was called, which may
> not be enough to work out what's going on. To give us more to go on with,
> we can store backtrace data in the lock and print that out when we
> unexpectedly still hold locks.
>
> Because the backtrace data is rather big, we only enable this if
> DEBUG_LOCKS_BACKTRACE is defined.

Okay, so, I like this as a feature. I'm tempted to say it should be
enabled always when we have DEBUG=1 yet have the option to *just* enable
it. Which I think is what you've done here?

> Signed-off-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
>
> ---
>
> RFC. IDK whether anyone else might find this useful, it helped me work out
> what was going on when I was hitting a SIGNAL_SYSTEM_RESET and it was
> complaining I was still holding a lock.
>
> This uses the backtrace functions which have more leading underscores than
> I'm comfortable with, and the printout is ugly.

It's better than nothing, and whenever I've hit something like this I've
immediately gone and put in backtrace()s around the place to find wtf I
got wrong.

> I've hidden this behind a new config ifdef for now, so that I can switch it
> on without activating all of DEBUG which occasionally causes other weird
> problems elsewhere. Maybe not the best way?
> ---
>  core/lock.c      | 25 +++++++++++++++++++++++--
>  include/config.h |  3 +++
>  include/lock.h   | 12 ++++++++++++
>  3 files changed, 38 insertions(+), 2 deletions(-)
>
> diff --git a/core/lock.c b/core/lock.c
> index fbf0b21ed5c6..38545784e916 100644
> --- a/core/lock.c
> +++ b/core/lock.c
> @@ -208,6 +208,15 @@ bool try_lock_caller(struct lock *l, const char *owner)
>  		cpu->con_suspend++;
>  	if (__try_lock(cpu, l)) {
>  		l->owner = owner;
> +
> +#ifdef DEBUG_LOCKS_BACKTRACE
> +		l->bt_ents = 60;
> +		___backtrace(l->bt_buf, &l->bt_ents,
> +			     (unsigned long)__builtin_frame_address(0),
> +			     &l->bt_token, &l->bt_r1_caller);
> +		l->bt_pir = mfspr(SPR_PIR);
> +#endif
> +
>  		list_add(&cpu->locks_held, &l->list);
>  		return true;
>  	}
> @@ -312,8 +321,14 @@ void dump_locks_list(void)
>  	struct lock *l;
>  
>  	prlog(PR_ERR, "Locks held:\n");
> -	list_for_each(&this_cpu()->locks_held, l, list)
> +	list_for_each(&this_cpu()->locks_held, l, list) {
>  		prlog(PR_ERR, "  %s\n", l->owner);
> +#ifdef DEBUG_LOCKS_BACKTRACE
> +		___print_backtrace(l->bt_pir, l->bt_buf, l->bt_ents,
> +				   l->bt_token, l->bt_r1_caller,
> +				   NULL, NULL, true);
> +#endif
> +	}
>  }
>  
>  void drop_my_locks(bool warn)
> @@ -322,8 +337,14 @@ void drop_my_locks(bool warn)
>  
>  	disable_fast_reboot("Lock corruption");
>  	while((l = list_top(&this_cpu()->locks_held, struct lock, list)) != NULL) {
> -		if (warn)
> +		if (warn) {
>  			prlog(PR_ERR, "  %s\n", l->owner);
> +#ifdef DEBUG_LOCKS_BACKTRACE
> +			___print_backtrace(l->bt_pir, l->bt_buf, l->bt_ents,
> +					   l->bt_token, l->bt_r1_caller,
> +					   NULL, NULL, true);
> +#endif
> +		}
>  		unlock(l);
>  	}
>  }
> diff --git a/include/config.h b/include/config.h
> index 6b36590c39f0..c285d549ccab 100644
> --- a/include/config.h
> +++ b/include/config.h
> @@ -39,6 +39,9 @@
>  /* Enable lock debugging */
>  #define DEBUG_LOCKS		1
>  
> +/* Enable printing of backtraces when locks not released */
> +#define DEBUG_LOCKS_BACKTRACE  1

this is behind a #if for DEBUG=1, right?

> +
>  /* Enable lock dependency checker */
>  #define DEADLOCK_CHECKER	1
>  
> diff --git a/include/lock.h b/include/lock.h
> index b18757394dbb..b7d47be83ba9 100644
> --- a/include/lock.h
> +++ b/include/lock.h
> @@ -23,6 +23,10 @@
>  #include <ccan/list/list.h>
>  #include <ccan/str/str.h>
>  
> +#ifdef DEBUG_LOCKS_BACKTRACE
> +#include <stack.h>
> +#endif
> +
>  struct lock {
>  	/* Lock value has bit 63 as lock bit and the PIR of the owner
>  	 * in the top 32-bit
> @@ -38,6 +42,14 @@ struct lock {
>  	/* file/line of lock owner */
>  	const char *owner;
>  
> +#ifdef DEBUG_LOCKS_BACKTRACE
> +	struct bt_entry bt_buf[60];
> +	unsigned int bt_ents;
> +	unsigned long bt_token;
> +	unsigned long bt_r1_caller;
> +	unsigned long bt_pir;
> +#endif

almost feels like it should be a struct.

> +
>  	/* linkage in per-cpu list of owned locks */
>  	struct list_node list;
>  };
> -- 
> 2.11.0
>
> _______________________________________________
> Skiboot mailing list
> Skiboot at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot
>

-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list