[Skiboot] FSP code calling pollers with locks held

Wed Feb 18 17:46:41 AEDT 2015

Benjamin Herrenschmidt <benh at kernel.crashing.org> writes:
> So we can start fixing those cases
>
> Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>
> ---
>  core/opal.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/core/opal.c b/core/opal.c
> index fc18db3..a49d022 100644
> --- a/core/opal.c
> +++ b/core/opal.c
> @@ -284,6 +284,11 @@ void opal_run_pollers(void)
>  {
>  	struct opal_poll_entry *poll_ent;
>  
> +	if (this_cpu()->lock_depth) {
> +		prlog(PR_ERR, "Running pollers with lock held !\n");
> +		backtrace();
> +	}
> +

Merged and then added a limit, because otherwise we could get stuck in a
loop printing backtraces for approximately forever on some FSP machines.

Basically, FSP pretty heavily calls pollers with locks held.

 S: 0000000031a03b40 R: 0000000030017eb4   .opal_run_pollers+0x44.
 S: 0000000031a03bc0 R: 00000000300449ec   .fsp_sync_msg+0x7c.
 S: 0000000031a03c50 R: 000000003004f6e0   .op_display+0x110.
 S: 0000000031a03d00 R: 000000003004797c   .fsp_console_preinit+0xe4.
 S: 0000000031a03d80 R: 000000003005650c   .ibm_fsp_init+0xac.

I've fixed this one, but we're going to have to chase these a bit.

I like the warning though, it shows us places where we're likely to have
bugs... and we should certainly fix them.

Interestingly enough, I modified the above code path back in january due
to a bug in being called from a poller.