[Skiboot] [PATCH] FSP: Fix poller lock issue

Benjamin Herrenschmidt benh at kernel.crashing.org
Mon Mar 9 18:30:07 AEDT 2015


On Mon, 2015-03-09 at 12:06 +0530, Ananth N Mavinakayanahalli wrote:
> On Sun, Mar 08, 2015 at 03:19:41PM +0530, Vasant Hegde wrote:
> > This patch attempts to fix "poller lock issue" by removing
> > opal_run_pollers function call from fsp_sync_msg.
> >

I disagree with generically changing the call to __fsp_poll(). That
means for example that during such a potentially long call, we will
not monitor the PSI link.

While a full queued d/l request mechanism is better, we could get away
with a simpler hack provided we are certain that we never will call
fsp_fetch_data() from a poller callback.

It's a bit gross but will provide a quick fix while you work on a better
solution, using a busy flag:

for (;;) {
	lock(...)
	test and set busy flag
	unlock(...)
	if (flag was clear)
		break;
	cpu_relax();
}

 .. do load

clear flag

	
 
> > sample log:
> > ----------
> > [4130907342,3] Running pollers with lock held !
> > CPU 08b0 Backtrace:
> >  S: 0000000033cc3990 R: 0000000030013520   .backtrace+0x2c
> >  S: 0000000033cc3a20 R: 0000000030017d9c   .opal_run_pollers+0x44
> >  S: 0000000033cc3aa0 R: 0000000030045f04   .fsp_sync_msg+0x40
> >  S: 0000000033cc3b30 R: 0000000030047840   .fsp_fetch_data+0x138
> >  S: 0000000033cc3c20 R: 0000000030021204   .hservices_lid_preload+0x148
> >  S: 0000000033cc3d10 R: 0000000030058038   .ibm_fsp_init+0xe4
> >  S: 0000000033cc3dc0 R: 0000000030058df8   .firenze_init+0x18
> >  S: 0000000033cc3e30 R: 00000000300141a4   .main_cpu_entry+0x3e0
> >  S: 0000000033cc3f00 R: 0000000030002534   boot_entry+0x18c
> > 
> > Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
> > 
> > ---
> > Ben, Stewart,
> >   Presently we have two places (fsp_fetch_data & capp_lid_download) where
> >   we held lock and make opal_run_pollers call (via fsp_sync_msg).
> > 
> >   Earlier we had replaced __fsp_poll with opal_run_pollers (commit 488dc8a3).
> >   But I think its fine just to call __fsp_poll here. This seems to be working.
> >   But needs to be reviewed before merging.
> > 
> >   I tried other alternative approach we discussed earlier. (introduce queue in
> >   fsp_fetch_data). This requies quite a bit of work in callback function.
> 
> This explanation needs to go in the commit message.
> 
> > ---
> >  hw/fsp/fsp.c |    5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/fsp/fsp.c b/hw/fsp/fsp.c
> > index e57c611..5fbc33f 100644
> > --- a/hw/fsp/fsp.c
> > +++ b/hw/fsp/fsp.c
> > @@ -1663,7 +1663,10 @@ int fsp_sync_msg(struct fsp_msg *msg, bool autofree)
> >  
> >  	while(fsp_msg_busy(msg)) {
> >  		cpu_relax();
> > -		opal_run_pollers();
> > +		lock(&fsp_lock);
> > +		__fsp_poll(false);
> > +		unlock(&fsp_lock);
> 
> You could just use fsp_opal_poll() instead.
> 
> Ananth
> 
> _______________________________________________
> Skiboot mailing list
> Skiboot at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot




More information about the Skiboot mailing list