[Skiboot] [PATCH] FSP: Fix poller lock issue
Benjamin Herrenschmidt
benh at kernel.crashing.org
Mon Mar 9 18:30:07 AEDT 2015
On Mon, 2015-03-09 at 12:06 +0530, Ananth N Mavinakayanahalli wrote:
> On Sun, Mar 08, 2015 at 03:19:41PM +0530, Vasant Hegde wrote:
> > This patch attempts to fix "poller lock issue" by removing
> > opal_run_pollers function call from fsp_sync_msg.
> >
I disagree with generically changing the call to __fsp_poll(). That
means for example that during such a potentially long call, we will
not monitor the PSI link.
While a full queued d/l request mechanism is better, we could get away
with a simpler hack provided we are certain that we never will call
fsp_fetch_data() from a poller callback.
It's a bit gross but will provide a quick fix while you work on a better
solution, using a busy flag:
for (;;) {
lock(...)
test and set busy flag
unlock(...)
if (flag was clear)
break;
cpu_relax();
}
.. do load
clear flag
> > sample log:
> > ----------
> > [4130907342,3] Running pollers with lock held !
> > CPU 08b0 Backtrace:
> > S: 0000000033cc3990 R: 0000000030013520 .backtrace+0x2c
> > S: 0000000033cc3a20 R: 0000000030017d9c .opal_run_pollers+0x44
> > S: 0000000033cc3aa0 R: 0000000030045f04 .fsp_sync_msg+0x40
> > S: 0000000033cc3b30 R: 0000000030047840 .fsp_fetch_data+0x138
> > S: 0000000033cc3c20 R: 0000000030021204 .hservices_lid_preload+0x148
> > S: 0000000033cc3d10 R: 0000000030058038 .ibm_fsp_init+0xe4
> > S: 0000000033cc3dc0 R: 0000000030058df8 .firenze_init+0x18
> > S: 0000000033cc3e30 R: 00000000300141a4 .main_cpu_entry+0x3e0
> > S: 0000000033cc3f00 R: 0000000030002534 boot_entry+0x18c
> >
> > Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
> >
> > ---
> > Ben, Stewart,
> > Presently we have two places (fsp_fetch_data & capp_lid_download) where
> > we held lock and make opal_run_pollers call (via fsp_sync_msg).
> >
> > Earlier we had replaced __fsp_poll with opal_run_pollers (commit 488dc8a3).
> > But I think its fine just to call __fsp_poll here. This seems to be working.
> > But needs to be reviewed before merging.
> >
> > I tried other alternative approach we discussed earlier. (introduce queue in
> > fsp_fetch_data). This requies quite a bit of work in callback function.
>
> This explanation needs to go in the commit message.
>
> > ---
> > hw/fsp/fsp.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/fsp/fsp.c b/hw/fsp/fsp.c
> > index e57c611..5fbc33f 100644
> > --- a/hw/fsp/fsp.c
> > +++ b/hw/fsp/fsp.c
> > @@ -1663,7 +1663,10 @@ int fsp_sync_msg(struct fsp_msg *msg, bool autofree)
> >
> > while(fsp_msg_busy(msg)) {
> > cpu_relax();
> > - opal_run_pollers();
> > + lock(&fsp_lock);
> > + __fsp_poll(false);
> > + unlock(&fsp_lock);
>
> You could just use fsp_opal_poll() instead.
>
> Ananth
>
> _______________________________________________
> Skiboot mailing list
> Skiboot at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot
More information about the Skiboot
mailing list