[Skiboot] [PATCH] FSP: Fix poller lock issue

Vaidyanathan Srinivasan svaidy at linux.vnet.ibm.com
Mon Mar 9 04:11:41 AEDT 2015


* Vasant Hegde <hegdevasant at linux.vnet.ibm.com> [2015-03-08 16:56:31]:

> On 03/08/2015 03:19 PM, Vasant Hegde wrote:
> > This patch attempts to fix "poller lock issue" by removing
> > opal_run_pollers function call from fsp_sync_msg.
> > 
> > sample log:
> > ----------
> > [4130907342,3] Running pollers with lock held !
> > CPU 08b0 Backtrace:
> >  S: 0000000033cc3990 R: 0000000030013520   .backtrace+0x2c
> >  S: 0000000033cc3a20 R: 0000000030017d9c   .opal_run_pollers+0x44
> >  S: 0000000033cc3aa0 R: 0000000030045f04   .fsp_sync_msg+0x40
> >  S: 0000000033cc3b30 R: 0000000030047840   .fsp_fetch_data+0x138
> >  S: 0000000033cc3c20 R: 0000000030021204   .hservices_lid_preload+0x148
> >  S: 0000000033cc3d10 R: 0000000030058038   .ibm_fsp_init+0xe4
> >  S: 0000000033cc3dc0 R: 0000000030058df8   .firenze_init+0x18
> >  S: 0000000033cc3e30 R: 00000000300141a4   .main_cpu_entry+0x3e0
> >  S: 0000000033cc3f00 R: 0000000030002534   boot_entry+0x18c
> > 
> > Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
> > 
> > ---
> > Ben, Stewart,
> >   Presently we have two places (fsp_fetch_data & capp_lid_download) where
> >   we held lock and make opal_run_pollers call (via fsp_sync_msg).
> > 
> >   Earlier we had replaced __fsp_poll with opal_run_pollers (commit 488dc8a3).
> >   But I think its fine just to call __fsp_poll here. This seems to be working.
> >   But needs to be reviewed before merging.
> > 
> >   I tried other alternative approach we discussed earlier. (introduce queue in
> >   fsp_fetch_data). This requies quite a bit of work in callback function.
> 
> Hmmm.. This patch fixes poller lock issue.. But this *doesn't* seems to be
> complete.. We still hit lock issue in fsp_fetch_data, if we get OCC load request
> while fetching skiroot :-(
> 
> Another possible approach:
>   Presently we use same DMA mapping for all the fetch data.. Instead if we use
> separate DMA mapping for OCC load requests then we can simply queue that request
> using fsp_fetch_data_queue call..That way we will not hit lock issue in
> fsp_fetch_data.. Looking into code base I don't see any other places we get such
> notification which will result in fsp_fetch_data call.

Hi Vasanth, 

The OCC load request can be made to wait.  I cam make it work with
a callback.  When fsp sends a OCC load request, I can issue a pre-load
of OCC lid and queue it to your functions.  The callback after
completion will do the actual OCC load request processing by calling
hostboot.  

Since we have preloaded the OCC lid, when we callback from hostboot
for OCC lid, we will already have it and not come to
fsp_fetch_data_queue at all.

What I still need to understand is the CAPI ucode loading request and
response sequence.  At what point will we be able to hold the request
and continue after the callback.

--Vaidy



More information about the Skiboot mailing list