[Skiboot] [PATCH] nvram: Fix 'missing' nvram on FSP systems.

Fri Dec 1 11:10:49 AEDT 2017

On Thu, 2017-11-30 at 15:50 +1100, Stewart Smith wrote:
> Cyril Bur <cyril.bur at au1.ibm.com> writes:
> > On Tue, 2017-11-14 at 11:39 +1100, Oliver wrote:
> > > On Tue, Nov 7, 2017 at 5:07 PM, Cyril Bur <cyril.bur at au1.ibm.com> wrote:
> > > > commit ba4d46fdd9eb ("console: Set log level from nvram") wants to read
> > > > from NVRAM rather early. This works fine on BMC based systems as
> > > > nvram_init() is actually synchronous. This is not true for FSP systems
> > > > and it turns out that the query for the console log level simply
> > > > queries blank nvram.
> > > > 
> > > > The simple fix is to wait for the NVRAM read to complete before
> > > > performing any query. Unfortunately it turns out that the fsp-nvram
> > > > code does not inform the generic NVRAM layer when the read is complete,
> > > > rather, it must be prompted to do so.
> > > > 
> > > > This patch addresses both these problems. This patch adds a check before
> > > > the first read of the NVRAM (for the console log level) that the read
> > > > has completed. The fsp-nvram code has been updated to inform the generic
> > > > layer as soon as the read completes.
> > > > 
> > > > The old prompt to the fsp-nvram code has been removed but a check to
> > > > ensure that the NVRAM has been loaded remains. It is conservative but
> > > > if the NVRAM is not done loading before the host is booted it will not
> > > > have an nvram device-tree node which means it won't be able to access
> > > > the NVRAM at all, ever, even after the NVRAM has loaded.
> > > > 
> > > > Signed-off-by: Cyril Bur <cyril.bur at au1.ibm.com>
> > > > ---
> > > >  core/init.c                  | 11 +++++++++--
> > > >  core/nvram-format.c          |  5 +++++
> > > >  core/nvram.c                 | 35 +++++++++++++++++++++++++++++++++++
> > > >  core/test/run-nvram-format.c |  5 +++++
> > > >  hw/fsp/fsp-nvram.c           | 28 ++++------------------------
> > > >  include/fsp.h                |  1 -
> > > >  include/nvram.h              |  2 ++
> > > >  7 files changed, 60 insertions(+), 27 deletions(-)
> > > > 
> > > > diff --git a/core/init.c b/core/init.c
> > > > index 8951e17b..17de9851 100644
> > > > --- a/core/init.c
> > > > +++ b/core/init.c
> > > > @@ -497,7 +497,7 @@ void __noreturn load_and_boot_kernel(bool is_reboot)
> > > >                 /* We wait for the nvram read to complete here so we can
> > > >                  * grab stuff from there such as the kernel arguments
> > > >                  */
> > > > -               fsp_nvram_wait_open();
> > > > +               nvram_wait_for_load();
> > > > 
> > > >                 /* Wait for FW VPD data read to complete */
> > > >                 fsp_code_update_wait_vpd(true);
> > > > @@ -987,7 +987,14 @@ void __noreturn __nomcount main_cpu_entry(const void *fdt)
> > > >         /* preload the IMC catalog dtb */
> > > >         imc_catalog_preload();
> > > > 
> > > > -       /* Set the console level */
> > > > +       /*
> > > > +        * Set the console level
> > > > +        * We need NVRAM to have been setup by now otherwise we'll
> > > > +        * just read a zeroed buffer.
> > > > +        * Some FSP systems can actually be quite slow to return the
> > > > +        * NVRAM.
> > > > +        */
> > > > +       nvram_wait_for_load();
> > > >         console_log_level();
> > > > 
> > > >         /* Secure/Trusted Boot init. We look for /ibm,secureboot in DT */
> > > > diff --git a/core/nvram-format.c b/core/nvram-format.c
> > > > index 923098af..655cb8ee 100644
> > > > --- a/core/nvram-format.c
> > > > +++ b/core/nvram-format.c
> > > > @@ -216,6 +216,11 @@ const char *nvram_query(const char *key)
> > > >         const char *part_end, *start;
> > > >         int key_len = strlen(key);
> > > > 
> > > > +       if (!nvram_has_loaded()) {
> > > > +               prlog(PR_WARNING, "NVRAM: Query before is done loading\n");
> > > > +               return NULL;
> > > > +       }
> > > 
> > > I think this should just call nvram_wait_for_load() rather than
> > > failing here. The problem
> > > with this behaviour is that can end up with nvram_query() failing
> > > intermittently on FSP
> > > systems since loading the nvram is asynchronous there.
> > > 
> > > Given that we use nvram_query() for switching debug stuff on and off
> > > I'd rather it was
> > > as reliable as possible even if it slows booting slightly.
> > > 
> > 
> > I think this goes back to - when do we call nvram_wait_for_load()...
> > and great... looks like my entire summary email never made it to the
> > list... nor can I find it anywhere...
> > 
> > It was a rather long email I don't feel like retyping...
> > 
> > Basically, either we do what Oliver suggests but this hides the (now)
> > blocking nature of nvram_query() which I was hesitant to do. I feel
> > like in skiboot it should be fairly obvious that something might block.
> > 
> > Another options is we leave what I have, which makes the blocking quite
> > obvious but does require the function to be explicitly called before
> > nvram_query() and more importantly before launching the bootkernel.
> > 
> > Finally, we put nvram_wait_for_load() at the end of nvram_init(), which
> > is a nicer place to have blocking since its obviously an init function.
> > The problem being that we might be able to nvram_init() long enough
> > before we actually need the nvram and we lose the benefit of the
> > asynchronous load. Having said that, as Oliver points out - we use
> > nvram_query() to switch debug stuff. We want to do that as early as
> > possible, so if there is going to be a blocking nvram_query() right
> > after nvram_init(), then it might as well be in nvram_init().
> > 
> > All this only applies to FSP, BMC nvram_init() is actually synchronous
> > at the moment (although it doesn't have to be).
> > 
> > 
> > Stewart, do you have any strong feelings?
> 
> I've gone back and forth a bit.... I think the best way would be to
> print the warning and then wait for it to be loaded anyway. THat way,
> things still work if somebody has to put in early debug and not sort
> through the how-we-load-nvram-mess, but we'll know that we need to fix
> something before shipping if we do.
> 

So the in nvram_query() wait with a warning. Probably easier to speak
in code - I'll send a patch.

Cyril

>