[OpenPower-Firmware] [POWER8] OCC Firdata over IPMI

Andrew Jeffery andrew at aj.id.au
Fri May 24 10:40:52 AEST 2019



On Thu, 23 May 2019, at 18:58, Artem Senichev wrote:
> On Thu, May 23, 2019 at 10:10:38AM +0930, Andrew Jeffery wrote:
> > Hi Artem,
> > 
> > Snipping somewhat, I'll let Doug dig into the specifics of the OCC issue as he
> > implemented the IPMI HIOMAP support for it.
> > 
> > On Wed, 22 May 2019, at 17:13, Artem Senichev wrote:
> > 
> > > We have HIOMAP support for our bundle (OpenBMC+OpenPOWER), it looks like
> > > the entire solution works fine except the OCC project, it is the last
> > > component that directly writes data to the PNOR flash.
> > > 
> > 
> > Great to hear the rest works for you. Just for your awareness you may need to
> > pick up a couple of patches depending on what firmwares you're running: [1]
> > to fix a data write corruption issue in skiboot and [2] to resolve a memory leak
> > in OpenBMC. These are the critical issues, but there are further sets of skiboot
> > and OpenBMC patches that I recommend to avoid lockups. I can point you to
> > them if necessary.
> 
> Yes, please. May be these patches will solve my second problem :)

Before I do, can you please either push your trees somewhere I can look over them?
That was I can minimise the set of patches I need to look over / recommend.

> We have a (dead?) lock situation in skiboot. I'm not sure that
> this affects other skiboot's code, but I can't send PCI device list
> from skiboot to OpenBMC - the system freezes at ipmi_queue_msg_sync():
> 
> https://github.com/open-power/skiboot/blob/76f7316bc8fc8a18fdbfcbc0e1fe1bb992d2a7d7/core/ipmi.c#L177
> 
> Usually, system hangs after sending 3-5 of my messages. I have added
> trace to core/ipmi.c - it ends up with two calls to 'lock(&sync_lock)',
> one for my message and another one for HIOMAP.
> As workaround, I use asynchronous sending with increased IPMI message queue
> size, but it's not a good solution.

Generally you shouldn't be using ipmi_queue_msg_sync(), and I'm unclear why
async is a bad solution here. What are your requirements that drive the use of
_sync()?

I don't have any immediate solutions for you. How are you sending down the PCI
data? Are you writing it to the PNOR or sending the PCI devices through as FRUs?

> 
> > 
> > On the skiboot side, master contains all the fixes we have needed so far, and
> > everything except for [3] is contained in the v6.3 tag. OpenBMC master has
> > all its respective fixes included.
> > 
> > Bit of a tangent, but I hope it helps.
> > 
> > Andrew
> > 
> > [1] https://github.com/open-power/skiboot/commit/7f291166283f
> > [2] https://github.com/openbmc/hiomapd/commit/fac3689e77d3
> > [3] https://github.com/open-power/skiboot/commit/f01cd777adb1
> 
> We already have these patches in our build, thanks.

Out of interest, do you have this patch included?

https://github.com/openbmc/btbridge/commit/aa5511d28ff9acee4a404c6397d09f5187812ed8

That might eliminate some other spurious and intermittent problems.

Again it would be great if you can point me to your trees so I can check for
myself whether you've included certain patches.

Andrew


More information about the OpenPower-Firmware mailing list