[Skiboot] [PATCH 0/4] next round of OPAL debugging improvements
npiggin at gmail.com
Thu Apr 19 16:57:02 AEST 2018
On Thu, 19 Apr 2018 16:24:06 +1000
Stewart Smith <stewart at linux.ibm.com> wrote:
> Nicholas Piggin <npiggin at gmail.com> writes:
> > The r13 clobber bug has been fixed in Linux upstream now, the next
> > bit is the skiboot stack corruption caused by re-entry from the OS.
> > Patches 1-3 have been posted here before in various RFCs. They give
> > nicer back traces, make quiescing a bit more robust, and stop the
> > stack from being destroyed, and make quiescing a bit more robust.
> > Patch 4 is one I've found useful when testing with sreset injection
> > into various parts of skiboot code.
> > We can in general interrupt and resume OPAL after this series.
> > Many specific cases are unlikely to work for other reasons, but
> > we have a cut down sub-set of OPAL API that we allow through, so
> > the task of making those reasonably robust should be manageable.
> I haven't hammered the series with sreset and fun like that, but we
> should probably add something along those lines to op-test (hit it a lot
> more when we had more bugs :)
Yeah, I've mostly been injecting in mambo, so far. We should be able
to add some tests. I'll think about it too.
> When I build with STACK_CHECK=1, it seems that we never get below 10k of
> free stack, but I'm not quite bold enough to suggest "8k is enough for
> everyone", and just eating the extra memory for stacks is probably okay.
Yeah I don't see it's a really big problem. Worth the added robustness.
> Series merged to master as of 87f55507195a166fb12ed1240ed3221964f11f40
Cool. I think this brings us to a *reasonably* good place in terms of
coping with NMIs and re-entry from Linux. Maybe a few minor tweaks, but
we should be getting more solid.
Next chapter is some kind of OPAL_DEBUG API we've talked about, but
that'll be for another day. Concentrate on getting all this well tested
and stable first.
More information about the Skiboot