Redfish requests for dbus-sensors data are needlessly slow - analysis

Fri Aug 6 04:57:04 AEST 2021

Hello,

On Wed, Aug 04, 2021 at 11:51:01AM -0700, Ed Tanous wrote:
> One thing to keep in mind is if you use the work in progress mode
> within gerrit, non-admins can't see the change.  Generally I just
> either mark them WIP in a topic, or put "work in progress" in the
> commit message somewhere.

Noted, thank you.

> Because the mapper interfaces are non-ideal for building Redfish
> resources, and we have to work around them a lot.

Still mapper is aware of all the devices appearing and disappearing
from the bus, and should be emitting signals accordingly. In what case
invalidating all of the (hypothetical) bmcweb
connections/mapping/associations cache on receiving such a signal
would not be enough?

> > PSUSensor backend is caching hwmon values so it's reasonably fast.
> 
> ..... This is the way it _should_ be.  Unfortunately PSUsensor has
> some really bad behaviors in terms of blocking dbus transactions while
> pmbus reads are happening.  There's been a couple proposed fixes to
> this, but nothing has landed yet.

Haven't noticed that in my tests. I had an impression that all reading
from hwmon files is performed in fully async mode, Boost::ASIO not
blocking on anything and letting the other handlers run.

> > > Yes, we should come at it from both angles, because a cache will
> > > still have to invalidate quite a bit, and nobody likes a bimodal
> > > distribution of response times.
> >
> > This sounds cool but I'm not sure I understand the point. We face
> > caching everywhere, and it's usually multi-layered, so when
> > appropriately benchmarked we often see tri- and more modal
> > distributions. And when not building a hard real-time system the
> > median is much more important than the modality I guess?
> 
> I'm not sure I'd say we have "caching everywhere".  About the only
> thing we really cache today is the mapper, and that only really
> invalidates on startup.  My point is really that if we implement a
> cache in bmcweb, we'll see one response that takes 200ms, then some
> responses will take 4 seconds when the cache isn't valid.

I thought "nobody likes a bimodal distribution of response times"
referred to a general notion, so I thought about all the caching in
computer systems on all the levels (CPU, RAM, disks, Ethernet etc
etc). One particular example of caching we really dislike would be the
one resulting in network bufferbloat, but that's not because of the
modality.

And regarding this specific question, I do not see anything
particularly bad about having to wait for 4 seconds few times a day if
thousands of requests are processed in 200 ms over the course of the
same day. Better than having to wait 2 seconds for each of them every
time.

> > > This might clean up a couple of the extra calls in the
> > > sensor code, but the primary target was inventory initially.
> >
> > I'm still rather puzzled by the relationship between different
> > sensors/entities/inventory/leds/etc. Is there a document describing
> > everything that's currently essential?
> 
> One systems "essential" is another systems "optional".  This is the
> short way of saying, no, I don't think we make any distinction between
> essential and non-essential relationships on dbus.  The best way to
> understand is to go look at the code like you've done.

TBH, looking through the Boost::ASIO async code was painful and most
importantly it often doesn't provide full rationale. I can eventually
understand what is being done, but my ideas about why that is needed
are speculations prone to errors, that's why in the absence of a
design rationale document I was hoping to get answers from the authors
of the code in question.

> > > > Does bmcweb really _need_ to do all the calls mentioned? I think the
> > > > fact it performs two identical D-Bus calls in a row gives a clear "no"
> > > > answer.
> > >
> > > You are almost certainly right, two identical calls is wasteful, and
> > > we should clean that up if we can.  I'm certain there's probably
> > > several speedups to be gained by someone understanding the sensor code
> > > and optimizing it a little.
> >
> > I'm happy I managed to get my point through finally.
> 
> I'm looking forward to the patch.

I wonder what kind of testing would be needed to make sure it doesn't
break all the systems I have no idea about. Those that have
sub-sensors, complex associations and related LEDs, and probably those
that are using phosphor-hwmon and not dbus-sensors...

> > Since I do not understand the role of /inventory/ pathes, and I do not
> > understand how LEDs get involved (apparently we do not have any LEDs
> > like that in our Tioga Pass configuration) and I didn't even try to
> > dig the "power" nodes I'm certainly not in a position to offer a
> > worthy code contribution in this area :( (other than fixing the
> > duplicated call which can trivially be done by passing an extra
> > argument; but that alone isn't fruitful much)
> 
> Reading and understanding the code is probably the best way to become
> "worthy".  It would be great to have even a simple change land on
> master as the result of this discussion that can start to work on your
> concerns.

Thank you for the prompt review for my simple change, I'll be working
on fixing the issues you raised.

> > I very much hoped for more involvement from the bmcweb maintainers, as
> > solving this issue properly, in a way that doesn't break any usecases
> > obviously requires plenty of domain-specific knowledge (domain here
> > being the extensive OpenBMC D-Bus API, specifically the
> > sensors/inventory part of it).
> 
> I'm not following here.... I (a bmcweb maintainer) have given you
> several pretty lengthy replies about things that could be worked on.
> Is there some other involvement you were hoping for that doesn't
> involve the maintainers just writing up and testing the patches?

Yes, and I appreciate it a lot.

What I hoped for is for somebody who has all the knowledge about
bmcweb Redfish implementation and OpenBMC architecture at her or his
fingertips to chime in and say something like: "we need steps M and N
for X and Y reasons, and step K was needed for Z but is redudant by
now, and L should stay for a while until <some> cleanup is merged." Or
"you're barking at the wrong tree, do /this/ and /that/ instead to
monitor the health of your servers."

I can be reading
https://github.com/openbmc/docs/blob/master/architecture/object-mapper.md
and
https://github.com/openbmc/docs/blob/master/architecture/sensor-architecture.md
and
https://github.com/openbmc/docs/blob/master/architecture/LED-architecture.md

for days, and the relevant source code, and experimenting on Tioga
Pass, and digging all your hints, and reading all the conflicting
changes that Gerrit nicely shows for my caching patch, but that won't
ever give me the breadth of the insight into all possible cases
(especially the edge cases) that those who developed and evolved this
architecture have.

Thank you for listening.

-- 
Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software!
mailto:fercerpav at gmail.com