Inconsistent performance of dbus call GetManagedObjects to PSUSensor in dbus-sensors

Sat Aug 8 09:17:20 AEST 2020

This is great!  Thank you for taking the time to type this up.

On Fri, Aug 7, 2020 at 3:42 PM Alex Qiu <xqiu at google.com> wrote:
>
> The setup has a total of 249 IPMI sensors, and among these, dbus-sensors reports 59 objects from HwmonTempSensor and 195 objects from PSUSensor, and we've already decreased the polling rate of PSUSensor to every 10 seconds to mitigate the issue. As the intel-ipmi-oem does, we measure the time of calling GetManagedObjects with commands:

This isn't the biggest sensor usage I've ever seen, but it certainly
is the biggest usage of PSUsensor I've seen sofar.  It's not
surprising you're finding performance issues other people haven't.
PSUSensor was originally supposed to be for physical pmbus power
supplies, but got abstracted a little at some point to be more
generic.

>
> time busctl call xyz.openbmc_project.HwmonTempSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects
> time busctl call xyz.openbmc_project.PSUSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects
>
> The first command for HwmonTempSensor constantly finishes in about 60 ms. However, the run time of the second command for PSUSensor is very inconsistent. Out of 50 continuous runs, most of them finish in about 150 ms, but 2 or 3 of them will take as long as about 6 seconds to return. This results in long time to scan the SDR and inconsistent performance polling IPMI sensors.
>

I don't have a system handy that uses PSUSensor, but based on what
you're saying, I'm going to guess that there's a blocking
io/wait/sleep call that snuck in somewhere in the PSUsensor, and it's
stopping the main reactor for some amount of time.  This is probably
exacerbated by how loaded your system is, which is causing the really
bad tail latencies.

If I were in your shoes, the first thing I would do is to recompile
PSUSensor with IO handler tracking enabled:
https://www.boost.org/doc/libs/1_73_0/doc/html/boost_asio/overview/core/handler_tracking.html

to do that, go here:
https://github.com/openbmc/dbus-sensors/blob/master/CMakeLists.txt#L194

add add a line line like
target_compile_definitions(psusensor PUBLIC
-DBOOST_ASIO_ENABLE_HANDLER_TRACKING)

and recompile.

That's going to print loads of debug info to the console when it runs.
Be prepared.  Rerun your test with the flag enabled.  When your
getmanagedobjects command gets stuck, dump the log and try to find the
spot where io seems to stop for a bit.  Hopefully you'll find one
async operation is taking a looooong time to run.  Most operations
should be in the order of micro/milliseconds for runtime.  Once you
know what the spot is, we can probably triage further.  Each
individual callback is pretty simple, and only does a couple things,
so it should be pretty easy to sort out what's blocking within a given
callback.

My second theory is that because of the async nature of psusensor, if
you get unlucky, 195 concurrent IO completion operations are getting
scheduled right ahead of your GetManagedObjects call.  Right now the
IO scheduling is pretty dumb, and doesn't attempt to add jitter to
randomize the call starts, under the assumption that the reactor will
never be more than 10 or so handles at a given time.  Given the number
of sensors you've got, we might want to rethink that, and try to
spread them out in time a little.  If we wanted to verify this, we
could instrument io_context with a little run_for() magic that breaks
every N milliseconds and prints the size of the queue.  That could
verify that we're running it too large.

Technically I think this is the embedded version of the thundering
herd problem.  There are ways to solve it that should be relatively
easy (if that's what it turns out to be).