Inconsistent performance of dbus call GetManagedObjects to PSUSensor in dbus-sensors

Sui Chen suichen at google.com
Thu Aug 13 10:30:29 AEST 2020


Hello,

Because machine configurations may change during development, we used a
microbenchmark to try to isolate the cause and reproduce the long DBus
latencies reliably, and another microbenchmark to demonstrate the idea we
had tried that appeared to alleviate but not completely eliminate this DBus
latency problem.

The first microbenchmark, dbus-asio-bmk (
https://gerrit.openbmc-project.xyz/c/openbmc/openbmc-tools/+/35576)
mimics our patched psusensor: an ASIO worker that reads sensors at some
fixed timer interval; the ASIO worker is also used by sdbusplus for
handling DBus work. We continuously run "busctl tree" against the DBus
interface created by the microbenchmark binary.

By importing the resultant DBus traffic dump and ASIO handler activity log
onto the timeline view form dbus-vis, we can clearly see the the "sensor
reading" ASIO work items block the DBus work items, which in turn causes
very long DBus latencies to show up:

[image: busctlubmk.png]Although this benchmark is run on an x86 workstation
instead of the BMC due to a run-time error in its dependencies, according
to the results above, we see this "thundering herd" problem appear to occur
on a desktop platform as well.

As we modify various experimental parameters, it turns out that the more
time is occupied by non-DBus ASIO work the more likely long DBus latencies
are to happen, since there is a higher chance the DBus calls clash with the
"fake sensor reading". Thus, we come up with an assumption that if we
reduce the time spent (by the only ASIO worker) in non-DBus ASIO work, DBus
latencies will be reduced.

Based on this assumption, we attempted a few methods to reduce the time it
takes psusensor to read the sensors. The second benchmark (
https://gerrit.openbmc-project.xyz/c/openbmc/openbmc-tools/+/35387)
explains the methods we had experimented with. It turns out we were able to
reduce sensor reading time as well as the chance of long DBus method calls
happening, but the inconsistent DBus call times do not completely go away.
This is probably due to psusensors being much more complex than the two
benchmarks with much other work contending for the ASIO worker's time.

So to summarize the point of this reply is to say:
1) We had attempted the ASIO handler dump as suggested and a method for
analyzing DBus+ASIO performance has been embodied in dbus-vis.
2) We are interested to know if someone else is looking at similar problems.
3) We will examine GetManagedObjects again when we get a chance.

Thanks,
Sui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20200812/c1cdad44/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: busctlubmk.png
Type: image/png
Size: 157653 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20200812/c1cdad44/attachment-0001.png>


More information about the openbmc mailing list