phosphor-hwmon bottleneck potential
Patrick Venture
venture at google.com
Fri May 5 02:11:49 AEST 2017
The system upon which I'm presently running openbmc uses phosphor-hwmon to
expose reading of fans and thermal sensors. And its works great for
precisely this function, however, when another program attempts to get the
sensor values over dbus from phosphor-hwmon I run into timing issues that
can likely be addressed.
The driver we're using for the fans requires a time period to measure its
RPMs. I don't know if this is the only driver on earth that does so, but
if it is --then this is only a problem I'll run into. The time period is
about 1 second.
As it stands now, there is one expected problem and one bizarre one that I
haven't traced down.
Firstly, if I try to read a specific fan speed it takes 8s to get a
response from phosphor-hwmon over dbus. This may be that it's reading all
the fan speeds before returning; but I haven't verified it. There are
eight fans on the system being handled by that instance of phosphor-hwmon,
but the 8 seconds could be a coincidence. I'd need to really look at how
it handles dbus requests to determine that that's the issue.
To avoid this, I registered for listen for Sensor.Value updates over dbus,
however, from looking at the round robin sensor update loop, my hypothesis
is that I would receive an update from each fan once every 8 seconds or so
given its first time to run. This was effectively what I saw, except of
course it was an update to the sensor every 9 seconds, because my
hypothesis forgot to account for the one second pause between each loop
iteration.
The solution that comes to mind would be to simply parallelize sensor
updates, such that phosphor-hwmon uses threads to update the sensor values
independently of each other. There are certainly downsides to this
implementation and it won't be 100% optimized because there are switching
overheads with threads, but since my driver read appears to be a
bottleneck, each thread will take their 1 second reading different files
and update the information more rapidly. This also prevents other sensors
under the same configuration from interfering with general updates to the
dbus sensor namespace. My system runs four instances of phosphor-hwmon
because of the layout, but others may run different variations and
experience other problems that could be alleviated by multi-threading.
The other suggestion I have to improve involves adding a timestamp to
phosphos-dbus-interfaces for Sensor.Value to let the recipient of the
information know when it was last updated. Certainly if someone listens
for property updates, they can produce this information themselves, but it
doesn't hurt to provide it to any reader regardless of mechanism and as it
stands now, it feels missing. It's true that some drivers cache the
information, and phosphor-hwmon itself waits one second before looping
around to update, but a way to reduce some amount of ambiguity would be to
provide a timestamp on the information. By adding a timestamp the
information would also come across to the reader as more of a time series,
which is more or less is.
Without multi-threading to remove the timing bottleneck on my platform, and
the inability to provide other more direct interfaces for controlling fans,
I'll be unable to use phosphor-hwmon as a primary source for
reading/writing sensor data and will be unable to upstream any effort i
make to write a pluggable thermal controller.
I felt it was important to thoroughly explain my reasoning as other
engineering teams attempting to use openbmc for their platforms could run
into similar problems and that the decision to develop independently was
data-driven and not arbitrary.
Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20170504/19381417/attachment.html>
More information about the openbmc
mailing list