Inconsistent performance of dbus call GetManagedObjects to PSUSensor in dbus-sensors

Josh Lehan krellan at google.com
Tue Sep 1 07:35:22 AEST 2020


On Sat, Aug 15, 2020 at 7:05 AM Ed Tanous <ed at tanous.net> wrote:
> On Thu, Aug 13, 2020 at 4:33 PM Sui Chen <suichen at google.com> wrote:
> > > What patches did you apply?
> > This one: https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/30348
>
> Looking at this patch, I get the feeling this is specifically the
> cause of your problems.  Have you tried your things on master yet,
> without that patch, to see if it exhibits the same behavior?  There's
> a good reason each sensor has an independent timer instance;  The
> intent is that when the sensor system gets overloaded, the readings
> get delayed, and each individual timer introduces its own
> load-specific delays as it's started after the read completes.  Said
> another way, technically, a sensor scan rate isn't exactly 1 second,
> it's 1 second + queuing time + the time it takes to read the sensor,
> so if the sensor takes a long time to read (because the system is
> overloaded) the sensor scans don't collapse into one another and cause
> problems like you're seeing, and you still (hopefully) have
> significant idle time in the io_context to handle other things, like
> managing dbus connections.  (as a side note, for i2c based sensors,
> this keeps the latency of other operations on the bus down as well.

That's a good point.

With each sensor having its own free-running timer, which auto-extends
time if the system is delayed as you mentioned, that does provide a
way for the timers to automatically slow down when the system is under
heavy load and can't respond as quickly.

I wonder if it would be worth adding some timing to detect this, and
perhaps try to back off the timer duration somewhat? For example, if
1-second timers are bogging down too much, back the timer off to
2-second timers, and so on, until they have been slowed down enough to
give the system enough time to avoid falling behind.

> Unfortunately, that patch kinda puts all that design on its head.
> "Avoid timer skew" is in the commit message.  That skew is there on
> purpose, to prevent the problem you're seeing.  With that said, I'm
> sure Josh was trying to fix something there, but in the commit message
> he doesn't actually outline what the problem is.  Do you know?  My
> best guess is maybe he was trying to get more accurate scan rates?  We
> could think up designs that have some logic that, in the base case
> handles the loop more accurately but spreading out the sensor scans is
> a good thing in general and reduces the peak work queue size (as
> you've already found out).
> Maybe there were some different performance issues that putting all
> the sensor scans together solved?

The sensor reading was causing the rest of the system to become
unusably slow, even before making any changes. I had 3 goals in mind:

1) The pmbus driver seems to ask the hardware for new data if it has
been long enough (HZ, roughly one second) since the last read, by
comparing the system jiffies clock. As all sensors would have their
own free-running timers, the driver readings would be smeared
throughout time, causing this interval to be quickly and repeatedly
reached. We suspected a driver was being slow, perhaps because of I2C,
so we were seeing a system slowdown every second. By reading more
sensors simultaneously, and then pausing for a while, I was hoping to
avoid triggering this interval as often, thus taking advantage of some
caching. The pmbus driver does a complete read of the chip every time
this interval is triggered, thus if I can satisfy more sensors at a
single interval, I don't need to do as many complete readings.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/hwmon/pmbus/pmbus_core.c#n584

2) In general, it's good practice to read sensors at fixed intervals,
to avoid skew between sensor readings. An example is voltage and
amperage. Ideally, you would sample them simultaneously, so that
computations such as obtaining wattage (volts * amps) would be
correct. If the sensor readings were staggered, and you were reading
something that was changing, your calculated wattage would be
incorrect. Many embedded sensor networks have a command that
broadcasts out to the worker nodes, to "lock" a sensor reading into
local sensor memory, then, as time permits, the leader node can issue
commands individually to collect data from each worker, knowing that
the data was earlier all captured simultaneously. This technique
improves the quality of collected data, and is often used in industry.
I was hoping to bring PSUSensor a little closer to this ideal.

3) By synchronizing to one master timer, it becomes possible to slow
down the sensor collection interval, and introduce some resting time
between sensor data collection. The system became almost unusably slow
during sensor data collection. By introducing a rest time between the
readings, I was able to "PWM" the system load, and thus dial it down
to make the system usable again, by changing the duty cycle to
something like 70% instead of 100%.

I wonder if it would be worth it, instead of having each sensor have
its own free-running timer, to still have a single master time for all
sensor collection, but stagger it? For example, let's say you need to
read 200 sensors, and you need to do a complete reading once every 5
seconds. Have the timer fire at intervals of 15 milliseconds. Collect
one sensor reading each tick. This completes 200 sensors in 3000
milliseconds. The system can then be quiet for the remaining 2000
milliseconds of the interval, giving it a chance to catch up and
remain responsive.

Josh


More information about the openbmc mailing list