Thoughts on performance profiling and tools for OpenBMC
Andrew Jeffery
andrew at aj.id.au
Mon Apr 12 13:12:01 AEST 2021
On Thu, 25 Mar 2021, at 10:58, Andrew Geissler wrote:
>
>
> > On Mar 22, 2021, at 5:05 PM, Sui Chen <suichen at google.com> wrote:
> >
> <snip>
> >
> > [ Proposed Design ]
> >
> > 1. Continue the previous effort [7] on a sensor-reading performance
> > benchmark for the BMC. This will naturally lead to investigation into
> > the lower levels such as I2C and async processing.
> >
> > 2. Try the community’s ideas on performance optimization in benchmarks
> > and measure performance difference. If an optimization generates
> > performance gain, attempt to land it in OpenBMC code.
> >
> > 3. Distill ideas and observations into performance tools. For example,
> > enhance or expand the existing DBus visualizer tool [8].
> >
> > 4. Repeat the process in other areas of BMC performance, such as web
> > request processing.
>
> I had to workaround a lot of performance issues in our first AST2500
> based systems. A lot of the issues were early in the boot of the BMC
> when systemd was starting up all of the different services in parallel
> and things like mapper were introspecting all new D-Bus objects
> showing up on the bus.
>
> Moving from python to c++ applications helped a lot. Changing
> application nice levels was not helpful (too many d-bus commands
> between apps so if one had a higher priority like mapper it would
> timeout waiting for lower priority apps).
>
> AndrewJ and I tried to track some of the issues and tools out on
> this wiki:
> https://github.com/openbmc/openbmc/wiki/Performance-Profiling-in-OpenBMC
Some rambling thoughts:
The wiki page makes a start on this, but I suspect what could be helpful
is a list of tools for capturing and inspecting behaviour at different
levels of the stack. Cribbing from the wiki page a bit:
# Application- and Kernel- Level behaviour
* `strace`
* `perf probe` / `perf record -e ...` (tracepoints, kprobes, uprobes))
* `perf record`: Hot-spot analysis
* Flamegraphs[1]: More hot-spot analysis
[1] http://www.brendangregg.com/flamegraphs.html
# Scheduler behaviour
* `perf sched record`
* `perf timechart`
# Service behaviour
* `systemd-analyze`
* `systemd-bootchart`
# D-Bus behaviour
* `busctl capture`
* `wireshark`
* `dbus-pcap`
`perf timechart` a great place to start when you fail to meet timing
requirements in a complex system (state).
I'm not sure much of this could be integrated into e.g. the visualiser
tool, but I think making OpenBMC easy to instrument is a step in the
right direction.
Andrew
More information about the openbmc
mailing list