BMC Performance Profiler

Sui Chen suichen at google.com
Sat Oct 10 05:53:30 AEDT 2020


On Fri, Oct 9, 2020 at 7:38 AM Andrew Geissler <geissonator at gmail.com> wrote:
>
>
>
> > On Oct 8, 2020, at 4:44 PM, Pasha Ghabussi <pashag at google.com> wrote:
> >
> > Hello all,
> >
> > We would really appreciate it if you can take a few minutes to read the proposal sent earlier and let us know your thoughts and suggestions.
> >
> > Thank you
> >
> > On Mon, Oct 5, 2020 at 1:57 PM Pasha Ghabussi <pashag at google.com> wrote:
> > Hello all,
> > We would really appreciate it if you can take a few minutes to read the following proposal and let us know your thoughts and suggestions.
> > We are developing a tool to investigate performance problems by looking at DBus traffic dumps.
>
> I definitely think this could be a very useful tool. Performance issues have hindered us from day 1 with OpenBMC and countless hours have gone into trying to identify the different issues. One area we’ve seen a lot of issues with is on BMC startup, especially after a firmware update. If you could provide a way to enable the needed profiling debug, and then reboot the BMC and capture the data for analysis, it would be appreciated.
>
> > Current DBus inspection and visualization tools do not represent the DBus events similar to a typical performance profiler. Additionally, these tools do not address typical BMC workloads such as IPMI and ASIO. Hence, identifying potential performance problems requires inspecting the raw BMC DBus traffic, which can become a long and complex process. We want to add a graphical interface to webui-vue to visualize the DBus traffic to address the abovementioned problem.
>
> Will you be using something like "busctl capture” to capture the data? I hope you don’t have to write a new tool to get the data?
>

We will be using "busctl capture" to capture the data and not writing
a new one, just like Pasha mentioned.

>
> >
> > There have been DBus and IPMI performance-related discussions in the OpenBMC community, both of which can be helped by this work: IPMI-related issues started to appear as early as in 2017. One issue (#2630) describes a problem related to large numbers of sensors. Its follow-up (#3098) mentions “hostboot crashes due to poor IPMI performance”. Another issue (#2519) describes a commonly-seen problem of IPMI taking very long to respond (> 5s).
> > There are also discussions on RedFish performance on the mailing list; A patch optimized DBus performance by introducing a cache for name translation.
> > All the performance investigations listed above involve DBus and may be helped by this work.
>
> Agreed
>
> >
> > We are planning to use the BMCweb file hosting functionality to access the DBus event dumps and visualize the events in the web UI. The available profiling tools such as dbus-pcap, Wireshark, Bustle, Snyh, or DFeet do not provide the exact functionality we are looking for. Our goal is to develop functionalities similar to other widely used profilers such as GPUView or VTune Profiler.
> >
>
> For the analysis and visualization side, I’m never a big fan of writing something from scratch. Have you looked into enhancing some of the existing tools out there vs. writing your own?
>

One existing tool on the visualization side that resembles what we are
looking for is the ChromeDevTools performance profiler UI
(https://github.com/ChromeDevTools/devtools-frontend/tree/master/front_end/perf_ui),
in that it is capable of showing thousands of events in an interactive
way (allowing the user to pan/scale the time line and inspect
individual events). If we plug the debug UI to existing DBus-related
tools, we basically get something similar to the prototype (
https://gerrit.openbmc-project.xyz/c/openbmc/openbmc-tools/+/34263 )
but a lot more polished.

The Perf UI mentioned above seems to have many dependencies and is
tightly integrated into Chrome so that we feel it might take less
effort to write a basic implementation from scratch (covering basic
functionalities such as timelines and histograms) than to integrate it
into existing tools. Actually, both the Perf UI and the rendering
routine in the prototype use the HTML canvas element for
visualization, which are typically hardware-accelerated and can render
many thousands of objects (basic shapes, images, text, etc) at
interactive frame rates. The visualization runs in the user's browser
and does not consume the BMC's processing power.

> Although having in the web UI could be useful, I don’t really see it as a requirement. Could your tool be simpler to write or be made more generic for others to use if it was not tied to the web UI?
>

WebUI was considered for the following reasons: 1) web technologies,
in particular the hardware-accelerated HTML canvas, are convenient and
performant enough for the visualization we are looking for, and it
makes reusing the code in the prototype (which was also HTML based)
very easy; 2) accessing the BMC through WebUI saves the user the
trouble of having to manually start DBus capture and transfer the dump
file back to the host for doing visualization; 3) it might make it
easier to integrate this tool with future technologies such as
Redfish.

To untie the visualizer from the WebUI, there could be a few alternatives:
1) visualize the data using a text-based UI. In that case, the tool
would function similarly to tools like "top".
2) generate the visualization in SVG or HTML format similarly to FlameGraph.

In any case, the visualization part and the integrated performance
profiling experience would be our main contribution and is the extra
step we are taking on top of existing text-based tools like dbus-pcap
(which already parses dbus dumps and is being depended on by the
prototype.)

> > One alternative solution considered was to stream DBus requests over websocket, but the existing websocket endpoints available on BMC webserver do not provide the exact information we need.
> >
> > Requirements and Scalability:
> >       • Should provide the adequate functionalities to filter, visualize the events timeline, and group the DBus traffic based on multiple criteria such as type, source, destination, path, interface, demon signatures, and more.
> >       • Should support capture of DBus messages using as little resources as possible.
> >       • Should be able to show many (~thousands of) entries on screen simultaneously
> >       • Integration with webui-vue
> >
> > Thank you
>


More information about the openbmc mailing list