BMC Performance Profiler

Andrew Geissler geissonator at gmail.com
Sat Oct 10 01:37:21 AEDT 2020



> On Oct 8, 2020, at 4:44 PM, Pasha Ghabussi <pashag at google.com> wrote:
> 
> Hello all,
> 
> We would really appreciate it if you can take a few minutes to read the proposal sent earlier and let us know your thoughts and suggestions.
> 
> Thank you
> 
> On Mon, Oct 5, 2020 at 1:57 PM Pasha Ghabussi <pashag at google.com> wrote:
> Hello all,
> We would really appreciate it if you can take a few minutes to read the following proposal and let us know your thoughts and suggestions.
> We are developing a tool to investigate performance problems by looking at DBus traffic dumps.

I definitely think this could be a very useful tool. Performance issues have hindered us from day 1 with OpenBMC and countless hours have gone into trying to identify the different issues. One area we’ve seen a lot of issues with is on BMC startup, especially after a firmware update. If you could provide a way to enable the needed profiling debug, and then reboot the BMC and capture the data for analysis, it would be appreciated.

> Current DBus inspection and visualization tools do not represent the DBus events similar to a typical performance profiler. Additionally, these tools do not address typical BMC workloads such as IPMI and ASIO. Hence, identifying potential performance problems requires inspecting the raw BMC DBus traffic, which can become a long and complex process. We want to add a graphical interface to webui-vue to visualize the DBus traffic to address the abovementioned problem.

Will you be using something like "busctl capture” to capture the data? I hope you don’t have to write a new tool to get the data? 


> 
> There have been DBus and IPMI performance-related discussions in the OpenBMC community, both of which can be helped by this work: IPMI-related issues started to appear as early as in 2017. One issue (#2630) describes a problem related to large numbers of sensors. Its follow-up (#3098) mentions “hostboot crashes due to poor IPMI performance”. Another issue (#2519) describes a commonly-seen problem of IPMI taking very long to respond (> 5s).
> There are also discussions on RedFish performance on the mailing list; A patch optimized DBus performance by introducing a cache for name translation.
> All the performance investigations listed above involve DBus and may be helped by this work.

Agreed

> 
> We are planning to use the BMCweb file hosting functionality to access the DBus event dumps and visualize the events in the web UI. The available profiling tools such as dbus-pcap, Wireshark, Bustle, Snyh, or DFeet do not provide the exact functionality we are looking for. Our goal is to develop functionalities similar to other widely used profilers such as GPUView or VTune Profiler.
> 

For the analysis and visualization side, I’m never a big fan of writing something from scratch. Have you looked into enhancing some of the existing tools out there vs. writing your own?

Although having in the web UI could be useful, I don’t really see it as a requirement. Could your tool be simpler to write or be made more generic for others to use if it was not tied to the web UI?

> One alternative solution considered was to stream DBus requests over websocket, but the existing websocket endpoints available on BMC webserver do not provide the exact information we need.
> 
> Requirements and Scalability:
> 	• Should provide the adequate functionalities to filter, visualize the events timeline, and group the DBus traffic based on multiple criteria such as type, source, destination, path, interface, demon signatures, and more.
> 	• Should support capture of DBus messages using as little resources as possible.
> 	• Should be able to show many (~thousands of) entries on screen simultaneously
> 	• Integration with webui-vue
> 
> Thank you



More information about the openbmc mailing list