netipmid consumes much CPU when obmc-console socket is shutdown
Ed Tanous
ed at tanous.net
Sat Jan 29 05:33:52 AEDT 2022
On Fri, Jan 14, 2022 at 6:08 AM Heyi Guo <guoheyi at linux.alibaba.com> wrote:
>
> Hi Ed,
>
> Thanks for your advice. I'll make a try later. But I'm still curious why
> boost read_some() function returns with 0 data byte and none error code,
> which seems to violate the reference obviously.
Like I said before, my guess is it's related to the fact that you're
combining an async_wait with a read_some in a way that asio didn't
intend in an evented system.
>
> Thanks,
>
> Heyi
>
> 在 2022/1/6 下午12:45, Ed Tanous 写道:
> > On Tue, Jan 4, 2022 at 6:31 PM Heyi Guo <guoheyi at linux.alibaba.com> wrote:
> >> Hi all,
> >>
> >> We found netipmid will consumes much CPU when SOL is activated but
> >> obmc-console socket is shutdown by some reason (can simply shutdown
> >> obmc-console by systemctl stop ....).
> >>
> >> After obmc-console socket is closed, the async_wait() in
> >> startHostConsole() is always triggered, and consoleInputHandler() will
> >> read empty data (readSize == 0 and readDataLen == 0), but all the ec
> >> condition check will NOT hit!
> >>
> >> From boost reference, it is said the function read_some() will:
> >>
> >> The function call will block until one or more bytes of data has been
> >> read successfully, or until an error occurs.
> >>
> >> Is it a bug of boost? Or is there anything wrong in ipmi-net? And how
> >> can we make netipmid more robust on obmc-console socket shutdown?
> >>
> > With not much knowledge of IPMI, but coming from a lot of knowledge of
> > boost and asio, that usage looks odd. Instead of the
> > consoleSocket.async_wait done here:
> > https://github.com/openbmc/phosphor-net-ipmid/blob/12d199b27764496bfff8a45661239b1e509c336f/sol/sol_manager.cpp#L92
> > Which then calls into a blocking async_read on the socket, I would've
> > expected a consoleSocket.async_read_some with a given buffer to reduce
> > the number of system calls, and to read out partial data as it's
> > available. Whether or not it would have different behavior in this
> > case, I can't say, but doing things the more expected way, and letting
> > asio handle it in the expected way in the past has netted us good
> > results in other applications.
> >
> > Another interesting thing is the use of std::deque for the console
> > buffer type here.
> > https://github.com/openbmc/phosphor-net-ipmid/blob/d4a4bed525f79c39705fa526b20ab663bb2c2069/sol/console_buffer.hpp#L12
> >
> > I would've expected to see one of the streaming buffer types like
> > flat_buffer (https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/ref/boost__beast__flat_buffer.html)
> > or multi-buffer
> > (https://www.boost.org/doc/libs/1_78_0/libs/beast/doc/html/beast/ref/boost__beast__multi_buffer.html),
> > which are designed for exactly what's being done here, streaming data
> > in and out of a pipe of variable lengths, and can be streamed into and
> > out of directly without having the extra copy. Additionally,
> > deque<uint8_t> is going to have a lot of memory overhead compared to a
> > flat buffer type.
> >
> > Not sure if any of the above is helpful to you or not, but it might
> > give you some things to try.
> >
> >> Thanks,
> >>
> >> Heyi
> >>
More information about the openbmc
mailing list