netipmid consumes much CPU when obmc-console socket is shutdown

Heyi Guo guoheyi at linux.alibaba.com
Tue Feb 1 14:06:01 AEDT 2022


在 2022/1/29 上午2:33, Ed Tanous 写道:
> On Fri, Jan 14, 2022 at 6:08 AM Heyi Guo <guoheyi at linux.alibaba.com> wrote:
>> Hi Ed,
>>
>> Thanks for your advice. I'll make a try later. But I'm still curious why
>> boost read_some() function returns with 0 data byte and none error code,
>> which seems to violate the reference obviously.
> Like I said before, my guess is it's related to the fact that you're
> combining an async_wait with a read_some in a way that asio didn't
> intend in an evented system.

Thanks, I'll take a try on this.

Heyi


>
>> Thanks,
>>
>> Heyi
>>
>> 在 2022/1/6 下午12:45, Ed Tanous 写道:
>>> On Tue, Jan 4, 2022 at 6:31 PM Heyi Guo <guoheyi at linux.alibaba.com> wrote:
>>>> Hi all,
>>>>
>>>> We found netipmid will consumes much CPU when SOL is activated but
>>>> obmc-console socket is shutdown by some reason (can simply shutdown
>>>> obmc-console by systemctl stop ....).
>>>>
>>>> After obmc-console socket is closed, the async_wait() in
>>>> startHostConsole() is always triggered, and consoleInputHandler() will
>>>> read empty data (readSize == 0 and readDataLen == 0), but all the ec
>>>> condition check will NOT hit!
>>>>
>>>>    From boost reference, it is said the function read_some() will:
>>>>
>>>> The function call will block until one or more bytes of data has been
>>>> read successfully, or until an error occurs.
>>>>
>>>> Is it a bug of boost? Or is there anything wrong in ipmi-net? And how
>>>> can we make netipmid more robust on obmc-console socket shutdown?
>>>>
>>> With not much knowledge of IPMI, but coming from a lot of knowledge of
>>> boost and asio, that usage looks odd.  Instead of the
>>> consoleSocket.async_wait done here:
>>> https://github.com/openbmc/phosphor-net-ipmid/blob/12d199b27764496bfff8a45661239b1e509c336f/sol/sol_manager.cpp#L92
>>> Which then calls into a blocking async_read on the socket, I would've
>>> expected a consoleSocket.async_read_some with a given buffer to reduce
>>> the number of system calls, and to read out partial data as it's
>>> available.  Whether or not it would have different behavior in this
>>> case, I can't say, but doing things the more expected way, and letting
>>> asio handle it in the expected way in the past has netted us good
>>> results in other applications.
>>>
>>> Another interesting thing is the use of std::deque for the console
>>> buffer type here.
>>> https://github.com/openbmc/phosphor-net-ipmid/blob/d4a4bed525f79c39705fa526b20ab663bb2c2069/sol/console_buffer.hpp#L12
>>>
>>> I would've expected to see one of the streaming buffer types like
>>> flat_buffer (https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/ref/boost__beast__flat_buffer.html)
>>> or multi-buffer
>>> (https://www.boost.org/doc/libs/1_78_0/libs/beast/doc/html/beast/ref/boost__beast__multi_buffer.html),
>>> which are designed for exactly what's being done here, streaming data
>>> in and out of a pipe of variable lengths, and can be streamed into and
>>> out of directly without having the extra copy.  Additionally,
>>> deque<uint8_t> is going to have a lot of memory overhead compared to a
>>> flat buffer type.
>>>
>>> Not sure if any of the above is helpful to you or not, but it might
>>> give you some things to try.
>>>
>>>> Thanks,
>>>>
>>>> Heyi
>>>>


More information about the openbmc mailing list