[entity-manager] Issue about entity-manager getting stuck

Ed Tanous edtanous at google.com
Thu Feb 4 03:11:02 AEDT 2021


On Tue, Feb 2, 2021 at 11:22 PM Scron Chang (張仲延)
<Scron.Chang at quantatw.com> wrote:
>
> Hi Ed,
> Thanks for your reply.
>
> In my case, I have a script using the following command to check the host status and then resetting the peci module based on its result.
> busctl get-property xyz.openbmc_project.State.Chassis /xyz/openbmc_project/state/os xyz.openbmc_project.State.OperatingSystem.Status OperatingSystemState

This sounds like something that should be using a match expression
rather than polling every second.  If you did that, your problem would
likely go away (and your system would be better off as a whole for
it).

>
> Now I understood the reason why entity-manager catch the nameOwnerChanged signal.
> However, please allow me to discuss one question furthermore. How does entity-manager define the waiting time for the system to become ready? According to the source code, the current waiting time is 5 seconds.
> (Please refer to this line:
> https://github.com/openbmc/entity-manager/blob/f094125cd3bdbc8737dc8035a6e9ac252f6e8840/src/EntityManager.cpp#L1687)
>
> If the waiting time can change to 1 second, the entity-manager's response can become faster and barely get stuck. I found entity-manager did use 1 second before this PR.
> (Please refer to this PR:
> https://gerrit.openbmc-project.xyz/c/openbmc/entity-manager/+/25193)
>
> In this PR, there is not much comment. May I ask the reason for changing the waiting time? And what should be concerned if entity-manager uses the shorter waiting time?

All the properties changed events can take more than a second to
process.  5 seconds is on the safe size.


PS, please don't top post.   This mailing list prefers inline replies.

>
> Scron Chang
> E-Mail  Scron.Chang at quantatw.com
> Ext.    11936
>
>
> -----Original Message-----
> From: Ed Tanous <edtanous at google.com>
> Sent: Thursday, January 28, 2021 1:07 AM
> To: Scron Chang (張仲延) <Scron.Chang at quantatw.com>
> Cc: openbmc at lists.ozlabs.org
> Subject: Re: [entity-manager] Issue about entity-manager getting stuck
>
> On Tue, Jan 26, 2021 at 10:34 PM Scron Chang (張仲延)
> <Scron.Chang at quantatw.com> wrote:
> >
> > Hi all,
> >
> > I am using openbmc/entity-manager in this version: "f094125cd3bdbc8737dc8035a6e9ac252f6e8840" and I found calling Dbus makes entity-manager get stuck.
> >
> > Reproduce this by following steps:
> > 1. systemctl stop xyz.openbmc_project.EntityManager 2. open another
> > terminal and do this while-loop: "while true; do busctl ; sleep 1; done"
> > 3. systemctl start xyz.openbmc_project.EntityManager I think the root
> > cause is this function: "nameOwnerChangedMatch." (Please refer to this
> > line:
> > https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> > ub.com%2Fopenbmc%2Fentity-manager%2Fblob%2Ff094125cd3bdbc8737dc8035a6e
> > 9ac252f6e8840%2Fsrc%2FEntityManager.cpp%23L1859&data=04%7C01%7CScr
> > on.Chang%40quantatw.com%7C31b46c0c041b402dc3d608d8c2e5f9dd%7C179b03270
> > 7fc4973ac738de7313561b2%7C1%7C0%7C637473640299652770%7CUnknown%7CTWFpb
> > GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> > %3D%7C1000&sdata=VcBRR5%2BTG%2FFscHR14bsDMqghE73qRhGYhztE25FKIzE%3
> > D&reserved=0.)
>
> My first thought is: Don't run an empty busctl in a loop then, but I'm guessing that's not what you're really trying to do.  If we had more ideas about what you were really hoping to accomplish, we might have some better advice for how to proceed.
>
> The intent of that code is to reconfigure entity-manager when interfaces are changed, so if you're constantly attaching and detaching to dbus, entity-manager (and object manager) never sees the system as "up" and keeps waiting for the system to finish stabilizing before it runs the config logic.
>
> In your specific case above, the code could be a little smarter, and ignore unique names in that check, only caring about newly-defined well known names, but without knowing your real use case, it's hard to know if that would help.
>
> >
> > Manually calling Dbus or calling Dbus in a script makes NameOwnerChanged signal and thus triggers the function: "propertiesChangedCallback" repeatedly. Meanwhile, the async_wait in propertiesChangedCallback gets returned because of the operation_aborted.
>
> Personal opinion: Don't call busctl continuously in a script.  It's inefficient, and causes problems like this.
>
> > So here is the conclusion:
> > Manually calling Dbus in a period that is less than 5 seconds leads entity-manager keeping to trigger new async_wait and abort the old one. However, the async_wait never gets done.
> >
> > Is this a bug of entity-manager, or I get something wrong. Please help me with this.
>
> IMO, entity-manager is working as intended, but lets try to figure out what you're really trying to do, and see if we can find you a solution.
>
> >
> > Scron Chang
> > E-Mail  Scron.Chang at quantatw.com
> >


More information about the openbmc mailing list