openbmc Digest, Vol 60, Issue 27

Manikandan manikandan.hcl.ers.epl at gmail.com
Sat Aug 8 19:29:46 AEST 2020


On Sat, Aug 08, 2020 at 12:00:04PM +1000, openbmc-request at lists.ozlabs.org wrote:
> Send openbmc mailing list submissions to
> 	openbmc at lists.ozlabs.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://lists.ozlabs.org/listinfo/openbmc
> or, via email, send a message with subject or body 'help' to
> 	openbmc-request at lists.ozlabs.org
> 
> You can reach the person managing the list at
> 	openbmc-owner at lists.ozlabs.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of openbmc digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: Inconsistent performance of dbus call GetManagedObjects
>       to PSUSensor in dbus-sensors (Ed Tanous)
>    2. RE: system power control (Zhao Kun)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Fri, 7 Aug 2020 16:17:20 -0700
> From: Ed Tanous <ed at tanous.net>
> To: Alex Qiu <xqiu at google.com>
> Cc: OpenBMC Maillist <openbmc at lists.ozlabs.org>, James Feist
> 	<james.feist at linux.intel.com>,  Peter Lundgren
> 	<peterlundgren at google.com>, Josh Lehan <krellan at google.com>,  Jason
> 	Ling <jasonling at google.com>, Sui Chen <suichen at google.com>, Jie Yang
> 	<jjy at google.com>, Drew Macrae <drewmacrae at google.com>
> Subject: Re: Inconsistent performance of dbus call GetManagedObjects
> 	to PSUSensor in dbus-sensors
> Message-ID:
> 	<CACWQX82Or8bnTA8WDqrogpp16vEff7PoEB4ZK4b3tFwYKWSQZQ at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
> 
> This is great!  Thank you for taking the time to type this up.
> 
> On Fri, Aug 7, 2020 at 3:42 PM Alex Qiu <xqiu at google.com> wrote:
> >
> > The setup has a total of 249 IPMI sensors, and among these, dbus-sensors reports 59 objects from HwmonTempSensor and 195 objects from PSUSensor, and we've already decreased the polling rate of PSUSensor to every 10 seconds to mitigate the issue. As the intel-ipmi-oem does, we measure the time of calling GetManagedObjects with commands:
> 
> This isn't the biggest sensor usage I've ever seen, but it certainly
> is the biggest usage of PSUsensor I've seen sofar.  It's not
> surprising you're finding performance issues other people haven't.
> PSUSensor was originally supposed to be for physical pmbus power
> supplies, but got abstracted a little at some point to be more
> generic.
> 
> >
> > time busctl call xyz.openbmc_project.HwmonTempSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects
> > time busctl call xyz.openbmc_project.PSUSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects
> >
> > The first command for HwmonTempSensor constantly finishes in about 60 ms. However, the run time of the second command for PSUSensor is very inconsistent. Out of 50 continuous runs, most of them finish in about 150 ms, but 2 or 3 of them will take as long as about 6 seconds to return. This results in long time to scan the SDR and inconsistent performance polling IPMI sensors.
> >
> 
> I don't have a system handy that uses PSUSensor, but based on what
> you're saying, I'm going to guess that there's a blocking
> io/wait/sleep call that snuck in somewhere in the PSUsensor, and it's
> stopping the main reactor for some amount of time.  This is probably
> exacerbated by how loaded your system is, which is causing the really
> bad tail latencies.
> 
> If I were in your shoes, the first thing I would do is to recompile
> PSUSensor with IO handler tracking enabled:
> https://www.boost.org/doc/libs/1_73_0/doc/html/boost_asio/overview/core/handler_tracking.html
> 
> to do that, go here:
> https://github.com/openbmc/dbus-sensors/blob/master/CMakeLists.txt#L194
> 
> add add a line line like
> target_compile_definitions(psusensor PUBLIC
> -DBOOST_ASIO_ENABLE_HANDLER_TRACKING)
> 
> and recompile.
> 
> That's going to print loads of debug info to the console when it runs.
> Be prepared.  Rerun your test with the flag enabled.  When your
> getmanagedobjects command gets stuck, dump the log and try to find the
> spot where io seems to stop for a bit.  Hopefully you'll find one
> async operation is taking a looooong time to run.  Most operations
> should be in the order of micro/milliseconds for runtime.  Once you
> know what the spot is, we can probably triage further.  Each
> individual callback is pretty simple, and only does a couple things,
> so it should be pretty easy to sort out what's blocking within a given
> callback.
> 
> 
> My second theory is that because of the async nature of psusensor, if
> you get unlucky, 195 concurrent IO completion operations are getting
> scheduled right ahead of your GetManagedObjects call.  Right now the
> IO scheduling is pretty dumb, and doesn't attempt to add jitter to
> randomize the call starts, under the assumption that the reactor will
> never be more than 10 or so handles at a given time.  Given the number
> of sensors you've got, we might want to rethink that, and try to
> spread them out in time a little.  If we wanted to verify this, we
> could instrument io_context with a little run_for() magic that breaks
> every N milliseconds and prints the size of the queue.  That could
> verify that we're running it too large.
> 
> Technically I think this is the embedded version of the thundering
> herd problem.  There are ways to solve it that should be relatively
> easy (if that's what it turns out to be).
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Sat, 8 Aug 2020 01:19:03 +0000
> From: Zhao Kun <zkxz at hotmail.com>
> To: "Bills, Jason M" <jason.m.bills at linux.intel.com>,
> 	"openbmc at lists.ozlabs.org" <openbmc at lists.ozlabs.org>
> Subject: RE: system power control
> Message-ID:
> 	<BYAPR14MB23424B7B0D6A450C52235EC2CF490 at BYAPR14MB2342.namprd14.prod.outlook.com>
> 	
> Content-Type: text/plain; charset="windows-1252"
> 
> Thank you, Jason. Could you share with me any example of defining those GPIOs in device tree for x86-power-control? I can?t find any in aspeed-bmc-intel-s2600wf.dts.
>
   Please refer the below Facebook Tiogapass gpio device entry for x86-power-control.
     https://github.com/openbmc/linux/blob/dev-5.7/arch/arm/boot/dts/aspeed-bmc-facebook-tiogapass.dts#L135
> 
> 
> Thanks.
> 
> Best regards,
> 
> Kun Zhao
> /*
>   zkxz at hotmail.com<mailto:zkxz at hotmail.com>
> */
> 
> From: Bills, Jason M<mailto:jason.m.bills at linux.intel.com>
> Sent: Friday, August 7, 2020 10:12 AM
> To: openbmc at lists.ozlabs.org<mailto:openbmc at lists.ozlabs.org>
> Subject: Re: system power control
> 
> 
> 
> On 8/6/2020 11:43 PM, Zhao Kun wrote:
> > Hi,
> >
> > I?m new to learn how to make OpenBMC work on a X86 based system.
> > Currently I met a problem of mapping the GPIOs about power
> > on/off/reset/status into OpenBMC logic. I understand when user issue a
> > power on request through any user interfaces like RESTful, IPMI, etc.,
> > some service (phosphor-state-manager?) will be triggered to check
> > current status and roll out corresponding systemd services to do the
> > job. (please correct me if I?m wrong)
> >
> > But I?m just confused on how those services actually toggle or check the
> > GPIOs, there seems be many choices,
> >
> >  1. Device tree?
> >  2. Using Workbook gpio_defs.json?
> >  3. Create some services calling platform specific scripts to operate
> >     GPIO or I2C devices?
> >  4. Using x86-power-control?
> >
> > So what?s the most recommended way to do it? Really appreciated If
> > anyone can share some lights.
> On Intel reference platforms, we use x86-power-control and configure the
> GPIO names using device tree.
> 
> >
> > I thought there must be a mechanism to consume some kind of
> > configuration file as the hardware abstraction layer. So I guess it
> > might be gpio_defs.json or device tree.
> >
> > Thanks.
> >
> > Best regards,
> >
> > Kun Zhao
> >
> > /*
> >
> > zkxz at hotmail.com <mailto:zkxz at hotmail.com>
> >
> > */
> >
> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20200808/0244c0ef/attachment-0001.htm>
> 
> End of openbmc Digest, Vol 60, Issue 27
> ***************************************


More information about the openbmc mailing list