ECC memory of BMC

Andrew Jeffery andrew at aj.id.au
Fri Feb 22 16:55:10 AEDT 2019


On Fri, 22 Feb 2019, at 16:22, Stefan Schaeckeler (sschaeck) wrote:
> Hi Will,
> 
> On 2/21/19, 6:00 PM, "Will Liang (梁永鉉)" <Will.Liang at quantatw.com> wrote:
> 
> > > > What we want to do is to record the ECC events to SEL.
> > > >
> > > > we are considering to create new dbus and a service.
> > > 
> > > Right; I think you need to create a new service that polls the sysfs interface for
> > > the EDAC device, and then use phosphor-logging to create error logs. 
> >
> > We consider creating the following objects for D-Bus:
> > -bus name : /xyz/openbmc_project/ECC
> > -object path : /xyz/openbmc_project/ECC/status
> > -interface : xyz.openbmc_project.Memory.MemoryECC
> >
> > and error types for xyz::openbmc_project::Memory::Ecc::Error::ceCount and "ueCount"
> > and "isLoggingLimitReached" for phosphor-logging error message.
> 
> 
> Note, the driver also logs the addresses of the recoverable and un-recoverable
> errors. Perhaps you want to expose them, too?
> 
> The edac framework is unfortunately not exposing them through sysfs. They get
> printed through "edac_mc_handle_error()" as printk(KERN_WARNING, ...) and look
> like
> 
> root at aspeed-arm:# dmesg | grep EDAC
> [ 1718.900000] EDAC MC0: 1 CE address(es) not available on 
> mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:0 
> syndrome:0x0)
> [ 1718.900000] EDAC MC0: 1 CE on mc#0csrow#0channel#0 (csrow:0 
> channel:0 page:0x80000 offset:0x0 grain:0 syndrome:0x0)
> 
> 
> I'm not sure if there is an elegant way for userspace to retrieve messages from
> the kernel ring buffer.
> 

Lets not start scraping dmesg. It's not considered part of the kernel ABI.


More information about the openbmc mailing list