[PATCHv2 pci-next 2/2] PCI/AER: Rate limit the reporting of the correctable errors

Sat Apr 8 05:46:45 AEST 2023

On Fri, Apr 07, 2023 at 11:53:27AM -0700, Grant Grundler wrote:
> On Thu, Apr 6, 2023 at 12:50 PM Bjorn Helgaas <helgaas at kernel.org> wrote:
> > On Fri, Mar 17, 2023 at 10:51:09AM -0700, Grant Grundler wrote:
> > > From: Rajat Khandelwal <rajat.khandelwal at linux.intel.com>
> > >
> > > There are many instances where correctable errors tend to inundate
> > > the message buffer. We observe such instances during thunderbolt PCIe
> > > tunneling.
> ...

> > >               if (info->severity == AER_CORRECTABLE)
> > > -                     pci_info(dev, "   [%2d] %-22s%s\n", i, errmsg,
> > > -                             info->first_error == i ? " (First)" : "");
> > > +                     pci_info_ratelimited(dev, "   [%2d] %-22s%s\n", i, errmsg,
> > > +                                          info->first_error == i ? " (First)" : "");
> >
> > I don't think this is going to reliably work the way we want.  We have
> > a bunch of pci_info_ratelimited() calls, and each caller has its own
> > ratelimit_state data.  Unless we call pci_info_ratelimited() exactly
> > the same number of times for each error, the ratelimit counters will
> > get out of sync and we'll end up printing fragments from error A mixed
> > with fragments from error B.
> 
> Ok - what I'm reading between the lines here is the output should be
> emitted in one step, not multiple pci_info_ratelimited() calls. if the
> code built an output string (using sprintnf()), and then called
> pci_info_ratelimited() exactly once at the bottom, would that be
> sufficient?
>
> > I think we need to explicitly manage the ratelimiting ourselves,
> > similar to print_hmi_event_info() or print_extlog_rcd().  Then we can
> > have a *single* ratelimit_state, and we can check it once to determine
> > whether to log this correctable error.
> 
> Is the rate limiting per call location or per device? From above, I
> understood rate limiting is "per call location".  If the code only
> has one call location, it should achieve the same goal, right?

Rate-limiting is per call location, so yes, if we only have one call
location, that would solve it.  It would also have the nice property
that all the output would be atomic so it wouldn't get mixed with
other stuff, and it might encourage us to be a little less wordy in
the output.

But I don't think we need output in a single step; we just need a
single instance of ratelimit_state (or one for CPER path and another
for native AER path), and that can control all the output for a single
error.  E.g., print_hmi_event_info() looks like this:

  static void print_hmi_event_info(...)
  {
    static DEFINE_RATELIMIT_STATE(rs, ...);

    if (__ratelimit(&rs)) {
      printk("%s%s Hypervisor Maintenance interrupt ...");
      printk("%s Error detail: %s\n", ...);
      printk("%s      HMER: %016llx\n", ...);
    }
  }

I think it's nice that the struct ratelimit_state is explicit and
there's no danger of breaking it when adding another printk later.

It *could* be per pci_dev, too, but I suspect it's not worth spending
40ish bytes per device for the ratelimit data.

Bjorn