[PATCH] powerpc/powernv: Read opal error log and export it through sysfs interface.

Wed Feb 26 10:19:46 EST 2014

Mahesh Jagannath Salgaonkar <mahesh at linux.vnet.ibm.com> writes:
>>  I think we could provide a better interface with instead having a file
>>  per log message appear in sysfs. We're never going to have more than 128
>>  of these at any one time on the Linux side, so it's not going to bee too
>>  many files.
>
> It is not just about 128 files, we may be adding/removing sysfs node for
> every new log id that gets informed to kernel and ack-ed. In worst case,
> when we have flood of elog errors with user daemon consuming it and
> ack-ing back to get ready for next log in a tight poll, we may
> continuously add/remove the sysfs node for each new <id>.

Do we ever get a storm of hundreds/thousands of them though? If many
come it at once userspace may just be woken up one or two times, as it
would just select() and wait for events.

>>  I've seen some conflicting things on this - is it 2kb or 16kb?
>
> We choose 16kb because we want to pull all the log data and not
> partial.

So the max log size for any one entry is in fact 16kb?

>>  This means we constantly use 128 * sizeof(struct opal_err_log) which
>>  equates to somewhere north of 2MB of memory (due to list overhead).
>> 
>>  I don't think we need to statically allocate this, we can probably just
>>  allocate on-demand as in a typical system you're probably quite
>>  unlikely to have too many of these sitting around (besides, if for
>>  whatever reason we cannot allocate memory at some point, that's okay
>>  because we can read it again later).
>
> The reason we choose to go for static allocation is, we can not afford
> to drop or delay a critical error log due to memory allocation failure.
> OR we can keep static allocations for critical errors and follow dynamic
> allocation for informative error logs.  What do you say?

Userspace is probably going to have to do IO to get the log and ack it,
so it's probably not a huge problem - if we can't allocate a few kb in a
couple of attempts then we likely have bigger problems.

If we were going to have a sustained amount of hundreds/thousands of
these per second then perhaps we'd have other issues, but from what I
understand we're probably only going to have a handful per year on a
typical system? (I am, of course, not talking about our dev systems,
which are rather atypical :)

I'll likely have a patch today that shows kind of what I mean.