[PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting

Hidetoshi Seto seto.hidetoshi at jp.fujitsu.com
Fri Jul 8 22:22:17 EST 2005


Benjamin Herrenschmidt wrote:
> On Thu, 2005-07-07 at 11:41 -0700, Greg KH wrote:
>>How about the issue of tying this into the other pci error reporting
>>infrastructure that is being worked on?
> 
> The other infrastructure is for asynchronous reporting and recovery.
> We still need synchronous detection & reporting. So this is a bit
> different.

The interesting point is that it seems that both could use for reporting.

> However, it would be nice if Hidetoshi's work could be adapted a bit so
> that 1) naming is a bit more consistent with the other stuff (pcierr_*
> maybe) and 2) the error "token" is the same. The later is especially
> important if we start adding ways to query the error token to know what
> the error precisely was etc... There is no reason to have 2 different
> ways of representing error details.

The naming doesn't really matter. iochk_* is just my preference.
However it would be worth a try to move generic codes from iomap.*
(historical home) to pci.* and rename iochk_* to pcierr_*.

Well, I'd like to use this opportunity to sort out my thoughts...

Now iochk_read() returns a boolean value, whether there was a error
or not. Of course I agree that it would be more useful if it can return
the detail of the error.

A quick solution is return "token" instead of boolean value 1.
-extern int iochk_read(iocookie *cookie);
+extern token iochk_read(iocookie *cookie);

The token should be a pointer or a bitmask having proper flags, not 0.
So this still work:
   if(iochk_read(cookie)) return -EIO;
and now this will also work:
   if((token=iochk_read(cookie))!=0){
      switch(severity(token)) {
        ...
   }}

The error "token", tentatively named "pci_error_token" in document
you posted, is now temporarily defined in recent Linas's patch using
other alias:
  enum pci_channel_state {
	pci_channel_io_normal = 0, /* I/O channel is in normal state */
	pci_channel_io_frozen = 1, /* I/O to channel is blocked */
	pci_channel_io_perm_failure, /* pci card is dead */
  };
Of course this will be not enough in near future.

I have already agree with what few month ago you said:
> The token should be an opaque type with accessors. You could define a
> pci_error_get_severity(token) to return the severity. The idea is to
> define accessors which return an error when the data requested isn't
> present in the error info. The actual content of the token is to be
> defined. I was thinking about a type plus a union. I was hoping Seto
> could provide something here ...

For example:
  /* offsets */
  enum {
	pcierr_io_frozen = 0, 	/* I/O to channel is blocked */
	pcierr_io_perm_failure, /* pci card is dead */
	pcierr_severity_valid,  /* 1:valid */
	pcierr_severity,     	/* 0:non-fatal 1:fatal */
  };

  #define pcierr_perm_failure(token) \
	test_bit(pcierr_io_perm_failure, token)

  int pcierr_get_severity(token)
  {
	if(test_bit(pcierr_severity_valid, token)) {
	  return test_bit(pcierr_severity, token);
	}
	return -1;
  }

Objections?
Are there any better place to put a group of codes like above than
include/linux/pci.h?

By the way, if token says frozen but not perm_failure, is it mean
"recovery processing now"? Are there any special state which
synchronous detection have to report?

Thanks,
H.Seto




More information about the Linuxppc64-dev mailing list