[PATCH] rpaphp broken in ameslab

Thu Jul 1 06:56:28 EST 2004

On Wed, Jun 30, 2004 at 12:46:34PM -0700, Greg KH wrote:
> On Wed, Jun 30, 2004 at 02:14:33PM -0500, linas at austin.ibm.com wrote:
> > On Wed, Jun 30, 2004 at 02:03:32PM -0500, Linda Xie wrote:
> > > Paul Mackerras wrote:
> > > >
> > > >By the way, I notice that upstream rpaphp_core.c now has the call to
> > > >eeh_register_disable_func(), although the actual function isn't
> > > >present in arch/ppc64/kernel/eeh.c.
> >
> > Paul,
> >
> > You and Anton are responsible for keeping the arch/ppc64 directories
> > in sync between sles9, ameslab, and akpm.  You are, after all, the
> > one true official, designated maintainer ... if the code hasn't been
> > migrated to akpm ... uuh ... what am I missing?
> >
> > >From where I sit, the sles9 code is really the latest, greatest, most
> > tested and debugged arch/ppc64 code that there is.  This is the tree
> > that the developers get thier code/patches into.
>
> And that is the big problem.
>
> Those patches/fixes should go to mainline, not directly to suse.  How
> are they going to get back into mainline?

My understanding is that Paul Mackerras and Anton Blanchard are the
designated maintainers of the arch/ppc64 tree.  They are responsible
for sending the patches upstream, getting them into mainline.

> Which is an ugly hack in and of itself.  I only oked it for now.
>
> Actually, since everyone agrees that this isn't the way to go, I'll go
> remove it :)

Ack, I wish you wouldn't, it will break things.

What do you suggest as the 'right way' to accomplish this?

> > > > In fact I think the
> > > >separation is bogus; the EEH code and the rpaphp code are both part of
> > > >the driver for the RPA PCI subsystem.
> >
> > prolly.  But note that the generic hotplug API's need to be extended
> > to give device drivers a mechanism to ask RPA PHP / EEH if a disconnect
> > event occured.  Last I talked to Greg, he wasn't willing to accept
> > something like that yet, so its a bit up in the air.
>
> I wasn't willing to accept that, as that was the wrong way to do this.

Yes, well, a few months ago, Torvalds made me promise that this would
be the way that it would be done eventually.  You were cc'ed on that
chain of notes.  Why didn't you take that up with him?

> It should be done from userspace with hotplug events like we mentioned.

I think you're confusing this with a different issue.  The issue here
is "the device driver thinks that the PCI bus is whacked. The device
driver wants to be able to make a call to find out if the PCI bus is
whacked."  I don't think you want to bounce that up to userspace and
back down again.

> And none of this "what happens about the root device" crud either, I've
> seen your code in the kernel that checks for this today.  bah.

? Well, if your implying I wrote that code, I didn't.  My goal is to
get it to do 'the right thing'.  Once I stop getting dirstract by
other issues.

The 'root device' issue is a real issue.  You can't execute /sbin/hotplug
if the root fs is not reachable.  If the scsi device driver suspects
that the reason its unable to get a response from the scsi controller
is that the PCI bus is down, then the scsi device driver must be given
a mechanism for rebooting the PCI bus without having to go to user-space
to do it.

--linas

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/