[PATCH] powerpc/pseries: Add support for IO Event interrupt drivers
Mark Nelson
markn at au1.ibm.com
Wed May 19 20:24:02 EST 2010
On Tuesday 18 May 2010 23:37:31 Michael Ellerman wrote:
> On Mon, 2010-05-17 at 22:53 +1000, Mark Nelson wrote:
> > This patch adds support for handling IO Event interrupts which come
> > through at the /event-sources/ibm,io-events device tree node.
>
> Hi Mark,
>
> You'll have to explain to me offline sometime how it is we ran out of
> interrupts and started needing to multiplex them ..
>
> > There is one ibm,io-events interrupt, but this interrupt might be used
> > for multiple I/O devices, each with their own separate driver. So, we
> > create a platform interrupt handler that will do the RTAS check-exception
> > call and then call the appropriate driver's interrupt handler (the one(s)
> > that registered with a scope that matches the scope of the incoming
> > interrupt).
> >
> > So, a driver for a device that uses IO Event interrupts will register
> > it's interrupt service routine (or interrupt handler) with the platform
> > code using ioei_register_isr(). This register function takes a function
> > pointer to the driver's handler and the scope that the driver is
> > interested in (scopes defined in arch/powerpc/include/asm/io_events.h).
> > The driver's handler must take a pointer to a struct io_events_section
> > and must not return anything.
> >
> > The platform code registers io_event_interrupt() as the interrupt handler
> > for the ibm,io-events interrupt. Upon receiving an IO Event interrupt, it
> > checks the scope of the incoming interrupt and only calls those drivers'
> > handlers that have registered as being interested in that scope.
>
> The "checks the scope" requires an RTAS call, which takes a global lock
> (and you add another) - these aren't going to be used for anything
> performance critical I hope?
I've been told they're not performance critical but I'll have to go
back and have a closer look at the locking again - I probably am being
a bit overzealous.
>
> > It is possible for a single driver to register the same function pointer
> > more than once with different scopes if it is interested in more than one
> > type of IO Event interrupt. If a handler wants to be notified of all
> > incoming IO Event interrupts it can register with IOEI_SCOPE_ANY.
> >
> > A driver can unregister to stop receiving the IO Event interrupts using
> > ioei_unregister_isr(), passing it the same function pointer to the
> > driver's handler and the scope the driver was registered with.
> >
> > Signed-off-by: Mark Nelson <markn at au1.ibm.com>
> > ---
> > arch/powerpc/include/asm/io_events.h | 40 +++++
> > arch/powerpc/platforms/pseries/Makefile | 2
> > arch/powerpc/platforms/pseries/io_events.c | 205 +++++++++++++++++++++++++++++
> > 3 files changed, 246 insertions(+), 1 deletion(-)
> >
> > Index: upstream/arch/powerpc/include/asm/io_events.h
> > ===================================================================
> > --- /dev/null
> > +++ upstream/arch/powerpc/include/asm/io_events.h
> > @@ -0,0 +1,40 @@
> > +/*
> > + * Copyright 2010 IBM Corporation, Mark Nelson
>
> I usually have name, corp, but I'm not sure if it matters.
I'll find out what the officially sanctioned way is :)
>
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
> > +
> > +#ifndef _ASM_POWERPC_IOEVENTS_H
> > +#define _ASM_POWERPC_IOEVENTS_H
> > +
> > +struct io_events_section {
> > + u16 id; // Unique section identifier x00-x01
> > + u16 length; // Section length (bytes) x02-x03
> > + u8 version; // Section Version x04-x04
> > + u8 sec_subtype; // Section subtype x05-x05
> > + u16 creator_id; // Creator Component ID x06-x07
> > + u8 event_type; // IO-Event Type x08-x08
> > + u8 rpc_field_len; // PRC Field Length x09-x09
> > + u8 scope; // Error/Event Scope x0A-x0A
> > + u8 event_subtype; // I/O-Event Sub-Type x0B-x0B
> > + u32 drc_index; // DRC Index x0C-x0F
> > + u32 rpc_data[]; // RPC Data (optional) x10-...
> > +};
>
> I know it's idiomatic in that sort of code, but C++ comments?
Yeah, I'm not too phased - I was just copying what lppaca.h did...
>
> > +#define IOEI_SCOPE_NOT_APP 0x00
> > +#define IOEI_SCOPE_RIO_HUB 0x36
> > +#define IOEI_SCOPE_RIO_BRIDGE 0x37
> > +#define IOEI_SCOPE_PHB 0x38
> > +#define IOEI_SCOPE_EADS_GLOBAL 0x39
> > +#define IOEI_SCOPE_EADS_SLOT 0x3A
> > +#define IOEI_SCOPE_TORRENT_HUB 0x3B
> > +#define IOEI_SCOPE_SERVICE_PROC 0x51
> > +#define IOEI_SCOPE_ANY -1
> > +
> > +int ioei_register_isr(void (*isr)(struct io_events_section *), int scope);
> > +int ioei_unregister_isr(void (*isr)(struct io_events_section *), int scope);
>
> Given these are exported to the whole kernel I'd vote for
> pseries_ioei_register_isr(), if I get to vote that is.
That's a very good point - I'll change that.
>
> > Index: upstream/arch/powerpc/platforms/pseries/io_events.c
> > ===================================================================
> > --- /dev/null
> > +++ upstream/arch/powerpc/platforms/pseries/io_events.c
> > @@ -0,0 +1,205 @@
> > +/*
> > + * Copyright 2010 IBM Corporation, Mark Nelson
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
> > +
> > +#include <linux/errno.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/slab.h>
> > +#include <linux/module.h>
> > +#include <linux/irq.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/of.h>
> > +
> > +#include <asm/io_events.h>
> > +#include <asm/rtas.h>
> > +#include <asm/irq.h>
> > +
> > +#include "event_sources.h"
> > +
> > +struct ioei_consumer {
> > + void (*ioei_isr)(struct io_events_section *);
>
> Function pointers is one case where a typedef can make the code easier
> to read, if you like.
>
> > + int scope;
> > + struct ioei_consumer *next;
> > +};
>
> You forgot the fourth commandment:
> Thou shalt not write thine own linked list implementation when
> there already exist several perfectly good ones in the kernel!
>
> Or is there some good reason for it I'm missing? :)
No, there was no good reason at all here and I do feel stupid as in a
previous patch I fought to use the in kernel implementation. I'll use
a struct list_head for the next version.
>
> You could even use a hlist to save a void * in your list head.
>
> > +static int ioei_check_exception_token;
> > +
> > +static unsigned char ioei_log_buf[RTAS_ERROR_LOG_MAX];
> > +static DEFINE_SPINLOCK(ioei_log_buf_lock);
> > +
> > +static struct ioei_consumer *ioei_isr_list;
> > +static DEFINE_SPINLOCK(ioei_isr_list_lock);
> > +
> > +int ioei_register_isr(void (*isr)(struct io_events_section *), int scope)
> > +{
> > + struct ioei_consumer *iter;
> > + struct ioei_consumer *cons;
> > + int ret = 0;
> > +
> > + spin_lock(&ioei_isr_list_lock);
>
> Here and unregister you should be using spin_lock_irq(), because you
> need to protect against an interrupt and you know you're not being
> called from an irq handler (so you don't need save/restore - though you
> could use it anyway to be lazy :D).
I actually had that right in an earlier version but I can't remember
what possessed me to change it... I'll fix it.
>
> > + /* check to see if we've already registered this function with
> > + * this scope. If we have, don't register it again
> > + */
> > + iter = ioei_isr_list;
> > + while (iter) {
> > + if (iter->ioei_isr == isr && iter->scope == scope)
> > + break;
> > + iter = iter->next;
> > + }
> > +
> > + if (iter) {
> > + ret = -EEXIST;
> > + goto out;
> > + }
> > +
> > + cons = kmalloc(sizeof(struct ioei_consumer), GFP_KERNEL);
>
> But you don't want to kmalloc while holding the lock and with interrupts
> off.
I could allocate above, before taking the lock, and then if we get the
case where it already exists in the list I could just free it before
returning. Would that work?
>
> > + if (!cons) {
> > + ret = -ENOMEM;
> > + goto out;
> > + }
> > +
> > + cons->ioei_isr = isr;
> > + cons->scope = scope;
> > +
> > + cons->next = ioei_isr_list;
> > + ioei_isr_list = cons;
> > +
> > +out:
> > + spin_unlock(&ioei_isr_list_lock);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(ioei_register_isr);
> > +
> > +int ioei_unregister_isr(void (*isr)(struct io_events_section *), int scope)
> > +{
> > + struct ioei_consumer *iter;
> > + struct ioei_consumer *prev = NULL;
> > + int ret = 0;
> > +
> > + spin_lock(&ioei_isr_list_lock);
> > + iter = ioei_isr_list;
> > + while (iter) {
> > + if (iter->ioei_isr == isr && iter->scope == scope)
> > + break;
> > + prev = iter;
> > + iter = iter->next;
> > + }
> > +
> > + if (!iter) {
> > + ret = -ENOENT;
> > + goto out;
> > + }
> > +
> > + if (prev)
> > + prev->next = iter->next;
> > + else
> > + ioei_isr_list = iter->next;
> > +
> > + kfree(iter);
> > +
> > +out:
> > + spin_unlock(&ioei_isr_list_lock);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(ioei_unregister_isr);
> > +
> > +static void ioei_call_consumers(int scope, struct io_events_section *sec)
> > +{
> > + struct ioei_consumer *iter;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioei_isr_list_lock, flags);
>
> You don't need to disable interrupts, because you're being called from
> the interrupt handler, and it won't be called again until you're
> finished.
You're right. Sorry, I think I mixed up the irq saving between the
register/unregister functions and this function. I'll fix it.
>
> > + iter = ioei_isr_list;
> > + while (iter) {
> > + if (iter->scope == scope || iter->scope == IOEI_SCOPE_ANY)
> > + iter->ioei_isr(sec);
> > + iter = iter->next;
> > + }
> > +
> > + spin_unlock_irqrestore(&ioei_isr_list_lock, flags);
> > +}
> > +
> > +#define EXT_INT_VECTOR_OFFSET 0x500
> > +#define RTAS_TYPE_IO_EVENT 0xE1
> > +
> > +static irqreturn_t io_event_interrupt(int irq, void *dev_id)
> > +{
> > + struct rtas_error_log *rtas_elog;
> > + struct io_events_section *ioei_sec;
> > + char *ch_ptr;
> > + int status;
> > + u16 *sec_len;
> > +
> > + spin_lock(&ioei_log_buf_lock);
> > +
> > + status = rtas_call(ioei_check_exception_token, 6, 1, NULL,
> > + EXT_INT_VECTOR_OFFSET,
> > + irq_map[irq].hwirq,
>
> This is going to be slow anyway, you may as well use virq_to_hw().
Oh yeah, good idea
>
> > + RTAS_IO_EVENTS, 1 /*Time Critical */,
>
> Missing space before the T ^
Nice catch!
>
> > + __pa(&ioei_log_buf),
>
> Does the buffer need to be aligned, and/or inside the RMO? I'd guess
> yes.
Good question, I'll look into it
>
> > + rtas_get_error_log_max());
> > +
> > + rtas_elog = (struct rtas_error_log *)ioei_log_buf;
> > +
> > + if (status != 0)
> > + goto out;
> > +
> > + /* We assume that we will only ever get called for io-event
> > + * interrupts. But if we get called with something else
> > + * make some noise about it.
> > + */
>
> That would mean we'd requested the wrong interrupt wouldn't it? I'd
> almost BUG, but benh always bitches that I do that too often so .. :)
Yeah, I don't think this would ever happen so I'm not sure exactly
what I'm protecting against here...
>
> > + if (rtas_elog->type != RTAS_TYPE_IO_EVENT) {
> > + pr_warning("IO Events: We got called with an event type of %d"
> > + " rather than %d!\n", rtas_elog->type,
> > + RTAS_TYPE_IO_EVENT);
> > + WARN_ON(1);
> > + goto out;
> > + }
> > +
> > + /* there are 24 bytes of event log data before the first section
> > + * (Main-A) begins
> > + */
> > + ch_ptr = (char *)ioei_log_buf + 24;
>
> Any reason you're casting from unsigned char to char?
None that I can fathom now... Good catch :)
>
> > + /* loop through all the sections until we get to the IO Events
> > + * Section, with section ID "IE"
> > + */
> > + while (*ch_ptr != 'I' && *(ch_ptr + 1) != 'E') {
> > + sec_len = (u16 *)(ch_ptr + 2);
> > + ch_ptr += *sec_len;
> > + }
>
> This would be neater if you cast to io_events_section and used the
> fields I think.
That's a good idea and would be a lot nicer than the code above.
>
> And even better if you know the length will be a multiple of the
> section_header size, you can do the arithmetic in those terms.
I'll read the docs again and see if we can do that.
>
> > + ioei_sec = (struct io_events_section *)ch_ptr;
> > +
> > + ioei_call_consumers(ioei_sec->scope, ioei_sec);
>
> Guaranteed to be only one section returned to us per call?
My understanding is that there's only ever one, but I'll double check.
>
> We /could/ copy the ioei_sec and drop the buf lock, which would allow
> another interrupt to come in and start doing the RTAS call (on another
> cpu, and iff there are actually multiple interrupts). But we probably
> don't care.
Good point - I'll update it so that we do the copy.
>
> > +out:
> > + spin_unlock(&ioei_log_buf_lock);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static int __init init_ioei_IRQ(void)
>
> Never understood why IRQ always (sometimes) gets caps ..
Hmmm... Just following the pattern from the RAS code...
>
> > +{
> > + struct device_node *np;
> > +
> > + ioei_check_exception_token = rtas_token("check-exception");
>
> Meep, need to check if it actually exists.
Good catch!
>
> > + np = of_find_node_by_path("/event-sources/ibm,io-events");
> > + if (np != NULL) {
>
> if (np) would usually do it
Definitely. I'll update.
>
> > + request_event_sources_irqs(np, io_event_interrupt, "IO_EVENT");
> > + of_node_put(np);
> > + }
> > +
> > + return 0;
> > +}
> > +device_initcall(init_ioei_IRQ);
>
> Should probably be a machine_initcall().
I'll change it to machine_device_initcall?
>
> > Index: upstream/arch/powerpc/platforms/pseries/Makefile
> > ===================================================================
> > --- upstream.orig/arch/powerpc/platforms/pseries/Makefile
> > +++ upstream/arch/powerpc/platforms/pseries/Makefile
> > @@ -8,7 +8,7 @@ endif
> >
> > obj-y := lpar.o hvCall.o nvram.o reconfig.o \
> > setup.o iommu.o event_sources.o ras.o \
> > - firmware.o power.o dlpar.o
> > + firmware.o power.o dlpar.o io_events.o
>
> The BML guys might appreciate an option to turn it off?
I initially had an option that gets selected by PPC_PSERIES, how about
that?
Thanks for going through this!
Mark
More information about the Linuxppc-dev
mailing list