From moilanen at austin.ibm.com Wed Jun 1 01:12:25 2005 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Tue, 31 May 2005 10:12:25 -0500 Subject: [PATCH] PCI device-node failure detection Message-ID: <20050531101225.510efbf7.moilanen@austin.ibm.com> OpenFirmware marks devices as failed in the device-tree when a hardware problem is detected. The kernel needs to fail config reads/writes to prevent a kernel crash when incorrect data is read. This patch validates that the device-node is not marked "fail" when config space reads/writes are attempted. Signed-off-by: Jake Moilanen Index: 2.6.12/arch/ppc64/kernel/prom.c =================================================================== --- 2.6.12.orig/arch/ppc64/kernel/prom.c 2005-03-02 01:38:13.000000000 -0600 +++ 2.6.12/arch/ppc64/kernel/prom.c 2005-05-27 18:44:33.172559207 -0500 @@ -1887,6 +1887,19 @@ *next = prop; } +int +dn_failed(struct device_node * dn) +{ + char * status; + + status = get_property(dn, "status", NULL); + + if (status && !strcmp(status, "fail")) + return 1; + + return 0; +} + #if 0 void print_properties(struct device_node *np) Index: 2.6.12/arch/ppc64/kernel/pSeries_pci.c =================================================================== --- 2.6.12.orig/arch/ppc64/kernel/pSeries_pci.c 2005-03-02 01:38:34.000000000 -0600 +++ 2.6.12/arch/ppc64/kernel/pSeries_pci.c 2005-05-27 18:44:33.183563164 -0500 @@ -96,7 +96,7 @@ /* Search only direct children of the bus */ for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) + if (dn->devfn == devfn && !dn_failed(dn)) return rtas_read_config(dn, where, size, val); return PCIBIOS_DEVICE_NOT_FOUND; } @@ -138,7 +138,7 @@ /* Search only direct children of the bus */ for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) + if (dn->devfn == devfn && !dn_failed(dn)) return rtas_write_config(dn, where, size, val); return PCIBIOS_DEVICE_NOT_FOUND; } Index: 2.6.12/include/asm-ppc64/prom.h =================================================================== --- 2.6.12.orig/include/asm-ppc64/prom.h 2005-03-02 01:38:33.000000000 -0600 +++ 2.6.12/include/asm-ppc64/prom.h 2005-05-27 18:44:33.192566401 -0500 @@ -225,5 +225,6 @@ extern int prom_n_intr_cells(struct device_node* np); extern void prom_get_irq_senses(unsigned char *senses, int off, int max); extern void prom_add_property(struct device_node* np, struct property* prop); +extern int dn_failed(struct device_node * dn); #endif /* _PPC64_PROM_H */ From linas at austin.ibm.com Wed Jun 1 02:06:28 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 31 May 2005 11:06:28 -0500 Subject: panic reboot stuck in rtas_os_term In-Reply-To: <20050525134617.24a2c940.moilanen@austin.ibm.com> References: <20050508083331.GA30329@suse.de> <20050508091533.GA30450@suse.de> <20050525065410.GA9430@suse.de> <20050525125352.2d08dc52.moilanen@austin.ibm.com> <20050525182413.GA18053@suse.de> <20050525134617.24a2c940.moilanen@austin.ibm.com> Message-ID: <20050531160628.GC31199@austin.ibm.com> On Wed, May 25, 2005 at 01:46:17PM -0500, Jake Moilanen was heard to remark: > > > > But the rtas call does not return, so the kernel cant do anything. > > There was a firmware problem w/ os-term awhile ago (returned hardware > error). It may be related. How old is your firmware? Is this GA level firmware? If some customer out in the field has GA level h/w running SLES9 successfully today, will upgrading to SP1 (or SP2) "break" the h/w? (i.e. force the customer to upgrade firmware?) I'm wondering if that's why Olaf is complaining. Should we have some RAS daemon that nags the user: "this kernel version requires at least firmware level XXX to operate properly"? --linas From nathanl at austin.ibm.com Wed Jun 1 02:12:40 2005 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Tue, 31 May 2005 11:12:40 -0500 Subject: [PATCH] ppc64: set/clear SMT capable bit at boot In-Reply-To: <20050529234154.GG11066@krispykreme> References: <20050529234154.GG11066@krispykreme> Message-ID: <1117555960.10583.6.camel@pants.austin.ibm.com> On Mon, 2005-05-30 at 09:41 +1000, Anton Blanchard wrote: > Paul/Nathan, does this look OK to you? Looks fine to me. Nathan From olh at suse.de Wed Jun 1 04:38:41 2005 From: olh at suse.de (Olaf Hering) Date: Tue, 31 May 2005 20:38:41 +0200 Subject: panic reboot stuck in rtas_os_term In-Reply-To: <20050531160628.GC31199@austin.ibm.com> References: <20050508083331.GA30329@suse.de> <20050508091533.GA30450@suse.de> <20050525065410.GA9430@suse.de> <20050525125352.2d08dc52.moilanen@austin.ibm.com> <20050525182413.GA18053@suse.de> <20050525134617.24a2c940.moilanen@austin.ibm.com> <20050531160628.GC31199@austin.ibm.com> Message-ID: <20050531183841.GA13988@suse.de> On Tue, May 31, Linas Vepstas wrote: > I'm wondering if that's why Olaf is complaining. No, I just expected a reboot on panic, like in the good old days. But even a p640 just activates the service processor, instead of rebooting. SLES8 kernels do not have the "ibm,os-term" call, they are new in 2.6. The current way is ok, one just has to know how to deal with it. From ntl at pobox.com Wed Jun 1 05:51:27 2005 From: ntl at pobox.com (Nathan Lynch) Date: Tue, 31 May 2005 14:51:27 -0500 Subject: [PATCH] PCI device-node failure detection In-Reply-To: <20050531101225.510efbf7.moilanen@austin.ibm.com> References: <20050531101225.510efbf7.moilanen@austin.ibm.com> Message-ID: <20050531195127.GD3723@otto> Jake Moilanen wrote: > OpenFirmware marks devices as failed in the device-tree when a hardware > problem is detected. The kernel needs to fail config reads/writes to > prevent a kernel crash when incorrect data is read. > > This patch validates that the device-node is not marked "fail" when > config space reads/writes are attempted. > > Signed-off-by: Jake Moilanen > > Index: 2.6.12/arch/ppc64/kernel/prom.c > =================================================================== > --- 2.6.12.orig/arch/ppc64/kernel/prom.c 2005-03-02 01:38:13.000000000 -0600 > +++ 2.6.12/arch/ppc64/kernel/prom.c 2005-05-27 18:44:33.172559207 -0500 > @@ -1887,6 +1887,19 @@ > *next = prop; > } > > +int > +dn_failed(struct device_node * dn) > +{ > + char * status; > + > + status = get_property(dn, "status", NULL); > + > + if (status && !strcmp(status, "fail")) > + return 1; > + > + return 0; > +} > + "fail" is not the only possible state that would indicate that the device is unusable, if I'm reading IEEE 1275 right. There is also "disabled" and "fail-xxx" where xxx is additional info about the fault. I've never seen "fail-xxx" myself but I've seen "disabled" on devices that were deconfigured by firmware (failing cpu iirc). Unless we want to treat "disabled" and "fail[-xxx]" differently, I think we should be checking that the status property, if present, says "okay". Something like: int dn_failed(struct device_node *dn) { char *status = get_property(dn, "status", NULL); if (!status) return 0; if (!strcmp(status, "okay")) return 0; return 1; } Also, I think this function could be made a static helper in pSeries_pci.c until something outside of that file needs it. Nathan From olh at suse.de Wed Jun 1 06:29:31 2005 From: olh at suse.de (Olaf Hering) Date: Tue, 31 May 2005 22:29:31 +0200 Subject: [PATCH] allow xmon=bt to print a backtrace by default Message-ID: <20050531202931.GA14769@suse.de> xmon does not print a backtrace per default. This is bad on systems with USB keyboard, the most needed info about the crash is lost. Booting with xmon=bt enables the autobacktrace functionality. Signed-off-by: Olaf Hering Index: linux-2.6.11/arch/ppc64/kernel/setup.c =================================================================== --- linux-2.6.11.orig/arch/ppc64/kernel/setup.c +++ linux-2.6.11/arch/ppc64/kernel/setup.c @@ -633,7 +633,7 @@ void __init setup_system(void) * Initialize xmon */ #ifdef CONFIG_XMON_DEFAULT - xmon_init(); + xmon_init(0); #endif /* * Register early console @@ -1345,12 +1345,14 @@ static int __init early_xmon(char *p) { /* ensure xmon is enabled */ if (p) { + if (strncmp(p, "bt", 2) == 0) + xmon_init(1); if (strncmp(p, "on", 2) == 0) - xmon_init(); + xmon_init(0); if (strncmp(p, "early", 5) != 0) return 0; } - xmon_init(); + xmon_init(0); debugger(NULL); return 0; Index: linux-2.6.11/arch/ppc64/xmon/start.c =================================================================== --- linux-2.6.11.orig/arch/ppc64/xmon/start.c +++ linux-2.6.11/arch/ppc64/xmon/start.c @@ -27,7 +27,7 @@ static void sysrq_handle_xmon(int key, s struct tty_struct *tty) { /* ensure xmon is enabled */ - xmon_init(); + xmon_init(0); debugger(pt_regs); } Index: linux-2.6.11/arch/ppc64/xmon/xmon.c =================================================================== --- linux-2.6.11.orig/arch/ppc64/xmon/xmon.c +++ linux-2.6.11/arch/ppc64/xmon/xmon.c @@ -47,6 +47,7 @@ static int xmon_gate; #endif /* CONFIG_SMP */ static unsigned long in_xmon = 0; +static unsigned long xmon_auto_backtrace; static unsigned long adrs; static int size = 1; @@ -131,6 +132,8 @@ static void csum(void); static void bootcmds(void); void dump_segments(void); static void symbol_lookup(void); +static void xmon_show_stack(unsigned long sp, unsigned long lr, + unsigned long pc); static void xmon_print_symbol(unsigned long address, const char *mid, const char *after); static const char *getvecname(unsigned long vec); @@ -767,6 +770,9 @@ cmds(struct pt_regs *excp) last_cmd = NULL; xmon_regs = excp; + if (xmon_auto_backtrace) + xmon_show_stack(excp->gpr[1], excp->link, excp->nip); + for(;;) { #ifdef CONFIG_SMP printf("%x:", smp_processor_id()); @@ -2485,8 +2491,10 @@ static void dump_stab(void) } } -void xmon_init(void) +void xmon_init(int bt) { + if (bt) + xmon_auto_backtrace = 1; __debugger = xmon; __debugger_ipi = xmon_ipi; __debugger_bpt = xmon_bpt; Index: linux-2.6.11/include/asm-ppc64/system.h =================================================================== --- linux-2.6.11.orig/include/asm-ppc64/system.h +++ linux-2.6.11/include/asm-ppc64/system.h @@ -88,7 +88,7 @@ DEBUGGER_BOILERPLATE(debugger_dabr_match DEBUGGER_BOILERPLATE(debugger_fault_handler) #ifdef CONFIG_XMON -extern void xmon_init(void); +extern void xmon_init(int bt); #endif #else From linas at austin.ibm.com Wed Jun 1 06:30:28 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 31 May 2005 15:30:28 -0500 Subject: [PATCH]: PCI Error Recovery Implementation Message-ID: <20050531203028.GD31199@austin.ibm.com> Hi, Attached is the latest and greatest greatest PCI error recovery patch. Its posted here as one giant patch, but logically consists of a number of different pieces: 1) generic modifications to include/linux/pci.h, as per emails in last round of discussion. 2) Documentation/pci-error-recovery.txt describing the API. This is a cut-n-paste-modified copy of BenH's email. I changed the names of a few routines, and added notes about the current ppc64 implementation. 3) working patches to the SCSI ipr and symbios device drivers to use this API to recover from PCI errors. These actually work. I plan to have a patch for e1000 "real soon now"(TM). 4) ppc64-specific patches that use the API to notify the device of PCI errors. Please review. I want to get this submitted into mainline ASAP. --linas Signed-off-by: Linas Vepstas -------------- next part -------------- --- include/linux/pci.h.linas-orig 2005-04-29 20:27:22.000000000 -0500 +++ include/linux/pci.h 2005-05-31 13:47:46.000000000 -0500 @@ -659,6 +659,81 @@ struct pci_dynids { unsigned int use_driver_data:1; /* pci_driver->driver_data is used */ }; +/* ---------------------------------------------------------------- */ +/** PCI error recovery infrastructure. If a PCI device driver provides + * a set fof callbacks in struct pci_error_handlers, then that device driver + * will be notified of PCI bus errors, and can be driven to recovery. + */ + +enum pci_channel_state { + pci_channel_io_normal = 0, /* I/O channel is in normal state */ + pci_channel_io_frozen = 1, /* I/O to channel is blocked */ + pci_channel_io_perm_failure, /* pci card is dead */ +}; + +enum pcierr_result { + PCIERR_RESULT_NONE=0, /* no result/none/not supported in device driver */ + PCIERR_RESULT_CAN_RECOVER=1, /* Device driver can recover without slot reset */ + PCIERR_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */ + PCIERR_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */ + PCIERR_RESULT_RECOVERED, /* Device driver is fully recovered and operational */ +}; + +/* PCI bus error event callbacks */ +struct pci_error_handlers +{ + int (*error_detected)(struct pci_dev *dev, enum pci_channel_state error); + int (*mmio_enabled)(struct pci_dev *dev); /* MMIO has been reanbled, but not DMA */ + int (*link_reset)(struct pci_dev *dev); /* PCI Express link has been reset */ + int (*slot_reset)(struct pci_dev *dev); /* PCI slot has been reset */ + void (*resume)(struct pci_dev *dev); /* Device driver may resume normal operations */ +}; + +/** + * PCI Error notifier event flags. + */ +#define PEH_NOTIFY_ERROR 1 + +/** PEH event -- structure holding pci controller data that describes + * a change in the isolation status of a PCI slot. A pointer + * to this struct is passed as the data pointer in a notify callback. + */ +struct peh_event { + struct list_head list; + struct pci_dev *dev; /* affected device */ + enum pci_channel_state state; /* PCI bus state for the affected device */ + int time_unavail; /* milliseconds until device might be available */ +}; + +/** + * peh_send_failure_event - generate a PCI error event + * @dev pci device + * + * This routine builds a PCI error event which will be delivered + * to all listeners on the peh_notifier_chain. + * + * This routine can be called within an interrupt context; + * the actual event will be delivered in a normal context + * (from a workqueue). + */ +int peh_send_failure_event (struct pci_dev *dev, + enum pci_channel_state state, + int time_unavail); + +/** + * peh_register_notifier - Register to find out about EEH events. + * @nb: notifier block to callback on events + */ +int peh_register_notifier(struct notifier_block *nb); + +/** + * peh_unregister_notifier - Unregister to an EEH event notifier. + * @nb: notifier block to callback on events + */ +int peh_unregister_notifier(struct notifier_block *nb); + +/* ---------------------------------------------------------------- */ + struct module; struct pci_driver { struct list_head node; @@ -671,6 +746,7 @@ struct pci_driver { int (*resume) (struct pci_dev *dev); /* Device woken up */ int (*enable_wake) (struct pci_dev *dev, u32 state, int enable); /* Enable wake event */ + struct pci_error_handlers err_handler; struct device_driver driver; struct pci_dynids dynids; }; --- Documentation/pci-error-recovery.txt.linas-orig 2005-05-06 17:44:41.000000000 -0500 +++ Documentation/pci-error-recovery.txt 2005-05-31 15:08:56.000000000 -0500 @@ -0,0 +1,232 @@ + + PCI Error Recovery + ------------------ + May 31, 2005 + + +Some PCI bus controllers are able to detect certain "hard" PCI errors +on the bus, such as parity errors on the data and address busses, as +well as SERR and PERR errors. These chipsets are then able to disable +I/O to/from the affected device, so that, for example, a bad DMA +address doesn't end up corrupting system memory. These same chipsets +are also able to reset the affected PCI device, and return it to +working condition. This document describes a generic API form +performing error recovery. + +The core idea is that after a PCI error has been detected, there must +be a way for the kernel to coordinate with all affected device drivers +so that the pci card can be made operational again, possibly after +performing a full electrical #RST of the PCI card. The API below +provides a generic API for device drivers to be notified of PCI +errors, and to be notified of, and respond to, a reset sequence. + +Preliminary sketch of API, cut-n-pasted-n-modified email from +Ben Herrenschmidt, circa 5 april 2005 + +The error recovery API support is exposed to the driver in the form of +a structure of function pointers pointed to by a new field in struct +pci_driver. The absence of this pointer in pci_driver denotes an +"non-aware" driver, behaviour on these is platform dependant. +Platforms like ppc64 can try to simulate pci hotplug remove/add. + +The definition of "pci_error_token" is not covered here. It is based on +Seto's work on the synchronous error detection. We still need to define +functions for extracting infos out of an opaque error token. This is +separate from this API. + +This structure has the form: + +struct pci_error_handlers +{ + int (*error_detected)(struct pci_dev *dev, pci_error_token error); + int (*mmio_enabled)(struct pci_dev *dev); + int (*resume)(struct pci_dev *dev); + int (*link_reset)(struct pci_dev *dev); + int (*slot_reset)(struct pci_dev *dev); +}; + +A driver doesn't have to implement all of these callbacks. The +only mandatory one is error_detected(). If a callback is not +implemented, the corresponding feature is considered unsupported. +For example, if mmio_enabled() and resume() aren't there, then the +driver is assumed as not doing any direct recovery and requires +a reset. If link_reset() is not implemented, the card is assumed as +not caring about link resets, in which case, if recover is supported, +the core can try recover (but not slot_reset() unless it really did +reset the slot). If slot_reset() is not supported, link_reset() can +be called instead on a slot reset. + +At first, the call will always be : + + 1) error_detected() + + Error detected. This is sent once after an error has been detected. At +this point, the device might not be accessible anymore depending on the +platform (the slot will be isolated on ppc64). The driver may already +have "noticed" the error because of a failing IO, but this is the proper +"synchronisation point", that is, it gives a chance to the driver to +cleanup, waiting for pending stuff (timers, whatever, etc...) to +complete; it can take semaphores, schedule, etc... everything but touch +the device. Within this function and after it returns, the driver +shouldn't do any new IOs. Called in task context. This is sort of a +"quiesce" point. See note about interrupts at the end of this doc. + + Result codes: + - PCIERR_RESULT_CAN_RECOVER: + Driever returns this if it thinks it might be able to recover + the HW by just banging IOs or if it wants to be given + a chance to extract some diagnostic informations (see + below). + - PCIERR_RESULT_NEED_RESET: + Driver returns this if it thinks it can't recover unless the + slot is reset. + - PCIERR_RESULT_DISCONNECT: + Return this if driver thinks it won't recover at all, + (this will detach the driver ? or just leave it + dangling ? to be decided) + +So at this point, we have called error_detected() for all drivers +on the segment that had the error. On ppc64, the slot is isolated. What +happens now typically depends on the result from the drivers. If all +drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would +re-enable IOs on the slot (or do nothing special if the platform doesn't +isolate slots) and call 2). If not and we can reset slots, we go to 4), +if neither, we have a dead slot. If it's an hotplug slot, we might +"simulate" reset by triggering HW unplug/replug though. + +>>> Current ppc64 implementation assumes that a device driver will +>>> *not* schedule or semaphore in this routine; the current ppc64 +>>> implementation uses one kernel thread to notify all devices; +>>> thus, of one device sleeps/schedules, all devices are affected. +>>> Doing better requires complex multi-threaded logic in the error +>>> recovery implementation (e.g. waiting for all notification threads +>>> to "join" before proceeding with recovery.) This seems excessively +>>> complex and not worth implementing. + + 2) mmio_enabled() + + This is the "early recovery" call. IOs are allowed again, but DMA is +not (hrm... to be discussed, I prefer not), with some restrictions. This +is NOT a callback for the driver to start operations again, only to +peek/poke at the device, extract diagnostic information, if any, and +eventually do things like trigger a device local reset or some such, +but not restart operations. This is sent if all drivers on a segment +agree that they can try to recover and no automatic link reset was +performed by the HW. If the platform can't just re-enable IOs without +a slot reset or a link reset, it doesn't call this callback and goes +directly to 3) or 4). All IOs should be done _synchronously_ from +within this callback, errors triggered by them will be returned via +the normal pci_check_whatever() api, no new error_detected() callback +will be issued due to an error happening here. However, such an error +might cause IOs to be re-blocked for the whole segment, and thus +invalidate the recovery that other devices on the same segment might +have done, forcing the whole segment into one of the next states, +that is link reset or slot reset. + + Result codes: + - PCIERR_RESULT_RECOVERED + Driver returns this if it thinks the device is fully + functionnal and thinks it is ready to start + normal driver operations again. There is no + guarantee that the driver will actually be + allowed to proceed, as another driver on the + same segment might have failed and thus triggered a + slot reset on platforms that support it. + + - PCIERR_RESULT_NEED_RESET + Driver returns this if it thinks the device is not + recoverable in it's current state and it needs a slot + reset to proceed. + + - PCIERR_RESULT_DISCONNECT + Same as above. Total failure, no recovery even after + reset driver dead. (To be defined more precisely) + +>>> The current ppc64 implementation does not implement this callback. + + 3) link_reset() + + This is called after the link has been reset. This is typically +a PCI Express specific state at this point and is done whenever a +non-fatal error has been detected that can be "solved" by resetting +the link. This call informs the driver of the reset and the driver +should check if the device appears to be in working condition. +This function acts a bit like 2) mmio_enabled(), in that the driver +is not supposed to restart normal driver I/O operations right away. +Instead, it should just "probe" the device to check it's recoverability +status. If all is right, then the core will call resume() once all +drivers have ack'd link_reset(). + + Result codes: + (identical to mmio_enabled) + +>>> The current ppc64 implementation does not implement this callback. + + 4) slot_reset() + + This is called after the slot has been soft or hard reset by the +platform. A soft reset consists of asserting the adapter #RST line +and then restoring the PCI BARs and PCI configuration header. If the +platform supports PCI hotplug, then it might instead perform a hard +reset by toggling power on the slot off/on. This call gives drivers +the chance to re-initialize the hardware (re-download firmware, etc.), +but drivers shouldn't restart normal I/O processing operations at +this point. (See note about interrupts; interrupts aren't guaranteed +to be delivered until the resume() callback has been called). If all +device drivers report success on this callback, the patform will call +resume() to complete the error handling and let the driver restart +normal I/O processing. + +A driver can still return a critical failure for this function if +it can't get the device operational after reset. If the platform +previously tried a soft reset, it migh now try a hard reset (power +cycle) and then call slot_reset() again. It the device still can't +be recovered, there is nothing more that can be done; the platform +will typically report a "permanent failure" in such a case. The +device will be considered "dead" in this case. + + Result codes: + - PCIERR_RESULT_DISCONNECT + Same as above. + + 5) resume() + + This is called if all drivers on the segment have returned +PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks. +That basically tells the driver to restart activity, tht everything +is back and running. No result code is taken into account here. If +a new error happens, it will restart a new error handling process. + +That's it. I think this covers all the possibilities. The way those +callbacks are called is platform policy. A platform with no slot reset +capability for example may want to just "ignore" drivers that can't +recover (disconnect them) and try to let other cards on the same segment +recover. Keep in mind that in most real life cases, though, there will +be only one driver per segment. + +Now, there is a note about interrupts. If you get an interrupt and your +device is dead or has been isolated, there is a problem :) + +After much thinking, I decided to leave that to the platform. That is, +the recovery API only precies that: + + - There is no guarantee that interrupt delivery can proceed from any +device on the segment starting from the error detection and until the +restart callback is sent, at which point interrupts are expected to be +fully operational. + + - There is no guarantee that interrupt delivery is stopped, that is, ad +river that gets an interrupts after detecting an error, or that detects +and error within the interrupt handler such that it prevents proper +ack'ing of the interrupt (and thus removal of the source) should just +return IRQ_NOTHANDLED. It's up to the platform to deal with taht +condition, typically by masking the irq source during the duration of +the error handling. It is expected that the platform "knows" which +interrupts are routed to error-management capable slots and can deal +with temporarily disabling that irq number during error processing (this +isn't terribly complex). That means some IRQ latency for other devices +sharing the interrupt, but there is simply no other way. High end +platforms aren't supposed to share interrupts between many devices +anyway :) + + --- drivers/pci/Makefile.linas-orig 2005-04-29 20:31:33.000000000 -0500 +++ drivers/pci/Makefile 2005-05-06 12:28:43.000000000 -0500 @@ -3,7 +3,7 @@ # obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ - names.o pci-driver.o search.o pci-sysfs.o \ + names.o pci-driver.o pci-error.o search.o pci-sysfs.o \ rom.o obj-$(CONFIG_PROC_FS) += proc.o --- drivers/pci/pci-error.c.linas-orig 2005-05-06 17:44:47.000000000 -0500 +++ drivers/pci/pci-error.c 2005-05-31 13:49:34.000000000 -0500 @@ -0,0 +1,152 @@ +/* + * pci-error.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include + +#undef DEBUG + +/** Overview: + * PEH, or "PCI Error Handling" is a PCI bridge technology for + * dealing with PCI bus errors that can't be dealt with within the + * usual PCI framework, except by check-stopping the CPU. Systems + * that are designed for high-availability/reliability cannot afford + * to crash due to a "mere" PCI error, thus the need for PEH. + * An PEH-capable bridge operates by converting a detected error + * into a "slot freeze", taking the PCI adapter off-line, making + * the slot behave, from the OS'es point of view, as if the slot + * were "empty": all reads return 0xff's and all writes are silently + * ignored. PEH slot isolation events can be triggered by parity + * errors on the address or data busses (e.g. during posted writes), + * which in turn might be caused by low voltage on the bus, dust, + * vibration, humidity, radioactivity or plain-old failed hardware. + * + * Note, however, that one of the leading causes of PEH slot + * freeze events are buggy device drivers, buggy device microcode, + * or buggy device hardware. This is because any attempt by the + * device to bus-master data to a memory address that is not + * assigned to the device will trigger a slot freeze. (The idea + * is to prevent devices-gone-wild from corrupting system memory). + * Buggy hardware/drivers will have a miserable time co-existing + * with PEH. + */ + +/* PEH event workqueue setup. */ +static spinlock_t peh_eventlist_lock = SPIN_LOCK_UNLOCKED; +LIST_HEAD(peh_eventlist); +static void peh_event_handler(void *); +DECLARE_WORK(peh_event_wq, peh_event_handler, NULL); + +static struct notifier_block *peh_notifier_chain; + +/** + * peh_event_handler - dispatch PEH events. The detection of a frozen + * slot can occur inside an interrupt, where it can be hard to do + * anything about it. The goal of this routine is to pull these + * detection events out of the context of the interrupt handler, and + * re-dispatch them for processing at a later time in a normal context. + * + * @dummy - unused + */ +static void peh_event_handler(void *dummy) +{ + unsigned long flags; + struct peh_event *event; + + while (1) { + spin_lock_irqsave(&peh_eventlist_lock, flags); + event = NULL; + if (!list_empty(&peh_eventlist)) { + event = list_entry(peh_eventlist.next, struct peh_event, list); + list_del(&event->list); + } + spin_unlock_irqrestore(&peh_eventlist_lock, flags); + if (event == NULL) + break; + + printk(KERN_INFO "PEH: Detected PCI bus error on device " + "%s %s\n", + pci_name(event->dev), pci_pretty_name(event->dev)); + + notifier_call_chain (&peh_notifier_chain, + PEH_NOTIFY_ERROR, event); + + pci_dev_put(event->dev); + kfree(event); + } +} + + +/** + * peh_send_failure_event - generate a PCI error event + * @dev pci device + * + * This routine builds a PCI error event which will be delivered + * to all listeners on the peh_notifier_chain. + * + * This routine can be called within an interrupt context; + * the actual event will be delivered in a normal context + * (from a workqueue). + */ +int peh_send_failure_event (struct pci_dev *dev, + enum pci_channel_state state, + int time_unavail) +{ + unsigned long flags; + struct peh_event *event; + + event = kmalloc(sizeof(*event), GFP_ATOMIC); + if (event == NULL) { + printk (KERN_ERR "PEH: out of memory, event not handled\n"); + return 1; + } + + event->dev = dev; + event->state = state; + event->time_unavail = time_unavail; + + /* We may or may not be called in an interrupt context */ + spin_lock_irqsave(&peh_eventlist_lock, flags); + list_add(&event->list, &peh_eventlist); + spin_unlock_irqrestore(&peh_eventlist_lock, flags); + + schedule_work(&peh_event_wq); + + return 0; +} + +/** + * peh_register_notifier - Register to find out about EEH events. + * @nb: notifier block to callback on events + */ +int peh_register_notifier(struct notifier_block *nb) +{ + return notifier_chain_register(&peh_notifier_chain, nb); +} + +/** + * peh_unregister_notifier - Unregister to an EEH event notifier. + * @nb: notifier block to callback on events + */ +int peh_unregister_notifier(struct notifier_block *nb) +{ + return notifier_chain_unregister(&peh_notifier_chain, nb); +} + +/********************** END OF FILE ******************************/ --- drivers/scsi/ipr.c.linas-orig 2005-04-29 20:33:36.000000000 -0500 +++ drivers/scsi/ipr.c 2005-05-31 15:12:08.000000000 -0500 @@ -5306,6 +5306,85 @@ static void ipr_initiate_ioa_reset(struc shutdown_type); } +#ifdef CONFIG_SCSI_IPR_EEH_RECOVERY + +/** If the PCI slot is frozen, hold off all i/o + * activity; then, as soon as the slot is available again, + * initiate an adapter reset. + */ +static int ipr_reset_freeze(struct ipr_cmnd *ipr_cmd) +{ + list_add_tail(&ipr_cmd->queue, &ipr_cmd->ioa_cfg->pending_q); + ipr_cmd->done = ipr_reset_ioa_job; + return IPR_RC_JOB_RETURN; +} + +/** ipr_eeh_frozen -- called when slot has experience PCI bus error. + * This routine is called to tell us that the PCI bus is down. + * Can't do anything here, except put the device driver into a + * holding pattern, waiting for the PCI bus to come back. + */ +static void ipr_eeh_frozen (struct pci_dev *pdev) +{ + unsigned long flags = 0; + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); + + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_freeze, IPR_SHUTDOWN_NONE); + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); +} + +/** ipr_eeh_slot_reset - called when pci slot has been reset. + * + * This routine is called by the pci error recovery recovery + * code after the PCI slot has been reset, just before we + * should resume normal operations. + */ +static int ipr_eeh_slot_reset (struct pci_dev *pdev) +{ + unsigned long flags = 0; + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); + + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_restore_cfg_space, + IPR_SHUTDOWN_NONE); + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); + + return PCIERR_RESULT_RECOVERED; +} + +/** This routine is called when the PCI bus has permanently + * failed. This routine should purge all pending I/O and + * shut down the device driver (close and unload). + * XXX Needs to be implemented. + */ +static void ipr_eeh_perm_failure (struct pci_dev *pdev) +{ +#if 0 // XXXXXXXXXXXXXXXXXXXXXXX + ipr_cmd->job_step = ipr_reset_shutdown_ioa; + rc = IPR_RC_JOB_CONTINUE; +#endif +} + +static int ipr_eeh_error_detected (struct pci_dev *pdev, + enum pci_channel_state state) +{ + switch (state) { + case pci_channel_io_frozen: + ipr_eeh_frozen (pdev); + return PCIERR_RESULT_NEED_RESET; + + case pci_channel_io_perm_failure: + ipr_eeh_perm_failure (pdev); + return PCIERR_RESULT_DISCONNECT; + break; + default: + break; + } + return PCIERR_RESULT_NEED_RESET; +} +#endif + /** * ipr_probe_ioa_part2 - Initializes IOAs found in ipr_probe_ioa(..) * @ioa_cfg: ioa cfg struct @@ -6015,6 +6094,10 @@ static struct pci_driver ipr_driver = { .id_table = ipr_pci_table, .probe = ipr_probe, .remove = ipr_remove, + .err_handler = { + .error_detected = ipr_eeh_error_detected, + .slot_reset = ipr_eeh_slot_reset, + }, .driver = { .shutdown = ipr_shutdown, }, --- drivers/scsi/sym53c8xx_2/sym_glue.c.linas-orig 2005-04-29 20:33:12.000000000 -0500 +++ drivers/scsi/sym53c8xx_2/sym_glue.c 2005-05-31 13:52:55.000000000 -0500 @@ -770,6 +770,10 @@ static irqreturn_t sym53c8xx_intr(int ir struct sym_hcb *np = (struct sym_hcb *)dev_id; if (DEBUG_FLAGS & DEBUG_TINY) printf_debug ("["); +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + if (np->s.io_state != pci_channel_io_normal) + return IRQ_HANDLED; +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ spin_lock_irqsave(np->s.host->host_lock, flags); sym_interrupt(np); @@ -844,6 +848,27 @@ static void sym_eh_done(struct scsi_cmnd */ static void sym_eh_timeout(u_long p) { __sym_eh_done((struct scsi_cmnd *)p, 1); } +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY +static void sym_eeh_timeout(u_long p) +{ + struct sym_eh_wait *ep = (struct sym_eh_wait *) p; + if (!ep) + return; + complete(&ep->done); +} + +static void sym_eeh_done(struct sym_eh_wait *ep) +{ + if (!ep) + return; + ep->timed_out = 0; + if (!del_timer(&ep->timer)) + return; + + complete(&ep->done); +} +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + /* * Generic method for our eh processing. * The 'op' argument tells what we have to do. @@ -893,6 +918,37 @@ prepare: /* Try to proceed the operation we have been asked for */ sts = -1; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + + /* We may be in an error condition because the PCI bus + * went down. In this case, we need to wait until the + * PCI bus is reset, the card is reset, and only then + * proceed with the scsi error recovery. We'll wait + * for 15 seconds for this to happen. + */ +#define WAIT_FOR_PCI_RECOVERY 15 + if (np->s.io_state != pci_channel_io_normal) { + struct sym_eh_wait eeh, *eep = &eeh; + np->s.io_reset_wait = eep; + init_completion(&eep->done); + init_timer(&eep->timer); + eep->to_do = SYM_EH_DO_WAIT; + eep->timer.expires = jiffies + (WAIT_FOR_PCI_RECOVERY*HZ); + eep->timer.function = sym_eeh_timeout; + eep->timer.data = (u_long)eep; + eep->timed_out = 1; /* Be pessimistic for once :) */ + add_timer(&eep->timer); + spin_unlock_irq(np->s.host->host_lock); + wait_for_completion(&eep->done); + spin_lock_irq(np->s.host->host_lock); + if (eep->timed_out) { + printk (KERN_ERR "%s: Timed out waiting for PCI reset\n", + sym_name(np)); + } + np->s.io_reset_wait = NULL; + } +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + switch(op) { case SYM_EH_ABORT: sts = sym_abort_scsiio(np, cmd, 1); @@ -1625,6 +1681,8 @@ static struct Scsi_Host * __devinit sym_ if (!np) goto attach_failed; np->s.device = dev->pdev; + np->s.io_state = pci_channel_io_normal; + np->s.io_reset_wait = NULL; np->bus_dmat = dev->pdev; /* Result in 1 DMA pool per HBA */ host_data->ncb = np; np->s.host = instance; @@ -2048,6 +2106,59 @@ static int sym_detach(struct sym_hcb *np return 1; } +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY +/** sym2_io_error_detected() is called when PCI error is detected */ +int sym2_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + np->s.io_state = state; + // XXX If slot is permanently frozen, then what? + // Should we scsi_remove_host() maybe ?? + + /* Request a slot slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** sym2_io_slot_reset is called when the pci bus has been reset. + * Restart the card from scratch. */ +int sym2_io_slot_reset (struct pci_dev *pdev) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + msleep (500); // pure paranoia -- wait for device to settle + printk (KERN_INFO "%s: recovering from a PCI slot reset\n", + sym_name(np)); + + if (pci_enable_device(pdev)) + printk (KERN_ERR "%s: device setup failed most egregiously\n", + sym_name(np)); + + pci_set_master(pdev); + + /* Perform host reset only on one instance of the card */ + if (0 == PCI_FUNC (pdev->devfn)) + sym_reset_scsi_bus(np, 0); + + return PCIERR_RESULT_RECOVERED; +} + +/** sym2_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + */ +void sym2_io_resume (struct pci_dev *pdev) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + /* Perform device startup only once for this card. */ + if (0 == PCI_FUNC (pdev->devfn)) + sym_start_up (np, 1); + + np->s.io_state = pci_channel_io_normal; + sym_eeh_done (np->s.io_reset_wait); +} +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + /* * Driver host template. */ @@ -2359,6 +2470,11 @@ static struct pci_driver sym2_driver = { .id_table = sym2_id_table, .probe = sym2_probe, .remove = __devexit_p(sym2_remove), + .err_handler = { + .error_detected = sym2_io_error_detected, + .slot_reset = sym2_io_slot_reset, + .resume = sym2_io_resume, + }, }; static int __init sym2_init(void) --- drivers/scsi/sym53c8xx_2/sym_glue.h.linas-orig 2005-04-29 20:32:45.000000000 -0500 +++ drivers/scsi/sym53c8xx_2/sym_glue.h 2005-05-06 16:29:39.000000000 -0500 @@ -358,6 +358,10 @@ struct sym_shcb { char chip_name[8]; struct pci_dev *device; + /* pci bus i/o state; waiter for clearing of i/o state */ + enum pci_channel_state io_state; + struct sym_eh_wait *io_reset_wait; + struct Scsi_Host *host; void __iomem * mmio_va; /* MMIO kernel virtual address */ --- drivers/scsi/sym53c8xx_2/sym_hipd.c.linas-orig 2005-04-29 20:22:45.000000000 -0500 +++ drivers/scsi/sym53c8xx_2/sym_hipd.c 2005-05-20 15:40:43.000000000 -0500 @@ -2836,6 +2836,7 @@ void sym_interrupt (struct sym_hcb *np) u_char istat, istatc; u_char dstat; u_short sist; + u_int icnt; /* * interrupt on the fly ? @@ -2877,6 +2878,7 @@ void sym_interrupt (struct sym_hcb *np) sist = 0; dstat = 0; istatc = istat; + icnt = 0; do { if (istatc & SIP) sist |= INW (nc_sist); @@ -2884,6 +2886,14 @@ void sym_interrupt (struct sym_hcb *np) dstat |= INB (nc_dstat); istatc = INB (nc_istat); istat |= istatc; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + /* Prevent deadlock waiting on a condition that may never clear. */ + icnt ++; + if (100 < icnt) { + if (eeh_slot_is_isolated(np->s.device)) + return; + } +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ } while (istatc & (SIP|DIP)); if (DEBUG_FLAGS & DEBUG_TINY) --- drivers/scsi/Kconfig.linas-orig 2005-04-29 20:31:30.000000000 -0500 +++ drivers/scsi/Kconfig 2005-05-24 11:17:40.000000000 -0500 @@ -1032,6 +1032,14 @@ config SCSI_SYM53C8XX_IOMAPPED the card. This is significantly slower then using memory mapped IO. Most people should answer N. +config SCSI_SYM53C8XX_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on SCSI_SYM53C8XX_2 && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config SCSI_IPR tristate "IBM Power Linux RAID adapter support" depends on PCI && SCSI @@ -1057,6 +1065,14 @@ config SCSI_IPR_DUMP If you enable this support, the iprdump daemon can be used to capture adapter failure analysis information. +config SCSI_IPR_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on SCSI_IPR && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config SCSI_ZALON tristate "Zalon SCSI support" depends on GSC && SCSI --- arch/ppc64/defconfig.linas-orig 2005-05-20 12:16:19.000000000 -0500 +++ arch/ppc64/defconfig 2005-05-20 12:16:58.000000000 -0500 @@ -255,6 +255,7 @@ CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MOD CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set +CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY=y # CONFIG_SCSI_QLOGIC_ISP is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set --- arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-04-29 20:34:04.000000000 -0500 +++ arch/ppc64/configs/pSeries_defconfig 2005-05-24 11:18:45.000000000 -0500 @@ -275,9 +275,11 @@ CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MOD CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set +CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY=y CONFIG_SCSI_IPR=y # CONFIG_SCSI_IPR_TRACE is not set # CONFIG_SCSI_IPR_DUMP is not set +CONFIG_SCSI_IPR_EEH_RECOVERY=y # CONFIG_SCSI_QLOGIC_ISP is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set --- include/asm-ppc64/eeh.h.linas-orig 2005-04-29 20:34:03.000000000 -0500 +++ include/asm-ppc64/eeh.h 2005-05-31 13:55:18.000000000 -0500 @@ -1,4 +1,4 @@ -/* +/* * eeh.h * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation. * @@ -6,12 +6,12 @@ * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -23,6 +23,7 @@ #include #include #include +#include #include struct pci_dev; @@ -36,6 +37,11 @@ struct notifier_block; #define EEH_MODE_SUPPORTED (1<<0) #define EEH_MODE_NOCHECK (1<<1) #define EEH_MODE_ISOLATED (1<<2) +#define EEH_MODE_RECOVERING (1<<3) + +/* Max number of EEH freezes allowed before we consider the device + * to be permanently disabled. */ +#define EEH_MAX_ALLOWED_FREEZES 5 void __init eeh_init(void); unsigned long eeh_check_failure(const volatile void __iomem *token, @@ -59,35 +65,82 @@ void eeh_add_device_late(struct pci_dev * eeh_remove_device - undo EEH setup for the indicated pci device * @dev: pci device to be removed * - * This routine should be when a device is removed from a running - * system (e.g. by hotplug or dlpar). + * This routine should be called when a device is removed from + * a running system (e.g. by hotplug or dlpar). It unregisters + * the PCI device from the EEH subsystem. I/O errors affecting + * this device will no longer be detected after this call; thus, + * i/o errors affecting this slot may leave this device unusable. */ void eeh_remove_device(struct pci_dev *); -#define EEH_DISABLE 0 -#define EEH_ENABLE 1 -#define EEH_RELEASE_LOADSTORE 2 -#define EEH_RELEASE_DMA 3 +/** + * eeh_slot_is_isolated -- return non-zero value if slot is frozen + */ +int eeh_slot_is_isolated (struct pci_dev *dev); /** - * Notifier event flags. + * eeh_ioaddr_is_isolated -- return non-zero value if device at + * io address is frozen. */ -#define EEH_NOTIFY_FREEZE 1 +int eeh_ioaddr_is_isolated(const volatile void __iomem *token); -/** EEH event -- structure holding pci slot data that describes - * a change in the isolation status of a PCI slot. A pointer - * to this struct is passed as the data pointer in a notify callback. - */ -struct eeh_event { - struct list_head list; - struct pci_dev *dev; - struct device_node *dn; - int reset_state; -}; - -/** Register to find out about EEH events. */ -int eeh_register_notifier(struct notifier_block *nb); -int eeh_unregister_notifier(struct notifier_block *nb); +/** + * eeh_slot_error_detail -- record and EEH error condition to the log + * @severity: 1 if temporary, 2 if permanent failure. + * + * Obtains the the EEH error details from the RTAS subsystem, + * and then logs these details with the RTAS error log system. + */ +void eeh_slot_error_detail (struct device_node *dn, int severity); + +/** + * rtas_set_slot_reset -- unfreeze a frozen slot + * + * Clear the EEH-frozen condition on a slot. This routine + * does this by asserting the PCI #RST line for 1/8th of + * a second; this routine will sleep while the adapter is + * being reset. + */ +void rtas_set_slot_reset (struct device_node *dn); + +/** rtas_pci_slot_reset raises/lowers the pci #RST line + * state: 1/0 to raise/lower the #RST + * + * Clear the EEH-frozen condition on a slot. This routine + * asserts the PCI #RST line if the 'state' argument is '1', + * and drops the #RST line if 'state is '0'. This routine is + * safe to call in an interrupt context. + * + */ +void rtas_pci_slot_reset(struct device_node *dn, int state); +void eeh_pci_slot_reset(struct pci_dev *dev, int state); + +/** eeh_pci_slot_availability -- Indicates whether a PCI + * slot is ready to be used. After a PCI reset, it may take a while + * for the PCI fabric to fully reset the comminucations path to the + * given PCI card. This routine can be used to determine how long + * to wait before a PCI slot might become usable. + * + * This routine returns how long to wait (in milliseconds) before + * the slot is expected to be usable. A value of zero means the + * slot is immediately usable. A negavitve value means that the + * slot is permanently disabled. + */ +int eeh_pci_slot_availability(struct pci_dev *dev); + +/** Restore device configuration info across device resets. + */ +void eeh_restore_bars(struct device_node *); +void eeh_pci_restore_bars(struct pci_dev *dev); + +/** + * rtas_configure_bridge -- firmware initialization of pci bridge + * + * Ask the firmware to configure any PCI bridge devices + * located behind the indicated node. Required after a + * pci device reset. + */ +void rtas_configure_bridge(struct device_node *dn); /** * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure. @@ -116,7 +169,7 @@ int eeh_unregister_notifier(struct notif #define EEH_IO_ERROR_VALUE(size) (-1UL) #endif -/* +/* * MMIO read/write operations with EEH support. */ static inline u8 eeh_readb(const volatile void __iomem *addr) @@ -238,21 +291,21 @@ static inline void eeh_memcpy_fromio(voi *((u8 *)dest) = *((volatile u8 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 1); - dest = (void *)((unsigned long)dest + 1); + dest = (void *)((unsigned long)dest + 1); n--; } while(n > 4) { *((u32 *)dest) = *((volatile u32 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 4); - dest = (void *)((unsigned long)dest + 4); + dest = (void *)((unsigned long)dest + 4); n -= 4; } while(n) { *((u8 *)dest) = *((volatile u8 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 1); - dest = (void *)((unsigned long)dest + 1); + dest = (void *)((unsigned long)dest + 1); n--; } __asm__ __volatile__ ("sync" : : : "memory"); @@ -274,19 +327,19 @@ static inline void eeh_memcpy_toio(volat while(n && (!EEH_CHECK_ALIGN(vdest, 4) || !EEH_CHECK_ALIGN(src, 4))) { *((volatile u8 *)vdest) = *((u8 *)src); src = (void *)((unsigned long)src + 1); - vdest = (void *)((unsigned long)vdest + 1); + vdest = (void *)((unsigned long)vdest + 1); n--; } while(n > 4) { *((volatile u32 *)vdest) = *((volatile u32 *)src); src = (void *)((unsigned long)src + 4); - vdest = (void *)((unsigned long)vdest + 4); + vdest = (void *)((unsigned long)vdest + 4); n-=4; } while(n) { *((volatile u8 *)vdest) = *((u8 *)src); src = (void *)((unsigned long)src + 1); - vdest = (void *)((unsigned long)vdest + 1); + vdest = (void *)((unsigned long)vdest + 1); n--; } __asm__ __volatile__ ("sync" : : : "memory"); --- include/asm-ppc64/prom.h.linas-orig 2005-04-29 20:32:46.000000000 -0500 +++ include/asm-ppc64/prom.h 2005-05-06 12:28:43.000000000 -0500 @@ -119,6 +119,7 @@ struct property { */ struct pci_controller; struct iommu_table; +struct eeh_recovery_ops; struct device_node { char *name; @@ -137,8 +138,12 @@ struct device_node { int devfn; /* for pci devices */ int eeh_mode; /* See eeh.h for possible EEH_MODEs */ int eeh_config_addr; + int eeh_check_count; /* number of times device driver ignored error */ + int eeh_freeze_count; /* number of times this device froze up. */ + int eeh_is_bridge; /* device is pci-to-pci bridge */ struct pci_controller *phb; /* for pci devices */ struct iommu_table *iommu_table; /* for phb's or bridges */ + u32 config_space[16]; /* saved PCI config space */ struct property *properties; struct device_node *parent; --- include/asm-ppc64/rtas.h.linas-orig 2005-04-29 20:32:32.000000000 -0500 +++ include/asm-ppc64/rtas.h 2005-05-06 12:28:43.000000000 -0500 @@ -243,4 +243,6 @@ extern unsigned long rtas_rmo_buf; #define GLOBAL_INTERRUPT_QUEUE 9005 +extern int rtas_write_config(struct device_node *dn, int where, int size, u32 val); + #endif /* _PPC64_RTAS_H */ --- arch/ppc64/kernel/eeh.c.linas-orig 2005-04-29 20:29:19.000000000 -0500 +++ arch/ppc64/kernel/eeh.c 2005-05-31 15:13:51.000000000 -0500 @@ -1,32 +1,33 @@ /* * eeh.c * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#include +#include #include +#include #include -#include #include #include #include #include #include #include +#include #include #include #include @@ -49,8 +50,8 @@ * were "empty": all reads return 0xff's and all writes are silently * ignored. EEH slot isolation events can be triggered by parity * errors on the address or data busses (e.g. during posted writes), - * which in turn might be caused by dust, vibration, humidity, - * radioactivity or plain-old failed hardware. + * which in turn might be caused by low voltage on the bus, dust, + * vibration, humidity, radioactivity or plain-old failed hardware. * * Note, however, that one of the leading causes of EEH slot * freeze events are buggy device drivers, buggy device microcode, @@ -75,22 +76,13 @@ #define BUID_HI(buid) ((buid) >> 32) #define BUID_LO(buid) ((buid) & 0xffffffff) -/* EEH event workqueue setup. */ -static DEFINE_SPINLOCK(eeh_eventlist_lock); -LIST_HEAD(eeh_eventlist); -static void eeh_event_handler(void *); -DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL); - -static struct notifier_block *eeh_notifier_chain; - /* * If a device driver keeps reading an MMIO register in an interrupt * handler after a slot isolation event has occurred, we assume it * is broken and panic. This sets the threshold for how many read * attempts we allow before panicking. */ -#define EEH_MAX_FAILS 1000 -static atomic_t eeh_fail_count; +#define EEH_MAX_FAILS 100000 /* RTAS tokens */ static int ibm_set_eeh_option; @@ -107,6 +99,10 @@ static DEFINE_SPINLOCK(slot_errbuf_lock) static int eeh_error_buf_size; /* System monitoring statistics */ +static DEFINE_PER_CPU(unsigned long, no_device); +static DEFINE_PER_CPU(unsigned long, no_dn); +static DEFINE_PER_CPU(unsigned long, no_cfg_addr); +static DEFINE_PER_CPU(unsigned long, ignored_check); static DEFINE_PER_CPU(unsigned long, total_mmio_ffs); static DEFINE_PER_CPU(unsigned long, false_positives); static DEFINE_PER_CPU(unsigned long, ignored_failures); @@ -225,9 +221,9 @@ pci_addr_cache_insert(struct pci_dev *de while (*p) { parent = *p; piar = rb_entry(parent, struct pci_io_addr_range, rb_node); - if (alo < piar->addr_lo) { + if (ahi < piar->addr_lo) { p = &parent->rb_left; - } else if (ahi > piar->addr_hi) { + } else if (alo > piar->addr_hi) { p = &parent->rb_right; } else { if (dev != piar->pcidev || @@ -246,6 +242,11 @@ pci_addr_cache_insert(struct pci_dev *de piar->pcidev = dev; piar->flags = flags; +#ifdef DEBUG + printk (KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n", + alo, ahi, pci_name (dev)); +#endif + rb_link_node(&piar->rb_node, parent, p); rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root); @@ -268,9 +269,10 @@ static void __pci_addr_cache_insert_devi /* Skip any devices for which EEH is not enabled. */ if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || dn->eeh_mode & EEH_MODE_NOCHECK) { -#ifdef DEBUG - printk(KERN_INFO "PCI: skip building address cache for=%s %s\n", - pci_name(dev), pci_pretty_name(dev)); +// #ifdef DEBUG +#if 1 + printk(KERN_INFO "PCI: skip building address cache for=%s %s %s\n", + pci_name(dev), pci_pretty_name(dev), dn->type); #endif return; } @@ -369,8 +371,12 @@ void pci_addr_cache_remove_device(struct */ void __init pci_addr_cache_build(void) { + struct device_node *dn; struct pci_dev *dev = NULL; + if (!eeh_subsystem_enabled) + return; + spin_lock_init(&pci_io_addr_cache_root.piar_lock); while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { @@ -379,6 +385,17 @@ void __init pci_addr_cache_build(void) continue; } pci_addr_cache_insert_device(dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + if (dn) { + int i; + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + dn->eeh_is_bridge = 1; + } } #ifdef DEBUG @@ -390,24 +407,32 @@ void __init pci_addr_cache_build(void) /* --------------------------------------------------------------- */ /* Above lies the PCI Address Cache. Below lies the EEH event infrastructure */ -/** - * eeh_register_notifier - Register to find out about EEH events. - * @nb: notifier block to callback on events - */ -int eeh_register_notifier(struct notifier_block *nb) +void eeh_slot_error_detail (struct device_node *dn, int severity) { - return notifier_chain_register(&eeh_notifier_chain, nb); -} + unsigned long flags; + int rc; -/** - * eeh_unregister_notifier - Unregister to an EEH event notifier. - * @nb: notifier block to callback on events - */ -int eeh_unregister_notifier(struct notifier_block *nb) -{ - return notifier_chain_unregister(&eeh_notifier_chain, nb); + if (!dn) return; + + /* Log the error with the rtas logger */ + spin_lock_irqsave(&slot_errbuf_lock, flags); + memset(slot_errbuf, 0, eeh_error_buf_size); + + rc = rtas_call(ibm_slot_error_detail, + 8, 1, NULL, dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), NULL, 0, + virt_to_phys(slot_errbuf), + eeh_error_buf_size, + severity); + + if (rc == 0) + log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); + spin_unlock_irqrestore(&slot_errbuf_lock, flags); } +EXPORT_SYMBOL(eeh_slot_error_detail); + /** * read_slot_reset_state - Read the reset state of a device node's slot * @dn: device node to read @@ -422,6 +447,7 @@ static int read_slot_reset_state(struct outputs = 4; } else { token = ibm_read_slot_reset_state; + rets[2] = 0; /* fake PE Unavailable info */ outputs = 3; } @@ -430,75 +456,8 @@ static int read_slot_reset_state(struct } /** - * eeh_panic - call panic() for an eeh event that cannot be handled. - * The philosophy of this routine is that it is better to panic and - * halt the OS than it is to risk possible data corruption by - * oblivious device drivers that don't know better. - * - * @dev pci device that had an eeh event - * @reset_state current reset state of the device slot - */ -static void eeh_panic(struct pci_dev *dev, int reset_state) -{ - /* - * XXX We should create a separate sysctl for this. - * - * Since the panic_on_oops sysctl is used to halt the system - * in light of potential corruption, we can use it here. - */ - if (panic_on_oops) - panic("EEH: MMIO failure (%d) on device:%s %s\n", reset_state, - pci_name(dev), pci_pretty_name(dev)); - else { - __get_cpu_var(ignored_failures)++; - printk(KERN_INFO "EEH: Ignored MMIO failure (%d) on device:%s %s\n", - reset_state, pci_name(dev), pci_pretty_name(dev)); - } -} - -/** - * eeh_event_handler - dispatch EEH events. The detection of a frozen - * slot can occur inside an interrupt, where it can be hard to do - * anything about it. The goal of this routine is to pull these - * detection events out of the context of the interrupt handler, and - * re-dispatch them for processing at a later time in a normal context. - * - * @dummy - unused - */ -static void eeh_event_handler(void *dummy) -{ - unsigned long flags; - struct eeh_event *event; - - while (1) { - spin_lock_irqsave(&eeh_eventlist_lock, flags); - event = NULL; - if (!list_empty(&eeh_eventlist)) { - event = list_entry(eeh_eventlist.next, struct eeh_event, list); - list_del(&event->list); - } - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); - if (event == NULL) - break; - - printk(KERN_INFO "EEH: MMIO failure (%d), notifiying device " - "%s %s\n", event->reset_state, - pci_name(event->dev), pci_pretty_name(event->dev)); - - atomic_set(&eeh_fail_count, 0); - notifier_call_chain (&eeh_notifier_chain, - EEH_NOTIFY_FREEZE, event); - - __get_cpu_var(slot_resets)++; - - pci_dev_put(event->dev); - kfree(event); - } -} - -/** - * eeh_token_to_phys - convert EEH address token to phys address - * @token i/o token, should be address in the form 0xE.... + * eeh_token_to_phys - convert I/O address to phys address + * @token i/o address, should be address in the form 0xA.... */ static inline unsigned long eeh_token_to_phys(unsigned long token) { @@ -513,6 +472,18 @@ static inline unsigned long eeh_token_to return pa | (token & (PAGE_SIZE-1)); } + +static inline struct pci_dev * eeh_find_pci_dev(struct device_node *dn) +{ + struct pci_dev *dev = NULL; + for_each_pci_dev(dev) { + if (pci_device_to_OF_node(dev) == dn) + return dev; + } + return NULL; +} + + /** * eeh_dn_check_failure - check if all 1's data is due to EEH slot freeze * @dn device node @@ -528,29 +499,37 @@ static inline unsigned long eeh_token_to * * It is safe to call this routine in an interrupt context. */ +extern void disable_irq_nosync(unsigned int); + int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) { int ret; int rets[3]; - unsigned long flags; - int rc, reset_state; - struct eeh_event *event; + enum pci_channel_state state; __get_cpu_var(total_mmio_ffs)++; if (!eeh_subsystem_enabled) return 0; - if (!dn) + if (!dn) { + __get_cpu_var(no_dn)++; return 0; + } /* Access to IO BARs might get this far and still not want checking. */ if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || dn->eeh_mode & EEH_MODE_NOCHECK) { + __get_cpu_var(ignored_check)++; +#ifdef DEBUG + printk ("EEH:ignored check for %s %s\n", + pci_pretty_name (dev), dn->full_name); +#endif return 0; } if (!dn->eeh_config_addr) { + __get_cpu_var(no_cfg_addr)++; return 0; } @@ -559,12 +538,18 @@ int eeh_dn_check_failure(struct device_n * slot, we know it's bad already, we don't need to check... */ if (dn->eeh_mode & EEH_MODE_ISOLATED) { - atomic_inc(&eeh_fail_count); - if (atomic_read(&eeh_fail_count) >= EEH_MAX_FAILS) { + dn->eeh_check_count ++; + if (dn->eeh_check_count >= EEH_MAX_FAILS) { + printk (KERN_ERR "EEH: Device driver ignored %d bad reads, panicing\n", + dn->eeh_check_count); + dump_stack(); /* re-read the slot reset state */ if (read_slot_reset_state(dn, rets) != 0) rets[0] = -1; /* reset state unknown */ - eeh_panic(dev, rets[0]); + + /* If we are here, then we hit an infinite loop. Stop. */ + panic("EEH: MMIO halt (%d) on device:%s %s\n", rets[0], + pci_name(dev), pci_pretty_name(dev)); } return 0; } @@ -577,53 +562,41 @@ int eeh_dn_check_failure(struct device_n * In any case they must share a common PHB. */ ret = read_slot_reset_state(dn, rets); - if (!(ret == 0 && rets[1] == 1 && (rets[0] == 2 || rets[0] == 4))) { + if (!(ret == 0 && ((rets[1] == 1 && (rets[0] == 2 || rets[0] >= 4)) + || (rets[0] == 5)))) { __get_cpu_var(false_positives)++; return 0; } - /* prevent repeated reports of this failure */ - dn->eeh_mode |= EEH_MODE_ISOLATED; - - reset_state = rets[0]; + /* Note that empty slots will fail; empty slots don't have children... */ + if ((rets[0] == 5) && (dn->child == NULL)) { + __get_cpu_var(false_positives)++; + return 0; + } - spin_lock_irqsave(&slot_errbuf_lock, flags); - memset(slot_errbuf, 0, eeh_error_buf_size); + /* Prevent repeated reports of this failure */ + dn->eeh_mode |= EEH_MODE_ISOLATED; + __get_cpu_var(slot_resets)++; - rc = rtas_call(ibm_slot_error_detail, - 8, 1, NULL, dn->eeh_config_addr, - BUID_HI(dn->phb->buid), - BUID_LO(dn->phb->buid), NULL, 0, - virt_to_phys(slot_errbuf), - eeh_error_buf_size, - 1 /* Temporary Error */); + if (!dev) + dev = eeh_find_pci_dev (dn); - if (rc == 0) - log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); - spin_unlock_irqrestore(&slot_errbuf_lock, flags); + /* Some devices go crazy if irq's are not ack'ed; disable irq now */ + if (dev) + disable_irq_nosync (dev->irq); + + state = pci_channel_io_normal; + if ((rets[0] == 2) || (rets[0] == 4)) + state = pci_channel_io_frozen; + if (rets[0] == 5) + state = pci_channel_io_perm_failure; - printk(KERN_INFO "EEH: MMIO failure (%d) on device: %s %s\n", - rets[0], dn->name, dn->full_name); - event = kmalloc(sizeof(*event), GFP_ATOMIC); - if (event == NULL) { - eeh_panic(dev, reset_state); - return 1; - } - - event->dev = dev; - event->dn = dn; - event->reset_state = reset_state; - - /* We may or may not be called in an interrupt context */ - spin_lock_irqsave(&eeh_eventlist_lock, flags); - list_add(&event->list, &eeh_eventlist); - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); + peh_send_failure_event (dev, state, rets[2]); /* Most EEH events are due to device driver bugs. Having * a stack trace will help the device-driver authors figure * out what happened. So print that out. */ - dump_stack(); - schedule_work(&eeh_event_wq); + if (rets[0] != 5) dump_stack(); return 0; } @@ -635,7 +608,6 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * @token i/o token, should be address in the form 0xA.... * @val value, should be all 1's (XXX why do we need this arg??) * - * Check for an eeh failure at the given token address. * Check for an EEH failure at the given token address. Call this * routine if the result of a read was all 0xff's and you want to * find out if this is due to an EEH slot freeze event. This routine @@ -643,6 +615,7 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * * Note this routine is safe to call in an interrupt context. */ + unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val) { unsigned long addr; @@ -652,8 +625,10 @@ unsigned long eeh_check_failure(const vo /* Finding the phys addr + pci device; this is pretty quick. */ addr = eeh_token_to_phys((unsigned long __force) token); dev = pci_get_device_by_addr(addr); - if (!dev) + if (!dev) { + __get_cpu_var(no_device)++; return val; + } dn = pci_device_to_OF_node(dev); eeh_dn_check_failure (dn, dev); @@ -664,6 +639,234 @@ unsigned long eeh_check_failure(const vo EXPORT_SYMBOL(eeh_check_failure); +/* ------------------------------------------------------------- */ +/* The code below deals with error recovery */ + +int +eeh_slot_is_isolated(struct pci_dev *dev) +{ + struct device_node *dn; + dn = pci_device_to_OF_node(dev); + return (dn->eeh_mode & EEH_MODE_ISOLATED); +} +EXPORT_SYMBOL(eeh_slot_is_isolated); + +int +eeh_ioaddr_is_isolated(const volatile void __iomem *token) +{ + unsigned long addr; + struct pci_dev *dev; + int rc; + + addr = eeh_token_to_phys((unsigned long __force) token); + dev = pci_get_device_by_addr(addr); + if (!dev) + return 0; + rc = eeh_slot_is_isolated(dev); + pci_dev_put(dev); + return rc; +} + +/** eeh_pci_slot_reset -- raises/lowers the pci #RST line + * state: 1/0 to raise/lower the #RST + */ +void +eeh_pci_slot_reset(struct pci_dev *dev, int state) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + rtas_pci_slot_reset (dn, state); +} + +/** Return negative value if a permanent error, else return + * a number of milliseconds to wait until the PCI slot is + * ready to be used. + */ +static int +eeh_slot_availability(struct device_node *dn) +{ + int rc; + int rets[3]; + + rc = read_slot_reset_state(dn, rets); + + if (rc) return rc; + + if (rets[1] == 0) return -1; /* EEH is not supported */ + if (rets[0] == 0) return 0; /* Oll Korrect */ + if (rets[0] == 5) { + if (rets[2] == 0) return -1; /* permanently unavailable */ + return rets[2]; /* number of millisecs to wait */ + } + return -1; +} + +int +eeh_pci_slot_availability(struct pci_dev *dev) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + if (!dn) return -1; + + BUG_ON (dn->phb==NULL); + if (dn->phb==NULL) { + printk (KERN_ERR "EEH, checking on slot with no phb dn=%s dev=%s:%s\n", + dn->full_name, pci_name(dev), pci_pretty_name (dev)); + return -1; + } + return eeh_slot_availability (dn); +} + +void +rtas_pci_slot_reset(struct device_node *dn, int state) +{ + int rc; + + if (!dn) + return; + if (!dn->phb) { + printk (KERN_WARNING "EEH: in slot reset, device node %s has no phb\n", dn->full_name); + return; + } + + dn->eeh_mode |= EEH_MODE_RECOVERING; + rc = rtas_call(ibm_set_slot_reset,4,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), + state); + if (rc) { + printk (KERN_WARNING "EEH: Unable to reset the failed slot, (%d) #RST=%d\n", rc, state); + return; + } + + if (state == 0) + dn->eeh_mode &= ~(EEH_MODE_RECOVERING|EEH_MODE_ISOLATED); +} + +/** rtas_set_slot_reset -- assert the pci #RST line for 1/4 second + * dn -- device node to be reset. + */ + +void +rtas_set_slot_reset(struct device_node *dn) +{ + int i, rc; + + rtas_pci_slot_reset (dn, 1); + + /* The PCI bus requires that the reset be held high for at least + * a 100 milliseconds. We wait a bit longer 'just in case'. */ + +#define PCI_BUS_RST_HOLD_TIME_MSEC 250 + msleep (PCI_BUS_RST_HOLD_TIME_MSEC); + rtas_pci_slot_reset (dn, 0); + + /* After a PCI slot has been reset, the PCI Express spec requires + * a 1.5 second idle time for the bus to stabilize, before starting + * up traffic. */ +#define PCI_BUS_SETTLE_TIME_MSEC 1800 + msleep (PCI_BUS_SETTLE_TIME_MSEC); + + /* Now double check with the firmware to make sure the device is + * ready to be used; if not, wait for recovery. */ + for (i=0; i<10; i++) { + rc = eeh_slot_availability (dn); + if (rc <= 0) break; + + msleep (rc+100); + } +} + +EXPORT_SYMBOL(rtas_set_slot_reset); + +void +rtas_configure_bridge(struct device_node *dn) +{ + int token = rtas_token ("ibm,configure-bridge"); + int rc; + + if (token == RTAS_UNKNOWN_SERVICE) + return; + rc = rtas_call(token,3,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid)); + if (rc) { + printk (KERN_WARNING "EEH: Unable to configure device bridge (%d) for %s\n", + rc, dn->full_name); + } +} + +EXPORT_SYMBOL(rtas_configure_bridge); + +/* ------------------------------------------------------- */ +/** Save and restore of PCI BARs + * + * Although firmware will set up BARs during boot, it doesn't + * set up device BAR's after a device reset, although it will, + * if requested, set up bridge configuration. Thus, we need to + * configure the PCI devices ourselves. Config-space setup is + * stored in the PCI structures which are normally deleted during + * device removal. Thus, the "save" routine references the + * structures so that they aren't deleted. + */ + +/** + * __restore_bars - Restore the Base Address Registers + * Loads the PCI configuration space base address registers, + * the expansion ROM base address, the latency timer, and etc. + * from the saved values in the device node. + */ +static inline void __restore_bars (struct device_node *dn) +{ + int i; + + if (NULL==dn->phb) return; + for (i=4; i<10; i++) { + rtas_write_config(dn, i*4, 4, dn->config_space[i]); + } + + /* 12 == Expansion ROM Address */ + rtas_write_config(dn, 12*4, 4, dn->config_space[12]); + +#define BYTE_SWAP(OFF) (8*((OFF)/4)+3-(OFF)) +#define SAVED_BYTE(OFF) (((u8 *)(dn->config_space))[BYTE_SWAP(OFF)]) + + rtas_write_config (dn, PCI_CACHE_LINE_SIZE, 1, + SAVED_BYTE(PCI_CACHE_LINE_SIZE)); + + rtas_write_config (dn, PCI_LATENCY_TIMER, 1, + SAVED_BYTE(PCI_LATENCY_TIMER)); + + /* max latency, min grant, interrupt pin and line */ + rtas_write_config(dn, 15*4, 4, dn->config_space[15]); +} + +/** + * eeh_restore_bars - restore the PCI config space info + */ +void eeh_restore_bars(struct device_node *dn) +{ + if (! dn->eeh_is_bridge) + __restore_bars (dn); + + if (dn->child) + eeh_restore_bars (dn->child); +} + +void eeh_pci_restore_bars(struct pci_dev *dev) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + eeh_restore_bars (dn); +} + +/* ------------------------------------------------------------- */ +/* The code below deals with enabling EEH for devices during the + * early boot sequence. EEH must be enabled before any PCI probing + * can be done. + */ + +#define EEH_ENABLE 1 + struct eeh_early_enable_info { unsigned int buid_hi; unsigned int buid_lo; @@ -682,6 +885,8 @@ static void *early_enable_eeh(struct dev int enable; dn->eeh_mode = 0; + dn->eeh_check_count = 0; + dn->eeh_freeze_count = 0; if (status && strcmp(status, "ok") != 0) return NULL; /* ignore devices with bad status */ @@ -743,7 +948,7 @@ static void *early_enable_eeh(struct dev dn->full_name); } - return NULL; + return NULL; } /* @@ -824,11 +1029,13 @@ void eeh_add_device_early(struct device_ struct pci_controller *phb; struct eeh_early_enable_info info; - if (!dn || !eeh_subsystem_enabled) + if (!dn) return; phb = dn->phb; if (NULL == phb || 0 == phb->buid) { - printk(KERN_WARNING "EEH: Expected buid but found none\n"); + printk(KERN_WARNING "EEH: Expected buid but found none for %s\n", + dn->full_name); + dump_stack(); return; } @@ -847,6 +1054,9 @@ EXPORT_SYMBOL(eeh_add_device_early); */ void eeh_add_device_late(struct pci_dev *dev) { + int i; + struct device_node *dn; + if (!dev || !eeh_subsystem_enabled) return; @@ -856,6 +1066,14 @@ void eeh_add_device_late(struct pci_dev #endif pci_addr_cache_insert_device (dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + dn->eeh_is_bridge = 1; } EXPORT_SYMBOL(eeh_add_device_late); @@ -885,12 +1103,17 @@ static int proc_eeh_show(struct seq_file unsigned int cpu; unsigned long ffs = 0, positives = 0, failures = 0; unsigned long resets = 0; + unsigned long no_dev = 0, no_dn = 0, no_cfg = 0, no_check = 0; for_each_cpu(cpu) { ffs += per_cpu(total_mmio_ffs, cpu); positives += per_cpu(false_positives, cpu); failures += per_cpu(ignored_failures, cpu); resets += per_cpu(slot_resets, cpu); + no_dev += per_cpu(no_device, cpu); + no_dn += per_cpu(no_dn, cpu); + no_cfg += per_cpu(no_cfg_addr, cpu); + no_check += per_cpu(ignored_check, cpu); } if (0 == eeh_subsystem_enabled) { @@ -898,13 +1121,17 @@ static int proc_eeh_show(struct seq_file seq_printf(m, "eeh_total_mmio_ffs=%ld\n", ffs); } else { seq_printf(m, "EEH Subsystem is enabled\n"); - seq_printf(m, "eeh_total_mmio_ffs=%ld\n" + seq_printf(m, + "no device=%ld\n" + "no device node=%ld\n" + "no config address=%ld\n" + "check not wanted=%ld\n" + "eeh_total_mmio_ffs=%ld\n" "eeh_false_positives=%ld\n" "eeh_ignored_failures=%ld\n" - "eeh_slot_resets=%ld\n" - "eeh_fail_count=%d\n", - ffs, positives, failures, resets, - eeh_fail_count.counter); + "eeh_slot_resets=%ld\n", + no_dev, no_dn, no_cfg, no_check, + ffs, positives, failures, resets); } return 0; --- arch/ppc64/kernel/pSeries_pci.c.linas-orig 2005-04-29 20:33:03.000000000 -0500 +++ arch/ppc64/kernel/pSeries_pci.c 2005-05-06 12:28:43.000000000 -0500 @@ -52,7 +52,7 @@ static int s7a_workaround; extern struct mpic *pSeries_mpic; -static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) +int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) { int returnval = -1; unsigned long buid, addr; @@ -101,7 +101,7 @@ static int rtas_pci_read_config(struct p return PCIBIOS_DEVICE_NOT_FOUND; } -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +int rtas_write_config(struct device_node *dn, int where, int size, u32 val) { unsigned long buid, addr; int ret; --- drivers/pci/hotplug/rpaphp.h.linas-orig 2005-04-29 20:26:21.000000000 -0500 +++ drivers/pci/hotplug/rpaphp.h 2005-05-06 12:28:43.000000000 -0500 @@ -118,7 +118,8 @@ extern int rpaphp_enable_pci_slot(struct extern int register_pci_slot(struct slot *slot); extern int rpaphp_unconfig_pci_adapter(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); -extern struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev); +extern void init_eeh_handler (void); +extern void exit_eeh_handler (void); /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); --- drivers/pci/hotplug/rpaphp_core.c.linas-orig 2005-04-29 20:32:16.000000000 -0500 +++ drivers/pci/hotplug/rpaphp_core.c 2005-05-06 12:28:43.000000000 -0500 @@ -460,12 +460,18 @@ static int __init rpaphp_init(void) { info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); + /* Get set to handle EEH events. */ + init_eeh_handler(); + /* read all the PRA info from the system */ return init_rpa(); } static void __exit rpaphp_exit(void) { + /* Let EEH know we are going away. */ + exit_eeh_handler(); + cleanup_slots(); } --- drivers/pci/hotplug/rpaphp_pci.c.linas-orig 2005-04-29 20:22:38.000000000 -0500 +++ drivers/pci/hotplug/rpaphp_pci.c 2005-05-16 11:59:30.000000000 -0500 @@ -24,6 +24,7 @@ */ #include #include +#include #include #include #include "../pci.h" /* for pci_add_new_bus */ @@ -63,6 +64,7 @@ int rpaphp_claim_resource(struct pci_dev root ? "Address space collision on" : "No parent found for", resource, dtype, pci_name(dev), res->start, res->end); + dump_stack(); } return err; } @@ -188,6 +190,19 @@ rpaphp_fixup_new_pci_devices(struct pci_ static int rpaphp_pci_config_bridge(struct pci_dev *dev); +static void rpaphp_eeh_add_bus_device(struct pci_bus *bus) +{ + struct pci_dev *dev; + list_for_each_entry(dev, &bus->devices, bus_list) { + eeh_add_device_late(dev); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + struct pci_bus *subbus = dev->subordinate; + if (bus) + rpaphp_eeh_add_bus_device (subbus); + } + } +} + /***************************************************************************** rpaphp_pci_config_slot() will configure all devices under the given slot->dn and return the the first pci_dev. @@ -215,6 +230,8 @@ rpaphp_pci_config_slot(struct device_nod } if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) rpaphp_pci_config_bridge(dev); + + rpaphp_eeh_add_bus_device(bus); } return dev; } @@ -223,7 +240,6 @@ static int rpaphp_pci_config_bridge(stru { u8 sec_busno; struct pci_bus *child_bus; - struct pci_dev *child_dev; dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev)); @@ -240,11 +256,7 @@ static int rpaphp_pci_config_bridge(stru /* do pci_scan_child_bus */ pci_scan_child_bus(child_bus); - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - - /* fixup new pci devices without touching bus struct */ + /* Fixup new pci devices without touching bus struct */ rpaphp_fixup_new_pci_devices(child_bus, 0); /* Make the discovered devices available */ @@ -282,7 +294,7 @@ static void print_slot_pci_funcs(struct return; } #else -static void print_slot_pci_funcs(struct slot *slot) +static inline void print_slot_pci_funcs(struct slot *slot) { return; } @@ -364,7 +376,6 @@ static void rpaphp_eeh_remove_bus_device if (pdev) rpaphp_eeh_remove_bus_device(pdev); } - } return; } @@ -566,36 +577,3 @@ exit: return retval; } -struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev) -{ - struct list_head *tmp, *n; - struct slot *slot; - - list_for_each_safe(tmp, n, &rpaphp_slot_head) { - struct pci_bus *bus; - struct list_head *ln; - - slot = list_entry(tmp, struct slot, rpaphp_slot_list); - if (slot->bridge == NULL) { - if (slot->dev_type == PCI_DEV) { - printk(KERN_WARNING "PCI slot missing bridge %s %s \n", - slot->name, slot->location); - } - continue; - } - - bus = slot->bridge->subordinate; - if (!bus) { - continue; /* should never happen? */ - } - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev == dev) - return slot->hotplug_slot; - } - } - - return NULL; -} - -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); --- drivers/pci/hotplug/rpaphp_eeh.c.linas-orig 2005-05-16 11:52:15.000000000 -0500 +++ drivers/pci/hotplug/rpaphp_eeh.c 2005-05-31 11:20:06.000000000 -0500 @@ -0,0 +1,354 @@ +/* + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform. + * Copyright (C) 2004, 2005 Linas Vepstas + * + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or (at + * your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + * + * Send feedback to + * + */ +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../pci.h" +#include "rpaphp.h" + +/** + * pci_search_bus_for_dev - return 1 if device is under this bus, else 0 + * @bus: the bus to search for this device. + * @dev: the pci device we are looking for. + * + * XXX should this be moved to drivers/pci/search.c ? + */ +static int pci_search_bus_for_dev (struct pci_bus *bus, struct pci_dev *dev) +{ + struct list_head *ln; + + if (!bus) return 0; + + for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { + struct pci_dev *pdev = pci_dev_b(ln); + if (pdev == dev) + return 1; + if (pdev->subordinate) { + int rc; + rc = pci_search_bus_for_dev (pdev->subordinate, dev); + if (rc) + return 1; + } + } + return 0; +} + +/** pci_walk_bus - walk bus under this device, calling callback. + * @top device whose peers should be walked + * @cb callback to be called for each device found + * @userdata arbitrary pointer to be passed to callback. + * + * Walk the bus on which this device sits, including any + * bridged devices on busses under this bus. Call the provided + * callback on each device found. + */ +typedef void (*pci_buswalk_cb)(struct pci_dev *, void *); + +static void +pci_walk_bus (struct pci_dev *top, pci_buswalk_cb cb, void *userdata) +{ + struct pci_dev *dev, *tmp; + + spin_lock(&pci_bus_lock); + list_for_each_entry_safe (dev, tmp, &top->bus->devices, bus_list) { + pci_dev_get(dev); + spin_unlock(&pci_bus_lock); + + /* run device routines with the bus unlocked */ + cb (dev, userdata); + if (dev->subordinate) { + pci_walk_bus (pci_dev_b(&dev->subordinate->devices), cb, userdata); + } + spin_lock(&pci_bus_lock); + pci_dev_put(dev); + } + spin_unlock(&pci_bus_lock); +} + +/** + * rpaphp_find_slot - find and return the slot holding the device + * @dev: pci device for which we want the slot structure. + */ +static struct slot *rpaphp_find_slot(struct pci_dev *dev) +{ + struct list_head *tmp, *n; + struct slot *slot; + + list_for_each_safe(tmp, n, &rpaphp_slot_head) { + struct pci_bus *bus; + + slot = list_entry(tmp, struct slot, rpaphp_slot_list); + + /* PHB's don't have bridges. */ + if (slot->bridge == NULL) + continue; + + /* The PCI device could be the slot itself. */ + if (slot->bridge == dev) + return slot; + + bus = slot->bridge->subordinate; + if (!bus) { + printk (KERN_WARNING "PCI bridge is missing bus: %s %s\n", + pci_name (slot->bridge), pci_pretty_name (slot->bridge)); + continue; /* should never happen? */ + } + + if (pci_search_bus_for_dev (bus, dev)) + return slot; + } + return NULL; +} + +/* ------------------------------------------------------- */ +/** eeh_report_error - report an EEH error to each device, + * collect up and merge the device responses. + */ + +static void eeh_report_error(struct pci_dev *dev, void *userdata) +{ + enum pcierr_result rc, *res = userdata; + + if (dev->driver->err_handler.error_detected) { + rc = dev->driver->err_handler.error_detected (dev, pci_channel_io_frozen); + if (*res == PCIERR_RESULT_NONE) *res = rc; + if (*res == PCIERR_RESULT_NEED_RESET) return; + if (*res == PCIERR_RESULT_DISCONNECT && + rc == PCIERR_RESULT_NEED_RESET) *res = rc; + } +} + +/** eeh_report_reset -- tell this device that the pci slot + * has been reset. + */ + +static void eeh_report_reset(struct pci_dev *dev, void *userdata) +{ + if (dev->driver->err_handler.slot_reset) + dev->driver->err_handler.slot_reset (dev); +} + +static void eeh_report_resume(struct pci_dev *dev, void *userdata) +{ + if (dev->driver->err_handler.resume) + dev->driver->err_handler.resume (dev); +} + +static void eeh_report_failure(struct pci_dev *dev, void *userdata) +{ + if (dev->driver->err_handler.error_detected) + dev->driver->err_handler.error_detected (dev, pci_channel_io_perm_failure); +} + +/* ------------------------------------------------------- */ +/** + * handle_eeh_events -- reset a PCI device after hard lockup. + * + * pSeries systems will isolate a PCI slot if the PCI-Host + * bridge detects address or data parity errors, DMA's + * occuring to wild addresses (which usually happen due to + * bugs in device drivers or in PCI adapter firmware). + * Slot isolations also occur if #SERR, #PERR or other misc + * PCI-related errors are detected. + * + * Recovery process consists of unplugging the device driver + * (which generated hotplug events to userspace), then issuing + * a PCI #RST to the device, then reconfiguring the PCI config + * space for all bridges & devices under this slot, and then + * finally restarting the device drivers (which cause a second + * set of hotplug events to go out to userspace). + */ + +int eeh_reset_device (struct pci_dev *dev, struct device_node *dn, int reconfig) +{ + struct slot *frozen_slot= NULL; + + if (!dev) + return 1; + + if (reconfig) + frozen_slot = rpaphp_find_slot(dev); + + if (reconfig && frozen_slot) rpaphp_unconfig_pci_adapter (frozen_slot); + + /* Reset the pci controller. (Asserts RST#; resets config space). + * Reconfigure bridges and devices */ + rtas_set_slot_reset (dn->child); + rtas_configure_bridge(dn); + eeh_restore_bars(dn->child); + + enable_irq (dev->irq); + + /* Give the system 5 seconds to finish running the user-space + * hotplug scripts, e.g. ifdown for ethernet. Yes, this is a hack, + * but if we don't do this, weird things happen. + */ + if (reconfig && frozen_slot) { + ssleep (5); + rpaphp_enable_pci_slot (frozen_slot); + } + return 0; +} + +/* The longest amount of time to wait for a pci device + * to come back on line, in seconds. + */ +#define MAX_WAIT_FOR_RECOVERY 15 + +int handle_eeh_events (struct notifier_block *self, + unsigned long reason, void *ev) +{ + int freeze_count=0; + struct device_node *frozen_device; + struct peh_event *event = ev; + struct pci_dev *dev = event->dev; + int perm_failure = 0; + + if (!dev) + { + printk ("EEH: EEH error caught, but no PCI device specified!\n"); + return 1; + } + + frozen_device = pci_bus_to_OF_node(dev->bus); + if (!frozen_device) + { + printk (KERN_ERR "EEH: Cannot find PCI controller for %s %s\n", + pci_name(dev), pci_pretty_name (dev)); + + return 1; + } + BUG_ON (frozen_device->phb==NULL); + + /* We get "permanent failure" messages on empty slots. + * These are false alarms. Empty slots have no child dn. */ + if ((event->state == pci_channel_io_perm_failure) && (frozen_device == NULL)) + return 0; + + if (frozen_device) + freeze_count = frozen_device->eeh_freeze_count; + freeze_count ++; + if (freeze_count > EEH_MAX_ALLOWED_FREEZES) + perm_failure = 1; + + /* If the reset state is a '5' and the time to reset is 0 (infinity) + * or is more then 15 seconds, then mark this as a permanent failure. + */ + if ((event->state == pci_channel_io_perm_failure) && + ((event->time_unavail <= 0) || + (event->time_unavail > MAX_WAIT_FOR_RECOVERY*1000))) + perm_failure = 1; + + /* Log the error with the rtas logger. */ + if (perm_failure) { + /* + * About 90% of all real-life EEH failures in the field + * are due to poorly seated PCI cards. Only 10% or so are + * due to actual, failed cards. + */ + printk (KERN_ERR + "EEH: device %s:%s has failed %d times \n" + "and has been permanently disabled. Please try reseating\n" + "this device or replacing it.\n", + pci_name (dev), + pci_pretty_name (dev), + freeze_count); + + eeh_slot_error_detail (frozen_device, 2 /* Permanent Error */); + + /* Notify all devices that they're about to go down. */ + pci_walk_bus (dev, eeh_report_failure, 0); + + /* If there's a hotplug slot, unconfigure it */ + // XXX we need alternate way to deconfigure non-hotplug slots. + struct slot * frozen_slot = rpaphp_find_slot(dev); + if (frozen_slot) + rpaphp_unconfig_pci_adapter (frozen_slot); + return 1; + } else { + eeh_slot_error_detail (frozen_device, 1 /* Temporary Error */); + } + + printk (KERN_WARNING + "EEH: This device has failed %d times since last reboot: %s:%s\n", + freeze_count, + pci_name (dev), + pci_pretty_name (dev)); + + /* Walk the various device drivers attached to this slot, + * letting each know about the EEH bug. + */ + enum pcierr_result result = PCIERR_RESULT_NONE; + pci_walk_bus (dev, eeh_report_error, &result); + + /* If all device drivers were EEH-unaware, then pci hotplug + * the device, and hope that clears the error. */ + if (result == PCIERR_RESULT_NONE) { + eeh_reset_device (dev, frozen_device, 1); + } + + /* If any device called out for a reset, then reset the slot */ + if (result == PCIERR_RESULT_NEED_RESET) { + eeh_reset_device (dev, frozen_device, 0); + pci_walk_bus (dev, eeh_report_reset, 0); + } + + /* If all devices reported they can proceed, the re-enable PIO */ + if (result == PCIERR_RESULT_CAN_RECOVER) { + /* XXX Not supported; we brute-force reset the device */ + eeh_reset_device (dev, frozen_device, 0); + pci_walk_bus (dev, eeh_report_reset, 0); + } + + /* Tell all device drivers that they can resume operations */ + pci_walk_bus (dev, eeh_report_resume, 0); + + /* Store the freeze count with the pci adapter, and not the slot. + * This way, if the device is replaced, the count is cleared. + */ + frozen_device->eeh_freeze_count = freeze_count; + + return 1; +} + +static struct notifier_block eeh_block; + +void __init init_eeh_handler (void) +{ + eeh_block.notifier_call = handle_eeh_events; + peh_register_notifier (&eeh_block); +} + +void __exit exit_eeh_handler (void) +{ + peh_unregister_notifier (&eeh_block); +} + --- drivers/pci/hotplug/Makefile.linas-orig 2005-04-29 20:29:50.000000000 -0500 +++ drivers/pci/hotplug/Makefile 2005-05-16 11:53:52.000000000 -0500 @@ -41,6 +41,7 @@ acpiphp-objs := acpiphp_core.o \ acpiphp_res.o rpaphp-objs := rpaphp_core.o \ + rpaphp_eeh.o \ rpaphp_pci.o \ rpaphp_slot.o \ rpaphp_vio.o From johnrose at austin.ibm.com Wed Jun 1 06:52:07 2005 From: johnrose at austin.ibm.com (John Rose) Date: Tue, 31 May 2005 15:52:07 -0500 Subject: [PATCH]: PCI Error Recovery Implementation In-Reply-To: <20050531203028.GD31199@austin.ibm.com> References: <20050531203028.GD31199@austin.ibm.com> Message-ID: <1117572727.7775.11.camel@sinatra.austin.ibm.com> Hi Linas/Greg/Everyone- +int handle_eeh_events (struct notifier_block *self, + unsigned long reason, void *ev) At the risk of sounding like a broken record, I don't think that this belongs in the RPA PCI Hotplug driver. This bit of code _uses_ PCI Hotplug rather than implementing it, and thus stands out from the rest of the module. Not to mention that it uses EEH-specific stuff that doesn't belong here. I'm in the midst of reducing the codebase of this module significantly, and this adds more unrelated stuff to an already cluttered module. I think the PCI hotplug driver could register enable/disable functions with eeh.c, and that the handle_events() code should reside there. If someone has a good technical explanation of why this is bad, please chime in. Thanks- John From raffi at raffi.at Wed Jun 1 07:26:38 2005 From: raffi at raffi.at (Raffael Himmelreich) Date: Tue, 31 May 2005 23:26:38 +0200 Subject: SCSI timeouts (was: Re: RS/6000 7017-S7A hangs on boot) In-Reply-To: <20050522205427.GE20174@krispykreme> References: <20050522201816.GA8254@exception.at> <20050522205427.GE20174@krispykreme> Message-ID: <20050531212638.GA11249@exception.at> Hi, thanks for your reply and sorry for my delay, but my access to the machine is rather limited. Anton Blanchard wrote: > Is xmon on? You might get some > more info on the oops if xmon is turned off (it looks like it hung). Heya. Seems like this was the point. But now I am facing strange SCSI timeouts. Any ideas? (these SCSI devices work well under AIX) > > cpu 0x1: Vector: 300 (Data Access) at [c00000003ff87b70] > > pc: c00000000002de3c > Can you look up the pc in your System.map? Hum, I recompiled the kernel as you suggested and didn't backup the System.map. But I can remember that this address didn't point to any beginning symbol address. If you are interested in this issue I will rebuild the kernel. best regards, raffi Boot log goes here: zImage starting: loaded at 0x400000 Allocating 0x80a000 bytes for kernel ... gunzipping (0x1c00000 <- 0x407000:0x695dc7)...done 0x6b5bd8 bytes 0xde98 bytes of heap consumed, max in use 0xa294 OF stdout device is: /pci at f8400000/isa at f/serial at i3f8 command line: root=/dev/fd0 memory layout at init: memory_limit : 0000000000000000 (16 MB aligned) alloc_bottom : 000000000231e000 alloc_top : 0000000040000000 alloc_top_hi : 0000000140000000 rmo_top : 0000000040000000 ram_top : 0000000140000000 Looking for displays opening PHB /pci at f8400000... done opening PHB /pci at f8500000... done opening PHB /pci at f8600000... done opening PHB /pci at f8700000... done instantiating rtas at 0x000000003ffd7000... done 0000000000000001 : starting cpu hw idx 0000000000000001... done 0000000000000002 : starting cpu hw idx 0000000000000002... done 0000000000000003 : starting cpu hw idx 0000000000000003... done WARNING: maximum CPUs (1) exceeded: ignoring extras copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x000000000241f000 -> 0x000000000241fd7c Device tree struct 0x0000000002420000 -> 0x0000000002429000 Calling quiesce ... returning from prom_init firmware_features = 0x0 Starting Linux PPC64 2.6.12-rc5 ----------------------------------------------------- ppc64_pft_size = 0x1a ppc64_debug_switch = 0x0 ppc64_interrupt_controller = 0x1 systemcfg = 0xc0000000004d0000 systemcfg->platform = 0x100 systemcfg->processorCount = 0x0 systemcfg->physicalMemorySize = 0xc0000000 ppc64_caches.dcache_line_size = 0x80 ppc64_caches.icache_line_size = 0x80 htab_address = 0xc000000138000000 htab_hash_mask = 0x7ffff ----------------------------------------------------- [boot]0100 MM Init IO Hole assumed to be 80000000 -> ffffffff [boot]0100 MM Init Done Linux version 2.6.12-rc5 (root at localhost.localdomain) (gcc version 3.4.3) #2 Wed May 25 14:20:01 CEST 2005 [boot]0012 Setup Arch Syscall map setup, 236 32 bits and 212 64 bits syscalls mpic: Setting up MPIC " MPIC " version at ffc00000, max 1 CPUs mpic: ISU size: 16, shift: 4, mask: f No ramdisk, default root is /dev/sda2 Python workaround: reg0: 218e3b88 Python workaround: reg0: 218e3b88 Python workaround: reg0: 218e3b88 Python workaround: reg0: 218e3b88 PPC64 nvram contains 122880 bytes Using default idle loop Top of RAM: 0x140000000, Total RAM: 0xc0000000 Memory hole size: 2048MB [boot]0015 Setup Done Built 1 zonelists Kernel command line: root=/dev/fd0 mpic: Initializing for 64 sources PID hash table entries: 4096 (order: 12, 131072 bytes) time_init: decrementer frequency = 262.731521 MHz time_init: processor frequency = 251.781200 MHz firmware_features = 0x0 Starting Linux PPC64 2.6.12-rc5 ----------------------------------------------------- ppc64_pft_size = 0x1a ppc64_debug_switch = 0x0 ppc64_interrupt_controller = 0x1 systemcfg = 0xc0000000004d0000 systemcfg->platform = 0x100 systemcfg->processorCount = 0x0 systemcfg->physicalMemorySize = 0xc0000000 ppc64_caches.dcache_line_size = 0x80 ppc64_caches.icache_line_size = 0x80 htab_address = 0xc000000138000000 htab_hash_mask = 0x7ffff ----------------------------------------------------- [boot]0100 MM Init IO Hole assumed to be 80000000 -> ffffffff [boot]0100 MM Init Done Linux version 2.6.12-rc5 (root at localhost.localdomain) (gcc version 3.4.3) #2 Wed May 25 14:20:01 CEST 2005 [boot]0012 Setup Arch Syscall map setup, 236 32 bits and 212 64 bits syscalls mpic: Setting up MPIC " MPIC " version at ffc00000, max 1 CPUs mpic: ISU size: 16, shift: 4, mask: f No ramdisk, default root is /dev/sda2 Python workaround: reg0: 218e3b88 Python workaround: reg0: 218e3b88 Python workaround: reg0: 218e3b88 Python workaround: reg0: 218e3b88 PPC64 nvram contains 122880 bytes Using default idle loop Top of RAM: 0x140000000, Total RAM: 0xc0000000 Memory hole size: 2048MB [boot]0015 Setup Done Built 1 zonelists Kernel command line: root=/dev/fd0 mpic: Initializing for 64 sources PID hash table entries: 4096 (order: 12, 131072 bytes) time_init: decrementer frequency = 262.731521 MHz time_init: processor frequency = 251.781200 MHz Console: colour dummy device 80x25 Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes) Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes) Memory: 2961152k/5242880k available (4040k kernel code, 2281068k reserved, 1880k data, 415k bss, 316k init) Mount-cache hash table entries: 256 NET: Registered protocol family 16 PCI: Probing PCI hardware IOMMU table initialized, virtual merging enabled mapping IO e0000000 -> e000000000000000, size: 2000000 mapping IO e2000000 -> e000000002000000, size: 800000 mapping IO e2800000 -> e000000002800000, size: 800000 mapping IO e3000000 -> e000000003000000, size: 800000 ISA bridge at 0000:00:0f.0 PCI: Probing PCI hardware done SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub i/pSeries Real Time Clock Driver v1.1 RTAS daemon started audit: initializing netlink socket (disabled) audit(343277964121.727:0): initialized Total HugeTLB memory allocated, 0 Initializing Cryptographic API HVSI: registered 0 devices Initializing IBM hvcs (Hypervisor Virtual Console Server) Driver HVCS: driver module inserted. serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered Floppy drive(s): fd0 is 2.88M FDC 0 is a National Semiconductor PC87306 RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize loop: loaded (max 8 devices) Intel(R) PRO/1000 Network Driver - version 5.7.6-k2 Copyright (c) 1999-2004 Intel Corporation. pcnet32.c:v1.30i 06.28.2004 tsbogend at alpha.franken.de pcnet32: PCnet/PCI II 79C970A at 0x1fff400, warning: CSR address invalid, using instead PROM address of 02 07 01 23 fc e8 assigned IRQ 22. eth0: registered as PCnet/PCI II 79C970A pcnet32: PCnet/FAST 79C971 at 0x27ff400, 00 20 35 35 74 a5 tx_start_pt(0x0c00):~220 bytes, BCR18(68e2):BurstWrEn BurstRdEn DWordIO NoUFlow SRAMSIZE=0x7f00, SRAM_BND=0x4000, assigned IRQ 38. eth1: registered as PCnet/FAST 79C971 pcnet32: PCnet/FAST 79C971 at 0x2ffec00, warning: CSR address invalid, using instead PROM address of 00 04 ac de 92 54 tx_start_pt(0x0c00):~220 bytes, BCR18(6861):BurstWrEn BurstRdEn NoUFlow SRAMSIZE=0x7f00, SRAM_BND=0x4000, assigned IRQ 54. eth2: registered as PCnet/FAST 79C971 pcnet32: 3 cards_found. e100: Intel(R) PRO/100 Network Driver, 3.3.6-k2-NAPI e100: Copyright(c) 1999-2004 Intel Corporation drivers/net/ibmveth.c: ibmveth: IBM i/pSeries Virtual Ethernet Driver 1.03 netconsole: not configured, aborting Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx sym0: <825a> rev 0x13 at pci 0000:00:0b.0 irq 25 sym0: No NVRAM, ID 7, Fast-10, SE, parity checking sym0: SCSI BUS has been reset. scsi0 : sym-2.2.0 0:0:0:0: ABORT operation started. 0:0:0:0: ABORT operation timed-out. 0:0:0:0: DEVICE RESET operation started. 0:0:0:0: DEVICE RESET operation timed-out. 0:0:0:0: BUS RESET operation started. 0:0:0:0: BUS RESET operation timed-out. 0:0:0:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:0:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0 0:0:1:0: ABORT operation started. 0:0:1:0: ABORT operation timed-out. 0:0:1:0: DEVICE RESET operation started. 0:0:1:0: DEVICE RESET operation timed-out. 0:0:1:0: BUS RESET operation started. 0:0:1:0: BUS RESET operation timed-out. 0:0:1:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:1:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 1 lun 0 0:0:2:0: ABORT operation started. 0:0:2:0: ABORT operation timed-out. 0:0:2:0: DEVICE RESET operation started. 0:0:2:0: DEVICE RESET operation timed-out. 0:0:2:0: BUS RESET operation started. 0:0:2:0: BUS RESET operation timed-out. 0:0:2:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:2:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 2 lun 0 0:0:3:0: ABORT operation started. 0:0:3:0: ABORT operation timed-out. 0:0:3:0: DEVICE RESET operation started. 0:0:3:0: DEVICE RESET operation timed-out. 0:0:3:0: BUS RESET operation started. 0:0:3:0: BUS RESET operation timed-out. 0:0:3:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:3:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 3 lun 0 0:0:4:0: ABORT operation started. 0:0:4:0: ABORT operation timed-out. 0:0:4:0: DEVICE RESET operation started. 0:0:4:0: DEVICE RESET operation timed-out. 0:0:4:0: BUS RESET operation started. 0:0:4:0: BUS RESET operation timed-out. 0:0:4:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:4:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 4 lun 0 0:0:5:0: ABORT operation started. 0:0:5:0: ABORT operation timed-out. 0:0:5:0: DEVICE RESET operation started. 0:0:5:0: DEVICE RESET operation timed-out. 0:0:5:0: BUS RESET operation started. 0:0:5:0: BUS RESET operation timed-out. 0:0:5:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:5:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 5 lun 0 0:0:6:0: ABORT operation started. 0:0:6:0: ABORT operation timed-out. 0:0:6:0: DEVICE RESET operation started. 0:0:6:0: DEVICE RESET operation timed-out. 0:0:6:0: BUS RESET operation started. 0:0:6:0: BUS RESET operation timed-out. 0:0:6:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:6:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 6 lun 0 0:0:8:0: ABORT operation started. 0:0:8:0: ABORT operation timed-out. 0:0:8:0: DEVICE RESET operation started. 0:0:8:0: DEVICE RESET operation timed-out. 0:0:8:0: BUS RESET operation started. 0:0:8:0: BUS RESET operation timed-out. 0:0:8:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:8:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 8 lun 0 0:0:9:0: ABORT operation started. 0:0:9:0: ABORT operation timed-out. 0:0:9:0: DEVICE RESET operation started. 0:0:9:0: DEVICE RESET operation timed-out. 0:0:9:0: BUS RESET operation started. 0:0:9:0: BUS RESET operation timed-out. 0:0:9:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:9:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 9 lun 0 0:0:10:0: ABORT operation started. 0:0:10:0: ABORT operation timed-out. 0:0:10:0: DEVICE RESET operation started. 0:0:10:0: DEVICE RESET operation timed-out. 0:0:10:0: BUS RESET operation started. 0:0:10:0: BUS RESET operation timed-out. 0:0:10:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:10:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 10 lun 0 0:0:11:0: ABORT operation started. 0:0:11:0: ABORT operation timed-out. 0:0:11:0: DEVICE RESET operation started. 0:0:11:0: DEVICE RESET operation timed-out. 0:0:11:0: BUS RESET operation started. 0:0:11:0: BUS RESET operation timed-out. 0:0:11:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:11:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 11 lun 0 0:0:12:0: ABORT operation started. 0:0:12:0: ABORT operation timed-out. 0:0:12:0: DEVICE RESET operation started. 0:0:12:0: DEVICE RESET operation timed-out. 0:0:12:0: BUS RESET operation started. 0:0:12:0: BUS RESET operation timed-out. 0:0:12:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:12:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 12 lun 0 0:0:13:0: ABORT operation started. 0:0:13:0: ABORT operation timed-out. 0:0:13:0: DEVICE RESET operation started. 0:0:13:0: DEVICE RESET operation timed-out. 0:0:13:0: BUS RESET operation started. 0:0:13:0: BUS RESET operation timed-out. 0:0:13:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:13:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 13 lun 0 0:0:14:0: ABORT operation started. 0:0:14:0: ABORT operation timed-out. 0:0:14:0: DEVICE RESET operation started. 0:0:14:0: DEVICE RESET operation timed-out. 0:0:14:0: BUS RESET operation started. 0:0:14:0: BUS RESET operation timed-out. 0:0:14:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:14:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 14 lun 0 0:0:15:0: ABORT operation started. 0:0:15:0: ABORT operation timed-out. 0:0:15:0: DEVICE RESET operation started. 0:0:15:0: DEVICE RESET operation timed-out. 0:0:15:0: BUS RESET operation started. 0:0:15:0: BUS RESET operation timed-out. 0:0:15:0: HOST RESET operation started. sym0: SCSI BUS has been reset. 0:0:15:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 15 lun 0 sym1: <875> rev 0x3 at pci 0000:00:0d.0 irq 23 sym1: No NVRAM, ID 7, Fast-20, SE, parity checking sym1: SCSI BUS has been reset. scsi1 : sym-2.2.0 1:0:0:0: ABORT operation started. 1:0:0:0: ABORT operation timed-out. 1:0:0:0: DEVICE RESET operation started. 1:0:0:0: DEVICE RESET operation timed-out. 1:0:0:0: BUS RESET operation started. 1:0:0:0: BUS RESET operation timed-out. 1:0:0:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:0:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 0 lun 0 1:0:1:0: ABORT operation started. 1:0:1:0: ABORT operation timed-out. 1:0:1:0: DEVICE RESET operation started. 1:0:1:0: DEVICE RESET operation timed-out. 1:0:1:0: BUS RESET operation started. 1:0:1:0: BUS RESET operation timed-out. 1:0:1:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:1:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 1 lun 0 1:0:2:0: ABORT operation started. 1:0:2:0: ABORT operation timed-out. 1:0:2:0: DEVICE RESET operation started. 1:0:2:0: DEVICE RESET operation timed-out. 1:0:2:0: BUS RESET operation started. 1:0:2:0: BUS RESET operation timed-out. 1:0:2:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:2:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 2 lun 0 1:0:3:0: ABORT operation started. 1:0:3:0: ABORT operation timed-out. 1:0:3:0: DEVICE RESET operation started. 1:0:3:0: DEVICE RESET operation timed-out. 1:0:3:0: BUS RESET operation started. 1:0:3:0: BUS RESET operation timed-out. 1:0:3:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:3:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 3 lun 0 1:0:4:0: ABORT operation started. 1:0:4:0: ABORT operation timed-out. 1:0:4:0: DEVICE RESET operation started. 1:0:4:0: DEVICE RESET operation timed-out. 1:0:4:0: BUS RESET operation started. 1:0:4:0: BUS RESET operation timed-out. 1:0:4:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:4:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 4 lun 0 1:0:5:0: ABORT operation started. 1:0:5:0: ABORT operation timed-out. 1:0:5:0: DEVICE RESET operation started. 1:0:5:0: DEVICE RESET operation timed-out. 1:0:5:0: BUS RESET operation started. 1:0:5:0: BUS RESET operation timed-out. 1:0:5:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:5:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 5 lun 0 1:0:6:0: ABORT operation started. 1:0:6:0: ABORT operation timed-out. 1:0:6:0: DEVICE RESET operation started. 1:0:6:0: DEVICE RESET operation timed-out. 1:0:6:0: BUS RESET operation started. 1:0:6:0: BUS RESET operation timed-out. 1:0:6:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:6:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 6 lun 0 1:0:8:0: ABORT operation started. 1:0:8:0: ABORT operation timed-out. 1:0:8:0: DEVICE RESET operation started. 1:0:8:0: DEVICE RESET operation timed-out. 1:0:8:0: BUS RESET operation started. 1:0:8:0: BUS RESET operation timed-out. 1:0:8:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:8:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 8 lun 0 1:0:9:0: ABORT operation started. 1:0:9:0: ABORT operation timed-out. 1:0:9:0: DEVICE RESET operation started. 1:0:9:0: DEVICE RESET operation timed-out. 1:0:9:0: BUS RESET operation started. 1:0:9:0: BUS RESET operation timed-out. 1:0:9:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:9:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 9 lun 0 1:0:10:0: ABORT operation started. 1:0:10:0: ABORT operation timed-out. 1:0:10:0: DEVICE RESET operation started. 1:0:10:0: DEVICE RESET operation timed-out. 1:0:10:0: BUS RESET operation started. 1:0:10:0: BUS RESET operation timed-out. 1:0:10:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:10:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 10 lun 0 1:0:11:0: ABORT operation started. 1:0:11:0: ABORT operation timed-out. 1:0:11:0: DEVICE RESET operation started. 1:0:11:0: DEVICE RESET operation timed-out. 1:0:11:0: BUS RESET operation started. 1:0:11:0: BUS RESET operation timed-out. 1:0:11:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:11:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 11 lun 0 1:0:12:0: ABORT operation started. 1:0:12:0: ABORT operation timed-out. 1:0:12:0: DEVICE RESET operation started. 1:0:12:0: DEVICE RESET operation timed-out. 1:0:12:0: BUS RESET operation started. 1:0:12:0: BUS RESET operation timed-out. 1:0:12:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:12:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 12 lun 0 1:0:13:0: ABORT operation started. 1:0:13:0: ABORT operation timed-out. 1:0:13:0: DEVICE RESET operation started. 1:0:13:0: DEVICE RESET operation timed-out. 1:0:13:0: BUS RESET operation started. 1:0:13:0: BUS RESET operation timed-out. 1:0:13:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:13:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 13 lun 0 1:0:14:0: ABORT operation started. 1:0:14:0: ABORT operation timed-out. 1:0:14:0: DEVICE RESET operation started. 1:0:14:0: DEVICE RESET operation timed-out. 1:0:14:0: BUS RESET operation started. 1:0:14:0: BUS RESET operation timed-out. 1:0:14:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:14:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 14 lun 0 1:0:15:0: ABORT operation started. 1:0:15:0: ABORT operation timed-out. 1:0:15:0: DEVICE RESET operation started. 1:0:15:0: DEVICE RESET operation timed-out. 1:0:15:0: BUS RESET operation started. 1:0:15:0: BUS RESET operation timed-out. 1:0:15:0: HOST RESET operation started. sym1: SCSI BUS has been reset. 1:0:15:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 1 channel 0 id 15 lun 0 sym2: <825a> rev 0x13 at pci 0001:10:0d.0 irq 39 sym2: No NVRAM, ID 7, Fast-10, SE, parity checking sym2: SCSI BUS has been reset. scsi2 : sym-2.2.0 2:0:0:0: ABORT operation started. 2:0:0:0: ABORT operation timed-out. 2:0:0:0: DEVICE RESET operation started. 2:0:0:0: DEVICE RESET operation timed-out. 2:0:0:0: BUS RESET operation started. 2:0:0:0: BUS RESET operation timed-out. 2:0:0:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:0:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 0 lun 0 2:0:1:0: ABORT operation started. 2:0:1:0: ABORT operation timed-out. 2:0:1:0: DEVICE RESET operation started. 2:0:1:0: DEVICE RESET operation timed-out. 2:0:1:0: BUS RESET operation started. 2:0:1:0: BUS RESET operation timed-out. 2:0:1:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:1:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 1 lun 0 2:0:2:0: ABORT operation started. 2:0:2:0: ABORT operation timed-out. 2:0:2:0: DEVICE RESET operation started. 2:0:2:0: DEVICE RESET operation timed-out. 2:0:2:0: BUS RESET operation started. 2:0:2:0: BUS RESET operation timed-out. 2:0:2:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:2:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 2 lun 0 2:0:3:0: ABORT operation started. 2:0:3:0: ABORT operation timed-out. 2:0:3:0: DEVICE RESET operation started. 2:0:3:0: DEVICE RESET operation timed-out. 2:0:3:0: BUS RESET operation started. 2:0:3:0: BUS RESET operation timed-out. 2:0:3:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:3:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 3 lun 0 2:0:4:0: ABORT operation started. 2:0:4:0: ABORT operation timed-out. 2:0:4:0: DEVICE RESET operation started. 2:0:4:0: DEVICE RESET operation timed-out. 2:0:4:0: BUS RESET operation started. 2:0:4:0: BUS RESET operation timed-out. 2:0:4:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:4:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 4 lun 0 2:0:5:0: ABORT operation started. 2:0:5:0: ABORT operation timed-out. 2:0:5:0: DEVICE RESET operation started. 2:0:5:0: DEVICE RESET operation timed-out. 2:0:5:0: BUS RESET operation started. 2:0:5:0: BUS RESET operation timed-out. 2:0:5:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:5:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 5 lun 0 2:0:6:0: ABORT operation started. 2:0:6:0: ABORT operation timed-out. 2:0:6:0: DEVICE RESET operation started. 2:0:6:0: DEVICE RESET operation timed-out. 2:0:6:0: BUS RESET operation started. 2:0:6:0: BUS RESET operation timed-out. 2:0:6:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:6:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 6 lun 0 2:0:8:0: ABORT operation started. 2:0:8:0: ABORT operation timed-out. 2:0:8:0: DEVICE RESET operation started. 2:0:8:0: DEVICE RESET operation timed-out. 2:0:8:0: BUS RESET operation started. 2:0:8:0: BUS RESET operation timed-out. 2:0:8:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:8:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 8 lun 0 2:0:9:0: ABORT operation started. 2:0:9:0: ABORT operation timed-out. 2:0:9:0: DEVICE RESET operation started. 2:0:9:0: DEVICE RESET operation timed-out. 2:0:9:0: BUS RESET operation started. 2:0:9:0: BUS RESET operation timed-out. 2:0:9:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:9:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 9 lun 0 2:0:10:0: ABORT operation started. 2:0:10:0: ABORT operation timed-out. 2:0:10:0: DEVICE RESET operation started. 2:0:10:0: DEVICE RESET operation timed-out. 2:0:10:0: BUS RESET operation started. 2:0:10:0: BUS RESET operation timed-out. 2:0:10:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:10:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 10 lun 0 2:0:11:0: ABORT operation started. 2:0:11:0: ABORT operation timed-out. 2:0:11:0: DEVICE RESET operation started. 2:0:11:0: DEVICE RESET operation timed-out. 2:0:11:0: BUS RESET operation started. 2:0:11:0: BUS RESET operation timed-out. 2:0:11:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:11:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 11 lun 0 2:0:12:0: ABORT operation started. 2:0:12:0: ABORT operation timed-out. 2:0:12:0: DEVICE RESET operation started. 2:0:12:0: DEVICE RESET operation timed-out. 2:0:12:0: BUS RESET operation started. 2:0:12:0: BUS RESET operation timed-out. 2:0:12:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:12:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 12 lun 0 2:0:13:0: ABORT operation started. 2:0:13:0: ABORT operation timed-out. 2:0:13:0: DEVICE RESET operation started. 2:0:13:0: DEVICE RESET operation timed-out. 2:0:13:0: BUS RESET operation started. 2:0:13:0: BUS RESET operation timed-out. 2:0:13:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:13:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 13 lun 0 2:0:14:0: ABORT operation started. 2:0:14:0: ABORT operation timed-out. 2:0:14:0: DEVICE RESET operation started. 2:0:14:0: DEVICE RESET operation timed-out. 2:0:14:0: BUS RESET operation started. 2:0:14:0: BUS RESET operation timed-out. 2:0:14:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:14:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 14 lun 0 2:0:15:0: ABORT operation started. 2:0:15:0: ABORT operation timed-out. 2:0:15:0: DEVICE RESET operation started. 2:0:15:0: DEVICE RESET operation timed-out. 2:0:15:0: BUS RESET operation started. 2:0:15:0: BUS RESET operation timed-out. 2:0:15:0: HOST RESET operation started. sym2: SCSI BUS has been reset. 2:0:15:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 2 channel 0 id 15 lun 0 sym3: <825a> rev 0x13 at pci 0001:10:0e.0 irq 41 sym3: No NVRAM, ID 7, Fast-10, SE, parity checking sym3: SCSI BUS has been reset. scsi3 : sym-2.2.0 3:0:0:0: ABORT operation started. 3:0:0:0: ABORT operation timed-out. 3:0:0:0: DEVICE RESET operation started. 3:0:0:0: DEVICE RESET operation timed-out. 3:0:0:0: BUS RESET operation started. 3:0:0:0: BUS RESET operation timed-out. 3:0:0:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:0:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 0 lun 0 3:0:1:0: ABORT operation started. 3:0:1:0: ABORT operation timed-out. 3:0:1:0: DEVICE RESET operation started. 3:0:1:0: DEVICE RESET operation timed-out. 3:0:1:0: BUS RESET operation started. 3:0:1:0: BUS RESET operation timed-out. 3:0:1:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:1:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 1 lun 0 3:0:2:0: ABORT operation started. 3:0:2:0: ABORT operation timed-out. 3:0:2:0: DEVICE RESET operation started. 3:0:2:0: DEVICE RESET operation timed-out. 3:0:2:0: BUS RESET operation started. 3:0:2:0: BUS RESET operation timed-out. 3:0:2:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:2:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 2 lun 0 3:0:3:0: ABORT operation started. 3:0:3:0: ABORT operation timed-out. 3:0:3:0: DEVICE RESET operation started. 3:0:3:0: DEVICE RESET operation timed-out. 3:0:3:0: BUS RESET operation started. 3:0:3:0: BUS RESET operation timed-out. 3:0:3:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:3:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 3 lun 0 3:0:4:0: ABORT operation started. 3:0:4:0: ABORT operation timed-out. 3:0:4:0: DEVICE RESET operation started. 3:0:4:0: DEVICE RESET operation timed-out. 3:0:4:0: BUS RESET operation started. 3:0:4:0: BUS RESET operation timed-out. 3:0:4:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:4:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 4 lun 0 3:0:5:0: ABORT operation started. 3:0:5:0: ABORT operation timed-out. 3:0:5:0: DEVICE RESET operation started. 3:0:5:0: DEVICE RESET operation timed-out. 3:0:5:0: BUS RESET operation started. 3:0:5:0: BUS RESET operation timed-out. 3:0:5:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:5:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 5 lun 0 3:0:6:0: ABORT operation started. 3:0:6:0: ABORT operation timed-out. 3:0:6:0: DEVICE RESET operation started. 3:0:6:0: DEVICE RESET operation timed-out. 3:0:6:0: BUS RESET operation started. 3:0:6:0: BUS RESET operation timed-out. 3:0:6:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:6:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 6 lun 0 3:0:8:0: ABORT operation started. 3:0:8:0: ABORT operation timed-out. 3:0:8:0: DEVICE RESET operation started. 3:0:8:0: DEVICE RESET operation timed-out. 3:0:8:0: BUS RESET operation started. 3:0:8:0: BUS RESET operation timed-out. 3:0:8:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:8:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 8 lun 0 3:0:9:0: ABORT operation started. 3:0:9:0: ABORT operation timed-out. 3:0:9:0: DEVICE RESET operation started. 3:0:9:0: DEVICE RESET operation timed-out. 3:0:9:0: BUS RESET operation started. 3:0:9:0: BUS RESET operation timed-out. 3:0:9:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:9:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 9 lun 0 3:0:10:0: ABORT operation started. 3:0:10:0: ABORT operation timed-out. 3:0:10:0: DEVICE RESET operation started. 3:0:10:0: DEVICE RESET operation timed-out. 3:0:10:0: BUS RESET operation started. 3:0:10:0: BUS RESET operation timed-out. 3:0:10:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:10:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 10 lun 0 3:0:11:0: ABORT operation started. 3:0:11:0: ABORT operation timed-out. 3:0:11:0: DEVICE RESET operation started. 3:0:11:0: DEVICE RESET operation timed-out. 3:0:11:0: BUS RESET operation started. 3:0:11:0: BUS RESET operation timed-out. 3:0:11:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:11:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 11 lun 0 3:0:12:0: ABORT operation started. 3:0:12:0: ABORT operation timed-out. 3:0:12:0: DEVICE RESET operation started. 3:0:12:0: DEVICE RESET operation timed-out. 3:0:12:0: BUS RESET operation started. 3:0:12:0: BUS RESET operation timed-out. 3:0:12:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:12:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 12 lun 0 3:0:13:0: ABORT operation started. 3:0:13:0: ABORT operation timed-out. 3:0:13:0: DEVICE RESET operation started. 3:0:13:0: DEVICE RESET operation timed-out. 3:0:13:0: BUS RESET operation started. 3:0:13:0: BUS RESET operation timed-out. 3:0:13:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:13:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 13 lun 0 3:0:14:0: ABORT operation started. 3:0:14:0: ABORT operation timed-out. 3:0:14:0: DEVICE RESET operation started. 3:0:14:0: DEVICE RESET operation timed-out. 3:0:14:0: BUS RESET operation started. 3:0:14:0: BUS RESET operation timed-out. 3:0:14:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:14:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 14 lun 0 3:0:15:0: ABORT operation started. 3:0:15:0: ABORT operation timed-out. 3:0:15:0: DEVICE RESET operation started. 3:0:15:0: DEVICE RESET operation timed-out. 3:0:15:0: BUS RESET operation started. 3:0:15:0: BUS RESET operation timed-out. 3:0:15:0: HOST RESET operation started. sym3: SCSI BUS has been reset. 3:0:15:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 3 channel 0 id 15 lun 0 sym4: <825a> rev 0x13 at pci 0002:20:0b.0 irq 51 sym4: No NVRAM, ID 7, Fast-10, SE, parity checking sym4: SCSI BUS has been reset. scsi4 : sym-2.2.0 4:0:0:0: ABORT operation started. 4:0:0:0: ABORT operation timed-out. 4:0:0:0: DEVICE RESET operation started. 4:0:0:0: DEVICE RESET operation timed-out. 4:0:0:0: BUS RESET operation started. 4:0:0:0: BUS RESET operation timed-out. 4:0:0:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:0:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 0 lun 0 4:0:1:0: ABORT operation started. 4:0:1:0: ABORT operation timed-out. 4:0:1:0: DEVICE RESET operation started. 4:0:1:0: DEVICE RESET operation timed-out. 4:0:1:0: BUS RESET operation started. 4:0:1:0: BUS RESET operation timed-out. 4:0:1:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:1:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 1 lun 0 4:0:2:0: ABORT operation started. 4:0:2:0: ABORT operation timed-out. 4:0:2:0: DEVICE RESET operation started. 4:0:2:0: DEVICE RESET operation timed-out. 4:0:2:0: BUS RESET operation started. 4:0:2:0: BUS RESET operation timed-out. 4:0:2:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:2:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 2 lun 0 4:0:3:0: ABORT operation started. 4:0:3:0: ABORT operation timed-out. 4:0:3:0: DEVICE RESET operation started. 4:0:3:0: DEVICE RESET operation timed-out. 4:0:3:0: BUS RESET operation started. 4:0:3:0: BUS RESET operation timed-out. 4:0:3:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:3:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 3 lun 0 4:0:4:0: ABORT operation started. 4:0:4:0: ABORT operation timed-out. 4:0:4:0: DEVICE RESET operation started. 4:0:4:0: DEVICE RESET operation timed-out. 4:0:4:0: BUS RESET operation started. 4:0:4:0: BUS RESET operation timed-out. 4:0:4:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:4:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 4 lun 0 4:0:5:0: ABORT operation started. 4:0:5:0: ABORT operation timed-out. 4:0:5:0: DEVICE RESET operation started. 4:0:5:0: DEVICE RESET operation timed-out. 4:0:5:0: BUS RESET operation started. 4:0:5:0: BUS RESET operation timed-out. 4:0:5:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:5:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 5 lun 0 4:0:6:0: ABORT operation started. 4:0:6:0: ABORT operation timed-out. 4:0:6:0: DEVICE RESET operation started. 4:0:6:0: DEVICE RESET operation timed-out. 4:0:6:0: BUS RESET operation started. 4:0:6:0: BUS RESET operation timed-out. 4:0:6:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:6:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 6 lun 0 4:0:8:0: ABORT operation started. 4:0:8:0: ABORT operation timed-out. 4:0:8:0: DEVICE RESET operation started. 4:0:8:0: DEVICE RESET operation timed-out. 4:0:8:0: BUS RESET operation started. 4:0:8:0: BUS RESET operation timed-out. 4:0:8:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:8:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 8 lun 0 4:0:9:0: ABORT operation started. 4:0:9:0: ABORT operation timed-out. 4:0:9:0: DEVICE RESET operation started. 4:0:9:0: DEVICE RESET operation timed-out. 4:0:9:0: BUS RESET operation started. 4:0:9:0: BUS RESET operation timed-out. 4:0:9:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:9:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 9 lun 0 4:0:10:0: ABORT operation started. 4:0:10:0: ABORT operation timed-out. 4:0:10:0: DEVICE RESET operation started. 4:0:10:0: DEVICE RESET operation timed-out. 4:0:10:0: BUS RESET operation started. 4:0:10:0: BUS RESET operation timed-out. 4:0:10:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:10:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 10 lun 0 4:0:11:0: ABORT operation started. 4:0:11:0: ABORT operation timed-out. 4:0:11:0: DEVICE RESET operation started. 4:0:11:0: DEVICE RESET operation timed-out. 4:0:11:0: BUS RESET operation started. 4:0:11:0: BUS RESET operation timed-out. 4:0:11:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:11:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 11 lun 0 4:0:12:0: ABORT operation started. 4:0:12:0: ABORT operation timed-out. 4:0:12:0: DEVICE RESET operation started. 4:0:12:0: DEVICE RESET operation timed-out. 4:0:12:0: BUS RESET operation started. 4:0:12:0: BUS RESET operation timed-out. 4:0:12:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:12:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 12 lun 0 4:0:13:0: ABORT operation started. 4:0:13:0: ABORT operation timed-out. 4:0:13:0: DEVICE RESET operation started. 4:0:13:0: DEVICE RESET operation timed-out. 4:0:13:0: BUS RESET operation started. 4:0:13:0: BUS RESET operation timed-out. 4:0:13:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:13:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 13 lun 0 4:0:14:0: ABORT operation started. 4:0:14:0: ABORT operation timed-out. 4:0:14:0: DEVICE RESET operation started. 4:0:14:0: DEVICE RESET operation timed-out. 4:0:14:0: BUS RESET operation started. 4:0:14:0: BUS RESET operation timed-out. 4:0:14:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:14:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 14 lun 0 4:0:15:0: ABORT operation started. 4:0:15:0: ABORT operation timed-out. 4:0:15:0: DEVICE RESET operation started. 4:0:15:0: DEVICE RESET operation timed-out. 4:0:15:0: BUS RESET operation started. 4:0:15:0: BUS RESET operation timed-out. 4:0:15:0: HOST RESET operation started. sym4: SCSI BUS has been reset. 4:0:15:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 4 channel 0 id 15 lun 0 sym5: <875> rev 0x3 at pci 0002:20:0d.0 irq 55 sym5: No NVRAM, ID 7, Fast-20, SE, parity checking sym5: SCSI BUS has been reset. scsi5 : sym-2.2.0 5:0:0:0: ABORT operation started. 5:0:0:0: ABORT operation timed-out. 5:0:0:0: DEVICE RESET operation started. 5:0:0:0: DEVICE RESET operation timed-out. 5:0:0:0: BUS RESET operation started. 5:0:0:0: BUS RESET operation timed-out. 5:0:0:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:0:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0 5:0:1:0: ABORT operation started. 5:0:1:0: ABORT operation timed-out. 5:0:1:0: DEVICE RESET operation started. 5:0:1:0: DEVICE RESET operation timed-out. 5:0:1:0: BUS RESET operation started. 5:0:1:0: BUS RESET operation timed-out. 5:0:1:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:1:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 1 lun 0 5:0:2:0: ABORT operation started. 5:0:2:0: ABORT operation timed-out. 5:0:2:0: DEVICE RESET operation started. 5:0:2:0: DEVICE RESET operation timed-out. 5:0:2:0: BUS RESET operation started. 5:0:2:0: BUS RESET operation timed-out. 5:0:2:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:2:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 2 lun 0 5:0:3:0: ABORT operation started. 5:0:3:0: ABORT operation timed-out. 5:0:3:0: DEVICE RESET operation started. 5:0:3:0: DEVICE RESET operation timed-out. 5:0:3:0: BUS RESET operation started. 5:0:3:0: BUS RESET operation timed-out. 5:0:3:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:3:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 3 lun 0 5:0:4:0: ABORT operation started. 5:0:4:0: ABORT operation timed-out. 5:0:4:0: DEVICE RESET operation started. 5:0:4:0: DEVICE RESET operation timed-out. 5:0:4:0: BUS RESET operation started. 5:0:4:0: BUS RESET operation timed-out. 5:0:4:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:4:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 4 lun 0 5:0:5:0: ABORT operation started. 5:0:5:0: ABORT operation timed-out. 5:0:5:0: DEVICE RESET operation started. 5:0:5:0: DEVICE RESET operation timed-out. 5:0:5:0: BUS RESET operation started. 5:0:5:0: BUS RESET operation timed-out. 5:0:5:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:5:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 5 lun 0 5:0:6:0: ABORT operation started. 5:0:6:0: ABORT operation timed-out. 5:0:6:0: DEVICE RESET operation started. 5:0:6:0: DEVICE RESET operation timed-out. 5:0:6:0: BUS RESET operation started. 5:0:6:0: BUS RESET operation timed-out. 5:0:6:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:6:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 6 lun 0 5:0:8:0: ABORT operation started. 5:0:8:0: ABORT operation timed-out. 5:0:8:0: DEVICE RESET operation started. 5:0:8:0: DEVICE RESET operation timed-out. 5:0:8:0: BUS RESET operation started. 5:0:8:0: BUS RESET operation timed-out. 5:0:8:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:8:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 8 lun 0 5:0:9:0: ABORT operation started. 5:0:9:0: ABORT operation timed-out. 5:0:9:0: DEVICE RESET operation started. 5:0:9:0: DEVICE RESET operation timed-out. 5:0:9:0: BUS RESET operation started. 5:0:9:0: BUS RESET operation timed-out. 5:0:9:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:9:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 9 lun 0 5:0:10:0: ABORT operation started. 5:0:10:0: ABORT operation timed-out. 5:0:10:0: DEVICE RESET operation started. 5:0:10:0: DEVICE RESET operation timed-out. 5:0:10:0: BUS RESET operation started. 5:0:10:0: BUS RESET operation timed-out. 5:0:10:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:10:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 10 lun 0 5:0:11:0: ABORT operation started. 5:0:11:0: ABORT operation timed-out. 5:0:11:0: DEVICE RESET operation started. 5:0:11:0: DEVICE RESET operation timed-out. 5:0:11:0: BUS RESET operation started. 5:0:11:0: BUS RESET operation timed-out. 5:0:11:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:11:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 11 lun 0 5:0:12:0: ABORT operation started. 5:0:12:0: ABORT operation timed-out. 5:0:12:0: DEVICE RESET operation started. 5:0:12:0: DEVICE RESET operation timed-out. 5:0:12:0: BUS RESET operation started. 5:0:12:0: BUS RESET operation timed-out. 5:0:12:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:12:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 12 lun 0 5:0:13:0: ABORT operation started. 5:0:13:0: ABORT operation timed-out. 5:0:13:0: DEVICE RESET operation started. 5:0:13:0: DEVICE RESET operation timed-out. 5:0:13:0: BUS RESET operation started. 5:0:13:0: BUS RESET operation timed-out. 5:0:13:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:13:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 13 lun 0 5:0:14:0: ABORT operation started. 5:0:14:0: ABORT operation timed-out. 5:0:14:0: DEVICE RESET operation started. 5:0:14:0: DEVICE RESET operation timed-out. 5:0:14:0: BUS RESET operation started. 5:0:14:0: BUS RESET operation timed-out. 5:0:14:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:14:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 14 lun 0 5:0:15:0: ABORT operation started. 5:0:15:0: ABORT operation timed-out. 5:0:15:0: DEVICE RESET operation started. 5:0:15:0: DEVICE RESET operation timed-out. 5:0:15:0: BUS RESET operation started. 5:0:15:0: BUS RESET operation timed-out. 5:0:15:0: HOST RESET operation started. sym5: SCSI BUS has been reset. 5:0:15:0: HOST RESET operation timed-out. scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 15 lun 0 ipr: IBM Power RAID SCSI Device Driver version: 2.0.13 (February 21, 2005) st: Version 20050312, fixed bufsize 32768, s/g segs 256 Initializing USB Mass Storage driver... usbcore: registered new driver usb-storage USB Mass Storage support registered. usbcore: registered new driver hiddev usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.01:USB HID core driver mice: PS/2 mouse device common for all mice md: linear personality registered as nr 1 md: raid0 personality registered as nr 2 md: raid1 personality registered as nr 3 md: raid5 personality registered as nr 4 raid5: measuring checksumming speed 8regs : 428.000 MB/sec 8regs_prefetch: 388.000 MB/sec 32regs : 552.000 MB/sec 32regs_prefetch: 472.000 MB/sec raid5: using function: 32regs (552.000 MB/sec) md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27 device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel at redhat.com oprofile: using ppc64/rs64 performance monitoring. NET: Registered protocol family 2 atkbd.c: keyboard reset failed on isa0060/serio1 IP: routing cache hash table of 65536 buckets, 512Kbytes TCP established hash table entries: 524288 (order: 10, 4194304 bytes) TCP bind hash table entries: 65536 (order: 7, 524288 bytes) TCP: Hash tables configured (established 524288 bind 65536) IPv4 over IPv4 tunneling driver NET: Registered protocol family 1 NET: Registered protocol family 17 md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. VFS: Insert root floppy and press ENTER From moilanen at austin.ibm.com Wed Jun 1 07:55:07 2005 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Tue, 31 May 2005 16:55:07 -0500 Subject: [PATCH] PCI device-node failure detection In-Reply-To: <20050531195127.GD3723@otto> References: <20050531101225.510efbf7.moilanen@austin.ibm.com> <20050531195127.GD3723@otto> Message-ID: <20050531165507.44dba54a.moilanen@austin.ibm.com> > "fail" is not the only possible state that would indicate that the > device is unusable, if I'm reading IEEE 1275 right. There is also > "disabled" and "fail-xxx" where xxx is additional info about the > fault. I've never seen "fail-xxx" myself but I've seen "disabled" on > devices that were deconfigured by firmware (failing cpu iirc). I haven't seen this on an adapter, but better to follow the spec. > Unless we want to treat "disabled" and "fail[-xxx]" differently, I > think we should be checking that the status property, if present, > says "okay". > > Something like: > > int dn_failed(struct device_node *dn) > { > char *status = get_property(dn, "status", NULL); > > if (!status) > return 0; > > if (!strcmp(status, "okay")) > return 0; > > return 1; > } I like that much more. > Also, I think this function could be made a static helper in > pSeries_pci.c until something outside of that file needs it. Hmm...I thought we were trying to keep all device-tree specific items in prom.c. Here's a fixed up version of the dn_failed(). Signed-off-by: Jake Moilanen Index: 2.6.12/arch/ppc64/kernel/prom.c =================================================================== --- 2.6.12.orig/arch/ppc64/kernel/prom.c 2005-03-02 01:38:13.000000000 -0600 +++ 2.6.12/arch/ppc64/kernel/prom.c 2005-05-31 21:05:58.674114874 -0500 @@ -1887,6 +1887,22 @@ *next = prop; } +int +dn_failed(struct device_node * dn) +{ + char * status; + + status = get_property(dn, "status", NULL); + + if (!status) + return 0; + + if (!strcmp(status, "okay")) + return 0; + + return 1; +} + #if 0 void print_properties(struct device_node *np) Index: 2.6.12/arch/ppc64/kernel/pSeries_pci.c =================================================================== --- 2.6.12.orig/arch/ppc64/kernel/pSeries_pci.c 2005-03-02 01:38:34.000000000 -0600 +++ 2.6.12/arch/ppc64/kernel/pSeries_pci.c 2005-05-27 18:44:33.000000000 -0500 @@ -96,7 +96,7 @@ /* Search only direct children of the bus */ for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) + if (dn->devfn == devfn && !dn_failed(dn)) return rtas_read_config(dn, where, size, val); return PCIBIOS_DEVICE_NOT_FOUND; } @@ -138,7 +138,7 @@ /* Search only direct children of the bus */ for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) + if (dn->devfn == devfn && !dn_failed(dn)) return rtas_write_config(dn, where, size, val); return PCIBIOS_DEVICE_NOT_FOUND; } Index: 2.6.12/include/asm-ppc64/prom.h =================================================================== --- 2.6.12.orig/include/asm-ppc64/prom.h 2005-03-02 01:38:33.000000000 -0600 +++ 2.6.12/include/asm-ppc64/prom.h 2005-05-27 18:44:33.000000000 -0500 @@ -225,5 +225,6 @@ extern int prom_n_intr_cells(struct device_node* np); extern void prom_get_irq_senses(unsigned char *senses, int off, int max); extern void prom_add_property(struct device_node* np, struct property* prop); +extern int dn_failed(struct device_node * dn); #endif /* _PPC64_PROM_H */ From linas at austin.ibm.com Wed Jun 1 07:56:28 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 31 May 2005 16:56:28 -0500 Subject: [PATCH]: PCI Error Recovery Implementation In-Reply-To: <1117572727.7775.11.camel@sinatra.austin.ibm.com> References: <20050531203028.GD31199@austin.ibm.com> <1117572727.7775.11.camel@sinatra.austin.ibm.com> Message-ID: <20050531215628.GE31199@austin.ibm.com> On Tue, May 31, 2005 at 03:52:07PM -0500, John Rose was heard to remark: > Hi Linas/Greg/Everyone- > > +int handle_eeh_events (struct notifier_block *self, > + unsigned long reason, void *ev) > > At the risk of sounding like a broken record, I don't think that this > belongs in the RPA PCI Hotplug driver. This bit of code _uses_ PCI > Hotplug rather than implementing it, and thus stands out from the rest > of the module. Not to mention that it uses EEH-specific stuff that > doesn't belong here. I'm in the midst of reducing the codebase of this > module significantly, and this adds more unrelated stuff to an already > cluttered module. > > I think the PCI hotplug driver could register enable/disable functions > with eeh.c, and that the handle_events() code should reside there. If > someone has a good technical explanation of why this is bad, please > chime in. What would be the correct way of forcing the rpaphp.o module to get loaded, and what would be the correct way of invoking functions in that module? --linas From johnrose at austin.ibm.com Wed Jun 1 08:13:11 2005 From: johnrose at austin.ibm.com (John Rose) Date: Tue, 31 May 2005 17:13:11 -0500 Subject: [PATCH]: PCI Error Recovery Implementation In-Reply-To: <20050531215628.GE31199@austin.ibm.com> References: <20050531203028.GD31199@austin.ibm.com> <1117572727.7775.11.camel@sinatra.austin.ibm.com> <20050531215628.GE31199@austin.ibm.com> Message-ID: <1117577591.7775.30.camel@sinatra.austin.ibm.com> Hi Linas- > What would be the correct way of forcing the rpaphp.o module to get > loaded, The burden is on the distro/user to either compile-in or load the PCI Hotplug module. This is the case with or without your patch, and regardless of the ultimate solution to the EEH problem. > and what would be the correct way of invoking functions in that > module? Arch/ppc64/kernel/eeh.c could have function pointers for enable/disable slot. Something like: int (*hp_disable_slot)(struct pci_bus *bus) = NULL; int (*hp_enable_slot)(struct pci_bus *bus) = NULL; These could either be exported, or be static and accompanied by small accessor functions. The RPA hotplug module could set these pointers at module init, either directly or through accessors. If these aren't set at runtime, the module isn't loaded. This puts the eeh footprint on rpaphp at 4 lines, and leaves the EEH implementation in an eeh file. Thoughts? John From paulus at samba.org Wed Jun 1 08:19:00 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 1 Jun 2005 08:19:00 +1000 Subject: [PATCH] correct printing to op panel In-Reply-To: <429CDC42.30905@austin.ibm.com> References: <429CDC42.30905@austin.ibm.com> Message-ID: <17052.58068.511994.89604@cargo.ozlabs.ibm.com> Mike Strosaker writes: > This patch corrects the printing of progress indicators to the op panel on > ppc64 systems. Each discrete reference code should begin with a form feed > char to clear the op panel, and the first and second lines should be separated > with a CR/LF sequence. Padding with spaces is not necessary. I want to think about this one a bit more. The ppc_md.progress() calls aren't only used for the i/pSeries op panels, and I need to think about the effect of \f on the other progress implementations. It would be best, I think, if the outputting of \f could be done by the pSeries progress function rather than the caller, if possible. Paul. From linas at austin.ibm.com Wed Jun 1 08:38:01 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 31 May 2005 17:38:01 -0500 Subject: [PATCH]: PCI Error Recovery Implementation In-Reply-To: <1117577591.7775.30.camel@sinatra.austin.ibm.com> References: <20050531203028.GD31199@austin.ibm.com> <1117572727.7775.11.camel@sinatra.austin.ibm.com> <20050531215628.GE31199@austin.ibm.com> <1117577591.7775.30.camel@sinatra.austin.ibm.com> Message-ID: <20050531223801.GF31199@austin.ibm.com> On Tue, May 31, 2005 at 05:13:11PM -0500, John Rose was heard to remark: > > and what would be the correct way of invoking functions in that > > module? > > Arch/ppc64/kernel/eeh.c could have function pointers for enable/disable > slot. Something like: > int (*hp_disable_slot)(struct pci_bus *bus) = NULL; > int (*hp_enable_slot)(struct pci_bus *bus) = NULL; > > These could either be exported, or be static and accompanied by small > accessor functions. The RPA hotplug module could set these pointers at > module init, either directly or through accessors. If these aren't set > at runtime, the module isn't loaded. > > This puts the eeh footprint on rpaphp at 4 lines, and leaves the EEH > implementation in an eeh file. That was the original implementation about a year ago. Lengthly discussion on the mailing lists suggested that this was the wrong way to do it. I'm not sure, but I vaguely remember Paul Mackerras as being (one of) the drivers urging the current implementation. Paul, could you please comment on this issue? Personally, I don't much care; either mechanism has the same net result. --linas From paulus at samba.org Wed Jun 1 09:05:04 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 1 Jun 2005 09:05:04 +1000 Subject: book e idea In-Reply-To: <429C5648.2040501@blach.dnsalias.org> References: <429C5648.2040501@blach.dnsalias.org> Message-ID: <17052.60832.525453.668005@cargo.ozlabs.ibm.com> Ralph Blach writes: > could the OS be run in translation state 0, and and the users in > translation state 1? Yes, it is possible. The main problem that it would introduce is that copy_to_user, copy_from_user, put_user, get_user etc. would no longer be able to access user space directly but would instead have to set up page mappings, and would therefore become bigger and slower. > The advantage to this would be is that user task could have true 4 gb > addressing on a and so could the kernel. Yes. There doesn't seem to be a lot of pressure for giving userspace more than 2GB in the embedded world though. Paul. From strosake at austin.ibm.com Wed Jun 1 07:50:58 2005 From: strosake at austin.ibm.com (Mike Strosaker) Date: Tue, 31 May 2005 16:50:58 -0500 Subject: [PATCH] correct printing to op panel Message-ID: <429CDC42.30905@austin.ibm.com> This patch corrects the printing of progress indicators to the op panel on ppc64 systems. Each discrete reference code should begin with a form feed char to clear the op panel, and the first and second lines should be separated with a CR/LF sequence. Padding with spaces is not necessary. Also, capitalize the hex value printed on the first line, to be consistent with the values printed by firmware, service processor, etc. Signed-off-by: Mike Strosaker diff -Nru linux-2.6.12-rc5.orig/arch/ppc64/kernel/pSeries_setup.c linux-2.6.12-rc5/arch/ppc64/kernel/pSeries_setup.c --- linux-2.6.12-rc5.orig/arch/ppc64/kernel/pSeries_setup.c 2005-05-31 15:35:37.000000000 -0500 +++ linux-2.6.12-rc5/arch/ppc64/kernel/pSeries_setup.c 2005-05-31 15:16:49.000000000 -0500 @@ -240,7 +240,7 @@ static int __init pSeries_init_panel(void) { /* Manually leave the kernel version on the panel. */ - ppc_md.progress("Linux ppc64\n", 0); + ppc_md.progress("\fLinux ppc64\r\n", 0); ppc_md.progress(UTS_RELEASE, 0); return 0; diff -Nru linux-2.6.12-rc5.orig/arch/ppc64/kernel/setup.c linux-2.6.12-rc5/arch/ppc64/kernel/setup.c --- linux-2.6.12-rc5.orig/arch/ppc64/kernel/setup.c 2005-05-31 15:35:37.000000000 -0500 +++ linux-2.6.12-rc5/arch/ppc64/kernel/setup.c 2005-05-31 15:17:10.000000000 -0500 @@ -1085,9 +1085,9 @@ if (ppc_md.progress) { char buf[32]; - sprintf(buf, "%08x \n", src); + sprintf(buf, "\f%08X\r\n", src); ppc_md.progress(buf, 0); - sprintf(buf, "%-16s", msg); + snprintf(buf, 32, "%s", msg); ppc_md.progress(buf, 0); } } From benh at kernel.crashing.org Wed Jun 1 09:34:46 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Jun 2005 09:34:46 +1000 Subject: [PATCH] PCI device-node failure detection In-Reply-To: <20050531101225.510efbf7.moilanen@austin.ibm.com> References: <20050531101225.510efbf7.moilanen@austin.ibm.com> Message-ID: <1117582486.5826.62.camel@gaston> On Tue, 2005-05-31 at 10:12 -0500, Jake Moilanen wrote: > OpenFirmware marks devices as failed in the device-tree when a hardware > problem is detected. The kernel needs to fail config reads/writes to > prevent a kernel crash when incorrect data is read. > > This patch validates that the device-node is not marked "fail" when > config space reads/writes are attempted. > > Signed-off-by: Jake Moilanen > > Index: 2.6.12/arch/ppc64/kernel/prom.c > =================================================================== > --- 2.6.12.orig/arch/ppc64/kernel/prom.c 2005-03-02 01:38:13.000000000 -0600 > +++ 2.6.12/arch/ppc64/kernel/prom.c 2005-05-27 18:44:33.172559207 -0500 > @@ -1887,6 +1887,19 @@ > *next = prop; > } > > +int > +dn_failed(struct device_node * dn) > +{ > + char * status; > + > + status = get_property(dn, "status", NULL); > + > + if (status && !strcmp(status, "fail")) > + return 1; > + > + return 0; > +} > + Please, keep that out of prom.c (and I don't like the function name :) Ben. From benh at kernel.crashing.org Wed Jun 1 09:37:55 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Jun 2005 09:37:55 +1000 Subject: [PATCH] PCI device-node failure detection In-Reply-To: <20050531165507.44dba54a.moilanen@austin.ibm.com> References: <20050531101225.510efbf7.moilanen@austin.ibm.com> <20050531195127.GD3723@otto> <20050531165507.44dba54a.moilanen@austin.ibm.com> Message-ID: <1117582676.5826.66.camel@gaston> > Hmm...I thought we were trying to keep all device-tree specific items in > prom.c. Not really. In fact, I want only "generic" device-tree stuff in there, and that isn't quite it. I'm about to remove the bunch of interpret_xxxx_props() for example, ultimately, I want prom.c to only contain the structure accessors, flatten/unflatten code, etc.. Besides, if you really want to export it, considering that it's "standard" enough to be in generic code, then it should be rather called something like. int of_device_failed(...) And finally, i'd rather have it backward, that is something like of_device_available(). Ben. From benh at kernel.crashing.org Wed Jun 1 14:54:25 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Jun 2005 14:54:25 +1000 Subject: [PATCH] ppc64: Fix a device-tree bug on Apple's Message-ID: <1117601666.5826.85.camel@gaston> Hi ! Apple's Open Firmware has a funny bug when creating the /cpus nodes where it leaves a dangling '\0' character in the CPU name which ends up appearing in the full path of the node. This is bogus and confuses /proc/device-tree badly. This patch strips those bogus zero's from the node full path when reading the device-tree from Open Firmware. The "name" property is not modified and still contains the spurrious 0 (it basically contains 0 tailing 0 instead of one) but that shouldn't be a problem. An equivalent patch for ppc32 will follow shortly Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom_init.c 2005-06-01 14:38:21.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom_init.c 2005-06-01 14:44:30.000000000 +1000 @@ -1566,7 +1566,7 @@ { int l, align; phandle child; - char *namep, *prev_name, *sstart; + char *namep, *prev_name, *sstart, *p, *ep; unsigned long soff; unsigned char *valp; unsigned long offset = reloc_offset(); @@ -1588,6 +1588,14 @@ call_prom("package-to-path", 3, 1, node, namep, l); } namep[l] = '\0'; + /* Fixup an Apple bug where they have bogus \0 chars in the + * middle of the path in some properties + */ + for (p = namep, ep = namep + l; p < ep; p++) + if (*p == '\0') { + memmove(p, p+1, ep - p); + ep--; l--; + } *mem_start = _ALIGN(((unsigned long) namep) + strlen(namep) + 1, 4); } From benh at kernel.crashing.org Wed Jun 1 17:07:27 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Jun 2005 17:07:27 +1000 Subject: [PATCH] ppc32/ppc64: cleanup /proc/device-tree Message-ID: <1117609647.5409.8.camel@gaston> Hi ! This patch cleans up the /proc/device-tree representation of the Open Firmware device-tree on ppc and ppc64. It does the following things: - Workaround an issue in some Apple device-trees where a property may exist with the same name as a child node of the parent. We now simply "drop" the property instead of creating duplicate entries in /proc with random result... - Do not try to chop off the "@0" at the end of a node name whose unit address is 0. This is not useful, inconsistent, and the code was buggy and didn't always work anyway. - Do not create symlinks for the short name and unit address parts of a ndoe. These were never really used, bloated the memory footprint of the device-tree with useless struct proc_dir_entry and their matching dentry and inode cache bloat. This results in smaller code, smaller memory footprint, and a more accurate view of the tree presented to userland. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/fs/proc/proc_devtree.c =================================================================== --- linux-work.orig/fs/proc/proc_devtree.c 2005-06-01 16:18:51.000000000 +1000 +++ linux-work/fs/proc/proc_devtree.c 2005-06-01 16:47:35.000000000 +1000 @@ -12,15 +12,8 @@ #include #ifndef HAVE_ARCH_DEVTREE_FIXUPS -static inline void set_node_proc_entry(struct device_node *np, struct proc_dir_entry *de) -{ -} - -static void inline set_node_name_link(struct device_node *np, struct proc_dir_entry *de) -{ -} - -static void inline set_node_addr_link(struct device_node *np, struct proc_dir_entry *de) +static inline void set_node_proc_entry(struct device_node *np, + struct proc_dir_entry *de) { } #endif @@ -58,89 +51,67 @@ /* * Process a node, adding entries for its children and its properties. */ -void proc_device_tree_add_node(struct device_node *np, struct proc_dir_entry *de) +void proc_device_tree_add_node(struct device_node *np, + struct proc_dir_entry *de) { struct property *pp; struct proc_dir_entry *ent; - struct device_node *child, *sib; - const char *p, *at; - int l; - struct proc_dir_entry *list, **lastp, *al; + struct device_node *child; + struct proc_dir_entry *list = NULL, **lastp; + const char *p; set_node_proc_entry(np, de); lastp = &list; - for (pp = np->properties; pp != 0; pp = pp->next) { - /* - * Unfortunately proc_register puts each new entry - * at the beginning of the list. So we rearrange them. - */ - ent = create_proc_read_entry(pp->name, strncmp(pp->name, "security-", 9) ? - S_IRUGO : S_IRUSR, de, property_read_proc, pp); - if (ent == 0) - break; - if (!strncmp(pp->name, "security-", 9)) - ent->size = 0; /* don't leak number of password chars */ - else - ent->size = pp->length; - *lastp = ent; - lastp = &ent->next; - } - child = NULL; - while ((child = of_get_next_child(np, child))) { + for (child = NULL; (child = of_get_next_child(np, child));) { p = strrchr(child->full_name, '/'); if (!p) p = child->full_name; else ++p; - /* chop off '@0' if the name ends with that */ - l = strlen(p); - if (l > 2 && p[l-2] == '@' && p[l-1] == '0') - l -= 2; ent = proc_mkdir(p, de); if (ent == 0) break; *lastp = ent; + ent->next = NULL; lastp = &ent->next; proc_device_tree_add_node(child, ent); - - /* - * If we left the address part on the name, consider - * adding symlinks from the name and address parts. - */ - if (p[l] != 0 || (at = strchr(p, '@')) == 0) - continue; - + } + of_node_put(child); + for (pp = np->properties; pp != 0; pp = pp->next) { /* - * If this is the first node with a given name property, - * add a symlink with the name property as its name. + * Yet another Apple device-tree bogosity: on some machines, + * they have properties & nodes with the same name. Those + * properties are quite unimportant for us though, thus we + * simply "skip" them here, but we do have to check. */ - sib = NULL; - while ((sib = of_get_next_child(np, sib)) && sib != child) - if (sib->name && strcmp(sib->name, child->name) == 0) - break; - if (sib == child && strncmp(p, child->name, l) != 0) { - al = proc_symlink(child->name, de, ent->name); - if (al == 0) { - of_node_put(sib); + for (ent = list; ent != NULL; ent = ent->next) + if (!strcmp(ent->name, pp->name)) break; - } - set_node_name_link(child, al); - *lastp = al; - lastp = &al->next; + if (ent != NULL) { + printk(KERN_WARNING "device-tree: property \"%s\" name" + " conflicts with node in %s\n", pp->name, + np->full_name); + continue; } - of_node_put(sib); + /* - * Add another directory with the @address part as its name. + * Unfortunately proc_register puts each new entry + * at the beginning of the list. So we rearrange them. */ - al = proc_symlink(at, de, ent->name); - if (al == 0) + ent = create_proc_read_entry(pp->name, + strncmp(pp->name, "security-", 9) + ? S_IRUGO : S_IRUSR, de, + property_read_proc, pp); + if (ent == 0) break; - set_node_addr_link(child, al); - *lastp = al; - lastp = &al->next; + if (!strncmp(pp->name, "security-", 9)) + ent->size = 0; /* don't leak number of password chars */ + else + ent->size = pp->length; + ent->next = NULL; + *lastp = ent; + lastp = &ent->next; } - of_node_put(child); - *lastp = NULL; de->subdir = list; } Index: linux-work/include/asm-ppc64/prom.h =================================================================== --- linux-work.orig/include/asm-ppc64/prom.h 2005-06-01 16:18:51.000000000 +1000 +++ linux-work/include/asm-ppc64/prom.h 2005-06-01 16:30:30.000000000 +1000 @@ -147,9 +147,7 @@ struct device_node *sibling; struct device_node *next; /* next device of same type */ struct device_node *allnext; /* next in list of all nodes */ - struct proc_dir_entry *pde; /* this node's proc directory */ - struct proc_dir_entry *name_link; /* name symlink */ - struct proc_dir_entry *addr_link; /* addr symlink */ + struct proc_dir_entry *pde; /* this node's proc directory */ struct kref kref; unsigned long _flags; }; @@ -174,15 +172,6 @@ dn->pde = de; } -static void inline set_node_name_link(struct device_node *dn, struct proc_dir_entry *de) -{ - dn->name_link = de; -} - -static void inline set_node_addr_link(struct device_node *dn, struct proc_dir_entry *de) -{ - dn->addr_link = de; -} /* OBSOLETE: Old stlye node lookup */ extern struct device_node *find_devices(const char *name); Index: linux-work/arch/ppc64/kernel/pSeries_reconfig.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/pSeries_reconfig.c 2005-06-01 16:18:51.000000000 +1000 +++ linux-work/arch/ppc64/kernel/pSeries_reconfig.c 2005-06-01 16:30:30.000000000 +1000 @@ -47,14 +47,6 @@ remove_proc_entry(pp->name, np->pde); pp = pp->next; } - - /* Assuming that symlinks have the same parent directory as - * np->pde. - */ - if (np->name_link) - remove_proc_entry(np->name_link->name, parent->pde); - if (np->addr_link) - remove_proc_entry(np->addr_link->name, parent->pde); if (np->pde) remove_proc_entry(np->pde->name, parent->pde); } From sfr at canb.auug.org.au Wed Jun 1 18:14:55 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 1 Jun 2005 18:14:55 +1000 Subject: [PATCH] iSeries: start cleanup of headers Message-ID: <20050601181455.2529c0fd.sfr@canb.auug.org.au> Hi all, I am endeavouring to make the iSeries header files look more "Linux like". This first patch just does white space cleanups, comment cleanups and some simple code reformatting. There are no semantic changes in here. I also remove one header that was only declaring a function that no longer exists. Comments? If none, I will send this to Andrew soon and will base more patches on this beginning. (Yes, that means more StudleyCaps removal :-)) The patch is too large for the list, so you can get it here: http://ozlabs.org/~sfr/iSeries_include.1.diff arch/ppc64/kernel/iSeries_proc.c | 1 arch/ppc64/kernel/iSeries_setup.c | 1 arch/ppc64/kernel/viopath.c | 1 include/asm-ppc64/iSeries/HvCall.h | 100 ++--- include/asm-ppc64/iSeries/HvCallCfg.h | 177 +++------ include/asm-ppc64/iSeries/HvCallEvent.h | 94 +---- include/asm-ppc64/iSeries/HvCallHpt.h | 128 ++---- include/asm-ppc64/iSeries/HvCallPci.h | 486 +++++++++----------------- include/asm-ppc64/iSeries/HvCallSc.h | 38 +- include/asm-ppc64/iSeries/HvCallSm.h | 36 - include/asm-ppc64/iSeries/HvCallXm.h | 114 ++---- include/asm-ppc64/iSeries/HvLpConfig.h | 294 ++++++++------- include/asm-ppc64/iSeries/HvLpEvent.h | 122 +++--- include/asm-ppc64/iSeries/HvReleaseData.h | 76 +--- include/asm-ppc64/iSeries/HvTypes.h | 112 ++--- include/asm-ppc64/iSeries/IoHriMainStore.h | 33 - include/asm-ppc64/iSeries/IoHriProcessorVpd.h | 30 - include/asm-ppc64/iSeries/ItExtVpdPanel.h | 52 +- include/asm-ppc64/iSeries/ItIplParmsReal.h | 97 ++--- include/asm-ppc64/iSeries/ItLpNaca.h | 44 -- include/asm-ppc64/iSeries/ItLpQueue.h | 74 +-- include/asm-ppc64/iSeries/ItLpRegSave.h | 41 +- include/asm-ppc64/iSeries/ItSpCommArea.h | 10 include/asm-ppc64/iSeries/ItVpdAreas.h | 121 +++--- include/asm-ppc64/iSeries/LparData.h | 27 - include/asm-ppc64/iSeries/LparMap.h | 42 +- include/asm-ppc64/iSeries/XmPciLpEvent.h | 15 include/asm-ppc64/iSeries/iSeries_io.h | 59 +-- include/asm-ppc64/iSeries/iSeries_irq.h | 18 include/asm-ppc64/iSeries/iSeries_pci.h | 161 ++++---- include/asm-ppc64/iSeries/iSeries_proc.h | 24 - include/asm-ppc64/iSeries/mf.h | 5 include/asm-ppc64/iSeries/vio.h | 57 +-- 33 files changed, 1153 insertions(+), 1537 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ From benh at kernel.crashing.org Wed Jun 1 18:26:30 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Jun 2005 18:26:30 +1000 Subject: Booting the linux-ppc64 kernel & flattened device tree v0.4 Message-ID: <1117614390.19020.24.camel@gaston> DO NOT REPLY TO ALL LISTS PLEASE ! (and CC me on replies). Here's the fourth version of my document along with new kernel patches for the new improved flattened format, and the first release of the device-tree "compiler" tool. The patches will be posted as a reply to this email. The compiler, dtc, can be downloaded, the URL is in the document. --- Booting the Linux/ppc64 kernel without Open Firmware ---------------------------------------------------- (c) 2005 Benjamin Herrenschmidt , IBM Corp. May 18, 2005: Rev 0.1 - Initial draft, no chapter III yet. May 19, 2005: Rev 0.2 - Add chapter III and bits & pieces here or clarifies the fact that a lot of things are optional, the kernel only requires a very small device tree, though it is encouraged to provide an as complete one as possible. May 24, 2005: Rev 0.3 - Precise that DT block has to be in RAM - Misc fixes - Define version 3 and new format version 16 for the DT block (version 16 needs kernel patches, will be fwd separately). String block now has a size, and full path is replaced by unit name for more compactness. linux,phandle is made optional, only nodes that are referenced by other nodes need it. "name" property is now automatically deduced from the unit name June 1, 2005: Rev 0.4 - Correct confusion between OF_DT_END and OF_DT_END_NODE in structure definition. - Change version 16 format to always align property data to 4 bytes. Since tokens are already aligned, that means no specific required alignement between property size and property data. The old style variable alignment would make it impossible to do "simple" insertion of properties using memove (thanks Milton for noticing). Updated kernel patch as well - Correct a few more alignement constraints - Add a chapter about the device-tree compiler and the textural representation of the tree that can be "compiled" by dtc. ToDo: - Add some definitions of interrupt tree (simple/complex) - Add some definitions for pci host bridges I- Introduction =============== During the recent developpements of the Linux/ppc64 kernel, and more specifically, the addition of new platform types outside of the old IBM pSeries/iSeries pair, it was decided to enforce some strict rules regarding the kernel entry and bootloader <-> kernel interfaces, in order to avoid the degeneration that has become the ppc32 kernel entry point and the way a new platform should be added to the kernel. The legacy iSeries platform breaks those rules as it predates this scheme, but no new board support will be accepted in the main tree that doesn't follows them properly. The main requirement that will be defined in more details below is the presence of a device-tree whose format is defined after Open Firmware specification. However, in order to make life easier to embedded board vendors, the kernel doesn't require the device-tree to represent every device in the system and only requires some nodes and properties to be present. This will be described in details in section III, but, for example, the kernel does not require you to create a node for every PCI device in the system. It is a requirement to have a node for PCI host bridges in order to provide interrupt routing informations and memory/IO ranges, among others. It is also recommended to define nodes for on chip devices and other busses that doesn't specifically fit in an existing OF specification, like on chip devices, this creates a great flexibility in the way the kernel can them probe those and match drivers to device, without having to hard code all sorts of tables. It also makes it more flexible for board vendors to do minor hardware upgrades without impacting significantly the kernel code or cluttering it with special cases. 1) Entry point -------------- There is one and one single entry point to the kernel, at the start of the kernel image. That entry point support two calling conventions: a) Boot from Open Firmware. If your firmware is compatible with Open Firmware (IEEE 1275) or provides an OF compatible client interface API (support for "interpret" callback of forth words isn't required), you can enter the kernel with: r5 : OF callback pointer as defined by IEEE 1275 bindings to powerpc. Only the 32 bits client interface is currently supported r3, r4 : address & lenght of an initrd if any or 0 MMU is either on or off, the kernel will run the trampoline located in arch/ppc64/kernel/prom_init.c to extract the device-tree and other informations from open firmware and build a flattened device-tree as described in b). prom_init() will then re-enter the kernel using the second method. This trampoline code runs in the context of the firmware, which is supposed to handle all exceptions during that time. b) Direct entry with a flattened device-tree block. This entry point is called by a) after the OF trampoline and can also be called directly by a bootloader that does not support the Open Firmware client interface. It is also used by "kexec" to implement "hot" booting of a new kernel from a previous running one. This method is what I will describe in more details in this document, as method a) is simply standard Open Firmware, and thus should be implemented according to the various standard documents defining it and it's binding to the PowerPC platform. The entry point definition then becomes: r3 : physical pointer to the device-tree block (defined in chapter II) in RAM r4 : physical pointer to the kernel itself. This is used by the assembly code to properly disable the MMU in case you are entering the kernel with MMU enabled and a non-1:1 mapping. r5 : NULL (as to differenciate with method a) Note about SMP entry: Either your firmware puts your other CPUs in some sleep loop or spin loop in ROM where you can get them out via a soft reset or some other mean, in which case you don't need to care, or you'll have to enter the kernel with all CPUs. The way to do that with method b) will be described in a later revision of this document. 2) Board support ---------------- Board supports (platforms) are not exclusive config options. An arbitrary set of board supports can be built in a single kernel image. The kernel will "known" what set of functions to use for a given platform based on the content of the device-tree. Thus, you should: a) add your platform support as a _boolean_ option in arch/ppc64/Kconfig, following the example of PPC_PSERIES, PPC_PMAC and PPC_MAPLE. The later is probably a good example of a board support to start from. b) create your main platform file as "arch/ppc64/kernel/myboard_setup.c" and add it to the Makefile under the condition of your CONFIG_ option. This file will define a structure of type "ppc_md" containing the various callbacks that the generic code will use to get to your platform specific code c) Add a reference to your "ppc_md" structure in the "machines" table in arch/ppc64/kernel/setup.c d) request and get assigned a platform number (see PLATFORM_* constants in include/asm-ppc64/processor.h I will describe later the boot process and various callbacks that your platform should implement. II - The DT block format =========================== This chapter defines the actual format of the flattened device-tree passed to the kernel. The actual content of it and kernel requirements are described later. You can find example of code manipulating that format in various places, including arch/ppc64/kernel/prom_init.c which will generate a flattened device-tree from the Open Firmware representation, or the fs2dt utility which is part of the kexec tools which will generate one from a filesystem representation. It is expected that a bootloader like uboot provides a bit more support, that will be discussed later as well. Note: The block has to be in main memory. It has to be accessible in both real mode and virtual mode with no other mapping than main memory. If you are writing a simple flash bootloader, it should copy the block to RAM before passing it to the kernel. 1) Header --------- The kernel is entered with r3 pointing to an area of memory that is roughtly described in include/asm-ppc64/prom.h by the structure boot_param_header: struct boot_param_header { u32 magic; /* magic word OF_DT_HEADER */ u32 totalsize; /* total size of DT block */ u32 off_dt_struct; /* offset to structure */ u32 off_dt_strings; /* offset to strings */ u32 off_mem_rsvmap; /* offset to memory reserve map */ u32 version; /* format version */ u32 last_comp_version; /* last compatible version */ /* version 2 fields below */ u32 boot_cpuid_phys; /* Which physical CPU id we're booting on */ /* version 3 fields below */ u32 size_dt_strings; /* size of the strings block */ }; Along with the constants: /* Definitions used by the flattened device tree */ #define OF_DT_HEADER 0xd00dfeed /* 4: version, 4: total size */ #define OF_DT_BEGIN_NODE 0x1 /* Start node: full name */ #define OF_DT_END_NODE 0x2 /* End node */ #define OF_DT_PROP 0x3 /* Property: name off, size, content */ #define OF_DT_END 0x9 All values in this header are in big endian format, the various fields in this header are defined more precisely below. All "offsets" values are in bytes from the start of the header, that is from r3 value. - magic This is a magic value that "marks" the beginning of the device-tree block header. It contains the value 0xd00dfeed and is defined by the constant OF_DT_HEADER - totalsize This is the total size of the DT block including the header. The "DT" block should enclose all data structures defined in this chapter (who are pointed to by offsets in this header). That is, the device-tree structure, strings, and the memory reserve map. - off_dt_struct This is an offset from the beginning of the header to the start of the "structure" part the device tree. (see 2) device tree) - off_dt_strings This is an offset from the beginning of the header to the start of the "strings" part of the device-tree - off_mem_rsvmap This is an offset from the beginning of the header to the start of the reserved memory map. This map is a list of pairs of 64 bits integers. Each pair is a physical address and a size. The list is terminated by an entry of size 0. This map provides the kernel with a list of physical memory areas that are "reserved" and thus not to be used for memory allocations, especially during early initialisation. The kernel needs to allocate memory during boot for things like un-flattening the device-tree, allocating an MMU hash table, etc... Those allocations must be done in such a way to avoid overriding critical things like, on Open Firmware capable machines, the RTAS instance, or on some pSeries, the TCE tables used for the iommu. Typically, the reserve map should contain _at least_ this DT block itself (header,total_size). If you are passing an initrd to the kernel, you should reserve it as well. You do not need to reserve the kernel image itself. The map should be 64 bits aligned. - version This is the version of this structure. Version 1 stops here. Version 2 adds an additional field boot_cpuid_phys. Version 3 adds the size of the strings block, allowing the kernel to reallocate it easily at boot and free up the unused flattened structure after expansion. Version 16 introduces a new more "compact" format for the tree itself that is however not backward compatible. You should always generate a structure of the highest version defined at the time of your implementation. Currently that is version 16, unless you explicitely aim at being backward compatible - last_comp_version Last compatible version. This indicates down to what version of the DT block you are backward compatible with. For example, version 2 is backward compatible with version 1 (that is, a kernel build for version 1 will be able to boot with a version 2 format). You should put a 1 in this field if you generate a device tree of version 1 to 3, or 0x10 if you generate a tree of version 0x10 using the new unit name format. - boot_cpuid_phys This field only exist on version 2 headers. It indicate which physical CPU ID is calling the kernel entry point. This is used, among others, by kexec. If you are on an SMP system, this value should match the content of the "reg" property of the CPU node in the device-tree corresponding to the CPU calling the kernel entry point (see further chapters for more informations on the required device-tree contents) So the typical layout of a DT block (though the various parts don't need to be in that order) looks like (addresses go from top to bottom): ------------------------------ r3 -> | struct boot_param_header | ------------------------------ | (alignment gap) (*) | ------------------------------ | memory reserve map | ------------------------------ | (alignment gap) | ------------------------------ | | | device-tree structure | | | ------------------------------ | (alignment gap) | ------------------------------ | | | device-tree strings | | | -----> ------------------------------ | | --- (r3 + totalsize) (*) The alignment gaps are not necessarily present, their presence and size are dependent on the various alignment requirements of the individual data blocks. 2) Device tree generalities --------------------------- This device-tree itself is separated in two different blocks, a structure block and a strings block. Both need to be aligned to a 4 bytes boundary. First, let's quickly describe the device-tree concept before detailing the storage format. This chapter does _not_ describe the detail of the required types of nodes & properties for the kernel, this is done later in chapter III. The device-tree layout is strongly inherited from the definition of the Open Firmware IEEE 1275 device-tree. It's basically a tree of nodes, each node having two or more named properties. A property can have a value or not. It is a tree, so each node has one and only one parent except for the root node who has no parent. A node has 2 names. The actual node name is generally contained in a property of type "name" in the node property list whose value is a zero terminated string and is mandatory for version 1 to 3 of the format definition (as it is in Open Firmware). Version 0x10 makes it optional as it can generate it from the unit name defined below. There is also a "unit name" that is used to differenciate nodes with the same name at the same level, it is usually made of the node name's, the "@" sign, and a "unit address", which definition is specific to the bus type the node sits on. The unit name doesn't exist as a property per-se but is included in the device-tree structure. It is typically used to represent "path" in the device-tree. More details about the actual format of these will be below. The kernel ppc64 generic code does not make any formal use of the unit address (though some board support code may do) so the only real requirement here for the unit address is to ensure uniqueness of the node unit name at a given level of the tree. Nodes with no notion of address and no possible sibling of the same name (like /memory or /cpus) may ommit the unit address in the context of this specification, or use the "@0" default unit address. The unit name is used to define a node "full path", which is the concatenation of all parent nodes unit names separated with "/". The root node doesn't have a defined name, and isn't required to have a name property either if you are using version 3 or earlier of the format. It also has no unit address (no @ symbol followed by a unit address). The root node unit name is thus an empty string. The full path to the root node is "/" Every node who actually represents an actual device (that is who isn't only a virtual "container" for more nodes, like "/cpus" is) is also required to have a "device_type" property indicating the type of node Finally, every node that can be referrenced from a property in another node is required to have a "linux,phandle" property. Real open firmware implementations do provide a unique "phandle" value for every node that the "prom_init()" trampoline code turns into "linux,phandle" properties. However, this is made optional if the flattened is used directly. An example of a node referencing another node via "phandle" is when laying out the interrupt tree which will be described in a further version of this document. This propery is a 32 bits value that uniquely identify a node. You are free to use whatever values or system of values, internal pointers, or whatever to generate these, the only requirement is that every node for which you provide that property has a unique value for it. Here is an example of a simple device-tree. In this example, a "o" designates a node followed by the node unit name. Properties are presented with their name followed by their content. "content" represent an ASCII string (zero terminated) value, while represent a 32 bits hexadecimal value. The various nodes in this example will be discusse in a later chapter. At this point, it is only meant to give you a idea of what a device-tree looks like. I have on purpose kept the "name" and "linux,phandle" properties which aren't necessary in order to give you a better idea of what the tree looks like in practice. / o device-tree |- name = "device-tree" |- model = "MyBoardName" |- compatible = "MyBoardFamilyName" |- #address-cells = <2> |- #size-cells = <2> |- linux,phandle = <0> | o cpus | | - name = "cpus" | | - linux,phandle = <1> | | - #address-cells = <1> | | - #size-cells = <0> | | | o PowerPC,970 at 0 | |- name = "PowerPC,970" | |- device_type = "cpu" | |- reg = <0> | |- clock-frequency = <5f5e1000> | |- linux,boot-cpu | |- linux,phandle = <2> | o memory at 0 | |- name = "memory" | |- device_type = "memory" | |- reg = <00000000 00000000 00000000 20000000> | |- linux,phandle = <3> | o chosen |- name = "chosen" |- bootargs = "root=/dev/sda2" |- linux,platform = <00000600> |- linux,phandle = <4> This tree is almost a minimal tree. It pretty much contains the minimal set of required nodes and properties to boot a linux kernel, that is some basic model informations at the root, the CPUs, the physical memory layout, and misc informations passed through /chosen like in this example, the platform type (mandatory) and the kernel command line arguments (optional). The /cpus/PowerPC,970 at 0/linux,boot-cpu property is an example of a property without a value. All other properties have a value. The signification of the #address-cells and #size-cells properties will be explained in chapter IV which defines precisely the required nodes and properties and their content. 3) Device tree "structure" block The structure of the device tree is a linearized tree structure. The "OF_DT_BEGIN_NODE" token starts a new node, and the "OF_DT_END_NODE" ends that node definition. Child nodes are simply defined before "OF_DT_END_NODE" (that is nodes within the node). A 'token' is a 32 bits value. The tree has to be "finished" with a OF_DT_END token Here's the basic structure of a single node: * token OF_DT_BEGIN_NODE (that is 0x00000001) * for version 1 to 3, this is the node full path as a zero terminated string, starting with "/". For version 16 and later, this is the node unit name only (or an empty string for the root node) * [align gap to next 4 bytes boundary] * for each property: * token OF_DT_PROP (that is 0x00000003) * 32 bits value of property value size in bytes (or 0 of no value) * 32 bits value of offset in string block of property name * property value data if any * [align gap to next 4 bytes boundary] * [child nodes if any] * token OF_DT_END_NODE (that is 0x00000002) So the node content can be summmarised as a start token, a full path, a list of properties, a list of child node and an end token. Every child node is a full node structure itself as defined above 4) Device tree 'strings" block In order to save space, property names, which are generally redundant, are stored separately in the "strings" block. This block is simply the whole bunch of zero terminated strings for all property names concatenated together. The device-tree property definitions in the structure block will contain offset values from the beginning of the strings block. III - Required content of the device tree ========================================= WARNING: All "linux,*" properties defined in this document apply only to a flattened device-tree. If your platform uses a real implementation of Open Firmware or an implementation compatible with the Open Firmware client interface, those properties will be created by the trampoline code in the kernel's prom_init() file. For example, that's where you'll have to add code to detect your board model and set the platform number. However, when using the flatenned device-tree entry point, there is no prom_init() pass, and thus you have to provide those properties yourself. 1) Note about cells and address representation ---------------------------------------------- The general rule is documented in the various Open Firmware documentations. If you chose to describe a bus with the device-tree and there exist an OF bus binding, then you should follow the specification. However, the kernel does not require every single device or bus to be described by the device tree. In general, the format of an address for a device is defined by the parent bus type, based on the #address-cells and #size-cells property. In absence of such a property, the parent's parent values are used, etc... The kernel requires the root node to have those properties defining addresses format for devices directly mapped on the processor bus. Those 2 properties define 'cells' for representing an address and a size. A "cell" is a 32 bits number. For example, if both contain 2 like the example tree given above, then an address and a size are both composed of 2 cells, that is a 64 bits number (cells are concatenated and expected to be in big endian format). Another example is the way Apple firmware define them, that is 2 cells for an address and one cell for a size. A device IO or MMIO areas on the bus are defined in the "reg" property. The format of this property depends on the bus the device is sitting on. Standard bus types define their "reg" properties format in the various OF bindings for those bus types, you are free to define your own "reg" format for proprietary busses or virtual busses enclosing on-chip devices, though it is recommended that the parts of the "reg" property containing addresses and sizes do respect the defined #address-cells and #size-cells when those make sense. Later, I will define more precisely some common address formats. For a new ppc64 board, I recommend to use either the 2/2 format or Apple's 2/1 format which is slightly more compact since sizes usually fit in a single 32 bits word. 2) Note about "compatible" properties ------------------------------------- Those properties are optional, but recommended in devices and the root node. The format of a "compatible" property is a list of concatenated zero terminated strings. They allow a device to express it's compatibility with a family of similar devices, in some cases, allowing a single driver to match against several devices regardless of their actual names 3) Note about "name" properties ------------------------------- While earlier users of Open Firmware like OldWorld macintoshes tended to use the actual device name for the "name" property, it's nowadays considered a good practice to use a name that is closer to the device class (often equal to device_type). For example, nowadays, ethernet controllers are named "ethernet", an additional "model" property defining precisely the chip type/model, and "compatible" property defining the family in case a single driver can driver more than one of these chips. The kernel however doesn't generally put any restriction on the "name" property, it is simply considered good practice to folow the standard and it's evolutions as closely as possible. Note also that the new format version 16 makes the "name" property optional. If it's absent for a node, then the node's unit name is then used to reconstruct the name. That is, the part of the unit name before the "@" sign is used (or the entire unit name if no "@" sign is present). 4) Note about node and property names and character set ------------------------------------------------------- While open firmware provides more flexibe usage of 8859-1, this specification enforces more strict rules. Nodes and properties should be comprised only of ASCII characters 'a' to 'z', '0' to '9', ',', '.', '_', '+', '#', '?', and '-'. Node names additionally allow uppercase characters 'A' to 'Z' (property names should be lowercase. The fact that vendors like Apple don't respect this rule is irrelevant here). Additionally, node and property names should always begin with a character in the range 'a' to 'z' (or 'A' to 'Z' for node names). The maximum number of characters for both nodes and property names is 31. In the case of node names, this is only the leftmost part of a unit name (the pure "name" property), it doesn't include the unit address which can extend beyond that limit. 5) Required nodes and properties -------------------------------- a) The root node The root node requires some properties to be present: - model : this is your board name/model - #address-cells : address representation for "root" devices - #size-cells: the size representation for "root" devices Additionally, some recommended properties are: - compatible : the board "family" generally finds its way here, for example, if you have 2 board models with a similar layout, that typically get driven by the same platform code in the kernel, you would use a different "model" property but put a value in "compatible". The kernel doesn't directly use that value (see /chosen/linux,platform for how the kernel choses a platform type) but it is generally useful. It's also generally where you add additional properties specific to your board like the serial number if any, that sort of thing. it is recommended that if you add any "custom" property whose name may clash with standard defined ones, you prefix them with your vendor name and a comma. b) The /cpus node This node is the parent of all individual CPUs nodes. It doesn't have any specific requirements, though it's generally good practice to have at least: #address-cells = <00000001> #size-cells = <00000000> This defines that the "address" for a CPU is a single cell, and has no meaningful size. This is not necessary but the kernel will assume that format when reading the "reg" properties of a CPU node, see below c) The /cpus/* nodes So under /cpus, you are supposed to create a node for every CPU on the machine. There is no specific restriction on the name of the CPU, though It's common practice to call it PowerPC,, for example, Apple uses PowerPC,G5 while IBM uses PowerPC,970FX. Required properties: - device_type : has to be "cpu" - reg : This is the physical cpu number, it's single 32 bits cell, this is also used as-is as the unit number for constructing the unit name in the full path, for example, with 2 CPUs, you would have the full path: /cpus/PowerPC,970FX at 0 /cpus/PowerPC,970FX at 1 (unit addresses do not require to have leading zero's) - d-cache-line-size : one cell, L1 data cache line size in bytes - i-cache-line-size : one cell, L1 instruction cache line size in bytes - d-cache-size : one cell, size of L1 data cache in bytes - i-cache-size : one cell, size of L1 instruction cache in bytes Recommended properties: - timebase-frequency : a cell indicating the frequency of the timebase in Hz. This is not directly used by the generic code, but you are welcome to copy/paste the pSeries code for setting the kernel timebase/decrementer calibration based on this value. - clock-frequency : a cell indicating the CPU core clock frequency in Hz. A new property will be defined for 64 bits value, but if your frequency is < 4Ghz, one cell is enough. Here as well as for the above, the common code doesn't use that property, but you are welcome to re-use the pSeries or Maple one. A future kernel version might provide a common function for this. You are welcome to add any property you find relevant to your board, like some informations about mecanism used to soft-reset the CPUs for example (Apple puts the GPIO number for CPU soft reset lines in there as a "soft-reset" property as they start secondary CPUs by soft-resetting them). d) the /memory node(s) To define the physical memory layout of your board, you should create one or more memory node(s). You can either create a single node with all memory ranges in it's reg property, or you can create several nodes, as you wishes. The unit address (@ part) used for the full path is the address of the first range of memory defined by a given node. If you use a single memory node, this will typically be @0. Required properties: - device_type : has to be "memory" - reg : This property contain all the physical memory ranges of your board. It's a list of addresses/sizes concatenated together, the number of cell of those beeing defined by the #address-cells and #size-cells of the root node. For example, with both of these properties beeing 2 like in the example given earlier, a 970 based machine with 6Gb of RAM could typically have a "reg" property here that looks like: 00000000 00000000 00000000 80000000 00000001 00000000 00000001 00000000 That is a range starting at 0 of 0x80000000 bytes and a range starting at 0x100000000 and of 0x100000000 bytes. You can see that there is no memory covering the IO hold between 2Gb and 4Gb. Some vendors prefer splitting those ranges into smaller segments, the kernel doesn't care. c) The /chosen node This node is a bit "special". Normally, that's where open firmware puts some variable environment informations, like the arguments, or phandle pointers to nodes like the main interrupt controller, or the default input/output devices. This specification makes a few of these mandatory, but also defines some linux specific properties that would be normally constructed by the prom_init() trampoline when booting with an OF client interface, but that you have to provide yourself when using the flattened format. Required properties: - linux,platform : This is your platform number as assigned by the architecture maintainers Recommended properties: - bootargs : This zero terminated string is passed as the kernel command line - linux,stdout-path : This is the full path to your standard console device if any. Typically, if you have serial devices on your board, you may want to put the full path to the one set as the default console in the firmware here, for the kernel to pick it up as it's own default console. If you look at the funciton set_preferred_console() in arch/ppc64/kernel/setup.c, you'll see that the kernel tries to find out the default console and has knowledge of various types like 8250 serial ports. You may want to extend this function to add your own. - interrupt-controller : This is one cell containing a phandle value that matches the "linux,phandle" property of your main interrupt controller node. May be used for interrupt routing. This is all that is currently required. However, it is strongly recommended that you expose PCI host bridges as documented in the PCI binding to open firmware, and your interrupt tree as documented in OF interrupt tree specification. IV - "dtc", the device tree compiler ==================================== dtc source code can be found at WARNING: This version is still in early developpement stage, the resulting device-tree "blobs" have not yet been validated with the kernel. The current generated bloc lacks a useful reserve map (it will be fixed to generate an empty one, it's up to the bootloader to fill it up) among others. The error handling needs work, bugs are lurking, etc... dtc basically takes a device-tree in a given format and outputs a device-tree in another format. The currently supported formats are: Input formats: ------------- - "dtb": "blob" format, that is a flattened device-tree block with header all in a binary blob. - "dts": "source" format. This is a text file containing a "source" for a device-tree. The format is defined later in this chapter. - "fs" format. This is a representation equivalent to the output of /proc/device-tree, that is nodes are directories and properties are files Output formats: --------------- - "dtb": "blob" format - "dts": "source" format - "asm": assembly language file. This is a file that can be sourced by gas to generate a device-tree "blob". That file can then simply be added to your Makefile. Additionally, the assembly file exports some symbols that can be use The syntax of the dtc tool is dtc [-I ] [-O ] [-o output-filename] [-V output_version] input_filename The "output_version" defines what versio of the "blob" format will be generated. Supported versions are 1,2,3 and 16. The default is currently version 3 but that may change in the future to version 16. Additionally, dtc performs various sanity checks on the tree, like the uniqueness of linux,phandle properties, validity of strings, etc... The format of the .dts "source" file is "C" like, supports C and C++ style commments. / { } The above is the "device-tree" definition. It's the only statement supported currently at the toplevel. / { property1 = "string_value"; /* define a property containing a 0 * terminated string */ property2 = <1234abcd>; /* define a property containing a * numerical 32 bits value (hexadecimal) */ property3 = <12345678 12345678 deadbeef>; /* define a property containing 3 numerical * 32 bits values (cells) in * hexadecimal */ property4 = [0a 0b 0c 0d de ea ad be ef]; /* define a property whose content is * an arbitrary array of bytes */ childnode at addresss { /* define a child node named "childnode" * whose unit name is "childnode at address" */ childprop = "hello\n"; /* define a property "childprop" of * childnode (in this case, a string) */ }; }; Nodes can contain other nodes etc... thus defining the hierarchical structure of the tree. Strings support common escape sequences from C: "\n", "\t", "\r", "\(octal value)", "\x(hex value)". It is also suggested that you pipe your source file through cpp (gcc preprocessor) so you can use #include's, #define for constants, etc... Finally, various options are planned but not yet implemented, like automatic generation of phandles, labels (exported to the asm file so you can point to a property content and change it easily from whatever you link the device-tree with), label or path instead of numeric value in some cells to "point" to a node (replaced by a phandle at compile time), export of reserve map address to the asm file, ability to specify reserve map content at compile time, etc... We may provide a .h include file with common definitions of that proves useful for some properties (like building PCI properties or interrupt maps) though it may be better to add a notion of struct definitions to the compiler... V - Recommendation for a bootloader =================================== Here are some various ideas/recommendations that have been proposed while all this has been defined and implemented. - A very The bootloader may want to be able to use the device-tree itself and may want to manipulate it (to add/edit some properties, like physical memory size or kernel arguments). At this point, 2 choices can be made. Either the bootloader works directly on the flattened format, or the bootloader has it's own internal tree representation with pointers (similar to the kernel one) and re-flattens the tree when booting the kernel. The former is a bit more difficult to edit/modify, the later requires probably a bit more code to handle the tree structure. Note that the structure format has been designed so it's relatively easy to "insert" properties or nodes or delete them by just memmovin'g things around. It contains no internal offsets or pointers for this purpose. - An example of code for iterating nodes & retreiving properties directly from the flattened tree format can be found in the kernel file arch/ppc64/kernel/prom.c, look at scan_flat_dt() function, it's usage in early_init_devtree(), and the corresponding various early_init_dt_scan_*() callbacks. That code can be re-used in a GPL bootloader, and as the author of that code, I would be happy do discuss possible free licencing to any vendor who wishes to integrate all or part of this code into a non-GPL bootloader. From benh at kernel.crashing.org Wed Jun 1 18:28:03 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Jun 2005 18:28:03 +1000 Subject: Booting the linux-ppc64 kernel & flattened device tree v0.4 In-Reply-To: <1117614390.19020.24.camel@gaston> References: <1117614390.19020.24.camel@gaston> Message-ID: <1117614484.19020.27.camel@gaston> Here is the kernel patch. It applies on top of the various prom_init.c bug fixes that I already posted today on the linuxppc-dev & linuxppc64-dev lists (those will be in the next -mm and maybe in 2.6.12). This patch is intended to hit upstream by 2.6.13 Index: linux-work/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom_init.c 2005-06-01 16:02:28.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom_init.c 2005-06-01 16:07:21.000000000 +1000 @@ -1514,7 +1514,14 @@ return 0; } -static void __init scan_dt_build_strings(phandle node, unsigned long *mem_start, +/* + * The Open Firmware 1275 specification states properties must be 31 bytes or + * less, however not all firmwares obey this. Make it 64 bytes to be safe. + */ +#define MAX_PROPERTY_NAME 64 + +static void __init scan_dt_build_strings(phandle node, + unsigned long *mem_start, unsigned long *mem_end) { unsigned long offset = reloc_offset(); @@ -1527,14 +1534,19 @@ /* get and store all property names */ prev_name = RELOC(""); for (;;) { - - /* 32 is max len of name including nul. */ - namep = make_room(mem_start, mem_end, 32, 1); + namep = make_room(mem_start, mem_end, MAX_PROPERTY_NAME, 1); if (call_prom("nextprop", 3, 1, node, prev_name, namep) <= 0) { /* No more nodes: unwind alloc */ *mem_start = (unsigned long)namep; break; } + + /* skip "name" */ + if (strcmp(namep, RELOC("name")) == 0) { + *mem_start = (unsigned long)namep; + prev_name = RELOC("name"); + continue; + } soff = dt_find_string(namep); if (soff != 0) { *mem_start = (unsigned long)namep; @@ -1555,72 +1567,83 @@ } } -/* - * The Open Firmware 1275 specification states properties must be 31 bytes or - * less, however not all firmwares obey this. Make it 64 bytes to be safe. - */ -#define MAX_PROPERTY_NAME 64 - static void __init scan_dt_build_struct(phandle node, unsigned long *mem_start, unsigned long *mem_end) { - int l, align; phandle child; - char *namep, *prev_name, *sstart, *p, *ep; + char *namep, *prev_name, *sstart, *p, *ep, *lp, *path; unsigned long soff; unsigned char *valp; unsigned long offset = reloc_offset(); - char pname[MAX_PROPERTY_NAME]; - char *path; - - path = RELOC(prom_scratch); + static char pname[MAX_PROPERTY_NAME] __initdata; + int l; dt_push_token(OF_DT_BEGIN_NODE, mem_start, mem_end); - /* get the node's full name */ + /* get the node's full name for debugging */ + path = RELOC(prom_scratch); + memset(path, 0, PROM_SCRATCH_SIZE); + call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-1); + prom_debug(" %s\n", path); + + /* get the node's full name for actual use */ namep = (char *)*mem_start; l = call_prom("package-to-path", 3, 1, node, namep, *mem_end - *mem_start); if (l >= 0) { + int had_fixup = 0; + /* Didn't fit? Get more room. */ if (l+1 > *mem_end - *mem_start) { namep = make_room(mem_start, mem_end, l+1, 1); call_prom("package-to-path", 3, 1, node, namep, l); } - namep[l] = '\0'; - /* Fixup an Apple bug where they have bogus \0 chars in the - * middle of the path in some properties - */ - for (p = namep, ep = namep + l; p < ep; p++) - if (*p == '\0') { - memmove(p, p+1, ep - p); - ep--; l--; - } - *mem_start = _ALIGN(((unsigned long) namep) + strlen(namep) + 1, 4); + ep = namep + l; + *ep = '\0'; + /* now try to find the unit name in that mess */ + for (p = namep, lp = NULL; p < ep; p++) { + if (*p == '/') + lp = p + 1; + /* bug fix: apple's OF has a funny bug where they have + * a '\0' in the name/path string of some nodes. + * We fix that up here + */ + if (*p == '\0') { + memmove(p, p+1, ep - p); + ep--; l--; + had_fixup = 1; + } + } + if (had_fixup) + prom_printf("fixed up bogus name for %s\n", namep); + if (lp != NULL) + memmove(namep, lp, strlen(lp) + 1); + *mem_start = _ALIGN(((unsigned long) namep) + + strlen(namep) + 1, 4); } - /* get it again for debugging */ - memset(path, 0, PROM_SCRATCH_SIZE); - call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-1); - /* get and store all properties */ prev_name = RELOC(""); sstart = (char *)RELOC(dt_string_start); for (;;) { - if (call_prom("nextprop", 3, 1, node, prev_name, pname) <= 0) - break; - - /* find string offset */ - soff = dt_find_string(pname); + if (call_prom("nextprop", 3, 1, node, prev_name, + RELOC(pname)) <= 0) + break; + if (strcmp(RELOC(pname), RELOC("name")) == 0) { + prev_name = RELOC("name"); + continue; + } + /* find string offset */ + soff = dt_find_string(RELOC(pname)); if (soff == 0) { - prom_printf("WARNING: Can't find string index for <%s>, node %s\n", - pname, path); + prom_printf("WARNING: Can't find string index " + "for <%s>, node %s\n", RELOC(pname), path); break; } prev_name = sstart + soff; /* get length */ - l = call_prom("getproplen", 2, 1, node, pname); + l = call_prom("getproplen", 2, 1, node, RELOC(pname)); /* sanity checks */ if (l < 0) @@ -1629,7 +1652,7 @@ prom_printf("WARNING: ignoring large property "); /* It seems OF doesn't null-terminate the path :-( */ prom_printf("[%s] ", path); - prom_printf("%s length 0x%x\n", pname, l); + prom_printf("%s length 0x%x\n", RELOC(pname), l); continue; } @@ -1639,17 +1662,16 @@ dt_push_token(soff, mem_start, mem_end); /* push property content */ - align = (l >= 8) ? 8 : 4; - valp = make_room(mem_start, mem_end, l, align); - call_prom("getprop", 4, 1, node, pname, valp, l); + valp = make_room(mem_start, mem_end, l, 4); + call_prom("getprop", 4, 1, node, RELOC(pname), valp, l); *mem_start = _ALIGN(*mem_start, 4); } /* Add a "linux,phandle" property. */ soff = dt_find_string(RELOC("linux,phandle")); if (soff == 0) - prom_printf("WARNING: Can't find string index for " - " node %s\n", path); + prom_printf("WARNING: Can't find string index for" + " node %s\n", path); else { dt_push_token(OF_DT_PROP, mem_start, mem_end); dt_push_token(4, mem_start, mem_end); @@ -1699,7 +1721,8 @@ /* Build header and make room for mem rsv map */ mem_start = _ALIGN(mem_start, 4); - hdr = make_room(&mem_start, &mem_end, sizeof(struct boot_param_header), 4); + hdr = make_room(&mem_start, &mem_end, + sizeof(struct boot_param_header), 4); RELOC(dt_header_start) = (unsigned long)hdr; rsvmap = make_room(&mem_start, &mem_end, sizeof(mem_reserve_map), 8); @@ -1712,11 +1735,11 @@ namep = make_room(&mem_start, &mem_end, 16, 1); strcpy(namep, RELOC("linux,phandle")); mem_start = (unsigned long)namep + strlen(namep) + 1; - RELOC(dt_string_end) = mem_start; /* Build string array */ prom_printf("Building dt strings...\n"); scan_dt_build_strings(root, &mem_start, &mem_end); + RELOC(dt_string_end) = mem_start; /* Build structure */ mem_start = PAGE_ALIGN(mem_start); @@ -1731,9 +1754,11 @@ hdr->totalsize = RELOC(dt_struct_end) - RELOC(dt_header_start); hdr->off_dt_struct = RELOC(dt_struct_start) - RELOC(dt_header_start); hdr->off_dt_strings = RELOC(dt_string_start) - RELOC(dt_header_start); + hdr->dt_strings_size = RELOC(dt_string_end) - RELOC(dt_string_start); hdr->off_mem_rsvmap = ((unsigned long)rsvmap) - RELOC(dt_header_start); hdr->version = OF_DT_VERSION; - hdr->last_comp_version = 1; + /* Version 16 is not backward compatible */ + hdr->last_comp_version = 0x10; /* Reserve the whole thing and copy the reserve map in, we * also bump mem_reserve_cnt to cause further reservations to @@ -1788,6 +1813,9 @@ /* does it need fixup ? */ if (prom_getproplen(i2c, "interrupts") > 0) return; + + prom_printf("fixing up bogus interrupts for u3 i2c...\n"); + /* interrupt on this revision of u3 is number 0 and level */ interrupts[0] = 0; interrupts[1] = 1; Index: linux-work/arch/ppc64/kernel/setup.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/setup.c 2005-06-01 16:02:28.000000000 +1000 +++ linux-work/arch/ppc64/kernel/setup.c 2005-06-01 16:07:21.000000000 +1000 @@ -10,7 +10,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-work/arch/ppc64/kernel/prom.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom.c 2005-06-01 16:02:28.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom.c 2005-06-01 16:07:21.000000000 +1000 @@ -15,7 +15,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include @@ -635,26 +635,32 @@ * unflatten the tree */ static int __init scan_flat_dt(int (*it)(unsigned long node, - const char *full_path, void *data), + const char *uname, int depth, void *data), void *data) { unsigned long p = ((unsigned long)initial_boot_params) + initial_boot_params->off_dt_struct; int rc = 0; + int depth = -1; do { u32 tag = *((u32 *)p); char *pathp; p += 4; - if (tag == OF_DT_END_NODE) + if (tag == OF_DT_END_NODE) { + depth --; + continue; + } + if (tag == OF_DT_NOP) continue; if (tag == OF_DT_END) break; if (tag == OF_DT_PROP) { u32 sz = *((u32 *)p); p += 8; - p = _ALIGN(p, sz >= 8 ? 8 : 4); + if (initial_boot_params->version < 0x10) + p = _ALIGN(p, sz >= 8 ? 8 : 4); p += sz; p = _ALIGN(p, 4); continue; @@ -664,9 +670,18 @@ " device tree !\n", tag); return -EINVAL; } + depth++; pathp = (char *)p; p = _ALIGN(p + strlen(pathp) + 1, 4); - rc = it(p, pathp, data); + if ((*pathp) == '/') { + char *lp, *np; + for (lp = NULL, np = pathp; *np; np++) + if ((*np) == '/') + lp = np+1; + if (lp != NULL) + pathp = lp; + } + rc = it(p, pathp, depth, data); if (rc != 0) break; } while(1); @@ -689,13 +704,16 @@ const char *nstr; p += 4; + if (tag == OF_DT_NOP) + continue; if (tag != OF_DT_PROP) return NULL; sz = *((u32 *)p); noff = *((u32 *)(p + 4)); p += 8; - p = _ALIGN(p, sz >= 8 ? 8 : 4); + if (initial_boot_params->version < 0x10) + p = _ALIGN(p, sz >= 8 ? 8 : 4); nstr = find_flat_dt_string(noff); if (nstr == NULL) { @@ -713,7 +731,7 @@ } static void *__init unflatten_dt_alloc(unsigned long *mem, unsigned long size, - unsigned long align) + unsigned long align) { void *res; @@ -727,13 +745,16 @@ static unsigned long __init unflatten_dt_node(unsigned long mem, unsigned long *p, struct device_node *dad, - struct device_node ***allnextpp) + struct device_node ***allnextpp, + unsigned long fpsize) { struct device_node *np; struct property *pp, **prev_pp = NULL; char *pathp; u32 tag; - unsigned int l; + unsigned int l, allocl; + int has_name = 0; + int new_format = 0; tag = *((u32 *)(*p)); if (tag != OF_DT_BEGIN_NODE) { @@ -742,21 +763,62 @@ } *p += 4; pathp = (char *)*p; - l = strlen(pathp) + 1; + l = allocl = strlen(pathp) + 1; *p = _ALIGN(*p + l, 4); - np = unflatten_dt_alloc(&mem, sizeof(struct device_node) + l, + /* version 0x10 has a more compact unit name here instead of the full + * path. we accumulate the full path size using "fpsize", we'll rebuild + * it later. We detect this because the first character of the name is + * not '/'. + */ + if ((*pathp) != '/') { + new_format = 1; + if (fpsize == 0) { + /* root node: special case. fpsize accounts for path + * plus terminating zero. root node only has '/', so + * fpsize should be 2, but we want to avoid the first + * level nodes to have two '/' so we use fpsize 1 here + */ + fpsize = 1; + allocl = 2; + } else { + /* account for '/' and path size minus terminal 0 + * already in 'l' + */ + fpsize += l; + allocl = fpsize; + } + } + + + np = unflatten_dt_alloc(&mem, sizeof(struct device_node) + allocl, __alignof__(struct device_node)); if (allnextpp) { memset(np, 0, sizeof(*np)); np->full_name = ((char*)np) + sizeof(struct device_node); - memcpy(np->full_name, pathp, l); + if (new_format) { + char *p = np->full_name; + /* rebuild full path for new format */ + if (dad && dad->parent) { + strcpy(p, dad->full_name); +#ifdef DEBUG + if ((strlen(p) + l + 1) != allocl) { + DBG("%s: p: %d, l: %d, a: %d\n", + pathp, strlen(p), l, allocl); + } +#endif + p += strlen(p); + } + *(p++) = '/'; + memcpy(p, pathp, l); + } else + memcpy(np->full_name, pathp, l); prev_pp = &np->properties; **allnextpp = np; *allnextpp = &np->allnext; if (dad != NULL) { np->parent = dad; - /* we temporarily use the `next' field as `last_child'. */ + /* we temporarily use the next field as `last_child'*/ if (dad->next == 0) dad->child = np; else @@ -770,18 +832,26 @@ char *pname; tag = *((u32 *)(*p)); + if (tag == OF_DT_NOP) { + *p += 4; + continue; + } if (tag != OF_DT_PROP) break; *p += 4; sz = *((u32 *)(*p)); noff = *((u32 *)((*p) + 4)); - *p = _ALIGN((*p) + 8, sz >= 8 ? 8 : 4); + *p += 8; + if (initial_boot_params->version < 0x10) + *p = _ALIGN(*p, sz >= 8 ? 8 : 4); pname = find_flat_dt_string(noff); if (pname == NULL) { printk("Can't find property name in list !\n"); break; } + if (strcmp(pname, "name") == 0) + has_name = 1; l = strlen(pname) + 1; pp = unflatten_dt_alloc(&mem, sizeof(struct property), __alignof__(struct property)); @@ -801,6 +871,28 @@ } *p = _ALIGN((*p) + sz, 4); } + /* with version 0x10 we may not have the name property, recreate + * it here from the unit name if absent + */ + if (!has_name) { + char *pa = pathp; + int sz; + + while (*pa && (*pa) != '@') + pa++; + sz = (pa - pathp) + 1; + pp = unflatten_dt_alloc(&mem, sizeof(struct property) + sz, + __alignof__(struct property)); + if (allnextpp) { + pp->name = "name"; + pp->length = sz; + pp->value = (unsigned char *)(pp + 1); + *prev_pp = pp; + prev_pp = &pp->next; + memcpy(pp->value, pathp, sz - 1); + ((char *)pp->value)[sz - 1] = 0; + } + } if (allnextpp) { *prev_pp = NULL; np->name = get_property(np, "name", NULL); @@ -812,7 +904,7 @@ np->type = ""; } while (tag == OF_DT_BEGIN_NODE) { - mem = unflatten_dt_node(mem, p, np, allnextpp); + mem = unflatten_dt_node(mem, p, np, allnextpp, fpsize); tag = *((u32 *)(*p)); } if (tag != OF_DT_END_NODE) { @@ -842,7 +934,7 @@ /* First pass, scan for size */ start = ((unsigned long)initial_boot_params) + initial_boot_params->off_dt_struct; - size = unflatten_dt_node(0, &start, NULL, NULL); + size = unflatten_dt_node(0, &start, NULL, NULL, 0); DBG(" size is %lx, allocating...\n", size); @@ -854,7 +946,7 @@ /* Second pass, do actual unflattening */ start = ((unsigned long)initial_boot_params) + initial_boot_params->off_dt_struct; - unflatten_dt_node(mem, &start, NULL, &allnextp); + unflatten_dt_node(mem, &start, NULL, &allnextp, 0); if (*((u32 *)start) != OF_DT_END) printk(KERN_WARNING "Weird tag at end of tree: %x\n", *((u32 *)start)); *allnextp = NULL; @@ -880,7 +972,7 @@ static int __init early_init_dt_scan_cpus(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { char *type = get_flat_dt_prop(node, "device_type", NULL); u32 *prop; @@ -933,13 +1025,15 @@ } static int __init early_init_dt_scan_chosen(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { u32 *prop; u64 *prop64; extern unsigned long memory_limit, tce_alloc_start, tce_alloc_end; - if (strcmp(full_path, "/chosen") != 0) + DBG("search \"chosen\", depth: %d, uname: %s\n", depth, uname); + + if (depth != 1 || strcmp(uname, "chosen") != 0) return 0; /* get platform type */ @@ -989,18 +1083,20 @@ } static int __init early_init_dt_scan_root(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { u32 *prop; - if (strcmp(full_path, "/") != 0) + if (depth != 0) return 0; prop = (u32 *)get_flat_dt_prop(node, "#size-cells", NULL); dt_root_size_cells = (prop == NULL) ? 1 : *prop; - + DBG("dt_root_size_cells = %x\n", dt_root_size_cells); + prop = (u32 *)get_flat_dt_prop(node, "#address-cells", NULL); dt_root_addr_cells = (prop == NULL) ? 2 : *prop; + DBG("dt_root_addr_cells = %x\n", dt_root_addr_cells); /* break now */ return 1; @@ -1028,7 +1124,7 @@ static int __init early_init_dt_scan_memory(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { char *type = get_flat_dt_prop(node, "device_type", NULL); cell_t *reg, *endp; @@ -1044,7 +1140,9 @@ endp = reg + (l / sizeof(cell_t)); - DBG("memory scan node %s ...\n", full_path); + DBG("memory scan node %s ..., reg size %ld, data: %x %x %x %x, ...\n", + uname, l, reg[0], reg[1], reg[2], reg[3]); + while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) { unsigned long base, size; @@ -1455,10 +1553,11 @@ struct device_node *np = allnodes; read_lock(&devtree_lock); - for (; np != 0; np = np->allnext) + for (; np != 0; np = np->allnext) { if (np->full_name != 0 && strcasecmp(np->full_name, path) == 0 && of_node_get(np)) break; + } read_unlock(&devtree_lock); return np; } Index: linux-work/include/asm-ppc64/prom.h =================================================================== --- linux-work.orig/include/asm-ppc64/prom.h 2005-06-01 16:07:18.000000000 +1000 +++ linux-work/include/asm-ppc64/prom.h 2005-06-01 16:07:21.000000000 +1000 @@ -22,13 +22,15 @@ #define RELOC(x) (*PTRRELOC(&(x))) /* Definitions used by the flattened device tree */ -#define OF_DT_HEADER 0xd00dfeed /* 4: version, 4: total size */ -#define OF_DT_BEGIN_NODE 0x1 /* Start node: full name */ +#define OF_DT_HEADER 0xd00dfeed /* marker */ +#define OF_DT_BEGIN_NODE 0x1 /* Start of node, full name */ #define OF_DT_END_NODE 0x2 /* End node */ -#define OF_DT_PROP 0x3 /* Property: name off, size, content */ +#define OF_DT_PROP 0x3 /* Property: name off, size, + * content */ +#define OF_DT_NOP 0x4 /* nop */ #define OF_DT_END 0x9 -#define OF_DT_VERSION 1 +#define OF_DT_VERSION 0x10 /* * This is what gets passed to the kernel by prom_init or kexec @@ -54,7 +56,9 @@ u32 version; /* format version */ u32 last_comp_version; /* last compatible version */ /* version 2 fields below */ - u32 boot_cpuid_phys; /* Which physical CPU id we're booting on */ + u32 boot_cpuid_phys; /* Physical CPU id we're booting on */ + /* version 3 fields below */ + u32 dt_strings_size; /* size of the DT strings block */ }; From arnd at arndb.de Wed Jun 1 19:07:31 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Wed, 1 Jun 2005 11:07:31 +0200 Subject: [PATCH] ppc64: fix fixup_device_tree Message-ID: <200506011107.31751.arnd@arndb.de> The new fixup_device_tree function breaks on some open firmware implementations that can't deal with invalid phandle values passed to prom_getprop(). The current code attempts to check the validity of the phandle returned from finddevice but fails to do that correctly, because (0x00000000fffffffful <= 0) is false. I suggest comparing the returned phandle to the expected value directly. Signed-off-by: Arnd Bergmann References: <200506011107.31751.arnd@arndb.de> Message-ID: <17053.36293.360808.42984@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > - if ((long)u3 <= 0) > + if (u3 == -1u) Yes, OK, but I think I would prefer either "-1" or "~0U" to "-1u". Paul. From benh at kernel.crashing.org Wed Jun 1 20:42:40 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Jun 2005 20:42:40 +1000 Subject: [PATCH] ppc64: fix fixup_device_tree In-Reply-To: <200506011107.31751.arnd@arndb.de> References: <200506011107.31751.arnd@arndb.de> Message-ID: <1117622561.19020.38.camel@gaston> On Wed, 2005-06-01 at 11:07 +0200, Arnd Bergmann wrote: > The new fixup_device_tree function breaks on some open firmware > implementations that can't deal with invalid phandle values passed > to prom_getprop(). > The current code attempts to check the validity of the phandle > returned from finddevice but fails to do that correctly, because > (0x00000000fffffffful <= 0) is false. > I suggest comparing the returned phandle to the expected > value directly. Ah indeed. There are a couple of other cases that get it wrong btw. I'll send a patch tomorrow. Ben. From brking at us.ibm.com Thu Jun 2 00:09:25 2005 From: brking at us.ibm.com (Brian King) Date: Wed, 01 Jun 2005 09:09:25 -0500 Subject: [PATCH]: PCI Error Recovery Implementation In-Reply-To: <20050531203028.GD31199@austin.ibm.com> References: <20050531203028.GD31199@austin.ibm.com> Message-ID: <429DC195.1010801@us.ibm.com> What tree is this patch diffed from? It doesn't apply to the current 2.6.12-rc5-git6 snapshot on kernel.org. Also, when you re-diff, can you diff in patch -p1 format so that akpm's patch scripts work on it? Thanks -Brian Linas Vepstas wrote: > > Hi, > > Attached is the latest and greatest greatest PCI error recovery > patch. Its posted here as one giant patch, but logically consists > of a number of different pieces: > > 1) generic modifications to include/linux/pci.h, as per emails > in last round of discussion. > > 2) Documentation/pci-error-recovery.txt describing the API. > This is a cut-n-paste-modified copy of BenH's email. > I changed the names of a few routines, and added notes > about the current ppc64 implementation. > > 3) working patches to the SCSI ipr and symbios device drivers > to use this API to recover from PCI errors. These actually work. > I plan to have a patch for e1000 "real soon now"(TM). > > 4) ppc64-specific patches that use the API to notify the device > of PCI errors. > > Please review. I want to get this submitted into mainline ASAP. > > --linas > > Signed-off-by: Linas Vepstas > > > ------------------------------------------------------------------------ > > --- include/linux/pci.h.linas-orig 2005-04-29 20:27:22.000000000 -0500 > +++ include/linux/pci.h 2005-05-31 13:47:46.000000000 -0500 > @@ -659,6 +659,81 @@ struct pci_dynids { > unsigned int use_driver_data:1; /* pci_driver->driver_data is used */ > }; > > +/* ---------------------------------------------------------------- */ > +/** PCI error recovery infrastructure. If a PCI device driver provides > + * a set fof callbacks in struct pci_error_handlers, then that device driver > + * will be notified of PCI bus errors, and can be driven to recovery. > + */ > + > +enum pci_channel_state { > + pci_channel_io_normal = 0, /* I/O channel is in normal state */ > + pci_channel_io_frozen = 1, /* I/O to channel is blocked */ > + pci_channel_io_perm_failure, /* pci card is dead */ > +}; > + > +enum pcierr_result { > + PCIERR_RESULT_NONE=0, /* no result/none/not supported in device driver */ > + PCIERR_RESULT_CAN_RECOVER=1, /* Device driver can recover without slot reset */ > + PCIERR_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */ > + PCIERR_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */ > + PCIERR_RESULT_RECOVERED, /* Device driver is fully recovered and operational */ > +}; > + > +/* PCI bus error event callbacks */ > +struct pci_error_handlers > +{ > + int (*error_detected)(struct pci_dev *dev, enum pci_channel_state error); > + int (*mmio_enabled)(struct pci_dev *dev); /* MMIO has been reanbled, but not DMA */ > + int (*link_reset)(struct pci_dev *dev); /* PCI Express link has been reset */ > + int (*slot_reset)(struct pci_dev *dev); /* PCI slot has been reset */ > + void (*resume)(struct pci_dev *dev); /* Device driver may resume normal operations */ > +}; > + > +/** > + * PCI Error notifier event flags. > + */ > +#define PEH_NOTIFY_ERROR 1 > + > +/** PEH event -- structure holding pci controller data that describes > + * a change in the isolation status of a PCI slot. A pointer > + * to this struct is passed as the data pointer in a notify callback. > + */ > +struct peh_event { > + struct list_head list; > + struct pci_dev *dev; /* affected device */ > + enum pci_channel_state state; /* PCI bus state for the affected device */ > + int time_unavail; /* milliseconds until device might be available */ > +}; > + > +/** > + * peh_send_failure_event - generate a PCI error event > + * @dev pci device > + * > + * This routine builds a PCI error event which will be delivered > + * to all listeners on the peh_notifier_chain. > + * > + * This routine can be called within an interrupt context; > + * the actual event will be delivered in a normal context > + * (from a workqueue). > + */ > +int peh_send_failure_event (struct pci_dev *dev, > + enum pci_channel_state state, > + int time_unavail); > + > +/** > + * peh_register_notifier - Register to find out about EEH events. > + * @nb: notifier block to callback on events > + */ > +int peh_register_notifier(struct notifier_block *nb); > + > +/** > + * peh_unregister_notifier - Unregister to an EEH event notifier. > + * @nb: notifier block to callback on events > + */ > +int peh_unregister_notifier(struct notifier_block *nb); > + > +/* ---------------------------------------------------------------- */ > + > struct module; > struct pci_driver { > struct list_head node; > @@ -671,6 +746,7 @@ struct pci_driver { > int (*resume) (struct pci_dev *dev); /* Device woken up */ > int (*enable_wake) (struct pci_dev *dev, u32 state, int enable); /* Enable wake event */ > > + struct pci_error_handlers err_handler; > struct device_driver driver; > struct pci_dynids dynids; > }; > --- Documentation/pci-error-recovery.txt.linas-orig 2005-05-06 17:44:41.000000000 -0500 > +++ Documentation/pci-error-recovery.txt 2005-05-31 15:08:56.000000000 -0500 > @@ -0,0 +1,232 @@ > + > + PCI Error Recovery > + ------------------ > + May 31, 2005 > + > + > +Some PCI bus controllers are able to detect certain "hard" PCI errors > +on the bus, such as parity errors on the data and address busses, as > +well as SERR and PERR errors. These chipsets are then able to disable > +I/O to/from the affected device, so that, for example, a bad DMA > +address doesn't end up corrupting system memory. These same chipsets > +are also able to reset the affected PCI device, and return it to > +working condition. This document describes a generic API form > +performing error recovery. > + > +The core idea is that after a PCI error has been detected, there must > +be a way for the kernel to coordinate with all affected device drivers > +so that the pci card can be made operational again, possibly after > +performing a full electrical #RST of the PCI card. The API below > +provides a generic API for device drivers to be notified of PCI > +errors, and to be notified of, and respond to, a reset sequence. > + > +Preliminary sketch of API, cut-n-pasted-n-modified email from > +Ben Herrenschmidt, circa 5 april 2005 > + > +The error recovery API support is exposed to the driver in the form of > +a structure of function pointers pointed to by a new field in struct > +pci_driver. The absence of this pointer in pci_driver denotes an > +"non-aware" driver, behaviour on these is platform dependant. > +Platforms like ppc64 can try to simulate pci hotplug remove/add. > + > +The definition of "pci_error_token" is not covered here. It is based on > +Seto's work on the synchronous error detection. We still need to define > +functions for extracting infos out of an opaque error token. This is > +separate from this API. > + > +This structure has the form: > + > +struct pci_error_handlers > +{ > + int (*error_detected)(struct pci_dev *dev, pci_error_token error); > + int (*mmio_enabled)(struct pci_dev *dev); > + int (*resume)(struct pci_dev *dev); > + int (*link_reset)(struct pci_dev *dev); > + int (*slot_reset)(struct pci_dev *dev); > +}; > + > +A driver doesn't have to implement all of these callbacks. The > +only mandatory one is error_detected(). If a callback is not > +implemented, the corresponding feature is considered unsupported. > +For example, if mmio_enabled() and resume() aren't there, then the > +driver is assumed as not doing any direct recovery and requires > +a reset. If link_reset() is not implemented, the card is assumed as > +not caring about link resets, in which case, if recover is supported, > +the core can try recover (but not slot_reset() unless it really did > +reset the slot). If slot_reset() is not supported, link_reset() can > +be called instead on a slot reset. > + > +At first, the call will always be : > + > + 1) error_detected() > + > + Error detected. This is sent once after an error has been detected. At > +this point, the device might not be accessible anymore depending on the > +platform (the slot will be isolated on ppc64). The driver may already > +have "noticed" the error because of a failing IO, but this is the proper > +"synchronisation point", that is, it gives a chance to the driver to > +cleanup, waiting for pending stuff (timers, whatever, etc...) to > +complete; it can take semaphores, schedule, etc... everything but touch > +the device. Within this function and after it returns, the driver > +shouldn't do any new IOs. Called in task context. This is sort of a > +"quiesce" point. See note about interrupts at the end of this doc. > + > + Result codes: > + - PCIERR_RESULT_CAN_RECOVER: > + Driever returns this if it thinks it might be able to recover > + the HW by just banging IOs or if it wants to be given > + a chance to extract some diagnostic informations (see > + below). > + - PCIERR_RESULT_NEED_RESET: > + Driver returns this if it thinks it can't recover unless the > + slot is reset. > + - PCIERR_RESULT_DISCONNECT: > + Return this if driver thinks it won't recover at all, > + (this will detach the driver ? or just leave it > + dangling ? to be decided) > + > +So at this point, we have called error_detected() for all drivers > +on the segment that had the error. On ppc64, the slot is isolated. What > +happens now typically depends on the result from the drivers. If all > +drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would > +re-enable IOs on the slot (or do nothing special if the platform doesn't > +isolate slots) and call 2). If not and we can reset slots, we go to 4), > +if neither, we have a dead slot. If it's an hotplug slot, we might > +"simulate" reset by triggering HW unplug/replug though. > + > +>>> Current ppc64 implementation assumes that a device driver will > +>>> *not* schedule or semaphore in this routine; the current ppc64 > +>>> implementation uses one kernel thread to notify all devices; > +>>> thus, of one device sleeps/schedules, all devices are affected. > +>>> Doing better requires complex multi-threaded logic in the error > +>>> recovery implementation (e.g. waiting for all notification threads > +>>> to "join" before proceeding with recovery.) This seems excessively > +>>> complex and not worth implementing. > + > + 2) mmio_enabled() > + > + This is the "early recovery" call. IOs are allowed again, but DMA is > +not (hrm... to be discussed, I prefer not), with some restrictions. This > +is NOT a callback for the driver to start operations again, only to > +peek/poke at the device, extract diagnostic information, if any, and > +eventually do things like trigger a device local reset or some such, > +but not restart operations. This is sent if all drivers on a segment > +agree that they can try to recover and no automatic link reset was > +performed by the HW. If the platform can't just re-enable IOs without > +a slot reset or a link reset, it doesn't call this callback and goes > +directly to 3) or 4). All IOs should be done _synchronously_ from > +within this callback, errors triggered by them will be returned via > +the normal pci_check_whatever() api, no new error_detected() callback > +will be issued due to an error happening here. However, such an error > +might cause IOs to be re-blocked for the whole segment, and thus > +invalidate the recovery that other devices on the same segment might > +have done, forcing the whole segment into one of the next states, > +that is link reset or slot reset. > + > + Result codes: > + - PCIERR_RESULT_RECOVERED > + Driver returns this if it thinks the device is fully > + functionnal and thinks it is ready to start > + normal driver operations again. There is no > + guarantee that the driver will actually be > + allowed to proceed, as another driver on the > + same segment might have failed and thus triggered a > + slot reset on platforms that support it. > + > + - PCIERR_RESULT_NEED_RESET > + Driver returns this if it thinks the device is not > + recoverable in it's current state and it needs a slot > + reset to proceed. > + > + - PCIERR_RESULT_DISCONNECT > + Same as above. Total failure, no recovery even after > + reset driver dead. (To be defined more precisely) > + > +>>> The current ppc64 implementation does not implement this callback. > + > + 3) link_reset() > + > + This is called after the link has been reset. This is typically > +a PCI Express specific state at this point and is done whenever a > +non-fatal error has been detected that can be "solved" by resetting > +the link. This call informs the driver of the reset and the driver > +should check if the device appears to be in working condition. > +This function acts a bit like 2) mmio_enabled(), in that the driver > +is not supposed to restart normal driver I/O operations right away. > +Instead, it should just "probe" the device to check it's recoverability > +status. If all is right, then the core will call resume() once all > +drivers have ack'd link_reset(). > + > + Result codes: > + (identical to mmio_enabled) > + > +>>> The current ppc64 implementation does not implement this callback. > + > + 4) slot_reset() > + > + This is called after the slot has been soft or hard reset by the > +platform. A soft reset consists of asserting the adapter #RST line > +and then restoring the PCI BARs and PCI configuration header. If the > +platform supports PCI hotplug, then it might instead perform a hard > +reset by toggling power on the slot off/on. This call gives drivers > +the chance to re-initialize the hardware (re-download firmware, etc.), > +but drivers shouldn't restart normal I/O processing operations at > +this point. (See note about interrupts; interrupts aren't guaranteed > +to be delivered until the resume() callback has been called). If all > +device drivers report success on this callback, the patform will call > +resume() to complete the error handling and let the driver restart > +normal I/O processing. > + > +A driver can still return a critical failure for this function if > +it can't get the device operational after reset. If the platform > +previously tried a soft reset, it migh now try a hard reset (power > +cycle) and then call slot_reset() again. It the device still can't > +be recovered, there is nothing more that can be done; the platform > +will typically report a "permanent failure" in such a case. The > +device will be considered "dead" in this case. > + > + Result codes: > + - PCIERR_RESULT_DISCONNECT > + Same as above. > + > + 5) resume() > + > + This is called if all drivers on the segment have returned > +PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks. > +That basically tells the driver to restart activity, tht everything > +is back and running. No result code is taken into account here. If > +a new error happens, it will restart a new error handling process. > + > +That's it. I think this covers all the possibilities. The way those > +callbacks are called is platform policy. A platform with no slot reset > +capability for example may want to just "ignore" drivers that can't > +recover (disconnect them) and try to let other cards on the same segment > +recover. Keep in mind that in most real life cases, though, there will > +be only one driver per segment. > + > +Now, there is a note about interrupts. If you get an interrupt and your > +device is dead or has been isolated, there is a problem :) > + > +After much thinking, I decided to leave that to the platform. That is, > +the recovery API only precies that: > + > + - There is no guarantee that interrupt delivery can proceed from any > +device on the segment starting from the error detection and until the > +restart callback is sent, at which point interrupts are expected to be > +fully operational. > + > + - There is no guarantee that interrupt delivery is stopped, that is, ad > +river that gets an interrupts after detecting an error, or that detects > +and error within the interrupt handler such that it prevents proper > +ack'ing of the interrupt (and thus removal of the source) should just > +return IRQ_NOTHANDLED. It's up to the platform to deal with taht > +condition, typically by masking the irq source during the duration of > +the error handling. It is expected that the platform "knows" which > +interrupts are routed to error-management capable slots and can deal > +with temporarily disabling that irq number during error processing (this > +isn't terribly complex). That means some IRQ latency for other devices > +sharing the interrupt, but there is simply no other way. High end > +platforms aren't supposed to share interrupts between many devices > +anyway :) > + > + > --- drivers/pci/Makefile.linas-orig 2005-04-29 20:31:33.000000000 -0500 > +++ drivers/pci/Makefile 2005-05-06 12:28:43.000000000 -0500 > @@ -3,7 +3,7 @@ > # > > obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ > - names.o pci-driver.o search.o pci-sysfs.o \ > + names.o pci-driver.o pci-error.o search.o pci-sysfs.o \ > rom.o > obj-$(CONFIG_PROC_FS) += proc.o > > --- drivers/pci/pci-error.c.linas-orig 2005-05-06 17:44:47.000000000 -0500 > +++ drivers/pci/pci-error.c 2005-05-31 13:49:34.000000000 -0500 > @@ -0,0 +1,152 @@ > +/* > + * pci-error.c > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > + */ > + > +#include > +#include > +#include > + > +#undef DEBUG > + > +/** Overview: > + * PEH, or "PCI Error Handling" is a PCI bridge technology for > + * dealing with PCI bus errors that can't be dealt with within the > + * usual PCI framework, except by check-stopping the CPU. Systems > + * that are designed for high-availability/reliability cannot afford > + * to crash due to a "mere" PCI error, thus the need for PEH. > + * An PEH-capable bridge operates by converting a detected error > + * into a "slot freeze", taking the PCI adapter off-line, making > + * the slot behave, from the OS'es point of view, as if the slot > + * were "empty": all reads return 0xff's and all writes are silently > + * ignored. PEH slot isolation events can be triggered by parity > + * errors on the address or data busses (e.g. during posted writes), > + * which in turn might be caused by low voltage on the bus, dust, > + * vibration, humidity, radioactivity or plain-old failed hardware. > + * > + * Note, however, that one of the leading causes of PEH slot > + * freeze events are buggy device drivers, buggy device microcode, > + * or buggy device hardware. This is because any attempt by the > + * device to bus-master data to a memory address that is not > + * assigned to the device will trigger a slot freeze. (The idea > + * is to prevent devices-gone-wild from corrupting system memory). > + * Buggy hardware/drivers will have a miserable time co-existing > + * with PEH. > + */ > + > +/* PEH event workqueue setup. */ > +static spinlock_t peh_eventlist_lock = SPIN_LOCK_UNLOCKED; > +LIST_HEAD(peh_eventlist); > +static void peh_event_handler(void *); > +DECLARE_WORK(peh_event_wq, peh_event_handler, NULL); > + > +static struct notifier_block *peh_notifier_chain; > + > +/** > + * peh_event_handler - dispatch PEH events. The detection of a frozen > + * slot can occur inside an interrupt, where it can be hard to do > + * anything about it. The goal of this routine is to pull these > + * detection events out of the context of the interrupt handler, and > + * re-dispatch them for processing at a later time in a normal context. > + * > + * @dummy - unused > + */ > +static void peh_event_handler(void *dummy) > +{ > + unsigned long flags; > + struct peh_event *event; > + > + while (1) { > + spin_lock_irqsave(&peh_eventlist_lock, flags); > + event = NULL; > + if (!list_empty(&peh_eventlist)) { > + event = list_entry(peh_eventlist.next, struct peh_event, list); > + list_del(&event->list); > + } > + spin_unlock_irqrestore(&peh_eventlist_lock, flags); > + if (event == NULL) > + break; > + > + printk(KERN_INFO "PEH: Detected PCI bus error on device " > + "%s %s\n", > + pci_name(event->dev), pci_pretty_name(event->dev)); > + > + notifier_call_chain (&peh_notifier_chain, > + PEH_NOTIFY_ERROR, event); > + > + pci_dev_put(event->dev); > + kfree(event); > + } > +} > + > + > +/** > + * peh_send_failure_event - generate a PCI error event > + * @dev pci device > + * > + * This routine builds a PCI error event which will be delivered > + * to all listeners on the peh_notifier_chain. > + * > + * This routine can be called within an interrupt context; > + * the actual event will be delivered in a normal context > + * (from a workqueue). > + */ > +int peh_send_failure_event (struct pci_dev *dev, > + enum pci_channel_state state, > + int time_unavail) > +{ > + unsigned long flags; > + struct peh_event *event; > + > + event = kmalloc(sizeof(*event), GFP_ATOMIC); > + if (event == NULL) { > + printk (KERN_ERR "PEH: out of memory, event not handled\n"); > + return 1; > + } > + > + event->dev = dev; > + event->state = state; > + event->time_unavail = time_unavail; > + > + /* We may or may not be called in an interrupt context */ > + spin_lock_irqsave(&peh_eventlist_lock, flags); > + list_add(&event->list, &peh_eventlist); > + spin_unlock_irqrestore(&peh_eventlist_lock, flags); > + > + schedule_work(&peh_event_wq); > + > + return 0; > +} > + > +/** > + * peh_register_notifier - Register to find out about EEH events. > + * @nb: notifier block to callback on events > + */ > +int peh_register_notifier(struct notifier_block *nb) > +{ > + return notifier_chain_register(&peh_notifier_chain, nb); > +} > + > +/** > + * peh_unregister_notifier - Unregister to an EEH event notifier. > + * @nb: notifier block to callback on events > + */ > +int peh_unregister_notifier(struct notifier_block *nb) > +{ > + return notifier_chain_unregister(&peh_notifier_chain, nb); > +} > + > +/********************** END OF FILE ******************************/ > --- drivers/scsi/ipr.c.linas-orig 2005-04-29 20:33:36.000000000 -0500 > +++ drivers/scsi/ipr.c 2005-05-31 15:12:08.000000000 -0500 > @@ -5306,6 +5306,85 @@ static void ipr_initiate_ioa_reset(struc > shutdown_type); > } > > +#ifdef CONFIG_SCSI_IPR_EEH_RECOVERY > + > +/** If the PCI slot is frozen, hold off all i/o > + * activity; then, as soon as the slot is available again, > + * initiate an adapter reset. > + */ > +static int ipr_reset_freeze(struct ipr_cmnd *ipr_cmd) > +{ > + list_add_tail(&ipr_cmd->queue, &ipr_cmd->ioa_cfg->pending_q); > + ipr_cmd->done = ipr_reset_ioa_job; > + return IPR_RC_JOB_RETURN; > +} > + > +/** ipr_eeh_frozen -- called when slot has experience PCI bus error. > + * This routine is called to tell us that the PCI bus is down. > + * Can't do anything here, except put the device driver into a > + * holding pattern, waiting for the PCI bus to come back. > + */ > +static void ipr_eeh_frozen (struct pci_dev *pdev) > +{ > + unsigned long flags = 0; > + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); > + > + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); > + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_freeze, IPR_SHUTDOWN_NONE); > + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); > +} > + > +/** ipr_eeh_slot_reset - called when pci slot has been reset. > + * > + * This routine is called by the pci error recovery recovery > + * code after the PCI slot has been reset, just before we > + * should resume normal operations. > + */ > +static int ipr_eeh_slot_reset (struct pci_dev *pdev) > +{ > + unsigned long flags = 0; > + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); > + > + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); > + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_restore_cfg_space, > + IPR_SHUTDOWN_NONE); > + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); > + > + return PCIERR_RESULT_RECOVERED; > +} > + > +/** This routine is called when the PCI bus has permanently > + * failed. This routine should purge all pending I/O and > + * shut down the device driver (close and unload). > + * XXX Needs to be implemented. > + */ > +static void ipr_eeh_perm_failure (struct pci_dev *pdev) > +{ > +#if 0 // XXXXXXXXXXXXXXXXXXXXXXX > + ipr_cmd->job_step = ipr_reset_shutdown_ioa; > + rc = IPR_RC_JOB_CONTINUE; > +#endif > +} > + > +static int ipr_eeh_error_detected (struct pci_dev *pdev, > + enum pci_channel_state state) > +{ > + switch (state) { > + case pci_channel_io_frozen: > + ipr_eeh_frozen (pdev); > + return PCIERR_RESULT_NEED_RESET; > + > + case pci_channel_io_perm_failure: > + ipr_eeh_perm_failure (pdev); > + return PCIERR_RESULT_DISCONNECT; > + break; > + default: > + break; > + } > + return PCIERR_RESULT_NEED_RESET; > +} > +#endif > + > /** > * ipr_probe_ioa_part2 - Initializes IOAs found in ipr_probe_ioa(..) > * @ioa_cfg: ioa cfg struct > @@ -6015,6 +6094,10 @@ static struct pci_driver ipr_driver = { > .id_table = ipr_pci_table, > .probe = ipr_probe, > .remove = ipr_remove, > + .err_handler = { > + .error_detected = ipr_eeh_error_detected, > + .slot_reset = ipr_eeh_slot_reset, > + }, > .driver = { > .shutdown = ipr_shutdown, > }, > --- drivers/scsi/sym53c8xx_2/sym_glue.c.linas-orig 2005-04-29 20:33:12.000000000 -0500 > +++ drivers/scsi/sym53c8xx_2/sym_glue.c 2005-05-31 13:52:55.000000000 -0500 > @@ -770,6 +770,10 @@ static irqreturn_t sym53c8xx_intr(int ir > struct sym_hcb *np = (struct sym_hcb *)dev_id; > > if (DEBUG_FLAGS & DEBUG_TINY) printf_debug ("["); > +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY > + if (np->s.io_state != pci_channel_io_normal) > + return IRQ_HANDLED; > +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ > > spin_lock_irqsave(np->s.host->host_lock, flags); > sym_interrupt(np); > @@ -844,6 +848,27 @@ static void sym_eh_done(struct scsi_cmnd > */ > static void sym_eh_timeout(u_long p) { __sym_eh_done((struct scsi_cmnd *)p, 1); } > > +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY > +static void sym_eeh_timeout(u_long p) > +{ > + struct sym_eh_wait *ep = (struct sym_eh_wait *) p; > + if (!ep) > + return; > + complete(&ep->done); > +} > + > +static void sym_eeh_done(struct sym_eh_wait *ep) > +{ > + if (!ep) > + return; > + ep->timed_out = 0; > + if (!del_timer(&ep->timer)) > + return; > + > + complete(&ep->done); > +} > +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ > + > /* > * Generic method for our eh processing. > * The 'op' argument tells what we have to do. > @@ -893,6 +918,37 @@ prepare: > > /* Try to proceed the operation we have been asked for */ > sts = -1; > +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY > + > + /* We may be in an error condition because the PCI bus > + * went down. In this case, we need to wait until the > + * PCI bus is reset, the card is reset, and only then > + * proceed with the scsi error recovery. We'll wait > + * for 15 seconds for this to happen. > + */ > +#define WAIT_FOR_PCI_RECOVERY 15 > + if (np->s.io_state != pci_channel_io_normal) { > + struct sym_eh_wait eeh, *eep = &eeh; > + np->s.io_reset_wait = eep; > + init_completion(&eep->done); > + init_timer(&eep->timer); > + eep->to_do = SYM_EH_DO_WAIT; > + eep->timer.expires = jiffies + (WAIT_FOR_PCI_RECOVERY*HZ); > + eep->timer.function = sym_eeh_timeout; > + eep->timer.data = (u_long)eep; > + eep->timed_out = 1; /* Be pessimistic for once :) */ > + add_timer(&eep->timer); > + spin_unlock_irq(np->s.host->host_lock); > + wait_for_completion(&eep->done); > + spin_lock_irq(np->s.host->host_lock); > + if (eep->timed_out) { > + printk (KERN_ERR "%s: Timed out waiting for PCI reset\n", > + sym_name(np)); > + } > + np->s.io_reset_wait = NULL; > + } > +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ > + > switch(op) { > case SYM_EH_ABORT: > sts = sym_abort_scsiio(np, cmd, 1); > @@ -1625,6 +1681,8 @@ static struct Scsi_Host * __devinit sym_ > if (!np) > goto attach_failed; > np->s.device = dev->pdev; > + np->s.io_state = pci_channel_io_normal; > + np->s.io_reset_wait = NULL; > np->bus_dmat = dev->pdev; /* Result in 1 DMA pool per HBA */ > host_data->ncb = np; > np->s.host = instance; > @@ -2048,6 +2106,59 @@ static int sym_detach(struct sym_hcb *np > return 1; > } > > +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY > +/** sym2_io_error_detected() is called when PCI error is detected */ > +int sym2_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) > +{ > + struct sym_hcb *np = pci_get_drvdata(pdev); > + > + np->s.io_state = state; > + // XXX If slot is permanently frozen, then what? > + // Should we scsi_remove_host() maybe ?? > + > + /* Request a slot slot reset. */ > + return PCIERR_RESULT_NEED_RESET; > +} > + > +/** sym2_io_slot_reset is called when the pci bus has been reset. > + * Restart the card from scratch. */ > +int sym2_io_slot_reset (struct pci_dev *pdev) > +{ > + struct sym_hcb *np = pci_get_drvdata(pdev); > + > + msleep (500); // pure paranoia -- wait for device to settle > + printk (KERN_INFO "%s: recovering from a PCI slot reset\n", > + sym_name(np)); > + > + if (pci_enable_device(pdev)) > + printk (KERN_ERR "%s: device setup failed most egregiously\n", > + sym_name(np)); > + > + pci_set_master(pdev); > + > + /* Perform host reset only on one instance of the card */ > + if (0 == PCI_FUNC (pdev->devfn)) > + sym_reset_scsi_bus(np, 0); > + > + return PCIERR_RESULT_RECOVERED; > +} > + > +/** sym2_io_resume is called when the error recovery driver > + * tells us that its OK to resume normal operation. > + */ > +void sym2_io_resume (struct pci_dev *pdev) > +{ > + struct sym_hcb *np = pci_get_drvdata(pdev); > + > + /* Perform device startup only once for this card. */ > + if (0 == PCI_FUNC (pdev->devfn)) > + sym_start_up (np, 1); > + > + np->s.io_state = pci_channel_io_normal; > + sym_eeh_done (np->s.io_reset_wait); > +} > +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ > + > /* > * Driver host template. > */ > @@ -2359,6 +2470,11 @@ static struct pci_driver sym2_driver = { > .id_table = sym2_id_table, > .probe = sym2_probe, > .remove = __devexit_p(sym2_remove), > + .err_handler = { > + .error_detected = sym2_io_error_detected, > + .slot_reset = sym2_io_slot_reset, > + .resume = sym2_io_resume, > + }, > }; > > static int __init sym2_init(void) > --- drivers/scsi/sym53c8xx_2/sym_glue.h.linas-orig 2005-04-29 20:32:45.000000000 -0500 > +++ drivers/scsi/sym53c8xx_2/sym_glue.h 2005-05-06 16:29:39.000000000 -0500 > @@ -358,6 +358,10 @@ struct sym_shcb { > char chip_name[8]; > struct pci_dev *device; > > + /* pci bus i/o state; waiter for clearing of i/o state */ > + enum pci_channel_state io_state; > + struct sym_eh_wait *io_reset_wait; > + > struct Scsi_Host *host; > > void __iomem * mmio_va; /* MMIO kernel virtual address */ > --- drivers/scsi/sym53c8xx_2/sym_hipd.c.linas-orig 2005-04-29 20:22:45.000000000 -0500 > +++ drivers/scsi/sym53c8xx_2/sym_hipd.c 2005-05-20 15:40:43.000000000 -0500 > @@ -2836,6 +2836,7 @@ void sym_interrupt (struct sym_hcb *np) > u_char istat, istatc; > u_char dstat; > u_short sist; > + u_int icnt; > > /* > * interrupt on the fly ? > @@ -2877,6 +2878,7 @@ void sym_interrupt (struct sym_hcb *np) > sist = 0; > dstat = 0; > istatc = istat; > + icnt = 0; > do { > if (istatc & SIP) > sist |= INW (nc_sist); > @@ -2884,6 +2886,14 @@ void sym_interrupt (struct sym_hcb *np) > dstat |= INB (nc_dstat); > istatc = INB (nc_istat); > istat |= istatc; > +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY > + /* Prevent deadlock waiting on a condition that may never clear. */ > + icnt ++; > + if (100 < icnt) { > + if (eeh_slot_is_isolated(np->s.device)) > + return; > + } > +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ > } while (istatc & (SIP|DIP)); > > if (DEBUG_FLAGS & DEBUG_TINY) > --- drivers/scsi/Kconfig.linas-orig 2005-04-29 20:31:30.000000000 -0500 > +++ drivers/scsi/Kconfig 2005-05-24 11:17:40.000000000 -0500 > @@ -1032,6 +1032,14 @@ config SCSI_SYM53C8XX_IOMAPPED > the card. This is significantly slower then using memory > mapped IO. Most people should answer N. > > +config SCSI_SYM53C8XX_EEH_RECOVERY > + bool "Enable PCI bus error recovery" > + depends on SCSI_SYM53C8XX_2 && PPC_PSERIES > + help > + If you say Y here, the driver will be able to recover from > + PCI bus errors on many PowerPC platforms. IBM pSeries users > + should answer Y. > + > config SCSI_IPR > tristate "IBM Power Linux RAID adapter support" > depends on PCI && SCSI > @@ -1057,6 +1065,14 @@ config SCSI_IPR_DUMP > If you enable this support, the iprdump daemon can be used > to capture adapter failure analysis information. > > +config SCSI_IPR_EEH_RECOVERY > + bool "Enable PCI bus error recovery" > + depends on SCSI_IPR && PPC_PSERIES > + help > + If you say Y here, the driver will be able to recover from > + PCI bus errors on many PowerPC platforms. IBM pSeries users > + should answer Y. > + > config SCSI_ZALON > tristate "Zalon SCSI support" > depends on GSC && SCSI > --- arch/ppc64/defconfig.linas-orig 2005-05-20 12:16:19.000000000 -0500 > +++ arch/ppc64/defconfig 2005-05-20 12:16:58.000000000 -0500 > @@ -255,6 +255,7 @@ CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MOD > CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 > CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 > # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set > +CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY=y > # CONFIG_SCSI_QLOGIC_ISP is not set > # CONFIG_SCSI_QLOGIC_FC is not set > # CONFIG_SCSI_QLOGIC_1280 is not set > --- arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-04-29 20:34:04.000000000 -0500 > +++ arch/ppc64/configs/pSeries_defconfig 2005-05-24 11:18:45.000000000 -0500 > @@ -275,9 +275,11 @@ CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MOD > CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 > CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 > # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set > +CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY=y > CONFIG_SCSI_IPR=y > # CONFIG_SCSI_IPR_TRACE is not set > # CONFIG_SCSI_IPR_DUMP is not set > +CONFIG_SCSI_IPR_EEH_RECOVERY=y > # CONFIG_SCSI_QLOGIC_ISP is not set > # CONFIG_SCSI_QLOGIC_FC is not set > # CONFIG_SCSI_QLOGIC_1280 is not set > --- include/asm-ppc64/eeh.h.linas-orig 2005-04-29 20:34:03.000000000 -0500 > +++ include/asm-ppc64/eeh.h 2005-05-31 13:55:18.000000000 -0500 > @@ -1,4 +1,4 @@ > -/* > +/* > * eeh.h > * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation. > * > @@ -6,12 +6,12 @@ > * it under the terms of the GNU General Public License as published by > * the Free Software Foundation; either version 2 of the License, or > * (at your option) any later version. > - * > + * > * This program is distributed in the hope that it will be useful, > * but WITHOUT ANY WARRANTY; without even the implied warranty of > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > * GNU General Public License for more details. > - * > + * > * You should have received a copy of the GNU General Public License > * along with this program; if not, write to the Free Software > * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > @@ -23,6 +23,7 @@ > #include > #include > #include > +#include > #include > > struct pci_dev; > @@ -36,6 +37,11 @@ struct notifier_block; > #define EEH_MODE_SUPPORTED (1<<0) > #define EEH_MODE_NOCHECK (1<<1) > #define EEH_MODE_ISOLATED (1<<2) > +#define EEH_MODE_RECOVERING (1<<3) > + > +/* Max number of EEH freezes allowed before we consider the device > + * to be permanently disabled. */ > +#define EEH_MAX_ALLOWED_FREEZES 5 > > void __init eeh_init(void); > unsigned long eeh_check_failure(const volatile void __iomem *token, > @@ -59,35 +65,82 @@ void eeh_add_device_late(struct pci_dev > * eeh_remove_device - undo EEH setup for the indicated pci device > * @dev: pci device to be removed > * > - * This routine should be when a device is removed from a running > - * system (e.g. by hotplug or dlpar). > + * This routine should be called when a device is removed from > + * a running system (e.g. by hotplug or dlpar). It unregisters > + * the PCI device from the EEH subsystem. I/O errors affecting > + * this device will no longer be detected after this call; thus, > + * i/o errors affecting this slot may leave this device unusable. > */ > void eeh_remove_device(struct pci_dev *); > > -#define EEH_DISABLE 0 > -#define EEH_ENABLE 1 > -#define EEH_RELEASE_LOADSTORE 2 > -#define EEH_RELEASE_DMA 3 > +/** > + * eeh_slot_is_isolated -- return non-zero value if slot is frozen > + */ > +int eeh_slot_is_isolated (struct pci_dev *dev); > > /** > - * Notifier event flags. > + * eeh_ioaddr_is_isolated -- return non-zero value if device at > + * io address is frozen. > */ > -#define EEH_NOTIFY_FREEZE 1 > +int eeh_ioaddr_is_isolated(const volatile void __iomem *token); > > -/** EEH event -- structure holding pci slot data that describes > - * a change in the isolation status of a PCI slot. A pointer > - * to this struct is passed as the data pointer in a notify callback. > - */ > -struct eeh_event { > - struct list_head list; > - struct pci_dev *dev; > - struct device_node *dn; > - int reset_state; > -}; > - > -/** Register to find out about EEH events. */ > -int eeh_register_notifier(struct notifier_block *nb); > -int eeh_unregister_notifier(struct notifier_block *nb); > +/** > + * eeh_slot_error_detail -- record and EEH error condition to the log > + * @severity: 1 if temporary, 2 if permanent failure. > + * > + * Obtains the the EEH error details from the RTAS subsystem, > + * and then logs these details with the RTAS error log system. > + */ > +void eeh_slot_error_detail (struct device_node *dn, int severity); > + > +/** > + * rtas_set_slot_reset -- unfreeze a frozen slot > + * > + * Clear the EEH-frozen condition on a slot. This routine > + * does this by asserting the PCI #RST line for 1/8th of > + * a second; this routine will sleep while the adapter is > + * being reset. > + */ > +void rtas_set_slot_reset (struct device_node *dn); > + > +/** rtas_pci_slot_reset raises/lowers the pci #RST line > + * state: 1/0 to raise/lower the #RST > + * > + * Clear the EEH-frozen condition on a slot. This routine > + * asserts the PCI #RST line if the 'state' argument is '1', > + * and drops the #RST line if 'state is '0'. This routine is > + * safe to call in an interrupt context. > + * > + */ > +void rtas_pci_slot_reset(struct device_node *dn, int state); > +void eeh_pci_slot_reset(struct pci_dev *dev, int state); > + > +/** eeh_pci_slot_availability -- Indicates whether a PCI > + * slot is ready to be used. After a PCI reset, it may take a while > + * for the PCI fabric to fully reset the comminucations path to the > + * given PCI card. This routine can be used to determine how long > + * to wait before a PCI slot might become usable. > + * > + * This routine returns how long to wait (in milliseconds) before > + * the slot is expected to be usable. A value of zero means the > + * slot is immediately usable. A negavitve value means that the > + * slot is permanently disabled. > + */ > +int eeh_pci_slot_availability(struct pci_dev *dev); > + > +/** Restore device configuration info across device resets. > + */ > +void eeh_restore_bars(struct device_node *); > +void eeh_pci_restore_bars(struct pci_dev *dev); > + > +/** > + * rtas_configure_bridge -- firmware initialization of pci bridge > + * > + * Ask the firmware to configure any PCI bridge devices > + * located behind the indicated node. Required after a > + * pci device reset. > + */ > +void rtas_configure_bridge(struct device_node *dn); > > /** > * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure. > @@ -116,7 +169,7 @@ int eeh_unregister_notifier(struct notif > #define EEH_IO_ERROR_VALUE(size) (-1UL) > #endif > > -/* > +/* > * MMIO read/write operations with EEH support. > */ > static inline u8 eeh_readb(const volatile void __iomem *addr) > @@ -238,21 +291,21 @@ static inline void eeh_memcpy_fromio(voi > *((u8 *)dest) = *((volatile u8 *)vsrc); > __asm__ __volatile__ ("eieio" : : : "memory"); > vsrc = (void *)((unsigned long)vsrc + 1); > - dest = (void *)((unsigned long)dest + 1); > + dest = (void *)((unsigned long)dest + 1); > n--; > } > while(n > 4) { > *((u32 *)dest) = *((volatile u32 *)vsrc); > __asm__ __volatile__ ("eieio" : : : "memory"); > vsrc = (void *)((unsigned long)vsrc + 4); > - dest = (void *)((unsigned long)dest + 4); > + dest = (void *)((unsigned long)dest + 4); > n -= 4; > } > while(n) { > *((u8 *)dest) = *((volatile u8 *)vsrc); > __asm__ __volatile__ ("eieio" : : : "memory"); > vsrc = (void *)((unsigned long)vsrc + 1); > - dest = (void *)((unsigned long)dest + 1); > + dest = (void *)((unsigned long)dest + 1); > n--; > } > __asm__ __volatile__ ("sync" : : : "memory"); > @@ -274,19 +327,19 @@ static inline void eeh_memcpy_toio(volat > while(n && (!EEH_CHECK_ALIGN(vdest, 4) || !EEH_CHECK_ALIGN(src, 4))) { > *((volatile u8 *)vdest) = *((u8 *)src); > src = (void *)((unsigned long)src + 1); > - vdest = (void *)((unsigned long)vdest + 1); > + vdest = (void *)((unsigned long)vdest + 1); > n--; > } > while(n > 4) { > *((volatile u32 *)vdest) = *((volatile u32 *)src); > src = (void *)((unsigned long)src + 4); > - vdest = (void *)((unsigned long)vdest + 4); > + vdest = (void *)((unsigned long)vdest + 4); > n-=4; > } > while(n) { > *((volatile u8 *)vdest) = *((u8 *)src); > src = (void *)((unsigned long)src + 1); > - vdest = (void *)((unsigned long)vdest + 1); > + vdest = (void *)((unsigned long)vdest + 1); > n--; > } > __asm__ __volatile__ ("sync" : : : "memory"); > --- include/asm-ppc64/prom.h.linas-orig 2005-04-29 20:32:46.000000000 -0500 > +++ include/asm-ppc64/prom.h 2005-05-06 12:28:43.000000000 -0500 > @@ -119,6 +119,7 @@ struct property { > */ > struct pci_controller; > struct iommu_table; > +struct eeh_recovery_ops; > > struct device_node { > char *name; > @@ -137,8 +138,12 @@ struct device_node { > int devfn; /* for pci devices */ > int eeh_mode; /* See eeh.h for possible EEH_MODEs */ > int eeh_config_addr; > + int eeh_check_count; /* number of times device driver ignored error */ > + int eeh_freeze_count; /* number of times this device froze up. */ > + int eeh_is_bridge; /* device is pci-to-pci bridge */ > struct pci_controller *phb; /* for pci devices */ > struct iommu_table *iommu_table; /* for phb's or bridges */ > + u32 config_space[16]; /* saved PCI config space */ > > struct property *properties; > struct device_node *parent; > --- include/asm-ppc64/rtas.h.linas-orig 2005-04-29 20:32:32.000000000 -0500 > +++ include/asm-ppc64/rtas.h 2005-05-06 12:28:43.000000000 -0500 > @@ -243,4 +243,6 @@ extern unsigned long rtas_rmo_buf; > > #define GLOBAL_INTERRUPT_QUEUE 9005 > > +extern int rtas_write_config(struct device_node *dn, int where, int size, u32 val); > + > #endif /* _PPC64_RTAS_H */ > --- arch/ppc64/kernel/eeh.c.linas-orig 2005-04-29 20:29:19.000000000 -0500 > +++ arch/ppc64/kernel/eeh.c 2005-05-31 15:13:51.000000000 -0500 > @@ -1,32 +1,33 @@ > /* > * eeh.c > * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation > - * > + * > * This program is free software; you can redistribute it and/or modify > * it under the terms of the GNU General Public License as published by > * the Free Software Foundation; either version 2 of the License, or > * (at your option) any later version. > - * > + * > * This program is distributed in the hope that it will be useful, > * but WITHOUT ANY WARRANTY; without even the implied warranty of > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > * GNU General Public License for more details. > - * > + * > * You should have received a copy of the GNU General Public License > * along with this program; if not, write to the Free Software > * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > */ > > -#include > +#include > #include > +#include > #include > -#include > #include > #include > #include > #include > #include > #include > +#include > #include > #include > #include > @@ -49,8 +50,8 @@ > * were "empty": all reads return 0xff's and all writes are silently > * ignored. EEH slot isolation events can be triggered by parity > * errors on the address or data busses (e.g. during posted writes), > - * which in turn might be caused by dust, vibration, humidity, > - * radioactivity or plain-old failed hardware. > + * which in turn might be caused by low voltage on the bus, dust, > + * vibration, humidity, radioactivity or plain-old failed hardware. > * > * Note, however, that one of the leading causes of EEH slot > * freeze events are buggy device drivers, buggy device microcode, > @@ -75,22 +76,13 @@ > #define BUID_HI(buid) ((buid) >> 32) > #define BUID_LO(buid) ((buid) & 0xffffffff) > > -/* EEH event workqueue setup. */ > -static DEFINE_SPINLOCK(eeh_eventlist_lock); > -LIST_HEAD(eeh_eventlist); > -static void eeh_event_handler(void *); > -DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL); > - > -static struct notifier_block *eeh_notifier_chain; > - > /* > * If a device driver keeps reading an MMIO register in an interrupt > * handler after a slot isolation event has occurred, we assume it > * is broken and panic. This sets the threshold for how many read > * attempts we allow before panicking. > */ > -#define EEH_MAX_FAILS 1000 > -static atomic_t eeh_fail_count; > +#define EEH_MAX_FAILS 100000 > > /* RTAS tokens */ > static int ibm_set_eeh_option; > @@ -107,6 +99,10 @@ static DEFINE_SPINLOCK(slot_errbuf_lock) > static int eeh_error_buf_size; > > /* System monitoring statistics */ > +static DEFINE_PER_CPU(unsigned long, no_device); > +static DEFINE_PER_CPU(unsigned long, no_dn); > +static DEFINE_PER_CPU(unsigned long, no_cfg_addr); > +static DEFINE_PER_CPU(unsigned long, ignored_check); > static DEFINE_PER_CPU(unsigned long, total_mmio_ffs); > static DEFINE_PER_CPU(unsigned long, false_positives); > static DEFINE_PER_CPU(unsigned long, ignored_failures); > @@ -225,9 +221,9 @@ pci_addr_cache_insert(struct pci_dev *de > while (*p) { > parent = *p; > piar = rb_entry(parent, struct pci_io_addr_range, rb_node); > - if (alo < piar->addr_lo) { > + if (ahi < piar->addr_lo) { > p = &parent->rb_left; > - } else if (ahi > piar->addr_hi) { > + } else if (alo > piar->addr_hi) { > p = &parent->rb_right; > } else { > if (dev != piar->pcidev || > @@ -246,6 +242,11 @@ pci_addr_cache_insert(struct pci_dev *de > piar->pcidev = dev; > piar->flags = flags; > > +#ifdef DEBUG > + printk (KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n", > + alo, ahi, pci_name (dev)); > +#endif > + > rb_link_node(&piar->rb_node, parent, p); > rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root); > > @@ -268,9 +269,10 @@ static void __pci_addr_cache_insert_devi > /* Skip any devices for which EEH is not enabled. */ > if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || > dn->eeh_mode & EEH_MODE_NOCHECK) { > -#ifdef DEBUG > - printk(KERN_INFO "PCI: skip building address cache for=%s %s\n", > - pci_name(dev), pci_pretty_name(dev)); > +// #ifdef DEBUG > +#if 1 > + printk(KERN_INFO "PCI: skip building address cache for=%s %s %s\n", > + pci_name(dev), pci_pretty_name(dev), dn->type); > #endif > return; > } > @@ -369,8 +371,12 @@ void pci_addr_cache_remove_device(struct > */ > void __init pci_addr_cache_build(void) > { > + struct device_node *dn; > struct pci_dev *dev = NULL; > > + if (!eeh_subsystem_enabled) > + return; > + > spin_lock_init(&pci_io_addr_cache_root.piar_lock); > > while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { > @@ -379,6 +385,17 @@ void __init pci_addr_cache_build(void) > continue; > } > pci_addr_cache_insert_device(dev); > + > + /* Save the BAR's; firmware doesn't restore these after EEH reset */ > + dn = pci_device_to_OF_node(dev); > + if (dn) { > + int i; > + for (i = 0; i < 16; i++) > + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); > + > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) > + dn->eeh_is_bridge = 1; > + } > } > > #ifdef DEBUG > @@ -390,24 +407,32 @@ void __init pci_addr_cache_build(void) > /* --------------------------------------------------------------- */ > /* Above lies the PCI Address Cache. Below lies the EEH event infrastructure */ > > -/** > - * eeh_register_notifier - Register to find out about EEH events. > - * @nb: notifier block to callback on events > - */ > -int eeh_register_notifier(struct notifier_block *nb) > +void eeh_slot_error_detail (struct device_node *dn, int severity) > { > - return notifier_chain_register(&eeh_notifier_chain, nb); > -} > + unsigned long flags; > + int rc; > > -/** > - * eeh_unregister_notifier - Unregister to an EEH event notifier. > - * @nb: notifier block to callback on events > - */ > -int eeh_unregister_notifier(struct notifier_block *nb) > -{ > - return notifier_chain_unregister(&eeh_notifier_chain, nb); > + if (!dn) return; > + > + /* Log the error with the rtas logger */ > + spin_lock_irqsave(&slot_errbuf_lock, flags); > + memset(slot_errbuf, 0, eeh_error_buf_size); > + > + rc = rtas_call(ibm_slot_error_detail, > + 8, 1, NULL, dn->eeh_config_addr, > + BUID_HI(dn->phb->buid), > + BUID_LO(dn->phb->buid), NULL, 0, > + virt_to_phys(slot_errbuf), > + eeh_error_buf_size, > + severity); > + > + if (rc == 0) > + log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); > + spin_unlock_irqrestore(&slot_errbuf_lock, flags); > } > > +EXPORT_SYMBOL(eeh_slot_error_detail); > + > /** > * read_slot_reset_state - Read the reset state of a device node's slot > * @dn: device node to read > @@ -422,6 +447,7 @@ static int read_slot_reset_state(struct > outputs = 4; > } else { > token = ibm_read_slot_reset_state; > + rets[2] = 0; /* fake PE Unavailable info */ > outputs = 3; > } > > @@ -430,75 +456,8 @@ static int read_slot_reset_state(struct > } > > /** > - * eeh_panic - call panic() for an eeh event that cannot be handled. > - * The philosophy of this routine is that it is better to panic and > - * halt the OS than it is to risk possible data corruption by > - * oblivious device drivers that don't know better. > - * > - * @dev pci device that had an eeh event > - * @reset_state current reset state of the device slot > - */ > -static void eeh_panic(struct pci_dev *dev, int reset_state) > -{ > - /* > - * XXX We should create a separate sysctl for this. > - * > - * Since the panic_on_oops sysctl is used to halt the system > - * in light of potential corruption, we can use it here. > - */ > - if (panic_on_oops) > - panic("EEH: MMIO failure (%d) on device:%s %s\n", reset_state, > - pci_name(dev), pci_pretty_name(dev)); > - else { > - __get_cpu_var(ignored_failures)++; > - printk(KERN_INFO "EEH: Ignored MMIO failure (%d) on device:%s %s\n", > - reset_state, pci_name(dev), pci_pretty_name(dev)); > - } > -} > - > -/** > - * eeh_event_handler - dispatch EEH events. The detection of a frozen > - * slot can occur inside an interrupt, where it can be hard to do > - * anything about it. The goal of this routine is to pull these > - * detection events out of the context of the interrupt handler, and > - * re-dispatch them for processing at a later time in a normal context. > - * > - * @dummy - unused > - */ > -static void eeh_event_handler(void *dummy) > -{ > - unsigned long flags; > - struct eeh_event *event; > - > - while (1) { > - spin_lock_irqsave(&eeh_eventlist_lock, flags); > - event = NULL; > - if (!list_empty(&eeh_eventlist)) { > - event = list_entry(eeh_eventlist.next, struct eeh_event, list); > - list_del(&event->list); > - } > - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); > - if (event == NULL) > - break; > - > - printk(KERN_INFO "EEH: MMIO failure (%d), notifiying device " > - "%s %s\n", event->reset_state, > - pci_name(event->dev), pci_pretty_name(event->dev)); > - > - atomic_set(&eeh_fail_count, 0); > - notifier_call_chain (&eeh_notifier_chain, > - EEH_NOTIFY_FREEZE, event); > - > - __get_cpu_var(slot_resets)++; > - > - pci_dev_put(event->dev); > - kfree(event); > - } > -} > - > -/** > - * eeh_token_to_phys - convert EEH address token to phys address > - * @token i/o token, should be address in the form 0xE.... > + * eeh_token_to_phys - convert I/O address to phys address > + * @token i/o address, should be address in the form 0xA.... > */ > static inline unsigned long eeh_token_to_phys(unsigned long token) > { > @@ -513,6 +472,18 @@ static inline unsigned long eeh_token_to > return pa | (token & (PAGE_SIZE-1)); > } > > + > +static inline struct pci_dev * eeh_find_pci_dev(struct device_node *dn) > +{ > + struct pci_dev *dev = NULL; > + for_each_pci_dev(dev) { > + if (pci_device_to_OF_node(dev) == dn) > + return dev; > + } > + return NULL; > +} > + > + > /** > * eeh_dn_check_failure - check if all 1's data is due to EEH slot freeze > * @dn device node > @@ -528,29 +499,37 @@ static inline unsigned long eeh_token_to > * > * It is safe to call this routine in an interrupt context. > */ > +extern void disable_irq_nosync(unsigned int); > + > int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) > { > int ret; > int rets[3]; > - unsigned long flags; > - int rc, reset_state; > - struct eeh_event *event; > + enum pci_channel_state state; > > __get_cpu_var(total_mmio_ffs)++; > > if (!eeh_subsystem_enabled) > return 0; > > - if (!dn) > + if (!dn) { > + __get_cpu_var(no_dn)++; > return 0; > + } > > /* Access to IO BARs might get this far and still not want checking. */ > if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || > dn->eeh_mode & EEH_MODE_NOCHECK) { > + __get_cpu_var(ignored_check)++; > +#ifdef DEBUG > + printk ("EEH:ignored check for %s %s\n", > + pci_pretty_name (dev), dn->full_name); > +#endif > return 0; > } > > if (!dn->eeh_config_addr) { > + __get_cpu_var(no_cfg_addr)++; > return 0; > } > > @@ -559,12 +538,18 @@ int eeh_dn_check_failure(struct device_n > * slot, we know it's bad already, we don't need to check... > */ > if (dn->eeh_mode & EEH_MODE_ISOLATED) { > - atomic_inc(&eeh_fail_count); > - if (atomic_read(&eeh_fail_count) >= EEH_MAX_FAILS) { > + dn->eeh_check_count ++; > + if (dn->eeh_check_count >= EEH_MAX_FAILS) { > + printk (KERN_ERR "EEH: Device driver ignored %d bad reads, panicing\n", > + dn->eeh_check_count); > + dump_stack(); > /* re-read the slot reset state */ > if (read_slot_reset_state(dn, rets) != 0) > rets[0] = -1; /* reset state unknown */ > - eeh_panic(dev, rets[0]); > + > + /* If we are here, then we hit an infinite loop. Stop. */ > + panic("EEH: MMIO halt (%d) on device:%s %s\n", rets[0], > + pci_name(dev), pci_pretty_name(dev)); > } > return 0; > } > @@ -577,53 +562,41 @@ int eeh_dn_check_failure(struct device_n > * In any case they must share a common PHB. > */ > ret = read_slot_reset_state(dn, rets); > - if (!(ret == 0 && rets[1] == 1 && (rets[0] == 2 || rets[0] == 4))) { > + if (!(ret == 0 && ((rets[1] == 1 && (rets[0] == 2 || rets[0] >= 4)) > + || (rets[0] == 5)))) { > __get_cpu_var(false_positives)++; > return 0; > } > > - /* prevent repeated reports of this failure */ > - dn->eeh_mode |= EEH_MODE_ISOLATED; > - > - reset_state = rets[0]; > + /* Note that empty slots will fail; empty slots don't have children... */ > + if ((rets[0] == 5) && (dn->child == NULL)) { > + __get_cpu_var(false_positives)++; > + return 0; > + } > > - spin_lock_irqsave(&slot_errbuf_lock, flags); > - memset(slot_errbuf, 0, eeh_error_buf_size); > + /* Prevent repeated reports of this failure */ > + dn->eeh_mode |= EEH_MODE_ISOLATED; > + __get_cpu_var(slot_resets)++; > > - rc = rtas_call(ibm_slot_error_detail, > - 8, 1, NULL, dn->eeh_config_addr, > - BUID_HI(dn->phb->buid), > - BUID_LO(dn->phb->buid), NULL, 0, > - virt_to_phys(slot_errbuf), > - eeh_error_buf_size, > - 1 /* Temporary Error */); > + if (!dev) > + dev = eeh_find_pci_dev (dn); > > - if (rc == 0) > - log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); > - spin_unlock_irqrestore(&slot_errbuf_lock, flags); > + /* Some devices go crazy if irq's are not ack'ed; disable irq now */ > + if (dev) > + disable_irq_nosync (dev->irq); > + > + state = pci_channel_io_normal; > + if ((rets[0] == 2) || (rets[0] == 4)) > + state = pci_channel_io_frozen; > + if (rets[0] == 5) > + state = pci_channel_io_perm_failure; > > - printk(KERN_INFO "EEH: MMIO failure (%d) on device: %s %s\n", > - rets[0], dn->name, dn->full_name); > - event = kmalloc(sizeof(*event), GFP_ATOMIC); > - if (event == NULL) { > - eeh_panic(dev, reset_state); > - return 1; > - } > - > - event->dev = dev; > - event->dn = dn; > - event->reset_state = reset_state; > - > - /* We may or may not be called in an interrupt context */ > - spin_lock_irqsave(&eeh_eventlist_lock, flags); > - list_add(&event->list, &eeh_eventlist); > - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); > + peh_send_failure_event (dev, state, rets[2]); > > /* Most EEH events are due to device driver bugs. Having > * a stack trace will help the device-driver authors figure > * out what happened. So print that out. */ > - dump_stack(); > - schedule_work(&eeh_event_wq); > + if (rets[0] != 5) dump_stack(); > > return 0; > } > @@ -635,7 +608,6 @@ EXPORT_SYMBOL(eeh_dn_check_failure); > * @token i/o token, should be address in the form 0xA.... > * @val value, should be all 1's (XXX why do we need this arg??) > * > - * Check for an eeh failure at the given token address. > * Check for an EEH failure at the given token address. Call this > * routine if the result of a read was all 0xff's and you want to > * find out if this is due to an EEH slot freeze event. This routine > @@ -643,6 +615,7 @@ EXPORT_SYMBOL(eeh_dn_check_failure); > * > * Note this routine is safe to call in an interrupt context. > */ > + > unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val) > { > unsigned long addr; > @@ -652,8 +625,10 @@ unsigned long eeh_check_failure(const vo > /* Finding the phys addr + pci device; this is pretty quick. */ > addr = eeh_token_to_phys((unsigned long __force) token); > dev = pci_get_device_by_addr(addr); > - if (!dev) > + if (!dev) { > + __get_cpu_var(no_device)++; > return val; > + } > > dn = pci_device_to_OF_node(dev); > eeh_dn_check_failure (dn, dev); > @@ -664,6 +639,234 @@ unsigned long eeh_check_failure(const vo > > EXPORT_SYMBOL(eeh_check_failure); > > +/* ------------------------------------------------------------- */ > +/* The code below deals with error recovery */ > + > +int > +eeh_slot_is_isolated(struct pci_dev *dev) > +{ > + struct device_node *dn; > + dn = pci_device_to_OF_node(dev); > + return (dn->eeh_mode & EEH_MODE_ISOLATED); > +} > +EXPORT_SYMBOL(eeh_slot_is_isolated); > + > +int > +eeh_ioaddr_is_isolated(const volatile void __iomem *token) > +{ > + unsigned long addr; > + struct pci_dev *dev; > + int rc; > + > + addr = eeh_token_to_phys((unsigned long __force) token); > + dev = pci_get_device_by_addr(addr); > + if (!dev) > + return 0; > + rc = eeh_slot_is_isolated(dev); > + pci_dev_put(dev); > + return rc; > +} > + > +/** eeh_pci_slot_reset -- raises/lowers the pci #RST line > + * state: 1/0 to raise/lower the #RST > + */ > +void > +eeh_pci_slot_reset(struct pci_dev *dev, int state) > +{ > + struct device_node *dn = pci_device_to_OF_node(dev); > + rtas_pci_slot_reset (dn, state); > +} > + > +/** Return negative value if a permanent error, else return > + * a number of milliseconds to wait until the PCI slot is > + * ready to be used. > + */ > +static int > +eeh_slot_availability(struct device_node *dn) > +{ > + int rc; > + int rets[3]; > + > + rc = read_slot_reset_state(dn, rets); > + > + if (rc) return rc; > + > + if (rets[1] == 0) return -1; /* EEH is not supported */ > + if (rets[0] == 0) return 0; /* Oll Korrect */ > + if (rets[0] == 5) { > + if (rets[2] == 0) return -1; /* permanently unavailable */ > + return rets[2]; /* number of millisecs to wait */ > + } > + return -1; > +} > + > +int > +eeh_pci_slot_availability(struct pci_dev *dev) > +{ > + struct device_node *dn = pci_device_to_OF_node(dev); > + if (!dn) return -1; > + > + BUG_ON (dn->phb==NULL); > + if (dn->phb==NULL) { > + printk (KERN_ERR "EEH, checking on slot with no phb dn=%s dev=%s:%s\n", > + dn->full_name, pci_name(dev), pci_pretty_name (dev)); > + return -1; > + } > + return eeh_slot_availability (dn); > +} > + > +void > +rtas_pci_slot_reset(struct device_node *dn, int state) > +{ > + int rc; > + > + if (!dn) > + return; > + if (!dn->phb) { > + printk (KERN_WARNING "EEH: in slot reset, device node %s has no phb\n", dn->full_name); > + return; > + } > + > + dn->eeh_mode |= EEH_MODE_RECOVERING; > + rc = rtas_call(ibm_set_slot_reset,4,1, NULL, > + dn->eeh_config_addr, > + BUID_HI(dn->phb->buid), > + BUID_LO(dn->phb->buid), > + state); > + if (rc) { > + printk (KERN_WARNING "EEH: Unable to reset the failed slot, (%d) #RST=%d\n", rc, state); > + return; > + } > + > + if (state == 0) > + dn->eeh_mode &= ~(EEH_MODE_RECOVERING|EEH_MODE_ISOLATED); > +} > + > +/** rtas_set_slot_reset -- assert the pci #RST line for 1/4 second > + * dn -- device node to be reset. > + */ > + > +void > +rtas_set_slot_reset(struct device_node *dn) > +{ > + int i, rc; > + > + rtas_pci_slot_reset (dn, 1); > + > + /* The PCI bus requires that the reset be held high for at least > + * a 100 milliseconds. We wait a bit longer 'just in case'. */ > + > +#define PCI_BUS_RST_HOLD_TIME_MSEC 250 > + msleep (PCI_BUS_RST_HOLD_TIME_MSEC); > + rtas_pci_slot_reset (dn, 0); > + > + /* After a PCI slot has been reset, the PCI Express spec requires > + * a 1.5 second idle time for the bus to stabilize, before starting > + * up traffic. */ > +#define PCI_BUS_SETTLE_TIME_MSEC 1800 > + msleep (PCI_BUS_SETTLE_TIME_MSEC); > + > + /* Now double check with the firmware to make sure the device is > + * ready to be used; if not, wait for recovery. */ > + for (i=0; i<10; i++) { > + rc = eeh_slot_availability (dn); > + if (rc <= 0) break; > + > + msleep (rc+100); > + } > +} > + > +EXPORT_SYMBOL(rtas_set_slot_reset); > + > +void > +rtas_configure_bridge(struct device_node *dn) > +{ > + int token = rtas_token ("ibm,configure-bridge"); > + int rc; > + > + if (token == RTAS_UNKNOWN_SERVICE) > + return; > + rc = rtas_call(token,3,1, NULL, > + dn->eeh_config_addr, > + BUID_HI(dn->phb->buid), > + BUID_LO(dn->phb->buid)); > + if (rc) { > + printk (KERN_WARNING "EEH: Unable to configure device bridge (%d) for %s\n", > + rc, dn->full_name); > + } > +} > + > +EXPORT_SYMBOL(rtas_configure_bridge); > + > +/* ------------------------------------------------------- */ > +/** Save and restore of PCI BARs > + * > + * Although firmware will set up BARs during boot, it doesn't > + * set up device BAR's after a device reset, although it will, > + * if requested, set up bridge configuration. Thus, we need to > + * configure the PCI devices ourselves. Config-space setup is > + * stored in the PCI structures which are normally deleted during > + * device removal. Thus, the "save" routine references the > + * structures so that they aren't deleted. > + */ > + > +/** > + * __restore_bars - Restore the Base Address Registers > + * Loads the PCI configuration space base address registers, > + * the expansion ROM base address, the latency timer, and etc. > + * from the saved values in the device node. > + */ > +static inline void __restore_bars (struct device_node *dn) > +{ > + int i; > + > + if (NULL==dn->phb) return; > + for (i=4; i<10; i++) { > + rtas_write_config(dn, i*4, 4, dn->config_space[i]); > + } > + > + /* 12 == Expansion ROM Address */ > + rtas_write_config(dn, 12*4, 4, dn->config_space[12]); > + > +#define BYTE_SWAP(OFF) (8*((OFF)/4)+3-(OFF)) > +#define SAVED_BYTE(OFF) (((u8 *)(dn->config_space))[BYTE_SWAP(OFF)]) > + > + rtas_write_config (dn, PCI_CACHE_LINE_SIZE, 1, > + SAVED_BYTE(PCI_CACHE_LINE_SIZE)); > + > + rtas_write_config (dn, PCI_LATENCY_TIMER, 1, > + SAVED_BYTE(PCI_LATENCY_TIMER)); > + > + /* max latency, min grant, interrupt pin and line */ > + rtas_write_config(dn, 15*4, 4, dn->config_space[15]); > +} > + > +/** > + * eeh_restore_bars - restore the PCI config space info > + */ > +void eeh_restore_bars(struct device_node *dn) > +{ > + if (! dn->eeh_is_bridge) > + __restore_bars (dn); > + > + if (dn->child) > + eeh_restore_bars (dn->child); > +} > + > +void eeh_pci_restore_bars(struct pci_dev *dev) > +{ > + struct device_node *dn = pci_device_to_OF_node(dev); > + eeh_restore_bars (dn); > +} > + > +/* ------------------------------------------------------------- */ > +/* The code below deals with enabling EEH for devices during the > + * early boot sequence. EEH must be enabled before any PCI probing > + * can be done. > + */ > + > +#define EEH_ENABLE 1 > + > struct eeh_early_enable_info { > unsigned int buid_hi; > unsigned int buid_lo; > @@ -682,6 +885,8 @@ static void *early_enable_eeh(struct dev > int enable; > > dn->eeh_mode = 0; > + dn->eeh_check_count = 0; > + dn->eeh_freeze_count = 0; > > if (status && strcmp(status, "ok") != 0) > return NULL; /* ignore devices with bad status */ > @@ -743,7 +948,7 @@ static void *early_enable_eeh(struct dev > dn->full_name); > } > > - return NULL; > + return NULL; > } > > /* > @@ -824,11 +1029,13 @@ void eeh_add_device_early(struct device_ > struct pci_controller *phb; > struct eeh_early_enable_info info; > > - if (!dn || !eeh_subsystem_enabled) > + if (!dn) > return; > phb = dn->phb; > if (NULL == phb || 0 == phb->buid) { > - printk(KERN_WARNING "EEH: Expected buid but found none\n"); > + printk(KERN_WARNING "EEH: Expected buid but found none for %s\n", > + dn->full_name); > + dump_stack(); > return; > } > > @@ -847,6 +1054,9 @@ EXPORT_SYMBOL(eeh_add_device_early); > */ > void eeh_add_device_late(struct pci_dev *dev) > { > + int i; > + struct device_node *dn; > + > if (!dev || !eeh_subsystem_enabled) > return; > > @@ -856,6 +1066,14 @@ void eeh_add_device_late(struct pci_dev > #endif > > pci_addr_cache_insert_device (dev); > + > + /* Save the BAR's; firmware doesn't restore these after EEH reset */ > + dn = pci_device_to_OF_node(dev); > + for (i = 0; i < 16; i++) > + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); > + > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) > + dn->eeh_is_bridge = 1; > } > EXPORT_SYMBOL(eeh_add_device_late); > > @@ -885,12 +1103,17 @@ static int proc_eeh_show(struct seq_file > unsigned int cpu; > unsigned long ffs = 0, positives = 0, failures = 0; > unsigned long resets = 0; > + unsigned long no_dev = 0, no_dn = 0, no_cfg = 0, no_check = 0; > > for_each_cpu(cpu) { > ffs += per_cpu(total_mmio_ffs, cpu); > positives += per_cpu(false_positives, cpu); > failures += per_cpu(ignored_failures, cpu); > resets += per_cpu(slot_resets, cpu); > + no_dev += per_cpu(no_device, cpu); > + no_dn += per_cpu(no_dn, cpu); > + no_cfg += per_cpu(no_cfg_addr, cpu); > + no_check += per_cpu(ignored_check, cpu); > } > > if (0 == eeh_subsystem_enabled) { > @@ -898,13 +1121,17 @@ static int proc_eeh_show(struct seq_file > seq_printf(m, "eeh_total_mmio_ffs=%ld\n", ffs); > } else { > seq_printf(m, "EEH Subsystem is enabled\n"); > - seq_printf(m, "eeh_total_mmio_ffs=%ld\n" > + seq_printf(m, > + "no device=%ld\n" > + "no device node=%ld\n" > + "no config address=%ld\n" > + "check not wanted=%ld\n" > + "eeh_total_mmio_ffs=%ld\n" > "eeh_false_positives=%ld\n" > "eeh_ignored_failures=%ld\n" > - "eeh_slot_resets=%ld\n" > - "eeh_fail_count=%d\n", > - ffs, positives, failures, resets, > - eeh_fail_count.counter); > + "eeh_slot_resets=%ld\n", > + no_dev, no_dn, no_cfg, no_check, > + ffs, positives, failures, resets); > } > > return 0; > --- arch/ppc64/kernel/pSeries_pci.c.linas-orig 2005-04-29 20:33:03.000000000 -0500 > +++ arch/ppc64/kernel/pSeries_pci.c 2005-05-06 12:28:43.000000000 -0500 > @@ -52,7 +52,7 @@ static int s7a_workaround; > > extern struct mpic *pSeries_mpic; > > -static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) > +int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) > { > int returnval = -1; > unsigned long buid, addr; > @@ -101,7 +101,7 @@ static int rtas_pci_read_config(struct p > return PCIBIOS_DEVICE_NOT_FOUND; > } > > -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) > +int rtas_write_config(struct device_node *dn, int where, int size, u32 val) > { > unsigned long buid, addr; > int ret; > --- drivers/pci/hotplug/rpaphp.h.linas-orig 2005-04-29 20:26:21.000000000 -0500 > +++ drivers/pci/hotplug/rpaphp.h 2005-05-06 12:28:43.000000000 -0500 > @@ -118,7 +118,8 @@ extern int rpaphp_enable_pci_slot(struct > extern int register_pci_slot(struct slot *slot); > extern int rpaphp_unconfig_pci_adapter(struct slot *slot); > extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); > -extern struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev); > +extern void init_eeh_handler (void); > +extern void exit_eeh_handler (void); > > /* rpaphp_core.c */ > extern int rpaphp_add_slot(struct device_node *dn); > --- drivers/pci/hotplug/rpaphp_core.c.linas-orig 2005-04-29 20:32:16.000000000 -0500 > +++ drivers/pci/hotplug/rpaphp_core.c 2005-05-06 12:28:43.000000000 -0500 > @@ -460,12 +460,18 @@ static int __init rpaphp_init(void) > { > info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); > > + /* Get set to handle EEH events. */ > + init_eeh_handler(); > + > /* read all the PRA info from the system */ > return init_rpa(); > } > > static void __exit rpaphp_exit(void) > { > + /* Let EEH know we are going away. */ > + exit_eeh_handler(); > + > cleanup_slots(); > } > > --- drivers/pci/hotplug/rpaphp_pci.c.linas-orig 2005-04-29 20:22:38.000000000 -0500 > +++ drivers/pci/hotplug/rpaphp_pci.c 2005-05-16 11:59:30.000000000 -0500 > @@ -24,6 +24,7 @@ > */ > #include > #include > +#include > #include > #include > #include "../pci.h" /* for pci_add_new_bus */ > @@ -63,6 +64,7 @@ int rpaphp_claim_resource(struct pci_dev > root ? "Address space collision on" : > "No parent found for", > resource, dtype, pci_name(dev), res->start, res->end); > + dump_stack(); > } > return err; > } > @@ -188,6 +190,19 @@ rpaphp_fixup_new_pci_devices(struct pci_ > > static int rpaphp_pci_config_bridge(struct pci_dev *dev); > > +static void rpaphp_eeh_add_bus_device(struct pci_bus *bus) > +{ > + struct pci_dev *dev; > + list_for_each_entry(dev, &bus->devices, bus_list) { > + eeh_add_device_late(dev); > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { > + struct pci_bus *subbus = dev->subordinate; > + if (bus) > + rpaphp_eeh_add_bus_device (subbus); > + } > + } > +} > + > /***************************************************************************** > rpaphp_pci_config_slot() will configure all devices under the > given slot->dn and return the the first pci_dev. > @@ -215,6 +230,8 @@ rpaphp_pci_config_slot(struct device_nod > } > if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) > rpaphp_pci_config_bridge(dev); > + > + rpaphp_eeh_add_bus_device(bus); > } > return dev; > } > @@ -223,7 +240,6 @@ static int rpaphp_pci_config_bridge(stru > { > u8 sec_busno; > struct pci_bus *child_bus; > - struct pci_dev *child_dev; > > dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev)); > > @@ -240,11 +256,7 @@ static int rpaphp_pci_config_bridge(stru > /* do pci_scan_child_bus */ > pci_scan_child_bus(child_bus); > > - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { > - eeh_add_device_late(child_dev); > - } > - > - /* fixup new pci devices without touching bus struct */ > + /* Fixup new pci devices without touching bus struct */ > rpaphp_fixup_new_pci_devices(child_bus, 0); > > /* Make the discovered devices available */ > @@ -282,7 +294,7 @@ static void print_slot_pci_funcs(struct > return; > } > #else > -static void print_slot_pci_funcs(struct slot *slot) > +static inline void print_slot_pci_funcs(struct slot *slot) > { > return; > } > @@ -364,7 +376,6 @@ static void rpaphp_eeh_remove_bus_device > if (pdev) > rpaphp_eeh_remove_bus_device(pdev); > } > - > } > return; > } > @@ -566,36 +577,3 @@ exit: > return retval; > } > > -struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev) > -{ > - struct list_head *tmp, *n; > - struct slot *slot; > - > - list_for_each_safe(tmp, n, &rpaphp_slot_head) { > - struct pci_bus *bus; > - struct list_head *ln; > - > - slot = list_entry(tmp, struct slot, rpaphp_slot_list); > - if (slot->bridge == NULL) { > - if (slot->dev_type == PCI_DEV) { > - printk(KERN_WARNING "PCI slot missing bridge %s %s \n", > - slot->name, slot->location); > - } > - continue; > - } > - > - bus = slot->bridge->subordinate; > - if (!bus) { > - continue; /* should never happen? */ > - } > - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { > - struct pci_dev *pdev = pci_dev_b(ln); > - if (pdev == dev) > - return slot->hotplug_slot; > - } > - } > - > - return NULL; > -} > - > -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); > --- drivers/pci/hotplug/rpaphp_eeh.c.linas-orig 2005-05-16 11:52:15.000000000 -0500 > +++ drivers/pci/hotplug/rpaphp_eeh.c 2005-05-31 11:20:06.000000000 -0500 > @@ -0,0 +1,354 @@ > +/* > + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform. > + * Copyright (C) 2004, 2005 Linas Vepstas > + * > + * All rights reserved. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or (at > + * your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. > + * > + * Send feedback to > + * > + */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "../pci.h" > +#include "rpaphp.h" > + > +/** > + * pci_search_bus_for_dev - return 1 if device is under this bus, else 0 > + * @bus: the bus to search for this device. > + * @dev: the pci device we are looking for. > + * > + * XXX should this be moved to drivers/pci/search.c ? > + */ > +static int pci_search_bus_for_dev (struct pci_bus *bus, struct pci_dev *dev) > +{ > + struct list_head *ln; > + > + if (!bus) return 0; > + > + for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { > + struct pci_dev *pdev = pci_dev_b(ln); > + if (pdev == dev) > + return 1; > + if (pdev->subordinate) { > + int rc; > + rc = pci_search_bus_for_dev (pdev->subordinate, dev); > + if (rc) > + return 1; > + } > + } > + return 0; > +} > + > +/** pci_walk_bus - walk bus under this device, calling callback. > + * @top device whose peers should be walked > + * @cb callback to be called for each device found > + * @userdata arbitrary pointer to be passed to callback. > + * > + * Walk the bus on which this device sits, including any > + * bridged devices on busses under this bus. Call the provided > + * callback on each device found. > + */ > +typedef void (*pci_buswalk_cb)(struct pci_dev *, void *); > + > +static void > +pci_walk_bus (struct pci_dev *top, pci_buswalk_cb cb, void *userdata) > +{ > + struct pci_dev *dev, *tmp; > + > + spin_lock(&pci_bus_lock); > + list_for_each_entry_safe (dev, tmp, &top->bus->devices, bus_list) { > + pci_dev_get(dev); > + spin_unlock(&pci_bus_lock); > + > + /* run device routines with the bus unlocked */ > + cb (dev, userdata); > + if (dev->subordinate) { > + pci_walk_bus (pci_dev_b(&dev->subordinate->devices), cb, userdata); > + } > + spin_lock(&pci_bus_lock); > + pci_dev_put(dev); > + } > + spin_unlock(&pci_bus_lock); > +} > + > +/** > + * rpaphp_find_slot - find and return the slot holding the device > + * @dev: pci device for which we want the slot structure. > + */ > +static struct slot *rpaphp_find_slot(struct pci_dev *dev) > +{ > + struct list_head *tmp, *n; > + struct slot *slot; > + > + list_for_each_safe(tmp, n, &rpaphp_slot_head) { > + struct pci_bus *bus; > + > + slot = list_entry(tmp, struct slot, rpaphp_slot_list); > + > + /* PHB's don't have bridges. */ > + if (slot->bridge == NULL) > + continue; > + > + /* The PCI device could be the slot itself. */ > + if (slot->bridge == dev) > + return slot; > + > + bus = slot->bridge->subordinate; > + if (!bus) { > + printk (KERN_WARNING "PCI bridge is missing bus: %s %s\n", > + pci_name (slot->bridge), pci_pretty_name (slot->bridge)); > + continue; /* should never happen? */ > + } > + > + if (pci_search_bus_for_dev (bus, dev)) > + return slot; > + } > + return NULL; > +} > + > +/* ------------------------------------------------------- */ > +/** eeh_report_error - report an EEH error to each device, > + * collect up and merge the device responses. > + */ > + > +static void eeh_report_error(struct pci_dev *dev, void *userdata) > +{ > + enum pcierr_result rc, *res = userdata; > + > + if (dev->driver->err_handler.error_detected) { > + rc = dev->driver->err_handler.error_detected (dev, pci_channel_io_frozen); > + if (*res == PCIERR_RESULT_NONE) *res = rc; > + if (*res == PCIERR_RESULT_NEED_RESET) return; > + if (*res == PCIERR_RESULT_DISCONNECT && > + rc == PCIERR_RESULT_NEED_RESET) *res = rc; > + } > +} > + > +/** eeh_report_reset -- tell this device that the pci slot > + * has been reset. > + */ > + > +static void eeh_report_reset(struct pci_dev *dev, void *userdata) > +{ > + if (dev->driver->err_handler.slot_reset) > + dev->driver->err_handler.slot_reset (dev); > +} > + > +static void eeh_report_resume(struct pci_dev *dev, void *userdata) > +{ > + if (dev->driver->err_handler.resume) > + dev->driver->err_handler.resume (dev); > +} > + > +static void eeh_report_failure(struct pci_dev *dev, void *userdata) > +{ > + if (dev->driver->err_handler.error_detected) > + dev->driver->err_handler.error_detected (dev, pci_channel_io_perm_failure); > +} > + > +/* ------------------------------------------------------- */ > +/** > + * handle_eeh_events -- reset a PCI device after hard lockup. > + * > + * pSeries systems will isolate a PCI slot if the PCI-Host > + * bridge detects address or data parity errors, DMA's > + * occuring to wild addresses (which usually happen due to > + * bugs in device drivers or in PCI adapter firmware). > + * Slot isolations also occur if #SERR, #PERR or other misc > + * PCI-related errors are detected. > + * > + * Recovery process consists of unplugging the device driver > + * (which generated hotplug events to userspace), then issuing > + * a PCI #RST to the device, then reconfiguring the PCI config > + * space for all bridges & devices under this slot, and then > + * finally restarting the device drivers (which cause a second > + * set of hotplug events to go out to userspace). > + */ > + > +int eeh_reset_device (struct pci_dev *dev, struct device_node *dn, int reconfig) > +{ > + struct slot *frozen_slot= NULL; > + > + if (!dev) > + return 1; > + > + if (reconfig) > + frozen_slot = rpaphp_find_slot(dev); > + > + if (reconfig && frozen_slot) rpaphp_unconfig_pci_adapter (frozen_slot); > + > + /* Reset the pci controller. (Asserts RST#; resets config space). > + * Reconfigure bridges and devices */ > + rtas_set_slot_reset (dn->child); > + rtas_configure_bridge(dn); > + eeh_restore_bars(dn->child); > + > + enable_irq (dev->irq); > + > + /* Give the system 5 seconds to finish running the user-space > + * hotplug scripts, e.g. ifdown for ethernet. Yes, this is a hack, > + * but if we don't do this, weird things happen. > + */ > + if (reconfig && frozen_slot) { > + ssleep (5); > + rpaphp_enable_pci_slot (frozen_slot); > + } > + return 0; > +} > + > +/* The longest amount of time to wait for a pci device > + * to come back on line, in seconds. > + */ > +#define MAX_WAIT_FOR_RECOVERY 15 > + > +int handle_eeh_events (struct notifier_block *self, > + unsigned long reason, void *ev) > +{ > + int freeze_count=0; > + struct device_node *frozen_device; > + struct peh_event *event = ev; > + struct pci_dev *dev = event->dev; > + int perm_failure = 0; > + > + if (!dev) > + { > + printk ("EEH: EEH error caught, but no PCI device specified!\n"); > + return 1; > + } > + > + frozen_device = pci_bus_to_OF_node(dev->bus); > + if (!frozen_device) > + { > + printk (KERN_ERR "EEH: Cannot find PCI controller for %s %s\n", > + pci_name(dev), pci_pretty_name (dev)); > + > + return 1; > + } > + BUG_ON (frozen_device->phb==NULL); > + > + /* We get "permanent failure" messages on empty slots. > + * These are false alarms. Empty slots have no child dn. */ > + if ((event->state == pci_channel_io_perm_failure) && (frozen_device == NULL)) > + return 0; > + > + if (frozen_device) > + freeze_count = frozen_device->eeh_freeze_count; > + freeze_count ++; > + if (freeze_count > EEH_MAX_ALLOWED_FREEZES) > + perm_failure = 1; > + > + /* If the reset state is a '5' and the time to reset is 0 (infinity) > + * or is more then 15 seconds, then mark this as a permanent failure. > + */ > + if ((event->state == pci_channel_io_perm_failure) && > + ((event->time_unavail <= 0) || > + (event->time_unavail > MAX_WAIT_FOR_RECOVERY*1000))) > + perm_failure = 1; > + > + /* Log the error with the rtas logger. */ > + if (perm_failure) { > + /* > + * About 90% of all real-life EEH failures in the field > + * are due to poorly seated PCI cards. Only 10% or so are > + * due to actual, failed cards. > + */ > + printk (KERN_ERR > + "EEH: device %s:%s has failed %d times \n" > + "and has been permanently disabled. Please try reseating\n" > + "this device or replacing it.\n", > + pci_name (dev), > + pci_pretty_name (dev), > + freeze_count); > + > + eeh_slot_error_detail (frozen_device, 2 /* Permanent Error */); > + > + /* Notify all devices that they're about to go down. */ > + pci_walk_bus (dev, eeh_report_failure, 0); > + > + /* If there's a hotplug slot, unconfigure it */ > + // XXX we need alternate way to deconfigure non-hotplug slots. > + struct slot * frozen_slot = rpaphp_find_slot(dev); > + if (frozen_slot) > + rpaphp_unconfig_pci_adapter (frozen_slot); > + return 1; > + } else { > + eeh_slot_error_detail (frozen_device, 1 /* Temporary Error */); > + } > + > + printk (KERN_WARNING > + "EEH: This device has failed %d times since last reboot: %s:%s\n", > + freeze_count, > + pci_name (dev), > + pci_pretty_name (dev)); > + > + /* Walk the various device drivers attached to this slot, > + * letting each know about the EEH bug. > + */ > + enum pcierr_result result = PCIERR_RESULT_NONE; > + pci_walk_bus (dev, eeh_report_error, &result); > + > + /* If all device drivers were EEH-unaware, then pci hotplug > + * the device, and hope that clears the error. */ > + if (result == PCIERR_RESULT_NONE) { > + eeh_reset_device (dev, frozen_device, 1); > + } > + > + /* If any device called out for a reset, then reset the slot */ > + if (result == PCIERR_RESULT_NEED_RESET) { > + eeh_reset_device (dev, frozen_device, 0); > + pci_walk_bus (dev, eeh_report_reset, 0); > + } > + > + /* If all devices reported they can proceed, the re-enable PIO */ > + if (result == PCIERR_RESULT_CAN_RECOVER) { > + /* XXX Not supported; we brute-force reset the device */ > + eeh_reset_device (dev, frozen_device, 0); > + pci_walk_bus (dev, eeh_report_reset, 0); > + } > + > + /* Tell all device drivers that they can resume operations */ > + pci_walk_bus (dev, eeh_report_resume, 0); > + > + /* Store the freeze count with the pci adapter, and not the slot. > + * This way, if the device is replaced, the count is cleared. > + */ > + frozen_device->eeh_freeze_count = freeze_count; > + > + return 1; > +} > + > +static struct notifier_block eeh_block; > + > +void __init init_eeh_handler (void) > +{ > + eeh_block.notifier_call = handle_eeh_events; > + peh_register_notifier (&eeh_block); > +} > + > +void __exit exit_eeh_handler (void) > +{ > + peh_unregister_notifier (&eeh_block); > +} > + > --- drivers/pci/hotplug/Makefile.linas-orig 2005-04-29 20:29:50.000000000 -0500 > +++ drivers/pci/hotplug/Makefile 2005-05-16 11:53:52.000000000 -0500 > @@ -41,6 +41,7 @@ acpiphp-objs := acpiphp_core.o \ > acpiphp_res.o > > rpaphp-objs := rpaphp_core.o \ > + rpaphp_eeh.o \ > rpaphp_pci.o \ > rpaphp_slot.o \ > rpaphp_vio.o > > > ------------------------------------------------------------------------ > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/cgi-bin/mailman/listinfo/linuxppc64-dev -- Brian King eServer Storage I/O IBM Linux Technology Center From johnrose at austin.ibm.com Thu Jun 2 04:10:36 2005 From: johnrose at austin.ibm.com (John Rose) Date: Wed, 01 Jun 2005 13:10:36 -0500 Subject: [PATCH] initialize TCE tables Message-ID: <1117649436.28482.3.camel@sinatra.austin.ibm.com> A fairly recent platform requirement states that the OS must clear the whole TCE table at setup time, in case firmware left any active mappings in it. Without this initialization, dynamic bus removes can fail. Firmware rejects these requests if active mappings still exist for a slot that has been deallocated by the OS. If there are no objections, I'll forward this to Andrew Morton. Thanks- John Signed-off-by: Olof Johansson Signed-off-by: John Rose diff -puN arch/ppc64/kernel/iommu.c~initialize_tces arch/ppc64/kernel/iommu.c --- 2_6_linus_2/arch/ppc64/kernel/iommu.c~initialize_tces 2005-06-01 12:17:53.000000000 -0500 +++ 2_6_linus_2-johnrose/arch/ppc64/kernel/iommu.c 2005-06-01 12:19:56.000000000 -0500 @@ -423,6 +423,9 @@ struct iommu_table *iommu_init_table(str tbl->it_largehint = tbl->it_halfpoint; spin_lock_init(&tbl->it_lock); + /* Clear the hardware table in case firmware left allocations in it */ + ppc_md.tce_free(tbl, 0, tbl->it_size); + if (!welcomed) { printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n", novmerge ? "disabled" : "enabled"); _ From benh at kernel.crashing.org Thu Jun 2 08:22:02 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 02 Jun 2005 08:22:02 +1000 Subject: [PATCH] ppc64: fix call_prom return checks Message-ID: <1117664522.19020.75.camel@gaston> Hi Arnd, Paul ! What about this patch ? (untested at the moment). I'm never very comfortable with those sign non-extension issues in prom_init ... It also fixes another 32 vs. 64 bytes issue on properties. Ben. Index: linux-work/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom_init.c 2005-06-02 08:11:37.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom_init.c 2005-06-02 08:19:27.000000000 +1000 @@ -1062,7 +1062,7 @@ prom_printf("opening PHB %s", path); phb_node = call_prom("open", 1, 1, path); - if ( (long)phb_node <= 0) + if (phb_node == ~0u) prom_printf("... failed\n"); else prom_printf("... done\n"); @@ -1279,12 +1279,12 @@ /* get a handle for the stdout device */ _prom->chosen = call_prom("finddevice", 1, 1, ADDR("/chosen")); - if ((long)_prom->chosen <= 0) + if ((long)_prom->chosen == ~0u) prom_panic("cannot find chosen"); /* msg won't be printed :( */ /* get device tree root */ _prom->root = call_prom("finddevice", 1, 1, ADDR("/")); - if ((long)_prom->root <= 0) + if ((long)_prom->root == ~0u) prom_panic("cannot find device tree root"); /* msg won't be printed :( */ } @@ -1426,7 +1426,7 @@ * leave some room at the end of the path for appending extra * arguments */ - if (call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-10) < 0) + if (call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-10) == ~0u) continue; prom_printf("found display : %s, opening ... ", path); @@ -1514,6 +1514,12 @@ return 0; } +/* + * The Open Firmware 1275 specification states properties must be 31 bytes or + * less, however not all firmwares obey this. Make it 64 bytes to be safe. + */ +#define MAX_PROPERTY_NAME 64 + static void __init scan_dt_build_strings(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1528,9 +1534,9 @@ prev_name = RELOC(""); for (;;) { - /* 32 is max len of name including nul. */ - namep = make_room(mem_start, mem_end, 32, 1); - if (call_prom("nextprop", 3, 1, node, prev_name, namep) <= 0) { + /* 64 is max len of name including nul. */ + namep = make_room(mem_start, mem_end, MAX_PROPERTY_NAME, 1); + if (call_prom("nextprop", 3, 1, node, prev_name, namep) == ~0u) { /* No more nodes: unwind alloc */ *mem_start = (unsigned long)namep; break; @@ -1555,12 +1561,6 @@ } } -/* - * The Open Firmware 1275 specification states properties must be 31 bytes or - * less, however not all firmwares obey this. Make it 64 bytes to be safe. - */ -#define MAX_PROPERTY_NAME 64 - static void __init scan_dt_build_struct(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1607,7 +1607,7 @@ prev_name = RELOC(""); sstart = (char *)RELOC(dt_string_start); for (;;) { - if (call_prom("nextprop", 3, 1, node, prev_name, pname) <= 0) + if (call_prom("nextprop", 3, 1, node, prev_name, pname) == ~0u) break; /* find string offset */ @@ -1623,7 +1623,7 @@ l = call_prom("getproplen", 2, 1, node, pname); /* sanity checks */ - if (l < 0) + if (l == ~0u) continue; if (l > MAX_PROPERTY_LENGTH) { prom_printf("WARNING: ignoring large property "); @@ -1771,13 +1771,13 @@ /* Some G5s have a missing interrupt definition, fix it up here */ u3 = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000")); - if ((long)u3 <= 0) + if (u3 == ~0u) return; i2c = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/i2c at f8001000")); - if ((long)i2c <= 0) + if (i2c <= ~0u) return; mpic = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/mpic at f8040000")); - if ((long)mpic <= 0) + if (mpic <= ~0u) return; /* check if proper rev of u3 */ From benh at kernel.crashing.org Thu Jun 2 08:25:00 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 02 Jun 2005 08:25:00 +1000 Subject: [PATCH] ppc64: set/clear SMT capable bit at boot In-Reply-To: <429BF93F.1030301@austin.ibm.com> References: <20050529234154.GG11066@krispykreme> <429BF93F.1030301@austin.ibm.com> Message-ID: <1117664701.19020.79.camel@gaston> On Tue, 2005-05-31 at 00:42 -0500, Joel Schopp wrote: > > + if (prop && ((size / sizeof(u32)) > 1)) > > If this hasn't gone out yet it might be nice to add a BUG_ON((size / > sizeof(u32)) > 2). I don't know of any processors that do more than 2 > way SMT, but if one comes out several years from now it would be nice to > catch it early. Mostly I'd like the extra BUG_ON() in order to have one > more check against buggy firmware. > > Patch looks fine with our without the extra BUG_ON() Rather than BUG_ON, just keep SMT feature cleared and dump something in dmesg, let's me kind to users, we are very early during boot, it would be a bit difficult to track down that BUG (no console yet). Ben. From arnd at arndb.de Thu Jun 2 08:20:16 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 2 Jun 2005 00:20:16 +0200 Subject: [PATCH] ppc64: fix call_prom return checks In-Reply-To: <1117664522.19020.75.camel@gaston> References: <1117664522.19020.75.camel@gaston> Message-ID: <200506020020.16663.arnd@arndb.de> On Dunnersdag 02 Juni 2005 00:22, Benjamin Herrenschmidt wrote: > Hi Arnd, Paul ! > > What about this patch ? (untested at the moment). I'm never very > comfortable with those sign non-extension issues in prom_init ... Why do you cast the values to 'long' in the first place? Comparing two values of the same type would seem more intuitive. Aside from that, the patch looks good to me, I'll give it a test run on our firmware tomorrow. Arnd <>< From benh at kernel.crashing.org Thu Jun 2 08:48:06 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 02 Jun 2005 08:48:06 +1000 Subject: [PATCH] ppc64: fix call_prom return checks In-Reply-To: <200506020020.16663.arnd@arndb.de> References: <1117664522.19020.75.camel@gaston> <200506020020.16663.arnd@arndb.de> Message-ID: <1117666087.19020.96.camel@gaston> On Thu, 2005-06-02 at 00:20 +0200, Arnd Bergmann wrote: > On Dunnersdag 02 Juni 2005 00:22, Benjamin Herrenschmidt wrote: > > Hi Arnd, Paul ! > > > > What about this patch ? (untested at the moment). I'm never very > > comfortable with those sign non-extension issues in prom_init ... > > Why do you cast the values to 'long' in the first place My patch is still doing that ? I though I was removing those casts ... > Comparing two values of the same type would seem more intuitive. > Aside from that, the patch looks good to me, I'll give it a test > run on our firmware tomorrow. From arnd at arndb.de Thu Jun 2 08:54:23 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 2 Jun 2005 00:54:23 +0200 Subject: [PATCH] ppc64: fix call_prom return checks In-Reply-To: <1117666087.19020.96.camel@gaston> References: <1117664522.19020.75.camel@gaston> <200506020020.16663.arnd@arndb.de> <1117666087.19020.96.camel@gaston> Message-ID: <200506020054.24224.arnd@arndb.de> On Dunnersdag 02 Juni 2005 00:48, Benjamin Herrenschmidt wrote: > My patch is still doing that ? I though I was removing those casts ... You removed most of them, but two are still left. Arnd <>< From benh at kernel.crashing.org Thu Jun 2 11:24:22 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 02 Jun 2005 11:24:22 +1000 Subject: [PATCH] ppc64: fix call_prom return checks In-Reply-To: <200506020054.24224.arnd@arndb.de> References: <1117664522.19020.75.camel@gaston> <200506020020.16663.arnd@arndb.de> <1117666087.19020.96.camel@gaston> <200506020054.24224.arnd@arndb.de> Message-ID: <1117675462.19020.106.camel@gaston> On Thu, 2005-06-02 at 00:54 +0200, Arnd Bergmann wrote: > On Dunnersdag 02 Juni 2005 00:48, Benjamin Herrenschmidt wrote: > > My patch is still doing that ? I though I was removing those casts ... > > You removed most of them, but two are still left. Ok, let's be clean. If I define PROM_ERROR to be (-1u) instead of (-1), it actually works in all cases without needing ugly casts. What about this patch ? Index: linux-work/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom_init.c 2005-06-02 08:11:37.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom_init.c 2005-06-02 11:24:05.000000000 +1000 @@ -216,7 +216,7 @@ * mode when we do. We switch back to 64b mode upon return. */ -#define PROM_ERROR (-1) +#define PROM_ERROR (-1u) static int __init call_prom(const char *service, int nargs, int nret, ...) { @@ -587,14 +587,13 @@ { unsigned long offset = reloc_offset(); ihandle elfloader; - int ret; elfloader = call_prom("open", 1, 1, ADDR("/packages/elf-loader")); - if (elfloader == 0) { + if (elfloader == PROM_ERROR || elfloader == 0) { prom_printf("couldn't open /packages/elf-loader\n"); return; } - ret = call_prom("call-method", 3, 1, ADDR("process-elf-header"), + call_prom("call-method", 3, 1, ADDR("process-elf-header"), elfloader, ADDR(&fake_elf)); call_prom("close", 1, 0, elfloader); } @@ -646,7 +645,7 @@ base = _ALIGN_UP(base + 0x100000, align)) { prom_debug(" trying: 0x%x\n\r", base); addr = (unsigned long)prom_claim(base, size, 0); - if ((int)addr != PROM_ERROR) + if (addr != PROM_ERROR) break; addr = 0; if (align == 0) @@ -708,7 +707,7 @@ for(; base > RELOC(alloc_bottom); base = _ALIGN_DOWN(base - 0x100000, align)) { prom_debug(" trying: 0x%x\n\r", base); addr = (unsigned long)prom_claim(base, size, 0); - if ((int)addr != PROM_ERROR) + if (addr != PROM_ERROR) break; addr = 0; } @@ -910,7 +909,7 @@ prom_rtas = call_prom("finddevice", 1, 1, ADDR("/rtas")); prom_debug("prom_rtas: %x\n", prom_rtas); - if (prom_rtas == (phandle) -1) + if (prom_rtas == PROM_ERROR) return; prom_getprop(prom_rtas, "rtas-size", &size, sizeof(size)); @@ -1062,7 +1061,7 @@ prom_printf("opening PHB %s", path); phb_node = call_prom("open", 1, 1, path); - if ( (long)phb_node <= 0) + if (phb_node == PROM_ERROR) prom_printf("... failed\n"); else prom_printf("... done\n"); @@ -1279,12 +1278,12 @@ /* get a handle for the stdout device */ _prom->chosen = call_prom("finddevice", 1, 1, ADDR("/chosen")); - if ((long)_prom->chosen <= 0) + if (_prom->chosen == PROM_ERROR) prom_panic("cannot find chosen"); /* msg won't be printed :( */ /* get device tree root */ _prom->root = call_prom("finddevice", 1, 1, ADDR("/")); - if ((long)_prom->root <= 0) + if (_prom->root == PROM_ERROR) prom_panic("cannot find device tree root"); /* msg won't be printed :( */ } @@ -1356,9 +1355,8 @@ } /* Default to pSeries. We need to know if we are running LPAR */ rtas = call_prom("finddevice", 1, 1, ADDR("/rtas")); - if (rtas != (phandle) -1) { - unsigned long x; - x = prom_getproplen(rtas, "ibm,hypertas-functions"); + if (rtas != PROM_ERROR) { + int x = prom_getproplen(rtas, "ibm,hypertas-functions"); if (x != PROM_ERROR) { prom_printf("Hypertas detected, assuming LPAR !\n"); return PLATFORM_PSERIES_LPAR; @@ -1426,12 +1424,13 @@ * leave some room at the end of the path for appending extra * arguments */ - if (call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-10) < 0) + if (call_prom("package-to-path", 3, 1, node, path, + PROM_SCRATCH_SIZE-10) == PRROM_ERROR) continue; prom_printf("found display : %s, opening ... ", path); ih = call_prom("open", 1, 1, path); - if (ih == (ihandle)0 || ih == (ihandle)-1) { + if (ih == 0 || ih == PROM_ERROR) { prom_printf("failed\n"); continue; } @@ -1514,6 +1513,12 @@ return 0; } +/* + * The Open Firmware 1275 specification states properties must be 31 bytes or + * less, however not all firmwares obey this. Make it 64 bytes to be safe. + */ +#define MAX_PROPERTY_NAME 64 + static void __init scan_dt_build_strings(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1528,9 +1533,10 @@ prev_name = RELOC(""); for (;;) { - /* 32 is max len of name including nul. */ - namep = make_room(mem_start, mem_end, 32, 1); - if (call_prom("nextprop", 3, 1, node, prev_name, namep) <= 0) { + /* 64 is max len of name including nul. */ + namep = make_room(mem_start, mem_end, MAX_PROPERTY_NAME, 1); + if (call_prom("nextprop", 3, 1, node, prev_name, namep) + == PROM_ERROR) { /* No more nodes: unwind alloc */ *mem_start = (unsigned long)namep; break; @@ -1555,12 +1561,6 @@ } } -/* - * The Open Firmware 1275 specification states properties must be 31 bytes or - * less, however not all firmwares obey this. Make it 64 bytes to be safe. - */ -#define MAX_PROPERTY_NAME 64 - static void __init scan_dt_build_struct(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1607,7 +1607,8 @@ prev_name = RELOC(""); sstart = (char *)RELOC(dt_string_start); for (;;) { - if (call_prom("nextprop", 3, 1, node, prev_name, pname) <= 0) + if (call_prom("nextprop", 3, 1, node, prev_name, pname) + == PROM_ERROR) break; /* find string offset */ @@ -1623,7 +1624,7 @@ l = call_prom("getproplen", 2, 1, node, pname); /* sanity checks */ - if (l < 0) + if (l == PROM_ERROR) continue; if (l > MAX_PROPERTY_LENGTH) { prom_printf("WARNING: ignoring large property "); @@ -1771,17 +1772,18 @@ /* Some G5s have a missing interrupt definition, fix it up here */ u3 = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000")); - if ((long)u3 <= 0) + if (u3 == PROM_ERROR) return; i2c = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/i2c at f8001000")); - if ((long)i2c <= 0) + if (i2c <= PROM_ERROR) return; mpic = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/mpic at f8040000")); - if ((long)mpic <= 0) + if (mpic <= PROM_ERROR) return; /* check if proper rev of u3 */ - if (prom_getprop(u3, "device-rev", &u3_rev, sizeof(u3_rev)) <= 0) + if (prom_getprop(u3, "device-rev", &u3_rev, sizeof(u3_rev)) + == PROM_ERROR) return; if (u3_rev != 0x35) return; From benh at kernel.crashing.org Thu Jun 2 11:35:05 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 02 Jun 2005 11:35:05 +1000 Subject: [PATCH] ppc64: fix call_prom return checks In-Reply-To: <1117675462.19020.106.camel@gaston> References: <1117664522.19020.75.camel@gaston> <200506020020.16663.arnd@arndb.de> <1117666087.19020.96.camel@gaston> <200506020054.24224.arnd@arndb.de> <1117675462.19020.106.camel@gaston> Message-ID: <1117676106.19020.111.camel@gaston> On Thu, 2005-06-02 at 11:24 +1000, Benjamin Herrenschmidt wrote: > On Thu, 2005-06-02 at 00:54 +0200, Arnd Bergmann wrote: > > On Dunnersdag 02 Juni 2005 00:48, Benjamin Herrenschmidt wrote: > > > My patch is still doing that ? I though I was removing those casts ... > > > > You removed most of them, but two are still left. > > Ok, let's be clean. If I define PROM_ERROR to be (-1u) instead of (-1), > it actually works in all cases without needing ugly casts. What about > this patch ? Or better, the one that actually builds :) Index: linux-work/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom_init.c 2005-06-02 08:11:37.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom_init.c 2005-06-02 11:33:21.000000000 +1000 @@ -216,7 +216,7 @@ * mode when we do. We switch back to 64b mode upon return. */ -#define PROM_ERROR (-1) +#define PROM_ERROR (-1u) static int __init call_prom(const char *service, int nargs, int nret, ...) { @@ -587,14 +587,13 @@ { unsigned long offset = reloc_offset(); ihandle elfloader; - int ret; elfloader = call_prom("open", 1, 1, ADDR("/packages/elf-loader")); - if (elfloader == 0) { + if (elfloader == PROM_ERROR || elfloader == 0) { prom_printf("couldn't open /packages/elf-loader\n"); return; } - ret = call_prom("call-method", 3, 1, ADDR("process-elf-header"), + call_prom("call-method", 3, 1, ADDR("process-elf-header"), elfloader, ADDR(&fake_elf)); call_prom("close", 1, 0, elfloader); } @@ -646,7 +645,7 @@ base = _ALIGN_UP(base + 0x100000, align)) { prom_debug(" trying: 0x%x\n\r", base); addr = (unsigned long)prom_claim(base, size, 0); - if ((int)addr != PROM_ERROR) + if (addr != PROM_ERROR) break; addr = 0; if (align == 0) @@ -708,7 +707,7 @@ for(; base > RELOC(alloc_bottom); base = _ALIGN_DOWN(base - 0x100000, align)) { prom_debug(" trying: 0x%x\n\r", base); addr = (unsigned long)prom_claim(base, size, 0); - if ((int)addr != PROM_ERROR) + if (addr != PROM_ERROR) break; addr = 0; } @@ -910,7 +909,7 @@ prom_rtas = call_prom("finddevice", 1, 1, ADDR("/rtas")); prom_debug("prom_rtas: %x\n", prom_rtas); - if (prom_rtas == (phandle) -1) + if (prom_rtas == PROM_ERROR) return; prom_getprop(prom_rtas, "rtas-size", &size, sizeof(size)); @@ -1062,7 +1061,7 @@ prom_printf("opening PHB %s", path); phb_node = call_prom("open", 1, 1, path); - if ( (long)phb_node <= 0) + if (phb_node == PROM_ERROR) prom_printf("... failed\n"); else prom_printf("... done\n"); @@ -1279,12 +1278,12 @@ /* get a handle for the stdout device */ _prom->chosen = call_prom("finddevice", 1, 1, ADDR("/chosen")); - if ((long)_prom->chosen <= 0) + if (_prom->chosen == PROM_ERROR) prom_panic("cannot find chosen"); /* msg won't be printed :( */ /* get device tree root */ _prom->root = call_prom("finddevice", 1, 1, ADDR("/")); - if ((long)_prom->root <= 0) + if (_prom->root == PROM_ERROR) prom_panic("cannot find device tree root"); /* msg won't be printed :( */ } @@ -1356,9 +1355,8 @@ } /* Default to pSeries. We need to know if we are running LPAR */ rtas = call_prom("finddevice", 1, 1, ADDR("/rtas")); - if (rtas != (phandle) -1) { - unsigned long x; - x = prom_getproplen(rtas, "ibm,hypertas-functions"); + if (rtas != PROM_ERROR) { + int x = prom_getproplen(rtas, "ibm,hypertas-functions"); if (x != PROM_ERROR) { prom_printf("Hypertas detected, assuming LPAR !\n"); return PLATFORM_PSERIES_LPAR; @@ -1426,12 +1424,13 @@ * leave some room at the end of the path for appending extra * arguments */ - if (call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-10) < 0) + if (call_prom("package-to-path", 3, 1, node, path, + PROM_SCRATCH_SIZE-10) == PROM_ERROR) continue; prom_printf("found display : %s, opening ... ", path); ih = call_prom("open", 1, 1, path); - if (ih == (ihandle)0 || ih == (ihandle)-1) { + if (ih == 0 || ih == PROM_ERROR) { prom_printf("failed\n"); continue; } @@ -1514,6 +1513,12 @@ return 0; } +/* + * The Open Firmware 1275 specification states properties must be 31 bytes or + * less, however not all firmwares obey this. Make it 64 bytes to be safe. + */ +#define MAX_PROPERTY_NAME 64 + static void __init scan_dt_build_strings(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1528,9 +1533,10 @@ prev_name = RELOC(""); for (;;) { - /* 32 is max len of name including nul. */ - namep = make_room(mem_start, mem_end, 32, 1); - if (call_prom("nextprop", 3, 1, node, prev_name, namep) <= 0) { + /* 64 is max len of name including nul. */ + namep = make_room(mem_start, mem_end, MAX_PROPERTY_NAME, 1); + if (call_prom("nextprop", 3, 1, node, prev_name, namep) + == PROM_ERROR) { /* No more nodes: unwind alloc */ *mem_start = (unsigned long)namep; break; @@ -1555,12 +1561,6 @@ } } -/* - * The Open Firmware 1275 specification states properties must be 31 bytes or - * less, however not all firmwares obey this. Make it 64 bytes to be safe. - */ -#define MAX_PROPERTY_NAME 64 - static void __init scan_dt_build_struct(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1607,7 +1607,8 @@ prev_name = RELOC(""); sstart = (char *)RELOC(dt_string_start); for (;;) { - if (call_prom("nextprop", 3, 1, node, prev_name, pname) <= 0) + if (call_prom("nextprop", 3, 1, node, prev_name, pname) + == PROM_ERROR) break; /* find string offset */ @@ -1623,7 +1624,7 @@ l = call_prom("getproplen", 2, 1, node, pname); /* sanity checks */ - if (l < 0) + if (l == PROM_ERROR) continue; if (l > MAX_PROPERTY_LENGTH) { prom_printf("WARNING: ignoring large property "); @@ -1771,17 +1772,18 @@ /* Some G5s have a missing interrupt definition, fix it up here */ u3 = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000")); - if ((long)u3 <= 0) + if (u3 == PROM_ERROR) return; i2c = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/i2c at f8001000")); - if ((long)i2c <= 0) + if (i2c <= PROM_ERROR) return; mpic = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/mpic at f8040000")); - if ((long)mpic <= 0) + if (mpic <= PROM_ERROR) return; /* check if proper rev of u3 */ - if (prom_getprop(u3, "device-rev", &u3_rev, sizeof(u3_rev)) <= 0) + if (prom_getprop(u3, "device-rev", &u3_rev, sizeof(u3_rev)) + == PROM_ERROR) return; if (u3_rev != 0x35) return; From benh at kernel.crashing.org Thu Jun 2 14:11:37 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 02 Jun 2005 14:11:37 +1000 Subject: [PATCH] ppc64: Fix result code handling in prom_init Message-ID: <1117685497.31082.27.camel@gaston> Hi ! prom_init(), the trampoline code that "talks" to Open Firmware during early boot, has various issues with managing OF result codes. Some of my recent fixups in fact made the problem worse on some platforms. This patch reworks it all. Tested on g5, Maple, POWER3 and POWER5. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom_init.c 2005-06-02 08:11:37.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom_init.c 2005-06-02 14:03:54.000000000 +1000 @@ -211,13 +211,23 @@ */ #define ADDR(x) (u32) ((unsigned long)(x) - offset) +/* + * Error results ... some OF calls will return "-1" on error, some + * will return 0, some will return either. To simplify, here are + * macros to use with any ihandle or phandle return value to check if + * it is valid + */ + +#define PROM_ERROR (-1u) +#define PHANDLE_VALID(p) ((p) != 0 && (p) != PROM_ERROR) +#define IHANDLE_VALID(i) ((i) != 0 && (i) != PROM_ERROR) + + /* This is the one and *ONLY* place where we actually call open * firmware from, since we need to make sure we're running in 32b * mode when we do. We switch back to 64b mode upon return. */ -#define PROM_ERROR (-1) - static int __init call_prom(const char *service, int nargs, int nret, ...) { int i; @@ -587,14 +597,13 @@ { unsigned long offset = reloc_offset(); ihandle elfloader; - int ret; elfloader = call_prom("open", 1, 1, ADDR("/packages/elf-loader")); if (elfloader == 0) { prom_printf("couldn't open /packages/elf-loader\n"); return; } - ret = call_prom("call-method", 3, 1, ADDR("process-elf-header"), + call_prom("call-method", 3, 1, ADDR("process-elf-header"), elfloader, ADDR(&fake_elf)); call_prom("close", 1, 0, elfloader); } @@ -646,7 +655,7 @@ base = _ALIGN_UP(base + 0x100000, align)) { prom_debug(" trying: 0x%x\n\r", base); addr = (unsigned long)prom_claim(base, size, 0); - if ((int)addr != PROM_ERROR) + if (addr != PROM_ERROR) break; addr = 0; if (align == 0) @@ -708,7 +717,7 @@ for(; base > RELOC(alloc_bottom); base = _ALIGN_DOWN(base - 0x100000, align)) { prom_debug(" trying: 0x%x\n\r", base); addr = (unsigned long)prom_claim(base, size, 0); - if ((int)addr != PROM_ERROR) + if (addr != PROM_ERROR) break; addr = 0; } @@ -902,18 +911,19 @@ { unsigned long offset = reloc_offset(); struct prom_t *_prom = PTRRELOC(&prom); - phandle prom_rtas, rtas_node; + phandle rtas_node; + ihandle rtas_inst; u32 base, entry = 0; u32 size = 0; prom_debug("prom_instantiate_rtas: start...\n"); - prom_rtas = call_prom("finddevice", 1, 1, ADDR("/rtas")); - prom_debug("prom_rtas: %x\n", prom_rtas); - if (prom_rtas == (phandle) -1) + rtas_node = call_prom("finddevice", 1, 1, ADDR("/rtas")); + prom_debug("rtas_node: %x\n", rtas_node); + if (!PHANDLE_VALID(rtas_node)) return; - prom_getprop(prom_rtas, "rtas-size", &size, sizeof(size)); + prom_getprop(rtas_node, "rtas-size", &size, sizeof(size)); if (size == 0) return; @@ -922,14 +932,18 @@ prom_printf("RTAS allocation failed !\n"); return; } - prom_printf("instantiating rtas at 0x%x", base); - rtas_node = call_prom("open", 1, 1, ADDR("/rtas")); - prom_printf("..."); + rtas_inst = call_prom("open", 1, 1, ADDR("/rtas")); + if (!IHANDLE_VALID(rtas_inst)) { + prom_printf("opening rtas package failed"); + return; + } + + prom_printf("instantiating rtas at 0x%x ...", base); if (call_prom("call-method", 3, 2, ADDR("instantiate-rtas"), - rtas_node, base) != PROM_ERROR) { + rtas_inst, base) != PROM_ERROR) { entry = (long)_prom->args.rets[1]; } if (entry == 0) { @@ -940,8 +954,8 @@ reserve_mem(base, size); - prom_setprop(prom_rtas, "linux,rtas-base", &base, sizeof(base)); - prom_setprop(prom_rtas, "linux,rtas-entry", &entry, sizeof(entry)); + prom_setprop(rtas_node, "linux,rtas-base", &base, sizeof(base)); + prom_setprop(rtas_node, "linux,rtas-entry", &entry, sizeof(entry)); prom_debug("rtas base = 0x%x\n", base); prom_debug("rtas entry = 0x%x\n", entry); @@ -1062,7 +1076,7 @@ prom_printf("opening PHB %s", path); phb_node = call_prom("open", 1, 1, path); - if ( (long)phb_node <= 0) + if (phb_node == 0) prom_printf("... failed\n"); else prom_printf("... done\n"); @@ -1279,12 +1293,12 @@ /* get a handle for the stdout device */ _prom->chosen = call_prom("finddevice", 1, 1, ADDR("/chosen")); - if ((long)_prom->chosen <= 0) + if (!PHANDLE_VALID(_prom->chosen)) prom_panic("cannot find chosen"); /* msg won't be printed :( */ /* get device tree root */ _prom->root = call_prom("finddevice", 1, 1, ADDR("/")); - if ((long)_prom->root <= 0) + if (!PHANDLE_VALID(_prom->root)) prom_panic("cannot find device tree root"); /* msg won't be printed :( */ } @@ -1356,9 +1370,8 @@ } /* Default to pSeries. We need to know if we are running LPAR */ rtas = call_prom("finddevice", 1, 1, ADDR("/rtas")); - if (rtas != (phandle) -1) { - unsigned long x; - x = prom_getproplen(rtas, "ibm,hypertas-functions"); + if (!PHANDLE_VALID(rtas)) { + int x = prom_getproplen(rtas, "ibm,hypertas-functions"); if (x != PROM_ERROR) { prom_printf("Hypertas detected, assuming LPAR !\n"); return PLATFORM_PSERIES_LPAR; @@ -1426,12 +1439,13 @@ * leave some room at the end of the path for appending extra * arguments */ - if (call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-10) < 0) + if (call_prom("package-to-path", 3, 1, node, path, + PROM_SCRATCH_SIZE-10) == PROM_ERROR) continue; prom_printf("found display : %s, opening ... ", path); ih = call_prom("open", 1, 1, path); - if (ih == (ihandle)0 || ih == (ihandle)-1) { + if (ih == 0) { prom_printf("failed\n"); continue; } @@ -1514,6 +1528,12 @@ return 0; } +/* + * The Open Firmware 1275 specification states properties must be 31 bytes or + * less, however not all firmwares obey this. Make it 64 bytes to be safe. + */ +#define MAX_PROPERTY_NAME 64 + static void __init scan_dt_build_strings(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1527,10 +1547,12 @@ /* get and store all property names */ prev_name = RELOC(""); for (;;) { - - /* 32 is max len of name including nul. */ - namep = make_room(mem_start, mem_end, 32, 1); - if (call_prom("nextprop", 3, 1, node, prev_name, namep) <= 0) { + int rc; + + /* 64 is max len of name including nul. */ + namep = make_room(mem_start, mem_end, MAX_PROPERTY_NAME, 1); + rc = call_prom("nextprop", 3, 1, node, prev_name, namep); + if (rc != 1) { /* No more nodes: unwind alloc */ *mem_start = (unsigned long)namep; break; @@ -1555,12 +1577,6 @@ } } -/* - * The Open Firmware 1275 specification states properties must be 31 bytes or - * less, however not all firmwares obey this. Make it 64 bytes to be safe. - */ -#define MAX_PROPERTY_NAME 64 - static void __init scan_dt_build_struct(phandle node, unsigned long *mem_start, unsigned long *mem_end) { @@ -1607,7 +1623,10 @@ prev_name = RELOC(""); sstart = (char *)RELOC(dt_string_start); for (;;) { - if (call_prom("nextprop", 3, 1, node, prev_name, pname) <= 0) + int rc; + + rc = call_prom("nextprop", 3, 1, node, prev_name, pname); + if (rc != 1) break; /* find string offset */ @@ -1623,7 +1642,7 @@ l = call_prom("getproplen", 2, 1, node, pname); /* sanity checks */ - if (l < 0) + if (l == PROM_ERROR) continue; if (l > MAX_PROPERTY_LENGTH) { prom_printf("WARNING: ignoring large property "); @@ -1771,17 +1790,18 @@ /* Some G5s have a missing interrupt definition, fix it up here */ u3 = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000")); - if ((long)u3 <= 0) + if (!PHANDLE_VALID(u3)) return; i2c = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/i2c at f8001000")); - if ((long)i2c <= 0) + if (!PHANDLE_VALID(i2c)) return; mpic = call_prom("finddevice", 1, 1, ADDR("/u3 at 0,f8000000/mpic at f8040000")); - if ((long)mpic <= 0) + if (!PHANDLE_VALID(mpic)) return; /* check if proper rev of u3 */ - if (prom_getprop(u3, "device-rev", &u3_rev, sizeof(u3_rev)) <= 0) + if (prom_getprop(u3, "device-rev", &u3_rev, sizeof(u3_rev)) + == PROM_ERROR) return; if (u3_rev != 0x35) return; From sfr at canb.auug.org.au Thu Jun 2 16:41:49 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 2 Jun 2005 16:41:49 +1000 Subject: [PATCH] iSeries: remove include/asm-ppc64/iSeries/LparData.h Message-ID: <20050602164149.1cb5902e.sfr@canb.auug.org.au> Hi all, include/asm-ppc64/iSeries/LparData.h just included a whole loat of other files to declare variables that would be better declared in those other files. So, remove it. This should reduce that number of things needed to be included in most cases to access the relevant variables. arch/ppc64/kernel/HvLpEvent.c | 2 - arch/ppc64/kernel/ItLpQueue.c | 1 arch/ppc64/kernel/iSeries_VpdInfo.c | 1 arch/ppc64/kernel/iSeries_pci.c | 1 arch/ppc64/kernel/iSeries_proc.c | 2 - arch/ppc64/kernel/iSeries_setup.c | 5 ++ arch/ppc64/kernel/iSeries_smp.c | 1 arch/ppc64/kernel/irq.c | 2 - arch/ppc64/kernel/lparcfg.c | 2 - arch/ppc64/kernel/ras.c | 1 arch/ppc64/kernel/rtc.c | 1 arch/ppc64/kernel/setup.c | 3 + arch/ppc64/kernel/viopath.c | 2 - include/asm-ppc64/iSeries/HvLpConfig.h | 1 include/asm-ppc64/iSeries/HvReleaseData.h | 2 + include/asm-ppc64/iSeries/IoHriMainStore.h | 2 + include/asm-ppc64/iSeries/IoHriProcessorVpd.h | 2 + include/asm-ppc64/iSeries/ItExtVpdPanel.h | 2 + include/asm-ppc64/iSeries/ItIplParmsReal.h | 2 + include/asm-ppc64/iSeries/ItLpNaca.h | 4 ++ include/asm-ppc64/iSeries/ItVpdAreas.h | 2 + include/asm-ppc64/iSeries/LparData.h | 48 -------------------------- include/asm-ppc64/iSeries/LparMap.h | 2 + 23 files changed, 29 insertions(+), 62 deletions(-) This is on top of my previous iSeries header cleanup. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/HvLpEvent.c linus-iSeries-headers.2/arch/ppc64/kernel/HvLpEvent.c --- linus-iSeries-headers.1/arch/ppc64/kernel/HvLpEvent.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/HvLpEvent.c 2005-06-02 15:17:50.000000000 +1000 @@ -12,7 +12,7 @@ #include #include #include -#include +#include /* Array of LpEvent handler functions */ LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/ItLpQueue.c linus-iSeries-headers.2/arch/ppc64/kernel/ItLpQueue.c --- linus-iSeries-headers.1/arch/ppc64/kernel/ItLpQueue.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/ItLpQueue.c 2005-06-02 16:07:50.000000000 +1000 @@ -16,7 +16,6 @@ #include #include #include -#include static __inline__ int set_inUse( struct ItLpQueue * lpQueue ) { diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_VpdInfo.c linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_VpdInfo.c --- linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-06-02 16:08:05.000000000 +1000 @@ -35,7 +35,6 @@ #include #include #include -#include #include #include "pci.h" diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_pci.c linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_pci.c --- linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_pci.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_pci.c 2005-06-02 16:16:36.000000000 +1000 @@ -40,7 +40,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_proc.c linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_proc.c --- linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_proc.c 2005-06-01 17:53:28.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_proc.c 2005-06-02 16:08:14.000000000 +1000 @@ -28,7 +28,7 @@ #include #include #include -#include +#include static int __init iseries_proc_create(void) { diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_setup.c linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_setup.c --- linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_setup.c 2005-06-01 18:18:38.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_setup.c 2005-06-02 16:28:19.000000000 +1000 @@ -47,7 +47,7 @@ #include #include #include -#include +#include #include #include #include @@ -58,6 +58,9 @@ #include #include #include +#include +#include +#include extern void hvlog(char *fmt, ...); diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_smp.c linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_smp.c --- linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_smp.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_smp.c 2005-06-02 16:18:02.000000000 +1000 @@ -38,7 +38,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/irq.c linus-iSeries-headers.2/arch/ppc64/kernel/irq.c --- linus-iSeries-headers.1/arch/ppc64/kernel/irq.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/irq.c 2005-06-02 15:48:09.000000000 +1000 @@ -52,7 +52,7 @@ #include #include #include -#include +#include #include #include diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/lparcfg.c linus-iSeries-headers.2/arch/ppc64/kernel/lparcfg.c --- linus-iSeries-headers.1/arch/ppc64/kernel/lparcfg.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/lparcfg.c 2005-06-02 16:08:22.000000000 +1000 @@ -28,12 +28,12 @@ #include #include #include -#include #include #include #include #include #include +#include #define MODULE_VERS "1.6" #define MODULE_NAME "lparcfg" diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/ras.c linus-iSeries-headers.2/arch/ppc64/kernel/ras.c --- linus-iSeries-headers.1/arch/ppc64/kernel/ras.c 2005-05-20 09:03:14.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/ras.c 2005-06-02 16:07:59.000000000 +1000 @@ -47,7 +47,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/rtc.c linus-iSeries-headers.2/arch/ppc64/kernel/rtc.c --- linus-iSeries-headers.1/arch/ppc64/kernel/rtc.c 2005-05-26 10:44:08.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/rtc.c 2005-06-02 16:18:54.000000000 +1000 @@ -42,7 +42,6 @@ #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/setup.c linus-iSeries-headers.2/arch/ppc64/kernel/setup.c --- linus-iSeries-headers.1/arch/ppc64/kernel/setup.c 2005-05-20 09:03:14.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/setup.c 2005-06-02 16:07:41.000000000 +1000 @@ -41,7 +41,6 @@ #include #include #include -#include #include #include #include @@ -57,6 +56,8 @@ #include #include #include +#include +#include #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) diff -ruNp linus-iSeries-headers.1/arch/ppc64/kernel/viopath.c linus-iSeries-headers.2/arch/ppc64/kernel/viopath.c --- linus-iSeries-headers.1/arch/ppc64/kernel/viopath.c 2005-06-01 17:54:00.000000000 +1000 +++ linus-iSeries-headers.2/arch/ppc64/kernel/viopath.c 2005-06-02 16:19:39.000000000 +1000 @@ -43,7 +43,7 @@ #include #include #include -#include +#include #include #include #include diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-01 16:08:25.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:21:09.000000000 +1000 @@ -27,7 +27,6 @@ #include #include #include -#include extern HvLpIndex HvLpConfig_getLpIndex_outline(void); diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvReleaseData.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvReleaseData.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvReleaseData.h 2005-06-01 16:39:29.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvReleaseData.h 2005-06-02 15:07:40.000000000 +1000 @@ -58,4 +58,6 @@ struct HvReleaseData { char xRsvd3[20]; /* Reserved x2C-x3F */ }; +extern struct HvReleaseData hvReleaseData; + #endif /* _HVRELEASEDATA_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/IoHriMainStore.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriMainStore.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/IoHriMainStore.h 2005-06-01 16:47:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriMainStore.h 2005-06-02 16:06:25.000000000 +1000 @@ -161,4 +161,6 @@ struct IoHriMainStoreSegment5 { u64 reserved3; }; +extern u64 xMsVpd[]; + #endif /* _IOHRIMAINSTORE_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/IoHriProcessorVpd.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriProcessorVpd.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/IoHriProcessorVpd.h 2005-06-01 16:50:11.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriProcessorVpd.h 2005-06-02 15:27:50.000000000 +1000 @@ -81,4 +81,6 @@ struct IoHriProcessorVpd { char xProcSrc[72]; // CSP format SRC xB8-xFF }; +extern struct IoHriProcessorVpd xIoHriProcessorVpd[]; + #endif /* _IOHRIPROCESSORVPD_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItExtVpdPanel.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItExtVpdPanel.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItExtVpdPanel.h 2005-06-01 16:51:48.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItExtVpdPanel.h 2005-06-02 15:22:18.000000000 +1000 @@ -47,4 +47,6 @@ struct ItExtVpdPanel { u8 xRsvd2[48]; }; +extern struct ItExtVpdPanel xItExtVpdPanel; + #endif /* _ITEXTVPDPANEL_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItIplParmsReal.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItIplParmsReal.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItIplParmsReal.h 2005-06-01 16:53:52.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItIplParmsReal.h 2005-06-02 15:05:43.000000000 +1000 @@ -66,4 +66,6 @@ struct ItIplParmsReal { u64 xRsvd13; // Reserved x38-x3F }; +extern struct ItIplParmsReal xItIplParmsReal; + #endif /* _ITIPLPARMSREAL_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItLpNaca.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpNaca.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItLpNaca.h 2005-06-01 16:58:28.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpNaca.h 2005-06-02 15:10:27.000000000 +1000 @@ -19,6 +19,8 @@ #ifndef _ITLPNACA_H #define _ITLPNACA_H +#include + /* * This control block contains the data that is shared between the * hypervisor (PLIC) and the OS. @@ -73,4 +75,6 @@ struct ItLpNaca { u64 xInterruptHdlr[32]; // Interrupt handlers 300-x3FF }; +extern struct ItLpNaca itLpNaca; + #endif /* _ITLPNACA_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItVpdAreas.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItVpdAreas.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-01 17:11:03.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-02 16:22:09.000000000 +1000 @@ -90,4 +90,6 @@ struct ItVpdAreas { void *xSlicVpdAdrs[ItVpdMaxEntries];// Array of VPD buffers 130-1EF }; +extern struct ItVpdAreas itVpdAreas; + #endif /* _ITVPDAREAS_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/LparData.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparData.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/LparData.h 2005-06-01 17:12:42.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparData.h 1970-01-01 10:00:00.000000000 +1000 @@ -1,48 +0,0 @@ -/* - * LparData.h - * Copyright (C) 2001 Mike Corrigan IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ - -#ifndef _LPARDATA_H -#define _LPARDATA_H - -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -extern struct LparMap xLparMap; -extern struct HvReleaseData hvReleaseData; -extern struct ItLpNaca itLpNaca; -extern struct ItIplParmsReal xItIplParmsReal; -extern struct ItExtVpdPanel xItExtVpdPanel; -extern struct IoHriProcessorVpd xIoHriProcessorVpd[]; -extern struct ItLpQueue xItLpQueue; -extern struct ItVpdAreas itVpdAreas; -extern u64 xMsVpd[]; -extern struct msChunks msChunks; - -#endif /* _LPARDATA_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/LparMap.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparMap.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/LparMap.h 2005-06-01 17:14:45.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparMap.h 2005-06-02 15:21:09.000000000 +1000 @@ -64,4 +64,6 @@ struct LparMap { u64 xVPN; // Virtual Page Number (0x000C000000000000) }; +extern struct LparMap xLparMap; + #endif /* _LPARMAP_H */ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/20184c0f/attachment.pgp From david at gibson.dropbear.id.au Thu Jun 2 17:09:18 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 2 Jun 2005 17:09:18 +1000 Subject: Booting the linux-ppc64 kernel & flattened device tree v0.4 In-Reply-To: <1117614390.19020.24.camel@gaston> References: <1117614390.19020.24.camel@gaston> Message-ID: <20050602070918.GI4748@localhost.localdomain> On Wed, Jun 01, 2005 at 06:26:30PM +1000, Benjamin Herrenschmidt wrote: > DO NOT REPLY TO ALL LISTS PLEASE ! (and CC me on replies). > > Here's the fourth version of my document along with new kernel patches > for the new improved flattened format, and the first release of the > device-tree "compiler" tool. The patches will be posted as a reply to > this email. The compiler, dtc, can be downloaded, the URL is in the > document. [snip] > IV - "dtc", the device tree compiler > ==================================== > > dtc source code can be found at > I've just updated the dtc tarball with a new version. Notable changes: - Corrected comment parsing - Corrected handling of #address-cells, #size-cells properties - Input from device tree blobs should actually work now - Corrected autogeneration of "name" properties in blob/asm output version < 0x10 - Added a TODO list -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From sfr at canb.auug.org.au Thu Jun 2 17:19:54 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 2 Jun 2005 17:19:54 +1000 Subject: [PATCH] iSeries: eliminate some unused inlines Message-ID: <20050602171954.010339aa.sfr@canb.auug.org.au> Hi all, This patch removes a large number of inline functions that are not used. It also changes the only caller of a HvCallCfg function that is outside HvLpConfig.h to its equivalent HvLpConfig function and no longer includes HvCallCfg.h where it is not needed. arch/ppc64/kernel/iSeries_smp.c | 1 arch/ppc64/kernel/viopath.c | 3 include/asm-ppc64/iSeries/HvCallCfg.h | 53 --------- include/asm-ppc64/iSeries/HvLpConfig.h | 188 --------------------------------- 4 files changed, 3 insertions(+), 242 deletions(-) Relative to my previous to iSeries header cleanups. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_smp.c linus-iSeries-headers.3/arch/ppc64/kernel/iSeries_smp.c --- linus-iSeries-headers.2/arch/ppc64/kernel/iSeries_smp.c 2005-06-02 16:18:02.000000000 +1000 +++ linus-iSeries-headers.3/arch/ppc64/kernel/iSeries_smp.c 2005-06-02 17:11:29.000000000 +1000 @@ -39,7 +39,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.2/arch/ppc64/kernel/viopath.c linus-iSeries-headers.3/arch/ppc64/kernel/viopath.c --- linus-iSeries-headers.2/arch/ppc64/kernel/viopath.c 2005-06-02 16:19:39.000000000 +1000 +++ linus-iSeries-headers.3/arch/ppc64/kernel/viopath.c 2005-06-02 17:06:58.000000000 +1000 @@ -46,7 +46,6 @@ #include #include #include -#include #include #include @@ -364,7 +363,7 @@ void vio_set_hostlp(void) * while we're active */ viopath_ourLp = HvLpConfig_getLpIndex(); - viopath_hostLp = HvCallCfg_getHostingLpIndex(viopath_ourLp); + viopath_hostLp = HvLpConfig_getHostingLpIndex(viopath_ourLp); if (viopath_hostLp != HvLpIndexInvalid) vio_setHandler(viomajorsubtype_config, handleConfig); diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallCfg.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallCfg.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-01 15:04:06.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-02 17:08:25.000000000 +1000 @@ -67,31 +67,11 @@ enum HvCallCfg_ReqQual { #define HvCallCfgGetLpExecutionMode HvCallCfg + 31 #define HvCallCfgGetHostingLpIndex HvCallCfg + 32 -static inline HvLpIndex HvCallCfg_getLps(void) -{ - return HvCall0(HvCallCfgGetLps); -} - -static inline int HvCallCfg_isBusDedicated(u64 busIndex) -{ - return HvCall1(HvCallCfgIsBusDedicated, busIndex); -} - static inline HvLpIndex HvCallCfg_getBusOwner(u64 busIndex) { return HvCall1(HvCallCfgGetBusOwner, busIndex); } -static inline HvLpIndexMap HvCallCfg_getBusAllocation(u64 busIndex) -{ - return HvCall1(HvCallCfgGetBusAllocation, busIndex); -} - -static inline HvLpIndexMap HvCallCfg_getActiveLpMap(void) -{ - return HvCall0(HvCallCfgGetActiveLpMap); -} - static inline HvLpVirtualLanIndexMap HvCallCfg_getVirtualLanIndexMap( HvLpIndex lp) { @@ -105,31 +85,12 @@ static inline HvLpVirtualLanIndexMap HvC return retVal; } -static inline u64 HvCallCfg_getSystemMsChunks(void) -{ - return HvCall0(HvCallCfgGetSystemMsChunks); -} - static inline u64 HvCallCfg_getMsChunks(HvLpIndex lp, enum HvCallCfg_ReqQual qual) { return HvCall2(HvCallCfgGetMsChunks, lp, qual); } -static inline u64 HvCallCfg_getMinRuntimeMsChunks(HvLpIndex lp) -{ - /* - * NOTE: This function was added in v5r1 so older hypervisors - * will return a -1 value - */ - return HvCall1(HvCallCfgGetMinRuntimeMsChunks, lp); -} - -static inline u64 HvCallCfg_setMinRuntimeMsChunks(u64 chunks) -{ - return HvCall1(HvCallCfgSetMinRuntimeMsChunks, chunks); -} - static inline u64 HvCallCfg_getSystemPhysicalProcessors(void) { return HvCall0(HvCallCfgGetSystemPhysicalProcessors); @@ -141,14 +102,6 @@ static inline u64 HvCallCfg_getPhysicalP return HvCall2(HvCallCfgGetPhysicalProcessors, lp, qual); } -static inline u64 HvCallCfg_getConfiguredBusUnitsForInterruptProc(HvLpIndex lp, - u16 hvLogicalProcIndex) -{ - return HvCall2(HvCallCfgGetConfiguredBusUnitsForIntProc, lp, - hvLogicalProcIndex); - -} - static inline HvLpSharedPoolIndex HvCallCfg_getSharedPoolIndex(HvLpIndex lp) { return HvCall1(HvCallCfgGetSharedPoolIndex, lp); @@ -164,15 +117,13 @@ static inline u64 HvCallCfg_getSharedPro static inline u64 HvCallCfg_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { - u16 retVal = HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); - return retVal; + return (u16)HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); } static inline HvLpIndex HvCallCfg_getHostingLpIndex(HvLpIndex lp) { - u64 retVal = HvCall1(HvCallCfgGetHostingLpIndex, lp); - return retVal; + return HvCall1(HvCallCfgGetHostingLpIndex, lp); } #endif /* _HVCALLCFG_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:21:09.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:59:16.000000000 +1000 @@ -40,127 +40,16 @@ static inline HvLpIndex HvLpConfig_getPr return itLpNaca.xPrimaryLpIndex; } -static inline HvLpIndex HvLpConfig_getLps(void) -{ - return HvCallCfg_getLps(); -} - -static inline HvLpIndexMap HvLpConfig_getActiveLpMap(void) -{ - return HvCallCfg_getActiveLpMap(); -} - -static inline u64 HvLpConfig_getSystemMsMegs(void) -{ - return HvCallCfg_getSystemMsChunks() / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getSystemMsChunks(void) -{ - return HvCallCfg_getSystemMsChunks(); -} - -static inline u64 HvLpConfig_getSystemMsPages(void) -{ - return HvCallCfg_getSystemMsChunks() * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur) - / HVCHUNKSPERMEG; -} - static inline u64 HvLpConfig_getMsChunks(void) { return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur); } -static inline u64 HvLpConfig_getMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMinMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getMinMsChunks(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min); -} - -static inline u64 HvLpConfig_getMinMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMinRuntimeMsMegs(void) -{ - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getMinRuntimeMsChunks(void) -{ - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()); -} - -static inline u64 HvLpConfig_getMinRuntimeMsPages(void) -{ - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMaxMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getMaxMsChunks(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max); -} - -static inline u64 HvLpConfig_getMaxMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getInitMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getInitMsChunks(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init); -} - -static inline u64 HvLpConfig_getInitMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init) - * HVPAGESPERCHUNK; -} - static inline u64 HvLpConfig_getSystemPhysicalProcessors(void) { return HvCallCfg_getSystemPhysicalProcessors(); } -static inline u64 HvLpConfig_getSystemLogicalProcessors(void) -{ - return HvCallCfg_getSystemPhysicalProcessors() - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline u64 HvLpConfig_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { return HvCallCfg_getNumProcsInSharedPool(sPI); @@ -172,13 +61,6 @@ static inline u64 HvLpConfig_getPhysical HvCallCfg_Cur); } -static inline u64 HvLpConfig_getLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Cur) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline HvLpSharedPoolIndex HvLpConfig_getSharedPoolIndex(void) { return HvCallCfg_getSharedPoolIndex(HvLpConfig_getLpIndex()); @@ -190,57 +72,18 @@ static inline u64 HvLpConfig_getSharedPr HvCallCfg_Cur); } -static inline u64 HvLpConfig_getMinSharedProcUnits(void) -{ - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), - HvCallCfg_Min); -} - static inline u64 HvLpConfig_getMaxSharedProcUnits(void) { return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), HvCallCfg_Max); } -static inline u64 HvLpConfig_getMinPhysicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Min); -} - -static inline u64 HvLpConfig_getMinLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Min) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline u64 HvLpConfig_getMaxPhysicalProcessors(void) { return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), HvCallCfg_Max); } -static inline u64 HvLpConfig_getMaxLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Max) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - -static inline u64 HvLpConfig_getInitPhysicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Init); -} - -static inline u64 HvLpConfig_getInitLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Init) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) { return HvCallCfg_getVirtualLanIndexMap(HvLpConfig_getLpIndex_outline()); @@ -252,37 +95,6 @@ static inline HvLpVirtualLanIndexMap HvL return HvCallCfg_getVirtualLanIndexMap(lp); } -static inline HvLpIndex HvLpConfig_getBusOwner(HvBusNumber busNumber) -{ - return HvCallCfg_getBusOwner(busNumber); -} - -static inline int HvLpConfig_isBusDedicated(HvBusNumber busNumber) -{ - return HvCallCfg_isBusDedicated(busNumber); -} - -static inline HvLpIndexMap HvLpConfig_getBusAllocation(HvBusNumber busNumber) -{ - return HvCallCfg_getBusAllocation(busNumber); -} - -/* returns the absolute real address of the load area */ -static inline u64 HvLpConfig_getLoadAddress(void) -{ - return itLpNaca.xLoadAreaAddr & 0x7fffffffffffffff; -} - -static inline u64 HvLpConfig_getLoadPages(void) -{ - return itLpNaca.xLoadAreaChunks * HVPAGESPERCHUNK; -} - -static inline int HvLpConfig_isBusOwnedByThisLp(HvBusNumber busNumber) -{ - return (HvLpConfig_getBusOwner(busNumber) == HvLpConfig_getLpIndex()); -} - static inline int HvLpConfig_doLpsCommunicateOnVirtualLan(HvLpIndex lp1, HvLpIndex lp2) { -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/a3726c97/attachment.pgp From sfr at canb.auug.org.au Thu Jun 2 17:43:36 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 2 Jun 2005 17:43:36 +1000 Subject: [PATCH] iSeries: remove HvCallCfg.h Message-ID: <20050602174336.536b7fbf.sfr@canb.auug.org.au> Hi all, Now that the only users of things in HvCallCfg.h are in HvLpConfig.h, merge in the bit we need and remove HvCallCfg.h. HvCallCfg.h | 129 ----------------------------------------------------------- HvLpConfig.h | 59 +++++++++++++++++++------- 2 files changed, 42 insertions(+), 146 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallCfg.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallCfg.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-02 17:08:25.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallCfg.h 1970-01-01 10:00:00.000000000 +1000 @@ -1,129 +0,0 @@ -/* - * HvCallCfg.h - * Copyright (C) 2001 Mike Corrigan IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ -/* - * This file contains the "hypervisor call" interface which is used to - * drive the hypervisor from the OS. - */ -#ifndef _HVCALLCFG_H -#define _HVCALLCFG_H - -#include -#include - -enum HvCallCfg_ReqQual { - HvCallCfg_Cur = 0, - HvCallCfg_Init = 1, - HvCallCfg_Max = 2, - HvCallCfg_Min = 3 -}; - -#define HvCallCfgGetLps HvCallCfg + 0 -#define HvCallCfgGetActiveLpMap HvCallCfg + 1 -#define HvCallCfgGetLpVrmIndex HvCallCfg + 2 -#define HvCallCfgGetLpMinSupportedPlicVrmIndex HvCallCfg + 3 -#define HvCallCfgGetLpMinCompatablePlicVrmIndex HvCallCfg + 4 -#define HvCallCfgGetLpVrmName HvCallCfg + 5 -#define HvCallCfgGetSystemPhysicalProcessors HvCallCfg + 6 -#define HvCallCfgGetPhysicalProcessors HvCallCfg + 7 -#define HvCallCfgGetSystemMsChunks HvCallCfg + 8 -#define HvCallCfgGetMsChunks HvCallCfg + 9 -#define HvCallCfgGetInteractivePercentage HvCallCfg + 10 -#define HvCallCfgIsBusDedicated HvCallCfg + 11 -#define HvCallCfgGetBusOwner HvCallCfg + 12 -#define HvCallCfgGetBusAllocation HvCallCfg + 13 -#define HvCallCfgGetBusUnitOwner HvCallCfg + 14 -#define HvCallCfgGetBusUnitAllocation HvCallCfg + 15 -#define HvCallCfgGetVirtualBusPool HvCallCfg + 16 -#define HvCallCfgGetBusUnitInterruptProc HvCallCfg + 17 -#define HvCallCfgGetConfiguredBusUnitsForIntProc HvCallCfg + 18 -#define HvCallCfgGetRioSanBusPool HvCallCfg + 19 -#define HvCallCfgGetSharedPoolIndex HvCallCfg + 20 -#define HvCallCfgGetSharedProcUnits HvCallCfg + 21 -#define HvCallCfgGetNumProcsInSharedPool HvCallCfg + 22 -#define HvCallCfgRouter23 HvCallCfg + 23 -#define HvCallCfgRouter24 HvCallCfg + 24 -#define HvCallCfgRouter25 HvCallCfg + 25 -#define HvCallCfgRouter26 HvCallCfg + 26 -#define HvCallCfgRouter27 HvCallCfg + 27 -#define HvCallCfgGetMinRuntimeMsChunks HvCallCfg + 28 -#define HvCallCfgSetMinRuntimeMsChunks HvCallCfg + 29 -#define HvCallCfgGetVirtualLanIndexMap HvCallCfg + 30 -#define HvCallCfgGetLpExecutionMode HvCallCfg + 31 -#define HvCallCfgGetHostingLpIndex HvCallCfg + 32 - -static inline HvLpIndex HvCallCfg_getBusOwner(u64 busIndex) -{ - return HvCall1(HvCallCfgGetBusOwner, busIndex); -} - -static inline HvLpVirtualLanIndexMap HvCallCfg_getVirtualLanIndexMap( - HvLpIndex lp) -{ - /* - * This is a new function in V5R1 so calls to this on older - * hypervisors will return -1 - */ - u64 retVal = HvCall1(HvCallCfgGetVirtualLanIndexMap, lp); - if (retVal == -1) - retVal = 0; - return retVal; -} - -static inline u64 HvCallCfg_getMsChunks(HvLpIndex lp, - enum HvCallCfg_ReqQual qual) -{ - return HvCall2(HvCallCfgGetMsChunks, lp, qual); -} - -static inline u64 HvCallCfg_getSystemPhysicalProcessors(void) -{ - return HvCall0(HvCallCfgGetSystemPhysicalProcessors); -} - -static inline u64 HvCallCfg_getPhysicalProcessors(HvLpIndex lp, - enum HvCallCfg_ReqQual qual) -{ - return HvCall2(HvCallCfgGetPhysicalProcessors, lp, qual); -} - -static inline HvLpSharedPoolIndex HvCallCfg_getSharedPoolIndex(HvLpIndex lp) -{ - return HvCall1(HvCallCfgGetSharedPoolIndex, lp); - -} - -static inline u64 HvCallCfg_getSharedProcUnits(HvLpIndex lp, - enum HvCallCfg_ReqQual qual) -{ - return HvCall2(HvCallCfgGetSharedProcUnits, lp, qual); - -} - -static inline u64 HvCallCfg_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) -{ - return (u16)HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); - -} - -static inline HvLpIndex HvCallCfg_getHostingLpIndex(HvLpIndex lp) -{ - return HvCall1(HvCallCfgGetHostingLpIndex, lp); -} - -#endif /* _HVCALLCFG_H */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:59:16.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 17:39:37.000000000 +1000 @@ -24,10 +24,26 @@ * to determine which resources should be allocated to each partition. */ -#include +#include #include #include +enum { + HvCallCfg_Cur = 0, + HvCallCfg_Init = 1, + HvCallCfg_Max = 2, + HvCallCfg_Min = 3 +}; + +#define HvCallCfgGetSystemPhysicalProcessors HvCallCfg + 6 +#define HvCallCfgGetPhysicalProcessors HvCallCfg + 7 +#define HvCallCfgGetMsChunks HvCallCfg + 9 +#define HvCallCfgGetSharedPoolIndex HvCallCfg + 20 +#define HvCallCfgGetSharedProcUnits HvCallCfg + 21 +#define HvCallCfgGetNumProcsInSharedPool HvCallCfg + 22 +#define HvCallCfgGetVirtualLanIndexMap HvCallCfg + 30 +#define HvCallCfgGetHostingLpIndex HvCallCfg + 32 + extern HvLpIndex HvLpConfig_getLpIndex_outline(void); static inline HvLpIndex HvLpConfig_getLpIndex(void) @@ -42,72 +58,81 @@ static inline HvLpIndex HvLpConfig_getPr static inline u64 HvLpConfig_getMsChunks(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur); + return HvCall2(HvCallCfgGetMsChunks, HvLpConfig_getLpIndex(), + HvCallCfg_Cur); } static inline u64 HvLpConfig_getSystemPhysicalProcessors(void) { - return HvCallCfg_getSystemPhysicalProcessors(); + return HvCall0(HvCallCfgGetSystemPhysicalProcessors); } static inline u64 HvLpConfig_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { - return HvCallCfg_getNumProcsInSharedPool(sPI); + return (u16)HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); } static inline u64 HvLpConfig_getPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetPhysicalProcessors, HvLpConfig_getLpIndex(), HvCallCfg_Cur); } static inline HvLpSharedPoolIndex HvLpConfig_getSharedPoolIndex(void) { - return HvCallCfg_getSharedPoolIndex(HvLpConfig_getLpIndex()); + return HvCall1(HvCallCfgGetSharedPoolIndex, HvLpConfig_getLpIndex()); } static inline u64 HvLpConfig_getSharedProcUnits(void) { - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetSharedProcUnits, HvLpConfig_getLpIndex(), HvCallCfg_Cur); } static inline u64 HvLpConfig_getMaxSharedProcUnits(void) { - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetSharedProcUnits, HvLpConfig_getLpIndex(), HvCallCfg_Max); } static inline u64 HvLpConfig_getMaxPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetPhysicalProcessors, HvLpConfig_getLpIndex(), HvCallCfg_Max); } -static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) +static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMapForLp( + HvLpIndex lp) { - return HvCallCfg_getVirtualLanIndexMap(HvLpConfig_getLpIndex_outline()); + /* + * This is a new function in V5R1 so calls to this on older + * hypervisors will return -1 + */ + u64 retVal = HvCall1(HvCallCfgGetVirtualLanIndexMap, lp); + if (retVal == -1) + retVal = 0; + return retVal; } -static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMapForLp( - HvLpIndex lp) +static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) { - return HvCallCfg_getVirtualLanIndexMap(lp); + return HvLpConfig_getVirtualLanIndexMapForLp( + HvLpConfig_getLpIndex_outline()); } static inline int HvLpConfig_doLpsCommunicateOnVirtualLan(HvLpIndex lp1, HvLpIndex lp2) { HvLpVirtualLanIndexMap virtualLanIndexMap1 = - HvCallCfg_getVirtualLanIndexMap(lp1); + HvLpConfig_getVirtualLanIndexMapForLp(lp1); HvLpVirtualLanIndexMap virtualLanIndexMap2 = - HvCallCfg_getVirtualLanIndexMap(lp2); + HvLpConfig_getVirtualLanIndexMapForLp(lp2); return ((virtualLanIndexMap1 & virtualLanIndexMap2) != 0); } static inline HvLpIndex HvLpConfig_getHostingLpIndex(HvLpIndex lp) { - return HvCallCfg_getHostingLpIndex(lp); + return HvCall1(HvCallCfgGetHostingLpIndex, lp); } #endif /* _HVLPCONFIG_H */ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/e01487ba/attachment.pgp From sfr at canb.auug.org.au Thu Jun 2 18:06:11 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 2 Jun 2005 18:06:11 +1000 Subject: [PATCH] iSeries: cleanup ItLpQueue.h a bit Message-ID: <20050602180611.251140c8.sfr@canb.auug.org.au> Hi all, Just white space cleaups and move process_iSeries_events into its only caller. arch/ppc64/kernel/idle.c | 5 +++++ include/asm-ppc64/iSeries/ItLpQueue.h | 16 ++++------------ 2 files changed, 9 insertions(+), 12 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/idle.c linus-iSeries-headers.5/arch/ppc64/kernel/idle.c --- linus-iSeries-headers.4/arch/ppc64/kernel/idle.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/idle.c 2005-06-02 18:03:04.000000000 +1000 @@ -42,6 +42,11 @@ static int (*idle_loop)(void); static unsigned long maxYieldTime = 0; static unsigned long minYieldTime = 0xffffffffffffffffUL; +static inline void process_iSeries_events(void) +{ + asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); +} + static void yield_shared_processor(void) { unsigned long tb; diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItLpQueue.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItLpQueue.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItLpQueue.h 2005-06-01 17:05:16.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItLpQueue.h 2005-06-02 14:55:45.000000000 +1000 @@ -64,9 +64,9 @@ struct ItLpQueue { u8 xPlicStatus; // 0x01 DedicatedIo or DedicatedLp or NotUsed u16 xSlicLogicalProcIndex; // 0x02 Logical Proc Index for correlation u8 xPlicRsvd[12]; // 0x04 - char* xSlicCurEventPtr; // 0x10 - char* xSlicLastValidEventPtr; // 0x18 - char* xSlicEventStackPtr; // 0x20 + char *xSlicCurEventPtr; // 0x10 + char *xSlicLastValidEventPtr; // 0x18 + char *xSlicEventStackPtr; // 0x20 u8 xIndex; // 0x28 unique sequential index. u8 xSlicRsvd[3]; // 0x29-2b u32 xInUseWord; // 0x2C @@ -76,17 +76,9 @@ struct ItLpQueue { extern struct ItLpQueue xItLpQueue; -extern struct HvLpEvent * ItLpQueue_getNextLpEvent(struct ItLpQueue *); +extern struct HvLpEvent *ItLpQueue_getNextLpEvent(struct ItLpQueue *); extern int ItLpQueue_isLpIntPending(struct ItLpQueue *); extern unsigned ItLpQueue_process(struct ItLpQueue *, struct pt_regs *); extern void ItLpQueue_clearValid(struct HvLpEvent *); -static __inline__ void process_iSeries_events(void) -{ - __asm__ __volatile__ ( - " li 0,0x5555 \n\ - sc" - : : : "r0", "r3"); -} - #endif /* _ITLPQUEUE_H */ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/2527f204/attachment.pgp From sfr at canb.auug.org.au Thu Jun 2 18:18:10 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 2 Jun 2005 18:18:10 +1000 Subject: [PATCH] iSeries: remove some unused bits Message-ID: <20050602181810.10935e25.sfr@canb.auug.org.au> Hi all, This patch removes some unused bits from HvCall.h and some #includes from other files. Also includes ItLpQueue.h in paca.h in preference to a stub declaration of struct ItLpQueue. arch/ppc64/kernel/asm-offsets.c | 1 arch/ppc64/kernel/iSeries_pci.c | 1 arch/ppc64/kernel/mf.c | 1 arch/ppc64/kernel/rtc.c | 1 include/asm-ppc64/iSeries/HvCall.h | 84 ------------------------------------- include/asm-ppc64/paca.h | 2 6 files changed, 3 insertions(+), 87 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.5/arch/ppc64/kernel/asm-offsets.c linus-iSeries-headers.6/arch/ppc64/kernel/asm-offsets.c --- linus-iSeries-headers.5/arch/ppc64/kernel/asm-offsets.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.6/arch/ppc64/kernel/asm-offsets.c 2005-06-02 15:50:28.000000000 +1000 @@ -31,7 +31,6 @@ #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_pci.c linus-iSeries-headers.6/arch/ppc64/kernel/iSeries_pci.c --- linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_pci.c 2005-06-02 16:16:36.000000000 +1000 +++ linus-iSeries-headers.6/arch/ppc64/kernel/iSeries_pci.c 2005-06-02 16:48:16.000000000 +1000 @@ -38,7 +38,6 @@ #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.5/arch/ppc64/kernel/mf.c linus-iSeries-headers.6/arch/ppc64/kernel/mf.c --- linus-iSeries-headers.5/arch/ppc64/kernel/mf.c 2005-05-26 10:44:08.000000000 +1000 +++ linus-iSeries-headers.6/arch/ppc64/kernel/mf.c 2005-06-02 14:59:25.000000000 +1000 @@ -40,7 +40,6 @@ #include #include #include -#include #include /* diff -ruNp linus-iSeries-headers.5/arch/ppc64/kernel/rtc.c linus-iSeries-headers.6/arch/ppc64/kernel/rtc.c --- linus-iSeries-headers.5/arch/ppc64/kernel/rtc.c 2005-06-02 16:18:54.000000000 +1000 +++ linus-iSeries-headers.6/arch/ppc64/kernel/rtc.c 2005-06-02 16:48:16.000000000 +1000 @@ -44,7 +44,6 @@ #include #include -#include extern int piranha_simulator; diff -ruNp linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvCall.h linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCall.h --- linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvCall.h 2005-06-01 14:51:07.000000000 +1000 +++ linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCall.h 2005-06-02 13:29:59.000000000 +1000 @@ -27,48 +27,6 @@ #include #include -/* -enum HvCall_ReturnCode -{ - HvCall_Good = 0, - HvCall_Partial = 1, - HvCall_NotOwned = 2, - HvCall_NotFreed = 3, - HvCall_UnspecifiedError = 4 -}; - -enum HvCall_TypeOfSIT -{ - HvCall_ReduceOnly = 0, - HvCall_Unconditional = 1 -}; - -enum HvCall_TypeOfYield -{ - HvCall_YieldTimed = 0, // Yield until specified time - HvCall_YieldToActive = 1, // Yield until all active procs have run - HvCall_YieldToProc = 2 // Yield until the specified processor has run -}; - -enum HvCall_InterruptMasks -{ - HvCall_MaskIPI = 0x00000001, - HvCall_MaskLpEvent = 0x00000002, - HvCall_MaskLpProd = 0x00000004, - HvCall_MaskTimeout = 0x00000008 -}; - -enum HvCall_VaryOffChunkRc -{ - HvCall_VaryOffSucceeded = 0, - HvCall_VaryOffWithdrawn = 1, - HvCall_ChunkInLoadArea = 2, - HvCall_ChunkInHPT = 3, - HvCall_ChunkNotAccessible = 4, - HvCall_ChunkInUse = 5 -}; -*/ - /* Type of yield for HvCallBaseYieldProcessor */ #define HvCall_YieldTimed 0 /* Yield until specified time (tb) */ #define HvCall_YieldToActive 1 /* Yield until all active procs have run */ @@ -139,35 +97,12 @@ static inline void HvCall_setEnabledInte HvCall1(HvCallBaseSetEnabledInterrupts, enabledInterrupts); } -static inline void HvCall_clearLogBuffer(HvLpIndex lpindex) -{ - HvCall1(HvCallBaseClearLogBuffer, lpindex); -} - -static inline u32 HvCall_getLogBufferCodePage(HvLpIndex lpindex) -{ - u32 retVal = HvCall1(HvCallBaseGetLogBufferCodePage, lpindex); - return retVal; -} - -static inline int HvCall_getLogBufferFormat(HvLpIndex lpindex) -{ - int retVal = HvCall1(HvCallBaseGetLogBufferFormat, lpindex); - return retVal; -} - -static inline u32 HvCall_getLogBufferLength(HvLpIndex lpindex) -{ - u32 retVal = HvCall1(HvCallBaseGetLogBufferLength, lpindex); - return retVal; -} - -static inline void HvCall_setLogBufferFormatAndCodepage(int format, u32 codePage) +static inline void HvCall_setLogBufferFormatAndCodepage(int format, + u32 codePage) { HvCall2(HvCallBaseSetLogBufferFormatAndCodePage, format, codePage); } -extern int HvCall_readLogBuffer(HvLpIndex lpindex, void *buffer, u64 bufLen); extern void HvCall_writeLogBuffer(const void *buffer, u64 bufLen); static inline void HvCall_sendIPI(struct paca_struct *targetPaca) @@ -175,19 +110,4 @@ static inline void HvCall_sendIPI(struct HvCall1(HvCallBaseSendIPI, targetPaca->paca_index); } -static inline void HvCall_terminateMachineSrc(void) -{ - HvCall0(HvCallBaseTerminateMachineSrc); -} - -static inline void HvCall_setDABR(unsigned long val) -{ - HvCall1(HvCallCcSetDABR, val); -} - -static inline void HvCall_setDebugBus(unsigned long val) -{ - HvCall1(HvCallBaseSetDebugBus, val); -} - #endif /* _HVCALL_H */ diff -ruNp linus-iSeries-headers.5/include/asm-ppc64/paca.h linus-iSeries-headers.6/include/asm-ppc64/paca.h --- linus-iSeries-headers.5/include/asm-ppc64/paca.h 2005-05-20 09:05:56.000000000 +1000 +++ linus-iSeries-headers.6/include/asm-ppc64/paca.h 2005-06-02 15:52:17.000000000 +1000 @@ -20,13 +20,13 @@ #include #include #include +#include #include register struct paca_struct *local_paca asm("r13"); #define get_paca() local_paca struct task_struct; -struct ItLpQueue; /* * Defines the layout of the paca. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/ddbfdacb/attachment.pgp From sfr at canb.auug.org.au Thu Jun 2 18:27:14 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 2 Jun 2005 18:27:14 +1000 Subject: [PATCH] iSeries: Message-ID: <20050602182714.76b28e7a.sfr@canb.auug.org.au> Hi all, Last of the cleanups for today :-) - don't have two defines for the same thing (HvMaxArchitectedLps and HvMaxArchitectedVirtualLans) - HvCallSc.h only needs linux/types.h - remove unused struct definition arch/ppc64/kernel/viopath.c | 4 +-- include/asm-ppc64/iSeries/HvCallSc.h | 34 ++++++++++++++++----------------- include/asm-ppc64/iSeries/HvTypes.h | 8 ++----- include/asm-ppc64/iSeries/ItVpdAreas.h | 6 ----- 4 files changed, 22 insertions(+), 30 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.6/arch/ppc64/kernel/viopath.c linus-iSeries-headers.7/arch/ppc64/kernel/viopath.c --- linus-iSeries-headers.6/arch/ppc64/kernel/viopath.c 2005-06-02 17:06:58.000000000 +1000 +++ linus-iSeries-headers.7/arch/ppc64/kernel/viopath.c 2005-06-02 18:09:32.000000000 +1000 @@ -485,7 +485,7 @@ int viopath_open(HvLpIndex remoteLp, int unsigned long flags; int tempNumAllocated; - if ((remoteLp >= HvMaxArchitectedLps) || (remoteLp == HvLpIndexInvalid)) + if ((remoteLp >= HVMAXARCHITECTEDLPS) || (remoteLp == HvLpIndexInvalid)) return -EINVAL; subtype = subtype >> VIOMAJOR_SUBTYPE_SHIFT; @@ -556,7 +556,7 @@ int viopath_close(HvLpIndex remoteLp, in int numOpen; struct alloc_parms parms; - if ((remoteLp >= HvMaxArchitectedLps) || (remoteLp == HvLpIndexInvalid)) + if ((remoteLp >= HVMAXARCHITECTEDLPS) || (remoteLp == HvLpIndexInvalid)) return -EINVAL; subtype = subtype >> VIOMAJOR_SUBTYPE_SHIFT; diff -ruNp linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCallSc.h linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvCallSc.h --- linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCallSc.h 2005-06-01 15:46:19.000000000 +1000 +++ linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvCallSc.h 2005-06-02 14:00:14.000000000 +1000 @@ -19,7 +19,7 @@ #ifndef _HVCALLSC_H #define _HVCALLSC_H -#include +#include #define HvCallBase 0x8000000000000000ul #define HvCallCc 0x8001000000000000ul @@ -30,22 +30,22 @@ #define HvCallSm 0x8007000000000000ul #define HvCallXm 0x8009000000000000ul -u64 HvCall0(u64); -u64 HvCall1(u64, u64); -u64 HvCall2(u64, u64, u64); -u64 HvCall3(u64, u64, u64, u64); -u64 HvCall4(u64, u64, u64, u64, u64); -u64 HvCall5(u64, u64, u64, u64, u64, u64); -u64 HvCall6(u64, u64, u64, u64, u64, u64, u64); -u64 HvCall7(u64, u64, u64, u64, u64, u64, u64, u64); +extern u64 HvCall0(u64); +extern u64 HvCall1(u64, u64); +extern u64 HvCall2(u64, u64, u64); +extern u64 HvCall3(u64, u64, u64, u64); +extern u64 HvCall4(u64, u64, u64, u64, u64); +extern u64 HvCall5(u64, u64, u64, u64, u64, u64); +extern u64 HvCall6(u64, u64, u64, u64, u64, u64, u64); +extern u64 HvCall7(u64, u64, u64, u64, u64, u64, u64, u64); -u64 HvCall0Ret16(u64, void *); -u64 HvCall1Ret16(u64, void *, u64); -u64 HvCall2Ret16(u64, void *, u64, u64); -u64 HvCall3Ret16(u64, void *, u64, u64, u64); -u64 HvCall4Ret16(u64, void *, u64, u64, u64, u64); -u64 HvCall5Ret16(u64, void *, u64, u64, u64, u64, u64); -u64 HvCall6Ret16(u64, void *, u64, u64, u64, u64, u64, u64); -u64 HvCall7Ret16(u64, void *, u64, u64 ,u64 ,u64 ,u64 ,u64 ,u64); +extern u64 HvCall0Ret16(u64, void *); +extern u64 HvCall1Ret16(u64, void *, u64); +extern u64 HvCall2Ret16(u64, void *, u64, u64); +extern u64 HvCall3Ret16(u64, void *, u64, u64, u64); +extern u64 HvCall4Ret16(u64, void *, u64, u64, u64, u64); +extern u64 HvCall5Ret16(u64, void *, u64, u64, u64, u64, u64); +extern u64 HvCall6Ret16(u64, void *, u64, u64, u64, u64, u64, u64); +extern u64 HvCall7Ret16(u64, void *, u64, u64 ,u64 ,u64 ,u64 ,u64 ,u64); #endif /* _HVCALLSC_H */ diff -ruNp linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvTypes.h linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvTypes.h --- linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvTypes.h 2005-06-01 16:45:03.000000000 +1000 +++ linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvTypes.h 2005-06-02 14:15:29.000000000 +1000 @@ -40,14 +40,14 @@ typedef u64 HvIoToken; typedef u8 HvLpName[8]; typedef u32 HvIoId; typedef u64 HvRealMemoryIndex; -typedef u32 HvLpIndexMap; /* Must hold HvMaxArchitectedLps bits!!! */ +typedef u32 HvLpIndexMap; /* Must hold HVMAXARCHITECTEDLPS bits!!! */ typedef u16 HvLpVrmIndex; typedef u32 HvXmGenerationId; typedef u8 HvLpBusPool; typedef u8 HvLpSharedPoolIndex; typedef u16 HvLpSharedProcUnitsX100; typedef u8 HvLpVirtualLanIndex; -typedef u16 HvLpVirtualLanIndexMap; /* Must hold HvMaxArchitectedVirtualLans bits!!! */ +typedef u16 HvLpVirtualLanIndexMap; /* Must hold HVMAXARCHITECTEDVIRTUALLANS bits!!! */ typedef u16 HvBusNumber; /* Hypervisor Bus Number */ typedef u8 HvSubBusNumber; /* Hypervisor SubBus Number */ typedef u8 HvAgentId; /* Hypervisor DevFn */ @@ -66,15 +66,13 @@ typedef u8 HvAgentId; /* Hypervisor DevF #define HVPAGESPERMEG 256 #define HVPAGESPERCHUNK 64 -#define HvMaxArchitectedLps ((HvLpIndex)HVMAXARCHITECTEDLPS) -#define HvMaxArchitectedVirtualLans ((HvLpVirtualLanIndex)16) #define HvLpIndexInvalid ((HvLpIndex)0xff) /* * Enums for the sub-components under PLIC * Used in HvCall and HvPrimaryCall */ -enum HvCallCompIds { +enum { HvCallCompId = 0, HvCallCpuCtlsCompId = 1, HvCallCfgCompId = 2, diff -ruNp linus-iSeries-headers.6/include/asm-ppc64/iSeries/ItVpdAreas.h linus-iSeries-headers.7/include/asm-ppc64/iSeries/ItVpdAreas.h --- linus-iSeries-headers.6/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-02 16:22:09.000000000 +1000 +++ linus-iSeries-headers.7/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-02 16:48:16.000000000 +1000 @@ -61,12 +61,6 @@ #define ItVpdAreasMaxSlotLabels 192 -struct SlicVpdAdrs { - u32 pad1; - void *vpdAddr; -}; - - struct ItVpdAreas { u32 xSlicDesc; // Descriptor 000-003 u16 xSlicSize; // Size of this control block 004-005 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/8781e4e4/attachment.pgp From arnd at arndb.de Fri Jun 3 01:52:05 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 2 Jun 2005 17:52:05 +0200 Subject: [PATCH] ppc64: Fix result code handling in prom_init In-Reply-To: <1117685497.31082.27.camel@gaston> References: <1117685497.31082.27.camel@gaston> Message-ID: <200506021752.06035.arnd@arndb.de> On Dunnersdag 02 Juni 2005 06:11, Benjamin Herrenschmidt wrote: > This patch reworks it all. Tested on g5, Maple, POWER3 and POWER5. Works on the Cell firmware as well, as expected. Thanks! Arnd <>< From xma at us.ibm.com Fri Jun 3 02:20:02 2005 From: xma at us.ibm.com (Shirley Ma) Date: Thu, 2 Jun 2005 09:20:02 -0700 Subject: get_cyles() In-Reply-To: Message-ID: Doesn't ppc64 supports get_cycles() correctly? I used below function to get_cycles(), it seems not right. t1 = get_cycles(); t2 = get_cycles(); t2-t1 is huge. static inline cycles_t get_cycles(void) { cycles_t ret; __asm__ __volatile__("mftb %0" : "=r" (ret) : ); return ret; } Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/382a94ab/attachment.htm From zwane at arm.linux.org.uk Fri Jun 3 05:42:44 2005 From: zwane at arm.linux.org.uk (Zwane Mwaikambo) Date: Thu, 2 Jun 2005 13:42:44 -0600 (MDT) Subject: get_cyles() In-Reply-To: References: Message-ID: On Thu, 2 Jun 2005, Shirley Ma wrote: > Doesn't ppc64 supports get_cycles() correctly? > I used below function to get_cycles(), it seems not right. > > t1 = get_cycles(); > t2 = get_cycles(); > > t2-t1 is huge. > > static inline cycles_t get_cycles(void) > { > cycles_t ret; > > __asm__ __volatile__("mftb %0" : "=r" (ret) : ); > return ret; > } Conjecture: Timebase is 33MHz (so it's slow compared to processor instruction retiring), there is a very small gap between t2 and t1 sampling time delta wise and it may go backwards(?). What are the actual numbers? Zwane From xma at us.ibm.com Fri Jun 3 07:18:27 2005 From: xma at us.ibm.com (Shirley Ma) Date: Thu, 2 Jun 2005 14:18:27 -0700 Subject: get_cyles() In-Reply-To: Message-ID: I found the reason. The application(32bit) printed unsigned long long, the get_cycles() returns unsigned long. Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/1e080e66/attachment.htm From gerald.vanbaren at smiths-aerospace.com Fri Jun 3 06:08:09 2005 From: gerald.vanbaren at smiths-aerospace.com (Jerry Van Baren) Date: Thu, 2 Jun 2005 16:08:09 -0400 Subject: get_cyles() In-Reply-To: References: Message-ID: <429F6729.5070202@smiths-aerospace.com> Shirley Ma wrote: > > Doesn't ppc64 supports get_cycles() correctly? > I used below function to get_cycles(), it seems not right. > > t1 = get_cycles(); > t2 = get_cycles(); > > t2-t1 is huge. > > static inline cycles_t get_cycles(void) > { > cycles_t ret; > > __asm__ __volatile__("mftb %0" : "=r" (ret) : ); > return ret; > } > > Shirley Ma > IBM Linux Technology Center > 15300 SW Koll Parkway > Beaverton, OR 97006-6063 > Phone(Fax): (503) 578-7638 Is (t2-t1) (a) _always_ huge or (b) only sometimes huge? If (a), are you sure t1 and t2 are executing in the written order? The compiler may be re-ordering the instructions on you. Use objdump -S to disassemble the object file (or executable). If (b), you are getting burned by carry out of the time base lower register. Every 2^32 counts, the timebase (lower) register will wrap around from 0xFFFFFFFF to 0x00000000. If the wrap occurs between reading t1 and t2, you will get a very large jump. Another possiblility for (b) is that you are getting an interrupt between reading t1 and t2, so your delta will include the interrupt handling time and potentially more... if your task is pre-empted due to the interrupt, the time delta will be (relatively speaking) very large. This is actually the most likely scenario in my mind. gvb From ntl at pobox.com Fri Jun 3 08:15:09 2005 From: ntl at pobox.com (Nathan Lynch) Date: Thu, 2 Jun 2005 17:15:09 -0500 Subject: [PATCH] fix slab corruption during ipr probe Message-ID: <20050602221509.GA11355@otto> Hi- With CONFIG_DEBUG_SLAB=y I see slab corruption messages during boot on pSeries machines with IPR adapters with any 2.6.12-rc kernel. The change which seems to have introduced the problem is "SCSI: revamp target scanning routines" and may be found at: http://marc.theaimsgroup.com/?l=bk-commits-head&m=111093946426333&w=2 In order to revert that in a 2.6.12-rc1 tree, I had to revert "target code updates to support scanned targets" first: http://marc.theaimsgroup.com/?l=bk-commits-head&m=111094132524649&w=2 With both patches reverted, the corruption messages go away. ipr: IBM Power RAID SCSI Device Driver version: 2.0.13 (February 21, 2005) ipr 0001:d0:01.0: Found IOA with IRQ: 167 ipr 0001:d0:01.0: Starting IOA initialization sequence. ipr 0001:d0:01.0: Adapter firmware version: 020A005C ipr 0001:d0:01.0: IOA initialized. scsi0 : IBM 570B Storage Adapter Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770 Type: Enclosure ANSI SCSI revision: 02 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770 Type: Enclosure ANSI SCSI revision: 02 Slab corruption: start=c0000001e8de5268, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](.scsi_target_dev_release+0x28/0x50) 080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a Prev obj: start=c0000001e8de5050, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<0000000000000000>](0x0) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=c0000001e8de5480, len=512 Redzone: 0x170fc2a5/0x170fc2a5. Last user: [](.as_init_queue+0x5c/0x228) 000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00 010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98 Slab corruption: start=c0000001e8de5268, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](.scsi_target_dev_release+0x28/0x50) 080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a Prev obj: start=c0000001e8de5050, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<0000000000000000>](0x0) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=c0000001e8de5480, len=512 Redzone: 0x170fc2a5/0x170fc2a5. Last user: [](.as_init_queue+0x5c/0x228) 000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00 010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98 ... I did some digging and the problem seems to be a refcounting issue in __scsi_add_device. The target gets freed in scsi_target_reap, and then __scsi_add_device tries to do another device_put on it. I'm not sure whether other users of __scsi_add_device are affected by this. This patch works for me (tm); is it correct? Signed-off-by: Nathan Lynch Index: linux-2.6.12-rc5/drivers/scsi/scsi_scan.c =================================================================== --- linux-2.6.12-rc5.orig/drivers/scsi/scsi_scan.c +++ linux-2.6.12-rc5/drivers/scsi/scsi_scan.c @@ -1197,6 +1197,7 @@ struct scsi_device *__scsi_add_device(st if (!starget) return ERR_PTR(-ENOMEM); + get_device(&starget->dev); down(&shost->scan_mutex); res = scsi_probe_and_add_lun(starget, lun, NULL, &sdev, 1, hostdata); if (res != SCSI_SCAN_LUN_PRESENT) From utz.bacher at de.ibm.com Fri Jun 3 09:33:57 2005 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Fri, 3 Jun 2005 01:33:57 +0200 (CEST) Subject: [PATCH 3/8] ppc64: add a watchdog driver for rtas Message-ID: Nathan, Arnd, Nathan Lynch wrote: > Arnd Bergmann wrote: > > +static volatile int wdrtas_miscdev_open = 0; > ... > > +static int > > +wdrtas_open(struct inode *inode, struct file *file) > > +{ > > + /* only open once */ > > + if (xchg(&wdrtas_miscdev_open,1)) > > + return -EBUSY; > > The volatile and xchg strike me as an obscure method for ensuring only > one process at a time can open this file. Any reason a semaphore > couldn't be used? [...] > > + printk("wdrtas: got unexpected close. Watchdog " > > + "not stopped.\n"); > > printk's need a valid log level specified. There are several in this > file that lack them. the following patch fixes the issues the two of you had: now printks have a proper log level specified and atomic variables are used to prevent concurrent access to the watchdog file. Thanks, Utz :wq Add a watchdog using the RTAS OS surveillance service. This is provided as a simpler alternative to rtasd. The added value is that it works with standard watchdog client programs and can therefore also do user space monitoring. On BPA, rtasd is not really useful because the hardware does not have much to report with event-scan. The driver should also work on other platforms that support the OS surveillance RTAS calls. Signed-off-by: Utz Bacher diff -ruN linux-2.6.12-rc5.orig/drivers/char/watchdog/Kconfig linux-2.6.12-rc5/drivers/char/watchdog/Kconfig --- linux-2.6.12-rc5.orig/drivers/char/watchdog/Kconfig 2005-06-03 00:33:32.679342588 +0200 +++ linux-2.6.12-rc5/drivers/char/watchdog/Kconfig 2005-06-03 00:34:20.193419032 +0200 @@ -414,6 +414,16 @@ machines. The watchdog timeout period is normally one minute but can be changed with a boot-time parameter. +# ppc64 RTAS watchdog +config WATCHDOG_RTAS + tristate "RTAS watchdog" + depends on WATCHDOG && PPC_RTAS + help + This driver adds watchdog support for the RTAS watchdog. + + To compile this driver as a module, choose M here. The module + will be called wdrtas. + # # ISA-based Watchdog Cards # diff -ruN linux-2.6.12-rc5.orig/drivers/char/watchdog/Makefile linux-2.6.12-rc5/drivers/char/watchdog/Makefile --- linux-2.6.12-rc5.orig/drivers/char/watchdog/Makefile 2005-06-03 00:33:32.679342588 +0200 +++ linux-2.6.12-rc5/drivers/char/watchdog/Makefile 2005-06-03 00:34:20.194419404 +0200 @@ -33,6 +33,7 @@ obj-$(CONFIG_IXP4XX_WATCHDOG) += ixp4xx_wdt.o obj-$(CONFIG_IXP2000_WATCHDOG) += ixp2000_wdt.o obj-$(CONFIG_8xx_WDT) += mpc8xx_wdt.o +obj-$(CONFIG_WATCHDOG_RTAS) += wdrtas.o # Only one watchdog can succeed. We probe the hardware watchdog # drivers first, then the softdog driver. This means if your hardware diff -ruN linux-2.6.12-rc5.orig/drivers/char/watchdog/wdrtas.c linux-2.6.12-rc5/drivers/char/watchdog/wdrtas.c --- linux-2.6.12-rc5.orig/drivers/char/watchdog/wdrtas.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.12-rc5/drivers/char/watchdog/wdrtas.c 2005-06-03 00:34:31.886208108 +0200 @@ -0,0 +1,696 @@ +/* + * FIXME: add wdrtas_get_status and wdrtas_get_boot_status as soon as + * RTAS calls are available + */ + +/* + * RTAS watchdog driver + * + * (C) Copyright IBM Corp. 2005 + * device driver to exploit watchdog RTAS functions + * + * Authors : Utz Bacher + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define WDRTAS_MAGIC_CHAR 42 +#define WDRTAS_SUPPORTED_MASK (WDIOF_SETTIMEOUT | \ + WDIOF_MAGICCLOSE) + +MODULE_AUTHOR("Utz Bacher "); +MODULE_DESCRIPTION("RTAS watchdog driver"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_MISCDEV(WATCHDOG_MINOR); +MODULE_ALIAS_MISCDEV(TEMP_MINOR); + +#ifdef CONFIG_WATCHDOG_NOWAYOUT +static int wdrtas_nowayout = 1; +#else +static int wdrtas_nowayout = 0; +#endif + +static atomic_t wdrtas_miscdev_open = ATOMIC_INIT(0); +static char wdrtas_expect_close = 0; + +static int wdrtas_interval; + +#define WDRTAS_THERMAL_SENSOR 3 +static int wdrtas_token_get_sensor_state; +#define WDRTAS_SURVEILLANCE_IND 9000 +static int wdrtas_token_set_indicator; +#define WDRTAS_SP_SPI 28 +static int wdrtas_token_get_sp; +static int wdrtas_token_event_scan; + +#define WDRTAS_DEFAULT_INTERVAL 300 + +#define WDRTAS_LOGBUFFER_LEN 128 +static char wdrtas_logbuffer[WDRTAS_LOGBUFFER_LEN]; + + +/*** watchdog access functions */ + +/** + * wdrtas_set_interval - sets the watchdog interval + * @interval: new interval + * + * returns 0 on success, <0 on failures + * + * wdrtas_set_interval sets the watchdog keepalive interval by calling the + * RTAS function set-indicator (surveillance). The unit of interval is + * seconds. + */ +static int +wdrtas_set_interval(int interval) +{ + long result; + static int print_msg = 10; + + /* rtas uses minutes */ + interval = (interval + 59) / 60; + + result = rtas_call(wdrtas_token_set_indicator, 3, 1, NULL, + WDRTAS_SURVEILLANCE_IND, 0, interval); + if ( (result < 0) && (print_msg) ) { + printk(KERN_ERR "wdrtas: setting the watchdog to %i " + "timeout failed: %li\n", interval, result); + print_msg--; + } + + return result; +} + +/** + * wdrtas_get_interval - returns the current watchdog interval + * @fallback_value: value (in seconds) to use, if the RTAS call fails + * + * returns the interval + * + * wdrtas_get_interval returns the current watchdog keepalive interval + * as reported by the RTAS function ibm,get-system-parameter. The unit + * of the return value is seconds. + */ +static int +wdrtas_get_interval(int fallback_value) +{ + long result; + char value[4]; + + result = rtas_call(wdrtas_token_get_sp, 3, 1, NULL, + WDRTAS_SP_SPI, (void *)__pa(&value), 4); + if ( (value[0] != 0) || (value[1] != 2) || (value[3] != 0) || + (result < 0) ) { + printk(KERN_WARNING "wdrtas: could not get sp_spi watchdog " + "timeout (%li). Continuing\n", result); + return fallback_value; + } + + /* rtas uses minutes */ + return ((int)value[2]) * 60; +} + +/** + * wdrtas_timer_start - starts watchdog + * + * wdrtas_timer_start starts the watchdog by calling the RTAS function + * set-interval (surveillance) + */ +static void +wdrtas_timer_start(void) +{ + wdrtas_set_interval(wdrtas_interval); +} + +/** + * wdrtas_timer_stop - stops watchdog + * + * wdrtas_timer_stop stops the watchdog timer by calling the RTAS function + * set-interval (surveillance) + */ +static void +wdrtas_timer_stop(void) +{ + wdrtas_set_interval(0); +} + +/** + * wdrtas_log_scanned_event - logs an event we received during keepalive + * + * wdrtas_log_scanned_event prints a message to the log buffer dumping + * the results of the last event-scan call + */ +static void +wdrtas_log_scanned_event(void) +{ + int i; + + for (i = 0; i < WDRTAS_LOGBUFFER_LEN; i += 16) + printk(KERN_INFO "wdrtas: dumping event (line %i/%i), data = " + "%02x %02x %02x %02x %02x %02x %02x %02x " + "%02x %02x %02x %02x %02x %02x %02x %02x\n", + (i / 16) + 1, (WDRTAS_LOGBUFFER_LEN / 16), + wdrtas_logbuffer[i + 0], wdrtas_logbuffer[i + 1], + wdrtas_logbuffer[i + 2], wdrtas_logbuffer[i + 3], + wdrtas_logbuffer[i + 4], wdrtas_logbuffer[i + 5], + wdrtas_logbuffer[i + 6], wdrtas_logbuffer[i + 7], + wdrtas_logbuffer[i + 8], wdrtas_logbuffer[i + 9], + wdrtas_logbuffer[i + 10], wdrtas_logbuffer[i + 11], + wdrtas_logbuffer[i + 12], wdrtas_logbuffer[i + 13], + wdrtas_logbuffer[i + 14], wdrtas_logbuffer[i + 15]); +} + +/** + * wdrtas_timer_keepalive - resets watchdog timer to keep system alive + * + * wdrtas_timer_keepalive restarts the watchdog timer by calling the + * RTAS function event-scan and repeats these calls as long as there are + * events available. All events will be dumped. + */ +static void +wdrtas_timer_keepalive(void) +{ + long result; + + do { + result = rtas_call(wdrtas_token_event_scan, 4, 1, NULL, + RTAS_EVENT_SCAN_ALL_EVENTS, 0, + (void *)__pa(wdrtas_logbuffer), + WDRTAS_LOGBUFFER_LEN); + if (result < 0) + printk(KERN_ERR "wdrtas: event-scan failed: %li\n", + result); + if (result == 0) + wdrtas_log_scanned_event(); + } while (result == 0); +} + +/** + * wdrtas_get_temperature - returns current temperature + * + * returns temperature or <0 on failures + * + * wdrtas_get_temperature returns the current temperature in Fahrenheit. It + * uses the RTAS call get-sensor-state, token 3 to do so + */ +static int +wdrtas_get_temperature(void) +{ + long result; + int temperature = 0; + + result = rtas_call(wdrtas_token_get_sensor_state, 2, 2, + (void *)__pa(&temperature), + WDRTAS_THERMAL_SENSOR, 0); + + if (result < 0) + printk(KERN_WARNING "wdrtas: reading the thermal sensor " + "faild: %li\n", result); + else + temperature = ((temperature * 9) / 5) + 32; /* fahrenheit */ + + return temperature; +} + +/** + * wdrtas_get_status - returns the status of the watchdog + * + * returns a bitmask of defines WDIOF_... as defined in + * include/linux/watchdog.h + */ +static int +wdrtas_get_status(void) +{ + return 0; /* TODO */ +} + +/** + * wdrtas_get_boot_status - returns the reason for the last boot + * + * returns a bitmask of defines WDIOF_... as defined in + * include/linux/watchdog.h, indicating why the watchdog rebooted the system + */ +static int +wdrtas_get_boot_status(void) +{ + return 0; /* TODO */ +} + +/*** watchdog API and operations stuff */ + +/* wdrtas_write - called when watchdog device is written to + * @file: file structure + * @buf: user buffer with data + * @len: amount to data written + * @ppos: position in file + * + * returns the number of successfully processed characters, which is always + * the number of bytes passed to this function + * + * wdrtas_write processes all the data given to it and looks for the magic + * character 'V'. This character allows the watchdog device to be closed + * properly. + */ +static ssize_t +wdrtas_write(struct file *file, const char __user *buf, + size_t len, loff_t *ppos) +{ + int i; + char c; + + if (!len) + goto out; + + if (!wdrtas_nowayout) { + wdrtas_expect_close = 0; + /* look for 'V' */ + for (i = 0; i < len; i++) { + if (get_user(c, buf + i)) + return -EFAULT; + /* allow to close device */ + if (c == 'V') + wdrtas_expect_close = WDRTAS_MAGIC_CHAR; + } + } + + wdrtas_timer_keepalive(); + +out: + return len; +} + +/** + * wdrtas_ioctl - ioctl function for the watchdog device + * @inode: inode structure + * @file: file structure + * @cmd: command for ioctl + * @arg: argument pointer + * + * returns 0 on success, <0 on failure + * + * wdrtas_ioctl implements the watchdog API ioctls + */ +static int +wdrtas_ioctl(struct inode *inode, struct file *file, + unsigned int cmd, unsigned long arg) +{ + int __user *argp = (void *)arg; + int i; + static struct watchdog_info wdinfo = { + .options = WDRTAS_SUPPORTED_MASK, + .firmware_version = 0, + .identity = "wdrtas" + }; + + switch (cmd) { + case WDIOC_GETSUPPORT: + if (copy_to_user(argp, &wdinfo, sizeof(wdinfo))) + return -EFAULT; + return 0; + + case WDIOC_GETSTATUS: + i = wdrtas_get_status(); + return put_user(i, argp); + + case WDIOC_GETBOOTSTATUS: + i = wdrtas_get_boot_status(); + return put_user(i, argp); + + case WDIOC_GETTEMP: + if (wdrtas_token_get_sensor_state == RTAS_UNKNOWN_SERVICE) + return -EOPNOTSUPP; + + i = wdrtas_get_temperature(); + return put_user(i, argp); + + case WDIOC_SETOPTIONS: + if (get_user(i, argp)) + return -EFAULT; + if (i & WDIOS_DISABLECARD) + wdrtas_timer_stop(); + if (i & WDIOS_ENABLECARD) { + wdrtas_timer_keepalive(); + wdrtas_timer_start(); + } + if (i & WDIOS_TEMPPANIC) { + /* not implemented. Done by H8 */ + } + return 0; + + case WDIOC_KEEPALIVE: + wdrtas_timer_keepalive(); + return 0; + + case WDIOC_SETTIMEOUT: + if (get_user(i, argp)) + return -EFAULT; + + if (wdrtas_set_interval(i)) + return -EINVAL; + + wdrtas_timer_keepalive(); + + if (wdrtas_token_get_sp == RTAS_UNKNOWN_SERVICE) + wdrtas_interval = i; + else + wdrtas_interval = wdrtas_get_interval(i); + /* fallthrough */ + + case WDIOC_GETTIMEOUT: + return put_user(wdrtas_interval, argp); + + default: + return -ENOIOCTLCMD; + } +} + +/** + * wdrtas_open - open function of watchdog device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success, -EBUSY if the file has been opened already, <0 on + * other failures + * + * function called when watchdog device is opened + */ +static int +wdrtas_open(struct inode *inode, struct file *file) +{ + /* only open once */ + if (atomic_inc_return(&wdrtas_miscdev_open) > 1) { + atomic_dec(&wdrtas_miscdev_open); + return -EBUSY; + } + + wdrtas_timer_start(); + wdrtas_timer_keepalive(); + + return nonseekable_open(inode, file); +} + +/** + * wdrtas_close - close function of watchdog device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success + * + * close function. Always succeeds + */ +static int +wdrtas_close(struct inode *inode, struct file *file) +{ + /* only stop watchdog, if this was announced using 'V' before */ + if (wdrtas_expect_close == WDRTAS_MAGIC_CHAR) + wdrtas_timer_stop(); + else { + printk(KERN_WARNING "wdrtas: unexpected close. Watchdog " + "not stopped.\n"); + wdrtas_timer_keepalive(); + } + + wdrtas_expect_close = 0; + atomic_dec(&wdrtas_miscdev_open); + return 0; +} + +/** + * wdrtas_temp_read - gives back the temperature in fahrenheit + * @file: file structure + * @buf: user buffer + * @count: number of bytes to be read + * @ppos: position in file + * + * returns always 1 or -EFAULT in case of user space copy failures, <0 on + * other failures + * + * wdrtas_temp_read gives the temperature to the users by copying this + * value as one byte into the user space buffer. The unit is Fahrenheit... + */ +static ssize_t +wdrtas_temp_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + int temperature = 0; + + temperature = wdrtas_get_temperature(); + if (temperature < 0) + return temperature; + + if (copy_to_user(buf, &temperature, 1)) + return -EFAULT; + + return 1; +} + +/** + * wdrtas_temp_open - open function of temperature device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success, <0 on failure + * + * function called when temperature device is opened + */ +static int +wdrtas_temp_open(struct inode *inode, struct file *file) +{ + return nonseekable_open(inode, file); +} + +/** + * wdrtas_temp_close - close function of temperature device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success + * + * close function. Always succeeds + */ +static int +wdrtas_temp_close(struct inode *inode, struct file *file) +{ + return 0; +} + +/** + * wdrtas_reboot - reboot notifier function + * @nb: notifier block structure + * @code: reboot code + * @ptr: unused + * + * returns NOTIFY_DONE + * + * wdrtas_reboot stops the watchdog in case of a reboot + */ +static int +wdrtas_reboot(struct notifier_block *this, unsigned long code, void *ptr) +{ + if ( (code==SYS_DOWN) || (code==SYS_HALT) ) + wdrtas_timer_stop(); + + return NOTIFY_DONE; +} + +/*** initialization stuff */ + +static struct file_operations wdrtas_fops = { + .owner = THIS_MODULE, + .llseek = no_llseek, + .write = wdrtas_write, + .ioctl = wdrtas_ioctl, + .open = wdrtas_open, + .release = wdrtas_close, +}; + +static struct miscdevice wdrtas_miscdev = { + .minor = WATCHDOG_MINOR, + .name = "watchdog", + .fops = &wdrtas_fops, +}; + +static struct file_operations wdrtas_temp_fops = { + .owner = THIS_MODULE, + .llseek = no_llseek, + .read = wdrtas_temp_read, + .open = wdrtas_temp_open, + .release = wdrtas_temp_close, +}; + +static struct miscdevice wdrtas_tempdev = { + .minor = TEMP_MINOR, + .name = "temperature", + .fops = &wdrtas_temp_fops, +}; + +static struct notifier_block wdrtas_notifier = { + .notifier_call = wdrtas_reboot, +}; + +/** + * wdrtas_get_tokens - reads in RTAS tokens + * + * returns 0 on succes, <0 on failure + * + * wdrtas_get_tokens reads in the tokens for the RTAS calls used in + * this watchdog driver. It tolerates, if "get-sensor-state" and + * "ibm,get-system-parameter" are not available. + */ +static int +wdrtas_get_tokens(void) +{ + wdrtas_token_get_sensor_state = rtas_token("get-sensor-state"); + if (wdrtas_token_get_sensor_state == RTAS_UNKNOWN_SERVICE) { + printk(KERN_WARNING "wdrtas: couldn't get token for " + "get-sensor-state. Trying to continue without " + "temperature support.\n"); + } + + wdrtas_token_get_sp = rtas_token("ibm,get-system-parameter"); + if (wdrtas_token_get_sp == RTAS_UNKNOWN_SERVICE) { + printk(KERN_WARNING "wdrtas: couldn't get token for " + "ibm,get-system-parameter. Trying to continue with " + "a default timeout value of %i seconds.\n", + WDRTAS_DEFAULT_INTERVAL); + } + + wdrtas_token_set_indicator = rtas_token("set-indicator"); + if (wdrtas_token_set_indicator == RTAS_UNKNOWN_SERVICE) { + printk(KERN_ERR "wdrtas: couldn't get token for " + "set-indicator. Terminating watchdog code.\n"); + return -EIO; + } + + wdrtas_token_event_scan = rtas_token("event-scan"); + if (wdrtas_token_event_scan == RTAS_UNKNOWN_SERVICE) { + printk(KERN_ERR "wdrtas: couldn't get token for event-scan. " + "Terminating watchdog code.\n"); + return -EIO; + } + + return 0; +} + +/** + * wdrtas_unregister_devs - unregisters the misc dev handlers + * + * wdrtas_register_devs unregisters the watchdog and temperature watchdog + * misc devs + */ +static void +wdrtas_unregister_devs(void) +{ + misc_deregister(&wdrtas_miscdev); + if (wdrtas_token_get_sensor_state != RTAS_UNKNOWN_SERVICE) + misc_deregister(&wdrtas_tempdev); +} + +/** + * wdrtas_register_devs - registers the misc dev handlers + * + * returns 0 on succes, <0 on failure + * + * wdrtas_register_devs registers the watchdog and temperature watchdog + * misc devs + */ +static int +wdrtas_register_devs(void) +{ + int result; + + result = misc_register(&wdrtas_miscdev); + if (result) { + printk(KERN_ERR "wdrtas: couldn't register watchdog misc " + "device. Terminating watchdog code.\n"); + return result; + } + + if (wdrtas_token_get_sensor_state != RTAS_UNKNOWN_SERVICE) { + result = misc_register(&wdrtas_tempdev); + if (result) { + printk(KERN_WARNING "wdrtas: couldn't register " + "watchdog temperature misc device. Continuing " + "without temperature support.\n"); + wdrtas_token_get_sensor_state = RTAS_UNKNOWN_SERVICE; + } + } + + return 0; +} + +/** + * wdrtas_init - init function of the watchdog driver + * + * returns 0 on succes, <0 on failure + * + * registers the file handlers and the reboot notifier + */ +static int __init +wdrtas_init(void) +{ + if (wdrtas_get_tokens()) + return -ENODEV; + + if (wdrtas_register_devs()) + return -ENODEV; + + if (register_reboot_notifier(&wdrtas_notifier)) { + printk(KERN_ERR "wdrtas: could not register reboot notifier. " + "Terminating watchdog code.\n"); + wdrtas_unregister_devs(); + return -ENODEV; + } + + if (wdrtas_token_get_sp == RTAS_UNKNOWN_SERVICE) + wdrtas_interval = WDRTAS_DEFAULT_INTERVAL; + else + wdrtas_interval = wdrtas_get_interval(WDRTAS_DEFAULT_INTERVAL); + + return 0; +} + +/** + * wdrtas_exit - exit function of the watchdog driver + * + * unregisters the file handlers and the reboot notifier + */ +static void __exit +wdrtas_exit(void) +{ + if (!wdrtas_nowayout) + wdrtas_timer_stop(); + + wdrtas_unregister_devs(); + + unregister_reboot_notifier(&wdrtas_notifier); +} + +module_init(wdrtas_init); +module_exit(wdrtas_exit); From benh at kernel.crashing.org Fri Jun 3 17:18:24 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 03 Jun 2005 17:18:24 +1000 Subject: Booting the linux-ppc64 kernel & flattened device tree v0.4 In-Reply-To: <1117614484.19020.27.camel@gaston> References: <1117614390.19020.24.camel@gaston> <1117614484.19020.27.camel@gaston> Message-ID: <1117783104.31082.151.camel@gaston> On Wed, 2005-06-01 at 18:28 +1000, Benjamin Herrenschmidt wrote: > Here is the kernel patch. It applies on top of the various prom_init.c > bug fixes that I already posted today on the linuxppc-dev & > linuxppc64-dev lists (those will be in the next -mm and maybe in > 2.6.12). > > This patch is intended to hit upstream by 2.6.13 Ok, the patch I posted with version 0.4 implementing version 0x10 of the format was broken, here is a fixed version against current linus "git" as of today: Index: linux-work/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom_init.c 2005-06-03 16:52:28.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom_init.c 2005-06-03 16:58:01.000000000 +1000 @@ -1534,7 +1534,8 @@ */ #define MAX_PROPERTY_NAME 64 -static void __init scan_dt_build_strings(phandle node, unsigned long *mem_start, +static void __init scan_dt_build_strings(phandle node, + unsigned long *mem_start, unsigned long *mem_end) { unsigned long offset = reloc_offset(); @@ -1547,16 +1548,21 @@ /* get and store all property names */ prev_name = RELOC(""); for (;;) { - int rc; - /* 64 is max len of name including nul. */ namep = make_room(mem_start, mem_end, MAX_PROPERTY_NAME, 1); - rc = call_prom("nextprop", 3, 1, node, prev_name, namep); - if (rc != 1) { + if (call_prom("nextprop", 3, 1, node, prev_name, namep) != 1) { /* No more nodes: unwind alloc */ *mem_start = (unsigned long)namep; break; } + + /* skip "name" */ + if (strcmp(namep, RELOC("name")) == 0) { + *mem_start = (unsigned long)namep; + prev_name = RELOC("name"); + continue; + } + /* get/create string entry */ soff = dt_find_string(namep); if (soff != 0) { *mem_start = (unsigned long)namep; @@ -1571,7 +1577,7 @@ /* do all our children */ child = call_prom("child", 1, 1, node); - while (child != (phandle)0) { + while (child != 0) { scan_dt_build_strings(child, mem_start, mem_end); child = call_prom("peer", 1, 1, child); } @@ -1580,16 +1586,13 @@ static void __init scan_dt_build_struct(phandle node, unsigned long *mem_start, unsigned long *mem_end) { - int l, align; phandle child; - char *namep, *prev_name, *sstart, *p, *ep; + char *namep, *prev_name, *sstart, *p, *ep, *lp, *path; unsigned long soff; unsigned char *valp; unsigned long offset = reloc_offset(); - char pname[MAX_PROPERTY_NAME]; - char *path; - - path = RELOC(prom_scratch); + static char pname[MAX_PROPERTY_NAME]; + int l; dt_push_token(OF_DT_BEGIN_NODE, mem_start, mem_end); @@ -1599,23 +1602,33 @@ namep, *mem_end - *mem_start); if (l >= 0) { /* Didn't fit? Get more room. */ - if (l+1 > *mem_end - *mem_start) { + if ((l+1) > (*mem_end - *mem_start)) { namep = make_room(mem_start, mem_end, l+1, 1); call_prom("package-to-path", 3, 1, node, namep, l); } namep[l] = '\0'; + /* Fixup an Apple bug where they have bogus \0 chars in the * middle of the path in some properties */ for (p = namep, ep = namep + l; p < ep; p++) if (*p == '\0') { memmove(p, p+1, ep - p); - ep--; l--; + ep--; l--; p--; } - *mem_start = _ALIGN(((unsigned long) namep) + strlen(namep) + 1, 4); + + /* now try to extract the unit name in that mess */ + for (p = namep, lp = NULL; *p; p++) + if (*p == '/') + lp = p + 1; + if (lp != NULL) + memmove(namep, lp, strlen(lp) + 1); + *mem_start = _ALIGN(((unsigned long) namep) + + strlen(namep) + 1, 4); } /* get it again for debugging */ + path = RELOC(prom_scratch); memset(path, 0, PROM_SCRATCH_SIZE); call_prom("package-to-path", 3, 1, node, path, PROM_SCRATCH_SIZE-1); @@ -1623,23 +1636,27 @@ prev_name = RELOC(""); sstart = (char *)RELOC(dt_string_start); for (;;) { - int rc; - - rc = call_prom("nextprop", 3, 1, node, prev_name, pname); - if (rc != 1) + if (call_prom("nextprop", 3, 1, node, prev_name, + RELOC(pname)) != 1) break; + /* skip "name" */ + if (strcmp(RELOC(pname), RELOC("name")) == 0) { + prev_name = RELOC("name"); + continue; + } + /* find string offset */ - soff = dt_find_string(pname); + soff = dt_find_string(RELOC(pname)); if (soff == 0) { - prom_printf("WARNING: Can't find string index for <%s>, node %s\n", - pname, path); + prom_printf("WARNING: Can't find string index for" + " <%s>, node %s\n", RELOC(pname), path); break; } prev_name = sstart + soff; /* get length */ - l = call_prom("getproplen", 2, 1, node, pname); + l = call_prom("getproplen", 2, 1, node, RELOC(pname)); /* sanity checks */ if (l == PROM_ERROR) @@ -1648,7 +1665,7 @@ prom_printf("WARNING: ignoring large property "); /* It seems OF doesn't null-terminate the path :-( */ prom_printf("[%s] ", path); - prom_printf("%s length 0x%x\n", pname, l); + prom_printf("%s length 0x%x\n", RELOC(pname), l); continue; } @@ -1658,17 +1675,16 @@ dt_push_token(soff, mem_start, mem_end); /* push property content */ - align = (l >= 8) ? 8 : 4; - valp = make_room(mem_start, mem_end, l, align); - call_prom("getprop", 4, 1, node, pname, valp, l); + valp = make_room(mem_start, mem_end, l, 4); + call_prom("getprop", 4, 1, node, RELOC(pname), valp, l); *mem_start = _ALIGN(*mem_start, 4); } /* Add a "linux,phandle" property. */ soff = dt_find_string(RELOC("linux,phandle")); if (soff == 0) - prom_printf("WARNING: Can't find string index for " - " node %s\n", path); + prom_printf("WARNING: Can't find string index for" + " node %s\n", path); else { dt_push_token(OF_DT_PROP, mem_start, mem_end); dt_push_token(4, mem_start, mem_end); @@ -1679,7 +1695,7 @@ /* do all our children */ child = call_prom("child", 1, 1, node); - while (child != (phandle)0) { + while (child != 0) { scan_dt_build_struct(child, mem_start, mem_end); child = call_prom("peer", 1, 1, child); } @@ -1718,7 +1734,8 @@ /* Build header and make room for mem rsv map */ mem_start = _ALIGN(mem_start, 4); - hdr = make_room(&mem_start, &mem_end, sizeof(struct boot_param_header), 4); + hdr = make_room(&mem_start, &mem_end, + sizeof(struct boot_param_header), 4); RELOC(dt_header_start) = (unsigned long)hdr; rsvmap = make_room(&mem_start, &mem_end, sizeof(mem_reserve_map), 8); @@ -1731,11 +1748,11 @@ namep = make_room(&mem_start, &mem_end, 16, 1); strcpy(namep, RELOC("linux,phandle")); mem_start = (unsigned long)namep + strlen(namep) + 1; - RELOC(dt_string_end) = mem_start; /* Build string array */ prom_printf("Building dt strings...\n"); scan_dt_build_strings(root, &mem_start, &mem_end); + RELOC(dt_string_end) = mem_start; /* Build structure */ mem_start = PAGE_ALIGN(mem_start); @@ -1750,9 +1767,11 @@ hdr->totalsize = RELOC(dt_struct_end) - RELOC(dt_header_start); hdr->off_dt_struct = RELOC(dt_struct_start) - RELOC(dt_header_start); hdr->off_dt_strings = RELOC(dt_string_start) - RELOC(dt_header_start); + hdr->dt_strings_size = RELOC(dt_string_end) - RELOC(dt_string_start); hdr->off_mem_rsvmap = ((unsigned long)rsvmap) - RELOC(dt_header_start); hdr->version = OF_DT_VERSION; - hdr->last_comp_version = 1; + /* Version 16 is not backward compatible */ + hdr->last_comp_version = 0x10; /* Reserve the whole thing and copy the reserve map in, we * also bump mem_reserve_cnt to cause further reservations to @@ -1808,6 +1827,9 @@ /* does it need fixup ? */ if (prom_getproplen(i2c, "interrupts") > 0) return; + + prom_printf("fixing up bogus interrupts for u3 i2c...\n"); + /* interrupt on this revision of u3 is number 0 and level */ interrupts[0] = 0; interrupts[1] = 1; Index: linux-work/arch/ppc64/kernel/prom.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom.c 2005-05-09 14:44:32.000000000 +1000 +++ linux-work/arch/ppc64/kernel/prom.c 2005-06-03 16:58:01.000000000 +1000 @@ -625,8 +625,8 @@ static inline char *find_flat_dt_string(u32 offset) { - return ((char *)initial_boot_params) + initial_boot_params->off_dt_strings - + offset; + return ((char *)initial_boot_params) + + initial_boot_params->off_dt_strings + offset; } /** @@ -635,26 +635,33 @@ * unflatten the tree */ static int __init scan_flat_dt(int (*it)(unsigned long node, - const char *full_path, void *data), + const char *uname, int depth, + void *data), void *data) { unsigned long p = ((unsigned long)initial_boot_params) + initial_boot_params->off_dt_struct; int rc = 0; + int depth = -1; do { u32 tag = *((u32 *)p); char *pathp; p += 4; - if (tag == OF_DT_END_NODE) + if (tag == OF_DT_END_NODE) { + depth --; + continue; + } + if (tag == OF_DT_NOP) continue; if (tag == OF_DT_END) break; if (tag == OF_DT_PROP) { u32 sz = *((u32 *)p); p += 8; - p = _ALIGN(p, sz >= 8 ? 8 : 4); + if (initial_boot_params->version < 0x10) + p = _ALIGN(p, sz >= 8 ? 8 : 4); p += sz; p = _ALIGN(p, 4); continue; @@ -664,9 +671,18 @@ " device tree !\n", tag); return -EINVAL; } + depth++; pathp = (char *)p; p = _ALIGN(p + strlen(pathp) + 1, 4); - rc = it(p, pathp, data); + if ((*pathp) == '/') { + char *lp, *np; + for (lp = NULL, np = pathp; *np; np++) + if ((*np) == '/') + lp = np+1; + if (lp != NULL) + pathp = lp; + } + rc = it(p, pathp, depth, data); if (rc != 0) break; } while(1); @@ -689,17 +705,21 @@ const char *nstr; p += 4; + if (tag == OF_DT_NOP) + continue; if (tag != OF_DT_PROP) return NULL; sz = *((u32 *)p); noff = *((u32 *)(p + 4)); p += 8; - p = _ALIGN(p, sz >= 8 ? 8 : 4); + if (initial_boot_params->version < 0x10) + p = _ALIGN(p, sz >= 8 ? 8 : 4); nstr = find_flat_dt_string(noff); if (nstr == NULL) { - printk(KERN_WARNING "Can't find property index name !\n"); + printk(KERN_WARNING "Can't find property index" + " name !\n"); return NULL; } if (strcmp(name, nstr) == 0) { @@ -713,7 +733,7 @@ } static void *__init unflatten_dt_alloc(unsigned long *mem, unsigned long size, - unsigned long align) + unsigned long align) { void *res; @@ -727,13 +747,16 @@ static unsigned long __init unflatten_dt_node(unsigned long mem, unsigned long *p, struct device_node *dad, - struct device_node ***allnextpp) + struct device_node ***allnextpp, + unsigned long fpsize) { struct device_node *np; struct property *pp, **prev_pp = NULL; char *pathp; u32 tag; - unsigned int l; + unsigned int l, allocl; + int has_name = 0; + int new_format = 0; tag = *((u32 *)(*p)); if (tag != OF_DT_BEGIN_NODE) { @@ -742,21 +765,62 @@ } *p += 4; pathp = (char *)*p; - l = strlen(pathp) + 1; + l = allocl = strlen(pathp) + 1; *p = _ALIGN(*p + l, 4); - np = unflatten_dt_alloc(&mem, sizeof(struct device_node) + l, + /* version 0x10 has a more compact unit name here instead of the full + * path. we accumulate the full path size using "fpsize", we'll rebuild + * it later. We detect this because the first character of the name is + * not '/'. + */ + if ((*pathp) != '/') { + new_format = 1; + if (fpsize == 0) { + /* root node: special case. fpsize accounts for path + * plus terminating zero. root node only has '/', so + * fpsize should be 2, but we want to avoid the first + * level nodes to have two '/' so we use fpsize 1 here + */ + fpsize = 1; + allocl = 2; + } else { + /* account for '/' and path size minus terminal 0 + * already in 'l' + */ + fpsize += l; + allocl = fpsize; + } + } + + + np = unflatten_dt_alloc(&mem, sizeof(struct device_node) + allocl, __alignof__(struct device_node)); if (allnextpp) { memset(np, 0, sizeof(*np)); np->full_name = ((char*)np) + sizeof(struct device_node); - memcpy(np->full_name, pathp, l); + if (new_format) { + char *p = np->full_name; + /* rebuild full path for new format */ + if (dad && dad->parent) { + strcpy(p, dad->full_name); +#ifdef DEBUG + if ((strlen(p) + l + 1) != allocl) { + DBG("%s: p: %d, l: %d, a: %d\n", + pathp, strlen(p), l, allocl); + } +#endif + p += strlen(p); + } + *(p++) = '/'; + memcpy(p, pathp, l); + } else + memcpy(np->full_name, pathp, l); prev_pp = &np->properties; **allnextpp = np; *allnextpp = &np->allnext; if (dad != NULL) { np->parent = dad; - /* we temporarily use the `next' field as `last_child'. */ + /* we temporarily use the next field as `last_child'*/ if (dad->next == 0) dad->child = np; else @@ -770,18 +834,26 @@ char *pname; tag = *((u32 *)(*p)); + if (tag == OF_DT_NOP) { + *p += 4; + continue; + } if (tag != OF_DT_PROP) break; *p += 4; sz = *((u32 *)(*p)); noff = *((u32 *)((*p) + 4)); - *p = _ALIGN((*p) + 8, sz >= 8 ? 8 : 4); + *p += 8; + if (initial_boot_params->version < 0x10) + *p = _ALIGN(*p, sz >= 8 ? 8 : 4); pname = find_flat_dt_string(noff); if (pname == NULL) { printk("Can't find property name in list !\n"); break; } + if (strcmp(pname, "name") == 0) + has_name = 1; l = strlen(pname) + 1; pp = unflatten_dt_alloc(&mem, sizeof(struct property), __alignof__(struct property)); @@ -801,6 +873,36 @@ } *p = _ALIGN((*p) + sz, 4); } + /* with version 0x10 we may not have the name property, recreate + * it here from the unit name if absent + */ + if (!has_name) { + char *p = pathp, *ps = pathp, *pa = NULL; + int sz; + + while (*p) { + if ((*p) == '@') + pa = p; + if ((*p) == '/') + ps = p + 1; + p++; + } + if (pa < ps) + pa = p; + sz = (pa - ps) + 1; + pp = unflatten_dt_alloc(&mem, sizeof(struct property) + sz, + __alignof__(struct property)); + if (allnextpp) { + pp->name = "name"; + pp->length = sz; + pp->value = (unsigned char *)(pp + 1); + *prev_pp = pp; + prev_pp = &pp->next; + memcpy(pp->value, ps, sz - 1); + ((char *)pp->value)[sz - 1] = 0; + DBG("fixed up name for %s -> %s\n", pathp, pp->value); + } + } if (allnextpp) { *prev_pp = NULL; np->name = get_property(np, "name", NULL); @@ -812,7 +914,7 @@ np->type = ""; } while (tag == OF_DT_BEGIN_NODE) { - mem = unflatten_dt_node(mem, p, np, allnextpp); + mem = unflatten_dt_node(mem, p, np, allnextpp, fpsize); tag = *((u32 *)(*p)); } if (tag != OF_DT_END_NODE) { @@ -842,21 +944,27 @@ /* First pass, scan for size */ start = ((unsigned long)initial_boot_params) + initial_boot_params->off_dt_struct; - size = unflatten_dt_node(0, &start, NULL, NULL); + size = unflatten_dt_node(0, &start, NULL, NULL, 0); + size = (size | 3) + 1; DBG(" size is %lx, allocating...\n", size); /* Allocate memory for the expanded device tree */ - mem = (unsigned long)abs_to_virt(lmb_alloc(size, + mem = (unsigned long)abs_to_virt(lmb_alloc(size + 4, __alignof__(struct device_node))); + ((u32 *)mem)[size / 4] = 0xdeadbeef; + DBG(" unflattening...\n", mem); /* Second pass, do actual unflattening */ start = ((unsigned long)initial_boot_params) + initial_boot_params->off_dt_struct; - unflatten_dt_node(mem, &start, NULL, &allnextp); + unflatten_dt_node(mem, &start, NULL, &allnextp, 0); if (*((u32 *)start) != OF_DT_END) - printk(KERN_WARNING "Weird tag at end of tree: %x\n", *((u32 *)start)); + printk(KERN_WARNING "Weird tag at end of tree: %08x\n", *((u32 *)start)); + if (((u32 *)mem)[size / 4] != 0xdeadbeef) + printk(KERN_WARNING "End of tree marker overwritten: %08x\n", + ((u32 *)mem)[size / 4] ); *allnextp = NULL; /* Get pointer to OF "/chosen" node for use everywhere */ @@ -880,7 +988,7 @@ static int __init early_init_dt_scan_cpus(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { char *type = get_flat_dt_prop(node, "device_type", NULL); u32 *prop; @@ -933,13 +1041,15 @@ } static int __init early_init_dt_scan_chosen(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { u32 *prop; u64 *prop64; extern unsigned long memory_limit, tce_alloc_start, tce_alloc_end; - if (strcmp(full_path, "/chosen") != 0) + DBG("search \"chosen\", depth: %d, uname: %s\n", depth, uname); + + if (depth != 1 || strcmp(uname, "chosen") != 0) return 0; /* get platform type */ @@ -989,18 +1099,20 @@ } static int __init early_init_dt_scan_root(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { u32 *prop; - if (strcmp(full_path, "/") != 0) + if (depth != 0) return 0; prop = (u32 *)get_flat_dt_prop(node, "#size-cells", NULL); dt_root_size_cells = (prop == NULL) ? 1 : *prop; - + DBG("dt_root_size_cells = %x\n", dt_root_size_cells); + prop = (u32 *)get_flat_dt_prop(node, "#address-cells", NULL); dt_root_addr_cells = (prop == NULL) ? 2 : *prop; + DBG("dt_root_addr_cells = %x\n", dt_root_addr_cells); /* break now */ return 1; @@ -1028,7 +1140,7 @@ static int __init early_init_dt_scan_memory(unsigned long node, - const char *full_path, void *data) + const char *uname, int depth, void *data) { char *type = get_flat_dt_prop(node, "device_type", NULL); cell_t *reg, *endp; @@ -1044,7 +1156,9 @@ endp = reg + (l / sizeof(cell_t)); - DBG("memory scan node %s ...\n", full_path); + DBG("memory scan node %s ..., reg size %ld, data: %x %x %x %x, ...\n", + uname, l, reg[0], reg[1], reg[2], reg[3]); + while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) { unsigned long base, size; @@ -1455,10 +1569,11 @@ struct device_node *np = allnodes; read_lock(&devtree_lock); - for (; np != 0; np = np->allnext) + for (; np != 0; np = np->allnext) { if (np->full_name != 0 && strcasecmp(np->full_name, path) == 0 && of_node_get(np)) break; + } read_unlock(&devtree_lock); return np; } Index: linux-work/include/asm-ppc64/prom.h =================================================================== --- linux-work.orig/include/asm-ppc64/prom.h 2005-06-02 08:05:58.000000000 +1000 +++ linux-work/include/asm-ppc64/prom.h 2005-06-03 16:58:01.000000000 +1000 @@ -22,13 +22,15 @@ #define RELOC(x) (*PTRRELOC(&(x))) /* Definitions used by the flattened device tree */ -#define OF_DT_HEADER 0xd00dfeed /* 4: version, 4: total size */ -#define OF_DT_BEGIN_NODE 0x1 /* Start node: full name */ +#define OF_DT_HEADER 0xd00dfeed /* marker */ +#define OF_DT_BEGIN_NODE 0x1 /* Start of node, full name */ #define OF_DT_END_NODE 0x2 /* End node */ -#define OF_DT_PROP 0x3 /* Property: name off, size, content */ +#define OF_DT_PROP 0x3 /* Property: name off, size, + * content */ +#define OF_DT_NOP 0x4 /* nop */ #define OF_DT_END 0x9 -#define OF_DT_VERSION 1 +#define OF_DT_VERSION 0x10 /* * This is what gets passed to the kernel by prom_init or kexec @@ -54,7 +56,9 @@ u32 version; /* format version */ u32 last_comp_version; /* last compatible version */ /* version 2 fields below */ - u32 boot_cpuid_phys; /* Which physical CPU id we're booting on */ + u32 boot_cpuid_phys; /* Physical CPU id we're booting on */ + /* version 3 fields below */ + u32 dt_strings_size; /* size of the DT strings block */ }; Index: linux-work/arch/ppc64/kernel/smp.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/smp.c 2005-06-03 16:52:28.000000000 +1000 +++ linux-work/arch/ppc64/kernel/smp.c 2005-06-03 16:58:01.000000000 +1000 @@ -15,7 +15,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include From strosake at austin.ibm.com Fri Jun 3 13:44:48 2005 From: strosake at austin.ibm.com (Mike Strosaker) Date: Thu, 02 Jun 2005 22:44:48 -0500 Subject: [PATCH] correct printing to op panel In-Reply-To: <17052.58068.511994.89604@cargo.ozlabs.ibm.com> References: <429CDC42.30905@austin.ibm.com> <17052.58068.511994.89604@cargo.ozlabs.ibm.com> Message-ID: <429FD230.9040303@austin.ibm.com> Hi, Paul: Paul Mackerras wrote: > I want to think about this one a bit more. The ppc_md.progress() > calls aren't only used for the i/pSeries op panels, and I need to > think about the effect of \f on the other progress implementations. > It would be best, I think, if the outputting of \f could be done by > the pSeries progress function rather than the caller, if possible. It turns out that there's an ibm,form-feed property; this modified patch uses it in the pSeries-specific progress routine. This patch also checks the number of rows and the specific width of each row (the second row on power5 systems can actually hold 80 characters). If the displayed text is too wide for the physical display, it can be viewed in the ASM menus, or by selecting option 14 on the op panel. Signed-off-by: Mike Strosaker -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: oppanel.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050602/3f8a6213/attachment.txt From sfr at canb.auug.org.au Fri Jun 3 17:58:19 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 17:58:19 +1000 Subject: [PATCH 0/10] ppc64 iSeries: header file cleanups Message-ID: <20050603175819.3d143a07.sfr@canb.auug.org.au> Hi Andrew, This set of patches does some long needed cleanup on the legacy iSeries header files. A lot of it is white space fixes and comment reformatting. Some is just simplification of code in inline functions and we actually get rid of 3 files. There are no StudleyCaps changes in here, that can wait until I have reduced the actual files to a minimum. arch/ppc64/kernel/HvLpEvent.c | 2 arch/ppc64/kernel/ItLpQueue.c | 1 arch/ppc64/kernel/asm-offsets.c | 1 arch/ppc64/kernel/iSeries_VpdInfo.c | 1 arch/ppc64/kernel/iSeries_pci.c | 2 arch/ppc64/kernel/iSeries_proc.c | 3 arch/ppc64/kernel/iSeries_setup.c | 6 arch/ppc64/kernel/iSeries_smp.c | 2 arch/ppc64/kernel/idle.c | 5 arch/ppc64/kernel/irq.c | 2 arch/ppc64/kernel/lparcfg.c | 2 arch/ppc64/kernel/mf.c | 1 arch/ppc64/kernel/ras.c | 1 arch/ppc64/kernel/rtc.c | 2 arch/ppc64/kernel/setup.c | 3 arch/ppc64/kernel/viopath.c | 10 include/asm-ppc64/iSeries/HvCall.h | 156 +------- include/asm-ppc64/iSeries/HvCallCfg.h | 213 ----------- include/asm-ppc64/iSeries/HvCallEvent.h | 94 +---- include/asm-ppc64/iSeries/HvCallHpt.h | 128 ++---- include/asm-ppc64/iSeries/HvCallPci.h | 486 +++++++++----------------- include/asm-ppc64/iSeries/HvCallSc.h | 40 +- include/asm-ppc64/iSeries/HvCallSm.h | 36 - include/asm-ppc64/iSeries/HvCallXm.h | 114 ++---- include/asm-ppc64/iSeries/HvLpConfig.h | 298 ++++----------- include/asm-ppc64/iSeries/HvLpEvent.h | 122 +++--- include/asm-ppc64/iSeries/HvReleaseData.h | 78 ++-- include/asm-ppc64/iSeries/HvTypes.h | 112 ++--- include/asm-ppc64/iSeries/IoHriMainStore.h | 33 - include/asm-ppc64/iSeries/IoHriProcessorVpd.h | 32 - include/asm-ppc64/iSeries/ItExtVpdPanel.h | 54 +- include/asm-ppc64/iSeries/ItIplParmsReal.h | 99 ++--- include/asm-ppc64/iSeries/ItLpNaca.h | 46 +- include/asm-ppc64/iSeries/ItLpQueue.h | 86 ++-- include/asm-ppc64/iSeries/ItLpRegSave.h | 41 +- include/asm-ppc64/iSeries/ItSpCommArea.h | 10 include/asm-ppc64/iSeries/ItVpdAreas.h | 135 +++---- include/asm-ppc64/iSeries/LparData.h | 49 -- include/asm-ppc64/iSeries/LparMap.h | 44 +- include/asm-ppc64/iSeries/XmPciLpEvent.h | 15 include/asm-ppc64/iSeries/iSeries_io.h | 59 +-- include/asm-ppc64/iSeries/iSeries_irq.h | 18 include/asm-ppc64/iSeries/iSeries_pci.h | 161 ++++---- include/asm-ppc64/iSeries/iSeries_proc.h | 24 - include/asm-ppc64/iSeries/mf.h | 5 include/asm-ppc64/iSeries/vio.h | 57 +-- include/asm-ppc64/paca.h | 2 47 files changed, 1018 insertions(+), 1873 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: 00000000.mimetmp Type: application/pgp-signature Size: 190 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050603/fcc8ad0c/attachment.pgp -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050603/fcc8ad0c/attachment-0001.pgp From sfr at canb.auug.org.au Fri Jun 3 18:04:17 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:04:17 +1000 Subject: [PATCH 2/10] ppc64 iSeries: header file white space cleanups In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603180417.4cfb2416.sfr@canb.auug.org.au> Hi Andrew, This patch just contains white space and comment cleanups in the iSeries headers files. There are no semantic changes. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCall.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCall.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCall.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCall.h 2005-06-01 14:51:07.000000000 +1000 @@ -1,34 +1,28 @@ /* * HvCall.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ - -//=========================================================================== -// -// This file contains the "hypervisor call" interface which is used to -// drive the hypervisor from the OS. -// -//=========================================================================== +/* + * This file contains the "hypervisor call" interface which is used to + * drive the hypervisor from the OS. + */ #ifndef _HVCALL_H #define _HVCALL_H -//------------------------------------------------------------------- -// Standard Includes -//------------------------------------------------------------------- #include #include #include @@ -76,9 +70,9 @@ enum HvCall_VaryOffChunkRc */ /* Type of yield for HvCallBaseYieldProcessor */ -#define HvCall_YieldTimed 0 // Yield until specified time (tb) -#define HvCall_YieldToActive 1 // Yield until all active procs have run -#define HvCall_YieldToProc 2 // Yield until the specified processor has run +#define HvCall_YieldTimed 0 /* Yield until specified time (tb) */ +#define HvCall_YieldToActive 1 /* Yield until all active procs have run */ +#define HvCall_YieldToProc 2 /* Yield until the specified processor has run */ /* interrupt masks for setEnabledInterrupts */ #define HvCall_MaskIPI 0x00000001 @@ -86,7 +80,7 @@ enum HvCall_VaryOffChunkRc #define HvCall_MaskLpProd 0x00000004 #define HvCall_MaskTimeout 0x00000008 -/* Log buffer formats */ +/* Log buffer formats */ #define HvCall_LogBuffer_ASCII 0 #define HvCall_LogBuffer_EBCDIC 1 @@ -95,7 +89,7 @@ enum HvCall_VaryOffChunkRc #define HvCallBaseGetHwPatch HvCallBase + 2 #define HvCallBaseReIplSpAttn HvCallBase + 3 #define HvCallBaseSetASR HvCallBase + 4 -#define HvCallBaseSetASRAndRfi HvCallBase + 5 +#define HvCallBaseSetASRAndRfi HvCallBase + 5 #define HvCallBaseSetIMR HvCallBase + 6 #define HvCallBaseSendIPI HvCallBase + 7 #define HvCallBaseTerminateMachine HvCallBase + 8 @@ -115,81 +109,75 @@ enum HvCall_VaryOffChunkRc #define HvCallBaseGetLogBufferCodePage HvCallBase + 22 #define HvCallBaseGetLogBufferFormat HvCallBase + 23 #define HvCallBaseGetLogBufferLength HvCallBase + 24 -#define HvCallBaseReadLogBuffer HvCallBase + 25 +#define HvCallBaseReadLogBuffer HvCallBase + 25 #define HvCallBaseSetLogBufferFormatAndCodePage HvCallBase + 26 -#define HvCallBaseWriteLogBuffer HvCallBase + 27 +#define HvCallBaseWriteLogBuffer HvCallBase + 27 #define HvCallBaseRouter28 HvCallBase + 28 #define HvCallBaseRouter29 HvCallBase + 29 #define HvCallBaseRouter30 HvCallBase + 30 -#define HvCallBaseSetDebugBus HvCallBase + 31 +#define HvCallBaseSetDebugBus HvCallBase + 31 -#define HvCallCcSetDABR HvCallCc + 7 +#define HvCallCcSetDABR HvCallCc + 7 -//===================================================================================== -static inline void HvCall_setVirtualDecr(void) +static inline void HvCall_setVirtualDecr(void) { - /* Ignore any error return codes - most likely means that the target value for the - * LP has been increased and this vary off would bring us below the new target. */ + /* + * Ignore any error return codes - most likely means that the + * target value for the LP has been increased and this vary off + * would bring us below the new target. + */ HvCall0(HvCallBaseSetVirtualDecr); } -//===================================================================== -static inline void HvCall_yieldProcessor(unsigned typeOfYield, u64 yieldParm) + +static inline void HvCall_yieldProcessor(unsigned typeOfYield, u64 yieldParm) { - HvCall2( HvCallBaseYieldProcessor, typeOfYield, yieldParm ); + HvCall2(HvCallBaseYieldProcessor, typeOfYield, yieldParm); } -//===================================================================== -static inline void HvCall_setEnabledInterrupts(u64 enabledInterrupts) + +static inline void HvCall_setEnabledInterrupts(u64 enabledInterrupts) { - HvCall1(HvCallBaseSetEnabledInterrupts,enabledInterrupts); + HvCall1(HvCallBaseSetEnabledInterrupts, enabledInterrupts); } -//===================================================================== -static inline void HvCall_clearLogBuffer(HvLpIndex lpindex) +static inline void HvCall_clearLogBuffer(HvLpIndex lpindex) { - HvCall1(HvCallBaseClearLogBuffer,lpindex); + HvCall1(HvCallBaseClearLogBuffer, lpindex); } -//===================================================================== -static inline u32 HvCall_getLogBufferCodePage(HvLpIndex lpindex) +static inline u32 HvCall_getLogBufferCodePage(HvLpIndex lpindex) { - u32 retVal = HvCall1(HvCallBaseGetLogBufferCodePage,lpindex); + u32 retVal = HvCall1(HvCallBaseGetLogBufferCodePage, lpindex); return retVal; } -//===================================================================== -static inline int HvCall_getLogBufferFormat(HvLpIndex lpindex) +static inline int HvCall_getLogBufferFormat(HvLpIndex lpindex) { - int retVal = HvCall1(HvCallBaseGetLogBufferFormat,lpindex); + int retVal = HvCall1(HvCallBaseGetLogBufferFormat, lpindex); return retVal; } -//===================================================================== -static inline u32 HvCall_getLogBufferLength(HvLpIndex lpindex) +static inline u32 HvCall_getLogBufferLength(HvLpIndex lpindex) { - u32 retVal = HvCall1(HvCallBaseGetLogBufferLength,lpindex); + u32 retVal = HvCall1(HvCallBaseGetLogBufferLength, lpindex); return retVal; } -//===================================================================== -static inline void HvCall_setLogBufferFormatAndCodepage(int format, u32 codePage) +static inline void HvCall_setLogBufferFormatAndCodepage(int format, u32 codePage) { - HvCall2(HvCallBaseSetLogBufferFormatAndCodePage,format, codePage); + HvCall2(HvCallBaseSetLogBufferFormatAndCodePage, format, codePage); } -//===================================================================== -int HvCall_readLogBuffer(HvLpIndex lpindex, void *buffer, u64 bufLen); -void HvCall_writeLogBuffer(const void *buffer, u64 bufLen); +extern int HvCall_readLogBuffer(HvLpIndex lpindex, void *buffer, u64 bufLen); +extern void HvCall_writeLogBuffer(const void *buffer, u64 bufLen); -//===================================================================== -static inline void HvCall_sendIPI(struct paca_struct * targetPaca) +static inline void HvCall_sendIPI(struct paca_struct *targetPaca) { - HvCall1( HvCallBaseSendIPI, targetPaca->paca_index ); + HvCall1(HvCallBaseSendIPI, targetPaca->paca_index); } -//===================================================================== -static inline void HvCall_terminateMachineSrc(void) +static inline void HvCall_terminateMachineSrc(void) { - HvCall0( HvCallBaseTerminateMachineSrc ); + HvCall0(HvCallBaseTerminateMachineSrc); } static inline void HvCall_setDABR(unsigned long val) diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallCfg.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallCfg.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallCfg.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-03 14:03:07.000000000 +1000 @@ -1,43 +1,32 @@ /* * HvCallCfg.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ - -//===================================================================================== -// -// This file contains the "hypervisor call" interface which is used to -// drive the hypervisor from the OS. -// -//===================================================================================== +/* + * This file contains the "hypervisor call" interface which is used to + * drive the hypervisor from the OS. + */ #ifndef _HVCALLCFG_H #define _HVCALLCFG_H -//------------------------------------------------------------------- -// Standard Includes -//------------------------------------------------------------------- #include #include -//------------------------------------------------------------------------------------- -// Constants -//------------------------------------------------------------------------------------- - -enum HvCallCfg_ReqQual -{ +enum HvCallCfg_ReqQual { HvCallCfg_Cur = 0, HvCallCfg_Init = 1, HvCallCfg_Max = 2, @@ -49,7 +38,7 @@ enum HvCallCfg_ReqQual #define HvCallCfgGetLpVrmIndex HvCallCfg + 2 #define HvCallCfgGetLpMinSupportedPlicVrmIndex HvCallCfg + 3 #define HvCallCfgGetLpMinCompatablePlicVrmIndex HvCallCfg + 4 -#define HvCallCfgGetLpVrmName HvCallCfg + 5 +#define HvCallCfgGetLpVrmName HvCallCfg + 5 #define HvCallCfgGetSystemPhysicalProcessors HvCallCfg + 6 #define HvCallCfgGetPhysicalProcessors HvCallCfg + 7 #define HvCallCfgGetSystemMsChunks HvCallCfg + 8 @@ -76,108 +65,113 @@ enum HvCallCfg_ReqQual #define HvCallCfgSetMinRuntimeMsChunks HvCallCfg + 29 #define HvCallCfgGetVirtualLanIndexMap HvCallCfg + 30 #define HvCallCfgGetLpExecutionMode HvCallCfg + 31 -#define HvCallCfgGetHostingLpIndex HvCallCfg + 32 +#define HvCallCfgGetHostingLpIndex HvCallCfg + 32 -//==================================================================== static inline HvLpIndex HvCallCfg_getLps(void) { HvLpIndex retVal = HvCall0(HvCallCfgGetLps); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//==================================================================== -static inline int HvCallCfg_isBusDedicated(u64 busIndex) + +static inline int HvCallCfg_isBusDedicated(u64 busIndex) { int retVal = HvCall1(HvCallCfgIsBusDedicated,busIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//==================================================================== + static inline HvLpIndex HvCallCfg_getBusOwner(u64 busIndex) { HvLpIndex retVal = HvCall1(HvCallCfgGetBusOwner,busIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//==================================================================== -static inline HvLpIndexMap HvCallCfg_getBusAllocation(u64 busIndex) + +static inline HvLpIndexMap HvCallCfg_getBusAllocation(u64 busIndex) { HvLpIndexMap retVal = HvCall1(HvCallCfgGetBusAllocation,busIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//==================================================================== -static inline HvLpIndexMap HvCallCfg_getActiveLpMap(void) + +static inline HvLpIndexMap HvCallCfg_getActiveLpMap(void) { HvLpIndexMap retVal = HvCall0(HvCallCfgGetActiveLpMap); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//==================================================================== -static inline HvLpVirtualLanIndexMap HvCallCfg_getVirtualLanIndexMap(HvLpIndex lp) + +static inline HvLpVirtualLanIndexMap HvCallCfg_getVirtualLanIndexMap( + HvLpIndex lp) { - // This is a new function in V5R1 so calls to this on older - // hypervisors will return -1 + /* + * This is a new function in V5R1 so calls to this on older + * hypervisors will return -1 + */ u64 retVal = HvCall1(HvCallCfgGetVirtualLanIndexMap, lp); - if(retVal == -1) + if (retVal == -1) retVal = 0; // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//=================================================================== -static inline u64 HvCallCfg_getSystemMsChunks(void) + +static inline u64 HvCallCfg_getSystemMsChunks(void) { u64 retVal = HvCall0(HvCallCfgGetSystemMsChunks); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//=================================================================== -static inline u64 HvCallCfg_getMsChunks(HvLpIndex lp,enum HvCallCfg_ReqQual qual) + +static inline u64 HvCallCfg_getMsChunks(HvLpIndex lp, + enum HvCallCfg_ReqQual qual) { u64 retVal = HvCall2(HvCallCfgGetMsChunks,lp,qual); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//=================================================================== -static inline u64 HvCallCfg_getMinRuntimeMsChunks(HvLpIndex lp) + +static inline u64 HvCallCfg_getMinRuntimeMsChunks(HvLpIndex lp) { - // NOTE: This function was added in v5r1 so older hypervisors will return a -1 value - u64 retVal = HvCall1(HvCallCfgGetMinRuntimeMsChunks,lp); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + /* + * NOTE: This function was added in v5r1 so older hypervisors + * will return a -1 value + */ + return HvCall1(HvCallCfgGetMinRuntimeMsChunks, lp); } -//=================================================================== -static inline u64 HvCallCfg_setMinRuntimeMsChunks(u64 chunks) + +static inline u64 HvCallCfg_setMinRuntimeMsChunks(u64 chunks) { u64 retVal = HvCall1(HvCallCfgSetMinRuntimeMsChunks,chunks); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//=================================================================== -static inline u64 HvCallCfg_getSystemPhysicalProcessors(void) + +static inline u64 HvCallCfg_getSystemPhysicalProcessors(void) { u64 retVal = HvCall0(HvCallCfgGetSystemPhysicalProcessors); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//=================================================================== -static inline u64 HvCallCfg_getPhysicalProcessors(HvLpIndex lp,enum HvCallCfg_ReqQual qual) + +static inline u64 HvCallCfg_getPhysicalProcessors(HvLpIndex lp, + enum HvCallCfg_ReqQual qual) { u64 retVal = HvCall2(HvCallCfgGetPhysicalProcessors,lp,qual); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//=================================================================== -static inline u64 HvCallCfg_getConfiguredBusUnitsForInterruptProc(HvLpIndex lp, - u16 hvLogicalProcIndex) + +static inline u64 HvCallCfg_getConfiguredBusUnitsForInterruptProc(HvLpIndex lp, + u16 hvLogicalProcIndex) { u64 retVal = HvCall2(HvCallCfgGetConfiguredBusUnitsForIntProc,lp,hvLogicalProcIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//================================================================== -static inline HvLpSharedPoolIndex HvCallCfg_getSharedPoolIndex(HvLpIndex lp) + +static inline HvLpSharedPoolIndex HvCallCfg_getSharedPoolIndex(HvLpIndex lp) { HvLpSharedPoolIndex retVal = HvCall1(HvCallCfgGetSharedPoolIndex,lp); @@ -185,29 +179,29 @@ static inline HvLpSharedPoolIndex HvCall return retVal; } -//================================================================== -static inline u64 HvCallCfg_getSharedProcUnits(HvLpIndex lp,enum HvCallCfg_ReqQual qual) + +static inline u64 HvCallCfg_getSharedProcUnits(HvLpIndex lp, + enum HvCallCfg_ReqQual qual) { u64 retVal = HvCall2(HvCallCfgGetSharedProcUnits,lp,qual); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//================================================================== -static inline u64 HvCallCfg_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) + +static inline u64 HvCallCfg_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { u16 retVal = HvCall1(HvCallCfgGetNumProcsInSharedPool,sPI); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//================================================================== + static inline HvLpIndex HvCallCfg_getHostingLpIndex(HvLpIndex lp) { u64 retVal = HvCall1(HvCallCfgGetHostingLpIndex,lp); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; - } #endif /* _HVCALLCFG_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallEvent.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallEvent.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallEvent.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallEvent.h 2005-06-03 14:04:46.000000000 +1000 @@ -1,32 +1,28 @@ /* * HvCallEvent.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ - /* - * This file contains the "hypervisor call" interface which is used to - * drive the hypervisor from the OS. + * This file contains the "hypervisor call" interface which is used to + * drive the hypervisor from the OS. */ #ifndef _HVCALLEVENT_H #define _HVCALLEVENT_H -/* - * Standard Includes - */ #include #include #include @@ -71,7 +67,7 @@ typedef u64 HvLpDma_Rc; #define HvCallEventCloseLpEventPath HvCallEvent + 2 #define HvCallEventDmaBufList HvCallEvent + 3 #define HvCallEventDmaSingle HvCallEvent + 4 -#define HvCallEventDmaToSp HvCallEvent + 5 +#define HvCallEventDmaToSp HvCallEvent + 5 #define HvCallEventGetOverflowLpEvents HvCallEvent + 6 #define HvCallEventGetSourceLpInstanceId HvCallEvent + 7 #define HvCallEventGetTargetLpInstanceId HvCallEvent + 8 @@ -85,13 +81,13 @@ typedef u64 HvLpDma_Rc; static inline void HvCallEvent_getOverflowLpEvents(u8 queueIndex) { - HvCall1(HvCallEventGetOverflowLpEvents,queueIndex); + HvCall1(HvCallEventGetOverflowLpEvents, queueIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallEvent_setInterLpQueueIndex(u8 queueIndex) { - HvCall1(HvCallEventSetInterLpQueueIndex,queueIndex); + HvCall1(HvCallEventSetInterLpQueueIndex, queueIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } @@ -138,7 +134,7 @@ static inline HvLpEvent_Rc HvCallEvent_s { HvLpEvent_Rc retVal; - // Pack the misc bits into a single Dword to pass to PLIC + /* Pack the misc bits into a single Dword to pass to PLIC */ union { struct HvCallEvent_PackedParms parms; u64 dword; @@ -225,7 +221,7 @@ static inline HvLpDma_Rc HvCallEvent_dma u64 localBufList, u64 remoteBufList, u32 transferLength) { HvLpDma_Rc retVal; - // Pack the misc bits into a single Dword to pass to PLIC + /* Pack the misc bits into a single Dword to pass to PLIC */ union { struct HvCallEvent_PackedDmaParms parms; u64 dword; @@ -257,7 +253,7 @@ static inline HvLpDma_Rc HvCallEvent_dma u64 localAddrOrTce, u64 remoteAddrOrTce, u32 transferLength) { HvLpDma_Rc retVal; - // Pack the misc bits into a single Dword to pass to PLIC + /* Pack the misc bits into a single Dword to pass to PLIC */ union { struct HvCallEvent_PackedDmaParms parms; u64 dword; @@ -280,7 +276,7 @@ static inline HvLpDma_Rc HvCallEvent_dma return retVal; } -static inline HvLpDma_Rc HvCallEvent_dmaToSp(void* local, u32 remote, +static inline HvLpDma_Rc HvCallEvent_dmaToSp(void *local, u32 remote, u32 length, HvLpDma_Direction dir) { u64 abs_addr; @@ -293,5 +289,4 @@ static inline HvLpDma_Rc HvCallEvent_dma return retVal; } - #endif /* _HVCALLEVENT_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallHpt.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallHpt.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallHpt.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallHpt.h 2005-06-03 14:06:15.000000000 +1000 @@ -1,17 +1,17 @@ /* * HvCallHpt.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,21 +19,15 @@ #ifndef _HVCALLHPT_H #define _HVCALLHPT_H -//============================================================================ -// -// This file contains the "hypervisor call" interface which is used to -// drive the hypervisor from the OS. -// -//============================================================================ +/* + * This file contains the "hypervisor call" interface which is used to + * drive the hypervisor from the OS. + */ #include #include #include -//----------------------------------------------------------------------------- -// Constants -//----------------------------------------------------------------------------- - #define HvCallHptGetHptAddress HvCallHpt + 0 #define HvCallHptGetHptPages HvCallHpt + 1 #define HvCallHptSetPp HvCallHpt + 5 @@ -47,81 +41,76 @@ #define HvCallHptInvalidateSetSwBitsGet HvCallHpt + 18 -//============================================================================ -static inline u64 HvCallHpt_getHptAddress(void) +static inline u64 HvCallHpt_getHptAddress(void) { u64 retval = HvCall0(HvCallHptGetHptAddress); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retval; } -//============================================================================ -static inline u64 HvCallHpt_getHptPages(void) -{ + +static inline u64 HvCallHpt_getHptPages(void) +{ u64 retval = HvCall0(HvCallHptGetHptPages); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retval; } -//============================================================================= -static inline void HvCallHpt_setPp(u32 hpteIndex, u8 value) + +static inline void HvCallHpt_setPp(u32 hpteIndex, u8 value) { - HvCall2( HvCallHptSetPp, hpteIndex, value ); + HvCall2(HvCallHptSetPp, hpteIndex, value); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } -//============================================================================= -static inline void HvCallHpt_setSwBits(u32 hpteIndex, u8 bitson, u8 bitsoff ) + +static inline void HvCallHpt_setSwBits(u32 hpteIndex, u8 bitson, u8 bitsoff) { - HvCall3( HvCallHptSetSwBits, hpteIndex, bitson, bitsoff ); + HvCall3(HvCallHptSetSwBits, hpteIndex, bitson, bitsoff); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } -//============================================================================= -static inline void HvCallHpt_invalidateNoSyncICache(u32 hpteIndex) - + +static inline void HvCallHpt_invalidateNoSyncICache(u32 hpteIndex) { - HvCall1( HvCallHptInvalidateNoSyncICache, hpteIndex ); + HvCall1(HvCallHptInvalidateNoSyncICache, hpteIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } -//============================================================================= -static inline u64 HvCallHpt_invalidateSetSwBitsGet(u32 hpteIndex, u8 bitson, u8 bitsoff ) - + +static inline u64 HvCallHpt_invalidateSetSwBitsGet(u32 hpteIndex, u8 bitson, + u8 bitsoff) { u64 compressedStatus; - compressedStatus = HvCall4( HvCallHptInvalidateSetSwBitsGet, hpteIndex, bitson, bitsoff, 1 ); - HvCall1( HvCallHptInvalidateNoSyncICache, hpteIndex ); + + compressedStatus = HvCall4(HvCallHptInvalidateSetSwBitsGet, + hpteIndex, bitson, bitsoff, 1); + HvCall1(HvCallHptInvalidateNoSyncICache, hpteIndex); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return compressedStatus; } -//============================================================================= -static inline u64 HvCallHpt_findValid( HPTE *hpte, u64 vpn ) + +static inline u64 HvCallHpt_findValid(HPTE *hpte, u64 vpn) { u64 retIndex = HvCall3Ret16( HvCallHptFindValid, hpte, vpn, 0, 0 ); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retIndex; } -//============================================================================= -static inline u64 HvCallHpt_findNextValid( HPTE *hpte, u32 hpteIndex, u8 bitson, u8 bitsoff ) + +static inline u64 HvCallHpt_findNextValid(HPTE *hpte, u32 hpteIndex, + u8 bitson, u8 bitsoff) { u64 retIndex = HvCall3Ret16( HvCallHptFindNextValid, hpte, hpteIndex, bitson, bitsoff ); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retIndex; } -//============================================================================= -static inline void HvCallHpt_get( HPTE *hpte, u32 hpteIndex ) + +static inline void HvCallHpt_get(HPTE *hpte, u32 hpteIndex) { - HvCall2Ret16( HvCallHptGet, hpte, hpteIndex, 0 ); + HvCall2Ret16(HvCallHptGet, hpte, hpteIndex, 0); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } -//============================================================================ -static inline void HvCallHpt_addValidate( u32 hpteIndex, - u32 hBit, - HPTE *hpte ) - + +static inline void HvCallHpt_addValidate(u32 hpteIndex, u32 hBit, HPTE *hpte) { - HvCall4( HvCallHptAddValidate, hpteIndex, - hBit, (*((u64 *)hpte)), (*(((u64 *)hpte)+1)) ); + HvCall4(HvCallHptAddValidate, hpteIndex, hBit, (*((u64 *)hpte)), + (*(((u64 *)hpte)+1))); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } - -//============================================================================= - #endif /* _HVCALLHPT_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallPci.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallPci.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallPci.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallPci.h 2005-06-03 14:10:49.000000000 +1000 @@ -1,26 +1,26 @@ -/************************************************************************/ -/* Provides the Hypervisor PCI calls for iSeries Linux Parition. */ -/* Copyright (C) 2001 */ -/* */ -/* This program is free software; you can redistribute it and/or modify */ -/* it under the terms of the GNU General Public License as published by */ -/* the Free Software Foundation; either version 2 of the License, or */ -/* (at your option) any later version. */ -/* */ -/* This program is distributed in the hope that it will be useful, */ -/* but WITHOUT ANY WARRANTY; without even the implied warranty of */ -/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */ -/* GNU General Public License for more details. */ -/* */ -/* You should have received a copy of the GNU General Public License */ -/* along with this program; if not, write to the: */ -/* Free Software Foundation, Inc., */ -/* 59 Temple Place, Suite 330, */ -/* Boston, MA 02111-1307 USA */ -/************************************************************************/ -/* Change Activity: */ -/* Created, Jan 9, 2001 */ -/************************************************************************/ +/* + * Provides the Hypervisor PCI calls for iSeries Linux Parition. + * Copyright (C) 2001 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the: + * Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, + * Boston, MA 02111-1307 USA + * + * Change Activity: + * Created, Jan 9, 2001 + */ #ifndef _HVCALLPCI_H #define _HVCALLPCI_H @@ -34,8 +34,8 @@ */ struct HvCallPci_DsaAddr { u16 busNumber; /* PHB index? */ - u8 subBusNumber; /* PCI bus number? */ - u8 deviceId; /* device and function? */ + u8 subBusNumber; /* PCI bus number? */ + u8 deviceId; /* device and function? */ u8 barNumber; u8 reserved[3]; }; @@ -52,34 +52,37 @@ struct HvCallPci_LoadReturn { enum HvCallPci_DeviceType { HvCallPci_NodeDevice = 1, - HvCallPci_SpDevice = 2, - HvCallPci_IopDevice = 3, - HvCallPci_BridgeDevice = 4, - HvCallPci_MultiFunctionDevice = 5, - HvCallPci_IoaDevice = 6 + HvCallPci_SpDevice = 2, + HvCallPci_IopDevice = 3, + HvCallPci_BridgeDevice = 4, + HvCallPci_MultiFunctionDevice = 5, + HvCallPci_IoaDevice = 6 }; struct HvCallPci_DeviceInfo { - u32 deviceType; // See DeviceType enum for values + u32 deviceType; /* See DeviceType enum for values */ }; - + struct HvCallPci_BusUnitInfo { - u32 sizeReturned; // length of data returned - u32 deviceType; // see DeviceType enum for values + u32 sizeReturned; /* length of data returned */ + u32 deviceType; /* see DeviceType enum for values */ }; struct HvCallPci_BridgeInfo { - struct HvCallPci_BusUnitInfo busUnitInfo; // Generic bus unit info - u8 subBusNumber; // Bus number of secondary bus - u8 maxAgents; // Max idsels on secondary bus - u8 maxSubBusNumber; // Max Sub Bus - u8 logicalSlotNumber; // Logical Slot Number for IOA + struct HvCallPci_BusUnitInfo busUnitInfo; /* Generic bus unit info */ + u8 subBusNumber; /* Bus number of secondary bus */ + u8 maxAgents; /* Max idsels on secondary bus */ + u8 maxSubBusNumber; /* Max Sub Bus */ + u8 logicalSlotNumber; /* Logical Slot Number for IOA */ }; - -// Maximum BusUnitInfo buffer size. Provided for clients so they can allocate -// a buffer big enough for any type of bus unit. Increase as needed. + +/* + * Maximum BusUnitInfo buffer size. Provided for clients so + * they can allocate a buffer big enough for any type of bus + * unit. Increase as needed. + */ enum {HvCallPci_MaxBusUnitInfoSize = 128}; struct HvCallPci_BarParms { @@ -89,12 +92,12 @@ struct HvCallPci_BarParms { u64 protectStart; u64 protectEnd; u64 relocationOffset; - u64 pciAddress; + u64 pciAddress; u64 reserved[3]; -}; +}; enum HvCallPci_VpdType { - HvCallPci_BusVpd = 1, + HvCallPci_BusVpd = 1, HvCallPci_BusAdapterVpd = 2 }; @@ -123,15 +126,13 @@ enum HvCallPci_VpdType { #define HvCallPciUnmaskInterrupts HvCallPci + 49 #define HvCallPciGetBusUnitInfo HvCallPci + 50 -//============================================================================ static inline u64 HvCallPci_configLoad8(u16 busNumber, u8 subBusNumber, - u8 deviceId, u32 offset, - u8 *value) + u8 deviceId, u32 offset, u8 *value) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumber; dsa.subBusNumber = subBusNumber; @@ -145,15 +146,14 @@ static inline u64 HvCallPci_configLoad8( return retVal.rc; } -//============================================================================ + static inline u64 HvCallPci_configLoad16(u16 busNumber, u8 subBusNumber, - u8 deviceId, u32 offset, - u16 *value) + u8 deviceId, u32 offset, u16 *value) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumber; dsa.subBusNumber = subBusNumber; @@ -167,15 +167,14 @@ static inline u64 HvCallPci_configLoad16 return retVal.rc; } -//============================================================================ -static inline u64 HvCallPci_configLoad32(u16 busNumber, u8 subBusNumber, - u8 deviceId, u32 offset, - u32 *value) + +static inline u64 HvCallPci_configLoad32(u16 busNumber, u8 subBusNumber, + u8 deviceId, u32 offset, u32 *value) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumber; dsa.subBusNumber = subBusNumber; @@ -189,15 +188,14 @@ static inline u64 HvCallPci_configLoad32 return retVal.rc; } -//============================================================================ -static inline u64 HvCallPci_configStore8(u16 busNumber, u8 subBusNumber, - u8 deviceId, u32 offset, - u8 value) + +static inline u64 HvCallPci_configStore8(u16 busNumber, u8 subBusNumber, + u8 deviceId, u32 offset, u8 value) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumber; dsa.subBusNumber = subBusNumber; @@ -209,15 +207,14 @@ static inline u64 HvCallPci_configStore8 return retVal; } -//============================================================================ -static inline u64 HvCallPci_configStore16(u16 busNumber, u8 subBusNumber, - u8 deviceId, u32 offset, - u16 value) + +static inline u64 HvCallPci_configStore16(u16 busNumber, u8 subBusNumber, + u8 deviceId, u32 offset, u16 value) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumber; dsa.subBusNumber = subBusNumber; @@ -229,15 +226,14 @@ static inline u64 HvCallPci_configStore1 return retVal; } -//============================================================================ -static inline u64 HvCallPci_configStore32(u16 busNumber, u8 subBusNumber, - u8 deviceId, u32 offset, - u32 value) + +static inline u64 HvCallPci_configStore32(u16 busNumber, u8 subBusNumber, + u8 deviceId, u32 offset, u32 value) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumber; dsa.subBusNumber = subBusNumber; @@ -249,18 +245,15 @@ static inline u64 HvCallPci_configStore3 return retVal; } -//============================================================================ -static inline u64 HvCallPci_barLoad8(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u8* valueParm) + +static inline u64 HvCallPci_barLoad8(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u8 *valueParm) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -275,18 +268,15 @@ static inline u64 HvCallPci_barLoad8(u16 return retVal.rc; } -//============================================================================ -static inline u64 HvCallPci_barLoad16(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u16* valueParm) + +static inline u64 HvCallPci_barLoad16(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u16 *valueParm) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -301,18 +291,15 @@ static inline u64 HvCallPci_barLoad16(u1 return retVal.rc; } -//============================================================================ -static inline u64 HvCallPci_barLoad32(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u32* valueParm) + +static inline u64 HvCallPci_barLoad32(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u32 *valueParm) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -327,18 +314,15 @@ static inline u64 HvCallPci_barLoad32(u1 return retVal.rc; } -//============================================================================ -static inline u64 HvCallPci_barLoad64(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u64* valueParm) + +static inline u64 HvCallPci_barLoad64(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u64 *valueParm) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -353,19 +337,16 @@ static inline u64 HvCallPci_barLoad64(u1 return retVal.rc; } -//============================================================================ -static inline u64 HvCallPci_barStore8(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u8 valueParm) + +static inline u64 HvCallPci_barStore8(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u8 valueParm) { struct HvCallPci_DsaAddr dsa; u64 retVal; *((u64*)&dsa) = 0; - + dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; @@ -377,19 +358,16 @@ static inline u64 HvCallPci_barStore8(u1 return retVal; } -//============================================================================ -static inline u64 HvCallPci_barStore16(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u16 valueParm) + +static inline u64 HvCallPci_barStore16(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u16 valueParm) { struct HvCallPci_DsaAddr dsa; u64 retVal; *((u64*)&dsa) = 0; - + dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; @@ -401,19 +379,16 @@ static inline u64 HvCallPci_barStore16(u return retVal; } -//============================================================================ -static inline u64 HvCallPci_barStore32(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u32 valueParm) + +static inline u64 HvCallPci_barStore32(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u32 valueParm) { struct HvCallPci_DsaAddr dsa; u64 retVal; *((u64*)&dsa) = 0; - + dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; @@ -425,19 +400,16 @@ static inline u64 HvCallPci_barStore32(u return retVal; } -//============================================================================ -static inline u64 HvCallPci_barStore64(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 offsetParm, - u64 valueParm) + +static inline u64 HvCallPci_barStore64(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 offsetParm, + u64 valueParm) { struct HvCallPci_DsaAddr dsa; u64 retVal; *((u64*)&dsa) = 0; - + dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; @@ -449,10 +421,9 @@ static inline u64 HvCallPci_barStore64(u return retVal; } -//============================================================================ -static inline u64 HvCallPci_eoi(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm) + +static inline u64 HvCallPci_eoi(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm) { struct HvCallPci_DsaAddr dsa; struct HvCallPci_LoadReturn retVal; @@ -469,13 +440,9 @@ static inline u64 HvCallPci_eoi(u16 busN return retVal.rc; } -//============================================================================ -static inline u64 HvCallPci_getBarParms(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u8 barNumberParm, - u64 parms, - u32 sizeofParms) + +static inline u64 HvCallPci_getBarParms(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u8 barNumberParm, u64 parms, u32 sizeofParms) { struct HvCallPci_DsaAddr dsa; u64 retVal; @@ -493,16 +460,14 @@ static inline u64 HvCallPci_getBarParms( return retVal; } -//============================================================================ -static inline u64 HvCallPci_maskFisr(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u64 fisrMask) + +static inline u64 HvCallPci_maskFisr(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u64 fisrMask) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -514,16 +479,14 @@ static inline u64 HvCallPci_maskFisr(u16 return retVal; } -//============================================================================ -static inline u64 HvCallPci_unmaskFisr(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u64 fisrMask) + +static inline u64 HvCallPci_unmaskFisr(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u64 fisrMask) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -535,11 +498,9 @@ static inline u64 HvCallPci_unmaskFisr(u return retVal; } -//============================================================================ -static inline u64 HvCallPci_setSlotReset(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u64 onNotOff) + +static inline u64 HvCallPci_setSlotReset(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u64 onNotOff) { struct HvCallPci_DsaAddr dsa; u64 retVal; @@ -556,12 +517,9 @@ static inline u64 HvCallPci_setSlotReset return retVal; } -//============================================================================ -static inline u64 HvCallPci_getDeviceInfo(u16 busNumberParm, - u8 subBusParm, - u8 deviceNumberParm, - u64 parms, - u32 sizeofParms) + +static inline u64 HvCallPci_getDeviceInfo(u16 busNumberParm, u8 subBusParm, + u8 deviceNumberParm, u64 parms, u32 sizeofParms) { struct HvCallPci_DsaAddr dsa; u64 retVal; @@ -578,16 +536,14 @@ static inline u64 HvCallPci_getDeviceInf return retVal; } -//============================================================================ -static inline u64 HvCallPci_maskInterrupts(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u64 interruptMask) + +static inline u64 HvCallPci_maskInterrupts(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u64 interruptMask) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -599,16 +555,14 @@ static inline u64 HvCallPci_maskInterrup return retVal; } -//============================================================================ -static inline u64 HvCallPci_unmaskInterrupts(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u64 interruptMask) + +static inline u64 HvCallPci_unmaskInterrupts(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u64 interruptMask) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -620,18 +574,14 @@ static inline u64 HvCallPci_unmaskInterr return retVal; } -//============================================================================ -static inline u64 HvCallPci_getBusUnitInfo(u16 busNumberParm, - u8 subBusParm, - u8 deviceIdParm, - u64 parms, - u32 sizeofParms) +static inline u64 HvCallPci_getBusUnitInfo(u16 busNumberParm, u8 subBusParm, + u8 deviceIdParm, u64 parms, u32 sizeofParms) { struct HvCallPci_DsaAddr dsa; u64 retVal; - *((u64*)&dsa) = 0; + *((u64*)&dsa) = 0; dsa.busNumber = busNumberParm; dsa.subBusNumber = subBusParm; @@ -643,9 +593,9 @@ static inline u64 HvCallPci_getBusUnitIn return retVal; } -//============================================================================ -static inline int HvCallPci_getBusVpd(u16 busNumParm, u64 destParm, u16 sizeParm) +static inline int HvCallPci_getBusVpd(u16 busNumParm, u64 destParm, + u16 sizeParm) { int xRetSize; u64 xRc = HvCall4(HvCallPciGetCardVpd, busNumParm, destParm, sizeParm, HvCallPci_BusVpd); @@ -656,9 +606,9 @@ static inline int HvCallPci_getBusVpd(u1 xRetSize = xRc & 0xFFFF; return xRetSize; } -//============================================================================ -static inline int HvCallPci_getBusAdapterVpd(u16 busNumParm, u64 destParm, u16 sizeParm) +static inline int HvCallPci_getBusAdapterVpd(u16 busNumParm, u64 destParm, + u16 sizeParm) { int xRetSize; u64 xRc = HvCall4(HvCallPciGetCardVpd, busNumParm, destParm, sizeParm, HvCallPci_BusAdapterVpd); @@ -669,5 +619,5 @@ static inline int HvCallPci_getBusAdapte xRetSize = xRc & 0xFFFF; return xRetSize; } -//============================================================================ + #endif /* _HVCALLPCI_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallSc.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallSc.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallSc.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallSc.h 2005-06-01 15:46:19.000000000 +1000 @@ -1,17 +1,17 @@ /* * HvCallSc.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -30,22 +30,22 @@ #define HvCallSm 0x8007000000000000ul #define HvCallXm 0x8009000000000000ul -u64 HvCall0( u64 ); -u64 HvCall1( u64, u64 ); -u64 HvCall2( u64, u64, u64 ); -u64 HvCall3( u64, u64, u64, u64 ); -u64 HvCall4( u64, u64, u64, u64, u64 ); -u64 HvCall5( u64, u64, u64, u64, u64, u64 ); -u64 HvCall6( u64, u64, u64, u64, u64, u64, u64 ); -u64 HvCall7( u64, u64, u64, u64, u64, u64, u64, u64 ); +u64 HvCall0(u64); +u64 HvCall1(u64, u64); +u64 HvCall2(u64, u64, u64); +u64 HvCall3(u64, u64, u64, u64); +u64 HvCall4(u64, u64, u64, u64, u64); +u64 HvCall5(u64, u64, u64, u64, u64, u64); +u64 HvCall6(u64, u64, u64, u64, u64, u64, u64); +u64 HvCall7(u64, u64, u64, u64, u64, u64, u64, u64); -u64 HvCall0Ret16( u64, void * ); -u64 HvCall1Ret16( u64, void *, u64 ); -u64 HvCall2Ret16( u64, void *, u64, u64 ); -u64 HvCall3Ret16( u64, void *, u64, u64, u64 ); -u64 HvCall4Ret16( u64, void *, u64, u64, u64, u64 ); -u64 HvCall5Ret16( u64, void *, u64, u64, u64, u64, u64 ); -u64 HvCall6Ret16( u64, void *, u64, u64, u64, u64, u64, u64 ); -u64 HvCall7Ret16( u64, void *, u64, u64 ,u64 ,u64 ,u64 ,u64 ,u64 ); +u64 HvCall0Ret16(u64, void *); +u64 HvCall1Ret16(u64, void *, u64); +u64 HvCall2Ret16(u64, void *, u64, u64); +u64 HvCall3Ret16(u64, void *, u64, u64, u64); +u64 HvCall4Ret16(u64, void *, u64, u64, u64, u64); +u64 HvCall5Ret16(u64, void *, u64, u64, u64, u64, u64); +u64 HvCall6Ret16(u64, void *, u64, u64, u64, u64, u64, u64); +u64 HvCall7Ret16(u64, void *, u64, u64 ,u64 ,u64 ,u64 ,u64 ,u64); #endif /* _HVCALLSC_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallSm.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallSm.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallSm.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallSm.h 2005-06-03 14:12:36.000000000 +1000 @@ -1,17 +1,17 @@ /* * HvCallSm.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,34 +19,23 @@ #ifndef _HVCALLSM_H #define _HVCALLSM_H -//============================================================================ -// -// This file contains the "hypervisor call" interface which is used to -// drive the hypervisor from the OS. -// -//============================================================================ +/* + * This file contains the "hypervisor call" interface which is used to + * drive the hypervisor from the OS. + */ -//------------------------------------------------------------------- -// Standard Includes -//------------------------------------------------------------------- #include #include -//----------------------------------------------------------------------------- -// Constants -//----------------------------------------------------------------------------- - #define HvCallSmGet64BitsOfAccessMap HvCallSm + 11 - -//============================================================================ -static inline u64 HvCallSm_get64BitsOfAccessMap( - HvLpIndex lpIndex, u64 indexIntoBitMap ) +static inline u64 HvCallSm_get64BitsOfAccessMap(HvLpIndex lpIndex, + u64 indexIntoBitMap) { u64 retval = HvCall2(HvCallSmGet64BitsOfAccessMap, lpIndex, indexIntoBitMap ); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retval; } -//============================================================================ + #endif /* _HVCALLSM_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallXm.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallXm.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvCallXm.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvCallXm.h 2005-06-03 14:17:07.000000000 +1000 @@ -1,30 +1,13 @@ -//============================================================================ -// Header File Id -// Name______________: HvCallXm.H -// -// Description_______: -// -// This file contains the "hypervisor call" interface which is used to -// drive the hypervisor from SLIC. -// -//============================================================================ +/* + * This file contains the "hypervisor call" interface which is used to + * drive the hypervisor from SLIC. + */ #ifndef _HVCALLXM_H #define _HVCALLXM_H -//------------------------------------------------------------------- -// Forward declarations -//------------------------------------------------------------------- - -//------------------------------------------------------------------- -// Standard Includes -//------------------------------------------------------------------- #include #include -//----------------------------------------------------------------------------- -// Constants -//----------------------------------------------------------------------------- - #define HvCallXmGetTceTableParms HvCallXm + 0 #define HvCallXmTestBus HvCallXm + 1 #define HvCallXmConnectBusUnit HvCallXm + 2 @@ -33,47 +16,46 @@ #define HvCallXmSetTce HvCallXm + 11 #define HvCallXmSetTces HvCallXm + 13 - - -//============================================================================ -static inline void HvCallXm_getTceTableParms(u64 cb) +static inline void HvCallXm_getTceTableParms(u64 cb) { HvCall1(HvCallXmGetTceTableParms, cb); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } -//============================================================================ -static inline u64 HvCallXm_setTce(u64 tceTableToken, u64 tceOffset, u64 tce) -{ + +static inline u64 HvCallXm_setTce(u64 tceTableToken, u64 tceOffset, u64 tce) +{ u64 retval = HvCall3(HvCallXmSetTce, tceTableToken, tceOffset, tce ); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retval; } -//============================================================================ -static inline u64 HvCallXm_setTces(u64 tceTableToken, u64 tceOffset, u64 numTces, u64 tce1, u64 tce2, u64 tce3, u64 tce4) -{ + +static inline u64 HvCallXm_setTces(u64 tceTableToken, u64 tceOffset, + u64 numTces, u64 tce1, u64 tce2, u64 tce3, u64 tce4) +{ u64 retval = HvCall7(HvCallXmSetTces, tceTableToken, tceOffset, numTces, tce1, tce2, tce3, tce4 ); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retval; } -//============================================================================= -static inline u64 HvCallXm_testBus(u16 busNumber) + +static inline u64 HvCallXm_testBus(u16 busNumber) { u64 retVal = HvCall1(HvCallXmTestBus, busNumber); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//===================================================================================== -static inline u64 HvCallXm_testBusUnit(u16 busNumber, u8 subBusNumber, u8 deviceId) + +static inline u64 HvCallXm_testBusUnit(u16 busNumber, u8 subBusNumber, + u8 deviceId) { u64 busUnitNumber = (subBusNumber << 8) | deviceId; u64 retVal = HvCall2(HvCallXmTestBusUnit, busNumber, busUnitNumber); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//===================================================================================== -static inline u64 HvCallXm_connectBusUnit(u16 busNumber, u8 subBusNumber, u8 deviceId, - u64 interruptToken) + +static inline u64 HvCallXm_connectBusUnit(u16 busNumber, u8 subBusNumber, + u8 deviceId, u64 interruptToken) { u64 busUnitNumber = (subBusNumber << 8) | deviceId; u64 queueIndex = 0; // HvLpConfig::mapDsaToQueueIndex(HvLpDSA(busNumber, xBoard, xCard)); @@ -83,13 +65,12 @@ static inline u64 HvCallXm_connectBusUni // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//===================================================================================== -static inline u64 HvCallXm_loadTod(void) + +static inline u64 HvCallXm_loadTod(void) { u64 retVal = HvCall0(HvCallXmLoadTod); // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } -//===================================================================================== #endif /* _HVCALLXM_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvLpConfig.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-03 14:21:41.000000000 +1000 @@ -1,17 +1,17 @@ /* * HvLpConfig.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,262 +19,285 @@ #ifndef _HVLPCONFIG_H #define _HVLPCONFIG_H -//=========================================================================== -// -// This file contains the interface to the LPAR configuration data -// to determine which resources should be allocated to each partition. -// -//=========================================================================== +/* + * This file contains the interface to the LPAR configuration data + * to determine which resources should be allocated to each partition. + */ #include #include #include #include -//------------------------------------------------------------------- -// Constants -//------------------------------------------------------------------- - extern HvLpIndex HvLpConfig_getLpIndex_outline(void); -//=================================================================== static inline HvLpIndex HvLpConfig_getLpIndex(void) { return itLpNaca.xLpIndex; } -//=================================================================== + static inline HvLpIndex HvLpConfig_getPrimaryLpIndex(void) { return itLpNaca.xPrimaryLpIndex; } -//================================================================= + static inline HvLpIndex HvLpConfig_getLps(void) { return HvCallCfg_getLps(); } -//================================================================= -static inline HvLpIndexMap HvLpConfig_getActiveLpMap(void) + +static inline HvLpIndexMap HvLpConfig_getActiveLpMap(void) { return HvCallCfg_getActiveLpMap(); } -//================================================================= -static inline u64 HvLpConfig_getSystemMsMegs(void) + +static inline u64 HvLpConfig_getSystemMsMegs(void) { return HvCallCfg_getSystemMsChunks() / HVCHUNKSPERMEG; } -//================================================================= -static inline u64 HvLpConfig_getSystemMsChunks(void) + +static inline u64 HvLpConfig_getSystemMsChunks(void) { return HvCallCfg_getSystemMsChunks(); } -//================================================================= -static inline u64 HvLpConfig_getSystemMsPages(void) + +static inline u64 HvLpConfig_getSystemMsPages(void) { return HvCallCfg_getSystemMsChunks() * HVPAGESPERCHUNK; } -//================================================================ -static inline u64 HvLpConfig_getMsMegs(void) + +static inline u64 HvLpConfig_getMsMegs(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Cur) / HVCHUNKSPERMEG; + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur) + / HVCHUNKSPERMEG; } -//================================================================ -static inline u64 HvLpConfig_getMsChunks(void) + +static inline u64 HvLpConfig_getMsChunks(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Cur); + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur); } -//================================================================ -static inline u64 HvLpConfig_getMsPages(void) + +static inline u64 HvLpConfig_getMsPages(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Cur) * HVPAGESPERCHUNK; + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur) + * HVPAGESPERCHUNK; } -//================================================================ -static inline u64 HvLpConfig_getMinMsMegs(void) + +static inline u64 HvLpConfig_getMinMsMegs(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Min) / HVCHUNKSPERMEG; + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min) + / HVCHUNKSPERMEG; } -//================================================================ -static inline u64 HvLpConfig_getMinMsChunks(void) + +static inline u64 HvLpConfig_getMinMsChunks(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Min); + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min); } -//================================================================ -static inline u64 HvLpConfig_getMinMsPages(void) + +static inline u64 HvLpConfig_getMinMsPages(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Min) * HVPAGESPERCHUNK; + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min) + * HVPAGESPERCHUNK; } -//================================================================ -static inline u64 HvLpConfig_getMinRuntimeMsMegs(void) + +static inline u64 HvLpConfig_getMinRuntimeMsMegs(void) { - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) / HVCHUNKSPERMEG; + return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) + / HVCHUNKSPERMEG; } -//=============================================================== -static inline u64 HvLpConfig_getMinRuntimeMsChunks(void) + +static inline u64 HvLpConfig_getMinRuntimeMsChunks(void) { return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()); } -//=============================================================== -static inline u64 HvLpConfig_getMinRuntimeMsPages(void) + +static inline u64 HvLpConfig_getMinRuntimeMsPages(void) { - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) * HVPAGESPERCHUNK; + return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) + * HVPAGESPERCHUNK; } -//=============================================================== -static inline u64 HvLpConfig_getMaxMsMegs(void) + +static inline u64 HvLpConfig_getMaxMsMegs(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Max) / HVCHUNKSPERMEG; + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max) + / HVCHUNKSPERMEG; } -//=============================================================== -static inline u64 HvLpConfig_getMaxMsChunks(void) + +static inline u64 HvLpConfig_getMaxMsChunks(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Max); + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max); } -//=============================================================== -static inline u64 HvLpConfig_getMaxMsPages(void) + +static inline u64 HvLpConfig_getMaxMsPages(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Max) * HVPAGESPERCHUNK; + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max) + * HVPAGESPERCHUNK; } -//=============================================================== -static inline u64 HvLpConfig_getInitMsMegs(void) + +static inline u64 HvLpConfig_getInitMsMegs(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Init) / HVCHUNKSPERMEG; + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init) + / HVCHUNKSPERMEG; } -//=============================================================== -static inline u64 HvLpConfig_getInitMsChunks(void) + +static inline u64 HvLpConfig_getInitMsChunks(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Init); + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init); } -//=============================================================== -static inline u64 HvLpConfig_getInitMsPages(void) -{ return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(),HvCallCfg_Init) * HVPAGESPERCHUNK; + +static inline u64 HvLpConfig_getInitMsPages(void) +{ + return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init) + * HVPAGESPERCHUNK; } -//=============================================================== -static inline u64 HvLpConfig_getSystemPhysicalProcessors(void) + +static inline u64 HvLpConfig_getSystemPhysicalProcessors(void) { return HvCallCfg_getSystemPhysicalProcessors(); } -//=============================================================== -static inline u64 HvLpConfig_getSystemLogicalProcessors(void) + +static inline u64 HvLpConfig_getSystemLogicalProcessors(void) { - return HvCallCfg_getSystemPhysicalProcessors() * (/*getPaca()->getSecondaryThreadCount() +*/ 1); + return HvCallCfg_getSystemPhysicalProcessors() + * (/*getPaca()->getSecondaryThreadCount() +*/ 1); } -//=============================================================== -static inline u64 HvLpConfig_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) + +static inline u64 HvLpConfig_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { return HvCallCfg_getNumProcsInSharedPool(sPI); } -//=============================================================== -static inline u64 HvLpConfig_getPhysicalProcessors(void) + +static inline u64 HvLpConfig_getPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Cur); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Cur); } -//=============================================================== -static inline u64 HvLpConfig_getLogicalProcessors(void) + +static inline u64 HvLpConfig_getLogicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Cur) * (/*getPaca()->getSecondaryThreadCount() +*/ 1); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Cur) + * (/*getPaca()->getSecondaryThreadCount() +*/ 1); } -//=============================================================== -static inline HvLpSharedPoolIndex HvLpConfig_getSharedPoolIndex(void) + +static inline HvLpSharedPoolIndex HvLpConfig_getSharedPoolIndex(void) { return HvCallCfg_getSharedPoolIndex(HvLpConfig_getLpIndex()); } -//=============================================================== -static inline u64 HvLpConfig_getSharedProcUnits(void) + +static inline u64 HvLpConfig_getSharedProcUnits(void) { - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(),HvCallCfg_Cur); + return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), + HvCallCfg_Cur); } -//=============================================================== -static inline u64 HvLpConfig_getMinSharedProcUnits(void) + +static inline u64 HvLpConfig_getMinSharedProcUnits(void) { - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(),HvCallCfg_Min); + return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), + HvCallCfg_Min); } -//=============================================================== -static inline u64 HvLpConfig_getMaxSharedProcUnits(void) + +static inline u64 HvLpConfig_getMaxSharedProcUnits(void) { - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(),HvCallCfg_Max); + return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), + HvCallCfg_Max); } -//=============================================================== -static inline u64 HvLpConfig_getMinPhysicalProcessors(void) + +static inline u64 HvLpConfig_getMinPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Min); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Min); } -//=============================================================== -static inline u64 HvLpConfig_getMinLogicalProcessors(void) + +static inline u64 HvLpConfig_getMinLogicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Min) * (/*getPaca()->getSecondaryThreadCount() +*/ 1); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Min) + * (/*getPaca()->getSecondaryThreadCount() +*/ 1); } -//=============================================================== -static inline u64 HvLpConfig_getMaxPhysicalProcessors(void) + +static inline u64 HvLpConfig_getMaxPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Max); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Max); } -//=============================================================== -static inline u64 HvLpConfig_getMaxLogicalProcessors(void) + +static inline u64 HvLpConfig_getMaxLogicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Max) * (/*getPaca()->getSecondaryThreadCount() +*/ 1); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Max) + * (/*getPaca()->getSecondaryThreadCount() +*/ 1); } -//=============================================================== -static inline u64 HvLpConfig_getInitPhysicalProcessors(void) + +static inline u64 HvLpConfig_getInitPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Init); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Init); } -//=============================================================== -static inline u64 HvLpConfig_getInitLogicalProcessors(void) + +static inline u64 HvLpConfig_getInitLogicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(),HvCallCfg_Init) * (/*getPaca()->getSecondaryThreadCount() +*/ 1); + return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + HvCallCfg_Init) + * (/*getPaca()->getSecondaryThreadCount() +*/ 1); } -//================================================================ -static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) + +static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) { return HvCallCfg_getVirtualLanIndexMap(HvLpConfig_getLpIndex_outline()); } -//=============================================================== -static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMapForLp(HvLpIndex lp) + +static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMapForLp( + HvLpIndex lp) { return HvCallCfg_getVirtualLanIndexMap(lp); } -//================================================================ -static inline HvLpIndex HvLpConfig_getBusOwner(HvBusNumber busNumber) + +static inline HvLpIndex HvLpConfig_getBusOwner(HvBusNumber busNumber) { return HvCallCfg_getBusOwner(busNumber); } -//=============================================================== -static inline int HvLpConfig_isBusDedicated(HvBusNumber busNumber) + +static inline int HvLpConfig_isBusDedicated(HvBusNumber busNumber) { return HvCallCfg_isBusDedicated(busNumber); } -//================================================================ -static inline HvLpIndexMap HvLpConfig_getBusAllocation(HvBusNumber busNumber) + +static inline HvLpIndexMap HvLpConfig_getBusAllocation(HvBusNumber busNumber) { return HvCallCfg_getBusAllocation(busNumber); } -//================================================================ -// returns the absolute real address of the load area -static inline u64 HvLpConfig_getLoadAddress(void) + +/* returns the absolute real address of the load area */ +static inline u64 HvLpConfig_getLoadAddress(void) { return itLpNaca.xLoadAreaAddr & 0x7fffffffffffffff; } -//================================================================ -static inline u64 HvLpConfig_getLoadPages(void) + +static inline u64 HvLpConfig_getLoadPages(void) { return itLpNaca.xLoadAreaChunks * HVPAGESPERCHUNK; } -//================================================================ -static inline int HvLpConfig_isBusOwnedByThisLp(HvBusNumber busNumber) + +static inline int HvLpConfig_isBusOwnedByThisLp(HvBusNumber busNumber) { HvLpIndex busOwner = HvLpConfig_getBusOwner(busNumber); return (busOwner == HvLpConfig_getLpIndex()); } -//================================================================ -static inline int HvLpConfig_doLpsCommunicateOnVirtualLan(HvLpIndex lp1, HvLpIndex lp2) + +static inline int HvLpConfig_doLpsCommunicateOnVirtualLan(HvLpIndex lp1, + HvLpIndex lp2) { - HvLpVirtualLanIndexMap virtualLanIndexMap1 = HvCallCfg_getVirtualLanIndexMap( lp1 ); - HvLpVirtualLanIndexMap virtualLanIndexMap2 = HvCallCfg_getVirtualLanIndexMap( lp2 ); + HvLpVirtualLanIndexMap virtualLanIndexMap1 = + HvCallCfg_getVirtualLanIndexMap(lp1); + HvLpVirtualLanIndexMap virtualLanIndexMap2 = + HvCallCfg_getVirtualLanIndexMap(lp2); return ((virtualLanIndexMap1 & virtualLanIndexMap2) != 0); } -//================================================================ -static inline HvLpIndex HvLpConfig_getHostingLpIndex(HvLpIndex lp) + +static inline HvLpIndex HvLpConfig_getHostingLpIndex(HvLpIndex lp) { return HvCallCfg_getHostingLpIndex(lp); } -//================================================================ #endif /* _HVLPCONFIG_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvLpEvent.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpEvent.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvLpEvent.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvLpEvent.h 2005-06-01 16:15:18.000000000 +1000 @@ -1,27 +1,24 @@ /* * HvLpEvent.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -//====================================================================== -// -// This file contains the class for HV events in the system. -// -//===================================================================== +/* This file contains the class for HV events in the system. */ + #ifndef _HVLPEVENT_H #define _HVLPEVENT_H @@ -30,69 +27,70 @@ #include #include -//===================================================================== -// -// HvLpEvent is the structure for Lp Event messages passed between -// partitions through PLIC. -// -//===================================================================== - -struct HvEventFlags -{ - u8 xValid:1; // Indicates a valid request x00-x00 - u8 xRsvd1:4; // Reserved ... - u8 xAckType:1; // Immediate or deferred ... - u8 xAckInd:1; // Indicates if ACK required ... - u8 xFunction:1; // Interrupt or Acknowledge ... +/* + * HvLpEvent is the structure for Lp Event messages passed between + * partitions through PLIC. + */ + +struct HvEventFlags { + u8 xValid:1; /* Indicates a valid request x00-x00 */ + u8 xRsvd1:4; /* Reserved ... */ + u8 xAckType:1; /* Immediate or deferred ... */ + u8 xAckInd:1; /* Indicates if ACK required ... */ + u8 xFunction:1; /* Interrupt or Acknowledge ... */ }; -struct HvLpEvent -{ - struct HvEventFlags xFlags; // Event flags x00-x00 - u8 xType; // Type of message x01-x01 - u16 xSubtype; // Subtype for event x02-x03 - u8 xSourceLp; // Source LP x04-x04 - u8 xTargetLp; // Target LP x05-x05 - u8 xSizeMinus1; // Size of Derived class - 1 x06-x06 - u8 xRc; // RC for Ack flows x07-x07 - u16 xSourceInstanceId; // Source sides instance id x08-x09 - u16 xTargetInstanceId; // Target sides instance id x0A-x0B +struct HvLpEvent { + struct HvEventFlags xFlags; /* Event flags x00-x00 */ + u8 xType; /* Type of message x01-x01 */ + u16 xSubtype; /* Subtype for event x02-x03 */ + u8 xSourceLp; /* Source LP x04-x04 */ + u8 xTargetLp; /* Target LP x05-x05 */ + u8 xSizeMinus1; /* Size of Derived class - 1 x06-x06 */ + u8 xRc; /* RC for Ack flows x07-x07 */ + u16 xSourceInstanceId; /* Source sides instance id x08-x09 */ + u16 xTargetInstanceId; /* Target sides instance id x0A-x0B */ union { - u32 xSubtypeData; // Data usable by the subtype x0C-x0F - u16 xSubtypeDataShort[2]; // Data as 2 shorts - u8 xSubtypeDataChar[4]; // Data as 4 chars + u32 xSubtypeData; /* Data usable by the subtype x0C-x0F */ + u16 xSubtypeDataShort[2]; /* Data as 2 shorts */ + u8 xSubtypeDataChar[4]; /* Data as 4 chars */ } x; - u64 xCorrelationToken; // Unique value for source/type x10-x17 + u64 xCorrelationToken; /* Unique value for source/type x10-x17 */ }; -// Lp Event handler function typedef void (*LpEventHandler)(struct HvLpEvent *, struct pt_regs *); -// Register a handler for an event type -// returns 0 on success -extern int HvLpEvent_registerHandler( HvLpEvent_Type eventType, LpEventHandler hdlr); - -// Unregister a handler for an event type -// This call will sleep until the handler being removed is guaranteed to -// be no longer executing on any CPU. Do not call with locks held. -// -// returns 0 on success -// Unregister will fail if there are any paths open for the type -extern int HvLpEvent_unregisterHandler( HvLpEvent_Type eventType ); - -// Open an Lp Event Path for an event type -// returns 0 on success -// openPath will fail if there is no handler registered for the event type. -// The lpIndex specified is the partition index for the target partition -// (for VirtualIo, VirtualLan and SessionMgr) other types specify zero) -extern int HvLpEvent_openPath( HvLpEvent_Type eventType, HvLpIndex lpIndex ); - - -// Close an Lp Event Path for a type and partition -// returns 0 on sucess -extern int HvLpEvent_closePath( HvLpEvent_Type eventType, HvLpIndex lpIndex ); +/* Register a handler for an event type - returns 0 on success */ +extern int HvLpEvent_registerHandler(HvLpEvent_Type eventType, + LpEventHandler hdlr); + +/* + * Unregister a handler for an event type + * + * This call will sleep until the handler being removed is guaranteed to + * be no longer executing on any CPU. Do not call with locks held. + * + * returns 0 on success + * Unregister will fail if there are any paths open for the type + */ +extern int HvLpEvent_unregisterHandler(HvLpEvent_Type eventType); + +/* + * Open an Lp Event Path for an event type + * returns 0 on success + * openPath will fail if there is no handler registered for the event type. + * The lpIndex specified is the partition index for the target partition + * (for VirtualIo, VirtualLan and SessionMgr) other types specify zero) + */ +extern int HvLpEvent_openPath(HvLpEvent_Type eventType, HvLpIndex lpIndex); + +/* + * Close an Lp Event Path for a type and partition + * returns 0 on sucess + */ +extern int HvLpEvent_closePath(HvLpEvent_Type eventType, HvLpIndex lpIndex); #define HvLpEvent_Type_Hypervisor 0 #define HvLpEvent_Type_MachineFac 1 @@ -141,4 +139,4 @@ extern int HvLpEvent_closePath( HvLpEven #define HvLpDma_Rc_InvalidAddress 4 #define HvLpDma_Rc_InvalidLength 5 -#endif // _HVLPEVENT_H +#endif /* _HVLPEVENT_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvReleaseData.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvReleaseData.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvReleaseData.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvReleaseData.h 2005-06-01 16:39:29.000000000 +1000 @@ -1,17 +1,17 @@ /* * HvReleaseData.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,47 +19,43 @@ #ifndef _HVRELEASEDATA_H #define _HVRELEASEDATA_H -//============================================================================= -// -// This control block contains the critical information about the -// release so that it can be changed in the future (ie, the virtual -// address of the OS's NACA). -// +/* + * This control block contains the critical information about the + * release so that it can be changed in the future (ie, the virtual + * address of the OS's NACA). + */ #include #include -//============================================================================= -// -// When we IPL a secondary partition, we will check if if the -// secondary xMinPlicVrmIndex > the primary xVrmIndex. -// If it is then this tells PLIC that this secondary is not -// supported running on this "old" of a level of PLIC. -// -// Likewise, we will compare the primary xMinSlicVrmIndex to -// the secondary xVrmIndex. -// If the primary xMinSlicVrmDelta > secondary xVrmDelta then we -// know that this PLIC does not support running an OS "that old". -// -//============================================================================= +/* + * When we IPL a secondary partition, we will check if if the + * secondary xMinPlicVrmIndex > the primary xVrmIndex. + * If it is then this tells PLIC that this secondary is not + * supported running on this "old" of a level of PLIC. + * + * Likewise, we will compare the primary xMinSlicVrmIndex to + * the secondary xVrmIndex. + * If the primary xMinSlicVrmDelta > secondary xVrmDelta then we + * know that this PLIC does not support running an OS "that old". + */ -struct HvReleaseData -{ - u32 xDesc; // Descriptor "HvRD" ebcdic x00-x03 - u16 xSize; // Size of this control block x04-x05 - u16 xVpdAreasPtrOffset; // Offset in NACA of ItVpdAreas x06-x07 - struct naca_struct * xSlicNacaAddr; // Virt addr of SLIC NACA x08-x0F - u32 xMsNucDataOffset; // Offset of Linux Mapping Data x10-x13 - u32 xRsvd1; // Reserved x14-x17 - u16 xTagsMode:1; // 0 == tags active, 1 == tags inactive - u16 xAddressSize:1; // 0 == 64-bit, 1 == 32-bit - u16 xNoSharedProcs:1; // 0 == shared procs, 1 == no shared - u16 xNoHMT:1; // 0 == allow HMT, 1 == no HMT - u16 xRsvd2:12; // Reserved x18-x19 - u16 xVrmIndex; // VRM Index of OS image x1A-x1B - u16 xMinSupportedPlicVrmIndex;// Min PLIC level (soft) x1C-x1D - u16 xMinCompatablePlicVrmIndex;// Min PLIC levelP (hard) x1E-x1F - char xVrmName[12]; // Displayable name x20-x2B - char xRsvd3[20]; // Reserved x2C-x3F +struct HvReleaseData { + u32 xDesc; /* Descriptor "HvRD" ebcdic x00-x03 */ + u16 xSize; /* Size of this control block x04-x05 */ + u16 xVpdAreasPtrOffset; /* Offset in NACA of ItVpdAreas x06-x07 */ + struct naca_struct *xSlicNacaAddr; /* Virt addr of SLIC NACA x08-x0F */ + u32 xMsNucDataOffset; /* Offset of Linux Mapping Data x10-x13 */ + u32 xRsvd1; /* Reserved x14-x17 */ + u16 xTagsMode:1; /* 0 == tags active, 1 == tags inactive */ + u16 xAddressSize:1; /* 0 == 64-bit, 1 == 32-bit */ + u16 xNoSharedProcs:1; /* 0 == shared procs, 1 == no shared */ + u16 xNoHMT:1; /* 0 == allow HMT, 1 == no HMT */ + u16 xRsvd2:12; /* Reserved x18-x19 */ + u16 xVrmIndex; /* VRM Index of OS image x1A-x1B */ + u16 xMinSupportedPlicVrmIndex; /* Min PLIC level (soft) x1C-x1D */ + u16 xMinCompatablePlicVrmIndex; /* Min PLIC levelP (hard) x1E-x1F */ + char xVrmName[12]; /* Displayable name x20-x2B */ + char xRsvd3[20]; /* Reserved x2C-x3F */ }; #endif /* _HVRELEASEDATA_H */ diff -ruNp linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvTypes.h linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvTypes.h --- linus-iSeries-headers.1/include/asm-ppc64/iSeries/HvTypes.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.2/include/asm-ppc64/iSeries/HvTypes.h 2005-06-01 16:45:03.000000000 +1000 @@ -1,17 +1,17 @@ /* * HvTypes.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,74 +19,62 @@ #ifndef _HVTYPES_H #define _HVTYPES_H -//=========================================================================== -// Header File Id -// Name______________: HvTypes.H -// -// Description_______: -// -// General typedefs for the hypervisor. -// -// Declared Class(es): -// -//=========================================================================== +/* + * General typedefs for the hypervisor. + */ #include -//------------------------------------------------------------------- -// Typedefs -//------------------------------------------------------------------- typedef u8 HvLpIndex; typedef u16 HvLpInstanceId; -typedef u64 HvLpTOD; -typedef u64 HvLpSystemSerialNum; -typedef u8 HvLpDeviceSerialNum[12]; -typedef u16 HvLpSanHwSet; -typedef u16 HvLpBus; -typedef u16 HvLpBoard; -typedef u16 HvLpCard; -typedef u8 HvLpDeviceType[4]; -typedef u8 HvLpDeviceModel[3]; -typedef u64 HvIoToken; -typedef u8 HvLpName[8]; +typedef u64 HvLpTOD; +typedef u64 HvLpSystemSerialNum; +typedef u8 HvLpDeviceSerialNum[12]; +typedef u16 HvLpSanHwSet; +typedef u16 HvLpBus; +typedef u16 HvLpBoard; +typedef u16 HvLpCard; +typedef u8 HvLpDeviceType[4]; +typedef u8 HvLpDeviceModel[3]; +typedef u64 HvIoToken; +typedef u8 HvLpName[8]; typedef u32 HvIoId; typedef u64 HvRealMemoryIndex; -typedef u32 HvLpIndexMap; // Must hold HvMaxArchitectedLps bits!!! +typedef u32 HvLpIndexMap; /* Must hold HvMaxArchitectedLps bits!!! */ typedef u16 HvLpVrmIndex; typedef u32 HvXmGenerationId; -typedef u8 HvLpBusPool; -typedef u8 HvLpSharedPoolIndex; +typedef u8 HvLpBusPool; +typedef u8 HvLpSharedPoolIndex; typedef u16 HvLpSharedProcUnitsX100; typedef u8 HvLpVirtualLanIndex; -typedef u16 HvLpVirtualLanIndexMap; // Must hold HvMaxArchitectedVirtualLans bits!!! -typedef u16 HvBusNumber; // Hypervisor Bus Number -typedef u8 HvSubBusNumber; // Hypervisor SubBus Number -typedef u8 HvAgentId; // Hypervisor DevFn - - -#define HVMAXARCHITECTEDLPS 32 -#define HVMAXARCHITECTEDVIRTUALLANS 16 -#define HVMAXARCHITECTEDVIRTUALDISKS 32 -#define HVMAXARCHITECTEDVIRTUALCDROMS 8 -#define HVMAXARCHITECTEDVIRTUALTAPES 8 -#define HVCHUNKSIZE 256 * 1024 -#define HVPAGESIZE 4 * 1024 -#define HVLPMINMEGSPRIMARY 256 -#define HVLPMINMEGSSECONDARY 64 -#define HVCHUNKSPERMEG 4 -#define HVPAGESPERMEG 256 -#define HVPAGESPERCHUNK 64 - -#define HvMaxArchitectedLps ((HvLpIndex)HVMAXARCHITECTEDLPS) +typedef u16 HvLpVirtualLanIndexMap; /* Must hold HvMaxArchitectedVirtualLans bits!!! */ +typedef u16 HvBusNumber; /* Hypervisor Bus Number */ +typedef u8 HvSubBusNumber; /* Hypervisor SubBus Number */ +typedef u8 HvAgentId; /* Hypervisor DevFn */ + + +#define HVMAXARCHITECTEDLPS 32 +#define HVMAXARCHITECTEDVIRTUALLANS 16 +#define HVMAXARCHITECTEDVIRTUALDISKS 32 +#define HVMAXARCHITECTEDVIRTUALCDROMS 8 +#define HVMAXARCHITECTEDVIRTUALTAPES 8 +#define HVCHUNKSIZE (256 * 1024) +#define HVPAGESIZE (4 * 1024) +#define HVLPMINMEGSPRIMARY 256 +#define HVLPMINMEGSSECONDARY 64 +#define HVCHUNKSPERMEG 4 +#define HVPAGESPERMEG 256 +#define HVPAGESPERCHUNK 64 + +#define HvMaxArchitectedLps ((HvLpIndex)HVMAXARCHITECTEDLPS) #define HvMaxArchitectedVirtualLans ((HvLpVirtualLanIndex)16) #define HvLpIndexInvalid ((HvLpIndex)0xff) -//-------------------------------------------------------------------- -// Enums for the sub-components under PLIC -// Used in HvCall and HvPrimaryCall -//-------------------------------------------------------------------- -enum HvCallCompIds -{ +/* + * Enums for the sub-components under PLIC + * Used in HvCall and HvPrimaryCall + */ +enum HvCallCompIds { HvCallCompId = 0, HvCallCpuCtlsCompId = 1, HvCallCfgCompId = 2, @@ -97,18 +85,18 @@ enum HvCallCompIds HvCallSmCompId = 7, HvCallSpdCompId = 8, HvCallXmCompId = 9, - HvCallRioCompId = 10, + HvCallRioCompId = 10, HvCallRsvd3CompId = 11, HvCallRsvd2CompId = 12, HvCallRsvd1CompId = 13, HvCallMaxCompId = 14, - HvPrimaryCallCompId = 0, + HvPrimaryCallCompId = 0, HvPrimaryCallCfgCompId = 1, - HvPrimaryCallPciCompId = 2, + HvPrimaryCallPciCompId = 2, HvPrimaryCallSmCompId = 3, HvPrimaryCallSpdCompId = 4, HvPrimaryCallXmCompId = 5, - HvPrimaryCallRioCompId = 6, + HvPrimaryCallRioCompId = 6, HvPrimaryCallRsvd7CompId = 7, HvPrimaryCallRsvd6CompId = 8, HvPrimaryCallRsvd5CompId = 9, @@ -116,7 +104,7 @@ enum HvCallCompIds HvPrimaryCallRsvd3CompId = 11, HvPrimaryCallRsvd2CompId = 12, HvPrimaryCallRsvd1CompId = 13, - HvPrimaryCallMaxCompId = HvCallMaxCompId + HvPrimaryCallMaxCompId = HvCallMaxCompId }; struct HvLpBufferList { From sfr at canb.auug.org.au Fri Jun 3 18:06:42 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:06:42 +1000 Subject: [PATCH 3/10] ppc64 iSeries: more header file white space cleanups In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603180642.34f1cb31.sfr@canb.auug.org.au> Hi Andrew, This patch just contains white space and comment cleanups in the iSeries headers files. There are no semantic changes. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriMainStore.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/IoHriMainStore.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriMainStore.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/IoHriMainStore.h 2005-06-01 16:47:55.000000000 +1000 @@ -1,17 +1,17 @@ /* * IoHriMainStore.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -21,7 +21,7 @@ #define _IOHRIMAINSTORE_H /* Main Store Vpd for Condor,iStar,sStar */ -struct IoHriMainStoreSegment4 { +struct IoHriMainStoreSegment4 { u8 msArea0Exists:1; u8 msArea1Exists:1; u8 msArea2Exists:1; @@ -51,7 +51,7 @@ struct IoHriMainStoreSegment4 { u8 msArea1HasRiserVpd:1; u8 msArea2HasRiserVpd:1; u8 msArea3HasRiserVpd:1; - u8 reserved5:4; + u8 reserved5:4; u8 reserved6; u16 reserved7; @@ -82,8 +82,8 @@ struct IoHriMainStoreVpdFruData { }; struct IoHriMainStoreAdrRangeBlock { - void * blockStart __attribute((packed)); - void * blockEnd __attribute((packed)); + void *blockStart __attribute((packed)); + void *blockEnd __attribute((packed)); u32 blockProcChipId __attribute((packed)); }; @@ -102,7 +102,7 @@ struct IoHriMainStoreArea4 { u32 procNodeId __attribute((packed)); u32 numAdrRangeBlocks __attribute((packed)); - struct IoHriMainStoreAdrRangeBlock xAdrRangeBlock[MaxAreaAdrRangeBlocks] __attribute((packed)); + struct IoHriMainStoreAdrRangeBlock xAdrRangeBlock[MaxAreaAdrRangeBlocks] __attribute((packed)); struct IoHriMainStoreChipInfo1 chipInfo0 __attribute((packed)); struct IoHriMainStoreChipInfo1 chipInfo1 __attribute((packed)); @@ -113,17 +113,17 @@ struct IoHriMainStoreArea4 { struct IoHriMainStoreChipInfo1 chipInfo6 __attribute((packed)); struct IoHriMainStoreChipInfo1 chipInfo7 __attribute((packed)); - void * msRamAreaArray __attribute((packed)); + void *msRamAreaArray __attribute((packed)); u32 msRamAreaArrayNumEntries __attribute((packed)); u32 msRamAreaArrayEntrySize __attribute((packed)); u32 numaDimmExists __attribute((packed)); u32 numaDimmFunctional __attribute((packed)); - void * numaDimmArray __attribute((packed)); + void *numaDimmArray __attribute((packed)); u32 numaDimmArrayNumEntries __attribute((packed)); u32 numaDimmArrayEntrySize __attribute((packed)); - struct IoHriMainStoreVpdIdData idData __attribute((packed)); + struct IoHriMainStoreVpdIdData idData __attribute((packed)); u64 powerData __attribute((packed)); u64 cardAssemblyPartNum __attribute((packed)); @@ -143,7 +143,7 @@ struct IoHriMainStoreArea4 { }; -struct IoHriMainStoreSegment5 { +struct IoHriMainStoreSegment5 { u16 reserved1; u8 reserved2; u8 msVpdFormat; @@ -151,17 +151,14 @@ struct IoHriMainStoreSegment5 { u32 totalMainStore; u64 maxConfiguredMsAdr; - struct IoHriMainStoreArea4* msAreaArray; + struct IoHriMainStoreArea4 *msAreaArray; u32 msAreaArrayNumEntries; u32 msAreaArrayEntrySize; - u32 msAreaExists; + u32 msAreaExists; u32 msAreaFunctional; u64 reserved3; }; - - -#endif // _IOHRIMAINSTORE_H - +#endif /* _IOHRIMAINSTORE_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriProcessorVpd.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/IoHriProcessorVpd.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/IoHriProcessorVpd.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/IoHriProcessorVpd.h 2005-06-01 16:50:11.000000000 +1000 @@ -1,17 +1,17 @@ /* * IoHriProcessorVpd.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,16 +19,12 @@ #ifndef _IOHRIPROCESSORVPD_H #define _IOHRIPROCESSORVPD_H -//=================================================================== -// -// This struct maps Processor Vpd that is DMAd to SLIC by CSP -// - #include -struct IoHriProcessorVpd -{ - +/* + * This struct maps Processor Vpd that is DMAd to SLIC by CSP + */ +struct IoHriProcessorVpd { u8 xFormat; // VPD format indicator x00-x00 u8 xProcStatus:8; // Processor State x01-x01 u8 xSecondaryThreadCount; // Secondary thread cnt x02-x02 @@ -40,12 +36,12 @@ struct IoHriProcessorVpd u16 xRsvd2; // Reserved x06-x07 u32 xHwNodeId; // Hardware node id x08-x0B u32 xHwProcId; // Hardware processor id x0C-x0F - + u32 xTypeNum; // Card Type/CCIN number x10-x13 u32 xModelNum; // Model/Feature number x14-x17 u64 xSerialNum; // Serial number x18-x1F - char xPartNum[12]; // Book Part or FPU number x20-x2B - char xMfgID[4]; // Manufacturing ID x2C-x2F + char xPartNum[12]; // Book Part or FPU number x20-x2B + char xMfgID[4]; // Manufacturing ID x2C-x2F u32 xProcFreq; // Processor Frequency x30-x33 u32 xTimeBaseFreq; // Time Base Frequency x34-x37 @@ -71,7 +67,7 @@ struct IoHriProcessorVpd u32 xDataL3CacheSizeKB; // L3 data cache size in KB x80-x83 u32 xDataL3CacheLineSize; // L3 data cache block size x84-x87 u64 xRsvd6; // Reserved x88-x8F - + u64 xFruLabel; // Card Location Label x90-x97 u8 xSlotsOnCard; // Slots on card (0=no slots) x98-x98 u8 xPartLocFlag; // Location flag (0-pluggable 1-imbedded) x99-x99 @@ -79,10 +75,10 @@ struct IoHriProcessorVpd u8 xSmartCardPortNo; // Smart card port number x9C-x9C u8 xRsvd7; // Reserved x9D-x9D u16 xFrameIdAndRackUnit; // Frame ID and rack unit adr x9E-x9F - + u8 xRsvd8[24]; // Reserved xA0-xB7 - char xProcSrc[72]; // CSP format SRC xB8-xFF + char xProcSrc[72]; // CSP format SRC xB8-xFF }; #endif /* _IOHRIPROCESSORVPD_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItExtVpdPanel.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItExtVpdPanel.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItExtVpdPanel.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItExtVpdPanel.h 2005-06-01 16:51:48.000000000 +1000 @@ -1,17 +1,17 @@ /* * ItExtVpdPanel.h * Copyright (C) 2002 Dave Boutcher IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -20,39 +20,31 @@ #define _ITEXTVPDPANEL_H /* - * - * This struct maps the panel information + * This struct maps the panel information * * Warning: * This data must match the architecture for the panel information - * */ - -/*------------------------------------------------------------------- - * Standard Includes - *------------------------------------------------------------------- -*/ #include -struct ItExtVpdPanel -{ - // Definition of the Extended Vpd On Panel Data Area - char systemSerial[8]; - char mfgID[4]; - char reserved1[24]; - char machineType[4]; - char systemID[6]; - char somUniqueCnt[4]; - char serialNumberCount; - char reserved2[7]; - u16 bbu3; - u16 bbu2; - u16 bbu1; - char xLocationLabel[8]; - u8 xRsvd1[6]; - u16 xFrameId; - u8 xRsvd2[48]; +struct ItExtVpdPanel { + /* Definition of the Extended Vpd On Panel Data Area */ + char systemSerial[8]; + char mfgID[4]; + char reserved1[24]; + char machineType[4]; + char systemID[6]; + char somUniqueCnt[4]; + char serialNumberCount; + char reserved2[7]; + u16 bbu3; + u16 bbu2; + u16 bbu1; + char xLocationLabel[8]; + u8 xRsvd1[6]; + u16 xFrameId; + u8 xRsvd2[48]; }; -#endif /* _ITEXTVPDPANEL_H */ +#endif /* _ITEXTVPDPANEL_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItIplParmsReal.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItIplParmsReal.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItIplParmsReal.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItIplParmsReal.h 2005-06-01 16:53:52.000000000 +1000 @@ -1,17 +1,17 @@ /* * ItIplParmsReal.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,58 +19,51 @@ #ifndef _ITIPLPARMSREAL_H #define _ITIPLPARMSREAL_H -//============================================================================== -// -// This struct maps the IPL Parameters DMA'd from the SP. -// -// Warning: -// This data must map in exactly 64 bytes and match the architecture for -// the IPL parms -// -//============================================================================= - +/* + * This struct maps the IPL Parameters DMA'd from the SP. + * + * Warning: + * This data must map in exactly 64 bytes and match the architecture for + * the IPL parms + */ -//------------------------------------------------------------------- -// Standard Includes -//------------------------------------------------------------------- #include -struct ItIplParmsReal -{ - u8 xFormat; // Defines format of IplParms x00-x00 - u8 xRsvd01:6; // Reserved x01-x01 - u8 xAlternateSearch:1; // Alternate search indicator ... - u8 xUaSupplied:1; // UA Supplied on programmed IPL ... - u8 xLsUaFormat; // Format byte for UA x02-x02 - u8 xRsvd02; // Reserved x03-x03 - u32 xLsUa; // LS UA x04-x07 - u32 xUnusedLsLid; // First OS LID to load x08-x0B - u16 xLsBusNumber; // LS Bus Number x0C-x0D - u8 xLsCardAdr; // LS Card Address x0E-x0E - u8 xLsBoardAdr; // LS Board Address x0F-x0F - u32 xRsvd03; // Reserved x10-x13 - u8 xSpcnPresent:1; // SPCN present x14-x14 - u8 xCpmPresent:1; // CPM present ... - u8 xRsvd04:6; // Reserved ... - u8 xRsvd05:4; // Reserved x15-x15 - u8 xKeyLock:4; // Keylock setting ... - u8 xRsvd06:6; // Reserved x16-x16 - u8 xIplMode:2; // Ipl mode (A|B|C|D) ... - u8 xHwIplType; // Fast v slow v slow EC HW IPL x17-x17 - u16 xCpmEnabledIpl:1; // CPM in effect when IPL initiated x18-x19 - u16 xPowerOnResetIpl:1; // Indicate POR condition ... - u16 xMainStorePreserved:1; // Main Storage is preserved ... - u16 xRsvd07:13; // Reserved ... - u16 xIplSource:16; // Ipl source x1A-x1B - u8 xIplReason:8; // Reason for this IPL x1C-x1C - u8 xRsvd08; // Reserved x1D-x1D - u16 xRsvd09; // Reserved x1E-x1F - u16 xSysBoxType; // System Box Type x20-x21 - u16 xSysProcType; // System Processor Type x22-x23 - u32 xRsvd10; // Reserved x24-x27 - u64 xRsvd11; // Reserved x28-x2F - u64 xRsvd12; // Reserved x30-x37 - u64 xRsvd13; // Reserved x38-x3F +struct ItIplParmsReal { + u8 xFormat; // Defines format of IplParms x00-x00 + u8 xRsvd01:6; // Reserved x01-x01 + u8 xAlternateSearch:1; // Alternate search indicator ... + u8 xUaSupplied:1; // UA Supplied on programmed IPL... + u8 xLsUaFormat; // Format byte for UA x02-x02 + u8 xRsvd02; // Reserved x03-x03 + u32 xLsUa; // LS UA x04-x07 + u32 xUnusedLsLid; // First OS LID to load x08-x0B + u16 xLsBusNumber; // LS Bus Number x0C-x0D + u8 xLsCardAdr; // LS Card Address x0E-x0E + u8 xLsBoardAdr; // LS Board Address x0F-x0F + u32 xRsvd03; // Reserved x10-x13 + u8 xSpcnPresent:1; // SPCN present x14-x14 + u8 xCpmPresent:1; // CPM present ... + u8 xRsvd04:6; // Reserved ... + u8 xRsvd05:4; // Reserved x15-x15 + u8 xKeyLock:4; // Keylock setting ... + u8 xRsvd06:6; // Reserved x16-x16 + u8 xIplMode:2; // Ipl mode (A|B|C|D) ... + u8 xHwIplType; // Fast v slow v slow EC HW IPL x17-x17 + u16 xCpmEnabledIpl:1; // CPM in effect when IPL initiatedx18-x19 + u16 xPowerOnResetIpl:1; // Indicate POR condition ... + u16 xMainStorePreserved:1; // Main Storage is preserved ... + u16 xRsvd07:13; // Reserved ... + u16 xIplSource:16; // Ipl source x1A-x1B + u8 xIplReason:8; // Reason for this IPL x1C-x1C + u8 xRsvd08; // Reserved x1D-x1D + u16 xRsvd09; // Reserved x1E-x1F + u16 xSysBoxType; // System Box Type x20-x21 + u16 xSysProcType; // System Processor Type x22-x23 + u32 xRsvd10; // Reserved x24-x27 + u64 xRsvd11; // Reserved x28-x2F + u64 xRsvd12; // Reserved x30-x37 + u64 xRsvd13; // Reserved x38-x3F }; #endif /* _ITIPLPARMSREAL_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpNaca.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItLpNaca.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpNaca.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItLpNaca.h 2005-06-01 16:58:28.000000000 +1000 @@ -1,17 +1,17 @@ /* * ItLpNaca.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,18 +19,13 @@ #ifndef _ITLPNACA_H #define _ITLPNACA_H -//============================================================================= -// -// This control block contains the data that is shared between the -// hypervisor (PLIC) and the OS. -// -//============================================================================= - -struct ItLpNaca -{ -//============================================================================= +/* + * This control block contains the data that is shared between the + * hypervisor (PLIC) and the OS. + */ + +struct ItLpNaca { // CACHE_LINE_1 0x0000 - 0x007F Contains read-only data -//============================================================================= u32 xDesc; // Eye catcher x00-x03 u16 xSize; // Size of this class x04-x05 u16 xIntHdlrOffset; // Offset to IntHdlr array x06-x07 @@ -59,30 +54,23 @@ struct ItLpNaca u64 xLoadAreaAddr; // ER address of load area x28-x2F u32 xLoadAreaChunks; // Chunks for the load area x30-x33 u32 xPaseSysCallCRMask; // Mask used to test CR before x34-x37 - // doing an ASR switch on PASE - // system call. - u64 xSlicSegmentTablePtr; // Pointer to Slic seg table. x38-x3f - u8 xRsvd1_4[64]; // x40-x7F - -//============================================================================= + // doing an ASR switch on PASE + // system call. + u64 xSlicSegmentTablePtr; // Pointer to Slic seg table. x38-x3f + u8 xRsvd1_4[64]; // x40-x7F + // CACHE_LINE_2 0x0080 - 0x00FF Contains local read-write data -//============================================================================= u8 xRsvd2_0[128]; // Reserved x00-x7F -//============================================================================= // CACHE_LINE_3-6 0x0100 - 0x02FF Contains LP Queue indicators -// NB: Padding required to keep xInterrruptHdlr at x300 which is required +// NB: Padding required to keep xInterrruptHdlr at x300 which is required // for v4r4 PLIC. -//============================================================================= u8 xOldLpQueue[128]; // LP Queue needed for v4r4 100-17F u8 xRsvd3_0[384]; // Reserved 180-2FF -//============================================================================= + // CACHE_LINE_7-8 0x0300 - 0x03FF Contains the address of the OS interrupt // handlers -//============================================================================= u64 xInterruptHdlr[32]; // Interrupt handlers 300-x3FF }; -//============================================================================= - #endif /* _ITLPNACA_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpQueue.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItLpQueue.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpQueue.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItLpQueue.h 2005-06-01 17:05:16.000000000 +1000 @@ -1,17 +1,17 @@ /* * ItLpQueue.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,47 +19,47 @@ #ifndef _ITLPQUEUE_H #define _ITLPQUEUE_H -//============================================================================= -// -// This control block defines the simple LP queue structure that is -// shared between the hypervisor (PLIC) and the OS in order to send -// events to an LP. -// +/* + * This control block defines the simple LP queue structure that is + * shared between the hypervisor (PLIC) and the OS in order to send + * events to an LP. + */ #include #include struct HvLpEvent; -#define ITMaxLpQueues 8 +#define ITMaxLpQueues 8 #define NotUsed 0 // Queue will not be used by PLIC #define DedicatedIo 1 // Queue dedicated to IO processor specified #define DedicatedLp 2 // Queue dedicated to LP specified #define Shared 3 // Queue shared for both IO and LP -#define LpEventStackSize 4096 -#define LpEventMaxSize 256 -#define LpEventAlign 64 +#define LpEventStackSize 4096 +#define LpEventMaxSize 256 +#define LpEventAlign 64 -struct ItLpQueue -{ -// -// The xSlicCurEventPtr is the pointer to the next event stack entry that will -// become valid. The OS must peek at this entry to determine if it is valid. -// PLIC will set the valid indicator as the very last store into that entry. -// -// When the OS has completed processing of the event then it will mark the event -// as invalid so that PLIC knows it can store into that event location again. -// -// If the event stack fills and there are overflow events, then PLIC will set -// the xPlicOverflowIntPending flag in which case the OS will have to fetch the -// additional LP events once they have drained the event stack. -// -// The first 16-bytes are known by both the OS and PLIC. The remainder of the -// cache line is for use by the OS. -// -//============================================================================= +struct ItLpQueue { +/* + * The xSlicCurEventPtr is the pointer to the next event stack entry + * that will become valid. The OS must peek at this entry to determine + * if it is valid. PLIC will set the valid indicator as the very last + * store into that entry. + * + * When the OS has completed processing of the event then it will mark + * the event as invalid so that PLIC knows it can store into that event + * location again. + * + * If the event stack fills and there are overflow events, then PLIC + * will set the xPlicOverflowIntPending flag in which case the OS will + * have to fetch the additional LP events once they have drained the + * event stack. + * + * The first 16-bytes are known by both the OS and PLIC. The remainder + * of the cache line is for use by the OS. + */ u8 xPlicOverflowIntPending;// 0x00 Overflow events are pending u8 xPlicStatus; // 0x01 DedicatedIo or DedicatedLp or NotUsed u16 xSlicLogicalProcIndex; // 0x02 Logical Proc Index for correlation @@ -76,17 +76,17 @@ struct ItLpQueue extern struct ItLpQueue xItLpQueue; -extern struct HvLpEvent * ItLpQueue_getNextLpEvent( struct ItLpQueue * ); -extern int ItLpQueue_isLpIntPending( struct ItLpQueue * ); -extern unsigned ItLpQueue_process( struct ItLpQueue *, struct pt_regs * ); -extern void ItLpQueue_clearValid( struct HvLpEvent * ); +extern struct HvLpEvent * ItLpQueue_getNextLpEvent(struct ItLpQueue *); +extern int ItLpQueue_isLpIntPending(struct ItLpQueue *); +extern unsigned ItLpQueue_process(struct ItLpQueue *, struct pt_regs *); +extern void ItLpQueue_clearValid(struct HvLpEvent *); -static __inline__ void process_iSeries_events( void ) +static __inline__ void process_iSeries_events(void) { __asm__ __volatile__ ( " li 0,0x5555 \n\ sc" - : : : "r0", "r3" ); + : : : "r0", "r3"); } #endif /* _ITLPQUEUE_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpRegSave.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItLpRegSave.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItLpRegSave.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItLpRegSave.h 2005-06-01 17:06:41.000000000 +1000 @@ -1,17 +1,17 @@ /* * ItLpRegSave.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,33 +19,30 @@ #ifndef _ITLPREGSAVE_H #define _ITLPREGSAVE_H -//===================================================================================== -// -// This control block contains the data that is shared between PLIC -// and the OS -// -// +/* + * This control block contains the data that is shared between PLIC + * and the OS + */ -struct ItLpRegSave -{ +struct ItLpRegSave { u32 xDesc; // Eye catcher "LpRS" ebcdic 000-003 u16 xSize; // Size of this class 004-005 u8 xInUse; // Area is live 006-007 - u8 xRsvd1[9]; // Reserved 007-00F + u8 xRsvd1[9]; // Reserved 007-00F - u8 xFixedRegSave[352]; // Fixed Register Save Area 010-16F + u8 xFixedRegSave[352]; // Fixed Register Save Area 010-16F u32 xCTRL; // Control Register 170-173 - u32 xDEC; // Decrementer 174-177 + u32 xDEC; // Decrementer 174-177 u32 xFPSCR; // FP Status and Control Reg 178-17B u32 xPVR; // Processor Version Number 17C-17F - + u64 xMMCR0; // Monitor Mode Control Reg 0 180-187 u32 xPMC1; // Perf Monitor Counter 1 188-18B u32 xPMC2; // Perf Monitor Counter 2 18C-18F u32 xPMC3; // Perf Monitor Counter 3 190-193 u32 xPMC4; // Perf Monitor Counter 4 194-197 u32 xPIR; // Processor ID Reg 198-19B - + u32 xMMCR1; // Monitor Mode Control Reg 1 19C-19F u32 xMMCRA; // Monitor Mode Control Reg A 1A0-1A3 u32 xPMC5; // Perf Monitor Counter 5 1A4-1A7 @@ -57,17 +54,17 @@ struct ItLpRegSave u32 xRsvd; // Reserved 1BC-1BF u64 xACCR; // Address Compare Control Reg 1C0-1C7 - u64 xIMR; // Instruction Match Register 1C8-1CF - u64 xSDR1; // Storage Description Reg 1 1D0-1D7 + u64 xIMR; // Instruction Match Register 1C8-1CF + u64 xSDR1; // Storage Description Reg 1 1D0-1D7 u64 xSPRG0; // Special Purpose Reg General0 1D8-1DF u64 xSPRG1; // Special Purpose Reg General1 1E0-1E7 u64 xSPRG2; // Special Purpose Reg General2 1E8-1EF u64 xSPRG3; // Special Purpose Reg General3 1F0-1F7 u64 xTB; // Time Base Register 1F8-1FF - + u64 xFPR[32]; // Floating Point Registers 200-2FF - u64 xMSR; // Machine State Register 300-307 + u64 xMSR; // Machine State Register 300-307 u64 xNIA; // Next Instruction Address 308-30F u64 xDABR; // Data Address Breakpoint Reg 310-317 @@ -76,8 +73,8 @@ struct ItLpRegSave u64 xHID0; // HW Implementation Dependent0 320-327 u64 xHID4; // HW Implementation Dependent4 328-32F - u64 xSCOMd; // SCON Data Reg (SPRG4) 330-337 - u64 xSCOMc; // SCON Command Reg (SPRG5) 338-33F + u64 xSCOMd; // SCON Data Reg (SPRG4) 330-337 + u64 xSCOMc; // SCON Command Reg (SPRG5) 338-33F u64 xSDAR; // Sample Data Address Register 340-347 u64 xSIAR; // Sample Inst Address Register 348-34F diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItSpCommArea.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItSpCommArea.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItSpCommArea.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItSpCommArea.h 2005-06-01 17:07:22.000000000 +1000 @@ -1,29 +1,27 @@ /* * ItSpCommArea.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ - #ifndef _ITSPCOMMAREA_H #define _ITSPCOMMAREA_H -struct SpCommArea -{ +struct SpCommArea { u32 xDesc; // Descriptor (only in new formats) 000-003 u8 xFormat; // Format (only in new formats) 004-004 u8 xRsvd1[11]; // Reserved 005-00F diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItVpdAreas.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItVpdAreas.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-01 17:11:03.000000000 +1000 @@ -1,17 +1,17 @@ /* * ItVpdAreas.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -19,78 +19,75 @@ #ifndef _ITVPDAREAS_H #define _ITVPDAREAS_H -//===================================================================================== -// -// This file defines the address and length of all of the VPD area passed to -// the OS from PLIC (most of which start from the SP). -// +/* + * This file defines the address and length of all of the VPD area passed to + * the OS from PLIC (most of which start from the SP). + */ #include -// VPD Entry index is carved in stone - cannot be changed (easily). -#define ItVpdCecVpd 0 -#define ItVpdDynamicSpace 1 -#define ItVpdExtVpd 2 -#define ItVpdExtVpdOnPanel 3 -#define ItVpdFirstPaca 4 -#define ItVpdIoVpd 5 -#define ItVpdIplParms 6 -#define ItVpdMsVpd 7 -#define ItVpdPanelVpd 8 -#define ItVpdLpNaca 9 -#define ItVpdBackplaneAndMaybeClockCardVpd 10 -#define ItVpdRecoveryLogBuffer 11 -#define ItVpdSpCommArea 12 -#define ItVpdSpLogBuffer 13 -#define ItVpdSpLogBufferSave 14 -#define ItVpdSpCardVpd 15 -#define ItVpdFirstProcVpd 16 -#define ItVpdApModelVpd 17 -#define ItVpdClockCardVpd 18 -#define ItVpdBusExtCardVpd 19 -#define ItVpdProcCapacityVpd 20 -#define ItVpdInteractiveCapacityVpd 21 -#define ItVpdFirstSlotLabel 22 -#define ItVpdFirstLpQueue 23 -#define ItVpdFirstL3CacheVpd 24 -#define ItVpdFirstProcFruVpd 25 - -#define ItVpdMaxEntries 26 +/* VPD Entry index is carved in stone - cannot be changed (easily). */ +#define ItVpdCecVpd 0 +#define ItVpdDynamicSpace 1 +#define ItVpdExtVpd 2 +#define ItVpdExtVpdOnPanel 3 +#define ItVpdFirstPaca 4 +#define ItVpdIoVpd 5 +#define ItVpdIplParms 6 +#define ItVpdMsVpd 7 +#define ItVpdPanelVpd 8 +#define ItVpdLpNaca 9 +#define ItVpdBackplaneAndMaybeClockCardVpd 10 +#define ItVpdRecoveryLogBuffer 11 +#define ItVpdSpCommArea 12 +#define ItVpdSpLogBuffer 13 +#define ItVpdSpLogBufferSave 14 +#define ItVpdSpCardVpd 15 +#define ItVpdFirstProcVpd 16 +#define ItVpdApModelVpd 17 +#define ItVpdClockCardVpd 18 +#define ItVpdBusExtCardVpd 19 +#define ItVpdProcCapacityVpd 20 +#define ItVpdInteractiveCapacityVpd 21 +#define ItVpdFirstSlotLabel 22 +#define ItVpdFirstLpQueue 23 +#define ItVpdFirstL3CacheVpd 24 +#define ItVpdFirstProcFruVpd 25 +#define ItVpdMaxEntries 26 -#define ItDmaMaxEntries 10 +#define ItDmaMaxEntries 10 -#define ItVpdAreasMaxSlotLabels 192 +#define ItVpdAreasMaxSlotLabels 192 struct SlicVpdAdrs { u32 pad1; - void * vpdAddr; + void *vpdAddr; }; -struct ItVpdAreas -{ - u32 xSlicDesc; // Descriptor 000-003 - u16 xSlicSize; // Size of this control block 004-005 - u16 xPlicAdjustVpdLens:1; // Flag to indicate new interface 006-007 - u16 xRsvd1:15; // Reserved bits ... - u16 xSlicVpdEntries; // Number of VPD entries 008-009 - u16 xSlicDmaEntries; // Number of DMA entries 00A-00B - u16 xSlicMaxLogicalProcs; // Maximum logical processors 00C-00D - u16 xSlicMaxPhysicalProcs; // Maximum physical processors 00E-00F - u16 xSlicDmaToksOffset; // Offset into this of array 010-011 - u16 xSlicVpdAdrsOffset; // Offset into this of array 012-013 - u16 xSlicDmaLensOffset; // Offset into this of array 014-015 - u16 xSlicVpdLensOffset; // Offset into this of array 016-017 - u16 xSlicMaxSlotLabels; // Maximum number of slot labels 018-019 - u16 xSlicMaxLpQueues; // Maximum number of LP Queues 01A-01B - u8 xRsvd2[4]; // Reserved 01C-01F - u64 xRsvd3[12]; // Reserved 020-07F - u32 xPlicDmaLens[ItDmaMaxEntries];// Array of DMA lengths 080-0A7 - u32 xPlicDmaToks[ItDmaMaxEntries];// Array of DMA tokens 0A8-0CF - u32 xSlicVpdLens[ItVpdMaxEntries];// Array of VPD lengths 0D0-12F - void * xSlicVpdAdrs[ItVpdMaxEntries];// Array of VPD buffers 130-1EF +struct ItVpdAreas { + u32 xSlicDesc; // Descriptor 000-003 + u16 xSlicSize; // Size of this control block 004-005 + u16 xPlicAdjustVpdLens:1; // Flag to indicate new interface006-007 + u16 xRsvd1:15; // Reserved bits ... + u16 xSlicVpdEntries; // Number of VPD entries 008-009 + u16 xSlicDmaEntries; // Number of DMA entries 00A-00B + u16 xSlicMaxLogicalProcs; // Maximum logical processors 00C-00D + u16 xSlicMaxPhysicalProcs; // Maximum physical processors 00E-00F + u16 xSlicDmaToksOffset; // Offset into this of array 010-011 + u16 xSlicVpdAdrsOffset; // Offset into this of array 012-013 + u16 xSlicDmaLensOffset; // Offset into this of array 014-015 + u16 xSlicVpdLensOffset; // Offset into this of array 016-017 + u16 xSlicMaxSlotLabels; // Maximum number of slot labels018-019 + u16 xSlicMaxLpQueues; // Maximum number of LP Queues 01A-01B + u8 xRsvd2[4]; // Reserved 01C-01F + u64 xRsvd3[12]; // Reserved 020-07F + u32 xPlicDmaLens[ItDmaMaxEntries];// Array of DMA lengths 080-0A7 + u32 xPlicDmaToks[ItDmaMaxEntries];// Array of DMA tokens 0A8-0CF + u32 xSlicVpdLens[ItVpdMaxEntries];// Array of VPD lengths 0D0-12F + void *xSlicVpdAdrs[ItVpdMaxEntries];// Array of VPD buffers 130-1EF }; #endif /* _ITVPDAREAS_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparData.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/LparData.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparData.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/LparData.h 2005-06-01 17:12:42.000000000 +1000 @@ -1,17 +1,17 @@ /* * LparData.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -34,16 +34,15 @@ #include #include -extern struct LparMap xLparMap; -extern struct HvReleaseData hvReleaseData; -extern struct ItLpNaca itLpNaca; -extern struct ItIplParmsReal xItIplParmsReal; -extern struct ItExtVpdPanel xItExtVpdPanel; -extern struct IoHriProcessorVpd xIoHriProcessorVpd[]; -extern struct ItLpQueue xItLpQueue; -extern struct ItVpdAreas itVpdAreas; -extern u64 xMsVpd[]; -extern struct msChunks msChunks; - +extern struct LparMap xLparMap; +extern struct HvReleaseData hvReleaseData; +extern struct ItLpNaca itLpNaca; +extern struct ItIplParmsReal xItIplParmsReal; +extern struct ItExtVpdPanel xItExtVpdPanel; +extern struct IoHriProcessorVpd xIoHriProcessorVpd[]; +extern struct ItLpQueue xItLpQueue; +extern struct ItVpdAreas itVpdAreas; +extern u64 xMsVpd[]; +extern struct msChunks msChunks; #endif /* _LPARDATA_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparMap.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/LparMap.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/LparMap.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/LparMap.h 2005-06-01 17:14:45.000000000 +1000 @@ -1,17 +1,17 @@ /* * LparMap.h * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -21,13 +21,14 @@ #include -/* The iSeries hypervisor will set up mapping for one or more +/* + * The iSeries hypervisor will set up mapping for one or more * ESID/VSID pairs (in SLB/segment registers) and will set up * mappings of one or more ranges of pages to VAs. * We will have the hypervisor set up the ESID->VSID mapping * for the four kernel segments (C-F). With shared processors, * the hypervisor will clear all segment registers and reload - * these four whenever the processor is switched from one + * these four whenever the processor is switched from one * partition to another. */ @@ -38,30 +39,29 @@ * need to be located within the load area (if the total partition size * is 64 MB), but cannot be mapped. Typically, this should specify * to map half (32 MB) of the load area. - * - * The hypervisor will set up page table entries for the number of + * + * The hypervisor will set up page table entries for the number of * pages specified. * * In 32-bit mode, the hypervisor will load all four of the - * segment registers (identified by the low-order four bits of the + * segment registers (identified by the low-order four bits of the * Esid field. In 64-bit mode, the hypervisor will load one SLB * entry to map the Esid to the Vsid. */ -// Hypervisor initially maps 32MB of the load area -#define HvPagesToMap 8192 +/* Hypervisor initially maps 32MB of the load area */ +#define HvPagesToMap 8192 -struct LparMap -{ - u64 xNumberEsids; // Number of ESID/VSID pairs (1) - u64 xNumberRanges; // Number of VA ranges to map (1) - u64 xSegmentTableOffs; // Page number within load area of seg table (0) - u64 xRsvd[5]; // Reserved (0) - u64 xKernelEsid; // Esid used to map kernel load (0x0C00000000) - u64 xKernelVsid; // Vsid used to map kernel load (0x0C00000000) - u64 xPages; // Number of pages to be mapped (8192) - u64 xOffset; // Offset from start of load area (0) - u64 xVPN; // Virtual Page Number (0x000C000000000000) +struct LparMap { + u64 xNumberEsids; // Number of ESID/VSID pairs (1) + u64 xNumberRanges; // Number of VA ranges to map (1) + u64 xSegmentTableOffs; // Page number within load area of seg table (0) + u64 xRsvd[5]; + u64 xKernelEsid; // Esid used to map kernel load (0x0C00000000) + u64 xKernelVsid; // Vsid used to map kernel load (0x0C00000000) + u64 xPages; // Number of pages to be mapped (8192) + u64 xOffset; // Offset from start of load area (0) + u64 xVPN; // Virtual Page Number (0x000C000000000000) }; #endif /* _LPARMAP_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/XmPciLpEvent.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/XmPciLpEvent.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/XmPciLpEvent.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/XmPciLpEvent.h 2005-06-03 14:33:41.000000000 +1000 @@ -1,8 +1,6 @@ - #ifndef __XMPCILPEVENT_H__ #define __XMPCILPEVENT_H__ - #ifdef __cplusplus extern "C" { #endif @@ -10,7 +8,6 @@ extern "C" { int XmPciLpEvent_init(void); void ppc_irq_dispatch_handler(struct pt_regs *regs, int irq); - #ifdef __cplusplus } #endif diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/iSeries_io.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_io.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/iSeries_io.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_io.h 2005-06-01 17:17:53.000000000 +1000 @@ -5,32 +5,33 @@ #ifdef CONFIG_PPC_ISERIES #include -/************************************************************************/ -/* File iSeries_io.h created by Allan Trautman on Thu Dec 28 2000. */ -/************************************************************************/ -/* Remaps the io.h for the iSeries Io */ -/* Copyright (C) 20yy Allan H Trautman, IBM Corporation */ -/* */ -/* This program is free software; you can redistribute it and/or modify */ -/* it under the terms of the GNU General Public License as published by */ -/* the Free Software Foundation; either version 2 of the License, or */ -/* (at your option) any later version. */ -/* */ -/* This program is distributed in the hope that it will be useful, */ -/* but WITHOUT ANY WARRANTY; without even the implied warranty of */ -/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */ -/* GNU General Public License for more details. */ -/* */ -/* You should have received a copy of the GNU General Public License */ -/* along with this program; if not, write to the: */ -/* Free Software Foundation, Inc., */ -/* 59 Temple Place, Suite 330, */ -/* Boston, MA 02111-1307 USA */ -/************************************************************************/ -/* Change Activity: */ -/* Created December 28, 2000 */ -/* End Change Activity */ -/************************************************************************/ +/* + * File iSeries_io.h created by Allan Trautman on Thu Dec 28 2000. + * + * Remaps the io.h for the iSeries Io + * Copyright (C) 2000 Allan H Trautman, IBM Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the: + * Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, + * Boston, MA 02111-1307 USA + * + * Change Activity: + * Created December 28, 2000 + * End Change Activity + */ + extern u8 iSeries_Read_Byte(const volatile void __iomem * IoAddress); extern u16 iSeries_Read_Word(const volatile void __iomem * IoAddress); extern u32 iSeries_Read_Long(const volatile void __iomem * IoAddress); @@ -39,8 +40,10 @@ extern void iSeries_Write_Word(u16 IoDat extern void iSeries_Write_Long(u32 IoData, volatile void __iomem * IoAddress); extern void iSeries_memset_io(volatile void __iomem *dest, char x, size_t n); -extern void iSeries_memcpy_toio(volatile void __iomem *dest, void *source, size_t n); -extern void iSeries_memcpy_fromio(void *dest, const volatile void __iomem *source, size_t n); +extern void iSeries_memcpy_toio(volatile void __iomem *dest, void *source, + size_t n); +extern void iSeries_memcpy_fromio(void *dest, + const volatile void __iomem *source, size_t n); #endif /* CONFIG_PPC_ISERIES */ #endif /* _ISERIES_IO_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/iSeries_pci.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_pci.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/iSeries_pci.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-03 14:36:47.000000000 +1000 @@ -1,112 +1,113 @@ #ifndef _ISERIES_64_PCI_H #define _ISERIES_64_PCI_H -/************************************************************************/ -/* File iSeries_pci.h created by Allan Trautman on Tue Feb 20, 2001. */ -/************************************************************************/ -/* Define some useful macros for the iSeries pci routines. */ -/* Copyright (C) 2001 Allan H Trautman, IBM Corporation */ -/* */ -/* This program is free software; you can redistribute it and/or modify */ -/* it under the terms of the GNU General Public License as published by */ -/* the Free Software Foundation; either version 2 of the License, or */ -/* (at your option) any later version. */ -/* */ -/* This program is distributed in the hope that it will be useful, */ -/* but WITHOUT ANY WARRANTY; without even the implied warranty of */ -/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */ -/* GNU General Public License for more details. */ -/* */ -/* You should have received a copy of the GNU General Public License */ -/* along with this program; if not, write to the: */ -/* Free Software Foundation, Inc., */ -/* 59 Temple Place, Suite 330, */ -/* Boston, MA 02111-1307 USA */ -/************************************************************************/ -/* Change Activity: */ -/* Created Feb 20, 2001 */ -/* Added device reset, March 22, 2001 */ -/* Ported to ppc64, May 25, 2001 */ -/* End Change Activity */ -/************************************************************************/ +/* + * File iSeries_pci.h created by Allan Trautman on Tue Feb 20, 2001. + * + * Define some useful macros for the iSeries pci routines. + * Copyright (C) 2001 Allan H Trautman, IBM Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the: + * Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, + * Boston, MA 02111-1307 USA + * + * Change Activity: + * Created Feb 20, 2001 + * Added device reset, March 22, 2001 + * Ported to ppc64, May 25, 2001 + * End Change Activity + */ #include #include -struct pci_dev; /* For Forward Reference */ +struct pci_dev; /* For Forward Reference */ struct iSeries_Device_Node; -/************************************************************************/ -/* Gets iSeries Bus, SubBus, DevFn using iSeries_Device_Node structure */ -/************************************************************************/ +/* + * Gets iSeries Bus, SubBus, DevFn using iSeries_Device_Node structure + */ #define ISERIES_BUS(DevPtr) DevPtr->DsaAddr.Dsa.busNumber #define ISERIES_SUBBUS(DevPtr) DevPtr->DsaAddr.Dsa.subBusNumber #define ISERIES_DEVICE(DevPtr) DevPtr->DsaAddr.Dsa.deviceId #define ISERIES_DSA(DevPtr) DevPtr->DsaAddr.DsaAddr #define ISERIES_DEVFUN(DevPtr) DevPtr->DevFn -#define ISERIES_DEVNODE(PciDev) ((struct iSeries_Device_Node*)PciDev->sysdata) +#define ISERIES_DEVNODE(PciDev) ((struct iSeries_Device_Node*)PciDev->sysdata) #define EADsMaxAgents 7 -/************************************************************************/ -/* Decodes Linux DevFn to iSeries DevFn, bridge device, or function. */ -/* For Linux, see PCI_SLOT and PCI_FUNC in include/linux/pci.h */ -/************************************************************************/ +/* + * Decodes Linux DevFn to iSeries DevFn, bridge device, or function. + * For Linux, see PCI_SLOT and PCI_FUNC in include/linux/pci.h + */ -#define ISERIES_PCI_AGENTID(idsel,func) ((idsel & 0x0F) << 4) | (func & 0x07) -#define ISERIES_ENCODE_DEVICE(agentid) ((0x10) | ((agentid&0x20)>>2) | (agentid&07)) +#define ISERIES_PCI_AGENTID(idsel, func) \ + ((idsel & 0x0F) << 4) | (func & 0x07) +#define ISERIES_ENCODE_DEVICE(agentid) \ + ((0x10) | ((agentid & 0x20) >> 2) | (agentid & 0x07)) -#define ISERIES_GET_DEVICE_FROM_SUBBUS(subbus) ((subbus >> 5) & 0x7) -#define ISERIES_GET_FUNCTION_FROM_SUBBUS(subbus) ((subbus >> 2) & 0x7) +#define ISERIES_GET_DEVICE_FROM_SUBBUS(subbus) ((subbus >> 5) & 0x7) +#define ISERIES_GET_FUNCTION_FROM_SUBBUS(subbus) ((subbus >> 2) & 0x7) /* * N.B. the ISERIES_DECODE_* macros are not used anywhere, and I think * the 0x71 (at least) must be wrong - 0x78 maybe? -- paulus. */ -#define ISERIES_DECODE_DEVFN(linuxdevfn) (((linuxdevfn & 0x71) << 1) | (linuxdevfn & 0x07)) -#define ISERIES_DECODE_DEVICE(linuxdevfn) (((linuxdevfn & 0x38) >> 3) |(((linuxdevfn & 0x40) >> 2) + 0x10)) -#define ISERIES_DECODE_FUNCTION(linuxdevfn) (linuxdevfn & 0x07) - -/************************************************************************/ -/* Converts Virtual Address to Real Address for Hypervisor calls */ -/************************************************************************/ - -#define ISERIES_HV_ADDR(virtaddr) (0x8000000000000000 | virt_to_abs(virtaddr)) - -/************************************************************************/ -/* iSeries Device Information */ -/************************************************************************/ +#define ISERIES_DECODE_DEVFN(linuxdevfn) \ + (((linuxdevfn & 0x71) << 1) | (linuxdevfn & 0x07)) +#define ISERIES_DECODE_DEVICE(linuxdevfn) \ + (((linuxdevfn & 0x38) >> 3) | (((linuxdevfn & 0x40) >> 2) + 0x10)) +#define ISERIES_DECODE_FUNCTION(linuxdevfn) \ + (linuxdevfn & 0x07) +/* + * Converts Virtual Address to Real Address for Hypervisor calls + */ +#define ISERIES_HV_ADDR(virtaddr) \ + (0x8000000000000000 | virt_to_abs(virtaddr)) + +/* + * iSeries Device Information + */ struct iSeries_Device_Node { struct list_head Device_List; - struct pci_dev* PciDev; /* Pointer to pci_dev structure*/ - union HvDsaMap DsaAddr; /* Direct Select Address */ - /* busNumber,subBusNumber, */ - /* deviceId, barNumber */ - HvAgentId AgentId; /* Hypervisor DevFn */ - int DevFn; /* Linux devfn */ - int BarOffset; - int Irq; /* Assigned IRQ */ - int ReturnCode; /* Return Code Holder */ - int IoRetry; /* Current Retry Count */ - int Flags; /* Possible flags(disable/bist)*/ - u16 Vendor; /* Vendor ID */ - u8 LogicalSlot; /* Hv Slot Index for Tces */ - struct iommu_table* iommu_table;/* Device TCE Table */ - u8 PhbId; /* Phb Card is on. */ - u16 Board; /* Board Number */ - u8 FrameId; /* iSeries spcn Frame Id */ - char CardLocation[4];/* Char format of planar vpd */ - char Location[20]; /* Frame 1, Card C10 */ + struct pci_dev *PciDev; + union HvDsaMap DsaAddr; /* Direct Select Address */ + /* busNumber, subBusNumber, */ + /* deviceId, barNumber */ + HvAgentId AgentId; /* Hypervisor DevFn */ + int DevFn; /* Linux devfn */ + int BarOffset; + int Irq; /* Assigned IRQ */ + int ReturnCode; /* Return Code Holder */ + int IoRetry; /* Current Retry Count */ + int Flags; /* Possible flags(disable/bist)*/ + u16 Vendor; /* Vendor ID */ + u8 LogicalSlot; /* Hv Slot Index for Tces */ + struct iommu_table *iommu_table;/* Device TCE Table */ + u8 PhbId; /* Phb Card is on. */ + u16 Board; /* Board Number */ + u8 FrameId; /* iSeries spcn Frame Id */ + char CardLocation[4];/* Char format of planar vpd */ + char Location[20]; /* Frame 1, Card C10 */ }; -/************************************************************************/ -/* Functions */ -/************************************************************************/ - -extern int iSeries_Device_Information(struct pci_dev*,char*, int); -extern void iSeries_Get_Location_Code(struct iSeries_Device_Node*); -extern int iSeries_Device_ToggleReset(struct pci_dev* PciDev, int AssertTime, int DelayTime); +extern int iSeries_Device_Information(struct pci_dev*, char*, int); +extern void iSeries_Get_Location_Code(struct iSeries_Device_Node*); +extern int iSeries_Device_ToggleReset(struct pci_dev* PciDev, + int AssertTime, int DelayTime); #endif /* _ISERIES_64_PCI_H */ diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/mf.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/mf.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/mf.h 2005-05-26 10:44:08.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/mf.h 2005-06-01 17:33:13.000000000 +1000 @@ -9,17 +9,16 @@ * all partitions in the iSeries. It also provides miscellaneous low-level * machine facility type operations. * - * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA diff -ruNp linus-iSeries-headers.2/include/asm-ppc64/iSeries/vio.h linus-iSeries-headers.3/include/asm-ppc64/iSeries/vio.h --- linus-iSeries-headers.2/include/asm-ppc64/iSeries/vio.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.3/include/asm-ppc64/iSeries/vio.h 2005-06-03 14:37:50.000000000 +1000 @@ -8,32 +8,32 @@ * Colin Devilbiss * * (C) Copyright 2000 IBM Corporation - * + * * This header file is used by the iSeries virtual I/O device * drivers. It defines the interfaces to the common functions * (implemented in drivers/char/viopath.h) as well as defining - * common functions and structures. Currently (at the time I + * common functions and structures. Currently (at the time I * wrote this comment) the iSeries virtual I/O device drivers - * that use this are - * drivers/block/viodasd.c + * that use this are + * drivers/block/viodasd.c * drivers/char/viocons.c * drivers/char/viotape.c * drivers/cdrom/viocd.c * * The iSeries virtual ethernet support (veth.c) uses a whole * different set of functions. - * + * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License as * published by the Free Software Foundation; either version 2 of the * License, or (at your option) anyu later version. * * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of + * WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. + * General Public License for more details. * - * You should have received a copy of the GNU General Public License + * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software Foundation, * Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * @@ -44,14 +44,16 @@ #include #include -/* iSeries virtual I/O events use the subtype field in +/* + * iSeries virtual I/O events use the subtype field in * HvLpEvent to figure out what kind of vio event is coming * in. We use a table to route these, and this defines * the maximum number of distinct subtypes */ #define VIO_MAX_SUBTYPES 8 -/* Each subtype can register a handler to process their events. +/* + * Each subtype can register a handler to process their events. * The handler must have this interface. */ typedef void (vio_event_handler_t) (struct HvLpEvent * event); @@ -70,13 +72,13 @@ void vio_free_event_buffer(int subtype, extern HvLpIndex viopath_hostLp; extern HvLpIndex viopath_ourLp; -#define VIOCHAR_MAX_DATA 200 +#define VIOCHAR_MAX_DATA 200 -#define VIOMAJOR_SUBTYPE_MASK 0xff00 -#define VIOMINOR_SUBTYPE_MASK 0x00ff -#define VIOMAJOR_SUBTYPE_SHIFT 8 +#define VIOMAJOR_SUBTYPE_MASK 0xff00 +#define VIOMINOR_SUBTYPE_MASK 0x00ff +#define VIOMAJOR_SUBTYPE_SHIFT 8 -#define VIOVERSION 0x0101 +#define VIOVERSION 0x0101 /* * This is the general structure for VIO errors; each module should have @@ -89,8 +91,8 @@ struct vio_error_entry { int errno; const char *msg; }; -const struct vio_error_entry *vio_lookup_rc(const struct vio_error_entry - *local_table, u16 rc); +extern const struct vio_error_entry *vio_lookup_rc( + const struct vio_error_entry *local_table, u16 rc); enum viosubtypes { viomajorsubtype_monitor = 0x0100, @@ -102,7 +104,6 @@ enum viosubtypes { viomajorsubtype_scsi = 0x0700 }; - enum vioconfigsubtype { vioconfigget = 0x0001, }; From sfr at canb.auug.org.au Fri Jun 3 18:07:12 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:07:12 +1000 Subject: [PATCH 1/10] ppc64 iSeries: remove iSeries_proc.h In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603180712.79da4ff5.sfr@canb.auug.org.au> Hi Andrew, include/asm-ppc64/iSeries/iSeries_proc.h just contains a declaration of a function that no longer exists. Remove it. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus/arch/ppc64/kernel/iSeries_proc.c linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_proc.c --- linus/arch/ppc64/kernel/iSeries_proc.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_proc.c 2005-06-01 17:53:28.000000000 +1000 @@ -29,7 +29,6 @@ #include #include #include -#include static int __init iseries_proc_create(void) { diff -ruNp linus/arch/ppc64/kernel/iSeries_setup.c linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_setup.c --- linus/arch/ppc64/kernel/iSeries_setup.c 2005-06-03 09:03:05.000000000 +1000 +++ linus-iSeries-headers.1/arch/ppc64/kernel/iSeries_setup.c 2005-06-03 09:24:03.000000000 +1000 @@ -55,7 +55,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus/arch/ppc64/kernel/viopath.c linus-iSeries-headers.1/arch/ppc64/kernel/viopath.c --- linus/arch/ppc64/kernel/viopath.c 2005-05-20 09:03:15.000000000 +1000 +++ linus-iSeries-headers.1/arch/ppc64/kernel/viopath.c 2005-06-01 17:54:00.000000000 +1000 @@ -48,7 +48,6 @@ #include #include #include -#include #include /* Status of the path to each other partition in the system. diff -ruNp linus/include/asm-ppc64/iSeries/iSeries_proc.h linus-iSeries-headers.1/include/asm-ppc64/iSeries/iSeries_proc.h --- linus/include/asm-ppc64/iSeries/iSeries_proc.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.1/include/asm-ppc64/iSeries/iSeries_proc.h 1970-01-01 10:00:00.000000000 +1000 @@ -1,24 +0,0 @@ -/* - * iSeries_proc.h - * Copyright (C) 2001 Kyle A. Lucke IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ -#ifndef _ISERIES_PROC_H -#define _ISERIES_PROC_H - -extern void iSeries_proc_early_init(void); - -#endif /* _iSeries_PROC_H */ From sfr at canb.auug.org.au Fri Jun 3 18:13:48 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:13:48 +1000 Subject: [PATCH 4/10] ppc64 iSeries: obvious code simplifications In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603181348.7fd6cd8d.sfr@canb.auug.org.au> Hi Andrew, This patch does some obvious code cleanups in the iSeries headers files. - simplifies the bodies of lots of inline functions - parenthesises a macros result - removes C++ wrapping - adds "extern" to some function declarations There are no semantic changes. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallCfg.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallCfg.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-03 14:03:07.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-01 15:04:06.000000000 +1000 @@ -69,37 +69,27 @@ enum HvCallCfg_ReqQual { static inline HvLpIndex HvCallCfg_getLps(void) { - HvLpIndex retVal = HvCall0(HvCallCfgGetLps); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall0(HvCallCfgGetLps); } static inline int HvCallCfg_isBusDedicated(u64 busIndex) { - int retVal = HvCall1(HvCallCfgIsBusDedicated,busIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallCfgIsBusDedicated, busIndex); } static inline HvLpIndex HvCallCfg_getBusOwner(u64 busIndex) { - HvLpIndex retVal = HvCall1(HvCallCfgGetBusOwner,busIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallCfgGetBusOwner, busIndex); } static inline HvLpIndexMap HvCallCfg_getBusAllocation(u64 busIndex) { - HvLpIndexMap retVal = HvCall1(HvCallCfgGetBusAllocation,busIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallCfgGetBusAllocation, busIndex); } static inline HvLpIndexMap HvCallCfg_getActiveLpMap(void) { - HvLpIndexMap retVal = HvCall0(HvCallCfgGetActiveLpMap); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall0(HvCallCfgGetActiveLpMap); } static inline HvLpVirtualLanIndexMap HvCallCfg_getVirtualLanIndexMap( @@ -112,23 +102,18 @@ static inline HvLpVirtualLanIndexMap HvC u64 retVal = HvCall1(HvCallCfgGetVirtualLanIndexMap, lp); if (retVal == -1) retVal = 0; - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return retVal; } static inline u64 HvCallCfg_getSystemMsChunks(void) { - u64 retVal = HvCall0(HvCallCfgGetSystemMsChunks); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall0(HvCallCfgGetSystemMsChunks); } static inline u64 HvCallCfg_getMsChunks(HvLpIndex lp, enum HvCallCfg_ReqQual qual) { - u64 retVal = HvCall2(HvCallCfgGetMsChunks,lp,qual); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall2(HvCallCfgGetMsChunks, lp, qual); } static inline u64 HvCallCfg_getMinRuntimeMsChunks(HvLpIndex lp) @@ -142,65 +127,51 @@ static inline u64 HvCallCfg_getMinRuntim static inline u64 HvCallCfg_setMinRuntimeMsChunks(u64 chunks) { - u64 retVal = HvCall1(HvCallCfgSetMinRuntimeMsChunks,chunks); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallCfgSetMinRuntimeMsChunks, chunks); } static inline u64 HvCallCfg_getSystemPhysicalProcessors(void) { - u64 retVal = HvCall0(HvCallCfgGetSystemPhysicalProcessors); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall0(HvCallCfgGetSystemPhysicalProcessors); } static inline u64 HvCallCfg_getPhysicalProcessors(HvLpIndex lp, enum HvCallCfg_ReqQual qual) { - u64 retVal = HvCall2(HvCallCfgGetPhysicalProcessors,lp,qual); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall2(HvCallCfgGetPhysicalProcessors, lp, qual); } static inline u64 HvCallCfg_getConfiguredBusUnitsForInterruptProc(HvLpIndex lp, u16 hvLogicalProcIndex) { - u64 retVal = HvCall2(HvCallCfgGetConfiguredBusUnitsForIntProc,lp,hvLogicalProcIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall2(HvCallCfgGetConfiguredBusUnitsForIntProc, lp, + hvLogicalProcIndex); } static inline HvLpSharedPoolIndex HvCallCfg_getSharedPoolIndex(HvLpIndex lp) { - HvLpSharedPoolIndex retVal = - HvCall1(HvCallCfgGetSharedPoolIndex,lp); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallCfgGetSharedPoolIndex, lp); } static inline u64 HvCallCfg_getSharedProcUnits(HvLpIndex lp, enum HvCallCfg_ReqQual qual) { - u64 retVal = HvCall2(HvCallCfgGetSharedProcUnits,lp,qual); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall2(HvCallCfgGetSharedProcUnits, lp, qual); } static inline u64 HvCallCfg_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { - u16 retVal = HvCall1(HvCallCfgGetNumProcsInSharedPool,sPI); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); + u16 retVal = HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); return retVal; } static inline HvLpIndex HvCallCfg_getHostingLpIndex(HvLpIndex lp) { - u64 retVal = HvCall1(HvCallCfgGetHostingLpIndex,lp); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); + u64 retVal = HvCall1(HvCallCfgGetHostingLpIndex, lp); return retVal; } diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallEvent.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallEvent.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallEvent.h 2005-06-03 14:04:46.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallEvent.h 2005-06-01 15:11:03.000000000 +1000 @@ -82,13 +82,11 @@ typedef u64 HvLpDma_Rc; static inline void HvCallEvent_getOverflowLpEvents(u8 queueIndex) { HvCall1(HvCallEventGetOverflowLpEvents, queueIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallEvent_setInterLpQueueIndex(u8 queueIndex) { HvCall1(HvCallEventSetInterLpQueueIndex, queueIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallEvent_setLpEventStack(u8 queueIndex, @@ -99,7 +97,6 @@ static inline void HvCallEvent_setLpEven abs_addr = virt_to_abs(eventStackAddr); HvCall3(HvCallEventSetLpEventStack, queueIndex, abs_addr, eventStackSize); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallEvent_setLpEventQueueInterruptProc(u8 queueIndex, @@ -107,22 +104,18 @@ static inline void HvCallEvent_setLpEven { HvCall2(HvCallEventSetLpEventQueueInterruptProc, queueIndex, lpLogicalProcIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline HvLpEvent_Rc HvCallEvent_signalLpEvent(struct HvLpEvent *event) { u64 abs_addr; - HvLpEvent_Rc retVal; #ifdef DEBUG_SENDEVENT printk("HvCallEvent_signalLpEvent: *event = %016lx\n ", (unsigned long)event); #endif abs_addr = virt_to_abs(event); - retVal = (HvLpEvent_Rc)HvCall1(HvCallEventSignalLpEvent, abs_addr); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallEventSignalLpEvent, abs_addr); } static inline HvLpEvent_Rc HvCallEvent_signalLpEventFast(HvLpIndex targetLp, @@ -132,8 +125,6 @@ static inline HvLpEvent_Rc HvCallEvent_s u64 eventData1, u64 eventData2, u64 eventData3, u64 eventData4, u64 eventData5) { - HvLpEvent_Rc retVal; - /* Pack the misc bits into a single Dword to pass to PLIC */ union { struct HvCallEvent_PackedParms parms; @@ -148,67 +139,49 @@ static inline HvLpEvent_Rc HvCallEvent_s packed.parms.xSourceInstId = sourceInstanceId; packed.parms.xTargetInstId = targetInstanceId; - retVal = (HvLpEvent_Rc)HvCall7(HvCallEventSignalLpEventParms, - packed.dword, correlationToken, eventData1,eventData2, - eventData3,eventData4, eventData5); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall7(HvCallEventSignalLpEventParms, packed.dword, + correlationToken, eventData1, eventData2, + eventData3, eventData4, eventData5); } static inline HvLpEvent_Rc HvCallEvent_ackLpEvent(struct HvLpEvent *event) { u64 abs_addr; - HvLpEvent_Rc retVal; abs_addr = virt_to_abs(event); - retVal = (HvLpEvent_Rc)HvCall1(HvCallEventAckLpEvent, abs_addr); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallEventAckLpEvent, abs_addr); } static inline HvLpEvent_Rc HvCallEvent_cancelLpEvent(struct HvLpEvent *event) { u64 abs_addr; - HvLpEvent_Rc retVal; abs_addr = virt_to_abs(event); - retVal = (HvLpEvent_Rc)HvCall1(HvCallEventCancelLpEvent, abs_addr); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallEventCancelLpEvent, abs_addr); } static inline HvLpInstanceId HvCallEvent_getSourceLpInstanceId( HvLpIndex targetLp, HvLpEvent_Type type) { - HvLpInstanceId retVal; - - retVal = HvCall2(HvCallEventGetSourceLpInstanceId, targetLp, type); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall2(HvCallEventGetSourceLpInstanceId, targetLp, type); } static inline HvLpInstanceId HvCallEvent_getTargetLpInstanceId( HvLpIndex targetLp, HvLpEvent_Type type) { - HvLpInstanceId retVal; - - retVal = HvCall2(HvCallEventGetTargetLpInstanceId, targetLp, type); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall2(HvCallEventGetTargetLpInstanceId, targetLp, type); } static inline void HvCallEvent_openLpEventPath(HvLpIndex targetLp, HvLpEvent_Type type) { HvCall2(HvCallEventOpenLpEventPath, targetLp, type); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallEvent_closeLpEventPath(HvLpIndex targetLp, HvLpEvent_Type type) { HvCall2(HvCallEventCloseLpEventPath, targetLp, type); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline HvLpDma_Rc HvCallEvent_dmaBufList(HvLpEvent_Type type, @@ -220,7 +193,6 @@ static inline HvLpDma_Rc HvCallEvent_dma /* Do these need to be converted to absolute addresses? */ u64 localBufList, u64 remoteBufList, u32 transferLength) { - HvLpDma_Rc retVal; /* Pack the misc bits into a single Dword to pass to PLIC */ union { struct HvCallEvent_PackedDmaParms parms; @@ -237,11 +209,8 @@ static inline HvLpDma_Rc HvCallEvent_dma packed.parms.xLocalInstId = localInstanceId; packed.parms.xRemoteInstId = remoteInstanceId; - retVal = (HvLpDma_Rc)HvCall4(HvCallEventDmaBufList, - packed.dword, localBufList, remoteBufList, - transferLength); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall4(HvCallEventDmaBufList, packed.dword, localBufList, + remoteBufList, transferLength); } static inline HvLpDma_Rc HvCallEvent_dmaSingle(HvLpEvent_Type type, @@ -252,7 +221,6 @@ static inline HvLpDma_Rc HvCallEvent_dma HvLpDma_AddressType remoteAddressType, u64 localAddrOrTce, u64 remoteAddrOrTce, u32 transferLength) { - HvLpDma_Rc retVal; /* Pack the misc bits into a single Dword to pass to PLIC */ union { struct HvCallEvent_PackedDmaParms parms; @@ -269,24 +237,17 @@ static inline HvLpDma_Rc HvCallEvent_dma packed.parms.xLocalInstId = localInstanceId; packed.parms.xRemoteInstId = remoteInstanceId; - retVal = (HvLpDma_Rc)HvCall4(HvCallEventDmaSingle, - packed.dword, localAddrOrTce, remoteAddrOrTce, - transferLength); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return (HvLpDma_Rc)HvCall4(HvCallEventDmaSingle, packed.dword, + localAddrOrTce, remoteAddrOrTce, transferLength); } static inline HvLpDma_Rc HvCallEvent_dmaToSp(void *local, u32 remote, u32 length, HvLpDma_Direction dir) { u64 abs_addr; - HvLpDma_Rc retVal; abs_addr = virt_to_abs(local); - retVal = (HvLpDma_Rc)HvCall4(HvCallEventDmaToSp, abs_addr, remote, - length, dir); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall4(HvCallEventDmaToSp, abs_addr, remote, length, dir); } #endif /* _HVCALLEVENT_H */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallHpt.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallHpt.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallHpt.h 2005-06-03 14:06:15.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallHpt.h 2005-06-01 15:16:46.000000000 +1000 @@ -43,34 +43,27 @@ static inline u64 HvCallHpt_getHptAddress(void) { - u64 retval = HvCall0(HvCallHptGetHptAddress); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retval; + return HvCall0(HvCallHptGetHptAddress); } static inline u64 HvCallHpt_getHptPages(void) { - u64 retval = HvCall0(HvCallHptGetHptPages); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retval; + return HvCall0(HvCallHptGetHptPages); } static inline void HvCallHpt_setPp(u32 hpteIndex, u8 value) { HvCall2(HvCallHptSetPp, hpteIndex, value); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallHpt_setSwBits(u32 hpteIndex, u8 bitson, u8 bitsoff) { HvCall3(HvCallHptSetSwBits, hpteIndex, bitson, bitsoff); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallHpt_invalidateNoSyncICache(u32 hpteIndex) { HvCall1(HvCallHptInvalidateNoSyncICache, hpteIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline u64 HvCallHpt_invalidateSetSwBitsGet(u32 hpteIndex, u8 bitson, @@ -81,36 +74,30 @@ static inline u64 HvCallHpt_invalidateSe compressedStatus = HvCall4(HvCallHptInvalidateSetSwBitsGet, hpteIndex, bitson, bitsoff, 1); HvCall1(HvCallHptInvalidateNoSyncICache, hpteIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); return compressedStatus; } static inline u64 HvCallHpt_findValid(HPTE *hpte, u64 vpn) { - u64 retIndex = HvCall3Ret16( HvCallHptFindValid, hpte, vpn, 0, 0 ); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retIndex; + return HvCall3Ret16(HvCallHptFindValid, hpte, vpn, 0, 0); } static inline u64 HvCallHpt_findNextValid(HPTE *hpte, u32 hpteIndex, u8 bitson, u8 bitsoff) { - u64 retIndex = HvCall3Ret16( HvCallHptFindNextValid, hpte, hpteIndex, bitson, bitsoff ); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retIndex; + return HvCall3Ret16(HvCallHptFindNextValid, hpte, hpteIndex, + bitson, bitsoff); } static inline void HvCallHpt_get(HPTE *hpte, u32 hpteIndex) { HvCall2Ret16(HvCallHptGet, hpte, hpteIndex, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline void HvCallHpt_addValidate(u32 hpteIndex, u32 hBit, HPTE *hpte) { HvCall4(HvCallHptAddValidate, hpteIndex, hBit, (*((u64 *)hpte)), (*(((u64 *)hpte)+1))); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } #endif /* _HVCALLHPT_H */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallPci.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallPci.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallPci.h 2005-06-03 14:10:49.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallPci.h 2005-06-01 15:45:18.000000000 +1000 @@ -140,8 +140,6 @@ static inline u64 HvCallPci_configLoad8( HvCall3Ret16(HvCallPciConfigLoad8, &retVal, *(u64 *)&dsa, offset, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - *value = retVal.value; return retVal.rc; @@ -161,8 +159,6 @@ static inline u64 HvCallPci_configLoad16 HvCall3Ret16(HvCallPciConfigLoad16, &retVal, *(u64 *)&dsa, offset, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - *value = retVal.value; return retVal.rc; @@ -182,8 +178,6 @@ static inline u64 HvCallPci_configLoad32 HvCall3Ret16(HvCallPciConfigLoad32, &retVal, *(u64 *)&dsa, offset, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - *value = retVal.value; return retVal.rc; @@ -193,7 +187,6 @@ static inline u64 HvCallPci_configStore8 u8 deviceId, u32 offset, u8 value) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -201,18 +194,13 @@ static inline u64 HvCallPci_configStore8 dsa.subBusNumber = subBusNumber; dsa.deviceId = deviceId; - retVal = HvCall4(HvCallPciConfigStore8, *(u64 *)&dsa, offset, value, 0); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall4(HvCallPciConfigStore8, *(u64 *)&dsa, offset, value, 0); } static inline u64 HvCallPci_configStore16(u16 busNumber, u8 subBusNumber, u8 deviceId, u32 offset, u16 value) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -220,18 +208,13 @@ static inline u64 HvCallPci_configStore1 dsa.subBusNumber = subBusNumber; dsa.deviceId = deviceId; - retVal = HvCall4(HvCallPciConfigStore16, *(u64 *)&dsa, offset, value, 0); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall4(HvCallPciConfigStore16, *(u64 *)&dsa, offset, value, 0); } static inline u64 HvCallPci_configStore32(u16 busNumber, u8 subBusNumber, u8 deviceId, u32 offset, u32 value) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -239,11 +222,7 @@ static inline u64 HvCallPci_configStore3 dsa.subBusNumber = subBusNumber; dsa.deviceId = deviceId; - retVal = HvCall4(HvCallPciConfigStore32, *(u64 *)&dsa, offset, value, 0); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall4(HvCallPciConfigStore32, *(u64 *)&dsa, offset, value, 0); } static inline u64 HvCallPci_barLoad8(u16 busNumberParm, u8 subBusParm, @@ -262,8 +241,6 @@ static inline u64 HvCallPci_barLoad8(u16 HvCall3Ret16(HvCallPciBarLoad8, &retVal, *(u64 *)&dsa, offsetParm, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - *valueParm = retVal.value; return retVal.rc; @@ -285,8 +262,6 @@ static inline u64 HvCallPci_barLoad16(u1 HvCall3Ret16(HvCallPciBarLoad16, &retVal, *(u64 *)&dsa, offsetParm, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - *valueParm = retVal.value; return retVal.rc; @@ -308,8 +283,6 @@ static inline u64 HvCallPci_barLoad32(u1 HvCall3Ret16(HvCallPciBarLoad32, &retVal, *(u64 *)&dsa, offsetParm, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - *valueParm = retVal.value; return retVal.rc; @@ -331,8 +304,6 @@ static inline u64 HvCallPci_barLoad64(u1 HvCall3Ret16(HvCallPciBarLoad64, &retVal, *(u64 *)&dsa, offsetParm, 0); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - *valueParm = retVal.value; return retVal.rc; @@ -343,7 +314,6 @@ static inline u64 HvCallPci_barStore8(u1 u8 valueParm) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -352,11 +322,8 @@ static inline u64 HvCallPci_barStore8(u1 dsa.deviceId = deviceIdParm; dsa.barNumber = barNumberParm; - retVal = HvCall4(HvCallPciBarStore8, *(u64 *)&dsa, offsetParm, valueParm, 0); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall4(HvCallPciBarStore8, *(u64 *)&dsa, offsetParm, + valueParm, 0); } static inline u64 HvCallPci_barStore16(u16 busNumberParm, u8 subBusParm, @@ -364,7 +331,6 @@ static inline u64 HvCallPci_barStore16(u u16 valueParm) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -373,11 +339,8 @@ static inline u64 HvCallPci_barStore16(u dsa.deviceId = deviceIdParm; dsa.barNumber = barNumberParm; - retVal = HvCall4(HvCallPciBarStore16, *(u64 *)&dsa, offsetParm, valueParm, 0); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall4(HvCallPciBarStore16, *(u64 *)&dsa, offsetParm, + valueParm, 0); } static inline u64 HvCallPci_barStore32(u16 busNumberParm, u8 subBusParm, @@ -385,7 +348,6 @@ static inline u64 HvCallPci_barStore32(u u32 valueParm) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -394,11 +356,8 @@ static inline u64 HvCallPci_barStore32(u dsa.deviceId = deviceIdParm; dsa.barNumber = barNumberParm; - retVal = HvCall4(HvCallPciBarStore32, *(u64 *)&dsa, offsetParm, valueParm, 0); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall4(HvCallPciBarStore32, *(u64 *)&dsa, offsetParm, + valueParm, 0); } static inline u64 HvCallPci_barStore64(u16 busNumberParm, u8 subBusParm, @@ -406,7 +365,6 @@ static inline u64 HvCallPci_barStore64(u u64 valueParm) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -415,11 +373,8 @@ static inline u64 HvCallPci_barStore64(u dsa.deviceId = deviceIdParm; dsa.barNumber = barNumberParm; - retVal = HvCall4(HvCallPciBarStore64, *(u64 *)&dsa, offsetParm, valueParm, 0); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall4(HvCallPciBarStore64, *(u64 *)&dsa, offsetParm, + valueParm, 0); } static inline u64 HvCallPci_eoi(u16 busNumberParm, u8 subBusParm, @@ -436,8 +391,6 @@ static inline u64 HvCallPci_eoi(u16 busN HvCall1Ret16(HvCallPciEoi, &retVal, *(u64*)&dsa); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal.rc; } @@ -445,7 +398,6 @@ static inline u64 HvCallPci_getBarParms( u8 deviceIdParm, u8 barNumberParm, u64 parms, u32 sizeofParms) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -454,18 +406,13 @@ static inline u64 HvCallPci_getBarParms( dsa.deviceId = deviceIdParm; dsa.barNumber = barNumberParm; - retVal = HvCall3(HvCallPciGetBarParms, *(u64*)&dsa, parms, sizeofParms); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall3(HvCallPciGetBarParms, *(u64*)&dsa, parms, sizeofParms); } static inline u64 HvCallPci_maskFisr(u16 busNumberParm, u8 subBusParm, u8 deviceIdParm, u64 fisrMask) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -473,18 +420,13 @@ static inline u64 HvCallPci_maskFisr(u16 dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; - retVal = HvCall2(HvCallPciMaskFisr, *(u64*)&dsa, fisrMask); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall2(HvCallPciMaskFisr, *(u64*)&dsa, fisrMask); } static inline u64 HvCallPci_unmaskFisr(u16 busNumberParm, u8 subBusParm, u8 deviceIdParm, u64 fisrMask) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -492,18 +434,13 @@ static inline u64 HvCallPci_unmaskFisr(u dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; - retVal = HvCall2(HvCallPciUnmaskFisr, *(u64*)&dsa, fisrMask); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall2(HvCallPciUnmaskFisr, *(u64*)&dsa, fisrMask); } static inline u64 HvCallPci_setSlotReset(u16 busNumberParm, u8 subBusParm, u8 deviceIdParm, u64 onNotOff) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -511,18 +448,13 @@ static inline u64 HvCallPci_setSlotReset dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; - retVal = HvCall2(HvCallPciSetSlotReset, *(u64*)&dsa, onNotOff); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall2(HvCallPciSetSlotReset, *(u64*)&dsa, onNotOff); } static inline u64 HvCallPci_getDeviceInfo(u16 busNumberParm, u8 subBusParm, u8 deviceNumberParm, u64 parms, u32 sizeofParms) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -530,18 +462,13 @@ static inline u64 HvCallPci_getDeviceInf dsa.subBusNumber = subBusParm; dsa.deviceId = deviceNumberParm << 4; - retVal = HvCall3(HvCallPciGetDeviceInfo, *(u64*)&dsa, parms, sizeofParms); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall3(HvCallPciGetDeviceInfo, *(u64*)&dsa, parms, sizeofParms); } static inline u64 HvCallPci_maskInterrupts(u16 busNumberParm, u8 subBusParm, u8 deviceIdParm, u64 interruptMask) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -549,18 +476,13 @@ static inline u64 HvCallPci_maskInterrup dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; - retVal = HvCall2(HvCallPciMaskInterrupts, *(u64*)&dsa, interruptMask); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall2(HvCallPciMaskInterrupts, *(u64*)&dsa, interruptMask); } static inline u64 HvCallPci_unmaskInterrupts(u16 busNumberParm, u8 subBusParm, u8 deviceIdParm, u64 interruptMask) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -568,18 +490,13 @@ static inline u64 HvCallPci_unmaskInterr dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; - retVal = HvCall2(HvCallPciUnmaskInterrupts, *(u64*)&dsa, interruptMask); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall2(HvCallPciUnmaskInterrupts, *(u64*)&dsa, interruptMask); } static inline u64 HvCallPci_getBusUnitInfo(u16 busNumberParm, u8 subBusParm, u8 deviceIdParm, u64 parms, u32 sizeofParms) { struct HvCallPci_DsaAddr dsa; - u64 retVal; *((u64*)&dsa) = 0; @@ -587,37 +504,30 @@ static inline u64 HvCallPci_getBusUnitIn dsa.subBusNumber = subBusParm; dsa.deviceId = deviceIdParm; - retVal = HvCall3(HvCallPciGetBusUnitInfo, *(u64*)&dsa, parms, sizeofParms); - - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - - return retVal; + return HvCall3(HvCallPciGetBusUnitInfo, *(u64*)&dsa, parms, + sizeofParms); } static inline int HvCallPci_getBusVpd(u16 busNumParm, u64 destParm, u16 sizeParm) { - int xRetSize; - u64 xRc = HvCall4(HvCallPciGetCardVpd, busNumParm, destParm, sizeParm, HvCallPci_BusVpd); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); + u64 xRc = HvCall4(HvCallPciGetCardVpd, busNumParm, destParm, + sizeParm, HvCallPci_BusVpd); if (xRc == -1) - xRetSize = -1; + return -1; else - xRetSize = xRc & 0xFFFF; - return xRetSize; + return xRc & 0xFFFF; } static inline int HvCallPci_getBusAdapterVpd(u16 busNumParm, u64 destParm, u16 sizeParm) { - int xRetSize; - u64 xRc = HvCall4(HvCallPciGetCardVpd, busNumParm, destParm, sizeParm, HvCallPci_BusAdapterVpd); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); + u64 xRc = HvCall4(HvCallPciGetCardVpd, busNumParm, destParm, + sizeParm, HvCallPci_BusAdapterVpd); if (xRc == -1) - xRetSize = -1; + return -1; else - xRetSize = xRc & 0xFFFF; - return xRetSize; + return xRc & 0xFFFF; } #endif /* _HVCALLPCI_H */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallSm.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallSm.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallSm.h 2005-06-03 14:12:36.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallSm.h 2005-06-01 15:47:52.000000000 +1000 @@ -32,10 +32,7 @@ static inline u64 HvCallSm_get64BitsOfAccessMap(HvLpIndex lpIndex, u64 indexIntoBitMap) { - u64 retval = HvCall2(HvCallSmGet64BitsOfAccessMap, lpIndex, - indexIntoBitMap ); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retval; + return HvCall2(HvCallSmGet64BitsOfAccessMap, lpIndex, indexIntoBitMap); } #endif /* _HVCALLSM_H */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallXm.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallXm.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvCallXm.h 2005-06-03 14:17:07.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvCallXm.h 2005-06-01 18:01:10.000000000 +1000 @@ -19,58 +19,43 @@ static inline void HvCallXm_getTceTableParms(u64 cb) { HvCall1(HvCallXmGetTceTableParms, cb); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); } static inline u64 HvCallXm_setTce(u64 tceTableToken, u64 tceOffset, u64 tce) { - u64 retval = HvCall3(HvCallXmSetTce, tceTableToken, tceOffset, tce ); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retval; + return HvCall3(HvCallXmSetTce, tceTableToken, tceOffset, tce); } static inline u64 HvCallXm_setTces(u64 tceTableToken, u64 tceOffset, u64 numTces, u64 tce1, u64 tce2, u64 tce3, u64 tce4) { - u64 retval = HvCall7(HvCallXmSetTces, tceTableToken, tceOffset, numTces, - tce1, tce2, tce3, tce4 ); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retval; + return HvCall7(HvCallXmSetTces, tceTableToken, tceOffset, numTces, + tce1, tce2, tce3, tce4); } static inline u64 HvCallXm_testBus(u16 busNumber) { - u64 retVal = HvCall1(HvCallXmTestBus, busNumber); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall1(HvCallXmTestBus, busNumber); } static inline u64 HvCallXm_testBusUnit(u16 busNumber, u8 subBusNumber, u8 deviceId) { - u64 busUnitNumber = (subBusNumber << 8) | deviceId; - u64 retVal = HvCall2(HvCallXmTestBusUnit, busNumber, busUnitNumber); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall2(HvCallXmTestBusUnit, busNumber, + (subBusNumber << 8) | deviceId); } static inline u64 HvCallXm_connectBusUnit(u16 busNumber, u8 subBusNumber, u8 deviceId, u64 interruptToken) { - u64 busUnitNumber = (subBusNumber << 8) | deviceId; - u64 queueIndex = 0; // HvLpConfig::mapDsaToQueueIndex(HvLpDSA(busNumber, xBoard, xCard)); - - u64 retVal = HvCall5(HvCallXmConnectBusUnit, busNumber, busUnitNumber, - interruptToken, 0, queueIndex); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall5(HvCallXmConnectBusUnit, busNumber, + (subBusNumber << 8) | deviceId, interruptToken, 0, + 0 /* HvLpConfig::mapDsaToQueueIndex(HvLpDSA(busNumber, xBoard, xCard)) */); } static inline u64 HvCallXm_loadTod(void) { - u64 retVal = HvCall0(HvCallXmLoadTod); - // getPaca()->adjustHmtForNoOfSpinLocksHeld(); - return retVal; + return HvCall0(HvCallXmLoadTod); } #endif /* _HVCALLXM_H */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-03 14:21:41.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-01 16:08:25.000000000 +1000 @@ -281,8 +281,7 @@ static inline u64 HvLpConfig_getLoadPage static inline int HvLpConfig_isBusOwnedByThisLp(HvBusNumber busNumber) { - HvLpIndex busOwner = HvLpConfig_getBusOwner(busNumber); - return (busOwner == HvLpConfig_getLpIndex()); + return (HvLpConfig_getBusOwner(busNumber) == HvLpConfig_getLpIndex()); } static inline int HvLpConfig_doLpsCommunicateOnVirtualLan(HvLpIndex lp1, diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/XmPciLpEvent.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/XmPciLpEvent.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/XmPciLpEvent.h 2005-06-03 14:33:41.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/XmPciLpEvent.h 2005-06-01 17:15:22.000000000 +1000 @@ -1,15 +1,7 @@ #ifndef __XMPCILPEVENT_H__ #define __XMPCILPEVENT_H__ -#ifdef __cplusplus -extern "C" { -#endif - -int XmPciLpEvent_init(void); -void ppc_irq_dispatch_handler(struct pt_regs *regs, int irq); - -#ifdef __cplusplus -} -#endif +extern int XmPciLpEvent_init(void); +extern void ppc_irq_dispatch_handler(struct pt_regs *regs, int irq); #endif /* __XMPCILPEVENT_H__ */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_irq.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/iSeries_irq.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_irq.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/iSeries_irq.h 2005-06-01 18:05:36.000000000 +1000 @@ -1,19 +1,11 @@ #ifndef __ISERIES_IRQ_H__ #define __ISERIES_IRQ_H__ -#ifdef __cplusplus -extern "C" { -#endif +extern void iSeries_init_IRQ(void); +extern int iSeries_allocate_IRQ(HvBusNumber, HvSubBusNumber, HvAgentId); +extern int iSeries_assign_IRQ(int, HvBusNumber, HvSubBusNumber, HvAgentId); +extern void iSeries_activate_IRQs(void); -void iSeries_init_IRQ(void); -int iSeries_allocate_IRQ(HvBusNumber, HvSubBusNumber, HvAgentId); -int iSeries_assign_IRQ(int, HvBusNumber, HvSubBusNumber, HvAgentId); -void iSeries_activate_IRQs(void); - -int XmPciLpEvent_init(void); - -#ifdef __cplusplus -} -#endif +extern int XmPciLpEvent_init(void); #endif /* __ISERIES_IRQ_H__ */ diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_pci.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/iSeries_pci.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-03 14:36:47.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-01 17:31:39.000000000 +1000 @@ -55,7 +55,7 @@ struct iSeries_Device_Node; */ #define ISERIES_PCI_AGENTID(idsel, func) \ - ((idsel & 0x0F) << 4) | (func & 0x07) + (((idsel & 0x0F) << 4) | (func & 0x07)) #define ISERIES_ENCODE_DEVICE(agentid) \ ((0x10) | ((agentid & 0x20) >> 2) | (agentid & 0x07)) diff -ruNp linus-iSeries-headers.3/include/asm-ppc64/iSeries/vio.h linus-iSeries-headers.4/include/asm-ppc64/iSeries/vio.h --- linus-iSeries-headers.3/include/asm-ppc64/iSeries/vio.h 2005-06-03 14:37:50.000000000 +1000 +++ linus-iSeries-headers.4/include/asm-ppc64/iSeries/vio.h 2005-06-01 18:06:34.000000000 +1000 @@ -58,16 +58,16 @@ */ typedef void (vio_event_handler_t) (struct HvLpEvent * event); -int viopath_open(HvLpIndex remoteLp, int subtype, int numReq); -int viopath_close(HvLpIndex remoteLp, int subtype, int numReq); -int vio_setHandler(int subtype, vio_event_handler_t * beh); -int vio_clearHandler(int subtype); -int viopath_isactive(HvLpIndex lp); -HvLpInstanceId viopath_sourceinst(HvLpIndex lp); -HvLpInstanceId viopath_targetinst(HvLpIndex lp); -void vio_set_hostlp(void); -void *vio_get_event_buffer(int subtype); -void vio_free_event_buffer(int subtype, void *buffer); +extern int viopath_open(HvLpIndex remoteLp, int subtype, int numReq); +extern int viopath_close(HvLpIndex remoteLp, int subtype, int numReq); +extern int vio_setHandler(int subtype, vio_event_handler_t * beh); +extern int vio_clearHandler(int subtype); +extern int viopath_isactive(HvLpIndex lp); +extern HvLpInstanceId viopath_sourceinst(HvLpIndex lp); +extern HvLpInstanceId viopath_targetinst(HvLpIndex lp); +extern void vio_set_hostlp(void); +extern void *vio_get_event_buffer(int subtype); +extern void vio_free_event_buffer(int subtype, void *buffer); extern HvLpIndex viopath_hostLp; extern HvLpIndex viopath_ourLp; From sfr at canb.auug.org.au Fri Jun 3 18:17:36 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:17:36 +1000 Subject: [PATCH 5/10] ppc64 iSeries: remove LparData.h In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603181736.3b02a7ae.sfr@canb.auug.org.au> Hi Andrew, include/asm-ppc64/iSeries/LparData.h just included a whole lot of other files to declare variables that would be better declared in those other files. So, remove it. This will reduce that number of things needed to be included in most cases to access the relevant variables. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/HvLpEvent.c linus-iSeries-headers.5/arch/ppc64/kernel/HvLpEvent.c --- linus-iSeries-headers.4/arch/ppc64/kernel/HvLpEvent.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/HvLpEvent.c 2005-06-02 15:17:50.000000000 +1000 @@ -12,7 +12,7 @@ #include #include #include -#include +#include /* Array of LpEvent handler functions */ LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/ItLpQueue.c linus-iSeries-headers.5/arch/ppc64/kernel/ItLpQueue.c --- linus-iSeries-headers.4/arch/ppc64/kernel/ItLpQueue.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/ItLpQueue.c 2005-06-02 16:07:50.000000000 +1000 @@ -16,7 +16,6 @@ #include #include #include -#include static __inline__ int set_inUse( struct ItLpQueue * lpQueue ) { diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_VpdInfo.c linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_VpdInfo.c --- linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-06-02 16:08:05.000000000 +1000 @@ -35,7 +35,6 @@ #include #include #include -#include #include #include "pci.h" diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_pci.c linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_pci.c --- linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_pci.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_pci.c 2005-06-02 16:16:36.000000000 +1000 @@ -40,7 +40,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_proc.c linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_proc.c --- linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_proc.c 2005-06-01 17:53:28.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_proc.c 2005-06-02 16:08:14.000000000 +1000 @@ -28,7 +28,7 @@ #include #include #include -#include +#include static int __init iseries_proc_create(void) { diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_setup.c linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_setup.c --- linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_setup.c 2005-06-03 09:24:03.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_setup.c 2005-06-03 09:25:15.000000000 +1000 @@ -47,7 +47,7 @@ #include #include #include -#include +#include #include #include #include @@ -58,6 +58,9 @@ #include #include #include +#include +#include +#include extern void hvlog(char *fmt, ...); diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_smp.c linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_smp.c --- linus-iSeries-headers.4/arch/ppc64/kernel/iSeries_smp.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_smp.c 2005-06-02 16:18:02.000000000 +1000 @@ -38,7 +38,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/irq.c linus-iSeries-headers.5/arch/ppc64/kernel/irq.c --- linus-iSeries-headers.4/arch/ppc64/kernel/irq.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/irq.c 2005-06-02 15:48:09.000000000 +1000 @@ -52,7 +52,7 @@ #include #include #include -#include +#include #include #include diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/lparcfg.c linus-iSeries-headers.5/arch/ppc64/kernel/lparcfg.c --- linus-iSeries-headers.4/arch/ppc64/kernel/lparcfg.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/lparcfg.c 2005-06-02 16:08:22.000000000 +1000 @@ -28,12 +28,12 @@ #include #include #include -#include #include #include #include #include #include +#include #define MODULE_VERS "1.6" #define MODULE_NAME "lparcfg" diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/ras.c linus-iSeries-headers.5/arch/ppc64/kernel/ras.c --- linus-iSeries-headers.4/arch/ppc64/kernel/ras.c 2005-05-20 09:03:14.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/ras.c 2005-06-02 16:07:59.000000000 +1000 @@ -47,7 +47,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/rtc.c linus-iSeries-headers.5/arch/ppc64/kernel/rtc.c --- linus-iSeries-headers.4/arch/ppc64/kernel/rtc.c 2005-05-26 10:44:08.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/rtc.c 2005-06-02 16:18:54.000000000 +1000 @@ -42,7 +42,6 @@ #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/setup.c linus-iSeries-headers.5/arch/ppc64/kernel/setup.c --- linus-iSeries-headers.4/arch/ppc64/kernel/setup.c 2005-06-03 09:03:05.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/setup.c 2005-06-03 09:26:21.000000000 +1000 @@ -41,7 +41,6 @@ #include #include #include -#include #include #include #include @@ -57,6 +56,8 @@ #include #include #include +#include +#include #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) diff -ruNp linus-iSeries-headers.4/arch/ppc64/kernel/viopath.c linus-iSeries-headers.5/arch/ppc64/kernel/viopath.c --- linus-iSeries-headers.4/arch/ppc64/kernel/viopath.c 2005-06-01 17:54:00.000000000 +1000 +++ linus-iSeries-headers.5/arch/ppc64/kernel/viopath.c 2005-06-02 16:19:39.000000000 +1000 @@ -43,7 +43,7 @@ #include #include #include -#include +#include #include #include #include diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-01 16:08:25.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:21:09.000000000 +1000 @@ -27,7 +27,6 @@ #include #include #include -#include extern HvLpIndex HvLpConfig_getLpIndex_outline(void); diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvReleaseData.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvReleaseData.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/HvReleaseData.h 2005-06-01 16:39:29.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvReleaseData.h 2005-06-02 15:07:40.000000000 +1000 @@ -58,4 +58,6 @@ struct HvReleaseData { char xRsvd3[20]; /* Reserved x2C-x3F */ }; +extern struct HvReleaseData hvReleaseData; + #endif /* _HVRELEASEDATA_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/IoHriMainStore.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/IoHriMainStore.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/IoHriMainStore.h 2005-06-01 16:47:55.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/IoHriMainStore.h 2005-06-02 16:06:25.000000000 +1000 @@ -161,4 +161,6 @@ struct IoHriMainStoreSegment5 { u64 reserved3; }; +extern u64 xMsVpd[]; + #endif /* _IOHRIMAINSTORE_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/IoHriProcessorVpd.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/IoHriProcessorVpd.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/IoHriProcessorVpd.h 2005-06-01 16:50:11.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/IoHriProcessorVpd.h 2005-06-02 15:27:50.000000000 +1000 @@ -81,4 +81,6 @@ struct IoHriProcessorVpd { char xProcSrc[72]; // CSP format SRC xB8-xFF }; +extern struct IoHriProcessorVpd xIoHriProcessorVpd[]; + #endif /* _IOHRIPROCESSORVPD_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItExtVpdPanel.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItExtVpdPanel.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItExtVpdPanel.h 2005-06-01 16:51:48.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItExtVpdPanel.h 2005-06-02 15:22:18.000000000 +1000 @@ -47,4 +47,6 @@ struct ItExtVpdPanel { u8 xRsvd2[48]; }; +extern struct ItExtVpdPanel xItExtVpdPanel; + #endif /* _ITEXTVPDPANEL_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItIplParmsReal.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItIplParmsReal.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItIplParmsReal.h 2005-06-01 16:53:52.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItIplParmsReal.h 2005-06-02 15:05:43.000000000 +1000 @@ -66,4 +66,6 @@ struct ItIplParmsReal { u64 xRsvd13; // Reserved x38-x3F }; +extern struct ItIplParmsReal xItIplParmsReal; + #endif /* _ITIPLPARMSREAL_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItLpNaca.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItLpNaca.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItLpNaca.h 2005-06-01 16:58:28.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItLpNaca.h 2005-06-02 15:10:27.000000000 +1000 @@ -19,6 +19,8 @@ #ifndef _ITLPNACA_H #define _ITLPNACA_H +#include + /* * This control block contains the data that is shared between the * hypervisor (PLIC) and the OS. @@ -73,4 +75,6 @@ struct ItLpNaca { u64 xInterruptHdlr[32]; // Interrupt handlers 300-x3FF }; +extern struct ItLpNaca itLpNaca; + #endif /* _ITLPNACA_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItVpdAreas.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItVpdAreas.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-01 17:11:03.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-02 16:22:09.000000000 +1000 @@ -90,4 +90,6 @@ struct ItVpdAreas { void *xSlicVpdAdrs[ItVpdMaxEntries];// Array of VPD buffers 130-1EF }; +extern struct ItVpdAreas itVpdAreas; + #endif /* _ITVPDAREAS_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/LparData.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/LparData.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/LparData.h 2005-06-01 17:12:42.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/LparData.h 1970-01-01 10:00:00.000000000 +1000 @@ -1,48 +0,0 @@ -/* - * LparData.h - * Copyright (C) 2001 Mike Corrigan IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ - -#ifndef _LPARDATA_H -#define _LPARDATA_H - -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -extern struct LparMap xLparMap; -extern struct HvReleaseData hvReleaseData; -extern struct ItLpNaca itLpNaca; -extern struct ItIplParmsReal xItIplParmsReal; -extern struct ItExtVpdPanel xItExtVpdPanel; -extern struct IoHriProcessorVpd xIoHriProcessorVpd[]; -extern struct ItLpQueue xItLpQueue; -extern struct ItVpdAreas itVpdAreas; -extern u64 xMsVpd[]; -extern struct msChunks msChunks; - -#endif /* _LPARDATA_H */ diff -ruNp linus-iSeries-headers.4/include/asm-ppc64/iSeries/LparMap.h linus-iSeries-headers.5/include/asm-ppc64/iSeries/LparMap.h --- linus-iSeries-headers.4/include/asm-ppc64/iSeries/LparMap.h 2005-06-01 17:14:45.000000000 +1000 +++ linus-iSeries-headers.5/include/asm-ppc64/iSeries/LparMap.h 2005-06-02 15:21:09.000000000 +1000 @@ -64,4 +64,6 @@ struct LparMap { u64 xVPN; // Virtual Page Number (0x000C000000000000) }; +extern struct LparMap xLparMap; + #endif /* _LPARMAP_H */ From sfr at canb.auug.org.au Fri Jun 3 18:22:54 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:22:54 +1000 Subject: [PATCH 7/10] ppc64 iSeries: remove HvCallCfg.h In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603182254.127cb816.sfr@canb.auug.org.au> Hi Andrew, Now that the only users of things in HvCallCfg.h are in HvLpConfig.h, merge in the bit we need and remove HvCallCfg.h. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCallCfg.h linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvCallCfg.h --- linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-02 17:08:25.000000000 +1000 +++ linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvCallCfg.h 1970-01-01 10:00:00.000000000 +1000 @@ -1,129 +0,0 @@ -/* - * HvCallCfg.h - * Copyright (C) 2001 Mike Corrigan IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ -/* - * This file contains the "hypervisor call" interface which is used to - * drive the hypervisor from the OS. - */ -#ifndef _HVCALLCFG_H -#define _HVCALLCFG_H - -#include -#include - -enum HvCallCfg_ReqQual { - HvCallCfg_Cur = 0, - HvCallCfg_Init = 1, - HvCallCfg_Max = 2, - HvCallCfg_Min = 3 -}; - -#define HvCallCfgGetLps HvCallCfg + 0 -#define HvCallCfgGetActiveLpMap HvCallCfg + 1 -#define HvCallCfgGetLpVrmIndex HvCallCfg + 2 -#define HvCallCfgGetLpMinSupportedPlicVrmIndex HvCallCfg + 3 -#define HvCallCfgGetLpMinCompatablePlicVrmIndex HvCallCfg + 4 -#define HvCallCfgGetLpVrmName HvCallCfg + 5 -#define HvCallCfgGetSystemPhysicalProcessors HvCallCfg + 6 -#define HvCallCfgGetPhysicalProcessors HvCallCfg + 7 -#define HvCallCfgGetSystemMsChunks HvCallCfg + 8 -#define HvCallCfgGetMsChunks HvCallCfg + 9 -#define HvCallCfgGetInteractivePercentage HvCallCfg + 10 -#define HvCallCfgIsBusDedicated HvCallCfg + 11 -#define HvCallCfgGetBusOwner HvCallCfg + 12 -#define HvCallCfgGetBusAllocation HvCallCfg + 13 -#define HvCallCfgGetBusUnitOwner HvCallCfg + 14 -#define HvCallCfgGetBusUnitAllocation HvCallCfg + 15 -#define HvCallCfgGetVirtualBusPool HvCallCfg + 16 -#define HvCallCfgGetBusUnitInterruptProc HvCallCfg + 17 -#define HvCallCfgGetConfiguredBusUnitsForIntProc HvCallCfg + 18 -#define HvCallCfgGetRioSanBusPool HvCallCfg + 19 -#define HvCallCfgGetSharedPoolIndex HvCallCfg + 20 -#define HvCallCfgGetSharedProcUnits HvCallCfg + 21 -#define HvCallCfgGetNumProcsInSharedPool HvCallCfg + 22 -#define HvCallCfgRouter23 HvCallCfg + 23 -#define HvCallCfgRouter24 HvCallCfg + 24 -#define HvCallCfgRouter25 HvCallCfg + 25 -#define HvCallCfgRouter26 HvCallCfg + 26 -#define HvCallCfgRouter27 HvCallCfg + 27 -#define HvCallCfgGetMinRuntimeMsChunks HvCallCfg + 28 -#define HvCallCfgSetMinRuntimeMsChunks HvCallCfg + 29 -#define HvCallCfgGetVirtualLanIndexMap HvCallCfg + 30 -#define HvCallCfgGetLpExecutionMode HvCallCfg + 31 -#define HvCallCfgGetHostingLpIndex HvCallCfg + 32 - -static inline HvLpIndex HvCallCfg_getBusOwner(u64 busIndex) -{ - return HvCall1(HvCallCfgGetBusOwner, busIndex); -} - -static inline HvLpVirtualLanIndexMap HvCallCfg_getVirtualLanIndexMap( - HvLpIndex lp) -{ - /* - * This is a new function in V5R1 so calls to this on older - * hypervisors will return -1 - */ - u64 retVal = HvCall1(HvCallCfgGetVirtualLanIndexMap, lp); - if (retVal == -1) - retVal = 0; - return retVal; -} - -static inline u64 HvCallCfg_getMsChunks(HvLpIndex lp, - enum HvCallCfg_ReqQual qual) -{ - return HvCall2(HvCallCfgGetMsChunks, lp, qual); -} - -static inline u64 HvCallCfg_getSystemPhysicalProcessors(void) -{ - return HvCall0(HvCallCfgGetSystemPhysicalProcessors); -} - -static inline u64 HvCallCfg_getPhysicalProcessors(HvLpIndex lp, - enum HvCallCfg_ReqQual qual) -{ - return HvCall2(HvCallCfgGetPhysicalProcessors, lp, qual); -} - -static inline HvLpSharedPoolIndex HvCallCfg_getSharedPoolIndex(HvLpIndex lp) -{ - return HvCall1(HvCallCfgGetSharedPoolIndex, lp); - -} - -static inline u64 HvCallCfg_getSharedProcUnits(HvLpIndex lp, - enum HvCallCfg_ReqQual qual) -{ - return HvCall2(HvCallCfgGetSharedProcUnits, lp, qual); - -} - -static inline u64 HvCallCfg_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) -{ - return (u16)HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); - -} - -static inline HvLpIndex HvCallCfg_getHostingLpIndex(HvLpIndex lp) -{ - return HvCall1(HvCallCfgGetHostingLpIndex, lp); -} - -#endif /* _HVCALLCFG_H */ diff -ruNp linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:59:16.000000000 +1000 +++ linus-iSeries-headers.7/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 17:39:37.000000000 +1000 @@ -24,10 +24,26 @@ * to determine which resources should be allocated to each partition. */ -#include +#include #include #include +enum { + HvCallCfg_Cur = 0, + HvCallCfg_Init = 1, + HvCallCfg_Max = 2, + HvCallCfg_Min = 3 +}; + +#define HvCallCfgGetSystemPhysicalProcessors HvCallCfg + 6 +#define HvCallCfgGetPhysicalProcessors HvCallCfg + 7 +#define HvCallCfgGetMsChunks HvCallCfg + 9 +#define HvCallCfgGetSharedPoolIndex HvCallCfg + 20 +#define HvCallCfgGetSharedProcUnits HvCallCfg + 21 +#define HvCallCfgGetNumProcsInSharedPool HvCallCfg + 22 +#define HvCallCfgGetVirtualLanIndexMap HvCallCfg + 30 +#define HvCallCfgGetHostingLpIndex HvCallCfg + 32 + extern HvLpIndex HvLpConfig_getLpIndex_outline(void); static inline HvLpIndex HvLpConfig_getLpIndex(void) @@ -42,72 +58,81 @@ static inline HvLpIndex HvLpConfig_getPr static inline u64 HvLpConfig_getMsChunks(void) { - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur); + return HvCall2(HvCallCfgGetMsChunks, HvLpConfig_getLpIndex(), + HvCallCfg_Cur); } static inline u64 HvLpConfig_getSystemPhysicalProcessors(void) { - return HvCallCfg_getSystemPhysicalProcessors(); + return HvCall0(HvCallCfgGetSystemPhysicalProcessors); } static inline u64 HvLpConfig_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { - return HvCallCfg_getNumProcsInSharedPool(sPI); + return (u16)HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); } static inline u64 HvLpConfig_getPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetPhysicalProcessors, HvLpConfig_getLpIndex(), HvCallCfg_Cur); } static inline HvLpSharedPoolIndex HvLpConfig_getSharedPoolIndex(void) { - return HvCallCfg_getSharedPoolIndex(HvLpConfig_getLpIndex()); + return HvCall1(HvCallCfgGetSharedPoolIndex, HvLpConfig_getLpIndex()); } static inline u64 HvLpConfig_getSharedProcUnits(void) { - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetSharedProcUnits, HvLpConfig_getLpIndex(), HvCallCfg_Cur); } static inline u64 HvLpConfig_getMaxSharedProcUnits(void) { - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetSharedProcUnits, HvLpConfig_getLpIndex(), HvCallCfg_Max); } static inline u64 HvLpConfig_getMaxPhysicalProcessors(void) { - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), + return HvCall2(HvCallCfgGetPhysicalProcessors, HvLpConfig_getLpIndex(), HvCallCfg_Max); } -static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) +static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMapForLp( + HvLpIndex lp) { - return HvCallCfg_getVirtualLanIndexMap(HvLpConfig_getLpIndex_outline()); + /* + * This is a new function in V5R1 so calls to this on older + * hypervisors will return -1 + */ + u64 retVal = HvCall1(HvCallCfgGetVirtualLanIndexMap, lp); + if (retVal == -1) + retVal = 0; + return retVal; } -static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMapForLp( - HvLpIndex lp) +static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) { - return HvCallCfg_getVirtualLanIndexMap(lp); + return HvLpConfig_getVirtualLanIndexMapForLp( + HvLpConfig_getLpIndex_outline()); } static inline int HvLpConfig_doLpsCommunicateOnVirtualLan(HvLpIndex lp1, HvLpIndex lp2) { HvLpVirtualLanIndexMap virtualLanIndexMap1 = - HvCallCfg_getVirtualLanIndexMap(lp1); + HvLpConfig_getVirtualLanIndexMapForLp(lp1); HvLpVirtualLanIndexMap virtualLanIndexMap2 = - HvCallCfg_getVirtualLanIndexMap(lp2); + HvLpConfig_getVirtualLanIndexMapForLp(lp2); return ((virtualLanIndexMap1 & virtualLanIndexMap2) != 0); } static inline HvLpIndex HvLpConfig_getHostingLpIndex(HvLpIndex lp) { - return HvCallCfg_getHostingLpIndex(lp); + return HvCall1(HvCallCfgGetHostingLpIndex, lp); } #endif /* _HVLPCONFIG_H */ From sfr at canb.auug.org.au Fri Jun 3 18:20:45 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:20:45 +1000 Subject: [PATCH 6/10] ppc64 iSeries: eliminate some unused inline functions In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603182045.3216b0ab.sfr@canb.auug.org.au> Hi Andrew, This patch removes from the iSeries header files a large number of inline functions that are not used. It also changes the only caller of a HvCallCfg function that is outside HvLpConfig.h to its equivalent HvLpConfig function and no longer includes HvCallCfg.h where it is not needed. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_smp.c linus-iSeries-headers.6/arch/ppc64/kernel/iSeries_smp.c --- linus-iSeries-headers.5/arch/ppc64/kernel/iSeries_smp.c 2005-06-02 16:18:02.000000000 +1000 +++ linus-iSeries-headers.6/arch/ppc64/kernel/iSeries_smp.c 2005-06-02 17:11:29.000000000 +1000 @@ -39,7 +39,6 @@ #include #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.5/arch/ppc64/kernel/viopath.c linus-iSeries-headers.6/arch/ppc64/kernel/viopath.c --- linus-iSeries-headers.5/arch/ppc64/kernel/viopath.c 2005-06-02 16:19:39.000000000 +1000 +++ linus-iSeries-headers.6/arch/ppc64/kernel/viopath.c 2005-06-02 17:06:58.000000000 +1000 @@ -46,7 +46,6 @@ #include #include #include -#include #include #include @@ -364,7 +363,7 @@ void vio_set_hostlp(void) * while we're active */ viopath_ourLp = HvLpConfig_getLpIndex(); - viopath_hostLp = HvCallCfg_getHostingLpIndex(viopath_ourLp); + viopath_hostLp = HvLpConfig_getHostingLpIndex(viopath_ourLp); if (viopath_hostLp != HvLpIndexInvalid) vio_setHandler(viomajorsubtype_config, handleConfig); diff -ruNp linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvCallCfg.h linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCallCfg.h --- linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-01 15:04:06.000000000 +1000 +++ linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvCallCfg.h 2005-06-02 17:08:25.000000000 +1000 @@ -67,31 +67,11 @@ enum HvCallCfg_ReqQual { #define HvCallCfgGetLpExecutionMode HvCallCfg + 31 #define HvCallCfgGetHostingLpIndex HvCallCfg + 32 -static inline HvLpIndex HvCallCfg_getLps(void) -{ - return HvCall0(HvCallCfgGetLps); -} - -static inline int HvCallCfg_isBusDedicated(u64 busIndex) -{ - return HvCall1(HvCallCfgIsBusDedicated, busIndex); -} - static inline HvLpIndex HvCallCfg_getBusOwner(u64 busIndex) { return HvCall1(HvCallCfgGetBusOwner, busIndex); } -static inline HvLpIndexMap HvCallCfg_getBusAllocation(u64 busIndex) -{ - return HvCall1(HvCallCfgGetBusAllocation, busIndex); -} - -static inline HvLpIndexMap HvCallCfg_getActiveLpMap(void) -{ - return HvCall0(HvCallCfgGetActiveLpMap); -} - static inline HvLpVirtualLanIndexMap HvCallCfg_getVirtualLanIndexMap( HvLpIndex lp) { @@ -105,31 +85,12 @@ static inline HvLpVirtualLanIndexMap HvC return retVal; } -static inline u64 HvCallCfg_getSystemMsChunks(void) -{ - return HvCall0(HvCallCfgGetSystemMsChunks); -} - static inline u64 HvCallCfg_getMsChunks(HvLpIndex lp, enum HvCallCfg_ReqQual qual) { return HvCall2(HvCallCfgGetMsChunks, lp, qual); } -static inline u64 HvCallCfg_getMinRuntimeMsChunks(HvLpIndex lp) -{ - /* - * NOTE: This function was added in v5r1 so older hypervisors - * will return a -1 value - */ - return HvCall1(HvCallCfgGetMinRuntimeMsChunks, lp); -} - -static inline u64 HvCallCfg_setMinRuntimeMsChunks(u64 chunks) -{ - return HvCall1(HvCallCfgSetMinRuntimeMsChunks, chunks); -} - static inline u64 HvCallCfg_getSystemPhysicalProcessors(void) { return HvCall0(HvCallCfgGetSystemPhysicalProcessors); @@ -141,14 +102,6 @@ static inline u64 HvCallCfg_getPhysicalP return HvCall2(HvCallCfgGetPhysicalProcessors, lp, qual); } -static inline u64 HvCallCfg_getConfiguredBusUnitsForInterruptProc(HvLpIndex lp, - u16 hvLogicalProcIndex) -{ - return HvCall2(HvCallCfgGetConfiguredBusUnitsForIntProc, lp, - hvLogicalProcIndex); - -} - static inline HvLpSharedPoolIndex HvCallCfg_getSharedPoolIndex(HvLpIndex lp) { return HvCall1(HvCallCfgGetSharedPoolIndex, lp); @@ -164,15 +117,13 @@ static inline u64 HvCallCfg_getSharedPro static inline u64 HvCallCfg_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { - u16 retVal = HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); - return retVal; + return (u16)HvCall1(HvCallCfgGetNumProcsInSharedPool, sPI); } static inline HvLpIndex HvCallCfg_getHostingLpIndex(HvLpIndex lp) { - u64 retVal = HvCall1(HvCallCfgGetHostingLpIndex, lp); - return retVal; + return HvCall1(HvCallCfgGetHostingLpIndex, lp); } #endif /* _HVCALLCFG_H */ diff -ruNp linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvLpConfig.h linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvLpConfig.h --- linus-iSeries-headers.5/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:21:09.000000000 +1000 +++ linus-iSeries-headers.6/include/asm-ppc64/iSeries/HvLpConfig.h 2005-06-02 16:59:16.000000000 +1000 @@ -40,127 +40,16 @@ static inline HvLpIndex HvLpConfig_getPr return itLpNaca.xPrimaryLpIndex; } -static inline HvLpIndex HvLpConfig_getLps(void) -{ - return HvCallCfg_getLps(); -} - -static inline HvLpIndexMap HvLpConfig_getActiveLpMap(void) -{ - return HvCallCfg_getActiveLpMap(); -} - -static inline u64 HvLpConfig_getSystemMsMegs(void) -{ - return HvCallCfg_getSystemMsChunks() / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getSystemMsChunks(void) -{ - return HvCallCfg_getSystemMsChunks(); -} - -static inline u64 HvLpConfig_getSystemMsPages(void) -{ - return HvCallCfg_getSystemMsChunks() * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur) - / HVCHUNKSPERMEG; -} - static inline u64 HvLpConfig_getMsChunks(void) { return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur); } -static inline u64 HvLpConfig_getMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Cur) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMinMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getMinMsChunks(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min); -} - -static inline u64 HvLpConfig_getMinMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Min) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMinRuntimeMsMegs(void) -{ - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getMinRuntimeMsChunks(void) -{ - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()); -} - -static inline u64 HvLpConfig_getMinRuntimeMsPages(void) -{ - return HvCallCfg_getMinRuntimeMsChunks(HvLpConfig_getLpIndex()) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getMaxMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getMaxMsChunks(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max); -} - -static inline u64 HvLpConfig_getMaxMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Max) - * HVPAGESPERCHUNK; -} - -static inline u64 HvLpConfig_getInitMsMegs(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init) - / HVCHUNKSPERMEG; -} - -static inline u64 HvLpConfig_getInitMsChunks(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init); -} - -static inline u64 HvLpConfig_getInitMsPages(void) -{ - return HvCallCfg_getMsChunks(HvLpConfig_getLpIndex(), HvCallCfg_Init) - * HVPAGESPERCHUNK; -} - static inline u64 HvLpConfig_getSystemPhysicalProcessors(void) { return HvCallCfg_getSystemPhysicalProcessors(); } -static inline u64 HvLpConfig_getSystemLogicalProcessors(void) -{ - return HvCallCfg_getSystemPhysicalProcessors() - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline u64 HvLpConfig_getNumProcsInSharedPool(HvLpSharedPoolIndex sPI) { return HvCallCfg_getNumProcsInSharedPool(sPI); @@ -172,13 +61,6 @@ static inline u64 HvLpConfig_getPhysical HvCallCfg_Cur); } -static inline u64 HvLpConfig_getLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Cur) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline HvLpSharedPoolIndex HvLpConfig_getSharedPoolIndex(void) { return HvCallCfg_getSharedPoolIndex(HvLpConfig_getLpIndex()); @@ -190,57 +72,18 @@ static inline u64 HvLpConfig_getSharedPr HvCallCfg_Cur); } -static inline u64 HvLpConfig_getMinSharedProcUnits(void) -{ - return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), - HvCallCfg_Min); -} - static inline u64 HvLpConfig_getMaxSharedProcUnits(void) { return HvCallCfg_getSharedProcUnits(HvLpConfig_getLpIndex(), HvCallCfg_Max); } -static inline u64 HvLpConfig_getMinPhysicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Min); -} - -static inline u64 HvLpConfig_getMinLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Min) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline u64 HvLpConfig_getMaxPhysicalProcessors(void) { return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), HvCallCfg_Max); } -static inline u64 HvLpConfig_getMaxLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Max) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - -static inline u64 HvLpConfig_getInitPhysicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Init); -} - -static inline u64 HvLpConfig_getInitLogicalProcessors(void) -{ - return HvCallCfg_getPhysicalProcessors(HvLpConfig_getLpIndex(), - HvCallCfg_Init) - * (/*getPaca()->getSecondaryThreadCount() +*/ 1); -} - static inline HvLpVirtualLanIndexMap HvLpConfig_getVirtualLanIndexMap(void) { return HvCallCfg_getVirtualLanIndexMap(HvLpConfig_getLpIndex_outline()); @@ -252,37 +95,6 @@ static inline HvLpVirtualLanIndexMap HvL return HvCallCfg_getVirtualLanIndexMap(lp); } -static inline HvLpIndex HvLpConfig_getBusOwner(HvBusNumber busNumber) -{ - return HvCallCfg_getBusOwner(busNumber); -} - -static inline int HvLpConfig_isBusDedicated(HvBusNumber busNumber) -{ - return HvCallCfg_isBusDedicated(busNumber); -} - -static inline HvLpIndexMap HvLpConfig_getBusAllocation(HvBusNumber busNumber) -{ - return HvCallCfg_getBusAllocation(busNumber); -} - -/* returns the absolute real address of the load area */ -static inline u64 HvLpConfig_getLoadAddress(void) -{ - return itLpNaca.xLoadAreaAddr & 0x7fffffffffffffff; -} - -static inline u64 HvLpConfig_getLoadPages(void) -{ - return itLpNaca.xLoadAreaChunks * HVPAGESPERCHUNK; -} - -static inline int HvLpConfig_isBusOwnedByThisLp(HvBusNumber busNumber) -{ - return (HvLpConfig_getBusOwner(busNumber) == HvLpConfig_getLpIndex()); -} - static inline int HvLpConfig_doLpsCommunicateOnVirtualLan(HvLpIndex lp1, HvLpIndex lp2) { From sfr at canb.auug.org.au Fri Jun 3 18:25:06 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:25:06 +1000 Subject: [PATCH 8/10] ppc64 iSeries: cleanup ItLpQueue.h In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603182506.28909ea9.sfr@canb.auug.org.au> Hi Andrew, Just white space cleaups and move process_iSeries_events into its only caller. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.7/arch/ppc64/kernel/idle.c linus-iSeries-headers.8/arch/ppc64/kernel/idle.c --- linus-iSeries-headers.7/arch/ppc64/kernel/idle.c 2005-06-03 09:03:05.000000000 +1000 +++ linus-iSeries-headers.8/arch/ppc64/kernel/idle.c 2005-06-03 09:31:16.000000000 +1000 @@ -42,6 +42,11 @@ static int (*idle_loop)(void); static unsigned long maxYieldTime = 0; static unsigned long minYieldTime = 0xffffffffffffffffUL; +static inline void process_iSeries_events(void) +{ + asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); +} + static void yield_shared_processor(void) { unsigned long tb; diff -ruNp linus-iSeries-headers.7/include/asm-ppc64/iSeries/ItLpQueue.h linus-iSeries-headers.8/include/asm-ppc64/iSeries/ItLpQueue.h --- linus-iSeries-headers.7/include/asm-ppc64/iSeries/ItLpQueue.h 2005-06-01 17:05:16.000000000 +1000 +++ linus-iSeries-headers.8/include/asm-ppc64/iSeries/ItLpQueue.h 2005-06-02 14:55:45.000000000 +1000 @@ -64,9 +64,9 @@ struct ItLpQueue { u8 xPlicStatus; // 0x01 DedicatedIo or DedicatedLp or NotUsed u16 xSlicLogicalProcIndex; // 0x02 Logical Proc Index for correlation u8 xPlicRsvd[12]; // 0x04 - char* xSlicCurEventPtr; // 0x10 - char* xSlicLastValidEventPtr; // 0x18 - char* xSlicEventStackPtr; // 0x20 + char *xSlicCurEventPtr; // 0x10 + char *xSlicLastValidEventPtr; // 0x18 + char *xSlicEventStackPtr; // 0x20 u8 xIndex; // 0x28 unique sequential index. u8 xSlicRsvd[3]; // 0x29-2b u32 xInUseWord; // 0x2C @@ -76,17 +76,9 @@ struct ItLpQueue { extern struct ItLpQueue xItLpQueue; -extern struct HvLpEvent * ItLpQueue_getNextLpEvent(struct ItLpQueue *); +extern struct HvLpEvent *ItLpQueue_getNextLpEvent(struct ItLpQueue *); extern int ItLpQueue_isLpIntPending(struct ItLpQueue *); extern unsigned ItLpQueue_process(struct ItLpQueue *, struct pt_regs *); extern void ItLpQueue_clearValid(struct HvLpEvent *); -static __inline__ void process_iSeries_events(void) -{ - __asm__ __volatile__ ( - " li 0,0x5555 \n\ - sc" - : : : "r0", "r3"); -} - #endif /* _ITLPQUEUE_H */ From sfr at canb.auug.org.au Fri Jun 3 18:29:29 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:29:29 +1000 Subject: [PATCH 9/10] ppc64 iSeries: tidy up some includes and HvCall.h In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603182929.6997bf8f.sfr@canb.auug.org.au> Hi Andrew, This patch removes some unused bits from HvCall.h and some unneeded #includes from other files. Also includes ItLpQueue.h in paca.h in preference to a stub declaration of struct ItLpQueue. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.8/arch/ppc64/kernel/asm-offsets.c linus-iSeries-headers.9/arch/ppc64/kernel/asm-offsets.c --- linus-iSeries-headers.8/arch/ppc64/kernel/asm-offsets.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.9/arch/ppc64/kernel/asm-offsets.c 2005-06-02 15:50:28.000000000 +1000 @@ -31,7 +31,6 @@ #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.8/arch/ppc64/kernel/iSeries_pci.c linus-iSeries-headers.9/arch/ppc64/kernel/iSeries_pci.c --- linus-iSeries-headers.8/arch/ppc64/kernel/iSeries_pci.c 2005-06-02 16:16:36.000000000 +1000 +++ linus-iSeries-headers.9/arch/ppc64/kernel/iSeries_pci.c 2005-06-02 16:48:16.000000000 +1000 @@ -38,7 +38,6 @@ #include #include -#include #include #include #include diff -ruNp linus-iSeries-headers.8/arch/ppc64/kernel/mf.c linus-iSeries-headers.9/arch/ppc64/kernel/mf.c --- linus-iSeries-headers.8/arch/ppc64/kernel/mf.c 2005-05-26 10:44:08.000000000 +1000 +++ linus-iSeries-headers.9/arch/ppc64/kernel/mf.c 2005-06-02 14:59:25.000000000 +1000 @@ -40,7 +40,6 @@ #include #include #include -#include #include /* diff -ruNp linus-iSeries-headers.8/arch/ppc64/kernel/rtc.c linus-iSeries-headers.9/arch/ppc64/kernel/rtc.c --- linus-iSeries-headers.8/arch/ppc64/kernel/rtc.c 2005-06-02 16:18:54.000000000 +1000 +++ linus-iSeries-headers.9/arch/ppc64/kernel/rtc.c 2005-06-02 16:48:16.000000000 +1000 @@ -44,7 +44,6 @@ #include #include -#include extern int piranha_simulator; diff -ruNp linus-iSeries-headers.8/include/asm-ppc64/iSeries/HvCall.h linus-iSeries-headers.9/include/asm-ppc64/iSeries/HvCall.h --- linus-iSeries-headers.8/include/asm-ppc64/iSeries/HvCall.h 2005-06-01 14:51:07.000000000 +1000 +++ linus-iSeries-headers.9/include/asm-ppc64/iSeries/HvCall.h 2005-06-02 13:29:59.000000000 +1000 @@ -27,48 +27,6 @@ #include #include -/* -enum HvCall_ReturnCode -{ - HvCall_Good = 0, - HvCall_Partial = 1, - HvCall_NotOwned = 2, - HvCall_NotFreed = 3, - HvCall_UnspecifiedError = 4 -}; - -enum HvCall_TypeOfSIT -{ - HvCall_ReduceOnly = 0, - HvCall_Unconditional = 1 -}; - -enum HvCall_TypeOfYield -{ - HvCall_YieldTimed = 0, // Yield until specified time - HvCall_YieldToActive = 1, // Yield until all active procs have run - HvCall_YieldToProc = 2 // Yield until the specified processor has run -}; - -enum HvCall_InterruptMasks -{ - HvCall_MaskIPI = 0x00000001, - HvCall_MaskLpEvent = 0x00000002, - HvCall_MaskLpProd = 0x00000004, - HvCall_MaskTimeout = 0x00000008 -}; - -enum HvCall_VaryOffChunkRc -{ - HvCall_VaryOffSucceeded = 0, - HvCall_VaryOffWithdrawn = 1, - HvCall_ChunkInLoadArea = 2, - HvCall_ChunkInHPT = 3, - HvCall_ChunkNotAccessible = 4, - HvCall_ChunkInUse = 5 -}; -*/ - /* Type of yield for HvCallBaseYieldProcessor */ #define HvCall_YieldTimed 0 /* Yield until specified time (tb) */ #define HvCall_YieldToActive 1 /* Yield until all active procs have run */ @@ -139,35 +97,12 @@ static inline void HvCall_setEnabledInte HvCall1(HvCallBaseSetEnabledInterrupts, enabledInterrupts); } -static inline void HvCall_clearLogBuffer(HvLpIndex lpindex) -{ - HvCall1(HvCallBaseClearLogBuffer, lpindex); -} - -static inline u32 HvCall_getLogBufferCodePage(HvLpIndex lpindex) -{ - u32 retVal = HvCall1(HvCallBaseGetLogBufferCodePage, lpindex); - return retVal; -} - -static inline int HvCall_getLogBufferFormat(HvLpIndex lpindex) -{ - int retVal = HvCall1(HvCallBaseGetLogBufferFormat, lpindex); - return retVal; -} - -static inline u32 HvCall_getLogBufferLength(HvLpIndex lpindex) -{ - u32 retVal = HvCall1(HvCallBaseGetLogBufferLength, lpindex); - return retVal; -} - -static inline void HvCall_setLogBufferFormatAndCodepage(int format, u32 codePage) +static inline void HvCall_setLogBufferFormatAndCodepage(int format, + u32 codePage) { HvCall2(HvCallBaseSetLogBufferFormatAndCodePage, format, codePage); } -extern int HvCall_readLogBuffer(HvLpIndex lpindex, void *buffer, u64 bufLen); extern void HvCall_writeLogBuffer(const void *buffer, u64 bufLen); static inline void HvCall_sendIPI(struct paca_struct *targetPaca) @@ -175,19 +110,4 @@ static inline void HvCall_sendIPI(struct HvCall1(HvCallBaseSendIPI, targetPaca->paca_index); } -static inline void HvCall_terminateMachineSrc(void) -{ - HvCall0(HvCallBaseTerminateMachineSrc); -} - -static inline void HvCall_setDABR(unsigned long val) -{ - HvCall1(HvCallCcSetDABR, val); -} - -static inline void HvCall_setDebugBus(unsigned long val) -{ - HvCall1(HvCallBaseSetDebugBus, val); -} - #endif /* _HVCALL_H */ diff -ruNp linus-iSeries-headers.8/include/asm-ppc64/paca.h linus-iSeries-headers.9/include/asm-ppc64/paca.h --- linus-iSeries-headers.8/include/asm-ppc64/paca.h 2005-05-20 09:05:56.000000000 +1000 +++ linus-iSeries-headers.9/include/asm-ppc64/paca.h 2005-06-02 15:52:17.000000000 +1000 @@ -20,13 +20,13 @@ #include #include #include +#include #include register struct paca_struct *local_paca asm("r13"); #define get_paca() local_paca struct task_struct; -struct ItLpQueue; /* * Defines the layout of the paca. From sfr at canb.auug.org.au Fri Jun 3 18:32:08 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 3 Jun 2005 18:32:08 +1000 Subject: [PATCH 10/10] ppc64 iSeries: misc header cleanups In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050603183208.23e1da70.sfr@canb.auug.org.au> Hi Andrew, Last of this round of the iSeries header cleanups - don't have two defines for the same thing (HvMaxArchitectedLps and HvMaxArchitectedVirtualLans) - HvCallSc.h only needs linux/types.h - remove unused struct definition - add "extern" to some more function declarations Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.9/arch/ppc64/kernel/viopath.c linus-iSeries-headers.10/arch/ppc64/kernel/viopath.c --- linus-iSeries-headers.9/arch/ppc64/kernel/viopath.c 2005-06-02 17:06:58.000000000 +1000 +++ linus-iSeries-headers.10/arch/ppc64/kernel/viopath.c 2005-06-02 18:09:32.000000000 +1000 @@ -485,7 +485,7 @@ int viopath_open(HvLpIndex remoteLp, int unsigned long flags; int tempNumAllocated; - if ((remoteLp >= HvMaxArchitectedLps) || (remoteLp == HvLpIndexInvalid)) + if ((remoteLp >= HVMAXARCHITECTEDLPS) || (remoteLp == HvLpIndexInvalid)) return -EINVAL; subtype = subtype >> VIOMAJOR_SUBTYPE_SHIFT; @@ -556,7 +556,7 @@ int viopath_close(HvLpIndex remoteLp, in int numOpen; struct alloc_parms parms; - if ((remoteLp >= HvMaxArchitectedLps) || (remoteLp == HvLpIndexInvalid)) + if ((remoteLp >= HVMAXARCHITECTEDLPS) || (remoteLp == HvLpIndexInvalid)) return -EINVAL; subtype = subtype >> VIOMAJOR_SUBTYPE_SHIFT; diff -ruNp linus-iSeries-headers.9/include/asm-ppc64/iSeries/HvCallSc.h linus-iSeries-headers.10/include/asm-ppc64/iSeries/HvCallSc.h --- linus-iSeries-headers.9/include/asm-ppc64/iSeries/HvCallSc.h 2005-06-01 15:46:19.000000000 +1000 +++ linus-iSeries-headers.10/include/asm-ppc64/iSeries/HvCallSc.h 2005-06-02 14:00:14.000000000 +1000 @@ -19,7 +19,7 @@ #ifndef _HVCALLSC_H #define _HVCALLSC_H -#include +#include #define HvCallBase 0x8000000000000000ul #define HvCallCc 0x8001000000000000ul @@ -30,22 +30,22 @@ #define HvCallSm 0x8007000000000000ul #define HvCallXm 0x8009000000000000ul -u64 HvCall0(u64); -u64 HvCall1(u64, u64); -u64 HvCall2(u64, u64, u64); -u64 HvCall3(u64, u64, u64, u64); -u64 HvCall4(u64, u64, u64, u64, u64); -u64 HvCall5(u64, u64, u64, u64, u64, u64); -u64 HvCall6(u64, u64, u64, u64, u64, u64, u64); -u64 HvCall7(u64, u64, u64, u64, u64, u64, u64, u64); +extern u64 HvCall0(u64); +extern u64 HvCall1(u64, u64); +extern u64 HvCall2(u64, u64, u64); +extern u64 HvCall3(u64, u64, u64, u64); +extern u64 HvCall4(u64, u64, u64, u64, u64); +extern u64 HvCall5(u64, u64, u64, u64, u64, u64); +extern u64 HvCall6(u64, u64, u64, u64, u64, u64, u64); +extern u64 HvCall7(u64, u64, u64, u64, u64, u64, u64, u64); -u64 HvCall0Ret16(u64, void *); -u64 HvCall1Ret16(u64, void *, u64); -u64 HvCall2Ret16(u64, void *, u64, u64); -u64 HvCall3Ret16(u64, void *, u64, u64, u64); -u64 HvCall4Ret16(u64, void *, u64, u64, u64, u64); -u64 HvCall5Ret16(u64, void *, u64, u64, u64, u64, u64); -u64 HvCall6Ret16(u64, void *, u64, u64, u64, u64, u64, u64); -u64 HvCall7Ret16(u64, void *, u64, u64 ,u64 ,u64 ,u64 ,u64 ,u64); +extern u64 HvCall0Ret16(u64, void *); +extern u64 HvCall1Ret16(u64, void *, u64); +extern u64 HvCall2Ret16(u64, void *, u64, u64); +extern u64 HvCall3Ret16(u64, void *, u64, u64, u64); +extern u64 HvCall4Ret16(u64, void *, u64, u64, u64, u64); +extern u64 HvCall5Ret16(u64, void *, u64, u64, u64, u64, u64); +extern u64 HvCall6Ret16(u64, void *, u64, u64, u64, u64, u64, u64); +extern u64 HvCall7Ret16(u64, void *, u64, u64 ,u64 ,u64 ,u64 ,u64 ,u64); #endif /* _HVCALLSC_H */ diff -ruNp linus-iSeries-headers.9/include/asm-ppc64/iSeries/HvTypes.h linus-iSeries-headers.10/include/asm-ppc64/iSeries/HvTypes.h --- linus-iSeries-headers.9/include/asm-ppc64/iSeries/HvTypes.h 2005-06-01 16:45:03.000000000 +1000 +++ linus-iSeries-headers.10/include/asm-ppc64/iSeries/HvTypes.h 2005-06-02 14:15:29.000000000 +1000 @@ -40,14 +40,14 @@ typedef u64 HvIoToken; typedef u8 HvLpName[8]; typedef u32 HvIoId; typedef u64 HvRealMemoryIndex; -typedef u32 HvLpIndexMap; /* Must hold HvMaxArchitectedLps bits!!! */ +typedef u32 HvLpIndexMap; /* Must hold HVMAXARCHITECTEDLPS bits!!! */ typedef u16 HvLpVrmIndex; typedef u32 HvXmGenerationId; typedef u8 HvLpBusPool; typedef u8 HvLpSharedPoolIndex; typedef u16 HvLpSharedProcUnitsX100; typedef u8 HvLpVirtualLanIndex; -typedef u16 HvLpVirtualLanIndexMap; /* Must hold HvMaxArchitectedVirtualLans bits!!! */ +typedef u16 HvLpVirtualLanIndexMap; /* Must hold HVMAXARCHITECTEDVIRTUALLANS bits!!! */ typedef u16 HvBusNumber; /* Hypervisor Bus Number */ typedef u8 HvSubBusNumber; /* Hypervisor SubBus Number */ typedef u8 HvAgentId; /* Hypervisor DevFn */ @@ -66,15 +66,13 @@ typedef u8 HvAgentId; /* Hypervisor DevF #define HVPAGESPERMEG 256 #define HVPAGESPERCHUNK 64 -#define HvMaxArchitectedLps ((HvLpIndex)HVMAXARCHITECTEDLPS) -#define HvMaxArchitectedVirtualLans ((HvLpVirtualLanIndex)16) #define HvLpIndexInvalid ((HvLpIndex)0xff) /* * Enums for the sub-components under PLIC * Used in HvCall and HvPrimaryCall */ -enum HvCallCompIds { +enum { HvCallCompId = 0, HvCallCpuCtlsCompId = 1, HvCallCfgCompId = 2, diff -ruNp linus-iSeries-headers.9/include/asm-ppc64/iSeries/ItVpdAreas.h linus-iSeries-headers.10/include/asm-ppc64/iSeries/ItVpdAreas.h --- linus-iSeries-headers.9/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-02 16:22:09.000000000 +1000 +++ linus-iSeries-headers.10/include/asm-ppc64/iSeries/ItVpdAreas.h 2005-06-02 16:48:16.000000000 +1000 @@ -61,12 +61,6 @@ #define ItVpdAreasMaxSlotLabels 192 -struct SlicVpdAdrs { - u32 pad1; - void *vpdAddr; -}; - - struct ItVpdAreas { u32 xSlicDesc; // Descriptor 000-003 u16 xSlicSize; // Size of this control block 004-005 From david at gibson.dropbear.id.au Fri Jun 3 18:19:41 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 3 Jun 2005 18:19:41 +1000 Subject: Booting the linux-ppc64 kernel & flattened device tree v0.4 In-Reply-To: <20050602070918.GI4748@localhost.localdomain> References: <1117614390.19020.24.camel@gaston> <20050602070918.GI4748@localhost.localdomain> Message-ID: <20050603081941.GA1458@localhost.localdomain> On Thu, Jun 02, 2005 at 05:09:18PM +1000, David Gibson wrote: > On Wed, Jun 01, 2005 at 06:26:30PM +1000, Benjamin Herrenschmidt wrote: > > DO NOT REPLY TO ALL LISTS PLEASE ! (and CC me on replies). > > > > Here's the fourth version of my document along with new kernel patches > > for the new improved flattened format, and the first release of the > > device-tree "compiler" tool. The patches will be posted as a reply to > > this email. The compiler, dtc, can be downloaded, the URL is in the > > document. > > [snip] > > > IV - "dtc", the device tree compiler > > ==================================== > > > > > dtc source code can be found at > > > > I've just updated the dtc tarball with a new version. Notable > changes: > - Corrected comment parsing > - Corrected handling of #address-cells, #size-cells properties > - Input from device tree blobs should actually work now > - Corrected autogeneration of "name" properties in blob/asm > output version < 0x10 > - Added a TODO list And yet another. Notable changes: - Basic generation of the reserve map in blob output, use -R command line option to leave space for a number of reserve map entries to be filled in by bootloader. - Rewrite blob and assembler output to better share code, and produce more readable assembler output. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From James.Bottomley at SteelEye.com Sat Jun 4 00:14:04 2005 From: James.Bottomley at SteelEye.com (James Bottomley) Date: Fri, 03 Jun 2005 09:14:04 -0500 Subject: [PATCH] fix slab corruption during ipr probe In-Reply-To: <20050602221509.GA11355@otto> References: <20050602221509.GA11355@otto> Message-ID: <1117808044.5030.11.camel@mulgrave> On Thu, 2005-06-02 at 17:15 -0500, Nathan Lynch wrote: > This patch works for me (tm); is it correct? > > Signed-off-by: Nathan Lynch Yes, it is. Apparently every other caller except IPR uses the standard scsi_scan_target() interface, which does have this extra get_device(), I can't think how it got left off the __scsi_add_device path ... well, except that it wouldn't show up during testing. I suppose someone should look at converting ipr to scsi_scan_target() and we can eliminate this API (the only difference is that scsi_add_device returns the actual device, but ipr never uses this...) In the meantime, I'll put this in rc fixes. Thanks, James From mbligh at mbligh.org Sat Jun 4 00:52:49 2005 From: mbligh at mbligh.org (Martin J. Bligh) Date: Fri, 03 Jun 2005 07:52:49 -0700 Subject: 2.6.12-rc5-git8 regression on PPC64 Message-ID: <374360000.1117810369@[10.10.2.4]> -git7 seems to boot fine, but -git8 is broken. See here: http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4547/debug/console.log for full boot log, but basically it does this: DEFAULT CATCH!, handler-entered=fff00300 Open Firmware exception handler entered from non-OF code Client's Fix Pt Regs: 00 0000000000000003 000000000291f800 00000000040acb88 0000000000000010 04 0000000024004042 0000000000000000 0000000000000000 0000000000000000 08 0000000000000000 000000077f800000 0000000000100000 0000000000000001 0c 2000000000000000 0000000000000000 0000000000000000 0000000000000000 10 0000000000000000 0000000000000000 0000000000000000 0000000000000000 14 0000000000230000 0000000780000000 0000000003a10000 0000000003ef2300 18 000000000291fa24 bffffffffc5f0000 000000077f800000 0000000003ef2478 1c bffffffffc5f0000 000000000291fa30 000000000291fab0 000000000291faf4 Special Regs: %IV: 00000300 %CR: 84004044 %XER: 00000000 %DSISR: 0a000000 %SRR0: 0000000003ec6644 %SRR1: 8000000000003000 %LR: 0000000003ec661c %CTR: 0000000000000000 %DAR: 000000077f800000 PID = 0 ofdbg Which doesn't give me anything useful, but perhaps it does for you ;-) M. From jdl at freescale.com Sat Jun 4 03:19:36 2005 From: jdl at freescale.com (Jon Loeliger) Date: Fri, 03 Jun 2005 12:19:36 -0500 Subject: Discuss: Adding OF Flat Dev Tree to ppc32 In-Reply-To: <1117783104.31082.151.camel@gaston> References: <1117614390.19020.24.camel@gaston> <1117614484.19020.27.camel@gaston> <1117783104.31082.151.camel@gaston> Message-ID: <1117819176.6517.290.camel@cashmere.sps.mot.com> Ben and Folks, I've read through ppc64/kernel/prom.c and done some minor call-chain analysis rooted at the two functions: early_init_devtree() unflatten_device_tree() as they are apparently the only two referenced in the initial early boot up process. My notion was to take the portion of prom.c rooted at these two functions and add them to the ppc32 line. First, what portions of pp64/kernel/prom.c are obsolete? Anything? You alluded to cleaning this up some, but I am not too familiar with it to know where that was headed. Second, there is already a fairly similar prom.c file hanging out over in ppc32 land. I _think_ it houses roughly the complementary code out of ppc64's prom.c that is NOT derived from the call chain derived from the above two functions. Which leads me to the questions: Is there, or should we create, a plan to factor the flat-dev-tree handling code into common or shared ppc code? I am reluctant to just outright clone and copy that code if it will ultimately "be the same" or even "mostly the same". It seems that the early_init_devtree() might then need to be refactored or duplicated for ppc32-land. Are you anticipating the same r3,r4,r5 interface outlined in your 0.4 rev of the ppc4 OF spec to be used by the ppc32 world as well? Seems like it just might... Naturally, I'm willing to jump in here, just looking for a bit of global-direction from you. :-) jdl From roland at topspin.com Sat Jun 4 03:42:46 2005 From: roland at topspin.com (Roland Dreier) Date: Fri, 03 Jun 2005 10:42:46 -0700 Subject: 2.6.12-rc5-git8 regression on PPC64 In-Reply-To: <374360000.1117810369@[10.10.2.4]> (Martin J. Bligh's message of "Fri, 03 Jun 2005 07:52:49 -0700") References: <374360000.1117810369@[10.10.2.4]> Message-ID: <52is0vwd49.fsf@topspin.com> I'm seeing something possibly related as well -- with an up-to-date git tree (HEAD == d8d088d25822b0199fdfb392085e1cf8a5914a97), I get a hang early in the boot on an OpenPower 710 (2 x POWER5). It seems to get just a little further: Please wait, loading kernel... Elf64 kernel loaded... Loading ramdisk... ramdisk loaded at 02300000, size: 949 Kbytes OF stdout device is: /vdevice/vty at 30000000 command line: ro console=hvsi0 root=/dev/sdd5 memory layout at init: memory_limit : 0000000000000000 (16 MB aligned) alloc_bottom : 00000000023ee000 alloc_top : 0000000040000000 alloc_top_hi : 00000001e8000000 rmo_top : 0000000040000000 ram_top : 00000001e8000000 Looking for displays found display : /pci at 800000020000003/pci at 2,2/pci at 1/display at 0, opening ... done instantiating rtas at 0x00000000077ca000 ... done 0000000000000000 : boot cpu 0000000000000000 0000000000000002 : starting cpu hw idx 0000000000000002... done copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x00000000027ef000 -> 0x00000000027f02f8 Device tree struct 0x00000000027f1000 -> 0x0000000002809000 and that's the last thing I see. - R. From ntl at pobox.com Sat Jun 4 04:27:25 2005 From: ntl at pobox.com (Nathan Lynch) Date: Fri, 3 Jun 2005 13:27:25 -0500 Subject: 2.6.12-rc5-git8 regression on PPC64 In-Reply-To: <52is0vwd49.fsf@topspin.com> References: <374360000.1117810369@[10.10.2.4]> <52is0vwd49.fsf@topspin.com> Message-ID: <20050603182725.GB11355@otto> Roland Dreier wrote: > I'm seeing something possibly related as well -- with an up-to-date > git tree (HEAD == d8d088d25822b0199fdfb392085e1cf8a5914a97), I get a > hang early in the boot on an OpenPower 710 (2 x POWER5). Backing this one out fixes it here on a Power5 570. http://marc.theaimsgroup.com/?l=bk-commits-head&m=111772917211270&q=raw Ben, any ideas? Nathan From roland at topspin.com Sat Jun 4 05:07:46 2005 From: roland at topspin.com (Roland Dreier) Date: Fri, 03 Jun 2005 12:07:46 -0700 Subject: 2.6.12-rc5-git8 regression on PPC64 In-Reply-To: <20050603182725.GB11355@otto> (Nathan Lynch's message of "Fri, 3 Jun 2005 13:27:25 -0500") References: <374360000.1117810369@[10.10.2.4]> <52is0vwd49.fsf@topspin.com> <20050603182725.GB11355@otto> Message-ID: <52vf4vuum5.fsf@topspin.com> Nathan> Backing this one out fixes it here on a Power5 570. Nathan> http://marc.theaimsgroup.com/?l=bk-commits-head&m=111772917211270&q=raw This fixes the boot on an OpenPower 710 for me as well. Thanks, Roland From ntl at pobox.com Sat Jun 4 05:25:25 2005 From: ntl at pobox.com (Nathan Lynch) Date: Fri, 3 Jun 2005 14:25:25 -0500 Subject: [PATCH] prom_find_machine_type typo breaks pSeries lpar boot In-Reply-To: <52vf4vuum5.fsf@topspin.com> References: <374360000.1117810369@[10.10.2.4]> <52is0vwd49.fsf@topspin.com> <20050603182725.GB11355@otto> <52vf4vuum5.fsf@topspin.com> Message-ID: <20050603192525.GC11355@otto> Typo in prom_find_machine_type from Ben's recent patch "ppc64: Fix result code handling in prom_init" prevents pSeries LPAR systems from booting. Tested on a pSeries 570 and OpenPower 720 (both Power5 LPAR). Signed-off-by: Nathan Lynch arch/ppc64/kernel/prom_init.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.12-rc5-git8/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-2.6.12-rc5-git8.orig/arch/ppc64/kernel/prom_init.c +++ linux-2.6.12-rc5-git8/arch/ppc64/kernel/prom_init.c @@ -1370,7 +1370,7 @@ static int __init prom_find_machine_type } /* Default to pSeries. We need to know if we are running LPAR */ rtas = call_prom("finddevice", 1, 1, ADDR("/rtas")); - if (!PHANDLE_VALID(rtas)) { + if (PHANDLE_VALID(rtas)) { int x = prom_getproplen(rtas, "ibm,hypertas-functions"); if (x != PROM_ERROR) { prom_printf("Hypertas detected, assuming LPAR !\n"); From benh at kernel.crashing.org Sat Jun 4 08:56:23 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 04 Jun 2005 08:56:23 +1000 Subject: 2.6.12-rc5-git8 regression on PPC64 In-Reply-To: <20050603182725.GB11355@otto> References: <374360000.1117810369@[10.10.2.4]> <52is0vwd49.fsf@topspin.com> <20050603182725.GB11355@otto> Message-ID: <1117839384.31082.186.camel@gaston> On Fri, 2005-06-03 at 13:27 -0500, Nathan Lynch wrote: > Roland Dreier wrote: > > I'm seeing something possibly related as well -- with an up-to-date > > git tree (HEAD == d8d088d25822b0199fdfb392085e1cf8a5914a97), I get a > > hang early in the boot on an OpenPower 710 (2 x POWER5). > > Backing this one out fixes it here on a Power5 570. > > http://marc.theaimsgroup.com/?l=bk-commits-head&m=111772917211270&q=raw > > Ben, any ideas? Weird, works fine on our POWER5 here. What is the output ? Especially, what if you enable DEBUG_PROM in there ? Ben. From benh at kernel.crashing.org Sat Jun 4 08:58:20 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 04 Jun 2005 08:58:20 +1000 Subject: [PATCH] prom_find_machine_type typo breaks pSeries lpar boot In-Reply-To: <20050603192525.GC11355@otto> References: <374360000.1117810369@[10.10.2.4]> <52is0vwd49.fsf@topspin.com> <20050603182725.GB11355@otto> <52vf4vuum5.fsf@topspin.com> <20050603192525.GC11355@otto> Message-ID: <1117839500.31082.188.camel@gaston> On Fri, 2005-06-03 at 14:25 -0500, Nathan Lynch wrote: > Typo in prom_find_machine_type from Ben's recent patch "ppc64: Fix > result code handling in prom_init" prevents pSeries LPAR systems from > booting. > > Tested on a pSeries 570 and OpenPower 720 (both Power5 LPAR). Damn ! I'm certain I tested it on P5 ! I must have forgotten to "quilt ref" before sending the patch (as I did notice this typo and fixed it just before sending). Sorry! Ben. From frowand at mvista.com Sat Jun 4 09:34:41 2005 From: frowand at mvista.com (Frank Rowand) Date: Fri, 3 Jun 2005 16:34:41 -0700 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart Message-ID: <200506032334.j53NYfUm010224@localhost.localdomain> The Maple board currently returns from a request to halt, power off, or restart. This patch causes all processors to instead spin in these cases so that the system appears to have halted. This patch applies against linux-2.6.12-rc5. Signed-off-by: Frank Rowand Index: linux-2.6.10/arch/ppc64/kernel/maple_setup.c =================================================================== --- linux-2.6.10.orig/arch/ppc64/kernel/maple_setup.c +++ linux-2.6.10/arch/ppc64/kernel/maple_setup.c @@ -81,14 +81,32 @@ extern void add_kgdb_port(void); static void maple_restart(char *cmd) { +#ifdef CONFIG_SMP + smp_send_stop(); +#endif + printk(KERN_EMERG "System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) ; } static void maple_power_off(void) { +#ifdef CONFIG_SMP + smp_send_stop(); +#endif + printk(KERN_EMERG "System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) ; } static void maple_halt(void) { +#ifdef CONFIG_SMP + smp_send_stop(); +#endif + printk(KERN_EMERG "System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) ; } #ifdef CONFIG_SMP From domen at coderock.org Sun Jun 5 00:19:18 2005 From: domen at coderock.org (Domen Puncer) Date: Sat, 4 Jun 2005 16:19:18 +0200 Subject: prom_init: memset, wrong size Message-ID: <20050604141918.GA9687@nd47.coderock.org> Hi. I noticed this in arch/ppc64/kernel/prom_init.c:1041: memset(path, 0, sizeof(path)); path is defined as: char *path = RELOC(prom_scratch); So sizeof(*char) should probably be PROM_SCRATCH_SIZE, right? Domen From hollis at penguinppc.org Sun Jun 5 03:46:04 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Sat, 4 Jun 2005 12:46:04 -0500 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <200506032334.j53NYfUm010224@localhost.localdomain> References: <200506032334.j53NYfUm010224@localhost.localdomain> Message-ID: <6628df334a76030b1b9e3e477f7cfc75@penguinppc.org> On Jun 3, 2005, at 6:34 PM, Frank Rowand wrote: > > The Maple board currently returns from a request to halt, power off, or > restart. This patch causes all processors to instead spin in these > cases > so that the system appears to have halted. Can't you write to magic locations in SuperIO NVRAM to have PIBS running on the 405 perform some of these actions for you? -Hollis From benh at kernel.crashing.org Sun Jun 5 08:31:12 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 05 Jun 2005 08:31:12 +1000 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <6628df334a76030b1b9e3e477f7cfc75@penguinppc.org> References: <200506032334.j53NYfUm010224@localhost.localdomain> <6628df334a76030b1b9e3e477f7cfc75@penguinppc.org> Message-ID: <1117924272.31082.218.camel@gaston> On Sat, 2005-06-04 at 12:46 -0500, Hollis Blanchard wrote: > On Jun 3, 2005, at 6:34 PM, Frank Rowand wrote: > > > > The Maple board currently returns from a request to halt, power off, or > > restart. This patch causes all processors to instead spin in these > > cases > > so that the system appears to have halted. > > Can't you write to magic locations in SuperIO NVRAM to have PIBS > running on the 405 perform some of these actions for you? It's possible, though the Maple can't do the full power off properly due to some issue between the cpc925 and the amd8111 Ben. From benh at kernel.crashing.org Sun Jun 5 08:56:23 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 05 Jun 2005 08:56:23 +1000 Subject: prom_init: memset, wrong size In-Reply-To: <20050604141918.GA9687@nd47.coderock.org> References: <20050604141918.GA9687@nd47.coderock.org> Message-ID: <1117925783.31082.223.camel@gaston> On Sat, 2005-06-04 at 16:19 +0200, Domen Puncer wrote: > Hi. > > I noticed this in arch/ppc64/kernel/prom_init.c:1041: > memset(path, 0, sizeof(path)); > > path is defined as: > char *path = RELOC(prom_scratch); > > So sizeof(*char) should probably be PROM_SCRATCH_SIZE, right? True, though probably harmless too. I'll send a fix. Ben. From olh at suse.de Sun Jun 5 17:49:45 2005 From: olh at suse.de (Olaf Hering) Date: Sun, 5 Jun 2005 09:49:45 +0200 Subject: [PATCH] update ppc64 defconfig Message-ID: <20050605074945.GA16638@suse.de> enable cpusets enable new lpfc and jsm drivers enable new dm-multipath leave new agp disabled disable rivafb, it does not handle the cards in G5 models (FX5200 as example) the new nvidiafb doesnt work on bigendian, yet Signed-off-by: Olaf Hering arch/ppc64/defconfig | 104 ++++++++++++++++++++++++++++++++------------------- 1 files changed, 67 insertions(+), 37 deletions(-) Index: linux-2.6.12-rc5-git9/arch/ppc64/defconfig =================================================================== --- linux-2.6.12-rc5-git9.orig/arch/ppc64/defconfig +++ linux-2.6.12-rc5-git9/arch/ppc64/defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.11-rc3-bk6 -# Wed Feb 9 23:34:51 2005 +# Linux kernel version: 2.6.12-rc5-git9 +# Sun Jun 5 09:26:47 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -11,7 +11,7 @@ CONFIG_GENERIC_ISA_DMA=y CONFIG_HAVE_DEC_LOCK=y CONFIG_EARLY_PRINTK=y CONFIG_COMPAT=y -CONFIG_FRAME_POINTER=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_FORCE_MAX_ZONEORDER=13 # @@ -20,6 +20,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y +CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup @@ -30,24 +31,28 @@ CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y -CONFIG_LOG_BUF_SHIFT=17 +# CONFIG_AUDIT is not set CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y +CONFIG_CPUSETS=y # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y -# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set +CONFIG_BASE_SMALL=0 # # Loadable module support @@ -91,9 +96,12 @@ CONFIG_DISCONTIGMEM=y CONFIG_EEH=y CONFIG_GENERIC_HARDIRQS=y CONFIG_PPC_RTAS=y +CONFIG_RTAS_PROC=y CONFIG_RTAS_FLASH=m CONFIG_SCANLOG=m CONFIG_LPARCFG=y +CONFIG_SECCOMP=y +CONFIG_ISA_DMA_API=y # # General setup @@ -104,6 +112,7 @@ CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=m # CONFIG_PCI_LEGACY_PROC is not set # CONFIG_PCI_NAMES is not set +# CONFIG_PCI_DEBUG is not set CONFIG_HOTPLUG_CPU=y # @@ -112,10 +121,6 @@ CONFIG_HOTPLUG_CPU=y # CONFIG_PCCARD is not set # -# PC-card bridges -# - -# # PCI Hotplug Support # CONFIG_HOTPLUG_PCI=m @@ -149,11 +154,10 @@ CONFIG_FW_LOADER=y # CONFIG_PARPORT=m CONFIG_PARPORT_PC=m -CONFIG_PARPORT_PC_CML1=m # CONFIG_PARPORT_SERIAL is not set # CONFIG_PARPORT_PC_FIFO is not set # CONFIG_PARPORT_PC_SUPERIO is not set -# CONFIG_PARPORT_OTHER is not set +# CONFIG_PARPORT_GSC is not set # CONFIG_PARPORT_1284 is not set # @@ -301,6 +305,7 @@ CONFIG_SCSI_SATA_SVW=y # CONFIG_SCSI_ATA_PIIX is not set # CONFIG_SCSI_SATA_NV is not set # CONFIG_SCSI_SATA_PROMISE is not set +# CONFIG_SCSI_SATA_QSTOR is not set # CONFIG_SCSI_SATA_SX4 is not set # CONFIG_SCSI_SATA_SIL is not set # CONFIG_SCSI_SATA_SIS is not set @@ -310,7 +315,6 @@ CONFIG_SCSI_SATA_SVW=y # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_EATA is not set -# CONFIG_SCSI_EATA_PIO is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set # CONFIG_SCSI_GDTH is not set # CONFIG_SCSI_IPS is not set @@ -327,7 +331,6 @@ CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 CONFIG_SCSI_IPR=y CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y -# CONFIG_SCSI_QLOGIC_ISP is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA2XXX=y @@ -336,6 +339,7 @@ CONFIG_SCSI_QLA22XX=m CONFIG_SCSI_QLA2300=m CONFIG_SCSI_QLA2322=m CONFIG_SCSI_QLA6312=m +CONFIG_SCSI_LPFC=m # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set CONFIG_SCSI_DEBUG=m @@ -358,6 +362,8 @@ CONFIG_DM_CRYPT=m CONFIG_DM_SNAPSHOT=m CONFIG_DM_MIRROR=m CONFIG_DM_ZERO=m +CONFIG_DM_MULTIPATH=m +CONFIG_DM_MULTIPATH_EMC=m # # Fusion MPT device support @@ -405,6 +411,7 @@ CONFIG_IEEE1394_AMDTP=m # CONFIG_ADB=y CONFIG_ADB_PMU=y +CONFIG_PMAC_SMU=y # CONFIG_PMAC_PBOOK is not set # CONFIG_PMAC_BACKLIGHT is not set # CONFIG_INPUT_ADBHID is not set @@ -420,7 +427,6 @@ CONFIG_NET=y # CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set -# CONFIG_NETLINK_DEV is not set CONFIG_UNIX=y CONFIG_NET_KEY=m CONFIG_INET=y @@ -588,7 +594,6 @@ CONFIG_PCNET32=y # CONFIG_DGRS is not set # CONFIG_EEPRO100 is not set CONFIG_E100=y -# CONFIG_E100_NAPI is not set # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set # CONFIG_NE2K_PCI is not set @@ -614,6 +619,8 @@ CONFIG_E1000=y # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y +# CONFIG_BNX2 is not set +# CONFIG_MV643XX_ETH is not set # # Ethernet (10000 Mbit) @@ -683,20 +690,6 @@ CONFIG_INPUT_EVDEV=m # CONFIG_INPUT_EVBUG is not set # -# Input I/O drivers -# -# CONFIG_GAMEPORT is not set -CONFIG_SOUND_GAMEPORT=y -CONFIG_SERIO=y -CONFIG_SERIO_I8042=y -# CONFIG_SERIO_SERPORT is not set -# CONFIG_SERIO_CT82C710 is not set -# CONFIG_SERIO_PARKBD is not set -# CONFIG_SERIO_PCIPS2 is not set -CONFIG_SERIO_LIBPS2=y -# CONFIG_SERIO_RAW is not set - -# # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y @@ -716,6 +709,18 @@ CONFIG_INPUT_PCSPKR=m # CONFIG_INPUT_UINPUT is not set # +# Hardware I/O ports +# +CONFIG_SERIO=y +CONFIG_SERIO_I8042=y +# CONFIG_SERIO_SERPORT is not set +# CONFIG_SERIO_PARKBD is not set +# CONFIG_SERIO_PCIPS2 is not set +CONFIG_SERIO_LIBPS2=y +# CONFIG_SERIO_RAW is not set +# CONFIG_GAMEPORT is not set + +# # Character devices # CONFIG_VT=y @@ -738,6 +743,7 @@ CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y # CONFIG_SERIAL_PMACZILOG is not set CONFIG_SERIAL_ICOM=m +CONFIG_SERIAL_JSM=m CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 @@ -766,9 +772,16 @@ CONFIG_HVCS=m # # Ftape, the floppy tape device driver # +# CONFIG_AGP is not set # CONFIG_DRM is not set CONFIG_RAW_DRIVER=y CONFIG_MAX_RAW_DEVS=256 +# CONFIG_HANGCHECK_TIMER is not set + +# +# TPM devices +# +# CONFIG_TCG_TPM is not set # # I2C support @@ -793,9 +806,9 @@ CONFIG_I2C_ALGOBIT=y CONFIG_I2C_AMD8111=y # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set +# CONFIG_I2C_PIIX4 is not set # CONFIG_I2C_ISA is not set CONFIG_I2C_KEYWEST=y -# CONFIG_I2C_MPC is not set # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT is not set # CONFIG_I2C_PARPORT_LIGHT is not set @@ -822,7 +835,9 @@ CONFIG_I2C_KEYWEST=y # CONFIG_SENSORS_ASB100 is not set # CONFIG_SENSORS_DS1621 is not set # CONFIG_SENSORS_FSCHER is not set +# CONFIG_SENSORS_FSCPOS is not set # CONFIG_SENSORS_GL518SM is not set +# CONFIG_SENSORS_GL520SM is not set # CONFIG_SENSORS_IT87 is not set # CONFIG_SENSORS_LM63 is not set # CONFIG_SENSORS_LM75 is not set @@ -833,9 +848,11 @@ CONFIG_I2C_KEYWEST=y # CONFIG_SENSORS_LM85 is not set # CONFIG_SENSORS_LM87 is not set # CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set # CONFIG_SENSORS_MAX1619 is not set # CONFIG_SENSORS_PC87360 is not set # CONFIG_SENSORS_SMSC47B397 is not set +# CONFIG_SENSORS_SIS5595 is not set # CONFIG_SENSORS_SMSC47M1 is not set # CONFIG_SENSORS_VIA686A is not set # CONFIG_SENSORS_W83781D is not set @@ -845,6 +862,7 @@ CONFIG_I2C_KEYWEST=y # # Other I2C Chip support # +# CONFIG_SENSORS_DS1337 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set # CONFIG_SENSORS_PCF8591 is not set @@ -877,6 +895,11 @@ CONFIG_I2C_KEYWEST=y # Graphics support # CONFIG_FB=y +CONFIG_FB_CFB_FILLRECT=y +CONFIG_FB_CFB_COPYAREA=y +CONFIG_FB_CFB_IMAGEBLIT=y +CONFIG_FB_SOFT_CURSOR=y +CONFIG_FB_MACMODES=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y # CONFIG_FB_CIRRUS is not set @@ -890,9 +913,8 @@ CONFIG_FB_OF=y # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set -CONFIG_FB_RIVA=y -CONFIG_FB_RIVA_I2C=y -# CONFIG_FB_RIVA_DEBUG is not set +# CONFIG_FB_NVIDIA is not set +# CONFIG_FB_RIVA is not set CONFIG_FB_MATROX=y CONFIG_FB_MATROX_MILLENIUM=y CONFIG_FB_MATROX_MYSTIQUE=y @@ -913,6 +935,7 @@ CONFIG_FB_RADEON_I2C=y # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set # CONFIG_FB_TRIDENT is not set +# CONFIG_FB_S1D13XXX is not set # CONFIG_FB_VIRTUAL is not set # @@ -946,6 +969,8 @@ CONFIG_LCD_DEVICE=y # # USB support # +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB=y # CONFIG_USB_DEBUG is not set @@ -956,8 +981,6 @@ CONFIG_USB_DEVICEFS=y # CONFIG_USB_BANDWIDTH is not set # CONFIG_USB_DYNAMIC_MINORS is not set # CONFIG_USB_OTG is not set -CONFIG_USB_ARCH_HAS_HCD=y -CONFIG_USB_ARCH_HAS_OHCI=y # # USB Host Controller Drivers @@ -966,6 +989,8 @@ CONFIG_USB_EHCI_HCD=y # CONFIG_USB_EHCI_SPLIT_ISO is not set # CONFIG_USB_EHCI_ROOT_HUB_TT is not set CONFIG_USB_OHCI_HCD=y +# CONFIG_USB_OHCI_BIG_ENDIAN is not set +CONFIG_USB_OHCI_LITTLE_ENDIAN=y # CONFIG_USB_UHCI_HCD is not set # CONFIG_USB_SL811_HCD is not set @@ -981,12 +1006,11 @@ CONFIG_USB_OHCI_HCD=y # CONFIG_USB_STORAGE=m # CONFIG_USB_STORAGE_DEBUG is not set -CONFIG_USB_STORAGE_RW_DETECT=y # CONFIG_USB_STORAGE_DATAFAB is not set # CONFIG_USB_STORAGE_FREECOM is not set # CONFIG_USB_STORAGE_ISD200 is not set # CONFIG_USB_STORAGE_DPCM is not set -# CONFIG_USB_STORAGE_HP8200e is not set +# CONFIG_USB_STORAGE_USBAT is not set # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set @@ -1030,6 +1054,7 @@ CONFIG_USB_HIDDEV=y CONFIG_USB_PEGASUS=y # CONFIG_USB_RTL8150 is not set # CONFIG_USB_USBNET is not set +# CONFIG_USB_MON is not set # # USB port drivers @@ -1055,6 +1080,7 @@ CONFIG_USB_PEGASUS=y # CONFIG_USB_PHIDGETKIT is not set # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set +# CONFIG_USB_SISUSBVGA is not set # CONFIG_USB_TEST is not set # @@ -1276,10 +1302,13 @@ CONFIG_OPROFILE=y # # Kernel hacking # +# CONFIG_PRINTK_TIME is not set CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_LOG_BUF_SHIFT=17 # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +# CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set @@ -1311,6 +1340,7 @@ CONFIG_CRYPTO_SHA1=m CONFIG_CRYPTO_SHA256=m CONFIG_CRYPTO_SHA512=m CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_DES=y CONFIG_CRYPTO_BLOWFISH=m CONFIG_CRYPTO_TWOFISH=m From anton at samba.org Mon Jun 6 09:02:53 2005 From: anton at samba.org (Anton Blanchard) Date: Mon, 6 Jun 2005 09:02:53 +1000 Subject: SCSI timeouts (was: Re: RS/6000 7017-S7A hangs on boot) In-Reply-To: <20050531212638.GA11249@exception.at> References: <20050522201816.GA8254@exception.at> <20050522205427.GE20174@krispykreme> <20050531212638.GA11249@exception.at> Message-ID: <20050605230253.GD28002@krispykreme> Hi, > Heya. Seems like this was the point. But now I am facing strange SCSI > timeouts. Any ideas? (these SCSI devices work well under AIX) ... > 0:0:0:0: ABORT operation started. > 0:0:0:0: ABORT operation timed-out. > 0:0:0:0: DEVICE RESET operation started. > 0:0:0:0: DEVICE RESET operation timed-out. > 0:0:0:0: BUS RESET operation started. > 0:0:0:0: BUS RESET operation timed-out. > 0:0:0:0: HOST RESET operation started. > sym0: SCSI BUS has been reset. > 0:0:0:0: HOST RESET operation timed-out. > scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0 Looks like the sym2 driver isnt getting interrupts. There is an awful hack in arch/ppc64/kernel/pSeries_pci.c:check_s7a() to work around a bug in our OF interrupt parsing. Your dmesg prints: > sym0: <825a> rev 0x13 at pci 0000:00:0b.0 irq 25 Our s7a prints: sym0: <875> rev 0x3 at pci 0000:00:0b.0 irq 22 So it looks like check_s7a isnt triggering. You could fix this function or a far better option is to fix our interrupt parsing code :) Anton From cfriesen at nortel.com Mon Jun 6 14:04:39 2005 From: cfriesen at nortel.com (Chris Friesen) Date: Sun, 05 Jun 2005 22:04:39 -0600 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <1117924272.31082.218.camel@gaston> References: <200506032334.j53NYfUm010224@localhost.localdomain> <6628df334a76030b1b9e3e477f7cfc75@penguinppc.org> <1117924272.31082.218.camel@gaston> Message-ID: <42A3CB57.4060704@nortel.com> Benjamin Herrenschmidt wrote: > It's possible, though the Maple can't do the full power off properly due > to some issue between the cpc925 and the amd8111 When I first heard that the Maple couldn't halt/reset itself I was fairly surprised--it just seems so odd to not be able to do something as fundamental as "reboot". Chris From benh at kernel.crashing.org Mon Jun 6 14:07:14 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 06 Jun 2005 14:07:14 +1000 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <42A3CB57.4060704@nortel.com> References: <200506032334.j53NYfUm010224@localhost.localdomain> <6628df334a76030b1b9e3e477f7cfc75@penguinppc.org> <1117924272.31082.218.camel@gaston> <42A3CB57.4060704@nortel.com> Message-ID: <1118030834.31082.281.camel@gaston> On Sun, 2005-06-05 at 22:04 -0600, Chris Friesen wrote: > Benjamin Herrenschmidt wrote: > > > It's possible, though the Maple can't do the full power off properly due > > to some issue between the cpc925 and the amd8111 > > When I first heard that the Maple couldn't halt/reset itself I was > fairly surprised--it just seems so odd to not be able to do something as > fundamental as "reboot". It can reset I think, the problem is with the halt. Ben. From sfr at canb.auug.org.au Mon Jun 6 16:14:15 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 6 Jun 2005 16:14:15 +1000 Subject: [PATCH 0/6] ppc64 iSeries: PCI related cleanups In-Reply-To: <20050603175819.3d143a07.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> Message-ID: <20050606161415.055dce39.sfr@canb.auug.org.au> Hi Andrew, This is another in my series if iSeries code cleanups. The following patches depend on the previous series (finishing with the patch you called ppc64-iseries-misc-header-cleanups.patch). This series of patches starts to look at the iSeries PCI code. arch/ppc64/kernel/Makefile | 5 arch/ppc64/kernel/iSeries_VpdInfo.c | 219 +++++++++++++++----------------- arch/ppc64/kernel/iSeries_pci.c | 43 ++---- arch/ppc64/kernel/iSeries_pci_reset.c | 104 --------------- include/asm-ppc64/iSeries/HvCallXm.h | 17 ++ include/asm-ppc64/iSeries/iSeries_pci.h | 29 ---- include/asm-ppc64/iommu.h | 21 --- 7 files changed, 147 insertions(+), 291 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050606/8ee37a06/attachment.pgp From sfr at canb.auug.org.au Mon Jun 6 16:22:11 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 6 Jun 2005 16:22:11 +1000 Subject: [PATCH 2/6] ppc64 iSeries: iommu.h cleanups In-Reply-To: <20050606161415.055dce39.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> Message-ID: <20050606162211.5f2713c8.sfr@canb.auug.org.au> Hi Andrew, The iommu_table_cb structure is iSeries specific, so move it to the header file that declares the function we pass it to. vio_tce_table and iommu_setup_iSeries no longer exist, so remove their declarations. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.11/include/asm-ppc64/iSeries/HvCallXm.h linus-iSeries-headers.12/include/asm-ppc64/iSeries/HvCallXm.h --- linus-iSeries-headers.11/include/asm-ppc64/iSeries/HvCallXm.h 2005-06-01 18:01:10.000000000 +1000 +++ linus-iSeries-headers.12/include/asm-ppc64/iSeries/HvCallXm.h 2005-06-04 17:05:26.000000000 +1000 @@ -16,6 +16,23 @@ #define HvCallXmSetTce HvCallXm + 11 #define HvCallXmSetTces HvCallXm + 13 +/* + * Structure passed to HvCallXm_getTceTableParms + */ +struct iommu_table_cb { + unsigned long itc_busno; /* Bus number for this tce table */ + unsigned long itc_start; /* Will be NULL for secondary */ + unsigned long itc_totalsize; /* Size (in pages) of whole table */ + unsigned long itc_offset; /* Index into real tce table of the + start of our section */ + unsigned long itc_size; /* Size (in pages) of our section */ + unsigned long itc_index; /* Index of this tce table */ + unsigned short itc_maxtables; /* Max num of tables for partition */ + unsigned char itc_virtbus; /* Flag to indicate virtual bus */ + unsigned char itc_slotno; /* IOA Tce Slot Index */ + unsigned char itc_rsvd[4]; +}; + static inline void HvCallXm_getTceTableParms(u64 cb) { HvCall1(HvCallXmGetTceTableParms, cb); diff -ruNp linus-iSeries-headers.11/include/asm-ppc64/iommu.h linus-iSeries-headers.12/include/asm-ppc64/iommu.h --- linus-iSeries-headers.11/include/asm-ppc64/iommu.h 2005-05-20 09:05:55.000000000 +1000 +++ linus-iSeries-headers.12/include/asm-ppc64/iommu.h 2005-06-04 17:13:20.000000000 +1000 @@ -82,24 +82,6 @@ struct iommu_table { unsigned long *it_map; /* A simple allocation bitmap for now */ }; -#ifdef CONFIG_PPC_ISERIES -struct iommu_table_cb { - unsigned long itc_busno; /* Bus number for this tce table */ - unsigned long itc_start; /* Will be NULL for secondary */ - unsigned long itc_totalsize; /* Size (in pages) of whole table */ - unsigned long itc_offset; /* Index into real tce table of the - start of our section */ - unsigned long itc_size; /* Size (in pages) of our section */ - unsigned long itc_index; /* Index of this tce table */ - unsigned short itc_maxtables; /* Max num of tables for partition */ - unsigned char itc_virtbus; /* Flag to indicate virtual bus */ - unsigned char itc_slotno; /* IOA Tce Slot Index */ - unsigned char itc_rsvd[4]; -}; - -extern struct iommu_table vio_tce_table; /* Tce table for virtual bus */ -#endif /* CONFIG_PPC_ISERIES */ - struct scatterlist; #ifdef CONFIG_PPC_MULTIPLATFORM @@ -122,9 +104,6 @@ extern void iommu_devnode_init_pSeries(s #ifdef CONFIG_PPC_ISERIES -/* Walks all buses and creates iommu tables */ -extern void iommu_setup_iSeries(void); - /* Initializes tables for bio buses */ extern void __init iommu_vio_init(void); From sfr at canb.auug.org.au Mon Jun 6 16:22:28 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 6 Jun 2005 16:22:28 +1000 Subject: [PATCH 1/6] ppc64 iSeries: remove iSeries_pci_reset.c In-Reply-To: <20050606161415.055dce39.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> Message-ID: <20050606162228.5df06ac4.sfr@canb.auug.org.au> Hi Andrew, The file arch/ppc64/kernel/iSeries_pci_reset contains only one function that is not use anywhere (any more). Remove it. This function is the only user of the ReturnCode member of iSeries_Device_Node, so remove that as well. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.10/arch/ppc64/kernel/Makefile linus-iSeries-headers.11/arch/ppc64/kernel/Makefile --- linus-iSeries-headers.10/arch/ppc64/kernel/Makefile 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.11/arch/ppc64/kernel/Makefile 2005-06-06 14:53:29.000000000 +1000 @@ -16,7 +16,7 @@ obj-y += vdso32/ vdso64/ obj-$(CONFIG_PPC_OF) += of_device.o -pci-obj-$(CONFIG_PPC_ISERIES) += iSeries_pci.o iSeries_pci_reset.o +pci-obj-$(CONFIG_PPC_ISERIES) += iSeries_pci.o pci-obj-$(CONFIG_PPC_MULTIPLATFORM) += pci_dn.o pci_direct_iommu.o obj-$(CONFIG_PCI) += pci.o pci_iommu.o iomap.o $(pci-obj-y) diff -ruNp linus-iSeries-headers.10/arch/ppc64/kernel/iSeries_pci_reset.c linus-iSeries-headers.11/arch/ppc64/kernel/iSeries_pci_reset.c --- linus-iSeries-headers.10/arch/ppc64/kernel/iSeries_pci_reset.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.11/arch/ppc64/kernel/iSeries_pci_reset.c 1970-01-01 10:00:00.000000000 +1000 @@ -1,104 +0,0 @@ -#define PCIFR(...) -/************************************************************************/ -/* File iSeries_pci_reset.c created by Allan Trautman on Mar 21 2001. */ -/************************************************************************/ -/* This code supports the pci interface on the IBM iSeries systems. */ -/* Copyright (C) 20yy */ -/* */ -/* This program is free software; you can redistribute it and/or modify */ -/* it under the terms of the GNU General Public License as published by */ -/* the Free Software Foundation; either version 2 of the License, or */ -/* (at your option) any later version. */ -/* */ -/* This program is distributed in the hope that it will be useful, */ -/* but WITHOUT ANY WARRANTY; without even the implied warranty of */ -/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */ -/* GNU General Public License for more details. */ -/* */ -/* You should have received a copy of the GNU General Public License */ -/* along with this program; if not, write to the: */ -/* Free Software Foundation, Inc., */ -/* 59 Temple Place, Suite 330, */ -/* Boston, MA 02111-1307 USA */ -/************************************************************************/ -/* Change Activity: */ -/* Created, March 20, 2001 */ -/* April 30, 2001, Added return codes on functions. */ -/* September 10, 2001, Ported to ppc64. */ -/* End Change Activity */ -/************************************************************************/ -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include - -#include -#include "pci.h" - -/* - * Interface to toggle the reset line - * Time is in .1 seconds, need for seconds. - */ -int iSeries_Device_ToggleReset(struct pci_dev *PciDev, int AssertTime, - int DelayTime) -{ - unsigned int AssertDelay, WaitDelay; - struct iSeries_Device_Node *DeviceNode = - (struct iSeries_Device_Node *)PciDev->sysdata; - - if (DeviceNode == NULL) { - printk("PCI: Pci Reset Failed, Device Node not found for pci_dev %p\n", - PciDev); - return -1; - } - /* - * Set defaults, Assert is .5 second, Wait is 3 seconds. - */ - if (AssertTime == 0) - AssertDelay = 500; - else - AssertDelay = AssertTime * 100; - - if (DelayTime == 0) - WaitDelay = 3000; - else - WaitDelay = DelayTime * 100; - - /* - * Assert reset - */ - DeviceNode->ReturnCode = HvCallPci_setSlotReset(ISERIES_BUS(DeviceNode), - 0x00, DeviceNode->AgentId, 1); - if (DeviceNode->ReturnCode == 0) { - msleep(AssertDelay); /* Sleep for the time */ - DeviceNode->ReturnCode = - HvCallPci_setSlotReset(ISERIES_BUS(DeviceNode), - 0x00, DeviceNode->AgentId, 0); - - /* - * Wait for device to reset - */ - msleep(WaitDelay); - } - if (DeviceNode->ReturnCode == 0) - PCIFR("Slot 0x%04X.%02 Reset\n", ISERIES_BUS(DeviceNode), - DeviceNode->AgentId); - else { - printk("PCI: Slot 0x%04X.%02X Reset Failed, RCode: %04X\n", - ISERIES_BUS(DeviceNode), DeviceNode->AgentId, - DeviceNode->ReturnCode); - PCIFR("Slot 0x%04X.%02X Reset Failed, RCode: %04X\n", - ISERIES_BUS(DeviceNode), DeviceNode->AgentId, - DeviceNode->ReturnCode); - } - return DeviceNode->ReturnCode; -} -EXPORT_SYMBOL(iSeries_Device_ToggleReset); diff -ruNp linus-iSeries-headers.10/include/asm-ppc64/iSeries/iSeries_pci.h linus-iSeries-headers.11/include/asm-ppc64/iSeries/iSeries_pci.h --- linus-iSeries-headers.10/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-01 17:31:39.000000000 +1000 +++ linus-iSeries-headers.11/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-06 15:39:26.000000000 +1000 @@ -92,7 +92,6 @@ struct iSeries_Device_Node { int DevFn; /* Linux devfn */ int BarOffset; int Irq; /* Assigned IRQ */ - int ReturnCode; /* Return Code Holder */ int IoRetry; /* Current Retry Count */ int Flags; /* Possible flags(disable/bist)*/ u16 Vendor; /* Vendor ID */ @@ -107,7 +106,5 @@ struct iSeries_Device_Node { extern int iSeries_Device_Information(struct pci_dev*, char*, int); extern void iSeries_Get_Location_Code(struct iSeries_Device_Node*); -extern int iSeries_Device_ToggleReset(struct pci_dev* PciDev, - int AssertTime, int DelayTime); #endif /* _ISERIES_64_PCI_H */ From sfr at canb.auug.org.au Mon Jun 6 16:30:40 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 6 Jun 2005 16:30:40 +1000 Subject: [PATCH 3/6] ppc64 iSeries: iSeries_VpdInfo.c cleanups In-Reply-To: <20050606161415.055dce39.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> Message-ID: <20050606163040.49ddb001.sfr@canb.auug.org.au> Hi Andrew, Clean up iSeries_VpdInfo.c: - white space and comment fixes - make a function static - the functions here are only called from iSeries_pci.c, so CONFIG_PCI will be set (so remove check) - only build when CONFIG_PCI is set - remove unneeded includes and cast Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.12/arch/ppc64/kernel/Makefile linus-iSeries-headers.13/arch/ppc64/kernel/Makefile --- linus-iSeries-headers.12/arch/ppc64/kernel/Makefile 2005-06-06 14:53:29.000000000 +1000 +++ linus-iSeries-headers.13/arch/ppc64/kernel/Makefile 2005-06-05 03:40:47.000000000 +1000 @@ -16,13 +16,12 @@ obj-y += vdso32/ vdso64/ obj-$(CONFIG_PPC_OF) += of_device.o -pci-obj-$(CONFIG_PPC_ISERIES) += iSeries_pci.o +pci-obj-$(CONFIG_PPC_ISERIES) += iSeries_pci.o iSeries_VpdInfo.o pci-obj-$(CONFIG_PPC_MULTIPLATFORM) += pci_dn.o pci_direct_iommu.o obj-$(CONFIG_PCI) += pci.o pci_iommu.o iomap.o $(pci-obj-y) -obj-$(CONFIG_PPC_ISERIES) += iSeries_irq.o \ - iSeries_VpdInfo.o XmPciLpEvent.o \ +obj-$(CONFIG_PPC_ISERIES) += iSeries_irq.o XmPciLpEvent.o \ HvCall.o HvLpConfig.o LparData.o \ iSeries_setup.o ItLpQueue.o hvCall.o \ mf.o HvLpEvent.o iSeries_proc.o iSeries_htab.o \ diff -ruNp linus-iSeries-headers.12/arch/ppc64/kernel/iSeries_VpdInfo.c linus-iSeries-headers.13/arch/ppc64/kernel/iSeries_VpdInfo.c --- linus-iSeries-headers.12/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-06-02 16:08:05.000000000 +1000 +++ linus-iSeries-headers.13/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-06-06 15:28:01.000000000 +1000 @@ -1,31 +1,31 @@ -/************************************************************************/ -/* File iSeries_vpdInfo.c created by Allan Trautman on Fri Feb 2 2001. */ -/************************************************************************/ -/* This code gets the card location of the hardware */ -/* Copyright (C) 20yy */ -/* */ -/* This program is free software; you can redistribute it and/or modify */ -/* it under the terms of the GNU General Public License as published by */ -/* the Free Software Foundation; either version 2 of the License, or */ -/* (at your option) any later version. */ -/* */ -/* This program is distributed in the hope that it will be useful, */ -/* but WITHOUT ANY WARRANTY; without even the implied warranty of */ -/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */ -/* GNU General Public License for more details. */ -/* */ -/* You should have received a copy of the GNU General Public License */ -/* along with this program; if not, write to the: */ -/* Free Software Foundation, Inc., */ -/* 59 Temple Place, Suite 330, */ -/* Boston, MA 02111-1307 USA */ -/************************************************************************/ -/* Change Activity: */ -/* Created, Feb 2, 2001 */ -/* Ported to ppc64, August 20, 2001 */ -/* End Change Activity */ -/************************************************************************/ -#include +/* + * File iSeries_vpdInfo.c created by Allan Trautman on Fri Feb 2 2001. + * + * This code gets the card location of the hardware + * Copyright (C) 2001 + * Copyright (C) 2005 Stephen Rothwel, IBM Corp + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the: + * Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, + * Boston, MA 02111-1307 USA + * + * Change Activity: + * Created, Feb 2, 2001 + * Ported to ppc64, August 20, 2001 + * End Change Activity + */ #include #include #include @@ -34,14 +34,13 @@ #include #include -#include #include -#include "pci.h" /* * Size of Bus VPD data */ #define BUS_VPDSIZE 1024 + /* * Bus Vpd Tags */ @@ -49,6 +48,7 @@ #define VpdEndOfAreaTag 0x79 #define VpdIdStringTag 0x82 #define VpdVendorAreaTag 0x84 + /* * Mfg Area Tags */ @@ -78,7 +78,7 @@ struct SlotMapStruct { char CardLocation[3]; char Parms[8]; char Reserved[2]; -}; +}; typedef struct SlotMapStruct SlotMap; #define SLOT_ENTRY_SIZE 16 @@ -111,28 +111,26 @@ int iSeries_Device_Information(struct pc PciDev->vendor); len += sprintf(buffer + len, "Frame%3d, Card %4s ", DevNode->FrameId, DevNode->CardLocation); -#ifdef CONFIG_PCI if (pci_class_name(PciDev->class >> 8) == 0) len += sprintf(buffer + len, "0x%04X ", (int)(PciDev->class >> 8)); else len += sprintf(buffer + len, "%s", pci_class_name(PciDev->class >> 8)); -#endif return len; } /* * Parse the Slot Area */ -void iSeries_Parse_SlotArea(SlotMap *MapPtr, int MapLen, +static void iSeries_Parse_SlotArea(SlotMap *MapPtr, int MapLen, struct iSeries_Device_Node *DevNode) { int SlotMapLen = MapLen; SlotMap *SlotMapPtr = MapPtr; /* - * Parse Slot label until we find the one requrested + * Parse Slot label until we find the one requested */ while (SlotMapLen > 0) { if (SlotMapPtr->AgentId == DevNode->AgentId ) { @@ -182,7 +180,7 @@ static void iSeries_Parse_MfgArea(u8 *Ar if (SlotMapFmt == 0x1004) SlotMapPtr = (SlotMap *)((char *)MfgAreaPtr + MFG_ENTRY_SIZE + 1); - else + else SlotMapPtr = (SlotMap *)((char *)MfgAreaPtr + MFG_ENTRY_SIZE); iSeries_Parse_SlotArea(SlotMapPtr, MfgTagLen, DevNode); @@ -193,8 +191,8 @@ static void iSeries_Parse_MfgArea(u8 *Ar */ MfgAreaPtr = (MfgArea *)((char *)MfgAreaPtr + MfgTagLen + MFG_ENTRY_SIZE); - MfgAreaLen -= (MfgTagLen + MFG_ENTRY_SIZE); - } + MfgAreaLen -= (MfgTagLen + MFG_ENTRY_SIZE); + } } /* @@ -205,7 +203,7 @@ static int iSeries_Parse_PhbId(u8 *AreaP { u8 *PhbPtr = AreaPtr; int DataLen = AreaLength; - char PhbId = 0xFF; + char PhbId = 0xFF; while (DataLen > 0) { if ((*PhbPtr == 'B') && (*(PhbPtr + 1) == 'U') @@ -215,7 +213,7 @@ static int iSeries_Parse_PhbId(u8 *AreaP ++PhbPtr; PhbId = (*PhbPtr & 0x0F); break; - } + } ++PhbPtr; --DataLen; } @@ -232,7 +230,7 @@ static void iSeries_Parse_Vpd(u8 *VpdDat int DataLen = VpdDataLen - 3; while ((*TagPtr != VpdEndOfAreaTag) && (DataLen > 0)) { - int AreaLen = *(TagPtr + 1) + (*(TagPtr + 2) * 256); + int AreaLen = *(TagPtr + 1) + (*(TagPtr + 2) * 256); u8 *AreaData = TagPtr + 3; if (*TagPtr == VpdIdStringTag) @@ -243,12 +241,12 @@ static void iSeries_Parse_Vpd(u8 *VpdDat TagPtr = AreaData + AreaLen; DataLen -= AreaLen; } -} +} void iSeries_Get_Location_Code(struct iSeries_Device_Node *DevNode) { int BusVpdLen = 0; - u8 *BusVpdPtr = (u8 *)kmalloc(BUS_VPDSIZE, GFP_KERNEL); + u8 *BusVpdPtr = kmalloc(BUS_VPDSIZE, GFP_KERNEL); if (BusVpdPtr == NULL) { printk("PCI: Bus VPD Buffer allocation failure.\n"); From sfr at canb.auug.org.au Mon Jun 6 16:33:24 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 6 Jun 2005 16:33:24 +1000 Subject: [PATCH 4/6] ppc64 iSeries: iSeries_pci.h cleanups In-Reply-To: <20050606161415.055dce39.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> Message-ID: <20050606163324.2d3ea08f.sfr@canb.auug.org.au> Hi Andrew, Remove no longer used things from iSeries_pci.h. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.13/arch/ppc64/kernel/iSeries_pci.c linus-iSeries-headers.14/arch/ppc64/kernel/iSeries_pci.c --- linus-iSeries-headers.13/arch/ppc64/kernel/iSeries_pci.c 2005-06-02 16:48:16.000000000 +1000 +++ linus-iSeries-headers.14/arch/ppc64/kernel/iSeries_pci.c 2005-06-06 15:36:15.000000000 +1000 @@ -497,7 +497,6 @@ static int scan_bridge_slot(HvBusNumber ++DeviceCount; node = build_device_node(Bus, SubBus, EADsIdSel, Function); - node->Vendor = VendorId; node->Irq = Irq; node->LogicalSlot = BridgeInfo->logicalSlotNumber; diff -ruNp linus-iSeries-headers.13/include/asm-ppc64/iSeries/iSeries_pci.h linus-iSeries-headers.14/include/asm-ppc64/iSeries/iSeries_pci.h --- linus-iSeries-headers.13/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-06 15:41:59.000000000 +1000 +++ linus-iSeries-headers.14/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-06 15:39:39.000000000 +1000 @@ -44,8 +44,7 @@ struct iSeries_Device_Node; #define ISERIES_SUBBUS(DevPtr) DevPtr->DsaAddr.Dsa.subBusNumber #define ISERIES_DEVICE(DevPtr) DevPtr->DsaAddr.Dsa.deviceId #define ISERIES_DSA(DevPtr) DevPtr->DsaAddr.DsaAddr -#define ISERIES_DEVFUN(DevPtr) DevPtr->DevFn -#define ISERIES_DEVNODE(PciDev) ((struct iSeries_Device_Node*)PciDev->sysdata) +#define ISERIES_DEVNODE(PciDev) ((struct iSeries_Device_Node *)PciDev->sysdata) #define EADsMaxAgents 7 @@ -63,17 +62,6 @@ struct iSeries_Device_Node; #define ISERIES_GET_FUNCTION_FROM_SUBBUS(subbus) ((subbus >> 2) & 0x7) /* - * N.B. the ISERIES_DECODE_* macros are not used anywhere, and I think - * the 0x71 (at least) must be wrong - 0x78 maybe? -- paulus. - */ -#define ISERIES_DECODE_DEVFN(linuxdevfn) \ - (((linuxdevfn & 0x71) << 1) | (linuxdevfn & 0x07)) -#define ISERIES_DECODE_DEVICE(linuxdevfn) \ - (((linuxdevfn & 0x38) >> 3) | (((linuxdevfn & 0x40) >> 2) + 0x10)) -#define ISERIES_DECODE_FUNCTION(linuxdevfn) \ - (linuxdevfn & 0x07) - -/* * Converts Virtual Address to Real Address for Hypervisor calls */ #define ISERIES_HV_ADDR(virtaddr) \ @@ -90,15 +78,12 @@ struct iSeries_Device_Node { /* deviceId, barNumber */ HvAgentId AgentId; /* Hypervisor DevFn */ int DevFn; /* Linux devfn */ - int BarOffset; int Irq; /* Assigned IRQ */ int IoRetry; /* Current Retry Count */ int Flags; /* Possible flags(disable/bist)*/ - u16 Vendor; /* Vendor ID */ u8 LogicalSlot; /* Hv Slot Index for Tces */ struct iommu_table *iommu_table;/* Device TCE Table */ u8 PhbId; /* Phb Card is on. */ - u16 Board; /* Board Number */ u8 FrameId; /* iSeries spcn Frame Id */ char CardLocation[4];/* Char format of planar vpd */ char Location[20]; /* Frame 1, Card C10 */ From sfr at canb.auug.org.au Mon Jun 6 16:37:32 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 6 Jun 2005 16:37:32 +1000 Subject: [PATCH 5/6] ppc64 iSeries: remove IoRetry from iSeries_Device_Node In-Reply-To: <20050606161415.055dce39.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> Message-ID: <20050606163732.60f29021.sfr@canb.auug.org.au> Hi Andrew, The IoRetry member of iSeries_Devide_Node is really only used locally, so remove it and replace it with a local variable. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.14/arch/ppc64/kernel/iSeries_pci.c linus-iSeries-headers.15/arch/ppc64/kernel/iSeries_pci.c --- linus-iSeries-headers.14/arch/ppc64/kernel/iSeries_pci.c 2005-06-06 15:36:15.000000000 +1000 +++ linus-iSeries-headers.15/arch/ppc64/kernel/iSeries_pci.c 2005-06-06 15:44:57.000000000 +1000 @@ -225,7 +225,6 @@ static struct iSeries_Device_Node *build node->DsaAddr.Dsa.deviceId = 0x10; node->AgentId = AgentId; node->DevFn = PCI_DEVFN(ISERIES_ENCODE_DEVICE(AgentId), Function); - node->IoRetry = 0; iSeries_Get_Location_Code(node); return node; } @@ -658,38 +657,34 @@ static struct pci_ops iSeries_pci_ops = * Check Return Code * -> On Failure, print and log information. * Increment Retry Count, if exceeds max, panic partition. - * -> If in retry, print and log success * * PCI: Device 23.90 ReadL I/O Error( 0): 0x1234 * PCI: Device 23.90 ReadL Retry( 1) * PCI: Device 23.90 ReadL Retry Successful(1) */ static int CheckReturnCode(char *TextHdr, struct iSeries_Device_Node *DevNode, - u64 ret) + int *retry, u64 ret) { if (ret != 0) { ++Pci_Error_Count; - ++DevNode->IoRetry; + (*retry)++; printk("PCI: %s: Device 0x%04X:%02X I/O Error(%2d): 0x%04X\n", TextHdr, DevNode->DsaAddr.Dsa.busNumber, DevNode->DevFn, - DevNode->IoRetry, (int)ret); + *retry, (int)ret); /* * Bump the retry and check for retry count exceeded. * If, Exceeded, panic the system. */ - if ((DevNode->IoRetry > Pci_Retry_Max) && + if (((*retry) > Pci_Retry_Max) && (Pci_Error_Flag > 0)) { mf_display_src(0xB6000103); - panic_timeout = 0; + panic_timeout = 0; panic("PCI: Hardware I/O Error, SRC B6000103, " "Automatic Reboot Disabled.\n"); } return -1; /* Retry Try */ } - /* If retry was in progress, log success and rest retry count */ - if (DevNode->IoRetry > 0) - DevNode->IoRetry = 0; - return 0; + return 0; } /* @@ -735,6 +730,7 @@ u8 iSeries_Read_Byte(const volatile void { u64 BarOffset; u64 dsa; + int retry = 0; struct HvCallPci_LoadReturn ret; struct iSeries_Device_Node *DevNode = xlate_iomm_address(IoAddress, &dsa, &BarOffset); @@ -754,7 +750,7 @@ u8 iSeries_Read_Byte(const volatile void do { ++Pci_Io_Read_Count; HvCall3Ret16(HvCallPciBarLoad8, &ret, dsa, BarOffset, 0); - } while (CheckReturnCode("RDB", DevNode, ret.rc) != 0); + } while (CheckReturnCode("RDB", DevNode, &retry, ret.rc) != 0); return (u8)ret.value; } @@ -764,6 +760,7 @@ u16 iSeries_Read_Word(const volatile voi { u64 BarOffset; u64 dsa; + int retry = 0; struct HvCallPci_LoadReturn ret; struct iSeries_Device_Node *DevNode = xlate_iomm_address(IoAddress, &dsa, &BarOffset); @@ -784,7 +781,7 @@ u16 iSeries_Read_Word(const volatile voi ++Pci_Io_Read_Count; HvCall3Ret16(HvCallPciBarLoad16, &ret, dsa, BarOffset, 0); - } while (CheckReturnCode("RDW", DevNode, ret.rc) != 0); + } while (CheckReturnCode("RDW", DevNode, &retry, ret.rc) != 0); return swab16((u16)ret.value); } @@ -794,6 +791,7 @@ u32 iSeries_Read_Long(const volatile voi { u64 BarOffset; u64 dsa; + int retry = 0; struct HvCallPci_LoadReturn ret; struct iSeries_Device_Node *DevNode = xlate_iomm_address(IoAddress, &dsa, &BarOffset); @@ -814,7 +812,7 @@ u32 iSeries_Read_Long(const volatile voi ++Pci_Io_Read_Count; HvCall3Ret16(HvCallPciBarLoad32, &ret, dsa, BarOffset, 0); - } while (CheckReturnCode("RDL", DevNode, ret.rc) != 0); + } while (CheckReturnCode("RDL", DevNode, &retry, ret.rc) != 0); return swab32((u32)ret.value); } @@ -831,6 +829,7 @@ void iSeries_Write_Byte(u8 data, volatil { u64 BarOffset; u64 dsa; + int retry = 0; u64 rc; struct iSeries_Device_Node *DevNode = xlate_iomm_address(IoAddress, &dsa, &BarOffset); @@ -850,7 +849,7 @@ void iSeries_Write_Byte(u8 data, volatil do { ++Pci_Io_Write_Count; rc = HvCall4(HvCallPciBarStore8, dsa, BarOffset, data, 0); - } while (CheckReturnCode("WWB", DevNode, rc) != 0); + } while (CheckReturnCode("WWB", DevNode, &retry, rc) != 0); } EXPORT_SYMBOL(iSeries_Write_Byte); @@ -858,6 +857,7 @@ void iSeries_Write_Word(u16 data, volati { u64 BarOffset; u64 dsa; + int retry = 0; u64 rc; struct iSeries_Device_Node *DevNode = xlate_iomm_address(IoAddress, &dsa, &BarOffset); @@ -877,7 +877,7 @@ void iSeries_Write_Word(u16 data, volati do { ++Pci_Io_Write_Count; rc = HvCall4(HvCallPciBarStore16, dsa, BarOffset, swab16(data), 0); - } while (CheckReturnCode("WWW", DevNode, rc) != 0); + } while (CheckReturnCode("WWW", DevNode, &retry, rc) != 0); } EXPORT_SYMBOL(iSeries_Write_Word); @@ -885,6 +885,7 @@ void iSeries_Write_Long(u32 data, volati { u64 BarOffset; u64 dsa; + int retry = 0; u64 rc; struct iSeries_Device_Node *DevNode = xlate_iomm_address(IoAddress, &dsa, &BarOffset); @@ -904,6 +905,6 @@ void iSeries_Write_Long(u32 data, volati do { ++Pci_Io_Write_Count; rc = HvCall4(HvCallPciBarStore32, dsa, BarOffset, swab32(data), 0); - } while (CheckReturnCode("WWL", DevNode, rc) != 0); + } while (CheckReturnCode("WWL", DevNode, &retry, rc) != 0); } EXPORT_SYMBOL(iSeries_Write_Long); diff -ruNp linus-iSeries-headers.14/include/asm-ppc64/iSeries/iSeries_pci.h linus-iSeries-headers.15/include/asm-ppc64/iSeries/iSeries_pci.h --- linus-iSeries-headers.14/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-06 15:39:39.000000000 +1000 +++ linus-iSeries-headers.15/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-06 15:44:06.000000000 +1000 @@ -79,7 +79,6 @@ struct iSeries_Device_Node { HvAgentId AgentId; /* Hypervisor DevFn */ int DevFn; /* Linux devfn */ int Irq; /* Assigned IRQ */ - int IoRetry; /* Current Retry Count */ int Flags; /* Possible flags(disable/bist)*/ u8 LogicalSlot; /* Hv Slot Index for Tces */ struct iommu_table *iommu_table;/* Device TCE Table */ From sfr at canb.auug.org.au Mon Jun 6 16:45:57 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 6 Jun 2005 16:45:57 +1000 Subject: [PATCH 6/6] ppc64 iSeries: remove some more members of iSeries_Device_Node In-Reply-To: <20050606161415.055dce39.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> Message-ID: <20050606164557.4b778222.sfr@canb.auug.org.au> Hi Andrew, The AgentId, PhbId, FrameId, CardLocation and Location members of iSeries_Device_Node are stored early in the boot process just so that a message about the device can be printed later in the boot process. Remove them and construct the message by doing the VPD parsing at the time the message is printed. Also remove a few unused defines in iSeries_VpdInfo.c. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-iSeries-headers.15/arch/ppc64/kernel/iSeries_VpdInfo.c linus-iSeries-headers.16/arch/ppc64/kernel/iSeries_VpdInfo.c --- linus-iSeries-headers.15/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-06-06 15:28:01.000000000 +1000 +++ linus-iSeries-headers.16/arch/ppc64/kernel/iSeries_VpdInfo.c 2005-06-06 12:59:23.000000000 +1000 @@ -44,7 +44,6 @@ /* * Bus Vpd Tags */ -#define VpdEndOfDataTag 0x78 #define VpdEndOfAreaTag 0x79 #define VpdIdStringTag 0x82 #define VpdVendorAreaTag 0x84 @@ -52,11 +51,8 @@ /* * Mfg Area Tags */ -#define VpdFruFlag 0x4647 // "FG" #define VpdFruFrameId 0x4649 // "FI" #define VpdSlotMapFormat 0x4D46 // "MF" -#define VpdAsmPartNumber 0x504E // "PN" -#define VpdFruSerial 0x534E // "SN" #define VpdSlotMap 0x534D // "SM" /* @@ -83,48 +79,10 @@ typedef struct SlotMapStruct SlotMap; #define SLOT_ENTRY_SIZE 16 /* - * Formats the device information. - * - Pass in pci_dev* pointer to the device. - * - Pass in buffer to place the data. Danger here is the buffer must - * be as big as the client says it is. Should be at least 128 bytes. - * Return will the length of the string data put in the buffer. - * Format: - * PCI: Bus 0, Device 26, Vendor 0x12AE Frame 1, Card C10 Ethernet - * controller - */ -int iSeries_Device_Information(struct pci_dev *PciDev, char *buffer, - int BufferSize) -{ - struct iSeries_Device_Node *DevNode = - (struct iSeries_Device_Node *)PciDev->sysdata; - int len; - - if (DevNode == NULL) - return sprintf(buffer, - "PCI: iSeries_Device_Information DevNode is NULL"); - - if (BufferSize < 128) - return 0; - - len = sprintf(buffer, "PCI: Bus%3d, Device%3d, Vendor %04X ", - ISERIES_BUS(DevNode), PCI_SLOT(PciDev->devfn), - PciDev->vendor); - len += sprintf(buffer + len, "Frame%3d, Card %4s ", - DevNode->FrameId, DevNode->CardLocation); - if (pci_class_name(PciDev->class >> 8) == 0) - len += sprintf(buffer + len, "0x%04X ", - (int)(PciDev->class >> 8)); - else - len += sprintf(buffer + len, "%s", - pci_class_name(PciDev->class >> 8)); - return len; -} - -/* * Parse the Slot Area */ -static void iSeries_Parse_SlotArea(SlotMap *MapPtr, int MapLen, - struct iSeries_Device_Node *DevNode) +static void __init iSeries_Parse_SlotArea(SlotMap *MapPtr, int MapLen, + HvAgentId agent, u8 *PhbId, char card[4]) { int SlotMapLen = MapLen; SlotMap *SlotMapPtr = MapPtr; @@ -133,17 +91,16 @@ static void iSeries_Parse_SlotArea(SlotM * Parse Slot label until we find the one requested */ while (SlotMapLen > 0) { - if (SlotMapPtr->AgentId == DevNode->AgentId ) { + if (SlotMapPtr->AgentId == agent) { /* * If Phb wasn't found, grab the entry first one found. */ - if (DevNode->PhbId == 0xff) - DevNode->PhbId = SlotMapPtr->PhbId; + if (*PhbId == 0xff) + *PhbId = SlotMapPtr->PhbId; /* Found it, extract the data. */ - if (SlotMapPtr->PhbId == DevNode->PhbId ) { - memcpy(&DevNode->CardLocation, - &SlotMapPtr->CardLocation, 3); - DevNode->CardLocation[3] = 0; + if (SlotMapPtr->PhbId == *PhbId) { + memcpy(card, &SlotMapPtr->CardLocation, 3); + card[3] = 0; break; } } @@ -156,8 +113,9 @@ static void iSeries_Parse_SlotArea(SlotM /* * Parse the Mfg Area */ -static void iSeries_Parse_MfgArea(u8 *AreaData, int AreaLen, - struct iSeries_Device_Node *DevNode) +static void __init iSeries_Parse_MfgArea(u8 *AreaData, int AreaLen, + HvAgentId agent, u8 *PhbId, + u8 *frame, char card[4]) { MfgArea *MfgAreaPtr = (MfgArea *)AreaData; int MfgAreaLen = AreaLen; @@ -168,7 +126,7 @@ static void iSeries_Parse_MfgArea(u8 *Ar int MfgTagLen = MfgAreaPtr->TagLength; /* Frame ID (FI 4649020310 ) */ if (MfgAreaPtr->Tag == VpdFruFrameId) /* FI */ - DevNode->FrameId = MfgAreaPtr->AreaData1; + *frame = MfgAreaPtr->AreaData1; /* Slot Map Format (MF 4D46020004 ) */ else if (MfgAreaPtr->Tag == VpdSlotMapFormat) /* MF */ SlotMapFmt = (MfgAreaPtr->AreaData1 * 256) @@ -183,7 +141,8 @@ static void iSeries_Parse_MfgArea(u8 *Ar else SlotMapPtr = (SlotMap *)((char *)MfgAreaPtr + MFG_ENTRY_SIZE); - iSeries_Parse_SlotArea(SlotMapPtr, MfgTagLen, DevNode); + iSeries_Parse_SlotArea(SlotMapPtr, MfgTagLen, + agent, PhbId, card); } /* * Point to the next Mfg Area @@ -199,7 +158,7 @@ static void iSeries_Parse_MfgArea(u8 *Ar * Look for "BUS".. Data is not Null terminated. * PHBID of 0xFF indicates PHB was not found in VPD Data. */ -static int iSeries_Parse_PhbId(u8 *AreaPtr, int AreaLength) +static int __init iSeries_Parse_PhbId(u8 *AreaPtr, int AreaLength) { u8 *PhbPtr = AreaPtr; int DataLen = AreaLength; @@ -223,27 +182,30 @@ static int iSeries_Parse_PhbId(u8 *AreaP /* * Parse out the VPD Areas */ -static void iSeries_Parse_Vpd(u8 *VpdData, int VpdDataLen, - struct iSeries_Device_Node *DevNode) +static void __init iSeries_Parse_Vpd(u8 *VpdData, int VpdDataLen, + HvAgentId agent, u8 *frame, char card[4]) { u8 *TagPtr = VpdData; int DataLen = VpdDataLen - 3; + u8 PhbId; while ((*TagPtr != VpdEndOfAreaTag) && (DataLen > 0)) { int AreaLen = *(TagPtr + 1) + (*(TagPtr + 2) * 256); u8 *AreaData = TagPtr + 3; if (*TagPtr == VpdIdStringTag) - DevNode->PhbId = iSeries_Parse_PhbId(AreaData, AreaLen); + PhbId = iSeries_Parse_PhbId(AreaData, AreaLen); else if (*TagPtr == VpdVendorAreaTag) - iSeries_Parse_MfgArea(AreaData, AreaLen, DevNode); + iSeries_Parse_MfgArea(AreaData, AreaLen, + agent, &PhbId, frame, card); /* Point to next Area. */ TagPtr = AreaData + AreaLen; DataLen -= AreaLen; } } -void iSeries_Get_Location_Code(struct iSeries_Device_Node *DevNode) +static void __init iSeries_Get_Location_Code(u16 bus, HvAgentId agent, + u8 *frame, char card[4]) { int BusVpdLen = 0; u8 *BusVpdPtr = kmalloc(BUS_VPDSIZE, GFP_KERNEL); @@ -252,23 +214,58 @@ void iSeries_Get_Location_Code(struct iS printk("PCI: Bus VPD Buffer allocation failure.\n"); return; } - BusVpdLen = HvCallPci_getBusVpd(ISERIES_BUS(DevNode), - ISERIES_HV_ADDR(BusVpdPtr), + BusVpdLen = HvCallPci_getBusVpd(bus, ISERIES_HV_ADDR(BusVpdPtr), BUS_VPDSIZE); if (BusVpdLen == 0) { - kfree(BusVpdPtr); printk("PCI: Bus VPD Buffer zero length.\n"); - return; + goto out_free; } /* printk("PCI: BusVpdPtr: %p, %d\n",BusVpdPtr, BusVpdLen); */ /* Make sure this is what I think it is */ if (*BusVpdPtr != VpdIdStringTag) { /* 0x82 */ printk("PCI: Bus VPD Buffer missing starting tag.\n"); - kfree(BusVpdPtr); - return; + goto out_free; } - iSeries_Parse_Vpd(BusVpdPtr,BusVpdLen, DevNode); - sprintf(DevNode->Location, "Frame%3d, Card %-4s", DevNode->FrameId, - DevNode->CardLocation); + iSeries_Parse_Vpd(BusVpdPtr, BusVpdLen, agent, frame, card); +out_free: kfree(BusVpdPtr); } + +/* + * Prints the device information. + * - Pass in pci_dev* pointer to the device. + * - Pass in the device count + * + * Format: + * PCI: Bus 0, Device 26, Vendor 0x12AE Frame 1, Card C10 Ethernet + * controller + */ +void __init iSeries_Device_Information(struct pci_dev *PciDev, int count) +{ + struct iSeries_Device_Node *DevNode = PciDev->sysdata; + u16 bus; + u8 frame; + char card[4]; + HvSubBusNumber subbus; + HvAgentId agent; + + if (DevNode == NULL) { + printk("%d. PCI: iSeries_Device_Information DevNode is NULL\n", + count); + return; + } + + bus = ISERIES_BUS(DevNode); + subbus = ISERIES_SUBBUS(DevNode); + agent = ISERIES_PCI_AGENTID(ISERIES_GET_DEVICE_FROM_SUBBUS(subbus), + ISERIES_GET_FUNCTION_FROM_SUBBUS(subbus)); + iSeries_Get_Location_Code(bus, agent, &frame, card); + + printk("%d. PCI: Bus%3d, Device%3d, Vendor %04X Frame%3d, Card %4s ", + count, bus, PCI_SLOT(PciDev->devfn), PciDev->vendor, + frame, card); + if (pci_class_name(PciDev->class >> 8) == 0) + printk("0x%04X\n", (int)(PciDev->class >> 8)); + else + printk("%s\n", pci_class_name(PciDev->class >> 8)); +} diff -ruNp linus-iSeries-headers.15/arch/ppc64/kernel/iSeries_pci.c linus-iSeries-headers.16/arch/ppc64/kernel/iSeries_pci.c --- linus-iSeries-headers.15/arch/ppc64/kernel/iSeries_pci.c 2005-06-06 15:44:57.000000000 +1000 +++ linus-iSeries-headers.16/arch/ppc64/kernel/iSeries_pci.c 2005-06-06 00:52:14.000000000 +1000 @@ -223,9 +223,7 @@ static struct iSeries_Device_Node *build node->DsaAddr.Dsa.busNumber = Bus; node->DsaAddr.Dsa.subBusNumber = SubBus; node->DsaAddr.Dsa.deviceId = 0x10; - node->AgentId = AgentId; node->DevFn = PCI_DEVFN(ISERIES_ENCODE_DEVICE(AgentId), Function); - iSeries_Get_Location_Code(node); return node; } @@ -299,7 +297,6 @@ void __init iSeries_pci_final_fixup(void { struct pci_dev *pdev = NULL; struct iSeries_Device_Node *node; - char Buffer[256]; int DeviceCount = 0; PPCDBG(PPCDBG_BUSWALK, "iSeries_pcibios_fixup Entry.\n"); @@ -321,9 +318,7 @@ void __init iSeries_pci_final_fixup(void "pdev 0x%p <==> DevNode 0x%p\n", pdev, node); allocate_device_bars(pdev); - iSeries_Device_Information(pdev, Buffer, - sizeof(Buffer)); - printk("%d. %s\n", DeviceCount, Buffer); + iSeries_Device_Information(pdev, DeviceCount); iommu_devnode_init_iSeries(node); } else printk("PCI: Device Tree not found for 0x%016lX\n", diff -ruNp linus-iSeries-headers.15/include/asm-ppc64/iSeries/iSeries_pci.h linus-iSeries-headers.16/include/asm-ppc64/iSeries/iSeries_pci.h --- linus-iSeries-headers.15/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-06 15:44:06.000000000 +1000 +++ linus-iSeries-headers.16/include/asm-ppc64/iSeries/iSeries_pci.h 2005-06-06 01:28:01.000000000 +1000 @@ -76,19 +76,13 @@ struct iSeries_Device_Node { union HvDsaMap DsaAddr; /* Direct Select Address */ /* busNumber, subBusNumber, */ /* deviceId, barNumber */ - HvAgentId AgentId; /* Hypervisor DevFn */ int DevFn; /* Linux devfn */ int Irq; /* Assigned IRQ */ int Flags; /* Possible flags(disable/bist)*/ u8 LogicalSlot; /* Hv Slot Index for Tces */ struct iommu_table *iommu_table;/* Device TCE Table */ - u8 PhbId; /* Phb Card is on. */ - u8 FrameId; /* iSeries spcn Frame Id */ - char CardLocation[4];/* Char format of planar vpd */ - char Location[20]; /* Frame 1, Card C10 */ }; -extern int iSeries_Device_Information(struct pci_dev*, char*, int); -extern void iSeries_Get_Location_Code(struct iSeries_Device_Node*); +extern void iSeries_Device_Information(struct pci_dev*, int); #endif /* _ISERIES_64_PCI_H */ From xsear at lasallena.com Mon Jun 6 21:13:18 2005 From: xsear at lasallena.com (LaSallena) Date: Mon, 6 Jun 2005 07:13:18 -0400 (EDT) Subject: LaSalle Informs You! Message-ID: <20050606111318.110AF5FC6A@nomediakings.org> An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050606/1a6f257f/attachment.htm From service at paypal.com Tue Jun 7 07:06:51 2005 From: service at paypal.com (accounting@paypal.com) Date: Mon, 6 Jun 2005 15:06:51 -0600 Subject: Verify your PayPal Account Message-ID: <200506062106.j56L6pKT017685@ss.lokal.ca> An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050606/0e6cb475/attachment.htm From service at paypal.com Tue Jun 7 05:22:32 2005 From: service at paypal.com (accounting@paypal.com) Date: Mon, 6 Jun 2005 13:22:32 -0600 Subject: Verify your PayPal Account Message-ID: <200506061922.j56JMWwS023303@ss.lokal.ca> An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050606/12440ac6/attachment.htm From david at gibson.dropbear.id.au Tue Jun 7 11:19:40 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 7 Jun 2005 11:19:40 +1000 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <200506032334.j53NYfUm010224@localhost.localdomain> References: <200506032334.j53NYfUm010224@localhost.localdomain> Message-ID: <20050607011940.GD5142@localhost.localdomain> On Fri, Jun 03, 2005 at 04:34:41PM -0700, Frank Rowand wrote: > > The Maple board currently returns from a request to halt, power off, or > restart. This patch causes all processors to instead spin in these cases > so that the system appears to have halted. > > This patch applies against linux-2.6.12-rc5. This one is probably a better way to go, though. This came from Matt Dharm some time ago, but was never merged due to a miscommunication. It implements the actual board-level reboot and powerdown by communicating with the service processor. Andrew, please apply. Signed-off-by: David Gibson ===== maple_setup.c 1.5 vs edited ===== --- 1.5/arch/ppc64/kernel/maple_setup.c 2005-01-07 21:43:52 -08:00 +++ edited/maple_setup.c 2005-01-28 18:58:39 -08:00 @@ -78,13 +78,72 @@ extern void generic_find_legacy_serial_ports(u64 *physport, unsigned int *default_speed); - static void maple_restart(char *cmd) { + unsigned int maple_nvram_base; + unsigned int maple_nvram_offset; + unsigned int maple_nvram_command; + struct device_node *rtcs; + + /* find NVRAM device */ + rtcs = find_compatible_devices("nvram", "AMD8111"); + if (rtcs && rtcs->addrs) { + maple_nvram_base = rtcs->addrs[0].address; + } else { + printk(KERN_INFO "Maple: Unable to find NVRAM\n"); + printk(KERN_INFO "Maple: Manual Restart Required\n"); + return; + } + + /* find service processor device */ + rtcs = find_devices("service-processor"); + if (!rtcs) { + printk(KERN_INFO "Maple: Unable to find Service Processor\n"); + printk(KERN_INFO "Maple: Manual Restart Required\n"); + return; + } + maple_nvram_offset = *(unsigned int*) get_property(rtcs, + "restart-addr", NULL); + maple_nvram_command = *(unsigned int*) get_property(rtcs, + "restart-value", NULL); + + /* send command */ + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); + for (;;) ; } static void maple_power_off(void) { + unsigned int maple_nvram_base; + unsigned int maple_nvram_offset; + unsigned int maple_nvram_command; + struct device_node *rtcs; + + /* find NVRAM device */ + rtcs = find_compatible_devices("nvram", "AMD8111"); + if (rtcs && rtcs->addrs) { + maple_nvram_base = rtcs->addrs[0].address; + } else { + printk(KERN_INFO "Maple: Unable to find NVRAM\n"); + printk(KERN_INFO "Maple: Manual Power-Down Required\n"); + return; + } + + /* find service processor device */ + rtcs = find_devices("service-processor"); + if (!rtcs) { + printk(KERN_INFO "Maple: Unable to find Service Processor\n"); + printk(KERN_INFO "Maple: Manual Power-Down Required\n"); + return; + } + maple_nvram_offset = *(unsigned int*) get_property(rtcs, + "power-off-addr", NULL); + maple_nvram_command = *(unsigned int*) get_property(rtcs, + "power-off-value", NULL); + + /* send command */ + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); + for (;;) ; } static void maple_halt(void) -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From frowand at mvista.com Tue Jun 7 11:52:55 2005 From: frowand at mvista.com (Frank Rowand) Date: Mon, 06 Jun 2005 18:52:55 -0700 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <20050607011940.GD5142@localhost.localdomain> References: <200506032334.j53NYfUm010224@localhost.localdomain> <20050607011940.GD5142@localhost.localdomain> Message-ID: <42A4FDF7.4090803@mvista.com> David Gibson wrote: > On Fri, Jun 03, 2005 at 04:34:41PM -0700, Frank Rowand wrote: > >>The Maple board currently returns from a request to halt, power off, or >>restart. This patch causes all processors to instead spin in these cases >>so that the system appears to have halted. >> >>This patch applies against linux-2.6.12-rc5. > > > This one is probably a better way to go, though. This came from Matt > Dharm some time ago, but was never merged due to a miscommunication. > It implements the actual board-level reboot and powerdown by > communicating with the service processor. David, I tried this patch on linux-2.6.12-rc5. The maple_restart() worked fine. The maple_halt() did not power off the board. The maintenance processor PIBS reports: PIBS $ Message received: Power off. Powering down 970s and the CPC925 ... Power down complete. but the board remains powered on. Some other comments: maple_halt() should do the same thing as maple_power_off(). (It could even just call maple_power_off().) maple_restart(), maple_power_off(), and maple_halt() should not ever return. The returns could be replaced with the code from my patch that started this thread. I don't know if smp_send_stop() is necesary (there seems to be a lack of consistency among the different architectures) but it seems like a good idea. Do you want me to rework the patch with these suggestions? (But I'm not sure how to resolve the failure to power off.) > > Andrew, please apply. > > Signed-off-by: David Gibson > > ===== maple_setup.c 1.5 vs edited ===== > --- 1.5/arch/ppc64/kernel/maple_setup.c 2005-01-07 21:43:52 -08:00 > +++ edited/maple_setup.c 2005-01-28 18:58:39 -08:00 > @@ -78,13 +78,72 @@ > extern void generic_find_legacy_serial_ports(u64 *physport, > unsigned int *default_speed); > > - > static void maple_restart(char *cmd) > { > + unsigned int maple_nvram_base; > + unsigned int maple_nvram_offset; > + unsigned int maple_nvram_command; > + struct device_node *rtcs; > + > + /* find NVRAM device */ > + rtcs = find_compatible_devices("nvram", "AMD8111"); > + if (rtcs && rtcs->addrs) { > + maple_nvram_base = rtcs->addrs[0].address; > + } else { > + printk(KERN_INFO "Maple: Unable to find NVRAM\n"); > + printk(KERN_INFO "Maple: Manual Restart Required\n"); > + return; > + } > + > + /* find service processor device */ > + rtcs = find_devices("service-processor"); > + if (!rtcs) { > + printk(KERN_INFO "Maple: Unable to find Service Processor\n"); > + printk(KERN_INFO "Maple: Manual Restart Required\n"); > + return; > + } > + maple_nvram_offset = *(unsigned int*) get_property(rtcs, > + "restart-addr", NULL); > + maple_nvram_command = *(unsigned int*) get_property(rtcs, > + "restart-value", NULL); > + > + /* send command */ > + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); > + for (;;) ; > } > > static void maple_power_off(void) > { > + unsigned int maple_nvram_base; > + unsigned int maple_nvram_offset; > + unsigned int maple_nvram_command; > + struct device_node *rtcs; > + > + /* find NVRAM device */ > + rtcs = find_compatible_devices("nvram", "AMD8111"); > + if (rtcs && rtcs->addrs) { > + maple_nvram_base = rtcs->addrs[0].address; > + } else { > + printk(KERN_INFO "Maple: Unable to find NVRAM\n"); > + printk(KERN_INFO "Maple: Manual Power-Down Required\n"); > + return; > + } > + > + /* find service processor device */ > + rtcs = find_devices("service-processor"); > + if (!rtcs) { > + printk(KERN_INFO "Maple: Unable to find Service Processor\n"); > + printk(KERN_INFO "Maple: Manual Power-Down Required\n"); > + return; > + } > + maple_nvram_offset = *(unsigned int*) get_property(rtcs, > + "power-off-addr", NULL); > + maple_nvram_command = *(unsigned int*) get_property(rtcs, > + "power-off-value", NULL); > + > + /* send command */ > + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); > + for (;;) ; > } > > static void maple_halt(void) > > > -Frank -- Frank Rowand MontaVista Software, Inc From david at gibson.dropbear.id.au Tue Jun 7 12:09:04 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 7 Jun 2005 12:09:04 +1000 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <42A4FDF7.4090803@mvista.com> References: <200506032334.j53NYfUm010224@localhost.localdomain> <20050607011940.GD5142@localhost.localdomain> <42A4FDF7.4090803@mvista.com> Message-ID: <20050607020904.GG5142@localhost.localdomain> On Mon, Jun 06, 2005 at 06:52:55PM -0700, Frank Rowand wrote: > David Gibson wrote: > >On Fri, Jun 03, 2005 at 04:34:41PM -0700, Frank Rowand wrote: > > > >>The Maple board currently returns from a request to halt, power off, or > >>restart. This patch causes all processors to instead spin in these cases > >>so that the system appears to have halted. > >> > >>This patch applies against linux-2.6.12-rc5. > > > > > >This one is probably a better way to go, though. This came from Matt > >Dharm some time ago, but was never merged due to a miscommunication. > >It implements the actual board-level reboot and powerdown by > >communicating with the service processor. > > David, > > I tried this patch on linux-2.6.12-rc5. The maple_restart() worked fine. > The maple_halt() did not power off the board. The maintenance processor > PIBS reports: > > PIBS $ Message received: Power off. > Powering down 970s and the CPC925 ... > Power down complete. > > but the board remains powered on. Yes, I believe this is the hardware/firmware bug that BenH mentioned. In fact I think it does turn off the 970s, but not the fans and other board-level power. > Some other comments: > > maple_halt() should do the same thing as maple_power_off(). (It could > even just call maple_power_off().) > > maple_restart(), maple_power_off(), and maple_halt() should not ever > return. The returns could be replaced with the code from my patch > that started this thread. Ah, yes, good point. > I don't know if smp_send_stop() is necesary (there seems to be a lack > of consistency among the different architectures) but it seems like a > good idea. > > Do you want me to rework the patch with these suggestions? (But I'm > not sure how to resolve the failure to power off.) Please do. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From benh at kernel.crashing.org Tue Jun 7 12:10:25 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 07 Jun 2005 12:10:25 +1000 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <42A4FDF7.4090803@mvista.com> References: <200506032334.j53NYfUm010224@localhost.localdomain> <20050607011940.GD5142@localhost.localdomain> <42A4FDF7.4090803@mvista.com> Message-ID: <1118110226.6850.27.camel@gaston> > David, > > I tried this patch on linux-2.6.12-rc5. The maple_restart() worked fine. > The maple_halt() did not power off the board. The maintenance processor > PIBS reports: > > PIBS $ Message received: Power off. > Powering down 970s and the CPC925 ... > Power down complete. > > but the board remains powered on. This is the problem I've talked about. Full power off doesn't work on Maple. > Some other comments: > > maple_halt() should do the same thing as maple_power_off(). (It could > even just call maple_power_off().) That's debatable... lots of people claim that halt() should just ... halt the kernel and not power off the computer :) I would personally have it do power off, but since that doesn't work ... > maple_restart(), maple_power_off(), and maple_halt() should not ever > return. The returns could be replaced with the code from my patch > that started this thread. I think the "return" case should be handled at the toplevel function in setup.c that calls ppc_md. > I don't know if smp_send_stop() is necesary (there seems to be a lack > of consistency among the different architectures) but it seems like a > good idea. > > Do you want me to rework the patch with these suggestions? (But I'm > not sure how to resolve the failure to power off.) > > > > > Andrew, please apply. > > > > Signed-off-by: David Gibson > > > > ===== maple_setup.c 1.5 vs edited ===== > > --- 1.5/arch/ppc64/kernel/maple_setup.c 2005-01-07 21:43:52 -08:00 > > +++ edited/maple_setup.c 2005-01-28 18:58:39 -08:00 > > @@ -78,13 +78,72 @@ > > extern void generic_find_legacy_serial_ports(u64 *physport, > > unsigned int *default_speed); > > > > - > > static void maple_restart(char *cmd) > > { > > + unsigned int maple_nvram_base; > > + unsigned int maple_nvram_offset; > > + unsigned int maple_nvram_command; > > + struct device_node *rtcs; > > + > > + /* find NVRAM device */ > > + rtcs = find_compatible_devices("nvram", "AMD8111"); > > + if (rtcs && rtcs->addrs) { > > + maple_nvram_base = rtcs->addrs[0].address; > > + } else { > > + printk(KERN_INFO "Maple: Unable to find NVRAM\n"); > > + printk(KERN_INFO "Maple: Manual Restart Required\n"); > > + return; > > + } > > + > > + /* find service processor device */ > > + rtcs = find_devices("service-processor"); > > + if (!rtcs) { > > + printk(KERN_INFO "Maple: Unable to find Service Processor\n"); > > + printk(KERN_INFO "Maple: Manual Restart Required\n"); > > + return; > > + } > > + maple_nvram_offset = *(unsigned int*) get_property(rtcs, > > + "restart-addr", NULL); > > + maple_nvram_command = *(unsigned int*) get_property(rtcs, > > + "restart-value", NULL); > > + > > + /* send command */ > > + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); > > + for (;;) ; > > } > > > > static void maple_power_off(void) > > { > > + unsigned int maple_nvram_base; > > + unsigned int maple_nvram_offset; > > + unsigned int maple_nvram_command; > > + struct device_node *rtcs; > > + > > + /* find NVRAM device */ > > + rtcs = find_compatible_devices("nvram", "AMD8111"); > > + if (rtcs && rtcs->addrs) { > > + maple_nvram_base = rtcs->addrs[0].address; > > + } else { > > + printk(KERN_INFO "Maple: Unable to find NVRAM\n"); > > + printk(KERN_INFO "Maple: Manual Power-Down Required\n"); > > + return; > > + } > > + > > + /* find service processor device */ > > + rtcs = find_devices("service-processor"); > > + if (!rtcs) { > > + printk(KERN_INFO "Maple: Unable to find Service Processor\n"); > > + printk(KERN_INFO "Maple: Manual Power-Down Required\n"); > > + return; > > + } > > + maple_nvram_offset = *(unsigned int*) get_property(rtcs, > > + "power-off-addr", NULL); > > + maple_nvram_command = *(unsigned int*) get_property(rtcs, > > + "power-off-value", NULL); > > + > > + /* send command */ > > + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); > > + for (;;) ; > > } > > > > static void maple_halt(void) > > > > > > > > -Frank > -- > Frank Rowand > MontaVista Software, Inc > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/cgi-bin/mailman/listinfo/linuxppc64-dev From david at gibson.dropbear.id.au Tue Jun 7 15:15:05 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 7 Jun 2005 15:15:05 +1000 Subject: Dynamic hugepage for ppc64 Message-ID: <20050607051505.GA27201@localhost.localdomain> Presently, 64-bit applications on ppc64 may only use hugepages in the address region from 1-1.5T. Furthermore, if hugepages are enabled in the kernel config, they may only use hugepages and never normal pages in this area. This patch relaxes this restriction, allowing any address to be used with hugepages, but with a 1TB granularity. That is if you map a hugepage anywhere in the region 1TB-2TB, that entire area will be reserved exclusively for hugepages for the remainder of the process's lifetime. This works analagously to hugepages in 32-bit applications, where hugepages can be mapped anywhere, but with 256MB (mmu segment) granularity. This patch must be applied on top of my 4-level pagetables patch. Index: working-2.6/arch/ppc64/kernel/asm-offsets.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/asm-offsets.c 2005-05-24 14:12:22.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/asm-offsets.c 2005-06-07 13:34:59.000000000 +1000 @@ -95,7 +95,8 @@ DEFINE(PACASLBCACHEPTR, offsetof(struct paca_struct, slb_cache_ptr)); DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id)); #ifdef CONFIG_HUGETLB_PAGE - DEFINE(PACAHTLBSEGS, offsetof(struct paca_struct, context.htlb_segs)); + DEFINE(PACALOWHTLBAREAS, offsetof(struct paca_struct, context.low_htlb_areas)); + DEFINE(PACAHIGHHTLBAREAS, offsetof(struct paca_struct, context.high_htlb_areas)); #endif /* CONFIG_HUGETLB_PAGE */ DEFINE(PACADEFAULTDECR, offsetof(struct paca_struct, default_decr)); DEFINE(PACA_EXGEN, offsetof(struct paca_struct, exgen)); Index: working-2.6/arch/ppc64/mm/hugetlbpage.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hugetlbpage.c 2005-06-07 13:26:15.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hugetlbpage.c 2005-06-07 13:45:01.000000000 +1000 @@ -131,15 +131,15 @@ return 0; } -static void flush_segments(void *parm) +static void flush_low_segments(void *parm) { - u16 segs = (unsigned long) parm; + u16 areas = (unsigned long) parm; unsigned long i; asm volatile("isync" : : : "memory"); for (i = 0; i < 16; i++) { - if (! (segs & (1U << i))) + if (! (areas & (1U << i))) continue; asm volatile("slbie %0" : : "r" (i << SID_SHIFT)); } @@ -147,6 +147,24 @@ asm volatile("isync" : : : "memory"); } +static void flush_high_segments(void *parm) +{ + u16 areas = (unsigned long) parm; + unsigned long i, j; + + asm volatile("isync" : : : "memory"); + + for (i = 0; i < 16; i++) { + if (! (areas & (1U << i))) + continue; + for (j = 0; j < (1UL << (HTLB_AREA_SHIFT-SID_SHIFT)); j++) + asm volatile("slbie %0" + :: "r" ((i << HTLB_AREA_SHIFT) + (j << SID_SHIFT))); + } + + asm volatile("isync" : : : "memory"); +} + static int prepare_low_seg_for_htlb(struct mm_struct *mm, unsigned long seg) { unsigned long start = seg << SID_SHIFT; @@ -163,20 +181,36 @@ return 0; } -static int open_low_hpage_segs(struct mm_struct *mm, u16 newsegs) +static int prepare_high_zone_for_htlb(struct mm_struct *mm, unsigned long zone) +{ + unsigned long start = zone << HTLB_AREA_SHIFT; + unsigned long end = (zone+1) << HTLB_AREA_SHIFT; + struct vm_area_struct *vma; + + BUG_ON(zone >= 16); + + /* Check no VMAs are in the region */ + vma = find_vma(mm, start); + if (vma && (vma->vm_start < end)) + return -EBUSY; + + return 0; +} + +static int open_low_hpage_areas(struct mm_struct *mm, u16 newareas) { unsigned long i; - newsegs &= ~(mm->context.htlb_segs); - if (! newsegs) + newareas &= ~(mm->context.low_htlb_areas); + if (! newareas) return 0; /* The segments we want are already open */ for (i = 0; i < 16; i++) - if ((1 << i) & newsegs) + if ((1 << i) & newareas) if (prepare_low_seg_for_htlb(mm, i) != 0) return -EBUSY; - mm->context.htlb_segs |= newsegs; + mm->context.low_htlb_areas |= newareas; /* update the paca copy of the context struct */ get_paca()->context = mm->context; @@ -184,29 +218,58 @@ /* the context change must make it to memory before the flush, * so that further SLB misses do the right thing. */ mb(); - on_each_cpu(flush_segments, (void *)(unsigned long)newsegs, 0, 1); + on_each_cpu(flush_low_segments, (void *)(unsigned long)newareas, 0, 1); + + return 0; +} + +static int open_high_hpage_areas(struct mm_struct *mm, u16 newareas) +{ + unsigned long i; + + newareas &= ~(mm->context.high_htlb_areas); + if (! newareas) + return 0; /* The areas we want are already open */ + + for (i = 0; i < 16; i++) + if ((1 << i) & newareas) + if (prepare_high_zone_for_htlb(mm, i) != 0) + return -EBUSY; + + mm->context.high_htlb_areas |= newareas; + + /* update the paca copy of the context struct */ + get_paca()->context = mm->context; + + /* the context change must make it to memory before the flush, + * so that further SLB misses do the right thing. */ + mb(); + on_each_cpu(flush_high_segments, (void *)(unsigned long)newareas, 0, 1); return 0; } int prepare_hugepage_range(unsigned long addr, unsigned long len) { - if (within_hugepage_high_range(addr, len)) - return 0; - else if ((addr < 0x100000000UL) && ((addr+len) < 0x100000000UL)) { - int err; - /* Yes, we need both tests, in case addr+len overflows - * 64-bit arithmetic */ - err = open_low_hpage_segs(current->mm, + int err; + + if ( (addr+len) < addr ) + return -EINVAL; + + if ((addr + len) < 0x100000000UL) + err = open_low_hpage_areas(current->mm, LOW_ESID_MASK(addr, len)); - if (err) - printk(KERN_DEBUG "prepare_hugepage_range(%lx, %lx)" - " failed (segs: 0x%04hx)\n", addr, len, - LOW_ESID_MASK(addr, len)); + else + err = open_high_hpage_areas(current->mm, + HTLB_AREA_MASK(addr, len)); + if (err) { + printk(KERN_DEBUG "prepare_hugepage_range(%lx, %lx)" + " failed (areas: 0x%04hx, areas: 0x%04hx)\n", addr, len, + LOW_ESID_MASK(addr, len), HTLB_AREA_MASK(addr, len)); return err; } - return -EINVAL; + return 0; } struct page * @@ -273,8 +336,8 @@ vma = find_vma(mm, addr); continue; } - if (touches_hugepage_high_range(addr, len)) { - addr = TASK_HPAGE_END; + if (touches_hugepage_high_range(mm, addr, len)) { + addr = ALIGN(addr+1, 1UL<mm, addr); - for (vma = find_vma(current->mm, addr); - addr + len <= TASK_HPAGE_END; - vma = vma->vm_next) { + while (addr + len <= TASK_SIZE_USER64) { BUG_ON(vma && (addr >= vma->vm_end)); /* invariant */ - BUG_ON(! within_hugepage_high_range(addr, len)); + + if (! __within_hugepage_high_range(addr, len, zonemask)) { + addr = ALIGN(addr+1, 1UL<mm, addr); + continue; + } if (!vma || (addr + len) <= vma->vm_start) return addr; addr = ALIGN(vma->vm_end, HPAGE_SIZE); - /* Because we're in a hugepage region, this alignment - * should not skip us over any VMAs */ + /* Depending on segmask this might not be a confirmed + * hugepage region, so the ALIGN could have skipped + * some VMAs */ + vma = find_vma(current->mm, addr); } return -ENOMEM; @@ -460,11 +529,11 @@ if (test_thread_flag(TIF_32BIT)) { int lastshift = 0; - u16 segmask, cursegs = current->mm->context.htlb_segs; + u16 segmask, curareas = current->mm->context.low_htlb_areas; /* First see if we can do the mapping in the existing * low hpage segments */ - addr = htlb_get_low_area(len, cursegs); + addr = htlb_get_low_area(len, curareas); if (addr != -ENOMEM) return addr; @@ -473,16 +542,37 @@ if (segmask & 1) lastshift = 1; - addr = htlb_get_low_area(len, cursegs | segmask); + addr = htlb_get_low_area(len, curareas | segmask); if ((addr != -ENOMEM) - && open_low_hpage_segs(current->mm, segmask) == 0) + && open_low_hpage_areas(current->mm, segmask) == 0) return addr; } printk(KERN_DEBUG "hugetlb_get_unmapped_area() unable to open" " enough segments\n"); return -ENOMEM; } else { - return htlb_get_high_area(len); + int lastshift = 0; + u16 zonemask, curareas = current->mm->context.high_htlb_areas; + + /* First see if we can do the mapping in the existing + * low hpage segments */ + addr = htlb_get_high_area(len, curareas); + if (addr != -ENOMEM) + return addr; + + for (zonemask = HTLB_AREA_MASK(TASK_SIZE_USER64-len, len); + ! lastshift; zonemask >>=1) { + if (zonemask & 1) + lastshift = 1; + + addr = htlb_get_high_area(len, curareas | zonemask); + if ((addr != -ENOMEM) + && open_high_hpage_areas(current->mm, zonemask) == 0) + return addr; + } + printk(KERN_DEBUG "hugetlb_get_unmapped_area() unable to open" + " enough segments\n"); + return -ENOMEM; } } Index: working-2.6/include/asm-ppc64/mmu.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu.h 2005-06-07 13:26:15.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu.h 2005-06-07 13:32:57.000000000 +1000 @@ -304,7 +304,7 @@ typedef struct { mm_context_id_t id; #ifdef CONFIG_HUGETLB_PAGE - u16 htlb_segs; /* bitmask */ + u16 low_htlb_areas, high_htlb_areas; #endif } mm_context_t; Index: working-2.6/include/asm-ppc64/page.h =================================================================== --- working-2.6.orig/include/asm-ppc64/page.h 2005-06-07 13:26:15.000000000 +1000 +++ working-2.6/include/asm-ppc64/page.h 2005-06-07 13:38:32.000000000 +1000 @@ -37,40 +37,45 @@ #define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT) -/* For 64-bit processes the hugepage range is 1T-1.5T */ -#define TASK_HPAGE_BASE ASM_CONST(0x0000010000000000) -#define TASK_HPAGE_END ASM_CONST(0x0000018000000000) +#define HTLB_AREA_SHIFT 40 +#define HTLB_AREA_SIZE (1UL << HTLB_AREA_SHIFT) +#define GET_HTLB_AREA(x) ((x) >> HTLB_AREA_SHIFT) #define LOW_ESID_MASK(addr, len) (((1U << (GET_ESID(addr+len-1)+1)) \ - (1U << GET_ESID(addr))) & 0xffff) +#define HTLB_AREA_MASK(addr, len) (((1U << (GET_HTLB_AREA(addr+len-1)+1)) \ + - (1U << GET_HTLB_AREA(addr))) & 0xffff) #define ARCH_HAS_HUGEPAGE_ONLY_RANGE #define ARCH_HAS_PREPARE_HUGEPAGE_RANGE #define ARCH_HAS_SETCLEAR_HUGE_PTE #define touches_hugepage_low_range(mm, addr, len) \ - (LOW_ESID_MASK((addr), (len)) & mm->context.htlb_segs) -#define touches_hugepage_high_range(addr, len) \ - (((addr) > (TASK_HPAGE_BASE-(len))) && ((addr) < TASK_HPAGE_END)) + (LOW_ESID_MASK((addr), (len)) & (mm)->context.low_htlb_areas) +#define touches_hugepage_high_range(mm, addr, len) \ + (HTLB_AREA_MASK((addr), (len)) & (mm)->context.high_htlb_areas) #define __within_hugepage_low_range(addr, len, segmask) \ ((LOW_ESID_MASK((addr), (len)) | (segmask)) == (segmask)) #define within_hugepage_low_range(addr, len) \ __within_hugepage_low_range((addr), (len), \ - current->mm->context.htlb_segs) -#define within_hugepage_high_range(addr, len) (((addr) >= TASK_HPAGE_BASE) \ - && ((addr)+(len) <= TASK_HPAGE_END) && ((addr)+(len) >= (addr))) + current->mm->context.low_htlb_areas) +#define __within_hugepage_high_range(addr, len, zonemask) \ + ((HTLB_AREA_MASK((addr), (len)) | (zonemask)) == (zonemask)) +#define within_hugepage_high_range(addr, len) \ + __within_hugepage_high_range((addr), (len), \ + current->mm->context.high_htlb_areas) #define is_hugepage_only_range(mm, addr, len) \ - (touches_hugepage_high_range((addr), (len)) || \ + (touches_hugepage_high_range((mm), (addr), (len)) || \ touches_hugepage_low_range((mm), (addr), (len))) #define HAVE_ARCH_HUGETLB_UNMAPPED_AREA #define in_hugepage_area(context, addr) \ (cpu_has_feature(CPU_FTR_16M_PAGE) && \ - ( (((addr) >= TASK_HPAGE_BASE) && ((addr) < TASK_HPAGE_END)) || \ + ( ((1 << GET_HTLB_AREA(addr)) & (context).high_htlb_areas) || \ ( ((addr) < 0x100000000L) && \ - ((1 << GET_ESID(addr)) & (context).htlb_segs) ) ) ) + ((1 << GET_ESID(addr)) & (context).low_htlb_areas) ) ) ) #else /* !CONFIG_HUGETLB_PAGE */ Index: working-2.6/arch/ppc64/mm/slb_low.S =================================================================== --- working-2.6.orig/arch/ppc64/mm/slb_low.S 2005-06-07 13:26:15.000000000 +1000 +++ working-2.6/arch/ppc64/mm/slb_low.S 2005-06-07 14:54:11.000000000 +1000 @@ -89,30 +89,34 @@ b 9f 0: /* user address: proto-VSID = context<<15 | ESID */ - li r11,SLB_VSID_USER - srdi. r9,r3,USER_ESID_BITS bne- 8f /* invalid ea bits set */ #ifdef CONFIG_HUGETLB_PAGE BEGIN_FTR_SECTION - /* check against the hugepage ranges */ - cmpldi r3,(TASK_HPAGE_END>>SID_SHIFT) - bge 6f /* >= TASK_HPAGE_END */ - cmpldi r3,(TASK_HPAGE_BASE>>SID_SHIFT) - bge 5f /* TASK_HPAGE_BASE..TASK_HPAGE_END */ + lhz r9,PACAHIGHHTLBAREAS(r13) + srdi r11,r3,(HTLB_AREA_SHIFT-SID_SHIFT) + srd r9,r9,r11 + andi. r9,r9,1 + bne 5f + + li r11,SLB_VSID_USER cmpldi r3,16 - bge 6f /* 4GB..TASK_HPAGE_BASE */ + blt 6f /* if we're below 4GB, we're done */ - lhz r9,PACAHTLBSEGS(r13) + lhz r9,PACALOWHTLBAREAS(r13) srd r9,r9,r3 andi. r9,r9,1 - beq 6f + bne 5f + + b 6f 5: /* this is a hugepage user address */ li r11,(SLB_VSID_USER|SLB_VSID_L) END_FTR_SECTION_IFSET(CPU_FTR_16M_PAGE) -#endif /* CONFIG_HUGETLB_PAGE */ +#else /* CONFIG_HUGETLB_PAGE */ + li r11,SLB_VSID_USER +#endif 6: ld r9,PACACONTEXTID(r13) rldimi r3,r9,USER_ESID_BITS,0 -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From paulus at samba.org Tue Jun 7 21:04:27 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 7 Jun 2005 21:04:27 +1000 Subject: [PATCH] initialize TCE tables In-Reply-To: <1117649436.28482.3.camel@sinatra.austin.ibm.com> References: <1117649436.28482.3.camel@sinatra.austin.ibm.com> Message-ID: <17061.32571.867554.400383@cargo.ozlabs.ibm.com> John Rose writes: > A fairly recent platform requirement states that the OS must clear the > whole TCE table at setup time, in case firmware left any active > mappings in it. Without this initialization, dynamic bus removes can > fail. Firmware rejects these requests if active mappings still exist > for a slot that has been deallocated by the OS. > > If there are no objections, I'll forward this to Andrew Morton. Looks fine to me. Do we need to push for it to be in 2.6.12? Paul. From paulus at samba.org Tue Jun 7 21:39:03 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 7 Jun 2005 21:39:03 +1000 Subject: [PATCH] update ppc64 defconfig In-Reply-To: <20050605074945.GA16638@suse.de> References: <20050605074945.GA16638@suse.de> Message-ID: <17061.34647.609440.548783@cargo.ozlabs.ibm.com> Olaf Hering writes: > enable cpusets > enable new lpfc and jsm drivers > enable new dm-multipath > leave new agp disabled > disable rivafb, it does not handle the cards in G5 models (FX5200 as example) > the new nvidiafb doesnt work on bigendian, yet > > Signed-off-by: Olaf Hering Acked-by: Paul Mackerras From yassczx at peoples.com Tue Jun 7 22:42:07 2005 From: yassczx at peoples.com (People's) Date: Tue, 07 Jun 2005 08:42:07 -0400 Subject: People's Informs You! Message-ID: An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050607/e7f17727/attachment.htm From johnrose at austin.ibm.com Wed Jun 8 00:59:03 2005 From: johnrose at austin.ibm.com (John Rose) Date: Tue, 07 Jun 2005 09:59:03 -0500 Subject: [PATCH] initialize TCE tables In-Reply-To: <17061.32571.867554.400383@cargo.ozlabs.ibm.com> References: <1117649436.28482.3.camel@sinatra.austin.ibm.com> <17061.32571.867554.400383@cargo.ozlabs.ibm.com> Message-ID: <1118156343.29412.3.camel@sinatra.austin.ibm.com> Hi Paul- > Looks fine to me. Do we need to push for it to be in 2.6.12? That would be helpful, since machines are already expecting this behavior. If not, it won't be the end of the world. Thanks- John From frowand at mvista.com Wed Jun 8 07:43:54 2005 From: frowand at mvista.com (Frank Rowand) Date: Tue, 7 Jun 2005 14:43:54 -0700 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart Message-ID: <200506072143.j57LhsHO017122@localhost.localdomain> I updated David Gibson's patch to reflect the comments on the list. I tested this patch on linux-2.6.12-rc6. Benjamin Herrenschmidt wrote: >>maple_halt() should do the same thing as maple_power_off(). (It could >>even just call maple_power_off().) > > > That's debatable... lots of people claim that halt() should just ... > halt the kernel and not power off the computer :) I would personally > have it do power off, but since that doesn't work ... I changed maple_halt() to call maple_power_off(), to be consistent with the other ppc64 halt functions. If someone changed it to just halt, it wouldn't bother me. >>maple_restart(), maple_power_off(), and maple_halt() should not ever >>return. The returns could be replaced with the code from my patch >>that started this thread. > > > I think the "return" case should be handled at the toplevel function in > setup.c that calls ppc_md. Good idea. I updated setup.c to catch the return case for restart, power down, and halt. I used "#ifdef CONFIG_SMP" instead of adding a null smp_send_stop() to include/smp.h. This #ifdef is already used in several other places in setup.c. I also changed the printk()s in David's patch from KERN_INFO to KERN_EMERG to match the shutdown messages in kernel/sys.c. How does this version of the patch look to everyone? -Frank Signed-off-by: Frank Rowand Signed-off-by: David Gibson ===== maple_setup.c 1.5 vs edited ===== Index: linux-2.6.12-rc6/arch/ppc64/kernel/maple_setup.c =================================================================== --- linux-2.6.12-rc6.orig/arch/ppc64/kernel/maple_setup.c +++ linux-2.6.12-rc6/arch/ppc64/kernel/maple_setup.c @@ -78,17 +78,77 @@ extern int maple_pci_get_legacy_ide_irq( extern void generic_find_legacy_serial_ports(u64 *physport, unsigned int *default_speed); - static void maple_restart(char *cmd) { + unsigned int maple_nvram_base; + unsigned int maple_nvram_offset; + unsigned int maple_nvram_command; + struct device_node *rtcs; + + /* find NVRAM device */ + rtcs = find_compatible_devices("nvram", "AMD8111"); + if (rtcs && rtcs->addrs) { + maple_nvram_base = rtcs->addrs[0].address; + } else { + printk(KERN_EMERG "Maple: Unable to find NVRAM\n"); + printk(KERN_EMERG "Maple: Manual Restart Required\n"); + return; + } + + /* find service processor device */ + rtcs = find_devices("service-processor"); + if (!rtcs) { + printk(KERN_EMERG "Maple: Unable to find Service Processor\n"); + printk(KERN_EMERG "Maple: Manual Restart Required\n"); + return; + } + maple_nvram_offset = *(unsigned int*) get_property(rtcs, + "restart-addr", NULL); + maple_nvram_command = *(unsigned int*) get_property(rtcs, + "restart-value", NULL); + + /* send command */ + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); + for (;;) ; } static void maple_power_off(void) { + unsigned int maple_nvram_base; + unsigned int maple_nvram_offset; + unsigned int maple_nvram_command; + struct device_node *rtcs; + + /* find NVRAM device */ + rtcs = find_compatible_devices("nvram", "AMD8111"); + if (rtcs && rtcs->addrs) { + maple_nvram_base = rtcs->addrs[0].address; + } else { + printk(KERN_EMERG "Maple: Unable to find NVRAM\n"); + printk(KERN_EMERG "Maple: Manual Power-Down Required\n"); + return; + } + + /* find service processor device */ + rtcs = find_devices("service-processor"); + if (!rtcs) { + printk(KERN_EMERG "Maple: Unable to find Service Processor\n"); + printk(KERN_EMERG "Maple: Manual Power-Down Required\n"); + return; + } + maple_nvram_offset = *(unsigned int*) get_property(rtcs, + "power-off-addr", NULL); + maple_nvram_command = *(unsigned int*) get_property(rtcs, + "power-off-value", NULL); + + /* send command */ + outb_p(maple_nvram_command, maple_nvram_base + maple_nvram_offset); + for (;;) ; } static void maple_halt(void) { + maple_power_off(); } #ifdef CONFIG_SMP Index: linux-2.6.12-rc6/arch/ppc64/kernel/setup.c =================================================================== --- linux-2.6.12-rc6.orig/arch/ppc64/kernel/setup.c +++ linux-2.6.12-rc6/arch/ppc64/kernel/setup.c @@ -678,6 +678,12 @@ void machine_restart(char *cmd) if (ppc_md.nvram_sync) ppc_md.nvram_sync(); ppc_md.restart(cmd); +#ifdef CONFIG_SMP + smp_send_stop(); +#endif + printk(KERN_EMERG "System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) ; } EXPORT_SYMBOL(machine_restart); @@ -687,6 +693,12 @@ void machine_power_off(void) if (ppc_md.nvram_sync) ppc_md.nvram_sync(); ppc_md.power_off(); +#ifdef CONFIG_SMP + smp_send_stop(); +#endif + printk(KERN_EMERG "System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) ; } EXPORT_SYMBOL(machine_power_off); @@ -696,6 +708,12 @@ void machine_halt(void) if (ppc_md.nvram_sync) ppc_md.nvram_sync(); ppc_md.halt(); +#ifdef CONFIG_SMP + smp_send_stop(); +#endif + printk(KERN_EMERG "System Halted, OK to turn off power\n"); + local_irq_disable(); + while (1) ; } EXPORT_SYMBOL(machine_halt); From benh at kernel.crashing.org Wed Jun 8 13:06:37 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 08 Jun 2005 13:06:37 +1000 Subject: Discuss: Adding OF Flat Dev Tree to ppc32 In-Reply-To: <1117819176.6517.290.camel@cashmere.sps.mot.com> References: <1117614390.19020.24.camel@gaston> <1117614484.19020.27.camel@gaston> <1117783104.31082.151.camel@gaston> <1117819176.6517.290.camel@cashmere.sps.mot.com> Message-ID: <1118199997.6850.106.camel@gaston> On Fri, 2005-06-03 at 12:19 -0500, Jon Loeliger wrote: > Ben and Folks, Hi Jon ! > I've read through ppc64/kernel/prom.c and done some minor > call-chain analysis rooted at the two functions: > early_init_devtree() > unflatten_device_tree() > as they are apparently the only two referenced in the > initial early boot up process. Yes. The first one is called very early on ppc64 with MMU still off (the ppc64 kernel can happily run C code without translation due to a "feature" where the top 2 bits of addresses are ignored in real mode, ppc32 will need to be a bit smarter here). It's basically used to extract some infos directly from the flattened tree in order to construct the LMB list (list of memory blocks, equivalent of ppc32's mem_pieces), get the platform number (though we are thinking about killing that platform number thing) and get a few other things that are used very early during boot. It uses functions that iterate the flattened device-tree directly that are currently not exported outside of prom.c but that may change, especially if we remove the platform number, we may let the ppc_md.probe() function use those to look at the device-tree model/compatible properties. The other function is called much later, with MMU enabled, to do the actual expansion of the tree into a pointer based tree that can be more easily walked & manipulated. > My notion was to take the portion of prom.c rooted at > these two functions and add them to the ppc32 line. > > First, what portions of pp64/kernel/prom.c are obsolete? > Anything? You alluded to cleaning this up some, but I > am not too familiar with it to know where that was headed. Well, I want to remove pretty much all of the stuff done in finish_device_tree(). That is the pre-parsed n_addrs/addrs and n_intrs/intrs fields of struct device_node, and all the functions involved in this parsing. I would then provide a better function to be called externally to "on-demand" map an interrupt, and eventually to process "reg" properties properly. I also want to move the other unrelated stuff from struct device_node like PCI gunk that we have on ppc64 to some separate structure that is only used by PCI or VIO devices. > Second, there is already a fairly similar prom.c file > hanging out over in ppc32 land. I _think_ it houses > roughly the complementary code out of ppc64's prom.c > that is NOT derived from the call chain derived from > the above two functions. Yes. > Which leads me to the questions: Is there, or should > we create, a plan to factor the flat-dev-tree handling > code into common or shared ppc code? Hrm... There is lot of code duplication in ppc vs. ppc64 land (and in some case, we do actually use the same headers, that is, include/asm-ppc64/something.h may just itself include the "asm-ppc" version. Steven Rothwell here is working on patches to make that more systematic for things that really should be identical, like termios.h etc... Regarding code in arch/ppc*, I'm not sure what the right approach would be. I'd say first copy things around, and we'll what we end up with. > I am reluctant > to just outright clone and copy that code if it will > ultimately "be the same" or even "mostly the same". > It seems that the early_init_devtree() might then need > to be refactored or duplicated for ppc32-land. Sort-of. We don't run in the same context though. We need to call this before MMU_init() on ppc32. Depending on the processor type that code runs mapped by BATs or bolted TLBs or whatever trick is used that early during boot and may not be able to access all of RAM. So you may need to add additional restrictions on the location of the device-tree in memory or run code in real mode if possible or whatever. We need the BookE folks for example to look at this closely. On 6xx & friends, I think only 16Mb. On POWER4 runninng 32 bits, RAM is bolted in the hash table by prom_init.c but that's just a gross hack. etc... > Are you anticipating the same r3,r4,r5 interface outlined > in your 0.4 rev of the ppc4 OF spec to be used by the > ppc32 world as well? Seems like it just might... Yes. > Naturally, I'm willing to jump in here, just looking > for a bit of global-direction from you. :-) No problem :) We'll also want to refactor prom_init to be much more like the ppc64 version, that is cut all ties between it and the rest of the kernel (so it doesn't share any global) and so that it generates a flattened tree and passes that to the kernel. Ben. From sfr at canb.auug.org.au Wed Jun 8 17:27:12 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 8 Jun 2005 17:27:12 +1000 Subject: [PATCH 0/4] ppc64 iSeries: irq and pci cleanups In-Reply-To: <20050606161415.055dce39.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> Message-ID: <20050608172712.4b56ec5d.sfr@canb.auug.org.au> Hi Andrew, Here is today's set of patches. These are more iSeries cleanups and depend on the previous sets. arch/ppc64/Kconfig | 2 arch/ppc64/kernel/Makefile | 6 arch/ppc64/kernel/XmPciLpEvent.c | 190 ------------------ arch/ppc64/kernel/dma.c | 4 arch/ppc64/kernel/iSeries_iommu.c | 3 arch/ppc64/kernel/iSeries_irq.c | 312 ++++++++++++++++++++++--------- arch/ppc64/kernel/iSeries_setup.c | 8 arch/ppc64/kernel/sys_ppc32.c | 3 arch/ppc64/lib/Makefile | 2 drivers/char/mem.c | 8 drivers/serial/Kconfig | 2 include/asm-ppc64/dma.h | 3 include/asm-ppc64/iSeries/XmPciLpEvent.h | 7 include/asm-ppc64/iSeries/iSeries_irq.h | 3 include/asm-ppc64/iommu.h | 4 15 files changed, 261 insertions(+), 296 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050608/1f6839eb/attachment.pgp From sfr at canb.auug.org.au Wed Jun 8 17:32:21 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 8 Jun 2005 17:32:21 +1000 Subject: [PATCH 1/4] ppc64 iSeries: irq simple cleanups In-Reply-To: <20050608172712.4b56ec5d.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> <20050608172712.4b56ec5d.sfr@canb.auug.org.au> Message-ID: <20050608173221.3d008bc2.sfr@canb.auug.org.au> Hi Andrew, This patch is just simple cleanups to the iSeries irq code. - whitespace and comments - rearrange some functions to avoid forward declarations - remove XmPciLpEvent.h as its functions were declared elsewhere - remove decaration of function that no longer exists No semantic changes. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-iSeries-headers.16/arch/ppc64/kernel/XmPciLpEvent.c linus-iSeries-headers.17/arch/ppc64/kernel/XmPciLpEvent.c --- linus-iSeries-headers.16/arch/ppc64/kernel/XmPciLpEvent.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.17/arch/ppc64/kernel/XmPciLpEvent.c 2005-06-08 12:10:56.000000000 +1000 @@ -1,9 +1,8 @@ /* - * File XmPciLpEvent.h created by Wayne Holm on Mon Jan 15 2001. + * File XmPciLpEvent.c created by Wayne Holm on Mon Jan 15 2001. * * This module handles PCI interrupt events sent by the iSeries Hypervisor. -*/ - + */ #include #include #include @@ -17,22 +16,22 @@ #include #include #include -#include +#include #include static long Pci_Interrupt_Count; static long Pci_Event_Count; enum XmPciLpEvent_Subtype { - XmPciLpEvent_BusCreated = 0, // PHB has been created - XmPciLpEvent_BusError = 1, // PHB has failed - XmPciLpEvent_BusFailed = 2, // Msg to Secondary, Primary failed bus - XmPciLpEvent_NodeFailed = 4, // Multi-adapter bridge has failed - XmPciLpEvent_NodeRecovered = 5, // Multi-adapter bridge has recovered - XmPciLpEvent_BusRecovered = 12, // PHB has been recovered - XmPciLpEvent_UnQuiesceBus = 18, // Secondary bus unqiescing - XmPciLpEvent_BridgeError = 21, // Bridge Error - XmPciLpEvent_SlotInterrupt = 22 // Slot interrupt + XmPciLpEvent_BusCreated = 0, // PHB has been created + XmPciLpEvent_BusError = 1, // PHB has failed + XmPciLpEvent_BusFailed = 2, // Msg to Secondary, Primary failed bus + XmPciLpEvent_NodeFailed = 4, // Multi-adapter bridge has failed + XmPciLpEvent_NodeRecovered = 5, // Multi-adapter bridge has recovered + XmPciLpEvent_BusRecovered = 12, // PHB has been recovered + XmPciLpEvent_UnQuiesceBus = 18, // Secondary bus unqiescing + XmPciLpEvent_BridgeError = 21, // Bridge Error + XmPciLpEvent_SlotInterrupt = 22 // Slot interrupt }; struct XmPciLpEvent_BusInterrupt { @@ -71,43 +70,6 @@ }; static void intReceived(struct XmPciLpEvent *eventParm, - struct pt_regs *regsParm); - -static void XmPciLpEvent_handler(struct HvLpEvent *eventParm, - struct pt_regs *regsParm) -{ -#ifdef CONFIG_PCI -#if 0 - PPCDBG(PPCDBG_BUSWALK, "XmPciLpEvent_handler, type 0x%x\n", - eventParm->xType); -#endif - ++Pci_Event_Count; - - if (eventParm && (eventParm->xType == HvLpEvent_Type_PciIo)) { - switch (eventParm->xFlags.xFunction) { - case HvLpEvent_Function_Int: - intReceived((struct XmPciLpEvent *)eventParm, regsParm); - break; - case HvLpEvent_Function_Ack: - printk(KERN_ERR - "XmPciLpEvent.c: unexpected ack received\n"); - break; - default: - printk(KERN_ERR - "XmPciLpEvent.c: unexpected event function %d\n", - (int)eventParm->xFlags.xFunction); - break; - } - } else if (eventParm) - printk(KERN_ERR - "XmPciLpEvent.c: Unrecognized PCI event type 0x%x\n", - (int)eventParm->xType); - else - printk(KERN_ERR "XmPciLpEvent.c: NULL event received\n"); -#endif -} - -static void intReceived(struct XmPciLpEvent *eventParm, struct pt_regs *regsParm) { int irq; @@ -164,6 +126,39 @@ } } +static void XmPciLpEvent_handler(struct HvLpEvent *eventParm, + struct pt_regs *regsParm) +{ +#ifdef CONFIG_PCI +#if 0 + PPCDBG(PPCDBG_BUSWALK, "XmPciLpEvent_handler, type 0x%x\n", + eventParm->xType); +#endif + ++Pci_Event_Count; + + if (eventParm && (eventParm->xType == HvLpEvent_Type_PciIo)) { + switch (eventParm->xFlags.xFunction) { + case HvLpEvent_Function_Int: + intReceived((struct XmPciLpEvent *)eventParm, regsParm); + break; + case HvLpEvent_Function_Ack: + printk(KERN_ERR + "XmPciLpEvent.c: unexpected ack received\n"); + break; + default: + printk(KERN_ERR + "XmPciLpEvent.c: unexpected event function %d\n", + (int)eventParm->xFlags.xFunction); + break; + } + } else if (eventParm) + printk(KERN_ERR + "XmPciLpEvent.c: Unrecognized PCI event type 0x%x\n", + (int)eventParm->xType); + else + printk(KERN_ERR "XmPciLpEvent.c: NULL event received\n"); +#endif +} /* This should be called sometime prior to buswalk (init_IRQ would be good) */ int XmPciLpEvent_init() @@ -179,12 +174,10 @@ if (xRc == 0) { xRc = HvLpEvent_openPath(HvLpEvent_Type_PciIo, 0); if (xRc != 0) - printk(KERN_ERR - "XmPciLpEvent.c: open event path failed with rc 0x%x\n", - xRc); + printk(KERN_ERR "XmPciLpEvent.c: open event path " + "failed with rc 0x%x\n", xRc); } else - printk(KERN_ERR - "XmPciLpEvent.c: register handler failed with rc 0x%x\n", - xRc); + printk(KERN_ERR "XmPciLpEvent.c: register handler " + "failed with rc 0x%x\n", xRc); return xRc; } diff -ruN linus-iSeries-headers.16/arch/ppc64/kernel/iSeries_irq.c linus-iSeries-headers.17/arch/ppc64/kernel/iSeries_irq.c --- linus-iSeries-headers.16/arch/ppc64/kernel/iSeries_irq.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.17/arch/ppc64/kernel/iSeries_irq.c 2005-06-08 12:05:30.000000000 +1000 @@ -1,27 +1,27 @@ -/************************************************************************/ -/* This module supports the iSeries PCI bus interrupt handling */ -/* Copyright (C) 20yy */ -/* */ -/* This program is free software; you can redistribute it and/or modify */ -/* it under the terms of the GNU General Public License as published by */ -/* the Free Software Foundation; either version 2 of the License, or */ -/* (at your option) any later version. */ -/* */ -/* This program is distributed in the hope that it will be useful, */ -/* but WITHOUT ANY WARRANTY; without even the implied warranty of */ -/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the */ -/* GNU General Public License for more details. */ -/* */ -/* You should have received a copy of the GNU General Public License */ -/* along with this program; if not, write to the: */ -/* Free Software Foundation, Inc., */ -/* 59 Temple Place, Suite 330, */ -/* Boston, MA 02111-1307 USA */ -/************************************************************************/ -/* Change Activity: */ -/* Created, December 13, 2000 by Wayne Holm */ -/* End Change Activity */ -/************************************************************************/ +/* + * This module supports the iSeries PCI bus interrupt handling + * Copyright (C) 20yy + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the: + * Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, + * Boston, MA 02111-1307 USA + * + * Change Activity: + * Created, December 13, 2000 by Wayne Holm + * End Change Activity + */ #include #include #include @@ -38,22 +38,6 @@ #include #include #include -#include - -static unsigned int iSeries_startup_IRQ(unsigned int irq); -static void iSeries_shutdown_IRQ(unsigned int irq); -static void iSeries_enable_IRQ(unsigned int irq); -static void iSeries_disable_IRQ(unsigned int irq); -static void iSeries_end_IRQ(unsigned int irq); - -static hw_irq_controller iSeries_IRQ_handler = { - .typename = "iSeries irq controller", - .startup = iSeries_startup_IRQ, - .shutdown = iSeries_shutdown_IRQ, - .enable = iSeries_enable_IRQ, - .disable = iSeries_disable_IRQ, - .end = iSeries_end_IRQ -}; /* This maps virtual irq numbers to real irqs */ unsigned int virt_irq_to_real_map[NR_IRQS]; @@ -69,30 +53,32 @@ XmPciLpEvent_init(); } +#define REAL_IRQ_TO_BUS(irq) ((((irq) >> 6) & 0xff) + 1) +#define REAL_IRQ_TO_IDSEL(irq) ((((irq) >> 3) & 7) + 1) +#define REAL_IRQ_TO_FUNC(irq) ((irq) & 7) + /* - * This is called out of iSeries_scan_slot to allocate an IRQ for an EADS slot - * It calculates the irq value for the slot. - * Note that subBusNumber is always 0 (at the moment at least). + * This will be called by device drivers (via enable_IRQ) + * to enable INTA in the bridge interrupt status register. */ -int __init iSeries_allocate_IRQ(HvBusNumber busNumber, - HvSubBusNumber subBusNumber, HvAgentId deviceId) +static void iSeries_enable_IRQ(unsigned int irq) { - unsigned int realirq, virtirq; - u8 idsel = (deviceId >> 4); - u8 function = deviceId & 7; + u32 bus, deviceId, function, mask; + const u32 subBus = 0; + unsigned int rirq = virt_irq_to_real_map[irq]; - virtirq = next_virtual_irq++; - realirq = ((busNumber - 1) << 6) + ((idsel - 1) << 3) + function; - virt_irq_to_real_map[virtirq] = realirq; + /* The IRQ has already been locked by the caller */ + bus = REAL_IRQ_TO_BUS(rirq); + function = REAL_IRQ_TO_FUNC(rirq); + deviceId = (REAL_IRQ_TO_IDSEL(rirq) << 4) + function; - irq_desc[virtirq].handler = &iSeries_IRQ_handler; - return virtirq; + /* Unmask secondary INTA */ + mask = 0x80000000; + HvCallPci_unmaskInterrupts(bus, subBus, deviceId, mask); + PPCDBG(PPCDBG_BUSWALK, "iSeries_enable_IRQ 0x%02X.%02X.%02X 0x%04X\n", + bus, subBus, deviceId, irq); } -#define REAL_IRQ_TO_BUS(irq) ((((irq) >> 6) & 0xff) + 1) -#define REAL_IRQ_TO_IDSEL(irq) ((((irq) >> 3) & 7) + 1) -#define REAL_IRQ_TO_FUNC(irq) ((irq) & 7) - /* This is called by iSeries_activate_IRQs */ static unsigned int iSeries_startup_IRQ(unsigned int irq) { @@ -131,7 +117,7 @@ desc->handler->startup(irq); spin_unlock_irqrestore(&desc->lock, flags); } - } + } } /* this is not called anywhere currently */ @@ -173,29 +159,7 @@ mask = 0x80000000; HvCallPci_maskInterrupts(bus, subBus, deviceId, mask); PPCDBG(PPCDBG_BUSWALK, "iSeries_disable_IRQ 0x%02X.%02X.%02X 0x%04X\n", - bus, subBus, deviceId, irq); -} - -/* - * This will be called by device drivers (via enable_IRQ) - * to enable INTA in the bridge interrupt status register. - */ -static void iSeries_enable_IRQ(unsigned int irq) -{ - u32 bus, deviceId, function, mask; - const u32 subBus = 0; - unsigned int rirq = virt_irq_to_real_map[irq]; - - /* The IRQ has already been locked by the caller */ - bus = REAL_IRQ_TO_BUS(rirq); - function = REAL_IRQ_TO_FUNC(rirq); - deviceId = (REAL_IRQ_TO_IDSEL(rirq) << 4) + function; - - /* Unmask secondary INTA */ - mask = 0x80000000; - HvCallPci_unmaskInterrupts(bus, subBus, deviceId, mask); - PPCDBG(PPCDBG_BUSWALK, "iSeries_enable_IRQ 0x%02X.%02X.%02X 0x%04X\n", - bus, subBus, deviceId, irq); + bus, subBus, deviceId, irq); } /* @@ -207,3 +171,32 @@ static void iSeries_end_IRQ(unsigned int irq) { } + +static hw_irq_controller iSeries_IRQ_handler = { + .typename = "iSeries irq controller", + .startup = iSeries_startup_IRQ, + .shutdown = iSeries_shutdown_IRQ, + .enable = iSeries_enable_IRQ, + .disable = iSeries_disable_IRQ, + .end = iSeries_end_IRQ +}; + +/* + * This is called out of iSeries_scan_slot to allocate an IRQ for an EADS slot + * It calculates the irq value for the slot. + * Note that subBusNumber is always 0 (at the moment at least). + */ +int __init iSeries_allocate_IRQ(HvBusNumber busNumber, + HvSubBusNumber subBusNumber, HvAgentId deviceId) +{ + unsigned int realirq, virtirq; + u8 idsel = (deviceId >> 4); + u8 function = deviceId & 7; + + virtirq = next_virtual_irq++; + realirq = ((busNumber - 1) << 6) + ((idsel - 1) << 3) + function; + virt_irq_to_real_map[virtirq] = realirq; + + irq_desc[virtirq].handler = &iSeries_IRQ_handler; + return virtirq; +} diff -ruN linus-iSeries-headers.16/include/asm-ppc64/iSeries/XmPciLpEvent.h linus-iSeries-headers.17/include/asm-ppc64/iSeries/XmPciLpEvent.h --- linus-iSeries-headers.16/include/asm-ppc64/iSeries/XmPciLpEvent.h 2005-06-01 17:15:22.000000000 +1000 +++ linus-iSeries-headers.17/include/asm-ppc64/iSeries/XmPciLpEvent.h 1970-01-01 10:00:00.000000000 +1000 @@ -1,7 +0,0 @@ -#ifndef __XMPCILPEVENT_H__ -#define __XMPCILPEVENT_H__ - -extern int XmPciLpEvent_init(void); -extern void ppc_irq_dispatch_handler(struct pt_regs *regs, int irq); - -#endif /* __XMPCILPEVENT_H__ */ diff -ruN linus-iSeries-headers.16/include/asm-ppc64/iSeries/iSeries_irq.h linus-iSeries-headers.17/include/asm-ppc64/iSeries/iSeries_irq.h --- linus-iSeries-headers.16/include/asm-ppc64/iSeries/iSeries_irq.h 2005-06-01 18:05:36.000000000 +1000 +++ linus-iSeries-headers.17/include/asm-ppc64/iSeries/iSeries_irq.h 2005-06-08 11:25:53.000000000 +1000 @@ -3,7 +3,6 @@ extern void iSeries_init_IRQ(void); extern int iSeries_allocate_IRQ(HvBusNumber, HvSubBusNumber, HvAgentId); -extern int iSeries_assign_IRQ(int, HvBusNumber, HvSubBusNumber, HvAgentId); extern void iSeries_activate_IRQs(void); extern int XmPciLpEvent_init(void); From sfr at canb.auug.org.au Wed Jun 8 17:36:28 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 8 Jun 2005 17:36:28 +1000 Subject: [PATCH 2/4] ppc64 iSeries: remove XmPciLpEvent.c In-Reply-To: <20050608172712.4b56ec5d.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> <20050608172712.4b56ec5d.sfr@canb.auug.org.au> Message-ID: <20050608173628.6d4262a0.sfr@canb.auug.org.au> Hi Andrew, This patch just merges XmPciLpEvent.c into iSeries_irq.c (the only caller of its only external function). XmPciLpEvent.c just contained the lowlevel iSeries irq code. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-iSeries-headers.17/arch/ppc64/kernel/Makefile linus-iSeries-headers.18/arch/ppc64/kernel/Makefile --- linus-iSeries-headers.17/arch/ppc64/kernel/Makefile 2005-06-05 03:40:47.000000000 +1000 +++ linus-iSeries-headers.18/arch/ppc64/kernel/Makefile 2005-06-08 12:15:34.000000000 +1000 @@ -21,7 +21,7 @@ obj-$(CONFIG_PCI) += pci.o pci_iommu.o iomap.o $(pci-obj-y) -obj-$(CONFIG_PPC_ISERIES) += iSeries_irq.o XmPciLpEvent.o \ +obj-$(CONFIG_PPC_ISERIES) += iSeries_irq.o \ HvCall.o HvLpConfig.o LparData.o \ iSeries_setup.o ItLpQueue.o hvCall.o \ mf.o HvLpEvent.o iSeries_proc.o iSeries_htab.o \ diff -ruN linus-iSeries-headers.17/arch/ppc64/kernel/XmPciLpEvent.c linus-iSeries-headers.18/arch/ppc64/kernel/XmPciLpEvent.c --- linus-iSeries-headers.17/arch/ppc64/kernel/XmPciLpEvent.c 2005-06-08 12:10:56.000000000 +1000 +++ linus-iSeries-headers.18/arch/ppc64/kernel/XmPciLpEvent.c 1970-01-01 10:00:00.000000000 +1000 @@ -1,183 +0,0 @@ -/* - * File XmPciLpEvent.c created by Wayne Holm on Mon Jan 15 2001. - * - * This module handles PCI interrupt events sent by the iSeries Hypervisor. - */ -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include - -static long Pci_Interrupt_Count; -static long Pci_Event_Count; - -enum XmPciLpEvent_Subtype { - XmPciLpEvent_BusCreated = 0, // PHB has been created - XmPciLpEvent_BusError = 1, // PHB has failed - XmPciLpEvent_BusFailed = 2, // Msg to Secondary, Primary failed bus - XmPciLpEvent_NodeFailed = 4, // Multi-adapter bridge has failed - XmPciLpEvent_NodeRecovered = 5, // Multi-adapter bridge has recovered - XmPciLpEvent_BusRecovered = 12, // PHB has been recovered - XmPciLpEvent_UnQuiesceBus = 18, // Secondary bus unqiescing - XmPciLpEvent_BridgeError = 21, // Bridge Error - XmPciLpEvent_SlotInterrupt = 22 // Slot interrupt -}; - -struct XmPciLpEvent_BusInterrupt { - HvBusNumber busNumber; - HvSubBusNumber subBusNumber; -}; - -struct XmPciLpEvent_NodeInterrupt { - HvBusNumber busNumber; - HvSubBusNumber subBusNumber; - HvAgentId deviceId; -}; - -struct XmPciLpEvent { - struct HvLpEvent hvLpEvent; - - union { - u64 alignData; // Align on an 8-byte boundary - - struct { - u32 fisr; - HvBusNumber busNumber; - HvSubBusNumber subBusNumber; - HvAgentId deviceId; - } slotInterrupt; - - struct XmPciLpEvent_BusInterrupt busFailed; - struct XmPciLpEvent_BusInterrupt busRecovered; - struct XmPciLpEvent_BusInterrupt busCreated; - - struct XmPciLpEvent_NodeInterrupt nodeFailed; - struct XmPciLpEvent_NodeInterrupt nodeRecovered; - - } eventData; - -}; - -static void intReceived(struct XmPciLpEvent *eventParm, - struct pt_regs *regsParm) -{ - int irq; - - ++Pci_Interrupt_Count; -#if 0 - PPCDBG(PPCDBG_BUSWALK, "PCI: XmPciLpEvent.c: intReceived\n"); -#endif - - switch (eventParm->hvLpEvent.xSubtype) { - case XmPciLpEvent_SlotInterrupt: - irq = eventParm->hvLpEvent.xCorrelationToken; - /* Dispatch the interrupt handlers for this irq */ - ppc_irq_dispatch_handler(regsParm, irq); - HvCallPci_eoi(eventParm->eventData.slotInterrupt.busNumber, - eventParm->eventData.slotInterrupt.subBusNumber, - eventParm->eventData.slotInterrupt.deviceId); - break; - /* Ignore error recovery events for now */ - case XmPciLpEvent_BusCreated: - printk(KERN_INFO "XmPciLpEvent.c: system bus %d created\n", - eventParm->eventData.busCreated.busNumber); - break; - case XmPciLpEvent_BusError: - case XmPciLpEvent_BusFailed: - printk(KERN_INFO "XmPciLpEvent.c: system bus %d failed\n", - eventParm->eventData.busFailed.busNumber); - break; - case XmPciLpEvent_BusRecovered: - case XmPciLpEvent_UnQuiesceBus: - printk(KERN_INFO "XmPciLpEvent.c: system bus %d recovered\n", - eventParm->eventData.busRecovered.busNumber); - break; - case XmPciLpEvent_NodeFailed: - case XmPciLpEvent_BridgeError: - printk(KERN_INFO - "XmPciLpEvent.c: multi-adapter bridge %d/%d/%d failed\n", - eventParm->eventData.nodeFailed.busNumber, - eventParm->eventData.nodeFailed.subBusNumber, - eventParm->eventData.nodeFailed.deviceId); - break; - case XmPciLpEvent_NodeRecovered: - printk(KERN_INFO - "XmPciLpEvent.c: multi-adapter bridge %d/%d/%d recovered\n", - eventParm->eventData.nodeRecovered.busNumber, - eventParm->eventData.nodeRecovered.subBusNumber, - eventParm->eventData.nodeRecovered.deviceId); - break; - default: - printk(KERN_ERR - "XmPciLpEvent.c: unrecognized event subtype 0x%x\n", - eventParm->hvLpEvent.xSubtype); - break; - } -} - -static void XmPciLpEvent_handler(struct HvLpEvent *eventParm, - struct pt_regs *regsParm) -{ -#ifdef CONFIG_PCI -#if 0 - PPCDBG(PPCDBG_BUSWALK, "XmPciLpEvent_handler, type 0x%x\n", - eventParm->xType); -#endif - ++Pci_Event_Count; - - if (eventParm && (eventParm->xType == HvLpEvent_Type_PciIo)) { - switch (eventParm->xFlags.xFunction) { - case HvLpEvent_Function_Int: - intReceived((struct XmPciLpEvent *)eventParm, regsParm); - break; - case HvLpEvent_Function_Ack: - printk(KERN_ERR - "XmPciLpEvent.c: unexpected ack received\n"); - break; - default: - printk(KERN_ERR - "XmPciLpEvent.c: unexpected event function %d\n", - (int)eventParm->xFlags.xFunction); - break; - } - } else if (eventParm) - printk(KERN_ERR - "XmPciLpEvent.c: Unrecognized PCI event type 0x%x\n", - (int)eventParm->xType); - else - printk(KERN_ERR "XmPciLpEvent.c: NULL event received\n"); -#endif -} - -/* This should be called sometime prior to buswalk (init_IRQ would be good) */ -int XmPciLpEvent_init() -{ - int xRc; - - PPCDBG(PPCDBG_BUSWALK, - "XmPciLpEvent_init, Register Event type 0x%04X\n", - HvLpEvent_Type_PciIo); - - xRc = HvLpEvent_registerHandler(HvLpEvent_Type_PciIo, - &XmPciLpEvent_handler); - if (xRc == 0) { - xRc = HvLpEvent_openPath(HvLpEvent_Type_PciIo, 0); - if (xRc != 0) - printk(KERN_ERR "XmPciLpEvent.c: open event path " - "failed with rc 0x%x\n", xRc); - } else - printk(KERN_ERR "XmPciLpEvent.c: register handler " - "failed with rc 0x%x\n", xRc); - return xRc; -} diff -ruN linus-iSeries-headers.17/arch/ppc64/kernel/iSeries_irq.c linus-iSeries-headers.18/arch/ppc64/kernel/iSeries_irq.c --- linus-iSeries-headers.17/arch/ppc64/kernel/iSeries_irq.c 2005-06-08 12:05:30.000000000 +1000 +++ linus-iSeries-headers.18/arch/ppc64/kernel/iSeries_irq.c 2005-06-08 12:15:02.000000000 +1000 @@ -1,6 +1,7 @@ /* * This module supports the iSeries PCI bus interrupt handling * Copyright (C) 20yy + * Copyright (C) 2004-2005 IBM Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -22,6 +23,7 @@ * Created, December 13, 2000 by Wayne Holm * End Change Activity */ +#include #include #include #include @@ -30,11 +32,12 @@ #include #include #include - #include #include -#include +#include +#include +#include #include #include #include @@ -46,6 +49,169 @@ /* Note: the pcnet32 driver assumes irq numbers < 2 aren't valid. :( */ static int next_virtual_irq = 2; +static long Pci_Interrupt_Count; +static long Pci_Event_Count; + +enum XmPciLpEvent_Subtype { + XmPciLpEvent_BusCreated = 0, // PHB has been created + XmPciLpEvent_BusError = 1, // PHB has failed + XmPciLpEvent_BusFailed = 2, // Msg to Secondary, Primary failed bus + XmPciLpEvent_NodeFailed = 4, // Multi-adapter bridge has failed + XmPciLpEvent_NodeRecovered = 5, // Multi-adapter bridge has recovered + XmPciLpEvent_BusRecovered = 12, // PHB has been recovered + XmPciLpEvent_UnQuiesceBus = 18, // Secondary bus unqiescing + XmPciLpEvent_BridgeError = 21, // Bridge Error + XmPciLpEvent_SlotInterrupt = 22 // Slot interrupt +}; + +struct XmPciLpEvent_BusInterrupt { + HvBusNumber busNumber; + HvSubBusNumber subBusNumber; +}; + +struct XmPciLpEvent_NodeInterrupt { + HvBusNumber busNumber; + HvSubBusNumber subBusNumber; + HvAgentId deviceId; +}; + +struct XmPciLpEvent { + struct HvLpEvent hvLpEvent; + + union { + u64 alignData; // Align on an 8-byte boundary + + struct { + u32 fisr; + HvBusNumber busNumber; + HvSubBusNumber subBusNumber; + HvAgentId deviceId; + } slotInterrupt; + + struct XmPciLpEvent_BusInterrupt busFailed; + struct XmPciLpEvent_BusInterrupt busRecovered; + struct XmPciLpEvent_BusInterrupt busCreated; + + struct XmPciLpEvent_NodeInterrupt nodeFailed; + struct XmPciLpEvent_NodeInterrupt nodeRecovered; + + } eventData; + +}; + +static void intReceived(struct XmPciLpEvent *eventParm, + struct pt_regs *regsParm) +{ + int irq; + + ++Pci_Interrupt_Count; +#if 0 + PPCDBG(PPCDBG_BUSWALK, "PCI: XmPciLpEvent.c: intReceived\n"); +#endif + + switch (eventParm->hvLpEvent.xSubtype) { + case XmPciLpEvent_SlotInterrupt: + irq = eventParm->hvLpEvent.xCorrelationToken; + /* Dispatch the interrupt handlers for this irq */ + ppc_irq_dispatch_handler(regsParm, irq); + HvCallPci_eoi(eventParm->eventData.slotInterrupt.busNumber, + eventParm->eventData.slotInterrupt.subBusNumber, + eventParm->eventData.slotInterrupt.deviceId); + break; + /* Ignore error recovery events for now */ + case XmPciLpEvent_BusCreated: + printk(KERN_INFO "XmPciLpEvent.c: system bus %d created\n", + eventParm->eventData.busCreated.busNumber); + break; + case XmPciLpEvent_BusError: + case XmPciLpEvent_BusFailed: + printk(KERN_INFO "XmPciLpEvent.c: system bus %d failed\n", + eventParm->eventData.busFailed.busNumber); + break; + case XmPciLpEvent_BusRecovered: + case XmPciLpEvent_UnQuiesceBus: + printk(KERN_INFO "XmPciLpEvent.c: system bus %d recovered\n", + eventParm->eventData.busRecovered.busNumber); + break; + case XmPciLpEvent_NodeFailed: + case XmPciLpEvent_BridgeError: + printk(KERN_INFO + "XmPciLpEvent.c: multi-adapter bridge %d/%d/%d failed\n", + eventParm->eventData.nodeFailed.busNumber, + eventParm->eventData.nodeFailed.subBusNumber, + eventParm->eventData.nodeFailed.deviceId); + break; + case XmPciLpEvent_NodeRecovered: + printk(KERN_INFO + "XmPciLpEvent.c: multi-adapter bridge %d/%d/%d recovered\n", + eventParm->eventData.nodeRecovered.busNumber, + eventParm->eventData.nodeRecovered.subBusNumber, + eventParm->eventData.nodeRecovered.deviceId); + break; + default: + printk(KERN_ERR + "XmPciLpEvent.c: unrecognized event subtype 0x%x\n", + eventParm->hvLpEvent.xSubtype); + break; + } +} + +static void XmPciLpEvent_handler(struct HvLpEvent *eventParm, + struct pt_regs *regsParm) +{ +#ifdef CONFIG_PCI +#if 0 + PPCDBG(PPCDBG_BUSWALK, "XmPciLpEvent_handler, type 0x%x\n", + eventParm->xType); +#endif + ++Pci_Event_Count; + + if (eventParm && (eventParm->xType == HvLpEvent_Type_PciIo)) { + switch (eventParm->xFlags.xFunction) { + case HvLpEvent_Function_Int: + intReceived((struct XmPciLpEvent *)eventParm, regsParm); + break; + case HvLpEvent_Function_Ack: + printk(KERN_ERR + "XmPciLpEvent.c: unexpected ack received\n"); + break; + default: + printk(KERN_ERR + "XmPciLpEvent.c: unexpected event function %d\n", + (int)eventParm->xFlags.xFunction); + break; + } + } else if (eventParm) + printk(KERN_ERR + "XmPciLpEvent.c: Unrecognized PCI event type 0x%x\n", + (int)eventParm->xType); + else + printk(KERN_ERR "XmPciLpEvent.c: NULL event received\n"); +#endif +} + +/* This should be called sometime prior to buswalk (init_IRQ would be good) */ +int XmPciLpEvent_init() +{ + int xRc; + + PPCDBG(PPCDBG_BUSWALK, + "XmPciLpEvent_init, Register Event type 0x%04X\n", + HvLpEvent_Type_PciIo); + + xRc = HvLpEvent_registerHandler(HvLpEvent_Type_PciIo, + &XmPciLpEvent_handler); + if (xRc == 0) { + xRc = HvLpEvent_openPath(HvLpEvent_Type_PciIo, 0); + if (xRc != 0) + printk(KERN_ERR "XmPciLpEvent.c: open event path " + "failed with rc 0x%x\n", xRc); + } else + printk(KERN_ERR "XmPciLpEvent.c: register handler " + "failed with rc 0x%x\n", xRc); + return xRc; +} + /* This is called by init_IRQ. set in ppc_md.init_IRQ by iSeries_setup.c */ void __init iSeries_init_IRQ(void) { From sfr at canb.auug.org.au Wed Jun 8 17:39:49 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 8 Jun 2005 17:39:49 +1000 Subject: [PATCH 3/4] ppc64 iSeries: tidy up irq code after merge In-Reply-To: <20050608172712.4b56ec5d.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> <20050608172712.4b56ec5d.sfr@canb.auug.org.au> Message-ID: <20050608173949.08b7e02c.sfr@canb.auug.org.au> Hi Andrew, This patch just removes some dead code, fixes messages that referred to the file this code used to be in and inserts XmPciLpEvent_init into its caller. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-iSeries-headers.18/arch/ppc64/kernel/iSeries_irq.c linus-iSeries-headers.19/arch/ppc64/kernel/iSeries_irq.c --- linus-iSeries-headers.18/arch/ppc64/kernel/iSeries_irq.c 2005-06-08 12:15:02.000000000 +1000 +++ linus-iSeries-headers.19/arch/ppc64/kernel/iSeries_irq.c 2005-06-08 13:14:42.000000000 +1000 @@ -105,9 +105,6 @@ int irq; ++Pci_Interrupt_Count; -#if 0 - PPCDBG(PPCDBG_BUSWALK, "PCI: XmPciLpEvent.c: intReceived\n"); -#endif switch (eventParm->hvLpEvent.xSubtype) { case XmPciLpEvent_SlotInterrupt: @@ -120,37 +117,37 @@ break; /* Ignore error recovery events for now */ case XmPciLpEvent_BusCreated: - printk(KERN_INFO "XmPciLpEvent.c: system bus %d created\n", + printk(KERN_INFO "intReceived: system bus %d created\n", eventParm->eventData.busCreated.busNumber); break; case XmPciLpEvent_BusError: case XmPciLpEvent_BusFailed: - printk(KERN_INFO "XmPciLpEvent.c: system bus %d failed\n", + printk(KERN_INFO "intReceived: system bus %d failed\n", eventParm->eventData.busFailed.busNumber); break; case XmPciLpEvent_BusRecovered: case XmPciLpEvent_UnQuiesceBus: - printk(KERN_INFO "XmPciLpEvent.c: system bus %d recovered\n", + printk(KERN_INFO "intReceived: system bus %d recovered\n", eventParm->eventData.busRecovered.busNumber); break; case XmPciLpEvent_NodeFailed: case XmPciLpEvent_BridgeError: printk(KERN_INFO - "XmPciLpEvent.c: multi-adapter bridge %d/%d/%d failed\n", + "intReceived: multi-adapter bridge %d/%d/%d failed\n", eventParm->eventData.nodeFailed.busNumber, eventParm->eventData.nodeFailed.subBusNumber, eventParm->eventData.nodeFailed.deviceId); break; case XmPciLpEvent_NodeRecovered: printk(KERN_INFO - "XmPciLpEvent.c: multi-adapter bridge %d/%d/%d recovered\n", + "intReceived: multi-adapter bridge %d/%d/%d recovered\n", eventParm->eventData.nodeRecovered.busNumber, eventParm->eventData.nodeRecovered.subBusNumber, eventParm->eventData.nodeRecovered.deviceId); break; default: printk(KERN_ERR - "XmPciLpEvent.c: unrecognized event subtype 0x%x\n", + "intReceived: unrecognized event subtype 0x%x\n", eventParm->hvLpEvent.xSubtype); break; } @@ -160,10 +157,6 @@ struct pt_regs *regsParm) { #ifdef CONFIG_PCI -#if 0 - PPCDBG(PPCDBG_BUSWALK, "XmPciLpEvent_handler, type 0x%x\n", - eventParm->xType); -#endif ++Pci_Event_Count; if (eventParm && (eventParm->xType == HvLpEvent_Type_PciIo)) { @@ -173,50 +166,42 @@ break; case HvLpEvent_Function_Ack: printk(KERN_ERR - "XmPciLpEvent.c: unexpected ack received\n"); + "XmPciLpEvent_handler: unexpected ack received\n"); break; default: printk(KERN_ERR - "XmPciLpEvent.c: unexpected event function %d\n", + "XmPciLpEvent_handler: unexpected event function %d\n", (int)eventParm->xFlags.xFunction); break; } } else if (eventParm) printk(KERN_ERR - "XmPciLpEvent.c: Unrecognized PCI event type 0x%x\n", + "XmPciLpEvent_handler: Unrecognized PCI event type 0x%x\n", (int)eventParm->xType); else - printk(KERN_ERR "XmPciLpEvent.c: NULL event received\n"); + printk(KERN_ERR "XmPciLpEvent_handler: NULL event received\n"); #endif } -/* This should be called sometime prior to buswalk (init_IRQ would be good) */ -int XmPciLpEvent_init() +/* + * This is called by init_IRQ. set in ppc_md.init_IRQ by iSeries_setup.c + * It must be called before the bus walk. + */ +void __init iSeries_init_IRQ(void) { + /* Register PCI event handler and open an event path */ int xRc; - PPCDBG(PPCDBG_BUSWALK, - "XmPciLpEvent_init, Register Event type 0x%04X\n", - HvLpEvent_Type_PciIo); - xRc = HvLpEvent_registerHandler(HvLpEvent_Type_PciIo, &XmPciLpEvent_handler); if (xRc == 0) { xRc = HvLpEvent_openPath(HvLpEvent_Type_PciIo, 0); if (xRc != 0) - printk(KERN_ERR "XmPciLpEvent.c: open event path " + printk(KERN_ERR "iSeries_init_IRQ: open event path " "failed with rc 0x%x\n", xRc); } else - printk(KERN_ERR "XmPciLpEvent.c: register handler " + printk(KERN_ERR "iSeries_init_IRQ: register handler " "failed with rc 0x%x\n", xRc); - return xRc; -} - -/* This is called by init_IRQ. set in ppc_md.init_IRQ by iSeries_setup.c */ -void __init iSeries_init_IRQ(void) -{ - /* Register PCI event handler and open an event path */ - XmPciLpEvent_init(); } #define REAL_IRQ_TO_BUS(irq) ((((irq) >> 6) & 0xff) + 1) diff -ruN linus-iSeries-headers.18/include/asm-ppc64/iSeries/iSeries_irq.h linus-iSeries-headers.19/include/asm-ppc64/iSeries/iSeries_irq.h --- linus-iSeries-headers.18/include/asm-ppc64/iSeries/iSeries_irq.h 2005-06-08 11:25:53.000000000 +1000 +++ linus-iSeries-headers.19/include/asm-ppc64/iSeries/iSeries_irq.h 2005-06-08 13:15:49.000000000 +1000 @@ -5,6 +5,4 @@ extern int iSeries_allocate_IRQ(HvBusNumber, HvSubBusNumber, HvAgentId); extern void iSeries_activate_IRQs(void); -extern int XmPciLpEvent_init(void); - #endif /* __ISERIES_IRQ_H__ */ From sfr at canb.auug.org.au Wed Jun 8 17:43:04 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 8 Jun 2005 17:43:04 +1000 Subject: [PATCH 4/4] ppc64 iSeries: allow build with no PCI In-Reply-To: <20050608172712.4b56ec5d.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> <20050608172712.4b56ec5d.sfr@canb.auug.org.au> Message-ID: <20050608174304.0525b9de.sfr@canb.auug.org.au> Hi Andrew, This patch allows iSeries to build with CONFIG_PCI=n. This is useful for partitions that have only virtual I/O. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-iSeries-headers.19/arch/ppc64/Kconfig linus-iSeries-headers.20/arch/ppc64/Kconfig --- linus-iSeries-headers.19/arch/ppc64/Kconfig 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.20/arch/ppc64/Kconfig 2005-06-08 14:57:11.000000000 +1000 @@ -323,7 +323,7 @@ bool config PCI - bool + bool "support for PCI devices" if (EMBEDDED && PPC_ISERIES) default y help Find out whether your system includes a PCI bus. PCI is the name of diff -ruN linus-iSeries-headers.19/arch/ppc64/kernel/Makefile linus-iSeries-headers.20/arch/ppc64/kernel/Makefile --- linus-iSeries-headers.19/arch/ppc64/kernel/Makefile 2005-06-08 12:15:34.000000000 +1000 +++ linus-iSeries-headers.20/arch/ppc64/kernel/Makefile 2005-06-08 15:06:13.000000000 +1000 @@ -16,13 +16,13 @@ obj-$(CONFIG_PPC_OF) += of_device.o -pci-obj-$(CONFIG_PPC_ISERIES) += iSeries_pci.o iSeries_VpdInfo.o +pci-obj-$(CONFIG_PPC_ISERIES) += iSeries_pci.o iSeries_irq.o \ + iSeries_VpdInfo.o pci-obj-$(CONFIG_PPC_MULTIPLATFORM) += pci_dn.o pci_direct_iommu.o obj-$(CONFIG_PCI) += pci.o pci_iommu.o iomap.o $(pci-obj-y) -obj-$(CONFIG_PPC_ISERIES) += iSeries_irq.o \ - HvCall.o HvLpConfig.o LparData.o \ +obj-$(CONFIG_PPC_ISERIES) += HvCall.o HvLpConfig.o LparData.o \ iSeries_setup.o ItLpQueue.o hvCall.o \ mf.o HvLpEvent.o iSeries_proc.o iSeries_htab.o \ iSeries_iommu.o diff -ruN linus-iSeries-headers.19/arch/ppc64/kernel/dma.c linus-iSeries-headers.20/arch/ppc64/kernel/dma.c --- linus-iSeries-headers.19/arch/ppc64/kernel/dma.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.20/arch/ppc64/kernel/dma.c 2005-06-07 16:42:36.000000000 +1000 @@ -15,8 +15,10 @@ static struct dma_mapping_ops *get_dma_ops(struct device *dev) { +#ifdef CONFIG_PCI if (dev->bus == &pci_bus_type) return &pci_dma_ops; +#endif #ifdef CONFIG_IBMVIO if (dev->bus == &vio_bus_type) return &vio_dma_ops; @@ -37,8 +39,10 @@ int dma_set_mask(struct device *dev, u64 dma_mask) { +#ifdef CONFIG_PCI if (dev->bus == &pci_bus_type) return pci_set_dma_mask(to_pci_dev(dev), dma_mask); +#endif #ifdef CONFIG_IBMVIO if (dev->bus == &vio_bus_type) return -EIO; diff -ruN linus-iSeries-headers.19/arch/ppc64/kernel/iSeries_iommu.c linus-iSeries-headers.20/arch/ppc64/kernel/iSeries_iommu.c --- linus-iSeries-headers.19/arch/ppc64/kernel/iSeries_iommu.c 2005-05-20 09:03:13.000000000 +1000 +++ linus-iSeries-headers.20/arch/ppc64/kernel/iSeries_iommu.c 2005-06-08 15:18:06.000000000 +1000 @@ -83,7 +83,7 @@ } } - +#ifdef CONFIG_PCI /* * This function compares the known tables to find an iommu_table * that has already been built for hardware TCEs. @@ -159,6 +159,7 @@ else kfree(tbl); } +#endif static void iommu_dev_setup_iSeries(struct pci_dev *dev) { } static void iommu_bus_setup_iSeries(struct pci_bus *bus) { } diff -ruN linus-iSeries-headers.19/arch/ppc64/kernel/iSeries_setup.c linus-iSeries-headers.20/arch/ppc64/kernel/iSeries_setup.c --- linus-iSeries-headers.19/arch/ppc64/kernel/iSeries_setup.c 2005-06-03 09:34:53.000000000 +1000 +++ linus-iSeries-headers.20/arch/ppc64/kernel/iSeries_setup.c 2005-06-08 15:53:04.000000000 +1000 @@ -76,7 +76,11 @@ static void build_iSeries_Memory_Map(void); static void setup_iSeries_cache_sizes(void); static void iSeries_bolt_kernel(unsigned long saddr, unsigned long eaddr); +#ifdef CONFIG_PCI extern void iSeries_pci_final_fixup(void); +#else +static void iSeries_pci_final_fixup(void) { } +#endif /* Global Variables */ static unsigned long procFreqHz; @@ -876,6 +880,10 @@ } __setup("spread_lpevents=", set_spread_lpevents); +#ifndef CONFIG_PCI +void __init iSeries_init_IRQ(void) { } +#endif + void __init iSeries_early_setup(void) { iSeries_fixup_klimit(); diff -ruN linus-iSeries-headers.19/arch/ppc64/kernel/sys_ppc32.c linus-iSeries-headers.20/arch/ppc64/kernel/sys_ppc32.c --- linus-iSeries-headers.19/arch/ppc64/kernel/sys_ppc32.c 2005-05-20 09:03:14.000000000 +1000 +++ linus-iSeries-headers.20/arch/ppc64/kernel/sys_ppc32.c 2005-06-08 15:05:31.000000000 +1000 @@ -741,6 +741,7 @@ asmlinkage int sys32_pciconfig_iobase(u32 which, u32 in_bus, u32 in_devfn) { +#ifdef CONFIG_PCI struct pci_controller* hose; struct list_head *ln; struct pci_bus *bus = NULL; @@ -786,7 +787,7 @@ case IOBASE_ISA_MEM: return -EINVAL; } - +#endif /* CONFIG_PCI */ return -EOPNOTSUPP; } diff -ruN linus-iSeries-headers.19/arch/ppc64/lib/Makefile linus-iSeries-headers.20/arch/ppc64/lib/Makefile --- linus-iSeries-headers.19/arch/ppc64/lib/Makefile 2005-05-20 09:03:15.000000000 +1000 +++ linus-iSeries-headers.20/arch/ppc64/lib/Makefile 2005-06-08 15:09:37.000000000 +1000 @@ -12,7 +12,7 @@ # e2a provides EBCDIC to ASCII conversions. ifdef CONFIG_PPC_ISERIES -obj-$(CONFIG_PCI) += e2a.o +obj-y += e2a.o endif lib-$(CONFIG_DEBUG_KERNEL) += sstep.o diff -ruN linus-iSeries-headers.19/drivers/char/mem.c linus-iSeries-headers.20/drivers/char/mem.c --- linus-iSeries-headers.19/drivers/char/mem.c 2005-05-20 09:03:47.000000000 +1000 +++ linus-iSeries-headers.20/drivers/char/mem.c 2005-06-08 15:44:27.000000000 +1000 @@ -484,7 +484,7 @@ return virtr + wrote; } -#if defined(CONFIG_ISA) || !defined(__mc68000__) +#if (defined(CONFIG_ISA) || !defined(__mc68000__)) && (!defined(CONFIG_PPC_ISERIES) || defined(CONFIG_PCI)) static ssize_t read_port(struct file * file, char __user * buf, size_t count, loff_t *ppos) { @@ -744,7 +744,7 @@ .write = write_null, }; -#if defined(CONFIG_ISA) || !defined(__mc68000__) +#if (defined(CONFIG_ISA) || !defined(__mc68000__)) && (!defined(CONFIG_PPC_ISERIES) || defined(CONFIG_PCI)) static struct file_operations port_fops = { .llseek = memory_lseek, .read = read_port, @@ -804,7 +804,7 @@ case 3: filp->f_op = &null_fops; break; -#if defined(CONFIG_ISA) || !defined(__mc68000__) +#if (defined(CONFIG_ISA) || !defined(__mc68000__)) && (!defined(CONFIG_PPC_ISERIES) || defined(CONFIG_PCI)) case 4: filp->f_op = &port_fops; break; @@ -846,7 +846,7 @@ {1, "mem", S_IRUSR | S_IWUSR | S_IRGRP, &mem_fops}, {2, "kmem", S_IRUSR | S_IWUSR | S_IRGRP, &kmem_fops}, {3, "null", S_IRUGO | S_IWUGO, &null_fops}, -#if defined(CONFIG_ISA) || !defined(__mc68000__) +#if (defined(CONFIG_ISA) || !defined(__mc68000__)) && (!defined(CONFIG_PPC_ISERIES) || defined(CONFIG_PCI)) {4, "port", S_IRUSR | S_IWUSR | S_IRGRP, &port_fops}, #endif {5, "zero", S_IRUGO | S_IWUGO, &zero_fops}, diff -ruN linus-iSeries-headers.19/drivers/serial/Kconfig linus-iSeries-headers.20/drivers/serial/Kconfig --- linus-iSeries-headers.19/drivers/serial/Kconfig 2005-05-20 09:04:37.000000000 +1000 +++ linus-iSeries-headers.20/drivers/serial/Kconfig 2005-06-07 16:42:36.000000000 +1000 @@ -753,7 +753,7 @@ config SERIAL_ICOM tristate "IBM Multiport Serial Adapter" - depends on PPC_ISERIES || PPC_PSERIES + depends on PCI && (PPC_ISERIES || PPC_PSERIES) select SERIAL_CORE help This driver is for a family of multiport serial adapters diff -ruN linus-iSeries-headers.19/include/asm-ppc64/dma.h linus-iSeries-headers.20/include/asm-ppc64/dma.h --- linus-iSeries-headers.19/include/asm-ppc64/dma.h 2005-05-20 09:05:54.000000000 +1000 +++ linus-iSeries-headers.20/include/asm-ppc64/dma.h 2005-06-08 15:51:09.000000000 +1000 @@ -27,6 +27,8 @@ /* Doesn't really apply... */ #define MAX_DMA_ADDRESS (~0UL) +#if !defined(CONFIG_PPC_ISERIES) || defined(CONFIG_PCI) + #define dma_outb outb #define dma_inb inb @@ -323,4 +325,5 @@ #else #define isa_dma_bridge_buggy (0) #endif +#endif /* !defined(CONFIG_PPC_ISERIES) || defined(CONFIG_PCI) */ #endif /* _ASM_DMA_H */ diff -ruN linus-iSeries-headers.19/include/asm-ppc64/iommu.h linus-iSeries-headers.20/include/asm-ppc64/iommu.h --- linus-iSeries-headers.19/include/asm-ppc64/iommu.h 2005-06-04 17:13:20.000000000 +1000 +++ linus-iSeries-headers.20/include/asm-ppc64/iommu.h 2005-06-08 15:08:52.000000000 +1000 @@ -137,8 +137,12 @@ extern void iommu_init_early_iSeries(void); extern void iommu_init_early_u3(void); +#ifdef CONFIG_PCI extern void pci_iommu_init(void); extern void pci_direct_iommu_init(void); +#else +static inline void pci_iommu_init(void) { } +#endif extern void alloc_u3_dart_table(void); From hch at lst.de Wed Jun 8 19:06:05 2005 From: hch at lst.de (Christoph Hellwig) Date: Wed, 8 Jun 2005 11:06:05 +0200 Subject: [PATCH 4/4] ppc64 iSeries: allow build with no PCI In-Reply-To: <20050608174304.0525b9de.sfr@canb.auug.org.au> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> <20050608172712.4b56ec5d.sfr@canb.auug.org.au> <20050608174304.0525b9de.sfr@canb.auug.org.au> Message-ID: <20050608090605.GA13860@lst.de> On Wed, Jun 08, 2005 at 05:43:04PM +1000, Stephen Rothwell wrote: > Hi Andrew, > > This patch allows iSeries to build with CONFIG_PCI=n. This is useful for partitions that have only virtual I/O. > > Signed-off-by: Stephen Rothwell > -- > Cheers, > Stephen Rothwell sfr at canb.auug.org.au > http://www.canb.auug.org.au/~sfr/ > > diff -ruN linus-iSeries-headers.19/arch/ppc64/Kconfig linus-iSeries-headers.20/arch/ppc64/Kconfig > --- linus-iSeries-headers.19/arch/ppc64/Kconfig 2005-05-20 09:03:13.000000000 +1000 > +++ linus-iSeries-headers.20/arch/ppc64/Kconfig 2005-06-08 14:57:11.000000000 +1000 > @@ -323,7 +323,7 @@ > bool > > config PCI > - bool > + bool "support for PCI devices" if (EMBEDDED && PPC_ISERIES) and what exactly makes an ISERIES an embedded device? Please make this just if PPC_ISERIES and provide a sane helptext. From paulus at samba.org Wed Jun 8 20:46:49 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 8 Jun 2005 20:46:49 +1000 Subject: [PATCH 4/4] ppc64 iSeries: allow build with no PCI In-Reply-To: <20050608090605.GA13860@lst.de> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> <20050608172712.4b56ec5d.sfr@canb.auug.org.au> <20050608174304.0525b9de.sfr@canb.auug.org.au> <20050608090605.GA13860@lst.de> Message-ID: <17062.52377.194047.272117@cargo.ozlabs.ibm.com> Christoph Hellwig writes: > > config PCI > > - bool > > + bool "support for PCI devices" if (EMBEDDED && PPC_ISERIES) > > and what exactly makes an ISERIES an embedded device? Please make this > just if PPC_ISERIES and provide a sane helptext. This means you get the chance to turn *off* PCI if you turn on EMBEDDED. That is consistent with other uses of EMBEDDED; for example, if you want to build the kernel for your iSeries partition without VT, you have to turn on EMBEDDED. In other words, EMBEDDED isn't really about embedded systems. :) It would be nice to have a better name, but for now Stephen's patch is reasonable (although we could make it just "if EMBEDDED" instead). Paul. From hch at lst.de Thu Jun 9 01:27:12 2005 From: hch at lst.de (Christoph Hellwig) Date: Wed, 8 Jun 2005 17:27:12 +0200 Subject: [PATCH 4/4] ppc64 iSeries: allow build with no PCI In-Reply-To: <17062.52377.194047.272117@cargo.ozlabs.ibm.com> References: <20050603175819.3d143a07.sfr@canb.auug.org.au> <20050606161415.055dce39.sfr@canb.auug.org.au> <20050608172712.4b56ec5d.sfr@canb.auug.org.au> <20050608174304.0525b9de.sfr@canb.auug.org.au> <20050608090605.GA13860@lst.de> <17062.52377.194047.272117@cargo.ozlabs.ibm.com> Message-ID: <20050608152712.GA21467@lst.de> On Wed, Jun 08, 2005 at 08:46:49PM +1000, Paul Mackerras wrote: > Christoph Hellwig writes: > > > > config PCI > > > - bool > > > + bool "support for PCI devices" if (EMBEDDED && PPC_ISERIES) > > > > and what exactly makes an ISERIES an embedded device? Please make this > > just if PPC_ISERIES and provide a sane helptext. > > This means you get the chance to turn *off* PCI if you turn on > EMBEDDED. That is consistent with other uses of EMBEDDED; for > example, if you want to build the kernel for your iSeries partition > without VT, you have to turn on EMBEDDED. In other words, EMBEDDED > isn't really about embedded systems. :) It would be nice to have a > better name, but for now Stephen's patch is reasonable (although we > could make it just "if EMBEDDED" instead). From reading the mail it's a totally reasonable choice to turn off PCI because there's normal iSeries configs that don't need it. EMBEDDED only is for really odd configs whatever that means (a little more than usual on x86 thanks to Linus). Note that you can turn PCI off easily without EMBEDDED on most plattforms, including x86. From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:48:15 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:48:15 +0900 Subject: [PATCH 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83A8F.9020503@jp.fujitsu.com> [This is 1 of 10 patches, "iochk-01-generic.patch"] - It defines: a pair of function : iochk_clear and iochk_read a function for init : iochk_init type of control var : iocookie and describe "no-ops" as its "generic" action. - HAVE_ARCH_IOMAP_CHECK allows us to change whole definition of these functions and type from generic one to specific one. See next patch (2 of 10). Signed-off-by: Hidetoshi Seto --- drivers/pci/pci.c | 2 ++ include/asm-generic/iomap.h | 16 ++++++++++++++++ lib/iomap.c | 26 ++++++++++++++++++++++++++ 3 files changed, 44 insertions(+) Index: linux-2.6.11.11/lib/iomap.c =================================================================== --- linux-2.6.11.11.orig/lib/iomap.c +++ linux-2.6.11.11/lib/iomap.c @@ -210,3 +210,29 @@ void pci_iounmap(struct pci_dev *dev, vo } EXPORT_SYMBOL(pci_iomap); EXPORT_SYMBOL(pci_iounmap); + +/* + * Clear/Read iocookie to check IO error while using iomap. + * + * Note that default iochk_clear-read pair interfaces don't have + * any effective error check, but some high-reliable platforms + * would provide useful information to you. + * And note that some action may be limited (ex. irq-unsafe) + * between the pair depend on the facility of the platform. + */ +#ifndef HAVE_ARCH_IOMAP_CHECK +void iochk_init(void) { ; } + +void iochk_clear(iocookie *cookie, struct pci_dev *dev) +{ + /* no-ops */ +} + +int iochk_read(iocookie *cookie) +{ + /* no-ops */ + return 0; +} +EXPORT_SYMBOL(iochk_clear); +EXPORT_SYMBOL(iochk_read); +#endif /* HAVE_ARCH_IOMAP_CHECK */ Index: linux-2.6.11.11/include/asm-generic/iomap.h =================================================================== --- linux-2.6.11.11.orig/include/asm-generic/iomap.h +++ linux-2.6.11.11/include/asm-generic/iomap.h @@ -60,4 +60,20 @@ struct pci_dev; extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max); extern void pci_iounmap(struct pci_dev *dev, void __iomem *); +/* + * IOMAP_CHECK provides additional interfaces for drivers to detect + * some IO errors, supports drivers having ability to recover errors. + * + * All works around iomap-check depends on the design of "iocookie" + * structure. Every architecture owning its iomap-check is free to + * define the actual design of iocookie to fit its special style. + */ +#ifndef HAVE_ARCH_IOMAP_CHECK +typedef unsigned long iocookie; +#endif + +extern void iochk_init(void); +extern void iochk_clear(iocookie *cookie, struct pci_dev *dev); +extern int iochk_read(iocookie *cookie); + #endif Index: linux-2.6.11.11/drivers/pci/pci.c =================================================================== --- linux-2.6.11.11.orig/drivers/pci/pci.c +++ linux-2.6.11.11/drivers/pci/pci.c @@ -782,6 +782,8 @@ static int __devinit pci_init(void) while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { pci_fixup_device(pci_fixup_final, dev); } + + iochk_init(); return 0; } From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:50:21 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:50:21 +0900 Subject: [PATCH 02/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83B0D.7040701@jp.fujitsu.com> [This is 2 of 10 patches, "iochk-02-ia64.patch"] - Add "config IOMAP_CHECK" to change definitions from generic to specific. - Defines ia64 version of: iochk_clear, iochk_read, iochk_init, and iocookie But they are no-ops yet. See next patch (3 of 10). Signed-off-by: Hidetoshi Seto --- arch/ia64/Kconfig | 13 +++++++++++++ arch/ia64/lib/Makefile | 1 + arch/ia64/lib/iomap_check.c | 30 ++++++++++++++++++++++++++++++ include/asm-ia64/io.h | 11 +++++++++++ 4 files changed, 55 insertions(+) Index: linux-2.6.11.11/arch/ia64/lib/Makefile =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/Makefile +++ linux-2.6.11.11/arch/ia64/lib/Makefile @@ -16,6 +16,7 @@ lib-$(CONFIG_MCKINLEY) += copy_page_mck. lib-$(CONFIG_PERFMON) += carta_random.o lib-$(CONFIG_MD_RAID5) += xor.o lib-$(CONFIG_HAVE_DEC_LOCK) += dec_and_lock.o +lib-$(CONFIG_IOMAP_CHECK) += iomap_check.o AFLAGS___divdi3.o = AFLAGS___udivdi3.o = -DUNSIGNED Index: linux-2.6.11.11/arch/ia64/Kconfig =================================================================== --- linux-2.6.11.11.orig/arch/ia64/Kconfig +++ linux-2.6.11.11/arch/ia64/Kconfig @@ -381,6 +381,19 @@ config PCI_DOMAINS bool default PCI +config IOMAP_CHECK + bool "Support iochk interfaces for IO error detection." + depends on PCI && EXPERIMENTAL + ---help--- + Saying Y provides iochk infrastructure for "RAS-aware" drivers + to detect and recover some IO errors, which strongly required by + some of very-high-reliable systems. + The implementation of this infrastructure is highly depend on arch, + bus system, chipset and so on. + Currentry, very few drivers on few arch actually implements this. + + If you don't know what to do here, say N. + source "drivers/pci/Kconfig" source "drivers/pci/hotplug/Kconfig" Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- /dev/null +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -0,0 +1,30 @@ +/* + * File: iomap_check.c + * Purpose: Implement the IA64 specific iomap recovery interfaces + */ + +#include + +void iochk_init(void); +void iochk_clear(iocookie *cookie, struct pci_dev *dev); +int iochk_read(iocookie *cookie); + +void iochk_init(void) +{ + /* setup */ +} + +void iochk_clear(iocookie *cookie, struct pci_dev *dev) +{ + /* register device etc. */ +} + +int iochk_read(iocookie *cookie) +{ + /* check error etc. */ + + return 0; +} + +EXPORT_SYMBOL(iochk_read); +EXPORT_SYMBOL(iochk_clear); Index: linux-2.6.11.11/include/asm-ia64/io.h =================================================================== --- linux-2.6.11.11.orig/include/asm-ia64/io.h +++ linux-2.6.11.11/include/asm-ia64/io.h @@ -70,6 +70,17 @@ extern unsigned int num_io_spaces; #include #include #include + +#ifdef CONFIG_IOMAP_CHECK + +/* definition of ia64 iocookie */ +typedef unsigned long iocookie; + +/* enable ia64 iochk - See arch/ia64/lib/iomap_check.c */ +#define HAVE_ARCH_IOMAP_CHECK + +#endif /* CONFIG_IOMAP_CHECK */ + #include /* From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:51:57 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:51:57 +0900 Subject: [PATCH 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83B6D.8010703@jp.fujitsu.com> [This is 3 of 10 patches, "iochk-03-register.patch"] - Implement ia64 version of basic codes: iochk_clear, iochk_read, iochk_init, and iocookie The direction is: - Have a "now in check" global list, "iochk_devices", for future use. - Take a lock, "iochk_lock", to protect the global list. - iochk_clear packs *dev into iocookie, and add it to the global list. After all prepared, clear error-flag in cookie to start io-critical-session. - iochk_read checks error-flag and device's status register. After removing iocookie from list, return the result. This is too simple. We need more codes... See next (4 of 10). Signed-off-by: Hidetoshi Seto --- arch/ia64/lib/iomap_check.c | 54 ++++++++++++++++++++++++++++++++++++++++++-- include/asm-ia64/io.h | 8 +++++- 2 files changed, 59 insertions(+), 3 deletions(-) Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -4,24 +4,74 @@ */ #include +#include +#include void iochk_init(void); void iochk_clear(iocookie *cookie, struct pci_dev *dev); int iochk_read(iocookie *cookie); +struct list_head iochk_devices; +DEFINE_SPINLOCK(iochk_lock); /* all works are excluded on this lock */ + +static int have_error(struct pci_dev *dev); + void iochk_init(void) { /* setup */ + INIT_LIST_HEAD(&iochk_devices); } void iochk_clear(iocookie *cookie, struct pci_dev *dev) { - /* register device etc. */ + unsigned long flag; + + INIT_LIST_HEAD(&(cookie->list)); + + cookie->dev = dev; + + spin_lock_irqsave(&iochk_lock, flag); + list_add(&cookie->list, &iochk_devices); + spin_unlock_irqrestore(&iochk_lock, flag); + + cookie->error = 0; } int iochk_read(iocookie *cookie) { - /* check error etc. */ + unsigned long flag; + int ret = 0; + + spin_lock_irqsave(&iochk_lock, flag); + if( cookie->error || have_error(cookie->dev) ) + ret = 1; + list_del(&cookie->list); + spin_unlock_irqrestore(&iochk_lock, flag); + + return ret; +} + +static int have_error(struct pci_dev *dev) +{ + u16 status; + + /* check status */ + switch (dev->hdr_type) { + case PCI_HEADER_TYPE_NORMAL: /* 0 */ + pci_read_config_word(dev, PCI_STATUS, &status); + break; + case PCI_HEADER_TYPE_BRIDGE: /* 1 */ + pci_read_config_word(dev, PCI_SEC_STATUS, &status); + break; + case PCI_HEADER_TYPE_CARDBUS: /* 2 */ + default: + BUG(); + } + + if ( (status & PCI_STATUS_REC_TARGET_ABORT) + || (status & PCI_STATUS_REC_MASTER_ABORT) + || (status & PCI_STATUS_DETECTED_PARITY) ) + return 1; return 0; } Index: linux-2.6.11.11/include/asm-ia64/io.h =================================================================== --- linux-2.6.11.11.orig/include/asm-ia64/io.h +++ linux-2.6.11.11/include/asm-ia64/io.h @@ -72,9 +72,15 @@ extern unsigned int num_io_spaces; #include #ifdef CONFIG_IOMAP_CHECK +#include /* definition of ia64 iocookie */ -typedef unsigned long iocookie; +struct __iocookie { + struct list_head list; + struct pci_dev *dev; /* targeting device */ + unsigned long error; /* error flag */ +}; +typedef struct __iocookie iocookie; /* enable ia64 iochk - See arch/ia64/lib/iomap_check.c */ #define HAVE_ARCH_IOMAP_CHECK From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:53:28 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:53:28 +0900 Subject: [PATCH 04/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83BC8.2010500@jp.fujitsu.com> [This is 4 of 10 patches, "iochk-04-register_bridge.patch"] - Since there could be a (PCI-)bus-error, some kind of error cannot detected on the device but on its hosting bridge. So, it is also required to check the bridge's register. In other words, to check a bus-error correctly, we need to check both end of the bus, device and its host bridge. OK, but often bridges are shared by multiple devices, right? So we need care to handle it... Yes, see next (5 of 10). Signed-off-by: Hidetoshi Seto --- arch/ia64/lib/iomap_check.c | 19 ++++++++++++++++++- include/asm-ia64/io.h | 1 + 2 files changed, 19 insertions(+), 1 deletion(-) Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -14,6 +14,7 @@ int iochk_read(iocookie *cookie); struct list_head iochk_devices; DEFINE_SPINLOCK(iochk_lock); /* all works are excluded on this lock */ +static struct pci_dev *search_host_bridge(struct pci_dev *dev); static int have_error(struct pci_dev *dev); void iochk_init(void) @@ -29,6 +30,7 @@ void iochk_clear(iocookie *cookie, struc INIT_LIST_HEAD(&(cookie->list)); cookie->dev = dev; + cookie->host = search_host_bridge(dev); spin_lock_irqsave(&iochk_lock, flag); list_add(&cookie->list, &iochk_devices); @@ -43,7 +45,8 @@ int iochk_read(iocookie *cookie) int ret = 0; spin_lock_irqsave(&iochk_lock, flag); - if( cookie->error || have_error(cookie->dev) ) + if( cookie->error || have_error(cookie->dev) + || (cookie->host && have_error(cookie->host)) ) ret = 1; list_del(&cookie->list); spin_unlock_irqrestore(&iochk_lock, flag); @@ -51,6 +54,20 @@ int iochk_read(iocookie *cookie) return ret; } +struct pci_dev *search_host_bridge(struct pci_dev *dev) +{ + struct pci_bus *pbus; + + /* there is no bridge */ + if (!dev->bus->self) return NULL; + + /* find root bus bridge */ + for (pbus = dev->bus; pbus->parent && pbus->parent->self; + pbus = pbus->parent); + + return pbus->self; +} + static int have_error(struct pci_dev *dev) { u16 status; Index: linux-2.6.11.11/include/asm-ia64/io.h =================================================================== --- linux-2.6.11.11.orig/include/asm-ia64/io.h +++ linux-2.6.11.11/include/asm-ia64/io.h @@ -78,6 +78,7 @@ extern unsigned int num_io_spaces; struct __iocookie { struct list_head list; struct pci_dev *dev; /* targeting device */ + struct pci_dev *host; /* hosting bridge */ unsigned long error; /* error flag */ }; typedef struct __iocookie iocookie; From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:54:42 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:54:42 +0900 Subject: [PATCH 05/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83C12.6010605@jp.fujitsu.com> [This is 5 of 10 patches, "iochk-05-check_bridge.patch"] - Consider three devices, A, B, and C are placed under a same host bridge H. After A and B checked-in (=passed iochk_clear, doing some I/Os, not come to call iochk_read yet), now C is going to check-in, just entered iochk_clear, but C finds out that H indicates error. It means that A or B hits a bus error, but there is no data which one actually hits the error. So, C should notify the error to both of A and B, and clear the H's status to start its own I/Os. If there are only two devices, it become more simple. It is clear if one find a bridge error while another is check-in, the error is nothing except for another's. Well, works concerning registers (devices and bridges) are almost shaped up. So, from next, I'll move to deep phase to implement more arch-specific codes... see next (6 of 10). Signed-off-by: Hidetoshi Seto --- arch/ia64/lib/iomap_check.c | 45 ++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 45 insertions(+) Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -17,6 +17,9 @@ DEFINE_SPINLOCK(iochk_lock); /* all work static struct pci_dev *search_host_bridge(struct pci_dev *dev); static int have_error(struct pci_dev *dev); +void notify_bridge_error(struct pci_dev *bridge); +void clear_bridge_error(struct pci_dev *bridge); + void iochk_init(void) { /* setup */ @@ -33,6 +36,11 @@ void iochk_clear(iocookie *cookie, struc cookie->host = search_host_bridge(dev); spin_lock_irqsave(&iochk_lock, flag); + if(cookie->host && have_error(cookie->host)) { + /* someone under my bridge causes error... */ + notify_bridge_error(cookie->host); + clear_bridge_error(cookie->host); + } list_add(&cookie->list, &iochk_devices); spin_unlock_irqrestore(&iochk_lock, flag); @@ -93,5 +101,42 @@ static int have_error(struct pci_dev *de return 0; } +void notify_bridge_error(struct pci_dev *bridge) +{ + iocookie *cookie; + + if (list_empty(&iochk_devices)) + return; + + /* notify error to all transactions using this host bridge */ + if (bridge) { + /* local notify, ex. Parity, Abort etc. */ + list_for_each_entry(cookie, &iochk_devices, list) { + if (cookie->host == bridge) + cookie->error = 1; + } + } +} + +void clear_bridge_error(struct pci_dev *bridge) +{ + u16 status = ( PCI_STATUS_REC_TARGET_ABORT + | PCI_STATUS_REC_MASTER_ABORT + | PCI_STATUS_DETECTED_PARITY ); + + /* clear bridge status */ + switch (bridge->hdr_type) { + case PCI_HEADER_TYPE_NORMAL: /* 0 */ + pci_write_config_word(bridge, PCI_STATUS, status); + break; + case PCI_HEADER_TYPE_BRIDGE: /* 1 */ + pci_write_config_word(bridge, PCI_SEC_STATUS, status); + break; + case PCI_HEADER_TYPE_CARDBUS: /* 2 */ + default: + BUG(); + } +} + EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:56:03 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:56:03 +0900 Subject: [PATCH 06/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83C63.5080604@jp.fujitsu.com> [This is 6 of 10 patches, "iochk-06-mcanotify.patch"] - This is a headache: When ia64 get a problem on hardware, OS could request SAL(System Abstraction Layer: ia64 firmware) to gather system status via calling SAL_GET_STATE_INFO procedure. However (depend on implementation of SAL for its platform, hopefully), on the way of gathering, SAL also checks every host bridges and its status, and after that, resets the state... So we should take care of this reset by SAL. Handling MCA(Machine Check Abort) is one of a situation should we take care. Originally MCA is designed as a critical interruption, so when MCA comes, without OS's order, SAL gathers system status before OS gets its control. So since states of bridges are already reset on entrance of MCA, OS should notify "lost of state" to all "check-in" contexts, by marking its error flag, iocookie->error. There would be better way if OS can know the bridge state from data which SAL gathered, but in the meanwhile, I just do simple way. PCI-parity error is one of MCA causes, is it OK? Next, "data poisoning" helps us... see next (7 of 10). Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca.c | 13 +++++++++++++ arch/ia64/lib/iomap_check.c | 7 ++++++- 2 files changed, 19 insertions(+), 1 deletion(-) Index: linux-2.6.11.11/arch/ia64/kernel/mca.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/kernel/mca.c +++ linux-2.6.11.11/arch/ia64/kernel/mca.c @@ -77,6 +77,11 @@ #include #include +#ifdef CONFIG_IOMAP_CHECK +#include +extern void notify_bridge_error(struct pci_dev *bridge); +#endif + #if defined(IA64_MCA_DEBUG_INFO) # define IA64_MCA_DEBUG(fmt...) printk(fmt) #else @@ -893,6 +898,14 @@ ia64_mca_ucmc_handler(void) sal_log_record_header_t *rh = IA64_LOG_CURR_BUFFER(SAL_INFO_TYPE_MCA); rh->severity = sal_log_severity_corrected; ia64_sal_clear_state_info(SAL_INFO_TYPE_MCA); + +#ifdef CONFIG_IOMAP_CHECK + /* + * SAL already reads and clears error bits on bridge registers, + * so we should have all running transactions to retry. + */ + notify_bridge_error(0); +#endif } /* * Wakeup all the processors which are spinning in the rendezvous Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -109,7 +109,12 @@ void notify_bridge_error(struct pci_dev return; /* notify error to all transactions using this host bridge */ - if (bridge) { + if (!bridge) { + /* global notify, ex. MCA */ + list_for_each_entry(cookie, &iochk_devices, list) { + cookie->error = 1; + } + } else { /* local notify, ex. Parity, Abort etc. */ list_for_each_entry(cookie, &iochk_devices, list) { if (cookie->host == bridge) From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:58:26 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:58:26 +0900 Subject: [PATCH 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83CF2.90304@jp.fujitsu.com> [This is 7 of 10 patches, "iochk-07-poison.patch"] - When bus-error occur on write, write data is broken on the bus, so target device gets broken data. There are 2 way for such device to take: - send PERR(Parity Error) to host, expecting immediate panic. - mark status register as error, expecting its driver to read it and decide to retry. So it is not difficult for drivers to recover from error on write if it can take latter way, and if it don't worry about taking time to wait completion of write. - When bus-error occur on read, read data is broken on the bus, so host bridge gets broken data. There are 2 way for such bridge to take: - send BERR(Bus Error) to host, expecting immediate panic. - mark data as "poisoned" and throw it to destination, expecting panic if system touched it but cannot stop data pollution. Former is traditional way, latter is modern way, called "data poisoning". The important difference is whether OS can get a chance to recover from the error. Usually, sending BERR doesn't tell us "where it comes", "who it orders", so we cannot do anything except panic. In the other hand, poisoned data will reach its destination and will cause a error on there again. Yes, destination is "where who lives". Well, the idea is quite simple: "driver checks read data, and recover if it was poisoned." Checking all read at once (ex. take a memo of all read addresses touched after iochk_clear and check them all in iochk_read) does not make sense. Practical way is check each read, keep its result, and read it at end. Touching poisoned data become a MCA, so now it directly means a system down. But since the MCA tells us "where it happens", we can recover it...? All right, let's see next (8 of 10). Signed-off-by: Hidetoshi Seto --- include/asm-ia64/io.h | 110 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 110 insertions(+) Index: linux-2.6.11.11/include/asm-ia64/io.h =================================================================== --- linux-2.6.11.11.orig/include/asm-ia64/io.h +++ linux-2.6.11.11/include/asm-ia64/io.h @@ -86,6 +86,20 @@ typedef struct __iocookie iocookie; /* enable ia64 iochk - See arch/ia64/lib/iomap_check.c */ #define HAVE_ARCH_IOMAP_CHECK +/* + * Some I/O bridges may poison the data read, instead of + * signaling a BERR. The consummation of poisoned data + * triggers a local, imprecise MCA. + * Note that the read operation by itself does not consume + * the bad data, you have to do something with it, e.g.: + * + * ld.8 r9=[r10];; // r10 == I/O address + * add.8 r8=r9,r9;; // fake operation + */ +#define ia64_poison_check(val) \ +{ register unsigned long gr8 asm("r8"); \ + asm volatile ("add %0=%1,r0" : "=r"(gr8) : "r"(val)); } + #endif /* CONFIG_IOMAP_CHECK */ #include @@ -190,6 +204,8 @@ __ia64_mk_io_addr (unsigned long port) * during optimization, which is why we use "volatile" pointers. */ +#ifdef CONFIG_IOMAP_CHECK + static inline unsigned int ___ia64_inb (unsigned long port) { @@ -198,6 +214,8 @@ ___ia64_inb (unsigned long port) ret = *addr; __ia64_mf_a(); + ia64_poison_check(ret); + return ret; } @@ -209,6 +227,8 @@ ___ia64_inw (unsigned long port) ret = *addr; __ia64_mf_a(); + ia64_poison_check(ret); + return ret; } @@ -220,9 +240,48 @@ ___ia64_inl (unsigned long port) ret = *addr; __ia64_mf_a(); + ia64_poison_check(ret); + + return ret; +} + +#else /* CONFIG_IOMAP_CHECK */ + +static inline unsigned int +___ia64_inb (unsigned long port) +{ + volatile unsigned char *addr = __ia64_mk_io_addr(port); + unsigned char ret; + + ret = *addr; + __ia64_mf_a(); return ret; } +static inline unsigned int +___ia64_inw (unsigned long port) +{ + volatile unsigned short *addr = __ia64_mk_io_addr(port); + unsigned short ret; + + ret = *addr; + __ia64_mf_a(); + return ret; +} + +static inline unsigned int +___ia64_inl (unsigned long port) +{ + volatile unsigned int *addr = __ia64_mk_io_addr(port); + unsigned int ret; + + ret = *addr; + __ia64_mf_a(); + return ret; +} + +#endif /* CONFIG_IOMAP_CHECK */ + static inline void ___ia64_outb (unsigned char val, unsigned long port) { @@ -339,6 +398,55 @@ __outsl (unsigned long port, const void * a good idea). Writes are ok though for all existing ia64 platforms (and * hopefully it'll stay that way). */ + +#ifdef CONFIG_IOMAP_CHECK + +static inline unsigned char +___ia64_readb (const volatile void __iomem *addr) +{ + unsigned char val; + + val = *(volatile unsigned char __force *)addr; + ia64_poison_check(val); + + return val; +} + +static inline unsigned short +___ia64_readw (const volatile void __iomem *addr) +{ + unsigned short val; + + val = *(volatile unsigned short __force *)addr; + ia64_poison_check(val); + + return val; +} + +static inline unsigned int +___ia64_readl (const volatile void __iomem *addr) +{ + unsigned int val; + + val = *(volatile unsigned int __force *) addr; + ia64_poison_check(val); + + return val; +} + +static inline unsigned long +___ia64_readq (const volatile void __iomem *addr) +{ + unsigned long val; + + val = *(volatile unsigned long __force *) addr; + ia64_poison_check(val); + + return val; +} + +#else /* CONFIG_IOMAP_CHECK */ + static inline unsigned char ___ia64_readb (const volatile void __iomem *addr) { @@ -363,6 +471,8 @@ ___ia64_readq (const volatile void __iom return *(volatile unsigned long __force *) addr; } +#endif /* CONFIG_IOMAP_CHECK */ + static inline void __writeb (unsigned char val, volatile void __iomem *addr) { From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 23:00:44 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 22:00:44 +0900 Subject: [PATCH 08/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83D7C.2010101@jp.fujitsu.com> [This is 8 of 10 patches, "iochk-08-mcadrv.patch"] - Touching poisoned data become a MCA, so now it assumed as a fatal error, directly will be a system down. But since the MCA tells us a physical address - "where it happens", we can do some action to survive. If the address is present in resource of "check-in" device, it is guaranteed that its driver will call iochk_read in the very near future, and that now the driver have a ability and responsibility of recovery from the error. So if it was "check-in" address, what OS should do is mark "check-in" devices and just restart usual works. Soon the driver will notice the error and operate it properly. Note: We can identify a affected device, but because of SAL behavior (mentioned at 6 of 10), we need to mark all "check-in" devices. Fix in future, if possible. Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca_drv.c | 85 ++++++++++++++++++++++++++++++++++++++++++++ arch/ia64/lib/iomap_check.c | 1 2 files changed, 86 insertions(+) Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -145,3 +145,4 @@ void clear_bridge_error(struct pci_dev * EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); +EXPORT_SYMBOL(iochk_devices); /* for MCA driver */ Index: linux-2.6.11.11/arch/ia64/kernel/mca_drv.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/kernel/mca_drv.c +++ linux-2.6.11.11/arch/ia64/kernel/mca_drv.c @@ -35,6 +35,13 @@ #include "mca_drv.h" +#ifdef CONFIG_IOMAP_CHECK +#include +#include +extern struct list_head iochk_devices; +#endif + + /* max size of SAL error record (default) */ static int sal_rec_max = 10000; @@ -378,6 +385,80 @@ is_mca_global(peidx_table_t *peidx, pal_ return MCA_IS_GLOBAL; } + +#ifdef CONFIG_IOMAP_CHECK + +/** + * get_target_identifier - get address of target_identifier + * @peidx: pointer of index of processor error section + * + * Return value: + * addr if valid / 0 if not valid + */ +static u64 get_target_identifier(peidx_table_t *peidx) +{ + sal_log_mod_error_info_t *smei; + + smei = peidx_bus_check(peidx, 0); + if (smei->valid.target_identifier) + return (smei->target_identifier); + return 0; +} + +/** + * offending_addr_in_check - Check if the addr is in checking resource. + * @addr: address offending this MCA + * + * Return value: + * 1 if in / 0 if out + */ +static int offending_addr_in_check(u64 addr) +{ + int i; + struct pci_dev *tdev; + iocookie *cookie; + + if (list_empty(&iochk_devices)) + return 0; + + list_for_each_entry(cookie, &iochk_devices, list) { + tdev = cookie->dev; + for (i = 0; i < PCI_ROM_RESOURCE; i++) { + if (tdev->resource[i].start <= addr + && addr <= tdev->resource[i].end) + return 1; + if ((tdev->resource[i].flags + & (PCI_BASE_ADDRESS_SPACE|PCI_BASE_ADDRESS_MEM_TYPE_MASK)) + == (PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64)) + i++; + } + } + return 0; +} + +/** + * pci_error_recovery - Check if MCA occur on transaction in iochk. + * @peidx: pointer of index of processor error section + * + * Return value: + * 1 if error could be cought in driver / 0 if not + */ +static int pci_error_recovery(peidx_table_t *peidx) +{ + u64 addr; + + addr = get_target_identifier(peidx); + if(!addr) return 0; + + if(offending_addr_in_check(addr)) + return 1; + + return 0; +} + +#endif /* CONFIG_IOMAP_CHECK */ + + /** * recover_from_read_error - Try to recover the errors which type are "read"s. * @slidx: pointer of index of SAL error record @@ -400,6 +481,10 @@ recover_from_read_error(slidx_table_t *s if (!pbci->tv) return 0; +#ifdef CONFIG_IOMAP_CHECK + if ( pci_error_recovery(peidx) ) return 1; +#endif + /* * cpu read or memory-mapped io read * From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 23:02:14 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 22:02:14 +0900 Subject: [PATCH 09/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83DD6.9030607@jp.fujitsu.com> [This is 9 of 10 patches, "iochk-09-cpeh.patch"] - SAL behavior doesn't affect only MCA. There are other chances to call SAL_GET_STATE_INFO, that's when CMC, CPE, and INIT is happen. - CMC(Corrected Machine Check) is for non-fatal, processor local errors. Fortunately, calling SAL_GET_STATE_INFO for CMC only collect data from a processor issued it, without touching any bridge and its status. So, this is safe. - CPE(Corrected Platform Error) is for non-fatal, platform related errors. Even it says corrected, but calling SAL procedure for CPE touchs every bridge on the platform, and "correct" bridge status that's bad for iochk works. - INIT is a kind of system reset request, as far as I know. So restarting from INIT is out of design, also iochk after INIT is not required at this time. In short, only MCA and CPE have the problem of SAL behavior. One of the difference from MCA is that SAL will not gather data before OS actually request it. MCA: 1) SAL gathers data and keep it internally 2) OS gets control 3) if OS requests, SAL returns data gathered at beginning. CPE: 1) OS gets control 2) OS request to SAL 3) SAL gathers data and return it to OS Therefore, we can make CPE handler to care bridge states, to check states before calling SAL procedure. Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca.c | 21 +++++++++++++++++++++ arch/ia64/lib/iomap_check.c | 17 +++++++++++++++++ 2 files changed, 38 insertions(+) Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -19,6 +19,7 @@ static int have_error(struct pci_dev *de void notify_bridge_error(struct pci_dev *bridge); void clear_bridge_error(struct pci_dev *bridge); +void save_bridge_error(void); void iochk_init(void) { @@ -143,6 +144,22 @@ void clear_bridge_error(struct pci_dev * } } +void save_bridge_error(void) +{ + iocookie *cookie; + + if (list_empty(&iochk_devices)) + return; + + /* mark devices if its root bus bridge have errors */ + list_for_each_entry(cookie, &iochk_devices, list) { + if (cookie->error) + continue; + if (have_error(cookie->host)) + notify_bridge_error(cookie->host); + } +} + EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); EXPORT_SYMBOL(iochk_devices); /* for MCA driver */ Index: linux-2.6.11.11/arch/ia64/kernel/mca.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/kernel/mca.c +++ linux-2.6.11.11/arch/ia64/kernel/mca.c @@ -80,6 +80,8 @@ #ifdef CONFIG_IOMAP_CHECK #include extern void notify_bridge_error(struct pci_dev *bridge); +extern void save_bridge_error(void); +extern spinlock_t iochk_lock; #endif #if defined(IA64_MCA_DEBUG_INFO) @@ -288,11 +290,30 @@ ia64_mca_cpe_int_handler (int cpe_irq, v IA64_MCA_DEBUG("%s: received interrupt vector = %#x on CPU %d\n", __FUNCTION__, cpe_irq, smp_processor_id()); +#ifndef CONFIG_IOMAP_CHECK + /* SAL spec states this should run w/ interrupts enabled */ local_irq_enable(); /* Get the CPE error record and log it */ ia64_mca_log_sal_error_record(SAL_INFO_TYPE_CPE); +#else + /* + * Because SAL_GET_STATE_INFO for CPE might clear bridge states + * in process of gathering error information from the system, + * we should check the states before clearing it. + * While OS and SAL are handling bridge status, we have to protect + * the states from changing by any other I/Os running simultaneously, + * so this should be handled w/ lock and interrupts disabled. + */ + spin_lock(&iochk_lock); + save_bridge_error(); + ia64_mca_log_sal_error_record(SAL_INFO_TYPE_CPE); + spin_unlock(&iochk_lock); + + /* Rests can go w/ interrupt enabled as usual */ + local_irq_enable(); +#endif spin_lock(&cpe_history_lock); if (!cpe_poll_enabled && cpe_vector >= 0) { From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 23:04:20 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 22:04:20 +0900 Subject: [PATCH 10/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <42A83E54.8030603@jp.fujitsu.com> [This is 10 of 10 patches, "iochk-10-rwlock.patch"] - If a read access (i.g. readX/inX) cause a error while SAL gathers system data on other processor ,it could be happen a bridge error status is marked and vanished in a blink. In case of MCA, thanks to rz_always flag, all MCA are handled as global, so all processor except one is paused during its handling. But in case of CPE, as same as other interruption, it have to be handled beside of all other active processors. Therefore, to avoid such status crash, exclusive control between read access and SAL_GET_STATE_INFO is required. To realize this, I changed control lock from spin to rw. There would be better way, if so, this part should be replaced. Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca.c | 6 +++--- arch/ia64/lib/iomap_check.c | 11 ++++++----- include/asm-ia64/io.h | 24 ++++++++++++++++++++++++ 3 files changed, 33 insertions(+), 8 deletions(-) Index: linux-2.6.11.11/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.11.11/arch/ia64/lib/iomap_check.c @@ -12,7 +12,7 @@ void iochk_clear(iocookie *cookie, struc int iochk_read(iocookie *cookie); struct list_head iochk_devices; -DEFINE_SPINLOCK(iochk_lock); /* all works are excluded on this lock */ +DEFINE_RWLOCK(iochk_lock); /* all works are excluded on this lock */ static struct pci_dev *search_host_bridge(struct pci_dev *dev); static int have_error(struct pci_dev *dev); @@ -36,14 +36,14 @@ void iochk_clear(iocookie *cookie, struc cookie->dev = dev; cookie->host = search_host_bridge(dev); - spin_lock_irqsave(&iochk_lock, flag); + write_lock_irqsave(&iochk_lock, flag); if(cookie->host && have_error(cookie->host)) { /* someone under my bridge causes error... */ notify_bridge_error(cookie->host); clear_bridge_error(cookie->host); } list_add(&cookie->list, &iochk_devices); - spin_unlock_irqrestore(&iochk_lock, flag); + write_unlock_irqrestore(&iochk_lock, flag); cookie->error = 0; } @@ -53,12 +53,12 @@ int iochk_read(iocookie *cookie) unsigned long flag; int ret = 0; - spin_lock_irqsave(&iochk_lock, flag); + write_lock_irqsave(&iochk_lock, flag); if( cookie->error || have_error(cookie->dev) || (cookie->host && have_error(cookie->host)) ) ret = 1; list_del(&cookie->list); - spin_unlock_irqrestore(&iochk_lock, flag); + write_unlock_irqrestore(&iochk_lock, flag); return ret; } @@ -160,6 +160,7 @@ void save_bridge_error(void) } } +EXPORT_SYMBOL(iochk_lock); EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); EXPORT_SYMBOL(iochk_devices); /* for MCA driver */ Index: linux-2.6.11.11/include/asm-ia64/io.h =================================================================== --- linux-2.6.11.11.orig/include/asm-ia64/io.h +++ linux-2.6.11.11/include/asm-ia64/io.h @@ -73,6 +73,7 @@ extern unsigned int num_io_spaces; #ifdef CONFIG_IOMAP_CHECK #include +#include /* definition of ia64 iocookie */ struct __iocookie { @@ -83,6 +84,8 @@ struct __iocookie { }; typedef struct __iocookie iocookie; +extern rwlock_t iochk_lock; /* see arch/ia64/lib/iomap_check.c */ + /* enable ia64 iochk - See arch/ia64/lib/iomap_check.c */ #define HAVE_ARCH_IOMAP_CHECK @@ -211,10 +214,13 @@ ___ia64_inb (unsigned long port) { volatile unsigned char *addr = __ia64_mk_io_addr(port); unsigned char ret; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); ret = *addr; __ia64_mf_a(); ia64_poison_check(ret); + read_unlock_irqrestore(&iochk_lock,flags); return ret; } @@ -224,10 +230,13 @@ ___ia64_inw (unsigned long port) { volatile unsigned short *addr = __ia64_mk_io_addr(port); unsigned short ret; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); ret = *addr; __ia64_mf_a(); ia64_poison_check(ret); + read_unlock_irqrestore(&iochk_lock,flags); return ret; } @@ -237,10 +246,13 @@ ___ia64_inl (unsigned long port) { volatile unsigned int *addr = __ia64_mk_io_addr(port); unsigned int ret; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); ret = *addr; __ia64_mf_a(); ia64_poison_check(ret); + read_unlock_irqrestore(&iochk_lock,flags); return ret; } @@ -405,9 +417,12 @@ static inline unsigned char ___ia64_readb (const volatile void __iomem *addr) { unsigned char val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned char __force *)addr; ia64_poison_check(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } @@ -416,9 +431,12 @@ static inline unsigned short ___ia64_readw (const volatile void __iomem *addr) { unsigned short val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned short __force *)addr; ia64_poison_check(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } @@ -427,9 +445,12 @@ static inline unsigned int ___ia64_readl (const volatile void __iomem *addr) { unsigned int val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned int __force *) addr; ia64_poison_check(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } @@ -438,9 +459,12 @@ static inline unsigned long ___ia64_readq (const volatile void __iomem *addr) { unsigned long val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned long __force *) addr; ia64_poison_check(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } Index: linux-2.6.11.11/arch/ia64/kernel/mca.c =================================================================== --- linux-2.6.11.11.orig/arch/ia64/kernel/mca.c +++ linux-2.6.11.11/arch/ia64/kernel/mca.c @@ -81,7 +81,7 @@ #include extern void notify_bridge_error(struct pci_dev *bridge); extern void save_bridge_error(void); -extern spinlock_t iochk_lock; +extern rwlock_t iochk_lock; #endif #if defined(IA64_MCA_DEBUG_INFO) @@ -306,10 +306,10 @@ ia64_mca_cpe_int_handler (int cpe_irq, v * the states from changing by any other I/Os running simultaneously, * so this should be handled w/ lock and interrupts disabled. */ - spin_lock(&iochk_lock); + write_lock(&iochk_lock); save_bridge_error(); ia64_mca_log_sal_error_record(SAL_INFO_TYPE_CPE); - spin_unlock(&iochk_lock); + write_unlock(&iochk_lock); /* Rests can go w/ interrupt enabled as usual */ local_irq_enable(); From seto.hidetoshi at jp.fujitsu.com Thu Jun 9 22:39:11 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Thu, 09 Jun 2005 21:39:11 +0900 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting Message-ID: <42A8386F.2060100@jp.fujitsu.com> Hi, long time no see :-D This is a continuation of previous post quite a while ago: "[PATCH/RFC] I/O-check interface for driver's error handling" Reflecting every comments, I brushed up my patch for generic part. So today I'll post it again, and also post "ia64 part", which surely implements ia64-arch specific error checking. I think latter will be a sample of basic implement for other arch. The patch is divided into 10 parts, as patch series: iochk-01-generic.patch iochk-02-ia64.patch iochk-03-register.patch iochk-04-register_bridge.patch iochk-05-check_bridge.patch iochk-06-mcanotify.patch iochk-07-poison.patch iochk-08-mcadrv.patch iochk-09-cpeh.patch iochk-10-rwlock.patch Only "01" is for generic, and rest 9 parts are for ia64. Since parts from 02 to 05 are used to construct basic pci-based checking, they are not so arch-specific even they in /arch/ia64. I think the generic part is almost completed. No matter if only "01" is accepted, because it will mean a good start for other arch where interested in I/O error recovery infrastructures. Of course, I'd appreciate it if all of them could be accepted. I feel need of comments, especially against ia64 parts. Every comments are welcome. Thanks, H.Seto ----- * ...followings are "abstract" copied from my previous post. If you know skip it: ----- Currently, I/O error is not a leading cause of system failure. However, since Linux nowadays is making great progress on its scalability, and ever larger number of PCI devices are being connected to a single high-performance server, the risk of the I/O error is increasing day by day. For example, PCI parity error is one of the most common errors in the hardware world. However, the major cause of parity error is not hardware's error but software's - low voltage, humidity, natural radiation... etc. Even though, some platforms are nervous to parity error enough to shutdown the system immediately on such error. So if device drivers can retry its transaction once results as an error, we can reduce the risk of I/O errors. So I'd like to suggest new interfaces that enable drivers to check - detect error and retry their I/O transaction easily. Previously I had post two prototypes to LKML: 1) readX_check() interface Added new kin of basic readX(), which returns its result of I/O. But, it would not make sense that device driver have to check and react after each of I/Os. 2) clear/read_pci_errors() interface Added new pair-interface to sandwich I/Os. It makes sense that device driver can adjust the number of checking I/Os and can react all of them at once. However, this was not generalized, so I thought that more expandable design would be required. Today's patch is 3rd one - iochk_clear/read() interface. - This also adds pair-interface, but not to sandwich only readX(). Depends on platform, starting with ioreadX(), inX(), writeX() if possible... and so on could be target of error checking. - Additionally adds special token - abstract "iocookie" structure to control/identifies/manage I/Os, by passing it to OS. Actual type of "iocookie" could be arch-specific. Device drivers could use the iocookie structure without knowing its detail. Expected usage(sample) is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include #include int sample_read_with_iochk(struct pci_dev *dev, u32 *buf, int words) { unsigned long ofs = pci_resource_start(dev, 0) + DATA_OFFSET; int i; /* Create magical cookie on the stack */ iocookie cookie; /* Critical section start */ iochk_clear(&dev, &cookie); { /* Get the whole packet of data */ for (i = 0; i < words; i++) *buf++ = ioread32(dev, ofs); } /* Critical section end. Did we have any trouble? */ if ( iochk_read(&cookie) ) return -1; /* OK, all system go. */ return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If arch doesn't(or cannot) have its io-checking strategy, these interfaces don't do anything, so driver maintainer can write their driver code with these interfaces for all arch, even where checking is not implemented. From moilanen at austin.ibm.com Fri Jun 10 00:31:12 2005 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Thu, 9 Jun 2005 09:31:12 -0500 Subject: [PATCH] PCI device-node failure detection In-Reply-To: <1117582676.5826.66.camel@gaston> References: <20050531101225.510efbf7.moilanen@austin.ibm.com> <20050531195127.GD3723@otto> <20050531165507.44dba54a.moilanen@austin.ibm.com> <1117582676.5826.66.camel@gaston> Message-ID: <20050609093112.21e9775c.moilanen@austin.ibm.com> > Besides, if you really want to export it, considering that it's > "standard" enough to be in generic code, then it should be rather called > something like. > > int of_device_failed(...) > > And finally, i'd rather have it backward, that is something like > of_device_available(). I finally got around to fixing this up. Signed-off-by: Jake Moilanen Index: 2.6.12-maui/arch/ppc64/kernel/pSeries_pci.c =================================================================== --- 2.6.12-maui.orig/arch/ppc64/kernel/pSeries_pci.c 2005-06-02 14:47:20.000000000 -0500 +++ 2.6.12-maui/arch/ppc64/kernel/pSeries_pci.c 2005-06-09 13:54:06.000000000 -0500 @@ -62,6 +62,21 @@ return 0; } +static int of_device_available(struct device_node * dn) +{ + char * status; + + status = get_property(dn, "status", NULL); + + if (!status) + return 1; + + if (!strcmp(status, "okay")) + return 1; + + return 0; +} + static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) { int returnval = -1; @@ -107,7 +122,7 @@ /* Search only direct children of the bus */ for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) + if (dn->devfn == devfn && of_device_available(dn)) return rtas_read_config(dn, where, size, val); return PCIBIOS_DEVICE_NOT_FOUND; } @@ -150,7 +165,7 @@ /* Search only direct children of the bus */ for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) + if (dn->devfn == devfn && of_device_available(dn)) return rtas_write_config(dn, where, size, val); return PCIBIOS_DEVICE_NOT_FOUND; } From greg at kroah.com Fri Jun 10 02:53:53 2005 From: greg at kroah.com (Greg KH) Date: Thu, 9 Jun 2005 09:53:53 -0700 Subject: [PATCH 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A83A8F.9020503@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83A8F.9020503@jp.fujitsu.com> Message-ID: <20050609165353.GB9597@kroah.com> On Thu, Jun 09, 2005 at 09:48:15PM +0900, Hidetoshi Seto wrote: > --- linux-2.6.11.11.orig/lib/iomap.c > +++ linux-2.6.11.11/lib/iomap.c > @@ -210,3 +210,29 @@ void pci_iounmap(struct pci_dev *dev, vo > } > EXPORT_SYMBOL(pci_iomap); > EXPORT_SYMBOL(pci_iounmap); > + > +/* > + * Clear/Read iocookie to check IO error while using iomap. > + * > + * Note that default iochk_clear-read pair interfaces don't have > + * any effective error check, but some high-reliable platforms > + * would provide useful information to you. > + * And note that some action may be limited (ex. irq-unsafe) > + * between the pair depend on the facility of the platform. > + */ > +#ifndef HAVE_ARCH_IOMAP_CHECK > +void iochk_init(void) { ; } > + > +void iochk_clear(iocookie *cookie, struct pci_dev *dev) > +{ > + /* no-ops */ > +} A bit of a coding style difference between the two functions, yet they do the same thing :) > + > +int iochk_read(iocookie *cookie) > +{ > + /* no-ops */ > + return 0; > +} Why not just return the cookie? Can this ever fail? Shouldn't these go into a .h file and be made "static inline" so they just compile away to nothing? > +EXPORT_SYMBOL(iochk_clear); > +EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL_GPL() perhaps? > +#endif /* HAVE_ARCH_IOMAP_CHECK */ > Index: linux-2.6.11.11/include/asm-generic/iomap.h > =================================================================== > --- linux-2.6.11.11.orig/include/asm-generic/iomap.h > +++ linux-2.6.11.11/include/asm-generic/iomap.h > @@ -60,4 +60,20 @@ struct pci_dev; > extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long > max); > extern void pci_iounmap(struct pci_dev *dev, void __iomem *); > > +/* > + * IOMAP_CHECK provides additional interfaces for drivers to detect > + * some IO errors, supports drivers having ability to recover errors. > + * > + * All works around iomap-check depends on the design of "iocookie" > + * structure. Every architecture owning its iomap-check is free to > + * define the actual design of iocookie to fit its special style. > + */ > +#ifndef HAVE_ARCH_IOMAP_CHECK > +typedef unsigned long iocookie; > +#endif Why typedef this if it isn't specified? thanks, greg k-h From greg at kroah.com Fri Jun 10 02:57:19 2005 From: greg at kroah.com (Greg KH) Date: Thu, 9 Jun 2005 09:57:19 -0700 Subject: [PATCH 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A83B6D.8010703@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83B6D.8010703@jp.fujitsu.com> Message-ID: <20050609165719.GC9597@kroah.com> On Thu, Jun 09, 2005 at 09:51:57PM +0900, Hidetoshi Seto wrote: > + if( cookie->error || have_error(cookie->dev) ) This should be written as: if (cookie->error || have_error(cookie->dev)) instead (note the placement of spaces). > /* definition of ia64 iocookie */ > -typedef unsigned long iocookie; > +struct __iocookie { > + struct list_head list; > + struct pci_dev *dev; /* targeting device */ > + unsigned long error; /* error flag */ > +}; > +typedef struct __iocookie iocookie; Hm, why not just make the thing be a "struct iocookie" in the first place, then we don't have to mess with a typedef at all. And then each arch can define how the structure will look like in their private .c files, ensuring that no user can ever try to touch the structure themselves. thanks, greg k-h From greg at kroah.com Fri Jun 10 02:57:58 2005 From: greg at kroah.com (Greg KH) Date: Thu, 9 Jun 2005 09:57:58 -0700 Subject: [PATCH 04/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A83BC8.2010500@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83BC8.2010500@jp.fujitsu.com> Message-ID: <20050609165758.GD9597@kroah.com> On Thu, Jun 09, 2005 at 09:53:28PM +0900, Hidetoshi Seto wrote: > + /* there is no bridge */ > + if (!dev->bus->self) return NULL; Put the "return NULL;" on it's own line please. thanks, greg k-h From greg at kroah.com Fri Jun 10 02:59:28 2005 From: greg at kroah.com (Greg KH) Date: Thu, 9 Jun 2005 09:59:28 -0700 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <20050609165928.GE9597@kroah.com> On Thu, Jun 09, 2005 at 09:39:11PM +0900, Hidetoshi Seto wrote: > Hi, long time no see :-D > > This is a continuation of previous post quite a while ago: > "[PATCH/RFC] I/O-check interface for driver's error handling" > > Reflecting every comments, I brushed up my patch for generic part. > So today I'll post it again, and also post "ia64 part", which > surely implements ia64-arch specific error checking. I think > latter will be a sample of basic implement for other arch. Overall, the idea and implementation looks very nice. I just had a few comments on the code style and how you implemented the .h and .c files. Good job. thanks, greg k-h From matthew at wil.cx Fri Jun 10 03:13:32 2005 From: matthew at wil.cx (Matthew Wilcox) Date: Thu, 9 Jun 2005 18:13:32 +0100 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <20050609171332.GC24611@parcelfarce.linux.theplanet.co.uk> On Thu, Jun 09, 2005 at 09:39:11PM +0900, Hidetoshi Seto wrote: > Previously I had post two prototypes to LKML: > 1) readX_check() interface > Added new kin of basic readX(), which returns its result of > I/O. But, it would not make sense that device driver have to > check and react after each of I/Os. > 2) clear/read_pci_errors() interface > Added new pair-interface to sandwich I/Os. It makes sense that > device driver can adjust the number of checking I/Os and can > react all of them at once. However, this was not generalized, > so I thought that more expandable design would be required. > > Today's patch is 3rd one - iochk_clear/read() interface. > - This also adds pair-interface, but not to sandwich only readX(). > Depends on platform, starting with ioreadX(), inX(), writeX() > if possible... and so on could be target of error checking. It makes sense to sandwich other kinds of device accesses. I don't think the previous clear/read_pci_errors() interface was intended *only* to sandwich readX(). > - Additionally adds special token - abstract "iocookie" structure > to control/identifies/manage I/Os, by passing it to OS. > Actual type of "iocookie" could be arch-specific. Device drivers > could use the iocookie structure without knowing its detail. I'm not sure we need this. Surely it can be deduced from the pci_dev or struct device? > Expected usage(sample) is: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > #include > #include > > int sample_read_with_iochk(struct pci_dev *dev, u32 *buf, int words) > { > unsigned long ofs = pci_resource_start(dev, 0) + DATA_OFFSET; > int i; > > /* Create magical cookie on the stack */ > iocookie cookie; > > /* Critical section start */ > iochk_clear(&dev, &cookie); > { > /* Get the whole packet of data */ > for (i = 0; i < words; i++) > *buf++ = ioread32(dev, ofs); You do know that ioread32() doesn't take a pci_dev, right? I hope you weren't counting on that for the rest of your implementation. > } > /* Critical section end. Did we have any trouble? */ > if ( iochk_read(&cookie) ) return -1; > > /* OK, all system go. */ > return 0; > } -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain From matthew at wil.cx Fri Jun 10 03:20:35 2005 From: matthew at wil.cx (Matthew Wilcox) Date: Thu, 9 Jun 2005 18:20:35 +0100 Subject: [PATCH 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A83B6D.8010703@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83B6D.8010703@jp.fujitsu.com> Message-ID: <20050609172035.GD24611@parcelfarce.linux.theplanet.co.uk> On Thu, Jun 09, 2005 at 09:51:57PM +0900, Hidetoshi Seto wrote: > + switch (dev->hdr_type) { > + case PCI_HEADER_TYPE_NORMAL: /* 0 */ > + pci_read_config_word(dev, PCI_STATUS, &status); > + break; > + case PCI_HEADER_TYPE_BRIDGE: /* 1 */ > + pci_read_config_word(dev, PCI_SEC_STATUS, &status); > + break; > + case PCI_HEADER_TYPE_CARDBUS: /* 2 */ > + default: > + BUG(); If somebody plugs a cardbus card into an ia64 machine, we BUG()? Unacceptable. Just return 0 if you don't know what to do with a particular device. -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain From matthew at wil.cx Fri Jun 10 03:34:33 2005 From: matthew at wil.cx (Matthew Wilcox) Date: Thu, 9 Jun 2005 18:34:33 +0100 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A8386F.2060100@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> Message-ID: <20050609173433.GE24611@parcelfarce.linux.theplanet.co.uk> On Thu, Jun 09, 2005 at 09:39:11PM +0900, Hidetoshi Seto wrote: > Reflecting every comments, I brushed up my patch for generic part. > So today I'll post it again, and also post "ia64 part", which > surely implements ia64-arch specific error checking. I think > latter will be a sample of basic implement for other arch. I think this is the wrong way to go about it. For PCI Express, we have a defined cross-architecture standard which tells us exactly how all future PCIe devices will behave in the face of errors. For PCI and PCI-X, we have a lot of legacy systems, each of which implements error checking and recovery in a somewhat eclectic way. So, IMO, any implementation of PCI error recovery should start by implementing the PCI Express AER mechanisms and then each architecture can look at extending that scheme to fit their own legacy hardware systems. That way we have a clean implementation for the future rather than being tied to any one manufacturer or architecture's quirks. Also, we can evaluate it based on looking at what the standard says, rather than all trying to wrap our brains around the idiosyncracies of a given platform ;-) -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain From davidm at napali.hpl.hp.com Fri Jun 10 03:40:56 2005 From: davidm at napali.hpl.hp.com (David Mosberger) Date: Thu, 9 Jun 2005 10:40:56 -0700 Subject: [PATCH 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A83CF2.90304@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83CF2.90304@jp.fujitsu.com> Message-ID: <17064.32552.507932.62892@napali.hpl.hp.com> Hidetoshi, >>>>> On Thu, 09 Jun 2005 21:58:26 +0900, Hidetoshi Seto said: Hidetoshi> +/* Hidetoshi> + * Some I/O bridges may poison the data read, instead of Hidetoshi> + * signaling a BERR. The consummation of poisoned data Hidetoshi> + * triggers a local, imprecise MCA. Hidetoshi> + * Note that the read operation by itself does not consume Hidetoshi> + * the bad data, you have to do something with it, e.g.: Hidetoshi> + * Hidetoshi> + * ld.8 r9=[r10];; // r10 == I/O address Hidetoshi> + * add.8 r8=r9,r9;; // fake operation Hidetoshi> + */ Hidetoshi> +#define ia64_poison_check(val) \ Hidetoshi> +{ register unsigned long gr8 asm("r8"); \ Hidetoshi> + asm volatile ("add %0=%1,r0" : "=r"(gr8) : "r"(val)); } Hidetoshi> + Hidetoshi> #endif /* CONFIG_IOMAP_CHECK */ I have only looked that this briefly and I didn't see off hand where you get the "r9=[r10]" sequence from --- I hope you're not relying on the compiler happening to generate this sequence! More importantly: please avoid inline "asm" and use the intrinsics defined by gcc_intrin.h instead (if you need something new, we can add that), but I think ia64_getreg() will do much of what you want already. Thanks, --david From benh at kernel.crashing.org Fri Jun 10 08:26:39 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 10 Jun 2005 08:26:39 +1000 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050609171332.GC24611@parcelfarce.linux.theplanet.co.uk> References: <42A8386F.2060100@jp.fujitsu.com> <20050609171332.GC24611@parcelfarce.linux.theplanet.co.uk> Message-ID: <1118355999.6850.177.camel@gaston> > It makes sense to sandwich other kinds of device accesses. I don't > think the previous clear/read_pci_errors() interface was intended *only* > to sandwich readX(). On many platforms, only read() is guaranteed to reliably report errors though. > > - Additionally adds special token - abstract "iocookie" structure > > to control/identifies/manage I/Os, by passing it to OS. > > Actual type of "iocookie" could be arch-specific. Device drivers > > could use the iocookie structure without knowing its detail. > > I'm not sure we need this. Surely it can be deduced from the pci_dev or > struct device? Might be useful to know more though, wether it was PIO or MMIO or other things. Also, I'd like to carry around the possible error details as can be returned by the firmware in some platforms. In fact, Is there any reason this is not ioerr_cookie instead of iocookie ? :) > > Expected usage(sample) is: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > #include > > #include > > > > int sample_read_with_iochk(struct pci_dev *dev, u32 *buf, int words) > > { > > unsigned long ofs = pci_resource_start(dev, 0) + DATA_OFFSET; > > int i; > > > > /* Create magical cookie on the stack */ > > iocookie cookie; > > > > /* Critical section start */ > > iochk_clear(&dev, &cookie); > > { > > /* Get the whole packet of data */ > > for (i = 0; i < words; i++) > > *buf++ = ioread32(dev, ofs); > > You do know that ioread32() doesn't take a pci_dev, right? I hope you > weren't counting on that for the rest of your implementation. > > > } > > /* Critical section end. Did we have any trouble? */ > > if ( iochk_read(&cookie) ) return -1; > > > > /* OK, all system go. */ > > return 0; > > } > From paulus at samba.org Fri Jun 10 11:41:30 2005 From: paulus at samba.org (Paul Mackerras) Date: Fri, 10 Jun 2005 11:41:30 +1000 Subject: [PATCH] kprobes: fix single-step out of line In-Reply-To: <20050525170159.GA9364@in.ibm.com> References: <20050525170159.GA9364@in.ibm.com> Message-ID: <17064.61386.359039.438550@cargo.ozlabs.ibm.com> Ananth N Mavinakayanahalli writes: > On Power4 and above, single-step out of line when the instruction copy > is on a kmalloc'ed memory area, fails with an Instruction Access > exception. Here is a patch that fixes it. > +static kprobe_opcode_t stepped_insn; Hmmm... you are putting the instruction in a location in the data segment, which may not be mapped executable. You would get away with it if the kernel is mapped with large pages (which is the default) and the kernel text + data fits into 16MB (which I hope would be the case). But still, it's not a really clean solution. However, I'm not sure what would be better; you need some storage that is both writable and executable, which we try to avoid having. Paul. From paulus at samba.org Fri Jun 10 12:40:19 2005 From: paulus at samba.org (Paul Mackerras) Date: Fri, 10 Jun 2005 12:40:19 +1000 Subject: last call for ppc64 changes for 2.6.12 Message-ID: <17064.64915.395654.961413@cargo.ozlabs.ibm.com> Is there anything that anyone knows of that absolutely must go in 2.6.12 that isn't in Linus' tree yet? If so, let me know ASAP. Thanks, Paul. From greg at kroah.com Fri Jun 10 14:33:25 2005 From: greg at kroah.com (Greg KH) Date: Thu, 9 Jun 2005 21:33:25 -0700 Subject: [PATCH] libfs: add simple attribute files In-Reply-To: <200505191029.07970.arnd@arndb.de> References: <200505132117.37461.arnd@arndb.de> <200505181441.01495.arnd@arndb.de> <20050518202446.GA20041@kroah.com> <200505191029.07970.arnd@arndb.de> Message-ID: <20050610043325.GA15040@kroah.com> On Thu, May 19, 2005 at 10:29:06AM +0200, Arnd Bergmann wrote: > On Middeweken 18 Mai 2005 22:24, Greg KH wrote: > > > Thanks for the patch. I've cleaned it up a bit (drop the spufs > > comments, changed the access check, and made the val be u64, and > > exported the symbols and cleaned up the debugfs portion) and added it to > > my tree. It should show up in the next -mm release. I've included the > > patch below so you can see my > > changes. > > Great, thanks for cleaning up those mistakes. > > I noticed one small problem with the change from 'long' to 'u64', in > that you did not change it in all places. In particular, using "%lu" to > print a u64 value will always do the wrong thing on big-endian 32 bit > platforms and maybe on some others. > Since 'u64' is '%llu' on most platforms but '%lu' on some 64 bit > platforms, I'd either do explicit cast to unsigned long long in > the printf or use unsigned long long throughout the code. > > > void foo_set(void *data, long val); and > ^^ u64 > > long foo_get(void *data); > ^^ u64 > > > +#define DEFINE_SIMPLE_ATTRIBUTE(__fops, __get, __set, __fmt) \ > > +static int __fops ## _open(struct inode *inode, struct file *file) \ > > +{ \ > > + __simple_attr_check_format(__fmt, 0ul); \ > ^^^^ 0ull > > > + else /* first read */ > > + size = scnprintf(attr->get_buf, sizeof(attr->get_buf), > > + attr->fmt, attr->get(attr->data)); > ^^ (unsigned long long) > > > +DEFINE_SIMPLE_ATTRIBUTE(fops_u8, debugfs_u8_get, debugfs_u8_set, "%lu\n"); > > +DEFINE_SIMPLE_ATTRIBUTE(fops_u16, debugfs_u16_get, debugfs_u16_set, "%lu\n"); > > +DEFINE_SIMPLE_ATTRIBUTE(fops_u32, debugfs_u32_get, debugfs_u32_set, "%lu\n"); > %llu ^^^^ > > I also noticed that it is not possible to pass NULL operations to > DEFINE_SIMPLE_ATTRIBUTE() unless you change > > --- a/include/linux/fs.h 2005-05-19 10:17:53.000000000 +0200 > +++ b/include/linux/fs.h 2005-05-19 10:14:57.000000000 +0200 > @@ -1680,7 +1680,7 @@ > static int __fops ## _open(struct inode *inode, struct file *file) \ > { \ > __simple_attr_check_format(__fmt, 0ul); \ > - return simple_attr_open(inode, file, &__get, &__set, __fmt); \ > + return simple_attr_open(inode, file, __get, __set, __fmt); \ > } \ > static struct file_operations __fops = { \ > .owner = THIS_MODULE, \ > > I'm currently away from my test machine, so I think it's easier if you > just update your patch yourself, but I could also send you an update > patch later if you prefer. Thanks for the updates, I've made them by hand to the patch, and will show up in the next -mm release. thanks again, greg k-h From sfr at canb.auug.org.au Fri Jun 10 14:57:04 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 10 Jun 2005 14:57:04 +1000 Subject: [PATCH] ppc64: tidy up vio devices fake parent Message-ID: <20050610145704.64311e08.sfr@canb.auug.org.au> Hi Andrew, Currently we dynamically allocate the fake parent device for all devices on the vio bus. This patch statically allocates it. This also allows us to reuse it for the iSeries "generic" vio device (that is used for passing to dma routines when communicating with the hypervisor without a device involved). Also unexport vio_bus_type as it is never used in modules. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus/arch/ppc64/kernel/vio.c linus-dma_bypass.1/arch/ppc64/kernel/vio.c --- linus/arch/ppc64/kernel/vio.c 2005-05-20 09:03:14.000000000 +1000 +++ linus-dma_bypass.1/arch/ppc64/kernel/vio.c 2005-06-09 23:19:27.000000000 +1000 @@ -41,20 +41,25 @@ static const struct vio_device_id *vio_m static struct iommu_table *vio_build_iommu_table(struct vio_dev *); static int vio_num_address_cells; #endif -static struct vio_dev *vio_bus_device; /* fake "parent" device */ - #ifdef CONFIG_PPC_ISERIES -static struct vio_dev *__init vio_register_device_iseries(char *type, - uint32_t unit_num); - static struct iommu_table veth_iommu_table; static struct iommu_table vio_iommu_table; - -static struct vio_dev _vio_dev = { +#endif +static struct vio_dev vio_bus_device = { /* fake "parent" device */ + .name = vio_bus_device.dev.bus_id, + .type = "", +#ifdef CONFIG_PPC_ISERIES .iommu_table = &vio_iommu_table, - .dev.bus = &vio_bus_type +#endif + .dev.bus_id = "vio", + .dev.bus = &vio_bus_type, }; -struct device *iSeries_vio_dev = &_vio_dev.dev; + +#ifdef CONFIG_PPC_ISERIES +static struct vio_dev *__init vio_register_device_iseries(char *type, + uint32_t unit_num); + +struct device *iSeries_vio_dev = &vio_bus_device.dev; EXPORT_SYMBOL(iSeries_vio_dev); #define device_is_compatible(a, b) 1 @@ -260,18 +265,10 @@ static int __init vio_bus_init(void) } /* the fake parent of all vio devices, just to give us a nice directory */ - vio_bus_device = kmalloc(sizeof(struct vio_dev), GFP_KERNEL); - if (!vio_bus_device) { - return 1; - } - memset(vio_bus_device, 0, sizeof(struct vio_dev)); - strcpy(vio_bus_device->dev.bus_id, "vio"); - - err = device_register(&vio_bus_device->dev); + err = device_register(&vio_bus_device.dev); if (err) { printk(KERN_WARNING "%s: device_register returned %i\n", __FUNCTION__, err); - kfree(vio_bus_device); return err; } @@ -326,7 +323,7 @@ static struct vio_dev * __devinit vio_re viodev->unit_address = unit_address; viodev->iommu_table = iommu_table; /* init generic 'struct device' fields: */ - viodev->dev.parent = &vio_bus_device->dev; + viodev->dev.parent = &vio_bus_device.dev; viodev->dev.bus = &vio_bus_type; viodev->dev.release = vio_dev_release; @@ -636,5 +633,3 @@ struct bus_type vio_bus_type = { .name = "vio", .match = vio_bus_match, }; - -EXPORT_SYMBOL(vio_bus_type); From seto.hidetoshi at jp.fujitsu.com Fri Jun 10 20:29:05 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 10 Jun 2005 19:29:05 +0900 Subject: [PATCH 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050609165353.GB9597@kroah.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83A8F.9020503@jp.fujitsu.com> <20050609165353.GB9597@kroah.com> Message-ID: <42A96B71.3080006@jp.fujitsu.com> Hi Greg, Thank you for giving me many useful advices! Greg KH wrote: > On Thu, Jun 09, 2005 at 09:48:15PM +0900, Hidetoshi Seto wrote: > >>+void iochk_init(void) { ; } >>+ >>+void iochk_clear(iocookie *cookie, struct pci_dev *dev) >>+{ >>+ /* no-ops */ >>+} > > A bit of a coding style difference between the two functions, yet they > do the same thing :) I intended to emphasize the pair. I'll unify them if not needed. >>+int iochk_read(iocookie *cookie) >>+{ >>+ /* no-ops */ >>+ return 0; >>+} > > Why not just return the cookie? Can this ever fail? In this time, no one initializes the cookie, so I just ignored it. > Shouldn't these go into a .h file and be made "static inline" so they > just compile away to nothing? I'm not used to inlining... In case of generic definition above, absolutely it should be inlined. OK, I'll try. >>+EXPORT_SYMBOL(iochk_clear); >>+EXPORT_SYMBOL(iochk_read); > > EXPORT_SYMBOL_GPL() perhaps? Yea. >>+#ifndef HAVE_ARCH_IOMAP_CHECK >>+typedef unsigned long iocookie; >>+#endif > > Why typedef this if it isn't specified? Because I stuck to have short name alias, and wanted to hide even whether it is struct or not. Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Fri Jun 10 20:30:24 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 10 Jun 2005 19:30:24 +0900 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050609171332.GC24611@parcelfarce.linux.theplanet.co.uk> References: <42A8386F.2060100@jp.fujitsu.com> <20050609171332.GC24611@parcelfarce.linux.theplanet.co.uk> Message-ID: <42A96BC0.9000505@jp.fujitsu.com> Matthew Wilcox wrote: >>Today's patch is 3rd one - iochk_clear/read() interface. >>- This also adds pair-interface, but not to sandwich only readX(). >> Depends on platform, starting with ioreadX(), inX(), writeX() >> if possible... and so on could be target of error checking. > > It makes sense to sandwich other kinds of device accesses. I don't > think the previous clear/read_pci_errors() interface was intended *only* > to sandwich readX(). At least there was _me_ who actually intended that... :-p Thank you for being so understanding. >>- Additionally adds special token - abstract "iocookie" structure >> to control/identifies/manage I/Os, by passing it to OS. >> Actual type of "iocookie" could be arch-specific. Device drivers >> could use the iocookie structure without knowing its detail. > > I'm not sure we need this. Surely it can be deduced from the pci_dev or > struct device? Once I prepared a cookie per a device, added it into pci_dev. But one of our NIC driver folks pointed out that it was hard to handle because there could be many contexts/threads riding on one device at same time. So I reconsidered it and now come to "a cookie per a context" style. >> *buf++ = ioread32(dev, ofs); > > You do know that ioread32() doesn't take a pci_dev, right? I hope you > weren't counting on that for the rest of your implementation. Oops. It's just my typo. Please ignore it. Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Fri Jun 10 20:31:14 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 10 Jun 2005 19:31:14 +0900 Subject: [PATCH 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050609165719.GC9597@kroah.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83B6D.8010703@jp.fujitsu.com> <20050609165719.GC9597@kroah.com> Message-ID: <42A96BF2.7070203@jp.fujitsu.com> Greg KH wrote: >> /* definition of ia64 iocookie */ >>-typedef unsigned long iocookie; >>+struct __iocookie { >>+ struct list_head list; >>+ struct pci_dev *dev; /* targeting device */ >>+ unsigned long error; /* error flag */ >>+}; >>+typedef struct __iocookie iocookie; > > Hm, why not just make the thing be a "struct iocookie" in the first > place, then we don't have to mess with a typedef at all. And then each > arch can define how the structure will look like in their private .c > files, ensuring that no user can ever try to touch the structure > themselves. Aha.., maybe I understand it just now. I don't know why, but I just stuck to typedef... Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Fri Jun 10 20:31:21 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 10 Jun 2005 19:31:21 +0900 Subject: [PATCH 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050609172035.GD24611@parcelfarce.linux.theplanet.co.uk> References: <42A8386F.2060100@jp.fujitsu.com> <42A83B6D.8010703@jp.fujitsu.com> <20050609172035.GD24611@parcelfarce.linux.theplanet.co.uk> Message-ID: <42A96BF9.2050608@jp.fujitsu.com> Matthew Wilcox wrote: > On Thu, Jun 09, 2005 at 09:51:57PM +0900, Hidetoshi Seto wrote: > >>+ switch (dev->hdr_type) { >>+ case PCI_HEADER_TYPE_NORMAL: /* 0 */ >>+ pci_read_config_word(dev, PCI_STATUS, &status); >>+ break; >>+ case PCI_HEADER_TYPE_BRIDGE: /* 1 */ >>+ pci_read_config_word(dev, PCI_SEC_STATUS, &status); >>+ break; >>+ case PCI_HEADER_TYPE_CARDBUS: /* 2 */ >>+ default: >>+ BUG(); > > If somebody plugs a cardbus card into an ia64 machine, we BUG()? > Unacceptable. Just return 0 if you don't know what to do with a > particular device. Sure, you are right. I'll fix it. Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Fri Jun 10 20:31:33 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 10 Jun 2005 19:31:33 +0900 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <1118355999.6850.177.camel@gaston> References: <42A8386F.2060100@jp.fujitsu.com> <20050609171332.GC24611@parcelfarce.linux.theplanet.co.uk> <1118355999.6850.177.camel@gaston> Message-ID: <42A96C05.4090301@jp.fujitsu.com> Hi Ben, Benjamin Herrenschmidt wrote: >>>- Additionally adds special token - abstract "iocookie" structure >>> to control/identifies/manage I/Os, by passing it to OS. >>> Actual type of "iocookie" could be arch-specific. Device drivers >>> could use the iocookie structure without knowing its detail. >> >>I'm not sure we need this. Surely it can be deduced from the pci_dev or >>struct device? > > Might be useful to know more though, wether it was PIO or MMIO or other > things. Also, I'd like to carry around the possible error details as can > be returned by the firmware in some platforms. > > In fact, Is there any reason this is not ioerr_cookie instead of > iocookie ? :) To be honest, No :) Or is there any reason to limit use of this cookie only for errors? Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Fri Jun 10 20:32:05 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 10 Jun 2005 19:32:05 +0900 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050609173433.GE24611@parcelfarce.linux.theplanet.co.uk> References: <42A8386F.2060100@jp.fujitsu.com> <20050609173433.GE24611@parcelfarce.linux.theplanet.co.uk> Message-ID: <42A96C25.9050903@jp.fujitsu.com> Matthew Wilcox wrote: > On Thu, Jun 09, 2005 at 09:39:11PM +0900, Hidetoshi Seto wrote: > >>Reflecting every comments, I brushed up my patch for generic part. >>So today I'll post it again, and also post "ia64 part", which >>surely implements ia64-arch specific error checking. I think >>latter will be a sample of basic implement for other arch. > > I think this is the wrong way to go about it. For PCI Express, we > have a defined cross-architecture standard which tells us exactly how > all future PCIe devices will behave in the face of errors. For PCI and > PCI-X, we have a lot of legacy systems, each of which implements error > checking and recovery in a somewhat eclectic way. > > So, IMO, any implementation of PCI error recovery should start by > implementing the PCI Express AER mechanisms and then each architecture can > look at extending that scheme to fit their own legacy hardware systems. > That way we have a clean implementation for the future rather than being > tied to any one manufacturer or architecture's quirks. > > Also, we can evaluate it based on looking at what the standard says, > rather than all trying to wrap our brains around the idiosyncracies of > a given platform ;-) All right, please take it a example of approach from legacy-side. Already there are good working group, includes Linas, BenH, and Long. They are also implementing some PCI error recovery codes (currently setting home to ppc64), and I know their wonderful works are more PCI Express friendly than my mysterious ia64 works :-) However, I also know that it doesn't mean my works were useless. Since there is a notable difference between their asynchronous error recovery and my synchronous error detecting, both could live in coexistence with each other. How cooperate with is interesting coming agenda, I think. Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Fri Jun 10 20:29:58 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 10 Jun 2005 19:29:58 +0900 Subject: [PATCH 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <17064.32552.507932.62892@napali.hpl.hp.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83CF2.90304@jp.fujitsu.com> <17064.32552.507932.62892@napali.hpl.hp.com> Message-ID: <42A96BA6.1070300@jp.fujitsu.com> Hi David, David Mosberger wrote: >>>>>>On Thu, 09 Jun 2005 21:58:26 +0900, Hidetoshi Seto said: > > Hidetoshi> +/* > Hidetoshi> + * Some I/O bridges may poison the data read, instead of > Hidetoshi> + * signaling a BERR. The consummation of poisoned data > Hidetoshi> + * triggers a local, imprecise MCA. > Hidetoshi> + * Note that the read operation by itself does not consume > Hidetoshi> + * the bad data, you have to do something with it, e.g.: > Hidetoshi> + * > Hidetoshi> + * ld.8 r9=[r10];; // r10 == I/O address > Hidetoshi> + * add.8 r8=r9,r9;; // fake operation > Hidetoshi> + */ > Hidetoshi> +#define ia64_poison_check(val) \ > Hidetoshi> +{ register unsigned long gr8 asm("r8"); \ > Hidetoshi> + asm volatile ("add %0=%1,r0" : "=r"(gr8) : "r"(val)); } > Hidetoshi> + > Hidetoshi> #endif /* CONFIG_IOMAP_CHECK */ > > I have only looked that this briefly and I didn't see off hand where you get > the "r9=[r10]" sequence from --- I hope you're not relying on the compiler > happening to generate this sequence! +static inline unsigned char +___ia64_readb (const volatile void __iomem *addr) +{ + unsigned char val; + + val = *(volatile unsigned char __force *)addr; + ia64_poison_check(val); + + return val; +} Assigning value from addr to variable val stands for "ld", is it right? What I want to do here is making sure that ld actually finishs loading data from memory or mmaped register or far place to general register, and make sure that the data is healthy enough to operate, not poisoned. > More importantly: please avoid inline "asm" and use the intrinsics > defined by gcc_intrin.h instead (if you need something new, we can add > that), but I think ia64_getreg() will do much of what you want already. Umm, I think I need something like ia64_setreg(ANYWHERE_DUMMY_REG, val). How do you think? Or don't you mind if I move the definition of ia64_poison_check above to gcc_intrin.h? Thanks, H.Seto From ananth at in.ibm.com Fri Jun 10 22:05:55 2005 From: ananth at in.ibm.com (Ananth N Mavinakayanahalli) Date: Fri, 10 Jun 2005 08:05:55 -0400 Subject: [PATCH] kprobes: fix single-step out of line In-Reply-To: <17064.61386.359039.438550@cargo.ozlabs.ibm.com> References: <20050525170159.GA9364@in.ibm.com> <17064.61386.359039.438550@cargo.ozlabs.ibm.com> Message-ID: <20050610120555.GA4891@in.ibm.com> On Fri, Jun 10, 2005 at 11:41:30AM +1000, Paul Mackerras wrote: > Ananth N Mavinakayanahalli writes: Hi Paul, > > On Power4 and above, single-step out of line when the instruction copy > > is on a kmalloc'ed memory area, fails with an Instruction Access > > exception. Here is a patch that fixes it. > > > +static kprobe_opcode_t stepped_insn; > > Hmmm... you are putting the instruction in a location in the data > segment, which may not be mapped executable. You would get away with > it if the kernel is mapped with large pages (which is the default) and > the kernel text + data fits into 16MB (which I hope would be the > case). But still, it's not a really clean solution. However, I'm not > sure what would be better; you need some storage that is both writable > and executable, which we try to avoid having. One option could be to __vmalloc() a page with pgprot = PAGE_KERNEL_EXEC and use that as a scratch area for stepping probed instructions - similar to what x86_64 kprobes does currently (though it uses module_alloc() to handle some special (RIP-relative) instructions). Suggestions? Ananth From david at gibson.dropbear.id.au Fri Jun 10 22:27:26 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 10 Jun 2005 22:27:26 +1000 Subject: [PATCH] ppc64: do not return from maple halt, power off, restart In-Reply-To: <200506072143.j57LhsHO017122@localhost.localdomain> References: <200506072143.j57LhsHO017122@localhost.localdomain> Message-ID: <20050610122726.GE8546@localhost.localdomain> On Tue, Jun 07, 2005 at 02:43:54PM -0700, Frank Rowand wrote: > > I updated David Gibson's patch to reflect the comments on the list. > > I tested this patch on linux-2.6.12-rc6. > > > Benjamin Herrenschmidt wrote: > > >>maple_halt() should do the same thing as maple_power_off(). (It could > >>even just call maple_power_off().) > > > > > > That's debatable... lots of people claim that halt() should just ... > > halt the kernel and not power off the computer :) I would personally > > have it do power off, but since that doesn't work ... > > I changed maple_halt() to call maple_power_off(), to be consistent > with the other ppc64 halt functions. If someone changed it to > just halt, it wouldn't bother me. > > >>maple_restart(), maple_power_off(), and maple_halt() should not ever > >>return. The returns could be replaced with the code from my patch > >>that started this thread. > > > > > > I think the "return" case should be handled at the toplevel function in > > setup.c that calls ppc_md. > > Good idea. I updated setup.c to catch the return case for restart, > power down, and halt. I used "#ifdef CONFIG_SMP" instead of adding > a null smp_send_stop() to include/smp.h. This #ifdef is already > used in several other places in setup.c. > > I also changed the printk()s in David's patch from KERN_INFO to > KERN_EMERG to match the shutdown messages in kernel/sys.c. > > How does this version of the patch look to everyone? Looks ok to me. Acked-by: David Gibson > Signed-off-by: Frank Rowand > Signed-off-by: David Gibson -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From davidm at napali.hpl.hp.com Sat Jun 11 03:25:46 2005 From: davidm at napali.hpl.hp.com (David Mosberger) Date: Fri, 10 Jun 2005 10:25:46 -0700 Subject: [PATCH 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A96BA6.1070300@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83CF2.90304@jp.fujitsu.com> <17064.32552.507932.62892@napali.hpl.hp.com> <42A96BA6.1070300@jp.fujitsu.com> Message-ID: <17065.52506.707169.903319@napali.hpl.hp.com> >>>>> On Fri, 10 Jun 2005 19:29:58 +0900, Hidetoshi Seto said: Hidetoshi> Hi David, Hidetoshi> David Mosberger wrote: >>>>>>> On Thu, 09 Jun 2005 21:58:26 +0900, Hidetoshi Seto said: >> Hidetoshi> +/* Hidetoshi> + * Some I/O bridges may poison the data read, instead of Hidetoshi> + * signaling a BERR. The consummation of poisoned data Hidetoshi> + * triggers a local, imprecise MCA. Hidetoshi> + * Note that the read operation by itself does not consume Hidetoshi> + * the bad data, you have to do something with it, e.g.: Hidetoshi> + * Hidetoshi> + * ld.8 r9=[r10];; // r10 == I/O address Hidetoshi> + * add.8 r8=r9,r9;; // fake operation Hidetoshi> + */ Hidetoshi> +#define ia64_poison_check(val) \ Hidetoshi> +{ register unsigned long gr8 asm("r8"); \ Hidetoshi> + asm volatile ("add %0=%1,r0" : "=r"(gr8) : "r"(val)); } Hidetoshi> + Hidetoshi> #endif /* CONFIG_IOMAP_CHECK */ >> I have only looked that this briefly and I didn't see off hand where you get >> the "r9=[r10]" sequence from --- I hope you're not relying on the compiler >> happening to generate this sequence! Hidetoshi> +static inline unsigned char Hidetoshi> +___ia64_readb (const volatile void __iomem *addr) Hidetoshi> +{ Hidetoshi> + unsigned char val; Hidetoshi> + Hidetoshi> + val = *(volatile unsigned char __force *)addr; Hidetoshi> + ia64_poison_check(val); Hidetoshi> + Hidetoshi> + return val; Hidetoshi> +} Ah, I see now what you're trying to do. I think it's really a machine-check barrier that you want there. I'm doubtful whether this is the right approach, though: your ia64_poison_check() will cause _every single_ readX() operation to stall the CPU for 1,000+ cycles. Why not define an explicit iochk_barrier() instead? Then you could do things like this: a = readb(X); b = readb(Y); c = readb(Z); iochk_barrier(a + b + c); That is, if it's unimportant to know whether the read of X, Y, or Z caused the MCA, you can amortize the cost of iochk_barrier() over 3 reads. I'd probably make iochk_barrier() an out-of-line no-op assembly routine. The cost of two branches compared to stalling for hundreds of cycles is rather trivial. --david From johnrose at austin.ibm.com Sat Jun 11 06:06:56 2005 From: johnrose at austin.ibm.com (John Rose) Date: Fri, 10 Jun 2005 15:06:56 -0500 Subject: [PATCH] initialize TCE tables In-Reply-To: <1117649436.28482.3.camel@sinatra.austin.ibm.com> References: <1117649436.28482.3.camel@sinatra.austin.ibm.com> Message-ID: <1118434015.21098.34.camel@sinatra.austin.ibm.com> It turns out that the previous patch is incorrect. When running with this patch, dynamic slot adds result in intermittent kernel warnings about failed TCE hcalls. The patch fails to account for the firmware-defined offset of each iommu table when initializing the table's TCEs. Each iommu table has an offset communicated by the ibm,dma-window property, and this needs to be passed into tce_free(). With the patch as-is, dynamic adds result in redundant clearing of the TCEs for offset 0, regardless of the offset of the table in question. Here's the correct patch, which I tested across multiple platforms and multiple DLPAR operations. Thanks- John Signed-off-by: John Rose diff -puN arch/ppc64/kernel/iommu.c~initialize_tces arch/ppc64/kernel/iommu.c --- 2_6_linus_2/arch/ppc64/kernel/iommu.c~initialize_tces 2005-06-01 12:17:53.000000000 -0500 +++ 2_6_linus_2-johnrose/arch/ppc64/kernel/iommu.c 2005-06-10 14:46:27.000000000 -0500 @@ -423,6 +423,9 @@ struct iommu_table *iommu_init_table(str tbl->it_largehint = tbl->it_halfpoint; spin_lock_init(&tbl->it_lock); + /* Clear the hardware table in case firmware left allocations in it */ + ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size); + if (!welcomed) { printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n", novmerge ? "disabled" : "enabled"); _ From olof at lixom.net Sat Jun 11 07:22:52 2005 From: olof at lixom.net (Olof Johansson) Date: Fri, 10 Jun 2005 16:22:52 -0500 Subject: [PATCH] Fix PCI BAR size interpretation on 64-bit arches Message-ID: <20050610212252.GA28655@austin.ibm.com> Hi, On 64-bit machines, PCI_BASE_ADDRESS_MEM_MASK and other mask constants passed to pci_size() are 64-bit (for example ~0x0fUL). However, pci_size does comparisons between the u32 arguments and the mask, which will fail even though any result from pci_size is still just 32-bit. Changing the mask argument to u32 seems the obvious thing to do, since all arithmetic in the function is 32-bit and having a larger mask makes no sense. This triggered on a PPC64 system here where an adapter (VGA, as it happened) had a memory region base of 0xfe000000 and a sz of the same, matching the if (max == maxbase ...) test at the bottom of pci_size but failing the mask comparison. Quite a corner case which I guess explains why we haven't seen it until now. Signed-off-by: Olof Johansson Index: 2.6/drivers/pci/probe.c =================================================================== --- 2.6.orig/drivers/pci/probe.c 2005-06-10 15:09:37.000000000 -0500 +++ 2.6/drivers/pci/probe.c 2005-06-10 15:43:36.000000000 -0500 @@ -125,7 +125,7 @@ static inline unsigned int pci_calc_reso /* * Find the extent of a PCI decode.. */ -static u32 pci_size(u32 base, u32 maxbase, unsigned long mask) +static u32 pci_size(u32 base, u32 maxbase, u32 mask) { u32 size = mask & maxbase; /* Find the significant bits */ if (!size) From benh at kernel.crashing.org Sat Jun 11 09:25:09 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 11 Jun 2005 09:25:09 +1000 Subject: [PATCH 00/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42A96C25.9050903@jp.fujitsu.com> References: <42A8386F.2060100@jp.fujitsu.com> <20050609173433.GE24611@parcelfarce.linux.theplanet.co.uk> <42A96C25.9050903@jp.fujitsu.com> Message-ID: <1118445909.5986.57.camel@gaston> > > I think this is the wrong way to go about it. For PCI Express, we > > have a defined cross-architecture standard which tells us exactly how > > all future PCIe devices will behave in the face of errors. For PCI and > > PCI-X, we have a lot of legacy systems, each of which implements error > > checking and recovery in a somewhat eclectic way. We already defined something for recovery that everybody seem to be fine with and that isn't tied around PCIe specifics. I do not think PCIe is the ultimate panacea that will replace everything. > > So, IMO, any implementation of PCI error recovery should start by > > implementing the PCI Express AER mechanisms and then each architecture can > > look at extending that scheme to fit their own legacy hardware systems. No, I strongly disagree. > > That way we have a clean implementation for the future rather than being > > tied to any one manufacturer or architecture's quirks. >> > > Also, we can evaluate it based on looking at what the standard says, > > rather than all trying to wrap our brains around the idiosyncracies of > > a given platform ;-) > All right, please take it a example of approach from legacy-side. > > Already there are good working group, includes Linas, BenH, and Long. > They are also implementing some PCI error recovery codes (currently > setting home to ppc64), and I know their wonderful works are more PCI > Express friendly than my mysterious ia64 works :-) > > However, I also know that it doesn't mean my works were useless. > Since there is a notable difference between their asynchronous error > recovery and my synchronous error detecting, both could live in > coexistence with each other. > > How cooperate with is interesting coming agenda, I think. Well, our recovery mecanism is intended to be an addition to the synchronous error detection. If you read carefully my document for example, I specify places where driver should still do synchronous detection, especially in some of the recovery phases themselves. We have agreed a long time ago that a good mecanism for synchronous detection is to sandwitch IOs that way. The actual implementation may use AER, pSeries EEH mecanisms, PCI/PCI-X status errors bits (that need per-segment locks though) etc... depending on the architecture. Since the actual error information can be very varied, the error "cookie" was suggested as an opaque way to carry that information and keep track of other platform specific things. We can then add specific accessors to extract useful infos (or dump as ASCII for some logging facility) the details of the error cookie. Ben. From seto.hidetoshi at jp.fujitsu.com Mon Jun 13 16:54:28 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Mon, 13 Jun 2005 15:54:28 +0900 Subject: [PATCH 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <17065.52506.707169.903319@napali.hpl.hp.com> References: <42A8386F.2060100@jp.fujitsu.com> <42A83CF2.90304@jp.fujitsu.com> <17064.32552.507932.62892@napali.hpl.hp.com> <42A96BA6.1070300@jp.fujitsu.com> <17065.52506.707169.903319@napali.hpl.hp.com> Message-ID: <42AD2DA4.1070309@jp.fujitsu.com> David Mosberger wrote: >>>>>>On Fri, 10 Jun 2005 19:29:58 +0900, Hidetoshi Seto said: > > Hidetoshi> +#define ia64_poison_check(val) \ > Hidetoshi> +{ register unsigned long gr8 asm("r8"); \ > Hidetoshi> + asm volatile ("add %0=%1,r0" : "=r"(gr8) : "r"(val)); } > > >> I have only looked that this briefly and I didn't see off hand where you get > >> the "r9=[r10]" sequence from --- I hope you're not relying on the compiler > >> happening to generate this sequence! > > Hidetoshi> +static inline unsigned char > Hidetoshi> +___ia64_readb (const volatile void __iomem *addr) > Hidetoshi> +{ > Hidetoshi> + unsigned char val; > Hidetoshi> + > Hidetoshi> + val = *(volatile unsigned char __force *)addr; > Hidetoshi> + ia64_poison_check(val); > Hidetoshi> + > Hidetoshi> + return val; > Hidetoshi> +} > > Ah, I see now what you're trying to do. I think it's really a > machine-check barrier that you want there. Yes, thanks for your understanding. > I'm doubtful whether this is the right approach, though: your > ia64_poison_check() will cause _every single_ readX() operation to > stall the CPU for 1,000+ cycles. Why not define an explicit > iochk_barrier() instead? Then you could do things like this: > > a = readb(X); > b = readb(Y); > c = readb(Z); > iochk_barrier(a + b + c); > > That is, if it's unimportant to know whether the read of X, Y, or Z > caused the MCA, you can amortize the cost of iochk_barrier() over 3 > reads. I'm also doubtful, I know it too costs... but I don't have any other better idea. As far as I can figure out, using iochk_barrier() style has difficulty like that: - pain for driver maintainers. They should be careful to make exact argument for barrier. - arch-specific. It will go against the spirit of iochk, "generic" interface. - unenforceable. You could forget it. - it would be in form: { iocookie cookie; iochk_clear(cookie, dev); for(i=0;i I'd probably make iochk_barrier() an out-of-line no-op assembly > routine. The cost of two branches compared to stalling for hundreds > of cycles is rather trivial. Of course I agree to have such routine in proper header file, but it would not help us to save CPU cycles if we don't have any other idea... Or I'll just replace ia64_poison_check() to ia64_mca_barrier() or so. Thanks, H.Seto From ceagan at gmail.com Tue Jun 14 05:17:29 2005 From: ceagan at gmail.com (Chris Eagan) Date: Mon, 13 Jun 2005 15:17:29 -0400 Subject: Brand new iMac G5 Message-ID: We have a brand new iMac G5 and we are having trouble with the network card drivers. It seems that they load and find the device okay, but then setting an ip manually does not give access to the network. DHCP fails to find anything as well. We have been working on this problem at http://bugs.gentoo.org/show_bug.cgi?id=94263 but nothing has worked so far. I have included some information about the device here. If any other information is needed, we have a cd that boots to a prompt on the machine and would be happy to provide any needed information. Thanks! 1. find /proc/device-tree -type d -name ethernet-phy -> /proc/device-tree/ht at 0,f2000000/pci at 1/ethernet at f/ethernet-phy 2. move to the found directory. 3. cat compatible -> B5461 4. od -x phy-id -> 0000000 0000 60d3 0000004 -- Chris Eagan ceagan at gmail.com Want 2GB Google Mail? Just ask me! From amavin at redhat.com Tue Jun 14 05:48:15 2005 From: amavin at redhat.com (Ananth N Mavinakayanahalli) Date: Mon, 13 Jun 2005 15:48:15 -0400 Subject: [patch 5/5] [kprobes] Tweak to the function return probe design In-Reply-To: <20050613190323.672988000@tuna.jf.intel.com> References: <20050613190207.954385000@tuna.jf.intel.com> <20050613190323.672988000@tuna.jf.intel.com> Message-ID: <42ADE2FF.5020604@redhat.com> rusty.lynch at intel.com wrote: Hi Rusty, Thanks for doing this. However... > + > + orig_ret_address = (unsigned long)ri->ret_addr; > + recycle_rp_inst(ri); > + > + if (orig_ret_address != (unsigned long) &kretprobe_trampoline) > + /* > + * This is the real return address. Any other > + * instances associated with this task are for > + * other calls deeper on the call stack > + */ > + break; > + } > + > + BUG_ON(!orig_ret_address); > + regs->nip = orig_ret_address; > + > + unlock_kprobes(); > + preempt_enable_no_resched(); ^^^^^^^ We don't need this here - on ppc64, we do a preempt_disable/enable in kprobe_exceptions_notify() and so this will cause a spurious preempt_enable(). Thanks, Ananth From rusty.lynch at intel.com Tue Jun 14 06:51:54 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Mon, 13 Jun 2005 13:51:54 -0700 Subject: [patch 1/5] [kprobes] Tweak to the function return probe design References: <20050613205153.349171000@linux.jf.intel.com> Message-ID: <20050613205234.799275000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-base.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050613/649cbe3c/attachment.txt From rusty.lynch at intel.com Tue Jun 14 06:51:53 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Mon, 13 Jun 2005 13:51:53 -0700 Subject: [patch 0/5] [kprobes] Tweak to the function return probe design Message-ID: <20050613205153.349171000@linux.jf.intel.com> (The following is a resend from this morning. The various kernel mailing list did not seem to get my email, so I am resending the patch series from another machine.) From my experiences with adding return probes to x86_64 and ia64, and the feedback on LKML to those patches, I think we can simplify the design for return probes. The following patch tweaks the original design such that: * Instead of storing the stack address in the return probe instance, the task pointer is stored. This gives us all we need in order to: - find the correct return probe instance when we enter the trampoline (even if we are recursing) - find all left-over return probe instances when the task is going away This has the side effect of simplifying the implementation since more work can be done in kernel/kprobes.c since architecture specific knowledge of the stack layout is no longer required. Specifically, we no longer have: - arch_get_kprobe_task() - arch_kprobe_flush_task() - get_rp_inst_tsk() - get_rp_inst() - trampoline_post_handler() * Instead of splitting the return probe handling and cleanup logic across the pre and post trampoline handlers, all the work is pushed into the pre function (trampoline_probe_handler), and then we skip single stepping the original function. In this case the original instruction to be single stepped was just a NOP, and we can do without the extra interruption. The new flow of events to having a return probe handler execute when a target function exits is: * At system initialization time, a kprobe is inserted at the beginning of kretprobe_trampoline. kernel/kprobes.c use to handle this on it's own, but ia64 needed to do this a little differently (i.e. a function pointer is really a pointer to a structure containing the instruction pointer and a global pointer), so I added the notion of arch_init(), so that kernel/kprobes.c:init_kprobes() now allows architecture specific initialization by calling arch_init() before exiting. Each architecture now registers a kprobe on it's own trampoline function. * register_kretprobe() will insert a kprobe at the beginning of the targeted function with the kprobe pre_handler set to arch_prepare_kretprobe (still no change) * When the target function is entered, the kprobe is fired, calling arch_prepare_kretprobe (still no change) * In arch_prepare_kretprobe() we try to get a free instance and if one is available then we fill out the instance with a pointer to the return probe, the original return address, and a pointer to the task structure (instead of the stack address.) Just like before we change the return address to the trampoline function and mark the instance as used. If multiple return probes are registered for a given target function, then arch_prepare_kretprobe() will get called multiple times for the same task (since our kprobe implementation is able to handle multiple kprobes at the same address.) Past the first call to arch_prepare_kretprobe, we end up with the original address stored in the return probe instance pointing to our trampoline function. (This is a significant difference from the original arch_prepare_kretprobe design.) * Target function executes like normal and then returns to kretprobe_trampoline. * kprobe inserted on the first instruction of kretprobe_trampoline is fired and calls trampoline_probe_handler() (no change here) * trampoline_probe_handler() consumes each of the instances associated with the current task by calling the registered handler function and marking the instance as unused until an instance is found that has a return address different then the trampoline function. (change similar to my previous ia64 RFC) * If the task is killed with some left-over return probe instances (meaning that a target function was entered, but never returned), then we just free any instances associated with the task. (Not much different other then we can handle this without calling architecture specific functions.) There is a known problem that this patch does not yet solve where registering a return probe flush_old_exec or flush_thread will put us in a bad state. Most likely the best way to handle this is to not allow registering return probes on these two functions. (Significant change) This patch series applies to the 2.6.12-rc6-mm1 kernel, and provides: * kernel/kprobes.c changes * i386 patch of existing return probes implementation * x86_64 patch of existing return probe implementation * ia64 implementation * ppc64 implementation (provided by Ananth) --rusty From rusty.lynch at intel.com Tue Jun 14 06:51:57 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Mon, 13 Jun 2005 13:51:57 -0700 Subject: [patch 4/5] [kprobes] Tweak to the function return probe design References: <20050613205153.349171000@linux.jf.intel.com> Message-ID: <20050613205236.351353000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-ia64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050613/daf75199/attachment.txt From rusty.lynch at intel.com Tue Jun 14 06:51:56 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Mon, 13 Jun 2005 13:51:56 -0700 Subject: [patch 3/5] [kprobes] Tweak to the function return probe design References: <20050613205153.349171000@linux.jf.intel.com> Message-ID: <20050613205235.839201000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-x86_64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050613/1448691e/attachment.txt From rusty.lynch at intel.com Tue Jun 14 06:51:58 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Mon, 13 Jun 2005 13:51:58 -0700 Subject: [patch 5/5] [kprobes] Tweak to the function return probe design References: <20050613205153.349171000@linux.jf.intel.com> Message-ID: <20050613205236.873920000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-ppc64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050613/915322ea/attachment.txt From rusty.lynch at intel.com Tue Jun 14 06:51:55 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Mon, 13 Jun 2005 13:51:55 -0700 Subject: [patch 2/5] [kprobes] Tweak to the function return probe design References: <20050613205153.349171000@linux.jf.intel.com> Message-ID: <20050613205235.308105000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-i386.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050613/e75f08ff/attachment.txt From benh at kernel.crashing.org Tue Jun 14 10:36:44 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 14 Jun 2005 10:36:44 +1000 Subject: Brand new iMac G5 In-Reply-To: References: Message-ID: <1118709405.5986.103.camel@gaston> On Mon, 2005-06-13 at 15:17 -0400, Chris Eagan wrote: > We have a brand new iMac G5 and we are having trouble with the network > card drivers. It seems that they load and find the device okay, but > then setting an ip manually does not give access to the network. DHCP > fails to find anything as well. We have been working on this problem > at http://bugs.gentoo.org/show_bug.cgi?id=94263 but nothing has worked > so far. I have included some information about the device here. If any > other information is needed, we have a cd that boots to a prompt on > the machine and would be happy to provide any needed information. Hi ! Please try this patch and let me know asap: Index: linux-work/drivers/net/sungem.c =================================================================== --- linux-work.orig/drivers/net/sungem.c 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem.c 2005-06-14 10:17:38.000000000 +1000 @@ -3078,7 +3078,9 @@ gp->phy_mii.dev = dev; gp->phy_mii.mdio_read = _phy_read; gp->phy_mii.mdio_write = _phy_write; - +#ifdef CONFIG_PPC_PMAC + gp->phy_mii.platform_data = gp->of_node; +#endif /* By default, we start with autoneg */ gp->want_autoneg = 1; Index: linux-work/drivers/net/sungem_phy.c =================================================================== --- linux-work.orig/drivers/net/sungem_phy.c 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem_phy.c 2005-06-14 10:36:07.000000000 +1000 @@ -32,6 +32,10 @@ #include #include +#ifdef CONFIG_PPC_PMAC +#include +#endif + #include "sungem_phy.h" /* Link modes of the BCM5400 PHY */ @@ -281,10 +285,12 @@ static int bcm5421_init(struct mii_phy* phy) { u16 data; - int rev; + unsigned int id; - rev = phy_read(phy, MII_PHYSID2) & 0x000f; - if (rev == 0) { + id = (phy_read(phy, MII_PHYSID1) << 16 | phy_read(phy, MII_PHYSID2)); + + /* Revision 0 of 5421 needs some fixups */ + if (id == 0x002060e0) { /* This is borrowed from MacOS */ phy_write(phy, 0x18, 0x1007); @@ -297,21 +303,28 @@ data = phy_read(phy, 0x15); phy_write(phy, 0x15, data | 0x0200); } -#if 0 - /* This has to be verified before I enable it */ - /* Enable automatic low-power */ - phy_write(phy, 0x1c, 0x9002); - phy_write(phy, 0x1c, 0xa821); - phy_write(phy, 0x1c, 0x941d); -#endif - return 0; -} -static int bcm5421k2_init(struct mii_phy* phy) -{ - /* Init code borrowed from OF */ - phy_write(phy, 4, 0x01e1); - phy_write(phy, 9, 0x0300); + /* Pick up some init code from OF for K2 version */ + if ((id & 0xfffffff0) == 0x002062e0) { + phy_write(phy, 4, 0x01e1); + phy_write(phy, 9, 0x0300); + } + + /* Check if we can enable automatic low power */ +#ifdef CONFIG_PPC_PMAC + if (phy->platform_data) { + struct device_node *np = of_get_parent(phy->platform_data); + int can_low_power = 1; + if (np == NULL || get_property(np, "no-autolowpower", NULL)) + can_low_power = 0; + if (can_low_power) { + /* Enable automatic low-power */ + phy_write(phy, 0x1c, 0x9002); + phy_write(phy, 0x1c, 0xa821); + phy_write(phy, 0x1c, 0x941d); + } + } +#endif /* CONFIG_PPC_PMAC */ return 0; } @@ -762,7 +775,7 @@ /* Broadcom BCM 5421 built-in K2 */ static struct mii_phy_ops bcm5421k2_phy_ops = { - .init = bcm5421k2_init, + .init = bcm5421_init, .suspend = bcm5411_suspend, .setup_aneg = bcm54xx_setup_aneg, .setup_forced = bcm54xx_setup_forced, @@ -779,6 +792,25 @@ .ops = &bcm5421k2_phy_ops }; +/* Broadcom BCM 5462 built-in Vesta */ +static struct mii_phy_ops bcm5462V_phy_ops = { + .init = bcm5421_init, + .suspend = bcm5411_suspend, + .setup_aneg = bcm54xx_setup_aneg, + .setup_forced = bcm54xx_setup_forced, + .poll_link = genmii_poll_link, + .read_link = bcm54xx_read_link, +}; + +static struct mii_phy_def bcm5462V_phy_def = { + .phy_id = 0x002062e0, + .phy_id_mask = 0xfffffff0, + .name = "BCM5462-Vesta", + .features = MII_GBIT_FEATURES, + .magic_aneg = 1, + .ops = &bcm5462V_phy_ops +}; + /* Marvell 88E1101 (Apple seem to deal with 2 different revs, * I masked out the 8 last bits to get both, but some specs * would be useful here) --BenH. @@ -824,6 +856,7 @@ &bcm5411_phy_def, &bcm5421_phy_def, &bcm5421k2_phy_def, + &bcm5462V_phy_def, &marvell_phy_def, &genmii_phy_def, NULL Index: linux-work/drivers/net/sungem_phy.h =================================================================== --- linux-work.orig/drivers/net/sungem_phy.h 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem_phy.h 2005-06-14 10:16:14.000000000 +1000 @@ -43,9 +43,10 @@ int pause; /* Provided by host chip */ - struct net_device* dev; + struct net_device *dev; int (*mdio_read) (struct net_device *dev, int mii_id, int reg); void (*mdio_write) (struct net_device *dev, int mii_id, int reg, int val); + void *platform_data; }; /* Pass in a struct mii_phy with dev, mdio_read and mdio_write From paulus at samba.org Tue Jun 14 22:19:24 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 14 Jun 2005 22:19:24 +1000 Subject: [PATCH] ppc64: update example configs Message-ID: <17070.52044.167922.762976@cargo.ozlabs.ibm.com> Here is a patch to update the example configs in arch/ppc64/configs. I hope that this and Olaf Hering's arch/ppc64/defconfig update can go in for 2.6.12. Signed-off-by: Paul Mackerras --- g5_defconfig | 76 ++++++++++++++++++++++++---------------- iSeries_defconfig | 62 +++++++++++++++++++------------- maple_defconfig | 70 +++++++++++++++++++++++++------------ pSeries_defconfig | 102 ++++++++++++++++++++++++++++++++++-------------------- 4 files changed, 196 insertions(+), 114 deletions(-) diff -urN linux-2.6/arch/ppc64/configs/g5_defconfig test/arch/ppc64/configs/g5_defconfig --- linux-2.6/arch/ppc64/configs/g5_defconfig 2005-04-26 15:37:55.000000000 +1000 +++ test/arch/ppc64/configs/g5_defconfig 2005-06-14 16:59:31.000000000 +1000 @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.11 -# Thu Mar 10 16:47:04 2005 +# Linux kernel version: 2.6.12-rc6 +# Tue Jun 14 16:59:20 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -11,7 +11,7 @@ CONFIG_HAVE_DEC_LOCK=y CONFIG_EARLY_PRINTK=y CONFIG_COMPAT=y -CONFIG_FRAME_POINTER=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_FORCE_MAX_ZONEORDER=13 # @@ -20,6 +20,7 @@ CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y +CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup @@ -31,19 +32,20 @@ # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set -CONFIG_LOG_BUF_SHIFT=17 CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y +# CONFIG_CPUSETS is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_PRINTK=y +CONFIG_BUG=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y -# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 @@ -87,6 +89,8 @@ # CONFIG_SCHED_SMT is not set # CONFIG_PREEMPT is not set CONFIG_GENERIC_HARDIRQS=y +CONFIG_SECCOMP=y +CONFIG_ISA_DMA_API=y # # General setup @@ -97,6 +101,7 @@ # CONFIG_BINFMT_MISC is not set CONFIG_PCI_LEGACY_PROC=y CONFIG_PCI_NAMES=y +# CONFIG_PCI_DEBUG is not set # CONFIG_HOTPLUG_CPU is not set # @@ -105,10 +110,6 @@ # CONFIG_PCCARD is not set # -# PC-card bridges -# - -# # PCI Hotplug Support # # CONFIG_HOTPLUG_PCI is not set @@ -293,7 +294,6 @@ # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_EATA is not set -# CONFIG_SCSI_EATA_PIO is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set # CONFIG_SCSI_GDTH is not set # CONFIG_SCSI_IPS is not set @@ -301,7 +301,6 @@ # CONFIG_SCSI_INIA100 is not set # CONFIG_SCSI_SYM53C8XX_2 is not set # CONFIG_SCSI_IPR is not set -# CONFIG_SCSI_QLOGIC_ISP is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA2XXX=y @@ -310,6 +309,7 @@ # CONFIG_SCSI_QLA2300 is not set # CONFIG_SCSI_QLA2322 is not set # CONFIG_SCSI_QLA6312 is not set +# CONFIG_SCSI_LPFC is not set # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_DEBUG is not set @@ -332,6 +332,7 @@ CONFIG_DM_SNAPSHOT=m CONFIG_DM_MIRROR=m CONFIG_DM_ZERO=m +# CONFIG_DM_MULTIPATH is not set # # Fusion MPT device support @@ -394,7 +395,6 @@ # CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set -# CONFIG_NETLINK_DEV is not set CONFIG_UNIX=y CONFIG_NET_KEY=m CONFIG_INET=y @@ -564,6 +564,8 @@ # CONFIG_R8169 is not set # CONFIG_SK98LIN is not set CONFIG_TIGON3=m +# CONFIG_BNX2 is not set +# CONFIG_MV643XX_ETH is not set # # Ethernet (10000 Mbit) @@ -631,18 +633,6 @@ # CONFIG_INPUT_EVBUG is not set # -# Input I/O drivers -# -# CONFIG_GAMEPORT is not set -CONFIG_SOUND_GAMEPORT=y -CONFIG_SERIO=y -# CONFIG_SERIO_I8042 is not set -# CONFIG_SERIO_SERPORT is not set -# CONFIG_SERIO_CT82C710 is not set -# CONFIG_SERIO_PCIPS2 is not set -# CONFIG_SERIO_RAW is not set - -# # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y @@ -660,6 +650,16 @@ # CONFIG_INPUT_MISC is not set # +# Hardware I/O ports +# +CONFIG_SERIO=y +# CONFIG_SERIO_I8042 is not set +# CONFIG_SERIO_SERPORT is not set +# CONFIG_SERIO_PCIPS2 is not set +# CONFIG_SERIO_RAW is not set +# CONFIG_GAMEPORT is not set + +# # Character devices # CONFIG_VT=y @@ -676,6 +676,7 @@ # Non-8250 serial port support # # CONFIG_SERIAL_PMACZILOG is not set +# CONFIG_SERIAL_JSM is not set CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 @@ -698,9 +699,12 @@ # # Ftape, the floppy tape device driver # +CONFIG_AGP=m +CONFIG_AGP_UNINORTH=m # CONFIG_DRM is not set CONFIG_RAW_DRIVER=y CONFIG_MAX_RAW_DEVS=256 +# CONFIG_HANGCHECK_TIMER is not set # # TPM devices @@ -730,12 +734,11 @@ # CONFIG_I2C_AMD8111 is not set # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set +# CONFIG_I2C_PIIX4 is not set # CONFIG_I2C_ISA is not set CONFIG_I2C_KEYWEST=y -# CONFIG_I2C_MPC is not set # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT_LIGHT is not set -# CONFIG_I2C_PIIX4 is not set # CONFIG_I2C_PROSAVAGE is not set # CONFIG_I2C_SAVAGE4 is not set # CONFIG_SCx200_ACB is not set @@ -772,6 +775,7 @@ # CONFIG_SENSORS_LM85 is not set # CONFIG_SENSORS_LM87 is not set # CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set # CONFIG_SENSORS_MAX1619 is not set # CONFIG_SENSORS_PC87360 is not set # CONFIG_SENSORS_SMSC47B397 is not set @@ -785,6 +789,7 @@ # # Other I2C Chip support # +# CONFIG_SENSORS_DS1337 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set # CONFIG_SENSORS_PCF8591 is not set @@ -817,6 +822,11 @@ # Graphics support # CONFIG_FB=y +CONFIG_FB_CFB_FILLRECT=y +CONFIG_FB_CFB_COPYAREA=y +CONFIG_FB_CFB_IMAGEBLIT=y +CONFIG_FB_SOFT_CURSOR=y +CONFIG_FB_MACMODES=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y # CONFIG_FB_CIRRUS is not set @@ -830,6 +840,7 @@ # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set +# CONFIG_FB_NVIDIA is not set CONFIG_FB_RIVA=y # CONFIG_FB_RIVA_I2C is not set # CONFIG_FB_RIVA_DEBUG is not set @@ -847,6 +858,7 @@ # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set # CONFIG_FB_TRIDENT is not set +# CONFIG_FB_S1D13XXX is not set # CONFIG_FB_VIRTUAL is not set # @@ -880,6 +892,8 @@ # # USB support # +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB=y # CONFIG_USB_DEBUG is not set @@ -890,8 +904,6 @@ # CONFIG_USB_BANDWIDTH is not set # CONFIG_USB_DYNAMIC_MINORS is not set # CONFIG_USB_OTG is not set -CONFIG_USB_ARCH_HAS_HCD=y -CONFIG_USB_ARCH_HAS_OHCI=y # # USB Host Controller Drivers @@ -917,7 +929,6 @@ # CONFIG_USB_STORAGE=y # CONFIG_USB_STORAGE_DEBUG is not set -CONFIG_USB_STORAGE_RW_DETECT=y CONFIG_USB_STORAGE_DATAFAB=y CONFIG_USB_STORAGE_FREECOM=y CONFIG_USB_STORAGE_ISD200=y @@ -1004,8 +1015,10 @@ # CONFIG_USB_SERIAL=m CONFIG_USB_SERIAL_GENERIC=y +# CONFIG_USB_SERIAL_AIRPRIME is not set CONFIG_USB_SERIAL_BELKIN=m CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m +# CONFIG_USB_SERIAL_CP2101 is not set CONFIG_USB_SERIAL_CYPRESS_M8=m CONFIG_USB_SERIAL_EMPEG=m CONFIG_USB_SERIAL_FTDI_SIO=m @@ -1034,6 +1047,7 @@ CONFIG_USB_SERIAL_KOBIL_SCT=m CONFIG_USB_SERIAL_MCT_U232=m CONFIG_USB_SERIAL_PL2303=m +# CONFIG_USB_SERIAL_HP4X is not set CONFIG_USB_SERIAL_SAFE=m CONFIG_USB_SERIAL_SAFE_PADDED=y CONFIG_USB_SERIAL_TI=m @@ -1270,11 +1284,13 @@ # # Kernel hacking # +# CONFIG_PRINTK_TIME is not set CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y -# CONFIG_PRINTK_TIME is not set +CONFIG_LOG_BUF_SHIFT=17 # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +# CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set diff -urN linux-2.6/arch/ppc64/configs/iSeries_defconfig test/arch/ppc64/configs/iSeries_defconfig --- linux-2.6/arch/ppc64/configs/iSeries_defconfig 2005-04-26 15:37:55.000000000 +1000 +++ test/arch/ppc64/configs/iSeries_defconfig 2005-06-14 17:02:00.000000000 +1000 @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.11-rc3-bk6 -# Wed Feb 9 23:34:52 2005 +# Linux kernel version: 2.6.12-rc6 +# Tue Jun 14 17:01:28 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -11,7 +11,7 @@ CONFIG_HAVE_DEC_LOCK=y CONFIG_EARLY_PRINTK=y CONFIG_COMPAT=y -CONFIG_FRAME_POINTER=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_FORCE_MAX_ZONEORDER=13 # @@ -20,6 +20,7 @@ CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y +CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup @@ -30,24 +31,29 @@ CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y -CONFIG_LOG_BUF_SHIFT=17 +CONFIG_AUDIT=y +CONFIG_AUDITSYSCALL=y CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y +# CONFIG_CPUSETS is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y -# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set +CONFIG_BASE_SMALL=0 # # Loadable module support @@ -79,6 +85,8 @@ CONFIG_GENERIC_HARDIRQS=y CONFIG_MSCHUNKS=y CONFIG_LPARCFG=y +CONFIG_SECCOMP=y +CONFIG_ISA_DMA_API=y # # General setup @@ -89,6 +97,7 @@ # CONFIG_BINFMT_MISC is not set CONFIG_PCI_LEGACY_PROC=y CONFIG_PCI_NAMES=y +# CONFIG_PCI_DEBUG is not set # # PCCARD (PCMCIA/CardBus) support @@ -96,10 +105,6 @@ # CONFIG_PCCARD is not set # -# PC-card bridges -# - -# # PCI Hotplug Support # # CONFIG_HOTPLUG_PCI is not set @@ -210,7 +215,6 @@ # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_EATA is not set -# CONFIG_SCSI_EATA_PIO is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set # CONFIG_SCSI_GDTH is not set # CONFIG_SCSI_IPS is not set @@ -219,7 +223,6 @@ # CONFIG_SCSI_INIA100 is not set # CONFIG_SCSI_SYM53C8XX_2 is not set # CONFIG_SCSI_IPR is not set -# CONFIG_SCSI_QLOGIC_ISP is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA2XXX=y @@ -228,6 +231,7 @@ # CONFIG_SCSI_QLA2300 is not set # CONFIG_SCSI_QLA2322 is not set # CONFIG_SCSI_QLA6312 is not set +# CONFIG_SCSI_LPFC is not set # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_DEBUG is not set @@ -250,6 +254,7 @@ CONFIG_DM_SNAPSHOT=m CONFIG_DM_MIRROR=m CONFIG_DM_ZERO=m +# CONFIG_DM_MULTIPATH is not set # # Fusion MPT device support @@ -280,7 +285,6 @@ # CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set -# CONFIG_NETLINK_DEV is not set CONFIG_UNIX=y CONFIG_NET_KEY=m CONFIG_INET=y @@ -445,7 +449,6 @@ # CONFIG_DGRS is not set # CONFIG_EEPRO100 is not set CONFIG_E100=y -# CONFIG_E100_NAPI is not set # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set # CONFIG_NE2K_PCI is not set @@ -471,6 +474,7 @@ # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set # CONFIG_TIGON3 is not set +# CONFIG_BNX2 is not set # # Ethernet (10000 Mbit) @@ -539,14 +543,6 @@ # CONFIG_INPUT_EVBUG is not set # -# Input I/O drivers -# -# CONFIG_GAMEPORT is not set -CONFIG_SOUND_GAMEPORT=y -# CONFIG_SERIO is not set -# CONFIG_SERIO_I8042 is not set - -# # Input Device Drivers # # CONFIG_INPUT_KEYBOARD is not set @@ -556,6 +552,12 @@ # CONFIG_INPUT_MISC is not set # +# Hardware I/O ports +# +# CONFIG_SERIO is not set +# CONFIG_GAMEPORT is not set + +# # Character devices # # CONFIG_SERIAL_NONSTANDARD is not set @@ -570,6 +572,7 @@ # CONFIG_SERIAL_CORE=m CONFIG_SERIAL_ICOM=m +# CONFIG_SERIAL_JSM is not set CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 @@ -592,9 +595,16 @@ # # Ftape, the floppy tape device driver # +# CONFIG_AGP is not set # CONFIG_DRM is not set CONFIG_RAW_DRIVER=y CONFIG_MAX_RAW_DEVS=256 +# CONFIG_HANGCHECK_TIMER is not set + +# +# TPM devices +# +# CONFIG_TCG_TPM is not set # # I2C support @@ -633,13 +643,9 @@ # # USB support # -# CONFIG_USB is not set CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y - -# -# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support' may also be needed; see USB_STORAGE Help for more information -# +# CONFIG_USB is not set # # USB Gadget Support @@ -848,10 +854,13 @@ # # Kernel hacking # +# CONFIG_PRINTK_TIME is not set CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_LOG_BUF_SHIFT=17 # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +# CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set @@ -881,6 +890,7 @@ CONFIG_CRYPTO_SHA256=m CONFIG_CRYPTO_SHA512=m CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_DES=y CONFIG_CRYPTO_BLOWFISH=m CONFIG_CRYPTO_TWOFISH=m diff -urN linux-2.6/arch/ppc64/configs/maple_defconfig test/arch/ppc64/configs/maple_defconfig --- linux-2.6/arch/ppc64/configs/maple_defconfig 2005-04-26 15:37:55.000000000 +1000 +++ test/arch/ppc64/configs/maple_defconfig 2005-06-14 17:12:54.000000000 +1000 @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.11-rc3-bk6 -# Wed Feb 9 23:34:53 2005 +# Linux kernel version: 2.6.12-rc6 +# Tue Jun 14 17:12:48 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -11,7 +11,7 @@ CONFIG_HAVE_DEC_LOCK=y CONFIG_EARLY_PRINTK=y CONFIG_COMPAT=y -CONFIG_FRAME_POINTER=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_FORCE_MAX_ZONEORDER=13 # @@ -20,6 +20,7 @@ CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y +CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup @@ -30,24 +31,28 @@ CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y -CONFIG_LOG_BUF_SHIFT=17 +# CONFIG_AUDIT is not set # CONFIG_HOTPLUG is not set CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y +# CONFIG_CPUSETS is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y -# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set +CONFIG_BASE_SMALL=0 # # Loadable module support @@ -84,6 +89,8 @@ # CONFIG_SCHED_SMT is not set # CONFIG_PREEMPT is not set CONFIG_GENERIC_HARDIRQS=y +CONFIG_SECCOMP=y +CONFIG_ISA_DMA_API=y # # General setup @@ -94,6 +101,7 @@ # CONFIG_BINFMT_MISC is not set CONFIG_PCI_LEGACY_PROC=y CONFIG_PCI_NAMES=y +# CONFIG_PCI_DEBUG is not set # # PCCARD (PCMCIA/CardBus) support @@ -101,10 +109,6 @@ # CONFIG_PCCARD is not set # -# PC-card bridges -# - -# # PCI Hotplug Support # # CONFIG_HOTPLUG_PCI is not set @@ -261,7 +265,6 @@ # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y -# CONFIG_NETLINK_DEV is not set CONFIG_UNIX=y # CONFIG_NET_KEY is not set CONFIG_INET=y @@ -376,6 +379,8 @@ # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set # CONFIG_TIGON3 is not set +# CONFIG_BNX2 is not set +# CONFIG_MV643XX_ETH is not set # # Ethernet (10000 Mbit) @@ -432,14 +437,6 @@ # CONFIG_INPUT_EVBUG is not set # -# Input I/O drivers -# -# CONFIG_GAMEPORT is not set -CONFIG_SOUND_GAMEPORT=y -# CONFIG_SERIO is not set -# CONFIG_SERIO_I8042 is not set - -# # Input Device Drivers # # CONFIG_INPUT_KEYBOARD is not set @@ -449,6 +446,12 @@ # CONFIG_INPUT_MISC is not set # +# Hardware I/O ports +# +# CONFIG_SERIO is not set +# CONFIG_GAMEPORT is not set + +# # Character devices # CONFIG_VT=y @@ -469,7 +472,7 @@ # CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y -# CONFIG_SERIAL_PMACZILOG is not set +# CONFIG_SERIAL_JSM is not set CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 @@ -492,8 +495,15 @@ # # Ftape, the floppy tape device driver # +# CONFIG_AGP is not set # CONFIG_DRM is not set # CONFIG_RAW_DRIVER is not set +# CONFIG_HANGCHECK_TIMER is not set + +# +# TPM devices +# +# CONFIG_TCG_TPM is not set # # I2C support @@ -518,8 +528,8 @@ CONFIG_I2C_AMD8111=y # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set +# CONFIG_I2C_PIIX4 is not set # CONFIG_I2C_ISA is not set -# CONFIG_I2C_MPC is not set # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT_LIGHT is not set # CONFIG_I2C_PROSAVAGE is not set @@ -545,7 +555,9 @@ # CONFIG_SENSORS_ASB100 is not set # CONFIG_SENSORS_DS1621 is not set # CONFIG_SENSORS_FSCHER is not set +# CONFIG_SENSORS_FSCPOS is not set # CONFIG_SENSORS_GL518SM is not set +# CONFIG_SENSORS_GL520SM is not set # CONFIG_SENSORS_IT87 is not set # CONFIG_SENSORS_LM63 is not set # CONFIG_SENSORS_LM75 is not set @@ -556,9 +568,11 @@ # CONFIG_SENSORS_LM85 is not set # CONFIG_SENSORS_LM87 is not set # CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set # CONFIG_SENSORS_MAX1619 is not set # CONFIG_SENSORS_PC87360 is not set # CONFIG_SENSORS_SMSC47B397 is not set +# CONFIG_SENSORS_SIS5595 is not set # CONFIG_SENSORS_SMSC47M1 is not set # CONFIG_SENSORS_VIA686A is not set # CONFIG_SENSORS_W83781D is not set @@ -568,6 +582,7 @@ # # Other I2C Chip support # +# CONFIG_SENSORS_DS1337 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set # CONFIG_SENSORS_PCF8591 is not set @@ -615,6 +630,8 @@ # # USB support # +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB=y # CONFIG_USB_DEBUG is not set @@ -625,8 +642,6 @@ # CONFIG_USB_BANDWIDTH is not set # CONFIG_USB_DYNAMIC_MINORS is not set # CONFIG_USB_OTG is not set -CONFIG_USB_ARCH_HAS_HCD=y -CONFIG_USB_ARCH_HAS_OHCI=y # # USB Host Controller Drivers @@ -635,6 +650,8 @@ CONFIG_USB_EHCI_SPLIT_ISO=y CONFIG_USB_EHCI_ROOT_HUB_TT=y CONFIG_USB_OHCI_HCD=y +# CONFIG_USB_OHCI_BIG_ENDIAN is not set +CONFIG_USB_OHCI_LITTLE_ENDIAN=y CONFIG_USB_UHCI_HCD=y # CONFIG_USB_SL811_HCD is not set @@ -688,6 +705,7 @@ CONFIG_USB_PEGASUS=y # CONFIG_USB_RTL8150 is not set # CONFIG_USB_USBNET is not set +CONFIG_USB_MON=y # # USB port drivers @@ -699,8 +717,10 @@ CONFIG_USB_SERIAL=y # CONFIG_USB_SERIAL_CONSOLE is not set CONFIG_USB_SERIAL_GENERIC=y +# CONFIG_USB_SERIAL_AIRPRIME is not set # CONFIG_USB_SERIAL_BELKIN is not set # CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set +# CONFIG_USB_SERIAL_CP2101 is not set CONFIG_USB_SERIAL_CYPRESS_M8=m # CONFIG_USB_SERIAL_EMPEG is not set # CONFIG_USB_SERIAL_FTDI_SIO is not set @@ -729,6 +749,7 @@ # CONFIG_USB_SERIAL_KOBIL_SCT is not set # CONFIG_USB_SERIAL_MCT_U232 is not set # CONFIG_USB_SERIAL_PL2303 is not set +# CONFIG_USB_SERIAL_HP4X is not set # CONFIG_USB_SERIAL_SAFE is not set CONFIG_USB_SERIAL_TI=m # CONFIG_USB_SERIAL_CYBERJACK is not set @@ -750,6 +771,7 @@ # CONFIG_USB_PHIDGETKIT is not set # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set +# CONFIG_USB_SISUSBVGA is not set # CONFIG_USB_TEST is not set # @@ -936,10 +958,13 @@ # # Kernel hacking # +# CONFIG_PRINTK_TIME is not set CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_LOG_BUF_SHIFT=17 # CONFIG_SCHEDSTATS is not set CONFIG_DEBUG_SLAB=y +# CONFIG_DEBUG_SPINLOCK is not set CONFIG_DEBUG_SPINLOCK_SLEEP=y # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set @@ -971,6 +996,7 @@ # CONFIG_CRYPTO_SHA256 is not set # CONFIG_CRYPTO_SHA512 is not set # CONFIG_CRYPTO_WP512 is not set +# CONFIG_CRYPTO_TGR192 is not set CONFIG_CRYPTO_DES=y # CONFIG_CRYPTO_BLOWFISH is not set # CONFIG_CRYPTO_TWOFISH is not set diff -urN linux-2.6/arch/ppc64/configs/pSeries_defconfig test/arch/ppc64/configs/pSeries_defconfig --- linux-2.6/arch/ppc64/configs/pSeries_defconfig 2005-04-26 15:37:55.000000000 +1000 +++ test/arch/ppc64/configs/pSeries_defconfig 2005-06-14 17:13:54.000000000 +1000 @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.11-rc3-bk6 -# Wed Feb 9 23:34:54 2005 +# Linux kernel version: 2.6.12-rc6 +# Tue Jun 14 17:13:47 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -11,7 +11,7 @@ CONFIG_HAVE_DEC_LOCK=y CONFIG_EARLY_PRINTK=y CONFIG_COMPAT=y -CONFIG_FRAME_POINTER=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_FORCE_MAX_ZONEORDER=13 # @@ -20,6 +20,7 @@ CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y +CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup @@ -30,24 +31,29 @@ CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y -CONFIG_LOG_BUF_SHIFT=17 +CONFIG_AUDIT=y +CONFIG_AUDITSYSCALL=y CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y +CONFIG_CPUSETS=y # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y -# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set +CONFIG_BASE_SMALL=0 # # Loadable module support @@ -89,9 +95,12 @@ CONFIG_EEH=y CONFIG_GENERIC_HARDIRQS=y CONFIG_PPC_RTAS=y +CONFIG_RTAS_PROC=y CONFIG_RTAS_FLASH=m CONFIG_SCANLOG=m CONFIG_LPARCFG=y +CONFIG_SECCOMP=y +CONFIG_ISA_DMA_API=y # # General setup @@ -102,6 +111,7 @@ # CONFIG_BINFMT_MISC is not set CONFIG_PCI_LEGACY_PROC=y CONFIG_PCI_NAMES=y +# CONFIG_PCI_DEBUG is not set CONFIG_HOTPLUG_CPU=y # @@ -110,10 +120,6 @@ # CONFIG_PCCARD is not set # -# PC-card bridges -# - -# # PCI Hotplug Support # CONFIG_HOTPLUG_PCI=m @@ -147,11 +153,10 @@ # CONFIG_PARPORT=m CONFIG_PARPORT_PC=m -CONFIG_PARPORT_PC_CML1=m # CONFIG_PARPORT_SERIAL is not set # CONFIG_PARPORT_PC_FIFO is not set # CONFIG_PARPORT_PC_SUPERIO is not set -# CONFIG_PARPORT_OTHER is not set +# CONFIG_PARPORT_GSC is not set # CONFIG_PARPORT_1284 is not set # @@ -293,7 +298,6 @@ # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_EATA is not set -# CONFIG_SCSI_EATA_PIO is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set # CONFIG_SCSI_GDTH is not set # CONFIG_SCSI_IPS is not set @@ -310,7 +314,6 @@ CONFIG_SCSI_IPR=y CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y -# CONFIG_SCSI_QLOGIC_ISP is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA2XXX=y @@ -319,6 +322,7 @@ CONFIG_SCSI_QLA2300=m CONFIG_SCSI_QLA2322=m CONFIG_SCSI_QLA6312=m +CONFIG_SCSI_LPFC=m # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_DEBUG is not set @@ -341,6 +345,8 @@ CONFIG_DM_SNAPSHOT=m CONFIG_DM_MIRROR=m CONFIG_DM_ZERO=m +CONFIG_DM_MULTIPATH=m +CONFIG_DM_MULTIPATH_EMC=m # # Fusion MPT device support @@ -371,7 +377,6 @@ # CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set -# CONFIG_NETLINK_DEV is not set CONFIG_UNIX=y CONFIG_NET_KEY=m CONFIG_INET=y @@ -539,7 +544,6 @@ # CONFIG_DGRS is not set # CONFIG_EEPRO100 is not set CONFIG_E100=y -# CONFIG_E100_NAPI is not set # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set # CONFIG_NE2K_PCI is not set @@ -565,6 +569,8 @@ # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y +# CONFIG_BNX2 is not set +# CONFIG_MV643XX_ETH is not set # # Ethernet (10000 Mbit) @@ -636,20 +642,6 @@ # CONFIG_INPUT_EVBUG is not set # -# Input I/O drivers -# -# CONFIG_GAMEPORT is not set -CONFIG_SOUND_GAMEPORT=y -CONFIG_SERIO=y -CONFIG_SERIO_I8042=y -# CONFIG_SERIO_SERPORT is not set -# CONFIG_SERIO_CT82C710 is not set -# CONFIG_SERIO_PARKBD is not set -# CONFIG_SERIO_PCIPS2 is not set -CONFIG_SERIO_LIBPS2=y -# CONFIG_SERIO_RAW is not set - -# # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y @@ -669,6 +661,18 @@ # CONFIG_INPUT_UINPUT is not set # +# Hardware I/O ports +# +CONFIG_SERIO=y +CONFIG_SERIO_I8042=y +# CONFIG_SERIO_SERPORT is not set +# CONFIG_SERIO_PARKBD is not set +# CONFIG_SERIO_PCIPS2 is not set +CONFIG_SERIO_LIBPS2=y +# CONFIG_SERIO_RAW is not set +# CONFIG_GAMEPORT is not set + +# # Character devices # CONFIG_VT=y @@ -689,8 +693,8 @@ # CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y -# CONFIG_SERIAL_PMACZILOG is not set CONFIG_SERIAL_ICOM=m +# CONFIG_SERIAL_JSM is not set CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 @@ -718,9 +722,16 @@ # # Ftape, the floppy tape device driver # +# CONFIG_AGP is not set # CONFIG_DRM is not set CONFIG_RAW_DRIVER=y CONFIG_MAX_RAW_DEVS=1024 +# CONFIG_HANGCHECK_TIMER is not set + +# +# TPM devices +# +# CONFIG_TCG_TPM is not set # # I2C support @@ -745,8 +756,8 @@ # CONFIG_I2C_AMD8111 is not set # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set +# CONFIG_I2C_PIIX4 is not set # CONFIG_I2C_ISA is not set -# CONFIG_I2C_MPC is not set # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT is not set # CONFIG_I2C_PARPORT_LIGHT is not set @@ -773,7 +784,9 @@ # CONFIG_SENSORS_ASB100 is not set # CONFIG_SENSORS_DS1621 is not set # CONFIG_SENSORS_FSCHER is not set +# CONFIG_SENSORS_FSCPOS is not set # CONFIG_SENSORS_GL518SM is not set +# CONFIG_SENSORS_GL520SM is not set # CONFIG_SENSORS_IT87 is not set # CONFIG_SENSORS_LM63 is not set # CONFIG_SENSORS_LM75 is not set @@ -784,9 +797,11 @@ # CONFIG_SENSORS_LM85 is not set # CONFIG_SENSORS_LM87 is not set # CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set # CONFIG_SENSORS_MAX1619 is not set # CONFIG_SENSORS_PC87360 is not set # CONFIG_SENSORS_SMSC47B397 is not set +# CONFIG_SENSORS_SIS5595 is not set # CONFIG_SENSORS_SMSC47M1 is not set # CONFIG_SENSORS_VIA686A is not set # CONFIG_SENSORS_W83781D is not set @@ -796,6 +811,7 @@ # # Other I2C Chip support # +# CONFIG_SENSORS_DS1337 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set # CONFIG_SENSORS_PCF8591 is not set @@ -828,8 +844,13 @@ # Graphics support # CONFIG_FB=y +CONFIG_FB_CFB_FILLRECT=y +CONFIG_FB_CFB_COPYAREA=y +CONFIG_FB_CFB_IMAGEBLIT=y +CONFIG_FB_SOFT_CURSOR=y +CONFIG_FB_MACMODES=y CONFIG_FB_MODE_HELPERS=y -# CONFIG_FB_TILEBLITTING is not set +CONFIG_FB_TILEBLITTING=y # CONFIG_FB_CIRRUS is not set # CONFIG_FB_PM2 is not set # CONFIG_FB_CYBER2000 is not set @@ -838,6 +859,7 @@ # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set +# CONFIG_FB_NVIDIA is not set # CONFIG_FB_RIVA is not set CONFIG_FB_MATROX=y CONFIG_FB_MATROX_MILLENIUM=y @@ -858,6 +880,7 @@ # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set # CONFIG_FB_TRIDENT is not set +# CONFIG_FB_S1D13XXX is not set # CONFIG_FB_VIRTUAL is not set # @@ -891,6 +914,8 @@ # # USB support # +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB=y # CONFIG_USB_DEBUG is not set @@ -901,8 +926,6 @@ # CONFIG_USB_BANDWIDTH is not set # CONFIG_USB_DYNAMIC_MINORS is not set # CONFIG_USB_OTG is not set -CONFIG_USB_ARCH_HAS_HCD=y -CONFIG_USB_ARCH_HAS_OHCI=y # # USB Host Controller Drivers @@ -911,6 +934,8 @@ # CONFIG_USB_EHCI_SPLIT_ISO is not set # CONFIG_USB_EHCI_ROOT_HUB_TT is not set CONFIG_USB_OHCI_HCD=y +# CONFIG_USB_OHCI_BIG_ENDIAN is not set +CONFIG_USB_OHCI_LITTLE_ENDIAN=y # CONFIG_USB_UHCI_HCD is not set # CONFIG_USB_SL811_HCD is not set @@ -926,12 +951,11 @@ # CONFIG_USB_STORAGE=y # CONFIG_USB_STORAGE_DEBUG is not set -# CONFIG_USB_STORAGE_RW_DETECT is not set # CONFIG_USB_STORAGE_DATAFAB is not set # CONFIG_USB_STORAGE_FREECOM is not set # CONFIG_USB_STORAGE_ISD200 is not set # CONFIG_USB_STORAGE_DPCM is not set -# CONFIG_USB_STORAGE_HP8200e is not set +# CONFIG_USB_STORAGE_USBAT is not set # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set @@ -975,6 +999,7 @@ # CONFIG_USB_PEGASUS is not set # CONFIG_USB_RTL8150 is not set # CONFIG_USB_USBNET is not set +CONFIG_USB_MON=y # # USB port drivers @@ -1000,6 +1025,7 @@ # CONFIG_USB_PHIDGETKIT is not set # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set +# CONFIG_USB_SISUSBVGA is not set # CONFIG_USB_TEST is not set # @@ -1208,10 +1234,13 @@ # # Kernel hacking # +# CONFIG_PRINTK_TIME is not set CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_LOG_BUF_SHIFT=17 # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +# CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set @@ -1243,6 +1272,7 @@ CONFIG_CRYPTO_SHA256=m CONFIG_CRYPTO_SHA512=m CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_DES=y CONFIG_CRYPTO_BLOWFISH=m CONFIG_CRYPTO_TWOFISH=m From ceagan at gmail.com Wed Jun 15 01:40:06 2005 From: ceagan at gmail.com (Chris Eagan) Date: Tue, 14 Jun 2005 11:40:06 -0400 Subject: Brand new iMac G5 In-Reply-To: <1118709405.5986.103.camel@gaston> References: <1118709405.5986.103.camel@gaston> Message-ID: Hey Ben, Thanks for the patch. Unfortunately it is still giving us the same problem where the network device is up but communication using it fails. Is there any other information I could get off the machine that would help? -Chris On 6/13/05, Benjamin Herrenschmidt wrote: > On Mon, 2005-06-13 at 15:17 -0400, Chris Eagan wrote: > > We have a brand new iMac G5 and we are having trouble with the network > > card drivers. It seems that they load and find the device okay, but > > then setting an ip manually does not give access to the network. DHCP > > fails to find anything as well. We have been working on this problem > > at http://bugs.gentoo.org/show_bug.cgi?id=94263 but nothing has worked > > so far. I have included some information about the device here. If any > > other information is needed, we have a cd that boots to a prompt on > > the machine and would be happy to provide any needed information. > > Hi ! > > Please try this patch and let me know asap: > > Index: linux-work/drivers/net/sungem.c > =================================================================== > --- linux-work.orig/drivers/net/sungem.c 2005-05-02 10:48:28.000000000 +1000 > +++ linux-work/drivers/net/sungem.c 2005-06-14 10:17:38.000000000 +1000 > @@ -3078,7 +3078,9 @@ > gp->phy_mii.dev = dev; > gp->phy_mii.mdio_read = _phy_read; > gp->phy_mii.mdio_write = _phy_write; > - > +#ifdef CONFIG_PPC_PMAC > + gp->phy_mii.platform_data = gp->of_node; > +#endif > /* By default, we start with autoneg */ > gp->want_autoneg = 1; > > Index: linux-work/drivers/net/sungem_phy.c > =================================================================== > --- linux-work.orig/drivers/net/sungem_phy.c 2005-05-02 10:48:28.000000000 +1000 > +++ linux-work/drivers/net/sungem_phy.c 2005-06-14 10:36:07.000000000 +1000 > @@ -32,6 +32,10 @@ > #include > #include > > +#ifdef CONFIG_PPC_PMAC > +#include > +#endif > + > #include "sungem_phy.h" > > /* Link modes of the BCM5400 PHY */ > @@ -281,10 +285,12 @@ > static int bcm5421_init(struct mii_phy* phy) > { > u16 data; > - int rev; > + unsigned int id; > > - rev = phy_read(phy, MII_PHYSID2) & 0x000f; > - if (rev == 0) { > + id = (phy_read(phy, MII_PHYSID1) << 16 | phy_read(phy, MII_PHYSID2)); > + > + /* Revision 0 of 5421 needs some fixups */ > + if (id == 0x002060e0) { > /* This is borrowed from MacOS > */ > phy_write(phy, 0x18, 0x1007); > @@ -297,21 +303,28 @@ > data = phy_read(phy, 0x15); > phy_write(phy, 0x15, data | 0x0200); > } > -#if 0 > - /* This has to be verified before I enable it */ > - /* Enable automatic low-power */ > - phy_write(phy, 0x1c, 0x9002); > - phy_write(phy, 0x1c, 0xa821); > - phy_write(phy, 0x1c, 0x941d); > -#endif > - return 0; > -} > > -static int bcm5421k2_init(struct mii_phy* phy) > -{ > - /* Init code borrowed from OF */ > - phy_write(phy, 4, 0x01e1); > - phy_write(phy, 9, 0x0300); > + /* Pick up some init code from OF for K2 version */ > + if ((id & 0xfffffff0) == 0x002062e0) { > + phy_write(phy, 4, 0x01e1); > + phy_write(phy, 9, 0x0300); > + } > + > + /* Check if we can enable automatic low power */ > +#ifdef CONFIG_PPC_PMAC > + if (phy->platform_data) { > + struct device_node *np = of_get_parent(phy->platform_data); > + int can_low_power = 1; > + if (np == NULL || get_property(np, "no-autolowpower", NULL)) > + can_low_power = 0; > + if (can_low_power) { > + /* Enable automatic low-power */ > + phy_write(phy, 0x1c, 0x9002); > + phy_write(phy, 0x1c, 0xa821); > + phy_write(phy, 0x1c, 0x941d); > + } > + } > +#endif /* CONFIG_PPC_PMAC */ > > return 0; > } > @@ -762,7 +775,7 @@ > > /* Broadcom BCM 5421 built-in K2 */ > static struct mii_phy_ops bcm5421k2_phy_ops = { > - .init = bcm5421k2_init, > + .init = bcm5421_init, > .suspend = bcm5411_suspend, > .setup_aneg = bcm54xx_setup_aneg, > .setup_forced = bcm54xx_setup_forced, > @@ -779,6 +792,25 @@ > .ops = &bcm5421k2_phy_ops > }; > > +/* Broadcom BCM 5462 built-in Vesta */ > +static struct mii_phy_ops bcm5462V_phy_ops = { > + .init = bcm5421_init, > + .suspend = bcm5411_suspend, > + .setup_aneg = bcm54xx_setup_aneg, > + .setup_forced = bcm54xx_setup_forced, > + .poll_link = genmii_poll_link, > + .read_link = bcm54xx_read_link, > +}; > + > +static struct mii_phy_def bcm5462V_phy_def = { > + .phy_id = 0x002062e0, > + .phy_id_mask = 0xfffffff0, > + .name = "BCM5462-Vesta", > + .features = MII_GBIT_FEATURES, > + .magic_aneg = 1, > + .ops = &bcm5462V_phy_ops > +}; > + > /* Marvell 88E1101 (Apple seem to deal with 2 different revs, > * I masked out the 8 last bits to get both, but some specs > * would be useful here) --BenH. > @@ -824,6 +856,7 @@ > &bcm5411_phy_def, > &bcm5421_phy_def, > &bcm5421k2_phy_def, > + &bcm5462V_phy_def, > &marvell_phy_def, > &genmii_phy_def, > NULL > Index: linux-work/drivers/net/sungem_phy.h > =================================================================== > --- linux-work.orig/drivers/net/sungem_phy.h 2005-05-02 10:48:28.000000000 +1000 > +++ linux-work/drivers/net/sungem_phy.h 2005-06-14 10:16:14.000000000 +1000 > @@ -43,9 +43,10 @@ > int pause; > > /* Provided by host chip */ > - struct net_device* dev; > + struct net_device *dev; > int (*mdio_read) (struct net_device *dev, int mii_id, int reg); > void (*mdio_write) (struct net_device *dev, int mii_id, int reg, int val); > + void *platform_data; > }; > > /* Pass in a struct mii_phy with dev, mdio_read and mdio_write > > > -- Chris Eagan ceagan at gmail.com Want 2GB Google Mail? Just ask me! From service at paypal.com Wed Jun 15 03:19:58 2005 From: service at paypal.com (update@paypal.com) Date: Tue, 14 Jun 2005 18:19:58 +0100 Subject: Update and Verify Your PayPal account*** Message-ID: <200506141719.j5EHJwW3015882@fox.molinare.co.uk> An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050614/9fab5182/attachment.htm From service at paypal.com Wed Jun 15 06:39:59 2005 From: service at paypal.com (update@paypal.com) Date: Tue, 14 Jun 2005 21:39:59 +0100 Subject: Update and Verify Your PayPal account*** Message-ID: <200506142039.j5EKdxNR006208@fox.molinare.co.uk> An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050614/e5aafa39/attachment.htm From benh at kernel.crashing.org Wed Jun 15 08:42:10 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 15 Jun 2005 08:42:10 +1000 Subject: Brand new iMac G5 In-Reply-To: References: <1118709405.5986.103.camel@gaston> Message-ID: <1118788931.5986.178.camel@gaston> On Tue, 2005-06-14 at 11:40 -0400, Chris Eagan wrote: > Hey Ben, > > Thanks for the patch. Unfortunately it is still giving us the same > problem where the network device is up but communication using it > fails. Is there any other information I could get off the machine that > would help? The dmesg output of the network driver, specifically, the messages about the PHY detection. Ben. From robl at mcs.anl.gov Wed Jun 15 09:20:54 2005 From: robl at mcs.anl.gov (Robert Latham) Date: Tue, 14 Jun 2005 18:20:54 -0500 Subject: aio problem: consume 100% cpu Message-ID: <20050614232054.GN25871@mcs.anl.gov> Hi I have an application that makes use of asyncronous IO. This code works on x86, x86-64, ppc32, and alpha, so I'm pretty sure we're calling the async IO routines correctly. On ppc64, however, after the async IO completes the application consumes all available CPU until it exits (and since the application is a server, that's indefinitely). A while back someone reported a similar problem with asyncronous IO: http://lists.suse.com/archive/suse-programming-e/2004-Sep/0103.html that url includes a link to "aiotest.c", which demonstrates the problem. I'm testing on a YDL-4.0 machine with glibc-2.3.3-18.ydl.4 Thanks ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B From paulus at samba.org Wed Jun 15 11:54:33 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 15 Jun 2005 11:54:33 +1000 Subject: aio problem: consume 100% cpu In-Reply-To: <20050614232054.GN25871@mcs.anl.gov> References: <20050614232054.GN25871@mcs.anl.gov> Message-ID: <17071.35417.494530.250000@cargo.ozlabs.ibm.com> Robert Latham writes: > On ppc64, however, after the async IO completes the application > consumes all available CPU until it exits (and since the application > is a server, that's indefinitely). Do you have a program that exhibits this behaviour that you can share with us? Paul. From robl at mcs.anl.gov Wed Jun 15 12:40:00 2005 From: robl at mcs.anl.gov (Robert Latham) Date: Tue, 14 Jun 2005 21:40:00 -0500 Subject: aio problem: consume 100% cpu In-Reply-To: <17071.35417.494530.250000@cargo.ozlabs.ibm.com> References: <20050614232054.GN25871@mcs.anl.gov> <17071.35417.494530.250000@cargo.ozlabs.ibm.com> Message-ID: <20050615024000.GP25871@mcs.anl.gov> On Wed, Jun 15, 2005 at 11:54:33AM +1000, Paul Mackerras wrote: > Do you have a program that exhibits this behaviour that you can share > with us? sorry if i didn't make it clear: Matthew Gregan wrote a simple test case which demonstrates the same behavior i'm seeing in my application: http://lists.suse.com/archive/suse-programming-e/2004-Sep/att-0103/aiotest.c ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B From paulus at samba.org Wed Jun 15 13:14:56 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 15 Jun 2005 13:14:56 +1000 Subject: aio problem: consume 100% cpu In-Reply-To: <20050615024000.GP25871@mcs.anl.gov> References: <20050614232054.GN25871@mcs.anl.gov> <17071.35417.494530.250000@cargo.ozlabs.ibm.com> <20050615024000.GP25871@mcs.anl.gov> Message-ID: <17071.40240.344319.166441@cargo.ozlabs.ibm.com> Robert Latham writes: > sorry if i didn't make it clear: Matthew Gregan wrote a simple test > case which demonstrates the same behavior i'm seeing in my > application: > > http://lists.suse.com/archive/suse-programming-e/2004-Sep/att-0103/aiotest.c What library do you link that against? I get aio_write and aio_error undefined, even if I link with -laio. Paul. From robl at mcs.anl.gov Wed Jun 15 14:33:50 2005 From: robl at mcs.anl.gov (Robert Latham) Date: Tue, 14 Jun 2005 23:33:50 -0500 Subject: aio problem: consume 100% cpu In-Reply-To: <17071.40240.344319.166441@cargo.ozlabs.ibm.com> References: <20050614232054.GN25871@mcs.anl.gov> <17071.35417.494530.250000@cargo.ozlabs.ibm.com> <20050615024000.GP25871@mcs.anl.gov> <17071.40240.344319.166441@cargo.ozlabs.ibm.com> Message-ID: <20050615043350.GQ25871@mcs.anl.gov> On Wed, Jun 15, 2005 at 01:14:56PM +1000, Paul Mackerras wrote: > What library do you link that against? I get aio_write and aio_error > undefined, even if I link with -laio. those routines are in librt (gcc -Wall -g aiotest.c -o aiotet -lrt) ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Labs, IL USA B29D F333 664A 4280 315B From ceagan at gmail.com Thu Jun 16 00:43:13 2005 From: ceagan at gmail.com (Chris Eagan) Date: Wed, 15 Jun 2005 10:43:13 -0400 Subject: Brand new iMac G5 In-Reply-To: <1118788931.5986.178.camel@gaston> References: <1118709405.5986.103.camel@gaston> <1118788931.5986.178.camel@gaston> Message-ID: The dmesg output of the driver: (unrelated stuff here) sungem.c: v0.98 8/24/03 David S. Miller (davem at redhat.com) PHY ID: 2060d3, addr: 0 PHY ID: 2060d3, addr: 0 eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:11:24:39:6d:4c eth0: Found Generic MII PHY (unrelated stuff here) eth0: Link is up at 100 Mpbs, full-duplex (unrelated stuff here) eth0: Link is up at 100 Mpbs, full-duplex eth0: Pause is disabled -Chris On 6/14/05, Benjamin Herrenschmidt wrote: > On Tue, 2005-06-14 at 11:40 -0400, Chris Eagan wrote: > > Hey Ben, > > > > Thanks for the patch. Unfortunately it is still giving us the same > > problem where the network device is up but communication using it > > fails. Is there any other information I could get off the machine that > > would help? > > The dmesg output of the network driver, specifically, the messages about > the PHY detection. > > Ben. > > > -- Chris Eagan ceagan at gmail.com Want 2GB Google Mail? Just ask me! From jdl at freescale.com Thu Jun 16 05:00:46 2005 From: jdl at freescale.com (Jon Loeliger) Date: Wed, 15 Jun 2005 14:00:46 -0500 Subject: Discuss: Adding OF Flat Dev Tree to ppc32 In-Reply-To: <1118199997.6850.106.camel@gaston> References: <1117614390.19020.24.camel@gaston> <1117614484.19020.27.camel@gaston> <1117783104.31082.151.camel@gaston> <1117819176.6517.290.camel@cashmere.sps.mot.com> <1118199997.6850.106.camel@gaston> Message-ID: <1118862046.25372.49.camel@cashmere.sps.mot.com> On Tue, 2005-06-07 at 22:06, Benjamin Herrenschmidt wrote: > It's basically used to extract some infos directly from the flattened > tree in order to construct the LMB list (list of memory blocks, > equivalent of ppc32's mem_pieces), OK. So the unflattenting process requires a small amount of memory allocation which is currently implemented using the lmb mechanism in PPC64 land. As you indicate, there is also the mem_pieces implementation over in ppc32 land. I think it is currently only used by arch/ppc/platforms/pmac_setup.c. In porting this code over to PPC32 land, there are roughly three choices: 1) Copy the LMB implementation from ppc64 over to PPC32 land, 2) Change the unflattening code in PPC32 to use mem_pieces, or rewrite it to allow a configurable choice between LMB and mem_pieces, or 3) Make up something new, yet very similar to LMB and mem_pieces. Does anyone have suggestions or advice on route 1) or 2)? Anyone? Kumar? Ben? Bueller? Thanks, jdl From benh at kernel.crashing.org Thu Jun 16 07:39:31 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 16 Jun 2005 07:39:31 +1000 Subject: Brand new iMac G5 In-Reply-To: References: <1118709405.5986.103.camel@gaston> <1118788931.5986.178.camel@gaston> Message-ID: <1118871572.5986.231.camel@gaston> On Wed, 2005-06-15 at 10:43 -0400, Chris Eagan wrote: > The dmesg output of the driver: > > (unrelated stuff here) > sungem.c: v0.98 8/24/03 David S. Miller (davem at redhat.com) > PHY ID: 2060d3, addr: 0 > PHY ID: 2060d3, addr: 0 > eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:11:24:39:6d:4c > eth0: Found Generic MII PHY > (unrelated stuff here) > eth0: Link is up at 100 Mpbs, full-duplex > (unrelated stuff here) > eth0: Link is up at 100 Mpbs, full-duplex > eth0: Pause is disabled Ok, the patch was missing a bit, here's a fixed version Index: linux-work/drivers/net/sungem.c =================================================================== --- linux-work.orig/drivers/net/sungem.c 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem.c 2005-06-14 10:17:38.000000000 +1000 @@ -3078,7 +3078,9 @@ gp->phy_mii.dev = dev; gp->phy_mii.mdio_read = _phy_read; gp->phy_mii.mdio_write = _phy_write; - +#ifdef CONFIG_PPC_PMAC + gp->phy_mii.platform_data = gp->of_node; +#endif /* By default, we start with autoneg */ gp->want_autoneg = 1; Index: linux-work/drivers/net/sungem_phy.c =================================================================== --- linux-work.orig/drivers/net/sungem_phy.c 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem_phy.c 2005-06-16 07:38:37.000000000 +1000 @@ -32,6 +32,10 @@ #include #include +#ifdef CONFIG_PPC_PMAC +#include +#endif + #include "sungem_phy.h" /* Link modes of the BCM5400 PHY */ @@ -281,10 +285,12 @@ static int bcm5421_init(struct mii_phy* phy) { u16 data; - int rev; + unsigned int id; - rev = phy_read(phy, MII_PHYSID2) & 0x000f; - if (rev == 0) { + id = (phy_read(phy, MII_PHYSID1) << 16 | phy_read(phy, MII_PHYSID2)); + + /* Revision 0 of 5421 needs some fixups */ + if (id == 0x002060e0) { /* This is borrowed from MacOS */ phy_write(phy, 0x18, 0x1007); @@ -297,21 +303,28 @@ data = phy_read(phy, 0x15); phy_write(phy, 0x15, data | 0x0200); } -#if 0 - /* This has to be verified before I enable it */ - /* Enable automatic low-power */ - phy_write(phy, 0x1c, 0x9002); - phy_write(phy, 0x1c, 0xa821); - phy_write(phy, 0x1c, 0x941d); -#endif - return 0; -} -static int bcm5421k2_init(struct mii_phy* phy) -{ - /* Init code borrowed from OF */ - phy_write(phy, 4, 0x01e1); - phy_write(phy, 9, 0x0300); + /* Pick up some init code from OF for K2 version */ + if ((id & 0xfffffff0) == 0x002062e0) { + phy_write(phy, 4, 0x01e1); + phy_write(phy, 9, 0x0300); + } + + /* Check if we can enable automatic low power */ +#ifdef CONFIG_PPC_PMAC + if (phy->platform_data) { + struct device_node *np = of_get_parent(phy->platform_data); + int can_low_power = 1; + if (np == NULL || get_property(np, "no-autolowpower", NULL)) + can_low_power = 0; + if (can_low_power) { + /* Enable automatic low-power */ + phy_write(phy, 0x1c, 0x9002); + phy_write(phy, 0x1c, 0xa821); + phy_write(phy, 0x1c, 0x941d); + } + } +#endif /* CONFIG_PPC_PMAC */ return 0; } @@ -762,7 +775,7 @@ /* Broadcom BCM 5421 built-in K2 */ static struct mii_phy_ops bcm5421k2_phy_ops = { - .init = bcm5421k2_init, + .init = bcm5421_init, .suspend = bcm5411_suspend, .setup_aneg = bcm54xx_setup_aneg, .setup_forced = bcm54xx_setup_forced, @@ -779,6 +792,25 @@ .ops = &bcm5421k2_phy_ops }; +/* Broadcom BCM 5462 built-in Vesta */ +static struct mii_phy_ops bcm5462V_phy_ops = { + .init = bcm5421_init, + .suspend = bcm5411_suspend, + .setup_aneg = bcm54xx_setup_aneg, + .setup_forced = bcm54xx_setup_forced, + .poll_link = genmii_poll_link, + .read_link = bcm54xx_read_link, +}; + +static struct mii_phy_def bcm5462V_phy_def = { + .phy_id = 0x002060d0, + .phy_id_mask = 0xfffffff0, + .name = "BCM5462-Vesta", + .features = MII_GBIT_FEATURES, + .magic_aneg = 1, + .ops = &bcm5462V_phy_ops +}; + /* Marvell 88E1101 (Apple seem to deal with 2 different revs, * I masked out the 8 last bits to get both, but some specs * would be useful here) --BenH. @@ -824,6 +856,7 @@ &bcm5411_phy_def, &bcm5421_phy_def, &bcm5421k2_phy_def, + &bcm5462V_phy_def, &marvell_phy_def, &genmii_phy_def, NULL Index: linux-work/drivers/net/sungem_phy.h =================================================================== --- linux-work.orig/drivers/net/sungem_phy.h 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem_phy.h 2005-06-14 10:16:14.000000000 +1000 @@ -43,9 +43,10 @@ int pause; /* Provided by host chip */ - struct net_device* dev; + struct net_device *dev; int (*mdio_read) (struct net_device *dev, int mii_id, int reg); void (*mdio_write) (struct net_device *dev, int mii_id, int reg, int val); + void *platform_data; }; /* Pass in a struct mii_phy with dev, mdio_read and mdio_write From benh at kernel.crashing.org Thu Jun 16 07:52:53 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 16 Jun 2005 07:52:53 +1000 Subject: Discuss: Adding OF Flat Dev Tree to ppc32 In-Reply-To: <1118862046.25372.49.camel@cashmere.sps.mot.com> References: <1117614390.19020.24.camel@gaston> <1117614484.19020.27.camel@gaston> <1117783104.31082.151.camel@gaston> <1117819176.6517.290.camel@cashmere.sps.mot.com> <1118199997.6850.106.camel@gaston> <1118862046.25372.49.camel@cashmere.sps.mot.com> Message-ID: <1118872373.5986.242.camel@gaston> On Wed, 2005-06-15 at 14:00 -0500, Jon Loeliger wrote: > On Tue, 2005-06-07 at 22:06, Benjamin Herrenschmidt wrote: > > > It's basically used to extract some infos directly from the flattened > > tree in order to construct the LMB list (list of memory blocks, > > equivalent of ppc32's mem_pieces), > > > OK. So the unflattenting process requires a small amount > of memory allocation which is currently implemented using > the lmb mechanism in PPC64 land. Not only the unflattening process. The full MMU initialisation (ability to map things etc...) needs mem_pieces too. The ppc32 kernel runs with a very limited memory setup until that happens. > As you indicate, there is also the mem_pieces implementation > over in ppc32 land. I think it is currently only used by > arch/ppc/platforms/pmac_setup.c. mem_pieces is used in several places in the early mmu setup too in arch/ppc/mm/* > In porting this code over to PPC32 land, there are roughly > three choices: > > 1) Copy the LMB implementation from ppc64 over to PPC32 land, No, we can stick to mem_pieces > 2) Change the unflattening code in PPC32 to use mem_pieces, > or rewrite it to allow a configurable choice between > LMB and mem_pieces, mem_pieces is fine > 3) Make up something new, yet very similar to LMB and > mem_pieces. > > Does anyone have suggestions or advice on route 1) or 2)? > Anyone? Kumar? Ben? Bueller? The main issues is not really there. The problem is that we will have to scan the flattened tree at boot in order to setup the MMU and that will have to be done with a very limited access to memory, possibly only the low 32Mb of RAM for example (though we may manage something better running in real mode with some CPUs). That means maybe imposing some restrictions on where the flattened device-tree block can be when passed to the kernel (unless we can run that C code in real mode) and other niceties that will have to be dealt per CPU family. Ben. From david at gibson.dropbear.id.au Thu Jun 16 17:08:16 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 16 Jun 2005 17:08:16 +1000 Subject: More dtc changes Message-ID: <20050616070816.GE31347@localhost.localdomain> I now have a git tree for the device tree compiler up at http://www.ozlabs.org/~dgibson/dtc/dtc.git Notable changes since the last tarball release: - Elementary support for labels. Labels can go on nodes or properties in the source tree and are exported to assembler output as usable symbols - phandle reference support. Cell data (< .. >) can include references of the form &/path/to/some/node and will be replaced in the output tree with the output node's phandle. A phandle will be generated for the target node if it doesn't already have one. There's a tarball release as well http://www.ozlabs.org/~dgibson/dtc/dtc.tar.gz -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From jdl at freescale.com Fri Jun 17 06:49:16 2005 From: jdl at freescale.com (Jon Loeliger) Date: Thu, 16 Jun 2005 15:49:16 -0500 Subject: More dtc changes In-Reply-To: <20050616070816.GE31347@localhost.localdomain> References: <20050616070816.GE31347@localhost.localdomain> Message-ID: <1118954956.25372.71.camel@cashmere.sps.mot.com> On Thu, 2005-06-16 at 02:08, David Gibson wrote: > I now have a git tree for the device tree compiler up at > http://www.ozlabs.org/~dgibson/dtc/dtc.git Very Cool! And, um, rats. So, I'm the victim of an aggressive and anti-social IT Firewall poo-poolicy.... Or maybe I am just being dumb and haven't Googled up the Right Magic. Is there a way to set an HTTP Proxy definition, flag, parameter, environment variable, or other doo-dad that will help me cg-pull/git-http-pull through my firewall? Alternatively, any chance we can convert it to, or add, an rsync tree? > There's a tarball release as well > http://www.ozlabs.org/~dgibson/dtc/dtc.tar.gz If not, I'll have to be doing the tarball thing... Thanks, jdl From jdl at freescale.com Fri Jun 17 07:36:56 2005 From: jdl at freescale.com (Jon Loeliger) Date: Thu, 16 Jun 2005 16:36:56 -0500 Subject: More dtc changes In-Reply-To: <42B1EBFC.4020105@smiths-aerospace.com> References: <20050616070816.GE31347@localhost.localdomain> <1118954956.25372.71.camel@cashmere.sps.mot.com> <42B1EBFC.4020105@smiths-aerospace.com> Message-ID: <1118957816.25372.103.camel@cashmere.sps.mot.com> On Thu, 2005-06-16 at 16:15, Jerry Van Baren wrote: > > And, um, rats. So, I'm the victim of an aggressive > > and anti-social IT Firewall poo-poolicy.... Or maybe I > > am just being dumb and haven't Googled up the Right Magic. Door number 3, I am dumb. > > Is there a way to set an HTTP Proxy definition, flag, > > parameter, environment variable, or other doo-dad that > > will help me cg-pull/git-http-pull through my firewall? > It ain't efficient and it is pull-only, but it works: Good enough.... > $ cg-clone http://www.ozlabs.org/~dgibson/dtc/dtc.git > defaulting to local storage area > 17:13:12 > URL:http://www.ozlabs.org/%7Edgibson/dtc/dtc.git/refs/heads/master > [41/41] -> "refs/heads/origin" [1] > progress: 40 objects, 59841 bytes Hmmm... So, I tested this mess again. This time, I did NOT use our stupid SOCKS proxy, but rather the built-in HTTP proxy foo sort of like this in an environment variable sort of like this: setenv http_proxy http://$user:$password@$proxy_machine:$proxy_port cashmere.sps.mot.com 894 % socksify cg-pull origin loeliger at 192.88.158.50.1090 sockspassword: cg-pull: objects pull failed cashmere.sps.mot.com 895 % cg-pull origin 16:24:36 URL:http://www.ozlabs.org/%7Edgibson/dtc/dtc.git/refs/heads/master [41/41] -> "refs/heads/origin" [1] FINISHED --16:24:48-- And then it just stupidly worked. Pay no attention to my questions, jdl From gerald.vanbaren at smiths-aerospace.com Fri Jun 17 07:15:40 2005 From: gerald.vanbaren at smiths-aerospace.com (Jerry Van Baren) Date: Thu, 16 Jun 2005 17:15:40 -0400 Subject: More dtc changes In-Reply-To: <1118954956.25372.71.camel@cashmere.sps.mot.com> References: <20050616070816.GE31347@localhost.localdomain> <1118954956.25372.71.camel@cashmere.sps.mot.com> Message-ID: <42B1EBFC.4020105@smiths-aerospace.com> Jon Loeliger wrote: > On Thu, 2005-06-16 at 02:08, David Gibson wrote: > >>I now have a git tree for the device tree compiler up at >>http://www.ozlabs.org/~dgibson/dtc/dtc.git > > > Very Cool! > > And, um, rats. So, I'm the victim of an aggressive > and anti-social IT Firewall poo-poolicy.... Or maybe I > am just being dumb and haven't Googled up the Right Magic. > > Is there a way to set an HTTP Proxy definition, flag, > parameter, environment variable, or other doo-dad that > will help me cg-pull/git-http-pull through my firewall? > > Alternatively, any chance we can convert it to, or add, > an rsync tree? > > >>There's a tarball release as well >>http://www.ozlabs.org/~dgibson/dtc/dtc.tar.gz > > > If not, I'll have to be doing the tarball thing... > > Thanks, > jdl It ain't efficient and it is pull-only, but it works: $ mkdir test $ cd test test$ ll total 0 $ cg-clone http://www.ozlabs.org/~dgibson/dtc/dtc.git defaulting to local storage area 17:13:12 URL:http://www.ozlabs.org/%7Edgibson/dtc/dtc.git/refs/heads/master [41/41] -> "refs/heads/origin" [1] progress: 40 objects, 59841 bytes FINISHED --17:13:25-- Downloaded: 951 bytes in 1 files New branch: 81f2e89c7551ef44a6203ab1cbb8228d09202572 Cloned (origin http://www.ozlabs.org/~dgibson/dtc/dtc.git available as branch "origin") Cloned to dtc/ (origin http://www.ozlabs.org/~dgibson/dtc/dtc.git available as branch "origin") $ ll dtc/ total 116 -rw-r--r-- 1 vanbaren cideas 446 2005-06-16 17:13 comment-test.dts -rw-r--r-- 1 vanbaren cideas 17992 2005-06-16 17:13 COPYING -rw-r--r-- 1 vanbaren cideas 4776 2005-06-16 17:13 data.c -rw-r--r-- 1 vanbaren cideas 4896 2005-06-16 17:13 dtc.c -rw-r--r-- 1 vanbaren cideas 4704 2005-06-16 17:13 dtc.h -rw-r--r-- 1 vanbaren cideas 2989 2005-06-16 17:13 dtc-lexer.l -rw-r--r-- 1 vanbaren cideas 2890 2005-06-16 17:13 dtc-parser.y -rw-r--r-- 1 vanbaren cideas 19400 2005-06-16 17:13 flattree.c -rw-r--r-- 1 vanbaren cideas 2324 2005-06-16 17:13 fstree.c -rw-r--r-- 1 vanbaren cideas 15506 2005-06-16 17:13 livetree.c -rw-r--r-- 1 vanbaren cideas 559 2005-06-16 17:13 Makefile -rw-r--r-- 1 vanbaren cideas 707 2005-06-16 17:13 test.dts drwxr-xr-x 2 vanbaren cideas 4096 2005-06-16 17:13 tests -rw-r--r-- 1 vanbaren cideas 566 2005-06-16 17:13 TODO -rw-r--r-- 1 vanbaren cideas 3000 2005-06-16 17:13 treesource.c $ du -s * 1368 dtc gvb From rusty.lynch at intel.com Fri Jun 17 08:31:39 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Thu, 16 Jun 2005 15:31:39 -0700 Subject: [patch 0/5] [kprobes] Tweak to the function return probe design - take 2 Message-ID: <20050616223139.444305000@linux.jf.intel.com> The following is the second version of the function return probe patches I sent out earlier this week. Changes since my last submission include: * Fix in ppc64 code removing an unneeded call to re-enable preemption * Fix a build problem in ia64 when kprobes was turned off * Added another BUG_ON check to each of the architecture trampoline handlers My initial patch description ==> From my experiences with adding return probes to x86_64 and ia64, and the feedback on LKML to those patches, I think we can simplify the design for return probes. The following patch tweaks the original design such that: * Instead of storing the stack address in the return probe instance, the task pointer is stored. This gives us all we need in order to: - find the correct return probe instance when we enter the trampoline (even if we are recursing) - find all left-over return probe instances when the task is going away This has the side effect of simplifying the implementation since more work can be done in kernel/kprobes.c since architecture specific knowledge of the stack layout is no longer required. Specifically, we no longer have: - arch_get_kprobe_task() - arch_kprobe_flush_task() - get_rp_inst_tsk() - get_rp_inst() - trampoline_post_handler() * Instead of splitting the return probe handling and cleanup logic across the pre and post trampoline handlers, all the work is pushed into the pre function (trampoline_probe_handler), and then we skip single stepping the original function. In this case the original instruction to be single stepped was just a NOP, and we can do without the extra interruption. The new flow of events to having a return probe handler execute when a target function exits is: * At system initialization time, a kprobe is inserted at the beginning of kretprobe_trampoline. kernel/kprobes.c use to handle this on it's own, but ia64 needed to do this a little differently (i.e. a function pointer is really a pointer to a structure containing the instruction pointer and a global pointer), so I added the notion of arch_init(), so that kernel/kprobes.c:init_kprobes() now allows architecture specific initialization by calling arch_init() before exiting. Each architecture now registers a kprobe on it's own trampoline function. * register_kretprobe() will insert a kprobe at the beginning of the targeted function with the kprobe pre_handler set to arch_prepare_kretprobe (still no change) * When the target function is entered, the kprobe is fired, calling arch_prepare_kretprobe (still no change) * In arch_prepare_kretprobe() we try to get a free instance and if one is available then we fill out the instance with a pointer to the return probe, the original return address, and a pointer to the task structure (instead of the stack address.) Just like before we change the return address to the trampoline function and mark the instance as used. If multiple return probes are registered for a given target function, then arch_prepare_kretprobe() will get called multiple times for the same task (since our kprobe implementation is able to handle multiple kprobes at the same address.) Past the first call to arch_prepare_kretprobe, we end up with the original address stored in the return probe instance pointing to our trampoline function. (This is a significant difference from the original arch_prepare_kretprobe design.) * Target function executes like normal and then returns to kretprobe_trampoline. * kprobe inserted on the first instruction of kretprobe_trampoline is fired and calls trampoline_probe_handler() (no change here) * trampoline_probe_handler() consumes each of the instances associated with the current task by calling the registered handler function and marking the instance as unused until an instance is found that has a return address different then the trampoline function. (change similar to my previous ia64 RFC) * If the task is killed with some left-over return probe instances (meaning that a target function was entered, but never returned), then we just free any instances associated with the task. (Not much different other then we can handle this without calling architecture specific functions.) There is a known problem that this patch does not yet solve where registering a return probe flush_old_exec or flush_thread will put us in a bad state. Most likely the best way to handle this is to not allow registering return probes on these two functions. (Significant change) This patch series applies to the 2.6.12-rc6-mm1 kernel, and provides: * kernel/kprobes.c changes * i386 patch of existing return probes implementation * x86_64 patch of existing return probe implementation * ia64 implementation * ppc64 implementation (provided by Ananth) --rusty From rusty.lynch at intel.com Fri Jun 17 08:31:40 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Thu, 16 Jun 2005 15:31:40 -0700 Subject: [patch 1/5] [kprobes] Tweak to the function return probe design - take 2 References: <20050616223139.444305000@linux.jf.intel.com> Message-ID: <20050616223254.746591000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-base.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050616/f66149ba/attachment.txt From rusty.lynch at intel.com Fri Jun 17 08:31:41 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Thu, 16 Jun 2005 15:31:41 -0700 Subject: [patch 2/5] [kprobes] Tweak to the function return probe design - take 2 References: <20050616223139.444305000@linux.jf.intel.com> Message-ID: <20050616223306.401011000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-i386.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050616/ae1a73fd/attachment.txt From rusty.lynch at intel.com Fri Jun 17 08:31:42 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Thu, 16 Jun 2005 15:31:42 -0700 Subject: [patch 3/5] [kprobes] Tweak to the function return probe design - take 2 References: <20050616223139.444305000@linux.jf.intel.com> Message-ID: <20050616223306.883491000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-x86_64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050616/3651c74a/attachment.txt From rusty.lynch at intel.com Fri Jun 17 08:31:43 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Thu, 16 Jun 2005 15:31:43 -0700 Subject: [patch 4/5] [kprobes] Tweak to the function return probe design - take 2 References: <20050616223139.444305000@linux.jf.intel.com> Message-ID: <20050616223307.395436000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-ia64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050616/f9086e2d/attachment.txt From rusty.lynch at intel.com Fri Jun 17 08:31:44 2005 From: rusty.lynch at intel.com (rusty.lynch at intel.com) Date: Thu, 16 Jun 2005 15:31:44 -0700 Subject: [patch 5/5] [kprobes] Tweak to the function return probe design - take 2 References: <20050616223139.444305000@linux.jf.intel.com> Message-ID: <20050616223308.007672000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-ppc64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050616/5c671c16/attachment.txt From david at gibson.dropbear.id.au Fri Jun 17 10:45:54 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 17 Jun 2005 10:45:54 +1000 Subject: More dtc changes In-Reply-To: <1118954956.25372.71.camel@cashmere.sps.mot.com> References: <20050616070816.GE31347@localhost.localdomain> <1118954956.25372.71.camel@cashmere.sps.mot.com> Message-ID: <20050617004554.GA31491@localhost.localdomain> On Thu, Jun 16, 2005 at 03:49:16PM -0500, Jon Loeliger wrote: > On Thu, 2005-06-16 at 02:08, David Gibson wrote: > > I now have a git tree for the device tree compiler up at > > http://www.ozlabs.org/~dgibson/dtc/dtc.git > > Very Cool! > > And, um, rats. So, I'm the victim of an aggressive > and anti-social IT Firewall poo-poolicy.... Or maybe I > am just being dumb and haven't Googled up the Right Magic. > > Is there a way to set an HTTP Proxy definition, flag, > parameter, environment variable, or other doo-dad that > will help me cg-pull/git-http-pull through my firewall? > > Alternatively, any chance we can convert it to, or add, > an rsync tree? Oh, oops, I meant to set up an rsync tree, but forgot. It's now done, and you can also pull it from: ozlabs.org::dtc/dtc.git -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From arnd at arndb.de Sat Jun 18 00:31:18 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 17 Jun 2005 16:31:18 +0200 Subject: [PATCH] ppc64: ppc_md.progress should not be __init Message-ID: <200506171631.18520.arnd@arndb.de> I noticed on BPA that ppc_md.progress() is called from some places outside of the init call sequence, e.g. the rtas flash code. However, I copied the __init annotation from one of the other platform types, which caused the kernel to crash upon hitting the freed code segment. I have checked that iSeries, pmac and maple all have .progress marked as __init and can never incorrectly hit that function after boot, so there is no actual breakage in the current source. However, I find the behavior rather surprising and suggest removing this to have it the same way as it is required on pSeries and BPA. Signed-off-by: Arnd Bergmann --- linux-cg.orig/arch/ppc64/kernel/iSeries_setup.c 2005-06-06 15:08:43.000000000 -0400 +++ linux-cg/arch/ppc64/kernel/iSeries_setup.c 2005-06-17 16:25:14.050938184 -0400 @@ -811,7 +811,7 @@ static void __init iSeries_calibrate_dec setup_default_decr(); } -static void __init iSeries_progress(char * st, unsigned short code) +static void iSeries_progress(char * st, unsigned short code) { printk("Progress: [%04x] - %s\n", (unsigned)code, st); if (!piranha_simulator && mf_initialized) { --- linux-cg.orig/arch/ppc64/kernel/maple_setup.c 2005-05-11 11:32:10.000000000 -0400 +++ linux-cg/arch/ppc64/kernel/maple_setup.c 2005-06-17 16:25:21.348895104 -0400 @@ -197,7 +197,7 @@ static __init void maple_init_IRQ(void) DBG(" <- maple_init_IRQ\n"); } -static void __init maple_progress(char *s, unsigned short hex) +static void maple_progress(char *s, unsigned short hex) { printk("*** %04x : %s\n", hex, s ? s : ""); } --- linux-cg.orig/arch/ppc64/kernel/pmac_setup.c 2005-05-11 11:32:10.000000000 -0400 +++ linux-cg/arch/ppc64/kernel/pmac_setup.c 2005-06-17 16:25:31.320976880 -0400 @@ -419,7 +419,7 @@ static __init void pmac_init_IRQ(void) of_node_put(irqctrler2); } -static void __init pmac_progress(char *s, unsigned short hex) +static void pmac_progress(char *s, unsigned short hex) { if (sccdbg) { udbg_puts(s); From ananth at in.ibm.com Sat Jun 18 01:23:06 2005 From: ananth at in.ibm.com (Ananth N Mavinakayanahalli) Date: Fri, 17 Jun 2005 11:23:06 -0400 Subject: [PATCH] kprobes: fix single-step out of line - take2 Message-ID: <20050617152306.GA7913@in.ibm.com> Hi, Here is the second try to fix the single-step out of line issues for PPC64 now that the kernel has no-execute support. x86_64 also has no-execute support and kprobes on x86_64 solved this issue by allocating an executable kernel page and using it as a scratch area for instructions to be stepped out of line. Reuse that for PPC64 too. Thanks, Ananth Now that PPC64 has no-execute support, here is a second try to fix the single step out of line during kprobe execution. Kprobes on x86_64 already solved this problem by allocating an executable page and using it as the scratch area for stepping out of line. Reuse that. Patch against 2.6.12-rc6-git8 Signed-off-by: Ananth N Mavinakayanahalli arch/ppc64/kernel/kprobes.c | 26 ++++++++- arch/x86_64/kernel/kprobes.c | 112 ------------------------------------------- include/asm-ppc64/kprobes.h | 2 include/linux/kprobes.h | 2 kernel/kprobes.c | 101 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 126 insertions(+), 117 deletions(-) Index: linux-2.6.12-rc6/arch/ppc64/kernel/kprobes.c =================================================================== --- linux-2.6.12-rc6.orig/arch/ppc64/kernel/kprobes.c 2005-06-16 14:44:40.000000000 -0400 +++ linux-2.6.12-rc6/arch/ppc64/kernel/kprobes.c 2005-06-16 14:53:51.000000000 -0400 @@ -39,6 +39,8 @@ #define KPROBE_HIT_ACTIVE 0x00000001 #define KPROBE_HIT_SS 0x00000002 +static DECLARE_MUTEX(kprobe_mutex); + static struct kprobe *current_kprobe; static unsigned long kprobe_status, kprobe_saved_msr; static struct pt_regs jprobe_saved_regs; @@ -55,6 +57,15 @@ int arch_prepare_kprobe(struct kprobe *p printk("Cannot register a kprobe on rfid or mtmsrd\n"); ret = -EINVAL; } + + /* insn must be on a special executable page on ppc64 */ + if (!ret) { + up(&kprobe_mutex); + p->ainsn.insn = get_insn_slot(); + down(&kprobe_mutex); + if (!p->ainsn.insn) + ret = -ENOMEM; + } return ret; } @@ -65,6 +76,9 @@ void arch_copy_kprobe(struct kprobe *p) void arch_remove_kprobe(struct kprobe *p) { + up(&kprobe_mutex); + free_insn_slot(p->ainsn.insn); + down(&kprobe_mutex); } static inline void disarm_kprobe(struct kprobe *p, struct pt_regs *regs) @@ -75,12 +89,15 @@ static inline void disarm_kprobe(struct static inline void prepare_singlestep(struct kprobe *p, struct pt_regs *regs) { + kprobe_opcode_t insn = *p->ainsn.insn; + regs->msr |= MSR_SE; - /*single step inline if it a breakpoint instruction*/ - if (p->opcode == BREAKPOINT_INSTRUCTION) + + /* single step inline if it is a trap variant */ + if (IS_TW(insn) || IS_TD(insn) || IS_TWI(insn) || IS_TDI(insn)) regs->nip = (unsigned long)p->addr; else - regs->nip = (unsigned long)&p->ainsn.insn; + regs->nip = (unsigned long)p->ainsn.insn; } static inline int kprobe_handler(struct pt_regs *regs) @@ -172,9 +189,10 @@ no_kprobe: static void resume_execution(struct kprobe *p, struct pt_regs *regs) { int ret; + unsigned int insn = *p->ainsn.insn; regs->nip = (unsigned long)p->addr; - ret = emulate_step(regs, p->ainsn.insn[0]); + ret = emulate_step(regs, insn); if (ret == 0) regs->nip = (unsigned long)p->addr + 4; } Index: linux-2.6.12-rc6/arch/x86_64/kernel/kprobes.c =================================================================== --- linux-2.6.12-rc6.orig/arch/x86_64/kernel/kprobes.c 2005-06-06 11:22:29.000000000 -0400 +++ linux-2.6.12-rc6/arch/x86_64/kernel/kprobes.c 2005-06-16 14:58:38.000000000 -0400 @@ -36,7 +36,6 @@ #include #include #include -#include #include #include @@ -51,8 +50,6 @@ static struct kprobe *current_kprobe; static unsigned long kprobe_status, kprobe_old_rflags, kprobe_saved_rflags; static struct pt_regs jprobe_saved_regs; static long *jprobe_saved_rsp; -static kprobe_opcode_t *get_insn_slot(void); -static void free_insn_slot(kprobe_opcode_t *slot); void jprobe_return_end(void); /* copy of the kernel stack at the probe fire time */ @@ -527,112 +524,3 @@ int longjmp_break_handler(struct kprobe } return 0; } - -/* - * kprobe->ainsn.insn points to the copy of the instruction to be single-stepped. - * By default on x86_64, pages we get from kmalloc or vmalloc are not - * executable. Single-stepping an instruction on such a page yields an - * oops. So instead of storing the instruction copies in their respective - * kprobe objects, we allocate a page, map it executable, and store all the - * instruction copies there. (We can allocate additional pages if somebody - * inserts a huge number of probes.) Each page can hold up to INSNS_PER_PAGE - * instruction slots, each of which is MAX_INSN_SIZE*sizeof(kprobe_opcode_t) - * bytes. - */ -#define INSNS_PER_PAGE (PAGE_SIZE/(MAX_INSN_SIZE*sizeof(kprobe_opcode_t))) -struct kprobe_insn_page { - struct hlist_node hlist; - kprobe_opcode_t *insns; /* page of instruction slots */ - char slot_used[INSNS_PER_PAGE]; - int nused; -}; - -static struct hlist_head kprobe_insn_pages; - -/** - * get_insn_slot() - Find a slot on an executable page for an instruction. - * We allocate an executable page if there's no room on existing ones. - */ -static kprobe_opcode_t *get_insn_slot(void) -{ - struct kprobe_insn_page *kip; - struct hlist_node *pos; - - hlist_for_each(pos, &kprobe_insn_pages) { - kip = hlist_entry(pos, struct kprobe_insn_page, hlist); - if (kip->nused < INSNS_PER_PAGE) { - int i; - for (i = 0; i < INSNS_PER_PAGE; i++) { - if (!kip->slot_used[i]) { - kip->slot_used[i] = 1; - kip->nused++; - return kip->insns + (i*MAX_INSN_SIZE); - } - } - /* Surprise! No unused slots. Fix kip->nused. */ - kip->nused = INSNS_PER_PAGE; - } - } - - /* All out of space. Need to allocate a new page. Use slot 0.*/ - kip = kmalloc(sizeof(struct kprobe_insn_page), GFP_KERNEL); - if (!kip) { - return NULL; - } - - /* - * For the %rip-relative displacement fixups to be doable, we - * need our instruction copy to be within +/- 2GB of any data it - * might access via %rip. That is, within 2GB of where the - * kernel image and loaded module images reside. So we allocate - * a page in the module loading area. - */ - kip->insns = module_alloc(PAGE_SIZE); - if (!kip->insns) { - kfree(kip); - return NULL; - } - INIT_HLIST_NODE(&kip->hlist); - hlist_add_head(&kip->hlist, &kprobe_insn_pages); - memset(kip->slot_used, 0, INSNS_PER_PAGE); - kip->slot_used[0] = 1; - kip->nused = 1; - return kip->insns; -} - -/** - * free_insn_slot() - Free instruction slot obtained from get_insn_slot(). - */ -static void free_insn_slot(kprobe_opcode_t *slot) -{ - struct kprobe_insn_page *kip; - struct hlist_node *pos; - - hlist_for_each(pos, &kprobe_insn_pages) { - kip = hlist_entry(pos, struct kprobe_insn_page, hlist); - if (kip->insns <= slot - && slot < kip->insns+(INSNS_PER_PAGE*MAX_INSN_SIZE)) { - int i = (slot - kip->insns) / MAX_INSN_SIZE; - kip->slot_used[i] = 0; - kip->nused--; - if (kip->nused == 0) { - /* - * Page is no longer in use. Free it unless - * it's the last one. We keep the last one - * so as not to have to set it up again the - * next time somebody inserts a probe. - */ - hlist_del(&kip->hlist); - if (hlist_empty(&kprobe_insn_pages)) { - INIT_HLIST_NODE(&kip->hlist); - hlist_add_head(&kip->hlist, - &kprobe_insn_pages); - } else { - module_free(NULL, kip->insns); - kfree(kip); - } - } - return; - } - } -} Index: linux-2.6.12-rc6/include/linux/kprobes.h =================================================================== --- linux-2.6.12-rc6.orig/include/linux/kprobes.h 2005-06-06 11:22:29.000000000 -0400 +++ linux-2.6.12-rc6/include/linux/kprobes.h 2005-06-16 15:01:13.000000000 -0400 @@ -101,6 +101,8 @@ extern int arch_prepare_kprobe(struct kp extern void arch_copy_kprobe(struct kprobe *p); extern void arch_remove_kprobe(struct kprobe *p); extern void show_registers(struct pt_regs *regs); +extern kprobe_opcode_t *get_insn_slot(void); +extern void free_insn_slot(kprobe_opcode_t *slot); /* Get the kprobe at this addr (if any). Must have called lock_kprobes */ struct kprobe *get_kprobe(void *addr); Index: linux-2.6.12-rc6/kernel/kprobes.c =================================================================== --- linux-2.6.12-rc6.orig/kernel/kprobes.c 2005-06-06 11:22:29.000000000 -0400 +++ linux-2.6.12-rc6/kernel/kprobes.c 2005-06-17 08:50:47.000000000 -0400 @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -46,6 +47,106 @@ unsigned int kprobe_cpu = NR_CPUS; static DEFINE_SPINLOCK(kprobe_lock); static struct kprobe *curr_kprobe; +/* + * kprobe->ainsn.insn points to the copy of the instruction to be + * single-stepped. x86_64, POWER4 and above have no-exec support and + * stepping on the instruction on a vmalloced/kmalloced/data page + * is a recipe for disaster + */ +#define INSNS_PER_PAGE (PAGE_SIZE/(MAX_INSN_SIZE * sizeof(kprobe_opcode_t))) + +struct kprobe_insn_page { + struct hlist_node hlist; + kprobe_opcode_t *insns; /* Page of instruction slots */ + char slot_used[INSNS_PER_PAGE]; + int nused; +}; + +static struct hlist_head kprobe_insn_pages; + +/** + * get_insn_slot() - Find a slot on an executable page for an instruction. + * We allocate an executable page if there's no room on existing ones. + */ +kprobe_opcode_t *get_insn_slot(void) +{ + struct kprobe_insn_page *kip; + struct hlist_node *pos; + + hlist_for_each(pos, &kprobe_insn_pages) { + kip = hlist_entry(pos, struct kprobe_insn_page, hlist); + if (kip->nused < INSNS_PER_PAGE) { + int i; + for (i = 0; i < INSNS_PER_PAGE; i++) { + if (!kip->slot_used[i]) { + kip->slot_used[i] = 1; + kip->nused++; + return kip->insns + (i * MAX_INSN_SIZE); + } + } + /* Surprise! No unused slots. Fix kip->nused. */ + kip->nused = INSNS_PER_PAGE; + } + } + + /* All out of space. Need to allocate a new page. Use slot 0.*/ + kip = kmalloc(sizeof(struct kprobe_insn_page), GFP_KERNEL); + if (!kip) { + return NULL; + } + + /* + * Use module_alloc so this page is within +/- 2GB of where the + * kernel image and loaded module images reside. This is required + * so x86_64 can correctly handle the %rip-relative fixups. + */ + kip->insns = module_alloc(PAGE_SIZE); + if (!kip->insns) { + kfree(kip); + return NULL; + } + INIT_HLIST_NODE(&kip->hlist); + hlist_add_head(&kip->hlist, &kprobe_insn_pages); + memset(kip->slot_used, 0, INSNS_PER_PAGE); + kip->slot_used[0] = 1; + kip->nused = 1; + return kip->insns; +} + +void free_insn_slot(kprobe_opcode_t *slot) +{ + struct kprobe_insn_page *kip; + struct hlist_node *pos; + + hlist_for_each(pos, &kprobe_insn_pages) { + kip = hlist_entry(pos, struct kprobe_insn_page, hlist); + if (kip->insns <= slot && + slot < kip->insns + (INSNS_PER_PAGE * MAX_INSN_SIZE)) { + int i = (slot - kip->insns) / MAX_INSN_SIZE; + kip->slot_used[i] = 0; + kip->nused--; + if (kip->nused == 0) { + /* + * Page is no longer in use. Free it unless + * it's the last one. We keep the last one + * so as not to have to set it up again the + * next time somebody inserts a probe. + */ + hlist_del(&kip->hlist); + if (hlist_empty(&kprobe_insn_pages)) { + INIT_HLIST_NODE(&kip->hlist); + hlist_add_head(&kip->hlist, + &kprobe_insn_pages); + } else { + module_free(NULL, kip->insns); + kfree(kip); + } + } + return; + } + } +} + /* Locks kprobe: irqs must be disabled */ void lock_kprobes(void) { Index: linux-2.6.12-rc6/include/asm-ppc64/kprobes.h =================================================================== --- linux-2.6.12-rc6.orig/include/asm-ppc64/kprobes.h 2005-06-17 09:32:24.000000000 -0400 +++ linux-2.6.12-rc6/include/asm-ppc64/kprobes.h 2005-06-17 09:32:49.000000000 -0400 @@ -45,7 +45,7 @@ typedef unsigned int kprobe_opcode_t; /* Architecture specific copy of original instruction */ struct arch_specific_insn { /* copy of original instruction */ - kprobe_opcode_t insn[MAX_INSN_SIZE]; + kprobe_opcode_t *insn; }; #ifdef CONFIG_KPROBES From johnrose at austin.ibm.com Sat Jun 18 07:59:19 2005 From: johnrose at austin.ibm.com (John Rose) Date: Fri, 17 Jun 2005 16:59:19 -0500 Subject: [PATCH] pSeries - read irqs dynamically Message-ID: <1119045559.12774.4.camel@sinatra.austin.ibm.com> For I/O DLPAR to work properly, the kernel needs to allow for dynamic assignment of the irq field of the pci_dev structure upon dynamic bus addition. This patch moves the assignment of that field from pSeries_final_fixup() to pcibios_fixup_bus(), which enables dynamic assignment for the children of a newly added bus. Currently, pci_devs receive their irq numbers in one of two ways. The irq line is either read at boot for all pci_devs, or read by the rpaphp module at slot enable time. The latter is no longer sufficient for DLPAR addition of slots that don't qualify as PCI-hotplug capable. This solution handles the cases of boot and dynamic add. Comments welcome. Thanks- John Signed-off-by: John Rose diff -puN arch/ppc64/kernel/pSeries_pci.c~irq_dev_setup arch/ppc64/kernel/pSeries_pci.c --- 2_6_linus/arch/ppc64/kernel/pSeries_pci.c~irq_dev_setup 2005-06-17 16:37:45.000000000 -0500 +++ 2_6_linus-johnrose/arch/ppc64/kernel/pSeries_pci.c 2005-06-17 16:37:45.000000000 -0500 @@ -48,8 +48,6 @@ static int write_pci_config; static int ibm_read_pci_config; static int ibm_write_pci_config; -static int s7a_workaround; - extern struct mpic *pSeries_mpic; static int config_access_valid(struct device_node *dn, int where) @@ -227,6 +225,39 @@ static void python_countermeasures(struc iounmap(chip_regs); } +static int is_model_s7a(void) +{ + struct device_node *root; + char *model; + int rc = 0; + + root = of_find_node_by_path("/"); + if (root) { + model = get_property(root, "model", NULL); + if (model && !strcmp(model, "IBM,7013-S7A")) + rc = 1; + of_node_put(root); + } + + return rc; +} + +void __devinit pSeries_irq_bus_setup(struct pci_bus *bus) +{ + struct pci_dev *dev; + + list_for_each_entry(dev, &bus->devices, bus_list) { + pci_read_irq_line(dev); + if (is_model_s7a()) { + if (dev->irq > 16) { + dev->irq -= 3; + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, + dev->irq); + } + } + } +} + void __init init_pci_config_tokens (void) { read_pci_config = rtas_token("read-pci-config"); @@ -414,6 +445,7 @@ unsigned long __init find_and_init_phbs( if (prop) pci_assign_all_buses = *prop; } + ppc_md.irq_bus_setup = pSeries_irq_bus_setup; return 0; } @@ -474,20 +506,6 @@ void pcibios_name_device(struct pci_dev DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pcibios_name_device); #endif -static void check_s7a(void) -{ - struct device_node *root; - char *model; - - root = of_find_node_by_path("/"); - if (root) { - model = get_property(root, "model", NULL); - if (model && !strcmp(model, "IBM,7013-S7A")) - s7a_workaround = 1; - of_node_put(root); - } -} - /* RPA-specific bits for removing PHBs */ int pcibios_remove_root_bus(struct pci_controller *phb) { @@ -553,20 +571,6 @@ static void __init pSeries_request_regio void __init pSeries_final_fixup(void) { - struct pci_dev *dev = NULL; - - check_s7a(); - - for_each_pci_dev(dev) { - pci_read_irq_line(dev); - if (s7a_workaround) { - if (dev->irq > 16) { - dev->irq -= 3; - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, dev->irq); - } - } - } - phbs_remap_io(); pSeries_request_regions(); diff -puN arch/ppc64/kernel/pci.c~irq_dev_setup arch/ppc64/kernel/pci.c --- 2_6_linus/arch/ppc64/kernel/pci.c~irq_dev_setup 2005-06-17 16:37:45.000000000 -0500 +++ 2_6_linus-johnrose/arch/ppc64/kernel/pci.c 2005-06-17 16:37:45.000000000 -0500 @@ -902,6 +902,9 @@ void __devinit pcibios_fixup_bus(struct list_for_each_entry(dev, &bus->devices, bus_list) ppc_md.iommu_dev_setup(dev); + if (ppc_md.irq_bus_setup) + ppc_md.irq_bus_setup(bus); + if (!pci_probe_only) return; diff -puN include/asm-ppc64/machdep.h~irq_dev_setup include/asm-ppc64/machdep.h --- 2_6_linus/include/asm-ppc64/machdep.h~irq_dev_setup 2005-06-17 16:37:45.000000000 -0500 +++ 2_6_linus-johnrose/include/asm-ppc64/machdep.h 2005-06-17 16:37:45.000000000 -0500 @@ -76,6 +76,7 @@ struct machdep_calls { void (*tce_flush)(struct iommu_table *tbl); void (*iommu_dev_setup)(struct pci_dev *dev); void (*iommu_bus_setup)(struct pci_bus *bus); + void (*irq_bus_setup)(struct pci_bus *bus); int (*probe)(int platform); void (*setup_arch)(void); _ From miltonm at bga.com Sat Jun 18 12:38:12 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 17 Jun 2005 21:38:12 -0500 Subject: [PATCH] pSeries - read irqs dynamically Message-ID: <2bf73ba753db91d84821014999f3dff1@bga.com> Surely the model won't be changing. How about leaving the s7a check to occur once at boot? milton > > -static int s7a_workaround; > - > extern struct mpic *pSeries_mpic; > > static int config_access_valid(struct device_node *dn, int where) > @@ -227,6 +225,39 @@ static void python_countermeasures(struc > iounmap(chip_regs); > } > > +static int is_model_s7a(void) > +{ > + struct device_node *root; > + char *model; > + int rc = 0; > + > + root = of_find_node_by_path("/"); > + if (root) { > + model = get_property(root, "model", NULL); > + if (model && !strcmp(model, "IBM,7013-S7A")) > + rc = 1; > + of_node_put(root); > + } > + > + return rc; > +} > + From paulus at samba.org Mon Jun 20 21:48:39 2005 From: paulus at samba.org (Paul Mackerras) Date: Mon, 20 Jun 2005 21:48:39 +1000 Subject: [PATCH] ppc64: ppc_md.progress should not be __init In-Reply-To: <200506171631.18520.arnd@arndb.de> References: <200506171631.18520.arnd@arndb.de> Message-ID: <17078.44311.795882.147213@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > I noticed on BPA that ppc_md.progress() is called from some places outside > of the init call sequence, e.g. the rtas flash code. However, I copied the > __init annotation from one of the other platform types, which caused the > kernel to crash upon hitting the freed code segment. > > I have checked that iSeries, pmac and maple all have .progress marked as > __init and can never incorrectly hit that function after boot, so there > is no actual breakage in the current source. > > However, I find the behavior rather surprising and suggest removing this > to have it the same way as it is required on pSeries and BPA. What I did in the ppc32 code was to clear the ppc_md.progress pointer when freeing the init memory. I think that's a simpler and more reliable fix for this problem. If the rtas flash code wants to change the LCD display it could call the pSeries progress routine directly. Regards, Paul. From johnrose at austin.ibm.com Tue Jun 21 02:00:38 2005 From: johnrose at austin.ibm.com (John Rose) Date: Mon, 20 Jun 2005 11:00:38 -0500 Subject: [PATCH] pSeries - read irqs dynamically In-Reply-To: <2bf73ba753db91d84821014999f3dff1@bga.com> References: <2bf73ba753db91d84821014999f3dff1@bga.com> Message-ID: <1119283238.2859.14.camel@sinatra.austin.ibm.com> On Fri, 2005-06-17 at 21:38, Milton Miller wrote: > Surely the model won't be changing. How about leaving the s7a check > to occur once at boot? Okay, how does this look? Signed-off-by: John Rose diff -puN arch/ppc64/kernel/pSeries_pci.c~irq_dev_setup arch/ppc64/kernel/pSeries_pci.c --- 2_6_linus/arch/ppc64/kernel/pSeries_pci.c~irq_dev_setup 2005-06-20 10:59:12.000000000 -0500 +++ 2_6_linus-johnrose/arch/ppc64/kernel/pSeries_pci.c 2005-06-20 10:59:12.000000000 -0500 @@ -227,6 +227,36 @@ static void python_countermeasures(struc iounmap(chip_regs); } +static void check_s7a(void) +{ + struct device_node *root; + char *model; + + root = of_find_node_by_path("/"); + if (root) { + model = get_property(root, "model", NULL); + if (model && !strcmp(model, "IBM,7013-S7A")) + s7a_workaround = 1; + of_node_put(root); + } +} + +void __devinit pSeries_irq_bus_setup(struct pci_bus *bus) +{ + struct pci_dev *dev; + + list_for_each_entry(dev, &bus->devices, bus_list) { + pci_read_irq_line(dev); + if (s7a_workaround) { + if (dev->irq > 16) { + dev->irq -= 3; + pci_write_config_byte(dev, PCI_INTERRUPT_LINE, + dev->irq); + } + } + } +} + void __init init_pci_config_tokens (void) { read_pci_config = rtas_token("read-pci-config"); @@ -414,6 +444,8 @@ unsigned long __init find_and_init_phbs( if (prop) pci_assign_all_buses = *prop; } + ppc_md.irq_bus_setup = pSeries_irq_bus_setup; + check_s7a(); return 0; } @@ -474,20 +506,6 @@ void pcibios_name_device(struct pci_dev DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pcibios_name_device); #endif -static void check_s7a(void) -{ - struct device_node *root; - char *model; - - root = of_find_node_by_path("/"); - if (root) { - model = get_property(root, "model", NULL); - if (model && !strcmp(model, "IBM,7013-S7A")) - s7a_workaround = 1; - of_node_put(root); - } -} - /* RPA-specific bits for removing PHBs */ int pcibios_remove_root_bus(struct pci_controller *phb) { @@ -553,20 +571,6 @@ static void __init pSeries_request_regio void __init pSeries_final_fixup(void) { - struct pci_dev *dev = NULL; - - check_s7a(); - - for_each_pci_dev(dev) { - pci_read_irq_line(dev); - if (s7a_workaround) { - if (dev->irq > 16) { - dev->irq -= 3; - pci_write_config_byte(dev, PCI_INTERRUPT_LINE, dev->irq); - } - } - } - phbs_remap_io(); pSeries_request_regions(); diff -puN arch/ppc64/kernel/pci.c~irq_dev_setup arch/ppc64/kernel/pci.c --- 2_6_linus/arch/ppc64/kernel/pci.c~irq_dev_setup 2005-06-20 10:59:12.000000000 -0500 +++ 2_6_linus-johnrose/arch/ppc64/kernel/pci.c 2005-06-20 10:59:12.000000000 -0500 @@ -902,6 +902,9 @@ void __devinit pcibios_fixup_bus(struct list_for_each_entry(dev, &bus->devices, bus_list) ppc_md.iommu_dev_setup(dev); + if (ppc_md.irq_bus_setup) + ppc_md.irq_bus_setup(bus); + if (!pci_probe_only) return; diff -puN include/asm-ppc64/machdep.h~irq_dev_setup include/asm-ppc64/machdep.h --- 2_6_linus/include/asm-ppc64/machdep.h~irq_dev_setup 2005-06-20 10:59:12.000000000 -0500 +++ 2_6_linus-johnrose/include/asm-ppc64/machdep.h 2005-06-20 10:59:12.000000000 -0500 @@ -76,6 +76,7 @@ struct machdep_calls { void (*tce_flush)(struct iommu_table *tbl); void (*iommu_dev_setup)(struct pci_dev *dev); void (*iommu_bus_setup)(struct pci_bus *bus); + void (*irq_bus_setup)(struct pci_bus *bus); int (*probe)(int platform); void (*setup_arch)(void); _ From linas at austin.ibm.com Tue Jun 21 03:43:29 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Mon, 20 Jun 2005 12:43:29 -0500 Subject: Linux on Power Wiki Message-ID: <20050620174329.GE16457@austin.ibm.com> I just found out about the Linux-on-Power Wiki; it has a very nice Wikipedia-like interface. I thought I'd announce it here, and encourage all developers take a look and, of course, hack on it as needed. Based on the incredible success of Wikipedia, I suspect that this may in fact be the best possible way to organize and present information about Linux on Power. Although the title page does say "IBM Power", I'm thinking that it may be possible to change this policy to include e.g. Frescale embedded info, as well as Linux-on-Apple systems. Maybe. See http://oss.gonicus.de/openpower/index.php/Main_Page --linas From arnd at arndb.de Tue Jun 21 03:55:29 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 20 Jun 2005 19:55:29 +0200 Subject: [PATCH] ppc64: ppc_md.progress should not be __init In-Reply-To: <17078.44311.795882.147213@cargo.ozlabs.ibm.com> References: <200506171631.18520.arnd@arndb.de> <17078.44311.795882.147213@cargo.ozlabs.ibm.com> Message-ID: <200506201955.29850.arnd@arndb.de> On Maandag 20 Juni 2005 13:48, Paul Mackerras wrote: > What I did in the ppc32 code was to clear the ppc_md.progress pointer > when freeing the init memory. ?I think that's a simpler and more > reliable fix for this problem. ?If the rtas flash code wants to change > the LCD display it could call the pSeries progress routine directly. Hmm, on BPA we don't have the pSeries progress routine, because the firmware has no sensible way of implementing the rtas call, but we want to use the same flash code. I can of course make a patch that moves pSeries_progress() from pSeries_setup.c to rtas_progress() in rtas.c and use that only for pSeries. AFAICS, the function doesn't do anything on SLOF, but is not harmful either. Arnd <>< From anton at samba.org Tue Jun 21 07:00:05 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 21 Jun 2005 07:00:05 +1000 Subject: [PATCH] ppc64: Mark kernel hptes dirty Message-ID: <20050620210005.GA5805@krispykreme> Hi, We dont use the hardware referenced and changed bits and setting them early avoids a store to memory. We already do this for userspace hptes but not kernel ones. Do it. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/mm/hash_utils.c~set_kernel_rcbits_1 arch/ppc64/mm/hash_utils.c --- foobar2/arch/ppc64/mm/hash_utils.c~set_kernel_rcbits_1 2005-05-17 02:42:01.379761773 -0500 +++ foobar2-anton/arch/ppc64/mm/hash_utils.c 2005-05-17 02:42:01.389760188 -0500 @@ -195,7 +195,7 @@ void __init htab_initialize(void) memset((void *)table, 0, htab_size_bytes); } - mode_rw = _PAGE_ACCESSED | _PAGE_COHERENT | PP_RWXX; + mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX; /* On U3 based machines, we need to reserve the DART area and * _NOT_ map it to avoid cache paradoxes as it's remapped non _ From becky.bruce at freescale.com Wed Jun 22 05:02:11 2005 From: becky.bruce at freescale.com (Becky Bruce) Date: Tue, 21 Jun 2005 14:02:11 -0500 Subject: Proposal for reorg of kernel directory Message-ID: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> All, First off, apologies for a lengthy email.... We've recently begun work on a port of the 64-bit kernel to a Freescale part, and noticed that all the platform-specific code is currently in the kernel directory. With new 64-bit parts in the works, we expect the number of supported platforms to increase significantly and to include more embedded systems. To prevent bloating the kernel directory, I'd like to propose a reorganization of the ppc64 tree to look more like the ppc tree. This includes the creation of "platforms" and "syslib" directories that would contain platform-specific code and non-platform-specific system code, respectively. Using the Kconfigs and Makefiles, I've taken a whack at identifying files that appear to be platform-specific, and those that appear to be generic. Since I'm not an expert on the existing platforms, it's highly possible that I've gotten some of this wrong. The proposed directory structure is listed below. Kernel directory: -------------------------- align.c asm-offsets.c binfmt_elf32.c bitops.c cpu_setup_power4.S cputable.c dma.c entry.S head.S idle.c idle_power4.S init_task.c ioctl32.c iomap.c iommu.c irq.c kprobes.c lmb.c misc.S module.c pacaData.c pci.c pci.h ppc_ksyms.c process.c proc_ppc64.c ptrace32.c ptrace.c semaphore.c setup.c signal32.c signal.c smp.c smp-tbsync.c syscalls.c sysfs.c sys_ppc32.c time.c traps.c vecemu.c vector.S vmlinux.lds.S Should these be in kernel or syslib or somewhere else altogether? -------------------------------------------------------------------- lparcfg.c pci_iommu.c pci_direct_iommu.c pci_dn.c This next section lists out the files that appear to be specific to a particular platform. I've separated them by platform for the sake of ease of parsing this list, but I would imagine that these would all end up in the high level "platforms" directory: iSeries: ---------------------- HvCall.c hvCall.S HvLpConfig.c HvLpEvent.c iSeries_htab.c iSeries_iommu.c iSeries_irq.c iSeries_pci.c iSeries_pci_reset.c iSeries_proc.c iSeries_setup.c iSeries_setup.h iSeries_smp.c iSeries_VpdInfo.c ItLpQueue.c LparData.c mf.c XmPciLpEvent.c pSeries: --------------------- eeh.c hvconsole.c hvcserver.c pSeries_hvCall.S pSeries_iommu.c pSeries_lpar.c pSeries_nvram.c pSeries_pci.c pSeries_setup.c pSeries_smp.c ras.c rtas.c rtasd.c rtas_flash.c rtas-proc.c scanlog.c xics.c pmac: ------------------ pmac_feature.c pmac.h pmac_low_i2c.c pmac_nvram.c pmac_pci.c pmac_setup.c pmac_smp.c pmac_time.c maple: -------------------- maple_pci.c maple_setup.c maple_time.c And the files that I believe should go into syslib: syslib/ ----------- btext.c i8259.c i8259.h mpic.c mpic.h nvram.c of_device.c prom.c prom_init.c rtc.c u3_iommu.c udbg.c vio.c viopath.c (? - not clear on this one) Thoughts? -Becky -- Becky Bruce PowerPC Software Developer Freescale Semiconductor, Austin, TX From arnd at arndb.de Wed Jun 22 05:25:03 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 21:25:03 +0200 Subject: Proposal for reorg of kernel directory In-Reply-To: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> Message-ID: <200506212125.04138.arnd@arndb.de> On Dinsdag 21 Juni 2005 21:02, Becky Bruce wrote: > We've recently begun work on a port of the 64-bit kernel to a > Freescale part, and noticed that all the platform-specific code is > currently in the kernel directory. ?With new 64-bit parts in the > works, we expect the number of supported platforms to increase > significantly and to include more embedded systems. Hmm, at least I'd hope not to need a new platform type for every piece of hardware, so there would not be too many of these. I would like to see platform types like 'everything with 64 bit Freescale CPUs running on SLOF' and maybe another platform type for the same CPU with a flat device tree if that differs a lot. > Kernel directory: > -------------------------- > binfmt_elf32.c > ioctl32.c > ptrace32.c > signal32.c > sys_ppc32.c If you start creating new subdirectories, these would be a natural choice for yet another directory. E.g. ia64 and x86_64 do that as well. > This next section lists out the files that appear to be specific > to a particular platform. I've separated them by platform for the > sake of ease of parsing this list, but I would imagine that these > would all end up in the high level "platforms" directory: > > > pSeries: > --------------------- > rtas.c > rtas_flash.c > rtas-proc.c > rtasd.c > scanlog.c Most of the rtas stuff is not really pSeries specific, so it should either stay in kernel/ or go to a subdirectory of it if you insist. The BPA patches that I'm currently making also move some code around to have a better distinction between pSeries specific code and generic rtas code that is also used by BPA and possibly other platforms using SLOF. > And the files that I believe should go into syslib: > > syslib/ > ----------- > btext.c > i8259.c > i8259.h > mpic.c > mpic.h > nvram.c > of_device.c > prom.c > prom_init.c > rtc.c > u3_iommu.c > udbg.c > vio.c > viopath.c (? - not clear on this one) I don't really see the point in this directory. Why not just leave these files in kernel/? Arnd <>< From linas at austin.ibm.com Wed Jun 22 05:52:55 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 21 Jun 2005 14:52:55 -0500 Subject: Proposal for reorg of kernel directory In-Reply-To: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> Message-ID: <20050621195254.GA9995@austin.ibm.com> On Tue, Jun 21, 2005 at 02:02:11PM -0500, Becky Bruce was heard to remark: > > Kernel directory: [...] > syslib/ What's the conceptual difference between kernel and syslib? > u3_iommu.c Belongs in the pmac directory, I beleive. --linas From arnd at arndb.de Wed Jun 22 06:02:21 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 22:02:21 +0200 Subject: Proposal for reorg of kernel directory In-Reply-To: <20050621195254.GA9995@austin.ibm.com> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <20050621195254.GA9995@austin.ibm.com> Message-ID: <200506212202.21819.arnd@arndb.de> On Dinsdag 21 Juni 2005 21:52, Linas Vepstas wrote: > > u3_iommu.c > > Belongs in the pmac directory, I beleive. No, maple needs this as well, including JS20 if you run SLOF. Arnd <>< From rusty.lynch at intel.com Wed Jun 22 06:53:43 2005 From: rusty.lynch at intel.com (Rusty Lynch) Date: Tue, 21 Jun 2005 13:53:43 -0700 Subject: [patch 0/5] Return probe redesign: overall description Message-ID: <20050621205343.548977000@linux.jf.intel.com> From my experiences with adding return probes to x86_64 and ia64, and the feedback on LKML to those patches, I think we can simplify the design for return probes. The following patchset for 2.6.12-mm1 tweaks the original design such that: * Instead of storing the stack address in the return probe instance, the task pointer is stored. This gives us all we need in order to: - find the correct return probe instance when we enter the trampoline (even if we are recursing) - find all left-over return probe instances when the task is going away This has the side effect of simplifying the implementation since more work can be done in kernel/kprobes.c since architecture specific knowledge of the stack layout is no longer required. Specifically, we no longer have: - arch_get_kprobe_task() - arch_kprobe_flush_task() - get_rp_inst_tsk() - get_rp_inst() - trampoline_post_handler() * Instead of splitting the return probe handling and cleanup logic across the pre and post trampoline handlers, all the work is pushed into the pre function (trampoline_probe_handler), and then we skip single stepping the original function. In this case the original instruction to be single stepped was just a NOP, and we can do without the extra interruption. The new flow of events to having a return probe handler execute when a target function exits is: * At system initialization time, a kprobe is inserted at the beginning of kretprobe_trampoline. kernel/kprobes.c use to handle this on it's own, but ia64 needed to do this a little differently (i.e. a function pointer is really a pointer to a structure containing the instruction pointer and a global pointer), so I added the notion of arch_init(), so that kernel/kprobes.c:init_kprobes() now allows architecture specific initialization by calling arch_init() before exiting. Each architecture now registers a kprobe on it's own trampoline function. * register_kretprobe() will insert a kprobe at the beginning of the targeted function with the kprobe pre_handler set to arch_prepare_kretprobe (still no change) * When the target function is entered, the kprobe is fired, calling arch_prepare_kretprobe (still no change) * In arch_prepare_kretprobe() we try to get a free instance and if one is available then we fill out the instance with a pointer to the return probe, the original return address, and a pointer to the task structure (instead of the stack address.) Just like before we change the return address to the trampoline function and mark the instance as used. If multiple return probes are registered for a given target function, then arch_prepare_kretprobe() will get called multiple times for the same task (since our kprobe implementation is able to handle multiple kprobes at the same address.) Past the first call to arch_prepare_kretprobe, we end up with the original address stored in the return probe instance pointing to our trampoline function. (This is a significant difference from the original arch_prepare_kretprobe design.) * Target function executes like normal and then returns to kretprobe_trampoline. * kprobe inserted on the first instruction of kretprobe_trampoline is fired and calls trampoline_probe_handler() (no change here) * trampoline_probe_handler() consumes each of the instances associated with the current task by calling the registered handler function and marking the instance as unused until an instance is found that has a return address different then the trampoline function. (change similar to my previous ia64 RFC) * If the task is killed with some left-over return probe instances (meaning that a target function was entered, but never returned), then we just free any instances associated with the task. (Not much different other then we can handle this without calling architecture specific functions.) There is a known problem that this patch does not yet solve where registering a return probe flush_old_exec or flush_thread will put us in a bad state. Most likely the best way to handle this is to not allow registering return probes on these two functions. (Significant change) This patch series applies to the 2.6.12-rc6-mm1 kernel, and provides: * kernel/kprobes.c changes * i386 patch of existing return probes implementation * x86_64 patch of existing return probe implementation * ia64 implementation * ppc64 implementation --rusty From rusty.lynch at intel.com Wed Jun 22 06:53:46 2005 From: rusty.lynch at intel.com (Rusty Lynch) Date: Tue, 21 Jun 2005 13:53:46 -0700 Subject: [patch 3/5] Return probe redesign: x86_64 specific changes References: <20050621205343.548977000@linux.jf.intel.com> Message-ID: <20050621205406.001318000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-x86_64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050621/8c159af5/attachment.txt From rusty.lynch at intel.com Wed Jun 22 06:53:47 2005 From: rusty.lynch at intel.com (Rusty Lynch) Date: Tue, 21 Jun 2005 13:53:47 -0700 Subject: [patch 4/5] Return probe redesign: ia64 specific implementation References: <20050621205343.548977000@linux.jf.intel.com> Message-ID: <20050621205406.996100000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-ia64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050621/f2366164/attachment.txt From rusty.lynch at intel.com Wed Jun 22 06:53:45 2005 From: rusty.lynch at intel.com (Rusty Lynch) Date: Tue, 21 Jun 2005 13:53:45 -0700 Subject: [patch 2/5] Return probe redesign: i386 specific changes References: <20050621205343.548977000@linux.jf.intel.com> Message-ID: <20050621205405.455437000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-i386.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050621/1a911ff5/attachment.txt From rusty.lynch at intel.com Wed Jun 22 06:53:48 2005 From: rusty.lynch at intel.com (Rusty Lynch) Date: Tue, 21 Jun 2005 13:53:48 -0700 Subject: [patch 5/5] Return probe redesign: ppc64 specific implementation References: <20050621205343.548977000@linux.jf.intel.com> Message-ID: <20050621205407.571189000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-ppc64.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050621/e82be8e1/attachment.txt From rusty.lynch at intel.com Wed Jun 22 06:53:44 2005 From: rusty.lynch at intel.com (Rusty Lynch) Date: Tue, 21 Jun 2005 13:53:44 -0700 Subject: [patch 1/5] Return probe redesign: architecture independant changes References: <20050621205343.548977000@linux.jf.intel.com> Message-ID: <20050621205404.856940000@linux.jf.intel.com> An embedded and charset-unspecified text was scrubbed... Name: kprobes-return-probes-redux-base.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050621/d0b4346a/attachment.txt From arnd at arndb.de Wed Jun 22 07:13:11 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:13:11 +0200 Subject: [PATCH 2/11] ppc64: rename pSeries rtc functions into rtas_* In-Reply-To: <200506212311.36010.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212311.36010.arnd@arndb.de> Message-ID: <200506212313.12090.arnd@arndb.de> The rtc rtas functions are not pSeries specific but can also be used by BPA and other SLOF based platforms Signed-off-by: Arnd Bergmann -- arch/ppc64/kernel/pSeries_setup.c | 9 +++------ arch/ppc64/kernel/rtc.c | 6 +++--- include/asm-ppc64/rtas.h | 5 +++++ 3 files changed, 11 insertions(+), 9 deletions(-) --- linux-cg.orig/arch/ppc64/kernel/pSeries_setup.c 2005-06-21 03:15:26.961012552 -0400 +++ linux-cg/arch/ppc64/kernel/pSeries_setup.c 2005-06-21 03:15:27.004006016 -0400 @@ -73,9 +73,6 @@ extern void pSeries_final_fixup(void); -extern void pSeries_get_boot_time(struct rtc_time *rtc_time); -extern void pSeries_get_rtc_time(struct rtc_time *rtc_time); -extern int pSeries_set_rtc_time(struct rtc_time *rtc_time); extern void find_udbg_vterm(void); extern void system_reset_fwnmi(void); /* from head.S */ extern void machine_check_fwnmi(void); /* from head.S */ @@ -534,9 +531,9 @@ struct machdep_calls __initdata pSeries_ .halt = rtas_halt, .panic = rtas_os_term, .cpu_die = pSeries_mach_cpu_die, - .get_boot_time = pSeries_get_boot_time, - .get_rtc_time = pSeries_get_rtc_time, - .set_rtc_time = pSeries_set_rtc_time, + .get_boot_time = rtas_get_boot_time, + .get_rtc_time = rtas_get_rtc_time, + .set_rtc_time = rtas_set_rtc_time, .calibrate_decr = generic_calibrate_decr, .progress = pSeries_progress, .check_legacy_ioport = pSeries_check_legacy_ioport, --- linux-cg.orig/arch/ppc64/kernel/rtc.c 2005-06-21 03:15:21.762997888 -0400 +++ linux-cg/arch/ppc64/kernel/rtc.c 2005-06-21 03:15:27.005005864 -0400 @@ -303,7 +303,7 @@ void iSeries_get_boot_time(struct rtc_ti #ifdef CONFIG_PPC_RTAS #define MAX_RTC_WAIT 5000 /* 5 sec */ #define RTAS_CLOCK_BUSY (-2) -void pSeries_get_boot_time(struct rtc_time *rtc_tm) +void rtas_get_boot_time(struct rtc_time *rtc_tm) { int ret[8]; int error, wait_time; @@ -338,7 +338,7 @@ void pSeries_get_boot_time(struct rtc_ti * and if a delay is needed to read the clock. In this case we just * silently return without updating rtc_tm. */ -void pSeries_get_rtc_time(struct rtc_time *rtc_tm) +void rtas_get_rtc_time(struct rtc_time *rtc_tm) { int ret[8]; int error, wait_time; @@ -373,7 +373,7 @@ void pSeries_get_rtc_time(struct rtc_tim rtc_tm->tm_year = ret[0] - 1900; } -int pSeries_set_rtc_time(struct rtc_time *tm) +int rtas_set_rtc_time(struct rtc_time *tm) { int error, wait_time; unsigned long max_wait_tb; --- linux-cg.orig/include/asm-ppc64/rtas.h 2005-06-21 03:15:24.090910336 -0400 +++ linux-cg/include/asm-ppc64/rtas.h 2005-06-21 03:15:44.352891944 -0400 @@ -188,6 +188,11 @@ extern int rtas_set_power_level(int powe extern int rtas_set_indicator(int indicator, int index, int new_value); extern void rtas_initialize(void); +struct rtc_time; +extern void rtas_get_boot_time(struct rtc_time *rtc_time); +extern void rtas_get_rtc_time(struct rtc_time *rtc_time); +extern int rtas_set_rtc_time(struct rtc_time *rtc_time); + /* Given an RTAS status code of 9900..9905 compute the hinted delay */ unsigned int rtas_extended_busy_delay_time(int status); static inline int rtas_is_extended_busy(int status) From arnd at arndb.de Wed Jun 22 07:10:53 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:10:53 +0200 Subject: [PATCH 0/11] ppc64: Introduce Cell/BPA platform, v3 Message-ID: <200506212310.54156.arnd@arndb.de> This series of patches add support for a fifth platform type in the ppc64 architecture tree. The Broadband Processor Architecture (BPA) is what machines using the Cell processor should be following and currently only prototype hardware exists for it. Most of the functionality is the same as in the previous version. The main updates are: - Fixes for the comments I got - Added more patches for moving rtas related stuff around from pSeries, so we can use it from BPA as well - Smaller bug fixes - Lots of changes on the SPU file system (see the patch comments) One thing that has happened is that the Cell Processor Based Blade has now been shown on E3 and the Power.org press summit and will also be on Linuxtag, so you can now see what kind of hardware this runs on. This series does not include the libspu files, as we are doing some changes to the library right now. I'm also not including the driver for our network driver yet. It's working well, but I'm waiting for a cleanup patch and plan to submit it after Linuxtag. Please forward these patches for inclusion in 2.6.13 if you are happy with them. The spufs code is still not ready for inclusion, but it could start a life in -mm to get a broader review at this point. Arnd <>< From arnd at arndb.de Wed Jun 22 07:18:15 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:18:15 +0200 Subject: [PATCH 4/11] ppc64: pSeries_progress -> rtas_progress In-Reply-To: <200506212317.13467.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212313.12090.arnd@arndb.de> <200506212317.13467.arnd@arndb.de> Message-ID: <200506212318.16573.arnd@arndb.de> The pSeries_progress function is called from some places in the rtas code, which may also be used by non-pSeries platforms. Though pSeries is currently the only platform type that implements display-character, the code is actually generic enough to be part of the rtas subsystem. I hit a bug here because the generic rtas code tried calling ppc_md.progress, which points to an __init function on most platforms. We could also clear the ppc_md.progress pointer when freeing the init memory to make it more explicit that ppc_md.progress must not be called after bootup. Signed-off-by: Arnd Bergmann --- arch/ppc64/kernel/pSeries_setup.c | 103 ------------------------------------ arch/ppc64/kernel/rtas.c | 106 +++++++++++++++++++++++++++++++++++++- include/asm-ppc64/rtas.h | 1 3 files changed, 106 insertions(+), 104 deletions(-) --- linux-cg.orig/arch/ppc64/kernel/pSeries_setup.c 2005-06-21 03:22:26.797955872 -0400 +++ linux-cg/arch/ppc64/kernel/pSeries_setup.c 2005-06-21 03:25:16.110912400 -0400 @@ -375,107 +375,6 @@ static void __init pSeries_init_early(vo } -static void pSeries_progress(char *s, unsigned short hex) -{ - struct device_node *root; - int width, *p; - char *os; - static int display_character, set_indicator; - static int max_width; - static DEFINE_SPINLOCK(progress_lock); - static int pending_newline = 0; /* did last write end with unprinted newline? */ - - if (!rtas.base) - return; - - if (max_width == 0) { - if ((root = find_path_device("/rtas")) && - (p = (unsigned int *)get_property(root, - "ibm,display-line-length", - NULL))) - max_width = *p; - else - max_width = 0x10; - display_character = rtas_token("display-character"); - set_indicator = rtas_token("set-indicator"); - } - - if (display_character == RTAS_UNKNOWN_SERVICE) { - /* use hex display if available */ - if (set_indicator != RTAS_UNKNOWN_SERVICE) - rtas_call(set_indicator, 3, 1, NULL, 6, 0, hex); - return; - } - - spin_lock(&progress_lock); - - /* - * Last write ended with newline, but we didn't print it since - * it would just clear the bottom line of output. Print it now - * instead. - * - * If no newline is pending, print a CR to start output at the - * beginning of the line. - */ - if (pending_newline) { - rtas_call(display_character, 1, 1, NULL, '\r'); - rtas_call(display_character, 1, 1, NULL, '\n'); - pending_newline = 0; - } else { - rtas_call(display_character, 1, 1, NULL, '\r'); - } - - width = max_width; - os = s; - while (*os) { - if (*os == '\n' || *os == '\r') { - /* Blank to end of line. */ - while (width-- > 0) - rtas_call(display_character, 1, 1, NULL, ' '); - - /* If newline is the last character, save it - * until next call to avoid bumping up the - * display output. - */ - if (*os == '\n' && !os[1]) { - pending_newline = 1; - spin_unlock(&progress_lock); - return; - } - - /* RTAS wants CR-LF, not just LF */ - - if (*os == '\n') { - rtas_call(display_character, 1, 1, NULL, '\r'); - rtas_call(display_character, 1, 1, NULL, '\n'); - } else { - /* CR might be used to re-draw a line, so we'll - * leave it alone and not add LF. - */ - rtas_call(display_character, 1, 1, NULL, *os); - } - - width = max_width; - } else { - width--; - rtas_call(display_character, 1, 1, NULL, *os); - } - - os++; - - /* if we overwrite the screen length */ - if (width <= 0) - while ((*os != 0) && (*os != '\n') && (*os != '\r')) - os++; - } - - /* Blank to end of line. */ - while (width-- > 0) - rtas_call(display_character, 1, 1, NULL, ' '); - - spin_unlock(&progress_lock); -} - static int pSeries_check_legacy_ioport(unsigned int baseport) { struct device_node *np; @@ -535,7 +434,7 @@ struct machdep_calls __initdata pSeries_ .get_rtc_time = rtas_get_rtc_time, .set_rtc_time = rtas_set_rtc_time, .calibrate_decr = generic_calibrate_decr, - .progress = pSeries_progress, + .progress = rtas_progress, .check_legacy_ioport = pSeries_check_legacy_ioport, .system_reset_exception = pSeries_system_reset_exception, .machine_check_exception = pSeries_machine_check_exception, --- linux-cg.orig/arch/ppc64/kernel/rtas-proc.c 2005-06-21 20:21:27.735960616 -0400 +++ linux-cg/arch/ppc64/kernel/rtas-proc.c 2005-06-21 20:22:10.272883704 -0400 @@ -371,11 +371,11 @@ static ssize_t ppc_rtas_progress_write(s /* Lets see if the user passed hexdigits */ hex = simple_strtoul(progress_led, NULL, 10); - ppc_md.progress ((char *)progress_led, hex); + rtas_progress ((char *)progress_led, hex); return count; /* clear the line */ - /* ppc_md.progress(" ", 0xffff);*/ + /* rtas_progress(" ", 0xffff);*/ } /* ****************************************************************** */ static int ppc_rtas_progress_show(struct seq_file *m, void *v) --- linux-cg.orig/arch/ppc64/kernel/rtas.c 2005-06-21 20:20:19.484954016 -0400 +++ linux-cg/arch/ppc64/kernel/rtas.c 2005-06-21 20:21:52.832873152 -0400 @@ -91,6 +91,108 @@ call_rtas_display_status_delay(unsigned } } +void +rtas_progress(char *s, unsigned short hex) +{ + struct device_node *root; + int width, *p; + char *os; + static int display_character, set_indicator; + static int max_width; + static DEFINE_SPINLOCK(progress_lock); + static int pending_newline = 0; /* did last write end with unprinted newline? */ + + if (!rtas.base) + return; + + if (max_width == 0) { + if ((root = find_path_device("/rtas")) && + (p = (unsigned int *)get_property(root, + "ibm,display-line-length", + NULL))) + max_width = *p; + else + max_width = 0x10; + display_character = rtas_token("display-character"); + set_indicator = rtas_token("set-indicator"); + } + + if (display_character == RTAS_UNKNOWN_SERVICE) { + /* use hex display if available */ + if (set_indicator != RTAS_UNKNOWN_SERVICE) + rtas_call(set_indicator, 3, 1, NULL, 6, 0, hex); + return; + } + + spin_lock(&progress_lock); + + /* + * Last write ended with newline, but we didn't print it since + * it would just clear the bottom line of output. Print it now + * instead. + * + * If no newline is pending, print a CR to start output at the + * beginning of the line. + */ + if (pending_newline) { + rtas_call(display_character, 1, 1, NULL, '\r'); + rtas_call(display_character, 1, 1, NULL, '\n'); + pending_newline = 0; + } else { + rtas_call(display_character, 1, 1, NULL, '\r'); + } + + width = max_width; + os = s; + while (*os) { + if (*os == '\n' || *os == '\r') { + /* Blank to end of line. */ + while (width-- > 0) + rtas_call(display_character, 1, 1, NULL, ' '); + + /* If newline is the last character, save it + * until next call to avoid bumping up the + * display output. + */ + if (*os == '\n' && !os[1]) { + pending_newline = 1; + spin_unlock(&progress_lock); + return; + } + + /* RTAS wants CR-LF, not just LF */ + + if (*os == '\n') { + rtas_call(display_character, 1, 1, NULL, '\r'); + rtas_call(display_character, 1, 1, NULL, '\n'); + } else { + /* CR might be used to re-draw a line, so we'll + * leave it alone and not add LF. + */ + rtas_call(display_character, 1, 1, NULL, *os); + } + + width = max_width; + } else { + width--; + rtas_call(display_character, 1, 1, NULL, *os); + } + + os++; + + /* if we overwrite the screen length */ + if (width <= 0) + while ((*os != 0) && (*os != '\n') && (*os != '\r')) + os++; + } + + /* Blank to end of line. */ + while (width-- > 0) + rtas_call(display_character, 1, 1, NULL, ' '); + + spin_unlock(&progress_lock); +} + int rtas_token(const char *service) { @@ -425,8 +527,8 @@ rtas_flash_firmware(void) printk(KERN_ALERT "FLASH: flash image is %ld bytes\n", image_size); printk(KERN_ALERT "FLASH: performing flash and reboot\n"); - ppc_md.progress("Flashing \n", 0x0); - ppc_md.progress("Please Wait... ", 0x0); + rtas_progress("Flashing \n", 0x0); + rtas_progress("Please Wait... ", 0x0); printk(KERN_ALERT "FLASH: this will take several minutes. Do not power off!\n"); status = rtas_call(update_token, 1, 1, NULL, rtas_block_list); switch (status) { /* should only get "bad" status */ --- linux-cg.orig/include/asm-ppc64/rtas.h 2005-06-21 20:21:43.670935016 -0400 +++ linux-cg/include/asm-ppc64/rtas.h 2005-06-21 20:21:52.832873152 -0400 @@ -186,6 +186,7 @@ extern int rtas_get_sensor(int sensor, i extern int rtas_get_power_level(int powerdomain, int *level); extern int rtas_set_power_level(int powerdomain, int level, int *setlevel); extern int rtas_set_indicator(int indicator, int index, int new_value); +extern void rtas_progress(char *s, unsigned short hex); extern void rtas_initialize(void); struct rtc_time; From arnd at arndb.de Wed Jun 22 07:20:05 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:20:05 +0200 Subject: [PATCH 5/11] ppc64: add a minimal nvram driver In-Reply-To: <200506212318.16573.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212317.13467.arnd@arndb.de> <200506212318.16573.arnd@arndb.de> Message-ID: <200506212320.05799.arnd@arndb.de> The firmware provides the location and size of the nvram in the device tree, so it does not really contain any hardware specific bits and could be used on other machines as well. From: Utz Bacher Signed-off-by: Arnd Bergmann Index: linus-2.5/arch/ppc64/kernel/bpa_nvram.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linus-2.5/arch/ppc64/kernel/bpa_nvram.c 2005-04-20 01:55:36.000000000 +0200 @@ -0,0 +1,118 @@ +/* + * NVRAM for CPBW + * + * (C) Copyright IBM Corp. 2005 + * + * Authors : Utz Bacher + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include +#include + +#include +#include +#include + +static void __iomem *bpa_nvram_start; +static long bpa_nvram_len; +static spinlock_t bpa_nvram_lock = SPIN_LOCK_UNLOCKED; + +static ssize_t bpa_nvram_read(char *buf, size_t count, loff_t *index) +{ + unsigned long flags; + + if (*index >= bpa_nvram_len) + return 0; + if (*index + count > bpa_nvram_len) + count = bpa_nvram_len - *index; + + spin_lock_irqsave(&bpa_nvram_lock, flags); + + memcpy_fromio(buf, bpa_nvram_start + *index, count); + + spin_unlock_irqrestore(&bpa_nvram_lock, flags); + + *index += count; + return count; +} + +static ssize_t bpa_nvram_write(char *buf, size_t count, loff_t *index) +{ + unsigned long flags; + + if (*index >= bpa_nvram_len) + return 0; + if (*index + count > bpa_nvram_len) + count = bpa_nvram_len - *index; + + spin_lock_irqsave(&bpa_nvram_lock, flags); + + memcpy_toio(bpa_nvram_start + *index, buf, count); + + spin_unlock_irqrestore(&bpa_nvram_lock, flags); + + *index += count; + return count; +} + +static ssize_t bpa_nvram_get_size(void) +{ + return bpa_nvram_len; +} + +int __init bpa_nvram_init(void) +{ + struct device_node *nvram_node; + unsigned long *buffer; + int proplen; + unsigned long nvram_addr; + int ret; + + ret = -ENODEV; + nvram_node = of_find_node_by_type(NULL, "nvram"); + if (!nvram_node) + goto out; + + ret = -EIO; + buffer = (unsigned long *)get_property(nvram_node, "reg", &proplen); + if (proplen != 2*sizeof(unsigned long)) + goto out; + + ret = -ENODEV; + nvram_addr = buffer[0]; + bpa_nvram_len = buffer[1]; + if ( (!bpa_nvram_len) || (!nvram_addr) ) + goto out; + + bpa_nvram_start = ioremap(nvram_addr, bpa_nvram_len); + if (!bpa_nvram_start) + goto out; + + printk(KERN_INFO "BPA NVRAM, %luk mapped to %p\n", + bpa_nvram_len >> 10, bpa_nvram_start); + + ppc_md.nvram_read = bpa_nvram_read; + ppc_md.nvram_write = bpa_nvram_write; + ppc_md.nvram_size = bpa_nvram_get_size; + +out: + of_node_put(nvram_node); + return ret; +} Index: linus-2.5/include/asm-ppc64/nvram.h =================================================================== --- linus-2.5.orig/include/asm-ppc64/nvram.h 2005-04-20 01:54:03.000000000 +0200 +++ linus-2.5/include/asm-ppc64/nvram.h 2005-04-20 01:55:36.000000000 +0200 @@ -70,6 +70,7 @@ extern int pSeries_nvram_init(void); extern int pmac_nvram_init(void); +extern int bpa_nvram_init(void); /* PowerMac specific nvram stuffs */ From arnd at arndb.de Wed Jun 22 07:17:12 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:17:12 +0200 Subject: [PATCH 3/11] ppc64: Split out generic rtas code from pSeries_pci.c. In-Reply-To: <200506212313.12090.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212311.36010.arnd@arndb.de> <200506212313.12090.arnd@arndb.de> Message-ID: <200506212317.13467.arnd@arndb.de> BPA is using rtas for PCI but should not be confused by pSeries code. This also avoids some #ifdefs. Other platforms that want to use rtas_pci.c could create their own platform_pci.c with platform specific fixups. Signed-off-by: Arnd Bergmann -- Makefile | 3 mpic.h | 3 pSeries_pci.c | 476 ------------------------------------------------------- rtas_pci.c | 495 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 506 insertions(+), 471 deletions(-) --- linux-cg.orig/arch/ppc64/kernel/Makefile 2005-05-13 14:56:19.016994560 -0400 +++ linux-cg/arch/ppc64/kernel/Makefile 2005-05-13 15:00:05.111971888 -0400 @@ -32,13 +32,14 @@ obj-$(CONFIG_PPC_MULTIPLATFORM) += nvram obj-$(CONFIG_PPC_PSERIES) += pSeries_pci.o pSeries_lpar.o pSeries_hvCall.o \ pSeries_nvram.o rtasd.o ras.o pSeries_reconfig.o \ - xics.o rtas.o pSeries_setup.o pSeries_iommu.o + xics.o pSeries_setup.o pSeries_iommu.o obj-$(CONFIG_EEH) += eeh.o obj-$(CONFIG_PROC_FS) += proc_ppc64.o obj-$(CONFIG_RTAS_FLASH) += rtas_flash.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_MODULES) += module.o ppc_ksyms.o +obj-$(CONFIG_PPC_RTAS) += rtas.o rtas_pci.o obj-$(CONFIG_RTAS_PROC) += rtas-proc.o obj-$(CONFIG_SCANLOG) += scanlog.o obj-$(CONFIG_VIOPATH) += viopath.o --- linux-cg.orig/arch/ppc64/kernel/mpic.h 2005-05-13 14:56:19.018994256 -0400 +++ linux-cg/arch/ppc64/kernel/mpic.h 2005-05-13 15:00:10.785908048 -0400 @@ -265,3 +265,6 @@ extern void mpic_send_ipi(unsigned int i extern int mpic_get_one_irq(struct mpic *mpic, struct pt_regs *regs); /* This one gets to the primary mpic */ extern int mpic_get_irq(struct pt_regs *regs); + +/* global mpic for pSeries */ +extern struct mpic *pSeries_mpic; --- linux-cg.orig/arch/ppc64/kernel/pSeries_pci.c 2005-05-13 14:57:09.556898776 -0400 +++ linux-cg/arch/ppc64/kernel/pSeries_pci.c 2005-05-13 15:00:10.786907896 -0400 @@ -1,13 +1,11 @@ /* - * pSeries_pci.c + * arch/ppc64/kernel/pSeries_pci.c * * Copyright (C) 2001 Dave Engebretsen, IBM Corporation * Copyright (C) 2003 Anton Blanchard , IBM * * pSeries specific routines for PCI. * - * Based on code from pci.c and chrp_pci.c - * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or @@ -23,430 +21,18 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ +#include +#include #include -#include #include #include -#include -#include -#include -#include -#include -#include -#include #include -#include -#include +#include -#include "mpic.h" #include "pci.h" -/* RTAS tokens */ -static int read_pci_config; -static int write_pci_config; -static int ibm_read_pci_config; -static int ibm_write_pci_config; - -static int s7a_workaround; - -extern struct mpic *pSeries_mpic; - -static int config_access_valid(struct device_node *dn, int where) -{ - if (where < 256) - return 1; - if (where < 4096 && dn->pci_ext_config_space) - return 1; - - return 0; -} - -static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) -{ - int returnval = -1; - unsigned long buid, addr; - int ret; - - if (!dn) - return PCIBIOS_DEVICE_NOT_FOUND; - if (!config_access_valid(dn, where)) - return PCIBIOS_BAD_REGISTER_NUMBER; - - addr = ((where & 0xf00) << 20) | (dn->busno << 16) | - (dn->devfn << 8) | (where & 0xff); - buid = dn->phb->buid; - if (buid) { - ret = rtas_call(ibm_read_pci_config, 4, 2, &returnval, - addr, buid >> 32, buid & 0xffffffff, size); - } else { - ret = rtas_call(read_pci_config, 2, 2, &returnval, addr, size); - } - *val = returnval; - - if (ret) - return PCIBIOS_DEVICE_NOT_FOUND; - - if (returnval == EEH_IO_ERROR_VALUE(size) - && eeh_dn_check_failure (dn, NULL)) - return PCIBIOS_DEVICE_NOT_FOUND; - - return PCIBIOS_SUCCESSFUL; -} - -static int rtas_pci_read_config(struct pci_bus *bus, - unsigned int devfn, - int where, int size, u32 *val) -{ - struct device_node *busdn, *dn; - - if (bus->self) - busdn = pci_device_to_OF_node(bus->self); - else - busdn = bus->sysdata; /* must be a phb */ - - /* Search only direct children of the bus */ - for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) - return rtas_read_config(dn, where, size, val); - return PCIBIOS_DEVICE_NOT_FOUND; -} - -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) -{ - unsigned long buid, addr; - int ret; - - if (!dn) - return PCIBIOS_DEVICE_NOT_FOUND; - if (!config_access_valid(dn, where)) - return PCIBIOS_BAD_REGISTER_NUMBER; - - addr = ((where & 0xf00) << 20) | (dn->busno << 16) | - (dn->devfn << 8) | (where & 0xff); - buid = dn->phb->buid; - if (buid) { - ret = rtas_call(ibm_write_pci_config, 5, 1, NULL, addr, buid >> 32, buid & 0xffffffff, size, (ulong) val); - } else { - ret = rtas_call(write_pci_config, 3, 1, NULL, addr, size, (ulong)val); - } - - if (ret) - return PCIBIOS_DEVICE_NOT_FOUND; - - return PCIBIOS_SUCCESSFUL; -} - -static int rtas_pci_write_config(struct pci_bus *bus, - unsigned int devfn, - int where, int size, u32 val) -{ - struct device_node *busdn, *dn; - - if (bus->self) - busdn = pci_device_to_OF_node(bus->self); - else - busdn = bus->sysdata; /* must be a phb */ - - /* Search only direct children of the bus */ - for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->devfn == devfn) - return rtas_write_config(dn, where, size, val); - return PCIBIOS_DEVICE_NOT_FOUND; -} - -struct pci_ops rtas_pci_ops = { - rtas_pci_read_config, - rtas_pci_write_config -}; - -int is_python(struct device_node *dev) -{ - char *model = (char *)get_property(dev, "model", NULL); - - if (model && strstr(model, "Python")) - return 1; - - return 0; -} - -static int get_phb_reg_prop(struct device_node *dev, - unsigned int addr_size_words, - struct reg_property64 *reg) -{ - unsigned int *ui_ptr = NULL, len; - - /* Found a PHB, now figure out where his registers are mapped. */ - ui_ptr = (unsigned int *)get_property(dev, "reg", &len); - if (ui_ptr == NULL) - return 1; - - if (addr_size_words == 1) { - reg->address = ((struct reg_property32 *)ui_ptr)->address; - reg->size = ((struct reg_property32 *)ui_ptr)->size; - } else { - *reg = *((struct reg_property64 *)ui_ptr); - } - - return 0; -} - -static void python_countermeasures(struct device_node *dev, - unsigned int addr_size_words) -{ - struct reg_property64 reg_struct; - void __iomem *chip_regs; - volatile u32 val; - - if (get_phb_reg_prop(dev, addr_size_words, ®_struct)) - return; - - /* Python's register file is 1 MB in size. */ - chip_regs = ioremap(reg_struct.address & ~(0xfffffUL), 0x100000); - - /* - * Firmware doesn't always clear this bit which is critical - * for good performance - Anton - */ - -#define PRG_CL_RESET_VALID 0x00010000 - - val = in_be32(chip_regs + 0xf6030); - if (val & PRG_CL_RESET_VALID) { - printk(KERN_INFO "Python workaround: "); - val &= ~PRG_CL_RESET_VALID; - out_be32(chip_regs + 0xf6030, val); - /* - * We must read it back for changes to - * take effect - */ - val = in_be32(chip_regs + 0xf6030); - printk("reg0: %x\n", val); - } - - iounmap(chip_regs); -} - -void __init init_pci_config_tokens (void) -{ - read_pci_config = rtas_token("read-pci-config"); - write_pci_config = rtas_token("write-pci-config"); - ibm_read_pci_config = rtas_token("ibm,read-pci-config"); - ibm_write_pci_config = rtas_token("ibm,write-pci-config"); -} - -unsigned long __devinit get_phb_buid (struct device_node *phb) -{ - int addr_cells; - unsigned int *buid_vals; - unsigned int len; - unsigned long buid; - - if (ibm_read_pci_config == -1) return 0; - - /* PHB's will always be children of the root node, - * or so it is promised by the current firmware. */ - if (phb->parent == NULL) - return 0; - if (phb->parent->parent) - return 0; - - buid_vals = (unsigned int *) get_property(phb, "reg", &len); - if (buid_vals == NULL) - return 0; - - addr_cells = prom_n_addr_cells(phb); - if (addr_cells == 1) { - buid = (unsigned long) buid_vals[0]; - } else { - buid = (((unsigned long)buid_vals[0]) << 32UL) | - (((unsigned long)buid_vals[1]) & 0xffffffff); - } - return buid; -} - -static int phb_set_bus_ranges(struct device_node *dev, - struct pci_controller *phb) -{ - int *bus_range; - unsigned int len; - - bus_range = (int *) get_property(dev, "bus-range", &len); - if (bus_range == NULL || len < 2 * sizeof(int)) { - return 1; - } - - phb->first_busno = bus_range[0]; - phb->last_busno = bus_range[1]; - - return 0; -} - -static int __devinit setup_phb(struct device_node *dev, - struct pci_controller *phb, - unsigned int addr_size_words) -{ - pci_setup_pci_controller(phb); - - if (is_python(dev)) - python_countermeasures(dev, addr_size_words); - - if (phb_set_bus_ranges(dev, phb)) - return 1; - - phb->arch_data = dev; - phb->ops = &rtas_pci_ops; - phb->buid = get_phb_buid(dev); - - return 0; -} - -static void __devinit add_linux_pci_domain(struct device_node *dev, - struct pci_controller *phb, - struct property *of_prop) -{ - memset(of_prop, 0, sizeof(struct property)); - of_prop->name = "linux,pci-domain"; - of_prop->length = sizeof(phb->global_number); - of_prop->value = (unsigned char *)&of_prop[1]; - memcpy(of_prop->value, &phb->global_number, sizeof(phb->global_number)); - prom_add_property(dev, of_prop); -} - -static struct pci_controller * __init alloc_phb(struct device_node *dev, - unsigned int addr_size_words) -{ - struct pci_controller *phb; - struct property *of_prop; - - phb = alloc_bootmem(sizeof(struct pci_controller)); - if (phb == NULL) - return NULL; - - of_prop = alloc_bootmem(sizeof(struct property) + - sizeof(phb->global_number)); - if (!of_prop) - return NULL; - - if (setup_phb(dev, phb, addr_size_words)) - return NULL; - - add_linux_pci_domain(dev, phb, of_prop); - - return phb; -} - -static struct pci_controller * __devinit alloc_phb_dynamic(struct device_node *dev, unsigned int addr_size_words) -{ - struct pci_controller *phb; - - phb = (struct pci_controller *)kmalloc(sizeof(struct pci_controller), - GFP_KERNEL); - if (phb == NULL) - return NULL; - - if (setup_phb(dev, phb, addr_size_words)) - return NULL; - - phb->is_dynamic = 1; - - /* TODO: linux,pci-domain? */ - - return phb; -} - -unsigned long __init find_and_init_phbs(void) -{ - struct device_node *node; - struct pci_controller *phb; - unsigned int root_size_cells = 0; - unsigned int index; - unsigned int *opprop = NULL; - struct device_node *root = of_find_node_by_path("/"); - - if (ppc64_interrupt_controller == IC_OPEN_PIC) { - opprop = (unsigned int *)get_property(root, - "platform-open-pic", NULL); - } - - root_size_cells = prom_n_size_cells(root); - - index = 0; - - for (node = of_get_next_child(root, NULL); - node != NULL; - node = of_get_next_child(root, node)) { - if (node->type == NULL || strcmp(node->type, "pci") != 0) - continue; - - phb = alloc_phb(node, root_size_cells); - if (!phb) - continue; - - pci_process_bridge_OF_ranges(phb, node); - pci_setup_phb_io(phb, index == 0); - - if (ppc64_interrupt_controller == IC_OPEN_PIC && pSeries_mpic) { - int addr = root_size_cells * (index + 2) - 1; - mpic_assign_isu(pSeries_mpic, index, opprop[addr]); - } - - index++; - } - - of_node_put(root); - pci_devs_phb_init(); - - /* - * pci_probe_only and pci_assign_all_buses can be set via properties - * in chosen. - */ - if (of_chosen) { - int *prop; - - prop = (int *)get_property(of_chosen, "linux,pci-probe-only", - NULL); - if (prop) - pci_probe_only = *prop; - - prop = (int *)get_property(of_chosen, - "linux,pci-assign-all-buses", NULL); - if (prop) - pci_assign_all_buses = *prop; - } - - return 0; -} - -struct pci_controller * __devinit init_phb_dynamic(struct device_node *dn) -{ - struct device_node *root = of_find_node_by_path("/"); - unsigned int root_size_cells = 0; - struct pci_controller *phb; - struct pci_bus *bus; - int primary; - - root_size_cells = prom_n_size_cells(root); - - primary = list_empty(&hose_list); - phb = alloc_phb_dynamic(dn, root_size_cells); - if (!phb) - return NULL; - - pci_process_bridge_OF_ranges(phb, dn); - - pci_setup_phb_io_dynamic(phb, primary); - of_node_put(root); - - pci_devs_phb_init_dynamic(phb); - phb->last_busno = 0xff; - bus = pci_scan_bus(phb->first_busno, phb->ops, phb->arch_data); - phb->bus = bus; - phb->last_busno = bus->subordinate; - - return phb; -} -EXPORT_SYMBOL(init_phb_dynamic); +static int __initdata s7a_workaround; #if 0 void pcibios_name_device(struct pci_dev *dev) @@ -474,7 +60,7 @@ void pcibios_name_device(struct pci_dev DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, pcibios_name_device); #endif -static void check_s7a(void) +static void __init check_s7a(void) { struct device_node *root; char *model; @@ -488,56 +74,6 @@ static void check_s7a(void) } } -/* RPA-specific bits for removing PHBs */ -int pcibios_remove_root_bus(struct pci_controller *phb) -{ - struct pci_bus *b = phb->bus; - struct resource *res; - int rc, i; - - res = b->resource[0]; - if (!res->flags) { - printk(KERN_ERR "%s: no IO resource for PHB %s\n", __FUNCTION__, - b->name); - return 1; - } - - rc = unmap_bus_range(b); - if (rc) { - printk(KERN_ERR "%s: failed to unmap IO on bus %s\n", - __FUNCTION__, b->name); - return 1; - } - - if (release_resource(res)) { - printk(KERN_ERR "%s: failed to release IO on bus %s\n", - __FUNCTION__, b->name); - return 1; - } - - for (i = 1; i < 3; ++i) { - res = b->resource[i]; - if (!res->flags && i == 0) { - printk(KERN_ERR "%s: no MEM resource for PHB %s\n", - __FUNCTION__, b->name); - return 1; - } - if (res->flags && release_resource(res)) { - printk(KERN_ERR - "%s: failed to release IO %d on bus %s\n", - __FUNCTION__, i, b->name); - return 1; - } - } - - list_del(&phb->list_node); - if (phb->is_dynamic) - kfree(phb); - - return 0; -} -EXPORT_SYMBOL(pcibios_remove_root_bus); - static void __init pSeries_request_regions(void) { if (!isa_io_base) --- linux-cg.orig/arch/ppc64/kernel/rtas_pci.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/arch/ppc64/kernel/rtas_pci.c 2005-05-13 15:00:10.788907592 -0400 @@ -0,0 +1,495 @@ +/* + * arch/ppc64/kernel/rtas_pci.c + * + * Copyright (C) 2001 Dave Engebretsen, IBM Corporation + * Copyright (C) 2003 Anton Blanchard , IBM + * + * RTAS specific routines for PCI. + * + * Based on code from pci.c, chrp_pci.c and pSeries_pci.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mpic.h" +#include "pci.h" + +/* RTAS tokens */ +static int read_pci_config; +static int write_pci_config; +static int ibm_read_pci_config; +static int ibm_write_pci_config; + +static int config_access_valid(struct device_node *dn, int where) +{ + if (where < 256) + return 1; + if (where < 4096 && dn->pci_ext_config_space) + return 1; + + return 0; +} + +static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) +{ + int returnval = -1; + unsigned long buid, addr; + int ret; + + if (!dn) + return PCIBIOS_DEVICE_NOT_FOUND; + if (!config_access_valid(dn, where)) + return PCIBIOS_BAD_REGISTER_NUMBER; + + addr = ((where & 0xf00) << 20) | (dn->busno << 16) | + (dn->devfn << 8) | (where & 0xff); + buid = dn->phb->buid; + if (buid) { + ret = rtas_call(ibm_read_pci_config, 4, 2, &returnval, + addr, buid >> 32, buid & 0xffffffff, size); + } else { + ret = rtas_call(read_pci_config, 2, 2, &returnval, addr, size); + } + *val = returnval; + + if (ret) + return PCIBIOS_DEVICE_NOT_FOUND; + + if (returnval == EEH_IO_ERROR_VALUE(size) + && eeh_dn_check_failure (dn, NULL)) + return PCIBIOS_DEVICE_NOT_FOUND; + + return PCIBIOS_SUCCESSFUL; +} + +static int rtas_pci_read_config(struct pci_bus *bus, + unsigned int devfn, + int where, int size, u32 *val) +{ + struct device_node *busdn, *dn; + + if (bus->self) + busdn = pci_device_to_OF_node(bus->self); + else + busdn = bus->sysdata; /* must be a phb */ + + /* Search only direct children of the bus */ + for (dn = busdn->child; dn; dn = dn->sibling) + if (dn->devfn == devfn) + return rtas_read_config(dn, where, size, val); + return PCIBIOS_DEVICE_NOT_FOUND; +} + +static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +{ + unsigned long buid, addr; + int ret; + + if (!dn) + return PCIBIOS_DEVICE_NOT_FOUND; + if (!config_access_valid(dn, where)) + return PCIBIOS_BAD_REGISTER_NUMBER; + + addr = ((where & 0xf00) << 20) | (dn->busno << 16) | + (dn->devfn << 8) | (where & 0xff); + buid = dn->phb->buid; + if (buid) { + ret = rtas_call(ibm_write_pci_config, 5, 1, NULL, addr, buid >> 32, buid & 0xffffffff, size, (ulong) val); + } else { + ret = rtas_call(write_pci_config, 3, 1, NULL, addr, size, (ulong)val); + } + + if (ret) + return PCIBIOS_DEVICE_NOT_FOUND; + + return PCIBIOS_SUCCESSFUL; +} + +static int rtas_pci_write_config(struct pci_bus *bus, + unsigned int devfn, + int where, int size, u32 val) +{ + struct device_node *busdn, *dn; + + if (bus->self) + busdn = pci_device_to_OF_node(bus->self); + else + busdn = bus->sysdata; /* must be a phb */ + + /* Search only direct children of the bus */ + for (dn = busdn->child; dn; dn = dn->sibling) + if (dn->devfn == devfn) + return rtas_write_config(dn, where, size, val); + return PCIBIOS_DEVICE_NOT_FOUND; +} + +struct pci_ops rtas_pci_ops = { + rtas_pci_read_config, + rtas_pci_write_config +}; + +int is_python(struct device_node *dev) +{ + char *model = (char *)get_property(dev, "model", NULL); + + if (model && strstr(model, "Python")) + return 1; + + return 0; +} + +static int get_phb_reg_prop(struct device_node *dev, + unsigned int addr_size_words, + struct reg_property64 *reg) +{ + unsigned int *ui_ptr = NULL, len; + + /* Found a PHB, now figure out where his registers are mapped. */ + ui_ptr = (unsigned int *)get_property(dev, "reg", &len); + if (ui_ptr == NULL) + return 1; + + if (addr_size_words == 1) { + reg->address = ((struct reg_property32 *)ui_ptr)->address; + reg->size = ((struct reg_property32 *)ui_ptr)->size; + } else { + *reg = *((struct reg_property64 *)ui_ptr); + } + + return 0; +} + +static void python_countermeasures(struct device_node *dev, + unsigned int addr_size_words) +{ + struct reg_property64 reg_struct; + void __iomem *chip_regs; + volatile u32 val; + + if (get_phb_reg_prop(dev, addr_size_words, ®_struct)) + return; + + /* Python's register file is 1 MB in size. */ + chip_regs = ioremap(reg_struct.address & ~(0xfffffUL), 0x100000); + + /* + * Firmware doesn't always clear this bit which is critical + * for good performance - Anton + */ + +#define PRG_CL_RESET_VALID 0x00010000 + + val = in_be32(chip_regs + 0xf6030); + if (val & PRG_CL_RESET_VALID) { + printk(KERN_INFO "Python workaround: "); + val &= ~PRG_CL_RESET_VALID; + out_be32(chip_regs + 0xf6030, val); + /* + * We must read it back for changes to + * take effect + */ + val = in_be32(chip_regs + 0xf6030); + printk("reg0: %x\n", val); + } + + iounmap(chip_regs); +} + +void __init init_pci_config_tokens (void) +{ + read_pci_config = rtas_token("read-pci-config"); + write_pci_config = rtas_token("write-pci-config"); + ibm_read_pci_config = rtas_token("ibm,read-pci-config"); + ibm_write_pci_config = rtas_token("ibm,write-pci-config"); +} + +unsigned long __devinit get_phb_buid (struct device_node *phb) +{ + int addr_cells; + unsigned int *buid_vals; + unsigned int len; + unsigned long buid; + + if (ibm_read_pci_config == -1) return 0; + + /* PHB's will always be children of the root node, + * or so it is promised by the current firmware. */ + if (phb->parent == NULL) + return 0; + if (phb->parent->parent) + return 0; + + buid_vals = (unsigned int *) get_property(phb, "reg", &len); + if (buid_vals == NULL) + return 0; + + addr_cells = prom_n_addr_cells(phb); + if (addr_cells == 1) { + buid = (unsigned long) buid_vals[0]; + } else { + buid = (((unsigned long)buid_vals[0]) << 32UL) | + (((unsigned long)buid_vals[1]) & 0xffffffff); + } + return buid; +} + +static int phb_set_bus_ranges(struct device_node *dev, + struct pci_controller *phb) +{ + int *bus_range; + unsigned int len; + + bus_range = (int *) get_property(dev, "bus-range", &len); + if (bus_range == NULL || len < 2 * sizeof(int)) { + return 1; + } + + phb->first_busno = bus_range[0]; + phb->last_busno = bus_range[1]; + + return 0; +} + +static int __devinit setup_phb(struct device_node *dev, + struct pci_controller *phb, + unsigned int addr_size_words) +{ + pci_setup_pci_controller(phb); + + if (is_python(dev)) + python_countermeasures(dev, addr_size_words); + + if (phb_set_bus_ranges(dev, phb)) + return 1; + + phb->arch_data = dev; + phb->ops = &rtas_pci_ops; + phb->buid = get_phb_buid(dev); + + return 0; +} + +static void __devinit add_linux_pci_domain(struct device_node *dev, + struct pci_controller *phb, + struct property *of_prop) +{ + memset(of_prop, 0, sizeof(struct property)); + of_prop->name = "linux,pci-domain"; + of_prop->length = sizeof(phb->global_number); + of_prop->value = (unsigned char *)&of_prop[1]; + memcpy(of_prop->value, &phb->global_number, sizeof(phb->global_number)); + prom_add_property(dev, of_prop); +} + +static struct pci_controller * __init alloc_phb(struct device_node *dev, + unsigned int addr_size_words) +{ + struct pci_controller *phb; + struct property *of_prop; + + phb = alloc_bootmem(sizeof(struct pci_controller)); + if (phb == NULL) + return NULL; + + of_prop = alloc_bootmem(sizeof(struct property) + + sizeof(phb->global_number)); + if (!of_prop) + return NULL; + + if (setup_phb(dev, phb, addr_size_words)) + return NULL; + + add_linux_pci_domain(dev, phb, of_prop); + + return phb; +} + +static struct pci_controller * __devinit alloc_phb_dynamic(struct device_node *dev, unsigned int addr_size_words) +{ + struct pci_controller *phb; + + phb = (struct pci_controller *)kmalloc(sizeof(struct pci_controller), + GFP_KERNEL); + if (phb == NULL) + return NULL; + + if (setup_phb(dev, phb, addr_size_words)) + return NULL; + + phb->is_dynamic = 1; + + /* TODO: linux,pci-domain? */ + + return phb; +} + +unsigned long __init find_and_init_phbs(void) +{ + struct device_node *node; + struct pci_controller *phb; + unsigned int root_size_cells = 0; + unsigned int index; + unsigned int *opprop = NULL; + struct device_node *root = of_find_node_by_path("/"); + + if (ppc64_interrupt_controller == IC_OPEN_PIC) { + opprop = (unsigned int *)get_property(root, + "platform-open-pic", NULL); + } + + root_size_cells = prom_n_size_cells(root); + + index = 0; + + for (node = of_get_next_child(root, NULL); + node != NULL; + node = of_get_next_child(root, node)) { + if (node->type == NULL || strcmp(node->type, "pci") != 0) + continue; + + phb = alloc_phb(node, root_size_cells); + if (!phb) + continue; + + pci_process_bridge_OF_ranges(phb, node); + pci_setup_phb_io(phb, index == 0); +#ifdef CONFIG_PPC_PSERIES + if (ppc64_interrupt_controller == IC_OPEN_PIC && pSeries_mpic) { + int addr = root_size_cells * (index + 2) - 1; + mpic_assign_isu(pSeries_mpic, index, opprop[addr]); + } +#endif + index++; + } + + of_node_put(root); + pci_devs_phb_init(); + + /* + * pci_probe_only and pci_assign_all_buses can be set via properties + * in chosen. + */ + if (of_chosen) { + int *prop; + + prop = (int *)get_property(of_chosen, "linux,pci-probe-only", + NULL); + if (prop) + pci_probe_only = *prop; + + prop = (int *)get_property(of_chosen, + "linux,pci-assign-all-buses", NULL); + if (prop) + pci_assign_all_buses = *prop; + } + + return 0; +} + +struct pci_controller * __devinit init_phb_dynamic(struct device_node *dn) +{ + struct device_node *root = of_find_node_by_path("/"); + unsigned int root_size_cells = 0; + struct pci_controller *phb; + struct pci_bus *bus; + int primary; + + root_size_cells = prom_n_size_cells(root); + + primary = list_empty(&hose_list); + phb = alloc_phb_dynamic(dn, root_size_cells); + if (!phb) + return NULL; + + pci_process_bridge_OF_ranges(phb, dn); + + pci_setup_phb_io_dynamic(phb, primary); + of_node_put(root); + + pci_devs_phb_init_dynamic(phb); + phb->last_busno = 0xff; + bus = pci_scan_bus(phb->first_busno, phb->ops, phb->arch_data); + phb->bus = bus; + phb->last_busno = bus->subordinate; + + return phb; +} +EXPORT_SYMBOL(init_phb_dynamic); + +/* RPA-specific bits for removing PHBs */ +int pcibios_remove_root_bus(struct pci_controller *phb) +{ + struct pci_bus *b = phb->bus; + struct resource *res; + int rc, i; + + res = b->resource[0]; + if (!res->flags) { + printk(KERN_ERR "%s: no IO resource for PHB %s\n", __FUNCTION__, + b->name); + return 1; + } + + rc = unmap_bus_range(b); + if (rc) { + printk(KERN_ERR "%s: failed to unmap IO on bus %s\n", + __FUNCTION__, b->name); + return 1; + } + + if (release_resource(res)) { + printk(KERN_ERR "%s: failed to release IO on bus %s\n", + __FUNCTION__, b->name); + return 1; + } + + for (i = 1; i < 3; ++i) { + res = b->resource[i]; + if (!res->flags && i == 0) { + printk(KERN_ERR "%s: no MEM resource for PHB %s\n", + __FUNCTION__, b->name); + return 1; + } + if (res->flags && release_resource(res)) { + printk(KERN_ERR + "%s: failed to release IO %d on bus %s\n", + __FUNCTION__, i, b->name); + return 1; + } + } + + list_del(&phb->list_node); + if (phb->is_dynamic) + kfree(phb); + + return 0; +} +EXPORT_SYMBOL(pcibios_remove_root_bus); From arnd at arndb.de Wed Jun 22 07:22:35 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:22:35 +0200 Subject: [PATCH 6/11] ppc64: add a watchdog driver for rtas In-Reply-To: <200506212320.05799.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212318.16573.arnd@arndb.de> <200506212320.05799.arnd@arndb.de> Message-ID: <200506212322.36453.arnd@arndb.de> Add a watchdog using the RTAS OS surveillance service. This is provided as a simpler alternative to rtasd. The added value is that it works with standard watchdog client programs and can therefore also do user space monitoring. On BPA, rtasd is not really useful because the hardware does not have much to report with event-scan. The driver should also work on other platforms that support the OS surveillance rtas calls. From: Utz Bacher Signed-off-by: Arnd Bergmann -- Kconfig | 10 Makefile | 1 wdrtas.c | 696 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 707 insertions(+) --- linux-cg.orig/drivers/char/watchdog/Kconfig 2005-06-21 02:51:30.460932768 -0400 +++ linux-cg/drivers/char/watchdog/Kconfig 2005-06-21 02:52:33.870015048 -0400 @@ -414,6 +414,16 @@ config WATCHDOG_RIO machines. The watchdog timeout period is normally one minute but can be changed with a boot-time parameter. +# ppc64 RTAS watchdog +config WATCHDOG_RTAS + tristate "RTAS watchdog" + depends on WATCHDOG && PPC_RTAS + help + This driver adds watchdog support for the RTAS watchdog. + + To compile this driver as a module, choose M here. The module + will be called wdrtas. + # # ISA-based Watchdog Cards # --- linux-cg.orig/drivers/char/watchdog/Makefile 2005-06-21 02:51:30.463932312 -0400 +++ linux-cg/drivers/char/watchdog/Makefile 2005-06-21 02:52:33.870015048 -0400 @@ -33,6 +33,7 @@ obj-$(CONFIG_USBPCWATCHDOG) += pcwd_usb. obj-$(CONFIG_IXP4XX_WATCHDOG) += ixp4xx_wdt.o obj-$(CONFIG_IXP2000_WATCHDOG) += ixp2000_wdt.o obj-$(CONFIG_8xx_WDT) += mpc8xx_wdt.o +obj-$(CONFIG_WATCHDOG_RTAS) += wdrtas.o # Only one watchdog can succeed. We probe the hardware watchdog # drivers first, then the softdog driver. This means if your hardware --- linux-cg.orig/drivers/char/watchdog/wdrtas.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/drivers/char/watchdog/wdrtas.c 2005-06-21 02:59:04.333885560 -0400 @@ -0,0 +1,696 @@ +/* + * FIXME: add wdrtas_get_status and wdrtas_get_boot_status as soon as + * RTAS calls are available + */ + +/* + * RTAS watchdog driver + * + * (C) Copyright IBM Corp. 2005 + * device driver to exploit watchdog RTAS functions + * + * Authors : Utz Bacher + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define WDRTAS_MAGIC_CHAR 42 +#define WDRTAS_SUPPORTED_MASK (WDIOF_SETTIMEOUT | \ + WDIOF_MAGICCLOSE) + +MODULE_AUTHOR("Utz Bacher "); +MODULE_DESCRIPTION("RTAS watchdog driver"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_MISCDEV(WATCHDOG_MINOR); +MODULE_ALIAS_MISCDEV(TEMP_MINOR); + +#ifdef CONFIG_WATCHDOG_NOWAYOUT +static int wdrtas_nowayout = 1; +#else +static int wdrtas_nowayout = 0; +#endif + +static atomic_t wdrtas_miscdev_open = ATOMIC_INIT(0); +static char wdrtas_expect_close = 0; + +static int wdrtas_interval; + +#define WDRTAS_THERMAL_SENSOR 3 +static int wdrtas_token_get_sensor_state; +#define WDRTAS_SURVEILLANCE_IND 9000 +static int wdrtas_token_set_indicator; +#define WDRTAS_SP_SPI 28 +static int wdrtas_token_get_sp; +static int wdrtas_token_event_scan; + +#define WDRTAS_DEFAULT_INTERVAL 300 + +#define WDRTAS_LOGBUFFER_LEN 128 +static char wdrtas_logbuffer[WDRTAS_LOGBUFFER_LEN]; + + +/*** watchdog access functions */ + +/** + * wdrtas_set_interval - sets the watchdog interval + * @interval: new interval + * + * returns 0 on success, <0 on failures + * + * wdrtas_set_interval sets the watchdog keepalive interval by calling the + * RTAS function set-indicator (surveillance). The unit of interval is + * seconds. + */ +static int +wdrtas_set_interval(int interval) +{ + long result; + static int print_msg = 10; + + /* rtas uses minutes */ + interval = (interval + 59) / 60; + + result = rtas_call(wdrtas_token_set_indicator, 3, 1, NULL, + WDRTAS_SURVEILLANCE_IND, 0, interval); + if ( (result < 0) && (print_msg) ) { + printk(KERN_ERR "wdrtas: setting the watchdog to %i " + "timeout failed: %li\n", interval, result); + print_msg--; + } + + return result; +} + +/** + * wdrtas_get_interval - returns the current watchdog interval + * @fallback_value: value (in seconds) to use, if the RTAS call fails + * + * returns the interval + * + * wdrtas_get_interval returns the current watchdog keepalive interval + * as reported by the RTAS function ibm,get-system-parameter. The unit + * of the return value is seconds. + */ +static int +wdrtas_get_interval(int fallback_value) +{ + long result; + char value[4]; + + result = rtas_call(wdrtas_token_get_sp, 3, 1, NULL, + WDRTAS_SP_SPI, (void *)__pa(&value), 4); + if ( (value[0] != 0) || (value[1] != 2) || (value[3] != 0) || + (result < 0) ) { + printk(KERN_WARNING "wdrtas: could not get sp_spi watchdog " + "timeout (%li). Continuing\n", result); + return fallback_value; + } + + /* rtas uses minutes */ + return ((int)value[2]) * 60; +} + +/** + * wdrtas_timer_start - starts watchdog + * + * wdrtas_timer_start starts the watchdog by calling the RTAS function + * set-interval (surveillance) + */ +static void +wdrtas_timer_start(void) +{ + wdrtas_set_interval(wdrtas_interval); +} + +/** + * wdrtas_timer_stop - stops watchdog + * + * wdrtas_timer_stop stops the watchdog timer by calling the RTAS function + * set-interval (surveillance) + */ +static void +wdrtas_timer_stop(void) +{ + wdrtas_set_interval(0); +} + +/** + * wdrtas_log_scanned_event - logs an event we received during keepalive + * + * wdrtas_log_scanned_event prints a message to the log buffer dumping + * the results of the last event-scan call + */ +static void +wdrtas_log_scanned_event(void) +{ + int i; + + for (i = 0; i < WDRTAS_LOGBUFFER_LEN; i += 16) + printk(KERN_INFO "wdrtas: dumping event (line %i/%i), data = " + "%02x %02x %02x %02x %02x %02x %02x %02x " + "%02x %02x %02x %02x %02x %02x %02x %02x\n", + (i / 16) + 1, (WDRTAS_LOGBUFFER_LEN / 16), + wdrtas_logbuffer[i + 0], wdrtas_logbuffer[i + 1], + wdrtas_logbuffer[i + 2], wdrtas_logbuffer[i + 3], + wdrtas_logbuffer[i + 4], wdrtas_logbuffer[i + 5], + wdrtas_logbuffer[i + 6], wdrtas_logbuffer[i + 7], + wdrtas_logbuffer[i + 8], wdrtas_logbuffer[i + 9], + wdrtas_logbuffer[i + 10], wdrtas_logbuffer[i + 11], + wdrtas_logbuffer[i + 12], wdrtas_logbuffer[i + 13], + wdrtas_logbuffer[i + 14], wdrtas_logbuffer[i + 15]); +} + +/** + * wdrtas_timer_keepalive - resets watchdog timer to keep system alive + * + * wdrtas_timer_keepalive restarts the watchdog timer by calling the + * RTAS function event-scan and repeats these calls as long as there are + * events available. All events will be dumped. + */ +static void +wdrtas_timer_keepalive(void) +{ + long result; + + do { + result = rtas_call(wdrtas_token_event_scan, 4, 1, NULL, + RTAS_EVENT_SCAN_ALL_EVENTS, 0, + (void *)__pa(wdrtas_logbuffer), + WDRTAS_LOGBUFFER_LEN); + if (result < 0) + printk(KERN_ERR "wdrtas: event-scan failed: %li\n", + result); + if (result == 0) + wdrtas_log_scanned_event(); + } while (result == 0); +} + +/** + * wdrtas_get_temperature - returns current temperature + * + * returns temperature or <0 on failures + * + * wdrtas_get_temperature returns the current temperature in Fahrenheit. It + * uses the RTAS call get-sensor-state, token 3 to do so + */ +static int +wdrtas_get_temperature(void) +{ + long result; + int temperature = 0; + + result = rtas_call(wdrtas_token_get_sensor_state, 2, 2, + (void *)__pa(&temperature), + WDRTAS_THERMAL_SENSOR, 0); + + if (result < 0) + printk(KERN_WARNING "wdrtas: reading the thermal sensor " + "faild: %li\n", result); + else + temperature = ((temperature * 9) / 5) + 32; /* fahrenheit */ + + return temperature; +} + +/** + * wdrtas_get_status - returns the status of the watchdog + * + * returns a bitmask of defines WDIOF_... as defined in + * include/linux/watchdog.h + */ +static int +wdrtas_get_status(void) +{ + return 0; /* TODO */ +} + +/** + * wdrtas_get_boot_status - returns the reason for the last boot + * + * returns a bitmask of defines WDIOF_... as defined in + * include/linux/watchdog.h, indicating why the watchdog rebooted the system + */ +static int +wdrtas_get_boot_status(void) +{ + return 0; /* TODO */ +} + +/*** watchdog API and operations stuff */ + +/* wdrtas_write - called when watchdog device is written to + * @file: file structure + * @buf: user buffer with data + * @len: amount to data written + * @ppos: position in file + * + * returns the number of successfully processed characters, which is always + * the number of bytes passed to this function + * + * wdrtas_write processes all the data given to it and looks for the magic + * character 'V'. This character allows the watchdog device to be closed + * properly. + */ +static ssize_t +wdrtas_write(struct file *file, const char __user *buf, + size_t len, loff_t *ppos) +{ + int i; + char c; + + if (!len) + goto out; + + if (!wdrtas_nowayout) { + wdrtas_expect_close = 0; + /* look for 'V' */ + for (i = 0; i < len; i++) { + if (get_user(c, buf + i)) + return -EFAULT; + /* allow to close device */ + if (c == 'V') + wdrtas_expect_close = WDRTAS_MAGIC_CHAR; + } + } + + wdrtas_timer_keepalive(); + +out: + return len; +} + +/** + * wdrtas_ioctl - ioctl function for the watchdog device + * @inode: inode structure + * @file: file structure + * @cmd: command for ioctl + * @arg: argument pointer + * + * returns 0 on success, <0 on failure + * + * wdrtas_ioctl implements the watchdog API ioctls + */ +static int +wdrtas_ioctl(struct inode *inode, struct file *file, + unsigned int cmd, unsigned long arg) +{ + int __user *argp = (void *)arg; + int i; + static struct watchdog_info wdinfo = { + .options = WDRTAS_SUPPORTED_MASK, + .firmware_version = 0, + .identity = "wdrtas" + }; + + switch (cmd) { + case WDIOC_GETSUPPORT: + if (copy_to_user(argp, &wdinfo, sizeof(wdinfo))) + return -EFAULT; + return 0; + + case WDIOC_GETSTATUS: + i = wdrtas_get_status(); + return put_user(i, argp); + + case WDIOC_GETBOOTSTATUS: + i = wdrtas_get_boot_status(); + return put_user(i, argp); + + case WDIOC_GETTEMP: + if (wdrtas_token_get_sensor_state == RTAS_UNKNOWN_SERVICE) + return -EOPNOTSUPP; + + i = wdrtas_get_temperature(); + return put_user(i, argp); + + case WDIOC_SETOPTIONS: + if (get_user(i, argp)) + return -EFAULT; + if (i & WDIOS_DISABLECARD) + wdrtas_timer_stop(); + if (i & WDIOS_ENABLECARD) { + wdrtas_timer_keepalive(); + wdrtas_timer_start(); + } + if (i & WDIOS_TEMPPANIC) { + /* not implemented. Done by H8 */ + } + return 0; + + case WDIOC_KEEPALIVE: + wdrtas_timer_keepalive(); + return 0; + + case WDIOC_SETTIMEOUT: + if (get_user(i, argp)) + return -EFAULT; + + if (wdrtas_set_interval(i)) + return -EINVAL; + + wdrtas_timer_keepalive(); + + if (wdrtas_token_get_sp == RTAS_UNKNOWN_SERVICE) + wdrtas_interval = i; + else + wdrtas_interval = wdrtas_get_interval(i); + /* fallthrough */ + + case WDIOC_GETTIMEOUT: + return put_user(wdrtas_interval, argp); + + default: + return -ENOIOCTLCMD; + } +} + +/** + * wdrtas_open - open function of watchdog device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success, -EBUSY if the file has been opened already, <0 on + * other failures + * + * function called when watchdog device is opened + */ +static int +wdrtas_open(struct inode *inode, struct file *file) +{ + /* only open once */ + if (atomic_inc_return(&wdrtas_miscdev_open) > 1) { + atomic_dec(&wdrtas_miscdev_open); + return -EBUSY; + } + + wdrtas_timer_start(); + wdrtas_timer_keepalive(); + + return nonseekable_open(inode, file); +} + +/** + * wdrtas_close - close function of watchdog device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success + * + * close function. Always succeeds + */ +static int +wdrtas_close(struct inode *inode, struct file *file) +{ + /* only stop watchdog, if this was announced using 'V' before */ + if (wdrtas_expect_close == WDRTAS_MAGIC_CHAR) + wdrtas_timer_stop(); + else { + printk(KERN_WARNING "wdrtas: got unexpected close. Watchdog " + "not stopped.\n"); + wdrtas_timer_keepalive(); + } + + wdrtas_expect_close = 0; + atomic_dec(&wdrtas_miscdev_open); + return 0; +} + +/** + * wdrtas_temp_read - gives back the temperature in fahrenheit + * @file: file structure + * @buf: user buffer + * @count: number of bytes to be read + * @ppos: position in file + * + * returns always 1 or -EFAULT in case of user space copy failures, <0 on + * other failures + * + * wdrtas_temp_read gives the temperature to the users by copying this + * value as one byte into the user space buffer. The unit is Fahrenheit... + */ +static ssize_t +wdrtas_temp_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + int temperature = 0; + + temperature = wdrtas_get_temperature(); + if (temperature < 0) + return temperature; + + if (copy_to_user(buf, &temperature, 1)) + return -EFAULT; + + return 1; +} + +/** + * wdrtas_temp_open - open function of temperature device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success, <0 on failure + * + * function called when temperature device is opened + */ +static int +wdrtas_temp_open(struct inode *inode, struct file *file) +{ + return nonseekable_open(inode, file); +} + +/** + * wdrtas_temp_close - close function of temperature device + * @inode: inode structure + * @file: file structure + * + * returns 0 on success + * + * close function. Always succeeds + */ +static int +wdrtas_temp_close(struct inode *inode, struct file *file) +{ + return 0; +} + +/** + * wdrtas_reboot - reboot notifier function + * @nb: notifier block structure + * @code: reboot code + * @ptr: unused + * + * returns NOTIFY_DONE + * + * wdrtas_reboot stops the watchdog in case of a reboot + */ +static int +wdrtas_reboot(struct notifier_block *this, unsigned long code, void *ptr) +{ + if ( (code==SYS_DOWN) || (code==SYS_HALT) ) + wdrtas_timer_stop(); + + return NOTIFY_DONE; +} + +/*** initialization stuff */ + +static struct file_operations wdrtas_fops = { + .owner = THIS_MODULE, + .llseek = no_llseek, + .write = wdrtas_write, + .ioctl = wdrtas_ioctl, + .open = wdrtas_open, + .release = wdrtas_close, +}; + +static struct miscdevice wdrtas_miscdev = { + .minor = WATCHDOG_MINOR, + .name = "watchdog", + .fops = &wdrtas_fops, +}; + +static struct file_operations wdrtas_temp_fops = { + .owner = THIS_MODULE, + .llseek = no_llseek, + .read = wdrtas_temp_read, + .open = wdrtas_temp_open, + .release = wdrtas_temp_close, +}; + +static struct miscdevice wdrtas_tempdev = { + .minor = TEMP_MINOR, + .name = "temperature", + .fops = &wdrtas_temp_fops, +}; + +static struct notifier_block wdrtas_notifier = { + .notifier_call = wdrtas_reboot, +}; + +/** + * wdrtas_get_tokens - reads in RTAS tokens + * + * returns 0 on succes, <0 on failure + * + * wdrtas_get_tokens reads in the tokens for the RTAS calls used in + * this watchdog driver. It tolerates, if "get-sensor-state" and + * "ibm,get-system-parameter" are not available. + */ +static int +wdrtas_get_tokens(void) +{ + wdrtas_token_get_sensor_state = rtas_token("get-sensor-state"); + if (wdrtas_token_get_sensor_state == RTAS_UNKNOWN_SERVICE) { + printk(KERN_WARNING "wdrtas: couldn't get token for " + "get-sensor-state. Trying to continue without " + "temperature support.\n"); + } + + wdrtas_token_get_sp = rtas_token("ibm,get-system-parameter"); + if (wdrtas_token_get_sp == RTAS_UNKNOWN_SERVICE) { + printk(KERN_WARNING "wdrtas: couldn't get token for " + "ibm,get-system-parameter. Trying to continue with " + "a default timeout value of %i seconds.\n", + WDRTAS_DEFAULT_INTERVAL); + } + + wdrtas_token_set_indicator = rtas_token("set-indicator"); + if (wdrtas_token_set_indicator == RTAS_UNKNOWN_SERVICE) { + printk(KERN_ERR "wdrtas: couldn't get token for " + "set-indicator. Terminating watchdog code.\n"); + return -EIO; + } + + wdrtas_token_event_scan = rtas_token("event-scan"); + if (wdrtas_token_event_scan == RTAS_UNKNOWN_SERVICE) { + printk(KERN_ERR "wdrtas: couldn't get token for event-scan. " + "Terminating watchdog code.\n"); + return -EIO; + } + + return 0; +} + +/** + * wdrtas_unregister_devs - unregisters the misc dev handlers + * + * wdrtas_register_devs unregisters the watchdog and temperature watchdog + * misc devs + */ +static void +wdrtas_unregister_devs(void) +{ + misc_deregister(&wdrtas_miscdev); + if (wdrtas_token_get_sensor_state != RTAS_UNKNOWN_SERVICE) + misc_deregister(&wdrtas_tempdev); +} + +/** + * wdrtas_register_devs - registers the misc dev handlers + * + * returns 0 on succes, <0 on failure + * + * wdrtas_register_devs registers the watchdog and temperature watchdog + * misc devs + */ +static int +wdrtas_register_devs(void) +{ + int result; + + result = misc_register(&wdrtas_miscdev); + if (result) { + printk(KERN_ERR "wdrtas: couldn't register watchdog misc " + "device. Terminating watchdog code.\n"); + return result; + } + + if (wdrtas_token_get_sensor_state != RTAS_UNKNOWN_SERVICE) { + result = misc_register(&wdrtas_tempdev); + if (result) { + printk(KERN_WARNING "wdrtas: couldn't register " + "watchdog temperature misc device. Continuing " + "without temperature support.\n"); + wdrtas_token_get_sensor_state = RTAS_UNKNOWN_SERVICE; + } + } + + return 0; +} + +/** + * wdrtas_init - init function of the watchdog driver + * + * returns 0 on succes, <0 on failure + * + * registers the file handlers and the reboot notifier + */ +static int __init +wdrtas_init(void) +{ + if (wdrtas_get_tokens()) + return -ENODEV; + + if (wdrtas_register_devs()) + return -ENODEV; + + if (register_reboot_notifier(&wdrtas_notifier)) { + printk(KERN_ERR "wdrtas: could not register reboot notifier. " + "Terminating watchdog code.\n"); + wdrtas_unregister_devs(); + return -ENODEV; + } + + if (wdrtas_token_get_sp == RTAS_UNKNOWN_SERVICE) + wdrtas_interval = WDRTAS_DEFAULT_INTERVAL; + else + wdrtas_interval = wdrtas_get_interval(WDRTAS_DEFAULT_INTERVAL); + + return 0; +} + +/** + * wdrtas_exit - exit function of the watchdog driver + * + * unregisters the file handlers and the reboot notifier + */ +static void __exit +wdrtas_exit(void) +{ + if (!wdrtas_nowayout) + wdrtas_timer_stop(); + + wdrtas_unregister_devs(); + + unregister_reboot_notifier(&wdrtas_notifier); +} + +module_init(wdrtas_init); +module_exit(wdrtas_exit); From arnd at arndb.de Wed Jun 22 07:24:19 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:24:19 +0200 Subject: [PATCH 7/11] ppc64: add BPA platform type In-Reply-To: <200506212322.36453.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212320.05799.arnd@arndb.de> <200506212322.36453.arnd@arndb.de> Message-ID: <200506212324.19713.arnd@arndb.de> This adds the basic support for running on BPA machines. So far, this is only the IBM workstation, and it will not run on others without a little more generalization. It should be possible to configure a kernel for any combination of CONFIG_PPC_BPA with any of the other multiplatform targets. Signed-off-by: Arnd Bergmann -- MAINTAINERS | 7 + arch/ppc64/Kconfig | 6 + arch/ppc64/Makefile | 2 arch/ppc64/kernel/Makefile | 3 arch/ppc64/kernel/bpa_setup.c | 135 +++++++++++++++++++++++++++++++++++ arch/ppc64/kernel/cpu_setup_power4.S | 16 +++- arch/ppc64/kernel/cputable.c | 11 ++ arch/ppc64/kernel/irq.c | 3 arch/ppc64/kernel/proc_ppc64.c | 2 arch/ppc64/kernel/prom_init.c | 4 - arch/ppc64/kernel/setup.c | 4 + arch/ppc64/kernel/traps.c | 4 + include/asm-ppc64/mmu.h | 5 - include/asm-ppc64/processor.h | 15 +++ include/asm-ppc64/smp.h | 8 ++ 15 files changed, 216 insertions(+), 9 deletions(-) --- linux-cg.orig/MAINTAINERS 2005-06-21 02:58:35.570002888 -0400 +++ linux-cg/MAINTAINERS 2005-06-21 03:01:35.220979888 -0400 @@ -499,6 +499,13 @@ L: bonding-devel at lists.sourceforge.net W: http://sourceforge.net/projects/bonding/ S: Supported +BROADBAND PROCESSOR ARCHITECTURE +P: Arnd Bergmann +M: arnd at arndb.de +L: linuxppc64-dev at ozlabs.org +W: http://linuxppc64.org +S: Supported + BTTV VIDEO4LINUX DRIVER P: Gerd Knorr M: kraxel at bytesex.org --- linux-cg.orig/arch/ppc64/Kconfig 2005-06-21 02:58:35.572002584 -0400 +++ linux-cg/arch/ppc64/Kconfig 2005-06-21 03:01:35.221979736 -0400 @@ -77,6 +77,10 @@ config PPC_PSERIES bool " IBM pSeries & new iSeries" default y +config PPC_BPA + bool " Broadband Processor Architecture" + depends on PPC_MULTIPLATFORM + config PPC_PMAC depends on PPC_MULTIPLATFORM bool " Apple G5 based machines" @@ -256,7 +260,7 @@ config MSCHUNKS config PPC_RTAS bool - depends on PPC_PSERIES + depends on PPC_PSERIES || PPC_BPA default y config RTAS_PROC --- linux-cg.orig/arch/ppc64/Makefile 2005-06-21 02:58:35.574002280 -0400 +++ linux-cg/arch/ppc64/Makefile 2005-06-21 03:01:35.222979584 -0400 @@ -90,12 +90,14 @@ boot := arch/ppc64/boot boottarget-$(CONFIG_PPC_PSERIES) := zImage zImage.initrd boottarget-$(CONFIG_PPC_MAPLE) := zImage zImage.initrd boottarget-$(CONFIG_PPC_ISERIES) := vmlinux.sminitrd vmlinux.initrd vmlinux.sm +boottarget-$(CONFIG_PPC_BPA) := zImage zImage.initrd $(boottarget-y): vmlinux $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@ bootimage-$(CONFIG_PPC_PSERIES) := $(boot)/zImage bootimage-$(CONFIG_PPC_PMAC) := vmlinux bootimage-$(CONFIG_PPC_MAPLE) := $(boot)/zImage +bootimage-$(CONFIG_PPC_BPA) := zImage bootimage-$(CONFIG_PPC_ISERIES) := vmlinux BOOTIMAGE := $(bootimage-y) install: vmlinux --- linux-cg.orig/arch/ppc64/kernel/Makefile 2005-06-21 02:58:35.577001824 -0400 +++ linux-cg/arch/ppc64/kernel/Makefile 2005-06-21 03:01:35.222979584 -0400 @@ -34,6 +34,8 @@ obj-$(CONFIG_PPC_PSERIES) += pSeries_pci pSeries_nvram.o rtasd.o ras.o pSeries_reconfig.o \ xics.o pSeries_setup.o pSeries_iommu.o +obj-$(CONFIG_PPC_BPA) += bpa_setup.o bpa_nvram.o + obj-$(CONFIG_EEH) += eeh.o obj-$(CONFIG_PROC_FS) += proc_ppc64.o obj-$(CONFIG_RTAS_FLASH) += rtas_flash.o @@ -60,6 +62,7 @@ ifdef CONFIG_SMP obj-$(CONFIG_PPC_PMAC) += pmac_smp.o smp-tbsync.o obj-$(CONFIG_PPC_ISERIES) += iSeries_smp.o obj-$(CONFIG_PPC_PSERIES) += pSeries_smp.o +obj-$(CONFIG_PPC_BPA) += pSeries_smp.o obj-$(CONFIG_PPC_MAPLE) += smp-tbsync.o endif --- linux-cg.orig/arch/ppc64/kernel/bpa_setup.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/arch/ppc64/kernel/bpa_setup.c 2005-06-21 03:01:52.874957760 -0400 @@ -0,0 +1,135 @@ +/* + * linux/arch/ppc/kernel/bpa_setup.c + * + * Copyright (C) 1995 Linus Torvalds + * Adapted from 'alpha' version by Gary Thomas + * Modified by Cort Dougan (cort at cs.nmt.edu) + * Modified by PPC64 Team, IBM Corp + * Modified by BPA Team, IBM Deutschland Entwicklung GmbH + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ +#undef DEBUG + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "pci.h" + +#ifdef DEBUG +#define DBG(fmt...) udbg_printf(fmt) +#else +#define DBG(fmt...) +#endif + +void bpa_get_cpuinfo(struct seq_file *m) +{ + struct device_node *root; + const char *model = ""; + + root = of_find_node_by_path("/"); + if (root) + model = get_property(root, "model", NULL); + seq_printf(m, "machine\t\t: BPA %s\n", model); + of_node_put(root); +} + +static void bpa_progress(char *s, unsigned short hex) +{ + printk("*** %04x : %s\n", hex, s ? s : ""); +} + +static void __init bpa_setup_arch(void) +{ +#ifdef CONFIG_SMP + smp_init_pSeries(); +#endif + + /* init to some ~sane value until calibrate_delay() runs */ + loops_per_jiffy = 50000000; + + if (ROOT_DEV == 0) { + printk("No ramdisk, default root is /dev/hda2\n"); + ROOT_DEV = Root_HDA2; + } + + /* Find and initialize PCI host bridges */ + init_pci_config_tokens(); + find_and_init_phbs(); + +#ifdef CONFIG_DUMMY_CONSOLE + conswitchp = &dummy_con; +#endif + + // bpa_nvram_init(); +} + +/* + * Early initialization. Relocation is on but do not reference unbolted pages + */ +static void __init bpa_init_early(void) +{ + DBG(" -> bpa_init_early()\n"); + + hpte_init_native(); + + pci_direct_iommu_init(); + + ppc64_interrupt_controller = IC_BPA_IIC; + + DBG(" <- bpa_init_early()\n"); +} + + +static int __init bpa_probe(int platform) +{ + if (platform != PLATFORM_BPA) + return 0; + + return 1; +} + +struct machdep_calls __initdata bpa_md = { + .probe = bpa_probe, + .setup_arch = bpa_setup_arch, + .init_early = bpa_init_early, + .get_cpuinfo = bpa_get_cpuinfo, + .restart = rtas_restart, + .power_off = rtas_power_off, + .halt = rtas_halt, + .get_boot_time = rtas_get_boot_time, + .get_rtc_time = rtas_get_rtc_time, + .set_rtc_time = rtas_set_rtc_time, + .calibrate_decr = generic_calibrate_decr, + .progress = bpa_progress, +}; --- linux-cg.orig/arch/ppc64/kernel/cpu_setup_power4.S 2005-06-21 02:58:35.581001216 -0400 +++ linux-cg/arch/ppc64/kernel/cpu_setup_power4.S 2005-06-21 03:01:35.224979280 -0400 @@ -73,7 +73,21 @@ _GLOBAL(__970_cpu_preinit) _GLOBAL(__setup_cpu_power4) blr - + +_GLOBAL(__setup_cpu_be) + /* Set large page sizes LP=0: 16MB, LP=1: 64KB */ + addi r3, 0, 0 + ori r3, r3, HID6_LB + sldi r3, r3, 32 + nor r3, r3, r3 + mfspr r4, SPRN_HID6 + and r4, r4, r3 + addi r3, 0, 0x02000 + sldi r3, r3, 32 + or r4, r4, r3 + mtspr SPRN_HID6, r4 + blr + _GLOBAL(__setup_cpu_ppc970) mfspr r0,SPRN_HID0 li r11,5 /* clear DOZE and SLEEP */ --- linux-cg.orig/arch/ppc64/kernel/cputable.c 2005-06-21 02:58:35.584000760 -0400 +++ linux-cg/arch/ppc64/kernel/cputable.c 2005-06-21 03:01:35.224979280 -0400 @@ -34,6 +34,7 @@ EXPORT_SYMBOL(cur_cpu_spec); extern void __setup_cpu_power3(unsigned long offset, struct cpu_spec* spec); extern void __setup_cpu_power4(unsigned long offset, struct cpu_spec* spec); extern void __setup_cpu_ppc970(unsigned long offset, struct cpu_spec* spec); +extern void __setup_cpu_be(unsigned long offset, struct cpu_spec* spec); /* We only set the altivec features if the kernel was compiled with altivec @@ -162,6 +163,16 @@ struct cpu_spec cpu_specs[] = { __setup_cpu_power4, COMMON_PPC64_FW }, + { /* BE DD1.x */ + 0xffff0000, 0x00700000, "Broadband Engine", + CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | + CPU_FTR_SMT, + COMMON_USER_PPC64 | PPC_FEATURE_HAS_ALTIVEC_COMP, + 128, 128, + __setup_cpu_be, + COMMON_PPC64_FW + }, { /* default match */ 0x00000000, 0x00000000, "POWER4 (compatible)", CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | --- linux-cg.orig/arch/ppc64/kernel/irq.c 2005-06-21 02:58:35.586000456 -0400 +++ linux-cg/arch/ppc64/kernel/irq.c 2005-06-21 03:01:35.225979128 -0400 @@ -395,6 +395,9 @@ int virt_irq_create_mapping(unsigned int if (ppc64_interrupt_controller == IC_OPEN_PIC) return real_irq; /* no mapping for openpic (for now) */ + if (ppc64_interrupt_controller == IC_BPA_IIC) + return real_irq; /* no mapping for iic either */ + /* don't map interrupts < MIN_VIRT_IRQ */ if (real_irq < MIN_VIRT_IRQ) { virt_irq_to_real_map[real_irq] = real_irq; --- linux-cg.orig/arch/ppc64/kernel/proc_ppc64.c 2005-06-21 02:58:35.588000152 -0400 +++ linux-cg/arch/ppc64/kernel/proc_ppc64.c 2005-06-21 03:01:35.226978976 -0400 @@ -53,7 +53,7 @@ static int __init proc_ppc64_create(void if (!root) return 1; - if (!(systemcfg->platform & PLATFORM_PSERIES)) + if (!(systemcfg->platform & (PLATFORM_PSERIES | PLATFORM_BPA))) return 0; if (!proc_mkdir("rtas", root)) --- linux-cg.orig/arch/ppc64/kernel/prom_init.c 2005-06-21 02:58:35.589999848 -0400 +++ linux-cg/arch/ppc64/kernel/prom_init.c 2005-06-21 03:01:35.228978672 -0400 @@ -1915,9 +1915,9 @@ unsigned long __init prom_init(unsigned prom_send_capabilities(); /* - * On pSeries, copy the CPU hold code + * On pSeries and BPA, copy the CPU hold code */ - if (RELOC(of_platform) & PLATFORM_PSERIES) + if (RELOC(of_platform) & (PLATFORM_PSERIES | PLATFORM_BPA)) copy_and_flush(0, KERNELBASE - offset, 0x100, 0); /* --- linux-cg.orig/arch/ppc64/kernel/setup.c 2005-06-21 02:58:35.592999392 -0400 +++ linux-cg/arch/ppc64/kernel/setup.c 2005-06-21 03:01:35.229978520 -0400 @@ -343,6 +343,7 @@ static void __init setup_cpu_maps(void) extern struct machdep_calls pSeries_md; extern struct machdep_calls pmac_md; extern struct machdep_calls maple_md; +extern struct machdep_calls bpa_md; /* Ultimately, stuff them in an elf section like initcalls... */ static struct machdep_calls __initdata *machines[] = { @@ -355,6 +356,9 @@ static struct machdep_calls __initdata * #ifdef CONFIG_PPC_MAPLE &maple_md, #endif /* CONFIG_PPC_MAPLE */ +#ifdef CONFIG_PPC_BPA + &bpa_md, +#endif NULL }; --- linux-cg.orig/arch/ppc64/kernel/traps.c 2005-06-21 02:58:35.594999088 -0400 +++ linux-cg/arch/ppc64/kernel/traps.c 2005-06-21 03:01:35.230978368 -0400 @@ -126,6 +126,10 @@ int die(const char *str, struct pt_regs printk("POWERMAC "); nl = 1; break; + case PLATFORM_BPA: + printk("BPA "); + nl = 1; + break; } if (nl) printk("\n"); --- linux-cg.orig/include/asm-ppc64/mmu.h 2005-06-21 02:58:35.596998784 -0400 +++ linux-cg/include/asm-ppc64/mmu.h 2005-06-21 03:01:35.231978216 -0400 @@ -47,9 +47,10 @@ #define SLB_VSID_KS ASM_CONST(0x0000000000000800) #define SLB_VSID_KP ASM_CONST(0x0000000000000400) #define SLB_VSID_N ASM_CONST(0x0000000000000200) /* no-execute */ -#define SLB_VSID_L ASM_CONST(0x0000000000000100) /* largepage 16M */ +#define SLB_VSID_L ASM_CONST(0x0000000000000100) /* largepage */ #define SLB_VSID_C ASM_CONST(0x0000000000000080) /* class */ - +#define SLB_VSID_LS ASM_CONST(0x0000000000000070) /* size of largepage */ + #define SLB_VSID_KERNEL (SLB_VSID_KP|SLB_VSID_C) #define SLB_VSID_USER (SLB_VSID_KP|SLB_VSID_KS) --- linux-cg.orig/include/asm-ppc64/processor.h 2005-06-21 02:58:35.598998480 -0400 +++ linux-cg/include/asm-ppc64/processor.h 2005-06-21 03:01:35.232978064 -0400 @@ -138,8 +138,16 @@ #define SPRN_NIADORM 0x3F3 /* Hardware Implementation Register 2 */ #define SPRN_HID4 0x3F4 /* 970 HID4 */ #define SPRN_HID5 0x3F6 /* 970 HID5 */ -#define SPRN_TSC 0x3FD /* Thread switch control */ -#define SPRN_TST 0x3FC /* Thread switch timeout */ +#define SPRN_HID6 0x3F9 /* BE HID 6 */ +#define HID6_LB (0x0F<<12) /* Concurrent Large Page Modes */ +#define HID6_DLP (1<<20) /* Disable all large page modes (4K only) */ +#define SPRN_TSCR 0x399 /* Thread switch control on BE */ +#define SPRN_TTR 0x39A /* Thread switch timeout on BE */ +#define TSCR_DEC_ENABLE 0x200000 /* Decrementer Interrupt */ +#define TSCR_EE_ENABLE 0x100000 /* External Interrupt */ +#define TSCR_EE_BOOST 0x080000 /* External Interrupt Boost */ +#define SPRN_TSC 0x3FD /* Thread switch control on others */ +#define SPRN_TST 0x3FC /* Thread switch timeout on others */ #define SPRN_L2CR 0x3F9 /* Level 2 Cache Control Regsiter */ #define SPRN_LR 0x008 /* Link Register */ #define SPRN_PIR 0x3FF /* Processor Identification Register */ @@ -259,6 +267,7 @@ #define PV_970FX 0x003C #define PV_630 0x0040 #define PV_630p 0x0041 +#define PV_BE 0x0070 /* Platforms supported by PPC64 */ #define PLATFORM_PSERIES 0x0100 @@ -267,6 +276,7 @@ #define PLATFORM_LPAR 0x0001 #define PLATFORM_POWERMAC 0x0400 #define PLATFORM_MAPLE 0x0500 +#define PLATFORM_BPA 0x1000 /* Compatibility with drivers coming from PPC32 world */ #define _machine (systemcfg->platform) @@ -278,6 +288,7 @@ #define IC_INVALID 0 #define IC_OPEN_PIC 1 #define IC_PPC_XIC 2 +#define IC_BPA_IIC 3 #define XGLUE(a,b) a##b #define GLUE(a,b) XGLUE(a,b) --- linux-cg.orig/include/asm-ppc64/smp.h 2005-06-21 02:58:35.601998024 -0400 +++ linux-cg/include/asm-ppc64/smp.h 2005-06-21 03:01:35.232978064 -0400 @@ -85,6 +85,14 @@ extern void smp_generic_take_timebase(vo extern struct smp_ops_t *smp_ops; +#ifdef CONFIG_PPC_PSERIES +void vpa_init(int cpu); +#else +static inline void vpa_init(int cpu) +{ +} +#endif /* CONFIG_PPC_PSERIES */ + #endif /* __ASSEMBLY__ */ #endif /* !(_PPC64_SMP_H) */ From arnd at arndb.de Wed Jun 22 07:26:17 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:26:17 +0200 Subject: [PATCH 8/11] ppc64: Add driver for BPA interrupt controllers In-Reply-To: <200506212324.19713.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212322.36453.arnd@arndb.de> <200506212324.19713.arnd@arndb.de> Message-ID: <200506212326.18205.arnd@arndb.de> Add support for the integrated interrupt controller on BPA CPUs. There is one of those for each SMT thread. The mapping of interrupt numbers to HW interrupt sources is described in arch/ppc64/kernel/bpa_iic.h. This version hardcodes the 'Spider' chip as the secondary interrupt controller. That is not really generic for the architecture, but at the moment it is the only secondary PIC that exists. A little more work will be needed on this as soon as we have boards with multiple external interrupt controllers. Signed-off-by: Arnd Bergmann -- Kconfig | 15 ++ kernel/Makefile | 8 - kernel/bpa_iic.c | 270 +++++++++++++++++++++++++++++++++++++++++++++++++++ kernel/bpa_iic.h | 62 +++++++++++ kernel/bpa_setup.c | 6 - kernel/pSeries_smp.c | 69 ++++++++++++- kernel/smp.c | 2 kernel/spider-pic.c | 191 ++++++++++++++++++++++++++++++++++++ 8 files changed, 613 insertions(+), 10 deletions(-) Index: linus-2.5/arch/ppc64/Kconfig =================================================================== --- linus-2.5.orig/arch/ppc64/Kconfig 2005-04-22 06:59:52.000000000 +0200 +++ linus-2.5/arch/ppc64/Kconfig 2005-04-22 06:59:58.000000000 +0200 @@ -106,6 +106,21 @@ bool default y +config XICS + depends on PPC_PSERIES + bool + default y + +config MPIC + depends on PPC_PSERIES || PPC_PMAC || PPC_MAPLE + bool + default y + +config BPA_IIC + depends on PPC_BPA + bool + default y + # VMX is pSeries only for now until somebody writes the iSeries # exception vectors for it config ALTIVEC Index: linus-2.5/arch/ppc64/kernel/Makefile =================================================================== --- linus-2.5.orig/arch/ppc64/kernel/Makefile 2005-04-22 06:59:52.000000000 +0200 +++ linus-2.5/arch/ppc64/kernel/Makefile 2005-04-22 07:01:07.000000000 +0200 @@ -28,13 +28,13 @@ mf.o HvLpEvent.o iSeries_proc.o iSeries_htab.o \ iSeries_iommu.o -obj-$(CONFIG_PPC_MULTIPLATFORM) += nvram.o i8259.o prom_init.o prom.o mpic.o +obj-$(CONFIG_PPC_MULTIPLATFORM) += nvram.o i8259.o prom_init.o prom.o obj-$(CONFIG_PPC_PSERIES) += pSeries_pci.o pSeries_lpar.o pSeries_hvCall.o \ pSeries_nvram.o rtasd.o ras.o pSeries_reconfig.o \ - xics.o pSeries_setup.o pSeries_iommu.o + pSeries_setup.o pSeries_iommu.o -obj-$(CONFIG_PPC_BPA) += bpa_setup.o bpa_nvram.o +obj-$(CONFIG_PPC_BPA) += bpa_setup.o bpa_nvram.o bpa_iic.o spider-pic.o obj-$(CONFIG_EEH) += eeh.o obj-$(CONFIG_PROC_FS) += proc_ppc64.o @@ -50,6 +50,8 @@ obj-$(CONFIG_BOOTX_TEXT) += btext.o obj-$(CONFIG_HVCS) += hvcserver.o obj-$(CONFIG_IBMVIO) += vio.o +obj-$(CONFIG_XICS) += xics.o +obj-$(CONFIG_MPIC) += mpic.o obj-$(CONFIG_PPC_PMAC) += pmac_setup.o pmac_feature.o pmac_pci.o \ pmac_time.o pmac_nvram.o pmac_low_i2c.o Index: linus-2.5/arch/ppc64/kernel/bpa_iic.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linus-2.5/arch/ppc64/kernel/bpa_iic.c 2005-04-22 06:59:58.000000000 +0200 @@ -0,0 +1,270 @@ +/* + * BPA Internal Interrupt Controller + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "bpa_iic.h" + +struct iic_pending_bits { + u32 data; + u8 flags; + u8 class; + u8 source; + u8 prio; +}; + +enum iic_pending_flags { + IIC_VALID = 0x80, + IIC_IPI = 0x40, +}; + +struct iic_regs { + struct iic_pending_bits pending; + struct iic_pending_bits pending_destr; + u64 generate; + u64 prio; +}; + +struct iic { + struct iic_regs __iomem *regs; +}; + +static DEFINE_PER_CPU(struct iic, iic); + +void iic_local_enable(void) +{ + out_be64(&__get_cpu_var(iic).regs->prio, 0xff); +} + +void iic_local_disable(void) +{ + out_be64(&__get_cpu_var(iic).regs->prio, 0x0); +} + +static unsigned int iic_startup(unsigned int irq) +{ + return 0; +} + +static void iic_enable(unsigned int irq) +{ + iic_local_enable(); +} + +static void iic_disable(unsigned int irq) +{ +} + +static void iic_end(unsigned int irq) +{ + iic_local_enable(); +} + +static struct hw_interrupt_type iic_pic = { + .typename = " BPA-IIC ", + .startup = iic_startup, + .enable = iic_enable, + .disable = iic_disable, + .end = iic_end, +}; + +static int iic_external_get_irq(struct iic_pending_bits pending) +{ + int irq; + unsigned char node, unit; + + node = pending.source >> 4; + unit = pending.source & 0xf; + irq = -1; + + /* + * This mapping is specific to the Broadband + * Engine. We might need to get the numbers + * from the device tree to support future CPUs. + */ + switch (unit) { + case 0x00: + case 0x0b: + /* + * One of these units can be connected + * to an external interrupt controller. + */ + if (pending.prio > 0x3f || + pending.class != 2) + break; + irq = IIC_EXT_OFFSET + + spider_get_irq(pending.prio + node * IIC_NODE_STRIDE) + + node * IIC_NODE_STRIDE; + break; + case 0x01 ... 0x04: + case 0x07 ... 0x0a: + /* + * These units are connected to the SPEs + */ + if (pending.class > 2) + break; + irq = IIC_SPE_OFFSET + + pending.class * IIC_CLASS_STRIDE + + node * IIC_NODE_STRIDE + + unit; + break; + } + if (irq == -1) + printk(KERN_WARNING "Unexpected interrupt class %02x, " + "source %02x, prio %02x, cpu %02x\n", pending.class, + pending.source, pending.prio, smp_processor_id()); + return irq; +} + +/* Get an IRQ number from the pending state register of the IIC */ +int iic_get_irq(struct pt_regs *regs) +{ + struct iic *iic; + int irq; + struct iic_pending_bits pending; + + iic = &__get_cpu_var(iic); + *(unsigned long *) &pending = + in_be64((unsigned long __iomem *) &iic->regs->pending_destr); + + irq = -1; + if (pending.flags & IIC_VALID) { + if (pending.flags & IIC_IPI) { + irq = IIC_IPI_OFFSET + (pending.prio >> 4); +/* + if (irq > 0x80) + printk(KERN_WARNING "Unexpected IPI prio %02x" + "on CPU %02x\n", pending.prio, + smp_processor_id()); +*/ + } else { + irq = iic_external_get_irq(pending); + } + } + return irq; +} + +static struct iic_regs __iomem *find_iic(int cpu) +{ + struct device_node *np; + int nodeid = cpu / 2; + unsigned long regs; + struct iic_regs __iomem *iic_regs; + + for (np = of_find_node_by_type(NULL, "cpu"); + np; + np = of_find_node_by_type(np, "cpu")) { + if (nodeid == *(int *)get_property(np, "node-id", NULL)) + break; + } + + if (!np) { + printk(KERN_WARNING "IIC: CPU %d not found\n", cpu); + iic_regs = NULL; + } else { + regs = *(long *)get_property(np, "iic", NULL); + + /* hack until we have decided on the devtree info */ + regs += 0x400; + if (cpu & 1) + regs += 0x20; + + printk(KERN_DEBUG "IIC for CPU %d at %lx\n", cpu, regs); + iic_regs = __ioremap(regs, sizeof(struct iic_regs), + _PAGE_NO_CACHE); + } + return iic_regs; +} + +#ifdef CONFIG_SMP +void iic_setup_cpu(void) +{ + out_be64(&__get_cpu_var(iic).regs->prio, 0xff); +} + +void iic_cause_IPI(int cpu, int mesg) +{ + out_be64(&per_cpu(iic, cpu).regs->generate, mesg); +} + +static irqreturn_t iic_ipi_action(int irq, void *dev_id, struct pt_regs *regs) +{ + + smp_message_recv(irq - IIC_IPI_OFFSET, regs); + return IRQ_HANDLED; +} + +static void iic_request_ipi(int irq, const char *name) +{ + /* IPIs are marked SA_INTERRUPT as they must run with irqs + * disabled */ + get_irq_desc(irq)->handler = &iic_pic; + get_irq_desc(irq)->status |= IRQ_PER_CPU; + request_irq(irq, iic_ipi_action, SA_INTERRUPT, name, NULL); +} + +void iic_request_IPIs(void) +{ + iic_request_ipi(IIC_IPI_OFFSET + PPC_MSG_CALL_FUNCTION, "IPI-call"); + iic_request_ipi(IIC_IPI_OFFSET + PPC_MSG_RESCHEDULE, "IPI-resched"); +#ifdef CONFIG_DEBUGGER + iic_request_ipi(IIC_IPI_OFFSET + PPC_MSG_DEBUGGER_BREAK, "IPI-debug"); +#endif /* CONFIG_DEBUGGER */ +} +#endif /* CONFIG_SMP */ + +static void iic_setup_spe_handlers(void) +{ + int be, isrc; + + /* Assume two threads per BE are present */ + for (be=0; be < num_present_cpus() / 2; be++) { + for (isrc = 0; isrc < IIC_CLASS_STRIDE * 3; isrc++) { + int irq = IIC_NODE_STRIDE * be + IIC_SPE_OFFSET + isrc; + get_irq_desc(irq)->handler = &iic_pic; + } + } +} + +void iic_init_IRQ(void) +{ + int cpu, irq_offset; + struct iic *iic; + + irq_offset = 0; + for_each_cpu(cpu) { + iic = &per_cpu(iic, cpu); + iic->regs = find_iic(cpu); + if (iic->regs) + out_be64(&iic->regs->prio, 0xff); + } + iic_setup_spe_handlers(); +} Index: linus-2.5/arch/ppc64/kernel/bpa_iic.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linus-2.5/arch/ppc64/kernel/bpa_iic.h 2005-04-22 06:59:58.000000000 +0200 @@ -0,0 +1,62 @@ +#ifndef ASM_BPA_IIC_H +#define ASM_BPA_IIC_H +#ifdef __KERNEL__ +/* + * Mapping of IIC pending bits into per-node + * interrupt numbers. + * + * IRQ FF CC SS PP FF CC SS PP Description + * + * 00-3f 80 02 +0 00 - 80 02 +0 3f South Bridge + * 00-3f 80 02 +b 00 - 80 02 +b 3f South Bridge + * 41-4a 80 00 +1 ** - 80 00 +a ** SPU Class 0 + * 51-5a 80 01 +1 ** - 80 01 +a ** SPU Class 1 + * 61-6a 80 02 +1 ** - 80 02 +a ** SPU Class 2 + * 70-7f C0 ** ** 00 - C0 ** ** 0f IPI + * + * F flags + * C class + * S source + * P Priority + * + node number + * * don't care + * + * A node consists of a Broadband Engine and an optional + * south bridge device providing a maximum of 64 IRQs. + * The south bridge may be connected to either IOIF0 + * or IOIF1. + * Each SPE is represented as three IRQ lines, one per + * interrupt class. + * 16 IRQ numbers are reserved for inter processor + * interruptions, although these are only used in the + * range of the first node. + * + * This scheme needs 128 IRQ numbers per BIF node ID, + * which means that with the total of 512 lines + * available, we can have a maximum of four nodes. + */ + +enum { + IIC_EXT_OFFSET = 0x00, /* Start of south bridge IRQs */ + IIC_NUM_EXT = 0x40, /* Number of south bridge IRQs */ + IIC_SPE_OFFSET = 0x40, /* Start of SPE interrupts */ + IIC_CLASS_STRIDE = 0x10, /* SPE IRQs per class */ + IIC_IPI_OFFSET = 0x70, /* Start of IPI IRQs */ + IIC_NUM_IPIS = 0x10, /* IRQs reserved for IPI */ + IIC_NODE_STRIDE = 0x80, /* Total IRQs per node */ +}; + +extern void iic_init_IRQ(void); +extern int iic_get_irq(struct pt_regs *regs); +extern void iic_cause_IPI(int cpu, int mesg); +extern void iic_request_IPIs(void); +extern void iic_setup_cpu(void); +extern void iic_local_enable(void); +extern void iic_local_disable(void); + + +extern void spider_init_IRQ(void); +extern int spider_get_irq(unsigned long int_pending); + +#endif +#endif /* ASM_BPA_IIC_H */ Index: linus-2.5/arch/ppc64/kernel/bpa_setup.c =================================================================== --- linus-2.5.orig/arch/ppc64/kernel/bpa_setup.c 2005-04-22 06:59:52.000000000 +0200 +++ linus-2.5/arch/ppc64/kernel/bpa_setup.c 2005-04-22 06:59:58.000000000 +0200 @@ -45,6 +45,7 @@ #include #include "pci.h" +#include "bpa_iic.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -143,6 +144,9 @@ static void __init bpa_setup_arch(void) { + ppc_md.init_IRQ = iic_init_IRQ; + ppc_md.get_irq = iic_get_irq; + #ifdef CONFIG_SMP smp_init_pSeries(); #endif @@ -158,7 +162,7 @@ /* Find and initialize PCI host bridges */ init_pci_config_tokens(); find_and_init_phbs(); - + spider_init_IRQ(); #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif Index: linus-2.5/arch/ppc64/kernel/pSeries_smp.c =================================================================== --- linus-2.5.orig/arch/ppc64/kernel/pSeries_smp.c 2005-04-22 06:58:22.000000000 +0200 +++ linus-2.5/arch/ppc64/kernel/pSeries_smp.c 2005-04-22 06:59:58.000000000 +0200 @@ -1,5 +1,5 @@ /* - * SMP support for pSeries machines. + * SMP support for pSeries and BPA machines. * * Dave Engebretsen, Peter Bergner, and * Mike Corrigan {engebret|bergner|mikec}@us.ibm.com @@ -47,6 +47,7 @@ #include #include "mpic.h" +#include "bpa_iic.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -286,6 +287,7 @@ return 1; } +#ifdef CONFIG_XICS static inline void smp_xics_do_message(int cpu, int msg) { set_bit(msg, &xics_ipi_message[cpu].value); @@ -334,6 +336,37 @@ rtas_set_indicator(GLOBAL_INTERRUPT_QUEUE, (1UL << interrupt_server_size) - 1 - default_distrib_server, 1); } +#endif /* CONFIG_XICS */ +#ifdef CONFIG_BPA_IIC +static void smp_iic_message_pass(int target, int msg) +{ + unsigned int i; + + if (target < NR_CPUS) { + iic_cause_IPI(target, msg); + } else { + for_each_online_cpu(i) { + if (target == MSG_ALL_BUT_SELF + && i == smp_processor_id()) + continue; + iic_cause_IPI(i, msg); + } + } +} + +static int __init smp_iic_probe(void) +{ + iic_request_IPIs(); + + return cpus_weight(cpu_possible_map); +} + +static void __devinit smp_iic_setup_cpu(int cpu) +{ + if (cpu != boot_cpuid) + iic_setup_cpu(); +} +#endif /* CONFIG_BPA_IIC */ static DEFINE_SPINLOCK(timebase_lock); static unsigned long timebase = 0; @@ -388,14 +421,15 @@ return 1; } - +#ifdef CONFIG_MPIC static struct smp_ops_t pSeries_mpic_smp_ops = { .message_pass = smp_mpic_message_pass, .probe = smp_mpic_probe, .kick_cpu = smp_pSeries_kick_cpu, .setup_cpu = smp_mpic_setup_cpu, }; - +#endif +#ifdef CONFIG_XICS static struct smp_ops_t pSeries_xics_smp_ops = { .message_pass = smp_xics_message_pass, .probe = smp_xics_probe, @@ -403,6 +437,16 @@ .setup_cpu = smp_xics_setup_cpu, .cpu_bootable = smp_pSeries_cpu_bootable, }; +#endif +#ifdef CONFIG_BPA_IIC +static struct smp_ops_t bpa_iic_smp_ops = { + .message_pass = smp_iic_message_pass, + .probe = smp_iic_probe, + .kick_cpu = smp_pSeries_kick_cpu, + .setup_cpu = smp_iic_setup_cpu, + .cpu_bootable = smp_pSeries_cpu_bootable, +}; +#endif /* This is called very early */ void __init smp_init_pSeries(void) @@ -411,10 +455,25 @@ DBG(" -> smp_init_pSeries()\n"); - if (ppc64_interrupt_controller == IC_OPEN_PIC) + switch (ppc64_interrupt_controller) { +#ifdef CONFIG_MPIC + case IC_OPEN_PIC: smp_ops = &pSeries_mpic_smp_ops; - else + break; +#endif +#ifdef CONFIG_XICS + case IC_PPC_XIC: smp_ops = &pSeries_xics_smp_ops; + break; +#endif +#ifdef CONFIG_BPA_IIC + case IC_BPA_IIC: + smp_ops = &bpa_iic_smp_ops; + break; +#endif + default: + panic("Invalid interrupt controller"); + } #ifdef CONFIG_HOTPLUG_CPU smp_ops->cpu_disable = pSeries_cpu_disable; Index: linus-2.5/arch/ppc64/kernel/smp.c =================================================================== --- linus-2.5.orig/arch/ppc64/kernel/smp.c 2005-04-22 06:58:22.000000000 +0200 +++ linus-2.5/arch/ppc64/kernel/smp.c 2005-04-22 06:59:58.000000000 +0200 @@ -71,7 +71,7 @@ int smt_enabled_at_boot = 1; -#ifdef CONFIG_PPC_MULTIPLATFORM +#if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_PMAC) || defined(CONFIG_PPC_MAPLE) void smp_mpic_message_pass(int target, int msg) { /* make sure we're sending something that translates to an IPI */ Index: linus-2.5/arch/ppc64/kernel/spider-pic.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linus-2.5/arch/ppc64/kernel/spider-pic.c 2005-04-22 06:59:58.000000000 +0200 @@ -0,0 +1,191 @@ +/* + * External Interrupt Controller on Spider South Bridge + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include + +#include +#include +#include + +#include "bpa_iic.h" + +/* register layout taken from Spider spec, table 7.4-4 */ +enum { + TIR_DEN = 0x004, /* Detection Enable Register */ + TIR_MSK = 0x084, /* Mask Level Register */ + TIR_EDC = 0x0c0, /* Edge Detection Clear Register */ + TIR_PNDA = 0x100, /* Pending Register A */ + TIR_PNDB = 0x104, /* Pending Register B */ + TIR_CS = 0x144, /* Current Status Register */ + TIR_LCSA = 0x150, /* Level Current Status Register A */ + TIR_LCSB = 0x154, /* Level Current Status Register B */ + TIR_LCSC = 0x158, /* Level Current Status Register C */ + TIR_LCSD = 0x15c, /* Level Current Status Register D */ + TIR_CFGA = 0x200, /* Setting Register A0 */ + TIR_CFGB = 0x204, /* Setting Register B0 */ + /* 0x208 ... 0x3ff Setting Register An/Bn */ + TIR_PPNDA = 0x400, /* Packet Pending Register A */ + TIR_PPNDB = 0x404, /* Packet Pending Register B */ + TIR_PIERA = 0x408, /* Packet Output Error Register A */ + TIR_PIERB = 0x40c, /* Packet Output Error Register B */ + TIR_PIEN = 0x444, /* Packet Output Enable Register */ + TIR_PIPND = 0x454, /* Packet Output Pending Register */ + TIRDID = 0x484, /* Spider Device ID Register */ + REISTIM = 0x500, /* Reissue Command Timeout Time Setting */ + REISTIMEN = 0x504, /* Reissue Command Timeout Setting */ + REISWAITEN = 0x508, /* Reissue Wait Control*/ +}; + +static void __iomem *spider_pics[4]; + +static void __iomem *spider_get_pic(int irq) +{ + int node = irq / IIC_NODE_STRIDE; + irq %= IIC_NODE_STRIDE; + + if (irq >= IIC_EXT_OFFSET && + irq < IIC_EXT_OFFSET + IIC_NUM_EXT && + spider_pics) + return spider_pics[node]; + return NULL; +} + +static int spider_get_nr(unsigned int irq) +{ + return (irq % IIC_NODE_STRIDE) - IIC_EXT_OFFSET; +} + +static void __iomem *spider_get_irq_config(int irq) +{ + void __iomem *pic; + pic = spider_get_pic(irq); + return pic + TIR_CFGA + 8 * spider_get_nr(irq); +} + +static void spider_enable_irq(unsigned int irq) +{ + void __iomem *cfg = spider_get_irq_config(irq); + irq = spider_get_nr(irq); + + out_be32(cfg, in_be32(cfg) | 0x3107000eu); + out_be32(cfg + 4, in_be32(cfg + 4) | 0x00020000u | irq); +} + +static void spider_disable_irq(unsigned int irq) +{ + void __iomem *cfg = spider_get_irq_config(irq); + irq = spider_get_nr(irq); + + out_be32(cfg, in_be32(cfg) & ~0x30000000u); +} + +static unsigned int spider_startup_irq(unsigned int irq) +{ + spider_enable_irq(irq); + return 0; +} + +static void spider_shutdown_irq(unsigned int irq) +{ + spider_disable_irq(irq); +} + +static void spider_end_irq(unsigned int irq) +{ + spider_enable_irq(irq); +} + +static void spider_ack_irq(unsigned int irq) +{ + spider_disable_irq(irq); + iic_local_enable(); +} + +static struct hw_interrupt_type spider_pic = { + .typename = " SPIDER ", + .startup = spider_startup_irq, + .shutdown = spider_shutdown_irq, + .enable = spider_enable_irq, + .disable = spider_disable_irq, + .ack = spider_ack_irq, + .end = spider_end_irq, +}; + + +int spider_get_irq(unsigned long int_pending) +{ + void __iomem *regs = spider_get_pic(int_pending); + unsigned long cs; + int irq; + + cs = in_be32(regs + TIR_CS); + + irq = cs >> 24; + if (irq != 63) + return irq; + + return -1; +} + +void spider_init_IRQ(void) +{ + int node; + struct device_node *dn; + unsigned int *property; + long spiderpic; + int n; + +/* FIXME: detect multiple PICs as soon as the device tree has them */ + for (node = 0; node < 1; node++) { + dn = of_find_node_by_path("/"); + n = prom_n_addr_cells(dn); + property = (unsigned int *) get_property(dn, + "platform-spider-pic", NULL); + + if (!property) + continue; + for (spiderpic = 0; n > 0; --n) + spiderpic = (spiderpic << 32) + *property++; + printk(KERN_DEBUG "SPIDER addr: %lx\n", spiderpic); + spider_pics[node] = __ioremap(spiderpic, 0x800, _PAGE_NO_CACHE); + for (n = 0; n < IIC_NUM_EXT; n++) { + int irq = n + IIC_EXT_OFFSET + node * IIC_NODE_STRIDE; + get_irq_desc(irq)->handler = &spider_pic; + + /* do not mask any interrupts because of level */ + out_be32(spider_pics[node] + TIR_MSK, 0x0); + + /* disable edge detection clear */ + /* out_be32(spider_pics[node] + TIR_EDC, 0x0); */ + + /* enable interrupt packets to be output */ + out_be32(spider_pics[node] + TIR_PIEN, + in_be32(spider_pics[node] + TIR_PIEN) | 0x1); + + /* Enable the interrupt detection enable bit. Do this last! */ + out_be32(spider_pics[node] + TIR_DEN, + in_be32(spider_pics[node] +TIR_DEN) | 0x1); + + } + } +} From arnd at arndb.de Wed Jun 22 07:11:35 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:11:35 +0200 Subject: [PATCH 1/11] ppc64: consolidate calibrate_decr implementations In-Reply-To: <200506212310.54156.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> Message-ID: <200506212311.36010.arnd@arndb.de> pSeries and maple have almost the same code for calibrate_decr, and BPA would need yet another copy. Instead, I'm moving the code to arch/ppc64/kernel/time.c. Some of the related declarations were missing from header files, so I'm moving those as well. It makes sense to merge this with the pmac function of the same name, so we end up having just one implemetation for iSeries and one for Open Firmware based machines. Signed-off-by: Arnd Bergmann --- arch/ppc64/kernel/iSeries_setup.c | 5 -- arch/ppc64/kernel/maple_setup.c | 2 - arch/ppc64/kernel/maple_time.c | 51 ---------------------------- arch/ppc64/kernel/pSeries_setup.c | 69 -------------------------------------- arch/ppc64/kernel/pmac_time.c | 8 ---- arch/ppc64/kernel/setup.c | 5 -- arch/ppc64/kernel/time.c | 63 ++++++++++++++++++++++++++++++++++ include/asm-ppc64/time.h | 9 ++++ 8 files changed, 76 insertions(+), 136 deletions(-) --- linux-cg.orig/arch/ppc64/kernel/iSeries_setup.c 2005-06-21 02:10:53.631907448 -0400 +++ linux-cg/arch/ppc64/kernel/iSeries_setup.c 2005-06-21 02:22:45.822002600 -0400 @@ -665,9 +665,6 @@ static void __init iSeries_bolt_kernel(u } } -extern unsigned long ppc_proc_freq; -extern unsigned long ppc_tb_freq; - /* * Document me. */ @@ -766,8 +763,6 @@ static void iSeries_halt(void) mf_power_off(); } -extern void setup_default_decr(void); - /* * void __init iSeries_calibrate_decr() * --- linux-cg.orig/arch/ppc64/kernel/maple_setup.c 2005-06-21 02:10:21.420014872 -0400 +++ linux-cg/arch/ppc64/kernel/maple_setup.c 2005-06-21 02:22:26.403897552 -0400 @@ -235,6 +235,6 @@ struct machdep_calls __initdata maple_md .get_boot_time = maple_get_boot_time, .set_rtc_time = maple_set_rtc_time, .get_rtc_time = maple_get_rtc_time, - .calibrate_decr = maple_calibrate_decr, + .calibrate_decr = generic_calibrate_decr, .progress = maple_progress, }; --- linux-cg.orig/arch/ppc64/kernel/maple_time.c 2005-06-21 02:10:53.633907144 -0400 +++ linux-cg/arch/ppc64/kernel/maple_time.c 2005-06-21 02:22:26.412896184 -0400 @@ -42,11 +42,8 @@ #define DBG(x...) #endif -extern void setup_default_decr(void); extern void GregorianDay(struct rtc_time * tm); -extern unsigned long ppc_tb_freq; -extern unsigned long ppc_proc_freq; static int maple_rtc_addr; static int maple_clock_read(int addr) @@ -176,51 +173,3 @@ void __init maple_get_boot_time(struct r maple_get_rtc_time(tm); } -/* XXX FIXME: Some sane defaults: 125 MHz timebase, 1GHz processor */ -#define DEFAULT_TB_FREQ 125000000UL -#define DEFAULT_PROC_FREQ (DEFAULT_TB_FREQ * 8) - -void __init maple_calibrate_decr(void) -{ - struct device_node *cpu; - struct div_result divres; - unsigned int *fp = NULL; - - /* - * The cpu node should have a timebase-frequency property - * to tell us the rate at which the decrementer counts. - */ - cpu = of_find_node_by_type(NULL, "cpu"); - - ppc_tb_freq = DEFAULT_TB_FREQ; - if (cpu != 0) - fp = (unsigned int *)get_property(cpu, "timebase-frequency", NULL); - if (fp != NULL) - ppc_tb_freq = *fp; - else - printk(KERN_ERR "WARNING: Estimating decrementer frequency (not found)\n"); - fp = NULL; - ppc_proc_freq = DEFAULT_PROC_FREQ; - if (cpu != 0) - fp = (unsigned int *)get_property(cpu, "clock-frequency", NULL); - if (fp != NULL) - ppc_proc_freq = *fp; - else - printk(KERN_ERR "WARNING: Estimating processor frequency (not found)\n"); - - of_node_put(cpu); - - printk(KERN_INFO "time_init: decrementer frequency = %lu.%.6lu MHz\n", - ppc_tb_freq/1000000, ppc_tb_freq%1000000); - printk(KERN_INFO "time_init: processor frequency = %lu.%.6lu MHz\n", - ppc_proc_freq/1000000, ppc_proc_freq%1000000); - - tb_ticks_per_jiffy = ppc_tb_freq / HZ; - tb_ticks_per_sec = tb_ticks_per_jiffy * HZ; - tb_ticks_per_usec = ppc_tb_freq / 1000000; - tb_to_us = mulhwu_scale_factor(ppc_tb_freq, 1000000); - div128_by_32(1024*1024, 0, tb_ticks_per_sec, &divres); - tb_to_xs = divres.result_low; - - setup_default_decr(); -} --- linux-cg.orig/arch/ppc64/kernel/pSeries_setup.c 2005-06-21 02:10:53.635906840 -0400 +++ linux-cg/arch/ppc64/kernel/pSeries_setup.c 2005-06-21 02:22:26.415895728 -0400 @@ -84,9 +84,6 @@ extern void generic_find_legacy_serial_p int fwnmi_active; /* TRUE if an FWNMI handler is present */ -extern unsigned long ppc_proc_freq; -extern unsigned long ppc_tb_freq; - extern void pSeries_system_reset_exception(struct pt_regs *regs); extern int pSeries_machine_check_exception(struct pt_regs *regs); @@ -482,70 +479,6 @@ static void pSeries_progress(char *s, un spin_unlock(&progress_lock); } -extern void setup_default_decr(void); - -/* Some sane defaults: 125 MHz timebase, 1GHz processor */ -#define DEFAULT_TB_FREQ 125000000UL -#define DEFAULT_PROC_FREQ (DEFAULT_TB_FREQ * 8) - -static void __init pSeries_calibrate_decr(void) -{ - struct device_node *cpu; - struct div_result divres; - unsigned int *fp; - int node_found; - - /* - * The cpu node should have a timebase-frequency property - * to tell us the rate at which the decrementer counts. - */ - cpu = of_find_node_by_type(NULL, "cpu"); - - ppc_tb_freq = DEFAULT_TB_FREQ; /* hardcoded default */ - node_found = 0; - if (cpu != 0) { - fp = (unsigned int *)get_property(cpu, "timebase-frequency", - NULL); - if (fp != 0) { - node_found = 1; - ppc_tb_freq = *fp; - } - } - if (!node_found) - printk(KERN_ERR "WARNING: Estimating decrementer frequency " - "(not found)\n"); - - ppc_proc_freq = DEFAULT_PROC_FREQ; - node_found = 0; - if (cpu != 0) { - fp = (unsigned int *)get_property(cpu, "clock-frequency", - NULL); - if (fp != 0) { - node_found = 1; - ppc_proc_freq = *fp; - } - } - if (!node_found) - printk(KERN_ERR "WARNING: Estimating processor frequency " - "(not found)\n"); - - of_node_put(cpu); - - printk(KERN_INFO "time_init: decrementer frequency = %lu.%.6lu MHz\n", - ppc_tb_freq/1000000, ppc_tb_freq%1000000); - printk(KERN_INFO "time_init: processor frequency = %lu.%.6lu MHz\n", - ppc_proc_freq/1000000, ppc_proc_freq%1000000); - - tb_ticks_per_jiffy = ppc_tb_freq / HZ; - tb_ticks_per_sec = tb_ticks_per_jiffy * HZ; - tb_ticks_per_usec = ppc_tb_freq / 1000000; - tb_to_us = mulhwu_scale_factor(ppc_tb_freq, 1000000); - div128_by_32(1024*1024, 0, tb_ticks_per_sec, &divres); - tb_to_xs = divres.result_low; - - setup_default_decr(); -} - static int pSeries_check_legacy_ioport(unsigned int baseport) { struct device_node *np; @@ -604,7 +537,7 @@ struct machdep_calls __initdata pSeries_ .get_boot_time = pSeries_get_boot_time, .get_rtc_time = pSeries_get_rtc_time, .set_rtc_time = pSeries_set_rtc_time, - .calibrate_decr = pSeries_calibrate_decr, + .calibrate_decr = generic_calibrate_decr, .progress = pSeries_progress, .check_legacy_ioport = pSeries_check_legacy_ioport, .system_reset_exception = pSeries_system_reset_exception, --- linux-cg.orig/arch/ppc64/kernel/pmac_time.c 2005-06-21 02:10:53.638906384 -0400 +++ linux-cg/arch/ppc64/kernel/pmac_time.c 2005-06-21 02:22:26.417895424 -0400 @@ -40,11 +40,6 @@ #define DBG(x...) #endif -extern void setup_default_decr(void); - -extern unsigned long ppc_tb_freq; -extern unsigned long ppc_proc_freq; - /* Apparently the RTC stores seconds since 1 Jan 1904 */ #define RTC_OFFSET 2082844800 @@ -161,8 +156,7 @@ void __init pmac_get_boot_time(struct rt /* * Query the OF and get the decr frequency. - * This was taken from the pmac time_init() when merging the prep/pmac - * time functions. + * FIXME: merge this with generic_calibrate_decr */ void __init pmac_calibrate_decr(void) { --- linux-cg.orig/arch/ppc64/kernel/setup.c 2005-06-21 02:10:21.423014416 -0400 +++ linux-cg/arch/ppc64/kernel/setup.c 2005-06-21 02:22:45.823002448 -0400 @@ -700,9 +700,6 @@ void machine_halt(void) EXPORT_SYMBOL(machine_halt); -unsigned long ppc_proc_freq; -unsigned long ppc_tb_freq; - static int ppc64_panic_event(struct notifier_block *this, unsigned long event, void *ptr) { @@ -1116,7 +1113,7 @@ void ppc64_dump_msg(unsigned int src, co } /* This should only be called on processor 0 during calibrate decr */ -void setup_default_decr(void) +void __init setup_default_decr(void) { struct paca_struct *lpaca = get_paca(); --- linux-cg.orig/arch/ppc64/kernel/time.c 2005-06-21 02:10:21.425014112 -0400 +++ linux-cg/arch/ppc64/kernel/time.c 2005-06-21 02:22:26.408896792 -0400 @@ -107,6 +107,9 @@ void ppc_adjtimex(void); static unsigned adjusting_time = 0; +unsigned long ppc_proc_freq; +unsigned long ppc_tb_freq; + static __inline__ void timer_check_rtc(void) { /* @@ -472,6 +475,66 @@ int do_settimeofday(struct timespec *tv) EXPORT_SYMBOL(do_settimeofday); +#if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_MAPLE) || defined(CONFIG_PPC_BPA) +void __init generic_calibrate_decr(void) +{ + struct device_node *cpu; + struct div_result divres; + unsigned int *fp; + int node_found; + + /* + * The cpu node should have a timebase-frequency property + * to tell us the rate at which the decrementer counts. + */ + cpu = of_find_node_by_type(NULL, "cpu"); + + ppc_tb_freq = DEFAULT_TB_FREQ; /* hardcoded default */ + node_found = 0; + if (cpu != 0) { + fp = (unsigned int *)get_property(cpu, "timebase-frequency", + NULL); + if (fp != 0) { + node_found = 1; + ppc_tb_freq = *fp; + } + } + if (!node_found) + printk(KERN_ERR "WARNING: Estimating decrementer frequency " + "(not found)\n"); + + ppc_proc_freq = DEFAULT_PROC_FREQ; + node_found = 0; + if (cpu != 0) { + fp = (unsigned int *)get_property(cpu, "clock-frequency", + NULL); + if (fp != 0) { + node_found = 1; + ppc_proc_freq = *fp; + } + } + if (!node_found) + printk(KERN_ERR "WARNING: Estimating processor frequency " + "(not found)\n"); + + of_node_put(cpu); + + printk(KERN_INFO "time_init: decrementer frequency = %lu.%.6lu MHz\n", + ppc_tb_freq/1000000, ppc_tb_freq%1000000); + printk(KERN_INFO "time_init: processor frequency = %lu.%.6lu MHz\n", + ppc_proc_freq/1000000, ppc_proc_freq%1000000); + + tb_ticks_per_jiffy = ppc_tb_freq / HZ; + tb_ticks_per_sec = tb_ticks_per_jiffy * HZ; + tb_ticks_per_usec = ppc_tb_freq / 1000000; + tb_to_us = mulhwu_scale_factor(ppc_tb_freq, 1000000); + div128_by_32(1024*1024, 0, tb_ticks_per_sec, &divres); + tb_to_xs = divres.result_low; + + setup_default_decr(); +} +#endif + void __init time_init(void) { /* This function is only called on the boot processor */ --- linux-cg.orig/include/asm-ppc64/time.h 2005-06-21 02:10:21.427013808 -0400 +++ linux-cg/include/asm-ppc64/time.h 2005-06-21 02:22:26.419895120 -0400 @@ -34,6 +34,15 @@ struct rtc_time; extern void to_tm(int tim, struct rtc_time * tm); extern time_t last_rtc_update; +void generic_calibrate_decr(void); +void setup_default_decr(void); + +/* Some sane defaults: 125 MHz timebase, 1GHz processor */ +extern unsigned long ppc_proc_freq; +#define DEFAULT_PROC_FREQ (DEFAULT_TB_FREQ * 8) +extern unsigned long ppc_tb_freq; +#define DEFAULT_TB_FREQ 125000000UL + /* * By putting all of this stuff into a single struct we * reduce the number of cache lines touched by do_gettimeofday. From arnd at arndb.de Wed Jun 22 07:28:28 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:28:28 +0200 Subject: [PATCH 9/11] ppc64: Add driver for BPA iommu In-Reply-To: <200506212326.18205.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212324.19713.arnd@arndb.de> <200506212326.18205.arnd@arndb.de> Message-ID: <200506212328.28929.arnd@arndb.de> Implementation of software load support for the BE iommu. This is very different from other iommu code on ppc64, since we only do a static mapping. The mapping is currently hardcoded but should really be read from the firmware, but they don't set up the device nodes yet. There is a single 512MB DMA window for PCI, USB and ethernet at 0x20000000 for our RAM. The Cell processor can put the I/O page table either in memory like the hashed page table (hardware load) or have the operating system write the entries into memory mapped CPU registers (software load). I use the software load mechanism because I know that all I/O page table entries for the amount of installed physical memory fit into the IO TLB cache. At the point when we get machines with more than 4GB of installed memory, we can either use hardware I/O page table access like the other platforms do or dynamically update the I/O TLB entries when a page fault occurs in the I/O subsystem. The software load can then use the macros that I have implemented for the static mapping in order to do the TLB cache updates. Signed-off-by: Arnd Bergmann -- Makefile | 3 bpa_iommu.c | 377 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ bpa_iommu.h | 65 ++++++++++ bpa_setup.c | 3 4 files changed, 446 insertions(+), 2 deletions(-) Index: linus-2.5/arch/ppc64/kernel/Makefile =================================================================== --- linus-2.5.orig/arch/ppc64/kernel/Makefile 2005-04-22 07:01:07.000000000 +0200 +++ linus-2.5/arch/ppc64/kernel/Makefile 2005-04-29 10:01:44.000000000 +0200 @@ -34,7 +34,8 @@ pSeries_nvram.o rtasd.o ras.o pSeries_reconfig.o \ pSeries_setup.o pSeries_iommu.o -obj-$(CONFIG_PPC_BPA) += bpa_setup.o bpa_nvram.o bpa_iic.o spider-pic.o +obj-$(CONFIG_PPC_BPA) += bpa_setup.o bpa_iommu.o bpa_nvram.o \ + bpa_iic.o spider-pic.o obj-$(CONFIG_EEH) += eeh.o obj-$(CONFIG_PROC_FS) += proc_ppc64.o Index: linus-2.5/arch/ppc64/kernel/bpa_iommu.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linus-2.5/arch/ppc64/kernel/bpa_iommu.c 2005-04-29 10:24:03.000000000 +0200 @@ -0,0 +1,377 @@ +/* + * IOMMU implementation for Broadband Processor Architecture + * We just establish a linear mapping at boot by setting all the + * IOPT cache entries in the CPU. + * The mapping functions should be identical to pci_direct_iommu, + * except for the handling of the high order bit that is required + * by the Spider bridge. These should be split into a separate + * file at the point where we get a different bridge chip. + * + * Copyright (C) 2005 IBM Deutschland Entwicklung GmbH, + * Arnd Bergmann + * + * Based on linear mapping + * Copyright (C) 2003 Benjamin Herrenschmidt (benh at kernel.crashing.org) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#undef DEBUG + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "pci.h" +#include "bpa_iommu.h" + +static inline unsigned long +get_iopt_entry(unsigned long real_address, unsigned long ioid, + unsigned long prot) +{ + return (prot & IOPT_PROT_MASK) + | (IOPT_COHERENT) + | (IOPT_ORDER_VC) + | (real_address & IOPT_RPN_MASK) + | (ioid & IOPT_IOID_MASK); +} + +typedef struct { + unsigned long val; +} ioste; + +static inline ioste +mk_ioste(unsigned long val) +{ + ioste ioste = { .val = val, }; + return ioste; +} + +static inline ioste +get_iost_entry(unsigned long iopt_base, unsigned long io_address, unsigned page_size) +{ + unsigned long ps; + unsigned long iostep; + unsigned long nnpt; + unsigned long shift; + + switch (page_size) { + case 0x1000000: + ps = IOST_PS_16M; + nnpt = 0; /* one page per segment */ + shift = 5; /* segment has 16 iopt entries */ + break; + + case 0x100000: + ps = IOST_PS_1M; + nnpt = 0; /* one page per segment */ + shift = 1; /* segment has 256 iopt entries */ + break; + + case 0x10000: + ps = IOST_PS_64K; + nnpt = 0x07; /* 8 pages per io page table */ + shift = 0; /* all entries are used */ + break; + + case 0x1000: + ps = IOST_PS_4K; + nnpt = 0x7f; /* 128 pages per io page table */ + shift = 0; /* all entries are used */ + break; + + default: /* not a known compile time constant */ + BUILD_BUG_ON(1); + break; + } + + iostep = iopt_base + + /* need 8 bytes per iopte */ + (((io_address / page_size * 8) + /* align io page tables on 4k page boundaries */ + << shift) + /* nnpt+1 pages go into each iopt */ + & ~(nnpt << 12)); + + nnpt++; /* this seems to work, but the documentation is not clear + about wether we put nnpt or nnpt-1 into the ioste bits. + In theory, this can't work for 4k pages. */ + return mk_ioste(IOST_VALID_MASK + | (iostep & IOST_PT_BASE_MASK) + | ((nnpt << 5) & IOST_NNPT_MASK) + | (ps & IOST_PS_MASK)); +} + +/* compute the address of an io pte */ +static inline unsigned long +get_ioptep(ioste iost_entry, unsigned long io_address) +{ + unsigned long iopt_base; + unsigned long page_size; + unsigned long page_number; + unsigned long iopt_offset; + + iopt_base = iost_entry.val & IOST_PT_BASE_MASK; + page_size = iost_entry.val & IOST_PS_MASK; + + /* decode page size to compute page number */ + page_number = (io_address & 0x0fffffff) >> (10 + 2 * page_size); + /* page number is an offset into the io page table */ + iopt_offset = (page_number << 3) & 0x7fff8ul; + return iopt_base + iopt_offset; +} + +/* compute the tag field of the iopt cache entry */ +static inline unsigned long +get_ioc_tag(ioste iost_entry, unsigned long io_address) +{ + unsigned long iopte = get_ioptep(iost_entry, io_address); + + return IOPT_VALID_MASK + | ((iopte & 0x00000000000000ff8ul) >> 3) + | ((iopte & 0x0000003fffffc0000ul) >> 9); +} + +/* compute the hashed 6 bit index for the 4-way associative pte cache */ +static inline unsigned long +get_ioc_hash(ioste iost_entry, unsigned long io_address) +{ + unsigned long iopte = get_ioptep(iost_entry, io_address); + + return ((iopte & 0x000000000000001f8ul) >> 3) + ^ ((iopte & 0x00000000000020000ul) >> 17) + ^ ((iopte & 0x00000000000010000ul) >> 15) + ^ ((iopte & 0x00000000000008000ul) >> 13) + ^ ((iopte & 0x00000000000004000ul) >> 11) + ^ ((iopte & 0x00000000000002000ul) >> 9) + ^ ((iopte & 0x00000000000001000ul) >> 7); +} + +/* same as above, but pretend that we have a simpler 1-way associative + pte cache with an 8 bit index */ +static inline unsigned long +get_ioc_hash_1way(ioste iost_entry, unsigned long io_address) +{ + unsigned long iopte = get_ioptep(iost_entry, io_address); + + return ((iopte & 0x000000000000001f8ul) >> 3) + ^ ((iopte & 0x00000000000020000ul) >> 17) + ^ ((iopte & 0x00000000000010000ul) >> 15) + ^ ((iopte & 0x00000000000008000ul) >> 13) + ^ ((iopte & 0x00000000000004000ul) >> 11) + ^ ((iopte & 0x00000000000002000ul) >> 9) + ^ ((iopte & 0x00000000000001000ul) >> 7) + ^ ((iopte & 0x0000000000000c000ul) >> 8); +} + +static inline ioste +get_iost_cache(void __iomem *base, unsigned long index) +{ + unsigned long __iomem *p = (base + IOC_ST_CACHE_DIR); + return mk_ioste(in_be64(&p[index])); +} + +static inline void +set_iost_cache(void __iomem *base, unsigned long index, ioste ste) +{ + unsigned long __iomem *p = (base + IOC_ST_CACHE_DIR); + pr_debug("ioste %02lx was %016lx, store %016lx", index, + get_iost_cache(base, index).val, ste.val); + out_be64(&p[index], ste.val); + pr_debug(" now %016lx\n", get_iost_cache(base, index).val); +} + +static inline unsigned long +get_iopt_cache(void __iomem *base, unsigned long index, unsigned long *tag) +{ + unsigned long __iomem *tags = (void *)(base + IOC_PT_CACHE_DIR); + unsigned long __iomem *p = (void *)(base + IOC_PT_CACHE_REG); + + *tag = tags[index]; + rmb(); + return *p; +} + +static inline void +set_iopt_cache(void __iomem *base, unsigned long index, + unsigned long tag, unsigned long val) +{ + unsigned long __iomem *tags = base + IOC_PT_CACHE_DIR; + unsigned long __iomem *p = base + IOC_PT_CACHE_REG; + pr_debug("iopt %02lx was v%016lx/t%016lx, store v%016lx/t%016lx\n", + index, get_iopt_cache(base, index, &oldtag), oldtag, val, tag); + + out_be64(p, val); + out_be64(&tags[index], tag); +} + +static inline void +set_iost_origin(void __iomem *base) +{ + unsigned long __iomem *p = base + IOC_ST_ORIGIN; + unsigned long origin = IOSTO_ENABLE | IOSTO_SW; + + pr_debug("iost_origin %016lx, now %016lx\n", in_be64(p), origin); + out_be64(p, origin); +} + +static inline void +set_iocmd_config(void __iomem *base) +{ + unsigned long __iomem *p = base + 0xc00; + unsigned long conf; + + conf = in_be64(p); + pr_debug("iost_conf %016lx, now %016lx\n", conf, conf | IOCMD_CONF_TE); + out_be64(p, conf | IOCMD_CONF_TE); +} + +/* FIXME: get these from the device tree */ +#define ioc_base 0x20000511000ull +#define ioc_mmio_base 0x20000510000ull +#define ioid 0x48a +#define iopt_phys_offset (- 0x20000000) /* We have a 512MB offset from the SB */ +#define io_page_size 0x1000000 + +static unsigned long map_iopt_entry(unsigned long address) +{ + switch (address >> 20) { + case 0x600: + address = 0x24020000000ull; /* spider i/o */ + break; + default: + address += iopt_phys_offset; + break; + } + + return get_iopt_entry(address, ioid, IOPT_PROT_RW); +} + +static void iommu_bus_setup_null(struct pci_bus *b) { } +static void iommu_dev_setup_null(struct pci_dev *d) { } + +/* initialize the iommu to support a simple linear mapping + * for each DMA window used by any device. For now, we + * happen to know that there is only one DMA window in use, + * starting at iopt_phys_offset. */ +static void bpa_map_iommu(void) +{ + unsigned long address; + void __iomem *base; + ioste ioste; + unsigned long index; + + base = __ioremap(ioc_base, 0x1000, _PAGE_NO_CACHE); + pr_debug("%lx mapped to %p\n", ioc_base, base); + set_iocmd_config(base); + iounmap(base); + + base = __ioremap(ioc_mmio_base, 0x1000, _PAGE_NO_CACHE); + pr_debug("%lx mapped to %p\n", ioc_mmio_base, base); + + set_iost_origin(base); + + for (address = 0; address < 0x100000000ul; address += io_page_size) { + ioste = get_iost_entry(0x10000000000ul, address, io_page_size); + if ((address & 0xfffffff) == 0) /* segment start */ + set_iost_cache(base, address >> 28, ioste); + index = get_ioc_hash_1way(ioste, address); + pr_debug("addr %08lx, index %02lx, ioste %016lx\n", + address, index, ioste.val); + set_iopt_cache(base, + get_ioc_hash_1way(ioste, address), + get_ioc_tag(ioste, address), + map_iopt_entry(address)); + } + iounmap(base); +} + + +static void *bpa_alloc_coherent(struct device *hwdev, size_t size, + dma_addr_t *dma_handle, unsigned int __nocast flag) +{ + void *ret; + + ret = (void *)__get_free_pages(flag, get_order(size)); + if (ret != NULL) { + memset(ret, 0, size); + *dma_handle = virt_to_abs(ret) | BPA_DMA_VALID; + } + return ret; +} + +static void bpa_free_coherent(struct device *hwdev, size_t size, + void *vaddr, dma_addr_t dma_handle) +{ + free_pages((unsigned long)vaddr, get_order(size)); +} + +static dma_addr_t bpa_map_single(struct device *hwdev, void *ptr, + size_t size, enum dma_data_direction direction) +{ + return virt_to_abs(ptr) | BPA_DMA_VALID; +} + +static void bpa_unmap_single(struct device *hwdev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction direction) +{ +} + +static int bpa_map_sg(struct device *hwdev, struct scatterlist *sg, + int nents, enum dma_data_direction direction) +{ + int i; + + for (i = 0; i < nents; i++, sg++) { + sg->dma_address = (page_to_phys(sg->page) + sg->offset) + | BPA_DMA_VALID; + sg->dma_length = sg->length; + } + + return nents; +} + +static void bpa_unmap_sg(struct device *hwdev, struct scatterlist *sg, + int nents, enum dma_data_direction direction) +{ +} + +static int bpa_dma_supported(struct device *dev, u64 mask) +{ + return mask < 0x100000000ull; +} + +void bpa_init_iommu(void) +{ + bpa_map_iommu(); + + /* Direct I/O, IOMMU off */ + ppc_md.iommu_dev_setup = iommu_dev_setup_null; + ppc_md.iommu_bus_setup = iommu_bus_setup_null; + + pci_dma_ops.alloc_coherent = bpa_alloc_coherent; + pci_dma_ops.free_coherent = bpa_free_coherent; + pci_dma_ops.map_single = bpa_map_single; + pci_dma_ops.unmap_single = bpa_unmap_single; + pci_dma_ops.map_sg = bpa_map_sg; + pci_dma_ops.unmap_sg = bpa_unmap_sg; + pci_dma_ops.dma_supported = bpa_dma_supported; +} Index: linus-2.5/arch/ppc64/kernel/bpa_iommu.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linus-2.5/arch/ppc64/kernel/bpa_iommu.h 2005-04-29 09:47:29.000000000 +0200 @@ -0,0 +1,65 @@ +#ifndef BPA_IOMMU_H +#define BPA_IOMMU_H + +/* some constants */ +enum { + /* segment table entries */ + IOST_VALID_MASK = 0x8000000000000000ul, + IOST_TAG_MASK = 0x3000000000000000ul, + IOST_PT_BASE_MASK = 0x000003fffffff000ul, + IOST_NNPT_MASK = 0x0000000000000fe0ul, + IOST_PS_MASK = 0x000000000000000ful, + + IOST_PS_4K = 0x1, + IOST_PS_64K = 0x3, + IOST_PS_1M = 0x5, + IOST_PS_16M = 0x7, + + /* iopt tag register */ + IOPT_VALID_MASK = 0x0000000200000000ul, + IOPT_TAG_MASK = 0x00000001fffffffful, + + /* iopt cache register */ + IOPT_PROT_MASK = 0xc000000000000000ul, + IOPT_PROT_NONE = 0x0000000000000000ul, + IOPT_PROT_READ = 0x4000000000000000ul, + IOPT_PROT_WRITE = 0x8000000000000000ul, + IOPT_PROT_RW = 0xc000000000000000ul, + IOPT_COHERENT = 0x2000000000000000ul, + + IOPT_ORDER_MASK = 0x1800000000000000ul, + /* order access to same IOID/VC on same address */ + IOPT_ORDER_ADDR = 0x0800000000000000ul, + /* similar, but only after a write access */ + IOPT_ORDER_WRITES = 0x1000000000000000ul, + /* Order all accesses to same IOID/VC */ + IOPT_ORDER_VC = 0x1800000000000000ul, + + IOPT_RPN_MASK = 0x000003fffffff000ul, + IOPT_HINT_MASK = 0x0000000000000800ul, + IOPT_IOID_MASK = 0x00000000000007fful, + + IOSTO_ENABLE = 0x8000000000000000ul, + IOSTO_ORIGIN = 0x000003fffffff000ul, + IOSTO_HW = 0x0000000000000800ul, + IOSTO_SW = 0x0000000000000400ul, + + IOCMD_CONF_TE = 0x0000800000000000ul, + + /* memory mapped registers */ + IOC_PT_CACHE_DIR = 0x000, + IOC_ST_CACHE_DIR = 0x800, + IOC_PT_CACHE_REG = 0x910, + IOC_ST_ORIGIN = 0x918, + IOC_CONF = 0x930, + + /* The high bit needs to be set on every DMA address, + only 2GB are addressable */ + BPA_DMA_VALID = 0x80000000, + BPA_DMA_MASK = 0x7fffffff, +}; + + +void bpa_init_iommu(void); + +#endif Index: linus-2.5/arch/ppc64/kernel/bpa_setup.c =================================================================== --- linus-2.5.orig/arch/ppc64/kernel/bpa_setup.c 2005-04-22 06:59:58.000000000 +0200 +++ linus-2.5/arch/ppc64/kernel/bpa_setup.c 2005-04-29 10:01:12.000000000 +0200 @@ -46,6 +46,7 @@ #include "pci.h" #include "bpa_iic.h" +#include "bpa_iommu.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -179,7 +180,7 @@ hpte_init_native(); - pci_direct_iommu_init(); + bpa_init_iommu(); ppc64_interrupt_controller = IC_BPA_IIC; From arnd at arndb.de Wed Jun 22 07:34:43 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:34:43 +0200 Subject: [PATCH 10/11] ppc64: SPU file system In-Reply-To: <200506212328.28929.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212326.18205.arnd@arndb.de> <200506212328.28929.arnd@arndb.de> Message-ID: <200506212334.44066.arnd@arndb.de> This is work-in-progess version of the SPU file system, which is used to run code on the Synergistic Processing Units of the Cell Processor, a.k.a. Broadband Engine. The file system provides a name space similar to posix shared memory or message queues. Users that have write permissions on the file system can create directories in the spufs root. Every directory represents an SPU context, which is currently mapped to a physical SPU, but that is going to change to a virtualization scheme in future updates. An SPU context directory contains a predefined set of files used for manipulating the state of the logical SPU. Users can change permissions on those files, but not actually add or remove files without removing the complete directory. The current set of files is: /mem the contents of the local store memory of the SPU. This can be accessed like a regular shared memory file and contains both code and data in the address space of the SPU. The implemented file operations currently are read(), write() and mmap(). We will need our own address space operations as soon as we allow the SPU context to be scheduled away from the physical SPU into page cache. /run A stub file that lets us do ioctl. The only ioctl method we need is the spu_run() call. spu_run suspends the current thread from the host CPU and transfers the flow of execution to the SPU. The ioctl call return to the calling thread when a state is entered that can not be handled by the kernel, e.g. an error in the SPU code or an exit() from it. When a signal is pending for the host CPU thread, the ioctl is interrupted and the SPU stopped in order to call the signal handler. /mbox The first SPU to CPU communication mailbox. This file is read-only and can be read in units of 32 bits. The file can only be used in non-blocking mode and it even poll() will not block on it. When no data is available in the mailbox, read() returns EAGAIN. /ibox The second SPU to CPU communication mailbox. This file is similar to the first mailbox file, but can be read in blocking I/O mode, and the poll familiy of system calls can be used to wait for it. /wbox The CPU to SPU communation mailbox. It is write-only can can be written in units of 32 bits. If the mailbox is full, write() will block and poll can be used to wait for it becoming empty again. Other files are planned but currently are not implemented or not functional. Some of the changes against the previous version of this patch are: - The file system code is split into separate files - Mailboxes are actually working with interrupts - We no longer oops on some bad DMA accesses - Use of generic simple_attribute code - Better initialization of SPE registers Signed-off-by: Arnd Bergmann -- arch/ppc64/kernel/Makefile | 1 arch/ppc64/kernel/spu_base.c | 678 +++++++++++++++++++++++++++++++++++++++++++ arch/ppc64/mm/hash_utils.c | 1 fs/Kconfig | 10 fs/Makefile | 1 fs/spufs/Makefile | 2 fs/spufs/file.c | 517 ++++++++++++++++++++++++++++++++ fs/spufs/inode.c | 378 +++++++++++++++++++++++ fs/spufs/spufs.h | 53 +++ include/asm-ppc64/spu.h | 468 +++++++++++++++++++++++++++++ mm/memory.c | 1 11 files changed, 2110 insertions(+) --- linux-cg.orig/arch/ppc64/kernel/Makefile 2005-06-21 22:48:42.131979120 -0400 +++ linux-cg/arch/ppc64/kernel/Makefile 2005-06-21 22:48:48.765901360 -0400 @@ -53,6 +53,7 @@ obj-$(CONFIG_HVCS) += hvcserver.o obj-$(CONFIG_IBMVIO) += vio.o obj-$(CONFIG_XICS) += xics.o obj-$(CONFIG_MPIC) += mpic.o +obj-$(CONFIG_SPU_FS) += spu_base.o obj-$(CONFIG_PPC_PMAC) += pmac_setup.o pmac_feature.o pmac_pci.o \ pmac_time.o pmac_nvram.o pmac_low_i2c.o --- linux-cg.orig/arch/ppc64/kernel/spu_base.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/arch/ppc64/kernel/spu_base.c 2005-06-21 22:48:48.767901056 -0400 @@ -0,0 +1,678 @@ +/* + * Low-level SPU handling + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#define DEBUG 1 + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "bpa_iic.h" + +static int __spu_trap_invalid_dma(struct spu *spu) +{ + pr_debug("%s\n", __FUNCTION__); + force_sig(SIGBUS, /* info, */ current); + return 0; +} + +static int __spu_trap_dma_align(struct spu *spu) +{ + pr_debug("%s\n", __FUNCTION__); + force_sig(SIGBUS, /* info, */ current); + return 0; +} + +static int __spu_trap_error(struct spu *spu) +{ + pr_debug("%s\n", __FUNCTION__); + force_sig(SIGILL, /* info, */ current); + return 0; +} + +static int __spu_trap_data_seg(struct spu *spu, unsigned long ea) +{ + struct spu_priv2 __iomem *priv2; + struct mm_struct *mm; + + pr_debug("%s\n", __FUNCTION__); + + if (REGION_ID(ea) != USER_REGION_ID) { + printk("invalid region access at %016lx\n", ea); + return 1; + } + + priv2 = spu->priv2; + mm = spu->mm; + + if (spu->slb_replace >= 8) + spu->slb_replace = 0; + + out_be64(&priv2->slb_index_W, spu->slb_replace); + out_be64(&priv2->slb_vsid_RW, + (get_vsid(mm->context.id, ea) << SLB_VSID_SHIFT) + | SLB_VSID_USER); + out_be64(&priv2->slb_esid_RW, (ea & ESID_MASK) | SLB_ESID_V); + out_be64(&priv2->mfc_control_RW, MFC_CNTL_RESTART_DMA_COMMAND); + + printk("set slb %d context %lx, ea %016lx, vsid %016lx, esid %016lx\n", + spu->slb_replace, mm->context.id, ea, + (get_vsid(mm->context.id, ea) << SLB_VSID_SHIFT)| SLB_VSID_USER, + (ea & ESID_MASK) | SLB_ESID_V); + return 0; +} + +static int __spu_trap_data_map(struct spu *spu, unsigned long ea) +{ + unsigned long dsisr; + struct spu_priv1 __iomem *priv1; + + pr_debug("%s\n", __FUNCTION__); + priv1 = spu->priv1; + dsisr = in_be64(&priv1->mfc_dsisr_RW); + + if (dsisr & MFC_DSISR_PTE_NOT_FOUND) { + printk("pte lookup ea %016lx, dsisr %lx\n", ea, dsisr); + wake_up(&spu->stop_wq); + } else { + printk("unexpexted data fault ea %016lx, dsisr %lx\n", ea, dsisr); + } + + return 0; +} + +static int __spu_trap_mailbox(struct spu *spu) +{ + wake_up_all(&spu->ibox_wq); + + /* atomically disable SPU mailbox interrupts */ + spin_lock(&spu->register_lock); + out_be64(&spu->priv1->int_mask_class2_RW, + in_be64(&spu->priv1->int_mask_class2_RW) & ~0x1); + spin_unlock(&spu->register_lock); + return 0; +} + +static int __spu_trap_stop(struct spu *spu) +{ + pr_debug("%s\n", __FUNCTION__); + spu->stop_code = in_be32(&spu->problem->spu_status_R); + wake_up(&spu->stop_wq); + return 0; +} + +static int __spu_trap_halt(struct spu *spu) +{ + pr_debug("%s\n", __FUNCTION__); + spu->stop_code = in_be32(&spu->problem->spu_status_R); + wake_up(&spu->stop_wq); + return 0; +} + +static int __spu_trap_tag_group(struct spu *spu) +{ + pr_debug("%s\n", __FUNCTION__); + /* wake_up(&spu->dma_wq); */ + return 0; +} + +static int __spu_trap_spubox(struct spu *spu) +{ + wake_up_all(&spu->wbox_wq); + + /* atomically disable SPU mailbox interrupts */ + spin_lock(&spu->register_lock); + out_be64(&spu->priv1->int_mask_class2_RW, + in_be64(&spu->priv1->int_mask_class2_RW) & ~0x10); + spin_unlock(&spu->register_lock); + return 0; +} + +static irqreturn_t +spu_irq_class_0(int irq, void *data, struct pt_regs *regs) +{ + struct spu *spu; + + spu = data; + spu->class_0_pending = 1; + wake_up(&spu->stop_wq); + + return IRQ_HANDLED; +} + +static int +spu_irq_class_0_bottom(struct spu *spu) +{ + unsigned long stat; + + spu->class_0_pending = 0; + + stat = in_be64(&spu->priv1->int_stat_class0_RW); + + if (stat & 1) /* invalid MFC DMA */ + __spu_trap_invalid_dma(spu); + + if (stat & 2) /* invalid DMA alignment */ + __spu_trap_dma_align(spu); + + if (stat & 4) /* error on SPU */ + __spu_trap_error(spu); + + out_be64(&spu->priv1->int_stat_class0_RW, stat); + return 0; +} + +static irqreturn_t +spu_irq_class_1(int irq, void *data, struct pt_regs *regs) +{ + struct spu *spu; + unsigned long stat, dar; + + spu = data; + stat = in_be64(&spu->priv1->int_stat_class1_RW); + dar = in_be64(&spu->priv1->mfc_dar_RW); + + if (stat & 1) /* segment fault */ + __spu_trap_data_seg(spu, dar); + + if (stat & 2) { /* mapping fault */ + __spu_trap_data_map(spu, dar); + } + + if (stat & 4) /* ls compare & suspend on get */ + ; + + if (stat & 8) /* ls compare & suspend on put */ + ; + + out_be64(&spu->priv1->int_stat_class1_RW, stat); + return stat ? IRQ_HANDLED : IRQ_NONE; +} + +static irqreturn_t +spu_irq_class_2(int irq, void *data, struct pt_regs *regs) +{ + struct spu *spu; + unsigned long stat; + + spu = data; + stat = in_be64(&spu->priv1->int_stat_class2_RW); + + pr_debug("class 2 interrupt %d, %lx, %lx\n", irq, stat, + in_be64(&spu->priv1->int_mask_class2_RW)); + + + if (stat & 1) /* PPC core mailbox */ + __spu_trap_mailbox(spu); + + if (stat & 2) /* SPU stop-and-signal */ + __spu_trap_stop(spu); + + if (stat & 4) /* SPU halted */ + __spu_trap_halt(spu); + + if (stat & 8) /* DMA tag group complete */ + __spu_trap_tag_group(spu); + + if (stat & 0x10) /* SPU mailbox threshold */ + __spu_trap_spubox(spu); + + out_be64(&spu->priv1->int_stat_class2_RW, stat); + return stat ? IRQ_HANDLED : IRQ_NONE; +} + +static int +spu_request_irqs(struct spu *spu) +{ + int ret; + int irq_base; + + irq_base = IIC_NODE_STRIDE * spu->node + IIC_SPE_OFFSET; + + snprintf(spu->irq_c0, sizeof (spu->irq_c0), "spe%02d.0", spu->number); + ret = request_irq(irq_base + spu->isrc, + spu_irq_class_0, 0, spu->irq_c0, spu); + if (ret) + goto out; + out_be64(&spu->priv1->int_mask_class0_RW, 0x7); + + snprintf(spu->irq_c1, sizeof (spu->irq_c1), "spe%02d.1", spu->number); + ret = request_irq(irq_base + IIC_CLASS_STRIDE + spu->isrc, + spu_irq_class_1, 0, spu->irq_c1, spu); + if (ret) + goto out1; + out_be64(&spu->priv1->int_mask_class1_RW, 0x3); + + snprintf(spu->irq_c2, sizeof (spu->irq_c2), "spe%02d.2", spu->number); + ret = request_irq(irq_base + 2*IIC_CLASS_STRIDE + spu->isrc, + spu_irq_class_2, 0, spu->irq_c2, spu); + if (ret) + goto out2; + out_be64(&spu->priv1->int_mask_class2_RW, 0xe); + goto out; + +out2: + free_irq(irq_base + IIC_CLASS_STRIDE + spu->isrc, spu); +out1: + free_irq(irq_base + spu->isrc, spu); +out: + return ret; +} + +static void +spu_free_irqs(struct spu *spu) +{ + int irq_base; + + irq_base = IIC_NODE_STRIDE * spu->node + IIC_SPE_OFFSET; + + free_irq(irq_base + spu->isrc, spu); + free_irq(irq_base + IIC_CLASS_STRIDE + spu->isrc, spu); + free_irq(irq_base + 2*IIC_CLASS_STRIDE + spu->isrc, spu); +} + +static LIST_HEAD(spu_list); +static DECLARE_MUTEX(spu_mutex); + +static void spu_init_channels(struct spu *spu) +{ + static const struct { + unsigned channel; + unsigned count; + } zero_list[] = { + { 0x00, 1, }, { 0x01, 1, }, { 0x03, 1, }, { 0x04, 1, }, + { 0x18, 1, }, { 0x19, 1, }, { 0x1b, 1, }, { 0x1d, 1, }, + }, count_list[] = { + { 0x00, 0, }, { 0x03, 0, }, { 0x04, 0, }, { 0x15, 16, }, + { 0x17, 1, }, { 0x18, 0, }, { 0x19, 0, }, { 0x1b, 0, }, + { 0x1c, 1, }, { 0x1d, 0, }, { 0x1e, 1, }, + }; + struct spu_priv2 *priv2; + int i; + + priv2 = spu->priv2; + + /* initialize all channel data to zero */ + for (i = 0; i < ARRAY_SIZE(zero_list); i++) { + int count; + + out_be64(&priv2->spu_chnlcntptr_RW, zero_list[i].channel); + for (count = 0; count < zero_list[i].count; count++) + out_be64(&priv2->spu_chnldata_RW, 0); + } + + /* initialize channel counts to meaningful values */ + for (i = 0; i < ARRAY_SIZE(count_list); i++) { + out_be64(&priv2->spu_chnlcntptr_RW, count_list[i].channel); + out_be64(&priv2->spu_chnlcnt_RW, count_list[i].count); + } +} + +static void spu_init_regs(struct spu *spu) +{ + out_be64(&spu->priv1->int_mask_class0_RW, 0x7); + out_be64(&spu->priv1->int_mask_class1_RW, 0x3); + out_be64(&spu->priv1->int_mask_class2_RW, 0xe); +} + +struct spu *spu_alloc(void) +{ + struct spu *spu; + + down(&spu_mutex); + if (!list_empty(&spu_list)) { + spu = list_entry(spu_list.next, struct spu, list); + list_del_init(&spu->list); + printk("Got SPU %x\n", spu->isrc); + } else { + printk("No SPU left\n"); + spu = NULL; + } + up(&spu_mutex); + + if (spu) { + spu_init_channels(spu); + spu_init_regs(spu); + } + + return spu; +} +EXPORT_SYMBOL(spu_alloc); + +void spu_free(struct spu *spu) +{ + down(&spu_mutex); + list_add_tail(&spu->list, &spu_list); + up(&spu_mutex); +} +EXPORT_SYMBOL(spu_free); + +extern int hash_page(unsigned long ea, unsigned long access, unsigned long trap); //XXX +static int spu_handle_pte_fault(struct spu *spu) +{ + struct spu_problem __iomem *prob; + struct spu_priv1 __iomem *priv1; + struct spu_priv2 __iomem *priv2; + unsigned long ea, access, is_write; + struct mm_struct *mm; + struct vm_area_struct *vma; + int ret; + + printk("%s\n", __FUNCTION__); + prob = spu->problem; + priv1 = spu->priv1; + priv2 = spu->priv2; + + ea = in_be64(&priv1->mfc_dar_RW); + access = _PAGE_PRESENT | _PAGE_USER; + is_write = in_be64(&priv1->mfc_dsisr_RW) & 0x02000000; + mm = spu->mm; + + ret = hash_page(ea, access, 0x300); + if (ret < 0) { + printk("error in hash_page!\n"); + ret = -EFAULT; + goto out_err; + } + + printk("current %ld, spu %ld, ea %ld\n", current->mm->context.id, mm->context.id, ea); + if (!ret) { + printk("hash inserted, vsid %lx\n", get_vsid(current->mm->context.id, ea)); + goto out_restart; + } + + ret = -EFAULT; + if (ea >= TASK_SIZE) + goto out_err; + + down_read(&mm->mmap_sem); + vma = find_vma(mm, ea); + if (!vma) + goto out; + + if (is_write) { + if (!(vma->vm_flags & VM_WRITE)) + goto out; + } + + ret = 0; +/* FIXME add missing code from do_page_fault */ + switch (handle_mm_fault(mm, vma, ea, is_write)) { + case VM_FAULT_MINOR: + printk("minor\n"); + current->min_flt++; + break; + case VM_FAULT_MAJOR: + printk("major\n"); + current->maj_flt++; + break; + case VM_FAULT_SIGBUS: + ret = -EFAULT; + break; + case VM_FAULT_OOM: + ret = -ENOMEM; + break; + default: + BUG(); + } +out: + up_read(&mm->mmap_sem); + if (ret) + goto out_err; +out_restart: + out_be64(&priv2->mfc_control_RW, MFC_CNTL_RESTART_DMA_COMMAND); +out_err: + printk("%s: returning %d\n", __FUNCTION__, ret); + return ret; +} + +int spu_run(struct spu *spu) +{ + struct spu_problem __iomem *prob; + struct spu_priv1 __iomem *priv1; + struct spu_priv2 __iomem *priv2; + unsigned long status; + int count = 10; + int ret; + + prob = spu->problem; + priv1 = spu->priv1; + priv2 = spu->priv2; + spu->mm = current->mm; + out_be32(&prob->spu_runcntl_RW, SPU_RUNCNTL_RUNNABLE); + + do { + ret = wait_event_interruptible(spu->stop_wq, + (!((status = in_be32(&prob->spu_status_R)) & 0x1)) + || (in_be64(&priv1->mfc_dsisr_RW) & MFC_DSISR_PTE_NOT_FOUND)); + + if (status & SPU_STATUS_STOPPED_BY_STOP) + ret = -EAGAIN; + else if (status & SPU_STATUS_STOPPED_BY_HALT) + ret = -EIO; + else if (in_be64(&priv1->mfc_dsisr_RW) & MFC_DSISR_PTE_NOT_FOUND) + ret = spu_handle_pte_fault(spu); + + if (spu->class_0_pending) + spu_irq_class_0_bottom(spu); + + } while (!ret && count--); + out_be32(&prob->spu_runcntl_RW, SPU_RUNCNTL_STOP); + out_be64(&priv2->slb_invalidate_all_W, 0); + spu->mm = NULL; + + return ret; +} +EXPORT_SYMBOL(spu_run); + +static void __iomem * __init map_spe_prop(struct device_node *n, + const char *name) +{ + struct address_prop { + unsigned long address; + unsigned int len; + } __attribute__((packed)) *prop; + + void *p; + int proplen; + + p = get_property(n, name, &proplen); + if (proplen != sizeof (struct address_prop)) + return NULL; + + prop = p; + + return ioremap(prop->address, prop->len); +} + +static void spu_unmap(struct spu *spu) +{ + iounmap(spu->priv2); + iounmap(spu->priv1); + iounmap(spu->problem); + iounmap((u8 __iomem *)spu->local_store); +} + +static int __init spu_map_device(struct spu *spu, struct device_node *spe) +{ + unsigned int *isrc_prop; + int ret; + + ret = -ENODEV; + isrc_prop = (u32 *)get_property(spe, "isrc", NULL); + if (!isrc_prop) + goto out; + spu->isrc = *isrc_prop; + + spu->name = get_property(spe, "name", NULL); + if (!spu->name) + goto out; + + /* we use local store as ram, not io memory */ + spu->local_store = (u8 __force *) map_spe_prop(spe, "local-store"); + if (!spu->local_store) + goto out; + + spu->problem= map_spe_prop(spe, "problem"); + if (!spu->problem) + goto out_unmap; + + spu->priv1= map_spe_prop(spe, "priv1"); + if (!spu->priv1) + goto out_unmap; + + spu->priv2= map_spe_prop(spe, "priv2"); + if (!spu->priv2) + goto out_unmap; + ret = 0; + goto out; + +out_unmap: + spu_unmap(spu); +out: + return ret; +} + +static int __init find_spu_node_id(struct device_node *spe) +{ + unsigned int *id; + struct device_node *cpu; + + cpu = spe->parent->parent; + id = (unsigned int *)get_property(cpu, "node-id", NULL); + + return id ? *id : 0; +} + +static int __init create_spu(struct device_node *spe) +{ + struct spu *spu; + int ret; + static int number; + + ret = -ENOMEM; + spu = kmalloc(sizeof (*spu), GFP_KERNEL); + if (!spu) + goto out; + + ret = spu_map_device(spu, spe); + if (ret) + goto out_free; + + spu->node = find_spu_node_id(spe); + spu->stop_code = 0; + spu->slb_replace = 0; + spu->mm = NULL; + spu->class_0_pending = 0; + spin_lock_init(&spu->register_lock); + + out_be64(&spu->priv1->mfc_sdr_RW, mfspr(SPRN_SDR1)); + out_be64(&spu->priv1->mfc_sr1_RW, 0x33); + + init_waitqueue_head(&spu->stop_wq); + init_waitqueue_head(&spu->wbox_wq); + init_waitqueue_head(&spu->ibox_wq); + + down(&spu_mutex); + spu->number = number++; + ret = spu_request_irqs(spu); + if (ret) + goto out_unmap; + + list_add(&spu->list, &spu_list); + up(&spu_mutex); + + printk(KERN_DEBUG "Using SPE %s %02x %p %p %p %p %d\n", + spu->name, spu->isrc, spu->local_store, + spu->problem, spu->priv1, spu->priv2, spu->number); + goto out; + +out_unmap: + up(&spu_mutex); + spu_unmap(spu); +out_free: + kfree(spu); +out: + return ret; +} + +static void destroy_spu(struct spu *spu) +{ + list_del_init(&spu->list); + + spu_free_irqs(spu); + spu_unmap(spu); + kfree(spu); +} + +static void cleanup_spu_base(void) +{ + struct spu *spu, *tmp; + down(&spu_mutex); + list_for_each_entry_safe(spu, tmp, &spu_list, list) + destroy_spu(spu); + up(&spu_mutex); +} +module_exit(cleanup_spu_base); + +static int __init init_spu_base(void) +{ + struct device_node *node; + int ret; + + ret = -ENODEV; + for (node = of_find_node_by_type(NULL, "spe"); + node; node = of_find_node_by_type(node, "spe")) { + ret = create_spu(node); + if (ret) { + printk(KERN_WARNING "%s: Error initializing %s\n", + __FUNCTION__, node->name); + cleanup_spu_base(); + break; + } + } + /* in some old firmware versions, the spe is called 'spc', so we + look for that as well */ + for (node = of_find_node_by_type(NULL, "spc"); + node; node = of_find_node_by_type(node, "spc")) { + ret = create_spu(node); + if (ret) { + printk(KERN_WARNING "%s: Error initializing %s\n", + __FUNCTION__, node->name); + cleanup_spu_base(); + break; + } + } + return ret; +} +module_init(init_spu_base); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Arnd Bergmann "); --- linux-cg.orig/arch/ppc64/mm/hash_utils.c 2005-06-21 22:48:42.136978360 -0400 +++ linux-cg/arch/ppc64/mm/hash_utils.c 2005-06-21 22:48:48.768900904 -0400 @@ -354,6 +354,7 @@ int hash_page(unsigned long ea, unsigned return ret; } +EXPORT_SYMBOL_GPL(hash_page); void flush_hash_page(unsigned long context, unsigned long ea, pte_t pte, int local) --- linux-cg.orig/fs/Kconfig 2005-06-21 22:48:42.138978056 -0400 +++ linux-cg/fs/Kconfig 2005-06-21 22:48:48.770900600 -0400 @@ -853,6 +853,16 @@ config HUGETLBFS config HUGETLB_PAGE def_bool HUGETLBFS +config SPU_FS + tristate "SPU file system" + default m + depends on PPC_BPA + help + The SPU file system is used to access Synergistic Processing + Units on machines implementing the Broadband Processor + Architecture. + + config RAMFS bool default y --- linux-cg.orig/fs/Makefile 2005-06-21 22:48:42.140977752 -0400 +++ linux-cg/fs/Makefile 2005-06-21 22:48:48.771900448 -0400 @@ -95,3 +95,4 @@ obj-$(CONFIG_BEFS_FS) += befs/ obj-$(CONFIG_HOSTFS) += hostfs/ obj-$(CONFIG_HPPFS) += hppfs/ obj-$(CONFIG_DEBUG_FS) += debugfs/ +obj-$(CONFIG_SPU_FS) += spufs/ --- linux-cg.orig/fs/spufs/Makefile 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/fs/spufs/Makefile 2005-06-21 22:48:52.326892544 -0400 @@ -0,0 +1,2 @@ +obj-$(CONFIG_SPU_FS) += spufs.o +spufs-y += inode.o file.o --- linux-cg.orig/fs/spufs/file.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/fs/spufs/file.c 2005-06-21 22:50:20.599920208 -0400 @@ -0,0 +1,507 @@ +/* + * SPU file system -- file contents + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "spufs.h" + +static int +spufs_mem_open(struct inode *inode, struct file *file) +{ + struct spufs_inode_info *i = SPUFS_I(inode); + file->private_data = i->i_ctx; + return 0; +} + +static ssize_t +spufs_mem_read(struct file *file, char __user *buffer, + size_t size, loff_t *pos) +{ + struct spu *spu; + struct spu_context *ctx; + int ret; + + ctx = file->private_data; + spu = ctx->spu; + + down_read(&ctx->backing_sema); + if (spu->number & 0/*1*/) { + ret = generic_file_read(file, buffer, size, pos); + goto out; + } + + ret = 0; + size = min_t(ssize_t, LS_SIZE - *pos, size); + if (size <= 0) + goto out; + *pos += size; + ret = copy_to_user(buffer, spu->local_store + *pos - size, size); + ret = ret ? -EFAULT : size; + +out: + up_read(&ctx->backing_sema); + return ret; +} + +static ssize_t +spufs_mem_write(struct file *file, const char __user *buffer, + size_t size, loff_t *pos) +{ + struct spu_context *ctx = file->private_data; + struct spu *spu = ctx->spu; + + if (spu->number & 0) //1) + return generic_file_write(file, buffer, size, pos); + + size = min_t(ssize_t, LS_SIZE - *pos, size); + if (size <= 0) + return -EFBIG; + *pos += size; + return copy_from_user(spu->local_store + *pos - size, + buffer, size) ? -EFAULT : size; +} + +static int +spufs_mem_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct spu_context *ctx = file->private_data; + struct spu *spu = ctx->spu; + unsigned long pfn; + + if (spu->number & 0) //1) + return generic_file_mmap(file, vma); + + vma->vm_flags |= VM_RESERVED; + pfn = __pa(spu->local_store) >> PAGE_SHIFT; + /* + * This will work for actual SPUs, but not for vmalloc memory: + */ + if (remap_pfn_range(vma, vma->vm_start, pfn, + vma->vm_end-vma->vm_start, vma->vm_page_prot)) + return -EAGAIN; + /**/ + return 0; +} + +static struct file_operations spufs_mem_fops = { + .open = spufs_mem_open, + .read = spufs_mem_read, + .write = spufs_mem_write, + .mmap = spufs_mem_mmap, + .llseek = generic_file_llseek, +}; + +/* generic open function for all pipe-like files */ +static int spufs_pipe_open(struct inode *inode, struct file *file) +{ + struct spufs_inode_info *i = SPUFS_I(inode); + file->private_data = i->i_ctx; + + return nonseekable_open(inode, file); +} + +static ssize_t spufs_mbox_read(struct file *file, char __user *buf, + size_t len, loff_t *pos) +{ + struct spu_context *ctx; + struct spu_problem __iomem *prob; + u32 mbox_stat; + u32 mbox_data; + + if (len < 4) + return -EINVAL; + + ctx = file->private_data; + prob = ctx->spu->problem; + mbox_stat = in_be32(&prob->mb_stat_R); + if (!(mbox_stat & 0x0000ff)) + return -EAGAIN; + + mbox_data = in_be32(&prob->pu_mb_R); + + if (copy_to_user(buf, &mbox_data, sizeof mbox_data)) + return -EFAULT; + + return 4; +} + +static struct file_operations spufs_mbox_fops = { + .open = spufs_pipe_open, + .read = spufs_mbox_read, +}; + +/* low-level ibox access function */ +size_t spu_ibox_read(struct spu *spu, u32 *data) +{ + int ret; + + spin_lock_irq(&spu->register_lock); + + if (in_be32(&spu->problem->mb_stat_R) & 0xff0000) { + /* read the first available word */ + *data = in_be64(&spu->priv2->puint_mb_R); + ret = 4; + } else { + /* make sure we get woken up by the interrupt */ + out_be64(&spu->priv1->int_mask_class2_RW, + in_be64(&spu->priv1->int_mask_class2_RW) | 0x1); + ret = 0; + } + + spin_unlock_irq(&spu->register_lock); + return ret; +} +EXPORT_SYMBOL(spu_ibox_read); + +static ssize_t spufs_ibox_read(struct file *file, char __user *buf, + size_t len, loff_t *pos) +{ + struct spu_context *ctx; + u32 ibox_data; + ssize_t ret; + + if (len < 4) + return -EINVAL; + + ctx = file->private_data; + + ret = 0; + if (file->f_flags & O_NONBLOCK) { + if (!spu_ibox_read(ctx->spu, &ibox_data)) + ret = -EAGAIN; + } else { + ret = wait_event_interruptible(ctx->spu->ibox_wq, + spu_ibox_read(ctx->spu, &ibox_data)); + } + + if (ret) + return ret; + + ret = 4; + if (copy_to_user(buf, &ibox_data, sizeof ibox_data)) + ret = -EFAULT; + + return ret; +} + +static unsigned int spufs_ibox_poll(struct file *file, poll_table *wait) +{ + struct spu_context *ctx; + struct spu_problem __iomem *prob; + u32 mbox_stat; + unsigned int mask; + + ctx = file->private_data; + prob = ctx->spu->problem; + mbox_stat = in_be32(&prob->mb_stat_R); + + poll_wait(file, &ctx->spu->ibox_wq, wait); + + mask = 0; + if (mbox_stat & 0xff0000) + mask |= POLLIN | POLLRDNORM; + + return mask; +} + +static struct file_operations spufs_ibox_fops = { + .open = spufs_pipe_open, + .read = spufs_ibox_read, + .poll = spufs_ibox_poll, +}; + +/* low-level mailbox write */ +size_t spu_wbox_write(struct spu *spu, u32 data) +{ + int ret; + + spin_lock_irq(&spu->register_lock); + + if (in_be32(&spu->problem->mb_stat_R) & 0x00ff00) { + /* we have space to write wbox_data to */ + out_be32(&spu->problem->spu_mb_W, data); + ret = 4; + } else { + /* make sure we get woken up by the interrupt when space + becomes available */ + out_be64(&spu->priv1->int_mask_class2_RW, + in_be64(&spu->priv1->int_mask_class2_RW) | 0x10); + ret = 0; + } + + spin_unlock_irq(&spu->register_lock); + return ret; +} +EXPORT_SYMBOL(spu_wbox_write); + +static ssize_t spufs_wbox_write(struct file *file, const char __user *buf, + size_t len, loff_t *pos) +{ + struct spu_context *ctx; + u32 wbox_data; + int ret; + + if (len < 4) + return -EINVAL; + + ctx = file->private_data; + + if (copy_from_user(&wbox_data, buf, sizeof wbox_data)) + return -EFAULT; + + ret = 0; + if (file->f_flags & O_NONBLOCK) { + if (!spu_wbox_write(ctx->spu, wbox_data)) + ret = -EAGAIN; + } else { + ret = wait_event_interruptible(ctx->spu->wbox_wq, + spu_wbox_write(ctx->spu, wbox_data)); + } + + return ret ? ret : sizeof wbox_data; +} + +static unsigned int spufs_wbox_poll(struct file *file, poll_table *wait) +{ + struct spu_context *ctx; + struct spu_problem __iomem *prob; + u32 mbox_stat; + unsigned int mask; + + ctx = file->private_data; + prob = ctx->spu->problem; + mbox_stat = in_be32(&prob->mb_stat_R); + + poll_wait(file, &ctx->spu->wbox_wq, wait); + + mask = 0; + if (mbox_stat & 0x00ff00) + mask = POLLOUT | POLLWRNORM; + + return mask; +} + +static struct file_operations spufs_wbox_fops = { + .open = spufs_pipe_open, + .write = spufs_wbox_write, + .poll = spufs_wbox_poll, +}; + +static long spufs_run_spu(struct file *file, struct spu_context *ctx, + u32 *npc, u32 *status) +{ + struct spu_problem __iomem *prob; + int ret; + + if (file->f_flags & O_NONBLOCK) { + ret = -EAGAIN; + if (!down_write_trylock(&ctx->backing_sema)) + goto out; + } else { + down_write(&ctx->backing_sema); + } + + ctx = file->private_data; + prob = ctx->spu->problem; + out_be32(&prob->spu_npc_RW, *npc); + + ret = spu_run(ctx->spu); + + *status = in_be32(&prob->spu_status_R); + *npc = in_be32(&prob->spu_npc_RW); + + up_write(&ctx->backing_sema); + +out: + return ret; +} + +struct spufs_run_arg { + u32 npc; /* inout: Next Program Counter */ + u32 status; /* out: SPU status */ +}; + +/* either this ioctl function or the system call needs to die! */ +static long spufs_run_ioctl(struct file *file, unsigned int num, + unsigned long arg) +{ + struct spufs_run_arg data; + int ret; + + if (num != _IOWR('s', 0, struct spufs_run_arg)) + return -EINVAL; + + if (copy_from_user(&data, (void __user *)arg, sizeof data)) + return -EFAULT; + + ret = spufs_run_spu(file, file->private_data, + &data.npc, &data.status); + + if (copy_to_user((void __user *)arg, &data, sizeof data)) + ret = -EFAULT; + + return ret; +} + +static struct file_operations spufs_run_fops = { + .open = spufs_pipe_open, + .unlocked_ioctl = spufs_run_ioctl, + .compat_ioctl = spufs_run_ioctl, +}; + +static void spufs_signal1_type_set(void *data, u64 val) +{ + struct spu_context *ctx = data; + ctx->sig1_type = !!val; +} + +static u64 spufs_signal1_type_get(void *data) +{ + struct spu_context *ctx = data; + return ctx->sig1_type; +} +DEFINE_SIMPLE_ATTRIBUTE(spufs_signal1_type, spufs_signal1_type_get, + spufs_signal1_type_set, "%llu"); + +static void spufs_signal2_type_set(void *data, u64 val) +{ + struct spu_context *ctx = data; + ctx->sig2_type = !!val; +} + +static u64 spufs_signal2_type_get(void *data) +{ + struct spu_context *ctx = data; + return ctx->sig2_type; +} +DEFINE_SIMPLE_ATTRIBUTE(spufs_signal2_type, spufs_signal2_type_get, + spufs_signal2_type_set, "%llu"); + +#define prob_attr(name) \ +static void spufs_ ## name ## _set(void *data, u64 val) \ +{ \ + struct spu_context *ctx = data; \ + out_be32(&ctx->spu->problem->name, val); \ +} \ +static u64 spufs_ ## name ## _get(void *data) \ +{ \ + struct spu_context *ctx = data; \ + return in_be32(&ctx->spu->problem->name); \ +} \ +DEFINE_SIMPLE_ATTRIBUTE(spufs_ ## name, \ + spufs_ ## name ## _get, \ + spufs_ ## name ## _set, "%llx\n") + +#define priv1_attr(name) \ +static void spufs_ ## name ## _set(void *data, u64 val) \ +{ \ + struct spu_context *ctx = data; \ + out_be64(&ctx->spu->priv1->name, val); \ +} \ +static u64 spufs_ ## name ## _get(void *data) \ +{ \ + struct spu_context *ctx = data; \ + return in_be64(&ctx->spu->priv1->name); \ +} \ +DEFINE_SIMPLE_ATTRIBUTE(spufs_ ## name, \ + spufs_ ## name ## _get, \ + spufs_ ## name ## _set, "%llx\n") + +#define priv2_attr(name) \ +static void spufs_ ## name ## _set(void *data, u64 val) \ +{ \ + struct spu_context *ctx = data; \ + out_be64(&ctx->spu->priv2->name, val); \ +} \ +static u64 spufs_ ## name ## _get(void *data) \ +{ \ + struct spu_context *ctx = data; \ + return in_be64(&ctx->spu->priv2->name); \ +} \ +DEFINE_SIMPLE_ATTRIBUTE(spufs_ ## name, \ + spufs_ ## name ## _get, \ + spufs_ ## name ## _set, "%llx\n") + +prob_attr(mb_stat_R); + +priv1_attr(int_stat_class0_RW); +priv1_attr(int_stat_class1_RW); +priv1_attr(int_stat_class2_RW); + +priv1_attr(int_mask_class0_RW); +priv1_attr(int_mask_class1_RW); +priv1_attr(int_mask_class2_RW); + +priv1_attr(mfc_sr1_RW); +priv1_attr(mfc_fir_R); +priv1_attr(mfc_fir_status_or_W); +priv1_attr(mfc_fir_status_and_W); +priv1_attr(mfc_fir_mask_R); +priv1_attr(mfc_fir_mask_or_W); +priv1_attr(mfc_fir_mask_and_W); +priv1_attr(mfc_fir_chkstp_enable_RW); +priv1_attr(mfc_cer_R); +priv1_attr(mfc_dsisr_RW); +priv1_attr(mfc_dsir_R); +priv1_attr(mfc_sdr_RW); +priv2_attr(mfc_control_RW); + +struct tree_descr spufs_dir_contents[] = { + { "mem", &spufs_mem_fops, 0644, }, + { "run", &spufs_run_fops, 0400, }, + { "mbox", &spufs_mbox_fops, 0400, }, + { "ibox", &spufs_ibox_fops, 0400, }, + { "wbox", &spufs_wbox_fops, 0200, }, + { "signal1_type", &spufs_signal1_type, 0600, }, + { "signal2_type", &spufs_signal2_type, 0600, }, +#if 1 /* debugging only */ + { "mb_stat", &spufs_mb_stat_R, 0400, }, + { "class0_mask", &spufs_int_mask_class0_RW, 0600, }, + { "class1_mask", &spufs_int_mask_class1_RW, 0600, }, + { "class2_mask", &spufs_int_mask_class2_RW, 0600, }, + { "class0_stat", &spufs_int_stat_class0_RW, 0600, }, + { "class1_stat", &spufs_int_stat_class1_RW, 0600, }, + { "class2_stat", &spufs_int_stat_class2_RW, 0600, }, + { "sr1", &spufs_mfc_sr1_RW, 0600, }, + { "fir", &spufs_mfc_fir_R, 0400, }, + { "fir_status_or", &spufs_mfc_fir_status_or_W, 0200, }, + { "fir_status_and", &spufs_mfc_fir_status_and_W, 0200, }, + { "fir_mask", &spufs_mfc_fir_mask_R, 0400, }, + { "fir_mask_or", &spufs_mfc_fir_mask_or_W, 0200, }, + { "fir_mask_and", &spufs_mfc_fir_mask_and_W, 0200, }, + { "fir_chkstp", &spufs_mfc_fir_chkstp_enable_RW, 0600, }, + { "cer", &spufs_mfc_cer_R, 0400, }, + { "dsisr", &spufs_mfc_dsisr_RW, 0600, }, + { "dsir", &spufs_mfc_dsir_R, 0200, }, + { "cntl", &spufs_mfc_control_RW, 0600, }, + { "sdr", &spufs_mfc_sdr_RW, 0600, }, +#endif + {}, +}; --- linux-cg.orig/fs/spufs/inode.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/fs/spufs/inode.c 2005-06-21 22:48:48.775899840 -0400 @@ -0,0 +1,378 @@ +/* + * SPU file system + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "spufs.h" + +/* SPU context abstraction */ +static struct spu_context * +alloc_spu_context(void) +{ + struct spu_context *ctx; + ctx = kmalloc(sizeof *ctx, GFP_KERNEL); + if (!ctx) + goto out; + ctx->spu = spu_alloc(); + if (!ctx->spu) + goto out_free; + init_rwsem(&ctx->backing_sema); + spin_lock_init(&ctx->mmio_lock); + kref_init(&ctx->kref); + goto out; +out_free: + kfree(ctx); + ctx = NULL; +out: + return ctx; +} + +static void +destroy_spu_context(struct kref *kref) +{ + struct spu_context *ctx; + ctx = container_of(kref, struct spu_context, kref); + if (ctx->spu) + spu_free(ctx->spu); + kfree(ctx); +} + +static struct spu_context * +get_spu_context(struct spu_context *ctx) +{ + kref_get(&ctx->kref); + return ctx; +} + +static void +put_spu_context(struct spu_context *ctx) +{ + kref_put(&ctx->kref, &destroy_spu_context); +} + +/* bits in the inode flags */ +enum { + SPUFS_DIRECT, /* Data resides on a physical SPU */ +}; + +static kmem_cache_t *spufs_inode_cache; + +/* Information about the backing dev, same as ramfs */ + +static struct backing_dev_info spufs_backing_dev_info = { + .ra_pages = 0, /* No readahead */ + .capabilities = BDI_CAP_NO_ACCT_DIRTY | BDI_CAP_NO_WRITEBACK | + BDI_CAP_MAP_DIRECT | BDI_CAP_MAP_COPY | BDI_CAP_READ_MAP | + BDI_CAP_WRITE_MAP, +}; + +static struct address_space_operations spufs_aops = { + .readpage = simple_readpage, + .prepare_write = simple_prepare_write, + .commit_write = simple_commit_write, +}; + +/* Inode operations */ + +static struct inode * +spufs_alloc_inode(struct super_block *sb) +{ + struct spufs_inode_info *ei; + + ei = kmem_cache_alloc(spufs_inode_cache, SLAB_KERNEL); + if (!ei) + return NULL; + return &ei->vfs_inode; +} + +static void +spufs_destroy_inode(struct inode *inode) +{ + kmem_cache_free(spufs_inode_cache, SPUFS_I(inode)); +} + +static void +spufs_init_once(void *p, kmem_cache_t * cachep, unsigned long flags) +{ + struct spufs_inode_info *ei = p; + + if ((flags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) == + SLAB_CTOR_CONSTRUCTOR) { + inode_init_once(&ei->vfs_inode); + } +} + +static struct inode * +spufs_new_inode(struct super_block *sb, int mode) +{ + struct inode *inode; + + inode = new_inode(sb); + if (!inode) + goto out; + + inode->i_mode = mode; + inode->i_uid = current->fsuid; + inode->i_gid = current->fsgid; + inode->i_blksize = PAGE_CACHE_SIZE; + inode->i_blocks = 0; + inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; +out: + return inode; +} + +static int +spufs_setattr(struct dentry *dentry, struct iattr *attr) +{ + struct inode *inode = dentry->d_inode; + +/* dump_stack(); + printk("ia_size %lld, i_size:%lld\n", attr->ia_size, inode->i_size); +*/ + if (attr->ia_size != inode->i_size) + return -EINVAL; + return inode_setattr(inode, attr); +} + +static void +spufs_delete_inode(struct inode *inode) +{ + if (SPUFS_I(inode)->i_ctx) + put_spu_context(SPUFS_I(inode)->i_ctx); + clear_inode(inode); +} + +static int +spufs_fill_dir(struct dentry *dir, struct tree_descr *files, + int mode, struct spu_context *ctx) +{ + struct inode *inode; + struct dentry *dentry; + int ret; + + static struct inode_operations iops = { + .getattr = simple_getattr, + .setattr = spufs_setattr, + }; + + ret = -ENOSPC; + while (files->name && files->name[0]) { + dentry = d_alloc_name(dir, files->name); + if (!dentry) + goto out; + inode = spufs_new_inode(dir->d_sb, + S_IFREG | (files->mode & mode)); + if (!inode) + goto out; + inode->i_op = &iops; + inode->i_fop = files->ops; + inode->i_mapping->a_ops = &spufs_aops; + inode->i_mapping->backing_dev_info = &spufs_backing_dev_info; + inode->u.generic_ip = + SPUFS_I(inode)->i_ctx = get_spu_context(ctx); + + d_add(dentry, inode); + files++; + } + return 0; +out: + // FIXME: remove all files that are left + return ret; +} + +static int +spufs_mkdir(struct inode *dir, struct dentry *dentry, int mode) +{ + int ret; + struct inode *inode; + struct spu_context *ctx; + + ret = -ENOSPC; + inode = spufs_new_inode(dir->i_sb, mode | S_IFDIR); + if (!inode) + goto out; + + if (dir->i_mode & S_ISGID) { + inode->i_gid = dir->i_gid; + inode->i_mode |= S_ISGID; + } + ctx = alloc_spu_context(); + SPUFS_I(inode)->i_ctx = ctx; + if (!ctx) + goto out_iput; + + inode->i_op = &simple_dir_inode_operations; + inode->i_fop = &simple_dir_operations; + ret = spufs_fill_dir(dentry, spufs_dir_contents, mode, ctx); + if (ret) + goto out_free_ctx; + + d_instantiate(dentry, inode); + dget(dentry); + dir->i_nlink++; + goto out; + +out_free_ctx: + put_spu_context(ctx); +out_iput: + iput(inode); +out: + return ret; +} + +/* This looks really wrong! */ +static int spufs_rmdir(struct inode *root, struct dentry *dir_dentry) +{ + struct dentry *dentry; + int err; + + spin_lock(&dcache_lock); + + /* check if any entry is used */ + err = -EBUSY; + list_for_each_entry(dentry, &dir_dentry->d_subdirs, d_child) { + if (d_unhashed(dentry) || !dentry->d_inode) + continue; + if (atomic_read(&dentry->d_count) != 1) + goto out; + } + /* remove all entries */ + err = 0; + list_for_each_entry(dentry, &dir_dentry->d_subdirs, d_child) { + if (d_unhashed(dentry) || !dentry->d_inode) + continue; + atomic_dec(&dentry->d_count); + __d_drop(dentry); + } +out: + spin_unlock(&dcache_lock); + if (!err) { + shrink_dcache_parent(dir_dentry); + err = simple_rmdir(root, dir_dentry); + } + return err; +} + +/* File system initialization */ + +static int +spufs_create_root(struct super_block *sb) { + static struct inode_operations spufs_dir_inode_operations = { + .lookup = simple_lookup, + .mkdir = spufs_mkdir, + .rmdir = spufs_rmdir, +// .rename = simple_rename, // XXX maybe + }; + + struct inode *inode; + int ret; + + ret = -ENOMEM; + inode = spufs_new_inode(sb, S_IFDIR | 0777); + + if (inode) { + inode->i_op = &spufs_dir_inode_operations; + inode->i_fop = &simple_dir_operations; + SPUFS_I(inode)->i_ctx = NULL; + sb->s_root = d_alloc_root(inode); + if (!sb->s_root) + iput(inode); + else + ret = 0; + } + return ret; +} + +static int +spufs_fill_super(struct super_block *sb, void *data, int silent) +{ + static struct super_operations s_ops = { + .alloc_inode = spufs_alloc_inode, + .destroy_inode = spufs_destroy_inode, + .statfs = simple_statfs, + .delete_inode = spufs_delete_inode, + .drop_inode = generic_delete_inode, + }; + + sb->s_maxbytes = MAX_LFS_FILESIZE; + sb->s_blocksize = PAGE_CACHE_SIZE; + sb->s_blocksize_bits = PAGE_CACHE_SHIFT; + sb->s_magic = SPUFS_MAGIC; + sb->s_op = &s_ops; + + return spufs_create_root(sb); +} + +static struct super_block * +spufs_get_sb(struct file_system_type *fstype, int flags, + const char *name, void *data) +{ + return get_sb_single(fstype, flags, data, spufs_fill_super); +} + +static struct file_system_type spufs_type = { + .owner = THIS_MODULE, + .name = "spufs", + .get_sb = spufs_get_sb, + .kill_sb = kill_litter_super, +}; + +static int spufs_init(void) +{ + int ret; + ret = -ENOMEM; + spufs_inode_cache = kmem_cache_create("spufs_inode_cache", + sizeof(struct spufs_inode_info), 0, + SLAB_HWCACHE_ALIGN, spufs_init_once, NULL); + + if (!spufs_inode_cache) + goto out; + ret = register_filesystem(&spufs_type); + if (ret) + kmem_cache_destroy(spufs_inode_cache); +out: + return ret; +} +module_init(spufs_init); + +static void spufs_exit(void) +{ + unregister_filesystem(&spufs_type); + kmem_cache_destroy(spufs_inode_cache); +} +module_exit(spufs_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Arnd Bergmann "); + --- linux-cg.orig/fs/spufs/spufs.h 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/fs/spufs/spufs.h 2005-06-21 22:49:22.676938600 -0400 @@ -0,0 +1,51 @@ +/* + * SPU file system + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ +#ifndef SPUFS_H +#define SPUFS_H + +#include +#include +#include + +/* The magic number for our file system */ +enum { + SPUFS_MAGIC = 0x23c9b64e, +}; + +struct spu_context { + struct spu *spu; /* pointer to a physical SPU if SPUFS_DIRECT */ + struct rw_semaphore backing_sema; /* protects the above */ + spinlock_t mmio_lock; /* protects mmio access */ + int sig1_type, sig2_type; + + struct kref kref; +}; + +struct spufs_inode_info { + struct spu_context *i_ctx; + struct inode vfs_inode; +}; +#define SPUFS_I(inode) container_of(inode, struct spufs_inode_info, vfs_inode) + +extern struct tree_descr spufs_dir_contents[]; + +#endif --- linux-cg.orig/include/asm-ppc64/spu.h 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/include/asm-ppc64/spu.h 2005-06-21 22:48:48.778899384 -0400 @@ -0,0 +1,468 @@ +/* + * SPU core / file system interface and HW structures + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef _SPU_H +#define _SPU_H + +#define LS_ORDER (6) /* 256 kb */ + +#define LS_SIZE (PAGE_SIZE << LS_ORDER) + +struct spu { + char *name; + u8 *local_store; + struct spu_problem __iomem *problem; + struct spu_priv1 __iomem *priv1; + struct spu_priv2 __iomem *priv2; + struct list_head list; + int number; + u32 isrc; + u32 node; + struct kref kref; + size_t ls_size; + unsigned int slb_replace; + struct mm_struct *mm; + int class_0_pending; + spinlock_t register_lock; + + u32 stop_code; + wait_queue_head_t stop_wq; + wait_queue_head_t ibox_wq; + wait_queue_head_t wbox_wq; + + char irq_c0[8]; + char irq_c1[8]; + char irq_c2[8]; +}; + +struct spu *spu_alloc(void); +void spu_free(struct spu *spu); +int spu_run(struct spu *spu); + +size_t spu_wbox_write(struct spu *spu, u32 data); +size_t spu_ibox_read(struct spu *spu, u32 *data); + +/* + * This defines the Local Store, Problem Area and Privlege Area of an SPU. + */ + +union MFC_TagSizeClassCmd { + struct { + u16 mfc_size; + u16 mfc_tag; + u8 pad; + u8 mfc_rclassid; + u16 mfc_cmd; + } u; + struct { + u32 mfc_size_tag32; + u32 mfc_class_cmd32; + } by32; + u64 all64; +}; + +struct MFC_cq_sr { + u64 mfc_cq_data0_RW; + u64 mfc_cq_data1_RW; + u64 mfc_cq_data2_RW; + u64 mfc_cq_data3_RW; +}; + +struct spu_problem { + u8 pad_0x0000_0x3000[0x3000 - 0x0000]; /* 0x0000 */ + + /* DMA Area */ + u8 pad_0x3000_0x3004[0x4]; /* 0x3000 */ + u32 mfc_lsa_W; /* 0x3004 */ + u64 mfc_ea_W; /* 0x3008 */ + union MFC_TagSizeClassCmd mfc_union_W; /* 0x3010 */ + u8 pad_0x3018_0x3104[0xec]; /* 0x3018 */ + u32 dma_qstatus_R; /* 0x3104 */ + u8 pad_0x3108_0x3204[0xfc]; /* 0x3108 */ + u32 dma_querytype_RW; /* 0x3204 */ + u8 pad_0x3208_0x321c[0x14]; /* 0x3208 */ + u32 dma_querymask_RW; /* 0x321c */ + u8 pad_0x3220_0x322c[0xc]; /* 0x3220 */ + u32 dma_tagstatus_R; /* 0x322c */ +#define DMA_TAGSTATUS_INTR_ANY 1u +#define DMA_TAGSTATUS_INTR_ALL 2u + u8 pad_0x3230_0x4000[0x4000 - 0x3230]; /* 0x3230 */ + + /* SPU Control Area */ + u8 pad_0x4000_0x4004[0x4]; /* 0x4000 */ + u32 pu_mb_R; /* 0x4004 */ + u8 pad_0x4008_0x400c[0x4]; /* 0x4008 */ + u32 spu_mb_W; /* 0x400c */ + u8 pad_0x4010_0x4014[0x4]; /* 0x4010 */ + u32 mb_stat_R; /* 0x4014 */ + u8 pad_0x4018_0x401c[0x4]; /* 0x4018 */ + u32 spu_runcntl_RW; /* 0x401c */ +#define SPU_RUNCNTL_STOP 0L +#define SPU_RUNCNTL_RUNNABLE 1L + u8 pad_0x4020_0x4024[0x4]; /* 0x4020 */ + u32 spu_status_R; /* 0x4024 */ +#define SPU_STATUS_STOPPED 0x0 +#define SPU_STATUS_RUNNING 0x1 +#define SPU_STATUS_STOPPED_BY_STOP 0x2 +#define SPU_STATUS_STOPPED_BY_HALT 0x4 +#define SPU_STATUS_WAITING_FOR_CHANNEL 0x8 +#define SPU_STATUS_SINGLE_STEP 0x10 + u8 pad_0x4028_0x402c[0x4]; /* 0x4028 */ + u32 spu_spe_R; /* 0x402c */ + u8 pad_0x4030_0x4034[0x4]; /* 0x4030 */ + u32 spu_npc_RW; /* 0x4034 */ + u8 pad_0x4038_0x14000[0x14000 - 0x4038]; /* 0x4038 */ + + /* Signal Notification Area */ + u8 pad_0x14000_0x1400c[0xc]; /* 0x14000 */ + u32 signal_notify1; /* 0x1400c */ + u8 pad_0x14010_0x1c00c[0x7ffc]; /* 0x14010 */ + u32 signal_notify2; /* 0x1c00c */ +} __attribute__ ((aligned(0x20000))); + +/* SPU Privilege 2 State Area */ +struct spu_priv2 { + /* MFC Registers */ + u8 pad_0x0000_0x1100[0x1100 - 0x0000]; /* 0x0000 */ + + /* SLB Management Registers */ + u8 pad_0x1100_0x1108[0x8]; /* 0x1100 */ + u64 slb_index_W; /* 0x1108 */ +#define SLB_INDEX_MASK 0x7L + u64 slb_esid_RW; /* 0x1110 */ + u64 slb_vsid_RW; /* 0x1118 */ +#define SLB_VSID_SUPERVISOR_STATE (0x1ull << 11) +#define SLB_VSID_SUPERVISOR_STATE_MASK (0x1ull << 11) +#define SLB_VSID_PROBLEM_STATE (0x1ull << 10) +#define SLB_VSID_PROBLEM_STATE_MASK (0x1ull << 10) +#define SLB_VSID_EXECUTE_SEGMENT (0x1ull << 9) +#define SLB_VSID_NO_EXECUTE_SEGMENT (0x1ull << 9) +#define SLB_VSID_EXECUTE_SEGMENT_MASK (0x1ull << 9) +#define SLB_VSID_4K_PAGE (0x0 << 8) +#define SLB_VSID_LARGE_PAGE (0x1ull << 8) +#define SLB_VSID_PAGE_SIZE_MASK (0x1ull << 8) +#define SLB_VSID_CLASS_MASK (0x1ull << 7) +#define SLB_VSID_VIRTUAL_PAGE_SIZE_MASK (0x1ull << 6) + u64 slb_invalidate_entry_W; /* 0x1120 */ + u64 slb_invalidate_all_W; /* 0x1128 */ + u8 pad_0x1130_0x2000[0x2000 - 0x1130]; /* 0x1130 */ + + /* Context Save / Restore Area */ + struct MFC_cq_sr spuq[16]; /* 0x2000 */ + struct MFC_cq_sr puq[8]; /* 0x2200 */ + u8 pad_0x2300_0x3000[0x3000 - 0x2300]; /* 0x2300 */ + + /* MFC Control */ + u64 mfc_control_RW; /* 0x3000 */ +#define MFC_CNTL_RESUME_DMA_QUEUE (0ull << 0) +#define MFC_CNTL_SUSPEND_DMA_QUEUE (1ull << 0) +#define MFC_CNTL_SUSPEND_DMA_QUEUE_MASK (1ull << 0) +#define MFC_CNTL_NORMAL_DMA_QUEUE_OPERATION (0ull << 8) +#define MFC_CNTL_SUSPEND_IN_PROGRESS (1ull << 8) +#define MFC_CNTL_SUSPEND_COMPLETE (3ull << 8) +#define MFC_CNTL_SUSPEND_DMA_STATUS_MASK (3ull << 8) +#define MFC_CNTL_DMA_QUEUES_EMPTY (1ull << 14) +#define MFC_CNTL_DMA_QUEUES_EMPTY_MASK (1ull << 14) +#define MFC_CNTL_PURGE_DMA_REQUEST (1ull << 15) +#define MFC_CNTL_PURGE_DMA_IN_PROGRESS (1ull << 24) +#define MFC_CNTL_PURGE_DMA_COMPLETE (3ull << 24) +#define MFC_CNTL_PURGE_DMA_STATUS_MASK (3ull << 24) +#define MFC_CNTL_RESTART_DMA_COMMAND (1ull << 32) +#define MFC_CNTL_DMA_COMMAND_REISSUE_PENDING (1ull << 32) +#define MFC_CNTL_DMA_COMMAND_REISSUE_STATUS_MASK (1ull << 32) +#define MFC_CNTL_MFC_PRIVILEGE_STATE (2ull << 33) +#define MFC_CNTL_MFC_PROBLEM_STATE (3ull << 33) +#define MFC_CNTL_MFC_KEY_PROTECTION_STATE_MASK (3ull << 33) +#define MFC_CNTL_DECREMENTER_HALTED (1ull << 35) +#define MFC_CNTL_DECREMENTER_RUNNING (1ull << 40) +#define MFC_CNTL_DECREMENTER_STATUS_MASK (1ull << 40) + u8 pad_0x3008_0x4000[0x4000 - 0x3008]; /* 0x3008 */ + + /* Interrupt Mailbox */ + u64 puint_mb_R; /* 0x4000 */ + u8 pad_0x4008_0x4040[0x4040 - 0x4008]; /* 0x4008 */ + + /* SPU Control */ + u64 spu_privcntl_RW; /* 0x4040 */ +#define SPU_PRIVCNTL_MODE_NORMAL (0x0ull << 0) +#define SPU_PRIVCNTL_MODE_SINGLE_STEP (0x1ull << 0) +#define SPU_PRIVCNTL_MODE_MASK (0x1ull << 0) +#define SPU_PRIVCNTL_NO_ATTENTION_EVENT (0x0ull << 1) +#define SPU_PRIVCNTL_ATTENTION_EVENT (0x1ull << 1) +#define SPU_PRIVCNTL_ATTENTION_EVENT_MASK (0x1ull << 1) +#define SPU_PRIVCNT_LOAD_REQUEST_NORMAL (0x0ull << 2) +#define SPU_PRIVCNT_LOAD_REQUEST_ENABLE_MASK (0x1ull << 2) + u8 pad_0x4048_0x4058[0x10]; /* 0x4048 */ + u64 spu_lslr_RW; /* 0x4058 */ + u64 spu_chnlcntptr_RW; /* 0x4060 */ + u64 spu_chnlcnt_RW; /* 0x4068 */ + u64 spu_chnldata_RW; /* 0x4070 */ + u64 spu_cfg_RW; /* 0x4078 */ + u8 pad_0x4080_0x5000[0x5000 - 0x4080]; /* 0x4080 */ + + /* PV2_ImplRegs: Implementation-specific privileged-state 2 regs */ + u64 spu_pm_trace_tag_status_RW; /* 0x5000 */ + u64 spu_tag_status_query_RW; /* 0x5008 */ +#define TAG_STATUS_QUERY_CONDITION_BITS (0x3ull << 32) +#define TAG_STATUS_QUERY_MASK_BITS (0xffffffffull) + u64 spu_cmd_buf1_RW; /* 0x5010 */ +#define SPU_COMMAND_BUFFER_1_LSA_BITS (0x7ffffull << 32) +#define SPU_COMMAND_BUFFER_1_EAH_BITS (0xffffffffull) + u64 spu_cmd_buf2_RW; /* 0x5018 */ +#define SPU_COMMAND_BUFFER_2_EAL_BITS ((0xffffffffull) << 32) +#define SPU_COMMAND_BUFFER_2_TS_BITS (0xffffull << 16) +#define SPU_COMMAND_BUFFER_2_TAG_BITS (0x3full) + u64 spu_atomic_status_RW; /* 0x5020 */ +} __attribute__ ((aligned(0x20000))); + +/* SPU Privilege 1 State Area */ +struct spu_priv1 { + /* Control and Configuration Area */ + u64 mfc_sr1_RW; /* 0x000 */ +#define MFC_STATE1_LOCAL_STORAGE_DECODE_MASK 0x01ull +#define MFC_STATE1_BUS_TLBIE_MASK 0x02ull +#define MFC_STATE1_REAL_MODE_OFFSET_ENABLE_MASK 0x04ull +#define MFC_STATE1_PROBLEM_STATE_MASK 0x08ull +#define MFC_STATE1_RELOCATE_MASK 0x10ull +#define MFC_STATE1_MASTER_RUN_CONTROL_MASK 0x20ull + u64 mfc_lpid_RW; /* 0x008 */ + u64 spu_idr_RW; /* 0x010 */ + u64 mfc_vr_RO; /* 0x018 */ +#define MFC_VERSION_BITS (0xffff << 16) +#define MFC_REVISION_BITS (0xffff) +#define MFC_GET_VERSION_BITS(vr) (((vr) & MFC_VERSION_BITS) >> 16) +#define MFC_GET_REVISION_BITS(vr) ((vr) & MFC_REVISION_BITS) + u64 spu_vr_RO; /* 0x020 */ +#define SPU_VERSION_BITS (0xffff << 16) +#define SPU_REVISION_BITS (0xffff) +#define SPU_GET_VERSION_BITS(vr) (vr & SPU_VERSION_BITS) >> 16 +#define SPU_GET_REVISION_BITS(vr) (vr & SPU_REVISION_BITS) + u8 pad_0x28_0x100[0x100 - 0x28]; /* 0x28 */ + + + /* Interrupt Area */ + u64 int_mask_class0_RW; /* 0x100 */ +#define CLASS0_ENABLE_DMA_ALIGNMENT_INTR 0x1L +#define CLASS0_ENABLE_INVALID_DMA_COMMAND_INTR 0x2L +#define CLASS0_ENABLE_SPU_ERROR_INTR 0x4L +#define CLASS0_ENABLE_MFC_FIR_INTR 0x8L + u64 int_mask_class1_RW; /* 0x108 */ +#define CLASS1_ENABLE_SEGMENT_FAULT_INTR 0x1L +#define CLASS1_ENABLE_STORAGE_FAULT_INTR 0x2L +#define CLASS1_ENABLE_LS_COMPARE_SUSPEND_ON_GET_INTR 0x4L +#define CLASS1_ENABLE_LS_COMPARE_SUSPEND_ON_PUT_INTR 0x8L + u64 int_mask_class2_RW; /* 0x110 */ +#define CLASS2_ENABLE_MAILBOX_INTR 0x1L +#define CLASS2_ENABLE_SPU_STOP_INTR 0x2L +#define CLASS2_ENABLE_SPU_HALT_INTR 0x4L +#define CLASS2_ENABLE_SPU_DMA_TAG_GROUP_COMPLETE_INTR 0x8L + u8 pad_0x118_0x140[0x28]; /* 0x118 */ + u64 int_stat_class0_RW; /* 0x140 */ + u64 int_stat_class1_RW; /* 0x148 */ + u64 int_stat_class2_RW; /* 0x150 */ + u8 pad_0x158_0x180[0x28]; /* 0x158 */ + u64 int_route_RW; /* 0x180 */ + + /* Interrupt Routing */ + u8 pad_0x188_0x200[0x200 - 0x188]; /* 0x188 */ + + /* Atomic Unit Control Area */ + u64 mfc_atomic_flush_RW; /* 0x200 */ +#define mfc_atomic_flush_enable 0x1L + u8 pad_0x208_0x280[0x78]; /* 0x208 */ + u64 resource_allocation_groupID_RW; /* 0x280 */ + u64 resource_allocation_enable_RW; /* 0x288 */ + u8 pad_0x290_0x380[0x380 - 0x290]; /* 0x290 */ + + /* MFC Fault Isolation Area */ + /* mfc_fir_R: MFC Fault Isolation Register. + * mfc_fir_status_or_W: MFC Fault Isolation Status OR Register. + * mfc_fir_status_and_W: MFC Fault Isolation Status AND Register. + * mfc_fir_mask_R: MFC FIR Mask Register. + * mfc_fir_mask_or_W: MFC FIR Mask OR Register. + * mfc_fir_mask_and_W: MFC FIR Mask AND Register. + * mfc_fir_chkstp_enable_W: MFC FIR Checkstop Enable Register. + */ + u64 mfc_fir_R; /* 0x380 */ + u64 mfc_fir_status_or_W; /* 0x388 */ + u64 mfc_fir_status_and_W; /* 0x390 */ + u64 mfc_fir_mask_R; /* 0x398 */ + u64 mfc_fir_mask_or_W; /* 0x3a0 */ + u64 mfc_fir_mask_and_W; /* 0x3a8 */ + u64 mfc_fir_chkstp_enable_RW; /* 0x3b0 */ + u8 pad_0x3b8_0x3c8[0x3c8 - 0x3b8]; /* 0x3b8 */ + + /* SPU_Cache_ImplRegs: Implementation-dependent cache registers */ + + u64 smf_sbi_signal_sel; /* 0x3c8 */ +#define smf_sbi_mask_lsb 56 +#define smf_sbi_shift (63 - smf_sbi_mask_lsb) +#define smf_sbi_mask (0x301LL << smf_sbi_shift) +#define smf_sbi_bus0_bits (0x001LL << smf_sbi_shift) +#define smf_sbi_bus2_bits (0x100LL << smf_sbi_shift) +#define smf_sbi2_bus0_bits (0x201LL << smf_sbi_shift) +#define smf_sbi2_bus2_bits (0x300LL << smf_sbi_shift) + u64 smf_ato_signal_sel; /* 0x3d0 */ +#define smf_ato_mask_lsb 35 +#define smf_ato_shift (63 - smf_ato_mask_lsb) +#define smf_ato_mask (0x3LL << smf_ato_shift) +#define smf_ato_bus0_bits (0x2LL << smf_ato_shift) +#define smf_ato_bus2_bits (0x1LL << smf_ato_shift) + u8 pad_0x3d8_0x400[0x400 - 0x3d8]; /* 0x3d8 */ + + /* TLB Management Registers */ + u64 mfc_sdr_RW; /* 0x400 */ + u8 pad_0x408_0x500[0xf8]; /* 0x408 */ + u64 tlb_index_hint_RO; /* 0x500 */ + u64 tlb_index_W; /* 0x508 */ + u64 tlb_vpn_RW; /* 0x510 */ + u64 tlb_rpn_RW; /* 0x518 */ + u8 pad_0x520_0x540[0x20]; /* 0x520 */ + u64 tlb_invalidate_entry_W; /* 0x540 */ + u64 tlb_invalidate_all_W; /* 0x548 */ + u8 pad_0x550_0x580[0x580 - 0x550]; /* 0x550 */ + + /* SPU_MMU_ImplRegs: Implementation-dependent MMU registers */ + u64 smm_hid; /* 0x580 */ +#define PAGE_SIZE_MASK 0xf000000000000000ull +#define PAGE_SIZE_16MB_64KB 0x2000000000000000ull + u8 pad_0x588_0x600[0x600 - 0x588]; /* 0x588 */ + + /* MFC Status/Control Area */ + u64 mfc_accr_RW; /* 0x600 */ +#define MFC_ACCR_EA_ACCESS_GET (1 << 0) +#define MFC_ACCR_EA_ACCESS_PUT (1 << 1) +#define MFC_ACCR_LS_ACCESS_GET (1 << 3) +#define MFC_ACCR_LS_ACCESS_PUT (1 << 4) + u8 pad_0x608_0x610[0x8]; /* 0x608 */ + u64 mfc_dsisr_RW; /* 0x610 */ +#define MFC_DSISR_PTE_NOT_FOUND (1 << 30) +#define MFC_DSISR_ACCESS_DENIED (1 << 27) +#define MFC_DSISR_ATOMIC (1 << 26) +#define MFC_DSISR_ACCESS_PUT (1 << 25) +#define MFC_DSISR_ADDR_MATCH (1 << 22) +#define MFC_DSISR_LS (1 << 17) +#define MFC_DSISR_L (1 << 16) +#define MFC_DSISR_ADDRESS_OVERFLOW (1 << 0) + u8 pad_0x618_0x620[0x8]; /* 0x618 */ + u64 mfc_dar_RW; /* 0x620 */ + u8 pad_0x628_0x700[0x700 - 0x628]; /* 0x628 */ + + /* Replacement Management Table (RMT) Area */ + u64 rmt_index_RW; /* 0x700 */ + u8 pad_0x708_0x710[0x8]; /* 0x708 */ + u64 rmt_data1_RW; /* 0x710 */ + u8 pad_0x718_0x800[0x800 - 0x718]; /* 0x718 */ + + /* Control/Configuration Registers */ + u64 mfc_dsir_R; /* 0x800 */ +#define MFC_DSIR_Q (1 << 31) +#define MFC_DSIR_SPU_QUEUE MFC_DSIR_Q + u64 mfc_lsacr_RW; /* 0x808 */ +#define MFC_LSACR_COMPARE_MASK ((~0ull) << 32) +#define MFC_LSACR_COMPARE_ADDR ((~0ull) >> 32) + u64 mfc_lscrr_R; /* 0x810 */ +#define MFC_LSCRR_Q (1 << 31) +#define MFC_LSCRR_SPU_QUEUE MFC_LSCRR_Q +#define MFC_LSCRR_QI_SHIFT 32 +#define MFC_LSCRR_QI_MASK ((~0ull) << MFC_LSCRR_QI_SHIFT) + u8 pad_0x818_0x900[0x900 - 0x818]; /* 0x818 */ + + /* Real Mode Support Registers */ + u64 mfc_rm_boundary; /* 0x900 */ + u8 pad_0x908_0x938[0x30]; /* 0x908 */ + u64 smf_dma_signal_sel; /* 0x938 */ +#define mfc_dma1_mask_lsb 41 +#define mfc_dma1_shift (63 - mfc_dma1_mask_lsb) +#define mfc_dma1_mask (0x3LL << mfc_dma1_shift) +#define mfc_dma1_bits (0x1LL << mfc_dma1_shift) +#define mfc_dma2_mask_lsb 43 +#define mfc_dma2_shift (63 - mfc_dma2_mask_lsb) +#define mfc_dma2_mask (0x3LL << mfc_dma2_shift) +#define mfc_dma2_bits (0x1LL << mfc_dma2_shift) + u8 pad_0x940_0xa38[0xf8]; /* 0x940 */ + u64 smm_signal_sel; /* 0xa38 */ +#define smm_sig_mask_lsb 12 +#define smm_sig_shift (63 - smm_sig_mask_lsb) +#define smm_sig_mask (0x3LL << smm_sig_shift) +#define smm_sig_bus0_bits (0x2LL << smm_sig_shift) +#define smm_sig_bus2_bits (0x1LL << smm_sig_shift) + u8 pad_0xa40_0xc00[0xc00 - 0xa40]; /* 0xa40 */ + + /* DMA Command Error Area */ + u64 mfc_cer_R; /* 0xc00 */ +#define MFC_CER_Q (1 << 31) +#define MFC_CER_SPU_QUEUE MFC_CER_Q + u8 pad_0xc08_0x1000[0x1000 - 0xc08]; /* 0xc08 */ + + /* PV1_ImplRegs: Implementation-dependent privileged-state 1 regs */ + /* DMA Command Error Area */ + u64 spu_ecc_cntl_RW; /* 0x1000 */ +#define SPU_ECC_CNTL_E (1ull << 0ull) +#define SPU_ECC_CNTL_ENABLE SPU_ECC_CNTL_E +#define SPU_ECC_CNTL_DISABLE (~SPU_ECC_CNTL_E & 1L) +#define SPU_ECC_CNTL_S (1ull << 1ull) +#define SPU_ECC_STOP_AFTER_ERROR SPU_ECC_CNTL_S +#define SPU_ECC_CONTINUE_AFTER_ERROR (~SPU_ECC_CNTL_S & 2L) +#define SPU_ECC_CNTL_B (1ull << 2ull) +#define SPU_ECC_BACKGROUND_ENABLE SPU_ECC_CNTL_B +#define SPU_ECC_BACKGROUND_DISABLE (~SPU_ECC_CNTL_B & 4L) +#define SPU_ECC_CNTL_I_SHIFT 3ull +#define SPU_ECC_CNTL_I_MASK (3ull << SPU_ECC_CNTL_I_SHIFT) +#define SPU_ECC_WRITE_ALWAYS (~SPU_ECC_CNTL_I & 12L) +#define SPU_ECC_WRITE_CORRECTABLE (1ull << SPU_ECC_CNTL_I_SHIFT) +#define SPU_ECC_WRITE_UNCORRECTABLE (3ull << SPU_ECC_CNTL_I_SHIFT) +#define SPU_ECC_CNTL_D (1ull << 5ull) +#define SPU_ECC_DETECTION_ENABLE SPU_ECC_CNTL_D +#define SPU_ECC_DETECTION_DISABLE (~SPU_ECC_CNTL_D & 32L) + u64 spu_ecc_stat_RW; /* 0x1008 */ +#define SPU_ECC_CORRECTED_ERROR (1ull << 0ul) +#define SPU_ECC_UNCORRECTED_ERROR (1ull << 1ul) +#define SPU_ECC_SCRUB_COMPLETE (1ull << 2ul) +#define SPU_ECC_SCRUB_IN_PROGRESS (1ull << 3ul) +#define SPU_ECC_INSTRUCTION_ERROR (1ull << 4ul) +#define SPU_ECC_DATA_ERROR (1ull << 5ul) +#define SPU_ECC_DMA_ERROR (1ull << 6ul) +#define SPU_ECC_STATUS_CNT_MASK (256ull << 8) + u64 spu_ecc_addr_RW; /* 0x1010 */ + u64 spu_err_mask_RW; /* 0x1018 */ +#define SPU_ERR_ILLEGAL_INSTR (1ull << 0ul) +#define SPU_ERR_ILLEGAL_CHANNEL (1ull << 1ul) + u8 pad_0x1020_0x1028[0x1028 - 0x1020]; /* 0x1020 */ + + /* SPU Debug-Trace Bus (DTB) Selection Registers */ + u64 spu_trig0_sel; /* 0x1028 */ + u64 spu_trig1_sel; /* 0x1030 */ + u64 spu_trig2_sel; /* 0x1038 */ + u64 spu_trig3_sel; /* 0x1040 */ + u64 spu_trace_sel; /* 0x1048 */ +#define spu_trace_sel_mask 0x1f1fLL +#define spu_trace_sel_bus0_bits 0x1000LL +#define spu_trace_sel_bus2_bits 0x0010LL + u64 spu_event0_sel; /* 0x1050 */ + u64 spu_event1_sel; /* 0x1058 */ + u64 spu_event2_sel; /* 0x1060 */ + u64 spu_event3_sel; /* 0x1068 */ + u64 spu_trace_cntl; /* 0x1070 */ +} __attribute__ ((aligned(0x2000))); + +#endif --- linux-cg.orig/mm/memory.c 2005-06-21 22:48:42.154975624 -0400 +++ linux-cg/mm/memory.c 2005-06-21 22:48:48.780899080 -0400 @@ -2201,6 +2201,7 @@ unsigned long vmalloc_to_pfn(void * vmal { return page_to_pfn(vmalloc_to_page(vmalloc_addr)); } +EXPORT_SYMBOL_GPL(handle_mm_fault); EXPORT_SYMBOL(vmalloc_to_pfn); From arnd at arndb.de Wed Jun 22 07:31:59 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 21 Jun 2005 23:31:59 +0200 Subject: [PATCH 11/11] spufs: Use a system call instead of ioctl In-Reply-To: <200506212330.06734.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212328.28929.arnd@arndb.de> <200506212330.06734.arnd@arndb.de> Message-ID: <200506212331.59883.arnd@arndb.de> This patch makes it possible to use a system call instead of an ioctl to run spu code on spufs. This is mostly for review, to see how ugly it gets. I personally don't like the ioctl implementation very much, but haven't come up with the best solution. The system call doesn't appear to be much better than ioctl here. One other solution that I haven't implemented yet would be an interface that returns a struct { __u32 npc; __u32 status; }; with the help of a read system call and uses lseek to update the npc value. That would require some knowledge about status code in the kernel if we want to avoid calling lseek every time an SPU does a callback that returns to the owning thread. Signed-off-by: Arnd Bergmann -- arch/ppc64/kernel/misc.S | 2 fs/spufs/Makefile | 3 + fs/spufs/file.c | 12 +++++ fs/spufs/spu_run.c | 96 +++++++++++++++++++++++++++++++++++++++++++++ fs/spufs/spufs.h | 2 include/asm-ppc/unistd.h | 3 - include/asm-ppc64/unistd.h | 3 - include/linux/syscalls.h | 3 + kernel/sys_ni.c | 1 9 files changed, 122 insertions(+), 3 deletions(-) --- linux-cg.orig/arch/ppc64/kernel/misc.S 2005-06-21 22:48:52.323893000 -0400 +++ linux-cg/arch/ppc64/kernel/misc.S 2005-06-21 22:51:43.412976752 -0400 @@ -956,6 +956,7 @@ _GLOBAL(sys_call_table32) .llong .sys32_request_key .llong .compat_sys_keyctl .llong .compat_sys_waitid + .llong .sys_spu_run .balign 8 _GLOBAL(sys_call_table) @@ -1232,3 +1233,4 @@ _GLOBAL(sys_call_table) .llong .sys_request_key /* 270 */ .llong .sys_keyctl .llong .sys_waitid + .llong .sys_spu_run --- linux-cg.orig/fs/spufs/Makefile 2005-06-21 22:48:52.326892544 -0400 +++ linux-cg/fs/spufs/Makefile 2005-06-21 22:51:43.413976600 -0400 @@ -1,2 +1,5 @@ obj-$(CONFIG_SPU_FS) += spufs.o +syscall-$(CONFIG_SPU_FS) += spu_run.o + +obj-y += $(syscall-y) $(syscall-m) spufs-y += inode.o file.o --- linux-cg.orig/fs/spufs/file.c 2005-06-21 22:50:20.599920208 -0400 +++ linux-cg/fs/spufs/file.c 2005-06-21 22:51:56.088914672 -0400 @@ -343,6 +343,16 @@ out: return ret; } +static int spufs_run_open(struct inode *inode, struct file *file) +{ + struct spufs_inode_info *i = SPUFS_I(inode); + file->private_data = i->i_ctx; + + i->i_spu_run = spufs_run_spu; + + return nonseekable_open(inode, file); +} + struct spufs_run_arg { u32 npc; /* inout: Next Program Counter */ u32 status; /* out: SPU status */ @@ -371,7 +381,7 @@ static long spufs_run_ioctl(struct file } static struct file_operations spufs_run_fops = { - .open = spufs_pipe_open, + .open = spufs_run_open, .unlocked_ioctl = spufs_run_ioctl, .compat_ioctl = spufs_run_ioctl, }; --- linux-cg.orig/fs/spufs/spu_run.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/fs/spufs/spu_run.c 2005-06-21 22:51:43.414976448 -0400 @@ -0,0 +1,96 @@ +/* + * SPU file system -- run system call + * + * (C) Copyright IBM Deutschland Entwicklung GmbH 2005 + * + * Author: Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include +#include +#include + +#include + +#include "spufs.h" + +/** + * sys_spu_run - run code loaded into an SPU + * + * @unpc: next program counter for the SPU + * @ustatus: status of the SPU + * + * This system call transfers the control of execution of a + * user space thread to an SPU. It will return when the + * SPU has finished executing or when it hits an error + * condition and it will be interrupted if a signal needs + * to be delivered to a handler in user space. + * + * The next program counter is set to the passed value + * before the SPU starts fetching code and the user space + * pointer gets updated with the new value when returning + * from kernel space. + * + * The status value returned from spu_run reflects the + * value of the spu_status register after the SPU has stopped. + * + * The function must get linked into the kernel, even if spufs + * itself is built as a module, so we can use the pointer in the + * system call table. + */ +long sys_spu_run(int fd, __u32 __user *unpc, __u32 __user *ustatus) +{ + struct file *filp; + struct spufs_inode_info *i; + long ret; + u32 npc, status; + int fput_needed; + + ret = -EBADF; + filp = fget_light(fd, &fput_needed); + if (!filp) + goto out; + + ret = -EFAULT; + if (get_user(npc, unpc) || get_user(status, ustatus)) + goto out; + + ret = -EINVAL; + if (filp->f_vfsmnt->mnt_sb->s_magic != SPUFS_MAGIC) + goto out_fput; + + i = SPUFS_I(filp->f_dentry->d_inode); + /* + * In order to call the underlying spu_run operation, we have the + * function pointer as part of our inode info. This is anything but + * nice, but it helps to avoid registering a global function pointer + * at module load time, which would be even worse imho. + */ + if (!i->i_spu_run) + goto out_fput; + ret = i->i_spu_run(filp, i->i_ctx, &npc, &status); + + if (put_user(npc, unpc)) + ret = -EFAULT; + +out_fput: + fput_light(filp, fput_needed); +out: + return ret; +} --- linux-cg.orig/fs/spufs/spufs.h 2005-06-21 22:49:22.676938600 -0400 +++ linux-cg/fs/spufs/spufs.h 2005-06-21 22:51:56.798938904 -0400 @@ -42,6 +42,8 @@ struct spu_context { struct spufs_inode_info { struct spu_context *i_ctx; + long (*i_spu_run)(struct file *filp, struct spu_context *ctx, + u32 *npc, u32 *result); struct inode vfs_inode; }; #define SPUFS_I(inode) container_of(inode, struct spufs_inode_info, vfs_inode) --- linux-cg.orig/include/asm-ppc/unistd.h 2005-06-21 22:48:52.330891936 -0400 +++ linux-cg/include/asm-ppc/unistd.h 2005-06-21 22:51:43.414976448 -0400 @@ -277,8 +277,9 @@ #define __NR_request_key 270 #define __NR_keyctl 271 #define __NR_waitid 272 +#define __NR_spu_run 273 -#define __NR_syscalls 273 +#define __NR_syscalls 274 #define __NR(n) #n --- linux-cg.orig/include/asm-ppc64/unistd.h 2005-06-21 22:48:52.332891632 -0400 +++ linux-cg/include/asm-ppc64/unistd.h 2005-06-21 22:51:43.415976296 -0400 @@ -283,8 +283,9 @@ #define __NR_request_key 270 #define __NR_keyctl 271 #define __NR_waitid 272 +#define __NR_spu_run 273 -#define __NR_syscalls 273 +#define __NR_syscalls 274 #ifdef __KERNEL__ #define NR_syscalls __NR_syscalls #endif --- linux-cg.orig/include/linux/syscalls.h 2005-06-21 22:48:52.335891176 -0400 +++ linux-cg/include/linux/syscalls.h 2005-06-21 22:51:43.416976144 -0400 @@ -505,4 +505,7 @@ asmlinkage long sys_request_key(const ch asmlinkage long sys_keyctl(int cmd, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5); +asmlinkage long sys_spu_run(int fd, __u32 __user *unpc, + __u32 __user *ustatus); + #endif --- linux-cg.orig/kernel/sys_ni.c 2005-06-21 22:48:52.337890872 -0400 +++ linux-cg/kernel/sys_ni.c 2005-06-21 22:51:43.417975992 -0400 @@ -85,3 +85,4 @@ cond_syscall(sys_pciconfig_iobase); cond_syscall(sys32_ipc); cond_syscall(sys32_sysctl); cond_syscall(ppc_rtas); +cond_syscall(sys_spu_run); From hollis at penguinppc.org Wed Jun 22 09:51:25 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Tue, 21 Jun 2005 18:51:25 -0500 Subject: [PATCH 7/11] ppc64: add BPA platform type In-Reply-To: <200506212324.19713.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212320.05799.arnd@arndb.de> <200506212322.36453.arnd@arndb.de> <200506212324.19713.arnd@arndb.de> Message-ID: On Jun 21, 2005, at 4:24 PM, Arnd Bergmann wrote: > +static void __init bpa_setup_arch(void) > +{ > ... > + // bpa_nvram_init(); > +} I didn't look closely, but I didn't see this called elsewhere... so probably shouldn't be commented out here? -Hollis From hollis at penguinppc.org Wed Jun 22 10:21:09 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Tue, 21 Jun 2005 19:21:09 -0500 Subject: [PATCH 10/11] ppc64: SPU file system In-Reply-To: <200506212334.44066.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212326.18205.arnd@arndb.de> <200506212328.28929.arnd@arndb.de> <200506212334.44066.arnd@arndb.de> Message-ID: On Jun 21, 2005, at 4:34 PM, Arnd Bergmann wrote: > +union MFC_TagSizeClassCmd { I think great effort has gone in to removing so-called "StudlyCaps" from the ppc64 iSeries code... :) Also, I didn't see "MFC" defined anywhere... it's sort of a pet peeve, but could you make sure all your acronyms are defined? Most of them are described in spu.h, but a few slipped through I think (like "SMF"). And while a comment at the top of every file is great, ones like this: > +/* > + * Low-level SPU handling > + * might be more helpful if they defined SPU and further mentioned it's the coprocessor in the Broadband Processor Architecture... -Hollis From trini at kernel.crashing.org Wed Jun 22 09:41:51 2005 From: trini at kernel.crashing.org (Tom Rini) Date: Tue, 21 Jun 2005 16:41:51 -0700 Subject: Discuss: Adding OF Flat Dev Tree to ppc32 In-Reply-To: <1118199997.6850.106.camel@gaston> References: <1117614390.19020.24.camel@gaston> <1117614484.19020.27.camel@gaston> <1117783104.31082.151.camel@gaston> <1117819176.6517.290.camel@cashmere.sps.mot.com> <1118199997.6850.106.camel@gaston> Message-ID: <20050621234151.GB15203@smtp.west.cox.net> On Wed, Jun 08, 2005 at 01:06:37PM +1000, Benjamin Herrenschmidt wrote: [snip] > Regarding code in arch/ppc*, I'm not sure what the right approach would > be. I'd say first copy things around, and we'll what we end up with. How about we just do: obj-y += ../../ppc64/kernel/flat_tree.o or so like x86_64 does for a handful of things? -- Tom Rini http://gate.crashing.org/~trini/ From benh at kernel.crashing.org Wed Jun 22 10:31:58 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 22 Jun 2005 10:31:58 +1000 Subject: Proposal for reorg of kernel directory In-Reply-To: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> Message-ID: <1119400318.18247.190.camel@gaston> > To prevent bloating the kernel directory, I'd like to propose a > reorganization of the ppc64 tree to look more like the ppc tree. > This includes the creation of "platforms" and "syslib" directories > that would contain platform-specific code and non-platform-specific > system code, respectively. I'm not fan at all of kernel vs. syslib. Even on ppc32, and years after the split, I still keep trying to get at files in the wrong directory ;) platforms is ok. Ben. From benh at kernel.crashing.org Wed Jun 22 10:32:37 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 22 Jun 2005 10:32:37 +1000 Subject: Proposal for reorg of kernel directory In-Reply-To: <20050621195254.GA9995@austin.ibm.com> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <20050621195254.GA9995@austin.ibm.com> Message-ID: <1119400358.18247.192.camel@gaston> On Tue, 2005-06-21 at 14:52 -0500, Linas Vepstas wrote: > On Tue, Jun 21, 2005 at 02:02:11PM -0500, Becky Bruce was heard to remark: > > > > Kernel directory: > [...] > > syslib/ > > What's the conceptual difference between kernel and syslib? > > > u3_iommu.c > > Belongs in the pmac directory, I beleive. No. Any board using U3 or IBM CPC925 uses that. That includes js20 BM, Maple, and possibly others. Ben. From sfr at canb.auug.org.au Wed Jun 22 10:37:21 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 22 Jun 2005 10:37:21 +1000 Subject: Proposal for reorg of kernel directory In-Reply-To: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> Message-ID: <20050622103721.452140c9.sfr@canb.auug.org.au> On Tue, 21 Jun 2005 14:02:11 -0500 Becky Bruce wrote: > > viopath.c (? - not clear on this one) iSeries only. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050622/cf3ad494/attachment.pgp From benh at kernel.crashing.org Wed Jun 22 10:36:07 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 22 Jun 2005 10:36:07 +1000 Subject: Proposal for reorg of kernel directory In-Reply-To: <200506212125.04138.arnd@arndb.de> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <200506212125.04138.arnd@arndb.de> Message-ID: <1119400568.18247.197.camel@gaston> On Tue, 2005-06-21 at 21:25 +0200, Arnd Bergmann wrote: > On Dinsdag 21 Juni 2005 21:02, Becky Bruce wrote: > > > We've recently begun work on a port of the 64-bit kernel to a > > Freescale part, and noticed that all the platform-specific code is > > currently in the kernel directory. With new 64-bit parts in the > > works, we expect the number of supported platforms to increase > > significantly and to include more embedded systems. > > Hmm, at least I'd hope not to need a new platform type for every > piece of hardware, so there would not be too many of these. I want to avoid platform numbers explosion too. Paul, David and I have been discussing that a bit already. The idea would be to make the functions for accessing the flattened device-tree early during boot available to platform code during ppc_md.probe(). That way, in most cases, the need for platform number goes away. There is some work on the interrupt management and other bits & pieces, but that's mostly only for pSeries and pmac. The idea is to replace that number with the type of HV interface if any, and a "major" platform number to differenciate things like iSeries. > I would like to see platform types like 'everything with 64 bit > Freescale CPUs running on SLOF' and maybe another platform type > for the same CPU with a flat device tree if that differs a lot. I have no problem have as many xxxxx_setup.c files as there are board around. We need to give them at least that flexibility :) Provided we don't have to allocate paltform number codes for every of them. Ben. From michael at ellerman.id.au Wed Jun 22 14:37:28 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 22 Jun 2005 14:37:28 +1000 Subject: Spread lpevents by default Message-ID: <200506221437.29590.michael@ellerman.id.au> Hi, Anton mentioned the other day that the iSeries code should spread lpevents by default, rather than requiring the spread_lpevents command line option. The following two patches implement that. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050622/17a1bda9/attachment.pgp From michael at ellerman.id.au Wed Jun 22 14:55:08 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 22 Jun 2005 14:55:08 +1000 Subject: [PATCH] ppc64: Reorganise paca initialisation macros In-Reply-To: <200506221437.29590.michael@ellerman.id.au> References: <200506221437.29590.michael@ellerman.id.au> Message-ID: <200506221455.08741.michael@ellerman.id.au> This patch reorganises the macros that initialise the paca array with the aim of making the split between iSeries and non-iSeries cleaner. Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/pacaData.c | 308 ++++++++++++++++++++++--------------------- 1 files changed, 161 insertions(+), 147 deletions(-) Index: work/arch/ppc64/kernel/pacaData.c =================================================================== --- work.orig/arch/ppc64/kernel/pacaData.c +++ work/arch/ppc64/kernel/pacaData.c @@ -42,21 +42,7 @@ extern unsigned long __toc_start; * processors. The processor VPD array needs one entry per physical * processor (not thread). */ -#ifdef CONFIG_PPC_ISERIES -#define EXTRA_INITS(number, lpq) \ - .lppaca_ptr = &paca[number].lppaca, \ - .lpqueue_ptr = (lpq), /* &xItLpQueue, */ \ - .reg_save_ptr = &paca[number].reg_save, \ - .reg_save = { \ - .xDesc = 0xd397d9e2, /* "LpRS" */ \ - .xSize = sizeof(struct ItLpRegSave) \ - }, -#else -#define EXTRA_INITS(number, lpq) -#endif - -#define PACAINITDATA(number,start,lpq,asrr,asrv) \ -{ \ +#define PACA_INIT_COMMON(number, start, asrr, asrv) \ .lock_token = 0x8000, \ .paca_index = (number), /* Paca Index */ \ .default_decr = 0x00ff0000, /* Initial Decr */ \ @@ -74,147 +60,175 @@ extern unsigned long __toc_start; .end_of_quantum = 0xfffffffffffffffful, \ .slb_count = 64, \ }, \ - EXTRA_INITS((number), (lpq)) \ -} -struct paca_struct paca[] = { #ifdef CONFIG_PPC_ISERIES - PACAINITDATA( 0, 1, &xItLpQueue, 0, STAB0_VIRT_ADDR), +#define PACA_INIT_ISERIES(number, lpq) \ + .lppaca_ptr = &paca[number].lppaca, \ + .lpqueue_ptr = (lpq), /* &xItLpQueue, */ \ + .reg_save_ptr = &paca[number].reg_save, \ + .reg_save = { \ + .xDesc = 0xd397d9e2, /* "LpRS" */ \ + .xSize = sizeof(struct ItLpRegSave) \ + } + +#define PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 0, 0, 0) \ + PACA_INIT_ISERIES(number, NULL) \ +} + +#define BOOTCPU_PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 1, 0, STAB0_VIRT_ADDR) \ + PACA_INIT_ISERIES(number, &xItLpQueue) \ +} + #else - PACAINITDATA( 0, 1, NULL, STAB0_PHYS_ADDR, STAB0_VIRT_ADDR), +#define PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 0, 0, 0) \ +} + +#define BOOTCPU_PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 1, STAB0_PHYS_ADDR, STAB0_VIRT_ADDR) \ +} #endif + +struct paca_struct paca[] = { + BOOTCPU_PACAINITDATA(0), #if NR_CPUS > 1 - PACAINITDATA( 1, 0, NULL, 0, 0), - PACAINITDATA( 2, 0, NULL, 0, 0), - PACAINITDATA( 3, 0, NULL, 0, 0), + PACAINITDATA(1), + PACAINITDATA(2), + PACAINITDATA(3), #if NR_CPUS > 4 - PACAINITDATA( 4, 0, NULL, 0, 0), - PACAINITDATA( 5, 0, NULL, 0, 0), - PACAINITDATA( 6, 0, NULL, 0, 0), - PACAINITDATA( 7, 0, NULL, 0, 0), + PACAINITDATA(4), + PACAINITDATA(5), + PACAINITDATA(6), + PACAINITDATA(7), #if NR_CPUS > 8 - PACAINITDATA( 8, 0, NULL, 0, 0), - PACAINITDATA( 9, 0, NULL, 0, 0), - PACAINITDATA(10, 0, NULL, 0, 0), - PACAINITDATA(11, 0, NULL, 0, 0), - PACAINITDATA(12, 0, NULL, 0, 0), - PACAINITDATA(13, 0, NULL, 0, 0), - PACAINITDATA(14, 0, NULL, 0, 0), - PACAINITDATA(15, 0, NULL, 0, 0), - PACAINITDATA(16, 0, NULL, 0, 0), - PACAINITDATA(17, 0, NULL, 0, 0), - PACAINITDATA(18, 0, NULL, 0, 0), - PACAINITDATA(19, 0, NULL, 0, 0), - PACAINITDATA(20, 0, NULL, 0, 0), - PACAINITDATA(21, 0, NULL, 0, 0), - PACAINITDATA(22, 0, NULL, 0, 0), - PACAINITDATA(23, 0, NULL, 0, 0), - PACAINITDATA(24, 0, NULL, 0, 0), - PACAINITDATA(25, 0, NULL, 0, 0), - PACAINITDATA(26, 0, NULL, 0, 0), - PACAINITDATA(27, 0, NULL, 0, 0), - PACAINITDATA(28, 0, NULL, 0, 0), - PACAINITDATA(29, 0, NULL, 0, 0), - PACAINITDATA(30, 0, NULL, 0, 0), - PACAINITDATA(31, 0, NULL, 0, 0), + PACAINITDATA(8), + PACAINITDATA(9), + PACAINITDATA(10), + PACAINITDATA(11), + PACAINITDATA(12), + PACAINITDATA(13), + PACAINITDATA(14), + PACAINITDATA(15), + PACAINITDATA(16), + PACAINITDATA(17), + PACAINITDATA(18), + PACAINITDATA(19), + PACAINITDATA(20), + PACAINITDATA(21), + PACAINITDATA(22), + PACAINITDATA(23), + PACAINITDATA(24), + PACAINITDATA(25), + PACAINITDATA(26), + PACAINITDATA(27), + PACAINITDATA(28), + PACAINITDATA(29), + PACAINITDATA(30), + PACAINITDATA(31), #if NR_CPUS > 32 - PACAINITDATA(32, 0, NULL, 0, 0), - PACAINITDATA(33, 0, NULL, 0, 0), - PACAINITDATA(34, 0, NULL, 0, 0), - PACAINITDATA(35, 0, NULL, 0, 0), - PACAINITDATA(36, 0, NULL, 0, 0), - PACAINITDATA(37, 0, NULL, 0, 0), - PACAINITDATA(38, 0, NULL, 0, 0), - PACAINITDATA(39, 0, NULL, 0, 0), - PACAINITDATA(40, 0, NULL, 0, 0), - PACAINITDATA(41, 0, NULL, 0, 0), - PACAINITDATA(42, 0, NULL, 0, 0), - PACAINITDATA(43, 0, NULL, 0, 0), - PACAINITDATA(44, 0, NULL, 0, 0), - PACAINITDATA(45, 0, NULL, 0, 0), - PACAINITDATA(46, 0, NULL, 0, 0), - PACAINITDATA(47, 0, NULL, 0, 0), - PACAINITDATA(48, 0, NULL, 0, 0), - PACAINITDATA(49, 0, NULL, 0, 0), - PACAINITDATA(50, 0, NULL, 0, 0), - PACAINITDATA(51, 0, NULL, 0, 0), - PACAINITDATA(52, 0, NULL, 0, 0), - PACAINITDATA(53, 0, NULL, 0, 0), - PACAINITDATA(54, 0, NULL, 0, 0), - PACAINITDATA(55, 0, NULL, 0, 0), - PACAINITDATA(56, 0, NULL, 0, 0), - PACAINITDATA(57, 0, NULL, 0, 0), - PACAINITDATA(58, 0, NULL, 0, 0), - PACAINITDATA(59, 0, NULL, 0, 0), - PACAINITDATA(60, 0, NULL, 0, 0), - PACAINITDATA(61, 0, NULL, 0, 0), - PACAINITDATA(62, 0, NULL, 0, 0), - PACAINITDATA(63, 0, NULL, 0, 0), + PACAINITDATA(32), + PACAINITDATA(33), + PACAINITDATA(34), + PACAINITDATA(35), + PACAINITDATA(36), + PACAINITDATA(37), + PACAINITDATA(38), + PACAINITDATA(39), + PACAINITDATA(40), + PACAINITDATA(41), + PACAINITDATA(42), + PACAINITDATA(43), + PACAINITDATA(44), + PACAINITDATA(45), + PACAINITDATA(46), + PACAINITDATA(47), + PACAINITDATA(48), + PACAINITDATA(49), + PACAINITDATA(50), + PACAINITDATA(51), + PACAINITDATA(52), + PACAINITDATA(53), + PACAINITDATA(54), + PACAINITDATA(55), + PACAINITDATA(56), + PACAINITDATA(57), + PACAINITDATA(58), + PACAINITDATA(59), + PACAINITDATA(60), + PACAINITDATA(61), + PACAINITDATA(62), + PACAINITDATA(63), #if NR_CPUS > 64 - PACAINITDATA(64, 0, NULL, 0, 0), - PACAINITDATA(65, 0, NULL, 0, 0), - PACAINITDATA(66, 0, NULL, 0, 0), - PACAINITDATA(67, 0, NULL, 0, 0), - PACAINITDATA(68, 0, NULL, 0, 0), - PACAINITDATA(69, 0, NULL, 0, 0), - PACAINITDATA(70, 0, NULL, 0, 0), - PACAINITDATA(71, 0, NULL, 0, 0), - PACAINITDATA(72, 0, NULL, 0, 0), - PACAINITDATA(73, 0, NULL, 0, 0), - PACAINITDATA(74, 0, NULL, 0, 0), - PACAINITDATA(75, 0, NULL, 0, 0), - PACAINITDATA(76, 0, NULL, 0, 0), - PACAINITDATA(77, 0, NULL, 0, 0), - PACAINITDATA(78, 0, NULL, 0, 0), - PACAINITDATA(79, 0, NULL, 0, 0), - PACAINITDATA(80, 0, NULL, 0, 0), - PACAINITDATA(81, 0, NULL, 0, 0), - PACAINITDATA(82, 0, NULL, 0, 0), - PACAINITDATA(83, 0, NULL, 0, 0), - PACAINITDATA(84, 0, NULL, 0, 0), - PACAINITDATA(85, 0, NULL, 0, 0), - PACAINITDATA(86, 0, NULL, 0, 0), - PACAINITDATA(87, 0, NULL, 0, 0), - PACAINITDATA(88, 0, NULL, 0, 0), - PACAINITDATA(89, 0, NULL, 0, 0), - PACAINITDATA(90, 0, NULL, 0, 0), - PACAINITDATA(91, 0, NULL, 0, 0), - PACAINITDATA(92, 0, NULL, 0, 0), - PACAINITDATA(93, 0, NULL, 0, 0), - PACAINITDATA(94, 0, NULL, 0, 0), - PACAINITDATA(95, 0, NULL, 0, 0), - PACAINITDATA(96, 0, NULL, 0, 0), - PACAINITDATA(97, 0, NULL, 0, 0), - PACAINITDATA(98, 0, NULL, 0, 0), - PACAINITDATA(99, 0, NULL, 0, 0), - PACAINITDATA(100, 0, NULL, 0, 0), - PACAINITDATA(101, 0, NULL, 0, 0), - PACAINITDATA(102, 0, NULL, 0, 0), - PACAINITDATA(103, 0, NULL, 0, 0), - PACAINITDATA(104, 0, NULL, 0, 0), - PACAINITDATA(105, 0, NULL, 0, 0), - PACAINITDATA(106, 0, NULL, 0, 0), - PACAINITDATA(107, 0, NULL, 0, 0), - PACAINITDATA(108, 0, NULL, 0, 0), - PACAINITDATA(109, 0, NULL, 0, 0), - PACAINITDATA(110, 0, NULL, 0, 0), - PACAINITDATA(111, 0, NULL, 0, 0), - PACAINITDATA(112, 0, NULL, 0, 0), - PACAINITDATA(113, 0, NULL, 0, 0), - PACAINITDATA(114, 0, NULL, 0, 0), - PACAINITDATA(115, 0, NULL, 0, 0), - PACAINITDATA(116, 0, NULL, 0, 0), - PACAINITDATA(117, 0, NULL, 0, 0), - PACAINITDATA(118, 0, NULL, 0, 0), - PACAINITDATA(119, 0, NULL, 0, 0), - PACAINITDATA(120, 0, NULL, 0, 0), - PACAINITDATA(121, 0, NULL, 0, 0), - PACAINITDATA(122, 0, NULL, 0, 0), - PACAINITDATA(123, 0, NULL, 0, 0), - PACAINITDATA(124, 0, NULL, 0, 0), - PACAINITDATA(125, 0, NULL, 0, 0), - PACAINITDATA(126, 0, NULL, 0, 0), - PACAINITDATA(127, 0, NULL, 0, 0), + PACAINITDATA(64), + PACAINITDATA(65), + PACAINITDATA(66), + PACAINITDATA(67), + PACAINITDATA(68), + PACAINITDATA(69), + PACAINITDATA(70), + PACAINITDATA(71), + PACAINITDATA(72), + PACAINITDATA(73), + PACAINITDATA(74), + PACAINITDATA(75), + PACAINITDATA(76), + PACAINITDATA(77), + PACAINITDATA(78), + PACAINITDATA(79), + PACAINITDATA(80), + PACAINITDATA(81), + PACAINITDATA(82), + PACAINITDATA(83), + PACAINITDATA(84), + PACAINITDATA(85), + PACAINITDATA(86), + PACAINITDATA(87), + PACAINITDATA(88), + PACAINITDATA(89), + PACAINITDATA(90), + PACAINITDATA(91), + PACAINITDATA(92), + PACAINITDATA(93), + PACAINITDATA(94), + PACAINITDATA(95), + PACAINITDATA(96), + PACAINITDATA(97), + PACAINITDATA(98), + PACAINITDATA(99), + PACAINITDATA(100), + PACAINITDATA(101), + PACAINITDATA(102), + PACAINITDATA(103), + PACAINITDATA(104), + PACAINITDATA(105), + PACAINITDATA(106), + PACAINITDATA(107), + PACAINITDATA(108), + PACAINITDATA(109), + PACAINITDATA(110), + PACAINITDATA(111), + PACAINITDATA(112), + PACAINITDATA(113), + PACAINITDATA(114), + PACAINITDATA(115), + PACAINITDATA(116), + PACAINITDATA(117), + PACAINITDATA(118), + PACAINITDATA(119), + PACAINITDATA(120), + PACAINITDATA(121), + PACAINITDATA(122), + PACAINITDATA(123), + PACAINITDATA(124), + PACAINITDATA(125), + PACAINITDATA(126), + PACAINITDATA(127), #endif #endif #endif From michael at ellerman.id.au Wed Jun 22 14:58:06 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 22 Jun 2005 14:58:06 +1000 Subject: [PATCH] ppc64: Spread lpevents by default on iSeries In-Reply-To: <200506221437.29590.michael@ellerman.id.au> References: <200506221437.29590.michael@ellerman.id.au> Message-ID: <200506221458.07107.michael@ellerman.id.au> This patch changes the macros that initialise the paca array on iSeries, such that every CPU processes lpevents by default. This makes the default behaviour equivalent to specifying "spread_lpevents=n" on the kernel command line, where n = number of cpus. The spread_lpevents command line option remains in case people want the old behaviour. Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/iSeries_setup.c | 3 +++ arch/ppc64/kernel/pacaData.c | 8 ++++---- 2 files changed, 7 insertions(+), 4 deletions(-) Index: work/arch/ppc64/kernel/pacaData.c =================================================================== --- work.orig/arch/ppc64/kernel/pacaData.c +++ work/arch/ppc64/kernel/pacaData.c @@ -62,9 +62,9 @@ extern unsigned long __toc_start; }, \ #ifdef CONFIG_PPC_ISERIES -#define PACA_INIT_ISERIES(number, lpq) \ +#define PACA_INIT_ISERIES(number) \ .lppaca_ptr = &paca[number].lppaca, \ - .lpqueue_ptr = (lpq), /* &xItLpQueue, */ \ + .lpqueue_ptr = &xItLpQueue, \ .reg_save_ptr = &paca[number].reg_save, \ .reg_save = { \ .xDesc = 0xd397d9e2, /* "LpRS" */ \ @@ -74,13 +74,13 @@ extern unsigned long __toc_start; #define PACAINITDATA(number) \ { \ PACA_INIT_COMMON(number, 0, 0, 0) \ - PACA_INIT_ISERIES(number, NULL) \ + PACA_INIT_ISERIES(number) \ } #define BOOTCPU_PACAINITDATA(number) \ { \ PACA_INIT_COMMON(number, 1, 0, STAB0_VIRT_ADDR) \ - PACA_INIT_ISERIES(number, &xItLpQueue) \ + PACA_INIT_ISERIES(number) \ } #else Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -871,6 +871,9 @@ static int set_spread_lpevents(char *str for (i = 1; i < val; ++i) paca[i].lpqueue_ptr = paca[0].lpqueue_ptr; + for (; i < NR_CPUS; ++i) + paca[i].lpqueue_ptr = NULL; + printk("lpevent processing spread over %ld processors\n", val); } else { printk("invalid spread_lpevents %ld\n", val); From david at gibson.dropbear.id.au Wed Jun 22 15:08:55 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 22 Jun 2005 15:08:55 +1000 Subject: [PATCH] ppc64: Spread lpevents by default on iSeries In-Reply-To: <200506221458.07107.michael@ellerman.id.au> References: <200506221437.29590.michael@ellerman.id.au> <200506221458.07107.michael@ellerman.id.au> Message-ID: <20050622050855.GI12646@localhost.localdomain> On Wed, Jun 22, 2005 at 02:58:06PM +1000, Michael Ellerman wrote: > This patch changes the macros that initialise the paca array on iSeries, > such that every CPU processes lpevents by default. This makes the default > behaviour equivalent to specifying "spread_lpevents=n" on the kernel > command line, where n = number of cpus. > > The spread_lpevents command line option remains in case people want the > old behaviour. I think a better way of accomplishing this would be to actually remove the lpqueue field from the paca, and reference xItLpQueue directly where we use it. We can check the cpu number explicitly against a limit if we want to avoid spreading lp events. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From michael at ellerman.id.au Wed Jun 22 15:42:37 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 22 Jun 2005 15:42:37 +1000 Subject: [PATCH] ppc64: Spread lpevents by default on iSeries In-Reply-To: <20050622050855.GI12646@localhost.localdomain> References: <200506221437.29590.michael@ellerman.id.au> <200506221458.07107.michael@ellerman.id.au> <20050622050855.GI12646@localhost.localdomain> Message-ID: <200506221542.41582.michael@ellerman.id.au> On Wed, 22 Jun 2005 15:08, David Gibson wrote: > On Wed, Jun 22, 2005 at 02:58:06PM +1000, Michael Ellerman wrote: > > This patch changes the macros that initialise the paca array on iSeries, > > such that every CPU processes lpevents by default. This makes the default > > behaviour equivalent to specifying "spread_lpevents=n" on the kernel > > command line, where n = number of cpus. > > > > The spread_lpevents command line option remains in case people want the > > old behaviour. > > I think a better way of accomplishing this would be to actually remove > the lpqueue field from the paca, and reference xItLpQueue directly > where we use it. We can check the cpu number explicitly against a > limit if we want to avoid spreading lp events. Sounds reasonable, I'll have a look at the code. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050622/286361f5/attachment.pgp From arnd at arndb.de Wed Jun 22 18:34:37 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Wed, 22 Jun 2005 10:34:37 +0200 Subject: [PATCH] ppc64: enable BPA nvram driver In-Reply-To: References: <200506212310.54156.arnd@arndb.de> <200506212324.19713.arnd@arndb.de> Message-ID: <200506221034.39268.arnd@arndb.de> Hollis Blanchard noticed that the initialization of the nvram driver was commented out in [PATCH 7/11] ppc64: add BPA platform type, which probably resulted from my reordering the patches incorrectly. Signed-off-by: Arnd Bergmann --- linux-cg.orig/arch/ppc64/kernel/bpa_setup.c 2005-06-22 10:33:09.329915056 -0400 +++ linux-cg/arch/ppc64/kernel/bpa_setup.c 2005-06-22 10:32:43.138901744 -0400 @@ -91,7 +91,7 @@ static void __init bpa_setup_arch(void) conswitchp = &dummy_con; #endif - // bpa_nvram_init(); + bpa_nvram_init(); } /* From arnd at arndb.de Wed Jun 22 18:47:13 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Wed, 22 Jun 2005 10:47:13 +0200 Subject: [PATCH 10/11] ppc64: SPU file system In-Reply-To: References: <200506212310.54156.arnd@arndb.de> <200506212334.44066.arnd@arndb.de> Message-ID: <200506221047.14602.arnd@arndb.de> On Middeweken 22 Juni 2005 02:21, Hollis Blanchard wrote: > On Jun 21, 2005, at 4:34 PM, Arnd Bergmann wrote: > > > +union MFC_TagSizeClassCmd { > > I think great effort has gone in to removing so-called "StudlyCaps" > from the ppc64 iSeries code... :) Yes. I've been wanting to fix this one for ages, but it keeps slipping through. The file used to be shared with user space (bad idea) and the CPU simulator and I tried to at least keep the structure definitions compatible initially. > Also, I didn't see "MFC" defined anywhere... it's sort of a pet peeve, > but could you make sure all your acronyms are defined? Most of them are > described in spu.h, but a few slipped through I think (like "SMF"). good point > And while a comment at the top of every file is great, ones like this: > > +/* > > + * Low-level SPU handling > > + * > might be more helpful if they defined SPU and further mentioned it's > the coprocessor in the Broadband Processor Architecture... Yes, all this is the sort of stuff you never notice unless you take while working on a piece of code for months. Thanks, Arnd <>< From becky.bruce at freescale.com Thu Jun 23 05:36:46 2005 From: becky.bruce at freescale.com (Becky Bruce) Date: Wed, 22 Jun 2005 14:36:46 -0500 Subject: Proposal for reorg of kernel directory In-Reply-To: <1119400318.18247.190.camel@gaston> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <1119400318.18247.190.camel@gaston> Message-ID: On Jun 21, 2005, at 7:31 PM, Benjamin Herrenschmidt wrote: > >> To prevent bloating the kernel directory, I'd like to propose a >> reorganization of the ppc64 tree to look more like the ppc tree. >> This includes the creation of "platforms" and "syslib" directories >> that would contain platform-specific code and non-platform-specific >> system code, respectively. > > I'm not fan at all of kernel vs. syslib. Even on ppc32, and years after > the split, I still keep trying to get at files in the wrong directory > ;) I can't really say I'm a huge fan, either - I had to get Kumar to explain to me the purpose of syslib. However, looking at the 32-bit side, there are quite a few files over there that would make the kernel directory rather large if they were in one directory. I think the idea behind that organization was to have: - "kernel" dir - ppc generic kernel code and processor-specific code - "platforms" dir - platform-specific code - "syslib" - all device/system-level kernel code that is not platform-specific. That said, I don't think the line between the 3 can be drawn perfectly in practice, hence the confusion we have about where to find files. I think there's some merit in having the 3 directories, but if the concensus of this group is that we only separate out the platform code, that works for me. > > platforms is ok. OK, so let's talk about how the organization of platforms would look if we go with the split. As I see it, there are several options: 1. Slam all of the platform-specific files into a flat "platforms" directory. 2. Do something similar to the 32-bit tree's implementation of the platforms dir, where single-platform code is at the highest level in "platforms". Create subdirectories for platforms which are very similar and share most of their code - I believe this ties in with Arnd's idea about more generic platforms. 3. Subdirs for each platform under the platforms directory. I don't really like this one, but it is an option. With 1 and 2 also comes the issue of file naming - files that are truly platform-specific should probably have the file name prefixed with the platform name. I believe this is mostly true today. Thoughts? Becky From benh at kernel.crashing.org Thu Jun 23 08:00:17 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 23 Jun 2005 08:00:17 +1000 Subject: Proposal for reorg of kernel directory In-Reply-To: References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <1119400318.18247.190.camel@gaston> Message-ID: <1119477618.18247.261.camel@gaston> On Wed, 2005-06-22 at 14:36 -0500, Becky Bruce wrote: > On Jun 21, 2005, at 7:31 PM, Benjamin Herrenschmidt wrote: > > > > >> To prevent bloating the kernel directory, I'd like to propose a > >> reorganization of the ppc64 tree to look more like the ppc tree. > >> This includes the creation of "platforms" and "syslib" directories > >> that would contain platform-specific code and non-platform-specific > >> system code, respectively. > > > > I'm not fan at all of kernel vs. syslib. Even on ppc32, and years after > > the split, I still keep trying to get at files in the wrong directory > > ;) > > I can't really say I'm a huge fan, either - I had to get Kumar to > explain to me the purpose of syslib. However, looking at the 32-bit > side, there are quite a few files over there that would make the kernel > directory rather large if they were in one directory. I think the idea > behind that organization was to have: > - "kernel" dir - ppc generic kernel code and processor-specific code > - "platforms" dir - platform-specific code > - "syslib" - all device/system-level kernel code that is not > platform-specific. > > That said, I don't think the line between the 3 can be drawn perfectly > in practice, hence the confusion we have about where to find files. I > think there's some merit in having the 3 directories, but if the > concensus of this group is that we only separate out the platform code, > that works for me. I'd start by splitting only platform. We can do different kind of splits in kernel. For example, we can have kernel32 with *32.c.. > OK, so let's talk about how the organization of platforms would look if > we go with the split. As I see it, there are several options: > > 1. Slam all of the platform-specific files into a flat "platforms" > directory. This is the ppc32 approach. I prefer subdirs but paulus doesn't ;) > 2. Do something similar to the 32-bit tree's implementation of the > platforms dir, where single-platform code is at the highest level in > "platforms". Create subdirectories for platforms which are very > similar and share most of their code - I believe this ties in with > Arnd's idea about more generic platforms. 32 bits started with a flat platforms dir. Things like 4xx popped up afterward afaik. > 3. Subdirs for each platform under the platforms directory. I don't > really like this one, but it is an option. > > With 1 and 2 also comes the issue of file naming - files that are truly > platform-specific should probably have the file name prefixed with the > platform name. I believe this is mostly true today. It is. Ben. From linas at austin.ibm.com Thu Jun 23 08:37:28 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 22 Jun 2005 17:37:28 -0500 Subject: [PATCH]: PCI Error Recovery Message-ID: <20050622223728.GA6144@austin.ibm.com> Hi, I've attached patches implementing PCI error recovery code. These should apply cleanly against kernel-2.6.12-git4 Details of what this is, and how it works, are in a documentation file, part way down the patch. These patches implement "native" error recovery for four devices: -- the e100, e1000 network cards -- the ipr and sym53c8xx_2 scsi device drivers I've lightly tested against multi-port versions fo the sym53c8xx_2 and the e1000 cards. I have not yet done serious stress testing. Please review and if appropriate, please apply. Signed-off-by: Linas Vepstas -------------- next part -------------- --- include/linux/pci.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ include/linux/pci.h 2005-06-22 15:28:29.000000000 -0500 @@ -660,6 +660,81 @@ struct pci_dynids { unsigned int use_driver_data:1; /* pci_driver->driver_data is used */ }; +/* ---------------------------------------------------------------- */ +/** PCI error recovery infrastructure. If a PCI device driver provides + * a set fof callbacks in struct pci_error_handlers, then that device driver + * will be notified of PCI bus errors, and can be driven to recovery. + */ + +enum pci_channel_state { + pci_channel_io_normal = 0, /* I/O channel is in normal state */ + pci_channel_io_frozen = 1, /* I/O to channel is blocked */ + pci_channel_io_perm_failure, /* pci card is dead */ +}; + +enum pcierr_result { + PCIERR_RESULT_NONE=0, /* no result/none/not supported in device driver */ + PCIERR_RESULT_CAN_RECOVER=1, /* Device driver can recover without slot reset */ + PCIERR_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */ + PCIERR_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */ + PCIERR_RESULT_RECOVERED, /* Device driver is fully recovered and operational */ +}; + +/* PCI bus error event callbacks */ +struct pci_error_handlers +{ + int (*error_detected)(struct pci_dev *dev, enum pci_channel_state error); + int (*mmio_enabled)(struct pci_dev *dev); /* MMIO has been reanbled, but not DMA */ + int (*link_reset)(struct pci_dev *dev); /* PCI Express link has been reset */ + int (*slot_reset)(struct pci_dev *dev); /* PCI slot has been reset */ + void (*resume)(struct pci_dev *dev); /* Device driver may resume normal operations */ +}; + +/** + * PCI Error notifier event flags. + */ +#define PEH_NOTIFY_ERROR 1 + +/** PEH event -- structure holding pci controller data that describes + * a change in the isolation status of a PCI slot. A pointer + * to this struct is passed as the data pointer in a notify callback. + */ +struct peh_event { + struct list_head list; + struct pci_dev *dev; /* affected device */ + enum pci_channel_state state; /* PCI bus state for the affected device */ + int time_unavail; /* milliseconds until device might be available */ +}; + +/** + * peh_send_failure_event - generate a PCI error event + * @dev pci device + * + * This routine builds a PCI error event which will be delivered + * to all listeners on the peh_notifier_chain. + * + * This routine can be called within an interrupt context; + * the actual event will be delivered in a normal context + * (from a workqueue). + */ +int peh_send_failure_event (struct pci_dev *dev, + enum pci_channel_state state, + int time_unavail); + +/** + * peh_register_notifier - Register to find out about EEH events. + * @nb: notifier block to callback on events + */ +int peh_register_notifier(struct notifier_block *nb); + +/** + * peh_unregister_notifier - Unregister to an EEH event notifier. + * @nb: notifier block to callback on events + */ +int peh_unregister_notifier(struct notifier_block *nb); + +/* ---------------------------------------------------------------- */ + struct module; struct pci_driver { struct list_head node; @@ -673,6 +748,7 @@ struct pci_driver { int (*enable_wake) (struct pci_dev *dev, pci_power_t state, int enable); /* Enable wake event */ void (*shutdown) (struct pci_dev *dev); + struct pci_error_handlers err_handler; struct device_driver driver; struct pci_dynids dynids; }; --- Documentation/pci-error-recovery.txt.linas-orig 2005-06-22 15:28:15.000000000 -0500 +++ Documentation/pci-error-recovery.txt 2005-06-22 15:28:29.000000000 -0500 @@ -0,0 +1,242 @@ + + PCI Error Recovery + ------------------ + May 31, 2005 + + +Some PCI bus controllers are able to detect certain "hard" PCI errors +on the bus, such as parity errors on the data and address busses, as +well as SERR and PERR errors. These chipsets are then able to disable +I/O to/from the affected device, so that, for example, a bad DMA +address doesn't end up corrupting system memory. These same chipsets +are also able to reset the affected PCI device, and return it to +working condition. This document describes a generic API form +performing error recovery. + +The core idea is that after a PCI error has been detected, there must +be a way for the kernel to coordinate with all affected device drivers +so that the pci card can be made operational again, possibly after +performing a full electrical #RST of the PCI card. The API below +provides a generic API for device drivers to be notified of PCI +errors, and to be notified of, and respond to, a reset sequence. + +Preliminary sketch of API, cut-n-pasted-n-modified email from +Ben Herrenschmidt, circa 5 april 2005 + +The error recovery API support is exposed to the driver in the form of +a structure of function pointers pointed to by a new field in struct +pci_driver. The absence of this pointer in pci_driver denotes an +"non-aware" driver, behaviour on these is platform dependant. +Platforms like ppc64 can try to simulate pci hotplug remove/add. + +The definition of "pci_error_token" is not covered here. It is based on +Seto's work on the synchronous error detection. We still need to define +functions for extracting infos out of an opaque error token. This is +separate from this API. + +This structure has the form: + +struct pci_error_handlers +{ + int (*error_detected)(struct pci_dev *dev, pci_error_token error); + int (*mmio_enabled)(struct pci_dev *dev); + int (*resume)(struct pci_dev *dev); + int (*link_reset)(struct pci_dev *dev); + int (*slot_reset)(struct pci_dev *dev); +}; + +A driver doesn't have to implement all of these callbacks. The +only mandatory one is error_detected(). If a callback is not +implemented, the corresponding feature is considered unsupported. +For example, if mmio_enabled() and resume() aren't there, then the +driver is assumed as not doing any direct recovery and requires +a reset. If link_reset() is not implemented, the card is assumed as +not caring about link resets, in which case, if recover is supported, +the core can try recover (but not slot_reset() unless it really did +reset the slot). If slot_reset() is not supported, link_reset() can +be called instead on a slot reset. + +At first, the call will always be : + + 1) error_detected() + + Error detected. This is sent once after an error has been detected. At +this point, the device might not be accessible anymore depending on the +platform (the slot will be isolated on ppc64). The driver may already +have "noticed" the error because of a failing IO, but this is the proper +"synchronisation point", that is, it gives a chance to the driver to +cleanup, waiting for pending stuff (timers, whatever, etc...) to +complete; it can take semaphores, schedule, etc... everything but touch +the device. Within this function and after it returns, the driver +shouldn't do any new IOs. Called in task context. This is sort of a +"quiesce" point. See note about interrupts at the end of this doc. + + Result codes: + - PCIERR_RESULT_CAN_RECOVER: + Driever returns this if it thinks it might be able to recover + the HW by just banging IOs or if it wants to be given + a chance to extract some diagnostic informations (see + below). + - PCIERR_RESULT_NEED_RESET: + Driver returns this if it thinks it can't recover unless the + slot is reset. + - PCIERR_RESULT_DISCONNECT: + Return this if driver thinks it won't recover at all, + (this will detach the driver ? or just leave it + dangling ? to be decided) + +So at this point, we have called error_detected() for all drivers +on the segment that had the error. On ppc64, the slot is isolated. What +happens now typically depends on the result from the drivers. If all +drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would +re-enable IOs on the slot (or do nothing special if the platform doesn't +isolate slots) and call 2). If not and we can reset slots, we go to 4), +if neither, we have a dead slot. If it's an hotplug slot, we might +"simulate" reset by triggering HW unplug/replug though. + +>>> Current ppc64 implementation assumes that a device driver will +>>> *not* schedule or semaphore in this routine; the current ppc64 +>>> implementation uses one kernel thread to notify all devices; +>>> thus, of one device sleeps/schedules, all devices are affected. +>>> Doing better requires complex multi-threaded logic in the error +>>> recovery implementation (e.g. waiting for all notification threads +>>> to "join" before proceeding with recovery.) This seems excessively +>>> complex and not worth implementing. + +>>> The current ppc64 implementation doesn't much care if the device +>>> attempts i/o at this point, or not. I/O's will fail, returning +>>> a value of 0xff on read, and writes will be dropped. If the device +>>> driver attempts more than 10K I/O's to a frozen adapter, it will +>>> assume that the device driver has gone into an infinite loop, and +>>> it will panic the the kernel. + + 2) mmio_enabled() + + This is the "early recovery" call. IOs are allowed again, but DMA is +not (hrm... to be discussed, I prefer not), with some restrictions. This +is NOT a callback for the driver to start operations again, only to +peek/poke at the device, extract diagnostic information, if any, and +eventually do things like trigger a device local reset or some such, +but not restart operations. This is sent if all drivers on a segment +agree that they can try to recover and no automatic link reset was +performed by the HW. If the platform can't just re-enable IOs without +a slot reset or a link reset, it doesn't call this callback and goes +directly to 3) or 4). All IOs should be done _synchronously_ from +within this callback, errors triggered by them will be returned via +the normal pci_check_whatever() api, no new error_detected() callback +will be issued due to an error happening here. However, such an error +might cause IOs to be re-blocked for the whole segment, and thus +invalidate the recovery that other devices on the same segment might +have done, forcing the whole segment into one of the next states, +that is link reset or slot reset. + + Result codes: + - PCIERR_RESULT_RECOVERED + Driver returns this if it thinks the device is fully + functionnal and thinks it is ready to start + normal driver operations again. There is no + guarantee that the driver will actually be + allowed to proceed, as another driver on the + same segment might have failed and thus triggered a + slot reset on platforms that support it. + + - PCIERR_RESULT_NEED_RESET + Driver returns this if it thinks the device is not + recoverable in it's current state and it needs a slot + reset to proceed. + + - PCIERR_RESULT_DISCONNECT + Same as above. Total failure, no recovery even after + reset driver dead. (To be defined more precisely) + +>>> The current ppc64 implementation does not implement this callback. + + 3) link_reset() + + This is called after the link has been reset. This is typically +a PCI Express specific state at this point and is done whenever a +non-fatal error has been detected that can be "solved" by resetting +the link. This call informs the driver of the reset and the driver +should check if the device appears to be in working condition. +This function acts a bit like 2) mmio_enabled(), in that the driver +is not supposed to restart normal driver I/O operations right away. +Instead, it should just "probe" the device to check it's recoverability +status. If all is right, then the core will call resume() once all +drivers have ack'd link_reset(). + + Result codes: + (identical to mmio_enabled) + +>>> The current ppc64 implementation does not implement this callback. + + 4) slot_reset() + + This is called after the slot has been soft or hard reset by the +platform. A soft reset consists of asserting the adapter #RST line +and then restoring the PCI BARs and PCI configuration header. If the +platform supports PCI hotplug, then it might instead perform a hard +reset by toggling power on the slot off/on. This call gives drivers +the chance to re-initialize the hardware (re-download firmware, etc.), +but drivers shouldn't restart normal I/O processing operations at +this point. (See note about interrupts; interrupts aren't guaranteed +to be delivered until the resume() callback has been called). If all +device drivers report success on this callback, the patform will call +resume() to complete the error handling and let the driver restart +normal I/O processing. + +A driver can still return a critical failure for this function if +it can't get the device operational after reset. If the platform +previously tried a soft reset, it migh now try a hard reset (power +cycle) and then call slot_reset() again. It the device still can't +be recovered, there is nothing more that can be done; the platform +will typically report a "permanent failure" in such a case. The +device will be considered "dead" in this case. + + Result codes: + - PCIERR_RESULT_DISCONNECT + Same as above. + +>>> The current ppc64 implementation does not try a power-cycle reset +>>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should. + + 5) resume() + + This is called if all drivers on the segment have returned +PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks. +That basically tells the driver to restart activity, tht everything +is back and running. No result code is taken into account here. If +a new error happens, it will restart a new error handling process. + +That's it. I think this covers all the possibilities. The way those +callbacks are called is platform policy. A platform with no slot reset +capability for example may want to just "ignore" drivers that can't +recover (disconnect them) and try to let other cards on the same segment +recover. Keep in mind that in most real life cases, though, there will +be only one driver per segment. + +Now, there is a note about interrupts. If you get an interrupt and your +device is dead or has been isolated, there is a problem :) + +After much thinking, I decided to leave that to the platform. That is, +the recovery API only precies that: + + - There is no guarantee that interrupt delivery can proceed from any +device on the segment starting from the error detection and until the +restart callback is sent, at which point interrupts are expected to be +fully operational. + + - There is no guarantee that interrupt delivery is stopped, that is, ad +river that gets an interrupts after detecting an error, or that detects +and error within the interrupt handler such that it prevents proper +ack'ing of the interrupt (and thus removal of the source) should just +return IRQ_NOTHANDLED. It's up to the platform to deal with taht +condition, typically by masking the irq source during the duration of +the error handling. It is expected that the platform "knows" which +interrupts are routed to error-management capable slots and can deal +with temporarily disabling that irq number during error processing (this +isn't terribly complex). That means some IRQ latency for other devices +sharing the interrupt, but there is simply no other way. High end +platforms aren't supposed to share interrupts between many devices +anyway :) + + --- drivers/pci/Makefile.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ drivers/pci/Makefile 2005-06-22 15:28:29.000000000 -0500 @@ -3,7 +3,7 @@ # obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ - names.o pci-driver.o search.o pci-sysfs.o \ + names.o pci-driver.o pci-error.o search.o pci-sysfs.o \ rom.o obj-$(CONFIG_PROC_FS) += proc.o --- drivers/pci/pci-error.c.linas-orig 2005-06-22 15:28:15.000000000 -0500 +++ drivers/pci/pci-error.c 2005-06-22 15:28:29.000000000 -0500 @@ -0,0 +1,152 @@ +/* + * pci-error.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include + +#undef DEBUG + +/** Overview: + * PEH, or "PCI Error Handling" is a PCI bridge technology for + * dealing with PCI bus errors that can't be dealt with within the + * usual PCI framework, except by check-stopping the CPU. Systems + * that are designed for high-availability/reliability cannot afford + * to crash due to a "mere" PCI error, thus the need for PEH. + * An PEH-capable bridge operates by converting a detected error + * into a "slot freeze", taking the PCI adapter off-line, making + * the slot behave, from the OS'es point of view, as if the slot + * were "empty": all reads return 0xff's and all writes are silently + * ignored. PEH slot isolation events can be triggered by parity + * errors on the address or data busses (e.g. during posted writes), + * which in turn might be caused by low voltage on the bus, dust, + * vibration, humidity, radioactivity or plain-old failed hardware. + * + * Note, however, that one of the leading causes of PEH slot + * freeze events are buggy device drivers, buggy device microcode, + * or buggy device hardware. This is because any attempt by the + * device to bus-master data to a memory address that is not + * assigned to the device will trigger a slot freeze. (The idea + * is to prevent devices-gone-wild from corrupting system memory). + * Buggy hardware/drivers will have a miserable time co-existing + * with PEH. + */ + +/* PEH event workqueue setup. */ +static spinlock_t peh_eventlist_lock = SPIN_LOCK_UNLOCKED; +LIST_HEAD(peh_eventlist); +static void peh_event_handler(void *); +DECLARE_WORK(peh_event_wq, peh_event_handler, NULL); + +static struct notifier_block *peh_notifier_chain; + +/** + * peh_event_handler - dispatch PEH events. The detection of a frozen + * slot can occur inside an interrupt, where it can be hard to do + * anything about it. The goal of this routine is to pull these + * detection events out of the context of the interrupt handler, and + * re-dispatch them for processing at a later time in a normal context. + * + * @dummy - unused + */ +static void peh_event_handler(void *dummy) +{ + unsigned long flags; + struct peh_event *event; + + while (1) { + spin_lock_irqsave(&peh_eventlist_lock, flags); + event = NULL; + if (!list_empty(&peh_eventlist)) { + event = list_entry(peh_eventlist.next, struct peh_event, list); + list_del(&event->list); + } + spin_unlock_irqrestore(&peh_eventlist_lock, flags); + if (event == NULL) + break; + + printk(KERN_INFO "PEH: Detected PCI bus error on device " + "%s %s\n", + pci_name(event->dev), pci_pretty_name(event->dev)); + + notifier_call_chain (&peh_notifier_chain, + PEH_NOTIFY_ERROR, event); + + pci_dev_put(event->dev); + kfree(event); + } +} + + +/** + * peh_send_failure_event - generate a PCI error event + * @dev pci device + * + * This routine builds a PCI error event which will be delivered + * to all listeners on the peh_notifier_chain. + * + * This routine can be called within an interrupt context; + * the actual event will be delivered in a normal context + * (from a workqueue). + */ +int peh_send_failure_event (struct pci_dev *dev, + enum pci_channel_state state, + int time_unavail) +{ + unsigned long flags; + struct peh_event *event; + + event = kmalloc(sizeof(*event), GFP_ATOMIC); + if (event == NULL) { + printk (KERN_ERR "PEH: out of memory, event not handled\n"); + return 1; + } + + event->dev = dev; + event->state = state; + event->time_unavail = time_unavail; + + /* We may or may not be called in an interrupt context */ + spin_lock_irqsave(&peh_eventlist_lock, flags); + list_add(&event->list, &peh_eventlist); + spin_unlock_irqrestore(&peh_eventlist_lock, flags); + + schedule_work(&peh_event_wq); + + return 0; +} + +/** + * peh_register_notifier - Register to find out about EEH events. + * @nb: notifier block to callback on events + */ +int peh_register_notifier(struct notifier_block *nb) +{ + return notifier_chain_register(&peh_notifier_chain, nb); +} + +/** + * peh_unregister_notifier - Unregister to an EEH event notifier. + * @nb: notifier block to callback on events + */ +int peh_unregister_notifier(struct notifier_block *nb) +{ + return notifier_chain_unregister(&peh_notifier_chain, nb); +} + +/********************** END OF FILE ******************************/ --- drivers/scsi/ipr.c.linas-orig 2005-06-22 15:26:14.000000000 -0500 +++ drivers/scsi/ipr.c 2005-06-22 17:05:14.000000000 -0500 @@ -5326,6 +5326,88 @@ static void ipr_initiate_ioa_reset(struc shutdown_type); } +#ifdef CONFIG_SCSI_IPR_EEH_RECOVERY + +/** If the PCI slot is frozen, hold off all i/o + * activity; then, as soon as the slot is available again, + * initiate an adapter reset. + */ +static int ipr_reset_freeze(struct ipr_cmnd *ipr_cmd) +{ + list_add_tail(&ipr_cmd->queue, &ipr_cmd->ioa_cfg->pending_q); + ipr_cmd->done = ipr_reset_ioa_job; + return IPR_RC_JOB_RETURN; +} + +/** ipr_eeh_frozen -- called when slot has experience PCI bus error. + * This routine is called to tell us that the PCI bus is down. + * Can't do anything here, except put the device driver into a + * holding pattern, waiting for the PCI bus to come back. + */ +static void ipr_eeh_frozen (struct pci_dev *pdev) +{ + unsigned long flags = 0; + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); + + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_freeze, IPR_SHUTDOWN_NONE); + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); +} + +/** ipr_eeh_slot_reset - called when pci slot has been reset. + * + * This routine is called by the pci error recovery recovery + * code after the PCI slot has been reset, just before we + * should resume normal operations. + */ +static int ipr_eeh_slot_reset (struct pci_dev *pdev) +{ + unsigned long flags = 0; + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); + + pci_enable_device(pdev); + pci_set_master(pdev); + enable_irq (pdev->irq); + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_restore_cfg_space, + IPR_SHUTDOWN_NONE); + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); + + return PCIERR_RESULT_RECOVERED; +} + +/** This routine is called when the PCI bus has permanently + * failed. This routine should purge all pending I/O and + * shut down the device driver (close and unload). + * XXX Needs to be implemented. + */ +static void ipr_eeh_perm_failure (struct pci_dev *pdev) +{ +#if 0 // XXXXXXXXXXXXXXXXXXXXXXX + ipr_cmd->job_step = ipr_reset_shutdown_ioa; + rc = IPR_RC_JOB_CONTINUE; +#endif +} + +static int ipr_eeh_error_detected (struct pci_dev *pdev, + enum pci_channel_state state) +{ + switch (state) { + case pci_channel_io_frozen: + ipr_eeh_frozen (pdev); + return PCIERR_RESULT_NEED_RESET; + + case pci_channel_io_perm_failure: + ipr_eeh_perm_failure (pdev); + return PCIERR_RESULT_DISCONNECT; + break; + default: + break; + } + return PCIERR_RESULT_NEED_RESET; +} +#endif + /** * ipr_probe_ioa_part2 - Initializes IOAs found in ipr_probe_ioa(..) * @ioa_cfg: ioa cfg struct @@ -6068,6 +6150,10 @@ static struct pci_driver ipr_driver = { .id_table = ipr_pci_table, .probe = ipr_probe, .remove = ipr_remove, + .err_handler = { + .error_detected = ipr_eeh_error_detected, + .slot_reset = ipr_eeh_slot_reset, + }, .driver = { .shutdown = ipr_shutdown, }, --- drivers/scsi/sym53c8xx_2/sym_glue.c.linas-orig 2005-06-22 15:26:17.000000000 -0500 +++ drivers/scsi/sym53c8xx_2/sym_glue.c 2005-06-22 17:17:00.000000000 -0500 @@ -685,6 +685,10 @@ static irqreturn_t sym53c8xx_intr(int ir struct sym_hcb *np = (struct sym_hcb *)dev_id; if (DEBUG_FLAGS & DEBUG_TINY) printf_debug ("["); +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + if (np->s.io_state != pci_channel_io_normal) + return IRQ_HANDLED; +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ spin_lock_irqsave(np->s.host->host_lock, flags); sym_interrupt(np); @@ -759,6 +763,27 @@ static void sym_eh_done(struct scsi_cmnd */ static void sym_eh_timeout(u_long p) { __sym_eh_done((struct scsi_cmnd *)p, 1); } +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY +static void sym_eeh_timeout(u_long p) +{ + struct sym_eh_wait *ep = (struct sym_eh_wait *) p; + if (!ep) + return; + complete(&ep->done); +} + +static void sym_eeh_done(struct sym_eh_wait *ep) +{ + if (!ep) + return; + ep->timed_out = 0; + if (!del_timer(&ep->timer)) + return; + + complete(&ep->done); +} +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + /* * Generic method for our eh processing. * The 'op' argument tells what we have to do. @@ -799,6 +824,37 @@ prepare: /* Try to proceed the operation we have been asked for */ sts = -1; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + + /* We may be in an error condition because the PCI bus + * went down. In this case, we need to wait until the + * PCI bus is reset, the card is reset, and only then + * proceed with the scsi error recovery. We'll wait + * for 15 seconds for this to happen. + */ +#define WAIT_FOR_PCI_RECOVERY 15 + if (np->s.io_state != pci_channel_io_normal) { + struct sym_eh_wait eeh, *eep = &eeh; + np->s.io_reset_wait = eep; + init_completion(&eep->done); + init_timer(&eep->timer); + eep->to_do = SYM_EH_DO_WAIT; + eep->timer.expires = jiffies + (WAIT_FOR_PCI_RECOVERY*HZ); + eep->timer.function = sym_eeh_timeout; + eep->timer.data = (u_long)eep; + eep->timed_out = 1; /* Be pessimistic for once :) */ + add_timer(&eep->timer); + spin_unlock_irq(np->s.host->host_lock); + wait_for_completion(&eep->done); + spin_lock_irq(np->s.host->host_lock); + if (eep->timed_out) { + printk (KERN_ERR "%s: Timed out waiting for PCI reset\n", + sym_name(np)); + } + np->s.io_reset_wait = NULL; + } +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + switch(op) { case SYM_EH_ABORT: sts = sym_abort_scsiio(np, cmd, 1); @@ -1584,6 +1640,10 @@ static struct Scsi_Host * __devinit sym_ np->maxoffs = dev->chip.offset_max; np->maxburst = dev->chip.burst_max; np->myaddr = dev->host_id; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + np->s.io_state = pci_channel_io_normal; + np->s.io_reset_wait = NULL; +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ /* * Edit its name. @@ -1916,6 +1976,59 @@ static int sym_detach(struct sym_hcb *np return 1; } +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY +/** sym2_io_error_detected() is called when PCI error is detected */ +static int sym2_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + np->s.io_state = state; + // XXX If slot is permanently frozen, then what? + // Should we scsi_remove_host() maybe ?? + + /* Request a slot slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** sym2_io_slot_reset is called when the pci bus has been reset. + * Restart the card from scratch. */ +static int sym2_io_slot_reset (struct pci_dev *pdev) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + printk (KERN_INFO "%s: recovering from a PCI slot reset\n", + sym_name(np)); + + if (pci_enable_device(pdev)) + printk (KERN_ERR "%s: device setup failed most egregiously\n", + sym_name(np)); + + pci_set_master(pdev); + enable_irq (pdev->irq); + + /* Perform host reset only on one instance of the card */ + if (0 == PCI_FUNC (pdev->devfn)) + sym_reset_scsi_bus(np, 0); + + return PCIERR_RESULT_RECOVERED; +} + +/** sym2_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + */ +static void sym2_io_resume (struct pci_dev *pdev) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + /* Perform device startup only once for this card. */ + if (0 == PCI_FUNC (pdev->devfn)) + sym_start_up (np, 1); + + np->s.io_state = pci_channel_io_normal; + sym_eeh_done (np->s.io_reset_wait); +} +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + /* * Driver host template. */ @@ -2174,6 +2287,13 @@ static struct pci_driver sym2_driver = { .id_table = sym2_id_table, .probe = sym2_probe, .remove = __devexit_p(sym2_remove), +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + .err_handler = { + .error_detected = sym2_io_error_detected, + .slot_reset = sym2_io_slot_reset, + .resume = sym2_io_resume, + }, +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ }; static int __init sym2_init(void) --- drivers/scsi/sym53c8xx_2/sym_glue.h.linas-orig 2005-06-22 15:26:17.000000000 -0500 +++ drivers/scsi/sym53c8xx_2/sym_glue.h 2005-06-22 15:28:29.000000000 -0500 @@ -181,6 +181,10 @@ struct sym_shcb { char chip_name[8]; struct pci_dev *device; + /* pci bus i/o state; waiter for clearing of i/o state */ + enum pci_channel_state io_state; + struct sym_eh_wait *io_reset_wait; + struct Scsi_Host *host; void __iomem * ioaddr; /* MMIO kernel io address */ --- drivers/scsi/sym53c8xx_2/sym_hipd.c.linas-orig 2005-06-22 15:26:17.000000000 -0500 +++ drivers/scsi/sym53c8xx_2/sym_hipd.c 2005-06-22 15:28:29.000000000 -0500 @@ -2806,6 +2806,7 @@ void sym_interrupt (struct sym_hcb *np) u_char istat, istatc; u_char dstat; u_short sist; + u_int icnt; /* * interrupt on the fly ? @@ -2847,6 +2848,7 @@ void sym_interrupt (struct sym_hcb *np) sist = 0; dstat = 0; istatc = istat; + icnt = 0; do { if (istatc & SIP) sist |= INW(np, nc_sist); @@ -2854,6 +2856,14 @@ void sym_interrupt (struct sym_hcb *np) dstat |= INB(np, nc_dstat); istatc = INB(np, nc_istat); istat |= istatc; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + /* Prevent deadlock waiting on a condition that may never clear. */ + icnt ++; + if (100 < icnt) { + if (eeh_slot_is_isolated(np->s.device)) + return; + } +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ } while (istatc & (SIP|DIP)); if (DEBUG_FLAGS & DEBUG_TINY) --- drivers/scsi/Kconfig.linas-orig 2005-06-22 15:26:14.000000000 -0500 +++ drivers/scsi/Kconfig 2005-06-22 15:28:29.000000000 -0500 @@ -1040,6 +1040,14 @@ config SCSI_SYM53C8XX_IOMAPPED the card. This is significantly slower then using memory mapped IO. Most people should answer N. +config SCSI_SYM53C8XX_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on SCSI_SYM53C8XX_2 && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config SCSI_IPR tristate "IBM Power Linux RAID adapter support" depends on PCI && SCSI @@ -1065,6 +1073,14 @@ config SCSI_IPR_DUMP If you enable this support, the iprdump daemon can be used to capture adapter failure analysis information. +config SCSI_IPR_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on SCSI_IPR && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config SCSI_ZALON tristate "Zalon SCSI support" depends on GSC && SCSI --- drivers/net/e100.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ drivers/net/e100.c 2005-06-22 17:18:26.000000000 -0500 @@ -2452,6 +2452,67 @@ static void e100_shutdown(struct device #endif } +#ifdef CONFIG_E100_EEH_RECOVERY + +/** e100_io_error_detected() is called when PCI error is detected */ +static int e100_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct nic *nic = netdev_priv(netdev); + + mod_timer(&nic->watchdog, jiffies + 30*HZ); + e100_down(nic); + + /* Request a slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** e100_io_slot_reset is called after the pci bus has been reset. + * Restart the card from scratch. */ +static int e100_io_slot_reset (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct nic *nic = netdev_priv(netdev); + + if(pci_enable_device(pdev)) { + printk(KERN_ERR "e100: Cannot re-enable PCI device after reset.\n"); + return PCIERR_RESULT_DISCONNECT; + } + pci_set_master(pdev); + + /* Only one device per card can do a reset */ + if (0 != PCI_FUNC (pdev->devfn)) + return PCIERR_RESULT_RECOVERED; + + e100_hw_reset(nic); + e100_phy_init(nic); + + if(e100_hw_init(nic)) { + DPRINTK(HW, ERR, "e100_hw_init failed\n"); + return PCIERR_RESULT_DISCONNECT; + } + + return PCIERR_RESULT_RECOVERED; +} + +/** e100_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + */ +static void e100_io_resume (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct nic *nic = netdev_priv(netdev); + + /* ack any pending wake events, disable PME */ + pci_enable_wake(pdev, 0, 0); + + netif_device_attach(netdev); + if(netif_running(netdev)) + e100_open (netdev); + + mod_timer(&nic->watchdog, jiffies); +} +#endif /* CONFIG_E100_EEH_RECOVERY */ static struct pci_driver e100_driver = { .name = DRV_NAME, @@ -2462,6 +2523,13 @@ static struct pci_driver e100_driver = { .suspend = e100_suspend, .resume = e100_resume, #endif +#ifdef CONFIG_E100_EEH_RECOVERY + .err_handler = { + .error_detected = e100_io_error_detected, + .slot_reset = e100_io_slot_reset, + .resume = e100_io_resume, + }, +#endif /* CONFIG_E100_EEH_RECOVERY */ .driver = { .shutdown = e100_shutdown, --- drivers/net/e1000/e1000_main.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ drivers/net/e1000/e1000_main.c 2005-06-22 17:02:17.000000000 -0500 @@ -171,6 +171,12 @@ static int e1000_resume(struct pci_dev * static void e1000_netpoll (struct net_device *netdev); #endif +#ifdef CONFIG_E1000_EEH_RECOVERY +static int e1000_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state); +static int e1000_io_slot_reset (struct pci_dev *pdev); +static void e1000_io_resume (struct pci_dev *pdev); +#endif /* CONFIG_E1000_EEH_RECOVERY */ + struct notifier_block e1000_notifier_reboot = { .notifier_call = e1000_notify_reboot, .next = NULL, @@ -191,6 +197,14 @@ static struct pci_driver e1000_driver = .suspend = e1000_suspend, .resume = e1000_resume #endif +#ifdef CONFIG_E1000_EEH_RECOVERY + .err_handler = { + .error_detected = e1000_io_error_detected, + .slot_reset = e1000_io_slot_reset, + .resume = e1000_io_resume, + }, +#endif /* CONFIG_E1000_EEH_RECOVERY */ + }; MODULE_AUTHOR("Intel Corporation, "); @@ -2774,7 +2788,7 @@ e1000_clean_tx_irq(struct e1000_adapter " next_to_use <%x>\n" " next_to_clean <%x>\n" "buffer_info[next_to_clean]\n" - " dma <%llx>\n" + " dma <%lx>\n" " time_stamp <%lx>\n" " next_to_watch <%x>\n" " jiffies <%lx>\n" @@ -3794,4 +3808,91 @@ e1000_netpoll(struct net_device *netdev) } #endif +#ifdef CONFIG_E1000_EEH_RECOVERY + +/** e1000_io_error_detected() is called when PCI error is detected */ +static int e1000_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct e1000_adapter *adapter = netdev->priv; + + mod_timer(&adapter->watchdog_timer, jiffies + 20 * HZ); + if(netif_running(netdev)) + e1000_down(adapter); + + /* Request a slot slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** e1000_io_slot_reset is called after the pci bus has been reset. + * Restart the card from scratch. + * Implementation resembles the first-half of the + * e1000_resume routine. + */ +static int e1000_io_slot_reset (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct e1000_adapter *adapter = netdev->priv; + + if(pci_enable_device(pdev)) { + printk(KERN_ERR "e1000: Cannot re-enable PCI device after reset.\n"); + return PCIERR_RESULT_DISCONNECT; + } + pci_set_master(pdev); + + pci_enable_wake(pdev, 3, 0); + pci_enable_wake(pdev, 4, 0); /* 4 == D3 cold */ + + /* Perform card reset only on one instance of the card */ + if (0 != PCI_FUNC (pdev->devfn)) + return PCIERR_RESULT_RECOVERED; + + e1000_reset(adapter); + E1000_WRITE_REG(&adapter->hw, WUS, ~0); + + return PCIERR_RESULT_RECOVERED; +} + +/** e1000_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + * Implementation resembles the second-half of the + * e1000_resume routine. + */ +static void e1000_io_resume (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct e1000_adapter *adapter = netdev->priv; + uint32_t manc, swsm; + + if(netif_running(netdev)) { + if(e1000_up(adapter)) { + printk ("e1000: can't bring device back up after reset\n"); + return; + } + } + + netif_device_attach(netdev); + + if(adapter->hw.mac_type >= e1000_82540 && + adapter->hw.media_type == e1000_media_type_copper) { + manc = E1000_READ_REG(&adapter->hw, MANC); + manc &= ~(E1000_MANC_ARP_EN); + E1000_WRITE_REG(&adapter->hw, MANC, manc); + } + + switch(adapter->hw.mac_type) { + case e1000_82573: + swsm = E1000_READ_REG(&adapter->hw, SWSM); + E1000_WRITE_REG(&adapter->hw, SWSM, + swsm | E1000_SWSM_DRV_LOAD); + break; + default: + break; + } + + mod_timer(&adapter->watchdog_timer, jiffies); +} + +#endif /* CONFIG_E1000_EEH_RECOVERY */ + /* e1000_main.c */ --- drivers/net/Kconfig.linas-orig 2005-06-22 15:26:13.000000000 -0500 +++ drivers/net/Kconfig 2005-06-22 15:28:29.000000000 -0500 @@ -1392,6 +1392,14 @@ config E100 . The module will be called e100. +config E100_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on E100 && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config LNE390 tristate "Mylex EISA LNE390A/B support (EXPERIMENTAL)" depends on NET_PCI && EISA && EXPERIMENTAL @@ -1839,6 +1847,14 @@ config E1000_NAPI If in doubt, say N. +config E1000_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on E1000 && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config MYRI_SBUS tristate "MyriCOM Gigabit Ethernet support" depends on SBUS --- arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ arch/ppc64/configs/pSeries_defconfig 2005-06-22 15:30:33.000000000 -0500 @@ -311,9 +311,11 @@ CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MOD CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set +CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY=y CONFIG_SCSI_IPR=y CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y +CONFIG_SCSI_IPR_EEH_RECOVERY=y # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA2XXX=y @@ -544,6 +546,7 @@ CONFIG_PCNET32=y # CONFIG_DGRS is not set # CONFIG_EEPRO100 is not set CONFIG_E100=y +CONFIG_E100_EEH_RECOVERY=y # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set # CONFIG_NE2K_PCI is not set @@ -562,6 +565,7 @@ CONFIG_ACENIC_OMIT_TIGON_I=y # CONFIG_DL2K is not set CONFIG_E1000=y # CONFIG_E1000_NAPI is not set +CONFIG_E1000_EEH_RECOVERY=y # CONFIG_NS83820 is not set # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set --- include/asm-ppc64/eeh.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ include/asm-ppc64/eeh.h 2005-06-22 15:28:29.000000000 -0500 @@ -1,4 +1,4 @@ -/* +/* * eeh.h * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation. * @@ -6,12 +6,12 @@ * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -23,6 +23,7 @@ #include #include #include +#include #include struct pci_dev; @@ -36,6 +37,11 @@ struct notifier_block; #define EEH_MODE_SUPPORTED (1<<0) #define EEH_MODE_NOCHECK (1<<1) #define EEH_MODE_ISOLATED (1<<2) +#define EEH_MODE_RECOVERING (1<<3) + +/* Max number of EEH freezes allowed before we consider the device + * to be permanently disabled. */ +#define EEH_MAX_ALLOWED_FREEZES 5 void __init eeh_init(void); unsigned long eeh_check_failure(const volatile void __iomem *token, @@ -59,35 +65,82 @@ void eeh_add_device_late(struct pci_dev * eeh_remove_device - undo EEH setup for the indicated pci device * @dev: pci device to be removed * - * This routine should be when a device is removed from a running - * system (e.g. by hotplug or dlpar). + * This routine should be called when a device is removed from + * a running system (e.g. by hotplug or dlpar). It unregisters + * the PCI device from the EEH subsystem. I/O errors affecting + * this device will no longer be detected after this call; thus, + * i/o errors affecting this slot may leave this device unusable. */ void eeh_remove_device(struct pci_dev *); -#define EEH_DISABLE 0 -#define EEH_ENABLE 1 -#define EEH_RELEASE_LOADSTORE 2 -#define EEH_RELEASE_DMA 3 +/** + * eeh_slot_is_isolated -- return non-zero value if slot is frozen + */ +int eeh_slot_is_isolated (struct pci_dev *dev); /** - * Notifier event flags. + * eeh_ioaddr_is_isolated -- return non-zero value if device at + * io address is frozen. */ -#define EEH_NOTIFY_FREEZE 1 +int eeh_ioaddr_is_isolated(const volatile void __iomem *token); -/** EEH event -- structure holding pci slot data that describes - * a change in the isolation status of a PCI slot. A pointer - * to this struct is passed as the data pointer in a notify callback. - */ -struct eeh_event { - struct list_head list; - struct pci_dev *dev; - struct device_node *dn; - int reset_state; -}; - -/** Register to find out about EEH events. */ -int eeh_register_notifier(struct notifier_block *nb); -int eeh_unregister_notifier(struct notifier_block *nb); +/** + * eeh_slot_error_detail -- record and EEH error condition to the log + * @severity: 1 if temporary, 2 if permanent failure. + * + * Obtains the the EEH error details from the RTAS subsystem, + * and then logs these details with the RTAS error log system. + */ +void eeh_slot_error_detail (struct device_node *dn, int severity); + +/** + * rtas_set_slot_reset -- unfreeze a frozen slot + * + * Clear the EEH-frozen condition on a slot. This routine + * does this by asserting the PCI #RST line for 1/8th of + * a second; this routine will sleep while the adapter is + * being reset. + */ +void rtas_set_slot_reset (struct device_node *dn); + +/** rtas_pci_slot_reset raises/lowers the pci #RST line + * state: 1/0 to raise/lower the #RST + * + * Clear the EEH-frozen condition on a slot. This routine + * asserts the PCI #RST line if the 'state' argument is '1', + * and drops the #RST line if 'state is '0'. This routine is + * safe to call in an interrupt context. + * + */ +void rtas_pci_slot_reset(struct device_node *dn, int state); +void eeh_pci_slot_reset(struct pci_dev *dev, int state); + +/** eeh_pci_slot_availability -- Indicates whether a PCI + * slot is ready to be used. After a PCI reset, it may take a while + * for the PCI fabric to fully reset the comminucations path to the + * given PCI card. This routine can be used to determine how long + * to wait before a PCI slot might become usable. + * + * This routine returns how long to wait (in milliseconds) before + * the slot is expected to be usable. A value of zero means the + * slot is immediately usable. A negavitve value means that the + * slot is permanently disabled. + */ +int eeh_pci_slot_availability(struct pci_dev *dev); + +/** Restore device configuration info across device resets. + */ +void eeh_restore_bars(struct device_node *); +void eeh_pci_restore_bars(struct pci_dev *dev); + +/** + * rtas_configure_bridge -- firmware initialization of pci bridge + * + * Ask the firmware to configure any PCI bridge devices + * located behind the indicated node. Required after a + * pci device reset. + */ +void rtas_configure_bridge(struct device_node *dn); /** * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure. @@ -129,7 +182,7 @@ static inline void eeh_remove_device(str #define EEH_IO_ERROR_VALUE(size) (-1UL) #endif /* CONFIG_EEH */ -/* +/* * MMIO read/write operations with EEH support. */ static inline u8 eeh_readb(const volatile void __iomem *addr) @@ -251,21 +304,21 @@ static inline void eeh_memcpy_fromio(voi *((u8 *)dest) = *((volatile u8 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 1); - dest = (void *)((unsigned long)dest + 1); + dest = (void *)((unsigned long)dest + 1); n--; } while(n > 4) { *((u32 *)dest) = *((volatile u32 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 4); - dest = (void *)((unsigned long)dest + 4); + dest = (void *)((unsigned long)dest + 4); n -= 4; } while(n) { *((u8 *)dest) = *((volatile u8 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 1); - dest = (void *)((unsigned long)dest + 1); + dest = (void *)((unsigned long)dest + 1); n--; } __asm__ __volatile__ ("sync" : : : "memory"); @@ -287,19 +340,19 @@ static inline void eeh_memcpy_toio(volat while(n && (!EEH_CHECK_ALIGN(vdest, 4) || !EEH_CHECK_ALIGN(src, 4))) { *((volatile u8 *)vdest) = *((u8 *)src); src = (void *)((unsigned long)src + 1); - vdest = (void *)((unsigned long)vdest + 1); + vdest = (void *)((unsigned long)vdest + 1); n--; } while(n > 4) { *((volatile u32 *)vdest) = *((volatile u32 *)src); src = (void *)((unsigned long)src + 4); - vdest = (void *)((unsigned long)vdest + 4); + vdest = (void *)((unsigned long)vdest + 4); n-=4; } while(n) { *((volatile u8 *)vdest) = *((u8 *)src); src = (void *)((unsigned long)src + 1); - vdest = (void *)((unsigned long)vdest + 1); + vdest = (void *)((unsigned long)vdest + 1); n--; } __asm__ __volatile__ ("sync" : : : "memory"); --- include/asm-ppc64/prom.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ include/asm-ppc64/prom.h 2005-06-22 15:28:29.000000000 -0500 @@ -119,6 +119,7 @@ struct property { */ struct pci_controller; struct iommu_table; +struct eeh_recovery_ops; struct device_node { char *name; @@ -137,9 +138,13 @@ struct device_node { int devfn; /* for pci devices */ int eeh_mode; /* See eeh.h for possible EEH_MODEs */ int eeh_config_addr; + int eeh_check_count; /* number of times device driver ignored error */ + int eeh_freeze_count; /* number of times this device froze up. */ + int eeh_is_bridge; /* device is pci-to-pci bridge */ int pci_ext_config_space; /* for pci devices */ struct pci_controller *phb; /* for pci devices */ struct iommu_table *iommu_table; /* for phb's or bridges */ + u32 config_space[16]; /* saved PCI config space */ struct property *properties; struct device_node *parent; --- include/asm-ppc64/rtas.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ include/asm-ppc64/rtas.h 2005-06-22 15:28:29.000000000 -0500 @@ -240,4 +240,6 @@ extern unsigned long rtas_rmo_buf; #define GLOBAL_INTERRUPT_QUEUE 9005 +extern int rtas_write_config(struct device_node *dn, int where, int size, u32 val); + #endif /* _PPC64_RTAS_H */ --- arch/ppc64/kernel/eeh.c.linas-orig 2005-06-22 15:26:11.000000000 -0500 +++ arch/ppc64/kernel/eeh.c 2005-06-22 15:28:29.000000000 -0500 @@ -1,32 +1,34 @@ /* + * * eeh.c * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#include +#include #include +#include #include -#include #include #include #include #include #include #include +#include #include #include #include @@ -49,8 +51,8 @@ * were "empty": all reads return 0xff's and all writes are silently * ignored. EEH slot isolation events can be triggered by parity * errors on the address or data busses (e.g. during posted writes), - * which in turn might be caused by dust, vibration, humidity, - * radioactivity or plain-old failed hardware. + * which in turn might be caused by low voltage on the bus, dust, + * vibration, humidity, radioactivity or plain-old failed hardware. * * Note, however, that one of the leading causes of EEH slot * freeze events are buggy device drivers, buggy device microcode, @@ -75,22 +77,13 @@ #define BUID_HI(buid) ((buid) >> 32) #define BUID_LO(buid) ((buid) & 0xffffffff) -/* EEH event workqueue setup. */ -static DEFINE_SPINLOCK(eeh_eventlist_lock); -LIST_HEAD(eeh_eventlist); -static void eeh_event_handler(void *); -DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL); - -static struct notifier_block *eeh_notifier_chain; - /* * If a device driver keeps reading an MMIO register in an interrupt * handler after a slot isolation event has occurred, we assume it * is broken and panic. This sets the threshold for how many read * attempts we allow before panicking. */ -#define EEH_MAX_FAILS 1000 -static atomic_t eeh_fail_count; +#define EEH_MAX_FAILS 100000 /* RTAS tokens */ static int ibm_set_eeh_option; @@ -107,6 +100,10 @@ static DEFINE_SPINLOCK(slot_errbuf_lock) static int eeh_error_buf_size; /* System monitoring statistics */ +static DEFINE_PER_CPU(unsigned long, no_device); +static DEFINE_PER_CPU(unsigned long, no_dn); +static DEFINE_PER_CPU(unsigned long, no_cfg_addr); +static DEFINE_PER_CPU(unsigned long, ignored_check); static DEFINE_PER_CPU(unsigned long, total_mmio_ffs); static DEFINE_PER_CPU(unsigned long, false_positives); static DEFINE_PER_CPU(unsigned long, ignored_failures); @@ -225,9 +222,9 @@ pci_addr_cache_insert(struct pci_dev *de while (*p) { parent = *p; piar = rb_entry(parent, struct pci_io_addr_range, rb_node); - if (alo < piar->addr_lo) { + if (ahi < piar->addr_lo) { p = &parent->rb_left; - } else if (ahi > piar->addr_hi) { + } else if (alo > piar->addr_hi) { p = &parent->rb_right; } else { if (dev != piar->pcidev || @@ -246,6 +243,11 @@ pci_addr_cache_insert(struct pci_dev *de piar->pcidev = dev; piar->flags = flags; +#ifdef DEBUG + printk (KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n", + alo, ahi, pci_name (dev)); +#endif + rb_link_node(&piar->rb_node, parent, p); rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root); @@ -268,9 +270,10 @@ static void __pci_addr_cache_insert_devi /* Skip any devices for which EEH is not enabled. */ if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || dn->eeh_mode & EEH_MODE_NOCHECK) { -#ifdef DEBUG - printk(KERN_INFO "PCI: skip building address cache for=%s %s\n", - pci_name(dev), pci_pretty_name(dev)); +// #ifdef DEBUG +#if 1 + printk(KERN_INFO "PCI: skip building address cache for=%s %s %s\n", + pci_name(dev), pci_pretty_name(dev), dn->type); #endif return; } @@ -369,8 +372,12 @@ void pci_addr_cache_remove_device(struct */ void __init pci_addr_cache_build(void) { + struct device_node *dn; struct pci_dev *dev = NULL; + if (!eeh_subsystem_enabled) + return; + spin_lock_init(&pci_io_addr_cache_root.piar_lock); while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { @@ -379,6 +386,17 @@ void __init pci_addr_cache_build(void) continue; } pci_addr_cache_insert_device(dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + if (dn) { + int i; + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + dn->eeh_is_bridge = 1; + } } #ifdef DEBUG @@ -390,24 +408,32 @@ void __init pci_addr_cache_build(void) /* --------------------------------------------------------------- */ /* Above lies the PCI Address Cache. Below lies the EEH event infrastructure */ -/** - * eeh_register_notifier - Register to find out about EEH events. - * @nb: notifier block to callback on events - */ -int eeh_register_notifier(struct notifier_block *nb) +void eeh_slot_error_detail (struct device_node *dn, int severity) { - return notifier_chain_register(&eeh_notifier_chain, nb); -} + unsigned long flags; + int rc; -/** - * eeh_unregister_notifier - Unregister to an EEH event notifier. - * @nb: notifier block to callback on events - */ -int eeh_unregister_notifier(struct notifier_block *nb) -{ - return notifier_chain_unregister(&eeh_notifier_chain, nb); + if (!dn) return; + + /* Log the error with the rtas logger */ + spin_lock_irqsave(&slot_errbuf_lock, flags); + memset(slot_errbuf, 0, eeh_error_buf_size); + + rc = rtas_call(ibm_slot_error_detail, + 8, 1, NULL, dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), NULL, 0, + virt_to_phys(slot_errbuf), + eeh_error_buf_size, + severity); + + if (rc == 0) + log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); + spin_unlock_irqrestore(&slot_errbuf_lock, flags); } +EXPORT_SYMBOL(eeh_slot_error_detail); + /** * read_slot_reset_state - Read the reset state of a device node's slot * @dn: device node to read @@ -422,6 +448,7 @@ static int read_slot_reset_state(struct outputs = 4; } else { token = ibm_read_slot_reset_state; + rets[2] = 0; /* fake PE Unavailable info */ outputs = 3; } @@ -430,75 +457,8 @@ static int read_slot_reset_state(struct } /** - * eeh_panic - call panic() for an eeh event that cannot be handled. - * The philosophy of this routine is that it is better to panic and - * halt the OS than it is to risk possible data corruption by - * oblivious device drivers that don't know better. - * - * @dev pci device that had an eeh event - * @reset_state current reset state of the device slot - */ -static void eeh_panic(struct pci_dev *dev, int reset_state) -{ - /* - * XXX We should create a separate sysctl for this. - * - * Since the panic_on_oops sysctl is used to halt the system - * in light of potential corruption, we can use it here. - */ - if (panic_on_oops) - panic("EEH: MMIO failure (%d) on device:%s %s\n", reset_state, - pci_name(dev), pci_pretty_name(dev)); - else { - __get_cpu_var(ignored_failures)++; - printk(KERN_INFO "EEH: Ignored MMIO failure (%d) on device:%s %s\n", - reset_state, pci_name(dev), pci_pretty_name(dev)); - } -} - -/** - * eeh_event_handler - dispatch EEH events. The detection of a frozen - * slot can occur inside an interrupt, where it can be hard to do - * anything about it. The goal of this routine is to pull these - * detection events out of the context of the interrupt handler, and - * re-dispatch them for processing at a later time in a normal context. - * - * @dummy - unused - */ -static void eeh_event_handler(void *dummy) -{ - unsigned long flags; - struct eeh_event *event; - - while (1) { - spin_lock_irqsave(&eeh_eventlist_lock, flags); - event = NULL; - if (!list_empty(&eeh_eventlist)) { - event = list_entry(eeh_eventlist.next, struct eeh_event, list); - list_del(&event->list); - } - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); - if (event == NULL) - break; - - printk(KERN_INFO "EEH: MMIO failure (%d), notifiying device " - "%s %s\n", event->reset_state, - pci_name(event->dev), pci_pretty_name(event->dev)); - - atomic_set(&eeh_fail_count, 0); - notifier_call_chain (&eeh_notifier_chain, - EEH_NOTIFY_FREEZE, event); - - __get_cpu_var(slot_resets)++; - - pci_dev_put(event->dev); - kfree(event); - } -} - -/** - * eeh_token_to_phys - convert EEH address token to phys address - * @token i/o token, should be address in the form 0xE.... + * eeh_token_to_phys - convert I/O address to phys address + * @token i/o address, should be address in the form 0xA.... */ static inline unsigned long eeh_token_to_phys(unsigned long token) { @@ -513,6 +473,39 @@ static inline unsigned long eeh_token_to return pa | (token & (PAGE_SIZE-1)); } + +/** Mark all devices that are peers of this device as failed. + */ +static inline void eeh_mark_slot (struct device_node *dn) +{ + while (dn) { + dn->eeh_mode |= EEH_MODE_ISOLATED; + if (dn->child) + eeh_mark_slot (dn->child); + dn = dn->sibling; + } +} + +static inline void eeh_clear_slot (struct device_node *dn) +{ + while (dn) { + dn->eeh_mode &= ~(EEH_MODE_RECOVERING|EEH_MODE_ISOLATED); + if (dn->child) + eeh_clear_slot (dn->child); + dn = dn->sibling; + } +} + +static inline struct pci_dev * eeh_find_pci_dev(struct device_node *dn) +{ + struct pci_dev *dev = NULL; + for_each_pci_dev(dev) { + if (pci_device_to_OF_node(dev) == dn) + return dev; + } + return NULL; +} + /** * eeh_dn_check_failure - check if all 1's data is due to EEH slot freeze * @dn device node @@ -528,29 +521,37 @@ static inline unsigned long eeh_token_to * * It is safe to call this routine in an interrupt context. */ +extern void disable_irq_nosync(unsigned int); + int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) { int ret; int rets[3]; - unsigned long flags; - int rc, reset_state; - struct eeh_event *event; + enum pci_channel_state state; __get_cpu_var(total_mmio_ffs)++; if (!eeh_subsystem_enabled) return 0; - if (!dn) + if (!dn) { + __get_cpu_var(no_dn)++; return 0; + } /* Access to IO BARs might get this far and still not want checking. */ if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || dn->eeh_mode & EEH_MODE_NOCHECK) { + __get_cpu_var(ignored_check)++; +#ifdef DEBUG + printk ("EEH:ignored check for %s %s\n", + pci_pretty_name (dev), dn->full_name); +#endif return 0; } if (!dn->eeh_config_addr) { + __get_cpu_var(no_cfg_addr)++; return 0; } @@ -559,12 +560,19 @@ int eeh_dn_check_failure(struct device_n * slot, we know it's bad already, we don't need to check... */ if (dn->eeh_mode & EEH_MODE_ISOLATED) { - atomic_inc(&eeh_fail_count); - if (atomic_read(&eeh_fail_count) >= EEH_MAX_FAILS) { + dn->eeh_check_count ++; + if (dn->eeh_check_count >= EEH_MAX_FAILS) { + printk (KERN_ERR "EEH: Device driver ignored %d bad reads, panicing\n", + dn->eeh_check_count); + dump_stack(); /* re-read the slot reset state */ if (read_slot_reset_state(dn, rets) != 0) rets[0] = -1; /* reset state unknown */ - eeh_panic(dev, rets[0]); + +*((long *) 0x0) = 42; + /* If we are here, then we hit an infinite loop. Stop. */ + panic("EEH: MMIO halt (%d) on device:%s %s\n", rets[0], + pci_name(dev), pci_pretty_name(dev)); } return 0; } @@ -577,53 +585,43 @@ int eeh_dn_check_failure(struct device_n * In any case they must share a common PHB. */ ret = read_slot_reset_state(dn, rets); - if (!(ret == 0 && rets[1] == 1 && (rets[0] == 2 || rets[0] == 4))) { + if (!(ret == 0 && ((rets[1] == 1 && (rets[0] == 2 || rets[0] >= 4)) + || (rets[0] == 5)))) { __get_cpu_var(false_positives)++; return 0; } - /* prevent repeated reports of this failure */ - dn->eeh_mode |= EEH_MODE_ISOLATED; - - reset_state = rets[0]; + /* Note that empty slots will fail; empty slots don't have children... */ + if ((rets[0] == 5) && (dn->child == NULL)) { + __get_cpu_var(false_positives)++; + return 0; + } - spin_lock_irqsave(&slot_errbuf_lock, flags); - memset(slot_errbuf, 0, eeh_error_buf_size); + /* Avoid repeated reports of this failure, including problems + * with other functions on this device, and functions under + * bridges. */ + eeh_mark_slot (dn->parent->child); + __get_cpu_var(slot_resets)++; - rc = rtas_call(ibm_slot_error_detail, - 8, 1, NULL, dn->eeh_config_addr, - BUID_HI(dn->phb->buid), - BUID_LO(dn->phb->buid), NULL, 0, - virt_to_phys(slot_errbuf), - eeh_error_buf_size, - 1 /* Temporary Error */); + if (!dev) + dev = eeh_find_pci_dev (dn); - if (rc == 0) - log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); - spin_unlock_irqrestore(&slot_errbuf_lock, flags); + /* Some devices go crazy if irq's are not ack'ed; disable irq now */ + if (dev) + disable_irq_nosync (dev->irq); + + state = pci_channel_io_normal; + if ((rets[0] == 2) || (rets[0] == 4)) + state = pci_channel_io_frozen; + if (rets[0] == 5) + state = pci_channel_io_perm_failure; - printk(KERN_INFO "EEH: MMIO failure (%d) on device: %s %s\n", - rets[0], dn->name, dn->full_name); - event = kmalloc(sizeof(*event), GFP_ATOMIC); - if (event == NULL) { - eeh_panic(dev, reset_state); - return 1; - } - - event->dev = dev; - event->dn = dn; - event->reset_state = reset_state; - - /* We may or may not be called in an interrupt context */ - spin_lock_irqsave(&eeh_eventlist_lock, flags); - list_add(&event->list, &eeh_eventlist); - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); + peh_send_failure_event (dev, state, rets[2]); /* Most EEH events are due to device driver bugs. Having * a stack trace will help the device-driver authors figure * out what happened. So print that out. */ - dump_stack(); - schedule_work(&eeh_event_wq); + if (rets[0] != 5) dump_stack(); return 0; } @@ -635,7 +633,6 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * @token i/o token, should be address in the form 0xA.... * @val value, should be all 1's (XXX why do we need this arg??) * - * Check for an eeh failure at the given token address. * Check for an EEH failure at the given token address. Call this * routine if the result of a read was all 0xff's and you want to * find out if this is due to an EEH slot freeze event. This routine @@ -643,6 +640,7 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * * Note this routine is safe to call in an interrupt context. */ + unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val) { unsigned long addr; @@ -652,8 +650,10 @@ unsigned long eeh_check_failure(const vo /* Finding the phys addr + pci device; this is pretty quick. */ addr = eeh_token_to_phys((unsigned long __force) token); dev = pci_get_device_by_addr(addr); - if (!dev) + if (!dev) { + __get_cpu_var(no_device)++; return val; + } dn = pci_device_to_OF_node(dev); eeh_dn_check_failure (dn, dev); @@ -664,6 +664,234 @@ unsigned long eeh_check_failure(const vo EXPORT_SYMBOL(eeh_check_failure); +/* ------------------------------------------------------------- */ +/* The code below deals with error recovery */ + +int +eeh_slot_is_isolated(struct pci_dev *dev) +{ + struct device_node *dn; + dn = pci_device_to_OF_node(dev); + return (dn->eeh_mode & EEH_MODE_ISOLATED); +} +EXPORT_SYMBOL(eeh_slot_is_isolated); + +int +eeh_ioaddr_is_isolated(const volatile void __iomem *token) +{ + unsigned long addr; + struct pci_dev *dev; + int rc; + + addr = eeh_token_to_phys((unsigned long __force) token); + dev = pci_get_device_by_addr(addr); + if (!dev) + return 0; + rc = eeh_slot_is_isolated(dev); + pci_dev_put(dev); + return rc; +} + +/** eeh_pci_slot_reset -- raises/lowers the pci #RST line + * state: 1/0 to raise/lower the #RST + */ +void +eeh_pci_slot_reset(struct pci_dev *dev, int state) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + rtas_pci_slot_reset (dn, state); +} + +/** Return negative value if a permanent error, else return + * a number of milliseconds to wait until the PCI slot is + * ready to be used. + */ +static int +eeh_slot_availability(struct device_node *dn) +{ + int rc; + int rets[3]; + + rc = read_slot_reset_state(dn, rets); + + if (rc) return rc; + + if (rets[1] == 0) return -1; /* EEH is not supported */ + if (rets[0] == 0) return 0; /* Oll Korrect */ + if (rets[0] == 5) { + if (rets[2] == 0) return -1; /* permanently unavailable */ + return rets[2]; /* number of millisecs to wait */ + } + return -1; +} + +int +eeh_pci_slot_availability(struct pci_dev *dev) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + if (!dn) return -1; + + BUG_ON (dn->phb==NULL); + if (dn->phb==NULL) { + printk (KERN_ERR "EEH, checking on slot with no phb dn=%s dev=%s:%s\n", + dn->full_name, pci_name(dev), pci_pretty_name (dev)); + return -1; + } + return eeh_slot_availability (dn); +} + +void +rtas_pci_slot_reset(struct device_node *dn, int state) +{ + int rc; + + if (!dn) + return; + if (!dn->phb) { + printk (KERN_WARNING "EEH: in slot reset, device node %s has no phb\n", dn->full_name); + return; + } + + dn->eeh_mode |= EEH_MODE_RECOVERING; + rc = rtas_call(ibm_set_slot_reset,4,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), + state); + if (rc) { + printk (KERN_WARNING "EEH: Unable to reset the failed slot, (%d) #RST=%d\n", rc, state); + return; + } + + if (state == 0) + eeh_clear_slot (dn->parent->child); +} + +/** rtas_set_slot_reset -- assert the pci #RST line for 1/4 second + * dn -- device node to be reset. + */ + +void +rtas_set_slot_reset(struct device_node *dn) +{ + int i, rc; + + rtas_pci_slot_reset (dn, 1); + + /* The PCI bus requires that the reset be held high for at least + * a 100 milliseconds. We wait a bit longer 'just in case'. */ + +#define PCI_BUS_RST_HOLD_TIME_MSEC 250 + msleep (PCI_BUS_RST_HOLD_TIME_MSEC); + rtas_pci_slot_reset (dn, 0); + + /* After a PCI slot has been reset, the PCI Express spec requires + * a 1.5 second idle time for the bus to stabilize, before starting + * up traffic. */ +#define PCI_BUS_SETTLE_TIME_MSEC 1800 + msleep (PCI_BUS_SETTLE_TIME_MSEC); + + /* Now double check with the firmware to make sure the device is + * ready to be used; if not, wait for recovery. */ + for (i=0; i<10; i++) { + rc = eeh_slot_availability (dn); + if (rc <= 0) break; + + msleep (rc+100); + } +} + +EXPORT_SYMBOL(rtas_set_slot_reset); + +void +rtas_configure_bridge(struct device_node *dn) +{ + int token = rtas_token ("ibm,configure-bridge"); + int rc; + + if (token == RTAS_UNKNOWN_SERVICE) + return; + rc = rtas_call(token,3,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid)); + if (rc) { + printk (KERN_WARNING "EEH: Unable to configure device bridge (%d) for %s\n", + rc, dn->full_name); + } +} + +EXPORT_SYMBOL(rtas_configure_bridge); + +/* ------------------------------------------------------- */ +/** Save and restore of PCI BARs + * + * Although firmware will set up BARs during boot, it doesn't + * set up device BAR's after a device reset, although it will, + * if requested, set up bridge configuration. Thus, we need to + * configure the PCI devices ourselves. Config-space setup is + * stored in the PCI structures which are normally deleted during + * device removal. Thus, the "save" routine references the + * structures so that they aren't deleted. + */ + +/** + * __restore_bars - Restore the Base Address Registers + * Loads the PCI configuration space base address registers, + * the expansion ROM base address, the latency timer, and etc. + * from the saved values in the device node. + */ +static inline void __restore_bars (struct device_node *dn) +{ + int i; + + if (NULL==dn->phb) return; + for (i=4; i<10; i++) { + rtas_write_config(dn, i*4, 4, dn->config_space[i]); + } + + /* 12 == Expansion ROM Address */ + rtas_write_config(dn, 12*4, 4, dn->config_space[12]); + +#define BYTE_SWAP(OFF) (8*((OFF)/4)+3-(OFF)) +#define SAVED_BYTE(OFF) (((u8 *)(dn->config_space))[BYTE_SWAP(OFF)]) + + rtas_write_config (dn, PCI_CACHE_LINE_SIZE, 1, + SAVED_BYTE(PCI_CACHE_LINE_SIZE)); + + rtas_write_config (dn, PCI_LATENCY_TIMER, 1, + SAVED_BYTE(PCI_LATENCY_TIMER)); + + /* max latency, min grant, interrupt pin and line */ + rtas_write_config(dn, 15*4, 4, dn->config_space[15]); +} + +/** + * eeh_restore_bars - restore the PCI config space info + */ +void eeh_restore_bars(struct device_node *dn) +{ + if (! dn->eeh_is_bridge) + __restore_bars (dn); + + if (dn->child) + eeh_restore_bars (dn->child); +} + +void eeh_pci_restore_bars(struct pci_dev *dev) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + eeh_restore_bars (dn); +} + +/* ------------------------------------------------------------- */ +/* The code below deals with enabling EEH for devices during the + * early boot sequence. EEH must be enabled before any PCI probing + * can be done. + */ + +#define EEH_ENABLE 1 + struct eeh_early_enable_info { unsigned int buid_hi; unsigned int buid_lo; @@ -682,6 +910,8 @@ static void *early_enable_eeh(struct dev int enable; dn->eeh_mode = 0; + dn->eeh_check_count = 0; + dn->eeh_freeze_count = 0; if (status && strcmp(status, "ok") != 0) return NULL; /* ignore devices with bad status */ @@ -743,7 +973,7 @@ static void *early_enable_eeh(struct dev dn->full_name); } - return NULL; + return NULL; } /* @@ -828,7 +1058,9 @@ void eeh_add_device_early(struct device_ return; phb = dn->phb; if (NULL == phb || 0 == phb->buid) { - printk(KERN_WARNING "EEH: Expected buid but found none\n"); + printk(KERN_WARNING "EEH: Expected buid but found none for %s\n", + dn->full_name); + dump_stack(); return; } @@ -847,6 +1079,9 @@ EXPORT_SYMBOL(eeh_add_device_early); */ void eeh_add_device_late(struct pci_dev *dev) { + int i; + struct device_node *dn; + if (!dev || !eeh_subsystem_enabled) return; @@ -856,6 +1091,14 @@ void eeh_add_device_late(struct pci_dev #endif pci_addr_cache_insert_device (dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + dn->eeh_is_bridge = 1; } EXPORT_SYMBOL(eeh_add_device_late); @@ -885,12 +1128,17 @@ static int proc_eeh_show(struct seq_file unsigned int cpu; unsigned long ffs = 0, positives = 0, failures = 0; unsigned long resets = 0; + unsigned long no_dev = 0, no_dn = 0, no_cfg = 0, no_check = 0; for_each_cpu(cpu) { ffs += per_cpu(total_mmio_ffs, cpu); positives += per_cpu(false_positives, cpu); failures += per_cpu(ignored_failures, cpu); resets += per_cpu(slot_resets, cpu); + no_dev += per_cpu(no_device, cpu); + no_dn += per_cpu(no_dn, cpu); + no_cfg += per_cpu(no_cfg_addr, cpu); + no_check += per_cpu(ignored_check, cpu); } if (0 == eeh_subsystem_enabled) { @@ -898,13 +1146,17 @@ static int proc_eeh_show(struct seq_file seq_printf(m, "eeh_total_mmio_ffs=%ld\n", ffs); } else { seq_printf(m, "EEH Subsystem is enabled\n"); - seq_printf(m, "eeh_total_mmio_ffs=%ld\n" + seq_printf(m, + "no device=%ld\n" + "no device node=%ld\n" + "no config address=%ld\n" + "check not wanted=%ld\n" + "eeh_total_mmio_ffs=%ld\n" "eeh_false_positives=%ld\n" "eeh_ignored_failures=%ld\n" - "eeh_slot_resets=%ld\n" - "eeh_fail_count=%d\n", - ffs, positives, failures, resets, - eeh_fail_count.counter); + "eeh_slot_resets=%ld\n", + no_dev, no_dn, no_cfg, no_check, + ffs, positives, failures, resets); } return 0; --- arch/ppc64/kernel/pSeries_pci.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ arch/ppc64/kernel/pSeries_pci.c 2005-06-22 15:28:29.000000000 -0500 @@ -62,7 +62,7 @@ static int config_access_valid(struct de return 0; } -static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) +int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) { int returnval = -1; unsigned long buid, addr; @@ -112,7 +112,7 @@ static int rtas_pci_read_config(struct p return PCIBIOS_DEVICE_NOT_FOUND; } -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +int rtas_write_config(struct device_node *dn, int where, int size, u32 val) { unsigned long buid, addr; int ret; --- drivers/pci/hotplug/rpaphp.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ drivers/pci/hotplug/rpaphp.h 2005-06-22 15:28:29.000000000 -0500 @@ -113,7 +113,8 @@ extern int rpaphp_enable_pci_slot(struct extern int register_pci_slot(struct slot *slot); extern int rpaphp_unconfig_pci_adapter(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); -extern struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev); +extern void init_eeh_handler (void); +extern void exit_eeh_handler (void); /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); --- drivers/pci/hotplug/rpaphp_core.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ drivers/pci/hotplug/rpaphp_core.c 2005-06-22 15:28:29.000000000 -0500 @@ -460,12 +460,18 @@ static int __init rpaphp_init(void) { info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); + /* Get set to handle EEH events. */ + init_eeh_handler(); + /* read all the PRA info from the system */ return init_rpa(); } static void __exit rpaphp_exit(void) { + /* Let EEH know we are going away. */ + exit_eeh_handler(); + cleanup_slots(); } --- drivers/pci/hotplug/rpaphp_pci.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ drivers/pci/hotplug/rpaphp_pci.c 2005-06-22 15:28:29.000000000 -0500 @@ -24,6 +24,7 @@ */ #include #include +#include #include #include #include "../pci.h" /* for pci_add_new_bus */ @@ -63,6 +64,7 @@ int rpaphp_claim_resource(struct pci_dev root ? "Address space collision on" : "No parent found for", resource, dtype, pci_name(dev), res->start, res->end); + dump_stack(); } return err; } @@ -188,6 +190,19 @@ rpaphp_fixup_new_pci_devices(struct pci_ static int rpaphp_pci_config_bridge(struct pci_dev *dev); +static void rpaphp_eeh_add_bus_device(struct pci_bus *bus) +{ + struct pci_dev *dev; + list_for_each_entry(dev, &bus->devices, bus_list) { + eeh_add_device_late(dev); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + struct pci_bus *subbus = dev->subordinate; + if (bus) + rpaphp_eeh_add_bus_device (subbus); + } + } +} + /***************************************************************************** rpaphp_pci_config_slot() will configure all devices under the given slot->dn and return the the first pci_dev. @@ -215,6 +230,8 @@ rpaphp_pci_config_slot(struct device_nod } if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) rpaphp_pci_config_bridge(dev); + + rpaphp_eeh_add_bus_device(bus); } return dev; } @@ -223,7 +240,6 @@ static int rpaphp_pci_config_bridge(stru { u8 sec_busno; struct pci_bus *child_bus; - struct pci_dev *child_dev; dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev)); @@ -240,11 +256,7 @@ static int rpaphp_pci_config_bridge(stru /* do pci_scan_child_bus */ pci_scan_child_bus(child_bus); - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - - /* fixup new pci devices without touching bus struct */ + /* Fixup new pci devices without touching bus struct */ rpaphp_fixup_new_pci_devices(child_bus, 0); /* Make the discovered devices available */ @@ -320,7 +332,6 @@ static void rpaphp_eeh_remove_bus_device if (pdev) rpaphp_eeh_remove_bus_device(pdev); } - } return; } @@ -503,36 +514,3 @@ exit: return retval; } -struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev) -{ - struct list_head *tmp, *n; - struct slot *slot; - - list_for_each_safe(tmp, n, &rpaphp_slot_head) { - struct pci_bus *bus; - struct list_head *ln; - - slot = list_entry(tmp, struct slot, rpaphp_slot_list); - if (slot->bridge == NULL) { - if (slot->dev_type == PCI_DEV) { - printk(KERN_WARNING "PCI slot missing bridge %s %s \n", - slot->name, slot->location); - } - continue; - } - - bus = slot->bridge->subordinate; - if (!bus) { - continue; /* should never happen? */ - } - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev == dev) - return slot->hotplug_slot; - } - } - - return NULL; -} - -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); --- drivers/pci/hotplug/rpaphp_eeh.c.linas-orig 2005-06-22 15:28:15.000000000 -0500 +++ drivers/pci/hotplug/rpaphp_eeh.c 2005-06-22 17:04:51.000000000 -0500 @@ -0,0 +1,380 @@ +/* + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform. + * Copyright (C) 2004, 2005 Linas Vepstas + * + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or (at + * your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + * + * Send feedback to + * + */ +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../pci.h" +#include "rpaphp.h" + +/** + * pci_search_bus_for_dev - return 1 if device is under this bus, else 0 + * @bus: the bus to search for this device. + * @dev: the pci device we are looking for. + * + * XXX should this be moved to drivers/pci/search.c ? + */ +static int pci_search_bus_for_dev (struct pci_bus *bus, struct pci_dev *dev) +{ + struct list_head *ln; + + if (!bus) return 0; + + for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { + struct pci_dev *pdev = pci_dev_b(ln); + if (pdev == dev) + return 1; + if (pdev->subordinate) { + int rc; + rc = pci_search_bus_for_dev (pdev->subordinate, dev); + if (rc) + return 1; + } + } + return 0; +} + +/** pci_walk_bus - walk bus under this device, calling callback. + * @top device whose peers should be walked + * @cb callback to be called for each device found + * @userdata arbitrary pointer to be passed to callback. + * + * Walk the bus on which this device sits, including any + * bridged devices on busses under this bus. Call the provided + * callback on each device found. + */ +typedef void (*pci_buswalk_cb)(struct pci_dev *, void *); + +static void +pci_walk_bus (struct pci_dev *top, pci_buswalk_cb cb, void *userdata) +{ + struct pci_dev *dev, *tmp; + + spin_lock(&pci_bus_lock); + list_for_each_entry_safe (dev, tmp, &top->bus->devices, bus_list) { + pci_dev_get(dev); + spin_unlock(&pci_bus_lock); + + /* Run device routines with the bus unlocked */ + cb (dev, userdata); + if (dev->subordinate) { + pci_walk_bus (pci_dev_b(&dev->subordinate->devices), cb, userdata); + } + spin_lock(&pci_bus_lock); + pci_dev_put(dev); + } + spin_unlock(&pci_bus_lock); +} + +/** + * rpaphp_find_slot - find and return the slot holding the device + * @dev: pci device for which we want the slot structure. + */ +static struct slot *rpaphp_find_slot(struct pci_dev *dev) +{ + struct list_head *tmp, *n; + struct slot *slot; + + list_for_each_safe(tmp, n, &rpaphp_slot_head) { + struct pci_bus *bus; + + slot = list_entry(tmp, struct slot, rpaphp_slot_list); + + /* PHB's don't have bridges. */ + if (slot->bridge == NULL) + continue; + + /* The PCI device could be the slot itself. */ + if (slot->bridge == dev) + return slot; + + bus = slot->bridge->subordinate; + if (!bus) { + printk (KERN_WARNING "PCI bridge is missing bus: %s %s\n", + pci_name (slot->bridge), pci_pretty_name (slot->bridge)); + continue; /* should never happen? */ + } + + if (pci_search_bus_for_dev (bus, dev)) + return slot; + } + return NULL; +} + +/* ------------------------------------------------------- */ +/** eeh_report_error - report an EEH error to each device, + * collect up and merge the device responses. + */ + +static void eeh_report_error(struct pci_dev *dev, void *userdata) +{ + enum pcierr_result rc, *res = userdata; + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + if (!driver->err_handler.error_detected) + return; + + rc = driver->err_handler.error_detected (dev, pci_channel_io_frozen); + if (*res == PCIERR_RESULT_NONE) *res = rc; + if (*res == PCIERR_RESULT_NEED_RESET) return; + if (*res == PCIERR_RESULT_DISCONNECT && + rc == PCIERR_RESULT_NEED_RESET) *res = rc; +} + +/** eeh_report_reset -- tell this device that the pci slot + * has been reset. + */ + +static void eeh_report_reset(struct pci_dev *dev, void *userdata) +{ + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + if (!driver->err_handler.slot_reset) + return; + + driver->err_handler.slot_reset (dev); +} + +static void eeh_report_resume(struct pci_dev *dev, void *userdata) +{ + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + if (!driver->err_handler.resume) + return; + + driver->err_handler.resume (dev); +} + +static void eeh_report_failure(struct pci_dev *dev, void *userdata) +{ + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + if (!driver->err_handler.error_detected) + return; + + driver->err_handler.error_detected (dev, pci_channel_io_perm_failure); +} + +/* ------------------------------------------------------- */ +/** + * handle_eeh_events -- reset a PCI device after hard lockup. + * + * pSeries systems will isolate a PCI slot if the PCI-Host + * bridge detects address or data parity errors, DMA's + * occuring to wild addresses (which usually happen due to + * bugs in device drivers or in PCI adapter firmware). + * Slot isolations also occur if #SERR, #PERR or other misc + * PCI-related errors are detected. + * + * Recovery process consists of unplugging the device driver + * (which generated hotplug events to userspace), then issuing + * a PCI #RST to the device, then reconfiguring the PCI config + * space for all bridges & devices under this slot, and then + * finally restarting the device drivers (which cause a second + * set of hotplug events to go out to userspace). + */ + +int eeh_reset_device (struct pci_dev *dev, struct device_node *dn, int reconfig) +{ + struct slot *frozen_slot= NULL; + + if (!dev) + return 1; + + if (reconfig) + frozen_slot = rpaphp_find_slot(dev); + + if (reconfig && frozen_slot) rpaphp_unconfig_pci_adapter (frozen_slot); + + /* Reset the pci controller. (Asserts RST#; resets config space). + * Reconfigure bridges and devices */ + rtas_set_slot_reset (dn->child); + + /* Walk over all functions on this device */ + struct device_node *peer = dn->child; + while (peer) { + rtas_configure_bridge(peer); + eeh_restore_bars(peer); + peer = peer->sibling; + } + + /* Give the system 5 seconds to finish running the user-space + * hotplug scripts, e.g. ifdown for ethernet. Yes, this is a hack, + * but if we don't do this, weird things happen. + */ + if (reconfig && frozen_slot) { + ssleep (5); + rpaphp_enable_pci_slot (frozen_slot); + } + return 0; +} + +/* The longest amount of time to wait for a pci device + * to come back on line, in seconds. + */ +#define MAX_WAIT_FOR_RECOVERY 15 + +int handle_eeh_events (struct notifier_block *self, + unsigned long reason, void *ev) +{ + int freeze_count=0; + struct device_node *frozen_device; + struct peh_event *event = ev; + struct pci_dev *dev = event->dev; + int perm_failure = 0; + + if (!dev) + { + printk ("EEH: EEH error caught, but no PCI device specified!\n"); + return 1; + } + + frozen_device = pci_bus_to_OF_node(dev->bus); + if (!frozen_device) + { + printk (KERN_ERR "EEH: Cannot find PCI controller for %s %s\n", + pci_name(dev), pci_pretty_name (dev)); + + return 1; + } + BUG_ON (frozen_device->phb==NULL); + + /* We get "permanent failure" messages on empty slots. + * These are false alarms. Empty slots have no child dn. */ + if ((event->state == pci_channel_io_perm_failure) && (frozen_device == NULL)) + return 0; + + if (frozen_device) + freeze_count = frozen_device->eeh_freeze_count; + freeze_count ++; + if (freeze_count > EEH_MAX_ALLOWED_FREEZES) + perm_failure = 1; + + /* If the reset state is a '5' and the time to reset is 0 (infinity) + * or is more then 15 seconds, then mark this as a permanent failure. + */ + if ((event->state == pci_channel_io_perm_failure) && + ((event->time_unavail <= 0) || + (event->time_unavail > MAX_WAIT_FOR_RECOVERY*1000))) + perm_failure = 1; + + /* Log the error with the rtas logger. */ + if (perm_failure) { + /* + * About 90% of all real-life EEH failures in the field + * are due to poorly seated PCI cards. Only 10% or so are + * due to actual, failed cards. + */ + printk (KERN_ERR + "EEH: device %s:%s has failed %d times \n" + "and has been permanently disabled. Please try reseating\n" + "this device or replacing it.\n", + pci_name (dev), + pci_pretty_name (dev), + freeze_count); + + eeh_slot_error_detail (frozen_device, 2 /* Permanent Error */); + + /* Notify all devices that they're about to go down. */ + pci_walk_bus (dev, eeh_report_failure, 0); + + /* If there's a hotplug slot, unconfigure it */ + // XXX we need alternate way to deconfigure non-hotplug slots. + struct slot * frozen_slot = rpaphp_find_slot(dev); + if (frozen_slot) + rpaphp_unconfig_pci_adapter (frozen_slot); + return 1; + } else { + eeh_slot_error_detail (frozen_device, 1 /* Temporary Error */); + } + + printk (KERN_WARNING + "EEH: This device has failed %d times since last reboot: %s:%s\n", + freeze_count, + pci_name (dev), + pci_pretty_name (dev)); + + /* Walk the various device drivers attached to this slot, + * letting each know about the EEH bug. + */ + enum pcierr_result result = PCIERR_RESULT_NONE; + pci_walk_bus (dev, eeh_report_error, &result); + + /* If all device drivers were EEH-unaware, then pci hotplug + * the device, and hope that clears the error. */ + if (result == PCIERR_RESULT_NONE) { + eeh_reset_device (dev, frozen_device, 1); + } + + /* If any device called out for a reset, then reset the slot */ + if (result == PCIERR_RESULT_NEED_RESET) { + eeh_reset_device (dev, frozen_device, 0); + pci_walk_bus (dev, eeh_report_reset, 0); + } + + /* If all devices reported they can proceed, the re-enable PIO */ + if (result == PCIERR_RESULT_CAN_RECOVER) { + /* XXX Not supported; we brute-force reset the device */ + eeh_reset_device (dev, frozen_device, 0); + pci_walk_bus (dev, eeh_report_reset, 0); + } + + /* Tell all device drivers that they can resume operations */ + pci_walk_bus (dev, eeh_report_resume, 0); + + /* Store the freeze count with the pci adapter, and not the slot. + * This way, if the device is replaced, the count is cleared. + */ + frozen_device->eeh_freeze_count = freeze_count; + + return 1; +} + +static struct notifier_block eeh_block; + +void __init init_eeh_handler (void) +{ + eeh_block.notifier_call = handle_eeh_events; + peh_register_notifier (&eeh_block); +} + +void __exit exit_eeh_handler (void) +{ + peh_unregister_notifier (&eeh_block); +} + --- drivers/pci/hotplug/Makefile.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ drivers/pci/hotplug/Makefile 2005-06-22 15:28:29.000000000 -0500 @@ -41,6 +41,7 @@ acpiphp-objs := acpiphp_core.o \ acpiphp_res.o rpaphp-objs := rpaphp_core.o \ + rpaphp_eeh.o \ rpaphp_pci.o \ rpaphp_slot.o \ rpaphp_vio.o From arnd at arndb.de Thu Jun 23 08:47:15 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 23 Jun 2005 00:47:15 +0200 Subject: Proposal for reorg of kernel directory In-Reply-To: References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <1119400318.18247.190.camel@gaston> Message-ID: <200506230047.15672.arnd@arndb.de> On Middeweken 22 Juni 2005 21:36, Becky Bruce wrote: > OK, so let's talk about how the organization of platforms would look if > we go with the split. ? As I see it, there are several options: > > 1. ?Slam all of the platform-specific files into a flat "platforms" > directory. That seems to combine the most pain with the least benefit. Instead of one heap of stuff you'd have two, with unclear rules of which one some of the files belong to. E.g. interrupt handling code traditionally belongs into the kernel directory but is also platform specific. Moving a file around breaks patches and causes trouble when typing the old names by habit, so it should only be done if there is - a real advantage in moving it - an obvious choice where it belongs - a low risk of having to move it again > 2. ?Do something similar to the 32-bit tree's implementation of the > platforms dir, where single-platform code is at the highest level in > "platforms". ?Create subdirectories for platforms which are very > similar and share most of their code - I believe this ties in with > Arnd's idea about more generic platforms. Besides ppc32, there is no other arch implementing that scheme. I'd rather name files in the most common way than trying to mimic ppc32 in order to minimize the surprise level of people doing infrequent cross-platform development. > 3. ?Subdirs for each platform under the platforms directory. ?I don't > really like this one, but it is an option. This is done by h8300, m68knommu and sh, which are all infrequently used by most kernel hackers. The most widely used naming scheme would be 4. Have subdirectories directly under arch/ppc64 arch/ ppc64/ kernel mach-iSeries mach-pSeries mach-bpa ... We might want to drop the mach- prefix though, since these are commonly called platforms, not machines, on ppc64. Also, while we're at it, we should finally get rid of the StudlyCaps file names like 'ItLpQueue'. Using the capital S in iSeries and pSeries is probably a good idea, but it can also be dropped as a file name prefix if the directory already has the platform name. My suggestion would be: # move all pci stuff together, regardless of platform (see i386, mips) mv arch/ppc64/kernel/pci_iommu.c arch/ppc64/pci/iommu.c mv arch/ppc64/kernel/pci_direct_iommu.c arch/ppc64/pci/direct_iommu.c mv arch/ppc64/kernel/pci_dn.c arch/ppc64/pci/prom.c mv arch/ppc64/kernel/iSeries_pci.c arch/ppc64/pci/iSeries_pci.c mv arch/ppc64/kernel/iSeries_pci_reset.c arch/ppc64/pci/iSeries_pci_reset.c mv arch/ppc64/kernel/pSeries_pci.c arch/ppc64/pci/pSeries_pci.c mv arch/ppc64/kernel/maple_pci.c arch/ppc64/pci/maple_pci.c mv arch/ppc64/kernel/pmac_pci.c arch/ppc64/pci/pmac_pci.c # interrupt related stuff mv arch/ppc64/kernel/xics.c arch/ppc64/kernel/irq/xics.c mv arch/ppc64/kernel/mpic.c arch/ppc64/kernel/irq/mpic.c mv arch/ppc64/kernel/bpa_iic.c arch/ppc64/kernel/irq/bpa_iic.c mv arch/ppc64/kernel/spider_pic.c arch/ppc64/kernel/irq/spider_pic.c mv arch/ppc64/kernel/irq.c arch/ppc64/kernel/irq/irq.c mv arch/ppc64/kernel/iSeries_irq.c arch/ppc64/kernel/irq/iSeries_irq.c # move ppc32 emulation to a new dir (see x86_64) mv arch/ppc64/kernel/*32.[chS] arch/ppc64/ppc32/ # get rtas out of kernel/ mv arch/ppc64/kernel/rtasd.c arch/ppc64/rtas/rtasd.c mv arch/ppc64/kernel/rtas_flash.c arch/ppc64/rtas/flash.c mv arch/ppc64/kernel/rtas-proc.c arch/ppc64/rtas/proc.c mv arch/ppc64/kernel/rtas_pci.c arch/ppc64/rtas/pci.c # all iSeries files mv arch/ppc64/kernel/HvCall.c arch/ppc64/iSeries/hvcall.c mv arch/ppc64/kernel/hvCall.S arch/ppc64/iSeries/hvcall.S mv arch/ppc64/kernel/HvLpConfig.c arch/ppc64/iSeries/hv_lpconfig.c mv arch/ppc64/kernel/HvLpEvent.c arch/ppc64/iSeries/hv_lpevent.c mv arch/ppc64/kernel/iSeries_htab.c arch/ppc64/iSeries/htab.c mv arch/ppc64/kernel/iSeries_iommu.c arch/ppc64/iSeries/iommu.c mv arch/ppc64/kernel/iSeries_proc.c arch/ppc64/iSeries/proc.c mv arch/ppc64/kernel/iSeries_setup.c arch/ppc64/iSeries/setup.c mv arch/ppc64/kernel/iSeries_setup.h arch/ppc64/iSeries/setup.h mv arch/ppc64/kernel/iSeries_smp.c arch/ppc64/iSeries/smp.c mv arch/ppc64/kernel/iSeries_VpdInfo.c arch/ppc64/iSeries/vpdinfo.c mv arch/ppc64/kernel/ItLpQueue.c arch/ppc64/iSeries/it_lpqueue.c mv arch/ppc64/kernel/LparData.c arch/ppc64/iSeries/lpardata.c mv arch/ppc64/kernel/mf.c arch/ppc64/iSeries/mf.c mv arch/ppc64/kernel/XmPciLpEvent.c arch/ppc64/iSeries/xm_pci_lpevent.c # all pSeries files, but leave xics.c in kernel/ (with other interrupt code) mv arch/ppc64/kernel/ras.c arch/ppc64/pSeries/ras.c mv arch/ppc64/kernel/scanlog.c arch/ppc64/pSeries/scanlog.c mv arch/ppc64/kernel/eeh.c arch/ppc64/pSeries/eeh.c mv arch/ppc64/kernel/hvconsole.c arch/ppc64/pSeries/hvconsole.c mv arch/ppc64/kernel/hvcserver.c arch/ppc64/pSeries/hvcserver.c mv arch/ppc64/kernel/pSeries_hvCall.S arch/ppc64/pSeries/hvcall.S mv arch/ppc64/kernel/pSeries_iommu.c arch/ppc64/pSeries/iommu.c mv arch/ppc64/kernel/pSeries_lpar.c arch/ppc64/pSeries/lpar.c mv arch/ppc64/kernel/pSeries_nvram.c arch/ppc64/pSeries/nvram.c mv arch/ppc64/kernel/pSeries_setup.c arch/ppc64/pSeries/setup.c mv arch/ppc64/kernel/pSeries_smp.c arch/ppc64/pSeries/smp.c # all pmac files mv arch/ppc64/kernel/pmac_feature.c arch/ppc64/pmac/feature.c mv arch/ppc64/kernel/pmac.h arch/ppc64/pmac/pmac.h mv arch/ppc64/kernel/pmac_low_i2c.c arch/ppc64/pmac/low_i2c.c mv arch/ppc64/kernel/pmac_nvram.c arch/ppc64/pmac/nvram.c mv arch/ppc64/kernel/pmac_setup.c arch/ppc64/pmac/setup.c mv arch/ppc64/kernel/pmac_smp.c arch/ppc64/pmac/smp.c mv arch/ppc64/kernel/pmac_time.c arch/ppc64/pmac/time.c # BPA from the patches I sent today, excluding iic and spider_pic mv arch/ppc64/kernel/bpa_nvram.c arch/ppc64/bpa/nvram.c mv arch/ppc64/kernel/bpa_iommu.c arch/ppc64/bpa/iommu.c mv arch/ppc64/kernel/bpa_iommu.h arch/ppc64/bpa/iommu.h mv arch/ppc64/kernel/bpa_setup.c arch/ppc64/bpa/setup.c mv arch/ppc64/kernel/spu_base.c arch/ppc64/bpa/spu_base.c # Not sure about a shared path for maple and future platforms mv arch/ppc64/kernel/maple_setup.c arch/ppc64/maple/setup.c mv arch/ppc64/kernel/maple_time.c arch/ppc64/maple/time.c It's always a compromise between binding to the platform or to the functionality. A different approach would be to group mostly by functionality, which would probably be irq, iommu, time, smp, nvram and setup, while putting the remaining platform files into platform directories. Arnd <>< From kumar.gala at freescale.com Thu Jun 23 08:56:30 2005 From: kumar.gala at freescale.com (Kumar Gala) Date: Wed, 22 Jun 2005 17:56:30 -0500 Subject: Proposal for reorg of kernel directory In-Reply-To: <200506230047.15672.arnd@arndb.de> References: <200506230047.15672.arnd@arndb.de> Message-ID: <5B05630D-1726-4401-BD86-CAB1EC9772E6@freescale.com> > 4. Have subdirectories directly under arch/ppc64 > > arch/ > ppc64/ > kernel > mach-iSeries > mach-pSeries > mach-bpa > ... > > We might want to drop the mach- prefix though, since these are > commonly called platforms, not machines, on ppc64. > > Also, while we're at it, we should finally get rid of the StudlyCaps > file names like 'ItLpQueue'. Using the capital S in iSeries and > pSeries is probably a good idea, but it can also be dropped as > a file name prefix if the directory already has the platform name. I think one thing you have to realize is that the days of a few ppc64 "mach" or "platforms" are numbered. As 970 and future 64-bit PPC parts from Freescale start to exist I'd expect to see more and more different boards existing. I don't think we want to add directories for each new board port that comes along. - kumar From security at paypal.com Mon Jun 20 08:37:16 2005 From: security at paypal.com (security at paypal.com) Date: Sun, 19 Jun 2005 18:37:16 -0400 Subject: Urgent PayPal security notification Message-ID: An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050619/428e7847/attachment.htm From linas at austin.ibm.com Fri Jun 24 02:45:56 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 23 Jun 2005 11:45:56 -0500 Subject: Proposal for reorg of kernel directory In-Reply-To: <200506230047.15672.arnd@arndb.de> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <1119400318.18247.190.camel@gaston> <200506230047.15672.arnd@arndb.de> Message-ID: <20050623164556.GA28499@austin.ibm.com> On Thu, Jun 23, 2005 at 12:47:15AM +0200, Arnd Bergmann was heard to remark: > My suggestion would be: > > # move all pci stuff together, regardless of platform (see i386, mips) > mv arch/ppc64/kernel/pci_iommu.c arch/ppc64/pci/iommu.c > mv arch/ppc64/kernel/pci_direct_iommu.c arch/ppc64/pci/direct_iommu.c > mv arch/ppc64/kernel/pci_dn.c arch/ppc64/pci/prom.c > mv arch/ppc64/kernel/iSeries_pci.c arch/ppc64/pci/iSeries_pci.c > mv arch/ppc64/kernel/iSeries_pci_reset.c arch/ppc64/pci/iSeries_pci_reset.c > mv arch/ppc64/kernel/pSeries_pci.c arch/ppc64/pci/pSeries_pci.c > mv arch/ppc64/kernel/maple_pci.c arch/ppc64/pci/maple_pci.c > mv arch/ppc64/kernel/pmac_pci.c arch/ppc64/pci/pmac_pci.c > > # interrupt related stuff > mv arch/ppc64/kernel/xics.c arch/ppc64/kernel/irq/xics.c > mv arch/ppc64/kernel/mpic.c arch/ppc64/kernel/irq/mpic.c > mv arch/ppc64/kernel/bpa_iic.c arch/ppc64/kernel/irq/bpa_iic.c > mv arch/ppc64/kernel/spider_pic.c arch/ppc64/kernel/irq/spider_pic.c > mv arch/ppc64/kernel/irq.c arch/ppc64/kernel/irq/irq.c > mv arch/ppc64/kernel/iSeries_irq.c arch/ppc64/kernel/irq/iSeries_irq.c > > # move ppc32 emulation to a new dir (see x86_64) > mv arch/ppc64/kernel/*32.[chS] arch/ppc64/ppc32/ > > # get rtas out of kernel/ > mv arch/ppc64/kernel/rtasd.c arch/ppc64/rtas/rtasd.c > mv arch/ppc64/kernel/rtas_flash.c arch/ppc64/rtas/flash.c > mv arch/ppc64/kernel/rtas-proc.c arch/ppc64/rtas/proc.c > mv arch/ppc64/kernel/rtas_pci.c arch/ppc64/rtas/pci.c > > # all iSeries files > mv arch/ppc64/kernel/HvCall.c arch/ppc64/iSeries/hvcall.c > mv arch/ppc64/kernel/hvCall.S arch/ppc64/iSeries/hvcall.S > mv arch/ppc64/kernel/HvLpConfig.c arch/ppc64/iSeries/hv_lpconfig.c > mv arch/ppc64/kernel/HvLpEvent.c arch/ppc64/iSeries/hv_lpevent.c > mv arch/ppc64/kernel/iSeries_htab.c arch/ppc64/iSeries/htab.c > mv arch/ppc64/kernel/iSeries_iommu.c arch/ppc64/iSeries/iommu.c > mv arch/ppc64/kernel/iSeries_proc.c arch/ppc64/iSeries/proc.c > mv arch/ppc64/kernel/iSeries_setup.c arch/ppc64/iSeries/setup.c > mv arch/ppc64/kernel/iSeries_setup.h arch/ppc64/iSeries/setup.h > mv arch/ppc64/kernel/iSeries_smp.c arch/ppc64/iSeries/smp.c > mv arch/ppc64/kernel/iSeries_VpdInfo.c arch/ppc64/iSeries/vpdinfo.c > mv arch/ppc64/kernel/ItLpQueue.c arch/ppc64/iSeries/it_lpqueue.c > mv arch/ppc64/kernel/LparData.c arch/ppc64/iSeries/lpardata.c > mv arch/ppc64/kernel/mf.c arch/ppc64/iSeries/mf.c > mv arch/ppc64/kernel/XmPciLpEvent.c arch/ppc64/iSeries/xm_pci_lpevent.c > > # all pSeries files, but leave xics.c in kernel/ (with other interrupt code) > mv arch/ppc64/kernel/ras.c arch/ppc64/pSeries/ras.c > mv arch/ppc64/kernel/scanlog.c arch/ppc64/pSeries/scanlog.c > mv arch/ppc64/kernel/eeh.c arch/ppc64/pSeries/eeh.c > mv arch/ppc64/kernel/hvconsole.c arch/ppc64/pSeries/hvconsole.c > mv arch/ppc64/kernel/hvcserver.c arch/ppc64/pSeries/hvcserver.c > mv arch/ppc64/kernel/pSeries_hvCall.S arch/ppc64/pSeries/hvcall.S > mv arch/ppc64/kernel/pSeries_iommu.c arch/ppc64/pSeries/iommu.c > mv arch/ppc64/kernel/pSeries_lpar.c arch/ppc64/pSeries/lpar.c > mv arch/ppc64/kernel/pSeries_nvram.c arch/ppc64/pSeries/nvram.c > mv arch/ppc64/kernel/pSeries_setup.c arch/ppc64/pSeries/setup.c > mv arch/ppc64/kernel/pSeries_smp.c arch/ppc64/pSeries/smp.c > > # all pmac files > mv arch/ppc64/kernel/pmac_feature.c arch/ppc64/pmac/feature.c > mv arch/ppc64/kernel/pmac.h arch/ppc64/pmac/pmac.h > mv arch/ppc64/kernel/pmac_low_i2c.c arch/ppc64/pmac/low_i2c.c > mv arch/ppc64/kernel/pmac_nvram.c arch/ppc64/pmac/nvram.c > mv arch/ppc64/kernel/pmac_setup.c arch/ppc64/pmac/setup.c > mv arch/ppc64/kernel/pmac_smp.c arch/ppc64/pmac/smp.c > mv arch/ppc64/kernel/pmac_time.c arch/ppc64/pmac/time.c > > # BPA from the patches I sent today, excluding iic and spider_pic > mv arch/ppc64/kernel/bpa_nvram.c arch/ppc64/bpa/nvram.c > mv arch/ppc64/kernel/bpa_iommu.c arch/ppc64/bpa/iommu.c > mv arch/ppc64/kernel/bpa_iommu.h arch/ppc64/bpa/iommu.h > mv arch/ppc64/kernel/bpa_setup.c arch/ppc64/bpa/setup.c > mv arch/ppc64/kernel/spu_base.c arch/ppc64/bpa/spu_base.c > > # Not sure about a shared path for maple and future platforms > mv arch/ppc64/kernel/maple_setup.c arch/ppc64/maple/setup.c > mv arch/ppc64/kernel/maple_time.c arch/ppc64/maple/time.c Ohh, I like this. But what do I know... --linas From jschopp at austin.ibm.com Fri Jun 24 06:22:30 2005 From: jschopp at austin.ibm.com (Joel Schopp) Date: Thu, 23 Jun 2005 15:22:30 -0500 Subject: Proposal for reorg of kernel directory In-Reply-To: <20050623164556.GA28499@austin.ibm.com> References: <29f48dd6f4b92e0fcd5120bc7bbcc340@freescale.com> <1119400318.18247.190.camel@gaston> <200506230047.15672.arnd@arndb.de> <20050623164556.GA28499@austin.ibm.com> Message-ID: <42BB1A06.3070306@austin.ibm.com> > Ohh, I like this. But what do I know... It's certainly not a democracy, but if it were I'd vote for this one. It seems very clean to me. By the way, could somebody point me to any documentation on the maple specs? From haren at us.ibm.com Fri Jun 24 08:38:05 2005 From: haren at us.ibm.com (Haren Myneni) Date: Thu, 23 Jun 2005 15:38:05 -0700 Subject: 2.6.12-mm1 compilation error with CONFIG_XMON enabled Message-ID: <42BB39CD.7090605@us.ibm.com> Hello, Getting undefined symbol ".cacheflush". This symbol is defined as static (xmon.c) and using in sched_cacheflush() (kernel/sched.c). Thanks Haren -------------- next part -------------- A non-text attachment was scrubbed... Name: 2612-mm1-compile_error.patch Type: text/x-patch Size: 504 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050623/5dd2774b/attachment.bin From paulus at samba.org Fri Jun 24 09:53:44 2005 From: paulus at samba.org (Paul Mackerras) Date: Fri, 24 Jun 2005 09:53:44 +1000 Subject: 2.6.12-mm1 compilation error with CONFIG_XMON enabled In-Reply-To: <42BB39CD.7090605@us.ibm.com> References: <42BB39CD.7090605@us.ibm.com> Message-ID: <17083.19336.37546.498966@cargo.ozlabs.ibm.com> Haren Myneni writes: > -static void cacheflush(void); > +void cacheflush(void); This isn't right; the cacheflush() from xmon is totally inappropriate for implementing sched_cacheflush(). Just take the cacheflush call out of sched_cacheflush for now. Paul. From tom_gall at mac.com Fri Jun 24 14:56:38 2005 From: tom_gall at mac.com (Tom _) Date: Thu, 23 Jun 2005 23:56:38 -0500 Subject: F80, 2.6.12 broken Message-ID: Greetings, It would appear the f80 is still busted on recent kernels. Not sure when it broke but it's certainly been in bad shape for 2.6.10,11 and now 12 as well. Last message is returning from prom_init (helpful yeah...) buuut earlier on in the messages from the kernel is this little bit: instantiating rtas at 0x00000000deadbeef ... failed I fully expect that's the real problem. Any patches to try? Advise? Thanks! Tom tgall at gentoo.org From michael at ellerman.id.au Fri Jun 24 17:17:07 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 24 Jun 2005 17:17:07 +1000 Subject: [PATCH] ppc64: Fix compile warnings in arch/ppc64/kernel/lparcfg.c Message-ID: <200506241717.07844.michael@ellerman.id.au> Hi, Stephen's patch to remove LparData.h missed an include in lparcfg.c This fixes a few compile warnings. Signed-off-by: Michael Ellerman Index: work/arch/ppc64/kernel/lparcfg.c =================================================================== --- work.orig/arch/ppc64/kernel/lparcfg.c +++ work/arch/ppc64/kernel/lparcfg.c @@ -34,6 +34,7 @@ #include #include #include +#include #define MODULE_VERS "1.6" #define MODULE_NAME "lparcfg" From benh at kernel.crashing.org Sat Jun 25 09:05:57 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 25 Jun 2005 09:05:57 +1000 Subject: F80, 2.6.12 broken In-Reply-To: References: Message-ID: <1119654358.16414.20.camel@gaston> On Thu, 2005-06-23 at 23:56 -0500, Tom _ wrote: > Greetings, > > It would appear the f80 is still busted on recent kernels. Not sure > when it broke but it's certainly been in bad shape for 2.6.10,11 and > now 12 as well. > > Last message is returning from prom_init (helpful yeah...) > > buuut earlier on in the messages from the kernel is this little bit: > > instantiating rtas at 0x00000000deadbeef ... failed > > I fully expect that's the real problem. > > Any patches to try? Advise? Enable PROM_DEBUG in prom_init.c and send me the log. Ben. From benh at kernel.crashing.org Mon Jun 27 11:36:29 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 27 Jun 2005 11:36:29 +1000 Subject: [PATCH] ppc/ppc64: Fix pci mmap via sysfs Message-ID: <1119836190.5133.59.camel@gaston> Hi ! This implement the change to /proc and sysfs PCI mmap functions that we discussed a while ago, that is adding an arch optional pci_resource_to_user() to allow munging on the exposed value of PCI resources to userland and thus hiding kernel internal values. It also implements using of that callback to sanitize exposed values on ppc an ppc64, thus fixing mmap of PCI devices via /proc and sysfs. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/pci.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/pci.c 2005-06-25 09:22:56.000000000 +1000 +++ linux-work/arch/ppc64/kernel/pci.c 2005-06-27 11:34:23.000000000 +1000 @@ -351,9 +351,10 @@ *offset += hose->pci_mem_offset; res_bit = IORESOURCE_MEM; } else { - io_offset = (unsigned long)hose->io_base_virt; + io_offset = (unsigned long)hose->io_base_virt - pci_io_base; *offset += io_offset; res_bit = IORESOURCE_IO; + printk(" -> offset: %lx\n", *offset); } /* @@ -373,12 +374,15 @@ continue; /* In the range of this resource? */ + printk(" r%d: %lx -> %lx\n", i, rp->start, rp->end); if (*offset < (rp->start & PAGE_MASK) || *offset > rp->end) continue; /* found it! construct the final physical address */ - if (mmap_state == pci_mmap_io) - *offset += hose->io_base_phys - io_offset; + if (mmap_state == pci_mmap_io) { + *offset += hose->io_base_phys - io_offset; + printk(" result: %lx\n", *offset); + } return rp; } @@ -944,4 +948,22 @@ } EXPORT_SYMBOL(pci_read_irq_line); +void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + unsigned long offset = 0; + + if (hose == NULL) + return; + + if (rsrc->flags & IORESOURCE_IO) + offset = pci_io_base - (unsigned long)hose->io_base_virt + + hose->io_base_phys; + + *start = rsrc->start + offset; + *end = rsrc->end + offset; +} + #endif /* CONFIG_PPC_MULTIPLATFORM */ Index: linux-work/drivers/pci/pci-sysfs.c =================================================================== --- linux-work.orig/drivers/pci/pci-sysfs.c 2005-06-25 09:22:57.000000000 +1000 +++ linux-work/drivers/pci/pci-sysfs.c 2005-06-27 11:28:21.000000000 +1000 @@ -60,15 +60,18 @@ char * str = buf; int i; int max = 7; + u64 start, end; if (pci_dev->subordinate) max = DEVICE_COUNT_RESOURCE; for (i = 0; i < max; i++) { - str += sprintf(str,"0x%016lx 0x%016lx 0x%016lx\n", - pci_resource_start(pci_dev,i), - pci_resource_end(pci_dev,i), - pci_resource_flags(pci_dev,i)); + struct resource *res = &pci_dev->resource[i]; + pci_resource_to_user(pci_dev, i, res, &start, &end); + str += sprintf(str,"0x%016llx 0x%016llx 0x%016llx\n", + (unsigned long long)start, + (unsigned long long)end, + (unsigned long long)res->flags); } return (str - buf); } @@ -313,8 +316,21 @@ struct device, kobj)); struct resource *res = (struct resource *)attr->private; enum pci_mmap_state mmap_type; + u64 start, end; + int i; - vma->vm_pgoff += res->start >> PAGE_SHIFT; + for (i = 0; i < PCI_ROM_RESOURCE; i++) + if (res == &pdev->resource[i]) + break; + if (i >= PCI_ROM_RESOURCE) + return -ENODEV; + + /* pci_mmap_page_range() expects the same kind of entry as coming + * from /proc/bus/pci/ which is a "user visible" value. If this is + * different from the resource itself, arch will do necessary fixup. + */ + pci_resource_to_user(pdev, i, res, &start, &end); + vma->vm_pgoff += start >> PAGE_SHIFT; mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io; return pci_mmap_page_range(pdev, vma, mmap_type, 0); Index: linux-work/include/asm-ppc64/pci.h =================================================================== --- linux-work.orig/include/asm-ppc64/pci.h 2005-05-02 10:50:01.000000000 +1000 +++ linux-work/include/asm-ppc64/pci.h 2005-06-27 11:28:21.000000000 +1000 @@ -135,6 +135,11 @@ unsigned long offset, unsigned long size, pgprot_t prot); +#define HAVE_ARCH_PCI_RESOURCE_TO_USER +extern void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end); + #endif /* __KERNEL__ */ Index: linux-work/drivers/pci/proc.c =================================================================== --- linux-work.orig/drivers/pci/proc.c 2005-05-05 15:56:37.000000000 +1000 +++ linux-work/drivers/pci/proc.c 2005-06-27 11:28:21.000000000 +1000 @@ -355,14 +355,20 @@ dev->device, dev->irq); /* Here should be 7 and not PCI_NUM_RESOURCES as we need to preserve compatibility */ - for(i=0; i<7; i++) + for(i=0; i<7; i++) { + u64 start, end; + pci_resource_to_user(dev, i, &dev->resource[i], &start, &end); seq_printf(m, LONG_FORMAT, - dev->resource[i].start | + ((unsigned long)start) | (dev->resource[i].flags & PCI_REGION_FLAG_MASK)); - for(i=0; i<7; i++) + } + for(i=0; i<7; i++) { + u64 start, end; + pci_resource_to_user(dev, i, &dev->resource[i], &start, &end); seq_printf(m, LONG_FORMAT, dev->resource[i].start < dev->resource[i].end ? - dev->resource[i].end - dev->resource[i].start + 1 : 0); + (unsigned long)(end - start) + 1 : 0); + } seq_putc(m, '\t'); if (drv) seq_printf(m, "%s", drv->name); Index: linux-work/drivers/pci/pci.c =================================================================== --- linux-work.orig/drivers/pci/pci.c 2005-05-05 15:56:37.000000000 +1000 +++ linux-work/drivers/pci/pci.c 2005-06-27 11:28:21.000000000 +1000 @@ -759,7 +759,7 @@ return 0; } #endif - + static int __devinit pci_init(void) { struct pci_dev *dev = NULL; Index: linux-work/include/linux/pci.h =================================================================== --- linux-work.orig/include/linux/pci.h 2005-05-05 15:56:38.000000000 +1000 +++ linux-work/include/linux/pci.h 2005-06-27 11:28:21.000000000 +1000 @@ -1016,6 +1016,21 @@ #define pci_pretty_name(dev) "" #endif + +/* Some archs don't want to expose struct resource to userland as-is + * in sysfs and /proc + */ +#ifndef HAVE_ARCH_PCI_RESOURCE_TO_USER +static void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end) +{ + *start = rsrc->start; + *end = rsrc->end; +} +#endif /* HAVE_ARCH_PCI_RESOURCE_TO_USER */ + + /* * The world is not perfect and supplies us with broken PCI devices. * For at least a part of these bugs we need a work-around, so both Index: linux-work/include/asm-ppc/pci.h =================================================================== --- linux-work.orig/include/asm-ppc/pci.h 2005-05-02 10:49:57.000000000 +1000 +++ linux-work/include/asm-ppc/pci.h 2005-06-27 11:28:21.000000000 +1000 @@ -103,6 +103,12 @@ unsigned long size, pgprot_t prot); +#define HAVE_ARCH_PCI_RESOURCE_TO_USER +extern void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end); + + #endif /* __KERNEL__ */ #endif /* __PPC_PCI_H */ Index: linux-work/arch/ppc/kernel/pci.c =================================================================== --- linux-work.orig/arch/ppc/kernel/pci.c 2005-06-25 09:22:56.000000000 +1000 +++ linux-work/arch/ppc/kernel/pci.c 2005-06-27 11:28:21.000000000 +1000 @@ -1495,7 +1495,7 @@ *offset += hose->pci_mem_offset; res_bit = IORESOURCE_MEM; } else { - io_offset = (unsigned long)hose->io_base_virt; + io_offset = hose->io_base_virt - ___IO_BASE; *offset += io_offset; res_bit = IORESOURCE_IO; } @@ -1522,7 +1522,7 @@ /* found it! construct the final physical address */ if (mmap_state == pci_mmap_io) - *offset += hose->io_base_phys - _IO_BASE; + *offset += hose->io_base_phys - io_offset; return rp; } @@ -1739,6 +1739,23 @@ return result; } +void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end) +{ + struct pci_controller *hose = pci_bus_to_hose(dev->bus->number); + unsigned long offset = 0; + + if (hose == NULL) + return; + + if (rsrc->flags & IORESOURCE_IO) + offset = ___IO_BASE - hose->io_base_virt + hose->io_base_phys; + + *start = rsrc->start + offset; + *end = rsrc->end + offset; +} + void __init pci_init_resource(struct resource *res, unsigned long start, unsigned long end, int flags, char *name) From benh at kernel.crashing.org Mon Jun 27 11:38:09 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 27 Jun 2005 11:38:09 +1000 Subject: [PATCH] ppc64: Add new PHY to sungem Message-ID: <1119836289.5133.61.camel@gaston> Hi ! This patch adds support for some new PHY models to sungem as used on some recent Apple iMac G5 models. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/drivers/net/sungem.c =================================================================== --- linux-work.orig/drivers/net/sungem.c 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem.c 2005-06-14 10:17:38.000000000 +1000 @@ -3078,7 +3078,9 @@ gp->phy_mii.dev = dev; gp->phy_mii.mdio_read = _phy_read; gp->phy_mii.mdio_write = _phy_write; - +#ifdef CONFIG_PPC_PMAC + gp->phy_mii.platform_data = gp->of_node; +#endif /* By default, we start with autoneg */ gp->want_autoneg = 1; Index: linux-work/drivers/net/sungem_phy.c =================================================================== --- linux-work.orig/drivers/net/sungem_phy.c 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem_phy.c 2005-06-16 07:38:37.000000000 +1000 @@ -32,6 +32,10 @@ #include #include +#ifdef CONFIG_PPC_PMAC +#include +#endif + #include "sungem_phy.h" /* Link modes of the BCM5400 PHY */ @@ -281,10 +285,12 @@ static int bcm5421_init(struct mii_phy* phy) { u16 data; - int rev; + unsigned int id; - rev = phy_read(phy, MII_PHYSID2) & 0x000f; - if (rev == 0) { + id = (phy_read(phy, MII_PHYSID1) << 16 | phy_read(phy, MII_PHYSID2)); + + /* Revision 0 of 5421 needs some fixups */ + if (id == 0x002060e0) { /* This is borrowed from MacOS */ phy_write(phy, 0x18, 0x1007); @@ -297,21 +303,28 @@ data = phy_read(phy, 0x15); phy_write(phy, 0x15, data | 0x0200); } -#if 0 - /* This has to be verified before I enable it */ - /* Enable automatic low-power */ - phy_write(phy, 0x1c, 0x9002); - phy_write(phy, 0x1c, 0xa821); - phy_write(phy, 0x1c, 0x941d); -#endif - return 0; -} -static int bcm5421k2_init(struct mii_phy* phy) -{ - /* Init code borrowed from OF */ - phy_write(phy, 4, 0x01e1); - phy_write(phy, 9, 0x0300); + /* Pick up some init code from OF for K2 version */ + if ((id & 0xfffffff0) == 0x002062e0) { + phy_write(phy, 4, 0x01e1); + phy_write(phy, 9, 0x0300); + } + + /* Check if we can enable automatic low power */ +#ifdef CONFIG_PPC_PMAC + if (phy->platform_data) { + struct device_node *np = of_get_parent(phy->platform_data); + int can_low_power = 1; + if (np == NULL || get_property(np, "no-autolowpower", NULL)) + can_low_power = 0; + if (can_low_power) { + /* Enable automatic low-power */ + phy_write(phy, 0x1c, 0x9002); + phy_write(phy, 0x1c, 0xa821); + phy_write(phy, 0x1c, 0x941d); + } + } +#endif /* CONFIG_PPC_PMAC */ return 0; } @@ -762,7 +775,7 @@ /* Broadcom BCM 5421 built-in K2 */ static struct mii_phy_ops bcm5421k2_phy_ops = { - .init = bcm5421k2_init, + .init = bcm5421_init, .suspend = bcm5411_suspend, .setup_aneg = bcm54xx_setup_aneg, .setup_forced = bcm54xx_setup_forced, @@ -779,6 +792,25 @@ .ops = &bcm5421k2_phy_ops }; +/* Broadcom BCM 5462 built-in Vesta */ +static struct mii_phy_ops bcm5462V_phy_ops = { + .init = bcm5421_init, + .suspend = bcm5411_suspend, + .setup_aneg = bcm54xx_setup_aneg, + .setup_forced = bcm54xx_setup_forced, + .poll_link = genmii_poll_link, + .read_link = bcm54xx_read_link, +}; + +static struct mii_phy_def bcm5462V_phy_def = { + .phy_id = 0x002060d0, + .phy_id_mask = 0xfffffff0, + .name = "BCM5462-Vesta", + .features = MII_GBIT_FEATURES, + .magic_aneg = 1, + .ops = &bcm5462V_phy_ops +}; + /* Marvell 88E1101 (Apple seem to deal with 2 different revs, * I masked out the 8 last bits to get both, but some specs * would be useful here) --BenH. @@ -824,6 +856,7 @@ &bcm5411_phy_def, &bcm5421_phy_def, &bcm5421k2_phy_def, + &bcm5462V_phy_def, &marvell_phy_def, &genmii_phy_def, NULL Index: linux-work/drivers/net/sungem_phy.h =================================================================== --- linux-work.orig/drivers/net/sungem_phy.h 2005-05-02 10:48:28.000000000 +1000 +++ linux-work/drivers/net/sungem_phy.h 2005-06-14 10:16:14.000000000 +1000 @@ -43,9 +43,10 @@ int pause; /* Provided by host chip */ - struct net_device* dev; + struct net_device *dev; int (*mdio_read) (struct net_device *dev, int mii_id, int reg); void (*mdio_write) (struct net_device *dev, int mii_id, int reg, int val); + void *platform_data; }; /* Pass in a struct mii_phy with dev, mdio_read and mdio_write From benh at kernel.crashing.org Mon Jun 27 12:11:03 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 27 Jun 2005 12:11:03 +1000 Subject: [PATCH] ppc/ppc64: Fix pci mmap via sysfs In-Reply-To: <20050626185727.0ce92772.akpm@osdl.org> References: <1119836190.5133.59.camel@gaston> <20050626185727.0ce92772.akpm@osdl.org> Message-ID: <1119838264.5133.76.camel@gaston> On Sun, 2005-06-26 at 18:57 -0700, Andrew Morton wrote: > Benjamin Herrenschmidt wrote: > > > > Hi ! > > > > This implement the change to /proc and sysfs PCI mmap functions that we > > discussed a while ago, that is adding an arch optional > > pci_resource_to_user() to allow munging on the exposed value of PCI > > resources to userland and thus hiding kernel internal values. It also > > implements using of that callback to sanitize exposed values on ppc an > > ppc64, thus fixing mmap of PCI devices via /proc and sysfs. > > > > You sure you want all those printks in there? One quilt ref later ... :) Hi ! This implement the change to /proc and sysfs PCI mmap functions that we discussed a while ago, that is adding an arch optional pci_resource_to_user() to allow munging on the exposed value of PCI resources to userland and thus hiding kernel internal values. It also implements using of that callback to sanitize exposed values on ppc an ppc64, thus fixing mmap of PCI devices via /proc and sysfs. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/pci.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/pci.c 2005-06-25 09:22:56.000000000 +1000 +++ linux-work/arch/ppc64/kernel/pci.c 2005-06-27 12:09:58.000000000 +1000 @@ -351,7 +351,7 @@ *offset += hose->pci_mem_offset; res_bit = IORESOURCE_MEM; } else { - io_offset = (unsigned long)hose->io_base_virt; + io_offset = (unsigned long)hose->io_base_virt - pci_io_base; *offset += io_offset; res_bit = IORESOURCE_IO; } @@ -378,7 +378,7 @@ /* found it! construct the final physical address */ if (mmap_state == pci_mmap_io) - *offset += hose->io_base_phys - io_offset; + *offset += hose->io_base_phys - io_offset; return rp; } @@ -416,9 +416,6 @@ else prot |= _PAGE_GUARDED; - printk("PCI map for %s:%lx, prot: %lx\n", pci_name(dev), rp->start, - prot); - return __pgprot(prot); } @@ -944,4 +941,22 @@ } EXPORT_SYMBOL(pci_read_irq_line); +void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end) +{ + struct pci_controller *hose = pci_bus_to_host(dev->bus); + unsigned long offset = 0; + + if (hose == NULL) + return; + + if (rsrc->flags & IORESOURCE_IO) + offset = pci_io_base - (unsigned long)hose->io_base_virt + + hose->io_base_phys; + + *start = rsrc->start + offset; + *end = rsrc->end + offset; +} + #endif /* CONFIG_PPC_MULTIPLATFORM */ Index: linux-work/drivers/pci/pci-sysfs.c =================================================================== --- linux-work.orig/drivers/pci/pci-sysfs.c 2005-06-25 09:22:57.000000000 +1000 +++ linux-work/drivers/pci/pci-sysfs.c 2005-06-27 11:28:21.000000000 +1000 @@ -60,15 +60,18 @@ char * str = buf; int i; int max = 7; + u64 start, end; if (pci_dev->subordinate) max = DEVICE_COUNT_RESOURCE; for (i = 0; i < max; i++) { - str += sprintf(str,"0x%016lx 0x%016lx 0x%016lx\n", - pci_resource_start(pci_dev,i), - pci_resource_end(pci_dev,i), - pci_resource_flags(pci_dev,i)); + struct resource *res = &pci_dev->resource[i]; + pci_resource_to_user(pci_dev, i, res, &start, &end); + str += sprintf(str,"0x%016llx 0x%016llx 0x%016llx\n", + (unsigned long long)start, + (unsigned long long)end, + (unsigned long long)res->flags); } return (str - buf); } @@ -313,8 +316,21 @@ struct device, kobj)); struct resource *res = (struct resource *)attr->private; enum pci_mmap_state mmap_type; + u64 start, end; + int i; - vma->vm_pgoff += res->start >> PAGE_SHIFT; + for (i = 0; i < PCI_ROM_RESOURCE; i++) + if (res == &pdev->resource[i]) + break; + if (i >= PCI_ROM_RESOURCE) + return -ENODEV; + + /* pci_mmap_page_range() expects the same kind of entry as coming + * from /proc/bus/pci/ which is a "user visible" value. If this is + * different from the resource itself, arch will do necessary fixup. + */ + pci_resource_to_user(pdev, i, res, &start, &end); + vma->vm_pgoff += start >> PAGE_SHIFT; mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io; return pci_mmap_page_range(pdev, vma, mmap_type, 0); Index: linux-work/include/asm-ppc64/pci.h =================================================================== --- linux-work.orig/include/asm-ppc64/pci.h 2005-05-02 10:50:01.000000000 +1000 +++ linux-work/include/asm-ppc64/pci.h 2005-06-27 11:28:21.000000000 +1000 @@ -135,6 +135,11 @@ unsigned long offset, unsigned long size, pgprot_t prot); +#define HAVE_ARCH_PCI_RESOURCE_TO_USER +extern void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end); + #endif /* __KERNEL__ */ Index: linux-work/drivers/pci/proc.c =================================================================== --- linux-work.orig/drivers/pci/proc.c 2005-05-05 15:56:37.000000000 +1000 +++ linux-work/drivers/pci/proc.c 2005-06-27 11:28:21.000000000 +1000 @@ -355,14 +355,20 @@ dev->device, dev->irq); /* Here should be 7 and not PCI_NUM_RESOURCES as we need to preserve compatibility */ - for(i=0; i<7; i++) + for(i=0; i<7; i++) { + u64 start, end; + pci_resource_to_user(dev, i, &dev->resource[i], &start, &end); seq_printf(m, LONG_FORMAT, - dev->resource[i].start | + ((unsigned long)start) | (dev->resource[i].flags & PCI_REGION_FLAG_MASK)); - for(i=0; i<7; i++) + } + for(i=0; i<7; i++) { + u64 start, end; + pci_resource_to_user(dev, i, &dev->resource[i], &start, &end); seq_printf(m, LONG_FORMAT, dev->resource[i].start < dev->resource[i].end ? - dev->resource[i].end - dev->resource[i].start + 1 : 0); + (unsigned long)(end - start) + 1 : 0); + } seq_putc(m, '\t'); if (drv) seq_printf(m, "%s", drv->name); Index: linux-work/drivers/pci/pci.c =================================================================== --- linux-work.orig/drivers/pci/pci.c 2005-05-05 15:56:37.000000000 +1000 +++ linux-work/drivers/pci/pci.c 2005-06-27 11:28:21.000000000 +1000 @@ -759,7 +759,7 @@ return 0; } #endif - + static int __devinit pci_init(void) { struct pci_dev *dev = NULL; Index: linux-work/include/linux/pci.h =================================================================== --- linux-work.orig/include/linux/pci.h 2005-05-05 15:56:38.000000000 +1000 +++ linux-work/include/linux/pci.h 2005-06-27 11:28:21.000000000 +1000 @@ -1016,6 +1016,21 @@ #define pci_pretty_name(dev) "" #endif + +/* Some archs don't want to expose struct resource to userland as-is + * in sysfs and /proc + */ +#ifndef HAVE_ARCH_PCI_RESOURCE_TO_USER +static void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end) +{ + *start = rsrc->start; + *end = rsrc->end; +} +#endif /* HAVE_ARCH_PCI_RESOURCE_TO_USER */ + + /* * The world is not perfect and supplies us with broken PCI devices. * For at least a part of these bugs we need a work-around, so both Index: linux-work/include/asm-ppc/pci.h =================================================================== --- linux-work.orig/include/asm-ppc/pci.h 2005-05-02 10:49:57.000000000 +1000 +++ linux-work/include/asm-ppc/pci.h 2005-06-27 11:28:21.000000000 +1000 @@ -103,6 +103,12 @@ unsigned long size, pgprot_t prot); +#define HAVE_ARCH_PCI_RESOURCE_TO_USER +extern void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end); + + #endif /* __KERNEL__ */ #endif /* __PPC_PCI_H */ Index: linux-work/arch/ppc/kernel/pci.c =================================================================== --- linux-work.orig/arch/ppc/kernel/pci.c 2005-06-25 09:22:56.000000000 +1000 +++ linux-work/arch/ppc/kernel/pci.c 2005-06-27 12:09:30.000000000 +1000 @@ -1495,7 +1495,7 @@ *offset += hose->pci_mem_offset; res_bit = IORESOURCE_MEM; } else { - io_offset = (unsigned long)hose->io_base_virt; + io_offset = hose->io_base_virt - ___IO_BASE; *offset += io_offset; res_bit = IORESOURCE_IO; } @@ -1522,7 +1522,7 @@ /* found it! construct the final physical address */ if (mmap_state == pci_mmap_io) - *offset += hose->io_base_phys - _IO_BASE; + *offset += hose->io_base_phys - io_offset; return rp; } @@ -1560,9 +1560,6 @@ else prot |= _PAGE_GUARDED; - printk("PCI map for %s:%lx, prot: %lx\n", pci_name(dev), rp->start, - prot); - return __pgprot(prot); } @@ -1739,6 +1736,23 @@ return result; } +void pci_resource_to_user(const struct pci_dev *dev, int bar, + const struct resource *rsrc, + u64 *start, u64 *end) +{ + struct pci_controller *hose = pci_bus_to_hose(dev->bus->number); + unsigned long offset = 0; + + if (hose == NULL) + return; + + if (rsrc->flags & IORESOURCE_IO) + offset = ___IO_BASE - hose->io_base_virt + hose->io_base_phys; + + *start = rsrc->start + offset; + *end = rsrc->end + offset; +} + void __init pci_init_resource(struct resource *res, unsigned long start, unsigned long end, int flags, char *name) From benh at kernel.crashing.org Mon Jun 27 14:29:56 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 27 Jun 2005 14:29:56 +1000 Subject: [PATCH] ppc64: Add missing exports Message-ID: <1119846596.5133.99.camel@gaston> Hi ! This patch adds a couple of missing symbol exports. flush_dcache_page is used by the AGP driver and rtc_lock by the RTC driver. Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Nick Piggin Index: linux-work/arch/ppc64/kernel/time.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/time.c 2005-06-27 12:08:53.000000000 +1000 +++ linux-work/arch/ppc64/kernel/time.c 2005-06-27 14:31:13.000000000 +1000 @@ -91,7 +91,8 @@ unsigned tb_to_us; unsigned long processor_freq; DEFINE_SPINLOCK(rtc_lock); - +EXPORT_SYMBOL_GPL(rtc_lock); + unsigned long tb_to_ns_scale; unsigned long tb_to_ns_shift; Index: linux-work/arch/ppc64/kernel/ppc_ksyms.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/ppc_ksyms.c 2005-05-02 10:48:08.000000000 +1000 +++ linux-work/arch/ppc64/kernel/ppc_ksyms.c 2005-06-27 14:31:30.000000000 +1000 @@ -75,6 +75,7 @@ EXPORT_SYMBOL(giveup_altivec); #endif EXPORT_SYMBOL(__flush_icache_range); +EXPORT_SYMBOL(flush_dcache_range); #ifdef CONFIG_SMP #ifdef CONFIG_PPC_ISERIES From michael at ellerman.id.au Mon Jun 27 18:24:46 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 27 Jun 2005 18:24:46 +1000 Subject: Series of patches to spread lpevents by default and related cleanups Message-ID: <200506271824.46507.michael@ellerman.id.au> Hi, The following patches remove the lpq pointer from the paca, then enable spreading of lpevents by default. Then there's a number of cleanups on the code in arch/ppc64/kernel/ItLpQueue.c and surrounds. Most of this should only effect iSeries. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050627/364cfc99/attachment.pgp From michael at ellerman.id.au Tue Jun 28 09:16:57 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:16:57 +1000 Subject: [PATCH 1/15] ppc64: Remove lpqueue pointer from the paca on iSeries In-Reply-To: <200506271824.46507.michael@ellerman.id.au> Message-ID: <1119914217.31245.735403323298.qpatch@concordia> Hi, The iSeries code keeps a pointer to the ItLpQueue in its paca struct. But all these pointers end up pointing to the one place, ie. xItLpQueue. So remove the pointer from the paca struct and just refer to xItLpQueue directly where needed. The only complication is that the spread_lpevents logic was implemented by having a NULL lpqueue pointer in the paca on CPUs that weren't supposed to process events. Instead we just compare the spread_lpevents value to the processor id to get the same behaviour. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 16 +++++++++------- arch/ppc64/kernel/iSeries_setup.c | 6 ++---- arch/ppc64/kernel/idle.c | 4 ++-- arch/ppc64/kernel/irq.c | 6 ++---- arch/ppc64/kernel/mf.c | 5 ++--- arch/ppc64/kernel/pacaData.c | 1 - arch/ppc64/kernel/time.c | 5 ++--- include/asm-ppc64/paca.h | 1 - 8 files changed, 19 insertions(+), 25 deletions(-) Index: work/arch/ppc64/kernel/idle.c =================================================================== --- work.orig/arch/ppc64/kernel/idle.c +++ work/arch/ppc64/kernel/idle.c @@ -88,7 +88,7 @@ static int iSeries_idle(void) while (1) { if (lpaca->lppaca.shared_proc) { - if (ItLpQueue_isLpIntPending(lpaca->lpqueue_ptr)) + if (ItLpQueue_isLpIntPending(&xItLpQueue)) process_iSeries_events(); if (!need_resched()) yield_shared_processor(); @@ -100,7 +100,7 @@ static int iSeries_idle(void) while (!need_resched()) { HMT_medium(); - if (ItLpQueue_isLpIntPending(lpaca->lpqueue_ptr)) + if (ItLpQueue_isLpIntPending(&xItLpQueue)) process_iSeries_events(); HMT_low(); } Index: work/arch/ppc64/kernel/irq.c =================================================================== --- work.orig/arch/ppc64/kernel/irq.c +++ work/arch/ppc64/kernel/irq.c @@ -269,7 +269,6 @@ out: void do_IRQ(struct pt_regs *regs) { struct paca_struct *lpaca; - struct ItLpQueue *lpq; irq_enter(); @@ -295,9 +294,8 @@ void do_IRQ(struct pt_regs *regs) iSeries_smp_message_recv(regs); } #endif /* CONFIG_SMP */ - lpq = lpaca->lpqueue_ptr; - if (lpq && ItLpQueue_isLpIntPending(lpq)) - lpevent_count += ItLpQueue_process(lpq, regs); + if (ItLpQueue_isLpIntPending(&xItLpQueue)) + lpevent_count += ItLpQueue_process(&xItLpQueue, regs); irq_exit(); Index: work/arch/ppc64/kernel/time.c =================================================================== --- work.orig/arch/ppc64/kernel/time.c +++ work/arch/ppc64/kernel/time.c @@ -367,9 +367,8 @@ int timer_interrupt(struct pt_regs * reg #ifdef CONFIG_PPC_ISERIES { - struct ItLpQueue *lpq = lpaca->lpqueue_ptr; - if (lpq && ItLpQueue_isLpIntPending(lpq)) - lpevent_count += ItLpQueue_process(lpq, regs); + if (ItLpQueue_isLpIntPending(&xItLpQueue)) + lpevent_count += ItLpQueue_process(&xItLpQueue, regs); } #endif Index: work/arch/ppc64/kernel/mf.c =================================================================== --- work.orig/arch/ppc64/kernel/mf.c +++ work/arch/ppc64/kernel/mf.c @@ -802,9 +802,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { extern unsigned long lpevent_count; - struct ItLpQueue *lpq = get_paca()->lpqueue_ptr; - if (lpq && ItLpQueue_isLpIntPending(lpq)) - lpevent_count += ItLpQueue_process(lpq, NULL); + if (ItLpQueue_isLpIntPending(&xItLpQueue)) + lpevent_count += ItLpQueue_process(&xItLpQueue, NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -69,15 +69,17 @@ struct HvLpEvent * ItLpQueue_getNextLpEv return nextLpEvent; } +unsigned long spread_lpevents = 1; + int ItLpQueue_isLpIntPending( struct ItLpQueue * lpQueue ) { - int retval = 0; - struct HvLpEvent * nextLpEvent; - if ( lpQueue ) { - nextLpEvent = (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; - retval = nextLpEvent->xFlags.xValid | lpQueue->xPlicOverflowIntPending; - } - return retval; + struct HvLpEvent *next_event; + + if (smp_processor_id() >= spread_lpevents) + return 0; + + next_event = (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; + return next_event->xFlags.xValid | lpQueue->xPlicOverflowIntPending; } void ItLpQueue_clearValid( struct HvLpEvent * event ) Index: work/include/asm-ppc64/paca.h =================================================================== --- work.orig/include/asm-ppc64/paca.h +++ work/include/asm-ppc64/paca.h @@ -62,7 +62,6 @@ struct paca_struct { u16 paca_index; /* Logical processor number */ u32 default_decr; /* Default decrementer value */ - struct ItLpQueue *lpqueue_ptr; /* LpQueue handled by this CPU */ u64 kernel_toc; /* Kernel TOC address */ u64 stab_real; /* Absolute address of segment table */ u64 stab_addr; /* Virtual address of segment table */ Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -855,17 +855,15 @@ late_initcall(iSeries_src_init); static int set_spread_lpevents(char *str) { - unsigned long i; unsigned long val = simple_strtoul(str, NULL, 0); + extern unsigned long spread_lpevents; /* * The parameter is the number of processors to share in processing * lp events. */ if (( val > 0) && (val <= NR_CPUS)) { - for (i = 1; i < val; ++i) - paca[i].lpqueue_ptr = paca[0].lpqueue_ptr; - + spread_lpevents = val; printk("lpevent processing spread over %ld processors\n", val); } else { printk("invalid spread_lpevents %ld\n", val); Index: work/arch/ppc64/kernel/pacaData.c =================================================================== --- work.orig/arch/ppc64/kernel/pacaData.c +++ work/arch/ppc64/kernel/pacaData.c @@ -45,7 +45,6 @@ extern unsigned long __toc_start; #ifdef CONFIG_PPC_ISERIES #define EXTRA_INITS(number, lpq) \ .lppaca_ptr = &paca[number].lppaca, \ - .lpqueue_ptr = (lpq), /* &xItLpQueue, */ \ .reg_save_ptr = &paca[number].reg_save, \ .reg_save = { \ .xDesc = 0xd397d9e2, /* "LpRS" */ \ From michael at ellerman.id.au Tue Jun 28 09:17:04 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:04 +1000 Subject: [PATCH 2/15] ppc64: Spread lpevents by default on iSeries In-Reply-To: <1119914217.31245.735403323298.qpatch@concordia> Message-ID: <1119914224.102132.945355478304.qpatch@concordia> Hi, With the previous patch in place, spreading lpevents by default becomes a one liner. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -69,7 +69,7 @@ struct HvLpEvent * ItLpQueue_getNextLpEv return nextLpEvent; } -unsigned long spread_lpevents = 1; +unsigned long spread_lpevents = NR_CPUS; int ItLpQueue_isLpIntPending( struct ItLpQueue * lpQueue ) { From michael at ellerman.id.au Tue Jun 28 09:17:09 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:09 +1000 Subject: [PATCH 3/15] ppc64: Reorganise the paca initialisation macros In-Reply-To: <1119914224.102132.945355478304.qpatch@concordia> Message-ID: <1119914229.959253.194917692058.qpatch@concordia> Hi, This patch updates the macros that initialise the paca to remove the lpq parameter. It also rearranges them a bit with the hope of making them a bit clearer. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/pacaData.c | 306 ++++++++++++++++++++++--------------------- 1 files changed, 160 insertions(+), 146 deletions(-) Index: work/arch/ppc64/kernel/pacaData.c =================================================================== --- work.orig/arch/ppc64/kernel/pacaData.c +++ work/arch/ppc64/kernel/pacaData.c @@ -42,20 +42,7 @@ extern unsigned long __toc_start; * processors. The processor VPD array needs one entry per physical * processor (not thread). */ -#ifdef CONFIG_PPC_ISERIES -#define EXTRA_INITS(number, lpq) \ - .lppaca_ptr = &paca[number].lppaca, \ - .reg_save_ptr = &paca[number].reg_save, \ - .reg_save = { \ - .xDesc = 0xd397d9e2, /* "LpRS" */ \ - .xSize = sizeof(struct ItLpRegSave) \ - }, -#else -#define EXTRA_INITS(number, lpq) -#endif - -#define PACAINITDATA(number,start,lpq,asrr,asrv) \ -{ \ +#define PACA_INIT_COMMON(number, start, asrr, asrv) \ .lock_token = 0x8000, \ .paca_index = (number), /* Paca Index */ \ .default_decr = 0x00ff0000, /* Initial Decr */ \ @@ -73,147 +60,174 @@ extern unsigned long __toc_start; .end_of_quantum = 0xfffffffffffffffful, \ .slb_count = 64, \ }, \ - EXTRA_INITS((number), (lpq)) \ -} -struct paca_struct paca[] = { #ifdef CONFIG_PPC_ISERIES - PACAINITDATA( 0, 1, &xItLpQueue, 0, STAB0_VIRT_ADDR), +#define PACA_INIT_ISERIES(number) \ + .lppaca_ptr = &paca[number].lppaca, \ + .reg_save_ptr = &paca[number].reg_save, \ + .reg_save = { \ + .xDesc = 0xd397d9e2, /* "LpRS" */ \ + .xSize = sizeof(struct ItLpRegSave) \ + } + +#define PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 0, 0, 0) \ + PACA_INIT_ISERIES(number) \ +} + +#define BOOTCPU_PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 1, 0, STAB0_VIRT_ADDR) \ + PACA_INIT_ISERIES(number) \ +} + #else - PACAINITDATA( 0, 1, NULL, STAB0_PHYS_ADDR, STAB0_VIRT_ADDR), +#define PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 0, 0, 0) \ +} + +#define BOOTCPU_PACAINITDATA(number) \ +{ \ + PACA_INIT_COMMON(number, 1, STAB0_PHYS_ADDR, STAB0_VIRT_ADDR) \ +} #endif + +struct paca_struct paca[] = { + BOOTCPU_PACAINITDATA(0), #if NR_CPUS > 1 - PACAINITDATA( 1, 0, NULL, 0, 0), - PACAINITDATA( 2, 0, NULL, 0, 0), - PACAINITDATA( 3, 0, NULL, 0, 0), + PACAINITDATA(1), + PACAINITDATA(2), + PACAINITDATA(3), #if NR_CPUS > 4 - PACAINITDATA( 4, 0, NULL, 0, 0), - PACAINITDATA( 5, 0, NULL, 0, 0), - PACAINITDATA( 6, 0, NULL, 0, 0), - PACAINITDATA( 7, 0, NULL, 0, 0), + PACAINITDATA(4), + PACAINITDATA(5), + PACAINITDATA(6), + PACAINITDATA(7), #if NR_CPUS > 8 - PACAINITDATA( 8, 0, NULL, 0, 0), - PACAINITDATA( 9, 0, NULL, 0, 0), - PACAINITDATA(10, 0, NULL, 0, 0), - PACAINITDATA(11, 0, NULL, 0, 0), - PACAINITDATA(12, 0, NULL, 0, 0), - PACAINITDATA(13, 0, NULL, 0, 0), - PACAINITDATA(14, 0, NULL, 0, 0), - PACAINITDATA(15, 0, NULL, 0, 0), - PACAINITDATA(16, 0, NULL, 0, 0), - PACAINITDATA(17, 0, NULL, 0, 0), - PACAINITDATA(18, 0, NULL, 0, 0), - PACAINITDATA(19, 0, NULL, 0, 0), - PACAINITDATA(20, 0, NULL, 0, 0), - PACAINITDATA(21, 0, NULL, 0, 0), - PACAINITDATA(22, 0, NULL, 0, 0), - PACAINITDATA(23, 0, NULL, 0, 0), - PACAINITDATA(24, 0, NULL, 0, 0), - PACAINITDATA(25, 0, NULL, 0, 0), - PACAINITDATA(26, 0, NULL, 0, 0), - PACAINITDATA(27, 0, NULL, 0, 0), - PACAINITDATA(28, 0, NULL, 0, 0), - PACAINITDATA(29, 0, NULL, 0, 0), - PACAINITDATA(30, 0, NULL, 0, 0), - PACAINITDATA(31, 0, NULL, 0, 0), + PACAINITDATA(8), + PACAINITDATA(9), + PACAINITDATA(10), + PACAINITDATA(11), + PACAINITDATA(12), + PACAINITDATA(13), + PACAINITDATA(14), + PACAINITDATA(15), + PACAINITDATA(16), + PACAINITDATA(17), + PACAINITDATA(18), + PACAINITDATA(19), + PACAINITDATA(20), + PACAINITDATA(21), + PACAINITDATA(22), + PACAINITDATA(23), + PACAINITDATA(24), + PACAINITDATA(25), + PACAINITDATA(26), + PACAINITDATA(27), + PACAINITDATA(28), + PACAINITDATA(29), + PACAINITDATA(30), + PACAINITDATA(31), #if NR_CPUS > 32 - PACAINITDATA(32, 0, NULL, 0, 0), - PACAINITDATA(33, 0, NULL, 0, 0), - PACAINITDATA(34, 0, NULL, 0, 0), - PACAINITDATA(35, 0, NULL, 0, 0), - PACAINITDATA(36, 0, NULL, 0, 0), - PACAINITDATA(37, 0, NULL, 0, 0), - PACAINITDATA(38, 0, NULL, 0, 0), - PACAINITDATA(39, 0, NULL, 0, 0), - PACAINITDATA(40, 0, NULL, 0, 0), - PACAINITDATA(41, 0, NULL, 0, 0), - PACAINITDATA(42, 0, NULL, 0, 0), - PACAINITDATA(43, 0, NULL, 0, 0), - PACAINITDATA(44, 0, NULL, 0, 0), - PACAINITDATA(45, 0, NULL, 0, 0), - PACAINITDATA(46, 0, NULL, 0, 0), - PACAINITDATA(47, 0, NULL, 0, 0), - PACAINITDATA(48, 0, NULL, 0, 0), - PACAINITDATA(49, 0, NULL, 0, 0), - PACAINITDATA(50, 0, NULL, 0, 0), - PACAINITDATA(51, 0, NULL, 0, 0), - PACAINITDATA(52, 0, NULL, 0, 0), - PACAINITDATA(53, 0, NULL, 0, 0), - PACAINITDATA(54, 0, NULL, 0, 0), - PACAINITDATA(55, 0, NULL, 0, 0), - PACAINITDATA(56, 0, NULL, 0, 0), - PACAINITDATA(57, 0, NULL, 0, 0), - PACAINITDATA(58, 0, NULL, 0, 0), - PACAINITDATA(59, 0, NULL, 0, 0), - PACAINITDATA(60, 0, NULL, 0, 0), - PACAINITDATA(61, 0, NULL, 0, 0), - PACAINITDATA(62, 0, NULL, 0, 0), - PACAINITDATA(63, 0, NULL, 0, 0), + PACAINITDATA(32), + PACAINITDATA(33), + PACAINITDATA(34), + PACAINITDATA(35), + PACAINITDATA(36), + PACAINITDATA(37), + PACAINITDATA(38), + PACAINITDATA(39), + PACAINITDATA(40), + PACAINITDATA(41), + PACAINITDATA(42), + PACAINITDATA(43), + PACAINITDATA(44), + PACAINITDATA(45), + PACAINITDATA(46), + PACAINITDATA(47), + PACAINITDATA(48), + PACAINITDATA(49), + PACAINITDATA(50), + PACAINITDATA(51), + PACAINITDATA(52), + PACAINITDATA(53), + PACAINITDATA(54), + PACAINITDATA(55), + PACAINITDATA(56), + PACAINITDATA(57), + PACAINITDATA(58), + PACAINITDATA(59), + PACAINITDATA(60), + PACAINITDATA(61), + PACAINITDATA(62), + PACAINITDATA(63), #if NR_CPUS > 64 - PACAINITDATA(64, 0, NULL, 0, 0), - PACAINITDATA(65, 0, NULL, 0, 0), - PACAINITDATA(66, 0, NULL, 0, 0), - PACAINITDATA(67, 0, NULL, 0, 0), - PACAINITDATA(68, 0, NULL, 0, 0), - PACAINITDATA(69, 0, NULL, 0, 0), - PACAINITDATA(70, 0, NULL, 0, 0), - PACAINITDATA(71, 0, NULL, 0, 0), - PACAINITDATA(72, 0, NULL, 0, 0), - PACAINITDATA(73, 0, NULL, 0, 0), - PACAINITDATA(74, 0, NULL, 0, 0), - PACAINITDATA(75, 0, NULL, 0, 0), - PACAINITDATA(76, 0, NULL, 0, 0), - PACAINITDATA(77, 0, NULL, 0, 0), - PACAINITDATA(78, 0, NULL, 0, 0), - PACAINITDATA(79, 0, NULL, 0, 0), - PACAINITDATA(80, 0, NULL, 0, 0), - PACAINITDATA(81, 0, NULL, 0, 0), - PACAINITDATA(82, 0, NULL, 0, 0), - PACAINITDATA(83, 0, NULL, 0, 0), - PACAINITDATA(84, 0, NULL, 0, 0), - PACAINITDATA(85, 0, NULL, 0, 0), - PACAINITDATA(86, 0, NULL, 0, 0), - PACAINITDATA(87, 0, NULL, 0, 0), - PACAINITDATA(88, 0, NULL, 0, 0), - PACAINITDATA(89, 0, NULL, 0, 0), - PACAINITDATA(90, 0, NULL, 0, 0), - PACAINITDATA(91, 0, NULL, 0, 0), - PACAINITDATA(92, 0, NULL, 0, 0), - PACAINITDATA(93, 0, NULL, 0, 0), - PACAINITDATA(94, 0, NULL, 0, 0), - PACAINITDATA(95, 0, NULL, 0, 0), - PACAINITDATA(96, 0, NULL, 0, 0), - PACAINITDATA(97, 0, NULL, 0, 0), - PACAINITDATA(98, 0, NULL, 0, 0), - PACAINITDATA(99, 0, NULL, 0, 0), - PACAINITDATA(100, 0, NULL, 0, 0), - PACAINITDATA(101, 0, NULL, 0, 0), - PACAINITDATA(102, 0, NULL, 0, 0), - PACAINITDATA(103, 0, NULL, 0, 0), - PACAINITDATA(104, 0, NULL, 0, 0), - PACAINITDATA(105, 0, NULL, 0, 0), - PACAINITDATA(106, 0, NULL, 0, 0), - PACAINITDATA(107, 0, NULL, 0, 0), - PACAINITDATA(108, 0, NULL, 0, 0), - PACAINITDATA(109, 0, NULL, 0, 0), - PACAINITDATA(110, 0, NULL, 0, 0), - PACAINITDATA(111, 0, NULL, 0, 0), - PACAINITDATA(112, 0, NULL, 0, 0), - PACAINITDATA(113, 0, NULL, 0, 0), - PACAINITDATA(114, 0, NULL, 0, 0), - PACAINITDATA(115, 0, NULL, 0, 0), - PACAINITDATA(116, 0, NULL, 0, 0), - PACAINITDATA(117, 0, NULL, 0, 0), - PACAINITDATA(118, 0, NULL, 0, 0), - PACAINITDATA(119, 0, NULL, 0, 0), - PACAINITDATA(120, 0, NULL, 0, 0), - PACAINITDATA(121, 0, NULL, 0, 0), - PACAINITDATA(122, 0, NULL, 0, 0), - PACAINITDATA(123, 0, NULL, 0, 0), - PACAINITDATA(124, 0, NULL, 0, 0), - PACAINITDATA(125, 0, NULL, 0, 0), - PACAINITDATA(126, 0, NULL, 0, 0), - PACAINITDATA(127, 0, NULL, 0, 0), + PACAINITDATA(64), + PACAINITDATA(65), + PACAINITDATA(66), + PACAINITDATA(67), + PACAINITDATA(68), + PACAINITDATA(69), + PACAINITDATA(70), + PACAINITDATA(71), + PACAINITDATA(72), + PACAINITDATA(73), + PACAINITDATA(74), + PACAINITDATA(75), + PACAINITDATA(76), + PACAINITDATA(77), + PACAINITDATA(78), + PACAINITDATA(79), + PACAINITDATA(80), + PACAINITDATA(81), + PACAINITDATA(82), + PACAINITDATA(83), + PACAINITDATA(84), + PACAINITDATA(85), + PACAINITDATA(86), + PACAINITDATA(87), + PACAINITDATA(88), + PACAINITDATA(89), + PACAINITDATA(90), + PACAINITDATA(91), + PACAINITDATA(92), + PACAINITDATA(93), + PACAINITDATA(94), + PACAINITDATA(95), + PACAINITDATA(96), + PACAINITDATA(97), + PACAINITDATA(98), + PACAINITDATA(99), + PACAINITDATA(100), + PACAINITDATA(101), + PACAINITDATA(102), + PACAINITDATA(103), + PACAINITDATA(104), + PACAINITDATA(105), + PACAINITDATA(106), + PACAINITDATA(107), + PACAINITDATA(108), + PACAINITDATA(109), + PACAINITDATA(110), + PACAINITDATA(111), + PACAINITDATA(112), + PACAINITDATA(113), + PACAINITDATA(114), + PACAINITDATA(115), + PACAINITDATA(116), + PACAINITDATA(117), + PACAINITDATA(118), + PACAINITDATA(119), + PACAINITDATA(120), + PACAINITDATA(121), + PACAINITDATA(122), + PACAINITDATA(123), + PACAINITDATA(124), + PACAINITDATA(125), + PACAINITDATA(126), + PACAINITDATA(127), #endif #endif #endif From michael at ellerman.id.au Tue Jun 28 09:17:17 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:17 +1000 Subject: [PATCH 4/15] ppc64: Don't pass the pointers to xItLpQueue around In-Reply-To: <1119914229.959253.194917692058.qpatch@concordia> Message-ID: <1119914237.219661.147090075133.qpatch@concordia> Hi, Because there's only one ItLpQueue and we know where it is, ie. xItLpQueue, there's no point passing pointers to it it around all over the place. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 24 ++++++++++++------------ arch/ppc64/kernel/idle.c | 4 ++-- arch/ppc64/kernel/irq.c | 4 ++-- arch/ppc64/kernel/mf.c | 4 ++-- arch/ppc64/kernel/time.c | 4 ++-- include/asm-ppc64/iSeries/ItLpQueue.h | 4 ++-- 6 files changed, 22 insertions(+), 22 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -17,10 +17,10 @@ #include #include -static __inline__ int set_inUse( struct ItLpQueue * lpQueue ) +static __inline__ int set_inUse(void) { int t; - u32 * inUseP = &(lpQueue->xInUseWord); + u32 * inUseP = &xItLpQueue.xInUseWord; __asm__ __volatile__("\n\ 1: lwarx %0,0,%2 \n\ @@ -31,37 +31,37 @@ static __inline__ int set_inUse( struct stwcx. %0,0,%2 \n\ bne- 1b \n\ 2: eieio" - : "=&r" (t), "=m" (lpQueue->xInUseWord) - : "r" (inUseP), "m" (lpQueue->xInUseWord) + : "=&r" (t), "=m" (xItLpQueue.xInUseWord) + : "r" (inUseP), "m" (xItLpQueue.xInUseWord) : "cc"); return t; } -static __inline__ void clear_inUse( struct ItLpQueue * lpQueue ) +static __inline__ void clear_inUse(void) { - lpQueue->xInUseWord = 0; + xItLpQueue.xInUseWord = 0; } /* Array of LpEvent handler functions */ extern LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; unsigned long ItLpQueueInProcess = 0; -struct HvLpEvent * ItLpQueue_getNextLpEvent( struct ItLpQueue * lpQueue ) +struct HvLpEvent * ItLpQueue_getNextLpEvent(void) { struct HvLpEvent * nextLpEvent = - (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; + (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; if ( nextLpEvent->xFlags.xValid ) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ - lpQueue->xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + + xItLpQueue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + LpEventAlign ) / LpEventAlign ) * LpEventAlign; /* Wrap to beginning if no room at end */ - if (lpQueue->xSlicCurEventPtr > lpQueue->xSlicLastValidEventPtr) - lpQueue->xSlicCurEventPtr = lpQueue->xSlicEventStackPtr; + if (xItLpQueue.xSlicCurEventPtr > xItLpQueue.xSlicLastValidEventPtr) + xItLpQueue.xSlicCurEventPtr = xItLpQueue.xSlicEventStackPtr; } else nextLpEvent = NULL; @@ -71,15 +71,15 @@ struct HvLpEvent * ItLpQueue_getNextLpEv unsigned long spread_lpevents = NR_CPUS; -int ItLpQueue_isLpIntPending( struct ItLpQueue * lpQueue ) +int ItLpQueue_isLpIntPending(void) { struct HvLpEvent *next_event; if (smp_processor_id() >= spread_lpevents) return 0; - next_event = (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; - return next_event->xFlags.xValid | lpQueue->xPlicOverflowIntPending; + next_event = (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; + return next_event->xFlags.xValid | xItLpQueue.xPlicOverflowIntPending; } void ItLpQueue_clearValid( struct HvLpEvent * event ) @@ -104,13 +104,13 @@ void ItLpQueue_clearValid( struct HvLpEv event->xFlags.xValid = 0; } -unsigned ItLpQueue_process( struct ItLpQueue * lpQueue, struct pt_regs *regs ) +unsigned ItLpQueue_process(struct pt_regs *regs) { unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; /* If we have recursed, just return */ - if ( !set_inUse( lpQueue ) ) + if ( !set_inUse() ) return 0; if (ItLpQueueInProcess == 0) @@ -119,13 +119,13 @@ unsigned ItLpQueue_process( struct ItLpQ BUG(); for (;;) { - nextLpEvent = ItLpQueue_getNextLpEvent( lpQueue ); + nextLpEvent = ItLpQueue_getNextLpEvent(); if ( nextLpEvent ) { /* Count events to return to caller - * and count processed events in lpQueue + * and count processed events in xItLpQueue */ ++numIntsProcessed; - lpQueue->xLpIntCount++; + xItLpQueue.xLpIntCount++; /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -140,7 +140,7 @@ unsigned ItLpQueue_process( struct ItLpQ * here! */ if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) - lpQueue->xLpIntCountByType[nextLpEvent->xType]++; + xItLpQueue.xLpIntCountByType[nextLpEvent->xType]++; if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && lpEventHandler[nextLpEvent->xType] ) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); @@ -148,19 +148,19 @@ unsigned ItLpQueue_process( struct ItLpQ printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); ItLpQueue_clearValid( nextLpEvent ); - } else if ( lpQueue->xPlicOverflowIntPending ) + } else if ( xItLpQueue.xPlicOverflowIntPending ) /* * No more valid events. If overflow events are * pending process them */ - HvCallEvent_getOverflowLpEvents( lpQueue->xIndex); + HvCallEvent_getOverflowLpEvents( xItLpQueue.xIndex); else break; } ItLpQueueInProcess = 0; mb(); - clear_inUse( lpQueue ); + clear_inUse(); get_paca()->lpevent_count += numIntsProcessed; Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -76,9 +76,9 @@ struct ItLpQueue { extern struct ItLpQueue xItLpQueue; -extern struct HvLpEvent *ItLpQueue_getNextLpEvent(struct ItLpQueue *); -extern int ItLpQueue_isLpIntPending(struct ItLpQueue *); -extern unsigned ItLpQueue_process(struct ItLpQueue *, struct pt_regs *); +extern struct HvLpEvent *ItLpQueue_getNextLpEvent(void); +extern int ItLpQueue_isLpIntPending(void); +extern unsigned ItLpQueue_process(struct pt_regs *); extern void ItLpQueue_clearValid(struct HvLpEvent *); #endif /* _ITLPQUEUE_H */ Index: work/arch/ppc64/kernel/idle.c =================================================================== --- work.orig/arch/ppc64/kernel/idle.c +++ work/arch/ppc64/kernel/idle.c @@ -88,7 +88,7 @@ static int iSeries_idle(void) while (1) { if (lpaca->lppaca.shared_proc) { - if (ItLpQueue_isLpIntPending(&xItLpQueue)) + if (ItLpQueue_isLpIntPending()) process_iSeries_events(); if (!need_resched()) yield_shared_processor(); @@ -100,7 +100,7 @@ static int iSeries_idle(void) while (!need_resched()) { HMT_medium(); - if (ItLpQueue_isLpIntPending(&xItLpQueue)) + if (ItLpQueue_isLpIntPending()) process_iSeries_events(); HMT_low(); } Index: work/arch/ppc64/kernel/irq.c =================================================================== --- work.orig/arch/ppc64/kernel/irq.c +++ work/arch/ppc64/kernel/irq.c @@ -294,8 +294,8 @@ void do_IRQ(struct pt_regs *regs) iSeries_smp_message_recv(regs); } #endif /* CONFIG_SMP */ - if (ItLpQueue_isLpIntPending(&xItLpQueue)) - lpevent_count += ItLpQueue_process(&xItLpQueue, regs); + if (ItLpQueue_isLpIntPending()) + lpevent_count += ItLpQueue_process(regs); irq_exit(); Index: work/arch/ppc64/kernel/time.c =================================================================== --- work.orig/arch/ppc64/kernel/time.c +++ work/arch/ppc64/kernel/time.c @@ -367,8 +367,8 @@ int timer_interrupt(struct pt_regs * reg #ifdef CONFIG_PPC_ISERIES { - if (ItLpQueue_isLpIntPending(&xItLpQueue)) - lpevent_count += ItLpQueue_process(&xItLpQueue, regs); + if (ItLpQueue_isLpIntPending()) + lpevent_count += ItLpQueue_process(regs); } #endif Index: work/arch/ppc64/kernel/mf.c =================================================================== --- work.orig/arch/ppc64/kernel/mf.c +++ work/arch/ppc64/kernel/mf.c @@ -802,8 +802,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { extern unsigned long lpevent_count; - if (ItLpQueue_isLpIntPending(&xItLpQueue)) - lpevent_count += ItLpQueue_process(&xItLpQueue, NULL); + if (ItLpQueue_isLpIntPending()) + lpevent_count += ItLpQueue_process(NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } From michael at ellerman.id.au Tue Jun 28 09:17:24 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:24 +1000 Subject: [PATCH 5/15] ppc64: Move initialisation of xItLpQueue into ItLpQueue.c In-Reply-To: <1119914237.219661.147090075133.qpatch@concordia> Message-ID: <1119914244.209803.173831178348.qpatch@concordia> Hi, The xItLpQueue is initalised manually in iSeries_setup_arch(). Move this code into ItLpQueue.c for a cleaner seperation. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 23 +++++++++++++++++++++++ arch/ppc64/kernel/iSeries_setup.c | 20 +------------------- include/asm-ppc64/iSeries/ItLpQueue.h | 1 + 3 files changed, 25 insertions(+), 19 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -166,3 +167,25 @@ unsigned ItLpQueue_process(struct pt_reg return numIntsProcessed; } + +void hvlpevent_queue_setup(void) +{ + void *eventStack; + + /* + * Allocate a page for the Event Stack. The Hypervisor needs the + * absolute real address, so we subtract out the KERNELBASE and add + * in the absolute real address of the kernel load area. + */ + eventStack = alloc_bootmem_pages(LpEventStackSize); + memset(eventStack, 0, LpEventStackSize); + + /* Invoke the hypervisor to initialize the event stack */ + HvCallEvent_setLpEventStack(0, eventStack, LpEventStackSize); + + xItLpQueue.xSlicEventStackPtr = (char *)eventStack; + xItLpQueue.xSlicCurEventPtr = (char *)eventStack; + xItLpQueue.xSlicLastValidEventPtr = (char *)eventStack + + (LpEventStackSize - LpEventMaxSize); + xItLpQueue.xIndex = 0; +} Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -676,7 +676,6 @@ static void __init iSeries_bolt_kernel(u */ static void __init iSeries_setup_arch(void) { - void *eventStack; unsigned procIx = get_paca()->lppaca.dyn_hv_phys_proc_index; /* Add an eye catcher and the systemcfg layout version number */ @@ -685,24 +684,7 @@ static void __init iSeries_setup_arch(vo systemcfg->version.minor = SYSTEMCFG_MINOR; /* Setup the Lp Event Queue */ - - /* Allocate a page for the Event Stack - * The hypervisor wants the absolute real address, so - * we subtract out the KERNELBASE and add in the - * absolute real address of the kernel load area - */ - eventStack = alloc_bootmem_pages(LpEventStackSize); - memset(eventStack, 0, LpEventStackSize); - - /* Invoke the hypervisor to initialize the event stack */ - HvCallEvent_setLpEventStack(0, eventStack, LpEventStackSize); - - /* Initialize fields in our Lp Event Queue */ - xItLpQueue.xSlicEventStackPtr = (char *)eventStack; - xItLpQueue.xSlicCurEventPtr = (char *)eventStack; - xItLpQueue.xSlicLastValidEventPtr = (char *)eventStack + - (LpEventStackSize - LpEventMaxSize); - xItLpQueue.xIndex = 0; + hvlpevent_queue_setup(); /* Compute processor frequency */ procFreqHz = ((1UL << 34) * 1000000) / Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -80,5 +80,6 @@ extern struct HvLpEvent *ItLpQueue_getNe extern int ItLpQueue_isLpIntPending(void); extern unsigned ItLpQueue_process(struct pt_regs *); extern void ItLpQueue_clearValid(struct HvLpEvent *); +extern void hvlpevent_queue_setup(void); #endif /* _ITLPQUEUE_H */ From michael at ellerman.id.au Tue Jun 28 09:17:30 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:30 +1000 Subject: [PATCH 6/15] ppc64: Move xItLpQueue proc code into ItLpQueue.c In-Reply-To: <1119914244.209803.173831178348.qpatch@concordia> Message-ID: <1119914250.463962.977117712453.qpatch@concordia> Hi, Move the code that displays xItLpQueue values in /proc into ItLpQueue.c Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 59 +++++++++++++++++++++++++++++++++++++++ arch/ppc64/kernel/iSeries_proc.c | 48 ------------------------------- 2 files changed, 59 insertions(+), 48 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -12,12 +12,26 @@ #include #include #include +#include +#include #include #include #include #include #include +static char *event_types[9] = { + "Hypervisor\t\t", + "Machine Facilities\t", + "Session Manager\t", + "SPD I/O\t\t", + "Virtual Bus\t\t", + "PCI I/O\t\t", + "RIO I/O\t\t", + "Virtual Lan\t\t", + "Virtual I/O\t\t" +}; + static __inline__ int set_inUse(void) { int t; @@ -189,3 +203,48 @@ void hvlpevent_queue_setup(void) (LpEventStackSize - LpEventMaxSize); xItLpQueue.xIndex = 0; } + +static int proc_lpevents_show(struct seq_file *m, void *v) +{ + unsigned int i; + + seq_printf(m, "LpEventQueue 0\n"); + seq_printf(m, " events processed:\t%lu\n", + (unsigned long)xItLpQueue.xLpIntCount); + + for (i = 0; i < 9; ++i) + seq_printf(m, " %s %10lu\n", event_types[i], + (unsigned long)xItLpQueue.xLpIntCountByType[i]); + + seq_printf(m, "\n events processed by processor:\n"); + + for_each_online_cpu(i) + seq_printf(m, " CPU%02d %10u\n", i, paca[i].lpevent_count); + + return 0; +} + +static int proc_lpevents_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_lpevents_show, NULL); +} + +static struct file_operations proc_lpevents_operations = { + .open = proc_lpevents_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int __init proc_lpevents_init(void) +{ + struct proc_dir_entry *e; + + e = create_proc_entry("iSeries/lpevents", S_IFREG|S_IRUGO, NULL); + if (e) + e->proc_fops = &proc_lpevents_operations; + + return 0; +} +__initcall(proc_lpevents_init); + Index: work/arch/ppc64/kernel/iSeries_proc.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_proc.c +++ work/arch/ppc64/kernel/iSeries_proc.c @@ -40,50 +40,6 @@ static int __init iseries_proc_create(vo } core_initcall(iseries_proc_create); -static char *event_types[9] = { - "Hypervisor\t\t", - "Machine Facilities\t", - "Session Manager\t", - "SPD I/O\t\t", - "Virtual Bus\t\t", - "PCI I/O\t\t", - "RIO I/O\t\t", - "Virtual Lan\t\t", - "Virtual I/O\t\t" -}; - -static int proc_lpevents_show(struct seq_file *m, void *v) -{ - unsigned int i; - - seq_printf(m, "LpEventQueue 0\n"); - seq_printf(m, " events processed:\t%lu\n", - (unsigned long)xItLpQueue.xLpIntCount); - - for (i = 0; i < 9; ++i) - seq_printf(m, " %s %10lu\n", event_types[i], - (unsigned long)xItLpQueue.xLpIntCountByType[i]); - - seq_printf(m, "\n events processed by processor:\n"); - - for_each_online_cpu(i) - seq_printf(m, " CPU%02d %10u\n", i, paca[i].lpevent_count); - - return 0; -} - -static int proc_lpevents_open(struct inode *inode, struct file *file) -{ - return single_open(file, proc_lpevents_show, NULL); -} - -static struct file_operations proc_lpevents_operations = { - .open = proc_lpevents_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, -}; - static unsigned long startTitan = 0; static unsigned long startTb = 0; @@ -148,10 +104,6 @@ static int __init iseries_proc_init(void { struct proc_dir_entry *e; - e = create_proc_entry("iSeries/lpevents", S_IFREG|S_IRUGO, NULL); - if (e) - e->proc_fops = &proc_lpevents_operations; - e = create_proc_entry("iSeries/titanTod", S_IFREG|S_IRUGO, NULL); if (e) e->proc_fops = &proc_titantod_operations; From michael at ellerman.id.au Tue Jun 28 09:17:37 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:37 +1000 Subject: [PATCH 7/15] ppc64: Make two ItLpQueue related functions static In-Reply-To: <1119914250.463962.977117712453.qpatch@concordia> Message-ID: <1119914257.7888.449924609208.qpatch@concordia> Hi, External parties don't need to use ItLpQueue_getNextLpEvent() or ItLpQueue_clearValid(), they're internal to ItLpQueue.c Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 4 ++-- include/asm-ppc64/iSeries/ItLpQueue.h | 2 -- 2 files changed, 2 insertions(+), 4 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -62,7 +62,7 @@ static __inline__ void clear_inUse(void) extern LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; unsigned long ItLpQueueInProcess = 0; -struct HvLpEvent * ItLpQueue_getNextLpEvent(void) +static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) { struct HvLpEvent * nextLpEvent = (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; @@ -97,7 +97,7 @@ int ItLpQueue_isLpIntPending(void) return next_event->xFlags.xValid | xItLpQueue.xPlicOverflowIntPending; } -void ItLpQueue_clearValid( struct HvLpEvent * event ) +static void ItLpQueue_clearValid( struct HvLpEvent * event ) { /* Clear the valid bit of the event * Also clear bits within this event that might Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -76,10 +76,8 @@ struct ItLpQueue { extern struct ItLpQueue xItLpQueue; -extern struct HvLpEvent *ItLpQueue_getNextLpEvent(void); extern int ItLpQueue_isLpIntPending(void); extern unsigned ItLpQueue_process(struct pt_regs *); -extern void ItLpQueue_clearValid(struct HvLpEvent *); extern void hvlpevent_queue_setup(void); #endif /* _ITLPQUEUE_H */ From michael at ellerman.id.au Tue Jun 28 09:17:43 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:43 +1000 Subject: [PATCH 8/15] ppc64: Move definition of xItLpQueue In-Reply-To: <1119914257.7888.449924609208.qpatch@concordia> Message-ID: <1119914263.38102.77846693540.qpatch@concordia> Hi, The xItLpQueue is declared in LparData.c, move it into ItLpQueue.c LparData.c is the only other file that needs to know about xItLpQueue, so remove the extern definition from ItLpQueue.h and put it in LparData.c directly. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 8 ++++++++ arch/ppc64/kernel/LparData.c | 7 +------ include/asm-ppc64/iSeries/ItLpQueue.h | 1 - 3 files changed, 9 insertions(+), 7 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -20,6 +20,14 @@ #include #include +/* + * The LpQueue is used to pass event data from the hypervisor to + * the partition. This is where I/O interrupt events are communicated. + * + * It is written to by the hypervisor so cannot end up in the BSS. + */ +struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); + static char *event_types[9] = { "Hypervisor\t\t", "Machine Facilities\t", Index: work/arch/ppc64/kernel/LparData.c =================================================================== --- work.orig/arch/ppc64/kernel/LparData.c +++ work/arch/ppc64/kernel/LparData.c @@ -28,13 +28,8 @@ #include #include -/* The LpQueue is used to pass event data from the hypervisor to - * the partition. This is where I/O interrupt events are communicated. - */ - -/* May be filled in by the hypervisor so cannot end up in the BSS */ -struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); +extern struct ItLpQueue xItLpQueue; /* The HvReleaseData is the root of the information shared between * the hypervisor and Linux. Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -74,7 +74,6 @@ struct ItLpQueue { u64 xLpIntCountByType[9]; // 0x38-0x7F Event counts by type }; -extern struct ItLpQueue xItLpQueue; extern int ItLpQueue_isLpIntPending(void); extern unsigned ItLpQueue_process(struct pt_regs *); From michael at ellerman.id.au Tue Jun 28 09:17:49 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:49 +1000 Subject: [PATCH 9/15] ppc64: Rename xItLpQueue to hvlpevent_queue In-Reply-To: <1119914263.38102.77846693540.qpatch@concordia> Message-ID: <1119914269.133079.482794003407.qpatch@concordia> Hi, The xItLpQueue is a queue of HvLpEvents that we're given by the Hypervisor. Rename xItLpQueue to hvlpevent_queue and make the type struct hvlpevent_queue. Signed-off-by: Michael Ellerman -- Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -26,7 +26,7 @@ * * It is written to by the hypervisor so cannot end up in the BSS. */ -struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); +struct hvlpevent_queue hvlpevent_queue __attribute__((__section__(".data"))); static char *event_types[9] = { "Hypervisor\t\t", @@ -43,7 +43,7 @@ static char *event_types[9] = { static __inline__ int set_inUse(void) { int t; - u32 * inUseP = &xItLpQueue.xInUseWord; + u32 * inUseP = &hvlpevent_queue.xInUseWord; __asm__ __volatile__("\n\ 1: lwarx %0,0,%2 \n\ @@ -54,8 +54,8 @@ static __inline__ int set_inUse(void) stwcx. %0,0,%2 \n\ bne- 1b \n\ 2: eieio" - : "=&r" (t), "=m" (xItLpQueue.xInUseWord) - : "r" (inUseP), "m" (xItLpQueue.xInUseWord) + : "=&r" (t), "=m" (hvlpevent_queue.xInUseWord) + : "r" (inUseP), "m" (hvlpevent_queue.xInUseWord) : "cc"); return t; @@ -63,7 +63,7 @@ static __inline__ int set_inUse(void) static __inline__ void clear_inUse(void) { - xItLpQueue.xInUseWord = 0; + hvlpevent_queue.xInUseWord = 0; } /* Array of LpEvent handler functions */ @@ -73,18 +73,18 @@ unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) { struct HvLpEvent * nextLpEvent = - (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; + (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; if ( nextLpEvent->xFlags.xValid ) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ - xItLpQueue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + + hvlpevent_queue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + LpEventAlign ) / LpEventAlign ) * LpEventAlign; /* Wrap to beginning if no room at end */ - if (xItLpQueue.xSlicCurEventPtr > xItLpQueue.xSlicLastValidEventPtr) - xItLpQueue.xSlicCurEventPtr = xItLpQueue.xSlicEventStackPtr; + if (hvlpevent_queue.xSlicCurEventPtr > hvlpevent_queue.xSlicLastValidEventPtr) + hvlpevent_queue.xSlicCurEventPtr = hvlpevent_queue.xSlicEventStackPtr; } else nextLpEvent = NULL; @@ -101,8 +101,8 @@ int ItLpQueue_isLpIntPending(void) if (smp_processor_id() >= spread_lpevents) return 0; - next_event = (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; - return next_event->xFlags.xValid | xItLpQueue.xPlicOverflowIntPending; + next_event = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; + return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; } static void ItLpQueue_clearValid( struct HvLpEvent * event ) @@ -145,10 +145,10 @@ unsigned ItLpQueue_process(struct pt_reg nextLpEvent = ItLpQueue_getNextLpEvent(); if ( nextLpEvent ) { /* Count events to return to caller - * and count processed events in xItLpQueue + * and count processed events in hvlpevent_queue */ ++numIntsProcessed; - xItLpQueue.xLpIntCount++; + hvlpevent_queue.xLpIntCount++; /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -163,7 +163,7 @@ unsigned ItLpQueue_process(struct pt_reg * here! */ if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) - xItLpQueue.xLpIntCountByType[nextLpEvent->xType]++; + hvlpevent_queue.xLpIntCountByType[nextLpEvent->xType]++; if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && lpEventHandler[nextLpEvent->xType] ) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); @@ -171,12 +171,12 @@ unsigned ItLpQueue_process(struct pt_reg printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); ItLpQueue_clearValid( nextLpEvent ); - } else if ( xItLpQueue.xPlicOverflowIntPending ) + } else if ( hvlpevent_queue.xPlicOverflowIntPending ) /* * No more valid events. If overflow events are * pending process them */ - HvCallEvent_getOverflowLpEvents( xItLpQueue.xIndex); + HvCallEvent_getOverflowLpEvents( hvlpevent_queue.xIndex); else break; } @@ -205,11 +205,11 @@ void hvlpevent_queue_setup(void) /* Invoke the hypervisor to initialize the event stack */ HvCallEvent_setLpEventStack(0, eventStack, LpEventStackSize); - xItLpQueue.xSlicEventStackPtr = (char *)eventStack; - xItLpQueue.xSlicCurEventPtr = (char *)eventStack; - xItLpQueue.xSlicLastValidEventPtr = (char *)eventStack + + hvlpevent_queue.xSlicEventStackPtr = (char *)eventStack; + hvlpevent_queue.xSlicCurEventPtr = (char *)eventStack; + hvlpevent_queue.xSlicLastValidEventPtr = (char *)eventStack + (LpEventStackSize - LpEventMaxSize); - xItLpQueue.xIndex = 0; + hvlpevent_queue.xIndex = 0; } static int proc_lpevents_show(struct seq_file *m, void *v) @@ -218,11 +218,11 @@ static int proc_lpevents_show(struct seq seq_printf(m, "LpEventQueue 0\n"); seq_printf(m, " events processed:\t%lu\n", - (unsigned long)xItLpQueue.xLpIntCount); + (unsigned long)hvlpevent_queue.xLpIntCount); for (i = 0; i < 9; ++i) seq_printf(m, " %s %10lu\n", event_types[i], - (unsigned long)xItLpQueue.xLpIntCountByType[i]); + (unsigned long)hvlpevent_queue.xLpIntCountByType[i]); seq_printf(m, "\n events processed by processor:\n"); Index: work/arch/ppc64/kernel/LparData.c =================================================================== --- work.orig/arch/ppc64/kernel/LparData.c +++ work/arch/ppc64/kernel/LparData.c @@ -29,7 +29,7 @@ #include -extern struct ItLpQueue xItLpQueue; +extern struct hvlpevent_queue hvlpevent_queue; /* The HvReleaseData is the root of the information shared between * the hypervisor and Linux. @@ -195,7 +195,7 @@ struct ItVpdAreas itVpdAreas = { 0,0,0, /* 13 - 15 */ sizeof(struct IoHriProcessorVpd),/* 16 length of Proc Vpd */ 0,0,0,0,0,0, /* 17 - 22 */ - sizeof(struct ItLpQueue),/* 23 length of Lp Queue */ + sizeof(struct hvlpevent_queue), /* 23 length of Lp Queue */ 0,0 /* 24 - 25 */ }, .xSlicVpdAdrs = { /* VPD addresses */ @@ -213,7 +213,7 @@ struct ItVpdAreas itVpdAreas = { 0,0,0, /* 13 - 15 */ &xIoHriProcessorVpd, /* 16 Proc Vpd */ 0,0,0,0,0,0, /* 17 - 22 */ - &xItLpQueue, /* 23 Lp Queue */ + &hvlpevent_queue, /* 23 Lp Queue */ 0,0 } }; Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -41,7 +41,7 @@ struct HvLpEvent; #define LpEventMaxSize 256 #define LpEventAlign 64 -struct ItLpQueue { +struct hvlpevent_queue { /* * The xSlicCurEventPtr is the pointer to the next event stack entry * that will become valid. The OS must peek at this entry to determine From michael at ellerman.id.au Tue Jun 28 09:17:55 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:17:55 +1000 Subject: [PATCH 10/15] ppc64: Rename ItLpQueue_* functions to hvlpevent_queue_* In-Reply-To: <1119914269.133079.482794003407.qpatch@concordia> Message-ID: <1119914275.942167.276923052080.qpatch@concordia> Hi, Now that we've renamed the xItLpQueue structure, rename the functions that operate on it also. Signed-off-by: Michael Ellerman -- Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -70,7 +70,7 @@ static __inline__ void clear_inUse(void) extern LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; unsigned long ItLpQueueInProcess = 0; -static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) +static struct HvLpEvent * hvlpevent_queue_next_event(void) { struct HvLpEvent * nextLpEvent = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; @@ -94,7 +94,7 @@ static struct HvLpEvent * ItLpQueue_getN unsigned long spread_lpevents = NR_CPUS; -int ItLpQueue_isLpIntPending(void) +int hvlpevent_queue_event_pending(void) { struct HvLpEvent *next_event; @@ -105,7 +105,7 @@ int ItLpQueue_isLpIntPending(void) return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; } -static void ItLpQueue_clearValid( struct HvLpEvent * event ) +static void hvlpevent_clear_valid( struct HvLpEvent * event ) { /* Clear the valid bit of the event * Also clear bits within this event that might @@ -127,7 +127,7 @@ static void ItLpQueue_clearValid( struct event->xFlags.xValid = 0; } -unsigned ItLpQueue_process(struct pt_regs *regs) +unsigned hvlpevent_queue_process(struct pt_regs *regs) { unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; @@ -142,7 +142,7 @@ unsigned ItLpQueue_process(struct pt_reg BUG(); for (;;) { - nextLpEvent = ItLpQueue_getNextLpEvent(); + nextLpEvent = hvlpevent_queue_next_event(); if ( nextLpEvent ) { /* Count events to return to caller * and count processed events in hvlpevent_queue @@ -170,7 +170,7 @@ unsigned ItLpQueue_process(struct pt_reg else printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); - ItLpQueue_clearValid( nextLpEvent ); + hvlpevent_clear_valid( nextLpEvent ); } else if ( hvlpevent_queue.xPlicOverflowIntPending ) /* * No more valid events. If overflow events are Index: work/arch/ppc64/kernel/idle.c =================================================================== --- work.orig/arch/ppc64/kernel/idle.c +++ work/arch/ppc64/kernel/idle.c @@ -88,7 +88,7 @@ static int iSeries_idle(void) while (1) { if (lpaca->lppaca.shared_proc) { - if (ItLpQueue_isLpIntPending()) + if (hvlpevent_queue_event_pending()) process_iSeries_events(); if (!need_resched()) yield_shared_processor(); @@ -100,7 +100,7 @@ static int iSeries_idle(void) while (!need_resched()) { HMT_medium(); - if (ItLpQueue_isLpIntPending()) + if (hvlpevent_queue_event_pending()) process_iSeries_events(); HMT_low(); } Index: work/arch/ppc64/kernel/irq.c =================================================================== --- work.orig/arch/ppc64/kernel/irq.c +++ work/arch/ppc64/kernel/irq.c @@ -294,8 +294,8 @@ void do_IRQ(struct pt_regs *regs) iSeries_smp_message_recv(regs); } #endif /* CONFIG_SMP */ - if (ItLpQueue_isLpIntPending()) - lpevent_count += ItLpQueue_process(regs); + if (hvlpevent_queue_event_pending()) + lpevent_count += hvlpevent_queue_process(regs); irq_exit(); Index: work/arch/ppc64/kernel/mf.c =================================================================== --- work.orig/arch/ppc64/kernel/mf.c +++ work/arch/ppc64/kernel/mf.c @@ -802,8 +802,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { extern unsigned long lpevent_count; - if (ItLpQueue_isLpIntPending()) - lpevent_count += ItLpQueue_process(NULL); + if (hvlpevent_queue_event_pending()) + lpevent_count += hvlpevent_queue_process(NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } Index: work/arch/ppc64/kernel/time.c =================================================================== --- work.orig/arch/ppc64/kernel/time.c +++ work/arch/ppc64/kernel/time.c @@ -367,8 +367,8 @@ int timer_interrupt(struct pt_regs * reg #ifdef CONFIG_PPC_ISERIES { - if (ItLpQueue_isLpIntPending()) - lpevent_count += ItLpQueue_process(regs); + if (hvlpevent_queue_event_pending()) + lpevent_count += hvlpevent_queue_process(regs); } #endif Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -75,8 +75,8 @@ struct hvlpevent_queue { }; -extern int ItLpQueue_isLpIntPending(void); -extern unsigned ItLpQueue_process(struct pt_regs *); +extern int hvlpevent_queue_event_pending(void); +extern unsigned hvlpevent_queue_process(struct pt_regs *); extern void hvlpevent_queue_setup(void); #endif /* _ITLPQUEUE_H */ From michael at ellerman.id.au Tue Jun 28 09:18:03 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:18:03 +1000 Subject: [PATCH 11/15] ppc64: Don't count number of events processed for caller In-Reply-To: <1119914275.942167.276923052080.qpatch@concordia> Message-ID: <1119914283.627555.717678214194.qpatch@concordia> Hi, Currently we count the number of lpevents processed in 3 seperate places. One of these counters is never read, so just remove it. This means hvlpevent_queue_process() no longer needs to return the number of events processed. Signed-off-by: Michael Ellerman -- Index: work/arch/ppc64/kernel/irq.c =================================================================== --- work.orig/arch/ppc64/kernel/irq.c +++ work/arch/ppc64/kernel/irq.c @@ -66,7 +66,6 @@ EXPORT_SYMBOL(irq_desc); int distribute_irqs = 1; int __irq_offset_value; int ppc_spurious_interrupts; -unsigned long lpevent_count; u64 ppc64_interrupt_controller; int show_interrupts(struct seq_file *p, void *v) @@ -295,7 +294,7 @@ void do_IRQ(struct pt_regs *regs) } #endif /* CONFIG_SMP */ if (hvlpevent_queue_event_pending()) - lpevent_count += hvlpevent_queue_process(regs); + hvlpevent_queue_process(regs); irq_exit(); Index: work/arch/ppc64/kernel/time.c =================================================================== --- work.orig/arch/ppc64/kernel/time.c +++ work/arch/ppc64/kernel/time.c @@ -98,7 +98,6 @@ unsigned long tb_to_ns_shift; struct gettimeofday_struct do_gtod; extern unsigned long wall_jiffies; -extern unsigned long lpevent_count; extern int smp_tb_synchronized; extern struct timezone sys_tz; @@ -368,7 +367,7 @@ int timer_interrupt(struct pt_regs * reg #ifdef CONFIG_PPC_ISERIES { if (hvlpevent_queue_event_pending()) - lpevent_count += hvlpevent_queue_process(regs); + hvlpevent_queue_process(regs); } #endif Index: work/arch/ppc64/kernel/mf.c =================================================================== --- work.orig/arch/ppc64/kernel/mf.c +++ work/arch/ppc64/kernel/mf.c @@ -801,9 +801,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) return rc; /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { - extern unsigned long lpevent_count; if (hvlpevent_queue_event_pending()) - lpevent_count += hvlpevent_queue_process(NULL); + hvlpevent_queue_process(NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -76,7 +76,7 @@ struct hvlpevent_queue { extern int hvlpevent_queue_event_pending(void); -extern unsigned hvlpevent_queue_process(struct pt_regs *); +extern void hvlpevent_queue_process(struct pt_regs *); extern void hvlpevent_queue_setup(void); #endif /* _ITLPQUEUE_H */ Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -127,14 +127,14 @@ static void hvlpevent_clear_valid( struc event->xFlags.xValid = 0; } -unsigned hvlpevent_queue_process(struct pt_regs *regs) +void hvlpevent_queue_process(struct pt_regs *regs) { unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; /* If we have recursed, just return */ if ( !set_inUse() ) - return 0; + return; if (ItLpQueueInProcess == 0) ItLpQueueInProcess = 1; @@ -144,9 +144,6 @@ unsigned hvlpevent_queue_process(struct for (;;) { nextLpEvent = hvlpevent_queue_next_event(); if ( nextLpEvent ) { - /* Count events to return to caller - * and count processed events in hvlpevent_queue - */ ++numIntsProcessed; hvlpevent_queue.xLpIntCount++; /* Call appropriate handler here, passing @@ -186,8 +183,6 @@ unsigned hvlpevent_queue_process(struct clear_inUse(); get_paca()->lpevent_count += numIntsProcessed; - - return numIntsProcessed; } void hvlpevent_queue_setup(void) From michael at ellerman.id.au Tue Jun 28 09:18:12 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:18:12 +1000 Subject: [PATCH 12/15] ppc64: Simplify counting of lpevents, remove lpevent_count from paca In-Reply-To: <1119914283.627555.717678214194.qpatch@concordia> Message-ID: <1119914292.574437.212533125149.qpatch@concordia> Hi, Currently there's a per-cpu count of lpevents processed, a per-queue (ie. global) total count, and a count by event type. Replace all that with a count by event for each cpu. We only need to add it up int the proc code. Signed-off-by: Michael Ellerman -- Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -28,7 +28,9 @@ */ struct hvlpevent_queue hvlpevent_queue __attribute__((__section__(".data"))); -static char *event_types[9] = { +DEFINE_PER_CPU(unsigned long[HvLpEvent_Type_NumTypes], hvlpevent_counts); + +static char *event_types[HvLpEvent_Type_NumTypes] = { "Hypervisor\t\t", "Machine Facilities\t", "Session Manager\t", @@ -129,7 +131,6 @@ static void hvlpevent_clear_valid( struc void hvlpevent_queue_process(struct pt_regs *regs) { - unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; /* If we have recursed, just return */ @@ -144,8 +145,6 @@ void hvlpevent_queue_process(struct pt_r for (;;) { nextLpEvent = hvlpevent_queue_next_event(); if ( nextLpEvent ) { - ++numIntsProcessed; - hvlpevent_queue.xLpIntCount++; /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -160,7 +159,7 @@ void hvlpevent_queue_process(struct pt_r * here! */ if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) - hvlpevent_queue.xLpIntCountByType[nextLpEvent->xType]++; + __get_cpu_var(hvlpevent_counts)[nextLpEvent->xType]++; if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && lpEventHandler[nextLpEvent->xType] ) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); @@ -181,8 +180,6 @@ void hvlpevent_queue_process(struct pt_r ItLpQueueInProcess = 0; mb(); clear_inUse(); - - get_paca()->lpevent_count += numIntsProcessed; } void hvlpevent_queue_setup(void) @@ -209,20 +206,37 @@ void hvlpevent_queue_setup(void) static int proc_lpevents_show(struct seq_file *m, void *v) { - unsigned int i; + int cpu, i; + unsigned long sum; + static unsigned long cpu_totals[NR_CPUS]; + + /* FIXME: do we care that there's no locking here? */ + sum = 0; + for_each_online_cpu(cpu) { + cpu_totals[cpu] = 0; + for (i = 0; i < HvLpEvent_Type_NumTypes; i++) { + cpu_totals[cpu] += per_cpu(hvlpevent_counts, cpu)[i]; + } + sum += cpu_totals[cpu]; + } seq_printf(m, "LpEventQueue 0\n"); - seq_printf(m, " events processed:\t%lu\n", - (unsigned long)hvlpevent_queue.xLpIntCount); + seq_printf(m, " events processed:\t%lu\n", sum); - for (i = 0; i < 9; ++i) - seq_printf(m, " %s %10lu\n", event_types[i], - (unsigned long)hvlpevent_queue.xLpIntCountByType[i]); + for (i = 0; i < HvLpEvent_Type_NumTypes; ++i) { + sum = 0; + for_each_online_cpu(cpu) { + sum += per_cpu(hvlpevent_counts, cpu)[i]; + } + + seq_printf(m, " %s %10lu\n", event_types[i], sum); + } seq_printf(m, "\n events processed by processor:\n"); - for_each_online_cpu(i) - seq_printf(m, " CPU%02d %10u\n", i, paca[i].lpevent_count); + for_each_online_cpu(cpu) { + seq_printf(m, " CPU%02d %10lu\n", cpu, cpu_totals[cpu]); + } return 0; } Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -70,8 +70,6 @@ struct hvlpevent_queue { u8 xIndex; // 0x28 unique sequential index. u8 xSlicRsvd[3]; // 0x29-2b u32 xInUseWord; // 0x2C - u64 xLpIntCount; // 0x30 Total Lp Int msgs processed - u64 xLpIntCountByType[9]; // 0x38-0x7F Event counts by type }; Index: work/include/asm-ppc64/paca.h =================================================================== --- work.orig/include/asm-ppc64/paca.h +++ work/include/asm-ppc64/paca.h @@ -90,7 +90,6 @@ struct paca_struct { u64 next_jiffy_update_tb; /* TB value for next jiffy update */ u64 saved_r1; /* r1 save for RTAS calls */ u64 saved_msr; /* MSR saved here by enter_rtas */ - u32 lpevent_count; /* lpevents processed */ u8 proc_enabled; /* irq soft-enable flag */ /* not yet used */ From michael at ellerman.id.au Tue Jun 28 09:18:19 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:18:19 +1000 Subject: [PATCH 13/15] ppc64: Cleanup proc printing of event types In-Reply-To: <1119914292.574437.212533125149.qpatch@concordia> Message-ID: <1119914299.407466.18333142742.qpatch@concordia> Hi, The code that prints event counts by type uses a hand-coded number of tabs to get the alignment right. Instead use a printf alignment which will allow allow us to use the event_type strings elsewhere in the future. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -31,15 +31,15 @@ struct hvlpevent_queue hvlpevent_queue _ DEFINE_PER_CPU(unsigned long[HvLpEvent_Type_NumTypes], hvlpevent_counts); static char *event_types[HvLpEvent_Type_NumTypes] = { - "Hypervisor\t\t", - "Machine Facilities\t", - "Session Manager\t", - "SPD I/O\t\t", - "Virtual Bus\t\t", - "PCI I/O\t\t", - "RIO I/O\t\t", - "Virtual Lan\t\t", - "Virtual I/O\t\t" + "Hypervisor", + "Machine Facilities", + "Session Manager", + "SPD I/O", + "Virtual Bus", + "PCI I/O", + "RIO I/O", + "Virtual Lan", + "Virtual I/O" }; static __inline__ int set_inUse(void) @@ -229,7 +229,7 @@ static int proc_lpevents_show(struct seq sum += per_cpu(hvlpevent_counts, cpu)[i]; } - seq_printf(m, " %s %10lu\n", event_types[i], sum); + seq_printf(m, " %-20s %10lu\n", event_types[i], sum); } seq_printf(m, "\n events processed by processor:\n"); From michael at ellerman.id.au Tue Jun 28 09:18:25 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:18:25 +1000 Subject: [PATCH 14/15] ppc64: Cleanup whitespace in arch/ppc64/kernel/ItLpQueue.c In-Reply-To: <1119914299.407466.18333142742.qpatch@concordia> Message-ID: <1119914305.357416.651792101712.qpatch@concordia> Hi, Just cleanup white space. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 62 +++++++++++++++++++++--------------------- 1 files changed, 31 insertions(+), 31 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -1,7 +1,7 @@ /* * ItLpQueue.c * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or @@ -74,21 +74,21 @@ unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * hvlpevent_queue_next_event(void) { - struct HvLpEvent * nextLpEvent = + struct HvLpEvent * nextLpEvent = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; - if ( nextLpEvent->xFlags.xValid ) { + if (nextLpEvent->xFlags.xValid) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ hvlpevent_queue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + - LpEventAlign ) / - LpEventAlign ) * + LpEventAlign) / + LpEventAlign) * LpEventAlign; /* Wrap to beginning if no room at end */ if (hvlpevent_queue.xSlicCurEventPtr > hvlpevent_queue.xSlicLastValidEventPtr) hvlpevent_queue.xSlicCurEventPtr = hvlpevent_queue.xSlicEventStackPtr; } - else + else nextLpEvent = NULL; return nextLpEvent; @@ -107,23 +107,23 @@ int hvlpevent_queue_event_pending(void) return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; } -static void hvlpevent_clear_valid( struct HvLpEvent * event ) +static void hvlpevent_clear_valid(struct HvLpEvent * event) { /* Clear the valid bit of the event * Also clear bits within this event that might * look like valid bits (on 64-byte boundaries) - */ - unsigned extra = (( event->xSizeMinus1 + LpEventAlign ) / - LpEventAlign ) - 1; - switch ( extra ) { - case 3: + */ + unsigned extra = ((event->xSizeMinus1 + LpEventAlign) / + LpEventAlign) - 1; + switch (extra) { + case 3: ((struct HvLpEvent*)((char*)event+3*LpEventAlign))->xFlags.xValid=0; - case 2: + case 2: ((struct HvLpEvent*)((char*)event+2*LpEventAlign))->xFlags.xValid=0; - case 1: + case 1: ((struct HvLpEvent*)((char*)event+1*LpEventAlign))->xFlags.xValid=0; - case 0: - ; + case 0: + ; } mb(); event->xFlags.xValid = 0; @@ -136,7 +136,7 @@ void hvlpevent_queue_process(struct pt_r /* If we have recursed, just return */ if ( !set_inUse() ) return; - + if (ItLpQueueInProcess == 0) ItLpQueueInProcess = 1; else @@ -144,35 +144,35 @@ void hvlpevent_queue_process(struct pt_r for (;;) { nextLpEvent = hvlpevent_queue_next_event(); - if ( nextLpEvent ) { - /* Call appropriate handler here, passing + if (nextLpEvent) { + /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it * needs it in a bottom half. (perhaps for * an ACK) - * - * Handlers are responsible for ACK processing + * + * Handlers are responsible for ACK processing * * The Hypervisor guarantees that LpEvents will * only be delivered with types that we have * registered for, so no type check is necessary * here! - */ - if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) + */ + if (nextLpEvent->xType < HvLpEvent_Type_NumTypes) __get_cpu_var(hvlpevent_counts)[nextLpEvent->xType]++; - if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && - lpEventHandler[nextLpEvent->xType] ) + if (nextLpEvent->xType < HvLpEvent_Type_NumTypes && + lpEventHandler[nextLpEvent->xType]) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); else printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); - - hvlpevent_clear_valid( nextLpEvent ); - } else if ( hvlpevent_queue.xPlicOverflowIntPending ) + + hvlpevent_clear_valid(nextLpEvent); + } else if (hvlpevent_queue.xPlicOverflowIntPending) /* * No more valid events. If overflow events are * pending process them */ - HvCallEvent_getOverflowLpEvents( hvlpevent_queue.xIndex); + HvCallEvent_getOverflowLpEvents(hvlpevent_queue.xIndex); else break; } From michael at ellerman.id.au Tue Jun 28 09:18:31 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 09:18:31 +1000 Subject: [PATCH 15/15] ppc64: Formatting cleanups in arch/ppc64/kernel/ItLpQueue.c In-Reply-To: <1119914305.357416.651792101712.qpatch@concordia> Message-ID: <1119914311.757215.71391459005.qpatch@concordia> Hi, Just formatting cleanups: * rename some "nextLpEvent" variables to just "event" * make code fit in 80 columns * use brackets around if/else * use a temporary to make hvlpevent_clear_valid clearer Signed-off-by: Michael Ellerman -- Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -74,24 +74,27 @@ unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * hvlpevent_queue_next_event(void) { - struct HvLpEvent * nextLpEvent = - (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; - if (nextLpEvent->xFlags.xValid) { + struct HvLpEvent * event; + event = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; + + if (event->xFlags.xValid) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ - hvlpevent_queue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + - LpEventAlign) / - LpEventAlign) * - LpEventAlign; + hvlpevent_queue.xSlicCurEventPtr += ((event->xSizeMinus1 + + LpEventAlign) / LpEventAlign) * LpEventAlign; + /* Wrap to beginning if no room at end */ - if (hvlpevent_queue.xSlicCurEventPtr > hvlpevent_queue.xSlicLastValidEventPtr) - hvlpevent_queue.xSlicCurEventPtr = hvlpevent_queue.xSlicEventStackPtr; + if (hvlpevent_queue.xSlicCurEventPtr > + hvlpevent_queue.xSlicLastValidEventPtr) { + hvlpevent_queue.xSlicCurEventPtr = + hvlpevent_queue.xSlicEventStackPtr; + } + } else { + event = NULL; } - else - nextLpEvent = NULL; - return nextLpEvent; + return event; } unsigned long spread_lpevents = NR_CPUS; @@ -104,34 +107,41 @@ int hvlpevent_queue_event_pending(void) return 0; next_event = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; - return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; + + return next_event->xFlags.xValid | + hvlpevent_queue.xPlicOverflowIntPending; } static void hvlpevent_clear_valid(struct HvLpEvent * event) { - /* Clear the valid bit of the event - * Also clear bits within this event that might - * look like valid bits (on 64-byte boundaries) + /* Tell the Hypervisor that we're done with this event. + * Also clear bits within this event that might look like valid bits. + * ie. on 64-byte boundaries. */ + struct HvLpEvent *tmp; unsigned extra = ((event->xSizeMinus1 + LpEventAlign) / LpEventAlign) - 1; + switch (extra) { case 3: - ((struct HvLpEvent*)((char*)event+3*LpEventAlign))->xFlags.xValid=0; + tmp = (struct HvLpEvent*)((char*)event + 3 * LpEventAlign); + tmp->xFlags.xValid = 0; case 2: - ((struct HvLpEvent*)((char*)event+2*LpEventAlign))->xFlags.xValid=0; + tmp = (struct HvLpEvent*)((char*)event + 2 * LpEventAlign); + tmp->xFlags.xValid = 0; case 1: - ((struct HvLpEvent*)((char*)event+1*LpEventAlign))->xFlags.xValid=0; - case 0: - ; + tmp = (struct HvLpEvent*)((char*)event + 1 * LpEventAlign); + tmp->xFlags.xValid = 0; } + mb(); + event->xFlags.xValid = 0; } void hvlpevent_queue_process(struct pt_regs *regs) { - struct HvLpEvent * nextLpEvent; + struct HvLpEvent * event; /* If we have recursed, just return */ if ( !set_inUse() ) @@ -143,8 +153,8 @@ void hvlpevent_queue_process(struct pt_r BUG(); for (;;) { - nextLpEvent = hvlpevent_queue_next_event(); - if (nextLpEvent) { + event = hvlpevent_queue_next_event(); + if (event) { /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -158,15 +168,15 @@ void hvlpevent_queue_process(struct pt_r * registered for, so no type check is necessary * here! */ - if (nextLpEvent->xType < HvLpEvent_Type_NumTypes) - __get_cpu_var(hvlpevent_counts)[nextLpEvent->xType]++; - if (nextLpEvent->xType < HvLpEvent_Type_NumTypes && - lpEventHandler[nextLpEvent->xType]) - lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); + if (event->xType < HvLpEvent_Type_NumTypes) + __get_cpu_var(hvlpevent_counts)[event->xType]++; + if (event->xType < HvLpEvent_Type_NumTypes && + lpEventHandler[event->xType]) + lpEventHandler[event->xType](event, regs); else - printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); + printk(KERN_INFO "Unexpected Lp Event type=%d\n", event->xType ); - hvlpevent_clear_valid(nextLpEvent); + hvlpevent_clear_valid(event); } else if (hvlpevent_queue.xPlicOverflowIntPending) /* * No more valid events. If overflow events are From sfr at canb.auug.org.au Tue Jun 28 14:10:33 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 14:10:33 +1000 Subject: [PATCH 1/15] ppc64: Remove lpqueue pointer from the paca on iSeries In-Reply-To: <1119914217.31245.735403323298.qpatch@concordia> References: <200506271824.46507.michael@ellerman.id.au> <1119914217.31245.735403323298.qpatch@concordia> Message-ID: <20050628141033.401d7a90.sfr@canb.auug.org.au> Hi Michael, Good work. Just one small comment. On Tue, 28 Jun 2005 09:16:57 +1000 Michael Ellerman wrote: > > --- work.orig/arch/ppc64/kernel/time.c > +++ work/arch/ppc64/kernel/time.c > @@ -367,9 +367,8 @@ int timer_interrupt(struct pt_regs * reg > > #ifdef CONFIG_PPC_ISERIES > { > - struct ItLpQueue *lpq = lpaca->lpqueue_ptr; > - if (lpq && ItLpQueue_isLpIntPending(lpq)) > - lpevent_count += ItLpQueue_process(lpq, regs); > + if (ItLpQueue_isLpIntPending(&xItLpQueue)) > + lpevent_count += ItLpQueue_process(&xItLpQueue, regs); > } You might as well remove the braces and outdent the code. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/62dfe7b8/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 14:16:45 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 14:16:45 +1000 Subject: [PATCH 3/15] ppc64: Reorganise the paca initialisation macros In-Reply-To: <1119914229.959253.194917692058.qpatch@concordia> References: <1119914224.102132.945355478304.qpatch@concordia> <1119914229.959253.194917692058.qpatch@concordia> Message-ID: <20050628141645.210e3c47.sfr@canb.auug.org.au> Hi Michael, On Tue, 28 Jun 2005 09:17:09 +1000 Michael Ellerman wrote: > > This patch updates the macros that initialise the paca to remove the lpq > parameter. It also rearranges them a bit with the hope of making them a > bit clearer. More good stuff. One suggestion: now that the initialisations are shorter, maybe you could bunch them up a bit (like two to a line). -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/f19305bf/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 14:42:31 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 14:42:31 +1000 Subject: [PATCH 4/15] ppc64: Don't pass the pointers to xItLpQueue around In-Reply-To: <1119914237.219661.147090075133.qpatch@concordia> References: <1119914229.959253.194917692058.qpatch@concordia> <1119914237.219661.147090075133.qpatch@concordia> Message-ID: <20050628144231.528386ed.sfr@canb.auug.org.au> Hi Michael, On Tue, 28 Jun 2005 09:17:17 +1000 Michael Ellerman wrote: > > int t; > - u32 * inUseP = &(lpQueue->xInUseWord); > + u32 * inUseP = &xItLpQueue.xInUseWord; > > __asm__ __volatile__("\n\ > 1: lwarx %0,0,%2 \n\ > @@ -31,37 +31,37 @@ static __inline__ int set_inUse( struct > stwcx. %0,0,%2 \n\ > bne- 1b \n\ > 2: eieio" Could you fix this assembler code up so that it is a set of concatenated strings rather than one longe one, please? I think that is the preferred formatting these days. > > /* If we have recursed, just return */ > - if ( !set_inUse( lpQueue ) ) > + if ( !set_inUse() ) You might as well do simple white space cleanups like thses along the way (as you have done elsewhere). > ItLpQueue_clearValid( nextLpEvent ); > - } else if ( lpQueue->xPlicOverflowIntPending ) > + } else if ( xItLpQueue.xPlicOverflowIntPending ) More whitespace. > */ > - HvCallEvent_getOverflowLpEvents( lpQueue->xIndex); > + HvCallEvent_getOverflowLpEvents( xItLpQueue.xIndex); And again. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/4c4aa71f/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 14:45:47 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 14:45:47 +1000 Subject: [PATCH 2/15] ppc64: Spread lpevents by default on iSeries In-Reply-To: <1119914224.102132.945355478304.qpatch@concordia> References: <1119914217.31245.735403323298.qpatch@concordia> <1119914224.102132.945355478304.qpatch@concordia> Message-ID: <20050628144547.58369ffb.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:04 +1000 Michael Ellerman wrote: > > With the previous patch in place, spreading lpevents by default becomes > a one liner. > > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/ac01578c/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 14:47:21 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 14:47:21 +1000 Subject: [PATCH 1/15] ppc64: Remove lpqueue pointer from the paca on iSeries In-Reply-To: <1119914217.31245.735403323298.qpatch@concordia> References: <200506271824.46507.michael@ellerman.id.au> <1119914217.31245.735403323298.qpatch@concordia> Message-ID: <20050628144721.27c8b50a.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:16:57 +1000 Michael Ellerman wrote: > > The iSeries code keeps a pointer to the ItLpQueue in its paca struct. But > all these pointers end up pointing to the one place, ie. xItLpQueue. > > So remove the pointer from the paca struct and just refer to xItLpQueue > directly where needed. > > The only complication is that the spread_lpevents logic was implemented by > having a NULL lpqueue pointer in the paca on CPUs that weren't supposed to > process events. Instead we just compare the spread_lpevents value to the > processor id to get the same behaviour. > > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell ... since my only comment was trivial and can be addressed later. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/a4d77815/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 14:50:05 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 14:50:05 +1000 Subject: [PATCH 3/15] ppc64: Reorganise the paca initialisation macros In-Reply-To: <1119914229.959253.194917692058.qpatch@concordia> References: <1119914224.102132.945355478304.qpatch@concordia> <1119914229.959253.194917692058.qpatch@concordia> Message-ID: <20050628145005.19bf5fc2.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:09 +1000 Michael Ellerman wrote: > > This patch updates the macros that initialise the paca to remove the lpq > parameter. It also rearranges them a bit with the hope of making them a > bit clearer. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell ... again with only a trivial comment -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/a0d236bf/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 14:51:11 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 14:51:11 +1000 Subject: [PATCH 4/15] ppc64: Don't pass the pointers to xItLpQueue around In-Reply-To: <1119914237.219661.147090075133.qpatch@concordia> References: <1119914229.959253.194917692058.qpatch@concordia> <1119914237.219661.147090075133.qpatch@concordia> Message-ID: <20050628145111.45da6b22.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:17 +1000 Michael Ellerman wrote: > > Because there's only one ItLpQueue and we know where it is, ie. xItLpQueue, > there's no point passing pointers to it it around all over the place. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/55f99b23/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:08:20 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:08:20 +1000 Subject: [PATCH 5/15] ppc64: Move initialisation of xItLpQueue into ItLpQueue.c In-Reply-To: <1119914244.209803.173831178348.qpatch@concordia> References: <1119914237.219661.147090075133.qpatch@concordia> <1119914244.209803.173831178348.qpatch@concordia> Message-ID: <20050628150820.4b7e8369.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:24 +1000 Michael Ellerman wrote: > > The xItLpQueue is initalised manually in iSeries_setup_arch(). Move this code > into ItLpQueue.c for a cleaner seperation. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell > +++ work/arch/ppc64/kernel/ItLpQueue.c > @@ -11,6 +11,7 @@ > #include > #include > #include > +#include You can probably remove the include of bootmem.h from iSeries_setup.c. > + /* > + * Allocate a page for the Event Stack. The Hypervisor needs the > + * absolute real address, so we subtract out the KERNELBASE and add > + * in the absolute real address of the kernel load area. > + */ I don't think this comment is relevant to the code below. > + hvlpevent_queue_setup(); Maybe just hvlpevent_setup()? -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/6603707e/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:10:33 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:10:33 +1000 Subject: [PATCH 6/15] ppc64: Move xItLpQueue proc code into ItLpQueue.c In-Reply-To: <1119914250.463962.977117712453.qpatch@concordia> References: <1119914244.209803.173831178348.qpatch@concordia> <1119914250.463962.977117712453.qpatch@concordia> Message-ID: <20050628151033.01caad7e.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:30 +1000 Michael Ellerman wrote: > > Move the code that displays xItLpQueue values in /proc into ItLpQueue.c > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/4028af48/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:12:29 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:12:29 +1000 Subject: [PATCH 7/15] ppc64: Make two ItLpQueue related functions static In-Reply-To: <1119914257.7888.449924609208.qpatch@concordia> References: <1119914250.463962.977117712453.qpatch@concordia> <1119914257.7888.449924609208.qpatch@concordia> Message-ID: <20050628151229.042cefe7.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:37 +1000 Michael Ellerman wrote: > > External parties don't need to use ItLpQueue_getNextLpEvent() or > ItLpQueue_clearValid(), they're internal to ItLpQueue.c > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/74a235b4/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:13:50 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:13:50 +1000 Subject: [PATCH 8/15] ppc64: Move definition of xItLpQueue In-Reply-To: <1119914263.38102.77846693540.qpatch@concordia> References: <1119914257.7888.449924609208.qpatch@concordia> <1119914263.38102.77846693540.qpatch@concordia> Message-ID: <20050628151350.0b662089.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:43 +1000 Michael Ellerman wrote: > > The xItLpQueue is declared in LparData.c, move it into ItLpQueue.c OK. > LparData.c is the only other file that needs to know about xItLpQueue, so > remove the extern definition from ItLpQueue.h and put it in LparData.c > directly. Don't do this. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/a0ab1c52/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:15:55 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:15:55 +1000 Subject: [PATCH 9/15] ppc64: Rename xItLpQueue to hvlpevent_queue In-Reply-To: <1119914269.133079.482794003407.qpatch@concordia> References: <1119914263.38102.77846693540.qpatch@concordia> <1119914269.133079.482794003407.qpatch@concordia> Message-ID: <20050628151555.167d1df3.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:49 +1000 Michael Ellerman wrote: > > The xItLpQueue is a queue of HvLpEvents that we're given by the Hypervisor. > Rename xItLpQueue to hvlpevent_queue and make the type struct hvlpevent_queue. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/66d517eb/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:20:25 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:20:25 +1000 Subject: [PATCH 10/15] ppc64: Rename ItLpQueue_* functions to hvlpevent_queue_* In-Reply-To: <1119914275.942167.276923052080.qpatch@concordia> References: <1119914269.133079.482794003407.qpatch@concordia> <1119914275.942167.276923052080.qpatch@concordia> Message-ID: <20050628152025.0d3c4a54.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:17:55 +1000 Michael Ellerman wrote: > > Now that we've renamed the xItLpQueue structure, rename the functions that > operate on it also. I have some suggestions for (better) names ... > -static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) > +static struct HvLpEvent * hvlpevent_queue_next_event(void) get_next_hvlpevent > -int ItLpQueue_isLpIntPending(void) > +int hvlpevent_queue_event_pending(void) hvlpevent_pending > -static void ItLpQueue_clearValid( struct HvLpEvent * event ) > +static void hvlpevent_clear_valid( struct HvLpEvent * event ) OK (but whitespace :-)) > -unsigned ItLpQueue_process(struct pt_regs *regs) > +unsigned hvlpevent_queue_process(struct pt_regs *regs) process_hvlpevents -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/83e4451e/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:22:25 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:22:25 +1000 Subject: [PATCH 11/15] ppc64: Don't count number of events processed for caller In-Reply-To: <1119914283.627555.717678214194.qpatch@concordia> References: <1119914275.942167.276923052080.qpatch@concordia> <1119914283.627555.717678214194.qpatch@concordia> Message-ID: <20050628152225.406af0ac.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:18:03 +1000 Michael Ellerman wrote: > > Currently we count the number of lpevents processed in 3 seperate places. > > One of these counters is never read, so just remove it. This means > hvlpevent_queue_process() no longer needs to return the number of events > processed. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/c0f55ca1/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:26:09 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:26:09 +1000 Subject: [PATCH 12/15] ppc64: Simplify counting of lpevents, remove lpevent_count from paca In-Reply-To: <1119914292.574437.212533125149.qpatch@concordia> References: <1119914283.627555.717678214194.qpatch@concordia> <1119914292.574437.212533125149.qpatch@concordia> Message-ID: <20050628152609.1a8ad1dd.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:18:12 +1000 Michael Ellerman wrote: > > Currently there's a per-cpu count of lpevents processed, a per-queue (ie. > global) total count, and a count by event type. > > Replace all that with a count by event for each cpu. We only need to add > it up int the proc code. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell > + for_each_online_cpu(cpu) { > + cpu_totals[cpu] = 0; > + for (i = 0; i < HvLpEvent_Type_NumTypes; i++) { > + cpu_totals[cpu] += per_cpu(hvlpevent_counts, cpu)[i]; > + } You don't really need to bracket these simple for loop bodies. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/46a66775/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:27:53 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:27:53 +1000 Subject: [PATCH 13/15] ppc64: Cleanup proc printing of event types In-Reply-To: <1119914299.407466.18333142742.qpatch@concordia> References: <1119914292.574437.212533125149.qpatch@concordia> <1119914299.407466.18333142742.qpatch@concordia> Message-ID: <20050628152753.4cc8ebeb.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:18:19 +1000 Michael Ellerman wrote: > > The code that prints event counts by type uses a hand-coded number of tabs > to get the alignment right. Instead use a printf alignment which will allow > allow us to use the event_type strings elsewhere in the future. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/61323e3e/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:29:01 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:29:01 +1000 Subject: [PATCH 14/15] ppc64: Cleanup whitespace in arch/ppc64/kernel/ItLpQueue.c In-Reply-To: <1119914305.357416.651792101712.qpatch@concordia> References: <1119914299.407466.18333142742.qpatch@concordia> <1119914305.357416.651792101712.qpatch@concordia> Message-ID: <20050628152901.79781e9c.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:18:25 +1000 Michael Ellerman wrote: > > Just cleanup white space. > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/b47b83bb/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 15:31:53 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 15:31:53 +1000 Subject: [PATCH 15/15] ppc64: Formatting cleanups in arch/ppc64/kernel/ItLpQueue.c In-Reply-To: <1119914311.757215.71391459005.qpatch@concordia> References: <1119914305.357416.651792101712.qpatch@concordia> <1119914311.757215.71391459005.qpatch@concordia> Message-ID: <20050628153153.7aa1088f.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 09:18:31 +1000 Michael Ellerman wrote: > > Just formatting cleanups: > * rename some "nextLpEvent" variables to just "event" > * make code fit in 80 columns > * use brackets around if/else > * use a temporary to make hvlpevent_clear_valid clearer > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell > + } > + } else { > + event = NULL; > } > - else > - nextLpEvent = NULL; You don't need to bracket these trivial else bodies. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/a76142c7/attachment.pgp From michael at ellerman.id.au Tue Jun 28 16:49:16 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 16:49:16 +1000 Subject: Updated: [PATCH 8/15] ppc64: Move definition of xItLpQueue In-Reply-To: <20050628151350.0b662089.sfr@canb.auug.org.au> References: <1119914257.7888.449924609208.qpatch@concordia> <1119914263.38102.77846693540.qpatch@concordia> <20050628151350.0b662089.sfr@canb.auug.org.au> Message-ID: <200506281649.16626.michael@ellerman.id.au> On Tue, 28 Jun 2005 15:13, Stephen Rothwell wrote: > On Tue, 28 Jun 2005 09:17:43 +1000 Michael Ellerman wrote: > > The xItLpQueue is declared in LparData.c, move it into ItLpQueue.c > > OK. > > > LparData.c is the only other file that needs to know about xItLpQueue, so > > remove the extern definition from ItLpQueue.h and put it in LparData.c > > directly. > > Don't do this. The xItLpQueue is declared extern in ItLpQueue.h and declared in LparData.c Move the actual declaration into ItLpQueue.c Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 8 ++++++++ arch/ppc64/kernel/LparData.c | 7 +------ include/asm-ppc64/iSeries/ItLpQueue.h | 1 - 3 files changed, 9 insertions(+), 7 deletions(-) Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -20,6 +20,14 @@ #include #include +/* + * The LpQueue is used to pass event data from the hypervisor to + * the partition. This is where I/O interrupt events are communicated. + * + * It is written to by the hypervisor so cannot end up in the BSS. + */ +struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); + static char *event_types[9] = { "Hypervisor\t\t", "Machine Facilities\t", Index: work/arch/ppc64/kernel/LparData.c =================================================================== --- work.orig/arch/ppc64/kernel/LparData.c +++ work/arch/ppc64/kernel/LparData.c @@ -28,13 +28,6 @@ #include #include -/* The LpQueue is used to pass event data from the hypervisor to - * the partition. This is where I/O interrupt events are communicated. - */ - -/* May be filled in by the hypervisor so cannot end up in the BSS */ -struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); - /* The HvReleaseData is the root of the information shared between * the hypervisor and Linux. From michael at ellerman.id.au Tue Jun 28 16:50:34 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Jun 2005 16:50:34 +1000 Subject: Updated: [PATCH 9/15] ppc64: Rename xItLpQueue to hvlpevent_queue In-Reply-To: <20050628151555.167d1df3.sfr@canb.auug.org.au> References: <1119914263.38102.77846693540.qpatch@concordia> <1119914269.133079.482794003407.qpatch@concordia> <20050628151555.167d1df3.sfr@canb.auug.org.au> Message-ID: <200506281650.38369.michael@ellerman.id.au> On Tue, 28 Jun 2005 15:15, Stephen Rothwell wrote: > On Tue, 28 Jun 2005 09:17:49 +1000 Michael Ellerman wrote: > > The xItLpQueue is a queue of HvLpEvents that we're given by the > > Hypervisor. Rename xItLpQueue to hvlpevent_queue and make the type struct > > hvlpevent_queue. > > > > Signed-off-by: Michael Ellerman > > Acked-by: Stephen Rothwell Updated because of change to previous patch. The xItLpQueue is a queue of HvLpEvents that we're given by the Hypervisor. Rename xItLpQueue to hvlpevent_queue and make the type struct hvlpevent_queue. Signed-off-by: Michael Ellerman -- Index: work/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- work.orig/arch/ppc64/kernel/ItLpQueue.c +++ work/arch/ppc64/kernel/ItLpQueue.c @@ -26,7 +26,7 @@ * * It is written to by the hypervisor so cannot end up in the BSS. */ -struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); +struct hvlpevent_queue hvlpevent_queue __attribute__((__section__(".data"))); static char *event_types[9] = { "Hypervisor\t\t", @@ -43,7 +43,7 @@ static char *event_types[9] = { static __inline__ int set_inUse(void) { int t; - u32 * inUseP = &xItLpQueue.xInUseWord; + u32 * inUseP = &hvlpevent_queue.xInUseWord; __asm__ __volatile__("\n\ 1: lwarx %0,0,%2 \n\ @@ -54,8 +54,8 @@ static __inline__ int set_inUse(void) stwcx. %0,0,%2 \n\ bne- 1b \n\ 2: eieio" - : "=&r" (t), "=m" (xItLpQueue.xInUseWord) - : "r" (inUseP), "m" (xItLpQueue.xInUseWord) + : "=&r" (t), "=m" (hvlpevent_queue.xInUseWord) + : "r" (inUseP), "m" (hvlpevent_queue.xInUseWord) : "cc"); return t; @@ -63,7 +63,7 @@ static __inline__ int set_inUse(void) static __inline__ void clear_inUse(void) { - xItLpQueue.xInUseWord = 0; + hvlpevent_queue.xInUseWord = 0; } /* Array of LpEvent handler functions */ @@ -73,18 +73,18 @@ unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) { struct HvLpEvent * nextLpEvent = - (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; + (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; if ( nextLpEvent->xFlags.xValid ) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ - xItLpQueue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + + hvlpevent_queue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + LpEventAlign ) / LpEventAlign ) * LpEventAlign; /* Wrap to beginning if no room at end */ - if (xItLpQueue.xSlicCurEventPtr > xItLpQueue.xSlicLastValidEventPtr) - xItLpQueue.xSlicCurEventPtr = xItLpQueue.xSlicEventStackPtr; + if (hvlpevent_queue.xSlicCurEventPtr > hvlpevent_queue.xSlicLastValidEventPtr) + hvlpevent_queue.xSlicCurEventPtr = hvlpevent_queue.xSlicEventStackPtr; } else nextLpEvent = NULL; @@ -101,8 +101,8 @@ int ItLpQueue_isLpIntPending(void) if (smp_processor_id() >= spread_lpevents) return 0; - next_event = (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; - return next_event->xFlags.xValid | xItLpQueue.xPlicOverflowIntPending; + next_event = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; + return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; } static void ItLpQueue_clearValid( struct HvLpEvent * event ) @@ -145,10 +145,10 @@ unsigned ItLpQueue_process(struct pt_reg nextLpEvent = ItLpQueue_getNextLpEvent(); if ( nextLpEvent ) { /* Count events to return to caller - * and count processed events in xItLpQueue + * and count processed events in hvlpevent_queue */ ++numIntsProcessed; - xItLpQueue.xLpIntCount++; + hvlpevent_queue.xLpIntCount++; /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -163,7 +163,7 @@ unsigned ItLpQueue_process(struct pt_reg * here! */ if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) - xItLpQueue.xLpIntCountByType[nextLpEvent->xType]++; + hvlpevent_queue.xLpIntCountByType[nextLpEvent->xType]++; if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && lpEventHandler[nextLpEvent->xType] ) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); @@ -171,12 +171,12 @@ unsigned ItLpQueue_process(struct pt_reg printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); ItLpQueue_clearValid( nextLpEvent ); - } else if ( xItLpQueue.xPlicOverflowIntPending ) + } else if ( hvlpevent_queue.xPlicOverflowIntPending ) /* * No more valid events. If overflow events are * pending process them */ - HvCallEvent_getOverflowLpEvents( xItLpQueue.xIndex); + HvCallEvent_getOverflowLpEvents( hvlpevent_queue.xIndex); else break; } @@ -205,11 +205,11 @@ void hvlpevent_queue_setup(void) /* Invoke the hypervisor to initialize the event stack */ HvCallEvent_setLpEventStack(0, eventStack, LpEventStackSize); - xItLpQueue.xSlicEventStackPtr = (char *)eventStack; - xItLpQueue.xSlicCurEventPtr = (char *)eventStack; - xItLpQueue.xSlicLastValidEventPtr = (char *)eventStack + + hvlpevent_queue.xSlicEventStackPtr = (char *)eventStack; + hvlpevent_queue.xSlicCurEventPtr = (char *)eventStack; + hvlpevent_queue.xSlicLastValidEventPtr = (char *)eventStack + (LpEventStackSize - LpEventMaxSize); - xItLpQueue.xIndex = 0; + hvlpevent_queue.xIndex = 0; } static int proc_lpevents_show(struct seq_file *m, void *v) @@ -218,11 +218,11 @@ static int proc_lpevents_show(struct seq seq_printf(m, "LpEventQueue 0\n"); seq_printf(m, " events processed:\t%lu\n", - (unsigned long)xItLpQueue.xLpIntCount); + (unsigned long)hvlpevent_queue.xLpIntCount); for (i = 0; i < 9; ++i) seq_printf(m, " %s %10lu\n", event_types[i], - (unsigned long)xItLpQueue.xLpIntCountByType[i]); + (unsigned long)hvlpevent_queue.xLpIntCountByType[i]); seq_printf(m, "\n events processed by processor:\n"); Index: work/arch/ppc64/kernel/LparData.c =================================================================== --- work.orig/arch/ppc64/kernel/LparData.c +++ work/arch/ppc64/kernel/LparData.c @@ -193,7 +193,7 @@ struct ItVpdAreas itVpdAreas = { 0,0,0, /* 13 - 15 */ sizeof(struct IoHriProcessorVpd),/* 16 length of Proc Vpd */ 0,0,0,0,0,0, /* 17 - 22 */ - sizeof(struct ItLpQueue),/* 23 length of Lp Queue */ + sizeof(struct hvlpevent_queue), /* 23 length of Lp Queue */ 0,0 /* 24 - 25 */ }, .xSlicVpdAdrs = { /* VPD addresses */ @@ -211,7 +211,7 @@ struct ItVpdAreas itVpdAreas = { 0,0,0, /* 13 - 15 */ &xIoHriProcessorVpd, /* 16 Proc Vpd */ 0,0,0,0,0,0, /* 17 - 22 */ - &xItLpQueue, /* 23 Lp Queue */ + &hvlpevent_queue, /* 23 Lp Queue */ 0,0 } }; Index: work/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- work.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ work/include/asm-ppc64/iSeries/ItLpQueue.h @@ -41,7 +41,7 @@ struct HvLpEvent; #define LpEventMaxSize 256 #define LpEventAlign 64 -struct ItLpQueue { +struct hvlpevent_queue { /* * The xSlicCurEventPtr is the pointer to the next event stack entry * that will become valid. The OS must peek at this entry to determine -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/c9b1912b/attachment.pgp From sfr at canb.auug.org.au Tue Jun 28 16:58:30 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 28 Jun 2005 16:58:30 +1000 Subject: Updated: [PATCH 8/15] ppc64: Move definition of xItLpQueue In-Reply-To: <200506281649.16626.michael@ellerman.id.au> References: <1119914257.7888.449924609208.qpatch@concordia> <1119914263.38102.77846693540.qpatch@concordia> <20050628151350.0b662089.sfr@canb.auug.org.au> <200506281649.16626.michael@ellerman.id.au> Message-ID: <20050628165830.223123bf.sfr@canb.auug.org.au> On Tue, 28 Jun 2005 16:49:16 +1000 Michael Ellerman wrote: > > The xItLpQueue is declared extern in ItLpQueue.h and declared in LparData.c > Move the actual declaration into ItLpQueue.c > > Signed-off-by: Michael Ellerman Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050628/5ebdad28/attachment.pgp From greg at kroah.com Tue Jun 28 17:51:40 2005 From: greg at kroah.com (Greg KH) Date: Tue, 28 Jun 2005 00:51:40 -0700 Subject: [PATCH] ppc/ppc64: Fix pci mmap via sysfs In-Reply-To: <1119838264.5133.76.camel@gaston> References: <1119836190.5133.59.camel@gaston> <20050626185727.0ce92772.akpm@osdl.org> <1119838264.5133.76.camel@gaston> Message-ID: <20050628075140.GF3577@kroah.com> On Mon, Jun 27, 2005 at 12:11:03PM +1000, Benjamin Herrenschmidt wrote: > On Sun, 2005-06-26 at 18:57 -0700, Andrew Morton wrote: > > Benjamin Herrenschmidt wrote: > > > > > > Hi ! > > > > > > This implement the change to /proc and sysfs PCI mmap functions that we > > > discussed a while ago, that is adding an arch optional > > > pci_resource_to_user() to allow munging on the exposed value of PCI > > > resources to userland and thus hiding kernel internal values. It also > > > implements using of that callback to sanitize exposed values on ppc an > > > ppc64, thus fixing mmap of PCI devices via /proc and sysfs. > > > > > > > You sure you want all those printks in there? > > One quilt ref later ... :) > > Hi ! > > This implement the change to /proc and sysfs PCI mmap functions that we > discussed a while ago, that is adding an arch optional > pci_resource_to_user() to allow munging on the exposed value of PCI > resources to userland and thus hiding kernel internal values. It also > implements using of that callback to sanitize exposed values on ppc an > ppc64, thus fixing mmap of PCI devices via /proc and sysfs. Hm, did I just send the right one to Linus? thanks, greg k-h From paulus at samba.org Tue Jun 28 19:36:22 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 28 Jun 2005 19:36:22 +1000 Subject: [PATCH 1/2] ppc64: fix read/write on large /dev/nvram In-Reply-To: <200504220850.00339.arnd@arndb.de> References: <200504220850.00339.arnd@arndb.de> Message-ID: <17089.6678.934199.608517@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > For large nvram devices on ppc64, reading and writing fails because > of oversized arguments to kmalloc. > This patch makes the driver use __get_free_page instead of kmalloc > and sanitizes error handling while touching the functions. Hmmm, I'm not sure that always using __get_free_page even for small reads is necessarily a good idea. If we are just limiting the size of the read or write to PAGE_SIZE we might as well just limit the count to PAGE_SIZE and continue to use kmalloc. I think that we should probably limit the buffer size but then loop around until the whole user request is satisfied. Regards, Paul. From paulus at samba.org Tue Jun 28 20:59:20 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 28 Jun 2005 20:59:20 +1000 Subject: [PATCH 1/2] logical numbering for numa nodes (2nd try) In-Reply-To: <20050505221520.GB3614@otto> References: <20050505221520.GB3614@otto> Message-ID: <17089.11656.794870.997068@cargo.ozlabs.ibm.com> Nathan Lynch writes: > This patch fixes the ppc64 numa code to be more consistent with the > conversion from numnodes to node_online_mask etc. and removes the > dependence on the platform numa numbering by setting up a mapping > between the platform ids found in the ibm,associativity properties and > logical node numbers. Unfortunately it no longer applies due to other changes to numa.c; care to send a new version of this patch and the following one? Paul. From arnd at arndb.de Tue Jun 28 22:29:40 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 28 Jun 2005 14:29:40 +0200 Subject: [PATCH 1/2] ppc64: fix read/write on large /dev/nvram In-Reply-To: <17089.6678.934199.608517@cargo.ozlabs.ibm.com> References: <200504220850.00339.arnd@arndb.de> <17089.6678.934199.608517@cargo.ozlabs.ibm.com> Message-ID: <200506281429.41386.arnd@arndb.de> On Dinsdag 28 Juni 2005 11:36, Paul Mackerras wrote: > Hmmm, I'm not sure that always using __get_free_page even for small > reads is necessarily a good idea. ?If we are just limiting the size of > the read or write to PAGE_SIZE we might as well just limit the count > to PAGE_SIZE and continue to use kmalloc. ?I think that we should > probably limit the buffer size but then loop around until the whole > user request is satisfied. IIRC, the tools we commonly use to access nvram first check the size and then try to get the whole buffer in a single read request. This means that we will still use full pages most of the time with kmalloc, so it shouldn't make any difference one way or the other. Anyway, here is an updated patch using kmalloc and with a fix for the bug noticed by Milton. --- The generic ppc64 nvram code breaks when passed large buffers to read/write because of the size limit for kmalloc. This patch fixes this by allocating at most PAGE_SIZE bytes. It also cleans up the error handling for those function so they become more readable. Signed-off-by: Arnd Bergmann --- linux-cg.orig/arch/ppc64/kernel/nvram.c 2005-06-28 13:14:04.275989864 -0400 +++ linux-cg/arch/ppc64/kernel/nvram.c 2005-06-28 13:40:28.377013664 -0400 @@ -81,80 +81,74 @@ static loff_t dev_nvram_llseek(struct fi static ssize_t dev_nvram_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { - ssize_t len; - char *tmp_buffer; - int size; + ssize_t ret; + char *tmp = NULL; + ssize_t size; + + ret = -ENODEV; + if (!ppc_md.nvram_size) + goto out; - if (ppc_md.nvram_size == NULL) - return -ENODEV; + ret = 0; size = ppc_md.nvram_size(); + if (*ppos >= size || size < 0) + goto out; - if (!access_ok(VERIFY_WRITE, buf, count)) - return -EFAULT; - if (*ppos >= size) - return 0; - if (count > size) - count = size; - - tmp_buffer = (char *) kmalloc(count, GFP_KERNEL); - if (!tmp_buffer) { - printk(KERN_ERR "dev_read_nvram: kmalloc failed\n"); - return -ENOMEM; - } - - len = ppc_md.nvram_read(tmp_buffer, count, ppos); - if ((long)len <= 0) { - kfree(tmp_buffer); - return len; - } - - if (copy_to_user(buf, tmp_buffer, len)) { - kfree(tmp_buffer); - return -EFAULT; - } + count = min_t(size_t, count, size - *ppos); + count = min(count, PAGE_SIZE); - kfree(tmp_buffer); - return len; + ret = -ENOMEM; + tmp = (char *) __get_free_page(GFP_KERNEL); + if (!tmp) + goto out; + + ret = ppc_md.nvram_read(tmp, count, ppos); + if (ret <= 0) + goto out; + + if (copy_to_user(buf, tmp, ret)) + ret = -EFAULT; + +out: + free_page((unsigned long)tmp); + return ret; } static ssize_t dev_nvram_write(struct file *file, const char __user *buf, - size_t count, loff_t *ppos) + size_t count, loff_t *ppos) { - ssize_t len; - char * tmp_buffer; - int size; + ssize_t ret; + char *tmp = NULL; + ssize_t size; + + ret = -ENODEV; + if (!ppc_md.nvram_size) + goto out; - if (ppc_md.nvram_size == NULL) - return -ENODEV; + ret = 0; size = ppc_md.nvram_size(); + if (*ppos >= size || size < 0) + goto out; - if (!access_ok(VERIFY_READ, buf, count)) - return -EFAULT; - if (*ppos >= size) - return 0; - if (count > size) - count = size; - - tmp_buffer = (char *) kmalloc(count, GFP_KERNEL); - if (!tmp_buffer) { - printk(KERN_ERR "dev_nvram_write: kmalloc failed\n"); - return -ENOMEM; - } - - if (copy_from_user(tmp_buffer, buf, count)) { - kfree(tmp_buffer); - return -EFAULT; - } + count = min_t(size_t, count, size - *ppos); + count = min(count, PAGE_SIZE); - len = ppc_md.nvram_write(tmp_buffer, count, ppos); - if ((long)len <= 0) { - kfree(tmp_buffer); - return len; - } + ret = -ENOMEM; + tmp = (char *) __get_free_page(GFP_KERNEL); + if (!tmp) + goto out; + + ret = -EFAULT; + if (copy_from_user(tmp, buf, count)) + goto out; + + ret = ppc_md.nvram_write(tmp, count, ppos); + +out: + free_page((unsigned long)tmp); + return ret; - kfree(tmp_buffer); - return len; } static int dev_nvram_ioctl(struct inode *inode, struct file *file, From arnd at arndb.de Tue Jun 28 22:34:25 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 28 Jun 2005 14:34:25 +0200 Subject: [PATCH 1/2] ppc64: fix read/write on large /dev/nvram In-Reply-To: <200506281429.41386.arnd@arndb.de> References: <200504220850.00339.arnd@arndb.de> <17089.6678.934199.608517@cargo.ozlabs.ibm.com> <200506281429.41386.arnd@arndb.de> Message-ID: <200506281434.25828.arnd@arndb.de> [ I sent the old patch again in the previous mail, here is the correct one. ] The generic ppc64 nvram code breaks when passed large buffers to read/write because of the size limit for kmalloc. This patch fixes this by allocating at most PAGE_SIZE bytes. It also cleans up the error handling for those function so they become more readable. Signed-off-by: Arnd Bergmann --- linux-cg.orig/arch/ppc64/kernel/nvram.c 2005-06-28 13:14:04.275989864 -0400 +++ linux-cg/arch/ppc64/kernel/nvram.c 2005-06-28 14:37:39.836976600 -0400 @@ -81,80 +81,74 @@ static loff_t dev_nvram_llseek(struct fi static ssize_t dev_nvram_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { - ssize_t len; - char *tmp_buffer; - int size; + ssize_t ret; + char *tmp = NULL; + ssize_t size; + + ret = -ENODEV; + if (!ppc_md.nvram_size) + goto out; - if (ppc_md.nvram_size == NULL) - return -ENODEV; + ret = 0; size = ppc_md.nvram_size(); + if (*ppos >= size || size < 0) + goto out; - if (!access_ok(VERIFY_WRITE, buf, count)) - return -EFAULT; - if (*ppos >= size) - return 0; - if (count > size) - count = size; - - tmp_buffer = (char *) kmalloc(count, GFP_KERNEL); - if (!tmp_buffer) { - printk(KERN_ERR "dev_read_nvram: kmalloc failed\n"); - return -ENOMEM; - } - - len = ppc_md.nvram_read(tmp_buffer, count, ppos); - if ((long)len <= 0) { - kfree(tmp_buffer); - return len; - } - - if (copy_to_user(buf, tmp_buffer, len)) { - kfree(tmp_buffer); - return -EFAULT; - } + count = min_t(size_t, count, size - *ppos); + count = min(count, PAGE_SIZE); - kfree(tmp_buffer); - return len; + ret = -ENOMEM; + tmp = kmalloc(count, GFP_KERNEL); + if (!tmp) + goto out; + + ret = ppc_md.nvram_read(tmp, count, ppos); + if (ret <= 0) + goto out; + + if (copy_to_user(buf, tmp, ret)) + ret = -EFAULT; + +out: + kfree(tmp); + return ret; } static ssize_t dev_nvram_write(struct file *file, const char __user *buf, - size_t count, loff_t *ppos) + size_t count, loff_t *ppos) { - ssize_t len; - char * tmp_buffer; - int size; + ssize_t ret; + char *tmp = NULL; + ssize_t size; + + ret = -ENODEV; + if (!ppc_md.nvram_size) + goto out; - if (ppc_md.nvram_size == NULL) - return -ENODEV; + ret = 0; size = ppc_md.nvram_size(); + if (*ppos >= size || size < 0) + goto out; - if (!access_ok(VERIFY_READ, buf, count)) - return -EFAULT; - if (*ppos >= size) - return 0; - if (count > size) - count = size; - - tmp_buffer = (char *) kmalloc(count, GFP_KERNEL); - if (!tmp_buffer) { - printk(KERN_ERR "dev_nvram_write: kmalloc failed\n"); - return -ENOMEM; - } - - if (copy_from_user(tmp_buffer, buf, count)) { - kfree(tmp_buffer); - return -EFAULT; - } + count = min_t(size_t, count, size - *ppos); + count = min(count, PAGE_SIZE); - len = ppc_md.nvram_write(tmp_buffer, count, ppos); - if ((long)len <= 0) { - kfree(tmp_buffer); - return len; - } + ret = -ENOMEM; + tmp = kmalloc(count, GFP_KERNEL); + if (!tmp) + goto out; + + ret = -EFAULT; + if (copy_from_user(tmp, buf, count)) + goto out; + + ret = ppc_md.nvram_write(tmp, count, ppos); + +out: + kfree(tmp); + return ret; - kfree(tmp_buffer); - return len; } static int dev_nvram_ioctl(struct inode *inode, struct file *file, From arnd at arndb.de Tue Jun 28 23:28:07 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 28 Jun 2005 15:28:07 +0200 Subject: [PATCH] net: add driver for the NIC on Cell Blades Message-ID: <200506281528.08834.arnd@arndb.de> This patch adds a driver for a new 1000 Mbit ethernet NIC. It is integrated on the south bridge that is used for our Cell Blades. The code gets the MAC address from the Open Firmware device tree, so it won't compile on platforms other than ppc64. This is the first public release, so I don't expect the first version to get merged, but I'd aim for integration within the 2.6.13 time frame. From: Jens Osterkamp Cc: Utz Bacher Signed-off-by: Arnd Bergmann --- drivers/net/Kconfig | 6 drivers/net/Makefile | 2 drivers/net/spider_net.c | 2298 +++++++++++++++++++++++++++++++++++++++ drivers/net/spider_net.h | 469 +++++++ drivers/net/spider_net_ethtool.c | 107 + include/linux/pci_ids.h | 1 6 files changed, 2883 insertions(+) --- linux-cg.orig/drivers/net/Kconfig 2005-06-28 14:54:14.571996776 -0400 +++ linux-cg/drivers/net/Kconfig 2005-06-28 15:08:07.506978488 -0400 @@ -2042,6 +2042,12 @@ config BNX2 To compile this driver as a module, choose M here: the module will be called bnx2. This is recommended. +config SPIDER_NET + tristate "Spider Gigabit Ethernet driver" + depends on PCI && PPC_BPA + This driver supports the Gigabit Ethernet chips present on the + Cell Processor-Based Blades from IBM. + config GIANFAR tristate "Gianfar Ethernet" depends on 85xx || 83xx --- linux-cg.orig/drivers/net/Makefile 2005-06-28 14:54:14.574996320 -0400 +++ linux-cg/drivers/net/Makefile 2005-06-28 15:06:01.224003480 -0400 @@ -52,6 +52,8 @@ obj-$(CONFIG_STNIC) += stnic.o 8390.o obj-$(CONFIG_FEALNX) += fealnx.o obj-$(CONFIG_TIGON3) += tg3.o obj-$(CONFIG_BNX2) += bnx2.o +spidernet-y += spider_net.o spider_net_ethtool.o sungem_phy.o +obj-$(CONFIG_SPIDER_NET) += spidernet.o obj-$(CONFIG_TC35815) += tc35815.o obj-$(CONFIG_SKGE) += skge.o obj-$(CONFIG_SK98LIN) += sk98lin/ --- linux-cg.orig/drivers/net/spider_net.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/drivers/net/spider_net.c 2005-06-28 15:06:01.234001960 -0400 @@ -0,0 +1,2298 @@ +/* + * Network device driver for Cell Processor-Based Blade + * + * (C) Copyright IBM Corp. 2005 + * + * Authors : Utz Bacher + * Jens Osterkamp + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "spider_net.h" + +MODULE_AUTHOR("Utz Bacher and Jens Osterkamp " \ + ""); +MODULE_DESCRIPTION("Spider Southbridge Gigabit Ethernet driver"); +MODULE_LICENSE("GPL"); + +static int rx_descriptors = SPIDER_NET_RX_DESCRIPTORS_DEFAULT; +static int tx_descriptors = SPIDER_NET_TX_DESCRIPTORS_DEFAULT; + +module_param(rx_descriptors, int, 0644); +module_param(tx_descriptors, int, 0644); + +MODULE_PARM_DESC(rx_descriptors, "number of descriptors used " \ + "in rx chains"); +MODULE_PARM_DESC(tx_descriptors, "number of descriptors used " \ + "in tx chain"); + +char spider_net_driver_name[] = "spidernet"; + +static struct pci_device_id spider_net_pci_tbl[] = { + { PCI_VENDOR_ID_TOSHIBA_2, PCI_DEVICE_ID_TOSHIBA_SPIDER_NET, + PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, + { 0, } +}; + +MODULE_DEVICE_TABLE(pci, spider_net_pci_tbl); + +/** + * spider_net_read_reg - reads an SMMIO register of a card + * @card: device structure + * @reg: register to read from + * + * returns the content of the specified SMMIO register. + */ +static u32 +spider_net_read_reg(struct spider_net_card *card, u32 reg) +{ + u32 value; + + value = readl(card->regs + reg); + value = le32_to_cpu(value); + + return value; +} + +/** + * spider_net_write_reg - writes to an SMMIO register of a card + * @card: device structure + * @reg: register to write to + * @value: value to write into the specified SMMIO register + */ +static void +spider_net_write_reg(struct spider_net_card *card, u32 reg, u32 value) +{ + value = cpu_to_le32(value); + writel(value, card->regs + reg); +} + +/** + * spider_net_rx_irq_off - switch off rx irq on this spider card + * @card: device structure + * + * switches off rx irq by masking them out in the GHIINTnMSK register + */ +static void +spider_net_rx_irq_off(struct spider_net_card *card) +{ + u32 regvalue; + unsigned long flags; + + spin_lock_irqsave(&card->intmask_lock, flags); + regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); + regvalue &= ~SPIDER_NET_RXINT; + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spin_unlock_irqrestore(&card->intmask_lock, flags); +} + +/** spider_net_write_phy - write to phy register + * @netdev: adapter to be written to + * @mii_id: id of MII + * @reg: PHY register + * @val: value to be written to phy register + * + * spider_net_write_phy_register writes to an arbitrary PHY + * register via the spider GPCWOPCMD register. We assume the queue does + * not run full (not more than 15 commands outstanding). + **/ +static void +spider_net_write_phy(struct net_device *netdev, int mii_id, + int reg, int val) +{ + struct spider_net_card *card = netdev_priv(netdev); + u32 writevalue; + + writevalue = ((u32)mii_id << 21) | + ((u32)reg << 16) | ((u32)val); + + spider_net_write_reg(card, SPIDER_NET_GPCWOPCMD, writevalue); +} + +/** spider_net_read_phy - read from phy register + * @netdev: network device to be read from + * @mii_id: id of MII + * @reg: PHY register + * + * Returns value read from PHY register + * + * spider_net_write_phy reads from an arbitrary PHY + * register via the spider GPCROPCMD register + **/ +static int +spider_net_read_phy(struct net_device *netdev, int mii_id, int reg) +{ + struct spider_net_card *card = netdev_priv(netdev); + u32 readvalue; + + readvalue = ((u32)mii_id << 21) | ((u32)reg << 16); + spider_net_write_reg(card, SPIDER_NET_GPCROPCMD, readvalue); + + /* we don't use semaphores to wait for an SPIDER_NET_GPROPCMPINT + * interrupt, as we poll for the completion of the read operation + * in spider_net_read_phy. Should take about 50 us */ + do { + readvalue = spider_net_read_reg(card, SPIDER_NET_GPCROPCMD); + } while (readvalue & SPIDER_NET_GPREXEC); + + readvalue &= SPIDER_NET_GPRDAT_MASK; + + return readvalue; +} + +/** + * spider_net_rx_irq_on - switch on rx irq on this spider card + * @card: device structure + * + * switches on rx irq by enabling them in the GHIINTnMSK register + */ +static void +spider_net_rx_irq_on(struct spider_net_card *card) +{ + u32 regvalue; + unsigned long flags; + + spin_lock_irqsave(&card->intmask_lock, flags); + regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); + regvalue |= SPIDER_NET_RXINT; + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spin_unlock_irqrestore(&card->intmask_lock, flags); +} + +/** + * spider_net_tx_irq_off - switch off tx irq on this spider card + * @card: device structure + * + * switches off tx irq by masking them out in the GHIINTnMSK register + */ +static void +spider_net_tx_irq_off(struct spider_net_card *card) +{ + u32 regvalue; + unsigned long flags; + + spin_lock_irqsave(&card->intmask_lock, flags); + regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); + regvalue &= ~SPIDER_NET_TXINT; + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spin_unlock_irqrestore(&card->intmask_lock, flags); +} + +/** + * spider_net_tx_irq_on - switch on tx irq on this spider card + * @card: device structure + * + * switches on tx irq by enabling them in the GHIINTnMSK register + */ +static void +spider_net_tx_irq_on(struct spider_net_card *card) +{ + u32 regvalue; + unsigned long flags; + + spin_lock_irqsave(&card->intmask_lock, flags); + regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); + regvalue |= SPIDER_NET_TXINT; + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spin_unlock_irqrestore(&card->intmask_lock, flags); +} + +/** + * spider_net_set_promisc - sets the unicast address or the promiscuous mode + * @card: card structure + * + * spider_net_set_promisc sets the unicast destination address filter and + * thus either allows for non-promisc mode or promisc mode + */ +static void +spider_net_set_promisc(struct spider_net_card *card) +{ + u32 macu, macl; + struct net_device *netdev = card->netdev; + + if (netdev->flags & IFF_PROMISC) { + /* clear destination entry 0 */ + spider_net_write_reg(card, SPIDER_NET_GMRUAFILnR, 0); + spider_net_write_reg(card, SPIDER_NET_GMRUAFILnR + 0x04, 0); + spider_net_write_reg(card, SPIDER_NET_GMRUA0FIL15R, + SPIDER_NET_PROMISC_VALUE); + } else { + macu = netdev->dev_addr[0]; + macu <<= 8; + macu |= netdev->dev_addr[1]; + memcpy(&macl, &netdev->dev_addr[2], sizeof(macl)); + + macu |= SPIDER_NET_UA_DESCR_VALUE; + spider_net_write_reg(card, SPIDER_NET_GMRUAFILnR, macu); + spider_net_write_reg(card, SPIDER_NET_GMRUAFILnR + 0x04, macl); + spider_net_write_reg(card, SPIDER_NET_GMRUA0FIL15R, + SPIDER_NET_NONPROMISC_VALUE); + } +} + +/** + * spider_net_get_mac_address - read mac address from spider card + * @card: device structure + * + * reads MAC address from GMACUNIMACU and GMACUNIMACL registers + */ +static int +spider_net_get_mac_address(struct net_device *netdev) +{ + struct spider_net_card *card = netdev_priv(netdev); + u32 macl, macu; + + macl = spider_net_read_reg(card, SPIDER_NET_GMACUNIMACL); + macu = spider_net_read_reg(card, SPIDER_NET_GMACUNIMACU); + + netdev->dev_addr[0] = (macu >> 24) & 0xff; + netdev->dev_addr[1] = (macu >> 16) & 0xff; + netdev->dev_addr[2] = (macu >> 8) & 0xff; + netdev->dev_addr[3] = macu & 0xff; + netdev->dev_addr[4] = (macl >> 8) & 0xff; + netdev->dev_addr[5] = macl & 0xff; + + if (!is_valid_ether_addr(&netdev->dev_addr[0])) + return -EINVAL; + + return 0; +} + +/** + * spider_net_get_descr_status -- returns the status of a descriptor + * @descr: descriptor to look at + * + * returns the status as in the dmac_cmd_status field of the descriptor + */ +static enum spider_net_descr_status +spider_net_get_descr_status(struct spider_net_descr *descr) +{ + u32 cmd_status; + rmb(); + cmd_status = descr->dmac_cmd_status; + rmb(); + cmd_status >>= SPIDER_NET_DESCR_IND_PROC_SHIFT; + /* no need to mask out any bits, as cmd_status is 32 bits wide only + * (and unsigned) */ + return cmd_status; +} + +/** + * spider_net_set_descr_status -- sets the status of a descriptor + * @descr: descriptor to change + * @status: status to set in the descriptor + * + * changes the status to the specified value. Doesn't change other bits + * in the status + */ +static void +spider_net_set_descr_status(struct spider_net_descr *descr, + enum spider_net_descr_status status) +{ + u32 cmd_status; + /* read the status */ + mb(); + cmd_status = descr->dmac_cmd_status; + /* clean the upper 4 bits */ + cmd_status &= SPIDER_NET_DESCR_IND_PROC_MASKO; + /* add the status to it */ + cmd_status |= ((u32)status)<dmac_cmd_status = cmd_status; + wmb(); +} + +/** + * spider_net_free_chain - free descriptor chain + * @card: card structure + * @chain: address of chain + * + */ +static void +spider_net_free_chain(struct spider_net_card *card, + struct spider_net_descr_chain *chain) +{ + struct spider_net_descr *descr; + + for (descr = chain->tail; !descr->bus_addr; descr = descr->next) { + pci_unmap_single(card->pdev, descr->bus_addr, + SPIDER_NET_DESCR_SIZE, PCI_DMA_BIDIRECTIONAL); + descr->bus_addr = 0; + } +} + +/** + * spider_net_init_chain - links descriptor chain + * @card: card structure + * @chain: address of chain + * @start_descr: address of descriptor array + * @no: number of descriptors + * + * we manage a circular list that mirrors the hardware structure, + * except that the hardware uses bus addresses. + * + * returns 0 on success, <0 on failure + */ +static int +spider_net_init_chain(struct spider_net_card *card, + struct spider_net_descr_chain *chain, + struct spider_net_descr *start_descr, int no) +{ + int i; + struct spider_net_descr *descr; + + spin_lock_init(&card->chain_lock); + + descr = start_descr; + memset(descr, 0, sizeof(*descr) * no); + + /* set up the hardware pointers in each descriptor */ + for (i=0; ibus_addr = + pci_map_single(card->pdev, descr, + SPIDER_NET_DESCR_SIZE, + PCI_DMA_BIDIRECTIONAL); + + if (descr->bus_addr == DMA_ERROR_CODE) + goto iommu_error; + + descr->next = descr + 1; + descr->prev = descr - 1; + + } + /* do actual circular list */ + (descr-1)->next = start_descr; + start_descr->prev = descr-1; + + descr = start_descr; + for (i=0; i < no; i++, descr++) { + descr->next_descr_addr = descr->next->bus_addr; + } + + chain->head = start_descr; + chain->tail = start_descr; + + return 0; + +iommu_error: + descr = start_descr; + for (i=0; i < no; i++, descr++) + if (descr->bus_addr) + pci_unmap_single(card->pdev, descr->bus_addr, + SPIDER_NET_DESCR_SIZE, PCI_DMA_BIDIRECTIONAL); + return -ENOMEM; +} + +/** + * spider_net_free_rx_chain_contents - frees descr contents in rx chain + * @card: card structure + * + * returns 0 on success, <0 on failure + */ +static void +spider_net_free_rx_chain_contents(struct spider_net_card *card) +{ + struct spider_net_descr *descr; + + descr = card->rx_chain.head; + while (descr->next != card->rx_chain.head) { + if (descr->skb) { + dev_kfree_skb(descr->skb); + pci_unmap_single(card->pdev, descr->buf_addr, + SPIDER_NET_MAX_MTU, + PCI_DMA_BIDIRECTIONAL); + } + descr = descr->next; + } +} + +/** + * spider_net_prepare_rx_descr - reinitializes a rx descriptor + * @card: card structure + * @descr: descriptor to re-init + * + * return 0 on succes, <0 on failure + * + * allocates a new rx skb, iommu-maps it and attaches it to the descriptor. + * Activate the descriptor state-wise + */ +static int +spider_net_prepare_rx_descr(struct spider_net_card *card, + struct spider_net_descr *descr) +{ + int error = 0; + int offset; + int bufsize; + + /* we need to round up the buffer size to a multiple of 128 */ + bufsize = (SPIDER_NET_MAX_MTU + SPIDER_NET_RXBUF_ALIGN - 1) & + (~(SPIDER_NET_RXBUF_ALIGN - 1)); + + /* and we need to have it 128 byte aligned, therefore we allocate a + * bit more */ + /* allocate an skb */ + descr->skb = dev_alloc_skb(bufsize + SPIDER_NET_RXBUF_ALIGN - 1); + if (!descr->skb) { + if (net_ratelimit()) + if (netif_msg_rx_err(card)) + pr_err("Not enough memory to allocate " + "rx buffer\n"); + return -ENOMEM; + } + descr->buf_size = bufsize; + descr->result_size = 0; + descr->valid_size = 0; + descr->data_status = 0; + descr->data_error = 0; + + offset = ((unsigned long)descr->skb->data) & + (SPIDER_NET_RXBUF_ALIGN - 1); + if (offset) + skb_reserve(descr->skb, SPIDER_NET_RXBUF_ALIGN - offset); + /* io-mmu-map the skb */ + descr->buf_addr = pci_map_single(card->pdev, descr->skb->data, + SPIDER_NET_MAX_MTU, + PCI_DMA_BIDIRECTIONAL); + if (descr->buf_addr == DMA_ERROR_CODE) { + dev_kfree_skb_any(descr->skb); + if (netif_msg_rx_err(card)) + pr_err("Could not iommu-map rx buffer\n"); + spider_net_set_descr_status(descr, SPIDER_NET_DESCR_NOT_IN_USE); + } else { + descr->dmac_cmd_status = SPIDER_NET_DMAC_RX_CARDOWNED; + } + + return error; +} + +/** + * spider_net_enable_rxctails - sets RX dmac chain tail addresses + * @card: card structure + * + * spider_net_enable_rxctails sets the RX DMAC chain tail adresses in the + * chip by writing to the appropriate register. DMA is enabled in + * spider_net_enable_rxdmac. + */ +static void +spider_net_enable_rxchtails(struct spider_net_card *card) +{ + /* assume chain is aligned correctly */ + spider_net_write_reg(card, SPIDER_NET_GDADCHA , + card->rx_chain.tail->bus_addr); +} + +/** + * spider_net_enable_rxdmac - enables a receive DMA controller + * @card: card structure + * + * spider_net_enable_rxdmac enables the DMA controller by setting RX_DMA_EN + * in the GDADMACCNTR register + */ +static void +spider_net_enable_rxdmac(struct spider_net_card *card) +{ + spider_net_write_reg(card, SPIDER_NET_GDADMACCNTR, + SPIDER_NET_DMA_RX_VALUE); +} + +/** + * spider_net_refill_rx_chain - refills descriptors/skbs in the rx chains + * @card: card structure + * + * refills descriptors in all chains (last used chain first): allocates skbs + * and iommu-maps them. + */ +static void +spider_net_refill_rx_chain(struct spider_net_card *card) +{ + struct spider_net_descr_chain *chain; + int count = 0; + unsigned long flags; + + chain = &card->rx_chain; + + spin_lock_irqsave(&card->chain_lock, flags); + while (spider_net_get_descr_status(chain->head) == + SPIDER_NET_DESCR_NOT_IN_USE) { + if (spider_net_prepare_rx_descr(card, chain->head)) + break; + count++; + chain->head = chain->head->next; + } + spin_unlock_irqrestore(&card->chain_lock, flags); + + /* could be optimized, only do that, if we know the DMA processing + * has terminated */ + if (count) + spider_net_enable_rxdmac(card); +} + +/** + * spider_net_alloc_rx_skbs - allocates rx skbs in rx descriptor chains + * @card: card structure + * + * returns 0 on success, <0 on failure + */ +static int +spider_net_alloc_rx_skbs(struct spider_net_card *card) +{ + int result; + struct spider_net_descr_chain *chain; + + result = -ENOMEM; + + chain = &card->rx_chain; + /* put at least one buffer into the chain. if this fails, + * we've got a problem. if not, spider_net_refill_rx_chain + * will do the rest at the end of this function */ + if (spider_net_prepare_rx_descr(card, chain->head)) + goto error; + else + chain->head = chain->head->next; + + /* this will allocate the rest of the rx buffers; if not, it's + * business as usual later on */ + spider_net_refill_rx_chain(card); + return 0; + +error: + spider_net_free_rx_chain_contents(card); + return result; +} + +/** + * spider_net_release_tx_descr - processes a used tx descriptor + * @card: card structure + * @descr: descriptor to release + * + * releases a used tx descriptor (unmapping, freeing of skb) + */ +static void +spider_net_release_tx_descr(struct spider_net_card *card, + struct spider_net_descr *descr) +{ + struct sk_buff *skb; + + /* unmap the skb */ + skb = descr->skb; + pci_unmap_single(card->pdev, descr->buf_addr, skb->len, + PCI_DMA_BIDIRECTIONAL); + + dev_kfree_skb_any(skb); + + /* set status to not used */ + spider_net_set_descr_status(descr, SPIDER_NET_DESCR_NOT_IN_USE); +} + +/** + * spider_net_release_tx_chain - processes sent tx descriptors + * @card: adapter structure + * @brutal: if set, don't care about whether descriptor seems to be in use + * + * releases the tx descriptors that spider has finished with (if non-brutal) + * or simply release tx descriptors (if brutal) + */ +static void +spider_net_release_tx_chain(struct spider_net_card *card, int brutal) +{ + struct spider_net_descr_chain *tx_chain = &card->tx_chain; + enum spider_net_descr_status status; + + spider_net_tx_irq_off(card); + + /* no lock for chain needed, if this is only executed once at a time */ +again: + for (;;) { + status = spider_net_get_descr_status(tx_chain->tail); + switch (status) { + case SPIDER_NET_DESCR_CARDOWNED: + if (!brutal) goto out; + /* fallthrough, if we release the descriptors + * brutally (then we don't care about + * SPIDER_NET_DESCR_CARDOWNED) */ + case SPIDER_NET_DESCR_RESPONSE_ERROR: + case SPIDER_NET_DESCR_PROTECTION_ERROR: + case SPIDER_NET_DESCR_FORCE_END: + if (netif_msg_tx_err(card)) + pr_err("%s: forcing end of tx descriptor " + "with status x%02x\n", + card->netdev->name, status); + card->netdev_stats.tx_dropped++; + break; + + case SPIDER_NET_DESCR_COMPLETE: + card->netdev_stats.tx_packets++; + card->netdev_stats.tx_bytes += + tx_chain->tail->skb->len; + break; + + default: /* any other value (== SPIDER_NET_DESCR_NOT_IN_USE) */ + goto out; + } + spider_net_release_tx_descr(card, tx_chain->tail); + tx_chain->tail = tx_chain->tail->next; + } +out: + netif_wake_queue(card->netdev); + + if (!brutal) { + /* switch on tx irqs (while we are still in the interrupt + * handler, so we don't get an interrupt), check again + * for done descriptors. This results in fewer interrupts */ + spider_net_tx_irq_on(card); + status = spider_net_get_descr_status(tx_chain->tail); + switch (status) { + case SPIDER_NET_DESCR_RESPONSE_ERROR: + case SPIDER_NET_DESCR_PROTECTION_ERROR: + case SPIDER_NET_DESCR_FORCE_END: + case SPIDER_NET_DESCR_COMPLETE: + goto again; + default: + break; + } + } + +} + +/** + * spider_net_get_multicast_hash - generates hash for multicast filter table + * @addr: multicast address + * + * returns the hash value. + * + * spider_net_get_multicast_hash calculates a hash value for a given multicast + * address, that is used to set the multicast filter tables + */ +static u8 +spider_net_get_multicast_hash(struct net_device *netdev, __u8 *addr) +{ + /* FIXME: an addr of 01:00:5e:00:00:01 must result in 0xa9, + * ff:ff:ff:ff:ff:ff must result in 0xfd */ + u32 crc; + u8 hash; + + crc = crc32_be(~0, addr, netdev->addr_len); + + hash = (crc >> 27); + hash <<= 3; + hash |= crc & 7; + + return hash; +} + +/** + * spider_net_set_multi - sets multicast addresses and promisc flags + * @netdev: interface device structure + * + * spider_net_set_multi configures multicast addresses as needed for the + * netdev interface. It also sets up multicast, allmulti and promisc + * flags appropriately + */ +static void +spider_net_set_multi(struct net_device *netdev) +{ + struct dev_mc_list *mc; + u8 hash; + int i; + u32 reg; + struct spider_net_card *card = netdev_priv(netdev); + unsigned long bitmask[SPIDER_NET_MULTICAST_HASHES / BITS_PER_LONG] = + {0, }; + + spider_net_set_promisc(card); + + if (netdev->flags & IFF_ALLMULTI) { + for (i = 0; i < SPIDER_NET_MULTICAST_HASHES; i++) { + set_bit(i, bitmask); + } + goto write_hash; + } + + /* well, we know, what the broadcast hash value is: it's xfd + hash = spider_net_get_multicast_hash(netdev, netdev->broadcast); */ + set_bit(0xfd, bitmask); + + for (mc = netdev->mc_list; mc; mc = mc->next) { + hash = spider_net_get_multicast_hash(netdev, mc->dmi_addr); + set_bit(hash, bitmask); + } + +write_hash: + for (i = 0; i < SPIDER_NET_MULTICAST_HASHES / 4; i++) { + reg = 0; + if (test_bit(i * 4, bitmask)) + reg += 0x08; + reg <<= 8; + if (test_bit(i * 4 + 1, bitmask)) + reg += 0x08; + reg <<= 8; + if (test_bit(i * 4 + 2, bitmask)) + reg += 0x08; + reg <<= 8; + if (test_bit(i * 4 + 3, bitmask)) + reg += 0x08; + + spider_net_write_reg(card, SPIDER_NET_GMRMHFILnR + i * 4, reg); + } +} + +/** + * spider_net_disable_rxdmac - disables the receive DMA controller + * @card: card structure + * + * spider_net_disable_rxdmac terminates processing on the DMA controller by + * turing off DMA and issueing a force end + */ +static void +spider_net_disable_rxdmac(struct spider_net_card *card) +{ + spider_net_write_reg(card, SPIDER_NET_GDADMACCNTR, + SPIDER_NET_DMA_RX_FEND_VALUE); +} + +/** + * spider_net_stop - called upon ifconfig down + * @netdev: interface device structure + * + * always returns 0 + */ +int +spider_net_stop(struct net_device *netdev) +{ + struct spider_net_card *card = netdev_priv(netdev); + + netif_poll_disable(netdev); + netif_carrier_off(netdev); + netif_stop_queue(netdev); + + /* disable/mask all interrupts */ + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, 0); + spider_net_write_reg(card, SPIDER_NET_GHIINT1MSK, 0); + spider_net_write_reg(card, SPIDER_NET_GHIINT2MSK, 0); + + spider_net_write_reg(card, SPIDER_NET_GDTDMACCNTR, + SPIDER_NET_DMA_TX_FEND_VALUE); + + /* turn off DMA, force end */ + spider_net_disable_rxdmac(card); + + /* release chains */ + spider_net_release_tx_chain(card, 1); + + /* switch off card */ + spider_net_write_reg(card, SPIDER_NET_CKRCTRL, + SPIDER_NET_CKRCTRL_STOP_VALUE); + + spider_net_free_chain(card, &card->tx_chain); + spider_net_free_chain(card, &card->rx_chain); + + return 0; +} + +/** + * spider_net_get_next_tx_descr - returns the next available tx descriptor + * @card: device structure to get descriptor from + * + * returns the address of the next descriptor, or NULL if not available. + */ +static struct spider_net_descr * +spider_net_get_next_tx_descr(struct spider_net_card *card) +{ + /* check, if head points to not-in-use descr */ + if ( spider_net_get_descr_status(card->tx_chain.head) == + SPIDER_NET_DESCR_NOT_IN_USE ) { + return card->tx_chain.head; + } else { + return NULL; + } +} + +/** + * spider_net_set_txdescr_cmdstat - sets the tx descriptor command field + * @descr: descriptor structure to fill out + * @skb: packet to consider + * + * fills out the command and status field of the descriptor structure, + * depending on hardware checksum settings. This function assumes a wmb() + * has executed before. + */ +static void +spider_net_set_txdescr_cmdstat(struct spider_net_descr *descr, + struct sk_buff *skb) +{ + if (skb->ip_summed != CHECKSUM_HW) { + descr->dmac_cmd_status = SPIDER_NET_DMAC_CMDSTAT_NOCS; + return; + } + + /* is packet ip? + * if yes: tcp? udp? */ + if (skb->protocol == htons(ETH_P_IP)) { + if (skb->nh.iph->protocol == IPPROTO_TCP) { + descr->dmac_cmd_status = SPIDER_NET_DMAC_CMDSTAT_TCPCS; + } else if (skb->nh.iph->protocol == IPPROTO_UDP) { + descr->dmac_cmd_status = SPIDER_NET_DMAC_CMDSTAT_UDPCS; + } else { /* the stack should checksum non-tcp and non-udp + packets on his own: NETIF_F_IP_CSUM */ + descr->dmac_cmd_status = SPIDER_NET_DMAC_CMDSTAT_NOCS; + } + } +} + +/** + * spider_net_prepare_tx_descr - fill tx descriptor with skb data + * @card: card structure + * @descr: descriptor structure to fill out + * @skb: packet to use + * + * returns 0 on success, <0 on failure. + * + * fills out the descriptor structure with skb data and len. Copies data, + * if needed (32bit DMA!) + */ +static int +spider_net_prepare_tx_descr(struct spider_net_card *card, + struct spider_net_descr *descr, + struct sk_buff *skb) +{ + descr->buf_addr = pci_map_single(card->pdev, skb->data, + skb->len, PCI_DMA_BIDIRECTIONAL); + if (descr->buf_addr == DMA_ERROR_CODE) { + if (netif_msg_tx_err(card)) + pr_err("could not iommu-map packet (%p, %i). " + "Dropping packet\n", skb->data, skb->len); + return -ENOMEM; + } + + descr->buf_size = skb->len; + descr->skb = skb; + descr->data_status = 0; + + /* make sure the above values are in memory before we change the + * status */ + wmb(); + + spider_net_set_txdescr_cmdstat(descr,skb); + + return 0; +} + +/** + * spider_net_kick_tx_dma - enables TX DMA processing + * @card: card structure + * @descr: descriptor address to enable TX processing at + * + * spider_net_kick_tx_dma writes the current tx chain head as start address + * of the tx descriptor chain and enables the transmission DMA engine + */ +static void +spider_net_kick_tx_dma(struct spider_net_card *card, + struct spider_net_descr *descr) +{ + /* this is the only descriptor in the output chain. + * Enable TX DMA */ + + spider_net_write_reg(card, SPIDER_NET_GDTDCHA, + descr->bus_addr); + + spider_net_write_reg(card, SPIDER_NET_GDTDMACCNTR, + SPIDER_NET_DMA_TX_VALUE); +} + +/** + * spider_net_xmit - transmits a frame over the device + * @skb: packet to send out + * @netdev: interface device structure + * + * returns 0 on success, <0 on failure + */ +static int +spider_net_xmit(struct sk_buff *skb, struct net_device *netdev) +{ + struct spider_net_card *card = netdev_priv(netdev); + struct spider_net_descr *descr; + int result; + + descr = spider_net_get_next_tx_descr(card); + + if (!descr) { + netif_stop_queue(netdev); + + descr = spider_net_get_next_tx_descr(card); + if (!descr) + goto error; + else + netif_start_queue(netdev); + } + + result = spider_net_prepare_tx_descr(card, descr, skb); + if (result) + goto error; + + card->tx_chain.head = card->tx_chain.head->next; + + /* make sure the status from spider_net_prepare_tx_descr is in + * memory before we check out the previous descriptor */ + wmb(); + + if (spider_net_get_descr_status(descr->prev) != + SPIDER_NET_DESCR_CARDOWNED) + spider_net_kick_tx_dma(card, descr); + + return NETDEV_TX_OK; + +error: + card->netdev_stats.tx_dropped++; + return NETDEV_TX_LOCKED; +} + +/** + * spider_net_do_ioctl - called for device ioctls + * @netdev: interface device structure + * @ifr: request parameter structure for ioctl + * @cmd: command code for ioctl + * + * returns 0 on success, <0 on failure. Currently, we have no special ioctls. + * -EOPNOTSUPP is returned, if an unknown ioctl was requested + */ +static int +spider_net_do_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd) +{ + switch (cmd) { + default: + return -EOPNOTSUPP; + } +} + +/** + * spider_net_pass_skb_up - takes an skb from a descriptor and passes it on + * @descr: descriptor to process + * @card: card structure + * + * returns 1 on success, 0 if no packet was passed to the stack + * + * iommu-unmaps the skb, fills out skb structure and passes the data to the + * stack. The descriptor state is not changed. + */ +static int +spider_net_pass_skb_up(struct spider_net_descr *descr, + struct spider_net_card *card) +{ + struct sk_buff *skb; + struct net_device *netdev; + u32 data_status, data_error; + + data_status = descr->data_status; + data_error = descr->data_error; + + netdev = card->netdev; + + /* check for errors in the data_error flag */ + if ((data_error & SPIDER_NET_DATA_ERROR_MASK) && + netif_msg_rx_err(card)) + pr_err("error in received descriptor found, " + "data_status=x%08x, data_error=x%08x\n", + data_status, data_error); + + /* prepare skb, unmap descriptor */ + skb = descr->skb; + pci_unmap_single(card->pdev, descr->buf_addr, SPIDER_NET_MAX_MTU, + PCI_DMA_BIDIRECTIONAL); + + /* the cases we'll throw away the packet immediately */ + if (data_error & SPIDER_NET_DESTROY_RX_FLAGS) + return 0; + + skb->dev = netdev; + skb_put(skb, descr->valid_size); + + /* the card seems to add 2 bytes of junk in front + * of the ethernet frame */ +#define SPIDER_MISALIGN 2 + skb_pull(skb, SPIDER_MISALIGN); + skb->protocol = eth_type_trans(skb, netdev); + + /* checksum offload */ + if (card->options.rx_csum) { + if ( (data_status & SPIDER_NET_DATA_STATUS_CHK_MASK) && + (!(data_error & SPIDER_NET_DATA_ERROR_CHK_MASK)) ) + skb->ip_summed = CHECKSUM_UNNECESSARY; + else + skb->ip_summed = CHECKSUM_NONE; + } else { + skb->ip_summed = CHECKSUM_NONE; + } + + if (data_status & SPIDER_NET_VLAN_PACKET) { + /* further enhancements: HW-accel VLAN + * vlan_hwaccel_receive_skb + */ + } + + /* pass skb up to stack */ + netif_receive_skb(skb); + + /* update netdevice statistics */ + card->netdev_stats.rx_packets++; + card->netdev_stats.rx_bytes += skb->len; + + return 1; +} + +/** + * spider_net_decode_descr - processes an rx descriptor + * @card: card structure + * + * returns 1 if a packet has been sent to the stack, otherwise 0 + * + * processes an rx descriptor by iommu-unmapping the data buffer and passing + * the packet up to the stack + */ +static int +spider_net_decode_one_descr(struct spider_net_card *card) +{ + enum spider_net_descr_status status; + struct spider_net_descr *descr; + struct spider_net_descr_chain *chain; + int result; + + chain = &card->rx_chain; + descr = chain->tail; + + status = spider_net_get_descr_status(descr); + + if (status == SPIDER_NET_DESCR_CARDOWNED) { + /* nothing in the descriptor yet */ + return 0; + } + + if (status == SPIDER_NET_DESCR_NOT_IN_USE) { + /* not initialized yet, I bet chain->tail == chain->head + * and the ring is empty */ + spider_net_refill_rx_chain(card); + return 0; + } + + /* descriptor definitively used -- move on head */ + chain->tail = descr->next; + + result = 0; + if ( (status == SPIDER_NET_DESCR_RESPONSE_ERROR) || + (status == SPIDER_NET_DESCR_PROTECTION_ERROR) || + (status == SPIDER_NET_DESCR_FORCE_END) ) { + if (netif_msg_rx_err(card)) + pr_err("%s: dropping RX descriptor with state %d\n", + card->netdev->name, status); + card->netdev_stats.rx_dropped++; + goto refill; + } + + if ( (status != SPIDER_NET_DESCR_COMPLETE) && + (status != SPIDER_NET_DESCR_FRAME_END) ) { + if (netif_msg_rx_err(card)) + pr_err("%s: RX descriptor with state %d\n", + card->netdev->name, status); + goto refill; + } + + /* ok, we've got a packet in descr */ + result = spider_net_pass_skb_up(descr, card); +refill: + spider_net_set_descr_status(descr, SPIDER_NET_DESCR_NOT_IN_USE); + /* change the descriptor state: */ + spider_net_refill_rx_chain(card); + + return result; +} + +/** + * spider_net_poll - NAPI poll function called by the stack to return packets + * @netdev: interface device structure + * @budget: number of packets we can pass to the stack at most + * + * returns 0 if no more packets available to the driver/stack. Returns 1, + * if the quota is exceeded, but the driver has still packets. + * + * spider_net_poll returns all packets from the rx descriptors to the stack + * (using netif_receive_skb). If all/enough packets are up, the driver + * reenables interrupts and returns 0. If not, 1 is returned. + */ +static int +spider_net_poll(struct net_device *netdev, int *budget) +{ + struct spider_net_card *card = netdev_priv(netdev); + int packets_to_do, packets_done = 0; + int no_more_packets = 0; + + packets_to_do = min(*budget, netdev->quota); + + while (packets_to_do) { + if (spider_net_decode_one_descr(card)) { + packets_done++; + packets_to_do--; + } else { + /* no more packets for the stack */ + no_more_packets = 1; + break; + } + } + + netdev->quota -= packets_done; + *budget -= packets_done; + + /* if all packets are in the stack, enable interrupts and return 0 */ + /* if not, return 1 */ + if (no_more_packets) { + netif_rx_complete(netdev); + spider_net_rx_irq_on(card); + return 0; + } + + return 1; +} + +/** + * spider_net_vlan_rx_reg - initializes VLAN structures in the driver and card + * @netdev: interface device structure + * @grp: vlan_group structure that is registered (NULL on destroying interface) + */ +static void +spider_net_vlan_rx_reg(struct net_device *netdev, struct vlan_group *grp) +{ + /* further enhancement... yet to do */ + return; +} + +/** + * spider_net_vlan_rx_add - adds VLAN id to the card filter + * @netdev: interface device structure + * @vid: VLAN id to add + */ +static void +spider_net_vlan_rx_add(struct net_device *netdev, uint16_t vid) +{ + /* further enhancement... yet to do */ + /* add vid to card's VLAN filter table */ + return; +} + +/** + * spider_net_vlan_rx_kill - removes VLAN id to the card filter + * @netdev: interface device structure + * @vid: VLAN id to remove + */ +static void +spider_net_vlan_rx_kill(struct net_device *netdev, uint16_t vid) +{ + /* further enhancement... yet to do */ + /* remove vid from card's VLAN filter table */ +} + +/** + * spider_net_get_stats - get interface statistics + * @netdev: interface device structure + * + * returns the interface statistics residing in the spider_net_card struct + */ +static struct net_device_stats * +spider_net_get_stats(struct net_device *netdev) +{ + struct spider_net_card *card = netdev_priv(netdev); + struct net_device_stats *stats = &card->netdev_stats; + return stats; +} + +/** + * spider_net_change_mtu - changes the MTU of an interface + * @netdev: interface device structure + * @new_mtu: new MTU value + * + * returns 0 on success, <0 on failure + */ +static int +spider_net_change_mtu(struct net_device *netdev, int new_mtu) +{ + /* no need to re-alloc skbs or so -- the max mtu is about 2.3k + * and mtu is outbound only anyway */ + if ( (new_mtu < SPIDER_NET_MIN_MTU ) || + (new_mtu > SPIDER_NET_MAX_MTU) ) + return -EINVAL; + netdev->mtu = new_mtu; + return 0; +} + +/** + * spider_net_set_mac - sets the MAC of an interface + * @netdev: interface device structure + * @ptr: pointer to new MAC address + * + * Returns 0 on success, <0 on failure. Currently, we don't support this + * and will always return EOPNOTSUPP. + */ +static int +spider_net_set_mac(struct net_device *netdev, void *p) +{ + struct spider_net_card *card = netdev_priv(netdev); + u32 macl, macu; + struct sockaddr *addr = p; + + /* GMACTPE and GMACRPE must be off, so we only allow this, if + * the device is down */ + if (netdev->flags & IFF_UP) + return -EBUSY; + + if (!is_valid_ether_addr(addr->sa_data)) + return -EADDRNOTAVAIL; + + macu = (addr->sa_data[0]<<24) + (addr->sa_data[1]<<16) + + (addr->sa_data[2]<<8) + (addr->sa_data[3]); + macl = (addr->sa_data[4]<<8) + (addr->sa_data[5]); + spider_net_write_reg(card, SPIDER_NET_GMACUNIMACU, macu); + spider_net_write_reg(card, SPIDER_NET_GMACUNIMACL, macl); + + spider_net_set_promisc(card); + + /* look up, whether we have been successful */ + if (spider_net_get_mac_address(netdev)) + return -EADDRNOTAVAIL; + if (memcmp(netdev->dev_addr,addr->sa_data,netdev->addr_len)) + return -EADDRNOTAVAIL; + + return 0; +} + +/** + * spider_net_enable_txdmac - enables a TX DMA controller + * @card: card structure + * + * spider_net_enable_txdmac enables the TX DMA controller by setting the + * descriptor chain tail address + */ +static void +spider_net_enable_txdmac(struct spider_net_card *card) +{ + /* assume chain is aligned correctly */ + spider_net_write_reg(card, SPIDER_NET_GDTDCHA, + card->tx_chain.tail->bus_addr); +} + +/** + * spider_net_handle_error_irq - handles errors raised by an interrupt + * @card: card structure + * @status_reg: interrupt status register 0 (GHIINT0STS) + * + * spider_net_handle_error_irq treats or ignores all error conditions + * found when an interrupt is presented + */ +static void +spider_net_handle_error_irq(struct spider_net_card *card, u32 status_reg) +{ + u32 error_reg1, error_reg2; + u32 i; + int show_error = 1; + + error_reg1 = spider_net_read_reg(card, SPIDER_NET_GHIINT1STS); + error_reg2 = spider_net_read_reg(card, SPIDER_NET_GHIINT2STS); + + /* check GHIINT0STS ************************************/ + if (status_reg) + for (i = 0; i < 32; i++) + if (status_reg & (1<tx_chain.tail == card->tx_chain.head) + spider_net_kick_tx_dma(card); + show_error = 0; */ + break; + + /* case SPIDER_NET_G1TMCNTINT: not used. print a message */ + /* case SPIDER_NET_GFREECNTINT: not used. print a message */ + } + + /* check GHIINT1STS ************************************/ + if (error_reg1) + for (i = 0; i < 32; i++) + if (error_reg1 & (1<netdev); + spider_net_enable_rxchtails(card); + spider_net_enable_rxdmac(card); + break; + + /* case SPIDER_NET_GTMSHTINT: problem, print a message */ + case SPIDER_NET_GDTINVDINT: + /* allrighty. tx from previous descr ok */ + show_error = 0; + break; + /* case SPIDER_NET_GRFDFLLINT: print a message down there */ + /* case SPIDER_NET_GRFCFLLINT: print a message down there */ + /* case SPIDER_NET_GRFBFLLINT: print a message down there */ + /* case SPIDER_NET_GRFAFLLINT: print a message down there */ + + /* chain end */ + case SPIDER_NET_GDDDCEINT: /* fallthrough */ + case SPIDER_NET_GDCDCEINT: /* fallthrough */ + case SPIDER_NET_GDBDCEINT: /* fallthrough */ + case SPIDER_NET_GDADCEINT: + if (netif_msg_intr(card)) + pr_err("got descriptor chain end interrupt, " + "restarting DMAC %c.\n", + 'D'+i-SPIDER_NET_GDDDCEINT); + spider_net_refill_rx_chain(card); + show_error = 0; + break; + + /* invalid descriptor */ + case SPIDER_NET_GDDINVDINT: /* fallthrough */ + case SPIDER_NET_GDCINVDINT: /* fallthrough */ + case SPIDER_NET_GDBINVDINT: /* fallthrough */ + case SPIDER_NET_GDAINVDINT: + /* could happen when rx chain is full */ + spider_net_refill_rx_chain(card); + show_error = 0; + break; + + /* case SPIDER_NET_GDTRSERINT: problem, print a message */ + /* case SPIDER_NET_GDDRSERINT: problem, print a message */ + /* case SPIDER_NET_GDCRSERINT: problem, print a message */ + /* case SPIDER_NET_GDBRSERINT: problem, print a message */ + /* case SPIDER_NET_GDARSERINT: problem, print a message */ + /* case SPIDER_NET_GDSERINT: problem, print a message */ + /* case SPIDER_NET_GDTPTERINT: problem, print a message */ + /* case SPIDER_NET_GDDPTERINT: problem, print a message */ + /* case SPIDER_NET_GDCPTERINT: problem, print a message */ + /* case SPIDER_NET_GDBPTERINT: problem, print a message */ + /* case SPIDER_NET_GDAPTERINT: problem, print a message */ + default: + show_error = 1; + break; + } + + /* check GHIINT2STS ************************************/ + if (error_reg2) + for (i = 0; i < 32; i++) + if (error_reg2 & (1<irq); + spider_net_interrupt(netdev->irq, netdev, NULL); + enable_irq(netdev->irq); +} +#endif /* CONFIG_NET_POLL_CONTROLLER */ + +/** + * spider_net_init_card - initializes the card + * @card: card structure + * + * spider_net_init_card initializes the card so that other registers can + * be used + */ +static void +spider_net_init_card(struct spider_net_card *card) +{ + spider_net_write_reg(card, SPIDER_NET_CKRCTRL, + SPIDER_NET_CKRCTRL_STOP_VALUE); + + spider_net_write_reg(card, SPIDER_NET_CKRCTRL, + SPIDER_NET_CKRCTRL_RUN_VALUE); +} + +/** + * spider_net_enable_card - enables the card by setting all kinds of regs + * @card: card structure + * + * spider_net_enable_card sets a lot of SMMIO registers to enable the device + */ +static void +spider_net_enable_card(struct spider_net_card *card) +{ + int i; + /* the following array consists of (register),(value) pairs + * that are set in this function. A register of 0 ends the list */ + u32 regs[][2] = { + { SPIDER_NET_GRESUMINTNUM, 0 }, + { SPIDER_NET_GREINTNUM, 0 }, + + /* set interrupt frame number registers */ + /* clear the single DMA engine registers first */ + { SPIDER_NET_GFAFRMNUM, SPIDER_NET_GFXFRAMES_VALUE }, + { SPIDER_NET_GFBFRMNUM, SPIDER_NET_GFXFRAMES_VALUE }, + { SPIDER_NET_GFCFRMNUM, SPIDER_NET_GFXFRAMES_VALUE }, + { SPIDER_NET_GFDFRMNUM, SPIDER_NET_GFXFRAMES_VALUE }, + /* then set, what we really need */ + { SPIDER_NET_GFFRMNUM, SPIDER_NET_FRAMENUM_VALUE }, + + /* timer counter registers and stuff */ + { SPIDER_NET_GFREECNNUM, 0 }, + { SPIDER_NET_GONETIMENUM, 0 }, + { SPIDER_NET_GTOUTFRMNUM, 0 }, + + /* RX mode setting */ + { SPIDER_NET_GRXMDSET, SPIDER_NET_RXMODE_VALUE }, + /* TX mode setting */ + { SPIDER_NET_GTXMDSET, SPIDER_NET_TXMODE_VALUE }, + /* IPSEC mode setting */ + { SPIDER_NET_GIPSECINIT, SPIDER_NET_IPSECINIT_VALUE }, + + { SPIDER_NET_GFTRESTRT, SPIDER_NET_RESTART_VALUE }, + + { SPIDER_NET_GMRWOLCTRL, 0 }, + { SPIDER_NET_GTESTMD, 0 }, + + { SPIDER_NET_GMACINTEN, 0 }, + + /* flow control stuff */ + { SPIDER_NET_GMACAPAUSE, SPIDER_NET_MACAPAUSE_VALUE }, + { SPIDER_NET_GMACTXPAUSE, SPIDER_NET_TXPAUSE_VALUE }, + + { SPIDER_NET_GMACBSTLMT, SPIDER_NET_BURSTLMT_VALUE }, + { 0, 0} + }; + + i = 0; + while (regs[i][0]) { + spider_net_write_reg(card, regs[i][0], regs[i][1]); + i++; + } + + /* clear unicast filter table entries 1 to 14 */ + for (i = 1; i <= 14; i++) { + spider_net_write_reg(card, + SPIDER_NET_GMRUAFILnR + i * 8, + 0x00080000); + spider_net_write_reg(card, + SPIDER_NET_GMRUAFILnR + i * 8 + 4, + 0x00000000); + } + + spider_net_write_reg(card, SPIDER_NET_GMRUA0FIL15R, 0x08080000); + + spider_net_write_reg(card, SPIDER_NET_ECMODE, SPIDER_NET_ECMODE_VALUE); + + /* set chain tail adress for RX chains and + * enable DMA */ + spider_net_enable_rxchtails(card); + spider_net_enable_rxdmac(card); + + spider_net_write_reg(card, SPIDER_NET_GRXDMAEN, SPIDER_NET_WOL_VALUE); + + /* set chain tail adress for TX chain */ + spider_net_enable_txdmac(card); + + spider_net_write_reg(card, SPIDER_NET_GMACLENLMT, + SPIDER_NET_LENLMT_VALUE); + spider_net_write_reg(card, SPIDER_NET_GMACMODE, + SPIDER_NET_MACMODE_VALUE); + spider_net_write_reg(card, SPIDER_NET_GMACOPEMD, + SPIDER_NET_OPMODE_VALUE); + + /* set interrupt mask registers */ + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, + SPIDER_NET_INT0_MASK_VALUE); + spider_net_write_reg(card, SPIDER_NET_GHIINT1MSK, + SPIDER_NET_INT1_MASK_VALUE); + spider_net_write_reg(card, SPIDER_NET_GHIINT2MSK, + SPIDER_NET_INT2_MASK_VALUE); +} + +/** + * spider_net_open - called upon ifonfig up + * @netdev: interface device structure + * + * returns 0 on success, <0 on failure + * + * spider_net_open allocates all the descriptors and memory needed for + * operation, sets up multicast list and enables interrupts + */ +int +spider_net_open(struct net_device *netdev) +{ + struct spider_net_card *card = netdev_priv(netdev); + int result; + + result = -ENOMEM; + if (spider_net_init_chain(card, &card->tx_chain, + card->descr, tx_descriptors)) + goto alloc_tx_failed; + if (spider_net_init_chain(card, &card->rx_chain, + card->descr + tx_descriptors, rx_descriptors)) + goto alloc_rx_failed; + + /* allocate rx skbs */ + if (spider_net_alloc_rx_skbs(card)) + goto alloc_skbs_failed; + + spider_net_set_multi(netdev); + + /* further enhancement: setup hw vlan, if needed */ + + result = -EBUSY; + if (request_irq(netdev->irq, spider_net_interrupt, + SA_SHIRQ, netdev->name, netdev)) + goto register_int_failed; + + spider_net_enable_card(card); + + return 0; + +register_int_failed: + spider_net_free_rx_chain_contents(card); +alloc_skbs_failed: + spider_net_free_chain(card, &card->rx_chain); +alloc_rx_failed: + spider_net_free_chain(card, &card->tx_chain); +alloc_tx_failed: + return result; +} + +/** + * spider_net_setup_phy - setup PHY + * @card: card structure + * + * returns 0 on success, <0 on failure + * + * spider_net_setup_phy is used as part of spider_net_probe. Sets + * the PHY to 1000 Mbps + **/ +static int +spider_net_setup_phy(struct spider_net_card *card) +{ + struct mii_phy *phy = &card->phy; + + spider_net_write_reg(card, SPIDER_NET_GDTDMASEL, + SPIDER_NET_DMASEL_VALUE); + spider_net_write_reg(card, SPIDER_NET_GPCCTRL, + SPIDER_NET_PHY_CTRL_VALUE); + phy->mii_id = 1; + phy->dev = card->netdev; + phy->mdio_read = spider_net_read_phy; + phy->mdio_write = spider_net_write_phy; + + mii_phy_probe(phy, phy->mii_id); + + if (phy->def->ops->setup_forced) + phy->def->ops->setup_forced(phy, SPEED_1000, DUPLEX_FULL); + + /* the following two writes could be moved to sungem_phy.c */ + /* enable fiber mode */ + spider_net_write_phy(card->netdev, 1, MII_NCONFIG, 0x9020); + /* LEDs active in both modes, autosense prio = fiber */ + spider_net_write_phy(card->netdev, 1, MII_NCONFIG, 0x945f); + + phy->def->ops->read_link(phy); + pr_info("Found %s with %i Mbps, %s-duplex.\n", phy->def->name, + phy->speed, phy->duplex==1 ? "Full" : "Half"); + + return 0; +} + +/** + * spider_net_download_firmware - loads firmware into the adapter + * @card: card structure + * @firmware: firmware pointer + * + * spider_net_download_firmware loads the firmware opened by + * spider_net_init_firmware into the adapter. + */ +static void +spider_net_download_firmware(struct spider_net_card *card, + const struct firmware *firmware) +{ + int sequencer, i; + u32 *fw_ptr = (u32 *)firmware->data; + + /* stop sequencers */ + spider_net_write_reg(card, SPIDER_NET_GSINIT, + SPIDER_NET_STOP_SEQ_VALUE); + + for (sequencer = 0; sequencer < 6; sequencer++) { + spider_net_write_reg(card, + SPIDER_NET_GSnPRGADR + sequencer * 8, 0); + for (i = 0; i < SPIDER_NET_FIRMWARE_LEN; i++) { + spider_net_write_reg(card, SPIDER_NET_GSnPRGDAT + + sequencer * 8, *fw_ptr); + fw_ptr++; + } + } + + spider_net_write_reg(card, SPIDER_NET_GSINIT, + SPIDER_NET_RUN_SEQ_VALUE); +} + +/** + * spider_net_init_firmware - reads in firmware parts + * @card: card structure + * + * Returns 0 on success, <0 on failure + * + * spider_net_init_firmware opens the sequencer firmware and does some basic + * checks. This function opens and releases the firmware structure. A call + * to download the firmware is performed before the release. + * + * Firmware format + * =============== + * spider_fw.bin is expected to be a file containing 6*1024*4 bytes, 4k being + * the program for each sequencer. Use the command + * tail -q -n +2 Seq_code1_0x088.txt Seq_code2_0x090.txt \ + * Seq_code3_0x098.txt Seq_code4_0x0A0.txt Seq_code5_0x0A8.txt \ + * Seq_code6_0x0B0.txt | xxd -r -p -c4 > spider_fw.bin + * + * to generate spider_fw.bin, if you have sequencer programs with something + * like the following contents for each sequencer: + * + * + * + * ... + * <1024th 4-BYTES-WORD FOR SEQUENCER> + */ +static int +spider_net_init_firmware(struct spider_net_card *card) +{ + const struct firmware *firmware; + int err = -EIO; + + if (request_firmware(&firmware, + SPIDER_NET_FIRMWARE_NAME, &card->pdev->dev) < 0) { + if (netif_msg_probe(card)) + pr_err("Couldn't read in sequencer data file %s.\n", + SPIDER_NET_FIRMWARE_NAME); + firmware = NULL; + goto out; + } + + if (firmware->size != 6 * SPIDER_NET_FIRMWARE_LEN * sizeof(u32)) { + if (netif_msg_probe(card)) + pr_err("Invalid size of sequencer data file %s.\n", + SPIDER_NET_FIRMWARE_NAME); + goto out; + } + + spider_net_download_firmware(card, firmware); + + err = 0; +out: + release_firmware(firmware); + + return err; +} + +/** + * spider_net_workaround_rxramfull - work around firmware bug + * @card: card structure + * + * no return value + **/ +static void +spider_net_workaround_rxramfull(struct spider_net_card *card) +{ + int i, sequencer = 0; + + /* cancel reset */ + spider_net_write_reg(card, SPIDER_NET_CKRCTRL, + SPIDER_NET_CKRCTRL_RUN_VALUE); + + /* empty sequencer data */ + for (sequencer = 0; sequencer < 6; sequencer++) { + spider_net_write_reg(card, SPIDER_NET_GSnPRGDAT + + sequencer * 8, 0x0); + for (i = 0; i < SPIDER_NET_FIRMWARE_LEN; i++) { + spider_net_write_reg(card, SPIDER_NET_GSnPRGDAT + + sequencer * 8, 0x0); + } + } + + /* set sequencer operation */ + spider_net_write_reg(card, SPIDER_NET_GSINIT, 0x000000fe); + + /* reset */ + spider_net_write_reg(card, SPIDER_NET_CKRCTRL, + SPIDER_NET_CKRCTRL_STOP_VALUE); +} + +/** + * spider_net_tx_timeout_task - task scheduled by the watchdog timeout + * function (to be called not under interrupt status) + * @data: data, is interface device structure + * + * called as task when tx hangs, resets interface (if interface is up) + */ +static void +spider_net_tx_timeout_task(void *data) +{ + struct net_device *netdev = data; + struct spider_net_card *card = netdev_priv(netdev); + + if (!(netdev->flags & IFF_UP)) + goto out; + + netif_device_detach(netdev); + spider_net_stop(netdev); + + spider_net_workaround_rxramfull(card); + spider_net_init_card(card); + + if (spider_net_setup_phy(card)) + goto out; + if (spider_net_init_firmware(card)) + goto out; + + spider_net_open(netdev); + spider_net_kick_tx_dma(card, card->tx_chain.head); + netif_device_attach(netdev); + +out: + atomic_dec(&card->tx_timeout_task_counter); +} + +/** + * spider_net_tx_timeout - called when the tx timeout watchdog kicks in. + * @netdev: interface device structure + * + * called, if tx hangs. Schedules a task that resets the interface + */ +static void +spider_net_tx_timeout(struct net_device *netdev) +{ + struct spider_net_card *card; + + card = netdev_priv(netdev); + atomic_inc(&card->tx_timeout_task_counter); + if (netdev->flags & IFF_UP) + schedule_work(&card->tx_timeout_task); + else + atomic_dec(&card->tx_timeout_task_counter); +} + +/** + * spider_net_setup_netdev_ops - initialization of net_device operations + * @netdev: net_device structure + * + * fills out function pointers in the net_device structure + */ +static void +spider_net_setup_netdev_ops(struct net_device *netdev) +{ + netdev->open = &spider_net_open; + netdev->stop = &spider_net_stop; + netdev->hard_start_xmit = &spider_net_xmit; + netdev->get_stats = &spider_net_get_stats; + netdev->set_multicast_list = &spider_net_set_multi; + netdev->set_mac_address = &spider_net_set_mac; + netdev->change_mtu = &spider_net_change_mtu; + netdev->do_ioctl = &spider_net_do_ioctl; + /* tx watchdog */ + netdev->tx_timeout = &spider_net_tx_timeout; + netdev->watchdog_timeo = SPIDER_NET_WATCHDOG_TIMEOUT; + /* NAPI */ + netdev->poll = &spider_net_poll; + netdev->weight = SPIDER_NET_NAPI_WEIGHT; + /* HW VLAN */ + netdev->vlan_rx_register = &spider_net_vlan_rx_reg; + netdev->vlan_rx_add_vid = &spider_net_vlan_rx_add; + netdev->vlan_rx_kill_vid = &spider_net_vlan_rx_kill; +#ifdef CONFIG_NET_POLL_CONTROLLER + /* poll controller */ + netdev->poll_controller = &spider_net_poll_controller; +#endif /* CONFIG_NET_POLL_CONTROLLER */ + /* ethtool ops */ + netdev->ethtool_ops = &spider_net_ethtool_ops; +} + +/** + * spider_net_setup_netdev - initialization of net_device + * @card: card structure + * + * Returns 0 on success or <0 on failure + * + * spider_net_setup_netdev initializes the net_device structure + **/ +static int +spider_net_setup_netdev(struct spider_net_card *card) +{ + int result; + struct net_device *netdev = card->netdev; + struct device_node *dn; + struct sockaddr addr; + u8 *mac; + + SET_MODULE_OWNER(netdev); + SET_NETDEV_DEV(netdev, &card->pdev->dev); + + pci_set_drvdata(card->pdev, netdev); + spin_lock_init(&card->intmask_lock); + netdev->irq = card->pdev->irq; + + card->options.rx_csum = SPIDER_NET_RX_CSUM_DEFAULT; + + spider_net_setup_netdev_ops(netdev); + + netdev->features = 0; + /* some time: NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX | + * NETIF_F_HW_VLAN_FILTER */ + + netdev->irq = card->pdev->irq; + + dn = pci_device_to_OF_node(card->pdev); + mac = (u8 *)get_property(dn, "local-mac-address", NULL); + memcpy(addr.sa_data, mac, ETH_ALEN); + + result = spider_net_set_mac(netdev, &addr); + if ((result) && (netif_msg_probe(card))) + pr_err("Failed to set MAC address: %i\n", result); + + result = register_netdev(netdev); + if (result) { + if (netif_msg_probe(card)) + pr_err("Couldn't register net_device: %i\n", + result); + return result; + } + + if (netif_msg_probe(card)) + pr_info("Initialized device %s.\n", netdev->name); + + return 0; +} + +/** + * spider_net_alloc_card - allocates net_device and card structure + * + * returns the card structure or NULL in case of errors + * + * the card and net_device structures are linked to each other + */ +static struct spider_net_card * +spider_net_alloc_card(void) +{ + struct net_device *netdev; + struct spider_net_card *card; + size_t alloc_size; + + alloc_size = sizeof (*card) + + sizeof (struct spider_net_descr) * rx_descriptors + + sizeof (struct spider_net_descr) * tx_descriptors; + netdev = alloc_etherdev(alloc_size); + if (!netdev) + return NULL; + + card = netdev_priv(netdev); + card->netdev = netdev; + card->msg_enable = SPIDER_NET_DEFAULT_MSG; + INIT_WORK(&card->tx_timeout_task, spider_net_tx_timeout_task, netdev); + init_waitqueue_head(&card->waitq); + atomic_set(&card->tx_timeout_task_counter, 0); + + return card; +} + +/** + * spider_net_undo_pci_setup - releases PCI ressources + * @card: card structure + * + * spider_net_undo_pci_setup releases the mapped regions + */ +static void +spider_net_undo_pci_setup(struct spider_net_card *card) +{ + iounmap(card->regs); + pci_release_regions(card->pdev); +} + +/** + * spider_net_setup_pci_dev - sets up the device in terms of PCI operations + * @card: card structure + * @pdev: PCI device + * + * Returns the card structure or NULL if any errors occur + * + * spider_net_setup_pci_dev initializes pdev and together with the + * functions called in spider_net_open configures the device so that + * data can be transferred over it + * The net_device structure is attached to the card structure, if the + * function returns without error. + **/ +static struct spider_net_card * +spider_net_setup_pci_dev(struct pci_dev *pdev) +{ + struct spider_net_card *card; + unsigned long mmio_start, mmio_len; + + if (pci_enable_device(pdev)) { + pr_err("Couldn't enable PCI device\n"); + return NULL; + } + + if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM)) { + pr_err("Couldn't find proper PCI device base address.\n"); + goto out_disable_dev; + } + + if (pci_request_regions(pdev, spider_net_driver_name)) { + pr_err("Couldn't obtain PCI resources, aborting.\n"); + goto out_disable_dev; + } + + pci_set_master(pdev); + + card = spider_net_alloc_card(); + if (!card) { + pr_err("Couldn't allocate net_device structure, " + "aborting.\n"); + goto out_release_regions; + } + card->pdev = pdev; + + /* fetch base address and length of first resource */ + mmio_start = pci_resource_start(pdev, 0); + mmio_len = pci_resource_len(pdev, 0); + + card->netdev->mem_start = mmio_start; + card->netdev->mem_end = mmio_start + mmio_len; + card->regs = ioremap(mmio_start, mmio_len); + + if (!card->regs) { + pr_err("Couldn't obtain PCI resources, aborting.\n"); + goto out_release_regions; + } + + return card; + +out_release_regions: + pci_release_regions(pdev); +out_disable_dev: + pci_disable_device(pdev); + pci_set_drvdata(pdev, NULL); + return NULL; +} + +/** + * spider_net_probe - initialization of a device + * @pdev: PCI device + * @ent: entry in the device id list + * + * Returns 0 on success, <0 on failure + * + * spider_net_probe initializes pdev and registers a net_device + * structure for it. After that, the device can be ifconfig'ed up + **/ +static int __devinit +spider_net_probe(struct pci_dev *pdev, const struct pci_device_id *ent) +{ + int err = -EIO; + struct spider_net_card *card; + + card = spider_net_setup_pci_dev(pdev); + if (!card) + goto out; + + spider_net_workaround_rxramfull(card); + spider_net_init_card(card); + + err = spider_net_setup_phy(card); + if (err) + goto out_undo_pci; + + err = spider_net_init_firmware(card); + if (err) + goto out_undo_pci; + + err = spider_net_setup_netdev(card); + if (err) + goto out_undo_pci; + + return 0; + +out_undo_pci: + spider_net_undo_pci_setup(card); + free_netdev(card->netdev); +out: + return err; +} + +/** + * spider_net_remove - removal of a device + * @pdev: PCI device + * + * Returns 0 on success, <0 on failure + * + * spider_net_remove is called to remove the device and unregisters the + * net_device + **/ +static void __devexit +spider_net_remove(struct pci_dev *pdev) +{ + struct net_device *netdev; + struct spider_net_card *card; + + netdev = pci_get_drvdata(pdev); + card = netdev_priv(netdev); + + wait_event(card->waitq, + atomic_read(&card->tx_timeout_task_counter) == 0); + + unregister_netdev(netdev); + spider_net_undo_pci_setup(card); + free_netdev(netdev); + + free_irq(to_pci_dev(netdev->class_dev.dev)->irq, netdev); +} + +static struct pci_driver spider_net_driver = { + .owner = THIS_MODULE, + .name = spider_net_driver_name, + .id_table = spider_net_pci_tbl, + .probe = spider_net_probe, + .remove = __devexit_p(spider_net_remove) +}; + +/** + * spider_net_init - init function when the driver is loaded + * + * spider_net_init registers the device driver + */ +static int __init spider_net_init(void) +{ + if (rx_descriptors < SPIDER_NET_RX_DESCRIPTORS_MIN) { + rx_descriptors = SPIDER_NET_RX_DESCRIPTORS_MIN; + pr_info("adjusting rx descriptors to %i.\n", rx_descriptors); + } + if (rx_descriptors > SPIDER_NET_RX_DESCRIPTORS_MAX) { + rx_descriptors = SPIDER_NET_RX_DESCRIPTORS_MAX; + pr_info("adjusting rx descriptors to %i.\n", rx_descriptors); + } + if (tx_descriptors < SPIDER_NET_TX_DESCRIPTORS_MIN) { + tx_descriptors = SPIDER_NET_TX_DESCRIPTORS_MIN; + pr_info("adjusting tx descriptors to %i.\n", tx_descriptors); + } + if (tx_descriptors > SPIDER_NET_TX_DESCRIPTORS_MAX) { + tx_descriptors = SPIDER_NET_TX_DESCRIPTORS_MAX; + pr_info("adjusting tx descriptors to %i.\n", tx_descriptors); + } + + return pci_register_driver(&spider_net_driver); +} + +/** + * spider_net_cleanup - exit function when driver is unloaded + * + * spider_net_cleanup unregisters the device driver + */ +static void __exit spider_net_cleanup(void) +{ + pci_unregister_driver(&spider_net_driver); +} + +module_init(spider_net_init); +module_exit(spider_net_cleanup); --- linux-cg.orig/drivers/net/spider_net.h 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/drivers/net/spider_net.h 2005-06-28 15:06:01.237001504 -0400 @@ -0,0 +1,469 @@ +/* + * Network device driver for Cell Processor-Based Blade + * + * (C) Copyright IBM Corp. 2005 + * + * Authors : Utz Bacher + * Jens Osterkamp + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#ifndef _SPIDER_NET_H +#define _SPIDER_NET_H + +#include "sungem_phy.h" + +extern int spider_net_stop(struct net_device *netdev); +extern int spider_net_open(struct net_device *netdev); + +extern struct ethtool_ops spider_net_ethtool_ops; + +extern char spider_net_driver_name[]; + +#define SPIDER_NET_MAX_MTU 2308 +#define SPIDER_NET_MIN_MTU 64 + +#define SPIDER_NET_RXBUF_ALIGN 128 + +#define SPIDER_NET_RX_DESCRIPTORS_DEFAULT 64 +#define SPIDER_NET_RX_DESCRIPTORS_MIN 16 +#define SPIDER_NET_RX_DESCRIPTORS_MAX 256 + +#define SPIDER_NET_TX_DESCRIPTORS_DEFAULT 64 +#define SPIDER_NET_TX_DESCRIPTORS_MIN 16 +#define SPIDER_NET_TX_DESCRIPTORS_MAX 256 + +#define SPIDER_NET_RX_CSUM_DEFAULT 1 + +#define SPIDER_NET_WATCHDOG_TIMEOUT 5*HZ +#define SPIDER_NET_NAPI_WEIGHT 64 + +#define SPIDER_NET_FIRMWARE_LEN 1024 +#define SPIDER_NET_FIRMWARE_NAME "spider_fw.bin" + +/** spider_net SMMIO registers */ +#define SPIDER_NET_GHIINT0STS 0x00000000 +#define SPIDER_NET_GHIINT1STS 0x00000004 +#define SPIDER_NET_GHIINT2STS 0x00000008 +#define SPIDER_NET_GHIINT0MSK 0x00000010 +#define SPIDER_NET_GHIINT1MSK 0x00000014 +#define SPIDER_NET_GHIINT2MSK 0x00000018 + +#define SPIDER_NET_GRESUMINTNUM 0x00000020 +#define SPIDER_NET_GREINTNUM 0x00000024 + +#define SPIDER_NET_GFFRMNUM 0x00000028 +#define SPIDER_NET_GFAFRMNUM 0x0000002c +#define SPIDER_NET_GFBFRMNUM 0x00000030 +#define SPIDER_NET_GFCFRMNUM 0x00000034 +#define SPIDER_NET_GFDFRMNUM 0x00000038 + +/* clear them (don't use it) */ +#define SPIDER_NET_GFREECNNUM 0x0000003c +#define SPIDER_NET_GONETIMENUM 0x00000040 + +#define SPIDER_NET_GTOUTFRMNUM 0x00000044 + +#define SPIDER_NET_GTXMDSET 0x00000050 +#define SPIDER_NET_GPCCTRL 0x00000054 +#define SPIDER_NET_GRXMDSET 0x00000058 +#define SPIDER_NET_GIPSECINIT 0x0000005c +#define SPIDER_NET_GFTRESTRT 0x00000060 +#define SPIDER_NET_GRXDMAEN 0x00000064 +#define SPIDER_NET_GMRWOLCTRL 0x00000068 +#define SPIDER_NET_GPCWOPCMD 0x0000006c +#define SPIDER_NET_GPCROPCMD 0x00000070 +#define SPIDER_NET_GTTFRMCNT 0x00000078 +#define SPIDER_NET_GTESTMD 0x0000007c + +#define SPIDER_NET_GSINIT 0x00000080 +#define SPIDER_NET_GSnPRGADR 0x00000084 +#define SPIDER_NET_GSnPRGDAT 0x00000088 + +#define SPIDER_NET_GMACOPEMD 0x00000100 +#define SPIDER_NET_GMACLENLMT 0x00000108 +#define SPIDER_NET_GMACINTEN 0x00000118 +#define SPIDER_NET_GMACPHYCTRL 0x00000120 + +#define SPIDER_NET_GMACAPAUSE 0x00000154 +#define SPIDER_NET_GMACTXPAUSE 0x00000164 + +#define SPIDER_NET_GMACMODE 0x000001b0 +#define SPIDER_NET_GMACBSTLMT 0x000001b4 + +#define SPIDER_NET_GMACUNIMACU 0x000001c0 +#define SPIDER_NET_GMACUNIMACL 0x000001c8 + +#define SPIDER_NET_GMRMHFILnR 0x00000400 +#define SPIDER_NET_MULTICAST_HASHES 256 + +#define SPIDER_NET_GMRUAFILnR 0x00000500 +#define SPIDER_NET_GMRUA0FIL15R 0x00000578 + +/* RX DMA controller registers, all 0x00000a.. are for DMA controller A, + * 0x00000b.. for DMA controller B, etc. */ +#define SPIDER_NET_GDADCHA 0x00000a00 +#define SPIDER_NET_GDADMACCNTR 0x00000a04 +#define SPIDER_NET_GDACTDPA 0x00000a08 +#define SPIDER_NET_GDACTDCNT 0x00000a0c +#define SPIDER_NET_GDACDBADDR 0x00000a20 +#define SPIDER_NET_GDACDBSIZE 0x00000a24 +#define SPIDER_NET_GDACNEXTDA 0x00000a28 +#define SPIDER_NET_GDACCOMST 0x00000a2c +#define SPIDER_NET_GDAWBCOMST 0x00000a30 +#define SPIDER_NET_GDAWBRSIZE 0x00000a34 +#define SPIDER_NET_GDAWBVSIZE 0x00000a38 +#define SPIDER_NET_GDAWBTRST 0x00000a3c +#define SPIDER_NET_GDAWBTRERR 0x00000a40 + +/* TX DMA controller registers */ +#define SPIDER_NET_GDTDCHA 0x00000e00 +#define SPIDER_NET_GDTDMACCNTR 0x00000e04 +#define SPIDER_NET_GDTCDPA 0x00000e08 +#define SPIDER_NET_GDTDMASEL 0x00000e14 + +#define SPIDER_NET_ECMODE 0x00000f00 +/* clock and reset control register */ +#define SPIDER_NET_CKRCTRL 0x00000ff0 + +/** SCONFIG registers */ +#define SPIDER_NET_SCONFIG_IOACTE 0x00002810 + +/** hardcoded register values */ +#define SPIDER_NET_INT0_MASK_VALUE 0x3f7fe3ff +#define SPIDER_NET_INT1_MASK_VALUE 0xffffffff +/* no MAC aborts -> auto retransmission */ +#define SPIDER_NET_INT2_MASK_VALUE 0xfffffff1 + +/* clear counter when interrupt sources are cleared +#define SPIDER_NET_FRAMENUM_VALUE 0x0001f001 */ +/* we rely on flagged descriptor interrupts */ +#define SPIDER_NET_FRAMENUM_VALUE 0x00000000 +/* set this first, then the FRAMENUM_VALUE */ +#define SPIDER_NET_GFXFRAMES_VALUE 0x00000000 + +#define SPIDER_NET_STOP_SEQ_VALUE 0x00000000 +#define SPIDER_NET_RUN_SEQ_VALUE 0x0000007e + +#define SPIDER_NET_PHY_CTRL_VALUE 0x00040040 +/* #define SPIDER_NET_PHY_CTRL_VALUE 0x01070080*/ +#define SPIDER_NET_RXMODE_VALUE 0x00000011 +/* auto retransmission in case of MAC aborts */ +#define SPIDER_NET_TXMODE_VALUE 0x00010000 +#define SPIDER_NET_RESTART_VALUE 0x00000000 +#define SPIDER_NET_WOL_VALUE 0x00001111 +#if 0 +#define SPIDER_NET_WOL_VALUE 0x00000000 +#endif +#define SPIDER_NET_IPSECINIT_VALUE 0x00f000f8 + +/* pause frames: automatic, no upper retransmission count */ +/* outside loopback mode: ETOMOD signal dont matter, not connected */ +#define SPIDER_NET_OPMODE_VALUE 0x00000063 +/*#define SPIDER_NET_OPMODE_VALUE 0x001b0062*/ +#define SPIDER_NET_LENLMT_VALUE 0x00000908 + +#define SPIDER_NET_MACAPAUSE_VALUE 0x00000800 /* about 1 ms */ +#define SPIDER_NET_TXPAUSE_VALUE 0x00000000 + +#define SPIDER_NET_MACMODE_VALUE 0x00000001 +#define SPIDER_NET_BURSTLMT_VALUE 0x00000200 /* about 16 us */ + +/* 1(0) enable r/tx dma + * 0000000 fixed to 0 + * + * 000000 fixed to 0 + * 0(1) en/disable descr writeback on force end + * 0(1) force end + * + * 000000 fixed to 0 + * 00 burst alignment: 128 bytes + * + * 00000 fixed to 0 + * 0 descr writeback size 32 bytes + * 0(1) descr chain end interrupt enable + * 0(1) descr status writeback enable */ + +/* to set RX_DMA_EN */ +#define SPIDER_NET_DMA_RX_VALUE 0x80000000 +#define SPIDER_NET_DMA_RX_FEND_VALUE 0x00030003 +/* to set TX_DMA_EN */ +#define SPIDER_NET_DMA_TX_VALUE 0x80000000 +#define SPIDER_NET_DMA_TX_FEND_VALUE 0x00030003 + +/* SPIDER_NET_UA_DESCR_VALUE is OR'ed with the unicast address */ +#define SPIDER_NET_UA_DESCR_VALUE 0x00080000 +#define SPIDER_NET_PROMISC_VALUE 0x00080000 +#define SPIDER_NET_NONPROMISC_VALUE 0x00000000 + +#define SPIDER_NET_DMASEL_VALUE 0x00000001 + +#define SPIDER_NET_ECMODE_VALUE 0x00000000 + +#define SPIDER_NET_CKRCTRL_RUN_VALUE 0x1fff010f +#define SPIDER_NET_CKRCTRL_STOP_VALUE 0x0000010f + +#define SPIDER_NET_SBIMSTATE_VALUE 0x00000000 +#define SPIDER_NET_SBTMSTATE_VALUE 0x00000000 + +/* SPIDER_NET_GHIINT0STS bits, in reverse order so that they can be used + * with 1 << SPIDER_NET_... */ +enum spider_net_int0_status { + SPIDER_NET_GPHYINT = 0, + SPIDER_NET_GMAC2INT, + SPIDER_NET_GMAC1INT, + SPIDER_NET_GIPSINT, + SPIDER_NET_GFIFOINT, + SPIDER_NET_GDMACINT, + SPIDER_NET_GSYSINT, + SPIDER_NET_GPWOPCMPINT, + SPIDER_NET_GPROPCMPINT, + SPIDER_NET_GPWFFINT, + SPIDER_NET_GRMDADRINT, + SPIDER_NET_GRMARPINT, + SPIDER_NET_GRMMPINT, + SPIDER_NET_GDTDEN0INT, + SPIDER_NET_GDDDEN0INT, + SPIDER_NET_GDCDEN0INT, + SPIDER_NET_GDBDEN0INT, + SPIDER_NET_GDADEN0INT, + SPIDER_NET_GDTFDCINT, + SPIDER_NET_GDDFDCINT, + SPIDER_NET_GDCFDCINT, + SPIDER_NET_GDBFDCINT, + SPIDER_NET_GDAFDCINT, + SPIDER_NET_GTTEDINT, + SPIDER_NET_GDTDCEINT, + SPIDER_NET_GRFDNMINT, + SPIDER_NET_GRFCNMINT, + SPIDER_NET_GRFBNMINT, + SPIDER_NET_GRFANMINT, + SPIDER_NET_GRFNMINT, + SPIDER_NET_G1TMCNTINT, + SPIDER_NET_GFREECNTINT +}; +/* GHIINT1STS bits */ +enum spider_net_int1_status { + SPIDER_NET_GTMFLLINT = 0, + SPIDER_NET_GRMFLLINT, + SPIDER_NET_GTMSHTINT, + SPIDER_NET_GDTINVDINT, + SPIDER_NET_GRFDFLLINT, + SPIDER_NET_GDDDCEINT, + SPIDER_NET_GDDINVDINT, + SPIDER_NET_GRFCFLLINT, + SPIDER_NET_GDCDCEINT, + SPIDER_NET_GDCINVDINT, + SPIDER_NET_GRFBFLLINT, + SPIDER_NET_GDBDCEINT, + SPIDER_NET_GDBINVDINT, + SPIDER_NET_GRFAFLLINT, + SPIDER_NET_GDADCEINT, + SPIDER_NET_GDAINVDINT, + SPIDER_NET_GDTRSERINT, + SPIDER_NET_GDDRSERINT, + SPIDER_NET_GDCRSERINT, + SPIDER_NET_GDBRSERINT, + SPIDER_NET_GDARSERINT, + SPIDER_NET_GDSERINT, + SPIDER_NET_GDTPTERINT, + SPIDER_NET_GDDPTERINT, + SPIDER_NET_GDCPTERINT, + SPIDER_NET_GDBPTERINT, + SPIDER_NET_GDAPTERINT +}; +/* GHIINT2STS bits */ +enum spider_net_int2_status { + SPIDER_NET_GPROPERINT = 0, + SPIDER_NET_GMCTCRSNGINT, + SPIDER_NET_GMCTLCOLINT, + SPIDER_NET_GMCTTMOTINT, + SPIDER_NET_GMCRCAERINT, + SPIDER_NET_GMCRCALERINT, + SPIDER_NET_GMCRALNERINT, + SPIDER_NET_GMCROVRINT, + SPIDER_NET_GMCRRNTINT, + SPIDER_NET_GMCRRXERINT, + SPIDER_NET_GTITCSERINT, + SPIDER_NET_GTIFMTERINT, + SPIDER_NET_GTIPKTRVKINT, + SPIDER_NET_GTISPINGINT, + SPIDER_NET_GTISADNGINT, + SPIDER_NET_GTISPDNGINT, + SPIDER_NET_GRIFMTERINT, + SPIDER_NET_GRIPKTRVKINT, + SPIDER_NET_GRISPINGINT, + SPIDER_NET_GRISADNGINT, + SPIDER_NET_GRISPDNGINT +}; + +#define SPIDER_NET_TXINT ( (1 << SPIDER_NET_GTTEDINT) | \ + (1 << SPIDER_NET_GDTDCEINT) | \ + (1 << SPIDER_NET_GDTFDCINT) ) + +/* we rely on flagged descriptor interrupts*/ +#define SPIDER_NET_RXINT ( (1 << SPIDER_NET_GDAFDCINT) | \ + (1 << SPIDER_NET_GRMFLLINT) ) + +#define SPIDER_NET_GPREXEC 0x80000000 +#define SPIDER_NET_GPRDAT_MASK 0x0000ffff + +/* descriptor bits + * + * 1010 descriptor ready + * 0 descr in middle of chain + * 000 fixed to 0 + * + * 0 no interrupt on completion + * 000 fixed to 0 + * 1 no ipsec processing + * 1 last descriptor for this frame + * 00 no checksum + * 10 tcp checksum + * 11 udp checksum + * + * 00 fixed to 0 + * 0 fixed to 0 + * 0 no interrupt on response errors + * 0 no interrupt on invalid descr + * 0 no interrupt on dma process termination + * 0 no interrupt on descr chain end + * 0 no interrupt on descr complete + * + * 000 fixed to 0 + * 0 response error interrupt status + * 0 invalid descr status + * 0 dma termination status + * 0 descr chain end status + * 0 descr complete status */ +#define SPIDER_NET_DMAC_CMDSTAT_NOCS 0xa00c0000 +#define SPIDER_NET_DMAC_CMDSTAT_TCPCS 0xa00e0000 +#define SPIDER_NET_DMAC_CMDSTAT_UDPCS 0xa00f0000 +#define SPIDER_NET_DESCR_IND_PROC_SHIFT 28 +#define SPIDER_NET_DESCR_IND_PROC_MASKO 0x0fffffff + +/* descr ready, descr is in middle of chain, get interrupt on completion */ +#define SPIDER_NET_DMAC_RX_CARDOWNED 0xa0800000 + +/* multicast is no problem */ +#define SPIDER_NET_DATA_ERROR_MASK 0xffffbfff + +enum spider_net_descr_status { + SPIDER_NET_DESCR_COMPLETE = 0x00, /* used in rx and tx */ + SPIDER_NET_DESCR_RESPONSE_ERROR = 0x01, /* used in rx and tx */ + SPIDER_NET_DESCR_PROTECTION_ERROR = 0x02, /* used in rx and tx */ + SPIDER_NET_DESCR_FRAME_END = 0x04, /* used in rx */ + SPIDER_NET_DESCR_FORCE_END = 0x05, /* used in rx and tx */ + SPIDER_NET_DESCR_CARDOWNED = 0x0a, /* used in rx and tx */ + SPIDER_NET_DESCR_NOT_IN_USE /* any other value */ +}; + +struct spider_net_descr { + /* as defined by the hardware */ + dma_addr_t buf_addr; + u32 buf_size; + dma_addr_t next_descr_addr; + u32 dmac_cmd_status; + u32 result_size; + u32 valid_size; /* all zeroes for tx */ + u32 data_status; + u32 data_error; /* all zeroes for tx */ + + /* used in the driver */ + struct sk_buff *skb; + dma_addr_t bus_addr; + struct spider_net_descr *next; + struct spider_net_descr *prev; +} __attribute__((aligned(32))); + +struct spider_net_descr_chain { + /* we walk from tail to head */ + struct spider_net_descr *head; + struct spider_net_descr *tail; +}; + +/* descriptor data_status bits */ +#define SPIDER_NET_RXIPCHK 29 +#define SPIDER_NET_TCPUDPIPCHK 28 +#define SPIDER_NET_DATA_STATUS_CHK_MASK (1 << SPIDER_NET_RXIPCHK | \ + 1 << SPIDER_NET_TCPUDPIPCHK) + +#define SPIDER_NET_VLAN_PACKET 21 + +/* descriptor data_error bits */ +#define SPIDER_NET_RXIPCHKERR 27 +#define SPIDER_NET_RXTCPCHKERR 26 +#define SPIDER_NET_DATA_ERROR_CHK_MASK (1 << SPIDER_NET_RXIPCHKERR | \ + 1 << SPIDER_NET_RXTCPCHKERR) + +/* the cases we don't pass the packet to the stack */ +#define SPIDER_NET_DESTROY_RX_FLAGS 0x70138000 + +#define SPIDER_NET_DESCR_SIZE 32 + +/* this will be bigger some time */ +struct spider_net_options { + int rx_csum; /* for rx: if 0 ip_summed=NONE, + if 1 and hw has verified, ip_summed=UNNECESSARY */ +}; + +#define SPIDER_NET_DEFAULT_MSG ( NETIF_MSG_DRV | \ + NETIF_MSG_PROBE | \ + NETIF_MSG_LINK | \ + NETIF_MSG_TIMER | \ + NETIF_MSG_IFDOWN | \ + NETIF_MSG_IFUP | \ + NETIF_MSG_RX_ERR | \ + NETIF_MSG_TX_ERR | \ + NETIF_MSG_TX_QUEUED | \ + NETIF_MSG_INTR | \ + NETIF_MSG_TX_DONE | \ + NETIF_MSG_RX_STATUS | \ + NETIF_MSG_PKTDATA | \ + NETIF_MSG_HW | \ + NETIF_MSG_WOL ) + +struct spider_net_card { + struct net_device *netdev; + struct pci_dev *pdev; + struct mii_phy phy; + + void __iomem *regs; + + struct spider_net_descr_chain tx_chain; + struct spider_net_descr_chain rx_chain; + spinlock_t chain_lock; + + struct net_device_stats netdev_stats; + + struct spider_net_options options; + + spinlock_t intmask_lock; + + struct work_struct tx_timeout_task; + atomic_t tx_timeout_task_counter; + wait_queue_head_t waitq; + + /* for ethtool */ + int msg_enable; + + struct spider_net_descr descr[0]; +}; + +#define pr_err(fmt,arg...) \ + printk(KERN_ERR fmt ,##arg) + +#endif --- linux-cg.orig/drivers/net/spider_net_ethtool.c 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/drivers/net/spider_net_ethtool.c 2005-06-28 15:06:01.238001352 -0400 @@ -0,0 +1,107 @@ +/* + * Network device driver for Cell Processor-Based Blade + * + * (C) Copyright IBM Corp. 2005 + * + * Authors : Utz Bacher + * Jens Osterkamp + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include +#include +#include + +#include "spider_net.h" + +static void +spider_net_ethtool_get_drvinfo(struct net_device *netdev, + struct ethtool_drvinfo *drvinfo) +{ + struct spider_net_card *card; + card = netdev_priv(netdev); + + /* clear and fill out info */ + memset(drvinfo, 0, sizeof(struct ethtool_drvinfo)); + strncpy(drvinfo->driver, spider_net_driver_name, 32); + strncpy(drvinfo->version, "0.1", 32); + strcpy(drvinfo->fw_version, "no information"); + strncpy(drvinfo->bus_info, pci_name(card->pdev), 32); +} + +static void +spider_net_ethtool_get_wol(struct net_device *netdev, + struct ethtool_wolinfo *wolinfo) +{ + /* no support for wol */ + wolinfo->supported = 0; + wolinfo->wolopts = 0; +} + +static u32 +spider_net_ethtool_get_msglevel(struct net_device *netdev) +{ + struct spider_net_card *card; + card = netdev_priv(netdev); + return card->msg_enable; +} + +static void +spider_net_ethtool_set_msglevel(struct net_device *netdev, + u32 level) +{ + struct spider_net_card *card; + card = netdev_priv(netdev); + card->msg_enable = level; +} + +static int +spider_net_ethtool_nway_reset(struct net_device *netdev) +{ + if (netif_running(netdev)) { + spider_net_stop(netdev); + spider_net_open(netdev); + } + return 0; +} + +static u32 +spider_net_ethtool_get_rx_csum(struct net_device *netdev) +{ + struct spider_net_card *card = netdev->priv; + + return card->options.rx_csum; +} + +static int +spider_net_ethtool_set_rx_csum(struct net_device *netdev, u32 n) +{ + struct spider_net_card *card = netdev->priv; + + card->options.rx_csum = n; + return 0; +} + +struct ethtool_ops spider_net_ethtool_ops = { + .get_drvinfo = spider_net_ethtool_get_drvinfo, + .get_wol = spider_net_ethtool_get_wol, + .get_msglevel = spider_net_ethtool_get_msglevel, + .set_msglevel = spider_net_ethtool_set_msglevel, + .nway_reset = spider_net_ethtool_nway_reset, + .get_rx_csum = spider_net_ethtool_get_rx_csum, + .set_rx_csum = spider_net_ethtool_set_rx_csum, +}; + --- linux-cg.orig/include/linux/pci_ids.h 2005-06-28 14:54:14.583994952 -0400 +++ linux-cg/include/linux/pci_ids.h 2005-06-28 15:06:01.240001048 -0400 @@ -1597,6 +1597,7 @@ #define PCI_DEVICE_ID_TOSHIBA_TC35815CF 0x0030 #define PCI_DEVICE_ID_TOSHIBA_TX4927 0x0180 #define PCI_DEVICE_ID_TOSHIBA_TC86C001_MISC 0x0108 +#define PCI_DEVICE_ID_TOSHIBA_SPIDER_NET 0x01b3 #define PCI_VENDOR_ID_RICOH 0x1180 #define PCI_DEVICE_ID_RICOH_RL5C465 0x0465 From arnd at arndb.de Tue Jun 28 23:47:03 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 28 Jun 2005 15:47:03 +0200 Subject: [PATCH] net: add missing include to netdevice.h In-Reply-To: <200506281528.08834.arnd@arndb.de> References: <200506281528.08834.arnd@arndb.de> Message-ID: <200506281547.04620.arnd@arndb.de> linux/etherdevice.h can't be included standalone at the moment, which is required in order to sort the header files in the recommended alphabetic order. This patch fixes that and is needed to build spider_net. Signed-off-by: Arnd Bergmann --- linux-cg.orig/include/linux/etherdevice.h 2005-05-31 07:48:50.044932320 -0400 +++ linux-cg/include/linux/etherdevice.h 2005-05-31 07:49:06.808914320 -0400 @@ -25,6 +25,7 @@ #define _LINUX_ETHERDEVICE_H #include +#include #include #ifdef __KERNEL__ From arjan at infradead.org Tue Jun 28 23:53:18 2005 From: arjan at infradead.org (Arjan van de Ven) Date: Tue, 28 Jun 2005 15:53:18 +0200 Subject: [PATCH] net: add driver for the NIC on Cell Blades In-Reply-To: <200506281528.08834.arnd@arndb.de> References: <200506281528.08834.arnd@arndb.de> Message-ID: <1119966799.3175.32.camel@laptopd505.fenrus.org> On Tue, 2005-06-28 at 15:28 +0200, Arnd Bergmann wrote: > > +static void > +spider_net_rx_irq_off(struct spider_net_card *card) > +{ > + u32 regvalue; > + unsigned long flags; > + > + spin_lock_irqsave(&card->intmask_lock, flags); > + regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); > + regvalue &= ~SPIDER_NET_RXINT; > + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); > + spin_unlock_irqrestore(&card->intmask_lock, flags); > +} I think you have a PCI posting bug here.... From arnd at arndb.de Wed Jun 29 00:43:53 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 28 Jun 2005 16:43:53 +0200 Subject: [PATCH] bpa: add default config Message-ID: <200506281643.53866.arnd@arndb.de> This adds a bpa_defconfig file and make target. The config settings are made for the current version of the Cell Processor Based Blade, so there are not too many drivers enabled. A few more drivers might get added in the future though. Signed-off-by: Arnd Bergmann --- linux-cg.orig/arch/ppc64/configs/bpa_defconfig 1969-12-31 19:00:00.000000000 -0500 +++ linux-cg/arch/ppc64/configs/bpa_defconfig 2005-06-28 16:06:41.079981312 -0400 @@ -0,0 +1,1017 @@ +# +# Automatically generated make config: don't edit +# Linux kernel version: 2.6.12 +# Tue Jun 28 16:06:30 2005 +# +CONFIG_64BIT=y +CONFIG_MMU=y +CONFIG_RWSEM_XCHGADD_ALGORITHM=y +CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_GENERIC_ISA_DMA=y +CONFIG_HAVE_DEC_LOCK=y +CONFIG_EARLY_PRINTK=y +CONFIG_COMPAT=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y +CONFIG_FORCE_MAX_ZONEORDER=13 + +# +# Code maturity level options +# +CONFIG_EXPERIMENTAL=y +CONFIG_CLEAN_COMPILE=y +CONFIG_LOCK_KERNEL=y +CONFIG_INIT_ENV_ARG_LIMIT=32 + +# +# General setup +# +CONFIG_LOCALVERSION="" +CONFIG_SWAP=y +CONFIG_SYSVIPC=y +# CONFIG_POSIX_MQUEUE is not set +# CONFIG_BSD_PROCESS_ACCT is not set +CONFIG_SYSCTL=y +# CONFIG_AUDIT is not set +CONFIG_HOTPLUG=y +CONFIG_KOBJECT_UEVENT=y +# CONFIG_IKCONFIG is not set +# CONFIG_CPUSETS is not set +# CONFIG_EMBEDDED is not set +CONFIG_KALLSYMS=y +# CONFIG_KALLSYMS_ALL is not set +# CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_EPOLL=y +CONFIG_SHMEM=y +CONFIG_CC_ALIGN_FUNCTIONS=0 +CONFIG_CC_ALIGN_LABELS=0 +CONFIG_CC_ALIGN_LOOPS=0 +CONFIG_CC_ALIGN_JUMPS=0 +# CONFIG_TINY_SHMEM is not set +CONFIG_BASE_SMALL=0 + +# +# Loadable module support +# +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +# CONFIG_MODULE_FORCE_UNLOAD is not set +CONFIG_OBSOLETE_MODPARM=y +# CONFIG_MODVERSIONS is not set +# CONFIG_MODULE_SRCVERSION_ALL is not set +# CONFIG_KMOD is not set +CONFIG_STOP_MACHINE=y +CONFIG_SYSVIPC_COMPAT=y + +# +# Platform support +# +# CONFIG_PPC_ISERIES is not set +CONFIG_PPC_MULTIPLATFORM=y +# CONFIG_PPC_PSERIES is not set +CONFIG_PPC_BPA=y +# CONFIG_PPC_PMAC is not set +# CONFIG_PPC_MAPLE is not set +CONFIG_PPC=y +CONFIG_PPC64=y +CONFIG_PPC_OF=y +CONFIG_BPA_IIC=y +CONFIG_ALTIVEC=y +CONFIG_KEXEC=y +# CONFIG_U3_DART is not set +# CONFIG_BOOTX_TEXT is not set +# CONFIG_POWER4_ONLY is not set +# CONFIG_IOMMU_VMERGE is not set +CONFIG_SMP=y +CONFIG_NR_CPUS=4 +CONFIG_ARCH_SELECT_MEMORY_MODEL=y +CONFIG_ARCH_FLATMEM_ENABLE=y +CONFIG_SELECT_MEMORY_MODEL=y +CONFIG_FLATMEM_MANUAL=y +# CONFIG_DISCONTIGMEM_MANUAL is not set +# CONFIG_SPARSEMEM_MANUAL is not set +CONFIG_FLATMEM=y +CONFIG_FLAT_NODE_MEM_MAP=y +# CONFIG_NUMA is not set +CONFIG_SCHED_SMT=y +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set +# CONFIG_PREEMPT is not set +CONFIG_PREEMPT_BKL=y +CONFIG_GENERIC_HARDIRQS=y +CONFIG_PPC_RTAS=y +CONFIG_RTAS_PROC=y +CONFIG_RTAS_FLASH=y +CONFIG_SECCOMP=y +CONFIG_ISA_DMA_API=y + +# +# General setup +# +CONFIG_PCI=y +CONFIG_PCI_DOMAINS=y +CONFIG_BINFMT_ELF=y +# CONFIG_BINFMT_MISC is not set +CONFIG_PCI_LEGACY_PROC=y +CONFIG_PCI_NAMES=y +# CONFIG_PCI_DEBUG is not set + +# +# PCCARD (PCMCIA/CardBus) support +# +# CONFIG_PCCARD is not set + +# +# PCI Hotplug Support +# +# CONFIG_HOTPLUG_PCI is not set +CONFIG_PROC_DEVICETREE=y +# CONFIG_CMDLINE_BOOL is not set + +# +# Device Drivers +# + +# +# Generic Driver Options +# +CONFIG_STANDALONE=y +CONFIG_PREVENT_FIRMWARE_BUILD=y +CONFIG_FW_LOADER=y +# CONFIG_DEBUG_DRIVER is not set + +# +# Memory Technology Devices (MTD) +# +# CONFIG_MTD is not set + +# +# Parallel port support +# +# CONFIG_PARPORT is not set + +# +# Plug and Play support +# + +# +# Block devices +# +# CONFIG_BLK_DEV_FD is not set +# CONFIG_BLK_CPQ_DA is not set +# CONFIG_BLK_CPQ_CISS_DA is not set +# CONFIG_BLK_DEV_DAC960 is not set +# CONFIG_BLK_DEV_UMEM is not set +# CONFIG_BLK_DEV_COW_COMMON is not set +CONFIG_BLK_DEV_LOOP=y +# CONFIG_BLK_DEV_CRYPTOLOOP is not set +CONFIG_BLK_DEV_NBD=y +# CONFIG_BLK_DEV_SX8 is not set +CONFIG_BLK_DEV_RAM=y +CONFIG_BLK_DEV_RAM_COUNT=16 +CONFIG_BLK_DEV_RAM_SIZE=131072 +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE="" +# CONFIG_CDROM_PKTCDVD is not set + +# +# IO Schedulers +# +CONFIG_IOSCHED_NOOP=y +CONFIG_IOSCHED_AS=y +CONFIG_IOSCHED_DEADLINE=y +CONFIG_IOSCHED_CFQ=y +# CONFIG_ATA_OVER_ETH is not set + +# +# ATA/ATAPI/MFM/RLL support +# +CONFIG_IDE=y +CONFIG_BLK_DEV_IDE=y + +# +# Please see Documentation/ide.txt for help/info on IDE drives +# +# CONFIG_BLK_DEV_IDE_SATA is not set +CONFIG_BLK_DEV_IDEDISK=y +CONFIG_IDEDISK_MULTI_MODE=y +# CONFIG_BLK_DEV_IDECD is not set +# CONFIG_BLK_DEV_IDETAPE is not set +# CONFIG_BLK_DEV_IDEFLOPPY is not set +# CONFIG_IDE_TASK_IOCTL is not set + +# +# IDE chipset support/bugfixes +# +CONFIG_IDE_GENERIC=y +CONFIG_BLK_DEV_IDEPCI=y +CONFIG_IDEPCI_SHARE_IRQ=y +# CONFIG_BLK_DEV_OFFBOARD is not set +CONFIG_BLK_DEV_GENERIC=y +# CONFIG_BLK_DEV_OPTI621 is not set +# CONFIG_BLK_DEV_SL82C105 is not set +CONFIG_BLK_DEV_IDEDMA_PCI=y +# CONFIG_BLK_DEV_IDEDMA_FORCED is not set +CONFIG_IDEDMA_PCI_AUTO=y +# CONFIG_IDEDMA_ONLYDISK is not set +CONFIG_BLK_DEV_AEC62XX=y +# CONFIG_BLK_DEV_ALI15X3 is not set +# CONFIG_BLK_DEV_AMD74XX is not set +# CONFIG_BLK_DEV_CMD64X is not set +# CONFIG_BLK_DEV_TRIFLEX is not set +# CONFIG_BLK_DEV_CY82C693 is not set +# CONFIG_BLK_DEV_CS5520 is not set +# CONFIG_BLK_DEV_CS5530 is not set +# CONFIG_BLK_DEV_HPT34X is not set +# CONFIG_BLK_DEV_HPT366 is not set +# CONFIG_BLK_DEV_SC1200 is not set +# CONFIG_BLK_DEV_PIIX is not set +# CONFIG_BLK_DEV_IT821X is not set +# CONFIG_BLK_DEV_NS87415 is not set +# CONFIG_BLK_DEV_PDC202XX_OLD is not set +# CONFIG_BLK_DEV_PDC202XX_NEW is not set +# CONFIG_BLK_DEV_SVWKS is not set +CONFIG_BLK_DEV_SIIMAGE=y +# CONFIG_BLK_DEV_SLC90E66 is not set +# CONFIG_BLK_DEV_TRM290 is not set +# CONFIG_BLK_DEV_VIA82CXXX is not set +# CONFIG_IDE_ARM is not set +CONFIG_BLK_DEV_IDEDMA=y +# CONFIG_IDEDMA_IVB is not set +CONFIG_IDEDMA_AUTO=y +# CONFIG_BLK_DEV_HD is not set + +# +# SCSI device support +# +# CONFIG_SCSI is not set + +# +# Multi-device support (RAID and LVM) +# +# CONFIG_MD is not set + +# +# Fusion MPT device support +# +# CONFIG_FUSION is not set + +# +# IEEE 1394 (FireWire) support +# +# CONFIG_IEEE1394 is not set + +# +# I2O device support +# +# CONFIG_I2O is not set + +# +# Macintosh device drivers +# + +# +# Networking support +# +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_MMAP is not set +CONFIG_UNIX=y +# CONFIG_NET_KEY is not set +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=y +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +CONFIG_SYN_COOKIES=y +# CONFIG_INET_AH is not set +# CONFIG_INET_ESP is not set +# CONFIG_INET_IPCOMP is not set +CONFIG_INET_TUNNEL=y +CONFIG_IP_TCPDIAG=y +CONFIG_IP_TCPDIAG_IPV6=y +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y + +# +# IP: Virtual Server Configuration +# +# CONFIG_IP_VS is not set +CONFIG_IPV6=y +# CONFIG_IPV6_PRIVACY is not set +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_INET6_TUNNEL=m +CONFIG_IPV6_TUNNEL=m +CONFIG_NETFILTER=y +# CONFIG_NETFILTER_DEBUG is not set + +# +# IP: Netfilter Configuration +# +CONFIG_IP_NF_CONNTRACK=y +# CONFIG_IP_NF_CT_ACCT is not set +# CONFIG_IP_NF_CONNTRACK_MARK is not set +CONFIG_IP_NF_CT_PROTO_SCTP=y +CONFIG_IP_NF_FTP=m +CONFIG_IP_NF_IRC=m +CONFIG_IP_NF_TFTP=m +CONFIG_IP_NF_AMANDA=m +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_LIMIT=m +CONFIG_IP_NF_MATCH_IPRANGE=m +CONFIG_IP_NF_MATCH_MAC=m +CONFIG_IP_NF_MATCH_PKTTYPE=m +CONFIG_IP_NF_MATCH_MARK=m +CONFIG_IP_NF_MATCH_MULTIPORT=m +CONFIG_IP_NF_MATCH_TOS=m +CONFIG_IP_NF_MATCH_RECENT=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_DSCP=m +CONFIG_IP_NF_MATCH_AH_ESP=m +CONFIG_IP_NF_MATCH_LENGTH=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_MATCH_TCPMSS=m +CONFIG_IP_NF_MATCH_HELPER=m +CONFIG_IP_NF_MATCH_STATE=m +CONFIG_IP_NF_MATCH_CONNTRACK=m +CONFIG_IP_NF_MATCH_OWNER=m +CONFIG_IP_NF_MATCH_ADDRTYPE=m +CONFIG_IP_NF_MATCH_REALM=m +CONFIG_IP_NF_MATCH_SCTP=m +CONFIG_IP_NF_MATCH_COMMENT=m +CONFIG_IP_NF_MATCH_HASHLIMIT=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_LOG=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_TARGET_TCPMSS=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_NAT_NEEDED=y +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_SAME=m +CONFIG_IP_NF_NAT_SNMP_BASIC=m +CONFIG_IP_NF_NAT_IRC=m +CONFIG_IP_NF_NAT_FTP=m +CONFIG_IP_NF_NAT_TFTP=m +CONFIG_IP_NF_NAT_AMANDA=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_TOS=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_DSCP=m +CONFIG_IP_NF_TARGET_MARK=m +CONFIG_IP_NF_TARGET_CLASSIFY=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_TARGET_NOTRACK=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m + +# +# IPv6: Netfilter Configuration (EXPERIMENTAL) +# +# CONFIG_IP6_NF_QUEUE is not set +# CONFIG_IP6_NF_IPTABLES is not set +CONFIG_XFRM=y +# CONFIG_XFRM_USER is not set + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set + +# +# QoS and/or fair queueing +# +# CONFIG_NET_SCHED is not set +CONFIG_NET_CLS_ROUTE=y + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +# CONFIG_NETPOLL is not set +# CONFIG_NET_POLL_CONTROLLER is not set +# CONFIG_HAMRADIO is not set +# CONFIG_IRDA is not set +# CONFIG_BT is not set +CONFIG_NETDEVICES=y +# CONFIG_DUMMY is not set +# CONFIG_BONDING is not set +# CONFIG_EQUALIZER is not set +# CONFIG_TUN is not set + +# +# ARCnet devices +# +# CONFIG_ARCNET is not set + +# +# Ethernet (10 or 100Mbit) +# +CONFIG_NET_ETHERNET=y +CONFIG_MII=y +# CONFIG_HAPPYMEAL is not set +# CONFIG_SUNGEM is not set +# CONFIG_NET_VENDOR_3COM is not set + +# +# Tulip family network device support +# +# CONFIG_NET_TULIP is not set +# CONFIG_HP100 is not set +# CONFIG_NET_PCI is not set + +# +# Ethernet (1000 Mbit) +# +# CONFIG_ACENIC is not set +# CONFIG_DL2K is not set +CONFIG_E1000=m +# CONFIG_E1000_NAPI is not set +# CONFIG_NS83820 is not set +# CONFIG_HAMACHI is not set +# CONFIG_YELLOWFIN is not set +# CONFIG_R8169 is not set +CONFIG_SKGE=m +# CONFIG_SK98LIN is not set +# CONFIG_TIGON3 is not set +# CONFIG_BNX2 is not set +CONFIG_SPIDER_NET=m +# CONFIG_MV643XX_ETH is not set + +# +# Ethernet (10000 Mbit) +# +# CONFIG_IXGB is not set +# CONFIG_S2IO is not set + +# +# Token Ring devices +# +# CONFIG_TR is not set + +# +# Wireless LAN (non-hamradio) +# +# CONFIG_NET_RADIO is not set + +# +# Wan interfaces +# +# CONFIG_WAN is not set +# CONFIG_FDDI is not set +# CONFIG_HIPPI is not set +# CONFIG_PPP is not set +# CONFIG_SLIP is not set +# CONFIG_SHAPER is not set +# CONFIG_NETCONSOLE is not set + +# +# ISDN subsystem +# +# CONFIG_ISDN is not set + +# +# Telephony Support +# +# CONFIG_PHONE is not set + +# +# Input device support +# +CONFIG_INPUT=y + +# +# Userland interfaces +# +CONFIG_INPUT_MOUSEDEV=y +# CONFIG_INPUT_MOUSEDEV_PSAUX is not set +CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 +CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 +# CONFIG_INPUT_JOYDEV is not set +# CONFIG_INPUT_TSDEV is not set +# CONFIG_INPUT_EVDEV is not set +# CONFIG_INPUT_EVBUG is not set + +# +# Input Device Drivers +# +# CONFIG_INPUT_KEYBOARD is not set +# CONFIG_INPUT_MOUSE is not set +# CONFIG_INPUT_JOYSTICK is not set +# CONFIG_INPUT_TOUCHSCREEN is not set +# CONFIG_INPUT_MISC is not set + +# +# Hardware I/O ports +# +CONFIG_SERIO=y +# CONFIG_SERIO_I8042 is not set +CONFIG_SERIO_SERPORT=y +# CONFIG_SERIO_PCIPS2 is not set +# CONFIG_SERIO_RAW is not set +# CONFIG_GAMEPORT is not set + +# +# Character devices +# +CONFIG_VT=y +CONFIG_VT_CONSOLE=y +CONFIG_HW_CONSOLE=y +CONFIG_SERIAL_NONSTANDARD=y +# CONFIG_ROCKETPORT is not set +# CONFIG_CYCLADES is not set +# CONFIG_MOXA_SMARTIO is not set +# CONFIG_ISI is not set +# CONFIG_SYNCLINK is not set +# CONFIG_SYNCLINKMP is not set +# CONFIG_N_HDLC is not set +# CONFIG_SPECIALIX is not set +# CONFIG_SX is not set +# CONFIG_STALDRV is not set + +# +# Serial drivers +# +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_NR_UARTS=4 +# CONFIG_SERIAL_8250_EXTENDED is not set + +# +# Non-8250 serial port support +# +CONFIG_SERIAL_CORE=y +CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_JSM is not set +CONFIG_UNIX98_PTYS=y +# CONFIG_LEGACY_PTYS is not set + +# +# IPMI +# +# CONFIG_IPMI_HANDLER is not set + +# +# Watchdog Cards +# +CONFIG_WATCHDOG=y +# CONFIG_WATCHDOG_NOWAYOUT is not set + +# +# Watchdog Device Drivers +# +# CONFIG_SOFT_WATCHDOG is not set +CONFIG_WATCHDOG_RTAS=y + +# +# PCI-based Watchdog Cards +# +# CONFIG_PCIPCWATCHDOG is not set +# CONFIG_WDTPCI is not set +# CONFIG_RTC is not set +# CONFIG_GEN_RTC is not set +# CONFIG_DTLK is not set +# CONFIG_R3964 is not set +# CONFIG_APPLICOM is not set + +# +# Ftape, the floppy tape device driver +# +# CONFIG_AGP is not set +# CONFIG_DRM is not set +# CONFIG_RAW_DRIVER is not set +# CONFIG_HANGCHECK_TIMER is not set + +# +# TPM devices +# +# CONFIG_TCG_TPM is not set + +# +# I2C support +# +CONFIG_I2C=y +# CONFIG_I2C_CHARDEV is not set + +# +# I2C Algorithms +# +CONFIG_I2C_ALGOBIT=y +# CONFIG_I2C_ALGOPCF is not set +# CONFIG_I2C_ALGOPCA is not set + +# +# I2C Hardware Bus support +# +# CONFIG_I2C_ALI1535 is not set +# CONFIG_I2C_ALI1563 is not set +# CONFIG_I2C_ALI15X3 is not set +# CONFIG_I2C_AMD756 is not set +# CONFIG_I2C_AMD8111 is not set +# CONFIG_I2C_I801 is not set +# CONFIG_I2C_I810 is not set +# CONFIG_I2C_PIIX4 is not set +# CONFIG_I2C_ISA is not set +# CONFIG_I2C_NFORCE2 is not set +# CONFIG_I2C_PARPORT_LIGHT is not set +# CONFIG_I2C_PROSAVAGE is not set +# CONFIG_I2C_SAVAGE4 is not set +# CONFIG_SCx200_ACB is not set +# CONFIG_I2C_SIS5595 is not set +# CONFIG_I2C_SIS630 is not set +# CONFIG_I2C_SIS96X is not set +# CONFIG_I2C_STUB is not set +# CONFIG_I2C_VIA is not set +# CONFIG_I2C_VIAPRO is not set +# CONFIG_I2C_VOODOO3 is not set +# CONFIG_I2C_PCA_ISA is not set + +# +# Hardware Sensors Chip support +# +# CONFIG_I2C_SENSOR is not set +# CONFIG_SENSORS_ADM1021 is not set +# CONFIG_SENSORS_ADM1025 is not set +# CONFIG_SENSORS_ADM1026 is not set +# CONFIG_SENSORS_ADM1031 is not set +# CONFIG_SENSORS_ADM9240 is not set +# CONFIG_SENSORS_ASB100 is not set +# CONFIG_SENSORS_ATXP1 is not set +# CONFIG_SENSORS_DS1621 is not set +# CONFIG_SENSORS_FSCHER is not set +# CONFIG_SENSORS_FSCPOS is not set +# CONFIG_SENSORS_GL518SM is not set +# CONFIG_SENSORS_GL520SM is not set +# CONFIG_SENSORS_IT87 is not set +# CONFIG_SENSORS_LM63 is not set +# CONFIG_SENSORS_LM75 is not set +# CONFIG_SENSORS_LM77 is not set +# CONFIG_SENSORS_LM78 is not set +# CONFIG_SENSORS_LM80 is not set +# CONFIG_SENSORS_LM83 is not set +# CONFIG_SENSORS_LM85 is not set +# CONFIG_SENSORS_LM87 is not set +# CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set +# CONFIG_SENSORS_MAX1619 is not set +# CONFIG_SENSORS_PC87360 is not set +# CONFIG_SENSORS_SMSC47B397 is not set +# CONFIG_SENSORS_SIS5595 is not set +# CONFIG_SENSORS_SMSC47M1 is not set +# CONFIG_SENSORS_VIA686A is not set +# CONFIG_SENSORS_W83781D is not set +# CONFIG_SENSORS_W83L785TS is not set +# CONFIG_SENSORS_W83627HF is not set +# CONFIG_SENSORS_W83627EHF is not set + +# +# Other I2C Chip support +# +# CONFIG_SENSORS_DS1337 is not set +# CONFIG_SENSORS_DS1374 is not set +# CONFIG_SENSORS_EEPROM is not set +# CONFIG_SENSORS_PCF8574 is not set +# CONFIG_SENSORS_PCA9539 is not set +# CONFIG_SENSORS_PCF8591 is not set +# CONFIG_SENSORS_RTC8564 is not set +# CONFIG_SENSORS_MAX6875 is not set +# CONFIG_I2C_DEBUG_CORE is not set +# CONFIG_I2C_DEBUG_ALGO is not set +# CONFIG_I2C_DEBUG_BUS is not set +# CONFIG_I2C_DEBUG_CHIP is not set + +# +# Dallas's 1-wire bus +# +# CONFIG_W1 is not set + +# +# Misc devices +# + +# +# Multimedia devices +# +# CONFIG_VIDEO_DEV is not set + +# +# Digital Video Broadcasting Devices +# +# CONFIG_DVB is not set + +# +# Graphics support +# +# CONFIG_FB is not set + +# +# Console display driver support +# +# CONFIG_VGA_CONSOLE is not set +CONFIG_DUMMY_CONSOLE=y + +# +# Sound +# +# CONFIG_SOUND is not set + +# +# USB support +# +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB_ARCH_HAS_OHCI=y +# CONFIG_USB is not set + +# +# USB Gadget Support +# +# CONFIG_USB_GADGET is not set + +# +# MMC/SD Card support +# +# CONFIG_MMC is not set + +# +# InfiniBand support +# +# CONFIG_INFINIBAND is not set + +# +# SN Devices +# + +# +# File systems +# +CONFIG_EXT2_FS=y +# CONFIG_EXT2_FS_XATTR is not set +# CONFIG_EXT2_FS_XIP is not set +CONFIG_EXT3_FS=y +CONFIG_EXT3_FS_XATTR=y +# CONFIG_EXT3_FS_POSIX_ACL is not set +# CONFIG_EXT3_FS_SECURITY is not set +CONFIG_JBD=y +# CONFIG_JBD_DEBUG is not set +CONFIG_FS_MBCACHE=y +# CONFIG_REISERFS_FS is not set +# CONFIG_JFS_FS is not set +CONFIG_FS_POSIX_ACL=y + +# +# XFS support +# +# CONFIG_XFS_FS is not set +# CONFIG_MINIX_FS is not set +# CONFIG_ROMFS_FS is not set +# CONFIG_QUOTA is not set +CONFIG_DNOTIFY=y +# CONFIG_AUTOFS_FS is not set +# CONFIG_AUTOFS4_FS is not set + +# +# CD-ROM/DVD Filesystems +# +CONFIG_ISO9660_FS=m +CONFIG_JOLIET=y +# CONFIG_ZISOFS is not set +CONFIG_UDF_FS=m +CONFIG_UDF_NLS=y + +# +# DOS/FAT/NT Filesystems +# +CONFIG_FAT_FS=m +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_FAT_DEFAULT_CODEPAGE=437 +CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" +# CONFIG_NTFS_FS is not set + +# +# Pseudo filesystems +# +CONFIG_PROC_FS=y +CONFIG_PROC_KCORE=y +CONFIG_SYSFS=y +# CONFIG_DEVPTS_FS_XATTR is not set +CONFIG_TMPFS=y +CONFIG_TMPFS_XATTR=y +# CONFIG_TMPFS_SECURITY is not set +CONFIG_HUGETLBFS=y +CONFIG_HUGETLB_PAGE=y +CONFIG_SPU_FS=m +CONFIG_RAMFS=y + +# +# Miscellaneous filesystems +# +# CONFIG_ADFS_FS is not set +# CONFIG_AFFS_FS is not set +# CONFIG_HFS_FS is not set +# CONFIG_HFSPLUS_FS is not set +# CONFIG_BEFS_FS is not set +# CONFIG_BFS_FS is not set +# CONFIG_EFS_FS is not set +# CONFIG_CRAMFS is not set +# CONFIG_VXFS_FS is not set +# CONFIG_HPFS_FS is not set +# CONFIG_QNX4FS_FS is not set +# CONFIG_SYSV_FS is not set +# CONFIG_UFS_FS is not set + +# +# Network File Systems +# +CONFIG_NFS_FS=m +CONFIG_NFS_V3=y +CONFIG_NFS_V3_ACL=y +# CONFIG_NFS_V4 is not set +# CONFIG_NFS_DIRECTIO is not set +CONFIG_NFSD=m +CONFIG_NFSD_V2_ACL=y +CONFIG_NFSD_V3=y +CONFIG_NFSD_V3_ACL=y +# CONFIG_NFSD_V4 is not set +CONFIG_NFSD_TCP=y +CONFIG_LOCKD=m +CONFIG_LOCKD_V4=y +CONFIG_EXPORTFS=m +CONFIG_NFS_ACL_SUPPORT=m +CONFIG_NFS_COMMON=y +CONFIG_SUNRPC=m +# CONFIG_RPCSEC_GSS_KRB5 is not set +# CONFIG_RPCSEC_GSS_SPKM3 is not set +# CONFIG_SMB_FS is not set +# CONFIG_CIFS is not set +# CONFIG_NCP_FS is not set +# CONFIG_CODA_FS is not set +# CONFIG_AFS_FS is not set + +# +# Partition Types +# +CONFIG_PARTITION_ADVANCED=y +# CONFIG_ACORN_PARTITION is not set +# CONFIG_OSF_PARTITION is not set +# CONFIG_AMIGA_PARTITION is not set +# CONFIG_ATARI_PARTITION is not set +# CONFIG_MAC_PARTITION is not set +CONFIG_MSDOS_PARTITION=y +# CONFIG_BSD_DISKLABEL is not set +# CONFIG_MINIX_SUBPARTITION is not set +# CONFIG_SOLARIS_X86_PARTITION is not set +# CONFIG_UNIXWARE_DISKLABEL is not set +# CONFIG_LDM_PARTITION is not set +# CONFIG_SGI_PARTITION is not set +# CONFIG_ULTRIX_PARTITION is not set +# CONFIG_SUN_PARTITION is not set +CONFIG_EFI_PARTITION=y + +# +# Native Language Support +# +CONFIG_NLS=m +CONFIG_NLS_DEFAULT="iso8859-1" +# CONFIG_NLS_CODEPAGE_437 is not set +# CONFIG_NLS_CODEPAGE_737 is not set +# CONFIG_NLS_CODEPAGE_775 is not set +# CONFIG_NLS_CODEPAGE_850 is not set +# CONFIG_NLS_CODEPAGE_852 is not set +# CONFIG_NLS_CODEPAGE_855 is not set +# CONFIG_NLS_CODEPAGE_857 is not set +# CONFIG_NLS_CODEPAGE_860 is not set +# CONFIG_NLS_CODEPAGE_861 is not set +# CONFIG_NLS_CODEPAGE_862 is not set +# CONFIG_NLS_CODEPAGE_863 is not set +# CONFIG_NLS_CODEPAGE_864 is not set +# CONFIG_NLS_CODEPAGE_865 is not set +# CONFIG_NLS_CODEPAGE_866 is not set +# CONFIG_NLS_CODEPAGE_869 is not set +# CONFIG_NLS_CODEPAGE_936 is not set +# CONFIG_NLS_CODEPAGE_950 is not set +# CONFIG_NLS_CODEPAGE_932 is not set +# CONFIG_NLS_CODEPAGE_949 is not set +# CONFIG_NLS_CODEPAGE_874 is not set +# CONFIG_NLS_ISO8859_8 is not set +# CONFIG_NLS_CODEPAGE_1250 is not set +# CONFIG_NLS_CODEPAGE_1251 is not set +# CONFIG_NLS_ASCII is not set +CONFIG_NLS_ISO8859_1=m +CONFIG_NLS_ISO8859_2=m +CONFIG_NLS_ISO8859_3=m +CONFIG_NLS_ISO8859_4=m +CONFIG_NLS_ISO8859_5=m +CONFIG_NLS_ISO8859_6=m +CONFIG_NLS_ISO8859_7=m +CONFIG_NLS_ISO8859_9=m +CONFIG_NLS_ISO8859_13=m +CONFIG_NLS_ISO8859_14=m +CONFIG_NLS_ISO8859_15=m +# CONFIG_NLS_KOI8_R is not set +# CONFIG_NLS_KOI8_U is not set +# CONFIG_NLS_UTF8 is not set + +# +# Profiling support +# +# CONFIG_PROFILING is not set + +# +# Kernel hacking +# +# CONFIG_PRINTK_TIME is not set +CONFIG_DEBUG_KERNEL=y +CONFIG_MAGIC_SYSRQ=y +CONFIG_LOG_BUF_SHIFT=15 +# CONFIG_SCHEDSTATS is not set +# CONFIG_DEBUG_SLAB is not set +# CONFIG_DEBUG_SPINLOCK is not set +CONFIG_DEBUG_SPINLOCK_SLEEP=y +# CONFIG_DEBUG_KOBJECT is not set +# CONFIG_DEBUG_INFO is not set +CONFIG_DEBUG_FS=y +# CONFIG_DEBUG_STACKOVERFLOW is not set +# CONFIG_KPROBES is not set +# CONFIG_DEBUG_STACK_USAGE is not set +CONFIG_DEBUGGER=y +# CONFIG_XMON is not set +# CONFIG_PPCDBG is not set +CONFIG_IRQSTACKS=y + +# +# Security options +# +# CONFIG_KEYS is not set +# CONFIG_SECURITY is not set + +# +# Cryptographic options +# +CONFIG_CRYPTO=y +CONFIG_CRYPTO_HMAC=y +# CONFIG_CRYPTO_NULL is not set +# CONFIG_CRYPTO_MD4 is not set +CONFIG_CRYPTO_MD5=m +CONFIG_CRYPTO_SHA1=m +# CONFIG_CRYPTO_SHA256 is not set +# CONFIG_CRYPTO_SHA512 is not set +# CONFIG_CRYPTO_WP512 is not set +# CONFIG_CRYPTO_TGR192 is not set +CONFIG_CRYPTO_DES=m +# CONFIG_CRYPTO_BLOWFISH is not set +# CONFIG_CRYPTO_TWOFISH is not set +# CONFIG_CRYPTO_SERPENT is not set +# CONFIG_CRYPTO_AES is not set +# CONFIG_CRYPTO_CAST5 is not set +# CONFIG_CRYPTO_CAST6 is not set +# CONFIG_CRYPTO_TEA is not set +# CONFIG_CRYPTO_ARC4 is not set +# CONFIG_CRYPTO_KHAZAD is not set +# CONFIG_CRYPTO_ANUBIS is not set +CONFIG_CRYPTO_DEFLATE=m +# CONFIG_CRYPTO_MICHAEL_MIC is not set +# CONFIG_CRYPTO_CRC32C is not set +# CONFIG_CRYPTO_TEST is not set + +# +# Hardware crypto devices +# + +# +# Library routines +# +# CONFIG_CRC_CCITT is not set +CONFIG_CRC32=y +# CONFIG_LIBCRC32C is not set +CONFIG_ZLIB_INFLATE=m +CONFIG_ZLIB_DEFLATE=m From jbarnes at virtuousgeek.org Wed Jun 29 07:19:57 2005 From: jbarnes at virtuousgeek.org (Jesse Barnes) Date: Tue, 28 Jun 2005 14:19:57 -0700 Subject: [PATCH] ppc/ppc64: Fix pci mmap via sysfs In-Reply-To: <1119838264.5133.76.camel@gaston> References: <1119836190.5133.59.camel@gaston> <20050626185727.0ce92772.akpm@osdl.org> <1119838264.5133.76.camel@gaston> Message-ID: <200506281419.57400.jbarnes@virtuousgeek.org> On Sunday, June 26, 2005 7:11 pm, Benjamin Herrenschmidt wrote: > On Sun, 2005-06-26 at 18:57 -0700, Andrew Morton wrote: > > Benjamin Herrenschmidt wrote: > > > Hi ! > > > > > > This implement the change to /proc and sysfs PCI mmap functions > > > that we discussed a while ago, that is adding an arch optional > > > pci_resource_to_user() to allow munging on the exposed value of > > > PCI resources to userland and thus hiding kernel internal values. > > > It also implements using of that callback to sanitize exposed > > > values on ppc an ppc64, thus fixing mmap of PCI devices via /proc > > > and sysfs. > > > > You sure you want all those printks in there? > > One quilt ref later ... :) This one looks better. :) Thanks for fixing this up. Please document it in sysfs-pci.txt too (I guess we don't have a similar document for /proc/bus/pci unfortunately) so that people won't miss it when they implement /proc/bus/pci and sysfs PCI mmap support in the future. Thanks, Jesse From davem at davemloft.net Wed Jun 29 08:59:01 2005 From: davem at davemloft.net (David S. Miller) Date: Tue, 28 Jun 2005 15:59:01 -0700 (PDT) Subject: [PATCH] net: add missing include to netdevice.h In-Reply-To: <200506281547.04620.arnd@arndb.de> References: <200506281528.08834.arnd@arndb.de> <200506281547.04620.arnd@arndb.de> Message-ID: <20050628.155901.78169707.davem@davemloft.net> From: Arnd Bergmann Date: Tue, 28 Jun 2005 15:47:03 +0200 > linux/etherdevice.h can't be included standalone at the moment, which > is required in order to sort the header files in the recommended > alphabetic order. This patch fixes that and is needed to build spider_net. > > Signed-off-by: Arnd Bergmann Applied, thanks a lot Arnd. From frowand at mvista.com Wed Jun 29 09:48:04 2005 From: frowand at mvista.com (Frank Rowand) Date: Tue, 28 Jun 2005 16:48:04 -0700 Subject: [PATCH] ppc64: change duplicate Kconfig menu "General setup" to "Bus Options" Message-ID: <20050628234804.GB18004@mossi.mvista.com> arch/ppc64/Kconfig defines a "General setup" menu, but also sources init/Kconfig which also defines a "General setup" menu. Both of these menus appear at the top level of make menuconfig. Having two menus with the same name is confusing. This patch renames the ppc64/Kconfig menu to be "Bus Options" and moves options in this menu which are not bus related to the end of the "Platform support" menu. There are many variations among architectures on the exact naming of the "Bus Options" menu. I chose to use the simplest one, which is also used in arch/ppc/Kconfig. If this patch looks good, please ack and forward. This patch applies against linux-2.6.12 Signed-off-by: Frank Rowand Index: linux-2.6.12/arch/ppc64/Kconfig =================================================================== --- linux-2.6.12.orig/arch/ppc64/Kconfig +++ linux-2.6.12/arch/ppc64/Kconfig @@ -296,13 +296,46 @@ config SECCOMP If unsure, say Y. Only embedded should say N here. +source "fs/Kconfig.binfmt" + +config HOTPLUG_CPU + bool "Support for hot-pluggable CPUs" + depends on SMP && EXPERIMENTAL && (PPC_PSERIES || PPC_PMAC) + select HOTPLUG + ---help--- + Say Y here to be able to turn CPUs off and on. + + Say N if you are unsure. + +config PROC_DEVICETREE + bool "Support for Open Firmware device tree in /proc" + depends on !PPC_ISERIES + help + This option adds a device-tree directory under /proc which contains + an image of the device tree that the kernel copies from Open + Firmware. If unsure, say Y here. + +config CMDLINE_BOOL + bool "Default bootloader kernel arguments" + depends on !PPC_ISERIES + +config CMDLINE + string "Initial kernel command string" + depends on CMDLINE_BOOL + default "console=ttyS0,9600 console=tty0 root=/dev/sda2" + help + On some platforms, there is currently no way for the boot loader to + pass arguments to the kernel. For these platforms, you can supply + some command-line options at build time by entering them here. In + most cases you will need to specify the root device here. + endmenu config ISA_DMA_API bool default y -menu "General setup" +menu "Bus Options" config ISA bool @@ -335,45 +368,12 @@ config PCI_DOMAINS bool default PCI -source "fs/Kconfig.binfmt" - source "drivers/pci/Kconfig" -config HOTPLUG_CPU - bool "Support for hot-pluggable CPUs" - depends on SMP && EXPERIMENTAL && (PPC_PSERIES || PPC_PMAC) - select HOTPLUG - ---help--- - Say Y here to be able to turn CPUs off and on. - - Say N if you are unsure. - source "drivers/pcmcia/Kconfig" source "drivers/pci/hotplug/Kconfig" -config PROC_DEVICETREE - bool "Support for Open Firmware device tree in /proc" - depends on !PPC_ISERIES - help - This option adds a device-tree directory under /proc which contains - an image of the device tree that the kernel copies from Open - Firmware. If unsure, say Y here. - -config CMDLINE_BOOL - bool "Default bootloader kernel arguments" - depends on !PPC_ISERIES - -config CMDLINE - string "Initial kernel command string" - depends on CMDLINE_BOOL - default "console=ttyS0,9600 console=tty0 root=/dev/sda2" - help - On some platforms, there is currently no way for the boot loader to - pass arguments to the kernel. For these platforms, you can supply - some command-line options at build time by entering them here. In - most cases you will need to specify the root device here. - endmenu source "drivers/Kconfig" From linas at austin.ibm.com Wed Jun 29 09:54:01 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:54:01 -0500 Subject: [PATCH 0/13] PCI Error Recovery Overview Message-ID: <20050628235401.GA6272@austin.ibm.com> Hi, The next 13 patches implement PCI error recovery along the lines of earlier discussions. Its broken out into little pieces for easy digestability. These should apply cleanly against kernel-2.6.12-git10 Details of what this is, and how it works, are in a documentation file, part way down the patch. These patches implement "native" error recovery for five devices: -- the e100, e1000 and ixgb network cards -- the ipr and sym53c8xx_2 scsi device drivers [PATCH 1/13]: PCI Err: pci.h header file changes [PATCH 2/13]: PCI Err: Overview Documentation [PATCH 3/13]: PCI Err: IPR scsi device driver recovery [PATCH 4/13]: PCI Err: e100 ethernet driver recovery [PATCH 5/13]: PCI Err: e1000 ethernet driver recovery [PATCH 6/13]: PCI Err: ixgb ethernet driver recovery [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery [PATCH 8/13]: PCI Err: Event delivery utility [PATCH 9/13]: PCI Err: Whitespace janitoring [PATCH 10/13]: PCI Err: PPC64-specific recovery infrastructure [PATCH 11/13]: PCI Err: RPA-PHP janitoring [PATCH 12/13]: PCI Err: RPA-PHP clarification [PATCH 13/13]: PCI Err: RPA-PHP-specific error recovery driver --linas From linas at austin.ibm.com Wed Jun 29 09:58:17 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:58:17 -0500 Subject: [PATCH 1/13]: PCI Err: pci.h header file changes Message-ID: <20050628235817.GA6324@austin.ibm.com> pci-err-1-pci.h.patch This patch adds PCI error recovery callbacks, error state and error return codes to include/linux/pci.h. These are closely described in the next patch, a documentation file. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/include/linux/pci.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/include/linux/pci.h 2005-06-24 14:44:59.000000000 -0500 @@ -660,6 +660,37 @@ struct pci_dynids { unsigned int use_driver_data:1; /* pci_driver->driver_data is used */ }; +/* ---------------------------------------------------------------- */ +/** PCI error recovery infrastructure. If a PCI device driver provides + * a set fof callbacks in struct pci_error_handlers, then that device driver + * will be notified of PCI bus errors, and can be driven to recovery. + */ + +enum pci_channel_state { + pci_channel_io_normal = 0, /* I/O channel is in normal state */ + pci_channel_io_frozen = 1, /* I/O to channel is blocked */ + pci_channel_io_perm_failure, /* pci card is dead */ +}; + +enum pcierr_result { + PCIERR_RESULT_NONE=0, /* no result/none/not supported in device driver */ + PCIERR_RESULT_CAN_RECOVER=1, /* Device driver can recover without slot reset */ + PCIERR_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */ + PCIERR_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */ + PCIERR_RESULT_RECOVERED, /* Device driver is fully recovered and operational */ +}; + +/* PCI bus error event callbacks */ +struct pci_error_handlers +{ + enum pci_channel_state error_state; /* current error state */ + int (*error_detected)(struct pci_dev *dev, enum pci_channel_state error); + int (*mmio_enabled)(struct pci_dev *dev); /* MMIO has been reanbled, but not DMA */ + int (*link_reset)(struct pci_dev *dev); /* PCI Express link has been reset */ + int (*slot_reset)(struct pci_dev *dev); /* PCI slot has been reset */ + void (*resume)(struct pci_dev *dev); /* Device driver may resume normal operations */ +}; + struct module; struct pci_driver { struct list_head node; @@ -673,6 +704,7 @@ struct pci_driver { int (*enable_wake) (struct pci_dev *dev, pci_power_t state, int enable); /* Enable wake event */ void (*shutdown) (struct pci_dev *dev); + struct pci_error_handlers err_handler; struct device_driver driver; struct pci_dynids dynids; }; From linas at austin.ibm.com Wed Jun 29 09:58:30 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:58:30 -0500 Subject: [PATCH 2/13]: PCI Err: Overview Documentation Message-ID: <20050628235830.GA6337@austin.ibm.com> pci-err-2-docs.patch Documentation that provides overview of the pci error recovery. Based on note from BenH several months back, with minor editing to harmonize function names with code. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/Documentation/pci-error-recovery.txt.linas-orig 2005-06-22 15:28:15.000000000 -0500 +++ linux-2.6.12-git10/Documentation/pci-error-recovery.txt 2005-06-22 15:28:29.000000000 -0500 @@ -0,0 +1,242 @@ + + PCI Error Recovery + ------------------ + May 31, 2005 + + +Some PCI bus controllers are able to detect certain "hard" PCI errors +on the bus, such as parity errors on the data and address busses, as +well as SERR and PERR errors. These chipsets are then able to disable +I/O to/from the affected device, so that, for example, a bad DMA +address doesn't end up corrupting system memory. These same chipsets +are also able to reset the affected PCI device, and return it to +working condition. This document describes a generic API form +performing error recovery. + +The core idea is that after a PCI error has been detected, there must +be a way for the kernel to coordinate with all affected device drivers +so that the pci card can be made operational again, possibly after +performing a full electrical #RST of the PCI card. The API below +provides a generic API for device drivers to be notified of PCI +errors, and to be notified of, and respond to, a reset sequence. + +Preliminary sketch of API, cut-n-pasted-n-modified email from +Ben Herrenschmidt, circa 5 april 2005 + +The error recovery API support is exposed to the driver in the form of +a structure of function pointers pointed to by a new field in struct +pci_driver. The absence of this pointer in pci_driver denotes an +"non-aware" driver, behaviour on these is platform dependant. +Platforms like ppc64 can try to simulate pci hotplug remove/add. + +The definition of "pci_error_token" is not covered here. It is based on +Seto's work on the synchronous error detection. We still need to define +functions for extracting infos out of an opaque error token. This is +separate from this API. + +This structure has the form: + +struct pci_error_handlers +{ + int (*error_detected)(struct pci_dev *dev, pci_error_token error); + int (*mmio_enabled)(struct pci_dev *dev); + int (*resume)(struct pci_dev *dev); + int (*link_reset)(struct pci_dev *dev); + int (*slot_reset)(struct pci_dev *dev); +}; + +A driver doesn't have to implement all of these callbacks. The +only mandatory one is error_detected(). If a callback is not +implemented, the corresponding feature is considered unsupported. +For example, if mmio_enabled() and resume() aren't there, then the +driver is assumed as not doing any direct recovery and requires +a reset. If link_reset() is not implemented, the card is assumed as +not caring about link resets, in which case, if recover is supported, +the core can try recover (but not slot_reset() unless it really did +reset the slot). If slot_reset() is not supported, link_reset() can +be called instead on a slot reset. + +At first, the call will always be : + + 1) error_detected() + + Error detected. This is sent once after an error has been detected. At +this point, the device might not be accessible anymore depending on the +platform (the slot will be isolated on ppc64). The driver may already +have "noticed" the error because of a failing IO, but this is the proper +"synchronisation point", that is, it gives a chance to the driver to +cleanup, waiting for pending stuff (timers, whatever, etc...) to +complete; it can take semaphores, schedule, etc... everything but touch +the device. Within this function and after it returns, the driver +shouldn't do any new IOs. Called in task context. This is sort of a +"quiesce" point. See note about interrupts at the end of this doc. + + Result codes: + - PCIERR_RESULT_CAN_RECOVER: + Driever returns this if it thinks it might be able to recover + the HW by just banging IOs or if it wants to be given + a chance to extract some diagnostic informations (see + below). + - PCIERR_RESULT_NEED_RESET: + Driver returns this if it thinks it can't recover unless the + slot is reset. + - PCIERR_RESULT_DISCONNECT: + Return this if driver thinks it won't recover at all, + (this will detach the driver ? or just leave it + dangling ? to be decided) + +So at this point, we have called error_detected() for all drivers +on the segment that had the error. On ppc64, the slot is isolated. What +happens now typically depends on the result from the drivers. If all +drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would +re-enable IOs on the slot (or do nothing special if the platform doesn't +isolate slots) and call 2). If not and we can reset slots, we go to 4), +if neither, we have a dead slot. If it's an hotplug slot, we might +"simulate" reset by triggering HW unplug/replug though. + +>>> Current ppc64 implementation assumes that a device driver will +>>> *not* schedule or semaphore in this routine; the current ppc64 +>>> implementation uses one kernel thread to notify all devices; +>>> thus, of one device sleeps/schedules, all devices are affected. +>>> Doing better requires complex multi-threaded logic in the error +>>> recovery implementation (e.g. waiting for all notification threads +>>> to "join" before proceeding with recovery.) This seems excessively +>>> complex and not worth implementing. + +>>> The current ppc64 implementation doesn't much care if the device +>>> attempts i/o at this point, or not. I/O's will fail, returning +>>> a value of 0xff on read, and writes will be dropped. If the device +>>> driver attempts more than 10K I/O's to a frozen adapter, it will +>>> assume that the device driver has gone into an infinite loop, and +>>> it will panic the the kernel. + + 2) mmio_enabled() + + This is the "early recovery" call. IOs are allowed again, but DMA is +not (hrm... to be discussed, I prefer not), with some restrictions. This +is NOT a callback for the driver to start operations again, only to +peek/poke at the device, extract diagnostic information, if any, and +eventually do things like trigger a device local reset or some such, +but not restart operations. This is sent if all drivers on a segment +agree that they can try to recover and no automatic link reset was +performed by the HW. If the platform can't just re-enable IOs without +a slot reset or a link reset, it doesn't call this callback and goes +directly to 3) or 4). All IOs should be done _synchronously_ from +within this callback, errors triggered by them will be returned via +the normal pci_check_whatever() api, no new error_detected() callback +will be issued due to an error happening here. However, such an error +might cause IOs to be re-blocked for the whole segment, and thus +invalidate the recovery that other devices on the same segment might +have done, forcing the whole segment into one of the next states, +that is link reset or slot reset. + + Result codes: + - PCIERR_RESULT_RECOVERED + Driver returns this if it thinks the device is fully + functionnal and thinks it is ready to start + normal driver operations again. There is no + guarantee that the driver will actually be + allowed to proceed, as another driver on the + same segment might have failed and thus triggered a + slot reset on platforms that support it. + + - PCIERR_RESULT_NEED_RESET + Driver returns this if it thinks the device is not + recoverable in it's current state and it needs a slot + reset to proceed. + + - PCIERR_RESULT_DISCONNECT + Same as above. Total failure, no recovery even after + reset driver dead. (To be defined more precisely) + +>>> The current ppc64 implementation does not implement this callback. + + 3) link_reset() + + This is called after the link has been reset. This is typically +a PCI Express specific state at this point and is done whenever a +non-fatal error has been detected that can be "solved" by resetting +the link. This call informs the driver of the reset and the driver +should check if the device appears to be in working condition. +This function acts a bit like 2) mmio_enabled(), in that the driver +is not supposed to restart normal driver I/O operations right away. +Instead, it should just "probe" the device to check it's recoverability +status. If all is right, then the core will call resume() once all +drivers have ack'd link_reset(). + + Result codes: + (identical to mmio_enabled) + +>>> The current ppc64 implementation does not implement this callback. + + 4) slot_reset() + + This is called after the slot has been soft or hard reset by the +platform. A soft reset consists of asserting the adapter #RST line +and then restoring the PCI BARs and PCI configuration header. If the +platform supports PCI hotplug, then it might instead perform a hard +reset by toggling power on the slot off/on. This call gives drivers +the chance to re-initialize the hardware (re-download firmware, etc.), +but drivers shouldn't restart normal I/O processing operations at +this point. (See note about interrupts; interrupts aren't guaranteed +to be delivered until the resume() callback has been called). If all +device drivers report success on this callback, the patform will call +resume() to complete the error handling and let the driver restart +normal I/O processing. + +A driver can still return a critical failure for this function if +it can't get the device operational after reset. If the platform +previously tried a soft reset, it migh now try a hard reset (power +cycle) and then call slot_reset() again. It the device still can't +be recovered, there is nothing more that can be done; the platform +will typically report a "permanent failure" in such a case. The +device will be considered "dead" in this case. + + Result codes: + - PCIERR_RESULT_DISCONNECT + Same as above. + +>>> The current ppc64 implementation does not try a power-cycle reset +>>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should. + + 5) resume() + + This is called if all drivers on the segment have returned +PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks. +That basically tells the driver to restart activity, tht everything +is back and running. No result code is taken into account here. If +a new error happens, it will restart a new error handling process. + +That's it. I think this covers all the possibilities. The way those +callbacks are called is platform policy. A platform with no slot reset +capability for example may want to just "ignore" drivers that can't +recover (disconnect them) and try to let other cards on the same segment +recover. Keep in mind that in most real life cases, though, there will +be only one driver per segment. + +Now, there is a note about interrupts. If you get an interrupt and your +device is dead or has been isolated, there is a problem :) + +After much thinking, I decided to leave that to the platform. That is, +the recovery API only precies that: + + - There is no guarantee that interrupt delivery can proceed from any +device on the segment starting from the error detection and until the +restart callback is sent, at which point interrupts are expected to be +fully operational. + + - There is no guarantee that interrupt delivery is stopped, that is, ad +river that gets an interrupts after detecting an error, or that detects +and error within the interrupt handler such that it prevents proper +ack'ing of the interrupt (and thus removal of the source) should just +return IRQ_NOTHANDLED. It's up to the platform to deal with taht +condition, typically by masking the irq source during the duration of +the error handling. It is expected that the platform "knows" which +interrupts are routed to error-management capable slots and can deal +with temporarily disabling that irq number during error processing (this +isn't terribly complex). That means some IRQ latency for other devices +sharing the interrupt, but there is simply no other way. High end +platforms aren't supposed to share interrupts between many devices +anyway :) + + From linas at austin.ibm.com Wed Jun 29 09:58:39 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:58:39 -0500 Subject: [PATCH 3/13]: PCI Err: IPR scsi device driver recovery Message-ID: <20050628235839.GA6362@austin.ibm.com> pci-err-3-ipr.patch Adds PCI error recovery callbacks to the IPR SCSI controller driver. Tested, seems to work well, a variant of this ships already in the Novell/SUSE SLES9 SP2 kernel. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/scsi/ipr.c.linas-orig 2005-06-22 15:26:14.000000000 -0500 +++ linux-2.6.12-git10/drivers/scsi/ipr.c 2005-06-22 17:05:14.000000000 -0500 @@ -5326,6 +5326,88 @@ static void ipr_initiate_ioa_reset(struc shutdown_type); } +#ifdef CONFIG_SCSI_IPR_EEH_RECOVERY + +/** If the PCI slot is frozen, hold off all i/o + * activity; then, as soon as the slot is available again, + * initiate an adapter reset. + */ +static int ipr_reset_freeze(struct ipr_cmnd *ipr_cmd) +{ + list_add_tail(&ipr_cmd->queue, &ipr_cmd->ioa_cfg->pending_q); + ipr_cmd->done = ipr_reset_ioa_job; + return IPR_RC_JOB_RETURN; +} + +/** ipr_eeh_frozen -- called when slot has experience PCI bus error. + * This routine is called to tell us that the PCI bus is down. + * Can't do anything here, except put the device driver into a + * holding pattern, waiting for the PCI bus to come back. + */ +static void ipr_eeh_frozen (struct pci_dev *pdev) +{ + unsigned long flags = 0; + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); + + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_freeze, IPR_SHUTDOWN_NONE); + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); +} + +/** ipr_eeh_slot_reset - called when pci slot has been reset. + * + * This routine is called by the pci error recovery recovery + * code after the PCI slot has been reset, just before we + * should resume normal operations. + */ +static int ipr_eeh_slot_reset (struct pci_dev *pdev) +{ + unsigned long flags = 0; + struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev); + + pci_enable_device(pdev); + pci_set_master(pdev); + enable_irq (pdev->irq); + spin_lock_irqsave(ioa_cfg->host->host_lock, flags); + _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_restore_cfg_space, + IPR_SHUTDOWN_NONE); + spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags); + + return PCIERR_RESULT_RECOVERED; +} + +/** This routine is called when the PCI bus has permanently + * failed. This routine should purge all pending I/O and + * shut down the device driver (close and unload). + * XXX Needs to be implemented. + */ +static void ipr_eeh_perm_failure (struct pci_dev *pdev) +{ +#if 0 // XXXXXXXXXXXXXXXXXXXXXXX + ipr_cmd->job_step = ipr_reset_shutdown_ioa; + rc = IPR_RC_JOB_CONTINUE; +#endif +} + +static int ipr_eeh_error_detected (struct pci_dev *pdev, + enum pci_channel_state state) +{ + switch (state) { + case pci_channel_io_frozen: + ipr_eeh_frozen (pdev); + return PCIERR_RESULT_NEED_RESET; + + case pci_channel_io_perm_failure: + ipr_eeh_perm_failure (pdev); + return PCIERR_RESULT_DISCONNECT; + break; + default: + break; + } + return PCIERR_RESULT_NEED_RESET; +} +#endif + /** * ipr_probe_ioa_part2 - Initializes IOAs found in ipr_probe_ioa(..) * @ioa_cfg: ioa cfg struct @@ -6068,6 +6150,10 @@ static struct pci_driver ipr_driver = { .id_table = ipr_pci_table, .probe = ipr_probe, .remove = ipr_remove, + .err_handler = { + .error_detected = ipr_eeh_error_detected, + .slot_reset = ipr_eeh_slot_reset, + }, .driver = { .shutdown = ipr_shutdown, }, --- linux-2.6.12-git10/drivers/scsi/Kconfig.linas-orig 2005-06-22 15:26:14.000000000 -0500 +++ linux-2.6.12-git10/drivers/scsi/Kconfig 2005-06-22 15:28:29.000000000 -0500 @@ -1065,6 +1065,14 @@ config SCSI_IPR_DUMP If you enable this support, the iprdump daemon can be used to capture adapter failure analysis information. +config SCSI_IPR_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on SCSI_IPR && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config SCSI_ZALON tristate "Zalon SCSI support" depends on GSC && SCSI --- linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig 2005-06-22 15:30:33.000000000 -0500 @@ -314,6 +314,7 @@ CONFIG_SCSI_IPR CONFIG_SCSI_IPR=y CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y +CONFIG_SCSI_IPR_EEH_RECOVERY=y # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set CONFIG_SCSI_QLA2XXX=y From linas at austin.ibm.com Wed Jun 29 09:58:48 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:58:48 -0500 Subject: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery Message-ID: <20050628235848.GA6376@austin.ibm.com> pci-err-4-e100.patch Adds PCI error recovery callbacks to the Intel E100 ethernet device driver. Lightly tested on an E100 two-port card. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/net/e100.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/net/e100.c 2005-06-22 17:18:26.000000000 -0500 @@ -2460,6 +2460,67 @@ static void e100_shutdown(struct device #endif } +#ifdef CONFIG_E100_EEH_RECOVERY + +/** e100_io_error_detected() is called when PCI error is detected */ +static int e100_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct nic *nic = netdev_priv(netdev); + + mod_timer(&nic->watchdog, jiffies + 30*HZ); + e100_down(nic); + + /* Request a slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** e100_io_slot_reset is called after the pci bus has been reset. + * Restart the card from scratch. */ +static int e100_io_slot_reset (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct nic *nic = netdev_priv(netdev); + + if(pci_enable_device(pdev)) { + printk(KERN_ERR "e100: Cannot re-enable PCI device after reset.\n"); + return PCIERR_RESULT_DISCONNECT; + } + pci_set_master(pdev); + + /* Only one device per card can do a reset */ + if (0 != PCI_FUNC (pdev->devfn)) + return PCIERR_RESULT_RECOVERED; + + e100_hw_reset(nic); + e100_phy_init(nic); + + if(e100_hw_init(nic)) { + DPRINTK(HW, ERR, "e100_hw_init failed\n"); + return PCIERR_RESULT_DISCONNECT; + } + + return PCIERR_RESULT_RECOVERED; +} + +/** e100_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + */ +static void e100_io_resume (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct nic *nic = netdev_priv(netdev); + + /* ack any pending wake events, disable PME */ + pci_enable_wake(pdev, 0, 0); + + netif_device_attach(netdev); + if(netif_running(netdev)) + e100_open (netdev); + + mod_timer(&nic->watchdog, jiffies); +} +#endif /* CONFIG_E100_EEH_RECOVERY */ static struct pci_driver e100_driver = { .name = DRV_NAME, @@ -2470,6 +2531,13 @@ static struct pci_driver e100_driver = { .suspend = e100_suspend, .resume = e100_resume, #endif +#ifdef CONFIG_E100_EEH_RECOVERY + .err_handler = { + .error_detected = e100_io_error_detected, + .slot_reset = e100_io_slot_reset, + .resume = e100_io_resume, + }, +#endif /* CONFIG_E100_EEH_RECOVERY */ .driver = { .shutdown = e100_shutdown, --- linux-2.6.12-git10/drivers/net/Kconfig.linas-orig 2005-06-22 15:26:13.000000000 -0500 +++ linux-2.6.12-git10/drivers/net/Kconfig 2005-06-22 15:28:29.000000000 -0500 @@ -1392,6 +1392,14 @@ config E100 . The module will be called e100. +config E100_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on E100 && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config LNE390 tristate "Mylex EISA LNE390A/B support (EXPERIMENTAL)" depends on NET_PCI && EISA && EXPERIMENTAL --- linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig 2005-06-22 15:30:33.000000000 -0500 @@ -545,6 +545,7 @@ CONFIG_PCNET32=y # CONFIG_DGRS is not set # CONFIG_EEPRO100 is not set CONFIG_E100=y +CONFIG_E100_EEH_RECOVERY=y # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set # CONFIG_NE2K_PCI is not set From linas at austin.ibm.com Wed Jun 29 09:58:55 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:58:55 -0500 Subject: [PATCH 5/13]: PCI Err: e1000 ethernet driver recovery Message-ID: <20050628235855.GA6389@austin.ibm.com> pci-err-5-e1000.patch Provide PCI Error recovery callbacks for the Intel gigabit E1000 ethernet driver. Tested with 2-port card. This patch shares some common logic with the power-management shutdown and resume code; one could split up the power managment code so that both this and that were even more similar, but there didn't seem to be much to gain from doing this. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/net/e1000/e1000_main.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/net/e1000/e1000_main.c 2005-06-22 17:02:17.000000000 -0500 @@ -173,6 +173,12 @@ static int e1000_resume(struct pci_dev * static void e1000_netpoll (struct net_device *netdev); #endif +#ifdef CONFIG_E1000_EEH_RECOVERY +static int e1000_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state); +static int e1000_io_slot_reset (struct pci_dev *pdev); +static void e1000_io_resume (struct pci_dev *pdev); +#endif /* CONFIG_E1000_EEH_RECOVERY */ + struct notifier_block e1000_notifier_reboot = { .notifier_call = e1000_notify_reboot, .next = NULL, @@ -193,6 +199,14 @@ static struct pci_driver e1000_driver = .suspend = e1000_suspend, .resume = e1000_resume #endif +#ifdef CONFIG_E1000_EEH_RECOVERY + .err_handler = { + .error_detected = e1000_io_error_detected, + .slot_reset = e1000_io_slot_reset, + .resume = e1000_io_resume, + }, +#endif /* CONFIG_E1000_EEH_RECOVERY */ + }; MODULE_AUTHOR("Intel Corporation, "); @@ -3820,4 +3834,91 @@ e1000_netpoll(struct net_device *netdev) } #endif +#ifdef CONFIG_E1000_EEH_RECOVERY + +/** e1000_io_error_detected() is called when PCI error is detected */ +static int e1000_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct e1000_adapter *adapter = netdev->priv; + + mod_timer(&adapter->watchdog_timer, jiffies + 20 * HZ); + if(netif_running(netdev)) + e1000_down(adapter); + + /* Request a slot slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** e1000_io_slot_reset is called after the pci bus has been reset. + * Restart the card from scratch. + * Implementation resembles the first-half of the + * e1000_resume routine. + */ +static int e1000_io_slot_reset (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct e1000_adapter *adapter = netdev->priv; + + if(pci_enable_device(pdev)) { + printk(KERN_ERR "e1000: Cannot re-enable PCI device after reset.\n"); + return PCIERR_RESULT_DISCONNECT; + } + pci_set_master(pdev); + + pci_enable_wake(pdev, 3, 0); + pci_enable_wake(pdev, 4, 0); /* 4 == D3 cold */ + + /* Perform card reset only on one instance of the card */ + if (0 != PCI_FUNC (pdev->devfn)) + return PCIERR_RESULT_RECOVERED; + + e1000_reset(adapter); + E1000_WRITE_REG(&adapter->hw, WUS, ~0); + + return PCIERR_RESULT_RECOVERED; +} + +/** e1000_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + * Implementation resembles the second-half of the + * e1000_resume routine. + */ +static void e1000_io_resume (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct e1000_adapter *adapter = netdev->priv; + uint32_t manc, swsm; + + if(netif_running(netdev)) { + if(e1000_up(adapter)) { + printk ("e1000: can't bring device back up after reset\n"); + return; + } + } + + netif_device_attach(netdev); + + if(adapter->hw.mac_type >= e1000_82540 && + adapter->hw.media_type == e1000_media_type_copper) { + manc = E1000_READ_REG(&adapter->hw, MANC); + manc &= ~(E1000_MANC_ARP_EN); + E1000_WRITE_REG(&adapter->hw, MANC, manc); + } + + switch(adapter->hw.mac_type) { + case e1000_82573: + swsm = E1000_READ_REG(&adapter->hw, SWSM); + E1000_WRITE_REG(&adapter->hw, SWSM, + swsm | E1000_SWSM_DRV_LOAD); + break; + default: + break; + } + + mod_timer(&adapter->watchdog_timer, jiffies); +} + +#endif /* CONFIG_E1000_EEH_RECOVERY */ + /* e1000_main.c */ --- linux-2.6.12-git10/drivers/net/Kconfig.linas-orig 2005-06-22 15:26:13.000000000 -0500 +++ linux-2.6.12-git10/drivers/net/Kconfig 2005-06-22 15:28:29.000000000 -0500 @@ -1847,6 +1847,14 @@ config E1000_NAPI If in doubt, say N. +config E1000_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on E1000 && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config MYRI_SBUS tristate "MyriCOM Gigabit Ethernet support" depends on SBUS --- linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig 2005-06-22 15:30:33.000000000 -0500 @@ -564,6 +564,7 @@ CONFIG_ACENIC_OMIT_TIGON_I=y # CONFIG_DL2K is not set CONFIG_E1000=y # CONFIG_E1000_NAPI is not set +CONFIG_E1000_EEH_RECOVERY=y # CONFIG_NS83820 is not set # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set From linas at austin.ibm.com Wed Jun 29 09:59:08 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:59:08 -0500 Subject: [PATCH 6/13]: PCI Err: ixgb ethernet driver recovery Message-ID: <20050628235908.GA6402@austin.ibm.com> pci-err-6-ixgb.patch Adds PCI Error recovery callbacks to the Intel 10-gigabit ethernet ixgb device driver. Lightly tested, works. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/net/ixgb/ixgb_main.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/net/ixgb/ixgb_main.c 2005-06-27 15:57:48.000000000 -0500 @@ -128,6 +128,12 @@ static void ixgb_restore_vlan(struct ixg static void ixgb_netpoll(struct net_device *dev); #endif +#ifdef CONFIG_IXGB_EEH_RECOVERY +static int ixgb_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state); +static int ixgb_io_slot_reset (struct pci_dev *pdev); +static void ixgb_io_resume (struct pci_dev *pdev); +#endif /* CONFIG_IXGB_EEH_RECOVERY */ + /* Exported from other modules */ extern void ixgb_check_options(struct ixgb_adapter *adapter); @@ -137,6 +143,14 @@ static struct pci_driver ixgb_driver = { .id_table = ixgb_pci_tbl, .probe = ixgb_probe, .remove = __devexit_p(ixgb_remove), +#ifdef CONFIG_IXGB_EEH_RECOVERY + .err_handler = { + .error_detected = ixgb_io_error_detected, + .slot_reset = ixgb_io_slot_reset, + .resume = ixgb_io_resume, + }, +#endif /* CONFIG_IXGB_EEH_RECOVERY */ + }; MODULE_AUTHOR("Intel Corporation, "); @@ -2119,4 +2133,67 @@ static void ixgb_netpoll(struct net_devi } #endif +#ifdef CONFIG_IXGB_EEH_RECOVERY + +/** ixgb_io_error_detected() is called when PCI error is detected */ +static int ixgb_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct ixgb_adapter *adapter = netdev->priv; + + if(netif_running(netdev)) + ixgb_down(adapter, TRUE); + + /* Request a slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** ixgb_io_slot_reset is called after the pci bus has been reset. + * Restart the card from scratch. + * Implementation resembles the first-half of the + * ixgb_resume routine. + */ +static int ixgb_io_slot_reset (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct ixgb_adapter *adapter = netdev->priv; + + if(pci_enable_device(pdev)) { + printk(KERN_ERR "ixgb: Cannot re-enable PCI device after reset.\n"); + return PCIERR_RESULT_DISCONNECT; + } + pci_set_master(pdev); + + + /* Perform card reset only on one instance of the card */ + if (0 != PCI_FUNC (pdev->devfn)) + return PCIERR_RESULT_RECOVERED; + + ixgb_reset(adapter); + + return PCIERR_RESULT_RECOVERED; +} + +/** ixgb_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + * Implementation resembles the second-half of the + * ixgb_resume routine. + */ +static void ixgb_io_resume (struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct ixgb_adapter *adapter = netdev->priv; + + if(netif_running(netdev)) { + if(ixgb_up(adapter)) { + printk ("ixgb: can't bring device back up after reset\n"); + return; + } + } + + netif_device_attach(netdev); + mod_timer(&adapter->watchdog_timer, jiffies); +} +#endif /* CONFIG_IXGB_EEH_RECOVERY */ + /* ixgb_main.c */ --- linux-2.6.12-git10/drivers/net/Kconfig.linas-orig 2005-06-22 15:26:13.000000000 -0500 +++ linux-2.6.12-git10/drivers/net/Kconfig 2005-06-27 14:41:46.000000000 -0500 @@ -2146,6 +2146,14 @@ config IXGB_NAPI If in doubt, say N. +config IXGB_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on IXGB && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config S2IO tristate "S2IO 10Gbe XFrame NIC" depends on PCI --- linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig 2005-06-27 14:43:24.000000000 -0500 @@ -580,6 +580,7 @@ CONFIG_TIGON3=y # CONFIG_IXGB=m # CONFIG_IXGB_NAPI is not set +CONFIG_IXGB_EEH_RECOVERY=y CONFIG_S2IO=m # CONFIG_S2IO_NAPI is not set # CONFIG_2BUFF_MODE is not set From benh at kernel.crashing.org Wed Jun 29 09:53:59 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 29 Jun 2005 09:53:59 +1000 Subject: [PATCH] ppc/ppc64: Fix pci mmap via sysfs In-Reply-To: <20050628075140.GF3577@kroah.com> References: <1119836190.5133.59.camel@gaston> <20050626185727.0ce92772.akpm@osdl.org> <1119838264.5133.76.camel@gaston> <20050628075140.GF3577@kroah.com> Message-ID: <1120002840.5133.216.camel@gaston> > > This implement the change to /proc and sysfs PCI mmap functions that we > > discussed a while ago, that is adding an arch optional > > pci_resource_to_user() to allow munging on the exposed value of PCI > > resources to userland and thus hiding kernel internal values. It also > > implements using of that callback to sanitize exposed values on ppc an > > ppc64, thus fixing mmap of PCI devices via /proc and sysfs. > > Hm, did I just send the right one to Linus? I'll check & send any additional fix that may be necessary (I just got noticed that it breaks iSeries ... :) Ben. From linas at austin.ibm.com Wed Jun 29 09:59:19 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:59:19 -0500 Subject: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery Message-ID: <20050628235919.GA6415@austin.ibm.com> pci-err-7-symbios.patch Adds PCI Error recoervy callbacks to the Symbios Sym53c8xx driver. Tested, seems to work well under i/o stress to one disk. Not stress tested under heavy i/o to multiple scsi devices. Note the check of the pci error state flag inside an infinite loop inside the interrupt handler. Without this check, the device can spin forever, locking up hard, long before the asynchronous error event (and callbacks) are ever called. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/scsi/sym53c8xx_2/sym_glue.c.linas-orig 2005-06-22 15:26:17.000000000 -0500 +++ linux-2.6.12-git10/drivers/scsi/sym53c8xx_2/sym_glue.c 2005-06-22 17:17:00.000000000 -0500 @@ -685,6 +685,10 @@ static irqreturn_t sym53c8xx_intr(int ir struct sym_hcb *np = (struct sym_hcb *)dev_id; if (DEBUG_FLAGS & DEBUG_TINY) printf_debug ("["); +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + if (np->s.io_state != pci_channel_io_normal) + return IRQ_HANDLED; +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ spin_lock_irqsave(np->s.host->host_lock, flags); sym_interrupt(np); @@ -759,6 +763,27 @@ static void sym_eh_done(struct scsi_cmnd */ static void sym_eh_timeout(u_long p) { __sym_eh_done((struct scsi_cmnd *)p, 1); } +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY +static void sym_eeh_timeout(u_long p) +{ + struct sym_eh_wait *ep = (struct sym_eh_wait *) p; + if (!ep) + return; + complete(&ep->done); +} + +static void sym_eeh_done(struct sym_eh_wait *ep) +{ + if (!ep) + return; + ep->timed_out = 0; + if (!del_timer(&ep->timer)) + return; + + complete(&ep->done); +} +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + /* * Generic method for our eh processing. * The 'op' argument tells what we have to do. @@ -799,6 +824,37 @@ prepare: /* Try to proceed the operation we have been asked for */ sts = -1; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + + /* We may be in an error condition because the PCI bus + * went down. In this case, we need to wait until the + * PCI bus is reset, the card is reset, and only then + * proceed with the scsi error recovery. We'll wait + * for 15 seconds for this to happen. + */ +#define WAIT_FOR_PCI_RECOVERY 15 + if (np->s.io_state != pci_channel_io_normal) { + struct sym_eh_wait eeh, *eep = &eeh; + np->s.io_reset_wait = eep; + init_completion(&eep->done); + init_timer(&eep->timer); + eep->to_do = SYM_EH_DO_WAIT; + eep->timer.expires = jiffies + (WAIT_FOR_PCI_RECOVERY*HZ); + eep->timer.function = sym_eeh_timeout; + eep->timer.data = (u_long)eep; + eep->timed_out = 1; /* Be pessimistic for once :) */ + add_timer(&eep->timer); + spin_unlock_irq(np->s.host->host_lock); + wait_for_completion(&eep->done); + spin_lock_irq(np->s.host->host_lock); + if (eep->timed_out) { + printk (KERN_ERR "%s: Timed out waiting for PCI reset\n", + sym_name(np)); + } + np->s.io_reset_wait = NULL; + } +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + switch(op) { case SYM_EH_ABORT: sts = sym_abort_scsiio(np, cmd, 1); @@ -1584,6 +1640,10 @@ static struct Scsi_Host * __devinit sym_ np->maxoffs = dev->chip.offset_max; np->maxburst = dev->chip.burst_max; np->myaddr = dev->host_id; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + np->s.io_state = pci_channel_io_normal; + np->s.io_reset_wait = NULL; +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ /* * Edit its name. @@ -1916,6 +1976,59 @@ static int sym_detach(struct sym_hcb *np return 1; } +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY +/** sym2_io_error_detected() is called when PCI error is detected */ +static int sym2_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + np->s.io_state = state; + // XXX If slot is permanently frozen, then what? + // Should we scsi_remove_host() maybe ?? + + /* Request a slot slot reset. */ + return PCIERR_RESULT_NEED_RESET; +} + +/** sym2_io_slot_reset is called when the pci bus has been reset. + * Restart the card from scratch. */ +static int sym2_io_slot_reset (struct pci_dev *pdev) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + printk (KERN_INFO "%s: recovering from a PCI slot reset\n", + sym_name(np)); + + if (pci_enable_device(pdev)) + printk (KERN_ERR "%s: device setup failed most egregiously\n", + sym_name(np)); + + pci_set_master(pdev); + enable_irq (pdev->irq); + + /* Perform host reset only on one instance of the card */ + if (0 == PCI_FUNC (pdev->devfn)) + sym_reset_scsi_bus(np, 0); + + return PCIERR_RESULT_RECOVERED; +} + +/** sym2_io_resume is called when the error recovery driver + * tells us that its OK to resume normal operation. + */ +static void sym2_io_resume (struct pci_dev *pdev) +{ + struct sym_hcb *np = pci_get_drvdata(pdev); + + /* Perform device startup only once for this card. */ + if (0 == PCI_FUNC (pdev->devfn)) + sym_start_up (np, 1); + + np->s.io_state = pci_channel_io_normal; + sym_eeh_done (np->s.io_reset_wait); +} +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ + /* * Driver host template. */ @@ -2174,6 +2287,13 @@ static struct pci_driver sym2_driver = { .id_table = sym2_id_table, .probe = sym2_probe, .remove = __devexit_p(sym2_remove), +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + .err_handler = { + .error_detected = sym2_io_error_detected, + .slot_reset = sym2_io_slot_reset, + .resume = sym2_io_resume, + }, +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ }; static int __init sym2_init(void) --- linux-2.6.12-git10/drivers/scsi/sym53c8xx_2/sym_glue.h.linas-orig 2005-06-22 15:26:17.000000000 -0500 +++ linux-2.6.12-git10/drivers/scsi/sym53c8xx_2/sym_glue.h 2005-06-22 15:28:29.000000000 -0500 @@ -181,6 +181,10 @@ struct sym_shcb { char chip_name[8]; struct pci_dev *device; + /* pci bus i/o state; waiter for clearing of i/o state */ + enum pci_channel_state io_state; + struct sym_eh_wait *io_reset_wait; + struct Scsi_Host *host; void __iomem * ioaddr; /* MMIO kernel io address */ --- linux-2.6.12-git10/drivers/scsi/sym53c8xx_2/sym_hipd.c.linas-orig 2005-06-22 15:26:17.000000000 -0500 +++ linux-2.6.12-git10/drivers/scsi/sym53c8xx_2/sym_hipd.c 2005-06-22 15:28:29.000000000 -0500 @@ -2806,6 +2806,7 @@ void sym_interrupt (struct sym_hcb *np) u_char istat, istatc; u_char dstat; u_short sist; + u_int icnt; /* * interrupt on the fly ? @@ -2847,6 +2848,7 @@ void sym_interrupt (struct sym_hcb *np) sist = 0; dstat = 0; istatc = istat; + icnt = 0; do { if (istatc & SIP) sist |= INW(np, nc_sist); @@ -2854,6 +2856,14 @@ void sym_interrupt (struct sym_hcb *np) dstat |= INB(np, nc_dstat); istatc = INB(np, nc_istat); istat |= istatc; +#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY + /* Prevent deadlock waiting on a condition that may never clear. */ + icnt ++; + if (100 < icnt) { + if (np->s.device->driver->err_handler.error_state != pci_channel_io_normal) + return; + } +#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */ } while (istatc & (SIP|DIP)); if (DEBUG_FLAGS & DEBUG_TINY) --- linux-2.6.12-git10/drivers/scsi/Kconfig.linas-orig 2005-06-22 15:26:14.000000000 -0500 +++ linux-2.6.12-git10/drivers/scsi/Kconfig 2005-06-22 15:28:29.000000000 -0500 @@ -1040,6 +1040,14 @@ config SCSI_SYM53C8XX_IOMAPPED the card. This is significantly slower then using memory mapped IO. Most people should answer N. +config SCSI_SYM53C8XX_EEH_RECOVERY + bool "Enable PCI bus error recovery" + depends on SCSI_SYM53C8XX_2 && PPC_PSERIES + help + If you say Y here, the driver will be able to recover from + PCI bus errors on many PowerPC platforms. IBM pSeries users + should answer Y. + config SCSI_IPR tristate "IBM Power Linux RAID adapter support" depends on PCI && SCSI --- linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/arch/ppc64/configs/pSeries_defconfig 2005-06-22 15:30:33.000000000 -0500 @@ -311,6 +311,7 @@ CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MOD CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set +CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY=y CONFIG_SCSI_IPR=y CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y From linas at austin.ibm.com Wed Jun 29 09:59:33 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:59:33 -0500 Subject: [PATCH 8/13]: PCI Err: Event delivery utility Message-ID: <20050628235932.GA6429@austin.ibm.com> pci-err-8-pci-err-event.patch [RFC] PCI Error distribution utility routine. This patch defines a utility routine that hasn't yet been discussed much on the mailing list; I've made this architecture independent with the idea that various architectures may find it handy, but its not directly required, or relevant, to the overall EEH error recovery mechanism. (It could be buried in arch-dependent code or implemented differently.) The current design has the arch dependent code detect a PCI bus error. That code uses this utility to generate a detection event. This event is then caught by PCI hotplug code, which drives the slot recovery. If the affected device drivers have recovery callbacks, these are used; all other devices are hotplugged. There are certainly other (simpler) ways to attach the arch-specific error detection code to the hot-plug mediated recovery code; this routine is rather left-over from earlier email discussions. Should this stay, or not? Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/include/linux/pci.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/include/linux/pci.h 2005-06-22 15:28:29.000000000 -0500 @@ -691,6 +691,51 @@ struct pci_error_handlers void (*resume)(struct pci_dev *dev); /* Device driver may resume normal operations */ }; +/** + * PCI Error notifier event flags. + */ +#define PEH_NOTIFY_ERROR 1 + +/** PEH event -- structure holding pci controller data that describes + * a change in the isolation status of a PCI slot. A pointer + * to this struct is passed as the data pointer in a notify callback. + */ +struct peh_event { + struct list_head list; + struct pci_dev *dev; /* affected device */ + enum pci_channel_state state; /* PCI bus state for the affected device */ + int time_unavail; /* milliseconds until device might be available */ +}; + +/** + * peh_send_failure_event - generate a PCI error event + * @dev pci device + * + * This routine builds a PCI error event which will be delivered + * to all listeners on the peh_notifier_chain. + * + * This routine can be called within an interrupt context; + * the actual event will be delivered in a normal context + * (from a workqueue). + */ +int peh_send_failure_event (struct pci_dev *dev, + enum pci_channel_state state, + int time_unavail); + +/** + * peh_register_notifier - Register to find out about EEH events. + * @nb: notifier block to callback on events + */ +int peh_register_notifier(struct notifier_block *nb); + +/** + * peh_unregister_notifier - Unregister to an EEH event notifier. + * @nb: notifier block to callback on events + */ +int peh_unregister_notifier(struct notifier_block *nb); + +/* ---------------------------------------------------------------- */ + struct module; struct pci_driver { struct list_head node; --- linux-2.6.12-git10/drivers/pci/Makefile.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/Makefile 2005-06-22 15:28:29.000000000 -0500 @@ -3,7 +3,7 @@ # obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ - names.o pci-driver.o search.o pci-sysfs.o \ + names.o pci-driver.o pci-error.o search.o pci-sysfs.o \ rom.o obj-$(CONFIG_PROC_FS) += proc.o --- linux-2.6.12-git10/drivers/pci/pci-error.c.linas-orig 2005-06-22 15:28:15.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/pci-error.c 2005-06-22 15:28:29.000000000 -0500 @@ -0,0 +1,152 @@ +/* + * pci-error.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include + +#undef DEBUG + +/** Overview: + * PEH, or "PCI Error Handling" is a PCI bridge technology for + * dealing with PCI bus errors that can't be dealt with within the + * usual PCI framework, except by check-stopping the CPU. Systems + * that are designed for high-availability/reliability cannot afford + * to crash due to a "mere" PCI error, thus the need for PEH. + * An PEH-capable bridge operates by converting a detected error + * into a "slot freeze", taking the PCI adapter off-line, making + * the slot behave, from the OS'es point of view, as if the slot + * were "empty": all reads return 0xff's and all writes are silently + * ignored. PEH slot isolation events can be triggered by parity + * errors on the address or data busses (e.g. during posted writes), + * which in turn might be caused by low voltage on the bus, dust, + * vibration, humidity, radioactivity or plain-old failed hardware. + * + * Note, however, that one of the leading causes of PEH slot + * freeze events are buggy device drivers, buggy device microcode, + * or buggy device hardware. This is because any attempt by the + * device to bus-master data to a memory address that is not + * assigned to the device will trigger a slot freeze. (The idea + * is to prevent devices-gone-wild from corrupting system memory). + * Buggy hardware/drivers will have a miserable time co-existing + * with PEH. + */ + +/* PEH event workqueue setup. */ +static spinlock_t peh_eventlist_lock = SPIN_LOCK_UNLOCKED; +LIST_HEAD(peh_eventlist); +static void peh_event_handler(void *); +DECLARE_WORK(peh_event_wq, peh_event_handler, NULL); + +static struct notifier_block *peh_notifier_chain; + +/** + * peh_event_handler - dispatch PEH events. The detection of a frozen + * slot can occur inside an interrupt, where it can be hard to do + * anything about it. The goal of this routine is to pull these + * detection events out of the context of the interrupt handler, and + * re-dispatch them for processing at a later time in a normal context. + * + * @dummy - unused + */ +static void peh_event_handler(void *dummy) +{ + unsigned long flags; + struct peh_event *event; + + while (1) { + spin_lock_irqsave(&peh_eventlist_lock, flags); + event = NULL; + if (!list_empty(&peh_eventlist)) { + event = list_entry(peh_eventlist.next, struct peh_event, list); + list_del(&event->list); + } + spin_unlock_irqrestore(&peh_eventlist_lock, flags); + if (event == NULL) + break; + + printk(KERN_INFO "PEH: Detected PCI bus error on device " + "%s %s\n", + pci_name(event->dev), pci_pretty_name(event->dev)); + + notifier_call_chain (&peh_notifier_chain, + PEH_NOTIFY_ERROR, event); + + pci_dev_put(event->dev); + kfree(event); + } +} + + +/** + * peh_send_failure_event - generate a PCI error event + * @dev pci device + * + * This routine builds a PCI error event which will be delivered + * to all listeners on the peh_notifier_chain. + * + * This routine can be called within an interrupt context; + * the actual event will be delivered in a normal context + * (from a workqueue). + */ +int peh_send_failure_event (struct pci_dev *dev, + enum pci_channel_state state, + int time_unavail) +{ + unsigned long flags; + struct peh_event *event; + + event = kmalloc(sizeof(*event), GFP_ATOMIC); + if (event == NULL) { + printk (KERN_ERR "PEH: out of memory, event not handled\n"); + return 1; + } + + event->dev = dev; + event->state = state; + event->time_unavail = time_unavail; + + /* We may or may not be called in an interrupt context */ + spin_lock_irqsave(&peh_eventlist_lock, flags); + list_add(&event->list, &peh_eventlist); + spin_unlock_irqrestore(&peh_eventlist_lock, flags); + + schedule_work(&peh_event_wq); + + return 0; +} + +/** + * peh_register_notifier - Register to find out about EEH events. + * @nb: notifier block to callback on events + */ +int peh_register_notifier(struct notifier_block *nb) +{ + return notifier_chain_register(&peh_notifier_chain, nb); +} + +/** + * peh_unregister_notifier - Unregister to an EEH event notifier. + * @nb: notifier block to callback on events + */ +int peh_unregister_notifier(struct notifier_block *nb) +{ + return notifier_chain_unregister(&peh_notifier_chain, nb); +} + +/********************** END OF FILE ******************************/ From linas at austin.ibm.com Wed Jun 29 09:59:44 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:59:44 -0500 Subject: [PATCH 9/13]: PCI Err: Whitespace janitoring Message-ID: <20050628235944.GA6442@austin.ibm.com> pci-err-9-whitespace-janitor.patch Whitespace janitoring -- remove trailing blanks at ends of various lines. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/include/asm-ppc64/eeh.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/include/asm-ppc64/eeh.h 2005-06-22 15:28:29.000000000 -0500 @@ -1,4 +1,4 @@ -/* +/* * eeh.h * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation. * @@ -6,12 +6,12 @@ * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -129,7 +129,7 @@ static inline void eeh_remove_device(str #define EEH_IO_ERROR_VALUE(size) (-1UL) #endif /* CONFIG_EEH */ -/* +/* * MMIO read/write operations with EEH support. */ static inline u8 eeh_readb(const volatile void __iomem *addr) @@ -251,21 +251,21 @@ static inline void eeh_memcpy_fromio(voi *((u8 *)dest) = *((volatile u8 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 1); - dest = (void *)((unsigned long)dest + 1); + dest = (void *)((unsigned long)dest + 1); n--; } while(n > 4) { *((u32 *)dest) = *((volatile u32 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 4); - dest = (void *)((unsigned long)dest + 4); + dest = (void *)((unsigned long)dest + 4); n -= 4; } while(n) { *((u8 *)dest) = *((volatile u8 *)vsrc); __asm__ __volatile__ ("eieio" : : : "memory"); vsrc = (void *)((unsigned long)vsrc + 1); - dest = (void *)((unsigned long)dest + 1); + dest = (void *)((unsigned long)dest + 1); n--; } __asm__ __volatile__ ("sync" : : : "memory"); @@ -287,19 +287,19 @@ static inline void eeh_memcpy_toio(volat while(n && (!EEH_CHECK_ALIGN(vdest, 4) || !EEH_CHECK_ALIGN(src, 4))) { *((volatile u8 *)vdest) = *((u8 *)src); src = (void *)((unsigned long)src + 1); - vdest = (void *)((unsigned long)vdest + 1); + vdest = (void *)((unsigned long)vdest + 1); n--; } while(n > 4) { *((volatile u32 *)vdest) = *((volatile u32 *)src); src = (void *)((unsigned long)src + 4); - vdest = (void *)((unsigned long)vdest + 4); + vdest = (void *)((unsigned long)vdest + 4); n-=4; } while(n) { *((volatile u8 *)vdest) = *((u8 *)src); src = (void *)((unsigned long)src + 1); - vdest = (void *)((unsigned long)vdest + 1); + vdest = (void *)((unsigned long)vdest + 1); n--; } __asm__ __volatile__ ("sync" : : : "memory"); From linas at austin.ibm.com Wed Jun 29 09:59:56 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 18:59:56 -0500 Subject: [PATCH 10/13]: PCI Err: PPC64-specific recovery infrastructure Message-ID: <20050628235956.GA6455@austin.ibm.com> pci-err-10-ppc64.patch Implements ppc64-specific parts of detecting PCI bus errors, (via calls to the firmware to ask the hardware pci bridges) and the related mechanisms for reseting the affects PCI slots (again, via firmware calls). Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/include/asm-ppc64/eeh.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/include/asm-ppc64/eeh.h 2005-06-28 16:54:06.000000000 -0500 @@ -36,6 +36,11 @@ struct notifier_block; #define EEH_MODE_SUPPORTED (1<<0) #define EEH_MODE_NOCHECK (1<<1) #define EEH_MODE_ISOLATED (1<<2) +#define EEH_MODE_RECOVERING (1<<3) + +/* Max number of EEH freezes allowed before we consider the device + * to be permanently disabled. */ +#define EEH_MAX_ALLOWED_FREEZES 5 void __init eeh_init(void); unsigned long eeh_check_failure(const volatile void __iomem *token, @@ -59,35 +64,71 @@ void eeh_add_device_late(struct pci_dev * eeh_remove_device - undo EEH setup for the indicated pci device * @dev: pci device to be removed * - * This routine should be when a device is removed from a running - * system (e.g. by hotplug or dlpar). + * This routine should be called when a device is removed from + * a running system (e.g. by hotplug or dlpar). It unregisters + * the PCI device from the EEH subsystem. I/O errors affecting + * this device will no longer be detected after this call; thus, + * i/o errors affecting this slot may leave this device unusable. */ void eeh_remove_device(struct pci_dev *); -#define EEH_DISABLE 0 -#define EEH_ENABLE 1 -#define EEH_RELEASE_LOADSTORE 2 -#define EEH_RELEASE_DMA 3 +/** + * eeh_slot_error_detail -- record and EEH error condition to the log + * @severity: 1 if temporary, 2 if permanent failure. + * + * Obtains the the EEH error details from the RTAS subsystem, + * and then logs these details with the RTAS error log system. + */ +void eeh_slot_error_detail (struct device_node *dn, int severity); /** - * Notifier event flags. + * rtas_set_slot_reset -- unfreeze a frozen slot + * + * Clear the EEH-frozen condition on a slot. This routine + * does this by asserting the PCI #RST line for 1/8th of + * a second; this routine will sleep while the adapter is + * being reset. */ -#define EEH_NOTIFY_FREEZE 1 +void rtas_set_slot_reset (struct device_node *dn); -/** EEH event -- structure holding pci slot data that describes - * a change in the isolation status of a PCI slot. A pointer - * to this struct is passed as the data pointer in a notify callback. - */ -struct eeh_event { - struct list_head list; - struct pci_dev *dev; - struct device_node *dn; - int reset_state; -}; - -/** Register to find out about EEH events. */ -int eeh_register_notifier(struct notifier_block *nb); -int eeh_unregister_notifier(struct notifier_block *nb); +/** rtas_pci_slot_reset raises/lowers the pci #RST line + * state: 1/0 to raise/lower the #RST + * + * Clear the EEH-frozen condition on a slot. This routine + * asserts the PCI #RST line if the 'state' argument is '1', + * and drops the #RST line if 'state is '0'. This routine is + * safe to call in an interrupt context. + * + */ +void rtas_pci_slot_reset(struct device_node *dn, int state); +void eeh_pci_slot_reset(struct pci_dev *dev, int state); + +/** eeh_pci_slot_availability -- Indicates whether a PCI + * slot is ready to be used. After a PCI reset, it may take a while + * for the PCI fabric to fully reset the comminucations path to the + * given PCI card. This routine can be used to determine how long + * to wait before a PCI slot might become usable. + * + * This routine returns how long to wait (in milliseconds) before + * the slot is expected to be usable. A value of zero means the + * slot is immediately usable. A negavitve value means that the + * slot is permanently disabled. + */ +int eeh_pci_slot_availability(struct pci_dev *dev); + +/** Restore device configuration info across device resets. + */ +void eeh_restore_bars(struct device_node *); +void eeh_pci_restore_bars(struct pci_dev *dev); + +/** + * rtas_configure_bridge -- firmware initialization of pci bridge + * + * Ask the firmware to configure any PCI bridge devices + * located behind the indicated node. Required after a + * pci device reset. + */ +void rtas_configure_bridge(struct device_node *dn); /** * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure. --- linux-2.6.12-git10/include/asm-ppc64/prom.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/include/asm-ppc64/prom.h 2005-06-28 16:33:31.000000000 -0500 @@ -119,6 +119,7 @@ struct property { */ struct pci_controller; struct iommu_table; +struct eeh_recovery_ops; struct device_node { char *name; @@ -137,9 +138,13 @@ struct device_node { int devfn; /* for pci devices */ int eeh_mode; /* See eeh.h for possible EEH_MODEs */ int eeh_config_addr; + int eeh_check_count; /* number of times device driver ignored error */ + int eeh_freeze_count; /* number of times this device froze up. */ + int eeh_is_bridge; /* device is pci-to-pci bridge */ int pci_ext_config_space; /* for pci devices */ struct pci_controller *phb; /* for pci devices */ struct iommu_table *iommu_table; /* for phb's or bridges */ + u32 config_space[16]; /* saved PCI config space */ struct property *properties; struct device_node *parent; --- linux-2.6.12-git10/include/asm-ppc64/rtas.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/include/asm-ppc64/rtas.h 2005-06-22 15:28:29.000000000 -0500 @@ -246,4 +246,6 @@ extern unsigned long rtas_rmo_buf; #define GLOBAL_INTERRUPT_QUEUE 9005 +extern int rtas_write_config(struct device_node *dn, int where, int size, u32 val); + #endif /* _PPC64_RTAS_H */ --- linux-2.6.12-git10/arch/ppc64/kernel/eeh.c.linas-orig 2005-06-28 12:17:02.000000000 -0500 +++ linux-2.6.12-git10/arch/ppc64/kernel/eeh.c 2005-06-28 18:26:30.000000000 -0500 @@ -1,32 +1,34 @@ /* + * * eeh.c * Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#include +#include #include +#include #include -#include #include #include #include #include #include #include +#include #include #include #include @@ -49,8 +51,8 @@ * were "empty": all reads return 0xff's and all writes are silently * ignored. EEH slot isolation events can be triggered by parity * errors on the address or data busses (e.g. during posted writes), - * which in turn might be caused by dust, vibration, humidity, - * radioactivity or plain-old failed hardware. + * which in turn might be caused by low voltage on the bus, dust, + * vibration, humidity, radioactivity or plain-old failed hardware. * * Note, however, that one of the leading causes of EEH slot * freeze events are buggy device drivers, buggy device microcode, @@ -75,22 +77,13 @@ #define BUID_HI(buid) ((buid) >> 32) #define BUID_LO(buid) ((buid) & 0xffffffff) -/* EEH event workqueue setup. */ -static DEFINE_SPINLOCK(eeh_eventlist_lock); -LIST_HEAD(eeh_eventlist); -static void eeh_event_handler(void *); -DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL); - -static struct notifier_block *eeh_notifier_chain; - /* * If a device driver keeps reading an MMIO register in an interrupt * handler after a slot isolation event has occurred, we assume it * is broken and panic. This sets the threshold for how many read * attempts we allow before panicking. */ -#define EEH_MAX_FAILS 1000 -static atomic_t eeh_fail_count; +#define EEH_MAX_FAILS 100000 /* RTAS tokens */ static int ibm_set_eeh_option; @@ -107,6 +100,10 @@ static DEFINE_SPINLOCK(slot_errbuf_lock) static int eeh_error_buf_size; /* System monitoring statistics */ +static DEFINE_PER_CPU(unsigned long, no_device); +static DEFINE_PER_CPU(unsigned long, no_dn); +static DEFINE_PER_CPU(unsigned long, no_cfg_addr); +static DEFINE_PER_CPU(unsigned long, ignored_check); static DEFINE_PER_CPU(unsigned long, total_mmio_ffs); static DEFINE_PER_CPU(unsigned long, false_positives); static DEFINE_PER_CPU(unsigned long, ignored_failures); @@ -225,9 +222,9 @@ pci_addr_cache_insert(struct pci_dev *de while (*p) { parent = *p; piar = rb_entry(parent, struct pci_io_addr_range, rb_node); - if (alo < piar->addr_lo) { + if (ahi < piar->addr_lo) { p = &parent->rb_left; - } else if (ahi > piar->addr_hi) { + } else if (alo > piar->addr_hi) { p = &parent->rb_right; } else { if (dev != piar->pcidev || @@ -246,6 +243,11 @@ pci_addr_cache_insert(struct pci_dev *de piar->pcidev = dev; piar->flags = flags; +#ifdef DEBUG + printk (KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n", + alo, ahi, pci_name (dev)); +#endif + rb_link_node(&piar->rb_node, parent, p); rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root); @@ -268,9 +270,10 @@ static void __pci_addr_cache_insert_devi /* Skip any devices for which EEH is not enabled. */ if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || dn->eeh_mode & EEH_MODE_NOCHECK) { -#ifdef DEBUG - printk(KERN_INFO "PCI: skip building address cache for=%s %s\n", - pci_name(dev), pci_pretty_name(dev)); +// #ifdef DEBUG +#if 1 + printk(KERN_INFO "PCI: skip building address cache for=%s %s %s\n", + pci_name(dev), pci_pretty_name(dev), dn->type); #endif return; } @@ -369,8 +372,12 @@ void pci_addr_cache_remove_device(struct */ void __init pci_addr_cache_build(void) { + struct device_node *dn; struct pci_dev *dev = NULL; + if (!eeh_subsystem_enabled) + return; + spin_lock_init(&pci_io_addr_cache_root.piar_lock); while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { @@ -379,6 +386,17 @@ void __init pci_addr_cache_build(void) continue; } pci_addr_cache_insert_device(dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + if (dn) { + int i; + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + dn->eeh_is_bridge = 1; + } } #ifdef DEBUG @@ -390,24 +408,32 @@ void __init pci_addr_cache_build(void) /* --------------------------------------------------------------- */ /* Above lies the PCI Address Cache. Below lies the EEH event infrastructure */ -/** - * eeh_register_notifier - Register to find out about EEH events. - * @nb: notifier block to callback on events - */ -int eeh_register_notifier(struct notifier_block *nb) +void eeh_slot_error_detail (struct device_node *dn, int severity) { - return notifier_chain_register(&eeh_notifier_chain, nb); -} + unsigned long flags; + int rc; -/** - * eeh_unregister_notifier - Unregister to an EEH event notifier. - * @nb: notifier block to callback on events - */ -int eeh_unregister_notifier(struct notifier_block *nb) -{ - return notifier_chain_unregister(&eeh_notifier_chain, nb); + if (!dn) return; + + /* Log the error with the rtas logger */ + spin_lock_irqsave(&slot_errbuf_lock, flags); + memset(slot_errbuf, 0, eeh_error_buf_size); + + rc = rtas_call(ibm_slot_error_detail, + 8, 1, NULL, dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), NULL, 0, + virt_to_phys(slot_errbuf), + eeh_error_buf_size, + severity); + + if (rc == 0) + log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); + spin_unlock_irqrestore(&slot_errbuf_lock, flags); } +EXPORT_SYMBOL(eeh_slot_error_detail); + /** * read_slot_reset_state - Read the reset state of a device node's slot * @dn: device node to read @@ -422,6 +448,7 @@ static int read_slot_reset_state(struct outputs = 4; } else { token = ibm_read_slot_reset_state; + rets[2] = 0; /* fake PE Unavailable info */ outputs = 3; } @@ -430,75 +457,8 @@ static int read_slot_reset_state(struct } /** - * eeh_panic - call panic() for an eeh event that cannot be handled. - * The philosophy of this routine is that it is better to panic and - * halt the OS than it is to risk possible data corruption by - * oblivious device drivers that don't know better. - * - * @dev pci device that had an eeh event - * @reset_state current reset state of the device slot - */ -static void eeh_panic(struct pci_dev *dev, int reset_state) -{ - /* - * XXX We should create a separate sysctl for this. - * - * Since the panic_on_oops sysctl is used to halt the system - * in light of potential corruption, we can use it here. - */ - if (panic_on_oops) - panic("EEH: MMIO failure (%d) on device:%s %s\n", reset_state, - pci_name(dev), pci_pretty_name(dev)); - else { - __get_cpu_var(ignored_failures)++; - printk(KERN_INFO "EEH: Ignored MMIO failure (%d) on device:%s %s\n", - reset_state, pci_name(dev), pci_pretty_name(dev)); - } -} - -/** - * eeh_event_handler - dispatch EEH events. The detection of a frozen - * slot can occur inside an interrupt, where it can be hard to do - * anything about it. The goal of this routine is to pull these - * detection events out of the context of the interrupt handler, and - * re-dispatch them for processing at a later time in a normal context. - * - * @dummy - unused - */ -static void eeh_event_handler(void *dummy) -{ - unsigned long flags; - struct eeh_event *event; - - while (1) { - spin_lock_irqsave(&eeh_eventlist_lock, flags); - event = NULL; - if (!list_empty(&eeh_eventlist)) { - event = list_entry(eeh_eventlist.next, struct eeh_event, list); - list_del(&event->list); - } - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); - if (event == NULL) - break; - - printk(KERN_INFO "EEH: MMIO failure (%d), notifiying device " - "%s %s\n", event->reset_state, - pci_name(event->dev), pci_pretty_name(event->dev)); - - atomic_set(&eeh_fail_count, 0); - notifier_call_chain (&eeh_notifier_chain, - EEH_NOTIFY_FREEZE, event); - - __get_cpu_var(slot_resets)++; - - pci_dev_put(event->dev); - kfree(event); - } -} - -/** - * eeh_token_to_phys - convert EEH address token to phys address - * @token i/o token, should be address in the form 0xE.... + * eeh_token_to_phys - convert I/O address to phys address + * @token i/o address, should be address in the form 0xA.... */ static inline unsigned long eeh_token_to_phys(unsigned long token) { @@ -513,6 +473,46 @@ static inline unsigned long eeh_token_to return pa | (token & (PAGE_SIZE-1)); } +static inline struct pci_dev * eeh_find_pci_dev(struct device_node *dn) +{ + struct pci_dev *dev = NULL; + for_each_pci_dev(dev) { + if (pci_device_to_OF_node(dev) == dn) + return dev; + } + return NULL; +} + +/** Mark all devices that are peers of this device as failed. + * Mark the device driver too, so that it can see the failure + * immediately (needed for polling in interrupts). + */ +static inline void eeh_mark_slot (struct device_node *dn) +{ + while (dn) { + dn->eeh_mode |= EEH_MODE_ISOLATED; + + /* Mark the pci device driver too */ + struct pci_dev *dev = eeh_find_pci_dev (dn); + if (dev && dev->driver) { + dev->driver->err_handler.error_state = pci_channel_io_frozen; + } + if (dn->child) + eeh_mark_slot (dn->child); + dn = dn->sibling; + } +} + +static inline void eeh_clear_slot (struct device_node *dn) +{ + while (dn) { + dn->eeh_mode &= ~(EEH_MODE_RECOVERING|EEH_MODE_ISOLATED); + if (dn->child) + eeh_clear_slot (dn->child); + dn = dn->sibling; + } +} + /** * eeh_dn_check_failure - check if all 1's data is due to EEH slot freeze * @dn device node @@ -528,29 +528,37 @@ static inline unsigned long eeh_token_to * * It is safe to call this routine in an interrupt context. */ +extern void disable_irq_nosync(unsigned int); + int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) { int ret; int rets[3]; - unsigned long flags; - int rc, reset_state; - struct eeh_event *event; + enum pci_channel_state state; __get_cpu_var(total_mmio_ffs)++; if (!eeh_subsystem_enabled) return 0; - if (!dn) + if (!dn) { + __get_cpu_var(no_dn)++; return 0; + } /* Access to IO BARs might get this far and still not want checking. */ if (!(dn->eeh_mode & EEH_MODE_SUPPORTED) || dn->eeh_mode & EEH_MODE_NOCHECK) { + __get_cpu_var(ignored_check)++; +#ifdef DEBUG + printk ("EEH:ignored check for %s %s\n", + pci_pretty_name (dev), dn->full_name); +#endif return 0; } if (!dn->eeh_config_addr) { + __get_cpu_var(no_cfg_addr)++; return 0; } @@ -559,12 +567,19 @@ int eeh_dn_check_failure(struct device_n * slot, we know it's bad already, we don't need to check... */ if (dn->eeh_mode & EEH_MODE_ISOLATED) { - atomic_inc(&eeh_fail_count); - if (atomic_read(&eeh_fail_count) >= EEH_MAX_FAILS) { + dn->eeh_check_count ++; + if (dn->eeh_check_count >= EEH_MAX_FAILS) { + printk (KERN_ERR "EEH: Device driver ignored %d bad reads, panicing\n", + dn->eeh_check_count); + dump_stack(); /* re-read the slot reset state */ if (read_slot_reset_state(dn, rets) != 0) rets[0] = -1; /* reset state unknown */ - eeh_panic(dev, rets[0]); + +*((long *) 0x0) = 42; + /* If we are here, then we hit an infinite loop. Stop. */ + panic("EEH: MMIO halt (%d) on device:%s %s\n", rets[0], + pci_name(dev), pci_pretty_name(dev)); } return 0; } @@ -577,53 +592,43 @@ int eeh_dn_check_failure(struct device_n * In any case they must share a common PHB. */ ret = read_slot_reset_state(dn, rets); - if (!(ret == 0 && rets[1] == 1 && (rets[0] == 2 || rets[0] == 4))) { + if (!(ret == 0 && ((rets[1] == 1 && (rets[0] == 2 || rets[0] >= 4)) + || (rets[0] == 5)))) { __get_cpu_var(false_positives)++; return 0; } - /* prevent repeated reports of this failure */ - dn->eeh_mode |= EEH_MODE_ISOLATED; - - reset_state = rets[0]; + /* Note that empty slots will fail; empty slots don't have children... */ + if ((rets[0] == 5) && (dn->child == NULL)) { + __get_cpu_var(false_positives)++; + return 0; + } - spin_lock_irqsave(&slot_errbuf_lock, flags); - memset(slot_errbuf, 0, eeh_error_buf_size); + /* Avoid repeated reports of this failure, including problems + * with other functions on this device, and functions under + * bridges. */ + eeh_mark_slot (dn->parent->child); + __get_cpu_var(slot_resets)++; - rc = rtas_call(ibm_slot_error_detail, - 8, 1, NULL, dn->eeh_config_addr, - BUID_HI(dn->phb->buid), - BUID_LO(dn->phb->buid), NULL, 0, - virt_to_phys(slot_errbuf), - eeh_error_buf_size, - 1 /* Temporary Error */); + if (!dev) + dev = eeh_find_pci_dev (dn); - if (rc == 0) - log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); - spin_unlock_irqrestore(&slot_errbuf_lock, flags); + /* Some devices go crazy if irq's are not ack'ed; disable irq now */ + if (dev) + disable_irq_nosync (dev->irq); + + state = pci_channel_io_normal; + if ((rets[0] == 2) || (rets[0] == 4)) + state = pci_channel_io_frozen; + if (rets[0] == 5) + state = pci_channel_io_perm_failure; - printk(KERN_INFO "EEH: MMIO failure (%d) on device: %s %s\n", - rets[0], dn->name, dn->full_name); - event = kmalloc(sizeof(*event), GFP_ATOMIC); - if (event == NULL) { - eeh_panic(dev, reset_state); - return 1; - } - - event->dev = dev; - event->dn = dn; - event->reset_state = reset_state; - - /* We may or may not be called in an interrupt context */ - spin_lock_irqsave(&eeh_eventlist_lock, flags); - list_add(&event->list, &eeh_eventlist); - spin_unlock_irqrestore(&eeh_eventlist_lock, flags); + peh_send_failure_event (dev, state, rets[2]); /* Most EEH events are due to device driver bugs. Having * a stack trace will help the device-driver authors figure * out what happened. So print that out. */ - dump_stack(); - schedule_work(&eeh_event_wq); + if (rets[0] != 5) dump_stack(); return 0; } @@ -635,7 +640,6 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * @token i/o token, should be address in the form 0xA.... * @val value, should be all 1's (XXX why do we need this arg??) * - * Check for an eeh failure at the given token address. * Check for an EEH failure at the given token address. Call this * routine if the result of a read was all 0xff's and you want to * find out if this is due to an EEH slot freeze event. This routine @@ -643,6 +647,7 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * * Note this routine is safe to call in an interrupt context. */ + unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val) { unsigned long addr; @@ -652,8 +657,10 @@ unsigned long eeh_check_failure(const vo /* Finding the phys addr + pci device; this is pretty quick. */ addr = eeh_token_to_phys((unsigned long __force) token); dev = pci_get_device_by_addr(addr); - if (!dev) + if (!dev) { + __get_cpu_var(no_device)++; return val; + } dn = pci_device_to_OF_node(dev); eeh_dn_check_failure (dn, dev); @@ -664,6 +671,209 @@ unsigned long eeh_check_failure(const vo EXPORT_SYMBOL(eeh_check_failure); +/* ------------------------------------------------------------- */ +/* The code below deals with error recovery */ + +/** eeh_pci_slot_reset -- raises/lowers the pci #RST line + * state: 1/0 to raise/lower the #RST + */ +void +eeh_pci_slot_reset(struct pci_dev *dev, int state) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + rtas_pci_slot_reset (dn, state); +} + +/** Return negative value if a permanent error, else return + * a number of milliseconds to wait until the PCI slot is + * ready to be used. + */ +static int +eeh_slot_availability(struct device_node *dn) +{ + int rc; + int rets[3]; + + rc = read_slot_reset_state(dn, rets); + + if (rc) return rc; + + if (rets[1] == 0) return -1; /* EEH is not supported */ + if (rets[0] == 0) return 0; /* Oll Korrect */ + if (rets[0] == 5) { + if (rets[2] == 0) return -1; /* permanently unavailable */ + return rets[2]; /* number of millisecs to wait */ + } + return -1; +} + +int +eeh_pci_slot_availability(struct pci_dev *dev) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + if (!dn) return -1; + + BUG_ON (dn->phb==NULL); + if (dn->phb==NULL) { + printk (KERN_ERR "EEH, checking on slot with no phb dn=%s dev=%s:%s\n", + dn->full_name, pci_name(dev), pci_pretty_name (dev)); + return -1; + } + return eeh_slot_availability (dn); +} + +void +rtas_pci_slot_reset(struct device_node *dn, int state) +{ + int rc; + + if (!dn) + return; + if (!dn->phb) { + printk (KERN_WARNING "EEH: in slot reset, device node %s has no phb\n", dn->full_name); + return; + } + + dn->eeh_mode |= EEH_MODE_RECOVERING; + rc = rtas_call(ibm_set_slot_reset,4,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), + state); + if (rc) { + printk (KERN_WARNING "EEH: Unable to reset the failed slot, (%d) #RST=%d\n", rc, state); + return; + } + + if (state == 0) + eeh_clear_slot (dn->parent->child); +} + +/** rtas_set_slot_reset -- assert the pci #RST line for 1/4 second + * dn -- device node to be reset. + */ + +void +rtas_set_slot_reset(struct device_node *dn) +{ + int i, rc; + + rtas_pci_slot_reset (dn, 1); + + /* The PCI bus requires that the reset be held high for at least + * a 100 milliseconds. We wait a bit longer 'just in case'. */ + +#define PCI_BUS_RST_HOLD_TIME_MSEC 250 + msleep (PCI_BUS_RST_HOLD_TIME_MSEC); + rtas_pci_slot_reset (dn, 0); + + /* After a PCI slot has been reset, the PCI Express spec requires + * a 1.5 second idle time for the bus to stabilize, before starting + * up traffic. */ +#define PCI_BUS_SETTLE_TIME_MSEC 1800 + msleep (PCI_BUS_SETTLE_TIME_MSEC); + + /* Now double check with the firmware to make sure the device is + * ready to be used; if not, wait for recovery. */ + for (i=0; i<10; i++) { + rc = eeh_slot_availability (dn); + if (rc <= 0) break; + + msleep (rc+100); + } +} + +EXPORT_SYMBOL(rtas_set_slot_reset); + +void +rtas_configure_bridge(struct device_node *dn) +{ + int token = rtas_token ("ibm,configure-bridge"); + int rc; + + if (token == RTAS_UNKNOWN_SERVICE) + return; + rc = rtas_call(token,3,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid)); + if (rc) { + printk (KERN_WARNING "EEH: Unable to configure device bridge (%d) for %s\n", + rc, dn->full_name); + } +} + +EXPORT_SYMBOL(rtas_configure_bridge); + +/* ------------------------------------------------------- */ +/** Save and restore of PCI BARs + * + * Although firmware will set up BARs during boot, it doesn't + * set up device BAR's after a device reset, although it will, + * if requested, set up bridge configuration. Thus, we need to + * configure the PCI devices ourselves. Config-space setup is + * stored in the PCI structures which are normally deleted during + * device removal. Thus, the "save" routine references the + * structures so that they aren't deleted. + */ + +/** + * __restore_bars - Restore the Base Address Registers + * Loads the PCI configuration space base address registers, + * the expansion ROM base address, the latency timer, and etc. + * from the saved values in the device node. + */ +static inline void __restore_bars (struct device_node *dn) +{ + int i; + + if (NULL==dn->phb) return; + for (i=4; i<10; i++) { + rtas_write_config(dn, i*4, 4, dn->config_space[i]); + } + + /* 12 == Expansion ROM Address */ + rtas_write_config(dn, 12*4, 4, dn->config_space[12]); + +#define BYTE_SWAP(OFF) (8*((OFF)/4)+3-(OFF)) +#define SAVED_BYTE(OFF) (((u8 *)(dn->config_space))[BYTE_SWAP(OFF)]) + + rtas_write_config (dn, PCI_CACHE_LINE_SIZE, 1, + SAVED_BYTE(PCI_CACHE_LINE_SIZE)); + + rtas_write_config (dn, PCI_LATENCY_TIMER, 1, + SAVED_BYTE(PCI_LATENCY_TIMER)); + + /* max latency, min grant, interrupt pin and line */ + rtas_write_config(dn, 15*4, 4, dn->config_space[15]); +} + +/** + * eeh_restore_bars - restore the PCI config space info + */ +void eeh_restore_bars(struct device_node *dn) +{ + if (! dn->eeh_is_bridge) + __restore_bars (dn); + + if (dn->child) + eeh_restore_bars (dn->child); +} + +void eeh_pci_restore_bars(struct pci_dev *dev) +{ + struct device_node *dn = pci_device_to_OF_node(dev); + eeh_restore_bars (dn); +} + +/* ------------------------------------------------------------- */ +/* The code below deals with enabling EEH for devices during the + * early boot sequence. EEH must be enabled before any PCI probing + * can be done. + */ + +#define EEH_ENABLE 1 + struct eeh_early_enable_info { unsigned int buid_hi; unsigned int buid_lo; @@ -682,6 +892,8 @@ static void *early_enable_eeh(struct dev int enable; dn->eeh_mode = 0; + dn->eeh_check_count = 0; + dn->eeh_freeze_count = 0; if (status && strcmp(status, "ok") != 0) return NULL; /* ignore devices with bad status */ @@ -743,7 +955,7 @@ static void *early_enable_eeh(struct dev dn->full_name); } - return NULL; + return NULL; } /* @@ -828,7 +1040,9 @@ void eeh_add_device_early(struct device_ return; phb = dn->phb; if (NULL == phb || 0 == phb->buid) { - printk(KERN_WARNING "EEH: Expected buid but found none\n"); + printk(KERN_WARNING "EEH: Expected buid but found none for %s\n", + dn->full_name); + dump_stack(); return; } @@ -847,6 +1061,9 @@ EXPORT_SYMBOL(eeh_add_device_early); */ void eeh_add_device_late(struct pci_dev *dev) { + int i; + struct device_node *dn; + if (!dev || !eeh_subsystem_enabled) return; @@ -856,6 +1073,14 @@ void eeh_add_device_late(struct pci_dev #endif pci_addr_cache_insert_device (dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + dn->eeh_is_bridge = 1; } EXPORT_SYMBOL(eeh_add_device_late); @@ -885,12 +1110,17 @@ static int proc_eeh_show(struct seq_file unsigned int cpu; unsigned long ffs = 0, positives = 0, failures = 0; unsigned long resets = 0; + unsigned long no_dev = 0, no_dn = 0, no_cfg = 0, no_check = 0; for_each_cpu(cpu) { ffs += per_cpu(total_mmio_ffs, cpu); positives += per_cpu(false_positives, cpu); failures += per_cpu(ignored_failures, cpu); resets += per_cpu(slot_resets, cpu); + no_dev += per_cpu(no_device, cpu); + no_dn += per_cpu(no_dn, cpu); + no_cfg += per_cpu(no_cfg_addr, cpu); + no_check += per_cpu(ignored_check, cpu); } if (0 == eeh_subsystem_enabled) { @@ -898,13 +1128,17 @@ static int proc_eeh_show(struct seq_file seq_printf(m, "eeh_total_mmio_ffs=%ld\n", ffs); } else { seq_printf(m, "EEH Subsystem is enabled\n"); - seq_printf(m, "eeh_total_mmio_ffs=%ld\n" + seq_printf(m, + "no device=%ld\n" + "no device node=%ld\n" + "no config address=%ld\n" + "check not wanted=%ld\n" + "eeh_total_mmio_ffs=%ld\n" "eeh_false_positives=%ld\n" "eeh_ignored_failures=%ld\n" - "eeh_slot_resets=%ld\n" - "eeh_fail_count=%d\n", - ffs, positives, failures, resets, - eeh_fail_count.counter); + "eeh_slot_resets=%ld\n", + no_dev, no_dn, no_cfg, no_check, + ffs, positives, failures, resets); } return 0; --- linux-2.6.12-git10/arch/ppc64/kernel/rtas_pci.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/arch/ppc64/kernel/rtas_pci.c 2005-06-22 15:28:29.000000000 -0500 @@ -58,7 +58,7 @@ static int config_access_valid(struct de return 0; } -static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) +int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) { int returnval = -1; unsigned long buid, addr; @@ -108,7 +108,7 @@ static int rtas_pci_read_config(struct p return PCIBIOS_DEVICE_NOT_FOUND; } -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +int rtas_write_config(struct device_node *dn, int where, int size, u32 val) { unsigned long buid, addr; int ret; From linas at austin.ibm.com Wed Jun 29 10:00:06 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 19:00:06 -0500 Subject: [PATCH 11/13]: PCI Err: RPA-PHP janitoring Message-ID: <20050629000006.GA6468@austin.ibm.com> pci-err-11-rpaphp-janitor.patch Remove dead code. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/pci/hotplug/rpaphp.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/hotplug/rpaphp.h 2005-06-22 15:28:29.000000000 -0500 @@ -113,7 +113,6 @@ extern int rpaphp_enable_pci_slot(struct extern int register_pci_slot(struct slot *slot); extern int rpaphp_unconfig_pci_adapter(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); -extern struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev); /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); --- linux-2.6.12-git10/drivers/pci/hotplug/rpaphp_pci.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/hotplug/rpaphp_pci.c 2005-06-22 15:28:29.000000000 -0500 @@ -503,36 +503,3 @@ exit: return retval; } -struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev) -{ - struct list_head *tmp, *n; - struct slot *slot; - - list_for_each_safe(tmp, n, &rpaphp_slot_head) { - struct pci_bus *bus; - struct list_head *ln; - - slot = list_entry(tmp, struct slot, rpaphp_slot_list); - if (slot->bridge == NULL) { - if (slot->dev_type == PCI_DEV) { - printk(KERN_WARNING "PCI slot missing bridge %s %s \n", - slot->name, slot->location); - } - continue; - } - - bus = slot->bridge->subordinate; - if (!bus) { - continue; /* should never happen? */ - } - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev == dev) - return slot->hotplug_slot; - } - } - - return NULL; -} - -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); From linas at austin.ibm.com Wed Jun 29 10:00:15 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 19:00:15 -0500 Subject: [PATCH 12/13]: PCI Err: RPA-PHP clarification Message-ID: <20050629000015.GA6481@austin.ibm.com> pci-err-12-rpaphp-symmetry.patch Restructure handling of bus remove and bus add to make the pairs of calls more symmetrical. Doesn't change function, but does make the code easier to understand. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/pci/hotplug/rpaphp_pci.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/hotplug/rpaphp_pci.c 2005-06-22 15:28:29.000000000 -0500 @@ -63,6 +63,7 @@ int rpaphp_claim_resource(struct pci_dev root ? "Address space collision on" : "No parent found for", resource, dtype, pci_name(dev), res->start, res->end); + dump_stack(); } return err; } @@ -188,6 +189,19 @@ rpaphp_fixup_new_pci_devices(struct pci_ static int rpaphp_pci_config_bridge(struct pci_dev *dev); +static void rpaphp_eeh_add_bus_device(struct pci_bus *bus) +{ + struct pci_dev *dev; + list_for_each_entry(dev, &bus->devices, bus_list) { + eeh_add_device_late(dev); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + struct pci_bus *subbus = dev->subordinate; + if (bus) + rpaphp_eeh_add_bus_device (subbus); + } + } +} + /***************************************************************************** rpaphp_pci_config_slot() will configure all devices under the given slot->dn and return the the first pci_dev. @@ -215,6 +229,8 @@ rpaphp_pci_config_slot(struct device_nod } if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) rpaphp_pci_config_bridge(dev); + + rpaphp_eeh_add_bus_device(bus); } return dev; } @@ -223,7 +239,6 @@ static int rpaphp_pci_config_bridge(stru { u8 sec_busno; struct pci_bus *child_bus; - struct pci_dev *child_dev; dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev)); @@ -240,11 +255,7 @@ static int rpaphp_pci_config_bridge(stru /* do pci_scan_child_bus */ pci_scan_child_bus(child_bus); - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - - /* fixup new pci devices without touching bus struct */ + /* Fixup new pci devices without touching bus struct */ rpaphp_fixup_new_pci_devices(child_bus, 0); /* Make the discovered devices available */ @@ -320,7 +331,6 @@ static void rpaphp_eeh_remove_bus_device if (pdev) rpaphp_eeh_remove_bus_device(pdev); } - } return; } From linas at austin.ibm.com Wed Jun 29 10:00:27 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 28 Jun 2005 19:00:27 -0500 Subject: [PATCH 13/13]: PCI Err: RPA-PHP-specific error recovery driver Message-ID: <20050629000027.GA6494@austin.ibm.com> pci-err-13-rpaphp-eeh.patch PCI Error recovery driver, ppc64-specific implementation. For various historical reasons, this driver is in the pci hotplug directory, although it could be moved to arch/ppc64 It is here because the driver falls back to using the pci hotplug routines if the device driver does not support native error recovery. Signed-off-by: Linas Vepstas -------------- next part -------------- --- linux-2.6.12-git10/drivers/pci/hotplug/rpaphp.h.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/hotplug/rpaphp.h 2005-06-22 15:28:29.000000000 -0500 @@ -113,6 +113,8 @@ extern int rpaphp_enable_pci_slot(struct extern int register_pci_slot(struct slot *slot); extern int rpaphp_unconfig_pci_adapter(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); +extern void init_eeh_handler (void); +extern void exit_eeh_handler (void); /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); --- linux-2.6.12-git10/drivers/pci/hotplug/rpaphp_core.c.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/hotplug/rpaphp_core.c 2005-06-22 15:28:29.000000000 -0500 @@ -460,12 +460,18 @@ static int __init rpaphp_init(void) { info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); + /* Get set to handle EEH events. */ + init_eeh_handler(); + /* read all the PRA info from the system */ return init_rpa(); } static void __exit rpaphp_exit(void) { + /* Let EEH know we are going away. */ + exit_eeh_handler(); + cleanup_slots(); } --- drivers/pci/hotplug/rpaphp_eeh.c.linas-orig 2005-06-28 12:47:20.000000000 -0500 +++ drivers/pci/hotplug/rpaphp_eeh.c 2005-06-28 17:36:17.000000000 -0500 @@ -0,0 +1,383 @@ +/* + * PCI Hot Plug Controller Driver for RPA-compliant PPC64 platform. + * Copyright (C) 2004, 2005 Linas Vepstas + * + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or (at + * your option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + * + * Send feedback to + * + */ +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../pci.h" +#include "rpaphp.h" + +/** + * pci_search_bus_for_dev - return 1 if device is under this bus, else 0 + * @bus: the bus to search for this device. + * @dev: the pci device we are looking for. + * + * XXX should this be moved to drivers/pci/search.c ? + */ +static int pci_search_bus_for_dev (struct pci_bus *bus, struct pci_dev *dev) +{ + struct list_head *ln; + + if (!bus) return 0; + + for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { + struct pci_dev *pdev = pci_dev_b(ln); + if (pdev == dev) + return 1; + if (pdev->subordinate) { + int rc; + rc = pci_search_bus_for_dev (pdev->subordinate, dev); + if (rc) + return 1; + } + } + return 0; +} + +/** pci_walk_bus - walk bus under this device, calling callback. + * @top device whose peers should be walked + * @cb callback to be called for each device found + * @userdata arbitrary pointer to be passed to callback. + * + * Walk the bus on which this device sits, including any + * bridged devices on busses under this bus. Call the provided + * callback on each device found. + */ +typedef void (*pci_buswalk_cb)(struct pci_dev *, void *); + +static void +pci_walk_bus (struct pci_dev *top, pci_buswalk_cb cb, void *userdata) +{ + struct pci_dev *dev, *tmp; + + spin_lock(&pci_bus_lock); + list_for_each_entry_safe (dev, tmp, &top->bus->devices, bus_list) { + pci_dev_get(dev); + spin_unlock(&pci_bus_lock); + + /* Run device routines with the bus unlocked */ + cb (dev, userdata); + if (dev->subordinate) { + pci_walk_bus (pci_dev_b(&dev->subordinate->devices), cb, userdata); + } + spin_lock(&pci_bus_lock); + pci_dev_put(dev); + } + spin_unlock(&pci_bus_lock); +} + +/** + * rpaphp_find_slot - find and return the slot holding the device + * @dev: pci device for which we want the slot structure. + */ +static struct slot *rpaphp_find_slot(struct pci_dev *dev) +{ + struct list_head *tmp, *n; + struct slot *slot; + + list_for_each_safe(tmp, n, &rpaphp_slot_head) { + struct pci_bus *bus; + + slot = list_entry(tmp, struct slot, rpaphp_slot_list); + + /* PHB's don't have bridges. */ + if (slot->bridge == NULL) + continue; + + /* The PCI device could be the slot itself. */ + if (slot->bridge == dev) + return slot; + + bus = slot->bridge->subordinate; + if (!bus) { + printk (KERN_WARNING "PCI bridge is missing bus: %s %s\n", + pci_name (slot->bridge), pci_pretty_name (slot->bridge)); + continue; /* should never happen? */ + } + + if (pci_search_bus_for_dev (bus, dev)) + return slot; + } + return NULL; +} + +/* ------------------------------------------------------- */ +/** eeh_report_error - report an EEH error to each device, + * collect up and merge the device responses. + */ + +static void eeh_report_error(struct pci_dev *dev, void *userdata) +{ + enum pcierr_result rc, *res = userdata; + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + driver->err_handler.error_state = pci_channel_io_frozen; + if (!driver->err_handler.error_detected) + return; + + rc = driver->err_handler.error_detected (dev, pci_channel_io_frozen); + if (*res == PCIERR_RESULT_NONE) *res = rc; + if (*res == PCIERR_RESULT_NEED_RESET) return; + if (*res == PCIERR_RESULT_DISCONNECT && + rc == PCIERR_RESULT_NEED_RESET) *res = rc; +} + +/** eeh_report_reset -- tell this device that the pci slot + * has been reset. + */ + +static void eeh_report_reset(struct pci_dev *dev, void *userdata) +{ + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + if (!driver->err_handler.slot_reset) + return; + + driver->err_handler.slot_reset (dev); +} + +static void eeh_report_resume(struct pci_dev *dev, void *userdata) +{ + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + driver->err_handler.error_state = pci_channel_io_normal; + if (!driver->err_handler.resume) + return; + + driver->err_handler.resume (dev); +} + +static void eeh_report_failure(struct pci_dev *dev, void *userdata) +{ + struct pci_driver *driver = dev->driver; + + if (!driver) + return; + driver->err_handler.error_state = pci_channel_io_perm_failure; + if (!driver->err_handler.error_detected) + return; + + driver->err_handler.error_detected (dev, pci_channel_io_perm_failure); +} + +/* ------------------------------------------------------- */ +/** + * handle_eeh_events -- reset a PCI device after hard lockup. + * + * pSeries systems will isolate a PCI slot if the PCI-Host + * bridge detects address or data parity errors, DMA's + * occuring to wild addresses (which usually happen due to + * bugs in device drivers or in PCI adapter firmware). + * Slot isolations also occur if #SERR, #PERR or other misc + * PCI-related errors are detected. + * + * Recovery process consists of unplugging the device driver + * (which generated hotplug events to userspace), then issuing + * a PCI #RST to the device, then reconfiguring the PCI config + * space for all bridges & devices under this slot, and then + * finally restarting the device drivers (which cause a second + * set of hotplug events to go out to userspace). + */ + +int eeh_reset_device (struct pci_dev *dev, struct device_node *dn, int reconfig) +{ + struct slot *frozen_slot= NULL; + + if (!dev) + return 1; + + if (reconfig) + frozen_slot = rpaphp_find_slot(dev); + + if (reconfig && frozen_slot) rpaphp_unconfig_pci_adapter (frozen_slot); + + /* Reset the pci controller. (Asserts RST#; resets config space). + * Reconfigure bridges and devices */ + rtas_set_slot_reset (dn->child); + + /* Walk over all functions on this device */ + struct device_node *peer = dn->child; + while (peer) { + rtas_configure_bridge(peer); + eeh_restore_bars(peer); + peer = peer->sibling; + } + + /* Give the system 5 seconds to finish running the user-space + * hotplug scripts, e.g. ifdown for ethernet. Yes, this is a hack, + * but if we don't do this, weird things happen. + */ + if (reconfig && frozen_slot) { + ssleep (5); + rpaphp_enable_pci_slot (frozen_slot); + } + return 0; +} + +/* The longest amount of time to wait for a pci device + * to come back on line, in seconds. + */ +#define MAX_WAIT_FOR_RECOVERY 15 + +int handle_eeh_events (struct notifier_block *self, + unsigned long reason, void *ev) +{ + int freeze_count=0; + struct device_node *frozen_device; + struct peh_event *event = ev; + struct pci_dev *dev = event->dev; + int perm_failure = 0; + + if (!dev) + { + printk ("EEH: EEH error caught, but no PCI device specified!\n"); + return 1; + } + + frozen_device = pci_bus_to_OF_node(dev->bus); + if (!frozen_device) + { + printk (KERN_ERR "EEH: Cannot find PCI controller for %s %s\n", + pci_name(dev), pci_pretty_name (dev)); + + return 1; + } + BUG_ON (frozen_device->phb==NULL); + + /* We get "permanent failure" messages on empty slots. + * These are false alarms. Empty slots have no child dn. */ + if ((event->state == pci_channel_io_perm_failure) && (frozen_device == NULL)) + return 0; + + if (frozen_device) + freeze_count = frozen_device->eeh_freeze_count; + freeze_count ++; + if (freeze_count > EEH_MAX_ALLOWED_FREEZES) + perm_failure = 1; + + /* If the reset state is a '5' and the time to reset is 0 (infinity) + * or is more then 15 seconds, then mark this as a permanent failure. + */ + if ((event->state == pci_channel_io_perm_failure) && + ((event->time_unavail <= 0) || + (event->time_unavail > MAX_WAIT_FOR_RECOVERY*1000))) + perm_failure = 1; + + /* Log the error with the rtas logger. */ + if (perm_failure) { + /* + * About 90% of all real-life EEH failures in the field + * are due to poorly seated PCI cards. Only 10% or so are + * due to actual, failed cards. + */ + printk (KERN_ERR + "EEH: device %s:%s has failed %d times \n" + "and has been permanently disabled. Please try reseating\n" + "this device or replacing it.\n", + pci_name (dev), + pci_pretty_name (dev), + freeze_count); + + eeh_slot_error_detail (frozen_device, 2 /* Permanent Error */); + + /* Notify all devices that they're about to go down. */ + pci_walk_bus (dev, eeh_report_failure, 0); + + /* If there's a hotplug slot, unconfigure it */ + // XXX we need alternate way to deconfigure non-hotplug slots. + struct slot * frozen_slot = rpaphp_find_slot(dev); + if (frozen_slot) + rpaphp_unconfig_pci_adapter (frozen_slot); + return 1; + } else { + eeh_slot_error_detail (frozen_device, 1 /* Temporary Error */); + } + + printk (KERN_WARNING + "EEH: This device has failed %d times since last reboot: %s:%s\n", + freeze_count, + pci_name (dev), + pci_pretty_name (dev)); + + /* Walk the various device drivers attached to this slot, + * letting each know about the EEH bug. + */ + enum pcierr_result result = PCIERR_RESULT_NONE; + pci_walk_bus (dev, eeh_report_error, &result); + + /* If all device drivers were EEH-unaware, then pci hotplug + * the device, and hope that clears the error. */ + if (result == PCIERR_RESULT_NONE) { + eeh_reset_device (dev, frozen_device, 1); + } + + /* If any device called out for a reset, then reset the slot */ + if (result == PCIERR_RESULT_NEED_RESET) { + eeh_reset_device (dev, frozen_device, 0); + pci_walk_bus (dev, eeh_report_reset, 0); + } + + /* If all devices reported they can proceed, the re-enable PIO */ + if (result == PCIERR_RESULT_CAN_RECOVER) { + /* XXX Not supported; we brute-force reset the device */ + eeh_reset_device (dev, frozen_device, 0); + pci_walk_bus (dev, eeh_report_reset, 0); + } + + /* Tell all device drivers that they can resume operations */ + pci_walk_bus (dev, eeh_report_resume, 0); + + /* Store the freeze count with the pci adapter, and not the slot. + * This way, if the device is replaced, the count is cleared. + */ + frozen_device->eeh_freeze_count = freeze_count; + + return 1; +} + +static struct notifier_block eeh_block; + +void __init init_eeh_handler (void) +{ + eeh_block.notifier_call = handle_eeh_events; + peh_register_notifier (&eeh_block); +} + +void __exit exit_eeh_handler (void) +{ + peh_unregister_notifier (&eeh_block); +} + --- linux-2.6.12-git10/drivers/pci/hotplug/Makefile.linas-orig 2005-06-17 14:48:29.000000000 -0500 +++ linux-2.6.12-git10/drivers/pci/hotplug/Makefile 2005-06-22 15:28:29.000000000 -0500 @@ -41,6 +41,7 @@ acpiphp-objs := acpiphp_core.o \ acpiphp_res.o rpaphp-objs := rpaphp_core.o \ + rpaphp_eeh.o \ rpaphp_pci.o \ rpaphp_slot.o \ rpaphp_vio.o From greg at kroah.com Wed Jun 29 10:29:51 2005 From: greg at kroah.com (Greg KH) Date: Tue, 28 Jun 2005 17:29:51 -0700 Subject: [PATCH 1/13]: PCI Err: pci.h header file changes In-Reply-To: <20050628235817.GA6324@austin.ibm.com> References: <20050628235817.GA6324@austin.ibm.com> Message-ID: <20050629002951.GA17885@kroah.com> On Tue, Jun 28, 2005 at 06:58:17PM -0500, Linas Vepstas wrote: > @@ -673,6 +704,7 @@ struct pci_driver { > int (*enable_wake) (struct pci_dev *dev, pci_power_t state, int enable); /* Enable wake event */ > void (*shutdown) (struct pci_dev *dev); > > + struct pci_error_handlers err_handler; > struct device_driver driver; > struct pci_dynids dynids; > }; Shouldn't that be a pointer and not the whole structure? Wouldn't that make it easier to "reuse" error handlers? thanks, greg k-h From arnd at arndb.de Wed Jun 29 10:38:58 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Wed, 29 Jun 2005 02:38:58 +0200 Subject: [PATCH] net: add driver for the NIC on Cell Blades In-Reply-To: <1119966799.3175.32.camel@laptopd505.fenrus.org> References: <200506281528.08834.arnd@arndb.de> <1119966799.3175.32.camel@laptopd505.fenrus.org> Message-ID: <200506290238.59231.arnd@arndb.de> On Dinsdag 28 Juni 2005 15:53, Arjan van de Ven wrote: > > > +static void > > +spider_net_rx_irq_off(struct spider_net_card *card) > > +{ > > + ? ? ? u32 regvalue; > > + ? ? ? unsigned long flags; > > + > > + ? ? ? spin_lock_irqsave(&card->intmask_lock, flags); > > + ? ? ? regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); > > + ? ? ? regvalue &= ~SPIDER_NET_RXINT; > > + ? ? ? spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); > > + ? ? ? spin_unlock_irqrestore(&card->intmask_lock, flags); > > +} > > I think you have a PCI posting bug here.... Could you be more specific? My guess would be that the 'sync' in writel takes care of this. Should there be an extra mmiowb() in here or are you referring to some other problem? Arnd <>< From benh at kernel.crashing.org Wed Jun 29 11:43:56 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 29 Jun 2005 11:43:56 +1000 Subject: [PATCH 1/13]: PCI Err: pci.h header file changes In-Reply-To: <20050628235817.GA6324@austin.ibm.com> References: <20050628235817.GA6324@austin.ibm.com> Message-ID: <1120009436.5133.225.camel@gaston> On Tue, 2005-06-28 at 18:58 -0500, Linas Vepstas wrote: > + > +/* PCI bus error event callbacks */ > +struct pci_error_handlers > +{ > + enum pci_channel_state error_state; /* current error state */ > + int (*error_detected)(struct pci_dev *dev, enum pci_channel_state error); > + int (*mmio_enabled)(struct pci_dev *dev); /* MMIO has been reanbled, but not DMA */ > + int (*link_reset)(struct pci_dev *dev); /* PCI Express link has been reset */ > + int (*slot_reset)(struct pci_dev *dev); /* PCI slot has been reset */ > + void (*resume)(struct pci_dev *dev); /* Device driver may resume normal operations */ > +}; The state variable shouldn't be in that structure. As Greg pointed, we should have a pointer to that structure in pci_driver, not a copy, and the error state should be in pci_dev. (With you current code, it's per driver which is broken anyway). Ben. From benh at kernel.crashing.org Wed Jun 29 11:46:58 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 29 Jun 2005 11:46:58 +1000 Subject: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery In-Reply-To: <20050628235848.GA6376@austin.ibm.com> References: <20050628235848.GA6376@austin.ibm.com> Message-ID: <1120009619.5133.228.camel@gaston> On Tue, 2005-06-28 at 18:58 -0500, Linas Vepstas wrote: > /** e100_io_error_detected() is called when PCI error is detected */ > +static int e100_io_error_detected (struct pci_dev *pdev, enum > pci_channel_state state) > +{ > + struct net_device *netdev = pci_get_drvdata(pdev); > + struct nic *nic = netdev_priv(netdev); > + > + mod_timer(&nic->watchdog, jiffies + 30*HZ); > + e100_down(nic); > + > + /* Request a slot reset. */ > + return PCIERR_RESULT_NEED_RESET; > +} I'm not sure just "pushing" the watchdog timer to 30sec in the future is the way to go here. What about netif_stop_queue() or so ? Ben. From benh at kernel.crashing.org Wed Jun 29 11:51:07 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 29 Jun 2005 11:51:07 +1000 Subject: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery In-Reply-To: <20050628235919.GA6415@austin.ibm.com> References: <20050628235919.GA6415@austin.ibm.com> Message-ID: <1120009868.5133.232.camel@gaston> On Tue, 2005-06-28 at 18:59 -0500, Linas Vepstas wrote: > pci-err-7-symbios.patch > > Adds PCI Error recoervy callbacks to the Symbios Sym53c8xx driver. > Tested, seems to work well under i/o stress to one disk. Not > stress tested under heavy i/o to multiple scsi devices. > > Note the check of the pci error state flag inside an infinite > loop inside the interrupt handler. Without this check, the > device can spin forever, locking up hard, long before the > asynchronous error event (and callbacks) are ever called. I don't understand the logic of that check. In general, I don't think checking the error state is reliable at all. You may be in an interrupt on the only CPU in the system, thus the error management code may have no chance to update that error state field while you are looping... It may work for us since we call the eeh stuff from the IO accessors but will not in the generic case. Normally, you should check for non-responding hardware by testing things like reading all ff's or having a timeout in the loop. The bug is that the driver has a potential infinite loop in the first place. The only type of "synchronous" error checking that can be done is what is proposed by Hidetoshi Seto. You could use his stuff here. Ben. From benh at kernel.crashing.org Wed Jun 29 11:59:47 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 29 Jun 2005 11:59:47 +1000 Subject: [PATCH 8/13]: PCI Err: Event delivery utility In-Reply-To: <20050628235932.GA6429@austin.ibm.com> References: <20050628235932.GA6429@austin.ibm.com> Message-ID: <1120010387.5133.235.camel@gaston> On Tue, 2005-06-28 at 18:59 -0500, Linas Vepstas wrote: > pci-err-8-pci-err-event.patch > > [RFC] > > PCI Error distribution utility routine. This patch defines > a utility routine that hasn't yet been discussed much on > the mailing list; I've made this architecture independent > with the idea that various architectures may find it handy, > but its not directly required, or relevant, to the overall > EEH error recovery mechanism. (It could be buried in > arch-dependent code or implemented differently.) > > The current design has the arch dependent code detect > a PCI bus error. That code uses this utility to generate > a detection event. This event is then caught by PCI > hotplug code, which drives the slot recovery. If the > affected device drivers have recovery callbacks, these > are used; all other devices are hotplugged. > > There are certainly other (simpler) ways to attach the > arch-specific error detection code to the hot-plug mediated > recovery code; this routine is rather left-over from > earlier email discussions. Should this stay, or not? Certainly needs to be in a separate .h at least ... Also, you have some lifetime issues. You probably want to do a get() on pci_dev when you put it in your struct and put() it after the notifier... Oh wait, you are doing pci_dev_put() ... but no pci_dev_get() ... The later must be missing from peh_send_failure_event(). I'd keep that in arch code for now. Ben. From ak at muc.de Wed Jun 29 13:02:37 2005 From: ak at muc.de (Andi Kleen) Date: 29 Jun 2005 05:02:37 +0200 Subject: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery In-Reply-To: <20050628235919.GA6415@austin.ibm.com> References: <20050628235919.GA6415@austin.ibm.com> Message-ID: <20050629030237.GB71992@muc.de> On Tue, Jun 28, 2005 at 06:59:19PM -0500, Linas Vepstas wrote: > > pci-err-7-symbios.patch > > Adds PCI Error recoervy callbacks to the Symbios Sym53c8xx driver. > Tested, seems to work well under i/o stress to one disk. Not > stress tested under heavy i/o to multiple scsi devices. What does this do to the IO requests currently being processed by the firmware? Do they get all aborted? Is it ensured that they all error out properly? -Andi From ak at muc.de Wed Jun 29 13:04:15 2005 From: ak at muc.de (Andi Kleen) Date: 29 Jun 2005 05:04:15 +0200 Subject: [PATCH 1/13]: PCI Err: pci.h header file changes In-Reply-To: <20050629002951.GA17885@kroah.com> References: <20050628235817.GA6324@austin.ibm.com> <20050629002951.GA17885@kroah.com> Message-ID: <20050629030415.GC71992@muc.de> On Tue, Jun 28, 2005 at 05:29:51PM -0700, Greg KH wrote: > On Tue, Jun 28, 2005 at 06:58:17PM -0500, Linas Vepstas wrote: > > @@ -673,6 +704,7 @@ struct pci_driver { > > int (*enable_wake) (struct pci_dev *dev, pci_power_t state, int enable); /* Enable wake event */ > > void (*shutdown) (struct pci_dev *dev); > > > > + struct pci_error_handlers err_handler; > > struct device_driver driver; > > struct pci_dynids dynids; > > }; > > Shouldn't that be a pointer and not the whole structure? Wouldn't that > make it easier to "reuse" error handlers? Yes, it's a good idea. In fact we could have a generic NIC error handler structure then that just calls the watchdog timeout function. I suspect that would be sufficient for most NICs. -Andi From benh at kernel.crashing.org Wed Jun 29 14:34:53 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 29 Jun 2005 14:34:53 +1000 Subject: [PATCH 10/13]: PCI Err: PPC64-specific recovery infrastructure In-Reply-To: <20050628235956.GA6455@austin.ibm.com> References: <20050628235956.GA6455@austin.ibm.com> Message-ID: <1120019694.5133.244.camel@gaston> On Tue, 2005-06-28 at 18:59 -0500, Linas Vepstas wrote: > > struct device_node { > char *name; > @@ -137,9 +138,13 @@ struct device_node { > int devfn; /* for pci devices */ > int eeh_mode; /* See eeh.h for possible > EEH_MODEs */ > int eeh_config_addr; > + int eeh_check_count; /* number of times device driver > ignored error */ > + int eeh_freeze_count; /* number of times this device froze > up. */ > + int eeh_is_bridge; /* device is pci-to-pci bridge */ > int pci_ext_config_space; /* for pci devices */ > struct pci_controller *phb; /* for pci devices */ > struct iommu_table *iommu_table; /* for phb's or > bridges */ > + u32 config_space[16]; /* saved PCI config space */ > > struct property *properties; > struct device_node *parent; Please, do not add crap to struct device_node. It's already bloated enough and we intend to instead get rid of the stuff in there. Do you actually need to save the config space at all ? Can't you just use "assigned-address" property to fill up the BARs again ? As for the other EEH things, well, we probably need to bite the bullet and do what we talked about doing for a while, that is split the PCI related junk out of struct device_node and into a separate structure. We could maybe at first (to ease the transition) keep a pointer to it in device_node, and we can create that structure early in pci_dn. That way, we only really need to add gunk to PCI devices and not to all nodes. Same for VIO actually. Ben. From arjan at infradead.org Wed Jun 29 17:32:25 2005 From: arjan at infradead.org (Arjan van de Ven) Date: Wed, 29 Jun 2005 09:32:25 +0200 Subject: [PATCH] net: add driver for the NIC on Cell Blades In-Reply-To: <200506290238.59231.arnd@arndb.de> References: <200506281528.08834.arnd@arndb.de> <1119966799.3175.32.camel@laptopd505.fenrus.org> <200506290238.59231.arnd@arndb.de> Message-ID: <1120030346.3196.21.camel@laptopd505.fenrus.org> On Wed, 2005-06-29 at 02:38 +0200, Arnd Bergmann wrote: > On Dinsdag 28 Juni 2005 15:53, Arjan van de Ven wrote: > > > > > +static void > > > +spider_net_rx_irq_off(struct spider_net_card *card) > > > +{ > > > + u32 regvalue; > > > + unsigned long flags; > > > + > > > + spin_lock_irqsave(&card->intmask_lock, flags); > > > + regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); > > > + regvalue &= ~SPIDER_NET_RXINT; > > > + spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); > > > + spin_unlock_irqrestore(&card->intmask_lock, flags); > > > +} > > > > I think you have a PCI posting bug here.... > > Could you be more specific? My guess would be that the 'sync' in writel > takes care of this. Should there be an extra mmiowb() in here or are > you referring to some other problem? different problem. the sync will get the byte out of the cpu. It won't get it out of the pci bridges... In short, pci bridges are allowed to buffer (post) writes until data traffic in the other direction happens (eg readl() or dma). In cases where you want your writel to hit the device instantly (and disabling irqs is generally one of those) you need to flush this posting cache with a dummy readl(). http://ftp.linux.org.uk/pub/linux/willy/patches/debug-write.diff is a patch to simulate this behavior more agressive From michael at ellerman.id.au Wed Jun 29 17:43:48 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:43:48 +1000 Subject: Updated patches for iSeries cleanup Message-ID: <200506291743.52405.michael@ellerman.id.au> Hi, Here is an updated series of patches to cleanup some of the iSeries code. I've added two new patches. One moves set_spread_lpevents() into ItLpQueue.c and therefore makes spread_lpevents static. And the other replaces the custom sort-of-atomic in ItLpQueue with a straight spinlock. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050629/bdbefdf1/attachment.pgp From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 4/17] ppc64: Reorganise the paca initialisation macros In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.658299.199492530402.qpatch@concordia> Hi, This patch updates the macros that initialise the paca to remove the lpq parameter. It also rearranges them a bit with the hope of making them a bit clearer. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/pacaData.c | 306 ++++++++++++++++++++++--------------------- 1 files changed, 160 insertions(+), 146 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/pacaData.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pacaData.c +++ ppc64-2.6/arch/ppc64/kernel/pacaData.c @@ -42,20 +42,7 @@ extern unsigned long __toc_start; * processors. The processor VPD array needs one entry per physical * processor (not thread). */ -#ifdef CONFIG_PPC_ISERIES -#define EXTRA_INITS(number, lpq) \ - .lppaca_ptr = &paca[number].lppaca, \ - .reg_save_ptr = &paca[number].reg_save, \ - .reg_save = { \ - .xDesc = 0xd397d9e2, /* "LpRS" */ \ - .xSize = sizeof(struct ItLpRegSave) \ - }, -#else -#define EXTRA_INITS(number, lpq) -#endif - -#define PACAINITDATA(number,start,lpq,asrr,asrv) \ -{ \ +#define PACA_INIT_COMMON(number, start, asrr, asrv) \ .lock_token = 0x8000, \ .paca_index = (number), /* Paca Index */ \ .default_decr = 0x00ff0000, /* Initial Decr */ \ @@ -73,147 +60,79 @@ extern unsigned long __toc_start; .end_of_quantum = 0xfffffffffffffffful, \ .slb_count = 64, \ }, \ - EXTRA_INITS((number), (lpq)) \ -} -struct paca_struct paca[] = { #ifdef CONFIG_PPC_ISERIES - PACAINITDATA( 0, 1, &xItLpQueue, 0, STAB0_VIRT_ADDR), +#define PACA_INIT_ISERIES(number) \ + .lppaca_ptr = &paca[number].lppaca, \ + .reg_save_ptr = &paca[number].reg_save, \ + .reg_save = { \ + .xDesc = 0xd397d9e2, /* "LpRS" */ \ + .xSize = sizeof(struct ItLpRegSave) \ + } + +#define PACA_INIT(number) \ +{ \ + PACA_INIT_COMMON(number, 0, 0, 0) \ + PACA_INIT_ISERIES(number) \ +} + +#define BOOTCPU_PACA_INIT(number) \ +{ \ + PACA_INIT_COMMON(number, 1, 0, STAB0_VIRT_ADDR) \ + PACA_INIT_ISERIES(number) \ +} + #else - PACAINITDATA( 0, 1, NULL, STAB0_PHYS_ADDR, STAB0_VIRT_ADDR), +#define PACA_INIT(number) \ +{ \ + PACA_INIT_COMMON(number, 0, 0, 0) \ +} + +#define BOOTCPU_PACA_INIT(number) \ +{ \ + PACA_INIT_COMMON(number, 1, STAB0_PHYS_ADDR, STAB0_VIRT_ADDR) \ +} #endif + +struct paca_struct paca[] = { + BOOTCPU_PACA_INIT(0), #if NR_CPUS > 1 - PACAINITDATA( 1, 0, NULL, 0, 0), - PACAINITDATA( 2, 0, NULL, 0, 0), - PACAINITDATA( 3, 0, NULL, 0, 0), + PACA_INIT( 1), PACA_INIT( 2), PACA_INIT( 3), #if NR_CPUS > 4 - PACAINITDATA( 4, 0, NULL, 0, 0), - PACAINITDATA( 5, 0, NULL, 0, 0), - PACAINITDATA( 6, 0, NULL, 0, 0), - PACAINITDATA( 7, 0, NULL, 0, 0), + PACA_INIT( 4), PACA_INIT( 5), PACA_INIT( 6), PACA_INIT( 7), #if NR_CPUS > 8 - PACAINITDATA( 8, 0, NULL, 0, 0), - PACAINITDATA( 9, 0, NULL, 0, 0), - PACAINITDATA(10, 0, NULL, 0, 0), - PACAINITDATA(11, 0, NULL, 0, 0), - PACAINITDATA(12, 0, NULL, 0, 0), - PACAINITDATA(13, 0, NULL, 0, 0), - PACAINITDATA(14, 0, NULL, 0, 0), - PACAINITDATA(15, 0, NULL, 0, 0), - PACAINITDATA(16, 0, NULL, 0, 0), - PACAINITDATA(17, 0, NULL, 0, 0), - PACAINITDATA(18, 0, NULL, 0, 0), - PACAINITDATA(19, 0, NULL, 0, 0), - PACAINITDATA(20, 0, NULL, 0, 0), - PACAINITDATA(21, 0, NULL, 0, 0), - PACAINITDATA(22, 0, NULL, 0, 0), - PACAINITDATA(23, 0, NULL, 0, 0), - PACAINITDATA(24, 0, NULL, 0, 0), - PACAINITDATA(25, 0, NULL, 0, 0), - PACAINITDATA(26, 0, NULL, 0, 0), - PACAINITDATA(27, 0, NULL, 0, 0), - PACAINITDATA(28, 0, NULL, 0, 0), - PACAINITDATA(29, 0, NULL, 0, 0), - PACAINITDATA(30, 0, NULL, 0, 0), - PACAINITDATA(31, 0, NULL, 0, 0), + PACA_INIT( 8), PACA_INIT( 9), PACA_INIT( 10), PACA_INIT( 11), + PACA_INIT( 12), PACA_INIT( 13), PACA_INIT( 14), PACA_INIT( 15), + PACA_INIT( 16), PACA_INIT( 17), PACA_INIT( 18), PACA_INIT( 19), + PACA_INIT( 20), PACA_INIT( 21), PACA_INIT( 22), PACA_INIT( 23), + PACA_INIT( 24), PACA_INIT( 25), PACA_INIT( 26), PACA_INIT( 27), + PACA_INIT( 28), PACA_INIT( 29), PACA_INIT( 30), PACA_INIT( 31), #if NR_CPUS > 32 - PACAINITDATA(32, 0, NULL, 0, 0), - PACAINITDATA(33, 0, NULL, 0, 0), - PACAINITDATA(34, 0, NULL, 0, 0), - PACAINITDATA(35, 0, NULL, 0, 0), - PACAINITDATA(36, 0, NULL, 0, 0), - PACAINITDATA(37, 0, NULL, 0, 0), - PACAINITDATA(38, 0, NULL, 0, 0), - PACAINITDATA(39, 0, NULL, 0, 0), - PACAINITDATA(40, 0, NULL, 0, 0), - PACAINITDATA(41, 0, NULL, 0, 0), - PACAINITDATA(42, 0, NULL, 0, 0), - PACAINITDATA(43, 0, NULL, 0, 0), - PACAINITDATA(44, 0, NULL, 0, 0), - PACAINITDATA(45, 0, NULL, 0, 0), - PACAINITDATA(46, 0, NULL, 0, 0), - PACAINITDATA(47, 0, NULL, 0, 0), - PACAINITDATA(48, 0, NULL, 0, 0), - PACAINITDATA(49, 0, NULL, 0, 0), - PACAINITDATA(50, 0, NULL, 0, 0), - PACAINITDATA(51, 0, NULL, 0, 0), - PACAINITDATA(52, 0, NULL, 0, 0), - PACAINITDATA(53, 0, NULL, 0, 0), - PACAINITDATA(54, 0, NULL, 0, 0), - PACAINITDATA(55, 0, NULL, 0, 0), - PACAINITDATA(56, 0, NULL, 0, 0), - PACAINITDATA(57, 0, NULL, 0, 0), - PACAINITDATA(58, 0, NULL, 0, 0), - PACAINITDATA(59, 0, NULL, 0, 0), - PACAINITDATA(60, 0, NULL, 0, 0), - PACAINITDATA(61, 0, NULL, 0, 0), - PACAINITDATA(62, 0, NULL, 0, 0), - PACAINITDATA(63, 0, NULL, 0, 0), + PACA_INIT( 32), PACA_INIT( 33), PACA_INIT( 34), PACA_INIT( 35), + PACA_INIT( 36), PACA_INIT( 37), PACA_INIT( 38), PACA_INIT( 39), + PACA_INIT( 40), PACA_INIT( 41), PACA_INIT( 42), PACA_INIT( 43), + PACA_INIT( 44), PACA_INIT( 45), PACA_INIT( 46), PACA_INIT( 47), + PACA_INIT( 48), PACA_INIT( 49), PACA_INIT( 50), PACA_INIT( 51), + PACA_INIT( 52), PACA_INIT( 53), PACA_INIT( 54), PACA_INIT( 55), + PACA_INIT( 56), PACA_INIT( 57), PACA_INIT( 58), PACA_INIT( 59), + PACA_INIT( 60), PACA_INIT( 61), PACA_INIT( 62), PACA_INIT( 63), #if NR_CPUS > 64 - PACAINITDATA(64, 0, NULL, 0, 0), - PACAINITDATA(65, 0, NULL, 0, 0), - PACAINITDATA(66, 0, NULL, 0, 0), - PACAINITDATA(67, 0, NULL, 0, 0), - PACAINITDATA(68, 0, NULL, 0, 0), - PACAINITDATA(69, 0, NULL, 0, 0), - PACAINITDATA(70, 0, NULL, 0, 0), - PACAINITDATA(71, 0, NULL, 0, 0), - PACAINITDATA(72, 0, NULL, 0, 0), - PACAINITDATA(73, 0, NULL, 0, 0), - PACAINITDATA(74, 0, NULL, 0, 0), - PACAINITDATA(75, 0, NULL, 0, 0), - PACAINITDATA(76, 0, NULL, 0, 0), - PACAINITDATA(77, 0, NULL, 0, 0), - PACAINITDATA(78, 0, NULL, 0, 0), - PACAINITDATA(79, 0, NULL, 0, 0), - PACAINITDATA(80, 0, NULL, 0, 0), - PACAINITDATA(81, 0, NULL, 0, 0), - PACAINITDATA(82, 0, NULL, 0, 0), - PACAINITDATA(83, 0, NULL, 0, 0), - PACAINITDATA(84, 0, NULL, 0, 0), - PACAINITDATA(85, 0, NULL, 0, 0), - PACAINITDATA(86, 0, NULL, 0, 0), - PACAINITDATA(87, 0, NULL, 0, 0), - PACAINITDATA(88, 0, NULL, 0, 0), - PACAINITDATA(89, 0, NULL, 0, 0), - PACAINITDATA(90, 0, NULL, 0, 0), - PACAINITDATA(91, 0, NULL, 0, 0), - PACAINITDATA(92, 0, NULL, 0, 0), - PACAINITDATA(93, 0, NULL, 0, 0), - PACAINITDATA(94, 0, NULL, 0, 0), - PACAINITDATA(95, 0, NULL, 0, 0), - PACAINITDATA(96, 0, NULL, 0, 0), - PACAINITDATA(97, 0, NULL, 0, 0), - PACAINITDATA(98, 0, NULL, 0, 0), - PACAINITDATA(99, 0, NULL, 0, 0), - PACAINITDATA(100, 0, NULL, 0, 0), - PACAINITDATA(101, 0, NULL, 0, 0), - PACAINITDATA(102, 0, NULL, 0, 0), - PACAINITDATA(103, 0, NULL, 0, 0), - PACAINITDATA(104, 0, NULL, 0, 0), - PACAINITDATA(105, 0, NULL, 0, 0), - PACAINITDATA(106, 0, NULL, 0, 0), - PACAINITDATA(107, 0, NULL, 0, 0), - PACAINITDATA(108, 0, NULL, 0, 0), - PACAINITDATA(109, 0, NULL, 0, 0), - PACAINITDATA(110, 0, NULL, 0, 0), - PACAINITDATA(111, 0, NULL, 0, 0), - PACAINITDATA(112, 0, NULL, 0, 0), - PACAINITDATA(113, 0, NULL, 0, 0), - PACAINITDATA(114, 0, NULL, 0, 0), - PACAINITDATA(115, 0, NULL, 0, 0), - PACAINITDATA(116, 0, NULL, 0, 0), - PACAINITDATA(117, 0, NULL, 0, 0), - PACAINITDATA(118, 0, NULL, 0, 0), - PACAINITDATA(119, 0, NULL, 0, 0), - PACAINITDATA(120, 0, NULL, 0, 0), - PACAINITDATA(121, 0, NULL, 0, 0), - PACAINITDATA(122, 0, NULL, 0, 0), - PACAINITDATA(123, 0, NULL, 0, 0), - PACAINITDATA(124, 0, NULL, 0, 0), - PACAINITDATA(125, 0, NULL, 0, 0), - PACAINITDATA(126, 0, NULL, 0, 0), - PACAINITDATA(127, 0, NULL, 0, 0), + PACA_INIT( 64), PACA_INIT( 65), PACA_INIT( 66), PACA_INIT( 67), + PACA_INIT( 68), PACA_INIT( 69), PACA_INIT( 70), PACA_INIT( 71), + PACA_INIT( 72), PACA_INIT( 73), PACA_INIT( 74), PACA_INIT( 75), + PACA_INIT( 76), PACA_INIT( 77), PACA_INIT( 78), PACA_INIT( 79), + PACA_INIT( 80), PACA_INIT( 81), PACA_INIT( 82), PACA_INIT( 83), + PACA_INIT( 84), PACA_INIT( 85), PACA_INIT( 86), PACA_INIT( 87), + PACA_INIT( 88), PACA_INIT( 89), PACA_INIT( 90), PACA_INIT( 91), + PACA_INIT( 92), PACA_INIT( 93), PACA_INIT( 94), PACA_INIT( 95), + PACA_INIT( 96), PACA_INIT( 97), PACA_INIT( 98), PACA_INIT( 99), + PACA_INIT(100), PACA_INIT(101), PACA_INIT(102), PACA_INIT(103), + PACA_INIT(104), PACA_INIT(105), PACA_INIT(106), PACA_INIT(107), + PACA_INIT(108), PACA_INIT(109), PACA_INIT(110), PACA_INIT(111), + PACA_INIT(112), PACA_INIT(113), PACA_INIT(114), PACA_INIT(115), + PACA_INIT(116), PACA_INIT(117), PACA_INIT(118), PACA_INIT(119), + PACA_INIT(120), PACA_INIT(121), PACA_INIT(122), PACA_INIT(123), + PACA_INIT(124), PACA_INIT(125), PACA_INIT(126), PACA_INIT(127), #endif #endif #endif From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 14/17] ppc64: Cleanup proc printing of event types In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.309794.326454602624.qpatch@concordia> Hi, The code that prints event counts by type uses a hand-coded number of tabs to get the alignment right. Instead use a printf alignment which will allow allow us to use the event_type strings elsewhere in the future. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -31,15 +31,15 @@ struct hvlpevent_queue hvlpevent_queue _ DEFINE_PER_CPU(unsigned long[HvLpEvent_Type_NumTypes], hvlpevent_counts); static char *event_types[HvLpEvent_Type_NumTypes] = { - "Hypervisor\t\t", - "Machine Facilities\t", - "Session Manager\t", - "SPD I/O\t\t", - "Virtual Bus\t\t", - "PCI I/O\t\t", - "RIO I/O\t\t", - "Virtual Lan\t\t", - "Virtual I/O\t\t" + "Hypervisor", + "Machine Facilities", + "Session Manager", + "SPD I/O", + "Virtual Bus", + "PCI I/O", + "RIO I/O", + "Virtual Lan", + "Virtual I/O" }; static __inline__ int set_inUse(void) @@ -248,7 +248,7 @@ static int proc_lpevents_show(struct seq sum += per_cpu(hvlpevent_counts, cpu)[i]; } - seq_printf(m, " %s %10lu\n", event_types[i], sum); + seq_printf(m, " %-20s %10lu\n", event_types[i], sum); } seq_printf(m, "\n events processed by processor:\n"); From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 7/17] ppc64: Move xItLpQueue proc code into ItLpQueue.c In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.856822.525818440039.qpatch@concordia> Hi, Move the code that displays xItLpQueue values in /proc into ItLpQueue.c Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 59 +++++++++++++++++++++++++++++++++++++++ arch/ppc64/kernel/iSeries_proc.c | 48 ------------------------------- 2 files changed, 59 insertions(+), 48 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -12,12 +12,26 @@ #include #include #include +#include +#include #include #include #include #include #include +static char *event_types[9] = { + "Hypervisor\t\t", + "Machine Facilities\t", + "Session Manager\t", + "SPD I/O\t\t", + "Virtual Bus\t\t", + "PCI I/O\t\t", + "RIO I/O\t\t", + "Virtual Lan\t\t", + "Virtual I/O\t\t" +}; + static __inline__ int set_inUse(void) { int t; @@ -208,3 +222,48 @@ void setup_hvlpevent_queue(void) (LpEventStackSize - LpEventMaxSize); xItLpQueue.xIndex = 0; } + +static int proc_lpevents_show(struct seq_file *m, void *v) +{ + unsigned int i; + + seq_printf(m, "LpEventQueue 0\n"); + seq_printf(m, " events processed:\t%lu\n", + (unsigned long)xItLpQueue.xLpIntCount); + + for (i = 0; i < 9; ++i) + seq_printf(m, " %s %10lu\n", event_types[i], + (unsigned long)xItLpQueue.xLpIntCountByType[i]); + + seq_printf(m, "\n events processed by processor:\n"); + + for_each_online_cpu(i) + seq_printf(m, " CPU%02d %10u\n", i, paca[i].lpevent_count); + + return 0; +} + +static int proc_lpevents_open(struct inode *inode, struct file *file) +{ + return single_open(file, proc_lpevents_show, NULL); +} + +static struct file_operations proc_lpevents_operations = { + .open = proc_lpevents_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int __init proc_lpevents_init(void) +{ + struct proc_dir_entry *e; + + e = create_proc_entry("iSeries/lpevents", S_IFREG|S_IRUGO, NULL); + if (e) + e->proc_fops = &proc_lpevents_operations; + + return 0; +} +__initcall(proc_lpevents_init); + Index: ppc64-2.6/arch/ppc64/kernel/iSeries_proc.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_proc.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_proc.c @@ -40,50 +40,6 @@ static int __init iseries_proc_create(vo } core_initcall(iseries_proc_create); -static char *event_types[9] = { - "Hypervisor\t\t", - "Machine Facilities\t", - "Session Manager\t", - "SPD I/O\t\t", - "Virtual Bus\t\t", - "PCI I/O\t\t", - "RIO I/O\t\t", - "Virtual Lan\t\t", - "Virtual I/O\t\t" -}; - -static int proc_lpevents_show(struct seq_file *m, void *v) -{ - unsigned int i; - - seq_printf(m, "LpEventQueue 0\n"); - seq_printf(m, " events processed:\t%lu\n", - (unsigned long)xItLpQueue.xLpIntCount); - - for (i = 0; i < 9; ++i) - seq_printf(m, " %s %10lu\n", event_types[i], - (unsigned long)xItLpQueue.xLpIntCountByType[i]); - - seq_printf(m, "\n events processed by processor:\n"); - - for_each_online_cpu(i) - seq_printf(m, " CPU%02d %10u\n", i, paca[i].lpevent_count); - - return 0; -} - -static int proc_lpevents_open(struct inode *inode, struct file *file) -{ - return single_open(file, proc_lpevents_show, NULL); -} - -static struct file_operations proc_lpevents_operations = { - .open = proc_lpevents_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, -}; - static unsigned long startTitan = 0; static unsigned long startTb = 0; @@ -148,10 +104,6 @@ static int __init iseries_proc_init(void { struct proc_dir_entry *e; - e = create_proc_entry("iSeries/lpevents", S_IFREG|S_IRUGO, NULL); - if (e) - e->proc_fops = &proc_lpevents_operations; - e = create_proc_entry("iSeries/titanTod", S_IFREG|S_IRUGO, NULL); if (e) e->proc_fops = &proc_titantod_operations; From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 8/17] ppc64: Make two ItLpQueue related functions static In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.918274.246684346050.qpatch@concordia> Hi, External parties don't need to use ItLpQueue_getNextLpEvent() or ItLpQueue_clearValid(), they're internal to ItLpQueue.c Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 4 ++-- include/asm-ppc64/iSeries/ItLpQueue.h | 2 -- 2 files changed, 2 insertions(+), 4 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -62,7 +62,7 @@ static __inline__ void clear_inUse(void) extern LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; unsigned long ItLpQueueInProcess = 0; -struct HvLpEvent * ItLpQueue_getNextLpEvent(void) +static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) { struct HvLpEvent * nextLpEvent = (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; @@ -97,7 +97,7 @@ int ItLpQueue_isLpIntPending(void) return next_event->xFlags.xValid | xItLpQueue.xPlicOverflowIntPending; } -void ItLpQueue_clearValid( struct HvLpEvent * event ) +static void ItLpQueue_clearValid( struct HvLpEvent * event ) { /* Clear the valid bit of the event * Also clear bits within this event that might Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -76,10 +76,8 @@ struct ItLpQueue { extern struct ItLpQueue xItLpQueue; -extern struct HvLpEvent *ItLpQueue_getNextLpEvent(void); extern int ItLpQueue_isLpIntPending(void); extern unsigned ItLpQueue_process(struct pt_regs *); -extern void ItLpQueue_clearValid(struct HvLpEvent *); extern void setup_hvlpevent_queue(void); #endif /* _ITLPQUEUE_H */ From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 9/17] ppc64: Move definition of xItLpQueue In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.986142.956678160627.qpatch@concordia> Hi, The xItLpQueue is declared in LparData.c, move it into ItLpQueue.c LparData.c is the only other file that needs to know about xItLpQueue, so remove the extern definition from ItLpQueue.h and put it in LparData.c directly. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 8 ++++++++ arch/ppc64/kernel/LparData.c | 7 +------ include/asm-ppc64/iSeries/ItLpQueue.h | 1 - 3 files changed, 9 insertions(+), 7 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -20,6 +20,14 @@ #include #include +/* + * The LpQueue is used to pass event data from the hypervisor to + * the partition. This is where I/O interrupt events are communicated. + * + * It is written to by the hypervisor so cannot end up in the BSS. + */ +struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); + static char *event_types[9] = { "Hypervisor\t\t", "Machine Facilities\t", Index: ppc64-2.6/arch/ppc64/kernel/LparData.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/LparData.c +++ ppc64-2.6/arch/ppc64/kernel/LparData.c @@ -28,13 +28,6 @@ #include #include -/* The LpQueue is used to pass event data from the hypervisor to - * the partition. This is where I/O interrupt events are communicated. - */ - -/* May be filled in by the hypervisor so cannot end up in the BSS */ -struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); - /* The HvReleaseData is the root of the information shared between * the hypervisor and Linux. From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 5/17] ppc64: Don't pass the pointers to xItLpQueue around In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.732424.791684246072.qpatch@concordia> Hi, Because there's only one ItLpQueue and we know where it is, ie. xItLpQueue, there's no point passing pointers to it it around all over the place. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 24 ++++++++++++------------ arch/ppc64/kernel/idle.c | 4 ++-- arch/ppc64/kernel/irq.c | 4 ++-- arch/ppc64/kernel/mf.c | 4 ++-- arch/ppc64/kernel/time.c | 4 ++-- include/asm-ppc64/iSeries/ItLpQueue.h | 4 ++-- 6 files changed, 22 insertions(+), 22 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -17,10 +17,10 @@ #include #include -static __inline__ int set_inUse( struct ItLpQueue * lpQueue ) +static __inline__ int set_inUse(void) { int t; - u32 * inUseP = &(lpQueue->xInUseWord); + u32 * inUseP = &xItLpQueue.xInUseWord; __asm__ __volatile__("\n\ 1: lwarx %0,0,%2 \n\ @@ -31,37 +31,37 @@ static __inline__ int set_inUse( struct stwcx. %0,0,%2 \n\ bne- 1b \n\ 2: eieio" - : "=&r" (t), "=m" (lpQueue->xInUseWord) - : "r" (inUseP), "m" (lpQueue->xInUseWord) + : "=&r" (t), "=m" (xItLpQueue.xInUseWord) + : "r" (inUseP), "m" (xItLpQueue.xInUseWord) : "cc"); return t; } -static __inline__ void clear_inUse( struct ItLpQueue * lpQueue ) +static __inline__ void clear_inUse(void) { - lpQueue->xInUseWord = 0; + xItLpQueue.xInUseWord = 0; } /* Array of LpEvent handler functions */ extern LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; unsigned long ItLpQueueInProcess = 0; -struct HvLpEvent * ItLpQueue_getNextLpEvent( struct ItLpQueue * lpQueue ) +struct HvLpEvent * ItLpQueue_getNextLpEvent(void) { struct HvLpEvent * nextLpEvent = - (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; + (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; if ( nextLpEvent->xFlags.xValid ) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ - lpQueue->xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + + xItLpQueue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + LpEventAlign ) / LpEventAlign ) * LpEventAlign; /* Wrap to beginning if no room at end */ - if (lpQueue->xSlicCurEventPtr > lpQueue->xSlicLastValidEventPtr) - lpQueue->xSlicCurEventPtr = lpQueue->xSlicEventStackPtr; + if (xItLpQueue.xSlicCurEventPtr > xItLpQueue.xSlicLastValidEventPtr) + xItLpQueue.xSlicCurEventPtr = xItLpQueue.xSlicEventStackPtr; } else nextLpEvent = NULL; @@ -71,15 +71,15 @@ struct HvLpEvent * ItLpQueue_getNextLpEv static unsigned long spread_lpevents = NR_CPUS; -int ItLpQueue_isLpIntPending( struct ItLpQueue * lpQueue ) +int ItLpQueue_isLpIntPending(void) { struct HvLpEvent *next_event; if (smp_processor_id() >= spread_lpevents) return 0; - next_event = (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; - return next_event->xFlags.xValid | lpQueue->xPlicOverflowIntPending; + next_event = (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; + return next_event->xFlags.xValid | xItLpQueue.xPlicOverflowIntPending; } void ItLpQueue_clearValid( struct HvLpEvent * event ) @@ -104,13 +104,13 @@ void ItLpQueue_clearValid( struct HvLpEv event->xFlags.xValid = 0; } -unsigned ItLpQueue_process( struct ItLpQueue * lpQueue, struct pt_regs *regs ) +unsigned ItLpQueue_process(struct pt_regs *regs) { unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; /* If we have recursed, just return */ - if ( !set_inUse( lpQueue ) ) + if ( !set_inUse() ) return 0; if (ItLpQueueInProcess == 0) @@ -119,13 +119,13 @@ unsigned ItLpQueue_process( struct ItLpQ BUG(); for (;;) { - nextLpEvent = ItLpQueue_getNextLpEvent( lpQueue ); + nextLpEvent = ItLpQueue_getNextLpEvent(); if ( nextLpEvent ) { /* Count events to return to caller - * and count processed events in lpQueue + * and count processed events in xItLpQueue */ ++numIntsProcessed; - lpQueue->xLpIntCount++; + xItLpQueue.xLpIntCount++; /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -140,7 +140,7 @@ unsigned ItLpQueue_process( struct ItLpQ * here! */ if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) - lpQueue->xLpIntCountByType[nextLpEvent->xType]++; + xItLpQueue.xLpIntCountByType[nextLpEvent->xType]++; if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && lpEventHandler[nextLpEvent->xType] ) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); @@ -148,19 +148,19 @@ unsigned ItLpQueue_process( struct ItLpQ printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); ItLpQueue_clearValid( nextLpEvent ); - } else if ( lpQueue->xPlicOverflowIntPending ) + } else if ( xItLpQueue.xPlicOverflowIntPending ) /* * No more valid events. If overflow events are * pending process them */ - HvCallEvent_getOverflowLpEvents( lpQueue->xIndex); + HvCallEvent_getOverflowLpEvents( xItLpQueue.xIndex); else break; } ItLpQueueInProcess = 0; mb(); - clear_inUse( lpQueue ); + clear_inUse(); get_paca()->lpevent_count += numIntsProcessed; Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -76,9 +76,9 @@ struct ItLpQueue { extern struct ItLpQueue xItLpQueue; -extern struct HvLpEvent *ItLpQueue_getNextLpEvent(struct ItLpQueue *); -extern int ItLpQueue_isLpIntPending(struct ItLpQueue *); -extern unsigned ItLpQueue_process(struct ItLpQueue *, struct pt_regs *); +extern struct HvLpEvent *ItLpQueue_getNextLpEvent(void); +extern int ItLpQueue_isLpIntPending(void); +extern unsigned ItLpQueue_process(struct pt_regs *); extern void ItLpQueue_clearValid(struct HvLpEvent *); #endif /* _ITLPQUEUE_H */ Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -88,7 +88,7 @@ static int iSeries_idle(void) while (1) { if (lpaca->lppaca.shared_proc) { - if (ItLpQueue_isLpIntPending(&xItLpQueue)) + if (ItLpQueue_isLpIntPending()) process_iSeries_events(); if (!need_resched()) yield_shared_processor(); @@ -100,7 +100,7 @@ static int iSeries_idle(void) while (!need_resched()) { HMT_medium(); - if (ItLpQueue_isLpIntPending(&xItLpQueue)) + if (ItLpQueue_isLpIntPending()) process_iSeries_events(); HMT_low(); } Index: ppc64-2.6/arch/ppc64/kernel/irq.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/irq.c +++ ppc64-2.6/arch/ppc64/kernel/irq.c @@ -294,8 +294,8 @@ void do_IRQ(struct pt_regs *regs) iSeries_smp_message_recv(regs); } #endif /* CONFIG_SMP */ - if (ItLpQueue_isLpIntPending(&xItLpQueue)) - lpevent_count += ItLpQueue_process(&xItLpQueue, regs); + if (ItLpQueue_isLpIntPending()) + lpevent_count += ItLpQueue_process(regs); irq_exit(); Index: ppc64-2.6/arch/ppc64/kernel/time.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/time.c +++ ppc64-2.6/arch/ppc64/kernel/time.c @@ -367,8 +367,8 @@ int timer_interrupt(struct pt_regs * reg set_dec(next_dec); #ifdef CONFIG_PPC_ISERIES - if (ItLpQueue_isLpIntPending(&xItLpQueue)) - lpevent_count += ItLpQueue_process(&xItLpQueue, regs); + if (ItLpQueue_isLpIntPending()) + lpevent_count += ItLpQueue_process(regs); #endif /* collect purr register values often, for accurate calculations */ Index: ppc64-2.6/arch/ppc64/kernel/mf.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/mf.c +++ ppc64-2.6/arch/ppc64/kernel/mf.c @@ -802,8 +802,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { extern unsigned long lpevent_count; - if (ItLpQueue_isLpIntPending(&xItLpQueue)) - lpevent_count += ItLpQueue_process(&xItLpQueue, NULL); + if (ItLpQueue_isLpIntPending()) + lpevent_count += ItLpQueue_process(NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 12/17] ppc64: Don't count number of events processed for caller In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.182981.633567678532.qpatch@concordia> Hi, Currently we count the number of lpevents processed in 3 seperate places. One of these counters is never read, so just remove it. This means hvlpevent_queue_process() no longer needs to return the number of events processed. Signed-off-by: Michael Ellerman -- Index: ppc64-2.6/arch/ppc64/kernel/irq.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/irq.c +++ ppc64-2.6/arch/ppc64/kernel/irq.c @@ -66,7 +66,6 @@ EXPORT_SYMBOL(irq_desc); int distribute_irqs = 1; int __irq_offset_value; int ppc_spurious_interrupts; -unsigned long lpevent_count; u64 ppc64_interrupt_controller; int show_interrupts(struct seq_file *p, void *v) @@ -295,7 +294,7 @@ void do_IRQ(struct pt_regs *regs) } #endif /* CONFIG_SMP */ if (hvlpevent_is_pending()) - lpevent_count += process_hvlpevents(regs); + process_hvlpevents(regs); irq_exit(); Index: ppc64-2.6/arch/ppc64/kernel/time.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/time.c +++ ppc64-2.6/arch/ppc64/kernel/time.c @@ -99,7 +99,6 @@ unsigned long tb_to_ns_shift; struct gettimeofday_struct do_gtod; extern unsigned long wall_jiffies; -extern unsigned long lpevent_count; extern int smp_tb_synchronized; extern struct timezone sys_tz; @@ -368,7 +367,7 @@ int timer_interrupt(struct pt_regs * reg #ifdef CONFIG_PPC_ISERIES if (hvlpevent_is_pending()) - lpevent_count += process_hvlpevents(regs); + process_hvlpevents(regs); #endif /* collect purr register values often, for accurate calculations */ Index: ppc64-2.6/arch/ppc64/kernel/mf.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/mf.c +++ ppc64-2.6/arch/ppc64/kernel/mf.c @@ -801,9 +801,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) return rc; /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { - extern unsigned long lpevent_count; if (hvlpevent_is_pending()) - lpevent_count += process_hvlpevents(NULL); + process_hvlpevents(NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -77,7 +77,7 @@ struct hvlpevent_queue { extern struct hvlpevent_queue hvlpevent_queue; extern int hvlpevent_is_pending(void); -extern unsigned process_hvlpevents(struct pt_regs *); +extern void process_hvlpevents(struct pt_regs *); extern void setup_hvlpevent_queue(void); #endif /* _ITLPQUEUE_H */ Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -127,14 +127,14 @@ static void hvlpevent_clear_valid( struc event->xFlags.xValid = 0; } -unsigned process_hvlpevents(struct pt_regs *regs) +void process_hvlpevents(struct pt_regs *regs) { unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; /* If we have recursed, just return */ if ( !set_inUse() ) - return 0; + return; if (ItLpQueueInProcess == 0) ItLpQueueInProcess = 1; @@ -144,9 +144,6 @@ unsigned process_hvlpevents(struct pt_re for (;;) { nextLpEvent = get_next_hvlpevent(); if ( nextLpEvent ) { - /* Count events to return to caller - * and count processed events in hvlpevent_queue - */ ++numIntsProcessed; hvlpevent_queue.xLpIntCount++; /* Call appropriate handler here, passing @@ -186,8 +183,6 @@ unsigned process_hvlpevents(struct pt_re clear_inUse(); get_paca()->lpevent_count += numIntsProcessed; - - return numIntsProcessed; } static int set_spread_lpevents(char *str) From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 10/17] ppc64: Rename xItLpQueue to hvlpevent_queue In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.53661.614977365606.qpatch@concordia> Hi, The xItLpQueue is a queue of HvLpEvents that we're given by the Hypervisor. Rename xItLpQueue to hvlpevent_queue and make the type struct hvlpevent_queue. Signed-off-by: Michael Ellerman -- Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -26,7 +26,7 @@ * * It is written to by the hypervisor so cannot end up in the BSS. */ -struct ItLpQueue xItLpQueue __attribute__((__section__(".data"))); +struct hvlpevent_queue hvlpevent_queue __attribute__((__section__(".data"))); static char *event_types[9] = { "Hypervisor\t\t", @@ -43,7 +43,7 @@ static char *event_types[9] = { static __inline__ int set_inUse(void) { int t; - u32 * inUseP = &xItLpQueue.xInUseWord; + u32 * inUseP = &hvlpevent_queue.xInUseWord; __asm__ __volatile__("\n\ 1: lwarx %0,0,%2 \n\ @@ -54,8 +54,8 @@ static __inline__ int set_inUse(void) stwcx. %0,0,%2 \n\ bne- 1b \n\ 2: eieio" - : "=&r" (t), "=m" (xItLpQueue.xInUseWord) - : "r" (inUseP), "m" (xItLpQueue.xInUseWord) + : "=&r" (t), "=m" (hvlpevent_queue.xInUseWord) + : "r" (inUseP), "m" (hvlpevent_queue.xInUseWord) : "cc"); return t; @@ -63,7 +63,7 @@ static __inline__ int set_inUse(void) static __inline__ void clear_inUse(void) { - xItLpQueue.xInUseWord = 0; + hvlpevent_queue.xInUseWord = 0; } /* Array of LpEvent handler functions */ @@ -73,18 +73,18 @@ unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) { struct HvLpEvent * nextLpEvent = - (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; + (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; if ( nextLpEvent->xFlags.xValid ) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ - xItLpQueue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + + hvlpevent_queue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + LpEventAlign ) / LpEventAlign ) * LpEventAlign; /* Wrap to beginning if no room at end */ - if (xItLpQueue.xSlicCurEventPtr > xItLpQueue.xSlicLastValidEventPtr) - xItLpQueue.xSlicCurEventPtr = xItLpQueue.xSlicEventStackPtr; + if (hvlpevent_queue.xSlicCurEventPtr > hvlpevent_queue.xSlicLastValidEventPtr) + hvlpevent_queue.xSlicCurEventPtr = hvlpevent_queue.xSlicEventStackPtr; } else nextLpEvent = NULL; @@ -101,8 +101,8 @@ int ItLpQueue_isLpIntPending(void) if (smp_processor_id() >= spread_lpevents) return 0; - next_event = (struct HvLpEvent *)xItLpQueue.xSlicCurEventPtr; - return next_event->xFlags.xValid | xItLpQueue.xPlicOverflowIntPending; + next_event = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; + return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; } static void ItLpQueue_clearValid( struct HvLpEvent * event ) @@ -145,10 +145,10 @@ unsigned ItLpQueue_process(struct pt_reg nextLpEvent = ItLpQueue_getNextLpEvent(); if ( nextLpEvent ) { /* Count events to return to caller - * and count processed events in xItLpQueue + * and count processed events in hvlpevent_queue */ ++numIntsProcessed; - xItLpQueue.xLpIntCount++; + hvlpevent_queue.xLpIntCount++; /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -163,7 +163,7 @@ unsigned ItLpQueue_process(struct pt_reg * here! */ if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) - xItLpQueue.xLpIntCountByType[nextLpEvent->xType]++; + hvlpevent_queue.xLpIntCountByType[nextLpEvent->xType]++; if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && lpEventHandler[nextLpEvent->xType] ) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); @@ -171,12 +171,12 @@ unsigned ItLpQueue_process(struct pt_reg printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); ItLpQueue_clearValid( nextLpEvent ); - } else if ( xItLpQueue.xPlicOverflowIntPending ) + } else if ( hvlpevent_queue.xPlicOverflowIntPending ) /* * No more valid events. If overflow events are * pending process them */ - HvCallEvent_getOverflowLpEvents( xItLpQueue.xIndex); + HvCallEvent_getOverflowLpEvents( hvlpevent_queue.xIndex); else break; } @@ -224,11 +224,11 @@ void setup_hvlpevent_queue(void) /* Invoke the hypervisor to initialize the event stack */ HvCallEvent_setLpEventStack(0, eventStack, LpEventStackSize); - xItLpQueue.xSlicEventStackPtr = (char *)eventStack; - xItLpQueue.xSlicCurEventPtr = (char *)eventStack; - xItLpQueue.xSlicLastValidEventPtr = (char *)eventStack + + hvlpevent_queue.xSlicEventStackPtr = (char *)eventStack; + hvlpevent_queue.xSlicCurEventPtr = (char *)eventStack; + hvlpevent_queue.xSlicLastValidEventPtr = (char *)eventStack + (LpEventStackSize - LpEventMaxSize); - xItLpQueue.xIndex = 0; + hvlpevent_queue.xIndex = 0; } static int proc_lpevents_show(struct seq_file *m, void *v) @@ -237,11 +237,11 @@ static int proc_lpevents_show(struct seq seq_printf(m, "LpEventQueue 0\n"); seq_printf(m, " events processed:\t%lu\n", - (unsigned long)xItLpQueue.xLpIntCount); + (unsigned long)hvlpevent_queue.xLpIntCount); for (i = 0; i < 9; ++i) seq_printf(m, " %s %10lu\n", event_types[i], - (unsigned long)xItLpQueue.xLpIntCountByType[i]); + (unsigned long)hvlpevent_queue.xLpIntCountByType[i]); seq_printf(m, "\n events processed by processor:\n"); Index: ppc64-2.6/arch/ppc64/kernel/LparData.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/LparData.c +++ ppc64-2.6/arch/ppc64/kernel/LparData.c @@ -193,7 +193,7 @@ struct ItVpdAreas itVpdAreas = { 0,0,0, /* 13 - 15 */ sizeof(struct IoHriProcessorVpd),/* 16 length of Proc Vpd */ 0,0,0,0,0,0, /* 17 - 22 */ - sizeof(struct ItLpQueue),/* 23 length of Lp Queue */ + sizeof(struct hvlpevent_queue), /* 23 length of Lp Queue */ 0,0 /* 24 - 25 */ }, .xSlicVpdAdrs = { /* VPD addresses */ @@ -211,7 +211,7 @@ struct ItVpdAreas itVpdAreas = { 0,0,0, /* 13 - 15 */ &xIoHriProcessorVpd, /* 16 Proc Vpd */ 0,0,0,0,0,0, /* 17 - 22 */ - &xItLpQueue, /* 23 Lp Queue */ + &hvlpevent_queue, /* 23 Lp Queue */ 0,0 } }; Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -41,7 +41,7 @@ struct HvLpEvent; #define LpEventMaxSize 256 #define LpEventAlign 64 -struct ItLpQueue { +struct hvlpevent_queue { /* * The xSlicCurEventPtr is the pointer to the next event stack entry * that will become valid. The OS must peek at this entry to determine @@ -74,7 +74,7 @@ struct ItLpQueue { u64 xLpIntCountByType[9]; // 0x38-0x7F Event counts by type }; -extern struct ItLpQueue xItLpQueue; +extern struct hvlpevent_queue hvlpevent_queue; extern int ItLpQueue_isLpIntPending(void); extern unsigned ItLpQueue_process(struct pt_regs *); From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 17/17] ppc64: Replace custom locking code with a spinlock In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.507146.708612834273.qpatch@concordia> Hi, The hvlpevent_queue (formally ItLpQueue) has a member called xInUseWord which is used for serialising access to the queue. Because it's a word (ie. 32 bit) there's a custom 32-bit version of test_and_set_bit() or thereabouts in ItLpQueue.c. The xInUseWord is not shared with they hypervisor, so we can replace it with a spinlock and remove the custom code. There is also another locking mechanism (ItLpQueueInProcess). This is redundant because it's only manipulated while the lock's held. Remove it. Signed-off-by: Michael Ellerman -- Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -42,35 +42,8 @@ static char *event_types[HvLpEvent_Type_ "Virtual I/O" }; -static __inline__ int set_inUse(void) -{ - int t; - u32 * inUseP = &hvlpevent_queue.xInUseWord; - - __asm__ __volatile__("\n\ -1: lwarx %0,0,%2 \n\ - cmpwi 0,%0,0 \n\ - li %0,0 \n\ - bne- 2f \n\ - addi %0,%0,1 \n\ - stwcx. %0,0,%2 \n\ - bne- 1b \n\ -2: eieio" - : "=&r" (t), "=m" (hvlpevent_queue.xInUseWord) - : "r" (inUseP), "m" (hvlpevent_queue.xInUseWord) - : "cc"); - - return t; -} - -static __inline__ void clear_inUse(void) -{ - hvlpevent_queue.xInUseWord = 0; -} - /* Array of LpEvent handler functions */ extern LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; -unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * get_next_hvlpevent(void) { @@ -144,14 +117,9 @@ void process_hvlpevents(struct pt_regs * struct HvLpEvent * event; /* If we have recursed, just return */ - if ( !set_inUse() ) + if (!spin_trylock(&hvlpevent_queue.lock)) return; - if (ItLpQueueInProcess == 0) - ItLpQueueInProcess = 1; - else - BUG(); - for (;;) { event = get_next_hvlpevent(); if (event) { @@ -187,9 +155,7 @@ void process_hvlpevents(struct pt_regs * break; } - ItLpQueueInProcess = 0; - mb(); - clear_inUse(); + spin_unlock(&hvlpevent_queue.lock); } static int set_spread_lpevents(char *str) Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -69,7 +69,7 @@ struct hvlpevent_queue { char *xSlicEventStackPtr; // 0x20 u8 xIndex; // 0x28 unique sequential index. u8 xSlicRsvd[3]; // 0x29-2b - u32 xInUseWord; // 0x2C + spinlock_t lock; }; extern struct hvlpevent_queue hvlpevent_queue; From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 13/17] ppc64: Simplify counting of lpevents, remove lpevent_count from paca In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.246449.86296880874.qpatch@concordia> Hi, Currently there's a per-cpu count of lpevents processed, a per-queue (ie. global) total count, and a count by event type. Replace all that with a count by event for each cpu. We only need to add it up int the proc code. Signed-off-by: Michael Ellerman -- Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -28,7 +28,9 @@ */ struct hvlpevent_queue hvlpevent_queue __attribute__((__section__(".data"))); -static char *event_types[9] = { +DEFINE_PER_CPU(unsigned long[HvLpEvent_Type_NumTypes], hvlpevent_counts); + +static char *event_types[HvLpEvent_Type_NumTypes] = { "Hypervisor\t\t", "Machine Facilities\t", "Session Manager\t", @@ -129,7 +131,6 @@ static void hvlpevent_clear_valid( struc void process_hvlpevents(struct pt_regs *regs) { - unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; /* If we have recursed, just return */ @@ -144,8 +145,6 @@ void process_hvlpevents(struct pt_regs * for (;;) { nextLpEvent = get_next_hvlpevent(); if ( nextLpEvent ) { - ++numIntsProcessed; - hvlpevent_queue.xLpIntCount++; /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -160,7 +159,7 @@ void process_hvlpevents(struct pt_regs * * here! */ if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) - hvlpevent_queue.xLpIntCountByType[nextLpEvent->xType]++; + __get_cpu_var(hvlpevent_counts)[nextLpEvent->xType]++; if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && lpEventHandler[nextLpEvent->xType] ) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); @@ -181,8 +180,6 @@ void process_hvlpevents(struct pt_regs * ItLpQueueInProcess = 0; mb(); clear_inUse(); - - get_paca()->lpevent_count += numIntsProcessed; } static int set_spread_lpevents(char *str) @@ -228,20 +225,37 @@ void setup_hvlpevent_queue(void) static int proc_lpevents_show(struct seq_file *m, void *v) { - unsigned int i; + int cpu, i; + unsigned long sum; + static unsigned long cpu_totals[NR_CPUS]; + + /* FIXME: do we care that there's no locking here? */ + sum = 0; + for_each_online_cpu(cpu) { + cpu_totals[cpu] = 0; + for (i = 0; i < HvLpEvent_Type_NumTypes; i++) { + cpu_totals[cpu] += per_cpu(hvlpevent_counts, cpu)[i]; + } + sum += cpu_totals[cpu]; + } seq_printf(m, "LpEventQueue 0\n"); - seq_printf(m, " events processed:\t%lu\n", - (unsigned long)hvlpevent_queue.xLpIntCount); + seq_printf(m, " events processed:\t%lu\n", sum); - for (i = 0; i < 9; ++i) - seq_printf(m, " %s %10lu\n", event_types[i], - (unsigned long)hvlpevent_queue.xLpIntCountByType[i]); + for (i = 0; i < HvLpEvent_Type_NumTypes; ++i) { + sum = 0; + for_each_online_cpu(cpu) { + sum += per_cpu(hvlpevent_counts, cpu)[i]; + } + + seq_printf(m, " %s %10lu\n", event_types[i], sum); + } seq_printf(m, "\n events processed by processor:\n"); - for_each_online_cpu(i) - seq_printf(m, " CPU%02d %10u\n", i, paca[i].lpevent_count); + for_each_online_cpu(cpu) { + seq_printf(m, " CPU%02d %10lu\n", cpu, cpu_totals[cpu]); + } return 0; } Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -70,8 +70,6 @@ struct hvlpevent_queue { u8 xIndex; // 0x28 unique sequential index. u8 xSlicRsvd[3]; // 0x29-2b u32 xInUseWord; // 0x2C - u64 xLpIntCount; // 0x30 Total Lp Int msgs processed - u64 xLpIntCountByType[9]; // 0x38-0x7F Event counts by type }; extern struct hvlpevent_queue hvlpevent_queue; Index: ppc64-2.6/include/asm-ppc64/paca.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/paca.h +++ ppc64-2.6/include/asm-ppc64/paca.h @@ -89,7 +89,6 @@ struct paca_struct { u64 next_jiffy_update_tb; /* TB value for next jiffy update */ u64 saved_r1; /* r1 save for RTAS calls */ u64 saved_msr; /* MSR saved here by enter_rtas */ - u32 lpevent_count; /* lpevents processed */ u8 proc_enabled; /* irq soft-enable flag */ /* not yet used */ From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 15/17] ppc64: Cleanup whitespace in arch/ppc64/kernel/ItLpQueue.c In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.381277.523763163161.qpatch@concordia> Hi, Just cleanup white space. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 62 +++++++++++++++++++++--------------------- 1 files changed, 31 insertions(+), 31 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -1,7 +1,7 @@ /* * ItLpQueue.c * Copyright (C) 2001 Mike Corrigan IBM Corporation - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or @@ -74,21 +74,21 @@ unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * get_next_hvlpevent(void) { - struct HvLpEvent * nextLpEvent = + struct HvLpEvent * nextLpEvent = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; - if ( nextLpEvent->xFlags.xValid ) { + if (nextLpEvent->xFlags.xValid) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ hvlpevent_queue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + - LpEventAlign ) / - LpEventAlign ) * + LpEventAlign) / + LpEventAlign) * LpEventAlign; /* Wrap to beginning if no room at end */ if (hvlpevent_queue.xSlicCurEventPtr > hvlpevent_queue.xSlicLastValidEventPtr) hvlpevent_queue.xSlicCurEventPtr = hvlpevent_queue.xSlicEventStackPtr; } - else + else nextLpEvent = NULL; return nextLpEvent; @@ -107,23 +107,23 @@ int hvlpevent_is_pending(void) return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; } -static void hvlpevent_clear_valid( struct HvLpEvent * event ) +static void hvlpevent_clear_valid(struct HvLpEvent * event) { /* Clear the valid bit of the event * Also clear bits within this event that might * look like valid bits (on 64-byte boundaries) - */ - unsigned extra = (( event->xSizeMinus1 + LpEventAlign ) / - LpEventAlign ) - 1; - switch ( extra ) { - case 3: + */ + unsigned extra = ((event->xSizeMinus1 + LpEventAlign) / + LpEventAlign) - 1; + switch (extra) { + case 3: ((struct HvLpEvent*)((char*)event+3*LpEventAlign))->xFlags.xValid=0; - case 2: + case 2: ((struct HvLpEvent*)((char*)event+2*LpEventAlign))->xFlags.xValid=0; - case 1: + case 1: ((struct HvLpEvent*)((char*)event+1*LpEventAlign))->xFlags.xValid=0; - case 0: - ; + case 0: + ; } mb(); event->xFlags.xValid = 0; @@ -136,7 +136,7 @@ void process_hvlpevents(struct pt_regs * /* If we have recursed, just return */ if ( !set_inUse() ) return; - + if (ItLpQueueInProcess == 0) ItLpQueueInProcess = 1; else @@ -144,35 +144,35 @@ void process_hvlpevents(struct pt_regs * for (;;) { nextLpEvent = get_next_hvlpevent(); - if ( nextLpEvent ) { - /* Call appropriate handler here, passing + if (nextLpEvent) { + /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it * needs it in a bottom half. (perhaps for * an ACK) - * - * Handlers are responsible for ACK processing + * + * Handlers are responsible for ACK processing * * The Hypervisor guarantees that LpEvents will * only be delivered with types that we have * registered for, so no type check is necessary * here! - */ - if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes ) + */ + if (nextLpEvent->xType < HvLpEvent_Type_NumTypes) __get_cpu_var(hvlpevent_counts)[nextLpEvent->xType]++; - if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes && - lpEventHandler[nextLpEvent->xType] ) + if (nextLpEvent->xType < HvLpEvent_Type_NumTypes && + lpEventHandler[nextLpEvent->xType]) lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); else printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); - - hvlpevent_clear_valid( nextLpEvent ); - } else if ( hvlpevent_queue.xPlicOverflowIntPending ) + + hvlpevent_clear_valid(nextLpEvent); + } else if (hvlpevent_queue.xPlicOverflowIntPending) /* * No more valid events. If overflow events are * pending process them */ - HvCallEvent_getOverflowLpEvents( hvlpevent_queue.xIndex); + HvCallEvent_getOverflowLpEvents(hvlpevent_queue.xIndex); else break; } From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 6/17] ppc64: Move initialisation of xItLpQueue into ItLpQueue.c In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.797398.892661549698.qpatch@concordia> Hi, The xItLpQueue is initalised manually in iSeries_setup_arch(). Move this code into ItLpQueue.c for a cleaner seperation. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 23 +++++++++++++++++++++++ arch/ppc64/kernel/iSeries_setup.c | 20 +------------------- include/asm-ppc64/iSeries/ItLpQueue.h | 1 + 3 files changed, 25 insertions(+), 19 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -186,3 +187,24 @@ static int set_spread_lpevents(char *str } __setup("spread_lpevents=", set_spread_lpevents); +void setup_hvlpevent_queue(void) +{ + void *eventStack; + + /* + * Allocate a page for the Event Stack. The Hypervisor needs the + * absolute real address, so we subtract out the KERNELBASE and add + * in the absolute real address of the kernel load area. + */ + eventStack = alloc_bootmem_pages(LpEventStackSize); + memset(eventStack, 0, LpEventStackSize); + + /* Invoke the hypervisor to initialize the event stack */ + HvCallEvent_setLpEventStack(0, eventStack, LpEventStackSize); + + xItLpQueue.xSlicEventStackPtr = (char *)eventStack; + xItLpQueue.xSlicCurEventPtr = (char *)eventStack; + xItLpQueue.xSlicLastValidEventPtr = (char *)eventStack + + (LpEventStackSize - LpEventMaxSize); + xItLpQueue.xIndex = 0; +} Index: ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c @@ -24,7 +24,6 @@ #include #include #include -#include #include #include #include @@ -676,7 +675,6 @@ static void __init iSeries_bolt_kernel(u */ static void __init iSeries_setup_arch(void) { - void *eventStack; unsigned procIx = get_paca()->lppaca.dyn_hv_phys_proc_index; /* Add an eye catcher and the systemcfg layout version number */ @@ -685,24 +683,7 @@ static void __init iSeries_setup_arch(vo systemcfg->version.minor = SYSTEMCFG_MINOR; /* Setup the Lp Event Queue */ - - /* Allocate a page for the Event Stack - * The hypervisor wants the absolute real address, so - * we subtract out the KERNELBASE and add in the - * absolute real address of the kernel load area - */ - eventStack = alloc_bootmem_pages(LpEventStackSize); - memset(eventStack, 0, LpEventStackSize); - - /* Invoke the hypervisor to initialize the event stack */ - HvCallEvent_setLpEventStack(0, eventStack, LpEventStackSize); - - /* Initialize fields in our Lp Event Queue */ - xItLpQueue.xSlicEventStackPtr = (char *)eventStack; - xItLpQueue.xSlicCurEventPtr = (char *)eventStack; - xItLpQueue.xSlicLastValidEventPtr = (char *)eventStack + - (LpEventStackSize - LpEventMaxSize); - xItLpQueue.xIndex = 0; + setup_hvlpevent_queue(); /* Compute processor frequency */ procFreqHz = ((1UL << 34) * 1000000) / Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -80,5 +80,6 @@ extern struct HvLpEvent *ItLpQueue_getNe extern int ItLpQueue_isLpIntPending(void); extern unsigned ItLpQueue_process(struct pt_regs *); extern void ItLpQueue_clearValid(struct HvLpEvent *); +extern void setup_hvlpevent_queue(void); #endif /* _ITLPQUEUE_H */ From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 11/17] ppc64: Rename ItLpQueue_* functions to hvlpevent_queue_* In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.119909.156768041053.qpatch@concordia> Hi, Now that we've renamed the xItLpQueue structure, rename the functions that operate on it also. Signed-off-by: Michael Ellerman -- Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -70,7 +70,7 @@ static __inline__ void clear_inUse(void) extern LpEventHandler lpEventHandler[HvLpEvent_Type_NumTypes]; unsigned long ItLpQueueInProcess = 0; -static struct HvLpEvent * ItLpQueue_getNextLpEvent(void) +static struct HvLpEvent * get_next_hvlpevent(void) { struct HvLpEvent * nextLpEvent = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; @@ -94,7 +94,7 @@ static struct HvLpEvent * ItLpQueue_getN static unsigned long spread_lpevents = NR_CPUS; -int ItLpQueue_isLpIntPending(void) +int hvlpevent_is_pending(void) { struct HvLpEvent *next_event; @@ -105,7 +105,7 @@ int ItLpQueue_isLpIntPending(void) return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; } -static void ItLpQueue_clearValid( struct HvLpEvent * event ) +static void hvlpevent_clear_valid( struct HvLpEvent * event ) { /* Clear the valid bit of the event * Also clear bits within this event that might @@ -127,7 +127,7 @@ static void ItLpQueue_clearValid( struct event->xFlags.xValid = 0; } -unsigned ItLpQueue_process(struct pt_regs *regs) +unsigned process_hvlpevents(struct pt_regs *regs) { unsigned numIntsProcessed = 0; struct HvLpEvent * nextLpEvent; @@ -142,7 +142,7 @@ unsigned ItLpQueue_process(struct pt_reg BUG(); for (;;) { - nextLpEvent = ItLpQueue_getNextLpEvent(); + nextLpEvent = get_next_hvlpevent(); if ( nextLpEvent ) { /* Count events to return to caller * and count processed events in hvlpevent_queue @@ -170,7 +170,7 @@ unsigned ItLpQueue_process(struct pt_reg else printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); - ItLpQueue_clearValid( nextLpEvent ); + hvlpevent_clear_valid( nextLpEvent ); } else if ( hvlpevent_queue.xPlicOverflowIntPending ) /* * No more valid events. If overflow events are Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -88,7 +88,7 @@ static int iSeries_idle(void) while (1) { if (lpaca->lppaca.shared_proc) { - if (ItLpQueue_isLpIntPending()) + if (hvlpevent_is_pending()) process_iSeries_events(); if (!need_resched()) yield_shared_processor(); @@ -100,7 +100,7 @@ static int iSeries_idle(void) while (!need_resched()) { HMT_medium(); - if (ItLpQueue_isLpIntPending()) + if (hvlpevent_is_pending()) process_iSeries_events(); HMT_low(); } Index: ppc64-2.6/arch/ppc64/kernel/irq.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/irq.c +++ ppc64-2.6/arch/ppc64/kernel/irq.c @@ -294,8 +294,8 @@ void do_IRQ(struct pt_regs *regs) iSeries_smp_message_recv(regs); } #endif /* CONFIG_SMP */ - if (ItLpQueue_isLpIntPending()) - lpevent_count += ItLpQueue_process(regs); + if (hvlpevent_is_pending()) + lpevent_count += process_hvlpevents(regs); irq_exit(); Index: ppc64-2.6/arch/ppc64/kernel/mf.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/mf.c +++ ppc64-2.6/arch/ppc64/kernel/mf.c @@ -802,8 +802,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { extern unsigned long lpevent_count; - if (ItLpQueue_isLpIntPending()) - lpevent_count += ItLpQueue_process(NULL); + if (hvlpevent_is_pending()) + lpevent_count += process_hvlpevents(NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } Index: ppc64-2.6/arch/ppc64/kernel/time.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/time.c +++ ppc64-2.6/arch/ppc64/kernel/time.c @@ -367,8 +367,8 @@ int timer_interrupt(struct pt_regs * reg set_dec(next_dec); #ifdef CONFIG_PPC_ISERIES - if (ItLpQueue_isLpIntPending()) - lpevent_count += ItLpQueue_process(regs); + if (hvlpevent_is_pending()) + lpevent_count += process_hvlpevents(regs); #endif /* collect purr register values often, for accurate calculations */ Index: ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/iSeries/ItLpQueue.h +++ ppc64-2.6/include/asm-ppc64/iSeries/ItLpQueue.h @@ -76,8 +76,8 @@ struct hvlpevent_queue { extern struct hvlpevent_queue hvlpevent_queue; -extern int ItLpQueue_isLpIntPending(void); -extern unsigned ItLpQueue_process(struct pt_regs *); +extern int hvlpevent_is_pending(void); +extern unsigned process_hvlpevents(struct pt_regs *); extern void setup_hvlpevent_queue(void); #endif /* _ITLPQUEUE_H */ From michael at ellerman.id.au Wed Jun 29 17:49:53 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:53 +1000 Subject: [PATCH 16/17] ppc64: Formatting cleanups in arch/ppc64/kernel/ItLpQueue.c In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031393.445340.442647516480.qpatch@concordia> Hi, Just formatting cleanups: * rename some "nextLpEvent" variables to just "event" * make code fit in 80 columns * use brackets around if/else * use a temporary to make hvlpevent_clear_valid clearer Signed-off-by: Michael Ellerman -- Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -74,24 +74,27 @@ unsigned long ItLpQueueInProcess = 0; static struct HvLpEvent * get_next_hvlpevent(void) { - struct HvLpEvent * nextLpEvent = - (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; - if (nextLpEvent->xFlags.xValid) { + struct HvLpEvent * event; + event = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; + + if (event->xFlags.xValid) { /* rmb() needed only for weakly consistent machines (regatta) */ rmb(); /* Set pointer to next potential event */ - hvlpevent_queue.xSlicCurEventPtr += ((nextLpEvent->xSizeMinus1 + - LpEventAlign) / - LpEventAlign) * - LpEventAlign; + hvlpevent_queue.xSlicCurEventPtr += ((event->xSizeMinus1 + + LpEventAlign) / LpEventAlign) * LpEventAlign; + /* Wrap to beginning if no room at end */ - if (hvlpevent_queue.xSlicCurEventPtr > hvlpevent_queue.xSlicLastValidEventPtr) - hvlpevent_queue.xSlicCurEventPtr = hvlpevent_queue.xSlicEventStackPtr; + if (hvlpevent_queue.xSlicCurEventPtr > + hvlpevent_queue.xSlicLastValidEventPtr) { + hvlpevent_queue.xSlicCurEventPtr = + hvlpevent_queue.xSlicEventStackPtr; + } + } else { + event = NULL; } - else - nextLpEvent = NULL; - return nextLpEvent; + return event; } static unsigned long spread_lpevents = NR_CPUS; @@ -104,34 +107,41 @@ int hvlpevent_is_pending(void) return 0; next_event = (struct HvLpEvent *)hvlpevent_queue.xSlicCurEventPtr; - return next_event->xFlags.xValid | hvlpevent_queue.xPlicOverflowIntPending; + + return next_event->xFlags.xValid | + hvlpevent_queue.xPlicOverflowIntPending; } static void hvlpevent_clear_valid(struct HvLpEvent * event) { - /* Clear the valid bit of the event - * Also clear bits within this event that might - * look like valid bits (on 64-byte boundaries) + /* Tell the Hypervisor that we're done with this event. + * Also clear bits within this event that might look like valid bits. + * ie. on 64-byte boundaries. */ + struct HvLpEvent *tmp; unsigned extra = ((event->xSizeMinus1 + LpEventAlign) / LpEventAlign) - 1; + switch (extra) { case 3: - ((struct HvLpEvent*)((char*)event+3*LpEventAlign))->xFlags.xValid=0; + tmp = (struct HvLpEvent*)((char*)event + 3 * LpEventAlign); + tmp->xFlags.xValid = 0; case 2: - ((struct HvLpEvent*)((char*)event+2*LpEventAlign))->xFlags.xValid=0; + tmp = (struct HvLpEvent*)((char*)event + 2 * LpEventAlign); + tmp->xFlags.xValid = 0; case 1: - ((struct HvLpEvent*)((char*)event+1*LpEventAlign))->xFlags.xValid=0; - case 0: - ; + tmp = (struct HvLpEvent*)((char*)event + 1 * LpEventAlign); + tmp->xFlags.xValid = 0; } + mb(); + event->xFlags.xValid = 0; } void process_hvlpevents(struct pt_regs *regs) { - struct HvLpEvent * nextLpEvent; + struct HvLpEvent * event; /* If we have recursed, just return */ if ( !set_inUse() ) @@ -143,8 +153,8 @@ void process_hvlpevents(struct pt_regs * BUG(); for (;;) { - nextLpEvent = get_next_hvlpevent(); - if (nextLpEvent) { + event = get_next_hvlpevent(); + if (event) { /* Call appropriate handler here, passing * a pointer to the LpEvent. The handler * must make a copy of the LpEvent if it @@ -158,15 +168,15 @@ void process_hvlpevents(struct pt_regs * * registered for, so no type check is necessary * here! */ - if (nextLpEvent->xType < HvLpEvent_Type_NumTypes) - __get_cpu_var(hvlpevent_counts)[nextLpEvent->xType]++; - if (nextLpEvent->xType < HvLpEvent_Type_NumTypes && - lpEventHandler[nextLpEvent->xType]) - lpEventHandler[nextLpEvent->xType](nextLpEvent, regs); + if (event->xType < HvLpEvent_Type_NumTypes) + __get_cpu_var(hvlpevent_counts)[event->xType]++; + if (event->xType < HvLpEvent_Type_NumTypes && + lpEventHandler[event->xType]) + lpEventHandler[event->xType](event, regs); else - printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType ); + printk(KERN_INFO "Unexpected Lp Event type=%d\n", event->xType ); - hvlpevent_clear_valid(nextLpEvent); + hvlpevent_clear_valid(event); } else if (hvlpevent_queue.xPlicOverflowIntPending) /* * No more valid events. If overflow events are From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 3/17] ppc64: Move set_spread_lpevents() into ItLpQueue.c In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.586588.723966985383.qpatch@concordia> Hi, The only code outside ItLpQueue.c that refers to spread_lpevents is in set_apread_lpevents(), so move it inside ItLpQueue.c and make spread_lpevents static. Signed-off-by: Michael Ellerman -- Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -69,7 +69,7 @@ struct HvLpEvent * ItLpQueue_getNextLpEv return nextLpEvent; } -unsigned long spread_lpevents = NR_CPUS; +static unsigned long spread_lpevents = NR_CPUS; int ItLpQueue_isLpIntPending( struct ItLpQueue * lpQueue ) { @@ -166,3 +166,23 @@ unsigned ItLpQueue_process( struct ItLpQ return numIntsProcessed; } + +static int set_spread_lpevents(char *str) +{ + unsigned long val = simple_strtoul(str, NULL, 0); + + /* + * The parameter is the number of processors to share in processing + * lp events. + */ + if (( val > 0) && (val <= NR_CPUS)) { + spread_lpevents = val; + printk("lpevent processing spread over %ld processors\n", val); + } else { + printk("invalid spread_lpevents %ld\n", val); + } + + return 1; +} +__setup("spread_lpevents=", set_spread_lpevents); + Index: ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c @@ -853,26 +853,6 @@ static int __init iSeries_src_init(void) late_initcall(iSeries_src_init); -static int set_spread_lpevents(char *str) -{ - unsigned long val = simple_strtoul(str, NULL, 0); - extern unsigned long spread_lpevents; - - /* - * The parameter is the number of processors to share in processing - * lp events. - */ - if (( val > 0) && (val <= NR_CPUS)) { - spread_lpevents = val; - printk("lpevent processing spread over %ld processors\n", val); - } else { - printk("invalid spread_lpevents %ld\n", val); - } - - return 1; -} -__setup("spread_lpevents=", set_spread_lpevents); - #ifndef CONFIG_PCI void __init iSeries_init_IRQ(void) { } #endif From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 2/17] ppc64: Spread lpevents by default on iSeries In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.485152.495413222621.qpatch@concordia> Hi, With the previous patch in place, spreading lpevents by default becomes a one liner. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -69,7 +69,7 @@ struct HvLpEvent * ItLpQueue_getNextLpEv return nextLpEvent; } -unsigned long spread_lpevents = 1; +unsigned long spread_lpevents = NR_CPUS; int ItLpQueue_isLpIntPending( struct ItLpQueue * lpQueue ) { From michael at ellerman.id.au Wed Jun 29 17:49:52 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 29 Jun 2005 17:49:52 +1000 Subject: [PATCH 1/17] ppc64: Remove lpqueue pointer from the paca on iSeries In-Reply-To: <200506291743.52405.michael@ellerman.id.au> Message-ID: <1120031392.369488.211620163893.qpatch@concordia> Hi, The iSeries code keeps a pointer to the ItLpQueue in its paca struct. But all these pointers end up pointing to the one place, ie. xItLpQueue. So remove the pointer from the paca struct and just refer to xItLpQueue directly where needed. The only complication is that the spread_lpevents logic was implemented by having a NULL lpqueue pointer in the paca on CPUs that weren't supposed to process events. Instead we just compare the spread_lpevents value to the processor id to get the same behaviour. Signed-off-by: Michael Ellerman -- arch/ppc64/kernel/ItLpQueue.c | 16 +++++++++------- arch/ppc64/kernel/iSeries_setup.c | 6 ++---- arch/ppc64/kernel/idle.c | 4 ++-- arch/ppc64/kernel/irq.c | 6 ++---- arch/ppc64/kernel/mf.c | 5 ++--- arch/ppc64/kernel/pacaData.c | 1 - arch/ppc64/kernel/time.c | 5 ++--- include/asm-ppc64/paca.h | 1 - 8 files changed, 19 insertions(+), 25 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -88,7 +88,7 @@ static int iSeries_idle(void) while (1) { if (lpaca->lppaca.shared_proc) { - if (ItLpQueue_isLpIntPending(lpaca->lpqueue_ptr)) + if (ItLpQueue_isLpIntPending(&xItLpQueue)) process_iSeries_events(); if (!need_resched()) yield_shared_processor(); @@ -100,7 +100,7 @@ static int iSeries_idle(void) while (!need_resched()) { HMT_medium(); - if (ItLpQueue_isLpIntPending(lpaca->lpqueue_ptr)) + if (ItLpQueue_isLpIntPending(&xItLpQueue)) process_iSeries_events(); HMT_low(); } Index: ppc64-2.6/arch/ppc64/kernel/irq.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/irq.c +++ ppc64-2.6/arch/ppc64/kernel/irq.c @@ -269,7 +269,6 @@ out: void do_IRQ(struct pt_regs *regs) { struct paca_struct *lpaca; - struct ItLpQueue *lpq; irq_enter(); @@ -295,9 +294,8 @@ void do_IRQ(struct pt_regs *regs) iSeries_smp_message_recv(regs); } #endif /* CONFIG_SMP */ - lpq = lpaca->lpqueue_ptr; - if (lpq && ItLpQueue_isLpIntPending(lpq)) - lpevent_count += ItLpQueue_process(lpq, regs); + if (ItLpQueue_isLpIntPending(&xItLpQueue)) + lpevent_count += ItLpQueue_process(&xItLpQueue, regs); irq_exit(); Index: ppc64-2.6/arch/ppc64/kernel/time.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/time.c +++ ppc64-2.6/arch/ppc64/kernel/time.c @@ -367,11 +367,8 @@ int timer_interrupt(struct pt_regs * reg set_dec(next_dec); #ifdef CONFIG_PPC_ISERIES - { - struct ItLpQueue *lpq = lpaca->lpqueue_ptr; - if (lpq && ItLpQueue_isLpIntPending(lpq)) - lpevent_count += ItLpQueue_process(lpq, regs); - } + if (ItLpQueue_isLpIntPending(&xItLpQueue)) + lpevent_count += ItLpQueue_process(&xItLpQueue, regs); #endif /* collect purr register values often, for accurate calculations */ Index: ppc64-2.6/arch/ppc64/kernel/mf.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/mf.c +++ ppc64-2.6/arch/ppc64/kernel/mf.c @@ -802,9 +802,8 @@ int mf_get_boot_rtc(struct rtc_time *tm) /* We need to poll here as we are not yet taking interrupts */ while (rtc_data.busy) { extern unsigned long lpevent_count; - struct ItLpQueue *lpq = get_paca()->lpqueue_ptr; - if (lpq && ItLpQueue_isLpIntPending(lpq)) - lpevent_count += ItLpQueue_process(lpq, NULL); + if (ItLpQueue_isLpIntPending(&xItLpQueue)) + lpevent_count += ItLpQueue_process(&xItLpQueue, NULL); } return rtc_set_tm(rtc_data.rc, rtc_data.ce_msg.ce_msg, tm); } Index: ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/ItLpQueue.c +++ ppc64-2.6/arch/ppc64/kernel/ItLpQueue.c @@ -69,15 +69,17 @@ struct HvLpEvent * ItLpQueue_getNextLpEv return nextLpEvent; } +unsigned long spread_lpevents = 1; + int ItLpQueue_isLpIntPending( struct ItLpQueue * lpQueue ) { - int retval = 0; - struct HvLpEvent * nextLpEvent; - if ( lpQueue ) { - nextLpEvent = (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; - retval = nextLpEvent->xFlags.xValid | lpQueue->xPlicOverflowIntPending; - } - return retval; + struct HvLpEvent *next_event; + + if (smp_processor_id() >= spread_lpevents) + return 0; + + next_event = (struct HvLpEvent *)lpQueue->xSlicCurEventPtr; + return next_event->xFlags.xValid | lpQueue->xPlicOverflowIntPending; } void ItLpQueue_clearValid( struct HvLpEvent * event ) Index: ppc64-2.6/include/asm-ppc64/paca.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/paca.h +++ ppc64-2.6/include/asm-ppc64/paca.h @@ -20,7 +20,6 @@ #include #include #include -#include #include register struct paca_struct *local_paca asm("r13"); @@ -62,7 +61,6 @@ struct paca_struct { u16 paca_index; /* Logical processor number */ u32 default_decr; /* Default decrementer value */ - struct ItLpQueue *lpqueue_ptr; /* LpQueue handled by this CPU */ u64 kernel_toc; /* Kernel TOC address */ u64 stab_real; /* Absolute address of segment table */ u64 stab_addr; /* Virtual address of segment table */ Index: ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c @@ -855,17 +855,15 @@ late_initcall(iSeries_src_init); static int set_spread_lpevents(char *str) { - unsigned long i; unsigned long val = simple_strtoul(str, NULL, 0); + extern unsigned long spread_lpevents; /* * The parameter is the number of processors to share in processing * lp events. */ if (( val > 0) && (val <= NR_CPUS)) { - for (i = 1; i < val; ++i) - paca[i].lpqueue_ptr = paca[0].lpqueue_ptr; - + spread_lpevents = val; printk("lpevent processing spread over %ld processors\n", val); } else { printk("invalid spread_lpevents %ld\n", val); Index: ppc64-2.6/arch/ppc64/kernel/pacaData.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pacaData.c +++ ppc64-2.6/arch/ppc64/kernel/pacaData.c @@ -45,7 +45,6 @@ extern unsigned long __toc_start; #ifdef CONFIG_PPC_ISERIES #define EXTRA_INITS(number, lpq) \ .lppaca_ptr = &paca[number].lppaca, \ - .lpqueue_ptr = (lpq), /* &xItLpQueue, */ \ .reg_save_ptr = &paca[number].reg_save, \ .reg_save = { \ .xDesc = 0xd397d9e2, /* "LpRS" */ \ From arnd at arndb.de Wed Jun 29 22:55:11 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Wed, 29 Jun 2005 14:55:11 +0200 Subject: [PATCH] net: add driver for the NIC on Cell Blades In-Reply-To: <1120030346.3196.21.camel@laptopd505.fenrus.org> References: <200506281528.08834.arnd@arndb.de> <200506290238.59231.arnd@arndb.de> <1120030346.3196.21.camel@laptopd505.fenrus.org> Message-ID: <200506291455.12506.arnd@arndb.de> On Middeweken 29 Juni 2005 09:32, Arjan van de Ven wrote: > different problem. the sync will get the byte out of the cpu. It won't > get it out of the pci bridges... > > In short, pci bridges are allowed to buffer (post) writes until data > traffic in the other direction happens (eg readl() or dma). Ok, understood. This patch changes all io writes within a spinlock to use a checking version of spider_net_write_reg(). Jens, could you verify if these are the only places where you rely on the write making it to the chip and then (n)ack this patch? Note that in our setup, we know that there are no PCI bridges involved because the device is not really PCI based but directly attached to the FlexIO port and just fakes a PCI-like config space. Signed-off-by: Arnd Bergmann --- linux-cg.orig/drivers/net/spider_net.c 2005-06-29 14:22:31.516896448 -0400 +++ linux-cg/drivers/net/spider_net.c 2005-06-29 14:22:22.567931328 -0400 @@ -109,6 +109,23 @@ spider_net_write_reg(struct spider_net_c } /** + * spider_net_write_reg_sync - writes to an SMMIO register of a card + * @card: device structure + * @reg: register to write to + * @value: value to write into the specified SMMIO register + * + * Unlike spider_net_write_reg, this will also make sure the + * data arrives on the card by reading the reg again. + */ +static void +spider_net_write_reg_sync(struct spider_net_card *card, u32 reg, u32 value) +{ + value = cpu_to_le32(value); + writel(value, card->regs + reg); + (void)readl(card->regs + reg); +} + +/** * spider_net_rx_irq_off - switch off rx irq on this spider card * @card: device structure * @@ -123,7 +140,7 @@ spider_net_rx_irq_off(struct spider_net_ spin_lock_irqsave(&card->intmask_lock, flags); regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); regvalue &= ~SPIDER_NET_RXINT; - spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spider_net_write_reg_sync(card, SPIDER_NET_GHIINT0MSK, regvalue); spin_unlock_irqrestore(&card->intmask_lock, flags); } @@ -196,7 +213,7 @@ spider_net_rx_irq_on(struct spider_net_c spin_lock_irqsave(&card->intmask_lock, flags); regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); regvalue |= SPIDER_NET_RXINT; - spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spider_net_write_reg_sync(card, SPIDER_NET_GHIINT0MSK, regvalue); spin_unlock_irqrestore(&card->intmask_lock, flags); } @@ -215,7 +232,7 @@ spider_net_tx_irq_off(struct spider_net_ spin_lock_irqsave(&card->intmask_lock, flags); regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); regvalue &= ~SPIDER_NET_TXINT; - spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spider_net_write_reg_sync(card, SPIDER_NET_GHIINT0MSK, regvalue); spin_unlock_irqrestore(&card->intmask_lock, flags); } @@ -234,7 +251,7 @@ spider_net_tx_irq_on(struct spider_net_c spin_lock_irqsave(&card->intmask_lock, flags); regvalue = spider_net_read_reg(card, SPIDER_NET_GHIINT0MSK); regvalue |= SPIDER_NET_TXINT; - spider_net_write_reg(card, SPIDER_NET_GHIINT0MSK, regvalue); + spider_net_write_reg_sync(card, SPIDER_NET_GHIINT0MSK, regvalue); spin_unlock_irqrestore(&card->intmask_lock, flags); } From olof at lixom.net Wed Jun 29 01:05:09 2005 From: olof at lixom.net (Olof Johansson) Date: Tue, 28 Jun 2005 10:05:09 -0500 Subject: [PATCH 4/15] ppc64: Don't pass the pointers to xItLpQueue around In-Reply-To: <20050628144231.528386ed.sfr@canb.auug.org.au> References: <1119914229.959253.194917692058.qpatch@concordia> <1119914237.219661.147090075133.qpatch@concordia> <20050628144231.528386ed.sfr@canb.auug.org.au> Message-ID: <20050628150509.GA14155@austin.ibm.com> On Tue, Jun 28, 2005 at 02:42:31PM +1000, Stephen Rothwell wrote: > Hi Michael, > > On Tue, 28 Jun 2005 09:17:17 +1000 Michael Ellerman wrote: > > > > int t; > > - u32 * inUseP = &(lpQueue->xInUseWord); > > + u32 * inUseP = &xItLpQueue.xInUseWord; > > > > __asm__ __volatile__("\n\ > > 1: lwarx %0,0,%2 \n\ > > @@ -31,37 +31,37 @@ static __inline__ int set_inUse( struct > > stwcx. %0,0,%2 \n\ > > bne- 1b \n\ > > 2: eieio" > > Could you fix this assembler code up so that it is a set of concatenated > strings rather than one longe one, please? I think that is the preferred > formatting these days. Actually, this is a once-used block of assembly that, as far as I can tell, essentially does a cmpxchg() on a 32-bit value, does it not? How about axing the function alltogether and/or use cmpxchg() instead? -Olof From johnrose at austin.ibm.com Thu Jun 30 01:28:38 2005 From: johnrose at austin.ibm.com (John Rose) Date: Wed, 29 Jun 2005 10:28:38 -0500 Subject: [PATCH 10/13]: PCI Err: PPC64-specific recovery infrastructure In-Reply-To: <20050628235956.GA6455@austin.ibm.com> References: <20050628235956.GA6455@austin.ibm.com> Message-ID: <1120058918.19616.4.camel@sinatra.austin.ibm.com> (resend, dropped cc's) Hi Linas- The new functions eeh_report_error, eeh_report_reset, eeh_report_resume, eeh_report_failure, eeh_reset_device, and handle_eeh_events do not belong in this driver, imho. Most of them have nothing to do with PCI hotplug (eeh_report_*). The ones that do are clients of the enable/disable functionality, and as such are not part of the actual hotplug implementation. In my mind, it makes sense to logically separate things when poissible. I'm currently making an effort to reduce bloat in this driver, and this would add to it. The functions eeh_report_*, along with the handling routines, could just as easily exist in the kernel or in a 3rd module (drivers/pci/hotplug/rpaphp_pcierr.ko?). You've suggested the third module idea before, and I think it makes more sense than lumping this in w/ rpaphp. You could export [enable,disable]_slot from rpaphp, and have a module dependency, similar to rpadlpar_io. Thanks- John On Tue, 2005-06-28 at 18:59, Linas Vepstas wrote: > pci-err-10-ppc64.patch > > Implements ppc64-specific parts of detecting PCI bus errors, > (via calls to the firmware to ask the hardware pci bridges) > and the related mechanisms for reseting the affects PCI > slots (again, via firmware calls). > > Signed-off-by: Linas Vepstas > > ______________________________________________________________________ > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From linas at austin.ibm.com Thu Jun 30 01:59:54 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 29 Jun 2005 10:59:54 -0500 Subject: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery In-Reply-To: <1120009619.5133.228.camel@gaston> References: <20050628235848.GA6376@austin.ibm.com> <1120009619.5133.228.camel@gaston> Message-ID: <20050629155954.GH28499@austin.ibm.com> On Wed, Jun 29, 2005 at 11:46:58AM +1000, Benjamin Herrenschmidt was heard to remark: > On Tue, 2005-06-28 at 18:58 -0500, Linas Vepstas wrote: > > /** e100_io_error_detected() is called when PCI error is detected */ > > +static int e100_io_error_detected (struct pci_dev *pdev, enum > > pci_channel_state state) > > +{ > > + struct net_device *netdev = pci_get_drvdata(pdev); > > + struct nic *nic = netdev_priv(netdev); > > + > > + mod_timer(&nic->watchdog, jiffies + 30*HZ); > > + e100_down(nic); > > + > > + /* Request a slot reset. */ > > + return PCIERR_RESULT_NEED_RESET; > > +} > > I'm not sure just "pushing" the watchdog timer to 30sec in the future is > the way to go here. What about netif_stop_queue() or so ? Yep, OK. Pushig the timer would in fact break if the device was marked perm disabled. --linas From linas at austin.ibm.com Thu Jun 30 02:34:08 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 29 Jun 2005 11:34:08 -0500 Subject: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery In-Reply-To: <20050629030237.GB71992@muc.de> References: <20050628235919.GA6415@austin.ibm.com> <20050629030237.GB71992@muc.de> Message-ID: <20050629163408.GI28499@austin.ibm.com> On Wed, Jun 29, 2005 at 05:02:37AM +0200, Andi Kleen was heard to remark: > On Tue, Jun 28, 2005 at 06:59:19PM -0500, Linas Vepstas wrote: > > > > pci-err-7-symbios.patch > > > > Adds PCI Error recoervy callbacks to the Symbios Sym53c8xx driver. > > Tested, seems to work well under i/o stress to one disk. Not > > stress tested under heavy i/o to multiple scsi devices. > > What does this do to the IO requests currently being processed > by the firmware? Do they get all aborted? Is it ensured > that they all error out properly? Interesting question; two replies. >From the hardware point of view, the scsi card is soft-reset, which wipes out all state on the card, including any command queues on the card. In-progress transactions, e.g. disk drives in the middle of receiving commands or in the process of responding to reads, are lost. The freshly rebooted scsi controller may wonder why disks are suddenly sending it data. This may sound alarming, but it not much different than the existing standard/generic SCSI bus-reset/host resest sequences, which I beleive (hope) work correctly. In particular, there shouldn't be any data corrpution; here's why: >From the kernel point of view, file system i/o goes through the block device, through to scsi_dispatch_cmd(), to the symbios driver. Any queued requests stay queued until they are fulfilled. Queued requests get replayed, in a fashion similar to what would be needed after a host reset. In particular, there shouldn't be and (permanent) file system corruption because any inconsistent state on the disk would get over-written when the queued reqeusts get re-issued. At least, that's how i think it should work. My testing was light ... inject errors while doing mild single-disk i/o. Haven't run any full stress tests, with would e.g. write patterns to multiple disks and then read back the patterns and bit-compare. Someday, I hope to run this test :) However, if this reveals bugs, I beleive these will be generic bugs, rather than PCI error recovery related bugs. FWIW, yes, I have heard of devices that "cheat", and report back that a transaction is complete, even though it is still pending in firmware somewhere, either on the host or the disk. Those devices get screwed. No doubt, this will happen to some giant banking customer, and result in the corruption of serious financial data. There will be hundreds of airplane trips as dozens of techies will be hunched over the system wondering "what happened" while executives fume in the corner, threatening billion dollar lawsuits. The net output of this will be a one-line patch to drivers/scsi/scsi_lib.c which will be lost in the noise of the LKML. Shit happens. --linas From ak at muc.de Thu Jun 30 02:58:29 2005 From: ak at muc.de (Andi Kleen) Date: 29 Jun 2005 18:58:29 +0200 Subject: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery In-Reply-To: <20050629155954.GH28499@austin.ibm.com> References: <20050628235848.GA6376@austin.ibm.com> <1120009619.5133.228.camel@gaston> <20050629155954.GH28499@austin.ibm.com> Message-ID: <20050629165828.GA73550@muc.de> > Yep, OK. Pushig the timer would in fact break if the device was marked > perm disabled. I think for network drivers you should just write a generic error handler (perhaps in net/core/dev.c) that calls the watchdog handler. Then all drivers could be easily converted without much code duplication. -Andi From linas at austin.ibm.com Thu Jun 30 06:48:23 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 29 Jun 2005 15:48:23 -0500 Subject: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery In-Reply-To: <1120009868.5133.232.camel@gaston> References: <20050628235919.GA6415@austin.ibm.com> <1120009868.5133.232.camel@gaston> Message-ID: <20050629204823.GM28499@austin.ibm.com> On Wed, Jun 29, 2005 at 11:51:07AM +1000, Benjamin Herrenschmidt was heard to remark: > On Tue, 2005-06-28 at 18:59 -0500, Linas Vepstas wrote: > > pci-err-7-symbios.patch > > > > Adds PCI Error recoervy callbacks to the Symbios Sym53c8xx driver. > > Tested, seems to work well under i/o stress to one disk. Not > > stress tested under heavy i/o to multiple scsi devices. > > > > Note the check of the pci error state flag inside an infinite > > loop inside the interrupt handler. Without this check, the > > device can spin forever, locking up hard, long before the > > asynchronous error event (and callbacks) are ever called. > > Normally, you should check for non-responding hardware by testing things > like reading all ff's or having a timeout in the loop. For ppc64, that does happen in the loop, and so the flag does get set synchronously, even on a single-cpu system. But point taken. > The bug is that > the driver has a potential infinite loop in the first place. > > The only type of "synchronous" error checking that can be done is what > is proposed by Hidetoshi Seto. You could use his stuff here. Yes. However, I will leave this bit in for now, (and mark it as a hack) until Seto-san's patches are on deck. I'd rather not have a built-in pre-req right now. --linas From linas at austin.ibm.com Thu Jun 30 07:14:35 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 29 Jun 2005 16:14:35 -0500 Subject: [PATCH 8/13]: PCI Err: Event delivery utility In-Reply-To: <1120010387.5133.235.camel@gaston> References: <20050628235932.GA6429@austin.ibm.com> <1120010387.5133.235.camel@gaston> Message-ID: <20050629211435.GN28499@austin.ibm.com> On Wed, Jun 29, 2005 at 11:59:47AM +1000, Benjamin Herrenschmidt was heard to remark: > On Tue, 2005-06-28 at 18:59 -0500, Linas Vepstas wrote: > > pci-err-8-pci-err-event.patch > > > > [RFC] > > > > PCI Error distribution utility routine. This patch defines > > a utility routine that hasn't yet been discussed much on > > Certainly needs to be in a separate .h at least ... Also, you have some > lifetime issues. You probably want to do a get() on pci_dev when you put > it in your struct and put() it after the notifier... Oh wait, you are > doing pci_dev_put() ... but no pci_dev_get() ... The later must be > missing from peh_send_failure_event(). I'm pretty sure this was balanced, there is a get very early on when the error is detected. But I'll review. > I'd keep that in arch code for now. OK, I'm moving it there. It did seem both confusing and semi-pointless after the last round of changes. --linas From benh at kernel.crashing.org Thu Jun 30 09:40:07 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 30 Jun 2005 09:40:07 +1000 Subject: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery In-Reply-To: <20050629165828.GA73550@muc.de> References: <20050628235848.GA6376@austin.ibm.com> <1120009619.5133.228.camel@gaston> <20050629155954.GH28499@austin.ibm.com> <20050629165828.GA73550@muc.de> Message-ID: <1120088407.31924.21.camel@gaston> On Wed, 2005-06-29 at 18:58 +0200, Andi Kleen wrote: > > Yep, OK. Pushig the timer would in fact break if the device was marked > > perm disabled. > > I think for network drivers you should just write a generic error handler > (perhaps in net/core/dev.c) that calls the watchdog handler. > Then all drivers could be easily converted without much code duplication. Provided the watchdog timer completely reconfigures the device from reset since the slot will be reset... Ben. From benh at kernel.crashing.org Thu Jun 30 09:41:14 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 30 Jun 2005 09:41:14 +1000 Subject: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery In-Reply-To: <20050629204823.GM28499@austin.ibm.com> References: <20050628235919.GA6415@austin.ibm.com> <1120009868.5133.232.camel@gaston> <20050629204823.GM28499@austin.ibm.com> Message-ID: <1120088475.31924.23.camel@gaston> On Wed, 2005-06-29 at 15:48 -0500, Linas Vepstas wrote: > > The only type of "synchronous" error checking that can be done is what > > is proposed by Hidetoshi Seto. You could use his stuff here. > > Yes. However, I will leave this bit in for now, (and mark it as a hack) > until Seto-san's patches are on deck. I'd rather not have a built-in > pre-req right now. No, check for ff's where they don't make sense or add a timeout to the loop. That's the correct solution for now. Ben. From benh at kernel.crashing.org Thu Jun 30 09:42:01 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 30 Jun 2005 09:42:01 +1000 Subject: [PATCH 8/13]: PCI Err: Event delivery utility In-Reply-To: <20050629211435.GN28499@austin.ibm.com> References: <20050628235932.GA6429@austin.ibm.com> <1120010387.5133.235.camel@gaston> <20050629211435.GN28499@austin.ibm.com> Message-ID: <1120088522.31924.25.camel@gaston> On Wed, 2005-06-29 at 16:14 -0500, Linas Vepstas wrote: > I'm pretty sure this was balanced, there is a get very early on when the > error is detected. But I'll review. > > > I'd keep that in arch code for now. > > OK, I'm moving it there. It did seem both confusing and semi-pointless > after the last round of changes. Well, it's logical for the get and put to be in the same "layer" don't you think ? Ben. From benh at kernel.crashing.org Thu Jun 30 09:48:20 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 30 Jun 2005 09:48:20 +1000 Subject: [PATCH] net: add driver for the NIC on Cell Blades In-Reply-To: <1120030346.3196.21.camel@laptopd505.fenrus.org> References: <200506281528.08834.arnd@arndb.de> <1119966799.3175.32.camel@laptopd505.fenrus.org> <200506290238.59231.arnd@arndb.de> <1120030346.3196.21.camel@laptopd505.fenrus.org> Message-ID: <1120088901.31924.31.camel@gaston> > > > > Could you be more specific? My guess would be that the 'sync' in writel > > takes care of this. Should there be an extra mmiowb() in here or are > > you referring to some other problem? > > different problem. the sync will get the byte out of the cpu. It won't > get it out of the pci bridges... > > In short, pci bridges are allowed to buffer (post) writes until data > traffic in the other direction happens (eg readl() or dma). > > In cases where you want your writel to hit the device instantly (and > disabling irqs is generally one of those) you need to flush this posting > cache with a dummy readl(). As I keep repeating over and over again, nothing will guarantee that your interrupt is actually disabled here, not even a readl(). You _must_ make sure you are ready for a spurrious interrupt coming in, though in that case, the interrupt will probably be very short. The thing is, interrupts (not MSIs tho, but the problem is still potentially there at the APIC level) are totally asynchronous to PCI transactions. By the time you readl() back (and thus flush your PCI posting buffers), or are approx. sure that the chip will have de-asserted it's IRQ line (oh well, that isn't even sure, it may take a few cycles depending on the HW you have there). But the IRQ may already have been "captured" by the PIC, or even by several layers of PICs. By the time the IRQ de-assertion propagates all the way to the CPU, you may already have taken it. Since those PCI IRQs are level, hopefully, it will go down real soon (unless you have a misconfigured PIC along the chain), so it most cases, it's just a matter of ignoring it. If it's safe, just read your device IRQ status reg and your IRQ routine will "notice" that there is nothing to do. (Note that returning IRQ_NONE there may confuse the core, heh !). If it's not, then you need a local variable indicating that you did indeed mask interrupts, and that cause your handler to just bail out. It may get re-entered a couple of time, but ultimately, the IRQ will go down. Ben From kravetz at us.ibm.com Thu Jun 30 06:42:35 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Wed, 29 Jun 2005 13:42:35 -0700 Subject: [PATCH] POWER 4 fails to boot with NUMA Message-ID: <20050629204235.GA9043@w-mikek2.ibm.com> If CONFIG_NUMA is set, some POWER 4 systems will fail to boot. This is because of special processing needed to handle invalid node IDs (0xffff) on POWER 4. My previous patch to handle memory 'holes' within nodes forgot to add this special case for POWER 4 in one place. In reality, I'm not sure that configuring the kernel for NUMA on POWER 4 makes much sense. Are there POWER 4 based systems with NUMA characteristics that are presented by the firmware? But, distros want one kernel for all systems so NUMA is on by default in their kernels. The patch handles those cases. -- Signed-off-by: Mike Kravetz diff -Naupr linux-2.6.12.1/arch/ppc64/mm/numa.c linux-2.6.12.1.work/arch/ppc64/mm/numa.c --- linux-2.6.12.1/arch/ppc64/mm/numa.c 2005-06-22 19:33:05.000000000 +0000 +++ linux-2.6.12.1.work/arch/ppc64/mm/numa.c 2005-06-29 18:52:47.000000000 +0000 @@ -644,7 +644,12 @@ void __init do_init_bootmem(void) new_range: mem_start = read_n_cells(addr_cells, &memcell_buf); mem_size = read_n_cells(size_cells, &memcell_buf); - numa_domain = numa_enabled ? of_node_numa_domain(memory) : 0; + if (numa_enabled) { + numa_domain = of_node_numa_domain(memory); + if (numa_domain >= MAX_NUMNODES) + numa_domain = 0; + } else + numa_domain = 0; if (numa_domain != nid) continue; From michael at ellerman.id.au Thu Jun 30 10:28:35 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 10:28:35 +1000 Subject: [PATCH 4/15] ppc64: Don't pass the pointers to xItLpQueue around In-Reply-To: <20050628150509.GA14155@austin.ibm.com> References: <1119914229.959253.194917692058.qpatch@concordia> <20050628144231.528386ed.sfr@canb.auug.org.au> <20050628150509.GA14155@austin.ibm.com> Message-ID: <200506301028.39540.michael@ellerman.id.au> On Wed, 29 Jun 2005 01:05, Olof Johansson wrote: > On Tue, Jun 28, 2005 at 02:42:31PM +1000, Stephen Rothwell wrote: > > Hi Michael, > > > > On Tue, 28 Jun 2005 09:17:17 +1000 Michael Ellerman wrote: > > > int t; > > > - u32 * inUseP = &(lpQueue->xInUseWord); > > > + u32 * inUseP = &xItLpQueue.xInUseWord; > > > > > > __asm__ __volatile__("\n\ > > > 1: lwarx %0,0,%2 \n\ > > > @@ -31,37 +31,37 @@ static __inline__ int set_inUse( struct > > > stwcx. %0,0,%2 \n\ > > > bne- 1b \n\ > > > 2: eieio" > > > > Could you fix this assembler code up so that it is a set of concatenated > > strings rather than one longe one, please? I think that is the preferred > > formatting these days. > > Actually, this is a once-used block of assembly that, as far as I can > tell, essentially does a cmpxchg() on a 32-bit value, does it not? > > How about axing the function alltogether and/or use cmpxchg() instead? Yep, great minds think alike ;D See my patch 17/17 ppc64: Replace custom locking code with a spinlock. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050630/20eda7ff/attachment.pgp From linas at austin.ibm.com Thu Jun 30 10:29:36 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 29 Jun 2005 19:29:36 -0500 Subject: [PATCH 8/13]: PCI Err: Event delivery utility In-Reply-To: <1120088522.31924.25.camel@gaston> References: <20050628235932.GA6429@austin.ibm.com> <1120010387.5133.235.camel@gaston> <20050629211435.GN28499@austin.ibm.com> <1120088522.31924.25.camel@gaston> Message-ID: <20050630002936.GS28499@austin.ibm.com> On Thu, Jun 30, 2005 at 09:42:01AM +1000, Benjamin Herrenschmidt was heard to remark: > On Wed, 2005-06-29 at 16:14 -0500, Linas Vepstas wrote: > > > I'm pretty sure this was balanced, there is a get very early on when the > > error is detected. But I'll review. > > > > > I'd keep that in arch code for now. > > > > OK, I'm moving it there. It did seem both confusing and semi-pointless > > after the last round of changes. > > Well, it's logical for the get and put to be in the same "layer" don't > you think ? Yes, it could be made more symmetrical; I'll do that. The get happened along with the malloc of the event structure, the put happens right before the free of the same structure. The reason for the event was in order to get the recovery of the error out of line from the detection of the error; detection may occur in an interrupt context; recovery happens in a work queue. Thus, get may happen in that interrupt context, the put only after the work is complete. I'll make the code more symmetrical with regards to the event malloc/free and that should make it more readable. --linas From laforge at gnumonks.org Thu Jun 30 18:04:39 2005 From: laforge at gnumonks.org (Harald Welte) Date: Thu, 30 Jun 2005 10:04:39 +0200 Subject: mmio latency measurements Message-ID: <20050630080439.GD25641@sunbeam.de.gnumonks.org> Hi! Im involved in a group of people (mostly linux networking developers is currently working on a tool to measure the MMIO latency of PCI devices such as network boards) on various hardware. The preliminary tool is available from http://svn.gnumonks.org/trunk/mmio_test This tool works fine on a number of architectures (x86, x86_64, ia64), but fails on ppc64. So what is it trying to do? It wants to measure the number of cpu cycles spent for a MMIO read from a device-specific address: 1. It mmap()'s a pci device's MMIO registers from /dev/mem. 2. It reads the cpu cycle counter (includs/asm/timex.h:get_cycles()) 3. It reads eight times the requested mmio register 4. It reads the cpu cycle counter via get_cycles() again. However, the number of cycles we get by doing this is at least at a factor of thousand smaller than on other comparable hardware. So here's my question: 1. Is this a valid use of get_cycles() 2. Are there any special caveats ? 3. Please propose an alternative mechanism, in case get_cycles() won't work Any patches and/or comments welcome. Thanks! -- - Harald Welte http://gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050630/72010f8a/attachment.pgp From benh at kernel.crashing.org Thu Jun 30 18:56:58 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 30 Jun 2005 18:56:58 +1000 Subject: mmio latency measurements In-Reply-To: <20050630080439.GD25641@sunbeam.de.gnumonks.org> References: <20050630080439.GD25641@sunbeam.de.gnumonks.org> Message-ID: <1120121818.31924.52.camel@gaston> > 1. Is this a valid use of get_cycles() > 2. Are there any special caveats ? > 3. Please propose an alternative mechanism, in case get_cycles() won't > work On ppc64, there is no cycle-counter per-se, but a HW timebase that ticks at a fixes frequency (independently of the CPU frequency nowadays). That's the value get_cycles() returns. You can see the calibration value of the timebase in herz in /proc/cpuinfo. Ben. From laforge at gnumonks.org Thu Jun 30 20:00:36 2005 From: laforge at gnumonks.org (Harald Welte) Date: Thu, 30 Jun 2005 12:00:36 +0200 Subject: mmio latency measurements In-Reply-To: <1120121818.31924.52.camel@gaston> References: <20050630080439.GD25641@sunbeam.de.gnumonks.org> <1120121818.31924.52.camel@gaston> Message-ID: <20050630100036.GJ25641@sunbeam.de.gnumonks.org> On Thu, Jun 30, 2005 at 06:56:58PM +1000, Benjamin Herrenschmidt wrote: > > > 1. Is this a valid use of get_cycles() > > 2. Are there any special caveats ? > > 3. Please propose an alternative mechanism, in case get_cycles() won't > > work > > On ppc64, there is no cycle-counter per-se, but a HW timebase that ticks > at a fixes frequency (independently of the CPU frequency nowadays). > That's the value get_cycles() returns. You can see the calibration value > of the timebase in herz in /proc/cpuinfo. thanks a lot for your fast reply, I'll adopt the program to acommodate this. -- - Harald Welte http://gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050630/37037547/attachment.pgp From michael at ellerman.id.au Thu Jun 30 20:16:49 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:16:49 +1000 Subject: [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Message-ID: <200506302016.55125.michael@ellerman.id.au> Hi y'all, The following is a series of patches for the iseries_veth driver. They're not ready for merging yet, as we need to do more extensive testing. However any feedback you have will be greatly appreciated. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050630/7efa8742/attachment.pgp From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 5/12] iseries_veth: Try to avoid pathological reset behaviour In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.441162.530324669503.qpatch@concordia> The iseries_veth driver contains a state machine which is used to manage how connections are setup and neogotiated between LPARs. If one side of a connection resets for some reason, the two LPARs can get stuck in a race to re-setup the connection. This can lead to the connection being declared dead by one or both ends. In practice this happens ~8/10 times a connection is reset, although it's rare for connections to be reset. (an example here: http://michael.ellerman.id.au/files/misc/veth-trace.html) The core of the problem is that the end that resets the connection doesn't wait for the other end to become aware of the reset. So the resetting end starts setting the connection back up, and then receives a reset from the other end (which is the response to the initial reset). And so on. We're severely limited in what we can do to fix this. The protocol between LPARs is essentially fixed, as we have to interoperate with both OS/400 and old Linux drivers. Which also means we need a fix that only changes the code on one end. The only fix I've found given that, is to just blindly sleep for a bit when resetting the connection, in the hope that the other end will get itself sorted. Needless to say I'd love it if someone has a better idea. This does work, I've so far been unable to get it to break, whereas without the fix a reset of one end will lead to a dead connection ~8/10 times. --- drivers/net/iseries_veth.c | 23 +++++++++++++++++++++-- 1 files changed, 21 insertions(+), 2 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -324,8 +324,12 @@ static void veth_take_monitor_ack(struct spin_lock_irqsave(&cnx->lock, flags); veth_debug("cnx %d: lost connection.\n", cnx->remote_lp); - cnx->state |= VETH_STATE_RESET; - veth_kick_statemachine(cnx); + /* Avoid kicking the statemachine once we're shutdown. + * It's unnecessary and it could break veth_stop_connection(). */ + if (! (cnx->state & VETH_STATE_SHUTDOWN)) { + cnx->state |= VETH_STATE_RESET; + veth_kick_statemachine(cnx); + } spin_unlock_irqrestore(&cnx->lock, flags); } @@ -483,6 +487,12 @@ static void veth_statemachine(void *p) if (cnx->state & VETH_STATE_RESET) goto restart; + + /* Hack, wait for the other end to reset itself. */ + if (! (cnx->state & VETH_STATE_SHUTDOWN)) { + schedule_delayed_work(&cnx->statemachine_wq, 5 * HZ); + goto out; + } } if (cnx->state & VETH_STATE_SHUTDOWN) @@ -667,6 +677,15 @@ static void veth_stop_connection(u8 rlp) veth_kick_statemachine(cnx); spin_unlock_irq(&cnx->lock); + /* There's a slim chance the reset code has just queued the + * statemachine to run in five seconds. If so we need to cancel + * that and requeue the work to run now. */ + if (cancel_delayed_work(&cnx->statemachine_wq)) { + spin_lock_irq(&cnx->lock); + veth_kick_statemachine(cnx); + spin_unlock_irq(&cnx->lock); + } + /* Wait for the state machine to run. */ flush_scheduled_work(); } From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 4/12] iseries_veth: Remove a FIXME WRT deletion of the ack_timer In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.355585.362623134076.qpatch@concordia> The iseries_veth driver has a timer which we use to send acks. When the connection is reset or stopped we need to delete the timer. Currently we only call del_timer() when resetting a connection, which means the timer might run again while the connection is being re-setup. As it turns out that's ok, because the flags the timer consults have been reset. It's cleaner though to call del_timer_sync() once we've dropped the lock, although the timer may still run between us dropping the lock and calling del_timer_sync(), but as above that's ok. --- drivers/net/iseries_veth.c | 21 +++++++++++++-------- 1 files changed, 13 insertions(+), 8 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -450,13 +450,15 @@ static void veth_statemachine(void *p) if (cnx->state & VETH_STATE_RESET) { int i; - del_timer(&cnx->ack_timer); - if (cnx->state & VETH_STATE_OPEN) HvCallEvent_closeLpEventPath(cnx->remote_lp, HvLpEvent_Type_VirtualLan); - /* reset ack data */ + /* + * Reset ack data. This prevents the ack_timer actually + * doing anything, even if it runs one more time when + * we drop the lock below. + */ memset(&cnx->pending_acks, 0xff, sizeof (cnx->pending_acks)); cnx->num_pending_acks = 0; @@ -469,9 +471,16 @@ static void veth_statemachine(void *p) if (cnx->msgs) for (i = 0; i < VETH_NUMBUFFERS; ++i) veth_recycle_msg(cnx, cnx->msgs + i); + + /* Drop the lock so we can do stuff that might sleep or + * take other locks. */ spin_unlock_irq(&cnx->lock); + + del_timer_sync(&cnx->ack_timer); veth_flush_pending(cnx); + spin_lock_irq(&cnx->lock); + if (cnx->state & VETH_STATE_RESET) goto restart; } @@ -658,12 +667,8 @@ static void veth_stop_connection(u8 rlp) veth_kick_statemachine(cnx); spin_unlock_irq(&cnx->lock); + /* Wait for the state machine to run. */ flush_scheduled_work(); - - /* FIXME: not sure if this is necessary - will already have - * been deleted by the state machine, just want to make sure - * its not running any more */ - del_timer_sync(&cnx->ack_timer); } static void veth_destroy_connection(u8 rlp) From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 3/12] iseries_veth: Make init_connection() & destroy_connection() symmetrical In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.290253.340047065213.qpatch@concordia> This patch makes veth_init_connection() and veth_destroy_connection() symmetrical in that they allocate/deallocate the same data. Currently if there's an error while initialising connections (ie. ENOMEM) we call veth_module_cleanup(), however this will oops because we call driver_unregister() before we've called driver_register(). I've never seen this actually happen though. So instead we explicitly call veth_destroy_connection() in a reverse loop for the connections we've successfully initialised. --- drivers/net/iseries_veth.c | 22 +++++++++++----------- 1 files changed, 11 insertions(+), 11 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -664,6 +664,14 @@ static void veth_stop_connection(u8 rlp) * been deleted by the state machine, just want to make sure * its not running any more */ del_timer_sync(&cnx->ack_timer); +} + +static void veth_destroy_connection(u8 rlp) +{ + struct veth_lpar_connection *cnx = veth_cnx[rlp]; + + if (! cnx) + return; if (cnx->num_events > 0) mf_deallocate_lp_events(cnx->remote_lp, @@ -675,14 +683,6 @@ static void veth_stop_connection(u8 rlp) HvLpEvent_Type_VirtualLan, cnx->num_ack_events, NULL, NULL); -} - -static void veth_destroy_connection(u8 rlp) -{ - struct veth_lpar_connection *cnx = veth_cnx[rlp]; - - if (! cnx) - return; kfree(cnx->msgs); kfree(cnx); @@ -1424,15 +1424,15 @@ module_exit(veth_module_cleanup); int __init veth_module_init(void) { - int i; - int rc; + int i, rc; this_lp = HvLpConfig_getLpIndex_outline(); for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) { rc = veth_init_connection(i); if (rc != 0) { - veth_module_cleanup(); + for (; i >= 0; i--) + veth_destroy_connection(i); return rc; } } From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 8/12] iseries_veth: Replace lock-protected atomic with an ordinary variable In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.641415.625704757570.qpatch@concordia> The iseries_veth driver uses atomic ops to manipulate the in_use field of one of its per-connection structures. However all references to the flag occur while the connection's lock is held, so the atomic ops aren't necessary. --- drivers/net/iseries_veth.c | 13 +++++++------ 1 files changed, 7 insertions(+), 6 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -117,7 +117,7 @@ struct veth_msg { struct veth_msg *next; struct VethFramesData data; int token; - unsigned long in_use; + int in_use; struct sk_buff *skb; struct device *dev; }; @@ -959,6 +959,8 @@ static int veth_transmit_to_one(struct s goto drop; } + msg->in_use = 1; + dma_length = skb->len; dma_address = dma_map_single(port->dev, skb->data, dma_length, DMA_TO_DEVICE); @@ -973,7 +975,6 @@ static int veth_transmit_to_one(struct s msg->data.addr[0] = dma_address; msg->data.len[0] = dma_length; msg->data.eofmask = 1 << VETH_EOF_SHIFT; - set_bit(0, &(msg->in_use)); rc = veth_signaldata(cnx, VethEventTypeFrames, msg->token, &msg->data); if (rc != HvLpEvent_Rc_Good) @@ -983,10 +984,8 @@ static int veth_transmit_to_one(struct s return 0; recycle_and_drop: + /* we free the skb below, so tell veth_recycle_msg() not to. */ msg->skb = NULL; - /* need to set in use to make veth_recycle_msg in case this - * was a mapping failure */ - set_bit(0, &msg->in_use); veth_recycle_msg(cnx, msg); drop: port->stats.tx_errors++; @@ -1068,12 +1067,14 @@ static int veth_start_xmit(struct sk_buf return 0; } +/* You musT hold the connection's lock when you call this function. */ static void veth_recycle_msg(struct veth_lpar_connection *cnx, struct veth_msg *msg) { u32 dma_address, dma_length; - if (test_and_clear_bit(0, &msg->in_use)) { + if (msg->in_use) { + msg->in_use = 0; dma_address = msg->data.addr[0]; dma_length = msg->data.len[0]; From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 2/12] iseries_veth: Cleanup error and debug messages In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.217047.4847506912.qpatch@concordia> This patch: * converts uses of veth_printk() to veth_debug()/veth_error() * makes terminology consistent, ie. always refer to LPAR not lpar * be consistent about printing return codes as %d not %x * make printf formats fit in 80 columns --- drivers/net/iseries_veth.c | 87 ++++++++++++++++++++++----------------------- 1 files changed, 43 insertions(+), 44 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -287,7 +287,7 @@ static void veth_take_cap(struct veth_lp HvLpEvent_Type_VirtualLan); if (cnx->state & VETH_STATE_GOTCAPS) { - veth_error("Received a second capabilities from lpar %d\n", + veth_error("Received a second capabilities from LPAR %d.\n", cnx->remote_lp); event->base_event.xRc = HvLpEvent_Rc_BufferNotAvailable; HvCallEvent_ackLpEvent((struct HvLpEvent *) event); @@ -306,7 +306,7 @@ static void veth_take_cap_ack(struct vet spin_lock_irqsave(&cnx->lock, flags); if (cnx->state & VETH_STATE_GOTCAPACK) { - veth_error("Received a second capabilities ack from lpar %d\n", + veth_error("Received a second capabilities ack from LPAR %d.\n", cnx->remote_lp); } else { memcpy(&cnx->cap_ack_event, event, @@ -323,8 +323,7 @@ static void veth_take_monitor_ack(struct unsigned long flags; spin_lock_irqsave(&cnx->lock, flags); - veth_printk(KERN_DEBUG, "Monitor ack returned for lpar %d\n", - cnx->remote_lp); + veth_debug("cnx %d: lost connection.\n", cnx->remote_lp); cnx->state |= VETH_STATE_RESET; veth_kick_statemachine(cnx); spin_unlock_irqrestore(&cnx->lock, flags); @@ -345,8 +344,8 @@ static void veth_handle_ack(struct VethL veth_take_monitor_ack(cnx, event); break; default: - veth_error("Unknown ack type %d from lpar %d\n", - event->base_event.xSubtype, rlp); + veth_error("Unknown ack type %d from LPAR %d.\n", + event->base_event.xSubtype, rlp); }; } @@ -382,8 +381,8 @@ static void veth_handle_int(struct VethL veth_receive(cnx, event); break; default: - veth_error("Unknown interrupt type %d from lpar %d\n", - event->base_event.xSubtype, rlp); + veth_error("Unknown interrupt type %d from LPAR %d.\n", + event->base_event.xSubtype, rlp); }; } @@ -409,8 +408,8 @@ static int veth_process_caps(struct veth || (remote_caps->ack_threshold > VETH_MAX_ACKS_PER_MSG) || (remote_caps->ack_threshold == 0) || (cnx->ack_timeout == 0) ) { - veth_error("Received incompatible capabilities from lpar %d\n", - cnx->remote_lp); + veth_error("Received incompatible capabilities from LPAR %d.\n", + cnx->remote_lp); return HvLpEvent_Rc_InvalidSubtypeData; } @@ -427,8 +426,8 @@ static int veth_process_caps(struct veth cnx->num_ack_events += num; if (cnx->num_ack_events < num_acks_needed) { - veth_error("Couldn't allocate enough ack events for lpar %d\n", - cnx->remote_lp); + veth_error("Couldn't allocate enough ack events " + "for LPAR %d.\n", cnx->remote_lp); return HvLpEvent_Rc_BufferNotAvailable; } @@ -507,9 +506,8 @@ static void veth_statemachine(void *p) } else { if ( (rc != HvLpEvent_Rc_PartitionDead) && (rc != HvLpEvent_Rc_PathClosed) ) - veth_error("Error sending monitor to " - "lpar %d, rc=%x\n", - rlp, (int) rc); + veth_error("Error sending monitor to LPAR %d, " + "rc = %d\n", rlp, rc); /* Oh well, hope we get a cap from the other * end and do better when that kicks us */ @@ -532,9 +530,9 @@ static void veth_statemachine(void *p) } else { if ( (rc != HvLpEvent_Rc_PartitionDead) && (rc != HvLpEvent_Rc_PathClosed) ) - veth_error("Error sending caps to " - "lpar %d, rc=%x\n", - rlp, (int) rc); + veth_error("Error sending caps to LPAR %d, " + "rc = %d\n", rlp, rc); + /* Oh well, hope we get a cap from the other * end and do better when that kicks us */ goto out; @@ -574,10 +572,8 @@ static void veth_statemachine(void *p) add_timer(&cnx->ack_timer); cnx->state |= VETH_STATE_READY; } else { - veth_printk(KERN_ERR, "Caps rejected (rc=%d) by " - "lpar %d\n", - cnx->cap_ack_event.base_event.xRc, - rlp); + veth_error("Caps rejected by LPAR %d, rc = %d\n", + rlp, cnx->cap_ack_event.base_event.xRc); goto cant_cope; } } @@ -590,8 +586,8 @@ static void veth_statemachine(void *p) /* FIXME: we get here if something happens we really can't * cope with. The link will never work once we get here, and * all we can do is not lock the rest of the system up */ - veth_error("Badness on connection to lpar %d (state=%04lx) " - " - shutting down\n", rlp, cnx->state); + veth_error("Unrecoverable error on connection to LPAR %d, shutting down" + " (state = 0x%04lx)\n", rlp, cnx->state); cnx->state |= VETH_STATE_SHUTDOWN; spin_unlock_irq(&cnx->lock); } @@ -623,7 +619,7 @@ static int veth_init_connection(u8 rlp) msgs = kmalloc(VETH_NUMBUFFERS * sizeof(struct veth_msg), GFP_KERNEL); if (! msgs) { - veth_error("Can't allocate buffers for lpar %d\n", rlp); + veth_error("Can't allocate buffers for LPAR %d.\n", rlp); return -ENOMEM; } @@ -639,8 +635,7 @@ static int veth_init_connection(u8 rlp) cnx->num_events = veth_allocate_events(rlp, 2 + VETH_NUMBUFFERS); if (cnx->num_events < (2 + VETH_NUMBUFFERS)) { - veth_error("Can't allocate events for lpar %d, only got %d\n", - rlp, cnx->num_events); + veth_error("Can't allocate enough events for LPAR %d.\n", rlp); return -ENOMEM; } @@ -898,15 +893,17 @@ static struct net_device * __init veth_p rc = register_netdev(dev); if (rc != 0) { - veth_printk(KERN_ERR, - "Failed to register ethernet device for vlan %d\n", - vlan); + veth_error("Failed registering net device for vlan%d.\n", vlan); free_netdev(dev); return NULL; } - veth_printk(KERN_DEBUG, "%s attached to iSeries vlan %d (lpar_map=0x%04x)\n", - dev->name, vlan, port->lpar_map); + veth_info("%s attached to iSeries vlan %d.\n", dev->name, vlan); + + for (i = 0; i < HVMAXARCHITECTEDLPS; i++) { + if (port->lpar_map & (1 << i)) + veth_info("%s connected to LPAR %d.\n", dev->name, i); + } return dev; } @@ -1039,7 +1036,7 @@ static int veth_start_xmit(struct sk_buf dev_kfree_skb(skb); } else { if (port->pending_skb) { - veth_error("%s: Tx while skb was pending!\n", + veth_error("%s: TX while skb was pending!\n", dev->name); dev_kfree_skb(skb); spin_unlock_irqrestore(&port->pending_gate, flags); @@ -1075,10 +1072,10 @@ static void veth_recycle_msg(struct veth memset(&msg->data, 0, sizeof(msg->data)); veth_stack_push(cnx, msg); - } else - if (cnx->state & VETH_STATE_OPEN) - veth_error("Bogus frames ack from lpar %d (#%d)\n", - cnx->remote_lp, msg->token); + } else if (cnx->state & VETH_STATE_OPEN) { + veth_error("Non-pending frame (# %d) acked by LPAR %d.\n", + cnx->remote_lp, msg->token); + } } static void veth_flush_pending(struct veth_lpar_connection *cnx) @@ -1188,8 +1185,8 @@ static void veth_flush_acks(struct veth_ 0, &cnx->pending_acks); if (rc != HvLpEvent_Rc_Good) - veth_error("Error 0x%x acking frames from lpar %d!\n", - (unsigned)rc, cnx->remote_lp); + veth_error("Failed acking frames from LPAR %d, rc = %d\n", + cnx->remote_lp, (int)rc); cnx->num_pending_acks = 0; memset(&cnx->pending_acks, 0xff, sizeof(cnx->pending_acks)); @@ -1225,9 +1222,10 @@ static void veth_receive(struct veth_lpa /* make sure that we have at least 1 EOF entry in the * remaining entries */ if (! (senddata->eofmask >> (startchunk + VETH_EOF_SHIFT))) { - veth_error("missing EOF frag in event " - "eofmask=0x%x startchunk=%d\n", - (unsigned) senddata->eofmask, startchunk); + veth_error("Missing EOF fragment in event " + "eofmask = 0x%x startchunk = %d\n", + (unsigned)senddata->eofmask, + startchunk); break; } @@ -1246,8 +1244,9 @@ static void veth_receive(struct veth_lpa /* nchunks == # of chunks in this frame */ if ((length - ETH_HLEN) > VETH_MAX_MTU) { - veth_error("Received oversize frame from lpar %d " - "(length=%d)\n", cnx->remote_lp, length); + veth_error("Received oversize frame from LPAR %d " + "(length = %d)\n", + cnx->remote_lp, length); continue; } From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 10/12] iseries_veth: Remove TX timeout code In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.908017.660889424014.qpatch@concordia> The iseries_veth driver uses the generic TX timeout watchdog, however a better solution is in the works, so remove this code. --- drivers/net/iseries_veth.c | 48 --------------------------------------------- 1 files changed, 48 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -813,49 +813,6 @@ static struct ethtool_ops ops = { .get_link = veth_get_link, }; -static void veth_tx_timeout(struct net_device *dev) -{ - struct veth_port *port = (struct veth_port *)dev->priv; - struct net_device_stats *stats = &port->stats; - unsigned long flags; - int i; - - stats->tx_errors++; - - spin_lock_irqsave(&port->pending_gate, flags); - - if (!port->pending_lpmask) { - spin_unlock_irqrestore(&port->pending_gate, flags); - return; - } - - printk(KERN_WARNING "%s: Tx timeout! Resetting lp connections: %08x\n", - dev->name, port->pending_lpmask); - - for (i = 0; i < HVMAXARCHITECTEDLPS; i++) { - struct veth_lpar_connection *cnx = veth_cnx[i]; - - if (! (port->pending_lpmask & (1<lock); - cnx->state |= VETH_STATE_RESET; - veth_kick_statemachine(cnx); - spin_unlock(&cnx->lock); - } - - spin_unlock_irqrestore(&port->pending_gate, flags); -} - static struct net_device * __init veth_probe_one(int vlan, struct device *vdev) { struct net_device *dev; @@ -904,9 +861,6 @@ static struct net_device * __init veth_p dev->set_multicast_list = veth_set_multicast_list; SET_ETHTOOL_OPS(dev, &ops); - dev->watchdog_timeo = 2 * (VETH_ACKTIMEOUT * HZ / 1000000); - dev->tx_timeout = veth_tx_timeout; - SET_NETDEV_DEV(dev, vdev); rc = register_netdev(dev); @@ -1047,8 +1001,6 @@ static int veth_start_xmit(struct sk_buf lpmask = veth_transmit_to_many(skb, lpmask, dev); - dev->trans_start = jiffies; - if (! lpmask) { dev_kfree_skb(skb); } else { From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 7/12] iseries_veth: Remove redundant message stack lock In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.569702.406803544780.qpatch@concordia> The iseries_veth driver keeps a stack of messages for each connection and a lock to protect the stack. However there is also a per-connection lock which makes the message stack redundant. Remove the message stack lock and document the fact that callers of the stack-manipulation functions must hold the connection's lock. --- drivers/net/iseries_veth.c | 12 +++--------- 1 files changed, 3 insertions(+), 9 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -143,7 +143,6 @@ struct veth_lpar_connection { struct VethCapData remote_caps; u32 ack_timeout; - spinlock_t msg_stack_lock; struct veth_msg *msg_stack_head; }; @@ -190,27 +189,23 @@ static void veth_timed_ack(unsigned long #define veth_debug(fmt, args...) do {} while (0) #endif +/* You must hold the connection's lock when you call this function. */ static inline void veth_stack_push(struct veth_lpar_connection *cnx, struct veth_msg *msg) { - unsigned long flags; - - spin_lock_irqsave(&cnx->msg_stack_lock, flags); msg->next = cnx->msg_stack_head; cnx->msg_stack_head = msg; - spin_unlock_irqrestore(&cnx->msg_stack_lock, flags); } +/* You must hold the connection's lock when you call this function. */ static inline struct veth_msg *veth_stack_pop(struct veth_lpar_connection *cnx) { - unsigned long flags; struct veth_msg *msg; - spin_lock_irqsave(&cnx->msg_stack_lock, flags); msg = cnx->msg_stack_head; if (msg) cnx->msg_stack_head = cnx->msg_stack_head->next; - spin_unlock_irqrestore(&cnx->msg_stack_lock, flags); + return msg; } @@ -643,7 +638,6 @@ static int veth_init_connection(u8 rlp) cnx->msgs = msgs; memset(msgs, 0, VETH_NUMBUFFERS * sizeof(struct veth_msg)); - spin_lock_init(&cnx->msg_stack_lock); for (i = 0; i < VETH_NUMBUFFERS; i++) { msgs[i].token = i; From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 6/12] iseries_veth: Fix broken promiscuous handling In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.505562.276017099853.qpatch@concordia> Due to a logic bug, once promiscuous mode is enabled in the iseries_veth driver it is never disabled. The driver keeps two flags, promiscuous and all_mcast which have exactly the same effect. This is because we only ever receive packets destined for us, or multicast packets. So consolidate them into one promiscuous flag for simplicity. --- drivers/net/iseries_veth.c | 16 +++++----------- 1 files changed, 5 insertions(+), 11 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -159,7 +159,6 @@ struct veth_port { rwlock_t mcast_gate; int promiscuous; - int all_mcast; int num_mcast; u64 mcast_addr[VETH_MAX_MCAST]; }; @@ -754,17 +753,15 @@ static void veth_set_multicast_list(stru write_lock_irqsave(&port->mcast_gate, flags); - if (dev->flags & IFF_PROMISC) { /* set promiscuous mode */ - printk(KERN_INFO "%s: Promiscuous mode enabled.\n", - dev->name); + if ((dev->flags & IFF_PROMISC) || (dev->flags & IFF_ALLMULTI) || + (dev->mc_count > VETH_MAX_MCAST)) { port->promiscuous = 1; - } else if ( (dev->flags & IFF_ALLMULTI) - || (dev->mc_count > VETH_MAX_MCAST) ) { - port->all_mcast = 1; } else { struct dev_mc_list *dmi = dev->mc_list; int i; + port->promiscuous = 0; + /* Update table */ port->num_mcast = 0; @@ -1147,12 +1144,9 @@ static inline int veth_frame_wanted(stru if ( (mac_addr == port->mac_addr) || (mac_addr == 0xffffffffffff0000) ) return 1; - if (! (((char *) &mac_addr)[0] & 0x01)) - return 0; - read_lock_irqsave(&port->mcast_gate, flags); - if (port->promiscuous || port->all_mcast) { + if (port->promiscuous) { wanted = 1; goto out; } From michael at ellerman.id.au Thu Jun 30 20:20:40 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:40 +1000 Subject: [PATCH 12/12] iseries_veth: Simplify full-queue handling In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126840.155491.927718131055.qpatch@concordia> The iseries_veth driver may have multiple netdevices sending packets over a single connection to another LPAR. If the bandwidth to the other LPAR is exceeded all the netdevices must have their queue's stopped. The current code achieves this by queueing one incoming skb on the per-netdevice port structure. When the connection is able to send more packets it flushes the queued packet for all netdevices and restarts their queues. This arrangement makes less sense now that we have per-connection TX timers, rather than the per-netdevice generic TX timer. The new code simply detects when one of the connections is full, and stops the queue of all associated netdevices. Then when a packet is acked on that connection (ie. there is space again) all the queues are woken up. --- drivers/net/iseries_veth.c | 108 ++++++++++++++++++++++++++------------------- 1 files changed, 64 insertions(+), 44 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -158,10 +158,11 @@ struct veth_port { u64 mac_addr; HvLpIndexMap lpar_map; - spinlock_t pending_gate; - struct sk_buff *pending_skb; - HvLpIndexMap pending_lpmask; + /* queue_lock protects the stopped_map and dev's queue. */ + spinlock_t queue_lock; + HvLpIndexMap stopped_map; + /* mcast_gate protects promiscuous, num_mcast & mcast_addr. */ rwlock_t mcast_gate; int promiscuous; int num_mcast; @@ -174,7 +175,8 @@ static struct net_device *veth_dev[HVMAX static int veth_start_xmit(struct sk_buff *skb, struct net_device *dev); static void veth_recycle_msg(struct veth_lpar_connection *, struct veth_msg *); -static void veth_flush_pending(struct veth_lpar_connection *cnx); +static void veth_wake_queues(struct veth_lpar_connection *cnx); +static void veth_stop_queues(struct veth_lpar_connection *cnx); static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *); static void veth_timed_ack(unsigned long ptr); static void veth_timed_reset(unsigned long ptr); @@ -216,6 +218,12 @@ static inline struct veth_msg *veth_stac return msg; } +/* You must hold the connection's lock when you call this function. */ +static inline int veth_stack_is_empty(struct veth_lpar_connection *cnx) +{ + return cnx->msg_stack_head == NULL; +} + static inline HvLpEvent_Rc veth_signalevent(struct veth_lpar_connection *cnx, u16 subtype, HvLpEvent_AckInd ackind, HvLpEvent_AckType acktype, @@ -384,12 +392,12 @@ static void veth_handle_int(struct VethL } } - if (acked > 0) + if (acked > 0) { cnx->last_contact = jiffies; + veth_wake_queues(cnx); + } spin_unlock_irqrestore(&cnx->lock, flags); - - veth_flush_pending(cnx); break; case VethEventTypeFrames: veth_receive(cnx, event); @@ -485,7 +493,9 @@ static void veth_statemachine(void *p) for (i = 0; i < VETH_NUMBUFFERS; ++i) veth_recycle_msg(cnx, cnx->msgs + i); } + cnx->outstanding_tx = 0; + veth_wake_queues(cnx); /* Drop the lock so we can do stuff that might sleep or * take other locks. */ @@ -494,8 +504,6 @@ static void veth_statemachine(void *p) del_timer_sync(&cnx->ack_timer); del_timer_sync(&cnx->reset_timer); - veth_flush_pending(cnx); - spin_lock_irq(&cnx->lock); if (cnx->state & VETH_STATE_RESET) @@ -852,8 +860,9 @@ static struct net_device * __init veth_p port = (struct veth_port *) dev->priv; - spin_lock_init(&port->pending_gate); + spin_lock_init(&port->queue_lock); rwlock_init(&port->mcast_gate); + port->stopped_map = 0; for (i = 0; i < HVMAXARCHITECTEDLPS; i++) { HvLpVirtualLanIndexMap map; @@ -969,6 +978,9 @@ static int veth_transmit_to_one(struct s cnx->last_contact = jiffies; cnx->outstanding_tx++; + if (veth_stack_is_empty(cnx)) + veth_stop_queues(cnx); + spin_unlock_irqrestore(&cnx->lock, flags); return 0; @@ -1012,7 +1024,6 @@ static int veth_start_xmit(struct sk_buf { unsigned char *frame = skb->data; struct veth_port *port = (struct veth_port *) dev->priv; - unsigned long flags; HvLpIndexMap lpmask; if (! (frame[0] & 0x01)) { @@ -1029,27 +1040,9 @@ static int veth_start_xmit(struct sk_buf lpmask = port->lpar_map; } - spin_lock_irqsave(&port->pending_gate, flags); - - lpmask = veth_transmit_to_many(skb, lpmask, dev); + veth_transmit_to_many(skb, lpmask, dev); - if (! lpmask) { - dev_kfree_skb(skb); - } else { - if (port->pending_skb) { - veth_error("%s: TX while skb was pending!\n", - dev->name); - dev_kfree_skb(skb); - spin_unlock_irqrestore(&port->pending_gate, flags); - return 1; - } - - port->pending_skb = skb; - port->pending_lpmask = lpmask; - netif_stop_queue(dev); - } - - spin_unlock_irqrestore(&port->pending_gate, flags); + dev_kfree_skb(skb); return 0; } @@ -1081,9 +1074,10 @@ static void veth_recycle_msg(struct veth } } -static void veth_flush_pending(struct veth_lpar_connection *cnx) +static void veth_wake_queues(struct veth_lpar_connection *cnx) { int i; + for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++) { struct net_device *dev = veth_dev[i]; struct veth_port *port; @@ -1097,19 +1091,45 @@ static void veth_flush_pending(struct ve if (! (port->lpar_map & (1<remote_lp))) continue; - spin_lock_irqsave(&port->pending_gate, flags); - if (port->pending_skb) { - port->pending_lpmask = - veth_transmit_to_many(port->pending_skb, - port->pending_lpmask, - dev); - if (! port->pending_lpmask) { - dev_kfree_skb_any(port->pending_skb); - port->pending_skb = NULL; - netif_wake_queue(dev); - } + spin_lock_irqsave(&port->queue_lock, flags); + + port->stopped_map &= ~(1 << cnx->remote_lp); + + if (0 == port->stopped_map && netif_queue_stopped(dev)) { + veth_debug("cnx %d: woke queue for %s.\n", + cnx->remote_lp, dev->name); + netif_wake_queue(dev); } - spin_unlock_irqrestore(&port->pending_gate, flags); + spin_unlock_irqrestore(&port->queue_lock, flags); + } +} + +static void veth_stop_queues(struct veth_lpar_connection *cnx) +{ + int i; + + for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++) { + struct net_device *dev = veth_dev[i]; + struct veth_port *port; + + if (! dev) + continue; + + port = (struct veth_port *)dev->priv; + + /* If this cnx is not on the vlan for this port, continue */ + if (! (port->lpar_map & (1 << cnx->remote_lp))) + continue; + + spin_lock(&port->queue_lock); + + netif_stop_queue(dev); + port->stopped_map |= (1 << cnx->remote_lp); + + veth_debug("cnx %d: stopped queue for %s, map = 0x%x.\n", + cnx->remote_lp, dev->name, port->stopped_map); + + spin_unlock(&port->queue_lock); } } From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 1/12] iseries_veth: Make error messages more user friendly, and add a debug macro In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.106943.128468321759.qpatch@concordia> Currently the iseries_veth driver prints the file name and line number in its error messages. This isn't very useful for most users, so just print "iseries_veth: message" instead. Also add a veth_debug() and veth_info() macro to replace the current veth_printk(). --- drivers/net/iseries_veth.c | 15 ++++++++++++--- 1 files changed, 12 insertions(+), 3 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -79,6 +79,8 @@ #include #include +#define DEBUG 1 + #include "iseries_veth.h" MODULE_AUTHOR("Kyle Lucke "); @@ -176,11 +178,18 @@ static void veth_timed_ack(unsigned long * Utility functions */ -#define veth_printk(prio, fmt, args...) \ - printk(prio "%s: " fmt, __FILE__, ## args) +#define veth_info(fmt, args...) \ + printk(KERN_INFO "iseries_veth: " fmt, ## args) #define veth_error(fmt, args...) \ - printk(KERN_ERR "(%s:%3.3d) ERROR: " fmt, __FILE__, __LINE__ , ## args) + printk(KERN_ERR "iseries_veth: Error: " fmt, ## args) + +#ifdef DEBUG +#define veth_debug(fmt, args...) \ + printk(KERN_DEBUG "iseries_veth: " fmt, ## args) +#else +#define veth_debug(fmt, args...) do {} while (0) +#endif static inline void veth_stack_push(struct veth_lpar_connection *cnx, struct veth_msg *msg) From michael at ellerman.id.au Thu Jun 30 20:20:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:39 +1000 Subject: [PATCH 9/12] iseries_veth: Use ref counts to track lifecycle of connection structs In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126839.794259.894526862881.qpatch@concordia> The iseries_veth driver can attach to multiple vlans, which correspond to multiple net devices. However there is only 1 connection between each LPAR, so the connection structure may be shared by multiple net devices. This makes module removal messy, because we can't deallocate the connections until we know there are no net devices still using them. The solution is to use ref counts on the connections, so we can delete them (actually stop) as soon as the ref count hits zero. This patch fixes (part of) a bug we were seeing with IPv6 sending probes to a dead LPAR, which would then hang us forever due to leftover skbs. This patch has the (minor?) side effect that we only start negotiating a connection with LPARs which are on one of our vlans. The previous behaviour was to start negotiation with all LPARs unconditionally, will have the think about that one. --- drivers/net/iseries_veth.c | 89 ++++++++++++++++++++++++++++++--------------- 1 files changed, 61 insertions(+), 28 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -129,6 +129,7 @@ struct veth_lpar_connection { int num_events; struct VethCapData local_caps; + struct kref refcount; struct timer_list ack_timer; spinlock_t lock; @@ -620,6 +621,10 @@ static int veth_init_connection(u8 rlp) return -ENOMEM; memset(cnx, 0, sizeof(*cnx)); + /* This gets us 1 reference, which is held on behalf of the driver + * infrastructure. It's released at module unload. */ + kref_init(&cnx->refcount); + cnx->remote_lp = rlp; spin_lock_init(&cnx->lock); INIT_WORK(&cnx->statemachine_wq, veth_statemachine, cnx); @@ -658,12 +663,10 @@ static int veth_init_connection(u8 rlp) return 0; } -static void veth_stop_connection(u8 rlp) +static void veth_stop_connection(struct kref *ref) { - struct veth_lpar_connection *cnx = veth_cnx[rlp]; - - if (! cnx) - return; + struct veth_lpar_connection *cnx; + cnx = container_of(ref, struct veth_lpar_connection, refcount); spin_lock_irq(&cnx->lock); cnx->state |= VETH_STATE_RESET | VETH_STATE_SHUTDOWN; @@ -1352,15 +1355,31 @@ static void veth_timed_ack(unsigned long static int veth_remove(struct vio_dev *vdev) { - int i = vdev->unit_address; + struct veth_lpar_connection *cnx; struct net_device *dev; + struct veth_port *port; + int i; - dev = veth_dev[i]; - if (dev != NULL) { - veth_dev[i] = NULL; - unregister_netdev(dev); - free_netdev(dev); + dev = veth_dev[vdev->unit_address]; + + if (! dev) + return 0; + + port = netdev_priv(dev); + + for (i = 0; i < HVMAXARCHITECTEDLPS; i++) { + cnx = veth_cnx[i]; + + if (cnx && (port->lpar_map & (1 << i))) { + /* Drop our reference to connections on our VLAN */ + kref_put(&cnx->refcount, veth_stop_connection); + } } + + veth_dev[vdev->unit_address] = NULL; + unregister_netdev(dev); + free_netdev(dev); + return 0; } @@ -1368,6 +1387,7 @@ static int veth_probe(struct vio_dev *vd { int i = vdev->unit_address; struct net_device *dev; + struct veth_port *port; dev = veth_probe_one(i, &vdev->dev); if (dev == NULL) { @@ -1376,11 +1396,19 @@ static int veth_probe(struct vio_dev *vd } veth_dev[i] = dev; - /* Start the state machine on each connection, to commence - * link negotiation */ - for (i = 0; i < HVMAXARCHITECTEDLPS; i++) - if (veth_cnx[i]) + port = (struct veth_port*)netdev_priv(dev); + + /* Start the state machine on each connection on this vlan. If we're + * the first dev to do so this will commence link negotiation */ + for (i = 0; i < HVMAXARCHITECTEDLPS; i++) { + if (! (port->lpar_map & (1 << i))) + continue; + + if (veth_cnx[i]) { + kref_get(&(veth_cnx[i]->refcount)); veth_kick_statemachine(veth_cnx[i]); + } + } return 0; } @@ -1409,26 +1437,31 @@ static struct vio_driver veth_driver = { void __exit veth_module_cleanup(void) { int i; + struct veth_lpar_connection *cnx; - /* Stop the queues first to stop any new packets being sent. */ - for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++) - if (veth_dev[i]) - netif_stop_queue(veth_dev[i]); + /* Drop the driver's references to the connections. */ + for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) { + cnx = veth_cnx[i]; - /* Stop the connections before we unregister the driver. This - * ensures there's no skbs lying around holding the device open. */ - for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) - veth_stop_connection(i); + if (cnx) { + kref_put(&cnx->refcount, veth_stop_connection); + } + } - HvLpEvent_unregisterHandler(HvLpEvent_Type_VirtualLan); + /* Unregister the driver, which will close all the netdevs and stop + * the connections when they're no longer referenced. */ + vio_unregister_driver(&veth_driver); - /* Hypervisor callbacks may have scheduled more work while we - * were stoping connections. Now that we've disconnected from - * the hypervisor make sure everything's finished. */ + /* Make sure each connection's state machine has run to completion. */ flush_scheduled_work(); - vio_unregister_driver(&veth_driver); + /* Disconnect our "irq" to stop events coming from the Hypervisor. */ + HvLpEvent_unregisterHandler(HvLpEvent_Type_VirtualLan); + + /* Make sure any work queued from Hypervisor callbacks is finished. */ + flush_scheduled_work(); + /* Deallocate everything. */ for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) veth_destroy_connection(i); From michael at ellerman.id.au Thu Jun 30 20:20:40 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 30 Jun 2005 20:20:40 +1000 Subject: [PATCH 11/12] iseries_veth: Add a per-connection ack timer In-Reply-To: <200506302016.55125.michael@ellerman.id.au> Message-ID: <1120126840.39112.35278125306.qpatch@concordia> Currently the iseries_veth driver contravenes the specification in Documentation/networking/driver.txt, in that if packets are not acked by the other LPAR they will sit around forever. This patch adds a per-connection timer which fires if we've had no acks for five seconds. This is superior to the generic TX timer because it catches the case of a small number of packets being sent and never acked. --- drivers/net/iseries_veth.c | 75 +++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 69 insertions(+), 6 deletions(-) Index: veth-dev/drivers/net/iseries_veth.c =================================================================== --- veth-dev.orig/drivers/net/iseries_veth.c +++ veth-dev/drivers/net/iseries_veth.c @@ -132,6 +132,11 @@ struct veth_lpar_connection { struct kref refcount; struct timer_list ack_timer; + struct timer_list reset_timer; + unsigned int reset_timeout; + unsigned long last_contact; + int outstanding_tx; + spinlock_t lock; unsigned long state; HvLpInstanceId src_inst; @@ -171,7 +176,8 @@ static int veth_start_xmit(struct sk_buf static void veth_recycle_msg(struct veth_lpar_connection *, struct veth_msg *); static void veth_flush_pending(struct veth_lpar_connection *cnx); static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *); -static void veth_timed_ack(unsigned long connectionPtr); +static void veth_timed_ack(unsigned long ptr); +static void veth_timed_reset(unsigned long ptr); /* * Utility functions @@ -353,7 +359,7 @@ static void veth_handle_int(struct VethL HvLpIndex rlp = event->base_event.xSourceLp; struct veth_lpar_connection *cnx = veth_cnx[rlp]; unsigned long flags; - int i; + int i, acked = 0; BUG_ON(! cnx); @@ -367,13 +373,22 @@ static void veth_handle_int(struct VethL break; case VethEventTypeFramesAck: spin_lock_irqsave(&cnx->lock, flags); + for (i = 0; i < VETH_MAX_ACKS_PER_MSG; ++i) { u16 msgnum = event->u.frames_ack_data.token[i]; - if (msgnum < VETH_NUMBUFFERS) + if (msgnum < VETH_NUMBUFFERS) { veth_recycle_msg(cnx, cnx->msgs + msgnum); + cnx->outstanding_tx--; + acked++; + } } + + if (acked > 0) + cnx->last_contact = jiffies; + spin_unlock_irqrestore(&cnx->lock, flags); + veth_flush_pending(cnx); break; case VethEventTypeFrames: @@ -447,8 +462,6 @@ static void veth_statemachine(void *p) restart: if (cnx->state & VETH_STATE_RESET) { - int i; - if (cnx->state & VETH_STATE_OPEN) HvCallEvent_closeLpEventPath(cnx->remote_lp, HvLpEvent_Type_VirtualLan); @@ -467,15 +480,20 @@ static void veth_statemachine(void *p) | VETH_STATE_SENTCAPACK | VETH_STATE_READY); /* Clean up any leftover messages */ - if (cnx->msgs) + if (cnx->msgs) { + int i; for (i = 0; i < VETH_NUMBUFFERS; ++i) veth_recycle_msg(cnx, cnx->msgs + i); + } + cnx->outstanding_tx = 0; /* Drop the lock so we can do stuff that might sleep or * take other locks. */ spin_unlock_irq(&cnx->lock); del_timer_sync(&cnx->ack_timer); + del_timer_sync(&cnx->reset_timer); + veth_flush_pending(cnx); spin_lock_irq(&cnx->lock); @@ -628,9 +646,16 @@ static int veth_init_connection(u8 rlp) cnx->remote_lp = rlp; spin_lock_init(&cnx->lock); INIT_WORK(&cnx->statemachine_wq, veth_statemachine, cnx); + init_timer(&cnx->ack_timer); cnx->ack_timer.function = veth_timed_ack; cnx->ack_timer.data = (unsigned long) cnx; + + init_timer(&cnx->reset_timer); + cnx->reset_timer.function = veth_timed_reset; + cnx->reset_timer.data = (unsigned long) cnx; + cnx->reset_timeout = 5 * HZ * (VETH_ACKTIMEOUT / 1000000); + memset(&cnx->pending_acks, 0xff, sizeof (cnx->pending_acks)); veth_cnx[rlp] = cnx; @@ -937,6 +962,13 @@ static int veth_transmit_to_one(struct s if (rc != HvLpEvent_Rc_Good) goto recycle_and_drop; + /* If the timer's not already running, start it now. */ + if (0 == cnx->outstanding_tx) + mod_timer(&cnx->reset_timer, jiffies + cnx->reset_timeout); + + cnx->last_contact = jiffies; + cnx->outstanding_tx++; + spin_unlock_irqrestore(&cnx->lock, flags); return 0; @@ -1081,6 +1113,37 @@ static void veth_flush_pending(struct ve } } +static void veth_timed_reset(unsigned long ptr) +{ + struct veth_lpar_connection *cnx = (struct veth_lpar_connection *)ptr; + unsigned long trigger_time, flags; + + /* FIXME is it possible this fires after veth_stop_connection()? + * That would reschedule the statemachine for 5 seconds and probably + * execute it after the module's been unloaded. Hmm. */ + + spin_lock_irqsave(&cnx->lock, flags); + + if (cnx->outstanding_tx > 0) { + trigger_time = cnx->last_contact + cnx->reset_timeout; + + if (trigger_time < jiffies) { + cnx->state |= VETH_STATE_RESET; + veth_kick_statemachine(cnx); + veth_error("%d packets not acked by LPAR %d within %d " + "seconds, resetting.\n", + cnx->outstanding_tx, cnx->remote_lp, + cnx->reset_timeout / HZ); + } else { + /* Reschedule the timer */ + trigger_time = jiffies + cnx->reset_timeout; + mod_timer(&cnx->reset_timer, trigger_time); + } + } + + spin_unlock_irqrestore(&cnx->lock, flags); +} + /* * Rx path */