From nathanl at austin.ibm.com Tue Nov 2 13:47:22 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Mon, 01 Nov 2004 20:47:22 -0600 Subject: [patch] mmu_context_init needs to run earlier Message-ID: <1099363642.22996.345.camel@pants.austin.ibm.com> Hi- I am seeing "kernel BUG in mmu_context_init at arch/ppc64/mm/init.c:528" in latest 2.6 bk kernels. It looks as if arch_initcall is not early enough for mmu_context_init -- I inserted printk's in that function and init_new_context, and indeed, init_new_context is being called before mmu_context_init. Not sure this is the best fix, or that this completely eliminates the races, but I didn't see any other obvious solution. Boot-tested on a p630. Signed-off-by: Nathan Lynch --- diff -puN arch/ppc64/mm/init.c~ppc64-make-mmu_context_init-core_initcall arch/ppc64/mm/init.c --- linux-2.6.10-rc1-bk11/arch/ppc64/mm/init.c~ppc64-make-mmu_context_init-core_initcall 2004-11-01 19:51:46.000000000 -0600 +++ linux-2.6.10-rc1-bk11-nathanl/arch/ppc64/mm/init.c 2004-11-01 19:53:24.000000000 -0600 @@ -529,7 +529,7 @@ static int __init mmu_context_init(void) return 0; } -arch_initcall(mmu_context_init); +core_initcall(mmu_context_init); /* * Do very early mm setup. _ From benh at kernel.crashing.org Tue Nov 2 15:47:23 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 02 Nov 2004 15:47:23 +1100 Subject: [patch] mmu_context_init needs to run earlier In-Reply-To: <1099363642.22996.345.camel@pants.austin.ibm.com> References: <1099363642.22996.345.camel@pants.austin.ibm.com> Message-ID: <1099370843.29689.448.camel@gaston> On Mon, 2004-11-01 at 20:47 -0600, Nathan Lynch wrote: > Hi- > > I am seeing "kernel BUG in mmu_context_init at arch/ppc64/mm/init.c:528" > in latest 2.6 bk kernels. It looks as if arch_initcall is not early > enough for mmu_context_init -- I inserted printk's in that function and > init_new_context, and indeed, init_new_context is being called before > mmu_context_init. > > Not sure this is the best fix, or that this completely eliminates the > races, but I didn't see any other obvious solution. Boot-tested on a > p630. Do you have a backtrace of who is trying to get a context that early ? If it's some call of usermode helpers, I doubt it's very sane to do that before the arch initcalls have run ! It would be interesting to know who is triggering it. Ben. From nathanl at austin.ibm.com Tue Nov 2 16:59:19 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Mon, 01 Nov 2004 23:59:19 -0600 Subject: [patch] mmu_context_init needs to run earlier In-Reply-To: <1099370843.29689.448.camel@gaston> References: <1099363642.22996.345.camel@pants.austin.ibm.com> <1099370843.29689.448.camel@gaston> Message-ID: <1099375159.9590.4.camel@localhost.localdomain> On Tue, 2004-11-02 at 15:47 +1100, Benjamin Herrenschmidt wrote: > Do you have a backtrace of who is trying to get a context that > early ? > > If it's some call of usermode helpers, I doubt it's very sane to do that > before the arch initcalls have run ! It would be interesting to know who > is triggering it. > Sure, I inserted a WARN_ON in init_new_context. I see several of these before hitting the BUG_ON in mmu_context_init. I'm assuming these are from the driver core trying to run /sbin/hotplug. Badness in init_new_context at arch/ppc64/mm/init.c:483 Call Trace: [c00000000ff7fa00] [c00000000ff7faa0] 0xc00000000ff7faa0 (unreliable) [c00000000ff7faa0] [c0000000000c96dc] .do_execve+0xdc/0x2ac [c00000000ff7fb60] [c000000000016790] .sys_execve+0x7c/0x104 [c00000000ff7fc00] [c000000000011b80] syscall_exit+0x0/0x18 --- Exception: c01 at .____call_usermodehelper+0xcc/0xf8 LR = .____call_usermodehelper+0x9c/0xf8 Nathan From benh at kernel.crashing.org Tue Nov 2 17:05:11 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 02 Nov 2004 17:05:11 +1100 Subject: [patch] mmu_context_init needs to run earlier In-Reply-To: <1099375159.9590.4.camel@localhost.localdomain> References: <1099363642.22996.345.camel@pants.austin.ibm.com> <1099370843.29689.448.camel@gaston> <1099375159.9590.4.camel@localhost.localdomain> Message-ID: <1099375511.29693.463.camel@gaston> On Mon, 2004-11-01 at 23:59 -0600, Nathan Lynch wrote: > On Tue, 2004-11-02 at 15:47 +1100, Benjamin Herrenschmidt wrote: > > Do you have a backtrace of who is trying to get a context that > > early ? > > > > If it's some call of usermode helpers, I doubt it's very sane to do that > > before the arch initcalls have run ! It would be interesting to know who > > is triggering it. > > > > Sure, I inserted a WARN_ON in init_new_context. I see several of these > before hitting the BUG_ON in mmu_context_init. I'm assuming these are > from the driver core trying to run /sbin/hotplug. Yah. It would be interesting to find out who is triggering those calls (what drivers are probed that early during boot). It doesn't happen on my g5 for some reason. Ben. From nathanl at austin.ibm.com Wed Nov 3 08:55:33 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Tue, 02 Nov 2004 15:55:33 -0600 Subject: [patch] mmu_context_init needs to run earlier In-Reply-To: <1099375511.29693.463.camel@gaston> References: <1099363642.22996.345.camel@pants.austin.ibm.com> <1099370843.29689.448.camel@gaston> <1099375159.9590.4.camel@localhost.localdomain> <1099375511.29693.463.camel@gaston> Message-ID: <1099432532.23845.90.camel@pants.austin.ibm.com> On Tue, 2004-11-02 at 00:05, Benjamin Herrenschmidt wrote: > On Mon, 2004-11-01 at 23:59 -0600, Nathan Lynch wrote: > > On Tue, 2004-11-02 at 15:47 +1100, Benjamin Herrenschmidt wrote: > > > Do you have a backtrace of who is trying to get a context that > > > early ? > > > > > > If it's some call of usermode helpers, I doubt it's very sane to do that > > > before the arch initcalls have run ! It would be interesting to know who > > > is triggering it. > > > > > > > Sure, I inserted a WARN_ON in init_new_context. I see several of these > > before hitting the BUG_ON in mmu_context_init. I'm assuming these are > > from the driver core trying to run /sbin/hotplug. > > Yah. It would be interesting to find out who is triggering those calls > (what drivers are probed that early during boot). It doesn't happen on > my g5 for some reason. Ok, here's a boot log with kobject debugging turned on. I can't interpret all of this but I believe a couple of them are due to sysdev_class_register for cpus and nodes. I don't know why I'm the only person running into this -- maybe it's something to do with my turning on every possible debug option ;) Regardless, I've got a better patch (I think) for the mmu context thing on the way. checking if image is initramfs... it is Freeing initrd memory: 1939k freed subsystem devices: registering kobject devices: registering. parent: , set: subsystem bus: registering kobject bus: registering. parent: , set: subsystem class: registering kobject class: registering. parent: , set: subsystem firmware: registering kobject firmware: registering. parent: , set: kobject platform: registering. parent: , set: devices subsystem platform: registering kobject platform: registering. parent: , set: bus kobject_hotplug fill_kobj_path: path = '/bus/platform' kobject_hotplug: /sbin/hotplug bus seq=1 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/platform SUBSYSTEM=bus kobject_hotplug - call_usermodehelper returned -1 kobject devices: registering. parent: platform, set: kobject_hotplug fill_kobj_path: path = '/bus/platform/devices' kobject_hotplug: /sbin/hotplug bus seq=2 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/platform/devices SUBSYSTEM=bus kobject_hotplug - call_usermodehelper returned -1 kobject drivers: registering. parent: platform, set: kobject_hotplug fill_kobj_path: path = '/bus/platform/drivers' kobject_hotplug: /sbin/hotplug bus seq=3 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/platform/drivers SUBSYSTEM=bus kobject_hotplug - call_usermodehelper returned -1 subsystem system: registering kobject system: registering. parent: devices, set: kobject cpu: registering. parent: , set: system kobject_hotplug fill_kobj_path: path = '/devices/system/cpu' kobject_hotplug: /sbin/hotplug system seq=4 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/devices/system/cpu SUBSYSTEM=system kobject_hotplug - call_usermodehelper returned -1 subsystem kernel: registering kobject kernel: registering. parent: , set: NET: Registered protocol family 16 subsystem of_platform: registering kobject of_platform: registering. parent: , set: bus kobject_hotplug fill_kobj_path: path = '/bus/of_platform' kobject_hotplug: /sbin/hotplug bus seq=5 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/of_platform SUBSYSTEM=bus kobject devices: registering. parent: of_platform, set: kobject_hotplug fill_kobj_path: path = '/bus/of_platform/devices' kobject_hotplug: /sbin/hotplug bus seq=6 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/of_platform/devices SUBSYSTEM=bus kobject drivers: registering. parent: of_platform, set: kobject_hotplug fill_kobj_path: path = '/bus/of_platform/drivers' kobject_hotplug: /sbin/hotplug bus seq=7 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/of_platform/drivers SUBSYSTEM=bus subsystem pci_bus: registering kobject pci_bus: registering. parent: , set: class kobject_hotplug fill_kobj_path: path = '/class/pci_bus' kobject_hotplug: /sbin/hotplug class seq=8 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/class/pci_bus SUBSYSTEM=class subsystem pci: registering kobject pci: registering. parent: , set: bus kobject_hotplug fill_kobj_path: path = '/bus/pci' kobject_hotplug: /sbin/hotplug bus seq=9 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/pci SUBSYSTEM=bus kobject devices: registering. parent: pci, set: kobject_hotplug fill_kobj_path: path = '/bus/pci/devices' kobject_hotplug: /sbin/hotplug bus seq=10 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/pci/devices SUBSYSTEM=bus kobject drivers: registering. parent: pci, set: kobject_hotplug fill_kobj_path: path = '/bus/pci/drivers' kobject_hotplug: /sbin/hotplug bus seq=11 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/bus/pci/drivers SUBSYSTEM=bus subsystem tty: registering kobject tty: registering. parent: , set: class kobject_hotplug fill_kobj_path: path = '/class/tty' kobject_hotplug: /sbin/hotplug class seq=12 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/class/tty SUBSYSTEM=class kobject node: registering. parent: , set: system kobject_hotplug fill_kobj_path: path = '/devices/system/node' kobject_hotplug: /sbin/hotplug system seq=13 HOME=/ PATH=/sbin:/bin:/usr/sbin:/usr/bin ACTION=add DEVPATH=/devices/system/node SUBSYSTEM=system kernel BUG in mmu_context_init at arch/ppc64/mm/init.c:528! cpu 0x1: Vector: 700 (Program Check) at [c0000000047cbb60] pc: c000000000439fa4: .mmu_context_init+0x4c/0x68 lr: c000000000439f8c: .mmu_context_init+0x34/0x68 sp: c0000000047cbde0 msr: 9000000000029032 current = 0xc0000001fe78d7f0 paca = 0xc0000000004cdd00 pid = 1, comm = swapper enter ? for help 1:mon> From benh at kernel.crashing.org Wed Nov 3 09:22:43 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 03 Nov 2004 09:22:43 +1100 Subject: [patch] mmu_context_init needs to run earlier In-Reply-To: <1099432532.23845.90.camel@pants.austin.ibm.com> References: <1099363642.22996.345.camel@pants.austin.ibm.com> <1099370843.29689.448.camel@gaston> <1099375159.9590.4.camel@localhost.localdomain> <1099375511.29693.463.camel@gaston> <1099432532.23845.90.camel@pants.austin.ibm.com> Message-ID: <1099434163.20294.15.camel@gaston> > Ok, here's a boot log with kobject debugging turned on. I can't > interpret all of this but I believe a couple of them are due to > sysdev_class_register for cpus and nodes. > > I don't know why I'm the only person running into this -- maybe it's > something to do with my turning on every possible debug option ;) > > Regardless, I've got a better patch (I think) for the mmu context thing > on the way. Ok, it's all of the platform stuff etc... I suppose you run into that because you actually have an initramfs with an /sbin/hotplug in it, do you ? Some other ppl experienced it, you aren't the only one ;) Ben. From nathanl at austin.ibm.com Wed Nov 3 09:46:33 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Tue, 02 Nov 2004 16:46:33 -0600 Subject: [patch] mmu_context_init needs to run earlier In-Reply-To: <1099434163.20294.15.camel@gaston> References: <1099363642.22996.345.camel@pants.austin.ibm.com> <1099370843.29689.448.camel@gaston> <1099375159.9590.4.camel@localhost.localdomain> <1099375511.29693.463.camel@gaston> <1099432532.23845.90.camel@pants.austin.ibm.com> <1099434163.20294.15.camel@gaston> Message-ID: <1099435593.23845.96.camel@pants.austin.ibm.com> On Tue, 2004-11-02 at 16:22, Benjamin Herrenschmidt wrote: > > Ok, here's a boot log with kobject debugging turned on. I can't > > interpret all of this but I believe a couple of them are due to > > sysdev_class_register for cpus and nodes. > > > > I don't know why I'm the only person running into this -- maybe it's > > something to do with my turning on every possible debug option ;) > > > > Regardless, I've got a better patch (I think) for the mmu context thing > > on the way. > > Ok, it's all of the platform stuff etc... I suppose you run into that > because you actually have an initramfs with an /sbin/hotplug in it, do > you ? Some other ppl experienced it, you aren't the only one ;) Right, I'm using initrd. It makes sense now, thanks. Nathan From linas at austin.ibm.com Wed Nov 3 10:06:18 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 2 Nov 2004 17:06:18 -0600 Subject: [PATCH] iommu fixes, round 3 In-Reply-To: <1098998916.692.20.camel@sinatra.austin.ibm.com> References: <1098775712.6897.17.camel@gaston> <1098808895.32293.23.camel@sinatra.austin.ibm.com> <1098813781.32293.40.camel@sinatra.austin.ibm.com> <16768.10849.741580.850491@cargo.ozlabs.ibm.com> <1098998916.692.20.camel@sinatra.austin.ibm.com> Message-ID: <20041102230618.GQ10026@austin.ibm.com> On Thu, Oct 28, 2004 at 04:28:36PM -0500, John Rose was heard to remark: > This patch changes the following iommu-related things: > > - Renames the [i,p]series versions of iommu_devnode_init(), to keep things > logically separate where possible. > > - Moves iommu_free_table() to generic iommu.c > > - Creates of_cleanup_node(), which will directly precede the dynamic removal of > any device node > > Comments welcome. FYI, without this patch, I get BUG_ON crashes when I hotplug-remove a PCI card, on the nov. 1 2.6.10-rc1 kernel. The BUG_ON is in free_pages, called from iommu_free_table() called from of_remove_node() With this patch, things get back to normal. Please forward & apply. --linas From benh at kernel.crashing.org Wed Nov 3 15:18:51 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 03 Nov 2004 15:18:51 +1100 Subject: [PATCH] iommu fixes, round 3 In-Reply-To: <1098998916.692.20.camel@sinatra.austin.ibm.com> References: <1098775712.6897.17.camel@gaston> <1098808895.32293.23.camel@sinatra.austin.ibm.com> <1098813781.32293.40.camel@sinatra.austin.ibm.com> <16768.10849.741580.850491@cargo.ozlabs.ibm.com> <1098998916.692.20.camel@sinatra.austin.ibm.com> Message-ID: <1099455531.31630.35.camel@gaston> On Thu, 2004-10-28 at 16:28 -0500, John Rose wrote: > This patch changes the following iommu-related things: > > - Renames the [i,p]series versions of iommu_devnode_init(), to keep things > logically separate where possible. > > - Moves iommu_free_table() to generic iommu.c > > - Creates of_cleanup_node(), which will directly precede the dynamic removal of > any device node Hrm... one thing I'm still annoyed with is that you are still calling of_cleanup_node() from within of_remove_node(). That call should be moved to the caller. Ben. From sfr at canb.auug.org.au Wed Nov 3 18:21:28 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 3 Nov 2004 18:21:28 +1100 Subject: [PATCH] PPC64 iSeries iommu cleanups Message-ID: <20041103182128.6d1a7d3a.sfr@canb.auug.org.au> Hi Andrew, This patch just does some cleanups of iSeries_iommu.c remove lots of unneeded includes use list_for_each_entry white space formatting No semantic changes. Signed-off-by: Stephen Rothwell Please apply and send to Linus. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk/arch/ppc64/kernel/iSeries_iommu.c linus-bk-iommu.1/arch/ppc64/kernel/iSeries_iommu.c --- linus-bk/arch/ppc64/kernel/iSeries_iommu.c 2004-04-13 09:25:09.000000000 +1000 +++ linus-bk-iommu.1/arch/ppc64/kernel/iSeries_iommu.c 2004-11-02 18:24:31.000000000 +1100 @@ -25,30 +25,14 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#include -#include #include -#include -#include -#include -#include -#include #include -#include -#include -#include -#include +#include -#include -#include #include -#include -#include - #include - -#include "pci.h" - +#include +#include extern struct list_head iSeries_Global_Device_List; @@ -76,12 +60,11 @@ tce.te_bits.tb_pciwr = 1; } - rc = HvCallXm_setTce((u64)tbl->it_index, - (u64)index, - tce.te_word); + rc = HvCallXm_setTce((u64)tbl->it_index, (u64)index, + tce.te_word); if (rc) - panic("PCI_DMA: HvCallXm_setTce failed, Rc: 0x%lx\n", rc); - + panic("PCI_DMA: HvCallXm_setTce failed, Rc: 0x%lx\n", + rc); index++; uaddr += PAGE_SIZE; } @@ -90,20 +73,14 @@ static void tce_free_iSeries(struct iommu_table *tbl, long index, long npages) { u64 rc; - union tce_entry tce; while (npages--) { - tce.te_word = 0; - rc = HvCallXm_setTce((u64)tbl->it_index, - (u64)index, - tce.te_word); - + rc = HvCallXm_setTce((u64)tbl->it_index, (u64)index, 0); if (rc) - panic("PCI_DMA: HvCallXm_setTce failed, Rc: 0x%lx\n", rc); - + panic("PCI_DMA: HvCallXm_setTce failed, Rc: 0x%lx\n", + rc); index++; } - } @@ -115,17 +92,14 @@ { struct iSeries_Device_Node *dp; - for (dp = (struct iSeries_Device_Node *)iSeries_Global_Device_List.next; - dp != (struct iSeries_Device_Node *)&iSeries_Global_Device_List; - dp = (struct iSeries_Device_Node *)dp->Device_List.next) - if (dp->iommu_table != NULL && - dp->iommu_table->it_type == TCE_PCI && - dp->iommu_table->it_offset == tbl->it_offset && - dp->iommu_table->it_index == tbl->it_index && - dp->iommu_table->it_size == tbl->it_size) + list_for_each_entry(dp, &iSeries_Global_Device_List, Device_List) { + if ((dp->iommu_table != NULL) && + (dp->iommu_table->it_type == TCE_PCI) && + (dp->iommu_table->it_offset == tbl->it_offset) && + (dp->iommu_table->it_index == tbl->it_index) && + (dp->iommu_table->it_size == tbl->it_size)) return dp->iommu_table; - - + } return NULL; } @@ -143,15 +117,14 @@ { struct iommu_table_cb *parms; - parms = (struct iommu_table_cb*)kmalloc(sizeof(*parms), GFP_KERNEL); - + parms = kmalloc(sizeof(*parms), GFP_KERNEL); if (parms == NULL) panic("PCI_DMA: TCE Table Allocation failed."); memset(parms, 0, sizeof(*parms)); - parms->itc_busno = ISERIES_BUS(dn); - parms->itc_slotno = dn->LogicalSlot; + parms->itc_busno = ISERIES_BUS(dn); + parms->itc_slotno = dn->LogicalSlot; parms->itc_virtbus = 0; HvCallXm_getTceTableParms(ISERIES_HV_ADDR(parms)); @@ -159,34 +132,32 @@ if (parms->itc_size == 0) panic("PCI_DMA: parms->size is zero, parms is 0x%p", parms); - tbl->it_size = parms->itc_size; - tbl->it_busno = parms->itc_busno; - tbl->it_offset = parms->itc_offset; - tbl->it_index = parms->itc_index; - tbl->it_entrysize = sizeof(union tce_entry); - tbl->it_blocksize = 1; - tbl->it_type = TCE_PCI; + tbl->it_size = parms->itc_size; + tbl->it_busno = parms->itc_busno; + tbl->it_offset = parms->itc_offset; + tbl->it_index = parms->itc_index; + tbl->it_entrysize = sizeof(union tce_entry); + tbl->it_blocksize = 1; + tbl->it_type = TCE_PCI; kfree(parms); } -void iommu_devnode_init(struct iSeries_Device_Node *dn) { +void iommu_devnode_init(struct iSeries_Device_Node *dn) +{ struct iommu_table *tbl; - tbl = (struct iommu_table *)kmalloc(sizeof(struct iommu_table), GFP_KERNEL); + tbl = kmalloc(sizeof(struct iommu_table), GFP_KERNEL); iommu_table_getparms(dn, tbl); /* Look for existing tce table */ dn->iommu_table = iommu_table_find(tbl); - if (dn->iommu_table == NULL) dn->iommu_table = iommu_init_table(tbl); else kfree(tbl); - - return; } -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041103/abedb587/attachment.pgp From brking at us.ibm.com Thu Nov 4 02:10:19 2004 From: brking at us.ibm.com (brking at us.ibm.com) Date: Wed, 03 Nov 2004 09:10:19 -0600 Subject: [PATCH 1/2] ppc64: Block config accesses during BIST (revised - resend) Message-ID: <200411031510.iA3FAK7t022615@d01av03.pok.ibm.com> Resending... Some PCI adapters on pSeries and iSeries hardware (ipr scsi adapters) have an exposure today in that they issue BIST to the adapter to reset the card. If, during the time it takes to complete BIST, userspace attempts to access PCI config space, the host bus bridge will master abort the access since the ipr adapter does not respond on the PCI bus for a brief period of time when running BIST. This master abort results in the host PCI bridge isolating that PCI device from the rest of the system, making the device unusable until Linux is rebooted. This patch is an attempt to close that exposure by introducing some blocking code in the arch specific PCI code. The intent is to have the ipr device driver invoke these routines to prevent userspace PCI accesses from occurring during this window. It has been tested by running BIST on an ipr adapter while running a script which looped reading the config space of that adapter through sysfs. Without the patch, an EEH error occurrs. With the patch there is no EEH error. Tested on Power 5 and iSeries Power 4. Signed-off-by: Brian King --- linux-2.6.10-rc1-bk13-bjking1/arch/ppc64/kernel/iSeries_pci.c | 128 +++++++++- linux-2.6.10-rc1-bk13-bjking1/arch/ppc64/kernel/pSeries_pci.c | 103 +++++++- linux-2.6.10-rc1-bk13-bjking1/include/asm-ppc64/iSeries/iSeries_pci.h | 1 linux-2.6.10-rc1-bk13-bjking1/include/asm-ppc64/pci.h | 6 linux-2.6.10-rc1-bk13-bjking1/include/asm-ppc64/prom.h | 4 5 files changed, 226 insertions(+), 16 deletions(-) diff -puN include/asm-ppc64/prom.h~ppc64_block_cfg_io_during_bist include/asm-ppc64/prom.h --- linux-2.6.10-rc1-bk13/include/asm-ppc64/prom.h~ppc64_block_cfg_io_during_bist 2004-11-03 08:52:08.000000000 -0600 +++ linux-2.6.10-rc1-bk13-bjking1/include/asm-ppc64/prom.h 2004-11-03 08:52:08.000000000 -0600 @@ -183,11 +183,15 @@ extern struct device_node *of_chosen; /* flag descriptions */ #define OF_STALE 0 /* node is slated for deletion */ #define OF_DYNAMIC 1 /* node and properties were allocated via kmalloc */ +#define OF_NO_CFGIO 2 /* config space accesses should fail */ #define OF_IS_STALE(x) test_bit(OF_STALE, &x->_flags) #define OF_MARK_STALE(x) set_bit(OF_STALE, &x->_flags) #define OF_IS_DYNAMIC(x) test_bit(OF_DYNAMIC, &x->_flags) #define OF_MARK_DYNAMIC(x) set_bit(OF_DYNAMIC, &x->_flags) +#define OF_IS_CFGIO_BLOCKED(x) test_bit(OF_NO_CFGIO, &x->_flags) +#define OF_UNBLOCK_CFGIO(x) clear_bit(OF_NO_CFGIO, &x->_flags) +#define OF_BLOCK_CFGIO(x) set_bit(OF_NO_CFGIO, &x->_flags) /* * Until 32-bit ppc can add proc_dir_entries to its device_node diff -puN arch/ppc64/kernel/pSeries_pci.c~ppc64_block_cfg_io_during_bist arch/ppc64/kernel/pSeries_pci.c --- linux-2.6.10-rc1-bk13/arch/ppc64/kernel/pSeries_pci.c~ppc64_block_cfg_io_during_bist 2004-11-03 08:52:08.000000000 -0600 +++ linux-2.6.10-rc1-bk13-bjking1/arch/ppc64/kernel/pSeries_pci.c 2004-11-03 08:52:08.000000000 -0600 @@ -30,6 +30,7 @@ #include #include #include +#include #include #include @@ -52,18 +53,17 @@ static int ibm_read_pci_config; static int ibm_write_pci_config; static int s7a_workaround; +static spinlock_t config_lock = SPIN_LOCK_UNLOCKED; extern unsigned long pci_probe_only; extern struct mpic *pSeries_mpic; -static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) +static int __rtas_read_config(struct device_node *dn, int where, int size, u32 *val) { int returnval = -1; unsigned long buid, addr; int ret; - if (!dn) - return PCIBIOS_DEVICE_NOT_FOUND; if (where & (size - 1)) return PCIBIOS_BAD_REGISTER_NUMBER; @@ -87,6 +87,23 @@ static int rtas_read_config(struct devic return PCIBIOS_SUCCESSFUL; } +static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) +{ + unsigned long flags; + int ret = 0; + + if (!dn) + return PCIBIOS_DEVICE_NOT_FOUND; + + spin_lock_irqsave(&config_lock, flags); + if (OF_IS_CFGIO_BLOCKED(dn)) + *val = -1; + else + ret = __rtas_read_config(dn, where, size, val); + spin_unlock_irqrestore(&config_lock, flags); + return ret; +} + static int rtas_pci_read_config(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val) @@ -105,13 +122,11 @@ static int rtas_pci_read_config(struct p return PCIBIOS_DEVICE_NOT_FOUND; } -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +static int __rtas_write_config(struct device_node *dn, int where, int size, u32 val) { unsigned long buid, addr; int ret; - if (!dn) - return PCIBIOS_DEVICE_NOT_FOUND; if (where & (size - 1)) return PCIBIOS_BAD_REGISTER_NUMBER; @@ -129,6 +144,21 @@ static int rtas_write_config(struct devi return PCIBIOS_SUCCESSFUL; } +static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +{ + unsigned long flags; + int ret = 0; + + if (!dn) + return PCIBIOS_DEVICE_NOT_FOUND; + + spin_lock_irqsave(&config_lock, flags); + if (!OF_IS_CFGIO_BLOCKED(dn)) + ret = __rtas_write_config(dn, where, size, val); + spin_unlock_irqrestore(&config_lock, flags); + return ret; +} + static int rtas_pci_write_config(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val) @@ -152,6 +182,67 @@ struct pci_ops rtas_pci_ops = { rtas_pci_write_config }; +/** + * pci_block_config_io - Block PCI config reads/writes + * @pdev: pci device struct + * + * This function blocks any PCI config accesses from occurring. + * Device drivers may call this prior to running BIST if the + * adapter cannot handle PCI config reads or writes when + * running BIST. When blocked, any writes will be ignored and + * treated as successful and any reads will return all 1's data. + * + * Return value: + * nothing + **/ +void pci_block_config_io(struct pci_dev *pdev) +{ + struct device_node *dn = pci_device_to_OF_node(pdev); + unsigned long flags; + + spin_lock_irqsave(&config_lock, flags); + OF_BLOCK_CFGIO(dn); + spin_unlock_irqrestore(&config_lock, flags); +} +EXPORT_SYMBOL(pci_block_config_io); + +/** + * pci_unblock_config_io - Unblock PCI config reads/writes + * @pdev: pci device struct + * + * This function allows PCI config accesses to resume. + * + * Return value: + * nothing + **/ +void pci_unblock_config_io(struct pci_dev *pdev) +{ + struct device_node *dn = pci_device_to_OF_node(pdev); + unsigned long flags; + + spin_lock_irqsave(&config_lock, flags); + OF_UNBLOCK_CFGIO(dn); + spin_unlock_irqrestore(&config_lock, flags); +} +EXPORT_SYMBOL(pci_unblock_config_io); + +/** + * pci_start_bist - Start BIST on a PCI device + * @pdev: pci device struct + * + * This function allows a device driver to start BIST + * when PCI config accesses are disabled. + * + * Return value: + * nothing + **/ +int pci_start_bist(struct pci_dev *pdev) +{ + struct device_node *dn = pci_device_to_OF_node(pdev); + return __rtas_write_config(dn, PCI_BIST, 1, PCI_BIST_START); +} +EXPORT_SYMBOL(pci_start_bist); + static void python_countermeasures(unsigned long addr) { void __iomem *chip_regs; diff -puN include/asm-ppc64/pci.h~ppc64_block_cfg_io_during_bist include/asm-ppc64/pci.h --- linux-2.6.10-rc1-bk13/include/asm-ppc64/pci.h~ppc64_block_cfg_io_during_bist 2004-11-03 08:52:08.000000000 -0600 +++ linux-2.6.10-rc1-bk13-bjking1/include/asm-ppc64/pci.h 2004-11-03 08:52:08.000000000 -0600 @@ -244,6 +244,12 @@ extern int pci_read_irq_line(struct pci_ extern void pcibios_add_platform_entries(struct pci_dev *dev); +extern void pci_block_config_io(struct pci_dev *dev); + +extern void pci_unblock_config_io(struct pci_dev *dev); + +extern int pci_start_bist(struct pci_dev *dev); + #endif /* __KERNEL__ */ #endif /* __PPC64_PCI_H */ diff -puN include/asm-ppc64/iSeries/iSeries_pci.h~ppc64_block_cfg_io_during_bist include/asm-ppc64/iSeries/iSeries_pci.h --- linux-2.6.10-rc1-bk13/include/asm-ppc64/iSeries/iSeries_pci.h~ppc64_block_cfg_io_during_bist 2004-11-03 08:52:08.000000000 -0600 +++ linux-2.6.10-rc1-bk13-bjking1/include/asm-ppc64/iSeries/iSeries_pci.h 2004-11-03 08:52:08.000000000 -0600 @@ -91,6 +91,7 @@ struct iSeries_Device_Node { int ReturnCode; /* Return Code Holder */ int IoRetry; /* Current Retry Count */ int Flags; /* Possible flags(disable/bist)*/ +#define ISERIES_CFGIO_BLOCKED 1 u16 Vendor; /* Vendor ID */ u8 LogicalSlot; /* Hv Slot Index for Tces */ struct iommu_table* iommu_table;/* Device TCE Table */ diff -puN arch/ppc64/kernel/iSeries_pci.c~ppc64_block_cfg_io_during_bist arch/ppc64/kernel/iSeries_pci.c --- linux-2.6.10-rc1-bk13/arch/ppc64/kernel/iSeries_pci.c~ppc64_block_cfg_io_during_bist 2004-11-03 08:52:08.000000000 -0600 +++ linux-2.6.10-rc1-bk13-bjking1/arch/ppc64/kernel/iSeries_pci.c 2004-11-03 08:52:08.000000000 -0600 @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -77,6 +78,7 @@ static int Pci_Retry_Max = 3; /* Only re static int Pci_Error_Flag = 1; /* Set Retry Error on. */ static struct pci_ops iSeries_pci_ops; +static spinlock_t config_lock = SPIN_LOCK_UNLOCKED; /* * Table defines @@ -603,16 +605,12 @@ static u64 hv_cfg_write_func[4] = { /* * Read PCI config space */ -static int iSeries_pci_read_config(struct pci_bus *bus, unsigned int devfn, +static int __iSeries_pci_read_config(struct iSeries_Device_Node *node, int offset, int size, u32 *val) { - struct iSeries_Device_Node *node = find_Device_Node(bus->number, devfn); u64 fn; struct HvCallPci_LoadReturn ret; - if (node == NULL) - return PCIBIOS_DEVICE_NOT_FOUND; - fn = hv_cfg_read_func[(size - 1) & 3]; HvCall3Ret16(fn, &ret, node->DsaAddr.DsaAddr, offset, 0); @@ -625,20 +623,36 @@ static int iSeries_pci_read_config(struc return 0; } +static int iSeries_pci_read_config(struct pci_bus *bus, unsigned int devfn, + int offset, int size, u32 *val) +{ + struct iSeries_Device_Node *node = find_Device_Node(bus->number, devfn); + int ret = PCIBIOS_DEVICE_NOT_FOUND; + unsigned long flags; + + if (node) { + ret = 0; + spin_lock_irqsave(&config_lock, flags); + if (node->Flags & ISERIES_CFGIO_BLOCKED) + *val = -1; + else + ret = __iSeries_pci_read_config(node, offset, size, val); + spin_unlock_irqrestore(&config_lock, flags); + } + + return ret; +} + /* * Write PCI config space */ -static int iSeries_pci_write_config(struct pci_bus *bus, unsigned int devfn, +static int __iSeries_pci_write_config(struct iSeries_Device_Node *node, int offset, int size, u32 val) { - struct iSeries_Device_Node *node = find_Device_Node(bus->number, devfn); u64 fn; u64 ret; - if (node == NULL) - return PCIBIOS_DEVICE_NOT_FOUND; - fn = hv_cfg_write_func[(size - 1) & 3]; ret = HvCall4(fn, node->DsaAddr.DsaAddr, offset, val, 0); @@ -648,6 +662,23 @@ static int iSeries_pci_write_config(stru return 0; } +static int iSeries_pci_write_config(struct pci_bus *bus, unsigned int devfn, + int offset, int size, u32 val) +{ + struct iSeries_Device_Node *node = find_Device_Node(bus->number, devfn); + int ret = PCIBIOS_DEVICE_NOT_FOUND; + unsigned long flags; + + if (node) { + spin_lock_irqsave(&config_lock, flags); + if (!(node->Flags & ISERIES_CFGIO_BLOCKED)) + ret = __iSeries_pci_write_config(node, offset, size, val); + spin_unlock_irqrestore(&config_lock, flags); + } + + return ret; +} + static struct pci_ops iSeries_pci_ops = { .read = iSeries_pci_read_config, .write = iSeries_pci_write_config @@ -906,3 +937,80 @@ void iSeries_Write_Long(u32 data, volati } while (CheckReturnCode("WWL", DevNode, rc) != 0); } EXPORT_SYMBOL(iSeries_Write_Long); + +/** + * pci_block_config_io - Block PCI config reads/writes + * @pdev: pci device struct + * + * This function blocks any PCI config accesses from occurring. + * Device drivers may call this prior to running BIST if the + * adapter cannot handle PCI config reads or writes when + * running BIST. When blocked, any writes will be ignored and + * treated as successful and any reads will return all 1's data. + * + * Return value: + * nothing + **/ +void pci_block_config_io(struct pci_dev *pdev) +{ + struct iSeries_Device_Node *node; + unsigned long flags; + + node = find_Device_Node(pdev->bus->number, pdev->devfn); + + if (node == NULL) + return; + + spin_lock_irqsave(&config_lock, flags); + node->Flags |= ISERIES_CFGIO_BLOCKED; + spin_unlock_irqrestore(&config_lock, flags); +} +EXPORT_SYMBOL(pci_block_config_io); + +/** + * pci_unblock_config_io - Unblock PCI config reads/writes + * @pdev: pci device struct + * + * This function allows PCI config accesses to resume. + * + * Return value: + * nothing + **/ +void pci_unblock_config_io(struct pci_dev *pdev) +{ + struct iSeries_Device_Node *node; + unsigned long flags; + + node = find_Device_Node(pdev->bus->number, pdev->devfn); + + if (node == NULL) + return; + + spin_lock_irqsave(&config_lock, flags); + node->Flags &= ~ISERIES_CFGIO_BLOCKED; + spin_unlock_irqrestore(&config_lock, flags); +} +EXPORT_SYMBOL(pci_unblock_config_io); + +/** + * pci_start_bist - Start BIST on a PCI device + * @pdev: pci device struct + * + * This function allows a device driver to start BIST + * when PCI config accesses are disabled. + * + * Return value: + * nothing + **/ +int pci_start_bist(struct pci_dev *pdev) +{ + struct iSeries_Device_Node *node; + + node = find_Device_Node(pdev->bus->number, pdev->devfn); + + if (node == NULL) + return PCIBIOS_DEVICE_NOT_FOUND; + + return __iSeries_pci_write_config(node, PCI_BIST, 1, PCI_BIST_START); +} +EXPORT_SYMBOL(pci_start_bist); _ From brking at us.ibm.com Thu Nov 4 02:10:26 2004 From: brking at us.ibm.com (brking at us.ibm.com) Date: Wed, 03 Nov 2004 09:10:26 -0600 Subject: [PATCH 2/2] ipr_block_config_io_during_bist (resend) Message-ID: <200411031510.iA3FAR2u010150@d03av02.boulder.ibm.com> Change ipr to use new ppc64 pci APIs to block PCI config space accesses when running BIST to prevent PCI master aborts. Signed-off-by: Brian King --- linux-2.6.10-rc1-bk13-bjking1/drivers/scsi/ipr.c | 5 ++++- linux-2.6.10-rc1-bk13-bjking1/drivers/scsi/ipr.h | 7 +++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff -puN drivers/scsi/ipr.c~ipr_block_config_io_during_bist drivers/scsi/ipr.c --- linux-2.6.10-rc1-bk13/drivers/scsi/ipr.c~ipr_block_config_io_during_bist 2004-11-03 09:08:27.000000000 -0600 +++ linux-2.6.10-rc1-bk13-bjking1/drivers/scsi/ipr.c 2004-11-03 09:08:27.000000000 -0600 @@ -4935,6 +4935,7 @@ static int ipr_reset_restore_cfg_space(s int rc; ENTER; + pci_unblock_config_io(ioa_cfg->pdev); rc = pci_restore_state(ioa_cfg->pdev); if (rc != PCIBIOS_SUCCESSFUL) { @@ -4989,9 +4990,11 @@ static int ipr_reset_start_bist(struct i int rc; ENTER; - rc = pci_write_config_byte(ioa_cfg->pdev, PCI_BIST, PCI_BIST_START); + pci_block_config_io(ioa_cfg->pdev); + rc = pci_start_bist(ioa_cfg->pdev); if (rc != PCIBIOS_SUCCESSFUL) { + pci_unblock_config_io(ioa_cfg->pdev); ipr_cmd->ioasa.ioasc = cpu_to_be32(IPR_IOASC_PCI_ACCESS_ERROR); rc = IPR_RC_JOB_CONTINUE; } else { diff -puN drivers/scsi/ipr.h~ipr_block_config_io_during_bist drivers/scsi/ipr.h --- linux-2.6.10-rc1-bk13/drivers/scsi/ipr.h~ipr_block_config_io_during_bist 2004-11-03 09:08:27.000000000 -0600 +++ linux-2.6.10-rc1-bk13-bjking1/drivers/scsi/ipr.h 2004-11-03 09:08:27.000000000 -0600 @@ -1112,6 +1112,13 @@ __FUNCTION__, __LINE__, ioa_cfg #define ipr_remove_dump_file(kobj, attr) do { } while(0) #endif +#if !defined(CONFIG_PPC_PSERIES) && !defined(CONFIG_PPC_ISERIES) +#define pci_block_config_io(dev) do { } while(0) +#define pci_unblock_config_io(dev) do { } while(0) +#define pci_start_bist(dev) \ + pci_write_config_byte(dev, PCI_BIST, PCI_BIST_START) +#endif + /* * Error logging macros */ _ From johnrose at austin.ibm.com Thu Nov 4 02:50:20 2004 From: johnrose at austin.ibm.com (John Rose) Date: Wed, 03 Nov 2004 09:50:20 -0600 Subject: [PATCH] iommu fixes, round 3 In-Reply-To: <1099455531.31630.35.camel@gaston> References: <1098775712.6897.17.camel@gaston> <1098808895.32293.23.camel@sinatra.austin.ibm.com> <1098813781.32293.40.camel@sinatra.austin.ibm.com> <16768.10849.741580.850491@cargo.ozlabs.ibm.com> <1098998916.692.20.camel@sinatra.austin.ibm.com> <1099455531.31630.35.camel@gaston> Message-ID: <1099497020.21421.0.camel@sinatra.austin.ibm.com> > > - Creates of_cleanup_node(), which will directly precede the dynamic removal of > > any device node > > Hrm... one thing I'm still annoyed with is that you are still calling > of_cleanup_node() from within of_remove_node(). That call should be > moved to the caller. :) Respectfully, I still disagree. The caller is a procfs-specific function related to an interface that we're hoping to deprecate soon. We want this to happen any time a node is removed, not anytime a node is removed using interface so-and-so. To me, it makes sense to put this here since of_add_node() calls of_finish_node_dynamic(), which creates the table. John From olof at austin.ibm.com Thu Nov 4 04:17:30 2004 From: olof at austin.ibm.com (Olof Johansson) Date: Wed, 3 Nov 2004 11:17:30 -0600 Subject: [PATCH] PPC64 VIO iommu table property parsing wrong Message-ID: <20041103171730.GA31267@4> Andrew, please apply: With current firmware, the ibm,my-dma-window property now contains two panes for VSCSI server nodes. This breaks the current tests in the setup code. There's a bunch of references to pre-GA firmware bugs. That's a while ago, so we can remove the workarounds without breaking anyone. Signed-off-by: Olof Johansson --- linux-2.5-olof/arch/ppc64/kernel/vio.c | 19 +------------------ 1 files changed, 1 insertion(+), 18 deletions(-) diff -puN arch/ppc64/kernel/vio.c~vio-iommu arch/ppc64/kernel/vio.c --- linux-2.5/arch/ppc64/kernel/vio.c~vio-iommu 2004-11-03 09:50:29.829990236 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/vio.c 2004-11-03 10:12:07.313786376 -0600 @@ -521,24 +521,7 @@ static struct iommu_table * vio_build_io newTceTable = (struct iommu_table *) kmalloc(sizeof(struct iommu_table), GFP_KERNEL); - /* RPA docs say that #address-cells is always 1 for virtual - devices, but some older boxes' OF returns 2. This should - be removed by GA, unless there is legacy OFs that still - have 2 for #address-cells */ - size = ((dma_window[1+vio_num_address_cells] >> PAGE_SHIFT) << 3) - >> PAGE_SHIFT; - - /* This is just an ugly kludge. Remove as soon as the OF for all - machines actually follow the spec and encodes the offset field - as phys-encode (that is, #address-cells wide)*/ - if (dma_window_property_size == 12) { - size = ((dma_window[1] >> PAGE_SHIFT) << 3) >> PAGE_SHIFT; - } else if (dma_window_property_size == 20) { - size = ((dma_window[4] >> PAGE_SHIFT) << 3) >> PAGE_SHIFT; - } else { - printk(KERN_WARNING "vio_build_iommu_table: Invalid size of ibm,my-dma-window=%i, using 0x80 for size\n", dma_window_property_size); - size = 0x80; - } + size = ((dma_window[4] >> PAGE_SHIFT) << 3) >> PAGE_SHIFT; /* There should be some code to extract the phys-encoded offset using prom_n_addr_cells(). However, according to a comment _ From benh at kernel.crashing.org Thu Nov 4 09:15:00 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 04 Nov 2004 09:15:00 +1100 Subject: [PATCH] iommu fixes, round 3 In-Reply-To: <1099497020.21421.0.camel@sinatra.austin.ibm.com> References: <1098775712.6897.17.camel@gaston> <1098808895.32293.23.camel@sinatra.austin.ibm.com> <1098813781.32293.40.camel@sinatra.austin.ibm.com> <16768.10849.741580.850491@cargo.ozlabs.ibm.com> <1098998916.692.20.camel@sinatra.austin.ibm.com> <1099455531.31630.35.camel@gaston> <1099497020.21421.0.camel@sinatra.austin.ibm.com> Message-ID: <1099520100.31629.52.camel@gaston> > :) Respectfully, I still disagree. The caller is a procfs-specific function > related to an interface that we're hoping to deprecate soon. We want this to > happen any time a node is removed, not anytime a node is removed using > interface so-and-so. > > To me, it makes sense to put this here since of_add_node() calls > of_finish_node_dynamic(), which creates the table. I hate that interface... but I suppose we can merge the patch for now. I think this should be changed tho. It's no business of the low level device-tree manipulation functions to know about such things as iommu tables. And what will happen the day I remove the iommu table pointer from the struct device-node anyway ? If your interface to userland relies on that, then it's broken and will have to be reworked :( Maybe we can get away be creating a notifier mecanism for something in the kernel to get called back after nodes are beeing added and before they are beeing removed, that would be ok I suppose, but the low level tree manipulation has to stay separate. I do intend, in the long run, to remove all those additional fields we put in struct device-tree... Ben. From johnrose at austin.ibm.com Thu Nov 4 09:50:50 2004 From: johnrose at austin.ibm.com (John Rose) Date: Wed, 03 Nov 2004 16:50:50 -0600 Subject: [PATCH] iommu fixes, round 3 In-Reply-To: <1099520100.31629.52.camel@gaston> References: <1098775712.6897.17.camel@gaston> <1098808895.32293.23.camel@sinatra.austin.ibm.com> <1098813781.32293.40.camel@sinatra.austin.ibm.com> <16768.10849.741580.850491@cargo.ozlabs.ibm.com> <1098998916.692.20.camel@sinatra.austin.ibm.com> <1099455531.31630.35.camel@gaston> <1099497020.21421.0.camel@sinatra.austin.ibm.com> <1099520100.31629.52.camel@gaston> Message-ID: <1099522250.21421.22.camel@sinatra.austin.ibm.com> On Wed, 2004-11-03 at 16:15, Benjamin Herrenschmidt wrote: > And what will happen the day I remove the iommu table pointer > from the struct device-node anyway ? This would break the current table creation and management scheme, so some reworking would have to be done anyway. As for cleaning up struct device_node, you're preaching to the choir. How will the tables be associated with devices in the new case? > If your interface to userland relies on that, then it's broken and will > have to be reworked :( User-space DLPAR stuff doesn't care about these tables, or at what point they're freed, if that's what you mean. Thanks for looking at the patch, I'll take reluctant acceptance over nothing :) John From benh at kernel.crashing.org Thu Nov 4 09:51:47 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 04 Nov 2004 09:51:47 +1100 Subject: [PATCH] iommu fixes, round 3 In-Reply-To: <1099522250.21421.22.camel@sinatra.austin.ibm.com> References: <1098775712.6897.17.camel@gaston> <1098808895.32293.23.camel@sinatra.austin.ibm.com> <1098813781.32293.40.camel@sinatra.austin.ibm.com> <16768.10849.741580.850491@cargo.ozlabs.ibm.com> <1098998916.692.20.camel@sinatra.austin.ibm.com> <1099455531.31630.35.camel@gaston> <1099497020.21421.0.camel@sinatra.austin.ibm.com> <1099520100.31629.52.camel@gaston> <1099522250.21421.22.camel@sinatra.austin.ibm.com> Message-ID: <1099522307.31629.82.camel@gaston> On Wed, 2004-11-03 at 16:50 -0600, John Rose wrote: > On Wed, 2004-11-03 at 16:15, Benjamin Herrenschmidt wrote: > > > And what will happen the day I remove the iommu table pointer > > from the struct device-node anyway ? > > This would break the current table creation and management scheme, so > some reworking would have to be done anyway. As for cleaning up struct > device_node, you're preaching to the choir. How will the tables be > associated with devices in the new case? Some structure attached to the device, but not the device-node. But it's not there yet anyway, it's a long term goal. > > If your interface to userland relies on that, then it's broken and will > > have to be reworked :( > > User-space DLPAR stuff doesn't care about these tables, or at what point > they're freed, if that's what you mean. Thanks for looking at the > patch, I'll take reluctant acceptance over nothing :) Hehe, well, we need to fix the problem for now anyway. Ben. From anton at samba.org Thu Nov 4 19:10:04 2004 From: anton at samba.org (Anton Blanchard) Date: Thu, 4 Nov 2004 19:10:04 +1100 Subject: [PATCH] ppc64: Add option for oprofile to backtrace through spinlocks Message-ID: <20041104081003.GB5357@krispykreme.ozlabs.ibm.com> Hi, Now that spinlocks are always out of line, oprofile needs to backtrace through them. The following patch adds this but also adds the ability to turn it off (via the backtrace_spinlocks option in oprofilefs). The backout option is included because the backtracing here is best effort. On ppc64 the performance monitor exception is not an NMI, we get them only when interrupts are enabled. This means we can receive a profile hit that is inside a spinlock when our PC is somewhere completely different. In this patch we check to make sure the PC of the performance monitor exception as well as the current PC is inside the spinlock region. If so then we find the callers PC. If this is not true we play it safe and leave the tick inside the lock region. Also, now that we execute the SLB handler in real mode we have to adjust the address range that we consider as valid real mode addresses. Otherwise the SLB miss handler will end up as unknown kernel profile hits. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/oprofile/op_model_power4.c~oprofile_backtrace arch/ppc64/oprofile/op_model_power4.c --- gr_work/arch/ppc64/oprofile/op_model_power4.c~oprofile_backtrace 2004-09-14 04:04:47.995524298 -0500 +++ gr_work-anton/arch/ppc64/oprofile/op_model_power4.c 2004-09-14 04:37:43.108261897 -0500 @@ -32,6 +32,13 @@ static u32 mmcr0_val; static u64 mmcr1_val; static u32 mmcra_val; +/* + * Since we do not have an NMI, backtracing through spinlocks is + * only a best guess. In light of this, allow it to be disabled at + * runtime. + */ +static int backtrace_spinlocks; + static void power4_reg_setup(struct op_counter_config *ctr, struct op_system_config *sys, int num_ctrs) @@ -59,6 +66,8 @@ static void power4_reg_setup(struct op_c mmcr1_val = sys->mmcr1; mmcra_val = sys->mmcra; + backtrace_spinlocks = sys->backtrace_spinlocks; + for (i = 0; i < num_counters; ++i) reset_value[i] = 0x80000000UL - ctr[i].count; @@ -170,19 +179,38 @@ static void __attribute_used__ kernel_un { } +static unsigned long check_spinlock_pc(struct pt_regs *regs, + unsigned long profile_pc) +{ + unsigned long pc = instruction_pointer(regs); + + /* + * If both the SIAR (sampled instruction) and the perfmon exception + * occurred in a spinlock region then we account the sample to the + * calling function. This isnt 100% correct, we really need soft + * IRQ disable so we always get the perfmon exception at the + * point at which the SIAR is set. + */ + if (backtrace_spinlocks && in_lock_functions(pc) && + in_lock_functions(profile_pc)) + return regs->link; + else + return profile_pc; +} + /* * On GQ and newer the MMCRA stores the HV and PR bits at the time * the SIAR was sampled. We use that to work out if the SIAR was sampled in * the hypervisor, our exception vectors or RTAS. */ -static unsigned long get_pc(void) +static unsigned long get_pc(struct pt_regs *regs) { unsigned long pc = mfspr(SPRN_SIAR); unsigned long mmcra; /* Cant do much about it */ if (!mmcra_has_sihv) - return pc; + return check_spinlock_pc(regs, pc); mmcra = mfspr(SPRN_MMCRA); @@ -196,10 +224,6 @@ static unsigned long get_pc(void) if (mmcra & MMCRA_SIPR) return pc; - /* Were we in our exception vectors? */ - if (pc < 0x4000UL) - return (unsigned long)__va(pc); - #ifdef CONFIG_PPC_PSERIES /* Were we in RTAS? */ if (pc >= rtas.base && pc < (rtas.base + rtas.size)) @@ -207,12 +231,16 @@ static unsigned long get_pc(void) return *((unsigned long *)rtas_bucket); #endif + /* Were we in our exception vectors or SLB real mode miss handler? */ + if (pc < 0x1000000UL) + return (unsigned long)__va(pc); + /* Not sure where we were */ if (pc < KERNELBASE) /* function descriptor madness */ return *((unsigned long *)kernel_unknown_bucket); - return pc; + return check_spinlock_pc(regs, pc); } static int get_kernel(unsigned long pc) @@ -239,7 +267,7 @@ static void power4_handle_interrupt(stru unsigned int cpu = smp_processor_id(); unsigned int mmcr0; - pc = get_pc(); + pc = get_pc(regs); is_kernel = get_kernel(pc); /* set the PMM bit (see comment below) */ diff -L op_model_power4.c -puN /dev/null /dev/null diff -puN arch/ppc64/oprofile/common.c~oprofile_backtrace arch/ppc64/oprofile/common.c --- gr_work/arch/ppc64/oprofile/common.c~oprofile_backtrace 2004-09-14 04:38:28.408023510 -0500 +++ gr_work-anton/arch/ppc64/oprofile/common.c 2004-09-14 04:40:18.825344482 -0500 @@ -112,11 +112,16 @@ static int op_ppc64_create_files(struct oprofilefs_create_ulong(sb, root, "enable_kernel", &sys.enable_kernel); oprofilefs_create_ulong(sb, root, "enable_user", &sys.enable_user); + oprofilefs_create_ulong(sb, root, "backtrace_spinlocks", + &sys.backtrace_spinlocks); /* Default to tracing both kernel and user */ sys.enable_kernel = 1; sys.enable_user = 1; + /* Turn on backtracing through spinlocks by default */ + sys.backtrace_spinlocks = 1; + return 0; } diff -puN arch/ppc64/oprofile/op_impl.h~oprofile_backtrace arch/ppc64/oprofile/op_impl.h --- gr_work/arch/ppc64/oprofile/op_impl.h~oprofile_backtrace 2004-09-14 04:38:59.694872442 -0500 +++ gr_work-anton/arch/ppc64/oprofile/op_impl.h 2004-09-14 04:39:17.624700077 -0500 @@ -71,6 +71,7 @@ struct op_system_config { unsigned long mmcra; unsigned long enable_kernel; unsigned long enable_user; + unsigned long backtrace_spinlocks; }; /* Per-arch configuration */ _ From anton at samba.org Fri Nov 5 02:50:32 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 5 Nov 2004 02:50:32 +1100 Subject: RTAS error log sequence numbers Message-ID: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> Hi, We can end up reusing RTAS error log sequence numbers - by calling log_error out of rtas_call before we have done nvram_init. eg on a p630 with a graphics card it doesnt like: RTAS: event: 1, Type: Internal Device Failure, Severity: 5 ... PCI: Probing PCI hardware RTAS: event: 2, Type: Internal Device Failure, Severity: 5 RTAS: event: 3, Type: Internal Device Failure, Severity: 5 RTAS: event: 4, Type: Internal Device Failure, Severity: 5 RTAS: event: 5, Type: Internal Device Failure, Severity: 5 RTAS: event: 6, Type: Internal Device Failure, Severity: 5 RTAS: event: 7, Type: Internal Device Failure, Severity: 5 RTAS: event: 8, Type: Internal Device Failure, Severity: 5 RTAS: event: 9, Type: Internal Device Failure, Severity: 5 RTAS: event: 10, Type: Internal Device Failure, Severity: 5 RTAS: event: 11, Type: Internal Device Failure, Severity: 5 RTAS: event: 12, Type: Internal Device Failure, Severity: 5 RTAS: event: 13, Type: Internal Device Failure, Severity: 5 RTAS: event: 14, Type: Internal Device Failure, Severity: 5 RTAS: event: 15, Type: Internal Device Failure, Severity: 5 RTAS: event: 16, Type: Internal Device Failure, Severity: 5 RTAS: event: 17, Type: Internal Device Failure, Severity: 5 RTAS: event: 18, Type: Internal Device Failure, Severity: 5 RTAS: event: 19, Type: Internal Device Failure, Severity: 5 RTAS: event: 20, Type: Internal Device Failure, Severity: 5 RTAS: event: 21, Type: Internal Device Failure, Severity: 5 RTAS: event: 22, Type: Internal Device Failure, Severity: 5 RTAS: event: 23, Type: Internal Device Failure, Severity: 5 RTAS: event: 24, Type: Internal Device Failure, Severity: 5 RTAS: event: 25, Type: Internal Device Failure, Severity: 5 RTAS: event: 26, Type: Internal Device Failure, Severity: 5 RTAS: event: 27, Type: Internal Device Failure, Severity: 5 RTAS: event: 28, Type: Internal Device Failure, Severity: 5 RTAS: event: 29, Type: Internal Device Failure, Severity: 5 RTAS: event: 30, Type: Internal Device Failure, Severity: 5 RTAS: event: 31, Type: Internal Device Failure, Severity: 5 RTAS: event: 32, Type: Internal Device Failure, Severity: 5 RTAS: event: 33, Type: Internal Device Failure, Severity: 5 RTAS: event: 34, Type: Internal Device Failure, Severity: 5 RTAS: event: 35, Type: Internal Device Failure, Severity: 5 RTAS: event: 36, Type: Internal Device Failure, Severity: 5 RTAS: event: 37, Type: Internal Device Failure, Severity: 5 ... RTAS daemon started RTAS: event: 42, Type: Unknown, Severity: 2 On reboot we get the same 1-37 error logs then the last one at 43. Maybe we dont care about persistent error log numbers but I thought Id check that the tools handle it OK. Anton From linas at austin.ibm.com Fri Nov 5 03:38:00 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 4 Nov 2004 10:38:00 -0600 Subject: RTAS error log sequence numbers In-Reply-To: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> References: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> Message-ID: <20041104163759.GR10026@austin.ibm.com> On Fri, Nov 05, 2004 at 02:50:32AM +1100, Anton Blanchard was heard to remark: > > Hi, > > We can end up reusing RTAS error log sequence numbers - by calling > log_error out of rtas_call before we have done nvram_init. eg on a p630 Curiously, nvram_init happens late in the boot sequence. I'm not sure why, other than the application of the principle "move things as late into the boot sequence as possible." > On reboot we get the same 1-37 error logs then the last one at 43. Maybe > we dont care about persistent error log numbers but I thought Id check > that the tools handle it OK. I assume you mean "unique error log numbers" that are monotinically increasing across boots. This would require moving nvram_init to very early in the boot sequence, since rtas errors can occur very early. I'll volunteer to do this shuffle, as long as there is no objection in principle. I don't have much of a feel for the pro's and con's of this. --linas From moilanen at austin.ibm.com Fri Nov 5 04:04:52 2004 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Thu, 4 Nov 2004 11:04:52 -0600 Subject: RTAS error log sequence numbers In-Reply-To: <20041104163759.GR10026@austin.ibm.com> References: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> <20041104163759.GR10026@austin.ibm.com> Message-ID: <20041104110452.38f2152e@localhost> > > On reboot we get the same 1-37 error logs then the last one at 43. Maybe > > we dont care about persistent error log numbers but I thought Id check > > that the tools handle it OK. > > I assume you mean "unique error log numbers" that are monotinically > increasing across boots. This would require moving nvram_init to > very early in the boot sequence, since rtas errors can occur very early. > I'll volunteer to do this shuffle, as long as there is no objection > in principle. > > I don't have much of a feel for the pro's and con's of this. As long as nvram_init is called after pSeries/pmac_nvram_init, there should not be an issue. In fact you could just as easily call nvram_init() at the end of pSeries/pmac_nvram_init(). You also need to add a set of the error_log_cnt from nvram (and remove it from nvram_read_error_log). Currently we set the error_log_count when rtasd starts up, which may be after the first log_error. The user-level daemons (ELA and rtas_errd) were supposed to be able to handle duplicate sequence numbers since there are situations where we can not guarantee a unique sequence number. Jake From nfont at austin.ibm.com Fri Nov 5 04:08:14 2004 From: nfont at austin.ibm.com (Nathan Fontenot) Date: Thu, 04 Nov 2004 11:08:14 -0600 Subject: RTAS error log sequence numbers In-Reply-To: <20041104163759.GR10026@austin.ibm.com> References: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> <20041104163759.GR10026@austin.ibm.com> Message-ID: <418A61FE.7030500@austin.ibm.com> Linas Vepstas wrote: > I assume you mean "unique error log numbers" that are monotinically > increasing across boots. This would require moving nvram_init to > very early in the boot sequence, since rtas errors can occur very early. > I'll volunteer to do this shuffle, as long as there is no objection > in principle. > > I don't have much of a feel for the pro's and con's of this. You would to also need to initialize the error log count by either reading the last RTAS event stored in nvram or starting the rtasd kernel daemon, before anyone calls log_error(). Moving this to earlier in the boot sequence would be nice but I'm not sure its worth the effort. Is there any way to garauntee that this is done vefore anyone calls log_error()? -Nathan F. From nfont at austin.ibm.com Fri Nov 5 03:34:38 2004 From: nfont at austin.ibm.com (Nathan Fontenot) Date: Thu, 04 Nov 2004 10:34:38 -0600 Subject: RTAS error log sequence numbers In-Reply-To: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> References: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> Message-ID: <418A5A1E.5050102@austin.ibm.com> Anton Blanchard wrote: > On reboot we get the same 1-37 error logs then the last one at 43. Maybe > we dont care about persistent error log numbers but I thought Id check > that the tools handle it OK. Yes, the tools (namely rtas_errd) handle this just fine. The tools aren't reaaly concerned about the log number, its more for end users to track RTAS events. The error log count isn't initialized until the rtasd kernel daemon starts and reads the last event stored in nvram. This is why the count starts to look sane after rtasd starts. We could put code in to initialize the error log count earlier if people really want it, I don't think its really neccessary though. > > Anton > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/cgi-bin/mailman/listinfo/linuxppc64-dev > > -- Nathan Fontenot Power Linux Platform Serviceability Home: IBM Austin 908/1E-036 Phone: 512.838.3377 (T/L 678.3377) Email: nfont at austin.ibm.com From johnrose at austin.ibm.com Fri Nov 5 08:29:30 2004 From: johnrose at austin.ibm.com (John Rose) Date: Thu, 04 Nov 2004 15:29:30 -0600 Subject: [PATCH] PPC64 pSeries iommu cleanups Message-ID: <1099603770.30815.4.camel@sinatra.austin.ibm.com> Hi Paul- Here's a resend of the last iommu patch I sent, re-based against current linus bk. This patch changes the following iommu-related things: - Renames the [i,p]series versions of iommu_devnode_init(), to keep things logically separate where possible. - Moves iommu_free_table() to generic iommu.c - Creates of_cleanup_node(), which will directly precede the dynamic removal of any device node Comments welcome. Thanks- John Signed-off-by: John Rose diff -puN arch/ppc64/kernel/iSeries_iommu.c~iommu_free_table_fix4 arch/ppc64/kernel/iSeries_iommu.c --- 2_6_ketchup/arch/ppc64/kernel/iSeries_iommu.c~iommu_free_table_fix4 2004-11-04 15:22:10.000000000 -0600 +++ 2_6_ketchup-johnrose/arch/ppc64/kernel/iSeries_iommu.c 2004-11-04 15:22:10.000000000 -0600 @@ -171,7 +171,7 @@ static void iommu_table_getparms(struct } -void iommu_devnode_init(struct iSeries_Device_Node *dn) { +void iommu_devnode_init_iSeries(struct iSeries_Device_Node *dn) { struct iommu_table *tbl; tbl = (struct iommu_table *)kmalloc(sizeof(struct iommu_table), GFP_KERNEL); diff -puN arch/ppc64/kernel/iSeries_pci.c~iommu_free_table_fix4 arch/ppc64/kernel/iSeries_pci.c --- 2_6_ketchup/arch/ppc64/kernel/iSeries_pci.c~iommu_free_table_fix4 2004-11-04 15:22:10.000000000 -0600 +++ 2_6_ketchup-johnrose/arch/ppc64/kernel/iSeries_pci.c 2004-11-04 15:22:10.000000000 -0600 @@ -329,7 +329,7 @@ void __init iSeries_pci_final_fixup(void iSeries_Device_Information(pdev, Buffer, sizeof(Buffer)); printk("%d. %s\n", DeviceCount, Buffer); - iommu_devnode_init(node); + iommu_devnode_init_iSeries(node); } else printk("PCI: Device Tree not found for 0x%016lX\n", (unsigned long)pdev); diff -puN arch/ppc64/kernel/iommu.c~iommu_free_table_fix4 arch/ppc64/kernel/iommu.c --- 2_6_ketchup/arch/ppc64/kernel/iommu.c~iommu_free_table_fix4 2004-11-04 15:22:10.000000000 -0600 +++ 2_6_ketchup-johnrose/arch/ppc64/kernel/iommu.c 2004-11-04 15:22:10.000000000 -0600 @@ -425,6 +425,39 @@ struct iommu_table *iommu_init_table(str return tbl; } +void iommu_free_table(struct device_node *dn) +{ + struct iommu_table *tbl = dn->iommu_table; + unsigned long bitmap_sz, i; + unsigned int order; + + if (!tbl || !tbl->it_map) { + printk(KERN_ERR "%s: expected TCE map for %s\n", __FUNCTION__, + dn->full_name); + return; + } + + /* verify that table contains no entries */ + /* it_mapsize is in entries, and we're examining 64 at a time */ + for (i = 0; i < (tbl->it_mapsize/64); i++) { + if (tbl->it_map[i] != 0) { + printk(KERN_WARNING "%s: Unexpected TCEs for %s\n", + __FUNCTION__, dn->full_name); + break; + } + } + + /* calculate bitmap size in bytes */ + bitmap_sz = (tbl->it_mapsize + 7) / 8; + + /* free bitmap */ + order = get_order(bitmap_sz); + free_pages((unsigned long) tbl->it_map, order); + + /* free table */ + kfree(tbl); +} + /* Creates TCEs for a user provided buffer. The user buffer must be * contiguous real kernel storage (not vmalloc). The address of the buffer * passed here is the kernel (virtual) address of the buffer. The buffer diff -puN arch/ppc64/kernel/pSeries_iommu.c~iommu_free_table_fix4 arch/ppc64/kernel/pSeries_iommu.c --- 2_6_ketchup/arch/ppc64/kernel/pSeries_iommu.c~iommu_free_table_fix4 2004-11-04 15:22:10.000000000 -0600 +++ 2_6_ketchup-johnrose/arch/ppc64/kernel/pSeries_iommu.c 2004-11-04 15:22:10.000000000 -0600 @@ -276,7 +276,7 @@ static void iommu_buses_init(void) first_phb = 0; for (dn = first_dn; dn != NULL; dn = dn->sibling) - iommu_devnode_init(dn); + iommu_devnode_init_pSeries(dn); } } @@ -298,7 +298,7 @@ static void iommu_buses_init_lpar(struct * Do it now because iommu_table_setparms_lpar needs it. */ busdn->bussubno = bus->number; - iommu_devnode_init(busdn); + iommu_devnode_init_pSeries(busdn); } /* look for a window on a bridge even if the PHB had one */ @@ -397,7 +397,7 @@ static void iommu_table_setparms_lpar(st } -void iommu_devnode_init(struct device_node *dn) +void iommu_devnode_init_pSeries(struct device_node *dn) { struct iommu_table *tbl; @@ -412,39 +412,6 @@ void iommu_devnode_init(struct device_no dn->iommu_table = iommu_init_table(tbl); } -void iommu_free_table(struct device_node *dn) -{ - struct iommu_table *tbl = dn->iommu_table; - unsigned long bitmap_sz, i; - unsigned int order; - - if (!tbl || !tbl->it_map) { - printk(KERN_ERR "%s: expected TCE map for %s\n", __FUNCTION__, - dn->full_name); - return; - } - - /* verify that table contains no entries */ - /* it_mapsize is in entries, and we're examining 64 at a time */ - for (i = 0; i < (tbl->it_mapsize/64); i++) { - if (tbl->it_map[i] != 0) { - printk(KERN_WARNING "%s: Unexpected TCEs for %s\n", - __FUNCTION__, dn->full_name); - break; - } - } - - /* calculate bitmap size in bytes */ - bitmap_sz = (tbl->it_mapsize + 7) / 8; - - /* free bitmap */ - order = get_order(bitmap_sz); - free_pages((unsigned long) tbl->it_map, order); - - /* free table */ - kfree(tbl); -} - void iommu_setup_pSeries(void) { struct pci_dev *dev = NULL; @@ -469,7 +436,6 @@ void iommu_setup_pSeries(void) } } - /* These are called very early. */ void tce_init_pSeries(void) { diff -puN arch/ppc64/kernel/prom.c~iommu_free_table_fix4 arch/ppc64/kernel/prom.c --- 2_6_ketchup/arch/ppc64/kernel/prom.c~iommu_free_table_fix4 2004-11-04 15:22:10.000000000 -0600 +++ 2_6_ketchup-johnrose/arch/ppc64/kernel/prom.c 2004-11-04 15:22:10.000000000 -0600 @@ -1740,7 +1740,7 @@ static int of_finish_dynamic_node(struct if (strcmp(node->name, "pci") == 0 && get_property(node, "ibm,dma-window", NULL)) { node->bussubno = node->busno; - iommu_devnode_init(node); + iommu_devnode_init_pSeries(node); } else node->iommu_table = parent->iommu_table; #endif /* CONFIG_PPC_PSERIES */ @@ -1802,6 +1802,15 @@ int of_add_node(const char *path, struct } /* + * Prepare an OF node for removal from system + */ +static void of_cleanup_node(struct device_node *np) +{ + if (np->iommu_table && get_property(np, "ibm,dma-window", NULL)) + iommu_free_table(np); +} + +/* * Remove an OF device node from the system. * Caller should have already "gotten" np. */ @@ -1818,13 +1827,7 @@ int of_remove_node(struct device_node *n return -EBUSY; } - /* XXX This is a layering violation, should be moved to the caller - * --BenH. - */ -#ifdef CONFIG_PPC_PSERIES - if (np->iommu_table) - iommu_free_table(np); -#endif /* CONFIG_PPC_PSERIES */ + of_cleanup_node(np); write_lock(&devtree_lock); OF_MARK_STALE(np); diff -puN include/asm-ppc64/iommu.h~iommu_free_table_fix4 include/asm-ppc64/iommu.h --- 2_6_ketchup/include/asm-ppc64/iommu.h~iommu_free_table_fix4 2004-11-04 15:22:10.000000000 -0600 +++ 2_6_ketchup-johnrose/include/asm-ppc64/iommu.h 2004-11-04 15:22:10.000000000 -0600 @@ -110,22 +110,18 @@ struct scatterlist; extern void iommu_setup_pSeries(void); extern void iommu_setup_u3(void); -/* Creates table for an individual device node */ -/* XXX: This isn't generic, please name it accordingly or add - * some ppc_md. hooks for iommu implementations to do what they - * need to do. --BenH. - */ -extern void iommu_devnode_init(struct device_node *dn); - /* Frees table for an individual device node */ -/* XXX: This isn't generic, please name it accordingly or add - * some ppc_md. hooks for iommu implementations to do what they - * need to do. --BenH. - */ extern void iommu_free_table(struct device_node *dn); #endif /* CONFIG_PPC_MULTIPLATFORM */ +#ifdef CONFIG_PPC_PSERIES + +/* Creates table for an individual device node */ +extern void iommu_devnode_init_pSeries(struct device_node *dn); + +#endif /* CONFIG_PPC_PSERIES */ + #ifdef CONFIG_PPC_ISERIES /* Walks all buses and creates iommu tables */ @@ -136,7 +132,7 @@ extern void __init iommu_vio_init(void); struct iSeries_Device_Node; /* Creates table for an individual device node */ -extern void iommu_devnode_init(struct iSeries_Device_Node *dn); +extern void iommu_devnode_init_iSeries(struct iSeries_Device_Node *dn); #endif /* CONFIG_PPC_ISERIES */ _ From anton at samba.org Fri Nov 5 16:09:33 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 5 Nov 2004 16:09:33 +1100 Subject: RTAS error log sequence numbers In-Reply-To: <418A61FE.7030500@austin.ibm.com> References: <20041104155032.GB1268@krispykreme.ozlabs.ibm.com> <20041104163759.GR10026@austin.ibm.com> <418A61FE.7030500@austin.ibm.com> Message-ID: <20041105050933.GC8470@krispykreme.ozlabs.ibm.com> > Moving this to earlier in the boot sequence would be nice but I'm not > sure its worth the effort. Is there any way to garauntee that this is > done vefore anyone calls log_error()? Since the userspace tools can handle it, Im OK to ignore the issue. Anton From l_indien at magic.fr Fri Nov 5 23:14:03 2004 From: l_indien at magic.fr (J. Mayer) Date: Fri, 05 Nov 2004 13:14:03 +0100 Subject: Booting Imac G5 Message-ID: <1099656843.8346.7.camel@rapid> Hi, I have a new Imac G5 and I made Linux boot on it. Here's a patch proposal to get the Sungem ethernet device, the firewire and the IDE controler recognized. There still are major issues: - serial ATA freezes during disc probe - the RTC isn't recongnized - of course, there is no power / fan management. My patch is a very minimal one which made me able to boot from CDROM and firewire disk drive, that's a start ;-) Note that this patch was originally done against the gentoo version of linux-2.6.8 but applies well against kernel.org 2.6.9. I'll try to take a look and solve the SATA issue during this week-end. Regards. -- J. Mayer Never organized -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.8-gentoo.diff Type: text/x-patch Size: 2150 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041105/3350b11d/attachment.bin From olh at suse.de Sat Nov 6 09:03:41 2004 From: olh at suse.de (Olaf Hering) Date: Fri, 5 Nov 2004 23:03:41 +0100 Subject: [PATCH] call ibm,os-term only if its available Message-ID: <20041105220341.GA28064@suse.de> The rtas property 'ibm,os-term' is not available on JS20, a panic will print: unable to mount root filesystem on /dev/hda Kernel panic - not syncing: Attempted to kill init! <0>ibm,os-term call failed -1 Rebooting in 42 seconds.. Signed-off-by: Olaf Hering diff -purN linux-2.6.10-rc1-bk15.orig/arch/ppc64/kernel/rtas.c linux-2.6.10-rc1-bk15.ibm,os-term/arch/ppc64/kernel/rtas.c --- linux-2.6.10-rc1-bk15.orig/arch/ppc64/kernel/rtas.c 2004-11-05 14:52:14.747905961 +0100 +++ linux-2.6.10-rc1-bk15.ibm,os-term/arch/ppc64/kernel/rtas.c 2004-11-05 23:00:10.581515367 +0100 @@ -439,6 +439,9 @@ void rtas_os_term(char *str) { int status; + if (RTAS_UNKNOWN_SERVICE == rtas_token("ibm,os-term")) + return; + snprintf(rtas_os_term_buf, 2048, "OS panic: %s", str); do { -- USB is for mice, FireWire is for men! sUse lINUX ag, n?RNBERG From benh at kernel.crashing.org Sat Nov 6 11:56:05 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 06 Nov 2004 11:56:05 +1100 Subject: Booting Imac G5 In-Reply-To: <1099656843.8346.7.camel@rapid> References: <1099656843.8346.7.camel@rapid> Message-ID: <1099702566.3946.49.camel@gaston> On Fri, 2004-11-05 at 13:14 +0100, J. Mayer wrote: > Hi, > > I have a new Imac G5 and I made Linux boot on it. Here's a patch > proposal to get the Sungem ethernet device, the firewire and the IDE > controler recognized. > There still are major issues: > - serial ATA freezes during disc probe > - the RTC isn't recongnized > - of course, there is no power / fan management. > My patch is a very minimal one which made me able to boot from CDROM and > firewire disk drive, that's a start ;-) > Note that this patch was originally done against the gentoo version of > linux-2.6.8 but applies well against kernel.org 2.6.9. > I'll try to take a look and solve the SATA issue during this week-end. Nice ! Did you submit the new PCI IDs to the online database too ? http://pciids.sourceforge.net/ Ben. From l_indien at magic.fr Sat Nov 6 22:28:56 2004 From: l_indien at magic.fr (J. Mayer) Date: Sat, 06 Nov 2004 12:28:56 +0100 Subject: Booting Imac G5 In-Reply-To: <1099702566.3946.49.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> Message-ID: <1099740535.8346.33.camel@rapid> On Sat, 2004-11-06 at 01:56, Benjamin Herrenschmidt wrote: > On Fri, 2004-11-05 at 13:14 +0100, J. Mayer wrote: > > Hi, > > > > I have a new Imac G5 and I made Linux boot on it. Here's a patch > > proposal to get the Sungem ethernet device, the firewire and the IDE > > controler recognized. > > There still are major issues: > > - serial ATA freezes during disc probe > > - the RTC isn't recongnized > > - of course, there is no power / fan management. > > My patch is a very minimal one which made me able to boot from CDROM and > > firewire disk drive, that's a start ;-) > > Note that this patch was originally done against the gentoo version of > > linux-2.6.8 but applies well against kernel.org 2.6.9. > > I'll try to take a look and solve the SATA issue during this week-end. > > Nice ! Did you submit the new PCI IDs to the online database too ? I just did, now that you remind it to me ;-) I also added the two following devices that I just identified, using /proc/device-tree to locate them: 004f Shasta Mac I/O 0058 U3 AGP bridge Regards. -- J. Mayer Never organized From l_indien at magic.fr Sun Nov 7 06:25:23 2004 From: l_indien at magic.fr (J. Mayer) Date: Sat, 06 Nov 2004 20:25:23 +0100 Subject: Booting Imac G5 In-Reply-To: <1099702566.3946.49.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> Message-ID: <1099769123.8346.41.camel@rapid> On Sat, 2004-11-06 at 01:56, Benjamin Herrenschmidt wrote: > On Fri, 2004-11-05 at 13:14 +0100, J. Mayer wrote: > > Hi, > > > > I have a new Imac G5 and I made Linux boot on it. Here's a patch > > proposal to get the Sungem ethernet device, the firewire and the IDE > > controler recognized. > > There still are major issues: > > - serial ATA freezes during disc probe > > - the RTC isn't recongnized > > - of course, there is no power / fan management. > > My patch is a very minimal one which made me able to boot from CDROM and > > firewire disk drive, that's a start ;-) > > Note that this patch was originally done against the gentoo version of > > linux-2.6.8 but applies well against kernel.org 2.6.9. > > I'll try to take a look and solve the SATA issue during this week-end. > Hi again, as I can see you wrote the SATA driver for Pmac, you may have an idea of what going wrong on the Imac. I did activate DPRINTK and VPRINTK in libata and added a few messages. It seems that the SET_FEATURES command never completes. So the insmod stays blocked but the machine is still fully usable from another shell. I attach here the complete dmesg I got when booting. Please note that the message: "ata_dev_set_xfermode: qc_issue xfer_mode=12" used to be "... xfer=70" (note the printk I added is decimal) but I tried to force it to the xfer_mode I saw from ata_host_set_pio trace, and it changed nothing. Regards. -- J. Mayer Never organized -------------- next part -------------- Found initrd at 0xc000000001b00000:0xc000000001b76aad trying to initialize btext ... Starting Linux PPC64 2.6.9 ----------------------------------------------------- naca = 0xc000000000004000 naca->pftSize = 0x17 naca->debug_switch = 0x0 naca->interrupt_controller = 0x1 systemcfg = 0xc000000000005000 systemcfg->processorCount = 0x0 systemcfg->physicalMemorySize = 0x20000000 systemcfg->dCacheL1LineSize = 0x80 systemcfg->iCacheL1LineSize = 0x80 htab_data.htab = 0xc00000001f800000 htab_data.num_ptegs = 0x10000 ----------------------------------------------------- [boot]0100 MM Init [boot]0100 MM Init Done Linux version 2.6.9 (root at imac) (gcc version 3.4.1 20040803 (Gentoo Linux 3.4.1-r3, ssp-3.4-2, pie-8.7.6.5)) #6 Sat Nov 6 19:21:12 CET 2004 [boot]0012 Setup Arch Using native/NAP idle loop Found U3 memory controller & host bridge, revision: 57 Mapped at 0xe000000080152000 Found a K2 mac-io controller, rev: 0, mapped at 0xe000000080193000 PowerMac motherboard: IMac G5 nvram: Checking bank 0... nvram: gen0=118, gen1=117 nvram: Active bank is: 0 Adding PCI host bridge /pci at 0,f0000000 Found U3-AGP PCI host bridge. Firmware bus number: 240->255 Adding PCI host bridge /ht at 0,f2000000 Can't get bus-range for /ht at 0,f2000000, assume bus 0 U3/HT: hole, 0 end at 8fffffff, 1 start at b0000000 Found U3-HT PCI host bridge. Firmware bus number: 0->239 Can't get bus-range for /ht at 0,f2000000 PCI Host 0, io start: fffffffffd800000; io end: fffffffffdffffff PCI Host 1, io start: 0; io end: 3fffff Top of RAM: 0x20000000, Total RAM: 0x20000000 Memory hole size: 0MB On node 0 totalpages: 131072 DMA zone: 131072 pages, LIFO batch:16 Normal zone: 0 pages, LIFO batch:1 HighMem zone: 0 pages, LIFO batch:1 [boot]0015 Setup Done Built 1 zonelists Kernel command line: root=/dev/ram rw ramdisk_size=11000 init=/linuxrc devfs real_root=/dev/scsi/host0/bus0/target0/lun0/part14 devf real_root=/dev/scsi/host0/bus0/target0/lun0/part14 PowerMac using OpenPIC irq controller at 0x80040000 [boot]0020 OpenPic Init OpenPIC Version 1.2 (4 CPUs and 124 IRQ sources) at e000000082e1c000 [boot]0025 OpenPic Done Slave OpenPIC at 0xf8040000 hooked on IRQ 96 [boot]0020 OpenPic U3 Init OpenPIC (U3) Version 1.2 [boot]0025 OpenPic U3 Done PID hash table entries: 4096 (order: 12, 131072 bytes) time_init: decrementer frequency = 33.333333 MHz Console: colour dummy device 80x25 Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Memory: 500480k/524288k available (3244k kernel code, 23472k reserved, 1484k data, 322k bss, 164k init) Calibrating delay loop... 66.56 BogoMIPS (lpj=33280) Mount-cache hash table entries: 256 (order: 0, 4096 bytes) checking if image is initramfs...it isn't (no cpio magic); looks like an initrd Freeing initrd memory: 474k freed NET: Registered protocol family 16 PCI: Probing PCI hardware U3-DART: table not allocated, using direct DMA PCI: Probing PCI hardware done SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub nvram_init: Could not find nvram partition for nvram buffered error logging. devfs: 2004-01-31 Richard Gooch (rgooch at atnf.csiro.au) devfs: boot_options: 0x1 Initializing Cryptographic API Using unsupported 1440x900 NVDA,Display-A at a0008000, depth=8, pitch=1536 Console: switching to colour frame buffer device 180x56 fb0: Open Firmware frame buffer device on /pci at 0,f0000000/NVDA,Parent at 10/NVDA,Display-A at 0 RAMDISK driver initialized: 16 RAM disks of 11000K size 1024 blocksize loop: loaded (max 8 devices) sungem.c:v0.98 8/24/03 David S. Miller (davem at redhat.com) eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:0d:93:57:f6:f6 PHY ID: 4061e4, addr: 0 eth0: Found BCM5221 PHY MacIO PCI driver attached to K2 chipset Warning: no ADB interface detected Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx PCI: Enabling device: (0001:02:0d.0), cmd 2 ide0: Found Apple OHare ATA controller, bus ID 3, irq 38 Probing IDE interface ide0... hda: MATSHITADVD-R UJ-825, ATAPI CD/DVD-ROM drive hda: MDMA, cycleTime: 150, accessTime: 75, recTime: 75 hda: Set MDMA timing for mode 2, reg: 0x00221526 hda: Enabling MultiWord DMA 2 Using anticipatory io scheduler ide0 at 0xe0000000831f0000-0xe0000000831f0007,0xe0000000831f0160 on irq 38 hda: ATAPI 24X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, (U)DMA Uniform CD-ROM driver Revision: 3.20 ieee1394: Initialized config rom entry `ip1394' ohci1394: $Rev: 1223 $ Ben Collins PCI: Enabling device: (0001:02:0e.0), cmd 2 ohci1394: fw-host0: Unexpected PCI resource length of 1000! ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[39] MMIO=[80100000-801007ff] Max Packet=[2048] sbp2: $Rev: 1219 $ Ben Collins ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) PCI: Enabling device: (0001:01:0b.0), cmd 2 ohci_hcd 0001:01:0b.0: NEC Corporation USB ohci_hcd 0001:01:0b.0: irq 70, pci mem e0000000831f3000 ohci_hcd 0001:01:0b.0: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 3 ports detected PCI: Enabling device: (0001:01:0b.1), cmd 2 ohci_hcd 0001:01:0b.1: NEC Corporation USB (#2) ohci_hcd 0001:01:0b.1: irq 70, pci mem e0000000831f4000 ohci_hcd 0001:01:0b.1: new USB bus registered, assigned bus number 2 hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected usbcore: registered new driver hiddev usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.0:USB HID core driver mice: PS/2 mouse device common for all mice i2c /dev entries driver Found KeyWest i2c on "u3", 2 channels, stepping: 4 bits Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits NET: Registered protocol family 26 NET: Registered protocol family 2 IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 131072 bind 65536) NET: Registered protocol family 1 NET: Registered protocol family 17 RAMDISK: Compressed image found at block 0 EXT2-fs warning: checktime reached, running e2fsck is recommended VFS: Mounted root (ext2 filesystem). Mounted devfs on /dev Freeing unused kernel memory: 164k freed usb 1-1: new full speed USB device using address 2 hub 1-1:1.0: USB hub found hub 1-1:1.0: 3 ports detected usb 1-2: new low speed USB device using address 3 input: USB HID v1.10 Mouse [Logitech Trackball] on usb-0001:01:0b.0-2 usb 1-3: new full speed USB device using address 4 ieee1394: Node added: ID:BUS[0-00:1023] GUID[0030e000e0000e1c] ieee1394: Host added: ID:BUS[0-01:1023] GUID[000d93fffe57f6f6] scsi0 : SCSI emulation for IEEE-1394 SBP-2 Devices input: USB HID v1.11 Keyboard [05ac:1000] on usb-0001:01:0b.0-3 input: USB HID v1.11 Mouse [05ac:1000] on usb-0001:01:0b.0-3 usb 2-1: new low speed USB device using address 2 input: USB HID v1.10 Keyboard [CHICONY USB Keyboard] on usb-0001:01:0b.1-1 input,hiddev0: USB HID v1.10 Device [CHICONY USB Keyboard] on usb-0001:01:0b.1-1 usb 1-1.3: new full speed USB device using address 5 input: USB HID v1.10 Keyboard [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:01:0b.0-1.3 input: USB HID v1.10 Device [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:01:0b.0-1.3 ieee1394: sbp2: Logged into SBP-2 device ieee1394: Node 0-00:1023: Max speed [S400] - Max payload [2048] Vendor: IBM-DTLA Model: -307030 Rev: Type: Direct-Access ANSI SCSI revision: 06 SCSI device sda: 60036480 512-byte hdwr sectors (30739 MB) sda: asking for cache data failed sda: assuming drive cache: write through /dev/scsi/host0/bus0/target0/lun0: [mac] p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0, type 0 EXT2-fs warning: checktime reached, running e2fsck is recommended ieee1394: unsolicited response packet received - no tlabel match EXT2-fs warning: checktime reached, running e2fsck is recommended EXT2-fs warning: checktime reached, running e2fsck is recommended PCI: Enabling device: (0001:01:0b.2), cmd 6 ehci_hcd 0001:01:0b.2: NEC Corporation USB 2.0 ehci_hcd 0001:01:0b.2: irq 70, pci mem e0000000831f8000 ehci_hcd 0001:01:0b.2: new USB bus registered, assigned bus number 3 ehci_hcd 0001:01:0b.2: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10 usb 2-1: USB disconnect, address 2 hub 3-0:1.0: USB hub found hub 3-0:1.0: 5 ports detected drivers/usb/input/hid-core.c: can't resubmit intr, 0001:01:0b.1-1/input1, status -19 usb 1-1: USB disconnect, address 2 usb 1-1.3: USB disconnect, address 5 usb 1-2: USB disconnect, address 3 usb 1-3: USB disconnect, address 4 usb 1-1: new full speed USB device using address 6 hub 1-1:1.0: USB hub found hub 1-1:1.0: 3 ports detected usb 1-2: new low speed USB device using address 7 input: USB HID v1.10 Mouse [Logitech Trackball] on usb-0001:01:0b.0-2 usb 1-3: new full speed USB device using address 8 input: USB HID v1.11 Keyboard [05ac:1000] on usb-0001:01:0b.0-3 input: USB HID v1.11 Mouse [05ac:1000] on usb-0001:01:0b.0-3 usb 2-1: new low speed USB device using address 3 input: USB HID v1.10 Keyboard [CHICONY USB Keyboard] on usb-0001:01:0b.1-1 input,hiddev0: USB HID v1.10 Device [CHICONY USB Keyboard] on usb-0001:01:0b.1-1 usb 1-1.3: new full speed USB device using address 9 input: USB HID v1.10 Keyboard [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:01:0b.0-1.3 input: USB HID v1.10 Device [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:01:0b.0-1.3 PHY ID: 4061e4, addr: 0 NET: Registered protocol family 10 Disabled Privacy Extensions on device c00000000048a7a8(lo) IPv6 over IPv4 tunneling driver eth0: Link is up at 100 Mbps, full-duplex. eth0: Pause is disabled hda: MDMA, cycleTime: 150, accessTime: 75, recTime: 75 hda: Set MDMA timing for mode 2, reg: 0x00221526 hda: Enabling MultiWord DMA 2 libata version 1.02 loaded. sata_svw version 1.04 ata_device_add: ENTER ata_host_add: ENTER ata_port_start: prd alloc, virt c000000012a7f000, dma 12a7f000 ata1: SATA max UDMA/133 cmd 0xE0000000831F9000 ctl 0xE0000000831F9020 bmdma 0xE0000000831F9030 irq 0 ata_host_add: ENTER ata_port_start: prd alloc, virt c000000012a73000, dma 12a73000 ata2: SATA max UDMA/133 cmd 0xE0000000831F9100 ctl 0xE0000000831F9120 bmdma 0xE0000000831F9130 irq 0 ata_host_add: ENTER ata_port_start: prd alloc, virt c00000001291b000, dma 1291b000 ata3: SATA max UDMA/133 cmd 0xE0000000831F9200 ctl 0xE0000000831F9220 bmdma 0xE0000000831F9230 irq 0 ata_host_add: ENTER ata_port_start: prd alloc, virt c000000011430000, dma 11430000 ata4: SATA max UDMA/133 cmd 0xE0000000831F9300 ctl 0xE0000000831F9320 bmdma 0xE0000000831F9330 irq 0 ata_device_add: probe begin ata_device_add: ata1: probe begin ata_bus_reset: ENTER, host 1, port 0 ata_dev_classify: found ATA device by sig ata_bus_reset: EXIT ata_dev_identify: ENTER, host 1, dev 0 ata_dev_select: ENTER, ata1: device 0, wait 1 ata_dev_identify: do ATA identify ata_sg_setup_one: mapped buffer of 512 bytes for read ata_fill_sg: PRD[0] = (0x12A7B3C0, 0x200) ata_dev_select: ENTER, ata1: device 0, wait 1 ata_exec_command_mmio: ata1: cmd 0xEC ata_pio_sector: data read ata_sg_clean: unmapping 1 sg elements ata_qc_complete: EXIT ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:007f ata_dump_id: 49==0x2f00 53==0x0007 63==0x0407 64==0x0003 75==0x0000 ata_dump_id: 80==0x007e 81==0x001b 82==0x346b 83==0x7d01 84==0x4003 ata_dump_id: 88==0x007f 93==0x0000 ata1: dev 0 ATA, max UDMA/133, 156301488 sectors: lba48 ata_dev_identify: EXIT, drv_stat = 0x50 ata_dev_identify: ENTER/EXIT (host 1, dev 1) -- nodev ata_host_set_pio: base 0x8 xfer_mode 0xc mask 0x1f x 4 ata_dev_set_xfermode: set features - xfer mode ata_dev_set_xfermode: qc_issue xfer_mode=12 ata_dev_select: ENTER, ata1: device 0, wait 1 ata_exec_command_mmio: ata1: cmd 0xEF ata_dev_set_xfermode: wait for completion From benh at kernel.crashing.org Sun Nov 7 07:50:42 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 07 Nov 2004 07:50:42 +1100 Subject: Booting Imac G5 In-Reply-To: <1099769123.8346.41.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> Message-ID: <1099774242.10262.99.camel@gaston> > as I can see you wrote the SATA driver for Pmac, you may have an idea of > what going wrong on the Imac. > I did activate DPRINTK and VPRINTK in libata and added a few messages. > It seems that the SET_FEATURES command never completes. So the insmod > stays blocked but the machine is still fully usable from another shell. > I attach here the complete dmesg I got when booting. > Please note that the message: > "ata_dev_set_xfermode: qc_issue xfer_mode=12" used to be "... xfer=70" > (note the printk I added is decimal) but I tried to force it to the > xfer_mode I saw from ata_host_set_pio trace, and it changed nothing. Difficult to say at this point... you can try not resetting the PHY for now (remove the ATA_FLAG_SATA_RESET) from host_flags. Did you look at Darwin code for anything that may have changed ? Also, I think you can try lowering the max DMA speed: probe_ent->udma_mask = 0x7f; to probe_ent->udma_mask = 0x3f; Ben. From benh at kernel.crashing.org Sun Nov 7 10:51:21 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 07 Nov 2004 10:51:21 +1100 Subject: Booting Imac G5 In-Reply-To: <1099774242.10262.99.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> Message-ID: <1099785081.5295.114.camel@gaston> Ok, a new Darwin is out and the driver there has some additional bits, related to the SATA cell. I'm hacking together a patch. Ben. From benh at kernel.crashing.org Sun Nov 7 11:32:48 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 07 Nov 2004 11:32:48 +1100 Subject: Booting Imac G5 In-Reply-To: <1099785081.5295.114.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> Message-ID: <1099787569.3946.116.camel@gaston> On Sun, 2004-11-07 at 10:51 +1100, Benjamin Herrenschmidt wrote: > Ok, a new Darwin is out and the driver there has some additional bits, > related to the SATA cell. I'm hacking together a patch. Index: linux-work/drivers/scsi/sata_svw.c =================================================================== --- linux-work.orig/drivers/scsi/sata_svw.c 2004-10-13 09:02:05.000000000 +1000 +++ linux-work/drivers/scsi/sata_svw.c 2004-11-07 11:23:41.945588808 +1100 @@ -49,7 +49,7 @@ #endif /* CONFIG_PPC_OF */ #define DRV_NAME "sata_svw" -#define DRV_VERSION "1.04" +#define DRV_VERSION "1.05" /* Taskfile registers offsets */ #define K2_SATA_TF_CMD_OFFSET 0x00 @@ -75,10 +75,19 @@ #define K2_SATA_SICR1_OFFSET 0x80 #define K2_SATA_SICR2_OFFSET 0x84 #define K2_SATA_SIM_OFFSET 0x88 +#define K2_SATA_MDIO_ACCESS 0x8c /* Port stride */ #define K2_SATA_PORT_OFFSET 0x100 +/* Private structure */ +struct k2_sata_priv +{ +#ifdef CONFIG_PPC_OF + struct device_node *of_node; +#endif + int need_mdio_phy_reset; +}; static u32 k2_sata_scr_read (struct ata_port *ap, unsigned int sc_reg) { @@ -96,6 +105,42 @@ writel(val, (void *) ap->ioaddr.scr_addr + (sc_reg * 4)); } +static u16 k2_sata_mdio_read(struct ata_host_set *host, int reg) +{ + u16 val; + int timeout; + + writel(host_set->mmio_base + K2_SATA_MDIO_ACCESS, + (reg & 0x1f) | 0x4000); + for(timeout = 10000; timeout > 0; timeout++) { + val = readl(host_set->mmio_base + K2_SATA_MDIO_ACCESS); + if (val & 0x8000) + break; + udelay(100); + } + if (timeout <= 0) { + printk(KERN_WARNING "sata_svw: timeout reading MDIO reg %d\n", reg); + return 0xffff; + } + return val >> 16; +} + +static void k2_sata_mdio_write(struct ata_host_set *host, int reg, u16 val) +{ + u16 val; + int timeout; + + writel(host_set->mmio_base + K2_SATA_MDIO_ACCESS, + (reg & 0x1f) | (((u32)val) << 16) | 0x2000); + for(timeout = 10000; timeout > 0; timeout++) { + val = readl(host_set->mmio_base + K2_SATA_MDIO_ACCESS); + if (val & 0x8000) + break; + udelay(100); + } + if (timeout <= 0) + printk(KERN_WARNING "sata_svw: timeout writing MDIO reg %d\n", reg); +} static void k2_sata_tf_load(struct ata_port *ap, struct ata_taskfile *tf) { @@ -220,6 +265,31 @@ return readl((void *) ap->ioaddr.status_addr); } +static void k2_sata_mdio_phy_reset(struct ata_host_set *host_set); +{ + u16 reg; + + reg = k2_sata_mdio_read(host_set, 4); + k2_sata_mdio_write(host_set, 4, reg | 0x0008); + udelay(200); + k2_sata_mdio_write(host_set, 4, reg); + udelay(250); +} + +static void k2_sata_host_start(struct ata_host_set *host_set) +{ + struct k2_sata_priv *pp; + + pp = host_set->private_data; + + /* Some cell revs need a HW reset of the PHY layer at this point, and + * on wakeup from power management + */ + if (pp->need_mdio_phy_reset) + k2_sata_mdio_phy_reset(host_set); +} + + #ifdef CONFIG_PPC_OF /* * k2_sata_proc_info @@ -237,15 +307,15 @@ { struct ata_port *ap; struct device_node *np; + struct k2_sata_priv *pp; int len, index; /* Find the ata_port */ ap = (struct ata_port *) &shost->hostdata[0]; if (ap == NULL) return 0; - - /* Find the OF node for the PCI device proper */ - np = pci_device_to_OF_node(ap->host_set->pdev); + pp = ap->host_set->private_data; + np = pp->of_node; if (np == NULL) return 0; @@ -310,6 +380,7 @@ .scr_write = k2_sata_scr_write, .port_start = ata_port_start, .port_stop = ata_port_stop, + .host_start = k2_sata_host_start, }; static void k2_sata_setup_port(struct ata_ioports *port, unsigned long base) @@ -338,6 +409,7 @@ struct ata_probe_ent *probe_ent = NULL; unsigned long base; void *mmio_base; + struct k2_sata_priv *pp = NULL; int rc; if (!printed_version++) @@ -374,10 +446,31 @@ rc = -ENOMEM; goto err_out_regions; } - memset(probe_ent, 0, sizeof(*probe_ent)); + + pp = (struct k2_sata_priv *)kmalloc(sizeof(struct k2_sata_priv), GFP_KERNEL); + if (pp == NULL) { + rc = -ENOMEM; + goto err_out_free_ent; + } + memset(pp, 0, sizeof(struct k2_sata_priv)); + probe_ent->pdev = pdev; INIT_LIST_HEAD(&probe_ent->node); + probe_ent->private_data = pdev; + +#ifdef CONFIG_PPC_OF + /* Find the OF node for the PCI device proper */ + pp->of_node = pci_device_to_OF_node(ap->host_set->pdev); + + /* Check for revision 1 */ + if (pp->of_node) { + u32 *rev; + rev = (u32 *)get_property(pp->of_node, "cell-revision", NULL); + if (rev && (*rev) > 0) + pp->need_mdio_phy_reset = 1; + } +#endif /* CONFIG_PPC_OF */ mmio_base = ioremap(pci_resource_start(pdev, 5), pci_resource_len(pdev, 5)); @@ -429,7 +522,10 @@ return 0; err_out_free_ent: - kfree(probe_ent); + if (pp) + kfree(pp); + if (probe_ent) + kfree(probe_ent); err_out_regions: pci_release_regions(pdev); err_out: Index: linux-work/drivers/scsi/libata-core.c =================================================================== --- linux-work.orig/drivers/scsi/libata-core.c 2004-11-07 11:24:09.617382056 +1100 +++ linux-work/drivers/scsi/libata-core.c 2004-11-07 11:24:40.491688448 +1100 @@ -3271,6 +3271,9 @@ host_set->private_data = ent->private_data; host_set->ops = ent->port_ops; + if (host_set->ops->host_start) + host_set->ops->host_start(host_set); + /* register each port bound to this device */ for (i = 0; i < ent->n_ports; i++) { struct ata_port *ap; Index: linux-work/include/linux/libata.h =================================================================== --- linux-work.orig/include/linux/libata.h 2004-11-07 11:23:56.598361248 +1100 +++ linux-work/include/linux/libata.h 2004-11-07 11:24:54.242597992 +1100 @@ -349,6 +349,7 @@ int (*port_start) (struct ata_port *ap); void (*port_stop) (struct ata_port *ap); + void (*host_start) (struct ata_host_set *host_set); void (*host_stop) (struct ata_host_set *host_set); }; From benh at kernel.crashing.org Sun Nov 7 11:33:49 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 07 Nov 2004 11:33:49 +1100 Subject: Booting Imac G5 (Wrong patch !) In-Reply-To: <1099787569.3946.116.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> Message-ID: <1099787630.3884.118.camel@gaston> Oops, sent the wrong patch, here it is: Index: linux-work/drivers/scsi/sata_svw.c =================================================================== --- linux-work.orig/drivers/scsi/sata_svw.c 2004-10-13 09:02:05.000000000 +1000 +++ linux-work/drivers/scsi/sata_svw.c 2004-11-07 11:31:52.229054392 +1100 @@ -49,7 +49,7 @@ #endif /* CONFIG_PPC_OF */ #define DRV_NAME "sata_svw" -#define DRV_VERSION "1.04" +#define DRV_VERSION "1.05" /* Taskfile registers offsets */ #define K2_SATA_TF_CMD_OFFSET 0x00 @@ -75,10 +75,19 @@ #define K2_SATA_SICR1_OFFSET 0x80 #define K2_SATA_SICR2_OFFSET 0x84 #define K2_SATA_SIM_OFFSET 0x88 +#define K2_SATA_MDIO_ACCESS 0x8c /* Port stride */ #define K2_SATA_PORT_OFFSET 0x100 +/* Private structure */ +struct k2_sata_priv +{ +#ifdef CONFIG_PPC_OF + struct device_node *of_node; +#endif + int need_mdio_phy_reset; +}; static u32 k2_sata_scr_read (struct ata_port *ap, unsigned int sc_reg) { @@ -96,6 +105,41 @@ writel(val, (void *) ap->ioaddr.scr_addr + (sc_reg * 4)); } +static u16 k2_sata_mdio_read(struct ata_host_set *host_set, int reg) +{ + u16 val; + int timeout; + + writel((reg & 0x1f) | 0x4000, + host_set->mmio_base + K2_SATA_MDIO_ACCESS); + for(timeout = 10000; timeout > 0; timeout++) { + val = readl(host_set->mmio_base + K2_SATA_MDIO_ACCESS); + if (val & 0x8000) + break; + udelay(100); + } + if (timeout <= 0) { + printk(KERN_WARNING "sata_svw: timeout reading MDIO reg %d\n", reg); + return 0xffff; + } + return val >> 16; +} + +static void k2_sata_mdio_write(struct ata_host_set *host_set, int reg, u16 val) +{ + int timeout; + + writel((reg & 0x1f) | (((u32)val) << 16) | 0x2000, + host_set->mmio_base + K2_SATA_MDIO_ACCESS); + for(timeout = 10000; timeout > 0; timeout++) { + val = readl(host_set->mmio_base + K2_SATA_MDIO_ACCESS); + if (val & 0x8000) + break; + udelay(100); + } + if (timeout <= 0) + printk(KERN_WARNING "sata_svw: timeout writing MDIO reg %d\n", reg); +} static void k2_sata_tf_load(struct ata_port *ap, struct ata_taskfile *tf) { @@ -220,6 +264,31 @@ return readl((void *) ap->ioaddr.status_addr); } +static void k2_sata_mdio_phy_reset(struct ata_host_set *host_set) +{ + u16 reg; + + reg = k2_sata_mdio_read(host_set, 4); + k2_sata_mdio_write(host_set, 4, reg | 0x0008); + udelay(200); + k2_sata_mdio_write(host_set, 4, reg); + udelay(250); +} + +static void k2_sata_host_start(struct ata_host_set *host_set) +{ + struct k2_sata_priv *pp; + + pp = host_set->private_data; + + /* Some cell revs need a HW reset of the PHY layer at this point, and + * on wakeup from power management + */ + if (pp->need_mdio_phy_reset) + k2_sata_mdio_phy_reset(host_set); +} + + #ifdef CONFIG_PPC_OF /* * k2_sata_proc_info @@ -237,15 +306,15 @@ { struct ata_port *ap; struct device_node *np; + struct k2_sata_priv *pp; int len, index; /* Find the ata_port */ ap = (struct ata_port *) &shost->hostdata[0]; if (ap == NULL) return 0; - - /* Find the OF node for the PCI device proper */ - np = pci_device_to_OF_node(ap->host_set->pdev); + pp = ap->host_set->private_data; + np = pp->of_node; if (np == NULL) return 0; @@ -310,6 +379,7 @@ .scr_write = k2_sata_scr_write, .port_start = ata_port_start, .port_stop = ata_port_stop, + .host_start = k2_sata_host_start, }; static void k2_sata_setup_port(struct ata_ioports *port, unsigned long base) @@ -338,6 +408,7 @@ struct ata_probe_ent *probe_ent = NULL; unsigned long base; void *mmio_base; + struct k2_sata_priv *pp = NULL; int rc; if (!printed_version++) @@ -374,10 +445,31 @@ rc = -ENOMEM; goto err_out_regions; } - memset(probe_ent, 0, sizeof(*probe_ent)); + + pp = (struct k2_sata_priv *)kmalloc(sizeof(struct k2_sata_priv), GFP_KERNEL); + if (pp == NULL) { + rc = -ENOMEM; + goto err_out_free_ent; + } + memset(pp, 0, sizeof(struct k2_sata_priv)); + probe_ent->pdev = pdev; INIT_LIST_HEAD(&probe_ent->node); + probe_ent->private_data = pdev; + +#ifdef CONFIG_PPC_OF + /* Find the OF node for the PCI device proper */ + pp->of_node = pci_device_to_OF_node(pdev); + + /* Check for revision 1 */ + if (pp->of_node) { + u32 *rev; + rev = (u32 *)get_property(pp->of_node, "cell-revision", NULL); + if (rev && (*rev) > 0) + pp->need_mdio_phy_reset = 1; + } +#endif /* CONFIG_PPC_OF */ mmio_base = ioremap(pci_resource_start(pdev, 5), pci_resource_len(pdev, 5)); @@ -429,7 +521,10 @@ return 0; err_out_free_ent: - kfree(probe_ent); + if (pp) + kfree(pp); + if (probe_ent) + kfree(probe_ent); err_out_regions: pci_release_regions(pdev); err_out: Index: linux-work/drivers/scsi/libata-core.c =================================================================== --- linux-work.orig/drivers/scsi/libata-core.c 2004-11-07 11:24:09.617382056 +1100 +++ linux-work/drivers/scsi/libata-core.c 2004-11-07 11:24:40.491688448 +1100 @@ -3271,6 +3271,9 @@ host_set->private_data = ent->private_data; host_set->ops = ent->port_ops; + if (host_set->ops->host_start) + host_set->ops->host_start(host_set); + /* register each port bound to this device */ for (i = 0; i < ent->n_ports; i++) { struct ata_port *ap; Index: linux-work/include/linux/libata.h =================================================================== --- linux-work.orig/include/linux/libata.h 2004-11-07 11:23:56.598361248 +1100 +++ linux-work/include/linux/libata.h 2004-11-07 11:24:54.242597992 +1100 @@ -349,6 +349,7 @@ int (*port_start) (struct ata_port *ap); void (*port_stop) (struct ata_port *ap); + void (*host_start) (struct ata_host_set *host_set); void (*host_stop) (struct ata_host_set *host_set); }; From anton at samba.org Mon Nov 8 04:20:30 2004 From: anton at samba.org (Anton Blanchard) Date: Mon, 8 Nov 2004 04:20:30 +1100 Subject: [RFC] Consolidate lots of hugepage code In-Reply-To: <20041029034817.GY12934@holomorphy.com> References: <20041029033708.GF12247@zax> <20041029034817.GY12934@holomorphy.com> Message-ID: <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> Hi, > Further consolidation is premature given that outstanding hugetlb bugs > have the implication that architectures' needs are not being served by > the current arch/core split. I have at least two relatively major hugetlb > bugs outstanding, the lack of a flush_dcache_page() analogue first, and > another (soon to be a reported to affected distros) less well-understood. > Unless they're directly toward the end of restoring hugetlb to a sound > state, they're counterproductive to merge before patches doing so. Could you point me at a summary of these 2 issues? Anton From wli at holomorphy.com Mon Nov 8 06:20:24 2004 From: wli at holomorphy.com (William Lee Irwin III) Date: Sun, 7 Nov 2004 11:20:24 -0800 Subject: [RFC] Consolidate lots of hugepage code In-Reply-To: <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> References: <20041029033708.GF12247@zax> <20041029034817.GY12934@holomorphy.com> <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> Message-ID: <20041107192024.GM2890@holomorphy.com> At some point in the past, I wrote: >> Further consolidation is premature given that outstanding hugetlb bugs >> have the implication that architectures' needs are not being served by >> the current arch/core split. I have at least two relatively major hugetlb >> bugs outstanding, the lack of a flush_dcache_page() analogue first, and >> another (soon to be a reported to affected distros) less well-understood. >> Unless they're directly toward the end of restoring hugetlb to a sound >> state, they're counterproductive to merge before patches doing so. On Mon, Nov 08, 2004 at 04:20:30AM +1100, Anton Blanchard wrote: > Could you point me at a summary of these 2 issues? It's all pretty obvious. The first is checking page size vs. cache size and whether it's VI or does anything unusual; thus far things look hopeful that flush_dcache_page() analogues are unnecessary. More information about Super-H is needed to wrap up what will probably be no more than an audit. The second is a triplefault on x86-64 under some condition involving a long-running database regression test. There has obviously been considerably less progress there in no small part due to the amount of time required to reproduce the issue. -- wli From anton at samba.org Mon Nov 8 06:30:07 2004 From: anton at samba.org (Anton Blanchard) Date: Mon, 8 Nov 2004 06:30:07 +1100 Subject: [RFC] Consolidate lots of hugepage code In-Reply-To: <20041107192024.GM2890@holomorphy.com> References: <20041029033708.GF12247@zax> <20041029034817.GY12934@holomorphy.com> <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> <20041107192024.GM2890@holomorphy.com> Message-ID: <20041107193007.GC16976@krispykreme.ozlabs.ibm.com> Hi, > It's all pretty obvious. The first is checking page size vs. cache size > and whether it's VI or does anything unusual; thus far things look > hopeful that flush_dcache_page() analogues are unnecessary. More > information about Super-H is needed to wrap up what will probably be no > more than an audit. Good to hear. > The second is a triplefault on x86-64 under some > condition involving a long-running database regression test. There has > obviously been considerably less progress there in no small part due to > the amount of time required to reproduce the issue. OK. We have not seen a similar issue on ppc64 even with extensive testing (although with HPC apps). The question is how long we should hold off on further hugetlb development waiting for this one bug report on a single architecture to be chased. Anton From wli at holomorphy.com Mon Nov 8 08:09:43 2004 From: wli at holomorphy.com (William Lee Irwin III) Date: Sun, 7 Nov 2004 13:09:43 -0800 Subject: [RFC] Consolidate lots of hugepage code In-Reply-To: <20041107193007.GC16976@krispykreme.ozlabs.ibm.com> References: <20041029033708.GF12247@zax> <20041029034817.GY12934@holomorphy.com> <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> <20041107192024.GM2890@holomorphy.com> <20041107193007.GC16976@krispykreme.ozlabs.ibm.com> Message-ID: <20041107210943.GN2890@holomorphy.com> At some point in the past, I wrote: >> The second is a triplefault on x86-64 under some >> condition involving a long-running database regression test. There has >> obviously been considerably less progress there in no small part due to >> the amount of time required to reproduce the issue. On Mon, Nov 08, 2004 at 06:30:07AM +1100, Anton Blanchard wrote: > OK. We have not seen a similar issue on ppc64 even with extensive > testing (although with HPC apps). The question is how long we should > hold off on further hugetlb development waiting for this one bug report > on a single architecture to be chased. Until it's fixed. Until then I'm considering it a byproduct of that same development. And with your report, that makes it two architectures, not one. The concepts of the features etc. are all generally okay, though very buzzword-centric. In general the audits and sweeps have been lacking thoroughness in the architecture-specific areas. I expect that particular issue to have been the cause of these two bugreports. -- wli From anton at samba.org Mon Nov 8 08:22:12 2004 From: anton at samba.org (Anton Blanchard) Date: Mon, 8 Nov 2004 08:22:12 +1100 Subject: [RFC] Consolidate lots of hugepage code In-Reply-To: <20041107210943.GN2890@holomorphy.com> References: <20041029033708.GF12247@zax> <20041029034817.GY12934@holomorphy.com> <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> <20041107192024.GM2890@holomorphy.com> <20041107193007.GC16976@krispykreme.ozlabs.ibm.com> <20041107210943.GN2890@holomorphy.com> Message-ID: <20041107212212.GD16976@krispykreme.ozlabs.ibm.com> > On Mon, Nov 08, 2004 at 06:30:07AM +1100, Anton Blanchard wrote: > > OK. We have not seen a similar issue on ppc64 even with extensive > > testing (although with HPC apps). The question is how long we should > > hold off on further hugetlb development waiting for this one bug report > > on a single architecture to be chased. > > Until it's fixed. Until then I'm considering it a byproduct of that same > development. And with your report, that makes it two architectures, not > one. We _arent_ seeing it on ppc64. Can we at least have a complete bug report if we are to halt all hugetlb development? At the moment we dont have much information to go on at all. Anton From wli at holomorphy.com Mon Nov 8 09:49:48 2004 From: wli at holomorphy.com (William Lee Irwin III) Date: Sun, 7 Nov 2004 14:49:48 -0800 Subject: [RFC] Consolidate lots of hugepage code In-Reply-To: <20041107212212.GD16976@krispykreme.ozlabs.ibm.com> References: <20041029033708.GF12247@zax> <20041029034817.GY12934@holomorphy.com> <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> <20041107192024.GM2890@holomorphy.com> <20041107193007.GC16976@krispykreme.ozlabs.ibm.com> <20041107210943.GN2890@holomorphy.com> <20041107212212.GD16976@krispykreme.ozlabs.ibm.com> Message-ID: <20041107224948.GO2890@holomorphy.com> At some point in the past, I wrote: >> Until it's fixed. Until then I'm considering it a byproduct of that same >> development. And with your report, that makes it two architectures, not >> one. On Mon, Nov 08, 2004 at 08:22:12AM +1100, Anton Blanchard wrote: > We _arent_ seeing it on ppc64. Can we at least have a complete bug > report if we are to halt all hugetlb development? At the moment we dont > have much information to go on at all. Sorry, I don't get complete bugreports myself. If you care to try to actually fix something (it's doubtful you yourself are the culprit) I'm still trying to reproduce it myself with long-running database tests. It's reliably reproducible on the reporters' machines. The particular bug is only one piece of evidence. Just asking basic questions about what was done for architecture code reveals that all this "development" is not paying proper attention to architecture code. I merely insist that development toward the end of stabilization occur prior to that for large feature work. And frankly, I'm rather unimpressed with the gravity of the proposed featurework, particularly in comparison to the stability requirements of users on typical production systems. Nor am I impressed with the quality. The patch presentations have been messy, the audits (as mentioned above) incomplete, the benefits not clearly demonstrated, and the code itself not so pretty. Just respinning the patches so they're properly incremental and the code somewhat cleaner (e.g. some recent one nested tabs 5 deep or so) would already remedy a large number of the issues with the featurework. Once arranged that way the audits' incompleteness can be dealt with by those with the fortitude to thoroughly audit and/or prior architecture knowledge to correct the patches for arches they don't deal with properly. -- wli From segher at kernel.crashing.org Mon Nov 8 20:27:15 2004 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Mon, 8 Nov 2004 10:27:15 +0100 Subject: Booting Imac G5 In-Reply-To: <1099702566.3946.49.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> Message-ID: <60960DF4-3168-11D9-A1A1-000A95A4DC02@kernel.crashing.org> > Nice ! Did you submit the new PCI IDs to the online database too ? > > http://pciids.sourceforge.net/ And please :-) Segher From segher at kernel.crashing.org Mon Nov 8 20:28:28 2004 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Mon, 8 Nov 2004 10:28:28 +0100 Subject: Booting Imac G5 In-Reply-To: <1099740535.8346.33.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099740535.8346.33.camel@rapid> Message-ID: <8C3E72E1-3168-11D9-A1A1-000A95A4DC02@kernel.crashing.org> > I also added the two following devices that I just identified, using > /proc/device-tree to locate them: > 004f Shasta Mac I/O > 0058 U3 AGP bridge It's U3L, instead. Segher From l_indien at magic.fr Mon Nov 8 22:14:28 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 08 Nov 2004 12:14:28 +0100 Subject: Booting Imac G5 In-Reply-To: <60960DF4-3168-11D9-A1A1-000A95A4DC02@kernel.crashing.org> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <60960DF4-3168-11D9-A1A1-000A95A4DC02@kernel.crashing.org> Message-ID: <1099912468.8346.1121.camel@rapid> On Mon, 2004-11-08 at 10:27, Segher Boessenkool wrote: > > Nice ! Did you submit the new PCI IDs to the online database too ? > > > > http://pciids.sourceforge.net/ > > And please :-) OK, done. -- J. Mayer Never organized From l_indien at magic.fr Tue Nov 9 00:30:12 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 08 Nov 2004 14:30:12 +0100 Subject: Booting Imac G5 (Wrong patch !) In-Reply-To: <1099787630.3884.118.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> Message-ID: <1099920612.8346.1460.camel@rapid> On Sun, 2004-11-07 at 01:33, Benjamin Herrenschmidt wrote: > Oops, sent the wrong patch, here it is: [...] Made some tries, this does not help. I'm not sure, but I feel like we miss an IRQ... I'll do more testing when I'll have more time. -- J. Mayer Never organized From brking at us.ibm.com Tue Nov 9 03:19:34 2004 From: brking at us.ibm.com (brking at us.ibm.com) Date: Mon, 08 Nov 2004 10:19:34 -0600 Subject: [PATCH 1/2] ppc64: Block config accesses during BIST #3 Message-ID: <200411081619.iA8GJabM014634@d03av02.boulder.ibm.com> Below is a revised patch in a attempt at sharing more code between iSeries and pSeries and also getting full ppc64 support of the new APIs essentially for free. Some PCI adapters on pSeries and iSeries hardware (ipr scsi adapters) have an exposure today in that they issue BIST to the adapter to reset the card. If, during the time it takes to complete BIST, userspace attempts to access PCI config space, the host bus bridge will master abort the access since the ipr adapter does not respond on the PCI bus for a brief period of time when running BIST. This master abort results in the host PCI bridge isolating that PCI device from the rest of the system, making the device unusable until Linux is rebooted. This patch is an attempt to close that exposure by introducing some blocking code in the arch specific PCI code. The intent is to have the ipr device driver invoke these routines to prevent userspace PCI accesses from occurring during this window. It has been tested by running BIST on an ipr adapter while running a script which looped reading the config space of that adapter through sysfs. Without the patch, an EEH error occurrs. With the patch there is no EEH error. Tested on Power 5. Signed-off-by: Brian King --- linux-2.6.10-rc1-bk18-bjking1/arch/ppc64/kernel/pSeries_pci.c | 2 linux-2.6.10-rc1-bk18-bjking1/arch/ppc64/kernel/pci.c | 112 +++++++++- linux-2.6.10-rc1-bk18-bjking1/arch/ppc64/kernel/pci.h | 1 linux-2.6.10-rc1-bk18-bjking1/include/asm-ppc64/pci-bridge.h | 4 linux-2.6.10-rc1-bk18-bjking1/include/asm-ppc64/pci.h | 13 + 5 files changed, 129 insertions(+), 3 deletions(-) diff -puN include/asm-ppc64/pci.h~ppc64_block_cfg_io_during_bist_revised include/asm-ppc64/pci.h --- linux-2.6.10-rc1-bk18/include/asm-ppc64/pci.h~ppc64_block_cfg_io_during_bist_revised 2004-11-08 09:32:48.000000000 -0600 +++ linux-2.6.10-rc1-bk18-bjking1/include/asm-ppc64/pci.h 2004-11-08 09:32:48.000000000 -0600 @@ -85,6 +85,7 @@ struct pci_dma_ops { }; extern struct pci_dma_ops pci_dma_ops; +extern struct pci_ops pci_ops; static inline void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size, dma_addr_t *dma_handle) @@ -244,6 +245,18 @@ extern int pci_read_irq_line(struct pci_ extern void pcibios_add_platform_entries(struct pci_dev *dev); +extern void pci_block_config_io(struct pci_dev *dev); + +extern void pci_unblock_config_io(struct pci_dev *dev); + +extern int pci_start_bist(struct pci_dev *dev); + +extern int pcibios_read_config(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 *val); + +extern int pcibios_write_config(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 val); + #endif /* __KERNEL__ */ #endif /* __PPC64_PCI_H */ diff -puN arch/ppc64/kernel/pci.c~ppc64_block_cfg_io_during_bist_revised arch/ppc64/kernel/pci.c --- linux-2.6.10-rc1-bk18/arch/ppc64/kernel/pci.c~ppc64_block_cfg_io_during_bist_revised 2004-11-08 09:32:48.000000000 -0600 +++ linux-2.6.10-rc1-bk18-bjking1/arch/ppc64/kernel/pci.c 2004-11-08 09:32:48.000000000 -0600 @@ -321,7 +321,7 @@ static int __init pcibios_init(void) /* Scan all of the recorded PCI controllers. */ list_for_each_entry_safe(hose, tmp, &hose_list, list_node) { hose->last_busno = 0xff; - bus = pci_scan_bus(hose->first_busno, hose->ops, + bus = pci_scan_bus(hose->first_busno, &pci_ops, hose->arch_data); hose->bus = bus; hose->last_busno = bus->subordinate; @@ -547,6 +547,104 @@ int pci_mmap_page_range(struct pci_dev * return ret; } +static spinlock_t config_lock = SPIN_LOCK_UNLOCKED; + +int pcibios_read_config(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 *val) +{ + struct pci_controller *hose = pci_bus_to_host(bus); + unsigned long flags; + int rc = 0; + + spin_lock_irqsave(&config_lock, flags); + if (hose && !(hose->block_cfg_io_mask & (1 << PCI_SLOT(devfn)))) + rc = hose->ops->read(bus, devfn, where, size, val); + else + *val = -1; + spin_unlock_irqrestore(&config_lock, flags); + return rc; +} +EXPORT_SYMBOL(pcibios_read_config); + +int pcibios_write_config(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 val) +{ + struct pci_controller *hose = pci_bus_to_host(bus); + unsigned long flags; + int rc = 0; + + spin_lock_irqsave(&config_lock, flags); + if (hose && !(hose->block_cfg_io_mask & (1 << PCI_SLOT(devfn)))) + rc = hose->ops->write(bus, devfn, where, size, val); + spin_unlock_irqrestore(&config_lock, flags); + return rc; +} +EXPORT_SYMBOL(pcibios_write_config); + +struct pci_ops pci_ops = { + pcibios_read_config, + pcibios_write_config +}; + +/** + * pci_block_config_io - Block PCI config reads/writes + * @pdev: pci device struct + * + * This function blocks any PCI config accesses from occurring. + * When blocked, any writes will be ignored and treated as + * successful and any reads will return all 1's data. + * + * Return value: + * nothing + **/ +void pci_block_config_io(struct pci_dev *pdev) +{ + struct pci_controller *hose = PCI_GET_PHB_PTR(pdev); + unsigned long flags; + + spin_lock_irqsave(&config_lock, flags); + hose->block_cfg_io_mask |= (1 << PCI_SLOT(pdev->devfn)); + spin_unlock_irqrestore(&config_lock, flags); +} +EXPORT_SYMBOL(pci_block_config_io); + +/** + * pci_unblock_config_io - Unblock PCI config reads/writes + * @pdev: pci device struct + * + * This function allows PCI config accesses to resume. + * + * Return value: + * nothing + **/ +void pci_unblock_config_io(struct pci_dev *pdev) +{ + struct pci_controller *hose = PCI_GET_PHB_PTR(pdev); + unsigned long flags; + + spin_lock_irqsave(&config_lock, flags); + hose->block_cfg_io_mask &= ~(1 << PCI_SLOT(pdev->devfn)); + spin_unlock_irqrestore(&config_lock, flags); +} +EXPORT_SYMBOL(pci_unblock_config_io); + +/** + * pci_start_bist - Start BIST on a PCI device + * @pdev: pci device struct + * + * This function allows a device driver to start BIST + * when PCI config accesses are disabled. + * + * Return value: + * nothing + **/ +int pci_start_bist(struct pci_dev *pdev) +{ + struct pci_controller *hose = pci_bus_to_host(pdev->bus); + return hose->ops->write(pdev->bus, pdev->devfn, PCI_BIST, 1, PCI_BIST_START); +} +EXPORT_SYMBOL(pci_start_bist); + #ifdef CONFIG_PPC_MULTIPLATFORM static ssize_t pci_show_devspec(struct device *dev, char *buf) { @@ -852,6 +950,18 @@ struct pci_controller* pci_find_hose_for return NULL; } +struct pci_controller* pci_find_hose_for_bus(struct pci_bus *bus) +{ + while (bus) { + struct pci_controller *hose, *tmp; + list_for_each_entry_safe(hose, tmp, &hose_list, list_node) + if (hose->bus == bus) + return hose; + bus=bus->parent; + } + return NULL; +} + /* * ppc64 can have multifunction devices that do not respond to function 0. * In this case we must scan all functions. diff -puN include/asm-ppc64/pci-bridge.h~ppc64_block_cfg_io_during_bist_revised include/asm-ppc64/pci-bridge.h --- linux-2.6.10-rc1-bk18/include/asm-ppc64/pci-bridge.h~ppc64_block_cfg_io_during_bist_revised 2004-11-08 09:32:48.000000000 -0600 +++ linux-2.6.10-rc1-bk18-bjking1/include/asm-ppc64/pci-bridge.h 2004-11-08 09:32:48.000000000 -0600 @@ -65,6 +65,7 @@ struct pci_controller { unsigned long buid; unsigned long dma_window_base_cur; unsigned long dma_window_size; + unsigned int block_cfg_io_mask; }; /* @@ -100,6 +101,7 @@ extern int pcibios_remove_root_bus(struc #define PCI_GET_DN(dev) ((struct device_node *)((dev)->sysdata)) extern void phbs_remap_io(void); +extern struct pci_controller* pci_find_hose_for_bus(struct pci_bus *bus); static inline struct pci_controller *pci_bus_to_host(struct pci_bus *bus) { @@ -113,7 +115,7 @@ static inline struct pci_controller *pci busdn = b->sysdata; } if (busdn == NULL) - return NULL; + return pci_find_hose_for_bus(bus); return busdn->phb; } diff -puN arch/ppc64/kernel/pci.h~ppc64_block_cfg_io_during_bist_revised arch/ppc64/kernel/pci.h --- linux-2.6.10-rc1-bk18/arch/ppc64/kernel/pci.h~ppc64_block_cfg_io_during_bist_revised 2004-11-08 09:32:48.000000000 -0600 +++ linux-2.6.10-rc1-bk18-bjking1/arch/ppc64/kernel/pci.h 2004-11-08 09:32:48.000000000 -0600 @@ -19,6 +19,7 @@ extern struct pci_controller* pci_alloc_ extern void pci_setup_phb_io(struct pci_controller *hose, int primary); extern struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node); +extern struct pci_controller* pci_find_hose_for_bus(struct pci_bus *bus); extern void pci_setup_phb_io_dynamic(struct pci_controller *hose); diff -puN arch/ppc64/kernel/pSeries_pci.c~ppc64_block_cfg_io_during_bist_revised arch/ppc64/kernel/pSeries_pci.c --- linux-2.6.10-rc1-bk18/arch/ppc64/kernel/pSeries_pci.c~ppc64_block_cfg_io_during_bist_revised 2004-11-08 09:32:48.000000000 -0600 +++ linux-2.6.10-rc1-bk18-bjking1/arch/ppc64/kernel/pSeries_pci.c 2004-11-08 09:32:48.000000000 -0600 @@ -434,7 +434,7 @@ struct pci_controller * __devinit init_p pci_devs_phb_init_dynamic(phb); phb->last_busno = 0xff; - bus = pci_scan_bus(phb->first_busno, phb->ops, phb->arch_data); + bus = pci_scan_bus(phb->first_busno, &pci_ops, phb->arch_data); phb->bus = bus; phb->last_busno = bus->subordinate; _ From brking at us.ibm.com Tue Nov 9 03:19:42 2004 From: brking at us.ibm.com (brking at us.ibm.com) Date: Mon, 08 Nov 2004 10:19:42 -0600 Subject: [PATCH 2/2] ipr: Block config IO during BIST (#3) Message-ID: <200411081619.iA8GJgFS000609@d03av01.boulder.ibm.com> Change ipr to use new ppc64 pci APIs to block PCI config space accesses when running BIST to prevent PCI master aborts. Signed-off-by: Brian King --- linux-2.6.10-rc1-bk18-bjking1/drivers/scsi/ipr.c | 5 ++++- linux-2.6.10-rc1-bk18-bjking1/drivers/scsi/ipr.h | 7 +++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff -puN drivers/scsi/ipr.h~ipr_block_config_io_during_bist_revised drivers/scsi/ipr.h --- linux-2.6.10-rc1-bk18/drivers/scsi/ipr.h~ipr_block_config_io_during_bist_revised 2004-11-08 09:32:53.000000000 -0600 +++ linux-2.6.10-rc1-bk18-bjking1/drivers/scsi/ipr.h 2004-11-08 09:32:53.000000000 -0600 @@ -1112,6 +1112,13 @@ __FUNCTION__, __LINE__, ioa_cfg #define ipr_remove_dump_file(kobj, attr) do { } while(0) #endif +#ifndef CONFIG_PPC64 +#define pci_block_config_io(dev) do { } while(0) +#define pci_unblock_config_io(dev) do { } while(0) +#define pci_start_bist(dev) \ + pci_write_config_byte(dev, PCI_BIST, PCI_BIST_START) +#endif + /* * Error logging macros */ diff -puN drivers/scsi/ipr.c~ipr_block_config_io_during_bist_revised drivers/scsi/ipr.c --- linux-2.6.10-rc1-bk18/drivers/scsi/ipr.c~ipr_block_config_io_during_bist_revised 2004-11-08 09:32:53.000000000 -0600 +++ linux-2.6.10-rc1-bk18-bjking1/drivers/scsi/ipr.c 2004-11-08 09:32:53.000000000 -0600 @@ -4935,6 +4935,7 @@ static int ipr_reset_restore_cfg_space(s int rc; ENTER; + pci_unblock_config_io(ioa_cfg->pdev); rc = pci_restore_state(ioa_cfg->pdev); if (rc != PCIBIOS_SUCCESSFUL) { @@ -4989,9 +4990,11 @@ static int ipr_reset_start_bist(struct i int rc; ENTER; - rc = pci_write_config_byte(ioa_cfg->pdev, PCI_BIST, PCI_BIST_START); + pci_block_config_io(ioa_cfg->pdev); + rc = pci_start_bist(ioa_cfg->pdev); if (rc != PCIBIOS_SUCCESSFUL) { + pci_unblock_config_io(ioa_cfg->pdev); ipr_cmd->ioasa.ioasc = cpu_to_be32(IPR_IOASC_PCI_ACCESS_ERROR); rc = IPR_RC_JOB_CONTINUE; } else { _ From ebenoit at hopevale.com Tue Nov 9 04:03:19 2004 From: ebenoit at hopevale.com (ebenoit at hopevale.com) Date: Mon, 8 Nov 2004 12:03:19 -0500 Subject: G5 two SCSI hard drive partitioning Message-ID: <1099933399.418fa6d79514e@www.hopevale.com> I am not sure if I am on the correct list, so excuse me if this does not relate. I am installing mandrake 9.1 ppc on a G5 with two SCSI drives the first drive is 36GB and the second is 74GB. Here is my question: How can I partition them so that I can have a /home directory of 90GB? I have tried to use LVM, but have not found enough information to set it up correctly with other partitions. Plus, it fails with a 'pvcreate failed' error message. I thought using linear RAID would do the trick, but again I am a beginer with both of these partitioning schemes. Here is what I have tried to accomplish: bootstrap | 10mb | no mount | apple_bootstrap root | 2GB | / | ext3 swap | 800mb | swap | Linux Swap home | 90GB | /home | LVM Thank you for your comments and or suggestions, Eric ------------------------------------------------------------ Hopevale Union Free School District: http://www.hopevale.com From linas at austin.ibm.com Tue Nov 9 05:16:19 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Mon, 8 Nov 2004 12:16:19 -0600 Subject: [PATCH] PPC64 Poor assembly coding style Message-ID: <20041108181619.GT10026@austin.ibm.com> Hi, Doug Maxey reported a bug with the latest/greatest gas assembler that demonstrates some poor coding style in entry.S and head.S. The following patch cleans up that style, and also avoids assembler confusion. Basically, in entry.S, cmpldi 0,r0,NR_syscalls should be written as either cmpldi r0,NR_syscalls or as cmpldi cr0,r0,NR_syscalls All three forms are theoretically equivalent; in practice, I find the first alternative the cleanest (and also consistent with usage elsewhere in the files). The new assembler seems to be mistaking NR_syscalls for a register number, which is clearly out of bounds (its not in 0..31). I think it would be cleaner overall to just drop the superfluous leading cr0. There are two other confusing usages, in head.S: I propose that cmpldi cr0,r5,0 should be cmpldi r5,0 cmpld 0,r6,r5 should be cmpld r6,r5 --linas Signed-off-by: Linas Vepstas -------------- next part -------------- Hi, Doug Maxey reported a bug with the latest/greatest gas assembler that demonstrates some poor coding style in entry.S and head.S. The following patch cleans up that style, and also avoids assembler confusion. Basically, in entry.S, cmpldi 0,r0,NR_syscalls should be written as either cmpldi r0,NR_syscalls or as cmpldi cr0,r0,NR_syscalls All three forms are theoretically equivalent; in practice, I find the first alternative the cleanest (and also consistent with usage elsewhere in the files). The new assembler seems to be mistaking NR_syscalls for a register number, which is clearly out of bounds (its not in 0..31). I think it would be cleaner overall to just drop the superfluous leading cr0. There are two other confusing usages, in head.S: I propose that cmpldi cr0,r5,0 should be cmpldi r5,0 cmpld 0,r6,r5 should be cmpld r6,r5 --linas Signed-off-by: Linas Vepstas ===== arch/ppc64/kernel/entry.S 1.46 vs edited ===== --- 1.46/arch/ppc64/kernel/entry.S 2004-10-07 16:52:16 -05:00 +++ edited/arch/ppc64/kernel/entry.S 2004-11-08 11:45:59 -06:00 @@ -122,7 +122,7 @@ SystemCall_common: andi. r11,r10,_TIF_SYSCALL_T_OR_A bne- syscall_dotrace syscall_dotrace_cont: - cmpldi 0,r0,NR_syscalls + cmpldi r0,NR_syscalls bge- syscall_enosys system_call: /* label this so stack traces look sane */ ===== arch/ppc64/kernel/head.S 1.81 vs edited ===== --- 1.81/arch/ppc64/kernel/head.S 2004-10-19 02:18:43 -05:00 +++ edited/arch/ppc64/kernel/head.S 2004-11-08 11:49:04 -06:00 @@ -1303,7 +1303,7 @@ _GLOBAL(__start_initialization_multiplat /* * Are we booted from a PROM Of-type client-interface ? */ - cmpldi cr0,r5,0 + cmpldi r5,0 bne .__boot_from_prom /* yes -> prom */ /* Save parameters */ @@ -1439,7 +1439,7 @@ _GLOBAL(copy_and_flush) dcbst r6,r3 /* write it to memory */ sync icbi r6,r3 /* flush the icache line */ - cmpld 0,r6,r5 + cmpld r6,r5 blt 4b sync addi r5,r5,8 From moilanen at austin.ibm.com Tue Nov 9 05:23:53 2004 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Mon, 8 Nov 2004 12:23:53 -0600 Subject: [PATCH] rtasd: window when error_log_cnt could get zeroed Message-ID: <20041108122353.5a6ea0e8@localhost> There appears to be a hole that if we get an log_error() call, that we could zero out our error log count in nvram. When rtasd() starts up, it turns on the logging via 'no_more_logging = 0'. If we get a log_error() call after that is set but before nvram_read_error_log has actually read nvram to set error_log_cnt, the log_error() call will write back to nvram a uninitialized error_log_cnt value, and wipe out our sequence number. To close the hole, simply move the 'no_more_logging = 0' till after nvram sets error_log_cnt but before pSeries_log_error is called. I also changed the 'no_more_logging' variable to be 'no_logging' since it's not only used when we stop logging now. I also removed the "volatile" part of no_more_logging, since it's unneeded. Thanks, Jake Signed-off-by: Jake Moilanen --- diff -puN arch/ppc64/kernel/rtasd.c~rtasd-no_more_logging-race arch/ppc64/kernel/rtasd.c --- linux-2.6-bk/arch/ppc64/kernel/rtasd.c~rtasd-no_more_logging-race Mon Nov 8 11:51:11 2004 +++ linux-2.6-bk-moilanen/arch/ppc64/kernel/rtasd.c Mon Nov 8 12:19:47 2004 @@ -48,7 +48,7 @@ static unsigned int rtas_error_log_buffe static int full_rtas_msgs = 0; -extern volatile int no_more_logging; +extern int no_logging; volatile int error_log_cnt = 0; @@ -213,7 +213,7 @@ void pSeries_log_error(char *buf, unsign } /* Write error to NVRAM */ - if (!no_more_logging && !(err_type & ERR_FLAG_BOOT)) + if (!no_logging && !(err_type & ERR_FLAG_BOOT)) nvram_write_error_log(buf, len, err_type); /* @@ -225,8 +225,8 @@ void pSeries_log_error(char *buf, unsign printk_log_rtas(buf, len); /* Check to see if we need to or have stopped logging */ - if (fatal || no_more_logging) { - no_more_logging = 1; + if (fatal || no_logging) { + no_logging = 1; spin_unlock_irqrestore(&rtasd_log_lock, s); return; } @@ -299,7 +299,7 @@ static ssize_t rtas_log_read(struct file spin_lock_irqsave(&rtasd_log_lock, s); /* if it's 0, then we know we got the last one (the one in NVRAM) */ - if (rtas_log_size == 0 && !no_more_logging) + if (rtas_log_size == 0 && !no_logging) nvram_clear_error_log(); spin_unlock_irqrestore(&rtasd_log_lock, s); @@ -417,9 +417,6 @@ static int rtasd(void *unused) goto error; } - /* We can use rtas_log_buf now */ - no_more_logging = 0; - printk(KERN_ERR "RTAS daemon started\n"); DEBUG("will sleep for %d jiffies\n", (HZ*60/rtas_event_scan_rate) / 2); @@ -428,6 +425,10 @@ static int rtasd(void *unused) memset(logdata, 0, rtas_error_log_max); rc = nvram_read_error_log(logdata, rtas_error_log_max, &err_type); + + /* We can use rtas_log_buf now */ + no_logging = 0; + if (!rc) { if (err_type != ERR_FLAG_ALREADY_LOGGED) { pSeries_log_error(logdata, err_type | ERR_FLAG_BOOT, 0); diff -puN arch/ppc64/kernel/nvram.c~rtasd-no_more_logging-race arch/ppc64/kernel/nvram.c --- linux-2.6-bk/arch/ppc64/kernel/nvram.c~rtasd-no_more_logging-race Mon Nov 8 11:52:39 2004 +++ linux-2.6-bk-moilanen/arch/ppc64/kernel/nvram.c Mon Nov 8 12:20:13 2004 @@ -43,9 +43,9 @@ static struct nvram_partition * nvram_pa static long nvram_error_log_index = -1; static long nvram_error_log_size = 0; -volatile int no_more_logging = 1; /* Until we initialize everything, - * make sure we don't try logging - * anything */ +int no_logging = 1; /* Until we initialize everything, + * make sure we don't try logging + * anything */ extern volatile int error_log_cnt; @@ -640,7 +640,7 @@ int nvram_write_error_log(char * buff, i loff_t tmp_index; struct err_log_info info; - if (no_more_logging) { + if (no_logging) { return -EPERM; } _ From linas at austin.ibm.com Tue Nov 9 05:35:19 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Mon, 8 Nov 2004 12:35:19 -0600 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <20041108181619.GT10026@austin.ibm.com> References: <20041108181619.GT10026@austin.ibm.com> Message-ID: <20041108183519.GU10026@austin.ibm.com> I wrote: > Doug Maxey reported a bug with the latest/greatest gas assembler > that demonstrates some poor coding style in entry.S and head.S. > The following patch cleans up that style, and also avoids assembler > confusion. Basically, in entry.S, > > cmpldi 0,r0,NR_syscalls should be written as either > > cmpldi r0,NR_syscalls or as cmpldi cr0,r0,NR_syscalls > > All three forms are theoretically equivalent; in practice, > I find the first alternative the cleanest (and also consistent > with usage elsewhere in the files). > > The new assembler seems to be mistaking NR_syscalls for a register > number, which is clearly out of bounds (its not in 0..31). > I think it would be cleaner overall to just drop the superfluous > leading cr0. There are two other confusing usages, in head.S: > I propose that > cmpldi cr0,r5,0 should be cmpldi r5,0 > cmpld 0,r6,r5 should be cmpld r6,r5 After mailing this, a grep revealed that its not just cmpl but in fact many cmp instructions that are sometimes written one way, and sometimes another. The following larger patch cleans up all the various usages, making (I beleive) the code slightly easier to read. --linas Signed-off-by: Linas Vepstas -------------- next part -------------- ===== arch/ppc64/kernel/entry.S 1.46 vs edited ===== --- 1.46/arch/ppc64/kernel/entry.S 2004-10-07 16:52:16 -05:00 +++ edited/arch/ppc64/kernel/entry.S 2004-11-08 12:18:08 -06:00 @@ -122,7 +122,7 @@ SystemCall_common: andi. r11,r10,_TIF_SYSCALL_T_OR_A bne- syscall_dotrace syscall_dotrace_cont: - cmpldi 0,r0,NR_syscalls + cmpldi r0,NR_syscalls bge- syscall_enosys system_call: /* label this so stack traces look sane */ @@ -204,7 +204,7 @@ syscall_enosys: syscall_error: lbz r11,TI_SC_NOERR(r12) - cmpwi 0,r11,0 + cmpwi r11,0 bne- syscall_error_cont neg r3,r3 oris r5,r5,0x1000 /* Set SO bit in CR */ @@ -313,7 +313,7 @@ _GLOBAL(ppc32_rt_sigreturn) _GLOBAL(ppc64_rt_sigreturn) bl .sys_rt_sigreturn -80: cmpdi 0,r3,0 +80: cmpdi r3,0 blt syscall_exit clrrdi r4,r1,THREAD_SHIFT ld r4,TI_FLAGS(r4) @@ -488,7 +488,7 @@ _GLOBAL(ret_from_except_lite) restore: #ifdef CONFIG_PPC_ISERIES ld r5,SOFTE(r1) - cmpdi 0,r5,0 + cmpdi r5,0 beq 4f /* Check for pending interrupts (iSeries) */ ld r3,PACALPPACA+LPPACAANYINT(r13) ===== arch/ppc64/kernel/head.S 1.81 vs edited ===== --- 1.81/arch/ppc64/kernel/head.S 2004-10-19 02:18:43 -05:00 +++ edited/arch/ppc64/kernel/head.S 2004-11-08 12:19:46 -06:00 @@ -156,7 +156,7 @@ _GLOBAL(__secondary_hold) /* All secondary cpu's wait here until told to start. */ 100: ld r4,__secondary_hold_spinloop at l(0) - cmpdi 0,r4,1 + cmpdi r4,1 bne 100b #ifdef CONFIG_HMT @@ -326,7 +326,7 @@ label##_Iseries: \ mtspr SPRG1,r13; /* save r13 */ \ EXCEPTION_PROLOG_ISERIES_1(PACA_EXGEN); \ lbz r10,PACAPROCENABLED(r13); \ - cmpwi 0,r10,0; \ + cmpwi r10,0; \ beq- label##_Iseries_masked; \ EXCEPTION_PROLOG_ISERIES_2; \ b label##_common; \ @@ -638,7 +638,7 @@ SystemReset_Iseries: ori r24,r24,MSR_RI mtmsrd r24 /* RI on */ lhz r24,PACAPACAINDEX(r13) /* Get processor # */ - cmpwi 0,r24,0 /* Are we processor 0? */ + cmpwi r24,0 /* Are we processor 0? */ beq .__start_initialization_iSeries /* Start up the first processor */ mfspr r4,CTRLF li r5,RUNLATCH /* Turn off the run light */ @@ -657,7 +657,7 @@ SystemReset_Iseries: addi r1,r3,THREAD_SIZE subi r1,r1,STACK_FRAME_OVERHEAD - cmpwi 0,r23,0 + cmpwi r23,0 beq iseries_secondary_smp_loop /* Loop until told to go */ #ifdef SECONDARY_PROCESSORS bne .__secondary_start /* Loop until told to go */ @@ -1219,7 +1219,7 @@ _GLOBAL(pseries_secondary_smp_init) ld r1,PACAEMERGSP(r13) subi r1,r1,STACK_FRAME_OVERHEAD - cmpwi 0,r23,0 + cmpwi r23,0 #ifdef CONFIG_SMP #ifdef SECONDARY_PROCESSORS bne .__secondary_start @@ -1303,7 +1303,7 @@ _GLOBAL(__start_initialization_multiplat /* * Are we booted from a PROM Of-type client-interface ? */ - cmpldi cr0,r5,0 + cmpldi r5,0 bne .__boot_from_prom /* yes -> prom */ /* Save parameters */ @@ -1439,7 +1439,7 @@ _GLOBAL(copy_and_flush) dcbst r6,r3 /* write it to memory */ sync icbi r6,r3 /* flush the icache line */ - cmpld 0,r6,r5 + cmpld r6,r5 blt 4b sync addi r5,r5,8 @@ -1472,7 +1472,7 @@ _STATIC(load_up_fpu) #ifndef CONFIG_SMP ld r3,last_task_used_math at got(r2) ld r4,0(r3) - cmpdi 0,r4,0 + cmpdi r4,0 beq 1f /* Save FP state to last_task_used_math's THREAD struct */ addi r4,r4,THREAD @@ -1528,11 +1528,11 @@ _GLOBAL(giveup_fpu) ori r5,r5,MSR_FP mtmsrd r5 /* enable use of fpu now */ isync - cmpdi 0,r3,0 + cmpdi r3,0 beqlr- /* if no previous owner, done */ addi r3,r3,THREAD /* want THREAD of task */ ld r5,PT_REGS(r3) - cmpdi 0,r5,0 + cmpdi r5,0 SAVE_32FPRS(0, r3) mffs fr0 stfd fr0,THREAD_FPSCR(r3) @@ -1578,7 +1578,7 @@ _STATIC(load_up_altivec) #ifndef CONFIG_SMP ld r3,last_task_used_altivec at got(r2) ld r4,0(r3) - cmpdi 0,r4,0 + cmpdi r4,0 beq 1f /* Save VMX state to last_task_used_altivec's THREAD struct */ addi r4,r4,THREAD @@ -1600,7 +1600,7 @@ _STATIC(load_up_altivec) * all 1's */ mfspr r4,SPRN_VRSAVE - cmpdi 0,r4,0 + cmpdi r4,0 bne+ 1f li r4,-1 mtspr SPRN_VRSAVE,r4 @@ -1647,11 +1647,11 @@ _GLOBAL(giveup_altivec) oris r5,r5,MSR_VEC at h mtmsrd r5 /* enable use of VMX now */ isync - cmpdi 0,r3,0 + cmpdi r3,0 beqlr- /* if no previous owner, done */ addi r3,r3,THREAD /* want THREAD of task */ ld r5,PT_REGS(r3) - cmpdi 0,r5,0 + cmpdi r5,0 SAVE_32VRS(0,r4,r3) mfvscr vr0 li r4,THREAD_VSCR ===== arch/ppc64/kernel/misc.S 1.92 vs edited ===== --- 1.92/arch/ppc64/kernel/misc.S 2004-10-27 16:35:17 -05:00 +++ edited/arch/ppc64/kernel/misc.S 2004-11-08 12:21:40 -06:00 @@ -82,10 +82,10 @@ _GLOBAL(local_irq_disable) _GLOBAL(local_irq_restore) lbz r5,PACAPROCENABLED(r13) /* Check if things are setup the way we want _already_. */ - cmpw 0,r3,r5 + cmpw r3,r5 beqlr /* are we enabling interrupts? */ - cmpdi 0,r3,0 + cmpdi r3,0 stb r3,PACAPROCENABLED(r13) beqlr /* Check pending interrupts */ @@ -374,7 +374,7 @@ _GLOBAL(__flush_dcache_icache) * The *_ns versions don't do byte-swapping. */ _GLOBAL(_insb) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,1 blelr- @@ -387,7 +387,7 @@ _GLOBAL(_insb) blr _GLOBAL(_outsb) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,1 blelr- @@ -398,7 +398,7 @@ _GLOBAL(_outsb) blr _GLOBAL(_insw) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,2 blelr- @@ -411,7 +411,7 @@ _GLOBAL(_insw) blr _GLOBAL(_outsw) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,2 blelr- @@ -422,7 +422,7 @@ _GLOBAL(_outsw) blr _GLOBAL(_insl) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,4 blelr- @@ -435,7 +435,7 @@ _GLOBAL(_insl) blr _GLOBAL(_outsl) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,4 blelr- @@ -447,7 +447,7 @@ _GLOBAL(_outsl) /* _GLOBAL(ide_insw) now in drivers/ide/ide-iops.c */ _GLOBAL(_insw_ns) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,2 blelr- @@ -461,7 +461,7 @@ _GLOBAL(_insw_ns) /* _GLOBAL(ide_outsw) now in drivers/ide/ide-iops.c */ _GLOBAL(_outsw_ns) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,2 blelr- @@ -472,7 +472,7 @@ _GLOBAL(_outsw_ns) blr _GLOBAL(_insl_ns) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,4 blelr- @@ -485,7 +485,7 @@ _GLOBAL(_insl_ns) blr _GLOBAL(_outsl_ns) - cmpwi 0,r5,0 + cmpwi r5,0 mtctr r5 subi r4,r4,4 blelr- @@ -526,7 +526,7 @@ _GLOBAL(identify_cpu) lwz r8,CPU_SPEC_PVR_MASK(r3) and r8,r8,r7 lwz r9,CPU_SPEC_PVR_VALUE(r3) - cmplw 0,r9,r8 + cmplw r9,r8 beq 1f addi r3,r3,CPU_SPEC_ENTRY_SIZE b 1b @@ -670,7 +670,7 @@ _GLOBAL(kernel_thread) li r4,0 /* new sp (unused) */ li r0,__NR_clone sc - cmpdi 0,r3,0 /* parent or child? */ + cmpdi r3,0 /* parent or child? */ bne 1f /* return if parent */ li r0,0 stdu r0,-STACK_FRAME_OVERHEAD(r1) From moilanen at austin.ibm.com Tue Nov 9 08:14:59 2004 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Mon, 8 Nov 2004 15:14:59 -0600 Subject: [PATCH 1/1] rtas_flash_4gig In-Reply-To: <16768.28322.583827.9327@cargo.ozlabs.ibm.com> References: <200410041942.i94Jg4WA154540@westrelay04.boulder.ibm.com> <16758.55568.809557.670513@cargo.ozlabs.ibm.com> <20041020170817.0ee49b64@localhost> <16768.28322.583827.9327@cargo.ozlabs.ibm.com> Message-ID: <20041108151459.267f0165@localhost> > OK, but I don't see that we make any attempt at all to try to make > sure the memory for the block list pages is below 4G. I also don't > see where we check the ibm,flash-block-version property (to see if we > can in fact use a linked list of headers) or where we check that the > pages we are using don't overlap OF's memory (i.e. real-size bytes > starting at real-base). Correct me if I'm wrong, but the ibm,flash-block-version will always be 1 if we have ibm,update-flash-64-and-reboot (from 270 onwards), and if we don't have the ibm,update-flash-64-and-reboot, we will not have gotten to this point anyways. The block list pages do not have the restriction of being under 4G. They just have to be page aligned and not cross a LMB. Since we only allocate one page for every memory block using get_zeroed_page, that should be fine. Why do you feel we need to check if the pages overlap OF's memory? We won't be flashing until well after OF is out of the picture. Thanks, Jake From paulus at samba.org Tue Nov 9 08:40:22 2004 From: paulus at samba.org (Paul Mackerras) Date: Tue, 9 Nov 2004 08:40:22 +1100 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <20041108181619.GT10026@austin.ibm.com> References: <20041108181619.GT10026@austin.ibm.com> Message-ID: <16783.59334.715911.661486@cargo.ozlabs.ibm.com> Linas Vepstas writes: > Doug Maxey reported a bug with the latest/greatest gas assembler > that demonstrates some poor coding style in entry.S and head.S. > The following patch cleans up that style, and also avoids assembler > confusion. Basically, in entry.S, > > cmpldi 0,r0,NR_syscalls should be written as either > > cmpldi r0,NR_syscalls or as cmpldi cr0,r0,NR_syscalls What is the actual bug here? Paul. From benh at kernel.crashing.org Tue Nov 9 09:33:56 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 09 Nov 2004 09:33:56 +1100 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <20041108181619.GT10026@austin.ibm.com> References: <20041108181619.GT10026@austin.ibm.com> Message-ID: <1099953236.5295.179.camel@gaston> On Mon, 2004-11-08 at 12:16 -0600, Linas Vepstas wrote: > Hi, > > Doug Maxey reported a bug with the latest/greatest gas assembler > that demonstrates some poor coding style in entry.S and head.S. > The following patch cleans up that style, and also avoids assembler > confusion. Basically, in entry.S, > > cmpldi 0,r0,NR_syscalls should be written as either > > cmpldi r0,NR_syscalls or as cmpldi cr0,r0,NR_syscalls > > All three forms are theoretically equivalent; in practice, > I find the first alternative the cleanest (and also consistent > with usage elsewhere in the files). > > The new assembler seems to be mistaking NR_syscalls for a register > number, which is clearly out of bounds (its not in 0..31). > I think it would be cleaner overall to just drop the superfluous > leading cr0. There are two other confusing usages, in head.S: > I propose that > cmpldi cr0,r5,0 should be cmpldi r5,0 > cmpld 0,r6,r5 should be cmpld r6,r5 I dont agree with "poor coding style" here, and you should remember that this code has been worked on by different people with different habits. Besides, gas does not know those register names, so the above is actually cmpldi 0,5,0 and cmpld 0,6,5 and writing that is perfectly valid. We defined some macros for register names to make things easier in the kernel, but they are not mandatory at all. Removing the first argument is an accepted construct, but in no way it "should" been done that way. In fact, I'd rather _add_ the implicit first argument in all cases to avoid confusion. You are welcome to submit a patch doing so, but please, avoid naming it "poor coding style" ... Ben. From amodra at bigpond.net.au Tue Nov 9 10:42:52 2004 From: amodra at bigpond.net.au (Alan Modra) Date: Tue, 9 Nov 2004 10:12:52 +1030 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <20041108181619.GT10026@austin.ibm.com> References: <20041108181619.GT10026@austin.ibm.com> Message-ID: <20041108234252.GA12837@bubble.modra.org> On Mon, Nov 08, 2004 at 12:16:19PM -0600, Linas Vepstas wrote: > The new assembler seems to be mistaking NR_syscalls for a register > number, which is clearly out of bounds (its not in 0..31). Testcase? -- Alan Modra IBM OzLabs - Linux Technology Centre From hollis at penguinppc.org Tue Nov 9 12:31:11 2004 From: hollis at penguinppc.org (Hollis Blanchard) Date: Mon, 8 Nov 2004 19:31:11 -0600 Subject: G5 two SCSI hard drive partitioning In-Reply-To: <1099933399.418fa6d79514e@www.hopevale.com> References: <1099933399.418fa6d79514e@www.hopevale.com> Message-ID: <09CEC382-31EF-11D9-9933-000A95A0560C@penguinppc.org> On Nov 8, 2004, at 11:03 AM, ebenoit at hopevale.com wrote: > I am not sure if I am on the correct list, so excuse me if this does > not relate. > > I am installing mandrake 9.1 ppc on a G5 with two SCSI drives the > first drive is > 36GB and the second is 74GB. > > Here is my question: > > How can I partition them so that I can have a /home directory of 90GB? Yes, you would probably have more luck on a Mandrake mailing list; see http://www.mandrakelinux.com/en/flists.php3 for details. -Hollis From sfr at canb.auug.org.au Tue Nov 9 18:42:23 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 9 Nov 2004 18:42:23 +1100 Subject: [0/6] PPC64 iSeries Machine Facilities code cleanup Message-ID: <20041109184223.16ea3414.sfr@canb.auug.org.au> Hi Andrew, The following patches clean up iSeries_mf.c and related files. There are a couple of simple fixes in here, but mainly this is just reorganisation and tidying. (Lots of Studly Caps disappear!) The overall diffstat looks like this: arch/ppc64/kernel/Makefile | 2 arch/ppc64/kernel/iSeries_pci.c | 6 arch/ppc64/kernel/iSeries_setup.c | 9 arch/ppc64/kernel/mf.c | 1118 +++++++++++++++++++++----------------- arch/ppc64/kernel/mf_proc.c | 250 -------- arch/ppc64/kernel/rtc.c | 4 arch/ppc64/kernel/viopath.c | 4 drivers/net/iseries_veth.c | 6 include/asm-ppc64/iSeries/mf.h | 41 - 9 files changed, 668 insertions(+), 772 deletions(-) Please apply them all (in order) and send to Linus. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041109/5835f741/attachment.pgp From sfr at canb.auug.org.au Tue Nov 9 18:45:51 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 9 Nov 2004 18:45:51 +1100 Subject: [PATCH] [1/6] PPC64 iSeries combine some MF code In-Reply-To: <20041109184223.16ea3414.sfr@canb.auug.org.au> References: <20041109184223.16ea3414.sfr@canb.auug.org.au> Message-ID: <20041109184551.03b8a32c.sfr@canb.auug.org.au> This patch just moves mf_proc.c into mf.c inanticipation of more cleanup to come. So mf_proc.c ceases to exist Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk/arch/ppc64/kernel/Makefile linus-bk-mf.0.5/arch/ppc64/kernel/Makefile --- linus-bk/arch/ppc64/kernel/Makefile 2004-10-29 10:45:15.000000000 +1000 +++ linus-bk-mf.0.5/arch/ppc64/kernel/Makefile 2004-11-09 16:28:49.000000000 +1100 @@ -22,7 +22,7 @@ obj-$(CONFIG_PPC_ISERIES) += iSeries_irq.o \ iSeries_VpdInfo.o XmPciLpEvent.o \ - HvCall.o HvLpConfig.o LparData.o mf_proc.o \ + HvCall.o HvLpConfig.o LparData.o \ iSeries_setup.o ItLpQueue.o hvCall.o \ mf.o HvLpEvent.o iSeries_proc.o iSeries_htab.o \ iSeries_iommu.o diff -ruN linus-bk/arch/ppc64/kernel/mf.c linus-bk-mf.0.5/arch/ppc64/kernel/mf.c --- linus-bk/arch/ppc64/kernel/mf.c 2004-05-07 06:56:27.000000000 +1000 +++ linus-bk-mf.0.5/arch/ppc64/kernel/mf.c 2004-11-09 16:30:30.000000000 +1100 @@ -1,6 +1,7 @@ /* * mf.c * Copyright (C) 2001 Troy D. Armstrong IBM Corporation + * Copyright (C) 2004 Stephen Rothwell IBM Corporation * * This modules exists as an interface between a Linux secondary partition * running on an iSeries and the primary partition's Virtual Service @@ -1079,3 +1080,232 @@ return signal_ce_msg(ceTime, NULL); } + +static int proc_mf_dump_cmdline(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + int len = count; + char *p; + + if (off) { + *eof = 1; + return 0; + } + + len = mf_getCmdLine(page, &len, (u64)data); + + p = page; + while (len < (count - 1)) { + if (!*p || *p == '\n') + break; + p++; + len++; + } + *p = '\n'; + p++; + *p = 0; + + return p - page; +} + +#if 0 +static int proc_mf_dump_vmlinux(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + int sizeToGet = count; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if (mf_getVmlinuxChunk(page, &sizeToGet, off, (u64)data) == 0) { + if (sizeToGet != 0) { + *start = page + off; + return sizeToGet; + } + *eof = 1; + return 0; + } + *eof = 1; + return 0; +} +#endif + +static int proc_mf_dump_side(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + int len; + char mf_current_side = mf_getSide(); + + len = sprintf(page, "%c\n", mf_current_side); + + if (len <= (off + count)) + *eof = 1; + *start = page + off; + len -= off; + if (len > count) + len = count; + if (len < 0) + len = 0; + return len; +} + +static int proc_mf_change_side(struct file *file, const char __user *buffer, + unsigned long count, void *data) +{ + char stkbuf[10]; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if (count > (sizeof(stkbuf) - 1)) + count = sizeof(stkbuf) - 1; + if (copy_from_user(stkbuf, buffer, count)) + return -EFAULT; + stkbuf[count] = 0; + if ((*stkbuf != 'A') && (*stkbuf != 'B') && + (*stkbuf != 'C') && (*stkbuf != 'D')) { + printk(KERN_ERR "mf_proc.c: proc_mf_change_side: invalid side\n"); + return -EINVAL; + } + + mf_setSide(*stkbuf); + + return count; +} + +static int proc_mf_dump_src(char *page, char **start, off_t off, + int count, int *eof, void *data) +{ + int len; + + mf_getSrcHistory(page, count); + len = count; + len -= off; + if (len < count) { + *eof = 1; + if (len <= 0) + return 0; + } else + len = count; + *start = page + off; + return len; +} + +static int proc_mf_change_src(struct file *file, const char __user *buffer, + unsigned long count, void *data) +{ + char stkbuf[10]; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if ((count < 4) && (count != 1)) { + printk(KERN_ERR "mf_proc: invalid src\n"); + return -EINVAL; + } + + if (count > (sizeof(stkbuf) - 1)) + count = sizeof(stkbuf) - 1; + if (copy_from_user(stkbuf, buffer, count)) + return -EFAULT; + + if ((count == 1) && (*stkbuf == '\0')) + mf_clearSrc(); + else + mf_displaySrc(*(u32 *)stkbuf); + + return count; +} + +static int proc_mf_change_cmdline(struct file *file, const char *buffer, + unsigned long count, void *data) +{ + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + mf_setCmdLine(buffer, count, (u64)data); + + return count; +} + +static ssize_t proc_mf_change_vmlinux(struct file *file, + const char __user *buf, + size_t count, loff_t *ppos) +{ + struct inode * inode = file->f_dentry->d_inode; + struct proc_dir_entry * dp = PDE(inode); + int rc; + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + rc = mf_setVmlinuxChunk(buf, count, *ppos, (u64)dp->data); + if (rc < 0) + return rc; + + *ppos += count; + + return count; +} + +static struct file_operations proc_vmlinux_operations = { + .write = proc_mf_change_vmlinux, +}; + +static int __init mf_proc_init(void) +{ + struct proc_dir_entry *mf_proc_root; + struct proc_dir_entry *ent; + struct proc_dir_entry *mf; + char name[2]; + int i; + + mf_proc_root = proc_mkdir("iSeries/mf", NULL); + if (!mf_proc_root) + return 1; + + name[1] = '\0'; + for (i = 0; i < 4; i++) { + name[0] = 'A' + i; + mf = proc_mkdir(name, mf_proc_root); + if (!mf) + return 1; + + ent = create_proc_entry("cmdline", S_IFREG|S_IRUSR|S_IWUSR, mf); + if (!ent) + return 1; + ent->nlink = 1; + ent->data = (void *)(long)i; + ent->read_proc = proc_mf_dump_cmdline; + ent->write_proc = proc_mf_change_cmdline; + + if (i == 3) /* no vmlinux entry for 'D' */ + continue; + + ent = create_proc_entry("vmlinux", S_IFREG|S_IWUSR, mf); + if (!ent) + return 1; + ent->nlink = 1; + ent->data = (void *)(long)i; + ent->proc_fops = &proc_vmlinux_operations; + } + + ent = create_proc_entry("side", S_IFREG|S_IRUSR|S_IWUSR, mf_proc_root); + if (!ent) + return 1; + ent->nlink = 1; + ent->data = (void *)0; + ent->read_proc = proc_mf_dump_side; + ent->write_proc = proc_mf_change_side; + + ent = create_proc_entry("src", S_IFREG|S_IRUSR|S_IWUSR, mf_proc_root); + if (!ent) + return 1; + ent->nlink = 1; + ent->data = (void *)0; + ent->read_proc = proc_mf_dump_src; + ent->write_proc = proc_mf_change_src; + + return 0; +} + +__initcall(mf_proc_init); diff -ruN linus-bk/arch/ppc64/kernel/mf_proc.c linus-bk-mf.0.5/arch/ppc64/kernel/mf_proc.c --- linus-bk/arch/ppc64/kernel/mf_proc.c 2004-08-24 07:22:47.000000000 +1000 +++ linus-bk-mf.0.5/arch/ppc64/kernel/mf_proc.c 1970-01-01 10:00:00.000000000 +1000 @@ -1,250 +0,0 @@ -/* - * mf_proc.c - * Copyright (C) 2001 Kyle A. Lucke IBM Corporation - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ -#include -#include -#include - -static int proc_mf_dump_cmdline(char *page, char **start, off_t off, - int count, int *eof, void *data) -{ - int len = count; - char *p; - - if (off) { - *eof = 1; - return 0; - } - - len = mf_getCmdLine(page, &len, (u64)data); - - p = page; - while (len < (count - 1)) { - if (!*p || *p == '\n') - break; - p++; - len++; - } - *p = '\n'; - p++; - *p = 0; - - return p - page; -} - -#if 0 -static int proc_mf_dump_vmlinux(char *page, char **start, off_t off, - int count, int *eof, void *data) -{ - int sizeToGet = count; - - if (!capable(CAP_SYS_ADMIN)) - return -EACCES; - - if (mf_getVmlinuxChunk(page, &sizeToGet, off, (u64)data) == 0) { - if (sizeToGet != 0) { - *start = page + off; - return sizeToGet; - } - *eof = 1; - return 0; - } - *eof = 1; - return 0; -} -#endif - -static int proc_mf_dump_side(char *page, char **start, off_t off, - int count, int *eof, void *data) -{ - int len; - char mf_current_side = mf_getSide(); - - len = sprintf(page, "%c\n", mf_current_side); - - if (len <= (off + count)) - *eof = 1; - *start = page + off; - len -= off; - if (len > count) - len = count; - if (len < 0) - len = 0; - return len; -} - -static int proc_mf_change_side(struct file *file, const char __user *buffer, - unsigned long count, void *data) -{ - char stkbuf[10]; - - if (!capable(CAP_SYS_ADMIN)) - return -EACCES; - - if (count > (sizeof(stkbuf) - 1)) - count = sizeof(stkbuf) - 1; - if (copy_from_user(stkbuf, buffer, count)) - return -EFAULT; - stkbuf[count] = 0; - if ((*stkbuf != 'A') && (*stkbuf != 'B') && - (*stkbuf != 'C') && (*stkbuf != 'D')) { - printk(KERN_ERR "mf_proc.c: proc_mf_change_side: invalid side\n"); - return -EINVAL; - } - - mf_setSide(*stkbuf); - - return count; -} - -static int proc_mf_dump_src(char *page, char **start, off_t off, - int count, int *eof, void *data) -{ - int len; - - mf_getSrcHistory(page, count); - len = count; - len -= off; - if (len < count) { - *eof = 1; - if (len <= 0) - return 0; - } else - len = count; - *start = page + off; - return len; -} - -static int proc_mf_change_src(struct file *file, const char __user *buffer, - unsigned long count, void *data) -{ - char stkbuf[10]; - - if (!capable(CAP_SYS_ADMIN)) - return -EACCES; - - if ((count < 4) && (count != 1)) { - printk(KERN_ERR "mf_proc: invalid src\n"); - return -EINVAL; - } - - if (count > (sizeof(stkbuf) - 1)) - count = sizeof(stkbuf) - 1; - if (copy_from_user(stkbuf, buffer, count)) - return -EFAULT; - - if ((count == 1) && (*stkbuf == '\0')) - mf_clearSrc(); - else - mf_displaySrc(*(u32 *)stkbuf); - - return count; -} - -static int proc_mf_change_cmdline(struct file *file, const char *buffer, - unsigned long count, void *data) -{ - if (!capable(CAP_SYS_ADMIN)) - return -EACCES; - - mf_setCmdLine(buffer, count, (u64)data); - - return count; -} - -static ssize_t proc_mf_change_vmlinux(struct file *file, - const char __user *buf, - size_t count, loff_t *ppos) -{ - struct inode * inode = file->f_dentry->d_inode; - struct proc_dir_entry * dp = PDE(inode); - int rc; - if (!capable(CAP_SYS_ADMIN)) - return -EACCES; - - rc = mf_setVmlinuxChunk(buf, count, *ppos, (u64)dp->data); - if (rc < 0) - return rc; - - *ppos += count; - - return count; -} - -static struct file_operations proc_vmlinux_operations = { - .write = proc_mf_change_vmlinux, -}; - -static int __init mf_proc_init(void) -{ - struct proc_dir_entry *mf_proc_root; - struct proc_dir_entry *ent; - struct proc_dir_entry *mf; - char name[2]; - int i; - - mf_proc_root = proc_mkdir("iSeries/mf", NULL); - if (!mf_proc_root) - return 1; - - name[1] = '\0'; - for (i = 0; i < 4; i++) { - name[0] = 'A' + i; - mf = proc_mkdir(name, mf_proc_root); - if (!mf) - return 1; - - ent = create_proc_entry("cmdline", S_IFREG|S_IRUSR|S_IWUSR, mf); - if (!ent) - return 1; - ent->nlink = 1; - ent->data = (void *)(long)i; - ent->read_proc = proc_mf_dump_cmdline; - ent->write_proc = proc_mf_change_cmdline; - - if (i == 3) /* no vmlinux entry for 'D' */ - continue; - - ent = create_proc_entry("vmlinux", S_IFREG|S_IWUSR, mf); - if (!ent) - return 1; - ent->nlink = 1; - ent->data = (void *)(long)i; - ent->proc_fops = &proc_vmlinux_operations; - } - - ent = create_proc_entry("side", S_IFREG|S_IRUSR|S_IWUSR, mf_proc_root); - if (!ent) - return 1; - ent->nlink = 1; - ent->data = (void *)0; - ent->read_proc = proc_mf_dump_side; - ent->write_proc = proc_mf_change_side; - - ent = create_proc_entry("src", S_IFREG|S_IRUSR|S_IWUSR, mf_proc_root); - if (!ent) - return 1; - ent->nlink = 1; - ent->data = (void *)0; - ent->read_proc = proc_mf_dump_src; - ent->write_proc = proc_mf_change_src; - - return 0; -} - -__initcall(mf_proc_init); -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041109/5c66ba6d/attachment.pgp From sfr at canb.auug.org.au Tue Nov 9 18:48:13 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 9 Nov 2004 18:48:13 +1100 Subject: [PATCH] [2/6] PPC64 iSeries remove trailing white space In-Reply-To: <20041109184551.03b8a32c.sfr@canb.auug.org.au> References: <20041109184223.16ea3414.sfr@canb.auug.org.au> <20041109184551.03b8a32c.sfr@canb.auug.org.au> Message-ID: <20041109184813.1a6e02cf.sfr@canb.auug.org.au> Nothing more than removing trailing white space from mf.c. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk-mf.0.5/arch/ppc64/kernel/mf.c linus-bk-mf.0.6/arch/ppc64/kernel/mf.c --- linus-bk-mf.0.5/arch/ppc64/kernel/mf.c 2004-11-09 16:30:30.000000000 +1100 +++ linus-bk-mf.0.6/arch/ppc64/kernel/mf.c 2004-11-09 16:39:08.000000000 +1100 @@ -9,17 +9,17 @@ * all partitions in the iSeries. It also provides miscellaneous low-level * machine facility type operations. * - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA @@ -497,7 +497,7 @@ pending_event_head = pending_event_head->next; two = pending_event_head; free_pending_event(oldHead); - } + } spin_unlock_irqrestore(&pending_event_spinlock, flags); /* send next waiting event */ @@ -695,7 +695,7 @@ break; case 'B': newSide = 1; break; - case 'C': newSide = 2; + case 'C': newSide = 2; break; default: newSide = 3; break; @@ -959,7 +959,7 @@ *time = mktime(year, mon, day, hour, min, sec); *time += (jiffies / HZ); - + /* * Now THIS is a nasty hack! * It ensures that the first two calls to mf_getRtcTime get different @@ -1020,7 +1020,7 @@ if (year <= 69) year += 100; - + tm->tm_sec = sec; tm->tm_min = min; tm->tm_hour = hour; @@ -1051,17 +1051,17 @@ char ceTime[12] = "\x00\x00\x00\x41\x00\x00\x00\x00\x00\x00\x00\x00"; u8 day, mon, hour, min, sec, y1, y2; unsigned year; - + year = 1900 + tm->tm_year; y1 = year / 100; y2 = year % 100; - + sec = tm->tm_sec; min = tm->tm_min; hour = tm->tm_hour; day = tm->tm_mday; mon = tm->tm_mon + 1; - + BIN_TO_BCD(sec); BIN_TO_BCD(min); BIN_TO_BCD(hour); @@ -1077,7 +1077,7 @@ ceTime[8] = hour; ceTime[10] = day; ceTime[11] = mon; - + return signal_ce_msg(ceTime, NULL); } @@ -1093,7 +1093,7 @@ } len = mf_getCmdLine(page, &len, (u64)data); - + p = page; while (len < (count - 1)) { if (!*p || *p == '\n') @@ -1146,7 +1146,7 @@ len = count; if (len < 0) len = 0; - return len; + return len; } static int proc_mf_change_side(struct file *file, const char __user *buffer, @@ -1180,15 +1180,15 @@ mf_getSrcHistory(page, count); len = count; - len -= off; - if (len < count) { - *eof = 1; - if (len <= 0) - return 0; - } else - len = count; - *start = page + off; - return len; + len -= off; + if (len < count) { + *eof = 1; + if (len <= 0) + return 0; + } else + len = count; + *start = page + off; + return len; } static int proc_mf_change_src(struct file *file, const char __user *buffer, @@ -1214,7 +1214,7 @@ else mf_displaySrc(*(u32 *)stkbuf); - return count; + return count; } static int proc_mf_change_cmdline(struct file *file, const char *buffer, @@ -1225,10 +1225,10 @@ mf_setCmdLine(buffer, count, (u64)data); - return count; + return count; } -static ssize_t proc_mf_change_vmlinux(struct file *file, +static ssize_t proc_mf_change_vmlinux(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { @@ -1244,7 +1244,7 @@ *ppos += count; - return count; + return count; } static struct file_operations proc_vmlinux_operations = { -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041109/6b78ef61/attachment.pgp From sfr at canb.auug.org.au Tue Nov 9 18:51:31 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 9 Nov 2004 18:51:31 +1100 Subject: [PATCH] [3/6] PPC64 iSeries remove some Studly Caps In-Reply-To: <20041109184813.1a6e02cf.sfr@canb.auug.org.au> References: <20041109184223.16ea3414.sfr@canb.auug.org.au> <20041109184551.03b8a32c.sfr@canb.auug.org.au> <20041109184813.1a6e02cf.sfr@canb.auug.org.au> Message-ID: <20041109185131.29e6eabd.sfr@canb.auug.org.au> This patch changes the externally referenced names in mf.c to not use Study Caps and removes a couple of no longer used functions. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk-mf.0.6/arch/ppc64/kernel/iSeries_pci.c linus-bk-mf.0.7/arch/ppc64/kernel/iSeries_pci.c --- linus-bk-mf.0.6/arch/ppc64/kernel/iSeries_pci.c 2004-11-08 16:21:10.000000000 +1100 +++ linus-bk-mf.0.7/arch/ppc64/kernel/iSeries_pci.c 2004-11-09 16:42:02.000000000 +1100 @@ -309,7 +309,7 @@ PPCDBG(PPCDBG_BUSWALK, "iSeries_pcibios_fixup Entry.\n"); /* Fix up at the device node and pci_dev relationship */ - mf_displaySrc(0xC9000100); + mf_display_src(0xC9000100); printk("pcibios_final_fixup\n"); for_each_pci_dev(pdev) { @@ -335,7 +335,7 @@ pdev->irq = node->Irq; } iSeries_activate_IRQs(); - mf_displaySrc(0xC9000200); + mf_display_src(0xC9000200); } void pcibios_fixup_bus(struct pci_bus *PciBus) @@ -677,7 +677,7 @@ */ if ((DevNode->IoRetry > Pci_Retry_Max) && (Pci_Error_Flag > 0)) { - mf_displaySrc(0xB6000103); + mf_display_src(0xB6000103); panic_timeout = 0; panic("PCI: Hardware I/O Error, SRC B6000103, " "Automatic Reboot Disabled.\n"); diff -ruN linus-bk-mf.0.6/arch/ppc64/kernel/iSeries_setup.c linus-bk-mf.0.7/arch/ppc64/kernel/iSeries_setup.c --- linus-bk-mf.0.6/arch/ppc64/kernel/iSeries_setup.c 2004-09-24 15:23:06.000000000 +1000 +++ linus-bk-mf.0.7/arch/ppc64/kernel/iSeries_setup.c 2004-11-09 16:48:02.000000000 +1100 @@ -732,7 +732,7 @@ */ void iSeries_power_off(void) { - mf_powerOff(); + mf_power_off(); } /* @@ -740,7 +740,7 @@ */ void iSeries_halt(void) { - mf_powerOff(); + mf_power_off(); } /* JDH Hack */ @@ -796,9 +796,9 @@ printk("Progress: [%04x] - %s\n", (unsigned)code, st); if (!piranha_simulator && mf_initialized) { if (code != 0xffff) - mf_displayProgress(code); + mf_display_progress(code); else - mf_clearSrc(); + mf_clear_src(); } } diff -ruN linus-bk-mf.0.6/arch/ppc64/kernel/mf.c linus-bk-mf.0.7/arch/ppc64/kernel/mf.c --- linus-bk-mf.0.6/arch/ppc64/kernel/mf.c 2004-11-09 16:39:08.000000000 +1100 +++ linus-bk-mf.0.7/arch/ppc64/kernel/mf.c 2004-11-09 17:27:46.000000000 +1100 @@ -357,7 +357,7 @@ if (rc) { printk(KERN_ALERT "mf.c: SIGINT to init failed (%d), " "hard shutdown commencing\n", rc); - mf_powerOff(); + mf_power_off(); } else printk(KERN_INFO "mf.c: init has been successfully notified " "to proceed with shutdown\n"); @@ -533,7 +533,7 @@ * Global kernel interface to allocate and seed events into the * Hypervisor. */ -void mf_allocateLpEvents(HvLpIndex targetLp, HvLpEvent_Type type, +void mf_allocate_lp_events(HvLpIndex targetLp, HvLpEvent_Type type, unsigned size, unsigned count, MFCompleteHandler hdlr, void *userToken) { @@ -560,13 +560,13 @@ if ((rc != 0) && (hdlr != NULL)) (*hdlr)(userToken, rc); } -EXPORT_SYMBOL(mf_allocateLpEvents); +EXPORT_SYMBOL(mf_allocate_lp_events); /* * Global kernel interface to unseed and deallocate events already in * Hypervisor. */ -void mf_deallocateLpEvents(HvLpIndex targetLp, HvLpEvent_Type type, +void mf_deallocate_lp_events(HvLpIndex targetLp, HvLpEvent_Type type, unsigned count, MFCompleteHandler hdlr, void *userToken) { struct pending_event *ev = new_pending_event(); @@ -591,13 +591,13 @@ if ((rc != 0) && (hdlr != NULL)) (*hdlr)(userToken, rc); } -EXPORT_SYMBOL(mf_deallocateLpEvents); +EXPORT_SYMBOL(mf_deallocate_lp_events); /* * Global kernel interface to tell the VSP object in the primary * partition to power this partition off. */ -void mf_powerOff(void) +void mf_power_off(void) { printk(KERN_INFO "mf.c: Down it goes...\n"); signal_ce_msg("\x00\x00\x00\x4D\x00\x00\x00\x00\x00\x00\x00\x00", NULL); @@ -618,7 +618,7 @@ /* * Display a single word SRC onto the VSP control panel. */ -void mf_displaySrc(u32 word) +void mf_display_src(u32 word) { u8 ce[12]; @@ -633,7 +633,7 @@ /* * Display a single word SRC of the form "PROGXXXX" on the VSP control panel. */ -void mf_displayProgress(u16 value) +void mf_display_progress(u16 value) { u8 ce[12]; u8 src[72]; @@ -657,7 +657,7 @@ * Clear the VSP control panel. Used to "erase" an SRC that was * previously displayed. */ -void mf_clearSrc(void) +void mf_clear_src(void) { signal_ce_msg("\x00\x00\x00\x4B\x00\x00\x00\x00\x00\x00\x00\x00", NULL); } @@ -909,15 +909,6 @@ return rc; } -int mf_setRtcTime(unsigned long time) -{ - struct rtc_time tm; - - to_tm(time, &tm); - - return mf_setRtc(&tm); -} - struct RtcTimeData { struct completion com; struct CeMsgData xCeMsg; @@ -933,47 +924,7 @@ complete(&rtc->com); } -static unsigned long lastsec = 1; - -int mf_getRtcTime(unsigned long *time) -{ - u32 dataWord1 = *((u32 *)(&xSpCommArea.xBcdTimeAtIplStart)); - u32 dataWord2 = *(((u32 *)&(xSpCommArea.xBcdTimeAtIplStart)) + 1); - int year = 1970; - int year1 = (dataWord1 >> 24) & 0x000000FF; - int year2 = (dataWord1 >> 16) & 0x000000FF; - int sec = (dataWord1 >> 8) & 0x000000FF; - int min = dataWord1 & 0x000000FF; - int hour = (dataWord2 >> 24) & 0x000000FF; - int day = (dataWord2 >> 8) & 0x000000FF; - int mon = dataWord2 & 0x000000FF; - - BCD_TO_BIN(sec); - BCD_TO_BIN(min); - BCD_TO_BIN(hour); - BCD_TO_BIN(day); - BCD_TO_BIN(mon); - BCD_TO_BIN(year1); - BCD_TO_BIN(year2); - year = year1 * 100 + year2; - - *time = mktime(year, mon, day, hour, min, sec); - *time += (jiffies / HZ); - - /* - * Now THIS is a nasty hack! - * It ensures that the first two calls to mf_getRtcTime get different - * answers. That way the loop in init_time (time.c) will not think - * the clock is stuck. - */ - if (lastsec) { - *time -= lastsec; - --lastsec; - } - return 0; -} - -int mf_getRtc(struct rtc_time *tm) +int mf_get_rtc(struct rtc_time *tm) { struct CeMsgCompleteData ceComplete; struct RtcTimeData rtcData; @@ -999,7 +950,7 @@ tm->tm_mday = 10; tm->tm_mon = 8; tm->tm_year = 71; - mf_setRtc(tm); + mf_set_rtc(tm); } { u32 dataWord1 = *((u32 *)(rtcData.xCeMsg.ce_msg+4)); @@ -1046,7 +997,7 @@ return rc; } -int mf_setRtc(struct rtc_time * tm) +int mf_set_rtc(struct rtc_time * tm) { char ceTime[12] = "\x00\x00\x00\x41\x00\x00\x00\x00\x00\x00\x00\x00"; u8 day, mon, hour, min, sec, y1, y2; @@ -1210,9 +1161,9 @@ return -EFAULT; if ((count == 1) && (*stkbuf == '\0')) - mf_clearSrc(); + mf_clear_src(); else - mf_displaySrc(*(u32 *)stkbuf); + mf_display_src(*(u32 *)stkbuf); return count; } diff -ruN linus-bk-mf.0.6/arch/ppc64/kernel/rtc.c linus-bk-mf.0.7/arch/ppc64/kernel/rtc.c --- linus-bk-mf.0.6/arch/ppc64/kernel/rtc.c 2004-09-09 09:59:49.000000000 +1000 +++ linus-bk-mf.0.7/arch/ppc64/kernel/rtc.c 2004-11-09 16:50:00.000000000 +1100 @@ -275,7 +275,7 @@ if (piranha_simulator) return; - mf_getRtc(rtc_tm); + mf_get_rtc(rtc_tm); rtc_tm->tm_mon--; } @@ -285,7 +285,7 @@ */ int iSeries_set_rtc_time(struct rtc_time *tm) { - mf_setRtc(tm); + mf_set_rtc(tm); return 0; } diff -ruN linus-bk-mf.0.6/arch/ppc64/kernel/viopath.c linus-bk-mf.0.7/arch/ppc64/kernel/viopath.c --- linus-bk-mf.0.6/arch/ppc64/kernel/viopath.c 2004-09-09 09:59:49.000000000 +1000 +++ linus-bk-mf.0.7/arch/ppc64/kernel/viopath.c 2004-11-09 16:46:36.000000000 +1100 @@ -473,7 +473,7 @@ parms.used_wait_atomic = 0; parms.sem = &Semaphore; } - mf_allocateLpEvents(remoteLp, HvLpEvent_Type_VirtualIo, 250, /* It would be nice to put a real number here! */ + mf_allocate_lp_events(remoteLp, HvLpEvent_Type_VirtualIo, 250, /* It would be nice to put a real number here! */ numEvents, &viopath_donealloc, &parms); if (in_atomic()) { while (atomic_read(&wait_atomic)) @@ -582,7 +582,7 @@ doneAllocParms.used_wait_atomic = 0; doneAllocParms.sem = &Semaphore; - mf_deallocateLpEvents(remoteLp, HvLpEvent_Type_VirtualIo, + mf_deallocate_lp_events(remoteLp, HvLpEvent_Type_VirtualIo, numReq, &viopath_donealloc, &doneAllocParms); down(&Semaphore); diff -ruN linus-bk-mf.0.6/drivers/net/iseries_veth.c linus-bk-mf.0.7/drivers/net/iseries_veth.c --- linus-bk-mf.0.6/drivers/net/iseries_veth.c 2004-10-20 21:20:19.000000000 +1000 +++ linus-bk-mf.0.7/drivers/net/iseries_veth.c 2004-11-09 16:47:08.000000000 +1100 @@ -248,7 +248,7 @@ { struct veth_allocation vc = { COMPLETION_INITIALIZER(vc.c), 0 }; - mf_allocateLpEvents(rlp, HvLpEvent_Type_VirtualLan, + mf_allocate_lp_events(rlp, HvLpEvent_Type_VirtualLan, sizeof(struct VethLpEvent), number, &veth_complete_allocation, &vc); wait_for_completion(&vc.c); @@ -662,12 +662,12 @@ del_timer_sync(&cnx->ack_timer); if (cnx->num_events > 0) - mf_deallocateLpEvents(cnx->remote_lp, + mf_deallocate_lp_events(cnx->remote_lp, HvLpEvent_Type_VirtualLan, cnx->num_events, NULL, NULL); if (cnx->num_ack_events > 0) - mf_deallocateLpEvents(cnx->remote_lp, + mf_deallocate_lp_events(cnx->remote_lp, HvLpEvent_Type_VirtualLan, cnx->num_ack_events, NULL, NULL); diff -ruN linus-bk-mf.0.6/include/asm-ppc64/iSeries/mf.h linus-bk-mf.0.7/include/asm-ppc64/iSeries/mf.h --- linus-bk-mf.0.6/include/asm-ppc64/iSeries/mf.h 2004-03-17 22:09:24.000000000 +1100 +++ linus-bk-mf.0.7/include/asm-ppc64/iSeries/mf.h 2004-11-09 17:27:54.000000000 +1100 @@ -1,6 +1,7 @@ /* * mf.h * Copyright (C) 2001 Troy D. Armstrong IBM Corporation + * Copyright (C) 2004 Stephen Rothwell IBM Corporation * * This modules exists as an interface between a Linux secondary partition * running on an iSeries and the primary partition's Virtual Service @@ -35,18 +36,18 @@ typedef void (*MFCompleteHandler)(void *clientToken, int returnCode); -extern void mf_allocateLpEvents(HvLpIndex targetLp, HvLpEvent_Type type, +extern void mf_allocate_lp_events(HvLpIndex targetLp, HvLpEvent_Type type, unsigned size, unsigned amount, MFCompleteHandler hdlr, void *userToken); -extern void mf_deallocateLpEvents(HvLpIndex targetLp, HvLpEvent_Type type, +extern void mf_deallocate_lp_events(HvLpIndex targetLp, HvLpEvent_Type type, unsigned count, MFCompleteHandler hdlr, void *userToken); -extern void mf_powerOff(void); +extern void mf_power_off(void); extern void mf_reboot(void); -extern void mf_displaySrc(u32 word); -extern void mf_displayProgress(u16 value); -extern void mf_clearSrc(void); +extern void mf_display_src(u32 word); +extern void mf_display_progress(u16 value); +extern void mf_clear_src(void); extern void mf_init(void); @@ -62,9 +63,7 @@ u64 side); extern int mf_getVmlinuxChunk(char *buffer, int *size, int offset, u64 side); -extern int mf_setRtcTime(unsigned long time); -extern int mf_getRtcTime(unsigned long *time); -extern int mf_getRtc( struct rtc_time * tm ); -extern int mf_setRtc( struct rtc_time * tm ); +extern int mf_get_rtc(struct rtc_time *tm); +extern int mf_set_rtc(struct rtc_time *tm); #endif /* MF_H_INCLUDED */ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041109/e94ffa31/attachment.pgp From benh at kernel.crashing.org Tue Nov 9 18:53:45 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 09 Nov 2004 18:53:45 +1100 Subject: [PATCH] ppc64: Fix G5 low level i2c code Message-ID: <1099986825.3946.221.camel@gaston> Hi ! The code in pmac_low_i2c.c is a low level synchronous version of the i2c keywest driver for use by platform code early during boot or during sleep/wakeup cycles to communicate with some motherboard chips, typically clock chips. It wasn't used on g5 until now, which is good because it wasn't 64 bits clean :) This patch fixes it, and also remove the use of udelay() since it can be used for synchronizing the HW timebase, and so must operate when it's frozen (and our implementation of udelay uses that timebase). Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/pmac_low_i2c.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/pmac_low_i2c.c 2004-11-09 14:56:12.224427352 +1100 +++ linux-work/arch/ppc64/kernel/pmac_low_i2c.c 2004-11-09 16:26:21.187140032 +1100 @@ -16,9 +16,10 @@ * properties parser */ +#undef DEBUG + #include #include -#include #include #include #include @@ -33,12 +34,12 @@ #define MAX_LOW_I2C_HOST 4 -#if 1 +#ifdef DEBUG #define DBG(x...) do {\ printk(KERN_DEBUG "KW:" x); \ } while(0) #else -#define DBGG(x...) +#define DBG(x...) #endif struct low_i2c_host; @@ -50,11 +51,11 @@ struct device_node *np; /* OF device node */ struct semaphore mutex; /* Access mutex for use by i2c-keywest */ low_i2c_func_t func; /* Access function */ - unsigned is_open : 1; /* Poor man's access control */ + unsigned int is_open : 1; /* Poor man's access control */ int mode; /* Current mode */ int channel; /* Current channel */ int num_channels; /* Number of channels */ - unsigned long base; /* For keywest-i2c, base address */ + void __iomem *base; /* For keywest-i2c, base address */ int bsteps; /* And register stepping */ int speed; /* And speed */ }; @@ -154,14 +155,12 @@ static inline u8 __kw_read_reg(struct low_i2c_host *host, reg_t reg) { - return in_8(((volatile u8 *)host->base) - + (((unsigned)reg) << host->bsteps)); + return readb(host->base + (((unsigned int)reg) << host->bsteps)); } static inline void __kw_write_reg(struct low_i2c_host *host, reg_t reg, u8 val) { - out_8(((volatile u8 *)host->base) - + (((unsigned)reg) << host->bsteps), val); + writeb(val, host->base + (((unsigned)reg) << host->bsteps)); (void)__kw_read_reg(host, reg_subaddr); } @@ -174,14 +173,19 @@ */ static u8 kw_wait_interrupt(struct low_i2c_host* host) { - int i; + int i, j; u8 isr; - for (i = 0; i < 200000; i++) { + for (i = 0; i < 100000; i++) { isr = kw_read_reg(reg_isr) & KW_I2C_IRQ_MASK; if (isr != 0) return isr; - udelay(1); + + /* This code is used with the timebase frozen, we cannot rely + * on udelay ! For now, just use a bogus loop + */ + for (j = 1; j < 10000; j++) + mb(); } return isr; } @@ -190,6 +194,8 @@ { u8 ack; + DBG("kw_handle_interrupt(%s, isr: %x)\n", __kw_state_names[state], isr); + if (isr == 0) { if (state != state_stop) { DBG("KW: Timeout !\n"); @@ -301,11 +307,9 @@ break; case pmac_low_i2c_mode_stdsub: mode_reg |= KW_I2C_MODE_STANDARDSUB; - kw_write_reg(reg_subaddr, subaddr); break; case pmac_low_i2c_mode_combined: mode_reg |= KW_I2C_MODE_COMBINED; - kw_write_reg(reg_subaddr, subaddr); break; } @@ -317,6 +321,11 @@ /* Set up address and r/w bit */ kw_write_reg(reg_addr, addr); + /* Set up the sub address */ + if ((mode_reg & KW_I2C_MODE_MODE_MASK) == KW_I2C_MODE_STANDARDSUB + || (mode_reg & KW_I2C_MODE_MODE_MASK) == KW_I2C_MODE_COMBINED) + kw_write_reg(reg_subaddr, subaddr); + /* Start sending address & disable interrupt*/ kw_write_reg(reg_ier, 0 /*KW_I2C_IRQ_MASK*/); kw_write_reg(reg_control, KW_I2C_CTL_XADDR); @@ -333,7 +342,7 @@ static void keywest_low_i2c_add(struct device_node *np) { struct low_i2c_host *host = find_low_i2c_host(NULL); - unsigned long *psteps, *prate, steps, aoffset = 0; + u32 *psteps, *prate, steps, aoffset = 0; struct device_node *parent; if (host == NULL) { @@ -345,7 +354,7 @@ init_MUTEX(&host->mutex); host->np = of_node_get(np); - psteps = (unsigned long *)get_property(np, "AAPL,address-step", NULL); + psteps = (u32 *)get_property(np, "AAPL,address-step", NULL); steps = psteps ? (*psteps) : 0x10; for (host->bsteps = 0; (steps & 0x01) == 0; host->bsteps++) steps >>= 1; @@ -357,7 +366,7 @@ } /* Select interface rate */ host->speed = KW_I2C_MODE_100KHZ; - prate = (unsigned long *)get_property(np, "AAPL,i2c-rate", NULL); + prate = (u32 *)get_property(np, "AAPL,i2c-rate", NULL); if (prate) switch(*prate) { case 100: host->speed = KW_I2C_MODE_100KHZ; @@ -369,8 +378,9 @@ host->speed = KW_I2C_MODE_25KHZ; break; } + host->mode = pmac_low_i2c_mode_std; - host->base = (unsigned long)ioremap(np->addrs[0].address + aoffset, + host->base = ioremap(np->addrs[0].address + aoffset, np->addrs[0].size); host->func = keywest_low_i2c_func; } From sfr at canb.auug.org.au Tue Nov 9 18:55:47 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 9 Nov 2004 18:55:47 +1100 Subject: [PATCH] [4/6] PPC64 iSeries more MF cleanup In-Reply-To: <20041109185131.29e6eabd.sfr@canb.auug.org.au> References: <20041109184223.16ea3414.sfr@canb.auug.org.au> <20041109184551.03b8a32c.sfr@canb.auug.org.au> <20041109184813.1a6e02cf.sfr@canb.auug.org.au> <20041109185131.29e6eabd.sfr@canb.auug.org.au> Message-ID: <20041109185547.6eaf99ee.sfr@canb.auug.org.au> This patch starts the improvement of the style of the MF code: - remove a union that is used where casts suffice - add a helper function (signal_ce_msg_simple) and use it - replace some painful code that converts a byte array to a couple of u32's and then shifts and masks the bytes back out. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk-mf.0.7/arch/ppc64/kernel/mf.c linus-bk-mf.0.8/arch/ppc64/kernel/mf.c --- linus-bk-mf.0.7/arch/ppc64/kernel/mf.c 2004-11-09 17:27:46.000000000 +1100 +++ linus-bk-mf.0.8/arch/ppc64/kernel/mf.c 2004-11-09 17:57:33.000000000 +1100 @@ -48,13 +48,8 @@ * This is the structure layout for the Machine Facilites LPAR event * flows. */ -union safe_cast { - u64 ptr_as_u64; - void *ptr; -}; - struct VspCmdData { - union safe_cast token; + u64 token; u16 cmd; HvLpIndex lp_index; u8 result_code; @@ -215,12 +210,8 @@ if (ev1 == ev) rc = -EIO; - else if (ev1->hdlr != NULL) { - union safe_cast mySafeCast; - - mySafeCast.ptr_as_u64 = ev1->event.hp_lp_event.xCorrelationToken; - (*ev1->hdlr)(mySafeCast.ptr, -EIO); - } + else if (ev1->hdlr != NULL) + (*ev1->hdlr)((void *)ev1->event.hp_lp_event.xCorrelationToken, -EIO); spin_lock_irqsave(&pending_event_spinlock, flags); free_pending_event(ev1); @@ -287,7 +278,7 @@ ev->event.hp_lp_event.xSubtype = 6; ev->event.hp_lp_event.x.xSubtypeData = subtype_data('M', 'F', 'V', 'I'); - ev->event.data.vsp_cmd.token.ptr = &response; + ev->event.data.vsp_cmd.token = (u64)&response; ev->event.data.vsp_cmd.cmd = vspCmd->cmd; ev->event.data.vsp_cmd.lp_index = HvLpConfig_getLpIndex(); ev->event.data.vsp_cmd.result_code = 0xFF; @@ -322,6 +313,18 @@ } /* + * Send a 12-byte CE message (with no data) to the primary partition VSP object + */ +static int signal_ce_msg_simple(u8 ce_op, struct CeMsgCompleteData *completion) +{ + u8 ce_msg[12]; + + memset(ce_msg, 0, sizeof(ce_msg)); + ce_msg[3] = ce_op; + return signal_ce_msg(ce_msg, completion); +} + +/* * Send a 12-byte CE message and DMA data to the primary partition VSP object */ static int dma_and_signal_ce_msg(char *ce_msg, @@ -385,7 +388,7 @@ if ((event->data.ce_msg.ce_msg[5] & 0x20) != 0) { printk(KERN_INFO "mf.c: Commencing partition shutdown\n"); if (shutdown() == 0) - signal_ce_msg("\x00\x00\x00\xDB\x00\x00\x00\x00\x00\x00\x00\x00", NULL); + signal_ce_msg_simple(0xDB, NULL); } break; case 0xC0: /* get time */ @@ -464,16 +467,13 @@ case 4: /* allocate */ case 5: /* deallocate */ if (pending_event_head->hdlr != NULL) { - union safe_cast mySafeCast; - - mySafeCast.ptr_as_u64 = event->hp_lp_event.xCorrelationToken; - (*pending_event_head->hdlr)(mySafeCast.ptr, event->data.alloc.count); + (*pending_event_head->hdlr)((void *)event->hp_lp_event.xCorrelationToken, event->data.alloc.count); } freeIt = 1; break; case 6: { - struct VspRspData *rsp = (struct VspRspData *)event->data.vsp_cmd.token.ptr; + struct VspRspData *rsp = (struct VspRspData *)event->data.vsp_cmd.token; if (rsp != NULL) { if (rsp->response != NULL) @@ -543,11 +543,8 @@ if (ev == NULL) { rc = -ENOMEM; } else { - union safe_cast mine; - - mine.ptr = userToken; ev->event.hp_lp_event.xSubtype = 4; - ev->event.hp_lp_event.xCorrelationToken = mine.ptr_as_u64; + ev->event.hp_lp_event.xCorrelationToken = (u64)userToken; ev->event.hp_lp_event.x.xSubtypeData = subtype_data('M', 'F', 'M', 'A'); ev->event.data.alloc.target_lp = targetLp; @@ -575,11 +572,8 @@ if (ev == NULL) rc = -ENOMEM; else { - union safe_cast mine; - - mine.ptr = userToken; ev->event.hp_lp_event.xSubtype = 5; - ev->event.hp_lp_event.xCorrelationToken = mine.ptr_as_u64; + ev->event.hp_lp_event.xCorrelationToken = (u64)userToken; ev->event.hp_lp_event.x.xSubtypeData = subtype_data('M', 'F', 'M', 'D'); ev->event.data.alloc.target_lp = targetLp; @@ -600,8 +594,9 @@ void mf_power_off(void) { printk(KERN_INFO "mf.c: Down it goes...\n"); - signal_ce_msg("\x00\x00\x00\x4D\x00\x00\x00\x00\x00\x00\x00\x00", NULL); - for (;;); + signal_ce_msg_simple(0x4d, NULL); + for (;;) + ; } /* @@ -611,8 +606,9 @@ void mf_reboot(void) { printk(KERN_INFO "mf.c: Preparing to bounce...\n"); - signal_ce_msg("\x00\x00\x00\x4E\x00\x00\x00\x00\x00\x00\x00\x00", NULL); - for (;;); + signal_ce_msg_simple(0x4e, NULL); + for (;;) + ; } /* @@ -659,7 +655,7 @@ */ void mf_clear_src(void) { - signal_ce_msg("\x00\x00\x00\x4B\x00\x00\x00\x00\x00\x00\x00\x00", NULL); + signal_ce_msg_simple(0x4b, NULL); } /* @@ -678,7 +674,7 @@ HvLpEvent_registerHandler(HvLpEvent_Type_MachineFac, &hvHandler); /* virtual continue ack */ - signal_ce_msg("\x00\x00\x00\x57\x00\x00\x00\x00\x00\x00\x00\x00", NULL); + signal_ce_msg_simple(0x57, NULL); /* initialization complete */ printk(KERN_NOTICE "mf.c: iSeries Linux LPAR Machine Facilities initialized\n"); @@ -935,8 +931,7 @@ init_completion(&rtcData.com); ceComplete.handler = &getRtcTimeComplete; ceComplete.token = (void *)&rtcData; - rc = signal_ce_msg("\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00", - &ceComplete); + rc = signal_ce_msg_simple(0x40, &ce_complete); if (rc == 0) { wait_for_completion(&rtcData.com); @@ -953,14 +948,13 @@ mf_set_rtc(tm); } { - u32 dataWord1 = *((u32 *)(rtcData.xCeMsg.ce_msg+4)); - u32 dataWord2 = *((u32 *)(rtcData.xCeMsg.ce_msg+8)); - u8 year = (dataWord1 >> 16) & 0x000000FF; - u8 sec = (dataWord1 >> 8) & 0x000000FF; - u8 min = dataWord1 & 0x000000FF; - u8 hour = (dataWord2 >> 24) & 0x000000FF; - u8 day = (dataWord2 >> 8) & 0x000000FF; - u8 mon = dataWord2 & 0x000000FF; + u8 *ce_msg = rtcData.xCeMsg.ce_msg; + u8 year = ce_msg[5]; + u8 sec = ce_msg[6]; + u8 min = ce_msg[7]; + u8 hour = ce_msg[8]; + u8 day = ce_msg[10]; + u8 mon = ce_msg[11]; BCD_TO_BIN(sec); BCD_TO_BIN(min); @@ -997,7 +991,7 @@ return rc; } -int mf_set_rtc(struct rtc_time * tm) +int mf_set_rtc(struct rtc_time *tm) { char ceTime[12] = "\x00\x00\x00\x41\x00\x00\x00\x00\x00\x00\x00\x00"; u8 day, mon, hour, min, sec, y1, y2; -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041109/02dc399f/attachment.pgp From sfr at canb.auug.org.au Tue Nov 9 18:57:59 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 9 Nov 2004 18:57:59 +1100 Subject: [PATCH] [5/6] PPC64 iSeries remove more Studly Caps from MF code In-Reply-To: <20041109185547.6eaf99ee.sfr@canb.auug.org.au> References: <20041109184223.16ea3414.sfr@canb.auug.org.au> <20041109184551.03b8a32c.sfr@canb.auug.org.au> <20041109184813.1a6e02cf.sfr@canb.auug.org.au> <20041109185131.29e6eabd.sfr@canb.auug.org.au> <20041109185547.6eaf99ee.sfr@canb.auug.org.au> Message-ID: <20041109185759.493d19fd.sfr@canb.auug.org.au> Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk-mf.0.8/arch/ppc64/kernel/mf.c linus-bk-mf.0.9/arch/ppc64/kernel/mf.c --- linus-bk-mf.0.8/arch/ppc64/kernel/mf.c 2004-11-09 17:57:33.000000000 +1100 +++ linus-bk-mf.0.9/arch/ppc64/kernel/mf.c 2004-11-09 18:19:18.000000000 +1100 @@ -48,7 +48,7 @@ * This is the structure layout for the Machine Facilites LPAR event * flows. */ -struct VspCmdData { +struct vsp_cmd_data { u64 token; u16 cmd; HvLpIndex lp_index; @@ -77,12 +77,12 @@ } sub_data; }; -struct VspRspData { +struct vsp_rsp_data { struct completion com; - struct VspCmdData *response; + struct vsp_cmd_data *response; }; -struct AllocData { +struct alloc_data { u16 size; u16 type; u32 count; @@ -91,30 +91,30 @@ HvLpIndex target_lp; }; -struct CeMsgData; +struct ce_msg_data; -typedef void (*CeMsgCompleteHandler)(void *token, struct CeMsgData *vspCmdRsp); +typedef void (*ce_msg_comp_hdlr)(void *token, struct ce_msg_data *vsp_cmd_rsp); -struct CeMsgCompleteData { - CeMsgCompleteHandler handler; +struct ce_msg_comp_data { + ce_msg_comp_hdlr handler; void *token; }; -struct CeMsgData { +struct ce_msg_data { u8 ce_msg[12]; char reserved[4]; - struct CeMsgCompleteData *completion; + struct ce_msg_comp_data *completion; }; -struct IoMFLpEvent { +struct io_mf_lp_event { struct HvLpEvent hp_lp_event; u16 subtype_result_code; u16 reserved1; u32 reserved2; union { - struct AllocData alloc; - struct CeMsgData ce_msg; - struct VspCmdData vsp_cmd; + struct alloc_data alloc; + struct ce_msg_data ce_msg; + struct vsp_cmd_data vsp_cmd; } data; }; @@ -130,7 +130,7 @@ */ struct pending_event { struct pending_event *next; - struct IoMFLpEvent event; + struct io_mf_lp_event event; MFCompleteHandler hdlr; char dma_data[72]; unsigned dma_data_length; @@ -168,7 +168,7 @@ unsigned long flags; int go = 1; struct pending_event *ev1; - HvLpEvent_Rc hvRc; + HvLpEvent_Rc hv_rc; /* enqueue the event */ if (ev != NULL) { @@ -195,11 +195,11 @@ pending_event_head->dma_data_length, HvLpDma_Direction_LocalToRemote); - hvRc = HvCallEvent_signalLpEvent( + hv_rc = HvCallEvent_signalLpEvent( &pending_event_head->event.hp_lp_event); - if (hvRc != HvLpEvent_Rc_Good) { + if (hv_rc != HvLpEvent_Rc_Good) { printk(KERN_ERR "mf.c: HvCallEvent_signalLpEvent() failed with %d\n", - (int)hvRc); + (int)hv_rc); spin_lock_irqsave(&pending_event_spinlock, flags); ev1 = pending_event_head; @@ -228,7 +228,7 @@ static struct pending_event *new_pending_event(void) { struct pending_event *ev = NULL; - HvLpIndex primaryLp = HvLpConfig_getPrimaryLpIndex(); + HvLpIndex primary_lp = HvLpConfig_getPrimaryLpIndex(); unsigned long flags; struct HvLpEvent *hev; @@ -253,38 +253,38 @@ hev->xFlags.xFunction = HvLpEvent_Function_Int; hev->xType = HvLpEvent_Type_MachineFac; hev->xSourceLp = HvLpConfig_getLpIndex(); - hev->xTargetLp = primaryLp; + hev->xTargetLp = primary_lp; hev->xSizeMinus1 = sizeof(ev->event)-1; hev->xRc = HvLpEvent_Rc_Good; - hev->xSourceInstanceId = HvCallEvent_getSourceLpInstanceId(primaryLp, + hev->xSourceInstanceId = HvCallEvent_getSourceLpInstanceId(primary_lp, HvLpEvent_Type_MachineFac); - hev->xTargetInstanceId = HvCallEvent_getTargetLpInstanceId(primaryLp, + hev->xTargetInstanceId = HvCallEvent_getTargetLpInstanceId(primary_lp, HvLpEvent_Type_MachineFac); return ev; } -static int signal_vsp_instruction(struct VspCmdData *vspCmd) +static int signal_vsp_instruction(struct vsp_cmd_data *vsp_cmd) { struct pending_event *ev = new_pending_event(); int rc; - struct VspRspData response; + struct vsp_rsp_data response; if (ev == NULL) return -ENOMEM; init_completion(&response.com); - response.response = vspCmd; + response.response = vsp_cmd; ev->event.hp_lp_event.xSubtype = 6; ev->event.hp_lp_event.x.xSubtypeData = subtype_data('M', 'F', 'V', 'I'); ev->event.data.vsp_cmd.token = (u64)&response; - ev->event.data.vsp_cmd.cmd = vspCmd->cmd; + ev->event.data.vsp_cmd.cmd = vsp_cmd->cmd; ev->event.data.vsp_cmd.lp_index = HvLpConfig_getLpIndex(); ev->event.data.vsp_cmd.result_code = 0xFF; ev->event.data.vsp_cmd.reserved = 0; memcpy(&(ev->event.data.vsp_cmd.sub_data), - &(vspCmd->sub_data), sizeof(vspCmd->sub_data)); + &(vsp_cmd->sub_data), sizeof(vsp_cmd->sub_data)); mb(); rc = signal_event(ev); @@ -297,7 +297,7 @@ /* * Send a 12-byte CE message to the primary partition VSP object */ -static int signal_ce_msg(char *ce_msg, struct CeMsgCompleteData *completion) +static int signal_ce_msg(char *ce_msg, struct ce_msg_comp_data *completion) { struct pending_event *ev = new_pending_event(); @@ -315,7 +315,7 @@ /* * Send a 12-byte CE message (with no data) to the primary partition VSP object */ -static int signal_ce_msg_simple(u8 ce_op, struct CeMsgCompleteData *completion) +static int signal_ce_msg_simple(u8 ce_op, struct ce_msg_comp_data *completion) { u8 ce_msg[12]; @@ -328,7 +328,7 @@ * Send a 12-byte CE message and DMA data to the primary partition VSP object */ static int dma_and_signal_ce_msg(char *ce_msg, - struct CeMsgCompleteData *completion, void *dma_data, + struct ce_msg_comp_data *completion, void *dma_data, unsigned dma_data_length, unsigned remote_address) { struct pending_event *ev = new_pending_event(); @@ -371,9 +371,9 @@ * The primary partition VSP object is sending us a new * event flow. Handle it... */ -static void intReceived(struct IoMFLpEvent *event) +static void handle_int(struct io_mf_lp_event *event) { - int freeIt = 0; + int free_it = 0; struct pending_event *two = NULL; /* ack the interrupt */ @@ -396,9 +396,9 @@ (pending_event_head->event.data.ce_msg.ce_msg[3] != 0x40)) break; - freeIt = 1; + free_it = 1; if (pending_event_head->event.data.ce_msg.completion != 0) { - CeMsgCompleteHandler handler = pending_event_head->event.data.ce_msg.completion->handler; + ce_msg_comp_hdlr handler = pending_event_head->event.data.ce_msg.completion->handler; void *token = pending_event_head->event.data.ce_msg.completion->token; if (handler != NULL) @@ -408,7 +408,7 @@ } /* remove from queue */ - if (freeIt == 1) { + if (free_it == 1) { unsigned long flags; spin_lock_irqsave(&pending_event_spinlock, flags); @@ -439,11 +439,11 @@ * of a flow we sent to them. If there are other flows queued * up, we must send another one now... */ -static void ackReceived(struct IoMFLpEvent *event) +static void handle_ack(struct io_mf_lp_event *event) { unsigned long flags; struct pending_event * two = NULL; - unsigned long freeIt = 0; + unsigned long free_it = 0; /* handle current event */ if (pending_event_head != NULL) { @@ -451,10 +451,10 @@ case 0: /* CE msg */ if (event->data.ce_msg.ce_msg[3] == 0x40) { if (event->data.ce_msg.ce_msg[2] != 0) { - freeIt = 1; + free_it = 1; if (pending_event_head->event.data.ce_msg.completion != 0) { - CeMsgCompleteHandler handler = pending_event_head->event.data.ce_msg.completion->handler; + ce_msg_comp_hdlr handler = pending_event_head->event.data.ce_msg.completion->handler; void *token = pending_event_head->event.data.ce_msg.completion->token; if (handler != NULL) @@ -462,18 +462,18 @@ } } } else - freeIt = 1; + free_it = 1; break; case 4: /* allocate */ case 5: /* deallocate */ if (pending_event_head->hdlr != NULL) { (*pending_event_head->hdlr)((void *)event->hp_lp_event.xCorrelationToken, event->data.alloc.count); } - freeIt = 1; + free_it = 1; break; case 6: { - struct VspRspData *rsp = (struct VspRspData *)event->data.vsp_cmd.token; + struct vsp_rsp_data *rsp = (struct vsp_rsp_data *)event->data.vsp_cmd.token; if (rsp != NULL) { if (rsp->response != NULL) @@ -481,7 +481,7 @@ complete(&rsp->com); } else printk(KERN_ERR "mf.c: no rsp\n"); - freeIt = 1; + free_it = 1; } break; } @@ -491,7 +491,7 @@ /* remove from queue */ spin_lock_irqsave(&pending_event_spinlock, flags); - if ((pending_event_head != NULL) && (freeIt == 1)) { + if ((pending_event_head != NULL) && (free_it == 1)) { struct pending_event *oldHead = pending_event_head; pending_event_head = pending_event_head->next; @@ -511,15 +511,15 @@ * parse it enough to know if it is an interrupt or an * acknowledge. */ -static void hvHandler(struct HvLpEvent *event, struct pt_regs *regs) +static void hv_handler(struct HvLpEvent *event, struct pt_regs *regs) { if ((event != NULL) && (event->xType == HvLpEvent_Type_MachineFac)) { switch(event->xFlags.xFunction) { case HvLpEvent_Function_Ack: - ackReceived((struct IoMFLpEvent *)event); + handle_ack((struct io_mf_lp_event *)event); break; case HvLpEvent_Function_Int: - intReceived((struct IoMFLpEvent *)event); + handle_int((struct io_mf_lp_event *)event); break; default: printk(KERN_ERR "mf.c: non ack/int event received\n"); @@ -533,9 +533,9 @@ * Global kernel interface to allocate and seed events into the * Hypervisor. */ -void mf_allocate_lp_events(HvLpIndex targetLp, HvLpEvent_Type type, +void mf_allocate_lp_events(HvLpIndex target_lp, HvLpEvent_Type type, unsigned size, unsigned count, MFCompleteHandler hdlr, - void *userToken) + void *user_token) { struct pending_event *ev = new_pending_event(); int rc; @@ -544,10 +544,10 @@ rc = -ENOMEM; } else { ev->event.hp_lp_event.xSubtype = 4; - ev->event.hp_lp_event.xCorrelationToken = (u64)userToken; + ev->event.hp_lp_event.xCorrelationToken = (u64)user_token; ev->event.hp_lp_event.x.xSubtypeData = subtype_data('M', 'F', 'M', 'A'); - ev->event.data.alloc.target_lp = targetLp; + ev->event.data.alloc.target_lp = target_lp; ev->event.data.alloc.type = type; ev->event.data.alloc.size = size; ev->event.data.alloc.count = count; @@ -555,7 +555,7 @@ rc = signal_event(ev); } if ((rc != 0) && (hdlr != NULL)) - (*hdlr)(userToken, rc); + (*hdlr)(user_token, rc); } EXPORT_SYMBOL(mf_allocate_lp_events); @@ -563,8 +563,8 @@ * Global kernel interface to unseed and deallocate events already in * Hypervisor. */ -void mf_deallocate_lp_events(HvLpIndex targetLp, HvLpEvent_Type type, - unsigned count, MFCompleteHandler hdlr, void *userToken) +void mf_deallocate_lp_events(HvLpIndex target_lp, HvLpEvent_Type type, + unsigned count, MFCompleteHandler hdlr, void *user_token) { struct pending_event *ev = new_pending_event(); int rc; @@ -573,17 +573,17 @@ rc = -ENOMEM; else { ev->event.hp_lp_event.xSubtype = 5; - ev->event.hp_lp_event.xCorrelationToken = (u64)userToken; + ev->event.hp_lp_event.xCorrelationToken = (u64)user_token; ev->event.hp_lp_event.x.xSubtypeData = subtype_data('M', 'F', 'M', 'D'); - ev->event.data.alloc.target_lp = targetLp; + ev->event.data.alloc.target_lp = target_lp; ev->event.data.alloc.type = type; ev->event.data.alloc.count = count; ev->hdlr = hdlr; rc = signal_event(ev); } if ((rc != 0) && (hdlr != NULL)) - (*hdlr)(userToken, rc); + (*hdlr)(user_token, rc); } EXPORT_SYMBOL(mf_deallocate_lp_events); @@ -671,7 +671,7 @@ i < sizeof(pending_event_prealloc) / sizeof(*pending_event_prealloc); ++i) free_pending_event(&pending_event_prealloc[i]); - HvLpEvent_registerHandler(HvLpEvent_Type_MachineFac, &hvHandler); + HvLpEvent_registerHandler(HvLpEvent_Type_MachineFac, &hv_handler); /* virtual continue ack */ signal_ce_msg_simple(0x57, NULL); @@ -682,60 +682,60 @@ void mf_setSide(char side) { - u64 newSide; - struct VspCmdData myVspCmd; + u64 new_side; + struct vsp_cmd_data vsp_cmd; - memset(&myVspCmd, 0, sizeof(myVspCmd)); + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); switch (side) { - case 'A': newSide = 0; + case 'A': new_side = 0; break; - case 'B': newSide = 1; + case 'B': new_side = 1; break; - case 'C': newSide = 2; + case 'C': new_side = 2; break; - default: newSide = 3; + default: new_side = 3; break; } - myVspCmd.sub_data.ipl_type = newSide; - myVspCmd.cmd = 10; + vsp_cmd.sub_data.ipl_type = new_side; + vsp_cmd.cmd = 10; - (void)signal_vsp_instruction(&myVspCmd); + (void)signal_vsp_instruction(&vsp_cmd); } char mf_getSide(void) { - char returnValue = ' '; + char return_value = ' '; int rc = 0; - struct VspCmdData myVspCmd; + struct vsp_cmd_data vsp_cmd; - memset(&myVspCmd, 0, sizeof(myVspCmd)); - myVspCmd.cmd = 2; - myVspCmd.sub_data.ipl_type = 0; + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 2; + vsp_cmd.sub_data.ipl_type = 0; mb(); - rc = signal_vsp_instruction(&myVspCmd); + rc = signal_vsp_instruction(&vsp_cmd); if (rc != 0) - return returnValue; + return return_value; - if (myVspCmd.result_code == 0) { - switch (myVspCmd.sub_data.ipl_type) { - case 0: returnValue = 'A'; + if (vsp_cmd.result_code == 0) { + switch (vsp_cmd.sub_data.ipl_type) { + case 0: return_value = 'A'; break; - case 1: returnValue = 'B'; + case 1: return_value = 'B'; break; - case 2: returnValue = 'C'; + case 2: return_value = 'C'; break; - default: returnValue = 'D'; + default: return_value = 'D'; break; } } - return returnValue; + return return_value; } void mf_getSrcHistory(char *buffer, int size) { #if 0 - struct IplTypeReturnStuff returnStuff; + struct IplTypeReturnStuff return_stuff; struct pending_event *ev = new_pending_event(); int rc = 0; char *pages[4]; @@ -748,13 +748,13 @@ || (pages[2] == NULL) || (pages[3] == NULL)) return -ENOMEM; - returnStuff.xType = 0; - returnStuff.xRc = 0; - returnStuff.xDone = 0; + return_stuff.xType = 0; + return_stuff.xRc = 0; + return_stuff.xDone = 0; ev->event.hp_lp_event.xSubtype = 6; ev->event.hp_lp_event.x.xSubtypeData = subtype_data('M', 'F', 'V', 'I'); - ev->event.data.vsp_cmd.xEvent = &returnStuff; + ev->event.data.vsp_cmd.xEvent = &return_stuff; ev->event.data.vsp_cmd.cmd = 4; ev->event.data.vsp_cmd.lp_index = HvLpConfig_getLpIndex(); ev->event.data.vsp_cmd.result_code = 0xFF; @@ -767,9 +767,9 @@ if (signal_event(ev) != 0) return; - while (returnStuff.xDone != 1) + while (return_stuff.xDone != 1) udelay(10); - if (returnStuff.xRc == 0) + if (return_stuff.xRc == 0) memcpy(buffer, pages[0], size); kfree(pages[0]); kfree(pages[1]); @@ -780,7 +780,7 @@ void mf_setCmdLine(const char *cmdline, int size, u64 side) { - struct VspCmdData myVspCmd; + struct vsp_cmd_data vsp_cmd; dma_addr_t dma_addr = 0; char *page = dma_alloc_coherent(iSeries_vio_dev, size, &dma_addr, GFP_ATOMIC); @@ -792,21 +792,21 @@ copy_from_user(page, cmdline, size); - memset(&myVspCmd, 0, sizeof(myVspCmd)); - myVspCmd.cmd = 31; - myVspCmd.sub_data.kern.token = dma_addr; - myVspCmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - myVspCmd.sub_data.kern.side = side; - myVspCmd.sub_data.kern.length = size; + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 31; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = side; + vsp_cmd.sub_data.kern.length = size; mb(); - (void)signal_vsp_instruction(&myVspCmd); + (void)signal_vsp_instruction(&vsp_cmd); dma_free_coherent(iSeries_vio_dev, size, page, dma_addr); } int mf_getCmdLine(char *cmdline, int *size, u64 side) { - struct VspCmdData myVspCmd; + struct vsp_cmd_data vsp_cmd; int rc; int len = *size; dma_addr_t dma_addr; @@ -814,18 +814,18 @@ dma_addr = dma_map_single(iSeries_vio_dev, cmdline, len, DMA_FROM_DEVICE); memset(cmdline, 0, len); - memset(&myVspCmd, 0, sizeof(myVspCmd)); - myVspCmd.cmd = 33; - myVspCmd.sub_data.kern.token = dma_addr; - myVspCmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - myVspCmd.sub_data.kern.side = side; - myVspCmd.sub_data.kern.length = len; + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 33; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = side; + vsp_cmd.sub_data.kern.length = len; mb(); - rc = signal_vsp_instruction(&myVspCmd); + rc = signal_vsp_instruction(&vsp_cmd); if (rc == 0) { - if (myVspCmd.result_code == 0) - len = myVspCmd.sub_data.length_out; + if (vsp_cmd.result_code == 0) + len = vsp_cmd.sub_data.length_out; #if 0 else memcpy(cmdline, "Bad cmdline", 11); @@ -840,7 +840,7 @@ int mf_setVmlinuxChunk(const char *buffer, int size, int offset, u64 side) { - struct VspCmdData myVspCmd; + struct vsp_cmd_data vsp_cmd; int rc; dma_addr_t dma_addr = 0; char *page = dma_alloc_coherent(iSeries_vio_dev, size, &dma_addr, @@ -852,18 +852,18 @@ } copy_from_user(page, buffer, size); - memset(&myVspCmd, 0, sizeof(myVspCmd)); + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); - myVspCmd.cmd = 30; - myVspCmd.sub_data.kern.token = dma_addr; - myVspCmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - myVspCmd.sub_data.kern.side = side; - myVspCmd.sub_data.kern.offset = offset; - myVspCmd.sub_data.kern.length = size; + vsp_cmd.cmd = 30; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = side; + vsp_cmd.sub_data.kern.offset = offset; + vsp_cmd.sub_data.kern.length = size; mb(); - rc = signal_vsp_instruction(&myVspCmd); + rc = signal_vsp_instruction(&vsp_cmd); if (rc == 0) { - if (myVspCmd.result_code == 0) + if (vsp_cmd.result_code == 0) rc = 0; else rc = -ENOMEM; @@ -876,7 +876,7 @@ int mf_getVmlinuxChunk(char *buffer, int *size, int offset, u64 side) { - struct VspCmdData myVspCmd; + struct vsp_cmd_data vsp_cmd; int rc; int len = *size; dma_addr_t dma_addr; @@ -884,18 +884,18 @@ dma_addr = dma_map_single(iSeries_vio_dev, buffer, len, DMA_FROM_DEVICE); memset(buffer, 0, len); - memset(&myVspCmd, 0, sizeof(myVspCmd)); - myVspCmd.cmd = 32; - myVspCmd.sub_data.kern.token = dma_addr; - myVspCmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - myVspCmd.sub_data.kern.side = side; - myVspCmd.sub_data.kern.offset = offset; - myVspCmd.sub_data.kern.length = len; + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 32; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = side; + vsp_cmd.sub_data.kern.offset = offset; + vsp_cmd.sub_data.kern.length = len; mb(); - rc = signal_vsp_instruction(&myVspCmd); + rc = signal_vsp_instruction(&vsp_cmd); if (rc == 0) { - if (myVspCmd.result_code == 0) - *size = myVspCmd.sub_data.length_out; + if (vsp_cmd.result_code == 0) + *size = vsp_cmd.sub_data.length_out; else rc = -ENOMEM; } @@ -905,39 +905,39 @@ return rc; } -struct RtcTimeData { +struct rtc_time_data { struct completion com; - struct CeMsgData xCeMsg; - int xRc; + struct ce_msg_data ce_msg; + int rc; }; -void getRtcTimeComplete(void * token, struct CeMsgData *ceMsg) +static void get_rtc_time_complete(void *token, struct ce_msg_data *ce_msg) { - struct RtcTimeData *rtc = (struct RtcTimeData *)token; + struct rtc_time_data *rtc = token; - memcpy(&(rtc->xCeMsg), ceMsg, sizeof(rtc->xCeMsg)); - rtc->xRc = 0; + memcpy(&rtc->ce_msg, ce_msg, sizeof(rtc->ce_msg)); + rtc->rc = 0; complete(&rtc->com); } int mf_get_rtc(struct rtc_time *tm) { - struct CeMsgCompleteData ceComplete; - struct RtcTimeData rtcData; + struct ce_msg_comp_data ce_complete; + struct rtc_time_data rtc_data; int rc; - memset(&ceComplete, 0, sizeof(ceComplete)); - memset(&rtcData, 0, sizeof(rtcData)); - init_completion(&rtcData.com); - ceComplete.handler = &getRtcTimeComplete; - ceComplete.token = (void *)&rtcData; + memset(&ce_complete, 0, sizeof(ce_complete)); + memset(&rtc_data, 0, sizeof(rtc_data)); + init_completion(&rtc_data.com); + ce_complete.handler = &get_rtc_time_complete; + ce_complete.token = &rtc_data; rc = signal_ce_msg_simple(0x40, &ce_complete); if (rc == 0) { - wait_for_completion(&rtcData.com); + wait_for_completion(&rtc_data.com); - if (rtcData.xRc == 0) { - if ((rtcData.xCeMsg.ce_msg[2] == 0xa9) || - (rtcData.xCeMsg.ce_msg[2] == 0xaf)) { + if (rtc_data.rc == 0) { + if ((rtc_data.ce_msg.ce_msg[2] == 0xa9) || + (rtc_data.ce_msg.ce_msg[2] == 0xaf)) { /* TOD clock is not set */ tm->tm_sec = 1; tm->tm_min = 1; @@ -948,7 +948,7 @@ mf_set_rtc(tm); } { - u8 *ce_msg = rtcData.xCeMsg.ce_msg; + u8 *ce_msg = rtc_data.ce_msg.ce_msg; u8 year = ce_msg[5]; u8 sec = ce_msg[6]; u8 min = ce_msg[7]; @@ -974,7 +974,7 @@ tm->tm_year = year; } } else { - rc = rtcData.xRc; + rc = rtc_data.rc; tm->tm_sec = 0; tm->tm_min = 0; tm->tm_hour = 0; @@ -993,7 +993,7 @@ int mf_set_rtc(struct rtc_time *tm) { - char ceTime[12] = "\x00\x00\x00\x41\x00\x00\x00\x00\x00\x00\x00\x00"; + char ce_time[12] = "\x00\x00\x00\x41\x00\x00\x00\x00\x00\x00\x00\x00"; u8 day, mon, hour, min, sec, y1, y2; unsigned year; @@ -1015,15 +1015,15 @@ BIN_TO_BCD(y1); BIN_TO_BCD(y2); - ceTime[4] = y1; - ceTime[5] = y2; - ceTime[6] = sec; - ceTime[7] = min; - ceTime[8] = hour; - ceTime[10] = day; - ceTime[11] = mon; + ce_time[4] = y1; + ce_time[5] = y2; + ce_time[6] = sec; + ce_time[7] = min; + ce_time[8] = hour; + ce_time[10] = day; + ce_time[11] = mon; - return signal_ce_msg(ceTime, NULL); + return signal_ce_msg(ce_time, NULL); } static int proc_mf_dump_cmdline(char *page, char **start, off_t off, -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041109/0249c6aa/attachment.pgp From sfr at canb.auug.org.au Tue Nov 9 19:02:26 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 9 Nov 2004 19:02:26 +1100 Subject: [PATCH] [6/6] PPC64 iSeries last of the cleanups fo the MF code In-Reply-To: <20041109185759.493d19fd.sfr@canb.auug.org.au> References: <20041109184223.16ea3414.sfr@canb.auug.org.au> <20041109184551.03b8a32c.sfr@canb.auug.org.au> <20041109184813.1a6e02cf.sfr@canb.auug.org.au> <20041109185131.29e6eabd.sfr@canb.auug.org.au> <20041109185547.6eaf99ee.sfr@canb.auug.org.au> <20041109185759.493d19fd.sfr@canb.auug.org.au> Message-ID: <20041109190226.035641e0.sfr@canb.auug.org.au> This last patch is a bit if a mess because it mainly consists of combining some single use small functions into their callers and rearranging some other code. Some intermediate variables are introduced and some code is restructured to improve its readablility (and hopefully maintainability). Overall there are no semantic changes. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk-mf.0.9/arch/ppc64/kernel/iSeries_setup.c linus-bk-mf.1/arch/ppc64/kernel/iSeries_setup.c --- linus-bk-mf.0.9/arch/ppc64/kernel/iSeries_setup.c 2004-11-09 16:48:02.000000000 +1100 +++ linus-bk-mf.1/arch/ppc64/kernel/iSeries_setup.c 2004-11-09 14:14:38.000000000 +1100 @@ -55,6 +55,7 @@ #include #include #include +#include extern void hvlog(char *fmt, ...); diff -ruN linus-bk-mf.0.9/arch/ppc64/kernel/mf.c linus-bk-mf.1/arch/ppc64/kernel/mf.c --- linus-bk-mf.0.9/arch/ppc64/kernel/mf.c 2004-11-09 18:19:18.000000000 +1100 +++ linus-bk-mf.1/arch/ppc64/kernel/mf.c 2004-11-09 16:26:01.000000000 +1100 @@ -25,24 +25,21 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#include #include #include #include #include -#include -#include #include -#include -#include #include -#include -#include -#include -#include #include #include + +#include +#include #include +#include +#include +#include /* * This is the structure layout for the Machine Facilites LPAR event @@ -198,8 +195,8 @@ hv_rc = HvCallEvent_signalLpEvent( &pending_event_head->event.hp_lp_event); if (hv_rc != HvLpEvent_Rc_Good) { - printk(KERN_ERR "mf.c: HvCallEvent_signalLpEvent() failed with %d\n", - (int)hv_rc); + printk(KERN_ERR "mf.c: HvCallEvent_signalLpEvent() " + "failed with %d\n", (int)hv_rc); spin_lock_irqsave(&pending_event_spinlock, flags); ev1 = pending_event_head; @@ -238,12 +235,13 @@ pending_event_avail = pending_event_avail->next; } spin_unlock_irqrestore(&pending_event_spinlock, flags); - if (ev == NULL) - ev = kmalloc(sizeof(struct pending_event),GFP_ATOMIC); if (ev == NULL) { - printk(KERN_ERR "mf.c: unable to kmalloc %ld bytes\n", - sizeof(struct pending_event)); - return NULL; + ev = kmalloc(sizeof(struct pending_event), GFP_ATOMIC); + if (ev == NULL) { + printk(KERN_ERR "mf.c: unable to kmalloc %ld bytes\n", + sizeof(struct pending_event)); + return NULL; + } } memset(ev, 0, sizeof(struct pending_event)); hev = &ev->event.hp_lp_event; @@ -254,7 +252,7 @@ hev->xType = HvLpEvent_Type_MachineFac; hev->xSourceLp = HvLpConfig_getLpIndex(); hev->xTargetLp = primary_lp; - hev->xSizeMinus1 = sizeof(ev->event)-1; + hev->xSizeMinus1 = sizeof(ev->event) - 1; hev->xRc = HvLpEvent_Rc_Good; hev->xSourceInstanceId = HvCallEvent_getSourceLpInstanceId(primary_lp, HvLpEvent_Type_MachineFac); @@ -373,8 +371,10 @@ */ static void handle_int(struct io_mf_lp_event *event) { - int free_it = 0; - struct pending_event *two = NULL; + struct ce_msg_data *ce_msg_data; + struct ce_msg_data *pce_msg_data; + unsigned long flags; + struct pending_event *pev; /* ack the interrupt */ event->hp_lp_event.xRc = HvLpEvent_Rc_Good; @@ -383,49 +383,42 @@ /* process interrupt */ switch (event->hp_lp_event.xSubtype) { case 0: /* CE message */ - switch (event->data.ce_msg.ce_msg[3]) { + ce_msg_data = &event->data.ce_msg; + switch (ce_msg_data->ce_msg[3]) { case 0x5B: /* power control notification */ - if ((event->data.ce_msg.ce_msg[5] & 0x20) != 0) { + if ((ce_msg_data->ce_msg[5] & 0x20) != 0) { printk(KERN_INFO "mf.c: Commencing partition shutdown\n"); if (shutdown() == 0) signal_ce_msg_simple(0xDB, NULL); } break; case 0xC0: /* get time */ - if ((pending_event_head == NULL) || - (pending_event_head->event.data.ce_msg.ce_msg[3] - != 0x40)) + spin_lock_irqsave(&pending_event_spinlock, flags); + pev = pending_event_head; + if (pev != NULL) + pending_event_head = pending_event_head->next; + spin_unlock_irqrestore(&pending_event_spinlock, flags); + if (pev == NULL) break; - free_it = 1; - if (pending_event_head->event.data.ce_msg.completion != 0) { - ce_msg_comp_hdlr handler = pending_event_head->event.data.ce_msg.completion->handler; - void *token = pending_event_head->event.data.ce_msg.completion->token; + pce_msg_data = &pev->event.data.ce_msg; + if (pce_msg_data->ce_msg[3] != 0x40) + break; + if (pce_msg_data->completion != NULL) { + ce_msg_comp_hdlr handler = + pce_msg_data->completion->handler; + void *token = pce_msg_data->completion->token; if (handler != NULL) - (*handler)(token, &(event->data.ce_msg)); + (*handler)(token, ce_msg_data); } - break; - } - - /* remove from queue */ - if (free_it == 1) { - unsigned long flags; - spin_lock_irqsave(&pending_event_spinlock, flags); - if (pending_event_head != NULL) { - struct pending_event *oldHead = - pending_event_head; - - pending_event_head = pending_event_head->next; - two = pending_event_head; - free_pending_event(oldHead); - } + free_pending_event(pev); spin_unlock_irqrestore(&pending_event_spinlock, flags); + /* send next waiting event */ + if (pending_event_head != NULL) + signal_event(NULL); + break; } - - /* send next waiting event */ - if (two != NULL) - signal_event(NULL); break; case 1: /* IT sys shutdown */ printk(KERN_INFO "mf.c: Commencing system shutdown\n"); @@ -442,52 +435,57 @@ static void handle_ack(struct io_mf_lp_event *event) { unsigned long flags; - struct pending_event * two = NULL; + struct pending_event *two = NULL; unsigned long free_it = 0; + struct ce_msg_data *ce_msg_data; + struct ce_msg_data *pce_msg_data; + struct vsp_rsp_data *rsp; /* handle current event */ - if (pending_event_head != NULL) { - switch (event->hp_lp_event.xSubtype) { - case 0: /* CE msg */ - if (event->data.ce_msg.ce_msg[3] == 0x40) { - if (event->data.ce_msg.ce_msg[2] != 0) { - free_it = 1; - if (pending_event_head->event.data.ce_msg.completion - != 0) { - ce_msg_comp_hdlr handler = pending_event_head->event.data.ce_msg.completion->handler; - void *token = pending_event_head->event.data.ce_msg.completion->token; - - if (handler != NULL) - (*handler)(token, &(event->data.ce_msg)); - } - } - } else - free_it = 1; - break; - case 4: /* allocate */ - case 5: /* deallocate */ - if (pending_event_head->hdlr != NULL) { - (*pending_event_head->hdlr)((void *)event->hp_lp_event.xCorrelationToken, event->data.alloc.count); - } + if (pending_event_head == NULL) { + printk(KERN_ERR "mf.c: stack empty for receiving ack\n"); + return; + } + + switch (event->hp_lp_event.xSubtype) { + case 0: /* CE msg */ + ce_msg_data = &event->data.ce_msg; + if (ce_msg_data->ce_msg[3] != 0x40) { free_it = 1; break; - case 6: - { - struct vsp_rsp_data *rsp = (struct vsp_rsp_data *)event->data.vsp_cmd.token; - - if (rsp != NULL) { - if (rsp->response != NULL) - memcpy(rsp->response, &(event->data.vsp_cmd), sizeof(event->data.vsp_cmd)); - complete(&rsp->com); - } else - printk(KERN_ERR "mf.c: no rsp\n"); - free_it = 1; - } + } + if (ce_msg_data->ce_msg[2] == 0) break; + free_it = 1; + pce_msg_data = &pending_event_head->event.data.ce_msg; + if (pce_msg_data->completion != NULL) { + ce_msg_comp_hdlr handler = + pce_msg_data->completion->handler; + void *token = pce_msg_data->completion->token; + + if (handler != NULL) + (*handler)(token, ce_msg_data); } + break; + case 4: /* allocate */ + case 5: /* deallocate */ + if (pending_event_head->hdlr != NULL) + (*pending_event_head->hdlr)((void *)event->hp_lp_event.xCorrelationToken, event->data.alloc.count); + free_it = 1; + break; + case 6: + free_it = 1; + rsp = (struct vsp_rsp_data *)event->data.vsp_cmd.token; + if (rsp == NULL) { + printk(KERN_ERR "mf.c: no rsp\n"); + break; + } + if (rsp->response != NULL) + memcpy(rsp->response, &event->data.vsp_cmd, + sizeof(event->data.vsp_cmd)); + complete(&rsp->com); + break; } - else - printk(KERN_ERR "mf.c: stack empty for receiving ack\n"); /* remove from queue */ spin_lock_irqsave(&pending_event_spinlock, flags); @@ -618,7 +616,9 @@ { u8 ce[12]; - memcpy(ce, "\x00\x00\x00\x4A\x00\x00\x00\x01\x00\x00\x00\x00", 12); + memset(ce, 0, sizeof(ce)); + ce[3] = 0x4a; + ce[7] = 0x01; ce[8] = word >> 24; ce[9] = word >> 16; ce[10] = word >> 8; @@ -677,232 +677,8 @@ signal_ce_msg_simple(0x57, NULL); /* initialization complete */ - printk(KERN_NOTICE "mf.c: iSeries Linux LPAR Machine Facilities initialized\n"); -} - -void mf_setSide(char side) -{ - u64 new_side; - struct vsp_cmd_data vsp_cmd; - - memset(&vsp_cmd, 0, sizeof(vsp_cmd)); - switch (side) { - case 'A': new_side = 0; - break; - case 'B': new_side = 1; - break; - case 'C': new_side = 2; - break; - default: new_side = 3; - break; - } - vsp_cmd.sub_data.ipl_type = new_side; - vsp_cmd.cmd = 10; - - (void)signal_vsp_instruction(&vsp_cmd); -} - -char mf_getSide(void) -{ - char return_value = ' '; - int rc = 0; - struct vsp_cmd_data vsp_cmd; - - memset(&vsp_cmd, 0, sizeof(vsp_cmd)); - vsp_cmd.cmd = 2; - vsp_cmd.sub_data.ipl_type = 0; - mb(); - rc = signal_vsp_instruction(&vsp_cmd); - - if (rc != 0) - return return_value; - - if (vsp_cmd.result_code == 0) { - switch (vsp_cmd.sub_data.ipl_type) { - case 0: return_value = 'A'; - break; - case 1: return_value = 'B'; - break; - case 2: return_value = 'C'; - break; - default: return_value = 'D'; - break; - } - } - return return_value; -} - -void mf_getSrcHistory(char *buffer, int size) -{ -#if 0 - struct IplTypeReturnStuff return_stuff; - struct pending_event *ev = new_pending_event(); - int rc = 0; - char *pages[4]; - - pages[0] = kmalloc(4096, GFP_ATOMIC); - pages[1] = kmalloc(4096, GFP_ATOMIC); - pages[2] = kmalloc(4096, GFP_ATOMIC); - pages[3] = kmalloc(4096, GFP_ATOMIC); - if ((ev == NULL) || (pages[0] == NULL) || (pages[1] == NULL) - || (pages[2] == NULL) || (pages[3] == NULL)) - return -ENOMEM; - - return_stuff.xType = 0; - return_stuff.xRc = 0; - return_stuff.xDone = 0; - ev->event.hp_lp_event.xSubtype = 6; - ev->event.hp_lp_event.x.xSubtypeData = - subtype_data('M', 'F', 'V', 'I'); - ev->event.data.vsp_cmd.xEvent = &return_stuff; - ev->event.data.vsp_cmd.cmd = 4; - ev->event.data.vsp_cmd.lp_index = HvLpConfig_getLpIndex(); - ev->event.data.vsp_cmd.result_code = 0xFF; - ev->event.data.vsp_cmd.reserved = 0; - ev->event.data.vsp_cmd.sub_data.page[0] = ISERIES_HV_ADDR(pages[0]); - ev->event.data.vsp_cmd.sub_data.page[1] = ISERIES_HV_ADDR(pages[1]); - ev->event.data.vsp_cmd.sub_data.page[2] = ISERIES_HV_ADDR(pages[2]); - ev->event.data.vsp_cmd.sub_data.page[3] = ISERIES_HV_ADDR(pages[3]); - mb(); - if (signal_event(ev) != 0) - return; - - while (return_stuff.xDone != 1) - udelay(10); - if (return_stuff.xRc == 0) - memcpy(buffer, pages[0], size); - kfree(pages[0]); - kfree(pages[1]); - kfree(pages[2]); - kfree(pages[3]); -#endif -} - -void mf_setCmdLine(const char *cmdline, int size, u64 side) -{ - struct vsp_cmd_data vsp_cmd; - dma_addr_t dma_addr = 0; - char *page = dma_alloc_coherent(iSeries_vio_dev, size, &dma_addr, - GFP_ATOMIC); - - if (page == NULL) { - printk(KERN_ERR "mf.c: couldn't allocate memory to set command line\n"); - return; - } - - copy_from_user(page, cmdline, size); - - memset(&vsp_cmd, 0, sizeof(vsp_cmd)); - vsp_cmd.cmd = 31; - vsp_cmd.sub_data.kern.token = dma_addr; - vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - vsp_cmd.sub_data.kern.side = side; - vsp_cmd.sub_data.kern.length = size; - mb(); - (void)signal_vsp_instruction(&vsp_cmd); - - dma_free_coherent(iSeries_vio_dev, size, page, dma_addr); -} - -int mf_getCmdLine(char *cmdline, int *size, u64 side) -{ - struct vsp_cmd_data vsp_cmd; - int rc; - int len = *size; - dma_addr_t dma_addr; - - dma_addr = dma_map_single(iSeries_vio_dev, cmdline, len, - DMA_FROM_DEVICE); - memset(cmdline, 0, len); - memset(&vsp_cmd, 0, sizeof(vsp_cmd)); - vsp_cmd.cmd = 33; - vsp_cmd.sub_data.kern.token = dma_addr; - vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - vsp_cmd.sub_data.kern.side = side; - vsp_cmd.sub_data.kern.length = len; - mb(); - rc = signal_vsp_instruction(&vsp_cmd); - - if (rc == 0) { - if (vsp_cmd.result_code == 0) - len = vsp_cmd.sub_data.length_out; -#if 0 - else - memcpy(cmdline, "Bad cmdline", 11); -#endif - } - - dma_unmap_single(iSeries_vio_dev, dma_addr, *size, DMA_FROM_DEVICE); - - return len; -} - - -int mf_setVmlinuxChunk(const char *buffer, int size, int offset, u64 side) -{ - struct vsp_cmd_data vsp_cmd; - int rc; - dma_addr_t dma_addr = 0; - char *page = dma_alloc_coherent(iSeries_vio_dev, size, &dma_addr, - GFP_ATOMIC); - - if (page == NULL) { - printk(KERN_ERR "mf.c: couldn't allocate memory to set vmlinux chunk\n"); - return -ENOMEM; - } - - copy_from_user(page, buffer, size); - memset(&vsp_cmd, 0, sizeof(vsp_cmd)); - - vsp_cmd.cmd = 30; - vsp_cmd.sub_data.kern.token = dma_addr; - vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - vsp_cmd.sub_data.kern.side = side; - vsp_cmd.sub_data.kern.offset = offset; - vsp_cmd.sub_data.kern.length = size; - mb(); - rc = signal_vsp_instruction(&vsp_cmd); - if (rc == 0) { - if (vsp_cmd.result_code == 0) - rc = 0; - else - rc = -ENOMEM; - } - - dma_free_coherent(iSeries_vio_dev, size, page, dma_addr); - - return rc; -} - -int mf_getVmlinuxChunk(char *buffer, int *size, int offset, u64 side) -{ - struct vsp_cmd_data vsp_cmd; - int rc; - int len = *size; - dma_addr_t dma_addr; - - dma_addr = dma_map_single(iSeries_vio_dev, buffer, len, - DMA_FROM_DEVICE); - memset(buffer, 0, len); - memset(&vsp_cmd, 0, sizeof(vsp_cmd)); - vsp_cmd.cmd = 32; - vsp_cmd.sub_data.kern.token = dma_addr; - vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; - vsp_cmd.sub_data.kern.side = side; - vsp_cmd.sub_data.kern.offset = offset; - vsp_cmd.sub_data.kern.length = len; - mb(); - rc = signal_vsp_instruction(&vsp_cmd); - if (rc == 0) { - if (vsp_cmd.result_code == 0) - *size = vsp_cmd.sub_data.length_out; - else - rc = -ENOMEM; - } - - dma_unmap_single(iSeries_vio_dev, dma_addr, len, DMA_FROM_DEVICE); - - return rc; + printk(KERN_NOTICE "mf.c: iSeries Linux LPAR Machine Facilities " + "initialized\n"); } struct rtc_time_data { @@ -932,68 +708,66 @@ ce_complete.handler = &get_rtc_time_complete; ce_complete.token = &rtc_data; rc = signal_ce_msg_simple(0x40, &ce_complete); - if (rc == 0) { - wait_for_completion(&rtc_data.com); - - if (rtc_data.rc == 0) { - if ((rtc_data.ce_msg.ce_msg[2] == 0xa9) || - (rtc_data.ce_msg.ce_msg[2] == 0xaf)) { - /* TOD clock is not set */ - tm->tm_sec = 1; - tm->tm_min = 1; - tm->tm_hour = 1; - tm->tm_mday = 10; - tm->tm_mon = 8; - tm->tm_year = 71; - mf_set_rtc(tm); - } - { - u8 *ce_msg = rtc_data.ce_msg.ce_msg; - u8 year = ce_msg[5]; - u8 sec = ce_msg[6]; - u8 min = ce_msg[7]; - u8 hour = ce_msg[8]; - u8 day = ce_msg[10]; - u8 mon = ce_msg[11]; - - BCD_TO_BIN(sec); - BCD_TO_BIN(min); - BCD_TO_BIN(hour); - BCD_TO_BIN(day); - BCD_TO_BIN(mon); - BCD_TO_BIN(year); - - if (year <= 69) - year += 100; - - tm->tm_sec = sec; - tm->tm_min = min; - tm->tm_hour = hour; - tm->tm_mday = day; - tm->tm_mon = mon; - tm->tm_year = year; - } - } else { - rc = rtc_data.rc; - tm->tm_sec = 0; - tm->tm_min = 0; - tm->tm_hour = 0; - tm->tm_mday = 15; - tm->tm_mon = 5; - tm->tm_year = 52; - - } - tm->tm_wday = 0; - tm->tm_yday = 0; - tm->tm_isdst = 0; + if (rc) + return rc; + wait_for_completion(&rtc_data.com); + tm->tm_wday = 0; + tm->tm_yday = 0; + tm->tm_isdst = 0; + if (rtc_data.rc) { + tm->tm_sec = 0; + tm->tm_min = 0; + tm->tm_hour = 0; + tm->tm_mday = 15; + tm->tm_mon = 5; + tm->tm_year = 52; + return rtc_data.rc; + } + + if ((rtc_data.ce_msg.ce_msg[2] == 0xa9) || + (rtc_data.ce_msg.ce_msg[2] == 0xaf)) { + /* TOD clock is not set */ + tm->tm_sec = 1; + tm->tm_min = 1; + tm->tm_hour = 1; + tm->tm_mday = 10; + tm->tm_mon = 8; + tm->tm_year = 71; + mf_set_rtc(tm); + } + { + u8 *ce_msg = rtc_data.ce_msg.ce_msg; + u8 year = ce_msg[5]; + u8 sec = ce_msg[6]; + u8 min = ce_msg[7]; + u8 hour = ce_msg[8]; + u8 day = ce_msg[10]; + u8 mon = ce_msg[11]; + + BCD_TO_BIN(sec); + BCD_TO_BIN(min); + BCD_TO_BIN(hour); + BCD_TO_BIN(day); + BCD_TO_BIN(mon); + BCD_TO_BIN(year); + + if (year <= 69) + year += 100; + + tm->tm_sec = sec; + tm->tm_min = min; + tm->tm_hour = hour; + tm->tm_mday = day; + tm->tm_mon = mon; + tm->tm_year = year; } - return rc; + return 0; } int mf_set_rtc(struct rtc_time *tm) { - char ce_time[12] = "\x00\x00\x00\x41\x00\x00\x00\x00\x00\x00\x00\x00"; + char ce_time[12]; u8 day, mon, hour, min, sec, y1, y2; unsigned year; @@ -1015,6 +789,8 @@ BIN_TO_BCD(y1); BIN_TO_BCD(y2); + memset(ce_time, 0, sizeof(ce_time)); + ce_time[3] = 0x41; ce_time[4] = y1; ce_time[5] = y2; ce_time[6] = sec; @@ -1026,34 +802,96 @@ return signal_ce_msg(ce_time, NULL); } +#ifdef CONFIG_PROC_FS + static int proc_mf_dump_cmdline(char *page, char **start, off_t off, int count, int *eof, void *data) { - int len = count; + int len; char *p; + struct vsp_cmd_data vsp_cmd; + int rc; + dma_addr_t dma_addr; - if (off) { - *eof = 1; + /* The HV appears to return no more than 256 bytes of command line */ + if (off >= 256) return 0; - } - - len = mf_getCmdLine(page, &len, (u64)data); + if ((off + count) > 256) + count = 256 - off; + dma_addr = dma_map_single(iSeries_vio_dev, page, off + count, + DMA_FROM_DEVICE); + if (dma_mapping_error(dma_addr)) + return -ENOMEM; + memset(page, 0, off + count); + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 33; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = (u64)data; + vsp_cmd.sub_data.kern.length = off + count; + mb(); + rc = signal_vsp_instruction(&vsp_cmd); + dma_unmap_single(iSeries_vio_dev, dma_addr, off + count, + DMA_FROM_DEVICE); + if (rc) + return rc; + if (vsp_cmd.result_code != 0) + return -ENOMEM; p = page; - while (len < (count - 1)) { - if (!*p || *p == '\n') + len = 0; + while (len < (off + count)) { + if ((*p == '\0') || (*p == '\n')) { + if (*p == '\0') + *p = '\n'; + p++; + len++; + *eof = 1; break; + } p++; len++; } - *p = '\n'; - p++; - *p = 0; - return p - page; + if (len < off) { + *eof = 1; + len = 0; + } + return len; } #if 0 +static int mf_getVmlinuxChunk(char *buffer, int *size, int offset, u64 side) +{ + struct vsp_cmd_data vsp_cmd; + int rc; + int len = *size; + dma_addr_t dma_addr; + + dma_addr = dma_map_single(iSeries_vio_dev, buffer, len, + DMA_FROM_DEVICE); + memset(buffer, 0, len); + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 32; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = side; + vsp_cmd.sub_data.kern.offset = offset; + vsp_cmd.sub_data.kern.length = len; + mb(); + rc = signal_vsp_instruction(&vsp_cmd); + if (rc == 0) { + if (vsp_cmd.result_code == 0) + *size = vsp_cmd.sub_data.length_out; + else + rc = -ENOMEM; + } + + dma_unmap_single(iSeries_vio_dev, dma_addr, len, DMA_FROM_DEVICE); + + return rc; +} + static int proc_mf_dump_vmlinux(char *page, char **start, off_t off, int count, int *eof, void *data) { @@ -1079,7 +917,28 @@ int count, int *eof, void *data) { int len; - char mf_current_side = mf_getSide(); + char mf_current_side = ' '; + struct vsp_cmd_data vsp_cmd; + + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 2; + vsp_cmd.sub_data.ipl_type = 0; + mb(); + + if (signal_vsp_instruction(&vsp_cmd) == 0) { + if (vsp_cmd.result_code == 0) { + switch (vsp_cmd.sub_data.ipl_type) { + case 0: mf_current_side = 'A'; + break; + case 1: mf_current_side = 'B'; + break; + case 2: mf_current_side = 'C'; + break; + default: mf_current_side = 'D'; + break; + } + } + } len = sprintf(page, "%c\n", mf_current_side); @@ -1097,30 +956,92 @@ static int proc_mf_change_side(struct file *file, const char __user *buffer, unsigned long count, void *data) { - char stkbuf[10]; + char side; + u64 newSide; + struct vsp_cmd_data vsp_cmd; if (!capable(CAP_SYS_ADMIN)) return -EACCES; - if (count > (sizeof(stkbuf) - 1)) - count = sizeof(stkbuf) - 1; - if (copy_from_user(stkbuf, buffer, count)) + if (count == 0) + return 0; + + if (get_user(side, buffer)) return -EFAULT; - stkbuf[count] = 0; - if ((*stkbuf != 'A') && (*stkbuf != 'B') && - (*stkbuf != 'C') && (*stkbuf != 'D')) { + + switch (side) { + case 'A': newSide = 0; + break; + case 'B': newSide = 1; + break; + case 'C': newSide = 2; + break; + case 'D': newSide = 3; + break; + default: printk(KERN_ERR "mf_proc.c: proc_mf_change_side: invalid side\n"); return -EINVAL; } - mf_setSide(*stkbuf); + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.sub_data.ipl_type = newSide; + vsp_cmd.cmd = 10; + + (void)signal_vsp_instruction(&vsp_cmd); return count; } +#if 0 +static void mf_getSrcHistory(char *buffer, int size) +{ + struct IplTypeReturnStuff return_stuff; + struct pending_event *ev = new_pending_event(); + int rc = 0; + char *pages[4]; + + pages[0] = kmalloc(4096, GFP_ATOMIC); + pages[1] = kmalloc(4096, GFP_ATOMIC); + pages[2] = kmalloc(4096, GFP_ATOMIC); + pages[3] = kmalloc(4096, GFP_ATOMIC); + if ((ev == NULL) || (pages[0] == NULL) || (pages[1] == NULL) + || (pages[2] == NULL) || (pages[3] == NULL)) + return -ENOMEM; + + return_stuff.xType = 0; + return_stuff.xRc = 0; + return_stuff.xDone = 0; + ev->event.hp_lp_event.xSubtype = 6; + ev->event.hp_lp_event.x.xSubtypeData = + subtype_data('M', 'F', 'V', 'I'); + ev->event.data.vsp_cmd.xEvent = &return_stuff; + ev->event.data.vsp_cmd.cmd = 4; + ev->event.data.vsp_cmd.lp_index = HvLpConfig_getLpIndex(); + ev->event.data.vsp_cmd.result_code = 0xFF; + ev->event.data.vsp_cmd.reserved = 0; + ev->event.data.vsp_cmd.sub_data.page[0] = ISERIES_HV_ADDR(pages[0]); + ev->event.data.vsp_cmd.sub_data.page[1] = ISERIES_HV_ADDR(pages[1]); + ev->event.data.vsp_cmd.sub_data.page[2] = ISERIES_HV_ADDR(pages[2]); + ev->event.data.vsp_cmd.sub_data.page[3] = ISERIES_HV_ADDR(pages[3]); + mb(); + if (signal_event(ev) != 0) + return; + + while (return_stuff.xDone != 1) + udelay(10); + if (return_stuff.xRc == 0) + memcpy(buffer, pages[0], size); + kfree(pages[0]); + kfree(pages[1]); + kfree(pages[2]); + kfree(pages[3]); +} +#endif + static int proc_mf_dump_src(char *page, char **start, off_t off, int count, int *eof, void *data) { +#if 0 int len; mf_getSrcHistory(page, count); @@ -1134,6 +1055,9 @@ len = count; *start = page + off; return len; +#else + return 0; +#endif } static int proc_mf_change_src(struct file *file, const char __user *buffer, @@ -1162,34 +1086,91 @@ return count; } -static int proc_mf_change_cmdline(struct file *file, const char *buffer, +static int proc_mf_change_cmdline(struct file *file, const char __user *buffer, unsigned long count, void *data) { + struct vsp_cmd_data vsp_cmd; + dma_addr_t dma_addr; + char *page; + int ret = -EACCES; + if (!capable(CAP_SYS_ADMIN)) - return -EACCES; + goto out; - mf_setCmdLine(buffer, count, (u64)data); + dma_addr = 0; + page = dma_alloc_coherent(iSeries_vio_dev, count, &dma_addr, + GFP_ATOMIC); + ret = -ENOMEM; + if (page == NULL) + goto out; + + ret = -EFAULT; + if (copy_from_user(page, buffer, count)) + goto out_free; - return count; + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 31; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = (u64)data; + vsp_cmd.sub_data.kern.length = count; + mb(); + (void)signal_vsp_instruction(&vsp_cmd); + ret = count; + +out_free: + dma_free_coherent(iSeries_vio_dev, count, page, dma_addr); +out: + return ret; } static ssize_t proc_mf_change_vmlinux(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { - struct inode * inode = file->f_dentry->d_inode; - struct proc_dir_entry * dp = PDE(inode); - int rc; + struct proc_dir_entry *dp = PDE(file->f_dentry->d_inode); + ssize_t rc; + dma_addr_t dma_addr; + char *page; + struct vsp_cmd_data vsp_cmd; + + rc = -EACCES; if (!capable(CAP_SYS_ADMIN)) - return -EACCES; + goto out; - rc = mf_setVmlinuxChunk(buf, count, *ppos, (u64)dp->data); - if (rc < 0) - return rc; + dma_addr = 0; + page = dma_alloc_coherent(iSeries_vio_dev, count, &dma_addr, + GFP_ATOMIC); + rc = -ENOMEM; + if (page == NULL) { + printk(KERN_ERR "mf.c: couldn't allocate memory to set vmlinux chunk\n"); + goto out; + } + rc = -EFAULT; + if (copy_from_user(page, buf, count)) + goto out_free; - *ppos += count; + memset(&vsp_cmd, 0, sizeof(vsp_cmd)); + vsp_cmd.cmd = 30; + vsp_cmd.sub_data.kern.token = dma_addr; + vsp_cmd.sub_data.kern.address_type = HvLpDma_AddressType_TceIndex; + vsp_cmd.sub_data.kern.side = (u64)dp->data; + vsp_cmd.sub_data.kern.offset = *ppos; + vsp_cmd.sub_data.kern.length = count; + mb(); + rc = signal_vsp_instruction(&vsp_cmd); + if (rc) + goto out_free; + rc = -ENOMEM; + if (vsp_cmd.result_code != 0) + goto out_free; - return count; + *ppos += count; + rc = count; +out_free: + dma_free_coherent(iSeries_vio_dev, count, page, dma_addr); +out: + return rc; } static struct file_operations proc_vmlinux_operations = { @@ -1254,3 +1235,5 @@ } __initcall(mf_proc_init); + +#endif /* CONFIG_PROC_FS */ diff -ruN linus-bk-mf.0.9/include/asm-ppc64/iSeries/mf.h linus-bk-mf.1/include/asm-ppc64/iSeries/mf.h --- linus-bk-mf.0.9/include/asm-ppc64/iSeries/mf.h 2004-11-09 17:27:54.000000000 +1100 +++ linus-bk-mf.1/include/asm-ppc64/iSeries/mf.h 2004-11-09 17:31:44.000000000 +1100 @@ -24,13 +24,13 @@ * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#ifndef MF_H_INCLUDED -#define MF_H_INCLUDED +#ifndef _ASM_PPC64_ISERIES_MF_H +#define _ASM_PPC64_ISERIES_MF_H -#include +#include #include -#include +#include struct rtc_time; @@ -51,19 +51,7 @@ extern void mf_init(void); -extern void mf_setSide(char side); -extern char mf_getSide(void); - -extern void mf_setCmdLine(const char *cmdline, int size, u64 side); -extern int mf_getCmdLine(char *cmdline, int *size, u64 side); - -extern void mf_getSrcHistory(char *buffer, int size); - -extern int mf_setVmlinuxChunk(const char *buffer, int size, int offset, - u64 side); -extern int mf_getVmlinuxChunk(char *buffer, int *size, int offset, u64 side); - extern int mf_get_rtc(struct rtc_time *tm); extern int mf_set_rtc(struct rtc_time *tm); -#endif /* MF_H_INCLUDED */ +#endif /* _ASM_PPC64_ISERIES_MF_H */ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041109/6bda4030/attachment.pgp From benh at kernel.crashing.org Tue Nov 9 19:01:49 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 09 Nov 2004 19:01:49 +1100 Subject: [PATCH] ppc64: Add HW CPU timebase sync Message-ID: <1099987309.10262.227.camel@gaston> Hi ! This patch which requires "ppc64: Fix G5 low level i2c code" to be applied first, implements support for doing a HW synchronization of the CPU timebases on SMP G5 machines. When the proper clock chips are found on i2c, they are used to stop the timebase clock source during the synchronization process. This replace the software sync algorithm we used so far and provide slightly more precise results. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/pmac_smp.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/pmac_smp.c 2004-10-25 21:58:12.000000000 +1000 +++ linux-work/arch/ppc64/kernel/pmac_smp.c 2004-11-09 18:54:46.658302632 +1100 @@ -31,7 +31,6 @@ #include #include #include -#include #include #include #include @@ -51,6 +50,7 @@ #include #include #include +#include #include "mpic.h" @@ -66,31 +66,164 @@ extern struct smp_ops_t *smp_ops; +static void (*pmac_tb_freeze)(int freeze); +static struct device_node *pmac_tb_clock_chip_host; +static spinlock_t timebase_lock = SPIN_LOCK_UNLOCKED; +static unsigned long timebase; + +static void smp_core99_cypress_tb_freeze(int freeze) +{ + u8 data; + int rc; + + /* Strangely, the device-tree says address is 0xd2, but darwin + * accesses 0xd0 ... + */ + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_combined); + rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, + 0xd0 | pmac_low_i2c_read, + 0x81, &data, 1); + if (rc != 0) + goto bail; + + data = (data & 0xf3) | (freeze ? 0x00 : 0x0c); + + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_stdsub); + rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, + 0xd0 | pmac_low_i2c_write, + 0x81, &data, 1); + + bail: + if (rc != 0) { + printk("Cypress Timebase %s rc: %d\n", + freeze ? "freeze" : "unfreeze", rc); + panic("Timebase freeze failed !\n"); + } +} + +static void smp_core99_pulsar_tb_freeze(int freeze) +{ + u8 data; + int rc; + + /* Strangely, the device-tree says address is 0xd2, but darwin + * accesses 0xd0 ... + */ + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_combined); + rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, + 0xd4 | pmac_low_i2c_read, + 0x2e, &data, 1); + if (rc != 0) + goto bail; + + data = (data & 0x88) | (freeze ? 0x11 : 0x22); + + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_stdsub); + rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, + 0xd4 | pmac_low_i2c_write, + 0x2e, &data, 1); + bail: + if (rc != 0) { + printk(KERN_ERR "Pulsar Timebase %s rc: %d\n", + freeze ? "freeze" : "unfreeze", rc); + panic("Timebase freeze failed !\n"); + } +} + + +static void smp_core99_give_timebase(void) +{ + /* Open i2c bus for synchronous access */ + if (pmac_low_i2c_open(pmac_tb_clock_chip_host, 0)) + panic("Can't open i2c for TB sync !\n"); + + spin_lock(&timebase_lock); + (*pmac_tb_freeze)(1); + mb(); + timebase = get_tb(); + spin_unlock(&timebase_lock); + + while (timebase) + barrier(); + + spin_lock(&timebase_lock); + (*pmac_tb_freeze)(0); + spin_unlock(&timebase_lock); + + /* Close i2c bus */ + pmac_low_i2c_close(pmac_tb_clock_chip_host); +} + + +static void __devinit smp_core99_take_timebase(void) +{ + while (!timebase) + barrier(); + spin_lock(&timebase_lock); + set_tb(timebase >> 32, timebase & 0xffffffff); + timebase = 0; + spin_unlock(&timebase_lock); +} + + static int __init smp_core99_probe(void) { - struct device_node *cpus; - int ncpus = 1; + struct device_node *cpus; + struct device_node *cc; + int ncpus = 0; /* Maybe use systemconfiguration here ? */ if (ppc_md.progress) ppc_md.progress("smp_core99_probe", 0x345); - cpus = find_type_devices("cpu"); - if (cpus == NULL) - return 0; - while ((cpus = cpus->next) != NULL) + /* Count CPUs in the device-tree */ + for (cpus = NULL; (cpus = of_find_node_by_type(cpus, "cpu")) != NULL;) ++ncpus; printk(KERN_INFO "PowerMac SMP probe found %d cpus\n", ncpus); - if (ncpus > 1) - mpic_request_ipis(); + /* Nothing more to do if less than 2 of them */ + if (ncpus <= 1) + return 1; + + /* Look for the clock chip */ + for (cc = NULL; (cc = of_find_node_by_name(cc, "i2c-hwclock")) != NULL;) { + struct device_node *p = of_get_parent(cc); + u32 *reg; + int ok; + ok = p && device_is_compatible(p, "uni-n-i2c"); + if (!ok) + goto next; + reg = (u32 *)get_property(cc, "reg", NULL); + if (reg == NULL) + goto next; + switch (*reg) { + case 0xd2: + pmac_tb_freeze = smp_core99_cypress_tb_freeze; + printk(KERN_INFO "Timebase clock is Cypress chip\n"); + break; + case 0xd4: + pmac_tb_freeze = smp_core99_pulsar_tb_freeze; + printk(KERN_INFO "Timebase clock is Pulsar chip\n"); + break; + } + if (pmac_tb_freeze != NULL) { + pmac_tb_clock_chip_host = p; + smp_ops->give_timebase = smp_core99_give_timebase; + smp_ops->take_timebase = smp_core99_take_timebase; + break; + } + next: + of_node_put(p); + } + + mpic_request_ipis(); return ncpus; } static void __init smp_core99_kick_cpu(int nr) { - int save_vector; + int save_vector, j; unsigned long new_vector; unsigned long flags; volatile unsigned int *vector @@ -135,7 +268,8 @@ * ideally, all that crap will be done in prom.c and the CPU left * in a RAM-based wait loop like CHRP. */ - mdelay(1); + for (j = 1; j < 1000000; j++) + mb(); /* Restore our exception vector */ *vector = save_vector; From scheel at vnet.ibm.com Wed Nov 10 07:41:29 2004 From: scheel at vnet.ibm.com (Jeff Scheel) Date: Tue, 9 Nov 2004 14:41:29 -0600 Subject: Fw: [PATCH] PURR data on iSeries Linux Message-ID: <003a01c4c69c$7d69a590$5f560a09@rchland.ibm.com> From: > P.S. Sorry I had to use my personal email address but I > can't get the patch out without losing tabs. I think this > works. :-) Shoot, the patch was truncated by my other mailer...I'll repost as soon as I dump my current mailer. Sorry. Comments still accepted. From jscheel at magnaspeed.net Wed Nov 10 07:14:50 2004 From: jscheel at magnaspeed.net (jscheel at magnaspeed.net) Date: Tue, 09 Nov 2004 15:14:50 -0500 Subject: [PATCH] PURR data on iSeries Linux Message-ID: With the addtion of the PURR for Power5 systems, applications have begun being built to utilize this value. One application in particular is looking for equivalent information for Linux running on legacy iSeries systems. The data necessary to report this data is available from the hypervisor. It simply has to be retrieved and reported by /proc/ppc64/lparcfg interface on iSeries. For those interested in testing this patch, retrieve the purr data at two defined intervals and calculate the difference. Then, multiply the delta by 100, divide by the number of seconds in your interval, divide by the "timebase" value from "/proc/cpuinfo", and divide again by the number of processors. This will provide the physical cpu utilization ranging from 1 to 100. For shared processor configurations, this will depend on workload with the actual value somewhere between 1 and 100. For dedicated processors, this number should always be 100 as the operating system gets all of the physical processor capacity. This patch to arch/ppc64/lparcfg.c reports this data. It has been tested on legacy iSeries systems and is not dependent on the Power5 PURR implementation. Please consider it for inclusion in the arch/ppc64 tree. Additionally, I would appreciate any and all comments. Thanks, -Jeff P.S. Sorry I had to use my personal email address but I can't get the patch out without losing tabs. I think this works. :-) Signed-off-by: Jeff Scheel --- linuxppc-2.6.9_rc1.orig/arch/ppc64/kernel/lparcfg.c 2004-11-09 07:03:43.354383000 -0600 +++ linuxppc-2.6.9_rc1/arch/ppc64/kernel/lparcfg.c 2004-11-09 10:40:36.375934020 -0600 @@ -34,7 +34,7 @@ #include #include -#define MODULE_VERS "1.4" +#define MODULE_VERS "1.5" #define MODULE_NAME "lparcfg" /* #define LPARCFG_DEBUG */ @@ -70,6 +70,30 @@ #ifdef CONFIG_PPC_ISERIES +static unsigned long get_purr(void); + +/* + * For iSeries legacy systems, the PPA purr function is available from the + * xEmulatedTimeBase field in the paca. + */ +static unsigned long get_purr() +{ + unsigned long sum_purr=0; + int cpu; + struct paca_struct *lpaca; + + for_each_online_cpu(cpu) { + lpaca = paca + cpu; + sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; + +#ifdef PURR_DEBUG + printk(KERN_INFO "get_purr for cpu (%x) has value (%lx) \n", + cpu,lpaca->xLpPaca.xEmulatedTimeBase); +#endif + } + return sum_purr; +} + #define lparcfg_write NULL /* @@ -81,6 +105,7 @@ int shared, entitled_capacity, max_entitled_capacity; int processors, max_processors; struct paca_struct *lpaca = get_paca(); + unsigned long purr = get_purr(); shared = (int)(lpaca->lppaca_ptr->xSharedProc); seq_printf(m, "serial_number=%c%c%c%c%c%c%c\n", @@ -131,6 +156,7 @@ seq_printf(m, "pool_capacity=%d\n", (int)(HvLpConfig_getNumProcsInSharedPool(pool_id) * 100)); + seq_printf(m, "purr=%ld\n", purr); } seq_printf(m, "shared_processor_mode=%d\n", shared); From scheel at vnet.ibm.com Wed Nov 10 08:27:01 2004 From: scheel at vnet.ibm.com (Jeff Scheel) Date: Tue, 09 Nov 2004 15:27:01 -0600 Subject: [Fwd: Fw: [PATCH] PURR data on iSeries Linux] Message-ID: <41913625.2020101@vnet.ibm.com> > Shoot, the patch was truncated by my other mailer...I'll repost as soon as I > dump my current mailer. Sorry. Let's try this again. -Jeff --- linuxppc-2.6.9_rc1.orig/arch/ppc64/kernel/lparcfg.c 2004-11-09 07:03:43.354383000 -0600 +++ linuxppc-2.6.9_rc1/arch/ppc64/kernel/lparcfg.c 2004-11-09 10:40:36.375934020 -0600 @@ -34,7 +34,7 @@ #include #include -#define MODULE_VERS "1.4" +#define MODULE_VERS "1.5" #define MODULE_NAME "lparcfg" /* #define LPARCFG_DEBUG */ @@ -70,6 +70,30 @@ #ifdef CONFIG_PPC_ISERIES +static unsigned long get_purr(void); + +/* + * For iSeries legacy systems, the PPA purr function is available from the + * xEmulatedTimeBase field in the paca. + */ +static unsigned long get_purr() +{ + unsigned long sum_purr=0; + int cpu; + struct paca_struct *lpaca; + + for_each_online_cpu(cpu) { + lpaca = paca + cpu; + sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; + +#ifdef PURR_DEBUG + printk(KERN_INFO "get_purr for cpu (%x) has value (%lx) \n", + cpu,lpaca->xLpPaca.xEmulatedTimeBase); +#endif + } + return sum_purr; +} + #define lparcfg_write NULL /* @@ -81,6 +105,7 @@ int shared, entitled_capacity, max_entitled_capacity; int processors, max_processors; struct paca_struct *lpaca = get_paca(); + unsigned long purr = get_purr(); shared = (int)(lpaca->lppaca_ptr->xSharedProc); seq_printf(m, "serial_number=%c%c%c%c%c%c%c\n", @@ -131,6 +156,7 @@ seq_printf(m, "pool_capacity=%d\n", (int)(HvLpConfig_getNumProcsInSharedPool(pool_id) * 100)); + seq_printf(m, "purr=%ld\n", purr); } seq_printf(m, "shared_processor_mode=%d\n", shared); From dwm at austin.ibm.com Wed Nov 10 10:03:13 2004 From: dwm at austin.ibm.com (Doug Maxey) Date: Tue, 09 Nov 2004 17:03:13 -0600 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <20041108234252.GA12837@bubble.modra.org> Message-ID: <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> On Tue, 09 Nov 2004 10:12:52 +1030, Alan Modra wrote: >On Mon, Nov 08, 2004 at 12:16:19PM -0600, Linas Vepstas wrote: >> The new assembler seems to be mistaking NR_syscalls for a register >> number, which is clearly out of bounds (its not in 0..31). > >Testcase? I tripped over this while trying to build both rh and suse distro kernels on one machine. Seemed easier to just have the same compiler level for both, the rh rpm version is 3.4.2, so I got real brave and built from scratch. 8) compile arch/ppc64/kernel/entry.S with gcc_3_4_2_release. Binutils is binutils-2.15.92.0.2 This particular history is from the, ahem, distro sources. Different error in the mainline sources. Is there a set of patches to get this version of the toolchain up to snuff? ++doug COMMAND=={make} ARGS={O=/build/dwm/build/s9-sp1-1103.edit/ppc64 zImage} STARTED Tue Nov 9 16:21:15 2004 ON io-browns cmd=={make O=/build/dwm/build/s9-sp1-1103.edit/ppc64 zImage} Using /build/dwm/linux/s9-sp1-1103.edit as source for kernel CHK include/linux/version.h HOSTCC scripts/basic/fixdep HOSTCC scripts/basic/split-include HOSTCC scripts/basic/docproc HOSTCC scripts/genksyms/genksyms.o HOSTCC scripts/genksyms/lex.o HOSTCC scripts/genksyms/parse.o HOSTLD scripts/genksyms/genksyms HOSTCC scripts/conmakehash HOSTCC scripts/kallsyms CC scripts/empty.o HOSTCC scripts/mk_elfconfig MKELF scripts/elfconfig.h HOSTCC scripts/file2alias.o HOSTCC scripts/modpost.o HOSTCC scripts/sumversion.o HOSTLD scripts/modpost HOSTCC scripts/pnmtologo HOSTCC scripts/bin2c CC arch/ppc64/kernel/asm-offsets.s CHK include/asm-ppc64/offsets.h CC init/main.o CHK include/linux/compile.h UPD include/linux/compile.h CC init/version.o CC init/do_mounts.o CC init/do_mounts_rd.o /build/dwm/linux/s9-sp1-1103.edit/init/do_mounts_rd.c:309: warning: conflicting types for built-in function 'malloc' CC init/do_mounts_initrd.o CC init/do_mounts_md.o LD init/mounts.o CC init/initramfs.o LD init/built-in.o CC init/kerntypes.o HOSTCC usr/gen_init_cpio CPIO usr/initramfs_data.cpio GZIP usr/initramfs_data.cpio.gz AS usr/initramfs_data.o LD usr/built-in.o CC arch/ppc64/kernel/setup.o /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/setup.c: In function `set_preferred_console': /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/setup.c:477: warning: 'offset' might be used uninitialized in this function AS arch/ppc64/kernel/entry.o /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S: Assembler messages: /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:100: Error: operand out of range (268 is not between 0 and 31) /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:100: Error: missing operand /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:164: Error: operand out of range (268 is not between 0 and 31) /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:164: Error: missing operand /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:239: Error: operand out of range (3 is not between 0 and 1) /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:239: Error: missing operand /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:242: Error: operand out of range (3 is not between 0 and 1) /build/dwm/linux/s9-sp1-1103.edit/arch/ppc64/kernel/entry.S:242: Error: missing operand make[2]: *** [arch/ppc64/kernel/entry.o] Error 1 make[1]: *** [arch/ppc64/kernel] Error 2 make: *** [zImage] Error 2 CC { Reading specs from /opt/gcc-3.4.2/lib/gcc/powerpc64-unknown-linux-gnu/3.4.2/specs Configured with: /build/dwm/toolchain/gcc-3.4.2/configure --prefix=/opt/gcc-3.4.2 --disable-multilib --with-ld=/opt/binutils-2.15.92.0.2/bin/ld --with-as=/opt/binutils-2.15.92.0.2/bin/as --enable-languages=c,c++,f77 --enable-altivec Thread model: posix gcc version 3.4.2 } UNAME Linux io-browns 2.6.5-7.97-pseries64 #1 SMP Fri Jul 2 14:21:59 UTC 2004 ppc64 ppc64 ppc64 GNU/Linux uid=1001(dwm) gid=100(users) groups=10(wheel),14(uucp),16(dialout),17(audio),33(video),100(users) COMPLETE at Tue Nov 9 16:21:29 2004 RETURN from {make O=/build/dwm/build/s9-sp1-1103.edit/ppc64 zImage} is 2 ELAPSED time 0:00:14 source lines: 98 andi. r11,r10,_TIF_SYSCALL_TRACE 99 bne- 50f 100 cmpli 0,r0,NR_syscalls 101 bge- 66f 102 /* ... 162 /* XXX check this - Anton */ 163 ld r9,GPR9(r1) 164 cmpli 0,r0,NR_syscalls 165 bge- 66f 166 /* 167 * Need to vector to 32 Bit or default sys_call_table ... 237 andi. r4,r4,_TIF_SYSCALL_TRACE 238 bne- 81f 239 cmpi 0,r3,0 240 bge .ret_from_except 241 b .ret_from_syscall_1 242 81: cmpi 0,r3,0 243 blt .ret_from_syscall_2 244 bl .do_syscall_trace From schwab at suse.de Wed Nov 10 10:09:07 2004 From: schwab at suse.de (Andreas Schwab) Date: Wed, 10 Nov 2004 00:09:07 +0100 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> (Doug Maxey's message of "Tue, 09 Nov 2004 17:03:13 -0600") References: <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> Message-ID: Doug Maxey writes: > compile arch/ppc64/kernel/entry.S with gcc_3_4_2_release. Binutils is > binutils-2.15.92.0.2 See . Andreas. -- Andreas Schwab, SuSE Labs, schwab at suse.de SuSE Linux AG, Maxfeldstra?e 5, 90409 N?rnberg, Germany Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From dwm at austin.ibm.com Wed Nov 10 10:22:05 2004 From: dwm at austin.ibm.com (Doug Maxey) Date: Tue, 09 Nov 2004 17:22:05 -0600 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: Message-ID: <200411092322.iA9NM5W0019417@falcon10.austin.ibm.com> On Wed, 10 Nov 2004 00:09:07 +0100, Andreas Schwab wrote: >Doug Maxey writes: > >> compile arch/ppc64/kernel/entry.S with gcc_3_4_2_release. Binutils is >> binutils-2.15.92.0.2 > >See . > Ok. Pulled binutils off kernel.org. Is there an 'official' site? or better yet, cvs repos and tag? ++doug From amodra at bigpond.net.au Wed Nov 10 10:22:10 2004 From: amodra at bigpond.net.au (Alan Modra) Date: Wed, 10 Nov 2004 09:52:10 +1030 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> References: <20041108234252.GA12837@bubble.modra.org> <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> Message-ID: <20041109232210.GA19474@bubble.modra.org> On Tue, Nov 09, 2004 at 05:03:13PM -0600, Doug Maxey wrote: > entry.S:100: Error: operand out of range (268 is not between 0 and 31) > entry.S:100: Error: missing operand [snip] > 100 cmpli 0,r0,NR_syscalls OK, this one is Segher's patch, now reverted, making the L operand on cmpli non-optional. Any of "cmpli cr,l,r,imm", "cmplwi cr,r,imm", or "cmpldi cr,r,imm" will be accepted by the assembler. -- Alan Modra IBM OzLabs - Linux Technology Centre From amodra at bigpond.net.au Wed Nov 10 10:34:32 2004 From: amodra at bigpond.net.au (Alan Modra) Date: Wed, 10 Nov 2004 10:04:32 +1030 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <200411092322.iA9NM5W0019417@falcon10.austin.ibm.com> References: <200411092322.iA9NM5W0019417@falcon10.austin.ibm.com> Message-ID: <20041109233432.GB19474@bubble.modra.org> On Tue, Nov 09, 2004 at 05:22:05PM -0600, Doug Maxey wrote: > Ok. Pulled binutils off kernel.org. Is there an 'official' site? or better > yet, cvs repos and tag? See http://sources.redhat.com/binutils/ -- Alan Modra IBM OzLabs - Linux Technology Centre From benh at kernel.crashing.org Wed Nov 10 10:56:56 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 10 Nov 2004 10:56:56 +1100 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <20041109232210.GA19474@bubble.modra.org> References: <20041108234252.GA12837@bubble.modra.org> <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> <20041109232210.GA19474@bubble.modra.org> Message-ID: <1100044616.3884.246.camel@gaston> On Wed, 2004-11-10 at 09:52 +1030, Alan Modra wrote: > On Tue, Nov 09, 2004 at 05:03:13PM -0600, Doug Maxey wrote: > > entry.S:100: Error: operand out of range (268 is not between 0 and 31) > > entry.S:100: Error: missing operand > [snip] > > 100 cmpli 0,r0,NR_syscalls > > OK, this one is Segher's patch, now reverted, making the L operand on > cmpli non-optional. > > Any of "cmpli cr,l,r,imm", "cmplwi cr,r,imm", or "cmpldi cr,r,imm" will > be accepted by the assembler. Yah, Segher seem to have a deep love with the idea of pushing gas changes that break pretty much everything out there ;) Ben. From dwm at austin.ibm.com Wed Nov 10 12:27:37 2004 From: dwm at austin.ibm.com (Doug Maxey) Date: Tue, 09 Nov 2004 19:27:37 -0600 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <20041109233432.GB19474@bubble.modra.org> Message-ID: <200411100127.iAA1RbM9020068@falcon10.austin.ibm.com> On Wed, 10 Nov 2004 10:04:32 +1030, Alan Modra wrote: >On Tue, Nov 09, 2004 at 05:22:05PM -0600, Doug Maxey wrote: >> Ok. Pulled binutils off kernel.org. Is there an 'official' site? or better >> yet, cvs repos and tag? > >See http://sources.redhat.com/binutils/ Cool, fixed it. But now have another issue, will start another thread: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors ++doug From dwm at austin.ibm.com Wed Nov 10 12:40:32 2004 From: dwm at austin.ibm.com (Doug Maxey) Date: Tue, 09 Nov 2004 19:40:32 -0600 Subject: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors Message-ID: <200411100140.iAA1eWu1020118@falcon10.austin.ibm.com> Same compiler (3.4.2) with updated binutils, but this time with mainline kernel, gets link errors on final kernel zImage. 'make vmlinux' does complete _without_ errors. Have I created a monster by missing some flags? BTW, ld apparently does not return error, so the make just blasts on by. ++doug cmd=={make -j4 O=/build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64 zImage} Using /build/dwm/linux/lk-2.6.10-rc1-mm4.edit as source for kernel CHK include/linux/version.h GEN /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64/Makefile SYMLINK include/asm -> include/asm-ppc64 UPD include/linux/version.h ... STRIP vmlinux.strip BOOTAS arch/ppc64/boot/crt0.o BOOTAS arch/ppc64/boot/string.o BOOTCC arch/ppc64/boot/prom.o BOOTCC arch/ppc64/boot/main.o BOOTCC arch/ppc64/boot/zlib.o BOOTAS arch/ppc64/boot/div64.o HOSTCC arch/ppc64/boot/addnote GZIP arch/ppc64/boot/kernel-vmlinux.strip.gz Generating arch/ppc64/boot/imagesize.c ls -l vmlinux.strip | \ awk '{printf "/* generated -- do not edit! */\n" \ "unsigned long vmlinux_filesize = %d;\n", $5}' > arch/ppc64/boot/imagesize.c nm -n vmlinux | tail -n 1 | \ awk '{printf "unsigned long vmlinux_memsize = 0x%s;\n", substr($1,8)}' \ >> arch/ppc64/boot/imagesize.c BOOTCC arch/ppc64/boot/imagesize.o touch arch/ppc64/boot/kernel-vmlinux.strip.c BOOTCC arch/ppc64/boot/kernel-vmlinux.strip.o objcopy arch/ppc64/boot/kernel-vmlinux.strip.o --add-section=.kernel:vmlinux.strip=arch/ppc64/boot/kernel-vmlinux.strip.gz --set-section-flags=.kernel:vmlinux.strip=contents,alloc,load,readonly,data ADDNOTE arch/ppc64/boot/zImage ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/crt0.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/string.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/prom.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/main.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/zlib.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/imagesize.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/div64.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/kernel-vmlinux.strip.o' is incompatible with powerpc:common output arch/ppc64/boot/prom.o(.text+0x3d0): In function `.fputs': : undefined reference to `.strlen' arch/ppc64/boot/main.o(.text+0x5c0): In function `.start': : undefined reference to `.memmove' arch/ppc64/boot/main.o(.text+0x688): In function `.start': : undefined reference to `.flush_cache' arch/ppc64/boot/main.o(.text+0x7b0): In function `.start': : undefined reference to `.memmove' arch/ppc64/boot/zlib.o(.text+0x934): In function `.inflateIncomp': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0x9c4): In function `.inflateIncomp': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0xb38): In function `.inflate_flush': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0xbc0): In function `.inflate_flush': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0xc64): In function `.inflate_flush': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0x1e30): more undefined references to `.memcpy' follow From amodra at bigpond.net.au Wed Nov 10 12:54:26 2004 From: amodra at bigpond.net.au (Alan Modra) Date: Wed, 10 Nov 2004 12:24:26 +1030 Subject: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors In-Reply-To: <200411100140.iAA1eWu1020118@falcon10.austin.ibm.com> References: <200411100140.iAA1eWu1020118@falcon10.austin.ibm.com> Message-ID: <20041110015426.GC19474@bubble.modra.org> On Tue, Nov 09, 2004 at 07:40:32PM -0600, Doug Maxey wrote: > Have I created a monster by missing some flags? Probably a wrong BOOTCC compiler or something like that. I can't tell because of the idiotic kernel practice of hiding actual commands run by make. If you use make V=1 it will likely become obvious. -- Alan Modra IBM OzLabs - Linux Technology Centre From sfr at canb.auug.org.au Wed Nov 10 14:38:52 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 10 Nov 2004 14:38:52 +1100 Subject: [Fwd: Fw: [PATCH] PURR data on iSeries Linux] In-Reply-To: <41913625.2020101@vnet.ibm.com> References: <41913625.2020101@vnet.ibm.com> Message-ID: <20041110143852.661b465f.sfr@canb.auug.org.au> Hi Jeff, Looks good in general. I cannot, of course, comment on the semantics due to lack of documentation of the HV interfaces :-) Just a couple of comments: On Tue, 09 Nov 2004 15:27:01 -0600 Jeff Scheel wrote: > > #ifdef CONFIG_PPC_ISERIES > > +static unsigned long get_purr(void); This prototype is unnecessary. > + > +/* > + * For iSeries legacy systems, the PPA purr function is available from the > + * xEmulatedTimeBase field in the paca. > + */ > +static unsigned long get_purr() Need "void" in paramter list. > +{ > + unsigned long sum_purr=0; ^ Use spaces around = ----------| > + int cpu; > + struct paca_struct *lpaca; > + > + for_each_online_cpu(cpu) { Rusty suggests using for_each_cpu in all cricumstances when collecting stats. If the machine is not capable of cpu hotplug, there is no penalty. > + lpaca = paca + cpu; > + sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; > + > +#ifdef PURR_DEBUG > + printk(KERN_INFO "get_purr for cpu (%x) has value (%lx) \n", Maybe decimal would be more meaningful for these values than hex. > + cpu,lpaca->xLpPaca.xEmulatedTimeBase); Spaces after commas. You caught me in "picky" mode :-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041110/ea0774b5/attachment.pgp From benh at kernel.crashing.org Wed Nov 10 15:09:08 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 10 Nov 2004 15:09:08 +1100 Subject: ppc64 vDSO update Message-ID: <1100059748.10262.292.camel@gaston> At the URL below, you can find a new version of the ppc64 vDSO patch against a recent Linus bk tree. I intend to submit it upstream real soon as the work on non-executable stack is waiting for it, though we must first make sure the way symbols are exported to userland is ok for glibc. http://gate.crashing.org/~benh/ppc64-vdso-20041110.diff Following Roland comments, plus my own updates, I've done the following changes: - Renamed _v_ and __v_* symbols to __kernel_* to match the x86 vDSO - Added symbol versions (currently LINUX_2.6.10) and don't export a few things that are really internal to the vDSO. - Added a new export to userland: __kernel_sync_dicache (prototype below) that does the dcache flush / icache invalidate necessary to turn data into executable, I also added the dynamic symbol patching mecanism I wrote about earlier so that the "__kernel_sync_dicache" symbol is automatically modified by the kernel to point to the right version based on the CPU you are running on. At this point, there are 2 implementations, a generic one and a POWER5 one. Ultimately, this mecanism will be used for a lot more "alternatives" once we get more functions in the vDSO, most notably locks. In the long run, I plan to work on improving the kernel side implementation, for example by avoiding some of the runtime symbol lookup done by the kernel at boot and replace it with build-time generation, that sort of thing, along with getting more functions exposed to glibc/userland. For now, however, I'd like to settle with what we have here, so we can properly "hook" glibc & ld.so on to __kernel_gettimeofday() and __kernel_sync_dicache(). Ben. From dwm at austin.ibm.com Wed Nov 10 16:27:00 2004 From: dwm at austin.ibm.com (Doug Maxey) Date: Tue, 09 Nov 2004 23:27:00 -0600 Subject: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors In-Reply-To: <20041110015426.GC19474@bubble.modra.org> Message-ID: <200411100527.iAA5R0cO020672@falcon10.austin.ibm.com> On Wed, 10 Nov 2004 12:24:26 +1030, Alan Modra wrote: >On Tue, Nov 09, 2004 at 07:40:32PM -0600, Doug Maxey wrote: >> Have I created a monster by missing some flags? > >Probably a wrong BOOTCC compiler or something like that. I can't tell >because of the idiotic kernel practice of hiding actual commands run >by make. If you use make V=1 it will likely become obvious. > Below is the 'make O=... V=1 zImage' from the point of the previous failure. I should point out that with the stock gcc-3.3.3-43.24, the compile _does_ succeed. -------------- next part -------------- COMMAND=={make} ARGS={O=/build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64 V=1 zImage} STARTED Tue Nov 9 23:14:15 2004 ON io-browns cmd=={make O=/build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64 V=1 zImage} make -C /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64 \ KBUILD_SRC=/build/dwm/linux/lk-2.6.10-rc1-mm4.edit KBUILD_VERBOSE=1 \ KBUILD_CHECK= KBUILD_EXTMOD="" \ -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/Makefile zImage Using /build/dwm/linux/lk-2.6.10-rc1-mm4.edit as source for kernel if [ -h /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include/asm -o -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/.config ]; then \ echo " /build/dwm/linux/lk-2.6.10-rc1-mm4.edit is not clean, please run 'make mrproper'";\ echo " in the '/build/dwm/linux/lk-2.6.10-rc1-mm4.edit' directory.";\ /bin/false; \ fi; if [ ! -d include2 ]; then mkdir -p include2; fi; ln -fsn /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include/asm-ppc64 include2/asm if /usr/bin/env test ! /build/dwm/linux/lk-2.6.10-rc1-mm4.edit -ef /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64; then \ /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mkmakefile \ /build/dwm/linux/lk-2.6.10-rc1-mm4.edit /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64 2 6 \ > /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64/Makefile; \ echo ' GEN /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64/Makefile'; \ fi GEN /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64/Makefile CHK include/linux/version.h make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/Makefile silentoldconfig make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=scripts/basic if /usr/bin/env test ! /build/dwm/linux/lk-2.6.10-rc1-mm4.edit -ef /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64; then \ /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mkmakefile \ /build/dwm/linux/lk-2.6.10-rc1-mm4.edit /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64 2 6 \ > /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64/Makefile; \ echo ' GEN /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64/Makefile'; \ fi GEN /build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64/Makefile make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=scripts/kconfig silentoldconfig scripts/kconfig/conf -s arch/ppc64/Kconfig # # using defaults found in .config # SPLIT include/linux/autoconf.h -> include/config/* make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=scripts/basic make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=scripts make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=scripts/genksyms make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=scripts/mod make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=arch/ppc64/kernel arch/ppc64/kernel/asm-offsets.s make[2]: `arch/ppc64/kernel/asm-offsets.s' is up to date. make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=init CHK include/linux/compile.h make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=usr set -e; echo ' CHK usr/initramfs_list'; mkdir -p usr/; /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/gen_initramfs_list.sh "" > usr/initramfs_list.tmp; if [ -r usr/initramfs_list ] && cmp -s usr/initramfs_list usr/initramfs_list.tmp; then rm -f usr/initramfs_list.tmp; else echo ' UPD usr/initramfs_list'; mv -f usr/initramfs_list.tmp usr/initramfs_list; fi CHK usr/initramfs_list make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=arch/ppc64/kernel make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=arch/ppc64/mm make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=arch/ppc64/xmon make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=kernel gzip -f -9 < .config > kernel/config_data.gz (echo "const char kernel_config_data[] = MAGIC_START"; cat kernel/config_data.gz | scripts/bin2c; echo "MAGIC_END;") > kernel/config_data.h gcc -m64 -Wp,-MD,kernel/.configs.o.d -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/kernel -Ikernel -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -O2 -fno-omit-frame-pointer -msoft-float -pipe -mminimal-toc -mtraceback=none -mtune=power4 -funit-at-a-time -Wdeclaration-after-statement -DKBUILD_BASENAME=configs -DKBUILD_MODNAME=configs -c -o kernel/.tmp_configs.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/kernel/configs.c make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=kernel/irq ld -m elf64ppc -m elf64ppc -r -o kernel/built-in.o kernel/sched.o kernel/fork.o kernel/exec_domain.o kernel/panic.o kernel/printk.o kernel/profile.o kernel/exit.o kernel/itimer.o kernel/time.o kernel/softirq.o kernel/resource.o kernel/sysctl.o kernel/capability.o kernel/ptrace.o kernel/timer.o kernel/user.o kernel/signal.o kernel/sys.o kernel/kmod.o kernel/workqueue.o kernel/pid.o kernel/rcupdate.o kernel/intermodule.o kernel/extable.o kernel/params.o kernel/posix-timers.o kernel/kthread.o kernel/wait.o kernel/kfifo.o kernel/sys_ni.o kernel/futex.o kernel/dma.o kernel/cpu.o kernel/spinlock.o kernel/module.o kernel/kallsyms.o kernel/acct.o kernel/compat.o kernel/configs.o kernel/stop_machine.o kernel/audit.o kernel/auditsc.o kernel/ksysfs.o kernel/irq/built-in.o make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=mm make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/autofs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/autofs4 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/cifs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/cramfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/devpts make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/exportfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/ext2 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/ext3 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/fat make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/hfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/hfsplus make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/hugetlbfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/isofs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/jbd make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/jfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/lockd make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/minix make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/msdos make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/ncpfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/nfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/nfsd make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/nls make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/partitions make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/proc make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/ramfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/reiserfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/smbfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/sysfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/udf make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/vfat make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=fs/xfs make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=ipc make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=security make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=security/selinux make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=security/selinux/ss make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=crypto make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/base make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/base/power make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/block make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/cdrom make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/char make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/char/drm make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/char/ipmi make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/char/watchdog make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/firmware make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/i2c make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/i2c/algos make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/i2c/busses make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/i2c/chips make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/ide make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/ide/arm make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/ide/legacy make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/ide/pci make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/ieee1394 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/input make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/input/keyboard make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/input/misc make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/input/mouse make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/input/serio make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/md make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/common make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/b2c2 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/bt8xx make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/cinergyT2 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/dibusb make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/dvb-core make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/frontends make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/ttpci make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/ttusb-budget make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/dvb/ttusb-dec make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/radio make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/media/video make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/misc make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/net make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/net/appletalk make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/net/bonding make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/net/e1000 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/net/fc make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/net/ixgb make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/net/tokenring make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/pci make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/pci/hotplug make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/scsi make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/scsi/ibmvscsi make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/scsi/qla2xxx make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/scsi/sym53c8xx_2 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/serial make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/usb make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/usb/core make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/usb/host make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/usb/input make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/usb/serial make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/usb/storage make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/video make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/video/aty make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/video/console make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/video/logo make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=drivers/video/matrox make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=sound make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=arch/ppc64/oprofile make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/802 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/8021q make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/appletalk make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/bridge make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/bridge/netfilter make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/core make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/ethernet make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/ipv4 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/ipv4/netfilter make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/ipv6 make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/ipv6/netfilter make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/ipx make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/key make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/llc make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/netlink make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/packet make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/sched make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/sctp make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/sunrpc make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/sunrpc/auth_gss make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/unix make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=net/xfrm make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=lib make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=lib/zlib_deflate make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=lib/zlib_inflate make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=arch/ppc64/lib set -e; . /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mkversion > .tmp_version; mv -f .tmp_version .version; make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=init CHK include/linux/compile.h UPD include/linux/compile.h gcc -m64 -Wp,-MD,init/.version.o.d -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/init -Iinit -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -O2 -fno-omit-frame-pointer -msoft-float -pipe -mminimal-toc -mtraceback=none -mtune=power4 -funit-at-a-time -Wdeclaration-after-statement -DKBUILD_BASENAME=version -DKBUILD_MODNAME=version -c -o init/.tmp_version.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/init/version.c ld -m elf64ppc -m elf64ppc -r -o init/built-in.o init/main.o init/version.o init/mounts.o init/initramfs.o init/calibrate.o ld -m elf64ppc -m elf64ppc -Bstatic -e 0xc000000000000000 -Ttext 0xc000000000000000 -o .tmp_vmlinux1 -T arch/ppc64/kernel/vmlinux.lds arch/ppc64/kernel/head.o init/built-in.o --start-group usr/built-in.o arch/ppc64/kernel/built-in.o arch/ppc64/mm/built-in.o arch/ppc64/xmon/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o lib/lib.a arch/ppc64/lib/lib.a lib/built-in.o arch/ppc64/lib/built-in.o drivers/built-in.o sound/built-in.o arch/ppc64/oprofile/built-in.o net/built-in.o --end-group echo 'cmd_.tmp_vmlinux1 := ld -m elf64ppc -m elf64ppc -Bstatic -e 0xc000000000000000 -Ttext 0xc000000000000000 -o .tmp_vmlinux1 -T arch/ppc64/kernel/vmlinux.lds arch/ppc64/kernel/head.o init/built-in.o --start-group usr/built-in.o arch/ppc64/kernel/built-in.o arch/ppc64/mm/built-in.o arch/ppc64/xmon/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o lib/lib.a arch/ppc64/lib/lib.a lib/built-in.o arch/ppc64/lib/built-in.o drivers/built-in.o sound/built-in.o arch/ppc64/oprofile/built-in.o net/built-in.o --end-group ' > ./..tmp_vmlinux1.cmd nm -n .tmp_vmlinux1 | scripts/kallsyms > .tmp_kallsyms1.S set -e; echo ' gcc -m64 -Wp,-MD,./..tmp_kallsyms1.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms1.o .tmp_kallsyms1.S'; gcc -m64 -Wp,-MD,./..tmp_kallsyms1.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms1.o .tmp_kallsyms1.S; scripts/basic/fixdep ./..tmp_kallsyms1.o.d .tmp_kallsyms1.o 'gcc -m64 -Wp,-MD,./..tmp_kallsyms1.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms1.o .tmp_kallsyms1.S' > ./..tmp_kallsyms1.o.tmp; rm -f ./..tmp_kallsyms1.o.d; mv -f ./..tmp_kallsyms1.o.tmp ./..tmp_kallsyms1.o.cmd gcc -m64 -Wp,-MD,./..tmp_kallsyms1.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms1.o .tmp_kallsyms1.S ld -m elf64ppc -m elf64ppc -Bstatic -e 0xc000000000000000 -Ttext 0xc000000000000000 -o .tmp_vmlinux2 -T arch/ppc64/kernel/vmlinux.lds arch/ppc64/kernel/head.o init/built-in.o --start-group usr/built-in.o arch/ppc64/kernel/built-in.o arch/ppc64/mm/built-in.o arch/ppc64/xmon/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o lib/lib.a arch/ppc64/lib/lib.a lib/built-in.o arch/ppc64/lib/built-in.o drivers/built-in.o sound/built-in.o arch/ppc64/oprofile/built-in.o net/built-in.o --end-group .tmp_kallsyms1.o nm -n .tmp_vmlinux2 | scripts/kallsyms > .tmp_kallsyms2.S set -e; echo ' gcc -m64 -Wp,-MD,./..tmp_kallsyms2.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms2.o .tmp_kallsyms2.S'; gcc -m64 -Wp,-MD,./..tmp_kallsyms2.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms2.o .tmp_kallsyms2.S; scripts/basic/fixdep ./..tmp_kallsyms2.o.d .tmp_kallsyms2.o 'gcc -m64 -Wp,-MD,./..tmp_kallsyms2.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms2.o .tmp_kallsyms2.S' > ./..tmp_kallsyms2.o.tmp; rm -f ./..tmp_kallsyms2.o.d; mv -f ./..tmp_kallsyms2.o.tmp ./..tmp_kallsyms2.o.cmd gcc -m64 -Wp,-MD,./..tmp_kallsyms2.o.d -D__ASSEMBLY__ -nostdinc -iwithprefix include -D__KERNEL__ -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -c -o .tmp_kallsyms2.o .tmp_kallsyms2.S ld -m elf64ppc -m elf64ppc -Bstatic -e 0xc000000000000000 -Ttext 0xc000000000000000 -o vmlinux -T arch/ppc64/kernel/vmlinux.lds arch/ppc64/kernel/head.o init/built-in.o --start-group usr/built-in.o arch/ppc64/kernel/built-in.o arch/ppc64/mm/built-in.o arch/ppc64/xmon/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o lib/lib.a arch/ppc64/lib/lib.a lib/built-in.o arch/ppc64/lib/built-in.o drivers/built-in.o sound/built-in.o arch/ppc64/oprofile/built-in.o net/built-in.o --end-group .tmp_kallsyms2.o echo 'cmd_vmlinux := ld -m elf64ppc -m elf64ppc -Bstatic -e 0xc000000000000000 -Ttext 0xc000000000000000 -o vmlinux -T arch/ppc64/kernel/vmlinux.lds arch/ppc64/kernel/head.o init/built-in.o --start-group usr/built-in.o arch/ppc64/kernel/built-in.o arch/ppc64/mm/built-in.o arch/ppc64/xmon/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o lib/lib.a arch/ppc64/lib/lib.a lib/built-in.o arch/ppc64/lib/built-in.o drivers/built-in.o sound/built-in.o arch/ppc64/oprofile/built-in.o net/built-in.o --end-group .tmp_kallsyms2.o' > ./.vmlinux.cmd echo ' /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mksysmap System.map' && /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mksysmap vmlinux System.map; if [ $? -ne 0 ]; then rm -f vmlinux; /bin/false; fi; /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mksysmap System.map echo ' /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mksysmap .tmp_System.map' && /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mksysmap .tmp_vmlinux2 .tmp_System.map /bin/sh /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/mksysmap .tmp_System.map cmp -s System.map .tmp_System.map || (echo Inconsistent kallsyms data; echo Try setting CONFIG_KALLSYMS_EXTRA_PASS; rm .tmp_kallsyms* ; /bin/false ) make -f /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/scripts/Makefile.build obj=arch/ppc64/boot arch/ppc64/boot/zImage strip -s vmlinux -o vmlinux.strip gzip -f -9 < vmlinux.strip > arch/ppc64/boot/kernel-vmlinux.strip.gz touch arch/ppc64/boot/kernel-vmlinux.strip.c gcc -Wp,-MD,arch/ppc64/boot/.kernel-vmlinux.strip.o.d -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -c -o arch/ppc64/boot/kernel-vmlinux.strip.o arch/ppc64/boot/kernel-vmlinux.strip.c objcopy arch/ppc64/boot/kernel-vmlinux.strip.o --add-section=.kernel:vmlinux.strip=arch/ppc64/boot/kernel-vmlinux.strip.gz --set-section-flags=.kernel:vmlinux.strip=contents,alloc,load,readonly,data gcc -Wp,-MD,arch/ppc64/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -traditional -c -o arch/ppc64/boot/crt0.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/crt0.S gcc -Wp,-MD,arch/ppc64/boot/.string.o.d -D__ASSEMBLY__ -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -traditional -c -o arch/ppc64/boot/string.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/string.S gcc -Wp,-MD,arch/ppc64/boot/.prom.o.d -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -c -o arch/ppc64/boot/prom.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/prom.c gcc -Wp,-MD,arch/ppc64/boot/.main.o.d -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -c -o arch/ppc64/boot/main.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/main.c gcc -Wp,-MD,arch/ppc64/boot/.zlib.o.d -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -c -o arch/ppc64/boot/zlib.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/zlib.c Generating arch/ppc64/boot/imagesize.c ls -l vmlinux.strip | \ awk '{printf "/* generated -- do not edit! */\n" \ "unsigned long vmlinux_filesize = %d;\n", $5}' > arch/ppc64/boot/imagesize.c nm -n vmlinux | tail -n 1 | \ awk '{printf "unsigned long vmlinux_memsize = 0x%s;\n", substr($1,8)}' \ >> arch/ppc64/boot/imagesize.c gcc -Wp,-MD,arch/ppc64/boot/.imagesize.o.d -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -c -o arch/ppc64/boot/imagesize.o arch/ppc64/boot/imagesize.c gcc -Wp,-MD,arch/ppc64/boot/.div64.o.d -D__ASSEMBLY__ -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -traditional -c -o arch/ppc64/boot/div64.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/div64.S gcc -Wp,-MD,arch/ppc64/boot/.addnote.d -Iarch/ppc64/boot -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -o arch/ppc64/boot/addnote /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/addnote.c ld -Ttext 0x00400000 -e _start -T /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/zImage.lds -o arch/ppc64/boot/zImage arch/ppc64/boot/crt0.o arch/ppc64/boot/string.o arch/ppc64/boot/prom.o arch/ppc64/boot/main.o arch/ppc64/boot/zlib.o arch/ppc64/boot/imagesize.o arch/ppc64/boot/div64.o arch/ppc64/boot/kernel-vmlinux.strip.o && arch/ppc64/boot/addnote arch/ppc64/boot/zImage ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/crt0.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/string.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/prom.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/main.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/zlib.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/imagesize.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/div64.o' is incompatible with powerpc:common output ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/kernel-vmlinux.strip.o' is incompatible with powerpc:common output arch/ppc64/boot/prom.o(.text+0x3d0): In function `.fputs': : undefined reference to `.strlen' arch/ppc64/boot/main.o(.text+0x5c0): In function `.start': : undefined reference to `.memmove' arch/ppc64/boot/main.o(.text+0x688): In function `.start': : undefined reference to `.flush_cache' arch/ppc64/boot/main.o(.text+0x7b0): In function `.start': : undefined reference to `.memmove' arch/ppc64/boot/zlib.o(.text+0x934): In function `.inflateIncomp': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0x9c4): In function `.inflateIncomp': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0xb38): In function `.inflate_flush': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0xbc0): In function `.inflate_flush': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0xc64): In function `.inflate_flush': : undefined reference to `.memcpy' arch/ppc64/boot/zlib.o(.text+0x1e30): more undefined references to `.memcpy' follow CC { Reading specs from /opt/gcc-3.4.2/lib/gcc/powerpc64-unknown-linux-gnu/3.4.2/specs Configured with: /build/dwm/toolchain/gcc-3.4.2/configure --prefix=/opt/gcc-3.4.2 --disable-multilib --with-ld=/opt/binutils-2.15-041109/bin/ld --with-as=/opt/binutils-2.15-041109/bin/as --enable-languages=c,c++,f77 --enable-altivec Thread model: posix gcc version 3.4.2 } UNAME Linux io-browns 2.6.5-7.97-pseries64 #1 SMP Fri Jul 2 14:21:59 UTC 2004 ppc64 ppc64 ppc64 GNU/Linux uid=1001(dwm) gid=100(users) groups=10(wheel),14(uucp),16(dialout),17(audio),33(video),100(users) COMPLETE at Tue Nov 9 23:14:40 2004 RETURN from {make O=/build/dwm/build/lk-2.6.10-rc1-mm4.edit/ppc64 V=1 zImage} is 0 ELAPSED time 0:00:25 From anton at samba.org Wed Nov 10 16:21:53 2004 From: anton at samba.org (Anton Blanchard) Date: Wed, 10 Nov 2004 16:21:53 +1100 Subject: [PATCH] ppc64: Fix for cpu hotplug + NUMA Message-ID: <20041110052153.GA4417@krispykreme.ozlabs.ibm.com> Hi, It was possible to hotplug add a new cpu that was larger than numnode. This meant we had a cpu with a node id that didnt have any NODE_DATA() backing it. The following patch catches this in numa_setup_cpu and forces the node to 0. We also make a pass through all cpus at boot to look for the maximum node id - this catches the case at boot. As for hotplug, since the node doesnt have any memory backing it (otherwise the node would have been onlined) then we can assume the machine isnt configured for performance and just force the cpu into node 0. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/mm/numa.c~add_numa_cpu_summary arch/ppc64/mm/numa.c --- gr_work/arch/ppc64/mm/numa.c~add_numa_cpu_summary 2004-11-09 18:09:01.482506301 -0600 +++ gr_work-anton/arch/ppc64/mm/numa.c 2004-11-09 22:50:34.425545566 -0600 @@ -216,7 +216,7 @@ static int numa_setup_cpu(unsigned long numa_domain = of_node_numa_domain(cpu); - if (numa_domain >= MAX_NUMNODES) { + if (numa_domain >= numnodes) { /* * POWER4 LPAR uses 0xffff as invalid node, * dont warn in this case. @@ -265,6 +265,7 @@ static int cpu_numa_callback(struct noti static int __init parse_numa_properties(void) { + struct device_node *cpu = NULL; struct device_node *memory = NULL; int max_domain = 0; long entries = lmb_end_of_DRAM() >> MEMORY_INCREMENT_SHIFT; @@ -290,6 +291,28 @@ static int __init parse_numa_properties( max_domain = numa_setup_cpu(boot_cpuid); + /* + * Even though we connect cpus to numa domains later in SMP init, + * we need to know the maximum node id now. This is because each + * node id must have NODE_DATA etc backing it. + * As a result of hotplug we could still have cpus appear later on + * with larger node ids. In that case we force the cpu into node 0. + */ + for_each_cpu(i) { + int numa_domain; + + cpu = find_cpu_node(i); + + if (cpu) { + numa_domain = of_node_numa_domain(cpu); + of_node_put(cpu); + + if (numa_domain < MAX_NUMNODES && + max_domain < numa_domain) + max_domain = numa_domain; + } + } + memory = NULL; while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { unsigned long start; From anton at samba.org Wed Nov 10 16:41:37 2004 From: anton at samba.org (Anton Blanchard) Date: Wed, 10 Nov 2004 16:41:37 +1100 Subject: [PATCH] ppc64: Small OF fixes Message-ID: <20041110054137.GB4417@krispykreme.ozlabs.ibm.com> Hi, A few small fixes: - Check for the model property before using it. Also check for the correct property name for nighthawk. - Check for the existence of freeze-time-base before using it to synchronize timebases instead of making it conditional on !LPAR Signed-off-by: Anton Blanchard diff -puN arch/ppc64/mm/hash_native.c~baremetal1 arch/ppc64/mm/hash_native.c --- gr_work/arch/ppc64/mm/hash_native.c~baremetal1 2004-11-09 02:49:07.623626848 -0600 +++ gr_work-anton/arch/ppc64/mm/hash_native.c 2004-11-09 14:25:44.100397586 -0600 @@ -405,7 +405,7 @@ void hpte_init_native(void) root = of_find_node_by_path("/"); if (root) { model = get_property(root, "model", NULL); - if (!strcmp(model, "CHRP IBM,9076-N81")) { + if (model && !strcmp(model, "IBM,9076-N81")) { of_node_put(root); goto bail; } --- gr_work/arch/ppc64/kernel/pSeries_smp.c~baremetal1 2004-11-09 15:52:13.809408014 -0600 +++ gr_work-anton/arch/ppc64/kernel/pSeries_smp.c 2004-11-09 15:56:18.418031140 -0600 @@ -382,12 +382,11 @@ void __init smp_init_pSeries(void) vpa_init(boot_cpuid); /* Non-lpar has additional take/give timebase */ - if (systemcfg->platform == PLATFORM_PSERIES) { + if (rtas_token("freeze-time-base") != RTAS_UNKNOWN_SERVICE) { smp_ops->give_timebase = pSeries_give_timebase; smp_ops->take_timebase = pSeries_take_timebase; } - DBG(" <- smp_init_pSeries()\n"); } From amodra at bigpond.net.au Wed Nov 10 16:47:39 2004 From: amodra at bigpond.net.au (Alan Modra) Date: Wed, 10 Nov 2004 16:17:39 +1030 Subject: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors In-Reply-To: <200411100527.iAA5R0cO020672@falcon10.austin.ibm.com> References: <20041110015426.GC19474@bubble.modra.org> <200411100527.iAA5R0cO020672@falcon10.austin.ibm.com> Message-ID: <20041110054739.GB32175@bubble.modra.org> On Tue, Nov 09, 2004 at 11:27:00PM -0600, Doug Maxey wrote: > gcc -Wp,-MD,arch/ppc64/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -Iinclude -Iinclude2 -I/build/dwm/linux/lk-2.6.10-rc1-mm4.edit/include -fno-builtin -traditional -c -o arch/ppc64/boot/crt0.o /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/crt0.S OK, you used plain unadorned gcc to compile crt0.S, without -m64 or -m32. That ought to produce ppc32 code. > ld -Ttext 0x00400000 -e _start -T /build/dwm/linux/lk-2.6.10-rc1-mm4.edit/arch/ppc64/boot/zImage.lds -o arch/ppc64/boot/zImage arch/ppc64/boot/crt0.o arch/ppc64/boot/string.o arch/ppc64/boot/prom.o arch/ppc64/boot/main.o arch/ppc64/boot/zlib.o arch/ppc64/boot/imagesize.o arch/ppc64/boot/div64.o arch/ppc64/boot/kernel-vmlinux.strip.o && arch/ppc64/boot/addnote arch/ppc64/boot/zImage And here you used plain unadorned ld too, which ought to be producing ppc32 output from ppc32 input, and indeed the powerpc:common output selection says you have the right ld. > ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/crt0.o' is incompatible with powerpc:common output But this says your gcc produces ppc64 code by default.. -- Alan Modra IBM OzLabs - Linux Technology Centre From dwm at austin.ibm.com Wed Nov 10 17:11:08 2004 From: dwm at austin.ibm.com (Doug Maxey) Date: Wed, 10 Nov 2004 00:11:08 -0600 Subject: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors In-Reply-To: <20041110054739.GB32175@bubble.modra.org> Message-ID: <200411100611.iAA6B8Ww020817@falcon10.austin.ibm.com> On Wed, 10 Nov 2004 16:17:39 +1030, Alan Modra wrote: >> ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/crt0.o' is incompatible with powerpc:common output > >But this says your gcc produces ppc64 code by default.. > Yes indeed. Do you have the incantation for gcc configure that will accomplish that? ++doug From amodra at bigpond.net.au Wed Nov 10 17:28:14 2004 From: amodra at bigpond.net.au (Alan Modra) Date: Wed, 10 Nov 2004 16:58:14 +1030 Subject: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors In-Reply-To: <200411100611.iAA6B8Ww020817@falcon10.austin.ibm.com> References: <20041110054739.GB32175@bubble.modra.org> <200411100611.iAA6B8Ww020817@falcon10.austin.ibm.com> Message-ID: <20041110062814.GC32175@bubble.modra.org> On Wed, Nov 10, 2004 at 12:11:08AM -0600, Doug Maxey wrote: > > On Wed, 10 Nov 2004 16:17:39 +1030, Alan Modra wrote: > >> ld: warning: powerpc:common64 architecture of input file `arch/ppc64/boot/crt0.o' is incompatible with powerpc:common output > > > >But this says your gcc produces ppc64 code by default.. > > > > Yes indeed. Do you have the incantation for gcc configure that will > accomplish that? Use --with-cpu=default32 -- Alan Modra IBM OzLabs - Linux Technology Centre From dwm at austin.ibm.com Wed Nov 10 18:50:13 2004 From: dwm at austin.ibm.com (Doug Maxey) Date: Wed, 10 Nov 2004 01:50:13 -0600 Subject: ppc64 gcc-3.4.2 lk-2.6.10-rc1-mm4 link errors In-Reply-To: <20041110062814.GC32175@bubble.modra.org> Message-ID: <200411100750.iAA7oDI5021055@falcon10.austin.ibm.com> On Wed, 10 Nov 2004 16:58:14 +1030, Alan Modra wrote: > >Use --with-cpu=default32 That put some gravy on those bisquits. :) Now to actually test. Will get to that later this morning. ++doug From segher at kernel.crashing.org Wed Nov 10 23:59:42 2004 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Wed, 10 Nov 2004 13:59:42 +0100 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <1100044616.3884.246.camel@gaston> References: <20041108234252.GA12837@bubble.modra.org> <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> <20041109232210.GA19474@bubble.modra.org> <1100044616.3884.246.camel@gaston> Message-ID: <63B9230B-3318-11D9-89B3-000A95A4DC02@kernel.crashing.org> On 10-nov-04, at 0:56, Benjamin Herrenschmidt wrote: > On Wed, 2004-11-10 at 09:52 +1030, Alan Modra wrote: >> On Tue, Nov 09, 2004 at 05:03:13PM -0600, Doug Maxey wrote: >>> entry.S:100: Error: operand out of range (268 is not between 0 and >>> 31) >>> entry.S:100: Error: missing operand >> [snip] >>> 100 cmpli 0,r0,NR_syscalls >> >> OK, this one is Segher's patch, now reverted, making the L operand on >> cmpli non-optional. >> >> Any of "cmpli cr,l,r,imm", "cmplwi cr,r,imm", or "cmpldi cr,r,imm" >> will >> be accepted by the assembler. > > > Yah, Segher seem to have a deep love with the idea of pushing gas > changes that break pretty much everything out there ;) It's not a "change", it's a bugfix. And it only is reverted because it also changed tlbie syntax to conform to the architecture specification -- tat unfortunately is directly contradicted by the PEM. Will fix, but I'm lazy, so you have some time to fix all your bugs before it's too late ;-) Segher From benh at kernel.crashing.org Thu Nov 11 00:03:17 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 11 Nov 2004 00:03:17 +1100 Subject: [PATCH] PPC64 Poor assembly coding style In-Reply-To: <63B9230B-3318-11D9-89B3-000A95A4DC02@kernel.crashing.org> References: <20041108234252.GA12837@bubble.modra.org> <200411092303.iA9N3DwB019266@falcon10.austin.ibm.com> <20041109232210.GA19474@bubble.modra.org> <1100044616.3884.246.camel@gaston> <63B9230B-3318-11D9-89B3-000A95A4DC02@kernel.crashing.org> Message-ID: <1100091797.25791.2.camel@gaston> > It's not a "change", it's a bugfix. And it only is reverted because > it also changed tlbie syntax to conform to the architecture > specification -- > tat unfortunately is directly contradicted by the PEM. Will fix, but > I'm > lazy, so you have some time to fix all your bugs before it's too late > ;-) I'm sure it would please everybody if you send a kernel patch _before_ changing binutils tho... :) Ben. From benh at kernel.crashing.org Thu Nov 11 00:10:21 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 11 Nov 2004 00:10:21 +1100 Subject: ppc64 vDSO update In-Reply-To: <1100059748.10262.292.camel@gaston> References: <1100059748.10262.292.camel@gaston> Message-ID: <1100092221.25814.7.camel@gaston> .../... > - Added a new export to userland: __kernel_sync_dicache (prototype below) that > does the dcache flush / icache invalidate necessary to turn data into executable, > I also added the dynamic symbol patching mecanism I wrote about earlier so that > the "__kernel_sync_dicache" symbol is automatically modified by the kernel to > point to the right version based on the CPU you are running on. At this point, > there are 2 implementations, a generic one and a POWER5 one. Ultimately, this > mecanism will be used for a lot more "alternatives" once we get more functions > in the vDSO, most notably locks. > .../... > > I forgot to describe the prototype of that new function, here it is: > > void __kernel_sync_dicache(unsigned long start, unsigned long end) > > Flushes the data cache & invalidate the instruction cache for the > provided range [start, end[ > > Ben. > > > From scheel at vnet.ibm.com Thu Nov 11 00:31:43 2004 From: scheel at vnet.ibm.com (Jeff Scheel) Date: Wed, 10 Nov 2004 07:31:43 -0600 Subject: [Fwd: Fw: [PATCH] PURR data on iSeries Linux] In-Reply-To: <20041110143852.661b465f.sfr@canb.auug.org.au> References: <41913625.2020101@vnet.ibm.com> <20041110143852.661b465f.sfr@canb.auug.org.au> Message-ID: <4192183F.3000609@vnet.ibm.com> Stephen Rothwell wrote: >Looks good in general. I cannot, of course, comment on the semantics due >to lack of documentation of the HV interfaces :-) > > You have as much documentation as I do. :-( The difference was that I have additional "resources" like access to the hypervisor developer who wrote the code. I'd send him to you if you'll send me a self-addressed "box". :-) >Just a couple of comments: > > Thanks. Does this look better? -Jeff Signed-off by: Jeff Scheel (scheel at vnet.ibm.com) --- linuxppc-2.6.9_rc1.orig/arch/ppc64/kernel/lparcfg.c 2004-11-09 07:03:43.354383000 -0600 +++ linuxppc-2.6.9_rc1/arch/ppc64/kernel/lparcfg.c 2004-11-10 07:11:14.588438917 -0600 @@ -34,7 +34,7 @@ #include #include -#define MODULE_VERS "1.4" +#define MODULE_VERS "1.5" #define MODULE_NAME "lparcfg" /* #define LPARCFG_DEBUG */ @@ -70,6 +70,28 @@ #ifdef CONFIG_PPC_ISERIES +/* + * For iSeries legacy systems, the PPA purr function is available from the + * xEmulatedTimeBase field in the paca. + */ +static unsigned long get_purr(void) +{ + unsigned long sum_purr = 0; + int cpu; + struct paca_struct *lpaca; + + for_each_cpu(cpu) { + lpaca = paca + cpu; + sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; + +#ifdef PURR_DEBUG + printk(KERN_INFO "get_purr for cpu (%d) has value (%ld) \n", + cpu, lpaca->xLpPaca.xEmulatedTimeBase); +#endif + } + return sum_purr; +} + #define lparcfg_write NULL /* @@ -81,6 +103,7 @@ int shared, entitled_capacity, max_entitled_capacity; int processors, max_processors; struct paca_struct *lpaca = get_paca(); + unsigned long purr = get_purr(); shared = (int)(lpaca->lppaca_ptr->xSharedProc); seq_printf(m, "serial_number=%c%c%c%c%c%c%c\n", @@ -131,6 +154,7 @@ seq_printf(m, "pool_capacity=%d\n", (int)(HvLpConfig_getNumProcsInSharedPool(pool_id) * 100)); + seq_printf(m, "purr=%ld\n", purr); } seq_printf(m, "shared_processor_mode=%d\n", shared); From scheel at vnet.ibm.com Thu Nov 11 08:22:03 2004 From: scheel at vnet.ibm.com (Jeff Scheel) Date: Wed, 10 Nov 2004 15:22:03 -0600 Subject: [Fwd: Fw: [PATCH] PURR data on iSeries Linux] In-Reply-To: <4192183F.3000609@vnet.ibm.com> References: <41913625.2020101@vnet.ibm.com> <20041110143852.661b465f.sfr@canb.auug.org.au> <4192183F.3000609@vnet.ibm.com> Message-ID: <4192867B.1090701@vnet.ibm.com> Got one additional suggestion from Will Schmidt on including the version in the iSeries lparcfg data. I've included that and attached a new patch. -Jeff Signed-off by: Jeff Scheel (scheel at vnet.ibm.com) --- linuxppc-2.6.10_rc1.orig/arch/ppc64/kernel/lparcfg.c 2004-11-09 07:03:43.354383000 -0600 +++ linuxppc-2.6.10_rc1/arch/ppc64/kernel/lparcfg.c 2004-11-10 10:38:47.904484533 -0600 @@ -34,7 +34,7 @@ #include #include -#define MODULE_VERS "1.4" +#define MODULE_VERS "1.5" #define MODULE_NAME "lparcfg" /* #define LPARCFG_DEBUG */ @@ -70,6 +70,28 @@ #ifdef CONFIG_PPC_ISERIES +/* + * For iSeries legacy systems, the PPA purr function is available from the + * xEmulatedTimeBase field in the paca. + */ +static unsigned long get_purr(void) +{ + unsigned long sum_purr = 0; + int cpu; + struct paca_struct *lpaca; + + for_each_cpu(cpu) { + lpaca = paca + cpu; + sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; + +#ifdef PURR_DEBUG + printk(KERN_INFO "get_purr for cpu (%d) has value (%ld) \n", + cpu, lpaca->xLpPaca.xEmulatedTimeBase); +#endif + } + return sum_purr; +} + #define lparcfg_write NULL /* @@ -81,6 +103,9 @@ int shared, entitled_capacity, max_entitled_capacity; int processors, max_processors; struct paca_struct *lpaca = get_paca(); + unsigned long purr = get_purr(); + + seq_printf(m, "%s %s \n", MODULE_NAME, MODULE_VERS); shared = (int)(lpaca->lppaca_ptr->xSharedProc); seq_printf(m, "serial_number=%c%c%c%c%c%c%c\n", @@ -131,6 +156,7 @@ seq_printf(m, "pool_capacity=%d\n", (int)(HvLpConfig_getNumProcsInSharedPool(pool_id) * 100)); + seq_printf(m, "purr=%ld\n", purr); } seq_printf(m, "shared_processor_mode=%d\n", shared); From will_schmidt at vnet.ibm.com Thu Nov 11 02:17:06 2004 From: will_schmidt at vnet.ibm.com (will schmidt) Date: Wed, 10 Nov 2004 09:17:06 -0600 Subject: [PATCH] PURR data on iSeries Linux In-Reply-To: References: Message-ID: <419230F2.30204@vnet.ibm.com> Hi, Since we're incrementing the lparcfg version,.. It might be helpful to actually print the value on the iSeries leg, like we do on the non-iSeries leg. (an earlier oversight on my part..) something like seq_printf(m, "%s %s \n", MODULE_NAME, MODULE_VERS); up near the top of lparcfg_data(). -Will jscheel at magnaspeed.net wrote: > With the addtion of the PURR for Power5 systems, > applications have begun being built to utilize this value. > One application in particular is looking for equivalent > information for Linux running on legacy iSeries systems. > > The data necessary to report this data is available from > the hypervisor. It simply has to be retrieved and reported > by /proc/ppc64/lparcfg interface on iSeries. > > For those interested in testing this patch, retrieve the > purr data at two defined intervals and calculate the > difference. Then, multiply the delta by 100, divide by > the number of seconds in your interval, divide by the > "timebase" value from "/proc/cpuinfo", and divide again by > the number of processors. This will provide the physical > cpu utilization ranging from 1 to 100. > > For shared processor configurations, this will depend on > workload with the actual value somewhere between 1 and 100. > For dedicated processors, this number should always be > 100 as the operating system gets all of the physical > processor capacity. > > This patch to arch/ppc64/lparcfg.c reports this data. It > has been tested on legacy iSeries systems and is not > dependent on the Power5 PURR implementation. > Please consider it for inclusion in the arch/ppc64 tree. > Additionally, I would appreciate any and all comments. > > Thanks, > -Jeff > > P.S. Sorry I had to use my personal email address but I > can't get the patch out without losing tabs. I think this > works. :-) > > Signed-off-by: Jeff Scheel > > --- > linuxppc-2.6.9_rc1.orig/arch/ppc64/kernel/lparcfg.c 2004-11-09 > 07:03:43.354383000 -0600 > +++ > linuxppc-2.6.9_rc1/arch/ppc64/kernel/lparcfg.c 2004-11-09 > 10:40:36.375934020 -0600 > @@ -34,7 +34,7 @@ > #include > #include > > -#define MODULE_VERS "1.4" > +#define MODULE_VERS "1.5" > #define MODULE_NAME "lparcfg" > > /* #define LPARCFG_DEBUG */ > @@ -70,6 +70,30 @@ > > #ifdef CONFIG_PPC_ISERIES > > +static unsigned long get_purr(void); > + > +/* > + * For iSeries legacy systems, the PPA purr function is > available from the > + * xEmulatedTimeBase field in the paca. > + */ > +static unsigned long get_purr() > +{ > + unsigned long sum_purr=0; > + int cpu; > + struct paca_struct *lpaca; > + > + for_each_online_cpu(cpu) { > + lpaca = paca + cpu; > + sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; > + > +#ifdef PURR_DEBUG > + printk(KERN_INFO "get_purr for cpu (%x) has value (%lx) > \n", > + cpu,lpaca->xLpPaca.xEmulatedTimeBase); > +#endif > + } > + return sum_purr; > +} > + > #define lparcfg_write NULL > > /* > @@ -81,6 +105,7 @@ > int shared, entitled_capacity, max_entitled_capacity; > int processors, max_processors; > struct paca_struct *lpaca = get_paca(); > + unsigned long purr = get_purr(); > > shared = (int)(lpaca->lppaca_ptr->xSharedProc); > seq_printf(m, "serial_number=%c%c%c%c%c%c%c\n", > @@ -131,6 +156,7 @@ > seq_printf(m, "pool_capacity=%d\n", > (int)(HvLpConfig_getNumProcsInSharedPool(pool_id) * > 100)); > + seq_printf(m, "purr=%ld\n", purr); > } > > seq_printf(m, "shared_processor_mode=%d\n", shared); > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/cgi-bin/mailman/listinfo/linuxppc64-dev From benh at kernel.crashing.org Thu Nov 11 14:32:48 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 11 Nov 2004 14:32:48 +1100 Subject: [RFC] ppc64 and calling conventions Message-ID: <1100143968.24782.15.camel@gaston> Hi ! Before I submit the ppc64 vDSO upstream, I want to make sure we decide on the calling convention to the library. I've been toying with a couple of options, but I'd like some feedback. The current one still exports 'normal' function symbols, that is procedure descriptors and "dot" symbols for the actual function. The whole thing is linked at +1Mb and mapping it elsewhere would require some sort of relocations to be done, or the call sites in glibc to do some additional arithmetic. (I can do relocation from the kernel triggering copy-on-write, but that seem like a bad idea ...) Something that was proposed a while ago (by Ulrich I think) is that instead, I could export simple non-descriptor symbols that point directly to the code and link the whole thing at 0. That way, what get exported by the vDSO becomes offsets to the functions. Since glibc will need call glue anyway, it doesn't make much difference, but allow the vDSO to be mapped anywhere. What do you think ? Should I stay to exporting normal symbols or switch to the above idea ? Additionally, we may consider, in the future, some ld.so trickery so that applications OPDs are directly "fixed up" to point to the vDSO to save the cost of the trampoline. Ben. From roland at redhat.com Thu Nov 11 14:49:43 2004 From: roland at redhat.com (Roland McGrath) Date: Wed, 10 Nov 2004 19:49:43 -0800 Subject: [RFC] ppc64 and calling conventions In-Reply-To: Benjamin Herrenschmidt's message of Thursday, 11 November 2004 14:32:48 +1100 <1100143968.24782.15.camel@gaston> Message-ID: <200411110349.iAB3nhl5009825@magilla.sf.frob.com> Frankly I don't think we are ready to commit now to saying what is useful. There is significant work to be done and issues to be decided before glibc will ever actually call any of your entry points. I would recommend that you start by getting a working vDSO in that has no interesting entry points, just the signal trampoline code (and it doesn't really matter whether you give that symbols or not, except perhaps to gdb). We have the support already there to make the trampoline unwind info available so that you can start using it and supporting nonexecutable stacks. I can't recommend right now that you do anything immediately to provide any entry points that you might feel compelled to stay compatible with later. We just haven't ironed out exactly how such things should be tied into libc, and until we do, all bets are off as to whether anything you choose now is something you'll want to stick with in the long run. From sfr at canb.auug.org.au Thu Nov 11 14:56:30 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 11 Nov 2004 14:56:30 +1100 Subject: [Fwd: Fw: [PATCH] PURR data on iSeries Linux] In-Reply-To: <4192867B.1090701@vnet.ibm.com> References: <41913625.2020101@vnet.ibm.com> <20041110143852.661b465f.sfr@canb.auug.org.au> <4192183F.3000609@vnet.ibm.com> <4192867B.1090701@vnet.ibm.com> Message-ID: <20041111145630.3e76d1a5.sfr@canb.auug.org.au> On Wed, 10 Nov 2004 15:22:03 -0600 Jeff Scheel wrote: > > Got one additional suggestion from Will Schmidt on including the version > in the iSeries lparcfg data. > > I've included that and attached a new patch. Looks fine to me. Send it to Andrew ... -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041111/b4effad7/attachment.pgp From benh at kernel.crashing.org Thu Nov 11 15:09:52 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 11 Nov 2004 15:09:52 +1100 Subject: [RFC] ppc64 and calling conventions In-Reply-To: <200411110349.iAB3nhl5009825@magilla.sf.frob.com> References: <200411110349.iAB3nhl5009825@magilla.sf.frob.com> Message-ID: <1100146192.24782.23.camel@gaston> On Wed, 2004-11-10 at 19:49 -0800, Roland McGrath wrote: > Frankly I don't think we are ready to commit now to saying what is useful. > There is significant work to be done and issues to be decided before glibc > will ever actually call any of your entry points. As far as the ppc vDSO is concerned, that is the issue that remains to be decided... What are those issues that would block it on glibc side ? x86_64 already calls the vDSO for gettimeofday(), though x86 uses hard coded addresses outside of TASK_SIZE, we need something different. > I would recommend that > you start by getting a working vDSO in that has no interesting entry > points, just the signal trampoline code (and it doesn't really matter > whether you give that symbols or not, except perhaps to gdb). We have the > support already there to make the trampoline unwind info available so that > you can start using it and supporting nonexecutable stacks. That is easy, I can just remove them... or leave them in and if we want to expose thing differently, symbol versioning will make sure there is no backward compatibility problem. One of the reasons here is that we have a rather urgent need for the userland gettimeofday for the JVM folks (though the JVM is such a special case, I may even end up adding a couple of "jvm special" calls in the vDSO that they would get to directly). > I can't recommend right now that you do anything immediately to provide any > entry points that you might feel compelled to stay compatible with later. > We just haven't ironed out exactly how such things should be tied into > libc, and until we do, all bets are off as to whether anything you choose > now is something you'll want to stick with in the long run. Ok, so can we start this discussion now then ? I'll separately commit a version of the vDSO with only the signal tramps, and maybe __kernel_gettimeofday. Ben. From paulus at samba.org Thu Nov 11 16:07:39 2004 From: paulus at samba.org (Paul Mackerras) Date: Thu, 11 Nov 2004 16:07:39 +1100 Subject: [PATCH 1/1] rtas_flash_4gig In-Reply-To: <20041108151459.267f0165@localhost> References: <200410041942.i94Jg4WA154540@westrelay04.boulder.ibm.com> <16758.55568.809557.670513@cargo.ozlabs.ibm.com> <20041020170817.0ee49b64@localhost> <16768.28322.583827.9327@cargo.ozlabs.ibm.com> <20041108151459.267f0165@localhost> Message-ID: <16786.62363.70577.484065@cargo.ozlabs.ibm.com> Jake Moilanen writes: > Correct me if I'm wrong, but the ibm,flash-block-version will always be > 1 if we have ibm,update-flash-64-and-reboot (from 270 onwards), and if > we don't have the ibm,update-flash-64-and-reboot, we will not have > gotten to this point anyways. The RPA doesn't say that discontiguous block list support is required. To be properly compliant, we should check for the ibm,flash-block-version property, even if we "know" that every machine will have it. > The block list pages do not have the restriction of being under 4G. They > just have to be page aligned and not cross a LMB. Since we only > allocate one page for every memory block using get_zeroed_page, that > should be fine. The block_list has to be below 4GB. The RPA doesn't actually say whether the block_list extensions need to be below 4GB also, but I don't see why there would be the requirement on the first block_list but not the subsequent ones. I have sent off a query to the powers that be to ask. > Why do you feel we need to check if the pages overlap OF's memory? RPA requirement R-7.3.7.2-4. Regards, Paul. From drepper at redhat.com Thu Nov 11 16:11:15 2004 From: drepper at redhat.com (Ulrich Drepper) Date: Wed, 10 Nov 2004 21:11:15 -0800 Subject: [RFC] ppc64 and calling conventions In-Reply-To: <1100143968.24782.15.camel@gaston> References: <1100143968.24782.15.camel@gaston> Message-ID: <4192F473.3040800@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Benjamin Herrenschmidt wrote: > The current one still exports 'normal' function symbols, that is > procedure descriptors and "dot" symbols for the actual function. What dot symbols? They are dead. We don't have them anymore. > Something that was proposed a while ago (by Ulrich I think) is that > instead, I could export simple non-descriptor symbols that point > directly to the code and link the whole thing at 0. Well, something like this. The vDSO must be freely relocatable without changes unless you forever want to have one address. ld.so won't perform any relocations. So, letting libc figure out the descriptors is possible, just provide symbol offsets. - -- ? Ulrich Drepper ? Red Hat, Inc. ? 444 Castro St ? Mountain View, CA ? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFBkvRz2ijCOnn/RHQRAvNXAKCWp0ZxnaTkvsuErzMMWVOJXUREPQCfXH8I GNWpu4CvP6GCX4BXN4N0OeQ= =D665 -----END PGP SIGNATURE----- From benh at kernel.crashing.org Thu Nov 11 16:56:26 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 11 Nov 2004 16:56:26 +1100 Subject: [RFC] ppc64 and calling conventions In-Reply-To: <4192F473.3040800@redhat.com> References: <1100143968.24782.15.camel@gaston> <4192F473.3040800@redhat.com> Message-ID: <1100152586.24782.26.camel@gaston> On Wed, 2004-11-10 at 21:11 -0800, Ulrich Drepper wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Benjamin Herrenschmidt wrote: > > > The current one still exports 'normal' function symbols, that is > > procedure descriptors and "dot" symbols for the actual function. > > What dot symbols? They are dead. We don't have them anymore. They are handy to get to the actual function with a debugger, we were pretty unhappy in kernel-land to see them gone... but I can remove them. > > Something that was proposed a while ago (by Ulrich I think) is that > > instead, I could export simple non-descriptor symbols that point > > directly to the code and link the whole thing at 0. > > Well, something like this. The vDSO must be freely relocatable without > changes unless you forever want to have one address. ld.so won't > perform any relocations. So, letting libc figure out the descriptors is > possible, just provide symbol offsets. Yah, I prefer that solution too. Ben. From l_indien at magic.fr Fri Nov 12 01:31:48 2004 From: l_indien at magic.fr (J. Mayer) Date: Thu, 11 Nov 2004 15:31:48 +0100 Subject: Booting Imac G5 : success ! In-Reply-To: <1099787630.3884.118.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> Message-ID: <1100183508.8346.5849.camel@rapid> Hi, I got a hack to fix the SATA problem on Imac G5. I was right thinking we were missing IRQs. Here's what I think about it: the SATA driver tries to use IRQ 0, which is not the good one. When I look into the OF device tree, I can see this: # hexdump /proc/device-tree/ht\@0\,f2000000/pci\@3/k2-sata-root\@c/k2-sata\@0/interrupts 0000000 0000 0000 0000004 # hexdump /proc/device-tree/ht\@0\,f2000000/pci\@3/k2-sata-root\@c/interrupts 0000000 0000 000a 0000 0001 0000008 It seems that the driver uses the first (bad) one so I added a test: if (pdev->irq == 0) pdev->irq = 0xA; in drivers/scsi/sata_svw.c then SATA works properly. Of course, this is just a hack, I think the real patch is to be in the OF tree parse, not to keep invalid IRQ definitions. You may know better than me the proper place and way to fix this issue (generic OF tree routines, SATA driver ?). Regards. -- J. Mayer Never organized From scheel at vnet.ibm.com Fri Nov 12 01:58:03 2004 From: scheel at vnet.ibm.com (Jeff Scheel) Date: Thu, 11 Nov 2004 08:58:03 -0600 Subject: [PATCH] iSeries legacy model emulation of PURR Message-ID: <41937DFB.8030700@vnet.ibm.com> Andrew, Here's a patch to extend the current Linux on Power support for PURR to legacy IBM iSeries servers (pre-Power5 processor models). This patch enables the reporting of timebase metrics to reflect physical processor utilization in a system running multiple logical partitions which share the same physical processors. The patch simply uses existing user interfaces for Linux IBM Power5 based servers to report data already collected by the hypervisor. The values reported with each call are running values in units of the system timebase. The calculation of physical processor utilization results from two samples (purr1 and purr2) differing by a know interval (time) such that: physical utilization = (purr2 - purr1) / (time * number of procs * timebase) where the number of procs and timebase can be obtained from /proc/cpuinfo. Applications have been written to the interface already defined and these applications have value back on the legacy iSeries models. Please consider this patch for inclusion. It has been reviewed by Stephen. -Jeff Signed-off by: Jeff Scheel (scheel at vnet.ibm.com) --- linuxppc-2.6.10_rc1.orig/arch/ppc64/kernel/lparcfg.c 2004-11-09 07:03:43.354383000 -0600 +++ linuxppc-2.6.10_rc1/arch/ppc64/kernel/lparcfg.c 2004-11-10 10:38:47.904484533 -0600 @@ -34,7 +34,7 @@ #include #include -#define MODULE_VERS "1.4" +#define MODULE_VERS "1.5" #define MODULE_NAME "lparcfg" /* #define LPARCFG_DEBUG */ @@ -70,6 +70,28 @@ #ifdef CONFIG_PPC_ISERIES +/* + * For iSeries legacy systems, the PPA purr function is available from the + * xEmulatedTimeBase field in the paca. + */ +static unsigned long get_purr(void) +{ + unsigned long sum_purr = 0; + int cpu; + struct paca_struct *lpaca; + + for_each_cpu(cpu) { + lpaca = paca + cpu; + sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; + +#ifdef PURR_DEBUG + printk(KERN_INFO "get_purr for cpu (%d) has value (%ld) \n", + cpu, lpaca->xLpPaca.xEmulatedTimeBase); +#endif + } + return sum_purr; +} + #define lparcfg_write NULL /* @@ -81,6 +103,9 @@ int shared, entitled_capacity, max_entitled_capacity; int processors, max_processors; struct paca_struct *lpaca = get_paca(); + unsigned long purr = get_purr(); + + seq_printf(m, "%s %s \n", MODULE_NAME, MODULE_VERS); shared = (int)(lpaca->lppaca_ptr->xSharedProc); seq_printf(m, "serial_number=%c%c%c%c%c%c%c\n", @@ -131,6 +156,7 @@ seq_printf(m, "pool_capacity=%d\n", (int)(HvLpConfig_getNumProcsInSharedPool(pool_id) * 100)); + seq_printf(m, "purr=%ld\n", purr); } seq_printf(m, "shared_processor_mode=%d\n", shared); From benh at kernel.crashing.org Fri Nov 12 08:42:41 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 12 Nov 2004 08:42:41 +1100 Subject: Booting Imac G5 : success ! In-Reply-To: <1100183508.8346.5849.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> Message-ID: <1100209361.16927.8.camel@gaston> On Thu, 2004-11-11 at 15:31 +0100, J. Mayer wrote: > Hi, > > I got a hack to fix the SATA problem on Imac G5. I was right thinking we > were missing IRQs. Here's what I think about it: > the SATA driver tries to use IRQ 0, which is not the good one. When I > look into the OF device tree, I can see this: > # hexdump > /proc/device-tree/ht\@0\,f2000000/pci\@3/k2-sata-root\@c/k2-sata\@0/interrupts > 0000000 0000 0000 > 0000004 > # hexdump > /proc/device-tree/ht\@0\,f2000000/pci\@3/k2-sata-root\@c/interrupts > 0000000 0000 000a 0000 0001 > 0000008 > > It seems that the driver uses the first (bad) one so I added a test: > if (pdev->irq == 0) pdev->irq = 0xA; > in drivers/scsi/sata_svw.c > then SATA works properly. > Of course, this is just a hack, I think the real patch is to be in the > OF tree parse, not to keep invalid IRQ definitions. > > You may know better than me the proper place and way to fix this issue > (generic OF tree routines, SATA driver ?). No, the probing code doesn't use what is in k2-sata, but what is in k2-sata-root, which is the real PCI device. I don't know why it wouldn't have parsed correctly, I suspect a problem with the PCI<->OF node matching, you can check what's going on in arch/ppc64/kernel/pci.c int pci_read_irq_line(struct pci_dev *pci_dev) Where it reads the IRQ line from OF and puts it in the pci_dev struct. (Note that SATA used to be on IRQ 0 on previous G5s) Ben. From anton at samba.org Fri Nov 12 13:00:38 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 12 Nov 2004 13:00:38 +1100 Subject: ppc64 NUMA code needs an IQ injection Message-ID: <20041112020038.GB20769@krispykreme.ozlabs.ibm.com> Hi, I was shown a box containing 128GB RAM of which only 32GB was visible inside Linux. The dmesg shows: memory region 0 to 10000000 maps to domain 0 memory region 10000000 to 20000000 maps to domain 0 memory region 20000000 to 30000000 maps to domain 1 memory region 30000000 to 40000000 maps to domain 2 Hole in node, disabling region start 40000000 length 10000000 ... Hole in node, disabling region start 1800000000 length 10000000 memory region 1810000000 to 1820000000 maps to domain 3 The device tree highlights the strange behaviour. The NUMA domain is the last number: ./memory at 0/ibm,associativity 00000003 00000000 00000000 00000000 ./memory at 10000000/ibm,associativity 00000003 00000000 00000000 00000000 ./memory at 20000000/ibm,associativity 00000003 00000000 00000000 00000001 ./memory at 30000000/ibm,associativity 00000003 00000000 00000000 00000002 ./memory at 40000000/ibm,associativity 00000003 00000000 00000000 00000000 ./memory at 50000000/ibm,associativity 00000003 00000000 00000000 00000000 Firmware thinks there are 2 256MB regions on node 1 and node 2. This turned out to be a firmware bug, however its a bit rude to disable 3/4 of the machines memory because of it. Our numa code expects each node to be contiguous and to appear in order, and gets upset with the above layout. Anton From david at gibson.dropbear.id.au Fri Nov 12 13:29:13 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 12 Nov 2004 13:29:13 +1100 Subject: [TRIVIAL] ppc64: Kill unused KRANGE_{START,END} macros Message-ID: <20041112022913.GE25274@zax> Andrew, please apply: Remove KRANGE_{START,END} macros from ppc64 code. These were not used anywhere. Further KRANGE_END was misleading, since it implied a limit on the linear mapping range based on the pagetable structure, whereas in fact the linear mapping does not use a (Linux) pagetable at all. Index: working-2.6/include/asm-ppc64/pgtable.h =================================================================== --- working-2.6.orig/include/asm-ppc64/pgtable.h 2004-10-29 13:17:44.000000000 +1000 +++ working-2.6/include/asm-ppc64/pgtable.h 2004-11-12 13:20:42.941952600 +1100 @@ -67,12 +67,6 @@ #define IMALLOC_END (IMALLOC_BASE + PGTABLE_EA_MASK) /* - * Define the address range mapped virt <-> physical - */ -#define KRANGE_START KERNELBASE -#define KRANGE_END (KRANGE_START + PGTABLE_EA_MASK) - -/* * Define the user address range */ #define USER_START (0UL) -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From l_indien at magic.fr Sat Nov 13 02:53:07 2004 From: l_indien at magic.fr (J. Mayer) Date: Fri, 12 Nov 2004 16:53:07 +0100 Subject: Booting Imac G5 : success ! In-Reply-To: <1100209361.16927.8.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> Message-ID: <1100274787.9674.1703.camel@rapid> On Thu, 2004-11-11 at 22:42, Benjamin Herrenschmidt wrote: > On Thu, 2004-11-11 at 15:31 +0100, J. Mayer wrote: > > Hi, > > > > I got a hack to fix the SATA problem on Imac G5. I was right thinking we > > were missing IRQs. [...] > > It seems that the driver uses the first (bad) one so I added a test: > > if (pdev->irq == 0) pdev->irq = 0xA; > > in drivers/scsi/sata_svw.c > > then SATA works properly. > > Of course, this is just a hack, I think the real patch is to be in the > > OF tree parse, not to keep invalid IRQ definitions. > > > > You may know better than me the proper place and way to fix this issue > > (generic OF tree routines, SATA driver ?). > > No, the probing code doesn't use what is in k2-sata, but what is in > k2-sata-root, which is the real PCI device. I don't know why it wouldn't > have parsed correctly, I suspect a problem with the PCI<->OF node > matching, you can check what's going on in arch/ppc64/kernel/pci.c > > int pci_read_irq_line(struct pci_dev *pci_dev) > > Where it reads the IRQ line from OF and puts it in the pci_dev struct. OK, you're right. The problem is that in the PCI configuration space of the SATA root, there's no IRQ line defined. I can see we're in the same case for the IDE controller, and I saw this code: drivers/ide/ppc/pmac.c:pmac_ide_pci_attach np = pci_device_to_OF_node(pdev); ... if (np->n_intrs == 0) pmif->irq = pdev->irq; else pmif->irq = np->intrs[0].line; Then I did the same in the sata driver and I got it working. Do you think this is an acceptable fix ? Regards. -- J. Mayer Never organized From sonny at burdell.org Sat Nov 13 08:59:52 2004 From: sonny at burdell.org (Sonny Rao) Date: Fri, 12 Nov 2004 16:59:52 -0500 Subject: ppc64 NUMA code needs an IQ injection In-Reply-To: <20041112020038.GB20769@krispykreme.ozlabs.ibm.com> References: <20041112020038.GB20769@krispykreme.ozlabs.ibm.com> Message-ID: <20041112215952.GA28253@kevlar.burdell.org> On Fri, Nov 12, 2004 at 01:00:38PM +1100, Anton Blanchard wrote: > > Hi, > > I was shown a box containing 128GB RAM of which only 32GB was visible > inside Linux. The dmesg shows: > > memory region 0 to 10000000 maps to domain 0 > memory region 10000000 to 20000000 maps to domain 0 > memory region 20000000 to 30000000 maps to domain 1 > memory region 30000000 to 40000000 maps to domain 2 > Hole in node, disabling region start 40000000 length 10000000 > ... > Hole in node, disabling region start 1800000000 length 10000000 > memory region 1810000000 to 1820000000 maps to domain 3 > > The device tree highlights the strange behaviour. The NUMA domain is the > last number: > > ./memory at 0/ibm,associativity > 00000003 00000000 00000000 00000000 > ./memory at 10000000/ibm,associativity > 00000003 00000000 00000000 00000000 > ./memory at 20000000/ibm,associativity > 00000003 00000000 00000000 00000001 > ./memory at 30000000/ibm,associativity > 00000003 00000000 00000000 00000002 > ./memory at 40000000/ibm,associativity > 00000003 00000000 00000000 00000000 > ./memory at 50000000/ibm,associativity > 00000003 00000000 00000000 00000000 > > Firmware thinks there are 2 256MB regions on node 1 and node 2. This > turned out to be a firmware bug, however its a bit rude to disable 3/4 > of the machines memory because of it. > > Our numa code expects each node to be contiguous and to appear in order, > and gets upset with the above layout. I've seen this on a Power5 machine as well. It was an 8-way 570 with 128GB of ram running a partition with all 8 CPUS and RAM, then I shut that partition down and activated a partition with only 4 CPUS and 64GB of RAM, but Linux only gave me 32GB. I also saw similar messages with "Hole in node, disabling region." I believe it was running the latest Firmware as well (0434E ?), has it been fixed in a newer version? Sonny From benh at kernel.crashing.org Sat Nov 13 12:14:17 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 13 Nov 2004 12:14:17 +1100 Subject: Booting Imac G5 : success ! In-Reply-To: <1100274787.9674.1703.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> Message-ID: <1100308457.20512.79.camel@gaston> > OK, you're right. The problem is that in the PCI configuration space of > the SATA root, there's no IRQ line defined. I can see we're in the same > case for the IDE controller, and I saw this code: > drivers/ide/ppc/pmac.c:pmac_ide_pci_attach > np = pci_device_to_OF_node(pdev); > ... > if (np->n_intrs == 0) > pmif->irq = pdev->irq; > else > pmif->irq = np->intrs[0].line; > Then I did the same in the sata driver and I got it working. Do you > think this is an acceptable fix ? Wait wait, what is the problem exactly ? is the device lacking the INTERRUPT_PIN in the config space ? or is pci_device_to_OF_node() returning NULL or is node->n_intrs == 0 ? In the 2 later cases, that means there is something wrong with the parsing of the OF interrupt tree in prom.c and that has to be fixed. Once you have booted with your hack fix, can you send me a tarball of /proc/device-tree please ? Ben. From benh at kernel.crashing.org Sat Nov 13 13:03:58 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 13 Nov 2004 13:03:58 +1100 Subject: Booting Imac G5 : success ! In-Reply-To: <1100311368.16435.22.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> Message-ID: <1100311438.20592.106.camel@gaston> > As I said, we're lacking an IRQ pin in the PCI configuration space of > the SATA controler, not in the OF tree (as you can see in the dump I > previously sent). (No, you wrote that we're lacking an IRQ _line_ which is different :) Anyway, the correct fix is to do a quirk I suppose, that "corrects" the IRQ pin register. Is any other device having the same bug ? Note that we could also probably just ignore the IRQ PIN in the PCI header, I have to check if it would have an impact on pSeries tho. > So, pci_read_irq_line returns 0 at the first test: > if (intpin == 0) { > PPCDBG(PPCDBG_BUSWALK,"\tDevice: %s No Interrupt used by device.\n", > pci_name(pci_dev)); > return 0; > } > > So, taking it from the OF tree instead of PCI space, the same way it's > done for IDE, solves the problem. > > > Once you have booted with your hack fix, can you send me a tarball > > of /proc/device-tree please ? > > Here you are... I also join a tarball of /proc/bus/pci so you can see > all PCI devices configuration space dumps. > -- Benjamin Herrenschmidt From l_indien at magic.fr Sat Nov 13 13:22:31 2004 From: l_indien at magic.fr (J. Mayer) Date: Sat, 13 Nov 2004 03:22:31 +0100 Subject: Booting Imac G5 : success ! In-Reply-To: <1100311438.20592.106.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> Message-ID: <1100312551.16435.37.camel@rapid> On Sat, 2004-11-13 at 03:03, Benjamin Herrenschmidt wrote: > > As I said, we're lacking an IRQ pin in the PCI configuration space of > > the SATA controler, not in the OF tree (as you can see in the dump I > > previously sent). > > (No, you wrote that we're lacking an IRQ _line_ which is different :) Ooops, you're right, sorry. > Anyway, the correct fix is to do a quirk I suppose, that "corrects" the > IRQ pin register. Is any other device having the same bug ? The only other device I can see that should have an IRQ pin set and have not is the IDE controler. That's why I did copy the code which gets the IRQ number from the IDE driver code, as I could see the IDE controler driver did work ! I'll check on my Ibook if I can see such a bug on IDE. -- J. Mayer Never organized From benh at kernel.crashing.org Sat Nov 13 13:21:57 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 13 Nov 2004 13:21:57 +1100 Subject: Booting Imac G5 : success ! In-Reply-To: <1100312551.16435.37.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> Message-ID: <1100312517.20592.109.camel@gaston> On Sat, 2004-11-13 at 03:22 +0100, J. Mayer wrote: > The only other device I can see that should have an IRQ pin set and have > not is the IDE controler. > That's why I did copy the code which gets the IRQ number from the IDE > driver code, as I could see the IDE controler driver did work ! > I'll check on my Ibook if I can see such a bug on IDE. I suppose the right fix then is to ignore the IRQ pin on pmac. Just copy pci_read_irq_line() to pmac_pci.c, call it pmac_pci_read_irq_line and remove the bit that reads the IRQ PIN .. (and remove the PPCDBG ugly macros too while you are at it :) Ben. From l_indien at magic.fr Sat Nov 13 13:02:48 2004 From: l_indien at magic.fr (J. Mayer) Date: Sat, 13 Nov 2004 03:02:48 +0100 Subject: Booting Imac G5 : success ! In-Reply-To: <1100308457.20512.79.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> Message-ID: <1100311368.16435.22.camel@rapid> On Sat, 2004-11-13 at 02:14, Benjamin Herrenschmidt wrote: > > OK, you're right. The problem is that in the PCI configuration space of > > the SATA root, there's no IRQ line defined. I can see we're in the same > > case for the IDE controller, and I saw this code: > > drivers/ide/ppc/pmac.c:pmac_ide_pci_attach > > np = pci_device_to_OF_node(pdev); > > ... > > if (np->n_intrs == 0) > > pmif->irq = pdev->irq; > > else > > pmif->irq = np->intrs[0].line; > > Then I did the same in the sata driver and I got it working. Do you > > think this is an acceptable fix ? > > Wait wait, what is the problem exactly ? is the device lacking the > INTERRUPT_PIN in the config space ? or is pci_device_to_OF_node() > returning NULL or is node->n_intrs == 0 ? > > In the 2 later cases, that means there is something wrong with the > parsing of the OF interrupt tree in prom.c and that has to be fixed. As I said, we're lacking an IRQ pin in the PCI configuration space of the SATA controler, not in the OF tree (as you can see in the dump I previously sent). So, pci_read_irq_line returns 0 at the first test: if (intpin == 0) { PPCDBG(PPCDBG_BUSWALK,"\tDevice: %s No Interrupt used by device.\n", pci_name(pci_dev)); return 0; } So, taking it from the OF tree instead of PCI space, the same way it's done for IDE, solves the problem. > Once you have booted with your hack fix, can you send me a tarball > of /proc/device-tree please ? Here you are... I also join a tarball of /proc/bus/pci so you can see all PCI devices configuration space dumps. -- J. Mayer Never organized -------------- next part -------------- A non-text attachment was scrubbed... Name: device-tree.tgz Type: application/x-compressed-tar Size: 38978 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041113/307b798c/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: pci_bus.tgz Type: application/x-compressed-tar Size: 1835 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041113/307b798c/attachment-0001.bin From ivan at vmfacility.fr Sun Nov 14 09:21:13 2004 From: ivan at vmfacility.fr (Ivan Warren) Date: Sat, 13 Nov 2004 23:21:13 +0100 Subject: Issue with ppc64/vibmscsi Message-ID: Folks, I am running into the following problem : I have started experimenting running linux ppc64 on a newly acquired IBM 9111-520 (p520). I am attempting to run a linux kernel (2.6.9) in a partition. All the devices are virtual. I (shamelessly) used the debian ppc d-i installer... It wouldn't complete, but went far enough to have a usable root filesystem. So I installed yaboot, did the ybin, etc.. so I could boot from the disk.. The kernel is cross compiled (on a ia32 system)... Now.. My problem starts when I attempt to do some heavy I/O operations.. (namely debian's apt-get something which I believe to do heavy I/O using db).. At this point, I start getting heavy I/O errors - to a point where the root fs is remounted read-only.. The virt scsi client adapter is then made disabled (all further I/O fail). the virtual I/O server shows this : LABEL: CLIENT_FAILURE IDENTIFIER: 37DDE80C Date/Time: Sat Nov 13 13:07:51 CST 2004 Sequence Number: 54 Machine Id: 00C1721E4C00 Node Id: vios1 Class: S Type: TEMP Resource Name: vhost3 Description Misbehaved Virtual SCSI Client Probable Causes Bad IU, or SRP Violation Failure Causes Bad IU, or SRP Violation Recommended Actions Remove Virtual SCSI Client, then Configure the same instance Detail Data Module RC Location Data srp_parse_descriptor_lis 0000000000000002 00000006 C00000000126B3C0 2E000 And the console shows : ibmvscsi: Virtual adapter failed! SCSI error : <0 0 1 0> return code = 0x70000 end_request: I/O error, dev sda, sector 13438632 SCSI error : <0 0 1 0> return code = 0x70000 end_request: I/O error, dev sda, sector 13438640 SCSI error : <0 0 1 0> return code = 0x70000 .. ad libidum ... I added a few printk to the srp/rdma driver and I get this : (notes in () are hand edited comments) (Note : This is the srp_event_struct iu field dump) Sending IU : 02000000 00010000 00000000 00000000 00000000 81000000 00000000 00000000 280000CD 0EA00000 08000000 00000000 00000000 02050000 00000000 00001000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (note : this is the CRQ request for the above SRP block) rpa_scsi : CRQ_SEND : CRQ = 8001000000000100 - 4300 (failing SRP) Sending IU : 02000000 00020002 00000000 00000000 00000000 81000000 00000000 00000000 280000CD 0EA80001 70000000 00000000 00000000 00004444 00000000 00000020 0002E000 00000000 02052000 00000000 0000E000 00000000 0C000000 00000000 00020000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 (failing CRQ) rpa_scsi : CRQ_SEND : CRQ = 8001000000000100 - 4400 ibmvscsi: Virtual adapter failed! SCSI error : <0 0 1 0> return code = 0x70000 end_request: I/O error, dev sda, sector 13438632 ... Basically, I cannot see anything wrong with the last failing request... (SRP Request type 02 : SRP_TYPE_CMD, data in format 2 (indirect) - 2 data in descriptors) - and some of the CDB fields I recognize : SCSI Command code 28 and LBA CD0EA8 (which matches sector 13438632 indicated afterwards..).. The rest is way to obscure for me.. This problem is *almost* always reproducible (~90% of the time - occurs when attempting the same operation).. I attempted deleting/recreating the virtual device, changed the size, to no avail.. Question : - Is this *really* a misbehaving client - or - a buggy server (VIOS at 1.1.20, p520 FW at SF220_51)? - In the latter case, how do I report this to IBM (knowing roll-your-own kernels are probably not supported).. - If this is a misbehaving client, When extra information is needed (knowing that my SRP, SCSI, VSCSI knowledge is somewhat limited) ? Thanks, --Ivan From anton at samba.org Sun Nov 14 10:37:02 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 10:37:02 +1100 Subject: [PATCH] ppc64: ratelimit some rtas errors Message-ID: <20041113233702.GB16377@krispykreme.ozlabs.ibm.com> Use printk_ratelimit() in rtc code to avoid flooding the kernel log buffer with errors. Also use rtas_get_error_log_max() instead of duplicating it in __fetch_rtas_last_error. Signed-off-by: Anton Blanchard ===== rtas.c 1.45 vs edited ===== --- 1.45/arch/ppc64/kernel/rtas.c 2004-10-26 02:20:03 +10:00 +++ edited/rtas.c 2004-11-14 10:12:54 +11:00 @@ -102,6 +102,27 @@ return tokp ? *tokp : RTAS_UNKNOWN_SERVICE; } +/* + * Return the firmware-specified size of the error log buffer + * for all rtas calls that require an error buffer argument. + * This includes 'check-exception' and 'rtas-last-error'. + */ +int rtas_get_error_log_max(void) +{ + static int rtas_error_log_max; + if (rtas_error_log_max) + return rtas_error_log_max; + + rtas_error_log_max = rtas_token ("rtas-error-log-max"); + if ((rtas_error_log_max == RTAS_UNKNOWN_SERVICE) || + (rtas_error_log_max > RTAS_ERROR_LOG_MAX)) { + printk (KERN_WARNING "RTAS: bad log buffer size %d\n", rtas_error_log_max); + rtas_error_log_max = RTAS_ERROR_LOG_MAX; + } + return rtas_error_log_max; +} + + /** Return a copy of the detailed error text associated with the * most recent failed call to rtas. Because the error text * might go stale if there are any other intervening rtas calls, @@ -114,12 +135,7 @@ struct rtas_args err_args, save_args; u32 bufsz; - bufsz = rtas_token ("rtas-error-log-max"); - if ((bufsz == RTAS_UNKNOWN_SERVICE) || - (bufsz > RTAS_ERROR_LOG_MAX)) { - printk (KERN_WARNING "RTAS: bad log buffer size %d\n", bufsz); - bufsz = RTAS_ERROR_LOG_MAX; - } + bufsz = rtas_get_error_log_max(); err_args.token = rtas_token("rtas-last-error"); err_args.nargs = 2; @@ -539,26 +555,6 @@ enter_rtas(__pa(rtas_args)); panic("Alas, I survived.\n"); -} - -/* - * Return the firmware-specified size of the error log buffer - * for all rtas calls that require an error buffer argument. - * This includes 'check-exception' and 'rtas-last-error'. - */ -int rtas_get_error_log_max(void) -{ - static int rtas_error_log_max; - if (rtas_error_log_max) - return rtas_error_log_max; - - rtas_error_log_max = rtas_token ("rtas-error-log-max"); - if ((rtas_error_log_max == RTAS_UNKNOWN_SERVICE) || - (rtas_error_log_max > RTAS_ERROR_LOG_MAX)) { - printk (KERN_WARNING "RTAS: bad log buffer size %d\n", rtas_error_log_max); - rtas_error_log_max = RTAS_ERROR_LOG_MAX; - } - return rtas_error_log_max; } /* ===== rtc.c 1.15 vs edited ===== --- 1.15/arch/ppc64/kernel/rtc.c 2004-09-08 16:32:57 +10:00 +++ edited/rtc.c 2004-11-14 10:05:36 +11:00 @@ -356,7 +356,7 @@ } } while (error == RTAS_CLOCK_BUSY && (__get_tb() < max_wait_tb)); - if (error != 0) { + if (error != 0 && printk_ratelimit()) { printk(KERN_WARNING "error: reading the clock failed (%d)\n", error); return; @@ -384,7 +384,7 @@ do { error = rtas_call(rtas_token("get-time-of-day"), 0, 8, ret); if (error == RTAS_CLOCK_BUSY || rtas_is_extended_busy(error)) { - if (in_interrupt()) { + if (in_interrupt() && printk_ratelimit()) { printk(KERN_WARNING "error: reading clock would delay interrupt\n"); return; /* delay not allowed */ } @@ -395,7 +395,7 @@ } } while (error == RTAS_CLOCK_BUSY && (__get_tb() < max_wait_tb)); - if (error != 0) { + if (error != 0 && printk_ratelimit()) { printk(KERN_WARNING "error: reading the clock failed (%d)\n", error); return; @@ -430,7 +430,7 @@ } } while (error == RTAS_CLOCK_BUSY && (__get_tb() < max_wait_tb)); - if (error != 0) + if (error != 0 && printk_ratelimit()) printk(KERN_WARNING "error: setting the clock failed (%d)\n", error); From anton at samba.org Sun Nov 14 10:48:06 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 10:48:06 +1100 Subject: [PATCH] ppc64: Use pci_device_to_OF_node Message-ID: <20041113234806.GC16377@krispykreme.ozlabs.ibm.com> PCI_GET_DN() doesnt check to see if ->sysdata has been initialised correctly - we should instead use pci_device_to_OF_node. Leave PCI_GET_DN() in the one performance critical case (iommu table lookup in pci DMA functions). In this case ->sysdata is guaranteed to have been initialised by the iommu setup code. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pSeries_iommu.c~remove_PCI_GET_DN arch/ppc64/kernel/pSeries_iommu.c --- foobar2/arch/ppc64/kernel/pSeries_iommu.c~remove_PCI_GET_DN 2004-11-13 10:08:48.224833361 +1100 +++ foobar2-anton/arch/ppc64/kernel/pSeries_iommu.c 2004-11-13 10:24:20.385047282 +1100 @@ -290,7 +290,11 @@ static void iommu_buses_init_lpar(struct for (ln=bus_list->next; ln != bus_list; ln=ln->next) { bus = pci_bus_b(ln); - busdn = PCI_GET_DN(bus); + + if (bus->self) + busdn = pci_device_to_OF_node(bus->self); + else + busdn = bus->sysdata; /* must be a phb */ dma_window = (unsigned int *)get_property(busdn, "ibm,dma-window", NULL); if (dma_window) { @@ -423,7 +427,7 @@ void iommu_setup_pSeries(void) * up the device tree to find it. */ for_each_pci_dev(dev) { - mydn = dn = PCI_GET_DN(dev); + mydn = dn = pci_device_to_OF_node(dev); while (dn && dn->iommu_table == NULL) dn = dn->parent; diff -puN include/asm-ppc64/pci-bridge.h~remove_PCI_GET_DN include/asm-ppc64/pci-bridge.h --- foobar2/include/asm-ppc64/pci-bridge.h~remove_PCI_GET_DN 2004-11-13 10:18:33.520682245 +1100 +++ foobar2-anton/include/asm-ppc64/pci-bridge.h 2004-11-13 10:18:52.171300722 +1100 @@ -84,11 +84,6 @@ extern void pci_process_bridge_OF_ranges extern int pcibios_remove_root_bus(struct pci_controller *phb); -/* Use this macro after the PCI bus walk for max performance when it - * is known that sysdata is correct. - */ -#define PCI_GET_DN(dev) ((struct device_node *)((dev)->sysdata)) - extern void phbs_remap_io(void); static inline struct pci_controller *pci_bus_to_host(struct pci_bus *bus) diff -puN arch/ppc64/kernel/pci_iommu.c~remove_PCI_GET_DN arch/ppc64/kernel/pci_iommu.c --- foobar2/arch/ppc64/kernel/pci_iommu.c~remove_PCI_GET_DN 2004-11-13 10:19:05.203239344 +1100 +++ foobar2-anton/arch/ppc64/kernel/pci_iommu.c 2004-11-13 10:20:53.108170239 +1100 @@ -43,6 +43,13 @@ #include #endif /* CONFIG_PPC_ISERIES */ +/* + * We can use ->sysdata directly and avoid the extra work in + * pci_device_to_OF_node since ->sysdata will have been initialised + * in the iommu init code for all devices. + */ +#define PCI_GET_DN(dev) ((struct device_node *)((dev)->sysdata)) + static inline struct iommu_table *devnode_table(struct pci_dev *dev) { if (!dev) _ From anton at samba.org Sun Nov 14 11:10:41 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 11:10:41 +1100 Subject: [PATCH] suggested kexec API changes Message-ID: <20041114001041.GD16377@krispykreme.ozlabs.ibm.com> Hi, Milton, Todd and I have been working our way through the kexec API and have a few suggested changes. Patches to follow. One thing I havent done yet is to make secondary cpus spin on the real cpu id - the wrapper does not necessarily know how the kernel is going to do the logical to real translation. Anton From anton at samba.org Sun Nov 14 11:12:36 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 11:12:36 +1100 Subject: [PATCH] suggested kexec API changes In-Reply-To: <20041114001041.GD16377@krispykreme.ozlabs.ibm.com> References: <20041114001041.GD16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114001236.GE16377@krispykreme.ozlabs.ibm.com> Move the linux,rtas* properties into the /rtas node and make them 32bit. Use rtas-size and avoid duplicating it in linux,rtas-size. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/prom_init.c~fix_rtas_of arch/ppc64/kernel/prom_init.c --- foobar2/arch/ppc64/kernel/prom_init.c~fix_rtas_of 2004-11-13 09:24:42.030565821 +1100 +++ foobar2-anton/arch/ppc64/kernel/prom_init.c 2004-11-13 09:59:48.610764091 +1100 @@ -701,9 +701,9 @@ static void __init prom_instantiate_rtas { unsigned long offset = reloc_offset(); struct prom_t *_prom = PTRRELOC(&prom); - phandle prom_rtas; - u64 base, entry = 0; - u32 size = 0; + phandle prom_rtas, rtas_node; + u32 base, entry = 0; + u32 size = 0; prom_debug("prom_instantiate_rtas: start...\n"); @@ -723,12 +723,12 @@ static void __init prom_instantiate_rtas } prom_printf("instantiating rtas at 0x%x", base); - prom_rtas = call_prom("open", 1, 1, ADDR("/rtas")); + rtas_node = call_prom("open", 1, 1, ADDR("/rtas")); prom_printf("..."); if (call_prom("call-method", 3, 2, ADDR("instantiate-rtas"), - prom_rtas, base) != PROM_ERROR) { + rtas_node, base) != PROM_ERROR) { entry = (long)_prom->args.rets[1]; } if (entry == 0) { @@ -739,9 +739,8 @@ static void __init prom_instantiate_rtas reserve_mem(base, size); - prom_setprop(_prom->chosen, "linux,rtas-base", &base, sizeof(base)); - prom_setprop(_prom->chosen, "linux,rtas-entry", &entry, sizeof(entry)); - prom_setprop(_prom->chosen, "linux,rtas-size", &size, sizeof(size)); + prom_setprop(prom_rtas, "linux,rtas-base", &base, sizeof(base)); + prom_setprop(prom_rtas, "linux,rtas-entry", &entry, sizeof(entry)); prom_debug("rtas base = 0x%x\n", base); prom_debug("rtas entry = 0x%x\n", entry); diff -puN arch/ppc64/kernel/rtas.c~fix_rtas_of arch/ppc64/kernel/rtas.c --- foobar2/arch/ppc64/kernel/rtas.c~fix_rtas_of 2004-11-13 09:24:42.035565480 +1100 +++ foobar2-anton/arch/ppc64/kernel/rtas.c 2004-11-13 09:59:23.997753829 +1100 @@ -573,15 +573,15 @@ void __init rtas_initialize(void) */ rtas.dev = of_find_node_by_name(NULL, "rtas"); if (rtas.dev) { - u64 *basep, *entryp; + u32 *basep, *entryp; u32 *sizep; - basep = (u64 *)get_property(of_chosen, "linux,rtas-base", NULL); - sizep = (u32 *)get_property(of_chosen, "linux,rtas-size", NULL); + basep = (u32 *)get_property(rtas.dev, "linux,rtas-base", NULL); + sizep = (u32 *)get_property(rtas.dev, "rtas-size", NULL); if (basep != NULL && sizep != NULL) { rtas.base = *basep; rtas.size = *sizep; - entryp = (u64 *)get_property(of_chosen, "linux,rtas-entry", NULL); + entryp = (u32 *)get_property(rtas.dev, "linux,rtas-entry", NULL); if (entryp == NULL) /* Ugh */ rtas.entry = rtas.base; else _ From anton at samba.org Sun Nov 14 11:17:52 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 11:17:52 +1100 Subject: [PATCH] suggested kexec API changes In-Reply-To: <20041114001615.GF16377@krispykreme.ozlabs.ibm.com> References: <20041114001041.GD16377@krispykreme.ozlabs.ibm.com> <20041114001236.GE16377@krispykreme.ozlabs.ibm.com> <20041114001615.GF16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114001752.GG16377@krispykreme.ozlabs.ibm.com> Reserve the kernel memory (0 - klimit) in the kernel instead of the wrapper. Remove an old comment that incorrectly referred to klimit. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/prom.c~dont_reserve_klimit_in_wrapper arch/ppc64/kernel/prom.c --- foobar2/arch/ppc64/kernel/prom.c~dont_reserve_klimit_in_wrapper 2004-11-13 10:49:17.563463329 +1100 +++ foobar2-anton/arch/ppc64/kernel/prom.c 2004-11-13 10:49:44.965783856 +1100 @@ -1019,6 +1019,7 @@ void __init early_init_devtree(void *par /* Scan memory nodes and rebuild LMBs */ lmb_init(); + lmb_reserve(0, __pa(klimit)); scan_flat_dt(early_init_dt_scan_root, NULL); scan_flat_dt(early_init_dt_scan_memory, NULL); lmb_analyze(); diff -puN arch/ppc64/kernel/prom_init.c~dont_reserve_klimit_in_wrapper arch/ppc64/kernel/prom_init.c --- foobar2/arch/ppc64/kernel/prom_init.c~dont_reserve_klimit_in_wrapper 2004-11-13 10:49:17.568462987 +1100 +++ foobar2-anton/arch/ppc64/kernel/prom_init.c 2004-11-13 10:49:17.601460735 +1100 @@ -1603,11 +1603,6 @@ unsigned long __init prom_init(unsigned prom_debug("offset=0x%x\n", offset); /* - * Reserve kernel in reserve map - */ - reserve_mem(0, __pa(RELOC(klimit))); - - /* * Check for an initrd */ prom_check_initrd(r3, r4); diff -puN include/asm-ppc64/rtas.h~dont_reserve_klimit_in_wrapper include/asm-ppc64/rtas.h --- foobar2/include/asm-ppc64/rtas.h~dont_reserve_klimit_in_wrapper 2004-11-13 10:49:17.574462578 +1100 +++ foobar2-anton/include/asm-ppc64/rtas.h 2004-11-13 10:49:17.592461349 +1100 @@ -149,7 +149,7 @@ struct rtas_error_log { unsigned long target:4; /* Target of failed operation */ unsigned long type:8; /* General event or error*/ unsigned long extended_log_length:32; /* length in bytes */ - unsigned char buffer[1]; /* allocated by klimit bump */ + unsigned char buffer[1]; }; struct flash_block { _ From anton at samba.org Sun Nov 14 11:16:16 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 11:16:16 +1100 Subject: [PATCH] suggested kexec API changes In-Reply-To: <20041114001236.GE16377@krispykreme.ozlabs.ibm.com> References: <20041114001041.GD16377@krispykreme.ozlabs.ibm.com> <20041114001236.GE16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114001615.GF16377@krispykreme.ozlabs.ibm.com> Remove linux,has-tce-table since we can just look for linux,tce-base and linux,tce-size. Make linux,tce-base store real addresses instead of virtual ones, the wrapper may not know the translation the kernel will use. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pSeries_iommu.c~iommu_real_addr arch/ppc64/kernel/pSeries_iommu.c --- gr_base/arch/ppc64/kernel/pSeries_iommu.c~iommu_real_addr 2004-11-12 04:51:30.405991684 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_iommu.c 2004-11-12 04:51:30.422988913 -0600 @@ -317,19 +317,16 @@ static void iommu_table_setparms(struct node = (struct device_node *)phb->arch_data; - if (get_property(node, "linux,has-tce-table", NULL) == NULL) { - printk(KERN_ERR "PCI_DMA: iommu_table_setparms: %s has no tce table !\n", - dn->full_name); - return; - } basep = (unsigned long *)get_property(node, "linux,tce-base", NULL); sizep = (unsigned int *)get_property(node, "linux,tce-size", NULL); if (basep == NULL || sizep == NULL) { - printk(KERN_ERR "PCI_DMA: iommu_table_setparms: %s has missing tce" - " entries !\n", dn->full_name); + printk(KERN_ERR "PCI_DMA: iommu_table_setparms: %s has " + "missing tce entries !\n", dn->full_name); return; } - memset((void *)(*basep), 0, *sizep); + + tbl->it_base = (unsigned long)__va(*basep); + memset((void *)tbl->it_base, 0, *sizep); tbl->it_busno = phb->bus->number; @@ -353,7 +350,6 @@ static void iommu_table_setparms(struct if (phb->dma_window_base_cur > (1 << 19)) panic("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); - tbl->it_base = *basep; tbl->it_index = 0; tbl->it_entrysize = sizeof(union tce_entry); tbl->it_blocksize = 16; diff -puN arch/ppc64/kernel/prom_init.c~iommu_real_addr arch/ppc64/kernel/prom_init.c --- gr_base/arch/ppc64/kernel/prom_init.c~iommu_real_addr 2004-11-12 04:51:30.411990706 -0600 +++ gr_base-anton/arch/ppc64/kernel/prom_init.c 2004-11-12 04:51:30.426988260 -0600 @@ -760,7 +760,7 @@ static void __init prom_initialize_tce_t unsigned long offset = reloc_offset(); char compatible[64], type[64], model[64]; char *path = RELOC(prom_scratch); - u64 base, vbase, align; + u64 base, align; u32 minalign, minsize; u64 tce_entry, *tce_entryp; u64 local_alloc_top, local_alloc_bottom; @@ -832,12 +832,9 @@ static void __init prom_initialize_tce_t if (base < local_alloc_bottom) local_alloc_bottom = base; - vbase = (unsigned long)abs_to_virt(base); - /* Save away the TCE table attributes for later use. */ - prom_setprop(node, "linux,tce-base", &vbase, sizeof(vbase)); + prom_setprop(node, "linux,tce-base", &base, sizeof(base)); prom_setprop(node, "linux,tce-size", &minsize, sizeof(minsize)); - prom_setprop(node, "linux,has-tce-table", NULL, 0); /* It seems OF doesn't null-terminate the path :-( */ memset(path, 0, sizeof(path)); _ From anton at samba.org Sun Nov 14 15:13:40 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 15:13:40 +1100 Subject: [PATCH] ppc64: avoid 32bit only syscalls in unistd.h Message-ID: <20041114041340.GH16377@krispykreme.ozlabs.ibm.com> Steve Munroe points out that ppc64 glibc builds stubs for a number of 32bit only syscalls. While none of them exist in the kernel syscall table, their existence in unistd.h means glibc still tries to use them then falls back onto the 64bit safe versions. Signed-off-by: Anton Blanchard ===== include/asm-ppc64/unistd.h 1.35 vs edited ===== --- 1.35/include/asm-ppc64/unistd.h 2004-10-22 19:27:40 +10:00 +++ edited/include/asm-ppc64/unistd.h 2004-11-12 05:25:02 +11:00 @@ -202,19 +202,19 @@ #define __NR_vfork 189 #define __NR_ugetrlimit 190 /* SuS compliant getrlimit */ #define __NR_readahead 191 -#define __NR_mmap2 192 -#define __NR_truncate64 193 -#define __NR_ftruncate64 194 -#define __NR_stat64 195 -#define __NR_lstat64 196 -#define __NR_fstat64 197 +/* #define __NR_mmap2 192 32bit only */ +/* #define __NR_truncate64 193 32bit only */ +/* #define __NR_ftruncate64 194 32bit only */ +/* #define __NR_stat64 195 32bit only */ +/* #define __NR_lstat64 196 32bit only */ +/* #define __NR_fstat64 197 32bit only */ #define __NR_pciconfig_read 198 #define __NR_pciconfig_write 199 #define __NR_pciconfig_iobase 200 #define __NR_multiplexer 201 #define __NR_getdents64 202 #define __NR_pivot_root 203 -#define __NR_fcntl64 204 +/* #define __NR_fcntl64 204 32bit only */ #define __NR_madvise 205 #define __NR_mincore 206 #define __NR_gettid 207 @@ -236,7 +236,7 @@ #define __NR_sched_getaffinity 223 /* 224 currently unused */ #define __NR_tuxcall 225 -#define __NR_sendfile64 226 +/* #define __NR_sendfile64 226 32bit only */ #define __NR_io_setup 227 #define __NR_io_destroy 228 #define __NR_io_getevents 229 @@ -264,7 +264,7 @@ #define __NR_utimes 251 #define __NR_statfs64 252 #define __NR_fstatfs64 253 -#define __NR_fadvise64_64 254 +/* #define __NR_fadvise64_64 254 32bit only */ #define __NR_rtas 255 /* Number 256 is reserved for sys_debug_setcontext */ /* Number 257 is reserved for vserver */ From anton at samba.org Sun Nov 14 15:30:12 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 15:30:12 +1100 Subject: [PATCH] ppc64: reduce ifdef clutter in arch/ppc64/kernel/sysfs.c Message-ID: <20041114043012.GI16377@krispykreme.ozlabs.ibm.com> From: Christoph Hellwig Reduce ifdef clutter in arch/ppc64/kernel/sysfs.c Signed-off-by: Anton Blanchard Index: linux-2.5/arch/ppc64/kernel/sysfs.c =================================================================== --- linux-2.5.orig/arch/ppc64/kernel/sysfs.c 2004-11-14 15:14:37.794925342 +1100 +++ linux-2.5/arch/ppc64/kernel/sysfs.c 2004-11-14 15:20:24.829990790 +1100 @@ -17,8 +17,7 @@ /* SMT stuff */ -#ifndef CONFIG_PPC_ISERIES - +#ifdef CONFIG_PPC_MULTIPLATFORM /* default to snooze disabled */ DEFINE_PER_CPU(unsigned long, smt_snooze_delay); @@ -94,19 +93,6 @@ } __setup("smt-snooze-delay=", setup_smt_snooze_delay); -#endif - - -/* PMC stuff */ - -#ifdef CONFIG_PPC_ISERIES -void ppc64_enable_pmcs(void) -{ - /* XXX Implement for iseries */ -} -#endif - -#ifdef CONFIG_PPC_MULTIPLATFORM /* * Enabling PMCs will slow partition context switch times so we only do * it the first time we write to the PMCs. @@ -183,6 +169,14 @@ } #endif /* CONFIG_PPC_PSERIES */ } + +#else + +/* PMC stuff */ +void ppc64_enable_pmcs(void) +{ + /* XXX Implement for iseries */ +} #endif /* CONFIG_PPC_MULTIPLATFORM */ EXPORT_SYMBOL(ppc64_enable_pmcs); From anton at samba.org Sun Nov 14 17:33:01 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 17:33:01 +1100 Subject: [PATCH] ppc64: remove BUG()s in pcibios_fixup_bus In-Reply-To: <20041114062944.GN16377@krispykreme.ozlabs.ibm.com> References: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> <20041114061825.GK16377@krispykreme.ozlabs.ibm.com> <20041114062202.GL16377@krispykreme.ozlabs.ibm.com> <20041114062801.GM16377@krispykreme.ozlabs.ibm.com> <20041114062944.GN16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114063301.GO16377@krispykreme.ozlabs.ibm.com> BUG() on missing IO or memory resources in pcibios_fixup_bus is rude, remove it. Also use list_for_each_entry. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pci.c~pci5 arch/ppc64/kernel/pci.c --- gr_base/arch/ppc64/kernel/pci.c~pci5 2004-11-12 04:50:39.326631502 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.c 2004-11-12 04:50:39.338629545 -0600 @@ -815,9 +815,6 @@ EXPORT_SYMBOL(pcibios_fixup_device_resou void __devinit pcibios_fixup_bus(struct pci_bus *bus) { struct pci_controller *hose = PCI_GET_PHB_PTR(bus); - struct list_head *ln; - - /* XXX or bus->parent? */ struct pci_dev *dev = bus->self; struct resource *res; int i; @@ -827,18 +824,13 @@ void __devinit pcibios_fixup_bus(struct hose->bus = bus; bus->resource[0] = res = &hose->io_resource; - if (!res->flags) - BUG(); /* No I/O resource for this PHB? */ - if (request_resource(&ioport_resource, res)) + if (res->flags && request_resource(&ioport_resource, res)) printk(KERN_ERR "Failed to request IO on " "PCI domain %d\n", pci_domain_nr(bus)); - for (i = 0; i < 3; ++i) { res = &hose->mem_resources[i]; - if (!res->flags && i == 0) - BUG(); /* No memory resource for this PHB? */ bus->resource[i+1] = res; if (res->flags && request_resource(&iomem_resource, res)) printk(KERN_ERR "Failed to request MEM on " @@ -853,12 +845,10 @@ void __devinit pcibios_fixup_bus(struct pcibios_fixup_device_resources(dev, bus); } - /* XXX Need to check why Alpha doesnt do this - Anton */ if (!pci_probe_only) return; - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *dev = pci_dev_b(ln); + list_for_each_entry(dev, &bus->devices, bus_list) { if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) pcibios_fixup_device_resources(dev, bus); } _ From anton at samba.org Sun Nov 14 17:18:25 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 17:18:25 +1100 Subject: [PATCH] ppc64: remove phb_set_model In-Reply-To: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> References: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114061825.GK16377@krispykreme.ozlabs.ibm.com> phb_set_model does a lot of work just to set up a text string that almost nothing uses. Replace this all with an is_python() check. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/iSeries_pci.c~pci2 arch/ppc64/kernel/iSeries_pci.c --- gr_base/arch/ppc64/kernel/iSeries_pci.c~pci2 2004-11-12 04:50:34.571616109 -0600 +++ gr_base-anton/arch/ppc64/kernel/iSeries_pci.c 2004-11-12 04:50:34.613609258 -0600 @@ -256,7 +256,7 @@ unsigned long __init find_and_init_phbs( int ret = HvCallXm_testBus(bus); if (ret == 0) { printk("bus %d appears to exist\n", bus); - phb = pci_alloc_pci_controller(phb_type_hypervisor); + phb = pci_alloc_pci_controller(); if (phb == NULL) return -1; phb->pci_mem_offset = phb->local_number = bus; diff -puN arch/ppc64/kernel/maple_pci.c~pci2 arch/ppc64/kernel/maple_pci.c --- gr_base/arch/ppc64/kernel/maple_pci.c~pci2 2004-11-12 04:50:34.576615293 -0600 +++ gr_base-anton/arch/ppc64/kernel/maple_pci.c 2004-11-12 04:50:34.614609095 -0600 @@ -325,7 +325,7 @@ static int __init add_bridge(struct devi dev->full_name); } - hose = pci_alloc_pci_controller(phb_type_apple); + hose = pci_alloc_pci_controller(); if (!hose) return -ENOMEM; hose->arch_data = dev; diff -puN arch/ppc64/kernel/pSeries_pci.c~pci2 arch/ppc64/kernel/pSeries_pci.c --- gr_base/arch/ppc64/kernel/pSeries_pci.c~pci2 2004-11-12 04:50:34.581614478 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_pci.c 2004-11-12 04:50:34.617608606 -0600 @@ -149,6 +149,16 @@ struct pci_ops rtas_pci_ops = { rtas_pci_write_config }; +static int is_python(struct device_node *dev) +{ + char *model = (char *)get_property(dev, "model", NULL); + + if (model && strstr(model, "Python")) + return 1; + + return 0; +} + static void python_countermeasures(unsigned long addr) { void __iomem *chip_regs; @@ -218,33 +228,6 @@ unsigned long __devinit get_phb_buid (st return buid; } -static enum phb_types get_phb_type(struct device_node *dev) -{ - enum phb_types type; - char *model; - - model = (char *)get_property(dev, "model", NULL); - - if (!model) { - printk(KERN_ERR "%s: phb has no model property\n", - __FUNCTION__); - model = ""; - } - - if (strstr(model, "Python")) { - type = phb_type_python; - } else if (strstr(model, "Speedwagon")) { - type = phb_type_speedwagon; - } else if (strstr(model, "Winnipeg")) { - type = phb_type_winnipeg; - } else { - printk(KERN_ERR "%s: unknown PHB %s\n", __FUNCTION__, model); - type = phb_type_unknown; - } - - return type; -} - static int get_phb_reg_prop(struct device_node *dev, unsigned int addr_size_words, struct reg_property64 *reg) @@ -288,21 +271,18 @@ static struct pci_controller *alloc_phb( { struct pci_controller *phb; struct reg_property64 reg_struct; - enum phb_types phb_type; struct property *of_prop; int rc; - phb_type = get_phb_type(dev); - rc = get_phb_reg_prop(dev, addr_size_words, ®_struct); if (rc) return NULL; - phb = pci_alloc_pci_controller(phb_type); + phb = pci_alloc_pci_controller(); if (phb == NULL) return NULL; - if (phb_type == phb_type_python) + if (is_python(dev)) python_countermeasures(reg_struct.address); rc = phb_set_bus_ranges(dev, phb); @@ -336,20 +316,17 @@ static struct pci_controller * __devinit { struct pci_controller *phb; struct reg_property64 reg_struct; - enum phb_types phb_type; int rc; - phb_type = get_phb_type(dev); - rc = get_phb_reg_prop(dev, addr_size_words, ®_struct); if (rc) return NULL; - phb = pci_alloc_phb_dynamic(phb_type); + phb = pci_alloc_phb_dynamic(); if (phb == NULL) return NULL; - if (phb_type == phb_type_python) + if (is_python(dev)) python_countermeasures(reg_struct.address); rc = phb_set_bus_ranges(dev, phb); diff -puN arch/ppc64/kernel/pci.c~pci2 arch/ppc64/kernel/pci.c --- gr_base/arch/ppc64/kernel/pci.c~pci2 2004-11-12 04:50:34.587613499 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.c 2004-11-12 04:50:34.619608279 -0600 @@ -181,43 +181,10 @@ void pcibios_align_resource(void *data, res->start = start; } -static void phb_set_model(struct pci_controller *hose, - enum phb_types controller_type) -{ - char *model; - - switch(controller_type) { -#ifdef CONFIG_PPC_ISERIES - case phb_type_hypervisor: - model = "PHB HV"; - break; -#endif - case phb_type_python: - model = "PHB PY"; - break; - case phb_type_speedwagon: - model = "PHB SW"; - break; - case phb_type_winnipeg: - model = "PHB WP"; - break; - case phb_type_apple: - model = "PHB APPLE"; - break; - default: - model = "PHB UK"; - break; - } - - if(strlen(model) < 8) - strcpy(hose->what,model); - else - memcpy(hose->what,model,7); -} /* * Allocate pci_controller(phb) initialized common variables. */ -struct pci_controller * __init pci_alloc_pci_controller(enum phb_types controller_type) +struct pci_controller * __init pci_alloc_pci_controller() { struct pci_controller *hose; @@ -233,10 +200,7 @@ struct pci_controller * __init pci_alloc } memset(hose, 0, sizeof(struct pci_controller)); - phb_set_model(hose, controller_type); - hose->is_dynamic = 0; - hose->type = controller_type; hose->global_number = global_phb_number++; list_add_tail(&hose->list_node, &hose_list); @@ -247,7 +211,7 @@ struct pci_controller * __init pci_alloc /* * Dymnamically allocate pci_controller(phb), initialize common variables. */ -struct pci_controller * pci_alloc_phb_dynamic(enum phb_types controller_type) +struct pci_controller * pci_alloc_phb_dynamic() { struct pci_controller *hose; @@ -259,10 +223,7 @@ struct pci_controller * pci_alloc_phb_dy } memset(hose, 0, sizeof(struct pci_controller)); - phb_set_model(hose, controller_type); - hose->is_dynamic = 1; - hose->type = controller_type; hose->global_number = global_phb_number++; list_add_tail(&hose->list_node, &hose_list); diff -puN arch/ppc64/kernel/pci.h~pci2 arch/ppc64/kernel/pci.h --- gr_base/arch/ppc64/kernel/pci.h~pci2 2004-11-12 04:50:34.591612846 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.h 2004-11-12 04:50:34.621735837 -0600 @@ -14,8 +14,8 @@ extern unsigned long isa_io_base; -extern struct pci_controller* pci_alloc_pci_controller(enum phb_types controller_type); -extern struct pci_controller* pci_alloc_phb_dynamic(enum phb_types controller_type); +extern struct pci_controller* pci_alloc_pci_controller(void); +extern struct pci_controller* pci_alloc_phb_dynamic(void); extern void pci_setup_phb_io(struct pci_controller *hose, int primary); extern struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node); @@ -50,7 +50,8 @@ void pci_addr_cache_remove_device(struct void init_pci_config_tokens (void); unsigned long get_phb_buid (struct device_node *); -extern int pci_probe_only; +extern unsigned long pci_probe_only; +extern unsigned long pci_assign_all_buses; extern int pci_read_irq_line(struct pci_dev *pci_dev); #endif /* __PPC_KERNEL_PCI_H__ */ diff -puN arch/ppc64/kernel/pmac_pci.c~pci2 arch/ppc64/kernel/pmac_pci.c --- gr_base/arch/ppc64/kernel/pmac_pci.c~pci2 2004-11-12 04:50:34.597611868 -0600 +++ gr_base-anton/arch/ppc64/kernel/pmac_pci.c 2004-11-12 04:50:34.623735511 -0600 @@ -614,7 +614,7 @@ static int __init add_bridge(struct devi dev->full_name); } - hose = pci_alloc_pci_controller(phb_type_apple); + hose = pci_alloc_pci_controller(); if (!hose) return -ENOMEM; hose->arch_data = dev; diff -puN include/asm-ppc64/pci-bridge.h~pci2 include/asm-ppc64/pci-bridge.h --- gr_base/include/asm-ppc64/pci-bridge.h~pci2 2004-11-12 04:50:34.601611215 -0600 +++ gr_base-anton/include/asm-ppc64/pci-bridge.h 2004-11-12 04:50:34.624735348 -0600 @@ -18,21 +18,11 @@ struct pci_controller; extern struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node); -enum phb_types { - phb_type_unknown = 0x0, - phb_type_hypervisor = 0x1, - phb_type_python = 0x10, - phb_type_speedwagon = 0x11, - phb_type_winnipeg = 0x12, - phb_type_apple = 0xff -}; - /* * Structure of a PCI controller (host bridge) */ struct pci_controller { char what[8]; /* Eye catcher */ - enum phb_types type; /* Type of hardware */ struct pci_bus *bus; char is_dynamic; void *arch_data; _ From anton at samba.org Sun Nov 14 17:10:36 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 17:10:36 +1100 Subject: [PATCH] ppc64: pci cleanup Message-ID: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> Cleanup ppc64 pci code. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/iSeries_pci.c~pci1 arch/ppc64/kernel/iSeries_pci.c --- gr_base/arch/ppc64/kernel/iSeries_pci.c~pci1 2004-11-12 04:50:33.577644518 -0600 +++ gr_base-anton/arch/ppc64/kernel/iSeries_pci.c 2004-11-12 04:50:33.624636852 -0600 @@ -292,7 +292,6 @@ void iSeries_pcibios_init(void) iomm_table_initialize(); find_and_init_phbs(); io_page_mask = -1; - /* pci_assign_all_busses = 0; SFRXXX*/ PPCDBG(PPCDBG_BUSWALK, "iSeries_pcibios_init Exit.\n"); } diff -puN arch/ppc64/kernel/maple_pci.c~pci1 arch/ppc64/kernel/maple_pci.c --- gr_base/arch/ppc64/kernel/maple_pci.c~pci1 2004-11-12 04:50:33.582643702 -0600 +++ gr_base-anton/arch/ppc64/kernel/maple_pci.c 2004-11-12 04:50:33.620637504 -0600 @@ -32,9 +32,6 @@ #define DBG(x...) #endif -extern int pci_probe_only; -extern int pci_read_irq_line(struct pci_dev *pci_dev); - static struct pci_controller *u3_agp, *u3_ht; static int __init fixup_one_level_bus_range(struct device_node *node, int higher) @@ -377,7 +374,7 @@ void __init maple_pcibios_fixup(void) DBG(" -> maple_pcibios_fixup\n"); - while ((dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) + for_each_pci_dev(dev) pci_read_irq_line(dev); /* Do the mapping of the IO space */ diff -puN arch/ppc64/kernel/pSeries_pci.c~pci1 arch/ppc64/kernel/pSeries_pci.c --- gr_base/arch/ppc64/kernel/pSeries_pci.c~pci1 2004-11-12 04:50:33.587642887 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_pci.c 2004-11-12 04:50:33.622637178 -0600 @@ -26,7 +26,6 @@ #include #include #include -#include #include #include #include @@ -37,7 +36,6 @@ #include #include #include -#include #include #include #include @@ -53,7 +51,6 @@ static int ibm_write_pci_config; static int s7a_workaround; -extern unsigned long pci_probe_only; extern struct mpic *pSeries_mpic; static int rtas_read_config(struct device_node *dn, int where, int size, u32 *val) @@ -248,17 +245,16 @@ static enum phb_types get_phb_type(struc return type; } -int get_phb_reg_prop(struct device_node *dev, unsigned int addr_size_words, - struct reg_property64 *reg) +static int get_phb_reg_prop(struct device_node *dev, + unsigned int addr_size_words, + struct reg_property64 *reg) { unsigned int *ui_ptr = NULL, len; /* Found a PHB, now figure out where his registers are mapped. */ ui_ptr = (unsigned int *) get_property(dev, "reg", &len); - if (ui_ptr == NULL) { - PPCDBG(PPCDBG_PHBINIT, "\tget reg failed.\n"); + if (ui_ptr == NULL) return 1; - } if (addr_size_words == 1) { reg->address = ((struct reg_property32 *)ui_ptr)->address; @@ -270,7 +266,8 @@ int get_phb_reg_prop(struct device_node return 0; } -int phb_set_bus_ranges(struct device_node *dev, struct pci_controller *phb) +static int phb_set_bus_ranges(struct device_node *dev, + struct pci_controller *phb) { int *bus_range; unsigned int len; diff -puN arch/ppc64/kernel/pci.c~pci1 arch/ppc64/kernel/pci.c --- gr_base/arch/ppc64/kernel/pci.c~pci1 2004-11-12 04:50:33.593641908 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.c 2004-11-12 04:50:33.615638320 -0600 @@ -16,14 +16,9 @@ #include #include #include -#include #include #include -#include -#include -#include #include -#include #include #include @@ -33,11 +28,8 @@ #include #include #include -#include -#include -#include -#include #include +#include #include "pci.h" @@ -50,8 +42,10 @@ unsigned long pci_probe_only = 1; unsigned long pci_assign_all_buses = 0; -/* legal IO pages under MAX_ISA_PORT. This is to ensure we don't touch - devices we don't have access to. */ +/* + * legal IO pages under MAX_ISA_PORT. This is to ensure we don't touch + * devices we don't have access to. + */ unsigned long io_page_mask; EXPORT_SYMBOL(io_page_mask); @@ -702,7 +696,7 @@ void __init pci_setup_phb_io(struct pci_ struct device_node *isa_dn; hose->io_base_virt = reserve_phb_iospace(size); - PPCDBG(PPCDBG_PHBINIT, "phb%d io_base_phys 0x%lx io_base_virt 0x%lx\n", + DBG("phb%d io_base_phys 0x%lx io_base_virt 0x%lx\n", hose->global_number, hose->io_base_phys, (unsigned long) hose->io_base_virt); @@ -733,7 +727,7 @@ void __devinit pci_setup_phb_io_dynamic( hose->io_base_virt = __ioremap(hose->io_base_phys, size, _PAGE_NO_CACHE); - PPCDBG(PPCDBG_PHBINIT, "phb%d io_base_phys 0x%lx io_base_virt 0x%lx\n", + DBG("phb%d io_base_phys 0x%lx io_base_virt 0x%lx\n", hose->global_number, hose->io_base_phys, (unsigned long) hose->io_base_virt); @@ -833,13 +827,10 @@ void phbs_remap_io(void) } -/*********************************************************************** - * pci_find_hose_for_OF_device - * +/* * This function finds the PHB that matching device_node in the * OpenFirmware by scanning all the pci_controllers. - * - ***********************************************************************/ + */ struct pci_controller* pci_find_hose_for_OF_device(struct device_node *node) { while (node) { @@ -972,44 +963,31 @@ void __devinit pcibios_fixup_bus(struct } EXPORT_SYMBOL(pcibios_fixup_bus); -/****************************************************************** - * pci_read_irq_line - * - * Reads the Interrupt Pin to determine if interrupt is use by card. +/* + * Reads the interrupt pin to determine if interrupt is use by card. * If the interrupt is used, then gets the interrupt line from the * openfirmware and sets it in the pci_dev and pci_config line. - * - ******************************************************************/ + */ int pci_read_irq_line(struct pci_dev *pci_dev) { u8 intpin; struct device_node *node; pci_read_config_byte(pci_dev, PCI_INTERRUPT_PIN, &intpin); - - if (intpin == 0) { - PPCDBG(PPCDBG_BUSWALK,"\tDevice: %s No Interrupt used by device.\n", - pci_name(pci_dev)); - return 0; - } + if (intpin == 0) + return 0; node = pci_device_to_OF_node(pci_dev); - if (node == NULL) { - PPCDBG(PPCDBG_BUSWALK,"\tDevice: %s Device Node not found.\n", - pci_name(pci_dev)); - return -1; - } - if (node->n_intrs == 0) { - PPCDBG(PPCDBG_BUSWALK,"\tDevice: %s No Device OF interrupts defined.\n", - pci_name(pci_dev)); - return -1; - } + if (node == NULL) + return -1; + + if (node->n_intrs == 0) + return -1; + pci_dev->irq = node->intrs[0].line; pci_write_config_byte(pci_dev, PCI_INTERRUPT_LINE, pci_dev->irq); - - PPCDBG(PPCDBG_BUSWALK,"\tDevice: %s pci_dev->irq = 0x%02X\n", - pci_name(pci_dev), pci_dev->irq); + return 0; } EXPORT_SYMBOL(pci_read_irq_line); diff -puN arch/ppc64/kernel/pci.h~pci1 arch/ppc64/kernel/pci.h --- gr_base/arch/ppc64/kernel/pci.h~pci1 2004-11-12 04:50:33.597641256 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.h 2004-11-12 04:50:33.616638157 -0600 @@ -25,16 +25,11 @@ extern void pci_setup_phb_io_dynamic(str extern struct list_head hose_list; extern int global_phb_number; -/******************************************************************* - * Platform functions that are brand specific implementation. - *******************************************************************/ extern unsigned long find_and_init_phbs(void); extern struct pci_dev *ppc64_isabridge_dev; /* may be NULL if no ISA bus */ -/******************************************************************* - * PCI device_node operations - *******************************************************************/ +/* PCI device_node operations */ struct device_node; typedef void *(*traverse_func)(struct device_node *me, void *data); void *traverse_pci_devices(struct device_node *start, traverse_func pre, @@ -55,5 +50,7 @@ void pci_addr_cache_remove_device(struct void init_pci_config_tokens (void); unsigned long get_phb_buid (struct device_node *); +extern int pci_probe_only; +extern int pci_read_irq_line(struct pci_dev *pci_dev); #endif /* __PPC_KERNEL_PCI_H__ */ diff -puN arch/ppc64/kernel/pmac_pci.c~pci1 arch/ppc64/kernel/pmac_pci.c --- gr_base/arch/ppc64/kernel/pmac_pci.c~pci1 2004-11-12 04:50:33.603640277 -0600 +++ gr_base-anton/arch/ppc64/kernel/pmac_pci.c 2004-11-12 04:50:33.618637830 -0600 @@ -39,9 +39,6 @@ #define DBG(x...) #endif -extern int pci_probe_only; -extern int pci_read_irq_line(struct pci_dev *pci_dev); - /* XXX Could be per-controller, but I don't think we risk anything by * assuming we won't have both UniNorth and Bandit */ static int has_uninorth; _ From anton at samba.org Sun Nov 14 17:22:02 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 17:22:02 +1100 Subject: [PATCH] ppc64: make fixup_winbond_82c105 pseries specific In-Reply-To: <20041114061825.GK16377@krispykreme.ozlabs.ibm.com> References: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> <20041114061825.GK16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114062202.GL16377@krispykreme.ozlabs.ibm.com> The winbond irq fixup is pSeries specific. Move it into pSeries_pci.c and check for PLATFORM_PSERIES. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pSeries_pci.c~pci2b arch/ppc64/kernel/pSeries_pci.c --- gr_base/arch/ppc64/kernel/pSeries_pci.c~pci2b 2004-11-12 04:50:35.605714058 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_pci.c 2004-11-12 04:50:35.621711448 -0600 @@ -542,3 +542,30 @@ void __init pSeries_final_fixup(void) pci_addr_cache_build(); } +/* + * Assume the winbond 82c105 is the IDE controller on a + * p610. We should probably be more careful in case + * someone tries to plug in a similar adapter. + */ +static void fixup_winbond_82c105(struct pci_dev* dev) +{ + int i; + unsigned int reg; + + if (!(systemcfg->platform & PLATFORM_PSERIES)) + return; + + printk("Using INTC for W82c105 IDE controller.\n"); + pci_read_config_dword(dev, 0x40, ®); + /* Enable LEGIRQ to use INTC instead of ISA interrupts */ + pci_write_config_dword(dev, 0x40, reg | (1<<11)); + + for (i = 0; i < DEVICE_COUNT_RESOURCE; ++i) { + /* zap the 2nd function of the winbond chip */ + if (dev->resource[i].flags & IORESOURCE_IO + && dev->bus->number == 0 && dev->devfn == 0x81) + dev->resource[i].flags &= ~IORESOURCE_IO; + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND, PCI_DEVICE_ID_WINBOND_82C105, + fixup_winbond_82c105); diff -puN arch/ppc64/kernel/pci.c~pci2b arch/ppc64/kernel/pci.c --- gr_base/arch/ppc64/kernel/pci.c~pci2b 2004-11-12 04:50:35.610713243 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.c 2004-11-12 04:50:35.624710959 -0600 @@ -87,30 +87,6 @@ static void fixup_broken_pcnet32(struct } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TRIDENT, PCI_ANY_ID, fixup_broken_pcnet32); -static void fixup_windbond_82c105(struct pci_dev* dev) -{ - /* Assume the windbond 82c105 is the IDE controller on a - * p610. We should probably be more careful in case - * someone tries to plug in a similar adapter. - */ - int i; - unsigned int reg; - - printk("Using INTC for W82c105 IDE controller.\n"); - pci_read_config_dword(dev, 0x40, ®); - /* Enable LEGIRQ to use INTC instead of ISA interrupts */ - pci_write_config_dword(dev, 0x40, reg | (1<<11)); - - for (i = 0; i < DEVICE_COUNT_RESOURCE; ++i) { - /* zap the 2nd function of the winbond chip */ - if (dev->resource[i].flags & IORESOURCE_IO - && dev->bus->number == 0 && dev->devfn == 0x81) - dev->resource[i].flags &= ~IORESOURCE_IO; - } -} -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND, PCI_DEVICE_ID_WINBOND_82C105, - fixup_windbond_82c105); - void pcibios_resource_to_bus(struct pci_dev *dev, struct pci_bus_region *region, struct resource *res) { _ From anton at samba.org Sun Nov 14 17:34:27 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 17:34:27 +1100 Subject: [PATCH] ppc64: get_phb_reg_prop only required on python PCI machines In-Reply-To: <20041114063301.GO16377@krispykreme.ozlabs.ibm.com> References: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> <20041114061825.GK16377@krispykreme.ozlabs.ibm.com> <20041114062202.GL16377@krispykreme.ozlabs.ibm.com> <20041114062801.GM16377@krispykreme.ozlabs.ibm.com> <20041114062944.GN16377@krispykreme.ozlabs.ibm.com> <20041114063301.GO16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114063427.GP16377@krispykreme.ozlabs.ibm.com> get_phb_reg_prop was only used for python PCI machines, so remove it from common code and call it from there. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pSeries_pci.c~pci6 arch/ppc64/kernel/pSeries_pci.c --- gr_base/arch/ppc64/kernel/pSeries_pci.c~pci6 2004-11-12 04:51:00.826306721 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_pci.c 2004-11-12 04:51:00.837304928 -0600 @@ -159,13 +159,39 @@ static int is_python(struct device_node return 0; } -static void python_countermeasures(unsigned long addr) +static int get_phb_reg_prop(struct device_node *dev, + unsigned int addr_size_words, + struct reg_property64 *reg) { + unsigned int *ui_ptr = NULL, len; + + /* Found a PHB, now figure out where his registers are mapped. */ + ui_ptr = (unsigned int *)get_property(dev, "reg", &len); + if (ui_ptr == NULL) + return 1; + + if (addr_size_words == 1) { + reg->address = ((struct reg_property32 *)ui_ptr)->address; + reg->size = ((struct reg_property32 *)ui_ptr)->size; + } else { + *reg = *((struct reg_property64 *)ui_ptr); + } + + return 0; +} + +static void python_countermeasures(struct device_node *dev, + unsigned int addr_size_words) +{ + struct reg_property64 reg_struct; void __iomem *chip_regs; volatile u32 val; + if (get_phb_reg_prop(dev, addr_size_words, ®_struct)) + return; + /* Python's register file is 1 MB in size. */ - chip_regs = ioremap(addr & ~(0xfffffUL), 0x100000); + chip_regs = ioremap(reg_struct.address & ~(0xfffffUL), 0x100000); /* * Firmware doesn't always clear this bit which is critical @@ -228,27 +254,6 @@ unsigned long __devinit get_phb_buid (st return buid; } -static int get_phb_reg_prop(struct device_node *dev, - unsigned int addr_size_words, - struct reg_property64 *reg) -{ - unsigned int *ui_ptr = NULL, len; - - /* Found a PHB, now figure out where his registers are mapped. */ - ui_ptr = (unsigned int *) get_property(dev, "reg", &len); - if (ui_ptr == NULL) - return 1; - - if (addr_size_words == 1) { - reg->address = ((struct reg_property32 *)ui_ptr)->address; - reg->size = ((struct reg_property32 *)ui_ptr)->size; - } else { - *reg = *((struct reg_property64 *)ui_ptr); - } - - return 0; -} - static int phb_set_bus_ranges(struct device_node *dev, struct pci_controller *phb) { @@ -270,15 +275,10 @@ static int __devinit setup_phb(struct de struct pci_controller *phb, unsigned int addr_size_words) { - struct reg_property64 reg_struct; - - if (get_phb_reg_prop(dev, addr_size_words, ®_struct)) - return 1; - pci_setup_pci_controller(phb); if (is_python(dev)) - python_countermeasures(reg_struct.address); + python_countermeasures(dev, addr_size_words); if (phb_set_bus_ranges(dev, phb)) return 1; _ From anton at samba.org Sun Nov 14 17:28:01 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 17:28:01 +1100 Subject: [PATCH] ppc64: remove duplication in pci_alloc_* In-Reply-To: <20041114062202.GL16377@krispykreme.ozlabs.ibm.com> References: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> <20041114061825.GK16377@krispykreme.ozlabs.ibm.com> <20041114062202.GL16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114062801.GM16377@krispykreme.ozlabs.ibm.com> We duplicated the code in pci_alloc_pci_controller twice and had an ifdef for iseries as well, just to select between kmalloc and bootmem memory. Change this so we instead pass the allocation into a common function - pci_setup_pci_controller. Also use a spinlock around the host_list and global_phb_number code since we now can modify it at runtime via hotplug. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pSeries_pci.c~pci3 arch/ppc64/kernel/pSeries_pci.c --- gr_base/arch/ppc64/kernel/pSeries_pci.c~pci3 2004-11-12 04:50:36.956624582 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_pci.c 2004-11-12 04:50:36.994618384 -0600 @@ -266,48 +266,61 @@ static int phb_set_bus_ranges(struct dev return 0; } -static struct pci_controller *alloc_phb(struct device_node *dev, - unsigned int addr_size_words) +static int __devinit setup_phb(struct device_node *dev, + struct pci_controller *phb, + unsigned int addr_size_words) { - struct pci_controller *phb; struct reg_property64 reg_struct; - struct property *of_prop; - int rc; - rc = get_phb_reg_prop(dev, addr_size_words, ®_struct); - if (rc) - return NULL; + if (get_phb_reg_prop(dev, addr_size_words, ®_struct)) + return 1; - phb = pci_alloc_pci_controller(); - if (phb == NULL) - return NULL; + pci_setup_pci_controller(phb); if (is_python(dev)) python_countermeasures(reg_struct.address); - rc = phb_set_bus_ranges(dev, phb); - if (rc) - return NULL; + if (phb_set_bus_ranges(dev, phb)) + return 1; - of_prop = (struct property *)alloc_bootmem(sizeof(struct property) + - sizeof(phb->global_number)); + phb->arch_data = dev; + phb->ops = &rtas_pci_ops; + phb->buid = get_phb_buid(dev); - if (!of_prop) { - kfree(phb); - return NULL; - } + return 0; +} +static void __devinit add_linux_pci_domain(struct device_node *dev, + struct pci_controller *phb, + struct property *of_prop) +{ memset(of_prop, 0, sizeof(struct property)); of_prop->name = "linux,pci-domain"; of_prop->length = sizeof(phb->global_number); of_prop->value = (unsigned char *)&of_prop[1]; memcpy(of_prop->value, &phb->global_number, sizeof(phb->global_number)); prom_add_property(dev, of_prop); +} - phb->arch_data = dev; - phb->ops = &rtas_pci_ops; +static struct pci_controller * __init alloc_phb(struct device_node *dev, + unsigned int addr_size_words) +{ + struct pci_controller *phb; + struct property *of_prop; - phb->buid = get_phb_buid(dev); + phb = (struct pci_controller *)alloc_bootmem(sizeof(struct pci_controller)); + if (phb == NULL) + return NULL; + + of_prop = (struct property *)alloc_bootmem(sizeof(struct property) + + sizeof(phb->global_number)); + if (!of_prop) + return NULL; + + if (setup_phb(dev, phb, addr_size_words)) + return NULL; + + add_linux_pci_domain(dev, phb, of_prop); return phb; } @@ -315,30 +328,18 @@ static struct pci_controller *alloc_phb( static struct pci_controller * __devinit alloc_phb_dynamic(struct device_node *dev, unsigned int addr_size_words) { struct pci_controller *phb; - struct reg_property64 reg_struct; - int rc; - - rc = get_phb_reg_prop(dev, addr_size_words, ®_struct); - if (rc) - return NULL; - phb = pci_alloc_phb_dynamic(); + phb = (struct pci_controller *)kmalloc(sizeof(struct pci_controller), + GFP_KERNEL); if (phb == NULL) return NULL; - if (is_python(dev)) - python_countermeasures(reg_struct.address); - - rc = phb_set_bus_ranges(dev, phb); - if (rc) + if (setup_phb(dev, phb, addr_size_words)) return NULL; - /* TODO: linux,pci-domain? */ - - phb->arch_data = dev; - phb->ops = &rtas_pci_ops; + phb->is_dynamic = 1; - phb->buid = get_phb_buid(dev); + /* TODO: linux,pci-domain? */ return phb; } diff -puN arch/ppc64/kernel/pci.c~pci3 arch/ppc64/kernel/pci.c --- gr_base/arch/ppc64/kernel/pci.c~pci3 2004-11-12 04:50:36.962623603 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.c 2004-11-12 04:50:36.996618058 -0600 @@ -157,54 +157,19 @@ void pcibios_align_resource(void *data, res->start = start; } -/* - * Allocate pci_controller(phb) initialized common variables. - */ -struct pci_controller * __init pci_alloc_pci_controller() -{ - struct pci_controller *hose; - -#ifdef CONFIG_PPC_ISERIES - hose = (struct pci_controller *)kmalloc(sizeof(struct pci_controller), - GFP_KERNEL); -#else - hose = (struct pci_controller *)alloc_bootmem(sizeof(struct pci_controller)); -#endif - if (hose == NULL) { - printk(KERN_ERR "PCI: Allocate pci_controller failed.\n"); - return NULL; - } - memset(hose, 0, sizeof(struct pci_controller)); - - hose->is_dynamic = 0; - hose->global_number = global_phb_number++; - - list_add_tail(&hose->list_node, &hose_list); - - return hose; -} +static spinlock_t hose_spinlock = SPIN_LOCK_UNLOCKED; /* - * Dymnamically allocate pci_controller(phb), initialize common variables. + * pci_controller(phb) initialized common variables. */ -struct pci_controller * pci_alloc_phb_dynamic() +void __devinit pci_setup_pci_controller(struct pci_controller *hose) { - struct pci_controller *hose; - - hose = (struct pci_controller *)kmalloc(sizeof(struct pci_controller), - GFP_KERNEL); - if(hose == NULL) { - printk(KERN_ERR "PCI: Allocate pci_controller failed.\n"); - return NULL; - } memset(hose, 0, sizeof(struct pci_controller)); - hose->is_dynamic = 1; + spin_lock(&hose_spinlock); hose->global_number = global_phb_number++; - list_add_tail(&hose->list_node, &hose_list); - - return hose; + spin_unlock(&hose_spinlock); } static void __init pcibios_claim_one_bus(struct pci_bus *b) diff -puN arch/ppc64/kernel/pci.h~pci3 arch/ppc64/kernel/pci.h --- gr_base/arch/ppc64/kernel/pci.h~pci3 2004-11-12 04:50:36.967622788 -0600 +++ gr_base-anton/arch/ppc64/kernel/pci.h 2004-11-12 04:50:36.997617894 -0600 @@ -14,8 +14,7 @@ extern unsigned long isa_io_base; -extern struct pci_controller* pci_alloc_pci_controller(void); -extern struct pci_controller* pci_alloc_phb_dynamic(void); +extern void pci_setup_pci_controller(struct pci_controller *hose); extern void pci_setup_phb_io(struct pci_controller *hose, int primary); extern struct pci_controller* pci_find_hose_for_OF_device(struct device_node* node); diff -puN arch/ppc64/kernel/pmac_pci.c~pci3 arch/ppc64/kernel/pmac_pci.c --- gr_base/arch/ppc64/kernel/pmac_pci.c~pci3 2004-11-12 04:50:36.972621972 -0600 +++ gr_base-anton/arch/ppc64/kernel/pmac_pci.c 2004-11-12 04:50:36.999617568 -0600 @@ -614,9 +614,11 @@ static int __init add_bridge(struct devi dev->full_name); } - hose = pci_alloc_pci_controller(); - if (!hose) - return -ENOMEM; + hose = (struct pci_controller *)alloc_bootmem(sizeof(struct pci_controller)); + if (hose == NULL) + return -ENOMEM; + pci_setup_pci_controller(hose); + hose->arch_data = dev; hose->first_busno = bus_range ? bus_range[0] : 0; hose->last_busno = bus_range ? bus_range[1] : 0xff; diff -puN arch/ppc64/kernel/maple_pci.c~pci3 arch/ppc64/kernel/maple_pci.c --- gr_base/arch/ppc64/kernel/maple_pci.c~pci3 2004-11-12 04:50:36.977621157 -0600 +++ gr_base-anton/arch/ppc64/kernel/maple_pci.c 2004-11-12 04:50:37.001617242 -0600 @@ -325,9 +325,11 @@ static int __init add_bridge(struct devi dev->full_name); } - hose = pci_alloc_pci_controller(); - if (!hose) - return -ENOMEM; + hose = (struct pci_controller *)alloc_bootmem(sizeof(struct pci_controller)); + if (hose == NULL) + return -ENOMEM; + pci_setup_pci_controller(hose); + hose->arch_data = dev; hose->first_busno = bus_range ? bus_range[0] : 0; hose->last_busno = bus_range ? bus_range[1] : 0xff; diff -puN arch/ppc64/kernel/iSeries_pci.c~pci3 arch/ppc64/kernel/iSeries_pci.c --- gr_base/arch/ppc64/kernel/iSeries_pci.c~pci3 2004-11-12 04:50:36.982620341 -0600 +++ gr_base-anton/arch/ppc64/kernel/iSeries_pci.c 2004-11-12 04:50:37.003616916 -0600 @@ -256,9 +256,12 @@ unsigned long __init find_and_init_phbs( int ret = HvCallXm_testBus(bus); if (ret == 0) { printk("bus %d appears to exist\n", bus); - phb = pci_alloc_pci_controller(); + + phb = (struct pci_controller *)kmalloc(sizeof(struct pci_controller), GFP_KERNEL); if (phb == NULL) - return -1; + return -ENOMEM; + pci_setup_pci_controller(phb); + phb->pci_mem_offset = phb->local_number = bus; phb->first_busno = bus; phb->last_busno = bus; _ From anton at samba.org Sun Nov 14 17:29:44 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 17:29:44 +1100 Subject: [PATCH] ppc64: OF overrides for pci_probe_only, pci_assign_all_buses In-Reply-To: <20041114062801.GM16377@krispykreme.ozlabs.ibm.com> References: <20041114061036.GJ16377@krispykreme.ozlabs.ibm.com> <20041114061825.GK16377@krispykreme.ozlabs.ibm.com> <20041114062202.GL16377@krispykreme.ozlabs.ibm.com> <20041114062801.GM16377@krispykreme.ozlabs.ibm.com> Message-ID: <20041114062944.GN16377@krispykreme.ozlabs.ibm.com> Allow pci_probe_only and pci_assign_all_buses to be modified via OF properties. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pSeries_pci.c~pci4 arch/ppc64/kernel/pSeries_pci.c --- gr_base/arch/ppc64/kernel/pSeries_pci.c~pci4 2004-11-12 04:50:38.272670698 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_pci.c 2004-11-12 04:50:38.283668903 -0600 @@ -386,6 +386,24 @@ unsigned long __init find_and_init_phbs( of_node_put(root); pci_devs_phb_init(); + /* + * pci_probe_only and pci_assign_all_buses can be set via properties + * in chosen. + */ + if (of_chosen) { + int *prop; + + prop = (int *)get_property(of_chosen, "linux,pci-probe-only", + NULL); + if (prop) + pci_probe_only = *prop; + + prop = (int *)get_property(of_chosen, + "linux,pci-assign-all-buses", NULL); + if (prop) + pci_assign_all_buses = *prop; + } + return 0; } _ From anton at samba.org Sun Nov 14 19:52:45 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 14 Nov 2004 19:52:45 +1100 Subject: [PATCH] ppc64: cleanups hpte_init_native, kill warning for !PSERIES builds Message-ID: <20041114085245.GB11375@krispykreme.ozlabs.ibm.com> From: Christoph Hellwig this splits out a small helper that checks whether tlb batching should be enabled from hpte_init_native, thus cleaning up the ifdef hell and killing a warning for pmac builds. Signed-off-by: Anton Blanchard Index: linux-2.5/arch/ppc64/mm/hash_native.c =================================================================== --- linux-2.5.orig/arch/ppc64/mm/hash_native.c 2004-11-14 15:52:18.747505472 +1100 +++ linux-2.5/arch/ppc64/mm/hash_native.c 2004-11-14 19:47:45.772890897 +1100 @@ -387,33 +387,37 @@ local_irq_restore(flags); } -void hpte_init_native(void) -{ #ifdef CONFIG_PPC_PSERIES - struct device_node *root; - const char *model; -#endif /* CONFIG_PPC_PSERIES */ +/* Disable TLB batching on nighthawk */ +static inline int tlb_batching_enabled(void) +{ + struct device_node *root = of_find_node_by_path("/"); + int enabled = 1; + + if (root) { + const char *model = get_property(root, "model", NULL); + if (model && !strcmp(model, "IBM,9076-N81")) + enabled = 0; + of_node_put(root); + } + return enabled; +} +#else +static inline int tlb_batching_enabled(void) +{ + return 1; +} +#endif + +void hpte_init_native(void) +{ ppc_md.hpte_invalidate = native_hpte_invalidate; ppc_md.hpte_updatepp = native_hpte_updatepp; ppc_md.hpte_updateboltedpp = native_hpte_updateboltedpp; ppc_md.hpte_insert = native_hpte_insert; ppc_md.hpte_remove = native_hpte_remove; - -#ifdef CONFIG_PPC_PSERIES - /* Disable TLB batching on nighthawk */ - root = of_find_node_by_path("/"); - if (root) { - model = get_property(root, "model", NULL); - if (model && !strcmp(model, "IBM,9076-N81")) { - of_node_put(root); - goto bail; - } - of_node_put(root); - } -#endif /* CONFIG_PPC_PSERIES */ - - ppc_md.flush_hash_range = native_flush_hash_range; - bail: + if (tlb_batching_enabled()) + ppc_md.flush_hash_range = native_flush_hash_range; htab_finish_init(); } From l_indien at magic.fr Mon Nov 15 01:40:50 2004 From: l_indien at magic.fr (J. Mayer) Date: Sun, 14 Nov 2004 15:40:50 +0100 Subject: Booting Imac G5 In-Reply-To: <1100312517.20592.109.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> Message-ID: <1100443250.16435.57.camel@rapid> On Sat, 2004-11-13 at 03:21, Benjamin Herrenschmidt wrote: > On Sat, 2004-11-13 at 03:22 +0100, J. Mayer wrote: > > > The only other device I can see that should have an IRQ pin set and have > > not is the IDE controler. > > That's why I did copy the code which gets the IRQ number from the IDE > > driver code, as I could see the IDE controler driver did work ! > > I'll check on my Ibook if I can see such a bug on IDE. > > I suppose the right fix then is to ignore the IRQ pin on pmac. Just copy > pci_read_irq_line() to pmac_pci.c, call it pmac_pci_read_irq_line and > remove the bit that reads the IRQ PIN .. (and remove the PPCDBG ugly > macros too while you are at it :) OK, what I've done is a bit different: I still read IRQ pin so I properly patch the PCI IRQ line if IRQ pin is set so the other PCI devices configuration won't change. Here you'll find the full patch I use now on my Imac G5, including your SATA initialisation fixups. Here's a quick overview of current kernel status: - Pmac PCI IRQ fix: needed for IDE & SATA on Imac G5 (should not break other targets). - Ethernet PHY failure if the cable is plugged during boot - No RTC - No SMU management - Bad detection of frame-buffer virtual res in riva-fb: should use xres=1440 yres=900 & virtual_xres=1536 - Have to unplug/replug the USB keyboard after kernel boot to make it work with kernel 2.6.10-rc1. No such problem with 2.6.9. - Lot's of segfaults occuring when multiple concurent processes are running, especially during compilations. Maybe the RAM is bad (I'm trying to port memtest86 to check this) or the CPU state is not completely saved / restored when rescheduling: it seems not to occur when compiling without doing anything else concurently. Do you have any idea about the last point ? Could this be Altivec context save / restore problems ? Regards. -- J. Mayer Never organized -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.10-rc1.diff Type: text/x-patch Size: 19419 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041114/c01f56d0/attachment.bin From l_indien at magic.fr Mon Nov 15 02:34:48 2004 From: l_indien at magic.fr (J. Mayer) Date: Sun, 14 Nov 2004 16:34:48 +0100 Subject: Booting Imac G5 : Wrong patch In-Reply-To: <1100312517.20592.109.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> Message-ID: <1100446488.16435.67.camel@rapid> On Sat, 2004-11-13 at 03:21, Benjamin Herrenschmidt wrote: > On Sat, 2004-11-13 at 03:22 +0100, J. Mayer wrote: > > > The only other device I can see that should have an IRQ pin set and have > > not is the IDE controler. > > That's why I did copy the code which gets the IRQ number from the IDE > > driver code, as I could see the IDE controler driver did work ! > > I'll check on my Ibook if I can see such a bug on IDE. > > I suppose the right fix then is to ignore the IRQ pin on pmac. Just copy > pci_read_irq_line() to pmac_pci.c, call it pmac_pci_read_irq_line and > remove the bit that reads the IRQ PIN .. (and remove the PPCDBG ugly > macros too while you are at it :) The previous patch I sent was buggy: I added the shasta controler in the pmac_ide driver but made a mistake: I missed an else statement so the controler was still recognized as an Ohare one. It's now well recognized (but uses k2 UDMA 100 timings) and runs in UDMA2 mode instead of MDMA2. The corrected full patch is attached. -- J. Mayer Never organized -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.10-rc1.diff Type: text/x-patch Size: 19505 bytes Desc: Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041114/1ccddd3f/attachment.bin From benh at kernel.crashing.org Mon Nov 15 09:07:20 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 15 Nov 2004 09:07:20 +1100 Subject: Booting Imac G5 In-Reply-To: <1100443250.16435.57.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> Message-ID: <1100470040.20593.143.camel@gaston> > Here's a quick overview of current kernel status: > - Pmac PCI IRQ fix: needed for IDE & SATA on > Imac G5 (should not break other targets). > - Ethernet PHY failure if the cable is plugged > during boot Ok, have to check out what's in darwin, there are some bits for the Shasta chipset that I haven't adapted yet. > - No RTC RTC is probably in the SMU... > - No SMU management > - Bad detection of frame-buffer virtual res in > riva-fb: > should use xres=1440 yres=900 & > virtual_xres=1536 Ok, weird. Will check that when I get it. Does X "nv" driver work when booting with offb ? > - Have to unplug/replug the USB keyboard > after kernel boot to make it work with kernel > 2.6.10-rc1. No such problem with 2.6.9. There have been various USB related issue in 2.6.10-rc*, have you tried the latest bk ? > - Lot's of segfaults occuring when multiple > concurent processes are running, especially > during compilations. Maybe the RAM is bad > (I'm trying to port memtest86 to check this) > or the CPU state is not completely saved / > restored when rescheduling: it seems not > to occur when compiling without doing > anything else concurently. > > Do you have any idea about the last point ? Could this be Altivec > context save / restore problems ? I very much doubt it has anything to do with CPU context saving/restoring... Could be lots of different things, difficult to say at this point. Thermal problem ? Clock chip setup problem ? Bad RAMs ? ... Ben. From benh at kernel.crashing.org Mon Nov 15 09:09:00 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 15 Nov 2004 09:09:00 +1100 Subject: Booting Imac G5 In-Reply-To: <1100443250.16435.57.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> Message-ID: <1100470140.20592.145.camel@gaston> On Sun, 2004-11-14 at 15:40 +0100, J. Mayer wrote: + */ + pci_read_config_byte(pci_dev, PCI_INTERRUPT_PIN, &intpin); + if (intpin != 0) + pci_write_config_byte(pci_dev, PCI_INTERRUPT_LINE, pci_dev->irq); You don't need to read PCI_INTERRUPT_PIN, just unconditionally write to PCI_INTERRUPT_LINE... Ben. From benh at kernel.crashing.org Mon Nov 15 09:15:29 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 15 Nov 2004 09:15:29 +1100 Subject: Booting Imac G5 : Wrong patch In-Reply-To: <1100446488.16435.67.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100446488.16435.67.camel@rapid> Message-ID: <1100470529.20512.148.camel@gaston> On Sun, 2004-11-14 at 16:34 +0100, J. Mayer wrote: > The previous patch I sent was buggy: > I added the shasta controler in the pmac_ide driver but made a mistake: > I missed an else statement so the controler was still recognized as an > Ohare one. > It's now well recognized (but uses k2 UDMA 100 timings) and runs in > UDMA2 mode instead of MDMA2. > > The corrected full patch is attached. I see that you added U/DMA 133 to the capabilities of the ide-pmac driver, but didn't update the timing tables... Have you checked those ? Also, it would be nice if you could avoid making lines go beyond 80 columns ;) Ben. From l_indien at magic.fr Mon Nov 15 09:31:06 2004 From: l_indien at magic.fr (J. Mayer) Date: Sun, 14 Nov 2004 23:31:06 +0100 Subject: Booting Imac G5 In-Reply-To: <1100470040.20593.143.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> Message-ID: <1100471466.16435.82.camel@rapid> On Sun, 2004-11-14 at 23:07, Benjamin Herrenschmidt wrote: > > Here's a quick overview of current kernel status: > > - Pmac PCI IRQ fix: needed for IDE & SATA on > > Imac G5 (should not break other targets). > > - Ethernet PHY failure if the cable is plugged > > during boot > > Ok, have to check out what's in darwin, there are some bits for the > Shasta chipset that I haven't adapted yet. OK. > > > - No RTC > > RTC is probably in the SMU... Yes, it's an i2c RTC clock in the SMU. I'm working to get SMU I2C work, looking to Apple's drivers. With luck, the RTC will be quite a standard one ;-) > > - No SMU management > > - Bad detection of frame-buffer virtual res in > > riva-fb: > > should use xres=1440 yres=900 & > > virtual_xres=1536 > > Ok, weird. Will check that when I get it. Does X "nv" driver work when > booting with offb ? xorg runs well when booting with ofb. The only issue is that it doesn't restore the OF original mode when exiting. The only problem I had was to find the right modeline to put in the monitor section to be able to use the native resolution. Here it is, if it can be useful: HorizSync 28.0-110.0 VertRefresh 43.0-90.0 Modeline "1440x900" 100.00 1440 1456 1464 1536 900 916 924 940 The xorg nvidia driver worker really well: it can use other resolutions (even if it does not seem useful), turn of the backlight, ... > > - Have to unplug/replug the USB keyboard > > after kernel boot to make it work with kernel > > 2.6.10-rc1. No such problem with 2.6.9. > > There have been various USB related issue in 2.6.10-rc*, have you tried > the latest bk ? Hum... I don't use bk. Can't compile it... > > - Lot's of segfaults occuring when multiple > > concurent processes are running, especially > > during compilations. Maybe the RAM is bad > > (I'm trying to port memtest86 to check this) > > or the CPU state is not completely saved / > > restored when rescheduling: it seems not > > to occur when compiling without doing > > anything else concurently. > > > > Do you have any idea about the last point ? Could this be Altivec > > context save / restore problems ? > > I very much doubt it has anything to do with CPU context > saving/restoring... Could be lots of different things, difficult to say > at this point. Thermal problem ? Clock chip setup problem ? Bad > RAMs ? ... I don't think it could be a thermal problem as the fans are running full speed when SMU is not managed. But it could be bad RAM. I have to check this point. OS X seems to run well, but I didn't stressed it a lot as I do with Linux ! > + */ > + pci_read_config_byte(pci_dev, PCI_INTERRUPT_PIN, &intpin); > + if (intpin != 0) > + pci_write_config_byte(pci_dev, PCI_INTERRUPT_LINE, pci_dev->irq); > > You don't need to read PCI_INTERRUPT_PIN, just unconditionally write to PCI_INTERRUPT_LINE... OK, I'll change this. -- J. Mayer Never organized From l_indien at magic.fr Mon Nov 15 09:37:48 2004 From: l_indien at magic.fr (J. Mayer) Date: Sun, 14 Nov 2004 23:37:48 +0100 Subject: Booting Imac G5 : Wrong patch In-Reply-To: <1100470529.20512.148.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100446488.16435.67.camel@rapid> <1100470529.20512.148.camel@gaston> Message-ID: <1100471868.16435.90.camel@rapid> On Sun, 2004-11-14 at 23:15, Benjamin Herrenschmidt wrote: > On Sun, 2004-11-14 at 16:34 +0100, J. Mayer wrote: > > > The previous patch I sent was buggy: > > I added the shasta controler in the pmac_ide driver but made a mistake: > > I missed an else statement so the controler was still recognized as an > > Ohare one. > > It's now well recognized (but uses k2 UDMA 100 timings) and runs in > > UDMA2 mode instead of MDMA2. > > > > The corrected full patch is attached. > > I see that you added U/DMA 133 to the capabilities of the ide-pmac > driver, but didn't update the timing tables... Have you checked those ? You're right, I should have #if zeroed those two lines, 'cause I got no ideas of the needed timings... My first goal is to get the controller well recognized (ie not as an Ohare !). > Also, it would be nice if you could avoid making lines go beyond 80 > columns ;) You're right. I hate this too, in fact ! Should never cut & paste ;-) -- J. Mayer Never organized From benh at kernel.crashing.org Mon Nov 15 09:43:55 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 15 Nov 2004 09:43:55 +1100 Subject: Booting Imac G5 In-Reply-To: <1100471466.16435.82.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <1100471466.16435.82.camel@rapid> Message-ID: <1100472236.20593.155.camel@gaston> On Sun, 2004-11-14 at 23:31 +0100, J. Mayer wrote: > On Sun, 2004-11-14 at 23:07, Benjamin Herrenschmidt wrote: > > > Here's a quick overview of current kernel status: > > > - Pmac PCI IRQ fix: needed for IDE & SATA on > > > Imac G5 (should not break other targets). > > > - Ethernet PHY failure if the cable is plugged > > > during boot > > > > Ok, have to check out what's in darwin, there are some bits for the > > Shasta chipset that I haven't adapted yet. > > OK. > > > > > > - No RTC > > > > RTC is probably in the SMU... > > Yes, it's an i2c RTC clock in the SMU. I'm working to get SMU I2C work, > looking to Apple's drivers. > With luck, the RTC will be quite a standard one ;-) Well, the Apple SMU driver isn't open sourced, though you may find out how it works from the OF code... > > > > - No SMU management > > > - Bad detection of frame-buffer virtual res in > > > riva-fb: > > > should use xres=1440 yres=900 & > > > virtual_xres=1536 > > > > Ok, weird. Will check that when I get it. Does X "nv" driver work when > > booting with offb ? > > xorg runs well when booting with ofb. The only issue is that it doesn't > restore the OF original mode when exiting. The only problem I had was to > find the right modeline to put in the monitor section to be able to use > the native resolution. Here it is, if it can be useful: > HorizSync 28.0-110.0 > VertRefresh 43.0-90.0 > Modeline "1440x900" 100.00 1440 1456 1464 1536 900 916 924 > 940 > The xorg nvidia driver worker really well: it can use other resolutions > (even if it does not seem useful), turn of the backlight, ... Doesn't OF node for the display contain an EDID ? That can be parsed to generate a correct ModeLine.. > > > - Have to unplug/replug the USB keyboard > > > after kernel boot to make it work with kernel > > > 2.6.10-rc1. No such problem with 2.6.9. > > > > There have been various USB related issue in 2.6.10-rc*, have you tried > > the latest bk ? > > Hum... I don't use bk. Can't compile it... There are snapshots too, dayly or so > > > - Lot's of segfaults occuring when multiple > > > concurent processes are running, especially > > > during compilations. Maybe the RAM is bad > > > (I'm trying to port memtest86 to check this) > > > or the CPU state is not completely saved / > > > restored when rescheduling: it seems not > > > to occur when compiling without doing > > > anything else concurently. > > > > > > Do you have any idea about the last point ? Could this be Altivec > > > context save / restore problems ? > > > > I very much doubt it has anything to do with CPU context > > saving/restoring... Could be lots of different things, difficult to say > > at this point. Thermal problem ? Clock chip setup problem ? Bad > > RAMs ? ... > > I don't think it could be a thermal problem as the fans are running full > speed when SMU is not managed. But it could be bad RAM. I have to check > this point. OS X seems to run well, but I didn't stressed it a lot as I > do with Linux ! OS X tend to run well in lots of crappy cases, strangely. I think Linux somewhat stresses the HW more. From sfr at canb.auug.org.au Mon Nov 15 16:53:57 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 15 Nov 2004 16:53:57 +1100 Subject: [PATCH] PPC64 iSeries: don't share request queues in viocd Message-ID: <20041115165357.2e738704.sfr@canb.auug.org.au> Hi Andrew, This patch fixes the virtual cdrom driver to not share a single request queue. Sharing the queue causes an oops if you remove the module and more than one cdrom exists. Signed-off-by: Stephen Rothwell Please apply and send to Linus. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linux-2.6.9/drivers/cdrom/viocd.c linux-2.6.9-sfr.1/drivers/cdrom/viocd.c --- linux-2.6.9/drivers/cdrom/viocd.c 2004-10-19 07:55:35.000000000 +1000 +++ linux-2.6.9-sfr.1/drivers/cdrom/viocd.c 2004-11-15 16:29:36.000000000 +1100 @@ -154,7 +154,6 @@ #define DEVICE_NR(di) ((di) - &viocd_diskinfo[0]) -static request_queue_t *viocd_queue; static spinlock_t viocd_reqlock; #define MAX_CD_REQ 1 @@ -503,6 +502,18 @@ return ret; } +static void restart_all_queues(int first_index) +{ + int i; + + for (i = first_index + 1; i < viocd_numdev; i++) + if (viocd_diskinfo[i].viocd_disk) + blk_run_queue(viocd_diskinfo[i].viocd_disk->queue); + for (i = 0; i <= first_index; i++) + if (viocd_diskinfo[i].viocd_disk) + blk_run_queue(viocd_diskinfo[i].viocd_disk->queue); +} + /* This routine handles incoming CD LP events */ static void vio_handle_cd_event(struct HvLpEvent *event) { @@ -532,7 +543,7 @@ case viocdopen: if (event->xRc == 0) { di = &viocd_diskinfo[bevent->disk]; - blk_queue_hardsect_size(viocd_queue, + blk_queue_hardsect_size(di->viocd_disk->queue, bevent->block_size); set_capacity(di->viocd_disk, bevent->media_size * @@ -584,7 +595,7 @@ /* restart handling of incoming requests */ spin_unlock_irqrestore(&viocd_reqlock, flags); - blk_run_queue(viocd_queue); + restart_all_queues(bevent->disk); break; default: @@ -624,6 +635,7 @@ struct disk_info *d; struct cdrom_device_info *c; struct cdrom_info *ci; + struct request_queue *q; deviceno = vdev->unit_address; if (deviceno >= viocd_numdev) @@ -643,17 +655,22 @@ if (register_cdrom(c) != 0) { printk(VIOCD_KERN_WARNING "Cannot register viocd CD-ROM %s!\n", c->name); - return 0; + goto out; } printk(VIOCD_KERN_INFO "cd %s is iSeries resource %10.10s " "type %4.4s, model %3.3s\n", c->name, ci->rsrcname, ci->type, ci->model); + q = blk_init_queue(do_viocd_request, &viocd_reqlock); + if (q == NULL) { + printk(VIOCD_KERN_WARNING "Cannot allocate queue for %s!\n", + c->name); + goto out_unregister_cdrom; + } gendisk = alloc_disk(1); if (gendisk == NULL) { printk(VIOCD_KERN_WARNING "Cannot create gendisk for %s!\n", c->name); - unregister_cdrom(c); - return 0; + goto out_cleanup_queue; } gendisk->major = VIOCD_MAJOR; gendisk->first_minor = deviceno; @@ -661,7 +678,10 @@ sizeof(gendisk->disk_name)); snprintf(gendisk->devfs_name, sizeof(gendisk->devfs_name), VIOCD_DEVICE_DEVFS "%d", deviceno); - gendisk->queue = viocd_queue; + blk_queue_max_hw_segments(q, 1); + blk_queue_max_phys_segments(q, 1); + blk_queue_max_sectors(q, 4096 / 512); + gendisk->queue = q; gendisk->fops = &viocd_fops; gendisk->flags = GENHD_FL_CD|GENHD_FL_REMOVABLE; set_capacity(gendisk, 0); @@ -670,8 +690,14 @@ d->dev = &vdev->dev; gendisk->driverfs_dev = d->dev; add_disk(gendisk); - return 0; + +out_cleanup_queue: + blk_cleanup_queue(q); +out_unregister_cdrom: + unregister_cdrom(c); +out: + return -ENODEV; } static int viocd_remove(struct vio_dev *vdev) @@ -683,6 +709,7 @@ "Cannot unregister viocd CD-ROM %s!\n", d->viocd_info.name); del_gendisk(d->viocd_disk); + blk_cleanup_queue(d->viocd_disk->queue); put_disk(d->viocd_disk); return 0; } @@ -742,18 +769,10 @@ goto out_undo_vio; spin_lock_init(&viocd_reqlock); - viocd_queue = blk_init_queue(do_viocd_request, &viocd_reqlock); - if (viocd_queue == NULL) { - ret = -ENOMEM; - goto out_free_info; - } - blk_queue_max_hw_segments(viocd_queue, 1); - blk_queue_max_phys_segments(viocd_queue, 1); - blk_queue_max_sectors(viocd_queue, 4096 / 512); ret = vio_register_driver(&viocd_driver); if (ret) - goto out_cleanup_queue; + goto out_free_info; e = create_proc_entry("iSeries/viocd", S_IFREG|S_IRUGO, NULL); if (e) { @@ -763,8 +782,6 @@ return 0; -out_cleanup_queue: - blk_cleanup_queue(viocd_queue); out_free_info: dma_free_coherent(iSeries_vio_dev, sizeof(*viocd_unitinfo) * VIOCD_MAX_CD, @@ -781,7 +798,6 @@ { remove_proc_entry("iSeries/viocd", NULL); vio_unregister_driver(&viocd_driver); - blk_cleanup_queue(viocd_queue); if (viocd_unitinfo != NULL) dma_free_coherent(iSeries_vio_dev, sizeof(*viocd_unitinfo) * VIOCD_MAX_CD, -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041115/d7cb081e/attachment.pgp From l_indien at magic.fr Mon Nov 15 17:03:48 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 15 Nov 2004 07:03:48 +0100 Subject: Booting Imac G5 In-Reply-To: <1100472236.20593.155.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <1100471466.16435.82.camel@rapid> <1100472236.20593.155.camel@gaston> Message-ID: <1100498628.16435.102.camel@rapid> On Sun, 2004-11-14 at 23:43, Benjamin Herrenschmidt wrote: > On Sun, 2004-11-14 at 23:31 +0100, J. Mayer wrote: > > On Sun, 2004-11-14 at 23:07, Benjamin Herrenschmidt wrote: [...] > > > > - No RTC > > > > > > RTC is probably in the SMU... > > > > Yes, it's an i2c RTC clock in the SMU. I'm working to get SMU I2C work, > > looking to Apple's drivers. > > With luck, the RTC will be quite a standard one ;-) > > Well, the Apple SMU driver isn't open sourced, though you may find out > how it works from the OF code... Yes, I saw... Will see if I can do something... > > > > - No SMU management > > > > - Bad detection of frame-buffer virtual res in > > > > riva-fb: > > > > should use xres=1440 yres=900 & > > > > virtual_xres=1536 > > > > > > Ok, weird. Will check that when I get it. Does X "nv" driver work when > > > booting with offb ? > > > > xorg runs well when booting with ofb. The only issue is that it doesn't > > restore the OF original mode when exiting. The only problem I had was to > > find the right modeline to put in the monitor section to be able to use > > the native resolution. Here it is, if it can be useful: > > HorizSync 28.0-110.0 > > VertRefresh 43.0-90.0 > > Modeline "1440x900" 100.00 1440 1456 1464 1536 900 916 924 > > 940 > > The xorg nvidia driver worker really well: it can use other resolutions > > (even if it does not seem useful), turn of the backlight, ... > > Doesn't OF node for the display contain an EDID ? That can be parsed to > generate a correct ModeLine.. Thanks for pointing this. So, I now got a modeline for 1440x900-60 > > > > - Have to unplug/replug the USB keyboard > > > > after kernel boot to make it work with kernel > > > > 2.6.10-rc1. No such problem with 2.6.9. > > > > > > There have been various USB related issue in 2.6.10-rc*, have you tried > > > the latest bk ? > > > > Hum... I don't use bk. Can't compile it... > > There are snapshots too, dayly or so It's not such a great issue. And I know I use a rc, not a realease. As it works with 2.6.9 kernel, I'm quite certain it is or will be fixed. > > > > - Lot's of segfaults occuring when multiple > > > > concurent processes are running, especially > > > > during compilations. Maybe the RAM is bad > > > > (I'm trying to port memtest86 to check this) > > > > or the CPU state is not completely saved / > > > > restored when rescheduling: it seems not > > > > to occur when compiling without doing > > > > anything else concurently. > > > > > > > > Do you have any idea about the last point ? Could this be Altivec > > > > context save / restore problems ? > > > > > > I very much doubt it has anything to do with CPU context > > > saving/restoring... Could be lots of different things, difficult to say > > > at this point. Thermal problem ? Clock chip setup problem ? Bad > > > RAMs ? ... > > > > I don't think it could be a thermal problem as the fans are running full > > speed when SMU is not managed. But it could be bad RAM. I have to check > > this point. OS X seems to run well, but I didn't stressed it a lot as I > > do with Linux ! > > OS X tend to run well in lots of crappy cases, strangely. I think Linux > somewhat stresses the HW more. I did port some parts of memtest86 to ppc64 as a userland program. It's sure not the best way, but I'm now able to launch it as an init replacement. I may add more tests, I just have a few ones, for now, but I may see if my RAM is bad (I would hate Apple, if it's the case !). In the meantime, I did implement the timings for UDMA133 but I can't really test them, as Apple put a 40 pin cable for the superdrive. What's good is it still work ! I noticed hdparm doesn't like it: the drive reports buggy features.... I may replace the cable when I'm not too lazy to reopen the box ;-) -- J. Mayer Never organized From miltonm at bga.com Mon Nov 15 21:21:01 2004 From: miltonm at bga.com (Milton Miller) Date: Mon, 15 Nov 2004 04:21:01 -0600 (CST) Subject: [PATCH][PPC64] update_process_times simplification Message-ID: <200411151021.iAFAL1p3070901@sullivan.realtime.net> When the update_process_times call was moved out of do_timer for the UP case, the replicator didn't track down the hiding and just added ifndef SMP. This removes the ifdefs and the indirection of calling another file for one function in a third file. Untested. Signed-off-by: Milton Miller ===== arch/ppc64/kernel/smp.c 1.98 vs edited ===== --- 1.98/arch/ppc64/kernel/smp.c 2004-10-25 03:29:39 +02:00 +++ edited/arch/ppc64/kernel/smp.c 2004-11-15 10:11:09 +01:00 @@ -159,11 +159,6 @@ static void __init smp_space_timers(unsi } } -void smp_local_timer_interrupt(struct pt_regs * regs) -{ - update_process_times(user_mode(regs)); -} - void smp_message_recv(int msg, struct pt_regs *regs) { switch(msg) { ===== arch/ppc64/kernel/time.c 1.40 vs edited ===== --- 1.40/arch/ppc64/kernel/time.c 2004-10-20 10:37:07 +02:00 +++ edited/arch/ppc64/kernel/time.c 2004-11-15 10:13:07 +01:00 @@ -67,8 +67,6 @@ #include #include -void smp_local_timer_interrupt(struct pt_regs *); - u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES; EXPORT_SYMBOL(jiffies_64); @@ -258,8 +256,6 @@ int timer_interrupt(struct pt_regs * reg lpaca->lppaca.xIntDword.xFields.xDecrInt = 0; while (lpaca->next_jiffy_update_tb <= (cur_tb = get_tb())) { - -#ifdef CONFIG_SMP /* * We cannot disable the decrementer, so in the period * between this cpu's being marked offline in cpu_online_map @@ -268,8 +264,7 @@ int timer_interrupt(struct pt_regs * reg * is the case. */ if (!cpu_is_offline(cpu)) - smp_local_timer_interrupt(regs); -#endif + update_process_times(user_mode(regs)); /* * No need to check whether cpu is offline here; boot_cpuid * should have been fixed up by now. @@ -278,9 +273,6 @@ int timer_interrupt(struct pt_regs * reg write_seqlock(&xtime_lock); tb_last_stamp = lpaca->next_jiffy_update_tb; do_timer(regs); -#ifndef CONFIG_SMP - update_process_times(user_mode(regs)); -#endif timer_sync_xtime( cur_tb ); timer_check_rtc(); write_sequnlock(&xtime_lock); From segher at kernel.crashing.org Mon Nov 15 20:26:47 2004 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Mon, 15 Nov 2004 10:26:47 +0100 Subject: Booting Imac G5 In-Reply-To: <1100470040.20593.143.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> Message-ID: <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> >> - Ethernet PHY failure if the cable is plugged >> during boot > > Ok, have to check out what's in darwin, there are some bits for the > Shasta chipset that I haven't adapted yet. The PHY is sort-of new (it's on the Vesta chip). >> - No RTC > > RTC is probably in the SMU... It is. >> - Have to unplug/replug the USB keyboard >> after kernel boot to make it work with kernel >> 2.6.10-rc1. No such problem with 2.6.9. > > There have been various USB related issue in 2.6.10-rc*, have you tried > the latest bk ? Do you have an url for that? USB is misbehaving on my 7,2 as well. >> - Lot's of segfaults occuring when multiple >> concurent processes are running, especially >> during compilations. Maybe the RAM is bad >> (I'm trying to port memtest86 to check this) >> or the CPU state is not completely saved / >> restored when rescheduling: it seems not >> to occur when compiling without doing >> anything else concurently. >> >> Do you have any idea about the last point ? Could this be Altivec >> context save / restore problems ? > > I very much doubt it has anything to do with CPU context > saving/restoring... Could be lots of different things, difficult to say > at this point. Thermal problem ? Clock chip setup problem ? Bad > RAMs ? ... Doesn't sound like a hardware problem, as it only occurs if he is running multiple processes... Or maybe it is... try running without X? Segher From hch at infradead.org Mon Nov 15 22:34:10 2004 From: hch at infradead.org (Christoph Hellwig) Date: Mon, 15 Nov 2004 11:34:10 +0000 Subject: [PATCH] PPC64 iSeries: don't share request queues in viocd In-Reply-To: <20041115165357.2e738704.sfr@canb.auug.org.au> References: <20041115165357.2e738704.sfr@canb.auug.org.au> Message-ID: <20041115113410.GA14471@infradead.org> On Mon, Nov 15, 2004 at 04:53:57PM +1100, Stephen Rothwell wrote: > Hi Andrew, > > This patch fixes the virtual cdrom driver to not share a single request > queue. Sharing the queue causes an oops if you remove the module and more > than one cdrom exists. Maybe you should fix that underlying bug? Queues are supposed to be shareable. From nickpiggin at yahoo.com.au Mon Nov 15 22:44:33 2004 From: nickpiggin at yahoo.com.au (Nick Piggin) Date: Mon, 15 Nov 2004 22:44:33 +1100 Subject: [PATCH] PPC64 iSeries: don't share request queues in viocd In-Reply-To: <20041115113410.GA14471@infradead.org> References: <20041115165357.2e738704.sfr@canb.auug.org.au> <20041115113410.GA14471@infradead.org> Message-ID: <419896A1.50605@yahoo.com.au> Christoph Hellwig wrote: > On Mon, Nov 15, 2004 at 04:53:57PM +1100, Stephen Rothwell wrote: > >>Hi Andrew, >> >>This patch fixes the virtual cdrom driver to not share a single request >>queue. Sharing the queue causes an oops if you remove the module and more >>than one cdrom exists. > > > Maybe you should fix that underlying bug? Queues are supposed to be > shareable. > I think shared queues are actually quite fundamentally broken at the moment (as pointed out to me by Al). It stems from the refcounting / conceptual relationship between a gendisk and a queue (I think - been a while since I looked at the code). I had something which just about fixed it up except that I couldn't work out an appropriate place and name for the "queue" in the sysfs hierarcy (IIRC I just had it as a sequentially increasing number, in /sys/block/). It is a relationship that I don't think sysfs can capture very well: queues are shared between multiple other objects, but they have no meaning outside the context of one of these objects. From ivan at vmfacility.fr Mon Nov 15 23:07:52 2004 From: ivan at vmfacility.fr (Ivan Warren) Date: Mon, 15 Nov 2004 13:07:52 +0100 Subject: Issue with ppc64/vibmscsi In-Reply-To: Message-ID: <016101c4cb0b$bbdc43f0$ad0aff51@vmfacility.fr> Folks, It seems (by my experience) that the pSeries AIX based Virtual I/O Server (at release 1.1.1.20) has a limit of 128K (256 sectors) request size when performing I/Os over the Virtual SCSI interface (using the SRP/RDMA hypervisor based transport). The symptom is that some large I/Os will fail the adapter (putting it offline). The Virtual I/O server indicates an errlog entry of type "37DDE80C" (Misbehaved Virtual SCSI Client) I couldn't find any documentation on any fixed request size limit anywhere. I attempted this, and this circumvents the problem (in my particular case) : Set max_sectors=256 to the scsi_host_template in ibmvscsi.c Of course, this is a bit .. brutal .. (it would limit the transfer size in environments that probably do not need it, such as when talking to an iSeries OS/400 or i5/OS partition) - And eventually, the limit in VIOS may go away (if, as I suspect, this is an actual problem in the current VIOS code). So a config statement may be appropriate.. Or maybe just wait for a new VIOS release or set pf PTFs.. --Ivan From l_indien at magic.fr Mon Nov 15 23:45:07 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 15 Nov 2004 13:45:07 +0100 Subject: Booting Imac G5 In-Reply-To: <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> Message-ID: <1100522706.16435.203.camel@rapid> On Mon, 2004-11-15 at 10:26, Segher Boessenkool wrote: > >> - Ethernet PHY failure if the cable is plugged > >> during boot > > > > Ok, have to check out what's in darwin, there are some bits for the > > Shasta chipset that I haven't adapted yet. > > The PHY is sort-of new (it's on the Vesta chip). > > >> - No RTC > > > > RTC is probably in the SMU... > > It is. Apple say it's an external RTC in the developper notes. The problem is to get an I2C driver for the SMU. With the forth code, seems that it can be made ;-) [...] > >> - Lot's of segfaults occuring when multiple > >> concurent processes are running, especially > >> during compilations. Maybe the RAM is bad > >> (I'm trying to port memtest86 to check this) > >> or the CPU state is not completely saved / > >> restored when rescheduling: it seems not > >> to occur when compiling without doing > >> anything else concurently. > >> > >> Do you have any idea about the last point ? Could this be Altivec > >> context save / restore problems ? > > > > I very much doubt it has anything to do with CPU context > > saving/restoring... Could be lots of different things, difficult to say > > at this point. Thermal problem ? Clock chip setup problem ? Bad > > RAMs ? ... > > Doesn't sound like a hardware problem, as it only occurs if he is > running multiple processes... I checked my RAM during more than two hours: I ran a memory checker I did, based on memtest86, which locks the more memory it can and test it in a loop. I did a standalone program, not using the libc, launched by a size-reduced Linux kernel (~1MB of code). So, I've been able to check 480 MB of RAM (I got 512 MB) without any fault. Each test suite took about 20 minutes, and it ran 7 times before I stopped it. As I get a lot of segfaults when compiling, I can't believe it's a hardware CPU/cache/RAM problem and no problem occured at least once during the memory check. > Or maybe it is... try running without X? I can hardly run X. I can launch it but it's not really usable as forking to launch xterms,..., make it crash quickly. I mainly use the Imac via ssh for now... -- J. Mayer Never organized From segher at kernel.crashing.org Mon Nov 15 23:48:24 2004 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Mon, 15 Nov 2004 13:48:24 +0100 Subject: Booting Imac G5 In-Reply-To: <1100522706.16435.203.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1100522706.16435.203.camel@rapid> Message-ID: >>> RTC is probably in the SMU... >> >> It is. > > Apple say it's an external RTC in the developper notes. The problem is > to get an I2C driver for the SMU. With the forth code, seems that it > can > be made ;-) As far as I know you just ask the SMU the time, you don't have to talk to the IIC yourself. Or maybe that has changed... checking... no, it hasn't (the actual commands did change, though). Segher From anton at samba.org Tue Nov 16 02:56:51 2004 From: anton at samba.org (Anton Blanchard) Date: Tue, 16 Nov 2004 02:56:51 +1100 Subject: Issue with ppc64/vibmscsi In-Reply-To: <016101c4cb0b$bbdc43f0$ad0aff51@vmfacility.fr> References: <016101c4cb0b$bbdc43f0$ad0aff51@vmfacility.fr> Message-ID: <20041115155650.GL9180@krispykreme.ozlabs.ibm.com> Hi Ivan, > It seems (by my experience) that the pSeries AIX based Virtual I/O Server > (at release 1.1.1.20) has a limit of 128K (256 sectors) request size when > performing I/Os over the Virtual SCSI interface (using the SRP/RDMA > hypervisor based transport). > > The symptom is that some large I/Os will fail the adapter (putting it > offline). > > The Virtual I/O server indicates an errlog entry of type "37DDE80C" > (Misbehaved Virtual SCSI Client) > > I couldn't find any documentation on any fixed request size limit anywhere. > > I attempted this, and this circumvents the problem (in my particular case) : > > Set max_sectors=256 to the scsi_host_template in ibmvscsi.c > > Of course, this is a bit .. brutal .. (it would limit the transfer size in > environments that probably do not need it, such as when talking to an > iSeries OS/400 or i5/OS partition) - And eventually, the limit in VIOS may > go away (if, as I suspect, this is an actual problem in the current VIOS > code). > > So a config statement may be appropriate.. Or maybe just wait for a new VIOS > release or set pf PTFs.. I agree its an AIX bug. Dave, Santi, does this sound familiar to you? Anton From santil at us.ibm.com Tue Nov 16 04:55:40 2004 From: santil at us.ibm.com (Santiago Leon) Date: Mon, 15 Nov 2004 11:55:40 -0600 Subject: Issue with ppc64/vibmscsi In-Reply-To: <20041115155650.GL9180@krispykreme.ozlabs.ibm.com> References: <016101c4cb0b$bbdc43f0$ad0aff51@vmfacility.fr> <20041115155650.GL9180@krispykreme.ozlabs.ibm.com> Message-ID: <4198ED9C.4030504@us.ibm.com> Anton Blanchard wrote: > I agree its an AIX bug. Dave, Santi, does this sound familiar to you? > > Anton > Actually, it's a vscsi client driver bug :)... here's a patch that will fix it... -- Santiago A. Leon Power Linux Development IBM Linux Technology Center -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ltc12082.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041115/ab7e6c8e/attachment.txt From ivan at vmfacility.fr Tue Nov 16 05:20:17 2004 From: ivan at vmfacility.fr (Ivan Warren) Date: Mon, 15 Nov 2004 19:20:17 +0100 Subject: Issue with ppc64/vibmscsi In-Reply-To: <4198ED9C.4030504@us.ibm.com> Message-ID: <01c201c4cb3f$c231ac70$ad0aff51@vmfacility.fr> > > Actually, it's a vscsi client driver bug :)... here's a patch > that will > fix it... > > -- > Santiago A. Leon > Power Linux Development > IBM Linux Technology Center > Works for me ! scsi0 : IBM POWER Virtual SCSI Adapter 1.5.2 ibmvscsi: partner initialization complete ibmvscsi: SRP_LOGIN succeeded ibmvscsi: host srp version: 16.a, host partition VIOS1 (1), OS 3, max io 131072 Using anticipatory io scheduler Vendor: AIX Model: VDASD Rev: Type: Direct-Access ANSI SCSI revision: 03 No more I/O errors during large I/Os ! Many thanks ! --Ivan Note : Never mind the 1.5.2... I had to hand-apply the patch because the version was 1.5.1 (2.6.9 kernel).. From anton at samba.org Tue Nov 16 05:25:39 2004 From: anton at samba.org (Anton Blanchard) Date: Tue, 16 Nov 2004 05:25:39 +1100 Subject: Issue with ppc64/vibmscsi In-Reply-To: <4198ED9C.4030504@us.ibm.com> References: <016101c4cb0b$bbdc43f0$ad0aff51@vmfacility.fr> <20041115155650.GL9180@krispykreme.ozlabs.ibm.com> <4198ED9C.4030504@us.ibm.com> Message-ID: <20041115182539.GN9180@krispykreme.ozlabs.ibm.com> > Actually, it's a vscsi client driver bug :)... here's a patch that will > fix it... Nice, does this need to be sent onto Linus? Be good to get it into 2.6.10. Anton From nfont at austin.ibm.com Tue Nov 16 06:14:28 2004 From: nfont at austin.ibm.com (Nathan Fontenot) Date: Mon, 15 Nov 2004 13:14:28 -0600 Subject: [PATCH] read_slot_reset_state2 rtas call Message-ID: <41990014.1060404@austin.ibm.com> Paul or Anton, Could you please forward upstream. The EEH code needs a small update to start using the ibm,read-slot-reset-state2 rtas call if available. The currently used ibm,read-slot-reset-state call will be going away in the near future. This patch attempts to use the newer rtas call if available and falls back the older version otherwise. This will maintain EEH slot checking capabilities on all future and current firmware levels. Signed-off-by: Nathan Fontenot -- Nathan Fontenot Power Linux Platform Serviceability Home: IBM Austin 908/1E-036 Phone: 512.838.3377 (T/L 678.3377) Email: nfont at austin.ibm.com -------------- next part -------------- A non-text attachment was scrubbed... Name: read_slot_reset_state2-linux.patch Type: text/x-patch Size: 2714 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041115/02fadccd/attachment.bin From andmike at us.ibm.com Tue Nov 16 06:12:43 2004 From: andmike at us.ibm.com (Mike Anderson) Date: Mon, 15 Nov 2004 11:12:43 -0800 Subject: Issue with ppc64/vibmscsi In-Reply-To: <20041115182539.GN9180@krispykreme.ozlabs.ibm.com> References: <016101c4cb0b$bbdc43f0$ad0aff51@vmfacility.fr> <20041115155650.GL9180@krispykreme.ozlabs.ibm.com> <4198ED9C.4030504@us.ibm.com> <20041115182539.GN9180@krispykreme.ozlabs.ibm.com> Message-ID: <20041115191243.GB8759@us.ibm.com> Anton Blanchard [anton at samba.org] wrote: > > > Actually, it's a vscsi client driver bug :)... here's a patch that will > > fix it... > > Nice, does this need to be sent onto Linus? Be good to get it into > 2.6.10. > or send it to James Bottomley to get it added to -rc1 fixes. http://marc.theaimsgroup.com/?t=110046745200002&r=1&w=2 -andmike -- Michael Anderson andmike at us.ibm.com From sleddog at us.ibm.com Tue Nov 16 07:48:21 2004 From: sleddog at us.ibm.com (Dave C Boutcher) Date: Mon, 15 Nov 2004 14:48:21 -0600 Subject: Issue with ppc64/vibmscsi In-Reply-To: <20041115191243.GB8759@us.ibm.com> References: <016101c4cb0b$bbdc43f0$ad0aff51@vmfacility.fr> <20041115155650.GL9180@krispykreme.ozlabs.ibm.com> <4198ED9C.4030504@us.ibm.com> <20041115182539.GN9180@krispykreme.ozlabs.ibm.com> <20041115191243.GB8759@us.ibm.com> Message-ID: <20041115204821.GA2709@cs.umn.edu> On Mon, Nov 15, 2004 at 11:12:43AM -0800, Mike Anderson wrote: > Anton Blanchard [anton at samba.org] wrote: > > > > > Actually, it's a vscsi client driver bug :)... here's a patch that will > > > fix it... > > > > Nice, does this need to be sent onto Linus? Be good to get it into > > 2.6.10. > > > > or send it to James Bottomley to get it added to -rc1 fixes. Ya, I have a couple of fixes queued up to go....I've been travelling so I'm a little behind. -- Dave Boutcher From sfr at canb.auug.org.au Tue Nov 16 11:49:21 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 16 Nov 2004 11:49:21 +1100 Subject: [PATCH] PPC64: add missing braces to rtc driver Message-ID: <20041116114921.0df6d838.sfr@canb.auug.org.au> Hi Andrew, This patch fixes the PPC64 rtc driver where a pair of braces was missing. Not a big bug, but a bug none the less. Also, while I was there, use C99 initialisers. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus-bk/arch/ppc64/kernel/rtc.c linus-bk-rtc.1/arch/ppc64/kernel/rtc.c --- linus-bk/arch/ppc64/kernel/rtc.c 2004-11-12 09:09:48.000000000 +1100 +++ linus-bk-rtc.1/arch/ppc64/kernel/rtc.c 2004-11-10 01:28:56.000000000 +1100 @@ -185,11 +185,10 @@ .release = rtc_release, }; -static struct miscdevice rtc_dev= -{ - RTC_MINOR, - "rtc", - &rtc_fops +static struct miscdevice rtc_dev = { + .minor = RTC_MINOR, + .name = "rtc", + .fops = &rtc_fops }; static int __init rtc_init(void) @@ -201,9 +200,11 @@ return retval; #ifdef CONFIG_PROC_FS - if (create_proc_read_entry ("driver/rtc", 0, NULL, rtc_read_proc, NULL) == NULL) + if (create_proc_read_entry("driver/rtc", 0, NULL, rtc_read_proc, NULL) + == NULL) { misc_deregister(&rtc_dev); return -ENOMEM; + } #endif printk(KERN_INFO "i/pSeries Real Time Clock Driver v" RTC_VERSION "\n"); -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041116/f8f6c9a2/attachment.pgp From sfr at canb.auug.org.au Tue Nov 16 16:54:03 2004 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 16 Nov 2004 16:54:03 +1100 Subject: [PATCH] PPC64 iSeries: fix viodasd remove Message-ID: <20041116165403.7fd5a83c.sfr@canb.auug.org.au> Hi Andrew, This patch just makes sure that we do not dereference a viodasd gendisk pointer after it has been freed. Signed-off-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au diff -ruNp linus-bk/drivers/block/viodasd.c linus-bk-viodasd.1/drivers/block/viodasd.c --- linus-bk/drivers/block/viodasd.c 2004-06-30 15:40:03.000000000 +1000 +++ linus-bk-viodasd.1/drivers/block/viodasd.c 2004-11-16 16:36:00.000000000 +1100 @@ -764,8 +764,8 @@ static int viodasd_remove(struct vio_dev d = &viodasd_devices[vdev->unit_address]; if (d->disk) { del_gendisk(d->disk); - put_disk(d->disk); blk_cleanup_queue(d->disk->queue); + put_disk(d->disk); d->disk = NULL; } d->dev = NULL; -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041116/b396be9b/attachment.pgp From olof at austin.ibm.com Wed Nov 17 06:01:54 2004 From: olof at austin.ibm.com (Olof Johansson) Date: Tue, 16 Nov 2004 13:01:54 -0600 Subject: [PATCH] [PPC64] Fix iSeries build (lparcfg) Message-ID: <20041116190154.GA20952@4> Hi, Andrew, please apply: Jeff Scheel's addition of PURR reporting in lparcfg conflicted with Stephen's cleanup of the iSeries namespace. This fixes the build break. Signed-off-by: Olof Johansson --- linux-2.5-olof/arch/ppc64/kernel/lparcfg.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff -puN arch/ppc64/kernel/lparcfg.c~lparcfg-buildfix arch/ppc64/kernel/lparcfg.c --- linux-2.5/arch/ppc64/kernel/lparcfg.c~lparcfg-buildfix 2004-11-16 12:24:17.064191383 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/lparcfg.c 2004-11-16 12:24:17.070192742 -0600 @@ -82,11 +82,11 @@ static unsigned long get_purr(void) for_each_cpu(cpu) { lpaca = paca + cpu; - sum_purr += lpaca->xLpPaca.xEmulatedTimeBase; + sum_purr += lpaca->lppaca.xEmulatedTimeBase; #ifdef PURR_DEBUG printk(KERN_INFO "get_purr for cpu (%d) has value (%ld) \n", - cpu, lpaca->xLpPaca.xEmulatedTimeBase); + cpu, lpaca->lppaca.xEmulatedTimeBase); #endif } return sum_purr; _ From olof at austin.ibm.com Wed Nov 17 08:02:01 2004 From: olof at austin.ibm.com (Olof Johansson) Date: Tue, 16 Nov 2004 15:02:01 -0600 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids Message-ID: <20041116210201.GA7368@localhost.localdomain> Hi, Below patch changes the early CPU spinup code to be based on physical CPU ID instead of logical. This will make it possible to kexec off of a different cpu than 0, for example after it's been hot-unplugged. The booted cpu will still be mapped as logical cpu 0, since there's various stuff in the early boot that assumes logical boot cpuid is 0. Also, it expands the kexec boot param structure to allow the booted physical cpuid to be passed in. This includes bumping the version number to 2 for backwards compat. Signed-off-by: Olof Johansson --- linux-2.5-olof/arch/ppc64/kernel/asm-offsets.c | 1 linux-2.5-olof/arch/ppc64/kernel/head.S | 62 ++++++++++++++++--------- linux-2.5-olof/arch/ppc64/kernel/pacaData.c | 1 linux-2.5-olof/arch/ppc64/kernel/prom.c | 17 +++++- linux-2.5-olof/arch/ppc64/kernel/prom_init.c | 4 - linux-2.5-olof/arch/ppc64/kernel/setup.c | 21 ++++++++ linux-2.5-olof/include/asm-ppc64/prom.h | 2 linux-2.5-olof/include/asm-ppc64/smp.h | 1 8 files changed, 81 insertions(+), 28 deletions(-) diff -puN arch/ppc64/kernel/asm-offsets.c~boot-cpuid arch/ppc64/kernel/asm-offsets.c --- linux-2.5/arch/ppc64/kernel/asm-offsets.c~boot-cpuid 2004-11-16 12:41:26.546908234 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/asm-offsets.c 2004-11-16 13:24:49.372405523 -0600 @@ -103,6 +103,7 @@ int main(void) DEFINE(PACA_EXDSI, offsetof(struct paca_struct, exdsi)); DEFINE(PACAEMERGSP, offsetof(struct paca_struct, emergency_sp)); DEFINE(PACALPPACA, offsetof(struct paca_struct, lppaca)); + DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id)); DEFINE(LPPACASRR0, offsetof(struct ItLpPaca, xSavedSrr0)); DEFINE(LPPACASRR1, offsetof(struct ItLpPaca, xSavedSrr1)); DEFINE(LPPACAANYINT, offsetof(struct ItLpPaca, xIntDword.xAnyInt)); diff -puN arch/ppc64/kernel/head.S~boot-cpuid arch/ppc64/kernel/head.S --- linux-2.5/arch/ppc64/kernel/head.S~boot-cpuid 2004-11-16 12:41:26.548908679 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/head.S 2004-11-16 13:20:02.404741718 -0600 @@ -26,6 +26,7 @@ #define SECONDARY_PROCESSORS #include +#include #include #include #include @@ -1192,7 +1193,7 @@ unrecov_slb: /* * On pSeries, secondary processors spin in the following code. - * At entry, r3 = this processor's number (in Linux terms, not hardware). + * At entry, r3 = this processor's number (physical cpu id) */ _GLOBAL(pseries_secondary_smp_init) mr r24,r3 @@ -1204,13 +1205,27 @@ _GLOBAL(pseries_secondary_smp_init) /* Copy some CPU settings from CPU 0 */ bl .__restore_cpu_setup - /* Set up a paca value for this processor. */ - LOADADDR(r5, paca) /* Get base vaddr of paca array */ - mulli r13,r24,PACA_SIZE /* Calculate vaddr of right paca */ - add r13,r13,r5 /* for this processor. */ - mtspr SPRG3,r13 /* Save vaddr of paca in SPRG3 */ -1: - HMT_LOW + /* Set up a paca value for this processor. Since we have the + * physical cpu id in r3, we need to search the pacas to find + * which logical id maps to our physical one. + */ + LOADADDR(r13, paca) /* Get base vaddr of paca array */ + li r5,0 /* logical cpu id */ +1: lhz r6,PACAHWCPUID(r13) /* Load HW procid from paca */ + cmpw r6,r24 /* Compare to our id */ + beq 2f + addi r13,r13,PACA_SIZE /* Loop to next PACA on miss */ + addi r5,r5,1 + cmpwi r5,NR_CPUS + blt 1b + +99: HMT_LOW /* Couldn't find our CPU id */ + b 99b + +2: mtspr SPRG3,r13 /* Save vaddr of paca in SPRG3 */ + /* From now on, r24 is expected to be logica cpuid */ + mr r24,r5 +3: HMT_LOW lbz r23,PACAPROCSTART(r13) /* Test if this processor should */ /* start. */ sync @@ -1225,7 +1240,7 @@ _GLOBAL(pseries_secondary_smp_init) bne .__secondary_start #endif #endif - b 1b /* Loop until told to go */ + b 3b /* Loop until told to go */ #ifdef CONFIG_PPC_ISERIES _STATIC(__start_initialization_iSeries) /* Clear out the BSS */ @@ -1921,19 +1936,6 @@ _STATIC(start_here_multiplatform) bl .__save_cpu_setup sync -#ifdef CONFIG_SMP - /* All secondary cpus are now spinning on a common - * spinloop, release them all now so they can start - * to spin on their individual paca spinloops. - * For non SMP kernels, the secondary cpus never - * get out of the common spinloop. - */ - li r3,1 - LOADADDR(r5,__secondary_hold_spinloop) - tophys(r4,r5) - std r3,0(r4) -#endif - /* Setup a valid physical PACA pointer in SPRG3 for early_setup * note that boot_cpuid can always be 0 nowadays since there is * nowhere it can be initialized differently before we reach this @@ -2131,6 +2133,22 @@ _GLOBAL(hmt_start_secondary) blr #endif +#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) +_GLOBAL(smp_release_cpus) + /* All secondary cpus are spinning on a common + * spinloop, release them all now so they can start + * to spin on their individual paca spinloops. + * For non SMP kernels, the secondary cpus never + * get out of the common spinloop. + */ + li r3,1 + LOADADDR(r5,__secondary_hold_spinloop) + std r3,0(r5) + sync + blr +#endif /* CONFIG_SMP && !CONFIG_PPC_ISERIES */ + + /* * We put a few things here that have to be page-aligned. * This stuff goes at the beginning of the data segment, diff -puN arch/ppc64/kernel/pacaData.c~boot-cpuid arch/ppc64/kernel/pacaData.c --- linux-2.5/arch/ppc64/kernel/pacaData.c~boot-cpuid 2004-11-16 12:41:26.551909346 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/pacaData.c 2004-11-16 12:41:26.572914016 -0600 @@ -58,6 +58,7 @@ extern unsigned long __toc_start; .stab_real = (asrr), /* Real pointer to segment table */ \ .stab_addr = (asrv), /* Virt pointer to segment table */ \ .cpu_start = (start), /* Processor start */ \ + .hw_cpu_id = 0xffff, \ .lppaca = { \ .xDesc = 0xd397d781, /* "LpPa" */ \ .xSize = sizeof(struct ItLpPaca), \ diff -puN arch/ppc64/kernel/prom.c~boot-cpuid arch/ppc64/kernel/prom.c --- linux-2.5/arch/ppc64/kernel/prom.c~boot-cpuid 2004-11-16 12:41:26.554910013 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/prom.c 2004-11-16 12:41:26.573914239 -0600 @@ -853,10 +853,19 @@ static int __init early_init_dt_scan_cpu } } - /* Check if it's the boot-cpu, set it's hw index in paca now */ - if (get_flat_dt_prop(node, "linux,boot-cpu", NULL) != NULL) { - u32 *prop = get_flat_dt_prop(node, "reg", NULL); - paca[0].hw_cpu_id = prop == NULL ? 0 : *prop; + if (initial_boot_params && initial_boot_params->version >= 2) { + /* version 2 of the kexec param format adds the phys cpuid + * of booted proc. + */ + boot_cpuid_phys = initial_boot_params->boot_cpuid_phys; + boot_cpuid = 0; + } else { + /* Check if it's the boot-cpu, set it's hw index in paca now */ + if (get_flat_dt_prop(node, "linux,boot-cpu", NULL) != NULL) { + u32 *prop = get_flat_dt_prop(node, "reg", NULL); + set_hard_smp_processor_id(0, prop == NULL ? 0 : *prop); + boot_cpuid_phys = get_hard_smp_processor_id(0); + } } return 0; diff -puN arch/ppc64/kernel/prom_init.c~boot-cpuid arch/ppc64/kernel/prom_init.c --- linux-2.5/arch/ppc64/kernel/prom_init.c~boot-cpuid 2004-11-16 12:41:26.556910458 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/prom_init.c 2004-11-16 12:41:26.575914683 -0600 @@ -992,13 +992,13 @@ static void __init prom_hold_cpus(void) /* Primary Thread of non-boot cpu */ prom_printf("%x : starting cpu hw idx %x... ", cpuid, reg); call_prom("start-cpu", 3, 0, node, - secondary_hold, cpuid); + secondary_hold, reg); for ( i = 0 ; (i < 100000000) && (*acknowledge == ((unsigned long)-1)); i++ ) mb(); - if (*acknowledge == cpuid) { + if (*acknowledge == reg) { prom_printf("done\n"); /* We have to get every CPU out of OF, * even if we never start it. */ diff -puN arch/ppc64/kernel/setup.c~boot-cpuid arch/ppc64/kernel/setup.c --- linux-2.5/arch/ppc64/kernel/setup.c~boot-cpuid 2004-11-16 12:41:26.559911125 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/setup.c 2004-11-16 13:22:53.060669846 -0600 @@ -99,6 +99,8 @@ extern void htab_initialize(void); extern void early_init_devtree(void *flat_dt); extern void unflatten_device_tree(void); +extern void smp_release_cpus(void); + unsigned long decr_overclock = 1; unsigned long decr_overclock_proc0 = 1; unsigned long decr_overclock_set = 0; @@ -106,6 +108,7 @@ unsigned long decr_overclock_proc0_set = int have_of = 1; int boot_cpuid = 0; +int boot_cpuid_phys = 0; dev_t boot_dev; /* @@ -242,6 +245,7 @@ static void __init setup_cpu_maps(void) { struct device_node *dn = NULL; int cpu = 0; + int swap_cpuid = 0; check_smt_enabled(); @@ -266,11 +270,23 @@ static void __init setup_cpu_maps(void) cpu_set(cpu, cpu_present_map); set_hard_smp_processor_id(cpu, intserv[j]); } + if (intserv[j] == boot_cpuid_phys) + swap_cpuid = cpu; cpu_set(cpu, cpu_possible_map); cpu++; } } + /* Swap CPU id 0 with boot_cpuid_phys, so we can always assume that + * boot cpu is logical 0. + */ + if (boot_cpuid_phys != get_hard_smp_processor_id(0)) { + u32 tmp; + tmp = get_hard_smp_processor_id(0); + set_hard_smp_processor_id(0, boot_cpuid_phys); + set_hard_smp_processor_id(swap_cpuid, tmp); + } + /* * On pSeries LPAR, we need to know how many cpus * could possibly be added to this partition. @@ -630,6 +646,11 @@ void __init setup_system(void) * iSeries has already initialized the cpu maps at this point. */ setup_cpu_maps(); + + /* Release secondary cpus out of their spinloops at 0x60 now that + * we can map physical -> logical CPU ids + */ + smp_release_cpus(); #endif /* defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) */ printk("Starting Linux PPC64 %s\n", UTS_RELEASE); diff -puN include/asm-ppc64/prom.h~boot-cpuid include/asm-ppc64/prom.h --- linux-2.5/include/asm-ppc64/prom.h~boot-cpuid 2004-11-16 12:41:26.561911570 -0600 +++ linux-2.5-olof/include/asm-ppc64/prom.h 2004-11-16 12:41:26.577915128 -0600 @@ -56,6 +56,8 @@ struct boot_param_header u32 off_mem_rsvmap; /* offset to memory reserve map */ u32 version; /* format version */ u32 last_comp_version; /* last compatible version */ + /* version 2 fields below */ + u32 boot_cpuid_phys; /* Which physical CPU id we're booting on */ }; diff -puN include/asm-ppc64/smp.h~boot-cpuid include/asm-ppc64/smp.h --- linux-2.5/include/asm-ppc64/smp.h~boot-cpuid 2004-11-16 12:41:26.564912237 -0600 +++ linux-2.5-olof/include/asm-ppc64/smp.h 2004-11-16 12:41:26.577915128 -0600 @@ -27,6 +27,7 @@ #include extern int boot_cpuid; +extern int boot_cpuid_phys; extern void cpu_die(void) __attribute__((noreturn)); _ From benh at kernel.crashing.org Wed Nov 17 09:54:48 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 17 Nov 2004 09:54:48 +1100 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <20041116210201.GA7368@localhost.localdomain> References: <20041116210201.GA7368@localhost.localdomain> Message-ID: <1100645688.14625.70.camel@gaston> On Tue, 2004-11-16 at 15:02 -0600, Olof Johansson wrote: > Hi, > > Below patch changes the early CPU spinup code to be based on physical > CPU ID instead of logical. This will make it possible to kexec off of > a different cpu than 0, for example after it's been hot-unplugged. > > The booted cpu will still be mapped as logical cpu 0, since there's > various stuff in the early boot that assumes logical boot cpuid is 0. > > Also, it expands the kexec boot param structure to allow the booted > physical cpuid to be passed in. This includes bumping the version number > to 2 for backwards compat. Why don't you put it in a property instead ? I'm against adding things to the structure itself unless absolutely necessary. Also, what about moving the hold loop to there too ? The area below 0x100 is a bit "sensitive" and I'd like to get rid of it. Ben. From anton at samba.org Wed Nov 17 10:15:16 2004 From: anton at samba.org (Anton Blanchard) Date: Wed, 17 Nov 2004 10:15:16 +1100 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <1100645688.14625.70.camel@gaston> References: <20041116210201.GA7368@localhost.localdomain> <1100645688.14625.70.camel@gaston> Message-ID: <20041116231516.GA26260@krispykreme.ozlabs.ibm.com> Hi, > Why don't you put it in a property instead ? I'm against adding things > to the structure itself unless absolutely necessary. It requires you to scan the device tree and modify it if you kexec boot off a different cpu. Possible but it is extra work, especially on a crashdump kexec where you want to do the minimum work on kexec. Anton From benh at kernel.crashing.org Wed Nov 17 10:18:26 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 17 Nov 2004 10:18:26 +1100 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <20041116231516.GA26260@krispykreme.ozlabs.ibm.com> References: <20041116210201.GA7368@localhost.localdomain> <1100645688.14625.70.camel@gaston> <20041116231516.GA26260@krispykreme.ozlabs.ibm.com> Message-ID: <1100647106.14553.79.camel@gaston> On Wed, 2004-11-17 at 10:15 +1100, Anton Blanchard wrote: > Hi, > > > Why don't you put it in a property instead ? I'm against adding things > > to the structure itself unless absolutely necessary. > > It requires you to scan the device tree and modify it if you kexec boot > off a different cpu. Possible but it is extra work, especially on a > crashdump kexec where you want to do the minimum work on kexec. kexec has to do the device-tree flattening anyway ... adding/modifying that property 'on the fly' while doing so is easy... Ben. From anton at samba.org Wed Nov 17 10:25:52 2004 From: anton at samba.org (Anton Blanchard) Date: Wed, 17 Nov 2004 10:25:52 +1100 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <1100647106.14553.79.camel@gaston> References: <20041116210201.GA7368@localhost.localdomain> <1100645688.14625.70.camel@gaston> <20041116231516.GA26260@krispykreme.ozlabs.ibm.com> <1100647106.14553.79.camel@gaston> Message-ID: <20041116232552.GB26260@krispykreme.ozlabs.ibm.com> > kexec has to do the device-tree flattening anyway ... adding/modifying > that property 'on the fly' while doing so is easy... Why cant it reuse the boot one or preprepare one? Thats better than doing the work after the machine has oopsed isnt it? Anton From benh at kernel.crashing.org Wed Nov 17 10:25:14 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 17 Nov 2004 10:25:14 +1100 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <20041116232552.GB26260@krispykreme.ozlabs.ibm.com> References: <20041116210201.GA7368@localhost.localdomain> <1100645688.14625.70.camel@gaston> <20041116231516.GA26260@krispykreme.ozlabs.ibm.com> <1100647106.14553.79.camel@gaston> <20041116232552.GB26260@krispykreme.ozlabs.ibm.com> Message-ID: <1100647514.14625.81.camel@gaston> On Wed, 2004-11-17 at 10:25 +1100, Anton Blanchard wrote: > > kexec has to do the device-tree flattening anyway ... adding/modifying > > that property 'on the fly' while doing so is easy... > > Why cant it reuse the boot one or preprepare one? Thats better than > doing the work after the machine has oopsed isnt it? Might be an issue with hotplug hardware... Ben. From olof at austin.ibm.com Wed Nov 17 10:32:50 2004 From: olof at austin.ibm.com (Olof Johansson) Date: Tue, 16 Nov 2004 17:32:50 -0600 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <1100647514.14625.81.camel@gaston> References: <20041116210201.GA7368@localhost.localdomain> <1100645688.14625.70.camel@gaston> <20041116231516.GA26260@krispykreme.ozlabs.ibm.com> <1100647106.14553.79.camel@gaston> <20041116232552.GB26260@krispykreme.ozlabs.ibm.com> <1100647514.14625.81.camel@gaston> Message-ID: <419A8E22.3080706@austin.ibm.com> Benjamin Herrenschmidt wrote: >On Wed, 2004-11-17 at 10:25 +1100, Anton Blanchard wrote: > > >> > kexec has to do the device-tree flattening anyway ... adding/modifying >> >> >>>that property 'on the fly' while doing so is easy... >>> >>> >>Why cant it reuse the boot one or preprepare one? Thats better than >>doing the work after the machine has oopsed isnt it? >> >> > >Might be an issue with hotplug hardware... > > Hotplug events would reasonably trigger re-flattenings of the new device tree, etc. There'd be a window where the new tree might not be built, but it can be dealt with separately. -Olof From david at gibson.dropbear.id.au Wed Nov 17 11:32:59 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 17 Nov 2004 11:32:59 +1100 Subject: [PATCH] [PPC64] Fix iSeries build (lparcfg) In-Reply-To: <20041116190154.GA20952@4> References: <20041116190154.GA20952@4> Message-ID: <20041117003259.GB31704@zax> On Tue, Nov 16, 2004 at 01:01:54PM -0600, Olof Johansson wrote: > Hi, > > Andrew, please apply: > > Jeff Scheel's addition of PURR reporting in lparcfg conflicted with > Stephen's cleanup of the iSeries namespace. This fixes the build > break. Actually, that looks like a conflict with my PACA cleanup, which went in some time ago, rather than sfr's cleanups. But the patch looks good, anyway. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From david at gibson.dropbear.id.au Wed Nov 17 18:16:08 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 17 Nov 2004 18:16:08 +1100 Subject: [RFC] sysfs cpu cleanup Message-ID: <20041117071608.GB19019@zax> Current the ppc64 sysfs code registers an entry for each possible cpu in sysfs, rather than just online cpus. That makes sense, since the sysfs entries are needed to control onlining of the cpus. However, this is done even if CONFIG_HOTPLUG_CPU is not set, or if it is not a hotplug capable (DLPAR) machine, which is a bit misleading. Secondly it also registers all the other sysfs entries (physical_id and the pmc stuff) on all possible cpus, although they are quite meaningless on non-online cpus. This patch alters the code to only register sysfs directories at boot for cpus which are either online or could be onlined (cpu is possible, and CONFIG_HOTPLUG_CPU and an lpar machine). Furthermore, the entries apart from 'online' itself are only registered for online CPUs (and deregistered again if a cpu goes offline). Anyone see any problems with this approach? Also, this has not yet been tested in the presence of actual cpu hotplugging... Index: working-2.6/arch/ppc64/kernel/sysfs.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/sysfs.c 2004-10-19 13:37:21.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/sysfs.c 2004-11-17 17:55:22.284033824 +1100 @@ -12,6 +12,7 @@ #include #include +static DEFINE_PER_CPU(struct cpu, cpu_devices); /* SMT stuff */ @@ -259,8 +260,29 @@ static SYSDEV_ATTR(pmc8, 0600, show_pmc8, store_pmc8); static SYSDEV_ATTR(purr, 0600, show_purr, NULL); -static void __init register_cpu_pmc(struct sys_device *s) +/* Only valid if CPU is online. */ +static ssize_t show_physical_id(struct sys_device *dev, char *buf) { + struct cpu *cpu = container_of(dev, struct cpu, sysdev); + + return sprintf(buf, "%u\n", get_hard_smp_processor_id(cpu->sysdev.id)); +} +static SYSDEV_ATTR(physical_id, 0444, show_physical_id, NULL); + +void register_cpu_online(int cpu) +{ + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + + sysdev_create_file(s, &attr_physical_id); + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_create_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + sysdev_create_file(s, &attr_mmcr0); sysdev_create_file(s, &attr_mmcr1); @@ -283,6 +305,45 @@ sysdev_create_file(s, &attr_purr); } +#ifdef CONFIG_HOTPLUG_CPU +void unregister_cpu_online(int cpu) +{ + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + + BUG_ON(c->no_control); + + sysdev_remove_file(s, &attr_physical_id); + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + + sysdev_remove_file(s, &attr_mmcr0); + sysdev_remove_file(s, &attr_mmcr1); + + if (cur_cpu_spec->cpu_features & CPU_FTR_MMCRA) + sysdev_remove_file(s, &attr_mmcra); + + sysdev_remove_file(s, &attr_pmc1); + sysdev_remove_file(s, &attr_pmc2); + sysdev_remove_file(s, &attr_pmc3); + sysdev_remove_file(s, &attr_pmc4); + sysdev_remove_file(s, &attr_pmc5); + sysdev_remove_file(s, &attr_pmc6); + + if (cur_cpu_spec->cpu_features & CPU_FTR_PMC8) { + sysdev_remove_file(s, &attr_pmc7); + sysdev_remove_file(s, &attr_pmc8); + } + + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_purr); +} +#endif /* CONFIG_HOTPLUG_CPU */ /* NUMA stuff */ @@ -313,18 +374,6 @@ #endif -/* Only valid if CPU is online. */ -static ssize_t show_physical_id(struct sys_device *dev, char *buf) -{ - struct cpu *cpu = container_of(dev, struct cpu, sysdev); - - return sprintf(buf, "%u\n", get_hard_smp_processor_id(cpu->sysdev.id)); -} -static SYSDEV_ATTR(physical_id, 0444, show_physical_id, NULL); - - -static DEFINE_PER_CPU(struct cpu, cpu_devices); - static int __init topology_init(void) { int cpu; @@ -348,16 +397,12 @@ if (systemcfg->platform != PLATFORM_PSERIES_LPAR) c->no_control = 1; - register_cpu(c, cpu, parent); + if (cpu_online(cpu) || (c->no_control == 0)) + register_cpu(c, cpu, parent); - register_cpu_pmc(&c->sysdev); + if (cpu_online{cpu)) + register_cpu_online(cpu); - sysdev_create_file(&c->sysdev, &attr_physical_id); - -#ifndef CONFIG_PPC_ISERIES - if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) - sysdev_create_file(&c->sysdev, &attr_smt_snooze_delay); -#endif } return 0; Index: working-2.6/arch/ppc64/kernel/smp.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/smp.c 2004-10-19 13:37:56.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/smp.c 2004-11-17 17:56:42.827035864 +1100 @@ -82,6 +82,8 @@ void smp_call_function_interrupt(void); extern long register_vpa(unsigned long flags, unsigned long proc, unsigned long vpa); +extern void register_cpu_online(int cpu); +extern void unregister_cpu_online(int cpu); int smt_enabled_at_boot = 1; @@ -291,6 +293,8 @@ int cpu_status; unsigned int pcpu = get_hard_smp_processor_id(cpu); + unregister_cpu_online(cpu); + for (tries = 0; tries < 25; tries++) { cpu_status = query_cpu_stopped(pcpu); if (cpu_status == 0 || cpu_status == -1) @@ -919,6 +923,11 @@ while (!cpu_online(cpu)) cpu_relax(); +#ifdef CONFIG_HOTPLUG_CPU + if (system_state >= SYSTEM_RUNNING) /* This is a hotplug */ + register_cpu_online(cpu); +#endif /* CONFIG_HOTPLUG_CPU */ + return 0; } -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From l_indien at magic.fr Thu Nov 18 01:56:59 2004 From: l_indien at magic.fr (J. Mayer) Date: Wed, 17 Nov 2004 15:56:59 +0100 Subject: Booting Imac G5 In-Reply-To: References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1100522706.16435.203.camel@rapid> Message-ID: <1100703416.16435.207.camel@rapid> On Mon, 2004-11-15 at 13:48, Segher Boessenkool wrote: > >>> RTC is probably in the SMU... > >> > >> It is. > > > > Apple say it's an external RTC in the developper notes. The problem is > > to get an I2C driver for the SMU. With the forth code, seems that it > > can > > be made ;-) > > As far as I know you just ask the SMU the time, you don't have to > talk to the IIC yourself. Or maybe that has changed... checking... > no, it hasn't (the actual commands did change, though). OK, I made a confusion between the SMU system clock, which is accessed via I2C and the RTC which is directly accessed. I still need to test my code, then I will have (soon) a RTC driver... -- J. Mayer Never organized From nathanl at austin.ibm.com Thu Nov 18 01:59:12 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Wed, 17 Nov 2004 08:59:12 -0600 Subject: [RFC] sysfs cpu cleanup In-Reply-To: <20041117071608.GB19019@zax> References: <20041117071608.GB19019@zax> Message-ID: <1100703552.8092.15.camel@localhost.localdomain> Hi David- On Wed, 2004-11-17 at 18:16 +1100, David Gibson wrote: > Current the ppc64 sysfs code registers an entry for each possible cpu > in sysfs, rather than just online cpus. That makes sense, since the > sysfs entries are needed to control onlining of the cpus. However, > this is done even if CONFIG_HOTPLUG_CPU is not set, or if it is not a > hotplug capable (DLPAR) machine, which is a bit misleading. Secondly > it also registers all the other sysfs entries (physical_id and the pmc > stuff) on all possible cpus, although they are quite meaningless on > non-online cpus. > > This patch alters the code to only register sysfs directories at boot > for cpus which are either online or could be onlined (cpu is possible, > and CONFIG_HOTPLUG_CPU and an lpar machine). Furthermore, the entries > apart from 'online' itself are only registered for online CPUs (and > deregistered again if a cpu goes offline). > > Anyone see any problems with this approach? Also, this has not yet > been tested in the presence of actual cpu hotplugging... See http://www.ussg.iu.edu/hypermail/linux/kernel/0410.3/0020.html for what I think is the best solution to this - dynamic cpu device registration. In short, the driver model "core" would register all present cpus at boot, which would be correct regardless of CONFIG_HOTPLUG_CPU. Addition and removal of the sysfs entities should correspond to addition and removal of cpus (not online/offline), imo. Of course, I haven't had much time to work on this lately. Your patch looks fine to me except for the treatment of the physical_id attribute. We need this to be present even on offline cpus, because it is sometimes used to determine which cpu to start. Nathan From benh at kernel.crashing.org Thu Nov 18 08:30:39 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 18 Nov 2004 08:30:39 +1100 Subject: Booting Imac G5 In-Reply-To: <1100703416.16435.207.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1100522706.16435.203.camel@rapid> <1100703416.16435.207.camel@rapid> Message-ID: <1100727040.14553.108.camel@gaston> On Wed, 2004-11-17 at 15:56 +0100, J. Mayer wrote: > On Mon, 2004-11-15 at 13:48, Segher Boessenkool wrote: > > >>> RTC is probably in the SMU... > > >> > > >> It is. > > > > > > Apple say it's an external RTC in the developper notes. The problem is > > > to get an I2C driver for the SMU. With the forth code, seems that it > > > can > > > be made ;-) > > > > As far as I know you just ask the SMU the time, you don't have to > > talk to the IIC yourself. Or maybe that has changed... checking... > > no, it hasn't (the actual commands did change, though). > > OK, I made a confusion between the SMU system clock, which is accessed > via I2C and the RTC which is directly accessed. > I still need to test my code, then I will have (soon) a RTC driver... Which i2c bus is it on ? You may need to use the low-level i2c routines in pmac_low_i2c instead of the high level driver to access the RTC early at boot. Ben. From l_indien at magic.fr Thu Nov 18 09:26:00 2004 From: l_indien at magic.fr (J. Mayer) Date: Wed, 17 Nov 2004 23:26:00 +0100 Subject: Booting Imac G5 In-Reply-To: <1100727040.14553.108.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1100522706.16435.203.camel@rapid> <1100703416.16435.207.camel@rapid> <1100727040.14553.108.camel@gaston> Message-ID: <1100730357.16435.250.camel@rapid> On Wed, 2004-11-17 at 22:30, Benjamin Herrenschmidt wrote: > On Wed, 2004-11-17 at 15:56 +0100, J. Mayer wrote: > > On Mon, 2004-11-15 at 13:48, Segher Boessenkool wrote: > > > >>> RTC is probably in the SMU... > > > >> > > > >> It is. > > > > > > > > Apple say it's an external RTC in the developper notes. The problem is > > > > to get an I2C driver for the SMU. With the forth code, seems that it > > > > can > > > > be made ;-) > > > > > > As far as I know you just ask the SMU the time, you don't have to > > > talk to the IIC yourself. Or maybe that has changed... checking... > > > no, it hasn't (the actual commands did change, though). > > > > OK, I made a confusion between the SMU system clock, which is accessed > > via I2C and the RTC which is directly accessed. > > I still need to test my code, then I will have (soon) a RTC driver... > > Which i2c bus is it on ? You may need to use the low-level i2c routines > in pmac_low_i2c instead of the high level driver to access the RTC early > at boot. There seem to be an I2C controller (or pseudo I2C, as I saw in Apple code) in the SMU with two I2C buses. One bus has two devices connected, according to the OF tree: the system clock (used to control the CPU clock, as I understand it) and the temperature sensor located on the hard drive. As the RTC is accessed directly using SMU commands, I2C won't be needed early at boot time. -- J. Mayer Never organized From paulus at samba.org Thu Nov 18 10:35:15 2004 From: paulus at samba.org (Paul Mackerras) Date: Thu, 18 Nov 2004 10:35:15 +1100 Subject: [PATCH] read_slot_reset_state2 rtas call In-Reply-To: <41990014.1060404@austin.ibm.com> References: <41990014.1060404@austin.ibm.com> Message-ID: <16795.57395.506121.891877@cargo.ozlabs.ibm.com> Nathan Fontenot writes: > Could you please forward upstream. Sure. Since we are now supposed to be in bug-fix/stabilization mode for 2.6.10, I'll defer this one until 2.6.10 is out, unless you have an argument why it needs to be in 2.6.10. Thanks, Paul. From linas at austin.ibm.com Thu Nov 18 10:52:20 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 17 Nov 2004 17:52:20 -0600 Subject: [PATCH] PPC64: EEH Recovery Message-ID: <20041117235219.GD13762@austin.ibm.com> Hi Paul, The patch below implements hotplug style EEH error recovery. Its split into two pieces: a part that needs to be applied to the PPC64 arch tree, and a part that needs to be applied to the RPA PHP hotplug tree. The PPC64 part needs to go in first. Assuming this doesn't generate a round of discussion, please forward upstream to akpm/torvalds. Signed-off-by: Linas Vepstas -------------- next part -------------- ===== arch/ppc64/kernel/eeh.c 1.40 vs edited ===== --- 1.40/arch/ppc64/kernel/eeh.c 2004-10-25 14:47:50 -05:00 +++ edited/arch/ppc64/kernel/eeh.c 2004-11-17 17:31:41 -06:00 @@ -17,21 +17,19 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#include +#include #include #include -#include #include #include #include #include #include -#include +#include #include #include #include #include -#include #include "pci.h" #undef DEBUG @@ -89,7 +87,6 @@ static struct notifier_block *eeh_notifi * attempts we allow before panicking. */ #define EEH_MAX_FAILS 1000 -static atomic_t eeh_fail_count; /* RTAS tokens */ static int ibm_set_eeh_option; @@ -223,9 +220,9 @@ pci_addr_cache_insert(struct pci_dev *de while (*p) { parent = *p; piar = rb_entry(parent, struct pci_io_addr_range, rb_node); - if (alo < piar->addr_lo) { + if (ahi < piar->addr_lo) { p = &parent->rb_left; - } else if (ahi > piar->addr_hi) { + } else if (alo > piar->addr_hi) { p = &parent->rb_right; } else { if (dev != piar->pcidev || @@ -243,6 +240,11 @@ pci_addr_cache_insert(struct pci_dev *de piar->addr_hi = ahi; piar->pcidev = dev; piar->flags = flags; + +#ifdef DEBUG + printk (KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n", + alo, ahi, pci_name (dev)); +#endif rb_link_node(&piar->rb_node, parent, p); rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root); @@ -377,6 +379,9 @@ void __init pci_addr_cache_build(void) continue; } pci_addr_cache_insert_device(dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + pci_save_state (dev); } #ifdef DEBUG @@ -388,6 +393,32 @@ void __init pci_addr_cache_build(void) /* --------------------------------------------------------------- */ /* Above lies the PCI Address Cache. Below lies the EEH event infrastructure */ +void eeh_slot_error_detail (struct device_node *dn, int severity) +{ + unsigned long flags; + int rc; + + if (!dn) return; + + /* Log the error with the rtas logger */ + spin_lock_irqsave(&slot_errbuf_lock, flags); + memset(slot_errbuf, 0, eeh_error_buf_size); + + rc = rtas_call(ibm_slot_error_detail, + 8, 1, NULL, dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), NULL, 0, + virt_to_phys(slot_errbuf), + eeh_error_buf_size, + severity); + + if (rc == 0) + log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); + spin_unlock_irqrestore(&slot_errbuf_lock, flags); +} + +EXPORT_SYMBOL(eeh_slot_error_detail); + /** * eeh_register_notifier - Register to find out about EEH events. * @nb: notifier block to callback on events @@ -462,11 +493,9 @@ static void eeh_event_handler(void *dumm "%s %s\n", event->reset_state, pci_name(event->dev), pci_pretty_name(event->dev)); - atomic_set(&eeh_fail_count, 0); - notifier_call_chain (&eeh_notifier_chain, - EEH_NOTIFY_FREEZE, event); - __get_cpu_var(slot_resets)++; + notifier_call_chain (&eeh_notifier_chain, + EEH_NOTIFY_FREEZE, event); pci_dev_put(event->dev); kfree(event); @@ -510,7 +539,7 @@ int eeh_dn_check_failure(struct device_n int ret; int rets[2]; unsigned long flags; - int rc, reset_state; + int reset_state; struct eeh_event *event; __get_cpu_var(total_mmio_ffs)++; @@ -530,14 +559,15 @@ int eeh_dn_check_failure(struct device_n if (!dn->eeh_config_addr) { return 0; } - + /* * If we already have a pending isolation event for this * slot, we know it's bad already, we don't need to check... */ if (dn->eeh_mode & EEH_MODE_ISOLATED) { - atomic_inc(&eeh_fail_count); - if (atomic_read(&eeh_fail_count) >= EEH_MAX_FAILS) { + dn->eeh_freeze_count ++; + if (dn->eeh_freeze_count >= EEH_MAX_FAILS) { + dump_stack(); /* re-read the slot reset state */ rets[0] = -1; rtas_call(ibm_read_slot_reset_state, 3, 3, rets, @@ -565,28 +595,17 @@ int eeh_dn_check_failure(struct device_n return 0; } - /* prevent repeated reports of this failure */ + /* Prevent repeated reports of this failure */ dn->eeh_mode |= EEH_MODE_ISOLATED; reset_state = rets[0]; + /* Log the error with the rtas logger */ + if (dn->eeh_freeze_count < EEH_MAX_ALLOWED_FREEZES) { + eeh_slot_error_detail (dn, 1 /* Temporary Error */); + } else { + eeh_slot_error_detail (dn, 2 /* Permanent Error */); + } - spin_lock_irqsave(&slot_errbuf_lock, flags); - memset(slot_errbuf, 0, eeh_error_buf_size); - - rc = rtas_call(ibm_slot_error_detail, - 8, 1, NULL, dn->eeh_config_addr, - BUID_HI(dn->phb->buid), - BUID_LO(dn->phb->buid), NULL, 0, - virt_to_phys(slot_errbuf), - eeh_error_buf_size, - 1 /* Temporary Error */); - - if (rc == 0) - log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); - spin_unlock_irqrestore(&slot_errbuf_lock, flags); - - printk(KERN_INFO "EEH: MMIO failure (%d) on device: %s %s\n", - rets[0], dn->name, dn->full_name); event = kmalloc(sizeof(*event), GFP_ATOMIC); if (event == NULL) { eeh_panic(dev, reset_state); @@ -618,7 +637,6 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * @token i/o token, should be address in the form 0xA.... * @val value, should be all 1's (XXX why do we need this arg??) * - * Check for an eeh failure at the given token address. * Check for an EEH failure at the given token address. Call this * routine if the result of a read was all 0xff's and you want to * find out if this is due to an EEH slot freeze event. This routine @@ -626,6 +644,7 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * * Note this routine is safe to call in an interrupt context. */ + unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val) { unsigned long addr; @@ -635,7 +654,7 @@ unsigned long eeh_check_failure(const vo /* Finding the phys addr + pci device; this is pretty quick. */ addr = eeh_token_to_phys((unsigned long __force) token); dev = pci_get_device_by_addr(addr); - if (!dev) + if (!dev) return val; dn = pci_device_to_OF_node(dev); @@ -647,6 +666,174 @@ unsigned long eeh_check_failure(const vo EXPORT_SYMBOL(eeh_check_failure); +/* ------------------------------------------------------------- */ +/* The code below deals with error recovery */ + +void +rtas_set_slot_reset(struct device_node *dn) +{ + int token = rtas_token ("ibm,set-slot-reset"); + int rc; + + if (token == RTAS_UNKNOWN_SERVICE) + return; + rc = rtas_call(token,4,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), + 1); + if (rc) { + printk (KERN_WARNING "EEH: Unable to reset the failed slot\n"); + return; + } + + /* The PCI bus requires that the reset be held high for at least + * a 100 milliseconds. We wait a bit longer 'just in case'. + */ + msleep (200); + + rc = rtas_call(token,4,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), + 0); +} + +EXPORT_SYMBOL(rtas_set_slot_reset); + +void +rtas_configure_bridge(struct device_node *dn) +{ + int token = rtas_token ("ibm,configure-bridge"); + int rc; + + if (token == RTAS_UNKNOWN_SERVICE) + return; + rc = rtas_call(token,3,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid)); + if (rc) { + printk (KERN_WARNING "EEH: Unable to configure device bridge\n"); + } +} + +EXPORT_SYMBOL(rtas_configure_bridge); + +/* ------------------------------------------------------- */ +/** Save and restore of PCI BARs + * + * Although firmware will set up BARs during boot, it doesn't + * set up device BAR's after a device reset, although it will, + * if requested, set up bridge configuration. Thus, we need to + * configure the PCI devices ourselves. Config-space setup is + * stored in the PCI structures which are normally deleted during + * device removal. Thus, the "save" routine references the + * structures so that they aren't deleted. + */ + + +struct eeh_cfg_tree +{ + struct eeh_cfg_tree *sibling; + struct eeh_cfg_tree *child; + struct pci_dev *dev; + struct device_node *dn; +}; + +static inline struct pci_dev * eeh_get_pci_dev(struct device_node *dn) +{ + struct pci_dev *dev = NULL; + char bus_id[BUS_ID_SIZE]; + + sprintf(bus_id, "%04x:%02x:%02x.%d",dn->phb->global_number, + dn->busno, PCI_SLOT(dn->devfn), PCI_FUNC(dn->devfn)); + + while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { + if (!strcmp(pci_name(dev), bus_id)) + return dev; + } + return NULL; +} + +/** + * eeh_save_bars - save the PCI config space info + */ +struct eeh_cfg_tree * eeh_save_bars(struct device_node *dn) +{ + struct eeh_cfg_tree *cnode; + struct pci_dev *dev; + + dev = eeh_get_pci_dev (dn); + if (!dev) + return NULL; + + cnode = kmalloc(sizeof(struct eeh_cfg_tree), GFP_KERNEL); + if (!cnode) + return NULL; + + cnode->dev = dev; + + of_node_get(dn); + cnode->dn = dn; + + cnode->sibling = NULL; + cnode->child = NULL; + + if (dn->child) { + cnode->child = eeh_save_bars (dn->child); + } + if (dn->sibling) { + cnode->sibling = eeh_save_bars (dn->sibling); + } + + return cnode; +} +EXPORT_SYMBOL(eeh_save_bars); + +/** + * __restore_bars - Restore the Base Address Registers + * Loads the PCI configuration space base address registers + * and the expansion ROM base address from the array + * passed as the second argument. + */ +static inline void __restore_bars (struct device_node *dn, u32 *cfg_hdr) +{ + int i; + for (i=4; i<10; i++) { + rtas_write_config(dn, i*4, 4, cfg_hdr[i]); + } + rtas_write_config(dn, 12*4, 4, cfg_hdr[12]); +} + +/** + * eeh_restore_bars - restore the PCI config space info + */ +void eeh_restore_bars(struct eeh_cfg_tree *tree) +{ + if (tree->dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) { + __restore_bars (tree->dn, tree->dev->saved_config_space); + } + + if (tree->child) { + eeh_restore_bars (tree->child); + } + if (tree->sibling) { + eeh_restore_bars (tree->sibling); + } + + of_node_put (tree->dn); + pci_dev_put (tree->dev); + kfree (tree); +} +EXPORT_SYMBOL(eeh_restore_bars); + +/* ------------------------------------------------------------- */ +/* The code below deals with enabling EEH for devices during the + * early boot sequence. EEH must be enabled before any PCI probing + * can be done. + */ + struct eeh_early_enable_info { unsigned int buid_hi; unsigned int buid_lo; @@ -840,6 +1027,9 @@ void eeh_add_device_late(struct pci_dev #endif pci_addr_cache_insert_device (dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + pci_save_state (dev); } EXPORT_SYMBOL(eeh_add_device_late); @@ -885,10 +1075,8 @@ static int proc_eeh_show(struct seq_file seq_printf(m, "eeh_total_mmio_ffs=%ld\n" "eeh_false_positives=%ld\n" "eeh_ignored_failures=%ld\n" - "eeh_slot_resets=%ld\n" - "eeh_fail_count=%d\n", - ffs, positives, failures, resets, - eeh_fail_count.counter); + "eeh_slot_resets=%ld\n", + ffs, positives, failures, resets); } return 0; ===== arch/ppc64/kernel/pSeries_pci.c 1.59 vs edited ===== --- 1.59/arch/ppc64/kernel/pSeries_pci.c 2004-11-15 21:29:10 -06:00 +++ edited/arch/ppc64/kernel/pSeries_pci.c 2004-11-17 16:18:02 -06:00 @@ -102,7 +102,7 @@ static int rtas_pci_read_config(struct p return PCIBIOS_DEVICE_NOT_FOUND; } -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +int rtas_write_config(struct device_node *dn, int where, int size, u32 val) { unsigned long buid, addr; int ret; @@ -125,6 +125,7 @@ static int rtas_write_config(struct devi return PCIBIOS_SUCCESSFUL; } +EXPORT_SYMBOL(rtas_write_config); static int rtas_pci_write_config(struct pci_bus *bus, unsigned int devfn, ===== include/asm-ppc64/eeh.h 1.23 vs edited ===== --- 1.23/include/asm-ppc64/eeh.h 2004-10-25 18:17:38 -05:00 +++ edited/include/asm-ppc64/eeh.h 2004-11-17 16:10:58 -06:00 @@ -22,8 +22,8 @@ #include #include -#include #include +#include struct pci_dev; struct device_node; @@ -33,6 +33,10 @@ struct device_node; #define EEH_MODE_NOCHECK (1<<1) #define EEH_MODE_ISOLATED (1<<2) +/* Max number of EEH freezes allowed before we consider the device + * to be permanently disabled. */ +#define EEH_MAX_ALLOWED_FREEZES 5 + #ifdef CONFIG_PPC_PSERIES extern void __init eeh_init(void); unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val); @@ -57,6 +61,34 @@ void eeh_add_device_early(struct device_ void eeh_add_device_late(struct pci_dev *); /** + * eeh_slot_error_detail -- record and EEH error condition to the log + * @severity: 1 if temporary, 2 if permanent failure. + * + * Obtains the the EEH error details from the RTAS subsystem, + * and then logs these details with the RTAS error log system. + */ +void eeh_slot_error_detail (struct device_node *dn, int severity); + +/** + * rtas_set_slot_reset -- unfreeze a frozen slot + * + * Clear the EEH-frozen condition on a slot. This routine + * does this by asserting the PCI #RST line for 1/8th of + * a second; this routine will sleep while the adapter is + * being reset. + */ +void rtas_set_slot_reset (struct device_node *dn); + +/** + * rtas_configure_bridge -- firmware initialization of pci bridge + * + * Ask the firmware to configure any PCI bridge devices + * located behind the indicated node. Required after a + * pci device reset. + */ +void rtas_configure_bridge(struct device_node *dn); + +/** * eeh_remove_device - undo EEH setup for the indicated pci device * @dev: pci device to be removed * @@ -91,6 +123,13 @@ struct eeh_event { /** Register to find out about EEH events. */ int eeh_register_notifier(struct notifier_block *nb); int eeh_unregister_notifier(struct notifier_block *nb); + +/** Save and restore device configuration info across + * device resets + */ +struct eeh_cfg_tree; +struct eeh_cfg_tree * eeh_save_bars(struct device_node *dn); +void eeh_restore_bars(struct eeh_cfg_tree *tree); /** * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure. ===== include/asm-ppc64/prom.h 1.23 vs edited ===== --- 1.23/include/asm-ppc64/prom.h 2004-10-24 20:55:43 -05:00 +++ edited/include/asm-ppc64/prom.h 2004-11-17 16:00:37 -06:00 @@ -162,6 +162,7 @@ struct device_node { int status; /* Current device status (non-zero is bad) */ int eeh_mode; /* See eeh.h for possible EEH_MODEs */ int eeh_config_addr; + int eeh_freeze_count; /* number of times this device froze up. */ struct pci_controller *phb; /* for pci devices */ struct iommu_table *iommu_table; /* for phb's or bridges */ ===== include/asm-ppc64/rtas.h 1.24 vs edited ===== --- 1.24/include/asm-ppc64/rtas.h 2004-09-22 00:42:53 -05:00 +++ edited/include/asm-ppc64/rtas.h 2004-11-17 16:00:37 -06:00 @@ -241,4 +241,6 @@ extern void rtas_stop_self(void); /* RMO buffer reserved for user-space RTAS use */ extern unsigned long rtas_rmo_buf; +extern int rtas_write_config(struct device_node *dn, int where, int size, u32 val); + #endif /* _PPC64_RTAS_H */ -------------- next part -------------- ===== drivers/pci/hotplug/rpaphp.h 1.11 vs edited ===== --- 1.11/drivers/pci/hotplug/rpaphp.h 2004-10-06 11:43:44 -05:00 +++ edited/drivers/pci/hotplug/rpaphp.h 2004-11-17 16:00:37 -06:00 @@ -126,6 +126,8 @@ extern int register_pci_slot(struct slot extern int rpaphp_unconfig_pci_adapter(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); extern struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev); +extern void init_eeh_handler (void); +extern void exit_eeh_handler (void); /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); ===== drivers/pci/hotplug/rpaphp_core.c 1.18 vs edited ===== --- 1.18/drivers/pci/hotplug/rpaphp_core.c 2004-10-06 11:43:44 -05:00 +++ edited/drivers/pci/hotplug/rpaphp_core.c 2004-11-17 16:00:37 -06:00 @@ -443,12 +443,18 @@ static int __init rpaphp_init(void) { info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); + /* Get set to handle EEH events. */ + init_eeh_handler(); + /* read all the PRA info from the system */ return init_rpa(); } static void __exit rpaphp_exit(void) { + /* Let EEH know we are going away. */ + exit_eeh_handler(); + cleanup_slots(); } ===== drivers/pci/hotplug/rpaphp_pci.c 1.16 vs edited ===== --- 1.16/drivers/pci/hotplug/rpaphp_pci.c 2004-10-19 11:54:38 -05:00 +++ edited/drivers/pci/hotplug/rpaphp_pci.c 2004-11-17 17:23:39 -06:00 @@ -22,8 +22,12 @@ * Send feedback to * */ +#include +#include #include +#include #include +#include #include #include "../pci.h" /* for pci_add_new_bus */ @@ -63,6 +67,7 @@ int rpaphp_claim_resource(struct pci_dev root ? "Address space collision on" : "No parent found for", resource, dtype, pci_name(dev), res->start, res->end); + dump_stack(); } return err; } @@ -185,6 +190,19 @@ rpaphp_fixup_new_pci_devices(struct pci_ static int rpaphp_pci_config_bridge(struct pci_dev *dev); +static void rpaphp_eeh_add_bus_device(struct pci_bus *bus) +{ + struct pci_dev *dev; + list_for_each_entry(dev, &bus->devices, bus_list) { + eeh_add_device_late(dev); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + struct pci_bus *subbus = dev->subordinate; + if (bus) + rpaphp_eeh_add_bus_device (subbus); + } + } +} + /***************************************************************************** rpaphp_pci_config_slot() will configure all devices under the given slot->dn and return the the first pci_dev. @@ -212,6 +230,8 @@ rpaphp_pci_config_slot(struct device_nod } if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) rpaphp_pci_config_bridge(dev); + + rpaphp_eeh_add_bus_device(bus); } return dev; } @@ -220,7 +240,6 @@ static int rpaphp_pci_config_bridge(stru { u8 sec_busno; struct pci_bus *child_bus; - struct pci_dev *child_dev; dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev)); @@ -237,11 +256,7 @@ static int rpaphp_pci_config_bridge(stru /* do pci_scan_child_bus */ pci_scan_child_bus(child_bus); - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - - /* fixup new pci devices without touching bus struct */ + /* Fixup new pci devices without touching bus struct */ rpaphp_fixup_new_pci_devices(child_bus, 0); /* Make the discovered devices available */ @@ -279,7 +294,7 @@ static void print_slot_pci_funcs(struct return; } #else -static void print_slot_pci_funcs(struct slot *slot) +static inline void print_slot_pci_funcs(struct slot *slot) { return; } @@ -361,7 +376,6 @@ static void rpaphp_eeh_remove_bus_device if (pdev) rpaphp_eeh_remove_bus_device(pdev); } - } return; } @@ -563,10 +577,14 @@ exit: return retval; } -struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev) +/** + * rpaphp_find_slot - find and return the slot holding the device + * @dev: pci device for which we want the slot structure. + */ +static struct slot *rpaphp_find_slot(struct pci_dev *dev) { - struct list_head *tmp, *n; - struct slot *slot; + struct list_head *tmp, *n; + struct slot *slot; list_for_each_safe(tmp, n, &rpaphp_slot_head) { struct pci_bus *bus; @@ -585,14 +603,109 @@ struct hotplug_slot *rpaphp_find_hotplug if (!bus) { continue; /* should never happen? */ } + for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev == dev) - return slot->hotplug_slot; + struct pci_dev *pdev = pci_dev_b(ln); + if (pdev == dev) + return slot; } } return NULL; } -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); +/* ------------------------------------------------------- */ +/** + * handle_eeh_events -- reset a PCI device after hard lockup. + * + * pSeries systems will isolate a PCI slot if the PCI-Host + * bridge detects address or data parity errors, DMA's + * occuring to wild addresses (which usually happen due to + * bugs in device drivers or in PCI adapter firmware). + * Slot isolations also occur if #SERR, #PERR or other misc + * PCI-related errors are detected. + * + * Recovery process consists of unplugging the device driver + * (which generated hotplug events to userspace), then issuing + * a PCI #RST to the device, then reconfiguring the PCI config + * space for all bridges & devices under this slot, and then + * finally restarting the device drivers (which cause a second + * set of hotplug events to go out to userspace). + */ +int handle_eeh_events (struct notifier_block *self, + unsigned long reason, void *ev) +{ + struct eeh_event *event = ev; + struct slot *frozen_slot; + struct eeh_cfg_tree * saved_bars; + + frozen_slot = rpaphp_find_slot(event->dev); + if (!frozen_slot) + { + printk (KERN_ERR + "EEH: Cannot find PCI slot for EEH error! dev=%p dn=%p\n", + event->dev, event->dn); + return 1; + } + + /* Keep a copy of the config space registers */ + saved_bars = eeh_save_bars(frozen_slot->dn); + of_node_get(event->dn); + pci_dev_get(event->dev); + + rpaphp_unconfig_pci_adapter (frozen_slot); + + event->dn->eeh_freeze_count ++; + if (event->dn->eeh_freeze_count > EEH_MAX_ALLOWED_FREEZES) { + /* + * About 90% of all real-life EEH failures in the field + * are due to poorly seated PCI cards. Only 10% or so are + * due to actual, failed cards + */ + printk (KERN_ERR + "EEH: device %s:%s has failed %d times \n" + "and has been permanently disabled. Please try reseating\n" + "this device or replacing it.\n", + pci_name (event->dev), + pci_pretty_name (event->dev), + EEH_MAX_ALLOWED_FREEZES); + goto rdone; + } + + /* Reset the pci controller. (Asserts RST#; resets config space). + * Reconfigure bridges and devices */ + rtas_set_slot_reset (event->dn); + rtas_configure_bridge(event->dn); + eeh_restore_bars(saved_bars); + + /* Give the system 5 seconds to finish running the user-space + * hotplug scripts, e.g. ifdown for ethernet. Yes, this is a hack, + * but if we don't do this, weird things happen. + */ + ssleep (5); + + rpaphp_enable_pci_slot (frozen_slot); + + /* The new device node is different than the old one; + * copy over the freeze count, so that we don't loose track of it. + */ + frozen_slot->dn->eeh_freeze_count = event->dn->eeh_freeze_count; +rdone: + of_node_put(event->dn); + pci_dev_put(event->dev); + return 0; +} + +static struct notifier_block eeh_block; + +void __init init_eeh_handler (void) +{ + eeh_block.notifier_call = handle_eeh_events; + eeh_register_notifier (&eeh_block); +} + +void __exit exit_eeh_handler (void) +{ + eeh_unregister_notifier (&eeh_block); +} + From paulus at samba.org Thu Nov 18 11:58:42 2004 From: paulus at samba.org (Paul Mackerras) Date: Thu, 18 Nov 2004 11:58:42 +1100 Subject: [PATCH] PPC64 move emulate_step to arch/ppc64/lib Message-ID: <16795.62402.289937.407188@cargo.ozlabs.ibm.com> This patch moves the emulate_step function, which is used in xmon's single-stepping code, out of xmon.c and into arch/ppc64/lib/sstep.c, so that kprobes can use it too. Andrew: if you prefer to defer this until after 2.6.10, that's OK, but I think it would be safe to go in now, since the only thing it can break is xmon. Signed-off-by: Paul Mackerras diff -urN linux-2.5/arch/ppc64/lib/Makefile test/arch/ppc64/lib/Makefile --- linux-2.5/arch/ppc64/lib/Makefile 2004-07-12 09:12:03.000000000 +1000 +++ test/arch/ppc64/lib/Makefile 2004-11-18 11:14:04.180366608 +1100 @@ -15,3 +15,4 @@ obj-$(CONFIG_PCI) += e2a.o endif +lib-$(CONFIG_XMON) += sstep.o diff -urN linux-2.5/arch/ppc64/lib/sstep.c test/arch/ppc64/lib/sstep.c --- /dev/null 2004-08-12 23:33:25.000000000 +1000 +++ test/arch/ppc64/lib/sstep.c 2004-11-18 11:21:40.464360680 +1100 @@ -0,0 +1,141 @@ +/* + * Single-step support. + * + * Copyright (C) 2004 Paul Mackerras , IBM + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ +#include +#include +#include +#include + +extern char SystemCall_common[]; + +/* Bits in SRR1 that are copied from MSR */ +#define MSR_MASK 0xffffffff87c0ffff + +/* + * Determine whether a conditional branch instruction would branch. + */ +static int branch_taken(unsigned int instr, struct pt_regs *regs) +{ + unsigned int bo = (instr >> 21) & 0x1f; + unsigned int bi; + + if ((bo & 4) == 0) { + /* decrement counter */ + --regs->ctr; + if (((bo >> 1) & 1) ^ (regs->ctr == 0)) + return 0; + } + if ((bo & 0x10) == 0) { + /* check bit from CR */ + bi = (instr >> 16) & 0x1f; + if (((regs->ccr >> (31 - bi)) & 1) != ((bo >> 3) & 1)) + return 0; + } + return 1; +} + +/* + * Emulate instructions that cause a transfer of control. + * Returns 1 if the step was emulated, 0 if not, + * or -1 if the instruction is one that should not be stepped, + * such as an rfid, or a mtmsrd that would clear MSR_RI. + */ +int emulate_step(struct pt_regs *regs, unsigned int instr) +{ + unsigned int opcode, rd; + unsigned long int imm; + + opcode = instr >> 26; + switch (opcode) { + case 16: /* bc */ + imm = (signed short)(instr & 0xfffc); + if ((instr & 2) == 0) + imm += regs->nip; + regs->nip += 4; + if ((regs->msr & MSR_SF) == 0) + regs->nip &= 0xffffffffUL; + if (instr & 1) + regs->link = regs->nip; + if (branch_taken(instr, regs)) + regs->nip = imm; + return 1; + case 17: /* sc */ + /* + * N.B. this uses knowledge about how the syscall + * entry code works. If that is changed, this will + * need to be changed also. + */ + regs->gpr[9] = regs->gpr[13]; + regs->gpr[11] = regs->nip + 4; + regs->gpr[12] = regs->msr & MSR_MASK; + regs->gpr[13] = (unsigned long) get_paca(); + regs->nip = (unsigned long) &SystemCall_common; + regs->msr = MSR_KERNEL; + return 1; + case 18: /* b */ + imm = instr & 0x03fffffc; + if (imm & 0x02000000) + imm -= 0x04000000; + if ((instr & 2) == 0) + imm += regs->nip; + if (instr & 1) { + regs->link = regs->nip + 4; + if ((regs->msr & MSR_SF) == 0) + regs->link &= 0xffffffffUL; + } + if ((regs->msr & MSR_SF) == 0) + imm &= 0xffffffffUL; + regs->nip = imm; + return 1; + case 19: + switch (instr & 0x7fe) { + case 0x20: /* bclr */ + case 0x420: /* bcctr */ + imm = (instr & 0x400)? regs->ctr: regs->link; + regs->nip += 4; + if ((regs->msr & MSR_SF) == 0) { + regs->nip &= 0xffffffffUL; + imm &= 0xffffffffUL; + } + if (instr & 1) + regs->link = regs->nip; + if (branch_taken(instr, regs)) + regs->nip = imm; + return 1; + case 0x24: /* rfid, scary */ + return -1; + } + case 31: + rd = (instr >> 21) & 0x1f; + switch (instr & 0x7fe) { + case 0xa6: /* mfmsr */ + regs->gpr[rd] = regs->msr & MSR_MASK; + regs->nip += 4; + if ((regs->msr & MSR_SF) == 0) + regs->nip &= 0xffffffffUL; + return 1; + case 0x164: /* mtmsrd */ + /* only MSR_EE and MSR_RI get changed if bit 15 set */ + /* mtmsrd doesn't change MSR_HV and MSR_ME */ + imm = (instr & 0x10000)? 0x8002: 0xefffffffffffefffUL; + imm = (regs->msr & MSR_MASK & ~imm) + | (regs->gpr[rd] & imm); + if ((imm & MSR_RI) == 0) + /* can't step mtmsrd that would clear MSR_RI */ + return -1; + regs->msr = imm; + regs->nip += 4; + if ((imm & MSR_SF) == 0) + regs->nip &= 0xffffffffUL; + return 1; + } + } + return 0; +} diff -urN linux-2.5/arch/ppc64/xmon/xmon.c test/arch/ppc64/xmon/xmon.c --- linux-2.5/arch/ppc64/xmon/xmon.c 2004-10-26 16:06:41.000000000 +1000 +++ test/arch/ppc64/xmon/xmon.c 2004-11-18 11:21:33.745317056 +1100 @@ -31,6 +31,7 @@ #include #include #include +#include #include "nonstdio.h" #include "privinst.h" @@ -85,9 +86,6 @@ #define BP_NUM(bp) ((bp) - bpts + 1) -/* Bits in SRR1 that are copied from MSR */ -#define MSR_MASK 0xffffffff87c0ffff - /* Prototypes */ static int cmds(struct pt_regs *); static int mread(unsigned long, void *, int); @@ -132,7 +130,6 @@ static void bootcmds(void); void dump_segments(void); static void symbol_lookup(void); -static int emulate_step(struct pt_regs *regs, unsigned int instr); static void xmon_print_symbol(unsigned long address, const char *mid, const char *after); static const char *getvecname(unsigned long vec); @@ -148,7 +145,6 @@ extern int setjmp(long *); extern void longjmp(long *, int); extern unsigned long _ASR; -extern char SystemCall_common[]; pte_t *find_linux_pte(pgd_t *pgdir, unsigned long va); /* from htab.c */ @@ -488,6 +484,9 @@ if (stepped == 0) { regs->nip = (unsigned long) &bp->instr[0]; atomic_inc(&bp->ref_count); + } else if (stepped < 0) { + printf("Couldn't single-step %s instruction\n", + (IS_RFID(bp->instr[0])? "rfid": "mtmsrd")); } } } @@ -755,108 +754,6 @@ set_iabr(0); } -static int branch_taken(unsigned int instr, struct pt_regs *regs) -{ - unsigned int bo = (instr >> 21) & 0x1f; - unsigned int bi; - - if ((bo & 4) == 0) { - /* decrement counter */ - --regs->ctr; - if (((bo >> 1) & 1) ^ (regs->ctr == 0)) - return 0; - } - if ((bo & 0x10) == 0) { - /* check bit from CR */ - bi = (instr >> 16) & 0x1f; - if (((regs->ccr >> (31 - bi)) & 1) != ((bo >> 3) & 1)) - return 0; - } - return 1; -} - -/* - * Emulate instructions that cause a transfer of control. - * Returns 1 if the step was emulated, 0 if not, - * or -1 if the instruction is one that should not be stepped, - * such as an rfid, or a mtmsrd that would clear MSR_RI. - */ -static int emulate_step(struct pt_regs *regs, unsigned int instr) -{ - unsigned int opcode, rd; - unsigned long int imm; - - opcode = instr >> 26; - switch (opcode) { - case 16: /* bc */ - imm = (signed short)(instr & 0xfffc); - if ((instr & 2) == 0) - imm += regs->nip; - regs->nip += 4; /* XXX check 32-bit mode */ - if (instr & 1) - regs->link = regs->nip; - if (branch_taken(instr, regs)) - regs->nip = imm; - return 1; - case 17: /* sc */ - regs->gpr[9] = regs->gpr[13]; - regs->gpr[11] = regs->nip + 4; - regs->gpr[12] = regs->msr & MSR_MASK; - regs->gpr[13] = (unsigned long) get_paca(); - regs->nip = (unsigned long) &SystemCall_common; - regs->msr = MSR_KERNEL; - return 1; - case 18: /* b */ - imm = instr & 0x03fffffc; - if (imm & 0x02000000) - imm -= 0x04000000; - if ((instr & 2) == 0) - imm += regs->nip; - if (instr & 1) - regs->link = regs->nip + 4; - regs->nip = imm; - return 1; - case 19: - switch (instr & 0x7fe) { - case 0x20: /* bclr */ - case 0x420: /* bcctr */ - imm = (instr & 0x400)? regs->ctr: regs->link; - regs->nip += 4; /* XXX check 32-bit mode */ - if (instr & 1) - regs->link = regs->nip; - if (branch_taken(instr, regs)) - regs->nip = imm; - return 1; - case 0x24: /* rfid, scary */ - printf("Can't single-step an rfid instruction\n"); - return -1; - } - case 31: - rd = (instr >> 21) & 0x1f; - switch (instr & 0x7fe) { - case 0xa6: /* mfmsr */ - regs->gpr[rd] = regs->msr & MSR_MASK; - regs->nip += 4; - return 1; - case 0x164: /* mtmsrd */ - /* only MSR_EE and MSR_RI get changed if bit 15 set */ - /* mtmsrd doesn't change MSR_HV and MSR_ME */ - imm = (instr & 0x10000)? 0x8002: 0xefffffffffffefffUL; - imm = (regs->msr & MSR_MASK & ~imm) - | (regs->gpr[rd] & imm); - if ((imm & MSR_RI) == 0) { - printf("Can't step an instruction that would " - "clear MSR.RI\n"); - return -1; - } - regs->msr = imm; - regs->nip += 4; - return 1; - } - } - return 0; -} - /* Command interpreting routine */ static char *last_cmd; @@ -988,8 +885,11 @@ if ((regs->msr & (MSR_SF|MSR_PR|MSR_IR)) == (MSR_SF|MSR_IR)) { if (mread(regs->nip, &instr, 4) == 4) { stepped = emulate_step(regs, instr); - if (stepped < 0) + if (stepped < 0) { + printf("Couldn't single-step %s instruction\n", + (IS_RFID(instr)? "rfid": "mtmsrd")); return 0; + } if (stepped > 0) { regs->trap = 0xd00 | (regs->trap & 1); printf("stepped to "); diff -urN linux-2.5/include/asm-ppc64/sstep.h test/include/asm-ppc64/sstep.h --- /dev/null 2004-08-12 23:33:25.000000000 +1000 +++ test/include/asm-ppc64/sstep.h 2004-11-18 11:14:59.596330520 +1100 @@ -0,0 +1,13 @@ +/* + * Copyright (C) 2004 Paul Mackerras , IBM + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +struct pt_regs; + +/* Emulate instructions that cause a transfer of control. */ +extern int emulate_step(struct pt_regs *regs, unsigned int instr); From olof at austin.ibm.com Thu Nov 18 12:10:47 2004 From: olof at austin.ibm.com (Olof Johansson) Date: Wed, 17 Nov 2004 19:10:47 -0600 Subject: [PATCH] PPC64: Make pci_alloc_consistent() conform to API docs Message-ID: <20041118011047.GA436@austin.ibm.com> Hi, Documentation/DMA-mapping.txt says that pci_alloc_consistent() needs to return a mapping that is aligned by the closest larger order of two as the allocation. We're currently breaking this with our iommu code. To fix this, add align_order arguments to the relevant functions and pass it down. Specifying align_order of 0 gives same behaviour as previous. Signed-off-by: Olof Johansson --- diff -puN arch/ppc64/kernel/iommu.c~alloc_consistent_order arch/ppc64/kernel/iommu.c --- linux-2.5/arch/ppc64/kernel/iommu.c~alloc_consistent_order 2004-11-17 18:59:24.449585800 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/iommu.c 2004-11-17 19:09:43.062542336 -0600 @@ -59,13 +59,18 @@ static int __init setup_iommu(char *str) __setup("iommu=", setup_iommu); -static unsigned long iommu_range_alloc(struct iommu_table *tbl, unsigned long npages, - unsigned long *handle) +static unsigned long iommu_range_alloc(struct iommu_table *tbl, + unsigned long npages, + unsigned long *handle, + unsigned int align_order) { unsigned long n, end, i, start; unsigned long limit; int largealloc = npages > 15; int pass = 0; + unsigned long align_mask; + + align_mask = 0xffffffffffffffffl >> (64 - align_order); /* This allocator was derived from x86_64's bit string search */ @@ -97,6 +102,10 @@ static unsigned long iommu_range_alloc(s again: n = find_next_zero_bit(tbl->it_map, limit, start); + + /* Align allocation */ + n = (n + align_mask) & ~align_mask; + end = n + npages; if (unlikely(end >= limit)) { @@ -141,14 +150,15 @@ static unsigned long iommu_range_alloc(s } static dma_addr_t iommu_alloc(struct iommu_table *tbl, void *page, - unsigned int npages, enum dma_data_direction direction) + unsigned int npages, enum dma_data_direction direction, + unsigned int align_order) { unsigned long entry, flags; dma_addr_t ret = DMA_ERROR_CODE; spin_lock_irqsave(&(tbl->it_lock), flags); - entry = iommu_range_alloc(tbl, npages, NULL); + entry = iommu_range_alloc(tbl, npages, NULL, align_order); if (unlikely(entry == DMA_ERROR_CODE)) { spin_unlock_irqrestore(&(tbl->it_lock), flags); @@ -264,7 +274,7 @@ int iommu_map_sg(struct device *dev, str vaddr = (unsigned long)page_address(s->page) + s->offset; npages = PAGE_ALIGN(vaddr + slen) - (vaddr & PAGE_MASK); npages >>= PAGE_SHIFT; - entry = iommu_range_alloc(tbl, npages, &handle); + entry = iommu_range_alloc(tbl, npages, &handle, 0); DBG(" - vaddr: %lx, size: %lx\n", vaddr, slen); @@ -478,7 +488,7 @@ dma_addr_t iommu_map_single(struct iommu npages >>= PAGE_SHIFT; if (tbl) { - dma_handle = iommu_alloc(tbl, vaddr, npages, direction); + dma_handle = iommu_alloc(tbl, vaddr, npages, direction, 0); if (dma_handle == DMA_ERROR_CODE) { if (printk_ratelimit()) { printk(KERN_INFO "iommu_alloc failed, " @@ -537,7 +547,7 @@ void *iommu_alloc_consistent(struct iomm memset(ret, 0, size); /* Set up tces to cover the allocated range */ - mapping = iommu_alloc(tbl, ret, npages, DMA_BIDIRECTIONAL); + mapping = iommu_alloc(tbl, ret, npages, DMA_BIDIRECTIONAL, order); if (mapping == DMA_ERROR_CODE) { free_pages((unsigned long)ret, order); ret = NULL; _ From ananth at in.ibm.com Thu Nov 18 21:13:02 2004 From: ananth at in.ibm.com (Ananth N Mavinakayanahalli) Date: Thu, 18 Nov 2004 15:43:02 +0530 Subject: [PATCH] Kprobes for PPC64 - updated Message-ID: <20041118101302.GA8830@in.ibm.com> Hi, Here is the updated kprobes patch for ppc64. Paul's "move emulate_step to arch/ppc64/lib" patch is a prereq. Current patch is against 2.6.9-rc2-mm1. Kprobes (Kernel dynamic probes) is a lightweight mechanism for kernel modules to insert probes into a running kernel, without the need to modify the underlying source. The probe handlers can then be coded to log relevent data at the probe point. More information on kprobes can be found at: http://www-124.ibm.com/developerworks/oss/linux/projects/kprobes/ Jprobes (or jumper probes) is a small infrastructure to access function arguments. It can be used by defining a small stub with the same template as the routine in kernel, within which the required parameters can be logged. Thanks, Ananth Signed-off-by: Ananth N Mavinakayanahalli diff -Naurp temp/linux-2.6.10-rc2/arch/ppc64/Kconfig.debug linux-2.6.10-rc2/arch/ppc64/Kconfig.debug --- temp/linux-2.6.10-rc2/arch/ppc64/Kconfig.debug 2004-11-15 06:58:19.000000000 +0530 +++ linux-2.6.10-rc2/arch/ppc64/Kconfig.debug 2004-11-18 11:19:50.000000000 +0530 @@ -6,6 +6,16 @@ config DEBUG_STACKOVERFLOW bool "Check for stack overflows" depends on DEBUG_KERNEL +config KPROBES + bool "Kprobes" + depends on DEBUG_KERNEL + help + Kprobes allows you to trap at almost any kernel address and + execute a callback function. register_kprobe() establishes + a probepoint and specifies the callback. Kprobes is useful + for kernel debugging, non-intrusive instrumentation and testing. + If in doubt, say "N". + config DEBUG_STACK_USAGE bool "Stack utilization instrumentation" depends on DEBUG_KERNEL diff -Naurp temp/linux-2.6.10-rc2/arch/ppc64/kernel/kprobes.c linux-2.6.10-rc2/arch/ppc64/kernel/kprobes.c --- temp/linux-2.6.10-rc2/arch/ppc64/kernel/kprobes.c 1970-01-01 05:30:00.000000000 +0530 +++ linux-2.6.10-rc2/arch/ppc64/kernel/kprobes.c 2004-11-18 11:23:44.000000000 +0530 @@ -0,0 +1,258 @@ +/* + * Kernel Probes (KProbes) + * arch/ppc64/kernel/kprobes.c + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2002, 2004 + * + * 2002-Oct Created by Vamsi Krishna S Kernel + * Probes initial implementation ( includes contributions from + * Rusty Russell). + * 2004-July Suparna Bhattacharya added jumper probes + * interface to access function arguments. + * 2004-Nov Ananth N Mavinakayanahalli kprobes port + * for PPC64 + */ + +#include +#include +#include +#include +#include +#include +#include + +/* kprobe_status settings */ +#define KPROBE_HIT_ACTIVE 0x00000001 +#define KPROBE_HIT_SS 0x00000002 + +static struct kprobe *current_kprobe; +static unsigned long kprobe_status, kprobe_saved_msr; +static struct pt_regs jprobe_saved_regs; + +int arch_prepare_kprobe(struct kprobe *p) +{ + memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t)); + if (IS_MTMSRD(p->ainsn.insn[0]) || IS_RFID(p->ainsn.insn[0])) + /* cannot put bp on RFID/MTMSRD */ + return 1; + return 0; +} + +void arch_remove_kprobe(struct kprobe *p) +{ +} + +static inline void disarm_kprobe(struct kprobe *p, struct pt_regs *regs) +{ + *p->addr = p->opcode; + regs->nip = (unsigned long)p->addr; +} + +static inline void prepare_singlestep(struct kprobe *p, struct pt_regs *regs) +{ + regs->msr |= MSR_SE; + regs->nip = (unsigned long)&p->ainsn.insn; +} + +static inline int kprobe_handler(struct pt_regs *regs) +{ + struct kprobe *p; + int ret = 0; + unsigned int *addr = (unsigned int *)regs->nip; + + /* We're in an interrupt, but this is clear and BUG()-safe. */ + preempt_disable(); + + /* Check we're not actually recursing */ + if (kprobe_running()) { + /* We *are* holding lock here, so this is safe. + Disarm the probe we just hit, and ignore it. */ + p = get_kprobe(addr); + if (p) { + disarm_kprobe(p, regs); + ret = 1; + } else { + p = current_kprobe; + if (p->break_handler && p->break_handler(p, regs)) { + goto ss_probe; + } + } + /* If it's not ours, can't be delete race, (we hold lock). */ + goto no_kprobe; + } + + lock_kprobes(); + p = get_kprobe(addr); + if (!p) { + unlock_kprobes(); + if (*addr != BREAKPOINT_INSTRUCTION) { + /* + * The breakpoint instruction was removed right + * after we hit it. Another cpu has removed + * either a probepoint or a debugger breakpoint + * at this address. In either case, no further + * handling of this interrupt is appropriate. + */ + ret = 1; + } + /* Not one of ours: let kernel handle it */ + goto no_kprobe; + } + + kprobe_status = KPROBE_HIT_ACTIVE; + current_kprobe = p; + kprobe_saved_msr = regs->msr; + if (p->pre_handler(p, regs)) { + /* handler has already set things up, so skip ss setup */ + return 1; + } + +ss_probe: + prepare_singlestep(p, regs); + kprobe_status = KPROBE_HIT_SS; + return 1; + +no_kprobe: + preempt_enable_no_resched(); + return ret; +} + +/* + * Called after single-stepping. p->addr is the address of the + * instruction whose first byte has been replaced by the "breakpoint" + * instruction. To avoid the SMP problems that can occur when we + * temporarily put back the original opcode to single-step, we + * single-stepped a copy of the instruction. The address of this + * copy is p->ainsn.insn. + */ +static void resume_execution(struct kprobe *p, struct pt_regs *regs) +{ + int ret; + + regs->nip = (unsigned long)p->addr; + ret = emulate_step(regs, p->ainsn.insn[0]); + if (ret == 0) + regs->nip = (unsigned long)p->addr + 4; + + regs->msr &= ~MSR_SE; +} + +static inline int post_kprobe_handler(struct pt_regs *regs) +{ + if (!kprobe_running()) + return 0; + + if (current_kprobe->post_handler) + current_kprobe->post_handler(current_kprobe, regs, 0); + + resume_execution(current_kprobe, regs); + regs->msr |= kprobe_saved_msr; + + unlock_kprobes(); + preempt_enable_no_resched(); + + /* + * if somebody else is singlestepping across a probe point, msr + * will have SE set, in which case, continue the remaining processing + * of do_debug, as if this is not a probe hit. + */ + if (regs->msr & MSR_SE) + return 0; + + return 1; +} + +/* Interrupts disabled, kprobe_lock held. */ +static inline int kprobe_fault_handler(struct pt_regs *regs, int trapnr) +{ + if (current_kprobe->fault_handler + && current_kprobe->fault_handler(current_kprobe, regs, trapnr)) + return 1; + + if (kprobe_status & KPROBE_HIT_SS) { + resume_execution(current_kprobe, regs); + regs->msr |= kprobe_saved_msr; + + unlock_kprobes(); + preempt_enable_no_resched(); + } + return 0; +} + +/* + * Wrapper routine to for handling exceptions. + */ +int kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, + void *data) +{ + struct die_args *args = (struct die_args *)data; + switch (val) { + case DIE_IABR_MATCH: + case DIE_DABR_MATCH: + case DIE_BPT: + if (kprobe_handler(args->regs)) + return NOTIFY_STOP; + break; + case DIE_SSTEP: + if (post_kprobe_handler(args->regs)) + return NOTIFY_STOP; + break; + case DIE_GPF: + case DIE_PAGE_FAULT: + if (kprobe_running() && + kprobe_fault_handler(args->regs, args->trapnr)) + return NOTIFY_STOP; + break; + default: + break; + } + return NOTIFY_DONE; +} + +int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs) +{ + struct jprobe *jp = container_of(p, struct jprobe, kp); + + memcpy(&jprobe_saved_regs, regs, sizeof(struct pt_regs)); + + /* setup return addr to the jprobe handler routine */ + regs->nip = (unsigned long)(((func_descr_t *)jp->entry)->entry); + regs->gpr[2] = (unsigned long)(((func_descr_t *)jp->entry)->toc); + + return 1; +} + +void jprobe_return(void) +{ + preempt_enable_no_resched(); + asm volatile("trap" ::: "memory"); +} + +void jprobe_return_end(void) +{ +}; + +int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs) +{ + /* + * FIXME - we should ideally be validating that we got here 'cos + * of the "trap" in jprobe_return() above, before restoring the + * saved regs... + */ + memcpy(regs, &jprobe_saved_regs, sizeof(struct pt_regs)); + return 1; +} diff -Naurp temp/linux-2.6.10-rc2/arch/ppc64/kernel/Makefile linux-2.6.10-rc2/arch/ppc64/kernel/Makefile --- temp/linux-2.6.10-rc2/arch/ppc64/kernel/Makefile 2004-11-15 06:58:32.000000000 +0530 +++ linux-2.6.10-rc2/arch/ppc64/kernel/Makefile 2004-11-18 11:18:52.000000000 +0530 @@ -61,5 +61,6 @@ obj-$(CONFIG_PPC_MAPLE) += smp-tbsync.o endif obj-$(CONFIG_ALTIVEC) += vecemu.o vector.o +obj-$(CONFIG_KPROBES) += kprobes.o CFLAGS_ioctl32.o += -Ifs/ diff -Naurp temp/linux-2.6.10-rc2/arch/ppc64/kernel/traps.c linux-2.6.10-rc2/arch/ppc64/kernel/traps.c --- temp/linux-2.6.10-rc2/arch/ppc64/kernel/traps.c 2004-11-15 06:57:41.000000000 +0530 +++ linux-2.6.10-rc2/arch/ppc64/kernel/traps.c 2004-11-18 11:18:52.000000000 +0530 @@ -29,6 +29,7 @@ #include #include #include +#include #include #include @@ -61,6 +62,20 @@ EXPORT_SYMBOL(__debugger_dabr_match); EXPORT_SYMBOL(__debugger_fault_handler); #endif +struct notifier_block *ppc64_die_chain; +static spinlock_t die_notifier_lock = SPIN_LOCK_UNLOCKED; + +int register_die_notifier(struct notifier_block *nb) +{ + int err = 0; + unsigned long flags; + + spin_lock_irqsave(&die_notifier_lock, flags); + err = notifier_chain_register(&ppc64_die_chain, nb); + spin_unlock_irqrestore(&die_notifier_lock, flags); + return err; +} + /* * Trap & Exception support */ @@ -287,6 +302,9 @@ UnknownException(struct pt_regs *regs) void InstructionBreakpointException(struct pt_regs *regs) { + if (notify_die(DIE_IABR_MATCH, "iabr_match", regs, 5, + 5, SIGTRAP) == NOTIFY_STOP) + return; if (debugger_iabr_match(regs)) return; _exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip); @@ -297,6 +315,9 @@ SingleStepException(struct pt_regs *regs { regs->msr &= ~MSR_SE; /* Turn off 'trace' bit */ + if (notify_die(DIE_SSTEP, "single_step", regs, 5, + 5, SIGTRAP) == NOTIFY_STOP) + return; if (debugger_sstep(regs)) return; @@ -470,6 +491,9 @@ ProgramCheckException(struct pt_regs *re } else if (regs->msr & 0x20000) { /* trap exception */ + if (notify_die(DIE_BPT, "breakpoint", regs, 5, + 5, SIGTRAP) == NOTIFY_STOP) + return; if (debugger_bpt(regs)) return; diff -Naurp temp/linux-2.6.10-rc2/arch/ppc64/lib/Makefile linux-2.6.10-rc2/arch/ppc64/lib/Makefile --- temp/linux-2.6.10-rc2/arch/ppc64/lib/Makefile 2004-11-18 12:21:15.331888520 +0530 +++ linux-2.6.10-rc2/arch/ppc64/lib/Makefile 2004-11-18 11:23:19.000000000 +0530 @@ -15,4 +15,4 @@ ifdef CONFIG_PPC_ISERIES obj-$(CONFIG_PCI) += e2a.o endif -lib-$(CONFIG_XMON) += sstep.o +lib-$(CONFIG_DEBUG_KERNEL) += sstep.o diff -Naurp temp/linux-2.6.10-rc2/arch/ppc64/mm/fault.c linux-2.6.10-rc2/arch/ppc64/mm/fault.c --- temp/linux-2.6.10-rc2/arch/ppc64/mm/fault.c 2004-11-18 12:20:55.654957144 +0530 +++ linux-2.6.10-rc2/arch/ppc64/mm/fault.c 2004-11-18 11:18:52.000000000 +0530 @@ -36,6 +36,7 @@ #include #include #include +#include /* * Check whether the instruction at regs->nip is a store using @@ -95,6 +96,10 @@ int do_page_fault(struct pt_regs *regs, BUG_ON((trap == 0x380) || (trap == 0x480)); + if (notify_die(DIE_PAGE_FAULT, "page_fault", regs, error_code, + 11, SIGSEGV) == NOTIFY_STOP) + return 0; + if (trap == 0x300) { if (debugger_fault_handler(regs)) return 0; @@ -105,6 +110,9 @@ int do_page_fault(struct pt_regs *regs, return SIGSEGV; if (error_code & 0x00400000) { + if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, error_code, + 11, SIGSEGV) == NOTIFY_STOP) + return 0; if (debugger_dabr_match(regs)) return 0; } diff -Naurp temp/linux-2.6.10-rc2/arch/ppc64/xmon/xmon.c linux-2.6.10-rc2/arch/ppc64/xmon/xmon.c --- temp/linux-2.6.10-rc2/arch/ppc64/xmon/xmon.c 2004-11-18 12:21:15.342886848 +0530 +++ linux-2.6.10-rc2/arch/ppc64/xmon/xmon.c 2004-11-18 11:21:35.000000000 +0530 @@ -229,17 +229,6 @@ extern inline void sync(void) */ /* - * We don't allow single-stepping an mtmsrd that would clear - * MSR_RI, since that would make the exception unrecoverable. - * Since we need to single-step to proceed from a breakpoint, - * we don't allow putting a breakpoint on an mtmsrd instruction. - * Similarly we don't allow breakpoints on rfid instructions. - * These macros tell us if an instruction is a mtmsrd or rfid. - */ -#define IS_MTMSRD(instr) (((instr) & 0xfc0007fe) == 0x7c000164) -#define IS_RFID(instr) (((instr) & 0xfc0007fe) == 0x4c000024) - -/* * Disable surveillance (the service processor watchdog function) * while we are in xmon. * XXX we should re-enable it when we leave. :) diff -Naurp temp/linux-2.6.10-rc2/include/asm-ppc64/kdebug.h linux-2.6.10-rc2/include/asm-ppc64/kdebug.h --- temp/linux-2.6.10-rc2/include/asm-ppc64/kdebug.h 1970-01-01 05:30:00.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-ppc64/kdebug.h 2004-11-18 11:18:52.000000000 +0530 @@ -0,0 +1,43 @@ +#ifndef _PPC64_KDEBUG_H +#define _PPC64_KDEBUG_H 1 + +/* nearly identical to x86_64/i386 code */ + +#include + +struct pt_regs; + +struct die_args { + struct pt_regs *regs; + const char *str; + long err; + int trapnr; + int signr; +}; + +/* + Note - you should never unregister because that can race with NMIs. + If you really want to do it first unregister - then synchronize_kernel - + then free. + */ +int register_die_notifier(struct notifier_block *nb); +extern struct notifier_block *ppc64_die_chain; + +/* Grossly misnamed. */ +enum die_val { + DIE_OOPS = 1, + DIE_IABR_MATCH, + DIE_DABR_MATCH, + DIE_BPT, + DIE_SSTEP, + DIE_GPF, + DIE_PAGE_FAULT, +}; + +static inline int notify_die(enum die_val val,char *str,struct pt_regs *regs,long err,int trap, int sig) +{ + struct die_args args = { .regs=regs, .str=str, .err=err, .trapnr=trap,.signr=sig }; + return notifier_call_chain(&ppc64_die_chain, val, &args); +} + +#endif diff -Naurp temp/linux-2.6.10-rc2/include/asm-ppc64/kprobes.h linux-2.6.10-rc2/include/asm-ppc64/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-ppc64/kprobes.h 1970-01-01 05:30:00.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-ppc64/kprobes.h 2004-11-18 11:20:48.000000000 +0530 @@ -0,0 +1,54 @@ +#ifndef _ASM_KPROBES_H +#define _ASM_KPROBES_H +/* + * Kernel Probes (KProbes) + * include/asm-ppc64/kprobes.h + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2002, 2004 + * + * 2002-Oct Created by Vamsi Krishna S Kernel + * Probes initial implementation ( includes suggestions from + * Rusty Russell). + * 2004-Nov Modified for PPC64 by Ananth N Mavinakayanahalli + * + */ +#include +#include + +struct pt_regs; + +typedef unsigned int kprobe_opcode_t; +#define BREAKPOINT_INSTRUCTION 0x7fe00008 /* trap */ +#define MAX_INSN_SIZE 1 + +/* Architecture specific copy of original instruction */ +struct arch_specific_insn { + /* copy of original instruction */ + kprobe_opcode_t insn[MAX_INSN_SIZE]; +}; + +#ifdef CONFIG_KPROBES +extern int kprobe_exceptions_notify(struct notifier_block *self, + unsigned long val, void *data); +#else /* !CONFIG_KPROBES */ +static inline int kprobe_exceptions_notify(struct notifier_block *self, + unsigned long val, void *data) +{ + return 0; +} +#endif +#endif /* _ASM_KPROBES_H */ diff -Naurp temp/linux-2.6.10-rc2/include/asm-ppc64/sstep.h linux-2.6.10-rc2/include/asm-ppc64/sstep.h --- temp/linux-2.6.10-rc2/include/asm-ppc64/sstep.h 2004-11-18 12:21:15.344886544 +0530 +++ linux-2.6.10-rc2/include/asm-ppc64/sstep.h 2004-11-18 11:21:48.000000000 +0530 @@ -9,5 +9,16 @@ struct pt_regs; +/* + * We don't allow single-stepping an mtmsrd that would clear + * MSR_RI, since that would make the exception unrecoverable. + * Since we need to single-step to proceed from a breakpoint, + * we don't allow putting a breakpoint on an mtmsrd instruction. + * Similarly we don't allow breakpoints on rfid instructions. + * These macros tell us if an instruction is a mtmsrd or rfid. + */ +#define IS_MTMSRD(instr) (((instr) & 0xfc0007fe) == 0x7c000164) +#define IS_RFID(instr) (((instr) & 0xfc0007fe) == 0x4c000024) + /* Emulate instructions that cause a transfer of control. */ extern int emulate_step(struct pt_regs *regs, unsigned int instr); From ananth at in.ibm.com Thu Nov 18 21:26:41 2004 From: ananth at in.ibm.com (Ananth N Mavinakayanahalli) Date: Thu, 18 Nov 2004 15:56:41 +0530 Subject: [PATCH] Kprobes: wrapper to define jprobe.entry Message-ID: <20041118102641.GB8830@in.ibm.com> Hi, Here is a patch that adds a wrapper for defining jprobe.entry to make it easy to handle the three dword function descriptors defined by the PowerPC ELF ABI. Current patch against 2.6.10-rc2-mm1 + kprobes patch for ppc64. Changes for adding this wrapper for x86, ppc64 (tested) and x86_64 (untested) below. The earlier method of defining jprobe.entry will continue to work. Here is a pseudocode snippet to use jprobes with this patch. ............ struct jprobe jp; jtcp_v4_rcv(struct skbuff *skb) { /* decode and log skb related details as required */ jprobe_return(); return 0; } init_module { jp.kp.addr = (kprobe_opcode_t *); jp.entry = JPROBE_ENTRY(jtcp_v4_rcv); register_jprobe(&jp); return 0; } cleanup_module { unregister_jprobe(&jp); } ............ Dave, I am not aware of the semantics for sparc64 for making this change. Thanks, Ananth Signed-off-by: Ananth N Mavinakayanahalli diff -Naurp temp/linux-2.6.10-rc2/include/asm-i386/kprobes.h linux-2.6.10-rc2/include/asm-i386/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-i386/kprobes.h 2004-11-15 06:57:53.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-i386/kprobes.h 2004-11-18 12:28:03.873952360 +0530 @@ -38,6 +38,8 @@ typedef u8 kprobe_opcode_t; ? (MAX_STACK_SIZE) \ : (((unsigned long)current_thread_info()) + THREAD_SIZE - (ADDR))) +#define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)pentry + /* Architecture specific copy of original instruction*/ struct arch_specific_insn { /* copy of the original instruction */ diff -Naurp temp/linux-2.6.10-rc2/include/asm-ppc64/kprobes.h linux-2.6.10-rc2/include/asm-ppc64/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-ppc64/kprobes.h 2004-11-18 12:26:43.236962848 +0530 +++ linux-2.6.10-rc2/include/asm-ppc64/kprobes.h 2004-11-18 12:28:03.875952056 +0530 @@ -35,6 +35,8 @@ typedef unsigned int kprobe_opcode_t; #define BREAKPOINT_INSTRUCTION 0x7fe00008 /* trap */ #define MAX_INSN_SIZE 1 +#define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)((func_descr_t *)pentry) + /* Architecture specific copy of original instruction */ struct arch_specific_insn { /* copy of original instruction */ diff -Naurp temp/linux-2.6.10-rc2/include/asm-x86_64/kprobes.h linux-2.6.10-rc2/include/asm-x86_64/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-x86_64/kprobes.h 2004-11-15 06:56:41.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-x86_64/kprobes.h 2004-11-18 12:28:03.877951752 +0530 @@ -37,6 +37,8 @@ typedef u8 kprobe_opcode_t; ? (MAX_STACK_SIZE) \ : (((unsigned long)current_thread_info()) + THREAD_SIZE - (ADDR))) +#define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)pentry + /* Architecture specific copy of original instruction*/ struct arch_specific_insn { /* copy of the original instruction */ From nfont at austin.ibm.com Fri Nov 19 01:24:57 2004 From: nfont at austin.ibm.com (Nathan Fontenot) Date: Thu, 18 Nov 2004 08:24:57 -0600 Subject: [PATCH] read_slot_reset_state2 rtas call In-Reply-To: <16795.57395.506121.891877@cargo.ozlabs.ibm.com> References: <41990014.1060404@austin.ibm.com> <16795.57395.506121.891877@cargo.ozlabs.ibm.com> Message-ID: <419CB0B9.2080708@austin.ibm.com> not a problem at all. thanks. -Nathan F. Paul Mackerras wrote: > Nathan Fontenot writes: > > >>Could you please forward upstream. > > > Sure. Since we are now supposed to be in bug-fix/stabilization mode > for 2.6.10, I'll defer this one until 2.6.10 is out, unless you have > an argument why it needs to be in 2.6.10. > > Thanks, > Paul. > > From akpm at osdl.org Fri Nov 19 09:47:46 2004 From: akpm at osdl.org (Andrew Morton) Date: Thu, 18 Nov 2004 14:47:46 -0800 Subject: [PATCH] Kprobes: wrapper to define jprobe.entry In-Reply-To: <20041118102641.GB8830@in.ibm.com> References: <20041118102641.GB8830@in.ibm.com> Message-ID: <20041118144746.7daa9395.akpm@osdl.org> Ananth N Mavinakayanahalli wrote: > > Here is a patch that adds a wrapper for defining jprobe.entry to make > it easy to handle the three dword function descriptors defined by the > PowerPC ELF ABI. > > Current patch against 2.6.10-rc2-mm1 + kprobes patch for ppc64. I don't have the kprobes-for-ppc64 patch here. > Changes for adding this wrapper for x86, ppc64 (tested) and x86_64 > (untested) below. The earlier method of defining jprobe.entry will > continue to work. So what should I do with this? I'm inclined to drop it until the x86_64 part has been tested and Dave has had a go at the sparc64 version. From davem at davemloft.net Fri Nov 19 09:43:47 2004 From: davem at davemloft.net (David S. Miller) Date: Thu, 18 Nov 2004 14:43:47 -0800 Subject: [PATCH] Kprobes: wrapper to define jprobe.entry In-Reply-To: <20041118144746.7daa9395.akpm@osdl.org> References: <20041118102641.GB8830@in.ibm.com> <20041118144746.7daa9395.akpm@osdl.org> Message-ID: <20041118144347.27008df7.davem@davemloft.net> On Thu, 18 Nov 2004 14:47:46 -0800 Andrew Morton wrote: > Ananth N Mavinakayanahalli wrote: > > Changes for adding this wrapper for x86, ppc64 (tested) and x86_64 > > (untested) below. The earlier method of defining jprobe.entry will > > continue to work. > > So what should I do with this? I'm inclined to drop it until the x86_64 > part has been tested and Dave has had a go at the sparc64 version. Yes, now that we have kprobe support on 4 platforms, it is important that anyone who changes public parts of this interface do the necessary per-platform fixups necessary to coincide with such changes. I think the person changing the data type should be the one fixing up sparc64 :-) From haveblue at us.ibm.com Fri Nov 19 09:52:33 2004 From: haveblue at us.ibm.com (Dave Hansen) Date: Thu, 18 Nov 2004 14:52:33 -0800 Subject: should cpus_in_xmon be volatile? Message-ID: <1100818353.24982.348.camel@localhost> I'm getting a warning (one of many) during a build of 2.6.10-rc2-mm2: memhotplug/arch/ppc64/xmon/xmon.c: In function `xmon_core': memhotplug/arch/ppc64/xmon/xmon.c:401: warning: passing arg 1 of `__cpus_weight' discards qualifiers from pointer target type It's this chunk of code: for (timeout = 100000000; timeout != 0; --timeout) if (cpus_weight(cpus_in_xmon) >= ncpus) break; Is that warning because cpus_in_xmon is volatile and __cpus_weight() doesn't take a pointer to a volatile type? I do notice that none of the test/set_bit() functions have volatile types, and I have the feeling this is because they take pointers to being with. Does the fact that since __cpus_weight() takes a pointer that cpus_in_xmon doesn't really need to be declared volatile? -- Dave From david at gibson.dropbear.id.au Fri Nov 19 10:44:44 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 19 Nov 2004 10:44:44 +1100 Subject: [RFC] sysfs cpu cleanup In-Reply-To: <1100703552.8092.15.camel@localhost.localdomain> References: <20041117071608.GB19019@zax> <1100703552.8092.15.camel@localhost.localdomain> Message-ID: <20041118234444.GA16796@zax> On Wed, Nov 17, 2004 at 08:59:12AM -0600, Nathan Lynch wrote: > Hi David- > > On Wed, 2004-11-17 at 18:16 +1100, David Gibson wrote: > > Current the ppc64 sysfs code registers an entry for each possible cpu > > in sysfs, rather than just online cpus. That makes sense, since the > > sysfs entries are needed to control onlining of the cpus. However, > > this is done even if CONFIG_HOTPLUG_CPU is not set, or if it is not a > > hotplug capable (DLPAR) machine, which is a bit misleading. Secondly > > it also registers all the other sysfs entries (physical_id and the pmc > > stuff) on all possible cpus, although they are quite meaningless on > > non-online cpus. > > > > This patch alters the code to only register sysfs directories at boot > > for cpus which are either online or could be onlined (cpu is possible, > > and CONFIG_HOTPLUG_CPU and an lpar machine). Furthermore, the entries > > apart from 'online' itself are only registered for online CPUs (and > > deregistered again if a cpu goes offline). > > > > Anyone see any problems with this approach? Also, this has not yet > > been tested in the presence of actual cpu hotplugging... > > See http://www.ussg.iu.edu/hypermail/linux/kernel/0410.3/0020.html for > what I think is the best solution to this - dynamic cpu device > registration. In short, the driver model "core" would register all > present cpus at boot, which would be correct regardless of > CONFIG_HOTPLUG_CPU. Addition and removal of the sysfs entities should > correspond to addition and removal of cpus (not online/offline), imo. Sounds fairly reasonable. > Of course, I haven't had much time to work on this lately. Your patch > looks fine to me except for the treatment of the physical_id attribute. > We need this to be present even on offline cpus, because it is sometimes > used to determine which cpu to start. Well, I did wonder about the physical_id attribute. However for the time being, at least, the physical_id showed as 0 for all offline CPUs, so at present it's not meaningful for offline CPUs - maybe it should be, in which case there's another bug... -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From nathanl at austin.ibm.com Fri Nov 19 11:22:02 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Thu, 18 Nov 2004 18:22:02 -0600 Subject: [RFC] sysfs cpu cleanup In-Reply-To: <20041118234444.GA16796@zax> References: <20041117071608.GB19019@zax> <1100703552.8092.15.camel@localhost.localdomain> <20041118234444.GA16796@zax> Message-ID: <1100823721.6601.11.camel@biclops> On Thu, 2004-11-18 at 17:44, David Gibson wrote: > On Wed, Nov 17, 2004 at 08:59:12AM -0600, Nathan Lynch wrote: > > Of course, I haven't had much time to work on this lately. Your patch > > looks fine to me except for the treatment of the physical_id attribute. > > We need this to be present even on offline cpus, because it is sometimes > > used to determine which cpu to start. > > Well, I did wonder about the physical_id attribute. However for the > time being, at least, the physical_id showed as 0 for all offline > CPUs, so at present it's not meaningful for offline CPUs - maybe it > should be, in which case there's another bug... Hmm, are you sure that's the case? Cpus which are possible but not present have the bogus physical_id, yes, and this has always bugged me, but a cpu which is offlined retains its physical_id value. Example from a system with 2 present cpus and 8 possible cpus: linux:~ # cat listcpus.sh #!/bin/bash for cpu in $(find /sys/devices/system/cpu/ -type d -name 'cpu*') ; do echo -ne "$cpu" echo -ne "\tonline $(cat $cpu/online)" echo -ne "\tphysical_id $(cat $cpu/physical_id)" echo done linux:~ # ./listcpus.sh /sys/devices/system/cpu/cpu7 online 0 physical_id 0 /sys/devices/system/cpu/cpu6 online 0 physical_id 0 /sys/devices/system/cpu/cpu5 online 0 physical_id 0 /sys/devices/system/cpu/cpu4 online 0 physical_id 0 /sys/devices/system/cpu/cpu3 online 0 physical_id 0 /sys/devices/system/cpu/cpu2 online 0 physical_id 0 /sys/devices/system/cpu/cpu1 online 1 physical_id 1 /sys/devices/system/cpu/cpu0 online 1 physical_id 0 linux:~ # echo 0 > /sys/devices/system/cpu/cpu1/online linux:~ # ./listcpus.sh /sys/devices/system/cpu/cpu7 online 0 physical_id 0 /sys/devices/system/cpu/cpu6 online 0 physical_id 0 /sys/devices/system/cpu/cpu5 online 0 physical_id 0 /sys/devices/system/cpu/cpu4 online 0 physical_id 0 /sys/devices/system/cpu/cpu3 online 0 physical_id 0 /sys/devices/system/cpu/cpu2 online 0 physical_id 0 /sys/devices/system/cpu/cpu1 online 0 physical_id 1 /sys/devices/system/cpu/cpu0 online 1 physical_id 0 It looks like Olof's early processor spinup patch from a few days ago would change the physical_id attribute to show -1 for nonpresent cpus, fwiw. Nathan From david at gibson.dropbear.id.au Fri Nov 19 13:30:59 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 19 Nov 2004 13:30:59 +1100 Subject: [RFC] sysfs cpu cleanup In-Reply-To: <1100823721.6601.11.camel@biclops> References: <20041117071608.GB19019@zax> <1100703552.8092.15.camel@localhost.localdomain> <20041118234444.GA16796@zax> <1100823721.6601.11.camel@biclops> Message-ID: <20041119023059.GF16796@zax> On Thu, Nov 18, 2004 at 06:22:02PM -0600, Nathan Lynch wrote: > On Thu, 2004-11-18 at 17:44, David Gibson wrote: > > On Wed, Nov 17, 2004 at 08:59:12AM -0600, Nathan Lynch wrote: > > > Of course, I haven't had much time to work on this lately. Your patch > > > looks fine to me except for the treatment of the physical_id attribute. > > > We need this to be present even on offline cpus, because it is sometimes > > > used to determine which cpu to start. > > > > Well, I did wonder about the physical_id attribute. However for the > > time being, at least, the physical_id showed as 0 for all offline > > CPUs, so at present it's not meaningful for offline CPUs - maybe it > > should be, in which case there's another bug... > > Hmm, are you sure that's the case? Cpus which are possible but not > present have the bogus physical_id, yes, and this has always bugged me, > but a cpu which is offlined retains its physical_id value. Example from > a system with 2 present cpus and 8 possible cpus: > > linux:~ # cat listcpus.sh > #!/bin/bash > > for cpu in $(find /sys/devices/system/cpu/ -type d -name 'cpu*') ; do > echo -ne "$cpu" > echo -ne "\tonline $(cat $cpu/online)" > echo -ne "\tphysical_id $(cat $cpu/physical_id)" > echo > done > > linux:~ # ./listcpus.sh > /sys/devices/system/cpu/cpu7 online 0 physical_id 0 > /sys/devices/system/cpu/cpu6 online 0 physical_id 0 > /sys/devices/system/cpu/cpu5 online 0 physical_id 0 > /sys/devices/system/cpu/cpu4 online 0 physical_id 0 > /sys/devices/system/cpu/cpu3 online 0 physical_id 0 > /sys/devices/system/cpu/cpu2 online 0 physical_id 0 > /sys/devices/system/cpu/cpu1 online 1 physical_id 1 > /sys/devices/system/cpu/cpu0 online 1 physical_id 0 > linux:~ # echo 0 > /sys/devices/system/cpu/cpu1/online > linux:~ # ./listcpus.sh > /sys/devices/system/cpu/cpu7 online 0 physical_id 0 > /sys/devices/system/cpu/cpu6 online 0 physical_id 0 > /sys/devices/system/cpu/cpu5 online 0 physical_id 0 > /sys/devices/system/cpu/cpu4 online 0 physical_id 0 > /sys/devices/system/cpu/cpu3 online 0 physical_id 0 > /sys/devices/system/cpu/cpu2 online 0 physical_id 0 > /sys/devices/system/cpu/cpu1 online 0 physical_id 1 > /sys/devices/system/cpu/cpu0 online 1 physical_id 0 Ah, I see, yes, same behaviour confirmed here. I guess I was confused, because the first time I was looking at the this the lpar only had one "present" cpu, although 4 possible were showing up. The fact that there is no indication of which CPUs are present in there is certainly a bad thing, which I guess your proposol to only register present CPUs would address. As would the change below. > It looks like Olof's early processor spinup patch from a few days ago > would change the physical_id attribute to show -1 for nonpresent cpus, > fwiw. That would be good. Anyway, here is a new version of the patch which leaves physical_id there for all possible CPUs. If there are no problems that anyone can see, I'll forward this on to Andrew Morton. ==== Currently the ppc64 sysfs code registers an entry for each possible cpu in sysfs, rather than just online cpus. That makes sense, since the sysfs entries are needed to control onlining of the cpus. However, this is done even if CONFIG_HOTPLUG_CPU is not set, or if it is not a hotplug capable (DLPAR) machine, which is a bit misleading. Secondly it also registers all the other sysfs entries (mostly performance monitoring controls) on all possible cpus, although they are quite meaningless on non-online cpus. This patch alters the code to only register sysfs directories at boot for cpus which are either online or could be onlined (cpu is possible, and CONFIG_HOTPLUG_CPU and an lpar machine). Furthermore, the entries apart from 'online' itself and 'physical_id' are only registered for online CPUs (and deregistered again if a cpu goes offline). Index: working-2.6/arch/ppc64/kernel/sysfs.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/sysfs.c 2004-10-19 13:37:21.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/sysfs.c 2004-11-19 13:07:10.690978384 +1100 @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -13,6 +14,8 @@ #include +static DEFINE_PER_CPU(struct cpu, cpu_devices); + /* SMT stuff */ #ifndef CONFIG_PPC_ISERIES @@ -259,8 +262,18 @@ static SYSDEV_ATTR(pmc8, 0600, show_pmc8, store_pmc8); static SYSDEV_ATTR(purr, 0600, show_purr, NULL); -static void __init register_cpu_pmc(struct sys_device *s) +void register_cpu_online(int cpu) { + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_create_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + sysdev_create_file(s, &attr_mmcr0); sysdev_create_file(s, &attr_mmcr1); @@ -283,6 +296,43 @@ sysdev_create_file(s, &attr_purr); } +#ifdef CONFIG_HOTPLUG_CPU +void unregister_cpu_online(int cpu) +{ + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + + BUG_ON(c->no_control); + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + + sysdev_remove_file(s, &attr_mmcr0); + sysdev_remove_file(s, &attr_mmcr1); + + if (cur_cpu_spec->cpu_features & CPU_FTR_MMCRA) + sysdev_remove_file(s, &attr_mmcra); + + sysdev_remove_file(s, &attr_pmc1); + sysdev_remove_file(s, &attr_pmc2); + sysdev_remove_file(s, &attr_pmc3); + sysdev_remove_file(s, &attr_pmc4); + sysdev_remove_file(s, &attr_pmc5); + sysdev_remove_file(s, &attr_pmc6); + + if (cur_cpu_spec->cpu_features & CPU_FTR_PMC8) { + sysdev_remove_file(s, &attr_pmc7); + sysdev_remove_file(s, &attr_pmc8); + } + + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_purr); +} +#endif /* CONFIG_HOTPLUG_CPU */ /* NUMA stuff */ @@ -312,8 +362,7 @@ } #endif - -/* Only valid if CPU is online. */ +/* Only valid if CPU is present. */ static ssize_t show_physical_id(struct sys_device *dev, char *buf) { struct cpu *cpu = container_of(dev, struct cpu, sysdev); @@ -322,9 +371,6 @@ } static SYSDEV_ATTR(physical_id, 0444, show_physical_id, NULL); - -static DEFINE_PER_CPU(struct cpu, cpu_devices); - static int __init topology_init(void) { int cpu; @@ -345,19 +391,19 @@ * CPU. For instance, the boot cpu might never be valid * for hotplugging. */ +#ifdef CONFIG_HOTPLUG_CPU if (systemcfg->platform != PLATFORM_PSERIES_LPAR) +#endif c->no_control = 1; - register_cpu(c, cpu, parent); + if (cpu_online(cpu) || (c->no_control == 0)) + register_cpu(c, cpu, parent); - register_cpu_pmc(&c->sysdev); + sysdev_create_file(s, &attr_physical_id); - sysdev_create_file(&c->sysdev, &attr_physical_id); + if (cpu_online(cpu)) + register_cpu_online(cpu); -#ifndef CONFIG_PPC_ISERIES - if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) - sysdev_create_file(&c->sysdev, &attr_smt_snooze_delay); -#endif } return 0; Index: working-2.6/arch/ppc64/kernel/smp.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/smp.c 2004-10-19 13:37:56.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/smp.c 2004-11-17 17:56:42.000000000 +1100 @@ -82,6 +82,8 @@ void smp_call_function_interrupt(void); extern long register_vpa(unsigned long flags, unsigned long proc, unsigned long vpa); +extern void register_cpu_online(int cpu); +extern void unregister_cpu_online(int cpu); int smt_enabled_at_boot = 1; @@ -291,6 +293,8 @@ int cpu_status; unsigned int pcpu = get_hard_smp_processor_id(cpu); + unregister_cpu_online(cpu); + for (tries = 0; tries < 25; tries++) { cpu_status = query_cpu_stopped(pcpu); if (cpu_status == 0 || cpu_status == -1) @@ -919,6 +923,11 @@ while (!cpu_online(cpu)) cpu_relax(); +#ifdef CONFIG_HOTPLUG_CPU + if (system_state >= SYSTEM_RUNNING) /* This is a hotplug */ + register_cpu_online(cpu); +#endif /* CONFIG_HOTPLUG_CPU */ + return 0; } -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From nathanl at austin.ibm.com Fri Nov 19 15:39:06 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Thu, 18 Nov 2004 22:39:06 -0600 Subject: [PATCH] ppc64 NUMA code needs an IQ injection In-Reply-To: <20041112020038.GB20769@krispykreme.ozlabs.ibm.com> References: <20041112020038.GB20769@krispykreme.ozlabs.ibm.com> Message-ID: <1100839146.6601.21.camel@biclops> Ran into this on a 4GB partition - all but about ~300MB was thrown away. Does this look ok? It works for me, but I've not tested on firmware without the bug. Fall back to non-numa setup upon discovering unexpected memory layout as presented by firmware, instead of throwing away regions. Signed-off-by: Nathan Lynch Index: linux-2.6.10-rc2-bk3/arch/ppc64/mm/numa.c =================================================================== --- linux-2.6.10-rc2-bk3.orig/arch/ppc64/mm/numa.c 2004-11-19 02:14:28.000000000 +0000 +++ linux-2.6.10-rc2-bk3/arch/ppc64/mm/numa.c 2004-11-19 03:58:22.000000000 +0000 @@ -345,8 +345,6 @@ numa_domain = 0; } - node_set_online(numa_domain); - if (max_domain < numa_domain) max_domain = numa_domain; @@ -361,14 +359,18 @@ init_node_data[numa_domain].node_start_pfn + init_node_data[numa_domain].node_spanned_pages; if (shouldstart != (start / PAGE_SIZE)) { - printk(KERN_ERR "WARNING: Hole in node, " - "disabling region start %lx " - "length %lx\n", start, size); - continue; + /* Revert to non-numa for now */ + printk(KERN_ERR + "WARNING: Unexpected node layout: " + "region start %lx length %lx\n", + start, size); + goto err; } init_node_data[numa_domain].node_spanned_pages += size / PAGE_SIZE; } else { + node_set_online(numa_domain); + init_node_data[numa_domain].node_start_pfn = start / PAGE_SIZE; init_node_data[numa_domain].node_spanned_pages = @@ -387,6 +389,15 @@ numnodes = max_domain + 1; return 0; +err: + /* Something has gone wrong; revert any setup we've done */ + for_each_node(i) { + node_set_offline(i); + init_node_data[i].node_start_pfn = 0; + init_node_data[i].node_spanned_pages = 0; + } + numnodes = 1; + return -1; } static void __init setup_nonnuma(void) From paulus at samba.org Fri Nov 19 15:41:50 2004 From: paulus at samba.org (Paul Mackerras) Date: Fri, 19 Nov 2004 15:41:50 +1100 Subject: [PATCH] PPC64: Make pci_alloc_consistent() conform to API docs In-Reply-To: <20041118011047.GA436@austin.ibm.com> References: <20041118011047.GA436@austin.ibm.com> Message-ID: <16797.31118.417357.647479@cargo.ozlabs.ibm.com> Olof Johansson writes: > Documentation/DMA-mapping.txt says that pci_alloc_consistent() needs to return > a mapping that is aligned by the closest larger order of two as the allocation. > > We're currently breaking this with our iommu code. To fix this, add align_order > arguments to the relevant functions and pass it down. Specifying align_order of > 0 gives same behaviour as previous. Andrew, Olof tells me that this fix is needed in order for a workaround for a hardware bug in the e1000 driver to be effective. Could it go in 2.6.10 please? Thanks, Paul. From ananth at in.ibm.com Fri Nov 19 17:52:58 2004 From: ananth at in.ibm.com (Ananth N Mavinakayanahalli) Date: Fri, 19 Nov 2004 12:22:58 +0530 Subject: [PATCH] Kprobes: wrapper to define jprobe.entry In-Reply-To: <20041118144746.7daa9395.akpm@osdl.org> References: <20041118102641.GB8830@in.ibm.com> <20041118144746.7daa9395.akpm@osdl.org> Message-ID: <20041119065258.GA6863@in.ibm.com> On Thu, Nov 18, 2004 at 02:47:46PM -0800, Andrew Morton wrote: > Ananth N Mavinakayanahalli wrote: Hi Andrew, > > > > Here is a patch that adds a wrapper for defining jprobe.entry to make > > it easy to handle the three dword function descriptors defined by the > > PowerPC ELF ABI. > > > > Current patch against 2.6.10-rc2-mm1 + kprobes patch for ppc64. > > I don't have the kprobes-for-ppc64 patch here. > > > Changes for adding this wrapper for x86, ppc64 (tested) and x86_64 > > (untested) below. The earlier method of defining jprobe.entry will > > continue to work. > > So what should I do with this? I'm inclined to drop it until the x86_64 > part has been tested and Dave has had a go at the sparc64 version. I have now tested the patch succesfully on x86_64 and updated it for sparc64 too (Dave says the change looks good). Please apply. Thanks, Ananth Signed-off-by: Ananth N Mavinakayanahalli diff -Naurp temp/linux-2.6.10-rc2/include/asm-i386/kprobes.h linux-2.6.10-rc2/include/asm-i386/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-i386/kprobes.h 2004-11-19 10:14:44.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-i386/kprobes.h 2004-11-19 10:05:16.000000000 +0530 @@ -38,6 +38,8 @@ typedef u8 kprobe_opcode_t; ? (MAX_STACK_SIZE) \ : (((unsigned long)current_thread_info()) + THREAD_SIZE - (ADDR))) +#define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)pentry + /* Architecture specific copy of original instruction*/ struct arch_specific_insn { /* copy of the original instruction */ diff -Naurp temp/linux-2.6.10-rc2/include/asm-ppc64/kprobes.h linux-2.6.10-rc2/include/asm-ppc64/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-ppc64/kprobes.h 2004-11-19 10:14:44.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-ppc64/kprobes.h 2004-11-19 10:05:16.000000000 +0530 @@ -35,6 +35,8 @@ typedef unsigned int kprobe_opcode_t; #define BREAKPOINT_INSTRUCTION 0x7fe00008 /* trap */ #define MAX_INSN_SIZE 1 +#define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)((func_descr_t *)pentry) + /* Architecture specific copy of original instruction */ struct arch_specific_insn { /* copy of original instruction */ diff -Naurp temp/linux-2.6.10-rc2/include/asm-sparc64/kprobes.h linux-2.6.10-rc2/include/asm-sparc64/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-sparc64/kprobes.h 2004-11-15 06:57:53.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-sparc64/kprobes.h 2004-11-19 10:07:24.000000000 +0530 @@ -10,6 +10,8 @@ typedef u32 kprobe_opcode_t; #define BREAKPOINT_INSTRUCTION_2 0x91d02071 /* ta 0x71 */ #define MAX_INSN_SIZE 2 +#define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)pentry + /* Architecture specific copy of original instruction*/ struct arch_specific_insn { /* copy of the original instruction */ diff -Naurp temp/linux-2.6.10-rc2/include/asm-x86_64/kprobes.h linux-2.6.10-rc2/include/asm-x86_64/kprobes.h --- temp/linux-2.6.10-rc2/include/asm-x86_64/kprobes.h 2004-11-19 10:14:44.000000000 +0530 +++ linux-2.6.10-rc2/include/asm-x86_64/kprobes.h 2004-11-19 10:05:16.000000000 +0530 @@ -37,6 +37,8 @@ typedef u8 kprobe_opcode_t; ? (MAX_STACK_SIZE) \ : (((unsigned long)current_thread_info()) + THREAD_SIZE - (ADDR))) +#define JPROBE_ENTRY(pentry) (kprobe_opcode_t *)pentry + /* Architecture specific copy of original instruction*/ struct arch_specific_insn { /* copy of the original instruction */ From akpm at osdl.org Fri Nov 19 18:05:06 2004 From: akpm at osdl.org (Andrew Morton) Date: Thu, 18 Nov 2004 23:05:06 -0800 Subject: [PATCH] Kprobes: wrapper to define jprobe.entry In-Reply-To: <20041119065258.GA6863@in.ibm.com> References: <20041118102641.GB8830@in.ibm.com> <20041118144746.7daa9395.akpm@osdl.org> <20041119065258.GA6863@in.ibm.com> Message-ID: <20041118230506.4d20b3c9.akpm@osdl.org> Ananth N Mavinakayanahalli wrote: > > > > > > > Here is a patch that adds a wrapper for defining jprobe.entry to make > > > it easy to handle the three dword function descriptors defined by the > > > PowerPC ELF ABI. > > > > > > Current patch against 2.6.10-rc2-mm1 + kprobes patch for ppc64. > > > > I don't have the kprobes-for-ppc64 patch here. > > > > > Changes for adding this wrapper for x86, ppc64 (tested) and x86_64 > > > (untested) below. The earlier method of defining jprobe.entry will > > > continue to work. > > > > So what should I do with this? I'm inclined to drop it until the x86_64 > > part has been tested and Dave has had a go at the sparc64 version. > > I have now tested the patch succesfully on x86_64 and updated it for > sparc64 too (Dave says the change looks good). What is the review and testing status of the kprobes-for-ppc64 patch which you sent? From anton at samba.org Fri Nov 19 18:25:54 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 19 Nov 2004 18:25:54 +1100 Subject: kexec interface changes Message-ID: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> Here is a second pass at the kexec interface changes. Any comments? Im hoping to get these into 2.6.10 - they make incompatible changes to the interface which we should resolve before kexec users appear. Anton From anton at samba.org Fri Nov 19 18:32:11 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 19 Nov 2004 18:32:11 +1100 Subject: [PATCH] ppc64: linux,rtas* fixes In-Reply-To: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> References: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> Message-ID: <20041119073211.GC12007@krispykreme.ozlabs.ibm.com> Move the linux,rtas* properties into the /rtas node and make them 32bit. Use rtas-size and avoid duplicating it in linux,rtas-size. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/prom_init.c~ppc64:linux.rtas_fixes.patch?id=432 arch/ppc64/kernel/prom_init.c --- gr_base2/arch/ppc64/kernel/prom_init.c~ppc64:linux.rtas_fixes.patch?id=432 2004-11-14 16:01:43.138791178 -0600 +++ gr_base2-anton/arch/ppc64/kernel/prom_init.c 2004-11-16 00:44:16.371850845 -0600 @@ -701,9 +701,9 @@ static void __init prom_instantiate_rtas { unsigned long offset = reloc_offset(); struct prom_t *_prom = PTRRELOC(&prom); - phandle prom_rtas; - u64 base, entry = 0; - u32 size = 0; + phandle prom_rtas, rtas_node; + u32 base, entry = 0; + u32 size = 0; prom_debug("prom_instantiate_rtas: start...\n"); @@ -723,12 +723,12 @@ static void __init prom_instantiate_rtas } prom_printf("instantiating rtas at 0x%x", base); - prom_rtas = call_prom("open", 1, 1, ADDR("/rtas")); + rtas_node = call_prom("open", 1, 1, ADDR("/rtas")); prom_printf("..."); if (call_prom("call-method", 3, 2, ADDR("instantiate-rtas"), - prom_rtas, base) != PROM_ERROR) { + rtas_node, base) != PROM_ERROR) { entry = (long)_prom->args.rets[1]; } if (entry == 0) { @@ -739,9 +739,8 @@ static void __init prom_instantiate_rtas reserve_mem(base, size); - prom_setprop(_prom->chosen, "linux,rtas-base", &base, sizeof(base)); - prom_setprop(_prom->chosen, "linux,rtas-entry", &entry, sizeof(entry)); - prom_setprop(_prom->chosen, "linux,rtas-size", &size, sizeof(size)); + prom_setprop(prom_rtas, "linux,rtas-base", &base, sizeof(base)); + prom_setprop(prom_rtas, "linux,rtas-entry", &entry, sizeof(entry)); prom_debug("rtas base = 0x%x\n", base); prom_debug("rtas entry = 0x%x\n", entry); diff -puN arch/ppc64/kernel/rtas.c~ppc64:linux.rtas_fixes.patch?id=432 arch/ppc64/kernel/rtas.c --- gr_base2/arch/ppc64/kernel/rtas.c~ppc64:linux.rtas_fixes.patch?id=432 2004-11-14 16:01:43.144790340 -0600 +++ gr_base2-anton/arch/ppc64/kernel/rtas.c 2004-11-14 16:01:43.159788246 -0600 @@ -573,15 +573,15 @@ void __init rtas_initialize(void) */ rtas.dev = of_find_node_by_name(NULL, "rtas"); if (rtas.dev) { - u64 *basep, *entryp; + u32 *basep, *entryp; u32 *sizep; - basep = (u64 *)get_property(of_chosen, "linux,rtas-base", NULL); - sizep = (u32 *)get_property(of_chosen, "linux,rtas-size", NULL); + basep = (u32 *)get_property(rtas.dev, "linux,rtas-base", NULL); + sizep = (u32 *)get_property(rtas.dev, "rtas-size", NULL); if (basep != NULL && sizep != NULL) { rtas.base = *basep; rtas.size = *sizep; - entryp = (u64 *)get_property(of_chosen, "linux,rtas-entry", NULL); + entryp = (u32 *)get_property(rtas.dev, "linux,rtas-entry", NULL); if (entryp == NULL) /* Ugh */ rtas.entry = rtas.base; else _ From anton at samba.org Fri Nov 19 18:36:15 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 19 Nov 2004 18:36:15 +1100 Subject: [PATCH] ppc64: linux,tce* changes In-Reply-To: <20041119073434.GD12007@krispykreme.ozlabs.ibm.com> References: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> <20041119073211.GC12007@krispykreme.ozlabs.ibm.com> <20041119073434.GD12007@krispykreme.ozlabs.ibm.com> Message-ID: <20041119073615.GE12007@krispykreme.ozlabs.ibm.com> Remove linux,has-tce-table since we can just look for linux,tce-base and linux,tce-size. Make linux,tce-base store real addresses instead of virtual ones, the wrapper may not know the translation the kernel will use. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/pSeries_iommu.c~iommu_real_addr arch/ppc64/kernel/pSeries_iommu.c --- gr_base/arch/ppc64/kernel/pSeries_iommu.c~iommu_real_addr 2004-11-12 04:51:30.405991684 -0600 +++ gr_base-anton/arch/ppc64/kernel/pSeries_iommu.c 2004-11-12 04:51:30.422988913 -0600 @@ -317,19 +317,16 @@ static void iommu_table_setparms(struct node = (struct device_node *)phb->arch_data; - if (get_property(node, "linux,has-tce-table", NULL) == NULL) { - printk(KERN_ERR "PCI_DMA: iommu_table_setparms: %s has no tce table !\n", - dn->full_name); - return; - } basep = (unsigned long *)get_property(node, "linux,tce-base", NULL); sizep = (unsigned int *)get_property(node, "linux,tce-size", NULL); if (basep == NULL || sizep == NULL) { - printk(KERN_ERR "PCI_DMA: iommu_table_setparms: %s has missing tce" - " entries !\n", dn->full_name); + printk(KERN_ERR "PCI_DMA: iommu_table_setparms: %s has " + "missing tce entries !\n", dn->full_name); return; } - memset((void *)(*basep), 0, *sizep); + + tbl->it_base = (unsigned long)__va(*basep); + memset((void *)tbl->it_base, 0, *sizep); tbl->it_busno = phb->bus->number; @@ -353,7 +350,6 @@ static void iommu_table_setparms(struct if (phb->dma_window_base_cur > (1 << 19)) panic("PCI_DMA: Unexpected number of IOAs under this PHB.\n"); - tbl->it_base = *basep; tbl->it_index = 0; tbl->it_entrysize = sizeof(union tce_entry); tbl->it_blocksize = 16; diff -puN arch/ppc64/kernel/prom_init.c~iommu_real_addr arch/ppc64/kernel/prom_init.c --- gr_base/arch/ppc64/kernel/prom_init.c~iommu_real_addr 2004-11-12 04:51:30.411990706 -0600 +++ gr_base-anton/arch/ppc64/kernel/prom_init.c 2004-11-12 04:51:30.426988260 -0600 @@ -760,7 +760,7 @@ static void __init prom_initialize_tce_t unsigned long offset = reloc_offset(); char compatible[64], type[64], model[64]; char *path = RELOC(prom_scratch); - u64 base, vbase, align; + u64 base, align; u32 minalign, minsize; u64 tce_entry, *tce_entryp; u64 local_alloc_top, local_alloc_bottom; @@ -832,12 +832,9 @@ static void __init prom_initialize_tce_t if (base < local_alloc_bottom) local_alloc_bottom = base; - vbase = (unsigned long)abs_to_virt(base); - /* Save away the TCE table attributes for later use. */ - prom_setprop(node, "linux,tce-base", &vbase, sizeof(vbase)); + prom_setprop(node, "linux,tce-base", &base, sizeof(base)); prom_setprop(node, "linux,tce-size", &minsize, sizeof(minsize)); - prom_setprop(node, "linux,has-tce-table", NULL, 0); /* It seems OF doesn't null-terminate the path :-( */ memset(path, 0, sizeof(path)); From anton at samba.org Fri Nov 19 18:34:34 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 19 Nov 2004 18:34:34 +1100 Subject: [PATCH] ppc64: Reserve kernel memory in kernel instead of wrapper In-Reply-To: <20041119073211.GC12007@krispykreme.ozlabs.ibm.com> References: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> <20041119073211.GC12007@krispykreme.ozlabs.ibm.com> Message-ID: <20041119073434.GD12007@krispykreme.ozlabs.ibm.com> Reserve the kernel memory (0 - klimit) in the kernel instead of the wrapper. Remove an old comment that incorrectly referred to klimit. Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/prom.c~ppc64:reserve_klimit_in_kernel.patch?id=433 arch/ppc64/kernel/prom.c --- gr_base2/arch/ppc64/kernel/prom.c~ppc64:reserve_klimit_in_kernel.patch?id=433 2004-11-15 16:19:13.046782549 -0600 +++ gr_base2-anton/arch/ppc64/kernel/prom.c 2004-11-15 16:19:19.657739184 -0600 @@ -1023,6 +1023,7 @@ void __init early_init_devtree(void *par scan_flat_dt(early_init_dt_scan_memory, NULL); lmb_analyze(); systemcfg->physicalMemorySize = lmb_phys_mem_size(); + lmb_reserve(0, __pa(klimit)); DBG("Phys. mem: %lx\n", systemcfg->physicalMemorySize); diff -puN arch/ppc64/kernel/prom_init.c~ppc64:reserve_klimit_in_kernel.patch?id=433 arch/ppc64/kernel/prom_init.c --- gr_base2/arch/ppc64/kernel/prom_init.c~ppc64:reserve_klimit_in_kernel.patch?id=433 2004-11-15 16:19:13.052781655 -0600 +++ gr_base2-anton/arch/ppc64/kernel/prom_init.c 2004-11-15 16:19:13.074778378 -0600 @@ -1606,11 +1606,6 @@ unsigned long __init prom_init(unsigned prom_debug("offset=0x%x\n", offset); /* - * Reserve kernel in reserve map - */ - reserve_mem(0, __pa(RELOC(klimit))); - - /* * Check for an initrd */ prom_check_initrd(r3, r4); diff -puN include/asm-ppc64/rtas.h~ppc64:reserve_klimit_in_kernel.patch?id=433 include/asm-ppc64/rtas.h --- gr_base2/include/asm-ppc64/rtas.h~ppc64:reserve_klimit_in_kernel.patch?id=433 2004-11-15 16:19:13.058780761 -0600 +++ gr_base2-anton/include/asm-ppc64/rtas.h 2004-11-15 16:19:13.077777931 -0600 @@ -149,7 +149,7 @@ struct rtas_error_log { unsigned long target:4; /* Target of failed operation */ unsigned long type:8; /* General event or error*/ unsigned long extended_log_length:32; /* length in bytes */ - unsigned char buffer[1]; /* allocated by klimit bump */ + unsigned char buffer[1]; }; struct flash_block { _ From anton at samba.org Fri Nov 19 18:39:47 2004 From: anton at samba.org (Anton Blanchard) Date: Fri, 19 Nov 2004 18:39:47 +1100 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <20041119073615.GE12007@krispykreme.ozlabs.ibm.com> References: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> <20041119073211.GC12007@krispykreme.ozlabs.ibm.com> <20041119073434.GD12007@krispykreme.ozlabs.ibm.com> <20041119073615.GE12007@krispykreme.ozlabs.ibm.com> Message-ID: <20041119073947.GF12007@krispykreme.ozlabs.ibm.com> From: Olof Johansson Below patch changes the early CPU spinup code to be based on physical CPU ID instead of logical. This will make it possible to kexec off of a different cpu than 0, for example after it's been hot-unplugged. The booted cpu will still be mapped as logical cpu 0, since there's various stuff in the early boot that assumes logical boot cpuid is 0. Also, it expands the kexec boot param structure to allow the booted physical cpuid to be passed in. This includes bumping the version number to 2 for backwards compat. Signed-off-by: Olof Johansson Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/asm-offsets.c~boot-cpuid arch/ppc64/kernel/asm-offsets.c --- linux-2.5/arch/ppc64/kernel/asm-offsets.c~boot-cpuid 2004-11-16 12:41:26.546908234 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/asm-offsets.c 2004-11-16 13:24:49.372405523 -0600 @@ -103,6 +103,7 @@ int main(void) DEFINE(PACA_EXDSI, offsetof(struct paca_struct, exdsi)); DEFINE(PACAEMERGSP, offsetof(struct paca_struct, emergency_sp)); DEFINE(PACALPPACA, offsetof(struct paca_struct, lppaca)); + DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id)); DEFINE(LPPACASRR0, offsetof(struct ItLpPaca, xSavedSrr0)); DEFINE(LPPACASRR1, offsetof(struct ItLpPaca, xSavedSrr1)); DEFINE(LPPACAANYINT, offsetof(struct ItLpPaca, xIntDword.xAnyInt)); diff -puN arch/ppc64/kernel/head.S~boot-cpuid arch/ppc64/kernel/head.S --- linux-2.5/arch/ppc64/kernel/head.S~boot-cpuid 2004-11-16 12:41:26.548908679 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/head.S 2004-11-16 13:20:02.404741718 -0600 @@ -26,6 +26,7 @@ #define SECONDARY_PROCESSORS #include +#include #include #include #include @@ -1192,7 +1193,7 @@ unrecov_slb: /* * On pSeries, secondary processors spin in the following code. - * At entry, r3 = this processor's number (in Linux terms, not hardware). + * At entry, r3 = this processor's number (physical cpu id) */ _GLOBAL(pseries_secondary_smp_init) mr r24,r3 @@ -1204,13 +1205,27 @@ _GLOBAL(pseries_secondary_smp_init) /* Copy some CPU settings from CPU 0 */ bl .__restore_cpu_setup - /* Set up a paca value for this processor. */ - LOADADDR(r5, paca) /* Get base vaddr of paca array */ - mulli r13,r24,PACA_SIZE /* Calculate vaddr of right paca */ - add r13,r13,r5 /* for this processor. */ - mtspr SPRG3,r13 /* Save vaddr of paca in SPRG3 */ -1: - HMT_LOW + /* Set up a paca value for this processor. Since we have the + * physical cpu id in r3, we need to search the pacas to find + * which logical id maps to our physical one. + */ + LOADADDR(r13, paca) /* Get base vaddr of paca array */ + li r5,0 /* logical cpu id */ +1: lhz r6,PACAHWCPUID(r13) /* Load HW procid from paca */ + cmpw r6,r24 /* Compare to our id */ + beq 2f + addi r13,r13,PACA_SIZE /* Loop to next PACA on miss */ + addi r5,r5,1 + cmpwi r5,NR_CPUS + blt 1b + +99: HMT_LOW /* Couldn't find our CPU id */ + b 99b + +2: mtspr SPRG3,r13 /* Save vaddr of paca in SPRG3 */ + /* From now on, r24 is expected to be logica cpuid */ + mr r24,r5 +3: HMT_LOW lbz r23,PACAPROCSTART(r13) /* Test if this processor should */ /* start. */ sync @@ -1225,7 +1240,7 @@ _GLOBAL(pseries_secondary_smp_init) bne .__secondary_start #endif #endif - b 1b /* Loop until told to go */ + b 3b /* Loop until told to go */ #ifdef CONFIG_PPC_ISERIES _STATIC(__start_initialization_iSeries) /* Clear out the BSS */ @@ -1921,19 +1936,6 @@ _STATIC(start_here_multiplatform) bl .__save_cpu_setup sync -#ifdef CONFIG_SMP - /* All secondary cpus are now spinning on a common - * spinloop, release them all now so they can start - * to spin on their individual paca spinloops. - * For non SMP kernels, the secondary cpus never - * get out of the common spinloop. - */ - li r3,1 - LOADADDR(r5,__secondary_hold_spinloop) - tophys(r4,r5) - std r3,0(r4) -#endif - /* Setup a valid physical PACA pointer in SPRG3 for early_setup * note that boot_cpuid can always be 0 nowadays since there is * nowhere it can be initialized differently before we reach this @@ -2131,6 +2133,22 @@ _GLOBAL(hmt_start_secondary) blr #endif +#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) +_GLOBAL(smp_release_cpus) + /* All secondary cpus are spinning on a common + * spinloop, release them all now so they can start + * to spin on their individual paca spinloops. + * For non SMP kernels, the secondary cpus never + * get out of the common spinloop. + */ + li r3,1 + LOADADDR(r5,__secondary_hold_spinloop) + std r3,0(r5) + sync + blr +#endif /* CONFIG_SMP && !CONFIG_PPC_ISERIES */ + + /* * We put a few things here that have to be page-aligned. * This stuff goes at the beginning of the data segment, diff -puN arch/ppc64/kernel/pacaData.c~boot-cpuid arch/ppc64/kernel/pacaData.c --- linux-2.5/arch/ppc64/kernel/pacaData.c~boot-cpuid 2004-11-16 12:41:26.551909346 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/pacaData.c 2004-11-16 12:41:26.572914016 -0600 @@ -58,6 +58,7 @@ extern unsigned long __toc_start; .stab_real = (asrr), /* Real pointer to segment table */ \ .stab_addr = (asrv), /* Virt pointer to segment table */ \ .cpu_start = (start), /* Processor start */ \ + .hw_cpu_id = 0xffff, \ .lppaca = { \ .xDesc = 0xd397d781, /* "LpPa" */ \ .xSize = sizeof(struct ItLpPaca), \ diff -puN arch/ppc64/kernel/prom.c~boot-cpuid arch/ppc64/kernel/prom.c --- linux-2.5/arch/ppc64/kernel/prom.c~boot-cpuid 2004-11-16 12:41:26.554910013 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/prom.c 2004-11-16 12:41:26.573914239 -0600 @@ -853,10 +853,19 @@ static int __init early_init_dt_scan_cpu } } - /* Check if it's the boot-cpu, set it's hw index in paca now */ - if (get_flat_dt_prop(node, "linux,boot-cpu", NULL) != NULL) { - u32 *prop = get_flat_dt_prop(node, "reg", NULL); - paca[0].hw_cpu_id = prop == NULL ? 0 : *prop; + if (initial_boot_params && initial_boot_params->version >= 2) { + /* version 2 of the kexec param format adds the phys cpuid + * of booted proc. + */ + boot_cpuid_phys = initial_boot_params->boot_cpuid_phys; + boot_cpuid = 0; + } else { + /* Check if it's the boot-cpu, set it's hw index in paca now */ + if (get_flat_dt_prop(node, "linux,boot-cpu", NULL) != NULL) { + u32 *prop = get_flat_dt_prop(node, "reg", NULL); + set_hard_smp_processor_id(0, prop == NULL ? 0 : *prop); + boot_cpuid_phys = get_hard_smp_processor_id(0); + } } return 0; diff -puN arch/ppc64/kernel/prom_init.c~boot-cpuid arch/ppc64/kernel/prom_init.c --- linux-2.5/arch/ppc64/kernel/prom_init.c~boot-cpuid 2004-11-16 12:41:26.556910458 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/prom_init.c 2004-11-16 12:41:26.575914683 -0600 @@ -992,13 +992,13 @@ static void __init prom_hold_cpus(void) /* Primary Thread of non-boot cpu */ prom_printf("%x : starting cpu hw idx %x... ", cpuid, reg); call_prom("start-cpu", 3, 0, node, - secondary_hold, cpuid); + secondary_hold, reg); for ( i = 0 ; (i < 100000000) && (*acknowledge == ((unsigned long)-1)); i++ ) mb(); - if (*acknowledge == cpuid) { + if (*acknowledge == reg) { prom_printf("done\n"); /* We have to get every CPU out of OF, * even if we never start it. */ diff -puN arch/ppc64/kernel/setup.c~boot-cpuid arch/ppc64/kernel/setup.c --- linux-2.5/arch/ppc64/kernel/setup.c~boot-cpuid 2004-11-16 12:41:26.559911125 -0600 +++ linux-2.5-olof/arch/ppc64/kernel/setup.c 2004-11-16 13:22:53.060669846 -0600 @@ -99,6 +99,8 @@ extern void htab_initialize(void); extern void early_init_devtree(void *flat_dt); extern void unflatten_device_tree(void); +extern void smp_release_cpus(void); + unsigned long decr_overclock = 1; unsigned long decr_overclock_proc0 = 1; unsigned long decr_overclock_set = 0; @@ -106,6 +108,7 @@ unsigned long decr_overclock_proc0_set = int have_of = 1; int boot_cpuid = 0; +int boot_cpuid_phys = 0; dev_t boot_dev; /* @@ -242,6 +245,7 @@ static void __init setup_cpu_maps(void) { struct device_node *dn = NULL; int cpu = 0; + int swap_cpuid = 0; check_smt_enabled(); @@ -266,11 +270,23 @@ static void __init setup_cpu_maps(void) cpu_set(cpu, cpu_present_map); set_hard_smp_processor_id(cpu, intserv[j]); } + if (intserv[j] == boot_cpuid_phys) + swap_cpuid = cpu; cpu_set(cpu, cpu_possible_map); cpu++; } } + /* Swap CPU id 0 with boot_cpuid_phys, so we can always assume that + * boot cpu is logical 0. + */ + if (boot_cpuid_phys != get_hard_smp_processor_id(0)) { + u32 tmp; + tmp = get_hard_smp_processor_id(0); + set_hard_smp_processor_id(0, boot_cpuid_phys); + set_hard_smp_processor_id(swap_cpuid, tmp); + } + /* * On pSeries LPAR, we need to know how many cpus * could possibly be added to this partition. @@ -630,6 +646,11 @@ void __init setup_system(void) * iSeries has already initialized the cpu maps at this point. */ setup_cpu_maps(); + + /* Release secondary cpus out of their spinloops at 0x60 now that + * we can map physical -> logical CPU ids + */ + smp_release_cpus(); #endif /* defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) */ printk("Starting Linux PPC64 %s\n", UTS_RELEASE); diff -puN include/asm-ppc64/prom.h~boot-cpuid include/asm-ppc64/prom.h --- linux-2.5/include/asm-ppc64/prom.h~boot-cpuid 2004-11-16 12:41:26.561911570 -0600 +++ linux-2.5-olof/include/asm-ppc64/prom.h 2004-11-16 12:41:26.577915128 -0600 @@ -56,6 +56,8 @@ struct boot_param_header u32 off_mem_rsvmap; /* offset to memory reserve map */ u32 version; /* format version */ u32 last_comp_version; /* last compatible version */ + /* version 2 fields below */ + u32 boot_cpuid_phys; /* Which physical CPU id we're booting on */ }; diff -puN include/asm-ppc64/smp.h~boot-cpuid include/asm-ppc64/smp.h --- linux-2.5/include/asm-ppc64/smp.h~boot-cpuid 2004-11-16 12:41:26.564912237 -0600 +++ linux-2.5-olof/include/asm-ppc64/smp.h 2004-11-16 12:41:26.577915128 -0600 @@ -27,6 +27,7 @@ #include extern int boot_cpuid; +extern int boot_cpuid_phys; extern void cpu_die(void) __attribute__((noreturn)); _ From akpm at osdl.org Fri Nov 19 18:51:43 2004 From: akpm at osdl.org (Andrew Morton) Date: Thu, 18 Nov 2004 23:51:43 -0800 Subject: kexec interface changes In-Reply-To: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> References: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> Message-ID: <20041118235143.66ebd9b0.akpm@osdl.org> Anton Blanchard wrote: > > Here is a second pass at the kexec interface changes. Any comments? Im > hoping to get these into 2.6.10 - they make incompatible changes to > the interface which we should resolve before kexec users appear. But kexec won't be in 2.6.10. Confused. From ananth at in.ibm.com Fri Nov 19 20:03:56 2004 From: ananth at in.ibm.com (Ananth N Mavinakayanahalli) Date: Fri, 19 Nov 2004 14:33:56 +0530 Subject: [PATCH] Kprobes: wrapper to define jprobe.entry In-Reply-To: <20041118230506.4d20b3c9.akpm@osdl.org> References: <20041118102641.GB8830@in.ibm.com> <20041118144746.7daa9395.akpm@osdl.org> <20041119065258.GA6863@in.ibm.com> <20041118230506.4d20b3c9.akpm@osdl.org> Message-ID: <20041119090356.GA10082@in.ibm.com> On Thu, Nov 18, 2004 at 11:05:06PM -0800, Andrew Morton wrote: > Ananth N Mavinakayanahalli wrote: > > > > > > > > > > Here is a patch that adds a wrapper for defining jprobe.entry to make > > > > it easy to handle the three dword function descriptors defined by the > > > > PowerPC ELF ABI. > > > > > > > > Current patch against 2.6.10-rc2-mm1 + kprobes patch for ppc64. > > > > > > I don't have the kprobes-for-ppc64 patch here. > > > > > > > Changes for adding this wrapper for x86, ppc64 (tested) and x86_64 > > > > (untested) below. The earlier method of defining jprobe.entry will > > > > continue to work. > > > > > > So what should I do with this? I'm inclined to drop it until the x86_64 > > > part has been tested and Dave has had a go at the sparc64 version. > > > > I have now tested the patch succesfully on x86_64 and updated it for > > sparc64 too (Dave says the change looks good). > > What is the review and testing status of the kprobes-for-ppc64 patch which > you sent? > The patch was earlier posted on the PPC64 mailing list for comments and Paul had reviewed it. I had to update the patch to take care of the base kprobe changes that were made to accomodate the x86_64 port. The patch is tested on POWER3 (uni) and POWER4 (lpar). Thanks, Ananth From jschopp at austin.ibm.com Sat Nov 20 03:42:51 2004 From: jschopp at austin.ibm.com (Joel Schopp) Date: Fri, 19 Nov 2004 10:42:51 -0600 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <20041116210201.GA7368@localhost.localdomain> References: <20041116210201.GA7368@localhost.localdomain> Message-ID: <419E228B.8080904@austin.ibm.com> > Hi, > > Below patch changes the early CPU spinup code to be based on physical > CPU ID instead of logical. This will make it possible to kexec off of > a different cpu than 0, for example after it's been hot-unplugged. > > The booted cpu will still be mapped as logical cpu 0, since there's > various stuff in the early boot that assumes logical boot cpuid is 0. > > Also, it expands the kexec boot param structure to allow the booted > physical cpuid to be passed in. This includes bumping the version number > to 2 for backwards compat. I don't see anything here that pops out as conflicting with cpu hotplug, but I am curious if you have tested it with cpu hotplug. If you haven't I'd be happy to test it for you today. If you could compile a kernel with CONFIG_HOTPLUG_CPU and CONFIG_SCSI_IPR both set to y I'll boot it up and kick it around. From nathanl at austin.ibm.com Sat Nov 20 03:52:11 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Fri, 19 Nov 2004 10:52:11 -0600 Subject: [PATCH] ppc64: Make early processor spinup based on physical ids In-Reply-To: <419E228B.8080904@austin.ibm.com> References: <20041116210201.GA7368@localhost.localdomain> <419E228B.8080904@austin.ibm.com> Message-ID: <1100883131.3237.1.camel@biclops> On Fri, 2004-11-19 at 10:42, Joel Schopp wrote: > > Hi, > > > > Below patch changes the early CPU spinup code to be based on physical > > CPU ID instead of logical. This will make it possible to kexec off of > > a different cpu than 0, for example after it's been hot-unplugged. > > > > The booted cpu will still be mapped as logical cpu 0, since there's > > various stuff in the early boot that assumes logical boot cpuid is 0. > > > > Also, it expands the kexec boot param structure to allow the booted > > physical cpuid to be passed in. This includes bumping the version number > > to 2 for backwards compat. > > I don't see anything here that pops out as conflicting with cpu hotplug, > but I am curious if you have tested it with cpu hotplug. If you haven't > I'd be happy to test it for you today. If you could compile a kernel > with CONFIG_HOTPLUG_CPU and CONFIG_SCSI_IPR both set to y I'll boot it > up and kick it around. Olof and I already checked that cpu hotplug works fine with this patch. Nathan From sleddog at us.ibm.com Sat Nov 20 06:54:10 2004 From: sleddog at us.ibm.com (Dave C Boutcher) Date: Fri, 19 Nov 2004 13:54:10 -0600 Subject: Kernel panic: select_hpte_slot found entry already valid Message-ID: <20041119195410.GA14818@cs.umn.edu> This look familiar to anyone? Legacy iseries error. I recall benh was playing around in there last :-) This is a customer running Red Hat in production. "Len Goldenstein" wrote on 11/15/2004 08:27:17 AM: > I had the LPAR running fine for quite a while at the RHEL QU2 > 2.4.21-15.EL kernel - no problems with backups at all. > After running that for a couple weeks, I went back to the > -20EL kernel and did the backups without the NFS mounts for a week and > no problems. > > I mounted the NFS file systems (which are other the LPARs on the > same 820) and the backup ran fine for a few days until this morning. > The console log showed this: > > Kernel panic: select_hpte_slot found entry already valid > > Rebooting in 180 seconds.. > > I did a quick google search and found this: > http://listcrawler.com/message2.jsp?id=41980 > > It may be a race condition in the ppc64 kernel. I'm not a > kernel guy by any means though so maybe you guys will have some > insight into the problem. > > I put a problem ticket in with RedHat but their support is less > than stellar. -- Dave Boutcher From linas at austin.ibm.com Sat Nov 20 07:29:01 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Fri, 19 Nov 2004 14:29:01 -0600 Subject: should cpus_in_xmon be volatile? In-Reply-To: <1100818353.24982.348.camel@localhost> References: <1100818353.24982.348.camel@localhost> Message-ID: <20041119202901.GB23780@austin.ibm.com> On Thu, Nov 18, 2004 at 02:52:33PM -0800, Dave Hansen was heard to remark: > I'm getting a warning (one of many) during a build of 2.6.10-rc2-mm2: > > memhotplug/arch/ppc64/xmon/xmon.c: In function `xmon_core': > memhotplug/arch/ppc64/xmon/xmon.c:401: warning: passing arg 1 of > `__cpus_weight' discards qualifiers from pointer target type > > It's this chunk of code: > > for (timeout = 100000000; timeout != 0; --timeout) > if (cpus_weight(cpus_in_xmon) >= ncpus) > break; > > Is that warning because cpus_in_xmon is volatile and __cpus_weight() > doesn't take a pointer to a volatile type? > > I do notice that none of the test/set_bit() functions have volatile > types, and I have the feeling this is because they take pointers to > being with. Does the fact that since __cpus_weight() takes a pointer > that cpus_in_xmon doesn't really need to be declared volatile? Well, to make >>that particular<< loop work correctly, the volatile is not needed. Why? Because cpus_weight() is extern __bitmap_weight() and since its extern, the compiler must be definition invoke it each time in the loop, since the compiler must assume that the called routine is changing the value of the thing being pointed at. i.e. the call has a side-effect. So even with -O6 and without the volatile, that loop will compile correctly. However, if someone changed the extern __bitmap_weight() to be inline __bitmap_weight(), then the compiler could potentially see that it had no side effects, and decide to optimize away the entire loop. If __bitmap_weight() was changed to be inline, then it would also need to be changed to have a volatile argument. What the compiler is complaing about here is that it knows that __bitmap_weight() is not treating the arg as if it were volatile, and thus it might work incorrectly (e.g. if it went to sleep, waiting for a bit to change; then it would sleep forever). Since we know what __bitmap_weight() does, we know it doesn't need to be volatile, so that's OK. Since we know that in other places we probably need to have cpus_in_xmon be volatile, we should leave it volatile. (for example, its used in othr places that I think might break if it weren't declared volatile; not sure, didn't look that carefully) The proper solution is then to cast away volatile before calling cpus_weight. That, at least, is how I see it. --linas From linas at austin.ibm.com Sat Nov 20 07:41:50 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Fri, 19 Nov 2004 14:41:50 -0600 Subject: [PATCH] PPC64: fix to Re: should cpus_in_xmon be volatile? In-Reply-To: <1100818353.24982.348.camel@localhost> References: <1100818353.24982.348.camel@localhost> Message-ID: <20041119204150.GC23780@austin.ibm.com> Hi, Here's a patch that fixes a compiler warning. It casts away volatile as per previous email. On Thu, Nov 18, 2004 at 02:52:33PM -0800, Dave Hansen was heard to remark: > I'm getting a warning (one of many) during a build of 2.6.10-rc2-mm2: > > memhotplug/arch/ppc64/xmon/xmon.c: In function `xmon_core': > memhotplug/arch/ppc64/xmon/xmon.c:401: warning: passing arg 1 of > `__cpus_weight' discards qualifiers from pointer target type Signed-off-by: Linas Vepstas -------------- next part -------------- --- arch/ppc64/xmon/xmon.c.orig 2004-11-19 14:34:11.000000000 -0600 +++ arch/ppc64/xmon/xmon.c 2004-11-19 14:34:36.000000000 -0600 @@ -402,7 +407,7 @@ int xmon_core(struct pt_regs *regs, int smp_send_debugger_break(MSG_ALL_BUT_SELF); /* wait for other cpus to come in */ for (timeout = 100000000; timeout != 0; --timeout) - if (cpus_weight(cpus_in_xmon) >= ncpus) + if (cpus_weight(*((cpumask_t *) &cpus_in_xmon)) >= ncpus) break; } remove_bpts(); From paulus at samba.org Sat Nov 20 10:00:15 2004 From: paulus at samba.org (Paul Mackerras) Date: Sat, 20 Nov 2004 10:00:15 +1100 Subject: [PATCH] PPC64: fix to Re: should cpus_in_xmon be volatile? In-Reply-To: <20041119204150.GC23780@austin.ibm.com> References: <1100818353.24982.348.camel@localhost> <20041119204150.GC23780@austin.ibm.com> Message-ID: <16798.31487.919850.745372@cargo.ozlabs.ibm.com> Linas Vepstas writes: > Here's a patch that fixes a compiler warning. It casts away volatile > as per previous email. I think this is a better approach: remove the volatile from cpus_in_xmon, and put a barrier() in the loop that waits for the other cpus to come in to xmon. Signed-off-by: Paul Mackerras diff -urN linux-2.5/arch/ppc64/xmon/xmon.c test/arch/ppc64/xmon/xmon.c --- linux-2.5/arch/ppc64/xmon/xmon.c 2004-10-26 16:06:41.000000000 +1000 +++ test/arch/ppc64/xmon/xmon.c 2004-11-20 09:56:00.291426520 +1100 @@ -39,7 +39,7 @@ #define skipbl xmon_skipbl #ifdef CONFIG_SMP -volatile cpumask_t cpus_in_xmon = CPU_MASK_NONE; +cpumask_t cpus_in_xmon = CPU_MASK_NONE; static unsigned long xmon_taken = 1; static int xmon_owner; static int xmon_gate; @@ -401,9 +401,11 @@ if (ncpus > 1) { smp_send_debugger_break(MSG_ALL_BUT_SELF); /* wait for other cpus to come in */ - for (timeout = 100000000; timeout != 0; --timeout) + for (timeout = 100000000; timeout != 0; --timeout) { if (cpus_weight(cpus_in_xmon) >= ncpus) break; + barrier(); + } } remove_bpts(); disable_surveillance(); From linas at austin.ibm.com Sat Nov 20 10:44:02 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Fri, 19 Nov 2004 17:44:02 -0600 Subject: [PATCH] PPC64: fix to Re: should cpus_in_xmon be volatile? In-Reply-To: <16798.31487.919850.745372@cargo.ozlabs.ibm.com> References: <1100818353.24982.348.camel@localhost> <20041119204150.GC23780@austin.ibm.com> <16798.31487.919850.745372@cargo.ozlabs.ibm.com> Message-ID: <20041119234402.GD23780@austin.ibm.com> On Sat, Nov 20, 2004 at 10:00:15AM +1100, Paul Mackerras was heard to remark: > Linas Vepstas writes: > > > Here's a patch that fixes a compiler warning. It casts away volatile > > as per previous email. > > I think this is a better approach: remove the volatile from > cpus_in_xmon, OK, the reason I didn't remove the volatile is because it wasn't clear that none of the other spots weren't written in a way that required volatile. I was too lazy to do that review; I assume its safe, then? --linas From anton at samba.org Sun Nov 21 02:27:32 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 21 Nov 2004 02:27:32 +1100 Subject: kexec interface changes In-Reply-To: <20041118235143.66ebd9b0.akpm@osdl.org> References: <20041119072554.GB12007@krispykreme.ozlabs.ibm.com> <20041118235143.66ebd9b0.akpm@osdl.org> Message-ID: <20041120152732.GA11932@krispykreme.ozlabs.ibm.com> > But kexec won't be in 2.6.10. Confused. Well there is already enough in current BK to kexec into a ppc64 kernel. On top of that, the kexec interface provides an entry point for simple boot loaders that dont implement the full openfirmware interfaces. Anton From anton at samba.org Sun Nov 21 02:38:49 2004 From: anton at samba.org (Anton Blanchard) Date: Sun, 21 Nov 2004 02:38:49 +1100 Subject: [RFC] Consolidate lots of hugepage code In-Reply-To: <20041107224948.GO2890@holomorphy.com> References: <20041029033708.GF12247@zax> <20041029034817.GY12934@holomorphy.com> <20041107172030.GA16976@krispykreme.ozlabs.ibm.com> <20041107192024.GM2890@holomorphy.com> <20041107193007.GC16976@krispykreme.ozlabs.ibm.com> <20041107210943.GN2890@holomorphy.com> <20041107212212.GD16976@krispykreme.ozlabs.ibm.com> <20041107224948.GO2890@holomorphy.com> Message-ID: <20041120153849.GB11932@krispykreme.ozlabs.ibm.com> Hi wli, Any progress on this? If not Id like to suggest we get Davids patch into -mm. Anton > Sorry, I don't get complete bugreports myself. If you care to try to > actually fix something (it's doubtful you yourself are the culprit) I'm > still trying to reproduce it myself with long-running database tests. > It's reliably reproducible on the reporters' machines. > > The particular bug is only one piece of evidence. Just asking basic > questions about what was done for architecture code reveals that > all this "development" is not paying proper attention to architecture > code. I merely insist that development toward the end of stabilization > occur prior to that for large feature work. > > And frankly, I'm rather unimpressed with the gravity of the proposed > featurework, particularly in comparison to the stability requirements > of users on typical production systems. > > Nor am I impressed with the quality. The patch presentations have been > messy, the audits (as mentioned above) incomplete, the benefits not > clearly demonstrated, and the code itself not so pretty. Just > respinning the patches so they're properly incremental and the code > somewhat cleaner (e.g. some recent one nested tabs 5 deep or so) > would already remedy a large number of the issues with the featurework. > Once arranged that way the audits' incompleteness can be dealt with by > those with the fortitude to thoroughly audit and/or prior architecture > knowledge to correct the patches for arches they don't deal with properly. > > > -- wli From miltonm at bga.com Sun Nov 21 09:11:04 2004 From: miltonm at bga.com (Milton Miller) Date: Sat, 20 Nov 2004 16:11:04 -0600 (CST) Subject: [PATCH 1/2] ppc64: Block config accesses during BIST #3 Message-ID: <200411202211.iAKMB4lI025034@sullivan.realtime.net> Hi Brian. Sorry it took so long to look at this, but I was totally burried 2 weeks ago and am just catching up with the fun stuff. A few comments, mostly 1/2. 0) line numbers are off after Anton's clean up pci controller allocation :) 1) I don't see any reason for adding exports pcibios_*config* routines 2) pci_ops should go in pci-bridge, its private to the arch. Maybe call this block_ops ? 3) in pci_bus_to_host don't worry about searching for the hose, Ben and I agree the pci code will always have bus->sysdata when calling config routines. (pci_bus sysdata is copied form parent, and to pci_dev) 4) define a HAVE_PCI_CONFIG_BLOCK or something for the drivers to key on 5) Other places we return 1's we are careful to return the right number of them (ie FF, FFFF, or FFFFFFFF depending on size). (ok _PCI_NOP doesn't even set the return). And last but not least, how it is broken: 6) There is a hose per phb not per bus. You are not including the bus number in your block table. With a EADS bridge, you are blocking the other slots on the same phb. The real problem with number 6 is that (as mentioned in 5) the bus->sysdata is not always correct. We currently use bus->sysdata 2 or 3 places in arch code, and then later explicitly copy the sysdata down from the phb (which I think is redundant with 2.6 pci code). Actually we never use bus-sysdata except for host bridges (bus->self == NULL) but it is still copied to the pci devs on hotplug etc so it has to be compatable with dev->sysdata aka a OF device node. The cleanest is to add this to struct pci_bus but we could also use a word in struct device_node (call to_OF_node on bus->self, sysdata if NULL) if needed. We could also search a list of (bus, mask) pairs or something, but we don't use many of the 256 possible buses on each PHB. milton From miltonm at realtime.net Sun Nov 21 10:11:53 2004 From: miltonm at realtime.net (Milton D. Miller II) Date: Sat, 20 Nov 2004 17:11:53 -0600 (CST) Subject: [PATCH] PPC64: EEH Recovery Message-ID: <200411202311.iAKNBrw0025283@sullivan.realtime.net> > Hi Paul, > > The patch below implements hotplug style EEH error recovery. > Its split into two pieces: a part that needs to be applied to the > PPC64 arch tree, and a part that needs to be applied to the > RPA PHP hotplug tree. The PPC64 part needs to go in first. > > Assuming this doesn't generate a round of discussion, please > forward upstream to akpm/torvalds. Here's some discussion :) Just reading the diff, not the patched code. Hopefully I undid all the html correctly. 1) why are you EXPORT_SYMBOL rtas_write_config? eeh.c isn't a module 2) I object to grabing pci devices so they don't disappear and reappear. I worry about duplicate devices across register/unregister and sysfs kobject lifetimes getting confused and duplicate names. I'd prefer we just kept the pci config stuff we are going to restore off the of device node. > @@ -635,7 +654,7 @@ unsigned long eeh_check_failure(const vo > /* Finding the phys addr + pci device; this is pretty quick. */ > addr = eeh_token_to_phys((unsigned long __force) token); > dev = pci_get_device_by_addr(addr); > - if (!dev) > + if (!dev) > return val; > > dn = pci_device_to_OF_node(dev); adding trailing white space. tsk tsk PS: there is more white space, this one caught my eye. > +/* ------------------------------------------------------- */ > +/** Save and restore of PCI BARs > + * > + * Although firmware will set up BARs during boot, it doesn't > + * set up device BAR's after a device reset, although it will, > + * if requested, set up bridge configuration. Thus, we need to > + * configure the PCI devices ourselves. Config-space setup is > + * stored in the PCI structures which are normally deleted during > + * device removal. Thus, the "save" routine references the > + * structures so that they aren't deleted. > + */ > + > + > +struct eeh_cfg_tree > +{ > + struct eeh_cfg_tree *sibling; > + struct eeh_cfg_tree *child; > + struct pci_dev *dev; > + struct device_node *dn; > +}; Do we need a tree for this? > + > +static inline struct pci_dev * eeh_get_pci_dev(struct device_node *dn) > +{ > + struct pci_dev *dev = NULL; > + char bus_id[BUS_ID_SIZE]; > + > + sprintf(bus_id, "%04x:%02x:%02x.%d",dn->phb->global_number, > + dn->busno, PCI_SLOT(dn->devfn), PCI_FUNC(dn->devfn)); > + > + while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { > + if (!strcmp(pci_name(dev), bus_id)) > + return dev; > + } > + return NULL; > +} IICK 1) matching on a stirng when we have the device_node? really? please match on pci_device_to_OF_node(dev) 2) for_each_pcidev() please > + > +/** > + * eeh_save_bars - save the PCI config space info > + */ > +struct eeh_cfg_tree * eeh_save_bars(struct device_node *dn) > +{ > + struct eeh_cfg_tree *cnode; > + struct pci_dev *dev; > + > + dev = eeh_get_pci_dev (dn); > + if (!dev) > + return NULL; > + > + cnode = kmalloc(sizeof(struct eeh_cfg_tree), GFP_KERNEL); > + if (!cnode) > + return NULL; > + > + cnode->dev = dev; > + > + of_node_get(dn); > + cnode->dn = dn; > + > + cnode->sibling = NULL; > + cnode->child = NULL; > + > + if (dn->child) { > + cnode->child = eeh_save_bars (dn->child); > + } > + if (dn->sibling) { > + cnode->sibling = eeh_save_bars (dn->sibling); > + } > + > + return cnode; > +} > +EXPORT_SYMBOL(eeh_save_bars); > + > +/** > + * __restore_bars - Restore the Base Address Registers > + * Loads the PCI configuration space base address registers > + * and the expansion ROM base address from the array > + * passed as the second argument. > + */ > +static inline void __restore_bars (struct device_node *dn, u32 *cfg_hdr) > +{ > + int i; > + for (i=4; i<10; i++) { > + rtas_write_config(dn, i*4, 4, cfg_hdr[i]); > + } > + rtas_write_config(dn, 12*4, 4, cfg_hdr[12]); > +} > + > +/** > + * eeh_restore_bars - restore the PCI config space info > + */ > +void eeh_restore_bars(struct eeh_cfg_tree *tree) > +{ > + if (tree->dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) { > + __restore_bars (tree->dn, tree->dev->saved_config_space); > + } > + > + if (tree->child) { > + eeh_restore_bars (tree->child); > + } > + if (tree->sibling) { > + eeh_restore_bars (tree->sibling); > + } > + > + of_node_put (tree->dn); > + pci_dev_put (tree->dev); > + kfree (tree); > +} > +EXPORT_SYMBOL(eeh_restore_bars); > + How about a list of (dn *, pci config words to write)? or an array of dn > > -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); not analyzed ... > +/* ------------------------------------------------------- */ > +/** > + * handle_eeh_events -- reset a PCI device after hard lockup. > + * > + * pSeries systems will isolate a PCI slot if the PCI-Host > + * bridge detects address or data parity errors, DMA's > + * occuring to wild addresses (which usually happen due to > + * bugs in device drivers or in PCI adapter firmware). > + * Slot isolations also occur if #SERR, #PERR or other misc > + * PCI-related errors are detected. > + * > + * Recovery process consists of unplugging the device driver > + * (which generated hotplug events to userspace), then issuing > + * a PCI #RST to the device, then reconfiguring the PCI config > + * space for all bridges & devices under this slot, and then > + * finally restarting the device drivers (which cause a second > + * set of hotplug events to go out to userspace). > + */ > +int handle_eeh_events (struct notifier_block *self, > + unsigned long reason, void *ev) > +{ > + struct eeh_event *event = ev; > + struct slot *frozen_slot; > + struct eeh_cfg_tree * saved_bars; > + > + frozen_slot = rpaphp_find_slot(event->dev); > + if (!frozen_slot) > + { > + printk (KERN_ERR > + "EEH: Cannot find PCI slot for EEH error! dev=%p dn=%p\n", > + event->dev, event->dn); > + return 1; > + } > + > + /* Keep a copy of the config space registers */ > + saved_bars = eeh_save_bars(frozen_slot->dn); > + of_node_get(event->dn); > + pci_dev_get(event->dev); > + > + rpaphp_unconfig_pci_adapter (frozen_slot); > + > + event->dn->eeh_freeze_count ++; > + if (event->dn->eeh_freeze_count > EEH_MAX_ALLOWED_FREEZES) { > + /* > + * About 90% of all real-life EEH failures in the field > + * are due to poorly seated PCI cards. Only 10% or so are > + * due to actual, failed cards > + */ > + printk (KERN_ERR > + "EEH: device %s:%s has failed %d times \n" > + "and has been permanently disabled. Please try reseating\n" > + "this device or replacing it.\n", > + pci_name (event->dev), > + pci_pretty_name (event->dev), > + EEH_MAX_ALLOWED_FREEZES); > + goto rdone; > + } > + > + /* Reset the pci controller. (Asserts RST#; resets config space). > + * Reconfigure bridges and devices */ > + rtas_set_slot_reset (event->dn); > + rtas_configure_bridge(event->dn); > + eeh_restore_bars(saved_bars); > + > + /* Give the system 5 seconds to finish running the user-space > + * hotplug scripts, e.g. ifdown for ethernet. Yes, this is a hack, > + * but if we don't do this, weird things happen. > + */ > + ssleep (5); > + > + rpaphp_enable_pci_slot (frozen_slot); > + > + /* The new device node is different than the old one; > + * copy over the freeze count, so that we don't loose track of it. > + */ > + frozen_slot->dn->eeh_freeze_count = event->dn->eeh_freeze_count; > +rdone: > + of_node_put(event->dn); > + pci_dev_put(event->dev); > + return 0; > +} see comments with concerns about lifetimes. > + > +static struct notifier_block eeh_block; > + > +void __init init_eeh_handler (void) > +{ > + eeh_block.notifier_call = handle_eeh_events; > + eeh_register_notifier (&eeh_block); > +} > + > +void __exit exit_eeh_handler (void) > +{ > + eeh_unregister_notifier (&eeh_block); > +} > + From brking at us.ibm.com Sun Nov 21 10:16:49 2004 From: brking at us.ibm.com (Brian King) Date: Sat, 20 Nov 2004 17:16:49 -0600 Subject: [PATCH 1/2] ppc64: Block config accesses during BIST #3 In-Reply-To: <200411202211.iAKMB4lI025034@sullivan.realtime.net> References: <200411202211.iAKMB4lI025034@sullivan.realtime.net> Message-ID: <419FD061.2050908@us.ibm.com> Thanks for taking a look at this. I'm currently looking at handling this in the generic pci layer instead. I recently submitted a patch to linux-kernel to just this. Thanks, -Brian Milton Miller wrote: > Hi Brian. > > Sorry it took so long to look at this, but I was totally burried 2 weeks > ago and am just catching up with the fun stuff. > > A few comments, mostly 1/2. > > 0) line numbers are off after Anton's clean up pci controller allocation :) > 1) I don't see any reason for adding exports pcibios_*config* routines > 2) pci_ops should go in pci-bridge, its private to the arch. Maybe call > this block_ops ? > 3) in pci_bus_to_host don't worry about searching for the hose, Ben and > I agree the pci code will always have bus->sysdata when calling > config routines. (pci_bus sysdata is copied form parent, and to pci_dev) > 4) define a HAVE_PCI_CONFIG_BLOCK or something for the drivers to key on > 5) Other places we return 1's we are careful to return the right number > of them (ie FF, FFFF, or FFFFFFFF depending on size). (ok _PCI_NOP > doesn't even set the return). > > And last but not least, how it is broken: > > 6) There is a hose per phb not per bus. You are not including the bus > number in your block table. With a EADS bridge, you are blocking the > other slots on the same phb. > > The real problem with number 6 is that (as mentioned in 5) the bus->sysdata > is not always correct. We currently use bus->sysdata 2 or 3 places in > arch code, and then later explicitly copy the sysdata down from the phb > (which I think is redundant with 2.6 pci code). Actually we never use > bus-sysdata except for host bridges (bus->self == NULL) but it is still > copied to the pci devs on hotplug etc so it has to be compatable with > dev->sysdata aka a OF device node. > > The cleanest is to add this to struct pci_bus but we could also use a word > in struct device_node (call to_OF_node on bus->self, sysdata if NULL) if > needed. We could also search a list of (bus, mask) pairs or something, but > we don't use many of the 256 possible buses on each PHB. > > milton > -- Brian King eServer Storage I/O IBM Linux Technology Center From benh at kernel.crashing.org Sun Nov 21 10:36:13 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 21 Nov 2004 10:36:13 +1100 Subject: [PATCH] PPC64: EEH Recovery In-Reply-To: <200411202311.iAKNBrw0025283@sullivan.realtime.net> References: <200411202311.iAKNBrw0025283@sullivan.realtime.net> Message-ID: <1100993773.3796.29.camel@gaston> On Sat, 2004-11-20 at 17:11 -0600, Milton D. Miller II wrote: > 2) I object to grabing pci devices so they don't disappear and reappear. > I worry about duplicate devices across register/unregister and sysfs > kobject lifetimes getting confused and duplicate names. > > I'd prefer we just kept the pci config stuff we are going to restore > off the of device node. Agreed... though it could even be a driver responsibility to restore the stuff ... The basic stuff like BARs don't need to be saved/restored I suppose too, just get the kernel to re-assign addresses after the old ones have been freed ... > adding trailing white space. tsk tsk > PS: there is more white space, this one caught my eye. Oh, I do that too all the time, bad bad emacs :) > Do we need a tree for this? No, this is overbloat > > + > > +static inline struct pci_dev * eeh_get_pci_dev(struct device_node *dn) > > +{ > > + struct pci_dev *dev = NULL; > > + char bus_id[BUS_ID_SIZE]; > > + > > + sprintf(bus_id, "%04x:%02x:%02x.%d",dn->phb->global_number, > > + dn->busno, PCI_SLOT(dn->devfn), PCI_FUNC(dn->devfn)); > > + > > + while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { > > + if (!strcmp(pci_name(dev), bus_id)) > > + return dev; > > + } > > + return NULL; > > +} > > > IICK Indeed, GACK ! > 1) matching on a stirng when we have the device_node? really? > please match on pci_device_to_OF_node(dev) > 2) for_each_pcidev() please Yup. > How about a list of (dn *, pci config words to write)? > or an array of dn I don't understand why we need to do that... it's totally redundant with just unplugging/re-plugging the device, the kernel will then re-assign addresses to it. From benh at kernel.crashing.org Sun Nov 21 10:42:31 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 21 Nov 2004 10:42:31 +1100 Subject: [PATCH 1/2] ppc64: Block config accesses during BIST #3 In-Reply-To: <200411202211.iAKMB4lI025034@sullivan.realtime.net> References: <200411202211.iAKMB4lI025034@sullivan.realtime.net> Message-ID: <1100994151.3795.35.camel@gaston> > 3) in pci_bus_to_host don't worry about searching for the hose, Ben and > I agree the pci code will always have bus->sysdata when calling > config routines. (pci_bus sysdata is copied form parent, and to pci_dev) Right, I have this patch I haven't sent yet ... Or maybe I did, bk is dead today, I can't verify... > 6) There is a hose per phb not per bus. You are not including the bus > number in your block table. With a EADS bridge, you are blocking the > other slots on the same phb. Right. I need that for using the patch for pmac dead devices too. > The real problem with number 6 is that (as mentioned in 5) the bus->sysdata > is not always correct. It starts it's life beeing the PHB "parent" node. Then, it's updated to be the actual P2P bridge node, or for a PHB, stays on the parent node. > We currently use bus->sysdata 2 or 3 places in > arch code, and then later explicitly copy the sysdata down from the phb > (which I think is redundant with 2.6 pci code). Actually we never use > bus-sysdata except for host bridges (bus->self == NULL) but it is still > copied to the pci devs on hotplug etc so it has to be compatable with > dev->sysdata aka a OF device node. With the simplified pci_bus_to_host(), we use bus->sysdata->phb, which should be fine at all times. > The cleanest is to add this to struct pci_bus but we could also use a word > in struct device_node (call to_OF_node on bus->self, sysdata if NULL) if > needed. We could also search a list of (bus, mask) pairs or something, but > we don't use many of the 256 possible buses on each PHB. Maybe the best solution is to put an arch hook in there for the locking instead of the generic code finally.... Ben. From miltonm at bga.com Sun Nov 21 21:25:31 2004 From: miltonm at bga.com (Milton Miller) Date: Sun, 21 Nov 2004 04:25:31 -0600 Subject: [PATCH] PPC64: EEH Recovery Message-ID: [that = write the bars after device removal and reinsert] Ben wrote: > I don't understand why we need to do that... it's totally redundant > with > just unplugging/re-plugging the device, the kernel will then re-assign > addresses to it. > In my other kernel work, I have yet to see the kernel assign our bars correctly and take pre-existing bars at the same time. Not to say that it shouldn't, just that code to restore the bars may be required. In other words, I wouuld feel better if they were restored before calling the pci layer to re-probe the slot. However, the only thing that should be needed is a list of device nodes and values. Actually the values are already in the "assigned-addresses" property. milton From benh at kernel.crashing.org Mon Nov 22 12:48:15 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 12:48:15 +1100 Subject: [PATCH] ppc64: Fix default command line Message-ID: <1101088095.13598.13.camel@gaston> Hi ! The ppc64 kernel can be built with a default command line (CONFIG_CMDLINE) for cases where none is provided by the firmware. However, some OF implementation always pass a "bootargs" property that only contains the "0" terminating byte of a C string which caused us to think there was a command line, and not use the built-in one. This patch fixes it. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/prom.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom.c 2004-11-22 11:59:42.918494552 +1100 +++ linux-work/arch/ppc64/kernel/prom.c 2004-11-22 12:45:41.563116824 +1100 @@ -823,7 +823,7 @@ strlcpy(cmd_line, p, min(l, COMMAND_LINE_SIZE)); } #ifdef CONFIG_CMDLINE - if (l == 0) /* dbl check */ + if (l == 0 || (l == 1 && (*p) == 0)) strlcpy(cmd_line, CONFIG_CMDLINE, COMMAND_LINE_SIZE); #endif /* CONFIG_CMDLINE */ From benh at kernel.crashing.org Mon Nov 22 12:49:57 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 12:49:57 +1100 Subject: [PATCH] ppc64: Fix typo when parsing isa "reg" properties Message-ID: <1101088198.13597.15.camel@gaston> Hi ! This patch fixes a typo in the code that parse Open Firmware properties for devices under an "isa" node, the incorrect struct size was used when parsing the "reg" property of these. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/prom.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/prom.c 2004-11-22 11:49:24.000000000 +1100 +++ linux-work/arch/ppc64/kernel/prom.c 2004-11-22 11:59:42.918494552 +1100 @@ -441,7 +441,7 @@ if (rp != 0 && l >= sizeof(struct isa_reg_property)) { i = 0; adr = (struct address_range *) mem_start; - while ((l -= sizeof(struct reg_property)) >= 0) { + while ((l -= sizeof(struct isa_reg_property)) >= 0) { if (!measure_only) { adr[i].space = rp[i].space; adr[i].address = rp[i].address; From benh at kernel.crashing.org Mon Nov 22 12:52:33 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 12:52:33 +1100 Subject: [PATCH] ppc64: pci_bus_to_host() simplification Message-ID: <1101088353.13612.19.camel@gaston> Hi ! The pci_bus_to_host() inline function used on ppc64 to find the pci_contoller structure a give pci_bus resides on used to contain bogus tree walking code, which fortunately ended up never beeing necessary since "sysdata" always points to a device_node structure that has the proper "phb" field (even if it is not the device-node of the actual P2P, which happens during boot). Signed-off-by: Benjamin Herrenschmidt Index: linux-work/include/asm-ppc64/pci-bridge.h =================================================================== --- linux-work.orig/include/asm-ppc64/pci-bridge.h 2004-11-22 11:50:50.000000000 +1100 +++ linux-work/include/asm-ppc64/pci-bridge.h 2004-11-22 12:45:52.695424456 +1100 @@ -88,17 +88,9 @@ static inline struct pci_controller *pci_bus_to_host(struct pci_bus *bus) { - struct device_node *busdn; + struct device_node *busdn = bus->sysdata; - busdn = bus->sysdata; - if (busdn == 0) { - struct pci_bus *b; - for (b = bus->parent; b && bus->sysdata == 0; b = b->parent) - ; - busdn = b->sysdata; - } - if (busdn == NULL) - return NULL; + BUG_ON(busdn == NULL); return busdn->phb; } From benh at kernel.crashing.org Mon Nov 22 12:54:46 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 12:54:46 +1100 Subject: [PATCH] ppc64: Fix early serial setup baud rate Message-ID: <1101088486.13598.23.camel@gaston> Hi ! The "udbg" code used on ppc64 for early consoles including early serial console recently got a new "default speed" option. This was implemented as a switch case that missed a few important cases, one beeing necessary for a board beeing released soon. This patch fixes it by using the proper division to calculate the dll value for the uart instead of that bogus switch/case. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/kernel/udbg.c =================================================================== --- linux-work.orig/arch/ppc64/kernel/udbg.c 2004-11-22 11:49:25.000000000 +1100 +++ linux-work/arch/ppc64/kernel/udbg.c 2004-11-22 11:59:47.652774832 +1100 @@ -56,28 +56,17 @@ void udbg_init_uart(void __iomem *comport, unsigned int speed) { - u8 dll = 12; + u16 dll = speed ? (115200 / speed) : 12; - switch(speed) { - case 115200: - dll = 1; - break; - case 57600: - dll = 2; - break; - case 38400: - dll = 3; - break; - } if (comport) { udbg_comport = (struct NS16550 __iomem *)comport; out_8(&udbg_comport->lcr, 0x00); out_8(&udbg_comport->ier, 0xff); out_8(&udbg_comport->ier, 0x00); out_8(&udbg_comport->lcr, 0x80); /* Access baud rate */ - out_8(&udbg_comport->dll, dll); /* 1 = 115200, 2 = 57600, + out_8(&udbg_comport->dll, dll & 0xff); /* 1 = 115200, 2 = 57600, 3 = 38400, 12 = 9600 baud */ - out_8(&udbg_comport->dlm, 0x00); /* dll >> 8 which should be zero + out_8(&udbg_comport->dlm, dll >> 8); /* dll >> 8 which should be zero for fast rates; */ out_8(&udbg_comport->lcr, 0x03); /* 8 data, 1 stop, no parity */ out_8(&udbg_comport->mcr, 0x03); /* RTS/DTR */ From l_indien at magic.fr Mon Nov 22 14:24:55 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 22 Nov 2004 04:24:55 +0100 Subject: Booting Imac G5 In-Reply-To: <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> Message-ID: <1101093895.31127.22.camel@rapid> Hi, On Mon, 2004-11-15 at 10:26, Segher Boessenkool wrote: > >> > >> - Have to unplug/replug the USB keyboard > >> after kernel boot to make it work with kernel > >> 2.6.10-rc1. No such problem with 2.6.9. > > > > There have been various USB related issue in 2.6.10-rc*, have you tried > > the latest bk ? > > Do you have an url for that? USB is misbehaving on my 7,2 as well. I just tested 2.6.10-rc2 and USB related problems I saw are now fixed. > >> - Lot's of segfaults occuring when multiple > >> concurent processes are running, especially > >> during compilations. Maybe the RAM is bad > >> (I'm trying to port memtest86 to check this) > >> or the CPU state is not completely saved / > >> restored when rescheduling: it seems not > >> to occur when compiling without doing > >> anything else concurently. > >> > >> Do you have any idea about the last point ? Could this be Altivec > >> context save / restore problems ? > > > > I very much doubt it has anything to do with CPU context > > saving/restoring... Could be lots of different things, difficult to say > > at this point. Thermal problem ? Clock chip setup problem ? Bad > > RAMs ? ... > > Doesn't sound like a hardware problem, as it only occurs if he is > running multiple processes... > > Or maybe it is... try running without X? I think I found the problem. While testing my RTC driver, I had some problems which seemed to be cache related. I made a try: in arch/ppc64/kernel/misc.S, in flush_dcache_range and flush_dcache_phys_range, I added a dcbf 0,r6 just after the dcbst 0,r6. This fixed the problems I had in the RTC driver and I can notice that I got no more segfault using this patch (I even can now update my Gentoo under X11 using Gnome !). I guess I could remove the dcbst, as dcbz does the same plus L1 cache invalidation. I'm not completly sure of what the problem is, but I think it can be either the dbcst is not appropriate, as it does not invalidate the L1 cache in 970FX, or HID registers are not well programmed. I'll send soon a new patch which will make the RTC accesses, reboot and halt available on SMU based machines (just need a few more tests and cleanups). Regards. -- J. Mayer Never organized From benh at kernel.crashing.org Mon Nov 22 14:38:36 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 14:38:36 +1100 Subject: ppc64 vDSO update Message-ID: <1101094716.13598.39.camel@gaston> At the URL below, you can find a new version of the ppc64 vDSO patch against a recent Linus bk tree. I intend to submit it upstream real soon as the work on non-executable stack is waiting for it, though we must first make sure the way symbols are exported to userland is ok for glibc. http://gate.crashing.org/~benh/ppc64-vdso-20041122.diff Following various comments, I've changed the way the 64 bits vDSO exports symbols, it now no-longer has function descriptors & "dot" symbols, the symbols point directly to the functions with a link base of 0 (so the symbol value is really an offset to the code now). You can still build the old way though with a #define in include/asm-ppc64/vdso.h, at least until we have formally agreed on what mecanism we want to use. (Craig: the signal issue is fixed now, either when building with descriptors or without). Ben. From benh at kernel.crashing.org Mon Nov 22 14:42:18 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 14:42:18 +1100 Subject: Booting Imac G5 In-Reply-To: <1101093895.31127.22.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> Message-ID: <1101094938.13598.42.camel@gaston> > I think I found the problem. > While testing my RTC driver, I had some problems which seemed to be > cache related. > I made a try: > in arch/ppc64/kernel/misc.S, in flush_dcache_range and > flush_dcache_phys_range, I added a > dcbf 0,r6 just after the dcbst 0,r6. > This fixed the problems I had in the RTC driver and I can notice that I > got no more segfault using this patch (I even can now update my Gentoo > under X11 using Gnome !). > I guess I could remove the dcbst, as dcbz does the same plus L1 cache > invalidation. > I'm not completly sure of what the problem is, but I think it can be > either the dbcst is not appropriate, as it does not invalidate the L1 > cache in 970FX, or HID registers are not well programmed. Hrm... that's bad ... dcbf will actually invalidate the line from the cache, not only flush it, which is not what we want (it's correct still, but not optimal) and there may well be existing code using dcbst.... I'll investigate, but that is definitely not normal. What is your exact PVR value ? (cpu rev.) and can you send me the values in HID0, HID1, HID4 and HID5 ? (just add some printk("HIDx: %lx\n", mfspr(SPRN_HIDx)); somewhere in the kernel) Ben. From benh at kernel.crashing.org Mon Nov 22 14:49:14 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 14:49:14 +1100 Subject: Booting Imac G5 In-Reply-To: <1101094938.13598.42.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> Message-ID: <1101095354.13598.44.camel@gaston> > Hrm... that's bad ... dcbf will actually invalidate the line from the > cache, not only flush it, which is not what we want (it's correct still, > but not optimal) and there may well be existing code using dcbst.... > I'll investigate, but that is definitely not normal. What is your exact > PVR value ? (cpu rev.) and can you send me the values in HID0, HID1, > HID4 and HID5 ? (just add some printk("HIDx: %lx\n", mfspr(SPRN_HIDx)); > somewhere in the kernel) In addition, the L1 D cache of the 970 is write-through, so it should never contain dirty data anyway... so neither dcbf nor dcbst is strictly necessary, provided we have a sync before the icbi afaik ... Ben. From benh at kernel.crashing.org Mon Nov 22 14:51:29 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 14:51:29 +1100 Subject: Booting Imac G5 In-Reply-To: <1101094938.13598.42.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> Message-ID: <1101095489.13598.46.camel@gaston> On Mon, 2004-11-22 at 14:42 +1100, Benjamin Herrenschmidt wrote: > > I think I found the problem. > > While testing my RTC driver, I had some problems which seemed to be > > cache related. > > I made a try: > > in arch/ppc64/kernel/misc.S, in flush_dcache_range and > > flush_dcache_phys_range, I added a > > dcbf 0,r6 just after the dcbst 0,r6. > > This fixed the problems I had in the RTC driver and I can notice that I > > got no more segfault using this patch (I even can now update my Gentoo > > under X11 using Gnome !). > > I guess I could remove the dcbst, as dcbz does the same plus L1 cache > > invalidation. > > I'm not completly sure of what the problem is, but I think it can be > > either the dbcst is not appropriate, as it does not invalidate the L1 > > cache in 970FX, or HID registers are not well programmed. Oh, and flush_dcache_range/phys_range aren't really used anyway for anything but the IOMMU code, which shouldn't be used on the iMac since you have <= 2Gb of RAM on this machine... so unless you did something else, those 2 functions don't explain the issue (flush_dcache_icache is another one tho). Ben. From benh at kernel.crashing.org Mon Nov 22 14:57:44 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 14:57:44 +1100 Subject: Kernel panic: select_hpte_slot found entry already valid In-Reply-To: <20041119195410.GA14818@cs.umn.edu> References: <20041119195410.GA14818@cs.umn.edu> Message-ID: <1101095864.13597.49.camel@gaston> On Fri, 2004-11-19 at 13:54 -0600, Dave C Boutcher wrote: > This look familiar to anyone? Legacy iseries error. I recall > benh was playing around in there last :-) This is a customer > running Red Hat in production. It tickles old bad memories ... but mostly in 2.6, I don't remember dealing with legacy iSeries issues in 2.4, but that doesn't mean I didn't :) Normally, 2.4 iSeries htab code has hashed spinlocks protecting it, no ? Ben. From olof at austin.ibm.com Mon Nov 22 15:59:47 2004 From: olof at austin.ibm.com (Olof Johansson) Date: Sun, 21 Nov 2004 22:59:47 -0600 Subject: Kernel panic: select_hpte_slot found entry already valid In-Reply-To: <20041119195410.GA14818@cs.umn.edu> References: <20041119195410.GA14818@cs.umn.edu> Message-ID: <20041122045947.GA6286@austin.ibm.com> On Fri, Nov 19, 2004 at 01:54:10PM -0600, Dave C Boutcher wrote: > This look familiar to anyone? Legacy iseries error. I recall > benh was playing around in there last :-) This is a customer > running Red Hat in production. Looks similar to something that should have been fixed in time for the 20.EL kernel, at least with the recreate we did have then. (Redhat BZ 120270). Len: Did this happen on both 15.EL and 20.EL, or on just one of them? I can't tell for sure from the email Dave included. -Olof From l_indien at magic.fr Mon Nov 22 17:23:00 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 22 Nov 2004 07:23:00 +0100 Subject: Booting Imac G5 In-Reply-To: <1101094938.13598.42.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> Message-ID: <1101104580.31127.115.camel@rapid> On Mon, 2004-11-22 at 04:42, Benjamin Herrenschmidt wrote: > > I think I found the problem. > > While testing my RTC driver, I had some problems which seemed to be > > cache related. > > I made a try: > > in arch/ppc64/kernel/misc.S, in flush_dcache_range and > > flush_dcache_phys_range, I added a > > dcbf 0,r6 just after the dcbst 0,r6. > > This fixed the problems I had in the RTC driver and I can notice that I > > got no more segfault using this patch (I even can now update my Gentoo > > under X11 using Gnome !). > > I guess I could remove the dcbst, as dcbz does the same plus L1 cache > > invalidation. > > I'm not completly sure of what the problem is, but I think it can be > > either the dbcst is not appropriate, as it does not invalidate the L1 > > cache in 970FX, or HID registers are not well programmed. > > Hrm... that's bad ... dcbf will actually invalidate the line from the > cache, not only flush it, which is not what we want (it's correct still, > but not optimal) and there may well be existing code using dcbst.... > I'll investigate, but that is definitely not normal. What is your exact > PVR value ? (cpu rev.) and can you send me the values in HID0, HID1, > HID4 and HID5 ? (just add some printk("HIDx: %lx\n", mfspr(SPRN_HIDx)); > somewhere in the kernel) OK, my mistake... I forgot one point: I was using 2.6.10-rc2 while my previous tests were with 2.6.10-rc1. So it seems the bug was corrected between those two pre-release. Great ! For your information, here are some registers dumps: PVR: 003c0300 MSR: 9000000000009032 PIR: 00000000 HID0: 0051108100000000 HID1: fd3c200000000000 HID4: 0000001000000000 HID5: 0000000000000000 For the SMU driver I need a function that uses dcbf, because we send a buffer address to the SMU and the chip may modify this buffer. So I now added another helper, like flush_dcache_phys_range but using dcbf instead of dcbst. > In addition, the L1 D cache of the 970 is write-through, so it should > never contain dirty data anyway... so neither dcbf nor dcbst is strictly > necessary, provided we have a sync before the icbi afaik ... Well, don't we still need dcbst to flush the L2 cache ? Furthermore, in flush_dcache_range & flush_dcache_phys_range, I can see no icbi and I don't see why we should have one, as the thing is to flush data cache, not icache. -- J. Mayer Never organized From benh at kernel.crashing.org Mon Nov 22 18:42:42 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 18:42:42 +1100 Subject: Booting Imac G5 In-Reply-To: <1101104580.31127.115.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> Message-ID: <1101109362.22529.82.camel@gaston> > OK, my mistake... > I forgot one point: I was using 2.6.10-rc2 while my previous tests were > with 2.6.10-rc1. > So it seems the bug was corrected between those two pre-release. Great ! > > For your information, here are some registers dumps: > PVR: 003c0300 MSR: 9000000000009032 PIR: 00000000 > HID0: 0051108100000000 HID1: fd3c200000000000 > HID4: 0000001000000000 HID5: 0000000000000000 > > For the SMU driver I need a function that uses dcbf, because we send a > buffer address to the SMU and the chip may modify this buffer. So I now > added another helper, like flush_dcache_phys_range but using dcbf > instead of dcbst. Why ? dcbst should work as well... I use flush_dcache_phys_range() for the IOMMU without problem so far... > > In addition, the L1 D cache of the 970 is write-through, so it should > > never contain dirty data anyway... so neither dcbf nor dcbst is strictly > > necessary, provided we have a sync before the icbi afaik ... > > Well, don't we still need dcbst to flush the L2 cache ? Not for flush_icache_dcache (that is I/D cache coherency) since only the L1 cache is split. > Furthermore, in flush_dcache_range & flush_dcache_phys_range, I can see > no icbi and I don't see why we should have one, as the thing is to flush > data cache, not icache. They are mean't for a different purpose, which is to sync with non-coherent DMA devices, like the IOMMU (I'm not sure you actually need something like that for the SPU, normally, the northbridge should take care of coherency ...) Ben. From l_indien at magic.fr Mon Nov 22 19:26:35 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 22 Nov 2004 09:26:35 +0100 Subject: Booting Imac G5 In-Reply-To: <1101109362.22529.82.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> Message-ID: <1101111995.31127.155.camel@rapid> On Mon, 2004-11-22 at 08:42, Benjamin Herrenschmidt wrote: > > OK, my mistake... > > I forgot one point: I was using 2.6.10-rc2 while my previous tests were > > with 2.6.10-rc1. > > So it seems the bug was corrected between those two pre-release. Great ! > > > > For your information, here are some registers dumps: > > PVR: 003c0300 MSR: 9000000000009032 PIR: 00000000 > > HID0: 0051108100000000 HID1: fd3c200000000000 > > HID4: 0000001000000000 HID5: 0000000000000000 > > > > For the SMU driver I need a function that uses dcbf, because we send a > > buffer address to the SMU and the chip may modify this buffer. So I now > > added another helper, like flush_dcache_phys_range but using dcbf > > instead of dcbst. > > Why ? dcbst should work as well... I use flush_dcache_phys_range() for > the IOMMU without problem so far... I need to invalidate the L1 cache because I need the memory to be read again after the SMU updated the buffer. If I don't, I will read back my request instead of the SMU answer. This because the caches don't know that another chip will update this memory area. In fact, I spent quite a long time before I understood why I didn't get valid SMU messages back.... > > > In addition, the L1 D cache of the 970 is write-through, so it should > > > never contain dirty data anyway... so neither dcbf nor dcbst is strictly > > > necessary, provided we have a sync before the icbi afaik ... > > > > Well, don't we still need dcbst to flush the L2 cache ? > > Not for flush_icache_dcache (that is I/D cache coherency) since only the > L1 cache is split. Well, OK, I was still thinking about flush_dcache_range & flush_dcache_phys_range.... > > Furthermore, in flush_dcache_range & flush_dcache_phys_range, I can see > > no icbi and I don't see why we should have one, as the thing is to flush > > data cache, not icache. > > They are mean't for a different purpose, which is to sync with > non-coherent DMA devices, like the IOMMU (I'm not sure you actually need > something like that for the SPU, normally, the northbridge should take > care of coherency ...) As the SMU is a separate chip, I'm not so surprised that the northbridge does not know that it may "corrupt" some memory. The SMU may act as a fake CPU while doing this if it was located on the processor bus. If I understand things well, the CPU acts as an IIC slave for the SPU (which seems to be a part of the SMU) then can receive commands from this unit. But it seems that the CPU cannot talk directly to the SPU. I implemented SMU communication the same way Apple does in the OF driver: we talk through a doorbell buffer and a GPIO all located in the MacIO space. The doorbell buffer only contains the physical address of the buffer used to exchange data with the SMU. -- J. Mayer Never organized From benh at kernel.crashing.org Mon Nov 22 19:41:04 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 19:41:04 +1100 Subject: Booting Imac G5 In-Reply-To: <1101111995.31127.155.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> Message-ID: <1101112864.13597.94.camel@gaston> > I need to invalidate the L1 cache because I need the memory to be read > again after the SMU updated the buffer. If I don't, I will read back my > request instead of the SMU answer. No, the system should be fully cache coherent. > This because the caches don't know that another chip will update this > memory area. They do normally > In fact, I spent quite a long time before I understood why I didn't get > valid SMU messages back.... That's strange... I suspect the SMU must bypass normal cache coherency protocol in some way, but that is strange, since according to the base address of the SMU area, it's in the middle of K2, which is on a fully coherent bus. > As the SMU is a separate chip, I'm not so surprised that the northbridge > does not know that it may "corrupt" some memory. The SMU may act as a > fake CPU while doing this if it was located on the processor bus. It can't be on the processor bus. Which is why I'm surprised. It seem to be on K2, which is on a coherent bus. You may just be missing memory barriers in fact. Can you show me your code ? > If I understand things well, the CPU acts as an IIC slave for the SPU > (which seems to be a part of the SMU) then can receive commands from > this unit. But it seems that the CPU cannot talk directly to the SPU. Hrm... the CPU is definitely an i2c slave of the SPU, but the command processing goes through some MMIOs wirted to K2 no ? > I implemented SMU communication the same way Apple does in the OF > driver: we talk through a doorbell buffer and a GPIO all located in the > MacIO space. The doorbell buffer only contains the physical address of > the buffer used to exchange data with the SMU. Right... well, it's possible that the SMU actually uses i2c accesses to read the memory buffer, in which case it will indeed bypass coherency protocols... scary ! Ben. From l_indien at magic.fr Mon Nov 22 20:09:14 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 22 Nov 2004 10:09:14 +0100 Subject: Booting Imac G5 In-Reply-To: <1101112864.13597.94.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> Message-ID: <1101114554.31127.174.camel@rapid> On Mon, 2004-11-22 at 09:41, Benjamin Herrenschmidt wrote: [...] > > In fact, I spent quite a long time before I understood why I didn't get > > valid SMU messages back.... > > That's strange... I suspect the SMU must bypass normal cache coherency > protocol in some way, but that is strange, since according to the base > address of the SMU area, it's in the middle of K2, which is on a fully > coherent bus. Yes, the doorbell buffer and the doorbell GPIOs are located in the K2 and I had no problem using them. > > As the SMU is a separate chip, I'm not so surprised that the northbridge > > does not know that it may "corrupt" some memory. The SMU may act as a > > fake CPU while doing this if it was located on the processor bus. > > It can't be on the processor bus. Which is why I'm surprised. Yes, was a foolish idea.... > It seem to > be on K2, which is on a coherent bus. You may just be missing memory > barriers in fact. Can you show me your code ? Sure, you'll find it attached. > > If I understand things well, the CPU acts as an IIC slave for the SPU > > (which seems to be a part of the SMU) then can receive commands from > > this unit. But it seems that the CPU cannot talk directly to the SPU. > > Hrm... the CPU is definitely an i2c slave of the SPU, but the command > processing goes through some MMIOs wirted to K2 no ? It seems we don't access directly the SPU or it's not showed in the OF tree. We send commands to the SMU which "translate" those commands. But I didn't spent lot's of time on SPU parts, for now only RTC, reboot and halt. > > I implemented SMU communication the same way Apple does in the OF > > driver: we talk through a doorbell buffer and a GPIO all located in the > > MacIO space. The doorbell buffer only contains the physical address of > > the buffer used to exchange data with the SMU. > > Right... well, it's possible that the SMU actually uses i2c accesses to > read the memory buffer, in which case it will indeed bypass coherency > protocols... scary ! Maybe. Take a look at my code, which is quite simple even if still not clean. Maybe you'll find a mistake I didn't see... -- J. Mayer Never organized -------------- next part -------------- A non-text attachment was scrubbed... Name: smu.c Type: text/x-csrc Size: 12574 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041122/cfa3a115/attachment.c From benh at kernel.crashing.org Mon Nov 22 20:37:34 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 20:37:34 +1100 Subject: Booting Imac G5 In-Reply-To: <1101114554.31127.174.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> Message-ID: <1101116254.13598.98.camel@gaston> On Mon, 2004-11-22 at 10:09 +0100, J. Mayer wrote: > > be on K2, which is on a coherent bus. You may just be missing memory > > barriers in fact. Can you show me your code ? > > Sure, you'll find it attached. Does it work if you replace your flush_* with an mb() ? > It seems we don't access directly the SPU or it's not showed in the OF > tree. We send commands to the SMU which "translate" those commands. But > I didn't spent lot's of time on SPU parts, for now only RTC, reboot and > halt. Yah, it's a bit messy... it can also eat PMU commands it seems ... > Maybe. Take a look at my code, which is quite simple even if still not > clean. Maybe you'll find a mistake I didn't see... I'll have a closer look tomorrow, nice work tho ! I'm hoping to get an iMac G5 at work anytime soon (it was supposed to get here a few weeks ago, but the order didn't go thru properly internally, so it had to be re-issued). I'll be able to help more. Ben. From l_indien at magic.fr Mon Nov 22 21:31:44 2004 From: l_indien at magic.fr (J. Mayer) Date: Mon, 22 Nov 2004 11:31:44 +0100 Subject: Booting Imac G5 In-Reply-To: <1101116254.13598.98.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> Message-ID: <1101119504.31127.192.camel@rapid> On Mon, 2004-11-22 at 10:37, Benjamin Herrenschmidt wrote: > On Mon, 2004-11-22 at 10:09 +0100, J. Mayer wrote: > > > > be on K2, which is on a coherent bus. You may just be missing memory > > > barriers in fact. Can you show me your code ? > > > > Sure, you'll find it attached. > > Does it work if you replace your flush_* with an mb() ? Just made a few tries: - I made the driver more secure, taking the lock before any cmd_buf access and releasing it when answer have been treated. I tried to do mb() at the start of do_cmd() and after the cmd_done call. Then, when reading the RTC, I read all zeroes but the first byte which is 0x81. The cmd byte isn't modified, when I should read the command ack which is the complement of the requested command (even if I don't check it, for now). I tried to add a call to flush_dcache_phys_range, then I got the same result. But using flush_inval_dcache_phys_range (yes the name is quite strange....), with or without mb(), then it works. > > It seems we don't access directly the SPU or it's not showed in the OF > > tree. We send commands to the SMU which "translate" those commands. But > > I didn't spent lot's of time on SPU parts, for now only RTC, reboot and > > halt. > > Yah, it's a bit messy... it can also eat PMU commands it seems ... I'm not sure it'll eat all PMU commands, but it sure can treat some... > > Maybe. Take a look at my code, which is quite simple even if still not > > clean. Maybe you'll find a mistake I didn't see... > > I'll have a closer look tomorrow, nice work tho ! > > I'm hoping to get an iMac G5 at work anytime soon (it was supposed to > get here a few weeks ago, but the order didn't go thru properly > internally, so it had to be re-issued). I'll be able to help more. All right ! But I think I reached my first goal: I got a usable and stable machine with all basic features available. This is a good start for the real deep needed work ! -- J. Mayer Never organized From benh at kernel.crashing.org Mon Nov 22 22:16:18 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 22 Nov 2004 22:16:18 +1100 Subject: Booting Imac G5 In-Reply-To: <1101119504.31127.192.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> Message-ID: <1101122178.13612.106.camel@gaston> > I tried to do mb() at the start of do_cmd() and after the cmd_done call. > Then, when reading the RTC, I read all zeroes but the first byte which > is 0x81. The cmd byte isn't modified, when I should read the command ack > which is the complement of the requested command (even if I don't check > it, for now). > I tried to add a call to flush_dcache_phys_range, then I got the same > result. > But using flush_inval_dcache_phys_range (yes the name is quite > strange....), with or without mb(), then it works. Ok, that seem to indicate that indeed, that crap isn't cache coherent ... I wonder how they actually access the memory from the SMU... maybe i2c commands to U3... > I'm not sure it'll eat all PMU commands, but it sure can treat some... Probably not all, I'm not even sure Apple themselves still knows what all PMU commands do :) The PMU is a mess that evolved from the original one that was in the very first Mac Portable ! > All right ! > But I think I reached my first goal: I got a usable and stable machine > with all basic features available. Yes, that's really nice. Is rivafb working too ? I've seen somebody submitting a patch that recently got into Linus bk fixing some issues with the 5200 > This is a good start for the real deep needed work ! Hehe, indeed ! Ben. From len at ghy.com Tue Nov 23 01:38:44 2004 From: len at ghy.com (Len Goldenstein) Date: Mon, 22 Nov 2004 08:38:44 -0600 Subject: Kernel panic: select_hpte_slot found entry already valid In-Reply-To: <20041122045947.GA6286@austin.ibm.com> Message-ID: Hi Olof, This only occurs on 20.EL. 15.EL (and previous) has no had any problems at all running in the same production environment. ----------------------------------------- Len Goldenstein Network Administrator IBM Certified Specialist eServer iSeries Linux Technical Solutions Geo. H. Young & Co. Ltd. 809 - 167 Lombard Ave. Winnipeg, MB R3B 3H8 Phone: (204) 947-6700 Ext. 203 Fax: (204) 947-3306 len at ghy.com http://www.ghy.com Due to changing circumstances and the complexity of Customs Regulations, our opinions are based on the available information and should not be considered binding. If you would like a binding ruling, please do not hesitate in contacting our Professional Service Department via email at consulting at ghy.com. GHY International is PIP Approved (Canada) and C-TPAT Certified (USA) > -----Original Message----- > From: Olof Johansson [mailto:olof at austin.ibm.com] > Sent: November 21, 2004 11:00 PM > To: sleddog at us.ibm.com; len at ghy.com > Cc: linuxppc64-dev at ozlabs.org > Subject: Re: Kernel panic: select_hpte_slot found entry already valid > > > On Fri, Nov 19, 2004 at 01:54:10PM -0600, Dave C Boutcher wrote: > > This look familiar to anyone? Legacy iseries error. I recall > > benh was playing around in there last :-) This is a customer > > running Red Hat in production. > > Looks similar to something that should have been fixed in time for the > 20.EL kernel, at least with the recreate we did have then. (Redhat BZ > 120270). > > Len: Did this happen on both 15.EL and 20.EL, or on just one of them? I > can't tell for sure from the email Dave included. > > > -Olof > From nathanl at austin.ibm.com Tue Nov 23 05:58:54 2004 From: nathanl at austin.ibm.com (Nathan Lynch) Date: Mon, 22 Nov 2004 12:58:54 -0600 Subject: [RFC] sysfs cpu cleanup In-Reply-To: <20041119023059.GF16796@zax> References: <20041117071608.GB19019@zax> <1100703552.8092.15.camel@localhost.localdomain> <20041118234444.GA16796@zax> <1100823721.6601.11.camel@biclops> <20041119023059.GF16796@zax> Message-ID: <1101149934.6189.16.camel@biclops> On Thu, 2004-11-18 at 20:30, David Gibson wrote: > Index: working-2.6/arch/ppc64/kernel/smp.c > =================================================================== > --- working-2.6.orig/arch/ppc64/kernel/smp.c 2004-10-19 13:37:56.000000000 +1000 > +++ working-2.6/arch/ppc64/kernel/smp.c 2004-11-17 17:56:42.000000000 +1100 > @@ -82,6 +82,8 @@ > void smp_call_function_interrupt(void); > extern long register_vpa(unsigned long flags, unsigned long proc, > unsigned long vpa); > +extern void register_cpu_online(int cpu); > +extern void unregister_cpu_online(int cpu); > > int smt_enabled_at_boot = 1; > > @@ -291,6 +293,8 @@ > int cpu_status; > unsigned int pcpu = get_hard_smp_processor_id(cpu); > > + unregister_cpu_online(cpu); > + > for (tries = 0; tries < 25; tries++) { > cpu_status = query_cpu_stopped(pcpu); > if (cpu_status == 0 || cpu_status == -1) > @@ -919,6 +923,11 @@ > while (!cpu_online(cpu)) > cpu_relax(); > > +#ifdef CONFIG_HOTPLUG_CPU > + if (system_state >= SYSTEM_RUNNING) /* This is a hotplug */ > + register_cpu_online(cpu); > +#endif /* CONFIG_HOTPLUG_CPU */ > + > return 0; > } I apologize for not noticing this before :/ Instead of explicitly calling into the sysfs.c code from smp.c, I think it would be better to use a cpu hotplug notifier block. Even without CONFIG_HOTPLUG_CPU turned on, you get the "online" notification at boot for each cpu as it is brought up. See the ppc64 numa code for an example usage. Nathan From linas at austin.ibm.com Tue Nov 23 06:50:44 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Mon, 22 Nov 2004 13:50:44 -0600 Subject: [PATCH] PPC64: EEH Recovery In-Reply-To: <200411202311.iAKNBrw0025283@sullivan.realtime.net> References: <200411202311.iAKNBrw0025283@sullivan.realtime.net> Message-ID: <20041122195044.GF23780@austin.ibm.com> On Sat, Nov 20, 2004 at 05:11:53PM -0600, Milton D. Miller II was heard to remark: > > The patch below implements hotplug style EEH error recovery. > > Here's some discussion :) > > 1) why are you EXPORT_SYMBOL rtas_write_config? eeh.c isn't a module No reason. Force of habit. I'll remove that. > 2) I object to grabing pci devices so they don't disappear and reappear. > I worry about duplicate devices across register/unregister and sysfs > kobject lifetimes getting confused and duplicate names. I'll double-check, I was under the impression that the unregister happened, and I was just avodiing the final free(). Maybe I'm sorely mistaken. > I'd prefer we just kept the pci config stuff we are going to restore > off the of device node. I could do that. However, while coding this, I was anticipating the comment "why did you add this stuff to struct device_node, when there's already a spot for it in the pci_dev struct?" and so I recycled the pci-dev structs. I don't much care one way or the other about this issue. > > - if (!dev) > > + if (!dev) > > return val; > > > > dn = pci_device_to_OF_node(dev); > > adding trailing white space. tsk tsk Oops. This comes from the nasty habit of the linux coding style not using braces for single-line if's written as double-lines. So if one inserts braces during debug, one has to remember to remove them afterwards, which can lead to inadvertent whitespace. Personally, I'd like to get the linux kernel coding style changed, this *is* an issue I do care about, a lot. However, I figure that would be an unpleasent battle. Oh well. > > +struct eeh_cfg_tree > > +{ > > + struct eeh_cfg_tree *sibling; > > + struct eeh_cfg_tree *child; > > + struct pci_dev *dev; > > + struct device_node *dn; > > +}; > > Do we need a tree for this? ? Don't understand the question. PCI devices are arranged in a tree. One of the cards I test with has a bridge and several devices under it. So one has to walk the whole tree, which might be arbitrarily deep, to get to all of the devices. > IICK > > 1) matching on a stirng when we have the device_node? really? > please match on pci_device_to_OF_node(dev) > 2) for_each_pcidev() please Sorry, I copied pre-existing code that does this. I'll fix it here, and in the other places as well. > How about a list of (dn *, pci config words to write)? > or an array of dn ? Don't understand the question. > > -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); > > not analyzed ... ?? --linas From linas at austin.ibm.com Tue Nov 23 07:21:18 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Mon, 22 Nov 2004 14:21:18 -0600 Subject: [PATCH] PPC64: EEH Recovery In-Reply-To: <1100993773.3796.29.camel@gaston> References: <200411202311.iAKNBrw0025283@sullivan.realtime.net> <1100993773.3796.29.camel@gaston> Message-ID: <20041122202118.GG23780@austin.ibm.com> On Sun, Nov 21, 2004 at 10:36:13AM +1100, Benjamin Herrenschmidt was heard to remark: > On Sat, 2004-11-20 at 17:11 -0600, Milton D. Miller II wrote: > > > 2) I object to grabing pci devices so they don't disappear and reappear. > > I worry about duplicate devices across register/unregister and sysfs > > kobject lifetimes getting confused and duplicate names. > > > > I'd prefer we just kept the pci config stuff we are going to restore > > off the of device node. > > Agreed... though it could even be a driver responsibility to restore the > stuff ... The basic stuff like BARs don't need to be saved/restored I > suppose too, just get the kernel to re-assign addresses after the old > ones have been freed ... I don't know how to get the kernel to assign BAR's. I looked for this, didn't find anything, and thus concluded that I had to save & restore BAR's myself. The power-management suspend-resume code comes close, but not quite. There were no firmware calls for this. The firmware will set up the BAR's when power is toggled (initial system powerup, or pci hotplug). However, it will *not* automatically reconfigure after a reset. There is an explicit call to ask it to reconfigure a bridge after reset, but not a device. So I used the configure-bridge rtas call, but did the devices myself, manually. > > How about a list of (dn *, pci config words to write)? > > or an array of dn > > I don't understand why we need to do that... it's totally redundant with > just unplugging/re-plugging the device, the kernel will then re-assign > addresses to it. You mean "the firmware". Yes, I thought about doing that, but the problem seemed to be that the rpa_php_hotplug tools were available only as RPM's from the IBM website, and were specific to the PPC64 architecture. So I figured that asking that generic, architecture- independent udev and hotplug scripts to be modified to invoke a PPC64-specific closed-source binary was not going to work. So it seemed easier to do the reset in the kernel; its not a whole lotta lines of code. The only gotcha was to save and restore the various BAR's so that the devices would come up properly. --linas From benh at kernel.crashing.org Tue Nov 23 09:12:36 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 23 Nov 2004 09:12:36 +1100 Subject: [PATCH] PPC64: EEH Recovery In-Reply-To: <20041122195044.GF23780@austin.ibm.com> References: <200411202311.iAKNBrw0025283@sullivan.realtime.net> <20041122195044.GF23780@austin.ibm.com> Message-ID: <1101161556.13598.128.camel@gaston> > I'll double-check, I was under the impression that the unregister > happened, and I was just avodiing the final free(). Maybe I'm sorely > mistaken. In fact, I don't think there is much problem keeping the pci_dev structure around if it has been unlinked from the bus & other kobject hierarchy, but make sure of that ... It will ultimately go away though. > > > > - if (!dev) > > > + if (!dev) > > > return val; > > > > > > dn = pci_device_to_OF_node(dev); > > > > adding trailing white space. tsk tsk > > Oops. > > This comes from the nasty habit of the linux coding style not using > braces for single-line if's written as double-lines. "nasty habit" ... I there anything in linux you won't qualify of "nasty habits" one day ? :) I hate braces for one lines. they are ugly. > So if one > inserts braces during debug, one has to remember to remove > them afterwards, which can lead to inadvertent whitespace. You should always do a dirdiff pass to review your patches, that catches these things, and sometimes others... > Personally, I'd like to get the linux kernel coding style changed, > this *is* an issue I do care about, a lot. However, I figure that > would be an unpleasent battle. Oh well. Get the linux kernel coding style changed ? wow ! good luck :) You won't win a battle whose entire point it to create uglyness ;) > ? Don't understand the question. PCI devices are arranged in a tree. > One of the cards I test with has a bridge and several devices under it. > So one has to walk the whole tree, which might be arbitrarily deep, to > get to all of the devices. Then attach your structures to the device-tree instead. No need to duplicate the structure. From linas at austin.ibm.com Tue Nov 23 11:57:11 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Mon, 22 Nov 2004 18:57:11 -0600 Subject: [PATCH] PPC64: EEH Recovery In-Reply-To: <1101161556.13598.128.camel@gaston> References: <200411202311.iAKNBrw0025283@sullivan.realtime.net> <20041122195044.GF23780@austin.ibm.com> <1101161556.13598.128.camel@gaston> Message-ID: <20041123005711.GH23780@austin.ibm.com> On Tue, Nov 23, 2004 at 09:12:36AM +1100, Benjamin Herrenschmidt was heard to remark: > > "nasty habit" ... I there anything in linux you won't qualify of "nasty > habits" one day ? :) Yikes! Have I really been that cranky, that often? Bad weather must have put me in a bad mood :). That, or lack of fresh air. > I hate braces for one lines. they are ugly. Ahh, I occasionally have to debug code that looks like if(a) if (b) x=y; else x=z; so I get cranky when I don't see braces. That, and the fact that I've helped other people debug code that had started life as if (b) x=y; and had been changed to if (b) f(y); x=y; and so I have come to enjoy braces as a way of avoiding stupid-error-prone-human-errors. So what if its ugly, as long as it saves you time and trouble later? > > ? Don't understand the question. PCI devices are arranged in a tree. > > One of the cards I test with has a bridge and several devices under it. > > So one has to walk the whole tree, which might be arbitrarily deep, to > > get to all of the devices. > > Then attach your structures to the device-tree instead. No need to > duplicate the structure. Heh :) Removing the pci device also causes all of the pointers in the device tree for that device to be nulled out, and so one looses the topology. So I had three bad choices: 1) Change the device-remove code to not null-out the pointers. This seemed potentially dangerous, and might have unexpected side-effects. And worse: if someone, one day in the future, changed the code back to null out the pointers, my code would break. So I didn't do this. 2) Add pointers to the device tree which point to exactly the same things as other pointers in the device tree, but which aren't null'ed out during device delete. This seemed to lead to confusion: someday, someone will wonder why there are two sets of identical pointers in the struct, and "fix it". 3) create the structs I needed. Simple, easy, limited in scope to just that one file, doesn't have to be exposed externally ... seemed like such an easy idea. So that's what I did. --linas From benh at kernel.crashing.org Tue Nov 23 13:37:20 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 23 Nov 2004 13:37:20 +1100 Subject: [PATCH] PPC64: EEH Recovery In-Reply-To: <20041123005711.GH23780@austin.ibm.com> References: <200411202311.iAKNBrw0025283@sullivan.realtime.net> <20041122195044.GF23780@austin.ibm.com> <1101161556.13598.128.camel@gaston> <20041123005711.GH23780@austin.ibm.com> Message-ID: <1101177441.6633.12.camel@gaston> > Ahh, I occasionally have to debug code that looks like > > if(a) > if (b) > x=y; > else > x=z; > > so I get cranky when I don't see braces. That, and the fact that I've > helped other people debug code that had started life as Of course, the above case would definitely require { } to be sane and gcc would probably even warn about it. > and so I have come to enjoy braces as a way of avoiding > stupid-error-prone-human-errors. So what if its ugly, as long as > it saves you time and trouble later? Because it is ugly :) taste taste ... > Heh :) Removing the pci device also causes all of the pointers in the > device tree for that device to be nulled out, and so one looses the > topology. So I had three bad choices: Hrm... just keeping the domain/bus/devfn should be enough anyway > 1) Change the device-remove code to not null-out the pointers. > This seemed potentially dangerous, and might have unexpected > side-effects. And worse: if someone, one day in the future, > changed the code back to null out the pointers, my code would > break. So I didn't do this. > > 2) Add pointers to the device tree which point to exactly the same > things as other pointers in the device tree, but which aren't > null'ed out during device delete. This seemed to lead to confusion: > someday, someone will wonder why there are two sets of identical > pointers in the struct, and "fix it". > > 3) create the structs I needed. Simple, easy, limited in scope to just > that one file, doesn't have to be exposed externally ... seemed like > such an easy idea. So that's what I did. > > --linas -- Benjamin Herrenschmidt From david at gibson.dropbear.id.au Tue Nov 23 15:04:46 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 23 Nov 2004 15:04:46 +1100 Subject: [RFC] sysfs cpu cleanup In-Reply-To: <1101149934.6189.16.camel@biclops> References: <20041117071608.GB19019@zax> <1100703552.8092.15.camel@localhost.localdomain> <20041118234444.GA16796@zax> <1100823721.6601.11.camel@biclops> <20041119023059.GF16796@zax> <1101149934.6189.16.camel@biclops> Message-ID: <20041123040446.GA5646@zax> On Mon, Nov 22, 2004 at 12:58:54PM -0600, Nathan Lynch wrote: > On Thu, 2004-11-18 at 20:30, David Gibson wrote: > > Index: working-2.6/arch/ppc64/kernel/smp.c > > =================================================================== > > +++ working-2.6/arch/ppc64/kernel/smp.c 2004-11-17 17:56:42.000000000 +1100 > > @@ -82,6 +82,8 @@ > > void smp_call_function_interrupt(void); > > extern long register_vpa(unsigned long flags, unsigned long proc, > > unsigned long vpa); > > +extern void register_cpu_online(int cpu); > > +extern void unregister_cpu_online(int cpu); > > > > int smt_enabled_at_boot = 1; > > > > @@ -291,6 +293,8 @@ > > int cpu_status; > > unsigned int pcpu = get_hard_smp_processor_id(cpu); > > > > + unregister_cpu_online(cpu); > > + > > for (tries = 0; tries < 25; tries++) { > > cpu_status = query_cpu_stopped(pcpu); > > if (cpu_status == 0 || cpu_status == -1) > > @@ -919,6 +923,11 @@ > > while (!cpu_online(cpu)) > > cpu_relax(); > > > > +#ifdef CONFIG_HOTPLUG_CPU > > + if (system_state >= SYSTEM_RUNNING) /* This is a hotplug */ > > + register_cpu_online(cpu); > > +#endif /* CONFIG_HOTPLUG_CPU */ > > + > > return 0; > > } > > I apologize for not noticing this before :/ > > Instead of explicitly calling into the sysfs.c code from smp.c, I think > it would be better to use a cpu hotplug notifier block. Even without > CONFIG_HOTPLUG_CPU turned on, you get the "online" notification at boot > for each cpu as it is brought up. See the ppc64 numa code for an > example usage. Ah, good point. How about this, then. Note that we still need to register the stuff for each online cpu in topology_init(), since smp_init() is called before the notifier block is registered. ==== Currently the ppc64 sysfs code registers an entry for each possible cpu in sysfs, rather than just online cpus. That makes sense, since the sysfs entries are needed to control onlining of the cpus. However, this is done even if CONFIG_HOTPLUG_CPU is not set, or if it is not a hotplug capable (DLPAR) machine, which is a bit misleading. Secondly it also registers all the other sysfs entries (mostly performance monitoring controls) on all possible cpus, although they are quite meaningless on non-online cpus. This patch alters the code to only register sysfs directories at boot for cpus which are either online or could be onlined (cpu is possible, and CONFIG_HOTPLUG_CPU and an lpar machine). Furthermore, the entries apart from 'online' itself and 'physical_id' are only registered for online CPUs (and deregistered again if a cpu goes offline). Index: working-2.6/arch/ppc64/kernel/sysfs.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/sysfs.c 2004-10-19 13:37:21.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/sysfs.c 2004-11-23 14:38:33.422111976 +1100 @@ -6,6 +6,8 @@ #include #include #include +#include +#include #include #include #include @@ -13,6 +15,8 @@ #include +static DEFINE_PER_CPU(struct cpu, cpu_devices); + /* SMT stuff */ #ifndef CONFIG_PPC_ISERIES @@ -259,8 +263,18 @@ static SYSDEV_ATTR(pmc8, 0600, show_pmc8, store_pmc8); static SYSDEV_ATTR(purr, 0600, show_purr, NULL); -static void __init register_cpu_pmc(struct sys_device *s) +static void register_cpu_online(unsigned int cpu) { + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_create_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + sysdev_create_file(s, &attr_mmcr0); sysdev_create_file(s, &attr_mmcr1); @@ -283,6 +297,65 @@ sysdev_create_file(s, &attr_purr); } +#ifdef CONFIG_HOTPLUG_CPU +static void unregister_cpu_online(unsigned int cpu) +{ + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + + BUG_ON(c->no_control); + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + + sysdev_remove_file(s, &attr_mmcr0); + sysdev_remove_file(s, &attr_mmcr1); + + if (cur_cpu_spec->cpu_features & CPU_FTR_MMCRA) + sysdev_remove_file(s, &attr_mmcra); + + sysdev_remove_file(s, &attr_pmc1); + sysdev_remove_file(s, &attr_pmc2); + sysdev_remove_file(s, &attr_pmc3); + sysdev_remove_file(s, &attr_pmc4); + sysdev_remove_file(s, &attr_pmc5); + sysdev_remove_file(s, &attr_pmc6); + + if (cur_cpu_spec->cpu_features & CPU_FTR_PMC8) { + sysdev_remove_file(s, &attr_pmc7); + sysdev_remove_file(s, &attr_pmc8); + } + + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_purr); +} +#endif /* CONFIG_HOTPLUG_CPU */ + +static int __devinit sysfs_cpu_notify(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned int)(long)hcpu; + + switch (action) { + case CPU_ONLINE: + register_cpu_online(cpu); + break; +#ifdef CONFIG_HOTPLUG_CPU + case CPU_DEAD: + unregister_cpu_online(cpu); + break; +#endif + } + return NOTIFY_OK; +} + +static struct notifier_block __devinitdata sysfs_cpu_nb = { + .notifier_call = sysfs_cpu_notify, +}; /* NUMA stuff */ @@ -312,8 +385,7 @@ } #endif - -/* Only valid if CPU is online. */ +/* Only valid if CPU is present. */ static ssize_t show_physical_id(struct sys_device *dev, char *buf) { struct cpu *cpu = container_of(dev, struct cpu, sysdev); @@ -322,9 +394,6 @@ } static SYSDEV_ATTR(physical_id, 0444, show_physical_id, NULL); - -static DEFINE_PER_CPU(struct cpu, cpu_devices); - static int __init topology_init(void) { int cpu; @@ -332,6 +401,8 @@ register_nodes(); + register_cpu_notifier(&sysfs_cpu_nb); + for_each_cpu(cpu) { struct cpu *c = &per_cpu(cpu_devices, cpu); @@ -345,19 +416,19 @@ * CPU. For instance, the boot cpu might never be valid * for hotplugging. */ +#ifdef CONFIG_HOTPLUG_CPU if (systemcfg->platform != PLATFORM_PSERIES_LPAR) +#endif c->no_control = 1; - register_cpu(c, cpu, parent); - - register_cpu_pmc(&c->sysdev); + if (cpu_online(cpu) || (c->no_control == 0)) { + register_cpu(c, cpu, parent); - sysdev_create_file(&c->sysdev, &attr_physical_id); + sysdev_create_file(&c->sysdev, &attr_physical_id); + } -#ifndef CONFIG_PPC_ISERIES - if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) - sysdev_create_file(&c->sysdev, &attr_smt_snooze_delay); -#endif + if (cpu_online(cpu)) + register_cpu_online(cpu); } return 0; Index: working-2.6/arch/ppc64/kernel/smp.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/smp.c 2004-10-19 13:37:56.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/smp.c 2004-11-23 14:05:14.039119728 +1100 @@ -82,6 +82,8 @@ void smp_call_function_interrupt(void); extern long register_vpa(unsigned long flags, unsigned long proc, unsigned long vpa); +extern void register_cpu_online(int cpu); +extern void unregister_cpu_online(int cpu); int smt_enabled_at_boot = 1; -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From miltonm at bga.com Tue Nov 23 15:34:47 2004 From: miltonm at bga.com (Milton Miller) Date: Mon, 22 Nov 2004 22:34:47 -0600 Subject: [PATCH] PPC64: EEH Recovery Message-ID: <015EEEE2-3D09-11D9-8147-003065DC03B0@bga.com> On Tue Nov 23 06:50:44 EST 2004, Linas Vepstas wrote: > On Sat, Nov 20, 2004 at 05:11:53PM -0600, Milton D. Miller II was > heard to remark: > ? Don't understand the question. PCI devices are arranged in a tree. > One of the cards I test with has a bridge and several devices under it. > So one has to walk the whole tree, which might be arbitrarily deep, to > get to all of the devices. Just because the devices are in a tree doesn't mean you need to store that structure. If you only walk the tree in one path then you can walk the tree as you save the state and just store a linked list (such as linux/list.h). (The power management code uses this trick). > > 2) I object to grabbing pci devices so they don't disappear and > reappear. > > I worry about duplicate devices across register/unregister and > sysfs > > kobject lifetimes getting confused and duplicate names. > > I'll double-check, I was under the impression that the unregister > happened, and I was just avodiing the final free(). Maybe I'm sorely > mista You seem to have a strong desire to save a struct pci_dev and then later reinsert that same device back in the tree. You indicated that the pci layer actually is unregistering and re-registering the device. Since you are not using the pci side of the device to restore the registers, it seems like a lot of work that is not needed. Just write the bars back then start a new pci device with a new lifetime like a hotplug slot driver would. Re information to save: you have identified the bars. This information should be in the device tree as the assigned-addresses property. You probably should also save the pci latency timer and a few other registers. > > > -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); > > > > not analyzed ... > > ?? You are removing a symbol export. I haven't looked for other users of this. If it has no current users it could be a separate patch. milton PS: I'll be on vacation for a week, so I might not respond to further mail for a bit. From tiwari.amit at gmail.com Tue Nov 23 17:29:31 2004 From: tiwari.amit at gmail.com (Amit K Tiwari) Date: Tue, 23 Nov 2004 11:59:31 +0530 Subject: Y-HPC Source Code Message-ID: Hi, Is the source code for Y-HPC (Yellow Dog Linux version 3.0.1 from Terrasoft) available? Where can I get them? Amit K T From segher at kernel.crashing.org Tue Nov 23 16:55:35 2004 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Tue, 23 Nov 2004 06:55:35 +0100 Subject: should cpus_in_xmon be volatile? In-Reply-To: <20041119202901.GB23780@austin.ibm.com> References: <1100818353.24982.348.camel@localhost> <20041119202901.GB23780@austin.ibm.com> Message-ID: <4B69F2F8-3D14-11D9-8998-000A95A4DC02@kernel.crashing.org> > Well, to make >>that particular<< loop work correctly, the volatile is > not > needed. Why? Because cpus_weight() is extern __bitmap_weight() and > since > its extern, the compiler must be definition invoke it each time in the > loop, since the compiler must assume that the called routine is > changing > the value of the thing being pointed at. i.e. the call has a > side-effect. That's not correct. External linkage is an abstract concept, and by no means prevents the compiler from optimising across the boundaries of a translation unit (e.g., when performing whole-program optimisation). Of course, that's not the current (default) behaviour of GCC, but that doesn't make it a correct C program. > However, if someone changed the extern __bitmap_weight() to be > inline __bitmap_weight(), then the compiler could potentially see that > it had no side effects, and decide to optimize away the entire loop. It can potentially do that anyway. Nothing in the C standard prevents it from doing that. Segher From l_indien at magic.fr Tue Nov 23 23:20:02 2004 From: l_indien at magic.fr (J. Mayer) Date: Tue, 23 Nov 2004 13:20:02 +0100 Subject: Booting Imac G5 In-Reply-To: <1101122178.13612.106.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> <1101122178.13612.106.camel@gaston> Message-ID: <1101212402.31127.1034.camel@rapid> On Mon, 2004-11-22 at 12:16, Benjamin Herrenschmidt wrote: > > I tried to do mb() at the start of do_cmd() and after the cmd_done call. > > Then, when reading the RTC, I read all zeroes but the first byte which > > is 0x81. The cmd byte isn't modified, when I should read the command ack > > which is the complement of the requested command (even if I don't check > > it, for now). > > I tried to add a call to flush_dcache_phys_range, then I got the same > > result. > > But using flush_inval_dcache_phys_range (yes the name is quite > > strange....), with or without mb(), then it works. > > Ok, that seem to indicate that indeed, that crap isn't cache > coherent ... I wonder how they actually access the memory from the > SMU... maybe i2c commands to U3... They may use the SPU interface to access the RAM. But, in fact, looking at the OF SMU driver, I should have been warned about the cache issue: this routine is quite clear: : clr-cache-lines cmd-buf ^dcbf ^sync ; It's called just before ringing the doorbell... > > I'm not sure it'll eat all PMU commands, but it sure can treat some... > > Probably not all, I'm not even sure Apple themselves still knows what > all PMU commands do :) The PMU is a mess that evolved from the original > one that was in the very first Mac Portable ! ;-) > > All right ! > > But I think I reached my first goal: I got a usable and stable machine > > with all basic features available. > > Yes, that's really nice. Is rivafb working too ? I've seen somebody > submitting a patch that recently got into Linus bk fixing some issues > with the 5200 rivafb works well, the only problem is to make it use the right video mode: it does not use the right virtual xres. I didn't spent time to try to fix this issue, but I guess it can be easily fixed. -- J. Mayer Never organized From segher at kernel.crashing.org Tue Nov 23 23:35:26 2004 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Tue, 23 Nov 2004 13:35:26 +0100 Subject: Booting Imac G5 In-Reply-To: <1101212402.31127.1034.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> <11011221! 78.13612.106.camel@gaston> <1101212402.31127.1034.camel@rapid> Message-ID: <26D0E0B9-3D4C-11D9-8998-000A95A4DC02@kernel.crashing.org> > They may use the SPU interface to access the RAM. But, in fact, > looking > at the OF SMU driver, I should have been warned about the cache issue: > this routine is quite clear: > : clr-cache-lines > cmd-buf ^dcbf ^sync ; Well yes, but Apple OF tends to use very heavy hammers for no reason at all, all of the time ;-) I'd really like to _understand_ the issue -- although that's not necessary at all for programming the driver, of course, monkey-see, monkey-do worked fine for all the other Mac drivers ;-P Segher From l_indien at magic.fr Wed Nov 24 00:28:41 2004 From: l_indien at magic.fr (J. Mayer) Date: Tue, 23 Nov 2004 14:28:41 +0100 Subject: Booting Imac G5 In-Reply-To: <26D0E0B9-3D4C-11D9-8998-000A95A4DC02@kernel.crashing.org> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> <11011221! 78.13612.106.camel@gaston> <1101212402.31127.1034.camel@rapid> <26D0E0B9-3D4C-11D9-8998-000A95A4DC02@kernel.crashing.org> Message-ID: <1101216521.31127.1047.camel@rapid> On Tue, 2004-11-23 at 13:35, Segher Boessenkool wrote: > > They may use the SPU interface to access the RAM. But, in fact, > > looking > > at the OF SMU driver, I should have been warned about the cache issue: > > this routine is quite clear: > > : clr-cache-lines > > cmd-buf ^dcbf ^sync ; > > Well yes, but Apple OF tends to use very heavy hammers for no > reason at all, all of the time ;-) I've seen that Darwin does not seem to be the best example for an optimisation lesson ;-) > I'd really like to _understand_ the issue -- although that's > not necessary at all for programming the driver, of course, > monkey-see, monkey-do worked fine for all the other Mac > drivers ;-P I'd like to understand what the hardware does too ! It seems logical too me that it would use the SPU interface to access the memory. If the SPU interface is really like a JTAG interface as it seems to be, then it can entirelly control the CPU, so it could also invalidate the caches when needed. Only getting (and understanding !) the SPU firmware may tell us what it really does.... -- J. Mayer Never organized From linas at austin.ibm.com Wed Nov 24 05:22:51 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 23 Nov 2004 12:22:51 -0600 Subject: should cpus_in_xmon be volatile? In-Reply-To: <4B69F2F8-3D14-11D9-8998-000A95A4DC02@kernel.crashing.org> References: <1100818353.24982.348.camel@localhost> <20041119202901.GB23780@austin.ibm.com> <4B69F2F8-3D14-11D9-8998-000A95A4DC02@kernel.crashing.org> Message-ID: <20041123182251.GI23780@austin.ibm.com> On Tue, Nov 23, 2004 at 06:55:35AM +0100, Segher Boessenkool was heard to remark: > >Well, to make >>that particular<< loop work correctly, the volatile is > >not > >needed. Why? Because cpus_weight() is extern __bitmap_weight() and > >since > >its extern, the compiler must be definition invoke it each time in the > >loop, since the compiler must assume that the called routine is > >changing > >the value of the thing being pointed at. i.e. the call has a > >side-effect. > > That's not correct. External linkage is an abstract concept, and by no > means prevents the compiler from optimising across the boundaries of a > translation unit (e.g., when performing whole-program optimisation). > > Of course, that's not the current (default) behaviour of GCC, but that > doesn't make it a correct C program. > > >However, if someone changed the extern __bitmap_weight() to be > >inline __bitmap_weight(), then the compiler could potentially see that > >it had no side effects, and decide to optimize away the entire loop. > > It can potentially do that anyway. Nothing in the C standard prevents > it from doing that. OK, Here in the US, the holidays are close, so what the heck, another long academic reply follows. Yes, of course. If there is a way for a compiler to determine that a particular call has no side-effects, then it is quite valid for the optimization to be performed. That's the point I was trying to make. Since, as far as I know, there aren't any compilers that actually *do* whole-program optimisation, the distinction is academic. I guess that come the day that gcc does whole-program optimization, then the only safe thing to do is to assume that all global vars are volatile, and that therefore any subroutine acting on a global does have a side-effect. However, this potentially plays havoc with function signatures: if globals are implicitly volatile, then what to do about routines that take globals as arguments? If the argument to a subroutine is declared volatile, then the compiler is prevented from doing certain types of optimizations within that subroutine (e.g. if the argument is used in a loop). So this kind of naive whole-program optimization would lead to a massive performance degradation. The alternative is to explcitly declare globals to be volatile, and then cast-away volatileness for those subroutines where we know that its not important. The third alternative was Paul's patch: pepper the code with memory barriers wherever they may seem needed. This way, a whole-program optimizer can assume that globals aren't volatile, and instead we depend on manually-placed barriers for program correctness. Which is in the spirit of how things are done today. --linas From linas at austin.ibm.com Wed Nov 24 05:59:16 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 23 Nov 2004 12:59:16 -0600 Subject: [PATCH] PPC64: EEH Recovery In-Reply-To: <015EEEE2-3D09-11D9-8147-003065DC03B0@bga.com> References: <015EEEE2-3D09-11D9-8147-003065DC03B0@bga.com> Message-ID: <20041123185916.GJ23780@austin.ibm.com> On Mon, Nov 22, 2004 at 10:34:47PM -0600, Milton Miller was heard to remark: > On Tue Nov 23 06:50:44 EST 2004, Linas Vepstas wrote: > >On Sat, Nov 20, 2004 at 05:11:53PM -0600, Milton D. Miller II was > >heard to remark: > > >? Don't understand the question. PCI devices are arranged in a tree. > >One of the cards I test with has a bridge and several devices under it. > >So one has to walk the whole tree, which might be arbitrarily deep, to > >get to all of the devices. > > Just because the devices are in a tree doesn't mean you need to store > that structure. If you only walk the tree in one path then you can > walk the tree as you save the state and just store a linked list (such > as linux/list.h). (The power management code uses this trick). Sure, yes. Trees can always be flattened to lists. But for this particular case, this seems to make things more complex, not simpler. The current code would be smaller, faster & easier to debug than having to convert it to use a linked list. > >> 2) I object to grabbing pci devices so they don't disappear and > >reappear. > >> I worry about duplicate devices across register/unregister and > >sysfs > >> kobject lifetimes getting confused and duplicate names. > > > >I'll double-check, I was under the impression that the unregister > >happened, and I was just avodiing the final free(). Maybe I'm sorely > >mista > You seem to have a strong desire to save a struct pci_dev and then > later reinsert that same device back in the tree. No not at all. As I said, all I really wanted to do was to save the bars. I actually thought about saving the bars in the device_node, but I then anticipated that some wise-guy on this mailing list suggest that I save the bars in the pci_dev instead. So I saved the bars in pci_dev instead. I did *not* anticipate your comments. I could save the bars in the device_node, I don't really care. Its the same amount of work either way. I guess that there is an intellectual challenge to understanding the finer points of kobjects which means that avoiding the use of kobjects and using the device_node instead is indeed a safer thing to do. So, OK, you're right the BAR's should be saved in the device node. > registers, it seems like a lot of work that is not needed. Just write Its the same amount of cpu cycles, same amount of lines of code. As to work, its less work to not change anything :) I'll convert the code to use device node instead. > Re information to save: you have identified the bars. This information > should be in the device tree as the assigned-addresses property. You I looked at that once, and decided I didn't like it; I vaguely remember it wasn't exactly 1-1 with the bars, so I'd have to do more math to restore from there. I'd rather do a block copy. > probably should also save the pci latency timer and a few other > registers. Yeah I wondered about this, but maybe didn't wonder enough. I was vaguely thinking that the device driver would set these up, but in fact, this is probably a bad assumption. I'l crawl through the PCI specs some more... > >> > -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); > >> > >> not analyzed ... > > > >?? > You are removing a symbol export. I haven't looked for other users of > this. If it has no current users it could be a separate patch. That subroutine no longer exists! (there are no other users; I added this routine for EEH a long time ago, and was the only user. Its been replaced by something better.) --linas From l_indien at magic.fr Wed Nov 24 22:58:44 2004 From: l_indien at magic.fr (J. Mayer) Date: Wed, 24 Nov 2004 12:58:44 +0100 Subject: New patch for Imac G5 In-Reply-To: <1101122178.13612.106.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> <1101122178.13612.106.camel@gaston> Message-ID: <1101297524.31127.1159.camel@rapid> Hi, here's the new version of the complete patch I use for the Imac G5. It includes Bejamin fixes for Shasta SATA initialisation. Here's the current status of the port: - Pmac PCI IRQ fix: needed for IDE & SATA on Imac G5 - Ethernet PHY failure if the cable is plugged during (re)boot - RTC works. - SMU management: support for reboot and shutdown - Bad detection of frame-buffer virtual res in riva-fb: should use xres=1440 yres=900 & virtual_xres=1536 - Have to unplug/replug the USB keyboard after kernel boot to make it work with kernel 2.6.10-rc1. No such problem with 2.6.9. Fixed with kernel 2.6.10-rc2 - Lot's of segfaults occuring when multiple concurent processes are running. Fixed with kernel 2.6.10-rc2 Regards. -- J. Mayer Never organized -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.10-rc2.diff.gz Type: application/x-gzip Size: 10212 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041124/5aaa1bc5/attachment.bin From linas at austin.ibm.com Thu Nov 25 08:26:33 2004 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 24 Nov 2004 15:26:33 -0600 Subject: [PATCH] PPC64: EEH Recovery (Revised) In-Reply-To: <20041123185916.GJ23780@austin.ibm.com> References: <015EEEE2-3D09-11D9-8147-003065DC03B0@bga.com> <20041123185916.GJ23780@austin.ibm.com> Message-ID: <20041124212633.GK23780@austin.ibm.com> Hi, The patch below implements hotplug style EEH error recovery. Its split into two pieces: a part that needs to be applied to the PPC64 arch tree, and a part that needs to be applied to the RPA PHP hotplug tree. The PPC64 part needs to go in first. This is a revised patch that incorporates changes suggested by Milton Miller, in particular, the saving of BAR's to the device node. Signed-off-by: Linas Vepstas --linas -------------- next part -------------- ===== arch/ppc64/kernel/eeh.c 1.40 vs edited ===== --- 1.40/arch/ppc64/kernel/eeh.c 2004-10-25 14:47:50 -05:00 +++ edited/arch/ppc64/kernel/eeh.c 2004-11-24 15:11:53 -06:00 @@ -17,21 +17,19 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ -#include +#include #include #include -#include #include #include #include #include #include -#include +#include #include #include #include #include -#include #include "pci.h" #undef DEBUG @@ -89,7 +87,6 @@ static struct notifier_block *eeh_notifi * attempts we allow before panicking. */ #define EEH_MAX_FAILS 1000 -static atomic_t eeh_fail_count; /* RTAS tokens */ static int ibm_set_eeh_option; @@ -223,9 +220,9 @@ pci_addr_cache_insert(struct pci_dev *de while (*p) { parent = *p; piar = rb_entry(parent, struct pci_io_addr_range, rb_node); - if (alo < piar->addr_lo) { + if (ahi < piar->addr_lo) { p = &parent->rb_left; - } else if (ahi > piar->addr_hi) { + } else if (alo > piar->addr_hi) { p = &parent->rb_right; } else { if (dev != piar->pcidev || @@ -243,6 +240,11 @@ pci_addr_cache_insert(struct pci_dev *de piar->addr_hi = ahi; piar->pcidev = dev; piar->flags = flags; + +#ifdef DEBUG + printk (KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n", + alo, ahi, pci_name (dev)); +#endif rb_link_node(&piar->rb_node, parent, p); rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root); @@ -367,6 +369,7 @@ void pci_addr_cache_remove_device(struct */ void __init pci_addr_cache_build(void) { + struct device_node *dn; struct pci_dev *dev = NULL; spin_lock_init(&pci_io_addr_cache_root.piar_lock); @@ -377,6 +380,14 @@ void __init pci_addr_cache_build(void) continue; } pci_addr_cache_insert_device(dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + if (dn) { + int i; + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); + } } #ifdef DEBUG @@ -388,6 +399,32 @@ void __init pci_addr_cache_build(void) /* --------------------------------------------------------------- */ /* Above lies the PCI Address Cache. Below lies the EEH event infrastructure */ +void eeh_slot_error_detail (struct device_node *dn, int severity) +{ + unsigned long flags; + int rc; + + if (!dn) return; + + /* Log the error with the rtas logger */ + spin_lock_irqsave(&slot_errbuf_lock, flags); + memset(slot_errbuf, 0, eeh_error_buf_size); + + rc = rtas_call(ibm_slot_error_detail, + 8, 1, NULL, dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), NULL, 0, + virt_to_phys(slot_errbuf), + eeh_error_buf_size, + severity); + + if (rc == 0) + log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); + spin_unlock_irqrestore(&slot_errbuf_lock, flags); +} + +EXPORT_SYMBOL(eeh_slot_error_detail); + /** * eeh_register_notifier - Register to find out about EEH events. * @nb: notifier block to callback on events @@ -462,11 +499,9 @@ static void eeh_event_handler(void *dumm "%s %s\n", event->reset_state, pci_name(event->dev), pci_pretty_name(event->dev)); - atomic_set(&eeh_fail_count, 0); - notifier_call_chain (&eeh_notifier_chain, - EEH_NOTIFY_FREEZE, event); - __get_cpu_var(slot_resets)++; + notifier_call_chain (&eeh_notifier_chain, + EEH_NOTIFY_FREEZE, event); pci_dev_put(event->dev); kfree(event); @@ -490,6 +525,17 @@ static inline unsigned long eeh_token_to return pa | (token & (PAGE_SIZE-1)); } +static inline struct pci_dev * eeh_get_pci_dev(struct device_node *dn) +{ + struct pci_dev *dev = NULL; + + for_each_pci_dev(dev) { + if (pci_device_to_OF_node(dev) == dn) + return dev; + } + return NULL; +} + /** * eeh_dn_check_failure - check if all 1's data is due to EEH slot freeze * @dn device node @@ -510,7 +556,7 @@ int eeh_dn_check_failure(struct device_n int ret; int rets[2]; unsigned long flags; - int rc, reset_state; + int reset_state; struct eeh_event *event; __get_cpu_var(total_mmio_ffs)++; @@ -527,17 +573,17 @@ int eeh_dn_check_failure(struct device_n return 0; } - if (!dn->eeh_config_addr) { + if (!dn->eeh_config_addr) return 0; - } /* * If we already have a pending isolation event for this * slot, we know it's bad already, we don't need to check... */ if (dn->eeh_mode & EEH_MODE_ISOLATED) { - atomic_inc(&eeh_fail_count); - if (atomic_read(&eeh_fail_count) >= EEH_MAX_FAILS) { + dn->eeh_freeze_count ++; + if (dn->eeh_freeze_count >= EEH_MAX_FAILS) { + dump_stack(); /* re-read the slot reset state */ rets[0] = -1; rtas_call(ibm_read_slot_reset_state, 3, 3, rets, @@ -565,34 +611,25 @@ int eeh_dn_check_failure(struct device_n return 0; } - /* prevent repeated reports of this failure */ + /* Prevent repeated reports of this failure */ dn->eeh_mode |= EEH_MODE_ISOLATED; reset_state = rets[0]; + /* Log the error with the rtas logger */ + if (dn->eeh_freeze_count < EEH_MAX_ALLOWED_FREEZES) { + eeh_slot_error_detail (dn, 1 /* Temporary Error */); + } else { + eeh_slot_error_detail (dn, 2 /* Permanent Error */); + } - spin_lock_irqsave(&slot_errbuf_lock, flags); - memset(slot_errbuf, 0, eeh_error_buf_size); - - rc = rtas_call(ibm_slot_error_detail, - 8, 1, NULL, dn->eeh_config_addr, - BUID_HI(dn->phb->buid), - BUID_LO(dn->phb->buid), NULL, 0, - virt_to_phys(slot_errbuf), - eeh_error_buf_size, - 1 /* Temporary Error */); - - if (rc == 0) - log_error(slot_errbuf, ERR_TYPE_RTAS_LOG, 0); - spin_unlock_irqrestore(&slot_errbuf_lock, flags); - - printk(KERN_INFO "EEH: MMIO failure (%d) on device: %s %s\n", - rets[0], dn->name, dn->full_name); event = kmalloc(sizeof(*event), GFP_ATOMIC); if (event == NULL) { - eeh_panic(dev, reset_state); + printk (KERN_ERR "EEH: out of memory, event not handled\n"); return 1; } + if (!dev) + dev = eeh_get_pci_dev (dn); event->dev = dev; event->dn = dn; event->reset_state = reset_state; @@ -618,7 +655,6 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * @token i/o token, should be address in the form 0xA.... * @val value, should be all 1's (XXX why do we need this arg??) * - * Check for an eeh failure at the given token address. * Check for an EEH failure at the given token address. Call this * routine if the result of a read was all 0xff's and you want to * find out if this is due to an EEH slot freeze event. This routine @@ -626,6 +662,7 @@ EXPORT_SYMBOL(eeh_dn_check_failure); * * Note this routine is safe to call in an interrupt context. */ + unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val) { unsigned long addr; @@ -647,6 +684,172 @@ unsigned long eeh_check_failure(const vo EXPORT_SYMBOL(eeh_check_failure); +/* ------------------------------------------------------------- */ +/* The code below deals with error recovery */ + +void +rtas_set_slot_reset(struct device_node *dn) +{ + int token = rtas_token ("ibm,set-slot-reset"); + int rc; + + if (token == RTAS_UNKNOWN_SERVICE) + return; + rc = rtas_call(token,4,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), + 1); + if (rc) { + printk (KERN_WARNING "EEH: Unable to reset the failed slot\n"); + return; + } + + /* The PCI bus requires that the reset be held high for at least + * a 100 milliseconds. We wait a bit longer 'just in case'. + */ + msleep (200); + + rc = rtas_call(token,4,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid), + 0); +} + +EXPORT_SYMBOL(rtas_set_slot_reset); + +void +rtas_configure_bridge(struct device_node *dn) +{ + int token = rtas_token ("ibm,configure-bridge"); + int rc; + + if (token == RTAS_UNKNOWN_SERVICE) + return; + rc = rtas_call(token,3,1, NULL, + dn->eeh_config_addr, + BUID_HI(dn->phb->buid), + BUID_LO(dn->phb->buid)); + if (rc) { + printk (KERN_WARNING "EEH: Unable to configure device bridge\n"); + } +} + +EXPORT_SYMBOL(rtas_configure_bridge); + +/* ------------------------------------------------------- */ +/** Save and restore of PCI BARs + * + * Although firmware will set up BARs during boot, it doesn't + * set up device BAR's after a device reset, although it will, + * if requested, set up bridge configuration. Thus, we need to + * configure the PCI devices ourselves. Config-space setup is + * stored in the PCI structures which are normally deleted during + * device removal. Thus, the "save" routine references the + * structures so that they aren't deleted. + */ + + +struct eeh_cfg_tree +{ + struct eeh_cfg_tree *sibling; + struct eeh_cfg_tree *child; + struct device_node *dn; + int is_bridge; +}; + +/** + * eeh_save_bars - save the PCI config space info + */ +struct eeh_cfg_tree * eeh_save_bars(struct device_node *dn) +{ + struct pci_dev *dev; + struct eeh_cfg_tree *cnode; + + dev = eeh_get_pci_dev(dn); + if (!dev) + return NULL; + + cnode = kmalloc(sizeof(struct eeh_cfg_tree), GFP_KERNEL); + if (!cnode) + return NULL; + + cnode->is_bridge = 0; + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + cnode->is_bridge = 1; + + of_node_get(dn); + cnode->dn = dn; + + cnode->sibling = NULL; + cnode->child = NULL; + + if (dn->child) { + cnode->child = eeh_save_bars (dn->child); + } + if (dn->sibling) { + cnode->sibling = eeh_save_bars (dn->sibling); + } + + return cnode; +} +EXPORT_SYMBOL(eeh_save_bars); + +/** + * __restore_bars - Restore the Base Address Registers + * Loads the PCI configuration space base address registers, + * the expansion ROM base address, the latency timer, and etc. + * from the saved values in the device node. + */ +static inline void __restore_bars (struct device_node *dn) +{ + int i; + for (i=4; i<10; i++) { + rtas_write_config(dn, i*4, 4, dn->config_space[i]); + } + + /* 12 == Expansion ROM Address */ + rtas_write_config(dn, 12*4, 4, dn->config_space[12]); + +#define SAVED_BYTE(OFF) (((u8 *)(dn->config_space))[OFF]) + + rtas_write_config (dn, PCI_CACHE_LINE_SIZE, 1, + SAVED_BYTE(PCI_CACHE_LINE_SIZE)); + + rtas_write_config (dn, PCI_LATENCY_TIMER, 1, + SAVED_BYTE(PCI_LATENCY_TIMER)); + + rtas_write_config (dn, PCI_INTERRUPT_LINE, 1, + SAVED_BYTE(PCI_INTERRUPT_LINE)); +} + +/** + * eeh_restore_bars - restore the PCI config space info + */ +void eeh_restore_bars(struct eeh_cfg_tree *tree) +{ + if (!(tree->is_bridge)) + __restore_bars (tree->dn); + + if (tree->child) + eeh_restore_bars (tree->child); + + if (tree->sibling) + eeh_restore_bars (tree->sibling); + + of_node_put (tree->dn); + kfree (tree); +} +EXPORT_SYMBOL(eeh_restore_bars); + +/* ------------------------------------------------------------- */ +/* The code below deals with enabling EEH for devices during the + * early boot sequence. EEH must be enabled before any PCI probing + * can be done. + */ + struct eeh_early_enable_info { unsigned int buid_hi; unsigned int buid_lo; @@ -831,6 +1034,9 @@ EXPORT_SYMBOL(eeh_add_device_early); */ void eeh_add_device_late(struct pci_dev *dev) { + int i; + struct device_node *dn; + if (!dev || !eeh_subsystem_enabled) return; @@ -840,6 +1046,11 @@ void eeh_add_device_late(struct pci_dev #endif pci_addr_cache_insert_device (dev); + + /* Save the BAR's; firmware doesn't restore these after EEH reset */ + dn = pci_device_to_OF_node(dev); + for (i = 0; i < 16; i++) + pci_read_config_dword(dev, i * 4, &dn->config_space[i]); } EXPORT_SYMBOL(eeh_add_device_late); @@ -885,10 +1096,8 @@ static int proc_eeh_show(struct seq_file seq_printf(m, "eeh_total_mmio_ffs=%ld\n" "eeh_false_positives=%ld\n" "eeh_ignored_failures=%ld\n" - "eeh_slot_resets=%ld\n" - "eeh_fail_count=%d\n", - ffs, positives, failures, resets, - eeh_fail_count.counter); + "eeh_slot_resets=%ld\n", + ffs, positives, failures, resets); } return 0; ===== arch/ppc64/kernel/pSeries_pci.c 1.59 vs edited ===== --- 1.59/arch/ppc64/kernel/pSeries_pci.c 2004-11-15 21:29:10 -06:00 +++ edited/arch/ppc64/kernel/pSeries_pci.c 2004-11-22 13:33:35 -06:00 @@ -102,7 +102,7 @@ static int rtas_pci_read_config(struct p return PCIBIOS_DEVICE_NOT_FOUND; } -static int rtas_write_config(struct device_node *dn, int where, int size, u32 val) +int rtas_write_config(struct device_node *dn, int where, int size, u32 val) { unsigned long buid, addr; int ret; ===== include/asm-ppc64/eeh.h 1.23 vs edited ===== --- 1.23/include/asm-ppc64/eeh.h 2004-10-25 18:17:38 -05:00 +++ edited/include/asm-ppc64/eeh.h 2004-11-17 16:10:58 -06:00 @@ -22,8 +22,8 @@ #include #include -#include #include +#include struct pci_dev; struct device_node; @@ -33,6 +33,10 @@ struct device_node; #define EEH_MODE_NOCHECK (1<<1) #define EEH_MODE_ISOLATED (1<<2) +/* Max number of EEH freezes allowed before we consider the device + * to be permanently disabled. */ +#define EEH_MAX_ALLOWED_FREEZES 5 + #ifdef CONFIG_PPC_PSERIES extern void __init eeh_init(void); unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned long val); @@ -57,6 +61,34 @@ void eeh_add_device_early(struct device_ void eeh_add_device_late(struct pci_dev *); /** + * eeh_slot_error_detail -- record and EEH error condition to the log + * @severity: 1 if temporary, 2 if permanent failure. + * + * Obtains the the EEH error details from the RTAS subsystem, + * and then logs these details with the RTAS error log system. + */ +void eeh_slot_error_detail (struct device_node *dn, int severity); + +/** + * rtas_set_slot_reset -- unfreeze a frozen slot + * + * Clear the EEH-frozen condition on a slot. This routine + * does this by asserting the PCI #RST line for 1/8th of + * a second; this routine will sleep while the adapter is + * being reset. + */ +void rtas_set_slot_reset (struct device_node *dn); + +/** + * rtas_configure_bridge -- firmware initialization of pci bridge + * + * Ask the firmware to configure any PCI bridge devices + * located behind the indicated node. Required after a + * pci device reset. + */ +void rtas_configure_bridge(struct device_node *dn); + +/** * eeh_remove_device - undo EEH setup for the indicated pci device * @dev: pci device to be removed * @@ -91,6 +123,13 @@ struct eeh_event { /** Register to find out about EEH events. */ int eeh_register_notifier(struct notifier_block *nb); int eeh_unregister_notifier(struct notifier_block *nb); + +/** Save and restore device configuration info across + * device resets + */ +struct eeh_cfg_tree; +struct eeh_cfg_tree * eeh_save_bars(struct device_node *dn); +void eeh_restore_bars(struct eeh_cfg_tree *tree); /** * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure. ===== include/asm-ppc64/prom.h 1.23 vs edited ===== --- 1.23/include/asm-ppc64/prom.h 2004-10-24 20:55:43 -05:00 +++ edited/include/asm-ppc64/prom.h 2004-11-24 11:47:24 -06:00 @@ -162,8 +162,10 @@ struct device_node { int status; /* Current device status (non-zero is bad) */ int eeh_mode; /* See eeh.h for possible EEH_MODEs */ int eeh_config_addr; + int eeh_freeze_count; /* number of times this device froze up. */ struct pci_controller *phb; /* for pci devices */ struct iommu_table *iommu_table; /* for phb's or bridges */ + u32 config_space[16]; /* saved PCI config space */ struct property *properties; struct device_node *parent; ===== include/asm-ppc64/rtas.h 1.24 vs edited ===== --- 1.24/include/asm-ppc64/rtas.h 2004-09-22 00:42:53 -05:00 +++ edited/include/asm-ppc64/rtas.h 2004-11-17 16:00:37 -06:00 @@ -241,4 +241,6 @@ extern void rtas_stop_self(void); /* RMO buffer reserved for user-space RTAS use */ extern unsigned long rtas_rmo_buf; +extern int rtas_write_config(struct device_node *dn, int where, int size, u32 val); + #endif /* _PPC64_RTAS_H */ -------------- next part -------------- ===== drivers/pci/hotplug/rpaphp.h 1.11 vs edited ===== --- 1.11/drivers/pci/hotplug/rpaphp.h 2004-10-06 11:43:44 -05:00 +++ edited/drivers/pci/hotplug/rpaphp.h 2004-11-17 16:00:37 -06:00 @@ -126,6 +126,8 @@ extern int register_pci_slot(struct slot extern int rpaphp_unconfig_pci_adapter(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); extern struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev); +extern void init_eeh_handler (void); +extern void exit_eeh_handler (void); /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); ===== drivers/pci/hotplug/rpaphp_core.c 1.18 vs edited ===== --- 1.18/drivers/pci/hotplug/rpaphp_core.c 2004-10-06 11:43:44 -05:00 +++ edited/drivers/pci/hotplug/rpaphp_core.c 2004-11-17 16:00:37 -06:00 @@ -443,12 +443,18 @@ static int __init rpaphp_init(void) { info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); + /* Get set to handle EEH events. */ + init_eeh_handler(); + /* read all the PRA info from the system */ return init_rpa(); } static void __exit rpaphp_exit(void) { + /* Let EEH know we are going away. */ + exit_eeh_handler(); + cleanup_slots(); } ===== drivers/pci/hotplug/rpaphp_pci.c 1.16 vs edited ===== --- 1.16/drivers/pci/hotplug/rpaphp_pci.c 2004-10-19 11:54:38 -05:00 +++ edited/drivers/pci/hotplug/rpaphp_pci.c 2004-11-24 15:08:41 -06:00 @@ -22,8 +22,12 @@ * Send feedback to * */ +#include +#include #include +#include #include +#include #include #include "../pci.h" /* for pci_add_new_bus */ @@ -63,6 +67,7 @@ int rpaphp_claim_resource(struct pci_dev root ? "Address space collision on" : "No parent found for", resource, dtype, pci_name(dev), res->start, res->end); + dump_stack(); } return err; } @@ -185,6 +190,19 @@ rpaphp_fixup_new_pci_devices(struct pci_ static int rpaphp_pci_config_bridge(struct pci_dev *dev); +static void rpaphp_eeh_add_bus_device(struct pci_bus *bus) +{ + struct pci_dev *dev; + list_for_each_entry(dev, &bus->devices, bus_list) { + eeh_add_device_late(dev); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + struct pci_bus *subbus = dev->subordinate; + if (bus) + rpaphp_eeh_add_bus_device (subbus); + } + } +} + /***************************************************************************** rpaphp_pci_config_slot() will configure all devices under the given slot->dn and return the the first pci_dev. @@ -212,6 +230,8 @@ rpaphp_pci_config_slot(struct device_nod } if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) rpaphp_pci_config_bridge(dev); + + rpaphp_eeh_add_bus_device(bus); } return dev; } @@ -220,7 +240,6 @@ static int rpaphp_pci_config_bridge(stru { u8 sec_busno; struct pci_bus *child_bus; - struct pci_dev *child_dev; dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev)); @@ -237,11 +256,7 @@ static int rpaphp_pci_config_bridge(stru /* do pci_scan_child_bus */ pci_scan_child_bus(child_bus); - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - - /* fixup new pci devices without touching bus struct */ + /* Fixup new pci devices without touching bus struct */ rpaphp_fixup_new_pci_devices(child_bus, 0); /* Make the discovered devices available */ @@ -279,7 +294,7 @@ static void print_slot_pci_funcs(struct return; } #else -static void print_slot_pci_funcs(struct slot *slot) +static inline void print_slot_pci_funcs(struct slot *slot) { return; } @@ -361,7 +376,6 @@ static void rpaphp_eeh_remove_bus_device if (pdev) rpaphp_eeh_remove_bus_device(pdev); } - } return; } @@ -563,10 +577,14 @@ exit: return retval; } -struct hotplug_slot *rpaphp_find_hotplug_slot(struct pci_dev *dev) +/** + * rpaphp_find_slot - find and return the slot holding the device + * @dev: pci device for which we want the slot structure. + */ +static struct slot *rpaphp_find_slot(struct pci_dev *dev) { - struct list_head *tmp, *n; - struct slot *slot; + struct list_head *tmp, *n; + struct slot *slot; list_for_each_safe(tmp, n, &rpaphp_slot_head) { struct pci_bus *bus; @@ -585,14 +603,109 @@ struct hotplug_slot *rpaphp_find_hotplug if (!bus) { continue; /* should never happen? */ } + for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev == dev) - return slot->hotplug_slot; + struct pci_dev *pdev = pci_dev_b(ln); + if (pdev == dev) + return slot; } } return NULL; } -EXPORT_SYMBOL_GPL(rpaphp_find_hotplug_slot); +/* ------------------------------------------------------- */ +/** + * handle_eeh_events -- reset a PCI device after hard lockup. + * + * pSeries systems will isolate a PCI slot if the PCI-Host + * bridge detects address or data parity errors, DMA's + * occuring to wild addresses (which usually happen due to + * bugs in device drivers or in PCI adapter firmware). + * Slot isolations also occur if #SERR, #PERR or other misc + * PCI-related errors are detected. + * + * Recovery process consists of unplugging the device driver + * (which generated hotplug events to userspace), then issuing + * a PCI #RST to the device, then reconfiguring the PCI config + * space for all bridges & devices under this slot, and then + * finally restarting the device drivers (which cause a second + * set of hotplug events to go out to userspace). + */ +int handle_eeh_events (struct notifier_block *self, + unsigned long reason, void *ev) +{ + struct eeh_event *event = ev; + struct slot *frozen_slot; + struct eeh_cfg_tree * saved_bars; + + frozen_slot = rpaphp_find_slot(event->dev); + if (!frozen_slot) + { + printk (KERN_ERR + "EEH: Cannot find PCI slot for EEH error! dev=%p dn=%p\n", + event->dev, event->dn); + return 1; + } + + /* Keep a copy of the config space registers */ + saved_bars = eeh_save_bars(frozen_slot->dn); + of_node_get(event->dn); + pci_dev_get(event->dev); + + rpaphp_unconfig_pci_adapter (frozen_slot); + + event->dn->eeh_freeze_count ++; + if (event->dn->eeh_freeze_count > EEH_MAX_ALLOWED_FREEZES) { + /* + * About 90% of all real-life EEH failures in the field + * are due to poorly seated PCI cards. Only 10% or so are + * due to actual, failed cards + */ + printk (KERN_ERR + "EEH: device %s:%s has failed %d times \n" + "and has been permanently disabled. Please try reseating\n" + "this device or replacing it.\n", + pci_name (event->dev), + pci_pretty_name (event->dev), + EEH_MAX_ALLOWED_FREEZES); + goto rdone; + } + + /* Reset the pci controller. (Asserts RST#; resets config space). + * Reconfigure bridges and devices */ + rtas_set_slot_reset (event->dn); + rtas_configure_bridge(event->dn); + eeh_restore_bars(saved_bars); + + /* Give the system 5 seconds to finish running the user-space + * hotplug scripts, e.g. ifdown for ethernet. Yes, this is a hack, + * but if we don't do this, weird things happen. + */ + ssleep (5); + + rpaphp_enable_pci_slot (frozen_slot); + + /* The new device node is different than the old one; + * copy over the freeze count, so that we don't loose track of it. + */ + frozen_slot->dn->eeh_freeze_count = event->dn->eeh_freeze_count; +rdone: + of_node_put(event->dn); + pci_dev_put(event->dev); + return 0; +} + +static struct notifier_block eeh_block; + +void __init init_eeh_handler (void) +{ + eeh_block.notifier_call = handle_eeh_events; + eeh_register_notifier (&eeh_block); +} + +void __exit exit_eeh_handler (void) +{ + eeh_unregister_notifier (&eeh_block); +} + From benh at kernel.crashing.org Thu Nov 25 09:12:05 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 25 Nov 2004 09:12:05 +1100 Subject: New patch for Imac G5 In-Reply-To: <1101297524.31127.1159.camel@rapid> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> <1101122178.13612.106.camel@gaston> <1101297524.31127.1159.camel@rapid> Message-ID: <1101334325.4683.7.camel@gaston> On Wed, 2004-11-24 at 12:58 +0100, J. Mayer wrote: > Hi, > > here's the new version of the complete patch I use for the Imac G5. > It includes Bejamin fixes for Shasta SATA initialisation. Here's the > current status of the port: Can you check if it works without the SATA change ? In Darwin, I think that code is only used on wakeup from sleep, which we aren't about to get working on this machine ... > - Pmac PCI IRQ fix: needed for IDE & SATA on Imac G5 > - Ethernet PHY failure if the cable is plugged during (re)boot > - RTC works. > - SMU management: support for reboot and shutdown > - Bad detection of frame-buffer virtual res in riva-fb: > should use xres=1440 yres=900 & virtual_xres=1536 > - Have to unplug/replug the USB keyboard after kernel boot to make it > work > with kernel 2.6.10-rc1. No such problem with 2.6.9. > Fixed with kernel 2.6.10-rc2 > - Lot's of segfaults occuring when multiple > concurent processes are running. > Fixed with kernel 2.6.10-rc2 Can you do split patches ? This is more handy for me to manage... Ben. From l_indien at magic.fr Thu Nov 25 21:47:57 2004 From: l_indien at magic.fr (J. Mayer) Date: Thu, 25 Nov 2004 11:47:57 +0100 Subject: New patch for Imac G5 In-Reply-To: <1101334325.4683.7.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> <1101122178.13612.106.camel@gaston> <1101297524.31127.1159.camel@rapid> <1101334325.4683.7.camel@gaston> Message-ID: <1101379677.31127.1180.camel@rapid> On Wed, 2004-11-24 at 23:12, Benjamin Herrenschmidt wrote: > On Wed, 2004-11-24 at 12:58 +0100, J. Mayer wrote: > > Hi, > > > > here's the new version of the complete patch I use for the Imac G5. > > It includes Bejamin fixes for Shasta SATA initialisation. Here's the > > current status of the port: > > Can you check if it works without the SATA change ? In Darwin, I think > that code is only used on wakeup from sleep, which we aren't about to > get working on this machine ... OK, I'll test this. [...] > Can you do split patches ? This is more handy for me to manage... Yes, I'll do this too. -- J. Mayer Never organized From paulus at samba.org Thu Nov 25 22:39:03 2004 From: paulus at samba.org (Paul Mackerras) Date: Thu, 25 Nov 2004 22:39:03 +1100 Subject: [PATCH] read_slot_reset_state2 rtas call In-Reply-To: <41990014.1060404@austin.ibm.com> References: <41990014.1060404@austin.ibm.com> Message-ID: <16805.50263.984132.312515@cargo.ozlabs.ibm.com> Nathan Fontenot writes: > This patch attempts to use the newer rtas call if available and falls > back the older version otherwise. This will maintain EEH slot checking > capabilities on all future and current firmware levels. David Howells pointed out that one of the read_slot_reset_state calls wasn't checking the return value, so here's a version of the patch that does. I'll send this to akpm later. Paul. diff -urN linux-2.5/arch/ppc64/kernel/eeh.c test/arch/ppc64/kernel/eeh.c --- linux-2.5/arch/ppc64/kernel/eeh.c 2004-10-26 16:06:41.000000000 +1000 +++ test/arch/ppc64/kernel/eeh.c 2004-11-25 22:24:59.681460824 +1100 @@ -95,6 +95,7 @@ static int ibm_set_eeh_option; static int ibm_set_slot_reset; static int ibm_read_slot_reset_state; +static int ibm_read_slot_reset_state2; static int ibm_slot_error_detail; static int eeh_subsystem_enabled; @@ -407,6 +408,27 @@ } /** + * read_slot_reset_state - Read the reset state of a device node's slot + * @dn: device node to read + * @rets: array to return results in + */ +static int read_slot_reset_state(struct device_node *dn, int rets[]) +{ + int token, outputs; + + if (ibm_read_slot_reset_state2 != RTAS_UNKNOWN_SERVICE) { + token = ibm_read_slot_reset_state2; + outputs = 4; + } else { + token = ibm_read_slot_reset_state; + outputs = 3; + } + + return rtas_call(token, 3, outputs, rets, dn->eeh_config_addr, + BUID_HI(dn->phb->buid), BUID_LO(dn->phb->buid)); +} + +/** * eeh_panic - call panic() for an eeh event that cannot be handled. * The philosophy of this routine is that it is better to panic and * halt the OS than it is to risk possible data corruption by @@ -508,7 +530,7 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) { int ret; - int rets[2]; + int rets[3]; unsigned long flags; int rc, reset_state; struct eeh_event *event; @@ -539,11 +561,8 @@ atomic_inc(&eeh_fail_count); if (atomic_read(&eeh_fail_count) >= EEH_MAX_FAILS) { /* re-read the slot reset state */ - rets[0] = -1; - rtas_call(ibm_read_slot_reset_state, 3, 3, rets, - dn->eeh_config_addr, - BUID_HI(dn->phb->buid), - BUID_LO(dn->phb->buid)); + if (read_slot_reset_state(dn, rets) != 0) + rets[0] = -1; /* reset state unknown */ eeh_panic(dev, rets[0]); } return 0; @@ -556,10 +575,7 @@ * function zero of a multi-function device. * In any case they must share a common PHB. */ - ret = rtas_call(ibm_read_slot_reset_state, 3, 3, rets, - dn->eeh_config_addr, BUID_HI(dn->phb->buid), - BUID_LO(dn->phb->buid)); - + ret = read_slot_reset_state(dn, rets); if (!(ret == 0 && rets[1] == 1 && (rets[0] == 2 || rets[0] == 4))) { __get_cpu_var(false_positives)++; return 0; @@ -755,6 +771,7 @@ ibm_set_eeh_option = rtas_token("ibm,set-eeh-option"); ibm_set_slot_reset = rtas_token("ibm,set-slot-reset"); + ibm_read_slot_reset_state2 = rtas_token("ibm,read-slot-reset-state2"); ibm_read_slot_reset_state = rtas_token("ibm,read-slot-reset-state"); ibm_slot_error_detail = rtas_token("ibm,slot-error-detail"); From david at gibson.dropbear.id.au Fri Nov 26 14:59:59 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 26 Nov 2004 14:59:59 +1100 Subject: [PPC64] Tweaks to ppc64 cpu sysfs information Message-ID: <20041126035959.GK11370@zax> Andrew, please apply: Currently the ppc64 sysfs code registers an entry for each possible cpu in sysfs, rather than just online cpus. That makes sense, since the sysfs entries are needed to control onlining of the cpus. However, this is done even if CONFIG_HOTPLUG_CPU is not set, or if it is not a hotplug capable (DLPAR) machine, which is a bit misleading. Secondly it also registers all the other sysfs entries (mostly performance monitoring controls) on all possible cpus, although they are quite meaningless on non-online cpus. This patch alters the code to only register sysfs directories at boot for cpus which are either online or could be onlined (cpu is possible, and CONFIG_HOTPLUG_CPU and an lpar machine). Furthermore, the entries apart from 'online' itself and 'physical_id' are only registered for online CPUs (and deregistered again if a cpu goes offline). Currently the ppc64 sysfs code registers an entry for each possible cpu in sysfs, rather than just online cpus. That makes sense, since the sysfs entries are needed to control onlining of the cpus. However, this is done even if CONFIG_HOTPLUG_CPU is not set, or if it is not a hotplug capable (DLPAR) machine, which is a bit misleading. Secondly it also registers all the other sysfs entries (mostly performance monitoring controls) on all possible cpus, although they are quite meaningless on non-online cpus. This patch alters the code to only register sysfs directories at boot for cpus which are either online or could be onlined (cpu is possible, and CONFIG_HOTPLUG_CPU and an lpar machine). Furthermore, the entries apart from 'online' itself and 'physical_id' are only registered for online CPUs (and deregistered again if a cpu goes offline). Index: working-2.6/arch/ppc64/kernel/sysfs.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/sysfs.c 2004-11-17 11:19:38.000000000 +1100 +++ working-2.6/arch/ppc64/kernel/sysfs.c 2004-11-26 12:28:39.064454600 +1100 @@ -7,6 +7,8 @@ #include #include #include +#include +#include #include #include @@ -15,6 +17,8 @@ #include +static DEFINE_PER_CPU(struct cpu, cpu_devices); + /* SMT stuff */ #ifdef CONFIG_PPC_MULTIPLATFORM @@ -255,8 +259,18 @@ static SYSDEV_ATTR(pmc8, 0600, show_pmc8, store_pmc8); static SYSDEV_ATTR(purr, 0600, show_purr, NULL); -static void __init register_cpu_pmc(struct sys_device *s) +static void register_cpu_online(unsigned int cpu) { + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_create_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + sysdev_create_file(s, &attr_mmcr0); sysdev_create_file(s, &attr_mmcr1); @@ -279,6 +293,65 @@ sysdev_create_file(s, &attr_purr); } +#ifdef CONFIG_HOTPLUG_CPU +static void unregister_cpu_online(unsigned int cpu) +{ + struct cpu *c = &per_cpu(cpu_devices, cpu); + struct sys_device *s = &c->sysdev; + + BUG_ON(c->no_control); + +#ifndef CONFIG_PPC_ISERIES + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_smt_snooze_delay); +#endif + + /* PMC stuff */ + + sysdev_remove_file(s, &attr_mmcr0); + sysdev_remove_file(s, &attr_mmcr1); + + if (cur_cpu_spec->cpu_features & CPU_FTR_MMCRA) + sysdev_remove_file(s, &attr_mmcra); + + sysdev_remove_file(s, &attr_pmc1); + sysdev_remove_file(s, &attr_pmc2); + sysdev_remove_file(s, &attr_pmc3); + sysdev_remove_file(s, &attr_pmc4); + sysdev_remove_file(s, &attr_pmc5); + sysdev_remove_file(s, &attr_pmc6); + + if (cur_cpu_spec->cpu_features & CPU_FTR_PMC8) { + sysdev_remove_file(s, &attr_pmc7); + sysdev_remove_file(s, &attr_pmc8); + } + + if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) + sysdev_remove_file(s, &attr_purr); +} +#endif /* CONFIG_HOTPLUG_CPU */ + +static int __devinit sysfs_cpu_notify(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + unsigned int cpu = (unsigned int)(long)hcpu; + + switch (action) { + case CPU_ONLINE: + register_cpu_online(cpu); + break; +#ifdef CONFIG_HOTPLUG_CPU + case CPU_DEAD: + unregister_cpu_online(cpu); + break; +#endif + } + return NOTIFY_OK; +} + +static struct notifier_block __devinitdata sysfs_cpu_nb = { + .notifier_call = sysfs_cpu_notify, +}; /* NUMA stuff */ @@ -308,8 +381,7 @@ } #endif - -/* Only valid if CPU is online. */ +/* Only valid if CPU is present. */ static ssize_t show_physical_id(struct sys_device *dev, char *buf) { struct cpu *cpu = container_of(dev, struct cpu, sysdev); @@ -318,9 +390,6 @@ } static SYSDEV_ATTR(physical_id, 0444, show_physical_id, NULL); - -static DEFINE_PER_CPU(struct cpu, cpu_devices); - static int __init topology_init(void) { int cpu; @@ -328,6 +397,8 @@ register_nodes(); + register_cpu_notifier(&sysfs_cpu_nb); + for_each_cpu(cpu) { struct cpu *c = &per_cpu(cpu_devices, cpu); @@ -341,19 +412,19 @@ * CPU. For instance, the boot cpu might never be valid * for hotplugging. */ +#ifdef CONFIG_HOTPLUG_CPU if (systemcfg->platform != PLATFORM_PSERIES_LPAR) +#endif c->no_control = 1; - register_cpu(c, cpu, parent); - - register_cpu_pmc(&c->sysdev); + if (cpu_online(cpu) || (c->no_control == 0)) { + register_cpu(c, cpu, parent); - sysdev_create_file(&c->sysdev, &attr_physical_id); + sysdev_create_file(&c->sysdev, &attr_physical_id); + } -#ifndef CONFIG_PPC_ISERIES - if (cur_cpu_spec->cpu_features & CPU_FTR_SMT) - sysdev_create_file(&c->sysdev, &attr_smt_snooze_delay); -#endif + if (cpu_online(cpu)) + register_cpu_online(cpu); } return 0; -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From dwmw2 at infradead.org Sat Nov 27 01:18:49 2004 From: dwmw2 at infradead.org (David Woodhouse) Date: Fri, 26 Nov 2004 14:18:49 +0000 Subject: Some ppc64 signal/ptrace patches. Message-ID: <1101478729.8191.9543.camel@hades.cambridge.redhat.com> First, single-stepping into and out of signals wasn't working. We were sending a SIGTRAP to the debugger on the second instruction of the handler, not the first. Second, the return path from sigsuspend() was stomping on the r4 and r5 registers (the args to the signal handler) by using syscall_exit to get back to userspace instead of ret_from_except. Third, signals were remaining masked when a signal handler was _NOT_ invoked due to having a bogus altstack. We aborted the setup of the signal frame and forced a SIGSEGV, but we didn't put the original signal mask back. As discovered by Bodo Stroesser. -- dwmw2 -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.10-ppc64-sigmasking.patch Type: text/x-patch Size: 4353 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041126/64ea2fb1/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.10-ppc64-sigsuspend.patch Type: text/x-patch Size: 1643 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041126/64ea2fb1/attachment-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.10-ppc64-singlestep.patch Type: text/x-patch Size: 3914 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041126/64ea2fb1/attachment-0002.bin From benh at kernel.crashing.org Sat Nov 27 15:20:18 2004 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 27 Nov 2004 15:20:18 +1100 Subject: Some ppc64 signal/ptrace patches. In-Reply-To: <1101478729.8191.9543.camel@hades.cambridge.redhat.com> References: <1101478729.8191.9543.camel@hades.cambridge.redhat.com> Message-ID: <1101529218.4667.10.camel@gaston> On Fri, 2004-11-26 at 14:18 +0000, David Woodhouse wrote: > First, single-stepping into and out of signals wasn't working. We were > sending a SIGTRAP to the debugger on the second instruction of the > handler, not the first. > > Second, the return path from sigsuspend() was stomping on the r4 and r5 > registers (the args to the signal handler) by using syscall_exit to get > back to userspace instead of ret_from_except. > > Third, signals were remaining masked when a signal handler was _NOT_ > invoked due to having a bogus altstack. We aborted the setup of the > signal frame and forced a SIGSEGV, but we didn't put the original signal > mask back. As discovered by Bodo Stroesser. I haven't looked in detail yet at the patch, but have you double chcked that restarting is still OK ? I remember the whole thing beeing quite fragile ... Ben. From l_indien at magic.fr Sat Nov 27 23:43:35 2004 From: l_indien at magic.fr (J. Mayer) Date: Sat, 27 Nov 2004 13:43:35 +0100 Subject: New patch for Imac G5 In-Reply-To: <1101334325.4683.7.camel@gaston> References: <1099656843.8346.7.camel@rapid> <1099702566.3946.49.camel@gaston> <1099769123.8346.41.camel@rapid> <1099774242.10262.99.camel@gaston> <1099785081.5295.114.camel@gaston> <1099787569.3946.116.camel@gaston> <1099787630.3884.118.camel@gaston> <1100183508.8346.5849.camel@rapid> <1100209361.16927.8.camel@gaston> <1100274787.9674.1703.camel@rapid> <1100308457.20512.79.camel@gaston> <1100311368.16435.22.camel@rapid> <1100311438.20592.106.camel@gaston> <1100312551.16435.37.camel@rapid> <1100312517.20592.109.camel@gaston> <1100443250.16435.57.camel@rapid> <1100470040.20593.143.camel@gaston> <78F844EA-36E8-11D9-B463-000A95A4DC02@kernel.crashing.org> <1101093895.31127.22.camel@rapid> <1101094938.13598.42.camel@gaston> <1101104580.31127.115.camel@rapid> <1101109362.22529.82.camel@gaston> <1101111995.31127.155.camel@rapid> <1101112864.13597.94.camel@gaston> <1101114554.31127.174.camel@rapid> <1101116254.13598.98.camel@gaston> <1101119504.31127.192.camel@rapid> <1101122178.13612.106.camel@gaston> <1101297524.31127.1159.camel@rapid> <1101334325.4683.7.camel@gaston> Message-ID: <1101559415.31127.4179.camel@rapid> On Wed, 2004-11-24 at 23:12, Benjamin Herrenschmidt wrote: > On Wed, 2004-11-24 at 12:58 +0100, J. Mayer wrote: > > Hi, > > > > here's the new version of the complete patch I use for the Imac G5. > > It includes Bejamin fixes for Shasta SATA initialisation. Here's the > > current status of the port: > > Can you check if it works without the SATA change ? In Darwin, I think > that code is only used on wakeup from sleep, which we aren't about to > get working on this machine ... I just checked, it works well without your patch. > > - Pmac PCI IRQ fix: needed for IDE & SATA on Imac G5 > > - Ethernet PHY failure if the cable is plugged during (re)boot > > - RTC works. > > - SMU management: support for reboot and shutdown > > - Bad detection of frame-buffer virtual res in riva-fb: > > should use xres=1440 yres=900 & virtual_xres=1536 > > - Have to unplug/replug the USB keyboard after kernel boot to make it > > work > > with kernel 2.6.10-rc1. No such problem with 2.6.9. > > Fixed with kernel 2.6.10-rc2 > > - Lot's of segfaults occuring when multiple > > concurent processes are running. > > Fixed with kernel 2.6.10-rc2 > > > Can you do split patches ? This is more handy for me to manage... Here you'll find a tarball with the patch splitted. Regards. -- J. Mayer Never organized -------------- next part -------------- A non-text attachment was scrubbed... Name: linux-2.6.10-rc2_ImacG5.tgz Type: application/x-compressed-tar Size: 10484 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20041127/b781155c/attachment.bin From dwmw2 at infradead.org Sun Nov 28 00:18:12 2004 From: dwmw2 at infradead.org (David Woodhouse) Date: Sat, 27 Nov 2004 13:18:12 +0000 Subject: Some ppc64 signal/ptrace patches. In-Reply-To: <1101529218.4667.10.camel@gaston> References: <1101478729.8191.9543.camel@hades.cambridge.redhat.com> <1101529218.4667.10.camel@gaston> Message-ID: <1101561492.21273.5107.camel@baythorne.infradead.org> On Sat, 2004-11-27 at 15:20 +1100, Benjamin Herrenschmidt wrote: > I haven't looked in detail yet at the patch, but have you double chcked > that restarting is still OK ? I remember the whole thing beeing quite > fragile ... It should be -- I haven't done anything which really ought to affect it. The code in signal.c and signal32.c seems to have a lot of duplication, but also a lot of code which is _almost_ duplicated but just slightly different for no adequately defined reason. I wonder if it makes sense to have a go at unifying it a bit, or at least making the 32 and 64 bits look alike if they _do_ have to be separate? -- dwmw2 From rusty at rustcorp.com.au Mon Nov 29 08:58:11 2004 From: rusty at rustcorp.com.au (Rusty Russell) Date: Mon, 29 Nov 2004 08:58:11 +1100 Subject: [PPC64] Tweaks to ppc64 cpu sysfs information In-Reply-To: <20041126035959.GK11370@zax> References: <20041126035959.GK11370@zax> Message-ID: <1101679091.25347.9.camel@localhost.localdomain> On Fri, 2004-11-26 at 14:59 +1100, David Gibson wrote: > Andrew, please apply: > > Currently the ppc64 sysfs code registers an entry for each possible > cpu in sysfs, rather than just online cpus. That makes sense, since > the sysfs entries are needed to control onlining of the cpus. > However, this is done even if CONFIG_HOTPLUG_CPU is not set, or if it > is not a hotplug capable (DLPAR) machine, which is a bit misleading. > Secondly it also registers all the other sysfs entries (mostly > performance monitoring controls) on all possible cpus, although they > are quite meaningless on non-online cpus. Surely if !CONFIG_HOTPLUG_CPU, then online == possible? If not, it should be. That would solve part of the problem. Rusty. -- A bad analogy is like a leaky screwdriver -- Richard Braakman From david at gibson.dropbear.id.au Mon Nov 29 11:50:33 2004 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 29 Nov 2004 11:50:33 +1100 Subject: [PPC64] Tweaks to ppc64 cpu sysfs information In-Reply-To: <1101679091.25347.9.camel@localhost.localdomain> References: <20041126035959.GK11370@zax> <1101679091.25347.9.camel@localhost.localdomain> Message-ID: <20041129005033.GB4155@zax> On Mon, Nov 29, 2004 at 08:58:11AM +1100, Paul 'Rusty' Russell wrote: > On Fri, 2004-11-26 at 14:59 +1100, David Gibson wrote: > > Andrew, please apply: > > > > Currently the ppc64 sysfs code registers an entry for each possible > > cpu in sysfs, rather than just online cpus. That makes sense, since > > the sysfs entries are needed to control onlining of the cpus. > > However, this is done even if CONFIG_HOTPLUG_CPU is not set, or if it > > is not a hotplug capable (DLPAR) machine, which is a bit misleading. > > Secondly it also registers all the other sysfs entries (mostly > > performance monitoring controls) on all possible cpus, although they > > are quite meaningless on non-online cpus. > > Surely if !CONFIG_HOTPLUG_CPU, then online == possible? If not, it > should be. That would solve part of the problem. No, it's not. Yes, it probably should be. I thought about, but wasn't sure what other consequences that might have. I figured my patch would definitely fix some things, and there's actually less overlap with setting online==possible than you might think, partly because my patch will do the right thing if we ever have systems with some CPUs on/offlineable and others not. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist. NOT _the_ _other_ _way_ | _around_! http://www.ozlabs.org/people/dgibson From igor at cs.wisc.edu Tue Nov 30 08:00:17 2004 From: igor at cs.wisc.edu (Igor Grobman) Date: Mon, 29 Nov 2004 15:00:17 -0600 (CST) Subject: power4 performance counters Message-ID: Is there a publicly available document that specifies the events countable on power4 performance counters? I found POWER4.evs on an AIX box (part of pmapi library), but I would like to write my own code that accesses those counters on linux, and incorporating that information probably violates IBM copyright. Thanks, Igor From anton at samba.org Tue Nov 30 08:49:14 2004 From: anton at samba.org (Anton Blanchard) Date: Tue, 30 Nov 2004 08:49:14 +1100 Subject: power4 performance counters In-Reply-To: References: Message-ID: <20041129214914.GD17540@krispykreme.ozlabs.ibm.com> > Is there a publicly available document that specifies the events > countable on power4 performance counters? I found POWER4.evs on an AIX > box (part of pmapi library), but I would like to write my own code that > accesses those counters on linux, and incorporating that information > probably violates IBM copyright. oprofile has a summary of the common groups for POWER4, POWER5 and 970. It supports hardware counters on recent 2.6 kernels. You may need to get take a CVS snapshot of oprofile, its fairly recent. For a background of how the performance counters work in POWER4/5 and 970 (the concepts are the same even though the muxes are different), check out the 970FX book4: http://www-306.ibm.com/chips/techlib/techlib.nsf/products/PowerPC_970_and_970FX_Microprocessors Let me know if you get stuck. Just out of interest, are you planning to work on one of the performance counter packages (eg pmapi, perfctr) BTW the performance counter registers (PMC*, MMCR*) are exported in 2.6 in sysfs (/sys/devices/system/cpu/cpu*), you can read and write each cpu via that. They are only 32bit so you need to read them periodically if you are doing it all from userspace (eg do it a few times a second to check for wrap). There is a userspace mfspr instruction to grab the PMC values from userspace so you can avoid hitting the kernel on reads if you want to do low overhead stuff in userspace. Anton