From trini at kernel.crashing.org Wed Feb 1 02:08:24 2006 From: trini at kernel.crashing.org (Tom Rini) Date: Tue, 31 Jan 2006 08:08:24 -0700 Subject: Maple fails to boot current git In-Reply-To: <1138679592.4934.1.camel@localhost.localdomain> References: <20060130171759.GE22672@smtp.west.cox.net> <20060130231118.GA19671@localhost.localdomain> <1138679592.4934.1.camel@localhost.localdomain> Message-ID: <20060131150824.GO22672@smtp.west.cox.net> On Tue, Jan 31, 2006 at 02:53:11PM +1100, Benjamin Herrenschmidt wrote: > On Tue, 2006-01-31 at 12:11 +1300, David Gibson wrote: > > On Mon, Jan 30, 2006 at 10:17:59AM -0700, Tom Rini wrote: > > > Hello, trying to boot my maple board (ppc64_defconfig + > > > CONFIG_PPC_EARLY_DEBUG_MAPLE=y) fails as follows (the "dirty" is > > > #define DEBUG in kernel/prom_parse.c and platforms/maple/time.c): > > > > Crud. Our Maple is stuffed at the moment (doesn't complete the CPU > > init script, so PIBS never even comes up on the 970), so I can't > > really investigate. > > Well, the RTC problem definitely looks like a bogus or lack of "ranges" > property or the fact that the parser doesn't recognize "ht" as a PCI > bus. You may want to try updating prom_parse.c to treat "ht" as a PCI > bus and see if that helps. With the following, I get parent bus is pci now, but still: OF: ** translation for device /ht at 0/isa at 4/rtc at 900 ** OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4 OF: translating address: 00000001 00000900 OF: parent bus is pci (na=3, ns=2) on /ht at 0 OF: walking ranges... OF: not found ! Maple: Unable to translate RTC address Maple: No device node for RTC, assuming legacy address (0x70) diff --git a/arch/powerpc/kernel/prom_parse.c b/arch/powerpc/kernel/prom_parse.c index a8099c8..6006201 100644 --- a/arch/powerpc/kernel/prom_parse.c +++ b/arch/powerpc/kernel/prom_parse.c @@ -1,4 +1,4 @@ -#undef DEBUG +#define DEBUG #include #include @@ -113,8 +113,10 @@ static unsigned int of_bus_default_get_f static int of_bus_pci_match(struct device_node *np) { - /* "vci" is for the /chaos bridge on 1st-gen PCI powermacs */ - return !strcmp(np->type, "pci") || !strcmp(np->type, "vci"); + /* "vci" is for the /chaos bridge on 1st-gen PCI powermacs, "ht" + * is the maple board. */ + return !strcmp(np->type, "pci") || !strcmp(np->type, "vci") || + !strcmp(np->type, "ht"); } static void of_bus_pci_count_cells(struct device_node *np, @@ -239,6 +241,16 @@ static struct of_bus of_busses[] = { .translate = of_bus_pci_translate, .get_flags = of_bus_pci_get_flags, }, + /* HT */ + { + .name = "ht", + .addresses = "assigned-addresses", + .match = of_bus_pci_match, + .count_cells = of_bus_pci_count_cells, + .map = of_bus_pci_map, + .translate = of_bus_pci_translate, + .get_flags = of_bus_pci_get_flags, + }, /* ISA */ { .name = "isa", -- Tom Rini http://gate.crashing.org/~trini/ From trini at kernel.crashing.org Wed Feb 1 02:11:17 2006 From: trini at kernel.crashing.org (Tom Rini) Date: Tue, 31 Jan 2006 08:11:17 -0700 Subject: [PATCH 2.6.16-rc1] Fix booting Maple boards (was: Re: LINUXPPC64 Maple fails to boot current git) In-Reply-To: <1138662630.3417.26.camel@brick.watson.ibm.com> References: <20060130171759.GE22672@smtp.west.cox.net> <1138662630.3417.26.camel@brick.watson.ibm.com> Message-ID: <20060131151117.GP22672@smtp.west.cox.net> On Mon, Jan 30, 2006 at 06:10:30PM -0500, Michal Ostrowski wrote: > I saw something similar on a JS-20 w SLOF. The last message you see is > related to the RTC driver, but the next thing to run after that is > console_init(), which was where my system was dying. > > Dropping the "#ifdef CONFIG_ISA" statements in > arch/powerpc/kernel/legacy_serial.c appears to fix things, and I've been > told that a patch to this effect has been posted (though I've yet to see > it). The following gets my Maple booting again, and I _think_ is testing what was intended --- When looking for legacy serial ports, condition poking of "ISA" areas on CONFIG_GENERIC_ISA_DMA, rather than CONFIG_ISA as some boards (such as the Maple) have no ISA slots, but do have ISA serial ports. Signed-off-by: Tom Rini arch/powerpc/kernel/legacy_serial.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c index f970ace..3dd7b39 100644 --- a/arch/powerpc/kernel/legacy_serial.c +++ b/arch/powerpc/kernel/legacy_serial.c @@ -134,7 +134,7 @@ static int __init add_legacy_soc_port(st return add_legacy_port(np, -1, UPIO_MEM, addr, addr, NO_IRQ, flags); } -#ifdef CONFIG_ISA +#ifdef CONFIG_GENERIC_ISA_DMA static int __init add_legacy_isa_port(struct device_node *np, struct device_node *isa_brg) { @@ -276,7 +276,7 @@ void __init find_legacy_serial_ports(voi of_node_put(soc); } -#ifdef CONFIG_ISA +#ifdef CONFIG_GENERIC_ISA_DMA /* First fill our array with ISA ports */ for (np = NULL; (np = of_find_node_by_type(np, "serial"));) { struct device_node *isa = of_get_parent(np); -- Tom Rini http://gate.crashing.org/~trini/ From linas at austin.ibm.com Wed Feb 1 07:22:14 2006 From: linas at austin.ibm.com (linas) Date: Tue, 31 Jan 2006 14:22:14 -0600 Subject: creating PCI-related sysfs entries Message-ID: <20060131202214.GZ19465@austin.ibm.com> Hi, I want to create some sysfs entries in order to report on the status of PCI slots. (If you are guessing that this pertains to the PCI error recovery code, you'd be right). I'm having trouble figuring out the best way to do this. There are existing entries at /sys/bus/pci/slots/... but these are for hotplug slots only; none of the soldered-onto-the-MB devices show up here. Is this intentional, or is this a bug/ overshight/not-yet-implemented thing? I also want to report some roll-up system-wide statistics both /sys/module and /sys/class seem reasonable. My code does not compile as a module. Suggestions? Yes, I'm going to RTFM shortly after I hit the send key, assuming I find the FM. --linas From greg at kroah.com Wed Feb 1 07:34:56 2006 From: greg at kroah.com (Greg KH) Date: Tue, 31 Jan 2006 12:34:56 -0800 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131202214.GZ19465@austin.ibm.com> References: <20060131202214.GZ19465@austin.ibm.com> Message-ID: <20060131203456.GA23819@kroah.com> On Tue, Jan 31, 2006 at 02:22:14PM -0600, linas wrote: > > Hi, > > I want to create some sysfs entries in order to report on the > status of PCI slots. (If you are guessing that this pertains > to the PCI error recovery code, you'd be right). I'm having > trouble figuring out the best way to do this. > > There are existing entries at /sys/bus/pci/slots/... but these > are for hotplug slots only; none of the soldered-onto-the-MB > devices show up here. Is this intentional, or is this a bug/ > overshight/not-yet-implemented thing? Not implemented, as it's up to a pci hotplug controller driver to provide those slots. It sounds like your driver needs to be expanded :) > I also want to report some roll-up system-wide statistics > both /sys/module and /sys/class seem reasonable. My code > does not compile as a module. Suggestions? What kind of statistics? Is this driver related? PCI bus related? Device related? thanks, greg k-h From linas at austin.ibm.com Wed Feb 1 08:08:05 2006 From: linas at austin.ibm.com (linas) Date: Tue, 31 Jan 2006 15:08:05 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131203456.GA23819@kroah.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> Message-ID: <20060131210805.GA19465@austin.ibm.com> On Tue, Jan 31, 2006 at 12:34:56PM -0800, Greg KH was heard to remark: > On Tue, Jan 31, 2006 at 02:22:14PM -0600, linas wrote: > > > > I want to create some sysfs entries in order to report on the > > status of PCI slots. (If you are guessing that this pertains > > to the PCI error recovery code, you'd be right). I'm having > > trouble figuring out the best way to do this. > > > > There are existing entries at /sys/bus/pci/slots/... but these > > are for hotplug slots only; none of the soldered-onto-the-MB > > devices show up here. Is this intentional, or is this a bug/ > > overshight/not-yet-implemented thing? > > Not implemented, as it's up to a pci hotplug controller driver to > provide those slots. It sounds like your driver needs to be expanded :) Hmm. But these slots are not hot-plugabble; should the arch use the hotplug infrastructure even on those slots? I note that /sys/devices/pciXXXX does have all of the pci slos listed, so perhaps that is where I can place per-slot data. > > I also want to report some roll-up system-wide statistics > > both /sys/module and /sys/class seem reasonable. My code > > does not compile as a module. Suggestions? > > What kind of statistics? Is this driver related? PCI bus related? > Device related? Related to the PCI error recovery. I'm not sure how to conceptually peg this: one could say that it is the driver for a specific type of pci-host bridge, although the code is not currently structured as such. Should I try to restructure it as such? If so, I'm not clear on how to proceed; I can't say I've clearly seen a kernel abstraction of a pci-host bridge device onto which to staple myself. I wanted to report a few read-only statistics, and a few writeable parameters: Read-only: -- total number of PCI device resets due to detected errors -- total number of "false positives" (probable errors that weren't) -- some other misc related stats. Most, but not all, of these statistics could be obtained by totalling up the per-slot statistics. Writable: -- Number of reset tries to perform before concluding that the device is hopelessly dead. Resets are disruptive and intensive, and I don't want to get stuck in an inf loop on a dead device. Linas. From greg at kroah.com Wed Feb 1 08:26:24 2006 From: greg at kroah.com (Greg KH) Date: Tue, 31 Jan 2006 13:26:24 -0800 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131210805.GA19465@austin.ibm.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> Message-ID: <20060131212624.GA10513@kroah.com> On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote: > On Tue, Jan 31, 2006 at 12:34:56PM -0800, Greg KH was heard to remark: > > On Tue, Jan 31, 2006 at 02:22:14PM -0600, linas wrote: > > > > > > I want to create some sysfs entries in order to report on the > > > status of PCI slots. (If you are guessing that this pertains > > > to the PCI error recovery code, you'd be right). I'm having > > > trouble figuring out the best way to do this. > > > > > > There are existing entries at /sys/bus/pci/slots/... but these > > > are for hotplug slots only; none of the soldered-onto-the-MB > > > devices show up here. Is this intentional, or is this a bug/ > > > overshight/not-yet-implemented thing? > > > > Not implemented, as it's up to a pci hotplug controller driver to > > provide those slots. It sounds like your driver needs to be expanded :) > > Hmm. But these slots are not hot-plugabble; should the arch > use the hotplug infrastructure even on those slots? Why not? It's a good place to put them, right? > I note that /sys/devices/pciXXXX does have all of the pci > slos listed, so perhaps that is where I can place per-slot data. That's only because your arch might happen to have 1 device per slot, which is not true for other arches. And I bet it's also not true for your non-virtual boxes... > > > I also want to report some roll-up system-wide statistics > > > both /sys/module and /sys/class seem reasonable. My code > > > does not compile as a module. Suggestions? > > > > What kind of statistics? Is this driver related? PCI bus related? > > Device related? > > Related to the PCI error recovery. I'm not sure how to conceptually > peg this: one could say that it is the driver for a specific type > of pci-host bridge, although the code is not currently structured > as such. Should I try to restructure it as such? If so, I'm not > clear on how to proceed; I can't say I've clearly seen a kernel > abstraction of a pci-host bridge device onto which to staple myself. People have suggested that they create such a driver for a long time. Why not just do that? > I wanted to report a few read-only statistics, and a few writeable > parameters: > > Read-only: > -- total number of PCI device resets due to detected errors > -- total number of "false positives" (probable errors that weren't) > -- some other misc related stats. These are all "per slot" right? > Most, but not all, of these statistics could be obtained by > totalling up the per-slot statistics. > > Writable: > -- Number of reset tries to perform before concluding that the > device is hopelessly dead. Resets are disruptive and intensive, > and I don't want to get stuck in an inf loop on a dead device. Why would you want to change this value? Just pick one at build time. thanks, greg k-h From benh at kernel.crashing.org Wed Feb 1 08:31:34 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 01 Feb 2006 08:31:34 +1100 Subject: Maple fails to boot current git In-Reply-To: <20060131150824.GO22672@smtp.west.cox.net> References: <20060130171759.GE22672@smtp.west.cox.net> <20060130231118.GA19671@localhost.localdomain> <1138679592.4934.1.camel@localhost.localdomain> <20060131150824.GO22672@smtp.west.cox.net> Message-ID: <1138743094.4934.11.camel@localhost.localdomain> On Tue, 2006-01-31 at 08:08 -0700, Tom Rini wrote: > On Tue, Jan 31, 2006 at 02:53:11PM +1100, Benjamin Herrenschmidt wrote: > > On Tue, 2006-01-31 at 12:11 +1300, David Gibson wrote: > > > On Mon, Jan 30, 2006 at 10:17:59AM -0700, Tom Rini wrote: > > > > Hello, trying to boot my maple board (ppc64_defconfig + > > > > CONFIG_PPC_EARLY_DEBUG_MAPLE=y) fails as follows (the "dirty" is > > > > #define DEBUG in kernel/prom_parse.c and platforms/maple/time.c): > > > > > > Crud. Our Maple is stuffed at the moment (doesn't complete the CPU > > > init script, so PIBS never even comes up on the 970), so I can't > > > really investigate. > > > > Well, the RTC problem definitely looks like a bogus or lack of "ranges" > > property or the fact that the parser doesn't recognize "ht" as a PCI > > bus. You may want to try updating prom_parse.c to treat "ht" as a PCI > > bus and see if that helps. > > With the following, I get parent bus is pci now, but still: > OF: ** translation for device /ht at 0/isa at 4/rtc at 900 ** > OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4 > OF: translating address: 00000001 00000900 > OF: parent bus is pci (na=3, ns=2) on /ht at 0 > OF: walking ranges... > OF: not found ! > Maple: Unable to translate RTC address > Maple: No device node for RTC, assuming legacy address (0x70) Can you send me the device-tree dump ? Ben. From markh at osdl.org Wed Feb 1 08:33:00 2006 From: markh at osdl.org (Mark Haverkamp) Date: Tue, 31 Jan 2006 13:33:00 -0800 Subject: iommu_alloc failure and panic In-Reply-To: <43DF691E.1020008@emulex.com> References: <200601310118.k0V1Il7Z018408@falcon30.maxeymade.com> <43DF691E.1020008@emulex.com> Message-ID: <1138743180.15732.15.camel@markh3.pdx.osdl.net> On Tue, 2006-01-31 at 08:41 -0500, James Smart wrote: > >> 2) The emulex driver has been prone to problems in the past where it's > >> been very aggressive at starting DMA operations, and I think it can > >> be avoided with tuning. What I don't know is if it's because of this, > >> or simply because of the large number of targets you have. Cc:ing James > >> Smart. > > I don't have data points for the 2.6 kernel, but I can comment on what I > have seen on the 2.4 kernel. > > The issue that I saw on the 2.4 kernel was that the pci dma alloc routine > was inappropriately allocating from the dma s/g maps. On systems with less > than 4Gig of memory, or on those with no iommmu (emt64), the checks around > adapter-supported dma masks were off (I'm going to be loose in terms to not > describe it in detail). The result was, although the adapter could support > a fully 64bit address and/or although the physical dma address would be under > 32-bits, the logic forced allocation from the mapped dma pool. On some > systems, this pool was originally only 16MB. Around 2.4.30, the swiotlb was > introduced, which reduced issue, but unfortunately, still never solved the > allocation logic. It fails less as the swiotlb simply had more space. > As far as I know, this problem doesn't exist in the 2.6 kernel. I'd have to > go look at the dma map functions to make sure. > > Why was the lpfc driver prone to the dma map exhaustion failures ? Due to the > default # of commands per lun and max sg segments reported by the driver to > the scsi midlayer, the scsi mid-layer's preallocation of dma maps for commands > for each lun, and the fact that our FC configs were usually large, had lots > of luns, and replicated the resources for each path to the same storage. > > Ultimately, what I think is the real issue here is the way the scsi mid-layer > is preallocating dma maps for the luns. 16000 luns is a huge number. > Multiply this by a max sg segment count of 64 by the driver, and a number > between 3 and 30 commands per lun, and you can see the numbers. Scsi does do > some interesting allocation algorithms once it hits an allocation failure. > One side effect of this is that it is fairly efficient at allocating the > bulk of the dma pool. James, Thanks for the information. I tried loading the lpfc driver with lpfc_lun_queue_depth=1 and haven't seen iommu_alloc failures. I'm still curious why the alloc failures lead to a panic though. Mark. > > -- james s -- Mark Haverkamp From grundler at parisc-linux.org Wed Feb 1 09:48:52 2006 From: grundler at parisc-linux.org (Grant Grundler) Date: Tue, 31 Jan 2006 15:48:52 -0700 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131210805.GA19465@austin.ibm.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> Message-ID: <20060131224852.GA25579@colo.lackof.org> On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote: > Related to the PCI error recovery. I'm not sure how to conceptually > peg this: one could say that it is the driver for a specific type > of pci-host bridge, although the code is not currently structured > as such. Should I try to restructure it as such? If so, I'm not > clear on how to proceed; I can't say I've clearly seen a kernel > abstraction of a pci-host bridge device onto which to staple myself. AFAIK, no pci-host device abstraction exists. Each arch deals with pci-host bridges as it sees fit. But access methods to some PCI features are abstracted: o method access to CFG space o method to register IRQs o advertise MMIO/IO Port routing. Sounds like you want to add another method for error recovery stats/control. grant From James.Smart at Emulex.Com Wed Feb 1 00:41:50 2006 From: James.Smart at Emulex.Com (James Smart) Date: Tue, 31 Jan 2006 08:41:50 -0500 Subject: iommu_alloc failure and panic In-Reply-To: <200601310118.k0V1Il7Z018408@falcon30.maxeymade.com> References: <200601310118.k0V1Il7Z018408@falcon30.maxeymade.com> Message-ID: <43DF691E.1020008@emulex.com> >> 2) The emulex driver has been prone to problems in the past where it's >> been very aggressive at starting DMA operations, and I think it can >> be avoided with tuning. What I don't know is if it's because of this, >> or simply because of the large number of targets you have. Cc:ing James >> Smart. I don't have data points for the 2.6 kernel, but I can comment on what I have seen on the 2.4 kernel. The issue that I saw on the 2.4 kernel was that the pci dma alloc routine was inappropriately allocating from the dma s/g maps. On systems with less than 4Gig of memory, or on those with no iommmu (emt64), the checks around adapter-supported dma masks were off (I'm going to be loose in terms to not describe it in detail). The result was, although the adapter could support a fully 64bit address and/or although the physical dma address would be under 32-bits, the logic forced allocation from the mapped dma pool. On some systems, this pool was originally only 16MB. Around 2.4.30, the swiotlb was introduced, which reduced issue, but unfortunately, still never solved the allocation logic. It fails less as the swiotlb simply had more space. As far as I know, this problem doesn't exist in the 2.6 kernel. I'd have to go look at the dma map functions to make sure. Why was the lpfc driver prone to the dma map exhaustion failures ? Due to the default # of commands per lun and max sg segments reported by the driver to the scsi midlayer, the scsi mid-layer's preallocation of dma maps for commands for each lun, and the fact that our FC configs were usually large, had lots of luns, and replicated the resources for each path to the same storage. Ultimately, what I think is the real issue here is the way the scsi mid-layer is preallocating dma maps for the luns. 16000 luns is a huge number. Multiply this by a max sg segment count of 64 by the driver, and a number between 3 and 30 commands per lun, and you can see the numbers. Scsi does do some interesting allocation algorithms once it hits an allocation failure. One side effect of this is that it is fairly efficient at allocating the bulk of the dma pool. -- james s From olh at suse.de Wed Feb 1 19:26:21 2006 From: olh at suse.de (Olaf Hering) Date: Wed, 1 Feb 2006 09:26:21 +0100 Subject: [PATCH] ppc64: per cpu data optimisations In-Reply-To: <20060111021644.GC4767@krispykreme> References: <20060111021644.GC4767@krispykreme> Message-ID: <20060201082621.GA29274@suse.de> On Wed, Jan 11, Anton Blanchard wrote: Anton, this causes trouble if you have sles10 installed and if runlevel 6 is your default runlevel (aka reboot in a loop). Whats wrong with the patch? See https://bugzilla.novell.com/show_bug.cgi?id=145459 for details. there are 2 other bugs which are seen also on other archs, will start looking at them now. -- short story of a lazy sysadmin: alias appserv=wotan From linas at austin.ibm.com Thu Feb 2 08:30:18 2006 From: linas at austin.ibm.com (linas) Date: Wed, 1 Feb 2006 15:30:18 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131212624.GA10513@kroah.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131212624.GA10513@kroah.com> Message-ID: <20060201213018.GG14705@austin.ibm.com> On Tue, Jan 31, 2006 at 01:26:24PM -0800, Greg KH was heard to remark: > On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote: > > > > ... the PCI error recovery. I'm not sure how to conceptually > > peg this: one could say that it is the driver for a specific type > > of pci-host bridge, although the code is not currently structured > > as such. Should I try to restructure it as such? If so, I'm not > > clear on how to proceed; I can't say I've clearly seen a kernel > > abstraction of a pci-host bridge device onto which to staple myself. > > People have suggested that they create such a driver for a long time. > Why not just do that? OK. Let me get this straight, then. Create a generic struct pci_host_bridge, which encapsulates some (all?) of the functions that Grant Grundler mentions in his email: Grant Grundler : <> Each arch deals with pci-host bridges as it sees fit. <> <>But access methods to some PCI features are abstracted: <>o method access to CFG space <>o method to register IRQs <>o advertise MMIO/IO Port routing. At the risk of over-engineering, maybe there should be a struct bus_host_bridge, and struct pci_host_bridge would derive from that? --linas p.s. rest of message: > > I wanted to report a few read-only statistics, and a few writeable > > parameters: > > > > Read-only: > > -- total number of PCI device resets due to detected errors > > -- total number of "false positives" (probable errors that weren't) > > -- some other misc related stats. > > These are all "per slot" right? Right. I'll keep them that way. > > Writable: > > -- Number of reset tries to perform before concluding that the > > device is hopelessly dead. Resets are disruptive and intensive, > > and I don't want to get stuck in an inf loop on a dead device. > > Why would you want to change this value? Just pick one at build time. OK. --linas From linas at austin.ibm.com Thu Feb 2 08:35:46 2006 From: linas at austin.ibm.com (linas) Date: Wed, 1 Feb 2006 15:35:46 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131224852.GA25579@colo.lackof.org> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131224852.GA25579@colo.lackof.org> Message-ID: <20060201213546.GH14705@austin.ibm.com> On Tue, Jan 31, 2006 at 03:48:52PM -0700, Grant Grundler was heard to remark: > On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote: > > Related to the PCI error recovery. I'm not sure how to conceptually > > peg this: one could say that it is the driver for a specific type > > of pci-host bridge, although the code is not currently structured > > as such. Should I try to restructure it as such? If so, I'm not > > clear on how to proceed; I can't say I've clearly seen a kernel > > abstraction of a pci-host bridge device onto which to staple myself. > > AFAIK, no pci-host device abstraction exists. > Each arch deals with pci-host bridges as it sees fit. > > But access methods to some PCI features are abstracted: > o method access to CFG space > o method to register IRQs > o advertise MMIO/IO Port routing. > > Sounds like you want to add another method for error recovery > stats/control. Actually, the "recovery" part is already (mostly) in mainline, See Documentation/pci-error-recovery.txt What's hanging out are patches to specific device drivers, which have been submitted, but haven't been accepted. Another issue is that there's no implementation at this time for any arch other than powerpc, although the latest pci express bridges support this function in principle. --linas From linas at austin.ibm.com Thu Feb 2 11:19:06 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 1 Feb 2006 18:19:06 -0600 Subject: [PATCH 1/2]: PowerPC/PCI Hotplug build break In-Reply-To: <1138833335.6933.5.camel@sinatra.austin.ibm.com> References: <1138833335.6933.5.camel@sinatra.austin.ibm.com> Message-ID: <20060202001906.GA24916@austin.ibm.com> Please apply ASAP: Build break: Building PCI hotplug on PowerPC results in a build break, due to failure to export symbols. Reported today by Dave Jones : drivers/pci/hotplug/rpaphp.ko needs unknown symbol pcibios_add_pci_devices This patch fixes the break in the arch/powerpc tree. Next patch fixes same problem in drivers/pci tree Signed-off-by: Linas Vepstas --- pci_dlpar.c | 3 +++ 1 files changed, 3 insertions(+) Index: linux-2.6.16-rc1-git5/arch/powerpc/platforms/pseries/pci_dlpar.c =================================================================== --- linux-2.6.16-rc1-git5.orig/arch/powerpc/platforms/pseries/pci_dlpar.c 2006-02-01 18:06:12.380829512 -0600 +++ linux-2.6.16-rc1-git5/arch/powerpc/platforms/pseries/pci_dlpar.c 2006-02-01 18:11:41.040673750 -0600 @@ -58,6 +58,7 @@ return find_bus_among_children(pdn->phb->bus, dn); } +EXPORT_SYMBOL_GPL(pcibios_find_pci_bus); /** * pcibios_remove_pci_devices - remove all devices under this bus @@ -106,6 +107,7 @@ } } } +EXPORT_SYMBOL_GPL(pcibios_fixup_new_pci_devices); static int pcibios_pci_config_bridge(struct pci_dev *dev) @@ -172,3 +174,4 @@ pcibios_pci_config_bridge(dev); } } +EXPORT_SYMBOL_GPL(pcibios_add_pci_devices); From linas at austin.ibm.com Thu Feb 2 11:21:09 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 1 Feb 2006 18:21:09 -0600 Subject: [PATCH 2/2]: PowerPC/PCI Hotplug build break Message-ID: <20060202002109.GB24916@austin.ibm.com> Please apply ASAP: Build break: Building PCI hotplug on PowerPC results in a build break, due to failure to export symbols. Reported today by Dave Jones : drivers/pci/hotplug/rpaphp.ko needs unknown symbol pcibios_add_pci_devices This patch fixes same problem in drivers/pci tree Previous patch fixes the break in the arch/powerpc tree. Signed-off-by: Linas Vepstas --- rpaphp_slot.c | 1 + 1 files changed, 1 insertion(+) Index: linux-2.6.16-rc1-git5/drivers/pci/hotplug/rpaphp_slot.c =================================================================== --- linux-2.6.16-rc1-git5.orig/drivers/pci/hotplug/rpaphp_slot.c 2006-02-01 18:06:06.022722369 -0600 +++ linux-2.6.16-rc1-git5/drivers/pci/hotplug/rpaphp_slot.c 2006-02-01 18:11:46.049970222 -0600 @@ -159,6 +159,7 @@ dbg("%s - Exit: rc[%d]\n", __FUNCTION__, retval); return retval; } +EXPORT_SYMBOL_GPL(rpaphp_deregister_slot); int rpaphp_register_slot(struct slot *slot) { From grundler at parisc-linux.org Thu Feb 2 16:52:43 2006 From: grundler at parisc-linux.org (Grant Grundler) Date: Wed, 1 Feb 2006 22:52:43 -0700 Subject: creating PCI-related sysfs entries In-Reply-To: <20060201213546.GH14705@austin.ibm.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131224852.GA25579@colo.lackof.org> <20060201213546.GH14705@austin.ibm.com> Message-ID: <20060202055243.GA12588@colo.lackof.org> On Wed, Feb 01, 2006 at 03:35:46PM -0600, linas wrote: > > Sounds like you want to add another method for error recovery > > stats/control. > > Actually, the "recovery" part is already (mostly) in mainline, > See Documentation/pci-error-recovery.txt Yes - I've reviewed a few of the times you submitted it. What I meant was, you want to formalize error recovery methods and make it a peer to the other resources access methods I listed. ... > Another issue is that there's no implementation at this time for > any arch other than powerpc, Well, some ia64 chipsets have some limited support but it's really up to the respective companies to drive that. > although the latest pci express bridges support this function in principle. "Nguyen, Tom L" has proposed patches to support PCI-e AER (Advanced Error Reporting): http://lkml.org/lkml/2005/3/11/269 I've cc'd him in case he has an interest in resurrecting those patches and adapting them to the current framework (and vice versa). grant From linas at austin.ibm.com Fri Feb 3 03:36:36 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 2 Feb 2006 10:36:36 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <20060202055243.GA12588@colo.lackof.org> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131224852.GA25579@colo.lackof.org> <20060201213546.GH14705@austin.ibm.com> <20060202055243.GA12588@colo.lackof.org> Message-ID: <20060202163636.GD24916@austin.ibm.com> On Wed, Feb 01, 2006 at 10:52:43PM -0700, Grant Grundler was heard to remark: > On Wed, Feb 01, 2006 at 03:35:46PM -0600, linas wrote: > > > Sounds like you want to add another method for error recovery > > > stats/control. > > > > Actually, the "recovery" part is already (mostly) in mainline, > > See Documentation/pci-error-recovery.txt > > What I meant was, you want to formalize error recovery methods > and make it a peer to the other resources access methods I listed. Hmm. Not sure what you mean by "a peer". pci config-space i/o is done through callbacks in the pci bus->ops structure. The PCI error recovery is done via callbacks in the pci dev structure. Was there something you'd like to see done differently? Given GregKH's remarks, it sounded like there was some interest in a "struct bus_host_bridge" abstraction, and I'd be willing to take a shot at that, provided there is general interest and general agreement. I'm not quite sure what such a struct might contain, just yet, I'm just imagining it might be non-empty. > "Nguyen, Tom L" has proposed patches > to support PCI-e AER (Advanced Error Reporting): I kept looking at AER, and could not figure out what to do with it. --linas From jfaslist at yahoo.fr Fri Feb 3 04:03:06 2006 From: jfaslist at yahoo.fr (jfaslist) Date: Thu, 02 Feb 2006 18:03:06 +0100 Subject: Maple freezing on PCI Target-Abort Message-ID: <43E23B4A.4020402@yahoo.fr> Hi, We have designed our own IBM970fx motherboard which is a (almost)clone to the IBM Maple reference kit. We are seeing that whenever a PIO read PCI cycle bound to the PCI bus that is across the AMD8111 is ended w/ a target-abort, the whole system freezes. The device signaling the TA is a PCI-VME bridge. It does so as the address passed is invalid. When the system hangs, using the service processor, I can access some AMD8111, CPC925 registers from which I can draw the following conclusions: 1- The AMD8111 secondary status tells me the AMD8111 got a TA 2- The CPC925 status/command register (0cf8070010) tells me that the TA error was forwarded to the CPC925. 3- The CPC925 APIEXCP register tells me that a DERR exception was signaled. From what I can read on the CPC925 and IBM970 cpu user manual, the DERR is the bus error that is returned to the CPU by the CPC925 to let him know that the cycle ended w/ an error. I have the following questions: -What exception vector is taking care of a DERR excp? From what I can see it seems to be the "machine check" vector. But that seems a bit drastic to me. After all this is just a PCI target abort. -I expect that the normal behavior would be for the kernel to send a signal termination to the user process which caused the PIO READ PCI cycle (from a previously mmap()'ed VMA address). Is it doable on this platform? Since a READ operation is coupled by nature, I think this is the only acceptable way. I have tried to set the MSR[RI] bit before doing the PCI cycle, but it didn't change change anything. Also on our design we disconnect the CPC925 checkstop pin from the 970 machine check pin.(see page 39 of cpc925 user's manual). So a DERR shouldn't cause a machine check I would think. I realize that these questions are very H/W related but couldn't find the answer in IBM doc. Thanks for the help, -- Best regards, _______________________________________ jean-francois simon - themis computer 5, rue irene joliot curie 38330 eybens - france +33 (0)4 76 14 77 85 ___________________________________________________________________________ Nouveau : t?l?phonez moins cher avec Yahoo! Messenger ! D?couvez les tarifs exceptionnels pour appeler la France et l'international. T?l?chargez sur http://fr.messenger.yahoo.com From grundler at parisc-linux.org Fri Feb 3 06:39:02 2006 From: grundler at parisc-linux.org (Grant Grundler) Date: Thu, 2 Feb 2006 12:39:02 -0700 Subject: creating PCI-related sysfs entries In-Reply-To: <20060202163636.GD24916@austin.ibm.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131224852.GA25579@colo.lackof.org> <20060201213546.GH14705@austin.ibm.com> <20060202055243.GA12588@colo.lackof.org> <20060202163636.GD24916@austin.ibm.com> Message-ID: <20060202193902.GA5424@colo.lackof.org> On Thu, Feb 02, 2006 at 10:36:36AM -0600, Linas Vepstas wrote: > Hmm. Not sure what you mean by "a peer". Just at the same level of the architecture - i.e. a server like the others. > pci config-space i/o is done through callbacks in the pci bus->ops > structure. The PCI error recovery is done via callbacks in the pci dev > structure. Was there something you'd like to see done differently? No. Each set of callbacks serves a different purpose. The services/resources at the bus level are different from those at the device level. My guess is error handling/containment can abstract at the "bus" level since we can't always guarantee "one device per slot" (think multifunction devices). > Given GregKH's remarks, it sounded like there was some interest in > a "struct bus_host_bridge" abstraction, and I'd be willing > to take a shot at that, provided there is general interest and > general agreement. I'm not quite sure what such a struct might > contain, just yet, I'm just imagining it might be non-empty. Yes, I agree don't have a better idea other than what I already pointed out. > I kept looking at AER, and could not figure out what to do > with it. I haven't either - other folks in HP "own" that. grant From linas at austin.ibm.com Fri Feb 3 07:46:24 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 2 Feb 2006 14:46:24 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <20060202193902.GA5424@colo.lackof.org> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131224852.GA25579@colo.lackof.org> <20060201213546.GH14705@austin.ibm.com> <20060202055243.GA12588@colo.lackof.org> <20060202163636.GD24916@austin.ibm.com> <20060202193902.GA5424@colo.lackof.org> Message-ID: <20060202204624.GM24916@austin.ibm.com> On Thu, Feb 02, 2006 at 12:39:02PM -0700, Grant Grundler was heard to remark: > > My guess is error handling/containment can abstract at the "bus" level > since we can't always guarantee "one device per slot" (think > multifunction devices). :-) Yes, testing with multi-function cards exposed bugs, but the code should work fine with them. In particular, the design allows multi-function devices to "vote" how they want to be reset, with the dumbest voter getting thier way. The bus disconnect is reported to all functions on all affected cards/slots. This allows all instances of a device driver to react appropriately. However, for card setup/init, typically, you want to have only one driver instance do that. You'll notice in the sym53cxx2 patch I just sent, there's a + if (PCI_FUNC(pdev->devfn) == 0) + sym_reset_scsi_bus(np, 0); so that the other instances don't fall over each other reseting. There's similar code in the e100 e1000 and ixgb drivers; I tested multi-function versions of these. (not sure about ixgb). > Yes, I agree don't have a better idea other than what I already > pointed out. Hmm. well, I may have lost the thread of what that was. --linas From geoffrey.levand at am.sony.com Fri Feb 3 09:47:12 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Thu, 02 Feb 2006 14:47:12 -0800 Subject: [PATCH] spufs split off platform code In-Reply-To: <200601280457.08170.arnd@arndb.de> References: <200601280457.08170.arnd@arndb.de> Message-ID: <43E28BF0.8060700@am.sony.com> Arnd Bergmann wrote: > I guess that the "spc" device type can be removed now, I don't think > that > any systems are left that have not been converted. > > Do you have "spe" type nodes at all? Is there anything that you need to > do different about them? Yes, scp can be removed. I think we can arrange it so some of the create_spu code can go back to generic code. Still investigating... >>+void spu_free_irqs(struct spu *spu) >>+{ >>+???????int irq_base; >>+ >>+???????if(!spu->priv_data) { >>+???????????????pr_debug("null priv_data in %p\n", spu); >>+???????????????return; >>+???????} > > > It may be just me, but I don't like this bit of coding style: > You are trying to deal with priv_data being either allocated > or not allocated at this point. Better make sure that you have > freed the structure before returning an error from any function > that would allocate it on success. Then get rid of the check > here. Yes, it really doesn't add any value does it. >>+struct spu_priv_data; >>+struct spu_phys { >>+???????unsigned long addr; >>+???????unsigned long size; >>+}; >> >>?struct spu { >>+???????struct spu_priv_data *priv_data; /* opaque */ >>????????char *name; > > > If you want priv_data to point to different types of data structures > depending on the context, I find it easier to understand if there is > a simple void pointer and the actual struct definitions have different > type names. Yes, a good tip. I'm looking into pushing these differences down into the lower level platform code. Hopefully it will simplify these parts. -Geoff From linas at austin.ibm.com Fri Feb 3 11:06:02 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 2 Feb 2006 18:06:02 -0600 Subject: [PATCH]: Documentation: Updated PCI Error Recovery Message-ID: <20060203000602.GQ24916@austin.ibm.com> I'm not sure who I'm addressing this patch to: Linus, maybe? Please apply. Fingers crossed, I hope this may make it into 2.6.16. --linas This patch is a cleanup/restructuring/clarification of the PCI error handling doc. It should look rather professional at this point. Signed-off-by: Linas Vepstas -- pci-error-recovery.txt | 462 ++++++++++++++++++++++++++++++++----------------- 1 files changed, 306 insertions(+), 156 deletions(-) Index: linux-2.6.16-rc1-git5/Documentation/pci-error-recovery.txt =================================================================== --- linux-2.6.16-rc1-git5.orig/Documentation/pci-error-recovery.txt 2006-02-01 17:09:01.000000000 -0600 +++ linux-2.6.16-rc1-git5/Documentation/pci-error-recovery.txt 2006-02-02 18:04:57.714942210 -0600 @@ -1,246 +1,396 @@ PCI Error Recovery ------------------ - May 31, 2005 + February 2, 2006 - Current document maintainer: - Linas Vepstas + Current document maintainer: + Linas Vepstas -Some PCI bus controllers are able to detect certain "hard" PCI errors -on the bus, such as parity errors on the data and address busses, as -well as SERR and PERR errors. These chipsets are then able to disable -I/O to/from the affected device, so that, for example, a bad DMA -address doesn't end up corrupting system memory. These same chipsets -are also able to reset the affected PCI device, and return it to -working condition. This document describes a generic API form -performing error recovery. - -The core idea is that after a PCI error has been detected, there must -be a way for the kernel to coordinate with all affected device drivers -so that the pci card can be made operational again, possibly after -performing a full electrical #RST of the PCI card. The API below -provides a generic API for device drivers to be notified of PCI -errors, and to be notified of, and respond to, a reset sequence. - -Preliminary sketch of API, cut-n-pasted-n-modified email from -Ben Herrenschmidt, circa 5 april 2005 +Many PCI bus controllers are able to detect a variety of hardware +PCI errors on the bus, such as parity errors on the data and address +busses, as well as SERR and PERR errors. Some of the more advanced +chipsets are able to deal with these errors; these include PCI-E chipsets, +and the PCI-host bridges found on IBM Power4 and Power5-based pSeries +boxes. A typical action taken is to disconnect the affected device, +halting all I/O to it. The goal of a disconnection is to avoid system +corruption; for example, to halt system memory corruption due to DMA's +to "wild" addresses. Typically, a reconnection mechanism is also +offered, so that the affected PCI device(s) are reset and put back +into working condition. The reset phase requires coordination +between the affected device drivers and the PCI controller chip. +This document describes a generic API for notifying device drivers +of a bus disconnection, and then performing error recovery. +This API is currently implemented in the 2.6.16 and later kernels. + +Reporting and recovery is performed in several steps. First, when +a PCI hardware error has resulted in a bus disconnect, that event +is reported as soon as possible to all affected device drivers, +including multiple instances of a device driver on multi-function +cards. This allows device drivers to avoid deadlocking in spinloops, +waiting for some i/o-space register to change, when it never will. +It also gives the drivers a chance to defer incoming I/O as +needed. + +Next, recovery is performed in several stages. Most of the complexity +is forced by the need to handle multi-function devices, that is, +devices that have multiple device drivers associated with them. +In the first stage, each driver is allowed to indicate what type +of reset it desires, the choices being a simple re-enabling of I/O +or requesting a hard reset (a full electrical #RST of the PCI card). +If any driver requests a full reset, that is what will be done. + +After a full reset and/or a re-enabling of I/O, all drivers are +again notified, so that they may then perform any device setup/config +that may be required. After these have all completed, a final +"resume normal operations" event is sent out. + +The biggest reason for choosing a kernel-based implementation rather +than a user-space implementation was the need to deal with bus +disconnects of PCI devices attached to storage media, and, in particular, +disconnects from devices holding the root file system. If the root +file system is disconnected, a user-space mechanism would have to go +through a large number of contortions to complete recovery. Almost all +of the current Linux file systems are not tolerant of disconnection +from/reconnection to their underlying block device. By contrast, +bus errors are easy to manage in the device driver. Indeed, most +device drivers already handle very similar recovery procedures; +for example, the SCSI-generic layer already provides significant +mechanisms for dealing with SCSI bus errors and SCSI bus resets. + + +Detailed Design +--------------- +Design and implementation details below, based on a chain of +public email discussions with Ben Herrenschmidt, circa 5 April 2005. The error recovery API support is exposed to the driver in the form of a structure of function pointers pointed to by a new field in struct -pci_driver. The absence of this pointer in pci_driver denotes an -"non-aware" driver, behaviour on these is platform dependant. -Platforms like ppc64 can try to simulate pci hotplug remove/add. - -The definition of "pci_error_token" is not covered here. It is based on -Seto's work on the synchronous error detection. We still need to define -functions for extracting infos out of an opaque error token. This is -separate from this API. +pci_driver. A driver that fails to provide the structure is "non-aware", +and the actual recovery steps taken are platform dependent. The +arch/powerpc implementation will simulate a PCI hotplug remove/add. This structure has the form: - struct pci_error_handlers { - int (*error_detected)(struct pci_dev *dev, pci_error_token error); + int (*error_detected)(struct pci_dev *dev, enum pci_channel_state); int (*mmio_enabled)(struct pci_dev *dev); - int (*resume)(struct pci_dev *dev); int (*link_reset)(struct pci_dev *dev); int (*slot_reset)(struct pci_dev *dev); + void (*resume)(struct pci_dev *dev); +}; + +The possible channel states are: +enum pci_channel_state { + pci_channel_io_normal, /* I/O channel is in normal state */ + pci_channel_io_frozen, /* I/O to channel is blocked */ + pci_channel_io_perm_failure, /* PCI card is dead */ +}; + +Possible return values are: +enum pci_ers_result { + PCI_ERS_RESULT_NONE, /* no result/none/not supported in device driver */ + PCI_ERS_RESULT_CAN_RECOVER, /* Device driver can recover without slot reset */ + PCI_ERS_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */ + PCI_ERS_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */ + PCI_ERS_RESULT_RECOVERED, /* Device driver is fully recovered and operational */ }; -A driver doesn't have to implement all of these callbacks. The -only mandatory one is error_detected(). If a callback is not -implemented, the corresponding feature is considered unsupported. -For example, if mmio_enabled() and resume() aren't there, then the -driver is assumed as not doing any direct recovery and requires +A driver does not have to implement all of these callbacks; however, +if it implements any, it must implement error_detected(). If a callback +is not implemented, the corresponding feature is considered unsupported. +For example, if mmio_enabled() and resume() aren't there, then it +is assumed that the driver is not doing any direct recovery and requires a reset. If link_reset() is not implemented, the card is assumed as -not caring about link resets, in which case, if recover is supported, -the core can try recover (but not slot_reset() unless it really did -reset the slot). If slot_reset() is not supported, link_reset() can -be called instead on a slot reset. - -At first, the call will always be : - - 1) error_detected() - - Error detected. This is sent once after an error has been detected. At -this point, the device might not be accessible anymore depending on the -platform (the slot will be isolated on ppc64). The driver may already -have "noticed" the error because of a failing IO, but this is the proper -"synchronisation point", that is, it gives a chance to the driver to -cleanup, waiting for pending stuff (timers, whatever, etc...) to -complete; it can take semaphores, schedule, etc... everything but touch -the device. Within this function and after it returns, the driver +not care about link resets. Typically a driver will want to know about +a slot_reset(). + +The actual steps taken by a platform to recover from a PCI error +event will be platform-dependent, but will follow the general +sequence described below. + +STEP 0: Error Event +------------------- +PCI bus error is detect by the PCI hardware. On powerpc, the slot +is isolated, in that all I/O is blocked: all reads return 0xffffffff, +all writes are ignored. + + +STEP 1: Notification +-------------------- +Platform calls the error_detected() callback on every instance of +every driver affected by the error. + +At this point, the device might not be accessible anymore, depending on +the platform (the slot will be isolated on powerpc). The driver may +already have "noticed" the error because of a failing I/O, but this +is the proper "synchronization point", that is, it gives the driver +a chance to cleanup, waiting for pending stuff (timers, whatever, etc...) +to complete; it can take semaphores, schedule, etc... everything but +touch the device. Within this function and after it returns, the driver shouldn't do any new IOs. Called in task context. This is sort of a "quiesce" point. See note about interrupts at the end of this doc. - Result codes: - - PCIERR_RESULT_CAN_RECOVER: - Driever returns this if it thinks it might be able to recover +All drivers participating in this system must implement this call. +The driver must return one of the following result codes: + - PCI_ERS_RESULT_CAN_RECOVER: + Driver returns this if it thinks it might be able to recover the HW by just banging IOs or if it wants to be given - a chance to extract some diagnostic informations (see - below). - - PCIERR_RESULT_NEED_RESET: - Driver returns this if it thinks it can't recover unless the - slot is reset. - - PCIERR_RESULT_DISCONNECT: - Return this if driver thinks it won't recover at all, - (this will detach the driver ? or just leave it - dangling ? to be decided) - -So at this point, we have called error_detected() for all drivers -on the segment that had the error. On ppc64, the slot is isolated. What -happens now typically depends on the result from the drivers. If all -drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would -re-enable IOs on the slot (or do nothing special if the platform doesn't -isolate slots) and call 2). If not and we can reset slots, we go to 4), -if neither, we have a dead slot. If it's an hotplug slot, we might -"simulate" reset by triggering HW unplug/replug though. + a chance to extract some diagnostic information (see + mmio_enable, below). + - PCI_ERS_RESULT_NEED_RESET: + Driver returns this if it can't recover without a hard + slot reset. + - PCI_ERS_RESULT_DISCONNECT: + Driver returns this if it doesn't want to recover at all. + +The next step taken will depend on the result codes returned by the +drivers. + +If all drivers on the segment/slot return PCI_ERS_RESULT_CAN_RECOVER, +then the platform should re-enable IOs on the slot (or do nothing in +particular, if the platform doesn't isolate slots), and recovery +proceeds to STEP 2 (MMIO Enable). + +If any driver requested a slot reset (by returning PCI_ERS_RESULT_NEED_RESET), +then recovery proceeds to STEP 4 (Slot Reset). + +If the platform is unable to recover the slot, the next step +is STEP 6 (Permanent Failure). ->>> Current ppc64 implementation assumes that a device driver will ->>> *not* schedule or semaphore in this routine; the current ppc64 +>>> The current powerpc implementation assumes that a device driver will +>>> *not* schedule or semaphore in this routine; the current powerpc >>> implementation uses one kernel thread to notify all devices; ->>> thus, of one device sleeps/schedules, all devices are affected. +>>> thus, if one device sleeps/schedules, all devices are affected. >>> Doing better requires complex multi-threaded logic in the error >>> recovery implementation (e.g. waiting for all notification threads >>> to "join" before proceeding with recovery.) This seems excessively >>> complex and not worth implementing. ->>> The current ppc64 implementation doesn't much care if the device ->>> attempts i/o at this point, or not. I/O's will fail, returning +>>> The current powerpc implementation doesn't much care if the device +>>> attempts I/O at this point, or not. I/O's will fail, returning >>> a value of 0xff on read, and writes will be dropped. If the device >>> driver attempts more than 10K I/O's to a frozen adapter, it will >>> assume that the device driver has gone into an infinite loop, and ->>> it will panic the the kernel. +>>> it will panic the the kernel. There doesn't seem to be any other +>>> way of stopping a device driver that insists on spinning on I/O. - 2) mmio_enabled() +STEP 2: MMIO Enabled +------------------- +The platform re-enables MMIO to the device (but typically not the +DMA), and then calls the mmio_enabled() callback on all affected +device drivers. - This is the "early recovery" call. IOs are allowed again, but DMA is +This is the "early recovery" call. IOs are allowed again, but DMA is not (hrm... to be discussed, I prefer not), with some restrictions. This is NOT a callback for the driver to start operations again, only to peek/poke at the device, extract diagnostic information, if any, and eventually do things like trigger a device local reset or some such, -but not restart operations. This is sent if all drivers on a segment -agree that they can try to recover and no automatic link reset was -performed by the HW. If the platform can't just re-enable IOs without -a slot reset or a link reset, it doesn't call this callback and goes -directly to 3) or 4). All IOs should be done _synchronously_ from -within this callback, errors triggered by them will be returned via -the normal pci_check_whatever() api, no new error_detected() callback -will be issued due to an error happening here. However, such an error -might cause IOs to be re-blocked for the whole segment, and thus -invalidate the recovery that other devices on the same segment might -have done, forcing the whole segment into one of the next states, -that is link reset or slot reset. +but not restart operations. This is callback is made if all drivers on +a segment agree that they can try to recover and if no automatic link reset +was performed by the HW. If the platform can't just re-enable IOs without +a slot reset or a link reset, it wont call this callback, and instead +will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset) + +>>> The following is proposed; no platform implements this yet: +>>> Proposal: All I/O's should be done _synchronously_ from within +>>> this callback, errors triggered by them will be returned via +>>> the normal pci_check_whatever() API, no new error_detected() +>>> callback will be issued due to an error happening here. However, +>>> such an error might cause IOs to be re-blocked for the whole +>>> segment, and thus invalidate the recovery that other devices +>>> on the same segment might have done, forcing the whole segment +>>> into one of the next states, that is, link reset or slot reset. - Result codes: - - PCIERR_RESULT_RECOVERED +The driver should return one of the following result codes: + - PCI_ERS_RESULT_RECOVERED Driver returns this if it thinks the device is fully - functionnal and thinks it is ready to start + functional and thinks it is ready to start normal driver operations again. There is no guarantee that the driver will actually be allowed to proceed, as another driver on the same segment might have failed and thus triggered a slot reset on platforms that support it. - - PCIERR_RESULT_NEED_RESET + - PCI_ERS_RESULT_NEED_RESET Driver returns this if it thinks the device is not recoverable in it's current state and it needs a slot reset to proceed. - - PCIERR_RESULT_DISCONNECT + - PCI_ERS_RESULT_DISCONNECT Same as above. Total failure, no recovery even after reset driver dead. (To be defined more precisely) ->>> The current ppc64 implementation does not implement this callback. - - 3) link_reset() - - This is called after the link has been reset. This is typically -a PCI Express specific state at this point and is done whenever a -non-fatal error has been detected that can be "solved" by resetting -the link. This call informs the driver of the reset and the driver -should check if the device appears to be in working condition. -This function acts a bit like 2) mmio_enabled(), in that the driver -is not supposed to restart normal driver I/O operations right away. -Instead, it should just "probe" the device to check it's recoverability -status. If all is right, then the core will call resume() once all -drivers have ack'd link_reset(). +The next step taken depends on the results returned by the drivers. +If all drivers returned PCI_ERS_RESULT_RECOVERED, then the platform +proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations). + +If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform +proceeds to STEP 4 (Slot Reset) + +>>> The current powerpc implementation does not implement this callback. + + +STEP 3: Link Reset +------------------ +The platform resets the link, and then calls the link_reset() callback +on all affected device drivers. This is a PCI-Express specific state +and is done whenever a non-fatal error has been detected that can be +"solved" by resetting the link. This call informs the driver of the +reset and the driver should check to see if the device appears to be +in working condition. + +The driver is not supposed to restart normal driver I/O operations +at this point. It should limit itself to "probing" the device to +check it's recoverability status. If all is right, then the platform +will call resume() once all drivers have ack'd link_reset(). Result codes: - (identical to mmio_enabled) + (identical to STEP 3 (MMIO Enabled) ->>> The current ppc64 implementation does not implement this callback. +The platform then proceeds to either STEP 4 (Slot Reset) or STEP 5 +(Resume Operations). - 4) slot_reset() +>>> The current powerpc implementation does not implement this callback. - This is called after the slot has been soft or hard reset by the -platform. A soft reset consists of asserting the adapter #RST line -and then restoring the PCI BARs and PCI configuration header. If the -platform supports PCI hotplug, then it might instead perform a hard -reset by toggling power on the slot off/on. This call gives drivers -the chance to re-initialize the hardware (re-download firmware, etc.), -but drivers shouldn't restart normal I/O processing operations at -this point. (See note about interrupts; interrupts aren't guaranteed -to be delivered until the resume() callback has been called). If all -device drivers report success on this callback, the patform will call -resume() to complete the error handling and let the driver restart -normal I/O processing. + +STEP 4: Slot Reset +------------------ +The platform performs a soft or hard reset of the device, and then +calls the slot_reset() callback. + +A soft reset consists of asserting the adapter #RST line and then +restoring the PCI BAR's and PCI configuration header to a state +that is equivalent to what it would be after a fresh system +power-on followed by power-on BIOS/system firmware initialization. +If the platform supports PCI hotplug, then the reset might be +performed by toggling the slot electrical power off/on. + +It is important for the platform to restore the PCI config space +to the "fresh poweron" state, rather than the "last state". After +a slot reset, the device driver will almost always use its standard +device initialization routines, and an unusual config space setup +may result in hung devices, kernel panics, or silent data corruption. + +This call gives drivers the chance to re-initialize the hardware +(re-download firmware, etc.). At this point, the driver may assume +that he card is in a fresh state and is fully functional. In +particular, interrupt generation should work normally. + +Drivers should not yet restart normal I/O processing operations +at this point. If all device drivers report success on this +callback, the platform will call resume() to complete the sequence, +and let the driver restart normal I/O processing. A driver can still return a critical failure for this function if it can't get the device operational after reset. If the platform -previously tried a soft reset, it migh now try a hard reset (power +previously tried a soft reset, it might now try a hard reset (power cycle) and then call slot_reset() again. It the device still can't be recovered, there is nothing more that can be done; the platform will typically report a "permanent failure" in such a case. The device will be considered "dead" in this case. +Drivers for multi-function cards will need to coordinate among +themselves as to which driver instance will perform any "one-shot" +or global device initialization. For example, the Symbios sym53cxx2 +driver performs device init only from PCI function 0: + ++ if (PCI_FUNC(pdev->devfn) == 0) ++ sym_reset_scsi_bus(np, 0); + Result codes: - - PCIERR_RESULT_DISCONNECT + - PCI_ERS_RESULT_DISCONNECT Same as above. ->>> The current ppc64 implementation does not try a power-cycle reset ->>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should. - - 5) resume() +Platform proceeds either to STEP 5 (Resume Operations) or STEP 6 (Permanent +Failure). - This is called if all drivers on the segment have returned -PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks. -That basically tells the driver to restart activity, tht everything -is back and running. No result code is taken into account here. If -a new error happens, it will restart a new error handling process. - -That's it. I think this covers all the possibilities. The way those -callbacks are called is platform policy. A platform with no slot reset -capability for example may want to just "ignore" drivers that can't +>>> The current powerpc implementation does not currently try a +>>> power-cycle reset if the driver returned PCI_ERS_RESULT_DISCONNECT. +>>> However, it probably should. + + +STEP 5: Resume Operations +------------------------- +The platform will call the resume() callback on all affected device +drivers if all drivers on the segment have returned +PCI_ERS_RESULT_RECOVERED from one of the 3 previous callbacks. +The goal of this callback is to tell the driver to restart activity, +that everything is back and running. This callback does not return +a result code. + +At this point, if a new error happens, the platform will restart +a new error recovery sequence. + +STEP 6: Permanent Failure +------------------------- +A "permanent failure" has occurred, and the platform cannot recover +the device. The platform will call error_detected() with a +pci_channel_state value of pci_channel_io_perm_failure. + +The device driver should, at this point, assume the worst. It should +cancel all pending I/O, refuse all new I/O, returning -EIO to +higher layers. The device driver should then clean up all of its +memory and remove itself from kernel operations, much as it would +during system shutdown. + +The platform will typically notify the system operator of the +permanent failure in some way. If the device is hotplug-capable, +the operator will probably want to remove and replace the device. +Note, however, not all failures are truly "permanent". Some are +caused by over-heating, some by a poorly seated card. Many +PCI error events are caused by software bugs, e.g. DMA's to +wild addresses or bogus split transactions due to programming +errors. See the discussion in powerpc/eeh-pci-error-recovery.txt +for additional detail on real-life experience of the causes of +software errors. + + +Conclusion; General Remarks +--------------------------- +The way those callbacks are called is platform policy. A platform with +no slot reset capability may want to just "ignore" drivers that can't recover (disconnect them) and try to let other cards on the same segment recover. Keep in mind that in most real life cases, though, there will be only one driver per segment. -Now, there is a note about interrupts. If you get an interrupt and your +Now, a note about interrupts. If you get an interrupt and your device is dead or has been isolated, there is a problem :) - -After much thinking, I decided to leave that to the platform. That is, -the recovery API only precies that: +The current policy is to turn this into a platform policy. +That is, the recovery API only requires that: - There is no guarantee that interrupt delivery can proceed from any device on the segment starting from the error detection and until the -restart callback is sent, at which point interrupts are expected to be +resume callback is sent, at which point interrupts are expected to be fully operational. - - There is no guarantee that interrupt delivery is stopped, that is, ad -river that gets an interrupts after detecting an error, or that detects -and error within the interrupt handler such that it prevents proper + - There is no guarantee that interrupt delivery is stopped, that is, +a driver that gets an interrupt after detecting an error, or that detects +an error within the interrupt handler such that it prevents proper ack'ing of the interrupt (and thus removal of the source) should just -return IRQ_NOTHANDLED. It's up to the platform to deal with taht -condition, typically by masking the irq source during the duration of +return IRQ_NOTHANDLED. It's up to the platform to deal with that +condition, typically by masking the IRQ source during the duration of the error handling. It is expected that the platform "knows" which interrupts are routed to error-management capable slots and can deal -with temporarily disabling that irq number during error processing (this +with temporarily disabling that IRQ number during error processing (this isn't terribly complex). That means some IRQ latency for other devices sharing the interrupt, but there is simply no other way. High end platforms aren't supposed to share interrupts between many devices anyway :) +>>> Implementation details for the powerpc platform are discussed in +>>> the file Documentation/powerpc/eeh-pci-error-recovery.txt + +>>> As of this writing, there are six device drivers with patches +>>> implementing error recovery. Not all of these patches are in +>>> mainline yet. These may be used as "examples": +>>> +>>> drivers/scsi/ipr.c +>>> drivers/scsi/sym53cxx_2 +>>> drivers/next/e100.c +>>> drivers/net/e1000 +>>> drivers/net/ixgb +>>> drivers/net/s2io.c -Revised: 31 May 2005 Linas Vepstas +The End +------- From benh at kernel.crashing.org Fri Feb 3 12:42:37 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 03 Feb 2006 12:42:37 +1100 Subject: Maple freezing on PCI Target-Abort In-Reply-To: <43E23B4A.4020402@yahoo.fr> References: <43E23B4A.4020402@yahoo.fr> Message-ID: <1138930958.4934.102.camel@localhost.localdomain> > -What exception vector is taking care of a DERR excp? From what I can > see it seems to be the "machine check" vector. But that seems a bit > drastic to me. After all this is just a PCI target abort. I would expect a machine check yes. > -I expect that the normal behavior would be for the kernel to send a > signal termination to the user process which caused the PIO READ PCI > cycle (from a previously mmap()'ed VMA address). Is it doable on this > platform? Since a READ operation is coupled by nature, I think this is > the only acceptable way. It should SIGBUS except if the problem occurred in the kernel. I don't know why it's not doing so, maybe you are hitting an issue/errata or misconfiguration of the 925 ? > I have tried to set the MSR[RI] bit before doing the PCI cycle, but it > didn't change change anything. Also on our design we disconnect the > CPC925 checkstop pin from the 970 machine check pin.(see page 39 of > cpc925 user's manual). So a DERR shouldn't cause a machine check I would > think. > > I realize that these questions are very H/W related but couldn't find > the answer in IBM doc. From benh at kernel.crashing.org Fri Feb 3 12:45:03 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 03 Feb 2006 12:45:03 +1100 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131210805.GA19465@austin.ibm.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> Message-ID: <1138931103.4934.105.camel@localhost.localdomain> On Tue, 2006-01-31 at 15:08 -0600, linas wrote: > Hmm. But these slots are not hot-plugabble; should the arch > use the hotplug infrastructure even on those slots? If those are EEH slots, they should probably treated as hotplugable... after all, didn't we discuss back then that one strategy we could use for recovery simulating an unplug/replug to the driver along with a slot hard reset ? From benh at kernel.crashing.org Fri Feb 3 12:56:01 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 03 Feb 2006 12:56:01 +1100 Subject: [PATCH 2.6.16-rc1] Fix booting Maple boards (was: Re: LINUXPPC64 Maple fails to boot current git) In-Reply-To: <20060131151117.GP22672@smtp.west.cox.net> References: <20060130171759.GE22672@smtp.west.cox.net> <1138662630.3417.26.camel@brick.watson.ibm.com> <20060131151117.GP22672@smtp.west.cox.net> Message-ID: <1138931761.4934.113.camel@localhost.localdomain> > When looking for legacy serial ports, condition poking of "ISA" areas > on CONFIG_GENERIC_ISA_DMA, rather than CONFIG_ISA as some boards (such > as the Maple) have no ISA slots, but do have ISA serial ports. Hrm... not sure ISA_DMA has anything to do with that at all.. in fact its more like "has legacy devices". I don't remember adding the ifdef CONFIG_ISA in the first place, maybe I did... it's a bit dodgy I'd say. Indeed, lots of machines have ISA devices (a superIO typically) without having ISA slots... Ben. > Signed-off-by: Tom Rini > > arch/powerpc/kernel/legacy_serial.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c > index f970ace..3dd7b39 100644 > --- a/arch/powerpc/kernel/legacy_serial.c > +++ b/arch/powerpc/kernel/legacy_serial.c > @@ -134,7 +134,7 @@ static int __init add_legacy_soc_port(st > return add_legacy_port(np, -1, UPIO_MEM, addr, addr, NO_IRQ, flags); > } > > -#ifdef CONFIG_ISA > +#ifdef CONFIG_GENERIC_ISA_DMA > static int __init add_legacy_isa_port(struct device_node *np, > struct device_node *isa_brg) > { > @@ -276,7 +276,7 @@ void __init find_legacy_serial_ports(voi > of_node_put(soc); > } > > -#ifdef CONFIG_ISA > +#ifdef CONFIG_GENERIC_ISA_DMA > /* First fill our array with ISA ports */ > for (np = NULL; (np = of_find_node_by_type(np, "serial"));) { > struct device_node *isa = of_get_parent(np); > From benh at kernel.crashing.org Fri Feb 3 12:53:21 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 03 Feb 2006 12:53:21 +1100 Subject: creating PCI-related sysfs entries In-Reply-To: <20060131212624.GA10513@kroah.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131212624.GA10513@kroah.com> Message-ID: <1138931602.4934.110.camel@localhost.localdomain> > That's only because your arch might happen to have 1 device per slot, > which is not true for other arches. And I bet it's also not true for > your non-virtual boxes... Even that is not true since we can have multi-function devices or devices with p2p bridges but the basic entity where the error management infos is available to us is indeed the physical slot. > People have suggested that they create such a driver for a long time. > Why not just do that? Depends if he wants per domain statistics or really per slot ... we do have per-slot control on most of IBM machines, thus I would rather have these info there (though if he also wants consolidated "global" stats, then yes, a host controller driver might be the way to go). From linas at austin.ibm.com Fri Feb 3 13:03:41 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 2 Feb 2006 20:03:41 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <1138931103.4934.105.camel@localhost.localdomain> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <1138931103.4934.105.camel@localhost.localdomain> Message-ID: <20060203020341.GR24916@austin.ibm.com> On Fri, Feb 03, 2006 at 12:45:03PM +1100, Benjamin Herrenschmidt was heard to remark: > On Tue, 2006-01-31 at 15:08 -0600, linas wrote: > > > Hmm. But these slots are not hot-plugabble; should the arch > > use the hotplug infrastructure even on those slots? > > If those are EEH slots, they should probably treated as hotplugable... > after all, didn't we discuss back then that one strategy we could use > for recovery simulating an unplug/replug to the driver along with a slot > hard reset ? Yes, and EEH does do that (in mainline, 10K times in a row, last I tried). This email was in reference to the layout of /sys/bus/pci/slots which seems to have only hotplug slots in there; I am not yet sure why. Its possible John Rose can shed some rapid insight? --linas From gregkh at suse.de Fri Feb 3 14:48:41 2006 From: gregkh at suse.de (Greg KH) Date: Thu, 2 Feb 2006 19:48:41 -0800 Subject: [PATCH]: Documentation: Updated PCI Error Recovery In-Reply-To: <20060203000602.GQ24916@austin.ibm.com> References: <20060203000602.GQ24916@austin.ibm.com> Message-ID: <20060203034841.GA14169@suse.de> On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote: > > I'm not sure who I'm addressing this patch to: Linus, maybe? As it's PCI related, I'll take it, like the other PCI stuff, and put it into my trees, which go into -mm, and then into Linus's tree. I'll add this to my queue. thanks, greg k-h From boutcher at cs.umn.edu Fri Feb 3 18:18:50 2006 From: boutcher at cs.umn.edu (Dave C Boutcher) Date: Fri, 3 Feb 2006 01:18:50 -0600 Subject: [PATCH 0/3] powerpc minor fixes to the rtas_percpu_suspend_me routine Message-ID: <17379.986.599275.637898@hound.rchland.ibm.com> A series of small fixes to the rtas_percpu_suspend_me routine for problems discovered since it was pushed to 2.6.16-rc1. Dave Boutcher From boutcher at cs.umn.edu Fri Feb 3 18:18:36 2006 From: boutcher at cs.umn.edu (Dave C Boutcher) Date: Fri, 3 Feb 2006 01:18:36 -0600 Subject: [PATCH 3/3] powerpc remove useless call to touch_softlockup_watchdog Message-ID: <17379.972.53558.75428@hound.rchland.ibm.com> It turns out that we can't stop the watchdog from triggering here. If we touch the timer (which just uses the current jiffie value) before we enable interrupts, it does nothing because jiffies are not mass-updated until after we enable interrupts. If we touch the timer after we enable interrupts, its too late because the softlockup watchdog will already have triggered. The touch_softlockup_watchdog call removed below does nothing. Signed-off-by: Dave Boutcher --- arch/powerpc/kernel/rtas.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) 14caae1e3b5508ce8798618f9f952f14e7c6d41a diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 4038ac1..1ecfcf8 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -598,10 +598,6 @@ static void rtas_percpu_suspend_me(void } out: - /* before we restore interrupts, make sure we don't - * generate a spurious soft lockup errors - */ - touch_softlockup_watchdog(); local_irq_restore(flags); return; } -- 1.1.4.g7310 From boutcher at cs.umn.edu Fri Feb 3 18:18:39 2006 From: boutcher at cs.umn.edu (Dave C Boutcher) Date: Fri, 3 Feb 2006 01:18:39 -0600 Subject: [PATCH 2/3] powerpc prod all processors after ibm,suspend-me Message-ID: <17379.975.326033.286493@hound.rchland.ibm.com> We need to prod everyone here since this is the only CPU that is guaranteed to be running after the ibm,suspend-me RTAS call returns. Signed-off-by: Dave Boutcher --- arch/powerpc/kernel/rtas.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) 9d615a50c077f82926732c8b9f366bebe50a4660 diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 107bd86..4038ac1 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -565,6 +565,7 @@ static int ibm_suspend_me_token = RTAS_U #ifdef CONFIG_PPC_PSERIES static void rtas_percpu_suspend_me(void *info) { + int i; long rc; long flags; struct rtas_suspend_me_data *data = @@ -589,6 +590,8 @@ static void rtas_percpu_suspend_me(void data->waiting = 0; data->args->args[data->args->nargs] = rtas_call(ibm_suspend_me_token, 0, 1, NULL); + for_each_cpu(i) + plpar_hcall_norets(H_PROD,i); } else { data->waiting = -EBUSY; printk(KERN_ERR "Error on H_Join hypervisor call\n"); -- 1.1.4.g7310 From boutcher at cs.umn.edu Fri Feb 3 18:18:46 2006 From: boutcher at cs.umn.edu (Dave C Boutcher) Date: Fri, 3 Feb 2006 01:18:46 -0600 Subject: [PATCH 1/3] powerpc return correct rtas status from ibm,suspend-me Message-ID: <17379.982.159401.407606@hound.rchland.ibm.com> Correctly return the status from the RTAS call. rtas_call expects to return the status as a return value. Signed-off-by: Dave Boutcher --- arch/powerpc/kernel/rtas.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) a0f3095607ff19d730f2ed5181bd37df231d4015 diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 7fe4a5c..107bd86 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -587,8 +587,8 @@ static void rtas_percpu_suspend_me(void if (rc == H_Continue) { data->waiting = 0; - rtas_call(ibm_suspend_me_token, 0, 1, - data->args->args); + data->args->args[data->args->nargs] = + rtas_call(ibm_suspend_me_token, 0, 1, NULL); } else { data->waiting = -EBUSY; printk(KERN_ERR "Error on H_Join hypervisor call\n"); -- 1.1.4.g7310 From michael at ellerman.id.au Fri Feb 3 19:05:14 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 03 Feb 2006 19:05:14 +1100 Subject: [PATCH] powerpc: Don't start secondary CPUs in a UP && KEXEC kernel Message-ID: <20060203080536.DA5AF68A10@ozlabs.org> Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to unconditionally call it from setup_system(). On a UP && KEXEC kernel we'll start up the secondary CPUs which will then go beserk and we die. Simple fix is to conditionally call smp_release_cpus() in setup_system(). We that in place we don't need the dummy definition of smp_release_cpus() because all call sites are #ifdef'ed either SMP or KEXEC. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/setup_64.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: kdump/arch/powerpc/kernel/setup_64.c =================================================================== --- kdump.orig/arch/powerpc/kernel/setup_64.c +++ kdump/arch/powerpc/kernel/setup_64.c @@ -311,8 +311,6 @@ void smp_release_cpus(void) DBG(" <- smp_release_cpus()\n"); } -#else -#define smp_release_cpus() #endif /* CONFIG_SMP || CONFIG_KEXEC */ /* @@ -470,10 +468,12 @@ void __init setup_system(void) check_smt_enabled(); smp_setup_cpu_maps(); +#ifdef CONFIG_SMP /* Release secondary cpus out of their spinloops at 0x60 now that * we can map physical -> logical CPU ids */ smp_release_cpus(); +#endif printk("Starting Linux PPC64 %s\n", system_utsname.version); From michael at ellerman.id.au Fri Feb 3 19:05:47 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 03 Feb 2006 19:05:47 +1100 Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel Message-ID: <20060203080609.403CA68A1F@ozlabs.org> It's possible for prom_init to allocate the flat device tree inside the kdump crash kernel region. If this happens, when we load the kdump kernel we overwrite the flattened device tree, which is bad. We could make prom_init try and avoid allocating inside the crash kernel region, but then we run into issues if the crash kernel region uses all the space inside the RMO. The easiest solution is to move the flat device tree once we're running in the kernel. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/prom.c | 27 +++++++++++++++++++++++++++ arch/powerpc/kernel/setup_64.c | 3 +++ include/asm-powerpc/prom.h | 2 ++ 3 files changed, 32 insertions(+) Index: kdump/arch/powerpc/kernel/prom.c =================================================================== --- kdump.orig/arch/powerpc/kernel/prom.c +++ kdump/arch/powerpc/kernel/prom.c @@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n return 0; } + +#ifdef CONFIG_KEXEC +/* We may have allocated the flat device tree inside the crash kernel region + * in prom_init. If so we need to move it out into regular memory. */ +void kdump_move_device_tree(void) +{ + unsigned long start, end; + struct boot_param_header *new; + + start = __pa((unsigned long)initial_boot_params); + end = start + initial_boot_params->totalsize; + + if (end < crashk_res.start || start > crashk_res.end) + return; + + new = (struct boot_param_header*) + __va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE)); + + memcpy(new, initial_boot_params, initial_boot_params->totalsize); + + initial_boot_params = new; + + DBG("Flat device tree blob moved to %p\n", initial_boot_params); + + /* XXX should we unreserve the old DT? */ +} +#endif /* CONFIG_KEXEC */ Index: kdump/arch/powerpc/kernel/setup_64.c =================================================================== --- kdump.orig/arch/powerpc/kernel/setup_64.c +++ kdump/arch/powerpc/kernel/setup_64.c @@ -398,6 +398,9 @@ void __init setup_system(void) { DBG(" -> setup_system()\n"); +#ifdef CONFIG_KEXEC + kdump_move_device_tree(); +#endif /* * Unflatten the device-tree passed by prom_init or kexec */ Index: kdump/include/asm-powerpc/prom.h =================================================================== --- kdump.orig/include/asm-powerpc/prom.h +++ kdump/include/asm-powerpc/prom.h @@ -222,5 +222,7 @@ extern int of_address_to_resource(struct extern int of_pci_address_to_resource(struct device_node *dev, int bar, struct resource *r); +extern void kdump_move_device_tree(void); + #endif /* __KERNEL__ */ #endif /* _POWERPC_PROM_H */ From benh at kernel.crashing.org Fri Feb 3 20:07:37 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 03 Feb 2006 20:07:37 +1100 Subject: creating PCI-related sysfs entries In-Reply-To: <20060203020341.GR24916@austin.ibm.com> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <1138931103.4934.105.camel@localhost.localdomain> <20060203020341.GR24916@austin.ibm.com> Message-ID: <1138957657.4934.124.camel@localhost.localdomain> On Thu, 2006-02-02 at 20:03 -0600, Linas Vepstas wrote: > Yes, and EEH does do that (in mainline, 10K times in a row, > last I tried). This email was in reference to the > layout of /sys/bus/pci/slots which seems to have only hotplug > slots in there; I am not yet sure why. Its possible John Rose > can shed some rapid insight? Ok... also, about this "max number of resets" thing, it would be useful in fact to have a rate limit rather ... a network card that for some reason need to be reset about once a day is still fairly useable and it would be nice if the system didn't consider it dead after 10 days ... Also, it might be useful to have an entry to force a retry on a card that has been considered dead... Ben. From galak at kernel.crashing.org Sat Feb 4 01:25:08 2006 From: galak at kernel.crashing.org (Kumar Gala) Date: Fri, 3 Feb 2006 08:25:08 -0600 Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel In-Reply-To: <20060203080609.403CA68A1F@ozlabs.org> References: <20060203080609.403CA68A1F@ozlabs.org> Message-ID: <8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org> On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote: > It's possible for prom_init to allocate the flat device tree inside > the > kdump crash kernel region. If this happens, when we load the kdump > kernel we > overwrite the flattened device tree, which is bad. > > We could make prom_init try and avoid allocating inside the crash > kernel > region, but then we run into issues if the crash kernel region uses > all the > space inside the RMO. The easiest solution is to move the flat > device tree > once we're running in the kernel. > > Signed-off-by: Michael Ellerman Doesn't setup_32.c need a similar change? - k > --- > > arch/powerpc/kernel/prom.c | 27 +++++++++++++++++++++++++++ > arch/powerpc/kernel/setup_64.c | 3 +++ > include/asm-powerpc/prom.h | 2 ++ > 3 files changed, 32 insertions(+) > > Index: kdump/arch/powerpc/kernel/prom.c > =================================================================== > --- kdump.orig/arch/powerpc/kernel/prom.c > +++ kdump/arch/powerpc/kernel/prom.c > @@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n > > return 0; > } > + > +#ifdef CONFIG_KEXEC > +/* We may have allocated the flat device tree inside the crash > kernel region > + * in prom_init. If so we need to move it out into regular memory. */ > +void kdump_move_device_tree(void) > +{ > + unsigned long start, end; > + struct boot_param_header *new; > + > + start = __pa((unsigned long)initial_boot_params); > + end = start + initial_boot_params->totalsize; > + > + if (end < crashk_res.start || start > crashk_res.end) > + return; > + > + new = (struct boot_param_header*) > + __va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE)); > + > + memcpy(new, initial_boot_params, initial_boot_params->totalsize); > + > + initial_boot_params = new; > + > + DBG("Flat device tree blob moved to %p\n", initial_boot_params); > + > + /* XXX should we unreserve the old DT? */ > +} > +#endif /* CONFIG_KEXEC */ > Index: kdump/arch/powerpc/kernel/setup_64.c > =================================================================== > --- kdump.orig/arch/powerpc/kernel/setup_64.c > +++ kdump/arch/powerpc/kernel/setup_64.c > @@ -398,6 +398,9 @@ void __init setup_system(void) > { > DBG(" -> setup_system()\n"); > > +#ifdef CONFIG_KEXEC > + kdump_move_device_tree(); > +#endif > /* > * Unflatten the device-tree passed by prom_init or kexec > */ > Index: kdump/include/asm-powerpc/prom.h > =================================================================== > --- kdump.orig/include/asm-powerpc/prom.h > +++ kdump/include/asm-powerpc/prom.h > @@ -222,5 +222,7 @@ extern int of_address_to_resource(struct > extern int of_pci_address_to_resource(struct device_node *dev, int > bar, > struct resource *r); > > +extern void kdump_move_device_tree(void); > + > #endif /* __KERNEL__ */ > #endif /* _POWERPC_PROM_H */ > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From trini at kernel.crashing.org Sat Feb 4 01:47:13 2006 From: trini at kernel.crashing.org (Tom Rini) Date: Fri, 3 Feb 2006 07:47:13 -0700 Subject: LINUXPPC64 Maple fails to boot current git) In-Reply-To: <1138931761.4934.113.camel@localhost.localdomain> References: <20060130171759.GE22672@smtp.west.cox.net> <1138662630.3417.26.camel@brick.watson.ibm.com> <20060131151117.GP22672@smtp.west.cox.net> <1138931761.4934.113.camel@localhost.localdomain> Message-ID: <20060203144713.GE3800@smtp.west.cox.net> On Fri, Feb 03, 2006 at 12:56:01PM +1100, Benjamin Herrenschmidt wrote: > > > When looking for legacy serial ports, condition poking of "ISA" areas > > on CONFIG_GENERIC_ISA_DMA, rather than CONFIG_ISA as some boards (such > > as the Maple) have no ISA slots, but do have ISA serial ports. > > Hrm... not sure ISA_DMA has anything to do with that at all.. in fact > its more like "has legacy devices". I don't remember adding the ifdef > CONFIG_ISA in the first place, maybe I did... it's a bit dodgy I'd say. > Indeed, lots of machines have ISA devices (a superIO typically) without > having ISA slots... Olaf says that he sent a patch to Andrew, who should be passing it along if not already, to just remove the #ifdefs there. -- Tom Rini http://gate.crashing.org/~trini/ From ericvh at gmail.com Sat Feb 4 01:54:41 2006 From: ericvh at gmail.com (Eric Van Hensbergen) Date: Fri, 3 Feb 2006 08:54:41 -0600 (CST) Subject: [patch 0/3] systemsim patch cleanup Message-ID: <20060203145441.6EC0A5A8075@localhost.localdomain> These are a set of code cleanups based on Arnd's systemsim patch-set sent out on January 14th. This patch attempts to clean-up some of the issues with the bogus network and bogus disk facilities of systemsim -- but is largely cosmetic. We had looked at incorporating the bogus devices into the IBM-maintained virtualization drivers in the past, but at the time it didn't look like there was a good match in the veth or the vscsi code -- the call-thru's would not integrate as nicely as they did with the hvc console code. The bogus disk and bogus network drivers are largely a stop-gap measure for systems the simulator doesn't have complete device models for. More complete device models are already in the plans for systemsim-cell, which will likely eventually replace the need for the "bogus" drivers. As such, I'll maintain the existing bogus drivers out-of-tree in my git repository on kernel.org (/pub/scm/linux/kernel/git/ericvh/systemsim.git) Unless there are any objections, I'll continue cc:'ing the ppc64-dev list on modifications to the patches. -eric From ericvh at gmail.com Sat Feb 4 01:56:17 2006 From: ericvh at gmail.com (Eric Van Hensbergen) Date: Fri, 3 Feb 2006 08:56:17 -0600 (CST) Subject: [patch 3/3] systemsim: new systemsim default configuration Message-ID: <20060203145617.D6FCD5A809C@localhost.localdomain> Subject: [PATCH] systemsim: clean up default configuration Signed-off-by: Eric Van Hensbergen --- arch/powerpc/configs/systemsim_defconfig | 125 +++++++----------------------- 1 files changed, 28 insertions(+), 97 deletions(-) 72e13e73b5998b853a9bd20e8c425486818ed09a diff --git a/arch/powerpc/configs/systemsim_defconfig b/arch/powerpc/configs/systemsim_defconfig index 59f1d0f..f7daa08 100644 --- a/arch/powerpc/configs/systemsim_defconfig +++ b/arch/powerpc/configs/systemsim_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: -# Fri Jan 13 09:33:18 2006 +# Linux kernel version: 2.6.16-rc1 +# Thu Feb 2 15:18:13 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -18,7 +18,6 @@ CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_PPC_OF=y CONFIG_PPC_UDBG_16550=y -# CONFIG_CRASH_DUMP is not set CONFIG_GENERIC_TBSYNC=y # @@ -57,7 +56,7 @@ CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_CPUSETS is not set CONFIG_INITRAMFS_SOURCE="" -CONFIG_CC_OPTIMIZE_FOR_SIZE=y +# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set @@ -100,11 +99,11 @@ CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y -# CONFIG_DEFAULT_AS is not set +CONFIG_DEFAULT_AS=y # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set -CONFIG_DEFAULT_NOOP=y -CONFIG_DEFAULT_IOSCHED="noop" +# CONFIG_DEFAULT_NOOP is not set +CONFIG_DEFAULT_IOSCHED="anticipatory" # # Platform support @@ -116,7 +115,7 @@ CONFIG_PPC_MULTIPLATFORM=y CONFIG_PPC_PSERIES=y # CONFIG_PPC_PMAC is not set CONFIG_PPC_MAPLE=y -CONFIG_PPC_CELL=y +# CONFIG_PPC_CELL is not set CONFIG_PPC_SYSTEMSIM=y CONFIG_SYSTEMSIM_IDLE=y CONFIG_XICS=y @@ -126,9 +125,8 @@ CONFIG_PPC_RTAS=y CONFIG_RTAS_ERROR_LOGGING=y CONFIG_RTAS_PROC=y # CONFIG_RTAS_FLASH is not set -CONFIG_MMIO_NVRAM=y +# CONFIG_MMIO_NVRAM is not set CONFIG_MPIC_BROKEN_U3=y -CONFIG_CELL_IIC=y CONFIG_IBMVIO=y # CONFIG_IBMEBUS is not set # CONFIG_PPC_MPC106 is not set @@ -136,11 +134,6 @@ CONFIG_IBMVIO=y # CONFIG_WANT_EARLY_SERIAL is not set # -# Cell Broadband Engine options -# -CONFIG_SPU_FS=m - -# # Kernel options # # CONFIG_HZ_100 is not set @@ -157,6 +150,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 # CONFIG_IOMMU_VMERGE is not set # CONFIG_HOTPLUG_CPU is not set # CONFIG_KEXEC is not set +# CONFIG_CRASH_DUMP is not set # CONFIG_IRQ_ALL_CPUS is not set # CONFIG_PPC_SPLPAR is not set CONFIG_EEH=y @@ -299,6 +293,7 @@ CONFIG_BRIDGE_NETFILTER=y # Core Netfilter Configuration # # CONFIG_NETFILTER_NETLINK is not set +# CONFIG_NETFILTER_XTABLES is not set # # IP: Netfilter Configuration @@ -315,91 +310,11 @@ CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m # CONFIG_IP_NF_PPTP is not set CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -# CONFIG_IP_NF_MATCH_IPRANGE is not set -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -# CONFIG_IP_NF_MATCH_PHYSDEV is not set -# CONFIG_IP_NF_MATCH_ADDRTYPE is not set -# CONFIG_IP_NF_MATCH_REALM is not set -# CONFIG_IP_NF_MATCH_SCTP is not set -# CONFIG_IP_NF_MATCH_DCCP is not set -# CONFIG_IP_NF_MATCH_COMMENT is not set -# CONFIG_IP_NF_MATCH_HASHLIMIT is not set -# CONFIG_IP_NF_MATCH_STRING is not set -# CONFIG_IP_NF_MATCH_POLICY is not set -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -# CONFIG_IP_NF_TARGET_NFQUEUE is not set -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -# CONFIG_IP_NF_TARGET_NETMAP is not set -# CONFIG_IP_NF_TARGET_SAME is not set -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -# CONFIG_IP_NF_TARGET_CLASSIFY is not set -# CONFIG_IP_NF_TARGET_TTL is not set -# CONFIG_IP_NF_RAW is not set -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m # # IPv6: Netfilter Configuration (EXPERIMENTAL) # # CONFIG_IP6_NF_QUEUE is not set -CONFIG_IP6_NF_IPTABLES=m -CONFIG_IP6_NF_MATCH_LIMIT=m -CONFIG_IP6_NF_MATCH_MAC=m -CONFIG_IP6_NF_MATCH_RT=m -CONFIG_IP6_NF_MATCH_OPTS=m -CONFIG_IP6_NF_MATCH_FRAG=m -CONFIG_IP6_NF_MATCH_HL=m -CONFIG_IP6_NF_MATCH_MULTIPORT=m -CONFIG_IP6_NF_MATCH_OWNER=m -CONFIG_IP6_NF_MATCH_MARK=m -CONFIG_IP6_NF_MATCH_IPV6HEADER=m -CONFIG_IP6_NF_MATCH_AHESP=m -CONFIG_IP6_NF_MATCH_LENGTH=m -CONFIG_IP6_NF_MATCH_EUI64=m -# CONFIG_IP6_NF_MATCH_PHYSDEV is not set -# CONFIG_IP6_NF_MATCH_POLICY is not set -CONFIG_IP6_NF_FILTER=m -CONFIG_IP6_NF_TARGET_LOG=m -# CONFIG_IP6_NF_TARGET_REJECT is not set -# CONFIG_IP6_NF_TARGET_NFQUEUE is not set -CONFIG_IP6_NF_MANGLE=m -CONFIG_IP6_NF_TARGET_MARK=m -# CONFIG_IP6_NF_TARGET_HL is not set -# CONFIG_IP6_NF_RAW is not set # # DECnet: Netfilter Configuration @@ -443,6 +358,11 @@ CONFIG_IPDDP_ENCAP=y CONFIG_IPDDP_DECAP=y # CONFIG_X25 is not set # CONFIG_LAPB is not set + +# +# TIPC Configuration (EXPERIMENTAL) +# +# CONFIG_TIPC is not set CONFIG_NET_DIVERT=y # CONFIG_ECONET is not set CONFIG_WAN_ROUTER=m @@ -555,6 +475,7 @@ CONFIG_MTD_CFI_I2=y # CONFIG_MTD_RAM is not set # CONFIG_MTD_ROM is not set # CONFIG_MTD_ABSENT is not set +# CONFIG_MTD_OBSOLETE_CHIPS is not set # # Mapping drivers for chip access @@ -707,7 +628,6 @@ CONFIG_SYSTEMSIM_NET=y # CONFIG_SK98LIN is not set # CONFIG_TIGON3 is not set # CONFIG_BNX2 is not set -# CONFIG_SPIDER_NET is not set # CONFIG_MV643XX_ETH is not set # @@ -815,7 +735,7 @@ CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y # CONFIG_SERIAL_8250_CONSOLE is not set CONFIG_SERIAL_8250_NR_UARTS=4 -CONFIG_SERIAL_8250_RUNTIME_UARTS=2 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # @@ -826,7 +746,10 @@ CONFIG_SERIAL_CORE=y CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 +CONFIG_HVC_DRIVER=y # CONFIG_HVC_CONSOLE is not set +CONFIG_HVC_FSS=y +CONFIG_HVC_RTAS=y # CONFIG_HVCS is not set # @@ -864,6 +787,12 @@ CONFIG_LEGACY_PTY_COUNT=256 # CONFIG_I2C is not set # +# SPI support +# +# CONFIG_SPI is not set +# CONFIG_SPI_MASTER is not set + +# # Dallas's 1-wire bus # # CONFIG_W1 is not set @@ -1057,6 +986,7 @@ CONFIG_UNIXWARE_DISKLABEL=y CONFIG_SGI_PARTITION=y # CONFIG_ULTRIX_PARTITION is not set CONFIG_SUN_PARTITION=y +# CONFIG_KARMA_PARTITION is not set # CONFIG_EFI_PARTITION is not set # @@ -1137,6 +1067,7 @@ CONFIG_DEBUG_SPINLOCK_SLEEP=y # CONFIG_DEBUG_INFO is not set # CONFIG_DEBUG_FS is not set # CONFIG_DEBUG_VM is not set +CONFIG_FORCED_INLINING=y # CONFIG_RCU_TORTURE_TEST is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set -- 1.0.GIT From ericvh at gmail.com Sat Feb 4 01:55:06 2006 From: ericvh at gmail.com (Eric Van Hensbergen) Date: Fri, 3 Feb 2006 08:55:06 -0600 (CST) Subject: [patch 1/3] systemsim: cleanup systemsim network patch Message-ID: <20060203145506.1E0405A807B@localhost.localdomain> Subject: [PATCH] systemsim: clean-up systemsim network patch Incorporate some of the LKML feedback, clean-up naming conventions and fix a bogus free in the close routine. Signed-off-by: Eric Van Hensbergen --- drivers/net/systemsim_net.c | 113 ++++++++++++++++++++++--------------------- 1 files changed, 57 insertions(+), 56 deletions(-) 79e30c5718a29c6de20e45f00bc1b458b359c29c diff --git a/drivers/net/systemsim_net.c b/drivers/net/systemsim_net.c index babc1fb..0a4cea9 100644 --- a/drivers/net/systemsim_net.c +++ b/drivers/net/systemsim_net.c @@ -60,32 +60,32 @@ #include #include -#define MAMBO_BOGUS_NET_PROBE 119 -#define MAMBO_BOGUS_NET_SEND 120 -#define MAMBO_BOGUS_NET_RECV 121 +#define SYSTEMSIM_NET_PROBE 119 +#define SYSTEMSIM_NET_SEND 120 +#define SYSTEMSIM_NET_RECV 121 -static inline int MamboBogusNetProbe(int devno, void *buf) +static inline int systemsim_bogusnet_probe(int devno, void *buf) { - return callthru2(MAMBO_BOGUS_NET_PROBE, + return callthru2(SYSTEMSIM_NET_PROBE, (unsigned long)devno, (unsigned long)buf); } -static inline int MamboBogusNetSend(int devno, void *buf, ulong size) +static inline int systemsim_bogusnet_send(int devno, void *buf, ulong size) { - return callthru3(MAMBO_BOGUS_NET_SEND, + return callthru3(SYSTEMSIM_NET_SEND, (unsigned long)devno, (unsigned long)buf, (unsigned long)size); } -static inline int MamboBogusNetRecv(int devno, void *buf, ulong size) +static inline int systemsim_bogusnet_recv(int devno, void *buf, ulong size) { - return callthru3(MAMBO_BOGUS_NET_RECV, + return callthru3(SYSTEMSIM_NET_RECV, (unsigned long)devno, (unsigned long)buf, (unsigned long)size); } static irqreturn_t -mambonet_interrupt(int irq, void *dev_instance, struct pt_regs *regs); +systemsim_net_intr(int irq, void *dev_instance, struct pt_regs *regs); #define INIT_BOTTOM_HALF(x,y,z) INIT_WORK(x, y, (void*)z) #define SCHEDULE_BOTTOM_HALF(x) schedule_delayed_work(x, 1) @@ -100,18 +100,18 @@ struct netdev_private { struct net_device_stats stats; }; -static int mambonet_probedev(int devno, void *buf) +static int systemsim_net_probedev(int devno, void *buf) { - struct device_node *mambo; + struct device_node *systemsim; struct device_node *net; unsigned int *reg; - mambo = find_path_device("/mambo"); + systemsim = find_path_device("/systemsim"); - if (mambo == NULL) { + if (systemsim == NULL) { return -1; } - net = find_path_device("/mambo/bogus-net at 0"); + net = find_path_device("/systemsim/bogus-net at 0"); if (net == NULL) { return -1; } @@ -121,20 +121,20 @@ static int mambonet_probedev(int devno, return -1; } - return MamboBogusNetProbe(devno, buf); + return systemsim_bogusnet_probe(devno, buf); } -static int mambonet_send(int devno, void *buf, ulong size) +static int systemsim_net_send(int devno, void *buf, ulong size) { - return MamboBogusNetSend(devno, buf, size); + return systemsim_bogusnet_send(devno, buf, size); } -static int mambonet_recv(int devno, void *buf, ulong size) +static int systemsim_net_recv(int devno, void *buf, ulong size) { - return MamboBogusNetRecv(devno, buf, size); + return systemsim_bogusnet_recv(devno, buf, size); } -static int mambonet_start_xmit(struct sk_buff *skb, struct net_device *dev) +static int systemsim_net_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct netdev_private *priv = (struct netdev_private *)dev->priv; int devno = priv->devno; @@ -142,7 +142,7 @@ static int mambonet_start_xmit(struct sk skb->dev = dev; /* we might need to checksum or something */ - mambonet_send(devno, skb->data, skb->len); + systemsim_net_send(devno, skb->data, skb->len); dev->last_rx = jiffies; priv->stats.rx_bytes += skb->len; @@ -155,7 +155,7 @@ static int mambonet_start_xmit(struct sk return (0); } -static int mambonet_poll(struct net_device *dev, int *budget) +static int systemsim_net_poll(struct net_device *dev, int *budget) { struct netdev_private *np = dev->priv; int devno = np->devno; @@ -166,7 +166,7 @@ static int mambonet_poll(struct net_devi int max_frames = min(*budget, dev->quota); int ret = 0; - while ((ns = mambonet_recv(devno, buffer, 1600)) > 0) { + while ((ns = systemsim_net_recv(devno, buffer, 1600)) > 0) { if ((skb = dev_alloc_skb(ns + 2)) != NULL) { skb->dev = dev; skb_reserve(skb, 2); /* 16 byte align the IP @@ -209,12 +209,12 @@ static int mambonet_poll(struct net_devi return ret; } -static void mambonet_timer(struct net_device *dev) +static void systemsim_net_timer(struct net_device *dev) { int budget = 16; struct netdev_private *priv = (struct netdev_private *)dev->priv; - mambonet_poll(dev, &budget); + systemsim_net_poll(dev, &budget); if (!priv->closing) { SCHEDULE_BOTTOM_HALF(&priv->poll_task); @@ -228,7 +228,7 @@ static struct net_device_stats *get_stat } static irqreturn_t -mambonet_interrupt(int irq, void *dev_instance, struct pt_regs *regs) +systemsim_net_intr(int irq, void *dev_instance, struct pt_regs *regs) { struct net_device *dev = dev_instance; if (netif_rx_schedule_prep(dev)) { @@ -237,7 +237,7 @@ mambonet_interrupt(int irq, void *dev_in return IRQ_HANDLED; } -static int mambonet_open(struct net_device *dev) +static int systemsim_net_open(struct net_device *dev) { struct netdev_private *priv; int ret = 0; @@ -245,29 +245,30 @@ static int mambonet_open(struct net_devi priv = dev->priv; /* - * we can't start polling in mambonet_init, because I don't think + * we can't start polling in systemsim_net_init, because I don't think * workqueues are usable that early. so start polling now. */ if (dev->irq) { - ret = request_irq(dev->irq, &mambonet_interrupt, 0, + ret = request_irq(dev->irq, &systemsim_net_intr, 0, dev->name, dev); if (ret == 0) { netif_start_queue(dev); } else { - printk(KERN_ERR "mambonet: request irq failed\n"); + printk(KERN_ERR "systemsim net: request irq failed\n"); } - MamboBogusNetProbe(priv->devno, NULL); /* probe with NULL to activate interrupts */ + /* probe with NULL to activate interrupts */ + systemsim_bogusnet_probe(priv->devno, NULL); } else { - mambonet_timer(dev); + systemsim_net_timer(dev); } return ret; } -static int mambonet_close(struct net_device *dev) +static int systemsim_net_close(struct net_device *dev) { struct netdev_private *priv; @@ -282,30 +283,29 @@ static int mambonet_close(struct net_dev KILL_BOTTOM_HALF(&priv->poll_task); } - kfree(priv); - return 0; } -static struct net_device_stats mambonet_stats; +static struct net_device_stats systemsim_net_stats; -static struct net_device_stats *mambonet_get_stats(struct net_device *dev) +static struct net_device_stats *systemsim_net_get_stats(struct net_device *dev) { - return &mambonet_stats; + return &systemsim_net_stats; } -static int mambonet_set_mac_address(struct net_device *dev, void *p) +static int systemsim_net_set_mac_address(struct net_device *dev, void *p) { return -EOPNOTSUPP; } -static int mambonet_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) +static int systemsim_net_ioctl(struct net_device *dev, struct ifreq *ifr, + int cmd) { return -EOPNOTSUPP; } static int nextdevno = 0; /* running count of device numbers */ /* Initialize the rest of the device. */ -int __init do_mambonet_probe(struct net_device *dev) +int __init do_systemsim_net_probe(struct net_device *dev) { struct netdev_private *priv; int devno = nextdevno++; @@ -313,7 +313,7 @@ int __init do_mambonet_probe(struct net_ printk("eth%d: bogus network driver initialization\n", devno); - irq = mambonet_probedev(devno, dev->dev_addr); + irq = systemsim_net_probedev(devno, dev->dev_addr); if (irq < 0) { printk("No IRQ retreived\n"); @@ -328,14 +328,14 @@ int __init do_mambonet_probe(struct net_ dev->irq = irq; dev->mtu = MAMBO_MTU; - dev->open = mambonet_open; - dev->poll = mambonet_poll; + dev->open = systemsim_net_open; + dev->poll = systemsim_net_poll; dev->weight = 16; - dev->stop = mambonet_close; - dev->hard_start_xmit = mambonet_start_xmit; - dev->get_stats = mambonet_get_stats; - dev->set_mac_address = mambonet_set_mac_address; - dev->do_ioctl = mambonet_ioctl; + dev->stop = systemsim_net_close; + dev->hard_start_xmit = systemsim_net_start_xmit; + dev->get_stats = systemsim_net_get_stats; + dev->set_mac_address = systemsim_net_set_mac_address; + dev->do_ioctl = systemsim_net_ioctl; dev->priv = kmalloc(sizeof(struct netdev_private), GFP_KERNEL); if (dev->priv == NULL) @@ -348,14 +348,14 @@ int __init do_mambonet_probe(struct net_ dev->get_stats = get_stats; if (dev->irq == 0) { - INIT_BOTTOM_HALF(&priv->poll_task, (void *)mambonet_timer, + INIT_BOTTOM_HALF(&priv->poll_task, (void *)systemsim_net_timer, (void *)dev); } return (0); }; -struct net_device *__init mambonet_probe(int unit) +struct net_device *__init systemsim_net_probe(int unit) { struct net_device *dev = alloc_etherdev(0); int err; @@ -366,7 +366,7 @@ struct net_device *__init mambonet_probe sprintf(dev->name, "eth%d", unit); netdev_boot_setup_check(dev); - err = do_mambonet_probe(dev); + err = do_systemsim_net_probe(dev); if (err) goto out; @@ -382,11 +382,12 @@ struct net_device *__init mambonet_probe return ERR_PTR(err); } -int __init init_mambonet(void) +int __init init_systemsim_net(void) { - mambonet_probe(0); + systemsim_net_probe(0); return 0; } -module_init(init_mambonet); +module_init(init_systemsim_net); +MODULE_DESCRIPTION("Systemsim Network Driver"); MODULE_LICENSE("GPL"); -- 1.0.GIT From ericvh at gmail.com Sat Feb 4 01:55:36 2006 From: ericvh at gmail.com (Eric Van Hensbergen) Date: Fri, 3 Feb 2006 08:55:36 -0600 (CST) Subject: [patch 2/3] systemsim: cleanup systemsim block driver patch Message-ID: <20060203145536.CB9C35A8098@localhost.localdomain> Subject: [PATCH] systemsim: clean up systemsim block driver Clean-up the systemsim block driver and integrate some of the suggestions from LKML. Signed-off-by: Eric Van Hensbergen --- drivers/block/systemsim_bd.c | 159 ++++++++++++++++++++++++------------------ 1 files changed, 91 insertions(+), 68 deletions(-) ea40711c3a573b917cade94c1bdca659e4f3f905 diff --git a/drivers/block/systemsim_bd.c b/drivers/block/systemsim_bd.c index deecfb8..bec453e 100644 --- a/drivers/block/systemsim_bd.c +++ b/drivers/block/systemsim_bd.c @@ -11,7 +11,7 @@ * written by Pavel Machek and Steven Whitehouse * * Some code is from the IBM Full System Simulator Group in ARL - * Author: PAtrick Bohrer + * Author: Patrick Bohrer * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -43,7 +43,7 @@ #include #include #include - +#include #include #include @@ -52,21 +52,21 @@ #include #define MAJOR_NR 42 -#define MAX_MBD 128 +#define MAX_SYSTEMSIM_BD 128 -#define MBD_SET_BLKSIZE _IO( 0xab, 1 ) -#define MBD_SET_SIZE _IO( 0xab, 2 ) -#define MBD_SET_SIZE_BLOCKS _IO( 0xab, 7 ) -#define MBD_DISCONNECT _IO( 0xab, 8 ) +#define SYSTEMSIM_BD_SET_BLKSIZE _IO( 0xab, 1 ) +#define SYSTEMSIM_BD_SET_SIZE _IO( 0xab, 2 ) +#define SYSTEMSIM_BD_SET_SIZE_BLOCKS _IO( 0xab, 7 ) +#define SYSTEMSIM_BD_DISCONNECT _IO( 0xab, 8 ) -struct mbd_device { +struct systemsim_bd_device { int initialized; int refcnt; int flags; struct gendisk *disk; }; -static struct mbd_device mbd_dev[MAX_MBD]; +static struct systemsim_bd_device systemsim_bd_dev[MAX_SYSTEMSIM_BD]; #define BD_INFO_SYNC 0 #define BD_INFO_STATUS 1 @@ -79,7 +79,7 @@ static struct mbd_device mbd_dev[MAX_MBD #define BOGUS_DISK_INFO 118 static inline int -MamboBogusDiskRead(int devno, void *buf, ulong sect, ulong nrsect) +systemsim_disk_read(int devno, void *buf, ulong sect, ulong nrsect) { return callthru3(BOGUS_DISK_READ, (unsigned long)buf, (unsigned long)sect, @@ -87,34 +87,34 @@ MamboBogusDiskRead(int devno, void *buf, } static inline int -MamboBogusDiskWrite(int devno, void *buf, ulong sect, ulong nrsect) +systemsim_disk_write(int devno, void *buf, ulong sect, ulong nrsect) { return callthru3(BOGUS_DISK_WRITE, (unsigned long)buf, (unsigned long)sect, (unsigned long)((nrsect << 16) | devno)); } -static inline int MamboBogusDiskInfo(int op, int devno) +static inline int systemsim_disk_info(int op, int devno) { return callthru2(BOGUS_DISK_INFO, (unsigned long)op, (unsigned long)devno); } -static int mbd_init_disk(int devno) +static int systemsim_bd_init_disk(int devno) { - struct gendisk *disk = mbd_dev[devno].disk; + struct gendisk *disk = systemsim_bd_dev[devno].disk; unsigned int sz; /* check disk configured */ - if (!MamboBogusDiskInfo(BD_INFO_STATUS, devno)) { + if (!systemsim_disk_info(BD_INFO_STATUS, devno)) { printk(KERN_ERR "Attempting to open bogus disk before initializaiton\n"); return 0; } - mbd_dev[devno].initialized++; + systemsim_bd_dev[devno].initialized++; - sz = MamboBogusDiskInfo(BD_INFO_DEVSZ, devno); + sz = systemsim_disk_info(BD_INFO_DEVSZ, devno); printk("Initializing disk %d with devsz %u\n", devno, sz); @@ -123,7 +123,7 @@ static int mbd_init_disk(int devno) return 1; } -static void do_mbd_request(request_queue_t * q) +static void do_systemsim_bd_request(request_queue_t * q) { int result = 0; struct request *req; @@ -133,14 +133,14 @@ static void do_mbd_request(request_queue switch (rq_data_dir(req)) { case READ: - result = MamboBogusDiskRead(minor, - req->buffer, req->sector, - req->current_nr_sectors); - break; - case WRITE: - result = MamboBogusDiskWrite(minor, + result = systemsim_disk_read(minor, req->buffer, req->sector, req->current_nr_sectors); + break; + case WRITE: + result = systemsim_disk_write(minor, + req->buffer, req->sector, + req->current_nr_sectors); }; if (result) @@ -150,108 +150,131 @@ static void do_mbd_request(request_queue } } -static int mbd_release(struct inode *inode, struct file *file) +static int systemsim_bd_release(struct inode *inode, struct file *file) { - struct mbd_device *lo; + struct systemsim_bd_device *lo; int dev; if (!inode) return -ENODEV; dev = inode->i_bdev->bd_disk->first_minor; - if (dev >= MAX_MBD) + if (dev >= MAX_SYSTEMSIM_BD) return -ENODEV; - if (MamboBogusDiskInfo(BD_INFO_SYNC, dev) < 0) { - printk(KERN_ALERT "mbd_release: unable to sync\n"); + if (systemsim_disk_info(BD_INFO_SYNC, dev) < 0) { + printk(KERN_ALERT "systemsim_bd_release: unable to sync\n"); } - lo = &mbd_dev[dev]; + lo = &systemsim_bd_dev[dev]; if (lo->refcnt <= 0) - printk(KERN_ALERT "mbd_release: refcount(%d) <= 0\n", + printk(KERN_ALERT "systemsim_bd_release: refcount(%d) <= 0\n", lo->refcnt); lo->refcnt--; return 0; } -static int mbd_revalidate(struct gendisk *disk) +static int systemsim_bd_revalidate(struct gendisk *disk) { int devno = disk->first_minor; - mbd_init_disk(devno); + systemsim_bd_init_disk(devno); return 0; } -static int mbd_open(struct inode *inode, struct file *file) +static int systemsim_bd_open(struct inode *inode, struct file *file) { int dev; if (!inode) return -EINVAL; dev = inode->i_bdev->bd_disk->first_minor; - if (dev >= MAX_MBD) + if (dev >= MAX_SYSTEMSIM_BD) return -ENODEV; check_disk_change(inode->i_bdev); - if (!mbd_dev[dev].initialized) - if (!mbd_init_disk(dev)) + if (!systemsim_bd_dev[dev].initialized) + if (!systemsim_bd_init_disk(dev)) return -ENODEV; - mbd_dev[dev].refcnt++; + systemsim_bd_dev[dev].refcnt++; return 0; } -static struct block_device_operations mbd_fops = { +static struct block_device_operations systemsim_bd_fops = { owner:THIS_MODULE, - open:mbd_open, - release:mbd_release, - /* media_changed: mbd_check_change, */ - revalidate_disk:mbd_revalidate, + open:systemsim_bd_open, + release:systemsim_bd_release, + /* media_changed: systemsim_bd_check_change, */ + revalidate_disk:systemsim_bd_revalidate, }; -static spinlock_t mbd_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t systemsim_bd_lock = SPIN_LOCK_UNLOCKED; -static int __init mbd_init(void) +static int __init systemsim_bd_init(void) { + struct device_node *systemsim; int err = -ENOMEM; int i; - for (i = 0; i < MAX_MBD; i++) { + systemsim = find_path_device("/systemsim"); + + if (systemsim == NULL) { + printk("NO SYSTEMSIM BOGUS DISK DETECTED\n"); + return -1; + } + + /* + * We could detect which disks are configured in openfirmware + * but I think this unnecessarily limits us from being able to + * hot-plug bogus disks durning run-time. + * + */ + + for (i = 0; i < MAX_SYSTEMSIM_BD; i++) { struct gendisk *disk = alloc_disk(1); if (!disk) goto out; - mbd_dev[i].disk = disk; + systemsim_bd_dev[i].disk = disk; /* * The new linux 2.5 block layer implementation requires * every gendisk to have its very own request_queue struct. * These structs are big so we dynamically allocate them. */ - disk->queue = blk_init_queue(do_mbd_request, &mbd_lock); + disk->queue = + blk_init_queue(do_systemsim_bd_request, &systemsim_bd_lock); if (!disk->queue) { put_disk(disk); goto out; } } - if (register_blkdev(MAJOR_NR, "mbd")) { + if (register_blkdev(MAJOR_NR, "systemsim_bd")) { err = -EIO; goto out; } #ifdef MODULE - printk("mambo bogus disk: registered device at major %d\n", MAJOR_NR); + printk("systemsim bogus disk: registered device at major %d\n", + MAJOR_NR); #else - printk("mambo bogus disk: compiled in with kernel\n"); + printk("systemsim bogus disk: compiled in with kernel\n"); #endif + /* + * left device name alone for now as too much depends on it + * external to the kernel + * + */ + devfs_mk_dir("mambobd"); - for (i = 0; i < MAX_MBD; i++) { /* load defaults */ - struct gendisk *disk = mbd_dev[i].disk; - mbd_dev[i].initialized = 0; - mbd_dev[i].refcnt = 0; - mbd_dev[i].flags = 0; + for (i = 0; i < MAX_SYSTEMSIM_BD; i++) { /* load defaults */ + struct gendisk *disk = systemsim_bd_dev[i].disk; + systemsim_bd_dev[i].initialized = 0; + systemsim_bd_dev[i].refcnt = 0; + systemsim_bd_dev[i].flags = 0; disk->major = MAJOR_NR; disk->first_minor = i; - disk->fops = &mbd_fops; - disk->private_data = &mbd_dev[i]; + disk->fops = &systemsim_bd_fops; + disk->private_data = &systemsim_bd_dev[i]; sprintf(disk->disk_name, "mambobd%d", i); sprintf(disk->devfs_name, "mambobd%d", i); set_capacity(disk, 0x7ffffc00ULL << 1); /* 2 TB */ @@ -261,25 +284,25 @@ static int __init mbd_init(void) return 0; out: while (i--) { - if (mbd_dev[i].disk->queue) - blk_cleanup_queue(mbd_dev[i].disk->queue); - put_disk(mbd_dev[i].disk); + if (systemsim_bd_dev[i].disk->queue) + blk_cleanup_queue(systemsim_bd_dev[i].disk->queue); + put_disk(systemsim_bd_dev[i].disk); } return -EIO; } -static void __exit mbd_cleanup(void) +static void __exit systemsim_bd_cleanup(void) { devfs_remove("mambobd"); - if (unregister_blkdev(MAJOR_NR, "mbd") != 0) - printk("mbd: cleanup_module failed\n"); + if (unregister_blkdev(MAJOR_NR, "systemsim_bd") != 0) + printk("systemsim_bd: cleanup_module failed\n"); else - printk("mbd: module cleaned up.\n"); + printk("systemsim_bd: module cleaned up.\n"); } -module_init(mbd_init); -module_exit(mbd_cleanup); +module_init(systemsim_bd_init); +module_exit(systemsim_bd_cleanup); -MODULE_DESCRIPTION("Mambo Block Device"); +MODULE_DESCRIPTION("Systemsim Block Device"); MODULE_LICENSE("GPL"); -- 1.0.GIT From jfaslist at yahoo.fr Sat Feb 4 02:58:36 2006 From: jfaslist at yahoo.fr (jfaslist) Date: Fri, 03 Feb 2006 16:58:36 +0100 Subject: Maple freezing on PCI Target-Abort In-Reply-To: <1138930958.4934.102.camel@localhost.localdomain> References: <43E23B4A.4020402@yahoo.fr> <1138930958.4934.102.camel@localhost.localdomain> Message-ID: <43E37DAC.4030606@yahoo.fr> Hi, Yes, we are going to dig into all this CPC925 and Processor Interface initialization. Note that I checked that both MSR_ME and MSR_RI were set prior to triggering the PCI Target-Abort. -MSR_ME: If not set the CPU will "checkstop" on a machine chaeck. -MSR_RI: So that the exception is recoverable. Regarding MSR_RI, this should always be set, I think? Thanks -jfs Benjamin Herrenschmidt wrote: >>-What exception vector is taking care of a DERR excp? From what I can >>see it seems to be the "machine check" vector. But that seems a bit >>drastic to me. After all this is just a PCI target abort. >> >> > >I would expect a machine check yes. > > > >>-I expect that the normal behavior would be for the kernel to send a >>signal termination to the user process which caused the PIO READ PCI >>cycle (from a previously mmap()'ed VMA address). Is it doable on this >>platform? Since a READ operation is coupled by nature, I think this is >>the only acceptable way. >> >> > >It should SIGBUS except if the problem occurred in the kernel. I don't >know why it's not doing so, maybe you are hitting an issue/errata or >misconfiguration of the 925 ? > > > >>I have tried to set the MSR[RI] bit before doing the PCI cycle, but it >>didn't change change anything. Also on our design we disconnect the >>CPC925 checkstop pin from the 970 machine check pin.(see page 39 of >>cpc925 user's manual). So a DERR shouldn't cause a machine check I would >>think. >> >>I realize that these questions are very H/W related but couldn't find >>the answer in IBM doc. >> >> > > > > > > ___________________________________________________________________________ Nouveau : t?l?phonez moins cher avec Yahoo! Messenger ! D?couvez les tarifs exceptionnels pour appeler la France et l'international. T?l?chargez sur http://fr.messenger.yahoo.com From ahuja at austin.ibm.com Wed Feb 1 06:11:54 2006 From: ahuja at austin.ibm.com (Manish Ahuja) Date: Tue, 31 Jan 2006 13:11:54 -0600 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <20060126204432.GG19465@austin.ibm.com> References: <43CFC094.8000709@austin.ibm.com> <20060126204432.GG19465@austin.ibm.com> Message-ID: <43DFB67A.5080508@austin.ibm.com> Yes, It probably is a good idea to have #define for it, but since purr is only available on power5 architecture, none of the other architecture's really need this code and maybe I should enclose this for power5 setup only. >>+static ssize_t show_dispatchedcycles(struct sys_device *, char *); >>+static ssize_t show_offline_cpu_cycles(struct sys_device *, char *); >>+ >>+static SYSDEV_ATTR(offline_cpu_cycles, 0444, show_offline_cpu_cycles, NULL); >>+static SYSDEV_ATTR(cpu_dispatched_cycles, 0444, show_dispatchedcycles, NULL); >> >> > >I think you need a #ifdef CONFIG_PPC64 around the above. > > >>- if (cpu_has_feature(CPU_FTR_SMT)) >>+ if (cpu_has_feature(CPU_FTR_SMT)) { >> sysdev_create_file(s, &attr_purr); >>+ sysdev_create_file(s, &attr_offline_cpu_cycles); >>+ sysdev_create_file(s, &attr_cpu_dispatched_cycles); >>+ } >> >> > >Shouldn't this be CPU_FTR_PURR not FTR_SMT ? (and also in the next >section too). > > > Yes, the original was FTR_SMT. I overlooked it. Thanks for pointing it out. +/* Defined in setup.c */ >>+extern u64 offline_cpu_total_tb; >>+extern u64 offline_cpu_total_cpu_util; >>+extern u64 offline_cpu_total_krncycles; >>+extern u64 offline_cpu_total_idle; >> >> > >These should be in a header file, probably arch/powerpc/kernel/setup.h > > > >>+static ssize_t show_offline_cpu_cycles(struct sys_device *dev, char *buf) >> >> > >#ifdef CONFIG_PPC64 surrounding the above .... > >--linas > > Okay, I can move it around, if its okay with everyone else. Thanks for the comments. From linas at austin.ibm.com Sat Feb 4 03:58:30 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Fri, 3 Feb 2006 10:58:30 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <1138957657.4934.124.camel@localhost.localdomain> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <1138931103.4934.105.camel@localhost.localdomain> <20060203020341.GR24916@austin.ibm.com> <1138957657.4934.124.camel@localhost.localdomain> Message-ID: <20060203165829.GS24916@austin.ibm.com> On Fri, Feb 03, 2006 at 08:07:37PM +1100, Benjamin Herrenschmidt was heard to remark: > On Thu, 2006-02-02 at 20:03 -0600, Linas Vepstas wrote: > > > Yes, and EEH does do that (in mainline, 10K times in a row, > > last I tried). This email was in reference to the > > layout of /sys/bus/pci/slots which seems to have only hotplug > > slots in there; I am not yet sure why. Its possible John Rose > > can shed some rapid insight? > > Ok... also, about this "max number of resets" thing, it would be useful > in fact to have a rate limit rather ... a network card that for some > reason need to be reset about once a day is still fairly useable and it > would be nice if the system didn't consider it dead after 10 days ... Yes, I've often thought about this. Only two designs come to mind: 1) a timer pops ever 8 hours, and decrements the failure count by 1. Thus, anything less than 3 resets a day would be acceptable. 2) Store the jiffies of the last reset. Increment the fail count only if previous jiffies is less than 8 hours ago. Set fail count to 1 if previous jiffies is more then 48 hours ago. Advantage over 1: no timers. Any preferences? > Also, it might be useful to have an entry to force a retry on a card > that has been considered dead... Actually, hotplug remove/add or dlpar remove/add can be used to clear the count. (and that's how I do my test cases) The problem is that the documentation for this is buried somwhere where it cannot be found. Actually, this is one of my bigger/biggest concerns: the info about any of this is unfindable. I'd like to hype it up a bit, but am not sure how. --linas From linas at austin.ibm.com Sat Feb 4 04:08:34 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Fri, 3 Feb 2006 11:08:34 -0600 Subject: creating PCI-related sysfs entries In-Reply-To: <1138931602.4934.110.camel@localhost.localdomain> References: <20060131202214.GZ19465@austin.ibm.com> <20060131203456.GA23819@kroah.com> <20060131210805.GA19465@austin.ibm.com> <20060131212624.GA10513@kroah.com> <1138931602.4934.110.camel@localhost.localdomain> Message-ID: <20060203170834.GT24916@austin.ibm.com> On Fri, Feb 03, 2006 at 12:53:21PM +1100, Benjamin Herrenschmidt was heard to remark: > > > People have suggested that they create such a driver for a long time. > > Why not just do that? > > if he also wants consolidated "global" stats, > then yes, a host controller driver might be the way to go). I've had trouble parsing these suggestions. I can certainly hack up some pci-host structure so I can publish a few stats (the goal is to eliminate /proc/ppc64/eeh). By "hack" I mean something that would live with either the rpaphp code or the powerpc code. However, this is a different type of activity than the idea "define a generic architecture-neutral pci-host bridge structure". Maybe I should just do the first, and if it ignites anyone's imagination, we can talk about the second. --linas From info at schihei.de Sat Feb 4 05:04:42 2006 From: info at schihei.de (Heiko J Schick) Date: Fri, 3 Feb 2006 19:04:42 +0100 Subject: kernel debugging tool In-Reply-To: <0ITB0074BC0LA6@mmp2.samsung.com> References: <0ITB0074BC0LA6@mmp2.samsung.com> Message-ID: <07030771-257D-4204-A0C4-1833B9F9FBD3@schihei.de> Hello, you can also use XMON or KDB, which are kernel debuggers. XMON is normally included in PowerPC kernels. I think for KDB you have to patch your kernel, but that could be wrong. If you dump out the crash instruction and compare it with the assembler output of your GCC, you can find fast the source code line which caused the kernel panic. Perhaps the following links helps, too: http://urbanmyth.org/linux/oops/ http://www-128.ibm.com/developerworks/library/l-kprobes.html?ca=dgr- lnxw42Kprobe http://www-128.ibm.com/developerworks/linux/library/l-kdbug/ Sometimes also very useful, too. :) On Jan 19, 2006, at 1:00 AM, Hyo Jung Song wrote: > WE are interested in Cell BE (broadband engine) Linux patch. > (found in > http://www.bsc.es/projects/deepcomputing/linuxoncell/cbexdev.html) > We want to debug kernel sources sometimes. How can we do it? > I believe you guys debugged kernel source codes for CBE and you > used > some tools. > Could you please some tips for this? Thank you. > > > > Hyo Jung Song > Senior Engineer > Samsung Electronics > tel. 82-2-3416-0355 > > -----Original Message----- > From: Cell Support [mailto:cell_support at bsc.es] > Sent: Wednesday, January 18, 2006 11:27 PM > To: hjsong at samsung.com > Cc: cell_support at bsc.es > Subject: Re: Fwd: kernel debugging tool > > Dear Hyo, > > we don't develop linux patches for Cell BE. We got them from public > kernel mailing lists and post them to help > people to built a kernel that works with Cell BE. This avoids > having to > go through kernel mailing lists to > find the correct patch files that fit a specific kernel release. Hope > this helps people. > > We think you should post your question to a linux kernel mailing list. > Regarding the ppc64 kernel development, > the linuxppc64-dev at ozlabs.org is the right place > (https://ozlabs.org/mailman/listinfo/linuxppc64-dev). But > you can also sent this to the http://www.kernel.org mailing lists. > Probable, kernel developers can help you > because they are always debugging new their code. > > Hope this helps. > > Regards, > >> Sender : hjsong at samsung.com >> Date : 2006-01-17 10:13 >> Title : kernel debugging tool >> >> Hi. >> >> WE are interested in CBE Linux patch. >> We want to debug kernel sources sometimes. How can we do it? >> I believe you guys debugged kernel source codes for CBE and you used >> some tools. >> Could you please some tips for this? Thank you. >> >> >> >> Hyo Jung Song >> Senior Engineer >> Samsung Electronics >> tel. 82-2-3416-0355 > > > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From info at schihei.de Sat Feb 4 04:55:19 2006 From: info at schihei.de (Heiko J Schick) Date: Fri, 3 Feb 2006 18:55:19 +0100 Subject: kernel debugging tool In-Reply-To: <0ITB0074BC0LA6@mmp2.samsung.com> References: <0ITB0074BC0LA6@mmp2.samsung.com> Message-ID: <759A45E8-8A3E-47CD-B3C9-880C0EBDC25B@schihei.de> Hello, you can also use XMON or KDB, which are kernel debuggers. XMON is normally included in PowerPC kernels. I think for KDB you have to patch your kernel, but that could be wrong. Sometimes also very useful, too. :) On Jan 19, 2006, at 1:00 AM, Hyo Jung Song wrote: > WE are interested in Cell BE (broadband engine) Linux patch. > (found in > http://www.bsc.es/projects/deepcomputing/linuxoncell/cbexdev.html) > We want to debug kernel sources sometimes. How can we do it? > I believe you guys debugged kernel source codes for CBE and you > used > some tools. > Could you please some tips for this? Thank you. > > > > Hyo Jung Song > Senior Engineer > Samsung Electronics > tel. 82-2-3416-0355 > > -----Original Message----- > From: Cell Support [mailto:cell_support at bsc.es] > Sent: Wednesday, January 18, 2006 11:27 PM > To: hjsong at samsung.com > Cc: cell_support at bsc.es > Subject: Re: Fwd: kernel debugging tool > > Dear Hyo, > > we don't develop linux patches for Cell BE. We got them from public > kernel mailing lists and post them to help > people to built a kernel that works with Cell BE. This avoids > having to > go through kernel mailing lists to > find the correct patch files that fit a specific kernel release. Hope > this helps people. > > We think you should post your question to a linux kernel mailing list. > Regarding the ppc64 kernel development, > the linuxppc64-dev at ozlabs.org is the right place > (https://ozlabs.org/mailman/listinfo/linuxppc64-dev). But > you can also sent this to the http://www.kernel.org mailing lists. > Probable, kernel developers can help you > because they are always debugging new their code. > > Hope this helps. > > Regards, > >> Sender : hjsong at samsung.com >> Date : 2006-01-17 10:13 >> Title : kernel debugging tool >> >> Hi. >> >> WE are interested in CBE Linux patch. >> We want to debug kernel sources sometimes. How can we do it? >> I believe you guys debugged kernel source codes for CBE and you used >> some tools. >> Could you please some tips for this? Thank you. >> >> >> >> Hyo Jung Song >> Senior Engineer >> Samsung Electronics >> tel. 82-2-3416-0355 > > > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev ------------------------------------------------------------------------ heiko j schick tel1: +49 (0) 7031 438635 theodor-storm-weg 14 tel2: +49 (0) 7431 971370 71101 schoenaich mobil-tel: +49 (0) 172 9365733 email: info at schihei.de homepage: http://www.schihei.de/ icq: 29165160 pgp-key id: 0x899AD7DC ------------------------------------------------------------------------ From haren at us.ibm.com Sat Feb 4 06:03:11 2006 From: haren at us.ibm.com (Haren Myneni) Date: Fri, 03 Feb 2006 11:03:11 -0800 Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel In-Reply-To: <20060203080609.403CA68A1F@ozlabs.org> References: <20060203080609.403CA68A1F@ozlabs.org> Message-ID: <43E3A8EF.8000609@us.ibm.com> Michael Ellerman wrote: >It's possible for prom_init to allocate the flat device tree inside the >kdump crash kernel region. If this happens, when we load the kdump kernel we >overwrite the flattened device tree, which is bad. > >We could make prom_init try and avoid allocating inside the crash kernel >region, but then we run into issues if the crash kernel region uses all the >space inside the RMO. The easiest solution is to move the flat device tree >once we're running in the kernel. > >Signed-off-by: Michael Ellerman >--- > > arch/powerpc/kernel/prom.c | 27 +++++++++++++++++++++++++++ > arch/powerpc/kernel/setup_64.c | 3 +++ > include/asm-powerpc/prom.h | 2 ++ > 3 files changed, 32 insertions(+) > >Index: kdump/arch/powerpc/kernel/prom.c >=================================================================== >--- kdump.orig/arch/powerpc/kernel/prom.c >+++ kdump/arch/powerpc/kernel/prom.c >@@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n > > return 0; > } >+ >+#ifdef CONFIG_KEXEC >+/* We may have allocated the flat device tree inside the crash kernel region >+ * in prom_init. If so we need to move it out into regular memory. */ >+void kdump_move_device_tree(void) > > Should be void __init kdump_move_device_tree(void) >+{ >+ unsigned long start, end; >+ struct boot_param_header *new; >+ >+ start = __pa((unsigned long)initial_boot_params); >+ end = start + initial_boot_params->totalsize; >+ >+ if (end < crashk_res.start || start > crashk_res.end) >+ return; >+ >+ new = (struct boot_param_header*) >+ __va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE)); >+ >+ memcpy(new, initial_boot_params, initial_boot_params->totalsize); > > We are touching the second kernel memory and the kexec boot will not initialize this region. So, reset this memory. memset((void *)initial_boot_params, 0, initial_boot_params->totalsize); Thanks Haren From cfriesen at nortel.com Sat Feb 4 10:21:00 2006 From: cfriesen at nortel.com (Christopher Friesen) Date: Fri, 03 Feb 2006 17:21:00 -0600 Subject: how to limit memory with 2.6.10 on ppc64 machine? Message-ID: <43E3E55C.90504@nortel.com> I'm running 2.6.10 on a ppc64 machine with 4GB of memory. We're debugging an issue and would like to try and see if disabling the U3 DART makes the problem go away. Unfortunately, this particular blade is unstable if not all the memory banks are populated. After some frustration I looked at the code and realized that the "mem=" functionality is not supported for ppc64 on this particular kernel. Can anyone give me some advice on the simplest way to limit this thing to under 2GB of memory so that the DART is not allocated/used? Does anyone know when support for "mem=" was added? I know it is there in the current git version, but the "powerpc" consolidation means everything is all different now. Thanks, Chris From benh at kernel.crashing.org Sat Feb 4 11:12:54 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 04 Feb 2006 11:12:54 +1100 Subject: Maple freezing on PCI Target-Abort In-Reply-To: <43E37DAC.4030606@yahoo.fr> References: <43E23B4A.4020402@yahoo.fr> <1138930958.4934.102.camel@localhost.localdomain> <43E37DAC.4030606@yahoo.fr> Message-ID: <1139011975.8543.4.camel@localhost.localdomain> On Fri, 2006-02-03 at 16:58 +0100, jfaslist wrote: > Hi, > Yes, we are going to dig into all this CPC925 and Processor Interface > initialization. > Note that I checked that both MSR_ME and MSR_RI were set prior to > triggering the PCI Target-Abort. > > -MSR_ME: If not set the CPU will "checkstop" on a machine chaeck. > -MSR_RI: So that the exception is recoverable. > > Regarding MSR_RI, this should always be set, I think? Yes, MSR:RI is always set by the kernel except in the rare code path where taking an exception is actually unsafe (like in some of the exception handling code itself) Ben. From michael at ellerman.id.au Sat Feb 4 11:54:02 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Sat, 4 Feb 2006 11:54:02 +1100 Subject: how to limit memory with 2.6.10 on ppc64 machine? In-Reply-To: <43E3E55C.90504@nortel.com> References: <43E3E55C.90504@nortel.com> Message-ID: <200602041154.12710.michael@ellerman.id.au> On Sat, 4 Feb 2006 10:21, Christopher Friesen wrote: > I'm running 2.6.10 on a ppc64 machine with 4GB of memory. > > We're debugging an issue and would like to try and see if disabling the > U3 DART makes the problem go away. Unfortunately, this particular blade > is unstable if not all the memory banks are populated. > > After some frustration I looked at the code and realized that the "mem=" > functionality is not supported for ppc64 on this particular kernel. > > Can anyone give me some advice on the simplest way to limit this thing > to under 2GB of memory so that the DART is not allocated/used? > > Does anyone know when support for "mem=" was added? I know it is there > in the current git version, but the "powerpc" consolidation means > everything is all different now. From memory (harhar) the mem= support was merged in 2.6.11, so the original patch should _probably_ apply on a vanilla 2.6.10 tree, try it: http://patchwork.ozlabs.org/linuxppc64/patch?id=724 cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060204/229eceaf/attachment.pgp From akpm at osdl.org Sat Feb 4 15:25:31 2006 From: akpm at osdl.org (Andrew Morton) Date: Fri, 3 Feb 2006 20:25:31 -0800 Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was: msi support) In-Reply-To: <20060203201441.194be500.pj@sgi.com> References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com> <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> <20060203201441.194be500.pj@sgi.com> Message-ID: <20060203202531.27d685fa.akpm@osdl.org> Paul Jackson wrote: > > The following patch seems to be breaking my ia64 sn2_defconfig > build of 2.6.16-rc1-mm5: > > gregkh-pci-altix-msi-support-git-ia64-fix.patch > > I'm guessing you should remove it for now. > > > Details: > ======== > > When I try to build an ia64 sn2_defconfig 2.6.16-rc1-mm5, the > build fails: > > arch/ia64/sn/pci/tioce_provider.c:699:49: macro "ATE_MAKE" passed 3 arguments, but takes just 2 > arch/ia64/sn/pci/tioce_provider.c: In function `tioce_reserve_m32': > arch/ia64/sn/pci/tioce_provider.c:699: error: `ATE_MAKE' undeclared (first use in this function) > > If I remove the patch: > > gregkh-pci-altix-msi-support-git-ia64-fix.patch > > then it compiles fine. OK. I autodrop several of Greg's MSI patches because a) they had bugs which broke stuff a while ago and b) they don't apply and I'm lazy. So it looks like you've found a fix for a patch which isn't actually in -mm any more. I sent that fix to Greg the other day. > It seems that someone added a patchset to change the ATE_MAKE() > macro from 2 to 3 args, then someone added this above fix patch > for a missed change, then someone reverted it all back to 2 args, > but leaving this fix patch. > > I guess it means Andrew should remove the above patch. I'll do that, thanks. From akpm at osdl.org Sat Feb 4 15:27:42 2006 From: akpm at osdl.org (Andrew Morton) Date: Fri, 3 Feb 2006 20:27:42 -0800 Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was: msi support) In-Reply-To: <20060203202531.27d685fa.akpm@osdl.org> References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com> <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> <20060203201441.194be500.pj@sgi.com> <20060203202531.27d685fa.akpm@osdl.org> Message-ID: <20060203202742.1e514fcc.akpm@osdl.org> Andrew Morton wrote: > > So it > looks like you've found a fix for a patch which isn't actually in -mm any > more. I sent that fix to Greg the other day. Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es git-ia64.patch when gregkh-pci-altix-msi-support.patch is also applied, so it's not presently useful to either Greg or Tony. I'll take care of it, somehow.. From pj at sgi.com Sat Feb 4 15:14:41 2006 From: pj at sgi.com (Paul Jackson) Date: Fri, 3 Feb 2006 20:14:41 -0800 Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was: msi support) In-Reply-To: <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com> <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> Message-ID: <20060203201441.194be500.pj@sgi.com> Andrew, The following patch seems to be breaking my ia64 sn2_defconfig build of 2.6.16-rc1-mm5: gregkh-pci-altix-msi-support-git-ia64-fix.patch I'm guessing you should remove it for now. Details: ======== When I try to build an ia64 sn2_defconfig 2.6.16-rc1-mm5, the build fails: arch/ia64/sn/pci/tioce_provider.c:699:49: macro "ATE_MAKE" passed 3 arguments, but takes just 2 arch/ia64/sn/pci/tioce_provider.c: In function `tioce_reserve_m32': arch/ia64/sn/pci/tioce_provider.c:699: error: `ATE_MAKE' undeclared (first use in this function) If I remove the patch: gregkh-pci-altix-msi-support-git-ia64-fix.patch then it compiles fine. It seems that someone added a patchset to change the ATE_MAKE() macro from 2 to 3 args, then someone added this above fix patch for a missed change, then someone reverted it all back to 2 args, but leaving this fix patch. I guess it means Andrew should remove the above patch. But I really do not know what is going on here. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson 1.925.600.0401 From maule at sgi.com Sat Feb 4 15:42:34 2006 From: maule at sgi.com (Mark Maule) Date: Fri, 3 Feb 2006 22:42:34 -0600 Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was: msi support) In-Reply-To: <20060203202742.1e514fcc.akpm@osdl.org> References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com> <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> <20060203201441.194be500.pj@sgi.com> <20060203202531.27d685fa.akpm@osdl.org> <20060203202742.1e514fcc.akpm@osdl.org> Message-ID: <20060204044234.GA31134@sgi.com> On Fri, Feb 03, 2006 at 08:27:42PM -0800, Andrew Morton wrote: > Andrew Morton wrote: > > > > So it > > looks like you've found a fix for a patch which isn't actually in -mm any > > more. I sent that fix to Greg the other day. > > Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es > git-ia64.patch when gregkh-pci-altix-msi-support.patch is also applied, so > it's not presently useful to either Greg or Tony. I'll take care of it, > somehow.. > I think what happened here is that I submitted a patchset for msi abstractions (and others posted a couple of subsequent bugfix incrementals), but these were not taken into the 2.6.16 base 'cause of their invasiveness. These patches touched the tioce_provider.c file. Then I submitted another patch which touched the tioce_provider.c file, and it looks like I probably based this file on the previous msi versions which were being held back, so in order for everything to build, you need all of the msi patches applied first. What's the preferred way to handle this ... fix the current ia64 build and then resubmit the msi patches relative to that base? Mark From akpm at osdl.org Sat Feb 4 16:08:07 2006 From: akpm at osdl.org (Andrew Morton) Date: Fri, 3 Feb 2006 21:08:07 -0800 Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was: msi support) In-Reply-To: <20060204044234.GA31134@sgi.com> References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com> <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> <20060203201441.194be500.pj@sgi.com> <20060203202531.27d685fa.akpm@osdl.org> <20060203202742.1e514fcc.akpm@osdl.org> <20060204044234.GA31134@sgi.com> Message-ID: <20060203210807.56a48888.akpm@osdl.org> Mark Maule wrote: > > On Fri, Feb 03, 2006 at 08:27:42PM -0800, Andrew Morton wrote: > > Andrew Morton wrote: > > > > > > So it > > > looks like you've found a fix for a patch which isn't actually in -mm any > > > more. I sent that fix to Greg the other day. > > > > Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es > > git-ia64.patch when gregkh-pci-altix-msi-support.patch is also applied, so > > it's not presently useful to either Greg or Tony. I'll take care of it, > > somehow.. > > > > I think what happened here is that I submitted a patchset for msi > abstractions (and others posted a couple of subsequent bugfix incrementals), > but these were not taken into the 2.6.16 base 'cause of their invasiveness. > These patches touched the tioce_provider.c file. > > Then I submitted another patch which touched the tioce_provider.c file, and > it looks like I probably based this file on the previous msi versions which > were being held back, so in order for everything to build, you need all of > the msi patches applied first. > > What's the preferred way to handle this ... fix the current ia64 build and > then resubmit the msi patches relative to that base? > umm, tricky. This situation doesn't arise very often. What you could do is to prepare the patches against Tony's latest tree. Then I can put them in -mm and Greg can drop them. Once Tony merges up with Linus I transfer the patches to Greg. Or we put the patches into Tony's tree. Either way - they'll be the same patches. But it does mean that the patches won't be merged into mainline until Tony merges up. If that's a problem then we'll need to think again. From benh at kernel.crashing.org Sun Feb 5 09:56:02 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 05 Feb 2006 09:56:02 +1100 Subject: [PATCH] powerpc: Fix PowerMac sound i2c In-Reply-To: <87y80rkmk7.fsf@briny.internal.ondioline.org> References: <1136695956.30123.44.camel@localhost.localdomain> <87y80rkmk7.fsf@briny.internal.ondioline.org> Message-ID: <1139093763.5634.3.camel@localhost.localdomain> On Sat, 2006-02-04 at 18:26 +0000, Paul Collins wrote: > Hi Ben, > > Benjamin Herrenschmidt writes: > > > My patch reworking the PowerMac i2c code break the sound drivers as they > > used to rely on some broken behaviour of i2c-keywest that is gone now. > > This patch should fix them (tested on a g5 with alsa only). It might > > also fix an oops if the alsa driver hits an unsupported chip. > > Applied Linus's current git tree, this patch makes ALSA sound on my > PowerBook5,4 work again. The second patch does not work because the > i2c wrapper (I assume that's what i2c_smbus_write_i2c_block_data is) > has apparently not yet returned. > > It would be nice to have this fix in 2.6.16 if possible. The second patch is the one that should go in, but it relies on an i2c fix (undoing some Bunk damage) that is still staging in Greg tree... I don't know what's up with that, I'll ask around. Ben. From haren at us.ibm.com Sun Feb 5 13:25:14 2006 From: haren at us.ibm.com (Haren Myneni) Date: Sat, 04 Feb 2006 18:25:14 -0800 Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel In-Reply-To: <8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org> References: <20060203080609.403CA68A1F@ozlabs.org> <8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org> Message-ID: <43E5620A.6060503@us.ibm.com> Kumar Gala wrote: >On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote: > > > >>It's possible for prom_init to allocate the flat device tree inside >>the >>kdump crash kernel region. If this happens, when we load the kdump >>kernel we >>overwrite the flattened device tree, which is bad. >> >>We could make prom_init try and avoid allocating inside the crash >>kernel >>region, but then we run into issues if the crash kernel region uses >>all the >>space inside the RMO. The easiest solution is to move the flat >>device tree >>once we're running in the kernel. >> >>Signed-off-by: Michael Ellerman >> >> > >Doesn't setup_32.c need a similar change? > > At present kdump will not be supported on ppc32. In case, kdump_move_device_tree() can be called at the beginning of unflatten_device_tree() to support both ppc32 and 64 instead of making changes in setup_64.c and setup_32.c. The extern definition can be removed from asm-powerpc/prom.h and this function can be static. Michael, what do you think if we have some printk to tell the user that device_tree is moved to new location. Because, the console messages from prom_init are saying about old addresses. Thanks Haren >- k > > > >>--- >> >> arch/powerpc/kernel/prom.c | 27 +++++++++++++++++++++++++++ >> arch/powerpc/kernel/setup_64.c | 3 +++ >> include/asm-powerpc/prom.h | 2 ++ >> 3 files changed, 32 insertions(+) >> >>Index: kdump/arch/powerpc/kernel/prom.c >>=================================================================== >>--- kdump.orig/arch/powerpc/kernel/prom.c >>+++ kdump/arch/powerpc/kernel/prom.c >>@@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n >> >> return 0; >> } >>+ >>+#ifdef CONFIG_KEXEC >>+/* We may have allocated the flat device tree inside the crash >>kernel region >>+ * in prom_init. If so we need to move it out into regular memory. */ >>+void kdump_move_device_tree(void) >>+{ >>+ unsigned long start, end; >>+ struct boot_param_header *new; >>+ >>+ start = __pa((unsigned long)initial_boot_params); >>+ end = start + initial_boot_params->totalsize; >>+ >>+ if (end < crashk_res.start || start > crashk_res.end) >>+ return; >>+ >>+ new = (struct boot_param_header*) >>+ __va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE)); >>+ >>+ memcpy(new, initial_boot_params, initial_boot_params->totalsize); >>+ >>+ initial_boot_params = new; >>+ >>+ DBG("Flat device tree blob moved to %p\n", initial_boot_params); >>+ >>+ /* XXX should we unreserve the old DT? */ >>+} >>+#endif /* CONFIG_KEXEC */ >>Index: kdump/arch/powerpc/kernel/setup_64.c >>=================================================================== >>--- kdump.orig/arch/powerpc/kernel/setup_64.c >>+++ kdump/arch/powerpc/kernel/setup_64.c >>@@ -398,6 +398,9 @@ void __init setup_system(void) >> { >> DBG(" -> setup_system()\n"); >> >>+#ifdef CONFIG_KEXEC >>+ kdump_move_device_tree(); >>+#endif >> /* >> * Unflatten the device-tree passed by prom_init or kexec >> */ >>Index: kdump/include/asm-powerpc/prom.h >>=================================================================== >>--- kdump.orig/include/asm-powerpc/prom.h >>+++ kdump/include/asm-powerpc/prom.h >>@@ -222,5 +222,7 @@ extern int of_address_to_resource(struct >> extern int of_pci_address_to_resource(struct device_node *dev, int >>bar, >> struct resource *r); >> >>+extern void kdump_move_device_tree(void); >>+ >> #endif /* __KERNEL__ */ >> #endif /* _POWERPC_PROM_H */ >>_______________________________________________ >>Linuxppc64-dev mailing list >>Linuxppc64-dev at ozlabs.org >>https://ozlabs.org/mailman/listinfo/linuxppc64-dev >> >> > >_______________________________________________ >Linuxppc64-dev mailing list >Linuxppc64-dev at ozlabs.org >https://ozlabs.org/mailman/listinfo/linuxppc64-dev > > > From paul at briny.ondioline.org Sun Feb 5 05:26:16 2006 From: paul at briny.ondioline.org (Paul Collins) Date: Sat, 04 Feb 2006 18:26:16 +0000 Subject: [PATCH] powerpc: Fix PowerMac sound i2c In-Reply-To: <1136695956.30123.44.camel@localhost.localdomain> (Benjamin Herrenschmidt's message of "Sun, 08 Jan 2006 15:52:36 +1100") References: <1136695956.30123.44.camel@localhost.localdomain> Message-ID: <87y80rkmk7.fsf@briny.internal.ondioline.org> Hi Ben, Benjamin Herrenschmidt writes: > My patch reworking the PowerMac i2c code break the sound drivers as they > used to rely on some broken behaviour of i2c-keywest that is gone now. > This patch should fix them (tested on a g5 with alsa only). It might > also fix an oops if the alsa driver hits an unsupported chip. Applied Linus's current git tree, this patch makes ALSA sound on my PowerBook5,4 work again. The second patch does not work because the i2c wrapper (I assume that's what i2c_smbus_write_i2c_block_data is) has apparently not yet returned. It would be nice to have this fix in 2.6.16 if possible. > Signed-off-by: Benjamin Herrenschmidt > > Index: linux-work/sound/ppc/tumbler.c > =================================================================== > --- linux-work.orig/sound/ppc/tumbler.c 2005-11-24 17:19:14.000000000 +1100 > +++ linux-work/sound/ppc/tumbler.c 2006-01-08 15:18:09.000000000 +1100 > @@ -137,6 +137,22 @@ static int send_init_client(pmac_keywest > return 0; > } > > +static int tumbler_write_block(struct i2c_client *client, u8 reg, int len, > + u8 *values) > +{ > + union i2c_smbus_data data; > + int err; > + > + data.block[0] = len; > + memcpy(&data.block[1], values, len); > + err = i2c_smbus_xfer(client->adapter, client->addr, client->flags, > + I2C_SMBUS_WRITE, reg, I2C_SMBUS_I2C_BLOCK_DATA, > + &data); > + return err; > +} > + > + > + > > static int tumbler_init_client(pmac_keywest_t *i2c) > { > @@ -239,8 +255,7 @@ static int tumbler_set_master_volume(pma > block[4] = (right_vol >> 8) & 0xff; > block[5] = (right_vol >> 0) & 0xff; > > - if (i2c_smbus_write_block_data(mix->i2c.client, TAS_REG_VOL, > - 6, block) < 0) { > + if (tumbler_write_block(mix->i2c.client, TAS_REG_VOL, 6, block) < 0) { > snd_printk("failed to set volume \n"); > return -EINVAL; > } > @@ -340,8 +355,7 @@ static int tumbler_set_drc(pmac_tumbler_ > val[1] = 0; > } > > - if (i2c_smbus_write_block_data(mix->i2c.client, TAS_REG_DRC, > - 2, val) < 0) { > + if (tumbler_write_block(mix->i2c.client, TAS_REG_DRC, 2, val) < 0) { > snd_printk("failed to set DRC\n"); > return -EINVAL; > } > @@ -376,8 +390,7 @@ static int snapper_set_drc(pmac_tumbler_ > val[4] = 0x60; > val[5] = 0xa0; > > - if (i2c_smbus_write_block_data(mix->i2c.client, TAS_REG_DRC, > - 6, val) < 0) { > + if (tumbler_write_block(mix->i2c.client, TAS_REG_DRC, 6, val) < 0) { > snd_printk("failed to set DRC\n"); > return -EINVAL; > } > @@ -481,8 +494,8 @@ static int tumbler_set_mono_volume(pmac_ > vol = info->table[vol]; > for (i = 0; i < info->bytes; i++) > block[i] = (vol >> ((info->bytes - i - 1) * 8)) & 0xff; > - if (i2c_smbus_write_block_data(mix->i2c.client, info->reg, > - info->bytes, block) < 0) { > + if (tumbler_write_block(mix->i2c.client, info->reg, > + info->bytes, block) < 0) { > snd_printk("failed to set mono volume %d\n", info->index); > return -EINVAL; > } > @@ -611,7 +624,7 @@ static int snapper_set_mix_vol1(pmac_tum > for (j = 0; j < 3; j++) > block[i * 3 + j] = (vol >> ((2 - j) * 8)) & 0xff; > } > - if (i2c_smbus_write_block_data(mix->i2c.client, reg, 9, block) < 0) { > + if (tumbler_write_block(mix->i2c.client, reg, 9, block) < 0) { > snd_printk("failed to set mono volume %d\n", reg); > return -EINVAL; > } > Index: linux-work/sound/oss/dmasound/tas_common.h > =================================================================== > --- linux-work.orig/sound/oss/dmasound/tas_common.h 2005-11-24 17:19:14.000000000 +1100 > +++ linux-work/sound/oss/dmasound/tas_common.h 2006-01-08 15:33:29.000000000 +1100 > @@ -157,6 +157,21 @@ tas_mono_to_stereo(uint mono) > return mono | (mono<<8); > } > > +static int tas_write_block(struct i2c_client *client, u8 reg, int len, u8 *vals) > +{ > + union i2c_smbus_data data; > + int err; > + > + data.block[0] = len; > + memcpy(&data.block[1], vals, len); > + err = i2c_smbus_xfer(client->adapter, client->addr, client->flags, > + I2C_SMBUS_WRITE, reg, I2C_SMBUS_I2C_BLOCK_DATA, > + &data); > + return err; > +} > + > + > + > /* > * Todo: make these functions a bit more efficient ! > */ > @@ -178,10 +193,8 @@ tas_write_register( struct tas_data_t *s > if (write_mode & WRITE_SHADOW) > memcpy(self->shadow[reg_num],data,reg_width); > if (write_mode & WRITE_HW) { > - rc=i2c_smbus_write_block_data(self->client, > - reg_num, > - reg_width, > - data); > + rc = tas_write_block(self->client, reg_num, > + reg_width, data); > if (rc < 0) { > printk("tas: I2C block write failed \n"); > return rc; > @@ -199,10 +212,8 @@ tas_sync_register( struct tas_data_t *se > > if (reg_width==0 || self==NULL) > return -EINVAL; > - rc=i2c_smbus_write_block_data(self->client, > - reg_num, > - reg_width, > - self->shadow[reg_num]); > + rc = tas_write_block(self->client, reg_num, > + reg_width, self->shadow[reg_num]); > if (rc < 0) { > printk("tas: I2C block write failed \n"); > return rc; > Index: linux-work/sound/ppc/pmac.c > =================================================================== > --- linux-work.orig/sound/ppc/pmac.c 2005-12-19 16:13:48.000000000 +1100 > +++ linux-work/sound/ppc/pmac.c 2006-01-08 15:37:10.000000000 +1100 > @@ -74,7 +74,7 @@ static int snd_pmac_dbdma_alloc(pmac_t * > > static void snd_pmac_dbdma_free(pmac_t *chip, pmac_dbdma_t *rec) > { > - if (rec) { > + if (rec->space) { > unsigned int rsize = sizeof(struct dbdma_cmd) * (rec->size + 1); > > dma_free_coherent(&chip->pdev->dev, rsize, rec->space, rec->dma_base); > @@ -895,6 +895,7 @@ static int __init snd_pmac_detect(pmac_t > chip->can_capture = 1; > chip->num_freqs = ARRAY_SIZE(awacs_freqs); > chip->freq_table = awacs_freqs; > + chip->pdev = NULL; > > chip->control_mask = MASK_IEPC | MASK_IEE | 0x11; /* default */ > -- Dag vijandelijk luchtschip de huismeester is dood From bdc at carlstrom.com Sun Feb 5 17:10:48 2006 From: bdc at carlstrom.com (Brian D. Carlstrom) Date: 5 Feb 2006 06:10:48 -0000 Subject: G5 fan problems return moving to 2.6.15 with dual processor 2.7GHz machine Message-ID: <20060205061048.7261.qmail@electricrain.com> I've been having problems with overheating on two of my three dual processor 2.7GHzs running Fedora Core 4's 2.6.14 kernels since November. Because of a pending deadline and the time of year, I simply opened the window and let nature cool the machines. In early January, I saw the therm_pm72.c fix in 2.6.15 http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007299.html [PATCH] powerpc: more g5 overtemp problem fix I tried Fedora's updates-testing 2.6.15 kernel to get the fix. That caused the fans to blow full blast like the old days, which was better than leaving the window open, which had several issues with storm winds closing the windows, unhappy facilities managers, and "helpful" co-workers closing the windows for me. Finally last week I had some time to work on this. My first step was to backport the therm_pm72.c fix from 2.6.14 and that worked like a charm, allowing humans with hearing to inhabit the office again. I'm running CPU simulations 24/7 on these machines, and without this fix they were powering off once a day or more without any fix, although I'd fixed them to reboot instead with a /sbin/critical_overtemp script that called "reboot -f". However, even after reporting the problems with the 2.6.15 updates-testing kernel, Fedora Core released the 2.6.15 kernel update anyway. I decided to try and debug what is going on since other people are going to start seeing this issue. Looking at the dmesg output change between 2.6.14 and 2.6.15, both start with the following: PowerMac G5 Thermal control driver 1.2b2 Detected fan controls: 0: PWM fan, id 1, location: BACKSIDE,SYS CTRLR FAN 1: RPM fan, id 2, location: DRIVE BAY 2: PWM fan, id 2, location: SLOT,PCI FAN 3: RPM fan, id 3, location: CPU A INTAKE 4: RPM fan, id 4, location: CPU A EXHAUST 5: RPM fan, id 5, location: CPU B INTAKE 6: RPM fan, id 6, location: CPU B EXHAUST 7: RPM fan, id 1, location: CPU A PUMP 8: RPM fan, id 0, location: CPU B PUMP However, 2.6.14 has the following addition line which I've come to expect on the 2.5GHz and 2.7GHz machines, although not on the 2.0GHz machines of course: Liquid cooling pumps detected, using new algorithm ! I decided to do a little more debugging before reporting this. I built the driver with "#define DEBUG" and added some additional DBG tracing messages (marked "XXX bdc" below). Here is the output with therm_pm72 built into the kernel, not as a module: Feb 4 12:19:06 youngmc kernel: Detected fan controls: Feb 4 12:19:06 youngmc kernel: 0: PWM fan, id 1, location: BACKSIDE,SYS CTRLR FAN Feb 4 12:19:06 youngmc kernel: 1: RPM fan, id 2, location: DRIVE BAY Feb 4 12:19:06 youngmc kernel: 2: PWM fan, id 2, location: SLOT,PCI FAN Feb 4 12:19:06 youngmc kernel: 3: RPM fan, id 3, location: CPU A INTAKE Feb 4 12:19:06 youngmc kernel: 4: RPM fan, id 4, location: CPU A EXHAUST Feb 4 12:19:06 youngmc kernel: 5: RPM fan, id 5, location: CPU B INTAKE Feb 4 12:19:06 youngmc kernel: 6: RPM fan, id 6, location: CPU B EXHAUST Feb 4 12:19:06 youngmc kernel: 7: RPM fan, id 1, location: CPU A PUMP Feb 4 12:19:06 youngmc kernel: 8: RPM fan, id 0, location: CPU B PUMP Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=monid Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=dvi Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=vga Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=crt2 Feb 4 12:19:06 youngmc kernel: Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 Feb 4 12:19:06 youngmc kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx Feb 4 12:19:06 youngmc kernel: ide0: Found Apple K2 ATA-6 controller, bus ID 3, irq 39 Feb 4 12:19:06 youngmc kernel: hda: PIONEER DVD-RW DVR-109, ATAPI CD/DVD-ROM drive Feb 4 12:19:06 youngmc kernel: hda: Enabling Ultra DMA 4 Feb 4 12:19:06 youngmc kernel: ide0 at 0xd000080083656000-0xd000080083656007,0xd000080083656160 on irq 39 Feb 4 12:19:06 youngmc kernel: hda: ATAPI 32X DVD-ROM DVD-R CD-R/RW drive, 2000kB Cache, UDMA(66) Feb 4 12:19:06 youngmc kernel: Uniform CD-ROM driver Revision: 3.20 Feb 4 12:19:06 youngmc kernel: ide-floppy driver 0.99.newide Feb 4 12:19:06 youngmc kernel: usbcore: registered new driver libusual Feb 4 12:19:06 youngmc kernel: usbcore: registered new driver hiddev Feb 4 12:19:06 youngmc kernel: usbcore: registered new driver usbhid Feb 4 12:19:06 youngmc kernel: drivers/usb/input/hid-core.c: v2.6:USB HID core driver Feb 4 12:19:06 youngmc kernel: mice: PS/2 mouse device common for all mice Feb 4 12:19:06 youngmc kernel: /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address ! Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=mac-io 0 Feb 4 12:19:06 youngmc kernel: Found K2 Feb 4 12:19:06 youngmc kernel: Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits I'm guessing I should have seen a "found U3-0", but I see a suspicious message here: /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address ! that I do not see in the working 2.6.14 boot. I was wondering if the change from module to builtin was causing the problem (grasping at straws I guess) so I also tried building it as a module. I get the similar results: Feb 4 12:59:26 youngmc kernel: /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address ! Feb 4 12:59:26 youngmc kernel: Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits ... Feb 4 12:59:27 youngmc kernel: Detected fan controls: Feb 4 12:59:27 youngmc kernel: 0: PWM fan, id 1, location: BACKSIDE,SYS CTRLR FAN Feb 4 12:59:27 youngmc kernel: 1: RPM fan, id 2, location: DRIVE BAY Feb 4 12:59:27 youngmc kernel: 2: PWM fan, id 2, location: SLOT,PCI FAN Feb 4 12:59:27 youngmc kernel: 3: RPM fan, id 3, location: CPU A INTAKE Feb 4 12:59:27 youngmc kernel: 4: RPM fan, id 4, location: CPU A EXHAUST Feb 4 12:59:27 youngmc kernel: 5: RPM fan, id 5, location: CPU B INTAKE Feb 4 12:59:27 youngmc kernel: 6: RPM fan, id 6, location: CPU B EXHAUST Feb 4 12:59:27 youngmc kernel: 7: RPM fan, id 1, location: CPU A PUMP Feb 4 12:59:28 youngmc kernel: 8: RPM fan, id 0, location: CPU B PUMP Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=monid Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=dvi Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=vga Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=crt2 Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach Feb 4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=mac-io 0 Feb 4 12:59:28 youngmc kernel: Found K2 Now this "/u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !" warning I'm seeing in both cases looked familar, in fact I was on a thread about it when the 2.7GHz machines first came out: http://patchwork.ozlabs.org/linuxppc64/patch?id=1982 The code that this patched applied to has moved to a new location arch/powerpc/kernel/prom_init.c, but logically it still seems like it should cover my case. The code says: if (u3_rev < 0x35 || u3_rev > 0x39) return; and my u3_rev looks to be 0x35 $ hexdump /proc/device-tree/u3 at 0,f8000000/device-rev 0000000 0000 0035 0000004 Unforunately it looks like I need to use prom_print to add debugging, which I'm guessing only comes to the console which I'm not near right now. Before going further, is there something obvious that the Fedora 2.6.15 kernel is doing wrong, given that the 2.6.14 kernel works and the 2.6.15 seems to have a regression? I'm willing to do some more debugging or try a more up-to-date kernel to help resolve this issue. One last note, my dual processor 2.0GHz and 2.5GHz machines are running fine with 2.6.15... -bri From benh at kernel.crashing.org Sun Feb 5 20:06:24 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 05 Feb 2006 20:06:24 +1100 Subject: G5 fan problems return moving to 2.6.15 with dual processor 2.7GHz machine In-Reply-To: <20060205061048.7261.qmail@electricrain.com> References: <20060205061048.7261.qmail@electricrain.com> Message-ID: <1139130385.5634.14.camel@localhost.localdomain> > I'm guessing I should have seen a "found U3-0", but I see a suspicious message here: > /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address ! > that I do not see in the working 2.6.14 boot. > > I was wondering if the change from module to builtin was causing the > problem (grasping at straws I guess) so I also tried building it as a > module. I get the similar results: > > Feb 4 12:59:26 youngmc kernel: /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address ! > Feb 4 12:59:26 youngmc kernel: Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits That looks like a fix in prom_init.c that is missing from 2.6.15.. I'll have to double check... Apple indeed seems to have a "bug" in the device-tree of some of those machines. prom_init.c has some code to fix it up, but there have been several versions of the fix and maybe that broke some way... Ah... now I'm reading the rest of the message and see that you figured that out already ;) > Now this "/u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !" > warning I'm seeing in both cases looked familar, in fact I was on a > thread about it when the 2.7GHz machines first came out: > > http://patchwork.ozlabs.org/linuxppc64/patch?id=1982 > > The code that this patched applied to has moved to a new location > arch/powerpc/kernel/prom_init.c, but logically it still seems like it > should cover my case. The code says: > > if (u3_rev < 0x35 || u3_rev > 0x39) > return; > > and my u3_rev looks to be 0x35 > $ hexdump /proc/device-tree/u3 at 0,f8000000/device-rev > 0000000 0000 0035 > 0000004 > > Unforunately it looks like I need to use prom_print to add debugging, > which I'm guessing only comes to the console which I'm not near right > now. > > Before going further, is there something obvious that the Fedora > 2.6.15 kernel is doing wrong, given that the 2.6.14 kernel works and > the 2.6.15 seems to have a regression? I'm willing to do some more > debugging or try a more up-to-date kernel to help resolve this issue. > > One last note, my dual processor 2.0GHz and 2.5GHz machines are running > fine with 2.6.15... Might be something in that prom_init.c fix that broke... it would be really nice if you could give a try with the console and find out what it is ... Unfortunately, I don't have access to one of these machines with the "problem" at the moment... Cheers, Ben. From galak at kernel.crashing.org Mon Feb 6 03:41:18 2006 From: galak at kernel.crashing.org (Kumar Gala) Date: Sun, 5 Feb 2006 10:41:18 -0600 Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel In-Reply-To: <43E5620A.6060503@us.ibm.com> References: <20060203080609.403CA68A1F@ozlabs.org> <8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org> <43E5620A.6060503@us.ibm.com> Message-ID: <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org> On Feb 4, 2006, at 8:25 PM, Haren Myneni wrote: > Kumar Gala wrote: > >> On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote: >> >> >>> It's possible for prom_init to allocate the flat device tree >>> inside the >>> kdump crash kernel region. If this happens, when we load the >>> kdump kernel we >>> overwrite the flattened device tree, which is bad. >>> >>> We could make prom_init try and avoid allocating inside the >>> crash kernel >>> region, but then we run into issues if the crash kernel region >>> uses all the >>> space inside the RMO. The easiest solution is to move the flat >>> device tree >>> once we're running in the kernel. >>> >>> Signed-off-by: Michael Ellerman >>> >> >> Doesn't setup_32.c need a similar change? >> > At present kdump will not be supported on ppc32. > In case, kdump_move_device_tree() can be called at the beginning of > unflatten_device_tree() to support both ppc32 and 64 instead of > making changes in setup_64.c and setup_32.c. The extern definition > can be removed from asm-powerpc/prom.h and this function can be > static. > > Michael, what do you think if we have some printk to tell the user > that device_tree is moved to new location. Because, the console > messages from prom_init are saying about old addresses. What's the issue with kdump on ppc32? One of the reasons we merged ppc32 and ppc64 was to avoid such issues going forward? - k >>> --- >>> >>> arch/powerpc/kernel/prom.c | 27 +++++++++++++++++++++++++++ >>> arch/powerpc/kernel/setup_64.c | 3 +++ >>> include/asm-powerpc/prom.h | 2 ++ >>> 3 files changed, 32 insertions(+) >>> >>> Index: kdump/arch/powerpc/kernel/prom.c >>> =================================================================== >>> --- kdump.orig/arch/powerpc/kernel/prom.c >>> +++ kdump/arch/powerpc/kernel/prom.c >>> @@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n >>> >>> return 0; >>> } >>> + >>> +#ifdef CONFIG_KEXEC >>> +/* We may have allocated the flat device tree inside the crash >>> kernel region >>> + * in prom_init. If so we need to move it out into regular >>> memory. */ >>> +void kdump_move_device_tree(void) >>> +{ >>> + unsigned long start, end; >>> + struct boot_param_header *new; >>> + >>> + start = __pa((unsigned long)initial_boot_params); >>> + end = start + initial_boot_params->totalsize; >>> + >>> + if (end < crashk_res.start || start > crashk_res.end) >>> + return; >>> + >>> + new = (struct boot_param_header*) >>> + __va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE)); >>> + >>> + memcpy(new, initial_boot_params, initial_boot_params->totalsize); >>> + >>> + initial_boot_params = new; >>> + >>> + DBG("Flat device tree blob moved to %p\n", initial_boot_params); >>> + >>> + /* XXX should we unreserve the old DT? */ >>> +} >>> +#endif /* CONFIG_KEXEC */ >>> Index: kdump/arch/powerpc/kernel/setup_64.c >>> =================================================================== >>> --- kdump.orig/arch/powerpc/kernel/setup_64.c >>> +++ kdump/arch/powerpc/kernel/setup_64.c >>> @@ -398,6 +398,9 @@ void __init setup_system(void) >>> { >>> DBG(" -> setup_system()\n"); >>> >>> +#ifdef CONFIG_KEXEC >>> + kdump_move_device_tree(); >>> +#endif >>> /* >>> * Unflatten the device-tree passed by prom_init or kexec >>> */ >>> Index: kdump/include/asm-powerpc/prom.h >>> =================================================================== >>> --- kdump.orig/include/asm-powerpc/prom.h >>> +++ kdump/include/asm-powerpc/prom.h >>> @@ -222,5 +222,7 @@ extern int of_address_to_resource(struct >>> extern int of_pci_address_to_resource(struct device_node *dev, >>> int bar, >>> struct resource *r); >>> >>> +extern void kdump_move_device_tree(void); >>> + >>> #endif /* __KERNEL__ */ >>> #endif /* _POWERPC_PROM_H */ >>> _______________________________________________ >>> Linuxppc64-dev mailing list >>> Linuxppc64-dev at ozlabs.org >>> https://ozlabs.org/mailman/listinfo/linuxppc64-dev >>> >> >> _______________________________________________ >> Linuxppc64-dev mailing list >> Linuxppc64-dev at ozlabs.org >> https://ozlabs.org/mailman/listinfo/linuxppc64-dev >> >> From david at gibson.dropbear.id.au Mon Feb 6 13:18:53 2006 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 6 Feb 2006 13:18:53 +1100 Subject: Hugepages need clear_user_highpage() not clear_highpage() Message-ID: <20060206021853.GC10708@localhost.localdomain> When hugepages are newly allocated to a file in mm/hugetlb.c, we clear them with a call to clear_highpage() on each of the subpages. We should be using clear_user_highpage(): on powerpc, at least, clear_highpage() doesn't correctly mark the page as icache dirty so if the page is executed shortly after it's possible to get strange results. This is a bugfix and should go into 2.6.16. Signed-off-by: David Gibson Index: working-2.6/mm/hugetlb.c =================================================================== --- working-2.6.orig/mm/hugetlb.c 2006-02-06 12:58:07.000000000 +1100 +++ working-2.6/mm/hugetlb.c 2006-02-06 12:58:19.000000000 +1100 @@ -107,7 +107,7 @@ struct page *alloc_huge_page(struct vm_a set_page_count(page, 1); page[1].mapping = (void *)free_huge_page; for (i = 0; i < (HPAGE_SIZE/PAGE_SIZE); ++i) - clear_highpage(&page[i]); + clear_user_highpage(&page[i], addr); return page; } -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Mon Feb 6 13:24:53 2006 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 6 Feb 2006 13:24:53 +1100 Subject: powerpc: Cleanup, consolidating icache dirtying logic Message-ID: <20060206022453.GA19390@localhost.localdomain> The code to mark a page as icache dirty (so that it will later be icache-dcache flushed when we try to execute from it) is duplicated in three places: flush_dcache_page() does this marking and nothing else, but clear_user_page() and copy_user_page() duplicate it, since those functions make the page icache dirty themselves. This patch makes those other functions call flush_dcache_page() instead, so the logic's all in one place. This will make life less confusing if we ever need to tweak the details of the the lazy icache flush mechanism. arch/powerpc/mm/mem.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) Signed-off-by: David Gibson Index: working-2.6/arch/powerpc/mm/mem.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/mem.c 2006-02-06 12:58:07.000000000 +1100 +++ working-2.6/arch/powerpc/mm/mem.c 2006-02-06 13:20:29.000000000 +1100 @@ -435,17 +435,12 @@ void clear_user_page(void *page, unsigne { clear_page(page); - if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) - return; /* * We shouldnt have to do this, but some versions of glibc * require it (ld.so assumes zero filled pages are icache clean) * - Anton */ - - /* avoid an atomic op if possible */ - if (test_bit(PG_arch_1, &pg->flags)) - clear_bit(PG_arch_1, &pg->flags); + flush_dcache_page(pg); } EXPORT_SYMBOL(clear_user_page); @@ -469,12 +464,7 @@ void copy_user_page(void *vto, void *vfr return; #endif - if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) - return; - - /* avoid an atomic op if possible */ - if (test_bit(PG_arch_1, &pg->flags)) - clear_bit(PG_arch_1, &pg->flags); + flush_dcache_page(pg); } void flush_icache_user_range(struct vm_area_struct *vma, struct page *page, -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From haren at us.ibm.com Mon Feb 6 13:46:05 2006 From: haren at us.ibm.com (Haren Myneni) Date: Sun, 05 Feb 2006 18:46:05 -0800 Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel In-Reply-To: <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org> References: <20060203080609.403CA68A1F@ozlabs.org> <8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org> <43E5620A.6060503@us.ibm.com> <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org> Message-ID: <43E6B86D.4070108@us.ibm.com> Kumar Gala wrote: > > On Feb 4, 2006, at 8:25 PM, Haren Myneni wrote: > >> Kumar Gala wrote: >> >>> On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote: >>> >>> >>>> It's possible for prom_init to allocate the flat device tree >>>> inside the >>>> kdump crash kernel region. If this happens, when we load the >>>> kdump kernel we >>>> overwrite the flattened device tree, which is bad. >>>> >>>> We could make prom_init try and avoid allocating inside the crash >>>> kernel >>>> region, but then we run into issues if the crash kernel region >>>> uses all the >>>> space inside the RMO. The easiest solution is to move the flat >>>> device tree >>>> once we're running in the kernel. >>>> >>>> Signed-off-by: Michael Ellerman >>>> >>> >>> Doesn't setup_32.c need a similar change? >>> >> At present kdump will not be supported on ppc32. >> In case, kdump_move_device_tree() can be called at the beginning of >> unflatten_device_tree() to support both ppc32 and 64 instead of >> making changes in setup_64.c and setup_32.c. The extern definition >> can be removed from asm-powerpc/prom.h and this function can be static. >> >> Michael, what do you think if we have some printk to tell the user >> that device_tree is moved to new location. Because, the console >> messages from prom_init are saying about old addresses. > > > What's the issue with kdump on ppc32? One of the reasons we merged > ppc32 and ppc64 was to avoid such issues going forward? Main issue is with the user level kexec-tools which does not have the kdump support for PPC32. One of the changes should be some cleanup/merge the way it happened in the kernel. At present, normal kexec support is included for gamecube/ppc32. Any help is appreciated. Thanks Haren > > - k > >>>> --- >>>> >>>> arch/powerpc/kernel/prom.c | 27 +++++++++++++++++++++++++++ >>>> arch/powerpc/kernel/setup_64.c | 3 +++ >>>> include/asm-powerpc/prom.h | 2 ++ >>>> 3 files changed, 32 insertions(+) >>>> >>>> Index: kdump/arch/powerpc/kernel/prom.c >>>> =================================================================== >>>> --- kdump.orig/arch/powerpc/kernel/prom.c >>>> +++ kdump/arch/powerpc/kernel/prom.c >>>> @@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n >>>> >>>> return 0; >>>> } >>>> + >>>> +#ifdef CONFIG_KEXEC >>>> +/* We may have allocated the flat device tree inside the crash >>>> kernel region >>>> + * in prom_init. If so we need to move it out into regular >>>> memory. */ >>>> +void kdump_move_device_tree(void) >>>> +{ >>>> + unsigned long start, end; >>>> + struct boot_param_header *new; >>>> + >>>> + start = __pa((unsigned long)initial_boot_params); >>>> + end = start + initial_boot_params->totalsize; >>>> + >>>> + if (end < crashk_res.start || start > crashk_res.end) >>>> + return; >>>> + >>>> + new = (struct boot_param_header*) >>>> + __va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE)); >>>> + >>>> + memcpy(new, initial_boot_params, initial_boot_params->totalsize); >>>> + >>>> + initial_boot_params = new; >>>> + >>>> + DBG("Flat device tree blob moved to %p\n", initial_boot_params); >>>> + >>>> + /* XXX should we unreserve the old DT? */ >>>> +} >>>> +#endif /* CONFIG_KEXEC */ >>>> Index: kdump/arch/powerpc/kernel/setup_64.c >>>> =================================================================== >>>> --- kdump.orig/arch/powerpc/kernel/setup_64.c >>>> +++ kdump/arch/powerpc/kernel/setup_64.c >>>> @@ -398,6 +398,9 @@ void __init setup_system(void) >>>> { >>>> DBG(" -> setup_system()\n"); >>>> >>>> +#ifdef CONFIG_KEXEC >>>> + kdump_move_device_tree(); >>>> +#endif >>>> /* >>>> * Unflatten the device-tree passed by prom_init or kexec >>>> */ >>>> Index: kdump/include/asm-powerpc/prom.h >>>> =================================================================== >>>> --- kdump.orig/include/asm-powerpc/prom.h >>>> +++ kdump/include/asm-powerpc/prom.h >>>> @@ -222,5 +222,7 @@ extern int of_address_to_resource(struct >>>> extern int of_pci_address_to_resource(struct device_node *dev, >>>> int bar, >>>> struct resource *r); >>>> >>>> +extern void kdump_move_device_tree(void); >>>> + >>>> #endif /* __KERNEL__ */ >>>> #endif /* _POWERPC_PROM_H */ >>>> _______________________________________________ >>>> Linuxppc64-dev mailing list >>>> Linuxppc64-dev at ozlabs.org >>>> https://ozlabs.org/mailman/listinfo/linuxppc64-dev >>>> >>> >>> _______________________________________________ >>> Linuxppc64-dev mailing list >>> Linuxppc64-dev at ozlabs.org >>> https://ozlabs.org/mailman/listinfo/linuxppc64-dev >>> >>> > > From michael at ellerman.id.au Mon Feb 6 14:04:49 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 6 Feb 2006 14:04:49 +1100 Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel In-Reply-To: <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org> References: <20060203080609.403CA68A1F@ozlabs.org> <43E5620A.6060503@us.ibm.com> <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org> Message-ID: <200602061404.52478.michael@ellerman.id.au> On Mon, 6 Feb 2006 03:41, Kumar Gala wrote: > On Feb 4, 2006, at 8:25 PM, Haren Myneni wrote: > > Kumar Gala wrote: > >> Doesn't setup_32.c need a similar change? > > > > At present kdump will not be supported on ppc32. > > In case, kdump_move_device_tree() can be called at the beginning of > > unflatten_device_tree() to support both ppc32 and 64 instead of > > making changes in setup_64.c and setup_32.c. The extern definition > > can be removed from asm-powerpc/prom.h and this function can be > > static. > > What's the issue with kdump on ppc32? One of the reasons we merged > ppc32 and ppc64 was to avoid such issues going forward? Currently in mainline all we have is hooks for ppc32 kexec, no actual implementation. Apparently it exists somewhere, but I haven't seen it. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060206/03a24485/attachment.pgp From michael at ellerman.id.au Mon Feb 6 17:34:14 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 06 Feb 2006 17:34:14 +1100 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked Message-ID: <20060206063434.22D37689F3@ozlabs.org> Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked. This can explode if we take the decrementer interrupt while we're in a module, because the toc pointer in r2 will be the module's toc pointer. Instead do an immediate load. I'm not sure if we really need the trickery in here, what do people think? arch/powerpc/kernel/head_64.S | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) Index: iseries/arch/powerpc/kernel/head_64.S =================================================================== --- iseries.orig/arch/powerpc/kernel/head_64.S +++ iseries/arch/powerpc/kernel/head_64.S @@ -752,8 +752,13 @@ decrementer_iSeries_masked: li r11,1 ld r12,PACALPPACAPTR(r13) stb r11,LPPACADECRINT(r12) - LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) - lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) + /* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy + * using the usual macros. We want to be fast, so we assume the top + * word of the lppaca pointer is the same as the top word of + * &tb_ticks_per_jiffy. + */ + oris r12,r12,tb_ticks_per_jiffy at h + lwz r12,tb_ticks_per_jiffy at l(r12) mtspr SPRN_DEC,r12 /* fall through */ From david at gibson.dropbear.id.au Mon Feb 6 17:42:47 2006 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 6 Feb 2006 17:42:47 +1100 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked In-Reply-To: <20060206063434.22D37689F3@ozlabs.org> References: <20060206063434.22D37689F3@ozlabs.org> Message-ID: <20060206064247.GA31631@localhost.localdomain> On Mon, Feb 06, 2006 at 05:34:14PM +1100, Michael Ellerman wrote: > Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using > LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked. > > This can explode if we take the decrementer interrupt while we're in a module, > because the toc pointer in r2 will be the module's toc pointer. > > Instead do an immediate load. I'm not sure if we really need the trickery in > here, what do people think? > > arch/powerpc/kernel/head_64.S | 9 +++++++-- > 1 files changed, 7 insertions(+), 2 deletions(-) > > Index: iseries/arch/powerpc/kernel/head_64.S > =================================================================== > --- iseries.orig/arch/powerpc/kernel/head_64.S > +++ iseries/arch/powerpc/kernel/head_64.S > @@ -752,8 +752,13 @@ decrementer_iSeries_masked: > li r11,1 > ld r12,PACALPPACAPTR(r13) > stb r11,LPPACADECRINT(r12) > - LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) > - lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) > + /* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy > + * using the usual macros. We want to be fast, so we assume the top > + * word of the lppaca pointer is the same as the top word of > + * &tb_ticks_per_jiffy. > + */ you need a clrrdi r12,r12,32 here, because r12's low word may not contain zero. > + oris r12,r12,tb_ticks_per_jiffy at h > + lwz r12,tb_ticks_per_jiffy at l(r12) > mtspr SPRN_DEC,r12 > /* fall through */ > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Mon Feb 6 17:44:38 2006 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 6 Feb 2006 17:44:38 +1100 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked In-Reply-To: <20060206064247.GA31631@localhost.localdomain> References: <20060206063434.22D37689F3@ozlabs.org> <20060206064247.GA31631@localhost.localdomain> Message-ID: <20060206064438.GB31631@localhost.localdomain> On Mon, Feb 06, 2006 at 05:42:47PM +1100, David Gibson wrote: > On Mon, Feb 06, 2006 at 05:34:14PM +1100, Michael Ellerman wrote: > > Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using > > LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked. > > > > This can explode if we take the decrementer interrupt while we're in a module, > > because the toc pointer in r2 will be the module's toc pointer. > > > > Instead do an immediate load. I'm not sure if we really need the trickery in > > here, what do people think? > > > > arch/powerpc/kernel/head_64.S | 9 +++++++-- > > 1 files changed, 7 insertions(+), 2 deletions(-) > > > > Index: iseries/arch/powerpc/kernel/head_64.S > > =================================================================== > > --- iseries.orig/arch/powerpc/kernel/head_64.S > > +++ iseries/arch/powerpc/kernel/head_64.S > > @@ -752,8 +752,13 @@ decrementer_iSeries_masked: > > li r11,1 > > ld r12,PACALPPACAPTR(r13) > > stb r11,LPPACADECRINT(r12) > > - LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) > > - lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) > > + /* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy > > + * using the usual macros. We want to be fast, so we assume the top > > + * word of the lppaca pointer is the same as the top word of > > + * &tb_ticks_per_jiffy. > > + */ > > you need a > clrrdi r12,r12,32 > here, because r12's low word may not contain zero. > > > + oris r12,r12,tb_ticks_per_jiffy at h Oh, and that needs to be @ha, because the load offset below is treated as signed. > > + lwz r12,tb_ticks_per_jiffy at l(r12) > > mtspr SPRN_DEC,r12 > > /* fall through */ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From olof at lixom.net Mon Feb 6 17:53:54 2006 From: olof at lixom.net (Olof Johansson) Date: Mon, 6 Feb 2006 00:53:54 -0600 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked In-Reply-To: <20060206064247.GA31631@localhost.localdomain> References: <20060206063434.22D37689F3@ozlabs.org> <20060206064247.GA31631@localhost.localdomain> Message-ID: <20060206065354.GA7626@pb15.lixom.net> On Mon, Feb 06, 2006 at 05:42:47PM +1100, David Gibson wrote: > On Mon, Feb 06, 2006 at 05:34:14PM +1100, Michael Ellerman wrote: > > Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using > > LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked. > > > > This can explode if we take the decrementer interrupt while we're in a module, > > because the toc pointer in r2 will be the module's toc pointer. > > > > Instead do an immediate load. I'm not sure if we really need the trickery in > > here, what do people think? > > > > arch/powerpc/kernel/head_64.S | 9 +++++++-- > > 1 files changed, 7 insertions(+), 2 deletions(-) > > > > Index: iseries/arch/powerpc/kernel/head_64.S > > =================================================================== > > --- iseries.orig/arch/powerpc/kernel/head_64.S > > +++ iseries/arch/powerpc/kernel/head_64.S > > @@ -752,8 +752,13 @@ decrementer_iSeries_masked: > > li r11,1 > > ld r12,PACALPPACAPTR(r13) > > stb r11,LPPACADECRINT(r12) > > - LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) > > - lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) > > + /* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy > > + * using the usual macros. We want to be fast, so we assume the top > > + * word of the lppaca pointer is the same as the top word of > > + * &tb_ticks_per_jiffy. > > + */ > > + oris r12,r12,tb_ticks_per_jiffy at h > > you need a > clrrdi r12,r12,32 > here, because r12's low word may not contain zero. Or do: oris r12,r11,tb_ticks_per_jiffy at ha since r11 only contains '1'. It's a bit obfuscated though, it depends on if the saved single extra instruction is that precious or not. :-) -Olof From olof at lixom.net Mon Feb 6 17:55:47 2006 From: olof at lixom.net (Olof Johansson) Date: Mon, 6 Feb 2006 00:55:47 -0600 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked In-Reply-To: <20060206065354.GA7626@pb15.lixom.net> References: <20060206063434.22D37689F3@ozlabs.org> <20060206064247.GA31631@localhost.localdomain> <20060206065354.GA7626@pb15.lixom.net> Message-ID: <20060206065547.GB7626@pb15.lixom.net> On Mon, Feb 06, 2006 at 12:53:54AM -0600, Olof Johansson wrote: > Or do: > oris r12,r11,tb_ticks_per_jiffy at ha > > since r11 only contains '1'. It's a bit obfuscated though, it depends on > if the saved single extra instruction is that precious or not. :-) DOH. That obviously won't work. Nevermind. -Olof From wli at holomorphy.com Mon Feb 6 19:19:39 2006 From: wli at holomorphy.com (William Lee Irwin III) Date: Mon, 6 Feb 2006 00:19:39 -0800 Subject: Hugepages need clear_user_highpage() not clear_highpage() In-Reply-To: <20060206021853.GC10708@localhost.localdomain> References: <20060206021853.GC10708@localhost.localdomain> Message-ID: <20060206081939.GA6789@holomorphy.com> On Mon, Feb 06, 2006 at 01:18:53PM +1100, David Gibson wrote: > When hugepages are newly allocated to a file in mm/hugetlb.c, we clear > them with a call to clear_highpage() on each of the subpages. We > should be using clear_user_highpage(): on powerpc, at least, > clear_highpage() doesn't correctly mark the page as icache dirty so if > the page is executed shortly after it's possible to get strange > results. > This is a bugfix and should go into 2.6.16. > Signed-off-by: David Gibson Not sure how this got past the usual crapfilters. Sorry about that. Acked-by: William Irwin -- wli From bgill at freescale.com Tue Feb 7 07:26:31 2006 From: bgill at freescale.com (Becky Bruce) Date: Mon, 6 Feb 2006 14:26:31 -0600 (CST) Subject: [PATCH] documentation: add bus-frequency property to SOC node Message-ID: Updated SOC node definition in documentation to include bus-frequency property. Also extended mdio example to match specification. Signed-off-by: Becky Bruce Signed-off-by: Kumar Gala --- commit 3441bf59c7e1dc3823f9be57838a2536c78f6f8f tree 2901a0e19418f1fe904ff0d041c630b3af048961 parent 66c490c9b00c52cd0f1e088ad689c9148e46f49e author Becky Bruce Thu, 02 Feb 2006 15:41:11 -0600 committer Becky Bruce Thu, 02 Feb 2006 15:41:11 -0600 Documentation/powerpc/booting-without-of.txt | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index 1284498..54e5f9b 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -880,6 +880,10 @@ address which can extend beyond that lim - device_type : Should be "soc" - ranges : Should be defined as specified in 1) to describe the translation of SOC addresses for memory mapped SOC registers. + - bus-frequency: Contains the bus frequency for the SOC node. + Typically, the value of this field is filled in by the boot + loader. + Recommended properties: @@ -919,6 +923,7 @@ SOC. device_type = "soc"; ranges = <00000000 e0000000 00100000> reg = ; + bus-frequency = <0>; } @@ -1170,6 +1175,8 @@ platforms are moved over to use the flat mdio at 24520 { reg = <24520 20>; + device_type = "mdio"; + compatible = "gianfar"; ethernet-phy at 0 { ...... @@ -1317,6 +1324,7 @@ not necessary as they are usually the sa device_type = "soc"; ranges = <00000000 e0000000 00100000> reg = ; + bus-frequency = <0>; mdio at 24520 { reg = <24520 20>; From mikey at neuling.org Tue Feb 7 10:58:21 2006 From: mikey at neuling.org (Michael Neuling) Date: Tue, 7 Feb 2006 10:58:21 +1100 Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down Message-ID: <20060207105821.bfd5ea21.mikey@neuling.org> Paulus, We call unregister_vpa but we don't check to see if the hypervisor supports this. Please apply. Signed-off-by: Michael Neuling Acked-by: Anton Blanchard -- arch/powerpc/platforms/pseries/setup.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c =================================================================== --- linux-2.6-powerpc.orig/arch/powerpc/platforms/pseries/setup.c +++ linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c @@ -585,7 +585,7 @@ static int pSeries_pci_probe_mode(struct static void pseries_kexec_cpu_down(int crash_shutdown, int secondary) { /* Don't risk a hypervisor call if we're crashing */ - if (!crash_shutdown) { + if (firmware_has_feature(FW_FEATURE_SPLPAR) && !crash_shutdown) { unsigned long vpa = __pa(get_lppaca()); if (unregister_vpa(hard_smp_processor_id(), vpa)) { From sfr at canb.auug.org.au Tue Feb 7 10:59:06 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 7 Feb 2006 10:59:06 +1100 Subject: [PATCH] powerpc: wire up the *at system calls Message-ID: <20060207105906.04a22df3.sfr@canb.auug.org.au> Signed-off-by: Stephen Rothwell --- arch/powerpc/kernel/systbl.S | 13 +++++++++++++ include/asm-powerpc/unistd.h | 15 ++++++++++++++- 2 files changed, 27 insertions(+), 1 deletions(-) This depend on the patch that creates all the compat wrappers. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ d02d8208813d8cae2c814a85734a1a31fed2f3ac diff --git a/arch/powerpc/kernel/systbl.S b/arch/powerpc/kernel/systbl.S index 007b15e..fe16d9c 100644 --- a/arch/powerpc/kernel/systbl.S +++ b/arch/powerpc/kernel/systbl.S @@ -323,3 +323,16 @@ SYSCALL(spu_run) SYSCALL(spu_create) COMPAT_SYS(pselect6) COMPAT_SYS(ppoll) +COMPAT_SYS(openat) +COMPAT_SYS(mkdirat) +COMPAT_SYS(mknodat) +COMPAT_SYS(fchownat) +COMPAT_SYS(futimesat) +COMPAT_SYS(newfstatat) +COMPAT_SYS(unlinkat) +COMPAT_SYS(renameat) +COMPAT_SYS(linkat) +COMPAT_SYS(symlinkat) +COMPAT_SYS(readlinkat) +COMPAT_SYS(fchmodat) +COMPAT_SYS(faccessat) diff --git a/include/asm-powerpc/unistd.h b/include/asm-powerpc/unistd.h index a40cdff..d05b85e 100644 --- a/include/asm-powerpc/unistd.h +++ b/include/asm-powerpc/unistd.h @@ -300,8 +300,21 @@ #define __NR_spu_create 279 #define __NR_pselect6 280 #define __NR_ppoll 281 +#define __NR_openat 282 +#define __NR_mkdirat 283 +#define __NR_mknodat 284 +#define __NR_fchownat 285 +#define __NR_futimesat 286 +#define __NR_newfstatat 287 +#define __NR_unlinkat 288 +#define __NR_renameat 289 +#define __NR_linkat 290 +#define __NR_symlinkat 291 +#define __NR_readlinkat 292 +#define __NR_fchmodat 293 +#define __NR_faccessat 294 -#define __NR_syscalls 282 +#define __NR_syscalls 295 #ifdef __KERNEL__ #define __NR__exit __NR_exit -- 1.1.5 From michael at ellerman.id.au Tue Feb 7 11:07:58 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 7 Feb 2006 11:07:58 +1100 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked In-Reply-To: <20060206063434.22D37689F3@ozlabs.org> References: <20060206063434.22D37689F3@ozlabs.org> Message-ID: <200602071108.01571.michael@ellerman.id.au> On Mon, 6 Feb 2006 17:34, Michael Ellerman wrote: > Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using > LOAD_REG_ADDRBASE, which uses the toc pointer, in > decrementer_iSeries_masked. > > This can explode if we take the decrementer interrupt while we're in a > module, because the toc pointer in r2 will be the module's toc pointer. > > Instead do an immediate load. I'm not sure if we really need the trickery > in here, what do people think? I think we answered that question pretty thoroughly, I'll post an updated and simplified version soon. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/3ebdce2c/attachment.pgp From michael at ellerman.id.au Tue Feb 7 11:21:13 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 7 Feb 2006 11:21:13 +1100 Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down In-Reply-To: <20060207105821.bfd5ea21.mikey@neuling.org> References: <20060207105821.bfd5ea21.mikey@neuling.org> Message-ID: <200602071121.17076.michael@ellerman.id.au> On Tue, 7 Feb 2006 10:58, Michael Neuling wrote: > Index: linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c > =================================================================== > --- linux-2.6-powerpc.orig/arch/powerpc/platforms/pseries/setup.c > +++ linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c > @@ -585,7 +585,7 @@ static int pSeries_pci_probe_mode(struct > static void pseries_kexec_cpu_down(int crash_shutdown, int secondary) > { > /* Don't risk a hypervisor call if we're crashing */ > - if (!crash_shutdown) { > + if (firmware_has_feature(FW_FEATURE_SPLPAR) && !crash_shutdown) { > unsigned long vpa = __pa(get_lppaca()); > > if (unregister_vpa(hard_smp_processor_id(), vpa)) { Is SPLPAR the right test? I would have thought LPAR? cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/782a9e55/attachment.pgp From mikey at neuling.org Tue Feb 7 11:58:16 2006 From: mikey at neuling.org (Michael Neuling) Date: Tue, 7 Feb 2006 11:58:16 +1100 Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down In-Reply-To: <200602071121.17076.michael@ellerman.id.au> References: <20060207105821.bfd5ea21.mikey@neuling.org> <200602071121.17076.michael@ellerman.id.au> Message-ID: <20060207115816.c2314f86.mikey@neuling.org> > Is SPLPAR the right test? I would have thought LPAR? I missed your patch which added this but you're right. Revised patch attached. Now depends on MPE's patches from here: http://patchwork.ozlabs.org/linuxppc64/patch?id=4088 -- We call unregister_vpa but we don't check to see if the hypervisor supports this. Signed-off-by: Michael Neuling arch/powerpc/platforms/pseries/setup.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c =================================================================== --- linux-2.6-powerpc.orig/arch/powerpc/platforms/pseries/setup.c +++ linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c @@ -585,7 +585,7 @@ static int pSeries_pci_probe_mode(struct static void pseries_kexec_cpu_down(int crash_shutdown, int secondary) { /* Don't risk a hypervisor call if we're crashing */ - if (!crash_shutdown) { + if (firmware_has_feature(FW_FEATURE_LPAR) && !crash_shutdown) { unsigned long vpa = __pa(get_lppaca()); if (unregister_vpa(hard_smp_processor_id(), vpa)) { From michael at ellerman.id.au Tue Feb 7 13:26:14 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 07 Feb 2006 13:26:14 +1100 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked Message-ID: <20060207022639.1F591689DD@ozlabs.org> Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked. This can explode if we take the decrementer interrupt while we're in a module, because the toc pointer in r2 will be the module's toc pointer. arch/powerpc/kernel/head_64.S | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) Index: iseries/arch/powerpc/kernel/head_64.S =================================================================== --- iseries.orig/arch/powerpc/kernel/head_64.S +++ iseries/arch/powerpc/kernel/head_64.S @@ -749,11 +749,12 @@ iSeries_secondary_smp_loop: .globl decrementer_iSeries_masked decrementer_iSeries_masked: + /* We may not have a valid TOC pointer in here. */ li r11,1 ld r12,PACALPPACAPTR(r13) stb r11,LPPACADECRINT(r12) - LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) - lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) + LOAD_REG_IMMEDIATE(r12, tb_ticks_per_jiffy) + lwz r12,0(r12) mtspr SPRN_DEC,r12 /* fall through */ From ntl at pobox.com Tue Feb 7 14:39:53 2006 From: ntl at pobox.com (Nathan Lynch) Date: Mon, 6 Feb 2006 21:39:53 -0600 Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down In-Reply-To: <20060207115816.c2314f86.mikey@neuling.org> References: <20060207105821.bfd5ea21.mikey@neuling.org> <200602071121.17076.michael@ellerman.id.au> <20060207115816.c2314f86.mikey@neuling.org> Message-ID: <20060207033953.GH18730@localhost.localdomain> Michael Neuling wrote: > > Is SPLPAR the right test? I would have thought LPAR? > > I missed your patch which added this but you're right. Actually I think the original patch is correct. VPAs come into play only when the hypervisor supports the SPLPAR option. From paulus at samba.org Tue Feb 7 14:41:39 2006 From: paulus at samba.org (Paul Mackerras) Date: Tue, 7 Feb 2006 14:41:39 +1100 Subject: merge these lists? Message-ID: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> A lot of messages seem to get cross-posted to both linuxppc-dev and linuxppc64-dev these days, since we are all working in the one tree. Rather than having to cross-post, I propose that we create a single powerpc-dev at ozlabs.org list to replace linuxppc-dev and linuxppc64-dev. (The linuxppc-embedded list would continue as at present.) If we do this, we would set up the new list with the union of the subscribers of the old lists, and make emails sent to linuxppc-dev and linuxppc64-dev go to the new list, so it should be painless. Thoughts? Comments? Objections? Paul. From jk at ozlabs.org Tue Feb 7 14:45:45 2006 From: jk at ozlabs.org (Jeremy Kerr) Date: Tue, 7 Feb 2006 14:45:45 +1100 Subject: merge these lists? In-Reply-To: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> Message-ID: <200602071445.45805.jk@ozlabs.org> Paul, > If we do this, we would set up the new list with the union of the > subscribers of the old lists, and make emails sent to linuxppc-dev > and linuxppc64-dev go to the new list, so it should be painless. > > Thoughts? Comments? Objections? How about the patchwork lists? Should I look at merging those too? Jeremy From haren at us.ibm.com Tue Feb 7 14:50:03 2006 From: haren at us.ibm.com (Haren Myneni) Date: Mon, 06 Feb 2006 19:50:03 -0800 Subject: [PATCH] Fix in free initrd when overlapped with crashkernel region Message-ID: <43E818EB.7010003@us.ibm.com> It is possible that the reserved crashkernel region can be overlapped with initrd since the bootloader sets the initrd location. When the initrd region is freed, the second kernel memory will not be contiguous. The Kexec_load can cause an oops since there is no contiguous memory to write the second kernel or this memory could be used in the first kernel itself and may not be part of the dump. For example, on powerpc, the initrd is located at 36MB and the crashkernel starts at 32MB. The kexec_load caused panic since writing into non-allocated memory (after 36MB). We could see the similar issue even on other archs. One possibility is to move the initrd outside of crashkernel region. But, the initrd region will be freed anyway before the system is up. This patch fixes this issue and frees only regions that are not part of crashkernel memory in case overlaps. Thanks Haren Signed-off-by: Haren Myneni -------------- next part -------------- A non-text attachment was scrubbed... Name: kdump-initrd-overlap-fix.patch Type: text/x-patch Size: 1723 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060206/cb4e1a08/attachment.bin From olof at lixom.net Tue Feb 7 15:12:43 2006 From: olof at lixom.net (Olof Johansson) Date: Mon, 6 Feb 2006 22:12:43 -0600 Subject: merge these lists? In-Reply-To: <200602071445.45805.jk@ozlabs.org> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> <200602071445.45805.jk@ozlabs.org> Message-ID: <20060207041243.GC7626@pb15.lixom.net> On Tue, Feb 07, 2006 at 02:45:45PM +1100, Jeremy Kerr wrote: > Paul, > > > If we do this, we would set up the new list with the union of the > > subscribers of the old lists, and make emails sent to linuxppc-dev > > and linuxppc64-dev go to the new list, so it should be painless. > > > > Thoughts? Comments? Objections? > > How about the patchwork lists? Should I look at merging those too? I get a feeling that our maintainers might not be using them much any more (most patches since August of last year are still "New"), but I find it convenient to search for a patch that you know has gone by but can't find in your list mbox. I would appreciate either a merge, or a new archive started for the new list. It's useful. -Olof From michael at ellerman.id.au Tue Feb 7 15:29:49 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 7 Feb 2006 15:29:49 +1100 Subject: merge these lists? In-Reply-To: <20060207041243.GC7626@pb15.lixom.net> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> <200602071445.45805.jk@ozlabs.org> <20060207041243.GC7626@pb15.lixom.net> Message-ID: <200602071529.52073.michael@ellerman.id.au> On Tue, 7 Feb 2006 15:12, Olof Johansson wrote: > On Tue, Feb 07, 2006 at 02:45:45PM +1100, Jeremy Kerr wrote: > > Paul, > > > > > If we do this, we would set up the new list with the union of the > > > subscribers of the old lists, and make emails sent to linuxppc-dev > > > and linuxppc64-dev go to the new list, so it should be painless. > > > > > > Thoughts? Comments? Objections? > > > > How about the patchwork lists? Should I look at merging those too? > > I get a feeling that our maintainers might not be using them much any > more (most patches since August of last year are still "New"), but I > find it convenient to search for a patch that you know has gone by but > can't find in your list mbox. > > I would appreciate either a merge, or a new archive started for the new > list. It's useful. And while Jk has nothing else to do .. I'd like to be able to managed my own patches, ie. set them as obsolete etc etc. Oh and yeah I think we should merge the lists :D cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/1a84a395/attachment.pgp From ntl at pobox.com Tue Feb 7 15:44:23 2006 From: ntl at pobox.com (Nathan Lynch) Date: Mon, 6 Feb 2006 22:44:23 -0600 Subject: [PATCH] avoid timer interrupt replay effect when onlining cpu Message-ID: <20060207044422.GI18730@localhost.localdomain> When a cpu is hotplug-onlined, if we don't set per_cpu(last_jiffy) to something sane, timer_interrupt will execute its while loop for every tick missed since the cpu was last online (or since the system was booted, if we're adding a new cpu). This can cause weird hangs, ssh sessions dropping, and we can even go xmon if we take a global IPI at the wrong time. Signed-off-by: Nathan Lynch --- powerpc-timer_interrupt-replay.orig/arch/powerpc/kernel/smp.c +++ powerpc-timer_interrupt-replay/arch/powerpc/kernel/smp.c @@ -540,6 +540,9 @@ int __devinit start_secondary(void *unus if (smp_ops->take_timebase) smp_ops->take_timebase(); + if (system_state > SYSTEM_BOOTING) + per_cpu(last_jiffy, cpu) = get_tb(); + spin_lock(&call_lock); cpu_set(cpu, cpu_online_map); spin_unlock(&call_lock); From mikey at neuling.org Tue Feb 7 15:55:39 2006 From: mikey at neuling.org (Michael Neuling) Date: Tue, 7 Feb 2006 15:55:39 +1100 Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down In-Reply-To: <20060207033953.GH18730@localhost.localdomain> References: <20060207105821.bfd5ea21.mikey@neuling.org> <200602071121.17076.michael@ellerman.id.au> <20060207115816.c2314f86.mikey@neuling.org> <20060207033953.GH18730@localhost.localdomain> Message-ID: <20060207155539.8e2130b7.mikey@neuling.org> > > > Is SPLPAR the right test? I would have thought LPAR? > > > > I missed your patch which added this but you're right. > > Actually I think the original patch is correct. VPAs come into play > only when the hypervisor supports the SPLPAR option. My bad. Looking at the PAPR you're right. Original patch is good. Second patch is bogus. Mikey From michael at ellerman.id.au Tue Feb 7 16:02:33 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 7 Feb 2006 16:02:33 +1100 Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked In-Reply-To: <20060207022639.1F591689DD@ozlabs.org> References: <20060207022639.1F591689DD@ozlabs.org> Message-ID: <200602071602.36387.michael@ellerman.id.au> On Tue, 7 Feb 2006 13:26, Michael Ellerman wrote: > Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using > LOAD_REG_ADDRBASE, which uses the toc pointer, in > decrementer_iSeries_masked. > > This can explode if we take the decrementer interrupt while we're in a > module, because the toc pointer in r2 will be the module's toc pointer. Ooops ... Signed-off-by: Michael Ellerman -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/bf7fb4ea/attachment.pgp From michael at ellerman.id.au Tue Feb 7 16:03:11 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 7 Feb 2006 16:03:11 +1100 Subject: [PATCH] powerpc: Fix !SMP build of rtas.c In-Reply-To: <20060131061807.0104468A53@ozlabs.org> References: <20060131061807.0104468A53@ozlabs.org> Message-ID: <200602071603.14043.michael@ellerman.id.au> On Tue, 31 Jan 2006 17:17, Michael Ellerman wrote: > arch/powerpc/kernel/rtas.c is getting hvcall.h via spinlock.h, but when > we're building for UP we don't include spinlock.h. > > arch/powerpc/kernel/rtas.c | 1 + > 1 files changed, 1 insertion(+) Signed-off-by: Michael Ellerman -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/00d0561d/attachment.pgp From michael at ellerman.id.au Tue Feb 7 16:03:52 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 7 Feb 2006 16:03:52 +1100 Subject: [PATCH] powerpc: ibmveth: Harden driver initilisation for kexec In-Reply-To: <20060131041055.5623C68A46@ozlabs.org> References: <20060131041055.5623C68A46@ozlabs.org> Message-ID: <200602071603.55743.michael@ellerman.id.au> On Tue, 31 Jan 2006 15:10, Michael Ellerman wrote: > After a kexec the veth driver will fail when trying to register with the > Hypervisor because the previous kernel has not unregistered. > > So if the registration fails, we unregister and then try again. Sorry this is missing: Signed-off-by: Michael Ellerman -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/a5ff0bbc/attachment.pgp From michael at ellerman.id.au Tue Feb 7 16:22:00 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 07 Feb 2006 16:22:00 +1100 Subject: [PATCH] powerpc: Make BUG_ON & WARN_ON play nice with compile-time optimisations Message-ID: <20060207052220.917C668A92@ozlabs.org> Currently if you do BUG_ON(0) you'll still get a trap instruction in your object, although it'll never trigger. That's ok, but a bit ugly, it'd be nice if the compiler could completely eliminate any trace of the BUG_ON. So update the BUG_ON & WARN_ON macros to make this possible. From the comment in the patch: The if statement in BUG_ON and WARN_ON gives the compiler a chance to do compile-time optimisation and possibly elide the entire block. The check for !__builtin_constant(x) has the oppposite effect, if we must do the test at runtime then we avoid a spurious compare and branch by ensuring the if condition is always true. I've confirmed it works in both cases, if the condition is false at compile time we get no code emitted for the BUG statement. If the condition needs to be evaluated at runtime we get the same code we used to, ie. only one test in the trap instruction. It's not clear from the patch due to the whitespace changes, but there's no changes to the inline asm whatsoever. For consideration for 2.6.17 I guess. Signed-off-by: Michael Ellerman --- include/asm-powerpc/bug.h | 46 +++++++++++++++++++++++++++++----------------- 1 files changed, 29 insertions(+), 17 deletions(-) Index: iseries/include/asm-powerpc/bug.h =================================================================== --- iseries.orig/include/asm-powerpc/bug.h +++ iseries/include/asm-powerpc/bug.h @@ -39,25 +39,37 @@ struct bug_entry *find_bug(unsigned long : : "i" (__LINE__), "i" (__FILE__), "i" (__FUNCTION__)); \ } while (0) -#define BUG_ON(x) do { \ - __asm__ __volatile__( \ - "1: "PPC_TLNEI" %0,0\n" \ - ".section __bug_table,\"a\"\n" \ - "\t"PPC_LONG" 1b,%1,%2,%3\n" \ - ".previous" \ - : : "r" ((long)(x)), "i" (__LINE__), \ - "i" (__FILE__), "i" (__FUNCTION__)); \ +/* + * The if statement in BUG_ON and WARN_ON gives the compiler a chance to do + * compile-time optimisation and possibly elide the entire block. The check + * for !__builtin_constant(x) has the oppposite effect, if we must do the + * test at runtime then we avoid a spurious compare and branch by ensuring + * the if condition is always true. + */ + +#define BUG_ON(x) do { \ + if (!__builtin_constant_p(x) || (x)) { \ + __asm__ __volatile__( \ + "1: "PPC_TLNEI" %0,0\n" \ + ".section __bug_table,\"a\"\n" \ + "\t"PPC_LONG" 1b,%1,%2,%3\n" \ + ".previous" \ + : : "r" ((long)(x)), "i" (__LINE__), \ + "i" (__FILE__), "i" (__FUNCTION__)); \ + } \ } while (0) -#define WARN_ON(x) do { \ - __asm__ __volatile__( \ - "1: "PPC_TLNEI" %0,0\n" \ - ".section __bug_table,\"a\"\n" \ - "\t"PPC_LONG" 1b,%1,%2,%3\n" \ - ".previous" \ - : : "r" ((long)(x)), \ - "i" (__LINE__ + BUG_WARNING_TRAP), \ - "i" (__FILE__), "i" (__FUNCTION__)); \ +#define WARN_ON(x) do { \ + if (!__builtin_constant_p(x) || (x)) { \ + __asm__ __volatile__( \ + "1: "PPC_TLNEI" %0,0\n" \ + ".section __bug_table,\"a\"\n" \ + "\t"PPC_LONG" 1b,%1,%2,%3\n" \ + ".previous" \ + : : "r" ((long)(x)), \ + "i" (__LINE__ + BUG_WARNING_TRAP), \ + "i" (__FILE__), "i" (__FUNCTION__)); \ + } \ } while (0) #define HAVE_ARCH_BUG From sfr at canb.auug.org.au Tue Feb 7 17:40:17 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 7 Feb 2006 17:40:17 +1100 Subject: [PATCH] compat: add compat functions for *at syscalls In-Reply-To: <20060206.160140.59716704.davem@davemloft.net> References: <20060207105631.39a1080c.sfr@canb.auug.org.au> <20060206.160140.59716704.davem@davemloft.net> Message-ID: <20060207174017.5e3b0ce0.sfr@canb.auug.org.au> On Mon, 06 Feb 2006 16:01:40 -0800 (PST) "David S. Miller" wrote: > > From: Stephen Rothwell > Date: Tue, 7 Feb 2006 10:56:31 +1100 > > > This adds compat version of all the remaining *at syscalls > > so that the "dfd" arguments can be properly sign extended. > > > > Signed-off-by: Stephen Rothwell > > I do the sign extension with tiny stubs in arch/sparc64/kernel/sys32.S > so that the arg frobbing does not consume a stack frame, which is what > happens if you do this in C code. > > We need to revisit this at some point and make a way for all > compat platforms to do this with a portable table of some kind > that expands a bunch of macros defined by the platform. How about the following (modifiying Linus' suggestion and copying what sparc64 already does)? The assumption is that all arguments have been zero extended by the compat syscall entry code, so we just sign extend those that need it. I am not sure of the sparc64 code below, s390 doesn't seem to follow our "all arguments are zero extended" assumption and x86_64 may not need any of these wrappers anyway. It may be that we would be better following Linus's suggestion of generating stubs for all of the compat syscalls. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ Subject: [PATCH] compat: introduce kernel/compat_wrapper.S and the necessary compat_wrapper.h with implementations for powerpc and sparc64. compat_wrapper.S builds wrappers for those syscalls that require sign extension for some of their arguments. Signed-off-by: Stephen Rothwell --- arch/sparc64/kernel/systbls.S | 6 +++--- include/asm-ia64/compat_wrapper.h | 15 +++++++++++++++ include/asm-mips/compat_wrapper.h | 15 +++++++++++++++ include/asm-parisc/compat_wrapper.h | 15 +++++++++++++++ include/asm-powerpc/compat_wrapper.h | 28 ++++++++++++++++++++++++++++ include/asm-s390/compat_wrapper.h | 15 +++++++++++++++ include/asm-sparc64/compat_wrapper.h | 33 +++++++++++++++++++++++++++++++++ include/asm-x86_64/compat_wrapper.h | 15 +++++++++++++++ include/linux/compat.h | 22 ++++++++++++++++++++++ kernel/Makefile | 2 +- kernel/compat_wrapper.S | 18 ++++++++++++++++++ 11 files changed, 180 insertions(+), 4 deletions(-) create mode 100644 include/asm-ia64/compat_wrapper.h create mode 100644 include/asm-mips/compat_wrapper.h create mode 100644 include/asm-parisc/compat_wrapper.h create mode 100644 include/asm-powerpc/compat_wrapper.h create mode 100644 include/asm-s390/compat_wrapper.h create mode 100644 include/asm-sparc64/compat_wrapper.h create mode 100644 include/asm-x86_64/compat_wrapper.h create mode 100644 kernel/compat_wrapper.S 1cffeae9ae628af849952cf90fbfca1d98befb97 diff --git a/arch/sparc64/kernel/systbls.S b/arch/sparc64/kernel/systbls.S index 2881faf..a2cc631 100644 --- a/arch/sparc64/kernel/systbls.S +++ b/arch/sparc64/kernel/systbls.S @@ -77,9 +77,9 @@ sys_call_table32: /*270*/ .word sys32_io_submit, sys_io_cancel, compat_sys_io_getevents, sys32_mq_open, sys_mq_unlink .word compat_sys_mq_timedsend, compat_sys_mq_timedreceive, compat_sys_mq_notify, compat_sys_mq_getsetattr, compat_sys_waitid /*280*/ .word sys_ni_syscall, sys_add_key, sys_request_key, sys_keyctl, compat_sys_openat - .word sys_mkdirat, sys_mknodat, sys_fchownat, compat_sys_futimesat, compat_sys_newfstatat -/*285*/ .word sys_unlinkat, sys_renameat, sys_linkat, sys_symlinkat, sys_readlinkat - .word sys_fchmodat, sys_faccessat, compat_sys_pselect6, compat_sys_ppoll + .word compat_sys_mkdirat, compat_sys_mknodat, compat_sys_fchownat, compat_sys_futimesat, compat_sys_newfstatat +/*285*/ .word compat_sys_unlinkat, compat_sys_renameat, compat_sys_linkat, compat_sys_symlinkat, compat_sys_readlinkat + .word compat_sys_fchmodat, compat_sys_faccessat, compat_sys_pselect6, compat_sys_ppoll #endif /* CONFIG_COMPAT */ diff --git a/include/asm-ia64/compat_wrapper.h b/include/asm-ia64/compat_wrapper.h new file mode 100644 index 0000000..f82befc --- /dev/null +++ b/include/asm-ia64/compat_wrapper.h @@ -0,0 +1,15 @@ +/* + * Definitions used to generate the sign extending stubs + * for compat syscalls + */ + +#define ARG1 +#define ARG2 +#define ARG3 +#define ARG4 +#define ARG5 +#define ARG6 + +#define compat_fn1(fn, arg) + +#define compat_fn2(fn, arg1, arg2) diff --git a/include/asm-mips/compat_wrapper.h b/include/asm-mips/compat_wrapper.h new file mode 100644 index 0000000..f82befc --- /dev/null +++ b/include/asm-mips/compat_wrapper.h @@ -0,0 +1,15 @@ +/* + * Definitions used to generate the sign extending stubs + * for compat syscalls + */ + +#define ARG1 +#define ARG2 +#define ARG3 +#define ARG4 +#define ARG5 +#define ARG6 + +#define compat_fn1(fn, arg) + +#define compat_fn2(fn, arg1, arg2) diff --git a/include/asm-parisc/compat_wrapper.h b/include/asm-parisc/compat_wrapper.h new file mode 100644 index 0000000..f82befc --- /dev/null +++ b/include/asm-parisc/compat_wrapper.h @@ -0,0 +1,15 @@ +/* + * Definitions used to generate the sign extending stubs + * for compat syscalls + */ + +#define ARG1 +#define ARG2 +#define ARG3 +#define ARG4 +#define ARG5 +#define ARG6 + +#define compat_fn1(fn, arg) + +#define compat_fn2(fn, arg1, arg2) diff --git a/include/asm-powerpc/compat_wrapper.h b/include/asm-powerpc/compat_wrapper.h new file mode 100644 index 0000000..9bc0669 --- /dev/null +++ b/include/asm-powerpc/compat_wrapper.h @@ -0,0 +1,28 @@ +/* + * Definitions used to generate the sign extending stubs + * for compat syscalls + * + * Copyright (C) 2006 Stephen Rothwell, IBM Corp + */ + +#define ARG1 %r3 +#define ARG2 %r4 +#define ARG3 %r5 +#define ARG4 %r6 +#define ARG5 %r7 +#define ARG6 %r8 + +#define compat_fn1(fn, arg) \ + .text; \ + .global .compat_sys_ ## fn; \ +.compat_sys_ ## fn: \ + extsw arg, arg; \ + b .sys_ ## fn + +#define compat_fn2(fn, arg1, arg2) \ + .text; \ + .global .compat_sys_ ## fn; \ +.compat_sys_ ## fn: \ + extsw arg1, arg1; \ + extsw arg2, arg2; \ + b .sys_ ## fn diff --git a/include/asm-s390/compat_wrapper.h b/include/asm-s390/compat_wrapper.h new file mode 100644 index 0000000..f82befc --- /dev/null +++ b/include/asm-s390/compat_wrapper.h @@ -0,0 +1,15 @@ +/* + * Definitions used to generate the sign extending stubs + * for compat syscalls + */ + +#define ARG1 +#define ARG2 +#define ARG3 +#define ARG4 +#define ARG5 +#define ARG6 + +#define compat_fn1(fn, arg) + +#define compat_fn2(fn, arg1, arg2) diff --git a/include/asm-sparc64/compat_wrapper.h b/include/asm-sparc64/compat_wrapper.h new file mode 100644 index 0000000..42afb2c --- /dev/null +++ b/include/asm-sparc64/compat_wrapper.h @@ -0,0 +1,33 @@ +/* + * Definitions used to generate the sign extending stubs + * for compat syscalls + * + * Copyright (C) 2006 Stephen Rothwell, IBM Corp + * Based on arch/sparc64/kernel/sys32.S + */ + +#define ARG1 %o0 +#define ARG2 %o1 +#define ARG3 %o2 +#define ARG4 %o3 +#define ARG5 %o4 +#define ARG6 %o5 + +#define compat_fn1(fn, arg) \ + .text; \ + .align 32; \ + .globl compat_sys_ ## fn; \ +compat_sys_ ## fn: \ + sethi %hi(sys_ ## fn), %g1; \ + jmpl %g1 + %lo(sys_ ## fn), %g0; \ + sra arg, 0, arg + +#define compat_fn2(fn, arg1, arg2) \ + .text; \ + .align 32; \ + .globl compat_sys_ ## fn; \ +compat_sys_ ## fn: \ + sethi %hi(sys_ ## fn), %g1; \ + sra arg1, 0, arg1; \ + jmpl %g1 + %lo(sys_ ## fn), %g0; \ + sra arg2, 0, arg2 diff --git a/include/asm-x86_64/compat_wrapper.h b/include/asm-x86_64/compat_wrapper.h new file mode 100644 index 0000000..f82befc --- /dev/null +++ b/include/asm-x86_64/compat_wrapper.h @@ -0,0 +1,15 @@ +/* + * Definitions used to generate the sign extending stubs + * for compat syscalls + */ + +#define ARG1 +#define ARG2 +#define ARG3 +#define ARG4 +#define ARG5 +#define ARG6 + +#define compat_fn1(fn, arg) + +#define compat_fn2(fn, arg1, arg2) diff --git a/include/linux/compat.h b/include/linux/compat.h index 2d7e7f1..b501201 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -168,6 +168,28 @@ asmlinkage long compat_sys_newfstatat(un int flag); asmlinkage long compat_sys_openat(unsigned int dfd, const char __user *filename, int flags, int mode); +asmlinkage long compat_sys_mkdirat(unsigned int dfd, + const char __user * pathname, int mode); +asmlinkage long compat_sys_mknodat(unsigned int dfd, + const char __user *filename, int mode, unsigned dev); +asmlinkage long compat_sys_fchownat(unsigned int dfd, + const char __user *filename, uid_t user, gid_t group, int flag); +asmlinkage long compat_sys_unlinkat(unsigned int dfd, + const char __user *pathname, int flag); +asmlinkage long compat_sys_renameat(unsigned int olddfd, + const char __user *oldname, unsigned int newdfd, + const char __user *newname); +asmlinkage long compat_sys_linkat(unsigned int olddfd, + const char __user *oldname, unsigned int newdfd, + const char __user *newname); +asmlinkage long compat_sys_symlinkat(const char __user *oldname, + unsigned int newdfd, const char __user *newname); +asmlinkage long compat_sys_readlinkat(unsigned int dfd, + const char __user *path, char __user *buf, int bufsiz); +asmlinkage long compat_sys_fchmodat(unsigned int dfd, + const char __user *filename, mode_t mode); +asmlinkage long compat_sys_faccessat(unsigned int dfd, + const char __user *filename, int mode); #endif /* CONFIG_COMPAT */ #endif /* _LINUX_COMPAT_H */ diff --git a/kernel/Makefile b/kernel/Makefile index 4ae0fbd..a0679c4 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -22,7 +22,7 @@ obj-$(CONFIG_KALLSYMS) += kallsyms.o obj-$(CONFIG_PM) += power/ obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o obj-$(CONFIG_KEXEC) += kexec.o -obj-$(CONFIG_COMPAT) += compat.o +obj-$(CONFIG_COMPAT) += compat.o compat_wrapper.o obj-$(CONFIG_CPUSETS) += cpuset.o obj-$(CONFIG_IKCONFIG) += configs.o obj-$(CONFIG_STOP_MACHINE) += stop_machine.o diff --git a/kernel/compat_wrapper.S b/kernel/compat_wrapper.S new file mode 100644 index 0000000..da009eb --- /dev/null +++ b/kernel/compat_wrapper.S @@ -0,0 +1,18 @@ +/* + * Copyright (C) 2006 Stephen Rothwell, IBM Corp + * + * this file will generate compat_ wrapper functions for + * syscalls that need sign extension for some of their arguments + */ +#include + +compat_fn1(mkdirat, ARG1) +compat_fn1(mknodat, ARG1) +compat_fn1(fchownat, ARG1) +compat_fn1(unlinkat, ARG1) +compat_fn2(renameat, ARG1, ARG3) +compat_fn2(linkat, ARG1, ARG3) +compat_fn1(symlinkat, ARG2) +compat_fn1(readlinkat, ARG1) +compat_fn1(fchmodat, ARG1) +compat_fn1(faccessat, ARG1) -- 1.1.5 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/553bb4f8/attachment.pgp From paulus at samba.org Tue Feb 7 19:47:39 2006 From: paulus at samba.org (Paul Mackerras) Date: Tue, 7 Feb 2006 19:47:39 +1100 Subject: [PATCH] Fix in free initrd when overlapped with crashkernel region In-Reply-To: <43E818EB.7010003@us.ibm.com> References: <43E818EB.7010003@us.ibm.com> Message-ID: <17384.24235.960221.979322@cargo.ozlabs.ibm.com> Haren Myneni writes: > --- 2616-rc2.orig/include/linux/kexec.h 2006-02-06 19:08:01.000000000 -0800 > +++ 2616-rc2/include/linux/kexec.h 2006-02-06 19:06:37.000000000 -0800 > @@ -6,6 +6,7 @@ > #include > #include > #include > +#include > #include What's this hunk for? Paul. From hch at lst.de Tue Feb 7 21:56:43 2006 From: hch at lst.de (Christoph Hellwig) Date: Tue, 7 Feb 2006 11:56:43 +0100 Subject: merge these lists? In-Reply-To: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> Message-ID: <20060207105643.GA22234@lst.de> On Tue, Feb 07, 2006 at 02:41:39PM +1100, Paul Mackerras wrote: > A lot of messages seem to get cross-posted to both linuxppc-dev and > linuxppc64-dev these days, since we are all working in the one tree. > Rather than having to cross-post, I propose that we create a single > powerpc-dev at ozlabs.org list to replace linuxppc-dev and > linuxppc64-dev. (The linuxppc-embedded list would continue as at > present.) Why not just kill linuxppc64-dev and leave linuxppc-dev? Probably not worth to remove the well-known and widely used address just for the sake of it. From galak at kernel.crashing.org Wed Feb 8 01:35:23 2006 From: galak at kernel.crashing.org (Kumar Gala) Date: Tue, 7 Feb 2006 08:35:23 -0600 Subject: merge these lists? In-Reply-To: <20060207105643.GA22234@lst.de> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> <20060207105643.GA22234@lst.de> Message-ID: On Feb 7, 2006, at 4:56 AM, Christoph Hellwig wrote: > On Tue, Feb 07, 2006 at 02:41:39PM +1100, Paul Mackerras wrote: >> A lot of messages seem to get cross-posted to both linuxppc-dev and >> linuxppc64-dev these days, since we are all working in the one tree. >> Rather than having to cross-post, I propose that we create a single >> powerpc-dev at ozlabs.org list to replace linuxppc-dev and >> linuxppc64-dev. (The linuxppc-embedded list would continue as at >> present.) > > Why not just kill linuxppc64-dev and leave linuxppc-dev? Probably not > worth to remove the well-known and widely used address just for the > sake of it. I agree. Let's just kill linuxppc64-dev and direct people at linuxppc-dev. - kumar From galak at kernel.crashing.org Wed Feb 8 01:36:35 2006 From: galak at kernel.crashing.org (Kumar Gala) Date: Tue, 7 Feb 2006 08:36:35 -0600 Subject: merge these lists? In-Reply-To: <200602071445.45805.jk@ozlabs.org> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> <200602071445.45805.jk@ozlabs.org> Message-ID: <6FED97E8-FD6B-4CEC-983F-5CA149A50E79@kernel.crashing.org> On Feb 6, 2006, at 9:45 PM, Jeremy Kerr wrote: > Paul, > >> If we do this, we would set up the new list with the union of the >> subscribers of the old lists, and make emails sent to linuxppc-dev >> and linuxppc64-dev go to the new list, so it should be painless. >> >> Thoughts? Comments? Objections? > > How about the patchwork lists? Should I look at merging those too? Hmm, how about a merged patchwork starting after 2.6.16? - kumar From linas at austin.ibm.com Wed Feb 8 03:43:05 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 7 Feb 2006 10:43:05 -0600 Subject: merge these lists? In-Reply-To: References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> <20060207105643.GA22234@lst.de> Message-ID: <20060207164305.GI24916@austin.ibm.com> On Tue, Feb 07, 2006 at 08:35:23AM -0600, Kumar Gala was heard to remark: > > I agree. Let's just kill linuxppc64-dev and direct people at > linuxppc-dev. Can a sysadmin merge the subscription lists? I didn't even know that there was a linuxppc-dev list; I thought the merge of these two lists occured a year ago, when it was moved to ozlabs :-/ --linas From jschopp at austin.ibm.com Wed Feb 8 03:45:38 2006 From: jschopp at austin.ibm.com (Joel Schopp) Date: Tue, 07 Feb 2006 10:45:38 -0600 Subject: merge these lists? In-Reply-To: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> Message-ID: <43E8CEB2.4020009@austin.ibm.com> > A lot of messages seem to get cross-posted to both linuxppc-dev and > linuxppc64-dev these days, since we are all working in the one tree. > Rather than having to cross-post, I propose that we create a single > powerpc-dev at ozlabs.org list to replace linuxppc-dev and > linuxppc64-dev. (The linuxppc-embedded list would continue as at > present.) > > If we do this, we would set up the new list with the union of the > subscribers of the old lists, and make emails sent to linuxppc-dev and > linuxppc64-dev go to the new list, so it should be painless. > > Thoughts? Comments? Objections? Marvelous idea. I like the new name too, matches the kernel tree and all that. A few points to make sure we get right, most of which have already been mentioned: -New archives should have links to the old archives. -Old list addresses should automagically send to new list. -The subscribers to both lists should automagically be subscribed to the new list. From jschopp at austin.ibm.com Wed Feb 8 04:06:25 2006 From: jschopp at austin.ibm.com (Joel Schopp) Date: Tue, 07 Feb 2006 11:06:25 -0600 Subject: [PATCH] avoid timer interrupt replay effect when onlining cpu In-Reply-To: <20060207044422.GI18730@localhost.localdomain> References: <20060207044422.GI18730@localhost.localdomain> Message-ID: <43E8D391.2090208@austin.ibm.com> > Signed-off-by: Nathan Lynch > > > --- powerpc-timer_interrupt-replay.orig/arch/powerpc/kernel/smp.c > +++ powerpc-timer_interrupt-replay/arch/powerpc/kernel/smp.c > @@ -540,6 +540,9 @@ int __devinit start_secondary(void *unus > if (smp_ops->take_timebase) > smp_ops->take_timebase(); > > + if (system_state > SYSTEM_BOOTING) > + per_cpu(last_jiffy, cpu) = get_tb(); > + > spin_lock(&call_lock); > cpu_set(cpu, cpu_online_map); > spin_unlock(&call_lock); > _______________________________________________ Yep, this bug has been seen in SUSE & Redhat distro kernels and this patch fixes it. While we are here, is there any reason we still have next_jiffy_update_tb in the paca? It isn't used anywhere anymore. Acked-by: Joel Schopp From bdc at carlstrom.com Wed Feb 8 06:26:17 2006 From: bdc at carlstrom.com (Brian D. Carlstrom) Date: Tue, 7 Feb 2006 11:26:17 -0800 Subject: G5 fan problems return moving to 2.6.15 with dual processor 2.7GHz machine In-Reply-To: <1139130385.5634.14.camel@localhost.localdomain> References: <20060205061048.7261.qmail@electricrain.com> <1139130385.5634.14.camel@localhost.localdomain> Message-ID: <17384.62553.442011.514155@zot.electricrain.com> Benjamin Herrenschmidt writes: > Might be something in that prom_init.c fix that broke... it would be > really nice if you could give a try with the console and find out what > it is ... Unfortunately, I don't have access to one of these machines > with the "problem" at the moment... Well, I added several prom_printf calls to prom_init.c's fixup_device_tree routine. I assumed I would spot these scrolling by during boot before what appears to be the video mode switch. However, I didn't see anything, but I wasn't sure if it wasn't just going by too fast. I tried using PROM_BUG to halt the output, but that just resulted in returning to an OpenFirmware prompt, although with a white background instead of the usual black background when I go their from yaboot with 'o'. I also tried putting a "while (1) ;" after one of my prom_printf, in case the illegal instruction used by PROM_BUG was causing the output to be lost, since it was clearing the screen to display the OpenFirmware prompt. However then I just got a pure white screen. So clearly in both cases it was running my changed code, but I see no output. I tried reviewing some OpenFirmware doc, looking at their talk of debugging via serial and telnet, but that all seemed to be a dead end, although I learned much more about the device tree. :) Clearly I could theoretically debug by moving the while(1); around to see what branches are being taken, but since I'm away from the machines today, I figured I'd ask how I'm expected to use prom_printf, before returning to debugging tomorrow. Sorry my lack of ppc experience is showing. -bri From geoffrey.levand at am.sony.com Wed Feb 8 07:37:32 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Tue, 07 Feb 2006 12:37:32 -0800 Subject: __setup_cpu_be problem Message-ID: <43E9050C.2000300@am.sony.com> Arnd, It seems HID6 is a hypervisor resource... Can we just have '.cpu_setup = __setup_cpu_power4', and you setup your page sizes somewhere else? -Geoff struct cpu_spec cpu_specs[] = { { /* Cell Broadband Engine */ .cpu_setup = __setup_cpu_be, }, _GLOBAL(__setup_cpu_be) /* Set large page sizes LP=0: 16MB, LP=1: 64KB */ addi r3, 0, 0 ori r3, r3, HID6_LB sldi r3, r3, 32 nor r3, r3, r3 mfspr r4, SPRN_HID6 and r4, r4, r3 addi r3, 0, 0x02000 sldi r3, r3, 32 or r4, r4, r3 mtspr SPRN_HID6, r4 blr From heiko.carstens at de.ibm.com Tue Feb 7 20:31:54 2006 From: heiko.carstens at de.ibm.com (Heiko Carstens) Date: Tue, 7 Feb 2006 10:31:54 +0100 Subject: [PATCH] compat: add compat functions for *at syscalls In-Reply-To: <20060207174017.5e3b0ce0.sfr@canb.auug.org.au> References: <20060207105631.39a1080c.sfr@canb.auug.org.au> <20060206.160140.59716704.davem@davemloft.net> <20060207174017.5e3b0ce0.sfr@canb.auug.org.au> Message-ID: <20060207093154.GA9311@osiris.boeblingen.de.ibm.com> > How about the following (modifiying Linus' suggestion and copying what > sparc64 already does)? > > The assumption is that all arguments have been zero extended by the compat > syscall entry code, so we just sign extend those that need it. > > I am not sure of the sparc64 code below, s390 doesn't seem to follow our > "all arguments are zero extended" assumption and x86_64 may not need any > of these wrappers anyway. On s390 we do already sign extension for int/long and zero extension for the unsigned parameters. Even though I wasn't aware that we should do zero extension for _all_ parameters of the compat system calls, regardless of their type. In addition we must do pointer conversion to 64 bit, since the compat tasks have the most significant bit set (to distinguish between 24- and 31-bit addressing mode). Therefore I think Linus' suggestion with having something like compat_fn6(sys_waitif, SARG, UARG, UARG, SARG, UARG); would be better. Just that we would need something for pointers as well. And to make things just a bit more complicated: only the first five parameters are in registers. Number six and the following are already on the stack. E.g. the compat wrapper for the futex syscall would need extra assembly code to do conversion on the stack. Maybe having defines like SARG1..SARG6 that would define assembly code instead of the register would do the job. Thanks, Heiko From heiko.carstens at de.ibm.com Wed Feb 8 00:29:49 2006 From: heiko.carstens at de.ibm.com (Heiko Carstens) Date: Tue, 7 Feb 2006 14:29:49 +0100 Subject: [PATCH] compat: add compat functions for *at syscalls In-Reply-To: <20060207093154.GA9311@osiris.boeblingen.de.ibm.com> References: <20060207105631.39a1080c.sfr@canb.auug.org.au> <20060206.160140.59716704.davem@davemloft.net> <20060207174017.5e3b0ce0.sfr@canb.auug.org.au> <20060207093154.GA9311@osiris.boeblingen.de.ibm.com> Message-ID: <20060207132949.GB9311@osiris.boeblingen.de.ibm.com> > Therefore I think Linus' suggestion with having something like > > compat_fn6(sys_waitif, SARG, UARG, UARG, SARG, UARG); > > would be better. Just that we would need something for pointers as well. > And to make things just a bit more complicated: only the first five > parameters are in registers. Number six and the following are already on > the stack. E.g. the compat wrapper for the futex syscall would need extra > assembly code to do conversion on the stack. > > Maybe having defines like SARG1..SARG6 that would define assembly code > instead of the register would do the job. Ah, stupid me... the SARG define defines assembly code of course. Just that we would need different defines for arguments that are in registers or on the stack. Is s390 the only architecture that has argument six on the stack? Heiko From greg at kroah.com Wed Feb 8 09:21:44 2006 From: greg at kroah.com (Greg KH) Date: Tue, 7 Feb 2006 14:21:44 -0800 Subject: [PATCH]: Documentation: Updated PCI Error Recovery In-Reply-To: <20060203000602.GQ24916@austin.ibm.com> References: <20060203000602.GQ24916@austin.ibm.com> Message-ID: <20060207222144.GA15622@kroah.com> On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote: > > I'm not sure who I'm addressing this patch to: Linus, maybe? > > Please apply. Fingers crossed, I hope this may make it into 2.6.16. This does not apply to the current tree, what kernel did you do it against? Care to respin it against the latest -git release? thanks, greg k-h From akpm at osdl.org Wed Feb 8 09:30:52 2006 From: akpm at osdl.org (Andrew Morton) Date: Tue, 7 Feb 2006 14:30:52 -0800 Subject: [PATCH]: Documentation: Updated PCI Error Recovery In-Reply-To: <20060207222144.GA15622@kroah.com> References: <20060203000602.GQ24916@austin.ibm.com> <20060207222144.GA15622@kroah.com> Message-ID: <20060207143052.19978ca7.akpm@osdl.org> Greg KH wrote: > > On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote: > > > > I'm not sure who I'm addressing this patch to: Linus, maybe? > > > > Please apply. Fingers crossed, I hope this may make it into 2.6.16. > > This does not apply to the current tree, what kernel did you do it > against? > > Care to respin it against the latest -git release? > err, I already merged it. Saw "documentation" and leapt on it ;) From greg at kroah.com Wed Feb 8 09:39:56 2006 From: greg at kroah.com (Greg KH) Date: Tue, 7 Feb 2006 14:39:56 -0800 Subject: [PATCH]: Documentation: Updated PCI Error Recovery In-Reply-To: <20060207143052.19978ca7.akpm@osdl.org> References: <20060203000602.GQ24916@austin.ibm.com> <20060207222144.GA15622@kroah.com> <20060207143052.19978ca7.akpm@osdl.org> Message-ID: <20060207223956.GA19009@kroah.com> On Tue, Feb 07, 2006 at 02:30:52PM -0800, Andrew Morton wrote: > Greg KH wrote: > > > > On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote: > > > > > > I'm not sure who I'm addressing this patch to: Linus, maybe? > > > > > > Please apply. Fingers crossed, I hope this may make it into 2.6.16. > > > > This does not apply to the current tree, what kernel did you do it > > against? > > > > Care to respin it against the latest -git release? > > > > err, I already merged it. Saw "documentation" and leapt on it ;) Ah, nevermind then... For some reason patch didn't say it looked like it had already been applied, otherwise I would have caught that... thanks, greg k-h From slpratt at austin.ibm.com Wed Feb 8 09:45:48 2006 From: slpratt at austin.ibm.com (Steven Pratt) Date: Tue, 07 Feb 2006 16:45:48 -0600 Subject: make install fails Message-ID: <43E9231C.4000004@austin.ibm.com> Hey, does anyone know why you can no longer do "make install" on ppc kernels on recent releases? Steve From akpm at osdl.org Wed Feb 8 09:53:47 2006 From: akpm at osdl.org (Andrew Morton) Date: Tue, 7 Feb 2006 14:53:47 -0800 Subject: [PATCH]: Documentation: Updated PCI Error Recovery In-Reply-To: <20060207223956.GA19009@kroah.com> References: <20060203000602.GQ24916@austin.ibm.com> <20060207222144.GA15622@kroah.com> <20060207143052.19978ca7.akpm@osdl.org> <20060207223956.GA19009@kroah.com> Message-ID: <20060207145347.72c0a77e.akpm@osdl.org> Greg KH wrote: > > On Tue, Feb 07, 2006 at 02:30:52PM -0800, Andrew Morton wrote: > > Greg KH wrote: > > > > > > On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote: > > > > > > > > I'm not sure who I'm addressing this patch to: Linus, maybe? > > > > > > > > Please apply. Fingers crossed, I hope this may make it into 2.6.16. > > > > > > This does not apply to the current tree, what kernel did you do it > > > against? > > > > > > Care to respin it against the latest -git release? > > > > > > > err, I already merged it. Saw "documentation" and leapt on it ;) > > Ah, nevermind then... For some reason patch didn't say it looked like > it had already been applied, otherwise I would have caught that... > It could be all the newly-added trailing whitespace I chopped off. `patch -p1 -R -l --dry-run'. From haren at us.ibm.com Wed Feb 8 10:01:30 2006 From: haren at us.ibm.com (Haren Myneni) Date: Tue, 07 Feb 2006 15:01:30 -0800 Subject: [PATCH] Fix in free initrd when overlapped with crashkernel region In-Reply-To: <17384.24235.960221.979322@cargo.ozlabs.ibm.com> References: <43E818EB.7010003@us.ibm.com> <17384.24235.960221.979322@cargo.ozlabs.ibm.com> Message-ID: <43E926CA.3000601@us.ibm.com> Paul Mackerras wrote: >Haren Myneni writes: > > > >>--- 2616-rc2.orig/include/linux/kexec.h 2006-02-06 19:08:01.000000000 -0800 >>+++ 2616-rc2/include/linux/kexec.h 2006-02-06 19:06:37.000000000 -0800 >>@@ -6,6 +6,7 @@ >> #include >> #include >> #include >>+#include >> #include >> >> > >What's this hunk for? > >Paul. > > crashk_res is an extern declaration in kexec.h. Declared as "struct resource" which is defined in linux/ioport.h. For other places wherever this variable is used, ioport.h got included through some other header file. Whereas for initramfs.c, either we need to include ioport.h explicitly or include in kexec.h. Chosen the later one. Probably, some comment would be better to make it clear. Paul, do you prefer to repost the patch with the comment? Thanks Haren >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo at vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > From geoffrey.levand at am.sony.com Wed Feb 8 10:10:47 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Tue, 07 Feb 2006 15:10:47 -0800 Subject: [PATCH] fix prom_init undefined error Message-ID: <43E928F7.1050307@am.sony.com> Paul, This patch fixes a build error when CONFIG_PPC_OF=n, CONFIG_PPC_MULTIPLATFORM=y. It makes the conditionals consistent in arch/powerpc/kernel/Makefile and head_64.S to both be on CONFIG_PPC_OF. arch/powerpc/kernel/head_64.o: In function `.__boot_from_prom': linux/arch/powerpc/kernel/head_64.S:(.text+0x8158): undefined reference to `.prom_init' obj-$(CONFIG_PPC_OF) += prom_init.o Signed-off-by: Geoff Levand -- --- powerpc.git.orig/arch/powerpc/kernel/head_64.S 2006-02-07 13:18:14.000000000 -0800 +++ powerpc.git/arch/powerpc/kernel/head_64.S 2006-02-07 14:51:15.000000000 -0800 @@ -1515,7 +1515,7 @@ * */ _GLOBAL(__start_initialization_multiplatform) -#ifdef CONFIG_PPC_MULTIPLATFORM +#ifdef CONFIG_PPC_OF /* * Are we booted from a PROM Of-type client-interface ? */ @@ -1542,7 +1542,7 @@ bl .__mmu_off b .__after_prom_start -#ifdef CONFIG_PPC_MULTIPLATFORM +#ifdef CONFIG_PPC_OF _STATIC(__boot_from_prom) /* Save parameters */ mr r31,r3 From greg at kroah.com Wed Feb 8 10:19:27 2006 From: greg at kroah.com (Greg KH) Date: Tue, 7 Feb 2006 15:19:27 -0800 Subject: [PATCH]: Documentation: Updated PCI Error Recovery In-Reply-To: <20060207145347.72c0a77e.akpm@osdl.org> References: <20060203000602.GQ24916@austin.ibm.com> <20060207222144.GA15622@kroah.com> <20060207143052.19978ca7.akpm@osdl.org> <20060207223956.GA19009@kroah.com> <20060207145347.72c0a77e.akpm@osdl.org> Message-ID: <20060207231927.GB19648@kroah.com> On Tue, Feb 07, 2006 at 02:53:47PM -0800, Andrew Morton wrote: > Greg KH wrote: > > > > On Tue, Feb 07, 2006 at 02:30:52PM -0800, Andrew Morton wrote: > > > Greg KH wrote: > > > > > > > > On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote: > > > > > > > > > > I'm not sure who I'm addressing this patch to: Linus, maybe? > > > > > > > > > > Please apply. Fingers crossed, I hope this may make it into 2.6.16. > > > > > > > > This does not apply to the current tree, what kernel did you do it > > > > against? > > > > > > > > Care to respin it against the latest -git release? > > > > > > > > > > err, I already merged it. Saw "documentation" and leapt on it ;) > > > > Ah, nevermind then... For some reason patch didn't say it looked like > > it had already been applied, otherwise I would have caught that... > > > > It could be all the newly-added trailing whitespace I chopped off. > `patch -p1 -R -l --dry-run'. Yup, that was it, quilt would have stripped them off for me too. Linas, please don't do this anymore... thanks, greg k-h From haren at us.ibm.com Wed Feb 8 10:47:03 2006 From: haren at us.ibm.com (Haren Myneni) Date: Tue, 07 Feb 2006 15:47:03 -0800 Subject: [PATHC] Trivial fix to set the proper timeout value for kdump Message-ID: <43E93177.5020601@us.ibm.com> The panic CPU is waiting forever due to some large timeout value if some CPU is not responding to an IPI. This patch will fixes this issue - The maximum waiting period will be 10 seconds and does the kdump boot. Thanks Haren Signed-off-by: Haren Myneni -------------- next part -------------- A non-text attachment was scrubbed... Name: kdump-timeout-value-fix.patch Type: text/x-patch Size: 616 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/2b4e3870/attachment.bin From davem at davemloft.net Wed Feb 8 09:57:25 2006 From: davem at davemloft.net (David S. Miller) Date: Tue, 07 Feb 2006 14:57:25 -0800 (PST) Subject: [PATCH] compat: add compat functions for *at syscalls In-Reply-To: <20060207132949.GB9311@osiris.boeblingen.de.ibm.com> References: <20060207174017.5e3b0ce0.sfr@canb.auug.org.au> <20060207093154.GA9311@osiris.boeblingen.de.ibm.com> <20060207132949.GB9311@osiris.boeblingen.de.ibm.com> Message-ID: <20060207.145725.22157385.davem@davemloft.net> From: Heiko Carstens Date: Tue, 7 Feb 2006 14:29:49 +0100 > Ah, stupid me... the SARG define defines assembly code of course. Just > that we would need different defines for arguments that are in registers > or on the stack. Is s390 the only architecture that has argument six on > the stack? If I remember correctly, o32 mips binaries put arg 6 on the stack too. From sfr at canb.auug.org.au Wed Feb 8 11:01:50 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 8 Feb 2006 11:01:50 +1100 Subject: merge these lists? In-Reply-To: <20060207164305.GI24916@austin.ibm.com> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> <20060207105643.GA22234@lst.de> <20060207164305.GI24916@austin.ibm.com> Message-ID: <20060208110150.5d9d1936.sfr@canb.auug.org.au> On Tue, 7 Feb 2006 10:43:05 -0600 linas at austin.ibm.com (Linas Vepstas) wrote: > > On Tue, Feb 07, 2006 at 08:35:23AM -0600, Kumar Gala was heard to remark: > > > > I agree. Let's just kill linuxppc64-dev and direct people at > > linuxppc-dev. > > Can a sysadmin merge the subscription lists? Yes, "a sysadmin" could do that. However, those that are subscribed with different addresses on each list will end up subscribed twice and those who have changed their preferences on the abondoned list will have fix them as well. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060208/fbe140ee/attachment.pgp From sfr at canb.auug.org.au Wed Feb 8 11:07:18 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 8 Feb 2006 11:07:18 +1100 Subject: Membership stats (Was: Re: merge these lists?) In-Reply-To: <20060208110150.5d9d1936.sfr@canb.auug.org.au> References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com> <20060207105643.GA22234@lst.de> <20060207164305.GI24916@austin.ibm.com> <20060208110150.5d9d1936.sfr@canb.auug.org.au> Message-ID: <20060208110718.57e9f9f5.sfr@canb.auug.org.au> On Wed, 8 Feb 2006 11:01:50 +1100 Stephen Rothwell wrote: > > Yes, "a sysadmin" could do that. However, those that are > subscribed with different addresses on each list will end > up subscribed twice and those who have changed their preferences on > the abondoned list will have fix them as well. Just for interest: members of linuxppc-dev 473 members of linuxppc64-dev 264 common 98 But, as I said, "common" above does not count those who have different addresses subscribed to each list. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060208/0efd61cc/attachment.pgp From benh at kernel.crashing.org Wed Feb 8 14:56:56 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 08 Feb 2006 14:56:56 +1100 Subject: G5 fan problems return moving to 2.6.15 with dual processor 2.7GHz machine In-Reply-To: <17384.62553.442011.514155@zot.electricrain.com> References: <20060205061048.7261.qmail@electricrain.com> <1139130385.5634.14.camel@localhost.localdomain> <17384.62553.442011.514155@zot.electricrain.com> Message-ID: <1139371016.8187.1.camel@localhost.localdomain> On Tue, 2006-02-07 at 11:26 -0800, Brian D. Carlstrom wrote: > Benjamin Herrenschmidt writes: > > Might be something in that prom_init.c fix that broke... it would be > > really nice if you could give a try with the console and find out what > > it is ... Unfortunately, I don't have access to one of these machines > > with the "problem" at the moment... > > Well, I added several prom_printf calls to prom_init.c's > fixup_device_tree routine. I assumed I would spot these scrolling by > during boot before what appears to be the video mode switch. However, I > didn't see anything, but I wasn't sure if it wasn't just going by too > fast. > > I tried using PROM_BUG to halt the output, but that just resulted in > returning to an OpenFirmware prompt, although with a white background > instead of the usual black background when I go their from yaboot with > 'o'. > > I also tried putting a "while (1) ;" after one of my prom_printf, in > case the illegal instruction used by PROM_BUG was causing the output to > be lost, since it was clearing the screen to display the OpenFirmware > prompt. However then I just got a pure white screen. So clearly in both > cases it was running my changed code, but I see no output. > > I tried reviewing some OpenFirmware doc, looking at their talk of > debugging via serial and telnet, but that all seemed to be a dead end, > although I learned much more about the device tree. :) > > Clearly I could theoretically debug by moving the while(1); around to > see what branches are being taken, but since I'm away from the machines > today, I figured I'd ask how I'm expected to use prom_printf, before > returning to debugging tomorrow. Sorry my lack of ppc experience is > showing. prom_printf should work ... try booting manually (from the OF command line) and maybe comment out the code that opens the displays... (it may be clearing the screen).... Ben. From benh at kernel.crashing.org Wed Feb 8 14:59:21 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 08 Feb 2006 14:59:21 +1100 Subject: [PATCH] fix prom_init undefined error In-Reply-To: <43E928F7.1050307@am.sony.com> References: <43E928F7.1050307@am.sony.com> Message-ID: <1139371162.8187.3.camel@localhost.localdomain> On Tue, 2006-02-07 at 15:10 -0800, Geoff Levand wrote: > Paul, > > This patch fixes a build error when CONFIG_PPC_OF=n, > CONFIG_PPC_MULTIPLATFORM=y. It makes the conditionals > consistent in arch/powerpc/kernel/Makefile and head_64.S > to both be on CONFIG_PPC_OF. > > arch/powerpc/kernel/head_64.o: In function `.__boot_from_prom': > linux/arch/powerpc/kernel/head_64.S:(.text+0x8158): undefined reference to `.prom_init' > > obj-$(CONFIG_PPC_OF) += prom_init.o With ARCH=powerpc, CONFIG_PPC_OF should always be set. It's supposed to be set when the device-tree accessors exist which they always do. Besides, I'll be removing support for !MULTIPLATFORM too :) (Except for iSeries at least for a little while). Look at the patch I posted that removes _machine for an idea of where things are going. Why do you want CONFIG_PPC_OF not set ? Ben. From benh at kernel.crashing.org Wed Feb 8 16:42:51 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 08 Feb 2006 16:42:51 +1100 Subject: [PATCH] powerpc: Thermal control for dual core G5s Message-ID: <1139377372.8187.16.camel@localhost.localdomain> This patch adds a windfarm module, windfarm_pm112, for the dual core G5s (both 2 and 4 core models), keeping the machine from getting into vacuum-cleaner mode ;) For proper credits, the patch was initially written by Paul Mackerras, and slightly reworked by me to add overtemp handling among others. The patch also removes the sysfs attributes from windfarm_pm81 and windfarm_pm91 and instead adds code to the windfarm core to automagically expose attributes for sensor & controls. Signed-off-by; Benjamin Herrenschmidt Index: linux-work/drivers/macintosh/Kconfig =================================================================== --- linux-work.orig/drivers/macintosh/Kconfig 2006-01-12 16:33:08.000000000 +1100 +++ linux-work/drivers/macintosh/Kconfig 2006-02-07 13:45:57.000000000 +1100 @@ -187,6 +187,14 @@ config WINDFARM_PM91 This driver provides thermal control for the PowerMac9,1 which is the recent (SMU based) single CPU desktop G5 +config WINDFARM_PM112 + tristate "Support for thermal management on PowerMac11,2" + depends on WINDFARM && I2C && PMAC_SMU + select I2C_PMAC_SMU + help + This driver provides thermal control for the PowerMac11,2 + which are the recent dual and quad G5 machines using the + 970MP dual-core processor. config ANSLCD tristate "Support for ANS LCD display" Index: linux-work/drivers/macintosh/Makefile =================================================================== --- linux-work.orig/drivers/macintosh/Makefile 2005-11-09 11:49:03.000000000 +1100 +++ linux-work/drivers/macintosh/Makefile 2006-02-07 13:45:57.000000000 +1100 @@ -35,3 +35,8 @@ obj-$(CONFIG_WINDFARM_PM91) += windf windfarm_smu_sensors.o \ windfarm_lm75_sensor.o windfarm_pid.o \ windfarm_cpufreq_clamp.o windfarm_pm91.o +obj-$(CONFIG_WINDFARM_PM112) += windfarm_pm112.o windfarm_smu_sat.o \ + windfarm_smu_controls.o \ + windfarm_smu_sensors.o \ + windfarm_max6690_sensor.o \ + windfarm_lm75_sensor.o windfarm_pid.o Index: linux-work/drivers/macintosh/windfarm.h =================================================================== --- linux-work.orig/drivers/macintosh/windfarm.h 2005-11-09 11:49:03.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm.h 2006-02-07 13:45:57.000000000 +1100 @@ -14,6 +14,7 @@ #include #include #include +#include /* Display a 16.16 fixed point value */ #define FIX32TOPRINT(f) ((f) >> 16),((((f) & 0xffff) * 1000) >> 16) @@ -39,6 +40,7 @@ struct wf_control { char *name; int type; struct kref ref; + struct device_attribute attr; }; #define WF_CONTROL_TYPE_GENERIC 0 @@ -87,6 +89,7 @@ struct wf_sensor { struct wf_sensor_ops *ops; char *name; struct kref ref; + struct device_attribute attr; }; /* Same lifetime rules as controls */ Index: linux-work/drivers/macintosh/windfarm_core.c =================================================================== --- linux-work.orig/drivers/macintosh/windfarm_core.c 2005-11-09 11:49:03.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm_core.c 2006-02-07 13:45:57.000000000 +1100 @@ -55,6 +55,10 @@ static unsigned int wf_overtemp; static unsigned int wf_overtemp_counter; struct task_struct *wf_thread; +static struct platform_device wf_platform_device = { + .name = "windfarm", +}; + /* * Utilities & tick thread */ @@ -156,6 +160,40 @@ static void wf_control_release(struct kr kfree(ct); } +static ssize_t wf_show_control(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct wf_control *ctrl = container_of(attr, struct wf_control, attr); + s32 val = 0; + int err; + + err = ctrl->ops->get_value(ctrl, &val); + if (err < 0) + return err; + return sprintf(buf, "%d\n", val); +} + +/* This is really only for debugging... */ +static ssize_t wf_store_control(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct wf_control *ctrl = container_of(attr, struct wf_control, attr); + int val; + int err; + char *endp; + + val = simple_strtoul(buf, &endp, 0); + while (endp < buf + count && (*endp == ' ' || *endp == '\n')) + ++endp; + if (endp - buf < count) + return -EINVAL; + err = ctrl->ops->set_value(ctrl, val); + if (err < 0) + return err; + return count; +} + int wf_register_control(struct wf_control *new_ct) { struct wf_control *ct; @@ -172,6 +210,13 @@ int wf_register_control(struct wf_contro kref_init(&new_ct->ref); list_add(&new_ct->link, &wf_controls); + new_ct->attr.attr.name = new_ct->name; + new_ct->attr.attr.owner = THIS_MODULE; + new_ct->attr.attr.mode = 0644; + new_ct->attr.show = wf_show_control; + new_ct->attr.store = wf_store_control; + device_create_file(&wf_platform_device.dev, &new_ct->attr); + DBG("wf: Registered control %s\n", new_ct->name); wf_notify(WF_EVENT_NEW_CONTROL, new_ct); @@ -246,6 +291,19 @@ static void wf_sensor_release(struct kre kfree(sr); } +static ssize_t wf_show_sensor(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct wf_sensor *sens = container_of(attr, struct wf_sensor, attr); + s32 val = 0; + int err; + + err = sens->ops->get_value(sens, &val); + if (err < 0) + return err; + return sprintf(buf, "%d.%03d\n", FIX32TOPRINT(val)); +} + int wf_register_sensor(struct wf_sensor *new_sr) { struct wf_sensor *sr; @@ -262,6 +320,13 @@ int wf_register_sensor(struct wf_sensor kref_init(&new_sr->ref); list_add(&new_sr->link, &wf_sensors); + new_sr->attr.attr.name = new_sr->name; + new_sr->attr.attr.owner = THIS_MODULE; + new_sr->attr.attr.mode = 0444; + new_sr->attr.show = wf_show_sensor; + new_sr->attr.store = NULL; + device_create_file(&wf_platform_device.dev, &new_sr->attr); + DBG("wf: Registered sensor %s\n", new_sr->name); wf_notify(WF_EVENT_NEW_SENSOR, new_sr); @@ -395,10 +460,6 @@ int wf_is_overtemp(void) } EXPORT_SYMBOL_GPL(wf_is_overtemp); -static struct platform_device wf_platform_device = { - .name = "windfarm", -}; - static int __init windfarm_core_init(void) { DBG("wf: core loaded\n"); Index: linux-work/drivers/macintosh/windfarm_max6690_sensor.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-work/drivers/macintosh/windfarm_max6690_sensor.c 2006-02-07 16:05:23.000000000 +1100 @@ -0,0 +1,169 @@ +/* + * Windfarm PowerMac thermal control. MAX6690 sensor. + * + * Copyright (C) 2005 Paul Mackerras, IBM Corp. + * + * Use and redistribute under the terms of the GNU GPL v2. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "windfarm.h" + +#define VERSION "0.1" + +/* This currently only exports the external temperature sensor, + since that's all the control loops need. */ + +/* Some MAX6690 register numbers */ +#define MAX6690_INTERNAL_TEMP 0 +#define MAX6690_EXTERNAL_TEMP 1 + +struct wf_6690_sensor { + struct i2c_client i2c; + struct wf_sensor sens; +}; + +#define wf_to_6690(x) container_of((x), struct wf_6690_sensor, sens) +#define i2c_to_6690(x) container_of((x), struct wf_6690_sensor, i2c) + +static int wf_max6690_attach(struct i2c_adapter *adapter); +static int wf_max6690_detach(struct i2c_client *client); + +static struct i2c_driver wf_max6690_driver = { + .driver = { + .name = "wf_max6690", + }, + .attach_adapter = wf_max6690_attach, + .detach_client = wf_max6690_detach, +}; + +static int wf_max6690_get(struct wf_sensor *sr, s32 *value) +{ + struct wf_6690_sensor *max = wf_to_6690(sr); + s32 data; + + if (max->i2c.adapter == NULL) + return -ENODEV; + + /* chip gets initialized by firmware */ + data = i2c_smbus_read_byte_data(&max->i2c, MAX6690_EXTERNAL_TEMP); + if (data < 0) + return data; + *value = data << 16; + return 0; +} + +static void wf_max6690_release(struct wf_sensor *sr) +{ + struct wf_6690_sensor *max = wf_to_6690(sr); + + if (max->i2c.adapter) { + i2c_detach_client(&max->i2c); + max->i2c.adapter = NULL; + } + kfree(max); +} + +static struct wf_sensor_ops wf_max6690_ops = { + .get_value = wf_max6690_get, + .release = wf_max6690_release, + .owner = THIS_MODULE, +}; + +static void wf_max6690_create(struct i2c_adapter *adapter, u8 addr) +{ + struct wf_6690_sensor *max; + char *name = "u4-temp"; + + max = kzalloc(sizeof(struct wf_6690_sensor), GFP_KERNEL); + if (max == NULL) { + printk(KERN_ERR "windfarm: Couldn't create MAX6690 sensor %s: " + "no memory\n", name); + return; + } + + max->sens.ops = &wf_max6690_ops; + max->sens.name = name; + max->i2c.addr = addr >> 1; + max->i2c.adapter = adapter; + max->i2c.driver = &wf_max6690_driver; + strncpy(max->i2c.name, name, I2C_NAME_SIZE-1); + + if (i2c_attach_client(&max->i2c)) { + printk(KERN_ERR "windfarm: failed to attach MAX6690 sensor\n"); + goto fail; + } + + if (wf_register_sensor(&max->sens)) { + i2c_detach_client(&max->i2c); + goto fail; + } + + return; + + fail: + kfree(max); +} + +static int wf_max6690_attach(struct i2c_adapter *adapter) +{ + struct device_node *busnode, *dev = NULL; + struct pmac_i2c_bus *bus; + const char *loc; + u32 *reg; + + bus = pmac_i2c_adapter_to_bus(adapter); + if (bus == NULL) + return -ENODEV; + busnode = pmac_i2c_get_bus_node(bus); + + while ((dev = of_get_next_child(busnode, dev)) != NULL) { + if (!device_is_compatible(dev, "max6690")) + continue; + loc = get_property(dev, "hwsensor-location", NULL); + reg = (u32 *) get_property(dev, "reg", NULL); + if (!loc || !reg) + continue; + printk("found max6690, loc=%s reg=%x\n", loc, *reg); + if (strcmp(loc, "BACKSIDE")) + continue; + wf_max6690_create(adapter, *reg); + } + + return 0; +} + +static int wf_max6690_detach(struct i2c_client *client) +{ + struct wf_6690_sensor *max = i2c_to_6690(client); + + max->i2c.adapter = NULL; + wf_unregister_sensor(&max->sens); + + return 0; +} + +static int __init wf_max6690_sensor_init(void) +{ + return i2c_add_driver(&wf_max6690_driver); +} + +static void __exit wf_max6690_sensor_exit(void) +{ + i2c_del_driver(&wf_max6690_driver); +} + +module_init(wf_max6690_sensor_init); +module_exit(wf_max6690_sensor_exit); + +MODULE_AUTHOR("Paul Mackerras "); +MODULE_DESCRIPTION("MAX6690 sensor objects for PowerMac thermal control"); +MODULE_LICENSE("GPL"); Index: linux-work/drivers/macintosh/windfarm_pid.c =================================================================== --- linux-work.orig/drivers/macintosh/windfarm_pid.c 2005-11-09 11:49:03.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm_pid.c 2006-02-07 13:45:57.000000000 +1100 @@ -88,8 +88,8 @@ EXPORT_SYMBOL_GPL(wf_cpu_pid_init); s32 wf_cpu_pid_run(struct wf_cpu_pid_state *st, s32 new_power, s32 new_temp) { - s64 error, integ, deriv, prop; - s32 target, sval, adj; + s64 integ, deriv, prop; + s32 error, target, sval, adj; int i, hlen = st->param.history_len; /* Calculate error term */ @@ -117,7 +117,7 @@ s32 wf_cpu_pid_run(struct wf_cpu_pid_sta integ += st->errors[(st->index + hlen - i) % hlen]; integ *= st->param.interval; integ *= st->param.gr; - sval = st->param.tmax - ((integ >> 20) & 0xffffffff); + sval = st->param.tmax - (s32)(integ >> 20); adj = min(st->param.ttarget, sval); DBG("integ: %lx, sval: %lx, adj: %lx\n", integ, sval, adj); @@ -129,7 +129,7 @@ s32 wf_cpu_pid_run(struct wf_cpu_pid_sta deriv *= st->param.gd; /* Calculate proportional term */ - prop = (new_temp - adj); + prop = st->last_delta = (new_temp - adj); prop *= st->param.gp; DBG("deriv: %lx, prop: %lx\n", deriv, prop); Index: linux-work/drivers/macintosh/windfarm_pid.h =================================================================== --- linux-work.orig/drivers/macintosh/windfarm_pid.h 2005-11-09 11:49:03.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm_pid.h 2006-02-07 13:45:57.000000000 +1100 @@ -72,6 +72,7 @@ struct wf_cpu_pid_state { int index; /* index of current power */ int tindex; /* index of current temp */ s32 target; /* current target value */ + s32 last_delta; /* last Tactual - Ttarget */ s32 powers[WF_PID_MAX_HISTORY]; /* power history buffer */ s32 errors[WF_PID_MAX_HISTORY]; /* error history buffer */ s32 temps[2]; /* temp. history buffer */ Index: linux-work/drivers/macintosh/windfarm_pm112.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-work/drivers/macintosh/windfarm_pm112.c 2006-02-08 16:28:38.000000000 +1100 @@ -0,0 +1,698 @@ +/* + * Windfarm PowerMac thermal control. + * Control loops for machines with SMU and PPC970MP processors. + * + * Copyright (C) 2005 Paul Mackerras, IBM Corp. + * Copyright (C) 2006 Benjamin Herrenschmidt, IBM Corp. + * + * Use and redistribute under the terms of the GNU GPL v2. + */ +#include +#include +#include +#include +#include +#include +#include +#include + +#include "windfarm.h" +#include "windfarm_pid.h" + +#define VERSION "0.2" + +#define DEBUG +#undef LOTSA_DEBUG + +#ifdef DEBUG +#define DBG(args...) printk(args) +#else +#define DBG(args...) do { } while(0) +#endif + +#ifdef LOTSA_DEBUG +#define DBG_LOTS(args...) printk(args) +#else +#define DBG_LOTS(args...) do { } while(0) +#endif + +/* define this to force CPU overtemp to 60 degree, useful for testing + * the overtemp code + */ +#undef HACKED_OVERTEMP + +/* We currently only handle 2 chips, 4 cores... */ +#define NR_CHIPS 2 +#define NR_CORES 4 +#define NR_CPU_FANS 3 * NR_CHIPS + +/* Controls and sensors */ +static struct wf_sensor *sens_cpu_temp[NR_CORES]; +static struct wf_sensor *sens_cpu_power[NR_CORES]; +static struct wf_sensor *hd_temp; +static struct wf_sensor *slots_power; +static struct wf_sensor *u4_temp; + +static struct wf_control *cpu_fans[NR_CPU_FANS]; +static char *cpu_fan_names[NR_CPU_FANS] = { + "cpu-rear-fan-0", + "cpu-rear-fan-1", + "cpu-front-fan-0", + "cpu-front-fan-1", + "cpu-pump-0", + "cpu-pump-1", +}; +static struct wf_control *cpufreq_clamp; + +/* Second pump isn't required (and isn't actually present) */ +#define CPU_FANS_REQD (NR_CPU_FANS - 2) +#define FIRST_PUMP 4 +#define LAST_PUMP 5 + +/* We keep a temperature history for average calculation of 180s */ +#define CPU_TEMP_HIST_SIZE 180 + +/* Scale factor for fan speed, *100 */ +static int cpu_fan_scale[NR_CPU_FANS] = { + 100, + 100, + 97, /* inlet fans run at 97% of exhaust fan */ + 97, + 100, /* updated later */ + 100, /* updated later */ +}; + +static struct wf_control *backside_fan; +static struct wf_control *slots_fan; +static struct wf_control *drive_bay_fan; + +/* PID loop state */ +static struct wf_cpu_pid_state cpu_pid[NR_CORES]; +static u32 cpu_thist[CPU_TEMP_HIST_SIZE]; +static int cpu_thist_pt; +static s64 cpu_thist_total; +static s32 cpu_all_tmax = 100 << 16; +static int cpu_last_target; +static struct wf_pid_state backside_pid; +static int backside_tick; +static struct wf_pid_state slots_pid; +static int slots_started; +static struct wf_pid_state drive_bay_pid; +static int drive_bay_tick; + +static int nr_cores; +static int have_all_controls; +static int have_all_sensors; +static int started; + +static int failure_state; +#define FAILURE_SENSOR 1 +#define FAILURE_FAN 2 +#define FAILURE_PERM 4 +#define FAILURE_LOW_OVERTEMP 8 +#define FAILURE_HIGH_OVERTEMP 16 + +/* Overtemp values */ +#define LOW_OVER_AVERAGE 0 +#define LOW_OVER_IMMEDIATE (10 << 16) +#define LOW_OVER_CLEAR ((-10) << 16) +#define HIGH_OVER_IMMEDIATE (14 << 16) +#define HIGH_OVER_AVERAGE (10 << 16) +#define HIGH_OVER_IMMEDIATE (14 << 16) + + +/* Implementation... */ +static int create_cpu_loop(int cpu) +{ + int chip = cpu / 2; + int core = cpu & 1; + struct smu_sdbp_header *hdr; + struct smu_sdbp_cpupiddata *piddata; + struct wf_cpu_pid_param pid; + struct wf_control *main_fan = cpu_fans[0]; + s32 tmax; + int fmin; + + /* Get PID params from the appropriate SAT */ + hdr = smu_sat_get_sdb_partition(chip, 0xC8 + core, NULL); + if (hdr == NULL) { + printk(KERN_WARNING"windfarm: can't get CPU PID fan config\n"); + return -EINVAL; + } + piddata = (struct smu_sdbp_cpupiddata *)&hdr[1]; + + /* Get FVT params to get Tmax; if not found, assume default */ + hdr = smu_sat_get_sdb_partition(chip, 0xC4 + core, NULL); + if (hdr) { + struct smu_sdbp_fvt *fvt = (struct smu_sdbp_fvt *)&hdr[1]; + tmax = fvt->maxtemp << 16; + } else + tmax = 95 << 16; /* default to 95 degrees C */ + + /* We keep a global tmax for overtemp calculations */ + if (tmax < cpu_all_tmax) + cpu_all_tmax = tmax; + + /* + * Darwin has a minimum fan speed of 1000 rpm for the 4-way and + * 515 for the 2-way. That appears to be overkill, so for now, + * impose a minimum of 750 or 515. + */ + fmin = (nr_cores > 2) ? 750 : 515; + + /* Initialize PID loop */ + pid.interval = 1; /* seconds */ + pid.history_len = piddata->history_len; + pid.gd = piddata->gd; + pid.gp = piddata->gp; + pid.gr = piddata->gr / piddata->history_len; + pid.pmaxadj = (piddata->max_power << 16) - (piddata->power_adj << 8); + pid.ttarget = tmax - (piddata->target_temp_delta << 16); + pid.tmax = tmax; + pid.min = main_fan->ops->get_min(main_fan); + pid.max = main_fan->ops->get_max(main_fan); + if (pid.min < fmin) + pid.min = fmin; + + wf_cpu_pid_init(&cpu_pid[cpu], &pid); + return 0; +} + +static void cpu_max_all_fans(void) +{ + int i; + + /* We max all CPU fans in case of a sensor error. We also do the + * cpufreq clamping now, even if it's supposedly done later by the + * generic code anyway, we do it earlier here to react faster + */ + if (cpufreq_clamp) + wf_control_set_max(cpufreq_clamp); + for (i = 0; i < NR_CPU_FANS; ++i) + if (cpu_fans[i]) + wf_control_set_max(cpu_fans[i]); +} + +static int cpu_check_overtemp(s32 temp) +{ + int new_state = 0; + s32 t_avg, t_old; + + /* First check for immediate overtemps */ + if (temp >= (cpu_all_tmax + LOW_OVER_IMMEDIATE)) { + new_state |= FAILURE_LOW_OVERTEMP; + if ((failure_state & FAILURE_LOW_OVERTEMP) == 0) + printk(KERN_ERR "windfarm: Overtemp due to immediate CPU" + " temperature !\n"); + } + if (temp >= (cpu_all_tmax + HIGH_OVER_IMMEDIATE)) { + new_state |= FAILURE_HIGH_OVERTEMP; + if ((failure_state & FAILURE_HIGH_OVERTEMP) == 0) + printk(KERN_ERR "windfarm: Critical overtemp due to" + " immediate CPU temperature !\n"); + } + + /* We calculate a history of max temperatures and use that for the + * overtemp management + */ + t_old = cpu_thist[cpu_thist_pt]; + cpu_thist[cpu_thist_pt] = temp; + cpu_thist_pt = (cpu_thist_pt + 1) % CPU_TEMP_HIST_SIZE; + cpu_thist_total -= t_old; + cpu_thist_total += temp; + t_avg = cpu_thist_total / CPU_TEMP_HIST_SIZE; + + DBG_LOTS("t_avg = %d.%03d (out: %d.%03d, in: %d.%03d)\n", + FIX32TOPRINT(t_avg), FIX32TOPRINT(t_old), FIX32TOPRINT(temp)); + + /* Now check for average overtemps */ + if (t_avg >= (cpu_all_tmax + LOW_OVER_AVERAGE)) { + new_state |= FAILURE_LOW_OVERTEMP; + if ((failure_state & FAILURE_LOW_OVERTEMP) == 0) + printk(KERN_ERR "windfarm: Overtemp due to average CPU" + " temperature !\n"); + } + if (t_avg >= (cpu_all_tmax + HIGH_OVER_AVERAGE)) { + new_state |= FAILURE_HIGH_OVERTEMP; + if ((failure_state & FAILURE_HIGH_OVERTEMP) == 0) + printk(KERN_ERR "windfarm: Critical overtemp due to" + " average CPU temperature !\n"); + } + + /* Now handle overtemp conditions. We don't currently use the windfarm + * overtemp handling core as it's not fully suited to the needs of those + * new machine. This will be fixed later. + */ + if (new_state) { + /* High overtemp -> immediate shutdown */ + if (new_state & FAILURE_HIGH_OVERTEMP) + machine_power_off(); + if ((failure_state & new_state) != new_state) + cpu_max_all_fans(); + failure_state |= new_state; + } else if ((failure_state & FAILURE_LOW_OVERTEMP) && + (temp < (cpu_all_tmax + LOW_OVER_CLEAR))) { + printk(KERN_ERR "windfarm: Overtemp condition cleared !\n"); + failure_state &= ~FAILURE_LOW_OVERTEMP; + } + + return failure_state & (FAILURE_LOW_OVERTEMP | FAILURE_HIGH_OVERTEMP); +} + +static void cpu_fans_tick(void) +{ + int err, cpu; + s32 greatest_delta = 0; + s32 temp, power, t_max = 0; + int i, t, target = 0; + struct wf_sensor *sr; + struct wf_control *ct; + struct wf_cpu_pid_state *sp; + + DBG_LOTS(KERN_DEBUG); + for (cpu = 0; cpu < nr_cores; ++cpu) { + /* Get CPU core temperature */ + sr = sens_cpu_temp[cpu]; + err = sr->ops->get_value(sr, &temp); + if (err) { + DBG("\n"); + printk(KERN_WARNING "windfarm: CPU %d temperature " + "sensor error %d\n", cpu, err); + failure_state |= FAILURE_SENSOR; + cpu_max_all_fans(); + return; + } + + /* Keep track of highest temp */ + t_max = max(t_max, temp); + + /* Get CPU power */ + sr = sens_cpu_power[cpu]; + err = sr->ops->get_value(sr, &power); + if (err) { + DBG("\n"); + printk(KERN_WARNING "windfarm: CPU %d power " + "sensor error %d\n", cpu, err); + failure_state |= FAILURE_SENSOR; + cpu_max_all_fans(); + return; + } + + /* Run PID */ + sp = &cpu_pid[cpu]; + t = wf_cpu_pid_run(sp, power, temp); + + if (cpu == 0 || sp->last_delta > greatest_delta) { + greatest_delta = sp->last_delta; + target = t; + } + DBG_LOTS("[%d] P=%d.%.3d T=%d.%.3d ", + cpu, FIX32TOPRINT(power), FIX32TOPRINT(temp)); + } + DBG_LOTS("fans = %d, t_max = %d.%03d\n", target, FIX32TOPRINT(t_max)); + + /* Darwin limits decrease to 20 per iteration */ + if (target < (cpu_last_target - 20)) + target = cpu_last_target - 20; + cpu_last_target = target; + for (cpu = 0; cpu < nr_cores; ++cpu) + cpu_pid[cpu].target = target; + + /* Handle possible overtemps */ + if (cpu_check_overtemp(t_max)) + return; + + /* Set fans */ + for (i = 0; i < NR_CPU_FANS; ++i) { + ct = cpu_fans[i]; + if (ct == NULL) + continue; + err = ct->ops->set_value(ct, target * cpu_fan_scale[i] / 100); + if (err) { + printk(KERN_WARNING "windfarm: fan %s reports " + "error %d\n", ct->name, err); + failure_state |= FAILURE_FAN; + break; + } + } +} + +/* Backside/U4 fan */ +static struct wf_pid_param backside_param = { + .interval = 5, + .history_len = 2, + .gd = 48 << 20, + .gp = 5 << 20, + .gr = 0, + .itarget = 64 << 16, + .additive = 1, +}; + +static void backside_fan_tick(void) +{ + s32 temp; + int speed; + int err; + + if (!backside_fan || !u4_temp) + return; + if (!backside_tick) { + /* first time; initialize things */ + backside_param.min = backside_fan->ops->get_min(backside_fan); + backside_param.max = backside_fan->ops->get_max(backside_fan); + wf_pid_init(&backside_pid, &backside_param); + backside_tick = 1; + } + if (--backside_tick > 0) + return; + backside_tick = backside_pid.param.interval; + + err = u4_temp->ops->get_value(u4_temp, &temp); + if (err) { + printk(KERN_WARNING "windfarm: U4 temp sensor error %d\n", + err); + failure_state |= FAILURE_SENSOR; + wf_control_set_max(backside_fan); + return; + } + speed = wf_pid_run(&backside_pid, temp); + DBG_LOTS("backside PID temp=%d.%.3d speed=%d\n", + FIX32TOPRINT(temp), speed); + + err = backside_fan->ops->set_value(backside_fan, speed); + if (err) { + printk(KERN_WARNING "windfarm: backside fan error %d\n", err); + failure_state |= FAILURE_FAN; + } +} + +/* Drive bay fan */ +static struct wf_pid_param drive_bay_prm = { + .interval = 5, + .history_len = 2, + .gd = 30 << 20, + .gp = 5 << 20, + .gr = 0, + .itarget = 40 << 16, + .additive = 1, +}; + +static void drive_bay_fan_tick(void) +{ + s32 temp; + int speed; + int err; + + if (!drive_bay_fan || !hd_temp) + return; + if (!drive_bay_tick) { + /* first time; initialize things */ + drive_bay_prm.min = drive_bay_fan->ops->get_min(drive_bay_fan); + drive_bay_prm.max = drive_bay_fan->ops->get_max(drive_bay_fan); + wf_pid_init(&drive_bay_pid, &drive_bay_prm); + drive_bay_tick = 1; + } + if (--drive_bay_tick > 0) + return; + drive_bay_tick = drive_bay_pid.param.interval; + + err = hd_temp->ops->get_value(hd_temp, &temp); + if (err) { + printk(KERN_WARNING "windfarm: drive bay temp sensor " + "error %d\n", err); + failure_state |= FAILURE_SENSOR; + wf_control_set_max(drive_bay_fan); + return; + } + speed = wf_pid_run(&drive_bay_pid, temp); + DBG_LOTS("drive_bay PID temp=%d.%.3d speed=%d\n", + FIX32TOPRINT(temp), speed); + + err = drive_bay_fan->ops->set_value(drive_bay_fan, speed); + if (err) { + printk(KERN_WARNING "windfarm: drive bay fan error %d\n", err); + failure_state |= FAILURE_FAN; + } +} + +/* PCI slots area fan */ +/* This makes the fan speed proportional to the power consumed */ +static struct wf_pid_param slots_param = { + .interval = 1, + .history_len = 2, + .gd = 0, + .gp = 0, + .gr = 0x1277952, + .itarget = 0, + .min = 1560, + .max = 3510, +}; + +static void slots_fan_tick(void) +{ + s32 power; + int speed; + int err; + + if (!slots_fan || !slots_power) + return; + if (!slots_started) { + /* first time; initialize things */ + wf_pid_init(&slots_pid, &slots_param); + slots_started = 1; + } + + err = slots_power->ops->get_value(slots_power, &power); + if (err) { + printk(KERN_WARNING "windfarm: slots power sensor error %d\n", + err); + failure_state |= FAILURE_SENSOR; + wf_control_set_max(slots_fan); + return; + } + speed = wf_pid_run(&slots_pid, power); + DBG_LOTS("slots PID power=%d.%.3d speed=%d\n", + FIX32TOPRINT(power), speed); + + err = slots_fan->ops->set_value(slots_fan, speed); + if (err) { + printk(KERN_WARNING "windfarm: slots fan error %d\n", err); + failure_state |= FAILURE_FAN; + } +} + +static void set_fail_state(void) +{ + int i; + + if (cpufreq_clamp) + wf_control_set_max(cpufreq_clamp); + for (i = 0; i < NR_CPU_FANS; ++i) + if (cpu_fans[i]) + wf_control_set_max(cpu_fans[i]); + if (backside_fan) + wf_control_set_max(backside_fan); + if (slots_fan) + wf_control_set_max(slots_fan); + if (drive_bay_fan) + wf_control_set_max(drive_bay_fan); +} + +static void pm112_tick(void) +{ + int i, last_failure; + + if (!started) { + started = 1; + for (i = 0; i < nr_cores; ++i) { + if (create_cpu_loop(i) < 0) { + failure_state = FAILURE_PERM; + set_fail_state(); + break; + } + } + DBG_LOTS("cpu_all_tmax=%d.%03d\n", FIX32TOPRINT(cpu_all_tmax)); + +#ifdef HACKED_OVERTEMP + cpu_all_tmax = 60 << 16; +#endif + } + + /* Permanent failure, bail out */ + if (failure_state & FAILURE_PERM) + return; + /* Clear all failure bits except low overtemp which will be eventually + * cleared by the control loop itself + */ + last_failure = failure_state; + failure_state &= FAILURE_LOW_OVERTEMP; + cpu_fans_tick(); + backside_fan_tick(); + slots_fan_tick(); + drive_bay_fan_tick(); + + DBG_LOTS("last_failure: 0x%x, failure_state: %x\n", + last_failure, failure_state); + + /* Check for failures. Any failure causes cpufreq clamping */ + if (failure_state && last_failure == 0 && cpufreq_clamp) + wf_control_set_max(cpufreq_clamp); + if (failure_state == 0 && last_failure && cpufreq_clamp) + wf_control_set_min(cpufreq_clamp); + + /* That's it for now, we might want to deal with other failures + * differently in the future though + */ +} + +static void pm112_new_control(struct wf_control *ct) +{ + int i, max_exhaust; + + if (cpufreq_clamp == NULL && !strcmp(ct->name, "cpufreq-clamp")) { + if (wf_get_control(ct) == 0) + cpufreq_clamp = ct; + } + + for (i = 0; i < NR_CPU_FANS; ++i) { + if (!strcmp(ct->name, cpu_fan_names[i])) { + if (cpu_fans[i] == NULL && wf_get_control(ct) == 0) + cpu_fans[i] = ct; + break; + } + } + if (i >= NR_CPU_FANS) { + /* not a CPU fan, try the others */ + if (!strcmp(ct->name, "backside-fan")) { + if (backside_fan == NULL && wf_get_control(ct) == 0) + backside_fan = ct; + } else if (!strcmp(ct->name, "slots-fan")) { + if (slots_fan == NULL && wf_get_control(ct) == 0) + slots_fan = ct; + } else if (!strcmp(ct->name, "drive-bay-fan")) { + if (drive_bay_fan == NULL && wf_get_control(ct) == 0) + drive_bay_fan = ct; + } + return; + } + + for (i = 0; i < CPU_FANS_REQD; ++i) + if (cpu_fans[i] == NULL) + return; + + /* work out pump scaling factors */ + max_exhaust = cpu_fans[0]->ops->get_max(cpu_fans[0]); + for (i = FIRST_PUMP; i <= LAST_PUMP; ++i) + if ((ct = cpu_fans[i]) != NULL) + cpu_fan_scale[i] = + ct->ops->get_max(ct) * 100 / max_exhaust; + + have_all_controls = 1; +} + +static void pm112_new_sensor(struct wf_sensor *sr) +{ + unsigned int i; + + if (have_all_sensors) + return; + if (!strncmp(sr->name, "cpu-temp-", 9)) { + i = sr->name[9] - '0'; + if (sr->name[10] == 0 && i < NR_CORES && + sens_cpu_temp[i] == NULL && wf_get_sensor(sr) == 0) + sens_cpu_temp[i] = sr; + + } else if (!strncmp(sr->name, "cpu-power-", 10)) { + i = sr->name[10] - '0'; + if (sr->name[11] == 0 && i < NR_CORES && + sens_cpu_power[i] == NULL && wf_get_sensor(sr) == 0) + sens_cpu_power[i] = sr; + } else if (!strcmp(sr->name, "hd-temp")) { + if (hd_temp == NULL && wf_get_sensor(sr) == 0) + hd_temp = sr; + } else if (!strcmp(sr->name, "slots-power")) { + if (slots_power == NULL && wf_get_sensor(sr) == 0) + slots_power = sr; + } else if (!strcmp(sr->name, "u4-temp")) { + if (u4_temp == NULL && wf_get_sensor(sr) == 0) + u4_temp = sr; + } else + return; + + /* check if we have all the sensors we need */ + for (i = 0; i < nr_cores; ++i) + if (sens_cpu_temp[i] == NULL || sens_cpu_power[i] == NULL) + return; + + have_all_sensors = 1; +} + +static int pm112_wf_notify(struct notifier_block *self, + unsigned long event, void *data) +{ + switch (event) { + case WF_EVENT_NEW_SENSOR: + pm112_new_sensor(data); + break; + case WF_EVENT_NEW_CONTROL: + pm112_new_control(data); + break; + case WF_EVENT_TICK: + if (have_all_controls && have_all_sensors) + pm112_tick(); + } + return 0; +} + +static struct notifier_block pm112_events = { + .notifier_call = pm112_wf_notify, +}; + +static int wf_pm112_probe(struct device *dev) +{ + wf_register_client(&pm112_events); + return 0; +} + +static int wf_pm112_remove(struct device *dev) +{ + wf_unregister_client(&pm112_events); + /* should release all sensors and controls */ + return 0; +} + +static struct device_driver wf_pm112_driver = { + .name = "windfarm", + .bus = &platform_bus_type, + .probe = wf_pm112_probe, + .remove = wf_pm112_remove, +}; + +static int __init wf_pm112_init(void) +{ + struct device_node *cpu; + + if (!machine_is_compatible("PowerMac11,2")) + return -ENODEV; + + /* Count the number of CPU cores */ + nr_cores = 0; + for (cpu = NULL; (cpu = of_find_node_by_type(cpu, "cpu")) != NULL; ) + ++nr_cores; + + printk(KERN_INFO "windfarm: initializing for dual-core desktop G5\n"); + driver_register(&wf_pm112_driver); + return 0; +} + +static void __exit wf_pm112_exit(void) +{ + driver_unregister(&wf_pm112_driver); +} + +module_init(wf_pm112_init); +module_exit(wf_pm112_exit); + +MODULE_AUTHOR("Paul Mackerras "); +MODULE_DESCRIPTION("Thermal control for PowerMac11,2"); +MODULE_LICENSE("GPL"); Index: linux-work/drivers/macintosh/windfarm_smu_controls.c =================================================================== --- linux-work.orig/drivers/macintosh/windfarm_smu_controls.c 2006-01-12 16:33:08.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm_smu_controls.c 2006-02-07 13:45:57.000000000 +1100 @@ -24,7 +24,7 @@ #include "windfarm.h" -#define VERSION "0.3" +#define VERSION "0.4" #undef DEBUG @@ -34,6 +34,8 @@ #define DBG(args...) do { } while(0) #endif +static int smu_supports_new_fans_ops = 1; + /* * SMU fans control object */ @@ -59,23 +61,49 @@ static int smu_set_fan(int pwm, u8 id, u /* Fill SMU command structure */ cmd.cmd = SMU_CMD_FAN_COMMAND; - cmd.data_len = 14; + + /* The SMU has an "old" and a "new" way of setting the fan speed + * Unfortunately, I found no reliable way to know which one works + * on a given machine model. After some investigations it appears + * that MacOS X just tries the new one, and if it fails fallbacks + * to the old ones ... Ugh. + */ + retry: + if (smu_supports_new_fans_ops) { + buffer[0] = 0x30; + buffer[1] = id; + *((u16 *)(&buffer[2])) = value; + cmd.data_len = 4; + } else { + if (id > 7) + return -EINVAL; + /* Fill argument buffer */ + memset(buffer, 0, 16); + buffer[0] = pwm ? 0x10 : 0x00; + buffer[1] = 0x01 << id; + *((u16 *)&buffer[2 + id * 2]) = value; + cmd.data_len = 14; + } + cmd.reply_len = 16; cmd.data_buf = cmd.reply_buf = buffer; cmd.status = 0; cmd.done = smu_done_complete; cmd.misc = ∁ - /* Fill argument buffer */ - memset(buffer, 0, 16); - buffer[0] = pwm ? 0x10 : 0x00; - buffer[1] = 0x01 << id; - *((u16 *)&buffer[2 + id * 2]) = value; - rc = smu_queue_cmd(&cmd); if (rc) return rc; wait_for_completion(&comp); + + /* Handle fallback (see coment above) */ + if (cmd.status != 0 && smu_supports_new_fans_ops) { + printk(KERN_WARNING "windfarm: SMU failed new fan command " + "falling back to old method\n"); + smu_supports_new_fans_ops = 0; + goto retry; + } + return cmd.status; } @@ -158,19 +186,29 @@ static struct smu_fan_control *smu_fan_c /* Names used on desktop models */ if (!strcmp(l, "Rear Fan 0") || !strcmp(l, "Rear Fan") || - !strcmp(l, "Rear fan 0") || !strcmp(l, "Rear fan")) + !strcmp(l, "Rear fan 0") || !strcmp(l, "Rear fan") || + !strcmp(l, "CPU A EXHAUST")) fct->ctrl.name = "cpu-rear-fan-0"; - else if (!strcmp(l, "Rear Fan 1") || !strcmp(l, "Rear fan 1")) + else if (!strcmp(l, "Rear Fan 1") || !strcmp(l, "Rear fan 1") || + !strcmp(l, "CPU B EXHAUST")) fct->ctrl.name = "cpu-rear-fan-1"; else if (!strcmp(l, "Front Fan 0") || !strcmp(l, "Front Fan") || - !strcmp(l, "Front fan 0") || !strcmp(l, "Front fan")) + !strcmp(l, "Front fan 0") || !strcmp(l, "Front fan") || + !strcmp(l, "CPU A INTAKE")) fct->ctrl.name = "cpu-front-fan-0"; - else if (!strcmp(l, "Front Fan 1") || !strcmp(l, "Front fan 1")) + else if (!strcmp(l, "Front Fan 1") || !strcmp(l, "Front fan 1") || + !strcmp(l, "CPU B INTAKE")) fct->ctrl.name = "cpu-front-fan-1"; - else if (!strcmp(l, "Slots Fan") || !strcmp(l, "Slots fan")) + else if (!strcmp(l, "CPU A PUMP")) + fct->ctrl.name = "cpu-pump-0"; + else if (!strcmp(l, "Slots Fan") || !strcmp(l, "Slots fan") || + !strcmp(l, "EXPANSION SLOTS INTAKE")) fct->ctrl.name = "slots-fan"; - else if (!strcmp(l, "Drive Bay") || !strcmp(l, "Drive bay")) + else if (!strcmp(l, "Drive Bay") || !strcmp(l, "Drive bay") || + !strcmp(l, "DRIVE BAY A INTAKE")) fct->ctrl.name = "drive-bay-fan"; + else if (!strcmp(l, "BACKSIDE")) + fct->ctrl.name = "backside-fan"; /* Names used on iMac models */ if (!strcmp(l, "System Fan") || !strcmp(l, "System fan")) @@ -223,7 +261,8 @@ static int __init smu_controls_init(void /* Look for RPM fans */ for (fans = NULL; (fans = of_get_next_child(smu, fans)) != NULL;) - if (!strcmp(fans->name, "rpm-fans")) + if (!strcmp(fans->name, "rpm-fans") || + device_is_compatible(fans, "smu-rpm-fans")) break; for (fan = NULL; fans && (fan = of_get_next_child(fans, fan)) != NULL;) { Index: linux-work/drivers/macintosh/windfarm_smu_sensors.c =================================================================== --- linux-work.orig/drivers/macintosh/windfarm_smu_sensors.c 2006-01-12 16:33:08.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm_smu_sensors.c 2006-02-07 13:45:57.000000000 +1100 @@ -220,14 +220,29 @@ static struct smu_ad_sensor *smu_ads_cre !strcmp(l, "CPU T-Diode")) { ads->sens.ops = &smu_cputemp_ops; ads->sens.name = "cpu-temp"; + if (cpudiode == NULL) { + DBG("wf: cpudiode partition (%02x) not found\n", + SMU_SDB_CPUDIODE_ID); + goto fail; + } } else if (!strcmp(c, "current-sensor") && !strcmp(l, "CPU Current")) { ads->sens.ops = &smu_cpuamp_ops; ads->sens.name = "cpu-current"; + if (cpuvcp == NULL) { + DBG("wf: cpuvcp partition (%02x) not found\n", + SMU_SDB_CPUVCP_ID); + goto fail; + } } else if (!strcmp(c, "voltage-sensor") && !strcmp(l, "CPU Voltage")) { ads->sens.ops = &smu_cpuvolt_ops; ads->sens.name = "cpu-voltage"; + if (cpuvcp == NULL) { + DBG("wf: cpuvcp partition (%02x) not found\n", + SMU_SDB_CPUVCP_ID); + goto fail; + } } else if (!strcmp(c, "power-sensor") && !strcmp(l, "Slots Power")) { ads->sens.ops = &smu_slotspow_ops; @@ -365,29 +380,22 @@ smu_cpu_power_create(struct wf_sensor *v return NULL; } -static int smu_fetch_param_partitions(void) +static void smu_fetch_param_partitions(void) { struct smu_sdbp_header *hdr; /* Get CPU voltage/current/power calibration data */ hdr = smu_get_sdb_partition(SMU_SDB_CPUVCP_ID, NULL); - if (hdr == NULL) { - DBG("wf: cpuvcp partition (%02x) not found\n", - SMU_SDB_CPUVCP_ID); - return -ENODEV; + if (hdr != NULL) { + cpuvcp = (struct smu_sdbp_cpuvcp *)&hdr[1]; + /* Keep version around */ + cpuvcp_version = hdr->version; } - cpuvcp = (struct smu_sdbp_cpuvcp *)&hdr[1]; - /* Keep version around */ - cpuvcp_version = hdr->version; /* Get CPU diode calibration data */ hdr = smu_get_sdb_partition(SMU_SDB_CPUDIODE_ID, NULL); - if (hdr == NULL) { - DBG("wf: cpudiode partition (%02x) not found\n", - SMU_SDB_CPUDIODE_ID); - return -ENODEV; - } - cpudiode = (struct smu_sdbp_cpudiode *)&hdr[1]; + if (hdr != NULL) + cpudiode = (struct smu_sdbp_cpudiode *)&hdr[1]; /* Get slots power calibration data if any */ hdr = smu_get_sdb_partition(SMU_SDB_SLOTSPOW_ID, NULL); @@ -398,23 +406,18 @@ static int smu_fetch_param_partitions(vo hdr = smu_get_sdb_partition(SMU_SDB_DEBUG_SWITCHES_ID, NULL); if (hdr != NULL) debugswitches = (u8 *)&hdr[1]; - - return 0; } static int __init smu_sensors_init(void) { struct device_node *smu, *sensors, *s; struct smu_ad_sensor *volt_sensor = NULL, *curr_sensor = NULL; - int rc; if (!smu_present()) return -ENODEV; /* Get parameters partitions */ - rc = smu_fetch_param_partitions(); - if (rc) - return rc; + smu_fetch_param_partitions(); smu = of_find_node_by_type(NULL, "smu"); if (smu == NULL) Index: linux-work/drivers/macintosh/windfarm_smu_sat.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-work/drivers/macintosh/windfarm_smu_sat.c 2006-02-07 16:18:32.000000000 +1100 @@ -0,0 +1,418 @@ +/* + * Windfarm PowerMac thermal control. SMU "satellite" controller sensors. + * + * Copyright (C) 2005 Paul Mackerras, IBM Corp. + * + * Released under the terms of the GNU GPL v2. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "windfarm.h" + +#define VERSION "0.2" + +#define DEBUG + +#ifdef DEBUG +#define DBG(args...) printk(args) +#else +#define DBG(args...) do { } while(0) +#endif + +/* If the cache is older than 800ms we'll refetch it */ +#define MAX_AGE msecs_to_jiffies(800) + +struct wf_sat { + int nr; + atomic_t refcnt; + struct semaphore mutex; + unsigned long last_read; /* jiffies when cache last updated */ + u8 cache[16]; + struct i2c_client i2c; + struct device_node *node; +}; + +static struct wf_sat *sats[2]; + +struct wf_sat_sensor { + int index; + int index2; /* used for power sensors */ + int shift; + struct wf_sat *sat; + struct wf_sensor sens; +}; + +#define wf_to_sat(c) container_of(c, struct wf_sat_sensor, sens) +#define i2c_to_sat(c) container_of(c, struct wf_sat, i2c) + +static int wf_sat_attach(struct i2c_adapter *adapter); +static int wf_sat_detach(struct i2c_client *client); + +static struct i2c_driver wf_sat_driver = { + .driver = { + .name = "wf_smu_sat", + }, + .attach_adapter = wf_sat_attach, + .detach_client = wf_sat_detach, +}; + +/* + * XXX i2c_smbus_read_i2c_block_data doesn't pass the requested + * length down to the low-level driver, so we use this, which + * works well enough with the SMU i2c driver code... + */ +static int sat_read_block(struct i2c_client *client, u8 command, + u8 *values, int len) +{ + union i2c_smbus_data data; + int err; + + data.block[0] = len; + err = i2c_smbus_xfer(client->adapter, client->addr, client->flags, + I2C_SMBUS_READ, command, I2C_SMBUS_I2C_BLOCK_DATA, + &data); + if (!err) + memcpy(values, data.block, len); + return err; +} + +struct smu_sdbp_header *smu_sat_get_sdb_partition(unsigned int sat_id, int id, + unsigned int *size) +{ + struct wf_sat *sat; + int err; + unsigned int i, len; + u8 *buf; + u8 data[4]; + + /* TODO: Add the resulting partition to the device-tree */ + + if (sat_id > 1 || (sat = sats[sat_id]) == NULL) + return NULL; + + err = i2c_smbus_write_word_data(&sat->i2c, 8, id << 8); + if (err) { + printk(KERN_ERR "smu_sat_get_sdb_part wr error %d\n", err); + return NULL; + } + + len = i2c_smbus_read_word_data(&sat->i2c, 9); + if (len < 0) { + printk(KERN_ERR "smu_sat_get_sdb_part rd len error\n"); + return NULL; + } + if (len == 0) { + printk(KERN_ERR "smu_sat_get_sdb_part no partition %x\n", id); + return NULL; + } + + len = le16_to_cpu(len); + len = (len + 3) & ~3; + buf = kmalloc(len, GFP_KERNEL); + if (buf == NULL) + return NULL; + + for (i = 0; i < len; i += 4) { + err = sat_read_block(&sat->i2c, 0xa, data, 4); + if (err) { + printk(KERN_ERR "smu_sat_get_sdb_part rd err %d\n", + err); + goto fail; + } + buf[i] = data[1]; + buf[i+1] = data[0]; + buf[i+2] = data[3]; + buf[i+3] = data[2]; + } +#ifdef DEBUG + DBG(KERN_DEBUG "sat %d partition %x:", sat_id, id); + for (i = 0; i < len; ++i) + DBG(" %x", buf[i]); + DBG("\n"); +#endif + + if (size) + *size = len; + return (struct smu_sdbp_header *) buf; + + fail: + kfree(buf); + return NULL; +} + +/* refresh the cache */ +static int wf_sat_read_cache(struct wf_sat *sat) +{ + int err; + + err = sat_read_block(&sat->i2c, 0x3f, sat->cache, 16); + if (err) + return err; + sat->last_read = jiffies; +#ifdef LOTSA_DEBUG + { + int i; + DBG(KERN_DEBUG "wf_sat_get: data is"); + for (i = 0; i < 16; ++i) + DBG(" %.2x", sat->cache[i]); + DBG("\n"); + } +#endif + return 0; +} + +static int wf_sat_get(struct wf_sensor *sr, s32 *value) +{ + struct wf_sat_sensor *sens = wf_to_sat(sr); + struct wf_sat *sat = sens->sat; + int i, err; + s32 val; + + if (sat->i2c.adapter == NULL) + return -ENODEV; + + down(&sat->mutex); + if (time_after(jiffies, (sat->last_read + MAX_AGE))) { + err = wf_sat_read_cache(sat); + if (err) + goto fail; + } + + i = sens->index * 2; + val = ((sat->cache[i] << 8) + sat->cache[i+1]) << sens->shift; + if (sens->index2 >= 0) { + i = sens->index2 * 2; + /* 4.12 * 8.8 -> 12.20; shift right 4 to get 16.16 */ + val = (val * ((sat->cache[i] << 8) + sat->cache[i+1])) >> 4; + } + + *value = val; + err = 0; + + fail: + up(&sat->mutex); + return err; +} + +static void wf_sat_release(struct wf_sensor *sr) +{ + struct wf_sat_sensor *sens = wf_to_sat(sr); + struct wf_sat *sat = sens->sat; + + if (atomic_dec_and_test(&sat->refcnt)) { + if (sat->i2c.adapter) { + i2c_detach_client(&sat->i2c); + sat->i2c.adapter = NULL; + } + if (sat->nr >= 0) + sats[sat->nr] = NULL; + kfree(sat); + } + kfree(sens); +} + +static struct wf_sensor_ops wf_sat_ops = { + .get_value = wf_sat_get, + .release = wf_sat_release, + .owner = THIS_MODULE, +}; + +static void wf_sat_create(struct i2c_adapter *adapter, struct device_node *dev) +{ + struct wf_sat *sat; + struct wf_sat_sensor *sens; + u32 *reg; + char *loc, *type; + u8 addr, chip, core; + struct device_node *child; + int shift, cpu, index; + char *name; + int vsens[2], isens[2]; + + reg = (u32 *) get_property(dev, "reg", NULL); + if (reg == NULL) + return; + addr = *reg; + DBG(KERN_DEBUG "wf_sat: creating sat at address %x\n", addr); + + sat = kzalloc(sizeof(struct wf_sat), GFP_KERNEL); + if (sat == NULL) + return; + sat->nr = -1; + sat->node = of_node_get(dev); + atomic_set(&sat->refcnt, 0); + init_MUTEX(&sat->mutex); + sat->i2c.addr = (addr >> 1) & 0x7f; + sat->i2c.adapter = adapter; + sat->i2c.driver = &wf_sat_driver; + strncpy(sat->i2c.name, "smu-sat", I2C_NAME_SIZE-1); + + if (i2c_attach_client(&sat->i2c)) { + printk(KERN_ERR "windfarm: failed to attach smu-sat to i2c\n"); + goto fail; + } + + vsens[0] = vsens[1] = -1; + isens[0] = isens[1] = -1; + child = NULL; + while ((child = of_get_next_child(dev, child)) != NULL) { + reg = (u32 *) get_property(child, "reg", NULL); + type = get_property(child, "device_type", NULL); + loc = get_property(child, "location", NULL); + if (reg == NULL || loc == NULL) + continue; + + /* the cooked sensors are between 0x30 and 0x37 */ + if (*reg < 0x30 || *reg > 0x37) + continue; + index = *reg - 0x30; + + /* expect location to be CPU [AB][01] ... */ + if (strncmp(loc, "CPU ", 4) != 0) + continue; + chip = loc[4] - 'A'; + core = loc[5] - '0'; + if (chip > 1 || core > 1) { + printk(KERN_ERR "wf_sat_create: don't understand " + "location %s for %s\n", loc, child->full_name); + continue; + } + cpu = 2 * chip + core; + if (sat->nr < 0) + sat->nr = chip; + else if (sat->nr != chip) { + printk(KERN_ERR "wf_sat_create: can't cope with " + "multiple CPU chips on one SAT (%s)\n", loc); + continue; + } + + if (strcmp(type, "voltage-sensor") == 0) { + name = "cpu-voltage"; + shift = 4; + vsens[core] = index; + } else if (strcmp(type, "current-sensor") == 0) { + name = "cpu-current"; + shift = 8; + isens[core] = index; + } else if (strcmp(type, "temp-sensor") == 0) { + name = "cpu-temp"; + shift = 10; + } else + continue; /* hmmm shouldn't happen */ + + /* the +16 is enough for "cpu-voltage-n" */ + sens = kzalloc(sizeof(struct wf_sat_sensor) + 16, GFP_KERNEL); + if (sens == NULL) { + printk(KERN_ERR "wf_sat_create: couldn't create " + "%s sensor %d (no memory)\n", name, cpu); + continue; + } + sens->index = index; + sens->index2 = -1; + sens->shift = shift; + sens->sat = sat; + atomic_inc(&sat->refcnt); + sens->sens.ops = &wf_sat_ops; + sens->sens.name = (char *) (sens + 1); + snprintf(sens->sens.name, 16, "%s-%d", name, cpu); + + if (wf_register_sensor(&sens->sens)) { + atomic_dec(&sat->refcnt); + kfree(sens); + } + } + + /* make the power sensors */ + for (core = 0; core < 2; ++core) { + if (vsens[core] < 0 || isens[core] < 0) + continue; + cpu = 2 * sat->nr + core; + sens = kzalloc(sizeof(struct wf_sat_sensor) + 16, GFP_KERNEL); + if (sens == NULL) { + printk(KERN_ERR "wf_sat_create: couldn't create power " + "sensor %d (no memory)\n", cpu); + continue; + } + sens->index = vsens[core]; + sens->index2 = isens[core]; + sens->shift = 0; + sens->sat = sat; + atomic_inc(&sat->refcnt); + sens->sens.ops = &wf_sat_ops; + sens->sens.name = (char *) (sens + 1); + snprintf(sens->sens.name, 16, "cpu-power-%d", cpu); + + if (wf_register_sensor(&sens->sens)) { + atomic_dec(&sat->refcnt); + kfree(sens); + } + } + + if (sat->nr >= 0) + sats[sat->nr] = sat; + + return; + + fail: + kfree(sat); +} + +static int wf_sat_attach(struct i2c_adapter *adapter) +{ + struct device_node *busnode, *dev = NULL; + struct pmac_i2c_bus *bus; + + bus = pmac_i2c_adapter_to_bus(adapter); + if (bus == NULL) + return -ENODEV; + busnode = pmac_i2c_get_bus_node(bus); + + while ((dev = of_get_next_child(busnode, dev)) != NULL) + if (device_is_compatible(dev, "smu-sat")) + wf_sat_create(adapter, dev); + return 0; +} + +static int wf_sat_detach(struct i2c_client *client) +{ + struct wf_sat *sat = i2c_to_sat(client); + + /* XXX TODO */ + + sat->i2c.adapter = NULL; + return 0; +} + +static int __init sat_sensors_init(void) +{ + int err; + + err = i2c_add_driver(&wf_sat_driver); + if (err < 0) + return err; + return 0; +} + +static void __exit sat_sensors_exit(void) +{ + i2c_del_driver(&wf_sat_driver); +} + +module_init(sat_sensors_init); +/*module_exit(sat_sensors_exit); Uncomment when cleanup is implemented */ + +MODULE_AUTHOR("Paul Mackerras "); +MODULE_DESCRIPTION("SMU satellite sensors for PowerMac thermal control"); +MODULE_LICENSE("GPL"); Index: linux-work/include/asm-powerpc/smu.h =================================================================== --- linux-work.orig/include/asm-powerpc/smu.h 2006-01-13 16:55:09.000000000 +1100 +++ linux-work/include/asm-powerpc/smu.h 2006-02-07 13:45:57.000000000 +1100 @@ -521,6 +521,11 @@ struct smu_sdbp_cpupiddata { extern struct smu_sdbp_header *smu_get_sdb_partition(int id, unsigned int *size); +/* Get "sdb" partition data from an SMU satellite */ +extern struct smu_sdbp_header *smu_sat_get_sdb_partition(unsigned int sat_id, + int id, unsigned int *size); + + #endif /* __KERNEL__ */ Index: linux-work/drivers/macintosh/windfarm_pm91.c =================================================================== --- linux-work.orig/drivers/macintosh/windfarm_pm91.c 2005-11-09 11:49:03.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm_pm91.c 2006-02-08 16:34:39.000000000 +1100 @@ -458,45 +458,6 @@ static void wf_smu_slots_fans_tick(struc /* - * ****** Attributes ****** - * - */ - -#define BUILD_SHOW_FUNC_FIX(name, data) \ -static ssize_t show_##name(struct device *dev, \ - struct device_attribute *attr, \ - char *buf) \ -{ \ - ssize_t r; \ - s32 val = 0; \ - data->ops->get_value(data, &val); \ - r = sprintf(buf, "%d.%03d", FIX32TOPRINT(val)); \ - return r; \ -} \ -static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL); - - -#define BUILD_SHOW_FUNC_INT(name, data) \ -static ssize_t show_##name(struct device *dev, \ - struct device_attribute *attr, \ - char *buf) \ -{ \ - s32 val = 0; \ - data->ops->get_value(data, &val); \ - return sprintf(buf, "%d", val); \ -} \ -static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL); - -BUILD_SHOW_FUNC_INT(cpu_fan, fan_cpu_main); -BUILD_SHOW_FUNC_INT(hd_fan, fan_hd); -BUILD_SHOW_FUNC_INT(slots_fan, fan_slots); - -BUILD_SHOW_FUNC_FIX(cpu_temp, sensor_cpu_temp); -BUILD_SHOW_FUNC_FIX(cpu_power, sensor_cpu_power); -BUILD_SHOW_FUNC_FIX(hd_temp, sensor_hd_temp); -BUILD_SHOW_FUNC_FIX(slots_power, sensor_slots_power); - -/* * ****** Setup / Init / Misc ... ****** * */ @@ -581,10 +542,8 @@ static void wf_smu_new_control(struct wf return; if (fan_cpu_main == NULL && !strcmp(ct->name, "cpu-rear-fan-0")) { - if (wf_get_control(ct) == 0) { + if (wf_get_control(ct) == 0) fan_cpu_main = ct; - device_create_file(wf_smu_dev, &dev_attr_cpu_fan); - } } if (fan_cpu_second == NULL && !strcmp(ct->name, "cpu-rear-fan-1")) { @@ -603,17 +562,13 @@ static void wf_smu_new_control(struct wf } if (fan_hd == NULL && !strcmp(ct->name, "drive-bay-fan")) { - if (wf_get_control(ct) == 0) { + if (wf_get_control(ct) == 0) fan_hd = ct; - device_create_file(wf_smu_dev, &dev_attr_hd_fan); - } } if (fan_slots == NULL && !strcmp(ct->name, "slots-fan")) { - if (wf_get_control(ct) == 0) { + if (wf_get_control(ct) == 0) fan_slots = ct; - device_create_file(wf_smu_dev, &dev_attr_slots_fan); - } } if (fan_cpu_main && (fan_cpu_second || fan_cpu_third) && fan_hd && @@ -627,31 +582,23 @@ static void wf_smu_new_sensor(struct wf_ return; if (sensor_cpu_power == NULL && !strcmp(sr->name, "cpu-power")) { - if (wf_get_sensor(sr) == 0) { + if (wf_get_sensor(sr) == 0) sensor_cpu_power = sr; - device_create_file(wf_smu_dev, &dev_attr_cpu_power); - } } if (sensor_cpu_temp == NULL && !strcmp(sr->name, "cpu-temp")) { - if (wf_get_sensor(sr) == 0) { + if (wf_get_sensor(sr) == 0) sensor_cpu_temp = sr; - device_create_file(wf_smu_dev, &dev_attr_cpu_temp); - } } if (sensor_hd_temp == NULL && !strcmp(sr->name, "hd-temp")) { - if (wf_get_sensor(sr) == 0) { + if (wf_get_sensor(sr) == 0) sensor_hd_temp = sr; - device_create_file(wf_smu_dev, &dev_attr_hd_temp); - } } if (sensor_slots_power == NULL && !strcmp(sr->name, "slots-power")) { - if (wf_get_sensor(sr) == 0) { + if (wf_get_sensor(sr) == 0) sensor_slots_power = sr; - device_create_file(wf_smu_dev, &dev_attr_slots_power); - } } if (sensor_cpu_power && sensor_cpu_temp && @@ -720,40 +667,26 @@ static int wf_smu_remove(struct device * * with that except by adding locks all over... I'll do that * eventually but heh, who ever rmmod this module anyway ? */ - if (sensor_cpu_power) { - device_remove_file(wf_smu_dev, &dev_attr_cpu_power); + if (sensor_cpu_power) wf_put_sensor(sensor_cpu_power); - } - if (sensor_cpu_temp) { - device_remove_file(wf_smu_dev, &dev_attr_cpu_temp); + if (sensor_cpu_temp) wf_put_sensor(sensor_cpu_temp); - } - if (sensor_hd_temp) { - device_remove_file(wf_smu_dev, &dev_attr_hd_temp); + if (sensor_hd_temp) wf_put_sensor(sensor_hd_temp); - } - if (sensor_slots_power) { - device_remove_file(wf_smu_dev, &dev_attr_slots_power); + if (sensor_slots_power) wf_put_sensor(sensor_slots_power); - } /* Release all controls */ - if (fan_cpu_main) { - device_remove_file(wf_smu_dev, &dev_attr_cpu_fan); + if (fan_cpu_main) wf_put_control(fan_cpu_main); - } if (fan_cpu_second) wf_put_control(fan_cpu_second); if (fan_cpu_third) wf_put_control(fan_cpu_third); - if (fan_hd) { - device_remove_file(wf_smu_dev, &dev_attr_hd_fan); + if (fan_hd) wf_put_control(fan_hd); - } - if (fan_slots) { - device_remove_file(wf_smu_dev, &dev_attr_slots_fan); + if (fan_slots) wf_put_control(fan_slots); - } if (cpufreq_clamp) wf_put_control(cpufreq_clamp); Index: linux-work/drivers/macintosh/windfarm_pm81.c =================================================================== --- linux-work.orig/drivers/macintosh/windfarm_pm81.c 2006-01-13 16:55:07.000000000 +1100 +++ linux-work/drivers/macintosh/windfarm_pm81.c 2006-02-08 16:35:28.000000000 +1100 @@ -538,45 +538,6 @@ static void wf_smu_cpu_fans_tick(struct } } - -/* - * ****** Attributes ****** - * - */ - -#define BUILD_SHOW_FUNC_FIX(name, data) \ -static ssize_t show_##name(struct device *dev, \ - struct device_attribute *attr, \ - char *buf) \ -{ \ - ssize_t r; \ - s32 val = 0; \ - data->ops->get_value(data, &val); \ - r = sprintf(buf, "%d.%03d", FIX32TOPRINT(val)); \ - return r; \ -} \ -static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL); - - -#define BUILD_SHOW_FUNC_INT(name, data) \ -static ssize_t show_##name(struct device *dev, \ - struct device_attribute *attr, \ - char *buf) \ -{ \ - s32 val = 0; \ - data->ops->get_value(data, &val); \ - return sprintf(buf, "%d", val); \ -} \ -static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL); - -BUILD_SHOW_FUNC_INT(cpu_fan, fan_cpu_main); -BUILD_SHOW_FUNC_INT(sys_fan, fan_system); -BUILD_SHOW_FUNC_INT(hd_fan, fan_hd); - -BUILD_SHOW_FUNC_FIX(cpu_temp, sensor_cpu_temp); -BUILD_SHOW_FUNC_FIX(cpu_power, sensor_cpu_power); -BUILD_SHOW_FUNC_FIX(hd_temp, sensor_hd_temp); - /* * ****** Setup / Init / Misc ... ****** * @@ -654,17 +615,13 @@ static void wf_smu_new_control(struct wf return; if (fan_cpu_main == NULL && !strcmp(ct->name, "cpu-fan")) { - if (wf_get_control(ct) == 0) { + if (wf_get_control(ct) == 0) fan_cpu_main = ct; - device_create_file(wf_smu_dev, &dev_attr_cpu_fan); - } } if (fan_system == NULL && !strcmp(ct->name, "system-fan")) { - if (wf_get_control(ct) == 0) { + if (wf_get_control(ct) == 0) fan_system = ct; - device_create_file(wf_smu_dev, &dev_attr_sys_fan); - } } if (cpufreq_clamp == NULL && !strcmp(ct->name, "cpufreq-clamp")) { @@ -683,10 +640,8 @@ static void wf_smu_new_control(struct wf } if (fan_hd == NULL && !strcmp(ct->name, "drive-bay-fan")) { - if (wf_get_control(ct) == 0) { + if (wf_get_control(ct) == 0) fan_hd = ct; - device_create_file(wf_smu_dev, &dev_attr_hd_fan); - } } if (fan_system && fan_hd && fan_cpu_main && cpufreq_clamp) @@ -699,24 +654,18 @@ static void wf_smu_new_sensor(struct wf_ return; if (sensor_cpu_power == NULL && !strcmp(sr->name, "cpu-power")) { - if (wf_get_sensor(sr) == 0) { + if (wf_get_sensor(sr) == 0) sensor_cpu_power = sr; - device_create_file(wf_smu_dev, &dev_attr_cpu_power); - } } if (sensor_cpu_temp == NULL && !strcmp(sr->name, "cpu-temp")) { - if (wf_get_sensor(sr) == 0) { + if (wf_get_sensor(sr) == 0) sensor_cpu_temp = sr; - device_create_file(wf_smu_dev, &dev_attr_cpu_temp); - } } if (sensor_hd_temp == NULL && !strcmp(sr->name, "hd-temp")) { - if (wf_get_sensor(sr) == 0) { + if (wf_get_sensor(sr) == 0) sensor_hd_temp = sr; - device_create_file(wf_smu_dev, &dev_attr_hd_temp); - } } if (sensor_cpu_power && sensor_cpu_temp && sensor_hd_temp) @@ -794,32 +743,20 @@ static int wf_smu_remove(struct device * * with that except by adding locks all over... I'll do that * eventually but heh, who ever rmmod this module anyway ? */ - if (sensor_cpu_power) { - device_remove_file(wf_smu_dev, &dev_attr_cpu_power); + if (sensor_cpu_power) wf_put_sensor(sensor_cpu_power); - } - if (sensor_cpu_temp) { - device_remove_file(wf_smu_dev, &dev_attr_cpu_temp); + if (sensor_cpu_temp) wf_put_sensor(sensor_cpu_temp); - } - if (sensor_hd_temp) { - device_remove_file(wf_smu_dev, &dev_attr_hd_temp); + if (sensor_hd_temp) wf_put_sensor(sensor_hd_temp); - } /* Release all controls */ - if (fan_cpu_main) { - device_remove_file(wf_smu_dev, &dev_attr_cpu_fan); + if (fan_cpu_main) wf_put_control(fan_cpu_main); - } - if (fan_hd) { - device_remove_file(wf_smu_dev, &dev_attr_hd_fan); + if (fan_hd) wf_put_control(fan_hd); - } - if (fan_system) { - device_remove_file(wf_smu_dev, &dev_attr_sys_fan); + if (fan_system) wf_put_control(fan_system); - } if (cpufreq_clamp) wf_put_control(cpufreq_clamp); From torvalds at osdl.org Wed Feb 8 17:07:30 2006 From: torvalds at osdl.org (Linus Torvalds) Date: Tue, 7 Feb 2006 22:07:30 -0800 (PST) Subject: [PATCH] powerpc: Thermal control for dual core G5s In-Reply-To: <1139377372.8187.16.camel@localhost.localdomain> References: <1139377372.8187.16.camel@localhost.localdomain> Message-ID: On Wed, 8 Feb 2006, Benjamin Herrenschmidt wrote: > > This patch adds a windfarm module, windfarm_pm112, for the dual core G5s > (both 2 and 4 core models), keeping the machine from getting into > vacuum-cleaner mode ;) This seems to introduce a new warning.. arch/powerpc/platforms/83xx/Kconfig:10: warning: 'select' used by config symbol 'MPC834x_SYS' refer to undefined symbol 'DEFAULT_UIMAGE' drivers/macintosh/Kconfig:193: warning: 'select' used by config symbol 'WINDFARM_PM112' refer to undefined symbol 'I2C_PMAC_SMU' Hmm? Linus From benh at kernel.crashing.org Wed Feb 8 17:38:51 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 08 Feb 2006 17:38:51 +1100 Subject: [PATCH] powerpc: Thermal control for dual core G5s In-Reply-To: References: <1139377372.8187.16.camel@localhost.localdomain> Message-ID: <1139380731.5003.1.camel@localhost.localdomain> On Tue, 2006-02-07 at 22:07 -0800, Linus Torvalds wrote: > > On Wed, 8 Feb 2006, Benjamin Herrenschmidt wrote: > > > > This patch adds a windfarm module, windfarm_pm112, for the dual core G5s > > (both 2 and 4 core models), keeping the machine from getting into > > vacuum-cleaner mode ;) > > This seems to introduce a new warning.. > > arch/powerpc/platforms/83xx/Kconfig:10: > warning: 'select' used by config symbol 'MPC834x_SYS' refer to undefined symbol 'DEFAULT_UIMAGE' Former is not mine... > drivers/macintosh/Kconfig:193: > warning: 'select' used by config symbol 'WINDFARM_PM112' refer to undefined symbol 'I2C_PMAC_SMU' Ok, looks like I forgot to update the Kconfig for the new i2c driver, it should select I2C_POWERMAC instead. Do you want a new patch or can you just fix it there ? Ben. From galak at kernel.crashing.org Thu Feb 9 01:52:33 2006 From: galak at kernel.crashing.org (Kumar Gala) Date: Wed, 8 Feb 2006 08:52:33 -0600 Subject: [PATCH] powerpc: Thermal control for dual core G5s In-Reply-To: References: <1139377372.8187.16.camel@localhost.localdomain> Message-ID: <87E547D2-A8FC-4A89-89A4-60313C9647B0@kernel.crashing.org> On Feb 8, 2006, at 12:07 AM, Linus Torvalds wrote: > > > On Wed, 8 Feb 2006, Benjamin Herrenschmidt wrote: >> >> This patch adds a windfarm module, windfarm_pm112, for the dual >> core G5s >> (both 2 and 4 core models), keeping the machine from getting into >> vacuum-cleaner mode ;) > > This seems to introduce a new warning.. > > arch/powerpc/platforms/83xx/Kconfig:10: > warning: 'select' used by config symbol 'MPC834x_SYS' refer to > undefined symbol 'DEFAULT_UIMAGE' That's my fault. Paul did push a simple build system update. I'll ask him to do so. http://ozlabs.org/pipermail/linuxppc-dev/2006-January/020980.html - kumar From dwmw2 at infradead.org Thu Feb 9 05:01:00 2006 From: dwmw2 at infradead.org (David Woodhouse) Date: Wed, 08 Feb 2006 18:01:00 +0000 Subject: PPC64 boot failure with 2.6.15 In-Reply-To: <1138660828.12601.21.camel@localhost.localdomain> References: <200601251821.47557.pat@computer-refuge.org> <200601252307.45741.pat@computer-refuge.org> <200601260051.00902.pat@computer-refuge.org> <200601262025.44777.pat@computer-refuge.org> <1138660828.12601.21.camel@localhost.localdomain> Message-ID: <1139421660.4183.15.camel@pmac.infradead.org> On Tue, 2006-01-31 at 09:40 +1100, Benjamin Herrenschmidt wrote: > Interesting... best would be to try to bisect to find out what > specific patch broke it but I understand that's not easy with those > old kernels that were maintained with bitkeeper... Maybe you could try > to spot which daily bk snapshot broke it if they are still available > somewhere ? What's wrong with just using 'git bisect' on the BK->git converted tree? -- dwmw2 From linas at austin.ibm.com Thu Feb 9 05:29:13 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 8 Feb 2006 12:29:13 -0600 Subject: [PATCH]: Documentation: Updated PCI Error Recovery In-Reply-To: <20060207231927.GB19648@kroah.com> References: <20060203000602.GQ24916@austin.ibm.com> <20060207222144.GA15622@kroah.com> <20060207143052.19978ca7.akpm@osdl.org> <20060207223956.GA19009@kroah.com> <20060207145347.72c0a77e.akpm@osdl.org> <20060207231927.GB19648@kroah.com> Message-ID: <20060208182913.GQ24916@austin.ibm.com> On Tue, Feb 07, 2006 at 03:19:27PM -0800, Greg KH was heard to remark: > > It could be all the newly-added trailing whitespace I chopped off. > > Yup, that was it, quilt would have stripped them off for me too. Linas, > please don't do this anymore... Sorry; I'm usually good about that in code, but the Pavlovian reaction didn't trip on docs. --linas From geoffrey.levand at am.sony.com Thu Feb 9 14:08:48 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Wed, 08 Feb 2006 19:08:48 -0800 Subject: [PATCH] fix prom_init undefined error In-Reply-To: <1139371162.8187.3.camel@localhost.localdomain> References: <1139371162.8187.3.camel@localhost.localdomain> Message-ID: <43EAB240.8030107@am.sony.com> Benjamin Herrenschmidt wrote: > On Tue, 2006-02-07 at 15:10 -0800, Geoff Levand wrote: > >>Paul, >> >>This patch fixes a build error when CONFIG_PPC_OF=n, >>CONFIG_PPC_MULTIPLATFORM=y. It makes the conditionals >>consistent in arch/powerpc/kernel/Makefile and head_64.S >>to both be on CONFIG_PPC_OF. >> >> arch/powerpc/kernel/head_64.o: In function `.__boot_from_prom': >> linux/arch/powerpc/kernel/head_64.S:(.text+0x8158): undefined > > reference to `.prom_init' > >>obj-$(CONFIG_PPC_OF) += prom_init.o > > > With ARCH=powerpc, CONFIG_PPC_OF should always be set. It's supposed to > be set when the device-tree accessors exist which they always do. OK, that makes things clear. > Besides, I'll be removing support for !MULTIPLATFORM too :) (Except for > iSeries at least for a little while). Look at the patch I posted that > removes _machine for an idea of where things are going. I'll try and rework things when you get rid of MULTIPLATFORM. When can we expect it? -Geoff From benh at kernel.crashing.org Thu Feb 9 14:44:45 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 09 Feb 2006 14:44:45 +1100 Subject: [PATCH] fix prom_init undefined error In-Reply-To: <43EAB240.8030107@am.sony.com> References: <1139371162.8187.3.camel@localhost.localdomain> <43EAB240.8030107@am.sony.com> Message-ID: <1139456685.5003.33.camel@localhost.localdomain> > > Besides, I'll be removing support for !MULTIPLATFORM too :) (Except for > > iSeries at least for a little while). Look at the patch I posted that > > removes _machine for an idea of where things are going. > > I'll try and rework things when you get rid of MULTIPLATFORM. When can > we expect it? Well, I posted a first patch that removes _machine and makes the platform "probe" closer between 32 and 64 bits already a few weeks ago before I went on vacation... I'll revive that patch today or tomorrow and have it merged in powerpc.git. At which point, I'll tackle in no specific order, removing of pre-parsed interrupt stuff in device_node (and implement a proper generic OF interrupt tree parser), making the early init code between 32 and 64 bits even more similar (si discussion about boot code I posted a month or two ago), etc... I'm sorry I can't promise any timeframe at this point though due to personal constraints (just had a baby). Ben. From bdc at carlstrom.com Thu Feb 9 17:02:37 2006 From: bdc at carlstrom.com (Brian D. Carlstrom) Date: Wed, 8 Feb 2006 22:02:37 -0800 Subject: G5 fan problems return moving to 2.6.15 with dual processor 2.7GHz machine In-Reply-To: <1139371016.8187.1.camel@localhost.localdomain> References: <20060205061048.7261.qmail@electricrain.com> <1139130385.5634.14.camel@localhost.localdomain> <17384.62553.442011.514155@zot.electricrain.com> <1139371016.8187.1.camel@localhost.localdomain> Message-ID: <17386.56061.78892.44180@zot.electricrain.com> Benjamin Herrenschmidt writes: > prom_printf should work ... try booting manually (from the OF command > line) and maybe comment out the code that opens the displays... (it > may be clearing the screen).... I tried commented out prom_check_displays and that does prevent the clearing of the screen, but still no visible prom_printf output. The last output seems to be from yaboot, its certainly not one of the prom_print messages from prom_init.c. For good measure, I also tried adding "video=ofonly". Still no prom_printf output visible. However, when I rebooted back to my 2.6.14 kernel, I saw the usual prom_printf messages from prom_init without any changes. I reviewed the prom_init.c diffs between 2.6.14 and 2.6.15 but they are large enough that its not easy to spot an obvious problem. In any case, I didn't have much time to really look at this today, just enough to try the disabling prom_check_displays, I'll have to look more Friday. -bri From michael at ellerman.id.au Thu Feb 9 17:03:27 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 09 Feb 2006 17:03:27 +1100 Subject: [PATCH 1/3] powerpc: Clean up pSeries firmware feature initialisation Message-ID: <1139465007.297357.792110844862.qpush@concordia> Clean up fw_feature_init in platforms/pseries/setup.c. Clean up white space and replace the while loop with a for loop - which seems clearer to me. Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/setup.c | 45 +++++++++++++++------------------ 1 files changed, 21 insertions(+), 24 deletions(-) Index: to-merge/arch/powerpc/platforms/pseries/setup.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/setup.c +++ to-merge/arch/powerpc/platforms/pseries/setup.c @@ -263,48 +263,45 @@ static int __init pSeries_init_panel(voi arch_initcall(pSeries_init_panel); -/* Build up the ppc64_firmware_features bitmask field - * using contents of device-tree/ibm,hypertas-functions. - * Ultimately this functionality may be moved into prom.c prom_init(). +/* Build up the firmware features bitmask using the contents of + * device-tree/ibm,hypertas-functions. Ultimately this functionality may + * be moved into prom.c prom_init(). */ static void __init fw_feature_init(void) { - struct device_node * dn; - char * hypertas; - unsigned int len; + struct device_node *dn; + char *hypertas, *s; + int len, i; DBG(" -> fw_feature_init()\n"); - ppc64_firmware_features = 0; dn = of_find_node_by_path("/rtas"); if (dn == NULL) { - printk(KERN_ERR "WARNING ! Cannot find RTAS in device-tree !\n"); + printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n"); goto no_rtas; } hypertas = get_property(dn, "ibm,hypertas-functions", &len); - if (hypertas) { - while (len > 0){ - int i, hypertas_len; + if (hypertas == NULL) + goto no_hypertas; + + for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) { + for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) { /* check value against table of strings */ - for(i=0; i < FIRMWARE_MAX_FEATURES ;i++) { - if ((firmware_features_table[i].name) && - (strcmp(firmware_features_table[i].name,hypertas))==0) { - /* we have a match */ - ppc64_firmware_features |= - (firmware_features_table[i].val); - break; - } - } - hypertas_len = strlen(hypertas); - len -= hypertas_len +1; - hypertas+= hypertas_len +1; + if (!firmware_features_table[i].name || + strcmp(firmware_features_table[i].name, s)) + continue; + + /* we have a match */ + ppc64_firmware_features |= + firmware_features_table[i].val; + break; } } +no_hypertas: of_node_put(dn); no_rtas: - DBG(" <- fw_feature_init()\n"); } From michael at ellerman.id.au Thu Feb 9 17:03:33 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 09 Feb 2006 17:03:33 +1100 Subject: [PATCH 2/3] powerpc: Move pSeries firmware feature setup into platforms/pseries In-Reply-To: <1139465007.297357.792110844862.qpush@concordia> Message-ID: <20060209060356.5606C679F6@ozlabs.org> Currently we have some stuff in firmware.h and kernel/firmware.c that is #ifdef CONFIG_PPC_PSERIES. Move it all into platforms/pseries. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/firmware.c | 25 ------- arch/powerpc/platforms/pseries/Makefile | 3 arch/powerpc/platforms/pseries/firmware.c | 104 ++++++++++++++++++++++++++++++ arch/powerpc/platforms/pseries/firmware.h | 17 ++++ arch/powerpc/platforms/pseries/setup.c | 46 ------------- include/asm-powerpc/firmware.h | 9 -- 6 files changed, 124 insertions(+), 80 deletions(-) Index: to-merge/arch/powerpc/kernel/firmware.c =================================================================== --- to-merge.orig/arch/powerpc/kernel/firmware.c +++ to-merge/arch/powerpc/kernel/firmware.c @@ -18,28 +18,3 @@ #include unsigned long ppc64_firmware_features; - -#ifdef CONFIG_PPC_PSERIES -firmware_feature_t firmware_features_table[FIRMWARE_MAX_FEATURES] = { - {FW_FEATURE_PFT, "hcall-pft"}, - {FW_FEATURE_TCE, "hcall-tce"}, - {FW_FEATURE_SPRG0, "hcall-sprg0"}, - {FW_FEATURE_DABR, "hcall-dabr"}, - {FW_FEATURE_COPY, "hcall-copy"}, - {FW_FEATURE_ASR, "hcall-asr"}, - {FW_FEATURE_DEBUG, "hcall-debug"}, - {FW_FEATURE_PERF, "hcall-perf"}, - {FW_FEATURE_DUMP, "hcall-dump"}, - {FW_FEATURE_INTERRUPT, "hcall-interrupt"}, - {FW_FEATURE_MIGRATE, "hcall-migrate"}, - {FW_FEATURE_PERFMON, "hcall-perfmon"}, - {FW_FEATURE_CRQ, "hcall-crq"}, - {FW_FEATURE_VIO, "hcall-vio"}, - {FW_FEATURE_RDMA, "hcall-rdma"}, - {FW_FEATURE_LLAN, "hcall-lLAN"}, - {FW_FEATURE_BULK, "hcall-bulk"}, - {FW_FEATURE_XDABR, "hcall-xdabr"}, - {FW_FEATURE_MULTITCE, "hcall-multi-tce"}, - {FW_FEATURE_SPLPAR, "hcall-splpar"}, -}; -#endif Index: to-merge/arch/powerpc/platforms/pseries/Makefile =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/Makefile +++ to-merge/arch/powerpc/platforms/pseries/Makefile @@ -1,5 +1,6 @@ obj-y := pci.o lpar.o hvCall.o nvram.o reconfig.o \ - setup.o iommu.o ras.o rtasd.o pci_dlpar.o + setup.o iommu.o ras.o rtasd.o pci_dlpar.o \ + firmware.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_IBMVIO) += vio.o obj-$(CONFIG_XICS) += xics.o Index: to-merge/arch/powerpc/platforms/pseries/firmware.c =================================================================== --- /dev/null +++ to-merge/arch/powerpc/platforms/pseries/firmware.c @@ -0,0 +1,104 @@ +/* + * pSeries firmware setup code. + * + * Portions from arch/powerpc/platforms/pseries/setup.c: + * Copyright (C) 1995 Linus Torvalds + * Adapted from 'alpha' version by Gary Thomas + * Modified by Cort Dougan (cort at cs.nmt.edu) + * Modified by PPC64 Team, IBM Corp + * + * Portions from arch/powerpc/kernel/firmware.c + * Copyright (C) 2001 Ben. Herrenschmidt (benh at kernel.crashing.org) + * Modifications for ppc64: + * Copyright (C) 2003 Dave Engebretsen + * Copyright (C) 2005 Stephen Rothwell, IBM Corporation + * + * Copyright 2006 IBM Corporation. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#undef DEBUG + +#include +#include + +#ifdef DEBUG +#define DBG(fmt...) udbg_printf(fmt) +#else +#define DBG(fmt...) +#endif + +typedef struct { + unsigned long val; + char * name; +} firmware_feature_t; + +static __initdata firmware_feature_t +firmware_features_table[FIRMWARE_MAX_FEATURES] = { + {FW_FEATURE_PFT, "hcall-pft"}, + {FW_FEATURE_TCE, "hcall-tce"}, + {FW_FEATURE_SPRG0, "hcall-sprg0"}, + {FW_FEATURE_DABR, "hcall-dabr"}, + {FW_FEATURE_COPY, "hcall-copy"}, + {FW_FEATURE_ASR, "hcall-asr"}, + {FW_FEATURE_DEBUG, "hcall-debug"}, + {FW_FEATURE_PERF, "hcall-perf"}, + {FW_FEATURE_DUMP, "hcall-dump"}, + {FW_FEATURE_INTERRUPT, "hcall-interrupt"}, + {FW_FEATURE_MIGRATE, "hcall-migrate"}, + {FW_FEATURE_PERFMON, "hcall-perfmon"}, + {FW_FEATURE_CRQ, "hcall-crq"}, + {FW_FEATURE_VIO, "hcall-vio"}, + {FW_FEATURE_RDMA, "hcall-rdma"}, + {FW_FEATURE_LLAN, "hcall-lLAN"}, + {FW_FEATURE_BULK, "hcall-bulk"}, + {FW_FEATURE_XDABR, "hcall-xdabr"}, + {FW_FEATURE_MULTITCE, "hcall-multi-tce"}, + {FW_FEATURE_SPLPAR, "hcall-splpar"}, +}; + +/* Build up the firmware features bitmask using the contents of + * device-tree/ibm,hypertas-functions. Ultimately this functionality may + * be moved into prom.c prom_init(). + */ +void __init fw_feature_init(void) +{ + struct device_node *dn; + char *hypertas, *s; + int len, i; + + DBG(" -> fw_feature_init()\n"); + + dn = of_find_node_by_path("/rtas"); + if (dn == NULL) { + printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n"); + goto no_rtas; + } + + hypertas = get_property(dn, "ibm,hypertas-functions", &len); + if (hypertas == NULL) + goto no_hypertas; + + for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) { + for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) { + /* check value against table of strings */ + if (!firmware_features_table[i].name || + strcmp(firmware_features_table[i].name, s)) + continue; + + /* we have a match */ + ppc64_firmware_features |= + firmware_features_table[i].val; + break; + } + } + +no_hypertas: + of_node_put(dn); +no_rtas: + DBG(" <- fw_feature_init()\n"); +} Index: to-merge/arch/powerpc/platforms/pseries/setup.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/setup.c +++ to-merge/arch/powerpc/platforms/pseries/setup.c @@ -60,7 +60,6 @@ #include #include #include "xics.h" -#include #include #include #include @@ -70,6 +69,7 @@ #include "plpar_wrappers.h" #include "ras.h" +#include "firmware.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -262,50 +262,6 @@ static int __init pSeries_init_panel(voi } arch_initcall(pSeries_init_panel); - -/* Build up the firmware features bitmask using the contents of - * device-tree/ibm,hypertas-functions. Ultimately this functionality may - * be moved into prom.c prom_init(). - */ -static void __init fw_feature_init(void) -{ - struct device_node *dn; - char *hypertas, *s; - int len, i; - - DBG(" -> fw_feature_init()\n"); - - dn = of_find_node_by_path("/rtas"); - if (dn == NULL) { - printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n"); - goto no_rtas; - } - - hypertas = get_property(dn, "ibm,hypertas-functions", &len); - if (hypertas == NULL) - goto no_hypertas; - - for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) { - for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) { - /* check value against table of strings */ - if (!firmware_features_table[i].name || - strcmp(firmware_features_table[i].name, s)) - continue; - - /* we have a match */ - ppc64_firmware_features |= - firmware_features_table[i].val; - break; - } - } - -no_hypertas: - of_node_put(dn); -no_rtas: - DBG(" <- fw_feature_init()\n"); -} - - static void __init pSeries_discover_pic(void) { struct device_node *np; Index: to-merge/include/asm-powerpc/firmware.h =================================================================== --- to-merge.orig/include/asm-powerpc/firmware.h +++ to-merge/include/asm-powerpc/firmware.h @@ -89,15 +89,6 @@ static inline unsigned long firmware_has (FW_FEATURE_POSSIBLE & ppc64_firmware_features & feature); } -#ifdef CONFIG_PPC_PSERIES -typedef struct { - unsigned long val; - char * name; -} firmware_feature_t; - -extern firmware_feature_t firmware_features_table[]; -#endif - extern void system_reset_fwnmi(void); extern void machine_check_fwnmi(void); Index: to-merge/arch/powerpc/platforms/pseries/firmware.h =================================================================== --- /dev/null +++ to-merge/arch/powerpc/platforms/pseries/firmware.h @@ -0,0 +1,17 @@ +/* + * Copyright 2006 IBM Corporation. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _PSERIES_FIRMWARE_H +#define _PSERIES_FIRMWARE_H + +#include + +extern void __init fw_feature_init(void); + +#endif /* _PSERIES_FIRMWARE_H */ From michael at ellerman.id.au Thu Feb 9 17:03:35 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 09 Feb 2006 17:03:35 +1100 Subject: [PATCH 3/3] powerpc: Replace platform_is_lpar() with a firmware feature In-Reply-To: <1139465007.297357.792110844862.qpush@concordia> Message-ID: <20060209060359.2F789679F7@ozlabs.org> It has been decreed that platform numbers are evil, so as a step in that direction, replace platform_is_lpar() with a FW_FEATURE_LPAR bit. Signed-off-by: Michael Ellerman --- arch/powerpc/mm/hash_utils_64.c | 4 ++-- arch/powerpc/oprofile/op_model_power4.c | 3 ++- arch/powerpc/platforms/iseries/setup.c | 10 +++++++--- arch/powerpc/platforms/pseries/iommu.c | 2 +- arch/powerpc/platforms/pseries/setup.c | 11 +++++++---- arch/powerpc/platforms/pseries/smp.c | 2 +- arch/powerpc/platforms/pseries/xics.c | 3 ++- include/asm-powerpc/firmware.h | 7 ++++--- include/asm-powerpc/processor.h | 1 - 9 files changed, 26 insertions(+), 17 deletions(-) Index: to-merge/include/asm-powerpc/firmware.h =================================================================== --- to-merge.orig/include/asm-powerpc/firmware.h +++ to-merge/include/asm-powerpc/firmware.h @@ -41,6 +41,7 @@ #define FW_FEATURE_MULTITCE (1UL<<19) #define FW_FEATURE_SPLPAR (1UL<<20) #define FW_FEATURE_ISERIES (1UL<<21) +#define FW_FEATURE_LPAR (1UL<<22) enum { #ifdef CONFIG_PPC64 @@ -51,10 +52,10 @@ enum { FW_FEATURE_MIGRATE | FW_FEATURE_PERFMON | FW_FEATURE_CRQ | FW_FEATURE_VIO | FW_FEATURE_RDMA | FW_FEATURE_LLAN | FW_FEATURE_BULK | FW_FEATURE_XDABR | FW_FEATURE_MULTITCE | - FW_FEATURE_SPLPAR, + FW_FEATURE_SPLPAR | FW_FEATURE_LPAR, FW_FEATURE_PSERIES_ALWAYS = 0, - FW_FEATURE_ISERIES_POSSIBLE = FW_FEATURE_ISERIES, - FW_FEATURE_ISERIES_ALWAYS = FW_FEATURE_ISERIES, + FW_FEATURE_ISERIES_POSSIBLE = FW_FEATURE_ISERIES | FW_FEATURE_LPAR, + FW_FEATURE_ISERIES_ALWAYS = FW_FEATURE_ISERIES | FW_FEATURE_LPAR, FW_FEATURE_POSSIBLE = #ifdef CONFIG_PPC_PSERIES FW_FEATURE_PSERIES_POSSIBLE | Index: to-merge/arch/powerpc/mm/hash_utils_64.c =================================================================== --- to-merge.orig/arch/powerpc/mm/hash_utils_64.c +++ to-merge/arch/powerpc/mm/hash_utils_64.c @@ -421,7 +421,7 @@ void __init htab_initialize(void) htab_hash_mask = pteg_count - 1; - if (platform_is_lpar()) { + if (firmware_has_feature(FW_FEATURE_LPAR)) { /* Using a hypervisor which owns the htab */ htab_address = NULL; _SDR1 = 0; @@ -515,7 +515,7 @@ void __init htab_initialize(void) void htab_initialize_secondary(void) { - if (!platform_is_lpar()) + if (!firmware_has_feature(FW_FEATURE_LPAR)) mtspr(SPRN_SDR1, _SDR1); } Index: to-merge/arch/powerpc/oprofile/op_model_power4.c =================================================================== --- to-merge.orig/arch/powerpc/oprofile/op_model_power4.c +++ to-merge/arch/powerpc/oprofile/op_model_power4.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -232,7 +233,7 @@ static unsigned long get_pc(struct pt_re mmcra = mfspr(SPRN_MMCRA); /* Were we in the hypervisor? */ - if (platform_is_lpar() && (mmcra & MMCRA_SIHV)) + if (firmware_has_feature(FW_FEATURE_LPAR) && (mmcra & MMCRA_SIHV)) /* function descriptor madness */ return *((unsigned long *)hypervisor_bucket); Index: to-merge/arch/powerpc/platforms/pseries/iommu.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/iommu.c +++ to-merge/arch/powerpc/platforms/pseries/iommu.c @@ -582,7 +582,7 @@ void iommu_init_early_pSeries(void) return; } - if (platform_is_lpar()) { + if (firmware_has_feature(FW_FEATURE_LPAR)) { if (firmware_has_feature(FW_FEATURE_MULTITCE)) { ppc_md.tce_build = tce_buildmulti_pSeriesLP; ppc_md.tce_free = tce_freemulti_pSeriesLP; Index: to-merge/arch/powerpc/platforms/pseries/setup.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/setup.c +++ to-merge/arch/powerpc/platforms/pseries/setup.c @@ -246,7 +246,7 @@ static void __init pSeries_setup_arch(vo ppc_md.idle_loop = default_idle; } - if (platform_is_lpar()) + if (firmware_has_feature(FW_FEATURE_LPAR)) ppc_md.enable_pmcs = pseries_lpar_enable_pmcs; else ppc_md.enable_pmcs = power4_enable_pmcs; @@ -326,7 +326,7 @@ static void __init pSeries_init_early(vo fw_feature_init(); - if (platform_is_lpar()) + if (firmware_has_feature(FW_FEATURE_LPAR)) hpte_init_lpar(); else { hpte_init_native(); @@ -334,7 +334,7 @@ static void __init pSeries_init_early(vo get_property(of_chosen, "linux,iommu-off", NULL)); } - if (platform_is_lpar()) + if (firmware_has_feature(FW_FEATURE_LPAR)) find_udbg_vterm(); if (firmware_has_feature(FW_FEATURE_DABR)) @@ -390,6 +390,9 @@ static int __init pSeries_probe(int plat * it here ... */ + if (platform == PLATFORM_PSERIES_LPAR) + ppc64_firmware_features |= FW_FEATURE_LPAR; + return 1; } @@ -529,7 +532,7 @@ static void pseries_shared_idle(void) static int pSeries_pci_probe_mode(struct pci_bus *bus) { - if (platform_is_lpar()) + if (firmware_has_feature(FW_FEATURE_LPAR)) return PCI_PROBE_DEVTREE; return PCI_PROBE_NORMAL; } Index: to-merge/arch/powerpc/platforms/pseries/smp.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/smp.c +++ to-merge/arch/powerpc/platforms/pseries/smp.c @@ -443,7 +443,7 @@ void __init smp_init_pSeries(void) smp_ops->cpu_die = pSeries_cpu_die; /* Processors can be added/removed only on LPAR */ - if (platform_is_lpar()) + if (firmware_has_feature(FW_FEATURE_LPAR)) pSeries_reconfig_notifier_register(&pSeries_smp_nb); #endif Index: to-merge/arch/powerpc/platforms/pseries/xics.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/xics.c +++ to-merge/arch/powerpc/platforms/pseries/xics.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -536,7 +537,7 @@ nextnode: of_node_put(np); } - if (platform_is_lpar()) + if (firmware_has_feature(FW_FEATURE_LPAR)) ops = &pSeriesLP_ops; else { #ifdef CONFIG_SMP Index: to-merge/arch/powerpc/platforms/iseries/setup.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/iseries/setup.c +++ to-merge/arch/powerpc/platforms/iseries/setup.c @@ -303,8 +303,6 @@ static void __init iSeries_init_early(vo { DBG(" -> iSeries_init_early()\n"); - ppc64_firmware_features = FW_FEATURE_ISERIES; - ppc64_interrupt_controller = IC_ISERIES; #if defined(CONFIG_BLK_DEV_INITRD) @@ -710,7 +708,13 @@ void __init iSeries_init_IRQ(void) { } static int __init iseries_probe(int platform) { - return PLATFORM_ISERIES_LPAR == platform; + if (PLATFORM_ISERIES_LPAR != platform) + return 0; + + ppc64_firmware_features |= FW_FEATURE_ISERIES; + ppc64_firmware_features |= FW_FEATURE_LPAR; + + return 1; } struct machdep_calls __initdata iseries_md = { Index: to-merge/include/asm-powerpc/processor.h =================================================================== --- to-merge.orig/include/asm-powerpc/processor.h +++ to-merge/include/asm-powerpc/processor.h @@ -52,7 +52,6 @@ #ifdef __KERNEL__ #define platform_is_pseries() (_machine == PLATFORM_PSERIES || \ _machine == PLATFORM_PSERIES_LPAR) -#define platform_is_lpar() (!!(_machine & PLATFORM_LPAR)) #if defined(CONFIG_PPC_MULTIPLATFORM) extern int _machine; From ntl at pobox.com Thu Feb 9 18:23:32 2006 From: ntl at pobox.com (Nathan Lynch) Date: Thu, 9 Feb 2006 01:23:32 -0600 Subject: [PATCH 1/3] powerpc: Clean up pSeries firmware feature initialisation In-Reply-To: <1139465007.297357.792110844862.qpush@concordia> References: <1139465007.297357.792110844862.qpush@concordia> Message-ID: <20060209072331.GK18730@localhost.localdomain> Michael Ellerman wrote: ... > DBG(" -> fw_feature_init()\n"); > > - ppc64_firmware_features = 0; > dn = of_find_node_by_path("/rtas"); > if (dn == NULL) { > - printk(KERN_ERR "WARNING ! Cannot find RTAS in device-tree !\n"); > + printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n"); > goto no_rtas; > } > > hypertas = get_property(dn, "ibm,hypertas-functions", &len); > - if (hypertas) { > - while (len > 0){ > - int i, hypertas_len; > + if (hypertas == NULL) > + goto no_hypertas; ... > +no_hypertas: > of_node_put(dn); > no_rtas: > - > DBG(" <- fw_feature_init()\n"); > } of_node_put can handle a null pointer fine, so you could get away with just one label here. From cfriesen at nortel.com Fri Feb 10 03:03:22 2006 From: cfriesen at nortel.com (Christopher Friesen) Date: Thu, 09 Feb 2006 10:03:22 -0600 Subject: question on ptep_clear_flush_dirty() for ppc64 Message-ID: <43EB67CA.7000207@nortel.com> I notice that (at least for 2.6.10) ptep_clear_flush_dirty() for ppc64 simply does ptep_test_and_clear_dirty(), then calls flush_tlb_pending(). I want to call ptep_clear_flush_dirty() for a large number of pages (tens of thousands) in an optimal manner--would it be legal for me to call ptep_test_and_clear_dirty() for each page, then call flush_tlb_pending() once at the end? Are there any implications for SMP machines? The reason I ask is that in a small experiment I did this increased the speed of a certain task by a factor of about 25%, which is significant in our application. I just wanted to make sure it was safe. Thanks, Chris From benh at kernel.crashing.org Fri Feb 10 10:04:40 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 10 Feb 2006 10:04:40 +1100 Subject: G5 fan problems return moving to 2.6.15 with dual processor 2.7GHz machine In-Reply-To: <17386.56061.78892.44180@zot.electricrain.com> References: <20060205061048.7261.qmail@electricrain.com> <1139130385.5634.14.camel@localhost.localdomain> <17384.62553.442011.514155@zot.electricrain.com> <1139371016.8187.1.camel@localhost.localdomain> <17386.56061.78892.44180@zot.electricrain.com> Message-ID: <1139526280.5003.40.camel@localhost.localdomain> On Wed, 2006-02-08 at 22:02 -0800, Brian D. Carlstrom wrote: > Benjamin Herrenschmidt writes: > > prom_printf should work ... try booting manually (from the OF command > > line) and maybe comment out the code that opens the displays... (it > > may be clearing the screen).... > > I tried commented out prom_check_displays and that does prevent the > clearing of the screen, but still no visible prom_printf output. The > last output seems to be from yaboot, its certainly not one of the > prom_print messages from prom_init.c. For good measure, I also tried > adding "video=ofonly". Still no prom_printf output visible. > > However, when I rebooted back to my 2.6.14 kernel, I saw the usual > prom_printf messages from prom_init without any changes. I reviewed the > prom_init.c diffs between 2.6.14 and 2.6.15 but they are large enough > that its not easy to spot an obvious problem. > > In any case, I didn't have much time to really look at this today, just > enough to try the disabling prom_check_displays, I'll have to look more > Friday. That is strange... Ben. From olof at lixom.net Fri Feb 10 10:25:12 2006 From: olof at lixom.net (Olof Johansson) Date: Thu, 9 Feb 2006 17:25:12 -0600 Subject: [PATCH 2/3] powerpc: Move pSeries firmware feature setup into platforms/pseries In-Reply-To: <20060209060356.5606C679F6@ozlabs.org> References: <1139465007.297357.792110844862.qpush@concordia> <20060209060356.5606C679F6@ozlabs.org> Message-ID: <20060209232511.GN4833@pb15.lixom.net> On Thu, Feb 09, 2006 at 05:03:33PM +1100, Michael Ellerman wrote: > Currently we have some stuff in firmware.h and kernel/firmware.c that is > #ifdef CONFIG_PPC_PSERIES. Move it all into platforms/pseries. I suggest renaming it to something like fw_set_hv_features() while you're at it, since all features it parses and sets are hypervisor-related. There are other, not yet fully merged hypervisor guest ports that might want to share this code (Xen, rhype) for non-pseries machines, so there's a chance that a move into platforms will just need to be undone down the road. However, since that code isn't merged yet, let's worry about that then instead of now. -Olof From michael at ellerman.id.au Fri Feb 10 10:48:46 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 10 Feb 2006 10:48:46 +1100 Subject: [PATCH 1/3] powerpc: Clean up pSeries firmware feature initialisation In-Reply-To: <20060209072331.GK18730@localhost.localdomain> References: <1139465007.297357.792110844862.qpush@concordia> <20060209072331.GK18730@localhost.localdomain> Message-ID: <200602101048.50034.michael@ellerman.id.au> On Thu, 9 Feb 2006 18:23, Nathan Lynch wrote: > Michael Ellerman wrote: > > +no_hypertas: > > of_node_put(dn); > > no_rtas: > > - > > DBG(" <- fw_feature_init()\n"); > > } > > of_node_put can handle a null pointer fine, so you could get away with > just one label here. Nice, I'll fix it up and resend. -- Michael Ellerman IBM OzLabs wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060210/11ffcc7e/attachment.pgp From benh at kernel.crashing.org Fri Feb 10 13:54:27 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 10 Feb 2006 13:54:27 +1100 Subject: question on ptep_clear_flush_dirty() for ppc64 In-Reply-To: <43EB67CA.7000207@nortel.com> References: <43EB67CA.7000207@nortel.com> Message-ID: <1139540068.5003.72.camel@localhost.localdomain> On Thu, 2006-02-09 at 10:03 -0600, Christopher Friesen wrote: > I notice that (at least for 2.6.10) ptep_clear_flush_dirty() for ppc64 > simply does ptep_test_and_clear_dirty(), then calls flush_tlb_pending(). > > I want to call ptep_clear_flush_dirty() for a large number of pages > (tens of thousands) in an optimal manner--would it be legal for me to > call ptep_test_and_clear_dirty() for each page, then call > flush_tlb_pending() once at the end? Are there any implications for SMP > machines? > > The reason I ask is that in a small experiment I did this increased the > speed of a certain task by a factor of about 25%, which is significant > in our application. I just wanted to make sure it was safe. If you do that, just beware that if any "new" dirtying happens between the update of the linux PTE and the flush_tlb_pending(), it will not be lost... this is not a problem if you only "use" the result of the function (the collected dirty bits) after you flush_tlb_pending() since you will have those already marked dirty. I don't think you need the page table lock, though I'm a bit tired at the moment and may be missing something, and I don't think you need to disable preemption as a context switch will call flush_tlb_pending()... Ben. From benh at kernel.crashing.org Fri Feb 10 14:01:15 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 10 Feb 2006 14:01:15 +1100 Subject: __setup_cpu_be problem In-Reply-To: <43E9050C.2000300@am.sony.com> References: <43E9050C.2000300@am.sony.com> Message-ID: <1139540476.5003.74.camel@localhost.localdomain> On Tue, 2006-02-07 at 12:37 -0800, Geoff Levand wrote: > Arnd, > > It seems HID6 is a hypervisor resource... Can we just have > '.cpu_setup = __setup_cpu_power4', and you setup your > page sizes somewhere else? Or better test if MSR:HV is set ? :) But yeah, he shouldn't have to set the page sizes there anyway, I would expect the firmware to do it and pass the right sizes via the device-tree since that's what the kernel expects. (Though you really want LP0 to be 16M and not 1M as the kernel can't really deal with the later properly anyway with the current page table layouts). > -Geoff > > struct cpu_spec cpu_specs[] = { > { /* Cell Broadband Engine */ > .cpu_setup = __setup_cpu_be, > }, > > _GLOBAL(__setup_cpu_be) > /* Set large page sizes LP=0: 16MB, LP=1: 64KB */ > addi r3, 0, 0 > ori r3, r3, HID6_LB > sldi r3, r3, 32 > nor r3, r3, r3 > mfspr r4, SPRN_HID6 > and r4, r4, r3 > addi r3, 0, 0x02000 > sldi r3, r3, 32 > or r4, r4, r3 > mtspr SPRN_HID6, r4 > blr > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From arnd at arndb.de Fri Feb 10 15:21:15 2006 From: arnd at arndb.de (arnd at arndb.de) Date: Fri, 10 Feb 2006 05:21:15 +0100 Subject: AW: Re: __setup_cpu_be problem Message-ID: <2812322.110611139545275893.JavaMail.servlet@kundenserver> > >On Tue, 2006-02-07 at 12:37 -0800, Geoff Levand wrote: >> Arnd, >> >> It seems HID6 is a hypervisor resource... Can we just have >> '.cpu_setup = __setup_cpu_power4', and you setup your >> page sizes somewhere else? > >Or better test if MSR:HV is set ? :) But yeah, he shouldn't have to set >the page sizes there anyway, I would expect the firmware to do it and >pass the right sizes via the device-tree since that's what the kernel >expects. (Though you really want LP0 to be 16M and not 1M as the kernel >can't really deal with the later properly anyway with the current page >table layouts). > [/me is sing webmail from some distant location in .nz, sorry if the mail gets messed up] Doing it in the firmware sounds like the right solution to me. I would however not want to do that if the current firmware sets the wrong page sizes. I know that Hartmut wanted me to provide him with the right device tree information that he needs to create to say that the page size are 16M, 64k and 4k. Maybe we can find a combined solution for these problems. Using __setup_cpu_power4 should be ok. We could probably do a fallback in the cell setup to see if the properties are in the device tree and do our own HID6 setup stuff if not, normally expecting that the firmware settings match the device tree. Geoff, if your firmware does not already have the properties for large page sizes, could you add them? Ben, could you point Hartmut (and maybe Geoff) to the documentation for how the device tree needs to look like? Hartmut, can you find out the value of HID6 when you enter the kernel from the firmware? Arnd <>< From benh at kernel.crashing.org Fri Feb 10 15:35:15 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 10 Feb 2006 15:35:15 +1100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <2812322.110611139545275893.JavaMail.servlet@kundenserver> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> Message-ID: <1139546116.5003.81.camel@localhost.localdomain> > [/me is sing webmail from some distant location in .nz, sorry if > the mail gets messed up] > > Doing it in the firmware sounds like the right solution to me. > I would however not want to do that if the current firmware > sets the wrong page sizes. > > I know that Hartmut wanted me to provide him with the right device > tree information that he needs to create to say that the page > size are 16M, 64k and 4k. Maybe we can find a combined solution > for these problems. Using __setup_cpu_power4 should be ok. I don't completely understand your statement ... sorry > We could probably do a fallback in the cell setup to see if > the properties are in the device tree and do our own HID6 > setup stuff if not, normally expecting that the firmware settings > match the device tree. We should not touch HID6 at all ... we should assume the firmware set it appropriately and have setup matching page size entries in the device-tree. I don't think we need to support changing that value especially since the kernel doesn't quite support 1M large page sizes anyway. > Geoff, if your firmware does not already have the properties > for large page sizes, could you add them? > > Ben, could you point Hartmut (and maybe Geoff) to the documentation > for how the device tree needs to look like? I'm not sure we published that yet :) I would suggest looking at what the kernel does to parse these instead in hash_utils.c until I get a former IBM approval for the spec to be published > Hartmut, can you find out the value of HID6 when you enter the kernel > from the firmware? Ben. From michael at ellerman.id.au Fri Feb 10 15:47:32 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 10 Feb 2006 15:47:32 +1100 Subject: [PATCH 1/2] powerpc: Clean up pSeries firmware feature initialisation Message-ID: <1139546852.877687.10893684750.qpush@concordia> Clean up fw_feature_init in platforms/pseries/setup.c. Clean up white space and replace the while loop with a for loop - which seems clearer to me. Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/setup.c | 48 +++++++++++++++------------------ 1 files changed, 22 insertions(+), 26 deletions(-) Index: to-merge/arch/powerpc/platforms/pseries/setup.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/setup.c +++ to-merge/arch/powerpc/platforms/pseries/setup.c @@ -263,48 +263,44 @@ static int __init pSeries_init_panel(voi arch_initcall(pSeries_init_panel); -/* Build up the ppc64_firmware_features bitmask field - * using contents of device-tree/ibm,hypertas-functions. - * Ultimately this functionality may be moved into prom.c prom_init(). +/* Build up the firmware features bitmask using the contents of + * device-tree/ibm,hypertas-functions. Ultimately this functionality may + * be moved into prom.c prom_init(). */ static void __init fw_feature_init(void) { - struct device_node * dn; - char * hypertas; - unsigned int len; + struct device_node *dn; + char *hypertas, *s; + int len, i; DBG(" -> fw_feature_init()\n"); - ppc64_firmware_features = 0; dn = of_find_node_by_path("/rtas"); if (dn == NULL) { - printk(KERN_ERR "WARNING ! Cannot find RTAS in device-tree !\n"); - goto no_rtas; + printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n"); + goto out; } hypertas = get_property(dn, "ibm,hypertas-functions", &len); - if (hypertas) { - while (len > 0){ - int i, hypertas_len; + if (hypertas == NULL) + goto out; + + for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) { + for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) { /* check value against table of strings */ - for(i=0; i < FIRMWARE_MAX_FEATURES ;i++) { - if ((firmware_features_table[i].name) && - (strcmp(firmware_features_table[i].name,hypertas))==0) { - /* we have a match */ - ppc64_firmware_features |= - (firmware_features_table[i].val); - break; - } - } - hypertas_len = strlen(hypertas); - len -= hypertas_len +1; - hypertas+= hypertas_len +1; + if (!firmware_features_table[i].name || + strcmp(firmware_features_table[i].name, s)) + continue; + + /* we have a match */ + ppc64_firmware_features |= + firmware_features_table[i].val; + break; } } +out: of_node_put(dn); -no_rtas: - DBG(" <- fw_feature_init()\n"); } From michael at ellerman.id.au Fri Feb 10 15:47:36 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 10 Feb 2006 15:47:36 +1100 Subject: [PATCH 2/2] powerpc: Move pSeries firmware feature setup into platforms/pseries In-Reply-To: <1139546852.877687.10893684750.qpush@concordia> Message-ID: <20060210044802.B8D1E67B20@ozlabs.org> Currently we have some stuff in firmware.h and kernel/firmware.c that is #ifdef CONFIG_PPC_PSERIES. Move it all into platforms/pseries. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/firmware.c | 25 ------- arch/powerpc/platforms/pseries/Makefile | 3 arch/powerpc/platforms/pseries/firmware.c | 103 ++++++++++++++++++++++++++++++ arch/powerpc/platforms/pseries/firmware.h | 17 ++++ arch/powerpc/platforms/pseries/setup.c | 45 ------------- include/asm-powerpc/firmware.h | 9 -- 6 files changed, 123 insertions(+), 79 deletions(-) Index: to-merge/arch/powerpc/kernel/firmware.c =================================================================== --- to-merge.orig/arch/powerpc/kernel/firmware.c +++ to-merge/arch/powerpc/kernel/firmware.c @@ -18,28 +18,3 @@ #include unsigned long ppc64_firmware_features; - -#ifdef CONFIG_PPC_PSERIES -firmware_feature_t firmware_features_table[FIRMWARE_MAX_FEATURES] = { - {FW_FEATURE_PFT, "hcall-pft"}, - {FW_FEATURE_TCE, "hcall-tce"}, - {FW_FEATURE_SPRG0, "hcall-sprg0"}, - {FW_FEATURE_DABR, "hcall-dabr"}, - {FW_FEATURE_COPY, "hcall-copy"}, - {FW_FEATURE_ASR, "hcall-asr"}, - {FW_FEATURE_DEBUG, "hcall-debug"}, - {FW_FEATURE_PERF, "hcall-perf"}, - {FW_FEATURE_DUMP, "hcall-dump"}, - {FW_FEATURE_INTERRUPT, "hcall-interrupt"}, - {FW_FEATURE_MIGRATE, "hcall-migrate"}, - {FW_FEATURE_PERFMON, "hcall-perfmon"}, - {FW_FEATURE_CRQ, "hcall-crq"}, - {FW_FEATURE_VIO, "hcall-vio"}, - {FW_FEATURE_RDMA, "hcall-rdma"}, - {FW_FEATURE_LLAN, "hcall-lLAN"}, - {FW_FEATURE_BULK, "hcall-bulk"}, - {FW_FEATURE_XDABR, "hcall-xdabr"}, - {FW_FEATURE_MULTITCE, "hcall-multi-tce"}, - {FW_FEATURE_SPLPAR, "hcall-splpar"}, -}; -#endif Index: to-merge/arch/powerpc/platforms/pseries/Makefile =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/Makefile +++ to-merge/arch/powerpc/platforms/pseries/Makefile @@ -1,5 +1,6 @@ obj-y := pci.o lpar.o hvCall.o nvram.o reconfig.o \ - setup.o iommu.o ras.o rtasd.o pci_dlpar.o + setup.o iommu.o ras.o rtasd.o pci_dlpar.o \ + firmware.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_IBMVIO) += vio.o obj-$(CONFIG_XICS) += xics.o Index: to-merge/arch/powerpc/platforms/pseries/firmware.c =================================================================== --- /dev/null +++ to-merge/arch/powerpc/platforms/pseries/firmware.c @@ -0,0 +1,103 @@ +/* + * pSeries firmware setup code. + * + * Portions from arch/powerpc/platforms/pseries/setup.c: + * Copyright (C) 1995 Linus Torvalds + * Adapted from 'alpha' version by Gary Thomas + * Modified by Cort Dougan (cort at cs.nmt.edu) + * Modified by PPC64 Team, IBM Corp + * + * Portions from arch/powerpc/kernel/firmware.c + * Copyright (C) 2001 Ben. Herrenschmidt (benh at kernel.crashing.org) + * Modifications for ppc64: + * Copyright (C) 2003 Dave Engebretsen + * Copyright (C) 2005 Stephen Rothwell, IBM Corporation + * + * Copyright 2006 IBM Corporation. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#undef DEBUG + +#include +#include + +#ifdef DEBUG +#define DBG(fmt...) udbg_printf(fmt) +#else +#define DBG(fmt...) +#endif + +typedef struct { + unsigned long val; + char * name; +} firmware_feature_t; + +static __initdata firmware_feature_t +firmware_features_table[FIRMWARE_MAX_FEATURES] = { + {FW_FEATURE_PFT, "hcall-pft"}, + {FW_FEATURE_TCE, "hcall-tce"}, + {FW_FEATURE_SPRG0, "hcall-sprg0"}, + {FW_FEATURE_DABR, "hcall-dabr"}, + {FW_FEATURE_COPY, "hcall-copy"}, + {FW_FEATURE_ASR, "hcall-asr"}, + {FW_FEATURE_DEBUG, "hcall-debug"}, + {FW_FEATURE_PERF, "hcall-perf"}, + {FW_FEATURE_DUMP, "hcall-dump"}, + {FW_FEATURE_INTERRUPT, "hcall-interrupt"}, + {FW_FEATURE_MIGRATE, "hcall-migrate"}, + {FW_FEATURE_PERFMON, "hcall-perfmon"}, + {FW_FEATURE_CRQ, "hcall-crq"}, + {FW_FEATURE_VIO, "hcall-vio"}, + {FW_FEATURE_RDMA, "hcall-rdma"}, + {FW_FEATURE_LLAN, "hcall-lLAN"}, + {FW_FEATURE_BULK, "hcall-bulk"}, + {FW_FEATURE_XDABR, "hcall-xdabr"}, + {FW_FEATURE_MULTITCE, "hcall-multi-tce"}, + {FW_FEATURE_SPLPAR, "hcall-splpar"}, +}; + +/* Build up the firmware features bitmask using the contents of + * device-tree/ibm,hypertas-functions. Ultimately this functionality may + * be moved into prom.c prom_init(). + */ +void __init fw_feature_init(void) +{ + struct device_node *dn; + char *hypertas, *s; + int len, i; + + DBG(" -> fw_feature_init()\n"); + + dn = of_find_node_by_path("/rtas"); + if (dn == NULL) { + printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n"); + goto out; + } + + hypertas = get_property(dn, "ibm,hypertas-functions", &len); + if (hypertas == NULL) + goto out; + + for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) { + for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) { + /* check value against table of strings */ + if (!firmware_features_table[i].name || + strcmp(firmware_features_table[i].name, s)) + continue; + + /* we have a match */ + ppc64_firmware_features |= + firmware_features_table[i].val; + break; + } + } + +out: + of_node_put(dn); + DBG(" <- fw_feature_init()\n"); +} Index: to-merge/arch/powerpc/platforms/pseries/setup.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/setup.c +++ to-merge/arch/powerpc/platforms/pseries/setup.c @@ -60,7 +60,6 @@ #include #include #include "xics.h" -#include #include #include #include @@ -70,6 +69,7 @@ #include "plpar_wrappers.h" #include "ras.h" +#include "firmware.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -262,49 +262,6 @@ static int __init pSeries_init_panel(voi } arch_initcall(pSeries_init_panel); - -/* Build up the firmware features bitmask using the contents of - * device-tree/ibm,hypertas-functions. Ultimately this functionality may - * be moved into prom.c prom_init(). - */ -static void __init fw_feature_init(void) -{ - struct device_node *dn; - char *hypertas, *s; - int len, i; - - DBG(" -> fw_feature_init()\n"); - - dn = of_find_node_by_path("/rtas"); - if (dn == NULL) { - printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n"); - goto out; - } - - hypertas = get_property(dn, "ibm,hypertas-functions", &len); - if (hypertas == NULL) - goto out; - - for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) { - for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) { - /* check value against table of strings */ - if (!firmware_features_table[i].name || - strcmp(firmware_features_table[i].name, s)) - continue; - - /* we have a match */ - ppc64_firmware_features |= - firmware_features_table[i].val; - break; - } - } - -out: - of_node_put(dn); - DBG(" <- fw_feature_init()\n"); -} - - static void __init pSeries_discover_pic(void) { struct device_node *np; Index: to-merge/include/asm-powerpc/firmware.h =================================================================== --- to-merge.orig/include/asm-powerpc/firmware.h +++ to-merge/include/asm-powerpc/firmware.h @@ -89,15 +89,6 @@ static inline unsigned long firmware_has (FW_FEATURE_POSSIBLE & ppc64_firmware_features & feature); } -#ifdef CONFIG_PPC_PSERIES -typedef struct { - unsigned long val; - char * name; -} firmware_feature_t; - -extern firmware_feature_t firmware_features_table[]; -#endif - extern void system_reset_fwnmi(void); extern void machine_check_fwnmi(void); Index: to-merge/arch/powerpc/platforms/pseries/firmware.h =================================================================== --- /dev/null +++ to-merge/arch/powerpc/platforms/pseries/firmware.h @@ -0,0 +1,17 @@ +/* + * Copyright 2006 IBM Corporation. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _PSERIES_FIRMWARE_H +#define _PSERIES_FIRMWARE_H + +#include + +extern void __init fw_feature_init(void); + +#endif /* _PSERIES_FIRMWARE_H */ From geoffrey.levand at am.sony.com Sat Feb 11 03:44:26 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Fri, 10 Feb 2006 08:44:26 -0800 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139546116.5003.81.camel@localhost.localdomain> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <1139546116.5003.81.camel@localhost.localdomain> Message-ID: <43ECC2EA.5040505@am.sony.com> Benjamin Herrenschmidt wrote: >> [/me is sing webmail from some distant location in .nz, sorry if >> the mail gets messed up] >> >> Doing it in the firmware sounds like the right solution to me. >> I would however not want to do that if the current firmware >> sets the wrong page sizes. Then we should consider support to change page sizes a firmware bug work-around. >> I know that Hartmut wanted me to provide him with the right device >> tree information that he needs to create to say that the page >> size are 16M, 64k and 4k. Maybe we can find a combined solution >> for these problems. Using __setup_cpu_power4 should be ok. > > I don't completely understand your statement ... sorry > >> We could probably do a fallback in the cell setup to see if >> the properties are in the device tree and do our own HID6 >> setup stuff if not, normally expecting that the firmware settings >> match the device tree. > > We should not touch HID6 at all ... we should assume the firmware set it > appropriately and have setup matching page size entries in the > device-tree. I don't think we need to support changing that value > especially since the kernel doesn't quite support 1M large page sizes > anyway. This seems reasonable. In the general case we shouldn't change the sizes. >> Ben, could you point Hartmut (and maybe Geoff) to the documentation >> for how the device tree needs to look like? > > I'm not sure we published that yet :) I would suggest looking at what > the kernel does to parse these instead in hash_utils.c until I get a > former IBM approval for the spec to be published It seems we can work with the info in hash_utils_64.c, but it would be good to get the documentation. -Geoff From ahuja at austin.ibm.com Sat Feb 11 07:45:08 2006 From: ahuja at austin.ibm.com (Manish Ahuja) Date: Fri, 10 Feb 2006 14:45:08 -0600 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <43DFB67A.5080508@austin.ibm.com> References: <43CFC094.8000709@austin.ibm.com> <20060126204432.GG19465@austin.ibm.com> <43DFB67A.5080508@austin.ibm.com> Message-ID: <43ECFB54.9050907@austin.ibm.com> I have made some changes to the patch. I will drop another patch at a later date with enhancements and some modifications. At this point, I would like to resubmit this for review and forward if there are no further comments. Thanks, Manish Ahuja -------------------------------------------------------------------- The issue of correctness of time is an important one, where users would like to accurately get a feel for how the system is performing. With Virtualization and addition of abstract layers between the hardware and the OS, we have introduced changes that do not allow us to correctly measure performance or accuracy at this moment. Any activity at this moment that collects metrics is faced with the challenge of collecting values that might be bogus. POWER5 machines have a per-hardware-thread register which counts at a rate which is proportional to the percentage of cycles on which the cpu dispatches an instruction for this thread (if the thread gets all the dispatch cycles it counts at the same rate as the timebase register). This register is also context-switched by the hypervisor. Thus it gives a fine-grained measure of the actual cpu usage by the thread over time. This patch builds on a patch submitted earlier. This patch provides framework and data which allows other tools to report measurements accurately to different tools. This Patch calculates the amount of real physical time spent by the processor in each USER/SYS mode. It calculates that by trapping entry and exits into the kernel. The values after calculations are avilable from /sys/devices/system/cpu/cpuX/dispatched_cycles. These values are calculated during interrupts & context switches. To be able to correctly report all cycles, it is important to be able to track all the cycles that are given to lpars that are either offline or have been removed since the system started. All such cycles are calculated and stored in /sys/devices/system/cpu/cpuX/offline_cpu_cycles A few tools are in the works that will exploit the values being calculated. Example output look like as follows. %user %sys %wait %idle ------ ------ ------ ------ 00.90 0.09 0.00 99.01 This patch also keeps track of exact user/kernel times for each process and updates them accordingly to be used by tools like CKRM. I am working with performance group to calculate the impact of this patch. I will add those numbers as soon something becomes available. Signed-off-by: Manish Ahuja -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cpu_acct.patch.2 Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060210/9d849630/attachment.txt From olof at lixom.net Sat Feb 11 10:23:25 2006 From: olof at lixom.net (Olof Johansson) Date: Fri, 10 Feb 2006 17:23:25 -0600 Subject: [PATCH] rename fw_feature_init() Message-ID: <20060210232325.GA4795@pb15.lixom.net> Hi, fw_feature_init() on pSeries is really just a setup of hypervisor features. Rename the function accordingly. Signed-off-by: Olof Johansson --- firmware.c | 18 +++++++++--------- firmware.h | 2 +- setup.c | 2 +- 3 files changed, 11 insertions(+), 11 deletions(-) Index: powerpc-git/arch/powerpc/platforms/pseries/firmware.c =================================================================== --- powerpc-git.orig/arch/powerpc/platforms/pseries/firmware.c +++ powerpc-git/arch/powerpc/platforms/pseries/firmware.c @@ -35,10 +35,10 @@ typedef struct { unsigned long val; char * name; -} firmware_feature_t; +} hypervisor_feature_t; -static __initdata firmware_feature_t -firmware_features_table[FIRMWARE_MAX_FEATURES] = { +static __initdata hypervisor_feature_t +hypervisor_features_table[FIRMWARE_MAX_FEATURES] = { {FW_FEATURE_PFT, "hcall-pft"}, {FW_FEATURE_TCE, "hcall-tce"}, {FW_FEATURE_SPRG0, "hcall-sprg0"}, @@ -65,13 +65,13 @@ firmware_features_table[FIRMWARE_MAX_FEA * device-tree/ibm,hypertas-functions. Ultimately this functionality may * be moved into prom.c prom_init(). */ -void __init fw_feature_init(void) +void __init hypervisor_feature_init(void) { struct device_node *dn; char *hypertas, *s; int len, i; - DBG(" -> fw_feature_init()\n"); + DBG(" -> hypervisor_feature_init()\n"); dn = of_find_node_by_path("/rtas"); if (dn == NULL) { @@ -86,18 +86,18 @@ void __init fw_feature_init(void) for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) { for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) { /* check value against table of strings */ - if (!firmware_features_table[i].name || - strcmp(firmware_features_table[i].name, s)) + if (!hypervisor_features_table[i].name || + strcmp(hypervisor_features_table[i].name, s)) continue; /* we have a match */ ppc64_firmware_features |= - firmware_features_table[i].val; + hypervisor_features_table[i].val; break; } } out: of_node_put(dn); - DBG(" <- fw_feature_init()\n"); + DBG(" <- hypervisor_feature_init()\n"); } Index: powerpc-git/arch/powerpc/platforms/pseries/firmware.h =================================================================== --- powerpc-git.orig/arch/powerpc/platforms/pseries/firmware.h +++ powerpc-git/arch/powerpc/platforms/pseries/firmware.h @@ -12,6 +12,6 @@ #include -extern void __init fw_feature_init(void); +extern void __init hypervisor_feature_init(void); #endif /* _PSERIES_FIRMWARE_H */ Index: powerpc-git/arch/powerpc/platforms/pseries/setup.c =================================================================== --- powerpc-git.orig/arch/powerpc/platforms/pseries/setup.c +++ powerpc-git/arch/powerpc/platforms/pseries/setup.c @@ -324,7 +324,7 @@ static void __init pSeries_init_early(vo DBG(" -> pSeries_init_early()\n"); - fw_feature_init(); + hypervisor_feature_init(); if (platform_is_lpar()) hpte_init_lpar(); From olof at lixom.net Sat Feb 11 10:49:03 2006 From: olof at lixom.net (Olof Johansson) Date: Fri, 10 Feb 2006 17:49:03 -0600 Subject: [PATCH] Update {g5,pseries,ppc64}_defconfig Message-ID: <20060210234903.GB4795@pb15.lixom.net> Hi, For powerpc.git (post-2.6.16): Update defconfigs for g5, pseries and generic ppc64. Default choices for everything, with the following exceptions: * Enable WINDFARM_PM112 on g5 and ppc64. * Increase CONFIG_NR_CPUS to 4 on g5_defconfig Signed-off-by: Olof Johansson Index: powerpc-git/arch/powerpc/configs/g5_defconfig =================================================================== --- powerpc-git.orig/arch/powerpc/configs/g5_defconfig +++ powerpc-git/arch/powerpc/configs/g5_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.15-rc5 -# Tue Dec 20 15:59:30 2005 +# Linux kernel version: 2.6.16-rc2 +# Fri Feb 10 17:33:08 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -16,6 +16,10 @@ CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_PPC_OF=y +# CONFIG_PPC_UDBG_16550 is not set +CONFIG_GENERIC_TBSYNC=y +# CONFIG_DEFAULT_UIMAGE is not set # # Processor support @@ -26,13 +30,12 @@ CONFIG_PPC_FPU=y CONFIG_ALTIVEC=y CONFIG_PPC_STD_MMU=y CONFIG_SMP=y -CONFIG_NR_CPUS=2 +CONFIG_NR_CPUS=4 # # Code maturity level options # CONFIG_EXPERIMENTAL=y -CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 @@ -47,8 +50,6 @@ CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set -CONFIG_HOTPLUG=y -CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_CPUSETS is not set @@ -58,8 +59,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y +CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y @@ -68,8 +71,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 +CONFIG_SLAB=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 +# CONFIG_SLOB is not set # # Loadable module support @@ -112,13 +117,12 @@ CONFIG_PPC_PMAC=y CONFIG_PPC_PMAC64=y # CONFIG_PPC_MAPLE is not set # CONFIG_PPC_CELL is not set -CONFIG_PPC_OF=y CONFIG_U3_DART=y CONFIG_MPIC=y # CONFIG_PPC_RTAS is not set # CONFIG_MMIO_NVRAM is not set +CONFIG_MPIC_BROKEN_U3=y # CONFIG_PPC_MPC106 is not set -CONFIG_GENERIC_TBSYNC=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y # CONFIG_CPU_FREQ_DEBUG is not set @@ -151,6 +155,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_IOMMU_VMERGE=y # CONFIG_HOTPLUG_CPU is not set CONFIG_KEXEC=y +# CONFIG_CRASH_DUMP is not set CONFIG_IRQ_ALL_CPUS=y # CONFIG_NUMA is not set CONFIG_ARCH_SELECT_MEMORY_MODEL=y @@ -202,6 +207,7 @@ CONFIG_NET=y # # Networking options # +# CONFIG_NETDEBUG is not set CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y @@ -239,6 +245,7 @@ CONFIG_NETFILTER=y # Core Netfilter Configuration # # CONFIG_NETFILTER_NETLINK is not set +# CONFIG_NETFILTER_XTABLES is not set # # IP: Netfilter Configuration @@ -255,65 +262,6 @@ CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m # CONFIG_IP_NF_PPTP is not set CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -# CONFIG_IP_NF_MATCH_DCCP is not set -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_CONNBYTES=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_MATCH_STRING=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_TARGET_NFQUEUE=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_TTL=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m # # DCCP Configuration (EXPERIMENTAL) @@ -324,6 +272,11 @@ CONFIG_IP_NF_ARP_MANGLE=m # SCTP Configuration (EXPERIMENTAL) # # CONFIG_IP_SCTP is not set + +# +# TIPC Configuration (EXPERIMENTAL) +# +# CONFIG_TIPC is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set # CONFIG_VLAN_8021Q is not set @@ -342,7 +295,6 @@ CONFIG_LLC=y # QoS and/or fair queueing # # CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y # # Network testing @@ -545,13 +497,7 @@ CONFIG_SCSI_SATA_SVW=y # CONFIG_SCSI_IPR is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set -CONFIG_SCSI_QLA2XXX=y -# CONFIG_SCSI_QLA21XX is not set -# CONFIG_SCSI_QLA22XX is not set -# CONFIG_SCSI_QLA2300 is not set -# CONFIG_SCSI_QLA2322 is not set -# CONFIG_SCSI_QLA6312 is not set -# CONFIG_SCSI_QLA24XX is not set +# CONFIG_SCSI_QLA_FC is not set # CONFIG_SCSI_LPFC is not set # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set @@ -614,7 +560,6 @@ CONFIG_IEEE1394_SBP2=m CONFIG_IEEE1394_ETH1394=m CONFIG_IEEE1394_DV1394=m CONFIG_IEEE1394_RAWIO=y -# CONFIG_IEEE1394_CMP is not set # # I2O device support @@ -630,6 +575,7 @@ CONFIG_THERM_PM72=y CONFIG_WINDFARM=y CONFIG_WINDFARM_PM81=y CONFIG_WINDFARM_PM91=y +CONFIG_WINDFARM_PM112=y # # Network device support @@ -682,6 +628,7 @@ CONFIG_E1000=y # CONFIG_R8169 is not set # CONFIG_SIS190 is not set # CONFIG_SKGE is not set +# CONFIG_SKY2 is not set # CONFIG_SK98LIN is not set CONFIG_TIGON3=m # CONFIG_BNX2 is not set @@ -861,8 +808,7 @@ CONFIG_I2C_ALGOBIT=y # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set # CONFIG_I2C_PIIX4 is not set -CONFIG_I2C_KEYWEST=y -CONFIG_I2C_PMAC_SMU=y +CONFIG_I2C_POWERMAC=y # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT_LIGHT is not set # CONFIG_I2C_PROSAVAGE is not set @@ -895,6 +841,12 @@ CONFIG_I2C_PMAC_SMU=y # CONFIG_I2C_DEBUG_CHIP is not set # +# SPI support +# +# CONFIG_SPI is not set +# CONFIG_SPI_MASTER is not set + +# # Dallas's 1-wire bus # # CONFIG_W1 is not set @@ -961,7 +913,6 @@ CONFIG_FB_RADEON_I2C=y # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set -# CONFIG_FB_CYBLA is not set # CONFIG_FB_TRIDENT is not set # CONFIG_FB_VIRTUAL is not set @@ -1008,9 +959,10 @@ CONFIG_SND_OSSEMUL=y CONFIG_SND_MIXER_OSS=m CONFIG_SND_PCM_OSS=m CONFIG_SND_SEQUENCER_OSS=y +# CONFIG_SND_DYNAMIC_MINORS is not set +CONFIG_SND_SUPPORT_OLD_API=y # CONFIG_SND_VERBOSE_PRINTK is not set # CONFIG_SND_DEBUG is not set -CONFIG_SND_GENERIC_DRIVER=y # # Generic devices @@ -1024,6 +976,8 @@ CONFIG_SND_GENERIC_DRIVER=y # # PCI devices # +# CONFIG_SND_AD1889 is not set +# CONFIG_SND_ALS4000 is not set # CONFIG_SND_ALI5451 is not set # CONFIG_SND_ATIIXP is not set # CONFIG_SND_ATIIXP_MODEM is not set @@ -1032,39 +986,38 @@ CONFIG_SND_GENERIC_DRIVER=y # CONFIG_SND_AU8830 is not set # CONFIG_SND_AZT3328 is not set # CONFIG_SND_BT87X is not set -# CONFIG_SND_CS46XX is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_CMIPCI is not set # CONFIG_SND_CS4281 is not set +# CONFIG_SND_CS46XX is not set # CONFIG_SND_EMU10K1 is not set # CONFIG_SND_EMU10K1X is not set -# CONFIG_SND_CA0106 is not set -# CONFIG_SND_KORG1212 is not set -# CONFIG_SND_MIXART is not set -# CONFIG_SND_NM256 is not set -# CONFIG_SND_RME32 is not set -# CONFIG_SND_RME96 is not set -# CONFIG_SND_RME9652 is not set -# CONFIG_SND_HDSP is not set -# CONFIG_SND_HDSPM is not set -# CONFIG_SND_TRIDENT is not set -# CONFIG_SND_YMFPCI is not set -# CONFIG_SND_AD1889 is not set -# CONFIG_SND_ALS4000 is not set -# CONFIG_SND_CMIPCI is not set # CONFIG_SND_ENS1370 is not set # CONFIG_SND_ENS1371 is not set # CONFIG_SND_ES1938 is not set # CONFIG_SND_ES1968 is not set -# CONFIG_SND_MAESTRO3 is not set # CONFIG_SND_FM801 is not set +# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_HDSP is not set +# CONFIG_SND_HDSPM is not set # CONFIG_SND_ICE1712 is not set # CONFIG_SND_ICE1724 is not set # CONFIG_SND_INTEL8X0 is not set # CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_KORG1212 is not set +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +# CONFIG_SND_PCXHR is not set +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set # CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_TRIDENT is not set # CONFIG_SND_VIA82XX is not set # CONFIG_SND_VIA82XX_MODEM is not set # CONFIG_SND_VX222 is not set -# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_YMFPCI is not set # # ALSA PowerMac devices @@ -1136,13 +1089,16 @@ CONFIG_USB_STORAGE_DPCM=y CONFIG_USB_STORAGE_SDDR09=y CONFIG_USB_STORAGE_SDDR55=y CONFIG_USB_STORAGE_JUMPSHOT=y +# CONFIG_USB_STORAGE_ALAUDA is not set # CONFIG_USB_STORAGE_ONETOUCH is not set +# CONFIG_USB_LIBUSUAL is not set # # USB Input Devices # CONFIG_USB_HID=y CONFIG_USB_HIDINPUT=y +# CONFIG_USB_HIDINPUT_POWERBOOK is not set CONFIG_HID_FF=y CONFIG_HID_PID=y CONFIG_LOGITECH_FF=y @@ -1159,6 +1115,7 @@ CONFIG_USB_HIDDEV=y # CONFIG_USB_YEALINK is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_ATI_REMOTE2 is not set # CONFIG_USB_KEYSPAN_REMOTE is not set # CONFIG_USB_APPLETOUCH is not set @@ -1207,6 +1164,7 @@ CONFIG_USB_SERIAL_GENERIC=y # CONFIG_USB_SERIAL_AIRPRIME is not set # CONFIG_USB_SERIAL_ANYDATA is not set CONFIG_USB_SERIAL_BELKIN=m +# CONFIG_USB_SERIAL_WHITEHEAT is not set CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m # CONFIG_USB_SERIAL_CP2101 is not set CONFIG_USB_SERIAL_CYPRESS_M8=m @@ -1288,6 +1246,10 @@ CONFIG_USB_EZUSB=y # # +# EDAC - error detection and reporting (RAS) +# + +# # File systems # CONFIG_EXT2_FS=y @@ -1317,6 +1279,7 @@ CONFIG_XFS_EXPORT=y CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_XFS_RT is not set +# CONFIG_OCFS2_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y @@ -1357,6 +1320,7 @@ CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y # CONFIG_RELAYFS_FS is not set +# CONFIG_CONFIGFS_FS is not set # # Miscellaneous filesystems @@ -1426,6 +1390,7 @@ CONFIG_MSDOS_PARTITION=y # CONFIG_SGI_PARTITION is not set # CONFIG_ULTRIX_PARTITION is not set # CONFIG_SUN_PARTITION is not set +# CONFIG_KARMA_PARTITION is not set # CONFIG_EFI_PARTITION is not set # @@ -1481,10 +1446,6 @@ CONFIG_CRC32=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m -CONFIG_TEXTSEARCH=y -CONFIG_TEXTSEARCH_KMP=m -CONFIG_TEXTSEARCH_BM=m -CONFIG_TEXTSEARCH_FSM=m # # Instrumentation Support @@ -1497,24 +1458,31 @@ CONFIG_OPROFILE=y # Kernel hacking # # CONFIG_PRINTK_TIME is not set -CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +CONFIG_DEBUG_MUTEXES=y # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set +CONFIG_FORCED_INLINING=y # CONFIG_RCU_TORTURE_TEST is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUGGER is not set CONFIG_IRQSTACKS=y CONFIG_BOOTX_TEXT=y +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set +# CONFIG_PPC_EARLY_DEBUG_G5 is not set +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set # # Security options Index: powerpc-git/arch/powerpc/configs/ppc64_defconfig =================================================================== --- powerpc-git.orig/arch/powerpc/configs/ppc64_defconfig +++ powerpc-git/arch/powerpc/configs/ppc64_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.15-rc5 -# Tue Dec 20 15:59:38 2005 +# Linux kernel version: 2.6.16-rc2 +# Fri Feb 10 17:32:14 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -16,6 +16,10 @@ CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_PPC_OF=y +CONFIG_PPC_UDBG_16550=y +CONFIG_GENERIC_TBSYNC=y +# CONFIG_DEFAULT_UIMAGE is not set # # Processor support @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=32 # Code maturity level options # CONFIG_EXPERIMENTAL=y -CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 @@ -48,8 +51,6 @@ CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set -CONFIG_HOTPLUG=y -CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_CPUSETS=y @@ -59,8 +60,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y +CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y @@ -69,8 +72,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 +CONFIG_SLAB=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 +# CONFIG_SLOB is not set # # Loadable module support @@ -113,7 +118,6 @@ CONFIG_PPC_PMAC=y CONFIG_PPC_PMAC64=y CONFIG_PPC_MAPLE=y # CONFIG_PPC_CELL is not set -CONFIG_PPC_OF=y CONFIG_XICS=y CONFIG_U3_DART=y CONFIG_MPIC=y @@ -124,8 +128,8 @@ CONFIG_RTAS_FLASH=m # CONFIG_MMIO_NVRAM is not set CONFIG_MPIC_BROKEN_U3=y CONFIG_IBMVIO=y +# CONFIG_IBMEBUS is not set # CONFIG_PPC_MPC106 is not set -CONFIG_GENERIC_TBSYNC=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y # CONFIG_CPU_FREQ_DEBUG is not set @@ -158,6 +162,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_IOMMU_VMERGE=y CONFIG_HOTPLUG_CPU=y CONFIG_KEXEC=y +# CONFIG_CRASH_DUMP is not set CONFIG_IRQ_ALL_CPUS=y CONFIG_PPC_SPLPAR=y CONFIG_EEH=y @@ -178,6 +183,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y CONFIG_SPARSEMEM_EXTREME=y # CONFIG_MEMORY_HOTPLUG is not set CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_MIGRATION=y # CONFIG_PPC_64K_PAGES is not set # CONFIG_SCHED_SMT is not set CONFIG_PROC_DEVICETREE=y @@ -221,6 +227,7 @@ CONFIG_NET=y # # Networking options # +# CONFIG_NETDEBUG is not set CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y @@ -260,6 +267,7 @@ CONFIG_NETFILTER=y CONFIG_NETFILTER_NETLINK=y CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NETFILTER_NETLINK_LOG=m +# CONFIG_NETFILTER_XTABLES is not set # # IP: Netfilter Configuration @@ -277,65 +285,6 @@ CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m # CONFIG_IP_NF_PPTP is not set CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -CONFIG_IP_NF_MATCH_DCCP=m -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_CONNBYTES=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_MATCH_STRING=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_TARGET_NFQUEUE=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_TTL=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m # # DCCP Configuration (EXPERIMENTAL) @@ -346,6 +295,11 @@ CONFIG_IP_NF_ARP_MANGLE=m # SCTP Configuration (EXPERIMENTAL) # # CONFIG_IP_SCTP is not set + +# +# TIPC Configuration (EXPERIMENTAL) +# +# CONFIG_TIPC is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set # CONFIG_VLAN_8021Q is not set @@ -364,7 +318,6 @@ CONFIG_LLC=y # QoS and/or fair queueing # # CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y # # Network testing @@ -572,13 +525,7 @@ CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set -CONFIG_SCSI_QLA2XXX=y -CONFIG_SCSI_QLA21XX=m -CONFIG_SCSI_QLA22XX=m -CONFIG_SCSI_QLA2300=m -CONFIG_SCSI_QLA2322=m -CONFIG_SCSI_QLA6312=m -CONFIG_SCSI_QLA24XX=m +# CONFIG_SCSI_QLA_FC is not set CONFIG_SCSI_LPFC=m # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set @@ -642,8 +589,6 @@ CONFIG_IEEE1394_SBP2=m CONFIG_IEEE1394_ETH1394=m CONFIG_IEEE1394_DV1394=m CONFIG_IEEE1394_RAWIO=y -CONFIG_IEEE1394_CMP=m -CONFIG_IEEE1394_AMDTP=m # # I2O device support @@ -659,6 +604,7 @@ CONFIG_THERM_PM72=y CONFIG_WINDFARM=y CONFIG_WINDFARM_PM81=y CONFIG_WINDFARM_PM91=y +CONFIG_WINDFARM_PM112=y # # Network device support @@ -731,6 +677,7 @@ CONFIG_E1000=y # CONFIG_R8169 is not set # CONFIG_SIS190 is not set # CONFIG_SKGE is not set +# CONFIG_SKY2 is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y @@ -853,6 +800,7 @@ CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=4 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # @@ -880,6 +828,7 @@ CONFIG_HVCS=m # CONFIG_WATCHDOG is not set # CONFIG_RTC is not set CONFIG_GEN_RTC=y +# CONFIG_GEN_RTC_X is not set # CONFIG_DTLK is not set # CONFIG_R3964 is not set # CONFIG_APPLICOM is not set @@ -923,8 +872,7 @@ CONFIG_I2C_AMD8111=y # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set # CONFIG_I2C_PIIX4 is not set -CONFIG_I2C_KEYWEST=y -CONFIG_I2C_PMAC_SMU=y +CONFIG_I2C_POWERMAC=y # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT_LIGHT is not set # CONFIG_I2C_PROSAVAGE is not set @@ -957,6 +905,12 @@ CONFIG_I2C_PMAC_SMU=y # CONFIG_I2C_DEBUG_CHIP is not set # +# SPI support +# +# CONFIG_SPI is not set +# CONFIG_SPI_MASTER is not set + +# # Dallas's 1-wire bus # # CONFIG_W1 is not set @@ -1028,7 +982,6 @@ CONFIG_FB_RADEON_I2C=y # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set -# CONFIG_FB_CYBLA is not set # CONFIG_FB_TRIDENT is not set # CONFIG_FB_VIRTUAL is not set @@ -1073,9 +1026,10 @@ CONFIG_SND_OSSEMUL=y CONFIG_SND_MIXER_OSS=m CONFIG_SND_PCM_OSS=m CONFIG_SND_SEQUENCER_OSS=y +# CONFIG_SND_DYNAMIC_MINORS is not set +CONFIG_SND_SUPPORT_OLD_API=y # CONFIG_SND_VERBOSE_PRINTK is not set # CONFIG_SND_DEBUG is not set -CONFIG_SND_GENERIC_DRIVER=y # # Generic devices @@ -1089,6 +1043,8 @@ CONFIG_SND_GENERIC_DRIVER=y # # PCI devices # +# CONFIG_SND_AD1889 is not set +# CONFIG_SND_ALS4000 is not set # CONFIG_SND_ALI5451 is not set # CONFIG_SND_ATIIXP is not set # CONFIG_SND_ATIIXP_MODEM is not set @@ -1097,39 +1053,38 @@ CONFIG_SND_GENERIC_DRIVER=y # CONFIG_SND_AU8830 is not set # CONFIG_SND_AZT3328 is not set # CONFIG_SND_BT87X is not set -# CONFIG_SND_CS46XX is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_CMIPCI is not set # CONFIG_SND_CS4281 is not set +# CONFIG_SND_CS46XX is not set # CONFIG_SND_EMU10K1 is not set # CONFIG_SND_EMU10K1X is not set -# CONFIG_SND_CA0106 is not set -# CONFIG_SND_KORG1212 is not set -# CONFIG_SND_MIXART is not set -# CONFIG_SND_NM256 is not set -# CONFIG_SND_RME32 is not set -# CONFIG_SND_RME96 is not set -# CONFIG_SND_RME9652 is not set -# CONFIG_SND_HDSP is not set -# CONFIG_SND_HDSPM is not set -# CONFIG_SND_TRIDENT is not set -# CONFIG_SND_YMFPCI is not set -# CONFIG_SND_AD1889 is not set -# CONFIG_SND_ALS4000 is not set -# CONFIG_SND_CMIPCI is not set # CONFIG_SND_ENS1370 is not set # CONFIG_SND_ENS1371 is not set # CONFIG_SND_ES1938 is not set # CONFIG_SND_ES1968 is not set -# CONFIG_SND_MAESTRO3 is not set # CONFIG_SND_FM801 is not set +# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_HDSP is not set +# CONFIG_SND_HDSPM is not set # CONFIG_SND_ICE1712 is not set # CONFIG_SND_ICE1724 is not set # CONFIG_SND_INTEL8X0 is not set # CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_KORG1212 is not set +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +# CONFIG_SND_PCXHR is not set +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set # CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_TRIDENT is not set # CONFIG_SND_VIA82XX is not set # CONFIG_SND_VIA82XX_MODEM is not set # CONFIG_SND_VX222 is not set -# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_YMFPCI is not set # # ALSA PowerMac devices @@ -1201,13 +1156,16 @@ CONFIG_USB_STORAGE=m # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set +# CONFIG_USB_STORAGE_ALAUDA is not set # CONFIG_USB_STORAGE_ONETOUCH is not set +# CONFIG_USB_LIBUSUAL is not set # # USB Input Devices # CONFIG_USB_HID=y CONFIG_USB_HIDINPUT=y +# CONFIG_USB_HIDINPUT_POWERBOOK is not set # CONFIG_HID_FF is not set CONFIG_USB_HIDDEV=y # CONFIG_USB_AIPTEK is not set @@ -1221,6 +1179,7 @@ CONFIG_USB_HIDDEV=y # CONFIG_USB_YEALINK is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_ATI_REMOTE2 is not set # CONFIG_USB_KEYSPAN_REMOTE is not set # CONFIG_USB_APPLETOUCH is not set @@ -1307,6 +1266,10 @@ CONFIG_INFINIBAND_IPOIB=m # # +# EDAC - error detection and reporting (RAS) +# + +# # File systems # CONFIG_EXT2_FS=y @@ -1340,6 +1303,7 @@ CONFIG_XFS_EXPORT=y CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_XFS_RT is not set +# CONFIG_OCFS2_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y @@ -1379,6 +1343,7 @@ CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y # CONFIG_RELAYFS_FS is not set +# CONFIG_CONFIGFS_FS is not set # # Miscellaneous filesystems @@ -1449,6 +1414,7 @@ CONFIG_MSDOS_PARTITION=y # CONFIG_SGI_PARTITION is not set # CONFIG_ULTRIX_PARTITION is not set # CONFIG_SUN_PARTITION is not set +# CONFIG_KARMA_PARTITION is not set # CONFIG_EFI_PARTITION is not set # @@ -1504,10 +1470,6 @@ CONFIG_CRC32=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m -CONFIG_TEXTSEARCH=y -CONFIG_TEXTSEARCH_KMP=m -CONFIG_TEXTSEARCH_BM=m -CONFIG_TEXTSEARCH_FSM=m # # Instrumentation Support @@ -1520,18 +1482,20 @@ CONFIG_OPROFILE=y # Kernel hacking # # CONFIG_PRINTK_TIME is not set -CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +CONFIG_DEBUG_MUTEXES=y # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set +CONFIG_FORCED_INLINING=y # CONFIG_RCU_TORTURE_TEST is not set CONFIG_DEBUG_STACKOVERFLOW=y CONFIG_DEBUG_STACK_USAGE=y @@ -1540,6 +1504,11 @@ CONFIG_XMON=y # CONFIG_XMON_DEFAULT is not set CONFIG_IRQSTACKS=y CONFIG_BOOTX_TEXT=y +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set +# CONFIG_PPC_EARLY_DEBUG_G5 is not set +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set # # Security options Index: powerpc-git/arch/powerpc/configs/pseries_defconfig =================================================================== --- powerpc-git.orig/arch/powerpc/configs/pseries_defconfig +++ powerpc-git/arch/powerpc/configs/pseries_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.15-rc5 -# Tue Dec 20 15:59:40 2005 +# Linux kernel version: 2.6.16-rc2 +# Fri Feb 10 17:33:32 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -16,6 +16,10 @@ CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_PPC_OF=y +CONFIG_PPC_UDBG_16550=y +# CONFIG_GENERIC_TBSYNC is not set +# CONFIG_DEFAULT_UIMAGE is not set # # Processor support @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=128 # Code maturity level options # CONFIG_EXPERIMENTAL=y -CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 @@ -49,8 +52,6 @@ CONFIG_POSIX_MQUEUE=y CONFIG_SYSCTL=y CONFIG_AUDIT=y CONFIG_AUDITSYSCALL=y -CONFIG_HOTPLUG=y -CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_CPUSETS=y @@ -60,8 +61,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y +CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y @@ -70,8 +73,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 +CONFIG_SLAB=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 +# CONFIG_SLOB is not set # # Loadable module support @@ -113,7 +118,6 @@ CONFIG_PPC_PSERIES=y # CONFIG_PPC_PMAC is not set # CONFIG_PPC_MAPLE is not set # CONFIG_PPC_CELL is not set -CONFIG_PPC_OF=y CONFIG_XICS=y # CONFIG_U3_DART is not set CONFIG_MPIC=y @@ -123,8 +127,8 @@ CONFIG_RTAS_PROC=y CONFIG_RTAS_FLASH=m # CONFIG_MMIO_NVRAM is not set CONFIG_IBMVIO=y +# CONFIG_IBMEBUS is not set # CONFIG_PPC_MPC106 is not set -# CONFIG_GENERIC_TBSYNC is not set # CONFIG_CPU_FREQ is not set # CONFIG_WANT_EARLY_SERIAL is not set @@ -145,6 +149,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_IOMMU_VMERGE=y CONFIG_HOTPLUG_CPU=y CONFIG_KEXEC=y +# CONFIG_CRASH_DUMP is not set CONFIG_IRQ_ALL_CPUS=y CONFIG_PPC_SPLPAR=y CONFIG_EEH=y @@ -165,6 +170,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y CONFIG_SPARSEMEM_EXTREME=y # CONFIG_MEMORY_HOTPLUG is not set CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_MIGRATION=y CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y # CONFIG_PPC_64K_PAGES is not set CONFIG_SCHED_SMT=y @@ -209,6 +215,7 @@ CONFIG_NET=y # # Networking options # +# CONFIG_NETDEBUG is not set CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y @@ -248,6 +255,7 @@ CONFIG_NETFILTER=y CONFIG_NETFILTER_NETLINK=y CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NETFILTER_NETLINK_LOG=m +# CONFIG_NETFILTER_XTABLES is not set # # IP: Netfilter Configuration @@ -265,65 +273,6 @@ CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m # CONFIG_IP_NF_PPTP is not set CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -# CONFIG_IP_NF_MATCH_DCCP is not set -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_CONNBYTES=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_MATCH_STRING=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_TARGET_NFQUEUE=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_TTL=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m # # DCCP Configuration (EXPERIMENTAL) @@ -334,6 +283,11 @@ CONFIG_IP_NF_ARP_MANGLE=m # SCTP Configuration (EXPERIMENTAL) # # CONFIG_IP_SCTP is not set + +# +# TIPC Configuration (EXPERIMENTAL) +# +# CONFIG_TIPC is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set # CONFIG_VLAN_8021Q is not set @@ -352,7 +306,6 @@ CONFIG_LLC=y # QoS and/or fair queueing # # CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y # # Network testing @@ -550,13 +503,7 @@ CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set -CONFIG_SCSI_QLA2XXX=y -CONFIG_SCSI_QLA21XX=m -CONFIG_SCSI_QLA22XX=m -CONFIG_SCSI_QLA2300=m -CONFIG_SCSI_QLA2322=m -CONFIG_SCSI_QLA6312=m -CONFIG_SCSI_QLA24XX=m +# CONFIG_SCSI_QLA_FC is not set CONFIG_SCSI_LPFC=m # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set @@ -678,6 +625,7 @@ CONFIG_E1000=y # CONFIG_R8169 is not set # CONFIG_SIS190 is not set # CONFIG_SKGE is not set +# CONFIG_SKY2 is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y @@ -803,6 +751,7 @@ CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=4 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # @@ -909,6 +858,12 @@ CONFIG_I2C_ALGOBIT=y # CONFIG_I2C_DEBUG_CHIP is not set # +# SPI support +# +# CONFIG_SPI is not set +# CONFIG_SPI_MASTER is not set + +# # Dallas's 1-wire bus # # CONFIG_W1 is not set @@ -976,7 +931,6 @@ CONFIG_FB_RADEON_I2C=y # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set -# CONFIG_FB_CYBLA is not set # CONFIG_FB_TRIDENT is not set # CONFIG_FB_VIRTUAL is not set @@ -1061,12 +1015,15 @@ CONFIG_USB_STORAGE=y # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set +# CONFIG_USB_STORAGE_ALAUDA is not set +# CONFIG_USB_LIBUSUAL is not set # # USB Input Devices # CONFIG_USB_HID=y CONFIG_USB_HIDINPUT=y +# CONFIG_USB_HIDINPUT_POWERBOOK is not set # CONFIG_HID_FF is not set CONFIG_USB_HIDDEV=y # CONFIG_USB_AIPTEK is not set @@ -1080,6 +1037,7 @@ CONFIG_USB_HIDDEV=y # CONFIG_USB_YEALINK is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_ATI_REMOTE2 is not set # CONFIG_USB_KEYSPAN_REMOTE is not set # CONFIG_USB_APPLETOUCH is not set @@ -1167,6 +1125,10 @@ CONFIG_INFINIBAND_IPOIB=m # # +# EDAC - error detection and reporting (RAS) +# + +# # File systems # CONFIG_EXT2_FS=y @@ -1200,6 +1162,7 @@ CONFIG_XFS_EXPORT=y CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_XFS_RT is not set +# CONFIG_OCFS2_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y @@ -1240,6 +1203,7 @@ CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y # CONFIG_RELAYFS_FS is not set +# CONFIG_CONFIGFS_FS is not set # # Miscellaneous filesystems @@ -1351,10 +1315,6 @@ CONFIG_CRC32=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m -CONFIG_TEXTSEARCH=y -CONFIG_TEXTSEARCH_KMP=m -CONFIG_TEXTSEARCH_BM=m -CONFIG_TEXTSEARCH_FSM=m # # Instrumentation Support @@ -1367,18 +1327,20 @@ CONFIG_OPROFILE=y # Kernel hacking # # CONFIG_PRINTK_TIME is not set -CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +CONFIG_DEBUG_MUTEXES=y # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set +CONFIG_FORCED_INLINING=y # CONFIG_RCU_TORTURE_TEST is not set CONFIG_DEBUG_STACKOVERFLOW=y CONFIG_DEBUG_STACK_USAGE=y @@ -1387,6 +1349,11 @@ CONFIG_XMON=y CONFIG_XMON_DEFAULT=y CONFIG_IRQSTACKS=y # CONFIG_BOOTX_TEXT is not set +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set +# CONFIG_PPC_EARLY_DEBUG_G5 is not set +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set # # Security options From arnd at arndb.de Sun Feb 12 15:52:25 2006 From: arnd at arndb.de (Arnd Bergmann) Date: Sun, 12 Feb 2006 05:52:25 +0100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139546116.5003.81.camel@localhost.localdomain> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <1139546116.5003.81.camel@localhost.localdomain> Message-ID: <200602120552.26164.arnd@arndb.de> On Friday 10 February 2006 05:35, Benjamin Herrenschmidt wrote: > > > Doing it in the firmware sounds like the right solution to me. > > I would however not want to do that if the current firmware > > sets the wrong page sizes. > > > > I know that Hartmut wanted me to provide him with the right device > > tree information that he needs to create to say that the page > > size are 16M, 64k and 4k. Maybe we can find a combined solution > > for these problems. Using __setup_cpu_power4 should be ok. > > I don't completely understand your statement ... sorry The current firmware on the Cell blades does neither the setup of the HID6 register nor have the correct tables in the device tree. Since I'm still currently sitting in a garden in NZ instead of the B?blingen lab, I can't find out what the HID6 power-on defaults are. We might get away with just leaving the default there, but that might prevent us from using 16M and/or 64k pages and there are definitely some application which depend on 16M hugetlb mappings on Cell. The two problems we are facing currently are: - If HID6 defaults to disabling 16M large pages, the kernel will get the wrong information from the CPU features and applications that use it break. The firmware should add the setup if HID6 _now_, but we also should be prepared for users of old firmware that want to upgrade their kernel without upgrading the firmware at the same time. - We want to use 64k pages in the future, so the firmware needs to add the 'ibm,segment-page-sizes' property ASAP, preferrably at the same time they start setting up HID6. I currently have a hack for the kernel to override that, but we're in the process of eliminating all the special hacks that won't make in into the mainline kernel. > > We could probably do a fallback in the cell setup to see if > > the properties are in the device tree and do our own HID6 > > setup stuff if not, normally expecting that the firmware settings > > match the device tree. > > We should not touch HID6 at all ... we should assume the firmware set it > appropriately and have setup matching page size entries in the > device-tree. I don't think we need to support changing that value > especially since the kernel doesn't quite support 1M large page sizes > anyway. Yes, 1M mappings are probably not of much use to us, and other OSs already do whatever they like ;-). > > Geoff, if your firmware does not already have the properties > > for large page sizes, could you add them? > > > > Ben, could you point Hartmut (and maybe Geoff) to the documentation > > for how the device tree needs to look like? > > I'm not sure we published that yet :) I would suggest looking at what > the kernel does to parse these instead in hash_utils.c until I get a > former IBM approval for the spec to be published. Then please try to at least send the spec or a link to Hartmut's IBM internal address (hpenner at de.ibm.com). I already pointed him to the linux code when it was initially merged, but he argued that reverse engineering that code is not good enough to be sure to get the property right and not having it in there is better than having incorrect properties. Arnd <>< From benh at kernel.crashing.org Mon Feb 13 08:24:22 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 13 Feb 2006 08:24:22 +1100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <200602120552.26164.arnd@arndb.de> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <1139546116.5003.81.camel@localhost.localdomain> <200602120552.26164.arnd@arndb.de> Message-ID: <1139779462.5247.30.camel@localhost.localdomain> > The current firmware on the Cell blades does neither the setup of > the HID6 register nor have the correct tables in the device tree. > > Since I'm still currently sitting in a garden in NZ instead of the > B?blingen lab, I can't find out what the HID6 power-on defaults > are. We might get away with just leaving the default there, but that > might prevent us from using 16M and/or 64k pages and there are > definitely some application which depend on 16M hugetlb mappings > on Cell. Yes, however, how much widely distributed and "frozen" is this current Cell firmware ? I mean, do we really need to add a workaround to the kenrel instead of just fixing the firmware here ? > The two problems we are facing currently are: > - If HID6 defaults to disabling 16M large pages, the kernel will > get the wrong information from the CPU features and applications > that use it break. The firmware should add the setup if HID6 > _now_, but we also should be prepared for users of old firmware > that want to upgrade their kernel without upgrading the firmware > at the same time. Do we really need to support old/broken firmware ? It's not like we had a released product all over the field... > - We want to use 64k pages in the future, so the firmware needs to > add the 'ibm,segment-page-sizes' property ASAP, preferrably at > the same time they start setting up HID6. I currently have a > hack for the kernel to override that, but we're in the process > of eliminating all the special hacks that won't make in into > the mainline kernel. The only things you need is to have this property set and the new ibm,pa-feature for which I need to dig out the latest spec.... The problem is that the kernel will currentl not enable 64k pages on any processor due to the lack of a feature bit (intentionally) from the cputable. That bit will be extracted from ibm,pa-features at least on pSeries. It's the bit indicating that L=1 works for cache inhibited mappings. > Yes, 1M mappings are probably not of much use to us, and other OSs > already do whatever they like ;-). Sure. Note that the firmware can still set HID6 to 1M pages and put the appropriate entries in the device-tree for 1M large pages. Linux won't be able to use them as-is though but at least the device-tree infos will be sane. I don't want to enter a debate wether we should be able to change HID6 etc... right now. It's more a firmware configuration issue as far as I'm concerned. > Then please try to at least send the spec or a link to Hartmut's IBM > internal address (hpenner at de.ibm.com). I already pointed him to the > linux code when it was initially merged, but he argued that reverse > engineering that code is not good enough to be sure to get the > property right and not having it in there is better than having incorrect > properties. Will do Ben. From benh at kernel.crashing.org Mon Feb 13 09:11:44 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 13 Feb 2006 09:11:44 +1100 Subject: [PATCH] Update {g5,pseries,ppc64}_defconfig In-Reply-To: <20060210234903.GB4795@pb15.lixom.net> References: <20060210234903.GB4795@pb15.lixom.net> Message-ID: <1139782304.5247.42.camel@localhost.localdomain> On Fri, 2006-02-10 at 17:49 -0600, Olof Johansson wrote: > Hi, > > For powerpc.git (post-2.6.16): > > Update defconfigs for g5, pseries and generic ppc64. Default choices > for everything, with the following exceptions: > > * Enable WINDFARM_PM112 on g5 and ppc64. > * Increase CONFIG_NR_CPUS to 4 on g5_defconfig You probably also want to make tg3 built-in... Ben. > > Signed-off-by: Olof Johansson > > Index: powerpc-git/arch/powerpc/configs/g5_defconfig > =================================================================== > --- powerpc-git.orig/arch/powerpc/configs/g5_defconfig > +++ powerpc-git/arch/powerpc/configs/g5_defconfig > @@ -1,7 +1,7 @@ > # > # Automatically generated make config: don't edit > -# Linux kernel version: 2.6.15-rc5 > -# Tue Dec 20 15:59:30 2005 > +# Linux kernel version: 2.6.16-rc2 > +# Fri Feb 10 17:33:08 2006 > # > CONFIG_PPC64=y > CONFIG_64BIT=y > @@ -16,6 +16,10 @@ CONFIG_COMPAT=y > CONFIG_SYSVIPC_COMPAT=y > CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y > CONFIG_ARCH_MAY_HAVE_PC_FDC=y > +CONFIG_PPC_OF=y > +# CONFIG_PPC_UDBG_16550 is not set > +CONFIG_GENERIC_TBSYNC=y > +# CONFIG_DEFAULT_UIMAGE is not set > > # > # Processor support > @@ -26,13 +30,12 @@ CONFIG_PPC_FPU=y > CONFIG_ALTIVEC=y > CONFIG_PPC_STD_MMU=y > CONFIG_SMP=y > -CONFIG_NR_CPUS=2 > +CONFIG_NR_CPUS=4 > > # > # Code maturity level options > # > CONFIG_EXPERIMENTAL=y > -CONFIG_CLEAN_COMPILE=y > CONFIG_LOCK_KERNEL=y > CONFIG_INIT_ENV_ARG_LIMIT=32 > > @@ -47,8 +50,6 @@ CONFIG_POSIX_MQUEUE=y > # CONFIG_BSD_PROCESS_ACCT is not set > CONFIG_SYSCTL=y > # CONFIG_AUDIT is not set > -CONFIG_HOTPLUG=y > -CONFIG_KOBJECT_UEVENT=y > CONFIG_IKCONFIG=y > CONFIG_IKCONFIG_PROC=y > # CONFIG_CPUSETS is not set > @@ -58,8 +59,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y > CONFIG_KALLSYMS=y > # CONFIG_KALLSYMS_ALL is not set > # CONFIG_KALLSYMS_EXTRA_PASS is not set > +CONFIG_HOTPLUG=y > CONFIG_PRINTK=y > CONFIG_BUG=y > +CONFIG_ELF_CORE=y > CONFIG_BASE_FULL=y > CONFIG_FUTEX=y > CONFIG_EPOLL=y > @@ -68,8 +71,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 > CONFIG_CC_ALIGN_LABELS=0 > CONFIG_CC_ALIGN_LOOPS=0 > CONFIG_CC_ALIGN_JUMPS=0 > +CONFIG_SLAB=y > # CONFIG_TINY_SHMEM is not set > CONFIG_BASE_SMALL=0 > +# CONFIG_SLOB is not set > > # > # Loadable module support > @@ -112,13 +117,12 @@ CONFIG_PPC_PMAC=y > CONFIG_PPC_PMAC64=y > # CONFIG_PPC_MAPLE is not set > # CONFIG_PPC_CELL is not set > -CONFIG_PPC_OF=y > CONFIG_U3_DART=y > CONFIG_MPIC=y > # CONFIG_PPC_RTAS is not set > # CONFIG_MMIO_NVRAM is not set > +CONFIG_MPIC_BROKEN_U3=y > # CONFIG_PPC_MPC106 is not set > -CONFIG_GENERIC_TBSYNC=y > CONFIG_CPU_FREQ=y > CONFIG_CPU_FREQ_TABLE=y > # CONFIG_CPU_FREQ_DEBUG is not set > @@ -151,6 +155,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 > CONFIG_IOMMU_VMERGE=y > # CONFIG_HOTPLUG_CPU is not set > CONFIG_KEXEC=y > +# CONFIG_CRASH_DUMP is not set > CONFIG_IRQ_ALL_CPUS=y > # CONFIG_NUMA is not set > CONFIG_ARCH_SELECT_MEMORY_MODEL=y > @@ -202,6 +207,7 @@ CONFIG_NET=y > # > # Networking options > # > +# CONFIG_NETDEBUG is not set > CONFIG_PACKET=y > # CONFIG_PACKET_MMAP is not set > CONFIG_UNIX=y > @@ -239,6 +245,7 @@ CONFIG_NETFILTER=y > # Core Netfilter Configuration > # > # CONFIG_NETFILTER_NETLINK is not set > +# CONFIG_NETFILTER_XTABLES is not set > > # > # IP: Netfilter Configuration > @@ -255,65 +262,6 @@ CONFIG_IP_NF_TFTP=m > CONFIG_IP_NF_AMANDA=m > # CONFIG_IP_NF_PPTP is not set > CONFIG_IP_NF_QUEUE=m > -CONFIG_IP_NF_IPTABLES=m > -CONFIG_IP_NF_MATCH_LIMIT=m > -CONFIG_IP_NF_MATCH_IPRANGE=m > -CONFIG_IP_NF_MATCH_MAC=m > -CONFIG_IP_NF_MATCH_PKTTYPE=m > -CONFIG_IP_NF_MATCH_MARK=m > -CONFIG_IP_NF_MATCH_MULTIPORT=m > -CONFIG_IP_NF_MATCH_TOS=m > -CONFIG_IP_NF_MATCH_RECENT=m > -CONFIG_IP_NF_MATCH_ECN=m > -CONFIG_IP_NF_MATCH_DSCP=m > -CONFIG_IP_NF_MATCH_AH_ESP=m > -CONFIG_IP_NF_MATCH_LENGTH=m > -CONFIG_IP_NF_MATCH_TTL=m > -CONFIG_IP_NF_MATCH_TCPMSS=m > -CONFIG_IP_NF_MATCH_HELPER=m > -CONFIG_IP_NF_MATCH_STATE=m > -CONFIG_IP_NF_MATCH_CONNTRACK=m > -CONFIG_IP_NF_MATCH_OWNER=m > -CONFIG_IP_NF_MATCH_ADDRTYPE=m > -CONFIG_IP_NF_MATCH_REALM=m > -CONFIG_IP_NF_MATCH_SCTP=m > -# CONFIG_IP_NF_MATCH_DCCP is not set > -CONFIG_IP_NF_MATCH_COMMENT=m > -CONFIG_IP_NF_MATCH_CONNMARK=m > -CONFIG_IP_NF_MATCH_CONNBYTES=m > -CONFIG_IP_NF_MATCH_HASHLIMIT=m > -CONFIG_IP_NF_MATCH_STRING=m > -CONFIG_IP_NF_FILTER=m > -CONFIG_IP_NF_TARGET_REJECT=m > -CONFIG_IP_NF_TARGET_LOG=m > -CONFIG_IP_NF_TARGET_ULOG=m > -CONFIG_IP_NF_TARGET_TCPMSS=m > -CONFIG_IP_NF_TARGET_NFQUEUE=m > -CONFIG_IP_NF_NAT=m > -CONFIG_IP_NF_NAT_NEEDED=y > -CONFIG_IP_NF_TARGET_MASQUERADE=m > -CONFIG_IP_NF_TARGET_REDIRECT=m > -CONFIG_IP_NF_TARGET_NETMAP=m > -CONFIG_IP_NF_TARGET_SAME=m > -CONFIG_IP_NF_NAT_SNMP_BASIC=m > -CONFIG_IP_NF_NAT_IRC=m > -CONFIG_IP_NF_NAT_FTP=m > -CONFIG_IP_NF_NAT_TFTP=m > -CONFIG_IP_NF_NAT_AMANDA=m > -CONFIG_IP_NF_MANGLE=m > -CONFIG_IP_NF_TARGET_TOS=m > -CONFIG_IP_NF_TARGET_ECN=m > -CONFIG_IP_NF_TARGET_DSCP=m > -CONFIG_IP_NF_TARGET_MARK=m > -CONFIG_IP_NF_TARGET_CLASSIFY=m > -CONFIG_IP_NF_TARGET_TTL=m > -CONFIG_IP_NF_TARGET_CONNMARK=m > -CONFIG_IP_NF_TARGET_CLUSTERIP=m > -CONFIG_IP_NF_RAW=m > -CONFIG_IP_NF_TARGET_NOTRACK=m > -CONFIG_IP_NF_ARPTABLES=m > -CONFIG_IP_NF_ARPFILTER=m > -CONFIG_IP_NF_ARP_MANGLE=m > > # > # DCCP Configuration (EXPERIMENTAL) > @@ -324,6 +272,11 @@ CONFIG_IP_NF_ARP_MANGLE=m > # SCTP Configuration (EXPERIMENTAL) > # > # CONFIG_IP_SCTP is not set > + > +# > +# TIPC Configuration (EXPERIMENTAL) > +# > +# CONFIG_TIPC is not set > # CONFIG_ATM is not set > # CONFIG_BRIDGE is not set > # CONFIG_VLAN_8021Q is not set > @@ -342,7 +295,6 @@ CONFIG_LLC=y > # QoS and/or fair queueing > # > # CONFIG_NET_SCHED is not set > -CONFIG_NET_CLS_ROUTE=y > > # > # Network testing > @@ -545,13 +497,7 @@ CONFIG_SCSI_SATA_SVW=y > # CONFIG_SCSI_IPR is not set > # CONFIG_SCSI_QLOGIC_FC is not set > # CONFIG_SCSI_QLOGIC_1280 is not set > -CONFIG_SCSI_QLA2XXX=y > -# CONFIG_SCSI_QLA21XX is not set > -# CONFIG_SCSI_QLA22XX is not set > -# CONFIG_SCSI_QLA2300 is not set > -# CONFIG_SCSI_QLA2322 is not set > -# CONFIG_SCSI_QLA6312 is not set > -# CONFIG_SCSI_QLA24XX is not set > +# CONFIG_SCSI_QLA_FC is not set > # CONFIG_SCSI_LPFC is not set > # CONFIG_SCSI_DC395x is not set > # CONFIG_SCSI_DC390T is not set > @@ -614,7 +560,6 @@ CONFIG_IEEE1394_SBP2=m > CONFIG_IEEE1394_ETH1394=m > CONFIG_IEEE1394_DV1394=m > CONFIG_IEEE1394_RAWIO=y > -# CONFIG_IEEE1394_CMP is not set > > # > # I2O device support > @@ -630,6 +575,7 @@ CONFIG_THERM_PM72=y > CONFIG_WINDFARM=y > CONFIG_WINDFARM_PM81=y > CONFIG_WINDFARM_PM91=y > +CONFIG_WINDFARM_PM112=y > > # > # Network device support > @@ -682,6 +628,7 @@ CONFIG_E1000=y > # CONFIG_R8169 is not set > # CONFIG_SIS190 is not set > # CONFIG_SKGE is not set > +# CONFIG_SKY2 is not set > # CONFIG_SK98LIN is not set > CONFIG_TIGON3=m > # CONFIG_BNX2 is not set > @@ -861,8 +808,7 @@ CONFIG_I2C_ALGOBIT=y > # CONFIG_I2C_I801 is not set > # CONFIG_I2C_I810 is not set > # CONFIG_I2C_PIIX4 is not set > -CONFIG_I2C_KEYWEST=y > -CONFIG_I2C_PMAC_SMU=y > +CONFIG_I2C_POWERMAC=y > # CONFIG_I2C_NFORCE2 is not set > # CONFIG_I2C_PARPORT_LIGHT is not set > # CONFIG_I2C_PROSAVAGE is not set > @@ -895,6 +841,12 @@ CONFIG_I2C_PMAC_SMU=y > # CONFIG_I2C_DEBUG_CHIP is not set > > # > +# SPI support > +# > +# CONFIG_SPI is not set > +# CONFIG_SPI_MASTER is not set > + > +# > # Dallas's 1-wire bus > # > # CONFIG_W1 is not set > @@ -961,7 +913,6 @@ CONFIG_FB_RADEON_I2C=y > # CONFIG_FB_KYRO is not set > # CONFIG_FB_3DFX is not set > # CONFIG_FB_VOODOO1 is not set > -# CONFIG_FB_CYBLA is not set > # CONFIG_FB_TRIDENT is not set > # CONFIG_FB_VIRTUAL is not set > > @@ -1008,9 +959,10 @@ CONFIG_SND_OSSEMUL=y > CONFIG_SND_MIXER_OSS=m > CONFIG_SND_PCM_OSS=m > CONFIG_SND_SEQUENCER_OSS=y > +# CONFIG_SND_DYNAMIC_MINORS is not set > +CONFIG_SND_SUPPORT_OLD_API=y > # CONFIG_SND_VERBOSE_PRINTK is not set > # CONFIG_SND_DEBUG is not set > -CONFIG_SND_GENERIC_DRIVER=y > > # > # Generic devices > @@ -1024,6 +976,8 @@ CONFIG_SND_GENERIC_DRIVER=y > # > # PCI devices > # > +# CONFIG_SND_AD1889 is not set > +# CONFIG_SND_ALS4000 is not set > # CONFIG_SND_ALI5451 is not set > # CONFIG_SND_ATIIXP is not set > # CONFIG_SND_ATIIXP_MODEM is not set > @@ -1032,39 +986,38 @@ CONFIG_SND_GENERIC_DRIVER=y > # CONFIG_SND_AU8830 is not set > # CONFIG_SND_AZT3328 is not set > # CONFIG_SND_BT87X is not set > -# CONFIG_SND_CS46XX is not set > +# CONFIG_SND_CA0106 is not set > +# CONFIG_SND_CMIPCI is not set > # CONFIG_SND_CS4281 is not set > +# CONFIG_SND_CS46XX is not set > # CONFIG_SND_EMU10K1 is not set > # CONFIG_SND_EMU10K1X is not set > -# CONFIG_SND_CA0106 is not set > -# CONFIG_SND_KORG1212 is not set > -# CONFIG_SND_MIXART is not set > -# CONFIG_SND_NM256 is not set > -# CONFIG_SND_RME32 is not set > -# CONFIG_SND_RME96 is not set > -# CONFIG_SND_RME9652 is not set > -# CONFIG_SND_HDSP is not set > -# CONFIG_SND_HDSPM is not set > -# CONFIG_SND_TRIDENT is not set > -# CONFIG_SND_YMFPCI is not set > -# CONFIG_SND_AD1889 is not set > -# CONFIG_SND_ALS4000 is not set > -# CONFIG_SND_CMIPCI is not set > # CONFIG_SND_ENS1370 is not set > # CONFIG_SND_ENS1371 is not set > # CONFIG_SND_ES1938 is not set > # CONFIG_SND_ES1968 is not set > -# CONFIG_SND_MAESTRO3 is not set > # CONFIG_SND_FM801 is not set > +# CONFIG_SND_HDA_INTEL is not set > +# CONFIG_SND_HDSP is not set > +# CONFIG_SND_HDSPM is not set > # CONFIG_SND_ICE1712 is not set > # CONFIG_SND_ICE1724 is not set > # CONFIG_SND_INTEL8X0 is not set > # CONFIG_SND_INTEL8X0M is not set > +# CONFIG_SND_KORG1212 is not set > +# CONFIG_SND_MAESTRO3 is not set > +# CONFIG_SND_MIXART is not set > +# CONFIG_SND_NM256 is not set > +# CONFIG_SND_PCXHR is not set > +# CONFIG_SND_RME32 is not set > +# CONFIG_SND_RME96 is not set > +# CONFIG_SND_RME9652 is not set > # CONFIG_SND_SONICVIBES is not set > +# CONFIG_SND_TRIDENT is not set > # CONFIG_SND_VIA82XX is not set > # CONFIG_SND_VIA82XX_MODEM is not set > # CONFIG_SND_VX222 is not set > -# CONFIG_SND_HDA_INTEL is not set > +# CONFIG_SND_YMFPCI is not set > > # > # ALSA PowerMac devices > @@ -1136,13 +1089,16 @@ CONFIG_USB_STORAGE_DPCM=y > CONFIG_USB_STORAGE_SDDR09=y > CONFIG_USB_STORAGE_SDDR55=y > CONFIG_USB_STORAGE_JUMPSHOT=y > +# CONFIG_USB_STORAGE_ALAUDA is not set > # CONFIG_USB_STORAGE_ONETOUCH is not set > +# CONFIG_USB_LIBUSUAL is not set > > # > # USB Input Devices > # > CONFIG_USB_HID=y > CONFIG_USB_HIDINPUT=y > +# CONFIG_USB_HIDINPUT_POWERBOOK is not set > CONFIG_HID_FF=y > CONFIG_HID_PID=y > CONFIG_LOGITECH_FF=y > @@ -1159,6 +1115,7 @@ CONFIG_USB_HIDDEV=y > # CONFIG_USB_YEALINK is not set > # CONFIG_USB_XPAD is not set > # CONFIG_USB_ATI_REMOTE is not set > +# CONFIG_USB_ATI_REMOTE2 is not set > # CONFIG_USB_KEYSPAN_REMOTE is not set > # CONFIG_USB_APPLETOUCH is not set > > @@ -1207,6 +1164,7 @@ CONFIG_USB_SERIAL_GENERIC=y > # CONFIG_USB_SERIAL_AIRPRIME is not set > # CONFIG_USB_SERIAL_ANYDATA is not set > CONFIG_USB_SERIAL_BELKIN=m > +# CONFIG_USB_SERIAL_WHITEHEAT is not set > CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m > # CONFIG_USB_SERIAL_CP2101 is not set > CONFIG_USB_SERIAL_CYPRESS_M8=m > @@ -1288,6 +1246,10 @@ CONFIG_USB_EZUSB=y > # > > # > +# EDAC - error detection and reporting (RAS) > +# > + > +# > # File systems > # > CONFIG_EXT2_FS=y > @@ -1317,6 +1279,7 @@ CONFIG_XFS_EXPORT=y > CONFIG_XFS_SECURITY=y > CONFIG_XFS_POSIX_ACL=y > # CONFIG_XFS_RT is not set > +# CONFIG_OCFS2_FS is not set > # CONFIG_MINIX_FS is not set > # CONFIG_ROMFS_FS is not set > CONFIG_INOTIFY=y > @@ -1357,6 +1320,7 @@ CONFIG_HUGETLBFS=y > CONFIG_HUGETLB_PAGE=y > CONFIG_RAMFS=y > # CONFIG_RELAYFS_FS is not set > +# CONFIG_CONFIGFS_FS is not set > > # > # Miscellaneous filesystems > @@ -1426,6 +1390,7 @@ CONFIG_MSDOS_PARTITION=y > # CONFIG_SGI_PARTITION is not set > # CONFIG_ULTRIX_PARTITION is not set > # CONFIG_SUN_PARTITION is not set > +# CONFIG_KARMA_PARTITION is not set > # CONFIG_EFI_PARTITION is not set > > # > @@ -1481,10 +1446,6 @@ CONFIG_CRC32=y > CONFIG_LIBCRC32C=m > CONFIG_ZLIB_INFLATE=y > CONFIG_ZLIB_DEFLATE=m > -CONFIG_TEXTSEARCH=y > -CONFIG_TEXTSEARCH_KMP=m > -CONFIG_TEXTSEARCH_BM=m > -CONFIG_TEXTSEARCH_FSM=m > > # > # Instrumentation Support > @@ -1497,24 +1458,31 @@ CONFIG_OPROFILE=y > # Kernel hacking > # > # CONFIG_PRINTK_TIME is not set > -CONFIG_DEBUG_KERNEL=y > CONFIG_MAGIC_SYSRQ=y > +CONFIG_DEBUG_KERNEL=y > CONFIG_LOG_BUF_SHIFT=17 > CONFIG_DETECT_SOFTLOCKUP=y > # CONFIG_SCHEDSTATS is not set > # CONFIG_DEBUG_SLAB is not set > +CONFIG_DEBUG_MUTEXES=y > # CONFIG_DEBUG_SPINLOCK is not set > # CONFIG_DEBUG_SPINLOCK_SLEEP is not set > # CONFIG_DEBUG_KOBJECT is not set > # CONFIG_DEBUG_INFO is not set > CONFIG_DEBUG_FS=y > # CONFIG_DEBUG_VM is not set > +CONFIG_FORCED_INLINING=y > # CONFIG_RCU_TORTURE_TEST is not set > # CONFIG_DEBUG_STACKOVERFLOW is not set > # CONFIG_DEBUG_STACK_USAGE is not set > # CONFIG_DEBUGGER is not set > CONFIG_IRQSTACKS=y > CONFIG_BOOTX_TEXT=y > +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set > +# CONFIG_PPC_EARLY_DEBUG_G5 is not set > +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set > +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set > +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set > > # > # Security options > Index: powerpc-git/arch/powerpc/configs/ppc64_defconfig > =================================================================== > --- powerpc-git.orig/arch/powerpc/configs/ppc64_defconfig > +++ powerpc-git/arch/powerpc/configs/ppc64_defconfig > @@ -1,7 +1,7 @@ > # > # Automatically generated make config: don't edit > -# Linux kernel version: 2.6.15-rc5 > -# Tue Dec 20 15:59:38 2005 > +# Linux kernel version: 2.6.16-rc2 > +# Fri Feb 10 17:32:14 2006 > # > CONFIG_PPC64=y > CONFIG_64BIT=y > @@ -16,6 +16,10 @@ CONFIG_COMPAT=y > CONFIG_SYSVIPC_COMPAT=y > CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y > CONFIG_ARCH_MAY_HAVE_PC_FDC=y > +CONFIG_PPC_OF=y > +CONFIG_PPC_UDBG_16550=y > +CONFIG_GENERIC_TBSYNC=y > +# CONFIG_DEFAULT_UIMAGE is not set > > # > # Processor support > @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=32 > # Code maturity level options > # > CONFIG_EXPERIMENTAL=y > -CONFIG_CLEAN_COMPILE=y > CONFIG_LOCK_KERNEL=y > CONFIG_INIT_ENV_ARG_LIMIT=32 > > @@ -48,8 +51,6 @@ CONFIG_POSIX_MQUEUE=y > # CONFIG_BSD_PROCESS_ACCT is not set > CONFIG_SYSCTL=y > # CONFIG_AUDIT is not set > -CONFIG_HOTPLUG=y > -CONFIG_KOBJECT_UEVENT=y > CONFIG_IKCONFIG=y > CONFIG_IKCONFIG_PROC=y > CONFIG_CPUSETS=y > @@ -59,8 +60,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y > CONFIG_KALLSYMS=y > CONFIG_KALLSYMS_ALL=y > # CONFIG_KALLSYMS_EXTRA_PASS is not set > +CONFIG_HOTPLUG=y > CONFIG_PRINTK=y > CONFIG_BUG=y > +CONFIG_ELF_CORE=y > CONFIG_BASE_FULL=y > CONFIG_FUTEX=y > CONFIG_EPOLL=y > @@ -69,8 +72,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 > CONFIG_CC_ALIGN_LABELS=0 > CONFIG_CC_ALIGN_LOOPS=0 > CONFIG_CC_ALIGN_JUMPS=0 > +CONFIG_SLAB=y > # CONFIG_TINY_SHMEM is not set > CONFIG_BASE_SMALL=0 > +# CONFIG_SLOB is not set > > # > # Loadable module support > @@ -113,7 +118,6 @@ CONFIG_PPC_PMAC=y > CONFIG_PPC_PMAC64=y > CONFIG_PPC_MAPLE=y > # CONFIG_PPC_CELL is not set > -CONFIG_PPC_OF=y > CONFIG_XICS=y > CONFIG_U3_DART=y > CONFIG_MPIC=y > @@ -124,8 +128,8 @@ CONFIG_RTAS_FLASH=m > # CONFIG_MMIO_NVRAM is not set > CONFIG_MPIC_BROKEN_U3=y > CONFIG_IBMVIO=y > +# CONFIG_IBMEBUS is not set > # CONFIG_PPC_MPC106 is not set > -CONFIG_GENERIC_TBSYNC=y > CONFIG_CPU_FREQ=y > CONFIG_CPU_FREQ_TABLE=y > # CONFIG_CPU_FREQ_DEBUG is not set > @@ -158,6 +162,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 > CONFIG_IOMMU_VMERGE=y > CONFIG_HOTPLUG_CPU=y > CONFIG_KEXEC=y > +# CONFIG_CRASH_DUMP is not set > CONFIG_IRQ_ALL_CPUS=y > CONFIG_PPC_SPLPAR=y > CONFIG_EEH=y > @@ -178,6 +183,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y > CONFIG_SPARSEMEM_EXTREME=y > # CONFIG_MEMORY_HOTPLUG is not set > CONFIG_SPLIT_PTLOCK_CPUS=4 > +CONFIG_MIGRATION=y > # CONFIG_PPC_64K_PAGES is not set > # CONFIG_SCHED_SMT is not set > CONFIG_PROC_DEVICETREE=y > @@ -221,6 +227,7 @@ CONFIG_NET=y > # > # Networking options > # > +# CONFIG_NETDEBUG is not set > CONFIG_PACKET=y > # CONFIG_PACKET_MMAP is not set > CONFIG_UNIX=y > @@ -260,6 +267,7 @@ CONFIG_NETFILTER=y > CONFIG_NETFILTER_NETLINK=y > CONFIG_NETFILTER_NETLINK_QUEUE=m > CONFIG_NETFILTER_NETLINK_LOG=m > +# CONFIG_NETFILTER_XTABLES is not set > > # > # IP: Netfilter Configuration > @@ -277,65 +285,6 @@ CONFIG_IP_NF_TFTP=m > CONFIG_IP_NF_AMANDA=m > # CONFIG_IP_NF_PPTP is not set > CONFIG_IP_NF_QUEUE=m > -CONFIG_IP_NF_IPTABLES=m > -CONFIG_IP_NF_MATCH_LIMIT=m > -CONFIG_IP_NF_MATCH_IPRANGE=m > -CONFIG_IP_NF_MATCH_MAC=m > -CONFIG_IP_NF_MATCH_PKTTYPE=m > -CONFIG_IP_NF_MATCH_MARK=m > -CONFIG_IP_NF_MATCH_MULTIPORT=m > -CONFIG_IP_NF_MATCH_TOS=m > -CONFIG_IP_NF_MATCH_RECENT=m > -CONFIG_IP_NF_MATCH_ECN=m > -CONFIG_IP_NF_MATCH_DSCP=m > -CONFIG_IP_NF_MATCH_AH_ESP=m > -CONFIG_IP_NF_MATCH_LENGTH=m > -CONFIG_IP_NF_MATCH_TTL=m > -CONFIG_IP_NF_MATCH_TCPMSS=m > -CONFIG_IP_NF_MATCH_HELPER=m > -CONFIG_IP_NF_MATCH_STATE=m > -CONFIG_IP_NF_MATCH_CONNTRACK=m > -CONFIG_IP_NF_MATCH_OWNER=m > -CONFIG_IP_NF_MATCH_ADDRTYPE=m > -CONFIG_IP_NF_MATCH_REALM=m > -CONFIG_IP_NF_MATCH_SCTP=m > -CONFIG_IP_NF_MATCH_DCCP=m > -CONFIG_IP_NF_MATCH_COMMENT=m > -CONFIG_IP_NF_MATCH_CONNMARK=m > -CONFIG_IP_NF_MATCH_CONNBYTES=m > -CONFIG_IP_NF_MATCH_HASHLIMIT=m > -CONFIG_IP_NF_MATCH_STRING=m > -CONFIG_IP_NF_FILTER=m > -CONFIG_IP_NF_TARGET_REJECT=m > -CONFIG_IP_NF_TARGET_LOG=m > -CONFIG_IP_NF_TARGET_ULOG=m > -CONFIG_IP_NF_TARGET_TCPMSS=m > -CONFIG_IP_NF_TARGET_NFQUEUE=m > -CONFIG_IP_NF_NAT=m > -CONFIG_IP_NF_NAT_NEEDED=y > -CONFIG_IP_NF_TARGET_MASQUERADE=m > -CONFIG_IP_NF_TARGET_REDIRECT=m > -CONFIG_IP_NF_TARGET_NETMAP=m > -CONFIG_IP_NF_TARGET_SAME=m > -CONFIG_IP_NF_NAT_SNMP_BASIC=m > -CONFIG_IP_NF_NAT_IRC=m > -CONFIG_IP_NF_NAT_FTP=m > -CONFIG_IP_NF_NAT_TFTP=m > -CONFIG_IP_NF_NAT_AMANDA=m > -CONFIG_IP_NF_MANGLE=m > -CONFIG_IP_NF_TARGET_TOS=m > -CONFIG_IP_NF_TARGET_ECN=m > -CONFIG_IP_NF_TARGET_DSCP=m > -CONFIG_IP_NF_TARGET_MARK=m > -CONFIG_IP_NF_TARGET_CLASSIFY=m > -CONFIG_IP_NF_TARGET_TTL=m > -CONFIG_IP_NF_TARGET_CONNMARK=m > -CONFIG_IP_NF_TARGET_CLUSTERIP=m > -CONFIG_IP_NF_RAW=m > -CONFIG_IP_NF_TARGET_NOTRACK=m > -CONFIG_IP_NF_ARPTABLES=m > -CONFIG_IP_NF_ARPFILTER=m > -CONFIG_IP_NF_ARP_MANGLE=m > > # > # DCCP Configuration (EXPERIMENTAL) > @@ -346,6 +295,11 @@ CONFIG_IP_NF_ARP_MANGLE=m > # SCTP Configuration (EXPERIMENTAL) > # > # CONFIG_IP_SCTP is not set > + > +# > +# TIPC Configuration (EXPERIMENTAL) > +# > +# CONFIG_TIPC is not set > # CONFIG_ATM is not set > # CONFIG_BRIDGE is not set > # CONFIG_VLAN_8021Q is not set > @@ -364,7 +318,6 @@ CONFIG_LLC=y > # QoS and/or fair queueing > # > # CONFIG_NET_SCHED is not set > -CONFIG_NET_CLS_ROUTE=y > > # > # Network testing > @@ -572,13 +525,7 @@ CONFIG_SCSI_IPR_TRACE=y > CONFIG_SCSI_IPR_DUMP=y > # CONFIG_SCSI_QLOGIC_FC is not set > # CONFIG_SCSI_QLOGIC_1280 is not set > -CONFIG_SCSI_QLA2XXX=y > -CONFIG_SCSI_QLA21XX=m > -CONFIG_SCSI_QLA22XX=m > -CONFIG_SCSI_QLA2300=m > -CONFIG_SCSI_QLA2322=m > -CONFIG_SCSI_QLA6312=m > -CONFIG_SCSI_QLA24XX=m > +# CONFIG_SCSI_QLA_FC is not set > CONFIG_SCSI_LPFC=m > # CONFIG_SCSI_DC395x is not set > # CONFIG_SCSI_DC390T is not set > @@ -642,8 +589,6 @@ CONFIG_IEEE1394_SBP2=m > CONFIG_IEEE1394_ETH1394=m > CONFIG_IEEE1394_DV1394=m > CONFIG_IEEE1394_RAWIO=y > -CONFIG_IEEE1394_CMP=m > -CONFIG_IEEE1394_AMDTP=m > > # > # I2O device support > @@ -659,6 +604,7 @@ CONFIG_THERM_PM72=y > CONFIG_WINDFARM=y > CONFIG_WINDFARM_PM81=y > CONFIG_WINDFARM_PM91=y > +CONFIG_WINDFARM_PM112=y > > # > # Network device support > @@ -731,6 +677,7 @@ CONFIG_E1000=y > # CONFIG_R8169 is not set > # CONFIG_SIS190 is not set > # CONFIG_SKGE is not set > +# CONFIG_SKY2 is not set > # CONFIG_SK98LIN is not set > # CONFIG_VIA_VELOCITY is not set > CONFIG_TIGON3=y > @@ -853,6 +800,7 @@ CONFIG_HW_CONSOLE=y > CONFIG_SERIAL_8250=y > CONFIG_SERIAL_8250_CONSOLE=y > CONFIG_SERIAL_8250_NR_UARTS=4 > +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 > # CONFIG_SERIAL_8250_EXTENDED is not set > > # > @@ -880,6 +828,7 @@ CONFIG_HVCS=m > # CONFIG_WATCHDOG is not set > # CONFIG_RTC is not set > CONFIG_GEN_RTC=y > +# CONFIG_GEN_RTC_X is not set > # CONFIG_DTLK is not set > # CONFIG_R3964 is not set > # CONFIG_APPLICOM is not set > @@ -923,8 +872,7 @@ CONFIG_I2C_AMD8111=y > # CONFIG_I2C_I801 is not set > # CONFIG_I2C_I810 is not set > # CONFIG_I2C_PIIX4 is not set > -CONFIG_I2C_KEYWEST=y > -CONFIG_I2C_PMAC_SMU=y > +CONFIG_I2C_POWERMAC=y > # CONFIG_I2C_NFORCE2 is not set > # CONFIG_I2C_PARPORT_LIGHT is not set > # CONFIG_I2C_PROSAVAGE is not set > @@ -957,6 +905,12 @@ CONFIG_I2C_PMAC_SMU=y > # CONFIG_I2C_DEBUG_CHIP is not set > > # > +# SPI support > +# > +# CONFIG_SPI is not set > +# CONFIG_SPI_MASTER is not set > + > +# > # Dallas's 1-wire bus > # > # CONFIG_W1 is not set > @@ -1028,7 +982,6 @@ CONFIG_FB_RADEON_I2C=y > # CONFIG_FB_KYRO is not set > # CONFIG_FB_3DFX is not set > # CONFIG_FB_VOODOO1 is not set > -# CONFIG_FB_CYBLA is not set > # CONFIG_FB_TRIDENT is not set > # CONFIG_FB_VIRTUAL is not set > > @@ -1073,9 +1026,10 @@ CONFIG_SND_OSSEMUL=y > CONFIG_SND_MIXER_OSS=m > CONFIG_SND_PCM_OSS=m > CONFIG_SND_SEQUENCER_OSS=y > +# CONFIG_SND_DYNAMIC_MINORS is not set > +CONFIG_SND_SUPPORT_OLD_API=y > # CONFIG_SND_VERBOSE_PRINTK is not set > # CONFIG_SND_DEBUG is not set > -CONFIG_SND_GENERIC_DRIVER=y > > # > # Generic devices > @@ -1089,6 +1043,8 @@ CONFIG_SND_GENERIC_DRIVER=y > # > # PCI devices > # > +# CONFIG_SND_AD1889 is not set > +# CONFIG_SND_ALS4000 is not set > # CONFIG_SND_ALI5451 is not set > # CONFIG_SND_ATIIXP is not set > # CONFIG_SND_ATIIXP_MODEM is not set > @@ -1097,39 +1053,38 @@ CONFIG_SND_GENERIC_DRIVER=y > # CONFIG_SND_AU8830 is not set > # CONFIG_SND_AZT3328 is not set > # CONFIG_SND_BT87X is not set > -# CONFIG_SND_CS46XX is not set > +# CONFIG_SND_CA0106 is not set > +# CONFIG_SND_CMIPCI is not set > # CONFIG_SND_CS4281 is not set > +# CONFIG_SND_CS46XX is not set > # CONFIG_SND_EMU10K1 is not set > # CONFIG_SND_EMU10K1X is not set > -# CONFIG_SND_CA0106 is not set > -# CONFIG_SND_KORG1212 is not set > -# CONFIG_SND_MIXART is not set > -# CONFIG_SND_NM256 is not set > -# CONFIG_SND_RME32 is not set > -# CONFIG_SND_RME96 is not set > -# CONFIG_SND_RME9652 is not set > -# CONFIG_SND_HDSP is not set > -# CONFIG_SND_HDSPM is not set > -# CONFIG_SND_TRIDENT is not set > -# CONFIG_SND_YMFPCI is not set > -# CONFIG_SND_AD1889 is not set > -# CONFIG_SND_ALS4000 is not set > -# CONFIG_SND_CMIPCI is not set > # CONFIG_SND_ENS1370 is not set > # CONFIG_SND_ENS1371 is not set > # CONFIG_SND_ES1938 is not set > # CONFIG_SND_ES1968 is not set > -# CONFIG_SND_MAESTRO3 is not set > # CONFIG_SND_FM801 is not set > +# CONFIG_SND_HDA_INTEL is not set > +# CONFIG_SND_HDSP is not set > +# CONFIG_SND_HDSPM is not set > # CONFIG_SND_ICE1712 is not set > # CONFIG_SND_ICE1724 is not set > # CONFIG_SND_INTEL8X0 is not set > # CONFIG_SND_INTEL8X0M is not set > +# CONFIG_SND_KORG1212 is not set > +# CONFIG_SND_MAESTRO3 is not set > +# CONFIG_SND_MIXART is not set > +# CONFIG_SND_NM256 is not set > +# CONFIG_SND_PCXHR is not set > +# CONFIG_SND_RME32 is not set > +# CONFIG_SND_RME96 is not set > +# CONFIG_SND_RME9652 is not set > # CONFIG_SND_SONICVIBES is not set > +# CONFIG_SND_TRIDENT is not set > # CONFIG_SND_VIA82XX is not set > # CONFIG_SND_VIA82XX_MODEM is not set > # CONFIG_SND_VX222 is not set > -# CONFIG_SND_HDA_INTEL is not set > +# CONFIG_SND_YMFPCI is not set > > # > # ALSA PowerMac devices > @@ -1201,13 +1156,16 @@ CONFIG_USB_STORAGE=m > # CONFIG_USB_STORAGE_SDDR09 is not set > # CONFIG_USB_STORAGE_SDDR55 is not set > # CONFIG_USB_STORAGE_JUMPSHOT is not set > +# CONFIG_USB_STORAGE_ALAUDA is not set > # CONFIG_USB_STORAGE_ONETOUCH is not set > +# CONFIG_USB_LIBUSUAL is not set > > # > # USB Input Devices > # > CONFIG_USB_HID=y > CONFIG_USB_HIDINPUT=y > +# CONFIG_USB_HIDINPUT_POWERBOOK is not set > # CONFIG_HID_FF is not set > CONFIG_USB_HIDDEV=y > # CONFIG_USB_AIPTEK is not set > @@ -1221,6 +1179,7 @@ CONFIG_USB_HIDDEV=y > # CONFIG_USB_YEALINK is not set > # CONFIG_USB_XPAD is not set > # CONFIG_USB_ATI_REMOTE is not set > +# CONFIG_USB_ATI_REMOTE2 is not set > # CONFIG_USB_KEYSPAN_REMOTE is not set > # CONFIG_USB_APPLETOUCH is not set > > @@ -1307,6 +1266,10 @@ CONFIG_INFINIBAND_IPOIB=m > # > > # > +# EDAC - error detection and reporting (RAS) > +# > + > +# > # File systems > # > CONFIG_EXT2_FS=y > @@ -1340,6 +1303,7 @@ CONFIG_XFS_EXPORT=y > CONFIG_XFS_SECURITY=y > CONFIG_XFS_POSIX_ACL=y > # CONFIG_XFS_RT is not set > +# CONFIG_OCFS2_FS is not set > # CONFIG_MINIX_FS is not set > # CONFIG_ROMFS_FS is not set > CONFIG_INOTIFY=y > @@ -1379,6 +1343,7 @@ CONFIG_HUGETLBFS=y > CONFIG_HUGETLB_PAGE=y > CONFIG_RAMFS=y > # CONFIG_RELAYFS_FS is not set > +# CONFIG_CONFIGFS_FS is not set > > # > # Miscellaneous filesystems > @@ -1449,6 +1414,7 @@ CONFIG_MSDOS_PARTITION=y > # CONFIG_SGI_PARTITION is not set > # CONFIG_ULTRIX_PARTITION is not set > # CONFIG_SUN_PARTITION is not set > +# CONFIG_KARMA_PARTITION is not set > # CONFIG_EFI_PARTITION is not set > > # > @@ -1504,10 +1470,6 @@ CONFIG_CRC32=y > CONFIG_LIBCRC32C=m > CONFIG_ZLIB_INFLATE=y > CONFIG_ZLIB_DEFLATE=m > -CONFIG_TEXTSEARCH=y > -CONFIG_TEXTSEARCH_KMP=m > -CONFIG_TEXTSEARCH_BM=m > -CONFIG_TEXTSEARCH_FSM=m > > # > # Instrumentation Support > @@ -1520,18 +1482,20 @@ CONFIG_OPROFILE=y > # Kernel hacking > # > # CONFIG_PRINTK_TIME is not set > -CONFIG_DEBUG_KERNEL=y > CONFIG_MAGIC_SYSRQ=y > +CONFIG_DEBUG_KERNEL=y > CONFIG_LOG_BUF_SHIFT=17 > CONFIG_DETECT_SOFTLOCKUP=y > # CONFIG_SCHEDSTATS is not set > # CONFIG_DEBUG_SLAB is not set > +CONFIG_DEBUG_MUTEXES=y > # CONFIG_DEBUG_SPINLOCK is not set > # CONFIG_DEBUG_SPINLOCK_SLEEP is not set > # CONFIG_DEBUG_KOBJECT is not set > # CONFIG_DEBUG_INFO is not set > CONFIG_DEBUG_FS=y > # CONFIG_DEBUG_VM is not set > +CONFIG_FORCED_INLINING=y > # CONFIG_RCU_TORTURE_TEST is not set > CONFIG_DEBUG_STACKOVERFLOW=y > CONFIG_DEBUG_STACK_USAGE=y > @@ -1540,6 +1504,11 @@ CONFIG_XMON=y > # CONFIG_XMON_DEFAULT is not set > CONFIG_IRQSTACKS=y > CONFIG_BOOTX_TEXT=y > +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set > +# CONFIG_PPC_EARLY_DEBUG_G5 is not set > +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set > +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set > +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set > > # > # Security options > Index: powerpc-git/arch/powerpc/configs/pseries_defconfig > =================================================================== > --- powerpc-git.orig/arch/powerpc/configs/pseries_defconfig > +++ powerpc-git/arch/powerpc/configs/pseries_defconfig > @@ -1,7 +1,7 @@ > # > # Automatically generated make config: don't edit > -# Linux kernel version: 2.6.15-rc5 > -# Tue Dec 20 15:59:40 2005 > +# Linux kernel version: 2.6.16-rc2 > +# Fri Feb 10 17:33:32 2006 > # > CONFIG_PPC64=y > CONFIG_64BIT=y > @@ -16,6 +16,10 @@ CONFIG_COMPAT=y > CONFIG_SYSVIPC_COMPAT=y > CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y > CONFIG_ARCH_MAY_HAVE_PC_FDC=y > +CONFIG_PPC_OF=y > +CONFIG_PPC_UDBG_16550=y > +# CONFIG_GENERIC_TBSYNC is not set > +# CONFIG_DEFAULT_UIMAGE is not set > > # > # Processor support > @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=128 > # Code maturity level options > # > CONFIG_EXPERIMENTAL=y > -CONFIG_CLEAN_COMPILE=y > CONFIG_LOCK_KERNEL=y > CONFIG_INIT_ENV_ARG_LIMIT=32 > > @@ -49,8 +52,6 @@ CONFIG_POSIX_MQUEUE=y > CONFIG_SYSCTL=y > CONFIG_AUDIT=y > CONFIG_AUDITSYSCALL=y > -CONFIG_HOTPLUG=y > -CONFIG_KOBJECT_UEVENT=y > CONFIG_IKCONFIG=y > CONFIG_IKCONFIG_PROC=y > CONFIG_CPUSETS=y > @@ -60,8 +61,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y > CONFIG_KALLSYMS=y > CONFIG_KALLSYMS_ALL=y > # CONFIG_KALLSYMS_EXTRA_PASS is not set > +CONFIG_HOTPLUG=y > CONFIG_PRINTK=y > CONFIG_BUG=y > +CONFIG_ELF_CORE=y > CONFIG_BASE_FULL=y > CONFIG_FUTEX=y > CONFIG_EPOLL=y > @@ -70,8 +73,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 > CONFIG_CC_ALIGN_LABELS=0 > CONFIG_CC_ALIGN_LOOPS=0 > CONFIG_CC_ALIGN_JUMPS=0 > +CONFIG_SLAB=y > # CONFIG_TINY_SHMEM is not set > CONFIG_BASE_SMALL=0 > +# CONFIG_SLOB is not set > > # > # Loadable module support > @@ -113,7 +118,6 @@ CONFIG_PPC_PSERIES=y > # CONFIG_PPC_PMAC is not set > # CONFIG_PPC_MAPLE is not set > # CONFIG_PPC_CELL is not set > -CONFIG_PPC_OF=y > CONFIG_XICS=y > # CONFIG_U3_DART is not set > CONFIG_MPIC=y > @@ -123,8 +127,8 @@ CONFIG_RTAS_PROC=y > CONFIG_RTAS_FLASH=m > # CONFIG_MMIO_NVRAM is not set > CONFIG_IBMVIO=y > +# CONFIG_IBMEBUS is not set > # CONFIG_PPC_MPC106 is not set > -# CONFIG_GENERIC_TBSYNC is not set > # CONFIG_CPU_FREQ is not set > # CONFIG_WANT_EARLY_SERIAL is not set > > @@ -145,6 +149,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 > CONFIG_IOMMU_VMERGE=y > CONFIG_HOTPLUG_CPU=y > CONFIG_KEXEC=y > +# CONFIG_CRASH_DUMP is not set > CONFIG_IRQ_ALL_CPUS=y > CONFIG_PPC_SPLPAR=y > CONFIG_EEH=y > @@ -165,6 +170,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y > CONFIG_SPARSEMEM_EXTREME=y > # CONFIG_MEMORY_HOTPLUG is not set > CONFIG_SPLIT_PTLOCK_CPUS=4 > +CONFIG_MIGRATION=y > CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y > # CONFIG_PPC_64K_PAGES is not set > CONFIG_SCHED_SMT=y > @@ -209,6 +215,7 @@ CONFIG_NET=y > # > # Networking options > # > +# CONFIG_NETDEBUG is not set > CONFIG_PACKET=y > # CONFIG_PACKET_MMAP is not set > CONFIG_UNIX=y > @@ -248,6 +255,7 @@ CONFIG_NETFILTER=y > CONFIG_NETFILTER_NETLINK=y > CONFIG_NETFILTER_NETLINK_QUEUE=m > CONFIG_NETFILTER_NETLINK_LOG=m > +# CONFIG_NETFILTER_XTABLES is not set > > # > # IP: Netfilter Configuration > @@ -265,65 +273,6 @@ CONFIG_IP_NF_TFTP=m > CONFIG_IP_NF_AMANDA=m > # CONFIG_IP_NF_PPTP is not set > CONFIG_IP_NF_QUEUE=m > -CONFIG_IP_NF_IPTABLES=m > -CONFIG_IP_NF_MATCH_LIMIT=m > -CONFIG_IP_NF_MATCH_IPRANGE=m > -CONFIG_IP_NF_MATCH_MAC=m > -CONFIG_IP_NF_MATCH_PKTTYPE=m > -CONFIG_IP_NF_MATCH_MARK=m > -CONFIG_IP_NF_MATCH_MULTIPORT=m > -CONFIG_IP_NF_MATCH_TOS=m > -CONFIG_IP_NF_MATCH_RECENT=m > -CONFIG_IP_NF_MATCH_ECN=m > -CONFIG_IP_NF_MATCH_DSCP=m > -CONFIG_IP_NF_MATCH_AH_ESP=m > -CONFIG_IP_NF_MATCH_LENGTH=m > -CONFIG_IP_NF_MATCH_TTL=m > -CONFIG_IP_NF_MATCH_TCPMSS=m > -CONFIG_IP_NF_MATCH_HELPER=m > -CONFIG_IP_NF_MATCH_STATE=m > -CONFIG_IP_NF_MATCH_CONNTRACK=m > -CONFIG_IP_NF_MATCH_OWNER=m > -CONFIG_IP_NF_MATCH_ADDRTYPE=m > -CONFIG_IP_NF_MATCH_REALM=m > -CONFIG_IP_NF_MATCH_SCTP=m > -# CONFIG_IP_NF_MATCH_DCCP is not set > -CONFIG_IP_NF_MATCH_COMMENT=m > -CONFIG_IP_NF_MATCH_CONNMARK=m > -CONFIG_IP_NF_MATCH_CONNBYTES=m > -CONFIG_IP_NF_MATCH_HASHLIMIT=m > -CONFIG_IP_NF_MATCH_STRING=m > -CONFIG_IP_NF_FILTER=m > -CONFIG_IP_NF_TARGET_REJECT=m > -CONFIG_IP_NF_TARGET_LOG=m > -CONFIG_IP_NF_TARGET_ULOG=m > -CONFIG_IP_NF_TARGET_TCPMSS=m > -CONFIG_IP_NF_TARGET_NFQUEUE=m > -CONFIG_IP_NF_NAT=m > -CONFIG_IP_NF_NAT_NEEDED=y > -CONFIG_IP_NF_TARGET_MASQUERADE=m > -CONFIG_IP_NF_TARGET_REDIRECT=m > -CONFIG_IP_NF_TARGET_NETMAP=m > -CONFIG_IP_NF_TARGET_SAME=m > -CONFIG_IP_NF_NAT_SNMP_BASIC=m > -CONFIG_IP_NF_NAT_IRC=m > -CONFIG_IP_NF_NAT_FTP=m > -CONFIG_IP_NF_NAT_TFTP=m > -CONFIG_IP_NF_NAT_AMANDA=m > -CONFIG_IP_NF_MANGLE=m > -CONFIG_IP_NF_TARGET_TOS=m > -CONFIG_IP_NF_TARGET_ECN=m > -CONFIG_IP_NF_TARGET_DSCP=m > -CONFIG_IP_NF_TARGET_MARK=m > -CONFIG_IP_NF_TARGET_CLASSIFY=m > -CONFIG_IP_NF_TARGET_TTL=m > -CONFIG_IP_NF_TARGET_CONNMARK=m > -CONFIG_IP_NF_TARGET_CLUSTERIP=m > -CONFIG_IP_NF_RAW=m > -CONFIG_IP_NF_TARGET_NOTRACK=m > -CONFIG_IP_NF_ARPTABLES=m > -CONFIG_IP_NF_ARPFILTER=m > -CONFIG_IP_NF_ARP_MANGLE=m > > # > # DCCP Configuration (EXPERIMENTAL) > @@ -334,6 +283,11 @@ CONFIG_IP_NF_ARP_MANGLE=m > # SCTP Configuration (EXPERIMENTAL) > # > # CONFIG_IP_SCTP is not set > + > +# > +# TIPC Configuration (EXPERIMENTAL) > +# > +# CONFIG_TIPC is not set > # CONFIG_ATM is not set > # CONFIG_BRIDGE is not set > # CONFIG_VLAN_8021Q is not set > @@ -352,7 +306,6 @@ CONFIG_LLC=y > # QoS and/or fair queueing > # > # CONFIG_NET_SCHED is not set > -CONFIG_NET_CLS_ROUTE=y > > # > # Network testing > @@ -550,13 +503,7 @@ CONFIG_SCSI_IPR_TRACE=y > CONFIG_SCSI_IPR_DUMP=y > # CONFIG_SCSI_QLOGIC_FC is not set > # CONFIG_SCSI_QLOGIC_1280 is not set > -CONFIG_SCSI_QLA2XXX=y > -CONFIG_SCSI_QLA21XX=m > -CONFIG_SCSI_QLA22XX=m > -CONFIG_SCSI_QLA2300=m > -CONFIG_SCSI_QLA2322=m > -CONFIG_SCSI_QLA6312=m > -CONFIG_SCSI_QLA24XX=m > +# CONFIG_SCSI_QLA_FC is not set > CONFIG_SCSI_LPFC=m > # CONFIG_SCSI_DC395x is not set > # CONFIG_SCSI_DC390T is not set > @@ -678,6 +625,7 @@ CONFIG_E1000=y > # CONFIG_R8169 is not set > # CONFIG_SIS190 is not set > # CONFIG_SKGE is not set > +# CONFIG_SKY2 is not set > # CONFIG_SK98LIN is not set > # CONFIG_VIA_VELOCITY is not set > CONFIG_TIGON3=y > @@ -803,6 +751,7 @@ CONFIG_HW_CONSOLE=y > CONFIG_SERIAL_8250=y > CONFIG_SERIAL_8250_CONSOLE=y > CONFIG_SERIAL_8250_NR_UARTS=4 > +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 > # CONFIG_SERIAL_8250_EXTENDED is not set > > # > @@ -909,6 +858,12 @@ CONFIG_I2C_ALGOBIT=y > # CONFIG_I2C_DEBUG_CHIP is not set > > # > +# SPI support > +# > +# CONFIG_SPI is not set > +# CONFIG_SPI_MASTER is not set > + > +# > # Dallas's 1-wire bus > # > # CONFIG_W1 is not set > @@ -976,7 +931,6 @@ CONFIG_FB_RADEON_I2C=y > # CONFIG_FB_KYRO is not set > # CONFIG_FB_3DFX is not set > # CONFIG_FB_VOODOO1 is not set > -# CONFIG_FB_CYBLA is not set > # CONFIG_FB_TRIDENT is not set > # CONFIG_FB_VIRTUAL is not set > > @@ -1061,12 +1015,15 @@ CONFIG_USB_STORAGE=y > # CONFIG_USB_STORAGE_SDDR09 is not set > # CONFIG_USB_STORAGE_SDDR55 is not set > # CONFIG_USB_STORAGE_JUMPSHOT is not set > +# CONFIG_USB_STORAGE_ALAUDA is not set > +# CONFIG_USB_LIBUSUAL is not set > > # > # USB Input Devices > # > CONFIG_USB_HID=y > CONFIG_USB_HIDINPUT=y > +# CONFIG_USB_HIDINPUT_POWERBOOK is not set > # CONFIG_HID_FF is not set > CONFIG_USB_HIDDEV=y > # CONFIG_USB_AIPTEK is not set > @@ -1080,6 +1037,7 @@ CONFIG_USB_HIDDEV=y > # CONFIG_USB_YEALINK is not set > # CONFIG_USB_XPAD is not set > # CONFIG_USB_ATI_REMOTE is not set > +# CONFIG_USB_ATI_REMOTE2 is not set > # CONFIG_USB_KEYSPAN_REMOTE is not set > # CONFIG_USB_APPLETOUCH is not set > > @@ -1167,6 +1125,10 @@ CONFIG_INFINIBAND_IPOIB=m > # > > # > +# EDAC - error detection and reporting (RAS) > +# > + > +# > # File systems > # > CONFIG_EXT2_FS=y > @@ -1200,6 +1162,7 @@ CONFIG_XFS_EXPORT=y > CONFIG_XFS_SECURITY=y > CONFIG_XFS_POSIX_ACL=y > # CONFIG_XFS_RT is not set > +# CONFIG_OCFS2_FS is not set > # CONFIG_MINIX_FS is not set > # CONFIG_ROMFS_FS is not set > CONFIG_INOTIFY=y > @@ -1240,6 +1203,7 @@ CONFIG_HUGETLBFS=y > CONFIG_HUGETLB_PAGE=y > CONFIG_RAMFS=y > # CONFIG_RELAYFS_FS is not set > +# CONFIG_CONFIGFS_FS is not set > > # > # Miscellaneous filesystems > @@ -1351,10 +1315,6 @@ CONFIG_CRC32=y > CONFIG_LIBCRC32C=m > CONFIG_ZLIB_INFLATE=y > CONFIG_ZLIB_DEFLATE=m > -CONFIG_TEXTSEARCH=y > -CONFIG_TEXTSEARCH_KMP=m > -CONFIG_TEXTSEARCH_BM=m > -CONFIG_TEXTSEARCH_FSM=m > > # > # Instrumentation Support > @@ -1367,18 +1327,20 @@ CONFIG_OPROFILE=y > # Kernel hacking > # > # CONFIG_PRINTK_TIME is not set > -CONFIG_DEBUG_KERNEL=y > CONFIG_MAGIC_SYSRQ=y > +CONFIG_DEBUG_KERNEL=y > CONFIG_LOG_BUF_SHIFT=17 > CONFIG_DETECT_SOFTLOCKUP=y > # CONFIG_SCHEDSTATS is not set > # CONFIG_DEBUG_SLAB is not set > +CONFIG_DEBUG_MUTEXES=y > # CONFIG_DEBUG_SPINLOCK is not set > # CONFIG_DEBUG_SPINLOCK_SLEEP is not set > # CONFIG_DEBUG_KOBJECT is not set > # CONFIG_DEBUG_INFO is not set > CONFIG_DEBUG_FS=y > # CONFIG_DEBUG_VM is not set > +CONFIG_FORCED_INLINING=y > # CONFIG_RCU_TORTURE_TEST is not set > CONFIG_DEBUG_STACKOVERFLOW=y > CONFIG_DEBUG_STACK_USAGE=y > @@ -1387,6 +1349,11 @@ CONFIG_XMON=y > CONFIG_XMON_DEFAULT=y > CONFIG_IRQSTACKS=y > # CONFIG_BOOTX_TEXT is not set > +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set > +# CONFIG_PPC_EARLY_DEBUG_G5 is not set > +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set > +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set > +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set > > # > # Security options From olof at lixom.net Mon Feb 13 10:30:31 2006 From: olof at lixom.net (Olof Johansson) Date: Sun, 12 Feb 2006 17:30:31 -0600 Subject: [PATCH] Update {g5,pseries,ppc64}_defconfig In-Reply-To: <1139782304.5247.42.camel@localhost.localdomain> References: <20060210234903.GB4795@pb15.lixom.net> <1139782304.5247.42.camel@localhost.localdomain> Message-ID: <20060212233030.GA12445@pb15.lixom.net> On Mon, Feb 13, 2006 at 09:11:44AM +1100, Benjamin Herrenschmidt wrote: > You probably also want to make tg3 built-in... Good point. New patch below. --- Update defconfigs for g5, pseries and generic ppc64. Default choices for everything, with the following exceptions: * Enable WINDFARM_PM112 on g5 and ppc64. * Increase CONFIG_NR_CPUS to 4 in g5_defconfig * CONFIG_TIGON3=y instead of =m in g5_defconfig Signed-off-by: Olof Johansson Index: powerpc-git/arch/powerpc/configs/g5_defconfig =================================================================== --- powerpc-git.orig/arch/powerpc/configs/g5_defconfig +++ powerpc-git/arch/powerpc/configs/g5_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.15-rc5 -# Tue Dec 20 15:59:30 2005 +# Linux kernel version: 2.6.16-rc2 +# Fri Feb 10 17:33:08 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -16,6 +16,10 @@ CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_PPC_OF=y +# CONFIG_PPC_UDBG_16550 is not set +CONFIG_GENERIC_TBSYNC=y +# CONFIG_DEFAULT_UIMAGE is not set # # Processor support @@ -26,13 +30,12 @@ CONFIG_PPC_FPU=y CONFIG_ALTIVEC=y CONFIG_PPC_STD_MMU=y CONFIG_SMP=y -CONFIG_NR_CPUS=2 +CONFIG_NR_CPUS=4 # # Code maturity level options # CONFIG_EXPERIMENTAL=y -CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 @@ -47,8 +50,6 @@ CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set -CONFIG_HOTPLUG=y -CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_CPUSETS is not set @@ -58,8 +59,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y +CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y @@ -68,8 +71,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 +CONFIG_SLAB=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 +# CONFIG_SLOB is not set # # Loadable module support @@ -112,13 +117,12 @@ CONFIG_PPC_PMAC=y CONFIG_PPC_PMAC64=y # CONFIG_PPC_MAPLE is not set # CONFIG_PPC_CELL is not set -CONFIG_PPC_OF=y CONFIG_U3_DART=y CONFIG_MPIC=y # CONFIG_PPC_RTAS is not set # CONFIG_MMIO_NVRAM is not set +CONFIG_MPIC_BROKEN_U3=y # CONFIG_PPC_MPC106 is not set -CONFIG_GENERIC_TBSYNC=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y # CONFIG_CPU_FREQ_DEBUG is not set @@ -151,6 +155,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_IOMMU_VMERGE=y # CONFIG_HOTPLUG_CPU is not set CONFIG_KEXEC=y +# CONFIG_CRASH_DUMP is not set CONFIG_IRQ_ALL_CPUS=y # CONFIG_NUMA is not set CONFIG_ARCH_SELECT_MEMORY_MODEL=y @@ -202,6 +207,7 @@ CONFIG_NET=y # # Networking options # +# CONFIG_NETDEBUG is not set CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y @@ -239,6 +245,7 @@ CONFIG_NETFILTER=y # Core Netfilter Configuration # # CONFIG_NETFILTER_NETLINK is not set +# CONFIG_NETFILTER_XTABLES is not set # # IP: Netfilter Configuration @@ -255,65 +262,6 @@ CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m # CONFIG_IP_NF_PPTP is not set CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -# CONFIG_IP_NF_MATCH_DCCP is not set -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_CONNBYTES=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_MATCH_STRING=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_TARGET_NFQUEUE=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_TTL=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m # # DCCP Configuration (EXPERIMENTAL) @@ -324,6 +272,11 @@ CONFIG_IP_NF_ARP_MANGLE=m # SCTP Configuration (EXPERIMENTAL) # # CONFIG_IP_SCTP is not set + +# +# TIPC Configuration (EXPERIMENTAL) +# +# CONFIG_TIPC is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set # CONFIG_VLAN_8021Q is not set @@ -342,7 +295,6 @@ CONFIG_LLC=y # QoS and/or fair queueing # # CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y # # Network testing @@ -545,13 +497,7 @@ CONFIG_SCSI_SATA_SVW=y # CONFIG_SCSI_IPR is not set # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set -CONFIG_SCSI_QLA2XXX=y -# CONFIG_SCSI_QLA21XX is not set -# CONFIG_SCSI_QLA22XX is not set -# CONFIG_SCSI_QLA2300 is not set -# CONFIG_SCSI_QLA2322 is not set -# CONFIG_SCSI_QLA6312 is not set -# CONFIG_SCSI_QLA24XX is not set +# CONFIG_SCSI_QLA_FC is not set # CONFIG_SCSI_LPFC is not set # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set @@ -614,7 +560,6 @@ CONFIG_IEEE1394_SBP2=m CONFIG_IEEE1394_ETH1394=m CONFIG_IEEE1394_DV1394=m CONFIG_IEEE1394_RAWIO=y -# CONFIG_IEEE1394_CMP is not set # # I2O device support @@ -630,6 +575,7 @@ CONFIG_THERM_PM72=y CONFIG_WINDFARM=y CONFIG_WINDFARM_PM81=y CONFIG_WINDFARM_PM91=y +CONFIG_WINDFARM_PM112=y # # Network device support @@ -682,8 +628,9 @@ CONFIG_E1000=y # CONFIG_R8169 is not set # CONFIG_SIS190 is not set # CONFIG_SKGE is not set +# CONFIG_SKY2 is not set # CONFIG_SK98LIN is not set -CONFIG_TIGON3=m +CONFIG_TIGON3=y # CONFIG_BNX2 is not set # CONFIG_MV643XX_ETH is not set @@ -861,8 +808,7 @@ CONFIG_I2C_ALGOBIT=y # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set # CONFIG_I2C_PIIX4 is not set -CONFIG_I2C_KEYWEST=y -CONFIG_I2C_PMAC_SMU=y +CONFIG_I2C_POWERMAC=y # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT_LIGHT is not set # CONFIG_I2C_PROSAVAGE is not set @@ -895,6 +841,12 @@ CONFIG_I2C_PMAC_SMU=y # CONFIG_I2C_DEBUG_CHIP is not set # +# SPI support +# +# CONFIG_SPI is not set +# CONFIG_SPI_MASTER is not set + +# # Dallas's 1-wire bus # # CONFIG_W1 is not set @@ -961,7 +913,6 @@ CONFIG_FB_RADEON_I2C=y # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set -# CONFIG_FB_CYBLA is not set # CONFIG_FB_TRIDENT is not set # CONFIG_FB_VIRTUAL is not set @@ -1008,9 +959,10 @@ CONFIG_SND_OSSEMUL=y CONFIG_SND_MIXER_OSS=m CONFIG_SND_PCM_OSS=m CONFIG_SND_SEQUENCER_OSS=y +# CONFIG_SND_DYNAMIC_MINORS is not set +CONFIG_SND_SUPPORT_OLD_API=y # CONFIG_SND_VERBOSE_PRINTK is not set # CONFIG_SND_DEBUG is not set -CONFIG_SND_GENERIC_DRIVER=y # # Generic devices @@ -1024,6 +976,8 @@ CONFIG_SND_GENERIC_DRIVER=y # # PCI devices # +# CONFIG_SND_AD1889 is not set +# CONFIG_SND_ALS4000 is not set # CONFIG_SND_ALI5451 is not set # CONFIG_SND_ATIIXP is not set # CONFIG_SND_ATIIXP_MODEM is not set @@ -1032,39 +986,38 @@ CONFIG_SND_GENERIC_DRIVER=y # CONFIG_SND_AU8830 is not set # CONFIG_SND_AZT3328 is not set # CONFIG_SND_BT87X is not set -# CONFIG_SND_CS46XX is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_CMIPCI is not set # CONFIG_SND_CS4281 is not set +# CONFIG_SND_CS46XX is not set # CONFIG_SND_EMU10K1 is not set # CONFIG_SND_EMU10K1X is not set -# CONFIG_SND_CA0106 is not set -# CONFIG_SND_KORG1212 is not set -# CONFIG_SND_MIXART is not set -# CONFIG_SND_NM256 is not set -# CONFIG_SND_RME32 is not set -# CONFIG_SND_RME96 is not set -# CONFIG_SND_RME9652 is not set -# CONFIG_SND_HDSP is not set -# CONFIG_SND_HDSPM is not set -# CONFIG_SND_TRIDENT is not set -# CONFIG_SND_YMFPCI is not set -# CONFIG_SND_AD1889 is not set -# CONFIG_SND_ALS4000 is not set -# CONFIG_SND_CMIPCI is not set # CONFIG_SND_ENS1370 is not set # CONFIG_SND_ENS1371 is not set # CONFIG_SND_ES1938 is not set # CONFIG_SND_ES1968 is not set -# CONFIG_SND_MAESTRO3 is not set # CONFIG_SND_FM801 is not set +# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_HDSP is not set +# CONFIG_SND_HDSPM is not set # CONFIG_SND_ICE1712 is not set # CONFIG_SND_ICE1724 is not set # CONFIG_SND_INTEL8X0 is not set # CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_KORG1212 is not set +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +# CONFIG_SND_PCXHR is not set +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set # CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_TRIDENT is not set # CONFIG_SND_VIA82XX is not set # CONFIG_SND_VIA82XX_MODEM is not set # CONFIG_SND_VX222 is not set -# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_YMFPCI is not set # # ALSA PowerMac devices @@ -1136,13 +1089,16 @@ CONFIG_USB_STORAGE_DPCM=y CONFIG_USB_STORAGE_SDDR09=y CONFIG_USB_STORAGE_SDDR55=y CONFIG_USB_STORAGE_JUMPSHOT=y +# CONFIG_USB_STORAGE_ALAUDA is not set # CONFIG_USB_STORAGE_ONETOUCH is not set +# CONFIG_USB_LIBUSUAL is not set # # USB Input Devices # CONFIG_USB_HID=y CONFIG_USB_HIDINPUT=y +# CONFIG_USB_HIDINPUT_POWERBOOK is not set CONFIG_HID_FF=y CONFIG_HID_PID=y CONFIG_LOGITECH_FF=y @@ -1159,6 +1115,7 @@ CONFIG_USB_HIDDEV=y # CONFIG_USB_YEALINK is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_ATI_REMOTE2 is not set # CONFIG_USB_KEYSPAN_REMOTE is not set # CONFIG_USB_APPLETOUCH is not set @@ -1207,6 +1164,7 @@ CONFIG_USB_SERIAL_GENERIC=y # CONFIG_USB_SERIAL_AIRPRIME is not set # CONFIG_USB_SERIAL_ANYDATA is not set CONFIG_USB_SERIAL_BELKIN=m +# CONFIG_USB_SERIAL_WHITEHEAT is not set CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m # CONFIG_USB_SERIAL_CP2101 is not set CONFIG_USB_SERIAL_CYPRESS_M8=m @@ -1288,6 +1246,10 @@ CONFIG_USB_EZUSB=y # # +# EDAC - error detection and reporting (RAS) +# + +# # File systems # CONFIG_EXT2_FS=y @@ -1317,6 +1279,7 @@ CONFIG_XFS_EXPORT=y CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_XFS_RT is not set +# CONFIG_OCFS2_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y @@ -1357,6 +1320,7 @@ CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y # CONFIG_RELAYFS_FS is not set +# CONFIG_CONFIGFS_FS is not set # # Miscellaneous filesystems @@ -1426,6 +1390,7 @@ CONFIG_MSDOS_PARTITION=y # CONFIG_SGI_PARTITION is not set # CONFIG_ULTRIX_PARTITION is not set # CONFIG_SUN_PARTITION is not set +# CONFIG_KARMA_PARTITION is not set # CONFIG_EFI_PARTITION is not set # @@ -1481,10 +1446,6 @@ CONFIG_CRC32=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m -CONFIG_TEXTSEARCH=y -CONFIG_TEXTSEARCH_KMP=m -CONFIG_TEXTSEARCH_BM=m -CONFIG_TEXTSEARCH_FSM=m # # Instrumentation Support @@ -1497,24 +1458,31 @@ CONFIG_OPROFILE=y # Kernel hacking # # CONFIG_PRINTK_TIME is not set -CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +CONFIG_DEBUG_MUTEXES=y # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set +CONFIG_FORCED_INLINING=y # CONFIG_RCU_TORTURE_TEST is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUGGER is not set CONFIG_IRQSTACKS=y CONFIG_BOOTX_TEXT=y +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set +# CONFIG_PPC_EARLY_DEBUG_G5 is not set +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set # # Security options Index: powerpc-git/arch/powerpc/configs/ppc64_defconfig =================================================================== --- powerpc-git.orig/arch/powerpc/configs/ppc64_defconfig +++ powerpc-git/arch/powerpc/configs/ppc64_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.15-rc5 -# Tue Dec 20 15:59:38 2005 +# Linux kernel version: 2.6.16-rc2 +# Fri Feb 10 17:32:14 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -16,6 +16,10 @@ CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_PPC_OF=y +CONFIG_PPC_UDBG_16550=y +CONFIG_GENERIC_TBSYNC=y +# CONFIG_DEFAULT_UIMAGE is not set # # Processor support @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=32 # Code maturity level options # CONFIG_EXPERIMENTAL=y -CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 @@ -48,8 +51,6 @@ CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set -CONFIG_HOTPLUG=y -CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_CPUSETS=y @@ -59,8 +60,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y +CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y @@ -69,8 +72,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 +CONFIG_SLAB=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 +# CONFIG_SLOB is not set # # Loadable module support @@ -113,7 +118,6 @@ CONFIG_PPC_PMAC=y CONFIG_PPC_PMAC64=y CONFIG_PPC_MAPLE=y # CONFIG_PPC_CELL is not set -CONFIG_PPC_OF=y CONFIG_XICS=y CONFIG_U3_DART=y CONFIG_MPIC=y @@ -124,8 +128,8 @@ CONFIG_RTAS_FLASH=m # CONFIG_MMIO_NVRAM is not set CONFIG_MPIC_BROKEN_U3=y CONFIG_IBMVIO=y +# CONFIG_IBMEBUS is not set # CONFIG_PPC_MPC106 is not set -CONFIG_GENERIC_TBSYNC=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y # CONFIG_CPU_FREQ_DEBUG is not set @@ -158,6 +162,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_IOMMU_VMERGE=y CONFIG_HOTPLUG_CPU=y CONFIG_KEXEC=y +# CONFIG_CRASH_DUMP is not set CONFIG_IRQ_ALL_CPUS=y CONFIG_PPC_SPLPAR=y CONFIG_EEH=y @@ -178,6 +183,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y CONFIG_SPARSEMEM_EXTREME=y # CONFIG_MEMORY_HOTPLUG is not set CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_MIGRATION=y # CONFIG_PPC_64K_PAGES is not set # CONFIG_SCHED_SMT is not set CONFIG_PROC_DEVICETREE=y @@ -221,6 +227,7 @@ CONFIG_NET=y # # Networking options # +# CONFIG_NETDEBUG is not set CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y @@ -260,6 +267,7 @@ CONFIG_NETFILTER=y CONFIG_NETFILTER_NETLINK=y CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NETFILTER_NETLINK_LOG=m +# CONFIG_NETFILTER_XTABLES is not set # # IP: Netfilter Configuration @@ -277,65 +285,6 @@ CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m # CONFIG_IP_NF_PPTP is not set CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -CONFIG_IP_NF_MATCH_DCCP=m -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_CONNBYTES=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_MATCH_STRING=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_TARGET_NFQUEUE=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_TTL=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m # # DCCP Configuration (EXPERIMENTAL) @@ -346,6 +295,11 @@ CONFIG_IP_NF_ARP_MANGLE=m # SCTP Configuration (EXPERIMENTAL) # # CONFIG_IP_SCTP is not set + +# +# TIPC Configuration (EXPERIMENTAL) +# +# CONFIG_TIPC is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set # CONFIG_VLAN_8021Q is not set @@ -364,7 +318,6 @@ CONFIG_LLC=y # QoS and/or fair queueing # # CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y # # Network testing @@ -572,13 +525,7 @@ CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set -CONFIG_SCSI_QLA2XXX=y -CONFIG_SCSI_QLA21XX=m -CONFIG_SCSI_QLA22XX=m -CONFIG_SCSI_QLA2300=m -CONFIG_SCSI_QLA2322=m -CONFIG_SCSI_QLA6312=m -CONFIG_SCSI_QLA24XX=m +# CONFIG_SCSI_QLA_FC is not set CONFIG_SCSI_LPFC=m # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set @@ -642,8 +589,6 @@ CONFIG_IEEE1394_SBP2=m CONFIG_IEEE1394_ETH1394=m CONFIG_IEEE1394_DV1394=m CONFIG_IEEE1394_RAWIO=y -CONFIG_IEEE1394_CMP=m -CONFIG_IEEE1394_AMDTP=m # # I2O device support @@ -659,6 +604,7 @@ CONFIG_THERM_PM72=y CONFIG_WINDFARM=y CONFIG_WINDFARM_PM81=y CONFIG_WINDFARM_PM91=y +CONFIG_WINDFARM_PM112=y # # Network device support @@ -731,6 +677,7 @@ CONFIG_E1000=y # CONFIG_R8169 is not set # CONFIG_SIS190 is not set # CONFIG_SKGE is not set +# CONFIG_SKY2 is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y @@ -853,6 +800,7 @@ CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=4 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # @@ -880,6 +828,7 @@ CONFIG_HVCS=m # CONFIG_WATCHDOG is not set # CONFIG_RTC is not set CONFIG_GEN_RTC=y +# CONFIG_GEN_RTC_X is not set # CONFIG_DTLK is not set # CONFIG_R3964 is not set # CONFIG_APPLICOM is not set @@ -923,8 +872,7 @@ CONFIG_I2C_AMD8111=y # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set # CONFIG_I2C_PIIX4 is not set -CONFIG_I2C_KEYWEST=y -CONFIG_I2C_PMAC_SMU=y +CONFIG_I2C_POWERMAC=y # CONFIG_I2C_NFORCE2 is not set # CONFIG_I2C_PARPORT_LIGHT is not set # CONFIG_I2C_PROSAVAGE is not set @@ -957,6 +905,12 @@ CONFIG_I2C_PMAC_SMU=y # CONFIG_I2C_DEBUG_CHIP is not set # +# SPI support +# +# CONFIG_SPI is not set +# CONFIG_SPI_MASTER is not set + +# # Dallas's 1-wire bus # # CONFIG_W1 is not set @@ -1028,7 +982,6 @@ CONFIG_FB_RADEON_I2C=y # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set -# CONFIG_FB_CYBLA is not set # CONFIG_FB_TRIDENT is not set # CONFIG_FB_VIRTUAL is not set @@ -1073,9 +1026,10 @@ CONFIG_SND_OSSEMUL=y CONFIG_SND_MIXER_OSS=m CONFIG_SND_PCM_OSS=m CONFIG_SND_SEQUENCER_OSS=y +# CONFIG_SND_DYNAMIC_MINORS is not set +CONFIG_SND_SUPPORT_OLD_API=y # CONFIG_SND_VERBOSE_PRINTK is not set # CONFIG_SND_DEBUG is not set -CONFIG_SND_GENERIC_DRIVER=y # # Generic devices @@ -1089,6 +1043,8 @@ CONFIG_SND_GENERIC_DRIVER=y # # PCI devices # +# CONFIG_SND_AD1889 is not set +# CONFIG_SND_ALS4000 is not set # CONFIG_SND_ALI5451 is not set # CONFIG_SND_ATIIXP is not set # CONFIG_SND_ATIIXP_MODEM is not set @@ -1097,39 +1053,38 @@ CONFIG_SND_GENERIC_DRIVER=y # CONFIG_SND_AU8830 is not set # CONFIG_SND_AZT3328 is not set # CONFIG_SND_BT87X is not set -# CONFIG_SND_CS46XX is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_CMIPCI is not set # CONFIG_SND_CS4281 is not set +# CONFIG_SND_CS46XX is not set # CONFIG_SND_EMU10K1 is not set # CONFIG_SND_EMU10K1X is not set -# CONFIG_SND_CA0106 is not set -# CONFIG_SND_KORG1212 is not set -# CONFIG_SND_MIXART is not set -# CONFIG_SND_NM256 is not set -# CONFIG_SND_RME32 is not set -# CONFIG_SND_RME96 is not set -# CONFIG_SND_RME9652 is not set -# CONFIG_SND_HDSP is not set -# CONFIG_SND_HDSPM is not set -# CONFIG_SND_TRIDENT is not set -# CONFIG_SND_YMFPCI is not set -# CONFIG_SND_AD1889 is not set -# CONFIG_SND_ALS4000 is not set -# CONFIG_SND_CMIPCI is not set # CONFIG_SND_ENS1370 is not set # CONFIG_SND_ENS1371 is not set # CONFIG_SND_ES1938 is not set # CONFIG_SND_ES1968 is not set -# CONFIG_SND_MAESTRO3 is not set # CONFIG_SND_FM801 is not set +# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_HDSP is not set +# CONFIG_SND_HDSPM is not set # CONFIG_SND_ICE1712 is not set # CONFIG_SND_ICE1724 is not set # CONFIG_SND_INTEL8X0 is not set # CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_KORG1212 is not set +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +# CONFIG_SND_PCXHR is not set +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set # CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_TRIDENT is not set # CONFIG_SND_VIA82XX is not set # CONFIG_SND_VIA82XX_MODEM is not set # CONFIG_SND_VX222 is not set -# CONFIG_SND_HDA_INTEL is not set +# CONFIG_SND_YMFPCI is not set # # ALSA PowerMac devices @@ -1201,13 +1156,16 @@ CONFIG_USB_STORAGE=m # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set +# CONFIG_USB_STORAGE_ALAUDA is not set # CONFIG_USB_STORAGE_ONETOUCH is not set +# CONFIG_USB_LIBUSUAL is not set # # USB Input Devices # CONFIG_USB_HID=y CONFIG_USB_HIDINPUT=y +# CONFIG_USB_HIDINPUT_POWERBOOK is not set # CONFIG_HID_FF is not set CONFIG_USB_HIDDEV=y # CONFIG_USB_AIPTEK is not set @@ -1221,6 +1179,7 @@ CONFIG_USB_HIDDEV=y # CONFIG_USB_YEALINK is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_ATI_REMOTE2 is not set # CONFIG_USB_KEYSPAN_REMOTE is not set # CONFIG_USB_APPLETOUCH is not set @@ -1307,6 +1266,10 @@ CONFIG_INFINIBAND_IPOIB=m # # +# EDAC - error detection and reporting (RAS) +# + +# # File systems # CONFIG_EXT2_FS=y @@ -1340,6 +1303,7 @@ CONFIG_XFS_EXPORT=y CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_XFS_RT is not set +# CONFIG_OCFS2_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y @@ -1379,6 +1343,7 @@ CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y # CONFIG_RELAYFS_FS is not set +# CONFIG_CONFIGFS_FS is not set # # Miscellaneous filesystems @@ -1449,6 +1414,7 @@ CONFIG_MSDOS_PARTITION=y # CONFIG_SGI_PARTITION is not set # CONFIG_ULTRIX_PARTITION is not set # CONFIG_SUN_PARTITION is not set +# CONFIG_KARMA_PARTITION is not set # CONFIG_EFI_PARTITION is not set # @@ -1504,10 +1470,6 @@ CONFIG_CRC32=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m -CONFIG_TEXTSEARCH=y -CONFIG_TEXTSEARCH_KMP=m -CONFIG_TEXTSEARCH_BM=m -CONFIG_TEXTSEARCH_FSM=m # # Instrumentation Support @@ -1520,18 +1482,20 @@ CONFIG_OPROFILE=y # Kernel hacking # # CONFIG_PRINTK_TIME is not set -CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +CONFIG_DEBUG_MUTEXES=y # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set +CONFIG_FORCED_INLINING=y # CONFIG_RCU_TORTURE_TEST is not set CONFIG_DEBUG_STACKOVERFLOW=y CONFIG_DEBUG_STACK_USAGE=y @@ -1540,6 +1504,11 @@ CONFIG_XMON=y # CONFIG_XMON_DEFAULT is not set CONFIG_IRQSTACKS=y CONFIG_BOOTX_TEXT=y +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set +# CONFIG_PPC_EARLY_DEBUG_G5 is not set +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set # # Security options Index: powerpc-git/arch/powerpc/configs/pseries_defconfig =================================================================== --- powerpc-git.orig/arch/powerpc/configs/pseries_defconfig +++ powerpc-git/arch/powerpc/configs/pseries_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.15-rc5 -# Tue Dec 20 15:59:40 2005 +# Linux kernel version: 2.6.16-rc2 +# Fri Feb 10 17:33:32 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -16,6 +16,10 @@ CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_PPC_OF=y +CONFIG_PPC_UDBG_16550=y +# CONFIG_GENERIC_TBSYNC is not set +# CONFIG_DEFAULT_UIMAGE is not set # # Processor support @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=128 # Code maturity level options # CONFIG_EXPERIMENTAL=y -CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 @@ -49,8 +52,6 @@ CONFIG_POSIX_MQUEUE=y CONFIG_SYSCTL=y CONFIG_AUDIT=y CONFIG_AUDITSYSCALL=y -CONFIG_HOTPLUG=y -CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_CPUSETS=y @@ -60,8 +61,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y +CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y @@ -70,8 +73,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 +CONFIG_SLAB=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 +# CONFIG_SLOB is not set # # Loadable module support @@ -113,7 +118,6 @@ CONFIG_PPC_PSERIES=y # CONFIG_PPC_PMAC is not set # CONFIG_PPC_MAPLE is not set # CONFIG_PPC_CELL is not set -CONFIG_PPC_OF=y CONFIG_XICS=y # CONFIG_U3_DART is not set CONFIG_MPIC=y @@ -123,8 +127,8 @@ CONFIG_RTAS_PROC=y CONFIG_RTAS_FLASH=m # CONFIG_MMIO_NVRAM is not set CONFIG_IBMVIO=y +# CONFIG_IBMEBUS is not set # CONFIG_PPC_MPC106 is not set -# CONFIG_GENERIC_TBSYNC is not set # CONFIG_CPU_FREQ is not set # CONFIG_WANT_EARLY_SERIAL is not set @@ -145,6 +149,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_IOMMU_VMERGE=y CONFIG_HOTPLUG_CPU=y CONFIG_KEXEC=y +# CONFIG_CRASH_DUMP is not set CONFIG_IRQ_ALL_CPUS=y CONFIG_PPC_SPLPAR=y CONFIG_EEH=y @@ -165,6 +170,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y CONFIG_SPARSEMEM_EXTREME=y # CONFIG_MEMORY_HOTPLUG is not set CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_MIGRATION=y CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y # CONFIG_PPC_64K_PAGES is not set CONFIG_SCHED_SMT=y @@ -209,6 +215,7 @@ CONFIG_NET=y # # Networking options # +# CONFIG_NETDEBUG is not set CONFIG_PACKET=y # CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y @@ -248,6 +255,7 @@ CONFIG_NETFILTER=y CONFIG_NETFILTER_NETLINK=y CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NETFILTER_NETLINK_LOG=m +# CONFIG_NETFILTER_XTABLES is not set # # IP: Netfilter Configuration @@ -265,65 +273,6 @@ CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m # CONFIG_IP_NF_PPTP is not set CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -# CONFIG_IP_NF_MATCH_DCCP is not set -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_CONNBYTES=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_MATCH_STRING=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_TARGET_NFQUEUE=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_TTL=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m # # DCCP Configuration (EXPERIMENTAL) @@ -334,6 +283,11 @@ CONFIG_IP_NF_ARP_MANGLE=m # SCTP Configuration (EXPERIMENTAL) # # CONFIG_IP_SCTP is not set + +# +# TIPC Configuration (EXPERIMENTAL) +# +# CONFIG_TIPC is not set # CONFIG_ATM is not set # CONFIG_BRIDGE is not set # CONFIG_VLAN_8021Q is not set @@ -352,7 +306,6 @@ CONFIG_LLC=y # QoS and/or fair queueing # # CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y # # Network testing @@ -550,13 +503,7 @@ CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y # CONFIG_SCSI_QLOGIC_FC is not set # CONFIG_SCSI_QLOGIC_1280 is not set -CONFIG_SCSI_QLA2XXX=y -CONFIG_SCSI_QLA21XX=m -CONFIG_SCSI_QLA22XX=m -CONFIG_SCSI_QLA2300=m -CONFIG_SCSI_QLA2322=m -CONFIG_SCSI_QLA6312=m -CONFIG_SCSI_QLA24XX=m +# CONFIG_SCSI_QLA_FC is not set CONFIG_SCSI_LPFC=m # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set @@ -678,6 +625,7 @@ CONFIG_E1000=y # CONFIG_R8169 is not set # CONFIG_SIS190 is not set # CONFIG_SKGE is not set +# CONFIG_SKY2 is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y @@ -803,6 +751,7 @@ CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=4 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # @@ -909,6 +858,12 @@ CONFIG_I2C_ALGOBIT=y # CONFIG_I2C_DEBUG_CHIP is not set # +# SPI support +# +# CONFIG_SPI is not set +# CONFIG_SPI_MASTER is not set + +# # Dallas's 1-wire bus # # CONFIG_W1 is not set @@ -976,7 +931,6 @@ CONFIG_FB_RADEON_I2C=y # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set -# CONFIG_FB_CYBLA is not set # CONFIG_FB_TRIDENT is not set # CONFIG_FB_VIRTUAL is not set @@ -1061,12 +1015,15 @@ CONFIG_USB_STORAGE=y # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set +# CONFIG_USB_STORAGE_ALAUDA is not set +# CONFIG_USB_LIBUSUAL is not set # # USB Input Devices # CONFIG_USB_HID=y CONFIG_USB_HIDINPUT=y +# CONFIG_USB_HIDINPUT_POWERBOOK is not set # CONFIG_HID_FF is not set CONFIG_USB_HIDDEV=y # CONFIG_USB_AIPTEK is not set @@ -1080,6 +1037,7 @@ CONFIG_USB_HIDDEV=y # CONFIG_USB_YEALINK is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_ATI_REMOTE2 is not set # CONFIG_USB_KEYSPAN_REMOTE is not set # CONFIG_USB_APPLETOUCH is not set @@ -1167,6 +1125,10 @@ CONFIG_INFINIBAND_IPOIB=m # # +# EDAC - error detection and reporting (RAS) +# + +# # File systems # CONFIG_EXT2_FS=y @@ -1200,6 +1162,7 @@ CONFIG_XFS_EXPORT=y CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_XFS_RT is not set +# CONFIG_OCFS2_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set CONFIG_INOTIFY=y @@ -1240,6 +1203,7 @@ CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y # CONFIG_RELAYFS_FS is not set +# CONFIG_CONFIGFS_FS is not set # # Miscellaneous filesystems @@ -1351,10 +1315,6 @@ CONFIG_CRC32=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m -CONFIG_TEXTSEARCH=y -CONFIG_TEXTSEARCH_KMP=m -CONFIG_TEXTSEARCH_BM=m -CONFIG_TEXTSEARCH_FSM=m # # Instrumentation Support @@ -1367,18 +1327,20 @@ CONFIG_OPROFILE=y # Kernel hacking # # CONFIG_PRINTK_TIME is not set -CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y +CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_SCHEDSTATS is not set # CONFIG_DEBUG_SLAB is not set +CONFIG_DEBUG_MUTEXES=y # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_INFO is not set CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set +CONFIG_FORCED_INLINING=y # CONFIG_RCU_TORTURE_TEST is not set CONFIG_DEBUG_STACKOVERFLOW=y CONFIG_DEBUG_STACK_USAGE=y @@ -1387,6 +1349,11 @@ CONFIG_XMON=y CONFIG_XMON_DEFAULT=y CONFIG_IRQSTACKS=y # CONFIG_BOOTX_TEXT is not set +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set +# CONFIG_PPC_EARLY_DEBUG_G5 is not set +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set # # Security options From anton at samba.org Mon Feb 13 14:48:35 2006 From: anton at samba.org (Anton Blanchard) Date: Mon, 13 Feb 2006 14:48:35 +1100 Subject: [PATCH] powerpc: Fix runlatch performance issues Message-ID: <20060213034835.GB7922@krispykreme> The runlatch SPR can take a lot of time to write. My original runlatch code would set it on every exception entry even though most of the time this was not required. It would also continually set it in the idle loop, which is an issue on an SMT capable processor. Now we cache the runlatch value in a threadinfo bit, and only check for it in decrementer and hardware interrupt exceptions as well as the idle loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32. Signed-off-by: Anton Blanchard --- Index: build/arch/powerpc/kernel/head_64.S =================================================================== --- build.orig/arch/powerpc/kernel/head_64.S 2006-02-11 14:50:46.000000000 +1100 +++ build/arch/powerpc/kernel/head_64.S 2006-02-13 13:11:22.000000000 +1100 @@ -321,7 +321,6 @@ exception_marker: label##_pSeries: \ HMT_MEDIUM; \ mtspr SPRN_SPRG1,r13; /* save r13 */ \ - RUNLATCH_ON(r13); \ EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common) #define STD_EXCEPTION_ISERIES(n, label, area) \ @@ -329,7 +328,6 @@ label##_pSeries: \ label##_iSeries: \ HMT_MEDIUM; \ mtspr SPRN_SPRG1,r13; /* save r13 */ \ - RUNLATCH_ON(r13); \ EXCEPTION_PROLOG_ISERIES_1(area); \ EXCEPTION_PROLOG_ISERIES_2; \ b label##_common @@ -339,7 +337,6 @@ label##_iSeries: \ label##_iSeries: \ HMT_MEDIUM; \ mtspr SPRN_SPRG1,r13; /* save r13 */ \ - RUNLATCH_ON(r13); \ EXCEPTION_PROLOG_ISERIES_1(PACA_EXGEN); \ lbz r10,PACAPROCENABLED(r13); \ cmpwi 0,r10,0; \ @@ -392,6 +389,7 @@ label##_common: \ label##_common: \ EXCEPTION_PROLOG_COMMON(trap, PACA_EXGEN); \ DISABLE_INTS; \ + bl .ppc64_runlatch_on; \ addi r3,r1,STACK_FRAME_OVERHEAD; \ bl hdlr; \ b .ret_from_except_lite @@ -409,7 +407,6 @@ __start_interrupts: _machine_check_pSeries: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - RUNLATCH_ON(r13) EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) . = 0x300 @@ -436,7 +433,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_SLB) data_access_slb_pSeries: HMT_MEDIUM mtspr SPRN_SPRG1,r13 - RUNLATCH_ON(r13) mfspr r13,SPRN_SPRG3 /* get paca address into r13 */ std r3,PACA_EXSLB+EX_R3(r13) mfspr r3,SPRN_DAR @@ -462,7 +458,6 @@ data_access_slb_pSeries: instruction_access_slb_pSeries: HMT_MEDIUM mtspr SPRN_SPRG1,r13 - RUNLATCH_ON(r13) mfspr r13,SPRN_SPRG3 /* get paca address into r13 */ std r3,PACA_EXSLB+EX_R3(r13) mfspr r3,SPRN_SRR0 /* SRR0 is faulting address */ @@ -493,7 +488,6 @@ instruction_access_slb_pSeries: .globl system_call_pSeries system_call_pSeries: HMT_MEDIUM - RUNLATCH_ON(r9) mr r9,r13 mfmsr r10 mfspr r13,SPRN_SPRG3 @@ -577,7 +571,6 @@ slb_miss_user_pseries: system_reset_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - RUNLATCH_ON(r13) EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) .globl machine_check_fwnmi @@ -585,7 +578,6 @@ system_reset_fwnmi: machine_check_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ - RUNLATCH_ON(r13) EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) #ifdef CONFIG_PPC_ISERIES @@ -896,7 +888,6 @@ unrecov_fer: .align 7 .globl data_access_common data_access_common: - RUNLATCH_ON(r10) /* It wont fit in the 0x300 handler */ mfspr r10,SPRN_DAR std r10,PACA_EXGEN+EX_DAR(r13) mfspr r10,SPRN_DSISR @@ -1044,6 +1035,7 @@ hardware_interrupt_common: EXCEPTION_PROLOG_COMMON(0x500, PACA_EXGEN) hardware_interrupt_entry: DISABLE_INTS + bl .ppc64_runlatch_on addi r3,r1,STACK_FRAME_OVERHEAD bl .do_IRQ b .ret_from_except_lite Index: build/arch/powerpc/kernel/process.c =================================================================== --- build.orig/arch/powerpc/kernel/process.c 2006-02-11 14:50:46.000000000 +1100 +++ build/arch/powerpc/kernel/process.c 2006-02-13 13:12:45.000000000 +1100 @@ -888,3 +888,35 @@ void dump_stack(void) show_stack(current, NULL); } EXPORT_SYMBOL(dump_stack); + +#ifdef CONFIG_PPC64 +void ppc64_runlatch_on(void) +{ + unsigned long ctrl; + + if (cpu_has_feature(CPU_FTR_CTRL) && !test_thread_flag(TIF_RUNLATCH)) { + HMT_medium(); + + ctrl = mfspr(SPRN_CTRLF); + ctrl |= CTRL_RUNLATCH; + mtspr(SPRN_CTRLT, ctrl); + + set_thread_flag(TIF_RUNLATCH); + } +} + +void ppc64_runlatch_off(void) +{ + unsigned long ctrl; + + if (cpu_has_feature(CPU_FTR_CTRL) && test_thread_flag(TIF_RUNLATCH)) { + HMT_medium(); + + clear_thread_flag(TIF_RUNLATCH); + + ctrl = mfspr(SPRN_CTRLF); + ctrl &= ~CTRL_RUNLATCH; + mtspr(SPRN_CTRLT, ctrl); + } +} +#endif Index: build/arch/powerpc/platforms/iseries/setup.c =================================================================== --- build.orig/arch/powerpc/platforms/iseries/setup.c 2006-02-11 14:50:46.000000000 +1100 +++ build/arch/powerpc/platforms/iseries/setup.c 2006-02-11 14:50:55.000000000 +1100 @@ -648,6 +648,7 @@ static void yield_shared_processor(void) * here and let the timer_interrupt code sort out the actual time. */ get_lppaca()->int_dword.fields.decr_int = 1; + ppc64_runlatch_on(); process_iSeries_events(); } Index: build/include/asm-powerpc/reg.h =================================================================== --- build.orig/include/asm-powerpc/reg.h 2006-02-11 14:50:46.000000000 +1100 +++ build/include/asm-powerpc/reg.h 2006-02-11 14:50:55.000000000 +1100 @@ -615,27 +615,9 @@ #define proc_trap() asm volatile("trap") #ifdef CONFIG_PPC64 -static inline void ppc64_runlatch_on(void) -{ - unsigned long ctrl; - - if (cpu_has_feature(CPU_FTR_CTRL)) { - ctrl = mfspr(SPRN_CTRLF); - ctrl |= CTRL_RUNLATCH; - mtspr(SPRN_CTRLT, ctrl); - } -} - -static inline void ppc64_runlatch_off(void) -{ - unsigned long ctrl; - - if (cpu_has_feature(CPU_FTR_CTRL)) { - ctrl = mfspr(SPRN_CTRLF); - ctrl &= ~CTRL_RUNLATCH; - mtspr(SPRN_CTRLT, ctrl); - } -} + +extern void ppc64_runlatch_on(void); +extern void ppc64_runlatch_off(void); extern unsigned long scom970_read(unsigned int address); extern void scom970_write(unsigned int address, unsigned long value); @@ -645,15 +627,6 @@ extern void scom970_write(unsigned int a #define __get_SP() ({unsigned long sp; \ asm volatile("mr %0,1": "=r" (sp)); sp;}) -#else /* __ASSEMBLY__ */ - -#define RUNLATCH_ON(REG) \ -BEGIN_FTR_SECTION \ - mfspr (REG),SPRN_CTRLF; \ - ori (REG),(REG),CTRL_RUNLATCH; \ - mtspr SPRN_CTRLT,(REG); \ -END_FTR_SECTION_IFSET(CPU_FTR_CTRL) - #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_REG_H */ Index: build/include/asm-powerpc/thread_info.h =================================================================== --- build.orig/include/asm-powerpc/thread_info.h 2006-02-11 14:50:46.000000000 +1100 +++ build/include/asm-powerpc/thread_info.h 2006-02-11 14:50:55.000000000 +1100 @@ -113,7 +113,7 @@ static inline struct thread_info *curren #define TIF_POLLING_NRFLAG 4 /* true if poll_idle() is polling TIF_NEED_RESCHED */ #define TIF_32BIT 5 /* 32 bit binary */ -/* #define SPARE 6 */ +#define TIF_RUNLATCH 6 /* Is the runlatch enabled? */ #define TIF_ABI_PENDING 7 /* 32/64 bit switch needed */ #define TIF_SYSCALL_AUDIT 8 /* syscall auditing active */ #define TIF_SINGLESTEP 9 /* singlestepping active */ @@ -131,7 +131,7 @@ static inline struct thread_info *curren #define _TIF_NEED_RESCHED (1< Hi, HMT support is currently broken and needs to be reworked to play nicely with the SMT scheduler. Remove the bit rotten bits for the time being. I also updated an incorrect comment, we enter __secondary_hold with the physical cpu id in r3. Anton Signed-off-by: Anton Blanchard --- Index: build/arch/powerpc/kernel/head_64.S =================================================================== --- build.orig/arch/powerpc/kernel/head_64.S 2006-02-13 18:08:18.000000000 +1100 +++ build/arch/powerpc/kernel/head_64.S 2006-02-13 18:08:19.000000000 +1100 @@ -139,7 +139,7 @@ _GLOBAL(__secondary_hold) ori r24,r24,MSR_RI mtmsrd r24 /* RI on */ - /* Grab our linux cpu number */ + /* Grab our physical cpu number */ mr r24,r3 /* Tell the master cpu we're here */ @@ -153,11 +153,6 @@ _GLOBAL(__secondary_hold) cmpdi 0,r4,1 bne 100b -#ifdef CONFIG_HMT - SET_REG_IMMEDIATE(r4, .hmt_init) - mtctr r4 - bctr -#else #ifdef CONFIG_SMP LOAD_REG_IMMEDIATE(r4, .pSeries_secondary_smp_init) mtctr r4 @@ -166,7 +161,6 @@ _GLOBAL(__secondary_hold) #else BUG_OPCODE #endif -#endif /* This value is used to mark exception frames on the stack. */ .section ".toc","aw" @@ -1810,22 +1804,6 @@ _STATIC(start_here_multiplatform) ori r6,r6,MSR_RI mtmsrd r6 /* RI on */ -#ifdef CONFIG_HMT - /* Start up the second thread on cpu 0 */ - mfspr r3,SPRN_PVR - srwi r3,r3,16 - cmpwi r3,0x34 /* Pulsar */ - beq 90f - cmpwi r3,0x36 /* Icestar */ - beq 90f - cmpwi r3,0x37 /* SStar */ - beq 90f - b 91f /* HMT not supported */ -90: li r3,0 - bl .hmt_start_secondary -91: -#endif - /* The following gets the stack and TOC set up with the regs */ /* pointing to the real addr of the kernel stack. This is */ /* all done to support the C function call below which sets */ @@ -1939,77 +1917,8 @@ _STATIC(start_here_common) bl .start_kernel -_GLOBAL(hmt_init) -#ifdef CONFIG_HMT - LOAD_REG_IMMEDIATE(r5, hmt_thread_data) - mfspr r7,SPRN_PVR - srwi r7,r7,16 - cmpwi r7,0x34 /* Pulsar */ - beq 90f - cmpwi r7,0x36 /* Icestar */ - beq 91f - cmpwi r7,0x37 /* SStar */ - beq 91f - b 101f -90: mfspr r6,SPRN_PIR - andi. r6,r6,0x1f - b 92f -91: mfspr r6,SPRN_PIR - andi. r6,r6,0x3ff -92: sldi r4,r24,3 - stwx r6,r5,r4 - bl .hmt_start_secondary - b 101f - -__hmt_secondary_hold: - LOAD_REG_IMMEDIATE(r5, hmt_thread_data) - clrldi r5,r5,4 - li r7,0 - mfspr r6,SPRN_PIR - mfspr r8,SPRN_PVR - srwi r8,r8,16 - cmpwi r8,0x34 - bne 93f - andi. r6,r6,0x1f - b 103f -93: andi. r6,r6,0x3f - -103: lwzx r8,r5,r7 - cmpw r8,r6 - beq 104f - addi r7,r7,8 - b 103b - -104: addi r7,r7,4 - lwzx r9,r5,r7 - mr r24,r9 -101: -#endif - mr r3,r24 - b .pSeries_secondary_smp_init - -#ifdef CONFIG_HMT -_GLOBAL(hmt_start_secondary) - LOAD_REG_IMMEDIATE(r4,__hmt_secondary_hold) - clrldi r4,r4,4 - mtspr SPRN_NIADORM, r4 - mfspr r4, SPRN_MSRDORM - li r5, -65 - and r4, r4, r5 - mtspr SPRN_MSRDORM, r4 - lis r4,0xffef - ori r4,r4,0x7403 - mtspr SPRN_TSC, r4 - li r4,0x1f4 - mtspr SPRN_TST, r4 - mfspr r4, SPRN_HID0 - ori r4, r4, 0x1 - mtspr SPRN_HID0, r4 - mfspr r4, SPRN_CTRLF - oris r4, r4, 0x40 - mtspr SPRN_CTRLT, r4 - blr -#endif + /* Not reached */ + BUG_OPCODE /* * We put a few things here that have to be page-aligned. Index: build/arch/powerpc/kernel/prom_init.c =================================================================== --- build.orig/arch/powerpc/kernel/prom_init.c 2006-02-13 15:04:15.000000000 +1100 +++ build/arch/powerpc/kernel/prom_init.c 2006-02-13 18:08:19.000000000 +1100 @@ -205,14 +205,6 @@ static cell_t __initdata regbuf[1024]; #define MAX_CPU_THREADS 2 -/* TO GO */ -#ifdef CONFIG_HMT -struct { - unsigned int pir; - unsigned int threadid; -} hmt_thread_data[NR_CPUS]; -#endif /* CONFIG_HMT */ - /* * Error results ... some OF calls will return "-1" on error, some * will return 0, some will return either. To simplify, here are @@ -1319,10 +1311,6 @@ static void __init prom_hold_cpus(void) */ *spinloop = 0; -#ifdef CONFIG_HMT - for (i = 0; i < NR_CPUS; i++) - RELOC(hmt_thread_data)[i].pir = 0xdeadbeef; -#endif /* look for cpus */ for (node = 0; prom_next_node(&node); ) { type[0] = 0; @@ -1389,32 +1377,6 @@ static void __init prom_hold_cpus(void) /* Reserve cpu #s for secondary threads. They start later. */ cpuid += cpu_threads; } -#ifdef CONFIG_HMT - /* Only enable HMT on processors that provide support. */ - if (__is_processor(PV_PULSAR) || - __is_processor(PV_ICESTAR) || - __is_processor(PV_SSTAR)) { - prom_printf(" starting secondary threads\n"); - - for (i = 0; i < NR_CPUS; i += 2) { - if (!cpu_online(i)) - continue; - - if (i == 0) { - unsigned long pir = mfspr(SPRN_PIR); - if (__is_processor(PV_PULSAR)) { - RELOC(hmt_thread_data)[i].pir = - pir & 0x1f; - } else { - RELOC(hmt_thread_data)[i].pir = - pir & 0x3ff; - } - } - } - } else { - prom_printf("Processor is not HMT capable\n"); - } -#endif if (cpuid > NR_CPUS) prom_printf("WARNING: maximum CPUs (" __stringify(NR_CPUS) Index: build/arch/powerpc/platforms/pseries/Kconfig =================================================================== --- build.orig/arch/powerpc/platforms/pseries/Kconfig 2005-11-05 20:51:08.000000000 +1100 +++ build/arch/powerpc/platforms/pseries/Kconfig 2006-02-13 18:09:21.000000000 +1100 @@ -9,13 +9,6 @@ config PPC_SPLPAR processors, that is, which share physical processors between two or more partitions. -config HMT - bool "Hardware multithreading" - depends on SMP && PPC_PSERIES && BROKEN - help - This option enables hardware multithreading on RS64 cpus. - pSeries systems p620 and p660 have such a cpu type. - config EEH bool "PCI Extended Error Handling (EEH)" if EMBEDDED depends on PPC_PSERIES From utz.bacher at de.ibm.com Tue Feb 14 07:36:25 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Mon, 13 Feb 2006 21:36:25 +0100 (CET) Subject: [FYI/PATCH 2/3] fix IIC device tree interpretation for Cell Message-ID: This patch applies on top of Arnd's posting (patch id 4188) from 1/18 (on top of 2.6.15.4). It fixes the Linux interpretation of the Cell SLOF deivce tree IIC target IDs and is recommended for running on a Cell blade today. Cc: Jens Osterkamp Cc: Arnd Bergmann From: Gerhard Stenzel Signed-off-by: Utz Bacher Index: linux-2.6.15.4/arch/powerpc/platforms/cell/interrupt.c =================================================================== --- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/interrupt.c +++ linux-2.6.15.4/arch/powerpc/platforms/cell/interrupt.c @@ -254,13 +254,13 @@ iic = &per_cpu(iic, np[0]); iic->regs = __ioremap(regs[0], sizeof(struct iic_regs), _PAGE_NO_CACHE); - iic->target_id = (np[0] << 4) + 0xe; + iic->target_id = ((np[0] & 2) << 3) + ((np[0] & 1) ? 0xf : 0xe); printk("IIC for CPU %d at %lx mapped to %p\n", np[0], regs[0], iic->regs); iic = &per_cpu(iic, np[1]); iic->regs = __ioremap(regs[2], sizeof(struct iic_regs), _PAGE_NO_CACHE); - iic->target_id = (np[1] << 3) + 0xe; + iic->target_id = ((np[1] & 2) << 3) + ((np[1] & 1) ? 0xf : 0xe); printk("IIC for CPU %d at %lx mapped to %p\n", np[1], regs[2], iic->regs); found++; From utz.bacher at de.ibm.com Tue Feb 14 07:33:42 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Mon, 13 Feb 2006 21:33:42 +0100 (CET) Subject: [FYI/PATCH 1/3] reenable CONFIG_GEN_RTC for Cell Message-ID: This patch applies on top of Arnd's posting (patch id 4182) from 1/18 (on top of 2.6.15.4). It reenables CONFIG_GEN_RTC which allows the clock to be set on Cell blades and is recommended for running on a such a system today. Cc: Arnd Bergmann From: Gerhard Stenzel Signed-off-by: Utz Bacher Index: linux-2.6.15.4/arch/powerpc/configs/cell_defconfig =================================================================== --- linux-2.6.15.4.orig/arch/powerpc/configs/cell_defconfig +++ linux-2.6.15.4/arch/powerpc/configs/cell_defconfig @@ -654,7 +654,7 @@ # CONFIG_PCIPCWATCHDOG is not set # CONFIG_WDTPCI is not set # CONFIG_RTC is not set -# CONFIG_GEN_RTC is not set +CONFIG_GEN_RTC=y # CONFIG_DTLK is not set # CONFIG_R3964 is not set # CONFIG_APPLICOM is not set From HPENNER at de.ibm.com Tue Feb 14 05:19:10 2006 From: HPENNER at de.ibm.com (Hartmut Penner) Date: Mon, 13 Feb 2006 19:19:10 +0100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139779462.5247.30.camel@localhost.localdomain> Message-ID: Hello the initial value set by FW of the HID6 is 0x00010034_00000000. I would like to support the large pages in the Firmware, but need to know excactly what properties I have to set. Looked at the linux code, but still are not quiet sure what values to put into ibm,segment-page-sizes. Could somebody enlighten me, how to find out ? I am right now in Rochester, would there be somebody here to talk about ? regards, Hartmut |---------+----------------------------> | | Benjamin | | | Herrenschmidt | | | | | | | | | 02/12/06 10:24 PM| |---------+----------------------------> >-------------------------------------------------------------------------------------------------------------------------------------| | | | To: Arnd Bergmann | | cc: Masato.Noguchi at jp.sony.com, linuxppc64-dev at ozlabs.org, geoffrey.levand at am.sony.com, Hartmut Penner/Germany/IBM at IBMDE| | Subject: Re: AW: Re: __setup_cpu_be problem | | | | | >-------------------------------------------------------------------------------------------------------------------------------------| > The current firmware on the Cell blades does neither the setup of > the HID6 register nor have the correct tables in the device tree. > > Since I'm still currently sitting in a garden in NZ instead of the > B?blingen lab, I can't find out what the HID6 power-on defaults > are. We might get away with just leaving the default there, but that > might prevent us from using 16M and/or 64k pages and there are > definitely some application which depend on 16M hugetlb mappings > on Cell. Yes, however, how much widely distributed and "frozen" is this current Cell firmware ? I mean, do we really need to add a workaround to the kenrel instead of just fixing the firmware here ? > The two problems we are facing currently are: > - If HID6 defaults to disabling 16M large pages, the kernel will > get the wrong information from the CPU features and applications > that use it break. The firmware should add the setup if HID6 > _now_, but we also should be prepared for users of old firmware > that want to upgrade their kernel without upgrading the firmware > at the same time. Do we really need to support old/broken firmware ? It's not like we had a released product all over the field... > - We want to use 64k pages in the future, so the firmware needs to > add the 'ibm,segment-page-sizes' property ASAP, preferrably at > the same time they start setting up HID6. I currently have a > hack for the kernel to override that, but we're in the process > of eliminating all the special hacks that won't make in into > the mainline kernel. The only things you need is to have this property set and the new ibm,pa-feature for which I need to dig out the latest spec.... The problem is that the kernel will currentl not enable 64k pages on any processor due to the lack of a feature bit (intentionally) from the cputable. That bit will be extracted from ibm,pa-features at least on pSeries. It's the bit indicating that L=1 works for cache inhibited mappings. > Yes, 1M mappings are probably not of much use to us, and other OSs > already do whatever they like ;-). Sure. Note that the firmware can still set HID6 to 1M pages and put the appropriate entries in the device-tree for 1M large pages. Linux won't be able to use them as-is though but at least the device-tree infos will be sane. I don't want to enter a debate wether we should be able to change HID6 etc... right now. It's more a firmware configuration issue as far as I'm concerned. > Then please try to at least send the spec or a link to Hartmut's IBM > internal address (hpenner at de.ibm.com). I already pointed him to the > linux code when it was initially merged, but he argued that reverse > engineering that code is not good enough to be sure to get the > property right and not having it in there is better than having incorrect > properties. Will do Ben. From arnd at arndb.de Tue Feb 14 09:17:31 2006 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 13 Feb 2006 23:17:31 +0100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: References: Message-ID: <200602132317.32034.arnd@arndb.de> On Monday 13 February 2006 19:19, Hartmut Penner wrote: > ? ? ? the initial value set by FW of the HID6 is 0x00010034_00000000. Ok, good: Both large page sizes are set to 16M, neither is 1M or 64k. That means that we can just rip out the HID6 setup from the kernel without losing the ability for 16M pages. Geoff, please submit a patch to replace __setup_cpu_be with __setup_cpu_power4 if that solves your problem. The new 64k page support has never worked so far on Cell because of missing spufs code for this, so we don't get a regression either way. We still need the firmware changes (HID6 setup and the device tree properties) in order to support 64k pages, but we don't need to worry about breaking stuff in the process. Arnd <>< From arnd at arndb.de Tue Feb 14 09:24:44 2006 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 13 Feb 2006 23:24:44 +0100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139779462.5247.30.camel@localhost.localdomain> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <200602120552.26164.arnd@arndb.de> <1139779462.5247.30.camel@localhost.localdomain> Message-ID: <200602132324.45433.arnd@arndb.de> On Sunday 12 February 2006 22:24, Benjamin Herrenschmidt wrote: > > The current firmware on the Cell blades does neither the setup of > > the HID6 register nor have the correct tables in the device tree. > > > > Since I'm still currently sitting in a garden in NZ instead of the > > B?blingen lab, I can't find out what the HID6 power-on defaults > > are. We might get away with just leaving the default there, but that > > might prevent us from using 16M and/or 64k pages and there are > > definitely some application which depend on 16M hugetlb mappings > > on Cell. > > Yes, however, how much widely distributed and "frozen" is this current > Cell firmware ? I mean, do we really need to add a workaround to the > kenrel instead of just fixing the firmware here ? The firmware update procedure is a little tricky, so our firmware people decided to as few updates as possible, which means we won't have small 'hotfix' updates going to the customer. > > The two problems we are facing currently are: > > - If HID6 defaults to disabling 16M large pages, the kernel will > > ? get the wrong information from the CPU features and applications > > ? that use it break. The firmware should add the setup if HID6 > > ? _now_, but we also should be prepared for users of old firmware > > ? that want to upgrade their kernel without upgrading the firmware > > ? at the same time. > > Do we really need to support old/broken firmware ? It's not like we had > a released product all over the field... Basically, we do want to support old firmware that went out in our customer shippings, but as I wrote in the other mail, we don't need to worry about that in this case. Also, the requirement is only to be able to boot with the mainline kernel, for production setup, users of the currently shipping hardware would also need other patches e.g. to work around performance errata in the CPU stepping. I expect that for the systems that ship in larger quantities (Mercury, Sony and IBM ones in the forseeable future) we can do without ugly hacks of that sort. Arnd <>< From benh at kernel.crashing.org Tue Feb 14 09:40:49 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 14 Feb 2006 09:40:49 +1100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <200602132317.32034.arnd@arndb.de> References: <200602132317.32034.arnd@arndb.de> Message-ID: <1139870450.5237.34.camel@localhost.localdomain> On Mon, 2006-02-13 at 23:17 +0100, Arnd Bergmann wrote: > On Monday 13 February 2006 19:19, Hartmut Penner wrote: > > the initial value set by FW of the HID6 is 0x00010034_00000000. > > Ok, good: Both large page sizes are set to 16M, neither is 1M or 64k. That should be changed. One should be set to 64K and the other to 16M. At this point, it's not yet clear how the kernel will make use of 64K pages, it requires a feature bit that is never set (indicating that cache inhibited L pages are supported). It will be provided by ibm,pa-feature property in the long run but last I looked, it wasn't yet implemented by any firmware. > That means that we can just rip out the HID6 setup from the kernel > without losing the ability for 16M pages. Geoff, please submit a > patch to replace __setup_cpu_be with __setup_cpu_power4 if that > solves your problem. > > The new 64k page support has never worked so far on Cell because > of missing spufs code for this, so we don't get a regression either > way. We still need the firmware changes (HID6 setup and the device > tree properties) in order to support 64k pages, but we don't need > to worry about breaking stuff in the process. Ben. From utz.bacher at de.ibm.com Tue Feb 14 12:58:34 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Tue, 14 Feb 2006 02:58:34 +0100 (CET) Subject: [FYI/PATCH 3/3] increase direct mapping sizes for spufs Message-ID: This patch applies on top of Arnd's postings (patch ids 4192, 4185, 4190) from 1/17 (on top of 2.6.15.4). It maps 16k instead of 4k for each problem-state mapped subarea. The mfc mapping contains the Multisource Synchronization Area, the MFC Command Parameter Area and the MFC Command Queue Control Area; the cntl mapping contains the SPU Control Area while the signal1 and signal2 mapping contain the relevant Signal-Notification Area. This allows libspe to build on direct problem state mapping and is recommended for running on a Cell blade today. The code may change in the near future. Cc: Mark Nutter Cc: Arnd Bergmann From: Ulrich Weigand Signed-off-by: Utz Bacher Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c =================================================================== --- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/context.c +++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c @@ -116,13 +116,13 @@ if (ctx->local_store) unmap_mapping_range(ctx->local_store, 0, LS_SIZE, 1); if (ctx->mfc) - unmap_mapping_range(ctx->mfc, 0, 0x1000, 1); + unmap_mapping_range(ctx->mfc, 0, 0x4000, 1); if (ctx->cntl) - unmap_mapping_range(ctx->cntl, 0, 0x1000, 1); + unmap_mapping_range(ctx->cntl, 0, 0x4000, 1); if (ctx->signal1) - unmap_mapping_range(ctx->signal1, 0, 0x1000, 1); + unmap_mapping_range(ctx->signal1, 0, 0x4000, 1); if (ctx->signal2) - unmap_mapping_range(ctx->signal2, 0, 0x1000, 1); + unmap_mapping_range(ctx->signal2, 0, 0x4000, 1); } int spu_acquire_runnable(struct spu_context *ctx) Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c =================================================================== --- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/file.c +++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c @@ -158,7 +158,7 @@ int ret; offset += vma->vm_pgoff << PAGE_SHIFT; - if (offset > 0x1000) + if (offset >= 0x4000) goto out; ret = spu_acquire_runnable(ctx); From geoffrey.levand at am.sony.com Tue Feb 14 13:08:01 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Mon, 13 Feb 2006 18:08:01 -0800 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <200602132317.32034.arnd@arndb.de> References: <200602132317.32034.arnd@arndb.de> Message-ID: <43F13B81.9020804@am.sony.com> Arnd Bergmann wrote: > That means that we can just rip out the HID6 setup from the kernel > without losing the ability for 16M pages. Geoff, please submit a > patch to replace __setup_cpu_be with __setup_cpu_power4 if that > solves your problem. This patch removes the incorrect and unneeded Cell processor setup routine __setup_cpu_be. __setup_cpu_be improperly accesses the hypervisor page size configuration at SPR HID6. The correct behavior is for the firmware or hypervisor to setup the correct page size configuration and pass those settings to the kernel in the device-tree. Signed-off-by: Geoff Levand -- diff --git a/arch/powerpc/kernel/cpu_setup_power4.S b/arch/powerpc/kernel/cpu_setup_power4.S index b61d86e..5c96481 100644 --- a/arch/powerpc/kernel/cpu_setup_power4.S +++ b/arch/powerpc/kernel/cpu_setup_power4.S @@ -76,20 +76,6 @@ _GLOBAL(__970_cpu_preinit) _GLOBAL(__setup_cpu_power4) blr -_GLOBAL(__setup_cpu_be) - /* Set large page sizes LP=0: 16MB, LP=1: 64KB */ - addi r3, 0, 0 - ori r3, r3, HID6_LB - sldi r3, r3, 32 - nor r3, r3, r3 - mfspr r4, SPRN_HID6 - and r4, r4, r3 - addi r3, 0, 0x02000 - sldi r3, r3, 32 - or r4, r4, r3 - mtspr SPRN_HID6, r4 - blr - _GLOBAL(__setup_cpu_ppc970) mfspr r0,SPRN_HID0 li r11,5 /* clear DOZE and SLEEP */ diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c index 3191be7..19fc380 100644 --- a/arch/powerpc/kernel/cputable.c +++ b/arch/powerpc/kernel/cputable.c @@ -33,7 +33,6 @@ EXPORT_SYMBOL(cur_cpu_spec); #ifdef CONFIG_PPC64 extern void __setup_cpu_power3(unsigned long offset, struct cpu_spec* spec); extern void __setup_cpu_power4(unsigned long offset, struct cpu_spec* spec); -extern void __setup_cpu_be(unsigned long offset, struct cpu_spec* spec); #else extern void __setup_cpu_603(unsigned long offset, struct cpu_spec* spec); extern void __setup_cpu_604(unsigned long offset, struct cpu_spec* spec); @@ -270,7 +269,7 @@ struct cpu_spec cpu_specs[] = { PPC_FEATURE_CELL | PPC_FEATURE_HAS_ALTIVEC_COMP, .icache_bsize = 128, .dcache_bsize = 128, - .cpu_setup = __setup_cpu_be, + .cpu_setup = __setup_cpu_power4, .platform = "ppc-cell-be", }, { /* default match */ From arndb at de.ibm.com Tue Feb 14 14:46:18 2006 From: arndb at de.ibm.com (Arnd Bergmann) Date: Tue, 14 Feb 2006 04:46:18 +0100 Subject: [FYI/PATCH 3/3] increase direct mapping sizes for spufs In-Reply-To: References: Message-ID: <200602140446.19422.arndb@de.ibm.com> On Tuesday 14 February 2006 02:58, Utz Bacher wrote: > This patch applies on top of Arnd's postings (patch ids 4192, 4185, 4190) > from 1/17 (on top of 2.6.15.4). > It maps 16k instead of 4k for each problem-state mapped subarea. The mfc > mapping contains the Multisource Synchronization Area, the MFC Command > Parameter Area and the MFC Command Queue Control Area; the cntl mapping > contains the SPU Control Area while the signal1 and signal2 mapping > contain the relevant Signal-Notification Area. > This allows libspe to build on direct problem state mapping and is > recommended for running on a Cell blade today. The code may change in the > near future. > > Cc: Mark Nutter > Cc: Arnd Bergmann > From: Ulrich Weigand > Signed-off-by: Utz Bacher Nack. Both the intent and the implementation are flawed. Please keep the size of each problem state mapping to one page. Your description is not completely clear on the actual problem. I assume that the code that I posted earlier had the wrong start address for the MFC page, if that's what happened, please just fix the start address. Last time we discussed this, the understanding was that the Multisource Synchronization Area does not need to be exposed to user space. If a need for that has now come up, we should add a new file for it that also allows synchronizing with file operations. Alternatively, we could implement that as a 'fsync' file operation on the mfc file. Arnd <>< From sfr at canb.auug.org.au Tue Feb 14 18:32:59 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 14 Feb 2006 18:32:59 +1100 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <17393.16261.768862.724265@cargo.ozlabs.ibm.com> References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com> Message-ID: <20060214183259.28a6a501.sfr@canb.auug.org.au> Hi Manish, Paul has asked me to have a look at this patch and to also consider what has been done for similar work in s390. I will compare this to s390 tomorrow, but for now here are some preliminary comments: > Index: linux-2.6.15-rc6/arch/powerpc/kernel/asm-offsets.c > =================================================================== > --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/asm-offsets.c 2005-12-18 16:36:54.000000000 -0800 > +++ linux-2.6.15-rc6/arch/powerpc/kernel/asm-offsets.c 2006-01-17 15:39:03.000000000 -0800 > @@ -144,6 +144,10 @@ > DEFINE(LPPACASRR1, offsetof(struct lppaca, saved_srr1)); > DEFINE(LPPACAANYINT, offsetof(struct lppaca, int_dword.any_int)); > DEFINE(LPPACADECRINT, offsetof(struct lppaca, int_dword.fields.decr_int)); > + DEFINE(PACA_STARTB, offsetof(struct paca_struct, start_tb)); > + DEFINE(PACA_CDFLAG, offsetof(struct paca_struct, cdflag)); > + DEFINE(PACA_DELTATB, offsetof(struct paca_struct, delta_tb)); Why not PACA_START_TB and PACA_DELTA_TB? Also, start_tb and delta_tb don't really store time base values, but PURR values. > Index: linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S > =================================================================== > --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/entry_64.S 2005-12-18 16:36:54.000000000 -0800 > +++ linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S 2006-01-17 15:39:03.000000000 -0800 > @@ -520,7 +520,19 @@ > * r13 is our per cpu area, only restore it if we are returning to > * userspace > */ > + > beq 1f > +BEGIN_FTR_SECTION > + li r10,0 > + stb r10,PACA_CDFLAG(r13) cdflag get set here but not set or used anywhere else. > Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c > =================================================================== > --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c 2005-12-18 16:36:54.000000000 -0800 > +++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c 2006-01-17 21:20:25.000000000 -0800 > @@ -243,6 +243,7 @@ > struct thread_struct *new_thread, *old_thread; > unsigned long flags; > struct task_struct *last; > + struct paca_struct *lpaca; This could have been declared below (near pd) > > #ifdef CONFIG_SMP > /* avoid complexity of lazy save/restore of fpu > @@ -313,19 +314,34 @@ > new_thread = &new->thread; > old_thread = ¤t->thread; > > -#ifdef CONFIG_PPC64 > - /* > - * Collect processor utilization data per process > - */ > - if (firmware_has_feature(FW_FEATURE_SPLPAR)) { > - struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array); > - long unsigned start_tb, current_tb; > - start_tb = old_thread->start_tb; > - cu->current_tb = current_tb = mfspr(SPRN_PURR); > - old_thread->accum_tb += (current_tb - start_tb); > - new_thread->start_tb = current_tb; > + > +/* Collect cpu_util utilization data per process and per processor wise */ > + if (cpu_has_feature(CPU_FTR_PURR)) { > + struct cpu_usage *pd = &__get_cpu_var(cpu_usage_array); Was there some good reason to change this variable name from cu to pd? > + long unsigned start_cpu_util, current_cpu_util; > + > + if ( old_thread->start_cpu_util ) > + pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR); > + else > + old_thread->start_cpu_util = pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR); Probably better would be: pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR); if (old_thread->start_cpu_util == 0) old_thread->start_cpu_util = current_cpu_util; > + > + /* store delta_tb & mftb into cpu_util data array for * > + * later easy access otherwise you have to do run_on_cpu * > + * which is expensive */ Comment style should be: /* store delta_tb & mftb into cpu_util data array for * later easy access otherwise you have to do run_on_cpu * which is expensive */ > + > + lpaca = get_paca(); > + pd->collected_krntb = lpaca->delta_tb; > + pd->collected_timebase = mftb(); > + > + start_cpu_util = old_thread->start_cpu_util; > + old_thread->total_dp += (current_cpu_util - start_cpu_util); > + > + /* collect time from entry into kernel to now and account it * > + * in process kernel time */ Comment style again. > + > + old_thread->proc_stime += (current_cpu_util - lpaca->start_tb); > + new_thread->start_cpu_util = current_cpu_util; > } > -#endif > > local_irq_save(flags); > last = _switch(old_thread, new_thread); > Index: linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c > =================================================================== > --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/setup_64.c 2005-12-18 16:36:54.000000000 -0800 > +++ linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c 2006-02-10 11:51:28.197401840 -0800 > @@ -851,3 +851,153 @@ > +static void collect_cpu_deltas(int cpu) > +static void post_cpu_deltas(int cpu) Should those two be #ifdef CONFIG_HOTPLUG_CPU ? > + /* Initialize the global variables to zero */ > + offline_cpu_total_tb = 0; > + offline_cpu_total_cpu_util = 0; > + offline_cpu_total_krncycles = 0; > + offline_cpu_total_idle = 0; You don't need to set these to zero explicitly. > Index: linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c > =================================================================== > --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/sysfs.c 2005-12-18 16:36:54.000000000 -0800 > +++ linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c 2006-02-10 12:36:02.375372096 -0800 > @@ -232,8 +240,11 @@ > if (cur_cpu_spec->num_pmcs >= 8) > sysdev_create_file(s, &attr_pmc8); > > - if (cpu_has_feature(CPU_FTR_SMT)) > + if (cpu_has_feature(CPU_FTR_PURR)) { > sysdev_create_file(s, &attr_purr); This will mean that the "purr" file doesn't exist in some cases where it used to (even if it was useless). Not sure if that is a problem for any user mode utilities. > Index: linux-2.6.15-rc6/include/asm-powerpc/processor.h > =================================================================== > --- linux-2.6.15-rc6.orig/include/asm-powerpc/processor.h 2005-12-18 16:36:54.000000000 -0800 > +++ linux-2.6.15-rc6/include/asm-powerpc/processor.h 2006-01-17 21:31:17.000000000 -0800 > @@ -177,6 +177,9 @@ > #ifdef CONFIG_PPC64 > unsigned long start_tb; /* Start purr when proc switched in */ > unsigned long accum_tb; /* Total accumilated purr for process */ > + unsigned long start_cpu_util; /* Start cpu_util when proc switch in */ > + unsigned long total_dp ; /* Total delta cpu_util accum for proc */ > + unsigned long proc_stime; /* Was pad,Now process cpu_util stime */ total_dp and proc_stime are not used anywhere and start_tb accum_tb are no longer used. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ From mohan at in.ibm.com Tue Feb 14 23:07:48 2006 From: mohan at in.ibm.com (Mohan Kumar M) Date: Tue, 14 Feb 2006 17:37:48 +0530 Subject: kexec tools gcc 4.1.0 issue Message-ID: <1139918867.8472.100.camel@explorer.in.ibm.com> Hi, Latest kexec tools for PPC64 with purgatory patch (ppc64-kdump-purgatory-backup-support.patch) was not working with gcc version 4.1.0 due to the change in object file generation. Here is the patch to fix this issue. This patch is created on top of the following level of kexec-tools: - kexec-tools-1.101.tar.gz (from eric biederman's site or from lse site) - kexec-tools-1.101-kdump6.patch (consolidated patch posted on http://lse.sourceforge.net/kdump/patches/1.101-kdump6/kexec-tools-1.101-kdump6.patch) Review and suggestions are welcome. Note: Resending the patch since its not delivered to both fastboot and linuxppc64-dev mailing list. Regards, Mohan. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: kexec-ppc-gcc410-fix.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060214/03a7bcc7/attachment.txt From mohan at in.ibm.com Tue Feb 14 23:30:03 2006 From: mohan at in.ibm.com (Mohan Kumar M) Date: Tue, 14 Feb 2006 18:00:03 +0530 Subject: kexec tools gcc warnings cleanup Message-ID: <1139918870.8472.102.camel@explorer.in.ibm.com> Cleanup the warnings generated in GCC 4.1.0 compilation of kexec-tools. This patch is created on top of the following level of kexec-tools: - kexec-tools-1.101.tar.gz (from eric biederman's site or from lse site) - kexec-tools-1.101-kdump6.patch (consolidated patch posted on http://lse.sourceforge.net/kdump/patches/1.101-kdump6/kexec-tools-1.101-kdump6.patch) Review and suggestions are welcome. Note: Resending the patch since its not delivered to both fastboot and linuxppc64-dev mailing list. Regards, Mohan. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: kexec-gcc-cleanup.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060214/2d50a242/attachment.txt From geoffrey.levand at am.sony.com Wed Feb 15 05:22:08 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Tue, 14 Feb 2006 10:22:08 -0800 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <200602132324.45433.arnd@arndb.de> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <200602120552.26164.arnd@arndb.de> <1139779462.5247.30.camel@localhost.localdomain> <200602132324.45433.arnd@arndb.de> Message-ID: <43F21FD0.507@am.sony.com> Arnd Bergmann wrote: > On Sunday 12 February 2006 22:24, Benjamin Herrenschmidt wrote: >> > The current firmware on the Cell blades does neither the setup of >> > the HID6 register nor have the correct tables in the device tree. >> > >> > Since I'm still currently sitting in a garden in NZ instead of the >> > B?blingen lab, I can't find out what the HID6 power-on defaults >> > are. We might get away with just leaving the default there, but that >> > might prevent us from using 16M and/or 64k pages and there are >> > definitely some application which depend on 16M hugetlb mappings >> > on Cell. >> >> Yes, however, how much widely distributed and "frozen" is this current >> Cell firmware ? I mean, do we really need to add a workaround to the >> kenrel instead of just fixing the firmware here ? > > The firmware update procedure is a little tricky, so our firmware > people decided to as few updates as possible, which means we won't > have small 'hotfix' updates going to the customer. > >> > The two problems we are facing currently are: >> > - If HID6 defaults to disabling 16M large pages, the kernel will >> > get the wrong information from the CPU features and applications >> > that use it break. The firmware should add the setup if HID6 >> > _now_, but we also should be prepared for users of old firmware >> > that want to upgrade their kernel without upgrading the firmware >> > at the same time. >> >> Do we really need to support old/broken firmware ? It's not like we had >> a released product all over the field... > > Basically, we do want to support old firmware that went out in our > customer shippings, but as I wrote in the other mail, we don't need > to worry about that in this case. Also, the requirement is only to > be able to boot with the mainline kernel, for production setup, users > of the currently shipping hardware would also need other patches e.g. > to work around performance errata in the CPU stepping. Sorry about changing my mind on this Ben, but after reading the Book 4 docs on page sizes I see that each partition can have independent page size settings. I made the wrong assumption that all partitions needed the same size setting. Based on this, and on Arnd's comments, I think in general we will need to setup page sizes in the kernel. This is particularly true if we setup 16M + 64k pages for the spufs, since the cpu default is 16M + 16M, which is probably what most firmware will use. At any rate, I don't think we need to worry about it so much now, since those settings can be handled inside the platform code. If it makes sense later we can have an interface to access the page size settings. -Geoff From benh at kernel.crashing.org Wed Feb 15 08:22:09 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 15 Feb 2006 08:22:09 +1100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <43F21FD0.507@am.sony.com> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <200602120552.26164.arnd@arndb.de> <1139779462.5247.30.camel@localhost.localdomain> <200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com> Message-ID: <1139952130.7903.24.camel@localhost.localdomain> > > Sorry about changing my mind on this Ben, but after reading the Book 4 > docs on page sizes I see that each partition can have independent > page size settings. I made the wrong assumption that all partitions > needed the same size setting. Based on this, and on Arnd's comments, > I think in general we will need to setup page sizes in the kernel. > This is particularly true if we setup 16M + 64k pages for the spufs, > since the cpu default is 16M + 16M, which is probably what most > firmware will use. > > At any rate, I don't think we need to worry about it so much now, > since those settings can be handled inside the platform code. If > it makes sense later we can have an interface to access the page > size settings. Well, I'll have to look more closely at the initialisation then. The kernel currently assume that non-legacy page sizes (that is something other than 4k and 16M) are completely described at boot by the device-tree, and that would imply HID6 has already been setup. If we want to do something differently, that means that we need a "hook" for the platform code to fill the page size description array. However, currently, there is no platform hook between that array being filled from the device-tree and the memory management being initialized based on those data. An option would be to let platform probe() functions fill the table and set a variable telling the later hash init code to ignore the device-tree description of page sizes ... Ben. From olof at lixom.net Wed Feb 15 08:27:46 2006 From: olof at lixom.net (Olof Johansson) Date: Tue, 14 Feb 2006 15:27:46 -0600 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139952130.7903.24.camel@localhost.localdomain> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <200602120552.26164.arnd@arndb.de> <1139779462.5247.30.camel@localhost.localdomain> <200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com> <1139952130.7903.24.camel@localhost.localdomain> Message-ID: <20060214212746.GA6291@pb15.lixom.net> On Wed, Feb 15, 2006 at 08:22:09AM +1100, Benjamin Herrenschmidt wrote: > Well, I'll have to look more closely at the initialisation then. The > kernel currently assume that non-legacy page sizes (that is something > other than 4k and 16M) are completely described at boot by the > device-tree, and that would imply HID6 has already been setup. Isn't this something that should be configured in the hypervisor / partition firmware on the machine then, instead of hacked into the kernel? The hypervisor would of course switch HID contents when dispatching different partitions, if needed. -Olof From d.herrendoerfer at de.ibm.com Wed Feb 15 01:45:07 2006 From: d.herrendoerfer at de.ibm.com (Dirk Herrendoerfer) Date: Tue, 14 Feb 2006 15:45:07 +0100 Subject: libspe-1.0.1 Message-ID: <8b812ffd71fa2e5bedb463af40cf3184@de.ibm.com> *********************** Warning: Your file, libspe-1.0.1.tar.gz, contains more than 32 files after decompression and cannot be scanned. *********************** This is the current snapshot of libspe. I is an update to version 1.0, conforming to the JSRE SPE Runtime Management Library documentation version 1.1. New in this release is the avaiability of direct problem state mapping, and ppe initiated dma. D. Herrendoerfer -------------- next part -------------- A non-text attachment was scrubbed... Name: libspe-1.0.1.tar.gz Type: application/x-gzip Size: 40465 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060214/8422c1a3/attachment.bin From hollis at penguinppc.org Wed Feb 15 09:31:49 2006 From: hollis at penguinppc.org (Hollis Blanchard) Date: Tue, 14 Feb 2006 16:31:49 -0600 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <43F21FD0.507@am.sony.com> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <200602120552.26164.arnd@arndb.de> <1139779462.5247.30.camel@localhost.localdomain> <200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com> Message-ID: <1139956309.780.254385845@webmail.messagingengine.com> On Tue, 2006-02-14 at 10:22 -0800, Geoff Levand wrote: > > Sorry about changing my mind on this Ben, but after reading the Book 4 > docs on page sizes I see that each partition can have independent > page size settings. I made the wrong assumption that all partitions > needed the same size setting. Based on this, and on Arnd's comments, > I think in general we will need to setup page sizes in the kernel. On Tue, 2006-02-14 at 15:27 -0600, Olof Johansson wrote: > > Isn't this something that should be configured in the hypervisor / > partition firmware on the machine then, instead of hacked into the > kernel? The hypervisor would of course switch HID contents when > dispatching different partitions, if needed. I agree with Olof; I don't follow the original leap of logic. If every partition can have independent page size settings, and especially if HID6 is a hypervisor-privileged resource as mentioned earlier, then the hypervisor needs to set it. Only the hypervisor can restore each partition's different HID6 value when it switches between them... -Hollis From geoffrey.levand at am.sony.com Wed Feb 15 10:14:47 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Tue, 14 Feb 2006 15:14:47 -0800 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139956309.780.254385845@webmail.messagingengine.com> References: <1139956309.780.254385845@webmail.messagingengine.com> Message-ID: <43F26467.3060407@am.sony.com> Hollis Blanchard wrote: > On Tue, 2006-02-14 at 10:22 -0800, Geoff Levand wrote: >> >> Sorry about changing my mind on this Ben, but after reading the Book 4 >> docs on page sizes I see that each partition can have independent >> page size settings. I made the wrong assumption that all partitions >> needed the same size setting. Based on this, and on Arnd's comments, >> I think in general we will need to setup page sizes in the kernel. > > On Tue, 2006-02-14 at 15:27 -0600, Olof Johansson wrote: >> >> Isn't this something that should be configured in the hypervisor / >> partition firmware on the machine then, instead of hacked into the >> kernel? The hypervisor would of course switch HID contents when >> dispatching different partitions, if needed. > > I agree with Olof; I don't follow the original leap of logic. > > If every partition can have independent page size settings, and > especially if HID6 is a hypervisor-privileged resource as mentioned > earlier, then the hypervisor needs to set it. Only the hypervisor can > restore each partition's different HID6 value when it switches between > them... > I guess what I am thinking of are cases like when the firmware has no clue and just uses defaults, or when the firmware or hypervisor expect the kernel to set what sizes work best for it. In these cases a change needs to be initiated by the kernel. -Geoff From olof at lixom.net Wed Feb 15 10:25:47 2006 From: olof at lixom.net (Olof Johansson) Date: Tue, 14 Feb 2006 17:25:47 -0600 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <43F26467.3060407@am.sony.com> References: <1139956309.780.254385845@webmail.messagingengine.com> <43F26467.3060407@am.sony.com> Message-ID: <20060214232547.GB6291@pb15.lixom.net> On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote: > Hollis Blanchard wrote: > I guess what I am thinking of are cases like when the firmware has no > clue and just uses defaults, or when the firmware or hypervisor > expect the kernel to set what sizes work best for it. In these > cases a change needs to be initiated by the kernel. But then give the firmware a clue, and fix it. For the partitioned case, I'm sure you have ways for an alpha partition to define the characteristics of a guest partition, and/or a small controller image running in your hypervisor for similar purposes. If the kernel needs to set "what works best for it", then you should look into some of the ELF header flag stuff that IBM pSeries firmware architects seems to love these days, it seems to be the preferred way for the OS to tell firmware/hypervisor what it wants. There should be no need to introduce yet another interface for this. There are plenty of them already. -Olof From hollis at penguinppc.org Wed Feb 15 10:42:28 2006 From: hollis at penguinppc.org (Hollis Blanchard) Date: Tue, 14 Feb 2006 17:42:28 -0600 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <20060214232547.GB6291@pb15.lixom.net> References: <1139956309.780.254385845@webmail.messagingengine.com> <43F26467.3060407@am.sony.com> <20060214232547.GB6291@pb15.lixom.net> Message-ID: <1139960548.9067.254390313@webmail.messagingengine.com> On Tue, 14 Feb 2006 17:25:47 -0600, "Olof Johansson" said: > On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote: > > I guess what I am thinking of are cases like when the firmware has no > > clue and just uses defaults, or when the firmware or hypervisor > > expect the kernel to set what sizes work best for it. In these > > cases a change needs to be initiated by the kernel. > > But then give the firmware a clue, and fix it. For the partitioned > case, I'm sure you have ways for an alpha partition to define the > characteristics of a guest partition, and/or a small controller image > running in your hypervisor for similar purposes. > > If the kernel needs to set "what works best for it", then you should > look into some of the ELF header flag stuff that IBM pSeries firmware > architects seems to love these days, it seems to be the preferred way > for the OS to tell firmware/hypervisor what it wants. The solution used with the IBM pSeries hypervisor (look for "fake_elf" in prom_init.c, in particular the "rpa_note" part of it) is considered poor by some kernel developers. Implementing something more fine-grained, like a "capabilities" hcall/rtas method/whatever would allow for much more flexibility, which makes sense since the information we want to communicate will undoubtedly grow on future platforms. In this case, an hcall requesting two page sizes would allow the hypervisor to validate the request and implement it as needed on differing hardware, whether it's via HID6 or some other hypervisor-privileged mechanism. -Hollis From geoffrey.levand at am.sony.com Wed Feb 15 10:43:35 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Tue, 14 Feb 2006 15:43:35 -0800 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <20060214232547.GB6291@pb15.lixom.net> References: <1139956309.780.254385845@webmail.messagingengine.com> <43F26467.3060407@am.sony.com> <20060214232547.GB6291@pb15.lixom.net> Message-ID: <43F26B27.4080208@am.sony.com> Olof Johansson wrote: > On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote: >> Hollis Blanchard wrote: > >> I guess what I am thinking of are cases like when the firmware has no >> clue and just uses defaults, or when the firmware or hypervisor >> expect the kernel to set what sizes work best for it. In these >> cases a change needs to be initiated by the kernel. > > But then give the firmware a clue, and fix it. For the partitioned > case, I'm sure you have ways for an alpha partition to define the > characteristics of a guest partition, and/or a small controller image > running in your hypervisor for similar purposes. > > If the kernel needs to set "what works best for it", then you should > look into some of the ELF header flag stuff that IBM pSeries firmware > architects seems to love these days, it seems to be the preferred way > for the OS to tell firmware/hypervisor what it wants. > > There should be no need to introduce yet another interface for this. There > are plenty of them already. > I wish the part where I wrote 'I don't think we need to worry about it so much now' didn't get cut from the discussion, since I am in agreement with you to try to avoid some new mechanisms... -Geoff From olof at lixom.net Wed Feb 15 10:45:40 2006 From: olof at lixom.net (Olof Johansson) Date: Tue, 14 Feb 2006 17:45:40 -0600 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <43F26B27.4080208@am.sony.com> References: <1139956309.780.254385845@webmail.messagingengine.com> <43F26467.3060407@am.sony.com> <20060214232547.GB6291@pb15.lixom.net> <43F26B27.4080208@am.sony.com> Message-ID: <20060214234539.GC6291@pb15.lixom.net> On Tue, Feb 14, 2006 at 03:43:35PM -0800, Geoff Levand wrote: > I wish the part where I wrote 'I don't think we need to worry about it > so much now' didn't get cut from the discussion, since I am in agreement > with you to try to avoid some new mechanisms... Ok, sounds good. :) -Olof From geoffrey.levand at am.sony.com Wed Feb 15 11:22:46 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Tue, 14 Feb 2006 16:22:46 -0800 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139960548.9067.254390313@webmail.messagingengine.com> References: <1139956309.780.254385845@webmail.messagingengine.com> <43F26467.3060407@am.sony.com> <20060214232547.GB6291@pb15.lixom.net> <1139960548.9067.254390313@webmail.messagingengine.com> Message-ID: <43F27456.9080105@am.sony.com> Hollis Blanchard wrote: > On Tue, 14 Feb 2006 17:25:47 -0600, "Olof Johansson" > said: >> On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote: >> > I guess what I am thinking of are cases like when the firmware has no >> > clue and just uses defaults, or when the firmware or hypervisor >> > expect the kernel to set what sizes work best for it. In these >> > cases a change needs to be initiated by the kernel. >> >> But then give the firmware a clue, and fix it. For the partitioned >> case, I'm sure you have ways for an alpha partition to define the >> characteristics of a guest partition, and/or a small controller image >> running in your hypervisor for similar purposes. >> >> If the kernel needs to set "what works best for it", then you should >> look into some of the ELF header flag stuff that IBM pSeries firmware >> architects seems to love these days, it seems to be the preferred way >> for the OS to tell firmware/hypervisor what it wants. > > The solution used with the IBM pSeries hypervisor (look for "fake_elf" > in prom_init.c, in particular the "rpa_note" part of it) is considered > poor by some kernel developers. Implementing something more > fine-grained, like a "capabilities" hcall/rtas method/whatever would > allow for much more flexibility, which makes sense since the information > we want to communicate will undoubtedly grow on future platforms. Certainly looks clunky... > In this case, an hcall requesting two page sizes would allow the > hypervisor to validate the request and implement it as needed on > differing hardware, whether it's via HID6 or some other > hypervisor-privileged mechanism. That seems a better way. Do you have any ideas on what other 'capabilities' are or would be desirable? -Geoff From jimix at watson.ibm.com Wed Feb 15 22:21:57 2006 From: jimix at watson.ibm.com (Jimi Xenidis) Date: Wed, 15 Feb 2006 06:21:57 -0500 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: References: Message-ID: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com> On Feb 13, 2006, at 1:19 PM, Hartmut Penner wrote: > I would like to support the large pages in the Firmware, but need > to know > excactly what properties I have to set. Why? What would you gain from using Large Pages? Is your FW that big? Are thinking of using Large PAges in IO space? Cuz I don't think you can. -JX From paulus at samba.org Wed Feb 15 22:30:49 2006 From: paulus at samba.org (Paul Mackerras) Date: Wed, 15 Feb 2006 22:30:49 +1100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com> References: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com> Message-ID: <17395.4329.1338.898562@cargo.ozlabs.ibm.com> Jimi Xenidis writes: > Are thinking of using Large PAges in IO space? Cuz I don't think you > can. Why not? Paul. From jimix at watson.ibm.com Wed Feb 15 23:11:33 2006 From: jimix at watson.ibm.com (Jimi Xenidis) Date: Wed, 15 Feb 2006 07:11:33 -0500 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <17395.4329.1338.898562@cargo.ozlabs.ibm.com> References: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com> <17395.4329.1338.898562@cargo.ozlabs.ibm.com> Message-ID: <1B5EC317-861F-46D2-AA05-AEC16DFE4737@watson.ibm.com> On Feb 15, 2006, at 6:30 AM, Paul Mackerras wrote: > Jimi Xenidis writes: > >> Are thinking of using Large PAges in IO space? Cuz I don't think you >> can. > > Why not? From the 970 User manual: To avoid accidental large/small page translation aliasing, the 970FX implements a HID4 bit (HID4[61]) to disable the large page facility and does not permit cache inhibited accesses to an address in a large page. I'm not 100% but I believe this effects P4, and maybe even P5. WRT Cell, I believe the BPA_Map can be mapped with large pages but I'm not sure about "real" devices. -JX From ahuja at austin.ibm.com Wed Feb 15 12:05:55 2006 From: ahuja at austin.ibm.com (Manish Ahuja) Date: Tue, 14 Feb 2006 19:05:55 -0600 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <20060214183259.28a6a501.sfr@canb.auug.org.au> References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com> <20060214183259.28a6a501.sfr@canb.auug.org.au> Message-ID: <43F27E73.30709@austin.ibm.com> Stephen Rothwell wrote: >Why not PACA_START_TB and PACA_DELTA_TB? Also, start_tb and delta_tb don't really >store time base values, but PURR values. > > Stephen, Thanks for the review. I will address all points or make appropriate changes where required. Just a quick note before I head out for the day. I will send another detailed response a bit later. On why these are called tb and not purr. I presume when i dropped the last patch, we weren't exhaustively tracking anything else other than purr and Paul M suggested that I use "tb" instead of purr. I would personally prefer purr as it makes reading the code easier as it suggests exactly what is being tracked. I can try and change it back to purr if Paul M agrees to it. Thanks, Manish. From olof at lixom.net Thu Feb 16 00:58:56 2006 From: olof at lixom.net (Olof Johansson) Date: Wed, 15 Feb 2006 07:58:56 -0600 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com> References: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com> Message-ID: <20060215135856.GE6291@pb15.lixom.net> On Wed, Feb 15, 2006 at 06:21:57AM -0500, Jimi Xenidis wrote: > > On Feb 13, 2006, at 1:19 PM, Hartmut Penner wrote: > > > I would like to support the large pages in the Firmware, but need > > to know > > excactly what properties I have to set. > > Why? What would you gain from using Large Pages? > Is your FW that big? I read that as he wants to have firmware configure which large pages to use, and use the architected manners in which FW tells the OS which pagesizes are available and what fields to set to select them, not necessarily use them to map firmware memory? > Are thinking of using Large PAges in IO space? Cuz I don't think you > can. POWER5+ can use large I/O pages, at least 64K. Other processors might also, but I don't know about Cell. The problem with I/O pages on PPC 2.01 was when the page size was only selected in the SLB entry. Since it's not a hypervisor resource, the OS could break isolation requirements by mapping a 16M I/O page that allowed access to other partitons' I/O space right after it's own. That's probably why PPC970 has the HID bits to disable it. This changed in PPC 2.02, where the L bit was introduced in the PTE entry as well. So, there the HV has a chance to verify it being set properly before allowing hash table insertions, which should allow for 16MB I/O pages also in a partitioned environment. I'm not sure if it's actually used anywhere or not. -Olof From utz.bacher at de.ibm.com Tue Feb 14 07:38:50 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Mon, 13 Feb 2006 21:38:50 +0100 (CET) Subject: [FYI/PATCH 3/3] increase direct mapping sizes for spufs Message-ID: This patch applies on top of Arnd's postings (patch ids 4192, 4185, 4190) from 1/17 (on top of 2.6.15.4). It maps 16k instead of 4k for each problem-state mapped subarea. The mfc mapping contains the Multisource Synchronization Area, the MFC Command Parameter Area and the MFC Command Queue Control Area; the cntl mapping contains the SPU Control Area while the signal1 and signal2 mapping contain the relevant Signal-Notification Area. This allows libspe to build on direct problem state mapping and is recommended for running on a Cell blade today. The code may well change in the near future. Cc: Mark Nutter Cc: Arnd Bergmann From: Ulrich Weigand Signed-off-by: Utz Bacher Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c =================================================================== --- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/context.c +++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c @@ -116,13 +116,13 @@ if (ctx->local_store) unmap_mapping_range(ctx->local_store, 0, LS_SIZE, 1); if (ctx->mfc) - unmap_mapping_range(ctx->mfc, 0, 0x1000, 1); + unmap_mapping_range(ctx->mfc, 0, 0x4000, 1); if (ctx->cntl) - unmap_mapping_range(ctx->cntl, 0, 0x1000, 1); + unmap_mapping_range(ctx->cntl, 0, 0x4000, 1); if (ctx->signal1) - unmap_mapping_range(ctx->signal1, 0, 0x1000, 1); + unmap_mapping_range(ctx->signal1, 0, 0x4000, 1); if (ctx->signal2) - unmap_mapping_range(ctx->signal2, 0, 0x1000, 1); + unmap_mapping_range(ctx->signal2, 0, 0x4000, 1); } int spu_acquire_runnable(struct spu_context *ctx) Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c =================================================================== --- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/file.c +++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c @@ -158,7 +158,7 @@ int ret; offset += vma->vm_pgoff << PAGE_SHIFT; - if (offset > 0x1000) + if (offset >= 0x4000) goto out; ret = spu_acquire_runnable(ctx); From olof at lixom.net Thu Feb 16 02:02:09 2006 From: olof at lixom.net (Olof Johansson) Date: Wed, 15 Feb 2006 09:02:09 -0600 Subject: [PATCH] [2.6.16] powerpc: Fix OOPS in lparcfg on G5 Message-ID: <20060215150209.GF6291@pb15.lixom.net> Hi, Bugfix, so please consider for 2.6.16: Hit the following with LTP with a ppc64_defconfig kernel on a G5: Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc00000000001f6d0 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 POWERMAC Modules linked in: NIP: C00000000001F6D0 LR: C00000000001F6CC CTR: 0000000000000000 REGS: c000000054853790 TRAP: 0300 Not tainted (2.6.16-rc3-mm1) MSR: 9000000000009032 CR: 24000444 XER: 00000000 DAR: 0000000000000030, DSISR: 0000000040000000 TASK = c00000005fb65810[4820] 'proc01' THREAD: c000000054850000 CPU: 1 GPR00: C00000000001F6CC C000000054853A10 C00000000079FBB0 C0000000007A32E8 GPR04: C0000000004AE220 0000000000000000 0000000000000020 0000000000000000 GPR08: C000000000610178 0000000000000072 C00000005FFFEE62 0000000000000092 GPR12: 0000000000000002 C0000000005BA100 00000000100D0000 0000000010116C88 GPR16: 00000000100D0000 00000000FFFF9008 0000000000000000 0000000000000000 GPR20: 000000001001B5D8 000000000FF58224 C000000054853E08 C00000000F44A330 GPR24: C00000005E47B700 0000000010016FB4 0000000000000000 C00000000F44A300 GPR28: 0000000000000000 0000000000000000 C0000000004AE220 0000000000000000 NIP [C00000000001F6D0] .of_find_property+0x30/0xa8 LR [C00000000001F6CC] .of_find_property+0x2c/0xa8 Call Trace: [C000000054853A10] [C00000000001F6CC] .of_find_property+0x2c/0xa8 (unreliable) [C000000054853AA0] [C00000000001F758] .get_property+0x10/0x34 [C000000054853B10] [C00000000001D3C8] .lparcfg_data+0x11c/0x6c8 [C000000054853C20] [C0000000000DC78C] .seq_read+0x198/0x418 [C000000054853CF0] [C0000000000B2634] .vfs_read+0xd0/0x1b0 [C000000054853D90] [C0000000000B32FC] .sys_read+0x4c/0x8c [C000000054853E30] [C0000000000086F8] syscall_exit+0x0/0x40 It happens since the lookup of the /rtas device node is never checked for success and just passed into get_property. It doesn't make sense to create the lparcfg proc entry on non-LPAR systems at all. On LPAR systems, there will always be an RTAS so the lookup will always succeed. Signed-off-by: Olof Johansson Index: linux/arch/powerpc/kernel/lparcfg.c =================================================================== --- linux.orig/arch/powerpc/kernel/lparcfg.c +++ linux/arch/powerpc/kernel/lparcfg.c @@ -565,6 +565,9 @@ int __init lparcfg_init(void) struct proc_dir_entry *ent; mode_t mode = S_IRUSR | S_IRGRP | S_IROTH; + if (!platform_is_lpar()) + return 0; + /* Allow writing if we have FW_FEATURE_SPLPAR */ if (firmware_has_feature(FW_FEATURE_SPLPAR)) { lparcfg_fops.write = lparcfg_write; From jimix at watson.ibm.com Wed Feb 15 22:29:20 2006 From: jimix at watson.ibm.com (Jimi Xenidis) Date: Wed, 15 Feb 2006 06:29:20 -0500 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1139956309.780.254385845@webmail.messagingengine.com> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <200602120552.26164.arnd@arndb.de> <1139779462.5247.30.camel@localhost.localdomain> <200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com> <1139956309.780.254385845@webmail.messagingengine.com> Message-ID: <7EC0BED8-30BC-4F52-92AE-19C7CEC3AD47@watson.ibm.com> It is important we consider the cases where the hypervisor is present and not present. There is also the problem of different Hypervisors. I do not think FW without Hypervisor has any business choosing the page sizes for an OS. For Hypervisor machines, as discussed below, it needs to be negotiated. There are plenty of things that need to be negotiated like this, and it is likely that each hypervisor will do this differently. I guess for Hypervisors we'll wait and see. -JX On Feb 14, 2006, at 5:31 PM, Hollis Blanchard wrote: > On Tue, 2006-02-14 at 10:22 -0800, Geoff Levand wrote: >> >> Sorry about changing my mind on this Ben, but after reading the >> Book 4 >> docs on page sizes I see that each partition can have independent >> page size settings. I made the wrong assumption that all partitions >> needed the same size setting. Based on this, and on Arnd's comments, >> I think in general we will need to setup page sizes in the kernel. > > On Tue, 2006-02-14 at 15:27 -0600, Olof Johansson wrote: >> >> Isn't this something that should be configured in the hypervisor / >> partition firmware on the machine then, instead of hacked into the >> kernel? The hypervisor would of course switch HID contents when >> dispatching different partitions, if needed. > > I agree with Olof; I don't follow the original leap of logic. > > If every partition can have independent page size settings, and > especially if HID6 is a hypervisor-privileged resource as mentioned > earlier, then the hypervisor needs to set it. Only the hypervisor can > restore each partition's different HID6 value when it switches between > them... > > -Hollis > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > From mohan at in.ibm.com Wed Feb 15 23:57:22 2006 From: mohan at in.ibm.com (Mohan Kumar M) Date: Wed, 15 Feb 2006 18:27:22 +0530 Subject: kexec tools gcc 4.1.0 issue In-Reply-To: <1139918867.8472.100.camel@explorer.in.ibm.com> References: <1139918867.8472.100.camel@explorer.in.ibm.com> Message-ID: <20060215125722.GA15333@in.ibm.com> Hi, One more patch is required to solve the gcc 4.1.0 issue with kexec-tools. When users run ./configure script without running autoconf, -mcall-aixdesc flag will not be added to the EXTRA_CFLAGS. This patch adds the flag to configure script also. So that even if the user does not run autoconf, -mcall-aixdesc flag is added to EXTRA_CFLAGS. This patch is required in addition to kexec-ppc-gcc410-fix.patch. When users run ./configure script without running autoconf, "-mcall-aixdesc" flag will not be included to the EXTRA_CFLAGS. This patch adds this flag to EXTRA_CLFAGS in "configure" script also. Signed-off-by: Mohan --- configure | 3 ++- 1 files changed, 2 insertions(+), 1 deletion(-) diff -puN configure~kexec-ppc-gcc410-fix-2 configure --- kexec-tools-1.101/configure~kexec-ppc-gcc410-fix-2 2006-02-15 18:01:30.000000000 +0530 +++ kexec-tools-1.101-mohan/configure 2006-02-15 18:03:18.000000000 +0530 @@ -1413,8 +1413,9 @@ fi EXTRA_CFLAGS="" # Check whether ppc64. Add -m64 for building 64-bit binary +# Add -mcall-aixdesc to generate dot-symbols as in gcc 3.3.3 if test "$ARCH" = ppc64; then - EXTRA_CFLAGS="$EXTRA_CFLAGS -m64" + EXTRA_CFLAGS="$EXTRA_CFLAGS -m64 -mcall-aixdesc" fi; # Check whether --with-objdir or --without-objdir was given. _ On Tue, Feb 14, 2006 at 05:37:48PM +0530, Mohan Kumar M wrote: > Hi, > > Latest kexec tools for PPC64 with purgatory patch > (ppc64-kdump-purgatory-backup-support.patch) was not working with gcc > version 4.1.0 due to the change in object file generation. > > Here is the patch to fix this issue. > > This patch is created on top of the following level of > kexec-tools: > > - kexec-tools-1.101.tar.gz (from eric biederman's site or > from lse site) > - kexec-tools-1.101-kdump6.patch (consolidated patch posted > on > http://lse.sourceforge.net/kdump/patches/1.101-kdump6/kexec-tools-1.101-kdump6.patch) > > Review and suggestions are welcome. > > Note: > Resending the patch since its not delivered to both fastboot and > linuxppc64-dev mailing list. > > Regards, > Mohan. From benh at kernel.crashing.org Thu Feb 16 09:27:09 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 16 Feb 2006 09:27:09 +1100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <1B5EC317-861F-46D2-AA05-AEC16DFE4737@watson.ibm.com> References: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com> <17395.4329.1338.898562@cargo.ozlabs.ibm.com> <1B5EC317-861F-46D2-AA05-AEC16DFE4737@watson.ibm.com> Message-ID: <1140042429.4054.6.camel@localhost.localdomain> On Wed, 2006-02-15 at 07:11 -0500, Jimi Xenidis wrote: > On Feb 15, 2006, at 6:30 AM, Paul Mackerras wrote: > > > Jimi Xenidis writes: > > > >> Are thinking of using Large PAges in IO space? Cuz I don't think you > >> can. > > > > Why not? > > From the 970 User manual: > To avoid accidental large/small page translation aliasing, the > 970FX implements a HID4 bit (HID4[61]) to > disable the large page facility and does not permit cache > inhibited accesses to an address in a large page. > > I'm not 100% but I believe this effects P4, and maybe even P5. > WRT Cell, I believe the BPA_Map can be mapped with large pages but > I'm not sure about "real" devices. AS 2.03 lifts this limitation, and from GS DD2.1 onward, L=1 can be cache inhibited (this is a requirement for the kernel to be able to use 64k HW pages btw). I think Cell works that way too but that remains to be confirmed. Ben. From benh at kernel.crashing.org Thu Feb 16 09:36:19 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 16 Feb 2006 09:36:19 +1100 Subject: AW: Re: __setup_cpu_be problem In-Reply-To: <7EC0BED8-30BC-4F52-92AE-19C7CEC3AD47@watson.ibm.com> References: <2812322.110611139545275893.JavaMail.servlet@kundenserver> <200602120552.26164.arnd@arndb.de> <1139779462.5247.30.camel@localhost.localdomain> <200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com> <1139956309.780.254385845@webmail.messagingengine.com> <7EC0BED8-30BC-4F52-92AE-19C7CEC3AD47@watson.ibm.com> Message-ID: <1140042979.4054.14.camel@localhost.localdomain> On Wed, 2006-02-15 at 06:29 -0500, Jimi Xenidis wrote: > It is important we consider the cases where the hypervisor is present > and not present. > There is also the problem of different Hypervisors. > I do not think FW without Hypervisor has any business choosing the > page sizes for an OS. > For Hypervisor machines, as discussed below, it needs to be negotiated. > There are plenty of things that need to be negotiated like this, and > it is likely that each hypervisor will do this differently. Page sizes are normally not "chosen" in that the architecture was written with the intend that a given CPU model supports a given range of page sizes and that gets exposed via the device-tree. What is causing the current "situation" is that Cell was designed slightly differently :) It supports 2 large page sizes encodings but 3 actual large page sizes. The matching of one of the encodings to one of the large page page sizes is done in software via HID6. This doesn't quite fit in anything that has been defined by our firmware stuff, thus my initial idea to try to have Cell based firmwares pick the encodings that make sense for linux, populate the device-tree accordingly and forget about it (that is 64k and 16M). However, I suppose there might be applications where 1M makes sense, non-linux OSes or even future versions of linux that get "fixed" to handle 1M large pages.... Thus if we want that configurable, the question is "where". I'm not too fan of having yet another mecanism for detecting page sizes in the hash code though. I'd really like that we stick to the current mecanism via the device-tree. Thus if we want a way to select the page sizes on CPUs like Cell, it should be done before we retreive the device-tree from OF, so that the firmware, when instructed to change it, can appropriately update the device-tree properties. The simple way I think is an nvram OF option in /options, along with other OF environment variables. The more complicated way would be a specific OF or rtas call (i'd rather avoid HV calls from prom_init but if we have to ...). Ben. From ntl at pobox.com Thu Feb 16 12:47:41 2006 From: ntl at pobox.com (Nathan Lynch) Date: Wed, 15 Feb 2006 19:47:41 -0600 Subject: [PATCH] [2.6.16] powerpc: Fix OOPS in lparcfg on G5 In-Reply-To: <20060215150209.GF6291@pb15.lixom.net> References: <20060215150209.GF6291@pb15.lixom.net> Message-ID: <20060216014741.GD3293@localhost.localdomain> Olof Johansson wrote: > Hi, > > Bugfix, so please consider for 2.6.16: > > > Hit the following with LTP with a ppc64_defconfig kernel on a G5: > > Unable to handle kernel paging request for data at address 0x00000030 > Faulting instruction address: 0xc00000000001f6d0 > Oops: Kernel access of bad area, sig: 11 [#1] > SMP NR_CPUS=32 POWERMAC > Modules linked in: > NIP: C00000000001F6D0 LR: C00000000001F6CC CTR: 0000000000000000 > REGS: c000000054853790 TRAP: 0300 Not tainted (2.6.16-rc3-mm1) > MSR: 9000000000009032 CR: 24000444 XER: 00000000 > DAR: 0000000000000030, DSISR: 0000000040000000 > TASK = c00000005fb65810[4820] 'proc01' THREAD: c000000054850000 CPU: 1 > GPR00: C00000000001F6CC C000000054853A10 C00000000079FBB0 C0000000007A32E8 > GPR04: C0000000004AE220 0000000000000000 0000000000000020 0000000000000000 > GPR08: C000000000610178 0000000000000072 C00000005FFFEE62 0000000000000092 > GPR12: 0000000000000002 C0000000005BA100 00000000100D0000 0000000010116C88 > GPR16: 00000000100D0000 00000000FFFF9008 0000000000000000 0000000000000000 > GPR20: 000000001001B5D8 000000000FF58224 C000000054853E08 C00000000F44A330 > GPR24: C00000005E47B700 0000000010016FB4 0000000000000000 C00000000F44A300 > GPR28: 0000000000000000 0000000000000000 C0000000004AE220 0000000000000000 > NIP [C00000000001F6D0] .of_find_property+0x30/0xa8 > LR [C00000000001F6CC] .of_find_property+0x2c/0xa8 > Call Trace: > [C000000054853A10] [C00000000001F6CC] .of_find_property+0x2c/0xa8 (unreliable) > [C000000054853AA0] [C00000000001F758] .get_property+0x10/0x34 > [C000000054853B10] [C00000000001D3C8] .lparcfg_data+0x11c/0x6c8 > [C000000054853C20] [C0000000000DC78C] .seq_read+0x198/0x418 > [C000000054853CF0] [C0000000000B2634] .vfs_read+0xd0/0x1b0 > [C000000054853D90] [C0000000000B32FC] .sys_read+0x4c/0x8c > [C000000054853E30] [C0000000000086F8] syscall_exit+0x0/0x40 > > > It happens since the lookup of the /rtas device node is never checked for > success and just passed into get_property. > > It doesn't make sense to create the lparcfg proc entry on non-LPAR > systems at all. Despite the lparcfg name, I think there are apps which depend on it even on non-lpar systems; we should still create the file on non-lpar Power4, for example. From michael at ellerman.id.au Thu Feb 16 14:13:48 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 16 Feb 2006 14:13:48 +1100 Subject: [PATCH 0/3] powerpc: Bug fixes for 2.6.16 Message-ID: <1140059628.718206.692588263539.qpush@concordia> This is a series of three bug fixes which I think should go in for 2.6.16. The first makes UP kernels work again, we were unconditionally starting secondary cpus. Paulus, you said you didn't like this much, but I think it's the best option for 2.6.16, I have a patch that cleans this stuff up in the works but it'll take a bit longer. The second patch makes UP to SMP kexec work again, this was supposed to work in the past but was never tested and got busted somewhere along the line. The third fixes a long standing bug on pSeries machines, where if secondary threads have different logical/physical ids we fail to spin them up correctly. We don't normally hit this because the logical/physical ids are the same. Built for pSeries, iSeries and pmac32. Booted on P5 LPAR, Power3 and iSeries. From michael at ellerman.id.au Thu Feb 16 14:13:50 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 16 Feb 2006 14:13:50 +1100 Subject: [PATCH 1/3] powerpc: Don't start secondary CPUs in a UP && KEXEC kernel In-Reply-To: <1140059628.718206.692588263539.qpush@concordia> Message-ID: <20060216031415.8F491679F2@ozlabs.org> Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to unconditionally call it from setup_system(). On a UP && KEXEC kernel we'll start up the secondary CPUs which will then go beserk and we die. Simple fix is to conditionally call smp_release_cpus() in setup_system(). With that in place we don't need the dummy definition of smp_release_cpus() because all call sites are #ifdef'ed either SMP or KEXEC. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/setup_64.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: to-merge/arch/powerpc/kernel/setup_64.c =================================================================== --- to-merge.orig/arch/powerpc/kernel/setup_64.c +++ to-merge/arch/powerpc/kernel/setup_64.c @@ -311,8 +311,6 @@ void smp_release_cpus(void) DBG(" <- smp_release_cpus()\n"); } -#else -#define smp_release_cpus() #endif /* CONFIG_SMP || CONFIG_KEXEC */ /* @@ -473,10 +471,12 @@ void __init setup_system(void) check_smt_enabled(); smp_setup_cpu_maps(); +#ifdef CONFIG_SMP /* Release secondary cpus out of their spinloops at 0x60 now that * we can map physical -> logical CPU ids */ smp_release_cpus(); +#endif printk("Starting Linux PPC64 %s\n", system_utsname.version); From michael at ellerman.id.au Thu Feb 16 14:13:51 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 16 Feb 2006 14:13:51 +1100 Subject: [PATCH 2/3] powerpc: Make UP -> SMP kexec work again In-Reply-To: <1140059628.718206.692588263539.qpush@concordia> Message-ID: <20060216031417.4024067AA0@ozlabs.org> For UP to SMP kexec to work we need to jump into pSeries_secondary_smp_init event on a UP + KEXEC kernel. The secondary cpus will not find their hw_cpu_id in the paca and so they'll jump into kexec_wait, ready for a kexec. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/head_64.S | 4 +--- 1 files changed, 1 insertion(+), 3 deletions(-) Index: to-merge/arch/powerpc/kernel/head_64.S =================================================================== --- to-merge.orig/arch/powerpc/kernel/head_64.S +++ to-merge/arch/powerpc/kernel/head_64.S @@ -155,8 +155,7 @@ _GLOBAL(__secondary_hold) SET_REG_IMMEDIATE(r4, .hmt_init) mtctr r4 bctr -#else -#ifdef CONFIG_SMP +#elif defined(CONFIG_SMP) || defined(CONFIG_KEXEC) LOAD_REG_IMMEDIATE(r4, .pSeries_secondary_smp_init) mtctr r4 mr r3,r24 @@ -164,7 +163,6 @@ _GLOBAL(__secondary_hold) #else BUG_OPCODE #endif -#endif /* This value is used to mark exception frames on the stack. */ .section ".toc","aw" From michael at ellerman.id.au Thu Feb 16 14:13:53 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 16 Feb 2006 14:13:53 +1100 Subject: [PATCH 3/3] powerpc: Fix bug in spinup of renumbered secondary threads In-Reply-To: <1140059628.718206.692588263539.qpush@concordia> Message-ID: <20060216031418.C2E0467B51@ozlabs.org> If the logical and physical cpu ids of a secondary thread don't match, we will fail to spin the thread up on pSeries machines due to a bug in pseries/smp.c We call the RTAS "start-cpu" method with the physical cpu id, the address of pSeries_secondary_smp_init and the value to pass that function in r3. Currently we pass "lcpu", the logical cpu id, but pSeries_secondary_smp_init expects the physical cpu id in r3. We should be passing pcpu instead. Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/smp.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: to-merge/arch/powerpc/platforms/pseries/smp.c =================================================================== --- to-merge.orig/arch/powerpc/platforms/pseries/smp.c +++ to-merge/arch/powerpc/platforms/pseries/smp.c @@ -292,7 +292,7 @@ static inline int __devinit smp_startup_ if (start_cpu == RTAS_UNKNOWN_SERVICE) return 1; - status = rtas_call(start_cpu, 3, 1, NULL, pcpu, start_here, lcpu); + status = rtas_call(start_cpu, 3, 1, NULL, pcpu, start_here, pcpu); if (status != 0) { printk(KERN_ERR "start-cpu failed: %i\n", status); return 0; From olof at lixom.net Thu Feb 16 14:40:44 2006 From: olof at lixom.net (Olof Johansson) Date: Wed, 15 Feb 2006 21:40:44 -0600 Subject: [PATCH] [2.6.16] powerpc: Fix OOPS in lparcfg on G5 In-Reply-To: <20060216014741.GD3293@localhost.localdomain> References: <20060215150209.GF6291@pb15.lixom.net> <20060216014741.GD3293@localhost.localdomain> Message-ID: <20060216034044.GK6291@pb15.lixom.net> On Wed, Feb 15, 2006 at 07:47:41PM -0600, Nathan Lynch wrote: > Despite the lparcfg name, I think there are apps which depend on it > even on non-lpar systems; we should still create the file on non-lpar > Power4, for example. Hrm, ok. Thanks Nathan. Paulus, please apply for 2.6.16. Thanks, Olof --- Fallback gracefully when reading /proc/ppc64/lparcfg when the /rtas device node can't be found. Signed-off-by: Olof Johansson Index: powerpc-git/arch/powerpc/kernel/lparcfg.c =================================================================== --- powerpc-git.orig/arch/powerpc/kernel/lparcfg.c +++ powerpc-git/arch/powerpc/kernel/lparcfg.c @@ -341,7 +341,7 @@ static int lparcfg_data(struct seq_file const char *system_id = ""; unsigned int *lp_index_ptr, lp_index = 0; struct device_node *rtas_node; - int *lrdrp; + int *lrdrp = NULL; rootdn = find_path_device("/"); if (rootdn) { @@ -362,7 +362,9 @@ static int lparcfg_data(struct seq_file seq_printf(m, "partition_id=%d\n", (int)lp_index); rtas_node = find_path_device("/rtas"); - lrdrp = (int *)get_property(rtas_node, "ibm,lrdr-capacity", NULL); + if (rtas_node) + lrdrp = (int *)get_property(rtas_node, "ibm,lrdr-capacity", + NULL); if (lrdrp == NULL) { partition_potential_processors = vdso_data->processorCount; From latten at austin.ibm.com Thu Feb 16 10:31:26 2006 From: latten at austin.ibm.com (Joy Latten) Date: Wed, 15 Feb 2006 17:31:26 -0600 Subject: problem booting Message-ID: <1140046286.3137.160.camel@faith.austin.ibm.com> Al Viro recommended I send this problem to linuxppc64-dev. I have Rawhide installed on a pseries lpar. It is working fine. The Rawhide kernel is vmlinuz-2.6.15-1.1948_FC5. I installed lspp.8 from Steve Grubb. When I rebooted my machine, I received the below kernel panic. I have seen something similar when downloading a vanilla kernel from kernel.org and using the default config file in arch/powerpc/configs/ ppc64_defconfig. I usually turn on selinux and ipsec protocols and ensure ibmveth and ibmvscsi are included in my kernel. I do not use initrd. A co-worker gave me a .config that seem to get past my problems, so I concluded that perhaps my config was missing something the lpar needed. I have included his config that works ok for me in my email. I apologice for such a large email. The only thing I change is his use of initrd. I do not use initrd. Perhaps I should... I will next try and compile with he rawhide config and a kernel.org kernel and see if it works ok or not. My gcc version is gcc version 4.1.0 20060213 (Red Hat 4.1.0-0.25) Oh I have been also been using arch/powerpc/boot/Zimage for my kernel. Advise if I should be using vmlinux instead. Thanks. Let me know if there are any questions. Regards, Joy Latten --------------------------------------------------------------------- boot: 2.6.15-1.1941.4 Please wait, loading kernel... Elf32 kernel loaded... Loading ramdisk... ramdisk loaded at 02200000, size: 1117 Kbytes OF stdout device is: /vdevice/vty at 30000000 command line: ro console=hvc0 root=LABEL=/1 memory layout at init: memory_limit : 00000000 (16 MB aligned) alloc_bottom : 02318000 alloc_top : 08000000 alloc_top_hi : 88000000 rmo_top : 08000000 ram_top : 88000000 Looking for displays instantiating rtas at 0x077d7000 ... done 00000000 : boot cpu 00000000 00000002 : starting cpu hw idx 00000002... done 00000004 : starting cpu hw idx 00000004... done 00000006 : starting cpu hw idx 00000006... done WARNING: maximum CPUs (4) exceeded: ignoring extras copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x02619000 -> 0x02619f49 Device tree struct 0x0261a000 -> 0x02621000 Calling quiesce ... returning from prom_init DEFAULT CATCH!, exception-handler=fff00300 at %SRR0: 0000000000c3c21c %SRR1: 8000000000003002 Call History ------------ @ - c3c1b0 find-method - c467f4 (poplocals) - c3a718 $call-method - c468ac (poplocals) - c3a718 key-fillq - c46e24 ?xoff - c46f20 (poplocals) - c3a718 (stdout-write) - c4754c (type) - c475d8 _syscatch - c4d43c _exception - c4cf00 - c39834 _syscatch - c4d3a0 _syscatch - c4d3a0 invalid pointer - 1800000000864 Client's Fix Pt Regs: 00 00080000000001f4 ffffffffff2581d4 00000000deadbeef fffffffffffffffc 04 0000000000000000 0000000000000000 000003fe007d0000 0000000000c03010 08 0000000008000000 000000000000003a 00000000003ff000 0000000000000008 0c 0000000000004000 0000000000000000 0000000000000000 0000000000000000 10 0000000000db3710 0000000000db3710 0000000000c465f4 0000000000c467f4 14 0000000000000000 0000000001bfff81 0000000001ef46f0 0000000000117400 18 0000000000c13000 0000000000c38000 0000000000c14f40 0000000000c16fc0 1c 0000000000c20000 0000000000c3fd20 0000000000c11f98 0000000000c10fd0 Special Regs: %IV: 00000300 %CR: 82000082 %XER: 00000000 %DSISR: 08000000 %SRR0: 0000000000c3c21c %SRR1: 8000000000003002 %LR: 0000000000c3c1b0 %CTR: 0000000000000000 %DAR: ffffffffff2581d4 Virtual PID = 0 PFW: Unable to send error log! ofdbg 0 > -------------- next part -------------- # # Automatically generated make config: don't edit # Linux kernel version: 2.6.15 # Wed Feb 8 15:35:32 2006 # CONFIG_PPC64=y CONFIG_64BIT=y CONFIG_PPC_MERGE=y CONFIG_MMU=y CONFIG_GENERIC_HARDIRQS=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_PPC=y CONFIG_EARLY_PRINTK=y CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y # # Processor support # # CONFIG_POWER4_ONLY is not set CONFIG_POWER3=y CONFIG_POWER4=y CONFIG_PPC_FPU=y CONFIG_ALTIVEC=y CONFIG_PPC_STD_MMU=y CONFIG_SMP=y CONFIG_NR_CPUS=128 # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set CONFIG_SYSCTL=y CONFIG_AUDIT=y CONFIG_AUDITSYSCALL=y CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y # CONFIG_IKCONFIG is not set CONFIG_CPUSETS=y CONFIG_INITRAMFS_SOURCE="/boot/initrd-2.6.15.cpio" CONFIG_INITRAMFS_ROOT_UID=0 CONFIG_INITRAMFS_ROOT_GID=0 CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set CONFIG_KALLSYMS_EXTRA_PASS=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_KMOD=y CONFIG_STOP_MACHINE=y # # Block layer # # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y CONFIG_DEFAULT_AS=y # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="anticipatory" # # Platform support # CONFIG_PPC_MULTIPLATFORM=y # CONFIG_PPC_ISERIES is not set # CONFIG_EMBEDDED6xx is not set # CONFIG_APUS is not set CONFIG_PPC_PSERIES=y CONFIG_PPC_PMAC=y CONFIG_PPC_PMAC64=y CONFIG_PPC_MAPLE=y CONFIG_PPC_CELL=y CONFIG_PPC_OF=y CONFIG_XICS=y CONFIG_U3_DART=y CONFIG_MPIC=y CONFIG_PPC_RTAS=y CONFIG_RTAS_ERROR_LOGGING=y CONFIG_RTAS_PROC=y CONFIG_RTAS_FLASH=y CONFIG_MMIO_NVRAM=y CONFIG_MPIC_BROKEN_U3=y CONFIG_CELL_IIC=y CONFIG_IBMVIO=y # CONFIG_PPC_MPC106 is not set CONFIG_GENERIC_TBSYNC=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y CONFIG_CPU_FREQ_DEBUG=y CONFIG_CPU_FREQ_STAT=m CONFIG_CPU_FREQ_STAT_DETAILS=y # CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y CONFIG_CPU_FREQ_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=m CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_GOV_ONDEMAND=m CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m CONFIG_CPU_FREQ_PMAC64=y # CONFIG_WANT_EARLY_SERIAL is not set # # Kernel options # # CONFIG_HZ_100 is not set CONFIG_HZ_250=y # CONFIG_HZ_1000 is not set CONFIG_HZ=250 # CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY=y # CONFIG_PREEMPT is not set CONFIG_PREEMPT_BKL=y CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=y CONFIG_FORCE_MAX_ZONEORDER=13 CONFIG_IOMMU_VMERGE=y CONFIG_HOTPLUG_CPU=y # CONFIG_KEXEC is not set CONFIG_IRQ_ALL_CPUS=y CONFIG_PPC_SPLPAR=y CONFIG_EEH=y CONFIG_SCANLOG=y CONFIG_LPARCFG=y CONFIG_NUMA=y CONFIG_ARCH_SELECT_MEMORY_MODEL=y CONFIG_ARCH_SPARSEMEM_ENABLE=y CONFIG_ARCH_SPARSEMEM_DEFAULT=y CONFIG_SELECT_MEMORY_MODEL=y # CONFIG_FLATMEM_MANUAL is not set # CONFIG_DISCONTIGMEM_MANUAL is not set CONFIG_SPARSEMEM_MANUAL=y CONFIG_SPARSEMEM=y CONFIG_NEED_MULTIPLE_NODES=y CONFIG_HAVE_MEMORY_PRESENT=y # CONFIG_SPARSEMEM_STATIC is not set CONFIG_SPARSEMEM_EXTREME=y # CONFIG_MEMORY_HOTPLUG is not set CONFIG_SPLIT_PTLOCK_CPUS=4 CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y # CONFIG_PPC_64K_PAGES is not set CONFIG_SCHED_SMT=y CONFIG_PROC_DEVICETREE=y # CONFIG_CMDLINE_BOOL is not set CONFIG_PM=y CONFIG_PM_LEGACY=y CONFIG_PM_DEBUG=y # CONFIG_SECCOMP is not set CONFIG_ISA_DMA_API=y # # Bus options # CONFIG_GENERIC_ISA_DMA=y CONFIG_PPC_I8259=y # CONFIG_PPC_INDIRECT_PCI is not set CONFIG_PCI=y CONFIG_PCI_DOMAINS=y CONFIG_PCI_LEGACY_PROC=y # CONFIG_PCI_DEBUG is not set # # PCCARD (PCMCIA/CardBus) support # CONFIG_PCCARD=y # CONFIG_PCMCIA_DEBUG is not set CONFIG_PCMCIA=y CONFIG_PCMCIA_LOAD_CIS=y CONFIG_PCMCIA_IOCTL=y CONFIG_CARDBUS=y # # PC-card bridges # CONFIG_YENTA=y CONFIG_PD6729=m CONFIG_I82092=m CONFIG_PCCARD_NONSTATIC=y # # PCI Hotplug Support # CONFIG_HOTPLUG_PCI=y # CONFIG_HOTPLUG_PCI_FAKE is not set # CONFIG_HOTPLUG_PCI_CPCI is not set CONFIG_HOTPLUG_PCI_SHPC=m CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE=y # CONFIG_HOTPLUG_PCI_SHPC_PHPRM_LEGACY is not set CONFIG_HOTPLUG_PCI_RPA=m CONFIG_HOTPLUG_PCI_RPA_DLPAR=m CONFIG_KERNEL_START=0xc000000000000000 # # Networking # CONFIG_NET=y # # Networking options # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_UNIX=y CONFIG_XFRM=y CONFIG_XFRM_USER=y CONFIG_NET_KEY=m CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_ASK_IP_FIB_HASH=y # CONFIG_IP_FIB_TRIE is not set CONFIG_IP_FIB_HASH=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IP_ROUTE_FWMARK=y CONFIG_IP_ROUTE_MULTIPATH=y # CONFIG_IP_ROUTE_MULTIPATH_CACHED is not set CONFIG_IP_ROUTE_VERBOSE=y # CONFIG_IP_PNP is not set CONFIG_NET_IPIP=m CONFIG_NET_IPGRE=m CONFIG_NET_IPGRE_BROADCAST=y CONFIG_IP_MROUTE=y CONFIG_IP_PIMSM_V1=y CONFIG_IP_PIMSM_V2=y # CONFIG_ARPD is not set CONFIG_SYN_COOKIES=y CONFIG_INET_AH=m CONFIG_INET_ESP=m CONFIG_INET_IPCOMP=m CONFIG_INET_TUNNEL=m CONFIG_INET_DIAG=m CONFIG_INET_TCP_DIAG=m # CONFIG_TCP_CONG_ADVANCED is not set CONFIG_TCP_CONG_BIC=y # # IP: Virtual Server Configuration # CONFIG_IP_VS=m # CONFIG_IP_VS_DEBUG is not set CONFIG_IP_VS_TAB_BITS=12 # # IPVS transport protocol load balancing support # CONFIG_IP_VS_PROTO_TCP=y CONFIG_IP_VS_PROTO_UDP=y CONFIG_IP_VS_PROTO_ESP=y CONFIG_IP_VS_PROTO_AH=y # # IPVS scheduler # CONFIG_IP_VS_RR=m CONFIG_IP_VS_WRR=m CONFIG_IP_VS_LC=m CONFIG_IP_VS_WLC=m CONFIG_IP_VS_LBLC=m CONFIG_IP_VS_LBLCR=m CONFIG_IP_VS_DH=m CONFIG_IP_VS_SH=m CONFIG_IP_VS_SED=m CONFIG_IP_VS_NQ=m # # IPVS application helper # CONFIG_IP_VS_FTP=m CONFIG_IPV6=m CONFIG_IPV6_PRIVACY=y CONFIG_INET6_AH=m CONFIG_INET6_ESP=m CONFIG_INET6_IPCOMP=m CONFIG_INET6_TUNNEL=m CONFIG_IPV6_TUNNEL=m CONFIG_NETFILTER=y # CONFIG_NETFILTER_DEBUG is not set CONFIG_BRIDGE_NETFILTER=y # # Core Netfilter Configuration # CONFIG_NETFILTER_NETLINK=m CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NETFILTER_NETLINK_LOG=m # # IP: Netfilter Configuration # CONFIG_IP_NF_CONNTRACK=m CONFIG_IP_NF_CT_ACCT=y CONFIG_IP_NF_CONNTRACK_MARK=y CONFIG_IP_NF_CONNTRACK_EVENTS=y CONFIG_IP_NF_CONNTRACK_NETLINK=m CONFIG_IP_NF_CT_PROTO_SCTP=m CONFIG_IP_NF_FTP=m CONFIG_IP_NF_IRC=m CONFIG_IP_NF_NETBIOS_NS=m CONFIG_IP_NF_TFTP=m CONFIG_IP_NF_AMANDA=m CONFIG_IP_NF_PPTP=m CONFIG_IP_NF_QUEUE=m CONFIG_IP_NF_IPTABLES=m CONFIG_IP_NF_MATCH_LIMIT=m CONFIG_IP_NF_MATCH_IPRANGE=m CONFIG_IP_NF_MATCH_MAC=m CONFIG_IP_NF_MATCH_PKTTYPE=m CONFIG_IP_NF_MATCH_MARK=m CONFIG_IP_NF_MATCH_MULTIPORT=m CONFIG_IP_NF_MATCH_TOS=m CONFIG_IP_NF_MATCH_RECENT=m CONFIG_IP_NF_MATCH_ECN=m CONFIG_IP_NF_MATCH_DSCP=m CONFIG_IP_NF_MATCH_AH_ESP=m CONFIG_IP_NF_MATCH_LENGTH=m CONFIG_IP_NF_MATCH_TTL=m CONFIG_IP_NF_MATCH_TCPMSS=m CONFIG_IP_NF_MATCH_HELPER=m CONFIG_IP_NF_MATCH_STATE=m CONFIG_IP_NF_MATCH_CONNTRACK=m CONFIG_IP_NF_MATCH_OWNER=m CONFIG_IP_NF_MATCH_PHYSDEV=m CONFIG_IP_NF_MATCH_ADDRTYPE=m CONFIG_IP_NF_MATCH_REALM=m CONFIG_IP_NF_MATCH_SCTP=m CONFIG_IP_NF_MATCH_DCCP=m CONFIG_IP_NF_MATCH_COMMENT=m CONFIG_IP_NF_MATCH_CONNMARK=m CONFIG_IP_NF_MATCH_CONNBYTES=m CONFIG_IP_NF_MATCH_HASHLIMIT=m CONFIG_IP_NF_MATCH_STRING=m CONFIG_IP_NF_FILTER=m CONFIG_IP_NF_TARGET_REJECT=m CONFIG_IP_NF_TARGET_LOG=m CONFIG_IP_NF_TARGET_ULOG=m CONFIG_IP_NF_TARGET_TCPMSS=m CONFIG_IP_NF_TARGET_NFQUEUE=m CONFIG_IP_NF_NAT=m CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=m CONFIG_IP_NF_TARGET_REDIRECT=m CONFIG_IP_NF_TARGET_NETMAP=m CONFIG_IP_NF_TARGET_SAME=m CONFIG_IP_NF_NAT_SNMP_BASIC=m CONFIG_IP_NF_NAT_IRC=m CONFIG_IP_NF_NAT_FTP=m CONFIG_IP_NF_NAT_TFTP=m CONFIG_IP_NF_NAT_AMANDA=m CONFIG_IP_NF_NAT_PPTP=m CONFIG_IP_NF_MANGLE=m CONFIG_IP_NF_TARGET_TOS=m CONFIG_IP_NF_TARGET_ECN=m CONFIG_IP_NF_TARGET_DSCP=m CONFIG_IP_NF_TARGET_MARK=m CONFIG_IP_NF_TARGET_CLASSIFY=m CONFIG_IP_NF_TARGET_TTL=m CONFIG_IP_NF_TARGET_CONNMARK=m CONFIG_IP_NF_TARGET_CLUSTERIP=m CONFIG_IP_NF_RAW=m CONFIG_IP_NF_TARGET_NOTRACK=m CONFIG_IP_NF_ARPTABLES=m CONFIG_IP_NF_ARPFILTER=m CONFIG_IP_NF_ARP_MANGLE=m # # IPv6: Netfilter Configuration (EXPERIMENTAL) # CONFIG_IP6_NF_QUEUE=m CONFIG_IP6_NF_IPTABLES=m CONFIG_IP6_NF_MATCH_LIMIT=m CONFIG_IP6_NF_MATCH_MAC=m CONFIG_IP6_NF_MATCH_RT=m CONFIG_IP6_NF_MATCH_OPTS=m CONFIG_IP6_NF_MATCH_FRAG=m CONFIG_IP6_NF_MATCH_HL=m CONFIG_IP6_NF_MATCH_MULTIPORT=m CONFIG_IP6_NF_MATCH_OWNER=m CONFIG_IP6_NF_MATCH_MARK=m CONFIG_IP6_NF_MATCH_IPV6HEADER=m CONFIG_IP6_NF_MATCH_AHESP=m CONFIG_IP6_NF_MATCH_LENGTH=m CONFIG_IP6_NF_MATCH_EUI64=m CONFIG_IP6_NF_MATCH_PHYSDEV=m CONFIG_IP6_NF_FILTER=m CONFIG_IP6_NF_TARGET_LOG=m CONFIG_IP6_NF_TARGET_REJECT=m CONFIG_IP6_NF_TARGET_NFQUEUE=m CONFIG_IP6_NF_MANGLE=m CONFIG_IP6_NF_TARGET_MARK=m CONFIG_IP6_NF_TARGET_HL=m CONFIG_IP6_NF_RAW=m # # Bridge: Netfilter Configuration # CONFIG_BRIDGE_NF_EBTABLES=m CONFIG_BRIDGE_EBT_BROUTE=m CONFIG_BRIDGE_EBT_T_FILTER=m CONFIG_BRIDGE_EBT_T_NAT=m CONFIG_BRIDGE_EBT_802_3=m CONFIG_BRIDGE_EBT_AMONG=m CONFIG_BRIDGE_EBT_ARP=m CONFIG_BRIDGE_EBT_IP=m CONFIG_BRIDGE_EBT_LIMIT=m CONFIG_BRIDGE_EBT_MARK=m CONFIG_BRIDGE_EBT_PKTTYPE=m CONFIG_BRIDGE_EBT_STP=m CONFIG_BRIDGE_EBT_VLAN=m CONFIG_BRIDGE_EBT_ARPREPLY=m CONFIG_BRIDGE_EBT_DNAT=m CONFIG_BRIDGE_EBT_MARK_T=m CONFIG_BRIDGE_EBT_REDIRECT=m CONFIG_BRIDGE_EBT_SNAT=m CONFIG_BRIDGE_EBT_LOG=m CONFIG_BRIDGE_EBT_ULOG=m # # DCCP Configuration (EXPERIMENTAL) # CONFIG_IP_DCCP=m CONFIG_INET_DCCP_DIAG=m # # DCCP CCIDs Configuration (EXPERIMENTAL) # CONFIG_IP_DCCP_CCID3=m CONFIG_IP_DCCP_TFRC_LIB=m # # DCCP Kernel Hacking # # CONFIG_IP_DCCP_DEBUG is not set CONFIG_IP_DCCP_UNLOAD_HACK=y # # SCTP Configuration (EXPERIMENTAL) # CONFIG_IP_SCTP=m # CONFIG_SCTP_DBG_MSG is not set # CONFIG_SCTP_DBG_OBJCNT is not set # CONFIG_SCTP_HMAC_NONE is not set # CONFIG_SCTP_HMAC_SHA1 is not set CONFIG_SCTP_HMAC_MD5=y CONFIG_ATM=m CONFIG_ATM_CLIP=m # CONFIG_ATM_CLIP_NO_ICMP is not set CONFIG_ATM_LANE=m # CONFIG_ATM_MPOA is not set CONFIG_ATM_BR2684=m # CONFIG_ATM_BR2684_IPFILTER is not set CONFIG_BRIDGE=m CONFIG_VLAN_8021Q=m # CONFIG_DECNET is not set CONFIG_LLC=y # CONFIG_LLC2 is not set CONFIG_IPX=m # CONFIG_IPX_INTERN is not set CONFIG_ATALK=m CONFIG_DEV_APPLETALK=y CONFIG_IPDDP=m CONFIG_IPDDP_ENCAP=y CONFIG_IPDDP_DECAP=y # CONFIG_X25 is not set # CONFIG_LAPB is not set CONFIG_NET_DIVERT=y # CONFIG_ECONET is not set CONFIG_WAN_ROUTER=m # # QoS and/or fair queueing # CONFIG_NET_SCHED=y CONFIG_NET_SCH_CLK_JIFFIES=y # CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set # CONFIG_NET_SCH_CLK_CPU is not set # # Queueing/Scheduling # CONFIG_NET_SCH_CBQ=m CONFIG_NET_SCH_HTB=m CONFIG_NET_SCH_HFSC=m CONFIG_NET_SCH_ATM=m CONFIG_NET_SCH_PRIO=m CONFIG_NET_SCH_RED=m CONFIG_NET_SCH_SFQ=m CONFIG_NET_SCH_TEQL=m CONFIG_NET_SCH_TBF=m CONFIG_NET_SCH_GRED=m CONFIG_NET_SCH_DSMARK=m CONFIG_NET_SCH_NETEM=m CONFIG_NET_SCH_INGRESS=m # # Classification # CONFIG_NET_CLS=y CONFIG_NET_CLS_BASIC=m CONFIG_NET_CLS_TCINDEX=m CONFIG_NET_CLS_ROUTE4=m CONFIG_NET_CLS_ROUTE=y CONFIG_NET_CLS_FW=m CONFIG_NET_CLS_U32=m CONFIG_CLS_U32_PERF=y CONFIG_CLS_U32_MARK=y CONFIG_NET_CLS_RSVP=m CONFIG_NET_CLS_RSVP6=m CONFIG_NET_EMATCH=y CONFIG_NET_EMATCH_STACK=32 CONFIG_NET_EMATCH_CMP=m CONFIG_NET_EMATCH_NBYTE=m CONFIG_NET_EMATCH_U32=m CONFIG_NET_EMATCH_META=m CONFIG_NET_EMATCH_TEXT=m # CONFIG_NET_CLS_ACT is not set CONFIG_NET_CLS_POLICE=y CONFIG_NET_CLS_IND=y CONFIG_NET_ESTIMATOR=y # # Network testing # CONFIG_NET_PKTGEN=m # CONFIG_HAMRADIO is not set CONFIG_IRDA=m # # IrDA protocols # CONFIG_IRLAN=m CONFIG_IRNET=m CONFIG_IRCOMM=m # CONFIG_IRDA_ULTRA is not set # # IrDA options # CONFIG_IRDA_CACHE_LAST_LSAP=y CONFIG_IRDA_FAST_RR=y # CONFIG_IRDA_DEBUG is not set # # Infrared-port device drivers # # # SIR device drivers # CONFIG_IRTTY_SIR=m # # Dongle support # CONFIG_DONGLE=y CONFIG_ESI_DONGLE=m CONFIG_ACTISYS_DONGLE=m CONFIG_TEKRAM_DONGLE=m CONFIG_LITELINK_DONGLE=m CONFIG_MA600_DONGLE=m CONFIG_GIRBIL_DONGLE=m CONFIG_MCP2120_DONGLE=m CONFIG_OLD_BELKIN_DONGLE=m CONFIG_ACT200L_DONGLE=m # # Old SIR device drivers # # # Old Serial dongle support # # # FIR device drivers # CONFIG_USB_IRDA=m CONFIG_SIGMATEL_FIR=m CONFIG_NSC_FIR=m CONFIG_WINBOND_FIR=m CONFIG_SMC_IRCC_FIR=m CONFIG_ALI_FIR=m CONFIG_VLSI_FIR=m CONFIG_VIA_FIR=m CONFIG_BT=m CONFIG_BT_L2CAP=m CONFIG_BT_SCO=m CONFIG_BT_RFCOMM=m CONFIG_BT_RFCOMM_TTY=y CONFIG_BT_BNEP=m CONFIG_BT_BNEP_MC_FILTER=y CONFIG_BT_BNEP_PROTO_FILTER=y CONFIG_BT_CMTP=m CONFIG_BT_HIDP=m # # Bluetooth device drivers # CONFIG_BT_HCIUSB=m CONFIG_BT_HCIUSB_SCO=y CONFIG_BT_HCIUART=m CONFIG_BT_HCIUART_H4=y CONFIG_BT_HCIUART_BCSP=y CONFIG_BT_HCIBCM203X=m CONFIG_BT_HCIBPA10X=m CONFIG_BT_HCIBFUSB=m CONFIG_BT_HCIDTL1=m CONFIG_BT_HCIBT3C=m CONFIG_BT_HCIBLUECARD=m CONFIG_BT_HCIBTUART=m CONFIG_BT_HCIVHCI=m CONFIG_IEEE80211=m CONFIG_IEEE80211_DEBUG=y CONFIG_IEEE80211_CRYPT_WEP=m CONFIG_IEEE80211_CRYPT_CCMP=m CONFIG_IEEE80211_CRYPT_TKIP=m # # Device Drivers # # # Generic Driver Options # CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y # CONFIG_DEBUG_DRIVER is not set # # Connector - unified userspace <-> kernelspace linker # CONFIG_CONNECTOR=m # # Memory Technology Devices (MTD) # CONFIG_MTD=m # CONFIG_MTD_DEBUG is not set CONFIG_MTD_CONCAT=m CONFIG_MTD_PARTITIONS=y CONFIG_MTD_REDBOOT_PARTS=m CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=-1 # CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED is not set # CONFIG_MTD_REDBOOT_PARTS_READONLY is not set CONFIG_MTD_CMDLINE_PARTS=y # # User Modules And Translation Layers # CONFIG_MTD_CHAR=m CONFIG_MTD_BLOCK=m CONFIG_MTD_BLOCK_RO=m CONFIG_FTL=m CONFIG_NFTL=m CONFIG_NFTL_RW=y CONFIG_INFTL=m CONFIG_RFD_FTL=m # # RAM/ROM/Flash chip drivers # CONFIG_MTD_CFI=m CONFIG_MTD_JEDECPROBE=m CONFIG_MTD_GEN_PROBE=m # CONFIG_MTD_CFI_ADV_OPTIONS is not set CONFIG_MTD_MAP_BANK_WIDTH_1=y CONFIG_MTD_MAP_BANK_WIDTH_2=y CONFIG_MTD_MAP_BANK_WIDTH_4=y # CONFIG_MTD_MAP_BANK_WIDTH_8 is not set # CONFIG_MTD_MAP_BANK_WIDTH_16 is not set # CONFIG_MTD_MAP_BANK_WIDTH_32 is not set CONFIG_MTD_CFI_I1=y CONFIG_MTD_CFI_I2=y # CONFIG_MTD_CFI_I4 is not set # CONFIG_MTD_CFI_I8 is not set CONFIG_MTD_CFI_INTELEXT=m CONFIG_MTD_CFI_AMDSTD=m CONFIG_MTD_CFI_AMDSTD_RETRY=3 CONFIG_MTD_CFI_STAA=m CONFIG_MTD_CFI_UTIL=m CONFIG_MTD_RAM=m CONFIG_MTD_ROM=m CONFIG_MTD_ABSENT=m # # Mapping drivers for chip access # CONFIG_MTD_COMPLEX_MAPPINGS=y # CONFIG_MTD_PHYSMAP is not set CONFIG_MTD_PCI=m # CONFIG_MTD_PLATRAM is not set # # Self-contained MTD device drivers # CONFIG_MTD_PMC551=m # CONFIG_MTD_PMC551_BUGFIX is not set # CONFIG_MTD_PMC551_DEBUG is not set # CONFIG_MTD_SLRAM is not set # CONFIG_MTD_PHRAM is not set CONFIG_MTD_MTDRAM=m CONFIG_MTDRAM_TOTAL_SIZE=4096 CONFIG_MTDRAM_ERASE_SIZE=128 # CONFIG_MTD_BLKMTD is not set CONFIG_MTD_BLOCK2MTD=m # # Disk-On-Chip Device Drivers # CONFIG_MTD_DOC2000=m # CONFIG_MTD_DOC2001 is not set CONFIG_MTD_DOC2001PLUS=m CONFIG_MTD_DOCPROBE=m CONFIG_MTD_DOCECC=m # CONFIG_MTD_DOCPROBE_ADVANCED is not set CONFIG_MTD_DOCPROBE_ADDRESS=0 # # NAND Flash Device Drivers # CONFIG_MTD_NAND=m # CONFIG_MTD_NAND_VERIFY_WRITE is not set CONFIG_MTD_NAND_IDS=m # CONFIG_MTD_NAND_DISKONCHIP is not set # CONFIG_MTD_NAND_NANDSIM is not set # # OneNAND Flash Device Drivers # # CONFIG_MTD_ONENAND is not set # # Parallel port support # CONFIG_PARPORT=m CONFIG_PARPORT_PC=m CONFIG_PARPORT_SERIAL=m # CONFIG_PARPORT_PC_FIFO is not set # CONFIG_PARPORT_PC_SUPERIO is not set CONFIG_PARPORT_PC_PCMCIA=m CONFIG_PARPORT_NOT_PC=y # CONFIG_PARPORT_GSC is not set CONFIG_PARPORT_1284=y # # Plug and Play support # # # Block devices # CONFIG_BLK_DEV_FD=m CONFIG_PARIDE=m CONFIG_PARIDE_PARPORT=m # # Parallel IDE high-level drivers # CONFIG_PARIDE_PD=m CONFIG_PARIDE_PCD=m CONFIG_PARIDE_PF=m CONFIG_PARIDE_PT=m CONFIG_PARIDE_PG=m # # Parallel IDE protocol modules # CONFIG_PARIDE_ATEN=m CONFIG_PARIDE_BPCK=m CONFIG_PARIDE_COMM=m CONFIG_PARIDE_DSTR=m CONFIG_PARIDE_FIT2=m CONFIG_PARIDE_FIT3=m CONFIG_PARIDE_EPAT=m CONFIG_PARIDE_EPATC8=y CONFIG_PARIDE_EPIA=m CONFIG_PARIDE_FRIQ=m CONFIG_PARIDE_FRPW=m CONFIG_PARIDE_KBIC=m CONFIG_PARIDE_KTTI=m CONFIG_PARIDE_ON20=m CONFIG_PARIDE_ON26=m # CONFIG_BLK_CPQ_DA is not set CONFIG_BLK_CPQ_CISS_DA=m CONFIG_CISS_SCSI_TAPE=y CONFIG_BLK_DEV_DAC960=m CONFIG_BLK_DEV_UMEM=m # CONFIG_BLK_DEV_COW_COMMON is not set CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_CRYPTOLOOP=m CONFIG_BLK_DEV_NBD=m CONFIG_BLK_DEV_SX8=m CONFIG_BLK_DEV_UB=m CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_COUNT=16 CONFIG_BLK_DEV_RAM_SIZE=16384 CONFIG_BLK_DEV_INITRD=y CONFIG_CDROM_PKTCDVD=m CONFIG_CDROM_PKTCDVD_BUFFERS=8 # CONFIG_CDROM_PKTCDVD_WCACHE is not set CONFIG_ATA_OVER_ETH=m # # ATA/ATAPI/MFM/RLL support # CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y # # Please see Documentation/ide.txt for help/info on IDE drives # # CONFIG_BLK_DEV_IDE_SATA is not set CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECS=m CONFIG_BLK_DEV_IDECD=y # CONFIG_BLK_DEV_IDETAPE is not set CONFIG_BLK_DEV_IDEFLOPPY=y CONFIG_BLK_DEV_IDESCSI=m CONFIG_IDE_TASK_IOCTL=y # # IDE chipset support/bugfixes # CONFIG_IDE_GENERIC=y CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y # CONFIG_BLK_DEV_OFFBOARD is not set CONFIG_BLK_DEV_GENERIC=y # CONFIG_BLK_DEV_OPTI621 is not set CONFIG_BLK_DEV_SL82C105=y CONFIG_BLK_DEV_IDEDMA_PCI=y # CONFIG_BLK_DEV_IDEDMA_FORCED is not set CONFIG_IDEDMA_PCI_AUTO=y # CONFIG_IDEDMA_ONLYDISK is not set CONFIG_BLK_DEV_AEC62XX=y CONFIG_BLK_DEV_ALI15X3=y # CONFIG_WDC_ALI15X3 is not set CONFIG_BLK_DEV_AMD74XX=y CONFIG_BLK_DEV_CMD64X=y CONFIG_BLK_DEV_TRIFLEX=y CONFIG_BLK_DEV_CY82C693=y CONFIG_BLK_DEV_CS5520=y CONFIG_BLK_DEV_CS5530=y CONFIG_BLK_DEV_HPT34X=y # CONFIG_HPT34X_AUTODMA is not set CONFIG_BLK_DEV_HPT366=y # CONFIG_BLK_DEV_SC1200 is not set CONFIG_BLK_DEV_PIIX=y CONFIG_BLK_DEV_IT821X=y # CONFIG_BLK_DEV_NS87415 is not set CONFIG_BLK_DEV_PDC202XX_OLD=y # CONFIG_PDC202XX_BURST is not set CONFIG_BLK_DEV_PDC202XX_NEW=y CONFIG_PDC202XX_FORCE=y CONFIG_BLK_DEV_SVWKS=y CONFIG_BLK_DEV_SIIMAGE=y CONFIG_BLK_DEV_SLC90E66=y # CONFIG_BLK_DEV_TRM290 is not set CONFIG_BLK_DEV_VIA82CXXX=y CONFIG_BLK_DEV_IDE_PMAC=y CONFIG_BLK_DEV_IDE_PMAC_ATA100FIRST=y CONFIG_BLK_DEV_IDEDMA_PMAC=y CONFIG_BLK_DEV_IDE_PMAC_BLINK=y # CONFIG_IDE_ARM is not set CONFIG_BLK_DEV_IDEDMA=y # CONFIG_IDEDMA_IVB is not set CONFIG_IDEDMA_AUTO=y # CONFIG_BLK_DEV_HD is not set # # SCSI device support # CONFIG_RAID_ATTRS=m CONFIG_SCSI=y CONFIG_SCSI_PROC_FS=y # # SCSI support type (disk, tape, CD-ROM) # CONFIG_BLK_DEV_SD=y CONFIG_CHR_DEV_ST=m CONFIG_CHR_DEV_OSST=m CONFIG_BLK_DEV_SR=m CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=y CONFIG_CHR_DEV_SCH=m # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # CONFIG_SCSI_MULTI_LUN=y CONFIG_SCSI_CONSTANTS=y CONFIG_SCSI_LOGGING=y # # SCSI Transport Attributes # CONFIG_SCSI_SPI_ATTRS=m CONFIG_SCSI_FC_ATTRS=m CONFIG_SCSI_ISCSI_ATTRS=m CONFIG_SCSI_SAS_ATTRS=m # # SCSI low-level drivers # CONFIG_ISCSI_TCP=m CONFIG_BLK_DEV_3W_XXXX_RAID=m CONFIG_SCSI_3W_9XXX=m CONFIG_SCSI_ACARD=m CONFIG_SCSI_AACRAID=m CONFIG_SCSI_AIC7XXX=m CONFIG_AIC7XXX_CMDS_PER_DEVICE=4 CONFIG_AIC7XXX_RESET_DELAY_MS=15000 # CONFIG_AIC7XXX_DEBUG_ENABLE is not set CONFIG_AIC7XXX_DEBUG_MASK=0 # CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set CONFIG_SCSI_AIC7XXX_OLD=m CONFIG_SCSI_AIC79XX=m CONFIG_AIC79XX_CMDS_PER_DEVICE=4 CONFIG_AIC79XX_RESET_DELAY_MS=15000 # CONFIG_AIC79XX_ENABLE_RD_STRM is not set # CONFIG_AIC79XX_DEBUG_ENABLE is not set CONFIG_AIC79XX_DEBUG_MASK=0 # CONFIG_AIC79XX_REG_PRETTY_PRINT is not set CONFIG_MEGARAID_NEWGEN=y CONFIG_MEGARAID_MM=m CONFIG_MEGARAID_MAILBOX=m CONFIG_MEGARAID_SAS=m CONFIG_SCSI_SATA=m CONFIG_SCSI_SATA_AHCI=m CONFIG_SCSI_SATA_SVW=m CONFIG_SCSI_ATA_PIIX=m CONFIG_SCSI_SATA_MV=m CONFIG_SCSI_SATA_NV=m CONFIG_SCSI_PDC_ADMA=m CONFIG_SCSI_SATA_QSTOR=m CONFIG_SCSI_SATA_PROMISE=m CONFIG_SCSI_SATA_SX4=m CONFIG_SCSI_SATA_SIL=m CONFIG_SCSI_SATA_SIL24=m CONFIG_SCSI_SATA_SIS=m CONFIG_SCSI_SATA_ULI=m CONFIG_SCSI_SATA_VIA=m CONFIG_SCSI_SATA_VITESSE=m CONFIG_SCSI_SATA_INTEL_COMBINED=y # CONFIG_SCSI_BUSLOGIC is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_EATA is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set CONFIG_SCSI_GDTH=m CONFIG_SCSI_IPS=m CONFIG_SCSI_IBMVSCSI=y CONFIG_SCSI_INITIO=m CONFIG_SCSI_INIA100=m CONFIG_SCSI_PPA=m CONFIG_SCSI_IMM=m # CONFIG_SCSI_IZIP_EPP16 is not set # CONFIG_SCSI_IZIP_SLOW_CTR is not set CONFIG_SCSI_SYM53C8XX_2=m CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 # CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set CONFIG_SCSI_IPR=m CONFIG_SCSI_IPR_TRACE=y CONFIG_SCSI_IPR_DUMP=y # CONFIG_SCSI_QLOGIC_FC is not set CONFIG_SCSI_QLOGIC_1280=m CONFIG_SCSI_QLA2XXX=y CONFIG_SCSI_QLA21XX=m CONFIG_SCSI_QLA22XX=m CONFIG_SCSI_QLA2300=m CONFIG_SCSI_QLA2322=m CONFIG_SCSI_QLA6312=m CONFIG_SCSI_QLA24XX=m CONFIG_SCSI_LPFC=m CONFIG_SCSI_DC395x=m # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_DEBUG is not set # # PCMCIA SCSI adapter support # # CONFIG_PCMCIA_FDOMAIN is not set CONFIG_PCMCIA_QLOGIC=m CONFIG_PCMCIA_SYM53C500=m # # Multi-device support (RAID and LVM) # CONFIG_MD=y CONFIG_BLK_DEV_MD=y CONFIG_MD_LINEAR=m CONFIG_MD_RAID0=m CONFIG_MD_RAID1=m CONFIG_MD_RAID10=m CONFIG_MD_RAID5=m CONFIG_MD_RAID6=m CONFIG_MD_MULTIPATH=m CONFIG_MD_FAULTY=m CONFIG_BLK_DEV_DM=m CONFIG_DM_CRYPT=m CONFIG_DM_SNAPSHOT=m CONFIG_DM_MIRROR=m CONFIG_DM_ZERO=m CONFIG_DM_MULTIPATH=m CONFIG_DM_MULTIPATH_EMC=m # # Fusion MPT device support # CONFIG_FUSION=y CONFIG_FUSION_SPI=m CONFIG_FUSION_FC=m CONFIG_FUSION_SAS=m CONFIG_FUSION_MAX_SGE=40 CONFIG_FUSION_CTL=m CONFIG_FUSION_LAN=m # # IEEE 1394 (FireWire) support # CONFIG_IEEE1394=m # # Subsystem Options # # CONFIG_IEEE1394_VERBOSEDEBUG is not set CONFIG_IEEE1394_OUI_DB=y CONFIG_IEEE1394_EXTRA_CONFIG_ROMS=y CONFIG_IEEE1394_CONFIG_ROM_IP1394=y # CONFIG_IEEE1394_EXPORT_FULL_API is not set # # Device Drivers # CONFIG_IEEE1394_PCILYNX=m CONFIG_IEEE1394_OHCI1394=m # # Protocol Drivers # CONFIG_IEEE1394_VIDEO1394=m CONFIG_IEEE1394_SBP2=m # CONFIG_IEEE1394_SBP2_PHYS_DMA is not set CONFIG_IEEE1394_ETH1394=m CONFIG_IEEE1394_DV1394=m CONFIG_IEEE1394_RAWIO=m CONFIG_IEEE1394_CMP=m CONFIG_IEEE1394_AMDTP=m # # I2O device support # # CONFIG_I2O is not set # # Macintosh device drivers # CONFIG_ADB_PMU=y CONFIG_PMAC_SMU=y CONFIG_THERM_PM72=y CONFIG_WINDFARM=y CONFIG_WINDFARM_PM81=y CONFIG_WINDFARM_PM91=y # # Network device support # CONFIG_NETDEVICES=y CONFIG_DUMMY=m CONFIG_BONDING=m CONFIG_EQUALIZER=m CONFIG_TUN=m # # ARCnet devices # # CONFIG_ARCNET is not set # # PHY device support # CONFIG_PHYLIB=m # # MII PHY device drivers # CONFIG_MARVELL_PHY=m CONFIG_DAVICOM_PHY=m CONFIG_QSEMI_PHY=m CONFIG_LXT_PHY=m CONFIG_CICADA_PHY=m # # Ethernet (10 or 100Mbit) # CONFIG_NET_ETHERNET=y CONFIG_MII=m CONFIG_HAPPYMEAL=m CONFIG_SUNGEM=m CONFIG_CASSINI=m CONFIG_NET_VENDOR_3COM=y CONFIG_VORTEX=m CONFIG_TYPHOON=m # # Tulip family network device support # CONFIG_NET_TULIP=y CONFIG_DE2104X=m CONFIG_TULIP=m # CONFIG_TULIP_MWI is not set CONFIG_TULIP_MMIO=y # CONFIG_TULIP_NAPI is not set CONFIG_DE4X5=m CONFIG_WINBOND_840=m CONFIG_DM9102=m CONFIG_ULI526X=m CONFIG_PCMCIA_XIRCOM=m # CONFIG_HP100 is not set CONFIG_IBMVETH=m CONFIG_NET_PCI=y CONFIG_PCNET32=m CONFIG_AMD8111_ETH=m CONFIG_AMD8111E_NAPI=y CONFIG_ADAPTEC_STARFIRE=m CONFIG_ADAPTEC_STARFIRE_NAPI=y CONFIG_B44=m CONFIG_FORCEDETH=m CONFIG_DGRS=m # CONFIG_EEPRO100 is not set CONFIG_E100=m CONFIG_FEALNX=m CONFIG_NATSEMI=m CONFIG_NE2K_PCI=m CONFIG_8139CP=m CONFIG_8139TOO=m # CONFIG_8139TOO_PIO is not set # CONFIG_8139TOO_TUNE_TWISTER is not set CONFIG_8139TOO_8129=y # CONFIG_8139_OLD_RX_RESET is not set CONFIG_SIS900=m CONFIG_EPIC100=m CONFIG_SUNDANCE=m # CONFIG_SUNDANCE_MMIO is not set CONFIG_VIA_RHINE=m CONFIG_VIA_RHINE_MMIO=y CONFIG_NET_POCKET=y CONFIG_DE600=m CONFIG_DE620=m # # Ethernet (1000 Mbit) # CONFIG_ACENIC=m # CONFIG_ACENIC_OMIT_TIGON_I is not set CONFIG_DL2K=m CONFIG_E1000=m CONFIG_E1000_NAPI=y CONFIG_NS83820=m CONFIG_HAMACHI=m CONFIG_YELLOWFIN=m CONFIG_R8169=m CONFIG_R8169_NAPI=y CONFIG_R8169_VLAN=y CONFIG_SIS190=m CONFIG_SKGE=m # CONFIG_SK98LIN is not set CONFIG_VIA_VELOCITY=m CONFIG_TIGON3=m CONFIG_BNX2=m # CONFIG_MV643XX_ETH is not set # # Ethernet (10000 Mbit) # CONFIG_CHELSIO_T1=m CONFIG_IXGB=m CONFIG_IXGB_NAPI=y CONFIG_S2IO=m CONFIG_S2IO_NAPI=y # # Token Ring devices # CONFIG_TR=y CONFIG_IBMOL=m CONFIG_3C359=m # CONFIG_TMS380TR is not set # # Wireless LAN (non-hamradio) # CONFIG_NET_RADIO=y # # Obsolete Wireless cards support (pre-802.11) # # CONFIG_STRIP is not set CONFIG_PCMCIA_WAVELAN=m CONFIG_PCMCIA_NETWAVE=m # # Wireless 802.11 Frequency Hopping cards support # # CONFIG_PCMCIA_RAYCS is not set # # Wireless 802.11b ISA/PCI cards support # # CONFIG_IPW2100 is not set # CONFIG_IPW2200 is not set CONFIG_AIRO=m CONFIG_HERMES=m CONFIG_APPLE_AIRPORT=m CONFIG_PLX_HERMES=m CONFIG_TMD_HERMES=m CONFIG_NORTEL_HERMES=m CONFIG_PCI_HERMES=m CONFIG_ATMEL=m CONFIG_PCI_ATMEL=m # # Wireless 802.11b Pcmcia/Cardbus cards support # CONFIG_PCMCIA_HERMES=m CONFIG_PCMCIA_SPECTRUM=m CONFIG_AIRO_CS=m CONFIG_PCMCIA_ATMEL=m CONFIG_PCMCIA_WL3501=m # # Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support # CONFIG_PRISM54=m CONFIG_HOSTAP=m CONFIG_HOSTAP_FIRMWARE=y CONFIG_HOSTAP_PLX=m CONFIG_HOSTAP_PCI=m CONFIG_HOSTAP_CS=m CONFIG_NET_WIRELESS=y # # PCMCIA network device support # CONFIG_NET_PCMCIA=y CONFIG_PCMCIA_3C589=m CONFIG_PCMCIA_3C574=m CONFIG_PCMCIA_FMVJ18X=m CONFIG_PCMCIA_PCNET=m CONFIG_PCMCIA_NMCLAN=m CONFIG_PCMCIA_SMC91C92=m CONFIG_PCMCIA_XIRC2PS=m CONFIG_PCMCIA_AXNET=m # # Wan interfaces # # CONFIG_WAN is not set # # ATM drivers # # CONFIG_ATM_DUMMY is not set CONFIG_ATM_TCP=m CONFIG_ATM_LANAI=m CONFIG_ATM_ENI=m # CONFIG_ATM_ENI_DEBUG is not set # CONFIG_ATM_ENI_TUNE_BURST is not set # CONFIG_ATM_FIRESTREAM is not set # CONFIG_ATM_ZATM is not set CONFIG_ATM_IDT77252=m # CONFIG_ATM_IDT77252_DEBUG is not set # CONFIG_ATM_IDT77252_RCV_ALL is not set CONFIG_ATM_IDT77252_USE_SUNI=y # CONFIG_ATM_AMBASSADOR is not set # CONFIG_ATM_HORIZON is not set CONFIG_ATM_FORE200E_MAYBE=m # CONFIG_ATM_FORE200E_PCA is not set CONFIG_ATM_HE=m # CONFIG_ATM_HE_USE_SUNI is not set CONFIG_FDDI=y # CONFIG_DEFXX is not set CONFIG_SKFP=m # CONFIG_HIPPI is not set CONFIG_PLIP=m CONFIG_PPP=m CONFIG_PPP_MULTILINK=y CONFIG_PPP_FILTER=y CONFIG_PPP_ASYNC=m CONFIG_PPP_SYNC_TTY=m CONFIG_PPP_DEFLATE=m # CONFIG_PPP_BSDCOMP is not set CONFIG_PPP_MPPE=m CONFIG_PPPOE=m CONFIG_PPPOATM=m CONFIG_SLIP=m CONFIG_SLIP_COMPRESSED=y CONFIG_SLIP_SMART=y # CONFIG_SLIP_MODE_SLIP6 is not set CONFIG_NET_FC=y # CONFIG_SHAPER is not set CONFIG_NETCONSOLE=m CONFIG_NETPOLL=y # CONFIG_NETPOLL_RX is not set CONFIG_NETPOLL_TRAP=y CONFIG_NET_POLL_CONTROLLER=y # # ISDN subsystem # CONFIG_ISDN=m # # Old ISDN4Linux # CONFIG_ISDN_I4L=m CONFIG_ISDN_PPP=y CONFIG_ISDN_PPP_VJ=y CONFIG_ISDN_MPP=y CONFIG_IPPP_FILTER=y # CONFIG_ISDN_PPP_BSDCOMP is not set CONFIG_ISDN_AUDIO=y CONFIG_ISDN_TTY_FAX=y # # ISDN feature submodules # CONFIG_ISDN_DIVERSION=m # # ISDN4Linux hardware drivers # # # Passive cards # CONFIG_ISDN_DRV_HISAX=m # # D-channel protocol features # CONFIG_HISAX_EURO=y CONFIG_DE_AOC=y CONFIG_HISAX_NO_SENDCOMPLETE=y CONFIG_HISAX_NO_LLC=y CONFIG_HISAX_NO_KEYPAD=y CONFIG_HISAX_1TR6=y CONFIG_HISAX_NI1=y CONFIG_HISAX_MAX_CARDS=8 # # HiSax supported cards # CONFIG_HISAX_16_3=y CONFIG_HISAX_S0BOX=y CONFIG_HISAX_AVM_A1_PCMCIA=y CONFIG_HISAX_ELSA=y CONFIG_HISAX_DIEHLDIVA=y CONFIG_HISAX_SEDLBAUER=y CONFIG_HISAX_NICCY=y CONFIG_HISAX_BKM_A4T=y CONFIG_HISAX_SCT_QUADRO=y CONFIG_HISAX_GAZEL=y CONFIG_HISAX_W6692=y CONFIG_HISAX_HFC_SX=y # CONFIG_HISAX_DEBUG is not set # # HiSax PCMCIA card service modules # CONFIG_HISAX_SEDLBAUER_CS=m CONFIG_HISAX_ELSA_CS=m CONFIG_HISAX_AVM_A1_CS=m CONFIG_HISAX_TELES_CS=m # # HiSax sub driver modules # CONFIG_HISAX_ST5481=m # CONFIG_HISAX_HFCUSB is not set CONFIG_HISAX_HFC4S8S=m CONFIG_HISAX_FRITZ_PCIPNP=m CONFIG_HISAX_HDLC=y # # Active cards # # # CAPI subsystem # CONFIG_ISDN_CAPI=m CONFIG_ISDN_DRV_AVMB1_VERBOSE_REASON=y CONFIG_ISDN_CAPI_MIDDLEWARE=y CONFIG_ISDN_CAPI_CAPI20=m CONFIG_ISDN_CAPI_CAPIFS_BOOL=y CONFIG_ISDN_CAPI_CAPIFS=m CONFIG_ISDN_CAPI_CAPIDRV=m # # CAPI hardware drivers # # # Active AVM cards # CONFIG_CAPI_AVM=y CONFIG_ISDN_DRV_AVMB1_B1PCI=m CONFIG_ISDN_DRV_AVMB1_B1PCIV4=y CONFIG_ISDN_DRV_AVMB1_B1PCMCIA=m CONFIG_ISDN_DRV_AVMB1_AVM_CS=m CONFIG_ISDN_DRV_AVMB1_T1PCI=m CONFIG_ISDN_DRV_AVMB1_C4=m # # Active Eicon DIVA Server cards # CONFIG_CAPI_EICON=y CONFIG_ISDN_DIVAS=m CONFIG_ISDN_DIVAS_BRIPCI=y CONFIG_ISDN_DIVAS_PRIPCI=y CONFIG_ISDN_DIVAS_DIVACAPI=m CONFIG_ISDN_DIVAS_USERIDI=m CONFIG_ISDN_DIVAS_MAINT=m # # Telephony Support # # CONFIG_PHONE is not set # # Input device support # CONFIG_INPUT=y # # Userland interfaces # CONFIG_INPUT_MOUSEDEV=y # CONFIG_INPUT_MOUSEDEV_PSAUX is not set CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 CONFIG_INPUT_JOYDEV=m # CONFIG_INPUT_TSDEV is not set CONFIG_INPUT_EVDEV=y # CONFIG_INPUT_EVBUG is not set # # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y # CONFIG_KEYBOARD_SUNKBD is not set # CONFIG_KEYBOARD_LKKBD is not set # CONFIG_KEYBOARD_XTKBD is not set # CONFIG_KEYBOARD_NEWTON is not set CONFIG_INPUT_MOUSE=y CONFIG_MOUSE_PS2=y CONFIG_MOUSE_SERIAL=m CONFIG_MOUSE_VSXXXAA=m CONFIG_INPUT_JOYSTICK=y CONFIG_JOYSTICK_ANALOG=m CONFIG_JOYSTICK_A3D=m CONFIG_JOYSTICK_ADI=m CONFIG_JOYSTICK_COBRA=m CONFIG_JOYSTICK_GF2K=m CONFIG_JOYSTICK_GRIP=m CONFIG_JOYSTICK_GRIP_MP=m CONFIG_JOYSTICK_GUILLEMOT=m CONFIG_JOYSTICK_INTERACT=m CONFIG_JOYSTICK_SIDEWINDER=m CONFIG_JOYSTICK_TMDC=m CONFIG_JOYSTICK_IFORCE=m CONFIG_JOYSTICK_IFORCE_USB=y CONFIG_JOYSTICK_IFORCE_232=y CONFIG_JOYSTICK_WARRIOR=m CONFIG_JOYSTICK_MAGELLAN=m CONFIG_JOYSTICK_SPACEORB=m CONFIG_JOYSTICK_SPACEBALL=m CONFIG_JOYSTICK_STINGER=m CONFIG_JOYSTICK_TWIDJOY=m CONFIG_JOYSTICK_DB9=m CONFIG_JOYSTICK_GAMECON=m CONFIG_JOYSTICK_TURBOGRAFX=m CONFIG_JOYSTICK_JOYDUMP=m CONFIG_INPUT_TOUCHSCREEN=y CONFIG_TOUCHSCREEN_GUNZE=m CONFIG_TOUCHSCREEN_ELO=m CONFIG_TOUCHSCREEN_MTOUCH=m CONFIG_TOUCHSCREEN_MK712=m CONFIG_INPUT_MISC=y # CONFIG_INPUT_PCSPKR is not set CONFIG_INPUT_UINPUT=m # # Hardware I/O ports # CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_SERIO_SERPORT=y # CONFIG_SERIO_PARKBD is not set # CONFIG_SERIO_PCIPS2 is not set CONFIG_SERIO_LIBPS2=y # CONFIG_SERIO_RAW is not set CONFIG_GAMEPORT=m CONFIG_GAMEPORT_NS558=m CONFIG_GAMEPORT_L4=m CONFIG_GAMEPORT_EMU10K1=m CONFIG_GAMEPORT_FM801=m # # Character devices # CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_SERIAL_NONSTANDARD=y CONFIG_ROCKETPORT=m # CONFIG_CYCLADES is not set # CONFIG_DIGIEPCA is not set # CONFIG_MOXA_SMARTIO is not set # CONFIG_ISI is not set # CONFIG_SYNCLINK is not set # CONFIG_SYNCLINKMP is not set CONFIG_N_HDLC=m # CONFIG_SPECIALIX is not set # CONFIG_SX is not set CONFIG_STALDRV=y # # Serial drivers # CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_CS=m CONFIG_SERIAL_8250_NR_UARTS=32 CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_MANY_PORTS=y CONFIG_SERIAL_8250_SHARE_IRQ=y CONFIG_SERIAL_8250_DETECT_IRQ=y CONFIG_SERIAL_8250_RSA=y # # Non-8250 serial port support # CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_SERIAL_PMACZILOG=m CONFIG_SERIAL_ICOM=m # CONFIG_SERIAL_JSM is not set CONFIG_UNIX98_PTYS=y # CONFIG_LEGACY_PTYS is not set CONFIG_PRINTER=m CONFIG_LP_CONSOLE=y CONFIG_PPDEV=m CONFIG_TIPAR=m CONFIG_HVC_CONSOLE=y CONFIG_HVCS=y # # IPMI # CONFIG_IPMI_HANDLER=m # CONFIG_IPMI_PANIC_EVENT is not set CONFIG_IPMI_DEVICE_INTERFACE=m CONFIG_IPMI_SI=m CONFIG_IPMI_WATCHDOG=m CONFIG_IPMI_POWEROFF=m # # Watchdog Cards # CONFIG_WATCHDOG=y # CONFIG_WATCHDOG_NOWAYOUT is not set # # Watchdog Device Drivers # CONFIG_SOFT_WATCHDOG=m CONFIG_WATCHDOG_RTAS=m # # PCI-based Watchdog Cards # CONFIG_PCIPCWATCHDOG=m CONFIG_WDTPCI=m CONFIG_WDT_501_PCI=y # # USB-based Watchdog Cards # CONFIG_USBPCWATCHDOG=m # CONFIG_RTC is not set CONFIG_GEN_RTC=y # CONFIG_GEN_RTC_X is not set CONFIG_DTLK=m CONFIG_R3964=m # CONFIG_APPLICOM is not set # # Ftape, the floppy tape device driver # CONFIG_AGP=y CONFIG_AGP_UNINORTH=y CONFIG_DRM=m CONFIG_DRM_TDFX=m CONFIG_DRM_R128=m CONFIG_DRM_RADEON=m CONFIG_DRM_MGA=m CONFIG_DRM_SIS=m CONFIG_DRM_VIA=m CONFIG_DRM_SAVAGE=m # # PCMCIA character devices # # CONFIG_SYNCLINK_CS is not set CONFIG_CARDMAN_4000=m CONFIG_CARDMAN_4040=m # CONFIG_RAW_DRIVER is not set CONFIG_HANGCHECK_TIMER=m # # TPM devices # # CONFIG_TCG_TPM is not set # CONFIG_TELCLOCK is not set # # I2C support # CONFIG_I2C=y CONFIG_I2C_CHARDEV=m # # I2C Algorithms # CONFIG_I2C_ALGOBIT=y CONFIG_I2C_ALGOPCF=m CONFIG_I2C_ALGOPCA=m # # I2C Hardware Bus support # # CONFIG_I2C_ALI1535 is not set # CONFIG_I2C_ALI1563 is not set # CONFIG_I2C_ALI15X3 is not set # CONFIG_I2C_AMD756 is not set # CONFIG_I2C_AMD8111 is not set # CONFIG_I2C_I801 is not set # CONFIG_I2C_I810 is not set # CONFIG_I2C_PIIX4 is not set CONFIG_I2C_ISA=m CONFIG_I2C_KEYWEST=y CONFIG_I2C_PMAC_SMU=y CONFIG_I2C_NFORCE2=m CONFIG_I2C_PARPORT=m CONFIG_I2C_PARPORT_LIGHT=m CONFIG_I2C_PROSAVAGE=m CONFIG_I2C_SAVAGE4=m # CONFIG_SCx200_ACB is not set # CONFIG_I2C_SIS5595 is not set # CONFIG_I2C_SIS630 is not set # CONFIG_I2C_SIS96X is not set CONFIG_I2C_STUB=m # CONFIG_I2C_VIA is not set # CONFIG_I2C_VIAPRO is not set CONFIG_I2C_VOODOO3=m CONFIG_I2C_PCA_ISA=m # # Miscellaneous I2C Chip support # CONFIG_SENSORS_DS1337=m CONFIG_SENSORS_DS1374=m CONFIG_SENSORS_EEPROM=m CONFIG_SENSORS_PCF8574=m CONFIG_SENSORS_PCA9539=m CONFIG_SENSORS_PCF8591=m CONFIG_SENSORS_RTC8564=m CONFIG_SENSORS_MAX6875=m CONFIG_RTC_X1205_I2C=m # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set # CONFIG_I2C_DEBUG_CHIP is not set # # Dallas's 1-wire bus # CONFIG_W1=m CONFIG_W1_MATROX=m CONFIG_W1_DS9490=m CONFIG_W1_DS9490_BRIDGE=m CONFIG_W1_THERM=m CONFIG_W1_SMEM=m CONFIG_W1_DS2433=m CONFIG_W1_DS2433_CRC=y # # Hardware Monitoring support # CONFIG_HWMON=m CONFIG_HWMON_VID=m CONFIG_SENSORS_ADM1021=m CONFIG_SENSORS_ADM1025=m CONFIG_SENSORS_ADM1026=m CONFIG_SENSORS_ADM1031=m CONFIG_SENSORS_ADM9240=m CONFIG_SENSORS_ASB100=m CONFIG_SENSORS_ATXP1=m CONFIG_SENSORS_DS1621=m CONFIG_SENSORS_FSCHER=m CONFIG_SENSORS_FSCPOS=m CONFIG_SENSORS_GL518SM=m CONFIG_SENSORS_GL520SM=m CONFIG_SENSORS_IT87=m CONFIG_SENSORS_LM63=m CONFIG_SENSORS_LM75=m CONFIG_SENSORS_LM77=m CONFIG_SENSORS_LM78=m CONFIG_SENSORS_LM80=m CONFIG_SENSORS_LM83=m CONFIG_SENSORS_LM85=m CONFIG_SENSORS_LM87=m CONFIG_SENSORS_LM90=m CONFIG_SENSORS_LM92=m CONFIG_SENSORS_MAX1619=m CONFIG_SENSORS_PC87360=m CONFIG_SENSORS_SIS5595=m CONFIG_SENSORS_SMSC47M1=m CONFIG_SENSORS_SMSC47B397=m CONFIG_SENSORS_VIA686A=m CONFIG_SENSORS_W83781D=m CONFIG_SENSORS_W83792D=m CONFIG_SENSORS_W83L785TS=m CONFIG_SENSORS_W83627HF=m CONFIG_SENSORS_W83627EHF=m # CONFIG_HWMON_DEBUG_CHIP is not set # # Misc devices # # # Multimedia Capabilities Port drivers # # # Multimedia devices # CONFIG_VIDEO_DEV=m # # Video For Linux # # # Video Adapters # CONFIG_VIDEO_BT848=m CONFIG_VIDEO_BT848_DVB=y CONFIG_VIDEO_SAA6588=m CONFIG_VIDEO_BWQCAM=m CONFIG_VIDEO_CQCAM=m CONFIG_VIDEO_W9966=m CONFIG_VIDEO_CPIA=m CONFIG_VIDEO_CPIA_PP=m CONFIG_VIDEO_CPIA_USB=m CONFIG_VIDEO_SAA5246A=m CONFIG_VIDEO_SAA5249=m CONFIG_TUNER_3036=m # CONFIG_VIDEO_STRADIS is not set # CONFIG_VIDEO_ZORAN is not set CONFIG_VIDEO_SAA7134=m CONFIG_VIDEO_SAA7134_ALSA=m CONFIG_VIDEO_SAA7134_DVB=m CONFIG_VIDEO_SAA7134_DVB_ALL_FRONTENDS=y CONFIG_VIDEO_MXB=m CONFIG_VIDEO_DPC=m CONFIG_VIDEO_HEXIUM_ORION=m CONFIG_VIDEO_HEXIUM_GEMINI=m CONFIG_VIDEO_CX88=m CONFIG_VIDEO_CX88_DVB=m CONFIG_VIDEO_CX88_DVB_ALL_FRONTENDS=y CONFIG_VIDEO_EM28XX=m CONFIG_VIDEO_OVCAMCHIP=m CONFIG_VIDEO_AUDIO_DECODER=m CONFIG_VIDEO_DECODER=m # # Radio Adapters # CONFIG_RADIO_GEMTEK_PCI=m CONFIG_RADIO_MAXIRADIO=m CONFIG_RADIO_MAESTRO=m # # Digital Video Broadcasting Devices # CONFIG_DVB=y CONFIG_DVB_CORE=m # # Supported SAA7146 based PCI Adapters # CONFIG_DVB_AV7110=m CONFIG_DVB_AV7110_OSD=y CONFIG_DVB_BUDGET=m CONFIG_DVB_BUDGET_CI=m CONFIG_DVB_BUDGET_AV=m CONFIG_DVB_BUDGET_PATCH=m # # Supported USB Adapters # CONFIG_DVB_USB=m # CONFIG_DVB_USB_DEBUG is not set CONFIG_DVB_USB_A800=m CONFIG_DVB_USB_DIBUSB_MB=m CONFIG_DVB_USB_DIBUSB_MC=m CONFIG_DVB_USB_UMT_010=m CONFIG_DVB_USB_CXUSB=m CONFIG_DVB_USB_DIGITV=m CONFIG_DVB_USB_VP7045=m CONFIG_DVB_USB_VP702X=m CONFIG_DVB_USB_NOVA_T_USB2=m CONFIG_DVB_USB_DTT200U=m CONFIG_DVB_TTUSB_BUDGET=m CONFIG_DVB_TTUSB_DEC=m CONFIG_DVB_CINERGYT2=m CONFIG_DVB_CINERGYT2_TUNING=y CONFIG_DVB_CINERGYT2_STREAM_URB_COUNT=32 CONFIG_DVB_CINERGYT2_STREAM_BUF_SIZE=512 CONFIG_DVB_CINERGYT2_QUERY_INTERVAL=250 CONFIG_DVB_CINERGYT2_ENABLE_RC_INPUT_DEVICE=y CONFIG_DVB_CINERGYT2_RC_QUERY_INTERVAL=100 # # Supported FlexCopII (B2C2) Adapters # CONFIG_DVB_B2C2_FLEXCOP=m CONFIG_DVB_B2C2_FLEXCOP_PCI=m CONFIG_DVB_B2C2_FLEXCOP_USB=m # CONFIG_DVB_B2C2_FLEXCOP_DEBUG is not set # # Supported BT878 Adapters # CONFIG_DVB_BT8XX=m # # Supported Pluto2 Adapters # CONFIG_DVB_PLUTO2=m # # Supported DVB Frontends # # # Customise DVB Frontends # # # DVB-S (satellite) frontends # CONFIG_DVB_STV0299=m CONFIG_DVB_CX24110=m CONFIG_DVB_TDA8083=m CONFIG_DVB_TDA80XX=m CONFIG_DVB_MT312=m CONFIG_DVB_VES1X93=m CONFIG_DVB_S5H1420=m # # DVB-T (terrestrial) frontends # CONFIG_DVB_SP8870=m CONFIG_DVB_SP887X=m CONFIG_DVB_CX22700=m CONFIG_DVB_CX22702=m CONFIG_DVB_L64781=m CONFIG_DVB_TDA1004X=m CONFIG_DVB_NXT6000=m CONFIG_DVB_MT352=m CONFIG_DVB_DIB3000MB=m CONFIG_DVB_DIB3000MC=m # # DVB-C (cable) frontends # CONFIG_DVB_ATMEL_AT76C651=m CONFIG_DVB_VES1820=m CONFIG_DVB_TDA10021=m CONFIG_DVB_STV0297=m # # ATSC (North American/Korean Terresterial DTV) frontends # CONFIG_DVB_NXT2002=m CONFIG_DVB_NXT200X=m CONFIG_DVB_OR51211=m CONFIG_DVB_OR51132=m CONFIG_DVB_BCM3510=m CONFIG_DVB_LGDT330X=m CONFIG_VIDEO_SAA7146=m CONFIG_VIDEO_SAA7146_VV=m CONFIG_VIDEO_VIDEOBUF=m CONFIG_VIDEO_TUNER=m CONFIG_VIDEO_BUF=m CONFIG_VIDEO_BUF_DVB=m CONFIG_VIDEO_BTCX=m CONFIG_VIDEO_IR=m CONFIG_VIDEO_TVEEPROM=m # # Graphics support # CONFIG_FB=y CONFIG_FB_CFB_FILLRECT=y CONFIG_FB_CFB_COPYAREA=y CONFIG_FB_CFB_IMAGEBLIT=y CONFIG_FB_MACMODES=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y CONFIG_FB_CIRRUS=m # CONFIG_FB_PM2 is not set # CONFIG_FB_CYBER2000 is not set CONFIG_FB_OF=y # CONFIG_FB_CONTROL is not set # CONFIG_FB_PLATINUM is not set # CONFIG_FB_VALKYRIE is not set # CONFIG_FB_CT65550 is not set # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set CONFIG_VIDEO_SELECT=y # CONFIG_FB_S1D13XXX is not set # CONFIG_FB_NVIDIA is not set CONFIG_FB_RIVA=m # CONFIG_FB_RIVA_I2C is not set # CONFIG_FB_RIVA_DEBUG is not set CONFIG_FB_MATROX=m CONFIG_FB_MATROX_MILLENIUM=y CONFIG_FB_MATROX_MYSTIQUE=y CONFIG_FB_MATROX_G=y CONFIG_FB_MATROX_I2C=m CONFIG_FB_MATROX_MAVEN=m CONFIG_FB_MATROX_MULTIHEAD=y # CONFIG_FB_RADEON_OLD is not set CONFIG_FB_RADEON=y CONFIG_FB_RADEON_I2C=y # CONFIG_FB_RADEON_DEBUG is not set # CONFIG_FB_ATY128 is not set # CONFIG_FB_ATY is not set CONFIG_FB_SAVAGE=m CONFIG_FB_SAVAGE_I2C=y CONFIG_FB_SAVAGE_ACCEL=y # CONFIG_FB_SIS is not set CONFIG_FB_NEOMAGIC=m CONFIG_FB_KYRO=m CONFIG_FB_3DFX=m CONFIG_FB_3DFX_ACCEL=y CONFIG_FB_VOODOO1=m CONFIG_FB_CYBLA=m CONFIG_FB_TRIDENT=m CONFIG_FB_TRIDENT_ACCEL=y # CONFIG_FB_VIRTUAL is not set # # Console display driver support # CONFIG_VGA_CONSOLE=y CONFIG_DUMMY_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y # CONFIG_FONTS is not set CONFIG_FONT_8x8=y CONFIG_FONT_8x16=y # # Logo configuration # CONFIG_LOGO=y # CONFIG_LOGO_LINUX_MONO is not set # CONFIG_LOGO_LINUX_VGA16 is not set CONFIG_LOGO_LINUX_CLUT224=y CONFIG_BACKLIGHT_LCD_SUPPORT=y CONFIG_BACKLIGHT_CLASS_DEVICE=m CONFIG_BACKLIGHT_DEVICE=y CONFIG_LCD_CLASS_DEVICE=m CONFIG_LCD_DEVICE=y # # Sound # CONFIG_SOUND=m # # Advanced Linux Sound Architecture # CONFIG_SND=m CONFIG_SND_AC97_CODEC=m CONFIG_SND_AC97_BUS=m CONFIG_SND_TIMER=m CONFIG_SND_PCM=m CONFIG_SND_HWDEP=m CONFIG_SND_RAWMIDI=m CONFIG_SND_SEQUENCER=m CONFIG_SND_SEQ_DUMMY=m CONFIG_SND_OSSEMUL=y CONFIG_SND_MIXER_OSS=m CONFIG_SND_PCM_OSS=m CONFIG_SND_SEQUENCER_OSS=y # CONFIG_SND_VERBOSE_PRINTK is not set # CONFIG_SND_DEBUG is not set CONFIG_SND_GENERIC_DRIVER=y # # Generic devices # CONFIG_SND_MPU401_UART=m CONFIG_SND_OPL3_LIB=m CONFIG_SND_VX_LIB=m CONFIG_SND_DUMMY=m CONFIG_SND_VIRMIDI=m CONFIG_SND_MTPAV=m # CONFIG_SND_SERIAL_U16550 is not set CONFIG_SND_MPU401=m # # PCI devices # CONFIG_SND_ALI5451=m CONFIG_SND_ATIIXP=m CONFIG_SND_ATIIXP_MODEM=m CONFIG_SND_AU8810=m CONFIG_SND_AU8820=m CONFIG_SND_AU8830=m CONFIG_SND_AZT3328=m CONFIG_SND_BT87X=m # CONFIG_SND_BT87X_OVERCLOCK is not set CONFIG_SND_CS46XX=m CONFIG_SND_CS46XX_NEW_DSP=y CONFIG_SND_CS4281=m CONFIG_SND_EMU10K1=m CONFIG_SND_EMU10K1X=m CONFIG_SND_CA0106=m CONFIG_SND_KORG1212=m CONFIG_SND_MIXART=m CONFIG_SND_NM256=m CONFIG_SND_RME32=m CONFIG_SND_RME96=m CONFIG_SND_RME9652=m CONFIG_SND_HDSP=m CONFIG_SND_HDSPM=m CONFIG_SND_TRIDENT=m CONFIG_SND_YMFPCI=m CONFIG_SND_AD1889=m CONFIG_SND_ALS4000=m CONFIG_SND_CMIPCI=m CONFIG_SND_ENS1370=m CONFIG_SND_ENS1371=m CONFIG_SND_ES1938=m CONFIG_SND_ES1968=m CONFIG_SND_MAESTRO3=m CONFIG_SND_FM801=m CONFIG_SND_FM801_TEA575X=m CONFIG_SND_ICE1712=m CONFIG_SND_ICE1724=m CONFIG_SND_INTEL8X0=m CONFIG_SND_INTEL8X0M=m CONFIG_SND_SONICVIBES=m CONFIG_SND_VIA82XX=m CONFIG_SND_VIA82XX_MODEM=m CONFIG_SND_VX222=m CONFIG_SND_HDA_INTEL=m # # ALSA PowerMac devices # CONFIG_SND_POWERMAC=m CONFIG_SND_POWERMAC_AUTO_DRC=y # # USB devices # CONFIG_SND_USB_AUDIO=m CONFIG_SND_USB_USX2Y=m # # PCMCIA devices # # # Open Sound System # # CONFIG_SOUND_PRIME is not set # # USB support # CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB=y # CONFIG_USB_DEBUG is not set # # Miscellaneous USB options # CONFIG_USB_DEVICEFS=y # CONFIG_USB_BANDWIDTH is not set # CONFIG_USB_DYNAMIC_MINORS is not set # CONFIG_USB_SUSPEND is not set # CONFIG_USB_OTG is not set # # USB Host Controller Drivers # CONFIG_USB_EHCI_HCD=m CONFIG_USB_EHCI_SPLIT_ISO=y CONFIG_USB_EHCI_ROOT_HUB_TT=y CONFIG_USB_ISP116X_HCD=m CONFIG_USB_OHCI_HCD=m # CONFIG_USB_OHCI_BIG_ENDIAN is not set CONFIG_USB_OHCI_LITTLE_ENDIAN=y CONFIG_USB_UHCI_HCD=m CONFIG_USB_SL811_HCD=m CONFIG_USB_SL811_CS=m # # USB Device Class drivers # # CONFIG_OBSOLETE_OSS_USB_DRIVER is not set CONFIG_USB_ACM=m CONFIG_USB_PRINTER=m # # NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support' # # # may also be needed; see USB_STORAGE Help for more information # CONFIG_USB_STORAGE=m # CONFIG_USB_STORAGE_DEBUG is not set CONFIG_USB_STORAGE_DATAFAB=y CONFIG_USB_STORAGE_FREECOM=y CONFIG_USB_STORAGE_ISD200=y CONFIG_USB_STORAGE_DPCM=y CONFIG_USB_STORAGE_USBAT=y CONFIG_USB_STORAGE_SDDR09=y CONFIG_USB_STORAGE_SDDR55=y CONFIG_USB_STORAGE_JUMPSHOT=y # # USB Input Devices # CONFIG_USB_HID=y CONFIG_USB_HIDINPUT=y CONFIG_HID_FF=y CONFIG_HID_PID=y CONFIG_LOGITECH_FF=y CONFIG_THRUSTMASTER_FF=y CONFIG_USB_HIDDEV=y CONFIG_USB_AIPTEK=m CONFIG_USB_WACOM=m CONFIG_USB_ACECAD=m CONFIG_USB_KBTAB=m CONFIG_USB_POWERMATE=m CONFIG_USB_MTOUCH=m CONFIG_USB_ITMTOUCH=m CONFIG_USB_EGALAX=m # CONFIG_USB_YEALINK is not set CONFIG_USB_XPAD=m CONFIG_USB_ATI_REMOTE=m CONFIG_USB_KEYSPAN_REMOTE=m CONFIG_USB_APPLETOUCH=m # # USB Imaging devices # CONFIG_USB_MDC800=m CONFIG_USB_MICROTEK=m # # USB Multimedia devices # CONFIG_USB_DABUSB=m CONFIG_USB_VICAM=m CONFIG_USB_DSBR=m CONFIG_USB_IBMCAM=m CONFIG_USB_KONICAWC=m CONFIG_USB_OV511=m CONFIG_USB_SE401=m CONFIG_USB_SN9C102=m CONFIG_USB_STV680=m CONFIG_USB_W9968CF=m CONFIG_USB_PWC=m # # USB Network Adapters # CONFIG_USB_CATC=m CONFIG_USB_KAWETH=m CONFIG_USB_PEGASUS=m CONFIG_USB_RTL8150=m CONFIG_USB_USBNET=m CONFIG_USB_NET_AX8817X=m CONFIG_USB_NET_CDCETHER=m CONFIG_USB_NET_GL620A=m CONFIG_USB_NET_NET1080=m CONFIG_USB_NET_PLUSB=m CONFIG_USB_NET_RNDIS_HOST=m CONFIG_USB_NET_CDC_SUBSET=m CONFIG_USB_ALI_M5632=y CONFIG_USB_AN2720=y CONFIG_USB_BELKIN=y CONFIG_USB_ARMLINUX=y CONFIG_USB_EPSON2888=y CONFIG_USB_NET_ZAURUS=m CONFIG_USB_ZD1201=m CONFIG_USB_MON=y # # USB port drivers # CONFIG_USB_USS720=m # # USB Serial Converter support # CONFIG_USB_SERIAL=m CONFIG_USB_SERIAL_GENERIC=y CONFIG_USB_SERIAL_AIRPRIME=m CONFIG_USB_SERIAL_ANYDATA=m CONFIG_USB_SERIAL_BELKIN=m CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m CONFIG_USB_SERIAL_CP2101=m CONFIG_USB_SERIAL_CYPRESS_M8=m CONFIG_USB_SERIAL_EMPEG=m CONFIG_USB_SERIAL_FTDI_SIO=m CONFIG_USB_SERIAL_VISOR=m CONFIG_USB_SERIAL_IPAQ=m CONFIG_USB_SERIAL_IR=m CONFIG_USB_SERIAL_EDGEPORT=m CONFIG_USB_SERIAL_EDGEPORT_TI=m CONFIG_USB_SERIAL_GARMIN=m CONFIG_USB_SERIAL_IPW=m CONFIG_USB_SERIAL_KEYSPAN_PDA=m CONFIG_USB_SERIAL_KEYSPAN=m CONFIG_USB_SERIAL_KEYSPAN_MPR=y CONFIG_USB_SERIAL_KEYSPAN_USA28=y CONFIG_USB_SERIAL_KEYSPAN_USA28X=y CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y CONFIG_USB_SERIAL_KEYSPAN_USA19=y CONFIG_USB_SERIAL_KEYSPAN_USA18X=y CONFIG_USB_SERIAL_KEYSPAN_USA19W=y CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y CONFIG_USB_SERIAL_KEYSPAN_USA49W=y CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y CONFIG_USB_SERIAL_KLSI=m CONFIG_USB_SERIAL_KOBIL_SCT=m CONFIG_USB_SERIAL_MCT_U232=m CONFIG_USB_SERIAL_PL2303=m CONFIG_USB_SERIAL_HP4X=m CONFIG_USB_SERIAL_SAFE=m CONFIG_USB_SERIAL_SAFE_PADDED=y CONFIG_USB_SERIAL_TI=m CONFIG_USB_SERIAL_CYBERJACK=m CONFIG_USB_SERIAL_XIRCOM=m CONFIG_USB_SERIAL_OPTION=m CONFIG_USB_SERIAL_OMNINET=m CONFIG_USB_EZUSB=y # # USB Miscellaneous drivers # CONFIG_USB_EMI62=m # CONFIG_USB_EMI26 is not set CONFIG_USB_AUERSWALD=m CONFIG_USB_RIO500=m CONFIG_USB_LEGOTOWER=m CONFIG_USB_LCD=m CONFIG_USB_LED=m # CONFIG_USB_CYTHERM is not set CONFIG_USB_PHIDGETKIT=m CONFIG_USB_PHIDGETSERVO=m CONFIG_USB_IDMOUSE=m CONFIG_USB_SISUSBVGA=m CONFIG_USB_SISUSBVGA_CON=y CONFIG_USB_LD=m CONFIG_USB_TEST=m # # USB DSL modem support # CONFIG_USB_ATM=m CONFIG_USB_SPEEDTOUCH=m CONFIG_USB_CXACRU=m CONFIG_USB_XUSBATM=m # # USB Gadget Support # # CONFIG_USB_GADGET is not set # # MMC/SD Card support # CONFIG_MMC=m # CONFIG_MMC_DEBUG is not set CONFIG_MMC_BLOCK=m CONFIG_MMC_WBSD=m # # InfiniBand support # CONFIG_INFINIBAND=m CONFIG_INFINIBAND_USER_MAD=m CONFIG_INFINIBAND_USER_ACCESS=m CONFIG_INFINIBAND_MTHCA=m # CONFIG_INFINIBAND_MTHCA_DEBUG is not set CONFIG_INFINIBAND_IPOIB=m # CONFIG_INFINIBAND_IPOIB_DEBUG is not set CONFIG_INFINIBAND_SRP=m # # SN Devices # # # File systems # CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y CONFIG_EXT2_FS_SECURITY=y # CONFIG_EXT2_FS_XIP is not set CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y CONFIG_EXT3_FS_SECURITY=y CONFIG_JBD=y # CONFIG_JBD_DEBUG is not set CONFIG_FS_MBCACHE=y CONFIG_REISERFS_FS=m # CONFIG_REISERFS_CHECK is not set CONFIG_REISERFS_PROC_INFO=y CONFIG_REISERFS_FS_XATTR=y CONFIG_REISERFS_FS_POSIX_ACL=y CONFIG_REISERFS_FS_SECURITY=y CONFIG_JFS_FS=m CONFIG_JFS_POSIX_ACL=y CONFIG_JFS_SECURITY=y # CONFIG_JFS_DEBUG is not set # CONFIG_JFS_STATISTICS is not set CONFIG_FS_POSIX_ACL=y CONFIG_XFS_FS=m CONFIG_XFS_EXPORT=y CONFIG_XFS_QUOTA=y CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_XFS_RT is not set CONFIG_MINIX_FS=m CONFIG_ROMFS_FS=m CONFIG_INOTIFY=y CONFIG_QUOTA=y # CONFIG_QFMT_V1 is not set CONFIG_QFMT_V2=y CONFIG_QUOTACTL=y CONFIG_DNOTIFY=y CONFIG_AUTOFS_FS=m CONFIG_AUTOFS4_FS=m CONFIG_FUSE_FS=m # # CD-ROM/DVD Filesystems # CONFIG_ISO9660_FS=y CONFIG_JOLIET=y CONFIG_ZISOFS=y CONFIG_ZISOFS_FS=y CONFIG_UDF_FS=m CONFIG_UDF_NLS=y # # DOS/FAT/NT Filesystems # CONFIG_FAT_FS=m CONFIG_MSDOS_FS=m CONFIG_VFAT_FS=m CONFIG_FAT_DEFAULT_CODEPAGE=437 CONFIG_FAT_DEFAULT_IOCHARSET="ascii" # CONFIG_NTFS_FS is not set # # Pseudo filesystems # CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y CONFIG_RELAYFS_FS=m # # Miscellaneous filesystems # # CONFIG_ADFS_FS is not set CONFIG_AFFS_FS=m CONFIG_HFS_FS=m CONFIG_HFSPLUS_FS=m CONFIG_BEFS_FS=m # CONFIG_BEFS_DEBUG is not set CONFIG_BFS_FS=m CONFIG_EFS_FS=m # CONFIG_JFFS_FS is not set CONFIG_JFFS2_FS=m CONFIG_JFFS2_FS_DEBUG=0 CONFIG_JFFS2_FS_WRITEBUFFER=y CONFIG_JFFS2_SUMMARY=y # CONFIG_JFFS2_COMPRESSION_OPTIONS is not set CONFIG_JFFS2_ZLIB=y CONFIG_JFFS2_RTIME=y # CONFIG_JFFS2_RUBIN is not set CONFIG_CRAMFS=m CONFIG_VXFS_FS=m # CONFIG_HPFS_FS is not set CONFIG_QNX4FS_FS=m CONFIG_SYSV_FS=m CONFIG_UFS_FS=m # CONFIG_UFS_FS_WRITE is not set # # Network File Systems # CONFIG_NFS_FS=m CONFIG_NFS_V3=y CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y CONFIG_NFS_DIRECTIO=y CONFIG_NFSD=m CONFIG_NFSD_V2_ACL=y CONFIG_NFSD_V3=y CONFIG_NFSD_V3_ACL=y CONFIG_NFSD_V4=y CONFIG_NFSD_TCP=y CONFIG_LOCKD=m CONFIG_LOCKD_V4=y CONFIG_EXPORTFS=m CONFIG_NFS_ACL_SUPPORT=m CONFIG_NFS_COMMON=y CONFIG_SUNRPC=m CONFIG_SUNRPC_GSS=m CONFIG_RPCSEC_GSS_KRB5=m CONFIG_RPCSEC_GSS_SPKM3=m # CONFIG_SMB_FS is not set CONFIG_CIFS=m # CONFIG_CIFS_STATS is not set CONFIG_CIFS_XATTR=y CONFIG_CIFS_POSIX=y # CONFIG_CIFS_EXPERIMENTAL is not set CONFIG_NCP_FS=m CONFIG_NCPFS_PACKET_SIGNING=y CONFIG_NCPFS_IOCTL_LOCKING=y CONFIG_NCPFS_STRONG=y CONFIG_NCPFS_NFS_NS=y CONFIG_NCPFS_OS2_NS=y CONFIG_NCPFS_SMALLDOS=y CONFIG_NCPFS_NLS=y CONFIG_NCPFS_EXTRAS=y CONFIG_CODA_FS=m # CONFIG_CODA_FS_OLD_API is not set # CONFIG_AFS_FS is not set CONFIG_9P_FS=m # # Partition Types # CONFIG_PARTITION_ADVANCED=y # CONFIG_ACORN_PARTITION is not set CONFIG_OSF_PARTITION=y CONFIG_AMIGA_PARTITION=y # CONFIG_ATARI_PARTITION is not set CONFIG_MAC_PARTITION=y CONFIG_MSDOS_PARTITION=y CONFIG_BSD_DISKLABEL=y CONFIG_MINIX_SUBPARTITION=y CONFIG_SOLARIS_X86_PARTITION=y CONFIG_UNIXWARE_DISKLABEL=y # CONFIG_LDM_PARTITION is not set CONFIG_SGI_PARTITION=y # CONFIG_ULTRIX_PARTITION is not set CONFIG_SUN_PARTITION=y CONFIG_EFI_PARTITION=y # # Native Language Support # CONFIG_NLS=y CONFIG_NLS_DEFAULT="utf8" CONFIG_NLS_CODEPAGE_437=y CONFIG_NLS_CODEPAGE_737=m CONFIG_NLS_CODEPAGE_775=m CONFIG_NLS_CODEPAGE_850=m CONFIG_NLS_CODEPAGE_852=m CONFIG_NLS_CODEPAGE_855=m CONFIG_NLS_CODEPAGE_857=m CONFIG_NLS_CODEPAGE_860=m CONFIG_NLS_CODEPAGE_861=m CONFIG_NLS_CODEPAGE_862=m CONFIG_NLS_CODEPAGE_863=m CONFIG_NLS_CODEPAGE_864=m CONFIG_NLS_CODEPAGE_865=m CONFIG_NLS_CODEPAGE_866=m CONFIG_NLS_CODEPAGE_869=m CONFIG_NLS_CODEPAGE_936=m CONFIG_NLS_CODEPAGE_950=m CONFIG_NLS_CODEPAGE_932=m CONFIG_NLS_CODEPAGE_949=m CONFIG_NLS_CODEPAGE_874=m CONFIG_NLS_ISO8859_8=m CONFIG_NLS_CODEPAGE_1250=m CONFIG_NLS_CODEPAGE_1251=m CONFIG_NLS_ASCII=y CONFIG_NLS_ISO8859_1=m CONFIG_NLS_ISO8859_2=m CONFIG_NLS_ISO8859_3=m CONFIG_NLS_ISO8859_4=m CONFIG_NLS_ISO8859_5=m CONFIG_NLS_ISO8859_6=m CONFIG_NLS_ISO8859_7=m CONFIG_NLS_ISO8859_9=m CONFIG_NLS_ISO8859_13=m CONFIG_NLS_ISO8859_14=m CONFIG_NLS_ISO8859_15=m CONFIG_NLS_KOI8_R=m CONFIG_NLS_KOI8_U=m CONFIG_NLS_UTF8=m # # Library routines # CONFIG_CRC_CCITT=m CONFIG_CRC16=m CONFIG_CRC32=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m CONFIG_TEXTSEARCH=y CONFIG_TEXTSEARCH_KMP=m CONFIG_TEXTSEARCH_BM=m CONFIG_TEXTSEARCH_FSM=m # # Instrumentation Support # CONFIG_PROFILING=y CONFIG_OPROFILE=m # CONFIG_KPROBES is not set # # Kernel hacking # # CONFIG_PRINTK_TIME is not set CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_SCHEDSTATS is not set CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_SPINLOCK_SLEEP=y # CONFIG_DEBUG_KOBJECT is not set CONFIG_DEBUG_INFO=y CONFIG_DEBUG_FS=y # CONFIG_DEBUG_VM is not set # CONFIG_RCU_TORTURE_TEST is not set CONFIG_DEBUG_STACKOVERFLOW=y CONFIG_DEBUG_STACK_USAGE=y CONFIG_DEBUGGER=y CONFIG_XMON=y CONFIG_XMON_DEFAULT=y CONFIG_IRQSTACKS=y CONFIG_BOOTX_TEXT=y # # Security options # CONFIG_KEYS=y CONFIG_KEYS_DEBUG_PROC_KEYS=y CONFIG_SECURITY=y CONFIG_SECURITY_NETWORK=y CONFIG_SECURITY_CAPABILITIES=y # CONFIG_SECURITY_ROOTPLUG is not set # CONFIG_SECURITY_SECLVL is not set CONFIG_SECURITY_SELINUX=y CONFIG_SECURITY_SELINUX_BOOTPARAM=y CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1 CONFIG_SECURITY_SELINUX_DISABLE=y CONFIG_SECURITY_SELINUX_DEVELOP=y CONFIG_SECURITY_SELINUX_AVC_STATS=y CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1 CONFIG_KEYS_COMPAT=y # # Cryptographic options # CONFIG_CRYPTO=y CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_NULL=m CONFIG_CRYPTO_MD4=m CONFIG_CRYPTO_MD5=y CONFIG_CRYPTO_SHA1=y CONFIG_CRYPTO_SHA256=m CONFIG_CRYPTO_SHA512=m CONFIG_CRYPTO_WP512=m CONFIG_CRYPTO_TGR192=m CONFIG_CRYPTO_DES=m CONFIG_CRYPTO_BLOWFISH=m CONFIG_CRYPTO_TWOFISH=m CONFIG_CRYPTO_SERPENT=m CONFIG_CRYPTO_AES=m CONFIG_CRYPTO_CAST5=m CONFIG_CRYPTO_CAST6=m CONFIG_CRYPTO_TEA=m CONFIG_CRYPTO_ARC4=m CONFIG_CRYPTO_KHAZAD=m CONFIG_CRYPTO_ANUBIS=m CONFIG_CRYPTO_DEFLATE=m CONFIG_CRYPTO_MICHAEL_MIC=m CONFIG_CRYPTO_CRC32C=m # CONFIG_CRYPTO_TEST is not set # # Hardware crypto devices # From ahuja at austin.ibm.com Thu Feb 16 15:44:02 2006 From: ahuja at austin.ibm.com (Manish Ahuja) Date: Wed, 15 Feb 2006 22:44:02 -0600 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <20060214183259.28a6a501.sfr@canb.auug.org.au> References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com> <20060214183259.28a6a501.sfr@canb.auug.org.au> Message-ID: <43F40312.2020800@austin.ibm.com> Paulus, Stephen: I have rebuilt this patch with suggestions and against 2.6.16-git3-rc3. This should apply cleanly. Stephen, Answering some of your queries. >Why not PACA_START_TB and PACA_DELTA_TB? Also, start_tb and delta_tb don't really >store time base values, but PURR values. > > When I dropped the earlier patch, we were tracking only purr's but since purr was a function of timebase in a sense, one of the comments was that I use "tb" instead of "purr". It made sense then, but now with too many things being tracked, I would ideally like to change tb/*_cpu_util to purr as that adds readablity quite a bit. >>Index: linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S >>=================================================================== >>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/entry_64.S 2005-12-18 16:36:54.000000000 -0800 >>+++ linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S 2006-01-17 15:39:03.000000000 -0800 >>@@ -520,7 +520,19 @@ >> * r13 is our per cpu area, only restore it if we are returning to >> * userspace >> */ >>+ >> beq 1f >>+BEGIN_FTR_SECTION >>+ li r10,0 >>+ stb r10,PACA_CDFLAG(r13) >> >> > >cdflag get set here but not set or used anywhere else. > > > I have a segment of code that uses this functionality. I pulled it out, since somewhere my math wasn't adding up. I left it to be dropped as a patch later. But if you wish, I can take this out now and add it later. >>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c >>=================================================================== >>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c 2005-12-18 16:36:54.000000000 -0800 >>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c 2006-01-17 21:20:25.000000000 -0800 >>@@ -243,6 +243,7 @@ >> struct thread_struct *new_thread, *old_thread; >> unsigned long flags; >> struct task_struct *last; >>+ struct paca_struct *lpaca; >> >> > >This could have been declared below (near pd) > > > Yes... But it seems fine there.. #ifdef CONFIG_SMP /* avoid complexity of lazy save/restore of fpu >>@@ -313,19 +314,34 @@ >> new_thread = &new->thread; >> old_thread = ¤t->thread; >> >>-#ifdef CONFIG_PPC64 >>- /* >>- * Collect processor utilization data per process >>- */ >>- if (firmware_has_feature(FW_FEATURE_SPLPAR)) { >>- struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array); >>- long unsigned start_tb, current_tb; >>- start_tb = old_thread->start_tb; >>- cu->current_tb = current_tb = mfspr(SPRN_PURR); >>- old_thread->accum_tb += (current_tb - start_tb); >>- new_thread->start_tb = current_tb; >>+ >>+/* Collect cpu_util utilization data per process and per processor wise */ >>+ if (cpu_has_feature(CPU_FTR_PURR)) { >>+ struct cpu_usage *pd = &__get_cpu_var(cpu_usage_array); >> >> > >Was there some good reason to change this variable name from cu to pd? > > Not really.. except pd stood for purr data and i liked tha abbr more. >>+ long unsigned start_cpu_util, current_cpu_util; >>+ >>+ if ( old_thread->start_cpu_util ) >>+ pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR); >>+ else >>+ old_thread->start_cpu_util = pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR); >> >> > >Probably better would be: > pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR); > if (old_thread->start_cpu_util == 0) > old_thread->start_cpu_util = current_cpu_util; > > > Yeah, that should have been obvious. Changed as requested. >>+ >>+ /* store delta_tb & mftb into cpu_util data array for * >>+ * later easy access otherwise you have to do run_on_cpu * >>+ * which is expensive */ >> >> > >Comment style should be: > > /* store delta_tb & mftb into cpu_util data array for > * later easy access otherwise you have to do run_on_cpu > * which is expensive > */ > > > Changed as requested. >>+ >>+ lpaca = get_paca(); >>+ pd->collected_krntb = lpaca->delta_tb; >>+ pd->collected_timebase = mftb(); >>+ >>+ start_cpu_util = old_thread->start_cpu_util; >>+ old_thread->total_dp += (current_cpu_util - start_cpu_util); >>+ >>+ /* collect time from entry into kernel to now and account it * >>+ * in process kernel time */ >> >> > >Comment style again. > > > Changed as requested. >>+ >>+ old_thread->proc_stime += (current_cpu_util - lpaca->start_tb); >>+ new_thread->start_cpu_util = current_cpu_util; >> } >>-#endif >> >> local_irq_save(flags); >> last = _switch(old_thread, new_thread); >>Index: linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c >>=================================================================== >>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/setup_64.c 2005-12-18 16:36:54.000000000 -0800 >>+++ linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c 2006-02-10 11:51:28.197401840 -0800 >>@@ -851,3 +851,153 @@ >> >> > > > >>+static void collect_cpu_deltas(int cpu) >> >> > > > >>+static void post_cpu_deltas(int cpu) >> >> > >Should those two be #ifdef CONFIG_HOTPLUG_CPU ? > > > Yeah, they should be and are now rightly so. >>+ /* Initialize the global variables to zero */ >>+ offline_cpu_total_tb = 0; >>+ offline_cpu_total_cpu_util = 0; >>+ offline_cpu_total_krncycles = 0; >>+ offline_cpu_total_idle = 0; >> >> > >You don't need to set these to zero explicitly. > > > Ok .. But since they are done.. No harm done.. >>Index: linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c >>=================================================================== >>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/sysfs.c 2005-12-18 16:36:54.000000000 -0800 >>+++ linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c 2006-02-10 12:36:02.375372096 -0800 >>@@ -232,8 +240,11 @@ >> if (cur_cpu_spec->num_pmcs >= 8) >> sysdev_create_file(s, &attr_pmc8); >> >>- if (cpu_has_feature(CPU_FTR_SMT)) >>+ if (cpu_has_feature(CPU_FTR_PURR)) { >> sysdev_create_file(s, &attr_purr); >> >> > >This will mean that the "purr" file doesn't exist in some cases where it >used to (even if it was useless). Not sure if that is a problem for any >user mode utilities. > > > I truly doubt it. But if there is such a utility, then it shouldn't really see purr if its not a power5 system. >>Index: linux-2.6.15-rc6/include/asm-powerpc/processor.h >>=================================================================== >>--- linux-2.6.15-rc6.orig/include/asm-powerpc/processor.h 2005-12-18 16:36:54.000000000 -0800 >>+++ linux-2.6.15-rc6/include/asm-powerpc/processor.h 2006-01-17 21:31:17.000000000 -0800 >>@@ -177,6 +177,9 @@ >> #ifdef CONFIG_PPC64 >> unsigned long start_tb; /* Start purr when proc switched in */ >> unsigned long accum_tb; /* Total accumilated purr for process */ >>+ unsigned long start_cpu_util; /* Start cpu_util when proc switch in */ >>+ unsigned long total_dp ; /* Total delta cpu_util accum for proc */ >>+ unsigned long proc_stime; /* Was pad,Now process cpu_util stime */ >> >> > >total_dp and proc_stime are not used anywhere and start_tb accum_tb are no longer used. > > total_dp & proc_stime are being used. I think I made a mistake and while porting from 2.6.11.8 to 2.6.15, I changed things. I have gone ahead and deleted these values. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cpu_patch-git3-rc3 Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060215/4944d874/attachment.txt From sharada at in.ibm.com Fri Feb 17 01:15:06 2006 From: sharada at in.ibm.com (R Sharada) Date: Thu, 16 Feb 2006 19:45:06 +0530 Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar Message-ID: <20060216141506.GA5064@in.ibm.com> Hello, The htab-size calculated for kexec/kdump on 2.6 kernels was broken, leading to wrong value being exported to /proc/device-tree Here is a patch fixing it up. This has been tested on 2.6.16-rc2 Thanks and Regards, Sharada We export a value linux,htab-size to /proc/device-tree so that kexec-tools can use that to exclude the htab region when trying to make space for the kexec segments. htab-size was earlier calculated in export_htab_values using ppc64_pft_size. ppc64_pft_size no longer holds a valid size for all machines. So, define a new variable htab_size in hash_utils_64.c which is initialized to the htab size value obtained in htab_initialize. Use this variable to set the htab-size in export_htab_values() Signed-off-by: R Sharada --- diff -puN arch/powerpc/mm/hash_utils_64.c~kdump-save-htab-size arch/powerpc/mm/hash_utils_64.c --- linux-2.6.16-rc2-htab/arch/powerpc/mm/hash_utils_64.c~kdump-save-htab-size 2006-02-16 18:29:54.000000000 +0530 +++ linux-2.6.16-rc2-htab-sharada/arch/powerpc/mm/hash_utils_64.c 2006-02-16 19:21:57.000000000 +0530 @@ -95,6 +95,10 @@ int mmu_virtual_psize = MMU_PAGE_4K; int mmu_huge_psize = MMU_PAGE_16M; unsigned int HPAGE_SHIFT; #endif +#ifdef CONFIG_KEXEC +#define HASH_GROUP_SIZE 0x80 /* size of each hash group, asm/mmu.h */ +unsigned long htab_size; +#endif /* There are definitions of page sizes arrays to be used when none * is provided by the firmware. @@ -445,6 +449,9 @@ void __init htab_initialize(void) /* Set SDR1 */ mtspr(SPRN_SDR1, _SDR1); +#ifdef CONFIG_KEXEC + htab_size = (htab_hash_mask + 1) * HASH_GROUP_SIZE; +#endif } mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX; diff -puN arch/powerpc/kernel/machine_kexec_64.c~kdump-save-htab-size arch/powerpc/kernel/machine_kexec_64.c --- linux-2.6.16-rc2-htab/arch/powerpc/kernel/machine_kexec_64.c~kdump-save-htab-size 2006-02-16 18:29:54.000000000 +0530 +++ linux-2.6.16-rc2-htab-sharada/arch/powerpc/kernel/machine_kexec_64.c 2006-02-16 18:53:06.000000000 +0530 @@ -26,8 +26,6 @@ #include #include -#define HASH_GROUP_SIZE 0x80 /* size of each hash group, asm/mmu.h */ - int default_machine_kexec_prepare(struct kimage *image) { int i; @@ -61,7 +59,7 @@ int default_machine_kexec_prepare(struct */ if (htab_address) { low = __pa(htab_address); - high = low + (htab_hash_mask + 1) * HASH_GROUP_SIZE; + high = low + htab_size; for (i = 0; i < image->nr_segments; i++) { begin = image->segment[i].mem; @@ -294,7 +292,7 @@ void default_machine_kexec(struct kimage } /* Values we need to export to the second kernel via the device tree. */ -static unsigned long htab_base, htab_size, kernel_end; +static unsigned long htab_base, kernel_end; static struct property htab_base_prop = { .name = "linux,htab-base", @@ -332,7 +330,6 @@ static void __init export_htab_values(vo htab_base = __pa(htab_address); prom_add_property(node, &htab_base_prop); - htab_size = 1UL << ppc64_pft_size; prom_add_property(node, &htab_size_prop); out: --- linux-2.6.16-rc2-htab/include/asm-powerpc/mmu.h~kdump-save-htab-size 2006-02-16 18:57:43.000000000 +0530 +++ linux-2.6.16-rc2-htab-sharada/include/asm-powerpc/mmu.h 2006-02-16 19:21:29.000000000 +0530 @@ -113,6 +113,9 @@ typedef struct { extern hpte_t *htab_address; extern unsigned long htab_hash_mask; +#ifdef CONFIG_KEXEC +extern unsigned long htab_size; +#endif /* * Page size definition _ From olof at lixom.net Fri Feb 17 01:26:54 2006 From: olof at lixom.net (Olof Johansson) Date: Thu, 16 Feb 2006 08:26:54 -0600 Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar In-Reply-To: <20060216141506.GA5064@in.ibm.com> References: <20060216141506.GA5064@in.ibm.com> Message-ID: <20060216142654.GL6291@pb15.lixom.net> On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote: > Hello, > The htab-size calculated for kexec/kdump on 2.6 kernels was broken, > leading to wrong value being exported to /proc/device-tree > Here is a patch fixing it up. Why you don't use htab_hash_mask in kexec instead of introducing a global variable? I.e. do htab_size = (htab_hash_mask + 1) * HASH_GROUP_SIZE in export_htab_values()? Saves a new global and a bunch of #ifdefs. Thanks, -Olof From d.herrendoerfer at de.ibm.com Fri Feb 17 03:45:10 2006 From: d.herrendoerfer at de.ibm.com (Dirk Herrendoerfer) Date: Thu, 16 Feb 2006 17:45:10 +0100 Subject: [FYI/PATCH] Missing SPUFS context initializer Message-ID: <22731be5999078b4021bf72c7909d9a4@de.ibm.com> This patch adds a missing initializtaion in the spufs context. Without this patch unmapping of the mfc file will result in a kernel oops. Index: linux/arch/powerpc/platforms/cell/spufs/context.c =================================================================== --- linux.orig/arch/powerpc/platforms/cell/spufs/context.c +++ linux/arch/powerpc/platforms/cell/spufs/context.c @@ -51,6 +51,7 @@ struct spu_context *alloc_spu_context(vo ctx->ibox_fasync = NULL; ctx->wbox_fasync = NULL; ctx->mfc_fasync = NULL; + ctx->mfc = NULL; ctx->tagwait = 0; ctx->state = SPU_STATE_SAVED; ctx->local_store = NULL; From michael at ellerman.id.au Fri Feb 17 08:56:49 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 17 Feb 2006 08:56:49 +1100 Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar In-Reply-To: <20060216142654.GL6291@pb15.lixom.net> References: <20060216141506.GA5064@in.ibm.com> <20060216142654.GL6291@pb15.lixom.net> Message-ID: <200602170856.52733.michael@ellerman.id.au> On Fri, 17 Feb 2006 01:26, Olof Johansson wrote: > On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote: > > Hello, > > The htab-size calculated for kexec/kdump on 2.6 kernels was broken, > > leading to wrong value being exported to /proc/device-tree > > Here is a patch fixing it up. > > Why you don't use htab_hash_mask in kexec instead of introducing a > global variable? Separation of concerns? We currently do the calculation in the kexec code, but that's gotten out of sync with the htab code, so just do it in one place and export it. > Saves a new global and a bunch of #ifdefs. I agree the patch could be a bit simpler, eg (totally untested): Index: to-merge/arch/powerpc/kernel/machine_kexec_64.c =================================================================== --- to-merge.orig/arch/powerpc/kernel/machine_kexec_64.c +++ to-merge/arch/powerpc/kernel/machine_kexec_64.c @@ -26,8 +26,6 @@ #include #include -#define HASH_GROUP_SIZE 0x80 /* size of each hash group, asm/mmu.h */ - int default_machine_kexec_prepare(struct kimage *image) { int i; @@ -61,7 +59,7 @@ int default_machine_kexec_prepare(struct */ if (htab_address) { low = __pa(htab_address); - high = low + (htab_hash_mask + 1) * HASH_GROUP_SIZE; + high = low + htab_size_bytes; for (i = 0; i < image->nr_segments; i++) { begin = image->segment[i].mem; @@ -294,7 +292,7 @@ void default_machine_kexec(struct kimage } /* Values we need to export to the second kernel via the device tree. */ -static unsigned long htab_base, htab_size, kernel_end; +static unsigned long htab_base, kernel_end; static struct property htab_base_prop = { .name = "linux,htab-base", @@ -305,7 +303,7 @@ static struct property htab_base_prop = static struct property htab_size_prop = { .name = "linux,htab-size", .length = sizeof(unsigned long), - .value = (unsigned char *)&htab_size, + .value = (unsigned char *)&htab_size_bytes, }; static struct property kernel_end_prop = { @@ -331,8 +329,6 @@ static void __init export_htab_values(vo htab_base = __pa(htab_address); prom_add_property(node, &htab_base_prop); - - htab_size = 1UL << ppc64_pft_size; prom_add_property(node, &htab_size_prop); out: Index: to-merge/arch/powerpc/mm/hash_utils_64.c =================================================================== --- to-merge.orig/arch/powerpc/mm/hash_utils_64.c +++ to-merge/arch/powerpc/mm/hash_utils_64.c @@ -88,6 +88,7 @@ static unsigned long _SDR1; struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; hpte_t *htab_address; +unsigned long htab_size_bytes; unsigned long htab_hash_mask; int mmu_linear_psize = MMU_PAGE_4K; int mmu_virtual_psize = MMU_PAGE_4K; @@ -399,7 +400,7 @@ void create_section_mapping(unsigned lon void __init htab_initialize(void) { - unsigned long table, htab_size_bytes; + unsigned long table; unsigned long pteg_count; unsigned long mode_rw; unsigned long base = 0, size = 0; Index: to-merge/include/asm-powerpc/mmu.h =================================================================== --- to-merge.orig/include/asm-powerpc/mmu.h +++ to-merge/include/asm-powerpc/mmu.h @@ -112,6 +112,7 @@ typedef struct { } hpte_t; extern hpte_t *htab_address; +extern unsigned long htab_size_bytes; extern unsigned long htab_hash_mask; /* -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/770342eb/attachment.pgp From olof at lixom.net Fri Feb 17 09:11:42 2006 From: olof at lixom.net (Olof Johansson) Date: Thu, 16 Feb 2006 16:11:42 -0600 Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar In-Reply-To: <200602170856.52733.michael@ellerman.id.au> References: <20060216141506.GA5064@in.ibm.com> <20060216142654.GL6291@pb15.lixom.net> <200602170856.52733.michael@ellerman.id.au> Message-ID: <20060216221142.GA4772@pb15.lixom.net> On Fri, Feb 17, 2006 at 08:56:49AM +1100, Michael Ellerman wrote: > On Fri, 17 Feb 2006 01:26, Olof Johansson wrote: > > On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote: > > > Hello, > > > The htab-size calculated for kexec/kdump on 2.6 kernels was broken, > > > leading to wrong value being exported to /proc/device-tree > > > Here is a patch fixing it up. > > > > Why you don't use htab_hash_mask in kexec instead of introducing a > > global variable? > > Separation of concerns? We currently do the calculation in the kexec code, but > that's gotten out of sync with the htab code, so just do it in one place and > export it. Eh, it's not like the hash group size is likely to change anytime soon, given that it's fixed in the architecture. But sure, it avoids exposing knowledge of it to kexec. > > Saves a new global and a bunch of #ifdefs. > > I agree the patch could be a bit simpler, eg (totally untested): That looks considerably better. No more complaints from me. :) -Olof From michael at ellerman.id.au Fri Feb 17 09:24:05 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 17 Feb 2006 09:24:05 +1100 Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar In-Reply-To: <20060216221142.GA4772@pb15.lixom.net> References: <20060216141506.GA5064@in.ibm.com> <200602170856.52733.michael@ellerman.id.au> <20060216221142.GA4772@pb15.lixom.net> Message-ID: <200602170924.09133.michael@ellerman.id.au> On Fri, 17 Feb 2006 09:11, Olof Johansson wrote: > On Fri, Feb 17, 2006 at 08:56:49AM +1100, Michael Ellerman wrote: > > On Fri, 17 Feb 2006 01:26, Olof Johansson wrote: > > > On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote: > > > > Hello, > > > > The htab-size calculated for kexec/kdump on 2.6 kernels was broken, > > > > leading to wrong value being exported to /proc/device-tree > > > > Here is a patch fixing it up. > > > > > > Why you don't use htab_hash_mask in kexec instead of introducing a > > > global variable? > > > > Separation of concerns? We currently do the calculation in the kexec > > code, but that's gotten out of sync with the htab code, so just do it in > > one place and export it. > > Eh, it's not like the hash group size is likely to change anytime soon, > given that it's fixed in the architecture. But sure, it avoids exposing > knowledge of it to kexec. Yeah sure, it's possible we could change the meaning of htab_hash_mask in future, although I agree it's unlikely. > That looks considerably better. No more complaints from me. :) Cool, I'll get it tested and send it up. cheers -- Michael Ellerman IBM OzLabs wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/73e6fbf4/attachment.pgp From dwg at au1.ibm.com Thu Feb 16 20:10:27 2006 From: dwg at au1.ibm.com (David Gibson) Date: Thu, 16 Feb 2006 20:10:27 +1100 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <43F40312.2020800@austin.ibm.com> References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com> <20060214183259.28a6a501.sfr@canb.auug.org.au> <43F40312.2020800@austin.ibm.com> Message-ID: <20060216091027.GA826@localhost.localdomain> On Wed, Feb 15, 2006 at 10:44:02PM -0600, Manish Ahuja wrote: [snip] > >>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c > >>=================================================================== > >>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c 2005-12-18 > >>16:36:54.000000000 -0800 > >>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c 2006-01-17 > >>21:20:25.000000000 -0800 > >>@@ -243,6 +243,7 @@ > >> struct thread_struct *new_thread, *old_thread; > >> unsigned long flags; > >> struct task_struct *last; > >>+ struct paca_struct *lpaca; > >> > >> > > > >This could have been declared below (near pd) > > Yes... But it seems fine there.. Actually, I've been trying to get rid of lpaca locals everywhere. Using get_paca() directly is barely more verbose, and usually clearer. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Fri Feb 17 14:54:11 2006 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 17 Feb 2006 14:54:11 +1100 Subject: powerpc: Fix accidentally-working typo in __pud_free_tlb Message-ID: <20060217035411.GB21696@localhost.localdomain> Andrew, Paulus, please apply. One of the parameters to the __pud_free_tlb() macro for powerpc is incorrect (see patch) . We get away with it by accident, because the one place the macro is called, the second parameter is a variable named "pud". Nonetheless, this should be fixed for 2.6.16. Signed-off-by: David Gibson Index: working-2.6/include/asm-powerpc/pgalloc.h =================================================================== --- working-2.6.orig/include/asm-powerpc/pgalloc.h 2006-01-16 13:02:29.000000000 +1100 +++ working-2.6/include/asm-powerpc/pgalloc.h 2006-02-17 14:48:13.000000000 +1100 @@ -146,7 +146,7 @@ extern void pgtable_free_tlb(struct mmu_ pgtable_free_tlb(tlb, pgtable_free_cache(pmd, \ PMD_CACHE_NUM, PMD_TABLE_SIZE-1)) #ifndef CONFIG_PPC_64K_PAGES -#define __pud_free_tlb(tlb, pmd) \ +#define __pud_free_tlb(tlb, pud) \ pgtable_free_tlb(tlb, pgtable_free_cache(pud, \ PUD_CACHE_NUM, PUD_TABLE_SIZE-1)) #endif /* CONFIG_PPC_64K_PAGES */ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From utz.bacher at de.ibm.com Fri Feb 17 16:36:23 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Fri, 17 Feb 2006 06:36:23 +0100 (CET) Subject: [FYI/PATCH 1/4] add syscall declarations used by spufs Message-ID: This adds some syscall declarations used for spufs that were missing. It applies on 2.6.15 and 2.6.15.4. I didn't see that Arnd posted it already; this is for folks who want to build something right now, while Arnd might revisit this some time when he's back. From: Arnd Bergmann Signed-off-by: Utz Bacher Index: linux-2.6.16-rc/include/linux/syscalls.h =================================================================== --- linux-2.6.16-rc.orig/include/linux/syscalls.h +++ linux-2.6.16-rc/include/linux/syscalls.h @@ -511,7 +511,21 @@ asmlinkage long sys_ioprio_set(int which asmlinkage long sys_ioprio_get(int which, int who); asmlinkage long sys_set_mempolicy(int mode, unsigned long __user *nmask, unsigned long maxnode); +asmlinkage long sys_mbind(unsigned long start, unsigned long len, + unsigned long mode, + unsigned long __user *nmask, + unsigned long maxnode, + unsigned flags); +asmlinkage long sys_get_mempolicy(int __user *policy, + unsigned long __user *nmask, + unsigned long maxnode, + unsigned long addr, unsigned long flags); +asmlinkage long sys_inotify_init(void); +asmlinkage long sys_inotify_add_watch(int fd, const char __user *path, + u32 mask); +asmlinkage long sys_inotify_rm_watch(int fd, u32 wd); + asmlinkage long sys_spu_run(int fd, __u32 __user *unpc, __u32 __user *ustatus); asmlinkage long sys_spu_create(const char __user *name, Index: linux-2.6.16-rc/fs/inotify.c =================================================================== --- linux-2.6.16-rc.orig/fs/inotify.c +++ linux-2.6.16-rc/fs/inotify.c @@ -33,6 +33,7 @@ #include #include #include +#include #include From utz.bacher at de.ibm.com Fri Feb 17 16:39:09 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Fri, 17 Feb 2006 06:39:09 +0100 (CET) Subject: [FYI/PATCH 4/4] Idle code for IBM Full System Simulator Message-ID: Improve system simulator idle loop. The patch applies on 2.6.15 and 2.6.15.4. Cc: Arnd Bergmann From: Sidney Manning Signed-off-by: Utz Bacher Index: linux/arch/powerpc/kernel/setup_64.c =================================================================== --- linux.orig/arch/powerpc/kernel/setup_64.c +++ linux/arch/powerpc/kernel/setup_64.c @@ -647,10 +647,10 @@ void __init setup_arch(char **cmdline_p) conswitchp = &dummy_con; #endif - ppc_md.setup_arch(); - setup_systemsim_idle(); + ppc_md.setup_arch(); + /* Use the default idle loop if the platform hasn't provided one. */ if (NULL == ppc_md.idle_loop) { ppc_md.idle_loop = default_idle; From utz.bacher at de.ibm.com Fri Feb 17 16:37:23 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Fri, 17 Feb 2006 06:37:23 +0100 (CET) Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator Message-ID: Enable control-C for the system simulator console. This patch applies on 2.6.15 and 2.6.15.4. Cc: Arnd Bergmann From: Sidney Manning Signed-off-by: Utz Bacher Index: linux/drivers/char/tty_io.c =================================================================== --- linux.orig/drivers/char/tty_io.c +++ linux/drivers/char/tty_io.c @@ -1838,7 +1838,9 @@ retry_open: if (driver) { /* Don't let /dev/console block */ filp->f_flags |= O_NONBLOCK; +#ifndef CONFIG_PPC_SYSTEMSIM noctty = 1; +#endif goto got_driver; } up(&tty_sem); From paulus at samba.org Fri Feb 17 19:39:13 2006 From: paulus at samba.org (Paul Mackerras) Date: Fri, 17 Feb 2006 19:39:13 +1100 Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator In-Reply-To: References: Message-ID: <17397.35761.56383.60273@cargo.ozlabs.ibm.com> Utz Bacher writes: > +#ifndef CONFIG_PPC_SYSTEMSIM > noctty = 1; > +#endif Why is this awful hack necessary? Paul. From utz.bacher at de.ibm.com Fri Feb 17 16:38:17 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Fri, 17 Feb 2006 06:38:17 +0100 (CET) Subject: [FYI/PATCH 3/4] Build fixes for IBM Full System Simulator Message-ID: This patch applies on 2.6.15 and 2.6.15.4 and changes defconfig to more reasonable values and sets flags for cross compiling. The patch is not intended for inclusion, but useful for building stuff for the IBM Full System Simulator. Cc: Arnd Bergmann From: Sidney Manning Signed-off-by: Utz Bacher Index: linux/arch/powerpc/Makefile =================================================================== --- linux.orig/arch/powerpc/Makefile +++ linux/arch/powerpc/Makefile @@ -105,6 +105,11 @@ ifndef CONFIG_FSL_BOOKE CFLAGS += -mstring endif + +ifneq ($(CROSS_COMPILE),) +cpu-as-$(CONFIG_PPC_CELL) += -Wa,-mcellppu +endif + cpu-as-$(CONFIG_PPC64BRIDGE) += -Wa,-mppc64bridge cpu-as-$(CONFIG_4xx) += -Wa,-m405 cpu-as-$(CONFIG_6xx) += -Wa,-maltivec Index: linux/arch/powerpc/configs/cbesim_defconfig =================================================================== --- linux.orig/arch/powerpc/configs/cbesim_defconfig +++ linux/arch/powerpc/configs/cbesim_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit # Linux kernel version: 2.6.15 -# Mon Jan 9 12:38:39 2006 +# Wed Jan 18 12:44:06 2006 # CONFIG_PPC64=y CONFIG_64BIT=y @@ -27,7 +27,6 @@ CONFIG_PPC_FPU=y CONFIG_ALTIVEC=y CONFIG_PPC_STD_MMU=y CONFIG_SMP=y -# CONFIG_BE_DD1 is not set CONFIG_NR_CPUS=4 # @@ -45,8 +44,12 @@ CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y +# CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y +# CONFIG_AUDIT is not set +CONFIG_HOTPLUG=y +CONFIG_KOBJECT_UEVENT=y # CONFIG_IKCONFIG is not set # CONFIG_CPUSETS is not set CONFIG_INITRAMFS_SOURCE="" @@ -55,7 +58,6 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set -CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_BASE_FULL=y @@ -92,11 +94,11 @@ CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y -# CONFIG_DEFAULT_AS is not set +CONFIG_DEFAULT_AS=y # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set -CONFIG_DEFAULT_NOOP=y -CONFIG_DEFAULT_IOSCHED="noop" +# CONFIG_DEFAULT_NOOP is not set +CONFIG_DEFAULT_IOSCHED="anticipatory" # # Platform support @@ -110,6 +112,8 @@ CONFIG_PPC_MULTIPLATFORM=y # CONFIG_PPC_MAPLE is not set CONFIG_PPC_CELL=y CONFIG_PPC_OF=y +CONFIG_PPC_SYSTEMSIM=y +CONFIG_SYSTEMSIM_IDLE=y # CONFIG_U3_DART is not set CONFIG_MPIC=y CONFIG_PPC_RTAS=y @@ -127,7 +131,9 @@ CONFIG_CELL_IIC=y # # Cell Broadband Engine options # +# CONFIG_BE_DD2 is not set CONFIG_SPU_FS=m +CONFIG_SPUFS_MMAP=y # # Kernel options @@ -145,7 +151,7 @@ CONFIG_BINFMT_MISC=y CONFIG_FORCE_MAX_ZONEORDER=13 # CONFIG_IOMMU_VMERGE is not set CONFIG_KEXEC=y -# CONFIG_IRQ_ALL_CPUS is not set +CONFIG_IRQ_ALL_CPUS=y # CONFIG_NUMA is not set CONFIG_ARCH_SELECT_MEMORY_MODEL=y CONFIG_ARCH_FLATMEM_ENABLE=y @@ -193,7 +199,168 @@ CONFIG_KERNEL_START=0xc000000000000000 # # Networking # -# CONFIG_NET is not set +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_MMAP is not set +CONFIG_UNIX=y +CONFIG_XFRM=y +# CONFIG_XFRM_USER is not set +# CONFIG_NET_KEY is not set +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=y +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +CONFIG_SYN_COOKIES=y +# CONFIG_INET_AH is not set +# CONFIG_INET_ESP is not set +# CONFIG_INET_IPCOMP is not set +CONFIG_INET_TUNNEL=y +CONFIG_INET_DIAG=y +CONFIG_INET_TCP_DIAG=y +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y + +# +# IP: Virtual Server Configuration +# +# CONFIG_IP_VS is not set +CONFIG_IPV6=y +# CONFIG_IPV6_PRIVACY is not set +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_IPCOMP=m +CONFIG_INET6_TUNNEL=m +CONFIG_IPV6_TUNNEL=m +CONFIG_NETFILTER=y +# CONFIG_NETFILTER_DEBUG is not set + +# +# Core Netfilter Configuration +# +# CONFIG_NETFILTER_NETLINK is not set + +# +# IP: Netfilter Configuration +# +CONFIG_IP_NF_CONNTRACK=y +# CONFIG_IP_NF_CT_ACCT is not set +# CONFIG_IP_NF_CONNTRACK_MARK is not set +# CONFIG_IP_NF_CONNTRACK_EVENTS is not set +CONFIG_IP_NF_CT_PROTO_SCTP=y +CONFIG_IP_NF_FTP=m +CONFIG_IP_NF_IRC=m +# CONFIG_IP_NF_NETBIOS_NS is not set +CONFIG_IP_NF_TFTP=m +CONFIG_IP_NF_AMANDA=m +# CONFIG_IP_NF_PPTP is not set +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_LIMIT=m +CONFIG_IP_NF_MATCH_IPRANGE=m +CONFIG_IP_NF_MATCH_MAC=m +CONFIG_IP_NF_MATCH_PKTTYPE=m +CONFIG_IP_NF_MATCH_MARK=m +CONFIG_IP_NF_MATCH_MULTIPORT=m +CONFIG_IP_NF_MATCH_TOS=m +CONFIG_IP_NF_MATCH_RECENT=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_DSCP=m +CONFIG_IP_NF_MATCH_AH_ESP=m +CONFIG_IP_NF_MATCH_LENGTH=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_MATCH_TCPMSS=m +CONFIG_IP_NF_MATCH_HELPER=m +CONFIG_IP_NF_MATCH_STATE=m +CONFIG_IP_NF_MATCH_CONNTRACK=m +CONFIG_IP_NF_MATCH_OWNER=m +CONFIG_IP_NF_MATCH_ADDRTYPE=m +CONFIG_IP_NF_MATCH_REALM=m +CONFIG_IP_NF_MATCH_SCTP=m +# CONFIG_IP_NF_MATCH_DCCP is not set +CONFIG_IP_NF_MATCH_COMMENT=m +CONFIG_IP_NF_MATCH_HASHLIMIT=m +# CONFIG_IP_NF_MATCH_STRING is not set +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_LOG=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_TARGET_TCPMSS=m +# CONFIG_IP_NF_TARGET_NFQUEUE is not set +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_NAT_NEEDED=y +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_SAME=m +CONFIG_IP_NF_NAT_SNMP_BASIC=m +CONFIG_IP_NF_NAT_IRC=m +CONFIG_IP_NF_NAT_FTP=m +CONFIG_IP_NF_NAT_TFTP=m +CONFIG_IP_NF_NAT_AMANDA=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_TOS=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_DSCP=m +CONFIG_IP_NF_TARGET_MARK=m +CONFIG_IP_NF_TARGET_CLASSIFY=m +# CONFIG_IP_NF_TARGET_TTL is not set +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_TARGET_NOTRACK=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m + +# +# IPv6: Netfilter Configuration (EXPERIMENTAL) +# +# CONFIG_IP6_NF_QUEUE is not set +# CONFIG_IP6_NF_IPTABLES is not set + +# +# DCCP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_DCCP is not set + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set + +# +# QoS and/or fair queueing +# +# CONFIG_NET_SCHED is not set +CONFIG_NET_CLS_ROUTE=y + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +# CONFIG_HAMRADIO is not set +# CONFIG_IRDA is not set +# CONFIG_BT is not set +# CONFIG_IEEE80211 is not set # # Device Drivers @@ -210,6 +377,7 @@ CONFIG_FW_LOADER=y # # Connector - unified userspace <-> kernelspace linker # +# CONFIG_CONNECTOR is not set # # Memory Technology Devices (MTD) @@ -236,17 +404,73 @@ CONFIG_FW_LOADER=y # CONFIG_BLK_DEV_COW_COMMON is not set CONFIG_BLK_DEV_LOOP=y # CONFIG_BLK_DEV_CRYPTOLOOP is not set +# CONFIG_BLK_DEV_NBD is not set # CONFIG_BLK_DEV_SX8 is not set CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_COUNT=16 CONFIG_BLK_DEV_RAM_SIZE=131072 CONFIG_BLK_DEV_INITRD=y +CONFIG_BLK_DEV_SYSTEMSIM=y # CONFIG_CDROM_PKTCDVD is not set +# CONFIG_ATA_OVER_ETH is not set # # ATA/ATAPI/MFM/RLL support # -# CONFIG_IDE is not set +CONFIG_IDE=y +CONFIG_BLK_DEV_IDE=y + +# +# Please see Documentation/ide.txt for help/info on IDE drives +# +# CONFIG_BLK_DEV_IDE_SATA is not set +CONFIG_BLK_DEV_IDEDISK=y +CONFIG_IDEDISK_MULTI_MODE=y +# CONFIG_BLK_DEV_IDECD is not set +# CONFIG_BLK_DEV_IDETAPE is not set +# CONFIG_BLK_DEV_IDEFLOPPY is not set +# CONFIG_IDE_TASK_IOCTL is not set + +# +# IDE chipset support/bugfixes +# +CONFIG_IDE_GENERIC=y +CONFIG_BLK_DEV_IDEPCI=y +CONFIG_IDEPCI_SHARE_IRQ=y +# CONFIG_BLK_DEV_OFFBOARD is not set +CONFIG_BLK_DEV_GENERIC=y +# CONFIG_BLK_DEV_OPTI621 is not set +# CONFIG_BLK_DEV_SL82C105 is not set +CONFIG_BLK_DEV_IDEDMA_PCI=y +# CONFIG_BLK_DEV_IDEDMA_FORCED is not set +CONFIG_IDEDMA_PCI_AUTO=y +# CONFIG_IDEDMA_ONLYDISK is not set +CONFIG_BLK_DEV_AEC62XX=y +# CONFIG_BLK_DEV_ALI15X3 is not set +# CONFIG_BLK_DEV_AMD74XX is not set +# CONFIG_BLK_DEV_CMD64X is not set +# CONFIG_BLK_DEV_TRIFLEX is not set +# CONFIG_BLK_DEV_CY82C693 is not set +# CONFIG_BLK_DEV_CS5520 is not set +# CONFIG_BLK_DEV_CS5530 is not set +# CONFIG_BLK_DEV_HPT34X is not set +# CONFIG_BLK_DEV_HPT366 is not set +# CONFIG_BLK_DEV_SC1200 is not set +# CONFIG_BLK_DEV_PIIX is not set +# CONFIG_BLK_DEV_IT821X is not set +# CONFIG_BLK_DEV_NS87415 is not set +# CONFIG_BLK_DEV_PDC202XX_OLD is not set +# CONFIG_BLK_DEV_PDC202XX_NEW is not set +# CONFIG_BLK_DEV_SVWKS is not set +CONFIG_BLK_DEV_SIIMAGE=y +# CONFIG_BLK_DEV_SLC90E66 is not set +# CONFIG_BLK_DEV_TRM290 is not set +# CONFIG_BLK_DEV_VIA82CXXX is not set +# CONFIG_IDE_ARM is not set +CONFIG_BLK_DEV_IDEDMA=y +# CONFIG_IDEDMA_IVB is not set +CONFIG_IDEDMA_AUTO=y +# CONFIG_BLK_DEV_HD is not set # # SCSI device support @@ -282,12 +506,92 @@ CONFIG_BLK_DEV_INITRD=y # # Network device support # +CONFIG_NETDEVICES=y +# CONFIG_DUMMY is not set +# CONFIG_BONDING is not set +# CONFIG_EQUALIZER is not set +# CONFIG_TUN is not set + +# +# ARCnet devices +# +# CONFIG_ARCNET is not set + +# +# PHY device support +# +# CONFIG_PHYLIB is not set + +# +# Ethernet (10 or 100Mbit) +# +CONFIG_NET_ETHERNET=y +CONFIG_MII=y +# CONFIG_HAPPYMEAL is not set +# CONFIG_SUNGEM is not set +# CONFIG_CASSINI is not set +# CONFIG_NET_VENDOR_3COM is not set + +# +# Tulip family network device support +# +# CONFIG_NET_TULIP is not set +# CONFIG_HP100 is not set +CONFIG_SYSTEMSIM_NET=y +# CONFIG_NET_PCI is not set + +# +# Ethernet (1000 Mbit) +# +# CONFIG_ACENIC is not set +# CONFIG_DL2K is not set +CONFIG_E1000=y +# CONFIG_E1000_NAPI is not set +# CONFIG_NS83820 is not set +# CONFIG_HAMACHI is not set +# CONFIG_YELLOWFIN is not set +# CONFIG_R8169 is not set +# CONFIG_SIS190 is not set +# CONFIG_SKGE is not set +# CONFIG_SK98LIN is not set +# CONFIG_TIGON3 is not set +# CONFIG_BNX2 is not set +# CONFIG_MV643XX_ETH is not set + +# +# Ethernet (10000 Mbit) +# +# CONFIG_CHELSIO_T1 is not set +# CONFIG_IXGB is not set +# CONFIG_S2IO is not set + +# +# Token Ring devices +# +# CONFIG_TR is not set + +# +# Wireless LAN (non-hamradio) +# +# CONFIG_NET_RADIO is not set + +# +# Wan interfaces +# +# CONFIG_WAN is not set +# CONFIG_FDDI is not set +# CONFIG_HIPPI is not set +# CONFIG_PPP is not set +# CONFIG_SLIP is not set +# CONFIG_SHAPER is not set +# CONFIG_NETCONSOLE is not set # CONFIG_NETPOLL is not set # CONFIG_NET_POLL_CONTROLLER is not set # # ISDN subsystem # +# CONFIG_ISDN is not set # # Telephony Support @@ -367,7 +671,7 @@ CONFIG_UNIX98_PTYS=y # CONFIG_LEGACY_PTYS is not set CONFIG_HVC_DRIVER=y CONFIG_HVC_FSS=y -# CONFIG_HVC_RTAS is not set +CONFIG_HVC_RTAS=y # # IPMI @@ -414,7 +718,57 @@ CONFIG_WATCHDOG_RTAS=y # # I2C support # -# CONFIG_I2C is not set +CONFIG_I2C=y +# CONFIG_I2C_CHARDEV is not set + +# +# I2C Algorithms +# +CONFIG_I2C_ALGOBIT=y +# CONFIG_I2C_ALGOPCF is not set +# CONFIG_I2C_ALGOPCA is not set + +# +# I2C Hardware Bus support +# +# CONFIG_I2C_ALI1535 is not set +# CONFIG_I2C_ALI1563 is not set +# CONFIG_I2C_ALI15X3 is not set +# CONFIG_I2C_AMD756 is not set +# CONFIG_I2C_AMD8111 is not set +# CONFIG_I2C_I801 is not set +# CONFIG_I2C_I810 is not set +# CONFIG_I2C_PIIX4 is not set +# CONFIG_I2C_NFORCE2 is not set +# CONFIG_I2C_PARPORT_LIGHT is not set +# CONFIG_I2C_PROSAVAGE is not set +# CONFIG_I2C_SAVAGE4 is not set +# CONFIG_SCx200_ACB is not set +# CONFIG_I2C_SIS5595 is not set +# CONFIG_I2C_SIS630 is not set +# CONFIG_I2C_SIS96X is not set +# CONFIG_I2C_STUB is not set +# CONFIG_I2C_VIA is not set +# CONFIG_I2C_VIAPRO is not set +# CONFIG_I2C_VOODOO3 is not set +# CONFIG_I2C_PCA_ISA is not set + +# +# Miscellaneous I2C Chip support +# +# CONFIG_SENSORS_DS1337 is not set +# CONFIG_SENSORS_DS1374 is not set +# CONFIG_SENSORS_EEPROM is not set +# CONFIG_SENSORS_PCF8574 is not set +# CONFIG_SENSORS_PCA9539 is not set +# CONFIG_SENSORS_PCF8591 is not set +# CONFIG_SENSORS_RTC8564 is not set +# CONFIG_SENSORS_MAX6875 is not set +# CONFIG_RTC_X1205_I2C is not set +# CONFIG_I2C_DEBUG_CORE is not set +# CONFIG_I2C_DEBUG_ALGO is not set +# CONFIG_I2C_DEBUG_BUS is not set +# CONFIG_I2C_DEBUG_CHIP is not set # # Dallas's 1-wire bus @@ -443,6 +797,7 @@ CONFIG_WATCHDOG_RTAS=y # # Digital Video Broadcasting Devices # +# CONFIG_DVB is not set # # Graphics support @@ -484,7 +839,14 @@ CONFIG_USB_ARCH_HAS_OHCI=y # # InfiniBand support # -# CONFIG_INFINIBAND is not set +CONFIG_INFINIBAND=y +CONFIG_INFINIBAND_USER_MAD=m +CONFIG_INFINIBAND_USER_ACCESS=m +CONFIG_INFINIBAND_MTHCA=m +CONFIG_INFINIBAND_MTHCA_DEBUG=y +CONFIG_INFINIBAND_IPOIB=m +CONFIG_INFINIBAND_IPOIB_DEBUG=y +CONFIG_INFINIBAND_IPOIB_DEBUG_DATA=y # # SN Devices @@ -496,10 +858,16 @@ CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_EXT2_FS=y # CONFIG_EXT2_FS_XATTR is not set # CONFIG_EXT2_FS_XIP is not set -# CONFIG_EXT3_FS is not set +CONFIG_EXT3_FS=y +CONFIG_EXT3_FS_XATTR=y +# CONFIG_EXT3_FS_POSIX_ACL is not set +# CONFIG_EXT3_FS_SECURITY is not set +CONFIG_JBD=y +# CONFIG_JBD_DEBUG is not set +CONFIG_FS_MBCACHE=y # CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set -# CONFIG_FS_POSIX_ACL is not set +CONFIG_FS_POSIX_ACL=y # CONFIG_XFS_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set @@ -513,14 +881,20 @@ CONFIG_DNOTIFY=y # # CD-ROM/DVD Filesystems # -# CONFIG_ISO9660_FS is not set -# CONFIG_UDF_FS is not set +CONFIG_ISO9660_FS=m +CONFIG_JOLIET=y +# CONFIG_ZISOFS is not set +CONFIG_UDF_FS=m +CONFIG_UDF_NLS=y # # DOS/FAT/NT Filesystems # -# CONFIG_MSDOS_FS is not set -# CONFIG_VFAT_FS is not set +CONFIG_FAT_FS=m +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_FAT_DEFAULT_CODEPAGE=437 +CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" # CONFIG_NTFS_FS is not set # @@ -534,7 +908,6 @@ CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_RAMFS=y # CONFIG_RELAYFS_FS is not set -# CONFIG_CONFIGFS_FS is not set # # Miscellaneous filesystems @@ -554,6 +927,35 @@ CONFIG_RAMFS=y # CONFIG_UFS_FS is not set # +# Network File Systems +# +CONFIG_NFS_FS=m +CONFIG_NFS_V3=y +CONFIG_NFS_V3_ACL=y +# CONFIG_NFS_V4 is not set +# CONFIG_NFS_DIRECTIO is not set +CONFIG_NFSD=m +CONFIG_NFSD_V2_ACL=y +CONFIG_NFSD_V3=y +CONFIG_NFSD_V3_ACL=y +# CONFIG_NFSD_V4 is not set +CONFIG_NFSD_TCP=y +CONFIG_LOCKD=m +CONFIG_LOCKD_V4=y +CONFIG_EXPORTFS=m +CONFIG_NFS_ACL_SUPPORT=m +CONFIG_NFS_COMMON=y +CONFIG_SUNRPC=m +# CONFIG_RPCSEC_GSS_KRB5 is not set +# CONFIG_RPCSEC_GSS_SPKM3 is not set +# CONFIG_SMB_FS is not set +# CONFIG_CIFS is not set +# CONFIG_NCP_FS is not set +# CONFIG_CODA_FS is not set +# CONFIG_AFS_FS is not set +# CONFIG_9P_FS is not set + +# # Partition Types # CONFIG_PARTITION_ADVANCED=y @@ -576,7 +978,46 @@ CONFIG_EFI_PARTITION=y # # Native Language Support # -# CONFIG_NLS is not set +CONFIG_NLS=m +CONFIG_NLS_DEFAULT="iso8859-1" +# CONFIG_NLS_CODEPAGE_437 is not set +# CONFIG_NLS_CODEPAGE_737 is not set +# CONFIG_NLS_CODEPAGE_775 is not set +# CONFIG_NLS_CODEPAGE_850 is not set +# CONFIG_NLS_CODEPAGE_852 is not set +# CONFIG_NLS_CODEPAGE_855 is not set +# CONFIG_NLS_CODEPAGE_857 is not set +# CONFIG_NLS_CODEPAGE_860 is not set +# CONFIG_NLS_CODEPAGE_861 is not set +# CONFIG_NLS_CODEPAGE_862 is not set +# CONFIG_NLS_CODEPAGE_863 is not set +# CONFIG_NLS_CODEPAGE_864 is not set +# CONFIG_NLS_CODEPAGE_865 is not set +# CONFIG_NLS_CODEPAGE_866 is not set +# CONFIG_NLS_CODEPAGE_869 is not set +# CONFIG_NLS_CODEPAGE_936 is not set +# CONFIG_NLS_CODEPAGE_950 is not set +# CONFIG_NLS_CODEPAGE_932 is not set +# CONFIG_NLS_CODEPAGE_949 is not set +# CONFIG_NLS_CODEPAGE_874 is not set +# CONFIG_NLS_ISO8859_8 is not set +# CONFIG_NLS_CODEPAGE_1250 is not set +# CONFIG_NLS_CODEPAGE_1251 is not set +# CONFIG_NLS_ASCII is not set +CONFIG_NLS_ISO8859_1=m +CONFIG_NLS_ISO8859_2=m +CONFIG_NLS_ISO8859_3=m +CONFIG_NLS_ISO8859_4=m +CONFIG_NLS_ISO8859_5=m +CONFIG_NLS_ISO8859_6=m +CONFIG_NLS_ISO8859_7=m +CONFIG_NLS_ISO8859_9=m +CONFIG_NLS_ISO8859_13=m +CONFIG_NLS_ISO8859_14=m +CONFIG_NLS_ISO8859_15=m +# CONFIG_NLS_KOI8_R is not set +# CONFIG_NLS_KOI8_U is not set +# CONFIG_NLS_UTF8 is not set # # Library routines @@ -614,6 +1055,7 @@ CONFIG_DEBUG_FS=y # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUGGER is not set +# CONFIG_XMON is not set CONFIG_IRQSTACKS=y # CONFIG_BOOTX_TEXT is not set From segher at kernel.crashing.org Fri Feb 17 21:25:42 2006 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Fri, 17 Feb 2006 11:25:42 +0100 Subject: [PATCH] Fix some MPIC + HT APIC buglets Message-ID: <4df057680bcb35ef633804504735670c@kernel.crashing.org> Do disable, not enable, the HT APIC IRQ in the function that is supposed to. Enable the MPIC IRQ before enabling the downstream APIC IRQ, avoids potentially losing an interrupt. Signed-off-by: Segher Boessenkool --- Index: linux/arch/powerpc/sysdev/mpic.c =================================================================== --- linux.orig/arch/powerpc/sysdev/mpic.c +++ linux/arch/powerpc/sysdev/mpic.c @@ -234,7 +234,7 @@ spin_lock_irqsave(&mpic->fixup_lock, flags); writeb(0x10 + 2 * fixup->index, fixup->base + 2); tmp = readl(fixup->base + 4); - tmp &= ~1U; + tmp |= 1; writel(tmp, fixup->base + 4); spin_unlock_irqrestore(&mpic->fixup_lock, flags); } @@ -446,14 +446,15 @@ #ifdef CONFIG_MPIC_BROKEN_U3 struct mpic *mpic = mpic_from_irq(irq); unsigned int src = irq - mpic->irq_offset; +#endif /* CONFIG_MPIC_BROKEN_U3 */ + + mpic_enable_irq(irq); +#ifdef CONFIG_MPIC_BROKEN_U3 if (mpic_is_ht_interrupt(mpic, src)) mpic_startup_ht_interrupt(mpic, src, irq_desc[irq].status); - #endif /* CONFIG_MPIC_BROKEN_U3 */ - mpic_enable_irq(irq); - return 0; } -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-mpic-fixes Type: application/octet-stream Size: 1116 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/33614aca/attachment.obj From segher at kernel.crashing.org Fri Feb 17 21:30:30 2006 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Fri, 17 Feb 2006 11:30:30 +0100 Subject: [PATCH] Don't re-assign PCI resources on Maple Message-ID: <7b18da50a94a66a434941526bfc5fbfd@kernel.crashing.org> Maple firmware does not need PCI resource allocation, and in fact, it can cause problems in some strange cases. Signed-off-by: Segher Boessenkool --- Index: linux/arch/powerpc/platforms/maple/pci.c =================================================================== --- linux.orig/arch/powerpc/platforms/maple/pci.c +++ linux/arch/powerpc/platforms/maple/pci.c @@ -435,8 +435,8 @@ PCI_DN(np)->busno = 0xf0; } - /* Tell pci.c to use the common resource allocation mecanism */ - pci_probe_only = 0; + /* Tell pci.c to not change any resource allocations. */ + pci_probe_only = 1; /* Allow all IO */ io_page_mask = -1; -------------- next part -------------- A non-text attachment was scrubbed... Name: patch-maple-pci-setup Type: application/octet-stream Size: 663 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/1acd9b58/attachment.obj From hch at lst.de Sat Feb 18 05:32:54 2006 From: hch at lst.de (Christoph Hellwig) Date: Fri, 17 Feb 2006 19:32:54 +0100 Subject: [FYI/PATCH 3/4] Build fixes for IBM Full System Simulator In-Reply-To: References: Message-ID: <20060217183254.GA3951@lst.de> > + > +ifneq ($(CROSS_COMPILE),) > +cpu-as-$(CONFIG_PPC_CELL) += -Wa,-mcellppu > +endif the CROSS_COMPILE setting is wrong. cross-compilation should not affect selection of assembler flags. > + > cpu-as-$(CONFIG_PPC64BRIDGE) += -Wa,-mppc64bridge > cpu-as-$(CONFIG_4xx) += -Wa,-m405 > cpu-as-$(CONFIG_6xx) += -Wa,-maltivec From rolandd at cisco.com Sat Feb 18 11:55:32 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:55:32 -0800 Subject: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver Message-ID: <20060218005532.13620.79663.stgit@localhost.localdomain> Here's a series of patches that add an InfiniBand adapter driver for IBM eHCA hardware. Please look it over with an eye towards issues that need to be addressed before merging this upstream. This patch series is somewhat unusual in that I am not the original author of this driver -- I am just sending it for review for the authors, who are apparently not able to post patches themselves due to internal issues at IBM. However they are cc'ed and will respond to comments in this thread. In fact I have some issues with the code myself that need to be addressed before this driver is mergeable. I've included most of them in the individual patches, although I have some general comments too. However I would like to get some early feedback for the ehca authors from the wider community. In particular I think its important to run this past the ppc64 experts, since I'm not sure what the standards for this sort of pSeries driver are. Anyway, my general comments: - The #ifs that test EHCA_USERDRIVER and __KERNEL__ should be killed. We know that this is kernel code, so there's no reason to include userspace compatibility junk. - Many of the comments look like they are for some automatic documentation system that is not quite kerneldoc. They should be fixed to be real kerneldoc comments. - In general there is a huge amount of code in large inline functions in .h files. Things should be reorganized to cut this down to a sane amount. Thanks, Roland From rolandd at cisco.com Sat Feb 18 11:57:10 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:10 -0800 Subject: [PATCH 03/22] pHype specific stuff In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005709.13620.77409.stgit@localhost.localdomain> From: Roland Dreier It's not clear what the connection between hcp_phyp.c and hcp_phyp.h really is -- they don't seem to very closely related. Again, hcp_phyp.h has some rather large functions that belong in a .c file and maybe shouldn't be inlined (although maybe the generated assembly ends up being small because it's just fiddling registers around). For a change, hipz_galpa_load() and hipz_galpa_store() actually look simple enough that they could probably become inline functions in a header (and just kill hcp_phyp.c). This would also make the comments about them being inline in ehca_galpa.h true. Is ehca_galpha.h needed at all, or can it be folded into another file? Why is its abstraction needed? --- drivers/infiniband/hw/ehca/ehca_galpa.h | 74 +++++++ drivers/infiniband/hw/ehca/hcp_phyp.c | 81 +++++++ drivers/infiniband/hw/ehca/hcp_phyp.h | 338 +++++++++++++++++++++++++++++++ 3 files changed, 493 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_galpa.h b/drivers/infiniband/hw/ehca/ehca_galpa.h new file mode 100644 index 0000000..d64115c --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_galpa.h @@ -0,0 +1,74 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * pSeries interface definitions + * + * Authors: Waleri Fomin + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_galpa.h,v 1.6 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __EHCA_GALPA_H__ +#define __EHCA_GALPA_H__ + +/* eHCA page (mapped into p-memory) + resource to access eHCA register pages in CPU address space +*/ +struct h_galpa { + u64 fw_handle; + /* for pSeries this is a 64bit memory address where + I/O memory is mapped into CPU address space (kv) */ +}; + +/** + resource to access eHCA address space registers, all types +*/ +struct h_galpas { + u32 pid; /*PID of userspace galpa checking */ + struct h_galpa user; /* user space accessible resource, + set to 0 if unused */ + struct h_galpa kernel; /* kernel space accessible resource, + set to 0 if unused */ +}; +/** @brief store value at offset into galpa, will be inline function + */ +void hipz_galpa_store(struct h_galpa galpa, u32 offset, u64 value); + +/** @brief return value from offset in galpa, will be inline function + */ +u64 hipz_galpa_load(struct h_galpa galpa, u32 offset); + +#endif /* __EHCA_GALPA_H__ */ diff --git a/drivers/infiniband/hw/ehca/hcp_phyp.c b/drivers/infiniband/hw/ehca/hcp_phyp.c new file mode 100644 index 0000000..129e61b --- /dev/null +++ b/drivers/infiniband/hw/ehca/hcp_phyp.c @@ -0,0 +1,81 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * load store abstraction for ehca register access + * + * Authors: Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hcp_phyp.c,v 1.10 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#define DEB_PREFIX "PHYP" + +#ifdef __KERNEL__ +#include "ehca_kernel.h" +#include "hipz_hw.h" +/* #include "hipz_structs.h" */ +/* TODO: still necessary */ +#include "ehca_classes.h" +#else /* !__KERNEL__ */ +#include "ehca_utools.h" +#include "ehca_galpa.h" +#endif + +#ifndef EHCA_USERDRIVER /* TODO: is this correct */ + +u64 hipz_galpa_load(struct h_galpa galpa, u32 offset) +{ + u64 addr = galpa.fw_handle + offset; + u64 out; + EDEB_EN(7, "addr=%lx offset=%x ", addr, offset); + out = *(u64 *) addr; + EDEB_EX(7, "addr=%lx value=%lx", addr, out); + return out; +}; + +void hipz_galpa_store(struct h_galpa galpa, u32 offset, u64 value) +{ + u64 addr = galpa.fw_handle + offset; + EDEB(7, "addr=%lx offset=%x value=%lx", addr, + offset, value); + *(u64 *) addr = value; +#ifdef EHCA_USE_HCALL + /* hipz_galpa_load(galpa, offset); */ + /* synchronize explicitly */ +#endif +}; + +#endif /* EHCA_USERDRIVER */ diff --git a/drivers/infiniband/hw/ehca/hcp_phyp.h b/drivers/infiniband/hw/ehca/hcp_phyp.h new file mode 100644 index 0000000..c82fb4b --- /dev/null +++ b/drivers/infiniband/hw/ehca/hcp_phyp.h @@ -0,0 +1,338 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Firmware calls + * + * Authors: Christoph Raisch + * Waleri Fomin + * Gerd Bayer + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hcp_phyp.h,v 1.16 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __HCP_PHYP_H__ +#define __HCP_PHYP_H__ + +#ifndef EHCA_USERDRIVER +inline static int hcall_map_page(u64 physaddr, u64 * mapaddr) +{ + *mapaddr = (u64)(ioremap(physaddr, 4096)); + + EDEB(7, "ioremap physaddr=%lx mapaddr=%lx", physaddr, *mapaddr); + return 0; +} + +inline static int hcall_unmap_page(u64 mapaddr) +{ + EDEB(7, "mapaddr=%lx", mapaddr); + iounmap((void *)(mapaddr)); + return 0; +} +#else +int hcall_map_page(u64 physaddr, u64 * mapaddr); +int hcall_unmap_page(u64 mapaddr); +#endif + +struct hcall { + u64 regs[11]; +}; + +/** + * @brief returns time to wait in secs for the given long busy error code + */ +inline static u32 getLongBusyTimeSecs(int longBusyRetCode) +{ + switch (longBusyRetCode) { + case H_LongBusyOrder1msec: + return 1; + case H_LongBusyOrder10msec: + return 10; + case H_LongBusyOrder100msec: + return 100; + case H_LongBusyOrder1sec: + return 1000; + case H_LongBusyOrder10sec: + return 10000; + case H_LongBusyOrder100sec: + return 100000; + default: + return 1; + } /* eof switch */ +} + +inline static long plpar_hcall_7arg_7ret(unsigned long opcode, + unsigned long arg1, /* References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005704.13620.88286.stgit@localhost.localdomain> From: Roland Dreier This is horribly non-portable. How much of a performance difference does it make? How does it do on ppc64 systems where the cacheline size is not 32? --- drivers/infiniband/hw/ehca/ehca_asm.h | 58 +++++++++++++++++++++++++++++++++ 1 files changed, 58 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_asm.h b/drivers/infiniband/hw/ehca/ehca_asm.h new file mode 100644 index 0000000..6a09ac5 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_asm.h @@ -0,0 +1,58 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Some helper macros with assembler instructions + * + * Authors: Khadija Souissi + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_asm.h,v 1.7 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#ifndef __EHCA_ASM_H__ +#define __EHCA_ASM_H__ + +#if defined(CONFIG_PPC_PSERIES) || defined (__PPC64__) || defined (__PPC__) + +#define clear_cacheline(adr) __asm__ __volatile("dcbz 0,%0"::"r"(adr)) + +#elif defined(CONFIG_ARCH_S390) +#error "unsupported yet" +#else +#error "invalid platform" +#endif + +#endif /* __EHCA_ASM_H__ */ From rolandd at cisco.com Sat Feb 18 11:57:07 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:07 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005707.13620.20538.stgit@localhost.localdomain> From: Roland Dreier This is a very large file with way too much code for a .h file. The functions look too big to be inlined also. Is there any way for this code to move to a .c file? --- drivers/infiniband/hw/ehca/hcp_if.h | 2022 +++++++++++++++++++++++++++++++++++ 1 files changed, 2022 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/hcp_if.h b/drivers/infiniband/hw/ehca/hcp_if.h new file mode 100644 index 0000000..70bf77f --- /dev/null +++ b/drivers/infiniband/hw/ehca/hcp_if.h @@ -0,0 +1,2022 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Firmware Infiniband Interface code for POWER + * + * Authors: Gerd Bayer + * Christoph Raisch + * Waleri Fomin + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hcp_if.h,v 1.62 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __HCP_IF_H__ +#define __HCP_IF_H__ + +#include "ehca_tools.h" +#include "hipz_structs.h" +#include "ehca_classes.h" + +#ifndef EHCA_USE_HCALL +#include "hcz_queue.h" +#include "hcz_mrmw.h" +#include "hcz_emmio.h" +#include "sim_prom.h" +#endif +#include "hipz_fns.h" +#include "hcp_sense.h" +#include "ehca_irq.h" + +#ifndef CONFIG_PPC64 +#ifndef Z_SERIES +#warning "included with wrong target, this is a p file" +#endif +#endif + +#ifdef EHCA_USE_HCALL + +#ifndef EHCA_USERDRIVER +#include "hcp_phyp.h" +#else +#include "testbench/hcallbridge.h" +#endif +#endif + +inline static int hcp_galpas_ctor(struct h_galpas *galpas, + u64 paddr_kernel, u64 paddr_user) +{ + int rc = 0; + + rc = hcall_map_page(paddr_kernel, &galpas->kernel.fw_handle); + if (rc != 0) + return (rc); + + galpas->user.fw_handle = paddr_user; + + EDEB(7, "paddr_kernel=%lx paddr_user=%lx galpas->kernel=%lx" + " galpas->user=%lx", + paddr_kernel, paddr_user, galpas->kernel.fw_handle, + galpas->user.fw_handle); + + return (rc); +} + +inline static int hcp_galpas_dtor(struct h_galpas *galpas) +{ + int rc = 0; + + if (galpas->kernel.fw_handle != 0) + rc = hcall_unmap_page(galpas->kernel.fw_handle); + + if (rc != 0) + return (rc); + + galpas->user.fw_handle = galpas->kernel.fw_handle = 0; + + return rc; +} + +/** + * hipz_h_alloc_resource_eq - Allocate EQ resources in HW and FW, initalize + * resources, create the empty EQPT (ring). + * + * @eq_handle: eq handle for this queue + * @act_nr_of_entries: actual number of queue entries + * @act_pages: actual number of queue pages + * @eq_ist: used by hcp_H_XIRR() call + */ +inline static u64 hipz_h_alloc_resource_eq(const struct + ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfeq *pfeq, + const u32 neq_control, + const u32 + number_of_entries, + struct ipz_eq_handle + *eq_handle, + u32 * act_nr_of_entries, + u32 * act_pages, + u32 * eq_ist) +{ + u64 retcode; + u64 dummy; + u64 act_nr_of_entries_out = 0; + u64 act_pages_out = 0; + u64 eq_ist_out = 0; + u64 allocate_controls = 0; + u32 x = (u64)(&x); + + EDEB_EN(7, "pfeq=%p hcp_adapter_handle=%lx new_control=%x" + " number_of_entries=%x", + pfeq, hcp_adapter_handle.handle, neq_control, + number_of_entries); + +#ifndef EHCA_USE_HCALL + retcode = simp_h_alloc_resource_eq(hcp_adapter_handle, pfeq, + neq_control, + number_of_entries, + eq_handle, + act_nr_of_entries, + act_pages, eq_ist); +#else + + /* resource type */ + allocate_controls = 3ULL; + + /* ISN is associated */ + if (neq_control != 1) { + allocate_controls = (1ULL << (63 - 7)) | allocate_controls; + } + + /* notification event queue */ + if (neq_control == 1) { + allocate_controls = (1ULL << 63) | allocate_controls; + } + + retcode = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + allocate_controls, /* r5 */ + number_of_entries, /* r6 */ + 0, 0, 0, 0, + &eq_handle->handle, /* r4 */ + &dummy, /* r5 */ + &dummy, /* r6 */ + &act_nr_of_entries_out, /* r7 */ + &act_pages_out, /* r8 */ + &eq_ist_out, /* r8 */ + &dummy); + + *act_nr_of_entries = (u32) act_nr_of_entries_out; + *act_pages = (u32) act_pages_out; + *eq_ist = (u32) eq_ist_out; + +#endif /* EHCA_USE_HCALL */ + + if (retcode == H_NOT_ENOUGH_RESOURCES) { + EDEB_ERR(4, "Not enough resource - retcode=%lx ", retcode); + } + + EDEB_EX(7, "act_nr_of_entries=%x act_pages=%x eq_ist=%x", + *act_nr_of_entries, *act_pages, *eq_ist); + + return retcode; +} + +static inline u64 hipz_h_reset_event(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ipz_eq_handle eq_handle, + const u64 event_mask) +{ + u64 retcode = 0; + u64 dummy; + + EDEB_EN(7, "eq_handle=%lx, adapter_handle=%lx event_mask=%lx", + eq_handle.handle, hcp_adapter_handle.handle, event_mask); + +#ifndef EHCA_USE_HCALL + /* TODO: Not implemented yet */ +#else + + retcode = plpar_hcall_7arg_7ret(H_RESET_EVENTS, + hcp_adapter_handle.handle, /* r4 */ + eq_handle.handle, /* r5 */ + event_mask, /* r6 */ + 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif + EDEB(7, "retcode=%lx", retcode); + + return retcode; +} + +/** + * hipz_h_allocate_resource_cq - Allocate CQ resources in HW and FW, initialize + * resources, create the empty CQPT (ring). + * + * @eq_handle: eq handle to use for this cq + * @cq_handle: cq handle for this queue + * @act_nr_of_entries: actual number of queue entries + * @act_pages: actual number of queue pages + * @galpas: contain logical adress of priv. storage and + * log_user_storage + */ +static inline u64 hipz_h_alloc_resource_cq(const struct + ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfcq *pfcq, + const struct ipz_eq_handle + eq_handle, + const u32 cq_token, + const u32 + number_of_entries, + struct ipz_cq_handle + *cq_handle, + u32 * act_nr_of_entries, + u32 * act_pages, + struct h_galpas *galpas) +{ + u64 retcode = 0; + u64 dummy; + u64 act_nr_of_entries_out; + u64 act_pages_out; + u64 g_la_privileged_out; + u64 g_la_user_out; + /* stack location is a unique identifier for a process from beginning + * to end of this frame */ + u32 x = (u64)(&x); + + EDEB_EN(7, "pfcq=%p hcp_adapter_handle=%lx eq_handle=%lx cq_token=%x" + " number_of_entries=%x", + pfcq, hcp_adapter_handle.handle, eq_handle.handle, + cq_token, number_of_entries); + +#ifndef EHCA_USE_HCALL + retcode = simp_h_alloc_resource_cq(hcp_adapter_handle, + pfcq, + eq_handle, + cq_token, + number_of_entries, + cq_handle, + act_nr_of_entries, + act_pages, galpas); +#else + retcode = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + 2, /* r5 */ + eq_handle.handle, /* r6 */ + cq_token, /* r7 */ + number_of_entries, /* r8 */ + 0, 0, + &cq_handle->handle, /* r4 */ + &dummy, /* r5 */ + &dummy, /* r6 */ + &act_nr_of_entries_out, /* r7 */ + &act_pages_out, /* r8 */ + &g_la_privileged_out, /* r9 */ + &g_la_user_out); /* r10 */ + + *act_nr_of_entries = (u32) act_nr_of_entries_out; + *act_pages = (u32) act_pages_out; + + if (retcode == 0) { + hcp_galpas_ctor(galpas, g_la_privileged_out, g_la_user_out); + } +#endif /* EHCA_US_HCALL */ + + if (retcode == H_NOT_ENOUGH_RESOURCES) { + EDEB_ERR(4, "Not enough resources. retcode=%lx", retcode); + } + + EDEB_EX(7, "cq_handle=%lx act_nr_of_entries=%x act_pages=%x", + cq_handle->handle, *act_nr_of_entries, *act_pages); + + return retcode; +} + +#define H_ALL_RES_QP_Enhanced_QP_Operations EHCA_BMASK_IBM(9,11) +#define H_ALL_RES_QP_QP_PTE_Pin EHCA_BMASK_IBM(12,12) +#define H_ALL_RES_QP_Service_Type EHCA_BMASK_IBM(13,15) +#define H_ALL_RES_QP_LL_RQ_CQE_Posting EHCA_BMASK_IBM(18,18) +#define H_ALL_RES_QP_LL_SQ_CQE_Posting EHCA_BMASK_IBM(19,21) +#define H_ALL_RES_QP_Signalling_Type EHCA_BMASK_IBM(22,23) +#define H_ALL_RES_QP_UD_Address_Vector_L_Key_Control EHCA_BMASK_IBM(31,31) +#define H_ALL_RES_QP_Resource_Type EHCA_BMASK_IBM(56,63) + +#define H_ALL_RES_QP_Max_Outstanding_Send_Work_Requests EHCA_BMASK_IBM(0,15) +#define H_ALL_RES_QP_Max_Outstanding_Receive_Work_Requests EHCA_BMASK_IBM(16,31) +#define H_ALL_RES_QP_Max_Send_SG_Elements EHCA_BMASK_IBM(32,39) +#define H_ALL_RES_QP_Max_Receive_SG_Elements EHCA_BMASK_IBM(40,47) + +#define H_ALL_RES_QP_Act_Outstanding_Send_Work_Requests EHCA_BMASK_IBM(16,31) +#define H_ALL_RES_QP_Act_Outstanding_Receive_Work_Requests EHCA_BMASK_IBM(48,63) +#define H_ALL_RES_QP_Act_Send_SG_Elements EHCA_BMASK_IBM(8,15) +#define H_ALL_RES_QP_Act_Receeive_SG_Elements EHCA_BMASK_IBM(24,31) + +#define H_ALL_RES_QP_Send_Queue_Size_pages EHCA_BMASK_IBM(0,31) +#define H_ALL_RES_QP_Receive_Queue_Size_pages EHCA_BMASK_IBM(32,63) + +/* direct access qp controls */ +#define DAQP_CTRL_ENABLE 0x01 +#define DAQP_CTRL_SEND_COMPLETION 0x20 +#define DAQP_CTRL_RECV_COMPLETION 0x40 + +/** + * hipz_h_alloc_resource_qp - Allocate QP resources in HW and FW, + * initialize resources, create empty QPPTs (2 rings). + * + * @h_galpas to access HCA resident QP attributes + */ +static inline u64 hipz_h_alloc_resource_qp(const struct + ipz_adapter_handle + adapter_handle, + struct ehca_pfqp *pfqp, + const u8 servicetype, + const u8 daqp_ctrl, + const u8 signalingtype, + const u8 ud_av_l_key_ctl, + const struct ipz_cq_handle send_cq_handle, + const struct ipz_cq_handle receive_cq_handle, + const struct ipz_eq_handle async_eq_handle, + const u32 qp_token, + const struct ipz_pd pd, + const u16 max_nr_send_wqes, + const u16 max_nr_receive_wqes, + const u8 max_nr_send_sges, + const u8 max_nr_receive_sges, + const u32 ud_av_l_key, + struct ipz_qp_handle *qp_handle, + u32 * qp_nr, + u16 * act_nr_send_wqes, + u16 * act_nr_receive_wqes, + u8 * act_nr_send_sges, + u8 * act_nr_receive_sges, + u32 * nr_sq_pages, + u32 * nr_rq_pages, + struct h_galpas *h_galpas) +{ + u64 retcode = H_Success; + u64 allocate_controls; + u64 max_r10_reg; + u64 dummy = 0; + u64 qp_nr_out = 0; + u64 r6_out = 0; + u64 r7_out = 0; + u64 r8_out = 0; + u64 g_la_user_out = 0; + u64 r11_out = 0; + + EDEB_EN(7, "pfqp=%p adapter_handle=%lx servicetype=%x signalingtype=%x" + " ud_av_l_key=%x send_cq_handle=%lx receive_cq_handle=%lx" + " async_eq_handle=%lx qp_token=%x pd=%x max_nr_send_wqes=%x" + " max_nr_receive_wqes=%x max_nr_send_sges=%x" + " max_nr_receive_sges=%x ud_av_l_key=%x galpa.pid=%x", + pfqp, adapter_handle.handle, servicetype, signalingtype, + ud_av_l_key, send_cq_handle.handle, + receive_cq_handle.handle, async_eq_handle.handle, qp_token, + pd.value, max_nr_send_wqes, max_nr_receive_wqes, + max_nr_send_sges, max_nr_receive_sges, ud_av_l_key, + h_galpas->pid); + +#ifndef EHCA_USE_HCALL + retcode = simp_h_alloc_resource_qp(adapter_handle, + pfqp, + servicetype, + signalingtype, + ud_av_l_key_ctl, + send_cq_handle, + receive_cq_handle, + async_eq_handle, + qp_token, + pd, + max_nr_send_wqes, + max_nr_receive_wqes, + max_nr_send_sges, + max_nr_receive_sges, + ud_av_l_key, + qp_handle, + qp_nr, + act_nr_send_wqes, + act_nr_receive_wqes, + act_nr_send_sges, + act_nr_receive_sges, + nr_sq_pages, nr_rq_pages, h_galpas); + +#else + allocate_controls = + EHCA_BMASK_SET(H_ALL_RES_QP_Enhanced_QP_Operations, + (daqp_ctrl & DAQP_CTRL_ENABLE) ? 1 : 0) + | EHCA_BMASK_SET(H_ALL_RES_QP_QP_PTE_Pin, 0) + | EHCA_BMASK_SET(H_ALL_RES_QP_Service_Type, servicetype) + | EHCA_BMASK_SET(H_ALL_RES_QP_Signalling_Type, signalingtype) + | EHCA_BMASK_SET(H_ALL_RES_QP_LL_RQ_CQE_Posting, + (daqp_ctrl & DAQP_CTRL_RECV_COMPLETION) ? 1 : 0) + | EHCA_BMASK_SET(H_ALL_RES_QP_LL_SQ_CQE_Posting, + (daqp_ctrl & DAQP_CTRL_SEND_COMPLETION) ? 1 : 0) + | EHCA_BMASK_SET(H_ALL_RES_QP_UD_Address_Vector_L_Key_Control, + ud_av_l_key_ctl) + | EHCA_BMASK_SET(H_ALL_RES_QP_Resource_Type, 1); + + max_r10_reg = + EHCA_BMASK_SET(H_ALL_RES_QP_Max_Outstanding_Send_Work_Requests, + max_nr_send_wqes) + | EHCA_BMASK_SET(H_ALL_RES_QP_Max_Outstanding_Receive_Work_Requests, + max_nr_receive_wqes) + | EHCA_BMASK_SET(H_ALL_RES_QP_Max_Send_SG_Elements, + max_nr_send_sges) + | EHCA_BMASK_SET(H_ALL_RES_QP_Max_Receive_SG_Elements, + max_nr_receive_sges); + + + retcode = plpar_hcall_9arg_9ret(H_ALLOC_RESOURCE, + adapter_handle.handle, /* r4 */ + allocate_controls, /* r5 */ + send_cq_handle.handle, /* r6 */ + receive_cq_handle.handle,/* r7 */ + async_eq_handle.handle, /* r8 */ + ((u64) qp_token << 32) + | pd.value, /* r9 */ + max_r10_reg, /* r10 */ + ud_av_l_key, /* r11 */ + 0, + &qp_handle->handle, /* r4 */ + &qp_nr_out, /* r5 */ + &r6_out, /* r6 */ + &r7_out, /* r7 */ + &r8_out, /* r8 */ + &dummy, /* r9 */ + &g_la_user_out, /* r10 */ + &r11_out, + &dummy); + + /* extract outputs */ + *qp_nr = (u32) qp_nr_out; + *act_nr_send_wqes = (u16) + EHCA_BMASK_GET(H_ALL_RES_QP_Act_Outstanding_Send_Work_Requests, + r6_out); + *act_nr_receive_wqes = (u16) + EHCA_BMASK_GET(H_ALL_RES_QP_Act_Outstanding_Receive_Work_Requests, + r6_out); + *act_nr_send_sges = + (u8) EHCA_BMASK_GET(H_ALL_RES_QP_Act_Send_SG_Elements, + r7_out); + *act_nr_receive_sges = + (u8) EHCA_BMASK_GET(H_ALL_RES_QP_Act_Receeive_SG_Elements, + r7_out); + *nr_sq_pages = + (u32) EHCA_BMASK_GET(H_ALL_RES_QP_Send_Queue_Size_pages, + r8_out); + *nr_rq_pages = + (u32) EHCA_BMASK_GET(H_ALL_RES_QP_Receive_Queue_Size_pages, + r8_out); + if (retcode == 0) { + hcp_galpas_ctor(h_galpas, g_la_user_out, g_la_user_out); + } +#endif /* EHCA_USE_HCALL */ + + if (retcode == H_NOT_ENOUGH_RESOURCES) { + EDEB_ERR(4, "Not enough resources. retcode=%lx", + retcode); + } + + EDEB_EX(7, "qp_nr=%x act_nr_send_wqes=%x" + " act_nr_receive_wqes=%x act_nr_send_sges=%x" + " act_nr_receive_sges=%x nr_sq_pages=%x" + " nr_rq_pages=%x galpa.user=%lx galpa.kernel=%lx", + *qp_nr, *act_nr_send_wqes, *act_nr_receive_wqes, + *act_nr_send_sges, *act_nr_receive_sges, *nr_sq_pages, + *nr_rq_pages, h_galpas->user.fw_handle, + h_galpas->kernel.fw_handle); + + return (retcode); +} + +static inline u64 hipz_h_query_port(const struct ipz_adapter_handle + hcp_adapter_handle, + const u8 port_id, + struct query_port_rblock + *query_port_response_block) +{ + u64 retcode = H_Success; + u64 dummy; + u64 r_cb; + + EDEB_EN(7, "hcp_adapter_handle=%lx port_id %x", + hcp_adapter_handle.handle, port_id); + + if ((((u64)query_port_response_block) & 0xfff) != 0) { + EDEB_ERR(4, "response block not page aligned"); + retcode = H_Parameter; + return (retcode); + } + +#ifndef EHCA_USE_HCALL + retcode = 0; +#else + r_cb = ehca_kv_to_g(query_port_response_block); + + retcode = plpar_hcall_7arg_7ret(H_QUERY_PORT, + hcp_adapter_handle.handle, /* r4 */ + port_id, /* r5 */ + r_cb, /* r6 */ + 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + + EDEB(7, "offset0=%x offset1=%x offset2=%x offset3=%x", + ((u32 *) query_port_response_block)[0], + ((u32 *) query_port_response_block)[1], + ((u32 *) query_port_response_block)[2], + ((u32 *) query_port_response_block)[3]); + EDEB(7, "offset4=%x offset5=%x offset6=%x offset7=%x", + ((u32 *) query_port_response_block)[4], + ((u32 *) query_port_response_block)[5], + ((u32 *) query_port_response_block)[6], + ((u32 *) query_port_response_block)[7]); + EDEB(7, "offset8=%x offset9=%x offseta=%x offsetb=%x", + ((u32 *) query_port_response_block)[8], + ((u32 *) query_port_response_block)[9], + ((u32 *) query_port_response_block)[10], + ((u32 *) query_port_response_block)[11]); + EDEB(7, "offsetc=%x offsetd=%x offsete=%x offsetf=%x", + ((u32 *) query_port_response_block)[12], + ((u32 *) query_port_response_block)[13], + ((u32 *) query_port_response_block)[14], + ((u32 *) query_port_response_block)[15]); + EDEB(7, "offset31=%x offset35=%x offset36=%x", + ((u32 *) query_port_response_block)[32], + ((u32 *) query_port_response_block)[36], + ((u32 *) query_port_response_block)[37]); + EDEB(7, "offset200=%x offset201=%x offset202=%x " + "offset203=%x", + ((u32 *) query_port_response_block)[0x200], + ((u32 *) query_port_response_block)[0x201], + ((u32 *) query_port_response_block)[0x202], + ((u32 *) query_port_response_block)[0x203]); + + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_query_hca(const struct ipz_adapter_handle + hcp_adapter_handle, + struct query_hca_rblock + *query_hca_rblock) +{ + u64 retcode = 0; + u64 dummy; + u64 r_cb; + EDEB_EN(7, "hcp_adapter_handle=%lx", hcp_adapter_handle.handle); + + if ((((u64)query_hca_rblock) & 0xfff) != 0) { + EDEB_ERR(4, "response block not page aligned"); + retcode = H_Parameter; + return (retcode); + } + +#ifndef EHCA_USE_HCALL + retcode = 0; +#else + r_cb = ehca_kv_to_g(query_hca_rblock); + + retcode = plpar_hcall_7arg_7ret(H_QUERY_HCA, + hcp_adapter_handle.handle, /* r4 */ + r_cb, /* r5 */ + 0, 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + + EDEB(7, "offset0=%x offset1=%x offset2=%x offset3=%x", + ((u32 *) query_hca_rblock)[0], + ((u32 *) query_hca_rblock)[1], + ((u32 *) query_hca_rblock)[2], ((u32 *) query_hca_rblock)[3]); + EDEB(7, "offset4=%x offset5=%x offset6=%x offset7=%x", + ((u32 *) query_hca_rblock)[4], + ((u32 *) query_hca_rblock)[5], + ((u32 *) query_hca_rblock)[6], ((u32 *) query_hca_rblock)[7]); + EDEB(7, "offset8=%x offset9=%x offseta=%x offsetb=%x", + ((u32 *) query_hca_rblock)[8], + ((u32 *) query_hca_rblock)[9], + ((u32 *) query_hca_rblock)[10], ((u32 *) query_hca_rblock)[11]); + EDEB(7, "offsetc=%x offsetd=%x offsete=%x offsetf=%x", + ((u32 *) query_hca_rblock)[12], + ((u32 *) query_hca_rblock)[13], + ((u32 *) query_hca_rblock)[14], ((u32 *) query_hca_rblock)[15]); + EDEB(7, "offset136=%x offset192=%x offset204=%x", + ((u32 *) query_hca_rblock)[32], + ((u32 *) query_hca_rblock)[48], ((u32 *) query_hca_rblock)[51]); + EDEB(7, "offset231=%x offset235=%x", + ((u32 *) query_hca_rblock)[57], ((u32 *) query_hca_rblock)[58]); + EDEB(7, "offset200=%x offset201=%x offset202=%x offset203=%x", + ((u32 *) query_hca_rblock)[0x201], + ((u32 *) query_hca_rblock)[0x202], + ((u32 *) query_hca_rblock)[0x203], + ((u32 *) query_hca_rblock)[0x204]); + + EDEB_EX(7, "retcode=%lx hcp_adapter_handle=%lx", + retcode, hcp_adapter_handle.handle); + + return retcode; +} + +/** + * hipz_h_register_rpage - hcp_if.h internal function for all + * hcp_H_REGISTER_RPAGE calls. + * + * @logical_address_of_page: kv transformation to GX address in this routine + */ +static inline u64 hipz_h_register_rpage(const struct + ipz_adapter_handle + hcp_adapter_handle, + const u8 pagesize, + const u8 queue_type, + const u64 resource_handle, + const u64 + logical_address_of_page, + u64 count) +{ + u64 retcode = 0; + u64 dummy; + + EDEB_EN(7, "hcp_adapter_handle=%lx pagesize=%x queue_type=%x" + " resource_handle=%lx logical_address_of_page=%lx count=%lx", + hcp_adapter_handle.handle, pagesize, queue_type, + resource_handle, logical_address_of_page, count); + +#ifndef EHCA_USE_HCALL + EDEB_ERR(4, "Not implemented"); +#else + retcode = plpar_hcall_7arg_7ret(H_REGISTER_RPAGES, + hcp_adapter_handle.handle, /* r4 */ + queue_type | pagesize << 8, /* r5 */ + resource_handle, /* r6 */ + logical_address_of_page, /* r7 */ + count, /* r8 */ + 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_register_rpage_eq(const struct + ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_eq_handle + eq_handle, + struct ehca_pfeq *pfeq, + const u8 pagesize, + const u8 queue_type, + const u64 + logical_address_of_page, + const u64 count) +{ + u64 retcode = 0; + + EDEB_EN(7, "pfeq=%p hcp_adapter_handle=%lx eq_handle=%lx pagesize=%x" + " queue_type=%x logical_address_of_page=%lx count=%lx", + pfeq, hcp_adapter_handle.handle, eq_handle.handle, pagesize, + queue_type,logical_address_of_page, count); + +#ifndef EHCA_USE_HCALL + retcode = + simp_h_register_rpage_eq(hcp_adapter_handle, eq_handle, pfeq, + pagesize, queue_type, + logical_address_of_page, count); +#else + if (count != 1) { + EDEB_ERR(4, "Ppage counter=%lx", count); + return (H_Parameter); + } + retcode = hipz_h_register_rpage(hcp_adapter_handle, + pagesize, + queue_type, + eq_handle.handle, + logical_address_of_page, count); +#endif /* EHCA_USE_HCALL */ + + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u32 hipz_request_interrupt(struct ehca_irq_info *irq_info, + irqreturn_t(*handler) + (int, void *, struct pt_regs *)) +{ + + int ret = 0; + + EDEB_EN(7, "ist=0x%x", irq_info->ist); + +#ifdef EHCA_USE_HCALL +#ifndef EHCA_USERDRIVER + ret = ibmebus_request_irq(NULL, irq_info->ist, handler, + SA_INTERRUPT, "ehca", (void *)irq_info); + + if (ret < 0) + EDEB_ERR(4, "Can't map interrupt handler."); +#else + struct hcall_irq_info hirq = {.irq = irq_info->irq, + .ist = irq_info->ist, + .pid = irq_info->pid}; + + hirq = hirq; + ret = hcall_reg_eqh(&hirq, ehca_interrupt_eq); +#endif /* EHCA_USERDRIVER */ +#endif /* EHCA_USE_HCALL */ + + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +static inline void hipz_free_interrupt(struct ehca_irq_info *irq_info) +{ +#ifdef EHCA_USE_HCALL +#ifndef EHCA_USERDRIVER + ibmebus_free_irq(NULL, irq_info->ist, (void *)irq_info); +#endif +#endif +} + +static inline u32 hipz_h_query_int_state(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_irq_info *irq_info) +{ + u32 rc = 0; + u64 dummy = 0; + + EDEB_EN(7, "ist=0x%x", irq_info->ist); + +#ifdef EHCA_USE_HCALL +#ifdef EHCA_USERDRIVER + /* TODO: Not implemented yet */ +#else + rc = plpar_hcall_7arg_7ret(H_QUERY_INT_STATE, + hcp_adapter_handle.handle, /* r4 */ + irq_info->ist, /* r5 */ + 0, 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); + + if ((rc != H_Success) && (rc != H_Busy)) + EDEB_ERR(4, "Could not query interrupt state."); +#endif +#endif + EDEB_EX(7, "interrupt state: %x", rc); + + return rc; +} + +static inline u64 hipz_h_register_rpage_cq(const struct + ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_cq_handle + cq_handle, + struct ehca_pfcq *pfcq, + const u8 pagesize, + const u8 queue_type, + const u64 + logical_address_of_page, + const u64 count, + const struct h_galpa gal) +{ + u64 retcode = 0; + + EDEB_EN(7, "pfcq=%p hcp_adapter_handle=%lx cq_handle=%lx pagesize=%x" + " queue_type=%x logical_address_of_page=%lx count=%lx", + pfcq, hcp_adapter_handle.handle, cq_handle.handle, pagesize, + queue_type, logical_address_of_page, count); + +#ifndef EHCA_USE_HCALL + retcode = + simp_h_register_rpage_cq(hcp_adapter_handle, cq_handle, pfcq, + pagesize, queue_type, + logical_address_of_page, count, gal); +#else + if (count != 1) { + EDEB_ERR(4, "Page counter=%lx", count); + return (H_Parameter); + } + + retcode = + hipz_h_register_rpage(hcp_adapter_handle, pagesize, queue_type, + cq_handle.handle, logical_address_of_page, + count); +#endif /* EHCA_USE_HCALL */ + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_register_rpage_qp(const struct + ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, + struct ehca_pfqp *pfqp, + const u8 pagesize, + const u8 queue_type, + const u64 + logical_address_of_page, + const u64 count, + const struct h_galpa + galpa) +{ + u64 retcode = 0; + + EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx qp_handle=%lx pagesize=%x" + " queue_type=%x logical_address_of_page=%lx count=%lx", + pfqp, hcp_adapter_handle.handle, qp_handle.handle, pagesize, + queue_type, logical_address_of_page, count); + +#ifndef EHCA_USE_HCALL + retcode = simp_h_register_rpage_qp(hcp_adapter_handle, + qp_handle, + pfqp, + pagesize, + queue_type, + logical_address_of_page, + count, galpa); +#else + if (count != 1) { + EDEB_ERR(4, "Page counter=%lx", count); + return (H_Parameter); + } + + retcode = hipz_h_register_rpage(hcp_adapter_handle, + pagesize, + queue_type, + qp_handle.handle, + logical_address_of_page, count); +#endif /* EHCA_USE_HCALL */ + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_remove_rpt_cq(const struct + ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_cq_handle + cq_handle, + struct ehca_pfcq *pfcq) +{ + u64 retcode = 0; + + EDEB_EN(7, "pfcq=%p hcp_adapter_handle=%lx cq_handle=%lx", + pfcq, hcp_adapter_handle.handle, cq_handle.handle); + +#ifndef EHCA_USE_HCALL + retcode = simp_h_remove_rpt_cq(hcp_adapter_handle, cq_handle, pfcq); +#else + /* TODO: hcall not implemented */ +#endif + EDEB_EX(7, "retcode=%lx", retcode); + + return 0; +} + +static inline u64 hipz_h_remove_rpt_eq(const struct + ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_eq_handle + eq_handle, + struct ehca_pfeq *pfeq) +{ + u64 retcode = 0; + + EDEB_EX(7, "hcp_adapter_handle=%lx eq_handle=%lx", + hcp_adapter_handle.handle, eq_handle.handle); + +#ifndef EHCA_USE_HCALL + retcode = simp_h_remove_rpt_eq(hcp_adapter_handle, eq_handle, pfeq); +#else + /* TODO: hcall not implemented */ +#endif + EDEB_EX(7, "retcode=%lx", retcode); + + return 0; +} + +static inline u64 hipz_h_remove_rpt_qp(const struct + ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, + struct ehca_pfqp *pfqp) +{ + u64 retcode = 0; + + EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx qp_handle=%lx", + pfqp, hcp_adapter_handle.handle, qp_handle.handle); + +#ifndef EHCA_USE_HCALL + retcode = simp_h_remove_rpt_qp(hcp_adapter_handle, qp_handle, pfqp); +#else + /* TODO: hcall not implemented */ +#endif + EDEB_EX(7, "retcode=%lx", retcode); + + return 0; +} + +static inline u64 hipz_h_disable_and_get_wqe(const struct + ipz_adapter_handle + hcp_adapter_handle, + const struct + ipz_qp_handle qp_handle, + struct ehca_pfqp *pfqp, + void **log_addr_next_sq_wqe_tb_processed, + void **log_addr_next_rq_wqe_tb_processed, + int dis_and_get_function_code) +{ + u64 retcode = 0; + u8 function_code = 1; + u64 dummy, dummy1, dummy2; + + EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx function=%x qp_handle=%lx", + pfqp, hcp_adapter_handle.handle, function_code, qp_handle.handle); + + if (log_addr_next_sq_wqe_tb_processed==NULL) { + log_addr_next_sq_wqe_tb_processed = (void**)&dummy1; + } + if (log_addr_next_rq_wqe_tb_processed==NULL) { + log_addr_next_rq_wqe_tb_processed = (void**)&dummy2; + } +#ifndef EHCA_USE_HCALL + retcode = + simp_h_disable_and_get_wqe(hcp_adapter_handle, qp_handle, pfqp, + log_addr_next_sq_wqe_tb_processed, + log_addr_next_rq_wqe_tb_processed); +#else + + retcode = plpar_hcall_7arg_7ret(H_DISABLE_AND_GETC, + hcp_adapter_handle.handle, /* r4 */ + dis_and_get_function_code, /* r5 */ + /* function code 1-disQP ret + * SQ RQ wqe ptr + * 2- ret SQ wqe ptr + * 3- ret. RQ count */ + qp_handle.handle, /* r6 */ + 0, 0, 0, 0, + (void*)log_addr_next_sq_wqe_tb_processed, /* r4 */ + (void*)log_addr_next_rq_wqe_tb_processed, /* r5 */ + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + EDEB_EX(7, "retcode=%lx ladr_next_rq_wqe_out=%p" + " ladr_next_sq_wqe_out=%p", retcode, + *log_addr_next_sq_wqe_tb_processed, + *log_addr_next_rq_wqe_tb_processed); + + return retcode; +} + +enum hcall_sigt { + HCALL_SIGT_NO_CQE = 0, + HCALL_SIGT_BY_WQE = 1, + HCALL_SIGT_EVERY = 2 +}; + +static inline u64 hipz_h_modify_qp(const struct ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, struct ehca_pfqp *pfqp, + const u64 update_mask, + struct hcp_modify_qp_control_block + *mqpcb, + struct h_galpa gal) +{ + u64 retcode = 0; + u64 invalid_attribute_identifier = 0; + u64 rc_attrib_mask = 0; + u64 dummy; + u64 r_cb; + EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx qp_handle=%lx" + " update_mask=%lx qp_state=%x mqpcb=%p", + pfqp, hcp_adapter_handle.handle, qp_handle.handle, + update_mask, mqpcb->qp_state, mqpcb); + +#ifndef EHCA_USE_HCALL + simp_h_modify_qp(hcp_adapter_handle, qp_handle, pfqp, update_mask, + mqpcb, gal); +#else + r_cb = ehca_kv_to_g(mqpcb); + retcode = plpar_hcall_7arg_7ret(H_MODIFY_QP, + hcp_adapter_handle.handle, /* r4 */ + qp_handle.handle, /* r5 */ + update_mask, /* r6 */ + r_cb, /* r7 */ + 0, 0, 0, + &invalid_attribute_identifier, /* r4 */ + &dummy, /* r5 */ + &dummy, /* r6 */ + &dummy, /* r7 */ + &dummy, /* r8 */ + &rc_attrib_mask, /* r9 */ + &dummy); +#endif + if (retcode == H_NOT_ENOUGH_RESOURCES) { + EDEB_ERR(4, "Insufficient resources retcode=%lx", retcode); + } + + EDEB_EX(7, "retcode=%lx invalid_attribute_identifier=%lx" + " invalid_attribute_MASK=%lx", retcode, + invalid_attribute_identifier, rc_attrib_mask); + + return retcode; +} + +static inline u64 hipz_h_query_qp(const struct ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, struct ehca_pfqp *pfqp, + struct hcp_modify_qp_control_block + *qqpcb, struct h_galpa gal) +{ + u64 retcode = 0; + u64 dummy; + u64 r_cb; + EDEB_EN(7, "hcp_adapter_handle=%lx qp_handle=%lx", + hcp_adapter_handle.handle, qp_handle.handle); + +#ifndef EHCA_USE_HCALL + simp_h_query_qp(hcp_adapter_handle, qp_handle, qqpcb, gal); +#else + r_cb = ehca_kv_to_g(qqpcb); + EDEB(7, "r_cb=%lx", r_cb); + + retcode = plpar_hcall_7arg_7ret(H_QUERY_QP, + hcp_adapter_handle.handle, /* r4 */ + qp_handle.handle, /* r5 */ + r_cb, /* r6 */ + 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); + +#endif + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_destroy_qp(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_qp *qp) +{ + u64 retcode = 0; + u64 dummy; + u64 ladr_next_sq_wqe_out; + u64 ladr_next_rq_wqe_out; + + EDEB_EN(7, "qp = %p ,ipz_qp_handle=%lx adapter_handle=%lx", + qp, qp->ipz_qp_handle.handle, hcp_adapter_handle.handle); + +#ifndef EHCA_USE_HCALL + retcode = + simp_h_destroy_qp(hcp_adapter_handle, qp, + qp->ehca_qp_core.galpas.user); +#else + + retcode = hcp_galpas_dtor(&qp->ehca_qp_core.galpas); + + retcode = plpar_hcall_7arg_7ret(H_DISABLE_AND_GETC, + hcp_adapter_handle.handle, /* r4 */ + /* function code */ + 1, /* r5 */ + qp->ipz_qp_handle.handle, /* r6 */ + 0, 0, 0, 0, + &ladr_next_sq_wqe_out, /* r4 */ + &ladr_next_rq_wqe_out, /* r5 */ + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); + if (retcode == H_Hardware) { + EDEB_ERR(4, "HCA not operational. retcode=%lx", retcode); + } + + retcode = plpar_hcall_7arg_7ret(H_FREE_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + qp->ipz_qp_handle.handle, /* r5 */ + 0, 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + + if (retcode == H_Resource) { + EDEB_ERR(4, "Resource still in use. retcode=%lx", retcode); + } + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_define_aqp0(const struct ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, struct h_galpa gal, + u32 port) +{ + u64 retcode = 0; + u64 dummy; + + EDEB_EN(7, "port=%x ipz_qp_handle=%lx adapter_handle=%lx", + port, qp_handle.handle, hcp_adapter_handle.handle); + +#ifndef EHCA_USE_HCALL + /* TODO: not implemented yet */ +#else + + retcode = plpar_hcall_7arg_7ret(H_DEFINE_AQP0, + hcp_adapter_handle.handle, /* r4 */ + qp_handle.handle, /* r5 */ + port, /* r6 */ + 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); + +#endif /* EHCA_USE_HCALL */ + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_define_aqp1(const struct ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, struct h_galpa gal, + u32 port, u32 * pma_qp_nr, + u32 * bma_qp_nr) +{ + u64 retcode = 0; + u64 dummy; + u64 pma_qp_nr_out; + u64 bma_qp_nr_out; + + EDEB_EN(7, "port=%x qp_handle=%lx adapter_handle=%lx", + port, qp_handle.handle, hcp_adapter_handle.handle); + +#ifndef EHCA_USE_HCALL + /* TODO: not implemented yet */ +#else + + retcode = plpar_hcall_7arg_7ret(H_DEFINE_AQP1, + hcp_adapter_handle.handle, /* r4 */ + qp_handle.handle, /* r5 */ + port, /* r6 */ + 0, 0, 0, 0, + &pma_qp_nr_out, /* r4 */ + &bma_qp_nr_out, /* r5 */ + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); + *pma_qp_nr = (u32) pma_qp_nr_out; + *bma_qp_nr = (u32) bma_qp_nr_out; + +#endif + if (retcode == H_ALIAS_EXIST) { + EDEB_ERR(4, "AQP1 already exists. retcode=%lx", retcode); + } + + EDEB_EX(7, "retcode=%lx pma_qp_nr=%i bma_qp_nr=%i", + retcode, (int)*pma_qp_nr, (int)*bma_qp_nr); + + return retcode; +} + +/* TODO: Don't use ib_* types in this file */ +static inline u64 hipz_h_attach_mcqp(const struct ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, struct h_galpa gal, + u16 mcg_dlid, union ib_gid dgid) +{ + u64 retcode = 0; + u64 dummy; + + EDEB_EN(7, "qp_handle=%lx adapter_handle=%lx\nMCG_DGID =" + " %d.%d.%d.%d.%d.%d.%d.%d." + " %d.%d.%d.%d.%d.%d.%d.%d\n", + qp_handle.handle, hcp_adapter_handle.handle, + dgid.raw[0], dgid.raw[1], + dgid.raw[2], dgid.raw[3], + dgid.raw[4], dgid.raw[5], + dgid.raw[6], dgid.raw[7], + dgid.raw[0 + 8], dgid.raw[1 + 8], + dgid.raw[2 + 8], dgid.raw[3 + 8], + dgid.raw[4 + 8], dgid.raw[5 + 8], + dgid.raw[6 + 8], dgid.raw[7 + 8]); + +#ifndef EHCA_USE_HCALL + /* TODO: not implemented yet */ +#else + retcode = plpar_hcall_7arg_7ret(H_ATTACH_MCQP, + hcp_adapter_handle.handle, /* r4 */ + qp_handle.handle, /* r5 */ + mcg_dlid, /* r6 */ + dgid.global.interface_id, /* r7 */ + dgid.global.subnet_prefix, /* r8 */ + 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + if (retcode == H_NOT_ENOUGH_RESOURCES) { + EDEB_ERR(4, "Not enough resources. retcode=%lx", retcode); + } + + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_detach_mcqp(const struct ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_qp_handle + qp_handle, struct h_galpa gal, + u16 mcg_dlid, union ib_gid dgid) +{ + u64 retcode = 0; + u64 dummy; + + EDEB_EN(7, "qp_handle=%lx adapter_handle=%lx\nMCG_DGID =" + " %d.%d.%d.%d.%d.%d.%d.%d." + " %d.%d.%d.%d.%d.%d.%d.%d\n", + qp_handle.handle, hcp_adapter_handle.handle, + dgid.raw[0], dgid.raw[1], + dgid.raw[2], dgid.raw[3], + dgid.raw[4], dgid.raw[5], + dgid.raw[6], dgid.raw[7], + dgid.raw[0 + 8], dgid.raw[1 + 8], + dgid.raw[2 + 8], dgid.raw[3 + 8], + dgid.raw[4 + 8], dgid.raw[5 + 8], + dgid.raw[6 + 8], dgid.raw[7 + 8]); +#ifndef EHCA_USE_HCALL + /* TODO: not implemented yet */ +#else + retcode = plpar_hcall_7arg_7ret(H_DETACH_MCQP, + hcp_adapter_handle.handle, /* r4 */ + qp_handle.handle, /* r5 */ + mcg_dlid, /* r6 */ + dgid.global.interface_id, /* r7 */ + dgid.global.subnet_prefix, /* r8 */ + 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + EDEB(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_destroy_cq(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_cq *cq, + u8 force_flag) +{ + u64 retcode = 0; + u64 dummy; + + EDEB_EN(7, "cq->pf=%p cq=.%p ipz_cq_handle=%lx adapter_handle=%lx", + &cq->pf, cq, cq->ipz_cq_handle.handle, hcp_adapter_handle.handle); + +#ifndef EHCA_USE_HCALL + simp_h_destroy_cq(hcp_adapter_handle, cq, + cq->ehca_cq_core.galpas.kernel); +#else + retcode = hcp_galpas_dtor(&cq->ehca_cq_core.galpas); + if (retcode != 0) { + EDEB_ERR(4, "Could not destruct cp->galpas"); + return (H_Resource); + } + + retcode = plpar_hcall_7arg_7ret(H_FREE_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + cq->ipz_cq_handle.handle, /* r5 */ + force_flag!=0 ? 1L : 0L, /* r6 */ + 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif + + if (retcode == H_Resource) { + EDEB(4, "retcode=%lx ", retcode); + } + + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +static inline u64 hipz_h_destroy_eq(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_eq *eq) +{ + u64 retcode = 0; + u64 dummy; + + EDEB_EN(7, "eq->pf=%p eq=%p ipz_eq_handle=%lx adapter_handle=%lx", + &eq->pf, eq, eq->ipz_eq_handle.handle, + hcp_adapter_handle.handle); + +#ifndef EHCA_USE_HCALL + /* TODO: not implemeted et */ +#else + + retcode = hcp_galpas_dtor(&eq->galpas); + if (retcode != 0) { + EDEB_ERR(4, "Could not destruct ep->galpas"); + return (H_Resource); + } + + retcode = plpar_hcall_7arg_7ret(H_FREE_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + eq->ipz_eq_handle.handle, /* r5 */ + 0, 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); + +#endif + if (retcode == H_Resource) { + EDEB_ERR(4, "Resource in use. retcode=%lx ", retcode); + } + EDEB_EX(7, "retcode=%lx", retcode); + + return retcode; +} + +/** + * hipz_h_alloc_resource_mr - Allocate MR resources in HW and FW, initialize + * resources. + * + * @pfmr: platform specific for MR + * pfshca: platform specific for SHCA + * vaddr: Memory Region I/O Virtual Address + * @length: Memory Region Length + * @access_ctrl: Memory Region Access Controls + * @pd: Protection Domain + * @mr_handle: Memory Region Handle + */ +static inline u64 hipz_h_alloc_resource_mr(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmr *pfmr, + struct ehca_pfshca + *pfshca, + const u64 vaddr, + const u64 length, + const u32 access_ctrl, + const struct ipz_pd pd, + struct ipz_mrmw_handle + *mr_handle, + u32 * lkey, + u32 * rkey) +{ + u64 rc = H_Success; + u64 dummy; + u64 lkey_out; + u64 rkey_out; + + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p vaddr=%lx length=%lx" + " access_ctrl=%x pd=%x pfshca=%p", + hcp_adapter_handle.handle, pfmr, vaddr, length, access_ctrl, + pd.value, pfshca); + +#ifndef EHCA_USE_HCALL + rc = simp_hcz_h_alloc_resource_mr(hcp_adapter_handle, + pfmr, + pfshca, + vaddr, + length, + access_ctrl, + pd, + (struct hcz_mrmw_handle *)mr_handle, + lkey, rkey); + EDEB_EX(7, "rc=%lx mr_handle.mrwpte=%p mr_handle.page_index=%x" + " lkey=%x rkey=%x", + rc, mr_handle->mrwpte, mr_handle->page_index, *lkey, *rkey); +#else + + rc = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + 5, /* r5 */ + vaddr, /* r6 */ + length, /* r7 */ + ((((u64) access_ctrl) << 32ULL)), /* r8 */ + pd.value, /* r9 */ + 0, + &mr_handle->handle, /* r4 */ + &dummy, /* r5 */ + &lkey_out, /* r6 */ + &rkey_out, /* r7 */ + &dummy, + &dummy, + &dummy); + *lkey = (u32) lkey_out; + *rkey = (u32) rkey_out; + + EDEB_EX(7, "rc=%lx mr_handle=%lx lkey=%x rkey=%x", + rc, mr_handle->handle, *lkey, *rkey); +#endif /* EHCA_USE_HCALL */ + + return rc; +} + +/** + * hipz_h_register_rpage_mr - Register MR resource page in HW and FW . + * + * @pfmr: platform specific for MR + * @pfshca: platform specific for SHCA + * @queue_type: must be zero for MR + */ +static inline u64 hipz_h_register_rpage_mr(const struct ipz_adapter_handle + hcp_adapter_handle, + const struct ipz_mrmw_handle + *mr_handle, + struct ehca_pfmr *pfmr, + struct ehca_pfshca *pfshca, + const u8 pagesize, + const u8 queue_type, + const u64 + logical_address_of_page, + const u64 count) +{ + u64 rc = H_Success; + +#ifndef EHCA_USE_HCALL + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle.mrwpte=%p" + " mr_handle.page_index=%x pagesize=%x queue_type=%x " + " logical_address_of_page=%lx count=%lx pfshca=%p", + hcp_adapter_handle.handle, pfmr, mr_handle->mrwpte, + mr_handle->page_index, pagesize, queue_type, + logical_address_of_page, count, pfshca); + + rc = simp_hcz_h_register_rpage_mr(hcp_adapter_handle, + (struct hcz_mrmw_handle *)mr_handle, + pfmr, + pfshca, + pagesize, + queue_type, + logical_address_of_page, count); +#else + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle=%lx pagesize=%x" + " queue_type=%x logical_address_of_page=%lx count=%lx", + hcp_adapter_handle.handle, pfmr, mr_handle->handle, pagesize, + queue_type, logical_address_of_page, count); + + if ((count > 1) && (logical_address_of_page & 0xfff)) { + ehca_catastrophic("ERROR: logical_address_of_page " + "not on a 4k boundary"); + rc = H_Parameter; + } else { + rc = hipz_h_register_rpage(hcp_adapter_handle, pagesize, + queue_type, mr_handle->handle, + logical_address_of_page, count); + } +#endif /* EHCA_USE_HCALL */ + EDEB_EX(7, "rc=%lx", rc); + + return rc; +} + +/** + * hipz_h_query_mr - Query MR in HW and FW. + * + * @pfmr: platform specific for MR + * @mr_handle: Memory Region Handle + * @mr_local_length: Local MR Length + * @mr_local_vaddr: Local MR I/O Virtual Address + * @mr_remote_length: Remote MR Length + * @mr_remote_vaddr Remote MR I/O Virtual Address + * @access_ctrl: Memory Region Access Controls + * @pd: Protection Domain + * lkey: L_Key + * rkey: R_Key + */ +static inline u64 hipz_h_query_mr(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmr *pfmr, + const struct ipz_mrmw_handle + *mr_handle, + u64 * mr_local_length, + u64 * mr_local_vaddr, + u64 * mr_remote_length, + u64 * mr_remote_vaddr, + u32 * access_ctrl, + struct ipz_pd *pd, + u32 * lkey, + u32 * rkey) +{ + u64 rc = H_Success; + u64 dummy; + u64 acc_ctrl_pd_out; + u64 r9_out; + +#ifndef EHCA_USE_HCALL + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle.mrwpte=%p" + " mr_handle.page_index=%x", + hcp_adapter_handle.handle, pfmr, mr_handle->mrwpte, + mr_handle->page_index); + + rc = simp_hcz_h_query_mr(hcp_adapter_handle, + pfmr, + mr_handle, + mr_local_length, + mr_local_vaddr, + mr_remote_length, + mr_remote_vaddr, access_ctrl, pd, lkey, rkey); + + EDEB_EX(7, "rc=%lx mr_local_length=%lx mr_local_vaddr=%lx" + " mr_remote_length=%lx mr_remote_vaddr=%lx access_ctrl=%x" + " pd=%x lkey=%x rkey=%x", + rc, *mr_local_length, *mr_local_vaddr, *mr_remote_length, + *mr_remote_vaddr, *access_ctrl, pd->value, *lkey, *rkey); +#else + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle=%lx", + hcp_adapter_handle.handle, pfmr, mr_handle->handle); + + + rc = plpar_hcall_7arg_7ret(H_QUERY_MR, + hcp_adapter_handle.handle, /* r4 */ + mr_handle->handle, /* r5 */ + 0, 0, 0, 0, 0, + mr_local_length, /* r4 */ + mr_local_vaddr, /* r5 */ + mr_remote_length, /* r6 */ + mr_remote_vaddr, /* r7 */ + &acc_ctrl_pd_out, /* r8 */ + &r9_out, + &dummy); + + *access_ctrl = acc_ctrl_pd_out >> 32; + pd->value = (u32) acc_ctrl_pd_out; + *lkey = (u32) (r9_out >> 32); + *rkey = (u32) (r9_out & (0xffffffff)); + + EDEB_EX(7, "rc=%lx mr_local_length=%lx mr_local_vaddr=%lx" + " mr_remote_length=%lx mr_remote_vaddr=%lx access_ctrl=%x" + " pd=%x lkey=%x rkey=%x", + rc, *mr_local_length, *mr_local_vaddr, *mr_remote_length, + *mr_remote_vaddr, *access_ctrl, pd->value, *lkey, *rkey); +#endif /* EHCA_USE_HCALL */ + + return rc; +} + +/** + * hipz_h_free_resource_mr - Free MR resources in HW and FW. + * + * @pfmr: platform specific for MR + * @mr_handle: Memory Region Handle + */ +static inline u64 hipz_h_free_resource_mr(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmr *pfmr, + const struct ipz_mrmw_handle + *mr_handle) +{ + u64 rc = H_Success; + u64 dummy; + +#ifndef EHCA_USE_HCALL + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle.mrwpte=%p" + " mr_handle.page_index=%x", + hcp_adapter_handle.handle, pfmr, mr_handle->mrwpte, + mr_handle->page_index); + + rc = simp_hcz_h_free_resource_mr(hcp_adapter_handle, pfmr, mr_handle); +#else + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle=%lx", + hcp_adapter_handle.handle, pfmr, mr_handle->handle); + + rc = plpar_hcall_7arg_7ret(H_FREE_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + mr_handle->handle, /* r5 */ + 0, 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + EDEB_EX(7, "rc=%lx", rc); + + return rc; +} + +/** + * hipz_h_reregister_pmr - Reregister MR in HW and FW. + * + * @pfmr: platform specific for MR + * @pfshca: platform specific for SHCA + * @mr_handle: Memory Region Handle + * @vaddr_in: Memory Region I/O Virtual Address + * @length: Memory Region Length + * @access_ctrl: Memory Region Access Controls + * @pd: Protection Domain + * @mr_addr_cb: Logical Address of MR Control Block + * @vaddr_out: Memory Region I/O Virtual Address + * lkey: L_Key + * rkey: R_Key + * + */ +static inline u64 hipz_h_reregister_pmr(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmr *pfmr, + struct ehca_pfshca *pfshca, + const struct ipz_mrmw_handle + *mr_handle, + const u64 vaddr_in, + const u64 length, + const u32 access_ctrl, + const struct ipz_pd pd, + const u64 mr_addr_cb, + u64 * vaddr_out, + u32 * lkey, + u32 * rkey) +{ + u64 rc = H_Success; + u64 dummy; + u64 lkey_out; + u64 rkey_out; + +#ifndef EHCA_USE_HCALL + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p pfshca=%p" + " mr_handle.mrwpte=%p mr_handle.page_index=%x vaddr_in=%lx" + " length=%lx access_ctrl=%x pd=%x mr_addr_cb=", + hcp_adapter_handle.handle, pfmr, pfshca, mr_handle->mrwpte, + mr_handle->page_index, vaddr_in, length, access_ctrl, + pd.value, mr_addr_cb); + + rc = simp_hcz_h_reregister_pmr(hcp_adapter_handle, pfmr, pfshca, + mr_handle, vaddr_in, length, access_ctrl, + pd, mr_addr_cb, vaddr_out, lkey, rkey); +#else + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p pfshca=%p mr_handle=%lx " + "vaddr_in=%lx length=%lx access_ctrl=%x pd=%x mr_addr_cb=%lx", + hcp_adapter_handle.handle, pfmr, pfshca, mr_handle->handle, + vaddr_in, length, access_ctrl, pd.value, mr_addr_cb); + + rc = plpar_hcall_7arg_7ret(H_REREGISTER_PMR, + hcp_adapter_handle.handle, /* r4 */ + mr_handle->handle, /* r5 */ + vaddr_in, /* r6 */ + length, /* r7 */ + /* r8 */ + ((((u64) access_ctrl) << 32ULL) | pd.value), + mr_addr_cb, /* r9 */ + 0, + &dummy, /* r4 */ + vaddr_out, /* r5 */ + &lkey_out, /* r6 */ + &rkey_out, /* r7 */ + &dummy, + &dummy, + &dummy); + *lkey = (u32) lkey_out; + *rkey = (u32) rkey_out; +#endif /* EHCA_USE_HCALL */ + + EDEB_EX(7, "rc=%lx vaddr_out=%lx lkey=%x rkey=%x", + rc, *vaddr_out, *lkey, *rkey); + return rc; +} + +/** @brief + as defined in carols hcall document +*/ + +/** + * Register shared MR in HW and FW. + * + * @pfmr: platform specific for new shared MR + * @orig_pfmr: platform specific for original MR + * @pfshca: platform specific for SHCA + * @orig_mr_handle: Memory Region Handle of original MR + * @vaddr_in: Memory Region I/O Virtual Address of new shared MR + * @access_ctrl: Memory Region Access Controls of new shared MR + * @pd: Protection Domain of new shared MR + * @mr_handle: Memory Region Handle of new shared MR + * @lkey: L_Key of new shared MR + * @rkey: R_Key of new shared MR + */ +static inline u64 hipz_h_register_smr(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmr *pfmr, + struct ehca_pfmr *orig_pfmr, + struct ehca_pfshca *pfshca, + const struct ipz_mrmw_handle + *orig_mr_handle, + const u64 vaddr_in, + const u32 access_ctrl, + const struct ipz_pd pd, + struct ipz_mrmw_handle + *mr_handle, + u32 * lkey, + u32 * rkey) +{ + u64 rc = H_Success; + u64 dummy; + u64 lkey_out; + u64 rkey_out; + +#ifndef EHCA_USE_HCALL + EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p orig_pfmr=%p pfshca=%p" + " orig_mr_handle.mrwpte=%p orig_mr_handle.page_index=%x" + " vaddr_in=%lx access_ctrl=%x pd=%x", + hcp_adapter_handle.handle, pfmr, orig_pfmr, pfshca, + orig_mr_handle->mrwpte, orig_mr_handle->page_index, + vaddr_in, access_ctrl, pd.value); + + rc = simp_hcz_h_register_smr(hcp_adapter_handle, pfmr, orig_pfmr, + pfshca, orig_mr_handle, vaddr_in, + access_ctrl, pd, + (struct hcz_mrmw_handle *)mr_handle, lkey, + rkey); + EDEB_EX(7, "rc=%lx mr_handle.mrwpte=%p mr_handle.page_index=%x" + " lkey=%x rkey=%x", + rc, mr_handle->mrwpte, mr_handle->page_index, *lkey, *rkey); +#else + EDEB_EN(7, "hcp_adapter_handle=%lx orig_pfmr=%p pfshca=%p" + " orig_mr_handle=%lx vaddr_in=%lx access_ctrl=%x pd=%x", + hcp_adapter_handle.handle, orig_pfmr, pfshca, + orig_mr_handle->handle, vaddr_in, access_ctrl, pd.value); + + + rc = plpar_hcall_7arg_7ret(H_REGISTER_SMR, + hcp_adapter_handle.handle, /* r4 */ + orig_mr_handle->handle, /* r5 */ + vaddr_in, /* r6 */ + ((((u64) access_ctrl) << 32ULL)), /* r7 */ + pd.value, /* r8 */ + 0, 0, + &mr_handle->handle, /* r4 */ + &dummy, /* r5 */ + &lkey_out, /* r6 */ + &rkey_out, /* r7 */ + &dummy, + &dummy, + &dummy); + *lkey = (u32) lkey_out; + *rkey = (u32) rkey_out; + + EDEB_EX(7, "rc=%lx mr_handle=%lx lkey=%x rkey=%x", + rc, mr_handle->handle, *lkey, *rkey); +#endif /* EHCA_USE_HCALL */ + + return rc; +} + +static inline u64 hipz_h_alloc_resource_mw(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmw *pfmw, + struct ehca_pfshca *pfshca, + const struct ipz_pd pd, + struct ipz_mrmw_handle *mw_handle, + u32 * rkey) +{ + u64 rc = H_Success; + u64 dummy; + u64 rkey_out; + + EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p pd=%x pfshca=%p", + hcp_adapter_handle.handle, pfmw, pd.value, pfshca); + +#ifndef EHCA_USE_HCALL + + rc = simp_hcz_h_alloc_resource_mw(hcp_adapter_handle, pfmw, pfshca, pd, + (struct hcz_mrmw_handle *)mw_handle, + rkey); + EDEB_EX(7, "rc=%lx mw_handle.mrwpte=%p mw_handle.page_index=%x rkey=%x", + rc, mw_handle->mrwpte, mw_handle->page_index, *rkey); +#else + rc = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + 6, /* r5 */ + pd.value, /* r6 */ + 0, 0, 0, 0, + &mw_handle->handle, /* r4 */ + &dummy, /* r5 */ + &dummy, /* r6 */ + &rkey_out, /* r7 */ + &dummy, + &dummy, + &dummy); + *rkey = (u32) rkey_out; + + EDEB_EX(7, "rc=%lx mw_handle=%lx rkey=%x", + rc, mw_handle->handle, *rkey); +#endif /* EHCA_USE_HCALL */ + return rc; +} + +static inline u64 hipz_h_query_mw(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmw *pfmw, + const struct ipz_mrmw_handle + *mw_handle, + u32 * rkey, + struct ipz_pd *pd) +{ + u64 rc = H_Success; + u64 dummy; + u64 pd_out; + u64 rkey_out; + +#ifndef EHCA_USE_HCALL + EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle.mrwpte=%p" + " mw_handle.page_index=%x", + hcp_adapter_handle.handle, pfmw, mw_handle->mrwpte, + mw_handle->page_index); + + rc = simp_hcz_h_query_mw(hcp_adapter_handle, pfmw, mw_handle, rkey, pd); + + EDEB_EX(7, "rc=%lx rkey=%x pd=%x", rc, *rkey, pd->value); +#else + EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle=%lx", + hcp_adapter_handle.handle, pfmw, mw_handle->handle); + + rc = plpar_hcall_7arg_7ret(H_QUERY_MW, + hcp_adapter_handle.handle, /* r4 */ + mw_handle->handle, /* r5 */ + 0, 0, 0, 0, 0, + &dummy, /* r4 */ + &dummy, /* r5 */ + &dummy, /* r6 */ + &rkey_out, /* r7 */ + &pd_out, /* r8 */ + &dummy, + &dummy); + *rkey = (u32) rkey_out; + pd->value = (u32) pd_out; + + EDEB_EX(7, "rc=%lx rkey=%x pd=%x", rc, *rkey, pd->value); +#endif /* EHCA_USE_HCALL */ + + return rc; +} + +static inline u64 hipz_h_free_resource_mw(const struct ipz_adapter_handle + hcp_adapter_handle, + struct ehca_pfmw *pfmw, + const struct ipz_mrmw_handle + *mw_handle) +{ + u64 rc = H_Success; + u64 dummy; + +#ifndef EHCA_USE_HCALL + EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle.mrwpte=%p" + " mw_handle.page_index=%x", + hcp_adapter_handle.handle, pfmw, mw_handle->mrwpte, + mw_handle->page_index); + + rc = simp_hcz_h_free_resource_mw(hcp_adapter_handle, pfmw, mw_handle); +#else + EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle=%lx", + hcp_adapter_handle.handle, pfmw, mw_handle->handle); + + rc = plpar_hcall_7arg_7ret(H_FREE_RESOURCE, + hcp_adapter_handle.handle, /* r4 */ + mw_handle->handle, /* r5 */ + 0, 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); +#endif /* EHCA_USE_HCALL */ + EDEB_EX(7, "rc=%lx", rc); + + return rc; +} + +static inline u64 hipz_h_error_data(const struct ipz_adapter_handle + adapter_handle, + const u64 ressource_handle, + void *rblock, + unsigned long *byte_count) +{ + u64 rc = H_Success; + u64 dummy; + u64 r_cb; + + EDEB_EN(7, "adapter_handle=%lx ressource_handle=%lx rblock=%p", + adapter_handle.handle, ressource_handle, rblock); + + if ((((u64)rblock) & 0xfff) != 0) { + EDEB_ERR(4, "rblock not page aligned."); + rc = H_Parameter; + return rc; + } + + r_cb = ehca_kv_to_g(rblock); + + rc = plpar_hcall_7arg_7ret(H_ERROR_DATA, + adapter_handle.handle, + ressource_handle, + r_cb, + 0, 0, 0, 0, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy, + &dummy); + + EDEB_EX(7, "rc=%lx", rc); + + return rc; +} + +#endif /* __HCP_IF_H__ */ From rolandd at cisco.com Sat Feb 18 11:57:14 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:14 -0800 Subject: [PATCH 04/22] OF adapter probing In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005712.13620.82908.stgit@localhost.localdomain> From: Roland Dreier hipz_probe_adapters() looks a little funny -- it seems to bail out of all the remaining adapters if one of them isn't quite right. --- drivers/infiniband/hw/ehca/hcp_sense.c | 144 ++++++++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/hcp_sense.h | 136 ++++++++++++++++++++++++++++++ 2 files changed, 280 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/hcp_sense.c b/drivers/infiniband/hw/ehca/hcp_sense.c new file mode 100644 index 0000000..83fa4a3 --- /dev/null +++ b/drivers/infiniband/hw/ehca/hcp_sense.c @@ -0,0 +1,144 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * ehca detection and query code for POWER + * + * Authors: Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hcp_sense.c,v 1.10 2006/02/06 10:17:34 schickhj Exp $ + */ + +#define DEB_PREFIX "snse" + +#include "ehca_kernel.h" +#include "ehca_tools.h" + +int hipz_count_adapters(void) +{ + int num = 0; + struct device_node *dn = NULL; + + EDEB_EN(7, ""); + + while ((dn = of_find_node_by_name(dn, "lhca"))) { + num++; + } + + of_node_put(dn); + + if (num == 0) { + EDEB_ERR(4, "No lhca node name was found in the" + " Open Firmware device tree."); + return -ENODEV; + } + + EDEB(6, " ... found %x adapter(s)", num); + + EDEB_EX(7, "num=%x", num); + + return num; +} + +int hipz_probe_adapters(char **adapter_list) +{ + int ret = 0; + int num = 0; + struct device_node *dn = NULL; + char *loc; + + EDEB_EN(7, "adapter_list=%p", adapter_list); + + while ((dn = of_find_node_by_name(dn, "lhca"))) { + loc = get_property(dn, "ibm,loc-code", NULL); + if (loc == NULL) { + EDEB_ERR(4, "No ibm,loc-code property for" + " lhca Open Firmware device tree node."); + ret = -ENODEV; + goto probe_adapters0; + } + + adapter_list[num] = loc; + EDEB(6, " ... found adapter[%x] with loc-code: %s", num, loc); + num++; + } + + probe_adapters0: + of_node_put(dn); + + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +u64 hipz_get_adapter_handle(char *adapter) +{ + struct device_node *dn = NULL; + char *loc; + u64 *u64data = NULL; + u64 ret = 0; + + EDEB_EN(7, "adapter=%p", adapter); + + while ((dn = of_find_node_by_name(dn, "lhca"))) { + loc = get_property(dn, "ibm,loc-code", NULL); + if (loc == NULL) { + EDEB_ERR(4, "No ibm,loc-code property for" + " lhca Open Firmware device tree node."); + goto get_adapter_handle0; + } + + if (strcmp(loc, adapter) == 0) { + u64data = + (u64 *) get_property(dn, "ibm,hca-handle", NULL); + break; + } + } + + if (u64data == NULL) { + EDEB_ERR(4, "No ibm,hca-handle property for" + " lhca Open Firmware device tree node with" + " ibm,loc-code: %s.", adapter); + goto get_adapter_handle0; + } + + ret = *u64data; + + get_adapter_handle0: + of_node_put(dn); + + EDEB_EX(7, "ret=%lx",ret); + + return ret; +} diff --git a/drivers/infiniband/hw/ehca/hcp_sense.h b/drivers/infiniband/hw/ehca/hcp_sense.h new file mode 100644 index 0000000..a49040b --- /dev/null +++ b/drivers/infiniband/hw/ehca/hcp_sense.h @@ -0,0 +1,136 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * ehca detection and query code for POWER + * + * Authors: Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hcp_sense.h,v 1.11 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef HCP_SENSE_H +#define HCP_SENSE_H + +int hipz_count_adapters(void); +int hipz_probe_adapters(char **adapter_list); +u64 hipz_get_adapter_handle(char *adapter); + +/* query hca response block */ +struct query_hca_rblock { + u32 cur_reliable_dg; + u32 cur_qp; + u32 cur_cq; + u32 cur_eq; + u32 cur_mr; + u32 cur_mw; + u32 cur_ee_context; + u32 cur_mcast_grp; + u32 cur_qp_attached_mcast_grp; + u32 reserved1; + u32 cur_ipv6_qp; + u32 cur_eth_qp; + u32 cur_hp_mr; + u32 reserved2[3]; + u32 max_rd_domain; + u32 max_qp; + u32 max_cq; + u32 max_eq; + u32 max_mr; + u32 max_hp_mr; + u32 max_mw; + u32 max_mrwpte; + u32 max_special_mrwpte; + u32 max_rd_ee_context; + u32 max_mcast_grp; + u32 max_qps_attached_all_mcast_grp; + u32 max_qps_attached_mcast_grp; + u32 max_raw_ipv6_qp; + u32 max_raw_ethy_qp; + u32 internal_clock_frequency; + u32 max_pd; + u32 max_ah; + u32 max_cqe; + u32 max_wqes_wq; + u32 max_partitions; + u32 max_rr_ee_context; + u32 max_rr_qp; + u32 max_rr_hca; + u32 max_act_wqs_ee_context; + u32 max_act_wqs_qp; + u32 max_sge; + u32 max_sge_rd; + u32 memory_page_size_supported; + u64 max_mr_size; + u32 local_ca_ack_delay; + u32 num_ports; + u32 vendor_id; + u32 vendor_part_id; + u32 hw_ver; + u64 node_guid; + u64 hca_cap_indicators; + u32 data_counter_register_size; + u32 max_shared_rq; + u32 max_isns_eq; + u32 max_neq; +} __attribute__ ((packed)); + +/* query port response block */ +struct query_port_rblock { + u32 state; + u32 bad_pkey_cntr; + u32 lmc; + u32 lid; + u32 subnet_timeout; + u32 qkey_viol_cntr; + u32 sm_sl; + u32 sm_lid; + u32 capability_mask; + u32 init_type_reply; + u32 pkey_tbl_len; + u32 gid_tbl_len; + u64 gid_prefix; + u32 port_nr; + u16 pkey_entries[16]; + u8 reserved1[32]; + u32 trent_size; + u32 trbuf_size; + u64 max_msg_sz; + u32 max_mtu; + u32 vl_cap; + u8 reserved2[1900]; + u64 guid_entries[255]; +} __attribute__ ((packed)); + +#endif From rolandd at cisco.com Sat Feb 18 11:57:17 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:17 -0800 Subject: [PATCH 05/22] HW register abstractions In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005717.13620.85161.stgit@localhost.localdomain> From: Roland Dreier Does hipz_structs.h really need a whole file to hold 5 #defines? --- drivers/infiniband/hw/ehca/hipz_fns.h | 83 ++++++ drivers/infiniband/hw/ehca/hipz_fns_core.h | 123 +++++++++ drivers/infiniband/hw/ehca/hipz_hw.h | 382 ++++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/hipz_structs.h | 54 ++++ 4 files changed, 642 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/hipz_fns.h b/drivers/infiniband/hw/ehca/hipz_fns.h new file mode 100644 index 0000000..4231b65 --- /dev/null +++ b/drivers/infiniband/hw/ehca/hipz_fns.h @@ -0,0 +1,83 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * HW abstraction register functions + * + * Authors: Christoph Raisch + * Reinhard Ernst + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hipz_fns.h,v 1.15 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __HIPZ_FNS_H__ +#define __HIPZ_FNS_H__ + +#include "hipz_structs.h" +#include "ehca_classes.h" +#include "hipz_hw.h" +#ifndef EHCA_USE_HCALL +#include "sim_gal.h" +#endif + +#include "hipz_fns_core.h" + +#define hipz_galpa_store_eq(gal,offset,value)\ + hipz_galpa_store(gal,EQTEMM_OFFSET(offset),value) +#define hipz_galpa_load_eq(gal,offset)\ + hipz_galpa_load(gal,EQTEMM_OFFSET(offset)) + +#define hipz_galpa_store_qped(gal,offset,value)\ + hipz_galpa_store(gal,QPEDMM_OFFSET(offset),value) +#define hipz_galpa_load_qped(gal,offset)\ + hipz_galpa_load(gal,QPEDMM_OFFSET(offset)) + +#define hipz_galpa_store_mrmw(gal,offset,value)\ + hipz_galpa_store(gal,MRMWMM_OFFSET(offset),value) +#define hipz_galpa_load_mrmw(gal,offset)\ + hipz_galpa_load(gal,MRMWMM_OFFSET(offset)) + +inline static void hipz_load_FEC(struct ehca_cq_core *cq_core, u32 * count) +{ + uint64_t reg = 0; + EDEB_EN(7, "cq_core=%p", cq_core); + { + struct h_galpa gal = cq_core->galpas.kernel; + reg = hipz_galpa_load_cq(gal, CQx_FEC); + *count = EHCA_BMASK_GET(CQx_FEC_CQE_cnt, reg); + } + EDEB_EX(7,"cq_core=%p CQx_FEC=%lx", cq_core,reg); +} + +#endif /* __IPZ_IF_H__ */ diff --git a/drivers/infiniband/hw/ehca/hipz_fns_core.h b/drivers/infiniband/hw/ehca/hipz_fns_core.h new file mode 100644 index 0000000..a60b808 --- /dev/null +++ b/drivers/infiniband/hw/ehca/hipz_fns_core.h @@ -0,0 +1,123 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * HW abstraction register functions + * + * Authors: Christoph Raisch + * Reinhard Ernst + * Hoang-Nam Nguyen + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hipz_fns_core.h,v 1.10 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __HIPZ_FNS_CORE_H__ +#define __HIPZ_FNS_CORE_H__ + +#include "ehca_galpa.h" +#include "hipz_hw.h" + +#define hipz_galpa_store_cq(gal,offset,value)\ + hipz_galpa_store(gal,CQTEMM_OFFSET(offset),value) +#define hipz_galpa_load_cq(gal,offset)\ + hipz_galpa_load(gal,CQTEMM_OFFSET(offset)) + +#define hipz_galpa_store_qp(gal,offset,value)\ + hipz_galpa_store(gal,QPTEMM_OFFSET(offset),value) +#define hipz_galpa_load_qp(gal,offset)\ + hipz_galpa_load(gal,QPTEMM_OFFSET(offset)) + +inline static void hipz_update_SQA(struct ehca_qp_core *qp_core, u16 nr_wqes) +{ + struct h_galpa gal; + + EDEB_EN(7, "qp_core=%p", qp_core); + gal = qp_core->galpas.kernel; + /* ringing doorbell :-) */ + hipz_galpa_store_qp(gal, QPx_SQA, EHCA_BMASK_SET(QPx_SQAdder, nr_wqes)); + EDEB_EX(7, "qp_core=%p QPx_SQA = %i", qp_core, nr_wqes); +} + +inline static void hipz_update_RQA(struct ehca_qp_core *qp_core, u16 nr_wqes) +{ + struct h_galpa gal; + + EDEB_EN(7, "qp_core=%p", qp_core); + gal = qp_core->galpas.kernel; + /* ringing doorbell :-) */ + hipz_galpa_store_qp(gal, QPx_RQA, EHCA_BMASK_SET(QPx_RQAdder, nr_wqes)); + EDEB_EX(7, "qp_core=%p QPx_RQA = %i", qp_core, nr_wqes); +} + +inline static void hipz_update_FECA(struct ehca_cq_core *cq_core, u32 nr_cqes) +{ + struct h_galpa gal; + + EDEB_EN(7, "cq_core=%p", cq_core); + gal = cq_core->galpas.kernel; + hipz_galpa_store_cq(gal, CQx_FECA, + EHCA_BMASK_SET(CQx_FECAdder, nr_cqes)); + EDEB_EX(7, "cq_core=%p CQx_FECA = %i", cq_core, nr_cqes); +} + +inline static void hipz_set_CQx_N0(struct ehca_cq_core *cq_core, u32 value) +{ + struct h_galpa gal; + u64 CQx_N0_reg = 0; + + EDEB_EN(7, "cq_core=%p event on solicited completion -- write CQx_N0", + cq_core); + gal = cq_core->galpas.kernel; + hipz_galpa_store_cq(gal, CQx_N0, + EHCA_BMASK_SET(CQx_N0_generate_solicited_comp_event, + value)); + CQx_N0_reg = hipz_galpa_load_cq(gal, CQx_N0); + EDEB_EX(7, "cq_core=%p loaded CQx_N0=%lx", cq_core,(unsigned long)CQx_N0_reg); +} + +inline static void hipz_set_CQx_N1(struct ehca_cq_core *cq_core, u32 value) +{ + struct h_galpa gal; + u64 CQx_N1_reg = 0; + + EDEB_EN(7, "cq_core=%p event on completion -- write CQx_N1", + cq_core); + gal = cq_core->galpas.kernel; + hipz_galpa_store_cq(gal, CQx_N1, + EHCA_BMASK_SET(CQx_N1_generate_comp_event, value)); + CQx_N1_reg = hipz_galpa_load_cq(gal, CQx_N1); + EDEB_EX(7, "cq_core=%p loaded CQx_N1=%lx", cq_core,(unsigned long)CQx_N1_reg); +} + +#endif /* __HIPZ_FNC_CORE_H__ */ diff --git a/drivers/infiniband/hw/ehca/hipz_hw.h b/drivers/infiniband/hw/ehca/hipz_hw.h new file mode 100644 index 0000000..6fa005b --- /dev/null +++ b/drivers/infiniband/hw/ehca/hipz_hw.h @@ -0,0 +1,382 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * eHCA register definitions + * + * Authors: Christoph Raisch + * Reinhard Ernst + * Waleri Fomin + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hipz_hw.h,v 1.7 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __HIPZ_HW_H__ +#define __HIPZ_HW_H__ + +#ifdef __KERNEL__ +#include "ehca_tools.h" +#include "ehca_kernel.h" +#else /* !__KERNEL__ */ +#include "ehca_utools.h" +#endif + +/** @brief Queue Pair Table Memory + */ +struct hipz_QPTEMM { + u64 QPx_HCR; +#define QPx_HCR_PKEY_Mode EHCA_BMASK_IBM(1,2) +#define QPx_HCR_Special_QP_Mode EHCA_BMASK_IBM(6,7) + u64 QPx_C; +#define QPx_C_Enabled EHCA_BMASK_IBM(0,0) +#define QPx_C_Disabled EHCA_BMASK_IBM(1,1) +#define QPx_C_Req_State EHCA_BMASK_IBM(16,23) +#define QPx_C_Res_State EHCA_BMASK_IBM(25,31) +#define QPx_C_disable_ETE_check EHCA_BMASK_IBM(7,7) + u64 QPx_HERR; + u64 QPx_AER; +/* 0x20*/ + u64 QPx_SQA; +#define QPx_SQAdder EHCA_BMASK_IBM(48,63) + u64 QPx_SQC; + u64 QPx_RQA; +#define QPx_RQAdder EHCA_BMASK_IBM(48,63) + u64 QPx_RQC; +/* 0x40*/ + u64 QPx_ST; + u64 QPx_PMSTATE; +#define QPx_PMSTATE_BITS EHCA_BMASK_IBM(30,31) + u64 QPx_PMFA; + u64 QPx_PKEY; +#define QPx_PKEY_value EHCA_BMASK_IBM(48,63) +/* 0x60*/ + u64 QPx_PKEYA; +#define QPx_PKEYA_index0 EHCA_BMASK_IBM(0,15) +#define QPx_PKEYA_index1 EHCA_BMASK_IBM(16,31) +#define QPx_PKEYA_index2 EHCA_BMASK_IBM(32,47) +#define QPx_PKEYA_index3 EHCA_BMASK_IBM(48,63) + u64 QPx_PKEYB; +#define QPx_PKEYB_index4 EHCA_BMASK_IBM(0,15) +#define QPx_PKEYB_index5 EHCA_BMASK_IBM(16,31) +#define QPx_PKEYB_index6 EHCA_BMASK_IBM(32,47) +#define QPx_PKEYB_index7 EHCA_BMASK_IBM(48,63) + u64 QPx_PKEYC; +#define QPx_PKEYC_index8 EHCA_BMASK_IBM(0,15) +#define QPx_PKEYC_index9 EHCA_BMASK_IBM(16,31) +#define QPx_PKEYC_index10 EHCA_BMASK_IBM(32,47) +#define QPx_PKEYC_index11 EHCA_BMASK_IBM(48,63) + u64 QPx_PKEYD; +#define QPx_PKEYD_index12 EHCA_BMASK_IBM(0,15) +#define QPx_PKEYD_index13 EHCA_BMASK_IBM(16,31) +#define QPx_PKEYD_index14 EHCA_BMASK_IBM(32,47) +#define QPx_PKEYD_index15 EHCA_BMASK_IBM(48,63) +/* 0x80*/ + u64 QPx_QKEY; +#define QPx_QKEY_value EHCA_BMASK_IBM(32,63) + u64 QPx_DQP; +#define QPx_DQP_number EHCA_BMASK_IBM(40,63) + u64 QPx_DLIDP; +#define QPx_DLID_PRIMARY EHCA_BMASK_IBM(48,63) +#define QPx_DLIDP_GRH EHCA_BMASK_IBM(31,31) + u64 QPx_PORTP; +#define QPx_PORT_Primary EHCA_BMASK_IBM(57,63) +/* 0xa0*/ + u64 QPx_SLIDP; +#define QPx_SLIDP_p_path EHCA_BMASK_IBM(48,63) +#define QPx_SLIDP_lmc EHCA_BMASK_IBM(37,39) + u64 QPx_SLIDPP; +#define QPx_SLID_PRIM_PATH EHCA_BMASK_IBM(57,63) + u64 QPx_DLIDA; +#define QPx_DLIDA_GRH EHCA_BMASK_IBM(31,31) + u64 QPx_PORTA; +#define QPx_PORT_Alternate EHCA_BMASK_IBM(57,63) +/* 0xc0*/ + u64 QPx_SLIDA; + u64 QPx_SLIDPA; + u64 QPx_SLVL; +#define QPx_SLVL_BITS EHCA_BMASK_IBM(56,59) +#define QPx_SLVL_VL EHCA_BMASK_IBM(60,63) + u64 QPx_IPD; +#define QPx_IPD_max_static_rate EHCA_BMASK_IBM(56,63) +/* 0xe0*/ + u64 QPx_MTU; +#define QPx_MTU_size EHCA_BMASK_IBM(56,63) + u64 QPx_LATO; +#define QPx_LATO_BITS EHCA_BMASK_IBM(59,63) + u64 QPx_RLIMIT; +#define QPx_RETRY_COUNT EHCA_BMASK_IBM(61,63) + u64 QPx_RNRLIMIT; +#define QPx_RNR_RETRY_COUNT EHCA_BMASK_IBM(61,63) +/* 0x100*/ + u64 QPx_T; + u64 QPx_SQHP; + u64 QPx_SQPTP; + u64 QPx_NSPSN; +#define QPx_NSPSN_value EHCA_BMASK_IBM(40,63) +/* 0x120*/ + u64 QPx_NSPSNHWM; +#define QPx_NSPSNHWM_value EHCA_BMASK_IBM(40,63) + u64 reserved1; + u64 QPx_SDSI; + u64 QPx_SDSBC; +/* 0x140*/ + u64 QPx_SQWSIZE; +#define QPx_SQWSIZE_value EHCA_BMASK_IBM(61,63) + u64 QPx_SQWTS; + u64 QPx_LSN; + u64 QPx_NSSN; +/* 0x160 */ + u64 QPx_MOR; +#define QPx_MOR_value EHCA_BMASK_IBM(48,63) + u64 QPx_COR; + u64 QPx_SQSIZE; +#define QPx_SQSIZE_value EHCA_BMASK_IBM(60,63) + u64 QPx_ERC; +/* 0x180*/ + u64 QPx_RNRRC; +#define QPx_RNRRESP_value EHCA_BMASK_IBM(59,63) + u64 QPx_ERNRWT; + u64 QPx_RNRRESP; +#define QPx_RNRRESP_WTR EHCA_BMASK_IBM(59,63) + u64 QPx_LMSNA; +/* 0x1a0 */ + u64 QPx_SQHPC; + u64 QPx_SQCPTP; + u64 QPx_SIGT; + u64 QPx_WQECNT; +/* 0x1c0*/ + + u64 QPx_RQHP; + u64 QPx_RQPTP; + u64 QPx_RQSIZE; +#define QPx_RQSIZE_value EHCA_BMASK_IBM(60,63) + u64 QPx_NRR; +#define QPx_NRR_value EHCA_BMASK_IBM(61,63) +/* 0x1e0*/ + u64 QPx_RDMAC; +#define QPx_RDMAC_value EHCA_BMASK_IBM(61,63) + u64 QPx_NRPSN; +#define QPx_NRPSN_value EHCA_BMASK_IBM(40,63) + u64 QPx_LAPSN; +#define QPx_LAPSN_value EHCA_BMASK_IBM(40,63) + u64 QPx_LCR; +/* 0x200*/ + u64 QPx_RWC; + u64 QPx_RWVA; + u64 QPx_RDSI; + u64 QPx_RDSBC; +/* 0x220*/ + u64 QPx_RQWSIZE; +#define QPx_RQWSIZE_value EHCA_BMASK_IBM(61,63) + u64 QPx_CRMSN; + u64 QPx_RDD; +#define QPx_RDD_VALUE EHCA_BMASK_IBM(32,63) + u64 QPx_LARPSN; +#define QPx_LARPSN_value EHCA_BMASK_IBM(40,63) +/* 0x240*/ + u64 QPx_PD; + u64 QPx_SCQN; + u64 QPx_RCQN; + u64 QPx_AEQN; +/* 0x260*/ + u64 QPx_AAELOG; + u64 QPx_RAM; + u64 QPx_RDMAQE0; + u64 QPx_RDMAQE1; +/* 0x280*/ + u64 QPx_RDMAQE2; + u64 QPx_RDMAQE3; + u64 QPx_NRPSNHWM; +#define QPx_NRPSNHWM_value EHCA_BMASK_IBM(40,63) +/* 0x298*/ + u64 reserved[(0x400 - 0x298) / 8]; +/* 0x400 extended data */ + u64 reserved_ext[(0x500 - 0x400) / 8]; +/* 0x500 */ + u64 reserved2[(0x1000 - 0x500) / 8]; +/* 0x1000 */ +}; + +#define QPTEMM_OFFSET(x) offsetof(struct hipz_QPTEMM,x) + +/** @brief MRMWPT Entry Memory Map + */ +struct hipz_MRMWMM { + /* 0x00 */ + u64 MRx_HCR; +#define MRx_HCR_LPARID_VALID EHCA_BMASK_IBM(0,0) + + u64 MRx_C; + u64 MRx_HERR; + u64 MRx_AER; + /* 0x20 */ + u64 MRx_PP; + u64 reserved1; + u64 reserved2; + u64 reserved3; + /* 0x40 */ + u64 reserved4[(0x200 - 0x40) / 8]; + /* 0x200 */ + u64 MRx_CTL[64]; + +}; + +#define MRMWMM_OFFSET(x) offsetof(struct hipz_MRMWMM,x) + +/** @brief QPEDMM + */ +struct hipz_QPEDMM { + /* 0x00 */ + u64 reserved0[(0x400) / 8]; + /* 0x400 */ + u64 QPEDx_PHH; +#define QPEDx_PHH_TClass EHCA_BMASK_IBM(4,11) +#define QPEDx_PHH_HopLimit EHCA_BMASK_IBM(56,63) +#define QPEDx_PHH_FlowLevel EHCA_BMASK_IBM(12,31) + u64 QPEDx_PPSGP; +#define QPEDx_PPSGP_PPPidx EHCA_BMASK_IBM(0,63) + /* 0x410 */ + u64 QPEDx_PPSGU; +#define QPEDx_PPSGU_PPPSGID EHCA_BMASK_IBM(0,63) + u64 QPEDx_PPDGP; + /* 0x420 */ + u64 QPEDx_PPDGU; + u64 QPEDx_APH; + /* 0x430 */ + u64 QPEDx_APSGP; + u64 QPEDx_APSGU; + /* 0x440 */ + u64 QPEDx_APDGP; + u64 QPEDx_APDGU; + /* 0x450 */ + u64 QPEDx_APAV; + u64 QPEDx_APSAV; + /* 0x460 */ + u64 QPEDx_HCR; + u64 reserved1[4]; + /* 0x488 */ + u64 QPEDx_RRL0; + /* 0x490 */ + u64 QPEDx_RRRKEY0; + u64 QPEDx_RRVA0; + /* 0x4A0 */ + u64 reserved2; + u64 QPEDx_RRL1; + /* 0x4B0 */ + u64 QPEDx_RRRKEY1; + u64 QPEDx_RRVA1; + /* 0x4C0 */ + u64 reserved3; + u64 QPEDx_RRL2; + /* 0x4D0 */ + u64 QPEDx_RRRKEY2; + u64 QPEDx_RRVA2; + /* 0x4E0 */ + u64 reserved4; + u64 QPEDx_RRL3; + /* 0x4F0 */ + u64 QPEDx_RRRKEY3; + u64 QPEDx_RRVA3; +}; + +#define QPEDMM_OFFSET(x) offsetof(struct hipz_QPEDMM,x) + +/** @brief CQ Table Entry Memory Map + */ +struct hipz_CQTEMM { + u64 CQx_HCR; +#define CQx_HCR_LPARID_valid EHCA_BMASK_IBM(0,0) + u64 CQx_C; +#define CQx_C_Enable EHCA_BMASK_IBM(0,0) +#define CQx_C_Disable_Complete EHCA_BMASK_IBM(1,1) +#define CQx_C_Error_Reset EHCA_BMASK_IBM(23,23) + u64 CQx_HERR; + u64 CQx_AER; +/* 0x20 */ + u64 CQx_PTP; + u64 CQx_TP; +#define CQx_FEC_CQE_cnt EHCA_BMASK_IBM(32,63) + u64 CQx_FEC; + u64 CQx_FECA; +#define CQx_FECAdder EHCA_BMASK_IBM(32,63) +/* 0x40 */ + u64 CQx_EP; +#define CQx_EP_Event_Pending EHCA_BMASK_IBM(0,0) +#define CQx_EQ_number EHCA_BMASK_IBM(0,15) +#define CQx_EQ_CQtoken EHCA_BMASK_IBM(32,63) + u64 CQx_EQ; +/* 0x50 */ + u64 reserved1; + u64 CQx_N0; +#define CQx_N0_generate_solicited_comp_event EHCA_BMASK_IBM(0,0) +/* 0x60 */ + u64 CQx_N1; +#define CQx_N1_generate_comp_event EHCA_BMASK_IBM(0,0) + u64 reserved2[(0x1000 - 0x60) / 8]; +/* 0x1000 */ +}; + +#define CQTEMM_OFFSET(x) offsetof(struct hipz_CQTEMM,x) + +/** @brief EQ Table Entry Memory Map + */ +struct hipz_EQTEMM { + u64 EQx_HCR; +#define EQx_HCR_LPARID_valid EHCA_BMASK_IBM(0,0) +#define EQx_HCR_ENABLE_PSB EHCA_BMASK_IBM(8,8) + u64 EQx_C; +#define EQx_C_Enable EHCA_BMASK_IBM(0,0) +#define EQx_C_Error_Reset EHCA_BMASK_IBM(23,23) +#define EQx_C_Comp_Event EHCA_BMASK_IBM(17,17) + + u64 EQx_HERR; + u64 EQx_AER; +/* 0x20 */ + u64 EQx_PTP; + u64 EQx_TP; + u64 EQx_SSBA; + u64 EQx_PSBA; + +/* 0x40 */ + u64 EQx_CEC; + u64 EQx_MEQL; + u64 EQx_XISBI; + u64 EQx_XISC; +/* 0x60 */ + u64 EQx_IT; + +}; +#define EQTEMM_OFFSET(x) offsetof(struct hipz_EQTEMM,x) + +#endif diff --git a/drivers/infiniband/hw/ehca/hipz_structs.h b/drivers/infiniband/hw/ehca/hipz_structs.h new file mode 100644 index 0000000..bd2dcad --- /dev/null +++ b/drivers/infiniband/hw/ehca/hipz_structs.h @@ -0,0 +1,54 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Infiniband Firmware structure definition + * + * Authors: Waleri Fomin + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: hipz_structs.h,v 1.8 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __HIPZ_STRUCTS_H__ +#define __HIPZ_STRUCTS_H__ + +/* access control defines for MR/MW */ +#define HIPZ_ACCESSCTRL_L_WRITE 0x00800000 +#define HIPZ_ACCESSCTRL_R_WRITE 0x00400000 +#define HIPZ_ACCESSCTRL_R_READ 0x00200000 +#define HIPZ_ACCESSCTRL_R_ATOMIC 0x00100000 +#define HIPZ_ACCESSCTRL_MW_BIND 0x00080000 + +#endif /* __IPZ_IF_H__ */ From rolandd at cisco.com Sat Feb 18 11:57:27 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:27 -0800 Subject: [PATCH 10/22] ehca IRQ handling In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005727.13620.58832.stgit@localhost.localdomain> From: Roland Dreier Where is the irq_count field of struct ehca_irq_info ever used? I couldn't find anywhere, so it can be deleted. The logic in ehca_interrupt_eq() is too convoluted for me to follow; there are two nested while () {} loops inside a do {} while () loop, and ehca_poll_eq() is called in three different places. Is there any way to untangle this? --- drivers/infiniband/hw/ehca/ehca_irq.c | 436 +++++++++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_irq.h | 90 +++++++ 2 files changed, 526 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c new file mode 100644 index 0000000..1bba58e --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -0,0 +1,436 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Functions for EQs, NEQs and interrupts + * + * Authors: Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_irq.c,v 1.64 2006/02/15 08:15:25 schickhj Exp $ + */ + +#include "ehca_kernel.h" +#include "ehca_irq.h" + +#define DEB_PREFIX "eirq" + +#include "ehca_kernel.h" +#include "ehca_classes.h" +#include "ehca_tools.h" +#include "ehca_eq.h" +#include "ehca_irq.h" +#include "hcp_if.h" + +#define EQE_COMPLETION_EVENT EHCA_BMASK_IBM(1,1) +#define EQE_CQ_QP_NUMBER EHCA_BMASK_IBM(8,31) +#define EQE_EE_IDENTIFIER EHCA_BMASK_IBM(2,7) +#define EQE_CQ_NUMBER EHCA_BMASK_IBM(8,31) +#define EQE_QP_NUMBER EHCA_BMASK_IBM(8,31) +#define EQE_QP_TOKEN EHCA_BMASK_IBM(32,63) +#define EQE_CQ_TOKEN EHCA_BMASK_IBM(32,63) + +#define NEQE_COMPLETION_EVENT EHCA_BMASK_IBM(1,1) +#define NEQE_EVENT_CODE EHCA_BMASK_IBM(2,7) +#define NEQE_PORT_NUMBER EHCA_BMASK_IBM(8,15) +#define NEQE_PORT_AVAILABILITY EHCA_BMASK_IBM(16,16) + +#define ERROR_DATA_LENGTH EHCA_BMASK_IBM(52,63) + +static inline void comp_event_callback(struct ehca_cq *cq) +{ + unsigned long spl_flags = 0; + + EDEB_EN(7, "cq=%p", cq); + + if (cq->ib_cq.comp_handler == NULL) + return; + + spin_lock_irqsave(&cq->cb_lock, spl_flags); + cq->ib_cq.comp_handler(&cq->ib_cq, cq->ib_cq.cq_context); + spin_unlock_irqrestore(&cq->cb_lock, spl_flags); + + EDEB_EX(7, "cq=%p", cq); + + return; +} + +int ehca_error_data(struct ehca_shca *shca, + u64 ressource) +{ + + unsigned long ret = 0; + u64 *rblock; + unsigned long block_count; + + EDEB_EN(7, "ressource=%lx", ressource); + + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (rblock == NULL) { + EDEB_ERR(4, "Cannot allocate rblock memory."); + ret = -ENOMEM; + goto error_data1; + } + + memset(rblock, 0, PAGE_SIZE); + + ret = hipz_h_error_data(shca->ipz_hca_handle, + ressource, + rblock, + &block_count); + + if (ret == H_R_STATE) { + EDEB_ERR(4, "No error data is available: %lx.", ressource); + } + else if (ret == H_Success) { + int length; + + length = EHCA_BMASK_GET(ERROR_DATA_LENGTH, rblock[0]); + + if (length > PAGE_SIZE) + length = PAGE_SIZE; + + EDEB_ERR(4, "Error data is available: %lx.", ressource); + EDEB_ERR(4, "EHCA ----- error data begin " + "---------------------------------------------------"); + EDEB_DMP(4, rblock, length, "ressource=%lx", ressource); + EDEB_ERR(4, "EHCA ----- error data end " + "-----------------------------------------------------"); + } + else { + EDEB_ERR(4, "Error data could not be fetched: %lx", ressource); + } + + kfree(rblock); + + error_data1: + return ret; + +} + +static void qp_event_callback(struct ehca_shca *shca, + u64 eqe, + enum ib_event_type event_type) +{ + struct ib_event event; + struct ehca_qp *qp; + u32 token = EHCA_BMASK_GET(EQE_QP_TOKEN, eqe); + + EDEB_EN(7, "eqe=%lx", eqe); + + down_read(&ehca_qp_idr_sem); + qp = idr_find(&ehca_qp_idr, token); + up_read(&ehca_qp_idr_sem); + + if (qp == NULL) + return; + + if (event_type == IB_EVENT_QP_FATAL) + EDEB_ERR(4, "QP 0x%x (ressource=%lx) has errors.", + qp->ib_qp.qp_num, qp->ipz_qp_handle.handle); + + ehca_error_data(shca, qp->ipz_qp_handle.handle); + + if (qp->ib_qp.event_handler == NULL) + return; + + event.device = &shca->ib_device; + event.event = event_type; + event.element.qp = &qp->ib_qp; + + qp->ib_qp.event_handler(&event, qp->ib_qp.qp_context); + + EDEB_EX(7, "qp=%p", qp); + + return; +} + +static void cq_event_callback(struct ehca_shca *shca, + u64 eqe) +{ + struct ehca_cq *cq; + u32 token = EHCA_BMASK_GET(EQE_CQ_TOKEN, eqe); + + EDEB_EN(7, "eqe=%lx", eqe); + + down_read(&ehca_cq_idr_sem); + cq = idr_find(&ehca_cq_idr, token); + up_read(&ehca_cq_idr_sem); + + if (cq == NULL) + return; + + EDEB_ERR(4, "CQ 0x%x (ressource=%lx) has errors.", + cq->cq_number, cq->ipz_cq_handle.handle); + + ehca_error_data(shca, cq->ipz_cq_handle.handle); + + EDEB_EX(7, "cq=%p", cq); + + return; +} + +static void parse_identifier(struct ehca_shca *shca, u64 eqe) +{ + u8 identifier = EHCA_BMASK_GET(EQE_EE_IDENTIFIER, eqe); + + EDEB_EN(7, "shca=%p eqe=%lx", shca, eqe); + + switch (identifier) { + case 0x02: /* path migrated */ + qp_event_callback(shca, eqe, IB_EVENT_PATH_MIG); + break; + case 0x03: /* communication established */ + qp_event_callback(shca, eqe, IB_EVENT_COMM_EST); + break; + case 0x04: /* send queue drained */ + qp_event_callback(shca, eqe, IB_EVENT_SQ_DRAINED); + break; + case 0x05: /* QP error */ + case 0x06: /* QP error */ + qp_event_callback(shca, eqe, IB_EVENT_QP_FATAL); + break; + case 0x07: /* CQ error */ + case 0x08: /* CQ error */ + cq_event_callback(shca, eqe); + break; + case 0x09: /* MRMWPTE error */ + case 0x0A: /* port event */ + case 0x0B: /* MR access error */ + case 0x0C: /* EQ error */ + case 0x0D: /* P/Q_Key mismatch */ + case 0x10: /* sampling complete */ + case 0x11: /* unaffiliated access error */ + case 0x12: /* path migrating error */ + case 0x13: /* interface trace stopped */ + case 0x14: /* first error capture info available */ + default: + EDEB_ERR(4, "Unknown identifier: %x on %s.", + identifier, shca->ib_device.name); + break; + } + + EDEB_EN(7, "eqe=%lx identifier=%x", eqe, identifier); + + return; +} + +static void parse_ec(struct ehca_shca *shca, u64 eqe) +{ + struct ib_event event; + u8 ec = EHCA_BMASK_GET(NEQE_EVENT_CODE, eqe); + u8 port = EHCA_BMASK_GET(NEQE_PORT_NUMBER, eqe); + + EDEB_EN(7, "shca=%p eqe=%lx", shca, eqe); + + switch (ec) { + case 0x30: /* port availability change */ + if (EHCA_BMASK_GET(NEQE_PORT_AVAILABILITY, eqe)) { + EDEB(4, "%s: port %x is active.", + shca->ib_device.name, port); + event.device = &shca->ib_device; + event.event = IB_EVENT_PORT_ACTIVE; + event.element.port_num = port; + shca->sport[port - 1].port_state = IB_PORT_ACTIVE; + ib_dispatch_event(&event); + } else { + EDEB(4, "%s: port %x is inactive.", + shca->ib_device.name, port); + event.device = &shca->ib_device; + event.event = IB_EVENT_PORT_ERR; + event.element.port_num = port; + shca->sport[port - 1].port_state = IB_PORT_DOWN; + ib_dispatch_event(&event); + } + break; + case 0x31: + /* port configuration change */ + /* disruptive change is caused by */ + /* LID, PKEY or SM change */ + EDEB(4, "EHCA disruptive port %x " + "configuration change.", port); + + EDEB(4, "%s: port %x is inactive.", + shca->ib_device.name, port); + event.device = &shca->ib_device; + event.event = IB_EVENT_PORT_ERR; + event.element.port_num = port; + shca->sport[port - 1].port_state = IB_PORT_DOWN; + ib_dispatch_event(&event); + + EDEB(4, "%s: port %x is active.", + shca->ib_device.name, port); + event.device = &shca->ib_device; + event.event = IB_EVENT_PORT_ACTIVE; + event.element.port_num = port; + shca->sport[port - 1].port_state = IB_PORT_ACTIVE; + ib_dispatch_event(&event); + break; + case 0x32: /* adapter malfunction */ + case 0x33: /* trace stopped */ + default: + EDEB_ERR(4, "Unknown event code: %x on %s.", + ec, shca->ib_device.name); + break; + } + + EDEB_EN(7, "eqe=%lx ec=%x", eqe, ec); + + return; +} + +static inline void reset_eq_pending(struct ehca_cq *cq) +{ + u64 CQx_EP = 0; + struct h_galpa gal = cq->ehca_cq_core.galpas.kernel; + + EDEB_EN(7, "cq=%p", cq); + + hipz_galpa_store_cq(gal, CQx_EP, 0x0); + CQx_EP = hipz_galpa_load(gal, CQTEMM_OFFSET(CQx_EP)); + EDEB(7, "CQx_EP=%lx", CQx_EP); + + EDEB_EX(7, "cq=%p", cq); + + return; +} + +void ehca_interrupt_eq(void *data) +{ + struct ehca_irq_info *irq_info; + struct ehca_shca *shca; + struct ehca_eqe *eqe; + int int_state; + + EDEB_EN(7, "data=%p", data); + + irq_info = (struct ehca_irq_info *)data; + shca = to_shca(eq); + + do { + eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->eq); + + if ((shca->hw_level >= 2) && (eqe != NULL)) + int_state = 1; + else + int_state = 0; + + while ((int_state == 1) || (eqe != 0)) { + while (eqe) { + u64 eqe_value = eqe->entry; + + EDEB(7, "eqe_value=%lx", eqe_value); + + /* TODO: better structure */ + if (EHCA_BMASK_GET(EQE_COMPLETION_EVENT, + eqe_value)) { + extern struct idr ehca_cq_idr; + u32 token; + struct ehca_cq *cq; + + EDEB(7, "... completion event"); + token = + EHCA_BMASK_GET(EQE_CQ_TOKEN, + eqe_value); + down_read(&ehca_cq_idr_sem); + cq = idr_find(&ehca_cq_idr, token); + up_read(&ehca_cq_idr_sem); + reset_eq_pending(cq); + comp_event_callback(cq); + } else { + EDEB(7, "... non completion event"); + parse_identifier(shca, eqe_value); + } + eqe = + (struct ehca_eqe *)ehca_poll_eq(shca, + &shca->eq); + } + + /* TODO: do we need hw_level */ + if (shca->hw_level >= 2) + int_state = + hipz_h_query_int_state(shca->ipz_hca_handle, + irq_info); + eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->eq); + + } + } while (int_state != 0); + + EDEB_EX(7, "shca=%p", shca); + + return; +} + +void ehca_interrupt_neq(void *data) +{ + struct ehca_irq_info *irq_info; + struct ehca_shca *shca; + struct ehca_eqe *eqe; + u64 ret = H_Success; + + EDEB_EN(7, "data=%p", data); + + irq_info = (struct ehca_irq_info *)data; + shca = to_shca(neq); + eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->neq); + + while (eqe) { + if (!EHCA_BMASK_GET(NEQE_COMPLETION_EVENT, eqe->entry)) + parse_ec(shca, eqe->entry); + + eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->neq); + } + + ret = hipz_h_reset_event(shca->ipz_hca_handle, + shca->neq.ipz_eq_handle, 0xFFFFFFFFFFFFFFFF); + + if (ret != H_Success) + EDEB_ERR(4, "Can't clear notification events."); + + EDEB_EX(7, "shca=%p", shca); + + return; +} + +irqreturn_t ehca_interrupt(int irq, void *dev_id, struct pt_regs *regs) +{ + struct ehca_irq_info *info = (struct ehca_irq_info *)dev_id; + + EDEB_EN(7, "dev_id=%p", dev_id); + + queue_work(info->wq, info->work); + + EDEB_EX(7, ""); + + return IRQ_HANDLED; +} diff --git a/drivers/infiniband/hw/ehca/ehca_irq.h b/drivers/infiniband/hw/ehca/ehca_irq.h new file mode 100644 index 0000000..43b2e3e --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_irq.h @@ -0,0 +1,90 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Function definitions and structs for EQs, NEQs and interrupts + * + * Authors: Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_irq.h,v 1.25 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __EHCA_IRQ_H +#define __EHCA_IRQ_H + + +struct ehca_shca; + +#include +#include + +#ifndef EHCA_USERDRIVER +#include +#endif + +#ifndef __KERNEL__ +#define NO_IRQ (-1) +#include +#include +#endif + +#ifndef EHCA_USERDRIVER +#define to_shca(queue) container_of(irq_info->eq, \ + struct ehca_shca, \ + queue) +#else +extern struct ehca_module ehca_module; +#define to_shca(queue) list_entry(ehca_module.shca_list.next, \ + struct ehca_shca, shca_list) +#endif + +struct ehca_irq_info { + __u32 ist; + __u32 irq; + void *eq; + + atomic_t irq_count; + struct workqueue_struct *wq; + struct work_struct *work; + + pid_t pid; +}; + +void ehca_interrupt_eq(void *data); +void ehca_interrupt_neq(void *data); +irqreturn_t ehca_interrupt(int irq, void *dev_id, struct pt_regs *regs); +irqreturn_t ehca_interrupt_direct(int irq, void *dev_id, struct pt_regs *regs); +int ehca_error_data(struct ehca_shca *shca, u64 ressource); + +#endif From rolandd at cisco.com Sat Feb 18 11:57:21 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:21 -0800 Subject: [PATCH 07/22] Hypercall definitions In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005721.13620.84990.stgit@localhost.localdomain> From: Roland Dreier Do these defines belong in the ehca driver, or should they be put somewhere in generic hypercall support? --- drivers/infiniband/hw/ehca/ehca_common.h | 115 ++++++++++++++++++++++++++++++ 1 files changed, 115 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_common.h b/drivers/infiniband/hw/ehca/ehca_common.h new file mode 100644 index 0000000..922f010 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_common.h @@ -0,0 +1,115 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * hcad local defines + * + * Authors: Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_common.h,v 1.15 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __EHCA_COMMON_H__ +#define __EHCA_COMMON_H__ + +#ifdef CONFIG_PPC64 +#include + +#define H_PARTIAL_STORE 16 +#define H_PAGE_REGISTERED 15 +#define H_IN_PROGRESS 14 +#define H_PARTIAL 5 +#define H_NOT_AVAILABLE 3 +#define H_Closed 2 +#define H_ADAPTER_PARM -17 +#define H_RH_PARM -18 +#define H_RCQ_PARM -19 +#define H_SCQ_PARM -20 +#define H_EQ_PARM -21 +#define H_RT_PARM -22 +#define H_ST_PARM -23 +#define H_SIGT_PARM -24 +#define H_TOKEN_PARM -25 +#define H_MLENGTH_PARM -27 +#define H_MEM_PARM -28 +#define H_MEM_ACCESS_PARM -29 +#define H_ATTR_PARM -30 +#define H_PORT_PARM -31 +#define H_MCG_PARM -32 +#define H_VL_PARM -33 +#define H_TSIZE_PARM -34 +#define H_TRACE_PARM -35 + +#define H_MASK_PARM -37 +#define H_MCG_FULL -38 +#define H_ALIAS_EXIST -39 +#define H_P_COUNTER -40 +#define H_TABLE_FULL -41 +#define H_ALT_TABLE -42 +#define H_MR_CONDITION -43 +#define H_NOT_ENOUGH_RESOURCES -44 +#define H_R_STATE -45 +#define H_RESCINDEND -46 + +/* H call defines to be moved to kernel */ +#define H_RESET_EVENTS 0x15C +#define H_ALLOC_RESOURCE 0x160 +#define H_FREE_RESOURCE 0x164 +#define H_MODIFY_QP 0x168 +#define H_QUERY_QP 0x16C +#define H_REREGISTER_PMR 0x170 +#define H_REGISTER_SMR 0x174 +#define H_QUERY_MR 0x178 +#define H_QUERY_MW 0x17C +#define H_QUERY_HCA 0x180 +#define H_QUERY_PORT 0x184 +#define H_MODIFY_PORT 0x188 +#define H_DEFINE_AQP1 0x18C +#define H_GET_TRACE_BUFFER 0x190 +#define H_DEFINE_AQP0 0x194 +#define H_RESIZE_MR 0x198 +#define H_ATTACH_MCQP 0x19C +#define H_DETACH_MCQP 0x1A0 +#define H_CREATE_RPT 0x1A4 +#define H_REMOVE_RPT 0x1A8 +#define H_REGISTER_RPAGES 0x1AC +#define H_DISABLE_AND_GETC 0x1B0 +#define H_ERROR_DATA 0x1B4 +#define H_GET_HCA_INFO 0x1B8 +#define H_GET_PERF_COUNT 0x1BC +#define H_MANAGE_TRACE 0x1C0 +#define H_QUERY_INT_STATE 0x1E4 +#endif + +#endif /* __EHCA_COMMON_H__ */ From rolandd at cisco.com Sat Feb 18 11:57:25 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:25 -0800 Subject: [PATCH 09/22] ehca classes In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005725.13620.32014.stgit@localhost.localdomain> From: Roland Dreier The fact that ehca_cq_delete and ehca_qp_delete return an int seems a little silly, given that the functions can never fail. The code in ehca_classes.c seems like a misuse of the kmem_cache API; rather than wrapping kmem_cache_alloc() and doing extra initialization, why not just use the kmem_cache's constructor to do this? --- drivers/infiniband/hw/ehca/ehca_classes.c | 191 +++++++++++ drivers/infiniband/hw/ehca/ehca_classes.h | 369 +++++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_classes_core.h | 73 ++++ drivers/infiniband/hw/ehca/ehca_classes_pSeries.h | 256 +++++++++++++++ 4 files changed, 889 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.c b/drivers/infiniband/hw/ehca/ehca_classes.c new file mode 100644 index 0000000..9819788 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_classes.c @@ -0,0 +1,191 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * struct initialisations and allocation + * + * Authors: Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_classes.c,v 1.21 2006/02/06 16:20:38 schickhj Exp $ + */ + +#define DEB_PREFIX "clas" +#include "ehca_kernel.h" + +#include "ehca_classes.h" + +struct ehca_pd *ehca_pd_new(void) +{ + extern struct ehca_module ehca_module; + struct ehca_pd *me; + + me = kmem_cache_alloc(ehca_module.cache_pd, SLAB_KERNEL); + if (me == NULL) + return NULL; + + memset(me, 0, sizeof(struct ehca_pd)); + + return me; +} + +void ehca_pd_delete(struct ehca_pd *me) +{ + extern struct ehca_module ehca_module; + + kmem_cache_free(ehca_module.cache_pd, me); +} + +struct ehca_cq *ehca_cq_new(void) +{ + extern struct ehca_module ehca_module; + struct ehca_cq *me; + + me = kmem_cache_alloc(ehca_module.cache_cq, SLAB_KERNEL); + if (me == NULL) + return NULL; + + memset(me, 0, sizeof(struct ehca_cq)); + spin_lock_init(&me->spinlock); + spin_lock_init(&me->cb_lock); + + return me; +} + +int ehca_cq_delete(struct ehca_cq *me) +{ + extern struct ehca_module ehca_module; + + kmem_cache_free(ehca_module.cache_cq, me); + + return H_Success; +} + +struct ehca_qp *ehca_qp_new(void) +{ + extern struct ehca_module ehca_module; + struct ehca_qp *me; + + me = kmem_cache_alloc(ehca_module.cache_qp, SLAB_KERNEL); + if (me == NULL) + return NULL; + + memset(me, 0, sizeof(struct ehca_qp)); + spin_lock_init(&me->spinlock_s); + spin_lock_init(&me->spinlock_r); + + return me; +} + +int ehca_qp_delete(struct ehca_qp *me) +{ + extern struct ehca_module ehca_module; + + kmem_cache_free(ehca_module.cache_qp, me); + + return H_Success; +} + +struct ehca_av *ehca_av_new(void) +{ + extern struct ehca_module ehca_module; + struct ehca_av *me; + + me = kmem_cache_alloc(ehca_module.cache_av, SLAB_KERNEL); + if (me == NULL) + return NULL; + + memset(me, 0, sizeof(struct ehca_av)); + + return me; +} + +int ehca_av_delete(struct ehca_av *me) +{ + extern struct ehca_module ehca_module; + + kmem_cache_free(ehca_module.cache_av, me); + + return H_Success; +} + +struct ehca_mr *ehca_mr_new(void) +{ + extern struct ehca_module ehca_module; + struct ehca_mr *me; + + me = kmem_cache_alloc(ehca_module.cache_mr, SLAB_KERNEL); + if (me) { + memset(me, 0, sizeof(struct ehca_mr)); + spin_lock_init(&me->mrlock); + EDEB_EX(7, "ehca_mr=%p sizeof(ehca_mr_t)=%x", me, + (u32) sizeof(struct ehca_mr)); + } else { + EDEB_ERR(3, "alloc failed"); + } + + return me; +} + +void ehca_mr_delete(struct ehca_mr *me) +{ + extern struct ehca_module ehca_module; + + kmem_cache_free(ehca_module.cache_mr, me); +} + +struct ehca_mw *ehca_mw_new(void) +{ + extern struct ehca_module ehca_module; + struct ehca_mw *me; + + me = kmem_cache_alloc(ehca_module.cache_mw, SLAB_KERNEL); + if (me) { + memset(me, 0, sizeof(struct ehca_mw)); + spin_lock_init(&me->mwlock); + EDEB_EX(7, "ehca_mw=%p sizeof(ehca_mw_t)=%x", me, + (u32) sizeof(struct ehca_mw)); + } else { + EDEB_ERR(3, "alloc failed"); + } + + return me; +} + +void ehca_mw_delete(struct ehca_mw *me) +{ + extern struct ehca_module ehca_module; + + kmem_cache_free(ehca_module.cache_mw, me); +} + diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h new file mode 100644 index 0000000..1d72aaf --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -0,0 +1,369 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * struct definitions for hcad internal structures + * + * Authors: Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_classes.h,v 1.80 2006/02/06 16:20:38 schickhj Exp $ + */ + +#ifndef __EHCA_CLASSES_H__ +#define __EHCA_CLASSES_H__ + +#include "ehca_kernel.h" +#include "ipz_pt_fn.h" + +#include + +struct ehca_module; +struct ehca_qp; +struct ehca_cq; +struct ehca_eq; +struct ehca_mr; +struct ehca_mw; +struct ehca_pd; +struct ehca_av; + +#ifndef CONFIG_PPC64 +#ifndef Z_SERIES +#error "no series defined" +#endif +#endif + +#ifdef CONFIG_PPC64 +#include "ehca_classes_pSeries.h" +#endif + +#ifdef Z_SERIES +#include "ehca_classes_zSeries.h" +#endif + +#include +#include + +#include "ehca_irq.h" + +#include "ehca_classes_core.h" + +/** @brief HCAD class + * + * contains HCAD specific data + * + */ +struct ehca_module { + struct list_head shca_list; + spinlock_t shca_lock; + + kmem_cache_t *cache_pd; + kmem_cache_t *cache_cq; + kmem_cache_t *cache_qp; + kmem_cache_t *cache_av; + kmem_cache_t *cache_mr; + kmem_cache_t *cache_mw; + + struct ehca_pfmodule pf; /* plattform specific part of HCA */ +}; + +/** @brief EQ class + */ +struct ehca_eq { + u32 length; /* length of EQ */ + struct ipz_queue ipz_queue; /* EQ in kv */ + struct ipz_eq_handle ipz_eq_handle; + struct ehca_irq_info irq_info; + struct work_struct work; + struct h_galpas galpas; + int is_initialized; + + struct ehca_pfeq pf; /* plattform specific part of EQ */ + + spinlock_t spinlock; +}; + +/** static port + */ +struct ehca_sport { + struct ib_cq *ibcq_aqp1; /* CQ for AQP1 */ + struct ib_qp *ibqp_aqp1; /* QP for AQP1 */ + enum ib_port_state port_state; +}; + +/** @brief HCA class "static HCA" + * + * contains HCA specific data per HCA (or vHCA?) + * per instance reported by firmware + * + */ +struct ehca_shca { + struct ib_device ib_device; + struct ibmebus_dev *ibmebus_dev; + u8 num_ports; + int hw_level; + struct list_head shca_list; + struct ipz_adapter_handle ipz_hca_handle; /* firmware HCA handle */ + struct ehca_bridge_handle bridge; + struct ehca_sport sport[2]; + struct ehca_eq eq; /* event queue */ + struct ehca_eq neq; /* notification event queue */ + struct ehca_mr *maxmr; /* internal max MR (for kernel users) */ + struct ehca_pd *pd; /* internal pd (for kernel users) */ + struct ehca_pfshca pf; /* plattform specific part of HCA */ + struct h_galpas galpas; +}; + +/** @brief protection domain + */ +struct ehca_pd { + struct ib_pd ib_pd; /* gen2 qp, must always be first in ehca_pd */ + struct ipz_pd fw_pd; + struct ehca_pfpd pf; +}; + +/** @brief QP class + */ +struct ehca_qp { + struct ib_qp ib_qp; /* gen2 qp, must always be first in ehca_qp */ + struct ehca_qp_core ehca_qp_core; /* common fields for + user/kernel space */ + u32 token; + spinlock_t spinlock_s; + spinlock_t spinlock_r; + u32 sq_max_inline_data_size; /* max # of inline data can be send */ + struct ipz_qp_handle ipz_qp_handle; /* QP handle for h-calls */ + struct ehca_pfqp pf; /* plattform specific part of QP */ + struct ib_qp_init_attr init_attr; + /* adr mapping for s/r queues and fw handle bw kernel&user space */ + u64 uspace_squeue; + u64 uspace_rqueue; + u64 uspace_fwh; + struct ehca_cq* send_cq; + unsigned int sqerr_purgeflag; + struct list_head list_entries; +}; + +#define QP_HASHTAB_LEN 7 +/** @brief CQ class + */ +struct ehca_cq { + struct ib_cq ib_cq; /* gen2 cq, must always be first + in ehca_cq */ + struct ehca_cq_core ehca_cq_core; /* common fields for + user/kernel space */ + spinlock_t spinlock; + u32 cq_number; + u32 token; + u32 nr_of_entries; + /* fw specific data common for p+z */ + struct ipz_cq_handle ipz_cq_handle; /* CQ handle for h-calls */ + /* pf specific code */ + struct ehca_pfcq pf; /* platform specific part of CQ */ + spinlock_t cb_lock; /* completion event handler */ + /* adr mapping for queue and fw handle bw kernel&user space */ + u64 uspace_queue; + u64 uspace_fwh; + struct list_head qp_hashtab[QP_HASHTAB_LEN]; +}; + + +/** @brief MR flags + */ +enum ehca_mr_flag { + EHCA_MR_FLAG_FMR = 0x80000000, /* FMR, created with ehca_alloc_fmr */ + EHCA_MR_FLAG_MAXMR = 0x40000000, /* max-MR */ + EHCA_MR_FLAG_USER = 0x20000000 /* user space TODO...necessary????. */ +}; + +/** @brief MR class + */ +struct ehca_mr { + union { + struct ib_mr ib_mr; /* must always be first in ehca_mr */ + struct ib_fmr ib_fmr; /* must always be first in ehca_mr */ + } ib; + + spinlock_t mrlock; + + /* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + * !!! ehca_mr_deletenew() memsets from flags to end of structure + * !!! DON'T move flags or insert another field before. + * !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */ + + enum ehca_mr_flag flags; + u32 num_pages; /* number of MR pages */ + int acl; /* ACL (stored here for usage in reregister) */ + u64 *start; /* virtual start address (stored here for */ + /* usage in reregister) */ + u64 size; /* size (stored here for usage in reregister) */ + u32 fmr_page_size; /* page size for FMR */ + u32 fmr_max_pages; /* max pages for FMR */ + u32 fmr_max_maps; /* max outstanding maps for FMR */ + u32 fmr_map_cnt; /* map counter for FMR */ + /* fw specific data */ + struct ipz_mrmw_handle ipz_mr_handle; /* MR handle for h-calls */ + struct h_galpas galpas; + /* data for userspace bridge */ + u32 nr_of_pages; + void *pagearray; + + struct ehca_pfmr pf; /* platform specific part of MR */ +}; + +/** @brief MW class + */ +struct ehca_mw { + struct ib_mw ib_mw; /* gen2 mw, must always be first in ehca_mw */ + spinlock_t mwlock; + + u8 never_bound; /* indication MW was never bound */ + struct ipz_mrmw_handle ipz_mw_handle; /* MW handle for h-calls */ + struct h_galpas galpas; + + struct ehca_pfmw pf; /* platform specific part of MW */ +}; + +/** @brief MR page info type + */ +enum ehca_mr_pgi_type { + EHCA_MR_PGI_PHYS = 1, /* type of ehca_reg_phys_mr, + * ehca_rereg_phys_mr, + * ehca_reg_internal_maxmr */ + EHCA_MR_PGI_USER = 2, /* type of ehca_reg_user_mr */ + EHCA_MR_PGI_FMR = 3 /* type of ehca_map_phys_fmr */ +}; + +/** @brief MR page info + */ +struct ehca_mr_pginfo { + enum ehca_mr_pgi_type type; + u64 num_pages; + u64 page_count; + + /* type EHCA_MR_PGI_PHYS section */ + int num_phys_buf; + struct ib_phys_buf *phys_buf_array; + u64 next_buf; + u64 next_page; + + /* type EHCA_MR_PGI_USER section */ + struct ib_umem *region; + struct ib_umem_chunk *next_chunk; + u64 next_nmap; + + /* type EHCA_MR_PGI_FMR section */ + u64 *page_list; + u64 next_listelem; +}; + + +/** @brief addres vector suitable for a ud enqueue request + */ +struct ehca_av { + struct ib_ah ib_ah; /* gen2 ah, must always be first in ehca_ah */ + struct ehca_ud_av av; +}; + +/** @brief user context + */ +struct ehca_ucontext { + struct ib_ucontext ib_ucontext; +}; + +struct ehca_module *ehca_module_new(void); + +int ehca_module_delete(struct ehca_module *me); + +int ehca_eq_ctor(struct ehca_eq *eq); + +int ehca_eq_dtor(struct ehca_eq *eq); + +struct ehca_shca *ehca_shca_new(void); + +int ehca_shca_delete(struct ehca_shca *me); + +struct ehca_sport *ehca_sport_new(struct ehca_shca *anchor); /*anchor?? */ + +struct ehca_cq *ehca_cq_new(void); + +int ehca_cq_delete(struct ehca_cq *me); + +struct ehca_av *ehca_av_new(void); + +int ehca_av_delete(struct ehca_av *me); + +struct ehca_pd *ehca_pd_new(void); + +void ehca_pd_delete(struct ehca_pd *me); + +struct ehca_qp *ehca_qp_new(void); + +int ehca_qp_delete(struct ehca_qp *me); + +struct ehca_mr *ehca_mr_new(void); + +void ehca_mr_delete(struct ehca_mr *me); + +struct ehca_mw *ehca_mw_new(void); + +void ehca_mw_delete(struct ehca_mw *me); + +extern struct rw_semaphore ehca_qp_idr_sem; +extern struct rw_semaphore ehca_cq_idr_sem; +extern struct idr ehca_qp_idr; +extern struct idr ehca_cq_idr; + +/* + * resp structs for comm bw user and kernel space + */ +struct ehca_create_cq_resp { + u32 cq_number; + u32 token; + struct ehca_cq_core ehca_cq_core; +}; + +struct ehca_create_qp_resp { + u32 qp_num; + u32 token; + struct ehca_qp_core ehca_qp_core; +}; + +/* + * helper funcs to link send cq and qp + */ +int ehca_cq_assign_qp(struct ehca_cq *cq, struct ehca_qp *qp); +int ehca_cq_unassign_qp(struct ehca_cq *cq, unsigned int qp_num); +struct ehca_qp* ehca_cq_get_qp(struct ehca_cq *cq, int qp_num); + +#endif /* __EHCA_CLASSES_H__ */ diff --git a/drivers/infiniband/hw/ehca/ehca_classes_core.h b/drivers/infiniband/hw/ehca/ehca_classes_core.h new file mode 100644 index 0000000..5e864b3 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_classes_core.h @@ -0,0 +1,73 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * core struct definitions for hcad internal structures and + * to be used/compiled commonly in user and kernel space + * + * Authors: Christoph Raisch + * Hoang-Nam Nguyen + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_classes_core.h,v 1.12 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __EHCA_CLASSES_CORE_H__ +#define __EHCA_CLASSES_CORE_H__ + +#include "ipz_pt_fn_core.h" +#include "ehca_galpa.h" + +/** @brief qp core contains common fields for user/kernel space + */ +struct ehca_qp_core { + /* kernel space: enum ib_qp_type, user space: enum ibv_qp_type */ + int qp_type; + int dummy1; /* 8 byte alignment */ + struct ipz_queue ipz_squeue; + struct ipz_queue ipz_rqueue; + struct h_galpas galpas; + unsigned int qkey; + int dummy2; /* 8 byte alignment */ + /* qp_num assigned by ehca: sqp0/1 may have got different numbers */ + unsigned int real_qp_num; +}; + +/** @brief cq core contains common fields for user/kernel space + */ +struct ehca_cq_core { + struct ipz_queue ipz_queue; + struct h_galpas galpas; +}; + +#endif /* __EHCA_CLASSES_CORE_H__ */ diff --git a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h new file mode 100644 index 0000000..8f86137 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h @@ -0,0 +1,256 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * pSeries interface definitions + * + * Authors: Waleri Fomin + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_classes_pSeries.h,v 1.24 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __EHCA_CLASSES_PSERIES_H__ +#define __EHCA_CLASSES_PSERIES_H__ + +#include "ehca_galpa.h" +#include "ipz_pt_fn.h" + + +struct ehca_pfmodule { +}; + +struct ehca_pfshca { +}; + +struct ehca_pfqp { + struct ipz_qpt sqpt; + struct ipz_qpt rqpt; + struct ehca_bridge_handle bridge; +}; + +struct ehca_pfcq { + struct ipz_qpt qpt; + struct ehca_bridge_handle bridge; + u32 cqnr; +}; + +struct ehca_pfeq { + struct ipz_qpt qpt; + struct ehca_bridge_handle bridge; + struct h_galpa galpa; + u32 eqnr; +}; + +struct ehca_pfpd { +}; + +struct ehca_pfmr { + struct ehca_bridge_handle bridge; +}; +struct ehca_pfmw { +}; + +struct ipz_adapter_handle { + u64 handle; +}; + +struct ipz_cq_handle { + u64 handle; +}; + +struct ipz_eq_handle { + u64 handle; +}; + +struct ipz_qp_handle { + u64 handle; +}; +struct ipz_mrmw_handle { + u64 handle; +}; + +struct ipz_pd { + u32 value; +}; + +struct hcp_modify_qp_control_block { + u32 qkey; /* 00 */ + u32 rdd; /* reliable datagram domain */ + u32 send_psn; /* 02 */ + u32 receive_psn; /* 03 */ + u32 prim_phys_port; /* 04 */ + u32 alt_phys_port; /* 05 */ + u32 prim_p_key_idx; /* 06 */ + u32 alt_p_key_idx; /* 07 */ + u32 rdma_atomic_ctrl; /* 08 */ + u32 qp_state; /* 09 */ + u32 reserved_10; /* 10 */ + u32 rdma_nr_atomic_resp_res; /* 11 */ + u32 path_migration_state; /* 12 */ + u32 rdma_atomic_outst_dest_qp; /* 13 */ + u32 dest_qp_nr; /* 14 */ + u32 min_rnr_nak_timer_field; /* 15 */ + u32 service_level; /* 16 */ + u32 send_grh_flag; /* 17 */ + u32 retry_count; /* 18 */ + u32 timeout; /* 19 */ + u32 path_mtu; /* 20 */ + u32 max_static_rate; /* 21 */ + u32 dlid; /* 22 */ + u32 rnr_retry_count; /* 23 */ + u32 source_path_bits; /* 24 */ + u32 traffic_class; /* 25 */ + u32 hop_limit; /* 26 */ + u32 source_gid_idx; /* 27 */ + u32 flow_label; /* 28 */ + u32 reserved_29; /* 29 */ + union { /* 30 */ + u64 dw[2]; + u8 byte[16]; + } dest_gid; + u32 service_level_al; /* 34 */ + u32 send_grh_flag_al; /* 35 */ + u32 retry_count_al; /* 36 */ + u32 timeout_al; /* 37 */ + u32 max_static_rate_al; /* 38 */ + u32 dlid_al; /* 39 */ + u32 rnr_retry_count_al; /* 40 */ + u32 source_path_bits_al; /* 41 */ + u32 traffic_class_al; /* 42 */ + u32 hop_limit_al; /* 43 */ + u32 source_gid_idx_al; /* 44 */ + u32 flow_label_al; /* 45 */ + u32 reserved_46; /* 46 */ + u32 reserved_47; /* 47 */ + union { /* 48 */ + u64 dw[2]; + u8 byte[16]; + } dest_gid_al; + u32 max_nr_outst_send_wr; /* 52 */ + u32 max_nr_outst_recv_wr; /* 53 */ + u32 disable_ete_credit_check; /* 54 */ + u32 qp_number; /* 55 */ + u64 send_queue_handle; /* 56 */ + u64 recv_queue_handle; /* 58 */ + u32 actual_nr_sges_in_sq_wqe; /* 60 */ + u32 actual_nr_sges_in_rq_wqe; /* 61 */ + u32 qp_enable; /* 62 */ + u32 curr_srq_limit; /* 63 */ + u64 qp_aff_asyn_ev_log_reg; /* 64 */ + u64 shared_rq_hndl; /* 66 */ + u64 trigg_doorbell_qp_hndl; /* 68 */ + u32 reserved_70_127[58]; /* 70 */ +}; + +#define MQPCB_MASK_QKEY EHCA_BMASK_IBM(0,0) +#define MQPCB_MASK_SEND_PSN EHCA_BMASK_IBM(2,2) +#define MQPCB_MASK_RECEIVE_PSN EHCA_BMASK_IBM(3,3) +#define MQPCB_MASK_PRIM_PHYS_PORT EHCA_BMASK_IBM(4,4) +#define MQPCB_PRIM_PHYS_PORT EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_ALT_PHYS_PORT EHCA_BMASK_IBM(5,5) +#define MQPCB_MASK_PRIM_P_KEY_IDX EHCA_BMASK_IBM(6,6) +#define MQPCB_PRIM_P_KEY_IDX EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_ALT_P_KEY_IDX EHCA_BMASK_IBM(7,7) +#define MQPCB_MASK_RDMA_ATOMIC_CTRL EHCA_BMASK_IBM(8,8) +#define MQPCB_MASK_QP_STATE EHCA_BMASK_IBM(9,9) +#define MQPCB_QP_STATE EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES EHCA_BMASK_IBM(11,11) +#define MQPCB_MASK_PATH_MIGRATION_STATE EHCA_BMASK_IBM(12,12) +#define MQPCB_MASK_RDMA_ATOMIC_OUTST_DEST_QP EHCA_BMASK_IBM(13,13) +#define MQPCB_MASK_DEST_QP_NR EHCA_BMASK_IBM(14,14) +#define MQPCB_MASK_MIN_RNR_NAK_TIMER_FIELD EHCA_BMASK_IBM(15,15) +#define MQPCB_MASK_SERVICE_LEVEL EHCA_BMASK_IBM(16,16) +#define MQPCB_MASK_SEND_GRH_FLAG EHCA_BMASK_IBM(17,17) +#define MQPCB_MASK_RETRY_COUNT EHCA_BMASK_IBM(18,18) +#define MQPCB_MASK_TIMEOUT EHCA_BMASK_IBM(19,19) +#define MQPCB_MASK_PATH_MTU EHCA_BMASK_IBM(20,20) +#define MQPCB_PATH_MTU EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_MAX_STATIC_RATE EHCA_BMASK_IBM(21,21) +#define MQPCB_MAX_STATIC_RATE EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_DLID EHCA_BMASK_IBM(22,22) +#define MQPCB_DLID EHCA_BMASK_IBM(16,31) +#define MQPCB_MASK_RNR_RETRY_COUNT EHCA_BMASK_IBM(23,23) +#define MQPCB_RNR_RETRY_COUNT EHCA_BMASK_IBM(29,31) +#define MQPCB_MASK_SOURCE_PATH_BITS EHCA_BMASK_IBM(24,24) +#define MQPCB_SOURCE_PATH_BITS EHCA_BMASK_IBM(25,31) +#define MQPCB_MASK_TRAFFIC_CLASS EHCA_BMASK_IBM(25,25) +#define MQPCB_TRAFFIC_CLASS EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_HOP_LIMIT EHCA_BMASK_IBM(26,26) +#define MQPCB_HOP_LIMIT EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_SOURCE_GID_IDX EHCA_BMASK_IBM(27,27) +#define MQPCB_SOURCE_GID_IDX EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_FLOW_LABEL EHCA_BMASK_IBM(28,28) +#define MQPCB_FLOW_LABEL EHCA_BMASK_IBM(12,31) +#define MQPCB_MASK_DEST_GID EHCA_BMASK_IBM(30,30) +#define MQPCB_MASK_SERVICE_LEVEL_AL EHCA_BMASK_IBM(31,31) +#define MQPCB_SERVICE_LEVEL_AL EHCA_BMASK_IBM(28,31) +#define MQPCB_MASK_SEND_GRH_FLAG_AL EHCA_BMASK_IBM(32,32) +#define MQPCB_SEND_GRH_FLAG_AL EHCA_BMASK_IBM(31,31) +#define MQPCB_MASK_RETRY_COUNT_AL EHCA_BMASK_IBM(33,33) +#define MQPCB_RETRY_COUNT_AL EHCA_BMASK_IBM(29,31) +#define MQPCB_MASK_TIMEOUT_AL EHCA_BMASK_IBM(34,34) +#define MQPCB_TIMEOUT_AL EHCA_BMASK_IBM(27,31) +#define MQPCB_MASK_MAX_STATIC_RATE_AL EHCA_BMASK_IBM(35,35) +#define MQPCB_MAX_STATIC_RATE_AL EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_DLID_AL EHCA_BMASK_IBM(36,36) +#define MQPCB_DLID_AL EHCA_BMASK_IBM(16,31) +#define MQPCB_MASK_RNR_RETRY_COUNT_AL EHCA_BMASK_IBM(37,37) +#define MQPCB_RNR_RETRY_COUNT_AL EHCA_BMASK_IBM(29,31) +#define MQPCB_MASK_SOURCE_PATH_BITS_AL EHCA_BMASK_IBM(38,38) +#define MQPCB_SOURCE_PATH_BITS_AL EHCA_BMASK_IBM(25,31) +#define MQPCB_MASK_TRAFFIC_CLASS_AL EHCA_BMASK_IBM(39,39) +#define MQPCB_TRAFFIC_CLASS_AL EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_HOP_LIMIT_AL EHCA_BMASK_IBM(40,40) +#define MQPCB_HOP_LIMIT_AL EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_SOURCE_GID_IDX_AL EHCA_BMASK_IBM(41,41) +#define MQPCB_SOURCE_GID_IDX_AL EHCA_BMASK_IBM(24,31) +#define MQPCB_MASK_FLOW_LABEL_AL EHCA_BMASK_IBM(42,42) +#define MQPCB_FLOW_LABEL_AL EHCA_BMASK_IBM(12,31) +#define MQPCB_MASK_DEST_GID_AL EHCA_BMASK_IBM(44,44) +#define MQPCB_MASK_MAX_NR_OUTST_SEND_WR EHCA_BMASK_IBM(45,45) +#define MQPCB_MAX_NR_OUTST_SEND_WR EHCA_BMASK_IBM(16,31) +#define MQPCB_MASK_MAX_NR_OUTST_RECV_WR EHCA_BMASK_IBM(46,46) +#define MQPCB_MAX_NR_OUTST_RECV_WR EHCA_BMASK_IBM(16,31) +#define MQPCB_MASK_DISABLE_ETE_CREDIT_CHECK EHCA_BMASK_IBM(47,47) +#define MQPCB_DISABLE_ETE_CREDIT_CHECK EHCA_BMASK_IBM(31,31) +#define MQPCB_QP_NUMBER EHCA_BMASK_IBM(8,31) +#define MQPCB_MASK_QP_ENABLE EHCA_BMASK_IBM(48,48) +#define MQPCB_QP_ENABLE EHCA_BMASK_IBM(31,31) +#define MQPCB_MASK_CURR_SQR_LIMIT EHCA_BMASK_IBM(49,49) +#define MQPCB_CURR_SQR_LIMIT EHCA_BMASK_IBM(15,31) +#define MQPCB_MASK_QP_AFF_ASYN_EV_LOG_REG EHCA_BMASK_IBM(50,50) +#define MQPCB_MASK_SHARED_RQ_HNDL EHCA_BMASK_IBM(51,51) + +#endif /* __EHCA_CLASSES_PSERIES_H__ */ From rolandd at cisco.com Sat Feb 18 11:57:39 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:39 -0800 Subject: [PATCH 12/22] ehca low-level verbs In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005739.13620.15633.stgit@localhost.localdomain> From: Roland Dreier What is ehca_init_module()? It is declared but never defined. --- drivers/infiniband/hw/ehca/ehca_iverbs.h | 163 ++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_qes.h | 274 ++++++++++++++++++++++++++++++ 2 files changed, 437 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h new file mode 100644 index 0000000..b1319a9 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h @@ -0,0 +1,163 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Function definitions for internal functions + * + * Authors: Heiko J Schick + * Khadija Souissi + * Christoph Raisch + * Hoang-Nam Nguyen + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_iverbs.h,v 1.32 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __EHCA_IVERBS_H__ +#define __EHCA_IVERBS_H__ + +#include "ehca_classes.h" +/** ehca internal verb for testuse + */ +void ehca_init_module(void); + +int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props); +int ehca_query_port(struct ib_device *ibdev, + u8 port, struct ib_port_attr *props); +int ehca_query_pkey(struct ib_device *ibdev, u8 port, u16 index, u16 * pkey); +int ehca_query_gid(struct ib_device *ibdev, u8 port, + int index, union ib_gid *gid); +int ehca_modify_port(struct ib_device *ibdev, + u8 port, int port_modify_mask, + struct ib_port_modify *props); + +struct ib_pd *ehca_alloc_pd(struct ib_device *device, + struct ib_ucontext *context, + struct ib_udata *udata); + +int ehca_dealloc_pd(struct ib_pd *pd); + +struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr); +int ehca_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr); +int ehca_query_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr); +int ehca_destroy_ah(struct ib_ah *ah); + +struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, + struct ib_ucontext *context, + struct ib_udata *udata); +int ehca_resize_cq(struct ib_cq *cq, int cqe); + +int ehca_destroy_cq(struct ib_cq *cq); + +int ehca_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc); + +int ehca_peek_cq(struct ib_cq *cq, int wc_cnt); + +int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify); + +struct ib_qp *ehca_create_qp(struct ib_pd *pd, + struct ib_qp_init_attr *init_attr, + struct ib_udata *udata); + +u64 ehca_define_sqp(struct ehca_shca *shca, struct ehca_qp *ibqp, + struct ib_qp_init_attr *qp_init_attr); + +int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask); + +int ehca_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr, + int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); + +int ehca_destroy_qp(struct ib_qp *qp); + +int ehca_post_send(struct ib_qp *qp, + struct ib_send_wr *send_wr, struct ib_send_wr **bad_send_wr); + +int ehca_post_recv(struct ib_qp *qp, + struct ib_recv_wr *recv_wr, struct ib_recv_wr **bad_recv_wr); + +struct ib_mr *ehca_get_dma_mr(struct ib_pd *pd, int mr_access_flags); + +struct ib_mr *ehca_reg_phys_mr(struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, u64 *iova_start); + +struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, + struct ib_umem *region, + int mr_access_flags, struct ib_udata *udata); + +int ehca_rereg_phys_mr(struct ib_mr *mr, + int mr_rereg_mask, + struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, int mr_access_flags, u64 *iova_start); + +int ehca_query_mr(struct ib_mr *mr, struct ib_mr_attr *mr_attr); + +int ehca_dereg_mr(struct ib_mr *mr); + +struct ib_mw *ehca_alloc_mw(struct ib_pd *pd); + +int ehca_bind_mw(struct ib_qp *qp, + struct ib_mw *mw, struct ib_mw_bind *mw_bind); + +int ehca_dealloc_mw(struct ib_mw *mw); + +struct ib_fmr *ehca_alloc_fmr(struct ib_pd *pd, + int mr_access_flags, + struct ib_fmr_attr *fmr_attr); + +int ehca_map_phys_fmr(struct ib_fmr *fmr, + u64 *page_list, int list_len, u64 iova); + +int ehca_unmap_fmr(struct list_head *fmr_list); + +int ehca_dealloc_fmr(struct ib_fmr *fmr); + +int ehca_attach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid); + +int ehca_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid); + +struct ib_ucontext *ehca_alloc_ucontext(struct ib_device *device, + struct ib_udata *udata); +int ehca_dealloc_ucontext(struct ib_ucontext *context); + +int ehca_mmap(struct ib_ucontext *context, struct vm_area_struct *vma); + +int ehca_poll_eqs(void *data); + +int ehca_mmap_nopage(u64 foffset,u64 length,void ** mapped,struct vm_area_struct ** vma); +int ehca_mmap_register(u64 physical,void ** mapped,struct vm_area_struct ** vma); +int ehca_munmap(unsigned long addr, size_t len); + +#endif diff --git a/drivers/infiniband/hw/ehca/ehca_qes.h b/drivers/infiniband/hw/ehca/ehca_qes.h new file mode 100644 index 0000000..e9420e3 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_qes.h @@ -0,0 +1,274 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Hardware request structures + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_qes.h,v 1.9 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#ifndef _EHCA_QES_H_ +#define _EHCA_QES_H_ + +/** DON'T include any kernel related files here!!! + * This file is used commonly in user and kernel space!!! + */ + +/** + * virtual scatter gather entry to specify remote adresses with length + */ +struct ehca_vsgentry { + u64 vaddr; + u32 lkey; + u32 length; +}; + +#define GRH_FLAG_MASK EHCA_BMASK_IBM(7,7) +#define GRH_IPVERSION_MASK EHCA_BMASK_IBM(0,3) +#define GRH_TCLASS_MASK EHCA_BMASK_IBM(4,12) +#define GRH_FLOWLABEL_MASK EHCA_BMASK_IBM(13,31) +#define GRH_PAYLEN_MASK EHCA_BMASK_IBM(32,47) +#define GRH_NEXTHEADER_MASK EHCA_BMASK_IBM(48,55) +#define GRH_HOPLIMIT_MASK EHCA_BMASK_IBM(56,63) + +/** + * Unreliable Datagram Address Vector Format + * see IBTA Vol1 chapter 8.3 Global Routing Header + */ +struct ehca_ud_av { + u8 sl; + u8 lnh; + u16 dlid; + u8 reserved1; + u8 reserved2; + u8 reserved3; + u8 slid_path_bits; + u8 reserved4; + u8 ipd; + u8 reserved5; + u8 pmtu; + u32 reserved6; + u64 reserved7; + union { + struct { + u64 word_0; /* always set to 6 */ + /*should be 0x1B for IB transport */ + u64 word_1; + u64 word_2; + u64 word_3; + u64 word_4; + } grh; + struct { + u32 wd_0; + u32 wd_1; + /* DWord_1 --> SGID */ + + u32 sgid_wd3; + /* bits 127 - 96 */ + + u32 sgid_wd2; + /* bits 95 - 64 */ + /* DWord_2 */ + + u32 sgid_wd1; + /* bits 63 - 32 */ + + u32 sgid_wd0; + /* bits 31 - 0 */ + /* DWord_3 --> DGID */ + + u32 dgid_wd3; + /* bits 127 - 96 + **/ + u32 dgid_wd2; + /* bits 95 - 64 + DWord_4 */ + u32 dgid_wd1; + /* bits 63 - 32 */ + + u32 dgid_wd0; + /* bits 31 - 0 */ + } grh_l; + }; +}; + +/* maximum number of sg entries allowed in a WQE */ +#define MAX_WQE_SG_ENTRIES 252 + +#define WQE_OPTYPE_SEND 0x80 +#define WQE_OPTYPE_RDMAREAD 0x40 +#define WQE_OPTYPE_RDMAWRITE 0x20 +#define WQE_OPTYPE_CMPSWAP 0x10 +#define WQE_OPTYPE_FETCHADD 0x08 +#define WQE_OPTYPE_BIND 0x04 + +#define WQE_WRFLAG_REQ_SIGNAL_COM 0x80 +#define WQE_WRFLAG_FENCE 0x40 +#define WQE_WRFLAG_IMM_DATA_PRESENT 0x20 +#define WQE_WRFLAG_SOLIC_EVENT 0x10 + +#define WQEF_CACHE_HINT 0x80 +#define WQEF_CACHE_HINT_RD_WR 0x40 +#define WQEF_TIMED_WQE 0x20 +#define WQEF_PURGE 0x08 + +#define MW_BIND_ACCESSCTRL_R_WRITE 0x40 +#define MW_BIND_ACCESSCTRL_R_READ 0x20 +#define MW_BIND_ACCESSCTRL_R_ATOMIC 0x10 + +struct ehca_wqe { + u64 work_request_id; + u8 optype; + u8 wr_flag; + u16 pkeyi; + u8 wqef; + u8 nr_of_data_seg; + u16 wqe_provided_slid; + u32 destination_qp_number; + u32 resync_psn_sqp; + u32 local_ee_context_qkey; + u32 immediate_data; + union { + struct { + u64 remote_virtual_adress; + u32 rkey; + u32 reserved; + u64 atomic_1st_op_dma_len; + u64 atomic_2nd_op; + struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES]; + + } nud; + struct { + u64 ehca_ud_av_ptr; + u64 reserved1; + u64 reserved2; + u64 reserved3; + struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES]; + } ud_avp; + struct { + struct ehca_ud_av ud_av; + struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES - + 2]; + } ud_av; + struct { + u64 reserved0; + u64 reserved1; + u64 reserved2; + u64 reserved3; + struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES]; + } all_rcv; + + struct { + u64 reserved; + u32 rkey; + u32 old_rkey; + u64 reserved1; + u64 reserved2; + u64 virtual_address; + u32 reserved3; + u32 length; + u32 reserved4; + u16 reserved5; + u8 reserved6; + u8 lr_ctl; + u32 lkey; + u32 reserved7; + u64 reserved8; + u64 reserved9; + u64 reserved10; + u64 reserved11; + } bind; + struct { + u64 reserved12; + u64 reserved13; + u32 size; + u32 start; + } inline_data; + } u; + +}; + +#define WC_SEND_RECEIVE EHCA_BMASK_IBM(0,0) +#define WC_IMM_DATA EHCA_BMASK_IBM(1,1) +#define WC_GRH_PRESENT EHCA_BMASK_IBM(2,2) +#define WC_SE_BIT EHCA_BMASK_IBM(3,3) + +struct ehca_cqe { + u64 work_request_id; + u8 optype; + u8 w_completion_flags; + u16 reserved1; + u32 nr_bytes_transferred; + u32 immediate_data; + u32 local_qp_number; + u8 freed_resource_count; + u8 service_level; + u16 wqe_count; + u32 qp_token; + u32 qkey_ee_token; + u32 remote_qp_number; + u16 dlid; + u16 rlid; + u16 reserved2; + u16 pkey_index; + u32 cqe_timestamp; + u32 wqe_timestamp; + u8 wqe_timestamp_valid; + u8 reserved3; + u8 reserved4; + u8 cqe_flags; + u32 status; +}; + +struct ehca_eqe { + u64 entry; +}; + +struct ehca_mrte { + u64 starting_va; + u64 length; /* length of memory region in bytes*/ + u32 pd; + u8 key_instance; + u8 pagesize; + u8 mr_control; + u8 local_remote_access_ctrl; + u8 reserved[0x20 - 0x18]; + u64 at_pointer[4]; +}; +#endif /*_EHCA_QES_H_*/ From rolandd at cisco.com Sat Feb 18 11:57:23 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:23 -0800 Subject: [PATCH 08/22] Generic ehca headers In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005723.13620.10389.stgit@localhost.localdomain> From: Roland Dreier The defines of TRUE and FALSE look rather useless. Why are they needed? What is struct ehca_cache for? It doesn't seem to be used anywhere. ehca_kv_to_g() looks completely horrible. The whole idea of using vmalloc()ed kernel memory to do DMA seems unacceptable to me. It's usual to include all headers before all headers. --- drivers/infiniband/hw/ehca/ehca_flightrecorder.h | 74 ++++ drivers/infiniband/hw/ehca/ehca_kernel.h | 135 +++++++ drivers/infiniband/hw/ehca/ehca_tools.h | 431 ++++++++++++++++++++++ 3 files changed, 640 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_flightrecorder.h b/drivers/infiniband/hw/ehca/ehca_flightrecorder.h new file mode 100644 index 0000000..7c631ad --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_flightrecorder.h @@ -0,0 +1,74 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * flightrecorder macros + * + * Authors: Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_flightrecorder.h,v 1.5 2006/02/06 10:17:34 schickhj Exp $ + */ +/*****************************************************************************/ +#ifndef EHCA_FLIGHTRECORDER_H +#define EHCA_FLIGHTRECORDER_H + +#define ED_EXTEND1(x,ar1...) \ + unsigned long __EDEB_R2=(const unsigned long)x-0;ED_EXTEND2(ar1) +#define ED_EXTEND2(x,ar1...) \ + unsigned long __EDEB_R3=(const unsigned long)x-0;ED_EXTEND3(ar1) +#define ED_EXTEND3(x,ar1...) \ + unsigned long __EDEB_R4=(const unsigned long)x-0;ED_EXTEND4(ar1) +#define ED_EXTEND4(x,ar1...) + +#define EHCA_FLIGHTRECORDER_SIZE 65536 + +extern atomic_t ehca_flightrecorder_index; +extern unsigned long ehca_flightrecorder[EHCA_FLIGHTRECORDER_SIZE]; + +/* Not nice, but -O2 optimized */ + +#define ED_FLIGHT_LOG(x,ar1...) { \ + u32 flight_offset = ((u32) \ + atomic_add_return(4, &ehca_flightrecorder_index)) \ + % EHCA_FLIGHTRECORDER_SIZE; \ + unsigned long *flight_trline = &ehca_flightrecorder[flight_offset]; \ + unsigned long __EDEB_R1 = (unsigned long) x-0; ED_EXTEND1(ar1) \ + flight_trline[0]=__EDEB_R1,flight_trline[1]=__EDEB_R2, \ + flight_trline[2]=__EDEB_R3,flight_trline[3]=__EDEB_R4; } + +#define EHCA_FLIGHTRECORDER_BACKLOG 60 + +void ehca_flight_to_printk(void); + +#endif diff --git a/drivers/infiniband/hw/ehca/ehca_kernel.h b/drivers/infiniband/hw/ehca/ehca_kernel.h new file mode 100644 index 0000000..f119149 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_kernel.h @@ -0,0 +1,135 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * generalized functions for code shared between kernel and userspace + * + * Authors: Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_kernel.h,v 1.39 2006/02/06 11:45:10 schickhj Exp $ + */ + +#ifndef _EHCA_KERNEL_H_ +#define _EHCA_KERNEL_H_ + +#define FALSE (1==0) +#define TRUE (1==1) + +#define big_little_target 0 /* needed for simulation */ +#include + +#include +#include "ehca_common.h" +#include "ehca_kernel.h" + +/** + * Handle to be used for adress translation mechanisms, currently a placeholder. + */ +struct ehca_bridge_handle { + int handle; +}; + +inline static int ehca_adr_bad(void *adr) +{ + return (adr == 0); +}; + +#ifdef EHCA_USERDRIVER +/* userspace replacement for kernel functions */ +#include "ehca_usermain.h" +#else /* USERDRIVER */ +/* kernel includes */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct ehca_cache { + kmem_cache_t *cache; + int size; +}; + +#ifdef __powerpc64__ +#include +#include +#include +#else +#endif + +#include + +#include + + +/** + * ehca_kv_to_g - Converts a kernel virtual address to visible addresses + * (i.e. a physical/absolute address). + */ +inline static u64 ehca_kv_to_g(void *adr) +{ + u64 raddr; +#ifndef CONFIG_PPC64 + raddr = virt_to_phys((u64)adr); +#else + /* we need to find not only the physical address + * but the absolute to account for memory segmentation */ + raddr = virt_to_abs((u64)adr); +#endif + if (((u64)adr & VMALLOC_START) == VMALLOC_START) { + raddr = phys_to_abs((page_to_pfn(vmalloc_to_page(adr)) << + PAGE_SHIFT)); + } + return (raddr); +} + +#endif /* USERDRIVER */ +#include + + +#endif /* _EHCA_KERNEL_H_ */ diff --git a/drivers/infiniband/hw/ehca/ehca_tools.h b/drivers/infiniband/hw/ehca/ehca_tools.h new file mode 100644 index 0000000..915a0b7 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_tools.h @@ -0,0 +1,431 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * auxiliary functions + * + * Authors: Christoph Raisch + * Khadija Souissi + * Waleri Fomin + * Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_tools.h,v 1.43 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#ifndef EHCA_TOOLS_H +#define EHCA_TOOLS_H + +#include "ehca_flightrecorder.h" +#include "ehca_common.h" + +#define flightlog_value() mftb() + +#ifndef sizeofmember +#define sizeofmember(TYPE, MEMBER) (sizeof( ((TYPE *)0)->MEMBER)) +#endif + +#define EHCA_EDEB_TRACE_MASK_SIZE 32 +extern u8 ehca_edeb_mask[EHCA_EDEB_TRACE_MASK_SIZE]; +#define EDEB_ID_TO_U32(str4) (str4[3] | (str4[2] << 8) | (str4[1] << 16) | \ + (str4[0] << 24)) + +inline static u64 ehca_edeb_filter(const u32 level, + const u32 id, const u32 line) +{ + u64 ret = 0; + u32 filenr = 0; + u32 filter_level = 9; + u32 dynamic_level = 0; + /* This is code written for the gcc -O2 optimizer which should colapse + * to two single ints filter_level is the first level kicked out by + * compiler means trace everythin below 6. */ + if (id == EDEB_ID_TO_U32("ehav")) { + filenr = 0x01; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("clas")) { + filenr = 0x02; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("cqeq")) { + filenr = 0x03; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("shca")) { + filenr = 0x05; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("eirq")) { + filenr = 0x06; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("lMad")) { + filenr = 0x07; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("mcas")) { + filenr = 0x08; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("mrmw")) { + filenr = 0x09; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("vpd ")) { + filenr = 0x0a; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("e_qp")) { + filenr = 0x0b; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("uqes")) { + filenr = 0x0c; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("PHYP")) { + filenr = 0x0d; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("snse")) { + filenr = 0x0e; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("iptz")) { + filenr = 0x0f; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("spta")) { + filenr = 0x10; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("simp")) { + filenr = 0x11; + filter_level = 8; + } + if (id == EDEB_ID_TO_U32("reqs")) { + filenr = 0x12; + filter_level = 8; + } + + if ((filenr - 1) > sizeof(ehca_edeb_mask)) { + filenr = 0; + } + + if (filenr == 0) { + filter_level = 9; + } /* default */ + ret = filenr * 0x10000 + line; + if (filter_level <= level) { + return (ret | 0x100000000); /* this is the flag to not trace */ + } + dynamic_level = ehca_edeb_mask[filenr]; + if (likely(dynamic_level <= level)) { + ret = ret | 0x100000000; + }; + return ret; +} + +#ifdef EHCA_USE_HCALL_KERNEL +#ifdef CONFIG_PPC_PSERIES + +#include + +/** + * IS_EDEB_ON - Checks if debug is on for the given level. + */ +#define IS_EDEB_ON(level) \ + ((ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__) & 0x100000000)==0) + +#define EDEB_P_GENERIC(level,idstring,format,args...) \ +do { \ + u64 ehca_edeb_filterresult = \ + ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__);\ + if ((ehca_edeb_filterresult & 0x100000000) == 0) \ + printk("PU%04x %08x:%s " idstring " "format "\n", \ + get_paca()->paca_index, (u32)(ehca_edeb_filterresult), \ + __func__, ##args); \ + if (unlikely(ehca_edeb_mask[0x1e]!=0)) \ + ED_FLIGHT_LOG((((u64)(get_paca()->paca_index)<< 32) | \ + ((u64)(ehca_edeb_filterresult & 0xffffffff)) << 40 | \ + (flightlog_value()&0xffffffff)), args); \ +} while (1==0) + +#elif CONFIG_ARCH_S390 + +#include +#define EDEB_P_GENERIC(level,idstring,format,args...) \ +do { \ + u64 ehca_edeb_filterresult = \ + ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__);\ + if ((ehca_edeb_filterresult & 0x100000000) == 0) \ + printk("PU%04x %08x:%s " idstring " "format "\n", \ + smp_processor_id(), (u32)(ehca_edeb_filterresult), \ + __func__, ##args); \ +} while (1==0) + +#elif REAL_HCALL + +#define EDEB_P_GENERIC(level,idstring,format,args...) \ +do { \ + u64 ehca_edeb_filterresult = \ + ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__); \ + if ((ehca_edeb_filterresult & 0x100000000) == 0) \ + printk("%08x:%s " idstring " "format "\n", \ + (u32)(ehca_edeb_filterresult), \ + __func__, ##args); \ +} while (1==0) + +#endif +#else + +#define IS_EDEB_ON(level) (1) + +#define EDEB_P_GENERIC(level,idstring,format,args...) \ +do { \ + printk("%s " idstring " "format "\n", \ + __func__, ##args); \ +} while (1==0) + +#endif + +/** + * EDEB - Trace output macro. + * @level tracelevel + * @format optional format string, use "" if not desired + * @args printf like arguments for trace, use %Lx for u64, %x for u32 + * %p for pointer + */ +#define EDEB(level,format,args...) \ + EDEB_P_GENERIC(level,"",format,##args) +#define EDEB_ERR(level,format,args...) \ + EDEB_P_GENERIC(level,"HCAD_ERROR ",format,##args) +#define EDEB_EN(level,format,args...) \ + EDEB_P_GENERIC(level,">>>",format,##args) +#define EDEB_EX(level,format,args...) \ + EDEB_P_GENERIC(level,"<<<",format,##args) + +/** + * EDEB macro to dump a memory block, whose length is n*8 bytes. + * Each line has the following layout: + * adr=X ofs=Y <8 bytes hex> <8 bytes hex> + */ + +#define EDEB_DMP(level,adr,len,format,args...) \ + do { \ + unsigned int x; \ + unsigned int l = (unsigned int)(len); \ + unsigned char *deb = (unsigned char*)(adr); \ + for (x = 0; x < l; x += 16) { \ + EDEB(level, format " adr=%p ofs=%04x %016lx %016lx", \ + ##args, deb, x, *((u64 *)&deb[0]), *((u64 *)&deb[8])); \ + deb += 16; \ + } \ + } while (0) + +#define LOCATION __FILE__ " " + +/* define a bitmask, little endian version */ +#define EHCA_BMASK(pos,length) (((pos)<<16)+(length)) +/* define a bitmask, the ibm way... */ +#define EHCA_BMASK_IBM(from,to) (((63-to)<<16)+((to)-(from)+1)) +/* internal function, don't use */ +#define EHCA_BMASK_SHIFTPOS(mask) (((mask)>>16)&0xffff) +/* internal function, don't use */ +#define EHCA_BMASK_MASK(mask) (0xffffffffffffffffULL >> ((64-(mask))&0xffff)) +/* return value shifted and masked by mask\n + * variable|=HCA_BMASK_SET(MY_MASK,0x4711) ORs the bits in variable\n + * variable&=~HCA_BMASK_SET(MY_MASK,-1) clears the bits from the mask + * in variable + */ +#define EHCA_BMASK_SET(mask,value) \ + ((EHCA_BMASK_MASK(mask) & ((u64)(value)))<>EHCA_BMASK_SHIFTPOS(mask))) + +/** + * ehca_fixme - Dummy function which will be removed in production code + * to find all todos by compiler. + */ +void ehca_fixme(void); + +extern void exit(int); +inline static void ehca_catastrophic(char *str) +{ +#ifndef EHCA_USERDRIVER + printk(KERN_ERR "HCAD_ERROR %s\n", str); + ehca_flight_to_printk(); +#else + exit(1); +#endif +} + +#define PARANOIA_MODE +#ifdef PARANOIA_MODE + +#define EHCA_CHECK_ADR_P(adr) \ + if (unlikely(adr==0)) { \ + EDEB_ERR(4, "adr=%p check failed line %i", adr, \ + __LINE__); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_ADR(adr) \ + if (unlikely(adr==0)) { \ + EDEB_ERR(4, "adr=%p check failed line %i", adr, \ + __LINE__); \ + return -EFAULT; } + +#define EHCA_CHECK_DEVICE_P(device) \ + if (unlikely(device==0)) { \ + EDEB_ERR(4, "device=%p check failed", device); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_DEVICE(device) \ + if (unlikely(device==0)) { \ + EDEB_ERR(4, "device=%p check failed", device); \ + return -EFAULT; } + +#define EHCA_CHECK_PD(pd) \ + if (unlikely(pd==0)) { \ + EDEB_ERR(4, "pd=%p check failed", pd); \ + return -EFAULT; } + +#define EHCA_CHECK_PD_P(pd) \ + if (unlikely(pd==0)) { \ + EDEB_ERR(4, "pd=%p check failed", pd); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_AV(av) \ + if (unlikely(av==0)) { \ + EDEB_ERR(4, "av=%p check failed", av); \ + return -EFAULT; } + +#define EHCA_CHECK_AV_P(av) \ + if (unlikely(av==0)) { \ + EDEB_ERR(4, "av=%p check failed", av); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_CQ(cq) \ + if (unlikely(cq==0)) { \ + EDEB_ERR(4, "cq=%p check failed", cq); \ + return -EFAULT; } + +#define EHCA_CHECK_CQ_P(cq) \ + if (unlikely(cq==0)) { \ + EDEB_ERR(4, "cq=%p check failed", cq); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_EQ(eq) \ + if (unlikely(eq==0)) { \ + EDEB_ERR(4, "eq=%p check failed", eq); \ + return -EFAULT; } + +#define EHCA_CHECK_EQ_P(eq) \ + if (unlikely(eq==0)) { \ + EDEB_ERR(4, "eq=%p check failed", eq); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_QP(qp) \ + if (unlikely(qp==0)) { \ + EDEB_ERR(4, "qp=%p check failed", qp); \ + return -EFAULT; } + +#define EHCA_CHECK_QP_P(qp) \ + if (unlikely(qp==0)) { \ + EDEB_ERR(4, "qp=%p check failed", qp); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_MR(mr) \ + if (unlikely(mr==0)) { \ + EDEB_ERR(4, "mr=%p check failed", mr); \ + return -EFAULT; } + +#define EHCA_CHECK_MR_P(mr) \ + if (unlikely(mr==0)) { \ + EDEB_ERR(4, "mr=%p check failed", mr); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_MW(mw) \ + if (unlikely(mw==0)) { \ + EDEB_ERR(4, "mw=%p check failed", mw); \ + return -EFAULT; } + +#define EHCA_CHECK_MW_P(mw) \ + if (unlikely(mw==0)) { \ + EDEB_ERR(4, "mw=%p check failed", mw); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_CHECK_FMR(fmr) \ + if (unlikely(fmr==0)) { \ + EDEB_ERR(4, "fmr=%p check failed", fmr); \ + return -EFAULT; } + +#define EHCA_CHECK_FMR_P(fmr) \ + if (unlikely(fmr==0)) { \ + EDEB_ERR(4, "fmr=%p check failed", fmr); \ + return ERR_PTR(-EFAULT); } + +#define EHCA_REGISTER_PD(device,pd) +#define EHCA_REGISTER_AV(pd,av) +#define EHCA_DEREGISTER_PD(PD) +#define EHCA_DEREGISTER_AV(av) +#else +#define EHCA_CHECK_DEVICE_P(device) + +#define EHCA_CHECK_PD(pd) +#define EHCA_REGISTER_PD(device,pd) +#define EHCA_DEREGISTER_PD(PD) +#endif + +/** + * ehca2ib_return_code - Returns ib return code corresponding to the given + * ehca return code. + */ +static inline int ehca2ib_return_code(u64 ehca_rc) +{ + switch (ehca_rc) { + case H_Success: + return 0; + case H_Busy: + return -EBUSY; + case H_NoMem: + return -ENOMEM; + default: + return -EINVAL; + } +} + +#endif /* EHCA_TOOLS_H */ From rolandd at cisco.com Sat Feb 18 11:57:45 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:45 -0800 Subject: [PATCH 15/22] ehca queue pair handling In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005745.13620.43256.stgit@localhost.localdomain> From: Roland Dreier --- drivers/infiniband/hw/ehca/ehca_qp.c | 1528 ++++++++++++++++++++++++++++++++++ 1 files changed, 1528 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c new file mode 100644 index 0000000..e5b1b80 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -0,0 +1,1528 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * QP functions + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Hoang-Nam Nguyen + * Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_qp.c,v 1.159 2006/02/15 15:01:24 nguyen Exp $ + */ + + +#define DEB_PREFIX "e_qp" + +#include "ehca_kernel.h" + +#include "ehca_classes.h" +#include "ehca_tools.h" +#include "hcp_if.h" +#include "ehca_qes.h" + +#include "ehca_iverbs.h" +#include +#include + +#include +#include + +/** @brief attributes not supported by query qp + */ +#define QP_ATTR_QUERY_NOT_SUPPORTED (IB_QP_MAX_DEST_RD_ATOMIC | \ + IB_QP_MAX_QP_RD_ATOMIC | \ + IB_QP_ACCESS_FLAGS | \ + IB_QP_EN_SQD_ASYNC_NOTIFY) + +/** @brief ehca (internal) qp state values + */ +enum ehca_qp_state { + EHCA_QPS_RESET = 1, + EHCA_QPS_INIT = 2, + EHCA_QPS_RTR = 3, + EHCA_QPS_RTS = 5, + EHCA_QPS_SQD = 6, + EHCA_QPS_SQE = 8, + EHCA_QPS_ERR = 128 +}; + +/** @brief qp state transitions as defined by IB Arch Rel 1.1 page 431 + */ +enum ib_qp_statetrans { + IB_QPST_ANY2RESET, + IB_QPST_ANY2ERR, + IB_QPST_RESET2INIT, + IB_QPST_INIT2RTR, + IB_QPST_INIT2INIT, + IB_QPST_RTR2RTS, + IB_QPST_RTS2SQD, + IB_QPST_RTS2RTS, + IB_QPST_SQD2RTS, + IB_QPST_SQE2RTS, + IB_QPST_SQD2SQD, + IB_QPST_MAX /* nr of transitions, this must be last!!! */ +}; + +/** @brief returns ehca qp state corresponding to given ib qp state + */ +static inline enum ehca_qp_state ib2ehca_qp_state(enum ib_qp_state ib_qp_state) +{ + switch (ib_qp_state) { + case IB_QPS_RESET: + return EHCA_QPS_RESET; + case IB_QPS_INIT: + return EHCA_QPS_INIT; + case IB_QPS_RTR: + return EHCA_QPS_RTR; + case IB_QPS_RTS: + return EHCA_QPS_RTS; + case IB_QPS_SQD: + return EHCA_QPS_SQD; + case IB_QPS_SQE: + return EHCA_QPS_SQE; + case IB_QPS_ERR: + return EHCA_QPS_ERR; + default: + EDEB_ERR(4, "invalid ib_qp_state=%x", ib_qp_state); + return -EINVAL; + } +} + +/** @brief returns ib qp state corresponding to given ehca qp state + */ +static inline enum ib_qp_state ehca2ib_qp_state(enum ehca_qp_state + ehca_qp_state) +{ + switch (ehca_qp_state) { + case EHCA_QPS_RESET: + return IB_QPS_RESET; + case EHCA_QPS_INIT: + return IB_QPS_INIT; + case EHCA_QPS_RTR: + return IB_QPS_RTR; + case EHCA_QPS_RTS: + return IB_QPS_RTS; + case EHCA_QPS_SQD: + return IB_QPS_SQD; + case EHCA_QPS_SQE: + return IB_QPS_SQE; + case EHCA_QPS_ERR: + return IB_QPS_ERR; + default: + EDEB_ERR(4,"invalid ehca_qp_state=%x",ehca_qp_state); + return -EINVAL; + } +} + +/** @brief qp type + * used as index for req_attr and opt_attr of struct ehca_modqp_statetrans + */ +enum ehca_qp_type { + QPT_RC = 0, + QPT_UC = 1, + QPT_UD = 2, + QPT_SQP = 3, + QPT_MAX +}; + +/** @brief returns ehca qp type corresponding to ib qp type + */ +static inline enum ehca_qp_type ib2ehcaqptype(enum ib_qp_type ibqptype) +{ + switch (ibqptype) { + case IB_QPT_SMI: + case IB_QPT_GSI: + return QPT_SQP; + case IB_QPT_RC: + return QPT_RC; + case IB_QPT_UC: + return QPT_UC; + case IB_QPT_UD: + return QPT_UD; + default: + EDEB_ERR(4,"Invalid ibqptype=%x", ibqptype); + return -EINVAL; + } +} + +static inline enum ib_qp_statetrans get_modqp_statetrans(int ib_fromstate, + int ib_tostate) +{ + int index = -EINVAL; + switch (ib_tostate) { + case IB_QPS_RESET: + index = IB_QPST_ANY2RESET; + break; + case IB_QPS_INIT: + if (ib_fromstate == IB_QPS_RESET) { + index = IB_QPST_RESET2INIT; + } else if (ib_fromstate == IB_QPS_INIT) { + index = IB_QPST_INIT2INIT; + } + break; + case IB_QPS_RTR: + if (ib_fromstate == IB_QPS_INIT) { + index = IB_QPST_INIT2RTR; + } + break; + case IB_QPS_RTS: + if (ib_fromstate == IB_QPS_RTR) { + index = IB_QPST_RTR2RTS; + } else if (ib_fromstate == IB_QPS_RTS) { + index = IB_QPST_RTS2RTS; + } else if (ib_fromstate == IB_QPS_SQD) { + index = IB_QPST_SQD2RTS; + } else if (ib_fromstate == IB_QPS_SQE) { + index = IB_QPST_SQE2RTS; + } + break; + case IB_QPS_SQD: + if (ib_fromstate == IB_QPS_RTS) { + index = IB_QPST_RTS2SQD; + } + break; + case IB_QPS_SQE: + /* not allowed via mod qp */ + break; + case IB_QPS_ERR: + index = IB_QPST_ANY2ERR; + break; + default: + return -EINVAL; + } + + return index; +} + +/** @brief ehca service types + */ +enum ehca_service_type { + ST_RC = 0, + ST_UC = 1, + ST_RD = 2, + ST_UD = 3 +}; + +/** @brief returns hcp service type corresponding to given ib qp type + * used by create_qp() + */ +static inline int ibqptype2servicetype(enum ib_qp_type ibqptype) +{ + switch (ibqptype) { + case IB_QPT_SMI: + case IB_QPT_GSI: + return ST_UD; + case IB_QPT_RC: + return ST_RC; + case IB_QPT_UC: + return ST_UC; + case IB_QPT_UD: + return ST_UD; + case IB_QPT_RAW_IPV6: + return -EINVAL; + case IB_QPT_RAW_ETY: + return -EINVAL; + default: + EDEB_ERR(4, "Invalid ibqptype=%x", ibqptype); + return -EINVAL; + } +} + +/* init_qp_queues - Initializes/constructs r/squeue and registers queue pages. + * returns 0 if successful, + * -EXXXX if not + */ +static inline int init_qp_queues(struct ipz_adapter_handle ipz_hca_handle, + struct ehca_qp *my_qp, + int nr_sq_pages, + int nr_rq_pages, + int swqe_size, + int rwqe_size, + int nr_send_sges, int nr_receive_sges) +{ + int ret = -EINVAL; + int cnt = 0; + void *vpage = NULL; + u64 rpage = 0; + int ipz_rc = -1; + u64 hipz_rc = H_Parameter; + + ipz_rc = ipz_queue_ctor(&my_qp->ehca_qp_core.ipz_squeue, + nr_sq_pages, + EHCA_PAGESIZE, swqe_size, nr_send_sges); + if (!ipz_rc) { + EDEB_ERR(4, "Cannot allocate page for squeue. ipz_rc=%x", + ipz_rc); + ret = -EBUSY; + return ret; + } + + ipz_rc = ipz_queue_ctor(&my_qp->ehca_qp_core.ipz_rqueue, + nr_rq_pages, + EHCA_PAGESIZE, rwqe_size, nr_receive_sges); + if (!ipz_rc) { + EDEB_ERR(4, "Cannot allocate page for rqueue. ipz_rc=%x", + ipz_rc); + ret = -EBUSY; + goto init_qp_queues0; + } + /* register SQ pages */ + for (cnt = 0; cnt < nr_sq_pages; cnt++) { + vpage = ipz_QPageit_get_inc(&my_qp->ehca_qp_core.ipz_squeue); + if (!vpage) { + EDEB_ERR(4, "SQ ipz_QPageit_get_inc() " + "failed p_vpage= %p", vpage); + ret = -EINVAL; + goto init_qp_queues1; + } + rpage = ehca_kv_to_g(vpage); + + hipz_rc = hipz_h_register_rpage_qp(ipz_hca_handle, + my_qp->ipz_qp_handle, + &my_qp->pf, 0, 0, /*TODO*/ + rpage, 1, + my_qp->ehca_qp_core.galpas.kernel); + if (hipz_rc < H_Success) { + EDEB_ERR(4,"SQ hipz_qp_register_rpage() faield " + " rc=%lx", hipz_rc); + ret = ehca2ib_return_code(hipz_rc); + goto init_qp_queues1; + } + /* for sq no need to check hipz_rc against + e.g. H_PAGE_REGISTERED */ + } + + ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_squeue); + + /* register RQ pages */ + for (cnt = 0; cnt < nr_rq_pages; cnt++) { + vpage = ipz_QPageit_get_inc(&my_qp->ehca_qp_core.ipz_rqueue); + if (!vpage) { + EDEB_ERR(4,"RQ ipz_QPageit_get_inc() " + "failed p_vpage = %p", vpage); + hipz_rc = H_Resource; + ret = -EINVAL; + goto init_qp_queues1; + } + + rpage = ehca_kv_to_g(vpage); + + hipz_rc = hipz_h_register_rpage_qp(ipz_hca_handle, + my_qp->ipz_qp_handle, + &my_qp->pf, 0, 1, /*TODO*/ + rpage, 1, + my_qp->ehca_qp_core.galpas. + kernel); + if (hipz_rc < H_Success) { + EDEB_ERR(4, "RQ hipz_qp_register_rpage() failed " + "rc=%lx", hipz_rc); + ret = ehca2ib_return_code(hipz_rc); + goto init_qp_queues1; + } + if (cnt == (nr_rq_pages - 1)) { /* last page! */ + if (hipz_rc != H_Success) { + EDEB_ERR(4,"RQ hipz_qp_register_rpage() " + "hipz_rc= %lx ", hipz_rc); + ret = ehca2ib_return_code(hipz_rc); + goto init_qp_queues1; + } + vpage = ipz_QPageit_get_inc(&my_qp->ehca_qp_core.ipz_rqueue); + if (vpage != NULL) { + EDEB_ERR(4,"ipz_QPageit_get_inc() " + "should not succeed vpage=%p", + vpage); + ret = -EINVAL; + goto init_qp_queues1; + } + } else { + if (hipz_rc != H_PAGE_REGISTERED) { + EDEB_ERR(4,"RQ hipz_qp_register_rpage() " + "hipz_rc= %lx ", hipz_rc); + ret = ehca2ib_return_code(hipz_rc); + goto init_qp_queues1; + } + } + } + + ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_rqueue); + + return 0; + + init_qp_queues1: + ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_rqueue); + init_qp_queues0: + ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_squeue); + return ret; +} + + +struct ib_qp *ehca_create_qp(struct ib_pd *pd, + struct ib_qp_init_attr *init_attr, + struct ib_udata *udata) +{ + static int da_msg_size[]={ 128, 256, 512, 1024, 2048, 4096 }; + int ret = -EINVAL; + int servicetype = 0; + int sigtype = 0; + + struct ehca_qp *my_qp = NULL; + struct ehca_pd *my_pd = NULL; + struct ehca_shca *shca = NULL; + struct ehca_cq *recv_ehca_cq = NULL; + struct ehca_cq *send_ehca_cq = NULL; + struct ib_ucontext *context = NULL; + u64 hipz_rc = H_Parameter; + int max_send_sge; + int max_recv_sge; + /* h_call's out parameters */ + u16 act_nr_send_wqes = 0, act_nr_receive_wqes = 0; + u8 act_nr_send_sges = 0, act_nr_receive_sges = 0; + u32 qp_nr = 0, + nr_sq_pages = 0, swqe_size = 0, rwqe_size = 0, nr_rq_pages = 0; + u8 daqp_completion; + u8 isdaqp; + EDEB_EN(7,"pd=%p init_attr=%p", pd, init_attr); + + EHCA_CHECK_PD_P(pd); + EHCA_CHECK_ADR_P(init_attr); + + if (init_attr->sq_sig_type != IB_SIGNAL_REQ_WR && + init_attr->sq_sig_type != IB_SIGNAL_ALL_WR) { + EDEB_ERR(4, "init_attr->sg_sig_type=%x not allowed", + init_attr->sq_sig_type); + return ERR_PTR(-EINVAL); + } + + /* save daqp completion bits */ + daqp_completion = init_attr->qp_type & 0x60; + /* save daqp bit */ + isdaqp = (init_attr->qp_type & 0x80) ? 1 : 0; + init_attr->qp_type = init_attr->qp_type & 0x1F; + + if (init_attr->qp_type != IB_QPT_UD && + init_attr->qp_type != IB_QPT_SMI && + init_attr->qp_type != IB_QPT_GSI && + init_attr->qp_type != IB_QPT_UC && + init_attr->qp_type != IB_QPT_RC) { + EDEB_ERR(4,"wrong QP Type=%x",init_attr->qp_type); + return ERR_PTR(-EINVAL); + } + if (init_attr->qp_type != IB_QPT_RC && isdaqp != 0) { + EDEB_ERR(4,"unsupported LL QP Type=%x",init_attr->qp_type); + return ERR_PTR(-EINVAL); + } + + if (pd->uobject && udata != NULL) { + context = pd->uobject->context; + } + + my_qp = ehca_qp_new(); + if (!my_qp) { + EDEB_ERR(4, "pd=%p not enough memory to alloc qp", pd); + return ERR_PTR(-ENOMEM); + } + + my_pd = container_of(pd, struct ehca_pd, ib_pd); + + shca = container_of(pd->device, struct ehca_shca, ib_device); + recv_ehca_cq = container_of(init_attr->recv_cq, struct ehca_cq, ib_cq); + send_ehca_cq = container_of(init_attr->send_cq, struct ehca_cq, ib_cq); + + my_qp->init_attr = *init_attr; + + do { + if (!idr_pre_get(&ehca_qp_idr, GFP_KERNEL)) { + ret = -ENOMEM; + EDEB_ERR(4, "Can't reserve idr resources."); + goto create_qp_exit0; + } + + down_write(&ehca_qp_idr_sem); + ret = idr_get_new(&ehca_qp_idr, my_qp, &my_qp->token); + up_write(&ehca_qp_idr_sem); + + } while (ret == -EAGAIN); + + if (ret) { + ret = -ENOMEM; + EDEB_ERR(4, "Can't allocate new idr entry."); + goto create_qp_exit0; + } + + servicetype = ibqptype2servicetype(init_attr->qp_type); + if (servicetype < 0) { + ret = -EINVAL; + EDEB_ERR(4, "Invalid qp_type=%x", init_attr->qp_type); + goto create_qp_exit0; + } + + if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) { + sigtype = HCALL_SIGT_EVERY; + } else { + sigtype = HCALL_SIGT_BY_WQE; + } + + /* UD_AV CIRCUMVENTION */ + max_send_sge=init_attr->cap.max_send_sge; + max_recv_sge=init_attr->cap.max_recv_sge; + if (IB_QPT_UD == init_attr->qp_type || + IB_QPT_GSI == init_attr->qp_type || + IB_QPT_SMI == init_attr->qp_type) { + max_send_sge += 2; + max_recv_sge += 2; + } + + EDEB(7, "isdaqp=%x daqp_completion=%x", isdaqp, daqp_completion); + + hipz_rc = hipz_h_alloc_resource_qp(shca->ipz_hca_handle, + &my_qp->pf, + servicetype, + isdaqp | daqp_completion, + sigtype, 0, /* no ud ad lkey ctrl */ + send_ehca_cq->ipz_cq_handle, + recv_ehca_cq->ipz_cq_handle, + shca->eq.ipz_eq_handle, + my_qp->token, + my_pd->fw_pd, + (u16) init_attr->cap.max_send_wr + 1, /* fixme(+1 ??) */ + (u16) init_attr->cap.max_recv_wr + 1, /* fixme(+1 ??) */ + (u8) max_send_sge, + (u8) max_recv_sge, + 0, /* ignored if ud ad lkey ctrl is 0 */ + &my_qp->ipz_qp_handle, + &qp_nr, + &act_nr_send_wqes, + &act_nr_receive_wqes, + &act_nr_send_sges, + &act_nr_receive_sges, + &nr_sq_pages, + &nr_rq_pages, + &my_qp->ehca_qp_core.galpas); + if (hipz_rc != H_Success) { + EDEB_ERR(4, "h_alloc_resource_qp() failed rc=%lx", hipz_rc); + ret = ehca2ib_return_code(hipz_rc); + goto create_qp_exit1; + } + + /* store real qp_num as we got from ehca */ + my_qp->ehca_qp_core.real_qp_num = qp_nr; + + switch (init_attr->qp_type) { + case IB_QPT_RC: + if (isdaqp == 0) { + swqe_size = offsetof(struct ehca_wqe, + u.nud.sg_list[(act_nr_send_sges)]); + rwqe_size = offsetof(struct ehca_wqe, + u.nud.sg_list[(act_nr_receive_sges)]); + } else { /* for daqp we need to use msg size, not wqe size */ + swqe_size = da_msg_size[max_send_sge]; + rwqe_size = da_msg_size[max_recv_sge]; + act_nr_send_sges=1; + act_nr_receive_sges=1; + } + break; + case IB_QPT_UC: + swqe_size = offsetof(struct ehca_wqe, + u.nud.sg_list[(act_nr_send_sges)]); + rwqe_size = offsetof(struct ehca_wqe, + u.nud.sg_list[(act_nr_receive_sges)]); + break; + + case IB_QPT_UD: + case IB_QPT_GSI: + case IB_QPT_SMI: + /* UD circumvention */ + act_nr_receive_sges -= 2; + act_nr_send_sges -= 2; + swqe_size = offsetof(struct ehca_wqe, + u.ud_av.sg_list[(act_nr_send_sges)]); + rwqe_size = offsetof(struct ehca_wqe, + u.ud_av.sg_list[(act_nr_receive_sges)]); + + if (IB_QPT_GSI == init_attr->qp_type || + IB_QPT_SMI == init_attr->qp_type) { + act_nr_send_wqes = init_attr->cap.max_send_wr; + act_nr_receive_wqes = init_attr->cap.max_recv_wr; + act_nr_send_sges = init_attr->cap.max_send_sge; + act_nr_receive_sges = init_attr->cap.max_recv_sge; + qp_nr = (init_attr->qp_type == IB_QPT_SMI) ? 0 : 1; + } + + break; + + default: + break; + } + + /* initializes r/squeue and registers queue pages */ + ret = init_qp_queues(shca->ipz_hca_handle, my_qp, + nr_sq_pages, nr_rq_pages, + swqe_size, rwqe_size, + act_nr_send_sges, act_nr_receive_sges); + if (ret != 0) { + EDEB_ERR(4,"Couldn't initialize r/squeue and pages ret=%x", + ret); + goto create_qp_exit2; + } + + my_qp->ib_qp.pd = &my_pd->ib_pd; + my_qp->ib_qp.device = my_pd->ib_pd.device; + + my_qp->ib_qp.recv_cq = init_attr->recv_cq; + my_qp->ib_qp.send_cq = init_attr->send_cq; + + my_qp->ib_qp.qp_num = qp_nr; + my_qp->ib_qp.qp_type = init_attr->qp_type; + + my_qp->ehca_qp_core.qp_type = init_attr->qp_type; + my_qp->ib_qp.srq = init_attr->srq; + + my_qp->ib_qp.qp_context = init_attr->qp_context; + my_qp->ib_qp.event_handler = init_attr->event_handler; + + init_attr->cap.max_inline_data = 0; /* not supported? */ + init_attr->cap.max_recv_sge = act_nr_receive_sges; + init_attr->cap.max_recv_wr = act_nr_receive_wqes; + init_attr->cap.max_send_sge = act_nr_send_sges; + init_attr->cap.max_send_wr = act_nr_send_wqes; + + /* TODO : define_apq0() not supported yet */ + if (init_attr->qp_type == IB_QPT_GSI) { + if ((hipz_rc = ehca_define_sqp(shca, my_qp, init_attr))) { + EDEB_ERR(4, "ehca_define_sqp() failed rc=%lx", hipz_rc); + ret = ehca2ib_return_code(hipz_rc); + goto create_qp_exit3; + } + } + + if (init_attr->send_cq != NULL) { + struct ehca_cq *cq = container_of(init_attr->send_cq, + struct ehca_cq, ib_cq); + ret = ehca_cq_assign_qp(cq, my_qp); + if (ret != 0) { + EDEB_ERR(4, "Couldn't assign qp to send_cq ret=%x", ret); + goto create_qp_exit3; + } + my_qp->send_cq = cq; + } + + /* copy queues, galpa data to user space */ + if (context != NULL && udata != NULL) { + struct ehca_create_qp_resp resp; + struct vm_area_struct * vma; + resp.qp_num = qp_nr; + resp.token = my_qp->token; + resp.ehca_qp_core = my_qp->ehca_qp_core; + + ehca_mmap_nopage(((u64) (my_qp->token) << 32) | 0x22000000, + my_qp->ehca_qp_core.ipz_rqueue.queue_length, + ((void**)&resp.ehca_qp_core.ipz_rqueue.queue), + &vma); + my_qp->uspace_rqueue = (u64)resp.ehca_qp_core.ipz_rqueue.queue; + ehca_mmap_nopage(((u64) (my_qp->token) << 32) | 0x23000000, + my_qp->ehca_qp_core.ipz_squeue.queue_length, + ((void**)&resp.ehca_qp_core.ipz_squeue.queue), + &vma); + my_qp->uspace_squeue = (u64)resp.ehca_qp_core.ipz_squeue.queue; + ehca_mmap_register(my_qp->ehca_qp_core.galpas.user.fw_handle, + ((void**)&resp.ehca_qp_core.galpas.kernel.fw_handle), + &vma); + my_qp->uspace_fwh = (u64)resp.ehca_qp_core.galpas.kernel.fw_handle; + + if (ib_copy_to_udata(udata, &resp, sizeof resp)) { + EDEB_ERR(4, "Copy to udata failed"); + ret = -EINVAL; + goto create_qp_exit3; + } + } + + EDEB_EX(7, "ehca_qp=%p qp_num=%x, token=%x", + my_qp, qp_nr, my_qp->token); + return (&my_qp->ib_qp); + + create_qp_exit3: + ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_rqueue); + ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_squeue); + + create_qp_exit2: + hipz_h_destroy_qp(shca->ipz_hca_handle, my_qp); + + create_qp_exit1: + down_write(&ehca_qp_idr_sem); + idr_remove(&ehca_qp_idr, my_qp->token); + up_write(&ehca_qp_idr_sem); + + create_qp_exit0: + ehca_qp_delete(my_qp); + EDEB_EX(4, "failed ret=%x", ret); + return ERR_PTR(ret); + +} + +/** called by internal_modify_qp() at trans sqe -> rts: + * set purge bit of bad wqe and subsequent wqes to avoid reentering sqe + * @return total number of bad wqes in bad_wqe_cnt + */ +static int prepare_sqe_rts(struct ehca_qp *my_qp, struct ehca_shca *shca, + int *bad_wqe_cnt) +{ + int ret = 0; + u64 hipz_rc = H_Success; + struct ipz_queue *squeue = NULL; + void *bad_send_wqe_p = NULL; + void *bad_send_wqe_v = NULL; + void *squeue_start_p = NULL; + void *squeue_end_p = NULL; + void *squeue_start_v = NULL; + void *squeue_end_v = NULL; + struct ehca_wqe *wqe = NULL; + int qp_num = my_qp->ib_qp.qp_num; + + EDEB_EN(7, "ehca_qp=%p qp_num=%x ", my_qp, qp_num); + + /* get send wqe pointer */ + hipz_rc = hipz_h_disable_and_get_wqe(shca->ipz_hca_handle, + my_qp->ipz_qp_handle, &my_qp->pf, + &bad_send_wqe_p, NULL, 2); + if (hipz_rc != H_Success) { + EDEB_ERR(4, "hipz_h_disable_and_get_wqe() failed " + "ehca_qp=%p qp_num=%x hipz_rc=%lx", + my_qp, qp_num, hipz_rc); + ret = ehca2ib_return_code(hipz_rc); + goto prepare_sqe_rts_exit1; + } + bad_send_wqe_p = (void*)((u64)bad_send_wqe_p & (~(1L<<63))); + EDEB(7, "qp_num=%x bad_send_wqe_p=%p", qp_num, bad_send_wqe_p); + /* convert wqe pointer to vadr */ + bad_send_wqe_v = abs_to_virt((u64)bad_send_wqe_p); + EDEB_DMP(6, bad_send_wqe_v, 32, "qp_num=%x bad_wqe", qp_num); + + squeue = &my_qp->ehca_qp_core.ipz_squeue; + squeue_start_p = (void*)ehca_kv_to_g(squeue->queue); + squeue_end_p = squeue_start_p+squeue->queue_length; + squeue_start_v = abs_to_virt((u64)squeue_start_p); + squeue_end_v = abs_to_virt((u64)squeue_end_p); + EDEB(6, "qp_num=%x squeue_start_v=%p squeue_end_v=%p", + qp_num, squeue_start_v, squeue_end_v); + + /* loop sets wqe's purge bit */ + wqe = (struct ehca_wqe*)bad_send_wqe_v; + *bad_wqe_cnt = 0; + while (wqe->optype != 0xff && wqe->wqef != 0xff) { + EDEB_DMP(6, wqe, 32, "qp_num=%x wqe", qp_num); + wqe->nr_of_data_seg = 0; /* suppress data access */ + wqe->wqef = WQEF_PURGE; /* WQE to be purged */ + wqe = (struct ehca_wqe*)((u8*)wqe+squeue->qe_size); + *bad_wqe_cnt = (*bad_wqe_cnt)+1; + if ((void*)wqe >= squeue_end_v) { + wqe = squeue_start_v; + } + } /* eof while wqe */ + /* bad wqe will be reprocessed and ignored when pol_cq() is called, + i.e. nr of wqes with flush error status is one less */ + EDEB(6, "qp_num=%x flusherr_wqe_cnt=%x", qp_num, (*bad_wqe_cnt)-1); + wqe->wqef = 0; + + prepare_sqe_rts_exit1: + + EDEB_EX(7, "ehca_qp=%p qp_num=%x ret=%x", my_qp, qp_num, ret); + return ret; +} + +/** @brief internal modify qp with circumvention to handle aqp0 properly + * smi_reset2init indicates if this is an internal reset-to-init-call for + * smi. This flag must always be zero if called from ehca_modify_qp()! + * This internal func was intorduced to avoid recursion of ehca_modify_qp()! + */ +static int internal_modify_qp(struct ib_qp *ibqp, + struct ib_qp_attr *attr, + int attr_mask, int smi_reset2init) +{ + enum ib_qp_state qp_cur_state = 0, qp_new_state = 0; + int cnt = 0, qp_attr_idx = 0, retcode = 0; + + enum ib_qp_statetrans statetrans; + struct hcp_modify_qp_control_block *mqpcb = NULL; + struct ehca_qp *my_qp = NULL; + struct ehca_shca *shca = NULL; + u64 update_mask = 0; + u64 hipz_rc = H_Success; + int bad_wqe_cnt = 0; + int squeue_locked = 0; + unsigned long spl_flags = 0; + + my_qp = container_of(ibqp, struct ehca_qp, ib_qp); + shca = container_of(ibqp->pd->device, struct ehca_shca, ib_device); + + EDEB_EN(7, "ehca_qp=%p qp_num=%x ibqp_type=%x " + "new qp_state=%x attribute_mask=%x", + my_qp, ibqp->qp_num, ibqp->qp_type, + attr->qp_state, attr_mask); + + /* do query_qp to obtain current attr values */ + mqpcb = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (mqpcb == NULL) { + retcode = -ENOMEM; + EDEB_ERR(4, "Could not get zeroed page for mqpcb " + "ehca_qp=%p qp_num=%x ", my_qp, ibqp->qp_num); + goto modify_qp_exit0; + } + memset(mqpcb, 0, PAGE_SIZE); + + hipz_rc = hipz_h_query_qp(shca->ipz_hca_handle, + my_qp->ipz_qp_handle, + &my_qp->pf, + mqpcb, my_qp->ehca_qp_core.galpas.kernel); + if (hipz_rc != H_Success) { + EDEB_ERR(4, "hipz_h_query_qp() failed " + "ehca_qp=%p qp_num=%x hipz_rc=%lx", + my_qp, ibqp->qp_num, hipz_rc); + retcode = ehca2ib_return_code(hipz_rc); + goto modify_qp_exit1; + } + EDEB(7, "ehca_qp=%p qp_num=%x ehca_qp_state=%x", + my_qp, ibqp->qp_num, mqpcb->qp_state); + + qp_cur_state = ehca2ib_qp_state(mqpcb->qp_state); + + if (qp_cur_state == -EINVAL) { /* invalid qp state */ + retcode = -EINVAL; + EDEB_ERR(4, "Invalid current ehca_qp_state=%x " + "ehca_qp=%p qp_num=%x", + mqpcb->qp_state, my_qp, ibqp->qp_num); + goto modify_qp_exit1; + } + /* circumvention to set aqp0 initial state to init + as expected by IB spec */ + if (smi_reset2init == 0 && + ibqp->qp_type == IB_QPT_SMI && + qp_cur_state == IB_QPS_RESET && + (attr_mask & IB_QP_STATE) + && attr->qp_state == IB_QPS_INIT) { /* RESET -> INIT */ + struct ib_qp_attr smiqp_attr = { + .qp_state = IB_QPS_INIT, + .port_num = my_qp->init_attr.port_num, + .pkey_index = 0, + .qkey = 0 + }; + int smiqp_attr_mask = IB_QP_STATE | IB_QP_PORT | + IB_QP_PKEY_INDEX | IB_QP_QKEY; + int smirc = internal_modify_qp( + ibqp, &smiqp_attr, smiqp_attr_mask, 1); + if (smirc != 0) { + EDEB_ERR(4, "SMI RESET -> INIT failed. " + "ehca_modify_qp() rc=%x", smirc); + retcode = H_Parameter; + goto modify_qp_exit1; + } + qp_cur_state = IB_QPS_INIT; + EDEB(7, "SMI RESET -> INIT succeeded"); + } + /* is transmitted current state equal to "real" current state */ + if (attr_mask & IB_QP_CUR_STATE) { + if (qp_cur_state != attr->cur_qp_state) { + retcode = -EINVAL; + EDEB_ERR(4, "Invalid IB_QP_CUR_STATE " + "attr->curr_qp_state=%x <> " + "actual cur_qp_state=%x. " + "ehca_qp=%p qp_num=%x", + attr->cur_qp_state, qp_cur_state, + my_qp, ibqp->qp_num); + goto modify_qp_exit1; + } + } + + EDEB(7, "ehca_qp=%p qp_num=%x current qp_state=%x " + "new qp_state=%x attribute_mask=%x", + my_qp, ibqp->qp_num, qp_cur_state, attr->qp_state, attr_mask); + + qp_new_state = attr_mask & IB_QP_STATE ? attr->qp_state : qp_cur_state; + if (!smi_reset2init && + !ib_modify_qp_is_ok(qp_cur_state, qp_new_state, ibqp->qp_type, + attr_mask)) { + retcode = -EINVAL; + EDEB_ERR(4, "Invalid qp transition new_state=%x cur_state=%x " + "ehca_qp=%p qp_num=%x attr_mask=%x", + qp_new_state, qp_cur_state, my_qp, ibqp->qp_num, + attr_mask); + goto modify_qp_exit1; + } + + if ((mqpcb->qp_state = ib2ehca_qp_state(qp_new_state))) { + update_mask = EHCA_BMASK_SET(MQPCB_MASK_QP_STATE, 1); + } else { + retcode = -EINVAL; + EDEB_ERR(4, "Invalid new qp state=%x " + "ehca_qp=%p qp_num=%x", + qp_new_state, my_qp, ibqp->qp_num); + goto modify_qp_exit1; + } + + /* retrieve state transition struct to get req and opt attrs */ + statetrans = get_modqp_statetrans(qp_cur_state, qp_new_state); + if (statetrans < 0) { + retcode = -EINVAL; + EDEB_ERR(4, " qp_cur_state=%x " + "new_qp_state=%x State_xsition=%x " + "ehca_qp=%p qp_num=%x", + qp_cur_state, qp_new_state, + statetrans, my_qp, ibqp->qp_num); + goto modify_qp_exit1; + } + + qp_attr_idx = ib2ehcaqptype(ibqp->qp_type); + + if (qp_attr_idx < 0) { + retcode = qp_attr_idx; + EDEB_ERR(4, "Invalid QP type=%x ehca_qp=%p qp_num=%x", + ibqp->qp_type, my_qp, ibqp->qp_num); + goto modify_qp_exit1; + } + + EDEB(7, "ehca_qp=%p qp_num=%x qp_state_xsit=%x", + my_qp, ibqp->qp_num, statetrans); + + /* sqe -> rts: set purge bit of bad wqe before actual trans */ + if ((my_qp->ehca_qp_core.qp_type == IB_QPT_UD + || my_qp->ehca_qp_core.qp_type == IB_QPT_GSI + || my_qp->ehca_qp_core.qp_type == IB_QPT_SMI) + && statetrans == IB_QPST_SQE2RTS) { + /* mark next free wqe if kernel */ + if (my_qp->uspace_squeue == 0) { + struct ehca_wqe *wqe = NULL; + /* lock send queue */ + spin_lock_irqsave(&my_qp->spinlock_s, spl_flags); + squeue_locked = 1; + /* mark next free wqe */ + wqe=(struct ehca_wqe*) + my_qp->ehca_qp_core.ipz_squeue.current_q_addr; + wqe->optype = wqe->wqef = 0xff; + EDEB(7, "qp_num=%x next_free_wqe=%p", + ibqp->qp_num, wqe); + } + retcode = prepare_sqe_rts(my_qp, shca, &bad_wqe_cnt); + if (retcode != 0) { + EDEB_ERR(4, "prepare_sqe_rts() failed " + "ehca_qp=%p qp_num=%x ret=%x", + my_qp, ibqp->qp_num, retcode); + goto modify_qp_exit2; + } + } + + /* enable RDMA_Atomic_Control if reset->init und reliable con + this is necessary since gen2 does not provide that flag, + but pHyp requires it */ + if (statetrans == IB_QPST_RESET2INIT && + (ibqp->qp_type == IB_QPT_RC || ibqp->qp_type == IB_QPT_UC)) { + mqpcb->rdma_atomic_ctrl = 3; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RDMA_ATOMIC_CTRL, 1); + } + /* circ. pHyp requires #RDMA/Atomic Responder Resources for UC INIT -> RTR */ + if (statetrans == IB_QPST_INIT2RTR && + (ibqp->qp_type == IB_QPT_UC) && + !(attr_mask & IB_QP_MAX_DEST_RD_ATOMIC)) { + mqpcb->rdma_nr_atomic_resp_res = 1; /* default to 1 */ + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES, 1); + } + + if (attr_mask & IB_QP_PKEY_INDEX) { + mqpcb->prim_p_key_idx = attr->pkey_index; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_P_KEY_IDX, 1); + EDEB(7, "ehca_qp=%p qp_num=%x " + "IB_QP_PKEY_INDEX update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_PORT) { + if (attr->port_num < 1 || attr->port_num > shca->num_ports) { + retcode = -EINVAL; + EDEB_ERR(4, "Invalid port=%x. " + "ehca_qp=%p qp_num=%x num_ports=%x", + attr->port_num, my_qp, ibqp->qp_num, + shca->num_ports); + goto modify_qp_exit2; + } + mqpcb->prim_phys_port = attr->port_num; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_PHYS_PORT, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_PORT update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_QKEY) { + mqpcb->qkey = attr->qkey; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_QKEY, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_QKEY update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_AV) { + mqpcb->dlid = attr->ah_attr.dlid; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DLID, 1); + mqpcb->source_path_bits = attr->ah_attr.src_path_bits; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SOURCE_PATH_BITS, 1); + mqpcb->service_level = attr->ah_attr.sl; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SERVICE_LEVEL, 1); + mqpcb->max_static_rate = attr->ah_attr.static_rate; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_MAX_STATIC_RATE, 1); + + /* only if GRH is TRUE we might consider SOURCE_GID_IDX and DEST_GID + * otherwise phype will return H_ATTR_PARM!!! + */ + if (attr->ah_attr.ah_flags == IB_AH_GRH) { + mqpcb->send_grh_flag = 1 << 31; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_SEND_GRH_FLAG, 1); + mqpcb->source_gid_idx = attr->ah_attr.grh.sgid_index; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_SOURCE_GID_IDX, 1); + + for (cnt = 0; cnt < 16; cnt++) { + mqpcb->dest_gid.byte[cnt] = + attr->ah_attr.grh.dgid.raw[cnt]; + } + + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DEST_GID, 1); + mqpcb->flow_label = attr->ah_attr.grh.flow_label; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_FLOW_LABEL, 1); + mqpcb->hop_limit = attr->ah_attr.grh.hop_limit; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_HOP_LIMIT, 1); + mqpcb->traffic_class = attr->ah_attr.grh.traffic_class; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_TRAFFIC_CLASS, 1); + } + + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_AV update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + + if (attr_mask & IB_QP_PATH_MTU) { + mqpcb->path_mtu = attr->path_mtu; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PATH_MTU, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_PATH_MTU update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_TIMEOUT) { + mqpcb->timeout = attr->timeout; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_TIMEOUT, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_TIMEOUT update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_RETRY_CNT) { + mqpcb->retry_count = attr->retry_cnt; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RETRY_COUNT, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_RETRY_CNT update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_RNR_RETRY) { + mqpcb->rnr_retry_count = attr->rnr_retry; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RNR_RETRY_COUNT, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_RNR_RETRY update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_RQ_PSN) { + mqpcb->receive_psn = attr->rq_psn; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RECEIVE_PSN, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_RQ_PSN update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_MAX_DEST_RD_ATOMIC) { + /* @TODO CHECK THIS with our spec */ + mqpcb->rdma_nr_atomic_resp_res = attr->max_dest_rd_atomic; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_MAX_DEST_RD_ATOMIC " + "update_mask=%lx", my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_MAX_QP_RD_ATOMIC) { + /* @TODO CHECK THIS with our spec */ + mqpcb->rdma_atomic_outst_dest_qp = attr->max_rd_atomic; + update_mask |= + EHCA_BMASK_SET + (MQPCB_MASK_RDMA_ATOMIC_OUTST_DEST_QP, 1); + EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_MAX_QP_RD_ATOMIC " + "update_mask=%lx", my_qp, ibqp->qp_num, update_mask); + } + if (attr_mask & IB_QP_ALT_PATH) { + mqpcb->dlid_al = attr->alt_ah_attr.dlid; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DLID_AL, 1); + mqpcb->source_path_bits_al = attr->alt_ah_attr.src_path_bits; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_SOURCE_PATH_BITS_AL, 1); + mqpcb->service_level_al = attr->alt_ah_attr.sl; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SERVICE_LEVEL_AL, 1); + mqpcb->max_static_rate_al = attr->alt_ah_attr.static_rate; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_MAX_STATIC_RATE_AL, 1); + + /* only if GRH is TRUE we might consider SOURCE_GID_IDX and DEST_GID + * otherwise phype will return H_ATTR_PARM!!! + */ + if (attr->alt_ah_attr.ah_flags == IB_AH_GRH) { + mqpcb->send_grh_flag_al = 1 << 31; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_SEND_GRH_FLAG_AL, 1); + mqpcb->source_gid_idx_al = + attr->alt_ah_attr.grh.sgid_index; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_SOURCE_GID_IDX_AL, 1); + + for (cnt = 0; cnt < 16; cnt++) { + mqpcb->dest_gid_al.byte[cnt] = + attr->alt_ah_attr.grh.dgid.raw[cnt]; + } + + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_DEST_GID_AL, 1); + mqpcb->flow_label_al = attr->alt_ah_attr.grh.flow_label; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_FLOW_LABEL_AL, 1); + mqpcb->hop_limit_al = attr->alt_ah_attr.grh.hop_limit; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_HOP_LIMIT_AL, 1); + mqpcb->traffic_class_al = + attr->alt_ah_attr.grh.traffic_class; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_TRAFFIC_CLASS_AL, 1); + } + + EDEB(7, "ehca_qp=%p qp_num=%x " + "IB_QP_ALT_PATH update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + + if (attr_mask & IB_QP_MIN_RNR_TIMER) { + mqpcb->min_rnr_nak_timer_field = attr->min_rnr_timer; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_MIN_RNR_NAK_TIMER_FIELD, 1); + EDEB(7, "ehca_qp=%p qp_num=%x " + "IB_QP_MIN_RNR_TIMER update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + + if (attr_mask & IB_QP_SQ_PSN) { + mqpcb->send_psn = attr->sq_psn; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SEND_PSN, 1); + EDEB(7, "ehca_qp=%p qp_num=%x " + "IB_QP_SQ_PSN update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + + if (attr_mask & IB_QP_DEST_QPN) { + mqpcb->dest_qp_nr = attr->dest_qp_num; + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DEST_QP_NR, 1); + EDEB(7, "ehca_qp=%p qp_num=%x " + "IB_QP_DEST_QPN update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + } + + if (attr_mask & IB_QP_PATH_MIG_STATE) { + mqpcb->path_migration_state = attr->path_mig_state; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_PATH_MIGRATION_STATE, 1); + EDEB(7, "ehca_qp=%p qp_num=%x " + "IB_QP_PATH_MIG_STATE update_mask=%lx", my_qp, + ibqp->qp_num, update_mask); + } + + if (attr_mask & IB_QP_CAP) { + mqpcb->max_nr_outst_send_wr = attr->cap.max_send_wr+1; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_MAX_NR_OUTST_SEND_WR, 1); + mqpcb->max_nr_outst_recv_wr = attr->cap.max_recv_wr+1; + update_mask |= + EHCA_BMASK_SET(MQPCB_MASK_MAX_NR_OUTST_RECV_WR, 1); + EDEB(7, "ehca_qp=%p qp_num=%x " + "IB_QP_CAP update_mask=%lx", + my_qp, ibqp->qp_num, update_mask); + /* TODO no support for max_send/recv_sge??? */ + } + + EDEB_DMP(7, mqpcb, 4*70, "ehca_qp=%p qp_num=%x", my_qp, ibqp->qp_num); + + hipz_rc = hipz_h_modify_qp(shca->ipz_hca_handle, + my_qp->ipz_qp_handle, + &my_qp->pf, + update_mask, + mqpcb, my_qp->ehca_qp_core.galpas.kernel); + + if (hipz_rc != H_Success) { + retcode = ehca2ib_return_code(hipz_rc); + EDEB_ERR(4, "hipz_h_modify_qp() failed rc=%lx " + "ehca_qp=%p qp_num=%x", + hipz_rc, my_qp, ibqp->qp_num); + goto modify_qp_exit2; + } + + if ((my_qp->ehca_qp_core.qp_type == IB_QPT_UD + || my_qp->ehca_qp_core.qp_type == IB_QPT_GSI + || my_qp->ehca_qp_core.qp_type == IB_QPT_SMI) + && statetrans == IB_QPST_SQE2RTS) { + /* doorbell to reprocessing wqes */ + iosync(); /* serialize GAL register access */ + hipz_update_SQA(&my_qp->ehca_qp_core, bad_wqe_cnt-1); + EDEB(6, "doorbell for %x wqes", bad_wqe_cnt); + } + + if (statetrans == IB_QPST_RESET2INIT || + statetrans == IB_QPST_INIT2INIT) { + mqpcb->qp_enable = TRUE; + mqpcb->qp_state = EHCA_QPS_INIT; + update_mask = 0; + update_mask = EHCA_BMASK_SET(MQPCB_MASK_QP_ENABLE, 1); + + EDEB(7, "ehca_qp=%p qp_num=%x " + "RESET_2_INIT needs an additional enable " + "-> update_mask=%lx", my_qp, ibqp->qp_num, update_mask); + + hipz_rc = hipz_h_modify_qp(shca->ipz_hca_handle, + my_qp->ipz_qp_handle, + &my_qp->pf, + update_mask, + mqpcb, + my_qp->ehca_qp_core.galpas.kernel); + + if (hipz_rc != H_Success) { + retcode = ehca2ib_return_code(hipz_rc); + EDEB_ERR(4, "ENABLE in context of " + "RESET_2_INIT failed! " + "Maybe you didn't get a LID" + "hipz_rc=%lx ehca_qp=%p qp_num=%x", + hipz_rc, my_qp, ibqp->qp_num); + goto modify_qp_exit2; + } + } + + if (statetrans == IB_QPST_ANY2RESET) { + ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_rqueue); + ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_squeue); + } + + if (attr_mask & IB_QP_QKEY) { + my_qp->ehca_qp_core.qkey = attr->qkey; + } + + modify_qp_exit2: + if (squeue_locked) { /* this means: sqe -> rts */ + spin_unlock_irqrestore(&my_qp->spinlock_s, spl_flags); + my_qp->sqerr_purgeflag = 1; + } + + modify_qp_exit1: + kfree(mqpcb); + + modify_qp_exit0: + EDEB_EX(7, "ehca_qp=%p qp_num=%x ibqp_type=%x retcode=%x", + my_qp, ibqp->qp_num, ibqp->qp_type, retcode); + return retcode; +} + +int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask) +{ + int ret = 0; + struct ehca_qp *my_qp = NULL; + + EHCA_CHECK_ADR(ibqp); + EHCA_CHECK_ADR(attr); + EHCA_CHECK_ADR(ibqp->device); + + my_qp = container_of(ibqp, struct ehca_qp, ib_qp); + + EDEB_EN(7, "ehca_qp=%p qp_num=%x ibqp_type=%x attr_mask=%x", + my_qp, ibqp->qp_num, ibqp->qp_type, attr_mask); + + ret = internal_modify_qp(ibqp, attr, attr_mask, 0); + + EDEB_EX(7, "ehca_qp=%p qp_num=%x ibqp_type=%x ret=%x", + my_qp, ibqp->qp_num, ibqp->qp_type, ret); + return ret; +} + +int ehca_query_qp(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr) +{ + struct ehca_qp *my_qp = NULL; + struct ehca_shca *shca = NULL; + struct hcp_modify_qp_control_block *qpcb = NULL; + + struct ipz_adapter_handle adapter_handle; + int cnt = 0, retcode = 0; + u64 hipz_rc = H_Success; + + EHCA_CHECK_ADR(qp); + EHCA_CHECK_ADR(qp_attr); + EHCA_CHECK_DEVICE(qp->device); + + my_qp = container_of(qp, struct ehca_qp, ib_qp); + + EDEB_EN(7, "ehca_qp=%p qp_num=%x " + "qp_attr=%p qp_attr_mask=%x qp_init_attr=%p", + my_qp, qp->qp_num, qp_attr, qp_attr_mask, qp_init_attr); + + shca = container_of(qp->device, struct ehca_shca, ib_device); + adapter_handle = shca->ipz_hca_handle; + + if (qp_attr_mask & QP_ATTR_QUERY_NOT_SUPPORTED) { + retcode = -EINVAL; + EDEB_ERR(4,"Invalid attribute mask " + "ehca_qp=%p qp_num=%x qp_attr_mask=%x ", + my_qp, qp->qp_num, qp_attr_mask); + goto query_qp_exit0; + } + + qpcb = kmalloc(EHCA_PAGESIZE, GFP_KERNEL ); + + if (qpcb == NULL) { + retcode = -ENOMEM; + EDEB_ERR(4,"Out of memory for qpcb " + "ehca_qp=%p qp_num=%x", my_qp, qp->qp_num); + goto query_qp_exit0; + } + memset(qpcb, 0, sizeof(*qpcb)); + + hipz_rc = hipz_h_query_qp(adapter_handle, + my_qp->ipz_qp_handle, + &my_qp->pf, + qpcb, my_qp->ehca_qp_core.galpas.kernel); + + if (hipz_rc != H_Success) { + retcode = ehca2ib_return_code(hipz_rc); + EDEB_ERR(4,"hipz_h_query_qp() failed " + "ehca_qp=%p qp_num=%x hipz_rc=%lx", + my_qp, qp->qp_num, hipz_rc); + goto query_qp_exit1; + } + + qp_attr->cur_qp_state = ehca2ib_qp_state(qpcb->qp_state); + qp_attr->qp_state = qp_attr->cur_qp_state; + if (qp_attr->cur_qp_state == -EINVAL) { + retcode = -EINVAL; + EDEB_ERR(4,"Got invalid ehca_qp_state=%x " + "ehca_qp=%p qp_num=%x", + qpcb->qp_state, my_qp, qp->qp_num); + goto query_qp_exit1; + } + + if (qp_attr->qp_state == IB_QPS_SQD) { + qp_attr->sq_draining = TRUE; + } + + qp_attr->qkey = qpcb->qkey; + qp_attr->path_mtu = qpcb->path_mtu; + qp_attr->path_mig_state = qpcb->path_migration_state; + qp_attr->rq_psn = qpcb->receive_psn; + qp_attr->sq_psn = qpcb->send_psn; + qp_attr->min_rnr_timer = qpcb->min_rnr_nak_timer_field; + qp_attr->cap.max_send_wr = qpcb->max_nr_outst_send_wr-1; + qp_attr->cap.max_recv_wr = qpcb->max_nr_outst_recv_wr-1; + /* UD_AV CIRCUMVENTION */ + if (my_qp->ehca_qp_core.qp_type == IB_QPT_UD) { + qp_attr->cap.max_send_sge = + qpcb->actual_nr_sges_in_sq_wqe - 2; + qp_attr->cap.max_recv_sge = + qpcb->actual_nr_sges_in_rq_wqe - 2; + } else { + qp_attr->cap.max_send_sge = + qpcb->actual_nr_sges_in_sq_wqe; + qp_attr->cap.max_recv_sge = + qpcb->actual_nr_sges_in_rq_wqe; + } + + qp_attr->cap.max_inline_data = my_qp->sq_max_inline_data_size; + qp_attr->dest_qp_num = qpcb->dest_qp_nr; + + qp_attr->pkey_index = + EHCA_BMASK_GET(MQPCB_PRIM_P_KEY_IDX, qpcb->prim_p_key_idx); + + qp_attr->port_num = + EHCA_BMASK_GET(MQPCB_PRIM_PHYS_PORT, qpcb->prim_phys_port); + + qp_attr->timeout = qpcb->timeout; + qp_attr->retry_cnt = qpcb->retry_count; + qp_attr->rnr_retry = qpcb->rnr_retry_count; + + qp_attr->alt_pkey_index = + EHCA_BMASK_GET(MQPCB_PRIM_P_KEY_IDX, qpcb->alt_p_key_idx); + + qp_attr->alt_port_num = qpcb->alt_phys_port; + qp_attr->alt_timeout = qpcb->timeout_al; + + /* primary av */ + qp_attr->ah_attr.sl = qpcb->service_level; + + if (qpcb->send_grh_flag) { + qp_attr->ah_attr.ah_flags = IB_AH_GRH; + } + + qp_attr->ah_attr.static_rate = qpcb->max_static_rate; + qp_attr->ah_attr.dlid = qpcb->dlid; + qp_attr->ah_attr.src_path_bits = qpcb->source_path_bits; + qp_attr->ah_attr.port_num = qp_attr->port_num; + + /* primary GRH */ + qp_attr->ah_attr.grh.traffic_class = qpcb->traffic_class; + qp_attr->ah_attr.grh.hop_limit = qpcb->hop_limit; + qp_attr->ah_attr.grh.sgid_index = qpcb->source_gid_idx; + qp_attr->ah_attr.grh.flow_label = qpcb->flow_label; + + for (cnt = 0; cnt < 16; cnt++) { + qp_attr->ah_attr.grh.dgid.raw[cnt] = + qpcb->dest_gid.byte[cnt]; + } + + /* alternate AV */ + qp_attr->alt_ah_attr.sl = qpcb->service_level_al; + if (qpcb->send_grh_flag_al) { + qp_attr->alt_ah_attr.ah_flags = IB_AH_GRH; + } + + qp_attr->alt_ah_attr.static_rate = qpcb->max_static_rate_al; + qp_attr->alt_ah_attr.dlid = qpcb->dlid_al; + qp_attr->alt_ah_attr.src_path_bits = qpcb->source_path_bits_al; + + /* alternate GRH */ + qp_attr->alt_ah_attr.grh.traffic_class = qpcb->traffic_class_al; + qp_attr->alt_ah_attr.grh.hop_limit = qpcb->hop_limit_al; + qp_attr->alt_ah_attr.grh.sgid_index = qpcb->source_gid_idx_al; + qp_attr->alt_ah_attr.grh.flow_label = qpcb->flow_label_al; + + for (cnt = 0; cnt < 16; cnt++) { + qp_attr->alt_ah_attr.grh.dgid.raw[cnt] = + qpcb->dest_gid_al.byte[cnt]; + } + + /* return init attributes given in ehca_create_qp */ + if (qp_init_attr != NULL) { + *qp_init_attr = my_qp->init_attr; + } + + EDEB(7, "ehca_qp=%p qp_number=%x dest_qp_number=%x " + "dlid=%x path_mtu=%x dest_gid=%lx_%lx " + "service_level=%x qp_state=%x", + my_qp, qpcb->qp_number, qpcb->dest_qp_nr, + qpcb->dlid, qpcb->path_mtu, + qpcb->dest_gid.dw[0], qpcb->dest_gid.dw[1], + qpcb->service_level, qpcb->qp_state); + + EDEB_DMP(7, qpcb, 4*70, "ehca_qp=%p qp_num=%x", my_qp, qp->qp_num); + + query_qp_exit1: + kfree(qpcb); + + query_qp_exit0: + EDEB_EX(7, "ehca_qp=%p qp_num=%x retcode=%x", + my_qp, qp->qp_num, retcode); + return retcode; +} + +int ehca_destroy_qp(struct ib_qp *ibqp) +{ + struct ehca_qp *my_qp = NULL; + struct ehca_shca *shca = NULL; + struct ehca_pfqp *qp_pf = NULL; + u32 qp_num = 0; + int retcode = 0; + u64 hipz_ret = H_Success; + u8 port_num = 0; + enum ib_qp_type qp_type; + + EHCA_CHECK_ADR(ibqp); + + my_qp = container_of(ibqp, struct ehca_qp, ib_qp); + qp_num = ibqp->qp_num; + qp_pf = &my_qp->pf; + + shca = container_of(ibqp->device, struct ehca_shca, ib_device); + + EDEB_EN(7, "ehca_qp=%p qp_num=%x", my_qp, ibqp->qp_num); + + if (my_qp->send_cq != NULL) { + retcode = ehca_cq_unassign_qp(my_qp->send_cq, + my_qp->ehca_qp_core.real_qp_num); + if (retcode != 0) { + EDEB_ERR(4, "Couldn't unassign qp from send_cq " + "ret=%x qp_num=%x cq_num=%x", + retcode, my_qp->ib_qp.qp_num, + my_qp->send_cq->cq_number); + goto destroy_qp_exit0; + } + } + + down_write(&ehca_qp_idr_sem); + idr_remove(&ehca_qp_idr, my_qp->token); + up_write(&ehca_qp_idr_sem); + + /* un-mmap if vma alloc */ + if (my_qp->uspace_rqueue != 0) { + struct ehca_qp_core *qp_core = &my_qp->ehca_qp_core; + retcode = ehca_munmap(my_qp->uspace_rqueue, + qp_core->ipz_rqueue.queue_length); + retcode = ehca_munmap(my_qp->uspace_squeue, + qp_core->ipz_squeue.queue_length); + retcode = ehca_munmap(my_qp->uspace_fwh, 4096); + } + + hipz_ret = hipz_h_destroy_qp(shca->ipz_hca_handle, my_qp); + if (hipz_ret != H_Success) { + EDEB_ERR(4, "hipz_h_destroy_qp() failed " + "rc=%lx ehca_qp=%p qp_num=%x", + hipz_ret, qp_pf, qp_num); + goto destroy_qp_exit0; + } + + port_num = my_qp->init_attr.port_num; + qp_type = my_qp->init_attr.qp_type; + + /* TODO: later with IB_QPT_SMI */ + if (qp_type == IB_QPT_GSI) { + struct ib_event event; + + EDEB(4, "EHCA port %x is inactive.", port_num); + event.device = &shca->ib_device; + event.event = IB_EVENT_PORT_ERR; + event.element.port_num = port_num; + shca->sport[port_num - 1].port_state = IB_PORT_DOWN; + ib_dispatch_event(&event); + } + + ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_rqueue); + ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_squeue); + ehca_qp_delete(my_qp); + + destroy_qp_exit0: + retcode = ehca2ib_return_code(hipz_ret); + EDEB_EX(7,"ret=%x", retcode); + return retcode; +} + +/* eof ehca_qp.c */ From rolandd at cisco.com Sat Feb 18 11:57:19 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:19 -0800 Subject: [PATCH 06/22] Queue handling In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005719.13620.95136.stgit@localhost.localdomain> From: Roland Dreier Code like #ifndef __PPC64__ void * dummy1; /* make sure we use the same thing on 32 bit */ #endif looks _very_ suspicious. Much better to make sure that the structures are laid out the same no matter what the word size of the architecture is rather than relying on fragile hacks like this. --- drivers/infiniband/hw/ehca/ipz_pt_fn.c | 137 ++++++++++++++++++++++ drivers/infiniband/hw/ehca/ipz_pt_fn.h | 165 +++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/ipz_pt_fn_core.h | 152 +++++++++++++++++++++++++ 3 files changed, 454 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c new file mode 100644 index 0000000..d6c490c --- /dev/null +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c @@ -0,0 +1,137 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * internal queue handling + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ipz_pt_fn.c,v 1.16 2006/02/06 10:17:34 schickhj Exp $ + */ + +#define DEB_PREFIX "iptz" + +#include "ehca_kernel.h" +#include "ehca_tools.h" +#include "ipz_pt_fn.h" + +extern int ehca_hwlevel; + +void *ipz_QPageit_get_inc(struct ipz_queue *queue) +{ + void *retvalue = NULL; + u8 *EOF_last_page = queue->queue + queue->queue_length; + + retvalue = queue->current_q_addr; + queue->current_q_addr += queue->pagesize; + if (queue->current_q_addr > EOF_last_page) { + queue->current_q_addr -= queue->pagesize; + retvalue = NULL; + } + + if ((((u64)retvalue) % EHCA_PAGESIZE) != 0) { + EDEB(4, "ERROR!! not at PAGE-Boundary"); + return (NULL); + } + EDEB(7, "queue=%p retvalue=%p", queue, retvalue); + return (retvalue); +} + +void *ipz_QEit_EQ_get_inc(struct ipz_queue *queue) +{ + void *retvalue = NULL; + u8 *last_entry_in_q = queue->queue + queue->queue_length + - queue->qe_size; + + retvalue = queue->current_q_addr; + queue->current_q_addr += queue->qe_size; + if (queue->current_q_addr > last_entry_in_q) { + queue->current_q_addr = queue->queue; + queue->toggle_state = (~queue->toggle_state) & 1; + } + + EDEB(7, "queue=%p retvalue=%p new current_q_addr=%p qe_size=%x", + queue, retvalue, queue->current_q_addr, queue->qe_size); + + return (retvalue); +} + +int ipz_queue_ctor(struct ipz_queue *queue, + const u32 nr_of_pages, + const u32 pagesize, const u32 qe_size, const u32 nr_of_sg) +{ + EDEB_EN(7, "nr_of_pages=%x pagesize=%x qe_size=%x", + nr_of_pages, pagesize, qe_size); + queue->queue_length = nr_of_pages * pagesize; + queue->queue = vmalloc(queue->queue_length); + if (queue->queue == 0) { + EDEB(4, "ERROR!! didn't get the memory"); + return (FALSE); + } + if ((((u64)queue->queue) & (EHCA_PAGESIZE - 1)) != 0) { + EDEB(4, "ERROR!! QUEUE doesn't start at " + "page boundary"); + vfree(queue->queue); + return (FALSE); + } + + memset(queue->queue, 0, queue->queue_length); + queue->current_q_addr = queue->queue; + queue->qe_size = qe_size; + queue->act_nr_of_sg = nr_of_sg; + queue->pagesize = pagesize; + queue->toggle_state = 1; + EDEB_EX(7, "queue_length=%x queue=%p qe_size=%x" + " act_nr_of_sg=%x", queue->queue_length, queue->queue, + queue->qe_size, queue->act_nr_of_sg); + return TRUE; +} + +int ipz_queue_dtor(struct ipz_queue *queue) +{ + EDEB_EN(7, "ipz_queue pointer=%p", queue); + if (queue == NULL) { + return (FALSE); + } + if (queue->queue == NULL) { + return (FALSE); + } + EDEB(7, "destructing a queue with the following " + "properties:\n nr_of_pages=%x pagesize=%x qe_size=%x", + queue->act_nr_of_sg, queue->pagesize, queue->qe_size); + vfree(queue->queue); + + EDEB_EX(7, "queue freed!"); + return TRUE; +} diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.h b/drivers/infiniband/hw/ehca/ipz_pt_fn.h new file mode 100644 index 0000000..2e197db --- /dev/null +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.h @@ -0,0 +1,165 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * internal queue handling + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ipz_pt_fn.h,v 1.11 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __IPZ_PT_FN_H__ +#define __IPZ_PT_FN_H__ + +#include "ipz_pt_fn_core.h" +#include "ehca_qes.h" + +#define EHCA_PAGESIZE 4096UL +#define EHCA_PT_ENTRIES 512UL + +/** @brief generic page table + */ +struct ipz_pt { + u64 entries[EHCA_PT_ENTRIES]; +}; + +/** @brief generic page + */ +struct ipz_page { + u8 entries[EHCA_PAGESIZE]; +}; + +/** @brief page table for a queue, only to be used in pf + */ +struct ipz_qpt { + /* queue page tables (kv), use u64 because we know the element length */ + u64 *qpts; + u32 allocated_qpts_entries; + u32 nr_of_PTEs; /* number of page table entries PTE iterators */ + u64 *current_pte_addr; +}; + +/** @brief constructor for a ipz_queue_t, placement new for ipz_queue_t, + new for all dependent datastructors + + all QP Tables are the same + flow: + -# allocate+pin queue + @see ipz_qpt_ctor() + @returns true if ok, false if out of memory + */ +int ipz_queue_ctor(struct ipz_queue *queue, const u32 nr_of_pages, + const u32 pagesize, + const u32 qe_size, /* queue entry size*/ + const u32 nr_of_sg); + +/** @brief destructor for a ipz_queue_t + -# free queue + @see ipz_queue_ctor() + @returns true if ok, false if queue was NULL-ptr of free failed +*/ +int ipz_queue_dtor(struct ipz_queue *queue); + +/** @brief constructor for a ipz_qpt_t, + * placement new for struct ipz_queue, new for all dependent datastructors + * + * all QP Tables are the same, + * flow: + * -# allocate+pin queue + * -# initialise ptcb + * -# allocate+pin PTs + * -# link PTs to a ring, according to HCA Arch, set bit62 id needed + * -# the ring must have room for exactly nr_of_PTEs + * @see ipz_qpt_ctor() + */ +void ipz_qpt_ctor(struct ipz_qpt *qpt, + struct ehca_bridge_handle bridge, + const u32 nr_of_QEs, + const u32 pagesize, + const u32 qe_size, + const u8 lowbyte, const u8 toggle, + u32 * act_nr_of_QEs, + u32 * act_nr_of_pages); + +/** @brief return current Queue Entry, increment Queue Entry iterator by one + step in struct ipz_queue, will wrap in ringbuffer + @returns address (kv) of Queue Entry BEFORE increment + @warning don't use in parallel with ipz_QPageit_get_inc() + @warning unpredictable results may occur if steps>act_nr_of_queue_entries + + fix EQ page problems + */ +void *ipz_QEit_EQ_get_inc(struct ipz_queue *queue); + +/** @brief return current Event Queue Entry, increment Queue Entry iterator + by one step in struct ipz_queue if valid, will wrap in ringbuffer + @returns address (kv) of Queue Entry BEFORE increment + @returns 0 and does not increment, if wrong valid state + @warning don't use in parallel with ipz_queue_QPageit_get_inc() + @warning unpredictable results may occur if steps>act_nr_of_queue_entries + */ +inline static void *ipz_QEit_EQ_get_inc_valid(struct ipz_queue *queue) +{ + void *retvalue = ipz_QEit_get(queue); + u32 qe = *(u8 *) retvalue; + EDEB(7, "ipz_QEit_EQ_get_inc_valid qe=%x", qe); + if ((qe >> 7) == (queue->toggle_state & 1)) { + /* this is a good one */ + ipz_QEit_EQ_get_inc(queue); + } else { + retvalue = NULL; + } + return (retvalue); +} + +/** + @returns address (GX) of first queue entry + */ +inline static u64 ipz_qpt_get_firstpage(struct ipz_qpt *qpt) +{ + return (be64_to_cpu(qpt->qpts[0])); +} + +/** + @returns address (kv) of first page of queue page table + */ +inline static void *ipz_qpt_get_qpt(struct ipz_qpt *qpt) +{ + return (qpt->qpts); +} + +#endif /* __IPZ_PT_FN_H__ */ diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn_core.h b/drivers/infiniband/hw/ehca/ipz_pt_fn_core.h new file mode 100644 index 0000000..1b9a114 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn_core.h @@ -0,0 +1,152 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * internal queue handling + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ipz_pt_fn_core.h,v 1.12 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef __IPZ_PT_FN_CORE_H__ +#define __IPZ_PT_FN_CORE_H__ + +#ifdef __KERNEL__ +#include "ehca_tools.h" +#else /* some replacements for kernel stuff */ +#include "ehca_utools.h" +#endif + +#include "ehca_qes.h" + +/** @brief generic queue in linux kernel virtual memory (kv) + */ +struct ipz_queue { +#ifndef __PPC64__ + void * dummy1; /* make sure we use the same thing on 32 bit */ +#endif + u8 *current_q_addr; /* current queue entry */ +#ifndef __PPC64__ + void * dummy2; +#endif + u8 *queue; /* points to first queue entry */ + u32 qe_size; /* queue entry size */ + u32 act_nr_of_sg; + u32 queue_length; /* queue length allocated in bytes */ + u32 pagesize; + u32 toggle_state; /* toggle flag - per page */ + u32 dummy3; /* 64 bit alignment*/ +}; + +/** @brief return current Queue Entry + @returns address (kv) of Queue Entry + */ +static inline void *ipz_QEit_get(struct ipz_queue *queue) +{ + return (queue->current_q_addr); +} + +/** @brief return current Queue Page , increment Queue Page iterator from + page to page in struct ipz_queue, last increment will return 0! and + NOT wrap + @returns address (kv) of Queue Page + @warning don't use in parallel with ipz_QE_get_inc() + */ +void *ipz_QPageit_get_inc(struct ipz_queue *queue); + +/** @brief return current Queue Entry, increment Queue Entry iterator by one + step in struct ipz_queue, will wrap in ringbuffer + @returns address (kv) of Queue Entry BEFORE increment + @warning don't use in parallel with ipz_QPageit_get_inc() + @warning unpredictable results may occur if steps>act_nr_of_queue_entries + */ +static inline void *ipz_QEit_get_inc(struct ipz_queue *queue) +{ + void *retvalue = 0; + u8 *last_entry_in_q = queue->queue + queue->queue_length + - queue->qe_size; + + retvalue = queue->current_q_addr; + queue->current_q_addr += queue->qe_size; + if (queue->current_q_addr > last_entry_in_q) { + queue->current_q_addr = queue->queue; + /* toggle the valid flag */ + queue->toggle_state = (~queue->toggle_state) & 1; + } + + EDEB(7, "queue=%p retvalue=%p new current_q_addr=%p qe_size=%x", + queue, retvalue, queue->current_q_addr, queue->qe_size); + + return (retvalue); +} + +/** @brief return current Queue Entry, increment Queue Entry iterator by one + step in struct ipz_queue, will wrap in ringbuffer + @returns address (kv) of Queue Entry BEFORE increment + @returns 0 and does not increment, if wrong valid state + @warning don't use in parallel with ipz_QPageit_get_inc() + @warning unpredictable results may occur if steps>act_nr_of_queue_entries + */ +inline static void *ipz_QEit_get_inc_valid(struct ipz_queue *queue) +{ + void *retvalue = ipz_QEit_get(queue); +#ifdef USERSPACE_DRIVER + + u32 qe = + ((struct ehca_cqe *)(ehca_ktou((struct ehca_cqe *)retvalue)))-> + cqe_flags; +#else + u32 qe = ((struct ehca_cqe *)retvalue)->cqe_flags; +#endif + if ((qe >> 7) == (queue->toggle_state & 1)) { + /* this is a good one */ + ipz_QEit_get_inc(queue); + } else + retvalue = 0; + return (retvalue); +} + +/** @brief returns and resets Queue Entry iterator + @returns address (kv) of first Queue Entry + */ +static inline void *ipz_QEit_reset(struct ipz_queue *queue) +{ + queue->current_q_addr = queue->queue; + return (queue->queue); +} + +#endif /* __IPZ_PT_FN_CORE_H__ */ From rolandd at cisco.com Sat Feb 18 11:57:41 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:41 -0800 Subject: [PATCH 13/22] HCA query functions In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005741.13620.93906.stgit@localhost.localdomain> From: Roland Dreier --- drivers/infiniband/hw/ehca/ehca_hca.c | 321 +++++++++++++++++++++++++++++++++ 1 files changed, 321 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c new file mode 100644 index 0000000..af05a5c --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -0,0 +1,321 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * HCA query functions + * + * Authors: Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_hca.c,v 1.46 2006/02/06 10:17:34 schickhj Exp $ + */ + +#undef DEB_PREFIX +#define DEB_PREFIX "shca" + +#include "ehca_kernel.h" +#include "ehca_tools.h" + +#include "hcp_if.h" /* TODO: later via hipz_* header file */ + +#define TO_MAX_INT(dest, src) \ + if (src >= INT_MAX) \ + dest = INT_MAX; \ + else \ + dest = src + +int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props) +{ + int ret = 0; + struct ehca_shca *shca; + struct query_hca_rblock *rblock; + + EDEB_EN(7, ""); + EHCA_CHECK_DEVICE(ibdev); + + memset(props, 0, sizeof(struct ib_device_attr)); + shca = container_of(ibdev, struct ehca_shca, ib_device); + + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (rblock == NULL) { + EDEB_ERR(4, "Can't allocate rblock memory."); + ret = -ENOMEM; + goto query_device0; + } + + memset(rblock, 0, PAGE_SIZE); + + if (hipz_h_query_hca(shca->ipz_hca_handle, rblock) != H_Success) { + EDEB_ERR(4, "Can't query device properties"); + ret = -EINVAL; + goto query_device1; + } + props->fw_ver = rblock->hw_ver; + /* TODO: memcpy(&props->sys_image_guid, ...); */ + props->max_mr_size = rblock->max_mr_size; + /* TODO: props->page_size_cap */ + props->vendor_id = rblock->vendor_id >> 8; + props->vendor_part_id = rblock->vendor_part_id >> 16; + props->hw_ver = rblock->hw_ver; + TO_MAX_INT(props->max_qp, (rblock->max_qp - rblock->cur_qp)); + /* TODO: props->max_qp_wr = */ + /* TODO: props->device_cap_flags */ + props->max_sge = rblock->max_sge; + props->max_sge_rd = rblock->max_sge_rd; + TO_MAX_INT(props->max_qp, (rblock->max_cq - rblock->cur_cq)); + props->max_cqe = rblock->max_cqe; + TO_MAX_INT(props->max_mr, (rblock->max_cq - rblock->cur_mr)); + TO_MAX_INT(props->max_pd, rblock->max_pd); + /* TODO: props->max_qp_rd_atom */ + /* TODO: props->max_qp_init_rd_atom */ + /* TODO: props->atomic_cap */ + /* TODO: props->max_ee */ + /* TODO: props->max_rdd */ + props->max_mw = rblock->max_mw; + TO_MAX_INT(props->max_mr, (rblock->max_mw - rblock->cur_mw)); + props->max_raw_ipv6_qp = rblock->max_raw_ipv6_qp; + props->max_raw_ethy_qp = rblock->max_raw_ethy_qp; + props->max_mcast_grp = rblock->max_mcast_grp; + props->max_mcast_qp_attach = rblock->max_qps_attached_mcast_grp; + props->max_total_mcast_qp_attach = rblock->max_qps_attached_all_mcast_grp; + + TO_MAX_INT(props->max_ah, rblock->max_ah); + + props->max_fmr = rblock->max_mr; + /* TODO: props->max_map_per_fmr */ + + /* TODO: props->max_srq */ + /* TODO: props->max_srq_wr */ + /* TODO: props->max_srq_sge */ + props->max_srq = 0; + props->max_srq_wr = 0; + props->max_srq_sge = 0; + + /* TODO: props->max_pkeys */ + props->max_pkeys = 16; + + props->local_ca_ack_delay = rblock->local_ca_ack_delay; + + query_device1: + kfree(rblock); + + query_device0: + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +int ehca_query_port(struct ib_device *ibdev, + u8 port, struct ib_port_attr *props) +{ + int ret = 0; + struct ehca_shca *shca; + struct query_port_rblock *rblock; + + EDEB_EN(7, "port=%x", port); + EHCA_CHECK_DEVICE(ibdev); + + memset(props, 0, sizeof(struct ib_port_attr)); + shca = container_of(ibdev, struct ehca_shca, ib_device); + + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (rblock == NULL) { + EDEB_ERR(4, "Can't allocate rblock memory."); + ret = -ENOMEM; + goto query_port0; + } + + memset(rblock, 0, PAGE_SIZE); + + if (hipz_h_query_port(shca->ipz_hca_handle, port, rblock) != H_Success) { + EDEB_ERR(4, "Can't query port properties"); + ret = -EINVAL; + goto query_port1; + } + + props->state = rblock->state; + + switch (rblock->max_mtu) { + case 0x1: + props->active_mtu = props->max_mtu = IB_MTU_256; + break; + case 0x2: + props->active_mtu = props->max_mtu = IB_MTU_512; + break; + case 0x3: + props->active_mtu = props->max_mtu = IB_MTU_1024; + break; + case 0x4: + props->active_mtu = props->max_mtu = IB_MTU_2048; + break; + case 0x5: + props->active_mtu = props->max_mtu = IB_MTU_4096; + break; + default: + EDEB_ERR(4, "Unknown MTU size: %x.", rblock->max_mtu); + } + + props->gid_tbl_len = rblock->gid_tbl_len; + /* TODO: props->port_cap_flags */ + props->max_msg_sz = rblock->max_msg_sz; + props->bad_pkey_cntr = rblock->bad_pkey_cntr; + props->qkey_viol_cntr = rblock->qkey_viol_cntr; + props->pkey_tbl_len = rblock->pkey_tbl_len; + props->lid = rblock->lid; + props->sm_lid = rblock->sm_lid; + props->lmc = rblock->lmc; + /* TODO: max_vl_num */ + props->sm_sl = rblock->sm_sl; + props->subnet_timeout = rblock->subnet_timeout; + props->init_type_reply = rblock->init_type_reply; + + /* TODO: props->active_width */ + props->active_width = IB_WIDTH_12X; + /* TODO: props->active_speed */ + + /* TODO: props->phys_state */ + + query_port1: + kfree(rblock); + + query_port0: + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +int ehca_query_pkey(struct ib_device *ibdev, u8 port, u16 index, u16 *pkey) +{ + int ret = 0; + struct ehca_shca *shca; + struct query_port_rblock *rblock; + + EDEB_EN(7, "port=%x index=%x", port, index); + EHCA_CHECK_DEVICE(ibdev); + + if (index > 16) { + EDEB_ERR(4, "Invalid index: %x.", index); + ret = -EINVAL; + goto query_pkey0; + } + + shca = container_of(ibdev, struct ehca_shca, ib_device); + + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (rblock == NULL) { + EDEB_ERR(4, "Can't allocate rblock memory."); + ret = -ENOMEM; + goto query_pkey0; + } + + memset(rblock, 0, PAGE_SIZE); + + if (hipz_h_query_port(shca->ipz_hca_handle, port, rblock) != H_Success) { + EDEB_ERR(4, "Can't query port properties"); + ret = -EINVAL; + goto query_pkey1; + } + + memcpy(pkey, &rblock->pkey_entries + index, sizeof(u16)); + + query_pkey1: + kfree(rblock); + + query_pkey0: + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +int ehca_query_gid(struct ib_device *ibdev, u8 port, + int index, union ib_gid *gid) +{ + int ret = 0; + struct ehca_shca *shca; + struct query_port_rblock *rblock; + + EDEB_EN(7, "port=%x index=%x", port, index); + EHCA_CHECK_DEVICE(ibdev); + + if (index > 255) { + EDEB_ERR(4, "Invalid index: %x.", index); + ret = -EINVAL; + goto query_gid0; + } + + shca = container_of(ibdev, struct ehca_shca, ib_device); + + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (rblock == NULL) { + EDEB_ERR(4, "Can't allocate rblock memory."); + ret = -ENOMEM; + goto query_gid0; + } + + memset(rblock, 0, PAGE_SIZE); + + if (hipz_h_query_port(shca->ipz_hca_handle, port, rblock) != H_Success) { + EDEB_ERR(4, "Can't query port properties"); + ret = -EINVAL; + goto query_gid1; + } + + memcpy(&gid->raw[0], &rblock->gid_prefix, sizeof(u64)); + memcpy(&gid->raw[8], &rblock->guid_entries[index], sizeof(u64)); + + query_gid1: + kfree(rblock); + + query_gid0: + EDEB_EX(7, "ret=%x GID=%lx%lx", ret, + *(u64 *) & gid->raw[0], + *(u64 *) & gid->raw[8]); + + return ret; +} + +int ehca_modify_port(struct ib_device *ibdev, + u8 port, int port_modify_mask, + struct ib_port_modify *props) +{ + int ret = 0; + + EDEB_EN(7, "port=%x", port); + EHCA_CHECK_DEVICE(ibdev); + + /* TODO: implementation */ + + EDEB_EX(7, "ret=%x", ret); + + return ret; +} From rolandd at cisco.com Sat Feb 18 11:57:52 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:52 -0800 Subject: [PATCH 18/22] ehca address vectors, multicast groups, protection domains In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005752.13620.3255.stgit@localhost.localdomain> From: Roland Dreier --- drivers/infiniband/hw/ehca/ehca_av.c | 258 +++++++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_mcast.c | 194 +++++++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_pd.c | 100 ++++++++++++ 3 files changed, 552 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c new file mode 100644 index 0000000..f5382c2 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_av.c @@ -0,0 +1,258 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * adress vector functions + * + * Authors: Reinhard Ernst + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_av.c,v 1.28 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#define DEB_PREFIX "ehav" + +#include "ehca_kernel.h" +#include "ehca_tools.h" +#include "ehca_iverbs.h" +#include "hcp_if.h" + +struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) +{ + extern int ehca_static_rate; + int retcode = 0; + struct ehca_av *av = NULL; + + EHCA_CHECK_PD_P(pd); + EHCA_CHECK_ADR_P(ah_attr); + + EDEB_EN(7,"pd=%p ah_attr=%p", pd, ah_attr); + + av = ehca_av_new(); + if (!av) { + EDEB_ERR(4,"Out of memory pd=%p ah_attr=%p", pd, ah_attr); + retcode = -ENOMEM; + goto create_ah_exit0; + } + + av->av.sl = ah_attr->sl; + av->av.dlid = ntohs(ah_attr->dlid); + av->av.slid_path_bits = ah_attr->src_path_bits; + + if (ehca_static_rate < 0) { + av->av.ipd = ah_attr->static_rate; + } else { + av->av.ipd = ehca_static_rate; + } + + av->av.lnh = ah_attr->ah_flags; + av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_IPVERSION_MASK, 6); + av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_TCLASS_MASK, + ah_attr->grh.traffic_class); + av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_FLOWLABEL_MASK, + ah_attr->grh.flow_label); + av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_HOPLIMIT_MASK, + ah_attr->grh.hop_limit); + av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_NEXTHEADER_MASK, 0x1B); + /* IB transport */ + av->av.grh.word_0 = be64_to_cpu(av->av.grh.word_0); + /* set sgid in grh.word_1 */ + if (ah_attr->ah_flags & IB_AH_GRH) { + int rc = 0; + struct ib_port_attr port_attr; + union ib_gid gid; + memset(&port_attr, 0, sizeof(port_attr)); + rc = ehca_query_port(pd->device, ah_attr->port_num, + &port_attr); + if (rc != 0) { /* invalid port number */ + retcode = -EINVAL; + EDEB_ERR(4, "Invalid port number " + "ehca_query_port() returned %x " + "pd=%p ah_attr=%p", rc, pd, ah_attr); + goto create_ah_exit1; + } + memset(&gid, 0, sizeof(gid)); + rc = ehca_query_gid(pd->device, + ah_attr->port_num, + ah_attr->grh.sgid_index, &gid); + if (rc != 0) { + retcode = -EINVAL; + EDEB_ERR(4, "Failed to retrieve sgid " + "ehca_query_gid() returned %x " + "pd=%p ah_attr=%p", rc, pd, ah_attr); + goto create_ah_exit1; + } + memcpy(&av->av.grh.word_1, &gid, sizeof(gid)); + } + /* for the time beeing we use a hard coded PMTU of 2048 Bytes */ + av->av.pmtu = 4; /* TODO */ + + /* dgid comes in grh.word_3 */ + memcpy(&av->av.grh.word_3, &ah_attr->grh.dgid, + sizeof(ah_attr->grh.dgid)); + + EHCA_REGISTER_AV(device, pd); + + EDEB_EX(7,"pd=%p ah_attr=%p av=%p", pd, ah_attr, av); + return (&av->ib_ah); + + create_ah_exit1: + ehca_av_delete(av); + + create_ah_exit0: + EDEB_EX(7,"retcode=%x pd=%p ah_attr=%p", retcode, pd, ah_attr); + return ERR_PTR(retcode); +} + +int ehca_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr) +{ + struct ehca_av *av = NULL; + struct ehca_ud_av new_ehca_av; + int ret = 0; + + EHCA_CHECK_AV(ah); + EHCA_CHECK_ADR(ah_attr); + + EDEB_EN(7,"ah=%p ah_attr=%p", ah, ah_attr); + + memset(&new_ehca_av, 0, sizeof(new_ehca_av)); + new_ehca_av.sl = ah_attr->sl; + new_ehca_av.dlid = ntohs(ah_attr->dlid); + new_ehca_av.slid_path_bits = ah_attr->src_path_bits; + new_ehca_av.ipd = ah_attr->static_rate; + new_ehca_av.lnh = EHCA_BMASK_SET(GRH_FLAG_MASK, + ((ah_attr->ah_flags & IB_AH_GRH) > 0)); + new_ehca_av.grh.word_0 = EHCA_BMASK_SET(GRH_TCLASS_MASK, + ah_attr->grh.traffic_class); + new_ehca_av.grh.word_0 |= EHCA_BMASK_SET(GRH_FLOWLABEL_MASK, + ah_attr->grh.flow_label); + new_ehca_av.grh.word_0 |= EHCA_BMASK_SET(GRH_HOPLIMIT_MASK, + ah_attr->grh.hop_limit); + new_ehca_av.grh.word_0 |= EHCA_BMASK_SET(GRH_NEXTHEADER_MASK, 0x1b); + new_ehca_av.grh.word_0 = be64_to_cpu(new_ehca_av.grh.word_0); + + /* set sgid in grh.word_1 */ + if (ah_attr->ah_flags & IB_AH_GRH) { + int rc = 0; + struct ib_port_attr port_attr; + union ib_gid gid; + memset(&port_attr, 0, sizeof(port_attr)); + rc = ehca_query_port(ah->device, ah_attr->port_num, + &port_attr); + if (rc != 0) { /* invalid port number */ + ret = -EINVAL; + EDEB_ERR(4, "Invalid port number " + "ehca_query_port() returned %x " + "ah=%p ah_attr=%p port_num=%x", + rc, ah, ah_attr, ah_attr->port_num); + goto modify_ah_exit1; + } + memset(&gid, 0, sizeof(gid)); + rc = ehca_query_gid(ah->device, + ah_attr->port_num, + ah_attr->grh.sgid_index, &gid); + if (rc != 0) { + ret = -EINVAL; + EDEB_ERR(4, + "Failed to retrieve sgid " + "ehca_query_gid() returned %x " + "ah=%p ah_attr=%p port_num=%x " + "sgid_index=%x", + rc, ah, ah_attr, ah_attr->port_num, + ah_attr->grh.sgid_index); + goto modify_ah_exit1; + } + memcpy(&new_ehca_av.grh.word_1, &gid, sizeof(gid)); + } + + new_ehca_av.pmtu = 4; /* TODO: see comment in create_ah() */ + + memcpy(&new_ehca_av.grh.word_3, &ah_attr->grh.dgid, + sizeof(ah_attr->grh.dgid)); + + av = container_of(ah, struct ehca_av, ib_ah); + av->av = new_ehca_av; + + modify_ah_exit1: + EDEB_EX(7,"ret=%x ah=%p ah_attr=%p", ret, ah, ah_attr); + + return ret; +} + +int ehca_query_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr) +{ + int ret = 0; + struct ehca_av *av = NULL; + + EHCA_CHECK_AV(ah); + EHCA_CHECK_ADR(ah_attr); + + EDEB_EN(7,"ah=%p ah_attr=%p", ah, ah_attr); + + av = container_of(ah, struct ehca_av, ib_ah); + memcpy(&ah_attr->grh.dgid, &av->av.grh.word_3, + sizeof(ah_attr->grh.dgid)); + ah_attr->sl = av->av.sl; + + ah_attr->dlid = av->av.dlid; + + ah_attr->src_path_bits = av->av.slid_path_bits; + ah_attr->static_rate = av->av.ipd; + ah_attr->ah_flags = EHCA_BMASK_GET(GRH_FLAG_MASK, av->av.lnh); + ah_attr->grh.traffic_class = EHCA_BMASK_GET(GRH_TCLASS_MASK, + av->av.grh.word_0); + ah_attr->grh.hop_limit = EHCA_BMASK_GET(GRH_HOPLIMIT_MASK, + av->av.grh.word_0); + ah_attr->grh.flow_label = EHCA_BMASK_GET(GRH_FLOWLABEL_MASK, + av->av.grh.word_0); + + EDEB_EX(7,"ah=%p ah_attr=%p ret=%x", ah, ah_attr, ret); + return ret; +} + +int ehca_destroy_ah(struct ib_ah *ah) +{ + int ret = 0; + + EHCA_CHECK_AV(ah); + EHCA_DEREGISTER_AV(ah); + + EDEB_EN(7,"ah=%p", ah); + + ehca_av_delete(container_of(ah, struct ehca_av, ib_ah)); + + EDEB_EX(7,"ret=%x ah=%p", ret, ah); + return ret; +} diff --git a/drivers/infiniband/hw/ehca/ehca_mcast.c b/drivers/infiniband/hw/ehca/ehca_mcast.c new file mode 100644 index 0000000..b49bcf6 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_mcast.c @@ -0,0 +1,194 @@ + +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * mcast functions + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Hoang-Nam Nguyen + * Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_mcast.c,v 1.20 2006/02/06 10:17:34 schickhj Exp $ + */ + +#define DEB_PREFIX "mcas" + +#include "ehca_kernel.h" +#include "ehca_classes.h" +#include "ehca_tools.h" +#include "hcp_if.h" +#include "ehca_qes.h" +#include +#include +#include "ehca_iverbs.h" + +#define MAX_MC_LID 0xFFFE +#define MIN_MC_LID 0xC000 /* Multicast limits */ +#define EHCA_VALID_MULTICAST_GID(gid) ((gid)[0] == 0xFF) +#define EHCA_VALID_MULTICAST_LID(lid) (((lid) >= MIN_MC_LID) && ((lid) <= MIN_MC_LID)) + +int ehca_attach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + struct ehca_qp *my_qp = NULL; + struct ehca_shca *shca = NULL; + union ib_gid my_gid; + u64 hipz_rc = H_Success; + int retcode = 0; + + EHCA_CHECK_ADR(ibqp); + EHCA_CHECK_ADR(gid); + + my_qp = container_of(ibqp, struct ehca_qp, ib_qp); + + EHCA_CHECK_QP(my_qp); + if (ibqp->qp_type != IB_QPT_UD) { + EDEB_ERR(4, "invalid qp_type %x gid, retcode=%x", + ibqp->qp_type, EINVAL); + return (-EINVAL); + } + + shca = container_of(ibqp->pd->device, struct ehca_shca, ib_device); + EHCA_CHECK_ADR(shca); + + if (!(EHCA_VALID_MULTICAST_GID(gid->raw))) { + EDEB_ERR(4, "gid is not valid mulitcast gid retcode=%x", + EINVAL); + return (-EINVAL); + } else if ((lid < MIN_MC_LID) || (lid > MAX_MC_LID)) { + EDEB_ERR(4, "lid=%x is not valid mulitcast lid retcode=%x", + lid, EINVAL); + return (-EINVAL); + } + + memcpy(&my_gid.raw, gid->raw, sizeof(union ib_gid)); + + hipz_rc = hipz_h_attach_mcqp(shca->ipz_hca_handle, + my_qp->ipz_qp_handle, + my_qp->ehca_qp_core.galpas.kernel, + lid, my_gid); + if (H_Success != hipz_rc) { + EDEB_ERR(4, + "ehca_qp=%p qp_num=%x hipz_h_attach_mcqp() failed " + "hipz_rc=%lx", my_qp, ibqp->qp_num, hipz_rc); + } + retcode = ehca2ib_return_code(hipz_rc); + + EDEB_EX(7, "mcast attach retcode=%x\n" + "ehca_qp=%p qp_num=%x lid=%x\n" + "my_gid= %x %x %x %x\n" + " %x %x %x %x\n" + " %x %x %x %x\n" + " %x %x %x %x\n", + retcode, my_qp, ibqp->qp_num, lid, + my_gid.raw[0], my_gid.raw[1], + my_gid.raw[2], my_gid.raw[3], + my_gid.raw[4], my_gid.raw[5], + my_gid.raw[6], my_gid.raw[7], + my_gid.raw[8], my_gid.raw[9], + my_gid.raw[10], my_gid.raw[11], + my_gid.raw[12], my_gid.raw[13], + my_gid.raw[14], my_gid.raw[15]); + + return retcode; +} + +int ehca_detach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + struct ehca_qp *my_qp = NULL; + struct ehca_shca *shca = NULL; + union ib_gid my_gid; + u64 hipz_rc = H_Success; + int retcode = 0; + + EHCA_CHECK_ADR(ibqp); + EHCA_CHECK_ADR(gid); + + my_qp = container_of(ibqp, struct ehca_qp, ib_qp); + + EHCA_CHECK_QP(my_qp); + if (ibqp->qp_type != IB_QPT_UD) { + EDEB_ERR(4, "invalid qp_type %x gid, retcode=%x", + ibqp->qp_type, EINVAL); + return (-EINVAL); + } + + shca = container_of(ibqp->pd->device, struct ehca_shca, ib_device); + EHCA_CHECK_ADR(shca); + + if (!(EHCA_VALID_MULTICAST_GID(gid->raw))) { + EDEB_ERR(4, "gid is not valid mulitcast gid retcode=%x", + EINVAL); + return (-EINVAL); + } else if ((lid < MIN_MC_LID) || (lid > MAX_MC_LID)) { + EDEB_ERR(4, "lid=%x is not valid mulitcast lid retcode=%x", + lid, EINVAL); + return (-EINVAL); + } + + EDEB_EN(7, "dgid=%p qp_numl=%x lid=%x", + gid, ibqp->qp_num, lid); + + memcpy(&my_gid.raw, gid->raw, sizeof(union ib_gid)); + + hipz_rc = hipz_h_detach_mcqp(shca->ipz_hca_handle, + my_qp->ipz_qp_handle, + my_qp->ehca_qp_core.galpas.kernel, + lid, my_gid); + if (H_Success != hipz_rc) { + EDEB_ERR(4, + "ehca_qp=%p qp_num=%x hipz_h_detach_mcqp() failed " + "hipz_rc=%lx", my_qp, ibqp->qp_num, hipz_rc); + } + retcode = ehca2ib_return_code(hipz_rc); + + EDEB_EX(7, "mcast detach retcode=%x\n" + "ehca_qp=%p qp_num=%x lid=%x\n" + "my_gid= %x %x %x %x\n" + " %x %x %x %x\n" + " %x %x %x %x\n" + " %x %x %x %x\n", + retcode, my_qp, ibqp->qp_num, lid, + my_gid.raw[0], my_gid.raw[1], + my_gid.raw[2], my_gid.raw[3], + my_gid.raw[4], my_gid.raw[5], + my_gid.raw[6], my_gid.raw[7], + my_gid.raw[8], my_gid.raw[9], + my_gid.raw[10], my_gid.raw[11], + my_gid.raw[12], my_gid.raw[13], + my_gid.raw[14], my_gid.raw[15]); + + return retcode; +} diff --git a/drivers/infiniband/hw/ehca/ehca_pd.c b/drivers/infiniband/hw/ehca/ehca_pd.c new file mode 100644 index 0000000..e110320 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_pd.c @@ -0,0 +1,100 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * PD functions + * + * Authors: Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_pd.c,v 1.25 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#define DEB_PREFIX "vpd " + +#include "ehca_kernel.h" +#include "ehca_tools.h" +#include "ehca_iverbs.h" + +struct ib_pd *ehca_alloc_pd(struct ib_device *device, + struct ib_ucontext *context, struct ib_udata *udata) +{ + struct ib_pd *mypd = NULL; + struct ehca_pd *pd = NULL; + + EDEB_EN(7, "device=%p context=%p udata=%p", device, context, udata); + + EHCA_CHECK_DEVICE_P(device); + + pd = ehca_pd_new(); + if (!pd) { + EDEB_ERR(4, "ERROR device=%p context=%p pd=%p " + "out of memory", device, context, mypd); + return ERR_PTR(-ENOMEM); + } + + /* kernel pd when (device,-1,0) + * user pd only if context != -1 */ + if (context == NULL) { + /* kernel pds after init reuses always + * the one created in ehca_shca_reopen() + */ + struct ehca_shca *shca = container_of(device, struct ehca_shca, + ib_device); + pd->fw_pd.value = shca->pd->fw_pd.value; + } else { + pd->fw_pd.value = (u64)pd; + } + + mypd = &pd->ib_pd; + + EHCA_REGISTER_PD(device, pd); + + EDEB_EX(7, "device=%p context=%p pd=%p", device, context, mypd); + + return (mypd); +} + +int ehca_dealloc_pd(struct ib_pd *pd) +{ + int ret = 0; + EDEB_EN(7, "pd=%p", pd); + + EHCA_CHECK_PD(pd); + EHCA_DEREGISTER_PD(pd); + ehca_pd_delete(container_of(pd, struct ehca_pd, ib_pd)); + + EDEB_EX(7, "pd=%p", pd); + return ret; +} From rolandd at cisco.com Sat Feb 18 11:57:50 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:50 -0800 Subject: [PATCH 17/22] Special QP functions In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005750.13620.62709.stgit@localhost.localdomain> From: Roland Dreier The wait for the port to become active when creating QP 1 seems bizarre. Why can't we just create QP 1 before the port is active? What is the issue with creating QP 0? Without QP 0, it's impossible to run a subnet manager on top of ehca. --- drivers/infiniband/hw/ehca/ehca_sqp.c | 135 +++++++++++++++++++++++++++++++++ 1 files changed, 135 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c new file mode 100644 index 0000000..bbad4cb --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_sqp.c @@ -0,0 +1,135 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * SQP functions + * + * Authors: Khadija Souissi + * Heiko J Schick + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_sqp.c,v 1.35 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#define DEB_PREFIX "e_qp" + +#include "ehca_kernel.h" +#include "ehca_classes.h" +#include "ehca_tools.h" +#include "hcp_if.h" +#include "ehca_qes.h" +#include "ehca_iverbs.h" + +#include +#include + +extern int ehca_create_aqp1(struct ehca_shca *shca, struct ehca_sport *sport); +extern int ehca_destroy_aqp1(struct ehca_sport *sport); + +extern int ehca_port_act_time; + +/** + * ehca_define_aqp0 - TODO + * + * @ehca_qp: : TODO adapter_handle, ipz_qp_handle, galpas.kernel + * @qp_init_attr : TODO for port number + */ +u64 ehca_define_sqp(struct ehca_shca *shca, + struct ehca_qp *ehca_qp, + struct ib_qp_init_attr *qp_init_attr) +{ + + u32 pma_qp_nr = 0; + u32 bma_qp_nr = 0; + u64 ret = H_Success; + u8 port = qp_init_attr->port_num; + int counter = 0; + + EDEB_EN(7, "port=%x qp_type=%x", + port, qp_init_attr->qp_type); + + shca->sport[port - 1].port_state = IB_PORT_DOWN; + + switch (qp_init_attr->qp_type) { + case IB_QPT_SMI: + /* TODO: function not supported yet */ + /* + ret = hipz_h_define_aqp0(shca->ipz_hca_handle, + ehca_qp->ipz_qp_handle, + ehca_qp->galpas.kernel, + (u32)qp_init_attr->port_num); + */ + break; + case IB_QPT_GSI: + ret = hipz_h_define_aqp1(shca->ipz_hca_handle, + ehca_qp->ipz_qp_handle, + ehca_qp->ehca_qp_core.galpas.kernel, + (u32) qp_init_attr->port_num, + &pma_qp_nr, &bma_qp_nr); + + if (ret != H_Success) { + EDEB_ERR(4, "Can't define AQP1 for port %x. rc=%lx", + port, ret); + goto ehca_define_aqp1; + } + break; + default: + ret = H_Parameter; + goto ehca_define_aqp1; + } + +#ifndef EHCA_USERDRIVER + while ((shca->sport[port - 1].port_state != IB_PORT_ACTIVE) && + (counter < ehca_port_act_time)) { + EDEB(6, "... wait until port %x is active", + port); + msleep_interruptible(1000); + counter++; + } + + if (counter == ehca_port_act_time) { + EDEB_ERR(4, "Port %x is not active.", port); + ret = H_Hardware; + } +#else + if (shca->sport[port - 1].port_state != IB_PORT_ACTIVE) { + sleep(20); + } +#endif + + ehca_define_aqp1: + EDEB_EX(7, "ret=%lx", ret); + + return ret; +} From rolandd at cisco.com Sat Feb 18 11:57:37 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:37 -0800 Subject: [PATCH 11/22] ehca event queues In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005730.13620.53494.stgit@localhost.localdomain> From: Roland Dreier in ehca_poll_eqs(), is there any reason not to use list_for_each_entry()? Since ehca_poll_eqs() defers all the work to an workqueue, is there any reason for it to run in a kernel thread? Why not just make it a recurring timer? --- drivers/infiniband/hw/ehca/ehca_eq.c | 242 ++++++++++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_eq.h | 78 +++++++++++ 2 files changed, 320 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_eq.c b/drivers/infiniband/hw/ehca/ehca_eq.c new file mode 100644 index 0000000..e508edb --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_eq.c @@ -0,0 +1,242 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Event queue handling + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Heiko J Schick + * Hoang-Nam Nguyen + * + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_eq.c,v 1.40 2006/02/06 16:20:38 schickhj Exp $ + */ + +#define DEB_PREFIX "e_eq" + +#include "ehca_eq.h" +#include "ehca_kernel.h" +#include "ehca_classes.h" +#include "hcp_if.h" +#include "ehca_iverbs.h" +#include "ipz_pt_fn.h" +#include "ehca_qes.h" +#include "ehca_irq.h" + +/* TODO: should be defined in ehca_classes_pSeries.h */ +#define HIPZ_EQ_REGISTER_ORIG 0 + +int ehca_create_eq(struct ehca_shca *shca, + struct ehca_eq *eq, + const enum ehca_eq_type type, const u32 length) +{ + extern struct workqueue_struct *ehca_wq; + u64 ret = H_Success; + u32 nr_pages = 0; + u32 i; + void *vpage = NULL; + + EDEB_EN(7, "shca=%p eq=%p length=%x", shca, eq, length); + EHCA_CHECK_ADR(shca); + EHCA_CHECK_ADR(eq); + + spin_lock_init(&eq->spinlock); + eq->is_initialized = 0; + + if (type!=EHCA_EQ && type!=EHCA_NEQ) { + EDEB_ERR(4, "Invalid EQ type %x. eq=%p", type, eq); + return -EINVAL; + } + if (length==0) { + EDEB_ERR(4, "EQ length must not be zero. eq=%p", eq); + return -EINVAL; + } + + ret = hipz_h_alloc_resource_eq(shca->ipz_hca_handle, + &eq->pf, + type, + length, + &eq->ipz_eq_handle, + &eq->length, + &nr_pages, &eq->irq_info.ist); + + if (ret != H_Success) { + EDEB_ERR(4, "Can't allocate EQ / NEQ. eq=%p", eq); + return -EINVAL; + } + + ret = ipz_queue_ctor(&eq->ipz_queue, nr_pages, + EHCA_PAGESIZE, sizeof(struct ehca_eqe), 0); + if (!ret) { + EDEB_ERR(4, "Can't allocate EQ pages. eq=%p", eq); + goto create_eq_exit1; + } + + for (i = 0; i < nr_pages; i++) { + u64 rpage; + + if (!(vpage = ipz_QPageit_get_inc(&eq->ipz_queue))) { + ret = H_Resource; + goto create_eq_exit2; + } + + rpage = ehca_kv_to_g(vpage); + ret = hipz_h_register_rpage_eq(shca->ipz_hca_handle, + eq->ipz_eq_handle, + &eq->pf, + 0, + HIPZ_EQ_REGISTER_ORIG, rpage, 1); + + if (i == (nr_pages - 1)) { + /* last page */ + vpage = ipz_QPageit_get_inc(&eq->ipz_queue); + if ((ret != H_Success) || (vpage != 0)) { + goto create_eq_exit2; + } + } else { + if ((ret != H_PAGE_REGISTERED) || (vpage == 0)) { + goto create_eq_exit2; + } + } + } + + ipz_QEit_reset(&eq->ipz_queue); + +#ifndef EHCA_USERDRIVER + { + pid_t pid = 0; + (eq->irq_info).pid = pid; + (eq->irq_info).eq = eq; + (eq->irq_info).wq = ehca_wq; + (eq->irq_info).work = &(eq->work); + } +#endif + + /* register interrupt handlers and initialize work queues */ + if (type == EHCA_EQ) { + INIT_WORK(&(eq->work), + ehca_interrupt_eq, (void *)&(eq->irq_info)); + eq->is_initialized = 1; + hipz_request_interrupt(&(eq->irq_info), ehca_interrupt); + } else if (type == EHCA_NEQ) { + INIT_WORK(&(eq->work), + ehca_interrupt_neq, (void *)&(eq->irq_info)); + hipz_request_interrupt(&(eq->irq_info), ehca_interrupt); + } + + EDEB_EX(7, "ret=%lx", ret); + + return 0; + + create_eq_exit2: + ipz_queue_dtor(&eq->ipz_queue); + + create_eq_exit1: + hipz_h_destroy_eq(shca->ipz_hca_handle, eq); + + EDEB_EX(7, "ret=%lx", ret); + + return -EINVAL; +} + +void *ehca_poll_eq(struct ehca_shca *shca, struct ehca_eq *eq) +{ + unsigned long flags = 0; + void *eqe = NULL; + + EDEB_EN(7, "shca=%p eq=%p", shca, eq); + EHCA_CHECK_ADR_P(shca); + EHCA_CHECK_EQ_P(eq); + + spin_lock_irqsave(&eq->spinlock, flags); + eqe = ipz_QEit_EQ_get_inc_valid(&eq->ipz_queue); + spin_unlock_irqrestore(&eq->spinlock, flags); + + EDEB_EX(7, "eq=%p eqe=%p", eq, eqe); + + return eqe; +} + +int ehca_poll_eqs(void *data) +{ + extern struct workqueue_struct *ehca_wq; + struct ehca_shca *shca; + struct ehca_module* module = data; + struct list_head *entry; + + do { + spin_lock(&module->shca_lock); + list_for_each(entry, &module->shca_list) { + shca = list_entry(entry, struct ehca_shca, shca_list); + + if (shca->eq.is_initialized && !kthread_should_stop()) + queue_work(ehca_wq, &shca->eq.work); + } + spin_unlock(&module->shca_lock); + + msleep_interruptible(1000); + } + while(!kthread_should_stop()); + + return 0; +} + +int ehca_destroy_eq(struct ehca_shca *shca, struct ehca_eq *eq) +{ + unsigned long flags = 0; + u64 retcode = H_Success; + + EDEB_EN(7, "shca=%p eq=%p", shca, eq); + EHCA_CHECK_ADR(shca); + EHCA_CHECK_EQ(eq); + + spin_lock_irqsave(&eq->spinlock, flags); + hipz_free_interrupt(&(eq->irq_info)); + + retcode = hipz_h_destroy_eq(shca->ipz_hca_handle, eq); + + spin_unlock_irqrestore(&eq->spinlock, flags); + + if (retcode != H_Success) { + EDEB_ERR(4, "Can't free EQ resources."); + return -EINVAL; + } + ipz_queue_dtor(&eq->ipz_queue); + + EDEB_EX(7, "retcode=%lx", retcode); + + return 0; +} + diff --git a/drivers/infiniband/hw/ehca/ehca_eq.h b/drivers/infiniband/hw/ehca/ehca_eq.h new file mode 100644 index 0000000..d09f21b --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_eq.h @@ -0,0 +1,78 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Completion queue, event queue handling helper functions + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Heiko J Schick + * Hoang-Nam Nguyen + * + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_eq.h,v 1.10 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef EHCA_EQ_H +#define EHCA_EQ_H + +#include "ehca_classes.h" +#include "ehca_common.h" + +enum ehca_eq_type { + EHCA_EQ = 0, /* event queue */ + EHCA_NEQ /* notification event queue */ +}; + +/** @brief hcad internal create EQ + */ +int ehca_create_eq(struct ehca_shca *shca, + struct ehca_eq *eq, /* struct contains eq to create */ + enum ehca_eq_type type, + const u32 length); + +/** @brief destroy the eq + */ +int ehca_destroy_eq(struct ehca_shca *shca, struct ehca_eq *eq); + +/** @brief hcad internal poll EQ + - check if new EQE available, + - if yes, increment EQE pointer + - otherwise return 0 + @returns pointer to EQE if new valid EQEavailable + */ +void *ehca_poll_eq(struct ehca_shca *shca, struct ehca_eq *eq); + +#endif /* EHCA_EQ_H */ + From rolandd at cisco.com Sat Feb 18 11:57:48 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:48 -0800 Subject: [PATCH 16/22] ehca post send/receive and poll CQ In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005748.13620.45620.stgit@localhost.localdomain> From: Roland Dreier There are an awful lot of magic numbers scattered around. Probably they should become enums somewhere. The compatibility defines for using the kernel file in userspace shouldn't go into the kernel. --- drivers/infiniband/hw/ehca/ehca_reqs.c | 401 ++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_reqs_core.c | 420 +++++++++++++++++++++++++++ 2 files changed, 821 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c new file mode 100644 index 0000000..659e6ba --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -0,0 +1,401 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * post_send/recv, poll_cq, req_notify + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Hoang-Nam Nguyen + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_reqs.c,v 1.41 2006/02/06 10:17:34 schickhj Exp $ + */ + + +#define DEB_PREFIX "reqs" + +#include "ehca_kernel.h" +#include "ehca_classes.h" +#include "ehca_tools.h" +#include "hcp_if.h" +#include "ehca_qes.h" +#include "ehca_iverbs.h" + +/* include some inline service routines */ +#include "ehca_asm.h" +#include "ehca_reqs_core.c" + +int ehca_post_send(struct ib_qp *qp, + struct ib_send_wr *send_wr, + struct ib_send_wr **bad_send_wr) +{ + struct ehca_qp *my_qp = NULL; + struct ib_send_wr *cur_send_wr = NULL; + struct ehca_wqe *wqe_p = NULL; + int wqe_cnt = 0; + int retcode = 0; + unsigned long spl_flags = 0; + + EHCA_CHECK_ADR(qp); + my_qp = container_of(qp, struct ehca_qp, ib_qp); + EHCA_CHECK_QP(my_qp); + EHCA_CHECK_ADR(send_wr); + EDEB_EN(7, "ehca_qp=%p qp_num=%x send_wr=%p bad_send_wr=%p", + my_qp, qp->qp_num, send_wr, bad_send_wr); + + /* LOCK the QUEUE */ + spin_lock_irqsave(&my_qp->spinlock_s, spl_flags); + + /* loop processes list of send reqs */ + for (cur_send_wr = send_wr; cur_send_wr != NULL; + cur_send_wr = cur_send_wr->next) { + void *start_addr = + &my_qp->ehca_qp_core.ipz_squeue.current_q_addr; + /* get pointer next to free WQE */ + wqe_p = ipz_QEit_get_inc(&my_qp->ehca_qp_core.ipz_squeue); + if (unlikely(wqe_p == NULL)) { + /* too many posted work requests: queue overflow */ + if (bad_send_wr != NULL) { + *bad_send_wr = cur_send_wr; + } + if (wqe_cnt==0) { + retcode = -ENOMEM; + EDEB_ERR(4, "Too many posted WQEs qp_num=%x", + qp->qp_num); + } + goto post_send_exit0; + } + /* write a SEND WQE into the QUEUE */ + retcode = ehca_write_swqe(&my_qp->ehca_qp_core, + wqe_p, cur_send_wr); + /* if something failed, + reset the free entry pointer to the start value + */ + if (unlikely(retcode != 0)) { + my_qp->ehca_qp_core.ipz_squeue.current_q_addr = + start_addr; + *bad_send_wr = cur_send_wr; + if (wqe_cnt==0) { + retcode = -EINVAL; + EDEB_ERR(4, "Could not write WQE qp_num=%x", + qp->qp_num); + } + goto post_send_exit0; + } + wqe_cnt++; + EDEB(7, "ehca_qp=%p qp_num=%x wqe_cnt=%d", + my_qp, qp->qp_num, wqe_cnt); + } /* eof for cur_send_wr */ + + post_send_exit0: + /* UNLOCK the QUEUE */ + spin_unlock_irqrestore(&my_qp->spinlock_s, spl_flags); + iosync(); /* serialize GAL register access */ + hipz_update_SQA(&my_qp->ehca_qp_core, wqe_cnt); + EDEB_EX(7, "ehca_qp=%p qp_num=%x ret=%x wqe_cnt=%d", + my_qp, qp->qp_num, retcode, wqe_cnt); + return retcode; +} + +int ehca_post_recv(struct ib_qp *qp, + struct ib_recv_wr *recv_wr, + struct ib_recv_wr **bad_recv_wr) +{ + struct ehca_qp *my_qp = NULL; + struct ib_recv_wr *cur_recv_wr = NULL; + struct ehca_wqe *wqe_p = NULL; + int wqe_cnt = 0; + int retcode = 0; + unsigned long spl_flags = 0; + + EHCA_CHECK_ADR(qp); + my_qp = container_of(qp, struct ehca_qp, ib_qp); + EHCA_CHECK_QP(my_qp); + EHCA_CHECK_ADR(recv_wr); + EDEB_EN(7, "ehca_qp=%p qp_num=%x recv_wr=%p bad_recv_wr=%p", + my_qp, qp->qp_num, recv_wr, bad_recv_wr); + + /* LOCK the QUEUE */ + spin_lock_irqsave(&my_qp->spinlock_r, spl_flags); + + /* loop processes list of send reqs */ + for (cur_recv_wr = recv_wr; cur_recv_wr != NULL; + cur_recv_wr = cur_recv_wr->next) { + void *start_addr = + &my_qp->ehca_qp_core.ipz_rqueue.current_q_addr; + /* get pointer next to free WQE */ + wqe_p = ipz_QEit_get_inc(&my_qp->ehca_qp_core.ipz_rqueue); + if (unlikely(wqe_p == NULL)) { + /* too many posted work requests: queue overflow */ + if (bad_recv_wr != NULL) { + *bad_recv_wr = cur_recv_wr; + } + if (wqe_cnt==0) { + retcode = -ENOMEM; + EDEB_ERR(4, "Too many posted WQEs qp_num=%x", + qp->qp_num); + } + goto post_recv_exit0; + } + /* write a RECV WQE into the QUEUE */ + retcode = + ehca_write_rwqe(&my_qp->ehca_qp_core, wqe_p, cur_recv_wr); + /* if something failed, + reset the free entry pointer to the start value + */ + if (unlikely(retcode != 0)) { + my_qp->ehca_qp_core.ipz_rqueue.current_q_addr = + start_addr; + *bad_recv_wr = cur_recv_wr; + if (wqe_cnt==0) { + retcode = -EINVAL; + EDEB_ERR(4, "Could not write WQE qp_num=%x", + qp->qp_num); + } + goto post_recv_exit0; + } + wqe_cnt++; + EDEB(7, "ehca_qp=%p qp_num=%x wqe_cnt=%d", + my_qp, qp->qp_num, wqe_cnt); + } /* eof for cur_recv_wr */ + + post_recv_exit0: + spin_unlock_irqrestore(&my_qp->spinlock_r, spl_flags); + iosync(); /* serialize GAL register access */ + hipz_update_RQA(&my_qp->ehca_qp_core, wqe_cnt); + EDEB_EX(7, "ehca_qp=%p qp_num=%x ret=%x wqe_cnt=%d", + my_qp, qp->qp_num, retcode, wqe_cnt); + return retcode; +} + +/** + * Table converts ehca wc opcode to ib + * Since we use zero to indicate invalid opcode, the actual ib opcode must + * be decremented!!! + */ +static const u8 ib_wc_opcode[255] = { + [0x01] = IB_WC_RECV+1, + [0x02] = IB_WC_RECV_RDMA_WITH_IMM+1, + [0x04] = IB_WC_BIND_MW+1, + [0x08] = IB_WC_FETCH_ADD+1, + [0x10] = IB_WC_COMP_SWAP+1, + [0x20] = IB_WC_RDMA_WRITE+1, + [0x40] = IB_WC_RDMA_READ+1, + [0x80] = IB_WC_SEND+1 +}; + +/** @brief internal function to poll one entry of cq + */ +static inline int ehca_poll_cq_one(struct ib_cq *cq, struct ib_wc *wc) +{ + int retcode = 0; + struct ehca_cq *my_cq = container_of(cq, struct ehca_cq, ib_cq); + struct ehca_cqe *cqe = NULL; + int cqe_count = 0; + + EDEB_EN(7, "ehca_cq=%p cq_num=%x wc=%p", my_cq, my_cq->cq_number, wc); + + poll_cq_one_read_cqe: + cqe = (struct ehca_cqe *) + ipz_QEit_get_inc_valid(&my_cq->ehca_cq_core.ipz_queue); + if (cqe == NULL) { + retcode = -EAGAIN; + EDEB(7, "Completion queue is empty ehca_cq=%p cq_num=%x " + "retcode=%x", my_cq, my_cq->cq_number, retcode); + goto poll_cq_one_exit0; + } + cqe_count++; + if (unlikely(cqe->status & 0x10)) { /* purge bit set */ + struct ehca_qp *qp=ehca_cq_get_qp(my_cq, cqe->local_qp_number); + int purgeflag = 0; + unsigned long spl_flags = 0; + if (qp==NULL) { /* should not happen */ + EDEB_ERR(4, "cq_num=%x qp_num=%x " + "could not find qp -> ignore cqe", + my_cq->cq_number, cqe->local_qp_number); + EDEB_DMP(4, cqe, 64, "cq_num=%x qp_num=%x", + my_cq->cq_number, cqe->local_qp_number); + /* ignore this purged cqe */ + goto poll_cq_one_read_cqe; + } + spin_lock_irqsave(&qp->spinlock_s, spl_flags); + purgeflag = qp->sqerr_purgeflag; + spin_unlock_irqrestore(&qp->spinlock_s, spl_flags); + if (purgeflag!=0) { + EDEB(6, "Got CQE with purged bit qp_num=%x src_qp=%x", + cqe->local_qp_number, cqe->remote_qp_number); + EDEB_DMP(6, cqe, 64, "qp_num=%x src_qp=%x", + cqe->local_qp_number, cqe->remote_qp_number); + /* ignore this to avoid double cqes of bad wqe + that caused sqe and turn off purge flag */ + qp->sqerr_purgeflag = 0; + goto poll_cq_one_read_cqe; + } + } + + /* tracing cqe */ + if (IS_EDEB_ON(7)) { + EDEB(7, "Received COMPLETION ehca_cq=%p cq_num=%x -----", + my_cq, my_cq->cq_number); + EDEB_DMP(7, cqe, 64, "ehca_cq=%p cq_num=%x", + my_cq, my_cq->cq_number); + EDEB(7, "ehca_cq=%p cq_num=%x -------------------------", + my_cq, my_cq->cq_number); + } + + /* we got a completion! */ + wc->wr_id = cqe->work_request_id; + + /* eval ib_wc_opcode */ + wc->opcode = ib_wc_opcode[cqe->optype]-1; + if (unlikely(wc->opcode == -1)) { + EDEB_ERR(4, "Invalid cqe->OPType=%x cqe->status=%x " + "ehca_cq=%p cq_num=%x", + cqe->optype, cqe->status, my_cq, my_cq->cq_number); + /* dump cqe for other infos */ + EDEB_DMP(4, cqe, 64, "ehca_cq=%p cq_num=%x", my_cq, my_cq->cq_number); + /* update also queue adder to throw away this entry!!! */ + goto poll_cq_one_exit0; + } + /* eval ib_wc_status */ + if (unlikely(cqe->status & 0x80000000)) { /* complete with errors */ + map_ib_wc_status(cqe->status, &wc->status); + wc->vendor_err = wc->status; + } else { + wc->status = IB_WC_SUCCESS; + } + + wc->qp_num = cqe->local_qp_number; + wc->byte_len = ntohl(cqe->nr_bytes_transferred); + wc->pkey_index = cqe->pkey_index; + wc->slid = cqe->rlid; + wc->dlid_path_bits = cqe->dlid; + wc->src_qp = cqe->remote_qp_number; + wc->wc_flags = cqe->w_completion_flags; + wc->imm_data = cqe->immediate_data; + wc->sl = cqe->service_level; + + if (wc->status != IB_WC_SUCCESS) { + EDEB(6, "ehca_cq=%p cq_num=%x WARNING unsuccessful cqe " + "OPType=%x status=%x qp_num=%x src_qp=%x wr_id=%lx cqe=%p", + my_cq, my_cq->cq_number, cqe->optype, cqe->status, + cqe->local_qp_number, cqe->remote_qp_number, + cqe->work_request_id, cqe); + } + + poll_cq_one_exit0: + if (cqe_count>0) { + hipz_update_FECA(&my_cq->ehca_cq_core, cqe_count); + } + + EDEB_EX(7, "retcode=%x ehca_cq=%p cq_number=%x wc=%p " + "status=%x opcode=%x qp_num=%x byte_len=%x", + retcode, my_cq, my_cq->cq_number, wc, wc->status, + wc->opcode, wc->qp_num, wc->byte_len); + return (retcode); +} + +int ehca_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc) +{ + struct ehca_cq *my_cq = NULL; + int nr = 0; + struct ib_wc *current_wc = NULL; + int retcode = 0; + unsigned long spl_flags = 0; + + EHCA_CHECK_CQ(cq); + EHCA_CHECK_ADR(wc); + + my_cq = container_of(cq, struct ehca_cq, ib_cq); + EHCA_CHECK_CQ(my_cq); + + EDEB_EN(7, "ehca_cq=%p cq_num=%x num_entries=%d wc=%p", + my_cq, my_cq->cq_number, num_entries, wc); + + if (num_entries < 1) { + EDEB_ERR(4, "Invalid num_entries=%d ehca_cq=%p cq_num=%x", + num_entries, my_cq, my_cq->cq_number); + retcode = -EINVAL; + goto poll_cq_exit0; + } + + current_wc = wc; + spin_lock_irqsave(&my_cq->spinlock, spl_flags); + for (nr = 0; nr < num_entries; nr++) { + retcode = ehca_poll_cq_one(cq, current_wc); + if (0 != retcode) { + break; + } + current_wc++; + } /* eof for nr */ + spin_unlock_irqrestore(&my_cq->spinlock, spl_flags); + if (-EAGAIN == retcode || 0 == retcode) { + retcode = nr; + } + + poll_cq_exit0: + EDEB_EX(7, "ehca_cq=%p cq_num=%x retcode=%x wc=%p nr_entries=%d", + my_cq, my_cq->cq_number, retcode, wc, nr); + return (retcode); +} + +int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify) +{ + struct ehca_cq *my_cq = NULL; + int retcode = 0; + + EHCA_CHECK_CQ(cq); + my_cq = container_of(cq, struct ehca_cq, ib_cq); + EHCA_CHECK_CQ(my_cq); + EDEB_EN(7, "ehca_cq=%p cq_num=%x cq_notif=%x", + my_cq, my_cq->cq_number, cq_notify); + + switch (cq_notify) { + case IB_CQ_SOLICITED: + hipz_set_CQx_N0(&my_cq->ehca_cq_core, 1); + break; + case IB_CQ_NEXT_COMP: + hipz_set_CQx_N1(&my_cq->ehca_cq_core, 1); + break; + default: + retcode = -EINVAL; + } + + EDEB_EX(7, "ehca_cq=%p cq_num=%x retcode=%x", + my_cq, my_cq->cq_number, retcode); + + return (retcode); +} + +/* eof ehca_reqs.c */ diff --git a/drivers/infiniband/hw/ehca/ehca_reqs_core.c b/drivers/infiniband/hw/ehca/ehca_reqs_core.c new file mode 100644 index 0000000..c0b7281 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_reqs_core.c @@ -0,0 +1,420 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * post_send/recv, poll_cq, req_notify + * Common code to be included statically in respective user/kernel + * modules, i.e. ehca_ureqs.c/ehca_reqs.c + * This module contains C code only. Including modules must include + * all required header files. + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Hoang-Nam Nguyen + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_reqs_core.c,v 1.40 2006/02/06 10:17:34 schickhj Exp $ + */ + +/** THIS following block of defines + * replaces ib types of kernel space to corresponding ones in user space, + * so that the implemented inline functions below can be compiled and + * work in both user and kernel space. + * However this ASSUMES that there is no functional differences between ib + * types in kernel e.g. ib_send_wr and user space e.g. ibv_send_wr. + */ + +#ifndef __KERNEL__ +#define ib_recv_wr ibv_recv_wr +#define ib_send_wr ibv_send_wr +#define ehca_av ehcau_av +/* ib_wr_opcode */ +#define IB_WR_SEND IBV_WR_SEND +#define IB_WR_SEND_WITH_IMM IBV_WR_SEND_WITH_IMM +#define IB_WR_RDMA_WRITE IBV_WR_RDMA_WRITE +#define IB_WR_RDMA_WRITE_WITH_IMM IBV_WR_RDMA_WRITE_WITH_IMM +#define IB_WR_RDMA_READ IBV_WR_RDMA_READ +/* ib_qp_type */ +#define IB_QPT_RC IBV_QPT_RC +#define IB_QPT_UC IBV_QPT_UC +#define IB_QPT_UD IBV_QPT_UD +/* ib_wc_opcode */ +#define ib_wc_opcode ibv_wc_opcode +#define IB_WC_SEND IBV_WC_SEND +#define IB_WC_RDMA_WRITE IBV_WC_RDMA_WRITE +#define IB_WC_RDMA_READ IBV_WC_RDMA_READ +#define IB_WC_COMP_SWAP IBV_WC_COMP_SWAP +#define IB_WC_FETCH_ADD IBV_WC_FETCH_ADD +#define IB_WC_BIND_MW IBV_WC_BIND_MW +#define IB_WC_RECV IBV_WC_RECV +#define IB_WC_RECV_RDMA_WITH_IMM IBV_WC_RECV_RDMA_WITH_IMM +/* ib_wc_status */ +#define ib_wc_status ibv_wc_status +#define IB_WC_LOC_LEN_ERR IBV_WC_LOC_LEN_ERR +#define IB_WC_LOC_QP_OP_ERR IBV_WC_LOC_QP_OP_ERR +#define IB_WC_LOC_EEC_OP_ERR IBV_WC_LOC_EEC_OP_ERR +#define IB_WC_LOC_PROT_ERR IBV_WC_LOC_PROT_ERR +#define IB_WC_WR_FLUSH_ERR IBV_WC_WR_FLUSH_ERR +#define IB_WC_MW_BIND_ERR IBV_WC_MW_BIND_ERR +#define IB_WC_GENERAL_ERR IBV_WC_GENERAL_ERR +#define IB_WC_REM_INV_REQ_ERR IBV_WC_REM_INV_REQ_ERR +#define IB_WC_REM_ACCESS_ERR IBV_WC_REM_ACCESS_ERR +#define IB_WC_REM_OP_ERR IBV_WC_REM_OP_ERR +#define IB_WC_REM_INV_RD_REQ_ERR IBV_WC_REM_INV_RD_REQ_ERR +#define IB_WC_RETRY_EXC_ERR IBV_WC_RETRY_EXC_ERR +#define IB_WC_RNR_RETRY_EXC_ERR IBV_WC_RNR_RETRY_EXC_ERR +#define IB_WC_REM_ABORT_ERR IBV_WC_REM_ABORT_ERR +#define IB_WC_INV_EECN_ERR IBV_WC_INV_EECN_ERR +#define IB_WC_INV_EEC_STATE_ERR IBV_WC_INV_EEC_STATE_ERR +#define IB_WC_BAD_RESP_ERR IBV_WC_BAD_RESP_ERR +#define IB_WC_FATAL_ERR IBV_WC_FATAL_ERR +#define IB_WC_SUCCESS IBV_WC_SUCCESS +/* ib_send_flags */ +#define IB_SEND_FENCE IBV_SEND_FENCE +#define IB_SEND_SIGNALED IBV_SEND_SIGNALED +#define IB_SEND_SOLICITED IBV_SEND_SOLICITED +#define IB_SEND_INLINE IBV_SEND_INLINE +#endif + +static inline int ehca_write_rwqe(struct ehca_qp_core *qp_core, + struct ehca_wqe *wqe_p, + struct ib_recv_wr *recv_wr) +{ + u8 cnt_ds; + if (unlikely((recv_wr->num_sge < 0) || + (recv_wr->num_sge > qp_core->ipz_rqueue.act_nr_of_sg))) { + EDEB_ERR(4, "Invalid number of WQE SGE. " + "num_sqe=%x max_nr_of_sg=%x", + recv_wr->num_sge, qp_core->ipz_rqueue.act_nr_of_sg); + return (-EINVAL); /* invalid SG list length */ + } + + clear_cacheline(wqe_p); + clear_cacheline((u8 *) wqe_p + 32); + clear_cacheline((u8 *) wqe_p + 64); + + wqe_p->work_request_id = be64_to_cpu(recv_wr->wr_id); + wqe_p->nr_of_data_seg = recv_wr->num_sge; + + for (cnt_ds = 0; cnt_ds < recv_wr->num_sge; cnt_ds++) { + wqe_p->u.all_rcv.sg_list[cnt_ds].vaddr = + be64_to_cpu(recv_wr->sg_list[cnt_ds].addr); + wqe_p->u.all_rcv.sg_list[cnt_ds].lkey = + ntohl(recv_wr->sg_list[cnt_ds].lkey); + wqe_p->u.all_rcv.sg_list[cnt_ds].length = + ntohl(recv_wr->sg_list[cnt_ds].length); + } + + if (IS_EDEB_ON(7)) { + EDEB(7, "RECEIVE WQE written into queue qp_core=%p", qp_core); + EDEB_DMP(7, wqe_p, 16*(6 + wqe_p->nr_of_data_seg), + "qp_core=%p", qp_core); + } + + return (0); +} + +/* internal use only + uncomment this line to enable trace output of GSI send wr */ +/* #define DEBUG_GSI_SEND_WR 1 */ +#if defined(__KERNEL__) && defined(DEBUG_GSI_SEND_WR) + +/* need ib_mad struct */ +#include + +static void trace_send_wr_ud(const struct ib_send_wr *send_wr) +{ + int idx = 0; + int j = 0; + while (send_wr != NULL) { + struct ib_mad_hdr *mad_hdr = send_wr->wr.ud.mad_hdr; + struct ib_sge *sge = send_wr->sg_list; + EDEB(4, "send_wr#%x wr_id=%lx num_sge=%x " + "send_flags=%x opcode=%x",idx, send_wr->wr_id, + send_wr->num_sge, send_wr->send_flags, send_wr->opcode); + if (mad_hdr != NULL) { + EDEB(4, "send_wr#%x mad_hdr base_version=%x " + "mgmt_class=%x class_version=%x method=%x " + "status=%x class_specific=%x tid=%lx attr_id=%x " + "resv=%x attr_mod=%x", + idx, mad_hdr->base_version, mad_hdr->mgmt_class, + mad_hdr->class_version, mad_hdr->method, + mad_hdr->status, mad_hdr->class_specific, + mad_hdr->tid, mad_hdr->attr_id, mad_hdr->resv, + mad_hdr->attr_mod); + } + for (j = 0; j < send_wr->num_sge; j++) { +#ifdef EHCA_USERDRIVER + u8 *data = (u8 *) sge->addr; +#else + u8 *data = (u8 *) abs_to_virt(sge->addr); +#endif + EDEB(4, "send_wr#%x sge#%x addr=%p length=%x lkey=%x", + idx, j, data, sge->length, sge->lkey); + /* assume length is n*16 */ + EDEB_DMP(4, data, sge->length, "send_wr#%x sge#%x", idx, j); + sge++; + } /* eof for j */ + idx++; + send_wr = send_wr->next; + } /* eof while send_wr */ +} + +#endif /* __KERNEL__ && DEBUG_GSI_SEND_WR */ + +static inline int ehca_write_swqe(struct ehca_qp_core *qp_core, + struct ehca_wqe *wqe_p, + const struct ib_send_wr *send_wr) +{ + u32 idx; + u64 dma_length; + struct ehca_av *my_av; + u32 remote_qkey = send_wr->wr.ud.remote_qkey; + + clear_cacheline(wqe_p); + clear_cacheline((u8 *) wqe_p + 32); + + if (unlikely((send_wr->num_sge < 0) || + (send_wr->num_sge > qp_core->ipz_squeue.act_nr_of_sg))) { + EDEB_ERR(4, "Invalid number of WQE SGE. " + "num_sqe=%x max_nr_of_sg=%x", + send_wr->num_sge, qp_core->ipz_rqueue.act_nr_of_sg); + return (-EINVAL); /* invalid SG list length */ + } + + wqe_p->work_request_id = be64_to_cpu(send_wr->wr_id); + + switch (send_wr->opcode) { + case IB_WR_SEND: + case IB_WR_SEND_WITH_IMM: + wqe_p->optype = WQE_OPTYPE_SEND; + break; + case IB_WR_RDMA_WRITE: + case IB_WR_RDMA_WRITE_WITH_IMM: + wqe_p->optype = WQE_OPTYPE_RDMAWRITE; + break; + case IB_WR_RDMA_READ: + wqe_p->optype = WQE_OPTYPE_RDMAREAD; + break; + default: + EDEB_ERR(4, "Invalid opcode=%x", send_wr->opcode); + return (-EINVAL); /* invalid opcode */ + } + + wqe_p->wqef = (send_wr->opcode) & 0xF0; + + wqe_p->wr_flag = 0; + if (send_wr->send_flags & IB_SEND_SIGNALED) { + wqe_p->wr_flag |= WQE_WRFLAG_REQ_SIGNAL_COM; + } + + if (send_wr->opcode == IB_WR_SEND_WITH_IMM || + send_wr->opcode == IB_WR_RDMA_WRITE_WITH_IMM) { + /* this might not work as long as HW does not support it */ + wqe_p->immediate_data = send_wr->imm_data; + wqe_p->wr_flag |= WQE_WRFLAG_IMM_DATA_PRESENT; + } + + wqe_p->nr_of_data_seg = send_wr->num_sge; + + switch (qp_core->qp_type) { +#ifdef __KERNEL__ + case IB_QPT_SMI: + case IB_QPT_GSI: +#endif /* __KERNEL__ */ + /* no break is intential here */ + case IB_QPT_UD: + /* IB 1.2 spec C10-15 compliance */ + if (send_wr->wr.ud.remote_qkey & 0x80000000) { + remote_qkey = qp_core->qkey; + } + wqe_p->destination_qp_number = + ntohl(send_wr->wr.ud.remote_qpn << 8); + wqe_p->local_ee_context_qkey = ntohl(remote_qkey); + if (send_wr->wr.ud.ah==NULL) { + EDEB_ERR(4, "wr.ud.ah is NULL. qp_core=%p", qp_core); + return (-EINVAL); + } + my_av = container_of(send_wr->wr.ud.ah, struct ehca_av, ib_ah); + wqe_p->u.ud_av.ud_av = my_av->av; + + /* omitted check of IB_SEND_INLINE + since HW does not support it */ + for (idx = 0; idx < send_wr->num_sge; idx++) { + wqe_p->u.ud_av.sg_list[idx].vaddr = + be64_to_cpu(send_wr->sg_list[idx].addr); + wqe_p->u.ud_av.sg_list[idx].lkey = + ntohl(send_wr->sg_list[idx].lkey); + wqe_p->u.ud_av.sg_list[idx].length = + ntohl(send_wr->sg_list[idx].length); + } /* eof for idx */ +#ifdef __KERNEL__ + if (qp_core->qp_type == IB_QPT_SMI || + qp_core->qp_type == IB_QPT_GSI) { + wqe_p->u.ud_av.ud_av.pmtu = 1; + } + if (qp_core->qp_type == IB_QPT_GSI) { + wqe_p->pkeyi = + ntohs(send_wr->wr.ud.pkey_index); +#ifdef DEBUG_GSI_SEND_WR + trace_send_wr_ud(send_wr); +#endif /* DEBUG_GSI_SEND_WR */ + } +#endif /* __KERNEL__ */ + break; + + case IB_QPT_UC: + if (send_wr->send_flags & IB_SEND_FENCE) { + wqe_p->wr_flag |= WQE_WRFLAG_FENCE; + } + /* no break is intential here */ + case IB_QPT_RC: + /*@@TODO atomic???*/ + wqe_p->u.nud.remote_virtual_adress = + be64_to_cpu(send_wr->wr.rdma.remote_addr); + wqe_p->u.nud.rkey = ntohl(send_wr->wr.rdma.rkey); + + /* omitted checking of IB_SEND_INLINE + since HW does not support it */ + dma_length = 0; + for (idx = 0; idx < send_wr->num_sge; idx++) { + wqe_p->u.nud.sg_list[idx].vaddr = + be64_to_cpu(send_wr->sg_list[idx].addr); + wqe_p->u.nud.sg_list[idx].lkey = + ntohl(send_wr->sg_list[idx].lkey); + wqe_p->u.nud.sg_list[idx].length = + ntohl(send_wr->sg_list[idx].length); + dma_length += send_wr->sg_list[idx].length; + } /* eof idx */ + wqe_p->u.nud.atomic_1st_op_dma_len = be64_to_cpu(dma_length); + + break; + + default: + EDEB_ERR(4, "Invalid qptype=%x", qp_core->qp_type); + return (-EINVAL); + } + + if (IS_EDEB_ON(7)) { + EDEB(7, "SEND WQE written into queue qp_core=%p ", qp_core); + EDEB_DMP(7, wqe_p, 16*(6 + wqe_p->nr_of_data_seg), + "qp_core=%p", qp_core); + } + return (0); +} + +/** @brief convert cqe_status to ib_wc_status + */ +static inline void map_ib_wc_status(u32 cqe_status, + enum ib_wc_status *wc_status) +{ + if (unlikely(cqe_status & 0x80000000)) { /* complete with errors */ + switch (cqe_status & 0x0000003F) { + case 0x01: + case 0x21: + *wc_status = IB_WC_LOC_LEN_ERR; + break; + case 0x02: + case 0x22: + *wc_status = IB_WC_LOC_QP_OP_ERR; + break; + case 0x03: + case 0x23: + *wc_status = IB_WC_LOC_EEC_OP_ERR; + break; + case 0x04: + case 0x24: + *wc_status = IB_WC_LOC_PROT_ERR; + break; + case 0x05: + case 0x25: + *wc_status = IB_WC_WR_FLUSH_ERR; + break; + case 0x06: + *wc_status = IB_WC_MW_BIND_ERR; + break; + case 0x07: /* remote error - look into bits 20:24 */ + switch ((cqe_status & 0x0000F800) >> 11) { + case 0x0: + /* PSN Sequence Error! + couldn't find a matching VAPI status! */ + *wc_status = IB_WC_GENERAL_ERR; + break; + case 0x1: + *wc_status = IB_WC_REM_INV_REQ_ERR; + break; + case 0x2: + *wc_status = IB_WC_REM_ACCESS_ERR; + break; + case 0x3: + *wc_status = IB_WC_REM_OP_ERR; + break; + case 0x4: + *wc_status = IB_WC_REM_INV_RD_REQ_ERR; + break; + } + break; + case 0x08: + *wc_status = IB_WC_RETRY_EXC_ERR; + break; + case 0x09: + *wc_status = IB_WC_RNR_RETRY_EXC_ERR; + break; + case 0x0A: + case 0x2D: + *wc_status = IB_WC_REM_ABORT_ERR; + break; + case 0x0B: + case 0x2E: + *wc_status = IB_WC_INV_EECN_ERR; + break; + case 0x0C: + case 0x2F: + *wc_status = IB_WC_INV_EEC_STATE_ERR; + break; + case 0x0D: + *wc_status = IB_WC_BAD_RESP_ERR; + break; + case 0x10: + /* WQE purged */ + *wc_status = IB_WC_WR_FLUSH_ERR; + break; + default: + *wc_status = IB_WC_FATAL_ERR; + + } + } else { + *wc_status = IB_WC_SUCCESS; + } +} + From rolandd at cisco.com Sat Feb 18 11:57:59 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:59 -0800 Subject: [PATCH 21/22] ehca main file In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005759.13620.10968.stgit@localhost.localdomain> From: Roland Dreier What is ehca_show_flightrecorder() trying to do that snprintf() is not fast enough? If you need to pass a binary structure back to userspace (with a kernel address in it??) then sysfs is not the right place to put it. Look at debugfs; or relayfs might make the most sense for your flightrecorder stuff. --- drivers/infiniband/hw/ehca/ehca_main.c | 1032 ++++++++++++++++++++++++++++++++ 1 files changed, 1032 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c new file mode 100644 index 0000000..2e2be06 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -0,0 +1,1032 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * module start stop, hca detection + * + * Authors: Heiko J Schick + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_main.c,v 1.137 2006/02/06 16:20:38 schickhj Exp $ + */ + +#define DEB_PREFIX "shca" + +#include "ehca_kernel.h" +#include "ehca_tools.h" +#include "ehca_classes.h" +#include "ehca_iverbs.h" +#include "ehca_eq.h" +#include "ehca_mrmw.h" + +#include "hcp_sense.h" /* TODO: later via hipz_* header file */ +#include "hcp_if.h" /* TODO: later via hipz_* header file */ + +MODULE_LICENSE("Dual BSD/GPL"); +MODULE_AUTHOR("Christoph Raisch "); +MODULE_DESCRIPTION("IBM eServer HCA Driver"); +MODULE_VERSION("EHCA2_0047"); + +#ifdef EHCA_USERDRIVER +int ehca_open_aqp1 = 1; +#else +int ehca_open_aqp1 = 0; +#endif +int ehca_tracelevel = -1; +int ehca_hw_level = 0; +int ehca_nr_ports = 2; +int ehca_use_hp_mr = 0; +int ehca_port_act_time = 30; +int ehca_poll_all_eqs = 1; +int ehca_static_rate = -1; + +module_param_named(open_aqp1, ehca_open_aqp1, int, 0); +module_param_named(tracelevel, ehca_tracelevel, int, 0); +module_param_named(hw_level, ehca_hw_level, int, 0); +module_param_named(nr_ports, ehca_nr_ports, int, 0); +module_param_named(use_hp_mr, ehca_use_hp_mr, int, 0); +module_param_named(port_act_time, ehca_port_act_time, int, 0); +module_param_named(poll_all_eqs, ehca_poll_all_eqs, int, 0); +module_param_named(static_rate, ehca_static_rate, int, 0); + +MODULE_PARM_DESC(open_aqp1, "0 no define AQP1 on startup (default)," + "1 define AQP1 on startup"); +MODULE_PARM_DESC(tracelevel, "0 maximum performance (no messages)," + "9 maximum messages (no performance)"); +MODULE_PARM_DESC(hw_level, "0 autosensing," + "1 v. 0.20," + "2 v. 0.21"); +MODULE_PARM_DESC(nr_ports, "number of connected ports (default: 2)"); +MODULE_PARM_DESC(use_hp_mr, "use high performance MRs," + "0 no (default)," + "1 yes"); +MODULE_PARM_DESC(port_act_time, "time to wait for port activation" + "(default: 30 sec.)"); +MODULE_PARM_DESC(poll_all_eqs, "polls all event queues periodically" + "0 no," + "1 yes (default)"); +MODULE_PARM_DESC(static_rate, "set permanent static rate (default: disabled)"); + +/* This external trace mask controls what will end up in the + * kernel ring buffer. Number 6 means, that everything between + * 0 and 5 will be stored. + */ +u8 ehca_edeb_mask[EHCA_EDEB_TRACE_MASK_SIZE]={6,6,6,6, + 6,6,6,6, + 6,6,6,6, + 6,6,6,6, + 6,6,6,6, + 6,6,6,6, + 6,6,6,6, + 6,6,1,0}; + /* offset 0x1e is flightrecorder */ +EXPORT_SYMBOL(ehca_edeb_mask); + +atomic_t ehca_flightrecorder_index = ATOMIC_INIT(1); +unsigned long ehca_flightrecorder[EHCA_FLIGHTRECORDER_SIZE]; +EXPORT_SYMBOL(ehca_flightrecorder_index); +EXPORT_SYMBOL(ehca_flightrecorder); + +DECLARE_RWSEM(ehca_qp_idr_sem); +DECLARE_RWSEM(ehca_cq_idr_sem); +DEFINE_IDR(ehca_qp_idr); +DEFINE_IDR(ehca_cq_idr); + +struct ehca_module ehca_module; +struct workqueue_struct *ehca_wq; +struct task_struct *ehca_kthread_eq; + +/** + * ehca_init_trace - TODO + */ +void ehca_init_trace(void) +{ + EDEB_EN(7, ""); + + if (ehca_tracelevel != -1) { + int i; + for (i = 0; i < EHCA_EDEB_TRACE_MASK_SIZE; i++) + ehca_edeb_mask[i] = ehca_tracelevel; + } + + EDEB_EX(7, ""); +} + +/** + * ehca_init_flight - TODO + */ +void ehca_init_flight(void) +{ + EDEB_EN(7, ""); + + memset(ehca_flightrecorder, 0xFA, + sizeof(unsigned long) * EHCA_FLIGHTRECORDER_SIZE); + atomic_set(&ehca_flightrecorder_index, 0); + ehca_flightrecorder[0] = 0x12345678abcdef0; + + EDEB_EX(7, ""); +} + +/** + * ehca_flight_to_printk - TODO + */ +void ehca_flight_to_printk(void) +{ + int cur_offset = atomic_read(&ehca_flightrecorder_index); + int new_offset = cur_offset - (EHCA_FLIGHTRECORDER_BACKLOG * 4); + u32 flight_offset; + int i; + + if (new_offset < 0) + new_offset = EHCA_FLIGHTRECORDER_SIZE + new_offset - 4; + + printk(KERN_ERR + "EHCA ----- flight recorder begin " + "-------------------------------------------\n"); + + for (i = 0; i < EHCA_FLIGHTRECORDER_BACKLOG; i++) { + new_offset += 4; + flight_offset = (u32) new_offset % EHCA_FLIGHTRECORDER_SIZE; + + printk(KERN_ERR "EHCA %02d: %.16lX %.16lX %.16lX %.16lX\n", + i + 1, + ehca_flightrecorder[flight_offset], + ehca_flightrecorder[flight_offset + 1], + ehca_flightrecorder[flight_offset + 2], + ehca_flightrecorder[flight_offset + 3]); + } + + printk(KERN_ERR + "EHCA ----- flight recorder end " + "---------------------------------------------\n"); +} + +#define EHCA_CACHE_CREATE(name) \ + ehca_module->cache_##name = \ + kmem_cache_create("ehca_cache_"#name, \ + sizeof(struct ehca_##name), \ + 0, SLAB_HWCACHE_ALIGN, \ + NULL, NULL); \ + if (ehca_module->cache_##name == NULL) { \ + EDEB_ERR(4, "Cannot create "#name" SLAB cache."); \ + return -ENOMEM; \ + } \ + +/** + * ehca_caches_create: TODO + */ +int ehca_caches_create(struct ehca_module *ehca_module) +{ + EDEB_EN(7, ""); + + EHCA_CACHE_CREATE(pd); + EHCA_CACHE_CREATE(cq); + EHCA_CACHE_CREATE(qp); + EHCA_CACHE_CREATE(av); + EHCA_CACHE_CREATE(mw); + EHCA_CACHE_CREATE(mr); + + EDEB_EX(7, ""); + + return 0; +} + +#define EHCA_CACHE_DESTROY(name) \ + ret = kmem_cache_destroy(ehca_module->cache_##name); \ + if (ret != 0) { \ + EDEB_ERR(4, "Cannot destroy "#name" SLAB cache. ret=%x", ret); \ + return ret; \ + } \ + +/** + * ehca_caches_destroy - TODO + */ +int ehca_caches_destroy(struct ehca_module *ehca_module) +{ + int ret; + + EDEB_EN(7, ""); + + EHCA_CACHE_DESTROY(pd); + EHCA_CACHE_DESTROY(cq); + EHCA_CACHE_DESTROY(qp); + EHCA_CACHE_DESTROY(av); + EHCA_CACHE_DESTROY(mw); + EHCA_CACHE_DESTROY(mr); + + EDEB_EX(7, ""); + + return 0; +} + +#define EHCA_HCAAVER EHCA_BMASK_IBM(32,39) +#define EHCA_REVID EHCA_BMASK_IBM(40,63) + +/** + * ehca_num_ports - TODO + */ +int ehca_sense_attributes(struct ehca_shca *shca) +{ + int ret = -EINVAL; + u64 rc = H_Success; + struct query_hca_rblock *rblock; + + EDEB_EN(7, "shca=%p", shca); + + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (rblock == NULL) { + EDEB_ERR(4, "Cannot allocate rblock memory."); + ret = -ENOMEM; + goto num_ports0; + } + + memset(rblock, 0, PAGE_SIZE); + + rc = hipz_h_query_hca(shca->ipz_hca_handle, rblock); + if (rc != H_Success) { + EDEB_ERR(4, "Cannot query device properties.rc=%lx", rc); + ret = -EPERM; + goto num_ports1; + } + + if (ehca_nr_ports == 1) + shca->num_ports = 1; + else + shca->num_ports = (u8) rblock->num_ports; + + EDEB(6, " ... found %x ports", rblock->num_ports); + + if (ehca_hw_level == 0) { + u32 hcaaver; + u32 revid; + + hcaaver = EHCA_BMASK_GET(EHCA_HCAAVER, rblock->hw_ver); + revid = EHCA_BMASK_GET(EHCA_REVID, rblock->hw_ver); + + EDEB(6, " ... hardware version=%x:%x", + hcaaver, revid); + + if ((hcaaver == 1) && (revid == 0)) + shca->hw_level = 0; + else if ((hcaaver == 1) && (revid == 1)) + shca->hw_level = 1; + else if ((hcaaver == 1) && (revid == 2)) + shca->hw_level = 2; + } + EDEB(6, " ... hardware level=%x", shca->hw_level); + + ret = 0; + + num_ports1: + kfree(rblock); + + num_ports0: + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +static int init_node_guid(struct ehca_shca* shca) +{ + int ret = 0; + struct query_hca_rblock *rblock; + + EDEB_EN(7, ""); + + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (rblock == NULL) { + EDEB_ERR(4, "Can't allocate rblock memory."); + ret = -ENOMEM; + goto init_node_guid0; + } + + memset(rblock, 0, PAGE_SIZE); + + if (hipz_h_query_hca(shca->ipz_hca_handle, rblock) != H_Success) { + EDEB_ERR(4, "Can't query device properties"); + ret = -EINVAL; + goto init_node_guid1; + } + + memcpy(&shca->ib_device.node_guid, &rblock->node_guid, (sizeof(u64))); + + init_node_guid1: + kfree(rblock); + + init_node_guid0: + EDEB_EX(7, "node_guid=%lx ret=%x", shca->ib_device.node_guid, ret); + + return ret; +} + +int ehca_register_device(struct ehca_shca *shca) +{ + int ret = 0; + + EDEB_EN(7, "shca=%p", shca); + + ret = init_node_guid(shca); + if (ret != 0) + return ret; + + strlcpy(shca->ib_device.name, "ehca%d", IB_DEVICE_NAME_MAX); + shca->ib_device.owner = THIS_MODULE; + + /* TODO: ABI ver later with define */ + shca->ib_device.uverbs_abi_ver = 1; + shca->ib_device.uverbs_cmd_mask = + (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | + (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | + (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | + (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_REG_MR) | + (1ull << IB_USER_VERBS_CMD_DEREG_MR) | + (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | + (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | + (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | + (1ull << IB_USER_VERBS_CMD_CREATE_QP) | + (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | + (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | + (1ull << IB_USER_VERBS_CMD_ATTACH_MCAST) | + (1ull << IB_USER_VERBS_CMD_DETACH_MCAST); + + shca->ib_device.node_type = RDMA_NODE_IB_CA; + shca->ib_device.phys_port_cnt = shca->num_ports; + shca->ib_device.dma_device = &shca->ibmebus_dev->ofdev.dev; + shca->ib_device.query_device = ehca_query_device; + shca->ib_device.query_port = ehca_query_port; + shca->ib_device.query_gid = ehca_query_gid; + shca->ib_device.query_pkey = ehca_query_pkey; + /* shca->in_device.modify_device = ehca_modify_device */ + shca->ib_device.modify_port = ehca_modify_port; + shca->ib_device.alloc_ucontext = ehca_alloc_ucontext; + shca->ib_device.dealloc_ucontext = ehca_dealloc_ucontext; + shca->ib_device.alloc_pd = ehca_alloc_pd; + shca->ib_device.dealloc_pd = ehca_dealloc_pd; + shca->ib_device.create_ah = ehca_create_ah; + /* shca->ib_device.modify_ah = ehca_modify_ah; */ + shca->ib_device.query_ah = ehca_query_ah; + shca->ib_device.destroy_ah = ehca_destroy_ah; + shca->ib_device.create_qp = ehca_create_qp; + shca->ib_device.modify_qp = ehca_modify_qp; + shca->ib_device.query_qp = ehca_query_qp; + shca->ib_device.destroy_qp = ehca_destroy_qp; + shca->ib_device.post_send = ehca_post_send; + shca->ib_device.post_recv = ehca_post_recv; + shca->ib_device.create_cq = ehca_create_cq; + shca->ib_device.destroy_cq = ehca_destroy_cq; + + /* TODO: disabled due to func signature conflict */ + /* shca->ib_device.resize_cq = ehca_resize_cq; */ + + shca->ib_device.poll_cq = ehca_poll_cq; + /* shca->ib_device.peek_cq = ehca_peek_cq; */ + shca->ib_device.req_notify_cq = ehca_req_notify_cq; + /* shca->ib_device.req_ncomp_notif = ehca_req_ncomp_notif; */ + shca->ib_device.get_dma_mr = ehca_get_dma_mr; + shca->ib_device.reg_phys_mr = ehca_reg_phys_mr; + shca->ib_device.reg_user_mr = ehca_reg_user_mr; + shca->ib_device.query_mr = ehca_query_mr; + shca->ib_device.dereg_mr = ehca_dereg_mr; + shca->ib_device.rereg_phys_mr = ehca_rereg_phys_mr; + shca->ib_device.alloc_mw = ehca_alloc_mw; + shca->ib_device.bind_mw = ehca_bind_mw; + shca->ib_device.dealloc_mw = ehca_dealloc_mw; + shca->ib_device.alloc_fmr = ehca_alloc_fmr; + shca->ib_device.map_phys_fmr = ehca_map_phys_fmr; + shca->ib_device.unmap_fmr = ehca_unmap_fmr; + shca->ib_device.dealloc_fmr = ehca_dealloc_fmr; + shca->ib_device.attach_mcast = ehca_attach_mcast; + shca->ib_device.detach_mcast = ehca_detach_mcast; + /* shca->ib_device.process_mad = ehca_process_mad; */ + shca->ib_device.mmap = ehca_mmap; + + ret = ib_register_device(&shca->ib_device); + + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +/** + * ehca_create_aqp1 - TODO + * + * @shca: TODO + */ +static int ehca_create_aqp1(struct ehca_shca *shca, u32 port) +{ + struct ehca_sport *sport; + struct ib_cq *ibcq; + struct ib_qp *ibqp; + struct ib_qp_init_attr qp_init_attr; + int ret = 0; + + EDEB_EN(7, "shca=%p port=%x", shca, port); + + sport = &shca->sport[port - 1]; + + if (sport->ibcq_aqp1 != NULL) { + EDEB_ERR(4, "AQP1 CQ is already created."); + return -EPERM; + } + + ibcq = ib_create_cq(&shca->ib_device, NULL, NULL, (void*)(-1), 10); + if (IS_ERR(ibcq)) { + EDEB_ERR(4, "Cannot create AQP1 CQ."); + return PTR_ERR(ibcq); + } + sport->ibcq_aqp1 = ibcq; + + if (sport->ibqp_aqp1 != NULL) { + EDEB_ERR(4, "AQP1 QP is already created."); + ret = -EPERM; + goto create_aqp1; + } + + memset(&qp_init_attr, 0, sizeof(struct ib_qp_init_attr)); + qp_init_attr.send_cq = ibcq; + qp_init_attr.recv_cq = ibcq; + qp_init_attr.sq_sig_type = IB_SIGNAL_ALL_WR; + qp_init_attr.cap.max_send_wr = 100; + qp_init_attr.cap.max_recv_wr = 100; + qp_init_attr.cap.max_send_sge = 2; + qp_init_attr.cap.max_recv_sge = 1; + qp_init_attr.qp_type = IB_QPT_GSI; + qp_init_attr.port_num = port; + qp_init_attr.qp_context = NULL; + qp_init_attr.event_handler = NULL; + qp_init_attr.srq = NULL; + + ibqp = ib_create_qp(&shca->pd->ib_pd, &qp_init_attr); + if (IS_ERR(ibqp)) { + EDEB_ERR(4, "Cannot create AQP1 QP."); + ret = PTR_ERR(ibqp); + goto create_aqp1; + } + sport->ibqp_aqp1 = ibqp; + + EDEB_EX(7, "ret=%x", ret); + + return ret; + + create_aqp1: + ib_destroy_cq(sport->ibcq_aqp1); + + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +/** + * ehca_destroy_aqp1 - TODO + */ +static int ehca_destroy_aqp1(struct ehca_sport *sport) +{ + int ret = 0; + + EDEB_EN(7, "sport=%p", sport); + + ret = ib_destroy_qp(sport->ibqp_aqp1); + if (ret != 0) { + EDEB_ERR(4, "Cannot destroy AQP1 QP. ret=%x", ret); + goto destroy_aqp1; + } + + ret = ib_destroy_cq(sport->ibcq_aqp1); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy AQP1 CQ. ret=%x", ret); + + destroy_aqp1: + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +static ssize_t ehca_show_debug_level(struct device_driver *ddp, char *buf) +{ + int f; + int total = 0; + total += snprintf(buf + total, PAGE_SIZE - total, "%d", + ehca_edeb_mask[0]); + for (f = 1; f < EHCA_EDEB_TRACE_MASK_SIZE; f++) { + total += snprintf(buf + total, PAGE_SIZE - total, ",%d", + ehca_edeb_mask[f]); + } + + total += snprintf(buf + total, PAGE_SIZE - total, "\n"); + + return total; +} + +static ssize_t ehca_store_debug_level(struct device_driver *ddp, + const char *buf, size_t count) +{ + int f; + for (f = 0; f < EHCA_EDEB_TRACE_MASK_SIZE; f++) { + char value = buf[f * 2] - '0'; + if ((value <= 9) && (count >= f * 2)) { + ehca_edeb_mask[f] = value; + } + } + return count; +} +DRIVER_ATTR(debug_level, S_IRUSR | S_IWUSR, + ehca_show_debug_level, ehca_store_debug_level); + +static ssize_t ehca_show_flightrecorder(struct device_driver *ddp, + char *buf) +{ + /* this is not style compliant, but snprintf is not fast enough */ + u64 *lbuf = (u64 *) buf; + lbuf[0] = (u64) & ehca_flightrecorder; + lbuf[1] = EHCA_FLIGHTRECORDER_SIZE; + lbuf[2] = atomic_read(&ehca_flightrecorder_index); + return sizeof(u64) * 3; +} +DRIVER_ATTR(flightrecorder, S_IRUSR, ehca_show_flightrecorder, 0); + +void ehca_create_driver_sysfs(struct ibmebus_driver *drv) +{ + driver_create_file(&drv->driver, &driver_attr_debug_level); + driver_create_file(&drv->driver, &driver_attr_flightrecorder); +} + +void ehca_remove_driver_sysfs(struct ibmebus_driver *drv) +{ + driver_remove_file(&drv->driver, &driver_attr_debug_level); + driver_remove_file(&drv->driver, &driver_attr_flightrecorder); +} + +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12) +#define EHCA_RESOURCE_ATTR_H(name) \ +static ssize_t ehca_show_##name(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) +#else +#define EHCA_RESOURCE_ATTR_H(name) \ +static ssize_t ehca_show_##name(struct device *dev, \ + char *buf) +#endif + +#define EHCA_RESOURCE_ATTR(name) \ +EHCA_RESOURCE_ATTR_H(name) \ +{ \ + struct ehca_shca *shca; \ + struct query_hca_rblock *rblock; \ + int len; \ + \ + shca = dev->driver_data; \ + \ + rblock = kmalloc(PAGE_SIZE, GFP_KERNEL); \ + if (rblock == NULL) { \ + EDEB_ERR(4, "Can't allocate rblock memory."); \ + return 0; \ + } \ + \ + memset(rblock, 0, PAGE_SIZE); \ + \ + if (hipz_h_query_hca(shca->ipz_hca_handle, rblock) != H_Success) { \ + EDEB_ERR(4, "Can't query device properties"); \ + kfree(rblock); \ + return 0; \ + } \ + \ + if ((strcmp(#name, "num_ports") == 0) && (ehca_nr_ports == 1)) \ + len = snprintf(buf, 256, "1"); \ + else \ + len = snprintf(buf, 256, "%d", rblock->name); \ + \ + if (len < 0) \ + return 0; \ + buf[len] = '\n'; \ + buf[len+1] = 0; \ + \ + kfree(rblock); \ + \ + return len+1; \ +} \ +static DEVICE_ATTR(name, S_IRUGO, ehca_show_##name, NULL); + +EHCA_RESOURCE_ATTR(num_ports); +EHCA_RESOURCE_ATTR(hw_ver); +EHCA_RESOURCE_ATTR(max_eq); +EHCA_RESOURCE_ATTR(cur_eq); +EHCA_RESOURCE_ATTR(max_cq); +EHCA_RESOURCE_ATTR(cur_cq); +EHCA_RESOURCE_ATTR(max_qp); +EHCA_RESOURCE_ATTR(cur_qp); +EHCA_RESOURCE_ATTR(max_mr); +EHCA_RESOURCE_ATTR(cur_mr); +EHCA_RESOURCE_ATTR(max_mw); +EHCA_RESOURCE_ATTR(cur_mw); +EHCA_RESOURCE_ATTR(max_pd); +EHCA_RESOURCE_ATTR(max_ah); + +static ssize_t ehca_show_adapter_handle(struct device *dev, +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12) + struct device_attribute *attr, +#endif + char *buf) +{ + struct ehca_shca *shca = dev->driver_data; + + return sprintf(buf, "%lx\n", shca->ipz_hca_handle.handle); + +} +static DEVICE_ATTR(adapter_handle, S_IRUGO, ehca_show_adapter_handle, NULL); + + + +void ehca_create_device_sysfs(struct ibmebus_dev *dev) +{ + device_create_file(&dev->ofdev.dev, &dev_attr_adapter_handle); + device_create_file(&dev->ofdev.dev, &dev_attr_num_ports); + device_create_file(&dev->ofdev.dev, &dev_attr_hw_ver); + device_create_file(&dev->ofdev.dev, &dev_attr_max_eq); + device_create_file(&dev->ofdev.dev, &dev_attr_cur_eq); + device_create_file(&dev->ofdev.dev, &dev_attr_max_cq); + device_create_file(&dev->ofdev.dev, &dev_attr_cur_cq); + device_create_file(&dev->ofdev.dev, &dev_attr_max_qp); + device_create_file(&dev->ofdev.dev, &dev_attr_cur_qp); + device_create_file(&dev->ofdev.dev, &dev_attr_max_mr); + device_create_file(&dev->ofdev.dev, &dev_attr_cur_mr); + device_create_file(&dev->ofdev.dev, &dev_attr_max_mw); + device_create_file(&dev->ofdev.dev, &dev_attr_cur_mw); + device_create_file(&dev->ofdev.dev, &dev_attr_max_pd); + device_create_file(&dev->ofdev.dev, &dev_attr_max_ah); +} + +void ehca_remove_device_sysfs(struct ibmebus_dev *dev) +{ + device_remove_file(&dev->ofdev.dev, &dev_attr_adapter_handle); + device_remove_file(&dev->ofdev.dev, &dev_attr_num_ports); + device_remove_file(&dev->ofdev.dev, &dev_attr_hw_ver); + device_remove_file(&dev->ofdev.dev, &dev_attr_max_eq); + device_remove_file(&dev->ofdev.dev, &dev_attr_cur_eq); + device_remove_file(&dev->ofdev.dev, &dev_attr_max_cq); + device_remove_file(&dev->ofdev.dev, &dev_attr_cur_cq); + device_remove_file(&dev->ofdev.dev, &dev_attr_max_qp); + device_remove_file(&dev->ofdev.dev, &dev_attr_cur_qp); + device_remove_file(&dev->ofdev.dev, &dev_attr_max_mr); + device_remove_file(&dev->ofdev.dev, &dev_attr_cur_mr); + device_remove_file(&dev->ofdev.dev, &dev_attr_max_mw); + device_remove_file(&dev->ofdev.dev, &dev_attr_cur_mw); + device_remove_file(&dev->ofdev.dev, &dev_attr_max_pd); + device_remove_file(&dev->ofdev.dev, &dev_attr_max_ah); +} + +/** + * ehca_probe - TODO + */ +static int __devinit ehca_probe(struct ibmebus_dev *dev, + const struct of_device_id *id) +{ + struct ehca_shca *shca; + u64 *handle; + struct ib_pd *ibpd; + int ret = 0; + + EDEB_EN(7, "name=%s", dev->name); + + handle = (u64 *)get_property(dev->ofdev.node, "ibm,hca-handle", NULL); + if (!handle) { + EDEB_ERR(4, "Cannot get eHCA handle for adapter: %s.", + dev->ofdev.node->full_name); + return -ENODEV; + } + + if (!(*handle)) { + EDEB_ERR(4, "Wrong eHCA handle for adapter: %s.", + dev->ofdev.node->full_name); + return -ENODEV; + } + + shca = (struct ehca_shca *)ib_alloc_device(sizeof(*shca)); + if (shca == NULL) { + EDEB_ERR(4, "Cannot allocate shca memory."); + return -ENOMEM; + } + + shca->ibmebus_dev = dev; + shca->ipz_hca_handle.handle = *handle; + dev->ofdev.dev.driver_data = shca; + + ret = ehca_sense_attributes(shca); + if (ret < 0) { + EDEB_ERR(4, "Cannot sense eHCA attributes."); + goto probe1; + } + + /* create event queues */ + ret = ehca_create_eq(shca, &shca->eq, EHCA_EQ, 2048); + if (ret != 0) { + EDEB_ERR(4, "Cannot create EQ."); + goto probe1; + } + + ret = ehca_create_eq(shca, &shca->neq, EHCA_NEQ, 513); + if (ret != 0) { + EDEB_ERR(4, "Cannot create NEQ."); + goto probe2; + } + + /* create internal protection domain */ + ibpd = ehca_alloc_pd(&shca->ib_device, (void*)(-1), 0); + if (IS_ERR(ibpd)) { + EDEB_ERR(4, "Cannot create internal PD."); + ret = PTR_ERR(ibpd); + goto probe3; + } + + shca->pd = container_of(ibpd, struct ehca_pd, ib_pd); + shca->pd->ib_pd.device = &shca->ib_device; + + /* create internal max MR */ + if (shca->maxmr == 0) { + struct ehca_mr *e_maxmr = 0; + ret = ehca_reg_internal_maxmr(shca, shca->pd, &e_maxmr); + if (ret != 0) { + EDEB_ERR(4, "Cannot create internal MR. ret=%x", ret); + goto probe4; + } + shca->maxmr = e_maxmr; + } + + ret = ehca_register_device(shca); + if (ret != 0) { + EDEB_ERR(4, "Cannot register Infiniband device."); + goto probe5; + } + + /* create AQP1 for port 1 */ + if (ehca_open_aqp1 == 1) { + shca->sport[0].port_state = IB_PORT_DOWN; + ret = ehca_create_aqp1(shca, 1); + if (ret != 0) { + EDEB_ERR(4, "Cannot create AQP1 for port 1."); + goto probe6; + } + } + + /* create AQP1 for port 2 */ + if ((ehca_open_aqp1 == 1) && (shca->num_ports == 2)) { + shca->sport[1].port_state = IB_PORT_DOWN; + ret = ehca_create_aqp1(shca, 2); + if (ret != 0) { + EDEB_ERR(4, "Cannot create AQP1 for port 2."); + goto probe7; + } + } + + ehca_create_device_sysfs(dev); + + spin_lock(&ehca_module.shca_lock); + list_add(&shca->shca_list, &ehca_module.shca_list); + spin_unlock(&ehca_module.shca_lock); + + EDEB_EX(7, "ret=%x", ret); + + return 0; + + probe7: + ret = ehca_destroy_aqp1(&shca->sport[0]); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy AQP1 for port 1. ret=%x", ret); + + probe6: + ib_unregister_device(&shca->ib_device); + + probe5: + ret = ehca_dereg_internal_maxmr(shca); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy internal MR. ret=%x", ret); + + probe4: + ret = ehca_dealloc_pd(&shca->pd->ib_pd); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy internal PD. ret=%x", ret); + + probe3: + ret = ehca_destroy_eq(shca, &shca->neq); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy NEQ. ret=%x", ret); + + probe2: + ret = ehca_destroy_eq(shca, &shca->eq); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy EQ. ret=%x", ret); + + probe1: + ib_dealloc_device(&shca->ib_device); + + EDEB_EX(4, "ret=%x", ret); + + return -EINVAL; +} + +static int __devexit ehca_remove(struct ibmebus_dev *dev) +{ + struct ehca_shca *shca = dev->ofdev.dev.driver_data; + int ret; + + EDEB_EN(7, "shca=%p", shca); + + ehca_remove_device_sysfs(dev); + + if (ehca_open_aqp1 == 1) { + int i; + + for (i = 0; i < shca->num_ports; i++) { + ret = ehca_destroy_aqp1(&shca->sport[i]); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy AQP1 for port %x." + " ret=%x", ret, i); + } + } + + ib_unregister_device(&shca->ib_device); + + ret = ehca_dereg_internal_maxmr(shca); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy internal MR. ret=%x", ret); + + ret = ehca_dealloc_pd(&shca->pd->ib_pd); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy internal PD. ret=%x", ret); + + ret = ehca_destroy_eq(shca, &shca->eq); + if (ret != 0) + EDEB_ERR(4, "Cannot destroy EQ. ret=%x", ret); + + ret = ehca_destroy_eq(shca, &shca->neq); + if (ret != 0) + EDEB_ERR(4, "Canot destroy NEQ. ret=%x", ret); + + ib_dealloc_device(&shca->ib_device); + + spin_lock(&ehca_module.shca_lock); + list_del(&shca->shca_list); + spin_unlock(&ehca_module.shca_lock); + + EDEB_EX(7, "ret=%x", ret); + + return ret; +} + +static struct of_device_id ehca_device_table[] = +{ + { + .name = "lhca", + .compatible = "IBM,lhca", + }, + {}, +}; + +static struct ibmebus_driver ehca_driver = { + .name = "ehca", + .id_table = ehca_device_table, + .probe = ehca_probe, + .remove = ehca_remove, +}; + +/** + * ehca_module_init - eHCA initialization routine. + */ +int __init ehca_module_init(void) +{ + int ret = 0; + + printk(KERN_INFO "eHCA Infiniband Device Driver " + "(Rel.: EHCA2_0047)\n"); + EDEB_EN(7, ""); + + idr_init(&ehca_qp_idr); + idr_init(&ehca_cq_idr); + + INIT_LIST_HEAD(&ehca_module.shca_list); + spin_lock_init(&ehca_module.shca_lock); + + ehca_init_trace(); + ehca_init_flight(); + + ehca_wq = create_workqueue("ehca"); + if (ehca_wq == NULL) { + EDEB_ERR(4, "Cannot create workqueue."); + ret = -ENOMEM; + goto module_init0; + } + + if ((ret = ehca_caches_create(&ehca_module)) != 0) { + ehca_catastrophic("Cannot create SLAB caches"); + ret = -ENOMEM; + goto module_init1; + } + + if ((ret = ibmebus_register_driver(&ehca_driver)) != 0) { + ehca_catastrophic("Cannot register eHCA device driver"); + ret = -EINVAL; + goto module_init2; + } + + ehca_create_driver_sysfs(&ehca_driver); + + if (ehca_poll_all_eqs != 1) { + EDEB_ERR(4, "WARNING!!!"); + EDEB_ERR(4, "It is possible to lose interrupts."); + + return 0; + } + + ehca_kthread_eq = kthread_create(ehca_poll_eqs, &ehca_module, + "ehca_poll_eqs"); + if (IS_ERR(ehca_kthread_eq)) { + EDEB_ERR(4, "Cannot create kthread_eq"); + ret = PTR_ERR(ehca_kthread_eq); + goto module_init3; + } + + wake_up_process(ehca_kthread_eq); + + EDEB_EX(7, "ret=%x", ret); + + return 0; + + module_init3: + ehca_remove_driver_sysfs(&ehca_driver); + ibmebus_unregister_driver(&ehca_driver); + + module_init2: + ehca_caches_destroy(&ehca_module); + + module_init1: + destroy_workqueue(ehca_wq); + + module_init0: + EDEB_EX(7, "ret=%x", ret); + + return ret; +}; + +/** + * ehca_module_exit - eHCA exit routine. + */ +void __exit ehca_module_exit(void) +{ + EDEB_EN(7, ""); + + if (ehca_poll_all_eqs == 1) + kthread_stop(ehca_kthread_eq); + + ehca_remove_driver_sysfs(&ehca_driver); + ibmebus_unregister_driver(&ehca_driver); + + if (ehca_caches_destroy(&ehca_module) != 0) + ehca_catastrophic("Cannot destroy SLAB caches"); + + destroy_workqueue(ehca_wq); + +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,15) + idr_destroy_ext(&ehca_cq_idr); + idr_destroy_ext(&ehca_qp_idr); +#else + idr_destroy(&ehca_cq_idr); + idr_destroy(&ehca_qp_idr); +#endif + + EDEB_EX(7, ""); +}; + +module_init(ehca_module_init); +module_exit(ehca_module_exit); From rolandd at cisco.com Sat Feb 18 11:57:43 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:43 -0800 Subject: [PATCH 14/22] ehca completion queue handling In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005743.13620.29456.stgit@localhost.localdomain> From: Roland Dreier --- drivers/infiniband/hw/ehca/ehca_cq.c | 416 ++++++++++++++++++++++++++++++++++ 1 files changed, 416 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c b/drivers/infiniband/hw/ehca/ehca_cq.c new file mode 100644 index 0000000..ebee9c3 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_cq.c @@ -0,0 +1,416 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * Completion queue handling + * + * Authors: Waleri Fomin + * Reinhard Ernst + * Heiko J Schick + * Hoang-Nam Nguyen + * + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_cq.c,v 1.61 2006/02/06 10:17:34 schickhj Exp $ + */ + +#define DEB_PREFIX "e_cq" + +#include "ehca_kernel.h" +#include "ehca_common.h" +#include "ehca_iverbs.h" +#include "ehca_classes.h" +#include "ehca_irq.h" +#include "hcp_if.h" +#include +#include + +#define HIPZ_CQ_REGISTER_ORIG 0 + +int ehca_cq_assign_qp(struct ehca_cq *cq, struct ehca_qp *qp) +{ + unsigned int qp_num = qp->ehca_qp_core.real_qp_num; + unsigned int key = qp_num%QP_HASHTAB_LEN; + unsigned long spl_flags = 0; + spin_lock_irqsave(&cq->spinlock, spl_flags); + list_add(&qp->list_entries, &cq->qp_hashtab[key]); + spin_unlock_irqrestore(&cq->spinlock, spl_flags); + EDEB(7, "cq_num=%x real_qp_num=%x", cq->cq_number, qp_num); + return 0; +} + +int ehca_cq_unassign_qp(struct ehca_cq *cq, unsigned int real_qp_num) +{ + int ret = -EINVAL; + unsigned int key = real_qp_num%QP_HASHTAB_LEN; + struct list_head *iter = NULL; + struct ehca_qp *qp = NULL; + unsigned long spl_flags = 0; + spin_lock_irqsave(&cq->spinlock, spl_flags); + list_for_each(iter, &cq->qp_hashtab[key]) { + qp = list_entry(iter, struct ehca_qp, list_entries); + if (qp->ehca_qp_core.real_qp_num == real_qp_num) { + list_del(iter); + EDEB(7, "removed qp from cq .cq_num=%x real_qp_num=%x", + cq->cq_number, real_qp_num); + ret = 0; + break; + } + } + spin_unlock_irqrestore(&cq->spinlock, spl_flags); + if (ret!=0) { + EDEB_ERR(4, "qp not found cq_num=%x real_qp_num=%x", + cq->cq_number, real_qp_num); + } + return ret; +} + +struct ehca_qp* ehca_cq_get_qp(struct ehca_cq *cq, int real_qp_num) +{ + struct ehca_qp *ret = NULL; + unsigned int key = real_qp_num%QP_HASHTAB_LEN; + struct list_head *iter = NULL; + struct ehca_qp *qp = NULL; + list_for_each(iter, &cq->qp_hashtab[key]) { + qp = list_entry(iter, struct ehca_qp, list_entries); + if (qp->ehca_qp_core.real_qp_num == real_qp_num) { + ret = qp; + break; + } + } + return ret; +} + +struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, + struct ib_ucontext *context, + struct ib_udata *udata) +{ + struct ib_cq *cq = NULL; + struct ehca_cq *my_cq = NULL; + u32 number_of_entries = cqe; + struct ehca_shca *shca = NULL; + struct ipz_adapter_handle adapter_handle; + struct ipz_eq_handle eq_handle; + struct ipz_cq_handle *cq_handle_ref = NULL; + u32 act_nr_of_entries = 0; + u32 act_pages = 0; + u32 counter = 0; + void *vpage = NULL; + u64 rpage = 0; + struct h_galpa gal; + u64 CQx_FEC = 0; + u64 hipz_rc = H_Success; + int ipz_rc = 0; + int ret = 0; + const u32 additional_cqe=20; + int i= 0; + + EHCA_CHECK_DEVICE_P(device); + EDEB_EN(7, "device=%p cqe=%x context=%p", + device, cqe, context); + /* cq's maximum depth is 4GB-64 + * but we need additional 20 as buffer for receiving errors cqes + */ + if (cqe>=0xFFFFFFFF-64-additional_cqe) { + return ERR_PTR(-EINVAL); + } + number_of_entries += additional_cqe; + + my_cq = ehca_cq_new(); + if (my_cq == NULL) { + cq = ERR_PTR(-ENOMEM); + EDEB_ERR(4, + "Out of memory for ehca_cq struct " + "device=%p", device); + goto create_cq_exit0; + } + cq = &my_cq->ib_cq; + + shca = container_of(device, struct ehca_shca, ib_device); + adapter_handle = shca->ipz_hca_handle; + eq_handle = shca->eq.ipz_eq_handle; + cq_handle_ref = &my_cq->ipz_cq_handle; + + do { + if (!idr_pre_get(&ehca_cq_idr, GFP_KERNEL)) { + cq = ERR_PTR(-ENOMEM); + EDEB_ERR(4, + "Can't reserve idr resources. " + "device=%p", device); + goto create_cq_exit1; + } + + down_write(&ehca_cq_idr_sem); + ret = idr_get_new(&ehca_cq_idr, my_cq, &my_cq->token); + up_write(&ehca_cq_idr_sem); + + } while (ret == -EAGAIN); + + if (ret) { + cq = ERR_PTR(-ENOMEM); + EDEB_ERR(4, + "Can't allocate new idr entry. " + "device=%p", device); + goto create_cq_exit1; + } + + hipz_rc = hipz_h_alloc_resource_cq(adapter_handle, + &my_cq->pf, + eq_handle, + my_cq->token, + number_of_entries, + cq_handle_ref, + &act_nr_of_entries, + &act_pages, + &my_cq->ehca_cq_core.galpas); + if (hipz_rc != H_Success) { + EDEB_ERR(4, + "hipz_h_alloc_resource_cq() failed " + "hipz_rc=%lx device=%p", hipz_rc, device); + cq = ERR_PTR(ehca2ib_return_code(hipz_rc)); + goto create_cq_exit2; + } + + ipz_rc = + ipz_queue_ctor(&my_cq->ehca_cq_core.ipz_queue, act_pages, + EHCA_PAGESIZE, sizeof(struct ehca_cqe), 0); + if (!ipz_rc) { + EDEB_ERR(4, + "ipz_queue_ctor() failed " + "ipz_rc=%x device=%p", ipz_rc, device); + cq = ERR_PTR(-EINVAL); + goto create_cq_exit3; + } + + for (counter = 0; counter < act_pages; counter++) { + vpage = ipz_QPageit_get_inc(&my_cq->ehca_cq_core.ipz_queue); + if (!vpage) { + EDEB_ERR(4, "ipz_QPageit_get_inc() " + "returns NULL device=%p", device); + cq = ERR_PTR(-EAGAIN); + goto create_cq_exit4; + } + rpage = ehca_kv_to_g(vpage); + + hipz_rc = hipz_h_register_rpage_cq(adapter_handle, + my_cq->ipz_cq_handle, + &my_cq->pf, + 0, + HIPZ_CQ_REGISTER_ORIG, + rpage, + 1, + my_cq->ehca_cq_core.galpas. + kernel); + + if (hipz_rc < H_Success) { + EDEB_ERR(4, "hipz_h_register_rpage_cq() failed " + "ehca_cq=%p cq_num=%x hipz_rc=%lx " + "counter=%i act_pages=%i", + my_cq, my_cq->cq_number, + hipz_rc, counter, act_pages); + cq = ERR_PTR(-EINVAL); + goto create_cq_exit4; + } + + if (counter == (act_pages - 1)) { + vpage = ipz_QPageit_get_inc( + &my_cq->ehca_cq_core.ipz_queue); + if ((hipz_rc != H_Success) || (vpage != 0)) { + EDEB_ERR(4, "Registration of pages not " + "complete ehca_cq=%p cq_num=%x " + "hipz_rc=%lx", + my_cq, my_cq->cq_number, hipz_rc); + cq = ERR_PTR(-EAGAIN); + goto create_cq_exit4; + } + } else { + if (hipz_rc != H_PAGE_REGISTERED) { + EDEB_ERR(4, "Registration of page failed " + "ehca_cq=%p cq_num=%x hipz_rc=%lx" + "counter=%i act_pages=%i", + my_cq, my_cq->cq_number, + hipz_rc, counter, act_pages); + cq = ERR_PTR(-ENOMEM); + goto create_cq_exit4; + } + } + } + + ipz_QEit_reset(&my_cq->ehca_cq_core.ipz_queue); + + gal = my_cq->ehca_cq_core.galpas.kernel; + CQx_FEC = hipz_galpa_load(gal, CQTEMM_OFFSET(CQx_FEC)); + EDEB(8, "ehca_cq=%p cq_num=%x CQx_FEC=%lx", + my_cq, my_cq->cq_number, CQx_FEC); + + my_cq->ib_cq.cqe = my_cq->nr_of_entries = + act_nr_of_entries-additional_cqe; + my_cq->cq_number = (my_cq->ipz_cq_handle.handle) & 0xffff; + + for (i=0; iqp_hashtab[i]); + } + + if (context) { + struct ehca_create_cq_resp resp; + struct vm_area_struct * vma; + resp.cq_number = my_cq->cq_number; + resp.token = my_cq->token; + resp.ehca_cq_core = my_cq->ehca_cq_core; + + ehca_mmap_nopage(((u64) (my_cq->token) << 32) | 0x12000000, + my_cq->ehca_cq_core.ipz_queue.queue_length, + ((void**)&resp.ehca_cq_core.ipz_queue.queue), + &vma); + my_cq->uspace_queue = (u64)resp.ehca_cq_core.ipz_queue.queue; + ehca_mmap_register(my_cq->ehca_cq_core.galpas.user.fw_handle, + ((void**)&resp.ehca_cq_core.galpas.kernel.fw_handle), + &vma); + my_cq->uspace_fwh = (u64)resp.ehca_cq_core.galpas.kernel.fw_handle; + if (ib_copy_to_udata(udata, &resp, sizeof(resp))) { + EDEB_ERR(4, "Copy to udata failed."); + goto create_cq_exit4; + } + } + + EDEB_EX(7,"retcode=%p ehca_cq=%p cq_num=%x cq_size=%x", + cq, my_cq, my_cq->cq_number, act_nr_of_entries); + return cq; + + create_cq_exit4: + ipz_queue_dtor(&my_cq->ehca_cq_core.ipz_queue); + + create_cq_exit3: + hipz_rc = hipz_h_destroy_cq(adapter_handle, my_cq, 1); + EDEB(3, "hipz_h_destroy_cq() failed ehca_cq=%p cq_num=%x hipz_rc=%lx", + my_cq, my_cq->cq_number, hipz_rc); + + create_cq_exit2: + /* dereg idr */ + down_write(&ehca_cq_idr_sem); + idr_remove(&ehca_cq_idr, my_cq->token); + up_write(&ehca_cq_idr_sem); + + create_cq_exit1: + /* free cq struct */ + ehca_cq_delete(my_cq); + + create_cq_exit0: + EDEB_EX(7, "An error has occured retcode=%p ", cq); + return cq; +} + +int ehca_destroy_cq(struct ib_cq *cq) +{ + u64 hipz_rc = H_Success; + int retcode = 0; + struct ehca_cq *my_cq = NULL; + int cq_num = 0; + struct ib_device *device = NULL; + struct ehca_shca *shca = NULL; + struct ipz_adapter_handle adapter_handle; + + EHCA_CHECK_CQ(cq); + my_cq = container_of(cq, struct ehca_cq, ib_cq); + cq_num = my_cq->cq_number; + device = cq->device; + EHCA_CHECK_DEVICE(device); + shca = container_of(device, struct ehca_shca, ib_device); + adapter_handle = shca->ipz_hca_handle; + EDEB_EN(7, "ehca_cq=%p cq_num=%x", + my_cq, my_cq->cq_number); + + down_write(&ehca_cq_idr_sem); + idr_remove(&ehca_cq_idr, my_cq->token); + up_write(&ehca_cq_idr_sem); + + /* un-mmap if vma alloc */ + if (my_cq->uspace_queue!=0) { + struct ehca_cq_core *cq_core = &my_cq->ehca_cq_core; + retcode = ehca_munmap(my_cq->uspace_queue, + cq_core->ipz_queue.queue_length); + retcode = ehca_munmap(my_cq->uspace_fwh, 4096); + } + + hipz_rc = hipz_h_destroy_cq(adapter_handle, my_cq, 0); + if (hipz_rc == H_R_STATE) { + /* cq in err: read err data and destroy it forcibly */ + EDEB(4, "ehca_cq=%p cq_num=%x ressource=%lx in err state. " + "Try to delete it forcibly.", + my_cq, my_cq->cq_number, my_cq->ipz_cq_handle.handle); + ehca_error_data(shca, my_cq->ipz_cq_handle.handle); + hipz_rc = hipz_h_destroy_cq(adapter_handle, my_cq, 1); + if (hipz_rc == H_Success) { + EDEB(4, "ehca_cq=%p cq_num=%x deleted successfully.", + my_cq, my_cq->cq_number); + } + } + if (hipz_rc != H_Success) { + EDEB_ERR(4,"hipz_h_destroy_cq() failed " + "hipz_rc=%lx ehca_cq=%p cq_num=%x", + hipz_rc, my_cq, my_cq->cq_number); + retcode = ehca2ib_return_code(hipz_rc); + goto destroy_cq_exit0;/*@TODO*/ + } + ipz_queue_dtor(&my_cq->ehca_cq_core.ipz_queue); + ehca_cq_delete(my_cq); + + destroy_cq_exit0: + EDEB_EX(7, "ehca_cq=%p cq_num=%x retcode=%x ", + my_cq, cq_num, retcode); + return retcode; +} + +int ehca_resize_cq(struct ib_cq *cq, int cqe) +{ + int retcode = 0; + struct ehca_cq *my_cq = NULL; + + if (unlikely(NULL == cq)) { + EDEB_ERR(4, "cq is NULL"); + return -EFAULT; + } + + my_cq = container_of(cq, struct ehca_cq, ib_cq); + EDEB_EN(7, "ehca_cq=%p cq_num=%x", + my_cq, my_cq->cq_number); + /*TODO proper resize still needs to be done*/ + if (cqe > cq->cqe) { + retcode = -EINVAL; + } + EDEB_EX(7, "ehca_cq=%p cq_num=%x", + my_cq, my_cq->cq_number); + return retcode; +} + +/* eof ehca_cq.c */ From rolandd at cisco.com Sat Feb 18 11:57:57 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:57 -0800 Subject: [PATCH 20/22] ehca userspace verbs In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005757.13620.13628.stgit@localhost.localdomain> From: Roland Dreier --- drivers/infiniband/hw/ehca/ehca_uverbs.c | 376 ++++++++++++++++++++++++++++++ 1 files changed, 376 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_uverbs.c b/drivers/infiniband/hw/ehca/ehca_uverbs.c new file mode 100644 index 0000000..f813e9c --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_uverbs.c @@ -0,0 +1,376 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * userspace support verbs + * + * Authors: Heiko J Schick + * Christoph Raisch + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_uverbs.c,v 1.29 2006/02/06 10:17:34 schickhj Exp $ + */ + +#undef DEB_PREFIX +#define DEB_PREFIX "uver" + +#include "ehca_kernel.h" +#include "ehca_tools.h" +#include "ehca_classes.h" +#include "ehca_iverbs.h" +#include "ehca_eq.h" +#include "ehca_mrmw.h" + +#include "hcp_sense.h" /* TODO: later via hipz_* header file */ +#include "hcp_if.h" /* TODO: later via hipz_* header file */ + +struct ib_ucontext *ehca_alloc_ucontext(struct ib_device *device, + struct ib_udata *udata) +{ + struct ehca_ucontext *my_context = NULL; + EHCA_CHECK_ADR_P(device); + EDEB_EN(7, "device=%p name=%s", device, device->name); + my_context = kmalloc(sizeof *my_context, GFP_KERNEL); + if (NULL == my_context) { + EDEB_ERR(4, "Out of memory device=%p", device); + return ERR_PTR(-ENOMEM); + } + memset(my_context, 0, sizeof(*my_context)); + EDEB_EX(7, "device=%p ucontext=%p", device, my_context); + return &my_context->ib_ucontext; +} + +int ehca_dealloc_ucontext(struct ib_ucontext *context) +{ + struct ehca_ucontext *my_context = NULL; + EHCA_CHECK_ADR(context); + EDEB_EN(7, "ucontext=%p", context); + my_context = container_of(context, struct ehca_ucontext, ib_ucontext); + kfree(my_context); + EDEB_EN(7, "ucontext=%p", context); + return 0; +} + +struct page *ehca_nopage(struct vm_area_struct *vma, + unsigned long address, int *type) +{ + struct page *mypage = 0; + u64 fileoffset = vma->vm_pgoff << PAGE_SHIFT; + u32 idr_handle = fileoffset >> 32; + u32 q_type = (fileoffset >> 28) & 0xF; /* CQ, QP,... */ + u32 rsrc_type = (fileoffset >> 24) & 0xF; /* sq,rq,cmnd_window */ + + EDEB_EN(7, + "vm_start=%lx vm_end=%lx vm_page_prot=%lx vm_fileoff=%lx", + vma->vm_start, vma->vm_end, vma->vm_page_prot, fileoffset); + + + if (q_type == 1) { /* CQ */ + struct ehca_cq *cq; + + down_read(&ehca_cq_idr_sem); + cq = idr_find(&ehca_cq_idr, idr_handle); + up_read(&ehca_cq_idr_sem); + + /* make sure this mmap really belongs to the authorized user */ + if (cq == 0) { + EDEB_ERR(4, "cq is NULL ret=NOPAGE_SIGBUS"); + return NOPAGE_SIGBUS; + } + if (rsrc_type == 2) { + void *vaddr; + EDEB(6, "cq=%p cq queuearea", cq); + vaddr = address - vma->vm_start + + cq->ehca_cq_core.ipz_queue.queue; + EDEB(6, "queue=%p vaddr=%p", + cq->ehca_cq_core.ipz_queue.queue, vaddr); + mypage = vmalloc_to_page(vaddr); + } + } else if (q_type == 2) { /* QP */ + struct ehca_qp *qp; + + down_read(&ehca_qp_idr_sem); + qp = idr_find(&ehca_qp_idr, idr_handle); + up_read(&ehca_qp_idr_sem); + + /* make sure this mmap really belongs to the authorized user */ + if (qp == NULL) { + EDEB_ERR(4, "qp is NULL ret=NOPAGE_SIGBUS"); + return NOPAGE_SIGBUS; + } + if (rsrc_type == 2) { /* rqueue */ + void *vaddr; + EDEB(6, "qp=%p qp rqueuearea", qp); + vaddr = address - vma->vm_start + + qp->ehca_qp_core.ipz_rqueue.queue; + EDEB(6, "rqueue=%p vaddr=%p", + qp->ehca_qp_core.ipz_rqueue.queue, vaddr); + mypage = vmalloc_to_page(vaddr); + } else if (rsrc_type == 3) { /* squeue */ + void *vaddr; + EDEB(6, "qp=%p qp squeuearea", qp); + vaddr = address - vma->vm_start + + qp->ehca_qp_core.ipz_squeue.queue; + EDEB(6, "squeue=%p vaddr=%p", + qp->ehca_qp_core.ipz_squeue.queue, vaddr); + mypage = vmalloc_to_page(vaddr); + } + } + if (mypage == 0) { + EDEB_ERR(4, "Invalid page adr==NULL ret=NOPAGE_SIGBUS"); + return NOPAGE_SIGBUS; + } + get_page(mypage); + EDEB_EX(7, "page adr=%p", mypage); + return mypage; +} + +static struct vm_operations_struct ehcau_vm_ops = { + .nopage = ehca_nopage, +}; + +/* TODO: better error output messages !!! + NO RETURN WITHOUT ERROR + */ +int ehca_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) +{ + u64 fileoffset = vma->vm_pgoff << PAGE_SHIFT; + + + u32 idr_handle = fileoffset >> 32; + u32 q_type = (fileoffset >> 28) & 0xF; /* CQ, QP,... */ + u32 rsrc_type = (fileoffset >> 24) & 0xF; /* sq,rq,cmnd_window */ + u32 ret = -EFAULT; /* assume the worst */ + u64 vsize = 0; /* must be calculated/set below */ + u64 physical = 0; /* must be calculated/set below */ + + EDEB_EN(7, "vm_start=%lx vm_end=%lx vm_page_prot=%lx vm_fileoff=%lx", + vma->vm_start, vma->vm_end, vma->vm_page_prot, fileoffset); + + if (q_type == 1) { /* CQ */ + struct ehca_cq *cq; + + down_read(&ehca_cq_idr_sem); + cq = idr_find(&ehca_cq_idr, idr_handle); + up_read(&ehca_cq_idr_sem); + + /* make sure this mmap really belongs to the authorized user */ + if (cq == 0) + return -EINVAL; + if (cq->ib_cq.uobject == 0) + return -EINVAL; + if (cq->ib_cq.uobject->context != context) + return -EINVAL; + if (rsrc_type == 1) { /* galpa fw handle */ + EDEB(6, "cq=%p cq triggerarea", cq); + vma->vm_flags |= VM_RESERVED; + vsize = vma->vm_end - vma->vm_start; + if (vsize != 4096) { + EDEB_ERR(4, "invalid vsize=%lx", + vma->vm_end - vma->vm_start); + ret = -EINVAL; + goto mmap_exit0; + } + + physical = cq->ehca_cq_core.galpas.user.fw_handle; + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + vma->vm_flags |= VM_IO | VM_RESERVED; + + EDEB(6, "vsize=%lx physical=%lx", vsize, + physical); + ret = + remap_pfn_range(vma, vma->vm_start, + physical >> PAGE_SHIFT, vsize, + vma->vm_page_prot); + if (ret != 0) { + EDEB_ERR(4, + "Error: remap_pfn_range() returned %x!", + ret); + ret = -ENOMEM; + } + goto mmap_exit0; + } else if (rsrc_type == 2) { /* cq queue_addr */ + EDEB(6, "cq=%p cq q_addr", cq); + /* vma->vm_page_prot = + * pgprot_noncached(vma->vm_page_prot); */ + vma->vm_flags |= VM_RESERVED; + vma->vm_ops = &ehcau_vm_ops; + ret = 0; + goto mmap_exit0; + } else { + EDEB_ERR(6, "bad resource type %x", rsrc_type); + ret = -EINVAL; + goto mmap_exit0; + } + } else if (q_type == 2) { /* QP */ + struct ehca_qp *qp; + + down_read(&ehca_qp_idr_sem); + qp = idr_find(&ehca_qp_idr, idr_handle); + up_read(&ehca_qp_idr_sem); + + /* make sure this mmap really belongs to the authorized user */ + if (qp == NULL || qp->ib_qp.uobject == NULL || + qp->ib_qp.uobject->context != context) { + EDEB(6, "qp=%p, uobject=%p, context=%p", + qp, qp->ib_qp.uobject, qp->ib_qp.uobject->context); + ret = -EINVAL; + goto mmap_exit0; + } + if (rsrc_type == 1) { /* galpa fw handle */ + EDEB(6, "qp=%p qp triggerarea", qp); + vma->vm_flags |= VM_RESERVED; + vsize = vma->vm_end - vma->vm_start; + if (vsize != 4096) { + EDEB_ERR(4, "invalid vsize=%lx", + vma->vm_end - vma->vm_start); + ret = -EINVAL; + goto mmap_exit0; + } + + physical = qp->ehca_qp_core.galpas.user.fw_handle; + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + vma->vm_flags |= VM_IO | VM_RESERVED; + + EDEB(6, "vsize=%lx physical=%lx", vsize, + physical); + ret = + remap_pfn_range(vma, vma->vm_start, + physical >> PAGE_SHIFT, vsize, + vma->vm_page_prot); + if (ret != 0) { + EDEB_ERR(4, + "Error: remap_pfn_range() returned %x!", + ret); + ret = -ENOMEM; + } + goto mmap_exit0; + } else if (rsrc_type == 2) { /* qp rqueue_addr */ + EDEB(6, "qp=%p qp rqueue_addr", qp); + vma->vm_flags |= VM_RESERVED; + vma->vm_ops = &ehcau_vm_ops; + ret = 0; + goto mmap_exit0; + } else if (rsrc_type == 3) { /* qp squeue_addr */ + EDEB(6, "qp=%p qp squeue_addr", qp); + vma->vm_flags |= VM_RESERVED; + vma->vm_ops = &ehcau_vm_ops; + ret = 0; + goto mmap_exit0; + } else { + EDEB_ERR(4, "bad resource type %x", + rsrc_type); + ret = -EINVAL; + goto mmap_exit0; + } + } else { + EDEB_ERR(4, "bad queue type %x", q_type); + ret = -EINVAL; + goto mmap_exit0; + } + + mmap_exit0: + EDEB_EX(7, "ret=%x", ret); + return ret; +} + +int ehca_mmap_nopage(u64 foffset,u64 length,void ** mapped,struct vm_area_struct ** vma) +{ + down_write(¤t->mm->mmap_sem); + *mapped=(void*) + do_mmap(NULL,0, + length, + PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, + foffset); + up_write(¤t->mm->mmap_sem); + if (*mapped) { + *vma = find_vma(current->mm,(u64)*mapped); + if (*vma) { + (*vma)->vm_flags |= VM_RESERVED; + (*vma)->vm_ops = &ehcau_vm_ops; + } else { + EDEB_ERR(4,"couldn't find queue vma queue=%p", + *mapped); + } + } else { + EDEB_ERR(4,"couldn't create mmap length=%lx",length); + } + EDEB(7,"mapped=%p",*mapped); + return 0; +} + +int ehca_mmap_register(u64 physical,void ** mapped,struct vm_area_struct ** vma) +{ + int ret; + unsigned long vsize; + ehca_mmap_nopage(0,4096,mapped,vma); + (*vma)->vm_flags |= VM_RESERVED; + vsize = (*vma)->vm_end - (*vma)->vm_start; + if (vsize != 4096) { + EDEB_ERR(4, "invalid vsize=%lx", + (*vma)->vm_end - (*vma)->vm_start); + ret = -EINVAL; + return ret; + } + + (*vma)->vm_page_prot = pgprot_noncached((*vma)->vm_page_prot); + (*vma)->vm_flags |= VM_IO | VM_RESERVED; + + EDEB(6, "vsize=%lx physical=%lx", vsize, + physical); + ret = + remap_pfn_range((*vma), (*vma)->vm_start, + physical >> PAGE_SHIFT, vsize, + (*vma)->vm_page_prot); + if (ret != 0) { + EDEB_ERR(4, + "Error: remap_pfn_range() returned %x!", + ret); + ret = -ENOMEM; + } + return ret; + +} + +int ehca_munmap(unsigned long addr, size_t len) { + int ret=0; + struct mm_struct *mm = current->mm; + if (mm!=0) { + down_write(&mm->mmap_sem); + ret = do_munmap(mm, addr, len); + up_write(&mm->mmap_sem); + } + return ret; +} + +/* eof ehca_uverbs.c */ From rolandd at cisco.com Sat Feb 18 11:57:54 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:57:54 -0800 Subject: [PATCH 19/22] ehca memory regions In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005754.13620.41418.stgit@localhost.localdomain> From: Roland Dreier Nearly all the inline functions in ehca_mrmw.h look too big to be inlined. Why can't they just be static functions in ehca_mrmw.c? --- drivers/infiniband/hw/ehca/ehca_mrmw.c | 1711 ++++++++++++++++++++++++++++++++ drivers/infiniband/hw/ehca/ehca_mrmw.h | 739 ++++++++++++++ 2 files changed, 2450 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c new file mode 100644 index 0000000..d756082 --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -0,0 +1,1711 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * MR/MW functions + * + * Authors: Dietmar Decker + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_mrmw.c,v 1.86 2006/02/07 07:51:13 decker Exp $ + */ + +#undef DEB_PREFIX +#define DEB_PREFIX "mrmw" + +#include "ehca_kernel.h" +#include "ehca_iverbs.h" +#include "hcp_if.h" +#include "ehca_mrmw.h" + +extern int ehca_use_hp_mr; + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +struct ib_mr *ehca_get_dma_mr(struct ib_pd *pd, int mr_access_flags) +{ + struct ib_mr *ib_mr; + int retcode = 0; + struct ehca_mr *e_maxmr = 0; + struct ehca_pd *e_pd; + struct ehca_shca *shca; + + EDEB_EN(7, "pd=%p mr_access_flags=%x", pd, mr_access_flags); + + EHCA_CHECK_PD_P(pd); + e_pd = container_of(pd, struct ehca_pd, ib_pd); + shca = container_of(pd->device, struct ehca_shca, ib_device); + + if (shca->maxmr) { + e_maxmr = ehca_mr_new(); + if (!e_maxmr) { + EDEB_ERR(4, "out of memory"); + ib_mr = ERR_PTR(-ENOMEM); + goto get_dma_mr_exit0; + } + + retcode = ehca_reg_maxmr(shca, e_maxmr, + (u64 *)KERNELBASE, + mr_access_flags, e_pd, + &e_maxmr->ib.ib_mr.lkey, + &e_maxmr->ib.ib_mr.rkey); + if (retcode != 0) { + ib_mr = ERR_PTR(retcode); + goto get_dma_mr_exit0; + } + ib_mr = &e_maxmr->ib.ib_mr; + } else { + EDEB_ERR(4, "no internal max-MR exist!"); + ib_mr = ERR_PTR(-EINVAL); + goto get_dma_mr_exit0; + } + + get_dma_mr_exit0: + if (IS_ERR(ib_mr) == 0) + EDEB_EX(7, "ib_mr=%p lkey=%x rkey=%x", + ib_mr, ib_mr->lkey, ib_mr->rkey); + else + EDEB_EX(4, "rc=%lx pd=%p mr_access_flags=%x ", + PTR_ERR(ib_mr), pd, mr_access_flags); + return (ib_mr); +} /* end ehca_get_dma_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +struct ib_mr *ehca_reg_phys_mr(struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start) +{ + struct ib_mr *ib_mr = 0; + int retcode = 0; + struct ehca_mr *e_mr = 0; + struct ehca_shca *shca = 0; + struct ehca_pd *e_pd = 0; + u64 size = 0; + struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0}; + u32 num_pages_mr = 0; + + EDEB_EN(7, "pd=%p phys_buf_array=%p num_phys_buf=%x " + "mr_access_flags=%x iova_start=%p", pd, phys_buf_array, + num_phys_buf, mr_access_flags, iova_start); + + EHCA_CHECK_PD_P(pd); + if ((num_phys_buf <= 0) || ehca_adr_bad(phys_buf_array)) { + EDEB_ERR(4, "bad input values: num_phys_buf=%x " + "phys_buf_array=%p", num_phys_buf, phys_buf_array); + ib_mr = ERR_PTR(-EINVAL); + goto reg_phys_mr_exit0; + } + if (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) || + ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE))) { + /* Remote Write Access requires Local Write Access */ + /* Remote Atomic Access requires Local Write Access */ + EDEB_ERR(4, "bad input values: mr_access_flags=%x", + mr_access_flags); + ib_mr = ERR_PTR(-EINVAL); + goto reg_phys_mr_exit0; + } + + /* check physical buffer list and calculate size */ + retcode = ehca_mr_chk_buf_and_calc_size(phys_buf_array, num_phys_buf, + iova_start, &size); + if (retcode != 0) { + ib_mr = ERR_PTR(retcode); + goto reg_phys_mr_exit0; + } + if ((size == 0) || + ((0xFFFFFFFFFFFFFFFF - size) < (u64)iova_start)) { + EDEB_ERR(4, "bad input values: size=%lx iova_start=%p", + size, iova_start); + ib_mr = ERR_PTR(-EINVAL); + goto reg_phys_mr_exit0; + } + + e_pd = container_of(pd, struct ehca_pd, ib_pd); + shca = container_of(pd->device, struct ehca_shca, ib_device); + + e_mr = ehca_mr_new(); + if (!e_mr) { + EDEB_ERR(4, "out of memory"); + ib_mr = ERR_PTR(-ENOMEM); + goto reg_phys_mr_exit0; + } + + /* determine number of MR pages */ + /* pagesize currently hardcoded to 4k ... TODO.. */ + num_pages_mr = + ((((u64)iova_start % PAGE_SIZE) + size + + PAGE_SIZE - 1) / PAGE_SIZE); + + /* register MR on HCA */ + if (ehca_mr_is_maxmr(size, iova_start)) { + e_mr->flags |= EHCA_MR_FLAG_MAXMR; + retcode = ehca_reg_maxmr(shca, e_mr, iova_start, + mr_access_flags, e_pd, + &e_mr->ib.ib_mr.lkey, + &e_mr->ib.ib_mr.rkey); + if (retcode != 0) { + ib_mr = ERR_PTR(retcode); + goto reg_phys_mr_exit1; + } + } else { + pginfo.type = EHCA_MR_PGI_PHYS; + pginfo.num_pages = num_pages_mr; + pginfo.num_phys_buf = num_phys_buf; + pginfo.phys_buf_array = phys_buf_array; + + retcode = ehca_reg_mr(shca, e_mr, iova_start, size, + mr_access_flags, e_pd, &pginfo, + &e_mr->ib.ib_mr.lkey, + &e_mr->ib.ib_mr.rkey); + if (retcode != 0) { + ib_mr = ERR_PTR(retcode); + goto reg_phys_mr_exit1; + } + } + + /* successful registration of all pages */ + ib_mr = &e_mr->ib.ib_mr; + goto reg_phys_mr_exit0; + + reg_phys_mr_exit1: + ehca_mr_delete(e_mr); + reg_phys_mr_exit0: + if (IS_ERR(ib_mr) == 0) + EDEB_EX(7, "ib_mr=%p lkey=%x rkey=%x", + ib_mr, ib_mr->lkey, ib_mr->rkey); + else + EDEB_EX(4, "rc=%lx pd=%p phys_buf_array=%p " + "num_phys_buf=%x mr_access_flags=%x iova_start=%p", + PTR_ERR(ib_mr), pd, phys_buf_array, + num_phys_buf, mr_access_flags, iova_start); + return (ib_mr); +} /* end ehca_reg_phys_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, + struct ib_umem *region, + int mr_access_flags, + struct ib_udata *udata) +{ + struct ib_mr *ib_mr = 0; + struct ehca_mr *e_mr = 0; + struct ehca_shca *shca = 0; + struct ehca_pd *e_pd = 0; + struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0}; + int retcode = 0; + u32 num_pages_mr = 0; + + EDEB_EN(7, "pd=%p region=%p mr_access_flags=%x udata=%p", + pd, region, mr_access_flags, udata); + + EHCA_CHECK_PD_P(pd); + if (ehca_adr_bad(region)) { + EDEB_ERR(4, "bad input values: region=%p", region); + ib_mr = ERR_PTR(-EINVAL); + goto reg_user_mr_exit0; + } + if (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) || + ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE))) { + /* Remote Write Access requires Local Write Access */ + /* Remote Atomic Access requires Local Write Access */ + EDEB_ERR(4, "bad input values: mr_access_flags=%x", + mr_access_flags); + ib_mr = ERR_PTR(-EINVAL); + goto reg_user_mr_exit0; + } + EDEB(7, "user_base=%lx virt_base=%lx length=%lx offset=%x page_size=%x " + "chunk_list.next=%p", + region->user_base, region->virt_base, region->length, + region->offset, region->page_size, region->chunk_list.next); + if (region->page_size != PAGE_SIZE) { + /* @TODO large page support */ + EDEB_ERR(4, "large pages not supported, region->page_size=%x", + region->page_size); + ib_mr = ERR_PTR(-EINVAL); + goto reg_user_mr_exit0; + } + + if ((region->length == 0) || + ((0xFFFFFFFFFFFFFFFF - region->length) < region->virt_base)) { + EDEB_ERR(4, "bad input values: length=%lx virt_base=%lx", + region->length, region->virt_base); + ib_mr = ERR_PTR(-EINVAL); + goto reg_user_mr_exit0; + } + + e_pd = container_of(pd, struct ehca_pd, ib_pd); + shca = container_of(pd->device, struct ehca_shca, ib_device); + + e_mr = ehca_mr_new(); + if (!e_mr) { + EDEB_ERR(4, "out of memory"); + ib_mr = ERR_PTR(-ENOMEM); + goto reg_user_mr_exit0; + } + + /* determine number of MR pages */ + /* pagesize currently hardcoded to 4k ...TODO... */ + num_pages_mr = + (((region->virt_base % PAGE_SIZE) + region->length + + PAGE_SIZE - 1) / PAGE_SIZE); + + /* register MR on HCA */ + pginfo.type = EHCA_MR_PGI_USER; + pginfo.num_pages = num_pages_mr; + pginfo.region = region; + pginfo.next_chunk = list_prepare_entry(pginfo.next_chunk, + (®ion->chunk_list), + list); + + retcode = ehca_reg_mr(shca, e_mr, (u64 *)region->virt_base, + region->length, mr_access_flags, e_pd, &pginfo, + &e_mr->ib.ib_mr.lkey, &e_mr->ib.ib_mr.rkey); + if (retcode != 0) { + ib_mr = ERR_PTR(retcode); + goto reg_user_mr_exit1; + } + + /* successful registration of all pages */ + ib_mr = &e_mr->ib.ib_mr; + goto reg_user_mr_exit0; + + reg_user_mr_exit1: + ehca_mr_delete(e_mr); + reg_user_mr_exit0: + if (IS_ERR(ib_mr) == 0) + EDEB_EX(7, "ib_mr=%p lkey=%x rkey=%x", + ib_mr, ib_mr->lkey, ib_mr->rkey); + else + EDEB_EX(4, "rc=%lx pd=%p region=%p mr_access_flags=%x " + "udata=%p", + PTR_ERR(ib_mr), pd, region, mr_access_flags, udata); + return (ib_mr); +} /* end ehca_reg_user_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_rereg_phys_mr(struct ib_mr *mr, + int mr_rereg_mask, + struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start) +{ + int retcode = 0; + struct ehca_shca *shca = 0; + struct ehca_mr *e_mr = 0; + u64 new_size = 0; + u64 *new_start = 0; + u32 new_acl = 0; + struct ehca_pd *new_pd = 0; + u32 tmp_lkey = 0; + u32 tmp_rkey = 0; + unsigned long sl_flags; + u64 num_pages_mr = 0; + struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0}; + + EDEB_EN(7, "mr=%p mr_rereg_mask=%x pd=%p phys_buf_array=%p " + "num_phys_buf=%x mr_access_flags=%x iova_start=%p", + mr, mr_rereg_mask, pd, phys_buf_array, num_phys_buf, + mr_access_flags, iova_start); + + if (!(mr_rereg_mask & IB_MR_REREG_TRANS)) { + /*@TODO not supported, because PHYP rereg hCall needs pages*/ + /*@TODO: We will follow this with Tom ....*/ + EDEB_ERR(4, "rereg without IB_MR_REREG_TRANS not supported yet," + " mr_rereg_mask=%x", mr_rereg_mask); + retcode = -EINVAL; + goto rereg_phys_mr_exit0; + } + + EHCA_CHECK_MR(mr); + e_mr = container_of(mr, struct ehca_mr, ib.ib_mr); + if (mr_rereg_mask & IB_MR_REREG_PD) { + EHCA_CHECK_PD(pd); + } + + if ((mr_rereg_mask & + ~(IB_MR_REREG_TRANS | IB_MR_REREG_PD | IB_MR_REREG_ACCESS)) || + (mr_rereg_mask == 0)) { + retcode = -EINVAL; + goto rereg_phys_mr_exit0; + } + + shca = container_of(mr->device, struct ehca_shca, ib_device); + + /* check other parameters */ + if (e_mr == shca->maxmr) { + /* should be impossible, however reject to be sure */ + EDEB_ERR(3, "rereg internal max-MR impossible, mr=%p " + "shca->maxmr=%p mr->lkey=%x", + mr, shca->maxmr, mr->lkey); + retcode = -EINVAL; + goto rereg_phys_mr_exit0; + } + if (mr_rereg_mask & IB_MR_REREG_TRANS) { /* transl., i.e. addr/size */ + if (e_mr->flags & EHCA_MR_FLAG_FMR) { + EDEB_ERR(4, "not supported for FMR, mr=%p flags=%x", + mr, e_mr->flags); + retcode = -EINVAL; + goto rereg_phys_mr_exit0; + } + if (ehca_adr_bad(phys_buf_array) || num_phys_buf <= 0) { + EDEB_ERR(4, "bad input values: mr_rereg_mask=%x " + "phys_buf_array=%p num_phys_buf=%x", + mr_rereg_mask, phys_buf_array, num_phys_buf); + retcode = -EINVAL; + goto rereg_phys_mr_exit0; + } + } + if ((mr_rereg_mask & IB_MR_REREG_ACCESS) && /* change ACL */ + (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) || + ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)))) { + /* Remote Write Access requires Local Write Access */ + /* Remote Atomic Access requires Local Write Access */ + EDEB_ERR(4, "bad input values: mr_rereg_mask=%x " + "mr_access_flags=%x", mr_rereg_mask, mr_access_flags); + retcode = -EINVAL; + goto rereg_phys_mr_exit0; + } + + /* set requested values dependent on rereg request */ + spin_lock_irqsave(&e_mr->mrlock, sl_flags); /* get lock @TODO for MR*/ + new_start = e_mr->start; /* new == old address */ + new_size = e_mr->size; /* new == old length */ + new_acl = e_mr->acl; /* new == old access control */ + new_pd = container_of(mr->pd,struct ehca_pd,ib_pd); /*new == old PD*/ + + if (mr_rereg_mask & IB_MR_REREG_TRANS) { + new_start = iova_start; /* change address */ + /* check physical buffer list and calculate size */ + retcode = ehca_mr_chk_buf_and_calc_size(phys_buf_array, + num_phys_buf, + iova_start, &new_size); + if (retcode != 0) + goto rereg_phys_mr_exit1; + if ((new_size == 0) || + ((0xFFFFFFFFFFFFFFFF - new_size) < (u64)iova_start)) { + EDEB_ERR(4, "bad input values: new_size=%lx " + "iova_start=%p", new_size, iova_start); + retcode = -EINVAL; + goto rereg_phys_mr_exit1; + } + num_pages_mr = ((((u64)new_start % PAGE_SIZE) + + new_size + PAGE_SIZE - 1) / PAGE_SIZE); + pginfo.type = EHCA_MR_PGI_PHYS; + pginfo.num_pages = num_pages_mr; + pginfo.num_phys_buf = num_phys_buf; + pginfo.phys_buf_array = phys_buf_array; + } + if (mr_rereg_mask & IB_MR_REREG_ACCESS) + new_acl = mr_access_flags; + if (mr_rereg_mask & IB_MR_REREG_PD) + new_pd = container_of(pd, struct ehca_pd, ib_pd); + + EDEB(7, "mr=%p new_start=%p new_size=%lx new_acl=%x new_pd=%p " + "num_pages_mr=%lx", + e_mr, new_start, new_size, new_acl, new_pd, num_pages_mr); + + retcode = ehca_rereg_mr(shca, e_mr, new_start, new_size, new_acl, + new_pd, &pginfo, &tmp_lkey, &tmp_rkey); + if (retcode != 0) + goto rereg_phys_mr_exit1; + + /* successful reregistration */ + if (mr_rereg_mask & IB_MR_REREG_PD) + mr->pd = pd; + mr->lkey = tmp_lkey; + mr->rkey = tmp_rkey; + + rereg_phys_mr_exit1: + spin_unlock_irqrestore(&e_mr->mrlock, sl_flags); /* free spin lock */ + rereg_phys_mr_exit0: + if (retcode == 0) + EDEB_EX(7, "mr=%p mr_rereg_mask=%x pd=%p phys_buf_array=%p " + "num_phys_buf=%x mr_access_flags=%x iova_start=%p", + mr, mr_rereg_mask, pd, phys_buf_array, num_phys_buf, + mr_access_flags, iova_start); + else + EDEB_EX(4, "retcode=%x mr=%p mr_rereg_mask=%x pd=%p " + "phys_buf_array=%p num_phys_buf=%x mr_access_flags=%x " + "iova_start=%p", + retcode, mr, mr_rereg_mask, pd, phys_buf_array, + num_phys_buf, mr_access_flags, iova_start); + + return (retcode); +} /* end ehca_rereg_phys_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_query_mr(struct ib_mr *mr, struct ib_mr_attr *mr_attr) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_shca *shca = 0; + struct ehca_mr *e_mr = 0; + struct ipz_pd fwpd; /* Firmware PD */ + u32 access_ctrl = 0; + u64 tmp_remote_size = 0; + u64 tmp_remote_len = 0; + + unsigned long sl_flags; + + EDEB_EN(7, "mr=%p mr_attr=%p", mr, mr_attr); + + EHCA_CHECK_MR(mr); + e_mr = container_of(mr, struct ehca_mr, ib.ib_mr); + if (ehca_adr_bad(mr_attr)) { + EDEB_ERR(4, "bad input values: mr_attr=%p", mr_attr); + retcode = -EINVAL; + goto query_mr_exit0; + } + if ((e_mr->flags & EHCA_MR_FLAG_FMR)) { + EDEB_ERR(4, "not supported for FMR, mr=%p e_mr=%p " + "e_mr->flags=%x", mr, e_mr, e_mr->flags); + retcode = -EINVAL; + goto query_mr_exit0; + } + + shca = container_of(mr->device, struct ehca_shca, ib_device); + memset(mr_attr, 0, sizeof(struct ib_mr_attr)); + spin_lock_irqsave(&e_mr->mrlock, sl_flags); /* get spin lock @TODO?? */ + + rc = hipz_h_query_mr(shca->ipz_hca_handle, &e_mr->pf, + &e_mr->ipz_mr_handle, &mr_attr->size, + &mr_attr->device_virt_addr, &tmp_remote_size, + &tmp_remote_len, &access_ctrl, &fwpd, + &mr_attr->lkey, &mr_attr->rkey); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_mr_query failed, rc=%lx mr=%p " + "hca_hndl=%lx mr_hndl=%lx lkey=%x", + rc, mr, shca->ipz_hca_handle.handle, + e_mr->ipz_mr_handle.handle, mr->lkey); + retcode = ehca_mrmw_map_rc_query_mr(rc); + goto query_mr_exit1; + } + ehca_mrmw_reverse_map_acl(&access_ctrl, &mr_attr->mr_access_flags); + mr_attr->pd = mr->pd; + + query_mr_exit1: + spin_unlock_irqrestore(&e_mr->mrlock, sl_flags); /* free spin lock */ + query_mr_exit0: + if (retcode == 0) + EDEB_EX(7, "pd=%p device_virt_addr=%lx size=%lx " + "mr_access_flags=%x lkey=%x rkey=%x", + mr_attr->pd, mr_attr->device_virt_addr, + mr_attr->size, mr_attr->mr_access_flags, + mr_attr->lkey, mr_attr->rkey); + else + EDEB_EX(4, "retcode=%x mr=%p mr_attr=%p", retcode, mr, mr_attr); + return (retcode); +} /* end ehca_query_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_dereg_mr(struct ib_mr *mr) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_shca *shca = 0; + struct ehca_mr *e_mr = 0; + + EDEB_EN(7, "mr=%p", mr); + + EHCA_CHECK_MR(mr); + e_mr = container_of(mr, struct ehca_mr, ib.ib_mr); + shca = container_of(mr->device, struct ehca_shca, ib_device); + + if ((e_mr->flags & EHCA_MR_FLAG_FMR)) { + EDEB_ERR(4, "not supported for FMR, mr=%p e_mr=%p " + "e_mr->flags=%x", mr, e_mr, e_mr->flags); + retcode = -EINVAL; + goto dereg_mr_exit0; + } else if (e_mr == shca->maxmr) { + /* should be impossible, however reject to be sure */ + EDEB_ERR(3, "dereg internal max-MR impossible, mr=%p " + "shca->maxmr=%p mr->lkey=%x", + mr, shca->maxmr, mr->lkey); + retcode = -EINVAL; + goto dereg_mr_exit0; + } + + /*@TODO: BUSY: MR still has bound window(s) */ + rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, &e_mr->pf, + &e_mr->ipz_mr_handle); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_free_mr failed, rc=%lx shca=%p e_mr=%p" + " hca_hndl=%lx mr_hndl=%lx mr->lkey=%x", + rc, shca, e_mr, shca->ipz_hca_handle.handle, + e_mr->ipz_mr_handle.handle, mr->lkey); + retcode = ehca_mrmw_map_rc_free_mr(rc); + goto dereg_mr_exit0; + } + + /* successful deregistration */ + ehca_mr_delete(e_mr); + + dereg_mr_exit0: + if (retcode == 0) + EDEB_EX(7, ""); + else + EDEB_EX(4, "retcode=%x mr=%p", retcode, mr); + return (retcode); +} /* end ehca_dereg_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +struct ib_mw *ehca_alloc_mw(struct ib_pd *pd) +{ + struct ib_mw *ib_mw = 0; + u64 rc = H_Success; + struct ehca_shca *shca = 0; + struct ehca_mw *e_mw = 0; + struct ehca_pd *e_pd = 0; + + EDEB_EN(7, "pd=%p", pd); + + EHCA_CHECK_PD_P(pd); + e_pd = container_of(pd, struct ehca_pd, ib_pd); + shca = container_of(pd->device, struct ehca_shca, ib_device); + + e_mw = ehca_mw_new(); + if (!e_mw) { + ib_mw = ERR_PTR(-ENOMEM); + goto alloc_mw_exit0; + } + + rc = hipz_h_alloc_resource_mw(shca->ipz_hca_handle, &e_mw->pf, + &shca->pf, e_pd->fw_pd, + &e_mw->ipz_mw_handle, &e_mw->ib_mw.rkey); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_mw_allocate failed, rc=%lx shca=%p " + "hca_hndl=%lx mw=%p", rc, shca, + shca->ipz_hca_handle.handle, e_mw); + ib_mw = ERR_PTR(ehca_mrmw_map_rc_alloc(rc)); + goto alloc_mw_exit1; + } + /* save R_Key in local copy */ + /*@TODO????? mw->rkey = *rkey_p; */ + + /* successful MW allocation */ + ib_mw = &e_mw->ib_mw; + goto alloc_mw_exit0; + + alloc_mw_exit1: + ehca_mw_delete(e_mw); + alloc_mw_exit0: + if (IS_ERR(ib_mw) == 0) + EDEB_EX(7, "ib_mw=%p rkey=%x", ib_mw, ib_mw->rkey); + else + EDEB_EX(4, "rc=%lx pd=%p", PTR_ERR(ib_mw), pd); + return (ib_mw); +} /* end ehca_alloc_mw() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_bind_mw(struct ib_qp *qp, + struct ib_mw *mw, + struct ib_mw_bind *mw_bind) +{ + int retcode = 0; + + /*@TODO: not supported up to now */ + EDEB_ERR(4, "bind MW currently not supported by HCAD"); + retcode = -EPERM; + goto bind_mw_exit0; + + bind_mw_exit0: + if (retcode == 0) + EDEB_EX(7, "qp=%p mw=%p mw_bind=%p", qp, mw, mw_bind); + else + EDEB_EX(4, "rc=%x qp=%p mw=%p mw_bind=%p", + retcode, qp, mw, mw_bind); + return (retcode); +} /* end ehca_bind_mw() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_dealloc_mw(struct ib_mw *mw) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_shca *shca = 0; + struct ehca_mw *e_mw = 0; + + EDEB_EN(7, "mw=%p", mw); + + EHCA_CHECK_MW(mw); + e_mw = container_of(mw, struct ehca_mw, ib_mw); + shca = container_of(mw->device, struct ehca_shca, ib_device); + + rc = hipz_h_free_resource_mw(shca->ipz_hca_handle, &e_mw->pf, + &e_mw->ipz_mw_handle); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_free_mw failed, rc=%lx shca=%p mw=%p " + "rkey=%x hca_hndl=%lx mw_hndl=%lx", + rc, shca, mw, mw->rkey, shca->ipz_hca_handle.handle, + e_mw->ipz_mw_handle.handle); + retcode = ehca_mrmw_map_rc_free_mw(rc); + goto dealloc_mw_exit0; + } + /* successful deallocation */ + ehca_mw_delete(e_mw); + + dealloc_mw_exit0: + if (retcode == 0) + EDEB_EX(7, ""); + else + EDEB_EX(4, "retcode=%x mw=%p", retcode, mw); + return (retcode); +} /* end ehca_dealloc_mw() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +struct ib_fmr *ehca_alloc_fmr(struct ib_pd *pd, + int mr_access_flags, + struct ib_fmr_attr *fmr_attr) +{ + struct ib_fmr *ib_fmr = 0; + struct ehca_shca *shca = 0; + struct ehca_mr *e_fmr = 0; + int retcode = 0; + struct ehca_pd *e_pd = 0; + u32 tmp_lkey = 0; + u32 tmp_rkey = 0; + struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0}; + + EDEB_EN(7, "pd=%p mr_access_flags=%x fmr_attr=%p", + pd, mr_access_flags, fmr_attr); + + EHCA_CHECK_PD_P(pd); + if (ehca_adr_bad(fmr_attr)) { + EDEB_ERR(4, "bad input values: fmr_attr=%p", fmr_attr); + ib_fmr = ERR_PTR(-EINVAL); + goto alloc_fmr_exit0; + } + + EDEB(7, "max_pages=%x max_maps=%x page_shift=%x", + fmr_attr->max_pages, fmr_attr->max_maps, fmr_attr->page_shift); + + /* check other parameters */ + if (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) || + ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) && + !(mr_access_flags & IB_ACCESS_LOCAL_WRITE))) { + /* Remote Write Access requires Local Write Access */ + /* Remote Atomic Access requires Local Write Access */ + EDEB_ERR(4, "bad input values: mr_access_flags=%x", + mr_access_flags); + ib_fmr = ERR_PTR(-EINVAL); + goto alloc_fmr_exit0; + } + if (mr_access_flags & IB_ACCESS_MW_BIND) { + EDEB_ERR(4, "bad input values: mr_access_flags=%x", + mr_access_flags); + ib_fmr = ERR_PTR(-EINVAL); + goto alloc_fmr_exit0; + } + if ((fmr_attr->max_pages == 0) || (fmr_attr->max_maps == 0)) { + EDEB_ERR(4, "bad input values: fmr_attr->max_pages=%x " + "fmr_attr->max_maps=%x fmr_attr->page_shift=%x", + fmr_attr->max_pages, fmr_attr->max_maps, + fmr_attr->page_shift); + ib_fmr = ERR_PTR(-EINVAL); + goto alloc_fmr_exit0; + } + if ((1 << fmr_attr->page_shift) != PAGE_SIZE) { + /* pagesize currently hardcoded to 4k ... */ + EDEB_ERR(4, "unsupported fmr_attr->page_shift=%x", + fmr_attr->page_shift); + ib_fmr = ERR_PTR(-EINVAL); + goto alloc_fmr_exit0; + } + + e_pd = container_of(pd, struct ehca_pd, ib_pd); + shca = container_of(pd->device, struct ehca_shca, ib_device); + + e_fmr = ehca_mr_new(); + if (e_fmr == 0) { + ib_fmr = ERR_PTR(-ENOMEM); + goto alloc_fmr_exit0; + } + e_fmr->flags |= EHCA_MR_FLAG_FMR; + + /* register MR on HCA */ + retcode = ehca_reg_mr(shca, e_fmr, 0, + fmr_attr->max_pages * PAGE_SIZE, + mr_access_flags, e_pd, &pginfo, + &tmp_lkey, &tmp_rkey); + if (retcode != 0) { + ib_fmr = ERR_PTR(retcode); + goto alloc_fmr_exit1; + } + + /* successful registration of all pages */ + e_fmr->fmr_page_size = 1 << fmr_attr->page_shift; + e_fmr->fmr_max_pages = fmr_attr->max_pages; /* pagesize hardcoded 4k */ + e_fmr->fmr_max_maps = fmr_attr->max_maps; + e_fmr->fmr_map_cnt = 0; + ib_fmr = &e_fmr->ib.ib_fmr; + goto alloc_fmr_exit0; + + alloc_fmr_exit1: + ehca_mr_delete(e_fmr); + alloc_fmr_exit0: + if (IS_ERR(ib_fmr) == 0) + EDEB_EX(7, "ib_fmr=%p tmp_lkey=%x tmp_rkey=%x", + ib_fmr, tmp_lkey, tmp_rkey); + else + EDEB_EX(4, "rc=%lx pd=%p mr_access_flags=%x " + "fmr_attr=%p", PTR_ERR(ib_fmr), pd, + mr_access_flags, fmr_attr); + return (ib_fmr); +} /* end ehca_alloc_fmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_map_phys_fmr(struct ib_fmr *fmr, + u64 *page_list, + int list_len, + u64 iova) +{ + int retcode = 0; + struct ehca_shca *shca = 0; + struct ehca_mr *e_fmr = 0; + struct ehca_pd *e_pd = 0; + struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0}; + u32 tmp_lkey = 0; + u32 tmp_rkey = 0; + /*@TODO unsigned long sl_flags; */ + + EDEB_EN(7, "fmr=%p page_list=%p list_len=%x iova=%lx", + fmr, page_list, list_len, iova); + + EHCA_CHECK_FMR(fmr); + e_fmr = container_of(fmr, struct ehca_mr, ib.ib_fmr); + shca = container_of(fmr->device, struct ehca_shca, ib_device); + e_pd = container_of(fmr->pd, struct ehca_pd, ib_pd); + + if (!(e_fmr->flags & EHCA_MR_FLAG_FMR)) { + EDEB_ERR(4, "not a FMR, e_fmr=%p e_fmr->flags=%x", + e_fmr, e_fmr->flags); + retcode = -EINVAL; + goto map_phys_fmr_exit0; + } + retcode = ehca_fmr_check_page_list(e_fmr, page_list, list_len); + if (retcode != 0) + goto map_phys_fmr_exit0; + if (iova % PAGE_SIZE) { + /* only whole-numbered pages */ + EDEB_ERR(4, "bad iova, iova=%lx", iova); + retcode = -EINVAL; + goto map_phys_fmr_exit0; + } + if (e_fmr->fmr_map_cnt >= e_fmr->fmr_max_maps) { + /* HCAD does not limit the maps, however trace this anyway */ + EDEB(6, "map limit exceeded, fmr=%p e_fmr->fmr_map_cnt=%x " + "e_fmr->fmr_max_maps=%x", + fmr, e_fmr->fmr_map_cnt, e_fmr->fmr_max_maps); + } + + pginfo.type = EHCA_MR_PGI_FMR; + pginfo.num_pages = list_len; + pginfo.page_list = page_list; + + /* @TODO spin_lock_irqsave(&e_fmr->mrlock, sl_flags); */ + + retcode = ehca_rereg_mr(shca, e_fmr, (u64 *)iova, + list_len * PAGE_SIZE, + e_fmr->acl, e_pd, &pginfo, + &tmp_lkey, &tmp_rkey); + if (retcode != 0) { + /* @TODO spin_unlock_irqrestore(&fmr->mrlock, sl_flags); */ + goto map_phys_fmr_exit0; + } + /* successful reregistration */ + e_fmr->fmr_map_cnt++; + /* @TODO spin_unlock_irqrestore(&fmr->mrlock, sl_flags); */ + + e_fmr->ib.ib_fmr.lkey = tmp_lkey; + e_fmr->ib.ib_fmr.rkey = tmp_rkey; + + map_phys_fmr_exit0: + if (retcode == 0) + EDEB_EX(7, "lkey=%x rkey=%x", + e_fmr->ib.ib_fmr.lkey, e_fmr->ib.ib_fmr.rkey); + else + EDEB_EX(4, "retcode=%x fmr=%p page_list=%p list_len=%x " + "iova=%lx", + retcode, fmr, page_list, list_len, iova); + return (retcode); +} /* end ehca_map_phys_fmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_unmap_fmr(struct list_head *fmr_list) +{ + int retcode = 0; + struct ib_fmr *ib_fmr; + struct ehca_shca *shca = 0; + struct ehca_shca *prev_shca = 0; + struct ehca_mr *e_fmr = 0; + u32 num_fmr = 0; + u32 unmap_fmr_cnt = 0; + /* @TODO unsigned long sl_flags; */ + + EDEB_EN(7, "fmr_list=%p", fmr_list); + + /* check all FMR belong to same SHCA, and check internal flag */ + list_for_each_entry(ib_fmr, fmr_list, list) { + prev_shca = shca; + shca = container_of(ib_fmr->device, struct ehca_shca, + ib_device); + EHCA_CHECK_FMR(ib_fmr); + e_fmr = container_of(ib_fmr, struct ehca_mr, ib.ib_fmr); + if ((shca != prev_shca) && (prev_shca != 0)) { + EDEB_ERR(4, "SHCA mismatch, shca=%p prev_shca=%p " + "e_fmr=%p", shca, prev_shca, e_fmr); + retcode = -EINVAL; + goto unmap_fmr_exit0; + } + if (!(e_fmr->flags & EHCA_MR_FLAG_FMR)) { + EDEB_ERR(4, "not a FMR, e_fmr=%p e_fmr->flags=%x", + e_fmr, e_fmr->flags); + retcode = -EINVAL; + goto unmap_fmr_exit0; + } + num_fmr++; + } + + /* loop over all FMRs to unmap */ + list_for_each_entry(ib_fmr, fmr_list, list) { + unmap_fmr_cnt++; + e_fmr = container_of(ib_fmr, struct ehca_mr, ib.ib_fmr); + shca = container_of(ib_fmr->device, struct ehca_shca, + ib_device); + /*@TODO??? spin_lock_irqsave(&fmr->mrlock, sl_flags); */ + retcode = ehca_unmap_one_fmr(shca, e_fmr); + /*@TODO???? spin_unlock_irqrestore(&fmr->mrlock, sl_flags); */ + if (retcode != 0) { + /* unmap failed, stop unmapping of rest of FMRs */ + EDEB_ERR(4, "unmap of one FMR failed, stop rest, " + "e_fmr=%p num_fmr=%x unmap_fmr_cnt=%x lkey=%x", + e_fmr, num_fmr, unmap_fmr_cnt, + e_fmr->ib.ib_fmr.lkey); + goto unmap_fmr_exit0; + } + } + + unmap_fmr_exit0: + if (retcode == 0) + EDEB_EX(7, "num_fmr=%x", num_fmr); + else + EDEB_EX(4, "retcode=%x fmr_list=%p num_fmr=%x unmap_fmr_cnt=%x", + retcode, fmr_list, num_fmr, unmap_fmr_cnt); + return (retcode); +} /* end ehca_unmap_fmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_dealloc_fmr(struct ib_fmr *fmr) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_shca *shca = 0; + struct ehca_mr *e_fmr = 0; + + EDEB_EN(7, "fmr=%p", fmr); + + EHCA_CHECK_FMR(fmr); + e_fmr = container_of(fmr, struct ehca_mr, ib.ib_fmr); + shca = container_of(fmr->device, struct ehca_shca, ib_device); + + if (!(e_fmr->flags & EHCA_MR_FLAG_FMR)) { + EDEB_ERR(4, "not a FMR, e_fmr=%p e_fmr->flags=%x", + e_fmr, e_fmr->flags); + retcode = -EINVAL; + goto free_fmr_exit0; + } + + rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, &e_fmr->pf, + &e_fmr->ipz_mr_handle); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_free_mr failed, rc=%lx e_fmr=%p " + "hca_hndl=%lx fmr_hndl=%lx fmr->lkey=%x", + rc, e_fmr, shca->ipz_hca_handle.handle, + e_fmr->ipz_mr_handle.handle, fmr->lkey); + ehca_mrmw_map_rc_free_mr(rc); + goto free_fmr_exit0; + } + /* successful deregistration */ + ehca_mr_delete(e_fmr); + + free_fmr_exit0: + if (retcode == 0) + EDEB_EX(7, ""); + else + EDEB_EX(4, "retcode=%x fmr=%p", retcode, fmr); + return (retcode); +} /* end ehca_dealloc_fmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_reg_mr(struct ehca_shca *shca, + struct ehca_mr *e_mr, + u64 *iova_start, + u64 size, + int acl, + struct ehca_pd *e_pd, + struct ehca_mr_pginfo *pginfo, + u32 *lkey, + u32 *rkey) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_pfmr *pfmr = &e_mr->pf; + u32 hipz_acl = 0; + + EDEB_EN(7, "shca=%p e_mr=%p iova_start=%p size=%lx acl=%x e_pd=%p " + "pginfo=%p num_pages=%lx", shca, e_mr, iova_start, size, acl, + e_pd, pginfo, pginfo->num_pages); + + ehca_mrmw_map_acl(acl, &hipz_acl); + ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl); + if (ehca_use_hp_mr == 1) + hipz_acl |= 0x00000001; + + rc = hipz_h_alloc_resource_mr(shca->ipz_hca_handle, pfmr, &shca->pf, + (u64)iova_start, size, hipz_acl, + e_pd->fw_pd, &e_mr->ipz_mr_handle, + lkey, rkey); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_alloc_mr failed, rc=%lx hca_hndl=%lx " + "mr_hndl=%lx", rc, shca->ipz_hca_handle.handle, + e_mr->ipz_mr_handle.handle); + retcode = ehca_mrmw_map_rc_alloc(rc); + goto ehca_reg_mr_exit0; + } + + retcode = ehca_reg_mr_rpages(shca, e_mr, pginfo); + if (retcode != 0) + goto ehca_reg_mr_exit1; + + /* successful registration */ + e_mr->num_pages = pginfo->num_pages; + e_mr->start = iova_start; + e_mr->size = size; + e_mr->acl = acl; + goto ehca_reg_mr_exit0; + + ehca_reg_mr_exit1: + rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, pfmr, + &e_mr->ipz_mr_handle); + if (rc != H_Success) { + EDEB(1, "rc=%lx shca=%p e_mr=%p iova_start=%p " + "size=%lx acl=%x e_pd=%p lkey=%x pginfo=%p num_pages=%lx", + rc, shca, e_mr, iova_start, size, acl, + e_pd, *lkey, pginfo, pginfo->num_pages); + ehca_catastrophic("internal error in ehca_reg_mr, " + "not recoverable"); + } + ehca_reg_mr_exit0: + if (retcode == 0) + EDEB_EX(7, "retcode=%x lkey=%x rkey=%x", retcode, *lkey, *rkey); + else + EDEB_EX(4, "retcode=%x shca=%p e_mr=%p iova_start=%p " + "size=%lx acl=%x e_pd=%p pginfo=%p num_pages=%lx", + retcode, shca, e_mr, iova_start, + size, acl, e_pd, pginfo, pginfo->num_pages); + return (retcode); +} /* end ehca_reg_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_reg_mr_rpages(struct ehca_shca *shca, + struct ehca_mr *e_mr, + struct ehca_mr_pginfo *pginfo) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_pfmr *pfmr = &e_mr->pf; + u32 rnum = 0; + u64 rpage = 0; + u32 i; + u64 *kpage = 0; + + EDEB_EN(7, "shca=%p e_mr=%p pginfo=%p num_pages=%lx", + shca, e_mr, pginfo, pginfo->num_pages); + + kpage = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (kpage == 0) { + EDEB_ERR(4, "kpage alloc failed"); + retcode = -ENOMEM; + goto ehca_reg_mr_rpages_exit0; + } + memset(kpage, 0, PAGE_SIZE); + + /* max 512 pages per shot */ + for (i = 0; i < ((pginfo->num_pages + 512 - 1) / 512); i++) { + + if (i == ((pginfo->num_pages + 512 - 1) / 512) - 1) { + rnum = pginfo->num_pages % 512; /* last shot */ + if (rnum == 0) + rnum = 512; /* last shot is full */ + } else + rnum = 512; + + if (rnum > 1) { + retcode = ehca_set_pagebuf(e_mr, pginfo, rnum, kpage); + if (retcode) { + EDEB_ERR(4, "ehca_set_pagebuf bad rc, " + "retcode=%x rnum=%x kpage=%p", + retcode, rnum, kpage); + retcode = -EFAULT; + goto ehca_reg_mr_rpages_exit1; + } + rpage = ehca_kv_to_g(kpage); + if (rpage == 0) { + EDEB_ERR(4, "kpage=%p i=%x", kpage, i); + retcode = -EFAULT; + goto ehca_reg_mr_rpages_exit1; + } + } else { /* rnum==1 */ + retcode = ehca_set_pagebuf_1(e_mr, pginfo, &rpage); + if (retcode) { + EDEB_ERR(4, "ehca_set_pagebuf_1 bad rc, " + "retcode=%x i=%x", retcode, i); + retcode = -EFAULT; + goto ehca_reg_mr_rpages_exit1; + } + } + + EDEB(9, "i=%x rnum=%x rpage=%lx", i, rnum, rpage); + + rc = hipz_h_register_rpage_mr(shca->ipz_hca_handle, + &e_mr->ipz_mr_handle, pfmr, + &shca->pf, + 0, /* pagesize hardcoded to 4k */ + 0, rpage, rnum); + + if (i == ((pginfo->num_pages + 512 - 1) / 512) - 1) { + /* check for 'registration complete'==H_Success */ + /* and for 'page registered'==H_PAGE_REGISTERED */ + if (rc != H_Success) { + EDEB_ERR(4, "last hipz_reg_rpage_mr failed, " + "rc=%lx e_mr=%p i=%x hca_hndl=%lx " + "mr_hndl=%lx lkey=%x", rc, e_mr, i, + shca->ipz_hca_handle.handle, + e_mr->ipz_mr_handle.handle, + e_mr->ib.ib_mr.lkey); + retcode = ehca_mrmw_map_rc_rrpg_last(rc); + break; + } else + retcode = 0; + } else if (rc != H_PAGE_REGISTERED) { + EDEB_ERR(4, "hipz_reg_rpage_mr failed, rc=%lx e_mr=%p " + "i=%x lkey=%x hca_hndl=%lx mr_hndl=%lx", + rc, e_mr, i, e_mr->ib.ib_mr.lkey, + shca->ipz_hca_handle.handle, + e_mr->ipz_mr_handle.handle); + retcode = ehca_mrmw_map_rc_rrpg_notlast(rc); + break; + } else + retcode = 0; + } /* end for(i) */ + + + ehca_reg_mr_rpages_exit1: + kfree(kpage); + ehca_reg_mr_rpages_exit0: + if (retcode == 0) + EDEB_EX(7, "retcode=%x", retcode); + else + EDEB_EX(4, "retcode=%x shca=%p e_mr=%p pginfo=%p " + "num_pages=%lx", + retcode, shca, e_mr, pginfo, pginfo->num_pages); + return (retcode); +} /* end ehca_reg_mr_rpages() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +inline int ehca_rereg_mr_rereg1(struct ehca_shca *shca, + struct ehca_mr *e_mr, + u64 *iova_start, + u64 size, + u32 acl, + struct ehca_pd *e_pd, + struct ehca_mr_pginfo *pginfo, + u32 *lkey, + u32 *rkey) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_pfmr *pfmr = &e_mr->pf; + u64 iova_start_out = 0; + u32 hipz_acl = 0; + u64 *kpage = 0; + u64 rpage = 0; + struct ehca_mr_pginfo pginfo_save; + + EDEB_EN(7, "shca=%p e_mr=%p iova_start=%p size=%lx acl=%x " + "e_pd=%p pginfo=%p num_pages=%lx", shca, e_mr, + iova_start, size, acl, e_pd, pginfo, pginfo->num_pages); + + ehca_mrmw_map_acl(acl, &hipz_acl); + ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl); + + kpage = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (kpage == 0) { + EDEB_ERR(4, "kpage alloc failed"); + retcode = -ENOMEM; + goto ehca_rereg_mr_rereg1_exit0; + } + memset(kpage, 0, PAGE_SIZE); + + pginfo_save = *pginfo; + retcode = ehca_set_pagebuf(e_mr, pginfo, pginfo->num_pages, kpage); + if (retcode != 0) { + EDEB_ERR(4, "set pagebuf failed, e_mr=%p pginfo=%p type=%x " + "num_pages=%lx kpage=%p", + e_mr, pginfo, pginfo->type, pginfo->num_pages, kpage); + goto ehca_rereg_mr_rereg1_exit1; + } + rpage = ehca_kv_to_g(kpage); + if (rpage == 0) { + EDEB_ERR(4, "kpage=%p", kpage); + retcode = -EFAULT; + goto ehca_rereg_mr_rereg1_exit1; + } + rc = hipz_h_reregister_pmr(shca->ipz_hca_handle, pfmr, &shca->pf, + &e_mr->ipz_mr_handle, (u64)iova_start, + size, hipz_acl, e_pd->fw_pd, rpage, + &iova_start_out, lkey, rkey); + if (rc != H_Success) { + /* reregistration unsuccessful, */ + /* try it again with the 3 hCalls, */ + /* e.g. this is required in case H_MR_CONDITION */ + /* (MW bound or MR is shared) */ + EDEB(6, "hipz_h_reregister_pmr failed (Rereg1), rc=%lx " + "e_mr=%p", rc, e_mr); + *pginfo = pginfo_save; + retcode = -EAGAIN; + } else if ((u64 *)iova_start_out != iova_start) { + EDEB_ERR(4, "PHYP changed iova_start in rereg_pmr, " + "iova_start=%p iova_start_out=%lx e_mr=%p " + "mr_handle=%lx lkey=%x", iova_start, iova_start_out, + e_mr, e_mr->ipz_mr_handle.handle, e_mr->ib.ib_mr.lkey); + retcode = -EFAULT; + } else { + /* successful reregistration */ + /* note: start and start_out are identical for eServer HCAs */ + e_mr->num_pages = pginfo->num_pages; + e_mr->start = iova_start; + e_mr->size = size; + e_mr->acl = acl; + } + + ehca_rereg_mr_rereg1_exit1: + kfree(kpage); + ehca_rereg_mr_rereg1_exit0: + if ((retcode == 0) || (retcode == -EAGAIN)) + EDEB_EX(7, "retcode=%x rc=%lx lkey=%x rkey=%x pginfo=%p " + "num_pages=%lx", + retcode, rc, *lkey, *rkey, pginfo, pginfo->num_pages); + else + EDEB_EX(4, "retcode=%x rc=%lx lkey=%x rkey=%x pginfo=%p " + "num_pages=%lx", + retcode, rc, *lkey, *rkey, pginfo, pginfo->num_pages); + return (retcode); +} /* end ehca_rereg_mr_rereg1() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_rereg_mr(struct ehca_shca *shca, + struct ehca_mr *e_mr, + u64 *iova_start, + u64 size, + int acl, + struct ehca_pd *e_pd, + struct ehca_mr_pginfo *pginfo, + u32 *lkey, + u32 *rkey) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_pfmr *pfmr = &e_mr->pf; + int Rereg1Hcall = TRUE; /* TRUE: use hipz_h_reregister_pmr directly */ + int Rereg3Hcall = FALSE; /* TRUE: use 3 hipz calls for reregistration */ + struct ehca_bridge_handle save_bridge; + + EDEB_EN(7, "shca=%p e_mr=%p iova_start=%p size=%lx acl=%x " + "e_pd=%p pginfo=%p num_pages=%lx", shca, e_mr, + iova_start, size, acl, e_pd, pginfo, pginfo->num_pages); + + /* first determine reregistration hCall(s) */ + if ((pginfo->num_pages > 512) || (e_mr->num_pages > 512) || + (pginfo->num_pages > e_mr->num_pages)) { + EDEB(7, "Rereg3 case, pginfo->num_pages=%lx " + "e_mr->num_pages=%x", pginfo->num_pages, e_mr->num_pages); + Rereg1Hcall = FALSE; + Rereg3Hcall = TRUE; + } + + if (e_mr->flags & EHCA_MR_FLAG_MAXMR) { /* check for max-MR */ + Rereg1Hcall = FALSE; + Rereg3Hcall = TRUE; + e_mr->flags &= ~EHCA_MR_FLAG_MAXMR; + EDEB(4, "Rereg MR for max-MR! e_mr=%p", e_mr); + } + + if (Rereg1Hcall) { + retcode = ehca_rereg_mr_rereg1(shca, e_mr, iova_start, size, + acl, e_pd, pginfo, lkey, rkey); + if (retcode != 0) { + if (retcode == -EAGAIN) + Rereg3Hcall = TRUE; + else + goto ehca_rereg_mr_exit0; + } + } + + if (Rereg3Hcall) { + struct ehca_mr save_mr; + + /* first deregister old MR */ + rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, pfmr, + &e_mr->ipz_mr_handle); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_free_mr failed, rc=%lx e_mr=%p " + "hca_hndl=%lx mr_hndl=%lx mr->lkey=%x", + rc, e_mr, shca->ipz_hca_handle.handle, + e_mr->ipz_mr_handle.handle, + e_mr->ib.ib_mr.lkey); + retcode = ehca_mrmw_map_rc_free_mr(rc); + goto ehca_rereg_mr_exit0; + } + /* clean ehca_mr_t, without changing struct ib_mr and lock */ + save_bridge = pfmr->bridge; + save_mr = *e_mr; + ehca_mr_deletenew(e_mr); + + /* set some MR values */ + e_mr->flags = save_mr.flags; + pfmr->bridge = save_bridge; + e_mr->fmr_page_size = save_mr.fmr_page_size; + e_mr->fmr_max_pages = save_mr.fmr_max_pages; + e_mr->fmr_max_maps = save_mr.fmr_max_maps; + e_mr->fmr_map_cnt = save_mr.fmr_map_cnt; + + retcode = ehca_reg_mr(shca, e_mr, iova_start, size, acl, + e_pd, pginfo, lkey, rkey); + if (retcode != 0) { + u32 offset = (u64)(&e_mr->flags) - (u64)e_mr; + memcpy(&e_mr->flags, &(save_mr.flags), + sizeof(struct ehca_mr) - offset); + goto ehca_rereg_mr_exit0; + } + } + + ehca_rereg_mr_exit0: + if (retcode == 0) + EDEB_EX(7, "retcode=%x shca=%p e_mr=%p iova_start=%p size=%lx " + "acl=%x e_pd=%p pginfo=%p num_pages=%lx lkey=%x " + "rkey=%x Rereg1Hcall=%x Rereg3Hcall=%x", + retcode, shca, e_mr, iova_start, size, acl, e_pd, + pginfo, pginfo->num_pages, *lkey, *rkey, Rereg1Hcall, + Rereg3Hcall); + else + EDEB_EX(4, "retcode=%x shca=%p e_mr=%p iova_start=%p size=%lx " + "acl=%x e_pd=%p pginfo=%p num_pages=%lx lkey=%x " + "rkey=%x Rereg1Hcall=%x Rereg3Hcall=%x", + retcode, shca, e_mr, iova_start, size, acl, e_pd, + pginfo, pginfo->num_pages, *lkey, *rkey, Rereg1Hcall, + Rereg3Hcall); + + return (retcode); +} /* end ehca_rereg_mr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_unmap_one_fmr(struct ehca_shca *shca, + struct ehca_mr *e_fmr) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_pfmr *pfmr = &e_fmr->pf; + int Rereg1Hcall = TRUE; /* TRUE: use hipz_mr_reregister directly */ + int Rereg3Hcall = FALSE; /* TRUE: use 3 hipz calls for unmapping */ + struct ehca_bridge_handle save_bridge; + struct ehca_pd *e_pd = 0; + struct ehca_mr save_fmr; + u32 tmp_lkey = 0; + u32 tmp_rkey = 0; + struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0}; + + EDEB_EN(7, "shca=%p e_fmr=%p", shca, e_fmr); + + /* first check if reregistration hCall can be used for unmap */ + if (e_fmr->fmr_max_pages > 512) { + Rereg1Hcall = FALSE; + Rereg3Hcall = TRUE; + } + + e_pd = container_of(e_fmr->ib.ib_fmr.pd, struct ehca_pd, ib_pd); + + if (Rereg1Hcall) { + /* note: after using rereg hcall with len=0, */ + /* rereg hcall must be used again for registering pages */ + u64 start_out = 0; + rc = hipz_h_reregister_pmr(shca->ipz_hca_handle, pfmr, + &shca->pf, &e_fmr->ipz_mr_handle, 0, + 0, 0, e_pd->fw_pd, 0, &start_out, + &tmp_lkey, &tmp_rkey); + if (rc != H_Success) { + /* should not happen, because length checked above, */ + /* FMRs are not shared and no MW bound to FMRs */ + EDEB_ERR(4, "hipz_reregister_pmr failed (Rereg1), " + "rc=%lx e_fmr=%p hca_hndl=%lx mr_hndl=%lx " + "lkey=%x", rc, e_fmr, + shca->ipz_hca_handle.handle, + e_fmr->ipz_mr_handle.handle, + e_fmr->ib.ib_fmr.lkey); + Rereg3Hcall = TRUE; + } else { + /* successful reregistration */ + e_fmr->start = 0; + e_fmr->size = 0; + } + } + + if (Rereg3Hcall) { + struct ehca_mr save_mr; + + /* first free old FMR */ + rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, pfmr, + &e_fmr->ipz_mr_handle); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_free_mr failed, rc=%lx e_fmr=%p " + "hca_hndl=%lx mr_hndl=%lx lkey=%x", rc, e_fmr, + shca->ipz_hca_handle.handle, + e_fmr->ipz_mr_handle.handle, + e_fmr->ib.ib_fmr.lkey); + retcode = ehca_mrmw_map_rc_free_mr(rc); + goto ehca_unmap_one_fmr_exit0; + } + /* clean ehca_mr_t, without changing lock */ + save_bridge = pfmr->bridge; + save_fmr = *e_fmr; + ehca_mr_deletenew(e_fmr); + + /* set some MR values */ + e_fmr->flags = save_fmr.flags; + pfmr->bridge = save_bridge; + e_fmr->fmr_page_size = save_fmr.fmr_page_size; + e_fmr->fmr_max_pages = save_fmr.fmr_max_pages; + e_fmr->fmr_max_maps = save_fmr.fmr_max_maps; + e_fmr->fmr_map_cnt = save_fmr.fmr_map_cnt; + e_fmr->acl = save_fmr.acl; + + pginfo.type = EHCA_MR_PGI_FMR; + pginfo.num_pages = 0; + retcode = ehca_reg_mr(shca, e_fmr, 0, + (e_fmr->fmr_max_pages * + e_fmr->fmr_page_size), + e_fmr->acl, e_pd, &pginfo, &tmp_lkey, + &tmp_rkey); + if (retcode != 0) { + u32 offset = (u64)(&e_fmr->flags) - (u64)e_fmr; + memcpy(&e_fmr->flags, &(save_mr.flags), + sizeof(struct ehca_mr) - offset); + goto ehca_unmap_one_fmr_exit0; + } + } + + ehca_unmap_one_fmr_exit0: + EDEB_EX(7, "retcode=%x tmp_lkey=%x tmp_rkey=%x fmr_max_pages=%x " + "Rereg1Hcall=%x Rereg3Hcall=%x", retcode, tmp_lkey, tmp_rkey, + e_fmr->fmr_max_pages, Rereg1Hcall, Rereg3Hcall); + return (retcode); +} /* end ehca_unmap_one_fmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_reg_smr(struct ehca_shca *shca, + struct ehca_mr *e_origmr, + struct ehca_mr *e_newmr, + u64 *iova_start, + int acl, + struct ehca_pd *e_pd, + u32 *lkey, + u32 *rkey) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_pfmr *pfmr = &e_newmr->pf; + u32 hipz_acl = 0; + + EDEB_EN(7,"shca=%p e_origmr=%p e_newmr=%p iova_start=%p acl=%x e_pd=%p", + shca, e_origmr, e_newmr, iova_start, acl, e_pd); + + ehca_mrmw_map_acl(acl, &hipz_acl); + ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl); + + rc = hipz_h_register_smr(shca->ipz_hca_handle, pfmr, &e_origmr->pf, + &shca->pf, &e_origmr->ipz_mr_handle, + (u64)iova_start, hipz_acl, e_pd->fw_pd, + &e_newmr->ipz_mr_handle, lkey, rkey); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_reg_smr failed, rc=%lx shca=%p e_origmr=%p " + "e_newmr=%p iova_start=%p acl=%x e_pd=%p hca_hndl=%lx " + "mr_hndl=%lx lkey=%x", rc, shca, e_origmr, e_newmr, + iova_start, acl, e_pd, shca->ipz_hca_handle.handle, + e_origmr->ipz_mr_handle.handle, + e_origmr->ib.ib_mr.lkey); + retcode = ehca_mrmw_map_rc_reg_smr(rc); + goto ehca_reg_smr_exit0; + } + /* successful registration */ + e_newmr->num_pages = e_origmr->num_pages; + e_newmr->start = iova_start; + e_newmr->size = e_origmr->size; + e_newmr->acl = acl; + goto ehca_reg_smr_exit0; + + ehca_reg_smr_exit0: + if (retcode == 0) + EDEB_EX(7, "retcode=%x lkey=%x rkey=%x", + retcode, *lkey, *rkey); + else + EDEB_EX(4, "retcode=%x shca=%p e_origmr=%p e_newmr=%p " + "iova_start=%p acl=%x e_pd=%p", retcode, + shca, e_origmr, e_newmr, iova_start, acl, e_pd); + return (retcode); +} /* end ehca_reg_smr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_reg_internal_maxmr( + struct ehca_shca *shca, + struct ehca_pd *e_pd, + struct ehca_mr **e_maxmr) +{ + int retcode = 0; + struct ehca_mr *e_mr = 0; + u64 *iova_start = 0; + u64 size_maxmr = 0; + struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0}; + struct ib_phys_buf ib_pbuf; + u32 num_pages_mr = 0; + + EDEB_EN(7, "shca=%p e_pd=%p e_maxmr=%p", shca, e_pd, e_maxmr); + + if (ehca_adr_bad(shca) || ehca_adr_bad(e_pd) || ehca_adr_bad(e_maxmr)) { + EDEB_ERR(4, "bad input values: shca=%p e_pd=%p e_maxmr=%p", + shca, e_pd, e_maxmr); + retcode = -EINVAL; + goto ehca_reg_internal_maxmr_exit0; + } + + e_mr = ehca_mr_new(); + if (!e_mr) { + EDEB_ERR(4, "out of memory"); + retcode = -ENOMEM; + goto ehca_reg_internal_maxmr_exit0; + } + e_mr->flags |= EHCA_MR_FLAG_MAXMR; + + /* register internal max-MR on HCA */ + size_maxmr = (u64)high_memory - PAGE_OFFSET; + EDEB(9, "high_memory=%p PAGE_OFFSET=%lx", high_memory, PAGE_OFFSET); + iova_start = (u64 *)KERNELBASE; + ib_pbuf.addr = 0; + ib_pbuf.size = size_maxmr; + num_pages_mr = + ((((u64)iova_start % PAGE_SIZE) + size_maxmr + + PAGE_SIZE - 1) / PAGE_SIZE); + + pginfo.type = EHCA_MR_PGI_PHYS; + pginfo.num_pages = num_pages_mr; + pginfo.num_phys_buf = 1; + pginfo.phys_buf_array = &ib_pbuf; + + retcode = ehca_reg_mr(shca, e_mr, iova_start, size_maxmr, 0, e_pd, + &pginfo, &e_mr->ib.ib_mr.lkey, + &e_mr->ib.ib_mr.rkey); + if (retcode != 0) { + EDEB_ERR(4, "reg of internal max MR failed, e_mr=%p " + "iova_start=%p size_maxmr=%lx num_pages_mr=%x", + e_mr, iova_start, size_maxmr, num_pages_mr); + goto ehca_reg_internal_maxmr_exit1; + } + + /* successful registration of all pages */ + e_mr->ib.ib_mr.device = e_pd->ib_pd.device; + e_mr->ib.ib_mr.pd = &e_pd->ib_pd; + e_mr->ib.ib_mr.uobject = NULL; + atomic_inc(&(e_pd->ib_pd.usecnt)); + atomic_set(&(e_mr->ib.ib_mr.usecnt), 0); + *e_maxmr = e_mr; + goto ehca_reg_internal_maxmr_exit0; + + ehca_reg_internal_maxmr_exit1: + ehca_mr_delete(e_mr); + ehca_reg_internal_maxmr_exit0: + if (retcode == 0) + EDEB_EX(7, "*e_maxmr=%p lkey=%x rkey=%x", + *e_maxmr, (*e_maxmr)->ib.ib_mr.lkey, + (*e_maxmr)->ib.ib_mr.rkey); + else + EDEB_EX(4, "retcode=%x shca=%p e_pd=%p e_maxmr=%p", + retcode, shca, e_pd, e_maxmr); + return (retcode); +} /* end ehca_reg_internal_maxmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_reg_maxmr(struct ehca_shca *shca, + struct ehca_mr *e_newmr, + u64 *iova_start, + int acl, + struct ehca_pd *e_pd, + u32 *lkey, + u32 *rkey) +{ + int retcode = 0; + u64 rc = H_Success; + struct ehca_pfmr *pfmr = &e_newmr->pf; + struct ehca_mr *e_origmr = shca->maxmr; + u32 hipz_acl = 0; + + EDEB_EN(7,"shca=%p e_origmr=%p e_newmr=%p iova_start=%p acl=%x e_pd=%p", + shca, e_origmr, e_newmr, iova_start, acl, e_pd); + + ehca_mrmw_map_acl(acl, &hipz_acl); + ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl); + + rc = hipz_h_register_smr(shca->ipz_hca_handle, pfmr, &e_origmr->pf, + &shca->pf, &e_origmr->ipz_mr_handle, + (u64)iova_start, hipz_acl, e_pd->fw_pd, + &e_newmr->ipz_mr_handle, lkey, rkey); + if (rc != H_Success) { + EDEB_ERR(4, "hipz_reg_smr failed, rc=%lx e_origmr=%p " + "hca_hndl=%lx mr_hndl=%lx lkey=%x", + rc, e_origmr, shca->ipz_hca_handle.handle, + e_origmr->ipz_mr_handle.handle, + e_origmr->ib.ib_mr.lkey); + retcode = ehca_mrmw_map_rc_reg_smr(rc); + goto ehca_reg_maxmr_exit0; + } + /* successful registration */ + e_newmr->num_pages = e_origmr->num_pages; + e_newmr->start = iova_start; + e_newmr->size = e_origmr->size; + e_newmr->acl = acl; + + ehca_reg_maxmr_exit0: + EDEB_EX(7, "retcode=%x lkey=%x rkey=%x", retcode, *lkey, *rkey); + return (retcode); +} /* end ehca_reg_maxmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +int ehca_dereg_internal_maxmr(struct ehca_shca *shca) +{ + int retcode = 0; + struct ehca_mr *e_maxmr = 0; + struct ib_pd *ib_pd = 0; + + EDEB_EN(7, "shca=%p shca->maxmr=%p", shca, shca->maxmr); + + if (shca->maxmr == 0) { + EDEB_ERR(4, "bad call, shca=%p", shca); + retcode = -EINVAL; + goto ehca_dereg_internal_maxmr_exit0; + } + + e_maxmr = shca->maxmr; + ib_pd = e_maxmr->ib.ib_mr.pd; + shca->maxmr = 0; /* remove internal max-MR indication from SHCA */ + + retcode = ehca_dereg_mr(&e_maxmr->ib.ib_mr); + if (retcode != 0) { + EDEB_ERR(3, "dereg internal max-MR failed, " + "retcode=%x e_maxmr=%p shca=%p lkey=%x", + retcode, e_maxmr, shca, e_maxmr->ib.ib_mr.lkey); + shca->maxmr = e_maxmr; + goto ehca_dereg_internal_maxmr_exit0; + } + + atomic_dec(&ib_pd->usecnt); + + ehca_dereg_internal_maxmr_exit0: + if (retcode == 0) + EDEB_EX(7, ""); + else + EDEB_EX(4, "retcode=%x shca=%p shca->maxmr=%p", + retcode, shca, shca->maxmr); + return (retcode); +} /* end ehca_dereg_internal_maxmr() */ diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.h b/drivers/infiniband/hw/ehca/ehca_mrmw.h new file mode 100644 index 0000000..4df4b5b --- /dev/null +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.h @@ -0,0 +1,739 @@ +/* + * IBM eServer eHCA Infiniband device driver for Linux on POWER + * + * MR/MW declarations and inline functions + * + * Authors: Dietmar Decker + * + * Copyright (c) 2005 IBM Corporation + * + * All rights reserved. + * + * This source code is distributed under a dual license of GPL v2.0 and OpenIB + * BSD. + * + * OpenIB BSD License + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * $Id: ehca_mrmw.h,v 1.59 2006/02/06 10:17:34 schickhj Exp $ + */ + +#ifndef _EHCA_MRMW_H_ +#define _EHCA_MRMW_H_ + +#undef DEB_PREFIX +#define DEB_PREFIX "mrmw" + +#include "hipz_structs.h" + + +int ehca_reg_mr(struct ehca_shca *shca, + struct ehca_mr *e_mr, + u64 *iova_start, + u64 size, + int acl, + struct ehca_pd *e_pd, + struct ehca_mr_pginfo *pginfo, + u32 *lkey, /**addr & ~PAGE_MASK)) { + EDEB_ERR(4, "iova_start/addr mismatch, iova_start=%p " + "pbuf->addr=%lx pbuf->size=%lx", + iova_start, pbuf->addr, pbuf->size); + return (-EINVAL); + } + if (((pbuf->addr + pbuf->size) % PAGE_SIZE) && + (num_phys_buf > 1)) { + EDEB_ERR(4, "addr/size mismatch in 1st buf, pbuf->addr=%lx " + "pbuf->size=%lx", pbuf->addr, pbuf->size); + return (-EINVAL); + } + + for (i = 0; i < num_phys_buf; i++) { + if ((i > 0) && (pbuf->addr % PAGE_SIZE)) { + EDEB_ERR(4, "bad address, i=%x pbuf->addr=%lx " + "pbuf->size=%lx", i, pbuf->addr, pbuf->size); + return (-EINVAL); + } + if (((i > 0) && /* not 1st */ + (i < (num_phys_buf - 1)) && /* not last */ + (pbuf->size % PAGE_SIZE)) || (pbuf->size == 0)) { + EDEB_ERR(4, "bad size, i=%x pbuf->size=%lx", + i, pbuf->size); + return (-EINVAL); + } + size_count += pbuf->size; + pbuf++; + } + + *size = size_count; + return (0); +} /* end ehca_mr_chk_buf_and_calc_size() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +/** @brief check page list of map FMR verb for validness +*/ +static inline int ehca_fmr_check_page_list( + struct ehca_mr *e_fmr, /** e_fmr->fmr_max_pages)) { + EDEB_ERR(4, "bad list_len, list_len=%x e_fmr->fmr_max_pages=%x " + "fmr=%p", list_len, e_fmr->fmr_max_pages, e_fmr); + return (-EINVAL); + } + + /* each page must be aligned */ + page = page_list; + for (i = 0; i < list_len; i++) { + if (*page % PAGE_SIZE) { + EDEB_ERR(4, "bad page, i=%x *page=%lx page=%p " + "fmr=%p", i, *page, page, e_fmr); + return (-EINVAL); + } + page++; + } + + return (0); +} /* end ehca_fmr_check_page_list() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +/** @brief setup page buffer from page info + */ +static inline int ehca_set_pagebuf(struct ehca_mr *e_mr, + struct ehca_mr_pginfo *pginfo, + u32 number, + u64 *kpage) /**type, pginfo->num_pages, pginfo->next_buf, + pginfo->next_page, number, kpage, pginfo->page_count, + pginfo->next_listelem, pginfo->region, pginfo->next_chunk, + pginfo->next_nmap); + + if (pginfo->type == EHCA_MR_PGI_PHYS) { + /* loop over desired phys_buf_array entries */ + while (i < number) { + pbuf = pginfo->phys_buf_array + pginfo->next_buf; + numpg = ((pbuf->size + PAGE_SIZE - 1) / PAGE_SIZE); + while (pginfo->next_page < numpg) { + /* sanity check */ + if (pginfo->page_count >= pginfo->num_pages) { + EDEB_ERR(4, "page_count >= num_pages, " + "page_count=%lx num_pages=%lx " + "i=%x", pginfo->page_count, + pginfo->num_pages, i); + retcode = -EFAULT; + goto ehca_set_pagebuf_exit0; + } + *kpage = phys_to_abs((pbuf->addr & PAGE_MASK) + + (pginfo->next_page * + PAGE_SIZE)); + if ((*kpage == 0) && (pbuf->addr != 0)) { + EDEB_ERR(4, "pbuf->addr=%lx" + " pbuf->size=%lx" + " next_page=%lx", + pbuf->addr, pbuf->size, + pginfo->next_page); + retcode = -EFAULT; + goto ehca_set_pagebuf_exit0; + } + (pginfo->next_page)++; + (pginfo->page_count)++; + kpage++; + i++; + if (i >= number) break; + } + if (pginfo->next_page >= numpg) { + (pginfo->next_buf)++; + pginfo->next_page = 0; + } + } + } else if (pginfo->type == EHCA_MR_PGI_USER) { + /* loop over desired chunk entries */ + /* (@TODO: add support for large pages) */ + chunk = pginfo->next_chunk; + prev_chunk = pginfo->next_chunk; + list_for_each_entry_continue(chunk, + (&(pginfo->region->chunk_list)), + list) { + EDEB(9, "chunk->page_list[0]=%lx", + (u64)sg_dma_address(&chunk->page_list[0])); + for (i = pginfo->next_nmap; i < chunk->nmap; i++) { + pgaddr = ( page_to_pfn(chunk->page_list[i].page) + << PAGE_SHIFT ); + *kpage = phys_to_abs(pgaddr); + EDEB(9,"pgaddr=%lx *kpage=%lx", pgaddr, *kpage); + if (*kpage == 0) { + EDEB_ERR(4, "chunk->page_list[i]=%lx" + " i=%x mr=%p", + (u64)sg_dma_address( + &chunk->page_list[i]), + i, e_mr); + retcode = -EFAULT; + goto ehca_set_pagebuf_exit0; + } + (pginfo->page_count)++; + (pginfo->next_nmap)++; + kpage++; + j++; + if (j >= number) break; + } + if ( (pginfo->next_nmap >= chunk->nmap) && + (j >= number) ) { + pginfo->next_nmap = 0; + prev_chunk = chunk; + break; + } else if (pginfo->next_nmap >= chunk->nmap) { + pginfo->next_nmap = 0; + prev_chunk = chunk; + } else if (j >= number) + break; + else + prev_chunk = chunk; + } + pginfo->next_chunk = + list_prepare_entry(prev_chunk, + (&(pginfo->region->chunk_list)), + list); + } else if (pginfo->type == EHCA_MR_PGI_FMR) { + /* loop over desired page_list entries */ + fmrlist = pginfo->page_list + pginfo->next_listelem; + for (i = 0; i < number; i++) { + *kpage = phys_to_abs(*fmrlist); + if (*kpage == 0) { + EDEB_ERR(4, "*fmrlist=%lx fmrlist=%p" + " next_listelem=%lx", *fmrlist, + fmrlist, pginfo->next_listelem); + retcode = -EFAULT; + goto ehca_set_pagebuf_exit0; + } + (pginfo->next_listelem)++; + (pginfo->page_count)++; + fmrlist++; + kpage++; + } + } else { + EDEB_ERR(4, "bad pginfo->type=%x", pginfo->type); + retcode = -EFAULT; + goto ehca_set_pagebuf_exit0; + } + + ehca_set_pagebuf_exit0: + if (retcode == 0) + EDEB_EX(7, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx " + "next_buf=%lx next_page=%lx number=%x kpage=%p " + "page_count=%lx i=%x next_listelem=%lx region=%p " + "next_chunk=%p next_nmap=%lx", + retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages, + pginfo->next_buf, pginfo->next_page, number, kpage, + pginfo->page_count, i, pginfo->next_listelem, + pginfo->region, pginfo->next_chunk, pginfo->next_nmap); + else + EDEB_EX(4, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx " + "next_buf=%lx next_page=%lx number=%x kpage=%p " + "page_count=%lx i=%x next_listelem=%lx region=%p " + "next_chunk=%p next_nmap=%lx", + retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages, + pginfo->next_buf, pginfo->next_page, number, kpage, + pginfo->page_count, i, pginfo->next_listelem, + pginfo->region, pginfo->next_chunk, pginfo->next_nmap); + return (retcode); +} /* end ehca_set_pagebuf() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +/** @brief setup 1 page from page info page buffer + */ +static inline int ehca_set_pagebuf_1(struct ehca_mr *e_mr, + struct ehca_mr_pginfo *pginfo, + u64 *rpage) /**type, pginfo->num_pages, pginfo->next_buf, + pginfo->next_page, rpage, pginfo->page_count, + pginfo->next_listelem, pginfo->region, pginfo->next_chunk, + pginfo->next_nmap); + + if (pginfo->type == EHCA_MR_PGI_PHYS) { + /* sanity check */ + if (pginfo->page_count >= pginfo->num_pages) { + EDEB_ERR(4, "page_count >= num_pages, " + "page_count=%lx num_pages=%lx", + pginfo->page_count, pginfo->num_pages); + retcode = -EFAULT; + goto ehca_set_pagebuf_1_exit0; + } + tmp_pbuf = pginfo->phys_buf_array + pginfo->next_buf; + *rpage = phys_to_abs(((tmp_pbuf->addr & PAGE_MASK) + + (pginfo->next_page * PAGE_SIZE))); + if ((*rpage == 0) && (tmp_pbuf->addr != 0)) { + EDEB_ERR(4, "tmp_pbuf->addr=%lx" + " tmp_pbuf->size=%lx next_page=%lx", + tmp_pbuf->addr, tmp_pbuf->size, + pginfo->next_page); + retcode = -EFAULT; + goto ehca_set_pagebuf_1_exit0; + } + (pginfo->next_page)++; + (pginfo->page_count)++; + if (pginfo->next_page >= tmp_pbuf->size / PAGE_SIZE) { + (pginfo->next_buf)++; + pginfo->next_page = 0; + } + } else if (pginfo->type == EHCA_MR_PGI_USER) { + chunk = pginfo->next_chunk; + prev_chunk = pginfo->next_chunk; + list_for_each_entry_continue(chunk, + (&(pginfo->region->chunk_list)), + list) { + pgaddr = ( page_to_pfn(chunk->page_list[ + pginfo->next_nmap].page) + << PAGE_SHIFT ); + *rpage = phys_to_abs(pgaddr); + EDEB(9,"pgaddr=%lx *rpage=%lx", pgaddr, *rpage); + if (*rpage == 0) { + EDEB_ERR(4, "chunk->page_list[]=%lx next_nmap=%lx " + "mr=%p", (u64)sg_dma_address( + &chunk->page_list[ + pginfo->next_nmap]), + pginfo->next_nmap, e_mr); + retcode = -EFAULT; + goto ehca_set_pagebuf_1_exit0; + } + (pginfo->page_count)++; + (pginfo->next_nmap)++; + if (pginfo->next_nmap >= chunk->nmap) { + pginfo->next_nmap = 0; + prev_chunk = chunk; + } + break; + } + pginfo->next_chunk = + list_prepare_entry(prev_chunk, + (&(pginfo->region->chunk_list)), + list); + } else if (pginfo->type == EHCA_MR_PGI_FMR) { + tmp_fmrlist = pginfo->page_list + pginfo->next_listelem; + *rpage = phys_to_abs(*tmp_fmrlist); + if (*rpage == 0) { + EDEB_ERR(4, "*tmp_fmrlist=%lx tmp_fmrlist=%p" + " next_listelem=%lx", *tmp_fmrlist, + tmp_fmrlist, pginfo->next_listelem); + retcode = -EFAULT; + goto ehca_set_pagebuf_1_exit0; + } + (pginfo->next_listelem)++; + (pginfo->page_count)++; + } else { + EDEB_ERR(4, "bad pginfo->type=%x", pginfo->type); + retcode = -EFAULT; + goto ehca_set_pagebuf_1_exit0; + } + + ehca_set_pagebuf_1_exit0: + if (retcode == 0) + EDEB_EX(7, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx " + "next_buf=%lx next_page=%lx rpage=%p page_count=%lx " + "next_listelem=%lx region=%p next_chunk=%p " + "next_nmap=%lx", + retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages, + pginfo->next_buf, pginfo->next_page, rpage, + pginfo->page_count, pginfo->next_listelem, + pginfo->region, pginfo->next_chunk, pginfo->next_nmap); + else + EDEB_EX(4, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx " + "next_buf=%lx next_page=%lx rpage=%p page_count=%lx " + "next_listelem=%lx region=%p next_chunk=%p " + "next_nmap=%lx", + retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages, + pginfo->next_buf, pginfo->next_page, rpage, + pginfo->page_count, pginfo->next_listelem, + pginfo->region, pginfo->next_chunk, pginfo->next_nmap); + return (retcode); +} /* end ehca_set_pagebuf_1() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ + +/** @brief check MR if it is a max-MR, i.e. uses whole memory + in case it's a max-MR TRUE is returned, else FALSE +*/ +static inline int ehca_mr_is_maxmr(u64 size, + u64 *iova_start) +{ + /* a MR is treated as max-MR only if it fits following: */ + if ((size == ((u64)high_memory - PAGE_OFFSET)) && + (iova_start == (void*)KERNELBASE)) { + EDEB(6, "this is a max-MR"); + return (TRUE); + } else + return (FALSE); +} /* end ehca_mr_is_maxmr() */ + +/*----------------------------------------------------------------------*/ +/*----------------------------------------------------------------------*/ +/** @brief map access control for MR/MW. + This routine is used for MR and MW. +*/ +static inline void ehca_mrmw_map_acl(int ib_acl, /**flags) - (u64)mr; + memset(&mr->flags, 0, sizeof(*mr) - offset); +} /* end ehca_mr_deletenew() */ + +#endif /*_EHCA_MRMW_H_*/ From rolandd at cisco.com Sat Feb 18 11:58:02 2006 From: rolandd at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 16:58:02 -0800 Subject: [PATCH 22/22] ehca Makefile/Kconfig changes In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: <20060218005801.13620.38625.stgit@localhost.localdomain> From: Roland Dreier --- drivers/infiniband/Kconfig | 2 ++ drivers/infiniband/Makefile | 1 + drivers/infiniband/hw/ehca/Kbuild | 8 ++++++++ drivers/infiniband/hw/ehca/Kconfig | 6 ++++++ 4 files changed, 17 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig index bdf0891..2b3ad03 100644 --- a/drivers/infiniband/Kconfig +++ b/drivers/infiniband/Kconfig @@ -31,6 +31,8 @@ config INFINIBAND_USER_ACCESS source "drivers/infiniband/hw/mthca/Kconfig" +source "drivers/infiniband/hw/ehca/Kconfig" + source "drivers/infiniband/ulp/ipoib/Kconfig" source "drivers/infiniband/ulp/srp/Kconfig" diff --git a/drivers/infiniband/Makefile b/drivers/infiniband/Makefile index a43fb34..eb7788f 100644 --- a/drivers/infiniband/Makefile +++ b/drivers/infiniband/Makefile @@ -1,4 +1,5 @@ obj-$(CONFIG_INFINIBAND) += core/ obj-$(CONFIG_INFINIBAND_MTHCA) += hw/mthca/ +obj-$(CONFIG_INFINIBAND_EHCA) += hw/ehca/ obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/ obj-$(CONFIG_INFINIBAND_SRP) += ulp/srp/ diff --git a/drivers/infiniband/hw/ehca/Kbuild b/drivers/infiniband/hw/ehca/Kbuild new file mode 100644 index 0000000..7b610b1 --- /dev/null +++ b/drivers/infiniband/hw/ehca/Kbuild @@ -0,0 +1,8 @@ +obj-$(CONFIG_INFINIBAND_EHCA) += hcad_mod.o + +hcad_mod-objs = ehca_main.o ehca_hca.o ipz_pt_fn.o ehca_classes.o ehca_av.o \ + ehca_pd.o ehca_mrmw.o ehca_cq.o ehca_sqp.o ehca_qp.o hcp_sense.o \ + ehca_eq.o ehca_irq.o hcp_phyp.o ehca_mcast.o ehca_reqs.o \ + ehca_uverbs.o + +CFLAGS +=-DP_SERIES -DEHCA_USE_HCALL -DEHCA_USE_HCALL_KERNEL diff --git a/drivers/infiniband/hw/ehca/Kconfig b/drivers/infiniband/hw/ehca/Kconfig new file mode 100644 index 0000000..b875649 --- /dev/null +++ b/drivers/infiniband/hw/ehca/Kconfig @@ -0,0 +1,6 @@ +config INFINIBAND_EHCA + tristate "eHCA support" + depends on IBMEBUS && INFINIBAND + ---help--- + This is a low level device driver for the IBM + GX based Host channel adapters (HCAs) \ No newline at end of file From greg at kroah.com Sat Feb 18 12:54:13 2006 From: greg at kroah.com (Greg KH) Date: Fri, 17 Feb 2006 17:54:13 -0800 Subject: [PATCH 04/22] OF adapter probing In-Reply-To: <20060218005712.13620.82908.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005712.13620.82908.stgit@localhost.localdomain> Message-ID: <20060218015413.GA17653@kroah.com> On Fri, Feb 17, 2006 at 04:57:14PM -0800, Roland Dreier wrote: > +int hipz_count_adapters(void) > +{ > + int num = 0; > + struct device_node *dn = NULL; > + > + EDEB_EN(7, ""); > + > + while ((dn = of_find_node_by_name(dn, "lhca"))) { > + num++; > + } The { } are not needed here. > + > + of_node_put(dn); > + > + if (num == 0) { > + EDEB_ERR(4, "No lhca node name was found in the" > + " Open Firmware device tree."); > + return -ENODEV; > + } > + > + EDEB(6, " ... found %x adapter(s)", num); > + > + EDEB_EX(7, "num=%x", num); > + > + return num; > +} > + > +int hipz_probe_adapters(char **adapter_list) > +{ > + int ret = 0; > + int num = 0; > + struct device_node *dn = NULL; > + char *loc; > + > + EDEB_EN(7, "adapter_list=%p", adapter_list); > + > + while ((dn = of_find_node_by_name(dn, "lhca"))) { > + loc = get_property(dn, "ibm,loc-code", NULL); > + if (loc == NULL) { > + EDEB_ERR(4, "No ibm,loc-code property for" > + " lhca Open Firmware device tree node."); > + ret = -ENODEV; > + goto probe_adapters0; > + } > + > + adapter_list[num] = loc; > + EDEB(6, " ... found adapter[%x] with loc-code: %s", num, loc); > + num++; > + } > + > + probe_adapters0: > + of_node_put(dn); Please use tabs everywhere. Hm, wait, that's a label. Put it where it belongs, over on the left please. thanks, greg k-h From greg at kroah.com Sat Feb 18 12:58:08 2006 From: greg at kroah.com (Greg KH) Date: Fri, 17 Feb 2006 17:58:08 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218005707.13620.20538.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> Message-ID: <20060218015808.GB17653@kroah.com> On Fri, Feb 17, 2006 at 04:57:07PM -0800, Roland Dreier wrote: > From: Roland Dreier > > This is a very large file with way too much code for a .h file. > The functions look too big to be inlined also. Is there any way > for this code to move to a .c file? Roland, your comments are fine, but what about the original author's descriptions of what each patch are? Come on, IBM allows developers to post code to lkml, just look at the archives for proof. For them to use a proxy like this is very strange, and also, there is no Signed-off-by: record from the original authors, which is not ok. And why aren't you using the standard firmware interface in the kernel? > +#ifndef CONFIG_PPC64 > +#ifndef Z_SERIES > +#warning "included with wrong target, this is a p file" > +#endif > +#endif It's a "p" file? What's that? Is this even needed? thanks, greg k-h From rdreier at cisco.com Sat Feb 18 13:04:56 2006 From: rdreier at cisco.com (Roland Dreier) Date: Fri, 17 Feb 2006 18:04:56 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218015808.GB17653@kroah.com> (Greg KH's message of "Fri, 17 Feb 2006 17:58:08 -0800") References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> Message-ID: Greg> Roland, your comments are fine, but what about the original Greg> author's descriptions of what each patch are? This is actually me breaking up a giant driver into pieces small enough to post to lkml without hitting the 100 KB limit. This is just an RFC -- I assume the driver is going to get merged in the end as one big git changeset with a changelog like "add driver for IBM eHCA InfiniBand adapters". Greg> Come on, IBM allows developers to post code to lkml, just Greg> look at the archives for proof. For them to use a proxy Greg> like this is very strange, and also, there is no Greg> Signed-off-by: record from the original authors, which is Greg> not ok. Well, the eHCA guys tell me that they can't post patches to lkml. You're right that the final merge will have to have an IBM Signed-off-by: line but as I said this is just an RFC. There are many reasons beyond patch format issues that make this stuff unmergeable as-is. Greg> And why aren't you using the standard firmware interface in Greg> the kernel? This is actually stuff to talk to the firmware that sits below the kernel on IBM ppc64 machines, not an interface to load device firmware from userspace. - R. From apgo at patchbomb.org Sat Feb 18 21:08:49 2006 From: apgo at patchbomb.org (Arthur Othieno) Date: Sat, 18 Feb 2006 05:08:49 -0500 Subject: [PATCH] powerpc: ARCH=powerpc build fix for CONFIG_SYSVIPC=n || CONFIG_SYSCTL=n Message-ID: <20060218100849.GA1869@krypton> When using a default config generated by just `make menuconfig' (ie. none of arch/powerpc/configs/*), linking .tmp_vmlinux1 barfs with: arch/powerpc/kernel/built-in.o: In function `.sys_call_table': : undefined reference to `.compat_sys_ipc' arch/powerpc/kernel/built-in.o: In function `.sys_call_table': : undefined reference to `.compat_sys_sysctl' make: *** [.tmp_vmlinux1] Error 1 These are wrapped around #ifdef CONFIG_{SYSVIPC,SYSCTL} respectively. Fixup to just return -ENOSYS when CONFIG_SYSVIPC=n || CONFIG_SYSCTL=n. Signed-off-by: Arthur Othieno --- Paulus, any chance this can go in before 2.6.16 is Out There(tm) ? arch/powerpc/kernel/sys_ppc32.c | 17 ++++++++++++++--- 1 files changed, 14 insertions(+), 3 deletions(-) 122877b2f58236c61f87797c2908a9ab1e3e451d diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index 475249d..272beb3 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first, return -ENOSYS; } -#endif +#else +long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr, + u32 fifth) +{ + return -ENOSYS; +} +#endif /* CONFIG_SYSVIPC */ /* Note: it is necessary to treat out_fd and in_fd as unsigned ints, * with the corresponding cast to a signed int to insure that the @@ -818,7 +824,6 @@ asmlinkage long compat_sys_umask(u32 mas return sys_umask((int)mask); } -#ifdef CONFIG_SYSCTL struct __sysctl_args32 { u32 name; int nlen; @@ -829,6 +834,7 @@ struct __sysctl_args32 { u32 __unused[4]; }; +#ifdef CONFIG_SYSCTL asmlinkage long compat_sys_sysctl(struct __sysctl_args32 __user *args) { struct __sysctl_args32 tmp; @@ -868,7 +874,12 @@ asmlinkage long compat_sys_sysctl(struct } return error; } -#endif +#else +asmlinkage long compat_sys_sysctl(struct __sysctl_args32 __user *args) +{ + return -ENOSYS; +} +#endif /* CONFIG_SYSCTL */ unsigned long compat_sys_mmap2(unsigned long addr, size_t len, unsigned long prot, unsigned long flags, -- 1.1.5 From heiko.carstens at de.ibm.com Sat Feb 18 21:59:36 2006 From: heiko.carstens at de.ibm.com (Heiko Carstens) Date: Sat, 18 Feb 2006 11:59:36 +0100 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218015808.GB17653@kroah.com> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> Message-ID: <20060218105936.GD9216@osiris.boeblingen.de.ibm.com> > Come on, IBM allows developers to post code to lkml, just look at the > archives for proof. For them to use a proxy like this is very strange, Things aren't always that easy at IBM. You should know best :) Heiko From hch at infradead.org Sat Feb 18 23:17:53 2006 From: hch at infradead.org (Christoph Hellwig) Date: Sat, 18 Feb 2006 12:17:53 +0000 Subject: [PATCH 01/22] Add powerpc-specific clear_cacheline(), which just compiles to "dcbz". In-Reply-To: <20060218005704.13620.88286.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005704.13620.88286.stgit@localhost.localdomain> Message-ID: <20060218121753.GC911@infradead.org> On Fri, Feb 17, 2006 at 04:57:04PM -0800, Roland Dreier wrote: > From: Roland Dreier > > This is horribly non-portable. Yes. If this is needed it should go to an asm/ header, not in a driver. From hch at infradead.org Sat Feb 18 23:19:13 2006 From: hch at infradead.org (Christoph Hellwig) Date: Sat, 18 Feb 2006 12:19:13 +0000 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218005707.13620.20538.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> Message-ID: <20060218121913.GD911@infradead.org> On Fri, Feb 17, 2006 at 04:57:07PM -0800, Roland Dreier wrote: > From: Roland Dreier > > This is a very large file with way too much code for a .h file. > The functions look too big to be inlined also. Is there any way > for this code to move to a .c file? > --- > > drivers/infiniband/hw/ehca/hcp_if.h | 2022 +++++++++++++++++++++++++++++++++++ > +#include "ehca_tools.h" > +#include "hipz_structs.h" > +#include "ehca_classes.h" > + > +#ifndef EHCA_USE_HCALL > +#include "hcz_queue.h" > +#include "hcz_mrmw.h" > +#include "hcz_emmio.h" > +#include "sim_prom.h" > +#endif > +#include "hipz_fns.h" > +#include "hcp_sense.h" > +#include "ehca_irq.h" > + > +#ifndef CONFIG_PPC64 > +#ifndef Z_SERIES > +#warning "included with wrong target, this is a p file" > +#endif > +#endif > + > +#ifdef EHCA_USE_HCALL > + > +#ifndef EHCA_USERDRIVER > +#include "hcp_phyp.h" > +#else > +#include "testbench/hcallbridge.h" > +#endif > +#endif the ifdefs should all go away and the build system should make sure it's only built for the right platforms. From hch at infradead.org Sat Feb 18 23:20:11 2006 From: hch at infradead.org (Christoph Hellwig) Date: Sat, 18 Feb 2006 12:20:11 +0000 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> Message-ID: <20060218122011.GE911@infradead.org> On Fri, Feb 17, 2006 at 06:04:56PM -0800, Roland Dreier wrote: > Greg> Roland, your comments are fine, but what about the original > Greg> author's descriptions of what each patch are? > > This is actually me breaking up a giant driver into pieces small > enough to post to lkml without hitting the 100 KB limit. > > This is just an RFC -- I assume the driver is going to get merged in > the end as one big git changeset with a changelog like "add driver for > IBM eHCA InfiniBand adapters". > > Greg> Come on, IBM allows developers to post code to lkml, just > Greg> look at the archives for proof. For them to use a proxy > Greg> like this is very strange, and also, there is no > Greg> Signed-off-by: record from the original authors, which is > Greg> not ok. > > Well, the eHCA guys tell me that they can't post patches to lkml. Then they lie. And not posting to lkml is a good reason not to merge an otherwise perfect driver. (which this one is far from) From hch at infradead.org Sat Feb 18 23:23:17 2006 From: hch at infradead.org (Christoph Hellwig) Date: Sat, 18 Feb 2006 12:23:17 +0000 Subject: [PATCH 03/22] pHype specific stuff In-Reply-To: <20060218005709.13620.77409.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005709.13620.77409.stgit@localhost.localdomain> Message-ID: <20060218122317.GF911@infradead.org> > +u64 hipz_galpa_load(struct h_galpa galpa, u32 offset) > +{ > + u64 addr = galpa.fw_handle + offset; > + u64 out; > + EDEB_EN(7, "addr=%lx offset=%x ", addr, offset); > + out = *(u64 *) addr; why does this cast an u64 to a pointer? > +#ifndef EHCA_USERDRIVER > +inline static int hcall_map_page(u64 physaddr, u64 * mapaddr) > +{ > + *mapaddr = (u64)(ioremap(physaddr, 4096)); > + > + EDEB(7, "ioremap physaddr=%lx mapaddr=%lx", physaddr, *mapaddr); > + return 0; ioremap returns void __iomem * and casting that to any integer type is wrong. > +inline static int hcall_unmap_page(u64 mapaddr) > +{ > + EDEB(7, "mapaddr=%lx", mapaddr); > + iounmap((void *)(mapaddr)); > + return 0; dito for iounmap and casting back. guys, please run this driver through sparse, thanks. > + /* if phype returns LongBusyXXX, > + * we retry several times, but not forever */ > + for (i = 0; i < 5; i++) { > + __asm__ __volatile__("mr 3,%10\n" > + "mr 4,%11\n" > + "mr 5,%12\n" assembly code under drivers/ is not acceptable. please create and for it or something similar. From hch at infradead.org Sat Feb 18 23:29:10 2006 From: hch at infradead.org (Christoph Hellwig) Date: Sat, 18 Feb 2006 12:29:10 +0000 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218122631.GA30535@granada.merseine.nu> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> Message-ID: <20060218122910.GA1521@infradead.org> On Sat, Feb 18, 2006 at 02:26:31PM +0200, Muli Ben-Yehuda wrote: > I don't speak for IBM or the authors, but there are perfectly > reasonable reasons to ask someone else to post a patch on your behalf > - including but not limited to to only being able to use Lotus Notes > with one's IBM email. I'm sure you've all seen the travesties that > Notes inflicts on inline patches. sure. and there's free webmail accounts that take about 10 minutes to setup as well as various people offering shell access to linux machines if you ask nicely. so this really is not an issue. I think this is more about ibm politics (espeically in boeblingen) sometimes making it pretty hard to post things. But that doesn't mean it's impossible, it just means they didn't try hard enough. From arjan at infradead.org Sat Feb 18 23:32:35 2006 From: arjan at infradead.org (Arjan van de Ven) Date: Sat, 18 Feb 2006 13:32:35 +0100 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218122631.GA30535@granada.merseine.nu> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> Message-ID: <1140265955.4035.19.camel@laptopd505.fenrus.org> On Sat, 2006-02-18 at 14:26 +0200, Muli Ben-Yehuda wrote: > On Sat, Feb 18, 2006 at 12:20:11PM +0000, Christoph Hellwig wrote: > > > > Well, the eHCA guys tell me that they can't post patches to lkml. > > > > Then they lie. And not posting to lkml is a good reason not to merge > > an otherwise perfect driver. (which this one is far from) > > I don't speak for IBM or the authors, but there are perfectly > reasonable reasons to ask someone else to post a patch on your behalf > - including but not limited to to only being able to use Lotus Notes > with one's IBM email. I'm sure you've all seen the travesties that > Notes inflicts on inline patches. there are ways around that with webmail etc. The bigger issue is: if people can't be bothered to do those steps, why would they be bothered to do this for maintenance and bugfixes etc etc? Basically it's now already a de-facto unmaintained driver.... From info at schihei.de Sat Feb 18 23:46:10 2006 From: info at schihei.de (Heiko J Schick) Date: Sat, 18 Feb 2006 13:46:10 +0100 Subject: [openib-general] [PATCH 04/22] OF adapter probing In-Reply-To: <20060218005712.13620.82908.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005712.13620.82908.stgit@localhost.localdomain> Message-ID: Hello Roland, sorry, this file is not used anymore. The functions int hipz_count_adapters(void); int hipz_probe_adapters(char **adapter_list); u64 hipz_get_adapter_handle(char *adapter); nowadays handled by the IBMEBUS [1] bus device driver. [1]: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/ linux-2.6.git;a=commit;h=d7a301033f1990188f65abf4fe8e5b90ef0e3888 Regards, Heiko On Feb 18, 2006, at 1:57 AM, Roland Dreier wrote: > From: Roland Dreier > > hipz_probe_adapters() looks a little funny -- it seems to bail out > of all the remaining adapters if one of them isn't quite right. > --- > > drivers/infiniband/hw/ehca/hcp_sense.c | 144 +++++++++++++++++++++ > +++++++++++ > drivers/infiniband/hw/ehca/hcp_sense.h | 136 +++++++++++++++++++++ > +++++++++ > 2 files changed, 280 insertions(+), 0 deletions(-) > > diff --git a/drivers/infiniband/hw/ehca/hcp_sense.c b/drivers/ > infiniband/hw/ehca/hcp_sense.c > new file mode 100644 > index 0000000..83fa4a3 > --- /dev/null > +++ b/drivers/infiniband/hw/ehca/hcp_sense.c > @@ -0,0 +1,144 @@ > +/* > + * IBM eServer eHCA Infiniband device driver for Linux on POWER > + * > + * ehca detection and query code for POWER > + * > + * Authors: Heiko J Schick > + * > + * Copyright (c) 2005 IBM Corporation > + * > + * All rights reserved. > + * > + * This source code is distributed under a dual license of GPL > v2.0 and OpenIB > + * BSD. > + * > + * OpenIB BSD License > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following > conditions are met: > + * > + * Redistributions of source code must retain the above copyright > notice, this > + * list of conditions and the following disclaimer. > + * > + * Redistributions in binary form must reproduce the above > copyright notice, > + * this list of conditions and the following disclaimer in the > documentation > + * and/or other materials > + * provided with the distribution. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > CONTRIBUTORS "AS IS" > + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > LIMITED TO, THE > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A > PARTICULAR PURPOSE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR > CONTRIBUTORS BE > + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, > EXEMPLARY, OR > + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, > PROCUREMENT OF > + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR > + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF > LIABILITY, WHETHER > + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR > OTHERWISE) > + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF > ADVISED OF THE > + * POSSIBILITY OF SUCH DAMAGE. > + * > + * $Id: hcp_sense.c,v 1.10 2006/02/06 10:17:34 schickhj Exp $ > + */ > + > +#define DEB_PREFIX "snse" > + > +#include "ehca_kernel.h" > +#include "ehca_tools.h" > + > +int hipz_count_adapters(void) > +{ > + int num = 0; > + struct device_node *dn = NULL; > + > + EDEB_EN(7, ""); > + > + while ((dn = of_find_node_by_name(dn, "lhca"))) { > + num++; > + } > + > + of_node_put(dn); > + > + if (num == 0) { > + EDEB_ERR(4, "No lhca node name was found in the" > + " Open Firmware device tree."); > + return -ENODEV; > + } > + > + EDEB(6, " ... found %x adapter(s)", num); > + > + EDEB_EX(7, "num=%x", num); > + > + return num; > +} > + > +int hipz_probe_adapters(char **adapter_list) > +{ > + int ret = 0; > + int num = 0; > + struct device_node *dn = NULL; > + char *loc; > + > + EDEB_EN(7, "adapter_list=%p", adapter_list); > + > + while ((dn = of_find_node_by_name(dn, "lhca"))) { > + loc = get_property(dn, "ibm,loc-code", NULL); > + if (loc == NULL) { > + EDEB_ERR(4, "No ibm,loc-code property for" > + " lhca Open Firmware device tree node."); > + ret = -ENODEV; > + goto probe_adapters0; > + } > + > + adapter_list[num] = loc; > + EDEB(6, " ... found adapter[%x] with loc-code: %s", num, loc); > + num++; > + } > + > + probe_adapters0: > + of_node_put(dn); > + > + EDEB_EX(7, "ret=%x", ret); > + > + return ret; > +} > + > +u64 hipz_get_adapter_handle(char *adapter) > +{ > + struct device_node *dn = NULL; > + char *loc; > + u64 *u64data = NULL; > + u64 ret = 0; > + > + EDEB_EN(7, "adapter=%p", adapter); > + > + while ((dn = of_find_node_by_name(dn, "lhca"))) { > + loc = get_property(dn, "ibm,loc-code", NULL); > + if (loc == NULL) { > + EDEB_ERR(4, "No ibm,loc-code property for" > + " lhca Open Firmware device tree node."); > + goto get_adapter_handle0; > + } > + > + if (strcmp(loc, adapter) == 0) { > + u64data = > + (u64 *) get_property(dn, "ibm,hca-handle", NULL); > + break; > + } > + } > + > + if (u64data == NULL) { > + EDEB_ERR(4, "No ibm,hca-handle property for" > + " lhca Open Firmware device tree node with" > + " ibm,loc-code: %s.", adapter); > + goto get_adapter_handle0; > + } > + > + ret = *u64data; > + > + get_adapter_handle0: > + of_node_put(dn); > + > + EDEB_EX(7, "ret=%lx",ret); > + > + return ret; > +} > diff --git a/drivers/infiniband/hw/ehca/hcp_sense.h b/drivers/ > infiniband/hw/ehca/hcp_sense.h > new file mode 100644 > index 0000000..a49040b > --- /dev/null > +++ b/drivers/infiniband/hw/ehca/hcp_sense.h > @@ -0,0 +1,136 @@ > +/* > + * IBM eServer eHCA Infiniband device driver for Linux on POWER > + * > + * ehca detection and query code for POWER > + * > + * Authors: Heiko J Schick > + * > + * Copyright (c) 2005 IBM Corporation > + * > + * All rights reserved. > + * > + * This source code is distributed under a dual license of GPL > v2.0 and OpenIB > + * BSD. > + * > + * OpenIB BSD License > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following > conditions are met: > + * > + * Redistributions of source code must retain the above copyright > notice, this > + * list of conditions and the following disclaimer. > + * > + * Redistributions in binary form must reproduce the above > copyright notice, > + * this list of conditions and the following disclaimer in the > documentation > + * and/or other materials > + * provided with the distribution. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > CONTRIBUTORS "AS IS" > + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > LIMITED TO, THE > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A > PARTICULAR PURPOSE > + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR > CONTRIBUTORS BE > + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, > EXEMPLARY, OR > + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, > PROCUREMENT OF > + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR > + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF > LIABILITY, WHETHER > + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR > OTHERWISE) > + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF > ADVISED OF THE > + * POSSIBILITY OF SUCH DAMAGE. > + * > + * $Id: hcp_sense.h,v 1.11 2006/02/06 10:17:34 schickhj Exp $ > + */ > + > +#ifndef HCP_SENSE_H > +#define HCP_SENSE_H > + > +int hipz_count_adapters(void); > +int hipz_probe_adapters(char **adapter_list); > +u64 hipz_get_adapter_handle(char *adapter); > + > +/* query hca response block */ > +struct query_hca_rblock { > + u32 cur_reliable_dg; > + u32 cur_qp; > + u32 cur_cq; > + u32 cur_eq; > + u32 cur_mr; > + u32 cur_mw; > + u32 cur_ee_context; > + u32 cur_mcast_grp; > + u32 cur_qp_attached_mcast_grp; > + u32 reserved1; > + u32 cur_ipv6_qp; > + u32 cur_eth_qp; > + u32 cur_hp_mr; > + u32 reserved2[3]; > + u32 max_rd_domain; > + u32 max_qp; > + u32 max_cq; > + u32 max_eq; > + u32 max_mr; > + u32 max_hp_mr; > + u32 max_mw; > + u32 max_mrwpte; > + u32 max_special_mrwpte; > + u32 max_rd_ee_context; > + u32 max_mcast_grp; > + u32 max_qps_attached_all_mcast_grp; > + u32 max_qps_attached_mcast_grp; > + u32 max_raw_ipv6_qp; > + u32 max_raw_ethy_qp; > + u32 internal_clock_frequency; > + u32 max_pd; > + u32 max_ah; > + u32 max_cqe; > + u32 max_wqes_wq; > + u32 max_partitions; > + u32 max_rr_ee_context; > + u32 max_rr_qp; > + u32 max_rr_hca; > + u32 max_act_wqs_ee_context; > + u32 max_act_wqs_qp; > + u32 max_sge; > + u32 max_sge_rd; > + u32 memory_page_size_supported; > + u64 max_mr_size; > + u32 local_ca_ack_delay; > + u32 num_ports; > + u32 vendor_id; > + u32 vendor_part_id; > + u32 hw_ver; > + u64 node_guid; > + u64 hca_cap_indicators; > + u32 data_counter_register_size; > + u32 max_shared_rq; > + u32 max_isns_eq; > + u32 max_neq; > +} __attribute__ ((packed)); > + > +/* query port response block */ > +struct query_port_rblock { > + u32 state; > + u32 bad_pkey_cntr; > + u32 lmc; > + u32 lid; > + u32 subnet_timeout; > + u32 qkey_viol_cntr; > + u32 sm_sl; > + u32 sm_lid; > + u32 capability_mask; > + u32 init_type_reply; > + u32 pkey_tbl_len; > + u32 gid_tbl_len; > + u64 gid_prefix; > + u32 port_nr; > + u16 pkey_entries[16]; > + u8 reserved1[32]; > + u32 trent_size; > + u32 trbuf_size; > + u64 max_msg_sz; > + u32 max_mtu; > + u32 vl_cap; > + u8 reserved2[1900]; > + u64 guid_entries[255]; > +} __attribute__ ((packed)); > + > +#endif > _______________________________________________ > openib-general mailing list > openib-general at openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/ > openib-general > From mulix at mulix.org Sat Feb 18 23:26:31 2006 From: mulix at mulix.org (Muli Ben-Yehuda) Date: Sat, 18 Feb 2006 14:26:31 +0200 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218122011.GE911@infradead.org> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> Message-ID: <20060218122631.GA30535@granada.merseine.nu> On Sat, Feb 18, 2006 at 12:20:11PM +0000, Christoph Hellwig wrote: > > Well, the eHCA guys tell me that they can't post patches to lkml. > > Then they lie. And not posting to lkml is a good reason not to merge > an otherwise perfect driver. (which this one is far from) I don't speak for IBM or the authors, but there are perfectly reasonable reasons to ask someone else to post a patch on your behalf - including but not limited to to only being able to use Lotus Notes with one's IBM email. I'm sure you've all seen the travesties that Notes inflicts on inline patches. Cheers, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ From mostrows at watson.ibm.com Sun Feb 19 01:24:44 2006 From: mostrows at watson.ibm.com (Michal Ostrowski) Date: Sat, 18 Feb 2006 09:24:44 -0500 Subject: [PATCH] Fix race condition in hvc console. Message-ID: tty_schedule_flip() would schedule a thread that would call flush_to_ldisc(). If tty_buffer_request_room() gets called prior to that thread running -- which is likely in this loop in hvc_poll(), it would set the active flag in the tty buffer and consequently flush_to_ldisc() would ignore it. The result is that input on the hvc console is not processed. This fix calls tty_flip_buffer_push (and flags the tty as "low_latency"). The push to the ldisc thus happens synchronously. Signed-off-by: Michal Ostrowski --- drivers/char/hvc_console.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) 1d719e2972f0c02d62a428aa84ca60793ad79666 diff --git a/drivers/char/hvc_console.c b/drivers/char/hvc_console.c index 1994a92..67f368f 100644 --- a/drivers/char/hvc_console.c +++ b/drivers/char/hvc_console.c @@ -335,6 +335,8 @@ static int hvc_open(struct tty_struct *t } /* else count == 0 */ tty->driver_data = hp; + tty->low_latency = 1; /* Makes flushes to ldisc synchronous. */ + hp->tty = tty; /* Save for request_irq outside of spin_lock. */ irq = hp->irq; @@ -633,9 +635,6 @@ static int hvc_poll(struct hvc_struct *h tty_insert_flip_char(tty, buf[i], 0); } - if (count) - tty_schedule_flip(tty); - /* * Account for the total amount read in one loop, and if above * 64 bytes, we do a quick schedule loop to let the tty grok @@ -656,6 +655,10 @@ static int hvc_poll(struct hvc_struct *h bail: spin_unlock_irqrestore(&hp->lock, flags); + if (read_total) { + tty_flip_buffer_push(tty); + } + return poll_mask; } -- 1.1.4.g0b63-dirty From sid at us.ibm.com Sun Feb 19 01:51:43 2006 From: sid at us.ibm.com (Sidney Manning) Date: Sat, 18 Feb 2006 08:51:43 -0600 Subject: [FYI/PATCH 3/4] Build fixes for IBM Full System Simulator In-Reply-To: <20060217183254.GA3951@lst.de> Message-ID: This patch is not intended for mainline inclusion. It is intended to cover up an assembler bug that is unique to the cross toolchain we use to compile the kernel for the simulator, "Fatal error: Neither Power nor PowerPC opcodes were selected." The build was selecting -mno-altivec and -maltivec and that combination was the cause of the above buildtime error, -mcellppu overrode all of that. Sidney Manning -- IBM-STI Design Center Austin, TX sid at us.ibm.com -- (512) 838-1125, TL/678-1125 Christoph Hellwig To 02/17/2006 12:32 Utz Bacher PM cc linuxppc64-dev at ozlabs.org, Sidney Manning/Austin/IBM at IBMUS, arndb at de.ibm.com Subject Re: [FYI/PATCH 3/4] Build fixes for IBM Full System Simulator > + > +ifneq ($(CROSS_COMPILE),) > +cpu-as-$(CONFIG_PPC_CELL) += -Wa,-mcellppu > +endif the CROSS_COMPILE setting is wrong. cross-compilation should not affect selection of assembler flags. > + > cpu-as-$(CONFIG_PPC64BRIDGE) += -Wa,-mppc64bridge > cpu-as-$(CONFIG_4xx) += -Wa,-m405 > cpu-as-$(CONFIG_6xx) += -Wa,-maltivec From hch at lst.de Sun Feb 19 02:09:20 2006 From: hch at lst.de (Christoph Hellwig) Date: Sat, 18 Feb 2006 16:09:20 +0100 Subject: [openib-general] [PATCH 08/22] Generic ehca headers In-Reply-To: <20060218005723.13620.10389.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005723.13620.10389.stgit@localhost.localdomain> Message-ID: <20060218150920.GA23817@lst.de> On Fri, Feb 17, 2006 at 04:57:23PM -0800, Roland Dreier wrote: > From: Roland Dreier > > The defines of TRUE and FALSE look rather useless. Why are they needed? > > What is struct ehca_cache for? It doesn't seem to be used anywhere. > > ehca_kv_to_g() looks completely horrible. The whole idea of using > vmalloc()ed kernel memory to do DMA seems unacceptable to me. When you want to do scatter-gather dma on kernel-virtual contingous areas allocate the pages individually and map them into kva using vmap(). Then dma can be performed using dma_map_page, or in case you have lots of pages dma_map_sg after creating an S/G list. From rdreier at cisco.com Sun Feb 19 03:02:33 2006 From: rdreier at cisco.com (Roland Dreier) Date: Sat, 18 Feb 2006 08:02:33 -0800 Subject: [openib-general] [PATCH 04/22] OF adapter probing In-Reply-To: (Heiko J. Schick's message of "Sat, 18 Feb 2006 13:46:10 +0100") References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005712.13620.82908.stgit@localhost.localdomain> Message-ID: Heiko> Hello Roland, sorry, this file is not used anymore. The Heiko> functions OK, please delete it from the svn tree. Thanks, Roland From rdreier at cisco.com Sun Feb 19 03:32:28 2006 From: rdreier at cisco.com (Roland Dreier) Date: Sat, 18 Feb 2006 08:32:28 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <1140265955.4035.19.camel@laptopd505.fenrus.org> (Arjan van de Ven's message of "Sat, 18 Feb 2006 13:32:35 +0100") References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> Message-ID: Arjan> The bigger issue is: if people can't be bothered to do Arjan> those steps, why would they be bothered to do this for Arjan> maintenance and bugfixes etc etc? Basically it's now Arjan> already a de-facto unmaintained driver.... I don't think that's really a fair statement. The IBM people have been active and responsive in maintaining their driving in the openib.org svn tree. However, they asked me to post their driver for review because it would be difficult for them to do it. IBM people: can you clarify the restrictions you have? Why do you feel you can't post your own driver for review? Will you be able to post smaller patches to lkml in the future if the driver is merged? Thanks, Roland From arjan at infradead.org Sun Feb 19 04:02:42 2006 From: arjan at infradead.org (Arjan van de Ven) Date: Sat, 18 Feb 2006 18:02:42 +0100 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> Message-ID: <1140282163.6514.7.camel@laptopd505.fenrus.org> On Sat, 2006-02-18 at 08:32 -0800, Roland Dreier wrote: > Arjan> The bigger issue is: if people can't be bothered to do > Arjan> those steps, why would they be bothered to do this for > Arjan> maintenance and bugfixes etc etc? Basically it's now > Arjan> already a de-facto unmaintained driver.... > > I don't think that's really a fair statement. It's a concern at least; if they're just having trouble posting really big files that's one thing.. if they're not allowed to post at all that's another. > IBM people: can you clarify the restrictions you have? Why do you > feel you can't post your own driver for review? Will you be able to > post smaller patches to lkml in the future if the driver is merged? And can you respond to questions and user questions on lkml? From greg at kroah.com Sun Feb 19 05:15:09 2006 From: greg at kroah.com (Greg KH) Date: Sat, 18 Feb 2006 10:15:09 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> Message-ID: <20060218181509.GA892@kroah.com> On Sat, Feb 18, 2006 at 08:32:28AM -0800, Roland Dreier wrote: > Arjan> The bigger issue is: if people can't be bothered to do > Arjan> those steps, why would they be bothered to do this for > Arjan> maintenance and bugfixes etc etc? Basically it's now > Arjan> already a de-facto unmaintained driver.... > > I don't think that's really a fair statement. The IBM people have > been active and responsive in maintaining their driving in the > openib.org svn tree. However, they asked me to post their driver for > review because it would be difficult for them to do it. Checking stuff into a private svn tree is vastly different from posting to lkml in public. In fact, it looks like the svn tree is so far ahead of the in-kernel stuff, that most people are just using it instead of the in-kernel code. I know at least one company has asked a distro to just accept the svn snapshot over the in-kernel IB code, which makes me wonder if the in-kernel stuff is even useful to people? Why have it, if companies insist on using the out-of-tree stuff instead? thanks, greg k-h From hch at infradead.org Sun Feb 19 05:19:32 2006 From: hch at infradead.org (Christoph Hellwig) Date: Sat, 18 Feb 2006 18:19:32 +0000 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218181509.GA892@kroah.com> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> <20060218181509.GA892@kroah.com> Message-ID: <20060218181932.GA6410@infradead.org> On Sat, Feb 18, 2006 at 10:15:09AM -0800, Greg KH wrote: > On Sat, Feb 18, 2006 at 08:32:28AM -0800, Roland Dreier wrote: > > Arjan> The bigger issue is: if people can't be bothered to do > > Arjan> those steps, why would they be bothered to do this for > > Arjan> maintenance and bugfixes etc etc? Basically it's now > > Arjan> already a de-facto unmaintained driver.... > > > > I don't think that's really a fair statement. The IBM people have > > been active and responsive in maintaining their driving in the > > openib.org svn tree. However, they asked me to post their driver for > > review because it would be difficult for them to do it. > > Checking stuff into a private svn tree is vastly different from posting > to lkml in public. In fact, it looks like the svn tree is so far ahead > of the in-kernel stuff, that most people are just using it instead of > the in-kernel code. > > I know at least one company has asked a distro to just accept the svn > snapshot over the in-kernel IB code, which makes me wonder if the > in-kernel stuff is even useful to people? Why have it, if companies > insist on using the out-of-tree stuff instead? The openib tree isn't private. It's mostly just a staging area for development. Any company that wants it included into a distro release is completely clueless. From rdreier at cisco.com Sun Feb 19 05:52:58 2006 From: rdreier at cisco.com (Roland Dreier) Date: Sat, 18 Feb 2006 10:52:58 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218181509.GA892@kroah.com> (Greg KH's message of "Sat, 18 Feb 2006 10:15:09 -0800") References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> <20060218181509.GA892@kroah.com> Message-ID: Greg> Checking stuff into a private svn tree is vastly different Greg> from posting to lkml in public. In fact, it looks like the Greg> svn tree is so far ahead of the in-kernel stuff, that most Greg> people are just using it instead of the in-kernel code. It's not a private svn tree -- the IBM ehca development is available to anyone via svn at https://openib.org/svn/gen2/trunk/src/linux-kernel/infiniband/hw/ehca Greg> I know at least one company has asked a distro to just Greg> accept the svn snapshot over the in-kernel IB code, which Greg> makes me wonder if the in-kernel stuff is even useful to Greg> people? Why have it, if companies insist on using the Greg> out-of-tree stuff instead? The IB driver stack is still in its early stages, so although I'm pushing for things to be merged as fast as possible, the unfortunate fact is that lots of things that people want to use (including the IBM ehca driver) are not upstream and are not ready to go upstream yet. But that doesn't mean we should give up on merging them. Distro politics are just distro politics -- and there will always be pressure on distros to ship stuff that's not upstream yet. - R. From greg at kroah.com Sun Feb 19 06:53:27 2006 From: greg at kroah.com (Greg KH) Date: Sat, 18 Feb 2006 11:53:27 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> <20060218181509.GA892@kroah.com> Message-ID: <20060218195327.GA1382@kroah.com> On Sat, Feb 18, 2006 at 10:52:58AM -0800, Roland Dreier wrote: > Greg> Checking stuff into a private svn tree is vastly different > Greg> from posting to lkml in public. In fact, it looks like the > Greg> svn tree is so far ahead of the in-kernel stuff, that most > Greg> people are just using it instead of the in-kernel code. > > It's not a private svn tree -- the IBM ehca development is available > to anyone via svn at https://openib.org/svn/gen2/trunk/src/linux-kernel/infiniband/hw/ehca Sorry, I didn't mean to say "private", but rather, "seperate". Doing kernel development in a seperate development tree from the mainline kernel is very problematic, as has been documented many times in the past. > Distro politics are just distro politics -- and there will always be > pressure on distros to ship stuff that's not upstream yet. Luckily the distros know better than to accept this anymore, as they have been burned too many times in the past... thanks, greg k-h From rdreier at cisco.com Sun Feb 19 08:31:52 2006 From: rdreier at cisco.com (Roland Dreier) Date: Sat, 18 Feb 2006 13:31:52 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218195327.GA1382@kroah.com> (Greg KH's message of "Sat, 18 Feb 2006 11:53:27 -0800") References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005707.13620.20538.stgit@localhost.localdomain> <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> <20060218181509.GA892@kroah.com> <20060218195327.GA1382@kroah.com> Message-ID: Greg> Sorry, I didn't mean to say "private", but rather, Greg> "seperate". Doing kernel development in a seperate Greg> development tree from the mainline kernel is very Greg> problematic, as has been documented many times in the past. As a general rule I agree with that. However, the openib svn tree we're talking about is not some project that is off in space never merging with the kernel; as Christoph said, it's really just a staging area for stuff that isn't ready for upstream yet.n Perhaps it would be more politically correct to use git to develop kernel code, but in the end that's really just a technical difference that shouldn't matter. Roland> Distro politics are just distro politics -- and there will Roland> always be pressure on distros to ship stuff that's not Roland> upstream yet. Greg> Luckily the distros know better than to accept this anymore, Greg> as they have been burned too many times in the past... OK, that's great. But now I don't understand your original point. You say there are people putting pressure on distros to ship what's in openib svn rather than the upstream kernel, but if the distros are going to ignore them, what does it matter? And this thread started with me trying to help the IBM people make progress towards merging a big chunk of that svn tree upstream. That should make you happy, right? - R. From sfr at canb.auug.org.au Sun Feb 19 10:23:35 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Sun, 19 Feb 2006 10:23:35 +1100 Subject: [PATCH] Fix race condition in hvc console. In-Reply-To: References: Message-ID: <20060219102335.37cda813.sfr@canb.auug.org.au> Hi Michal, On Sat, 18 Feb 2006 09:24:44 -0500 Michal Ostrowski wrote: > > + if (read_total) { > + tty_flip_buffer_push(tty); > + } A small nit: please don't add these unnecessary '{}' pairs. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ From greg at kroah.com Sun Feb 19 10:29:34 2006 From: greg at kroah.com (Greg KH) Date: Sat, 18 Feb 2006 15:29:34 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: References: <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> <20060218181509.GA892@kroah.com> <20060218195327.GA1382@kroah.com> Message-ID: <20060218232934.GA2624@kroah.com> On Sat, Feb 18, 2006 at 01:31:52PM -0800, Roland Dreier wrote: > Greg> Sorry, I didn't mean to say "private", but rather, > Greg> "seperate". Doing kernel development in a seperate > Greg> development tree from the mainline kernel is very > Greg> problematic, as has been documented many times in the past. > > As a general rule I agree with that. However, the openib svn tree > we're talking about is not some project that is off in space never > merging with the kernel; as Christoph said, it's really just a staging > area for stuff that isn't ready for upstream yet.n > > Perhaps it would be more politically correct to use git to develop > kernel code, but in the end that's really just a technical difference > that shouldn't matter. Yes, that doesn't matter. But it seems that the svn tree is vastly different from the in-kernel code. So much so that some companies feel that the in-kernel stuff just isn't worth running at all. > Roland> Distro politics are just distro politics -- and there will > Roland> always be pressure on distros to ship stuff that's not > Roland> upstream yet. > > Greg> Luckily the distros know better than to accept this anymore, > Greg> as they have been burned too many times in the past... > > OK, that's great. But now I don't understand your original point. > You say there are people putting pressure on distros to ship what's in > openib svn rather than the upstream kernel, but if the distros are > going to ignore them, what does it matter? It takes a _lot_ of effort to ignore them, as it's very difficult to do so. Especially when companies try to play the different distros off of each other, but that's not an issue that the mainline kernel developers need to worry about :) > And this thread started with me trying to help the IBM people make > progress towards merging a big chunk of that svn tree upstream. That > should make you happy, right? Yes, that does make me happy. But it doesn't make me happy to see IBM not being able to participate in kernel development by posting and defending their own code to lkml. I thought IBM knew better than that... thanks, greg k-h From rdreier at cisco.com Sun Feb 19 11:09:31 2006 From: rdreier at cisco.com (Roland Dreier) Date: Sat, 18 Feb 2006 16:09:31 -0800 Subject: [PATCH 02/22] Firmware interface code for IB device. In-Reply-To: <20060218232934.GA2624@kroah.com> (Greg KH's message of "Sat, 18 Feb 2006 15:29:34 -0800") References: <20060218015808.GB17653@kroah.com> <20060218122011.GE911@infradead.org> <20060218122631.GA30535@granada.merseine.nu> <1140265955.4035.19.camel@laptopd505.fenrus.org> <20060218181509.GA892@kroah.com> <20060218195327.GA1382@kroah.com> <20060218232934.GA2624@kroah.com> Message-ID: Greg> Yes, that doesn't matter. But it seems that the svn tree is Greg> vastly different from the in-kernel code. So much so that Greg> some companies feel that the in-kernel stuff just isn't Greg> worth running at all. I don't want to belabor this issue... but the svn tree is not vastly different than what's in the kernel. It has some things that aren't upstream yet, and which are important to some people. For example, the IBM ehca driver we're talking about, as well as the PathScale driver, SDP (sockets direct protocol), etc. It just takes time for this new code to get to the point where both the developers of the new stuff feel it's ready to be merged, and the kernel community agrees that it should be merged. Greg> Yes, that does make me happy. But it doesn't make me happy Greg> to see IBM not being able to participate in kernel Greg> development by posting and defending their own code to lkml. Greg> I thought IBM knew better than that... Agreed. But let's not get sidetracked on that internal IBM issue. The ehca developers have assured me that they can and will participate in the thread reviewing their driver. It seems like it's better for me to help them work around their internal problems by acting as a proxy, than for me to delay merging their driver just because someone in IBM management is clueless. - R. From paulus at samba.org Sun Feb 19 22:52:31 2006 From: paulus at samba.org (Paul Mackerras) Date: Sun, 19 Feb 2006 22:52:31 +1100 Subject: [PATCH] powerpc: ARCH=powerpc build fix for CONFIG_SYSVIPC=n || CONFIG_SYSCTL=n In-Reply-To: <20060218100849.GA1869@krypton> References: <20060218100849.GA1869@krypton> Message-ID: <17400.23551.904754.47979@cargo.ozlabs.ibm.com> Arthur Othieno writes: > --- a/arch/powerpc/kernel/sys_ppc32.c > +++ b/arch/powerpc/kernel/sys_ppc32.c > @@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first, > > return -ENOSYS; > } > -#endif > +#else > +long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr, > + u32 fifth) > +{ > + return -ENOSYS; > +} > +#endif /* CONFIG_SYSVIPC */ Can't we just add a couple of cond_syscall lines to kernel/sys_ni.c instead? Paul. From laforge at gnumonks.org Sun Feb 19 22:45:32 2006 From: laforge at gnumonks.org (Harald Welte) Date: Sun, 19 Feb 2006 12:45:32 +0100 Subject: PowerMac11,2 sound questions Message-ID: <20060219114532.GA30498@sunbeam.de.gnumonks.org> Hi! Since I recently got a Quad G5, and paulus/benh were too fast for me to hack on the fan control, I was looking for something else that is missing. Apparently there is no sound support for those machines yet. Apple seems to call the sound architecture of those boxes 'onyx', and a quick look at http://darwinsource.opendarwin.org/10.4.5.ppc/AppleOnboardAudio-256.2.5/AppleOnboardAudio/ revealed that all onyx specific bits are not present in the source code :( Does anyone have more information on what needs to be done / what is missing for getting sound support on those devices? [yes, I'm well aware of the long-standing i2s/infrastructure/ubuntu-bounty/... discussion, but that's not what I'm asking about] Thanks! -- - Harald Welte http://gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060219/336495f4/attachment.pgp From benh at kernel.crashing.org Mon Feb 20 09:14:53 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 20 Feb 2006 09:14:53 +1100 Subject: PowerMac11,2 sound questions In-Reply-To: <20060219114532.GA30498@sunbeam.de.gnumonks.org> References: <20060219114532.GA30498@sunbeam.de.gnumonks.org> Message-ID: <1140387293.32374.39.camel@localhost.localdomain> On Sun, 2006-02-19 at 12:45 +0100, Harald Welte wrote: > Hi! > > Since I recently got a Quad G5, and paulus/benh were too fast for me to > hack on the fan control, I was looking for something else that is > missing. > > Apparently there is no sound support for those machines yet. Apple > seems to call the sound architecture of those boxes 'onyx', and a quick > look at > http://darwinsource.opendarwin.org/10.4.5.ppc/AppleOnboardAudio-256.2.5/AppleOnboardAudio/ > revealed that all onyx specific bits are not present in the source code > :( > > Does anyone have more information on what needs to be done / what is > missing for getting sound support on those devices? > > [yes, I'm well aware of the long-standing > i2s/infrastructure/ubuntu-bounty/... discussion, but that's not what I'm > asking about] Those machines have dual codecs (Onyx + Topaz). Onyx is a PCM3052 (TI afaik) and I have a spec. Topaz is is CS84xx (Darwin at least knows at least 3 models, CS8406, CS8416, CS8420), spec available online. The main issue right now is that the current driver can't really handle properly multiple codecs and multiple i2s busses, along with all the various bits & pieces that are already barely working and need serious rework. I've had plans for some time to rewrite the sound driver (at least for newer architectures based on layoutID) but didn't have time yet to seriously begin work on it. Among the things that need to be done is proper usage of platform-do-* functions for things like GPIO manipulations (Ben Collins did some work on that already), better "objectisation" of the whole driver so we can properly instanciate sound busses (I think up to 2 i2s busses can be used, maybe 3) and codecs, with a generic callback system for things like clock changes etc... (when using digital inputs, the bus clocking and other codecs must adapt to changes ot hte digital input clock) etc... Ben. From huangjq at cn.ibm.com Mon Feb 20 12:14:54 2006 From: huangjq at cn.ibm.com (Jin Qi Huang) Date: Mon, 20 Feb 2006 09:14:54 +0800 Subject: Kernel oops then panic when perform a soft reset on ppc64 box Message-ID: Hi all, When I perform a soft reset on HMC console to a ppc64 box, the kernel oops then panic, here is the procedure to reproduce it: 1. machine hardware environment: # cat /proc/cpuinfo processor : 0 cpu : POWER4 (gp) clock : 1002.296504MHz revision : 3.2 processor : 1 cpu : POWER4 (gp) clock : 1002.296504MHz revision : 3.2 timebase : 125287063 machine : CHRP IBM,7028-6C4 2. machine software environment: # uname -a Linux mcptest4 2.6.5-279 #2 SMP Thu Feb 9 21:21:11 UTC 2006 ppc64 ppc64 ppc64 GNU/Linux 3. on HMC console perform a soft reset: $ chsysstate -m plinuxt4 -r lpar -n lpar1 -o reset 4. on the HMC virtual terminal give the kernel oops and panic message: Oops: System Reset, sig: 0 [#1] SMP NR_CPUS=32 PSERIES LPAR NIP: C000000000013B5C XER: 0000000020000000 LR: C000000000013B9C REGS: c00000000053fad0 TRAP: 0100 Not tainted (2.6.5-279 ) MSR: 8000000000009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 TASK: c0000000005d3a20[0] 'swapper' THREAD: c00000000053c000 CPU: 0 GPR00: 0000000000000010 C00000000053FD50 C00000000071EAB8 C0000000BB1CD800 GPR04: 0000000000000007 0000000000000000 C00000000053FC30 0000000000000000 GPR08: 0000000000000000 0000000000000000 C00000000071D008 C00000000053C000 GPR12: 0000000042004028 C000000000541000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000230000 0000000000000000 0000000000000000 0000000003A00000 GPR24: C000000000541000 C00000000071D008 C000000000539AF0 0000000000008000 GPR28: 0000000000000010 0000000000000008 C00000000053C000 C00000000053C010 NIP [c000000000013b5c] .default_idle+0x64/0xac LR [c000000000013b9c] .default_idle+0xa4/0xac Call Trace: [c00000000053fd50] [c000000000013b9c] .default_idle+0xa4/0xac (unreliable) [c00000000053fde0] [c00000000001398c] .cpu_idle+0x38/0x50 [c00000000053fe50] [c00000000000c49c] .rest_init+0x64/0x7c [c00000000053fed0] [c0000000004ee5dc] .start_kernel+0x2b4/0x330 [c00000000053ff90] [c00000000000c394] .__setup_cpu_power3+0x0/0x4 <0>Fatal exception: panic in 5 seconds et, sig: 0 [#2] SMP NR_CPUS=32 PSERIES LPAR NIP: C000000000013B5C XER: 0000000020000000 LR: C000000000013B9C REGS: c0000000bff07b80 TRAP: 0100 Not tainted (2.6.5-279 ) MSR: 8000000000009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 TASK: c00000000397c9b0[0] 'swapper' THREAD: c0000000bff04000 CPU: 1 GPR00: 0000000000000010 C0000000BFF07E00 C00000000071EAB8 C0000000BC181000 GPR04: 0000000000000007 0000000000000000 C0000000BFF07CE0 0000000000000000 GPR08: 0000000000000000 0000000000000000 C00000000071D008 C0000000BFF04000 GPR12: 0000000044004028 C000000000543000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000C00000 0000000000000000 0000000000000001 GPR24: 0000000000000001 0000000000000010 0000000000000568 000000000000041C GPR28: 0000000000000010 0000000000000008 C0000000BFF04000 C0000000BFF04010 NIP [c000000000013b5c] .default_idle+0x64/0xac LR [c000000000013b9c] .default_idle+0xa4/0xac Call Trace: [c0000000bff07e00] [c000000000013b9c] .default_idle+0xa4/0xac (unreliable) [c0000000bff07e90] [c00000000001398c] .cpu_idle+0x38/0x50 [c0000000bff07f00] [c00000000003ed78] .start_secondary+0x148/0x1a8 [c0000000bff07f90] [c00000000000c03c] .enable_64b_mode+0x0/0x28 <0>Fatal exception: panic in 5 seconds Kernel panic: Fatal exception In idle task - not syncing >From its kernel code, when user perform a soft reset, it creates a system reset exception, then invoke the exception handler SystemResetException and go to die, Does system must go to die when receive a soft reset? thanks! -- Regards, Jin Qi Huang -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/a7024a8b/attachment.htm From paulus at samba.org Mon Feb 20 13:03:31 2006 From: paulus at samba.org (Paul Mackerras) Date: Mon, 20 Feb 2006 13:03:31 +1100 Subject: Kernel oops then panic when perform a soft reset on ppc64 box In-Reply-To: References: Message-ID: <17401.9075.295712.950980@cargo.ozlabs.ibm.com> Jin Qi Huang writes: > When I perform a soft reset on HMC console to a ppc64 box, the kernel oops > then panic, here is the procedure to reproduce it: That's normal, what did you expect it to do? Paul. From huangjq at cn.ibm.com Mon Feb 20 13:34:16 2006 From: huangjq at cn.ibm.com (Jin Qi Huang) Date: Mon, 20 Feb 2006 10:34:16 +0800 Subject: Kernel oops then panic when perform a soft reset on ppc64 box In-Reply-To: <17401.9075.295712.950980@cargo.ozlabs.ibm.com> Message-ID: Hi Paul, Would you please give me some detailed information about what happens when we perform a soft reset and why the system must go to die? I am a youngster to POWER architecture, thanks! -- Regards, Jin Qi Huang Paul Mackerras 2006-02-20 10:03 To Jin Qi Huang/China/Contr/IBM at IBMCN cc linuxppc64-dev at ozlabs.org Subject Re: Kernel oops then panic when perform a soft reset on ppc64 box Jin Qi Huang writes: > When I perform a soft reset on HMC console to a ppc64 box, the kernel oops > then panic, here is the procedure to reproduce it: That's normal, what did you expect it to do? Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/9fd43edc/attachment.htm From david at gibson.dropbear.id.au Mon Feb 20 14:05:56 2006 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 20 Feb 2006 14:05:56 +1100 Subject: powerpc: Fixup for STRICT_MM_TYPECHECKS Message-ID: <20060220030556.GC24457@localhost.localdomain> Paulus, please apply (for post 2.6.16, I guess). Currently ARCH=powerpc will not compile when STRICT_MM_TYPECHECKS is turned on and CONFIG_64K_PAGES is turned off. The patch below corrects the problem. Signed-off-by: David Gibson Index: working-2.6/include/asm-powerpc/pgtable-4k.h =================================================================== --- working-2.6.orig/include/asm-powerpc/pgtable-4k.h 2006-01-16 13:02:29.000000000 +1100 +++ working-2.6/include/asm-powerpc/pgtable-4k.h 2006-02-20 13:53:57.000000000 +1100 @@ -62,9 +62,14 @@ /* shift to put page number into pte */ #define PTE_RPN_SHIFT (17) -#define __real_pte(e,p) ((real_pte_t)(e)) -#define __rpte_to_pte(r) (r) -#define __rpte_to_hidx(r,index) (pte_val((r)) >> 12) +#ifdef STRICT_MM_TYPECHECKS +#define __real_pte(e,p) ((real_pte_t){(e)}) +#define __rpte_to_pte(r) ((r).pte) +#else +#define __real_pte(e,p) (e) +#define __rpte_to_pte(r) (__pte(r)) +#endif +#define __rpte_to_hidx(r,index) (pte_val(__rpte_to_pte(r)) >> 12) #define pte_iterate_hashed_subpages(rpte, psize, va, index, shift) \ do { \ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From sfr at canb.auug.org.au Mon Feb 20 15:32:26 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 20 Feb 2006 15:32:26 +1100 Subject: [PATCH] Fix compile for CONFIG_SYSVIPC=n or CONFIG_SYSCTL=n In-Reply-To: <17400.23551.904754.47979@cargo.ozlabs.ibm.com> References: <20060218100849.GA1869@krypton> <17400.23551.904754.47979@cargo.ozlabs.ibm.com> Message-ID: <20060220153226.30ee4b13.sfr@canb.auug.org.au> The compat syscalls are added to sys_ni.c since they are not defined if the above CONFIG options are off. Also, nfs would not build with CONFIG_SYSCTL off. Noticed by Arthur Othieno. Signed-off-by: Stephen Rothwell --- include/linux/nfs_fs.h | 2 +- kernel/sys_ni.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletions(-) On Sun, 19 Feb 2006 22:52:31 +1100 Paul Mackerras wrote: > > Arthur Othieno writes: > > > --- a/arch/powerpc/kernel/sys_ppc32.c > > +++ b/arch/powerpc/kernel/sys_ppc32.c > > @@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first, > > > > return -ENOSYS; > > } > > -#endif > > +#else > > +long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr, > > + u32 fifth) > > +{ > > + return -ENOSYS; > > +} > > +#endif /* CONFIG_SYSVIPC */ > > Can't we just add a couple of cond_syscall lines to kernel/sys_ni.c > instead? Linus, can we have this applied for 2.6.16. It presumably affects sparc64 (at least for CONFIG_SYSVIPC) as well as powerpc. The NFS fix would affect all architectures, I think? This has been compile tested with the CONFIG options on and off for powerpc. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ c1a27bc400a1412c7c758775bb695e8b98d1c0c3 diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 547d649..b4dc6e2 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -398,7 +398,7 @@ extern struct inode_operations nfs_symli extern int nfs_register_sysctl(void); extern void nfs_unregister_sysctl(void); #else -#define nfs_register_sysctl() do { } while(0) +#define nfs_register_sysctl() 0 #define nfs_unregister_sysctl() do { } while(0) #endif diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 17313b9..1067090 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -104,6 +104,8 @@ cond_syscall(sys_setreuid16); cond_syscall(sys_setuid16); cond_syscall(sys_vm86old); cond_syscall(sys_vm86); +cond_syscall(compat_sys_ipc); +cond_syscall(compat_sys_sysctl); /* arch-specific weak syscall entries */ cond_syscall(sys_pciconfig_read); -- 1.2.1 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/8d8b7dad/attachment.pgp From michael at ellerman.id.au Mon Feb 20 19:07:31 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 20 Feb 2006 19:07:31 +1100 Subject: [PATCH] powerpc: Initialise hvlpevent_queue.lock correctly Message-ID: <20060220080757.74C78679F6@ozlabs.org> When I changed the hvlpevent_queue code to use a spinlock instead of a custom atomic (719d1cd86780c156f954fc34f34481adac197aec) I didn't initialise the lock anywhere, oops. Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/iseries/lpevents.c | 2 ++ 1 files changed, 2 insertions(+) Index: iseries/arch/powerpc/platforms/iseries/lpevents.c =================================================================== --- iseries.orig/arch/powerpc/platforms/iseries/lpevents.c +++ iseries/arch/powerpc/platforms/iseries/lpevents.c @@ -184,6 +184,8 @@ void setup_hvlpevent_queue(void) { void *eventStack; + spin_lock_init(&hvlpevent_queue.lock); + /* Allocate a page for the Event Stack. */ eventStack = alloc_bootmem_pages(LpEventStackSize); memset(eventStack, 0, LpEventStackSize); From apgo at patchbomb.org Tue Feb 21 00:53:15 2006 From: apgo at patchbomb.org (Arthur Othieno) Date: Mon, 20 Feb 2006 08:53:15 -0500 Subject: [PATCH] Fix compile for CONFIG_SYSVIPC=n or CONFIG_SYSCTL=n In-Reply-To: <20060220153226.30ee4b13.sfr@canb.auug.org.au> References: <20060218100849.GA1869@krypton> <17400.23551.904754.47979@cargo.ozlabs.ibm.com> <20060220153226.30ee4b13.sfr@canb.auug.org.au> Message-ID: <20060220135315.GA24943@krypton> On Mon, Feb 20, 2006 at 03:32:26PM +1100, Stephen Rothwell wrote: > The compat syscalls are added to sys_ni.c since they are not defined > if the above CONFIG options are off. Also, nfs would not build with > CONFIG_SYSCTL off. > > Noticed by Arthur Othieno. > > Signed-off-by: Stephen Rothwell Looks good, thanks ;-) Acked-by: Arthur Othieno > --- > > include/linux/nfs_fs.h | 2 +- > kernel/sys_ni.c | 2 ++ > 2 files changed, 3 insertions(+), 1 deletions(-) > > On Sun, 19 Feb 2006 22:52:31 +1100 Paul Mackerras wrote: > > > > Arthur Othieno writes: > > > > > --- a/arch/powerpc/kernel/sys_ppc32.c > > > +++ b/arch/powerpc/kernel/sys_ppc32.c > > > @@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first, > > > > > > return -ENOSYS; > > > } > > > -#endif > > > +#else > > > +long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr, > > > + u32 fifth) > > > +{ > > > + return -ENOSYS; > > > +} > > > +#endif /* CONFIG_SYSVIPC */ > > > > Can't we just add a couple of cond_syscall lines to kernel/sys_ni.c > > instead? > > Linus, can we have this applied for 2.6.16. It presumably affects sparc64 > (at least for CONFIG_SYSVIPC) as well as powerpc. The NFS fix would > affect all architectures, I think? > > This has been compile tested with the CONFIG options on and off for powerpc. > > -- > Cheers, > Stephen Rothwell sfr at canb.auug.org.au > http://www.canb.auug.org.au/~sfr/ > > c1a27bc400a1412c7c758775bb695e8b98d1c0c3 > diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h > index 547d649..b4dc6e2 100644 > --- a/include/linux/nfs_fs.h > +++ b/include/linux/nfs_fs.h > @@ -398,7 +398,7 @@ extern struct inode_operations nfs_symli > extern int nfs_register_sysctl(void); > extern void nfs_unregister_sysctl(void); > #else > -#define nfs_register_sysctl() do { } while(0) > +#define nfs_register_sysctl() 0 > #define nfs_unregister_sysctl() do { } while(0) > #endif > > diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c > index 17313b9..1067090 100644 > --- a/kernel/sys_ni.c > +++ b/kernel/sys_ni.c > @@ -104,6 +104,8 @@ cond_syscall(sys_setreuid16); > cond_syscall(sys_setuid16); > cond_syscall(sys_vm86old); > cond_syscall(sys_vm86); > +cond_syscall(compat_sys_ipc); > +cond_syscall(compat_sys_sysctl); > > /* arch-specific weak syscall entries */ > cond_syscall(sys_pciconfig_read); > -- > 1.2.1 From anton at samba.org Tue Feb 21 01:59:05 2006 From: anton at samba.org (Anton Blanchard) Date: Tue, 21 Feb 2006 01:59:05 +1100 Subject: [PATCH 01/22] Add powerpc-specific clear_cacheline(), which just compiles to "dcbz". In-Reply-To: <20060218005704.13620.88286.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005704.13620.88286.stgit@localhost.localdomain> Message-ID: <20060220145904.GA19895@krispykreme> Hi, > This is horribly non-portable. How much of a performance difference > does it make? How does it do on ppc64 systems where the cacheline > size is not 32? Yes, if anything we should catch cacheline aligned, multiple cacheline sized zeroing in memset. Anton From anton at samba.org Tue Feb 21 02:09:53 2006 From: anton at samba.org (Anton Blanchard) Date: Tue, 21 Feb 2006 02:09:53 +1100 Subject: [PATCH 03/22] pHype specific stuff In-Reply-To: <20060218005709.13620.77409.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005709.13620.77409.stgit@localhost.localdomain> Message-ID: <20060220150953.GB19895@krispykreme> Hi, > +inline static u32 getLongBusyTimeSecs(int longBusyRetCode) > +{ > + switch (longBusyRetCode) { > + case H_LongBusyOrder1msec: > + return 1; > + case H_LongBusyOrder10msec: > + return 10; > + case H_LongBusyOrder100msec: > + return 100; > + case H_LongBusyOrder1sec: > + return 1000; > + case H_LongBusyOrder10sec: > + return 10000; > + case H_LongBusyOrder100sec: > + return 100000; > + default: > + return 1; > + } /* eof switch */ > +} Since this actually returns milliseconds it might be worth making it obvious in the function name. Also no need to use studly caps for the function name and variable. We will fix the studly caps H_LongBusy* stuff another day :) > +inline static long plpar_hcall_7arg_7ret(unsigned long opcode, > +inline static long plpar_hcall_9arg_9ret(unsigned long opcode, These belong in arch/powerpc/platforms/pseries/hvCall.S Anton From anton at samba.org Tue Feb 21 02:12:15 2006 From: anton at samba.org (Anton Blanchard) Date: Tue, 21 Feb 2006 02:12:15 +1100 Subject: [PATCH 07/22] Hypercall definitions In-Reply-To: <20060218005721.13620.84990.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005721.13620.84990.stgit@localhost.localdomain> Message-ID: <20060220151215.GC19895@krispykreme> Hi, > Do these defines belong in the ehca driver, or should they be put > somewhere in generic hypercall support? Agreed, I think they should go into include/asm-powerpc/hvcall.h Anton From anton at samba.org Tue Feb 21 02:22:13 2006 From: anton at samba.org (Anton Blanchard) Date: Tue, 21 Feb 2006 02:22:13 +1100 Subject: [PATCH 21/22] ehca main file In-Reply-To: <20060218005759.13620.10968.stgit@localhost.localdomain> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005759.13620.10968.stgit@localhost.localdomain> Message-ID: <20060220152213.GD19895@krispykreme> Hi, > What is ehca_show_flightrecorder() trying to do that snprintf() is > not fast enough? If you need to pass a binary structure back to > userspace (with a kernel address in it??) then sysfs is not the right > place to put it. Look at debugfs; or relayfs might make the most > sense for your flightrecorder stuff. I agree debugfs or relayfs would be better suited. Of course as the driver matures this form of debug is probably not required at all. > +#include "hcp_sense.h" /* TODO: later via hipz_* header file */ > +#include "hcp_if.h" /* TODO: later via hipz_* header file */ I count 88 TODOs in the driver, it would be nice to get rid of some of them like the two above, so we can concentrate on the important TODOs :) > +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12) > +#define EHCA_RESOURCE_ATTR_H(name) \ > +static ssize_t ehca_show_##name(struct device *dev, \ > + struct device_attribute *attr, \ > + char *buf) > +#else > +#define EHCA_RESOURCE_ATTR_H(name) \ > +static ssize_t ehca_show_##name(struct device *dev, \ > + char *buf) > +#endif No need for kernel version ifdefs. Anton From rdreier at cisco.com Tue Feb 21 03:52:55 2006 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 20 Feb 2006 08:52:55 -0800 Subject: [PATCH 21/22] ehca main file In-Reply-To: <20060220152213.GD19895@krispykreme> (Anton Blanchard's message of "Tue, 21 Feb 2006 02:22:13 +1100") References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005759.13620.10968.stgit@localhost.localdomain> <20060220152213.GD19895@krispykreme> Message-ID: Anton> No need for kernel version ifdefs. Sorry, I tried to strip these out before posting the patch, but I missed one. Anyway, totally agree on the ifdefs and I will be double-extra-sure that the final version doesn't include them. - R. From rdreier at cisco.com Tue Feb 21 03:55:24 2006 From: rdreier at cisco.com (Roland Dreier) Date: Mon, 20 Feb 2006 08:55:24 -0800 Subject: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver In-Reply-To: (Christoph Raisch's message of "Mon, 20 Feb 2006 16:06:19 +0100") References: Message-ID: Christoph> I guess posting 22 new patch files (diff against NIL) Christoph> each week is sort of a DoS attack on the mailing list Christoph> and we'll end up in peoples spam folders pretty Christoph> quickly... So what's the recomended way to proceed Christoph> here? I don't think there's any other way to proceed. For each version, you should carefully note down the feedback that you received and how you are responding to each suggestion, and include that with the patch file. But it's too much to expect for people to keep context for a patch under review, so even though it generates a lot of email, I think that including the whole series is the only way to go. Perhaps the list admins disagree with me though ;) - R. From arndb at de.ibm.com Tue Feb 21 04:26:25 2006 From: arndb at de.ibm.com (Arnd Bergmann) Date: Mon, 20 Feb 2006 18:26:25 +0100 Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator In-Reply-To: <17397.35761.56383.60273@cargo.ozlabs.ibm.com> References: <17397.35761.56383.60273@cargo.ozlabs.ibm.com> Message-ID: <200602201826.25489.arndb@de.ibm.com> On Friday 17 February 2006 09:39, Paul Mackerras wrote: > Utz Bacher writes: > > > +#ifndef CONFIG_PPC_SYSTEMSIM > > ? ????????????????????noctty = 1; > > +#endif > > Why is this awful hack necessary? It's not. It's just a workaround to boot systemsim without any sort of /sbin/init logic that sets ctty. I actually though we had removed that hack earlier. Arnd <>< From RAISCH at de.ibm.com Tue Feb 21 02:06:19 2006 From: RAISCH at de.ibm.com (Christoph Raisch) Date: Mon, 20 Feb 2006 16:06:19 +0100 Subject: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain> Message-ID: Roland, as you already stated we really have a problem that we're not able to send "large" pieces of code to the kernel mailing list. It's perfectly ok for us to send patches to the openib.org mailing list and svn. This is something we still try to resolve with legal. So thank you Roland for acting as a proxy here... We have the ok to contribute to any ehca related discussion on kernel mailing-list and ppc64-mailing list, and are absolutely willing to do so! Adding a new driver for a complex new hardware isn't the regular linux develpment case, especially if there's no base code in linux kernel to patch against... In our case this patch resulted in 22 postings. Some people already noticed that there's still quite some road ahead of us... but we're abolutely willing to work that, and we had to start at some place. Some coments will result in modifications to all files. I guess posting 22 new patch files (diff against NIL) each week is sort of a DoS attack on the mailing list and we'll end up in peoples spam folders pretty quickly... So what's the recomended way to proceed here? Gruss / Regards . . . Christoph Raisch christoph raisch, HCAD teamlead Roland Dreier wrote on 18.02.2006 01:55:32: > Here's a series of patches that add an InfiniBand adapter driver > for IBM eHCA hardware. Please look it over with an eye towards issues > that need to be addressed before merging this upstream. > From arnd at arndb.de Tue Feb 21 05:32:31 2006 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 20 Feb 2006 19:32:31 +0100 Subject: [openib-general] Re: [PATCH 21/22] ehca main file In-Reply-To: <43FA7677.3040901@de.ibm.com> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060220152213.GD19895@krispykreme> <43FA7677.3040901@de.ibm.com> Message-ID: <200602201932.31739.arnd@arndb.de> On Tuesday 21 February 2006 03:09, Heiko J Schick wrote: > ?>>+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12) > ?>>+#define EHCA_RESOURCE_ATTR_H(name) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \ > ?>>+static ssize_t ?ehca_show_##name(struct device *dev, ? ? ? ? ? ? ? ? ? ? ? \ > ?>>+???????????????????????????? struct device_attribute *attr, ? ? ? ? ? ?\ > ?>>+???????????????????????????? char *buf) > ?>>+#else > ?>>+#define EHCA_RESOURCE_ATTR_H(name) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \ > ?>>+static ssize_t ?ehca_show_##name(struct device *dev, ? ? ? ? ? ? ? ? ? ? ? \ > ?>>+???????????????????????????? char *buf) > ?>>+#endif > ?> > ?> > ?> No need for kernel version ifdefs. > > The point is that our module have to run on Linux 2.6.5-7.244 (SuSE SLES 9 SP3), too. > This was the reason why we've included the ifdefs. We can change the ifdefs to > #if LINUX_VERSION_CODE >= KERNEL_VERSION(2.6.5) to mark that this code is used for > Linux 2.6.5 compatibility. That only makes sense as long as you have a common source code for both that also is under your control. As soon as the driver enters the mainline kernel, it is no longer helpful to have these checks in it, because other people will start making changes to the driver that you don't want to have in the 2.6.5 version. You cannot avoid forking the code in the long term, but fortunately the need to backport fixes to the old version should also decrease over time. Arnd <>< From spoole at lanl.gov Tue Feb 21 04:43:51 2006 From: spoole at lanl.gov (Stephen Poole) Date: Mon, 20 Feb 2006 10:43:51 -0700 Subject: [openib-general] Re: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver In-Reply-To: References: Message-ID: If every open source company was being sued for $3B I think many companies would be a bit timid. :-) IBM has been working this issue at all levels. It will happen when IBM Legal has figured out all of the necessary paths in order to cover any potential law suits. Unfortunately, the open source path has been muddied by some folks. Steve... At 4:06 PM +0100 2/20/06, Christoph Raisch wrote: >Roland, >as you already stated we really have a problem that we're not able to send >"large" pieces of code to the kernel mailing list. >It's perfectly ok for us to send patches to the openib.org mailing list and >svn. >This is something we still try to resolve with legal. >So thank you Roland for acting as a proxy here... >We have the ok to contribute to any ehca related discussion on kernel >mailing-list and ppc64-mailing list, and are absolutely willing to do so! > >Adding a new driver for a complex new hardware isn't the regular linux >develpment case, especially if there's no base code in linux kernel to >patch against... >In our case this patch resulted in 22 postings. >Some people already noticed that there's still quite some road ahead of >us... but we're abolutely willing to work that, and we had to start at some >place. >Some coments will result in modifications to all files. >I guess posting 22 new patch files (diff against NIL) each week is sort of >a DoS attack on the mailing list and we'll end up in peoples spam folders >pretty quickly... >So what's the recomended way to proceed here? > > >Gruss / Regards . . . Christoph Raisch > >christoph raisch, HCAD teamlead > >Roland Dreier wrote on 18.02.2006 01:55:32: > >> Here's a series of patches that add an InfiniBand adapter driver >> for IBM eHCA hardware. Please look it over with an eye towards issues >> that need to be addressed before merging this upstream. >> > >_______________________________________________ >openib-general mailing list >openib-general at openib.org >http://openib.org/mailman/listinfo/openib-general > >To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- Steve Poole (spoole at lanl.gov) Office: 505.665.9662 Los Alamos National Laboratory Cell: 505.699.3807 CCN - Special Projects / Advanced Development Fax: 505.665.7793 P.O. Box 1663, MS B255 Los Alamos, NM. 87545 03149801S From schihei at de.ibm.com Tue Feb 21 13:09:59 2006 From: schihei at de.ibm.com (Heiko J Schick) Date: Tue, 21 Feb 2006 03:09:59 +0100 Subject: [openib-general] Re: [PATCH 21/22] ehca main file In-Reply-To: <20060220152213.GD19895@krispykreme> References: <20060218005532.13620.79663.stgit@localhost.localdomain> <20060218005759.13620.10968.stgit@localhost.localdomain> <20060220152213.GD19895@krispykreme> Message-ID: <43FA7677.3040901@de.ibm.com> Hello Anton, thanks for your help! >>+#include "hcp_sense.h" /* TODO: later via hipz_* header file */ >>+#include "hcp_if.h" /* TODO: later via hipz_* header file */ > > > I count 88 TODOs in the driver, it would be nice to get rid of some of > them like the two above, so we can concentrate on the important TODOs :) We will remove the TODOs soon as possible. >>+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12) >>+#define EHCA_RESOURCE_ATTR_H(name) \ >>+static ssize_t ehca_show_##name(struct device *dev, \ >>+ struct device_attribute *attr, \ >>+ char *buf) >>+#else >>+#define EHCA_RESOURCE_ATTR_H(name) \ >>+static ssize_t ehca_show_##name(struct device *dev, \ >>+ char *buf) >>+#endif > > > No need for kernel version ifdefs. The point is that our module have to run on Linux 2.6.5-7.244 (SuSE SLES 9 SP3), too. This was the reason why we've included the ifdefs. We can change the ifdefs to #if LINUX_VERSION_CODE >= KERNEL_VERSION(2.6.5) to mark that this code is used for Linux 2.6.5 compatibility. Regards, Heiko From utz.bacher at de.ibm.com Tue Feb 21 04:33:39 2006 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Mon, 20 Feb 2006 18:33:39 +0100 Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator In-Reply-To: <200602201826.25489.arndb@de.ibm.com> Message-ID: Arnd Bergmann wrote on 20.02.2006 18:26:25: > On Friday 17 February 2006 09:39, Paul Mackerras wrote: > > Utz Bacher writes: > > > > > +#ifndef CONFIG_PPC_SYSTEMSIM > > > noctty = 1; > > > +#endif > > > > Why is this awful hack necessary? > > > It's not. It's just a workaround to boot systemsim without > any sort of /sbin/init logic that sets ctty. > > I actually though we had removed that hack earlier. The idea was to keep the system simulator environment very small, no login etc. It shouldn't go really into a proper kernel. What we missed to point out (probably for all of the four patches) and I take that, is that this should be used when packages for such environments are built ready for use, like our friends at http://www.bsc.es/; it is however not a thing that should go in somewhere else, the kernel or such, and finally should disappear. Utz :wq -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/bda07ca7/attachment.htm From ahuja at austin.ibm.com Tue Feb 21 10:03:46 2006 From: ahuja at austin.ibm.com (Manish Ahuja) Date: Mon, 20 Feb 2006 17:03:46 -0600 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <20060216091027.GA826@localhost.localdomain> References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com> <20060214183259.28a6a501.sfr@canb.auug.org.au> <43F40312.2020800@austin.ibm.com> <20060216091027.GA826@localhost.localdomain> Message-ID: <43FA4AD2.3090503@austin.ibm.com> David Gibson wrote: >On Wed, Feb 15, 2006 at 10:44:02PM -0600, Manish Ahuja wrote: >[snip] > > >>>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c >>>>=================================================================== >>>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c 2005-12-18 >>>>16:36:54.000000000 -0800 >>>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c 2006-01-17 >>>>21:20:25.000000000 -0800 >>>>@@ -243,6 +243,7 @@ >>>> struct thread_struct *new_thread, *old_thread; >>>> unsigned long flags; >>>> struct task_struct *last; >>>>+ struct paca_struct *lpaca; >>>> >>>> >>>> >>>> >>>This could have been declared below (near pd) >>> >>> >>Yes... But it seems fine there.. >> >> > >Actually, I've been trying to get rid of lpaca locals everywhere. >Using get_paca() directly is barely more verbose, and usually clearer. > > > I can change it accordingly.. -Manish From sharada at in.ibm.com Tue Feb 21 16:44:49 2006 From: sharada at in.ibm.com (R Sharada) Date: Tue, 21 Feb 2006 11:14:49 +0530 Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear Message-ID: <20060221054448.GA1695@in.ibm.com> Hello, kexec on Power4 (non-lpar) was breaking because of a spinlock recursion problem in native_hpte_clear. This patch fixes the recursion by changing the call to tlbie() in native_hpte_clear to call __tlbie() (as per Milton's suggestion). native_hpte_clear and slot2va still do not support clearing of large pages (>4K pages). I do not know the large page support code well enough to fix that at the moment. If any one has any ideas or can help fix the hpte_clear code to add support for large pages, that would be appreciated. With this patch, I am able to kexec boot on Power4 non-lpar. Please review, provide comments, and consider for acceptance Thanks and Regards, Sharada native_hpte_clear has a spin_lock recursion problem with the native_tlbie_lock being called twice. Fixing the tlbie() call in native_hpte_clear to call __tlbie(). It still supports only 4K pages for now. Signed-off-by: R Sharada --- diff -puN arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear arch/powerpc/mm/hash_native_64.c --- linux-2.6.16-rc4-tlbie/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear 2006-02-20 22:01:49.000000000 +0530 +++ linux-2.6.16-rc4-tlbie-sharada/arch/powerpc/mm/hash_native_64.c 2006-02-20 22:05:31.000000000 +0530 @@ -383,6 +383,7 @@ static void native_hpte_clear(void) hpte_t *hptep = htab_address; unsigned long hpte_v; unsigned long pteg_count; + unsigned long va; pteg_count = htab_hash_mask + 1; @@ -405,10 +406,12 @@ static void native_hpte_clear(void) if (hpte_v & HPTE_V_VALID) { hptep->v = 0; - tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K, 0); + va = slot2va(hpte_v, slot); + __tlbie(va, MMU_PAGE_4K); } } + asm volatile("eieio; tlbsync; ptesync":::"memory"); spin_unlock(&native_tlbie_lock); local_irq_restore(flags); } _ From michael at ellerman.id.au Tue Feb 21 17:22:55 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 21 Feb 2006 17:22:55 +1100 Subject: [PATCH] powerpc: Only calculate htab_size in one place for kexec Message-ID: <20060221062320.6EFB5679F5@ozlabs.org> For kexec we need to know the size of the htab. Currently we calculate the size once in the htab code, and then twice more in the kexec code, once using htab_hash_mask and once using ppc64_pft_size. On some machines the ppc64_pft_size calculation is broken because ppc64_pft_size is not set. So we need to fix the second calculation, but better still we should just calculate the size once and use it everywhere else. Tested on Power5 LPAR, Power4 non-LPAR and Power3. Kexec is broken on some non-LPAR machines without this, so I think it should go upstream for 2.6.16. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/machine_kexec_64.c | 10 +++------- arch/powerpc/mm/hash_utils_64.c | 3 ++- include/asm-powerpc/mmu.h | 1 + 3 files changed, 6 insertions(+), 8 deletions(-) Index: to-merge/arch/powerpc/kernel/machine_kexec_64.c =================================================================== --- to-merge.orig/arch/powerpc/kernel/machine_kexec_64.c +++ to-merge/arch/powerpc/kernel/machine_kexec_64.c @@ -26,8 +26,6 @@ #include #include -#define HASH_GROUP_SIZE 0x80 /* size of each hash group, asm/mmu.h */ - int default_machine_kexec_prepare(struct kimage *image) { int i; @@ -61,7 +59,7 @@ int default_machine_kexec_prepare(struct */ if (htab_address) { low = __pa(htab_address); - high = low + (htab_hash_mask + 1) * HASH_GROUP_SIZE; + high = low + htab_size_bytes; for (i = 0; i < image->nr_segments; i++) { begin = image->segment[i].mem; @@ -294,7 +292,7 @@ void default_machine_kexec(struct kimage } /* Values we need to export to the second kernel via the device tree. */ -static unsigned long htab_base, htab_size, kernel_end; +static unsigned long htab_base, kernel_end; static struct property htab_base_prop = { .name = "linux,htab-base", @@ -305,7 +303,7 @@ static struct property htab_base_prop = static struct property htab_size_prop = { .name = "linux,htab-size", .length = sizeof(unsigned long), - .value = (unsigned char *)&htab_size, + .value = (unsigned char *)&htab_size_bytes, }; static struct property kernel_end_prop = { @@ -331,8 +329,6 @@ static void __init export_htab_values(vo htab_base = __pa(htab_address); prom_add_property(node, &htab_base_prop); - - htab_size = 1UL << ppc64_pft_size; prom_add_property(node, &htab_size_prop); out: Index: to-merge/arch/powerpc/mm/hash_utils_64.c =================================================================== --- to-merge.orig/arch/powerpc/mm/hash_utils_64.c +++ to-merge/arch/powerpc/mm/hash_utils_64.c @@ -88,6 +88,7 @@ static unsigned long _SDR1; struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; hpte_t *htab_address; +unsigned long htab_size_bytes; unsigned long htab_hash_mask; int mmu_linear_psize = MMU_PAGE_4K; int mmu_virtual_psize = MMU_PAGE_4K; @@ -399,7 +400,7 @@ void create_section_mapping(unsigned lon void __init htab_initialize(void) { - unsigned long table, htab_size_bytes; + unsigned long table; unsigned long pteg_count; unsigned long mode_rw; unsigned long base = 0, size = 0; Index: to-merge/include/asm-powerpc/mmu.h =================================================================== --- to-merge.orig/include/asm-powerpc/mmu.h +++ to-merge/include/asm-powerpc/mmu.h @@ -112,6 +112,7 @@ typedef struct { } hpte_t; extern hpte_t *htab_address; +extern unsigned long htab_size_bytes; extern unsigned long htab_hash_mask; /* From johnrose at austin.ibm.com Wed Feb 22 07:55:41 2006 From: johnrose at austin.ibm.com (John Rose) Date: Tue, 21 Feb 2006 14:55:41 -0600 Subject: [PATCH 2/2] Fix dynamic PCI probe regression Message-ID: <1140555341.24859.15.camel@sinatra.austin.ibm.com> Some hotplug driver functions were migrated to the kernel for use by EEH in the following set of changes: http://tinyurl.com/qke9r Previously, the PCI Hotplug module had been changed to use the new OFDT-based PCI probe when appropriate: http://tinyurl.com/jy4jl When rpaphp_pci_config_slot() was moved from the rpaphp driver to the new kernel function pcibios_add_pci_devices(), the OFDT-based probe stuff was dropped. This patch restores it. Signed-off-by: John Rose diff -puN arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress arch/powerpc/platforms/pseries/pci_dlpar.c --- 2_6_linus/arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress 2006-02-21 14:54:10.000000000 -0600 +++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/pci_dlpar.c 2006-02-21 14:54:10.000000000 -0600 @@ -106,6 +106,8 @@ pcibios_fixup_new_pci_devices(struct pci } } } + + eeh_add_device_tree_late(bus); } EXPORT_SYMBOL_GPL(pcibios_fixup_new_pci_devices); @@ -114,7 +116,6 @@ pcibios_pci_config_bridge(struct pci_dev { u8 sec_busno; struct pci_bus *child_bus; - struct pci_dev *child_dev; /* Get busno of downstream bus */ pci_read_config_byte(dev, PCI_SECONDARY_BUS, &sec_busno); @@ -129,10 +130,6 @@ pcibios_pci_config_bridge(struct pci_dev pci_scan_child_bus(child_bus); - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - /* Fixup new pci devices without touching bus struct */ pcibios_fixup_new_pci_devices(child_bus, 0); @@ -160,18 +157,25 @@ pcibios_add_pci_devices(struct pci_bus * eeh_add_device_tree_early(dn); - /* pci_scan_slot should find all children */ - slotno = PCI_SLOT(PCI_DN(dn->child)->devfn); - num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); - if (num) { - pcibios_fixup_new_pci_devices(bus, 1); - pci_bus_add_devices(bus); - } + if (_machine == PLATFORM_PSERIES_LPAR) { + /* use ofdt-based probe */ + of_scan_bus(dn, bus); + if (!list_empty(&bus->devices)) { + pcibios_fixup_new_pci_devices(bus, 0); + pci_bus_add_devices(bus); + } + } else { + /* use legacy probe */ + slotno = PCI_SLOT(PCI_DN(dn->child)->devfn); + num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + if (num) { + pcibios_fixup_new_pci_devices(bus, 1); + pci_bus_add_devices(bus); + } - list_for_each_entry(dev, &bus->devices, bus_list) { - eeh_add_device_late (dev); - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) - pcibios_pci_config_bridge(dev); + list_for_each_entry(dev, &bus->devices, bus_list) + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + pcibios_pci_config_bridge(dev); } } EXPORT_SYMBOL_GPL(pcibios_add_pci_devices); diff -puN arch/powerpc/platforms/pseries/eeh.c~reorg_regress arch/powerpc/platforms/pseries/eeh.c --- 2_6_linus/arch/powerpc/platforms/pseries/eeh.c~reorg_regress 2006-02-21 14:54:10.000000000 -0600 +++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/eeh.c 2006-02-21 14:54:10.000000000 -0600 @@ -917,6 +917,20 @@ void eeh_add_device_late(struct pci_dev pci_addr_cache_insert_device (dev); } +void eeh_add_device_tree_late(struct pci_bus *bus) +{ + struct pci_dev *dev; + + list_for_each_entry(dev, &bus->devices, bus_list) { + eeh_add_device_late(dev); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + struct pci_bus *subbus = dev->subordinate; + if (subbus) + eeh_add_device_tree_late(subbus); + } + } +} + /** * eeh_remove_device - undo EEH setup for the indicated pci device * @dev: pci device to be removed diff -puN include/asm-powerpc/eeh.h~reorg_regress include/asm-powerpc/eeh.h --- 2_6_linus/include/asm-powerpc/eeh.h~reorg_regress 2006-02-21 14:54:10.000000000 -0600 +++ 2_6_linus-johnrose/include/asm-powerpc/eeh.h 2006-02-21 14:54:10.000000000 -0600 @@ -27,6 +27,7 @@ #include struct pci_dev; +struct pci_bus; struct device_node; #ifdef CONFIG_EEH @@ -51,7 +52,7 @@ int eeh_dn_check_failure(struct device_n void __init pci_addr_cache_build(void); void eeh_add_device_tree_early(struct device_node *); -void eeh_add_device_late(struct pci_dev *); +void eeh_add_device_tree_late(struct pci_bus *); /** * eeh_remove_bus_device - undo EEH for device & children. @@ -92,10 +93,10 @@ static inline int eeh_dn_check_failure(s static inline void pci_addr_cache_build(void) { } -static inline void eeh_add_device_late(struct pci_dev *dev) { } - static inline void eeh_add_device_tree_early(struct device_node *dn) { } +static inline void eeh_add_device_tree_late(struct pci_bus *bus) { } + static inline void eeh_remove_bus_device(struct pci_dev *dev) { } #define EEH_POSSIBLE_ERROR(val, type) (0) #define EEH_IO_ERROR_VALUE(size) (-1UL) _ From linas at austin.ibm.com Wed Feb 22 08:14:02 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 21 Feb 2006 15:14:02 -0600 Subject: [PATCH 1/2] EEH cleanups In-Reply-To: <1140555218.24859.11.camel@sinatra.austin.ibm.com> References: <1140555218.24859.11.camel@sinatra.austin.ibm.com> Message-ID: <20060221211402.GD26339@austin.ibm.com> Hi, On Tue, Feb 21, 2006 at 02:53:38PM -0600, John Rose was heard to remark: > This patch removes unnecessary exports, marks functions as static when > possible, and simplifies some list-related code. > > Signed-off-by: John Rose Looks reasonable to me; I have one request, though. The patch removes the following documentatin from eeh.h; can you copy this over to eeh.c? (what's there now is shorter and has a typo.) > /** > - * eeh_remove_device - undo EEH setup for the indicated pci device > - * @dev: pci device to be removed > - * > - * This routine should be called when a device is removed from > - * a running system (e.g. by hotplug or dlpar). It unregisters > - * the PCI device from the EEH subsystem. I/O errors affecting > - * this device will no longer be detected after this call; thus, > - * i/o errors affecting this slot may leave this device unusable. > - */ I won't be here tommorrow, to ack anything revised then, so I'll just ack now: Acked-by: Linas Vepstas --linas From johnrose at austin.ibm.com Wed Feb 22 08:21:45 2006 From: johnrose at austin.ibm.com (John Rose) Date: Tue, 21 Feb 2006 15:21:45 -0600 Subject: [PATCH 1/2] EEH cleanups In-Reply-To: <20060221211402.GD26339@austin.ibm.com> References: <1140555218.24859.11.camel@sinatra.austin.ibm.com> <20060221211402.GD26339@austin.ibm.com> Message-ID: <1140556904.24859.18.camel@sinatra.austin.ibm.com> This patch removes unnecessary exports, marks functions as static when possible, and simplifies some list-related code. Signed-off-by: John Rose Acked-by: Linas Vepstas diff -puN arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups arch/powerpc/platforms/pseries/eeh.c --- 2_6_linus/arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups 2006-02-21 15:20:08.000000000 -0600 +++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/eeh.c 2006-02-21 15:24:32.000000000 -0600 @@ -409,8 +409,6 @@ dn_unlock: return rc; } -EXPORT_SYMBOL_GPL(eeh_dn_check_failure); - /** * eeh_check_failure - check if all 1's data is due to EEH slot freeze * @token i/o token, should be address in the form 0xA.... @@ -865,7 +863,7 @@ void __init eeh_init(void) * on the CEC architecture, type of the device, on earlier boot * command-line arguments & etc. */ -void eeh_add_device_early(struct device_node *dn) +static void eeh_add_device_early(struct device_node *dn) { struct pci_controller *phb; struct eeh_early_enable_info info; @@ -882,7 +880,6 @@ void eeh_add_device_early(struct device_ info.buid_lo = BUID_LO(phb->buid); early_enable_eeh(dn, &info); } -EXPORT_SYMBOL_GPL(eeh_add_device_early); void eeh_add_device_tree_early(struct device_node *dn) { @@ -919,16 +916,18 @@ void eeh_add_device_late(struct pci_dev pci_addr_cache_insert_device (dev); } -EXPORT_SYMBOL_GPL(eeh_add_device_late); -/** - * eeh_remove_device - undo EEH setup for the indicated pci device - * @dev: pci device to be removed - * - * This routine should be when a device is removed from a running - * system (e.g. by hotplug or dlpar). - */ -void eeh_remove_device(struct pci_dev *dev) + /** + * eeh_remove_device - undo EEH setup for the indicated pci device + * @dev: pci device to be removed + * + * This routine should be called when a device is removed from + * a running system (e.g. by hotplug or dlpar). It unregisters + * the PCI device from the EEH subsystem. I/O errors affecting + * this device will no longer be detected after this call; thus, + * i/o errors affecting this slot may leave this device unusable. + */ +static void eeh_remove_device(struct pci_dev *dev) { struct device_node *dn; if (!dev || !eeh_subsystem_enabled) @@ -944,21 +943,16 @@ void eeh_remove_device(struct pci_dev *d PCI_DN(dn)->pcidev = NULL; pci_dev_put (dev); } -EXPORT_SYMBOL_GPL(eeh_remove_device); void eeh_remove_bus_device(struct pci_dev *dev) { + struct pci_bus *bus = dev->subordinate; + struct pci_dev *child, *tmp; + eeh_remove_device(dev); - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { - struct pci_bus *bus = dev->subordinate; - struct list_head *ln; - if (!bus) - return; - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev) - eeh_remove_bus_device(pdev); - } + if (bus && dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + list_for_each_entry_safe(child, tmp, &bus->devices, bus_list) + eeh_remove_bus_device(child); } } EXPORT_SYMBOL_GPL(eeh_remove_bus_device); diff -puN include/asm-powerpc/eeh.h~eeh_cleanups include/asm-powerpc/eeh.h --- 2_6_linus/include/asm-powerpc/eeh.h~eeh_cleanups 2006-02-21 15:20:08.000000000 -0600 +++ 2_6_linus-johnrose/include/asm-powerpc/eeh.h 2006-02-21 15:24:32.000000000 -0600 @@ -50,33 +50,11 @@ unsigned long eeh_check_failure(const vo int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev); void __init pci_addr_cache_build(void); -/** - * eeh_add_device_early - * eeh_add_device_late - * - * Perform eeh initialization for devices added after boot. - * Call eeh_add_device_early before doing any i/o to the - * device (including config space i/o). Call eeh_add_device_late - * to finish the eeh setup for this device. - */ -void eeh_add_device_early(struct device_node *); void eeh_add_device_tree_early(struct device_node *); void eeh_add_device_late(struct pci_dev *); /** - * eeh_remove_device - undo EEH setup for the indicated pci device - * @dev: pci device to be removed - * - * This routine should be called when a device is removed from - * a running system (e.g. by hotplug or dlpar). It unregisters - * the PCI device from the EEH subsystem. I/O errors affecting - * this device will no longer be detected after this call; thus, - * i/o errors affecting this slot may leave this device unusable. - */ -void eeh_remove_device(struct pci_dev *); - -/** - * eeh_remove_device_recursive - undo EEH for device & children. + * eeh_remove_bus_device - undo EEH for device & children. * @dev: pci device to be removed * * As above, this removes the device; it also removes child @@ -114,12 +92,8 @@ static inline int eeh_dn_check_failure(s static inline void pci_addr_cache_build(void) { } -static inline void eeh_add_device_early(struct device_node *dn) { } - static inline void eeh_add_device_late(struct pci_dev *dev) { } -static inline void eeh_remove_device(struct pci_dev *dev) { } - static inline void eeh_add_device_tree_early(struct device_node *dn) { } static inline void eeh_remove_bus_device(struct pci_dev *dev) { } _ From linas at austin.ibm.com Wed Feb 22 08:29:07 2006 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 21 Feb 2006 15:29:07 -0600 Subject: [PATCH 2/2] Fix dynamic PCI probe regression In-Reply-To: <1140555341.24859.15.camel@sinatra.austin.ibm.com> References: <1140555341.24859.15.camel@sinatra.austin.ibm.com> Message-ID: <20060221212907.GE26339@austin.ibm.com> On Tue, Feb 21, 2006 at 02:55:41PM -0600, John Rose was heard to remark: > > When rpaphp_pci_config_slot() was moved from the rpaphp driver to the > new kernel function pcibios_add_pci_devices(), the OFDT-based probe > stuff was dropped. This patch restores it. I did that. Sorry. I think I even know how/why; but I'll spare you the convoluted excuse. The ofdt logic flow certainly looks cleaner now. > Signed-off-by: John Rose I haven't tested this patch, but after reading it, it looks good to me. So: Acked-by: Linas Vepstas --linas From johnrose at austin.ibm.com Wed Feb 22 07:53:38 2006 From: johnrose at austin.ibm.com (John Rose) Date: Tue, 21 Feb 2006 14:53:38 -0600 Subject: [PATCH 1/2] EEH cleanups Message-ID: <1140555218.24859.11.camel@sinatra.austin.ibm.com> This patch removes unnecessary exports, marks functions as static when possible, and simplifies some list-related code. Signed-off-by: John Rose diff -puN arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups arch/powerpc/platforms/pseries/eeh.c --- 2_6_linus/arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups 2006-02-21 14:40:43.000000000 -0600 +++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/eeh.c 2006-02-21 14:55:34.000000000 -0600 @@ -409,8 +409,6 @@ dn_unlock: return rc; } -EXPORT_SYMBOL_GPL(eeh_dn_check_failure); - /** * eeh_check_failure - check if all 1's data is due to EEH slot freeze * @token i/o token, should be address in the form 0xA.... @@ -865,7 +863,7 @@ void __init eeh_init(void) * on the CEC architecture, type of the device, on earlier boot * command-line arguments & etc. */ -void eeh_add_device_early(struct device_node *dn) +static void eeh_add_device_early(struct device_node *dn) { struct pci_controller *phb; struct eeh_early_enable_info info; @@ -882,7 +880,6 @@ void eeh_add_device_early(struct device_ info.buid_lo = BUID_LO(phb->buid); early_enable_eeh(dn, &info); } -EXPORT_SYMBOL_GPL(eeh_add_device_early); void eeh_add_device_tree_early(struct device_node *dn) { @@ -919,7 +916,6 @@ void eeh_add_device_late(struct pci_dev pci_addr_cache_insert_device (dev); } -EXPORT_SYMBOL_GPL(eeh_add_device_late); /** * eeh_remove_device - undo EEH setup for the indicated pci device @@ -928,7 +924,7 @@ EXPORT_SYMBOL_GPL(eeh_add_device_late); * This routine should be when a device is removed from a running * system (e.g. by hotplug or dlpar). */ -void eeh_remove_device(struct pci_dev *dev) +static void eeh_remove_device(struct pci_dev *dev) { struct device_node *dn; if (!dev || !eeh_subsystem_enabled) @@ -944,21 +940,16 @@ void eeh_remove_device(struct pci_dev *d PCI_DN(dn)->pcidev = NULL; pci_dev_put (dev); } -EXPORT_SYMBOL_GPL(eeh_remove_device); void eeh_remove_bus_device(struct pci_dev *dev) { + struct pci_bus *bus = dev->subordinate; + struct pci_dev *child, *tmp; + eeh_remove_device(dev); - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { - struct pci_bus *bus = dev->subordinate; - struct list_head *ln; - if (!bus) - return; - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev) - eeh_remove_bus_device(pdev); - } + if (bus && dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + list_for_each_entry_safe(child, tmp, &bus->devices, bus_list) + eeh_remove_bus_device(child); } } EXPORT_SYMBOL_GPL(eeh_remove_bus_device); diff -puN include/asm-powerpc/eeh.h~eeh_cleanups include/asm-powerpc/eeh.h --- 2_6_linus/include/asm-powerpc/eeh.h~eeh_cleanups 2006-02-21 14:40:43.000000000 -0600 +++ 2_6_linus-johnrose/include/asm-powerpc/eeh.h 2006-02-21 14:55:34.000000000 -0600 @@ -50,33 +50,11 @@ unsigned long eeh_check_failure(const vo int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev); void __init pci_addr_cache_build(void); -/** - * eeh_add_device_early - * eeh_add_device_late - * - * Perform eeh initialization for devices added after boot. - * Call eeh_add_device_early before doing any i/o to the - * device (including config space i/o). Call eeh_add_device_late - * to finish the eeh setup for this device. - */ -void eeh_add_device_early(struct device_node *); void eeh_add_device_tree_early(struct device_node *); void eeh_add_device_late(struct pci_dev *); /** - * eeh_remove_device - undo EEH setup for the indicated pci device - * @dev: pci device to be removed - * - * This routine should be called when a device is removed from - * a running system (e.g. by hotplug or dlpar). It unregisters - * the PCI device from the EEH subsystem. I/O errors affecting - * this device will no longer be detected after this call; thus, - * i/o errors affecting this slot may leave this device unusable. - */ -void eeh_remove_device(struct pci_dev *); - -/** - * eeh_remove_device_recursive - undo EEH for device & children. + * eeh_remove_bus_device - undo EEH for device & children. * @dev: pci device to be removed * * As above, this removes the device; it also removes child @@ -114,12 +92,8 @@ static inline int eeh_dn_check_failure(s static inline void pci_addr_cache_build(void) { } -static inline void eeh_add_device_early(struct device_node *dn) { } - static inline void eeh_add_device_late(struct pci_dev *dev) { } -static inline void eeh_remove_device(struct pci_dev *dev) { } - static inline void eeh_add_device_tree_early(struct device_node *dn) { } static inline void eeh_remove_bus_device(struct pci_dev *dev) { } _ From ahuja at austin.ibm.com Wed Feb 22 10:07:15 2006 From: ahuja at austin.ibm.com (Manish Ahuja) Date: Tue, 21 Feb 2006 17:07:15 -0600 Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics In-Reply-To: <20060216091027.GA826@localhost.localdomain> References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com> <20060214183259.28a6a501.sfr@canb.auug.org.au> <43F40312.2020800@austin.ibm.com> <20060216091027.GA826@localhost.localdomain> Message-ID: <43FB9D23.8070207@austin.ibm.com> Added entry and exit points to system_call path. Got rid of lpaca variables. David Gibson wrote: >On Wed, Feb 15, 2006 at 10:44:02PM -0600, Manish Ahuja wrote: >[snip] > > >>>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c >>>>=================================================================== >>>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c 2005-12-18 >>>>16:36:54.000000000 -0800 >>>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c 2006-01-17 >>>>21:20:25.000000000 -0800 >>>>@@ -243,6 +243,7 @@ >>>> struct thread_struct *new_thread, *old_thread; >>>> unsigned long flags; >>>> struct task_struct *last; >>>>+ struct paca_struct *lpaca; >>>> >>>> >>>> >>>> >>>This could have been declared below (near pd) >>> >>> >>Yes... But it seems fine there.. >> >> > >Actually, I've been trying to get rid of lpaca locals everywhere. >Using get_paca() directly is barely more verbose, and usually clearer. > > > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cpu-acct.txt Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060221/6d21af2b/attachment.txt From sfr at ozlabs.org Wed Feb 22 15:04:26 2006 From: sfr at ozlabs.org (Stephen Rothwell) Date: Wed, 22 Feb 2006 15:04:26 +1100 Subject: Yahoo addresses delayed Message-ID: <20060222150426.755ceb91.sfr@ozlabs.org> Hi all, This is just an email to let you all know that if you are subscribed to any of these lists using a Yahoo email address, your copies of posts will be delayed as someone has reported ozlabs.org to Yahoo as a spam site! As fas as I know there has been no (or very little) spam through these lists as they are set to member post only. If you do see spam on these lists, please report it to abuse at ozlabs.org (and not Yahoo or spamcop etc) so that we can try to fix the problem. -- Cheers, Stephen Rothwell sfr at ozlabs.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060222/41e08504/attachment.pgp From paulus at samba.org Wed Feb 22 22:35:30 2006 From: paulus at samba.org (Paul Mackerras) Date: Wed, 22 Feb 2006 22:35:30 +1100 Subject: [PATCH] Accurate task and cpu time accounting Message-ID: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> Here is a patch that implements accurate task and cpu time accounting for 64-bit powerpc kernels. Instead of accounting a whole jiffy of time to a task on a timer interrupt because that task happened to be running at the time, we now account time in units of timebase ticks according to the actual time spent in user mode and kernel mode. To do this we read either the PURR (processor utilization of resources register) on POWER5 machines or the timebase on other machines on each entry to the kernel from usermode, each exit to usermode, on transitions between process context, hard irq context and soft irq context in kernel mode, and on context switches. On POWER5 systems with shared-processor logical partitioning we also read both the PURR and the timebase at each timer interrupt in order to determine how much time has been taken by the hypervisor to run other partitions ("steal" time). This is all based quite heavily on what s390 does, and it uses the generic interfaces that were added by the s390 developers, i.e. account_system_time(), account_user_time(), etc. This patch doesn't add any new interfaces between the kernel and userspace, and doesn't change the units in which time is reported to userspace by things such as /proc/stat, /proc//stat, getrusage(), times(), etc. Internally the various task and cpu times are stored in timebase units, but they are converted to USER_HZ units (1/100th of a second) when reported to userspace. Some precision is therefore lost but there should not be any accumulating error, since the internal accumulation is at full precision. All of this is conditional on CONFIG_VIRT_CPU_ACCOUNTING. If that is not set, we do tick-based approximate accounting as before. Paul. diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 80d114a..707d079 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -238,6 +238,21 @@ config PPC_STD_MMU_32 def_bool y depends on PPC_STD_MMU && PPC32 +config VIRT_CPU_ACCOUNTING + bool "Deterministic task and CPU time accounting" + depends on PPC64 + default y + help + Select this option to enable more accurate task and CPU time + accounting. This is done by reading a CPU counter on each + kernel entry and exit and on transitions within the kernel + between system, softirq and hardirq state, so there is a + small performance impact. This also enables accounting of + stolen time on logically-partitioned systems running on + IBM POWER5-based machines. + + If in doubt, say Y here. + config SMP depends on PPC_STD_MMU bool "Symmetric multi-processing support" diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 840aad4..18810ac 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -137,6 +137,9 @@ int main(void) DEFINE(PACAEMERGSP, offsetof(struct paca_struct, emergency_sp)); DEFINE(PACALPPACAPTR, offsetof(struct paca_struct, lppaca_ptr)); DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id)); + DEFINE(PACA_STARTPURR, offsetof(struct paca_struct, startpurr)); + DEFINE(PACA_USER_TIME, offsetof(struct paca_struct, user_time)); + DEFINE(PACA_SYSTEM_TIME, offsetof(struct paca_struct, system_time)); DEFINE(LPPACASRR0, offsetof(struct lppaca, saved_srr0)); DEFINE(LPPACASRR1, offsetof(struct lppaca, saved_srr1)); diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 388f861..df918f7 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -63,6 +63,7 @@ system_call_common: std r12,_MSR(r1) std r0,GPR0(r1) std r10,GPR1(r1) + ACCOUNT_CPU_USER_ENTRY(r10, r11) std r2,GPR2(r1) std r3,GPR3(r1) std r4,GPR4(r1) @@ -170,8 +171,9 @@ syscall_error_cont: stdcx. r0,0,r1 /* to clear the reservation */ andi. r6,r8,MSR_PR ld r4,_LINK(r1) - beq- 1f /* only restore r13 if */ - ld r13,GPR13(r1) /* returning to usermode */ + beq- 1f + ACCOUNT_CPU_USER_EXIT(r11, r12) + ld r13,GPR13(r1) /* only restore r13 if returning to usermode */ 1: ld r2,GPR2(r1) li r12,MSR_RI andc r11,r10,r12 @@ -538,6 +540,7 @@ restore: * userspace */ beq 1f + ACCOUNT_CPU_USER_EXIT(r3, r4) REST_GPR(13, r1) 1: ld r3,_CTR(r1) diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index 2b03a09..40c813b 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -283,6 +283,7 @@ exception_marker: std r10,0(r1); /* make stack chain pointer */ \ std r0,GPR0(r1); /* save r0 in stackframe */ \ std r10,GPR1(r1); /* save r1 in stackframe */ \ + ACCOUNT_CPU_USER_ENTRY(r9, r10); \ std r2,GPR2(r1); /* save r2 in stackframe */ \ SAVE_4GPRS(3, r1); /* save r3 - r6 in stackframe */ \ SAVE_2GPRS(7, r1); /* save r7, r8 in stackframe */ \ @@ -858,6 +859,14 @@ fast_exception_return: ld r11,_NIP(r1) andi. r3,r12,MSR_RI /* check if RI is set */ beq- unrecov_fer + +#ifdef CONFIG_VIRT_CPU_ACCOUNTING + andi. r3,r12,MSR_PR + beq 2f + ACCOUNT_CPU_USER_EXIT(r3, r4) +2: +#endif + ld r3,_CCR(r1) ld r4,_LINK(r1) ld r5,_CTR(r1) diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index d1fffce..dea05b4 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -394,10 +394,24 @@ void irq_ctx_init(void) } } +static inline void do_softirq_onstack(void) +{ + struct thread_info *curtp, *irqtp; + + curtp = current_thread_info(); + irqtp = softirq_ctx[smp_processor_id()]; + irqtp->task = curtp->task; + call_do_softirq(irqtp); + irqtp->task = NULL; +} + +#else +#define do_softirq_onstack() __do_softirq() +#endif /* CONFIG_IRQSTACKS */ + void do_softirq(void) { unsigned long flags; - struct thread_info *curtp, *irqtp; if (in_interrupt()) return; @@ -405,19 +419,17 @@ void do_softirq(void) local_irq_save(flags); if (local_softirq_pending()) { - curtp = current_thread_info(); - irqtp = softirq_ctx[smp_processor_id()]; - irqtp->task = curtp->task; - call_do_softirq(irqtp); - irqtp->task = NULL; + account_system_vtime(current); + local_bh_disable(); + do_softirq_onstack(); + account_system_vtime(current); + __local_bh_enable(); } local_irq_restore(flags); } EXPORT_SYMBOL(do_softirq); -#endif /* CONFIG_IRQSTACKS */ - static int __init setup_noirqdistrib(char *str) { distribute_irqs = 0; diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 5770399..aeede05 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -330,6 +330,11 @@ struct task_struct *__switch_to(struct t #endif local_irq_save(flags); + + account_system_vtime(current); + account_process_vtime(current); + calculate_steal_time(); + last = _switch(old_thread, new_thread); local_irq_restore(flags); diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 13595a6..805eaed 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -541,7 +541,7 @@ int __devinit start_secondary(void *unus smp_ops->take_timebase(); if (system_state > SYSTEM_BOOTING) - per_cpu(last_jiffy, cpu) = get_tb(); + snapshot_timebase(); spin_lock(&call_lock); cpu_set(cpu, cpu_online_map); @@ -573,6 +573,8 @@ void __init smp_cpus_done(unsigned int m set_cpus_allowed(current, old_mask); + snapshot_timebases(); + dump_numa_cpu_topology(); } diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 2a7ddc5..8a57a38 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -51,6 +51,7 @@ #include #include #include +#include #include #include @@ -135,6 +136,220 @@ unsigned long tb_last_stamp; */ DEFINE_PER_CPU(unsigned long, last_jiffy); +#ifdef CONFIG_VIRT_CPU_ACCOUNTING +/* + * Factors for converting from cputime_t (timebase ticks) to + * jiffies, milliseconds, seconds, and clock_t (1/USER_HZ seconds). + * These are all stored as 0.64 fixed-point binary fractions. + */ +u64 __cputime_jiffies_factor; +u64 __cputime_msec_factor; +u64 __cputime_sec_factor; +u64 __cputime_clockt_factor; + +static void calc_cputime_factors(void) +{ + struct div_result res; + + div128_by_32(HZ, 0, tb_ticks_per_sec, &res); + __cputime_jiffies_factor = res.result_low; + div128_by_32(1000, 0, tb_ticks_per_sec, &res); + __cputime_msec_factor = res.result_low; + div128_by_32(1, 0, tb_ticks_per_sec, &res); + __cputime_sec_factor = res.result_low; + div128_by_32(USER_HZ, 0, tb_ticks_per_sec, &res); + __cputime_clockt_factor = res.result_low; +} + +/* + * Read the PURR on systems that have it, otherwise the timebase. + */ +static u64 read_purr(void) +{ + if (cpu_has_feature(CPU_FTR_PURR)) + return mfspr(SPRN_PURR); + return mftb(); +} + +/* + * Account time for a transition between system, hard irq + * or soft irq state. + */ +void account_system_vtime(struct task_struct *tsk) +{ + u64 now, delta; + unsigned long flags; + + local_irq_save(flags); + now = read_purr(); + delta = now - get_paca()->startpurr; + get_paca()->startpurr = now; + if (!in_interrupt()) { + delta += get_paca()->system_time; + get_paca()->system_time = 0; + } + account_system_time(tsk, 0, delta); + local_irq_restore(flags); +} + +/* + * Transfer the user and system times accumulated in the paca + * by the exception entry and exit code to the generic process + * user and system time records. + * Must be called with interrupts disabled. + */ +void account_process_vtime(struct task_struct *tsk) +{ + cputime_t utime; + + utime = get_paca()->user_time; + get_paca()->user_time = 0; + account_user_time(tsk, utime); +} + +static void account_process_time(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + account_process_vtime(current); + run_local_timers(); + if (rcu_pending(cpu)) + rcu_check_callbacks(cpu, user_mode(regs)); + scheduler_tick(); + run_posix_cpu_timers(current); +} + +#ifdef CONFIG_PPC_SPLPAR +/* + * Stuff for accounting stolen time. + */ +struct cpu_purr_data { + int initialized; /* thread is running */ + u64 tb0; /* timebase at origin time */ + u64 purr0; /* PURR at origin time */ + u64 tb; /* last TB value read */ + u64 purr; /* last PURR value read */ + u64 stolen; /* stolen time so far */ + spinlock_t lock; +}; + +static DEFINE_PER_CPU(struct cpu_purr_data, cpu_purr_data); + +static void snapshot_tb_and_purr(void *data) +{ + struct cpu_purr_data *p = &__get_cpu_var(cpu_purr_data); + + p->tb0 = mftb(); + p->purr0 = mfspr(SPRN_PURR); + p->tb = p->tb0; + p->purr = 0; + wmb(); + p->initialized = 1; +} + +/* + * Called during boot when all cpus have come up. + */ +void snapshot_timebases(void) +{ + int cpu; + + if (!cpu_has_feature(CPU_FTR_PURR)) + return; + for_each_cpu(cpu) + spin_lock_init(&per_cpu(cpu_purr_data, cpu).lock); + on_each_cpu(snapshot_tb_and_purr, NULL, 0, 1); +} + +void calculate_steal_time(void) +{ + u64 tb, purr, t0; + s64 stolen; + struct cpu_purr_data *p0, *pme, *phim; + int cpu; + + if (!cpu_has_feature(CPU_FTR_PURR)) + return; + cpu = smp_processor_id(); + pme = &per_cpu(cpu_purr_data, cpu); + if (!pme->initialized) + return; /* this can happen in early boot */ + p0 = &per_cpu(cpu_purr_data, cpu & ~1); + phim = &per_cpu(cpu_purr_data, cpu ^ 1); + spin_lock(&p0->lock); + tb = mftb(); + purr = mfspr(SPRN_PURR) - pme->purr0; + if (!phim->initialized || !cpu_online(cpu ^ 1)) { + stolen = (tb - pme->tb) - (purr - pme->purr); + } else { + t0 = pme->tb0; + if (phim->tb0 < t0) + t0 = phim->tb0; + stolen = phim->tb - t0 - phim->purr - purr - p0->stolen; + } + if (stolen > 0) { + account_steal_time(current, stolen); + p0->stolen += stolen; + } + pme->tb = tb; + pme->purr = purr; + spin_unlock(&p0->lock); +} + +/* + * Must be called before the cpu is added to the online map when + * a cpu is being brought up at runtime. + */ +static void snapshot_purr(void) +{ + int cpu; + u64 purr; + struct cpu_purr_data *p0, *pme, *phim; + unsigned long flags; + + if (!cpu_has_feature(CPU_FTR_PURR)) + return; + cpu = smp_processor_id(); + pme = &per_cpu(cpu_purr_data, cpu); + p0 = &per_cpu(cpu_purr_data, cpu & ~1); + phim = &per_cpu(cpu_purr_data, cpu ^ 1); + spin_lock_irqsave(&p0->lock, flags); + pme->tb = pme->tb0 = mftb(); + purr = mfspr(SPRN_PURR); + if (!phim->initialized) { + pme->purr = 0; + pme->purr0 = purr; + } else { + /* set p->purr and p->purr0 for no change in p0->stolen */ + pme->purr = phim->tb - phim->tb0 - phim->purr - p0->stolen; + pme->purr0 = purr - pme->purr; + } + pme->initialized = 1; + spin_unlock_irqrestore(&p0->lock, flags); +} + +#endif /* CONFIG_PPC_SPLPAR */ + +#else /* ! CONFIG_VIRT_CPU_ACCOUNTING */ +#define calc_cputime_factors() +#define account_process_time(regs) update_process_times(user_mode(regs)) +#define calculate_steal_time() do { } while (0) +#endif + +#if !(defined(CONFIG_VIRT_CPU_ACCOUNTING) && defined(CONFIG_PPC_SPLPAR)) +#define snapshot_purr() do { } while (0) +#endif + +/* + * Called when a cpu comes up after the system has finished booting, + * i.e. as a result of a hotplug cpu action. + */ +void snapshot_timebase(void) +{ + __get_cpu_var(last_jiffy) = get_tb(); + snapshot_purr(); +} + void __delay(unsigned long loops) { unsigned long start; @@ -382,6 +597,7 @@ static void iSeries_tb_recal(void) new_tb_ticks_per_jiffy, sign, tick_diff ); tb_ticks_per_jiffy = new_tb_ticks_per_jiffy; tb_ticks_per_sec = new_tb_ticks_per_sec; + calc_cputime_factors(); div128_by_32( XSEC_PER_SEC, 0, tb_ticks_per_sec, &divres ); do_gtod.tb_ticks_per_sec = tb_ticks_per_sec; tb_to_xs = divres.result_low; @@ -430,6 +646,7 @@ void timer_interrupt(struct pt_regs * re irq_enter(); profile_tick(CPU_PROFILING, regs); + calculate_steal_time(); #ifdef CONFIG_PPC_ISERIES get_lppaca()->int_dword.fields.decr_int = 0; @@ -451,7 +668,7 @@ void timer_interrupt(struct pt_regs * re * is the case. */ if (!cpu_is_offline(cpu)) - update_process_times(user_mode(regs)); + account_process_time(regs); /* * No need to check whether cpu is offline here; boot_cpuid @@ -706,6 +923,7 @@ void __init time_init(void) tb_ticks_per_sec = ppc_tb_freq; tb_ticks_per_usec = ppc_tb_freq / 1000000; tb_to_us = mulhwu_scale_factor(ppc_tb_freq, 1000000); + calc_cputime_factors(); /* * Calculate the length of each tick in ns. It will not be diff --git a/include/asm-powerpc/cputable.h b/include/asm-powerpc/cputable.h index 6421054..f74d0ed 100644 --- a/include/asm-powerpc/cputable.h +++ b/include/asm-powerpc/cputable.h @@ -117,6 +117,7 @@ extern void do_cpu_ftr_fixups(unsigned l #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0000100000000000) #define CPU_FTR_PAUSE_ZERO ASM_CONST(0x0000200000000000) +#define CPU_FTR_PURR ASM_CONST(0x0000400000000000) #else /* ensure on 32b processors the flags are available for compiling but * don't do anything */ @@ -132,6 +133,7 @@ extern void do_cpu_ftr_fixups(unsigned l #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0) #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0) #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0) +#define CPU_FTR_PURR ASM_CONST(0x0) #endif #ifndef __ASSEMBLY__ @@ -313,7 +315,7 @@ enum { CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_MMCRA | CPU_FTR_SMT | CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | - CPU_FTR_MMCRA_SIHV, + CPU_FTR_MMCRA_SIHV | CPU_FTR_PURR, CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | @@ -326,7 +328,7 @@ enum { #ifdef __powerpc64__ CPU_FTRS_POWER3 | CPU_FTRS_RS64 | CPU_FTRS_POWER4 | CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | CPU_FTRS_CELL | - CPU_FTR_CI_LARGE_PAGE | + CPU_FTR_CI_LARGE_PAGE | CPU_FTR_PURR | #else #if CLASSIC_PPC CPU_FTRS_PPC601 | CPU_FTRS_603 | CPU_FTRS_604 | CPU_FTRS_740_NOTAU | diff --git a/include/asm-powerpc/cputime.h b/include/asm-powerpc/cputime.h index 6d68ad7..a21185d 100644 --- a/include/asm-powerpc/cputime.h +++ b/include/asm-powerpc/cputime.h @@ -1 +1,203 @@ +/* + * Definitions for measuring cputime on powerpc machines. + * + * Copyright (C) 2006 Paul Mackerras, IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * If we have CONFIG_VIRT_CPU_ACCOUNTING, we measure cpu time in + * the same units as the timebase. Otherwise we measure cpu time + * in jiffies using the generic definitions. + */ + +#ifndef __POWERPC_CPUTIME_H +#define __POWERPC_CPUTIME_H + +#ifndef CONFIG_VIRT_CPU_ACCOUNTING #include +#else + +#include +#include +#include +#include +#include + +typedef u64 cputime_t; +typedef u64 cputime64_t; + +#define cputime_zero ((cputime_t)0) +#define cputime_max ((~((cputime_t)0) >> 1) - 1) +#define cputime_add(__a, __b) ((__a) + (__b)) +#define cputime_sub(__a, __b) ((__a) - (__b)) +#define cputime_div(__a, __n) ((__a) / (__n)) +#define cputime_halve(__a) ((__a) >> 1) +#define cputime_eq(__a, __b) ((__a) == (__b)) +#define cputime_gt(__a, __b) ((__a) > (__b)) +#define cputime_ge(__a, __b) ((__a) >= (__b)) +#define cputime_lt(__a, __b) ((__a) < (__b)) +#define cputime_le(__a, __b) ((__a) <= (__b)) + +#define cputime64_zero ((cputime64_t)0) +#define cputime64_add(__a, __b) ((__a) + (__b)) +#define cputime_to_cputime64(__ct) (__ct) + +#ifdef __KERNEL__ + +/* + * Convert cputime <-> jiffies + */ +extern u64 __cputime_jiffies_factor; + +static inline unsigned long cputime_to_jiffies(const cputime_t ct) +{ + return mulhdu(ct, __cputime_jiffies_factor); +} + +static inline cputime_t jiffies_to_cputime(const unsigned long jif) +{ + cputime_t ct; + unsigned long sec; + + /* have to be a little careful about overflow */ + ct = jif % HZ; + sec = jif / HZ; + if (ct) { + ct *= tb_ticks_per_sec; + do_div(ct, HZ); + } + if (sec) + ct += (cputime_t) sec * tb_ticks_per_sec; + return ct; +} + +static inline u64 cputime64_to_jiffies64(const cputime_t ct) +{ + return mulhdu(ct, __cputime_jiffies_factor); +} + +/* + * Convert cputime <-> milliseconds + */ +extern u64 __cputime_msec_factor; + +static inline unsigned long cputime_to_msecs(const cputime_t ct) +{ + return mulhdu(ct, __cputime_msec_factor); +} + +static inline cputime_t msecs_to_cputime(const unsigned long ms) +{ + cputime_t ct; + unsigned long sec; + + /* have to be a little careful about overflow */ + ct = ms % 1000; + sec = ms / 1000; + if (ct) { + ct *= tb_ticks_per_sec; + do_div(ct, 1000); + } + if (sec) + ct += (cputime_t) sec * tb_ticks_per_sec; + return ct; +} + +/* + * Convert cputime <-> seconds + */ +extern u64 __cputime_sec_factor; + +static inline unsigned long cputime_to_secs(const cputime_t ct) +{ + return mulhdu(ct, __cputime_sec_factor); +} + +static inline cputime_t secs_to_cputime(const unsigned long sec) +{ + return (cputime_t) sec * tb_ticks_per_sec; +} + +/* + * Convert cputime <-> timespec + */ +static inline void cputime_to_timespec(const cputime_t ct, struct timespec *p) +{ + u64 x = ct; + unsigned int frac; + + frac = do_div(x, tb_ticks_per_sec); + p->tv_sec = x; + x = (u64) frac * 1000000000; + do_div(x, tb_ticks_per_sec); + p->tv_nsec = x; +} + +static inline cputime_t timespec_to_cputime(const struct timespec *p) +{ + cputime_t ct; + + ct = (u64) p->tv_nsec * tb_ticks_per_sec; + do_div(ct, 1000000000); + return ct + (u64) p->tv_sec * tb_ticks_per_sec; +} + +/* + * Convert cputime <-> timeval + */ +static inline void cputime_to_timeval(const cputime_t ct, struct timeval *p) +{ + u64 x = ct; + unsigned int frac; + + frac = do_div(x, tb_ticks_per_sec); + p->tv_sec = x; + x = (u64) frac * 1000000; + do_div(x, tb_ticks_per_sec); + p->tv_usec = x; +} + +static inline cputime_t timeval_to_cputime(const struct timeval *p) +{ + cputime_t ct; + + ct = (u64) p->tv_usec * tb_ticks_per_sec; + do_div(ct, 1000000); + return ct + (u64) p->tv_sec * tb_ticks_per_sec; +} + +/* + * Convert cputime <-> clock_t (units of 1/USER_HZ seconds) + */ +extern u64 __cputime_clockt_factor; + +static inline unsigned long cputime_to_clock_t(const cputime_t ct) +{ + return mulhdu(ct, __cputime_clockt_factor); +} + +static inline cputime_t clock_t_to_cputime(const unsigned long clk) +{ + cputime_t ct; + unsigned long sec; + + /* have to be a little careful about overflow */ + ct = clk % USER_HZ; + sec = clk / USER_HZ; + if (ct) { + ct *= tb_ticks_per_sec; + do_div(ct, USER_HZ); + } + if (sec) + ct += (cputime_t) sec * tb_ticks_per_sec; + return ct; +} + +#define cputime64_to_clock_t(ct) cputime_to_clock_t((cputime_t)(ct)) + +#endif /* __KERNEL__ */ +#endif /* CONFIG_VIRT_CPU_ACCOUNTING */ +#endif /* __POWERPC_CPUTIME_H */ diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h index 8eb7e85..51f87d9 100644 --- a/include/asm-powerpc/irq.h +++ b/include/asm-powerpc/irq.h @@ -479,6 +479,10 @@ extern int distribute_irqs; struct irqaction; struct pt_regs; +#define __ARCH_HAS_DO_SOFTIRQ + +extern void __do_softirq(void); + #ifdef CONFIG_IRQSTACKS /* * Per-cpu stacks for handling hard and soft interrupts. @@ -491,8 +495,6 @@ extern void call_do_softirq(struct threa extern int call___do_IRQ(int irq, struct pt_regs *regs, struct thread_info *tp); -#define __ARCH_HAS_DO_SOFTIRQ - #else #define irq_ctx_init() diff --git a/include/asm-powerpc/paca.h b/include/asm-powerpc/paca.h index c9add8f..4cd1a95 100644 --- a/include/asm-powerpc/paca.h +++ b/include/asm-powerpc/paca.h @@ -96,6 +96,11 @@ struct paca_struct { u64 saved_r1; /* r1 save for RTAS calls */ u64 saved_msr; /* MSR saved here by enter_rtas */ u8 proc_enabled; /* irq soft-enable flag */ + + /* Stuff for accurate time accounting */ + u64 user_time; /* accumulated usermode TB ticks */ + u64 system_time; /* accumulated system TB ticks */ + u64 startpurr; /* PURR/TB value snapshot */ }; extern struct paca_struct paca[]; diff --git a/include/asm-powerpc/ppc_asm.h b/include/asm-powerpc/ppc_asm.h index ab8688d..dd1c0a9 100644 --- a/include/asm-powerpc/ppc_asm.h +++ b/include/asm-powerpc/ppc_asm.h @@ -15,6 +15,48 @@ #define SZL (BITS_PER_LONG/8) /* + * Stuff for accurate CPU time accounting. + * These macros handle transitions between user and system state + * in exception entry and exit and accumulate time to the + * user_time and system_time fields in the paca. + */ + +#ifndef CONFIG_VIRT_CPU_ACCOUNTING +#define ACCOUNT_CPU_USER_ENTRY(ra, rb) +#define ACCOUNT_CPU_USER_EXIT(ra, rb) +#else +#define ACCOUNT_CPU_USER_ENTRY(ra, rb) \ + beq 2f; /* if from kernel mode */ \ +BEGIN_FTR_SECTION; \ + mfspr ra,SPRN_PURR; /* get processor util. reg */ \ +END_FTR_SECTION_IFSET(CPU_FTR_PURR); \ +BEGIN_FTR_SECTION; \ + mftb ra; /* or get TB if no PURR */ \ +END_FTR_SECTION_IFCLR(CPU_FTR_PURR); \ + ld rb,PACA_STARTPURR(r13); \ + std ra,PACA_STARTPURR(r13); \ + subf rb,rb,ra; /* subtract start value */ \ + ld ra,PACA_USER_TIME(r13); \ + add ra,ra,rb; /* add on to user time */ \ + std ra,PACA_USER_TIME(r13); \ +2: + +#define ACCOUNT_CPU_USER_EXIT(ra, rb) \ +BEGIN_FTR_SECTION; \ + mfspr ra,SPRN_PURR; /* get processor util. reg */ \ +END_FTR_SECTION_IFSET(CPU_FTR_PURR); \ +BEGIN_FTR_SECTION; \ + mftb ra; /* or get TB if no PURR */ \ +END_FTR_SECTION_IFCLR(CPU_FTR_PURR); \ + ld rb,PACA_STARTPURR(r13); \ + std ra,PACA_STARTPURR(r13); \ + subf rb,rb,ra; /* subtract start value */ \ + ld ra,PACA_SYSTEM_TIME(r13); \ + add ra,ra,rb; /* add on to user time */ \ + std ra,PACA_SYSTEM_TIME(r13); +#endif + +/* * Macros for storing registers into and loading registers from * exception frames. */ diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h index d9bf536..41b7a5b 100644 --- a/include/asm-powerpc/system.h +++ b/include/asm-powerpc/system.h @@ -424,5 +424,9 @@ static inline void create_function_call( create_branch(addr, func_addr, BRANCH_SET_LINK); } +#ifdef CONFIG_VIRT_CPU_ACCOUNTING +extern void account_system_vtime(struct task_struct *); +#endif + #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_SYSTEM_H */ diff --git a/include/asm-powerpc/time.h b/include/asm-powerpc/time.h index baddc9a..912118d 100644 --- a/include/asm-powerpc/time.h +++ b/include/asm-powerpc/time.h @@ -41,6 +41,7 @@ extern time_t last_rtc_update; extern void generic_calibrate_decr(void); extern void wakeup_decrementer(void); +extern void snapshot_timebase(void); /* Some sane defaults: 125 MHz timebase, 1GHz processor */ extern unsigned long ppc_proc_freq; @@ -221,5 +222,19 @@ struct cpu_usage { DECLARE_PER_CPU(struct cpu_usage, cpu_usage_array); +#ifdef CONFIG_VIRT_CPU_ACCOUNTING +extern void account_process_vtime(struct task_struct *tsk); +#else +#define account_process_vtime(tsk) do { } while (0) +#endif + +#if defined(CONFIG_VIRT_CPU_ACCOUNTING) && defined(CONFIG_PPC_SPLPAR) +extern void calculate_steal_time(void); +extern void snapshot_timebases(void); +#else +#define calculate_steal_time() do { } while (0) +#define snapshot_timebases() do { } while (0) +#endif + #endif /* __KERNEL__ */ #endif /* __PPC64_TIME_H */ From olh at suse.de Thu Feb 23 00:35:51 2006 From: olh at suse.de (Olaf Hering) Date: Wed, 22 Feb 2006 14:35:51 +0100 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> Message-ID: <20060222133551.GA30355@suse.de> On Wed, Feb 22, Paul Mackeras wrote: > All of this is conditional on CONFIG_VIRT_CPU_ACCOUNTING. If that is > not set, we do tick-based approximate accounting as before. arch/powerpc/kernel/process.c: In function '__switch_to': arch/powerpc/kernel/process.c:335: error: implicit declaration of function 'account_process_vtime' arch/powerpc/kernel/process.c:336: error: implicit declaration of function 'calculate_steal_time' make[1]: *** [arch/powerpc/kernel/process.o] Error 1 This change fixes it. But it will not link 32bit: kernel/built-in.o(.text+0xbd88): In function `irq_exit': : undefined reference to `do_softirq' kernel/built-in.o(.text+0xbde0): In function `local_bh_enable': : undefined reference to `do_softirq' kernel/built-in.o(.text+0xbe60): In function `ksoftirqd': : undefined reference to `do_softirq' net/built-in.o(.text+0xd6fc): In function `netif_rx_ni': : undefined reference to `do_softirq' I think the placement of __ARCH_HAS_DO_SOFTIRQ needs adjustment, or the code must be moved out of CONFIG_PPC64 in arch/powerpc/kernel/irq.c. Index: linux-2.6.15/arch/powerpc/kernel/process.c =================================================================== --- linux-2.6.15.orig/arch/powerpc/kernel/process.c +++ linux-2.6.15/arch/powerpc/kernel/process.c @@ -49,8 +49,8 @@ #include #ifdef CONFIG_PPC64 #include -#include #endif +#include extern unsigned long _get_SP(void); From sharada at in.ibm.com Thu Feb 23 03:13:08 2006 From: sharada at in.ibm.com (R Sharada) Date: Wed, 22 Feb 2006 21:43:08 +0530 Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear In-Reply-To: <20060221054448.GA1695@in.ibm.com> References: <20060221054448.GA1695@in.ibm.com> Message-ID: <20060222161308.GA6356@in.ibm.com> Ok, I realized I did not have to add the extra variables and could have done it cleaner. Also, Michael suggested adding a comment why we were replacing the call to tlbie() with __tlbie(). So, here is a revised version. With this fix, I am able to kexec and kdump boot successfully on p630 non-lpar mode running 2.6.16-rc4. Since without this fix kexec is currently broken on non-lpar, please consider for inclusion in 2.6.16 Thanks and Regards, Sharada native_hpte_clear has a spinlock recursion problem with the native_tlbie_lock being called twice, once in native_hpte_clear() and once within tlbie(). Fix the problem by changing the call to tlbie() in native_hpte_clear() to __tlbie(). It still supports only 4k pages for now. Signed-off-by: R Sharada --- diff -puN arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear arch/powerpc/mm/hash_native_64.c --- linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear 2006-02-22 21:22:42.000000000 +0530 +++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c 2006-02-22 21:26:25.000000000 +0530 @@ -403,12 +403,16 @@ static void native_hpte_clear(void) */ hpte_v = hptep->v; + /* tlbie() takes the native_tlbie_lock. hence change the + * tlbie() call here to __tlbie() + */ if (hpte_v & HPTE_V_VALID) { hptep->v = 0; - tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K, 0); + __tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K); } } + asm volatile("eieio; tlbsync; ptesync":::"memory"); spin_unlock(&native_tlbie_lock); local_irq_restore(flags); } _ From geoffrey.levand at am.sony.com Thu Feb 23 09:50:26 2006 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Wed, 22 Feb 2006 14:50:26 -0800 Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear In-Reply-To: <20060222161308.GA6356@in.ibm.com> References: <20060222161308.GA6356@in.ibm.com> Message-ID: <43FCEAB2.2020702@am.sony.com> R Sharada wrote: > linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear > 2006-02-22 21:22:42.000000000 +0530 > +++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c > 2006-02-22 21:26:25.000000000 +0530 > @@ -403,12 +403,16 @@ static void native_hpte_clear(void) > */ > hpte_v = hptep->v; > > + /* tlbie() takes the native_tlbie_lock. hence change the > + * tlbie() call here to __tlbie() > + */ Once the patch is applied, the tlbie() call disappears and you have comment that doesn't make sense in the new context. Maybe you should reconsider the wording. > if (hpte_v & HPTE_V_VALID) { > hptep->v = 0; > - tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K, 0); > + __tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K); > } > } > > + asm volatile("eieio; tlbsync; ptesync":::"memory"); > spin_unlock(&native_tlbie_lock); > local_irq_restore(flags); > } From michael at ellerman.id.au Thu Feb 23 10:49:18 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 23 Feb 2006 10:49:18 +1100 Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear In-Reply-To: <43FCEAB2.2020702@am.sony.com> References: <20060222161308.GA6356@in.ibm.com> <43FCEAB2.2020702@am.sony.com> Message-ID: <200602231049.23107.michael@ellerman.id.au> On Thu, 23 Feb 2006 09:50, Geoff Levand wrote: > R Sharada wrote: > > linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear > > 2006-02-22 21:22:42.000000000 +0530 > > +++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c > > 2006-02-22 21:26:25.000000000 +0530 > > @@ -403,12 +403,16 @@ static void native_hpte_clear(void) > > */ > > hpte_v = hptep->v; > > > > + /* tlbie() takes the native_tlbie_lock. hence change the > > + * tlbie() call here to __tlbie() > > + */ > > Once the patch is applied, the tlbie() call disappears and you have > comment that doesn't make sense in the new context. Maybe you should > reconsider the wording. Yeah I agree with Geoff. The point is that we already hold the tlbie lock, so we can't call something that takes it again. cheers -- Michael Ellerman IBM OzLabs wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060223/4c3d4d53/attachment.pgp From pj at sgi.com Thu Feb 23 11:50:09 2006 From: pj at sgi.com (Paul Jackson) Date: Wed, 22 Feb 2006 16:50:09 -0800 Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was: msi support) In-Reply-To: <20060203202742.1e514fcc.akpm@osdl.org> References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com> <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> <20060203201441.194be500.pj@sgi.com> <20060203202531.27d685fa.akpm@osdl.org> <20060203202742.1e514fcc.akpm@osdl.org> Message-ID: <20060222165009.6493e6a1.pj@sgi.com> On Feb 3, Andrew wrote: > Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es > git-ia64.patch when gregkh-pci-altix-msi-support.patch Is it time to reinsert that patch? My ia64 sn build fails again, complaining: =========================== CC arch/ia64/sn/pci/tioce_provider.o arch/ia64/sn/pci/tioce_provider.c:720:46: macro "ATE_MAKE" requires 3 arguments, but only 2 given =========================== Your broken-out/series file (2.6.16-rc4-mm1) has the lines: # Need this when gregkh-pci-altix-msi-support.patch comes back #gregkh-pci-altix-msi-support-git-ia64-fix.patch I guess that is this patch below, which fixes my sn build just fine. Holler if you need it as a proper patch. --- 2.6.16-rc4-mm1.orig/arch/ia64/sn/pci/tioce_provider.c 2006-02-22 16:21:52.054985166 -0800 +++ 2.6.16-rc4-mm1/arch/ia64/sn/pci/tioce_provider.c 2006-02-22 16:31:21.594755653 -0800 @@ -717,7 +717,7 @@ tioce_reserve_m32(struct tioce_kernel *c while (ate_index <= last_ate) { u64 ate; - ate = ATE_MAKE(0xdeadbeef, ps); + ate = ATE_MAKE(0xdeadbeef, ps, 0); ce_kern->ce_ate3240_shadow[ate_index] = ate; tioce_mmr_storei(ce_kern, &ce_mmr->ce_ure_ate3240[ate_index], ate); -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson 1.925.600.0401 From akpm at osdl.org Thu Feb 23 12:01:42 2006 From: akpm at osdl.org (Andrew Morton) Date: Wed, 22 Feb 2006 17:01:42 -0800 Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was: msi support) In-Reply-To: <20060222165009.6493e6a1.pj@sgi.com> References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com> <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com> <20060203201441.194be500.pj@sgi.com> <20060203202531.27d685fa.akpm@osdl.org> <20060203202742.1e514fcc.akpm@osdl.org> <20060222165009.6493e6a1.pj@sgi.com> Message-ID: <20060222170142.497eaac3.akpm@osdl.org> Paul Jackson wrote: > > Your broken-out/series file (2.6.16-rc4-mm1) has the lines: > > # Need this when gregkh-pci-altix-msi-support.patch comes back > #gregkh-pci-altix-msi-support-git-ia64-fix.patch Bah. I resurrected it, thanks. From kelly at au1.ibm.com Thu Feb 23 14:32:59 2006 From: kelly at au1.ibm.com (Kelly Daly) Date: Thu, 23 Feb 2006 14:32:59 +1100 Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries Message-ID: <200602231432.59472.kelly@au.ibm.com> disable OProfile in Kconfig for iSeries to prevent hangs. OProfile was not originally intended to work with legacy iSeries. Signed-off-by: Kelly Daly --- hi Paulus, could you push this up to the 2.6.16 release please? K diff -urpN linux-2.6.15.4/arch/powerpc/oprofile/Kconfig linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig --- linux-2.6.15.4/arch/powerpc/oprofile/Kconfig 2006-02-10 18:22:48.000000000 +1100 +++ linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig 2006-02-23 13:09:00.000000000 +1100 @@ -1,4 +1,5 @@ config PROFILING + depends on !PPC_ISERIES bool "Profiling support (EXPERIMENTAL)" help Say Y here to enable the extended profiling support mechanisms used From paulus at samba.org Thu Feb 23 15:12:52 2006 From: paulus at samba.org (Paul Mackerras) Date: Thu, 23 Feb 2006 15:12:52 +1100 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <20060222133551.GA30355@suse.de> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> <20060222133551.GA30355@suse.de> Message-ID: <17405.13892.370606.476003@cargo.ozlabs.ibm.com> Olaf Hering writes: > This change fixes it. But it will not link 32bit: I didn't notice that stuff was inside an #ifdef CONFIG_PPC64 block. Easily fixed... Paul. diff -urN a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c --- a/arch/powerpc/kernel/irq.c 2006-02-22 09:44:33.000000000 +1100 +++ b/arch/powerpc/kernel/irq.c 2006-02-23 15:10:51.000000000 +1100 @@ -371,6 +371,7 @@ return NO_IRQ; } +#endif /* CONFIG_PPC64 */ #ifdef CONFIG_IRQSTACKS struct thread_info *softirq_ctx[NR_CPUS]; @@ -430,6 +431,7 @@ } EXPORT_SYMBOL(do_softirq); +#ifdef CONFIG_PPC64 static int __init setup_noirqdistrib(char *str) { distribute_irqs = 0; From sharada at in.ibm.com Thu Feb 23 15:59:39 2006 From: sharada at in.ibm.com (R Sharada) Date: Thu, 23 Feb 2006 10:29:39 +0530 Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear In-Reply-To: <200602231049.23107.michael@ellerman.id.au> References: <20060222161308.GA6356@in.ibm.com> <43FCEAB2.2020702@am.sony.com> <200602231049.23107.michael@ellerman.id.au> Message-ID: <20060223045939.GA2151@in.ibm.com> Would something to this effect be more appropriate? /* we already hold the native_tlbie_lock before getting here. So, cannot * take it back again. So call raw __tlbie() in here */ Thanks and Regards, Sharada On Thu, Feb 23, 2006 at 10:49:18AM +1100, Michael Ellerman wrote: > On Thu, 23 Feb 2006 09:50, Geoff Levand wrote: > > R Sharada wrote: > > > linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear > > > 2006-02-22 21:22:42.000000000 +0530 > > > +++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c > > > 2006-02-22 21:26:25.000000000 +0530 > > > @@ -403,12 +403,16 @@ static void native_hpte_clear(void) > > > */ > > > hpte_v = hptep->v; > > > > > > + /* tlbie() takes the native_tlbie_lock. hence change the > > > + * tlbie() call here to __tlbie() > > > + */ > > > > Once the patch is applied, the tlbie() call disappears and you have > > comment that doesn't make sense in the new context. Maybe you should > > reconsider the wording. > > Yeah I agree with Geoff. The point is that we already hold the tlbie lock, so > we can't call something that takes it again. > > cheers > > -- > Michael Ellerman > IBM OzLabs > > wwweb: http://michael.ellerman.id.au > phone: +61 2 6212 1183 (tie line 70 21183) > > We do not inherit the earth from our ancestors, > we borrow it from our children. - S.M.A.R.T Person From michael at ellerman.id.au Thu Feb 23 16:55:22 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 23 Feb 2006 16:55:22 +1100 Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries In-Reply-To: <200602231432.59472.kelly@au.ibm.com> References: <200602231432.59472.kelly@au.ibm.com> Message-ID: <200602231655.26244.michael@ellerman.id.au> On Thu, 23 Feb 2006 14:32, Kelly Daly wrote: > disable OProfile in Kconfig for iSeries to prevent hangs. OProfile was not > originally intended to work with legacy iSeries. > > diff -urpN linux-2.6.15.4/arch/powerpc/oprofile/Kconfig > linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig --- > linux-2.6.15.4/arch/powerpc/oprofile/Kconfig 2006-02-10 18:22:48.000000000 > +1100 +++ linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig 2006-02-23 > 13:09:00.000000000 +1100 @@ -1,4 +1,5 @@ > config PROFILING > + depends on !PPC_ISERIES We've been trying to avoid !ISERIES compile time checks because they're a barrier to the mythical combined kernel. I haven't looked at the oprofile code, but is there an easy way to turn this into a firmware_has_feature(ISERIES) check? cheers -- Michael Ellerman IBM OzLabs wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060223/52fe3a81/attachment.pgp From arnd at arndb.de Thu Feb 23 20:39:45 2006 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 23 Feb 2006 10:39:45 +0100 Subject: [FYI/PATCH 0/2] what is left on cell Message-ID: <200602231039.45554.arnd@arndb.de> On the way to getting everything together for a new binary kernel on bsc.es and other distributors, we found two more patches to be missing. These are not for inclusion in the mainline kernel, but needed for now. From arnd at arndb.de Thu Feb 23 20:46:52 2006 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 23 Feb 2006 10:46:52 +0100 Subject: [FYI/PATCH 2/2] fix previous interrupt controller rework patch In-Reply-To: <200602231039.45554.arnd@arndb.de> References: <200602231039.45554.arnd@arndb.de> Message-ID: <200602231046.52479.arnd@arndb.de> This fixes a bug for the patch in http://patchwork.ozlabs.org/linuxppc64/patch?id=4188. I still haven't received feedback on the implementation itself of that patch, but for now lets assume that we do it that way. I'll submit a fixed patch for the interrupt controller rework for inclusion 2.6.17-rc then. --- linux-2.6.16-rc1.orig/arch/powerpc/platforms/cell/spider-pic.c +++ linux-2.6.16-rc1/arch/powerpc/platforms/cell/spider-pic.c @@ -196,10 +196,11 @@ void spider_init_IRQ(void) if (strstr(compatible, "CBEA,platform-spider-pic")) spider_reg = *(long *)get_property(dn,"reg", NULL); - else { + else if (strstr(compatible, "sti,platform-spider-pic")) { spider_init_IRQ_hardcoded(); return; - } + } else + continue; if (!spider_reg) printk("interrupt controller does not have reg property !\n"); From arnd at arndb.de Thu Feb 23 20:41:14 2006 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 23 Feb 2006 10:41:14 +0100 Subject: [FYI/PATCH 1/2] small hacks for running on BPA hardware, v4 In-Reply-To: <200602231039.45554.arnd@arndb.de> References: <200602231039.45554.arnd@arndb.de> Message-ID: <200602231041.14418.arnd@arndb.de> The things done in here are workarounds for deficiencies in the firmware that will be solved there in later releases. Signed-off-by: Arnd Bergmann Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/Makefile =================================================================== --- linux-2.6.16-rc.orig/arch/powerpc/platforms/cell/Makefile +++ linux-2.6.16-rc/arch/powerpc/platforms/cell/Makefile @@ -1,5 +1,5 @@ obj-y += interrupt.o iommu.o setup.o spider-pic.o -obj-y += pervasive.o +obj-y += pervasive.o pci.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SPU_FS) += spufs/ spu-base.o Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/pci.c =================================================================== --- /dev/null +++ linux-2.6.16-rc/arch/powerpc/platforms/cell/pci.c @@ -0,0 +1,82 @@ +/* + * Cell specific PCI code + * + * Copyright (C) 2005 IBM Corporation, + Arnd Bergmann + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include + +#include +#include +#include + +#include "interrupt.h" + +void __init cell_final_fixup(void) +{ + struct pci_dev *dev = NULL; + + //phbs_remap_io(); + + for_each_pci_dev(dev) { + // FIXME: fix IRQ numbers for devices on second south bridge + } +} + +static void fixup_spider_ipci_irq(struct pci_dev* dev) +{ + int irq_node_offset; + pr_debug("fixup for %04x:%04x at %02x.%1x: ", dev->vendor, dev->device, + PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn)); + switch (dev->devfn) { + case PCI_DEVFN(3,0): + /* ethernet */ + dev->irq = 8; + break; + case PCI_DEVFN(5,0): + /* OHCI 0 */ + dev->irq = 10; + break; + case PCI_DEVFN(6,0): + /* OHCI 1 */ + dev->irq = 11; + break; + case PCI_DEVFN(5,1): + /* EHCI 0 */ + dev->irq = 10; + break; + case PCI_DEVFN(6,1): + /* EHCI 1 */ + dev->irq = 11; + break; + } + + irq_node_offset = IIC_NODE_STRIDE * (pci_domain_nr(dev->bus)-1); + dev->irq += irq_node_offset; + + pr_debug("irq %0x\n", dev->irq); +} + +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA_2, + PCI_DEVICE_ID_TOSHIBA_SPIDER_NET, fixup_spider_ipci_irq); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA_2, + PCI_DEVICE_ID_TOSHIBA_SPIDER_OHCI, fixup_spider_ipci_irq); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA_2, + PCI_DEVICE_ID_TOSHIBA_SPIDER_EHCI, fixup_spider_ipci_irq); Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/setup.c =================================================================== --- linux-2.6.16-rc.orig/arch/powerpc/platforms/cell/setup.c +++ linux-2.6.16-rc/arch/powerpc/platforms/cell/setup.c @@ -58,6 +58,7 @@ #else #define DBG(fmt...) #endif +extern void cell_final_fixup(void); void cell_show_cpuinfo(struct seq_file *m) { @@ -297,6 +298,7 @@ struct machdep_calls __initdata cell_md .setup_arch = cell_setup_arch, .init_early = cell_init_early, .show_cpuinfo = cell_show_cpuinfo, + .pcibios_fixup = cell_final_fixup, .restart = rtas_restart, .power_off = rtas_power_off, .halt = rtas_halt, Index: linux-2.6.16-rc/include/linux/pci_ids.h =================================================================== --- linux-2.6.16-rc.orig/include/linux/pci_ids.h +++ linux-2.6.16-rc/include/linux/pci_ids.h @@ -1367,6 +1367,8 @@ #define PCI_DEVICE_ID_TOSHIBA_TC35815CF 0x0030 #define PCI_DEVICE_ID_TOSHIBA_TC86C001_MISC 0x0108 #define PCI_DEVICE_ID_TOSHIBA_SPIDER_NET 0x01b3 +#define PCI_DEVICE_ID_TOSHIBA_SPIDER_OHCI 0x01b6 +#define PCI_DEVICE_ID_TOSHIBA_SPIDER_EHCI 0x01b5 #define PCI_VENDOR_ID_RICOH 0x1180 #define PCI_DEVICE_ID_RICOH_RL5C465 0x0465 Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/spu_base.c =================================================================== --- linux-2.6.16-rc.orig/arch/powerpc/platforms/cell/spu_base.c +++ linux-2.6.16-rc/arch/powerpc/platforms/cell/spu_base.c @@ -534,6 +534,10 @@ static void __iomem * __init map_spe_pro prop = p; + /* FIXME: Firmware bug */ + if (strcmp (name, "priv2") == 0 && prop->len < 0x20000) + return ioremap(prop->address, 0x20000); + return ioremap(prop->address, prop->len); } From paulus at samba.org Thu Feb 23 21:35:02 2006 From: paulus at samba.org (Paul Mackerras) Date: Thu, 23 Feb 2006 21:35:02 +1100 Subject: [patch] powerpc: native atomic_add_unless In-Reply-To: <20060121112536.GA27505@wotan.suse.de> References: <20060121112536.GA27505@wotan.suse.de> Message-ID: <17405.36822.685629.515591@cargo.ozlabs.ibm.com> Nick Piggin writes: > atomic_add_unless (atomic_inc_not_zero) is used in several hot paths in the > vfs and I'm planning some uses in the memory manager, so it should be as > small and fast as possible. > > Joel had a good suggestion to save a register but all bugs are mine. > > Comments? The implementation looks OK. I would be interested to know if this actually makes any measurable difference though. Paul. From olof at lixom.net Fri Feb 24 03:42:13 2006 From: olof at lixom.net (Olof Johansson) Date: Thu, 23 Feb 2006 08:42:13 -0800 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> Message-ID: <20060223164213.GB4674@pb15.lixom.net> Hi, On Wed, Feb 22, 2006 at 10:35:30PM +1100, Paul Mackerras wrote: > #ifndef __ASSEMBLY__ > @@ -313,7 +315,7 @@ enum { > CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | > CPU_FTR_MMCRA | CPU_FTR_SMT | > CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | > - CPU_FTR_MMCRA_SIHV, > + CPU_FTR_MMCRA_SIHV | CPU_FTR_PURR, > CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | > CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | > CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | > @@ -326,7 +328,7 @@ enum { > #ifdef __powerpc64__ > CPU_FTRS_POWER3 | CPU_FTRS_RS64 | CPU_FTRS_POWER4 | > CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | CPU_FTRS_CELL | > - CPU_FTR_CI_LARGE_PAGE | > + CPU_FTR_CI_LARGE_PAGE | CPU_FTR_PURR | Is this second change really needed (this is the setting of CPU_FTRS_POSSIBLE)? It already includes CPU_FTRS_POWER5, which has the bit set by the first change. Only case I can see where it's mandated to include there is when the bit is set based on device-tree contents, right? CPU_FTR_CI_LARGE_PAGE seems to be a weird case, it's checked only in one location (hash code that enables 64K pages) but never actually set anywhere in current sources. It never seems to have been. -Olof From stevewin at us.ibm.com Fri Feb 24 03:37:54 2006 From: stevewin at us.ibm.com (Stephen Winiecki) Date: Thu, 23 Feb 2006 11:37:54 -0500 Subject: Maple boot hang when SMP not configured and KEXEC configured Message-ID: Using maple_defconfig with latest 2.6.16 prepatch versions, when SMP is not configured the kernel hangs in smp_release_cpus() in kernel/setup_64.c ... returning from prom_init Page orders: linear mapping = 24, others = 12 Found initrd at 0xc0000000018fa000:0xc000000001b2c3c5 DART: table not allocated, using direct DMA Found legacy serial port 0 for /ht at 0/isa at 4/serial at 3f8 port=3f8, taddr=f40003f8, irq=ffffffffffffffff, clk=1843200, speed=115200 Found legacy serial port 1 for /ht at 0/isa at 4/serial at 2f8 port=2f8, taddr=f40002f8, irq=ffffffffffffffff, clk=1843200, speed=115200 -> smp_release_cpus() #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC) void smp_release_cpus(void) { ... Unconfiguring KEXEC does allow the boot to complete successfully Steve Winiecki -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060223/9826bd33/attachment.htm From haren at us.ibm.com Fri Feb 24 05:53:36 2006 From: haren at us.ibm.com (Haren Myneni) Date: Thu, 23 Feb 2006 10:53:36 -0800 Subject: Maple boot hang when SMP not configured and KEXEC configured In-Reply-To: References: Message-ID: <43FE04B0.6000004@us.ibm.com> Stephen Winiecki wrote: > Using maple_defconfig with latest 2.6.16 prepatch versions, when SMP > is not configured the kernel hangs in smp_release_cpus() in > kernel/setup_64.c > > ... > returning from prom_init > Page orders: linear mapping = 24, others = 12 > Found initrd at 0xc0000000018fa000:0xc000000001b2c3c5 > DART: table not allocated, using direct DMA > Found legacy serial port 0 for /ht at 0/isa at 4/serial at 3f8 > port=3f8, taddr=f40003f8, irq=ffffffffffffffff, clk=1843200, speed=115200 > Found legacy serial port 1 for /ht at 0/isa at 4/serial at 2f8 > port=2f8, taddr=f40002f8, irq=ffffffffffffffff, clk=1843200, speed=115200 > -> smp_release_cpus() > > > > #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC) > void smp_release_cpus(void) > { > ... > > Unconfiguring KEXEC does allow the boot to complete successfully > For UP kernels even KEXEC is enabled, this function should not get executed. Please check whether your kernel has the following patch. http://ozlabs.org/pipermail/linuxppc64-dev/2006-February/008064.html > > Steve Winiecki > >------------------------------------------------------------------------ > >_______________________________________________ >Linuxppc64-dev mailing list >Linuxppc64-dev at ozlabs.org >https://ozlabs.org/mailman/listinfo/linuxppc64-dev > > From paulus at samba.org Fri Feb 24 09:07:57 2006 From: paulus at samba.org (Paul Mackerras) Date: Fri, 24 Feb 2006 09:07:57 +1100 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <20060223164213.GB4674@pb15.lixom.net> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> <20060223164213.GB4674@pb15.lixom.net> Message-ID: <17406.12861.222238.773579@cargo.ozlabs.ibm.com> Olof Johansson writes: > Is this second change really needed (this is the setting of > CPU_FTRS_POSSIBLE)? It already includes CPU_FTRS_POWER5, which has the > bit set by the first change. Good point. I'll take that bit out. Paul. From sfr at canb.auug.org.au Fri Feb 24 10:16:44 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 24 Feb 2006 10:16:44 +1100 Subject: [PATCH] change compat shmget size arg to signed Message-ID: <20060224101644.548b0c24.sfr@canb.auug.org.au> Hi Olaf, > change second arg (the 'size') to signed to handle a size of -1. > ltp test shmget02 fails. This patch fixes it. > Oddly, we see the failure only on a POWER4 LPAR with 4.6G ram. > > Signed-off-by: Olaf Hering > > arch/powerpc/kernel/sys_ppc32.c | 2 +- > 1 files changed, 1 insertion(+), 1 deletion(-) > > Index: linux-2.6.16-rc4-olh/arch/powerpc/kernel/sys_ppc32.c > =================================================================== > --- linux-2.6.16-rc4-olh.orig/arch/powerpc/kernel/sys_ppc32.c > +++ linux-2.6.16-rc4-olh/arch/powerpc/kernel/sys_ppc32.c > @@ -429,7 +429,7 @@ long compat_sys_ipc(u32 call, u32 first, > return sys_shmdt(compat_ptr(ptr)); > case SHMGET: > /* sign extend key_t */ > - return sys_shmget((int)first, second, third); > + return sys_shmget((int)first, (int)second, third); > case SHMCTL: > /* sign extend shmid */ > return compat_sys_shmctl((int)first, second, compat_ptr(ptr)); Does the ltp test fail on a standard kernel(where SHMMAX is 0x2000000), or only on a SLES kernel (where SHMMAX is ULONG_MAX)? -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ From olh at suse.de Fri Feb 24 10:27:17 2006 From: olh at suse.de (Olaf Hering) Date: Fri, 24 Feb 2006 00:27:17 +0100 Subject: [PATCH] change compat shmget size arg to signed In-Reply-To: <20060224101644.548b0c24.sfr@canb.auug.org.au> References: <20060224101644.548b0c24.sfr@canb.auug.org.au> Message-ID: <20060223232717.GB29454@suse.de> On Fri, Feb 24, Stephen Rothwell wrote: > Does the ltp test fail on a standard kernel(where SHMMAX is 0x2000000), or > only on a SLES kernel (where SHMMAX is ULONG_MAX)? It fails with SLES9 and SLES10. SLES9 has 0x2000000 as default. From kelly.daly at gmail.com Fri Feb 24 11:00:05 2006 From: kelly.daly at gmail.com (Kelly Daly) Date: Fri, 24 Feb 2006 11:00:05 +1100 Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries In-Reply-To: <200602231655.26244.michael@ellerman.id.au> References: <200602231432.59472.kelly@au.ibm.com> <200602231655.26244.michael@ellerman.id.au> Message-ID: <9ffa56aa0602231600i214d95c6lb7bfdbf9de67d494@mail.gmail.com> Hey Michael, I will definitely look into doing it the way that you have mentioned. In the interim, however, this is a good solution to stop the hanging problem. Cheers, Kelly On 2/23/06, Michael Ellerman wrote: > > On Thu, 23 Feb 2006 14:32, Kelly Daly wrote: > > disable OProfile in Kconfig for iSeries to prevent hangs. OProfile was > not > > originally intended to work with legacy iSeries. > > > > diff -urpN linux-2.6.15.4/arch/powerpc/oprofile/Kconfig > > linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig --- > > linux-2.6.15.4/arch/powerpc/oprofile/Kconfig 2006-02-10 18:22: > 48.000000000 > > +1100 +++ linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig 2006-02-23 > > 13:09:00.000000000 +1100 @@ -1,4 +1,5 @@ > > config PROFILING > > + depends on !PPC_ISERIES > > We've been trying to avoid !ISERIES compile time checks because they're a > barrier to the mythical combined kernel. I haven't looked at the oprofile > code, but is there an easy way to turn this into a > firmware_has_feature(ISERIES) check? > > cheers > > -- > Michael Ellerman > IBM OzLabs > > wwweb: http://michael.ellerman.id.au > phone: +61 2 6212 1183 (tie line 70 21183) > > We do not inherit the earth from our ancestors, > we borrow it from our children. - S.M.A.R.T Person > > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/11bd6d32/attachment.htm From sfr at canb.auug.org.au Fri Feb 24 11:12:42 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 24 Feb 2006 11:12:42 +1100 Subject: [PATCH] change compat shmget size arg to signed In-Reply-To: <20060223232717.GB29454@suse.de> References: <20060224101644.548b0c24.sfr@canb.auug.org.au> <20060223232717.GB29454@suse.de> Message-ID: <20060224111242.08f14bd9.sfr@canb.auug.org.au> On Fri, 24 Feb 2006 00:27:17 +0100 Olaf Hering wrote: > > On Fri, Feb 24, Stephen Rothwell wrote: > > > Does the ltp test fail on a standard kernel(where SHMMAX is 0x2000000), or > > only on a SLES kernel (where SHMMAX is ULONG_MAX)? > > It fails with SLES9 and SLES10. SLES9 has 0x2000000 as default. So what was shm_ctlmax set to when the test was run. I am trying to figure out why this test: if (size < SHMMIN || size > shm_ctlmax) return -EINVAL; Doesn't return -EINVAL for size == 0xffffffff if shm_ctlmax is 0x2000000? -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/a14ef0ea/attachment.pgp From benh at kernel.crashing.org Fri Feb 24 14:32:43 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 24 Feb 2006 14:32:43 +1100 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <20060223164213.GB4674@pb15.lixom.net> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> <20060223164213.GB4674@pb15.lixom.net> Message-ID: <1140751963.8264.70.camel@localhost.localdomain> > CPU_FTR_CI_LARGE_PAGE seems to be a weird case, it's checked only in > one location (hash code that enables 64K pages) but never actually set > anywhere in current sources. It never seems to have been. Yup, because it's not yet clear when to set it... There is a new device-tree property being architected but I didn't yet have a chance to test a machine with a firmware that provides it. Setting it based on the PVR would cause problems in the case of machines mixing multiple CPU revisions. In general, we have a problem with our cputable model vs. IBM cpu feature model. The PAPR architecture considers that any feature not explicitely exposed in the device-tree should not be used (like altivec for example). We currently use the PVR for almost everything however.... We need to be able to identify properly a PAPR machine early and clear out a load of feature bits from what was provided by the table, and then only set back in the bits that are advertised by the various device-tree properties defined by IBM. In addition, we need to make sure we don't break bare-metal in the process. To make things difficult, identifying a PAPR machine is a bit dodgy since they are still specified as simply having "chrp" in / device_type ... Ben. From npiggin at suse.de Fri Feb 24 14:47:19 2006 From: npiggin at suse.de (Nick Piggin) Date: Fri, 24 Feb 2006 04:47:19 +0100 Subject: [patch] powerpc: native atomic_add_unless In-Reply-To: <17405.36822.685629.515591@cargo.ozlabs.ibm.com> References: <20060121112536.GA27505@wotan.suse.de> <17405.36822.685629.515591@cargo.ozlabs.ibm.com> Message-ID: <20060224034719.GB19281@wotan.suse.de> On Thu, Feb 23, 2006 at 09:35:02PM +1100, Paul Mackerras wrote: > Nick Piggin writes: > > > atomic_add_unless (atomic_inc_not_zero) is used in several hot paths in the > > vfs and I'm planning some uses in the memory manager, so it should be as > > small and fast as possible. > > > > Joel had a good suggestion to save a register but all bugs are mine. > > > > Comments? > > The implementation looks OK. I would be interested to know if this > actually makes any measurable difference though. > I tried to microbenchmark it in userspace but couldn't get significant results for a single thread. When the cacheline is not hot or there is some contention, I hoped the native version might result in less coherency protocol operations. There are less branches and it should use less I cache too. All things that are difficult to test in microbenchmarks, unfortunately. From galak at kernel.crashing.org Sat Feb 25 03:34:30 2006 From: galak at kernel.crashing.org (Kumar Gala) Date: Fri, 24 Feb 2006 10:34:30 -0600 (CST) Subject: Membership stats (Was: Re: merge these lists?) In-Reply-To: <20060208110718.57e9f9f5.sfr@canb.auug.org.au> Message-ID: On Wed, 8 Feb 2006, Stephen Rothwell wrote: > On Wed, 8 Feb 2006 11:01:50 +1100 Stephen Rothwell wrote: > > > > Yes, "a sysadmin" could do that. However, those that are > > subscribed with different addresses on each list will end > > up subscribed twice and those who have changed their preferences on > > the abondoned list will have fix them as well. > > Just for interest: > > members of linuxppc-dev 473 > members of linuxppc64-dev 264 > common 98 > > But, as I said, "common" above does not count those who have different > addresses subscribed to each list. Where did we leave on with this? I was about to request that marc.theaimsgroup.com start archiving some of the ppc lists but figured doing it after we merged lists would be better. - kumar From johnrose at austin.ibm.com Sat Feb 25 04:34:23 2006 From: johnrose at austin.ibm.com (John Rose) Date: Fri, 24 Feb 2006 11:34:23 -0600 Subject: [PATCH] fix dynamic PCI probe regression Message-ID: <1140802463.17752.3.camel@sinatra.austin.ibm.com> Hi Paul- Some hotplug driver functions were migrated to the kernel for use by EEH in the following set of changes: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2bf6a8fa21570f37fd1789610da30f70a05ac5e3 Previously, the PCI Hotplug module had been changed to use the new OFDT-based PCI probe when appropriate: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5fa80fcdca9d20d30c9ecec30d4dbff4ed93a5c6 When rpaphp_pci_config_slot() was moved from the rpaphp driver to the new kernel function pcibios_add_pci_devices(), the OFDT-based probe stuff was dropped. This patch restores it. Please apply if approriate. Thanks- John Signed-off-by: John Rose diff -puN arch/powerpc/platforms/pseries/eeh.c~reorg_regress arch/powerpc/platforms/pseries/eeh.c --- 2_6_linus_2/arch/powerpc/platforms/pseries/eeh.c~reorg_regress 2006-02-24 11:04:10.000000000 -0600 +++ 2_6_linus_2-johnrose/arch/powerpc/platforms/pseries/eeh.c 2006-02-24 11:04:10.000000000 -0600 @@ -893,6 +893,20 @@ void eeh_add_device_tree_early(struct de } EXPORT_SYMBOL_GPL(eeh_add_device_tree_early); +void eeh_add_device_tree_late(struct pci_bus *bus) +{ + struct pci_dev *dev; + + list_for_each_entry(dev, &bus->devices, bus_list) { + eeh_add_device_late(dev); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + struct pci_bus *subbus = dev->subordinate; + if (subbus) + eeh_add_device_tree_late(subbus); + } + } +} + /** * eeh_add_device_late - perform EEH initialization for the indicated pci device * @dev: pci device for which to set up EEH diff -puN arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress arch/powerpc/platforms/pseries/pci_dlpar.c --- 2_6_linus_2/arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress 2006-02-24 11:04:10.000000000 -0600 +++ 2_6_linus_2-johnrose/arch/powerpc/platforms/pseries/pci_dlpar.c 2006-02-24 11:04:10.000000000 -0600 @@ -106,6 +106,8 @@ pcibios_fixup_new_pci_devices(struct pci } } } + + eeh_add_device_tree_late(bus); } EXPORT_SYMBOL_GPL(pcibios_fixup_new_pci_devices); @@ -114,7 +116,6 @@ pcibios_pci_config_bridge(struct pci_dev { u8 sec_busno; struct pci_bus *child_bus; - struct pci_dev *child_dev; /* Get busno of downstream bus */ pci_read_config_byte(dev, PCI_SECONDARY_BUS, &sec_busno); @@ -129,10 +130,6 @@ pcibios_pci_config_bridge(struct pci_dev pci_scan_child_bus(child_bus); - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - /* Fixup new pci devices without touching bus struct */ pcibios_fixup_new_pci_devices(child_bus, 0); @@ -160,18 +157,25 @@ pcibios_add_pci_devices(struct pci_bus * eeh_add_device_tree_early(dn); - /* pci_scan_slot should find all children */ - slotno = PCI_SLOT(PCI_DN(dn->child)->devfn); - num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); - if (num) { - pcibios_fixup_new_pci_devices(bus, 1); - pci_bus_add_devices(bus); - } + if (_machine == PLATFORM_PSERIES_LPAR) { + /* use ofdt-based probe */ + of_scan_bus(dn, bus); + if (!list_empty(&bus->devices)) { + pcibios_fixup_new_pci_devices(bus, 0); + pci_bus_add_devices(bus); + } + } else { + /* use legacy probe */ + slotno = PCI_SLOT(PCI_DN(dn->child)->devfn); + num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + if (num) { + pcibios_fixup_new_pci_devices(bus, 1); + pci_bus_add_devices(bus); + } - list_for_each_entry(dev, &bus->devices, bus_list) { - eeh_add_device_late (dev); - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) - pcibios_pci_config_bridge(dev); + list_for_each_entry(dev, &bus->devices, bus_list) + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + pcibios_pci_config_bridge(dev); } } EXPORT_SYMBOL_GPL(pcibios_add_pci_devices); diff -puN include/asm-powerpc/eeh.h~reorg_regress include/asm-powerpc/eeh.h --- 2_6_linus_2/include/asm-powerpc/eeh.h~reorg_regress 2006-02-24 11:04:10.000000000 -0600 +++ 2_6_linus_2-johnrose/include/asm-powerpc/eeh.h 2006-02-24 11:06:50.000000000 -0600 @@ -27,6 +27,7 @@ #include struct pci_dev; +struct pci_bus; struct device_node; #ifdef CONFIG_EEH @@ -61,7 +62,7 @@ void __init pci_addr_cache_build(void); */ void eeh_add_device_early(struct device_node *); void eeh_add_device_tree_early(struct device_node *); -void eeh_add_device_late(struct pci_dev *); +void eeh_add_device_tree_late(struct pci_bus *); /** * eeh_remove_device - undo EEH setup for the indicated pci device @@ -116,12 +117,12 @@ static inline void pci_addr_cache_build( static inline void eeh_add_device_early(struct device_node *dn) { } -static inline void eeh_add_device_late(struct pci_dev *dev) { } - static inline void eeh_remove_device(struct pci_dev *dev) { } static inline void eeh_add_device_tree_early(struct device_node *dn) { } +static inline void eeh_add_device_tree_late(struct pci_bus *bus) { } + static inline void eeh_remove_bus_device(struct pci_dev *dev) { } #define EEH_POSSIBLE_ERROR(val, type) (0) #define EEH_IO_ERROR_VALUE(size) (-1UL) _ From stevewin at us.ibm.com Sat Feb 25 08:40:46 2006 From: stevewin at us.ibm.com (Stephen Winiecki) Date: Fri, 24 Feb 2006 16:40:46 -0500 Subject: Maple fails to boot current git Message-ID: On Tue, 2006-01-31 at 08:08 -0700, Tom Rini wrote: > On Tue, Jan 31, 2006 at 02:53:11PM +1100, Benjamin Herrenschmidt wrote: > > Well, the RTC problem definitely looks like a bogus or lack of "ranges" > > property or the fact that the parser doesn't recognize "ht" as a PCI > > bus. You may want to try updating prom_parse.c to treat "ht" as a PCI > > bus and see if that helps. > > With the following, I get parent bus is pci now, but still: > OF: ** translation for device /ht at 0/isa at 4/rtc at 900 ** > OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4 > OF: translating address: 00000001 00000900 > OF: parent bus is pci (na=3, ns=2) on /ht at 0 > OF: walking ranges... > OF: not found ! > Maple: Unable to translate RTC address > Maple: No device node for RTC, assuming legacy address (0x70) For the record, changing the ISA ranges property does correct the problem translating the addresses for the devices hanging off that bus Old: /isa at 4 ... >> ranges = 00000001 f4000000 00010000 New: /isa at 4 ... >> ranges = 00000001 00000000 f4000000 00000000 00000000 00010000 Output w/ ISA range property change only: ... OF: ** translation for device /ht at 0/isa at 4/rtc at 900 ** OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4 OF: translating address: 00000001 00000900 OF: parent bus is default (na=3, ns=2) on /ht at 0 OF: walking ranges... OF: ISA map, cp=0, s=10000, da=900 OF: parent translation for: f4000000 00000000 00000000 OF: with offset: 900 OF: one level translation: 00000000 00000000 00000900 OF: parent bus is default (na=2, ns=2) on / OF: walking ranges... OF: default map, cp=0, s=400000, da=900 OF: parent translation for: 00000000 f4000000 OF: with offset: 900 OF: one level translation: 00000000 f4000900 OF: reached root node Maple: Found RTC at IO 0x900 ... Fixes similar issues w/ other devices on the bus as well. Note Ben - it looks like adding "ht" as a match in of_bus_pci_match() doesn't help matters - Output w/ ISA range property change and adding "ht" as match in of_bus_pci_match(): OF: of_bus_pci_match with ht OF: ** translation for device /ht at 0/isa at 4/rtc at 900 ** OF: of_bus_pci_match with ht OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4 OF: translating address: 00000001 00000900 OF: of_bus_pci_match with ht OF: parent bus is pci (na=3, ns=2) on /ht at 0 OF: walking ranges... OF: ISA map, cp=0, s=10000, da=900 OF: parent translation for: f4000000 00000000 00000000 OF: with offset: 900 OF: one level translation: f4000000 00000000 00000900 OF: of_bus_pci_match with ht OF: parent bus is default (na=2, ns=2) on / OF: walking ranges... OF: not found ! Maple: Unable to translate RTC address Maple: No device node for RTC, assuming legacy address (0x70) Updating the range property can be done via the EPOS(/PIBS for more recent versions) shell using this function: of_change_property(char *nodename, char *propname, char* prop, size_t len) As an example: PIBS $ int val=malloc(24) PIBS $ int *p=val PIBS $ *p=0x0000000100000000 PIBS $ p+=1 PIBS $ *p=0xf400000000000000 PIBS $ p+=8 # Note - there appears to be an anomoly in my PIBS version where ptr arith is only done for the # first addition - check your values using "print p" PIBS $ *p=0x0000000000010000 PIBS $ of_change_property("/ht/isa", "ranges", val, 24) Note also - this range definition does appear to be compatible with older kernels (I booted a 2.6.10 based image w/ no obvious problems) Steve Winiecki -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/43cac33a/attachment.htm From pradeep at us.ibm.com Sat Feb 25 09:14:02 2006 From: pradeep at us.ibm.com (Pradeep Satyanarayana) Date: Fri, 24 Feb 2006 14:14:02 -0800 Subject: Problems loading some select modules Message-ID: I was trying to load some Infiniband modules (using modprobe) on Power5 machine (p570), and I get the following error: WARNING: Error inserting findex (/lib/modules/2.6.16-rc2/kernel/drivers/infiniband/core/findex.ko): Invalid module format Also, in /var/log/messages I see the following error about the same module: kernel: findex: doesn't contain .toc or .stubs. objdump -h findex.ko | grep toc returns nothing. However, when I tried that on another module I see the following: objdump -h ib_core.ko | grep toc 16 .toc1 000002b8 0000000000000000 0000000000000000 0000d900 2**0 18 .toc 00000038 0000000000000000 0000000000000000 0000e548 2**3 As expected the ib_core (and several other modules) load properly. Just findex.ko has this problem. I suspected problems with the wrong module being picked up and attempted an insmod of the module by specifying the path; same problem. I was using linux 2.6.16-rc2. This was a Sles9sp2 machine. The gcc version is : gcc version 3.3.3 (SuSE Linux). Identical kernel and Infiniband sources on RHEL4U3 machine (on a p570 again) have no problems and the modules load properly. On the RedHat machine the gcc version is : gcc version 3.4.5 20051201 (Red Hat 3.4.5-2) Any help with this is much appreciated. Pradeep pradeep at us.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/22a2fb5f/attachment.htm From ericvh at gmail.com Sat Feb 25 09:57:40 2006 From: ericvh at gmail.com (Eric Van Hensbergen) Date: Fri, 24 Feb 2006 16:57:40 -0600 (CST) Subject: [PATCH 0/2] systemsim extensions (not for mainline inclusion) Message-ID: <20060224225740.E0BBF5A8075@localhost.localdomain> What follows is a couple of FYI systemsim patches. They are not necessarily for inclusion in mainline kernel and will be maintained in /pub/scm/linux/kernel/git/ericvh/systemsim.git o kernel.org. -eric From ericvh at gmail.com Sat Feb 25 09:58:24 2006 From: ericvh at gmail.com (Eric Van Hensbergen) Date: Fri, 24 Feb 2006 16:58:24 -0600 (CST) Subject: [PATCH 1/2] systemsim: boot hacks for non-standard platforms Message-ID: <20060224225824.D98DE5A8075@localhost.localdomain> >From nobody Mon Sep 17 00:00:00 2001 From: Eric Van Hensbergen Date: Fri Feb 24 16:46:07 2006 -0600 Subject: [PATCH] systemsim: add boot hacks for non-standard platforms When booting on some "experimental platforms" under the IBM Full System Simulator - a certain set of boot hacks are required which differentiate the hardware from standard pSeries systems. This patch adds a config flag which allows you to use these hacks. Signed-off-by: Eric Van Hensbergen --- arch/powerpc/Kconfig | 8 ++++++++ arch/powerpc/platforms/pseries/setup.c | 8 ++++++++ 2 files changed, 16 insertions(+), 0 deletions(-) 6f27df783005ca87d1a27370837070b98798fbeb diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 371043b..592846c 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -384,6 +384,14 @@ config SYSTEMSIM_IDLE significantly reduces the load on the host system when simulating an idle system. +config SYSTEMSIM_BOOT + bool " Boot hacks for non-standard hardware under systemsim" + depends on PPC_SYSTEMSIM + help + Selecting this option will enable boot hacks during setup + to facilitate Linux boots on non-standard hardware under the + IBM Full System Simulator. + config XICS depends on PPC_PSERIES bool diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 9edeca8..ca5d20a 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -171,11 +171,19 @@ static void __init pSeries_setup_mpic(vo /* Setup the openpic driver */ irq_count = NR_IRQS - NUM_ISA_INTERRUPTS - 4; /* leave room for IPIs */ +#ifndef CONFIG_SYSTEMSIM_BOOT pSeries_mpic = mpic_alloc(openpic_addr, MPIC_PRIMARY, 16, 16, irq_count, /* isu size, irq offset, irq count */ NR_IRQS - 4, /* ipi offset */ senses, irq_count, /* sense & sense size */ " MPIC "); +#else /* CONFIG_SYSTEMSIM_BOOT */ + pSeries_mpic = mpic_alloc(openpic_addr, MPIC_PRIMARY, + 0, 0, irq_count, /* isu size, irq offset, irq count */ + NR_IRQS - 4, /* ipi offset */ + senses, irq_count, /* sense & sense size */ + " MPIC "); +#endif /* CONFIG_SYSTEMSIM_BOOT */ } static void pseries_lpar_enable_pmcs(void) -- From ericvh at gmail.com Sat Feb 25 09:58:55 2006 From: ericvh at gmail.com (Eric Van Hensbergen) Date: Fri, 24 Feb 2006 16:58:55 -0600 (CST) Subject: [PATCH 2/2] systemsim: add early debug options for HVC_FSS Message-ID: <20060224225855.B83C45A8075@localhost.localdomain> >From nobody Mon Sep 17 00:00:00 2001 From: Eric Van Hensbergen Date: Fri Feb 24 16:47:36 2006 -0600 Subject: [PATCH] systemsim: add early debug option when using systemsim console This patch adds udbg hooks for early-printk debug when using the IBM Full System Simulator console support. Signed-off-by: Eric Van Hensbergen --- arch/powerpc/kernel/udbg.c | 3 +++ drivers/char/Kconfig | 7 +++++++ drivers/char/hvc_fss.c | 15 +++++++++++++++ include/asm-powerpc/udbg.h | 1 + 4 files changed, 26 insertions(+), 0 deletions(-) b4e4add5d57f130a422e68787626f96f311658a0 diff --git a/arch/powerpc/kernel/udbg.c b/arch/powerpc/kernel/udbg.c index 3774e80..66b63ad 100644 --- a/arch/powerpc/kernel/udbg.c +++ b/arch/powerpc/kernel/udbg.c @@ -39,6 +39,9 @@ void __init udbg_early_init(void) #elif defined(CONFIG_PPC_EARLY_DEBUG_MAPLE) /* Maple real mode debug */ udbg_init_maple_realmode(); +#elif defined(CONFIG_PPC_EARLY_DEBUG_FSS) + /* Maple real mode debug */ + udbg_init_fss(); #elif defined(CONFIG_PPC_EARLY_DEBUG_ISERIES) /* For iSeries - hit Ctrl-x Ctrl-x to see the output */ udbg_init_iseries(); diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig index 74f9932..1973869 100644 --- a/drivers/char/Kconfig +++ b/drivers/char/Kconfig @@ -586,6 +586,13 @@ config HVC_FSS IBM Full System Simulator Console device driver which makes use of the HVC_DRIVER front end. +config PPC_EARLY_DEBUG_FSS + bool "IBM Full System Simulator Console early debug support" + depends on PPC_SYSTEMSIM + select HVC_FSS + help + Display early debug info over the Full System simulator console + config HVC_RTAS bool "IBM RTAS Console support" depends on PPC_RTAS diff --git a/drivers/char/hvc_fss.c b/drivers/char/hvc_fss.c index e87c03a..84aa34f 100644 --- a/drivers/char/hvc_fss.c +++ b/drivers/char/hvc_fss.c @@ -38,6 +38,7 @@ #include #include #include +#include #include "hvc_console.h" @@ -74,6 +75,20 @@ static int hvc_fss_read_console(uint32_t return got; } +#ifdef CONFIG_PPC_EARLY_DEBUG_FSS +void udbg_fss_real_putc(char c) +{ + callthru3(SIM_WRITE_CONSOLE_CODE, (unsigned long)&c, 1, 1); +} + +void __init udbg_init_fss(void) +{ + udbg_putc = udbg_fss_real_putc; + udbg_getc = NULL; + udbg_getc_poll = NULL; +} +#endif /* CONFIG_PPC_EARLY_DEBUG_FSS */ + static struct hv_ops hvc_fss_get_put_ops = { .get_chars = hvc_fss_read_console, .put_chars = hvc_fss_write_console, diff --git a/include/asm-powerpc/udbg.h b/include/asm-powerpc/udbg.h index 5c4236c..46b100a 100644 --- a/include/asm-powerpc/udbg.h +++ b/include/asm-powerpc/udbg.h @@ -42,6 +42,7 @@ extern void __init udbg_init_pmac_realmo extern void __init udbg_init_maple_realmode(void); extern void __init udbg_init_iseries(void); extern void __init udbg_init_rtas(void); +extern void __init udbg_init_fss(void); #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_UDBG_H */ -- From Luis.Lopez at foxconn.com Sat Feb 25 09:44:54 2006 From: Luis.Lopez at foxconn.com (Luis Lopez-FLL005) Date: Fri, 24 Feb 2006 15:44:54 -0700 Subject: p660 RIO failure Message-ID: <0608F878EEE32846BFEFE4F1221A0E070419A5DC@cuuexm01.mx.efoxconn.com> Hello Did you solve your problem with the following message ? Service Processor Firmware Failure Error code: B1014602 Detail: 6013 SRC -------------------------------------------------------------- word11: B1014602 word12: 0230005D word13: 60132014 word14: 00000000 word15: 00000700 word16: 0000A05A word17: 00000000 word18: 00004000 word19: F444E060 B1014602 I am having this problem with my RS 6000 Server using AIX 4.3.3, I appreciate any help. Luis -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/643fedc1/attachment.htm From benh at kernel.crashing.org Sat Feb 25 16:02:32 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 25 Feb 2006 16:02:32 +1100 Subject: Maple fails to boot current git In-Reply-To: References: Message-ID: <1140843752.24957.14.camel@localhost.localdomain> > For the record, changing the ISA ranges property does correct the > problem translating the addresses for the devices hanging off that bus Yup, we should get the PIBS folks to fix that. > Note Ben - it looks like adding "ht" as a match in of_bus_pci_match() > doesn't help matters - It should still be done for correctness imho > Updating the range property can be done via the EPOS(/PIBS for more > recent versions) shell using this function: > > of_change_property(char *nodename, char *propname, char* prop, size_t > len) Ah good to know, I couldn't remember how to do it. > As an example: > > PIBS $ int val=malloc(24) > PIBS $ int *p=val > PIBS $ *p=0x0000000100000000 > PIBS $ p+=1 > PIBS $ *p=0xf400000000000000 > PIBS $ p+=8 # Note - there appears to be an anomoly in my PIBS version > where ptr arith is only done for the > # first addition - check your values using "print p" > PIBS $ *p=0x0000000000010000 > PIBS $ of_change_property("/ht/isa", "ranges", val, 24) > > Note also - this range definition does appear to be compatible with > older kernels (I booted a 2.6.10 based image w/ no obvious problems) Yup, the previous one was bogus. Ben. From olh at suse.de Sat Feb 25 19:17:00 2006 From: olh at suse.de (Olaf Hering) Date: Sat, 25 Feb 2006 09:17:00 +0100 Subject: p660 RIO failure In-Reply-To: <0608F878EEE32846BFEFE4F1221A0E070419A5DC@cuuexm01.mx.efoxconn.com> References: <0608F878EEE32846BFEFE4F1221A0E070419A5DC@cuuexm01.mx.efoxconn.com> Message-ID: <20060225081700.GA13698@suse.de> On Fri, Feb 24, Luis Lopez-FLL005 wrote: > I am having this problem with my RS 6000 Server using AIX 4.3.3, I > appreciate any help. I sort of solved it by taking everything apart and reassemble it. It helped for a while. From olh at suse.de Sat Feb 25 23:34:51 2006 From: olh at suse.de (Olaf Hering) Date: Sat, 25 Feb 2006 13:34:51 +0100 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <20060222133551.GA30355@suse.de> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> <20060222133551.GA30355@suse.de> Message-ID: <20060225123451.GA20731@suse.de> On Wed, Feb 22, Olaf Hering wrote: > On Wed, Feb 22, Paul Mackeras wrote: > > > All of this is conditional on CONFIG_VIRT_CPU_ACCOUNTING. If that is > > not set, we do tick-based approximate accounting as before. cpufreq has now unresolved symbols. WARNING: /var/tmp/kernel-ppc64-2.6.16_rc4_git8-build/lib/modules/2.6.16-rc4-git8-20060225_do_get_xsec-ppc64/kernel/drivers/cpufreq/cpufreq_stats.ko needs unknown symbol __cputime_clockt_factor From benh at kernel.crashing.org Sun Feb 26 08:09:00 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 26 Feb 2006 08:09:00 +1100 Subject: [PATCH] powerpc: vdso 64bits gettimeofday bug Message-ID: <1140901740.24957.29.camel@localhost.localdomain> A bug in the assembly code of the vdso can cause gettimeofday() to hang or to return incorrect results. The wrong register was used to test for pending updates of the calibration variables and to create a dependency for subsequent loads. This fixes it. Signed-off-by: Benjamin Herrenschmidt --- Might be worth applying to the stable series too and/or distro kernels 2.6.15 and later --- linux-work.orig/arch/powerpc/kernel/vdso64/gettimeofday.S 2006-02-26 08:02:57.000000000 +1100 +++ linux-work/arch/powerpc/kernel/vdso64/gettimeofday.S 2006-02-26 08:04:23.000000000 +1100 @@ -225,9 +225,9 @@ .cfi_startproc /* check for update count & load values */ 1: ld r8,CFG_TB_UPDATE_COUNT(r3) - andi. r0,r4,1 /* pending update ? loop */ + andi. r0,r8,1 /* pending update ? loop */ bne- 1b - xor r0,r4,r4 /* create dependency */ + xor r0,r8,r8 /* create dependency */ add r3,r3,r0 /* Get TB & offset it */ From benh at kernel.crashing.org Sun Feb 26 08:29:17 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 26 Feb 2006 08:29:17 +1100 Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries In-Reply-To: <200602231432.59472.kelly@au.ibm.com> References: <200602231432.59472.kelly@au.ibm.com> Message-ID: <1140902958.24957.38.camel@localhost.localdomain> On Thu, 2006-02-23 at 14:32 +1100, Kelly Daly wrote: > disable OProfile in Kconfig for iSeries to prevent hangs. OProfile was not originally intended to work with legacy iSeries. What is hanging exactly ? There should be no problem using oprofile timer based sampling at least on iseries... Ben. From benh at kernel.crashing.org Sun Feb 26 08:36:14 2006 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 26 Feb 2006 08:36:14 +1100 Subject: [PATCH] powerpc: Fix runlatch performance issues In-Reply-To: <200602240502.k1O52ExR009703@hera.kernel.org> References: <200602240502.k1O52ExR009703@hera.kernel.org> Message-ID: <1140903374.24957.43.camel@localhost.localdomain> On Fri, 2006-02-24 at 05:02 +0000, Linux Kernel Mailing List wrote: > commit cb2c9b2741346eb23b177187a51ff5abf08295bd > tree 31433b46f96a00e22ca7e8402fd0bfe1fea3408d > parent 47f78a49206b7f9b0d283ba46a2a5a6ee1796472 > author Anton Blanchard Mon, 13 Feb 2006 14:48:35 +1100 > committer Paul Mackerras Fri, 24 Feb 2006 11:36:31 +1100 > > [PATCH] powerpc: Fix runlatch performance issues > > The runlatch SPR can take a lot of time to write. My original runlatch > code would set it on every exception entry even though most of the time > this was not required. It would also continually set it in the idle > loop, which is an issue on an SMT capable processor. > > Now we cache the runlatch value in a threadinfo bit, and only check for > it in decrementer and hardware interrupt exceptions as well as the idle > loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32. I very much dislike the unconditional bl to C code in the exception path. Can you at least wrap it in asm cpu feature conditionals on CPU_FTR_CTRL so that it gets NOP'ed out on CPUs without a runlatch ? In addition, we should probably not set that feature bit from the cputable but from the platform code so that it's only set on machines where it's useful, thus causing the code to be NOP'ed out on G5s and other bare metal stuff that don't care about the runlatch no ? Ben. From paulus at samba.org Mon Feb 27 14:13:24 2006 From: paulus at samba.org (Paul Mackerras) Date: Mon, 27 Feb 2006 14:13:24 +1100 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <20060225123451.GA20731@suse.de> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> <20060222133551.GA30355@suse.de> <20060225123451.GA20731@suse.de> Message-ID: <17410.28244.72197.129637@cargo.ozlabs.ibm.com> Olaf Hering writes: > WARNING: > /var/tmp/kernel-ppc64-2.6.16_rc4_git8-build/lib/modules/2.6.16-rc4-git8-20060225_do_get_xsec-ppc64/kernel/drivers/cpufreq/cpufreq_stats.ko > needs unknown symbol __cputime_clockt_factor We need to export that and a few others, or else move the conversion functions in cputime.h to arch/powerpc/kernel/time.c so that they are out of line. Paul. From sfr at canb.auug.org.au Mon Feb 27 16:03:37 2006 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Mon, 27 Feb 2006 16:03:37 +1100 Subject: [PATCH] Signal hadnling fix for 2.4 Message-ID: <20060227160337.65610906.sfr@canb.auug.org.au> Hi Marcelo, While investigating a bug report about a 64bit application that crashed in malloc, Paul Mackerras noticed that sys_rt_sigreturn's return value was "int". It needs to be "long" or else the return value of a syscall that is interrupted by a signal will be truncated to 32 bits and then sign extended. This causes .e.g mmap's return value to be corrupted if it is returning an address above 2^31 (which is what caused a SEGV in malloc). This problem obviously only affects 64 bit processes. Signed-off-by: Stephen Rothwell --- Please apply for 2.4.33, this patch is against 2.4.33-pre2. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linux/arch/ppc64/kernel/signal.c linux-sfr/arch/ppc64/kernel/signal.c --- linux/arch/ppc64/kernel/signal.c 2006-02-24 17:37:08.000000000 +1100 +++ linux-sfr/arch/ppc64/kernel/signal.c 2006-02-27 11:05:07.000000000 +1100 @@ -332,7 +332,7 @@ } -asmlinkage int +asmlinkage long sys_rt_sigreturn(unsigned long r3, unsigned long r4, unsigned long r5, unsigned long r6, unsigned long r7, unsigned long r8, struct pt_regs *regs) From clumens at redhat.com Tue Feb 28 02:41:37 2006 From: clumens at redhat.com (Chris Lumens) Date: Mon, 27 Feb 2006 10:41:37 -0500 Subject: [PATCH] Conditionalize debugging printks Message-ID: <20060227154137.GF17260@exeter.boston.redhat.com> All the debugging output I'm seeing in the log files on my G5 means relatively little to me, so this patch gives opportunity to turn it off. It looks like relatively new code, which is why I didn't just default to turning everything off. - Chris Signed-off-by: Chris Lumens --- arch/powerpc/platforms/powermac/pfunc_base.c | 6 ++++++ arch/powerpc/platforms/powermac/pfunc_core.c | 7 +++++++ 2 files changed, 13 insertions(+), 0 deletions(-) a192d232af68676eee3488d734bf334acce05453 diff --git a/arch/powerpc/platforms/powermac/pfunc_base.c b/arch/powerpc/platforms/powermac/pfunc_base.c index 4ffd2a9..8ea5bc0 100644 --- a/arch/powerpc/platforms/powermac/pfunc_base.c +++ b/arch/powerpc/platforms/powermac/pfunc_base.c @@ -9,7 +9,13 @@ #include #include +#define DEBUG + +#ifdef DEBUG #define DBG(fmt...) printk(fmt) +#else +#define DBG(fmt...) +#endif static irqreturn_t macio_gpio_irq(int irq, void *data, struct pt_regs *regs) { diff --git a/arch/powerpc/platforms/powermac/pfunc_core.c b/arch/powerpc/platforms/powermac/pfunc_core.c index 356a739..215d267 100644 --- a/arch/powerpc/platforms/powermac/pfunc_core.c +++ b/arch/powerpc/platforms/powermac/pfunc_core.c @@ -17,10 +17,17 @@ #include /* Debug */ +#define DEBUG + #define LOG_PARSE(fmt...) #define LOG_ERROR(fmt...) printk(fmt) #define LOG_BLOB(t,b,c) + +#ifdef DEBUG #define DBG(fmt...) printk(fmt) +#else +#define DBG(fmt...) +#endif /* Command numbers */ #define PMF_CMD_LIST 0 -- 1.2.3 From sonny at burdell.org Tue Feb 28 06:31:57 2006 From: sonny at burdell.org (Sonny Rao) Date: Mon, 27 Feb 2006 14:31:57 -0500 Subject: [PATCH] powerpc: Fix runlatch performance issues In-Reply-To: <1140903374.24957.43.camel@localhost.localdomain> References: <200602240502.k1O52ExR009703@hera.kernel.org> <1140903374.24957.43.camel@localhost.localdomain> Message-ID: <20060227193157.GA22165@kevlar.burdell.org> On Sun, Feb 26, 2006 at 08:36:14AM +1100, Benjamin Herrenschmidt wrote: > On Fri, 2006-02-24 at 05:02 +0000, Linux Kernel Mailing List wrote: > > commit cb2c9b2741346eb23b177187a51ff5abf08295bd > > tree 31433b46f96a00e22ca7e8402fd0bfe1fea3408d > > parent 47f78a49206b7f9b0d283ba46a2a5a6ee1796472 > > author Anton Blanchard Mon, 13 Feb 2006 14:48:35 +1100 > > committer Paul Mackerras Fri, 24 Feb 2006 11:36:31 +1100 > > > > [PATCH] powerpc: Fix runlatch performance issues > > > > The runlatch SPR can take a lot of time to write. My original runlatch > > code would set it on every exception entry even though most of the time > > this was not required. It would also continually set it in the idle > > loop, which is an issue on an SMT capable processor. > > > > Now we cache the runlatch value in a threadinfo bit, and only check for > > it in decrementer and hardware interrupt exceptions as well as the idle > > loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32. > > I very much dislike the unconditional bl to C code in the exception > path. Can you at least wrap it in asm cpu feature conditionals on > CPU_FTR_CTRL so that it gets NOP'ed out on CPUs without a runlatch ? In > addition, we should probably not set that feature bit from the cputable > but from the platform code so that it's only set on machines where it's > useful, thus causing the code to be NOP'ed out on G5s and other bare > metal stuff that don't care about the runlatch no ? AFAIK, runlatch is orthogonal to paravirtualization vs bare metal issues. All it does is stop the PM_RUN_CYC counter from running while a CPU is idle. This is useful when you want to accurately determine CPI on a given workload and there is any idle time (waiting for I/O, whatever). If we ever release pmcount (pending on perfmon2 api stabilization, I think?) you'll find runlatch is useful even on a G5. Sonny From paulus at samba.org Mon Feb 27 15:43:29 2006 From: paulus at samba.org (Paul Mackerras) Date: Mon, 27 Feb 2006 15:43:29 +1100 Subject: [PATCH] Accurate task and cpu time accounting In-Reply-To: <20060225123451.GA20731@suse.de> References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com> <20060222133551.GA30355@suse.de> <20060225123451.GA20731@suse.de> Message-ID: <17410.33649.46500.544164@cargo.ozlabs.ibm.com> Olaf Hering writes: > cpufreq has now unresolved symbols. This should fix it... (now in powerpc.git) Paul. diff-tree 2cf82c0256b198ae28c465f2c4d7c12c836ea5ea (from f055affb89f587a03f3411c3fd49ef31295c3d48) Author: Paul Mackerras Date: Mon Feb 27 15:41:47 2006 +1100 powerpc: Export variables used in conversions to/from cputime_t The inline cputime_to_foo and foo_to_cputime conversion functions in include/asm-powerpc/cputime.h refer to 5 variables, which need to be exported if those functions are to be usable from modules. Signed-off-by: Paul Mackerras diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 0b34db2..4f20a5f 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -97,10 +97,11 @@ static unsigned long first_settimeofday unsigned long tb_ticks_per_jiffy; unsigned long tb_ticks_per_usec = 100; /* sane default */ EXPORT_SYMBOL(tb_ticks_per_usec); unsigned long tb_ticks_per_sec; +EXPORT_SYMBOL(tb_ticks_per_sec); /* for cputime_t conversions */ u64 tb_to_xs; unsigned tb_to_us; #define TICKLEN_SCALE (SHIFT_SCALE - 10) u64 last_tick_len; /* units are ns / 2^TICKLEN_SCALE */ @@ -141,13 +142,17 @@ DEFINE_PER_CPU(unsigned long, last_jiffy * Factors for converting from cputime_t (timebase ticks) to * jiffies, milliseconds, seconds, and clock_t (1/USER_HZ seconds). * These are all stored as 0.64 fixed-point binary fractions. */ u64 __cputime_jiffies_factor; +EXPORT_SYMBOL(__cputime_jiffies_factor); u64 __cputime_msec_factor; +EXPORT_SYMBOL(__cputime_msec_factor); u64 __cputime_sec_factor; +EXPORT_SYMBOL(__cputime_sec_factor); u64 __cputime_clockt_factor; +EXPORT_SYMBOL(__cputime_clockt_factor); static void calc_cputime_factors(void) { struct div_result res; From michael at ellerman.id.au Tue Feb 28 14:54:26 2006 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 28 Feb 2006 14:54:26 +1100 Subject: [PATCH] powerpc: iseries: Fix double phys_to_abs bug in htab_bolt_mapping Message-ID: <20060228035450.BB448679F8@ozlabs.org> Before the merge I updated create_pte_mapping() to work for iSeries, by calling iSeries_hpte_bolt_or_insert. (4c55130b2aa93370f1bf52d2304394e91cf8ee39) Later we changed iSeries_hpte_insert to cope with the bolting case, and called that instead from create_pte_mapping() (which was renamed to htab_bolt_mapping) (3c726f8dee6f55e96475574e9f645327e461884c). Unfortunately that change introduced a subtle bug, where we pass an absolute address to iSeries_hpte_insert() where it expects a physical address. This leads to us calling phys_to_abs() twice on the physical address, which is seriously bogus. This only causes a problem if the absolute address from the first translation can be looked up again in the chunk_map, which depends on the size and layout of memory. I've seen it fail on one box, but not others. The minimal fix is to pass the physical address to iSeries_hpte_insert(). For 2.6.17 we should make phys_to_abs() BUG if we try to double-translate an address. Signed-off-by: Michael Ellerman --- arch/powerpc/mm/hash_utils_64.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: iseries/arch/powerpc/mm/hash_utils_64.c =================================================================== --- iseries.orig/arch/powerpc/mm/hash_utils_64.c +++ iseries/arch/powerpc/mm/hash_utils_64.c @@ -169,7 +169,7 @@ int htab_bolt_mapping(unsigned long vsta #ifdef CONFIG_PPC_ISERIES if (_machine == PLATFORM_ISERIES_LPAR) ret = iSeries_hpte_insert(hpteg, va, - virt_to_abs(paddr), + __pa(vaddr), tmp_mode, HPTE_V_BOLTED, psize); From paulus at samba.org Tue Feb 28 15:01:34 2006 From: paulus at samba.org (Paul Mackerras) Date: Tue, 28 Feb 2006 15:01:34 +1100 Subject: [PATCH] Signal hadnling fix for 2.4 In-Reply-To: <20060227160337.65610906.sfr@canb.auug.org.au> References: <20060227160337.65610906.sfr@canb.auug.org.au> Message-ID: <17411.51998.642468.642351@cargo.ozlabs.ibm.com> Stephen Rothwell writes: > While investigating a bug report about a 64bit application that crashed in > malloc, Paul Mackerras noticed that sys_rt_sigreturn's return value was > "int". It needs to be "long" or else the return value of a syscall that > is interrupted by a signal will be truncated to 32 bits and then sign > extended. This causes .e.g mmap's return value to be corrupted if it is > returning an address above 2^31 (which is what caused a SEGV in malloc). > This problem obviously only affects 64 bit processes. > > Signed-off-by: Stephen Rothwell Acked-by: Paul Mackerras From paulus at samba.org Tue Feb 28 16:06:02 2006 From: paulus at samba.org (Paul Mackerras) Date: Tue, 28 Feb 2006 16:06:02 +1100 Subject: Problems loading some select modules In-Reply-To: References: Message-ID: <17411.55866.172377.50234@cargo.ozlabs.ibm.com> Pradeep Satyanarayana writes: > I was trying to load some Infiniband modules (using modprobe) on Power5 > machine (p570), and I get the following error: > > WARNING: Error inserting findex > (/lib/modules/2.6.16-rc2/kernel/drivers/infiniband/core/findex.ko): Invalid > module format > > Also, in /var/log/messages I see the following error about the same module: > > kernel: findex: doesn't contain .toc or .stubs. Interesting. I don't see findex.c in the kernel sources anywhere. It could be that a very simple module that only accesses variables on the stack would not need a toc, and maybe in this case the toolchain doesn't generate a toc. Could you send me the source of your module plus the generated findex.ko? Paul.