From jgarzik at pobox.com Fri Jul 1 00:41:03 2005 From: jgarzik at pobox.com (Jeff Garzik) Date: Thu, 30 Jun 2005 10:41:03 -0400 Subject: [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver In-Reply-To: <200506302016.55125.michael@ellerman.id.au> References: <200506302016.55125.michael@ellerman.id.au> Message-ID: <42C4047F.1000108@pobox.com> Michael Ellerman wrote: > Hi y'all, > > The following is a series of patches for the iseries_veth driver. > > They're not ready for merging yet, as we need to do more extensive testing. > However any feedback you have will be greatly appreciated. Note, make sure to CC me, and also the new netdev list (netdev at vger.kernel.org). Jeff From segher at kernel.crashing.org Fri Jul 1 02:41:03 2005 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Thu, 30 Jun 2005 18:41:03 +0200 Subject: mmio latency measurements In-Reply-To: <1120121818.31924.52.camel@gaston> References: <20050630080439.GD25641@sunbeam.de.gnumonks.org> <1120121818.31924.52.camel@gaston> Message-ID: <144d58d527f5e870e6a096333bd38791@kernel.crashing.org> > On ppc64, there is no cycle-counter per-se, but a HW timebase that > ticks > at a fixes frequency (independently of the CPU frequency nowadays). On a 970, and presumably on POWER4 and maybe POWER5 as well, you *can* get cycle counts -- from the performance monitor counters. Simply write 0xf00 to MMCR0 and then read the cycle count from PMC1. Segher From linas at austin.ibm.com Fri Jul 1 06:39:31 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 30 Jun 2005 15:39:31 -0500 Subject: PCI Power management (was: Re: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery In-Reply-To: <20050629165828.GA73550@muc.de> References: <20050628235848.GA6376@austin.ibm.com> <1120009619.5133.228.camel@gaston> <20050629155954.GH28499@austin.ibm.com> <20050629165828.GA73550@muc.de> Message-ID: <20050630203931.GY28499@austin.ibm.com> On Wed, Jun 29, 2005 at 06:58:29PM +0200, Andi Kleen was heard to remark: > > Yep, OK. Pushig the timer would in fact break if the device was marked > > perm disabled. > > I think for network drivers you should just write a generic error handler > (perhaps in net/core/dev.c) that calls the watchdog handler. > Then all drivers could be easily converted without much code duplication. Well, there's no watchdog per-se in "struct net_device" -- are you suggesting I add one? It looks like I can almost create generic handlers for net devices; looks like calling netdev->stop() is enough to handle the error detection. However, a generic bringup would need to call pci_enable_device(), and net/core/dev.c does not include pci.h so I can't really do it there. Other than that, a generic recovry routine looks like it might be possible; I'll have to experiment; its hard to tell by reading code. This might be the wrong paradigm, though. The pci error recovery routines are *almost identical* to the power-management suspend/resume routines. From what I can tell, the only real difference is that I want to not actually turn off/on the power. Thus, the right thing to do might be to split up the struct pci_dev->suspend() and pci_dev->resume() calls into suspend() poweroff() poweron() resume() and then have the generic pci error recovery routines call suspend/resume only, skipping the poweroff-on calls. Does that sound good? I'm not sure I can pull this off without having someone from the power-management world throw a brick at me. --linas From linas at austin.ibm.com Fri Jul 1 07:07:48 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Thu, 30 Jun 2005 16:07:48 -0500 Subject: PCI Power management (was: Re: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery In-Reply-To: <20050630203931.GY28499@austin.ibm.com> References: <20050628235848.GA6376@austin.ibm.com> <1120009619.5133.228.camel@gaston> <20050629155954.GH28499@austin.ibm.com> <20050629165828.GA73550@muc.de> <20050630203931.GY28499@austin.ibm.com> Message-ID: <20050630210748.GZ28499@austin.ibm.com> Hm, Scratch the idea I outline below, seems like its not a good idea. I'm reading the e100, e1000 and the ixgb power management code, and they go through all sorts of steps I don't need to do for PCI device reset. There's no clear abstraction that would serve both needs. On Thu, Jun 30, 2005 at 03:39:31PM -0500, Linas Vepstas was heard to remark: > On Wed, Jun 29, 2005 at 06:58:29PM +0200, Andi Kleen was heard to remark: > > > Yep, OK. Pushig the timer would in fact break if the device was marked > > > perm disabled. > > > > I think for network drivers you should just write a generic error handler > > (perhaps in net/core/dev.c) that calls the watchdog handler. > > Then all drivers could be easily converted without much code duplication. > > Well, there's no watchdog per-se in "struct net_device" -- are you > suggesting I add one? > > It looks like I can almost create generic handlers for net devices; > looks like calling netdev->stop() is enough to handle the error > detection. > > However, a generic bringup would need to call pci_enable_device(), > and net/core/dev.c does not include pci.h so I can't really do it > there. Other than that, a generic recovry routine looks like it might > be possible; I'll have to experiment; its hard to tell by reading code. > > This might be the wrong paradigm, though. The pci error recovery > routines are *almost identical* to the power-management suspend/resume > routines. From what I can tell, the only real difference is that > I want to not actually turn off/on the power. > > Thus, the right thing to do might be to split up the > struct pci_dev->suspend() and pci_dev->resume() calls into > > suspend() > poweroff() > poweron() > resume() > > and then have the generic pci error recovery routines call > suspend/resume only, skipping the poweroff-on calls. Does that > sound good? > > I'm not sure I can pull this off without having someone from > the power-management world throw a brick at me. > > --linas > > From benh at kernel.crashing.org Fri Jul 1 09:32:43 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 01 Jul 2005 09:32:43 +1000 Subject: PCI Power management (was: Re: [PATCH 4/13]: PCI Err: e100 ethernet driver recovery In-Reply-To: <20050630203931.GY28499@austin.ibm.com> References: <20050628235848.GA6376@austin.ibm.com> <1120009619.5133.228.camel@gaston> <20050629155954.GH28499@austin.ibm.com> <20050629165828.GA73550@muc.de> <20050630203931.GY28499@austin.ibm.com> Message-ID: <1120174364.31924.57.camel@gaston> On Thu, 2005-06-30 at 15:39 -0500, Linas Vepstas wrote: > Thus, the right thing to do might be to split up the > struct pci_dev->suspend() and pci_dev->resume() calls into > > suspend() > poweroff() > poweron() > resume() No. There are very good reasons not to do that split at the pci_dev level. > and then have the generic pci error recovery routines call > suspend/resume only, skipping the poweroff-on calls. Does that > sound good? > > I'm not sure I can pull this off without having someone from > the power-management world throw a brick at me. Just keep the error recovery callbacks for now, and we might be able to provide a generic "helper" doing the watchdog thing (yes, there is a watchdog in the net core) Ben. From michael at ellerman.id.au Fri Jul 1 21:46:14 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 1 Jul 2005 21:46:14 +1000 Subject: Make idle_loop a member of ppc_md Message-ID: <200507012146.19553.michael@ellerman.id.au> Currently the idle loop is selected in idle_setup() by consulting systemcfg->platform and with a few ifdefs as well. These five patches make idle_loop a member of the ppc_md structure, and moves the selection into the respective platforms' setup_arch(). I wrote this and then change my mind, and thought we should instead try and reduce the number of different idle loops. But that looks hard, perhaps impossible, so this might be as good as it gets. I've boot tested on iSeries and pSeries LPAR, and compiled defconfig for iSeries/pSeries/maple/G5. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050701/d7777b71/attachment.pgp From michael at ellerman.id.au Fri Jul 1 21:46:32 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 1 Jul 2005 21:46:32 +1000 Subject: [PATCH 2/5] ppc64: Move iSeries_idle() into iSeries_setup.c In-Reply-To: <200507012146.19553.michael@ellerman.id.au> Message-ID: <1120218392.289033.83615061705.qpatch@concordia> Move iSeries_idle() into iSeries_setup.c, no one else needs to know about it. Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/iSeries_setup.c | 81 +++++++++++++++++++++++++++++++++++ arch/ppc64/kernel/idle.c | 86 -------------------------------------- 2 files changed, 81 insertions(+), 86 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c @@ -834,6 +834,87 @@ static int __init iSeries_src_init(void) late_initcall(iSeries_src_init); +static unsigned long maxYieldTime = 0; +static unsigned long minYieldTime = 0xffffffffffffffffUL; + +static inline void process_iSeries_events(void) +{ + asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); +} + +static void yield_shared_processor(void) +{ + unsigned long tb; + unsigned long yieldTime; + + HvCall_setEnabledInterrupts(HvCall_MaskIPI | + HvCall_MaskLpEvent | + HvCall_MaskLpProd | + HvCall_MaskTimeout); + + tb = get_tb(); + /* Compute future tb value when yield should expire */ + HvCall_yieldProcessor(HvCall_YieldTimed, tb+tb_ticks_per_jiffy); + + yieldTime = get_tb() - tb; + if (yieldTime > maxYieldTime) + maxYieldTime = yieldTime; + + if (yieldTime < minYieldTime) + minYieldTime = yieldTime; + + /* + * The decrementer stops during the yield. Force a fake decrementer + * here and let the timer_interrupt code sort out the actual time. + */ + get_paca()->lppaca.int_dword.fields.decr_int = 1; + process_iSeries_events(); +} + +static int iSeries_idle(void) +{ + struct paca_struct *lpaca; + long oldval; + + /* ensure iSeries run light will be out when idle */ + ppc64_runlatch_off(); + + lpaca = get_paca(); + + while (1) { + if (lpaca->lppaca.shared_proc) { + if (hvlpevent_is_pending()) + process_iSeries_events(); + if (!need_resched()) + yield_shared_processor(); + } else { + oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + + if (!oldval) { + set_thread_flag(TIF_POLLING_NRFLAG); + + while (!need_resched()) { + HMT_medium(); + if (hvlpevent_is_pending()) + process_iSeries_events(); + HMT_low(); + } + + HMT_medium(); + clear_thread_flag(TIF_POLLING_NRFLAG); + } else { + set_need_resched(); + } + } + + ppc64_runlatch_on(); + schedule(); + ppc64_runlatch_off(); + } + + return 0; +} + #ifndef CONFIG_PCI void __init iSeries_init_IRQ(void) { } #endif Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -39,90 +39,6 @@ extern void power4_idle(void); static int (*idle_loop)(void); -#ifdef CONFIG_PPC_ISERIES -static unsigned long maxYieldTime = 0; -static unsigned long minYieldTime = 0xffffffffffffffffUL; - -static inline void process_iSeries_events(void) -{ - asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); -} - -static void yield_shared_processor(void) -{ - unsigned long tb; - unsigned long yieldTime; - - HvCall_setEnabledInterrupts(HvCall_MaskIPI | - HvCall_MaskLpEvent | - HvCall_MaskLpProd | - HvCall_MaskTimeout); - - tb = get_tb(); - /* Compute future tb value when yield should expire */ - HvCall_yieldProcessor(HvCall_YieldTimed, tb+tb_ticks_per_jiffy); - - yieldTime = get_tb() - tb; - if (yieldTime > maxYieldTime) - maxYieldTime = yieldTime; - - if (yieldTime < minYieldTime) - minYieldTime = yieldTime; - - /* - * The decrementer stops during the yield. Force a fake decrementer - * here and let the timer_interrupt code sort out the actual time. - */ - get_paca()->lppaca.int_dword.fields.decr_int = 1; - process_iSeries_events(); -} - -static int iSeries_idle(void) -{ - struct paca_struct *lpaca; - long oldval; - - /* ensure iSeries run light will be out when idle */ - ppc64_runlatch_off(); - - lpaca = get_paca(); - - while (1) { - if (lpaca->lppaca.shared_proc) { - if (hvlpevent_is_pending()) - process_iSeries_events(); - if (!need_resched()) - yield_shared_processor(); - } else { - oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); - - if (!oldval) { - set_thread_flag(TIF_POLLING_NRFLAG); - - while (!need_resched()) { - HMT_medium(); - if (hvlpevent_is_pending()) - process_iSeries_events(); - HMT_low(); - } - - HMT_medium(); - clear_thread_flag(TIF_POLLING_NRFLAG); - } else { - set_need_resched(); - } - } - - ppc64_runlatch_on(); - schedule(); - ppc64_runlatch_off(); - } - - return 0; -} - -#else - int default_idle(void) { long oldval; @@ -305,8 +221,6 @@ int native_idle(void) return 0; } -#endif /* CONFIG_PPC_ISERIES */ - void cpu_idle(void) { BUG_ON(NULL == ppc_md.idle_loop); From michael at ellerman.id.au Fri Jul 1 21:46:32 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 1 Jul 2005 21:46:32 +1000 Subject: [PATCH 1/5] ppc64: Make idle_loop a ppc_md function In-Reply-To: <200507012146.19553.michael@ellerman.id.au> Message-ID: <1120218392.215165.357899678992.qpatch@concordia> This patch adds an idle member to the ppc_md structure and calls it from cpu_idle(). If a platform leaves ppc_md.idle as null it will get the default idle loop default_idle(). Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/idle.c | 8 +++++--- arch/ppc64/kernel/setup.c | 6 +++--- include/asm-ppc64/machdep.h | 5 +++++ 3 files changed, 13 insertions(+), 6 deletions(-) Index: ppc64-2.6/include/asm-ppc64/machdep.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/machdep.h +++ ppc64-2.6/include/asm-ppc64/machdep.h @@ -140,8 +140,13 @@ struct machdep_calls { unsigned long size, pgprot_t vma_prot); + /* Idle loop for this platform, leave empty for default idle loop */ + int (*idle_loop)(void); }; +extern int default_idle(void); +extern int native_idle(void); + extern struct machdep_calls ppc_md; extern char cmd_line[COMMAND_LINE_SIZE]; Index: ppc64-2.6/arch/ppc64/kernel/setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/setup.c +++ ppc64-2.6/arch/ppc64/kernel/setup.c @@ -96,7 +96,6 @@ extern void udbg_init_maple_realmode(voi extern unsigned long klimit; extern void mm_init_ppc64(void); -extern int idle_setup(void); extern void stab_initialize(unsigned long stab); extern void htab_initialize(void); extern void early_init_devtree(void *flat_dt); @@ -1081,8 +1080,9 @@ void __init setup_arch(char **cmdline_p) ppc_md.setup_arch(); - /* Select the correct idle loop for the platform. */ - idle_setup(); + /* Use the default idle loop if the platform hasn't provided one. */ + if (NULL == ppc_md.idle_loop) + ppc_md.idle_loop = default_idle; paging_init(); ppc64_boot_msg(0x15, "Setup Done"); Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -33,6 +33,7 @@ #include #include #include +#include extern void power4_idle(void); @@ -122,7 +123,7 @@ static int iSeries_idle(void) #else -static int default_idle(void) +int default_idle(void) { long oldval; unsigned int cpu = smp_processor_id(); @@ -288,7 +289,7 @@ static int shared_idle(void) #endif /* CONFIG_PPC_PSERIES */ -static int native_idle(void) +int native_idle(void) { while(1) { /* check CPU type here */ @@ -308,7 +309,8 @@ static int native_idle(void) void cpu_idle(void) { - idle_loop(); + BUG_ON(NULL == ppc_md.idle_loop); + ppc_md.idle_loop(); } int powersave_nap; From michael at ellerman.id.au Fri Jul 1 21:46:32 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 1 Jul 2005 21:46:32 +1000 Subject: [PATCH 5/5] ppc64: Remove obsolete idle_setup() In-Reply-To: <200507012146.19553.michael@ellerman.id.au> Message-ID: <1120218392.499521.155402682754.qpatch@concordia> Now that the idle loop is configured by each platform we don't need idle_setup() anymore. Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/idle.c | 41 ----------------------------------------- 1 files changed, 41 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -37,8 +37,6 @@ extern void power4_idle(void); -static int (*idle_loop)(void); - int default_idle(void) { long oldval; @@ -127,42 +125,3 @@ register_powersave_nap_sysctl(void) } __initcall(register_powersave_nap_sysctl); #endif - -int idle_setup(void) -{ - /* - * Move that junk to each platform specific file, eventually define - * a pSeries_idle for shared processor stuff - */ -#ifdef CONFIG_PPC_ISERIES - idle_loop = iSeries_idle; - return 1; -#else - idle_loop = default_idle; -#endif -#ifdef CONFIG_PPC_PSERIES - if (systemcfg->platform & PLATFORM_PSERIES) { - if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { - if (get_paca()->lppaca.shared_proc) { - printk(KERN_INFO "Using shared processor idle loop\n"); - idle_loop = shared_idle; - } else { - printk(KERN_INFO "Using dedicated idle loop\n"); - idle_loop = dedicated_idle; - } - } else { - printk(KERN_INFO "Using default idle loop\n"); - idle_loop = default_idle; - } - } -#endif /* CONFIG_PPC_PSERIES */ -#ifndef CONFIG_PPC_ISERIES - if (systemcfg->platform == PLATFORM_POWERMAC || - systemcfg->platform == PLATFORM_MAPLE) { - printk(KERN_INFO "Using native/NAP idle loop\n"); - idle_loop = native_idle; - } -#endif /* CONFIG_PPC_ISERIES */ - - return 1; -} From michael at ellerman.id.au Fri Jul 1 21:46:32 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 1 Jul 2005 21:46:32 +1000 Subject: [PATCH 4/5] ppc64: Fixup platforms for new ppc_md.idle In-Reply-To: <200507012146.19553.michael@ellerman.id.au> Message-ID: <1120218392.425320.222568985943.qpatch@concordia> This patch fixes up iSeries, pSeries, pmac and maple to set the correct idle function for each platform. Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/iSeries_setup.c | 1 + arch/ppc64/kernel/maple_setup.c | 3 +++ arch/ppc64/kernel/pSeries_setup.c | 18 ++++++++++++++++++ arch/ppc64/kernel/pmac_setup.c | 5 ++++- 4 files changed, 26 insertions(+), 1 deletion(-) Index: ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c @@ -940,5 +940,6 @@ void __init iSeries_early_setup(void) ppc_md.get_rtc_time = iSeries_get_rtc_time; ppc_md.calibrate_decr = iSeries_calibrate_decr; ppc_md.progress = iSeries_progress; + ppc_md.idle_loop = iSeries_idle; } Index: ppc64-2.6/arch/ppc64/kernel/maple_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/maple_setup.c +++ ppc64-2.6/arch/ppc64/kernel/maple_setup.c @@ -177,6 +177,8 @@ void __init maple_setup_arch(void) #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif + + printk(KERN_INFO "Using native/NAP idle loop\n"); } /* @@ -297,4 +299,5 @@ struct machdep_calls __initdata maple_md .get_rtc_time = maple_get_rtc_time, .calibrate_decr = generic_calibrate_decr, .progress = maple_progress, + .idle_loop = native_idle, }; Index: ppc64-2.6/arch/ppc64/kernel/pmac_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pmac_setup.c +++ ppc64-2.6/arch/ppc64/kernel/pmac_setup.c @@ -186,6 +186,8 @@ void __init pmac_setup_arch(void) #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif + + printk(KERN_INFO "Using native/NAP idle loop\n"); } #ifdef CONFIG_SCSI @@ -507,5 +509,6 @@ struct machdep_calls __initdata pmac_md .calibrate_decr = pmac_calibrate_decr, .feature_call = pmac_do_feature_call, .progress = pmac_progress, - .check_legacy_ioport = pmac_check_legacy_ioport + .check_legacy_ioport = pmac_check_legacy_ioport, + .idle_loop = native_idle, }; Index: ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c @@ -19,6 +19,7 @@ #undef DEBUG #include +#include #include #include #include @@ -82,6 +83,9 @@ int fwnmi_active; /* TRUE if an FWNMI h extern void pSeries_system_reset_exception(struct pt_regs *regs); extern int pSeries_machine_check_exception(struct pt_regs *regs); +static int shared_idle(void); +static int dedicated_idle(void); + static volatile void __iomem * chrp_int_ack_special; struct mpic *pSeries_mpic; @@ -229,6 +233,20 @@ static void __init pSeries_setup_arch(vo if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) vpa_init(boot_cpuid); + + /* Choose an idle loop */ + if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { + if (get_paca()->lppaca.shared_proc) { + printk(KERN_INFO "Using shared processor idle loop\n"); + ppc_md.idle_loop = shared_idle; + } else { + printk(KERN_INFO "Using dedicated idle loop\n"); + ppc_md.idle_loop = dedicated_idle; + } + } else { + printk(KERN_INFO "Using default idle loop\n"); + ppc_md.idle_loop = default_idle; + } } static int __init pSeries_init_panel(void) From michael at ellerman.id.au Fri Jul 1 21:46:32 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 1 Jul 2005 21:46:32 +1000 Subject: [PATCH 3/5] ppc64: Move pSeries idle functions into pSeries_setup.c In-Reply-To: <200507012146.19553.michael@ellerman.id.au> Message-ID: <1120218392.355354.309061134660.qpatch@concordia> dedicated_idle() and shared_idle() are only used by pSeries, so move them into pSeries_setup.c Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/idle.c | 131 -------------------------------------- arch/ppc64/kernel/pSeries_setup.c | 127 ++++++++++++++++++++++++++++++++++++ 2 files changed, 127 insertions(+), 131 deletions(-) Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -74,137 +74,6 @@ int default_idle(void) return 0; } -#ifdef CONFIG_PPC_PSERIES - -DECLARE_PER_CPU(unsigned long, smt_snooze_delay); - -int dedicated_idle(void) -{ - long oldval; - struct paca_struct *lpaca = get_paca(), *ppaca; - unsigned long start_snooze; - unsigned long *smt_snooze_delay = &__get_cpu_var(smt_snooze_delay); - unsigned int cpu = smp_processor_id(); - - ppaca = &paca[cpu ^ 1]; - - while (1) { - /* - * Indicate to the HV that we are idle. Now would be - * a good time to find other work to dispatch. - */ - lpaca->lppaca.idle = 1; - - oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); - if (!oldval) { - set_thread_flag(TIF_POLLING_NRFLAG); - start_snooze = __get_tb() + - *smt_snooze_delay * tb_ticks_per_usec; - while (!need_resched() && !cpu_is_offline(cpu)) { - /* - * Go into low thread priority and possibly - * low power mode. - */ - HMT_low(); - HMT_very_low(); - - if (*smt_snooze_delay == 0 || - __get_tb() < start_snooze) - continue; - - HMT_medium(); - - if (!(ppaca->lppaca.idle)) { - local_irq_disable(); - - /* - * We are about to sleep the thread - * and so wont be polling any - * more. - */ - clear_thread_flag(TIF_POLLING_NRFLAG); - - /* - * SMT dynamic mode. Cede will result - * in this thread going dormant, if the - * partner thread is still doing work. - * Thread wakes up if partner goes idle, - * an interrupt is presented, or a prod - * occurs. Returning from the cede - * enables external interrupts. - */ - if (!need_resched()) - cede_processor(); - else - local_irq_enable(); - } else { - /* - * Give the HV an opportunity at the - * processor, since we are not doing - * any work. - */ - poll_pending(); - } - } - - clear_thread_flag(TIF_POLLING_NRFLAG); - } else { - set_need_resched(); - } - - HMT_medium(); - lpaca->lppaca.idle = 0; - schedule(); - if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) - cpu_die(); - } - return 0; -} - -static int shared_idle(void) -{ - struct paca_struct *lpaca = get_paca(); - unsigned int cpu = smp_processor_id(); - - while (1) { - /* - * Indicate to the HV that we are idle. Now would be - * a good time to find other work to dispatch. - */ - lpaca->lppaca.idle = 1; - - while (!need_resched() && !cpu_is_offline(cpu)) { - local_irq_disable(); - - /* - * Yield the processor to the hypervisor. We return if - * an external interrupt occurs (which are driven prior - * to returning here) or if a prod occurs from another - * processor. When returning here, external interrupts - * are enabled. - * - * Check need_resched() again with interrupts disabled - * to avoid a race. - */ - if (!need_resched()) - cede_processor(); - else - local_irq_enable(); - } - - HMT_medium(); - lpaca->lppaca.idle = 0; - schedule(); - if (cpu_is_offline(smp_processor_id()) && - system_state == SYSTEM_RUNNING) - cpu_die(); - } - - return 0; -} - -#endif /* CONFIG_PPC_PSERIES */ - int native_idle(void) { while(1) { Index: ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c @@ -418,6 +418,133 @@ static int __init pSeries_probe(int plat return 1; } +DECLARE_PER_CPU(unsigned long, smt_snooze_delay); + +int dedicated_idle(void) +{ + long oldval; + struct paca_struct *lpaca = get_paca(), *ppaca; + unsigned long start_snooze; + unsigned long *smt_snooze_delay = &__get_cpu_var(smt_snooze_delay); + unsigned int cpu = smp_processor_id(); + + ppaca = &paca[cpu ^ 1]; + + while (1) { + /* + * Indicate to the HV that we are idle. Now would be + * a good time to find other work to dispatch. + */ + lpaca->lppaca.idle = 1; + + oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + if (!oldval) { + set_thread_flag(TIF_POLLING_NRFLAG); + start_snooze = __get_tb() + + *smt_snooze_delay * tb_ticks_per_usec; + while (!need_resched() && !cpu_is_offline(cpu)) { + /* + * Go into low thread priority and possibly + * low power mode. + */ + HMT_low(); + HMT_very_low(); + + if (*smt_snooze_delay == 0 || + __get_tb() < start_snooze) + continue; + + HMT_medium(); + + if (!(ppaca->lppaca.idle)) { + local_irq_disable(); + + /* + * We are about to sleep the thread + * and so wont be polling any + * more. + */ + clear_thread_flag(TIF_POLLING_NRFLAG); + + /* + * SMT dynamic mode. Cede will result + * in this thread going dormant, if the + * partner thread is still doing work. + * Thread wakes up if partner goes idle, + * an interrupt is presented, or a prod + * occurs. Returning from the cede + * enables external interrupts. + */ + if (!need_resched()) + cede_processor(); + else + local_irq_enable(); + } else { + /* + * Give the HV an opportunity at the + * processor, since we are not doing + * any work. + */ + poll_pending(); + } + } + + clear_thread_flag(TIF_POLLING_NRFLAG); + } else { + set_need_resched(); + } + + HMT_medium(); + lpaca->lppaca.idle = 0; + schedule(); + if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) + cpu_die(); + } + return 0; +} + +static int shared_idle(void) +{ + struct paca_struct *lpaca = get_paca(); + unsigned int cpu = smp_processor_id(); + + while (1) { + /* + * Indicate to the HV that we are idle. Now would be + * a good time to find other work to dispatch. + */ + lpaca->lppaca.idle = 1; + + while (!need_resched() && !cpu_is_offline(cpu)) { + local_irq_disable(); + + /* + * Yield the processor to the hypervisor. We return if + * an external interrupt occurs (which are driven prior + * to returning here) or if a prod occurs from another + * processor. When returning here, external interrupts + * are enabled. + * + * Check need_resched() again with interrupts disabled + * to avoid a race. + */ + if (!need_resched()) + cede_processor(); + else + local_irq_enable(); + } + + HMT_medium(); + lpaca->lppaca.idle = 0; + schedule(); + if (cpu_is_offline(smp_processor_id()) && + system_state == SYSTEM_RUNNING) + cpu_die(); + } + + return 0; +} + struct machdep_calls __initdata pSeries_md = { .probe = pSeries_probe, .setup_arch = pSeries_setup_arch, From kernel at 0x100.com Sat Jul 2 00:09:59 2005 From: kernel at 0x100.com (Yuta SATOH) Date: Fri, 01 Jul 2005 23:09:59 +0900 Subject: Brand new iMac G5 In-Reply-To: <1118871572.5986.231.camel@gaston> References: <1118871572.5986.231.camel@gaston> Message-ID: <20050701230521.4F4D.KERNEL@0x100.com> Hello, I received the report that the network device of a brand new iMacG5 functioned on the kernel which applied your patch. [1] If possible, please merge it into a kernel. Thank you. [1] http://bugs.gentoo.org/94263 Benjamin Herrenschmidt wrote: > Ok, the patch was missing a bit, here's a fixed version > > Index: linux-work/drivers/net/sungem.c > =================================================================== > --- linux-work.orig/drivers/net/sungem.c 2005-05-02 10:48:28.000000000 +1000 > +++ linux-work/drivers/net/sungem.c 2005-06-14 10:17:38.000000000 +1000 > @@ -3078,7 +3078,9 @@ > gp->phy_mii.dev = dev; > gp->phy_mii.mdio_read = _phy_read; > gp->phy_mii.mdio_write = _phy_write; > - > +#ifdef CONFIG_PPC_PMAC > + gp->phy_mii.platform_data = gp->of_node; > +#endif > /* By default, we start with autoneg */ > gp->want_autoneg = 1; > > Index: linux-work/drivers/net/sungem_phy.c > =================================================================== > --- linux-work.orig/drivers/net/sungem_phy.c 2005-05-02 10:48:28.000000000 +1000 > +++ linux-work/drivers/net/sungem_phy.c 2005-06-16 07:38:37.000000000 +1000 > @@ -32,6 +32,10 @@ > #include > #include > > +#ifdef CONFIG_PPC_PMAC > +#include > +#endif > + > #include "sungem_phy.h" > > /* Link modes of the BCM5400 PHY */ > @@ -281,10 +285,12 @@ > static int bcm5421_init(struct mii_phy* phy) > { > u16 data; > - int rev; > + unsigned int id; > > - rev = phy_read(phy, MII_PHYSID2) & 0x000f; > - if (rev == 0) { > + id = (phy_read(phy, MII_PHYSID1) << 16 | phy_read(phy, MII_PHYSID2)); > + > + /* Revision 0 of 5421 needs some fixups */ > + if (id == 0x002060e0) { > /* This is borrowed from MacOS > */ > phy_write(phy, 0x18, 0x1007); > @@ -297,21 +303,28 @@ > data = phy_read(phy, 0x15); > phy_write(phy, 0x15, data | 0x0200); > } > -#if 0 > - /* This has to be verified before I enable it */ > - /* Enable automatic low-power */ > - phy_write(phy, 0x1c, 0x9002); > - phy_write(phy, 0x1c, 0xa821); > - phy_write(phy, 0x1c, 0x941d); > -#endif > - return 0; > -} > > -static int bcm5421k2_init(struct mii_phy* phy) > -{ > - /* Init code borrowed from OF */ > - phy_write(phy, 4, 0x01e1); > - phy_write(phy, 9, 0x0300); > + /* Pick up some init code from OF for K2 version */ > + if ((id & 0xfffffff0) == 0x002062e0) { > + phy_write(phy, 4, 0x01e1); > + phy_write(phy, 9, 0x0300); > + } > + > + /* Check if we can enable automatic low power */ > +#ifdef CONFIG_PPC_PMAC > + if (phy->platform_data) { > + struct device_node *np = of_get_parent(phy->platform_data); > + int can_low_power = 1; > + if (np == NULL || get_property(np, "no-autolowpower", NULL)) > + can_low_power = 0; > + if (can_low_power) { > + /* Enable automatic low-power */ > + phy_write(phy, 0x1c, 0x9002); > + phy_write(phy, 0x1c, 0xa821); > + phy_write(phy, 0x1c, 0x941d); > + } > + } > +#endif /* CONFIG_PPC_PMAC */ > > return 0; > } > @@ -762,7 +775,7 @@ > > /* Broadcom BCM 5421 built-in K2 */ > static struct mii_phy_ops bcm5421k2_phy_ops = { > - .init = bcm5421k2_init, > + .init = bcm5421_init, > .suspend = bcm5411_suspend, > .setup_aneg = bcm54xx_setup_aneg, > .setup_forced = bcm54xx_setup_forced, > @@ -779,6 +792,25 @@ > .ops = &bcm5421k2_phy_ops > }; > > +/* Broadcom BCM 5462 built-in Vesta */ > +static struct mii_phy_ops bcm5462V_phy_ops = { > + .init = bcm5421_init, > + .suspend = bcm5411_suspend, > + .setup_aneg = bcm54xx_setup_aneg, > + .setup_forced = bcm54xx_setup_forced, > + .poll_link = genmii_poll_link, > + .read_link = bcm54xx_read_link, > +}; > + > +static struct mii_phy_def bcm5462V_phy_def = { > + .phy_id = 0x002060d0, > + .phy_id_mask = 0xfffffff0, > + .name = "BCM5462-Vesta", > + .features = MII_GBIT_FEATURES, > + .magic_aneg = 1, > + .ops = &bcm5462V_phy_ops > +}; > + > /* Marvell 88E1101 (Apple seem to deal with 2 different revs, > * I masked out the 8 last bits to get both, but some specs > * would be useful here) --BenH. > @@ -824,6 +856,7 @@ > &bcm5411_phy_def, > &bcm5421_phy_def, > &bcm5421k2_phy_def, > + &bcm5462V_phy_def, > &marvell_phy_def, > &genmii_phy_def, > NULL > Index: linux-work/drivers/net/sungem_phy.h > =================================================================== > --- linux-work.orig/drivers/net/sungem_phy.h 2005-05-02 10:48:28.000000000 +1000 > +++ linux-work/drivers/net/sungem_phy.h 2005-06-14 10:16:14.000000000 +1000 > @@ -43,9 +43,10 @@ > int pause; > > /* Provided by host chip */ > - struct net_device* dev; > + struct net_device *dev; > int (*mdio_read) (struct net_device *dev, int mii_id, int reg); > void (*mdio_write) (struct net_device *dev, int mii_id, int reg, int val); > + void *platform_data; > }; > > /* Pass in a struct mii_phy with dev, mdio_read and mdio_write -- Yuta SATOH From service at paypal.com Sat Jul 2 06:08:25 2005 From: service at paypal.com (PayPal) Date: Fri, 01 Jul 2005 16:08:25 -0400 Subject: Update Account Information Message-ID: An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050701/0c6bfc15/attachment.htm From benh at kernel.crashing.org Sat Jul 2 09:51:03 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 02 Jul 2005 09:51:03 +1000 Subject: Brand new iMac G5 In-Reply-To: <20050701230521.4F4D.KERNEL@0x100.com> References: <1118871572.5986.231.camel@gaston> <20050701230521.4F4D.KERNEL@0x100.com> Message-ID: <1120261863.31924.115.camel@gaston> On Fri, 2005-07-01 at 23:09 +0900, Yuta SATOH wrote: > Hello, > > I received the report that the network device of a brand new iMacG5 > functioned on the kernel which applied your patch. [1] > If possible, please merge it into a kernel. It has been sent upstream already Ben. From grundler at parisc-linux.org Sat Jul 2 18:21:29 2005 From: grundler at parisc-linux.org (Grant Grundler) Date: Sat, 2 Jul 2005 02:21:29 -0600 Subject: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery In-Reply-To: <20050629163408.GI28499@austin.ibm.com> References: <20050628235919.GA6415@austin.ibm.com> <20050629030237.GB71992@muc.de> <20050629163408.GI28499@austin.ibm.com> Message-ID: <20050702082129.GD14091@colo.lackof.org> On Wed, Jun 29, 2005 at 11:34:08AM -0500, Linas Vepstas wrote: ... > requests get replayed, in a fashion similar to what would be needed > after a host reset. In particular, there shouldn't be and (permanent) > file system corruption because any inconsistent state on the disk > would get over-written when the queued reqeusts get re-issued. FS's that require some ordering (journal) should be handling this sort of stuff already. I have the same expectations as Linas does WRT design. FS's that don't, will have the same sort of problems that they would have as if the OS crashed. > FWIW, yes, I have heard of devices that "cheat", and report back that a > transaction is complete, even though it is still pending in firmware > somewhere, either on the host or the disk. Those devices get screwed. See "Write Cache Enabled" (aka WCE or in HPUX speak "Immediate Reporting"). WCE must be disabled if data corruption can not be tolerated. "Desktop" (ie unix workstations) systems typically have WCE enabled so they look good on (stupid) performance benchmarks. The only devices that lie about WCE have battery backed RAM buffers. (e.g. SCSI RAID *devices* - multi-LUN, dual controller beasts) > No doubt, this will happen to some giant banking customer, It won't happen because of WCE. None of the major HW vendors will sell or support HW with WCE enabled. Exactly for the reasons you point out. grant From olh at suse.de Mon Jul 4 22:02:44 2005 From: olh at suse.de (Olaf Hering) Date: Mon, 4 Jul 2005 14:02:44 +0200 Subject: [PATCH] vdso32, fix link errors after recent toolchain changes Message-ID: <20050704120244.GA10377@suse.de> Patch from amodra at bigpond.net.au, http://sources.redhat.com/bugzilla/show_bug.cgi?id=1042 /usr/bin/ld: arch/ppc64/kernel/vdso32/vdso32.so: The first section in the PT_DYNAMIC segment is not the .dynamic section Signed-off-by: Olaf Hering arch/ppc64/kernel/vdso32/vdso32.lds.S | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: linux-2.6.12/arch/ppc64/kernel/vdso32/vdso32.lds.S =================================================================== --- linux-2.6.12.orig/arch/ppc64/kernel/vdso32/vdso32.lds.S +++ linux-2.6.12/arch/ppc64/kernel/vdso32/vdso32.lds.S @@ -40,9 +40,9 @@ SECTIONS .gcc_except_table : { *(.gcc_except_table) } .fixup : { *(.fixup) } - .got ALIGN(4) : { *(.got.plt) *(.got) } - .dynamic : { *(.dynamic) } :text :dynamic + .got : { *(.got) } + .plt : { *(.plt) } _end = .; __end = .; From mostrows at watson.ibm.com Tue Jul 5 09:36:52 2005 From: mostrows at watson.ibm.com (Michal Ostrowski) Date: Mon, 4 Jul 2005 19:36:52 -0400 Subject: [PATCH] Externally visible buffer for CONFIG_CMDLINE Message-ID: <20050704193652.23980d26@brick.watson.ibm.com> Define a fixed buffer to store the CONFIG_CMDLINE string and the buffer in it's own section. This allows for one to easily locate this buffer in the vmlinux file (using objdump) and then use dd to change the command line. (Allows one to avoid re-building everything to change the command line when using hardware where the only command line is the built-in one.) --- Signed-off-by: Michal Ostrowski 0) strlcpy(cmd_line, p, min(l, COMMAND_LINE_SIZE)); } + #ifdef CONFIG_CMDLINE - if (l == 0 || (l == 1 && (*p) == 0)) - strlcpy(cmd_line, CONFIG_CMDLINE, COMMAND_LINE_SIZE); -#endif /* CONFIG_CMDLINE */ + if (l == 0 || (l == 1 && (*p) == 0)) { + strlcpy(cmd_line, builtin_cmdline, sizeof(builtin_cmdline)); + } +#endif DBG("Command line is: %s\n", cmd_line); -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050704/21299202/attachment.pgp From anton at samba.org Wed Jul 6 02:23:40 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 02:23:40 +1000 Subject: [PATCH] ppc64: use c99 initialisers in cputable code Message-ID: <20050705162340.GH5384@krispykreme> Use c99 initialisers in the cputable code. Signed-off-by: Anton Blanchard Index: linux-2.6.git-work/arch/ppc64/kernel/cputable.c =================================================================== --- linux-2.6.git-work.orig/arch/ppc64/kernel/cputable.c 2005-07-03 10:41:00.000000000 +1000 +++ linux-2.6.git-work/arch/ppc64/kernel/cputable.c 2005-07-03 11:15:43.000000000 +1000 @@ -49,160 +49,219 @@ #endif struct cpu_spec cpu_specs[] = { - { /* Power3 */ - 0xffff0000, 0x00400000, "POWER3 (630)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_IABR | CPU_FTR_PMC8, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power3, - COMMON_PPC64_FW - }, - { /* Power3+ */ - 0xffff0000, 0x00410000, "POWER3 (630+)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_IABR | CPU_FTR_PMC8, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power3, - COMMON_PPC64_FW - }, - { /* Northstar */ - 0xffff0000, 0x00330000, "RS64-II (northstar)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_IABR | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power3, - COMMON_PPC64_FW - }, - { /* Pulsar */ - 0xffff0000, 0x00340000, "RS64-III (pulsar)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_IABR | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power3, - COMMON_PPC64_FW - }, - { /* I-star */ - 0xffff0000, 0x00360000, "RS64-III (icestar)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_IABR | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power3, - COMMON_PPC64_FW - }, - { /* S-star */ - 0xffff0000, 0x00370000, "RS64-IV (sstar)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_IABR | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power3, - COMMON_PPC64_FW - }, - { /* Power4 */ - 0xffff0000, 0x00350000, "POWER4 (gp)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power4, - COMMON_PPC64_FW - }, - { /* Power4+ */ - 0xffff0000, 0x00380000, "POWER4+ (gq)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power4, - COMMON_PPC64_FW - }, - { /* PPC970 */ - 0xffff0000, 0x00390000, "PPC970", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | - CPU_FTR_CAN_NAP | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64 | PPC_FEATURE_HAS_ALTIVEC_COMP, - 128, 128, - __setup_cpu_ppc970, - COMMON_PPC64_FW - }, - { /* PPC970FX */ - 0xffff0000, 0x003c0000, "PPC970FX", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | - CPU_FTR_CAN_NAP | CPU_FTR_PMC8 | CPU_FTR_MMCRA, - COMMON_USER_PPC64 | PPC_FEATURE_HAS_ALTIVEC_COMP, - 128, 128, - __setup_cpu_ppc970, - COMMON_PPC64_FW - }, - { /* Power5 */ - 0xffff0000, 0x003a0000, "POWER5 (gr)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_MMCRA | CPU_FTR_SMT | - CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | - CPU_FTR_MMCRA_SIHV, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power4, - COMMON_PPC64_FW - }, - { /* Power5 */ - 0xffff0000, 0x003b0000, "POWER5 (gs)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_MMCRA | CPU_FTR_SMT | - CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | - CPU_FTR_MMCRA_SIHV, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power4, - COMMON_PPC64_FW - }, - { /* BE DD1.x */ - 0xffff0000, 0x00700000, "Broadband Engine", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | - CPU_FTR_SMT, - COMMON_USER_PPC64 | PPC_FEATURE_HAS_ALTIVEC_COMP, - 128, 128, - __setup_cpu_be, - COMMON_PPC64_FW - }, - { /* default match */ - 0x00000000, 0x00000000, "POWER4 (compatible)", - CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | - CPU_FTR_PPCAS_ARCH_V2, - COMMON_USER_PPC64, - 128, 128, - __setup_cpu_power4, - COMMON_PPC64_FW - } + { /* Power3 */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00400000, + .cpu_name = "POWER3 (630)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | + CPU_FTR_PMC8, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power3, + .firmware_features = COMMON_PPC64_FW, + }, + { /* Power3+ */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00410000, + .cpu_name = "POWER3 (630+)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | + CPU_FTR_PMC8, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power3, + .firmware_features = COMMON_PPC64_FW, + }, + { /* Northstar */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00330000, + .cpu_name = "RS64-II (northstar)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | + CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power3, + .firmware_features = COMMON_PPC64_FW, + }, + { /* Pulsar */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00340000, + .cpu_name = "RS64-III (pulsar)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | + CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power3, + .firmware_features = COMMON_PPC64_FW, + }, + { /* I-star */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00360000, + .cpu_name = "RS64-III (icestar)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | + CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power3, + .firmware_features = COMMON_PPC64_FW, + }, + { /* S-star */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00370000, + .cpu_name = "RS64-IV (sstar)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | + CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power3, + .firmware_features = COMMON_PPC64_FW, + }, + { /* Power4 */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00350000, + .cpu_name = "POWER4 (gp)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power4, + .firmware_features = COMMON_PPC64_FW, + }, + { /* Power4+ */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00380000, + .cpu_name = "POWER4+ (gq)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power4, + .firmware_features = COMMON_PPC64_FW, + }, + { /* PPC970 */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00390000, + .cpu_name = "PPC970", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | + CPU_FTR_CAN_NAP | CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64 | + PPC_FEATURE_HAS_ALTIVEC_COMP, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_ppc970, + .firmware_features = COMMON_PPC64_FW, + }, + { /* PPC970FX */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x003c0000, + .cpu_name = "PPC970FX", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | + CPU_FTR_CAN_NAP | CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64 | + PPC_FEATURE_HAS_ALTIVEC_COMP, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_ppc970, + .firmware_features = COMMON_PPC64_FW, + }, + { /* Power5 */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x003a0000, + .cpu_name = "POWER5 (gr)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_MMCRA | CPU_FTR_SMT | + CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | + CPU_FTR_MMCRA_SIHV, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power4, + .firmware_features = COMMON_PPC64_FW, + }, + { /* Power5 */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x003b0000, + .cpu_name = "POWER5 (gs)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_MMCRA | CPU_FTR_SMT | + CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | + CPU_FTR_MMCRA_SIHV, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power4, + .firmware_features = COMMON_PPC64_FW, + }, + { /* BE DD1.x */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00700000, + .cpu_name = "Broadband Engine", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | + CPU_FTR_SMT, + .cpu_user_features = COMMON_USER_PPC64 | + PPC_FEATURE_HAS_ALTIVEC_COMP, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_be, + .firmware_features = COMMON_PPC64_FW, + }, + { /* default match */ + .pvr_mask = 0x00000000, + .pvr_value = 0x00000000, + .cpu_name = "POWER4 (compatible)", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2, + .cpu_user_features = COMMON_USER_PPC64, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_power4, + .firmware_features = COMMON_PPC64_FW, + } }; firmware_feature_t firmware_features_table[FIRMWARE_MAX_FEATURES] = { - {FW_FEATURE_PFT, "hcall-pft"}, - {FW_FEATURE_TCE, "hcall-tce"}, - {FW_FEATURE_SPRG0, "hcall-sprg0"}, - {FW_FEATURE_DABR, "hcall-dabr"}, - {FW_FEATURE_COPY, "hcall-copy"}, - {FW_FEATURE_ASR, "hcall-asr"}, - {FW_FEATURE_DEBUG, "hcall-debug"}, - {FW_FEATURE_PERF, "hcall-perf"}, - {FW_FEATURE_DUMP, "hcall-dump"}, - {FW_FEATURE_INTERRUPT, "hcall-interrupt"}, - {FW_FEATURE_MIGRATE, "hcall-migrate"}, - {FW_FEATURE_PERFMON, "hcall-perfmon"}, - {FW_FEATURE_CRQ, "hcall-crq"}, - {FW_FEATURE_VIO, "hcall-vio"}, - {FW_FEATURE_RDMA, "hcall-rdma"}, - {FW_FEATURE_LLAN, "hcall-lLAN"}, - {FW_FEATURE_BULK, "hcall-bulk"}, - {FW_FEATURE_XDABR, "hcall-xdabr"}, - {FW_FEATURE_MULTITCE, "hcall-multi-tce"}, - {FW_FEATURE_SPLPAR, "hcall-splpar"}, + {FW_FEATURE_PFT, "hcall-pft"}, + {FW_FEATURE_TCE, "hcall-tce"}, + {FW_FEATURE_SPRG0, "hcall-sprg0"}, + {FW_FEATURE_DABR, "hcall-dabr"}, + {FW_FEATURE_COPY, "hcall-copy"}, + {FW_FEATURE_ASR, "hcall-asr"}, + {FW_FEATURE_DEBUG, "hcall-debug"}, + {FW_FEATURE_PERF, "hcall-perf"}, + {FW_FEATURE_DUMP, "hcall-dump"}, + {FW_FEATURE_INTERRUPT, "hcall-interrupt"}, + {FW_FEATURE_MIGRATE, "hcall-migrate"}, + {FW_FEATURE_PERFMON, "hcall-perfmon"}, + {FW_FEATURE_CRQ, "hcall-crq"}, + {FW_FEATURE_VIO, "hcall-vio"}, + {FW_FEATURE_RDMA, "hcall-rdma"}, + {FW_FEATURE_LLAN, "hcall-lLAN"}, + {FW_FEATURE_BULK, "hcall-bulk"}, + {FW_FEATURE_XDABR, "hcall-xdabr"}, + {FW_FEATURE_MULTITCE, "hcall-multi-tce"}, + {FW_FEATURE_SPLPAR, "hcall-splpar"}, }; From anton at samba.org Wed Jul 6 04:36:53 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 04:36:53 +1000 Subject: [PATCH] ppc64: Fix runlatch code to work on pseries machines In-Reply-To: <20050705162340.GH5384@krispykreme> References: <20050705162340.GH5384@krispykreme> Message-ID: <20050705183653.GI5384@krispykreme> Not all ppc64 CPUs have the CTRL SPR, so we need a cputable feature for it. Signed-off-by: Anton Blanchard Index: linux-2.6.git-work/include/asm-ppc64/processor.h =================================================================== --- linux-2.6.git-work.orig/include/asm-ppc64/processor.h 2005-07-02 08:20:46.000000000 +1000 +++ linux-2.6.git-work/include/asm-ppc64/processor.h 2005-07-06 01:20:04.000000000 +1000 @@ -20,6 +20,7 @@ #include #include #include +#include /* Machine State Register (MSR) Fields */ #define MSR_SF_LG 63 /* Enable 64 bit mode */ @@ -501,18 +502,22 @@ { unsigned long ctrl; - ctrl = mfspr(SPRN_CTRLF); - ctrl |= CTRL_RUNLATCH; - mtspr(SPRN_CTRLT, ctrl); + if (cpu_has_feature(CPU_FTR_CTRL)) { + ctrl = mfspr(SPRN_CTRLF); + ctrl |= CTRL_RUNLATCH; + mtspr(SPRN_CTRLT, ctrl); + } } static inline void ppc64_runlatch_off(void) { unsigned long ctrl; - ctrl = mfspr(SPRN_CTRLF); - ctrl &= ~CTRL_RUNLATCH; - mtspr(SPRN_CTRLT, ctrl); + if (cpu_has_feature(CPU_FTR_CTRL)) { + ctrl = mfspr(SPRN_CTRLF); + ctrl &= ~CTRL_RUNLATCH; + mtspr(SPRN_CTRLT, ctrl); + } } #endif /* __KERNEL__ */ Index: linux-2.6.git-work/include/asm-ppc64/cputable.h =================================================================== --- linux-2.6.git-work.orig/include/asm-ppc64/cputable.h 2005-07-02 08:20:45.000000000 +1000 +++ linux-2.6.git-work/include/asm-ppc64/cputable.h 2005-07-06 01:20:04.000000000 +1000 @@ -138,6 +138,7 @@ #define CPU_FTR_COHERENT_ICACHE ASM_CONST(0x0000020000000000) #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0000040000000000) #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) +#define CPU_FTR_CTRL ASM_CONST(0x0000100000000000) /* Platform firmware features */ #define FW_FTR_ ASM_CONST(0x0000000000000001) @@ -148,7 +149,7 @@ #define CPU_FTR_PPCAS_ARCH_V2_BASE (CPU_FTR_SLB | \ CPU_FTR_TLBIEL | CPU_FTR_NOEXECUTE | \ - CPU_FTR_NODSISRALIGN) + CPU_FTR_NODSISRALIGN | CPU_FTR_CTRL) /* iSeries doesn't support large pages */ #ifdef CONFIG_PPC_ISERIES Index: linux-2.6.git-work/arch/ppc64/kernel/cputable.c =================================================================== --- linux-2.6.git-work.orig/arch/ppc64/kernel/cputable.c 2005-07-03 11:15:43.000000000 +1000 +++ linux-2.6.git-work/arch/ppc64/kernel/cputable.c 2005-07-06 01:21:21.000000000 +1000 @@ -81,7 +81,7 @@ .cpu_name = "RS64-II (northstar)", .cpu_features = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | - CPU_FTR_PMC8 | CPU_FTR_MMCRA, + CPU_FTR_PMC8 | CPU_FTR_MMCRA | CPU_FTR_CTRL, .cpu_user_features = COMMON_USER_PPC64, .icache_bsize = 128, .dcache_bsize = 128, @@ -94,7 +94,7 @@ .cpu_name = "RS64-III (pulsar)", .cpu_features = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | - CPU_FTR_PMC8 | CPU_FTR_MMCRA, + CPU_FTR_PMC8 | CPU_FTR_MMCRA | CPU_FTR_CTRL, .cpu_user_features = COMMON_USER_PPC64, .icache_bsize = 128, .dcache_bsize = 128, @@ -107,7 +107,7 @@ .cpu_name = "RS64-III (icestar)", .cpu_features = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | - CPU_FTR_PMC8 | CPU_FTR_MMCRA, + CPU_FTR_PMC8 | CPU_FTR_MMCRA | CPU_FTR_CTRL, .cpu_user_features = COMMON_USER_PPC64, .icache_bsize = 128, .dcache_bsize = 128, @@ -120,7 +120,7 @@ .cpu_name = "RS64-IV (sstar)", .cpu_features = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_IABR | - CPU_FTR_PMC8 | CPU_FTR_MMCRA, + CPU_FTR_PMC8 | CPU_FTR_MMCRA | CPU_FTR_CTRL, .cpu_user_features = COMMON_USER_PPC64, .icache_bsize = 128, .dcache_bsize = 128, From sonny at burdell.org Wed Jul 6 05:48:04 2005 From: sonny at burdell.org (Sonny Rao) Date: Tue, 5 Jul 2005 15:48:04 -0400 Subject: [PATCH] ppc64: Fix runlatch code to work on pseries machines In-Reply-To: <20050705183653.GI5384@krispykreme> References: <20050705162340.GH5384@krispykreme> <20050705183653.GI5384@krispykreme> Message-ID: <20050705194804.GA17587@kevlar.burdell.org> On Wed, Jul 06, 2005 at 04:36:53AM +1000, Anton Blanchard wrote: > > Not all ppc64 CPUs have the CTRL SPR, so we need a cputable feature for it. > > Signed-off-by: Anton Blanchard Forgive my ignorance, but why don't POWER4 and above have this feature, is is related to runlatch? Thanks Sonny From anton at samba.org Wed Jul 6 06:22:43 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 06:22:43 +1000 Subject: [PATCH] ppc64: Fix runlatch code to work on pseries machines In-Reply-To: <20050705194804.GA17587@kevlar.burdell.org> References: <20050705162340.GH5384@krispykreme> <20050705183653.GI5384@krispykreme> <20050705194804.GA17587@kevlar.burdell.org> Message-ID: <20050705202243.GB12786@krispykreme> Hi, > Forgive my ignorance, but why don't POWER4 and above have this > feature, is is related to runlatch? Its a feature of PPC AS v2, so I added the define there: @@ -148,7 +149,7 @@ #define CPU_FTR_PPCAS_ARCH_V2_BASE (CPU_FTR_SLB | \ CPU_FTR_TLBIEL | CPU_FTR_NOEXECUTE | \ - CPU_FTR_NODSISRALIGN) + CPU_FTR_NODSISRALIGN | CPU_FTR_CTRL) Anton From anton at samba.org Wed Jul 6 06:21:46 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 06:21:46 +1000 Subject: Make idle_loop a member of ppc_md In-Reply-To: <200507012146.19553.michael@ellerman.id.au> References: <200507012146.19553.michael@ellerman.id.au> Message-ID: <20050705202146.GA12786@krispykreme> Hi Michael, > Currently the idle loop is selected in idle_setup() by consulting > systemcfg->platform and with a few ifdefs as well. > > These five patches make idle_loop a member of the ppc_md structure, and moves > the selection into the respective platforms' setup_arch(). > > I wrote this and then change my mind, and thought we should instead try and > reduce the number of different idle loops. But that looks hard, perhaps > impossible, so this might be as good as it gets. Looks good to me. Ive been meaning to fix up our runlatch handling in the idle loops, so here are a few more patches on top of your series. The previous two patches I sent out need to be applied also: [PATCH] ppc64: use c99 initialisers in cputable code [PATCH] ppc64: Fix runlatch code to work on pseries machines Anton From anton at samba.org Wed Jul 6 06:37:21 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 06:37:21 +1000 Subject: Make idle_loop a member of ppc_md In-Reply-To: <20050705202146.GA12786@krispykreme> References: <200507012146.19553.michael@ellerman.id.au> <20050705202146.GA12786@krispykreme> Message-ID: <20050705203721.GC12786@krispykreme> iSeries idle fixups: - remove min/max yield time, we dont use the values anywhere - separate shared and dedicated idle loops - check need_resched again with irqs off to avoid sleeping with pending work - continually set runlatch off in idle loop, this means we dont need to turn the runlatch off on exception exit and suffer that associated cost for all exceptions. (A future patch will turn the runlatch on at exception entry) Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 02:26:24.061621784 +1000 +++ foobar2/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 05:49:46.629711734 +1000 @@ -834,9 +834,6 @@ late_initcall(iSeries_src_init); -static unsigned long maxYieldTime = 0; -static unsigned long minYieldTime = 0xffffffffffffffffUL; - static inline void process_iSeries_events(void) { asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); @@ -845,7 +842,6 @@ static void yield_shared_processor(void) { unsigned long tb; - unsigned long yieldTime; HvCall_setEnabledInterrupts(HvCall_MaskIPI | HvCall_MaskLpEvent | @@ -856,13 +852,6 @@ /* Compute future tb value when yield should expire */ HvCall_yieldProcessor(HvCall_YieldTimed, tb+tb_ticks_per_jiffy); - yieldTime = get_tb() - tb; - if (yieldTime > maxYieldTime) - maxYieldTime = yieldTime; - - if (yieldTime < minYieldTime) - minYieldTime = yieldTime; - /* * The decrementer stops during the yield. Force a fake decrementer * here and let the timer_interrupt code sort out the actual time. @@ -871,45 +860,62 @@ process_iSeries_events(); } -static int iSeries_idle(void) +static int iseries_shared_idle(void) { - struct paca_struct *lpaca; - long oldval; + while (1) { + while (!need_resched() && !hvlpevent_is_pending()) { + local_irq_disable(); + ppc64_runlatch_off(); + + /* Recheck with irqs off */ + if (!need_resched() && !hvlpevent_is_pending()) + yield_shared_processor(); + + HMT_medium(); + local_irq_enable(); + } + + ppc64_runlatch_on(); + + if (hvlpevent_is_pending()) + process_iSeries_events(); + + schedule(); + } - /* ensure iSeries run light will be out when idle */ - ppc64_runlatch_off(); + return 0; +} - lpaca = get_paca(); +static int iseries_dedicated_idle(void) +{ + struct paca_struct *lpaca = get_paca(); + long oldval; while (1) { - if (lpaca->lppaca.shared_proc) { - if (hvlpevent_is_pending()) - process_iSeries_events(); - if (!need_resched()) - yield_shared_processor(); - } else { - oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + + if (!oldval) { + set_thread_flag(TIF_POLLING_NRFLAG); - if (!oldval) { - set_thread_flag(TIF_POLLING_NRFLAG); + while (!need_resched()) { + ppc64_runlatch_off(); + HMT_low(); - while (!need_resched()) { + if (hvlpevent_is_pending()) { HMT_medium(); - if (hvlpevent_is_pending()) - process_iSeries_events(); - HMT_low(); + ppc64_runlatch_on(); + process_iSeries_events(); } - - HMT_medium(); - clear_thread_flag(TIF_POLLING_NRFLAG); - } else { - set_need_resched(); } + + HMT_medium(); + clear_thread_flag(TIF_POLLING_NRFLAG); + } else { + set_need_resched(); } ppc64_runlatch_on(); schedule(); - ppc64_runlatch_off(); } return 0; @@ -940,6 +946,10 @@ ppc_md.get_rtc_time = iSeries_get_rtc_time; ppc_md.calibrate_decr = iSeries_calibrate_decr; ppc_md.progress = iSeries_progress; - ppc_md.idle_loop = iSeries_idle; + + if (get_paca()->lppaca.shared_proc) + ppc_md.idle_loop = iseries_shared_idle; + else + ppc_md.idle_loop = iseries_dedicated_idle; } From anton at samba.org Wed Jul 6 06:43:03 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 06:43:03 +1000 Subject: [PATCH] ppc64: pSeries idle fixups In-Reply-To: <20050705203721.GC12786@krispykreme> References: <200507012146.19553.michael@ellerman.id.au> <20050705202146.GA12786@krispykreme> <20050705203721.GC12786@krispykreme> Message-ID: <20050705204303.GD12786@krispykreme> pSeries idle fixups: - separate out sleep logic in dedicated_idle, it was so far indented that it got squashed against the right side of the screen. - add runlatch support, looping on runlatch disable. Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/pSeries_setup.c 2005-07-06 05:49:51.479133649 +1000 +++ foobar2/arch/ppc64/kernel/pSeries_setup.c 2005-07-06 06:14:22.752007077 +1000 @@ -83,8 +83,8 @@ extern void pSeries_system_reset_exception(struct pt_regs *regs); extern int pSeries_machine_check_exception(struct pt_regs *regs); -static int shared_idle(void); -static int dedicated_idle(void); +static int pseries_shared_idle(void); +static int pseries_dedicated_idle(void); static volatile void __iomem * chrp_int_ack_special; struct mpic *pSeries_mpic; @@ -238,10 +238,10 @@ if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { if (get_paca()->lppaca.shared_proc) { printk(KERN_INFO "Using shared processor idle loop\n"); - ppc_md.idle_loop = shared_idle; + ppc_md.idle_loop = pseries_shared_idle; } else { printk(KERN_INFO "Using dedicated idle loop\n"); - ppc_md.idle_loop = dedicated_idle; + ppc_md.idle_loop = pseries_dedicated_idle; } } else { printk(KERN_INFO "Using default idle loop\n"); @@ -438,15 +438,47 @@ DECLARE_PER_CPU(unsigned long, smt_snooze_delay); -int dedicated_idle(void) +static inline void dedicated_idle_sleep(unsigned int cpu) +{ + struct paca_struct *ppaca = &paca[cpu ^ 1]; + + /* Only sleep if the other thread is not idle */ + if (!(ppaca->lppaca.idle)) { + local_irq_disable(); + + /* + * We are about to sleep the thread and so wont be polling any + * more. + */ + clear_thread_flag(TIF_POLLING_NRFLAG); + + /* + * SMT dynamic mode. Cede will result in this thread going + * dormant, if the partner thread is still doing work. Thread + * wakes up if partner goes idle, an interrupt is presented, or + * a prod occurs. Returning from the cede enables external + * interrupts. + */ + if (!need_resched()) + cede_processor(); + else + local_irq_enable(); + } else { + /* + * Give the HV an opportunity at the processor, since we are + * not doing any work. + */ + poll_pending(); + } +} + +static int pseries_dedicated_idle(void) { long oldval; - struct paca_struct *lpaca = get_paca(), *ppaca; + struct paca_struct *lpaca = get_paca(); + unsigned int cpu = smp_processor_id(); unsigned long start_snooze; unsigned long *smt_snooze_delay = &__get_cpu_var(smt_snooze_delay); - unsigned int cpu = smp_processor_id(); - - ppaca = &paca[cpu ^ 1]; while (1) { /* @@ -458,9 +490,13 @@ oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); if (!oldval) { set_thread_flag(TIF_POLLING_NRFLAG); + start_snooze = __get_tb() + *smt_snooze_delay * tb_ticks_per_usec; + while (!need_resched() && !cpu_is_offline(cpu)) { + ppc64_runlatch_off(); + /* * Go into low thread priority and possibly * low power mode. @@ -468,60 +504,31 @@ HMT_low(); HMT_very_low(); - if (*smt_snooze_delay == 0 || - __get_tb() < start_snooze) - continue; - - HMT_medium(); - - if (!(ppaca->lppaca.idle)) { - local_irq_disable(); - - /* - * We are about to sleep the thread - * and so wont be polling any - * more. - */ - clear_thread_flag(TIF_POLLING_NRFLAG); - - /* - * SMT dynamic mode. Cede will result - * in this thread going dormant, if the - * partner thread is still doing work. - * Thread wakes up if partner goes idle, - * an interrupt is presented, or a prod - * occurs. Returning from the cede - * enables external interrupts. - */ - if (!need_resched()) - cede_processor(); - else - local_irq_enable(); - } else { - /* - * Give the HV an opportunity at the - * processor, since we are not doing - * any work. - */ - poll_pending(); + if (*smt_snooze_delay != 0 && + __get_tb() > start_snooze) { + HMT_medium(); + dedicated_idle_sleep(cpu); } + } + HMT_medium(); clear_thread_flag(TIF_POLLING_NRFLAG); } else { set_need_resched(); } - HMT_medium(); lpaca->lppaca.idle = 0; + ppc64_runlatch_on(); + schedule(); + if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) cpu_die(); } - return 0; } -static int shared_idle(void) +static int pseries_shared_idle(void) { struct paca_struct *lpaca = get_paca(); unsigned int cpu = smp_processor_id(); @@ -535,6 +542,7 @@ while (!need_resched() && !cpu_is_offline(cpu)) { local_irq_disable(); + ppc64_runlatch_off(); /* * Yield the processor to the hypervisor. We return if @@ -550,13 +558,16 @@ cede_processor(); else local_irq_enable(); + + HMT_medium(); } - HMT_medium(); lpaca->lppaca.idle = 0; + ppc64_runlatch_on(); + schedule(); - if (cpu_is_offline(smp_processor_id()) && - system_state == SYSTEM_RUNNING) + + if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) cpu_die(); } From olh at suse.de Wed Jul 6 06:47:36 2005 From: olh at suse.de (Olaf Hering) Date: Tue, 5 Jul 2005 22:47:36 +0200 Subject: [PATCH] allow xmon=nobt to not print a backtrace by default In-Reply-To: <20050531202931.GA14769@suse.de> References: <20050531202931.GA14769@suse.de> Message-ID: <20050705204736.GA31800@suse.de> (untested) xmon does not print a backtrace per default. This is bad on systems with USB keyboard, the most needed info about the crash is lost. print a backtrace during the very first xmon entry. Booting with xmon=nobt disables the autobacktrace functionality. Signed-off-by: Olaf Hering arch/ppc64/kernel/setup.c | 4 ++++ arch/ppc64/xmon/xmon.c | 5 +++++ 2 files changed, 9 insertions(+) Index: linux-2.6.12-olh/arch/ppc64/kernel/setup.c =================================================================== --- linux-2.6.12-olh.orig/arch/ppc64/kernel/setup.c +++ linux-2.6.12-olh/arch/ppc64/kernel/setup.c @@ -91,6 +91,8 @@ extern void udbg_init_maple_realmode(voi do { ppc_md.udbg_putc = call_rtas_display_status_delay; } while(0) #endif +extern int xmon_no_auto_backtrace; + /* extern void *stab; */ extern unsigned long klimit; @@ -1318,6 +1320,8 @@ static int __init early_xmon(char *p) { /* ensure xmon is enabled */ if (p) { + if (strncmp(p, "nobt", 4) == 0) + xmon_no_auto_backtrace++; if (strncmp(p, "on", 2) == 0) xmon_init(); if (strncmp(p, "early", 5) != 0) Index: linux-2.6.12-olh/arch/ppc64/xmon/xmon.c =================================================================== --- linux-2.6.12-olh.orig/arch/ppc64/xmon/xmon.c +++ linux-2.6.12-olh/arch/ppc64/xmon/xmon.c @@ -132,11 +132,13 @@ static void csum(void); static void bootcmds(void); void dump_segments(void); static void symbol_lookup(void); +static void xmon_show_stack(unsigned long sp, unsigned long lr, unsigned long pc); static void xmon_print_symbol(unsigned long address, const char *mid, const char *after); static const char *getvecname(unsigned long vec); static void debug_trace(void); +int xmon_no_auto_backtrace; extern int print_insn_powerpc(unsigned long, unsigned long, int); extern void printf(const char *fmt, ...); @@ -768,6 +770,9 @@ cmds(struct pt_regs *excp) last_cmd = NULL; xmon_regs = excp; + if (!xmon_no_auto_backtrace++) + xmon_show_stack(excp->gpr[1], excp->link, excp->nip); + for(;;) { #ifdef CONFIG_SMP printf("%x:", smp_processor_id()); From anton at samba.org Wed Jul 6 06:46:15 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 06:46:15 +1000 Subject: [PATCH] ppc64: idle fixups In-Reply-To: <20050705204303.GD12786@krispykreme> References: <200507012146.19553.michael@ellerman.id.au> <20050705202146.GA12786@krispykreme> <20050705203721.GC12786@krispykreme> <20050705204303.GD12786@krispykreme> Message-ID: <20050705204615.GE12786@krispykreme> - remove some unnecessary includes - add runlatch support - no need to use raw_smp_processor_id any more, current preempt debug logic checks for processes that are bound to one cpu. Signed-off-by: Anton Blanchard Index: linux-2.6.git-work/arch/ppc64/kernel/idle.c =================================================================== --- linux-2.6.git-work.orig/arch/ppc64/kernel/idle.c 2005-07-02 08:24:55.000000000 +1000 +++ linux-2.6.git-work/arch/ppc64/kernel/idle.c 2005-07-06 01:50:08.000000000 +1000 @@ -20,18 +20,12 @@ #include #include #include -#include #include -#include #include #include -#include #include #include -#include -#include -#include #include #include @@ -49,7 +43,8 @@ set_thread_flag(TIF_POLLING_NRFLAG); while (!need_resched() && !cpu_is_offline(cpu)) { - barrier(); + ppc64_runlatch_off(); + /* * Go into low thread priority and possibly * low power mode. @@ -64,6 +59,7 @@ set_need_resched(); } + ppc64_runlatch_on(); schedule(); if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) cpu_die(); @@ -74,17 +70,22 @@ int native_idle(void) { - while(1) { - /* check CPU type here */ + while (1) { + ppc64_runlatch_off(); + if (!need_resched()) power4_idle(); - if (need_resched()) + + if (need_resched()) { + ppc64_runlatch_on(); schedule(); + } - if (cpu_is_offline(raw_smp_processor_id()) && + if (cpu_is_offline(smp_processor_id()) && system_state == SYSTEM_RUNNING) cpu_die(); } + return 0; } From anton at samba.org Wed Jul 6 08:49:51 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 08:49:51 +1000 Subject: [PATCH] ppc64: fix compile warning In-Reply-To: <20050705204615.GE12786@krispykreme> References: <200507012146.19553.michael@ellerman.id.au> <20050705202146.GA12786@krispykreme> <20050705203721.GC12786@krispykreme> <20050705204303.GD12786@krispykreme> <20050705204615.GE12786@krispykreme> Message-ID: <20050705224951.GK12786@krispykreme> Fix a compile warning introduced by the previous patches. Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 07:09:32.039334942 +1000 +++ foobar2/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 07:14:29.159334906 +1000 @@ -888,7 +888,6 @@ static int iseries_dedicated_idle(void) { - struct paca_struct *lpaca = get_paca(); long oldval; while (1) { From anton at samba.org Wed Jul 6 09:12:43 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 6 Jul 2005 09:12:43 +1000 Subject: [PATCH] ppc64: Turn runlatch on in exception entry In-Reply-To: <20050705183653.GI5384@krispykreme> References: <20050705162340.GH5384@krispykreme> <20050705183653.GI5384@krispykreme> Message-ID: <20050705231243.GM12786@krispykreme> Enable the runlatch at the start of each exception. Unfortunately we are out of space in the 0x300 handler, so I added it a bit later. The SPR write is fairly expensive, perhaps we should cache the runlatch state in the paca and avoid the write when possible. We dont need to turn the runlatch off, we do that in the idle loop. Better to take the hit in the idle loop than for each exception exit. Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/head.S =================================================================== --- foobar2.orig/arch/ppc64/kernel/head.S 2005-07-06 07:28:11.576663962 +1000 +++ foobar2/arch/ppc64/kernel/head.S 2005-07-06 07:35:41.567944291 +1000 @@ -308,6 +308,7 @@ label##_pSeries: \ HMT_MEDIUM; \ mtspr SPRG1,r13; /* save r13 */ \ + RUNLATCH_ON(r13); \ EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common) #define STD_EXCEPTION_ISERIES(n, label, area) \ @@ -315,6 +316,7 @@ label##_iSeries: \ HMT_MEDIUM; \ mtspr SPRG1,r13; /* save r13 */ \ + RUNLATCH_ON(r13); \ EXCEPTION_PROLOG_ISERIES_1(area); \ EXCEPTION_PROLOG_ISERIES_2; \ b label##_common @@ -324,6 +326,7 @@ label##_iSeries: \ HMT_MEDIUM; \ mtspr SPRG1,r13; /* save r13 */ \ + RUNLATCH_ON(r13); \ EXCEPTION_PROLOG_ISERIES_1(PACA_EXGEN); \ lbz r10,PACAPROCENABLED(r13); \ cmpwi 0,r10,0; \ @@ -393,6 +396,7 @@ _machine_check_pSeries: HMT_MEDIUM mtspr SPRG1,r13 /* save r13 */ + RUNLATCH_ON(r13) EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) . = 0x300 @@ -419,6 +423,7 @@ data_access_slb_pSeries: HMT_MEDIUM mtspr SPRG1,r13 + RUNLATCH_ON(r13) mfspr r13,SPRG3 /* get paca address into r13 */ std r9,PACA_EXSLB+EX_R9(r13) /* save r9 - r12 */ std r10,PACA_EXSLB+EX_R10(r13) @@ -439,6 +444,7 @@ instruction_access_slb_pSeries: HMT_MEDIUM mtspr SPRG1,r13 + RUNLATCH_ON(r13) mfspr r13,SPRG3 /* get paca address into r13 */ std r9,PACA_EXSLB+EX_R9(r13) /* save r9 - r12 */ std r10,PACA_EXSLB+EX_R10(r13) @@ -464,6 +470,7 @@ .globl system_call_pSeries system_call_pSeries: HMT_MEDIUM + RUNLATCH_ON(r9) mr r9,r13 mfmsr r10 mfspr r13,SPRG3 @@ -707,11 +714,13 @@ system_reset_fwnmi: HMT_MEDIUM mtspr SPRG1,r13 /* save r13 */ + RUNLATCH_ON(r13) EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) .globl machine_check_fwnmi machine_check_fwnmi: HMT_MEDIUM mtspr SPRG1,r13 /* save r13 */ + RUNLATCH_ON(r13) EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) /* @@ -848,6 +857,7 @@ .align 7 .globl data_access_common data_access_common: + RUNLATCH_ON(r10) /* It wont fit in the 0x300 handler */ mfspr r10,DAR std r10,PACA_EXGEN+EX_DAR(r13) mfspr r10,DSISR Index: foobar2/include/asm-ppc64/processor.h =================================================================== --- foobar2.orig/include/asm-ppc64/processor.h 2005-07-06 07:28:11.577663885 +1000 +++ foobar2/include/asm-ppc64/processor.h 2005-07-06 07:30:34.878399039 +1000 @@ -524,6 +524,15 @@ #endif /* __ASSEMBLY__ */ +#ifdef __KERNEL__ +#define RUNLATCH_ON(REG) \ +BEGIN_FTR_SECTION \ + mfspr (REG),SPRN_CTRLF; \ + ori (REG),(REG),CTRL_RUNLATCH; \ + mtspr SPRN_CTRLT,(REG); \ +END_FTR_SECTION_IFSET(CPU_FTR_CTRL) +#endif + /* * Number of entries in the SLB. If this ever changes we should handle * it with a use a cpu feature fixup. From benh at kernel.crashing.org Wed Jul 6 09:54:15 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 06 Jul 2005 09:54:15 +1000 Subject: [PATCH] vdso32, fix link errors after recent toolchain changes In-Reply-To: <20050704120244.GA10377@suse.de> References: <20050704120244.GA10377@suse.de> Message-ID: <1120607655.31924.175.camel@gaston> On Mon, 2005-07-04 at 14:02 +0200, Olaf Hering wrote: > Patch from amodra at bigpond.net.au, http://sources.redhat.com/bugzilla/show_bug.cgi?id=1042 > > /usr/bin/ld: arch/ppc64/kernel/vdso32/vdso32.so: The first section in the PT_DYNAMIC segment is not the .dynamic section > > Signed-off-by: Olaf Hering Acked-by: Benjamin Herrenschmidt From michael at ellerman.id.au Wed Jul 6 12:41:13 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 6 Jul 2005 12:41:13 +1000 Subject: [PATCH] ppc64: Be consistent about printing which idle loop we're using In-Reply-To: <20050705224951.GK12786@krispykreme> References: <200507012146.19553.michael@ellerman.id.au> <20050705204615.GE12786@krispykreme> <20050705224951.GK12786@krispykreme> Message-ID: <200507061241.30135.michael@ellerman.id.au> Not sure if we really need this, but it was handy to know which iSeries loop I was testing. Be consistent about printing which idle loop we're using, with this patch we cover all cases. Signed-off-by: Michael Ellerman --- arch/ppc64/kernel/iSeries_setup.c | 7 +++++-- arch/ppc64/kernel/setup.c | 4 +++- 2 files changed, 8 insertions(+), 3 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -946,9 +946,12 @@ void __init iSeries_early_setup(void) ppc_md.calibrate_decr = iSeries_calibrate_decr; ppc_md.progress = iSeries_progress; - if (get_paca()->lppaca.shared_proc) + if (get_paca()->lppaca.shared_proc) { ppc_md.idle_loop = iseries_shared_idle; - else + printk(KERN_INFO "Using shared processor idle loop\n"); + } else { ppc_md.idle_loop = iseries_dedicated_idle; + printk(KERN_INFO "Using dedicated idle loop\n"); + } } Index: work/arch/ppc64/kernel/setup.c =================================================================== --- work.orig/arch/ppc64/kernel/setup.c +++ work/arch/ppc64/kernel/setup.c @@ -1081,8 +1081,10 @@ void __init setup_arch(char **cmdline_p) ppc_md.setup_arch(); /* Use the default idle loop if the platform hasn't provided one. */ - if (NULL == ppc_md.idle_loop) + if (NULL == ppc_md.idle_loop) { ppc_md.idle_loop = default_idle; + printk(KERN_INFO "Using default idle loop\n"); + } paging_init(); ppc64_boot_msg(0x15, "Setup Done"); From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 14:53:06 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 13:53:06 +0900 Subject: [PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting Message-ID: <42CB63B2.6000505@jp.fujitsu.com> Hi all, The followings are updated version of patches I've posted to implement IOCHK interface for I/O error handling/detecting. The abstraction of patches hasn't changed, so please refer archives if you need, e.g.: http://lwn.net/Articles/139240/ Tony, how do you think about applying my patches to your tree? Thanks, H.Seto [This is 1 of 10 patches, "iochk-01-generic.patch"] - It defines: a pair of function : iochk_clear and iochk_read a function for init : iochk_init type of control var : iocookie and describe "no-ops" as its "generic" action. - HAVE_ARCH_IOMAP_CHECK allows us to change whole definition of these functions and type from generic one to specific one. See next patch (2 of 10). Changes from previous one for 2.6.11.11: - reform default "nop" functions in static inline style. - I don't mind using EXPORT_SYMBOL_GPL but keep them as before. Does anyone worry about this? Signed-off-by: Hidetoshi Seto --- drivers/pci/pci.c | 2 ++ include/asm-generic/iomap.h | 32 ++++++++++++++++++++++++++++++++ lib/iomap.c | 6 ++++++ 3 files changed, 40 insertions(+) Index: linux-2.6.13-rc1/lib/iomap.c =================================================================== --- linux-2.6.13-rc1.orig/lib/iomap.c +++ linux-2.6.13-rc1/lib/iomap.c @@ -230,3 +230,9 @@ void pci_iounmap(struct pci_dev *dev, vo } EXPORT_SYMBOL(pci_iomap); EXPORT_SYMBOL(pci_iounmap); + +#ifndef HAVE_ARCH_IOMAP_CHECK +/* Since generic funcs are inlined and defined in header, just export */ +EXPORT_SYMBOL(iochk_clear); +EXPORT_SYMBOL(iochk_read); +#endif Index: linux-2.6.13-rc1/include/asm-generic/iomap.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-generic/iomap.h +++ linux-2.6.13-rc1/include/asm-generic/iomap.h @@ -65,4 +65,36 @@ struct pci_dev; extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max); extern void pci_iounmap(struct pci_dev *dev, void __iomem *); +/* + * IOMAP_CHECK provides additional interfaces for drivers to detect + * some IO errors, supports drivers having ability to recover errors. + * + * All works around iomap-check depends on the design of "iocookie" + * structure. Every architecture owning its iomap-check is free to + * define the actual design of iocookie to fit its special style. + */ +#ifndef HAVE_ARCH_IOMAP_CHECK +/* Dummy definition of default iocookie */ +typedef int iocookie; +#endif + +/* + * Clear/Read iocookie to check IO error while using iomap. + * + * Note that default iochk_clear-read pair interfaces don't have + * any effective error check, but some high-reliable platforms + * would provide useful information to you. + * And note that some action may be limited (ex. irq-unsafe) + * between the pair depend on the facility of the platform. + */ +#ifdef HAVE_ARCH_IOMAP_CHECK +extern void iochk_init(void); +extern void iochk_clear(iocookie *cookie, struct pci_dev *dev); +extern int iochk_read(iocookie *cookie); +#else +static inline void iochk_init(void) {} +static inline void iochk_clear(iocookie *cookie, struct pci_dev *dev) {} +static inline int iochk_read(iocookie *cookie) { return 0; } +#endif + #endif Index: linux-2.6.13-rc1/drivers/pci/pci.c =================================================================== --- linux-2.6.13-rc1.orig/drivers/pci/pci.c +++ linux-2.6.13-rc1/drivers/pci/pci.c @@ -767,6 +767,8 @@ static int __devinit pci_init(void) while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { pci_fixup_device(pci_fixup_final, dev); } + + iochk_init(); return 0; } From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:00:22 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:00:22 +0900 Subject: [PATCH 2.6.13-rc1 02/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB6566.8090804@jp.fujitsu.com> [This is 2 of 10 patches, "iochk-02-ia64.patch"] - Add "config IOMAP_CHECK" to change definitions from generic to specific. - Defines ia64 version of: iochk_clear, iochk_read, iochk_init, and iocookie But they are no-ops yet. See next patch (3 of 10). Changes from previous one for 2.6.11.11: - simplify define of iocookie structure. Signed-off-by: Hidetoshi Seto --- arch/ia64/Kconfig | 13 +++++++++++++ arch/ia64/lib/Makefile | 1 + arch/ia64/lib/iomap_check.c | 30 ++++++++++++++++++++++++++++++ include/asm-ia64/io.h | 13 +++++++++++++ 4 files changed, 57 insertions(+) Index: linux-2.6.13-rc1/arch/ia64/lib/Makefile =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/Makefile +++ linux-2.6.13-rc1/arch/ia64/lib/Makefile @@ -16,6 +16,7 @@ lib-$(CONFIG_MCKINLEY) += copy_page_mck. lib-$(CONFIG_PERFMON) += carta_random.o lib-$(CONFIG_MD_RAID5) += xor.o lib-$(CONFIG_HAVE_DEC_LOCK) += dec_and_lock.o +lib-$(CONFIG_IOMAP_CHECK) += iomap_check.o AFLAGS___divdi3.o = AFLAGS___udivdi3.o = -DUNSIGNED Index: linux-2.6.13-rc1/arch/ia64/Kconfig =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/Kconfig +++ linux-2.6.13-rc1/arch/ia64/Kconfig @@ -413,6 +413,19 @@ config PCI_DOMAINS bool default PCI +config IOMAP_CHECK + bool "Support iochk interfaces for IO error detection." + depends on PCI && EXPERIMENTAL + ---help--- + Saying Y provides iochk infrastructure for "RAS-aware" drivers + to detect and recover some IO errors, which strongly required by + some of very-high-reliable systems. + The implementation of this infrastructure is highly depend on arch, + bus system, chipset and so on. + Currentry, very few drivers on few arch actually implements this. + + If you don't know what to do here, say N. + source "drivers/pci/Kconfig" source "drivers/pci/hotplug/Kconfig" Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- /dev/null +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -0,0 +1,30 @@ +/* + * File: iomap_check.c + * Purpose: Implement the IA64 specific iomap recovery interfaces + */ + +#include + +void iochk_init(void); +void iochk_clear(iocookie *cookie, struct pci_dev *dev); +int iochk_read(iocookie *cookie); + +void iochk_init(void) +{ + /* setup */ +} + +void iochk_clear(iocookie *cookie, struct pci_dev *dev) +{ + /* register device etc. */ +} + +int iochk_read(iocookie *cookie) +{ + /* check error etc. */ + + return 0; +} + +EXPORT_SYMBOL(iochk_read); +EXPORT_SYMBOL(iochk_clear); Index: linux-2.6.13-rc1/include/asm-ia64/io.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-ia64/io.h +++ linux-2.6.13-rc1/include/asm-ia64/io.h @@ -70,6 +70,19 @@ extern unsigned int num_io_spaces; #include #include #include + +#ifdef CONFIG_IOMAP_CHECK + +/* ia64 iocookie */ +typedef struct { + int dummy; +} iocookie; + +/* Enable ia64 iochk - See arch/ia64/lib/iomap_check.c */ +#define HAVE_ARCH_IOMAP_CHECK + +#endif /* CONFIG_IOMAP_CHECK */ + #include /* From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:04:14 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:04:14 +0900 Subject: [PATCH 2.6.13-rc1 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB664E.1050003@jp.fujitsu.com> [This is 3 of 10 patches, "iochk-03-register.patch"] - Implement ia64 version of basic codes: iochk_clear, iochk_read, iochk_init, and iocookie The direction is: - Have a "now in check" global list, "iochk_devices", for future use. - Take a lock, "iochk_lock", to protect the global list. - iochk_clear packs *dev into iocookie, and add it to the global list. After all prepared, clear error-flag in cookie to start io-critical-session. - iochk_read checks error-flag and device's status register. After removing iocookie from list, return the result. This is too simple. We need more codes... See next (4 of 10). Changes from previous one for 2.6.11.11: - trivial coding style fix. Signed-off-by: Hidetoshi Seto --- arch/ia64/lib/iomap_check.c | 55 ++++++++++++++++++++++++++++++++++++++++++-- include/asm-ia64/io.h | 5 +++- 2 files changed, 57 insertions(+), 3 deletions(-) Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -4,24 +4,75 @@ */ #include +#include +#include void iochk_init(void); void iochk_clear(iocookie *cookie, struct pci_dev *dev); int iochk_read(iocookie *cookie); +struct list_head iochk_devices; +DEFINE_SPINLOCK(iochk_lock); /* all works are excluded on this lock */ + +static int have_error(struct pci_dev *dev); + void iochk_init(void) { /* setup */ + INIT_LIST_HEAD(&iochk_devices); } void iochk_clear(iocookie *cookie, struct pci_dev *dev) { - /* register device etc. */ + unsigned long flag; + + INIT_LIST_HEAD(&(cookie->list)); + + cookie->dev = dev; + + spin_lock_irqsave(&iochk_lock, flag); + list_add(&cookie->list, &iochk_devices); + spin_unlock_irqrestore(&iochk_lock, flag); + + cookie->error = 0; } int iochk_read(iocookie *cookie) { - /* check error etc. */ + unsigned long flag; + int ret = 0; + + spin_lock_irqsave(&iochk_lock, flag); + if (cookie->error || have_error(cookie->dev)) + ret = 1; + list_del(&cookie->list); + spin_unlock_irqrestore(&iochk_lock, flag); + + return ret; +} + +static int have_error(struct pci_dev *dev) +{ + u16 status; + + /* check status */ + switch (dev->hdr_type) { + case PCI_HEADER_TYPE_NORMAL: /* 0 */ + pci_read_config_word(dev, PCI_STATUS, &status); + break; + case PCI_HEADER_TYPE_BRIDGE: /* 1 */ + pci_read_config_word(dev, PCI_SEC_STATUS, &status); + break; + case PCI_HEADER_TYPE_CARDBUS: /* 2 */ + return 0; /* FIX ME */ + default: + BUG(); + } + + if ( (status & PCI_STATUS_REC_TARGET_ABORT) + || (status & PCI_STATUS_REC_MASTER_ABORT) + || (status & PCI_STATUS_DETECTED_PARITY) ) + return 1; return 0; } Index: linux-2.6.13-rc1/include/asm-ia64/io.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-ia64/io.h +++ linux-2.6.13-rc1/include/asm-ia64/io.h @@ -72,10 +72,13 @@ extern unsigned int num_io_spaces; #include #ifdef CONFIG_IOMAP_CHECK +#include /* ia64 iocookie */ typedef struct { - int dummy; + struct list_head list; + struct pci_dev *dev; /* target device */ + unsigned long error; /* error flag */ } iocookie; /* Enable ia64 iochk - See arch/ia64/lib/iomap_check.c */ From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:07:39 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:07:39 +0900 Subject: [PATCH 2.6.13-rc1 04/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB671B.5000604@jp.fujitsu.com> [This is 4 of 10 patches, "iochk-04-register_bridge.patch"] - Since there could be a (PCI-)bus-error, some kind of error cannot detected on the device but on its hosting bridge. So, it is also required to check the bridge's register. In other words, to check a bus-error correctly, we need to check both end of the bus, device and its host bridge. OK, but often bridges are shared by multiple devices, right? So we need care to handle it... Yes, see next (5 of 10). Changes from previous one for 2.6.11.11: - trivial coding style fix. Signed-off-by: Hidetoshi Seto --- arch/ia64/lib/iomap_check.c | 20 +++++++++++++++++++- include/asm-ia64/io.h | 1 + 2 files changed, 20 insertions(+), 1 deletion(-) Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -14,6 +14,7 @@ int iochk_read(iocookie *cookie); struct list_head iochk_devices; DEFINE_SPINLOCK(iochk_lock); /* all works are excluded on this lock */ +static struct pci_dev *search_host_bridge(struct pci_dev *dev); static int have_error(struct pci_dev *dev); void iochk_init(void) @@ -29,6 +30,7 @@ void iochk_clear(iocookie *cookie, struc INIT_LIST_HEAD(&(cookie->list)); cookie->dev = dev; + cookie->host = search_host_bridge(dev); spin_lock_irqsave(&iochk_lock, flag); list_add(&cookie->list, &iochk_devices); @@ -43,7 +45,8 @@ int iochk_read(iocookie *cookie) int ret = 0; spin_lock_irqsave(&iochk_lock, flag); - if (cookie->error || have_error(cookie->dev)) + if ( cookie->error || have_error(cookie->dev) + || (cookie->host && have_error(cookie->host)) ) ret = 1; list_del(&cookie->list); spin_unlock_irqrestore(&iochk_lock, flag); @@ -51,6 +54,21 @@ int iochk_read(iocookie *cookie) return ret; } +struct pci_dev *search_host_bridge(struct pci_dev *dev) +{ + struct pci_bus *pbus; + + /* there is no bridge */ + if (!dev->bus->self) + return NULL; + + /* find root bus bridge */ + for (pbus = dev->bus; pbus->parent && pbus->parent->self; + pbus = pbus->parent); + + return pbus->self; +} + static int have_error(struct pci_dev *dev) { u16 status; Index: linux-2.6.13-rc1/include/asm-ia64/io.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-ia64/io.h +++ linux-2.6.13-rc1/include/asm-ia64/io.h @@ -78,6 +78,7 @@ extern unsigned int num_io_spaces; typedef struct { struct list_head list; struct pci_dev *dev; /* target device */ + struct pci_dev *host; /* hosting bridge */ unsigned long error; /* error flag */ } iocookie; From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:11:42 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:11:42 +0900 Subject: [PATCH 2.6.13-rc1 05/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB680E.2010103@jp.fujitsu.com> [This is 5 of 10 patches, "iochk-05-check_bridge.patch"] - Consider three devices, A, B, and C are placed under a same host bridge H. After A and B checked-in (=passed iochk_clear, doing some I/Os, not come to call iochk_read yet), now C is going to check-in, just entered iochk_clear, but C finds out that H indicates error. It means that A or B hits a bus error, but there is no data which one actually hits the error. So, C should notify the error to both of A and B, and clear the H's status to start its own I/Os. If there are only two devices, it become more simple. It is clear if one find a bridge error while another is check-in, the error is nothing except for another's. Well, works concerning registers (devices and bridges) are almost shaped up. So, from next, I'll move to deep phase to implement more arch-specific codes... see next (6 of 10). Changes from previous one for 2.6.11.11: - (non) Signed-off-by: Hidetoshi Seto --- arch/ia64/lib/iomap_check.c | 45 ++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 45 insertions(+) Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -17,6 +17,9 @@ DEFINE_SPINLOCK(iochk_lock); /* all work static struct pci_dev *search_host_bridge(struct pci_dev *dev); static int have_error(struct pci_dev *dev); +void notify_bridge_error(struct pci_dev *bridge); +void clear_bridge_error(struct pci_dev *bridge); + void iochk_init(void) { /* setup */ @@ -33,6 +36,11 @@ void iochk_clear(iocookie *cookie, struc cookie->host = search_host_bridge(dev); spin_lock_irqsave(&iochk_lock, flag); + if (cookie->host && have_error(cookie->host)) { + /* someone under my bridge causes error... */ + notify_bridge_error(cookie->host); + clear_bridge_error(cookie->host); + } list_add(&cookie->list, &iochk_devices); spin_unlock_irqrestore(&iochk_lock, flag); @@ -95,5 +103,42 @@ static int have_error(struct pci_dev *de return 0; } +void notify_bridge_error(struct pci_dev *bridge) +{ + iocookie *cookie; + + if (list_empty(&iochk_devices)) + return; + + /* notify error to all transactions using this host bridge */ + if (bridge) { + /* local notify, ex. Parity, Abort etc. */ + list_for_each_entry(cookie, &iochk_devices, list) { + if (cookie->host == bridge) + cookie->error = 1; + } + } +} + +void clear_bridge_error(struct pci_dev *bridge) +{ + u16 status = ( PCI_STATUS_REC_TARGET_ABORT + | PCI_STATUS_REC_MASTER_ABORT + | PCI_STATUS_DETECTED_PARITY ); + + /* clear bridge status */ + switch (bridge->hdr_type) { + case PCI_HEADER_TYPE_NORMAL: /* 0 */ + pci_write_config_word(bridge, PCI_STATUS, status); + break; + case PCI_HEADER_TYPE_BRIDGE: /* 1 */ + pci_write_config_word(bridge, PCI_SEC_STATUS, status); + break; + case PCI_HEADER_TYPE_CARDBUS: /* 2 */ + default: + BUG(); + } +} + EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:14:07 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:14:07 +0900 Subject: [PATCH 2.6.13-rc1 06/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB689F.6040208@jp.fujitsu.com> [This is 6 of 10 patches, "iochk-06-mcanotify.patch"] - This is a headache: When ia64 get a problem on hardware, OS could request SAL(System Abstraction Layer: ia64 firmware) to gather system status via calling SAL_GET_STATE_INFO procedure. However (depend on implementation of SAL for its platform, hopefully), on the way of gathering, SAL also checks every host bridges and its status, and after that, resets the state... So we should take care of this reset by SAL. Handling MCA(Machine Check Abort) is one of a situation should we take care. Originally MCA is designed as a critical interruption, so when MCA comes, without OS's order, SAL gathers system status before OS gets its control. So since states of bridges are already reset on entrance of MCA, OS should notify "lost of state" to all "check-in" contexts, by marking its error flag, iocookie->error. There would be better way if OS can know the bridge state from data which SAL gathered, but in the meanwhile, I just do simple way. PCI-parity error is one of MCA causes, is it OK? Next, "data poisoning" helps us... see next (7 of 10). Changes from previous one for 2.6.11.11: - (non) Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca.c | 13 +++++++++++++ arch/ia64/lib/iomap_check.c | 7 ++++++- 2 files changed, 19 insertions(+), 1 deletion(-) Index: linux-2.6.13-rc1/arch/ia64/kernel/mca.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/kernel/mca.c +++ linux-2.6.13-rc1/arch/ia64/kernel/mca.c @@ -77,6 +77,11 @@ #include #include +#ifdef CONFIG_IOMAP_CHECK +#include +extern void notify_bridge_error(struct pci_dev *bridge); +#endif + #if defined(IA64_MCA_DEBUG_INFO) # define IA64_MCA_DEBUG(fmt...) printk(fmt) #else @@ -893,6 +898,14 @@ ia64_mca_ucmc_handler(void) sal_log_record_header_t *rh = IA64_LOG_CURR_BUFFER(SAL_INFO_TYPE_MCA); rh->severity = sal_log_severity_corrected; ia64_sal_clear_state_info(SAL_INFO_TYPE_MCA); + +#ifdef CONFIG_IOMAP_CHECK + /* + * SAL already reads and clears error bits on bridge registers, + * so we should have all running transactions to retry. + */ + notify_bridge_error(0); +#endif } /* * Wakeup all the processors which are spinning in the rendezvous Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -111,7 +111,12 @@ void notify_bridge_error(struct pci_dev return; /* notify error to all transactions using this host bridge */ - if (bridge) { + if (!bridge) { + /* global notify, ex. MCA */ + list_for_each_entry(cookie, &iochk_devices, list) { + cookie->error = 1; + } + } else { /* local notify, ex. Parity, Abort etc. */ list_for_each_entry(cookie, &iochk_devices, list) { if (cookie->host == bridge) From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:17:21 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:17:21 +0900 Subject: [PATCH 2.6.13-rc1 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB6961.2060508@jp.fujitsu.com> [This is 7 of 10 patches, "iochk-07-poison.patch"] - When bus-error occur on write, write data is broken on the bus, so target device gets broken data. There are 2 way for such device to take: - send PERR(Parity Error) to host, expecting immediate panic. - mark status register as error, expecting its driver to read it and decide to retry. So it is not difficult for drivers to recover from error on write if it can take latter way, and if it don't worry about taking time to wait completion of write. - When bus-error occur on read, read data is broken on the bus, so host bridge gets broken data. There are 2 way for such bridge to take: - send BERR(Bus Error) to host, expecting immediate panic. - mark data as "poisoned" and throw it to destination, expecting panic if system touched it but cannot stop data pollution. Former is traditional way, latter is modern way, called "data poisoning". The important difference is whether OS can get a chance to recover from the error. Usually, sending BERR doesn't tell us "where it comes", "who it orders", so we cannot do anything except panic. In the other hand, poisoned data will reach its destination and will cause a error on there again. Yes, destination is "where who lives". Well, the idea is quite simple: "driver checks read data, and recover if it was poisoned." Checking all read at once (ex. take a memo of all read addresses touched after iochk_clear and check them all in iochk_read) does not make sense. Practical way is check each read, keep its result, and read it at end. Touching poisoned data become a MCA, so now it directly means a system down. But since the MCA tells us "where it happens", we can recover it...? All right, let's see next (8 of 10). Changes from previous one for 2.6.11.11: - move barrier function macro into gcc_inirin.h. - could anyone write same barrier for intel compiler? Tony or David, could you help me? Signed-off-by: Hidetoshi Seto --- include/asm-ia64/gcc_intrin.h | 16 +++++++ include/asm-ia64/io.h | 96 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 112 insertions(+) Index: linux-2.6.13-rc1/include/asm-ia64/io.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-ia64/io.h +++ linux-2.6.13-rc1/include/asm-ia64/io.h @@ -189,6 +189,8 @@ __ia64_mk_io_addr (unsigned long port) * during optimization, which is why we use "volatile" pointers. */ +#ifdef CONFIG_IOMAP_CHECK + static inline unsigned int ___ia64_inb (unsigned long port) { @@ -197,6 +199,8 @@ ___ia64_inb (unsigned long port) ret = *addr; __ia64_mf_a(); + ia64_mca_barrier(ret); + return ret; } @@ -208,6 +212,8 @@ ___ia64_inw (unsigned long port) ret = *addr; __ia64_mf_a(); + ia64_mca_barrier(ret); + return ret; } @@ -219,9 +225,48 @@ ___ia64_inl (unsigned long port) ret = *addr; __ia64_mf_a(); + ia64_mca_barrier(ret); + + return ret; +} + +#else /* CONFIG_IOMAP_CHECK */ + +static inline unsigned int +___ia64_inb (unsigned long port) +{ + volatile unsigned char *addr = __ia64_mk_io_addr(port); + unsigned char ret; + + ret = *addr; + __ia64_mf_a(); + return ret; +} + +static inline unsigned int +___ia64_inw (unsigned long port) +{ + volatile unsigned short *addr = __ia64_mk_io_addr(port); + unsigned short ret; + + ret = *addr; + __ia64_mf_a(); return ret; } +static inline unsigned int +___ia64_inl (unsigned long port) +{ + volatile unsigned int *addr = __ia64_mk_io_addr(port); + unsigned int ret; + + ret = *addr; + __ia64_mf_a(); + return ret; +} + +#endif /* CONFIG_IOMAP_CHECK */ + static inline void ___ia64_outb (unsigned char val, unsigned long port) { @@ -338,6 +383,55 @@ __outsl (unsigned long port, const void * a good idea). Writes are ok though for all existing ia64 platforms (and * hopefully it'll stay that way). */ + +#ifdef CONFIG_IOMAP_CHECK + +static inline unsigned char +___ia64_readb (const volatile void __iomem *addr) +{ + unsigned char val; + + val = *(volatile unsigned char __force *)addr; + ia64_mca_barrier(val); + + return val; +} + +static inline unsigned short +___ia64_readw (const volatile void __iomem *addr) +{ + unsigned short val; + + val = *(volatile unsigned short __force *)addr; + ia64_mca_barrier(val); + + return val; +} + +static inline unsigned int +___ia64_readl (const volatile void __iomem *addr) +{ + unsigned int val; + + val = *(volatile unsigned int __force *) addr; + ia64_mca_barrier(val); + + return val; +} + +static inline unsigned long +___ia64_readq (const volatile void __iomem *addr) +{ + unsigned long val; + + val = *(volatile unsigned long __force *) addr; + ia64_mca_barrier(val); + + return val; +} + +#else /* CONFIG_IOMAP_CHECK */ + static inline unsigned char ___ia64_readb (const volatile void __iomem *addr) { @@ -362,6 +456,8 @@ ___ia64_readq (const volatile void __iom return *(volatile unsigned long __force *) addr; } +#endif /* CONFIG_IOMAP_CHECK */ + static inline void __writeb (unsigned char val, volatile void __iomem *addr) { Index: linux-2.6.13-rc1/include/asm-ia64/gcc_intrin.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-ia64/gcc_intrin.h +++ linux-2.6.13-rc1/include/asm-ia64/gcc_intrin.h @@ -598,4 +598,20 @@ do { \ :: "r"((x)) : "p6", "p7", "memory"); \ } while (0) +/* + * Some I/O bridges may poison the data read, instead of + * signaling a BERR. The consummation of poisoned data + * triggers a MCA, which tells us the polluted address. + * Note that the read operation by itself does not consume + * the bad data, you have to do something with it, e.g.: + * + * ld.8 r9=[r10];; // r10 == I/O address + * add.8 r8=r9,0;; // fake operation + */ +#define ia64_mca_barrier(val) \ +({ \ + register unsigned long gr8 asm("r8"); \ + asm volatile ("add %0=%1,r0" : "=r"(gr8) : "r"(val)); \ +}) + #endif /* _ASM_IA64_GCC_INTRIN_H */ From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:18:53 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:18:53 +0900 Subject: [PATCH 2.6.13-rc1 08/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB69BD.1090607@jp.fujitsu.com> [This is 8 of 10 patches, "iochk-08-mcadrv.patch"] - Touching poisoned data become a MCA, so now it assumed as a fatal error, directly will be a system down. But since the MCA tells us a physical address - "where it happens", we can do some action to survive. If the address is present in resource of "check-in" device, it is guaranteed that its driver will call iochk_read in the very near future, and that now the driver have a ability and responsibility of recovery from the error. So if it was "check-in" address, what OS should do is mark "check-in" devices and just restart usual works. Soon the driver will notice the error and operate it properly. Note: We can identify a affected device, but because of SAL behavior (mentioned at 6 of 10), we need to mark all "check-in" devices. Fix in future, if possible. Changes from previous one for 2.6.11.11: - (non) Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca_drv.c | 84 ++++++++++++++++++++++++++++++++++++++++++++ arch/ia64/lib/iomap_check.c | 1 2 files changed, 85 insertions(+) Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -147,3 +147,4 @@ void clear_bridge_error(struct pci_dev * EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); +EXPORT_SYMBOL(iochk_devices); /* for MCA driver */ Index: linux-2.6.13-rc1/arch/ia64/kernel/mca_drv.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/kernel/mca_drv.c +++ linux-2.6.13-rc1/arch/ia64/kernel/mca_drv.c @@ -35,6 +35,12 @@ #include "mca_drv.h" +#ifdef CONFIG_IOMAP_CHECK +#include +#include +extern struct list_head iochk_devices; +#endif + /* max size of SAL error record (default) */ static int sal_rec_max = 10000; @@ -377,6 +383,79 @@ is_mca_global(peidx_table_t *peidx, pal_ return MCA_IS_GLOBAL; } +#ifdef CONFIG_IOMAP_CHECK + +/** + * get_target_identifier - get address of target_identifier + * @peidx: pointer of index of processor error section + * + * Return value: + * addr if valid / 0 if not valid + */ +static u64 get_target_identifier(peidx_table_t *peidx) +{ + sal_log_mod_error_info_t *smei; + + smei = peidx_bus_check(peidx, 0); + if (smei->valid.target_identifier) + return (smei->target_identifier); + return 0; +} + +/** + * offending_addr_in_check - Check if the addr is in checking resource. + * @addr: address offending this MCA + * + * Return value: + * 1 if in / 0 if out + */ +static int offending_addr_in_check(u64 addr) +{ + int i; + struct pci_dev *tdev; + iocookie *cookie; + + if (list_empty(&iochk_devices)) + return 0; + + list_for_each_entry(cookie, &iochk_devices, list) { + tdev = cookie->dev; + for (i = 0; i < PCI_ROM_RESOURCE; i++) { + if (tdev->resource[i].start <= addr + && addr <= tdev->resource[i].end) + return 1; + if ((tdev->resource[i].flags + & (PCI_BASE_ADDRESS_SPACE|PCI_BASE_ADDRESS_MEM_TYPE_MASK)) + == (PCI_BASE_ADDRESS_SPACE_MEMORY|PCI_BASE_ADDRESS_MEM_TYPE_64)) + i++; + } + } + return 0; +} + +/** + * pci_error_recovery - Check if MCA occur on transaction in iochk. + * @peidx: pointer of index of processor error section + * + * Return value: + * 1 if error could be cought in driver / 0 if not + */ +static int pci_error_recovery(peidx_table_t *peidx) +{ + u64 addr; + + addr = get_target_identifier(peidx); + if (!addr) + return 0; + + if (offending_addr_in_check(addr)) + return 1; + + return 0; +} + +#endif /* CONFIG_IOMAP_CHECK */ + /** * recover_from_read_error - Try to recover the errors which type are "read"s. * @slidx: pointer of index of SAL error record @@ -399,6 +478,11 @@ recover_from_read_error(slidx_table_t *s if (!pbci->tv) return 0; +#ifdef CONFIG_IOMAP_CHECK + if (pci_error_recovery(peidx)) + return 1; +#endif + /* * cpu read or memory-mapped io read * From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:20:15 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:20:15 +0900 Subject: [PATCH 2.6.13-rc1 09/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB6A0F.4080304@jp.fujitsu.com> [This is 9 of 10 patches, "iochk-09-cpeh.patch"] - SAL behavior doesn't affect only MCA. There are other chances to call SAL_GET_STATE_INFO, that's when CMC, CPE, and INIT is happen. - CMC(Corrected Machine Check) is for non-fatal, processor local errors. Fortunately, calling SAL_GET_STATE_INFO for CMC only collect data from a processor issued it, without touching any bridge and its status. So, this is safe. - CPE(Corrected Platform Error) is for non-fatal, platform related errors. Even it says corrected, but calling SAL procedure for CPE touchs every bridge on the platform, and "correct" bridge status that's bad for iochk works. - INIT is a kind of system reset request, as far as I know. So restarting from INIT is out of design, also iochk after INIT is not required at this time. In short, only MCA and CPE have the problem of SAL behavior. One of the difference from MCA is that SAL will not gather data before OS actually request it. MCA: 1) SAL gathers data and keep it internally 2) OS gets control 3) if OS requests, SAL returns data gathered at beginning. CPE: 1) OS gets control 2) OS request to SAL 3) SAL gathers data and return it to OS Therefore, we can make CPE handler to care bridge states, to check states before calling SAL procedure. Changes from previous one for 2.6.11.11: - (non) Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca.c | 21 +++++++++++++++++++++ arch/ia64/lib/iomap_check.c | 17 +++++++++++++++++ 2 files changed, 38 insertions(+) Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -19,6 +19,7 @@ static int have_error(struct pci_dev *de void notify_bridge_error(struct pci_dev *bridge); void clear_bridge_error(struct pci_dev *bridge); +void save_bridge_error(void); void iochk_init(void) { @@ -145,6 +146,22 @@ void clear_bridge_error(struct pci_dev * } } +void save_bridge_error(void) +{ + iocookie *cookie; + + if (list_empty(&iochk_devices)) + return; + + /* mark devices if its root bus bridge have errors */ + list_for_each_entry(cookie, &iochk_devices, list) { + if (cookie->error) + continue; + if (have_error(cookie->host)) + notify_bridge_error(cookie->host); + } +} + EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); EXPORT_SYMBOL(iochk_devices); /* for MCA driver */ Index: linux-2.6.13-rc1/arch/ia64/kernel/mca.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/kernel/mca.c +++ linux-2.6.13-rc1/arch/ia64/kernel/mca.c @@ -80,6 +80,8 @@ #ifdef CONFIG_IOMAP_CHECK #include extern void notify_bridge_error(struct pci_dev *bridge); +extern void save_bridge_error(void); +extern spinlock_t iochk_lock; #endif #if defined(IA64_MCA_DEBUG_INFO) @@ -288,11 +290,30 @@ ia64_mca_cpe_int_handler (int cpe_irq, v IA64_MCA_DEBUG("%s: received interrupt vector = %#x on CPU %d\n", __FUNCTION__, cpe_irq, smp_processor_id()); +#ifndef CONFIG_IOMAP_CHECK + /* SAL spec states this should run w/ interrupts enabled */ local_irq_enable(); /* Get the CPE error record and log it */ ia64_mca_log_sal_error_record(SAL_INFO_TYPE_CPE); +#else + /* + * Because SAL_GET_STATE_INFO for CPE might clear bridge states + * in process of gathering error information from the system, + * we should check the states before clearing it. + * While OS and SAL are handling bridge status, we have to protect + * the states from changing by any other I/Os running simultaneously, + * so this should be handled w/ lock and interrupts disabled. + */ + spin_lock(&iochk_lock); + save_bridge_error(); + ia64_mca_log_sal_error_record(SAL_INFO_TYPE_CPE); + spin_unlock(&iochk_lock); + + /* Rests can go w/ interrupt enabled as usual */ + local_irq_enable(); +#endif spin_lock(&cpe_history_lock); if (!cpe_poll_enabled && cpe_vector >= 0) { From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 15:21:15 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 14:21:15 +0900 Subject: [PATCH 2.6.13-rc1 10/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <42CB6A4B.9000906@jp.fujitsu.com> [This is 10 of 10 patches, "iochk-10-rwlock.patch"] - If a read access (i.g. readX/inX) cause a error while SAL gathers system data on other processor ,it could be happen a bridge error status is marked and vanished in a blink. In case of MCA, thanks to rz_always flag, all MCA are handled as global, so all processor except one is paused during its handling. But in case of CPE, as same as other interruption, it have to be handled beside of all other active processors. Therefore, to avoid such status crash, exclusive control between read access and SAL_GET_STATE_INFO is required. To realize this, I changed control lock from spin to rw. There would be better way, if so, this part should be replaced. Changes from previous one for 2.6.11.11: - (non) Signed-off-by: Hidetoshi Seto --- arch/ia64/kernel/mca.c | 6 +++--- arch/ia64/lib/iomap_check.c | 11 ++++++----- include/asm-ia64/io.h | 24 ++++++++++++++++++++++++ 3 files changed, 33 insertions(+), 8 deletions(-) Index: linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/lib/iomap_check.c +++ linux-2.6.13-rc1/arch/ia64/lib/iomap_check.c @@ -12,7 +12,7 @@ void iochk_clear(iocookie *cookie, struc int iochk_read(iocookie *cookie); struct list_head iochk_devices; -DEFINE_SPINLOCK(iochk_lock); /* all works are excluded on this lock */ +DEFINE_RWLOCK(iochk_lock); /* all works are excluded on this lock */ static struct pci_dev *search_host_bridge(struct pci_dev *dev); static int have_error(struct pci_dev *dev); @@ -36,14 +36,14 @@ void iochk_clear(iocookie *cookie, struc cookie->dev = dev; cookie->host = search_host_bridge(dev); - spin_lock_irqsave(&iochk_lock, flag); + write_lock_irqsave(&iochk_lock, flag); if (cookie->host && have_error(cookie->host)) { /* someone under my bridge causes error... */ notify_bridge_error(cookie->host); clear_bridge_error(cookie->host); } list_add(&cookie->list, &iochk_devices); - spin_unlock_irqrestore(&iochk_lock, flag); + write_unlock_irqrestore(&iochk_lock, flag); cookie->error = 0; } @@ -53,12 +53,12 @@ int iochk_read(iocookie *cookie) unsigned long flag; int ret = 0; - spin_lock_irqsave(&iochk_lock, flag); + write_lock_irqsave(&iochk_lock, flag); if ( cookie->error || have_error(cookie->dev) || (cookie->host && have_error(cookie->host)) ) ret = 1; list_del(&cookie->list); - spin_unlock_irqrestore(&iochk_lock, flag); + write_unlock_irqrestore(&iochk_lock, flag); return ret; } @@ -162,6 +162,7 @@ void save_bridge_error(void) } } +EXPORT_SYMBOL(iochk_lock); EXPORT_SYMBOL(iochk_read); EXPORT_SYMBOL(iochk_clear); EXPORT_SYMBOL(iochk_devices); /* for MCA driver */ Index: linux-2.6.13-rc1/include/asm-ia64/io.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-ia64/io.h +++ linux-2.6.13-rc1/include/asm-ia64/io.h @@ -73,6 +73,7 @@ extern unsigned int num_io_spaces; #ifdef CONFIG_IOMAP_CHECK #include +#include /* ia64 iocookie */ typedef struct { @@ -82,6 +83,8 @@ typedef struct { unsigned long error; /* error flag */ } iocookie; +extern rwlock_t iochk_lock; /* see arch/ia64/lib/iomap_check.c */ + /* Enable ia64 iochk - See arch/ia64/lib/iomap_check.c */ #define HAVE_ARCH_IOMAP_CHECK @@ -196,10 +199,13 @@ ___ia64_inb (unsigned long port) { volatile unsigned char *addr = __ia64_mk_io_addr(port); unsigned char ret; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); ret = *addr; __ia64_mf_a(); ia64_mca_barrier(ret); + read_unlock_irqrestore(&iochk_lock,flags); return ret; } @@ -209,10 +215,13 @@ ___ia64_inw (unsigned long port) { volatile unsigned short *addr = __ia64_mk_io_addr(port); unsigned short ret; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); ret = *addr; __ia64_mf_a(); ia64_mca_barrier(ret); + read_unlock_irqrestore(&iochk_lock,flags); return ret; } @@ -222,10 +231,13 @@ ___ia64_inl (unsigned long port) { volatile unsigned int *addr = __ia64_mk_io_addr(port); unsigned int ret; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); ret = *addr; __ia64_mf_a(); ia64_mca_barrier(ret); + read_unlock_irqrestore(&iochk_lock,flags); return ret; } @@ -390,9 +402,12 @@ static inline unsigned char ___ia64_readb (const volatile void __iomem *addr) { unsigned char val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned char __force *)addr; ia64_mca_barrier(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } @@ -401,9 +416,12 @@ static inline unsigned short ___ia64_readw (const volatile void __iomem *addr) { unsigned short val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned short __force *)addr; ia64_mca_barrier(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } @@ -412,9 +430,12 @@ static inline unsigned int ___ia64_readl (const volatile void __iomem *addr) { unsigned int val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned int __force *) addr; ia64_mca_barrier(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } @@ -423,9 +444,12 @@ static inline unsigned long ___ia64_readq (const volatile void __iomem *addr) { unsigned long val; + unsigned long flags; + read_lock_irqsave(&iochk_lock,flags); val = *(volatile unsigned long __force *) addr; ia64_mca_barrier(val); + read_unlock_irqrestore(&iochk_lock,flags); return val; } Index: linux-2.6.13-rc1/arch/ia64/kernel/mca.c =================================================================== --- linux-2.6.13-rc1.orig/arch/ia64/kernel/mca.c +++ linux-2.6.13-rc1/arch/ia64/kernel/mca.c @@ -81,7 +81,7 @@ #include extern void notify_bridge_error(struct pci_dev *bridge); extern void save_bridge_error(void); -extern spinlock_t iochk_lock; +extern rwlock_t iochk_lock; #endif #if defined(IA64_MCA_DEBUG_INFO) @@ -306,10 +306,10 @@ ia64_mca_cpe_int_handler (int cpe_irq, v * the states from changing by any other I/Os running simultaneously, * so this should be handled w/ lock and interrupts disabled. */ - spin_lock(&iochk_lock); + write_lock(&iochk_lock); save_bridge_error(); ia64_mca_log_sal_error_record(SAL_INFO_TYPE_CPE); - spin_unlock(&iochk_lock); + write_unlock(&iochk_lock); /* Rests can go w/ interrupt enabled as usual */ local_irq_enable(); From yoshfuji at linux-ipv6.org Wed Jul 6 16:26:27 2005 From: yoshfuji at linux-ipv6.org (YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=) Date: Wed, 06 Jul 2005 15:26:27 +0900 (JST) Subject: [PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <20050706.152627.68274440.yoshfuji@linux-ipv6.org> In article <42CB63B2.6000505 at jp.fujitsu.com> (at Wed, 06 Jul 2005 13:53:06 +0900), Hidetoshi Seto says: > Index: linux-2.6.13-rc1/lib/iomap.c > =================================================================== > --- linux-2.6.13-rc1.orig/lib/iomap.c > +++ linux-2.6.13-rc1/lib/iomap.c > @@ -230,3 +230,9 @@ void pci_iounmap(struct pci_dev *dev, vo > } > EXPORT_SYMBOL(pci_iomap); > EXPORT_SYMBOL(pci_iounmap); > + > +#ifndef HAVE_ARCH_IOMAP_CHECK > +/* Since generic funcs are inlined and defined in header, just export */ > +EXPORT_SYMBOL(iochk_clear); > +EXPORT_SYMBOL(iochk_read); > +#endif > Index: linux-2.6.13-rc1/include/asm-generic/iomap.h > =================================================================== > --- linux-2.6.13-rc1.orig/include/asm-generic/iomap.h > +++ linux-2.6.13-rc1/include/asm-generic/iomap.h : > + */ > +#ifdef HAVE_ARCH_IOMAP_CHECK > +extern void iochk_init(void); > +extern void iochk_clear(iocookie *cookie, struct pci_dev *dev); > +extern int iochk_read(iocookie *cookie); > +#else > +static inline void iochk_init(void) {} > +static inline void iochk_clear(iocookie *cookie, struct pci_dev *dev) {} > +static inline int iochk_read(iocookie *cookie) { return 0; } > +#endif > + > #endif It looks strange to me. You cannot export "static inline" functions. You can export iochk_{init,clear,read} only if HAVE_ARCH_IOMAP_CHECK is defined. --yoshfuji From seto.hidetoshi at jp.fujitsu.com Wed Jul 6 20:15:02 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 06 Jul 2005 19:15:02 +0900 Subject: [PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050706.152627.68274440.yoshfuji@linux-ipv6.org> References: <42CB63B2.6000505@jp.fujitsu.com> <20050706.152627.68274440.yoshfuji@linux-ipv6.org> Message-ID: <42CBAF26.3070002@jp.fujitsu.com> YOSHIFUJI Hideaki wrote: >>Index: linux-2.6.13-rc1/lib/iomap.c >>=================================================================== >>--- linux-2.6.13-rc1.orig/lib/iomap.c >>+++ linux-2.6.13-rc1/lib/iomap.c >>@@ -230,3 +230,9 @@ void pci_iounmap(struct pci_dev *dev, vo >> } >> EXPORT_SYMBOL(pci_iomap); >> EXPORT_SYMBOL(pci_iounmap); >>+ >>+#ifndef HAVE_ARCH_IOMAP_CHECK >>+/* Since generic funcs are inlined and defined in header, just export */ >>+EXPORT_SYMBOL(iochk_clear); >>+EXPORT_SYMBOL(iochk_read); >>+#endif : > It looks strange to me. > You cannot export "static inline" functions. > You can export iochk_{init,clear,read} only > if HAVE_ARCH_IOMAP_CHECK is defined. Oh yes, I had such strange feel too. I'm not sure there was a compile error or not (it seems good), but if you right(suppose so) I should remove them. Fortunately dropping these export doesn't affect any of other following patches in this series. Please replace only 01 patch to following new one, and reuse rest 02 to 10 patches if this is actually strange. Thanks, H.Seto --- drivers/pci/pci.c | 2 ++ include/asm-generic/iomap.h | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+) Index: linux-2.6.13-rc1/include/asm-generic/iomap.h =================================================================== --- linux-2.6.13-rc1.orig/include/asm-generic/iomap.h +++ linux-2.6.13-rc1/include/asm-generic/iomap.h @@ -65,4 +65,36 @@ struct pci_dev; extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max); extern void pci_iounmap(struct pci_dev *dev, void __iomem *); +/* + * IOMAP_CHECK provides additional interfaces for drivers to detect + * some IO errors, supports drivers having ability to recover errors. + * + * All works around iomap-check depends on the design of "iocookie" + * structure. Every architecture owning its iomap-check is free to + * define the actual design of iocookie to fit its special style. + */ +#ifndef HAVE_ARCH_IOMAP_CHECK +/* Dummy definition of default iocookie */ +typedef int iocookie; +#endif + +/* + * Clear/Read iocookie to check IO error while using iomap. + * + * Note that default iochk_clear-read pair interfaces don't have + * any effective error check, but some high-reliable platforms + * would provide useful information to you. + * And note that some action may be limited (ex. irq-unsafe) + * between the pair depend on the facility of the platform. + */ +#ifdef HAVE_ARCH_IOMAP_CHECK +extern void iochk_init(void); +extern void iochk_clear(iocookie *cookie, struct pci_dev *dev); +extern int iochk_read(iocookie *cookie); +#else +static inline void iochk_init(void) {} +static inline void iochk_clear(iocookie *cookie, struct pci_dev *dev) {} +static inline int iochk_read(iocookie *cookie) { return 0; } +#endif + #endif Index: linux-2.6.13-rc1/drivers/pci/pci.c =================================================================== --- linux-2.6.13-rc1.orig/drivers/pci/pci.c +++ linux-2.6.13-rc1/drivers/pci/pci.c @@ -767,6 +767,8 @@ static int __devinit pci_init(void) while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { pci_fixup_device(pci_fixup_final, dev); } + + iochk_init(); return 0; } From anton at samba.org Thu Jul 7 01:55:22 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:55:22 +1000 Subject: [PATCH 8/14] hvc_console: Statically initialize the vtermnos array In-Reply-To: <20050706155457.GX12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> <20050706155457.GX12786@krispykreme> Message-ID: <20050706155522.GY12786@krispykreme> From: Milton Miller Statically initialize the vtermnos array. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-static-init drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-static-init 2005-02-08 00:43:04.837874744 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 00:43:04.847873158 -0600 @@ -144,8 +144,8 @@ struct hvc_struct *hvc_get_by_index(int * console interfaces but can still be used as a tty device. This has to be * static because kmalloc will not work during early console init. */ -static uint32_t vtermnos[MAX_NR_HVC_CONSOLES]; - +static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = + {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -213,10 +213,6 @@ struct console hvc_con_driver = { /* Early console initialization. Preceeds driver initialization. */ static int __init hvc_console_init(void) { - int i; - - for (i=0; i References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> <20050706155457.GX12786@krispykreme> <20050706155522.GY12786@krispykreme> Message-ID: <20050706155636.GZ12786@krispykreme> From: Milton Miller Check if a vterm was registered before accepting it as a console. Check that a slot hasn't been probed with a tty in hvc_instantiate(). Check that a slot hasn't been free'ed when handing out console device. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-checks drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-checks 2005-02-08 14:11:57.635129665 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 14:12:39.456812517 -0600 @@ -192,12 +192,21 @@ void hvc_console_print(struct console *c static struct tty_driver *hvc_console_device(struct console *c, int *index) { + if (vtermnos[c->index] == -1) + return NULL; + *index = c->index; return hvc_driver; } static int __init hvc_console_setup(struct console *co, char *options) { + if (co->index < 0 || co->index >= MAX_NR_HVC_CONSOLES) + return -ENODEV; + + if (vtermnos[co->index] == -1) + return -ENODEV; + return 0; } @@ -227,12 +236,21 @@ console_initcall(hvc_console_init); */ int hvc_instantiate(uint32_t vtermno, int index) { + struct hvc_struct *hp; + if (index < 0 || index >= MAX_NR_HVC_CONSOLES) return -1; if (vtermnos[index] != -1) return -1; + /* make sure no no tty has been registerd in this index */ + hp = hvc_get_by_index(index); + if (hp) { + kobject_put(&hp->kobj); + return -1; + } + vtermnos[index] = vtermno; /* reserve all indices upto and including this index */ _ From anton at samba.org Thu Jul 7 01:51:55 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:51:55 +1000 Subject: [PATCH 2/14] hvc_console: Match vio and console devices using vterm numbers In-Reply-To: <20050706155112.GR12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> Message-ID: <20050706155155.GS12786@krispykreme> From: Milton Miller Use the vterm numbers to match the vio devices being probed with the indices already allocated via the console initcall function hvc_find_vtys. The old code required hvc_find_vtys to "guess" the matching devices the vio subsystem would find and its probe order. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-matching drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-matching 2005-02-08 00:42:37.166135407 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 00:42:37.177133662 -0600 @@ -102,10 +102,11 @@ static struct list_head hvc_structs = LI static DEFINE_SPINLOCK(hvc_structs_lock); /* - * This value is used to associate a tty->index value to a hvc_struct based - * upon order of exposure via hvc_probe(). + * This value is used to assign a tty->index value to a hvc_struct based + * upon order of exposure via hvc_probe(), when we can not match it to + * a console canidate registered with hvc_instantiate(). */ -static int hvc_count = -1; +static int last_hvc = -1; /* * Do not call this function with either the hvc_strucst_lock or the hvc_struct @@ -224,9 +225,10 @@ static int __init hvc_console_init(void) console_initcall(hvc_console_init); /* - * hvc_instantiate() is an early console discovery method which locates consoles - * prior to the vio subsystem discovering them. Hotplugged vty adapters do NOT - * get an hvc_instantiate() callback since the appear after early console init. + * hvc_instantiate() is an early console discovery method which locates + * consoles * prior to the vio subsystem discovering them. Hotplugged + * vty adapters do NOT get an hvc_instantiate() callback since they + * appear after early console init. */ int hvc_instantiate(uint32_t vtermno, int index) { @@ -237,6 +239,11 @@ int hvc_instantiate(uint32_t vtermno, in return -1; vtermnos[index] = vtermno; + + /* reserve all indices upto and including this index */ + if (last_hvc < index) + last_hvc = index; + return 0; } @@ -697,6 +704,7 @@ static int __devinit hvc_probe( const struct vio_device_id *id) { struct hvc_struct *hp; + int i; /* probed with invalid parameters. */ if (!dev || !id) @@ -717,7 +725,21 @@ static int __devinit hvc_probe( spin_lock_init(&hp->lock); spin_lock(&hvc_structs_lock); - hp->index = ++hvc_count; + + /* + * find index to use: + * see if this vterm id matches one registered for console. + */ + for (i=0; i < MAX_NR_HVC_CONSOLES; i++) + if (vtermnos[i] == hp->vtermno) + break; + + /* no matching slot, just use a counter */ + if (i >= MAX_NR_HVC_CONSOLES) + i = ++last_hvc; + + hp->index = i; + list_add_tail(&(hp->next), &hvc_structs); spin_unlock(&hvc_structs_lock); _ From anton at samba.org Thu Jul 7 01:51:12 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:51:12 +1000 Subject: [PATCH 1/14] hvc_console: Rearrange code In-Reply-To: <20050706155020.GQ12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> Message-ID: <20050706155112.GR12786@krispykreme> From: Milton Miller Rearrange the code in drivers/char/hvc_console.c to make future patches smaller. No actual code changes, just ordering of the functions in the file. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-move-code drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-move-code 2005-02-07 22:26:54.047486440 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 00:08:42.243553235 -0600 @@ -61,16 +61,21 @@ */ #define HVC_ALLOC_TTY_ADAPTERS 8 -static struct tty_driver *hvc_driver; -#ifdef CONFIG_MAGIC_SYSRQ -static int sysrq_pressed; -#endif - #define N_OUTBUF 16 #define N_INBUF 16 #define __ALIGNED__ __attribute__((__aligned__(8))) +static struct tty_driver *hvc_driver; +static struct task_struct *hvc_task; + +/* Picks up late kicks after list walk but before schedule() */ +static int hvc_kicked; + +#ifdef CONFIG_MAGIC_SYSRQ +static int sysrq_pressed; +#endif + struct hvc_struct { spinlock_t lock; int index; @@ -97,6 +102,41 @@ static struct list_head hvc_structs = LI static DEFINE_SPINLOCK(hvc_structs_lock); /* + * This value is used to associate a tty->index value to a hvc_struct based + * upon order of exposure via hvc_probe(). + */ +static int hvc_count = -1; + +/* + * Do not call this function with either the hvc_strucst_lock or the hvc_struct + * lock held. If successful, this function increments the kobject reference + * count against the target hvc_struct so it should be released when finished. + */ +struct hvc_struct *hvc_get_by_index(int index) +{ + struct hvc_struct *hp; + unsigned long flags; + + spin_lock(&hvc_structs_lock); + + list_for_each_entry(hp, &hvc_structs, next) { + spin_lock_irqsave(&hp->lock, flags); + if (hp->index == index) { + kobject_get(&hp->kobj); + spin_unlock_irqrestore(&hp->lock, flags); + spin_unlock(&hvc_structs_lock); + return hp; + } + spin_unlock_irqrestore(&hp->lock, flags); + } + hp = NULL; + + spin_unlock(&hvc_structs_lock); + return hp; +} + + +/* * Initial console vtermnos for console API usage prior to full console * initialization. Any vty adapter outside this range will not have usable * console interfaces but can still be used as a tty device. This has to be @@ -107,16 +147,98 @@ static uint32_t vtermnos[MAX_NR_HVC_CONS /* Used for accounting purposes */ static int num_vterms = 0; -static struct task_struct *hvc_task; +/* + * Console APIs, NOT TTY. These APIs are available immediately when + * hvc_console_setup() finds adapters. + */ + +void hvc_console_print(struct console *co, const char *b, unsigned count) +{ + char c[16] __ALIGNED__; + unsigned i = 0, n = 0; + int r, donecr = 0; + + /* Console access attempt outside of acceptable console range. */ + if (co->index >= MAX_NR_HVC_CONSOLES) + return; + + /* This console adapter was removed so it is not useable. */ + if (vtermnos[co->index] < 0) + return; + + while (count > 0 || i > 0) { + if (count > 0 && i < sizeof(c)) { + if (b[n] == '\n' && !donecr) { + c[i++] = '\r'; + donecr = 1; + } else { + c[i++] = b[n++]; + donecr = 0; + --count; + } + } else { + r = hvc_put_chars(vtermnos[co->index], c, i); + if (r < 0) { + /* throw away chars on error */ + i = 0; + } else if (r > 0) { + i -= r; + if (i > 0) + memmove(c, c+r, i); + } + } + } +} + +static struct tty_driver *hvc_console_device(struct console *c, int *index) +{ + *index = c->index; + return hvc_driver; +} + +static int __init hvc_console_setup(struct console *co, char *options) +{ + return 0; +} + +struct console hvc_con_driver = { + .name = "hvc", + .write = hvc_console_print, + .device = hvc_console_device, + .setup = hvc_console_setup, + .flags = CON_PRINTBUFFER, + .index = -1, +}; + +/* Early console initialization. Preceeds driver initialization. */ +static int __init hvc_console_init(void) +{ + int i; + + for (i=0; iindex value to a hvc_struct based - * upon order of exposure via hvc_probe(). + * hvc_instantiate() is an early console discovery method which locates consoles + * prior to the vio subsystem discovering them. Hotplugged vty adapters do NOT + * get an hvc_instantiate() callback since the appear after early console init. */ -static int hvc_count = -1; +int hvc_instantiate(uint32_t vtermno, int index) +{ + if (index < 0 || index >= MAX_NR_HVC_CONSOLES) + return -1; -/* Picks up late kicks after list walk but before schedule() */ -static int hvc_kicked; + if (vtermnos[index] != -1) + return -1; + + vtermnos[index] = vtermno; + return 0; +} /* Wake the sleeping khvcd */ static void hvc_kick(void) @@ -141,34 +263,6 @@ static void hvc_unthrottle(struct tty_st } /* - * Do not call this function with either the hvc_strucst_lock or the hvc_struct - * lock held. If successful, this function increments the kobject reference - * count against the target hvc_struct so it should be released when finished. - */ -struct hvc_struct *hvc_get_by_index(int index) -{ - struct hvc_struct *hp; - unsigned long flags; - - spin_lock(&hvc_structs_lock); - - list_for_each_entry(hp, &hvc_structs, next) { - spin_lock_irqsave(&hp->lock, flags); - if (hp->index == index) { - kobject_get(&hp->kobj); - spin_unlock_irqrestore(&hp->lock, flags); - spin_unlock(&hvc_structs_lock); - return hp; - } - spin_unlock_irqrestore(&hp->lock, flags); - } - hp = NULL; - - spin_unlock(&hvc_structs_lock); - return hp; -} - -/* * The TTY interface won't be used until after the vio layer has exposed the vty * adapter to the kernel. */ @@ -577,14 +671,6 @@ static struct tty_operations hvc_ops = { .chars_in_buffer = hvc_chars_in_buffer, }; -char hvc_driver_name[] = "hvc_console"; - -static struct vio_device_id hvc_driver_table[] __devinitdata= { - {"serial", "hvterm1"}, - { NULL, } -}; -MODULE_DEVICE_TABLE(vio, hvc_driver_table); - /* callback when the kboject ref count reaches zero. */ static void destroy_hvc_struct(struct kobject *kobj) { @@ -674,6 +760,14 @@ static int __devexit hvc_remove(struct v return 0; } +char hvc_driver_name[] = "hvc_console"; + +static struct vio_device_id hvc_driver_table[] __devinitdata= { + {"serial", "hvterm1"}, + { NULL, } +}; +MODULE_DEVICE_TABLE(vio, hvc_driver_table); + static struct vio_driver hvc_vio_driver = { .name = hvc_driver_name, .id_table = hvc_driver_table, @@ -721,6 +815,7 @@ int __init hvc_init(void) return rc; } +module_init(hvc_init); /* This isn't particularily necessary due to this being a console driver but it * is nice to be thorough */ @@ -733,99 +828,4 @@ static void __exit hvc_exit(void) /* return tty_struct instances allocated in hvc_init(). */ put_tty_driver(hvc_driver); } - -/* - * Console APIs, NOT TTY. These APIs are available immediately when - * hvc_console_setup() finds adapters. - */ - -/* - * hvc_instantiate() is an early console discovery method which locates consoles - * prior to the vio subsystem discovering them. Hotplugged vty adapters do NOT - * get an hvc_instantiate() callback since the appear after early console init. - */ -int hvc_instantiate(uint32_t vtermno, int index) -{ - if (index < 0 || index >= MAX_NR_HVC_CONSOLES) - return -1; - - if (vtermnos[index] != -1) - return -1; - - vtermnos[index] = vtermno; - return 0; -} - -void hvc_console_print(struct console *co, const char *b, unsigned count) -{ - char c[16] __ALIGNED__; - unsigned i = 0, n = 0; - int r, donecr = 0; - - /* Console access attempt outside of acceptable console range. */ - if (co->index >= MAX_NR_HVC_CONSOLES) - return; - - /* This console adapter was removed so it is not useable. */ - if (vtermnos[co->index] < 0) - return; - - while (count > 0 || i > 0) { - if (count > 0 && i < sizeof(c)) { - if (b[n] == '\n' && !donecr) { - c[i++] = '\r'; - donecr = 1; - } else { - c[i++] = b[n++]; - donecr = 0; - --count; - } - } else { - r = hvc_put_chars(vtermnos[co->index], c, i); - if (r < 0) { - /* throw away chars on error */ - i = 0; - } else if (r > 0) { - i -= r; - if (i > 0) - memmove(c, c+r, i); - } - } - } -} - -static struct tty_driver *hvc_console_device(struct console *c, int *index) -{ - *index = c->index; - return hvc_driver; -} - -static int __init hvc_console_setup(struct console *co, char *options) -{ - return 0; -} - -struct console hvc_con_driver = { - .name = "hvc", - .write = hvc_console_print, - .device = hvc_console_device, - .setup = hvc_console_setup, - .flags = CON_PRINTBUFFER, - .index = -1, -}; - -/* Early console initialization. Preceeds driver initialization. */ -static int __init hvc_console_init(void) -{ - int i; - - for (i=0; i References: <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> <20050706155457.GX12786@krispykreme> <20050706155522.GY12786@krispykreme> <20050706155636.GZ12786@krispykreme> <20050706155836.GA12786@krispykreme> Message-ID: <20050706160116.GB12786@krispykreme> From: Milton Miller Remove all the vio device driver code from hvc_console.c This will allow us to separate hvsi, hvc, and allow hvc_console to be used without the ppc64 vio layer. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/Makefile~hvc-console-split-vio drivers/char/Makefile --- gr_work_small/drivers/char/Makefile~hvc-console-split-vio 2005-05-31 21:30:10.000000000 -0500 +++ gr_work_small-miltonm/drivers/char/Makefile 2005-05-31 21:30:11.000000000 -0500 @@ -40,7 +40,7 @@ obj-$(CONFIG_N_HDLC) += n_hdlc.o obj-$(CONFIG_AMIGA_BUILTIN_SERIAL) += amiserial.o obj-$(CONFIG_SX) += sx.o generic_serial.o obj-$(CONFIG_RIO) += rio/ generic_serial.o -obj-$(CONFIG_HVC_CONSOLE) += hvc_console.o hvsi.o +obj-$(CONFIG_HVC_CONSOLE) += hvc_console.o hvc_vio.o hvsi.o obj-$(CONFIG_RAW_DRIVER) += raw.o obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o obj-$(CONFIG_MMTIMER) += mmtimer.o diff -puN drivers/char/hvc_console.c~hvc-console-split-vio drivers/char/hvc_console.c --- gr_work_small/drivers/char/hvc_console.c~hvc-console-split-vio 2005-05-31 21:30:10.000000000 -0500 +++ gr_work_small-miltonm/drivers/char/hvc_console.c 2005-06-01 19:59:35.081931524 -0500 @@ -41,7 +41,6 @@ #include #include #include -#include #define HVC_MAJOR 229 #define HVC_MINOR 0 @@ -90,7 +89,6 @@ struct hvc_struct { int irq; struct list_head next; struct kobject kobj; /* ref count & hvc_struct lifetime */ - struct vio_dev *vdev; }; /* dynamic list of hvc_struct instances */ @@ -279,6 +277,7 @@ int hvc_instantiate(uint32_t vtermno, in return 0; } +EXPORT_SYMBOL(hvc_instantiate); /* Wake the sleeping khvcd */ static void hvc_kick(void) @@ -738,26 +737,19 @@ static struct kobj_type hvc_kobj_type = .release = destroy_hvc_struct, }; -static int __devinit hvc_probe( - struct vio_dev *dev, - const struct vio_device_id *id) +struct hvc_struct __devinit *hvc_alloc(uint32_t vtermno, int irq) { struct hvc_struct *hp; int i; - /* probed with invalid parameters. */ - if (!dev || !id) - return -EPERM; - hp = kmalloc(sizeof(*hp), GFP_KERNEL); if (!hp) - return -ENOMEM; + return ERR_PTR(-ENOMEM); memset(hp, 0x00, sizeof(*hp)); - hp->vtermno = dev->unit_address; - hp->vdev = dev; - hp->vdev->dev.driver_data = hp; - hp->irq = dev->irq; + + hp->vtermno = vtermno; + hp->irq = irq; kobject_init(&hp->kobj); hp->kobj.ktype = &hvc_kobj_type; @@ -782,12 +774,12 @@ static int __devinit hvc_probe( list_add_tail(&(hp->next), &hvc_structs); spin_unlock(&hvc_structs_lock); - return 0; + return hp; } +EXPORT_SYMBOL(hvc_alloc); -static int __devexit hvc_remove(struct vio_dev *dev) +int __devexit hvc_remove(struct hvc_struct *hp) { - struct hvc_struct *hp = dev->dev.driver_data; unsigned long flags; struct kobject *kobjp; struct tty_struct *tty; @@ -820,28 +812,12 @@ static int __devexit hvc_remove(struct v tty_hangup(tty); return 0; } - -char hvc_driver_name[] = "hvc_console"; - -static struct vio_device_id hvc_driver_table[] __devinitdata= { - {"serial", "hvterm1"}, - { NULL, } -}; -MODULE_DEVICE_TABLE(vio, hvc_driver_table); - -static struct vio_driver hvc_vio_driver = { - .name = hvc_driver_name, - .id_table = hvc_driver_table, - .probe = hvc_probe, - .remove = hvc_remove, -}; +EXPORT_SYMBOL(hvc_remove); /* Driver initialization. Follow console initialization. This is where the TTY * interfaces start to become available. */ int __init hvc_init(void) { - int rc; - /* We need more than hvc_count adapters due to hotplug additions. */ hvc_driver = alloc_tty_driver(HVC_ALLOC_TTY_ADAPTERS); if (!hvc_driver) @@ -870,10 +846,7 @@ int __init hvc_init(void) return -EIO; } - /* Register as a vio device to receive callbacks */ - rc = vio_register_driver(&hvc_vio_driver); - - return rc; + return 0; } module_init(hvc_init); @@ -884,7 +857,6 @@ static void __exit hvc_exit(void) { kthread_stop(hvc_task); - vio_unregister_driver(&hvc_vio_driver); tty_unregister_driver(hvc_driver); /* return tty_struct instances allocated in hvc_init(). */ put_tty_driver(hvc_driver); diff -puN /dev/null drivers/char/hvc_vio.c --- /dev/null 2005-03-30 21:59:48.274369186 -0600 +++ gr_work_small-miltonm/drivers/char/hvc_vio.c 2005-06-01 19:59:35.082931365 -0500 @@ -0,0 +1,125 @@ +/* + * vio driver interface to hvc_console.c + * + * This code was moved here to allow the remaing code to be reused as a + * generic polling mode with semi-reliable transport driver core to the + * console and tty subsystems. + * + * + * Copyright (C) 2001 Anton Blanchard , IBM + * Copyright (C) 2001 Paul Mackerras , IBM + * Copyright (C) 2004 Benjamin Herrenschmidt , IBM Corp. + * Copyright (C) 2004 IBM Corporation + * + * Additional Author(s): + * Ryan S. Arnold + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include +#include +#include + +char hvc_driver_name[] = "hvc_console"; + +static struct vio_device_id hvc_driver_table[] __devinitdata = { + {"serial", "hvterm1"}, + { NULL, } +}; +MODULE_DEVICE_TABLE(vio, hvc_driver_table); + +static int __devinit hvc_vio_probe(struct vio_dev *vdev, + const struct vio_device_id *id) +{ + struct hvc_struct *hp; + + /* probed with invalid parameters. */ + if (!vdev || !id) + return -EPERM; + + hp = hvc_alloc(vdev->unit_address, vdev->irq); + if (IS_ERR(hp)) + return PTR_ERR(hp); + dev_set_drvdata(&vdev->dev, hp); + + return 0; +} + +static int __devexit hvc_vio_remove(struct vio_dev *vdev) +{ + struct hvc_struct *hp = dev_get_drvdata(&vdev->dev); + + return hvc_remove(hp); +} + +static struct vio_driver hvc_vio_driver = { + .name = hvc_driver_name, + .id_table = hvc_driver_table, + .probe = hvc_vio_probe, + .remove = hvc_vio_remove, + .driver = { + .owner = THIS_MODULE, + } +}; + +static int hvc_vio_init(void) +{ + int rc; + + /* Register as a vio device to receive callbacks */ + rc = vio_register_driver(&hvc_vio_driver); + + return rc; +} +module_init(hvc_vio_init); /* after drivers/char/hvc_console.c */ + +static void hvc_vio_exit(void) +{ + vio_unregister_driver(&hvc_vio_driver); +} +module_exit(hvc_vio_exit); + +/* the device tree order defines our numbering */ +static int hvc_find_vtys(void) +{ + struct device_node *vty; + int num_found = 0; + + for (vty = of_find_node_by_name(NULL, "vty"); vty != NULL; + vty = of_find_node_by_name(vty, "vty")) { + uint32_t *vtermno; + + /* We have statically defined space for only a certain number + * of console adapters. + */ + if (num_found >= MAX_NR_HVC_CONSOLES) + break; + + vtermno = (uint32_t *)get_property(vty, "reg", NULL); + if (!vtermno) + continue; + + if (device_is_compatible(vty, "hvterm1")) { + hvc_instantiate(*vtermno, num_found); + ++num_found; + } + } + + return num_found; +} +console_initcall(hvc_find_vtys); diff -puN include/asm-ppc64/hvconsole.h~hvc-console-split-vio include/asm-ppc64/hvconsole.h --- gr_work_small/include/asm-ppc64/hvconsole.h~hvc-console-split-vio 2005-05-31 21:30:10.000000000 -0500 +++ gr_work_small-miltonm/include/asm-ppc64/hvconsole.h 2005-06-01 19:59:35.083931206 -0500 @@ -29,9 +29,16 @@ */ #define MAX_NR_HVC_CONSOLES 16 +/* implemented by a low level driver */ extern int hvc_get_chars(uint32_t vtermno, char *buf, int count); extern int hvc_put_chars(uint32_t vtermno, const char *buf, int count); -/* Register a vterm and a slot index for use as a console */ +struct hvc_struct; + +/* Register a vterm and a slot index for use as a console (console_init) */ extern int hvc_instantiate(uint32_t vtermno, int index); +/* register a vterm for hvc tty operation (module_init or hotplug add) */ +extern struct hvc_struct * __devinit hvc_alloc(uint32_t vtermno, int irq); +/* remove a vterm from hvc tty operation (modele_exit or hotplug remove) */ +extern int __devexit hvc_remove(struct hvc_struct *hp); #endif /* _PPC64_HVCONSOLE_H */ diff -puN arch/ppc64/kernel/hvconsole.c~hvc-console-split-vio arch/ppc64/kernel/hvconsole.c --- gr_work_small/arch/ppc64/kernel/hvconsole.c~hvc-console-split-vio 2005-05-31 21:49:47.000000000 -0500 +++ gr_work_small-miltonm/arch/ppc64/kernel/hvconsole.c 2005-06-01 19:59:34.879830465 -0500 @@ -27,7 +27,6 @@ #include #include #include -#include /** * hvc_get_chars - retrieve characters from firmware for denoted vterm adatper @@ -88,35 +87,3 @@ int hvc_put_chars(uint32_t vtermno, cons } EXPORT_SYMBOL(hvc_put_chars); - -/* - * We hope/assume that the first vty found corresponds to the first console - * device. - */ -static int hvc_find_vtys(void) -{ - struct device_node *vty; - int num_found = 0; - - for (vty = of_find_node_by_name(NULL, "vty"); vty != NULL; - vty = of_find_node_by_name(vty, "vty")) { - uint32_t *vtermno; - - /* We have statically defined space for only a certain number of - * console adapters. */ - if (num_found >= MAX_NR_HVC_CONSOLES) - break; - - vtermno = (uint32_t *)get_property(vty, "reg", NULL); - if (!vtermno) - continue; - - if (device_is_compatible(vty, "hvterm1")) { - hvc_instantiate(*vtermno, num_found); - ++num_found; - } - } - - return num_found; -} -console_initcall(hvc_find_vtys); _ From anton at samba.org Thu Jul 7 01:53:18 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:53:18 +1000 Subject: [PATCH 4/14] hvc_console: MAGIC_SYSRQ should only be on console channel In-Reply-To: <20050706155241.GT12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> Message-ID: <20050706155318.GU12786@krispykreme> From: Milton Miller Guard the MAGIC_SYSRQ ^O to be just on the console channel. Make the other channels more transparent. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-sysrq drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-sysrq 2005-02-08 02:09:27.502022024 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 02:32:09.105245274 -0600 @@ -584,14 +584,17 @@ static int hvc_poll(struct hvc_struct *h } for (i = 0; i < n; ++i) { #ifdef CONFIG_MAGIC_SYSRQ - /* Handle the SysRq Hack */ - if (buf[i] == '\x0f') { /* ^O -- should support a sequence */ - sysrq_pressed = 1; - continue; - } else if (sysrq_pressed) { - handle_sysrq(buf[i], NULL, tty); - sysrq_pressed = 0; - continue; + if (hp->index == hvc_con_driver.index) { + /* Handle the SysRq Hack */ + /* XXX should support a sequence */ + if (buf[i] == '\x0f') { /* ^O */ + sysrq_pressed = 1; + continue; + } else if (sysrq_pressed) { + handle_sysrq(buf[i], NULL, tty); + sysrq_pressed = 0; + continue; + } } #endif /* CONFIG_MAGIC_SYSRQ */ tty_insert_flip_char(tty, buf[i], 0); _ From anton at samba.org Thu Jul 7 01:54:22 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:54:22 +1000 Subject: [PATCH 6/14] hvc_console: Add missing include In-Reply-To: <20050706155355.GV12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> Message-ID: <20050706155422.GW12786@krispykreme> From: Milton Miller hvc_console checks MAGIC_SYSRQ and XMON config vars. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-config-h drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-config-h 2005-02-08 00:42:54.520912122 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 00:43:21.879836268 -0600 @@ -22,6 +22,7 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ +#include #include #include #include _ From anton at samba.org Thu Jul 7 02:02:09 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 02:02:09 +1000 Subject: [PATCH 12/14] hvc_console: Register ops when setting up hvc_console In-Reply-To: <20050706160116.GB12786@krispykreme> References: <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> <20050706155457.GX12786@krispykreme> <20050706155522.GY12786@krispykreme> <20050706155636.GZ12786@krispykreme> <20050706155836.GA12786@krispykreme> <20050706160116.GB12786@krispykreme> Message-ID: <20050706160209.GC12786@krispykreme> From: Milton Miller When registering the hvc console port, register a list of ops (read and write) to go with it, instead of calling fixed function names. This allows different ports to encode the data differently. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-hooked drivers/char/hvc_console.c --- gr_work_small/drivers/char/hvc_console.c~hvc-console-hooked 2005-05-31 21:59:59.410444389 -0500 +++ gr_work_small-miltonm/drivers/char/hvc_console.c 2005-05-31 21:59:59.432440906 -0500 @@ -85,6 +85,7 @@ struct hvc_struct { char outbuf[N_OUTBUF] __ALIGNED__; int n_outbuf; uint32_t vtermno; + struct hv_ops *ops; int irq_requested; int irq; struct list_head next; @@ -142,6 +143,7 @@ struct hvc_struct *hvc_get_by_index(int * console interfaces but can still be used as a tty device. This has to be * static because kmalloc will not work during early console init. */ +static struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; @@ -154,14 +156,14 @@ void hvc_console_print(struct console *c { char c[16] __ALIGNED__; unsigned i = 0, n = 0; - int r, donecr = 0; + int r, donecr = 0, index = co->index; /* Console access attempt outside of acceptable console range. */ - if (co->index >= MAX_NR_HVC_CONSOLES) + if (index >= MAX_NR_HVC_CONSOLES) return; /* This console adapter was removed so it is not useable. */ - if (vtermnos[co->index] < 0) + if (vtermnos[index] < 0) return; while (count > 0 || i > 0) { @@ -175,7 +177,7 @@ void hvc_console_print(struct console *c --count; } } else { - r = hvc_put_chars(vtermnos[co->index], c, i); + r = cons_ops[index]->put_chars(vtermnos[index], c, i); if (r < 0) { /* throw away chars on error */ i = 0; @@ -245,7 +247,7 @@ console_initcall(hvc_console_init); * vty adapters do NOT get an hvc_instantiate() callback since they * appear after early console init. */ -int hvc_instantiate(uint32_t vtermno, int index) +int hvc_instantiate(uint32_t vtermno, int index, struct hv_ops *ops) { struct hvc_struct *hp; @@ -263,6 +265,7 @@ int hvc_instantiate(uint32_t vtermno, in } vtermnos[index] = vtermno; + cons_ops[index] = ops; /* reserve all indices upto and including this index */ if (last_hvc < index) @@ -466,7 +469,7 @@ static void hvc_push(struct hvc_struct * { int n; - n = hvc_put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf); + n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf); if (n <= 0) { if (n == 0) return; @@ -604,7 +607,7 @@ static int hvc_poll(struct hvc_struct *h break; } - n = hvc_get_chars(hp->vtermno, buf, count); + n = hp->ops->get_chars(hp->vtermno, buf, count); if (n <= 0) { /* Hangup the tty when disconnected from host */ if (n == -EPIPE) { @@ -737,7 +740,8 @@ static struct kobj_type hvc_kobj_type = .release = destroy_hvc_struct, }; -struct hvc_struct __devinit *hvc_alloc(uint32_t vtermno, int irq) +struct hvc_struct __devinit *hvc_alloc(uint32_t vtermno, int irq, + struct hv_ops *ops) { struct hvc_struct *hp; int i; @@ -750,6 +754,7 @@ struct hvc_struct __devinit *hvc_alloc(u hp->vtermno = vtermno; hp->irq = irq; + hp->ops = ops; kobject_init(&hp->kobj); hp->kobj.ktype = &hvc_kobj_type; diff -puN drivers/char/hvc_vio.c~hvc-console-hooked drivers/char/hvc_vio.c --- gr_work_small/drivers/char/hvc_vio.c~hvc-console-hooked 2005-05-31 21:59:59.414443756 -0500 +++ gr_work_small-miltonm/drivers/char/hvc_vio.c 2005-05-31 22:01:10.628964170 -0500 @@ -43,6 +43,11 @@ static struct vio_device_id hvc_driver_t }; MODULE_DEVICE_TABLE(vio, hvc_driver_table); +static struct hv_ops hvc_get_put_ops = { + .get_chars = hvc_get_chars, + .put_chars = hvc_put_chars, +}; + static int __devinit hvc_vio_probe(struct vio_dev *vdev, const struct vio_device_id *id) { @@ -52,7 +57,7 @@ static int __devinit hvc_vio_probe(struc if (!vdev || !id) return -EPERM; - hp = hvc_alloc(vdev->unit_address, vdev->irq); + hp = hvc_alloc(vdev->unit_address, vdev->irq, &hvc_get_put_ops); if (IS_ERR(hp)) return PTR_ERR(hp); dev_set_drvdata(&vdev->dev, hp); @@ -115,7 +120,7 @@ static int hvc_find_vtys(void) continue; if (device_is_compatible(vty, "hvterm1")) { - hvc_instantiate(*vtermno, num_found); + hvc_instantiate(*vtermno, num_found, &hvc_get_put_ops); ++num_found; } } diff -puN include/asm-ppc64/hvconsole.h~hvc-console-hooked include/asm-ppc64/hvconsole.h --- gr_work_small/include/asm-ppc64/hvconsole.h~hvc-console-hooked 2005-05-31 21:59:59.419442964 -0500 +++ gr_work_small-miltonm/include/asm-ppc64/hvconsole.h 2005-05-31 21:59:59.433440748 -0500 @@ -30,15 +30,20 @@ #define MAX_NR_HVC_CONSOLES 16 /* implemented by a low level driver */ +struct hv_ops { + int (*get_chars)(uint32_t vtermno, char *buf, int count); + int (*put_chars)(uint32_t vtermno, const char *buf, int count); +}; extern int hvc_get_chars(uint32_t vtermno, char *buf, int count); extern int hvc_put_chars(uint32_t vtermno, const char *buf, int count); struct hvc_struct; /* Register a vterm and a slot index for use as a console (console_init) */ -extern int hvc_instantiate(uint32_t vtermno, int index); +extern int hvc_instantiate(uint32_t vtermno, int index, struct hv_ops *ops); /* register a vterm for hvc tty operation (module_init or hotplug add) */ -extern struct hvc_struct * __devinit hvc_alloc(uint32_t vtermno, int irq); +extern struct hvc_struct * __devinit hvc_alloc(uint32_t vtermno, int irq, + struct hv_ops *ops); /* remove a vterm from hvc tty operation (modele_exit or hotplug remove) */ extern int __devexit hvc_remove(struct hvc_struct *hp); #endif /* _PPC64_HVCONSOLE_H */ _ From anton at samba.org Thu Jul 7 01:54:57 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:54:57 +1000 Subject: [PATCH 7/14] hvc_console: remove num_vterms and some dead code In-Reply-To: <20050706155422.GW12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> Message-ID: <20050706155457.GX12786@krispykreme> From: Milton Miller num_vterms hasn't been used since the hotplug support went in. Also, remove a dead code line from a list_for_each_entry conversion. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-remove-num-vterms drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-remove-num-vterms 2005-02-08 02:32:19.610182226 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 02:32:19.620180641 -0600 @@ -146,8 +146,6 @@ struct hvc_struct *hvc_get_by_index(int */ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES]; -/* Used for accounting purposes */ -static int num_vterms = 0; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -219,7 +217,7 @@ static int __init hvc_console_init(void) for (i=0; i References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> Message-ID: <20050706155241.GT12786@krispykreme> From: Milton Miller Have the hvc console code try to pull characters immediately when receiving an interrupt, and kick the poll thread only if the immediate poll indicates it needed a call back to do more work. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-irq-nodelay drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-irq-nodelay 2005-02-08 00:42:40.386023479 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 00:42:40.396021893 -0600 @@ -254,13 +254,17 @@ static void hvc_kick(void) wake_up_process(hvc_task); } +static int hvc_poll(struct hvc_struct *hp); + /* * NOTE: This API isn't used if the console adapter doesn't support interrupts. * In this case the console is poll driven. */ static irqreturn_t hvc_handle_interrupt(int irq, void *dev_instance, struct pt_regs *regs) { - hvc_kick(); + /* if hvc_poll request a repoll, then kick the hvcd thread */ + if (hvc_poll(dev_instance)) + hvc_kick(); return IRQ_HANDLED; } @@ -598,8 +602,8 @@ static int hvc_poll(struct hvc_struct *h /* * Account for the total amount read in one loop, and if above - * 64 bytes, we do a quick schedule loop to let the tty grok the - * data and eventually throttle us. + * 64 bytes, we do a quick schedule loop to let the tty grok + * the data and eventually throttle us. */ read_total += n; if (read_total >= 64) { _ From anton at samba.org Thu Jul 7 02:03:08 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 02:03:08 +1000 Subject: [PATCH 14/14] hvc_console: Use hvc_get_chars in hvsi code In-Reply-To: <20050706160238.GD12786@krispykreme> References: <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> <20050706155457.GX12786@krispykreme> <20050706155522.GY12786@krispykreme> <20050706155636.GZ12786@krispykreme> <20050706155836.GA12786@krispykreme> <20050706160116.GB12786@krispykreme> <20050706160209.GC12786@krispykreme> <20050706160238.GD12786@krispykreme> Message-ID: <20050706160307.GE12786@krispykreme> Subject: [PATCH] hvc_console: Use hvc_get_chars in hvsi code From: Milton Miller Now that hvc_get_chars doesn't strip NULs, hvsi doesn't have to duplicate it. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvsi.c~hvsi-use-get-chars drivers/char/hvsi.c --- gr_dbg/drivers/char/hvsi.c~hvsi-use-get-chars 2005-01-07 04:57:00.914220357 -0600 +++ gr_dbg-miltonm/drivers/char/hvsi.c 2005-01-07 04:59:06.556385126 -0600 @@ -291,15 +291,13 @@ static void dump_packet(uint8_t *packet) dump_hex(packet, header->len); } -/* can't use hvc_get_chars because that strips CRs */ static int hvsi_read(struct hvsi_struct *hp, char *buf, int count) { unsigned long got; - if (plpar_hcall(H_GET_TERM_CHAR, hp->vtermno, 0, 0, 0, &got, - (unsigned long *)buf, (unsigned long *)buf+1) == H_Success) - return got; - return 0; + got = hvc_get_chars(hp->vtermno, buf, count); + + return got; } static void hvsi_recv_control(struct hvsi_struct *hp, uint8_t *packet, _ From anton at samba.org Thu Jul 7 01:53:55 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:53:55 +1000 Subject: [PATCH 5/14] hvc_console: Unregister the console in the exit routine. In-Reply-To: <20050706155318.GU12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> Message-ID: <20050706155355.GV12786@krispykreme> From: Milton Miller Be thorough in our exit routine, since it says it is there to be so. Unregistering without registering is safe (checked in 2.6.10). Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN drivers/char/hvc_console.c~hvc-console-unregister drivers/char/hvc_console.c --- gr_work_udbg/drivers/char/hvc_console.c~hvc-console-unregister 2005-02-08 02:32:12.656217822 -0600 +++ gr_work_udbg-miltonm/drivers/char/hvc_console.c 2005-02-08 02:32:12.666216237 -0600 @@ -846,8 +846,9 @@ int __init hvc_init(void) } module_init(hvc_init); -/* This isn't particularily necessary due to this being a console driver but it - * is nice to be thorough */ +/* This isn't particularily necessary due to this being a console driver + * but it is nice to be thorough. + */ static void __exit hvc_exit(void) { kthread_stop(hvc_task); @@ -856,5 +857,6 @@ static void __exit hvc_exit(void) tty_unregister_driver(hvc_driver); /* return tty_struct instances allocated in hvc_init(). */ put_tty_driver(hvc_driver); + unregister_console(&hvc_con_driver); } module_exit(hvc_exit); _ From anton at samba.org Thu Jul 7 02:02:38 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 02:02:38 +1000 Subject: [PATCH 13/14] hvc_console: Separate the NUL character filtering from get_hvc_chars In-Reply-To: <20050706160209.GC12786@krispykreme> References: <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> <20050706155457.GX12786@krispykreme> <20050706155522.GY12786@krispykreme> <20050706155636.GZ12786@krispykreme> <20050706155836.GA12786@krispykreme> <20050706160116.GB12786@krispykreme> <20050706160209.GC12786@krispykreme> Message-ID: <20050706160238.GD12786@krispykreme> From: Milton Miller Separate the NUL character filtering from get_hvc_chars. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/hvconsole.c~hvc-console-nul-filter arch/ppc64/kernel/hvconsole.c --- gr_work_small/arch/ppc64/kernel/hvconsole.c~hvc-console-nul-filter 2005-05-31 22:06:09.000000000 -0500 +++ gr_work_small-miltonm/arch/ppc64/kernel/hvconsole.c 2005-06-01 19:58:25.373370091 -0500 @@ -41,29 +41,14 @@ int hvc_get_chars(uint32_t vtermno, char unsigned long got; if (plpar_hcall(H_GET_TERM_CHAR, vtermno, 0, 0, 0, &got, - (unsigned long *)buf, (unsigned long *)buf+1) == H_Success) { - /* - * Work around a HV bug where it gives us a null - * after every \r. -- paulus - */ - if (got > 0) { - int i; - for (i = 1; i < got; ++i) { - if (buf[i] == 0 && buf[i-1] == '\r') { - --got; - if (i < got) - memmove(&buf[i], &buf[i+1], - got - i); - } - } - } + (unsigned long *)buf, (unsigned long *)buf+1) == H_Success) return got; - } return 0; } EXPORT_SYMBOL(hvc_get_chars); + /** * hvc_put_chars: send characters to firmware for denoted vterm adapter * @vtermno: The vtermno or unit_address of the adapter from which the data diff -puN drivers/char/hvc_vio.c~hvc-console-nul-filter drivers/char/hvc_vio.c --- gr_work_small/drivers/char/hvc_vio.c~hvc-console-nul-filter 2005-05-31 22:06:09.000000000 -0500 +++ gr_work_small-miltonm/drivers/char/hvc_vio.c 2005-05-31 22:09:23.000000000 -0500 @@ -43,8 +43,30 @@ static struct vio_device_id hvc_driver_t }; MODULE_DEVICE_TABLE(vio, hvc_driver_table); +static int filtered_get_chars(uint32_t vtermno, char *buf, int count) +{ + unsigned long got; + int i; + + got = hvc_get_chars(vtermno, buf, count); + + /* + * Work around a HV bug where it gives us a null + * after every \r. -- paulus + */ + for (i = 1; i < got; ++i) { + if (buf[i] == 0 && buf[i-1] == '\r') { + --got; + if (i < got) + memmove(&buf[i], &buf[i+1], + got - i); + } + } + return got; +} + static struct hv_ops hvc_get_put_ops = { - .get_chars = hvc_get_chars, + .get_chars = filtered_get_chars, .put_chars = hvc_put_chars, }; _ From anton at samba.org Thu Jul 7 01:50:20 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:50:20 +1000 Subject: hvc_console patches Message-ID: <20050706155020.GQ12786@krispykreme> Hi, Milton Miller has done a lot of work to clean up our hvc_console code. One of the important things the following patch series does is separate the VIO layer from the hvc_console code. With the VIO specific code removed any ppc64 platform, or even any architecture, can use hvc_console as a generic polling console. You simply have to supply a get_chars and put_chars method and hvc_console does the rest of the work. You can even use it for an interrupt driven console. Its great. And here it comes. Anton From anton at samba.org Thu Jul 7 01:58:36 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 01:58:36 +1000 Subject: [PATCH 10/14] hvc_console: Separate hvc_console and vio code In-Reply-To: <20050706155636.GZ12786@krispykreme> References: <20050706155020.GQ12786@krispykreme> <20050706155112.GR12786@krispykreme> <20050706155155.GS12786@krispykreme> <20050706155241.GT12786@krispykreme> <20050706155318.GU12786@krispykreme> <20050706155355.GV12786@krispykreme> <20050706155422.GW12786@krispykreme> <20050706155457.GX12786@krispykreme> <20050706155522.GY12786@krispykreme> <20050706155636.GZ12786@krispykreme> Message-ID: <20050706155836.GA12786@krispykreme> From: Milton Miller Separate the console setup routines of the hvc_console and the vio layer. Remove the call to find_init_vty from hvc_console.c. Fail the setup routine if the console doesn't exist, but register the console again when the specified channel is instantiated. This scheme maintains the print buffer semantics while eliminating callout and call back for the console code. Signed-off-by: Milton Miller Signed-off-by: Anton Blanchard diff -puN arch/ppc64/kernel/hvconsole.c~hvc-console-seperate-setup arch/ppc64/kernel/hvconsole.c --- gr_work_small/arch/ppc64/kernel/hvconsole.c~hvc-console-seperate-setup 2005-05-31 21:12:30.000000000 -0500 +++ gr_work_small-miltonm/arch/ppc64/kernel/hvconsole.c 2005-06-01 20:01:51.205950334 -0500 @@ -93,7 +93,7 @@ EXPORT_SYMBOL(hvc_put_chars); * We hope/assume that the first vty found corresponds to the first console * device. */ -int hvc_find_vtys(void) +static int hvc_find_vtys(void) { struct device_node *vty; int num_found = 0; @@ -119,3 +119,4 @@ int hvc_find_vtys(void) return num_found; } +console_initcall(hvc_find_vtys); diff -puN drivers/char/hvc_console.c~hvc-console-seperate-setup drivers/char/hvc_console.c --- gr_work_small/drivers/char/hvc_console.c~hvc-console-seperate-setup 2005-05-31 21:12:30.000000000 -0500 +++ gr_work_small-miltonm/drivers/char/hvc_console.c 2005-06-01 20:01:51.197951605 -0500 @@ -219,10 +219,23 @@ struct console hvc_con_driver = { .index = -1, }; -/* Early console initialization. Preceeds driver initialization. */ +/* + * Early console initialization. Preceeds driver initialization. + * + * (1) we are first, and the user specified another driver + * -- index will remain -1 + * (2) we are first and the user specified no driver + * -- index will be set to 0, then we will fail setup. + * (3) we are first and the user specified our driver + * -- index will be set to user specified driver, and we will fail + * (4) we are after driver, and this initcall will register us + * -- if the user didn't specify a driver then the console will match + * + * Note that for cases 2 and 3, we will match later when the io driver + * calls hvc_instantiate() and call register again. + */ static int __init hvc_console_init(void) { - hvc_find_vtys(); register_console(&hvc_con_driver); return 0; } @@ -257,6 +270,13 @@ int hvc_instantiate(uint32_t vtermno, in if (last_hvc < index) last_hvc = index; + /* if this index is what the user requested, then register + * now (setup won't fail at this point). It's ok to just + * call register again if previously .setup failed. + */ + if (index == hvc_con_driver.index) + register_console(&hvc_con_driver); + return 0; } diff -puN include/asm-ppc64/hvconsole.h~hvc-console-seperate-setup include/asm-ppc64/hvconsole.h --- gr_work_small/include/asm-ppc64/hvconsole.h~hvc-console-seperate-setup 2005-05-31 21:12:30.000000000 -0500 +++ gr_work_small-miltonm/include/asm-ppc64/hvconsole.h 2005-06-01 20:01:51.204950493 -0500 @@ -32,9 +32,6 @@ extern int hvc_get_chars(uint32_t vtermno, char *buf, int count); extern int hvc_put_chars(uint32_t vtermno, const char *buf, int count); -/* Early discovery of console adapters. */ -extern int hvc_find_vtys(void); - -/* Implemented by a console driver */ +/* Register a vterm and a slot index for use as a console */ extern int hvc_instantiate(uint32_t vtermno, int index); #endif /* _PPC64_HVCONSOLE_H */ _ From anton at samba.org Thu Jul 7 04:45:53 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:45:53 +1000 Subject: [PATCH 1/10] ppc64: Make idle_loop a ppc_md function In-Reply-To: <20050706184447.GG12786@krispykreme> References: <20050706184447.GG12786@krispykreme> Message-ID: <20050706184553.GH12786@krispykreme> From: Michael Ellerman This patch adds an idle member to the ppc_md structure and calls it from cpu_idle(). If a platform leaves ppc_md.idle as null it will get the default idle loop default_idle(). Signed-off-by: Michael Ellerman Signed-off-by: Anton Blanchard Index: ppc64-2.6/include/asm-ppc64/machdep.h =================================================================== --- ppc64-2.6.orig/include/asm-ppc64/machdep.h +++ ppc64-2.6/include/asm-ppc64/machdep.h @@ -140,8 +140,13 @@ struct machdep_calls { unsigned long size, pgprot_t vma_prot); + /* Idle loop for this platform, leave empty for default idle loop */ + int (*idle_loop)(void); }; +extern int default_idle(void); +extern int native_idle(void); + extern struct machdep_calls ppc_md; extern char cmd_line[COMMAND_LINE_SIZE]; Index: ppc64-2.6/arch/ppc64/kernel/setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/setup.c +++ ppc64-2.6/arch/ppc64/kernel/setup.c @@ -96,7 +96,6 @@ extern void udbg_init_maple_realmode(voi extern unsigned long klimit; extern void mm_init_ppc64(void); -extern int idle_setup(void); extern void stab_initialize(unsigned long stab); extern void htab_initialize(void); extern void early_init_devtree(void *flat_dt); @@ -1081,8 +1080,9 @@ void __init setup_arch(char **cmdline_p) ppc_md.setup_arch(); - /* Select the correct idle loop for the platform. */ - idle_setup(); + /* Use the default idle loop if the platform hasn't provided one. */ + if (NULL == ppc_md.idle_loop) + ppc_md.idle_loop = default_idle; paging_init(); ppc64_boot_msg(0x15, "Setup Done"); Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -33,6 +33,7 @@ #include #include #include +#include extern void power4_idle(void); @@ -122,7 +123,7 @@ static int iSeries_idle(void) #else -static int default_idle(void) +int default_idle(void) { long oldval; unsigned int cpu = smp_processor_id(); @@ -288,7 +289,7 @@ static int shared_idle(void) #endif /* CONFIG_PPC_PSERIES */ -static int native_idle(void) +int native_idle(void) { while(1) { /* check CPU type here */ @@ -308,7 +309,8 @@ static int native_idle(void) void cpu_idle(void) { - idle_loop(); + BUG_ON(NULL == ppc_md.idle_loop); + ppc_md.idle_loop(); } int powersave_nap; From anton at samba.org Thu Jul 7 04:46:56 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:46:56 +1000 Subject: [PATCH 3/10] ppc64: Move pSeries idle functions into pSeries_setup.c In-Reply-To: <20050706184616.GI12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> Message-ID: <20050706184655.GJ12786@krispykreme> From: Michael Ellerman dedicated_idle() and shared_idle() are only used by pSeries, so move them into pSeries_setup.c Signed-off-by: Michael Ellerman Signed-off-by: Anton Blanchard Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -74,137 +74,6 @@ int default_idle(void) return 0; } -#ifdef CONFIG_PPC_PSERIES - -DECLARE_PER_CPU(unsigned long, smt_snooze_delay); - -int dedicated_idle(void) -{ - long oldval; - struct paca_struct *lpaca = get_paca(), *ppaca; - unsigned long start_snooze; - unsigned long *smt_snooze_delay = &__get_cpu_var(smt_snooze_delay); - unsigned int cpu = smp_processor_id(); - - ppaca = &paca[cpu ^ 1]; - - while (1) { - /* - * Indicate to the HV that we are idle. Now would be - * a good time to find other work to dispatch. - */ - lpaca->lppaca.idle = 1; - - oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); - if (!oldval) { - set_thread_flag(TIF_POLLING_NRFLAG); - start_snooze = __get_tb() + - *smt_snooze_delay * tb_ticks_per_usec; - while (!need_resched() && !cpu_is_offline(cpu)) { - /* - * Go into low thread priority and possibly - * low power mode. - */ - HMT_low(); - HMT_very_low(); - - if (*smt_snooze_delay == 0 || - __get_tb() < start_snooze) - continue; - - HMT_medium(); - - if (!(ppaca->lppaca.idle)) { - local_irq_disable(); - - /* - * We are about to sleep the thread - * and so wont be polling any - * more. - */ - clear_thread_flag(TIF_POLLING_NRFLAG); - - /* - * SMT dynamic mode. Cede will result - * in this thread going dormant, if the - * partner thread is still doing work. - * Thread wakes up if partner goes idle, - * an interrupt is presented, or a prod - * occurs. Returning from the cede - * enables external interrupts. - */ - if (!need_resched()) - cede_processor(); - else - local_irq_enable(); - } else { - /* - * Give the HV an opportunity at the - * processor, since we are not doing - * any work. - */ - poll_pending(); - } - } - - clear_thread_flag(TIF_POLLING_NRFLAG); - } else { - set_need_resched(); - } - - HMT_medium(); - lpaca->lppaca.idle = 0; - schedule(); - if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) - cpu_die(); - } - return 0; -} - -static int shared_idle(void) -{ - struct paca_struct *lpaca = get_paca(); - unsigned int cpu = smp_processor_id(); - - while (1) { - /* - * Indicate to the HV that we are idle. Now would be - * a good time to find other work to dispatch. - */ - lpaca->lppaca.idle = 1; - - while (!need_resched() && !cpu_is_offline(cpu)) { - local_irq_disable(); - - /* - * Yield the processor to the hypervisor. We return if - * an external interrupt occurs (which are driven prior - * to returning here) or if a prod occurs from another - * processor. When returning here, external interrupts - * are enabled. - * - * Check need_resched() again with interrupts disabled - * to avoid a race. - */ - if (!need_resched()) - cede_processor(); - else - local_irq_enable(); - } - - HMT_medium(); - lpaca->lppaca.idle = 0; - schedule(); - if (cpu_is_offline(smp_processor_id()) && - system_state == SYSTEM_RUNNING) - cpu_die(); - } - - return 0; -} - -#endif /* CONFIG_PPC_PSERIES */ - int native_idle(void) { while(1) { Index: ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c @@ -418,6 +418,133 @@ static int __init pSeries_probe(int plat return 1; } +DECLARE_PER_CPU(unsigned long, smt_snooze_delay); + +int dedicated_idle(void) +{ + long oldval; + struct paca_struct *lpaca = get_paca(), *ppaca; + unsigned long start_snooze; + unsigned long *smt_snooze_delay = &__get_cpu_var(smt_snooze_delay); + unsigned int cpu = smp_processor_id(); + + ppaca = &paca[cpu ^ 1]; + + while (1) { + /* + * Indicate to the HV that we are idle. Now would be + * a good time to find other work to dispatch. + */ + lpaca->lppaca.idle = 1; + + oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + if (!oldval) { + set_thread_flag(TIF_POLLING_NRFLAG); + start_snooze = __get_tb() + + *smt_snooze_delay * tb_ticks_per_usec; + while (!need_resched() && !cpu_is_offline(cpu)) { + /* + * Go into low thread priority and possibly + * low power mode. + */ + HMT_low(); + HMT_very_low(); + + if (*smt_snooze_delay == 0 || + __get_tb() < start_snooze) + continue; + + HMT_medium(); + + if (!(ppaca->lppaca.idle)) { + local_irq_disable(); + + /* + * We are about to sleep the thread + * and so wont be polling any + * more. + */ + clear_thread_flag(TIF_POLLING_NRFLAG); + + /* + * SMT dynamic mode. Cede will result + * in this thread going dormant, if the + * partner thread is still doing work. + * Thread wakes up if partner goes idle, + * an interrupt is presented, or a prod + * occurs. Returning from the cede + * enables external interrupts. + */ + if (!need_resched()) + cede_processor(); + else + local_irq_enable(); + } else { + /* + * Give the HV an opportunity at the + * processor, since we are not doing + * any work. + */ + poll_pending(); + } + } + + clear_thread_flag(TIF_POLLING_NRFLAG); + } else { + set_need_resched(); + } + + HMT_medium(); + lpaca->lppaca.idle = 0; + schedule(); + if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) + cpu_die(); + } + return 0; +} + +static int shared_idle(void) +{ + struct paca_struct *lpaca = get_paca(); + unsigned int cpu = smp_processor_id(); + + while (1) { + /* + * Indicate to the HV that we are idle. Now would be + * a good time to find other work to dispatch. + */ + lpaca->lppaca.idle = 1; + + while (!need_resched() && !cpu_is_offline(cpu)) { + local_irq_disable(); + + /* + * Yield the processor to the hypervisor. We return if + * an external interrupt occurs (which are driven prior + * to returning here) or if a prod occurs from another + * processor. When returning here, external interrupts + * are enabled. + * + * Check need_resched() again with interrupts disabled + * to avoid a race. + */ + if (!need_resched()) + cede_processor(); + else + local_irq_enable(); + } + + HMT_medium(); + lpaca->lppaca.idle = 0; + schedule(); + if (cpu_is_offline(smp_processor_id()) && + system_state == SYSTEM_RUNNING) + cpu_die(); + } + + return 0; +} + struct machdep_calls __initdata pSeries_md = { .probe = pSeries_probe, .setup_arch = pSeries_setup_arch, From anton at samba.org Thu Jul 7 04:48:39 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:48:39 +1000 Subject: [PATCH 5/10] ppc64: Remove obsolete idle_setup() In-Reply-To: <20050706184726.GK12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> <20050706184655.GJ12786@krispykreme> <20050706184726.GK12786@krispykreme> Message-ID: <20050706184839.GL12786@krispykreme> From: Michael Ellerman Now that the idle loop is configured by each platform we don't need idle_setup() anymore. Signed-off-by: Michael Ellerman Signed-off-by: Anton Blanchard Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -37,8 +37,6 @@ extern void power4_idle(void); -static int (*idle_loop)(void); - int default_idle(void) { long oldval; @@ -127,42 +125,3 @@ register_powersave_nap_sysctl(void) } __initcall(register_powersave_nap_sysctl); #endif - -int idle_setup(void) -{ - /* - * Move that junk to each platform specific file, eventually define - * a pSeries_idle for shared processor stuff - */ -#ifdef CONFIG_PPC_ISERIES - idle_loop = iSeries_idle; - return 1; -#else - idle_loop = default_idle; -#endif -#ifdef CONFIG_PPC_PSERIES - if (systemcfg->platform & PLATFORM_PSERIES) { - if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { - if (get_paca()->lppaca.shared_proc) { - printk(KERN_INFO "Using shared processor idle loop\n"); - idle_loop = shared_idle; - } else { - printk(KERN_INFO "Using dedicated idle loop\n"); - idle_loop = dedicated_idle; - } - } else { - printk(KERN_INFO "Using default idle loop\n"); - idle_loop = default_idle; - } - } -#endif /* CONFIG_PPC_PSERIES */ -#ifndef CONFIG_PPC_ISERIES - if (systemcfg->platform == PLATFORM_POWERMAC || - systemcfg->platform == PLATFORM_MAPLE) { - printk(KERN_INFO "Using native/NAP idle loop\n"); - idle_loop = native_idle; - } -#endif /* CONFIG_PPC_ISERIES */ - - return 1; -} From anton at samba.org Thu Jul 7 04:50:37 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:50:37 +1000 Subject: [PATCH 10/10] ppc64: Be consistent about printing which idle loop we're using In-Reply-To: <20050706185015.GP12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> <20050706184655.GJ12786@krispykreme> <20050706184726.GK12786@krispykreme> <20050706184839.GL12786@krispykreme> <20050706184906.GM12786@krispykreme> <20050706184937.GN12786@krispykreme> <20050706184953.GO12786@krispykreme> <20050706185015.GP12786@krispykreme> Message-ID: <20050706185037.GQ12786@krispykreme> From: Michael Ellerman Not sure if we really need this, but it was handy to know which iSeries loop I was testing. Be consistent about printing which idle loop we're using, with this patch we cover all cases. Signed-off-by: Michael Ellerman Signed-off-by: Anton Blanchard Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -946,9 +946,12 @@ void __init iSeries_early_setup(void) ppc_md.calibrate_decr = iSeries_calibrate_decr; ppc_md.progress = iSeries_progress; - if (get_paca()->lppaca.shared_proc) + if (get_paca()->lppaca.shared_proc) { ppc_md.idle_loop = iseries_shared_idle; - else + printk(KERN_INFO "Using shared processor idle loop\n"); + } else { ppc_md.idle_loop = iseries_dedicated_idle; + printk(KERN_INFO "Using dedicated idle loop\n"); + } } Index: work/arch/ppc64/kernel/setup.c =================================================================== --- work.orig/arch/ppc64/kernel/setup.c +++ work/arch/ppc64/kernel/setup.c @@ -1081,8 +1081,10 @@ void __init setup_arch(char **cmdline_p) ppc_md.setup_arch(); /* Use the default idle loop if the platform hasn't provided one. */ - if (NULL == ppc_md.idle_loop) + if (NULL == ppc_md.idle_loop) { ppc_md.idle_loop = default_idle; + printk(KERN_INFO "Using default idle loop\n"); + } paging_init(); ppc64_boot_msg(0x15, "Setup Done"); From anton at samba.org Thu Jul 7 04:49:06 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:49:06 +1000 Subject: [PATCH 6/10] ppc64: iSeries idle fixups In-Reply-To: <20050706184839.GL12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> <20050706184655.GJ12786@krispykreme> <20050706184726.GK12786@krispykreme> <20050706184839.GL12786@krispykreme> Message-ID: <20050706184906.GM12786@krispykreme> - remove min/max yield time, we dont use the values anywhere - separate shared and dedicated idle loops - check need_resched again with irqs off to avoid sleeping with pending work - continually set runlatch off in idle loop, this means we dont need to turn the runlatch off on exception exit and suffer that associated cost for all exceptions. (A future patch will turn the runlatch on at exception entry) Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 02:26:24.061621784 +1000 +++ foobar2/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 05:49:46.629711734 +1000 @@ -834,9 +834,6 @@ late_initcall(iSeries_src_init); -static unsigned long maxYieldTime = 0; -static unsigned long minYieldTime = 0xffffffffffffffffUL; - static inline void process_iSeries_events(void) { asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); @@ -845,7 +842,6 @@ static void yield_shared_processor(void) { unsigned long tb; - unsigned long yieldTime; HvCall_setEnabledInterrupts(HvCall_MaskIPI | HvCall_MaskLpEvent | @@ -856,13 +852,6 @@ /* Compute future tb value when yield should expire */ HvCall_yieldProcessor(HvCall_YieldTimed, tb+tb_ticks_per_jiffy); - yieldTime = get_tb() - tb; - if (yieldTime > maxYieldTime) - maxYieldTime = yieldTime; - - if (yieldTime < minYieldTime) - minYieldTime = yieldTime; - /* * The decrementer stops during the yield. Force a fake decrementer * here and let the timer_interrupt code sort out the actual time. @@ -871,45 +860,62 @@ process_iSeries_events(); } -static int iSeries_idle(void) +static int iseries_shared_idle(void) { - struct paca_struct *lpaca; - long oldval; + while (1) { + while (!need_resched() && !hvlpevent_is_pending()) { + local_irq_disable(); + ppc64_runlatch_off(); + + /* Recheck with irqs off */ + if (!need_resched() && !hvlpevent_is_pending()) + yield_shared_processor(); + + HMT_medium(); + local_irq_enable(); + } + + ppc64_runlatch_on(); + + if (hvlpevent_is_pending()) + process_iSeries_events(); + + schedule(); + } - /* ensure iSeries run light will be out when idle */ - ppc64_runlatch_off(); + return 0; +} - lpaca = get_paca(); +static int iseries_dedicated_idle(void) +{ + struct paca_struct *lpaca = get_paca(); + long oldval; while (1) { - if (lpaca->lppaca.shared_proc) { - if (hvlpevent_is_pending()) - process_iSeries_events(); - if (!need_resched()) - yield_shared_processor(); - } else { - oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + + if (!oldval) { + set_thread_flag(TIF_POLLING_NRFLAG); - if (!oldval) { - set_thread_flag(TIF_POLLING_NRFLAG); + while (!need_resched()) { + ppc64_runlatch_off(); + HMT_low(); - while (!need_resched()) { + if (hvlpevent_is_pending()) { HMT_medium(); - if (hvlpevent_is_pending()) - process_iSeries_events(); - HMT_low(); + ppc64_runlatch_on(); + process_iSeries_events(); } - - HMT_medium(); - clear_thread_flag(TIF_POLLING_NRFLAG); - } else { - set_need_resched(); } + + HMT_medium(); + clear_thread_flag(TIF_POLLING_NRFLAG); + } else { + set_need_resched(); } ppc64_runlatch_on(); schedule(); - ppc64_runlatch_off(); } return 0; @@ -940,6 +946,10 @@ ppc_md.get_rtc_time = iSeries_get_rtc_time; ppc_md.calibrate_decr = iSeries_calibrate_decr; ppc_md.progress = iSeries_progress; - ppc_md.idle_loop = iSeries_idle; + + if (get_paca()->lppaca.shared_proc) + ppc_md.idle_loop = iseries_shared_idle; + else + ppc_md.idle_loop = iseries_dedicated_idle; } From anton at samba.org Thu Jul 7 04:50:15 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:50:15 +1000 Subject: [PATCH 9/10] ppc64: fix compile warning In-Reply-To: <20050706184953.GO12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> <20050706184655.GJ12786@krispykreme> <20050706184726.GK12786@krispykreme> <20050706184839.GL12786@krispykreme> <20050706184906.GM12786@krispykreme> <20050706184937.GN12786@krispykreme> <20050706184953.GO12786@krispykreme> Message-ID: <20050706185015.GP12786@krispykreme> Subject: [PATCH] ppc64: fix compile warning Fix a compile warning introduced by the previous patches. Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 07:09:32.039334942 +1000 +++ foobar2/arch/ppc64/kernel/iSeries_setup.c 2005-07-06 07:14:29.159334906 +1000 @@ -888,7 +888,6 @@ static int iseries_dedicated_idle(void) { - struct paca_struct *lpaca = get_paca(); long oldval; while (1) { From anton at samba.org Thu Jul 7 04:44:48 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:44:48 +1000 Subject: idle loop patches Message-ID: <20050706184447.GG12786@krispykreme> Hi, Michael Ellerman has cleaned up our idle loop mess. Patches to follow. BTW they depend on the following patches posted earlier: [PATCH] ppc64: use c99 initialisers in cputable code [PATCH] ppc64: Fix runlatch code to work on pseries machines Anton From anton at samba.org Thu Jul 7 04:49:37 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:49:37 +1000 Subject: [PATCH 7/10] ppc64: pSeries idle fixups In-Reply-To: <20050706184906.GM12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> <20050706184655.GJ12786@krispykreme> <20050706184726.GK12786@krispykreme> <20050706184839.GL12786@krispykreme> <20050706184906.GM12786@krispykreme> Message-ID: <20050706184937.GN12786@krispykreme> - separate out sleep logic in dedicated_idle, it was so far indented that it got squashed against the right side of the screen. - add runlatch support, looping on runlatch disable. Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/pSeries_setup.c 2005-07-06 05:49:51.479133649 +1000 +++ foobar2/arch/ppc64/kernel/pSeries_setup.c 2005-07-06 06:14:22.752007077 +1000 @@ -83,8 +83,8 @@ extern void pSeries_system_reset_exception(struct pt_regs *regs); extern int pSeries_machine_check_exception(struct pt_regs *regs); -static int shared_idle(void); -static int dedicated_idle(void); +static int pseries_shared_idle(void); +static int pseries_dedicated_idle(void); static volatile void __iomem * chrp_int_ack_special; struct mpic *pSeries_mpic; @@ -238,10 +238,10 @@ if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { if (get_paca()->lppaca.shared_proc) { printk(KERN_INFO "Using shared processor idle loop\n"); - ppc_md.idle_loop = shared_idle; + ppc_md.idle_loop = pseries_shared_idle; } else { printk(KERN_INFO "Using dedicated idle loop\n"); - ppc_md.idle_loop = dedicated_idle; + ppc_md.idle_loop = pseries_dedicated_idle; } } else { printk(KERN_INFO "Using default idle loop\n"); @@ -438,15 +438,47 @@ DECLARE_PER_CPU(unsigned long, smt_snooze_delay); -int dedicated_idle(void) +static inline void dedicated_idle_sleep(unsigned int cpu) +{ + struct paca_struct *ppaca = &paca[cpu ^ 1]; + + /* Only sleep if the other thread is not idle */ + if (!(ppaca->lppaca.idle)) { + local_irq_disable(); + + /* + * We are about to sleep the thread and so wont be polling any + * more. + */ + clear_thread_flag(TIF_POLLING_NRFLAG); + + /* + * SMT dynamic mode. Cede will result in this thread going + * dormant, if the partner thread is still doing work. Thread + * wakes up if partner goes idle, an interrupt is presented, or + * a prod occurs. Returning from the cede enables external + * interrupts. + */ + if (!need_resched()) + cede_processor(); + else + local_irq_enable(); + } else { + /* + * Give the HV an opportunity at the processor, since we are + * not doing any work. + */ + poll_pending(); + } +} + +static int pseries_dedicated_idle(void) { long oldval; - struct paca_struct *lpaca = get_paca(), *ppaca; + struct paca_struct *lpaca = get_paca(); + unsigned int cpu = smp_processor_id(); unsigned long start_snooze; unsigned long *smt_snooze_delay = &__get_cpu_var(smt_snooze_delay); - unsigned int cpu = smp_processor_id(); - - ppaca = &paca[cpu ^ 1]; while (1) { /* @@ -458,9 +490,13 @@ oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); if (!oldval) { set_thread_flag(TIF_POLLING_NRFLAG); + start_snooze = __get_tb() + *smt_snooze_delay * tb_ticks_per_usec; + while (!need_resched() && !cpu_is_offline(cpu)) { + ppc64_runlatch_off(); + /* * Go into low thread priority and possibly * low power mode. @@ -468,60 +504,31 @@ HMT_low(); HMT_very_low(); - if (*smt_snooze_delay == 0 || - __get_tb() < start_snooze) - continue; - - HMT_medium(); - - if (!(ppaca->lppaca.idle)) { - local_irq_disable(); - - /* - * We are about to sleep the thread - * and so wont be polling any - * more. - */ - clear_thread_flag(TIF_POLLING_NRFLAG); - - /* - * SMT dynamic mode. Cede will result - * in this thread going dormant, if the - * partner thread is still doing work. - * Thread wakes up if partner goes idle, - * an interrupt is presented, or a prod - * occurs. Returning from the cede - * enables external interrupts. - */ - if (!need_resched()) - cede_processor(); - else - local_irq_enable(); - } else { - /* - * Give the HV an opportunity at the - * processor, since we are not doing - * any work. - */ - poll_pending(); + if (*smt_snooze_delay != 0 && + __get_tb() > start_snooze) { + HMT_medium(); + dedicated_idle_sleep(cpu); } + } + HMT_medium(); clear_thread_flag(TIF_POLLING_NRFLAG); } else { set_need_resched(); } - HMT_medium(); lpaca->lppaca.idle = 0; + ppc64_runlatch_on(); + schedule(); + if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) cpu_die(); } - return 0; } -static int shared_idle(void) +static int pseries_shared_idle(void) { struct paca_struct *lpaca = get_paca(); unsigned int cpu = smp_processor_id(); @@ -535,6 +542,7 @@ while (!need_resched() && !cpu_is_offline(cpu)) { local_irq_disable(); + ppc64_runlatch_off(); /* * Yield the processor to the hypervisor. We return if @@ -550,13 +558,16 @@ cede_processor(); else local_irq_enable(); + + HMT_medium(); } - HMT_medium(); lpaca->lppaca.idle = 0; + ppc64_runlatch_on(); + schedule(); - if (cpu_is_offline(smp_processor_id()) && - system_state == SYSTEM_RUNNING) + + if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) cpu_die(); } From anton at samba.org Thu Jul 7 04:46:16 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:46:16 +1000 Subject: [PATCH 2/10] ppc64: Move iSeries_idle() into iSeries_setup.c In-Reply-To: <20050706184553.GH12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> Message-ID: <20050706184616.GI12786@krispykreme> From: Michael Ellerman Move iSeries_idle() into iSeries_setup.c, no one else needs to know about it. Signed-off-by: Michael Ellerman Signed-off-by: Anton Blanchard Index: ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c @@ -834,6 +834,87 @@ static int __init iSeries_src_init(void) late_initcall(iSeries_src_init); +static unsigned long maxYieldTime = 0; +static unsigned long minYieldTime = 0xffffffffffffffffUL; + +static inline void process_iSeries_events(void) +{ + asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); +} + +static void yield_shared_processor(void) +{ + unsigned long tb; + unsigned long yieldTime; + + HvCall_setEnabledInterrupts(HvCall_MaskIPI | + HvCall_MaskLpEvent | + HvCall_MaskLpProd | + HvCall_MaskTimeout); + + tb = get_tb(); + /* Compute future tb value when yield should expire */ + HvCall_yieldProcessor(HvCall_YieldTimed, tb+tb_ticks_per_jiffy); + + yieldTime = get_tb() - tb; + if (yieldTime > maxYieldTime) + maxYieldTime = yieldTime; + + if (yieldTime < minYieldTime) + minYieldTime = yieldTime; + + /* + * The decrementer stops during the yield. Force a fake decrementer + * here and let the timer_interrupt code sort out the actual time. + */ + get_paca()->lppaca.int_dword.fields.decr_int = 1; + process_iSeries_events(); +} + +static int iSeries_idle(void) +{ + struct paca_struct *lpaca; + long oldval; + + /* ensure iSeries run light will be out when idle */ + ppc64_runlatch_off(); + + lpaca = get_paca(); + + while (1) { + if (lpaca->lppaca.shared_proc) { + if (hvlpevent_is_pending()) + process_iSeries_events(); + if (!need_resched()) + yield_shared_processor(); + } else { + oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); + + if (!oldval) { + set_thread_flag(TIF_POLLING_NRFLAG); + + while (!need_resched()) { + HMT_medium(); + if (hvlpevent_is_pending()) + process_iSeries_events(); + HMT_low(); + } + + HMT_medium(); + clear_thread_flag(TIF_POLLING_NRFLAG); + } else { + set_need_resched(); + } + } + + ppc64_runlatch_on(); + schedule(); + ppc64_runlatch_off(); + } + + return 0; +} + #ifndef CONFIG_PCI void __init iSeries_init_IRQ(void) { } #endif Index: ppc64-2.6/arch/ppc64/kernel/idle.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/idle.c +++ ppc64-2.6/arch/ppc64/kernel/idle.c @@ -39,90 +39,6 @@ extern void power4_idle(void); static int (*idle_loop)(void); -#ifdef CONFIG_PPC_ISERIES -static unsigned long maxYieldTime = 0; -static unsigned long minYieldTime = 0xffffffffffffffffUL; - -static inline void process_iSeries_events(void) -{ - asm volatile ("li 0,0x5555; sc" : : : "r0", "r3"); -} - -static void yield_shared_processor(void) -{ - unsigned long tb; - unsigned long yieldTime; - - HvCall_setEnabledInterrupts(HvCall_MaskIPI | - HvCall_MaskLpEvent | - HvCall_MaskLpProd | - HvCall_MaskTimeout); - - tb = get_tb(); - /* Compute future tb value when yield should expire */ - HvCall_yieldProcessor(HvCall_YieldTimed, tb+tb_ticks_per_jiffy); - - yieldTime = get_tb() - tb; - if (yieldTime > maxYieldTime) - maxYieldTime = yieldTime; - - if (yieldTime < minYieldTime) - minYieldTime = yieldTime; - - /* - * The decrementer stops during the yield. Force a fake decrementer - * here and let the timer_interrupt code sort out the actual time. - */ - get_paca()->lppaca.int_dword.fields.decr_int = 1; - process_iSeries_events(); -} - -static int iSeries_idle(void) -{ - struct paca_struct *lpaca; - long oldval; - - /* ensure iSeries run light will be out when idle */ - ppc64_runlatch_off(); - - lpaca = get_paca(); - - while (1) { - if (lpaca->lppaca.shared_proc) { - if (hvlpevent_is_pending()) - process_iSeries_events(); - if (!need_resched()) - yield_shared_processor(); - } else { - oldval = test_and_clear_thread_flag(TIF_NEED_RESCHED); - - if (!oldval) { - set_thread_flag(TIF_POLLING_NRFLAG); - - while (!need_resched()) { - HMT_medium(); - if (hvlpevent_is_pending()) - process_iSeries_events(); - HMT_low(); - } - - HMT_medium(); - clear_thread_flag(TIF_POLLING_NRFLAG); - } else { - set_need_resched(); - } - } - - ppc64_runlatch_on(); - schedule(); - ppc64_runlatch_off(); - } - - return 0; -} - -#else - int default_idle(void) { long oldval; @@ -305,8 +221,6 @@ int native_idle(void) return 0; } -#endif /* CONFIG_PPC_ISERIES */ - void cpu_idle(void) { BUG_ON(NULL == ppc_md.idle_loop); From anton at samba.org Thu Jul 7 04:49:53 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:49:53 +1000 Subject: [PATCH 8/10] ppc64: idle fixups In-Reply-To: <20050706184937.GN12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> <20050706184655.GJ12786@krispykreme> <20050706184726.GK12786@krispykreme> <20050706184839.GL12786@krispykreme> <20050706184906.GM12786@krispykreme> <20050706184937.GN12786@krispykreme> Message-ID: <20050706184953.GO12786@krispykreme> - remove some unnecessary includes - add runlatch support - no need to use raw_smp_processor_id any more, current preempt debug logic checks for processes that are bound to one cpu. Signed-off-by: Anton Blanchard Index: linux-2.6.git-work/arch/ppc64/kernel/idle.c =================================================================== --- linux-2.6.git-work.orig/arch/ppc64/kernel/idle.c 2005-07-02 08:24:55.000000000 +1000 +++ linux-2.6.git-work/arch/ppc64/kernel/idle.c 2005-07-06 01:50:08.000000000 +1000 @@ -20,18 +20,12 @@ #include #include #include -#include #include -#include #include #include -#include #include #include -#include -#include -#include #include #include @@ -49,7 +43,8 @@ set_thread_flag(TIF_POLLING_NRFLAG); while (!need_resched() && !cpu_is_offline(cpu)) { - barrier(); + ppc64_runlatch_off(); + /* * Go into low thread priority and possibly * low power mode. @@ -64,6 +59,7 @@ set_need_resched(); } + ppc64_runlatch_on(); schedule(); if (cpu_is_offline(cpu) && system_state == SYSTEM_RUNNING) cpu_die(); @@ -74,17 +70,22 @@ int native_idle(void) { - while(1) { - /* check CPU type here */ + while (1) { + ppc64_runlatch_off(); + if (!need_resched()) power4_idle(); - if (need_resched()) + + if (need_resched()) { + ppc64_runlatch_on(); schedule(); + } - if (cpu_is_offline(raw_smp_processor_id()) && + if (cpu_is_offline(smp_processor_id()) && system_state == SYSTEM_RUNNING) cpu_die(); } + return 0; } From anton at samba.org Thu Jul 7 04:47:26 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 04:47:26 +1000 Subject: [PATCH 4/10] ppc64: Fixup platforms for new ppc_md.idle In-Reply-To: <20050706184655.GJ12786@krispykreme> References: <20050706184447.GG12786@krispykreme> <20050706184553.GH12786@krispykreme> <20050706184616.GI12786@krispykreme> <20050706184655.GJ12786@krispykreme> Message-ID: <20050706184726.GK12786@krispykreme> From: Michael Ellerman This patch fixes up iSeries, pSeries, pmac and maple to set the correct idle function for each platform. Signed-off-by: Michael Ellerman Signed-off-by: Anton Blanchard Index: ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/iSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/iSeries_setup.c @@ -940,5 +940,6 @@ void __init iSeries_early_setup(void) ppc_md.get_rtc_time = iSeries_get_rtc_time; ppc_md.calibrate_decr = iSeries_calibrate_decr; ppc_md.progress = iSeries_progress; + ppc_md.idle_loop = iSeries_idle; } Index: ppc64-2.6/arch/ppc64/kernel/maple_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/maple_setup.c +++ ppc64-2.6/arch/ppc64/kernel/maple_setup.c @@ -177,6 +177,8 @@ void __init maple_setup_arch(void) #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif + + printk(KERN_INFO "Using native/NAP idle loop\n"); } /* @@ -297,4 +299,5 @@ struct machdep_calls __initdata maple_md .get_rtc_time = maple_get_rtc_time, .calibrate_decr = generic_calibrate_decr, .progress = maple_progress, + .idle_loop = native_idle, }; Index: ppc64-2.6/arch/ppc64/kernel/pmac_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pmac_setup.c +++ ppc64-2.6/arch/ppc64/kernel/pmac_setup.c @@ -186,6 +186,8 @@ void __init pmac_setup_arch(void) #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif + + printk(KERN_INFO "Using native/NAP idle loop\n"); } #ifdef CONFIG_SCSI @@ -507,5 +509,6 @@ struct machdep_calls __initdata pmac_md .calibrate_decr = pmac_calibrate_decr, .feature_call = pmac_do_feature_call, .progress = pmac_progress, - .check_legacy_ioport = pmac_check_legacy_ioport + .check_legacy_ioport = pmac_check_legacy_ioport, + .idle_loop = native_idle, }; Index: ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- ppc64-2.6.orig/arch/ppc64/kernel/pSeries_setup.c +++ ppc64-2.6/arch/ppc64/kernel/pSeries_setup.c @@ -19,6 +19,7 @@ #undef DEBUG #include +#include #include #include #include @@ -82,6 +83,9 @@ int fwnmi_active; /* TRUE if an FWNMI h extern void pSeries_system_reset_exception(struct pt_regs *regs); extern int pSeries_machine_check_exception(struct pt_regs *regs); +static int shared_idle(void); +static int dedicated_idle(void); + static volatile void __iomem * chrp_int_ack_special; struct mpic *pSeries_mpic; @@ -229,6 +233,20 @@ static void __init pSeries_setup_arch(vo if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) vpa_init(boot_cpuid); + + /* Choose an idle loop */ + if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) { + if (get_paca()->lppaca.shared_proc) { + printk(KERN_INFO "Using shared processor idle loop\n"); + ppc_md.idle_loop = shared_idle; + } else { + printk(KERN_INFO "Using dedicated idle loop\n"); + ppc_md.idle_loop = dedicated_idle; + } + } else { + printk(KERN_INFO "Using default idle loop\n"); + ppc_md.idle_loop = default_idle; + } } static int __init pSeries_init_panel(void) From ntl at pobox.com Thu Jul 7 05:11:03 2005 From: ntl at pobox.com (Nathan Lynch) Date: Wed, 6 Jul 2005 14:11:03 -0500 Subject: Help text for memory model config section? Message-ID: <20050706191103.GE1581@otto> Doing a make oldconfig on 2.6.13-rc2: Memory model 1. Flat Memory (FLATMEM_MANUAL) (NEW) > 2. Discontigious Memory (DISCONTIGMEM_MANUAL) (NEW) 3. Sparse Memory (SPARSEMEM_MANUAL) (NEW) choice[1-3]: ? Sorry, no help available for this option yet. :( Can someone please whip up some help text for this? I don't know whether I should pick 2 or 3 for my numa config. Also, the correct spelling is "discontiguous" :) Nathan From anton at samba.org Thu Jul 7 05:09:57 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 05:09:57 +1000 Subject: [PATCH] ppc64: silence perfmon exception warnings Message-ID: <20050706190957.GS12786@krispykreme> We dont need to use the PERFMON exception on POWER5, in fact the firmware returns an error. Due to this just remove the warning. Also now that we have proper runlatch support we can remove the bootup hack. Signed-off-by: Anton Blanchard Index: foobar2/arch/ppc64/kernel/sysfs.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/sysfs.c 2005-07-06 09:37:32.435185908 +1000 +++ foobar2/arch/ppc64/kernel/sysfs.c 2005-07-06 14:39:14.198111618 +1000 @@ -112,7 +112,6 @@ unsigned long hid0; #ifdef CONFIG_PPC_PSERIES unsigned long set, reset; - int ret; #endif /* CONFIG_PPC_PSERIES */ /* Only need to enable them once */ @@ -145,11 +144,7 @@ case PLATFORM_PSERIES_LPAR: set = 1UL << 63; reset = 0; - ret = plpar_hcall_norets(H_PERFMON, set, reset); - if (ret) - printk(KERN_ERR "H_PERFMON call on cpu %u " - "returned %d\n", - smp_processor_id(), ret); + plpar_hcall_norets(H_PERFMON, set, reset); break; #endif /* CONFIG_PPC_PSERIES */ @@ -161,13 +156,6 @@ /* instruct hypervisor to maintain PMCs */ if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) get_paca()->lppaca.pmcregs_in_use = 1; - - /* - * On SMT machines we have to set the run latch in the ctrl register - * in order to make PMC6 spin. - */ - if (cpu_has_feature(CPU_FTR_SMT)) - ppc64_runlatch_on(); #endif /* CONFIG_PPC_PSERIES */ } From jschopp at austin.ibm.com Thu Jul 7 05:24:03 2005 From: jschopp at austin.ibm.com (Joel Schopp) Date: Wed, 06 Jul 2005 14:24:03 -0500 Subject: Help text for memory model config section? In-Reply-To: <20050706191103.GE1581@otto> References: <20050706191103.GE1581@otto> Message-ID: <42CC2FD3.3080805@austin.ibm.com> > Doing a make oldconfig on 2.6.13-rc2: > > Memory model > 1. Flat Memory (FLATMEM_MANUAL) (NEW) > >>2. Discontigious Memory (DISCONTIGMEM_MANUAL) (NEW) > > 3. Sparse Memory (SPARSEMEM_MANUAL) (NEW) > choice[1-3]: ? > > Sorry, no help available for this option yet. Sorry about the no help part. We'll work on that. > I don't know > whether I should pick 2 or 3 for my numa config. Well, 3 is designed to replace 2 and should be superior in every regard. 2 is there because it is a proven quantity, and 3 isn't yet. > > Also, the correct spelling is "discontiguous" :) > We'll work on that too. From anton at samba.org Thu Jul 7 05:32:38 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 05:32:38 +1000 Subject: Help text for memory model config section? In-Reply-To: <42CC2FD3.3080805@austin.ibm.com> References: <20050706191103.GE1581@otto> <42CC2FD3.3080805@austin.ibm.com> Message-ID: <20050706193238.GT12786@krispykreme> Hi Joel, While I remember, I noticed this: /* * SECTION_SIZE_BITS 2^N: how big each section will be * MAX_PHYSADDR_BITS 2^N: how much physical address space we * have * MAX_PHYSMEM_BITS 2^N: how much memory we can have in that * space */ #define SECTION_SIZE_BITS 24 #define MAX_PHYSADDR_BITS 38 #define MAX_PHYSMEM_BITS 36 At the moment our LMBs are 16MB in size, so SECTION_SIZE_BITS looks ok. However we can have up to 2TB with our current setup (41 bits). On a shared processor box we may end up with all memory in a single node, so I think it should look like: #define SECTION_SIZE_BITS 24 #define MAX_PHYSADDR_BITS 41 #define MAX_PHYSMEM_BITS 41 But I wonder if some structures in sparse.c will grow too large with this change. In the ppc64 numa code we decided to allocate the memory lookup table (which contains a node id per 16MB region) at runtime. Otherwise we had a 256kB array in all kernels. FYI we might be going to 16TB soon. Anton From kravetz at us.ibm.com Thu Jul 7 05:53:10 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Wed, 6 Jul 2005 12:53:10 -0700 Subject: [Lhms-devel] Re: Help text for memory model config section? In-Reply-To: <20050706193238.GT12786@krispykreme> References: <20050706191103.GE1581@otto> <42CC2FD3.3080805@austin.ibm.com> <20050706193238.GT12786@krispykreme> Message-ID: <20050706195310.GA3965@w-mikek2.ibm.com> On Thu, Jul 07, 2005 at 05:32:38AM +1000, Anton Blanchard wrote: > At the moment our LMBs are 16MB in size, so SECTION_SIZE_BITS looks ok. I think someone once told me that LMB size is 'adjusted' based on the amount of memory in the system? The more memory, the bigger the LMBs. But, it was possible to have LMBs as small as 16MB. Does that sound correct? As a data point, my OpenPower 720 with 8GB of memory has 32MB LMBs. -- Mike From haveblue at us.ibm.com Thu Jul 7 06:22:08 2005 From: haveblue at us.ibm.com (Dave Hansen) Date: Wed, 06 Jul 2005 13:22:08 -0700 Subject: [Lhms-devel] Re: Help text for memory model config section? In-Reply-To: <20050706193238.GT12786@krispykreme> References: <20050706191103.GE1581@otto> <42CC2FD3.3080805@austin.ibm.com> <20050706193238.GT12786@krispykreme> Message-ID: <1120681328.5741.24.camel@localhost> On Thu, 2005-07-07 at 05:32 +1000, Anton Blanchard wrote: > But I wonder if some structures in sparse.c will grow too large with this > change. In the ppc64 numa code we decided to allocate the memory lookup > table (which contains a node id per 16MB region) at runtime. Otherwise > we had a 256kB array in all kernels. It's not a big deal to go and free up the bits of that which we don't use, especially if there's not going to be any memory hotplug going on. However, if you want memory hotplug, I think 256k at boot-time in .bss is a pretty small price to pay if it gets you up anywhere from 256MB to 2TB of memory. -- Dave From anton at samba.org Thu Jul 7 10:40:07 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 10:40:07 +1000 Subject: [Lhms-devel] Re: Help text for memory model config section? In-Reply-To: <20050706195310.GA3965@w-mikek2.ibm.com> References: <20050706191103.GE1581@otto> <42CC2FD3.3080805@austin.ibm.com> <20050706193238.GT12786@krispykreme> <20050706195310.GA3965@w-mikek2.ibm.com> Message-ID: <20050707004007.GB14803@krispykreme> Hi Mike, > I think someone once told me that LMB size is 'adjusted' based on the > amount of memory in the system? The more memory, the bigger the LMBs. > But, it was possible to have LMBs as small as 16MB. Does that sound > correct? > > As a data point, my OpenPower 720 with 8GB of memory has 32MB LMBs. Definitely, we would expect the big memory layouts to have 256MB LMBs. However it looks like sparsemem uses compile time constants to work out the layout of the structures. Anton From anton at samba.org Thu Jul 7 10:38:15 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 7 Jul 2005 10:38:15 +1000 Subject: [Lhms-devel] Re: Help text for memory model config section? In-Reply-To: <1120681328.5741.24.camel@localhost> References: <20050706191103.GE1581@otto> <42CC2FD3.3080805@austin.ibm.com> <20050706193238.GT12786@krispykreme> <1120681328.5741.24.camel@localhost> Message-ID: <20050707003814.GA14803@krispykreme> Hi, > It's not a big deal to go and free up the bits of that which we don't > use, especially if there's not going to be any memory hotplug going on. Yep, we have tried to keep the bloat of the generic kernel with NUMA down so its acceptable to a G5 user. It would be nice to only allocate that 256kB when required. Seeing that sparsemem and the ppc64 NUMA code keep a similar structure indexed by address, does it make sense to consolidate that and keep the numa lookup info in the sparsemem structure? > However, if you want memory hotplug, I think 256k at boot-time in .bss > is a pretty small price to pay if it gets you up anywhere from 256MB to > 2TB of memory. I tested sparsemem for the 16TB case: #define SECTION_SIZE_BITS 24 -#define MAX_PHYSADDR_BITS 38 -#define MAX_PHYSMEM_BITS 36 +#define MAX_PHYSADDR_BITS 44 +#define MAX_PHYSMEM_BITS 44 text data bss dec hex filename 5337004 3017328 579744 8934076 8852bc 1/vmlinux 5337032 3017328 8935584 17289944 107d2d8 2/vmlinux So bumping the bits from 38 to 44 increased the kernel to ~17MB. Anton From david at gibson.dropbear.id.au Thu Jul 7 15:55:54 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 7 Jul 2005 15:55:54 +1000 Subject: RFC: Hugepage COW Message-ID: <20050707055554.GC11246@localhost.localdomain> Now that the hugepage code has been consolidated across the architectures, it becomes much easier to implement copy-on-write. Hugepage COW is of limited utility of itself, however, it is essentially a prerequisite for any of a number of methods of allowing userland programs to automatically use hugepages without code changes e.g. hugepage malloc() libraries, implicit hugepage mmap(), hugepage ELF segments. For certain applications (particularly enormous HPC FORTRAN programs), these can result in a large performance improvement. Thoughts? Flames? This patch implements copy-on-write for hugepages, thus allowing MAP_PRIVATE|MAP_WRITE mappings of hugetlbfs. Because the pool of hugepages is limited, a write to a MAP_PRIVATE hugepage region may result in a SIGBUS, if a new hugepage cannot be allocated. This patch is currently broken on sparc64, sh and sh64 (anything with ARCH_HAS_SETCLEAR_HUGE_PTE) - that will need to be fixed, obviously. Index: working-2.6/mm/hugetlb.c =================================================================== --- working-2.6.orig/mm/hugetlb.c 2005-06-23 10:10:30.000000000 +1000 +++ working-2.6/mm/hugetlb.c 2005-06-23 11:35:22.000000000 +1000 @@ -253,11 +253,12 @@ .nopage = hugetlb_nopage, }; -static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page) +static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, + int writable) { pte_t entry; - if (vma->vm_flags & VM_WRITE) { + if (writable) { entry = pte_mkwrite(pte_mkdirty(mk_pte(page, vma->vm_page_prot))); } else { @@ -276,6 +277,9 @@ struct page *ptepage; unsigned long addr = vma->vm_start; unsigned long end = vma->vm_end; + int cow; + + cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE; while (addr < end) { dst_pte = huge_pte_alloc(dst, addr); @@ -283,6 +287,10 @@ goto nomem; src_pte = huge_pte_offset(src, addr); BUG_ON(!src_pte || pte_none(*src_pte)); /* prefaulted */ + + if (cow) + huge_ptep_set_wrprotect(src, addr, src_pte); + entry = *src_pte; ptepage = pte_page(entry); get_page(ptepage); @@ -334,6 +342,7 @@ struct mm_struct *mm = current->mm; unsigned long addr; int ret = 0; + int writable; WARN_ON(!is_vm_hugetlb_page(vma)); BUG_ON(vma->vm_start & ~HPAGE_MASK); @@ -342,6 +351,7 @@ hugetlb_prefault_arch_hook(mm); spin_lock(&mm->page_table_lock); + for (addr = vma->vm_start; addr < vma->vm_end; addr += HPAGE_SIZE) { unsigned long idx; pte_t *pte = huge_pte_alloc(mm, addr); @@ -369,17 +379,35 @@ ret = -ENOMEM; goto out; } - ret = add_to_page_cache(page, mapping, idx, GFP_ATOMIC); - if (! ret) { + + /* This is a new page, all full of zeroes. If + * we're MAP_SHARED, the page needs to go into + * the page cache. If it's MAP_PRIVATE it + * might as well be made "anonymous" now or + * we'll just have to copy it on the first + * write. */ + if (vma->vm_flags & VM_SHARED) { + ret = add_to_page_cache(page, mapping, idx, + GFP_ATOMIC); + if (ret) { + hugetlb_put_quota(mapping); + free_huge_page(page); + goto out; + } + unlock_page(page); - } else { - hugetlb_put_quota(mapping); - free_huge_page(page); - goto out; } + + writable = vma->vm_flags & VM_WRITE; + } else { + /* Existing page in page cache. Can only + * allow writes if mapping is both writable + * and shared */ + writable = (vma->vm_flags & VM_SHARED) + && (vma->vm_flags & VM_WRITE); } add_mm_counter(mm, rss, HPAGE_SIZE / PAGE_SIZE); - set_huge_pte_at(mm, addr, pte, make_huge_pte(vma, page)); + set_huge_pte_at(mm, addr, pte, make_huge_pte(vma, page, writable)); } out: spin_unlock(&mm->page_table_lock); @@ -433,3 +461,91 @@ return i; } + +static int hugepage_cow(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, pte_t pte) +{ + struct page *old_page, *new_page; + int i; + + old_page = pte_page(*ptep); + + /* If no-one else is actually using this page, avoid the copy + * and just make the page writable */ + if (!TestSetPageLocked(old_page)) { + int avoidcopy = (page_count(old_page) == 1); + unlock_page(old_page); + if (avoidcopy) { + set_huge_ptep_writable(vma, address, ptep); + spin_unlock(&mm->page_table_lock); + return VM_FAULT_MINOR; + } + } + + page_cache_get(old_page); + + spin_unlock(&mm->page_table_lock); + + new_page = alloc_huge_page(); + + if (! new_page) { + page_cache_release(old_page); + + /* Logically this is OOM, not a SIGBUS, but an OOM + * could cause the kernel to go killing other + * processes which won't help the hugepage situation + * at all (?) */ + return VM_FAULT_SIGBUS; + } + + for (i = 0; i < HPAGE_SIZE/PAGE_SIZE; i++) + copy_user_highpage(new_page + i, old_page + i, + address + i*PAGE_SIZE); + + spin_lock(&mm->page_table_lock); + + ptep = huge_pte_offset(mm, address & HPAGE_MASK); + if (pte_same(*ptep, pte)) { + /* Break COW */ + set_huge_pte_at(mm, address, ptep, + make_huge_pte(vma, new_page, 1)); + + /* Make the old page be freed below */ + new_page = old_page; + } + page_cache_release(new_page); + page_cache_release(old_page); + spin_unlock(&mm->page_table_lock); + return VM_FAULT_MINOR; +} + +int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, int write_access) +{ + pte_t *ptep; + int rc = VM_FAULT_SIGBUS; + + spin_lock(&mm->page_table_lock); + + ptep = huge_pte_offset(mm, address & HPAGE_MASK); + + if ( (! ptep) || pte_none(*ptep)) + goto fail; + + rc = VM_FAULT_MINOR; + + if (! (write_access && !pte_write(*ptep))) { + printk(KERN_WARNING "Unexpected hugepte fault (wr=%d hugepte=%08lx\n", + write_access, pte_val(*ptep)); + goto fail; + } + + /* The only faults we should actually get are COWs */ + /* this drops the page_table_lock */ + return hugepage_cow(mm, vma, address, ptep, *ptep); + + fail: + spin_unlock(&mm->page_table_lock); + + return rc; +} Index: working-2.6/mm/memory.c =================================================================== --- working-2.6.orig/mm/memory.c 2005-06-23 10:10:30.000000000 +1000 +++ working-2.6/mm/memory.c 2005-06-23 11:28:53.000000000 +1000 @@ -2019,7 +2019,7 @@ inc_page_state(pgfault); if (is_vm_hugetlb_page(vma)) - return VM_FAULT_SIGBUS; /* mapping truncation does this. */ + return hugetlb_fault(mm, vma, address, write_access); /* * We need the page table lock to synchronize with kswapd Index: working-2.6/include/linux/hugetlb.h =================================================================== --- working-2.6.orig/include/linux/hugetlb.h 2005-06-23 10:10:30.000000000 +1000 +++ working-2.6/include/linux/hugetlb.h 2005-06-23 11:32:20.000000000 +1000 @@ -25,6 +25,8 @@ unsigned long hugetlb_total_pages(void); struct page *alloc_huge_page(void); void free_huge_page(struct page *); +int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct * vma, + unsigned long address, int write_access); extern unsigned long max_huge_pages; extern const unsigned long hugetlb_zero, hugetlb_infinity; @@ -57,11 +59,26 @@ #ifndef ARCH_HAS_SETCLEAR_HUGE_PTE #define set_huge_pte_at(mm, addr, ptep, pte) set_pte_at(mm, addr, ptep, pte) #define huge_ptep_get_and_clear(mm, addr, ptep) ptep_get_and_clear(mm, addr, ptep) +#define huge_ptep_set_wrprotect(mm, addr, ptep) ptep_set_wrprotect(mm, addr, ptep) +static inline void set_huge_ptep_writable(struct vm_area_struct *vma, + unsigned long address, + pte_t *ptep) +{ + pte_t entry; + + entry = pte_mkwrite(pte_mkdirty(*ptep)); + ptep_set_access_flags(vma, address, ptep, entry, 1); + update_mmu_cache(vma, address, entry); +} #else void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte); pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep); +void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); +void set_huge_ptep_writable(struct vm_struct *vm, unsigned long address, + pte_t *ptep); #endif #ifndef ARCH_HAS_HUGETLB_PREFAULT_HOOK Index: working-2.6/fs/hugetlbfs/inode.c =================================================================== --- working-2.6.orig/fs/hugetlbfs/inode.c 2005-06-23 10:10:30.000000000 +1000 +++ working-2.6/fs/hugetlbfs/inode.c 2005-06-23 11:28:54.000000000 +1000 @@ -52,9 +52,6 @@ loff_t len, vma_len; int ret; - if ((vma->vm_flags & (VM_MAYSHARE | VM_WRITE)) == VM_WRITE) - return -EINVAL; - if (vma->vm_pgoff & (HPAGE_SIZE / PAGE_SIZE - 1)) return -EINVAL; -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From avi at argo.co.il Thu Jul 7 16:49:27 2005 From: avi at argo.co.il (Avi Kivity) Date: Thu, 07 Jul 2005 09:49:27 +0300 Subject: RFC: Hugepage COW In-Reply-To: <20050707055554.GC11246@localhost.localdomain> References: <20050707055554.GC11246@localhost.localdomain> Message-ID: <1120718967.2989.7.camel@blast.q> On Thu, 2005-07-07 at 15:55 +1000, David Gibson wrote: > MAP_PRIVATE|MAP_WRITE mappings of hugetlbfs. Because the pool of > hugepages is limited, a write to a MAP_PRIVATE hugepage region may > result in a SIGBUS, if a new hugepage cannot be allocated. This patch in that case you might allocate regular pages for the new copy. From olh at suse.de Thu Jul 7 19:19:55 2005 From: olh at suse.de (Olaf Hering) Date: Thu, 7 Jul 2005 11:19:55 +0200 Subject: p660 RIO failure Message-ID: <20050707091955.GA18318@suse.de> Any idea what the "Detail:" code means here? It is not listed in the 380566.pdf, the cables appear to be ok. 1. 07/07/2005 08:35:27 Service Processor Firmware Failure Error code: B1014602 Detail: 6013 SRC -------------------------------------------------------------- word11: B1014602 word12: 0230005D word13: 60132014 word14: 00000000 word15: 00000700 word16: 0000A05A word17: 00000000 word18: 00004000 word19: F444E060 B1014602 From sxfzzm at tom.com Thu Jul 7 19:40:54 2005 From: sxfzzm at tom.com (sxfzzm at tom.com) Date: Thu, 7 Jul 2005 17:40:54 +0800 Subject: FLOWEXPO CHINA 2006-INVITATION-(To:linuxppc64-dev@ozlabs.org) Message-ID: <20050707094055.13EC767BB9@ozlabs.org> FLOWEXPO CHINA 2006 - International Trade Fair for Valves, Pipelines, Fluid Engineerings and Process Industries Http://www.flowexpo.com The 9th International Valves, Actuators, pipelines , Pumps, compressors,Seals,Fluid Machinery, process equipment, Instrumentation ,Automation & control Systems Exhibition?FLOWEXPO 2006? March29-31,2006 Guangzhou Gymnasium No.783,BaiYundadao,Guangzhou,China You are welcome to attend our FLOWEXPO! FLOWEXPO, which is International Exhibition on Fluid Engineering and Process Industry, was founded in March, 1997. One can attend this expo every year at GUANGZHOU. The Expo date is the last Wednesday in March every year, and it will last for three days. The FLOWEXPO will feature Valves, Actuators, Pipings, Fittings, Pumps, Seals, Compressors, Fluid Machinery, Process Equipment, Instrumentation, Automation & Control Systems. We will invite our audience from all kinds of trades such as petrol and chemistry, petrol-making, energy resources, electric power, pulping? paper-making, pharmacy, foodstuff, drinks, beer, light-industry, construct material? mining, metallurgy, water-supplying, steam-supplying, heat-supplying, environment-protecting, water-handling, industry equipment installing project, the long pipeline of petrol and natural gas project, architecture, water-supplying and draining, air condition, fire-fighting, gas-installed project and so on. Since FLOWEXPO was founded, over one thousand suppliers have been attracted to attend our exhibition, most of them come from more than 20 countries and areas such as America, Germany, French, Italy, Australia, Canada, Japan, Denmark ? Spanish, Norway, Korea, Singapore and so on. Many people in the profession sing high praise for our exhibition with ?special exhibition, corresponding audience and remarkable effect ?, so more than one hundred famous newspaper presses, special magazines, trade websites and sponsoring institutions gave us great propagation and support. At the same time, the sponsoring institutions , together with Chinese concerned departments and guild, organized some famous experts from home to abroad to have technology lectures, and invited design institutions, project corporations, installing companies, imports and exports companies, business institutions, associations and academies to visit, make purchase and communicate. FLOWEXPO will provide you with good favor in exploiting market and seeking commerce opportunity! We can make sure that our exhibition will achieve greater effect, both the businessman and audience this year! If you want to book our showcase? please apply for it in advance; if you want to visit our exhibition, you are welcome to register. Please contact with us as soon as possible. Guangzhou Flow Expo Co., Ltd., (Guangzhou Free Trade Zone Shibo Exhibition Co., Ltd., ) Add?505 Lifeng Center,7 Qingnian Road,GETDD,Guangzhou,china TEL?86-20-82220077?82220061?82227155?82088331 FAX?86-20-82214624?82209956 P.C.?510730 Http://www.flowexpo.com E-mail?wanwan at flowexpo.com From david at gibson.dropbear.id.au Thu Jul 7 19:24:25 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 7 Jul 2005 19:24:25 +1000 Subject: RFC: Hugepage COW In-Reply-To: <1120718967.2989.7.camel@blast.q> References: <20050707055554.GC11246@localhost.localdomain> <1120718967.2989.7.camel@blast.q> Message-ID: <20050707092425.GA10044@localhost.localdomain> On Thu, Jul 07, 2005 at 09:49:27AM +0300, Avi Kivity wrote: > On Thu, 2005-07-07 at 15:55 +1000, David Gibson wrote: > > > MAP_PRIVATE|MAP_WRITE mappings of hugetlbfs. Because the pool of > > hugepages is limited, a write to a MAP_PRIVATE hugepage region may > > result in a SIGBUS, if a new hugepage cannot be allocated. This patch > > in that case you might allocate regular pages for the new copy. That's not necessarily possible. On some archs - ppc64 for one - the mmu has to be set up for hugepages on a granularity greater than the hugepage size. So you can just arbitrarily substitute normal pages for hugepages. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From sfr at canb.auug.org.au Thu Jul 7 22:53:25 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 7 Jul 2005 22:53:25 +1000 Subject: RFC: Hugepage COW In-Reply-To: <20050707092425.GA10044@localhost.localdomain> References: <20050707055554.GC11246@localhost.localdomain> <1120718967.2989.7.camel@blast.q> <20050707092425.GA10044@localhost.localdomain> Message-ID: <20050707225325.082ced8f.sfr@canb.auug.org.au> On Thu, 7 Jul 2005 19:24:25 +1000 David Gibson wrote: > > That's not necessarily possible. On some archs - ppc64 for one - > the mmu has to be set up for hugepages on a granularity greater than > the hugepage size. So you can just arbitrarily substitute normal ^^^ presumably you meant "cannot" > pages for hugepages. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050707/ffe42c09/attachment.pgp From haveblue at us.ibm.com Fri Jul 8 01:12:31 2005 From: haveblue at us.ibm.com (Dave Hansen) Date: Thu, 07 Jul 2005 08:12:31 -0700 Subject: [Lhms-devel] Re: Help text for memory model config section? In-Reply-To: <20050707004007.GB14803@krispykreme> References: <20050706191103.GE1581@otto> <42CC2FD3.3080805@austin.ibm.com> <20050706193238.GT12786@krispykreme> <20050706195310.GA3965@w-mikek2.ibm.com> <20050707004007.GB14803@krispykreme> Message-ID: <1120749151.5829.24.camel@localhost> On Thu, 2005-07-07 at 10:40 +1000, Anton Blanchard wrote: > > I think someone once told me that LMB size is 'adjusted' based on the > > amount of memory in the system? The more memory, the bigger the LMBs. > > But, it was possible to have LMBs as small as 16MB. Does that sound > > correct? > > > > As a data point, my OpenPower 720 with 8GB of memory has 32MB LMBs. > > Definitely, we would expect the big memory layouts to have 256MB LMBs. > However it looks like sparsemem uses compile time constants to work out > the layout of the structures. It does, but the root reason for this is that we encode the section number into the page->flags, which are statically allocated. However, we could probably include a global, variable section shift to globally increase the section size. But, I don't think it's really worth the complexity. The ia64 folks require a two-level table instead of a flat array, and I think we'll just do that for ppc64 as well. -- Dave From brking at us.ibm.com Thu Jul 7 23:28:56 2005 From: brking at us.ibm.com (Brian King) Date: Thu, 07 Jul 2005 08:28:56 -0500 Subject: [PATCH 3/13]: PCI Err: IPR scsi device driver recovery In-Reply-To: <20050628235839.GA6362@austin.ibm.com> References: <20050628235839.GA6362@austin.ibm.com> Message-ID: <42CD2E18.40608@us.ibm.com> Linas Vepstas wrote: > +/** This routine is called when the PCI bus has permanently > + * failed. This routine should purge all pending I/O and > + * shut down the device driver (close and unload). > + * XXX Needs to be implemented. > + */ > +static void ipr_eeh_perm_failure (struct pci_dev *pdev) > +{ > +#if 0 // XXXXXXXXXXXXXXXXXXXXXXX > + ipr_cmd->job_step = ipr_reset_shutdown_ioa; > + rc = IPR_RC_JOB_CONTINUE; > +#endif > +} What were your plans here? What can the device driver rely on here? Are interrupts disabled? Will pci config accesses all fail? Should the driver attempt to talk to the pci adapter at all, or should it simply clean up after it? -- Brian King eServer Storage I/O IBM Linux Technology Center From greg at kroah.com Fri Jul 8 04:41:02 2005 From: greg at kroah.com (Greg KH) Date: Thu, 7 Jul 2005 11:41:02 -0700 Subject: [PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB63B2.6000505@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> Message-ID: <20050707184102.GC14726@kroah.com> On Wed, Jul 06, 2005 at 01:53:06PM +0900, Hidetoshi Seto wrote: > Hi all, > > The followings are updated version of patches I've posted to > implement IOCHK interface for I/O error handling/detecting. > > The abstraction of patches hasn't changed, so please refer > archives if you need, e.g.: http://lwn.net/Articles/139240/ How about the issue of tying this into the other pci error reporting infrastructure that is being worked on? thanks, greg k-h From benh at kernel.crashing.org Fri Jul 8 08:27:18 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 08 Jul 2005 08:27:18 +1000 Subject: [PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050707184102.GC14726@kroah.com> References: <42CB63B2.6000505@jp.fujitsu.com> <20050707184102.GC14726@kroah.com> Message-ID: <1120775239.31924.262.camel@gaston> On Thu, 2005-07-07 at 11:41 -0700, Greg KH wrote: > On Wed, Jul 06, 2005 at 01:53:06PM +0900, Hidetoshi Seto wrote: > > Hi all, > > > > The followings are updated version of patches I've posted to > > implement IOCHK interface for I/O error handling/detecting. > > > > The abstraction of patches hasn't changed, so please refer > > archives if you need, e.g.: http://lwn.net/Articles/139240/ > > How about the issue of tying this into the other pci error reporting > infrastructure that is being worked on? The other infrastructure is for asynchronous reporting and recovery. We still need synchronous detection & reporting. So this is a bit different. However, it would be nice if Hidetoshi's work could be adapted a bit so that 1) naming is a bit more consistent with the other stuff (pcierr_* maybe) and 2) the error "token" is the same. The later is especially important if we start adding ways to query the error token to know what the error precisely was etc... There is no reason to have 2 different ways of representing error details. Ben From david at gibson.dropbear.id.au Fri Jul 8 12:44:31 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 8 Jul 2005 12:44:31 +1000 Subject: RFC: Hugepage COW In-Reply-To: <20050707225325.082ced8f.sfr@canb.auug.org.au> References: <20050707055554.GC11246@localhost.localdomain> <1120718967.2989.7.camel@blast.q> <20050707092425.GA10044@localhost.localdomain> <20050707225325.082ced8f.sfr@canb.auug.org.au> Message-ID: <20050708024431.GB30761@localhost.localdomain> On Thu, Jul 07, 2005 at 10:53:25PM +1000, Stephen Rothwell wrote: > On Thu, 7 Jul 2005 19:24:25 +1000 David Gibson wrote: > > > > That's not necessarily possible. On some archs - ppc64 for one - > > the mmu has to be set up for hugepages on a granularity greater than > > the hugepage size. So you can just arbitrarily substitute normal > ^^^ > presumably you meant "cannot" Oops.. yes, indeed. > > pages for hugepages. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050708/bc83f7ce/attachment.pgp From dmosberger at gmail.com Fri Jul 8 14:37:15 2005 From: dmosberger at gmail.com (david mosberger) Date: Thu, 7 Jul 2005 21:37:15 -0700 Subject: [PATCH 2.6.13-rc1 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB6961.2060508@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB6961.2060508@jp.fujitsu.com> Message-ID: On 7/5/05, Hidetoshi Seto wrote: > - could anyone write same barrier for intel compiler? > Tony or David, could you help me? I think it might be best to make ia64_mca_barrier() a proper subroutine written in assembly code. Yes, that costs some time, but we're talking about wasting 1,000+ cycles just to consume the value read via readX(), so the call-overhead is actually overlapped and completely trivial. --david From david at gibson.dropbear.id.au Fri Jul 8 14:46:54 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 8 Jul 2005 14:46:54 +1000 Subject: [PPC64] Kill bitfields in ppc64 hash code Message-ID: <20050708044653.GC30761@localhost.localdomain> Andrew, please apply: This patch removes the use of bitfield types from the ppc64 hash table manipulation code. Signed-off-by: David Gibson arch/ppc64/kernel/iSeries_htab.c | 50 +++++-------- arch/ppc64/kernel/pSeries_lpar.c | 49 ++++-------- arch/ppc64/mm/hash_low.S | 8 -- arch/ppc64/mm/hash_native.c | 129 +++++++++++++++------------------- arch/ppc64/mm/hash_utils.c | 16 ++-- arch/ppc64/mm/hugetlbpage.c | 16 ++-- arch/ppc64/mm/init.c | 7 + include/asm-ppc64/iSeries/HvCallHpt.h | 11 +- include/asm-ppc64/machdep.h | 6 - include/asm-ppc64/mmu.h | 83 +++++++-------------- 10 files changed, 157 insertions(+), 218 deletions(-) Index: working-2.6/arch/ppc64/kernel/iSeries_htab.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/iSeries_htab.c 2005-06-08 15:37:37.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/iSeries_htab.c 2005-07-08 13:06:14.000000000 +1000 @@ -38,11 +38,12 @@ } static long iSeries_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large) + unsigned long prpn, unsigned long vflags, + unsigned long rflags) { long slot; - HPTE lhpte; + hpte_t lhpte; + int secondary = 0; /* * The hypervisor tries both primary and secondary. @@ -50,13 +51,13 @@ * it means we have already tried both primary and secondary, * so we return failure immediately. */ - if (secondary) + if (vflags & HPTE_V_SECONDARY) return -1; iSeries_hlock(hpte_group); slot = HvCallHpt_findValid(&lhpte, va >> PAGE_SHIFT); - BUG_ON(lhpte.dw0.dw0.v); + BUG_ON(lhpte.v & HPTE_V_VALID); if (slot == -1) { /* No available entry found in either group */ iSeries_hunlock(hpte_group); @@ -64,19 +65,13 @@ } if (slot < 0) { /* MSB set means secondary group */ + vflags |= HPTE_V_VALID; secondary = 1; slot &= 0x7fffffffffffffff; } - lhpte.dw1.dword1 = 0; - lhpte.dw1.dw1.rpn = physRpn_to_absRpn(prpn); - lhpte.dw1.flags.flags = hpteflags; - - lhpte.dw0.dword0 = 0; - lhpte.dw0.dw0.avpn = va >> 23; - lhpte.dw0.dw0.h = secondary; - lhpte.dw0.dw0.bolted = bolted; - lhpte.dw0.dw0.v = 1; + hpte.v = (va >> 23) << HPTE_V_AVPN_SHIFT | vflags | HPTE_V_VALID; + hpte.r = (physRpn_to_absRpn(prpn) << HPTE_R_RPN_SHIFT) | rflags; /* Now fill in the actual HPTE */ HvCallHpt_addValidate(slot, secondary, &lhpte); @@ -89,19 +84,17 @@ static unsigned long iSeries_hpte_getword0(unsigned long slot) { unsigned long dword0; - HPTE hpte; + hpte_t hpte; HvCallHpt_get(&hpte, slot); - dword0 = hpte.dw0.dword0; - - return dword0; + return hpte.v; } static long iSeries_hpte_remove(unsigned long hpte_group) { unsigned long slot_offset; int i; - HPTE lhpte; + unsigned long hpte_v; /* Pick a random slot to start at */ slot_offset = mftb() & 0x7; @@ -109,10 +102,9 @@ iSeries_hlock(hpte_group); for (i = 0; i < HPTES_PER_GROUP; i++) { - lhpte.dw0.dword0 = - iSeries_hpte_getword0(hpte_group + slot_offset); + hpte_v = iSeries_hpte_getword0(hpte_group + slot_offset); - if (!lhpte.dw0.dw0.bolted) { + if (! (hpte_v & HPTE_V_BOLTED)) { HvCallHpt_invalidateSetSwBitsGet(hpte_group + slot_offset, 0, 0); iSeries_hunlock(hpte_group); @@ -137,13 +129,13 @@ static long iSeries_hpte_updatepp(unsigned long slot, unsigned long newpp, unsigned long va, int large, int local) { - HPTE hpte; + hpte_t hpte; unsigned long avpn = va >> 23; iSeries_hlock(slot); HvCallHpt_get(&hpte, slot); - if ((hpte.dw0.dw0.avpn == avpn) && (hpte.dw0.dw0.v)) { + if ((HPTE_V_AVPN_VAL(hpte.v) == avpn) && (hpte.v & HPTE_V_VALID)) { /* * Hypervisor expects bits as NPPP, which is * different from how they are mapped in our PP. @@ -167,7 +159,7 @@ */ static long iSeries_hpte_find(unsigned long vpn) { - HPTE hpte; + hpte_t hpte; long slot; /* @@ -177,7 +169,7 @@ * 0x80000000xxxxxxxx : Entry found in secondary group, slot x */ slot = HvCallHpt_findValid(&hpte, vpn); - if (hpte.dw0.dw0.v) { + if (hpte.v & HPTE_V_VALID) { if (slot < 0) { slot &= 0x7fffffffffffffff; slot = -slot; @@ -212,7 +204,7 @@ static void iSeries_hpte_invalidate(unsigned long slot, unsigned long va, int large, int local) { - HPTE lhpte; + unsigned long hpte_v; unsigned long avpn = va >> 23; unsigned long flags; @@ -220,9 +212,9 @@ iSeries_hlock(slot); - lhpte.dw0.dword0 = iSeries_hpte_getword0(slot); + hpte_v = iSeries_hpte_getword0(slot); - if ((lhpte.dw0.dw0.avpn == avpn) && lhpte.dw0.dw0.v) + if ((HPTE_V_AVPN_VAL(hpte_v) == avpn) && (hpte_v & HPTE_V_VALID)) HvCallHpt_invalidateSetSwBitsGet(slot, 0, 0); iSeries_hunlock(slot); Index: working-2.6/arch/ppc64/kernel/pSeries_lpar.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/pSeries_lpar.c 2005-06-08 15:37:37.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/pSeries_lpar.c 2005-07-08 13:06:14.000000000 +1000 @@ -277,31 +277,20 @@ long pSeries_lpar_hpte_insert(unsigned long hpte_group, unsigned long va, unsigned long prpn, - int secondary, unsigned long hpteflags, - int bolted, int large) + unsigned long vflags, unsigned long rflags) { unsigned long arpn = physRpn_to_absRpn(prpn); unsigned long lpar_rc; unsigned long flags; unsigned long slot; - HPTE lhpte; + unsigned long hpte_v, hpte_r; unsigned long dummy0, dummy1; - /* Fill in the local HPTE with absolute rpn, avpn and flags */ - lhpte.dw1.dword1 = 0; - lhpte.dw1.dw1.rpn = arpn; - lhpte.dw1.flags.flags = hpteflags; - - lhpte.dw0.dword0 = 0; - lhpte.dw0.dw0.avpn = va >> 23; - lhpte.dw0.dw0.h = secondary; - lhpte.dw0.dw0.bolted = bolted; - lhpte.dw0.dw0.v = 1; - - if (large) { - lhpte.dw0.dw0.l = 1; - lhpte.dw0.dw0.avpn &= ~0x1UL; - } + hpte_v = ((va >> 23) << HPTE_V_AVPN_SHIFT) | vflags | HPTE_V_VALID; + if (vflags & HPTE_V_LARGE) + hpte_v &= ~(1UL << HPTE_V_AVPN_SHIFT); + + hpte_r = (arpn << HPTE_R_RPN_SHIFT) | rflags; /* Now fill in the actual HPTE */ /* Set CEC cookie to 0 */ @@ -312,11 +301,11 @@ flags = 0; /* XXX why is this here? - Anton */ - if (hpteflags & (_PAGE_GUARDED|_PAGE_NO_CACHE)) - lhpte.dw1.flags.flags &= ~_PAGE_COHERENT; + if (rflags & (_PAGE_GUARDED|_PAGE_NO_CACHE)) + hpte_r &= ~_PAGE_COHERENT; - lpar_rc = plpar_hcall(H_ENTER, flags, hpte_group, lhpte.dw0.dword0, - lhpte.dw1.dword1, &slot, &dummy0, &dummy1); + lpar_rc = plpar_hcall(H_ENTER, flags, hpte_group, hpte_v, + hpte_r, &slot, &dummy0, &dummy1); if (unlikely(lpar_rc == H_PTEG_Full)) return -1; @@ -332,7 +321,7 @@ /* Because of iSeries, we have to pass down the secondary * bucket bit here as well */ - return (slot & 7) | (secondary << 3); + return (slot & 7) | (!!(vflags & HPTE_V_SECONDARY) << 3); } static DEFINE_SPINLOCK(pSeries_lpar_tlbie_lock); @@ -427,22 +416,18 @@ unsigned long hash; unsigned long i, j; long slot; - union { - unsigned long dword0; - Hpte_dword0 dw0; - } hpte_dw0; - Hpte_dword0 dw0; + unsigned long hpte_v; hash = hpt_hash(vpn, 0); for (j = 0; j < 2; j++) { slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; for (i = 0; i < HPTES_PER_GROUP; i++) { - hpte_dw0.dword0 = pSeries_lpar_hpte_getword0(slot); - dw0 = hpte_dw0.dw0; + hpte_v = pSeries_lpar_hpte_getword0(slot); - if ((dw0.avpn == (vpn >> 11)) && dw0.v && - (dw0.h == j)) { + if ((HPTE_V_AVPN_VAL(hpte_v) == (vpn >> 11)) + && (hpte_v & HPTE_V_VALID) + && (!!(hpte_v & HPTE_V_SECONDARY) == j)) { /* HPTE matches */ if (j) slot = -slot; Index: working-2.6/arch/ppc64/mm/hash_low.S =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_low.S 2005-06-08 15:46:23.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_low.S 2005-07-08 13:06:14.000000000 +1000 @@ -170,9 +170,7 @@ /* Call ppc_md.hpte_insert */ ld r7,STK_PARM(r4)(r1) /* Retreive new pp bits */ mr r4,r29 /* Retreive va */ - li r6,0 /* primary slot */ - li r8,0 /* not bolted and not large */ - li r9,0 + li r6,0 /* no vflags */ _GLOBAL(htab_call_hpte_insert1) bl . /* Will be patched by htab_finish_init() */ cmpdi 0,r3,0 @@ -192,9 +190,7 @@ /* Call ppc_md.hpte_insert */ ld r7,STK_PARM(r4)(r1) /* Retreive new pp bits */ mr r4,r29 /* Retreive va */ - li r6,1 /* secondary slot */ - li r8,0 /* not bolted and not large */ - li r9,0 + li r6,HPTE_V_SECONDARY at l /* secondary slot */ _GLOBAL(htab_call_hpte_insert2) bl . /* Will be patched by htab_finish_init() */ cmpdi 0,r3,0 Index: working-2.6/arch/ppc64/mm/hash_native.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_native.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_native.c 2005-07-08 13:43:15.000000000 +1000 @@ -27,9 +27,9 @@ static DEFINE_SPINLOCK(native_tlbie_lock); -static inline void native_lock_hpte(HPTE *hptep) +static inline void native_lock_hpte(hpte_t *hptep) { - unsigned long *word = &hptep->dw0.dword0; + unsigned long *word = &hptep->v; while (1) { if (!test_and_set_bit(HPTE_LOCK_BIT, word)) @@ -39,32 +39,28 @@ } } -static inline void native_unlock_hpte(HPTE *hptep) +static inline void native_unlock_hpte(hpte_t *hptep) { - unsigned long *word = &hptep->dw0.dword0; + unsigned long *word = &hptep->v; asm volatile("lwsync":::"memory"); clear_bit(HPTE_LOCK_BIT, word); } long native_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large) + unsigned long prpn, unsigned long vflags, + unsigned long rflags) { unsigned long arpn = physRpn_to_absRpn(prpn); - HPTE *hptep = htab_address + hpte_group; - Hpte_dword0 dw0; - HPTE lhpte; + hpte_t *hptep = htab_address + hpte_group; + unsigned long hpte_v, hpte_r; int i; for (i = 0; i < HPTES_PER_GROUP; i++) { - dw0 = hptep->dw0.dw0; - - if (!dw0.v) { + if (! (hptep->v & HPTE_V_VALID)) { /* retry with lock held */ native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; - if (!dw0.v) + if (! (hptep->v & HPTE_V_VALID)) break; native_unlock_hpte(hptep); } @@ -75,56 +71,45 @@ if (i == HPTES_PER_GROUP) return -1; - lhpte.dw1.dword1 = 0; - lhpte.dw1.dw1.rpn = arpn; - lhpte.dw1.flags.flags = hpteflags; - - lhpte.dw0.dword0 = 0; - lhpte.dw0.dw0.avpn = va >> 23; - lhpte.dw0.dw0.h = secondary; - lhpte.dw0.dw0.bolted = bolted; - lhpte.dw0.dw0.v = 1; - - if (large) { - lhpte.dw0.dw0.l = 1; - lhpte.dw0.dw0.avpn &= ~0x1UL; - } - - hptep->dw1.dword1 = lhpte.dw1.dword1; + hpte_v = (va >> 23) << HPTE_V_AVPN_SHIFT | vflags | HPTE_V_VALID; + if (vflags & HPTE_V_LARGE) + va &= ~(1UL << HPTE_V_AVPN_SHIFT); + hpte_r = (arpn << HPTE_R_RPN_SHIFT) | rflags; + hptep->r = hpte_r; /* Guarantee the second dword is visible before the valid bit */ __asm__ __volatile__ ("eieio" : : : "memory"); - /* * Now set the first dword including the valid bit * NOTE: this also unlocks the hpte */ - hptep->dw0.dword0 = lhpte.dw0.dword0; + hptep->v = hpte_v; __asm__ __volatile__ ("ptesync" : : : "memory"); - return i | (secondary << 3); + return i | (!!(vflags & HPTE_V_SECONDARY) << 3); } static long native_hpte_remove(unsigned long hpte_group) { - HPTE *hptep; - Hpte_dword0 dw0; + hpte_t *hptep; int i; int slot_offset; + unsigned long hpte_v; /* pick a random entry to start at */ slot_offset = mftb() & 0x7; for (i = 0; i < HPTES_PER_GROUP; i++) { hptep = htab_address + hpte_group + slot_offset; - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; - if (dw0.v && !dw0.bolted) { + if ((hpte_v & HPTE_V_VALID) && !(hpte_v & HPTE_V_BOLTED)) { /* retry with lock held */ native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; - if (dw0.v && !dw0.bolted) + hpte_v = hptep->v; + if ((hpte_v & HPTE_V_VALID) + && !(hpte_v & HPTE_V_BOLTED)) break; native_unlock_hpte(hptep); } @@ -137,15 +122,15 @@ return -1; /* Invalidate the hpte. NOTE: this also unlocks it */ - hptep->dw0.dword0 = 0; + hptep->v = 0; return i; } -static inline void set_pp_bit(unsigned long pp, HPTE *addr) +static inline void set_pp_bit(unsigned long pp, hpte_t *addr) { unsigned long old; - unsigned long *p = &addr->dw1.dword1; + unsigned long *p = &addr->r; __asm__ __volatile__( "1: ldarx %0,0,%3\n\ @@ -163,11 +148,11 @@ */ static long native_hpte_find(unsigned long vpn) { - HPTE *hptep; + hpte_t *hptep; unsigned long hash; unsigned long i, j; long slot; - Hpte_dword0 dw0; + unsigned long hpte_v; hash = hpt_hash(vpn, 0); @@ -175,10 +160,11 @@ slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; for (i = 0; i < HPTES_PER_GROUP; i++) { hptep = htab_address + slot; - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; - if ((dw0.avpn == (vpn >> 11)) && dw0.v && - (dw0.h == j)) { + if ((HPTE_V_AVPN_VAL(hpte_v) == (vpn >> 11)) + && (hpte_v & HPTE_V_VALID) + && ( !!(hpte_v & HPTE_V_SECONDARY) == j)) { /* HPTE matches */ if (j) slot = -slot; @@ -195,20 +181,21 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp, unsigned long va, int large, int local) { - HPTE *hptep = htab_address + slot; - Hpte_dword0 dw0; + hpte_t *hptep = htab_address + slot; + unsigned long hpte_v; unsigned long avpn = va >> 23; int ret = 0; if (large) - avpn &= ~0x1UL; + avpn &= ~1; native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; /* Even if we miss, we need to invalidate the TLB */ - if ((dw0.avpn != avpn) || !dw0.v) { + if ((HPTE_V_AVPN_VAL(hpte_v) != avpn) + || !(hpte_v & HPTE_V_VALID)) { native_unlock_hpte(hptep); ret = -1; } else { @@ -244,7 +231,7 @@ { unsigned long vsid, va, vpn, flags = 0; long slot; - HPTE *hptep; + hpte_t *hptep; int lock_tlbie = !cpu_has_feature(CPU_FTR_LOCKLESS_TLBIE); vsid = get_kernel_vsid(ea); @@ -269,26 +256,27 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long va, int large, int local) { - HPTE *hptep = htab_address + slot; - Hpte_dword0 dw0; + hpte_t *hptep = htab_address + slot; + unsigned long hpte_v; unsigned long avpn = va >> 23; unsigned long flags; int lock_tlbie = !cpu_has_feature(CPU_FTR_LOCKLESS_TLBIE); if (large) - avpn &= ~0x1UL; + avpn &= ~1; local_irq_save(flags); native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; /* Even if we miss, we need to invalidate the TLB */ - if ((dw0.avpn != avpn) || !dw0.v) { + if ((HPTE_V_AVPN_VAL(hpte_v) != avpn) + || !(hpte_v & HPTE_V_VALID)) { native_unlock_hpte(hptep); } else { /* Invalidate the hpte. NOTE: this also unlocks it */ - hptep->dw0.dword0 = 0; + hptep->v = 0; } /* Invalidate the tlb */ @@ -315,8 +303,8 @@ static void native_hpte_clear(void) { unsigned long slot, slots, flags; - HPTE *hptep = htab_address; - Hpte_dword0 dw0; + hpte_t *hptep = htab_address; + unsigned long hpte_v; unsigned long pteg_count; pteg_count = htab_hash_mask + 1; @@ -336,11 +324,11 @@ * running, right? and for crash dump, we probably * don't want to wait for a maybe bad cpu. */ - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; - if (dw0.v) { - hptep->dw0.dword0 = 0; - tlbie(slot2va(dw0.avpn, dw0.l, dw0.h, slot), dw0.l); + if (hpte_v & HPTE_V_VALID) { + hptep->v = 0; + tlbie(slot2va(hpte_v, slot), hpte_v & HPTE_V_LARGE); } } @@ -353,8 +341,8 @@ { unsigned long vsid, vpn, va, hash, secondary, slot, flags, avpn; int i, j; - HPTE *hptep; - Hpte_dword0 dw0; + hpte_t *hptep; + unsigned long hpte_v; struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch); /* XXX fix for large ptes */ @@ -390,14 +378,15 @@ native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; /* Even if we miss, we need to invalidate the TLB */ - if ((dw0.avpn != avpn) || !dw0.v) { + if ((HPTE_V_AVPN_VAL(hpte_v) != avpn) + || !(hpte_v & HPTE_V_VALID)) { native_unlock_hpte(hptep); } else { /* Invalidate the hpte. NOTE: this also unlocks it */ - hptep->dw0.dword0 = 0; + hptep->v = 0; } j++; Index: working-2.6/arch/ppc64/mm/hash_utils.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_utils.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_utils.c 2005-07-08 13:06:14.000000000 +1000 @@ -75,8 +75,8 @@ extern unsigned long dart_tablebase; #endif /* CONFIG_U3_DART */ -HPTE *htab_address; -unsigned long htab_hash_mask; +hpte_t *htab_address; +unsigned long htab_hash_mask; extern unsigned long _SDR1; @@ -97,11 +97,15 @@ unsigned long addr; unsigned int step; unsigned long tmp_mode; + unsigned long vflags; - if (large) + if (large) { step = 16*MB; - else + vflags = HPTE_V_BOLTED | HPTE_V_LARGE; + } else { step = 4*KB; + vflags = HPTE_V_BOLTED; + } for (addr = start; addr < end; addr += step) { unsigned long vpn, hash, hpteg; @@ -129,12 +133,12 @@ if (systemcfg->platform & PLATFORM_LPAR) ret = pSeries_lpar_hpte_insert(hpteg, va, virt_to_abs(addr) >> PAGE_SHIFT, - 0, tmp_mode, 1, large); + vflags, tmp_mode); else #endif /* CONFIG_PPC_PSERIES */ ret = native_hpte_insert(hpteg, va, virt_to_abs(addr) >> PAGE_SHIFT, - 0, tmp_mode, 1, large); + vflags, tmp_mode); if (ret == -1) { ppc64_terminate_msg(0x20, "create_pte_mapping"); Index: working-2.6/arch/ppc64/mm/hugetlbpage.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hugetlbpage.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hugetlbpage.c 2005-07-08 13:06:14.000000000 +1000 @@ -583,7 +583,7 @@ pte_t *ptep; unsigned long va, vpn; pte_t old_pte, new_pte; - unsigned long hpteflags, prpn; + unsigned long rflags, prpn; long slot; int err = 1; @@ -626,9 +626,9 @@ old_pte = *ptep; new_pte = old_pte; - hpteflags = 0x2 | (! (pte_val(new_pte) & _PAGE_RW)); + rflags = 0x2 | (! (pte_val(new_pte) & _PAGE_RW)); /* _PAGE_EXEC -> HW_NO_EXEC since it's inverted */ - hpteflags |= ((pte_val(new_pte) & _PAGE_EXEC) ? 0 : HW_NO_EXEC); + rflags |= ((pte_val(new_pte) & _PAGE_EXEC) ? 0 : HW_NO_EXEC); /* Check if pte already has an hpte (case 2) */ if (unlikely(pte_val(old_pte) & _PAGE_HASHPTE)) { @@ -641,7 +641,7 @@ slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; slot += (pte_val(old_pte) & _PAGE_GROUP_IX) >> 12; - if (ppc_md.hpte_updatepp(slot, hpteflags, va, 1, local) == -1) + if (ppc_md.hpte_updatepp(slot, rflags, va, 1, local) == -1) pte_val(old_pte) &= ~_PAGE_HPTEFLAGS; } @@ -661,10 +661,10 @@ /* Add in WIMG bits */ /* XXX We should store these in the pte */ - hpteflags |= _PAGE_COHERENT; + rflags |= _PAGE_COHERENT; - slot = ppc_md.hpte_insert(hpte_group, va, prpn, 0, - hpteflags, 0, 1); + slot = ppc_md.hpte_insert(hpte_group, va, prpn, + HPTE_V_LARGE, rflags); /* Primary is full, try the secondary */ if (unlikely(slot == -1)) { @@ -672,7 +672,7 @@ hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL; slot = ppc_md.hpte_insert(hpte_group, va, prpn, - 1, hpteflags, 0, 1); + HPTE_V_LARGE, rflags); if (slot == -1) { if (mftb() & 0x1) hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL; Index: working-2.6/arch/ppc64/mm/init.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/init.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/init.c 2005-07-08 13:49:35.000000000 +1000 @@ -180,9 +180,10 @@ hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP); /* Panic if a pte grpup is full */ - if (ppc_md.hpte_insert(hpteg, va, pa >> PAGE_SHIFT, 0, - _PAGE_NO_CACHE|_PAGE_GUARDED|PP_RWXX, - 1, 0) == -1) { + if (ppc_md.hpte_insert(hpteg, va, pa >> PAGE_SHIFT, + HPTE_V_BOLTED, + _PAGE_NO_CACHE|_PAGE_GUARDED|PP_RWXX) + == -1) { panic("map_io_page: could not insert mapping"); } } Index: working-2.6/include/asm-ppc64/machdep.h =================================================================== --- working-2.6.orig/include/asm-ppc64/machdep.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/machdep.h 2005-07-08 13:06:14.000000000 +1000 @@ -53,10 +53,8 @@ long (*hpte_insert)(unsigned long hpte_group, unsigned long va, unsigned long prpn, - int secondary, - unsigned long hpteflags, - int bolted, - int large); + unsigned long vflags, + unsigned long rflags); long (*hpte_remove)(unsigned long hpte_group); void (*flush_hash_range)(unsigned long context, unsigned long number, Index: working-2.6/include/asm-ppc64/mmu.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu.h 2005-07-08 13:48:40.000000000 +1000 @@ -60,6 +60,22 @@ #define HPTES_PER_GROUP 8 +#define HPTE_V_AVPN_SHIFT 7 +#define HPTE_V_AVPN ASM_CONST(0xffffffffffffff80) +#define HPTE_V_AVPN_VAL(x) (((x) & HPTE_V_AVPN) >> HPTE_V_AVPN_SHIFT) +#define HPTE_V_BOLTED ASM_CONST(0x0000000000000010) +#define HPTE_V_LOCK ASM_CONST(0x0000000000000008) +#define HPTE_V_LARGE ASM_CONST(0x0000000000000004) +#define HPTE_V_SECONDARY ASM_CONST(0x0000000000000002) +#define HPTE_V_VALID ASM_CONST(0x0000000000000001) + +#define HPTE_R_PP0 ASM_CONST(0x8000000000000000) +#define HPTE_R_TS ASM_CONST(0x4000000000000000) +#define HPTE_R_RPN_SHIFT 12 +#define HPTE_R_RPN ASM_CONST(0x3ffffffffffff000) +#define HPTE_R_FLAGS ASM_CONST(0x00000000000003ff) +#define HPTE_R_PP ASM_CONST(0x0000000000000003) + /* Values for PP (assumes Ks=0, Kp=1) */ /* pp0 will always be 0 for linux */ #define PP_RWXX 0 /* Supervisor read/write, User none */ @@ -69,54 +85,13 @@ #ifndef __ASSEMBLY__ -/* Hardware Page Table Entry */ -typedef struct { - unsigned long avpn:57; /* vsid | api == avpn */ - unsigned long : 2; /* Software use */ - unsigned long bolted: 1; /* HPTE is "bolted" */ - unsigned long lock: 1; /* lock on pSeries SMP */ - unsigned long l: 1; /* Virtual page is large (L=1) or 4 KB (L=0) */ - unsigned long h: 1; /* Hash function identifier */ - unsigned long v: 1; /* Valid (v=1) or invalid (v=0) */ -} Hpte_dword0; - -typedef struct { - unsigned long pp0: 1; /* Page protection bit 0 */ - unsigned long ts: 1; /* Tag set bit */ - unsigned long rpn: 50; /* Real page number */ - unsigned long : 2; /* Reserved */ - unsigned long ac: 1; /* Address compare */ - unsigned long r: 1; /* Referenced */ - unsigned long c: 1; /* Changed */ - unsigned long w: 1; /* Write-thru cache mode */ - unsigned long i: 1; /* Cache inhibited */ - unsigned long m: 1; /* Memory coherence required */ - unsigned long g: 1; /* Guarded */ - unsigned long n: 1; /* No-execute */ - unsigned long pp: 2; /* Page protection bits 1:2 */ -} Hpte_dword1; - -typedef struct { - char padding[6]; /* padding */ - unsigned long : 6; /* padding */ - unsigned long flags: 10; /* HPTE flags */ -} Hpte_dword1_flags; - typedef struct { - union { - unsigned long dword0; - Hpte_dword0 dw0; - } dw0; - - union { - unsigned long dword1; - Hpte_dword1 dw1; - Hpte_dword1_flags flags; - } dw1; -} HPTE; + unsigned long v; + unsigned long r; +} hpte_t; -extern HPTE * htab_address; -extern unsigned long htab_hash_mask; +extern hpte_t *htab_address; +extern unsigned long htab_hash_mask; static inline unsigned long hpt_hash(unsigned long vpn, int large) { @@ -181,18 +156,18 @@ asm volatile("ptesync": : :"memory"); } -static inline unsigned long slot2va(unsigned long avpn, unsigned long large, - unsigned long secondary, unsigned long slot) +static inline unsigned long slot2va(unsigned long hpte_v, unsigned long slot) { + unsigned long avpn = HPTE_V_AVPN_VAL(hpte_v); unsigned long va; va = avpn << 23; - if (!large) { + if (! (hpte_v & HPTE_V_LARGE)) { unsigned long vpi, pteg; pteg = slot / HPTES_PER_GROUP; - if (secondary) + if (hpte_v & HPTE_V_SECONDARY) pteg = ~pteg; vpi = ((va >> 28) ^ pteg) & htab_hash_mask; @@ -219,11 +194,11 @@ extern long pSeries_lpar_hpte_insert(unsigned long hpte_group, unsigned long va, unsigned long prpn, - int secondary, unsigned long hpteflags, - int bolted, int large); + unsigned long vflags, + unsigned long rflags); extern long native_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large); + unsigned long prpn, + unsigned long vflags, unsigned long rflags); #endif /* __ASSEMBLY__ */ Index: working-2.6/include/asm-ppc64/iSeries/HvCallHpt.h =================================================================== --- working-2.6.orig/include/asm-ppc64/iSeries/HvCallHpt.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/iSeries/HvCallHpt.h 2005-07-08 13:22:29.000000000 +1000 @@ -77,27 +77,26 @@ return compressedStatus; } -static inline u64 HvCallHpt_findValid(HPTE *hpte, u64 vpn) +static inline u64 HvCallHpt_findValid(hpte_t *hpte, u64 vpn) { return HvCall3Ret16(HvCallHptFindValid, hpte, vpn, 0, 0); } -static inline u64 HvCallHpt_findNextValid(HPTE *hpte, u32 hpteIndex, +static inline u64 HvCallHpt_findNextValid(hpte_t *hpte, u32 hpteIndex, u8 bitson, u8 bitsoff) { return HvCall3Ret16(HvCallHptFindNextValid, hpte, hpteIndex, bitson, bitsoff); } -static inline void HvCallHpt_get(HPTE *hpte, u32 hpteIndex) +static inline void HvCallHpt_get(hpte_t *hpte, u32 hpteIndex) { HvCall2Ret16(HvCallHptGet, hpte, hpteIndex, 0); } -static inline void HvCallHpt_addValidate(u32 hpteIndex, u32 hBit, HPTE *hpte) +static inline void HvCallHpt_addValidate(u32 hpteIndex, u32 hBit, hpte_t *hpte) { - HvCall4(HvCallHptAddValidate, hpteIndex, hBit, (*((u64 *)hpte)), - (*(((u64 *)hpte)+1))); + HvCall4(HvCallHptAddValidate, hpteIndex, hBit, hpte->v, hpte->r); } #endif /* _HVCALLHPT_H */ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From seto.hidetoshi at jp.fujitsu.com Fri Jul 8 15:44:49 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 08 Jul 2005 14:44:49 +0900 Subject: [PATCH 2.6.13-rc1 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: References: <42CB63B2.6000505@jp.fujitsu.com> <42CB6961.2060508@jp.fujitsu.com> Message-ID: <42CE12D0.7060601@jp.fujitsu.com> david mosberger wrote: >> - could anyone write same barrier for intel compiler? >> Tony or David, could you help me? > > I think it might be best to make ia64_mca_barrier() a proper > subroutine written in assembly code. Yes, that costs some time, but > we're talking about wasting 1,000+ cycles just to consume the value > read via readX(), so the call-overhead is actually overlapped and > completely trivial. Yes, of course speed is worth, but in some situations it can happen that the data integrity is better than the speed. (sounds crazy but fact) I'm not familiar with assembly code for intel compiler. So David, could you write another macro of ia64_mca_barrier() or a proper subroutine instead? Thanks, H.Seto From david at gibson.dropbear.id.au Fri Jul 8 15:58:55 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 8 Jul 2005 15:58:55 +1000 Subject: [PPC64] Kill bitfields in ppc64 hash code In-Reply-To: <20050708044653.GC30761@localhost.localdomain> References: <20050708044653.GC30761@localhost.localdomain> Message-ID: <20050708055855.GD30761@localhost.localdomain> On Fri, Jul 08, 2005 at 02:46:54PM +1000, David Gibson wrote: > Andrew, please apply: Ahem. Or perhaps the version which builds on iSeries too. This patch removes the use of bitfield types from the ppc64 hash table manipulation code. Signed-off-by: David Gibson Index: working-2.6/arch/ppc64/kernel/iSeries_htab.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/iSeries_htab.c 2005-06-08 15:37:37.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/iSeries_htab.c 2005-07-08 15:57:53.000000000 +1000 @@ -38,11 +38,12 @@ } static long iSeries_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large) + unsigned long prpn, unsigned long vflags, + unsigned long rflags) { long slot; - HPTE lhpte; + hpte_t lhpte; + int secondary = 0; /* * The hypervisor tries both primary and secondary. @@ -50,13 +51,13 @@ * it means we have already tried both primary and secondary, * so we return failure immediately. */ - if (secondary) + if (vflags & HPTE_V_SECONDARY) return -1; iSeries_hlock(hpte_group); slot = HvCallHpt_findValid(&lhpte, va >> PAGE_SHIFT); - BUG_ON(lhpte.dw0.dw0.v); + BUG_ON(lhpte.v & HPTE_V_VALID); if (slot == -1) { /* No available entry found in either group */ iSeries_hunlock(hpte_group); @@ -64,19 +65,13 @@ } if (slot < 0) { /* MSB set means secondary group */ + vflags |= HPTE_V_VALID; secondary = 1; slot &= 0x7fffffffffffffff; } - lhpte.dw1.dword1 = 0; - lhpte.dw1.dw1.rpn = physRpn_to_absRpn(prpn); - lhpte.dw1.flags.flags = hpteflags; - - lhpte.dw0.dword0 = 0; - lhpte.dw0.dw0.avpn = va >> 23; - lhpte.dw0.dw0.h = secondary; - lhpte.dw0.dw0.bolted = bolted; - lhpte.dw0.dw0.v = 1; + lhpte.v = (va >> 23) << HPTE_V_AVPN_SHIFT | vflags | HPTE_V_VALID; + lhpte.r = (physRpn_to_absRpn(prpn) << HPTE_R_RPN_SHIFT) | rflags; /* Now fill in the actual HPTE */ HvCallHpt_addValidate(slot, secondary, &lhpte); @@ -88,20 +83,17 @@ static unsigned long iSeries_hpte_getword0(unsigned long slot) { - unsigned long dword0; - HPTE hpte; + hpte_t hpte; HvCallHpt_get(&hpte, slot); - dword0 = hpte.dw0.dword0; - - return dword0; + return hpte.v; } static long iSeries_hpte_remove(unsigned long hpte_group) { unsigned long slot_offset; int i; - HPTE lhpte; + unsigned long hpte_v; /* Pick a random slot to start at */ slot_offset = mftb() & 0x7; @@ -109,10 +101,9 @@ iSeries_hlock(hpte_group); for (i = 0; i < HPTES_PER_GROUP; i++) { - lhpte.dw0.dword0 = - iSeries_hpte_getword0(hpte_group + slot_offset); + hpte_v = iSeries_hpte_getword0(hpte_group + slot_offset); - if (!lhpte.dw0.dw0.bolted) { + if (! (hpte_v & HPTE_V_BOLTED)) { HvCallHpt_invalidateSetSwBitsGet(hpte_group + slot_offset, 0, 0); iSeries_hunlock(hpte_group); @@ -137,13 +128,13 @@ static long iSeries_hpte_updatepp(unsigned long slot, unsigned long newpp, unsigned long va, int large, int local) { - HPTE hpte; + hpte_t hpte; unsigned long avpn = va >> 23; iSeries_hlock(slot); HvCallHpt_get(&hpte, slot); - if ((hpte.dw0.dw0.avpn == avpn) && (hpte.dw0.dw0.v)) { + if ((HPTE_V_AVPN_VAL(hpte.v) == avpn) && (hpte.v & HPTE_V_VALID)) { /* * Hypervisor expects bits as NPPP, which is * different from how they are mapped in our PP. @@ -167,7 +158,7 @@ */ static long iSeries_hpte_find(unsigned long vpn) { - HPTE hpte; + hpte_t hpte; long slot; /* @@ -177,7 +168,7 @@ * 0x80000000xxxxxxxx : Entry found in secondary group, slot x */ slot = HvCallHpt_findValid(&hpte, vpn); - if (hpte.dw0.dw0.v) { + if (hpte.v & HPTE_V_VALID) { if (slot < 0) { slot &= 0x7fffffffffffffff; slot = -slot; @@ -212,7 +203,7 @@ static void iSeries_hpte_invalidate(unsigned long slot, unsigned long va, int large, int local) { - HPTE lhpte; + unsigned long hpte_v; unsigned long avpn = va >> 23; unsigned long flags; @@ -220,9 +211,9 @@ iSeries_hlock(slot); - lhpte.dw0.dword0 = iSeries_hpte_getword0(slot); + hpte_v = iSeries_hpte_getword0(slot); - if ((lhpte.dw0.dw0.avpn == avpn) && lhpte.dw0.dw0.v) + if ((HPTE_V_AVPN_VAL(hpte_v) == avpn) && (hpte_v & HPTE_V_VALID)) HvCallHpt_invalidateSetSwBitsGet(slot, 0, 0); iSeries_hunlock(slot); Index: working-2.6/arch/ppc64/kernel/pSeries_lpar.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/pSeries_lpar.c 2005-06-08 15:37:37.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/pSeries_lpar.c 2005-07-08 13:06:14.000000000 +1000 @@ -277,31 +277,20 @@ long pSeries_lpar_hpte_insert(unsigned long hpte_group, unsigned long va, unsigned long prpn, - int secondary, unsigned long hpteflags, - int bolted, int large) + unsigned long vflags, unsigned long rflags) { unsigned long arpn = physRpn_to_absRpn(prpn); unsigned long lpar_rc; unsigned long flags; unsigned long slot; - HPTE lhpte; + unsigned long hpte_v, hpte_r; unsigned long dummy0, dummy1; - /* Fill in the local HPTE with absolute rpn, avpn and flags */ - lhpte.dw1.dword1 = 0; - lhpte.dw1.dw1.rpn = arpn; - lhpte.dw1.flags.flags = hpteflags; - - lhpte.dw0.dword0 = 0; - lhpte.dw0.dw0.avpn = va >> 23; - lhpte.dw0.dw0.h = secondary; - lhpte.dw0.dw0.bolted = bolted; - lhpte.dw0.dw0.v = 1; - - if (large) { - lhpte.dw0.dw0.l = 1; - lhpte.dw0.dw0.avpn &= ~0x1UL; - } + hpte_v = ((va >> 23) << HPTE_V_AVPN_SHIFT) | vflags | HPTE_V_VALID; + if (vflags & HPTE_V_LARGE) + hpte_v &= ~(1UL << HPTE_V_AVPN_SHIFT); + + hpte_r = (arpn << HPTE_R_RPN_SHIFT) | rflags; /* Now fill in the actual HPTE */ /* Set CEC cookie to 0 */ @@ -312,11 +301,11 @@ flags = 0; /* XXX why is this here? - Anton */ - if (hpteflags & (_PAGE_GUARDED|_PAGE_NO_CACHE)) - lhpte.dw1.flags.flags &= ~_PAGE_COHERENT; + if (rflags & (_PAGE_GUARDED|_PAGE_NO_CACHE)) + hpte_r &= ~_PAGE_COHERENT; - lpar_rc = plpar_hcall(H_ENTER, flags, hpte_group, lhpte.dw0.dword0, - lhpte.dw1.dword1, &slot, &dummy0, &dummy1); + lpar_rc = plpar_hcall(H_ENTER, flags, hpte_group, hpte_v, + hpte_r, &slot, &dummy0, &dummy1); if (unlikely(lpar_rc == H_PTEG_Full)) return -1; @@ -332,7 +321,7 @@ /* Because of iSeries, we have to pass down the secondary * bucket bit here as well */ - return (slot & 7) | (secondary << 3); + return (slot & 7) | (!!(vflags & HPTE_V_SECONDARY) << 3); } static DEFINE_SPINLOCK(pSeries_lpar_tlbie_lock); @@ -427,22 +416,18 @@ unsigned long hash; unsigned long i, j; long slot; - union { - unsigned long dword0; - Hpte_dword0 dw0; - } hpte_dw0; - Hpte_dword0 dw0; + unsigned long hpte_v; hash = hpt_hash(vpn, 0); for (j = 0; j < 2; j++) { slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; for (i = 0; i < HPTES_PER_GROUP; i++) { - hpte_dw0.dword0 = pSeries_lpar_hpte_getword0(slot); - dw0 = hpte_dw0.dw0; + hpte_v = pSeries_lpar_hpte_getword0(slot); - if ((dw0.avpn == (vpn >> 11)) && dw0.v && - (dw0.h == j)) { + if ((HPTE_V_AVPN_VAL(hpte_v) == (vpn >> 11)) + && (hpte_v & HPTE_V_VALID) + && (!!(hpte_v & HPTE_V_SECONDARY) == j)) { /* HPTE matches */ if (j) slot = -slot; Index: working-2.6/arch/ppc64/mm/hash_low.S =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_low.S 2005-06-08 15:46:23.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_low.S 2005-07-08 13:06:14.000000000 +1000 @@ -170,9 +170,7 @@ /* Call ppc_md.hpte_insert */ ld r7,STK_PARM(r4)(r1) /* Retreive new pp bits */ mr r4,r29 /* Retreive va */ - li r6,0 /* primary slot */ - li r8,0 /* not bolted and not large */ - li r9,0 + li r6,0 /* no vflags */ _GLOBAL(htab_call_hpte_insert1) bl . /* Will be patched by htab_finish_init() */ cmpdi 0,r3,0 @@ -192,9 +190,7 @@ /* Call ppc_md.hpte_insert */ ld r7,STK_PARM(r4)(r1) /* Retreive new pp bits */ mr r4,r29 /* Retreive va */ - li r6,1 /* secondary slot */ - li r8,0 /* not bolted and not large */ - li r9,0 + li r6,HPTE_V_SECONDARY at l /* secondary slot */ _GLOBAL(htab_call_hpte_insert2) bl . /* Will be patched by htab_finish_init() */ cmpdi 0,r3,0 Index: working-2.6/arch/ppc64/mm/hash_native.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_native.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_native.c 2005-07-08 13:43:15.000000000 +1000 @@ -27,9 +27,9 @@ static DEFINE_SPINLOCK(native_tlbie_lock); -static inline void native_lock_hpte(HPTE *hptep) +static inline void native_lock_hpte(hpte_t *hptep) { - unsigned long *word = &hptep->dw0.dword0; + unsigned long *word = &hptep->v; while (1) { if (!test_and_set_bit(HPTE_LOCK_BIT, word)) @@ -39,32 +39,28 @@ } } -static inline void native_unlock_hpte(HPTE *hptep) +static inline void native_unlock_hpte(hpte_t *hptep) { - unsigned long *word = &hptep->dw0.dword0; + unsigned long *word = &hptep->v; asm volatile("lwsync":::"memory"); clear_bit(HPTE_LOCK_BIT, word); } long native_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large) + unsigned long prpn, unsigned long vflags, + unsigned long rflags) { unsigned long arpn = physRpn_to_absRpn(prpn); - HPTE *hptep = htab_address + hpte_group; - Hpte_dword0 dw0; - HPTE lhpte; + hpte_t *hptep = htab_address + hpte_group; + unsigned long hpte_v, hpte_r; int i; for (i = 0; i < HPTES_PER_GROUP; i++) { - dw0 = hptep->dw0.dw0; - - if (!dw0.v) { + if (! (hptep->v & HPTE_V_VALID)) { /* retry with lock held */ native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; - if (!dw0.v) + if (! (hptep->v & HPTE_V_VALID)) break; native_unlock_hpte(hptep); } @@ -75,56 +71,45 @@ if (i == HPTES_PER_GROUP) return -1; - lhpte.dw1.dword1 = 0; - lhpte.dw1.dw1.rpn = arpn; - lhpte.dw1.flags.flags = hpteflags; - - lhpte.dw0.dword0 = 0; - lhpte.dw0.dw0.avpn = va >> 23; - lhpte.dw0.dw0.h = secondary; - lhpte.dw0.dw0.bolted = bolted; - lhpte.dw0.dw0.v = 1; - - if (large) { - lhpte.dw0.dw0.l = 1; - lhpte.dw0.dw0.avpn &= ~0x1UL; - } - - hptep->dw1.dword1 = lhpte.dw1.dword1; + hpte_v = (va >> 23) << HPTE_V_AVPN_SHIFT | vflags | HPTE_V_VALID; + if (vflags & HPTE_V_LARGE) + va &= ~(1UL << HPTE_V_AVPN_SHIFT); + hpte_r = (arpn << HPTE_R_RPN_SHIFT) | rflags; + hptep->r = hpte_r; /* Guarantee the second dword is visible before the valid bit */ __asm__ __volatile__ ("eieio" : : : "memory"); - /* * Now set the first dword including the valid bit * NOTE: this also unlocks the hpte */ - hptep->dw0.dword0 = lhpte.dw0.dword0; + hptep->v = hpte_v; __asm__ __volatile__ ("ptesync" : : : "memory"); - return i | (secondary << 3); + return i | (!!(vflags & HPTE_V_SECONDARY) << 3); } static long native_hpte_remove(unsigned long hpte_group) { - HPTE *hptep; - Hpte_dword0 dw0; + hpte_t *hptep; int i; int slot_offset; + unsigned long hpte_v; /* pick a random entry to start at */ slot_offset = mftb() & 0x7; for (i = 0; i < HPTES_PER_GROUP; i++) { hptep = htab_address + hpte_group + slot_offset; - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; - if (dw0.v && !dw0.bolted) { + if ((hpte_v & HPTE_V_VALID) && !(hpte_v & HPTE_V_BOLTED)) { /* retry with lock held */ native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; - if (dw0.v && !dw0.bolted) + hpte_v = hptep->v; + if ((hpte_v & HPTE_V_VALID) + && !(hpte_v & HPTE_V_BOLTED)) break; native_unlock_hpte(hptep); } @@ -137,15 +122,15 @@ return -1; /* Invalidate the hpte. NOTE: this also unlocks it */ - hptep->dw0.dword0 = 0; + hptep->v = 0; return i; } -static inline void set_pp_bit(unsigned long pp, HPTE *addr) +static inline void set_pp_bit(unsigned long pp, hpte_t *addr) { unsigned long old; - unsigned long *p = &addr->dw1.dword1; + unsigned long *p = &addr->r; __asm__ __volatile__( "1: ldarx %0,0,%3\n\ @@ -163,11 +148,11 @@ */ static long native_hpte_find(unsigned long vpn) { - HPTE *hptep; + hpte_t *hptep; unsigned long hash; unsigned long i, j; long slot; - Hpte_dword0 dw0; + unsigned long hpte_v; hash = hpt_hash(vpn, 0); @@ -175,10 +160,11 @@ slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; for (i = 0; i < HPTES_PER_GROUP; i++) { hptep = htab_address + slot; - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; - if ((dw0.avpn == (vpn >> 11)) && dw0.v && - (dw0.h == j)) { + if ((HPTE_V_AVPN_VAL(hpte_v) == (vpn >> 11)) + && (hpte_v & HPTE_V_VALID) + && ( !!(hpte_v & HPTE_V_SECONDARY) == j)) { /* HPTE matches */ if (j) slot = -slot; @@ -195,20 +181,21 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp, unsigned long va, int large, int local) { - HPTE *hptep = htab_address + slot; - Hpte_dword0 dw0; + hpte_t *hptep = htab_address + slot; + unsigned long hpte_v; unsigned long avpn = va >> 23; int ret = 0; if (large) - avpn &= ~0x1UL; + avpn &= ~1; native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; /* Even if we miss, we need to invalidate the TLB */ - if ((dw0.avpn != avpn) || !dw0.v) { + if ((HPTE_V_AVPN_VAL(hpte_v) != avpn) + || !(hpte_v & HPTE_V_VALID)) { native_unlock_hpte(hptep); ret = -1; } else { @@ -244,7 +231,7 @@ { unsigned long vsid, va, vpn, flags = 0; long slot; - HPTE *hptep; + hpte_t *hptep; int lock_tlbie = !cpu_has_feature(CPU_FTR_LOCKLESS_TLBIE); vsid = get_kernel_vsid(ea); @@ -269,26 +256,27 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long va, int large, int local) { - HPTE *hptep = htab_address + slot; - Hpte_dword0 dw0; + hpte_t *hptep = htab_address + slot; + unsigned long hpte_v; unsigned long avpn = va >> 23; unsigned long flags; int lock_tlbie = !cpu_has_feature(CPU_FTR_LOCKLESS_TLBIE); if (large) - avpn &= ~0x1UL; + avpn &= ~1; local_irq_save(flags); native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; /* Even if we miss, we need to invalidate the TLB */ - if ((dw0.avpn != avpn) || !dw0.v) { + if ((HPTE_V_AVPN_VAL(hpte_v) != avpn) + || !(hpte_v & HPTE_V_VALID)) { native_unlock_hpte(hptep); } else { /* Invalidate the hpte. NOTE: this also unlocks it */ - hptep->dw0.dword0 = 0; + hptep->v = 0; } /* Invalidate the tlb */ @@ -315,8 +303,8 @@ static void native_hpte_clear(void) { unsigned long slot, slots, flags; - HPTE *hptep = htab_address; - Hpte_dword0 dw0; + hpte_t *hptep = htab_address; + unsigned long hpte_v; unsigned long pteg_count; pteg_count = htab_hash_mask + 1; @@ -336,11 +324,11 @@ * running, right? and for crash dump, we probably * don't want to wait for a maybe bad cpu. */ - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; - if (dw0.v) { - hptep->dw0.dword0 = 0; - tlbie(slot2va(dw0.avpn, dw0.l, dw0.h, slot), dw0.l); + if (hpte_v & HPTE_V_VALID) { + hptep->v = 0; + tlbie(slot2va(hpte_v, slot), hpte_v & HPTE_V_LARGE); } } @@ -353,8 +341,8 @@ { unsigned long vsid, vpn, va, hash, secondary, slot, flags, avpn; int i, j; - HPTE *hptep; - Hpte_dword0 dw0; + hpte_t *hptep; + unsigned long hpte_v; struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch); /* XXX fix for large ptes */ @@ -390,14 +378,15 @@ native_lock_hpte(hptep); - dw0 = hptep->dw0.dw0; + hpte_v = hptep->v; /* Even if we miss, we need to invalidate the TLB */ - if ((dw0.avpn != avpn) || !dw0.v) { + if ((HPTE_V_AVPN_VAL(hpte_v) != avpn) + || !(hpte_v & HPTE_V_VALID)) { native_unlock_hpte(hptep); } else { /* Invalidate the hpte. NOTE: this also unlocks it */ - hptep->dw0.dword0 = 0; + hptep->v = 0; } j++; Index: working-2.6/arch/ppc64/mm/hash_utils.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_utils.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_utils.c 2005-07-08 13:06:14.000000000 +1000 @@ -75,8 +75,8 @@ extern unsigned long dart_tablebase; #endif /* CONFIG_U3_DART */ -HPTE *htab_address; -unsigned long htab_hash_mask; +hpte_t *htab_address; +unsigned long htab_hash_mask; extern unsigned long _SDR1; @@ -97,11 +97,15 @@ unsigned long addr; unsigned int step; unsigned long tmp_mode; + unsigned long vflags; - if (large) + if (large) { step = 16*MB; - else + vflags = HPTE_V_BOLTED | HPTE_V_LARGE; + } else { step = 4*KB; + vflags = HPTE_V_BOLTED; + } for (addr = start; addr < end; addr += step) { unsigned long vpn, hash, hpteg; @@ -129,12 +133,12 @@ if (systemcfg->platform & PLATFORM_LPAR) ret = pSeries_lpar_hpte_insert(hpteg, va, virt_to_abs(addr) >> PAGE_SHIFT, - 0, tmp_mode, 1, large); + vflags, tmp_mode); else #endif /* CONFIG_PPC_PSERIES */ ret = native_hpte_insert(hpteg, va, virt_to_abs(addr) >> PAGE_SHIFT, - 0, tmp_mode, 1, large); + vflags, tmp_mode); if (ret == -1) { ppc64_terminate_msg(0x20, "create_pte_mapping"); Index: working-2.6/arch/ppc64/mm/hugetlbpage.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hugetlbpage.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hugetlbpage.c 2005-07-08 13:06:14.000000000 +1000 @@ -583,7 +583,7 @@ pte_t *ptep; unsigned long va, vpn; pte_t old_pte, new_pte; - unsigned long hpteflags, prpn; + unsigned long rflags, prpn; long slot; int err = 1; @@ -626,9 +626,9 @@ old_pte = *ptep; new_pte = old_pte; - hpteflags = 0x2 | (! (pte_val(new_pte) & _PAGE_RW)); + rflags = 0x2 | (! (pte_val(new_pte) & _PAGE_RW)); /* _PAGE_EXEC -> HW_NO_EXEC since it's inverted */ - hpteflags |= ((pte_val(new_pte) & _PAGE_EXEC) ? 0 : HW_NO_EXEC); + rflags |= ((pte_val(new_pte) & _PAGE_EXEC) ? 0 : HW_NO_EXEC); /* Check if pte already has an hpte (case 2) */ if (unlikely(pte_val(old_pte) & _PAGE_HASHPTE)) { @@ -641,7 +641,7 @@ slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; slot += (pte_val(old_pte) & _PAGE_GROUP_IX) >> 12; - if (ppc_md.hpte_updatepp(slot, hpteflags, va, 1, local) == -1) + if (ppc_md.hpte_updatepp(slot, rflags, va, 1, local) == -1) pte_val(old_pte) &= ~_PAGE_HPTEFLAGS; } @@ -661,10 +661,10 @@ /* Add in WIMG bits */ /* XXX We should store these in the pte */ - hpteflags |= _PAGE_COHERENT; + rflags |= _PAGE_COHERENT; - slot = ppc_md.hpte_insert(hpte_group, va, prpn, 0, - hpteflags, 0, 1); + slot = ppc_md.hpte_insert(hpte_group, va, prpn, + HPTE_V_LARGE, rflags); /* Primary is full, try the secondary */ if (unlikely(slot == -1)) { @@ -672,7 +672,7 @@ hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL; slot = ppc_md.hpte_insert(hpte_group, va, prpn, - 1, hpteflags, 0, 1); + HPTE_V_LARGE, rflags); if (slot == -1) { if (mftb() & 0x1) hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL; Index: working-2.6/arch/ppc64/mm/init.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/init.c 2005-07-06 10:30:13.000000000 +1000 +++ working-2.6/arch/ppc64/mm/init.c 2005-07-08 13:49:35.000000000 +1000 @@ -180,9 +180,10 @@ hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP); /* Panic if a pte grpup is full */ - if (ppc_md.hpte_insert(hpteg, va, pa >> PAGE_SHIFT, 0, - _PAGE_NO_CACHE|_PAGE_GUARDED|PP_RWXX, - 1, 0) == -1) { + if (ppc_md.hpte_insert(hpteg, va, pa >> PAGE_SHIFT, + HPTE_V_BOLTED, + _PAGE_NO_CACHE|_PAGE_GUARDED|PP_RWXX) + == -1) { panic("map_io_page: could not insert mapping"); } } Index: working-2.6/include/asm-ppc64/machdep.h =================================================================== --- working-2.6.orig/include/asm-ppc64/machdep.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/machdep.h 2005-07-08 13:06:14.000000000 +1000 @@ -53,10 +53,8 @@ long (*hpte_insert)(unsigned long hpte_group, unsigned long va, unsigned long prpn, - int secondary, - unsigned long hpteflags, - int bolted, - int large); + unsigned long vflags, + unsigned long rflags); long (*hpte_remove)(unsigned long hpte_group); void (*flush_hash_range)(unsigned long context, unsigned long number, Index: working-2.6/include/asm-ppc64/mmu.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu.h 2005-07-08 13:48:40.000000000 +1000 @@ -60,6 +60,22 @@ #define HPTES_PER_GROUP 8 +#define HPTE_V_AVPN_SHIFT 7 +#define HPTE_V_AVPN ASM_CONST(0xffffffffffffff80) +#define HPTE_V_AVPN_VAL(x) (((x) & HPTE_V_AVPN) >> HPTE_V_AVPN_SHIFT) +#define HPTE_V_BOLTED ASM_CONST(0x0000000000000010) +#define HPTE_V_LOCK ASM_CONST(0x0000000000000008) +#define HPTE_V_LARGE ASM_CONST(0x0000000000000004) +#define HPTE_V_SECONDARY ASM_CONST(0x0000000000000002) +#define HPTE_V_VALID ASM_CONST(0x0000000000000001) + +#define HPTE_R_PP0 ASM_CONST(0x8000000000000000) +#define HPTE_R_TS ASM_CONST(0x4000000000000000) +#define HPTE_R_RPN_SHIFT 12 +#define HPTE_R_RPN ASM_CONST(0x3ffffffffffff000) +#define HPTE_R_FLAGS ASM_CONST(0x00000000000003ff) +#define HPTE_R_PP ASM_CONST(0x0000000000000003) + /* Values for PP (assumes Ks=0, Kp=1) */ /* pp0 will always be 0 for linux */ #define PP_RWXX 0 /* Supervisor read/write, User none */ @@ -69,54 +85,13 @@ #ifndef __ASSEMBLY__ -/* Hardware Page Table Entry */ -typedef struct { - unsigned long avpn:57; /* vsid | api == avpn */ - unsigned long : 2; /* Software use */ - unsigned long bolted: 1; /* HPTE is "bolted" */ - unsigned long lock: 1; /* lock on pSeries SMP */ - unsigned long l: 1; /* Virtual page is large (L=1) or 4 KB (L=0) */ - unsigned long h: 1; /* Hash function identifier */ - unsigned long v: 1; /* Valid (v=1) or invalid (v=0) */ -} Hpte_dword0; - -typedef struct { - unsigned long pp0: 1; /* Page protection bit 0 */ - unsigned long ts: 1; /* Tag set bit */ - unsigned long rpn: 50; /* Real page number */ - unsigned long : 2; /* Reserved */ - unsigned long ac: 1; /* Address compare */ - unsigned long r: 1; /* Referenced */ - unsigned long c: 1; /* Changed */ - unsigned long w: 1; /* Write-thru cache mode */ - unsigned long i: 1; /* Cache inhibited */ - unsigned long m: 1; /* Memory coherence required */ - unsigned long g: 1; /* Guarded */ - unsigned long n: 1; /* No-execute */ - unsigned long pp: 2; /* Page protection bits 1:2 */ -} Hpte_dword1; - -typedef struct { - char padding[6]; /* padding */ - unsigned long : 6; /* padding */ - unsigned long flags: 10; /* HPTE flags */ -} Hpte_dword1_flags; - typedef struct { - union { - unsigned long dword0; - Hpte_dword0 dw0; - } dw0; - - union { - unsigned long dword1; - Hpte_dword1 dw1; - Hpte_dword1_flags flags; - } dw1; -} HPTE; + unsigned long v; + unsigned long r; +} hpte_t; -extern HPTE * htab_address; -extern unsigned long htab_hash_mask; +extern hpte_t *htab_address; +extern unsigned long htab_hash_mask; static inline unsigned long hpt_hash(unsigned long vpn, int large) { @@ -181,18 +156,18 @@ asm volatile("ptesync": : :"memory"); } -static inline unsigned long slot2va(unsigned long avpn, unsigned long large, - unsigned long secondary, unsigned long slot) +static inline unsigned long slot2va(unsigned long hpte_v, unsigned long slot) { + unsigned long avpn = HPTE_V_AVPN_VAL(hpte_v); unsigned long va; va = avpn << 23; - if (!large) { + if (! (hpte_v & HPTE_V_LARGE)) { unsigned long vpi, pteg; pteg = slot / HPTES_PER_GROUP; - if (secondary) + if (hpte_v & HPTE_V_SECONDARY) pteg = ~pteg; vpi = ((va >> 28) ^ pteg) & htab_hash_mask; @@ -219,11 +194,11 @@ extern long pSeries_lpar_hpte_insert(unsigned long hpte_group, unsigned long va, unsigned long prpn, - int secondary, unsigned long hpteflags, - int bolted, int large); + unsigned long vflags, + unsigned long rflags); extern long native_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large); + unsigned long prpn, + unsigned long vflags, unsigned long rflags); #endif /* __ASSEMBLY__ */ Index: working-2.6/include/asm-ppc64/iSeries/HvCallHpt.h =================================================================== --- working-2.6.orig/include/asm-ppc64/iSeries/HvCallHpt.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/iSeries/HvCallHpt.h 2005-07-08 13:22:29.000000000 +1000 @@ -77,27 +77,26 @@ return compressedStatus; } -static inline u64 HvCallHpt_findValid(HPTE *hpte, u64 vpn) +static inline u64 HvCallHpt_findValid(hpte_t *hpte, u64 vpn) { return HvCall3Ret16(HvCallHptFindValid, hpte, vpn, 0, 0); } -static inline u64 HvCallHpt_findNextValid(HPTE *hpte, u32 hpteIndex, +static inline u64 HvCallHpt_findNextValid(hpte_t *hpte, u32 hpteIndex, u8 bitson, u8 bitsoff) { return HvCall3Ret16(HvCallHptFindNextValid, hpte, hpteIndex, bitson, bitsoff); } -static inline void HvCallHpt_get(HPTE *hpte, u32 hpteIndex) +static inline void HvCallHpt_get(hpte_t *hpte, u32 hpteIndex) { HvCall2Ret16(HvCallHptGet, hpte, hpteIndex, 0); } -static inline void HvCallHpt_addValidate(u32 hpteIndex, u32 hBit, HPTE *hpte) +static inline void HvCallHpt_addValidate(u32 hpteIndex, u32 hBit, hpte_t *hpte) { - HvCall4(HvCallHptAddValidate, hpteIndex, hBit, (*((u64 *)hpte)), - (*(((u64 *)hpte)+1))); + HvCall4(HvCallHptAddValidate, hpteIndex, hBit, hpte->v, hpte->r); } #endif /* _HVCALLHPT_H */ Index: working-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/iSeries_setup.c 2005-07-08 11:30:47.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/iSeries_setup.c 2005-07-08 15:58:23.000000000 +1000 @@ -503,7 +503,7 @@ /* Fill in the hashed page table hash mask */ num_ptegs = hptSizePages * - (PAGE_SIZE / (sizeof(HPTE) * HPTES_PER_GROUP)); + (PAGE_SIZE / (sizeof(hpte_t) * HPTES_PER_GROUP)); htab_hash_mask = num_ptegs - 1; /* @@ -618,25 +618,23 @@ static void iSeries_make_pte(unsigned long va, unsigned long pa, int mode) { - HPTE local_hpte, rhpte; + hpte_t local_hpte, rhpte; unsigned long hash, vpn; long slot; vpn = va >> PAGE_SHIFT; hash = hpt_hash(vpn, 0); - local_hpte.dw1.dword1 = pa | mode; - local_hpte.dw0.dword0 = 0; - local_hpte.dw0.dw0.avpn = va >> 23; - local_hpte.dw0.dw0.bolted = 1; /* bolted */ - local_hpte.dw0.dw0.v = 1; + local_hpte.r = pa | mode; + local_hpte.v = ((va >> 23) << HPTE_V_AVPN_SHIFT) + | HPTE_V_BOLTED | HPTE_V_VALID; slot = HvCallHpt_findValid(&rhpte, vpn); if (slot < 0) { /* Must find space in primary group */ panic("hash_page: hpte already exists\n"); } - HvCallHpt_addValidate(slot, 0, (HPTE *)&local_hpte ); + HvCallHpt_addValidate(slot, 0, &local_hpte); } /* @@ -646,7 +644,7 @@ { unsigned long pa; unsigned long mode_rw = _PAGE_ACCESSED | _PAGE_COHERENT | PP_RWXX; - HPTE hpte; + hpte_t hpte; for (pa = saddr; pa < eaddr ;pa += PAGE_SIZE) { unsigned long ea = (unsigned long)__va(pa); @@ -659,7 +657,7 @@ if (!in_kernel_text(ea)) mode_rw |= HW_NO_EXEC; - if (hpte.dw0.dw0.v) { + if (hpte.v & HPTE_V_VALID) { /* HPTE exists, so just bolt it */ HvCallHpt_setSwBits(slot, 0x10, 0); /* And make sure the pp bits are correct */ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From ctyuptqmxfm at yahoo.com Fri Jul 8 15:43:08 2005 From: ctyuptqmxfm at yahoo.com (¦Lªí¾÷¯Ó§÷ Swanson) Date: Fri, 08 Jul 2005 07:43:08 +0200 Subject: =?iso-8859-1?q?=28=28=28=AD=AD=B6q=A4j=A4=E8=B0e=29=29=29=A5=FE?= =?iso-8859-1?q?=A8t=A6C=A6L=AA=ED=BE=F7=AF=D3=A7=F7=A1=FB=A9=AF=BA?= =?iso-8859-1?q?=D6=BA=A1=B7N=BB=F9?= Message-ID: An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050708/ec9e9485/attachment.htm From sfr at canb.auug.org.au Fri Jul 8 16:20:01 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 8 Jul 2005 16:20:01 +1000 Subject: [PPC64] Kill bitfields in ppc64 hash code In-Reply-To: <20050708055855.GD30761@localhost.localdomain> References: <20050708044653.GC30761@localhost.localdomain> <20050708055855.GD30761@localhost.localdomain> Message-ID: <20050708162001.48a1f460.sfr@canb.auug.org.au> On Fri, 8 Jul 2005 15:58:55 +1000 David Gibson wrote: > > On Fri, Jul 08, 2005 at 02:46:54PM +1000, David Gibson wrote: > > Andrew, please apply: > > Ahem. Or perhaps the version which builds on iSeries too. > > This patch removes the use of bitfield types from the ppc64 hash table > manipulation code. > > Signed-off-by: David Gibson Looks good to me for iSeries (built and booted). Acked-by: Stephen Rothwell -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050708/a120addb/attachment.pgp From sfr at canb.auug.org.au Fri Jul 8 17:04:54 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 8 Jul 2005 17:04:54 +1000 Subject: [PATCH] make dma_addr_t 64 bits Message-ID: <20050708170454.3ac0c79e.sfr@canb.auug.org.au> Hi all, There has been a need expressed for dma_addr_t to be 64 bits on PPC64. This patch does that. I have built it for pSeries and iSeries and booted a virtual only iSeries partition. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus/include/asm-ppc64/scatterlist.h linus-dma64/include/asm-ppc64/scatterlist.h --- linus/include/asm-ppc64/scatterlist.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-dma64/include/asm-ppc64/scatterlist.h 2005-07-08 16:45:07.000000000 +1000 @@ -19,7 +19,7 @@ unsigned int length; /* For TCE support */ - u32 dma_address; + dma_addr_t dma_address; u32 dma_length; }; diff -ruN linus/include/asm-ppc64/types.h linus-dma64/include/asm-ppc64/types.h --- linus/include/asm-ppc64/types.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-dma64/include/asm-ppc64/types.h 2005-07-08 16:41:08.000000000 +1000 @@ -63,7 +63,7 @@ typedef __vector128 vector128; -typedef u32 dma_addr_t; +typedef u64 dma_addr_t; typedef u64 dma64_addr_t; typedef struct { From yagqhctvefm at yahoo.com Fri Jul 8 17:45:29 2005 From: yagqhctvefm at yahoo.com (¦Lªí¾÷¯Ó§÷ Owens) Date: Fri, 08 Jul 2005 12:45:29 +0500 Subject: =?iso-8859-1?q?=A1=B9=A2=A1=A5u=ADn=A5=CE=B9L=A1I=A4H=A4H=B3=A3?= =?iso-8859-1?q?=BB=A1=C6g=A1A=AA=E1=A5P=A4l=A6L=AA=ED=BE=F7=AF=D3?= =?iso-8859-1?q?=A7=F7=A1I?= Message-ID: An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050708/0d51064e/attachment.htm From seto.hidetoshi at jp.fujitsu.com Fri Jul 8 22:22:17 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Fri, 08 Jul 2005 21:22:17 +0900 Subject: [PATCH 2.6.13-rc1 01/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <1120775239.31924.262.camel@gaston> References: <42CB63B2.6000505@jp.fujitsu.com> <20050707184102.GC14726@kroah.com> <1120775239.31924.262.camel@gaston> Message-ID: <42CE6FF9.4040505@jp.fujitsu.com> Benjamin Herrenschmidt wrote: > On Thu, 2005-07-07 at 11:41 -0700, Greg KH wrote: >>How about the issue of tying this into the other pci error reporting >>infrastructure that is being worked on? > > The other infrastructure is for asynchronous reporting and recovery. > We still need synchronous detection & reporting. So this is a bit > different. The interesting point is that it seems that both could use for reporting. > However, it would be nice if Hidetoshi's work could be adapted a bit so > that 1) naming is a bit more consistent with the other stuff (pcierr_* > maybe) and 2) the error "token" is the same. The later is especially > important if we start adding ways to query the error token to know what > the error precisely was etc... There is no reason to have 2 different > ways of representing error details. The naming doesn't really matter. iochk_* is just my preference. However it would be worth a try to move generic codes from iomap.* (historical home) to pci.* and rename iochk_* to pcierr_*. Well, I'd like to use this opportunity to sort out my thoughts... Now iochk_read() returns a boolean value, whether there was a error or not. Of course I agree that it would be more useful if it can return the detail of the error. A quick solution is return "token" instead of boolean value 1. -extern int iochk_read(iocookie *cookie); +extern token iochk_read(iocookie *cookie); The token should be a pointer or a bitmask having proper flags, not 0. So this still work: if(iochk_read(cookie)) return -EIO; and now this will also work: if((token=iochk_read(cookie))!=0){ switch(severity(token)) { ... }} The error "token", tentatively named "pci_error_token" in document you posted, is now temporarily defined in recent Linas's patch using other alias: enum pci_channel_state { pci_channel_io_normal = 0, /* I/O channel is in normal state */ pci_channel_io_frozen = 1, /* I/O to channel is blocked */ pci_channel_io_perm_failure, /* pci card is dead */ }; Of course this will be not enough in near future. I have already agree with what few month ago you said: > The token should be an opaque type with accessors. You could define a > pci_error_get_severity(token) to return the severity. The idea is to > define accessors which return an error when the data requested isn't > present in the error info. The actual content of the token is to be > defined. I was thinking about a type plus a union. I was hoping Seto > could provide something here ... For example: /* offsets */ enum { pcierr_io_frozen = 0, /* I/O to channel is blocked */ pcierr_io_perm_failure, /* pci card is dead */ pcierr_severity_valid, /* 1:valid */ pcierr_severity, /* 0:non-fatal 1:fatal */ }; #define pcierr_perm_failure(token) \ test_bit(pcierr_io_perm_failure, token) int pcierr_get_severity(token) { if(test_bit(pcierr_severity_valid, token)) { return test_bit(pcierr_severity, token); } return -1; } Objections? Are there any better place to put a group of codes like above than include/linux/pci.h? By the way, if token says frozen but not perm_failure, is it mean "recovery processing now"? Are there any special state which synchronous detection have to report? Thanks, H.Seto From michael at ellerman.id.au Fri Jul 8 22:45:06 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Fri, 8 Jul 2005 22:45:06 +1000 Subject: Platform numbers Message-ID: <200507082245.11330.michael@ellerman.id.au> Hi y'all, Are the platform numbers in asm-ppc64/processor.h private to Linux? Or do firmware/hardware know about them? For pseries/pmac/maple it looks like we just assign them based on OF properties. For iSeries we check in the naca. I don't see where BPA gets assigned? The reason I ask is PLATFORM_MAPLE is currently 0x0500 which means we can't (easily) do any bit mask based trickery on platform numbers. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050708/1f423a6d/attachment.pgp From jimix at watson.ibm.com Sat Jul 9 02:31:40 2005 From: jimix at watson.ibm.com (Jimi Xenidis) Date: Fri, 8 Jul 2005 12:31:40 -0400 Subject: Platform numbers In-Reply-To: <5bd4830d5a631d2e617753459d90c66b@us.ibm.com> References: <200507082245.11330.michael@ellerman.id.au> <5bd4830d5a631d2e617753459d90c66b@us.ibm.com> Message-ID: <17102.43628.650356.928630@kitch0.watson.ibm.com> >>>>> "HB" == Hollis Blanchard writes: HB> On Jul 8, 2005, at 7:45 AM, Michael Ellerman wrote: >> >> Are the platform numbers in asm-ppc64/processor.h private to Linux? Or >> do >> firmware/hardware know about them? AFAICT they are private to Linux. -JX -- "I got an idea, an idea so smart my head would explode if I even began to know what I was talking about." -- Peter Griffin (Family Guy) From olof at lixom.net Sat Jul 9 02:52:35 2005 From: olof at lixom.net (Olof Johansson) Date: Fri, 8 Jul 2005 11:52:35 -0500 Subject: Platform numbers In-Reply-To: <200507082245.11330.michael@ellerman.id.au> References: <200507082245.11330.michael@ellerman.id.au> Message-ID: <20050708165235.GA12039@austin.ibm.com> On Fri, Jul 08, 2005 at 10:45:06PM +1000, Michael Ellerman wrote: > Hi y'all, > > Are the platform numbers in asm-ppc64/processor.h private to Linux? Or do > firmware/hardware know about them? > > For pseries/pmac/maple it looks like we just assign them based on OF > properties. For iSeries we check in the naca. I don't see where BPA gets > assigned? > > The reason I ask is PLATFORM_MAPLE is currently 0x0500 which means we can't > (easily) do any bit mask based trickery on platform numbers. They are just local to Linux and you can change them if you see a need to. Only thing to think about is the LPAR bit (0x1). Only challenge with going with bitmap is that the number of possible platforms are quite a bit lower, I'm not sure how quickly that will hurt us. We'll have other, bigger, headaches before we run out of platform bits anyway. -Olof From hollisb at us.ibm.com Sat Jul 9 02:01:28 2005 From: hollisb at us.ibm.com (Hollis Blanchard) Date: Fri, 8 Jul 2005 11:01:28 -0500 Subject: Platform numbers In-Reply-To: <200507082245.11330.michael@ellerman.id.au> References: <200507082245.11330.michael@ellerman.id.au> Message-ID: <5bd4830d5a631d2e617753459d90c66b@us.ibm.com> On Jul 8, 2005, at 7:45 AM, Michael Ellerman wrote: > > Are the platform numbers in asm-ppc64/processor.h private to Linux? Or > do > firmware/hardware know about them? > > For pseries/pmac/maple it looks like we just assign them based on OF > properties. For iSeries we check in the naca. I don't see where BPA > gets > assigned? > > The reason I ask is PLATFORM_MAPLE is currently 0x0500 which means we > can't > (easily) do any bit mask based trickery on platform numbers. What did you have in mind? The low bit is already used to indicate LPAR, which is something we're using for Xen/PPC development (i.e. #define PLATFORM_MAPLE_LPAR 0x0501). -- Hollis Blanchard IBM Linux Technology Center From olof at lixom.net Sat Jul 9 05:47:25 2005 From: olof at lixom.net (Olof Johansson) Date: Fri, 8 Jul 2005 14:47:25 -0500 Subject: [PATCH] make dma_addr_t 64 bits In-Reply-To: <20050708170454.3ac0c79e.sfr@canb.auug.org.au> References: <20050708170454.3ac0c79e.sfr@canb.auug.org.au> Message-ID: <20050708194725.GB12039@austin.ibm.com> On Fri, Jul 08, 2005 at 05:04:54PM +1000, Stephen Rothwell wrote: > Hi all, > > There has been a need expressed for dma_addr_t to be 64 bits on PPC64. > This patch does that. I have built it for pSeries and iSeries and booted > a virtual only iSeries partition. I've kicked it off across a range of machines here. JS20 seems happy, running on a few POWER4 and POWER5 machines right now. -Olof From arndb at onlinehome.de Sat Jul 9 06:01:13 2005 From: arndb at onlinehome.de (Arnd Bergmann ) Date: Fri, 08 Jul 2005 22:01:13 +0200 Subject: [PATCH] make dma_addr_t 64 bits Message-ID: <10982849.1120852873264.JavaMail.servlet@kundenserver> sfr at canb.auug.org.au wrote: >There has been a need expressed for dma_addr_t to be 64 bits on PPC64. >This patch does that. I have built it for pSeries and iSeries and booted >a virtual only iSeries partition. I know that this patch breaks our new spidernet driver for the Cell Blade, but it is trivially fixable (dma_addr_t was incorrectly used to describe part of a HW structure, needs to be u32 instead). Maybe you should check if other code is broken in a similar way. Arnd <>< From olh at suse.de Sat Jul 9 06:57:52 2005 From: olh at suse.de (Olaf Hering) Date: Fri, 8 Jul 2005 22:57:52 +0200 Subject: p620 hangs instantiating rtas at 0x00000000deadbeef In-Reply-To: <20050209150654.GA16640@suse.de> References: <20050209150654.GA16640@suse.de> Message-ID: <20050708205752.GA32069@suse.de> On Wed, Feb 09, Olaf Hering wrote: > > Current Linus tree hangs on p620, xmon does not trigger. > rc3 was already broken. > And 2.6.10 doesnt work either... linux-2.6.9-rc2-bk9 works, linux-2.6.9-rc2-bk10, see attached logs. I got a little bit further with a prom_claim patch: BOOTP S = 1 FILE: orange Load Addr=0x4000 Max Size=0xbfc000 FINAL Packet Count = 2817 FINAL File Size = 1441999 bytes. zImage starting: loaded at 0x400000 Allocating 0x6ca000 bytes for kernel ... gunzipping (0x2100000 <- 0x407000:0x54ed0a)...done 0x3b0068 bytes 0xd5d8 bytes of heap consumed, max in use 0xa264 OF stdout device is: /pci at fff7f09000/isa at 10/serial at i3f8 klimit=0xc0000000005ca000 offset=0xbffffffffdef0000 command line: root_addr_cells: 0000000000000002 root_size_cells: 0000000000000002 scanning memory: node /memory at 0 : 0000000000000000 0000000100000000 memory layout at init: memory_limit : 0000000000000000 (16 MB aligned) alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 0000000100000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 Booting CPU hw index = 0x0000000000000000 Looking for displays found display : /pci at fff7f0a000/pci at b,4/display at 1, opening ... done starting prom_initialize_tce_table alloc_down(0000000000400000, 0000000000800000, (high)) DDD -> 00000000ff800000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000ff800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000 node = 0x0000000000cc7350 base = 0x00000000ff800000 size = 0x0000000000400000 opening PHB /pci at fff7f09000... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000ff400000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000ff400000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b node = 0x0000000000cd8530 base = 0x00000000ff400000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000ff000000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000ff000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b,2 node = 0x0000000000cdc5c8 base = 0x00000000ff000000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b,2... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fec00000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fec00000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b,4 node = 0x0000000000ce0a58 base = 0x00000000fec00000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b,4... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fe800000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fe800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b,6 node = 0x0000000000ce4ee8 base = 0x00000000fe800000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b,6... done alloc_down(0000000000400000, 0000000000800000, (high)) DDD -> 00000000fe000000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fe000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000 node = 0x0000000000ce97b0 base = 0x00000000fe000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fdc00000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fdc00000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b node = 0x0000000000cec6f0 base = 0x00000000fdc00000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fd800000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fd800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b,2 node = 0x0000000000cf0b08 base = 0x00000000fd800000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,2... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fd400000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fd400000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b,4 node = 0x0000000000cf4f98 base = 0x00000000fd400000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,4... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fd000000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fd000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b,6 node = 0x0000000000cf9428 base = 0x00000000fd000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,6... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fcc00000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fcc00000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c node = 0x0000000000cfd8b8 base = 0x00000000fcc00000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fc800000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fc800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c,2 node = 0x0000000000d01d58 base = 0x00000000fc800000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,2... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fc400000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fc400000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c,4 node = 0x0000000000d061f8 base = 0x00000000fc400000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,4... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fc000000 alloc_bottom : 00000000026de000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fc000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c,6 node = 0x0000000000d0a698 base = 0x00000000fc000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,6... done ending prom_initialize_tce_table prom_instantiate_rtas: start... rtas_node: 0000000000cb5020 alloc_down(00000000000a7000, 0000000000001000, (low)) trying: 0x000000003ff59000 trying: 0x000000003fe59000 DDD -> 000000003fe59000 alloc_bottom : 00000000026de000 alloc_top : 000000003fe59000 alloc_top_hi : 00000000fc000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 instantiating rtas at 0x000000003fe59000 ... done rtas base = 0x000000003fe59000 rtas entry = 0x000000003fe59900 rtas size = 0x00000000000a7000 prom_instantiate_rtas: end... prom_hold_cpus: start... 1) spinloop = 0x0000000000000008 1) *spinloop = 0x0000000000000000 1) acknowledge = 0x0000000000000010 1) *acknowledge = 0x0000000000000000 1) secondary_hold = 0x0000000000000060 cpuid = 0x0000000000000000 cpu hw idx = 0x0000000000000000 0000000000000000 : boot cpu 0000000000000000 cpuid = 0x0000000000000001 cpu hw idx = 0x0000000000000002 0000000000000001 : starting cpu hw idx 0000000000000002... done cpuid = 0x0000000000000002 cpu hw idx = 0x0000000000000004 0000000000000002 : starting cpu hw idx 0000000000000004... done cpuid = 0x0000000000000003 cpu hw idx = 0x0000000000000006 0000000000000003 : starting cpu hw idx 0000000000000006... done prom_hold_cpus: end... copying OF device tree ... starting device tree allocs at 00000000026de000 alloc_up(0000000000100000, 0000000000001000) trying: 0x00000000026de000 trying: 0x00000000027de000 UUU -> 00000000027de000 alloc_bottom : 00000000027de000 alloc_top : 000000003fe59000 alloc_top_hi : 00000000fc000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 Building dt strings... Building dt structure... reserved memory map: 00000000fc000000 - 0000000004000000 000000003fe59000 - 00000000000a7000 00000000027de000 - 0000000000012000 Device tree strings 0x00000000027df000 -> 0x00000000027e01cd Device tree struct 0x00000000027e1000 -> 0x00000000027f0000 Calling quiesce ... returning from prom_init ->dt_header_start=0x00000000027de000 ->phys=0x0000000002110000 Hello World ! <- pSeries_init_early() -> finish_device_tree <- finish_device_tree firmware_features = 0x0 <- setup_system() -> smp_init_pSeries() <- smp_init_pSeries() phb0: IO 0x0 -> 0xfffff phb0: MEM 0xfe80000000 -> 0xfebfffffff phb0 io_base_phys 0xfeffe00000 io_base_virt 0xd000010000000000 phb1: IO 0x0 -> 0xfffff phb1: MEM 0xff00000000 -> 0xff3fffffff phb1 io_base_phys 0xfefff00000 io_base_virt 0xd000010000100000 Starting Linux PPC64 2.6.13-rc2 ----------------------------------------------------- ppc64_pft_size = 0x1a ppc64_debug_switch = 0x0 ppc64_interrupt_controller = 0x2 systemcfg = 0xc0000000002a4000 systemcfg->platform = 0x100 systemcfg->processorCount = 0x4 systemcfg->physicalMemorySize = 0x100000000 ppc64_caches.dcache_line_size = 0x80 ppc64_caches.icache_line_size = 0x80 htab_address = 0xc0000000f8000000 htab_hash_mask = 0x7ffff ----------------------------------------------------- [boot]0100 MM Init [boot]0100 MM Init Done Linux version 2.6.13-rc2 (olaf at pomegranate) (gcc version 3.3.3 (SuSE Linux)) #7 SMP Fri Jul 8 22:47:36 CEST 2005 [boot]0012 Setup Arch Top of RAM: 0x100000000, Total RAM: 0x100000000 Memory hole size: 0MB Syscall map setup, 226 32 bits and 200 64 bits syscalls No ramdisk, default root is /dev/sda2 EEH: No capable adapters found PPC64 nvram contains 262144 bytes Using default idle loop [boot]0015 Setup Done Built 1 zonelists Kernel command line: panic=1 [boot]0020 XICS Init Kernel panic - not syncing: map_io_page: could not insert mapping <3>Badness in smp_call_function at /home/olaf/kernel/olh/orange/linux-2.6.13-rc2-olh/arch/ppc64/kernel/smp.c:240 Call Trace: [c0000000002a3aa0] [c00000000033af30] 0xc00000000033af30 (unreliable) [c0000000002a3b50] [c000000000046edc] .panic+0x8c/0x1f8 [c0000000002a3bf0] [c000000000035c58] .__ioremap_com+0x218/0x2d4 [c0000000002a3cc0] [c000000000035efc] .__ioremap+0xd8/0x104 [c0000000002a3d60] [c000000000032d44] .xics_init_IRQ+0x37c/0x548 [c0000000002a3e50] [c0000000002732dc] .init_IRQ+0x6c/0x84 [c0000000002a3ed0] [c00000000026a660] .start_kernel+0x148/0x2f0 [c0000000002a3f90] [c00000000000bd08] .__setup_cpu_power3+0x0/0x4 R -------------- next part -------------- arch/ppc64/kernel/bpa_iommu.c | 2 +- arch/ppc64/kernel/bpa_setup.c | 2 +- arch/ppc64/kernel/eeh.c | 2 +- arch/ppc64/kernel/iSeries_setup.c | 2 +- arch/ppc64/kernel/iSeries_smp.c | 2 +- arch/ppc64/kernel/lmb.c | 2 +- arch/ppc64/kernel/lparcfg.c | 2 +- arch/ppc64/kernel/maple_time.c | 2 +- arch/ppc64/kernel/module.c | 2 +- arch/ppc64/kernel/mpic.c | 2 +- arch/ppc64/kernel/nvram.c | 2 +- arch/ppc64/kernel/pSeries_setup.c | 2 +- arch/ppc64/kernel/pSeries_smp.c | 2 +- arch/ppc64/kernel/pci.c | 2 +- arch/ppc64/kernel/pmac_feature.c | 2 +- arch/ppc64/kernel/pmac_low_i2c.c | 2 +- arch/ppc64/kernel/pmac_setup.c | 2 +- arch/ppc64/kernel/pmac_smp.c | 2 +- arch/ppc64/kernel/pmac_time.c | 2 +- arch/ppc64/kernel/prom.c | 2 +- arch/ppc64/kernel/prom_init.c | 11 +++++++---- arch/ppc64/kernel/ras.c | 2 +- arch/ppc64/kernel/rtasd.c | 2 +- arch/ppc64/kernel/setup.c | 2 +- arch/ppc64/kernel/smp.c | 2 +- arch/ppc64/kernel/vdso.c | 2 +- 26 files changed, 32 insertions(+), 29 deletions(-) Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/bpa_iommu.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/bpa_iommu.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/bpa_iommu.c @@ -19,7 +19,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/bpa_setup.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/bpa_setup.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/bpa_setup.c @@ -12,7 +12,7 @@ * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/eeh.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/eeh.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/eeh.c @@ -35,7 +35,7 @@ #include #include "pci.h" -#undef DEBUG +#define DEBUG /** Overview: * EEH, or "Extended Error Handling" is a PCI bridge technology for Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/iSeries_setup.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/iSeries_setup.c @@ -16,7 +16,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/iSeries_smp.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/iSeries_smp.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/iSeries_smp.c @@ -12,7 +12,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/lmb.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/lmb.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/lmb.c @@ -22,7 +22,7 @@ struct lmb lmb; -#undef DEBUG +#define DEBUG void lmb_dump_all(void) { Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/lparcfg.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/lparcfg.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/lparcfg.c @@ -39,7 +39,7 @@ #define MODULE_VERS "1.6" #define MODULE_NAME "lparcfg" -/* #define LPARCFG_DEBUG */ +#define LPARCFG_DEBUG /* find a better place for this function... */ void log_plpar_hcall_return(unsigned long rc, char *tag) Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/maple_time.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/maple_time.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/maple_time.c @@ -11,7 +11,7 @@ * */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/module.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/module.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/module.c @@ -30,7 +30,7 @@ Using a magic allocator which places modules within 32MB solves this, and makes other things simpler. Anton? --RR. */ -#if 0 +#if 1 #define DEBUGP printk #else #define DEBUGP(fmt , ...) Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/mpic.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/mpic.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/mpic.c @@ -12,7 +12,7 @@ * for more details. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/nvram.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/nvram.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/nvram.c @@ -33,7 +33,7 @@ #include #include -#undef DEBUG_NVRAM +#define DEBUG_NVRAM static int nvram_scan_partitions(void); static int nvram_setup_partition(void); Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pSeries_setup.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pSeries_setup.c @@ -16,7 +16,7 @@ * bootup setup stuff.. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pSeries_smp.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pSeries_smp.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pSeries_smp.c @@ -12,7 +12,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pci.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pci.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pci.c @@ -11,7 +11,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_feature.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pmac_feature.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_feature.c @@ -41,7 +41,7 @@ #include #include -#undef DEBUG_FEATURE +#define DEBUG_FEATURE #ifdef DEBUG_FEATURE #define DBG(fmt...) printk(KERN_DEBUG fmt) Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_low_i2c.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pmac_low_i2c.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_low_i2c.c @@ -16,7 +16,7 @@ * properties parser */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_setup.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pmac_setup.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_setup.c @@ -23,7 +23,7 @@ * bootup setup stuff.. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_smp.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pmac_smp.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_smp.c @@ -22,7 +22,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_time.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/pmac_time.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/pmac_time.c @@ -32,7 +32,7 @@ #include #include -#undef DEBUG +#define DEBUG #ifdef DEBUG #define DBG(x...) printk(x) Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/prom.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/prom.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/prom.c @@ -15,7 +15,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/prom_init.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/prom_init.c @@ -15,7 +15,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG_PROM +#define DEBUG_PROM #include #include @@ -257,9 +257,12 @@ static int __init call_prom(const char * static unsigned int __init prom_claim(unsigned long virt, unsigned long size, unsigned long align) { - return (unsigned int)call_prom("claim", 3, 1, + unsigned int ret = (unsigned int)call_prom("claim", 3, 1, (prom_arg_t)virt, (prom_arg_t)size, (prom_arg_t)align); + if ((unsigned int)0xdeadbeef == ret) + ret = PROM_ERROR; + return ret; } static void __init prom_print(const char *msg) @@ -665,7 +668,7 @@ static unsigned long __init alloc_up(uns return 0; RELOC(alloc_bottom) = addr; - prom_debug(" -> %x\n", addr); + prom_debug(" UUU -> %x\n", addr); prom_debug(" alloc_bottom : %x\n", RELOC(alloc_bottom)); prom_debug(" alloc_top : %x\n", RELOC(alloc_top)); prom_debug(" alloc_top_hi : %x\n", RELOC(alloc_top_high)); @@ -726,7 +729,7 @@ static unsigned long __init alloc_down(u RELOC(alloc_top) = addr; bail: - prom_debug(" -> %x\n", addr); + prom_debug(" DDD -> %x\n", addr); prom_debug(" alloc_bottom : %x\n", RELOC(alloc_bottom)); prom_debug(" alloc_top : %x\n", RELOC(alloc_top)); prom_debug(" alloc_top_hi : %x\n", RELOC(alloc_top_high)); Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/ras.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/ras.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/ras.c @@ -73,7 +73,7 @@ static irqreturn_t ras_epow_interrupt(in static irqreturn_t ras_error_interrupt(int irq, void *dev_id, struct pt_regs * regs); -/* #define DEBUG */ +#define DEBUG static void request_ras_irqs(struct device_node *np, char *propname, irqreturn_t (*handler)(int, void *, struct pt_regs *), Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/rtasd.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/rtasd.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/rtasd.c @@ -28,7 +28,7 @@ #include #include -#if 0 +#if 1 #define DEBUG(A...) printk(KERN_ERR A) #else #define DEBUG(A...) Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/setup.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/setup.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/setup.c @@ -10,7 +10,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/smp.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/smp.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/smp.c @@ -15,7 +15,7 @@ * 2 of the License, or (at your option) any later version. */ -#undef DEBUG +#define DEBUG #include #include Index: linux-2.6.13-rc2-olh/arch/ppc64/kernel/vdso.c =================================================================== --- linux-2.6.13-rc2-olh.orig/arch/ppc64/kernel/vdso.c +++ linux-2.6.13-rc2-olh/arch/ppc64/kernel/vdso.c @@ -36,7 +36,7 @@ #include #include -#undef DEBUG +#define DEBUG #ifdef DEBUG #define DBG(fmt...) printk(fmt) -------------- next part -------------- BOOTP S = 1 FILE: orange Load Addr=0x4000 Max Size=0xbfc000 FINAL Packet Count = 3306 FINAL File Size = 1692258 bytes. zImage starting: loaded at 0x400000 gunzipping (0x2100000 <- 0x407000:0x568ee6)...done 4374954 bytes 56392 bytes of heap consumed, max in use 42296 klimit=0xc0000000004a4000 offset=0xbffffffffdef0000 ->mem=0x00000000025b4000 birec_verify: r6=0x00000000025b4000 tag=0x0000000000001010 last=0x00000000025b4031 last_tag=0x0000000000001011 first=0x00000000025b4000 bi_recs=0x00000000025b4000 new mem=0x00000000025b403d Booting CPU hw index = 0x0000000000000000 prom_dump_lmb: memory.cnt = 0x0000000000000001 memory.size = 0x0000000100000000 memory.region[0x0000000000000000].base = 0x0000000000000000 .physbase = 0x0000000000000000 .size = 0x0000000100000000 reserved.cnt = 0x0000000000000001 reserved.size = 0x0000000000000000 reserved.region[0x0000000000000000 ].base = 0x0000000000000000 .physbase = 0x0000000000000000 .size = 0x0000000000000000 bi: 0x0000000000001010 bi: 0x0000000000001013 bi: 0x0000000000001016 Looking for displays OF stdout is : /pci at fff7f09000/isa at 10/serial at i3f8 found display : /pci at fff7f0a000/pci at b,4/display at 1 Opening displays... opening display : /pci at fff7f0a000/pci at b,4/display at 1... done prom_instantiate_rtas: start... instantiating rtas at 0x000000003ff59000... done rtas->base = 0x000000003ff59000 rtas->entry = 0x000000003ff59900 rtas->size = 0x00000000000a7000 prom_instantiate_rtas: end... prom_initialize_naca: start... systemcfg->processorCount = 0x0000000000000004 systemcfg->physicalMemorySize = 0x0000000100000000 naca->pftSize = 0x000000000000001a systemcfg->dCacheL1LineSize = 0x0000000000000080 opening PHB /pci at fff7f09000/pci at b,4... done TCE table: 0x0000000000000004 node = 0x0000000000ce4ee8 base = 0xc0000000fec00000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b,6... done TCE table: 0x0000000000000005 node = 0x0000000000ce97b0 base = 0xc0000000fe800000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000... done TCE table: 0x0000000000000006 node = 0x0000000000cec6f0 base = 0xc0000000fe400000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b... done TCE table: 0x0000000000000007 node = 0x0000000000cf0b08 base = 0xc0000000fe000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,2... done TCE table: 0x0000000000000008 node = 0x0000000000cf4f98 base = 0xc0000000fdc00000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,4... done TCE table: 0x0000000000000009 node = 0x0000000000cf9428 base = 0xc0000000fd800000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,6... done TCE table: 0x000000000000000a node = 0x0000000000cfd8b8 base = 0xc0000000fd400000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c... done TCE table: 0x000000000000000b node = 0x0000000000d01d58 base = 0xc0000000fd000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,2... done TCE table: 0x000000000000000c node = 0x0000000000d061f8 base = 0xc0000000fcc00000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,4... done TCE table: 0x000000000000000d node = 0x0000000000d0a698 base = 0xc0000000fc800000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,6... done ending prom_initialize_tce_table Calling quiesce ... returning from prom_init -------------- next part -------------- BOOTP S = 1 FILE: orange Load Addr=0x4000 Max Size=0xbfc000 FINAL Packet Count = 2633 FINAL File Size = 1347842 bytes. zImage starting: loaded at 0x400000 Allocating 0x5a8000 bytes for kernel ... trying: 0x01400000 trying: 0x01500000 trying: 0x01600000 trying: 0x01700000 trying: 0x01800000 trying: 0x01900000 trying: 0x01a00000 trying: 0x01b00000 trying: 0x01c00000 trying: 0x01d00000 trying: 0x01e00000 trying: 0x01f00000 trying: 0x02000000 trying: 0x02100000 gunzipping (0x2100000 <- 0x407000:0x537701)...done 0x388ca0 bytes 0xd054 bytes of heap consumed, max in use 0x% ... skipping 0x10000 bytes of ELF header kernel: entry addr = 0x2110000 a1 = 0x0, a2 = 0x0, prom = 0xc1e030, bi_recs = 0x0, OF stdout device is: /pci at fff7f09000/isa at 10/serial at i3f8 klimit=0xc0000000004a8000 offset=0xbffffffffdef0000 command line: root_addr_cells: 0000000000000002 root_size_cells: 0000000000000002 scanning memory: node /memory at 0 : 0000000000000000 0000000100000000 memory layout at init: alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 0000000100000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 Booting CPU hw index = 0x0000000000000000 Looking for displays found display : /pci at fff7f0a000/pci at b,4/display at 1, opening ... done starting prom_initialize_tce_table alloc_down(0000000000400000, 0000000000800000, (high)) DDD -> 00000000ff800000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000ff800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000 node = 0x0000000000cc7350 base = 0xc0000000ff800000 size = 0x0000000000400000 opening PHB /pci at fff7f09000... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000ff400000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000ff400000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b node = 0x0000000000cd8530 base = 0xc0000000ff400000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000ff000000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000ff000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b,2 node = 0x0000000000cdc5c8 base = 0xc0000000ff000000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b,2... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fec00000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fec00000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b,4 node = 0x0000000000ce0a58 base = 0xc0000000fec00000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b,4... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fe800000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fe800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f09000/pci at b,6 node = 0x0000000000ce4ee8 base = 0xc0000000fe800000 size = 0x0000000000400000 opening PHB /pci at fff7f09000/pci at b,6... done alloc_down(0000000000400000, 0000000000800000, (high)) DDD -> 00000000fe000000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fe000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000 node = 0x0000000000ce97b0 base = 0xc0000000fe000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fdc00000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fdc00000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b node = 0x0000000000cec6f0 base = 0xc0000000fdc00000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fd800000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fd800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b,2 node = 0x0000000000cf0b08 base = 0xc0000000fd800000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,2... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fd400000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fd400000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b,4 node = 0x0000000000cf4f98 base = 0xc0000000fd400000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,4... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fd000000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fd000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at b,6 node = 0x0000000000cf9428 base = 0xc0000000fd000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at b,6... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fcc00000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fcc00000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c node = 0x0000000000cfd8b8 base = 0xc0000000fcc00000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fc800000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fc800000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c,2 node = 0x0000000000d01d58 base = 0xc0000000fc800000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,2... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fc400000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fc400000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c,4 node = 0x0000000000d061f8 base = 0xc0000000fc400000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,4... done alloc_down(0000000000400000, 0000000000400000, (high)) DDD -> 00000000fc000000 alloc_bottom : 00000000025bc000 alloc_top : 0000000040000000 alloc_top_hi : 00000000fc000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 TCE table: /pci at fff7f0a000/pci at c,6 node = 0x0000000000d0a698 base = 0xc0000000fc000000 size = 0x0000000000400000 opening PHB /pci at fff7f0a000/pci at c,6... done ending prom_initialize_tce_table prom_instantiate_rtas: start... alloc_down(00000000000a7000, 0000000000001000, (low)) trying: 0x000000003ff59000 DDD -> 00000000deadbeef alloc_bottom : 00000000025bc000 alloc_top : 00000000deadbeef alloc_top_hi : 00000000fc000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 instantiating rtas at 0x00000000deadbeef... failed prom_hold_cpus: start... 1) spinloop = 0x0000000000000008 1) *spinloop = 0x0000000000000000 1) acknowledge = 0x0000000000000010 1) *acknowledge = 0x0000000000000000 1) secondary_hold = 0x0000000000000060 cpuid = 0x0000000000000000 cpu hw idx = 0x0000000000000000 0000000000000000 : boot cpu 0000000000000000 cpuid = 0x0000000000000001 cpu hw idx = 0x0000000000000002 0000000000000001 : starting cpu hw idx 0000000000000002... done cpuid = 0x0000000000000002 cpu hw idx = 0x0000000000000004 0000000000000002 : starting cpu hw idx 0000000000000004... done cpuid = 0x0000000000000003 cpu hw idx = 0x0000000000000006 0000000000000003 : starting cpu hw idx 0000000000000006... done prom_hold_cpus: end... copying OF device tree ... starting device tree allocs at 00000000025bc000 alloc_up(0000000000100000, 0000000000001000) trying: 0x00000000025bc000 trying: 0x00000000026bc000 UUU -> 00000000026bc000 alloc_bottom : 00000000026bc000 alloc_top : 00000000deadbeef alloc_top_hi : 00000000fc000000 rmo_top : 0000000040000000 ram_top : 0000000100000000 Building dt strings... Building dt structure... reserved memory map: 0000000000000000 - 00000000004a8000 00000000fc000000 - 0000000004000000 00000000026bc000 - 0000000000012000 Device tree strings 0x00000000026bd000 -> 0x00000000026be1c7 Device tree struct 0x00000000026bf000 -> 0x00000000026ce000 Calling quiesce ... returning from prom_init ->dt_header_start=0x00000000026bc000 ->phys=0x0000000002110000 From paulus at samba.org Sat Jul 9 10:43:08 2005 From: paulus at samba.org (Paul Mackerras) Date: Sat, 9 Jul 2005 10:43:08 +1000 Subject: Platform numbers In-Reply-To: <200507082245.11330.michael@ellerman.id.au> References: <200507082245.11330.michael@ellerman.id.au> Message-ID: <17103.7580.905282.562351@cargo.ozlabs.ibm.com> Michael Ellerman writes: > The reason I ask is PLATFORM_MAPLE is currently 0x0500 which means we can't > (easily) do any bit mask based trickery on platform numbers. Yes, that's bogus. It should be 0x800, or better yet, we get rid of the platform numbers entirely - Ben H had some ideas about that. Paul. From sfr at canb.auug.org.au Sat Jul 9 11:51:41 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Sat, 9 Jul 2005 11:51:41 +1000 Subject: [PATCH] make dma_addr_t 64 bits In-Reply-To: <20050708194725.GB12039@austin.ibm.com> References: <20050708170454.3ac0c79e.sfr@canb.auug.org.au> <20050708194725.GB12039@austin.ibm.com> Message-ID: <20050709115141.38c0018c.sfr@canb.auug.org.au> On Fri, 8 Jul 2005 14:47:25 -0500 Olof Johansson wrote: > > On Fri, Jul 08, 2005 at 05:04:54PM +1000, Stephen Rothwell wrote: > > Hi all, > > > > There has been a need expressed for dma_addr_t to be 64 bits on PPC64. > > This patch does that. I have built it for pSeries and iSeries and booted > > a virtual only iSeries partition. > > I've kicked it off across a range of machines here. JS20 seems happy, > running on a few POWER4 and POWER5 machines right now. Great, thanks. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050709/8c34ab12/attachment.pgp From benh at kernel.crashing.org Sat Jul 9 13:45:23 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 09 Jul 2005 13:45:23 +1000 Subject: Platform numbers In-Reply-To: <20050708165235.GA12039@austin.ibm.com> References: <200507082245.11330.michael@ellerman.id.au> <20050708165235.GA12039@austin.ibm.com> Message-ID: <1120880723.31924.303.camel@gaston> On Fri, 2005-07-08 at 11:52 -0500, Olof Johansson wrote: > On Fri, Jul 08, 2005 at 10:45:06PM +1000, Michael Ellerman wrote: > > Hi y'all, > > > > Are the platform numbers in asm-ppc64/processor.h private to Linux? Or do > > firmware/hardware know about them? > > > > For pseries/pmac/maple it looks like we just assign them based on OF > > properties. For iSeries we check in the naca. I don't see where BPA gets > > assigned? > > > > The reason I ask is PLATFORM_MAPLE is currently 0x0500 which means we can't > > (easily) do any bit mask based trickery on platform numbers. > > They are just local to Linux and you can change them if you see a need > to. Only thing to think about is the LPAR bit (0x1). No, they are not local. Think about kexec, and firmwares that directly pass a flattened device-tree > Only challenge with going with bitmap is that the number of possible > platforms are quite a bit lower, I'm not sure how quickly that will hurt > us. We'll have other, bigger, headaches before we run out of platform > bits anyway. Yes, but I think we should kill the platform number. We should replace it with a "HV type" (native, iseries, rpa, xen, ...) and have the ppc_md.probe() function use the device-tree to identify the platform. There are bits & pieces here or there that will need to be fixed for that approach to work though. Like gross hacks in the interrupt tree parsing. Ben. From michael at ellerman.id.au Sat Jul 9 15:00:09 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sat, 9 Jul 2005 15:00:09 +1000 Subject: Platform numbers In-Reply-To: <1120880723.31924.303.camel@gaston> References: <200507082245.11330.michael@ellerman.id.au> <20050708165235.GA12039@austin.ibm.com> <1120880723.31924.303.camel@gaston> Message-ID: <200507091500.16075.michael@ellerman.id.au> On Sat, 9 Jul 2005 13:45, Benjamin Herrenschmidt wrote: > On Fri, 2005-07-08 at 11:52 -0500, Olof Johansson wrote: > > On Fri, Jul 08, 2005 at 10:45:06PM +1000, Michael Ellerman wrote: > > > Hi y'all, > > > > > > Are the platform numbers in asm-ppc64/processor.h private to Linux? Or > > > do firmware/hardware know about them? > > > > > > For pseries/pmac/maple it looks like we just assign them based on OF > > > properties. For iSeries we check in the naca. I don't see where BPA > > > gets assigned? > > > > > > The reason I ask is PLATFORM_MAPLE is currently 0x0500 which means we > > > can't (easily) do any bit mask based trickery on platform numbers. > > > > They are just local to Linux and you can change them if you see a need > > to. Only thing to think about is the LPAR bit (0x1). > > No, they are not local. Think about kexec, and firmwares that directly > pass a flattened device-tree Ok. But AIUI there isn't any firmware yet which does that? So we could change the MAPLE number now and get away with it? > > Only challenge with going with bitmap is that the number of possible > > platforms are quite a bit lower, I'm not sure how quickly that will hurt > > us. We'll have other, bigger, headaches before we run out of platform > > bits anyway. > > Yes, but I think we should kill the platform number. We should replace > it with a "HV type" (native, iseries, rpa, xen, ...) and have the > ppc_md.probe() function use the device-tree to identify the platform. > > There are bits & pieces here or there that will need to be fixed for > that approach to work though. Like gross hacks in the interrupt tree > parsing. We'd still have the problem that there's little bits of code outside of ppc_md functions which uses the platform number to work out which platform it's on. How would it check what platform it was on without the number? I also see we #define _machine (systemcfg->platform) for drivers? cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050709/d2c69984/attachment.pgp From benh at kernel.crashing.org Sat Jul 9 15:36:52 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 09 Jul 2005 15:36:52 +1000 Subject: Platform numbers In-Reply-To: <200507091500.16075.michael@ellerman.id.au> References: <200507082245.11330.michael@ellerman.id.au> <20050708165235.GA12039@austin.ibm.com> <1120880723.31924.303.camel@gaston> <200507091500.16075.michael@ellerman.id.au> Message-ID: <1120887413.31924.310.camel@gaston> > > No, they are not local. Think about kexec, and firmwares that directly > > pass a flattened device-tree > > Ok. But AIUI there isn't any firmware yet which does that? There is kexec already though, but changing Maple for now is probably ok. > So we could change > the MAPLE number now and get away with it? Yes. > > > Only challenge with going with bitmap is that the number of possible > > > platforms are quite a bit lower, I'm not sure how quickly that will hurt > > > us. We'll have other, bigger, headaches before we run out of platform > > > bits anyway. > > > > Yes, but I think we should kill the platform number. We should replace > > it with a "HV type" (native, iseries, rpa, xen, ...) and have the > > ppc_md.probe() function use the device-tree to identify the platform. > > > > There are bits & pieces here or there that will need to be fixed for > > that approach to work though. Like gross hacks in the interrupt tree > > parsing. > > We'd still have the problem that there's little bits of code outside of ppc_md > functions which uses the platform number to work out which platform it's on. > How would it check what platform it was on without the number? That is the "bits & pieces" I'm talking about :) In most cases, testing the platform number shouldn't be needed though. > I also see we #define _machine (systemcfg->platform) for drivers? Yes, for compatibility with ppc32 stuff that uses _machine, but again, it should be possible to fix all of that by proper use of the device-tree instead and/or adding things in ppc_md. Ben. From miltonm at bga.com Sun Jul 10 00:26:10 2005 From: miltonm at bga.com (Milton Miller) Date: Sat, 9 Jul 2005 09:26:10 -0500 Subject: Platform numbers Message-ID: Yes kexec prepares device trees however it only copies other trees, it doesn't generate them from scratch. milton From olh at suse.de Sun Jul 10 00:55:58 2005 From: olh at suse.de (Olaf Hering) Date: Sat, 9 Jul 2005 16:55:58 +0200 Subject: p620 hangs instantiating rtas at 0x00000000deadbeef In-Reply-To: <20050708205752.GA32069@suse.de> References: <20050209150654.GA16640@suse.de> <20050708205752.GA32069@suse.de> Message-ID: <20050709145558.GA11458@suse.de> On Fri, Jul 08, Olaf Hering wrote: > On Wed, Feb 09, Olaf Hering wrote: > > > > > Current Linus tree hangs on p620, xmon does not trigger. > > rc3 was already broken. > > And 2.6.10 doesnt work either... > > linux-2.6.9-rc2-bk9 works, linux-2.6.9-rc2-bk10, see attached logs. The truth is hidden in the changes between 2.6.10-bk11 and -bk12. http://penguinppc.org/~olaf/rs64-breakage/ Everything up to 2.6.9-rc2-bk1 worked ok, -rc2-bk2 was broken due to the vsid changes. This change fixed it, up to 2.6.9-rc2-bk9 http://ozlabs.org/pipermail/linuxppc64-dev/2004-September/002278.html Then proc_claim started to return 0xdeadbeef in -bk10. Workaround is to turn this value into PROM_ERROR. This helps up to 2.6.10-bk11, -bk12 is stuck after kicking other cpus. From olh at suse.de Sun Jul 10 03:03:01 2005 From: olh at suse.de (Olaf Hering) Date: Sat, 9 Jul 2005 19:03:01 +0200 Subject: p620 hangs instantiating rtas at 0x00000000deadbeef In-Reply-To: <20050709145558.GA11458@suse.de> References: <20050209150654.GA16640@suse.de> <20050708205752.GA32069@suse.de> <20050709145558.GA11458@suse.de> Message-ID: <20050709170301.GA11940@suse.de> On Sat, Jul 09, Olaf Hering wrote: > On Fri, Jul 08, Olaf Hering wrote: > > > On Wed, Feb 09, Olaf Hering wrote: > > > > > > > > Current Linus tree hangs on p620, xmon does not trigger. > > > rc3 was already broken. > > > And 2.6.10 doesnt work either... > > > > linux-2.6.9-rc2-bk9 works, linux-2.6.9-rc2-bk10, see attached logs. > > The truth is hidden in the changes between 2.6.10-bk11 and -bk12. > > http://penguinppc.org/~olaf/rs64-breakage/ this patch breaks RS64 systems: http://linux.bkbits.net:8080/linux-2.6/cset at 41e0534bAOSttXpLl_HqAfBWGcXKJA http://penguinppc.org/~olaf/rs64-breakage/ contains a series of patches, 'quilt push 19' will give a still working kernel, apply 2 more patches and it hangs. I should probably look at the 'Pre-POWER4' bits mentionend in the commit message. From olh at suse.de Mon Jul 11 05:35:15 2005 From: olh at suse.de (Olaf Hering) Date: Sun, 10 Jul 2005 19:35:15 +0000 Subject: [PATCH 7/82] remove linux/version.h include from arch/ppc64 In-Reply-To: <20050710193508.0.PmFpst2252.2247.olh@nectarine.suse.de> Message-ID: <20050710193515.7.NnOUBt2443.2247.olh@nectarine.suse.de> changing CONFIG_LOCALVERSION rebuilds too much, for no appearent reason. use system_utsname for progress and debug header Signed-off-by: Olaf Hering arch/ppc64/kernel/btext.c | 1 - arch/ppc64/kernel/pSeries_setup.c | 4 ++-- arch/ppc64/kernel/prom.c | 1 - arch/ppc64/kernel/prom_init.c | 1 - arch/ppc64/kernel/setup.c | 4 ++-- arch/ppc64/kernel/vio.c | 1 - 6 files changed, 4 insertions(+), 8 deletions(-) Index: linux-2.6.13-rc2-mm1/arch/ppc64/kernel/btext.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/arch/ppc64/kernel/btext.c +++ linux-2.6.13-rc2-mm1/arch/ppc64/kernel/btext.c @@ -7,7 +7,6 @@ #include #include #include -#include #include #include Index: linux-2.6.13-rc2-mm1/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/arch/ppc64/kernel/pSeries_setup.c +++ linux-2.6.13-rc2-mm1/arch/ppc64/kernel/pSeries_setup.c @@ -37,7 +37,7 @@ #include #include #include -#include +#include #include #include #include @@ -253,7 +253,7 @@ static int __init pSeries_init_panel(voi { /* Manually leave the kernel version on the panel. */ ppc_md.progress("Linux ppc64n", 0); - ppc_md.progress(UTS_RELEASE, 0); + ppc_md.progress(system_utsname.version, 0); return 0; } Index: linux-2.6.13-rc2-mm1/arch/ppc64/kernel/prom.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/arch/ppc64/kernel/prom.c +++ linux-2.6.13-rc2-mm1/arch/ppc64/kernel/prom.c @@ -22,7 +22,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/arch/ppc64/kernel/prom_init.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/arch/ppc64/kernel/prom_init.c +++ linux-2.6.13-rc2-mm1/arch/ppc64/kernel/prom_init.c @@ -22,7 +22,6 @@ #include #include #include -#include #include #include #include Index: linux-2.6.13-rc2-mm1/arch/ppc64/kernel/setup.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/arch/ppc64/kernel/setup.c +++ linux-2.6.13-rc2-mm1/arch/ppc64/kernel/setup.c @@ -25,7 +25,7 @@ #include #include #include -#include +#include #include #include #include @@ -653,7 +653,7 @@ void __init setup_system(void) smp_release_cpus(); #endif /* defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) */ - printk("Starting Linux PPC64 %sn", UTS_RELEASE); + printk("Starting Linux PPC64 %sn", system_utsname.version); printk("-----------------------------------------------------n"); printk("ppc64_pft_size = 0x%lxn", ppc64_pft_size); Index: linux-2.6.13-rc2-mm1/arch/ppc64/kernel/vio.c =================================================================== --- linux-2.6.13-rc2-mm1.orig/arch/ppc64/kernel/vio.c +++ linux-2.6.13-rc2-mm1/arch/ppc64/kernel/vio.c @@ -14,7 +14,6 @@ #include #include -#include #include #include #include From moilanen at austin.ibm.com Tue Jul 12 00:09:37 2005 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Mon, 11 Jul 2005 09:09:37 -0500 Subject: [PATCH] make dma_addr_t 64 bits In-Reply-To: <10982849.1120852873264.JavaMail.servlet@kundenserver> References: <10982849.1120852873264.JavaMail.servlet@kundenserver> Message-ID: <20050711090937.1c2d1444.moilanen@austin.ibm.com> > >There has been a need expressed for dma_addr_t to be 64 bits on PPC64. > >This patch does that. I have built it for pSeries and iSeries and booted > >a virtual only iSeries partition. > > I know that this patch breaks our new spidernet driver for the Cell Blade, > but it is trivially fixable (dma_addr_t was incorrectly used to describe > part of a HW structure, needs to be u32 instead). Maybe you should check > if other code is broken in a similar way. I wouldn't be surprised if there are a few more drivers that make assumptions on the dma_addr size. This probably needs a wide test base. Jake From michael at ellerman.id.au Tue Jul 12 17:07:30 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 12 Jul 2005 17:07:30 +1000 Subject: [RFC/PATCH] ppc64: Move ppc64_enable_pmcs() logic into a ppc_md function Message-ID: <200507121707.31061.michael@ellerman.id.au> This patch adds an enable_pmc entry to the ppc_md structure, and moves code from arch/ppc64/kerne/sysfs.c into the various platform files. There should be no functional changes. The logic for pSeries seems a little confused. Are the platform check and the firmware_features check equivalent? We also call power4_enable_pmcs() unconditionally on pSeries. It seems to work on power3, but not power5. Maybe it should be called something else? And therefore probably not be in op_model_power4.c? Signed-off-by: Michael Ellerman arch/ppc64/kernel/iSeries_setup.c | 2 + arch/ppc64/kernel/pSeries_setup.c | 18 ++++++++++ arch/ppc64/kernel/pmac_setup.c | 6 +++ arch/ppc64/kernel/sysfs.c | 56 ++-------------------------------- arch/ppc64/oprofile/op_model_power4.c | 21 ++++++++++++ include/asm-ppc64/machdep.h | 5 +++ 6 files changed, 56 insertions(+), 52 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -946,6 +946,8 @@ void __init iSeries_early_setup(void) ppc_md.calibrate_decr = iSeries_calibrate_decr; ppc_md.progress = iSeries_progress; + /* XXX Implement enable_pmcs for iSeries */ + if (get_paca()->lppaca.shared_proc) { ppc_md.idle_loop = iseries_shared_idle; printk(KERN_INFO "Using shared processor idle loop\n"); Index: work/arch/ppc64/kernel/pSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/pSeries_setup.c +++ work/arch/ppc64/kernel/pSeries_setup.c @@ -574,6 +574,23 @@ static int pseries_shared_idle(void) return 0; } +static void pSeries_enable_pmcs(void) +{ + power4_enable_pmcs(); + + if (systemcfg->platform & PLATFORM_LPAR) { + unsigned long set, reset; + + set = 1UL << 63; + reset = 0; + plpar_hcall_norets(H_PERFMON, set, reset); + } + + /* instruct hypervisor to maintain PMCs */ + if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) + get_paca()->lppaca.pmcregs_in_use = 1; +} + struct machdep_calls __initdata pSeries_md = { .probe = pSeries_probe, .setup_arch = pSeries_setup_arch, @@ -595,4 +612,5 @@ struct machdep_calls __initdata pSeries_ .check_legacy_ioport = pSeries_check_legacy_ioport, .system_reset_exception = pSeries_system_reset_exception, .machine_check_exception = pSeries_machine_check_exception, + .enable_pmcs = pSeries_enable_pmcs, }; Index: work/arch/ppc64/kernel/pmac_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/pmac_setup.c +++ work/arch/ppc64/kernel/pmac_setup.c @@ -489,6 +489,11 @@ static int __init pmac_probe(int platfor return 1; } +static void pmac_enable_pmcs(void) +{ + power4_enable_pmcs(); +} + struct machdep_calls __initdata pmac_md = { #ifdef CONFIG_HOTPLUG_CPU .cpu_die = generic_mach_cpu_die, @@ -511,4 +516,5 @@ struct machdep_calls __initdata pmac_md .progress = pmac_progress, .check_legacy_ioport = pmac_check_legacy_ioport, .idle_loop = native_idle, + .enable_pmcs = pmac_enable_pmcs, }; Index: work/arch/ppc64/kernel/sysfs.c =================================================================== --- work.orig/arch/ppc64/kernel/sysfs.c +++ work/arch/ppc64/kernel/sysfs.c @@ -100,6 +100,8 @@ static int __init setup_smt_snooze_delay } __setup("smt-snooze-delay=", setup_smt_snooze_delay); +#endif /* CONFIG_PPC_MULTIPLATFORM */ + /* * Enabling PMCs will slow partition context switch times so we only do * it the first time we write to the PMCs. @@ -109,65 +111,15 @@ static DEFINE_PER_CPU(char, pmcs_enabled void ppc64_enable_pmcs(void) { - unsigned long hid0; -#ifdef CONFIG_PPC_PSERIES - unsigned long set, reset; -#endif /* CONFIG_PPC_PSERIES */ - /* Only need to enable them once */ if (__get_cpu_var(pmcs_enabled)) return; __get_cpu_var(pmcs_enabled) = 1; - switch (systemcfg->platform) { - case PLATFORM_PSERIES: - case PLATFORM_POWERMAC: - hid0 = mfspr(HID0); - hid0 |= 1UL << (63 - 20); - - /* POWER4 requires the following sequence */ - asm volatile( - "sync\n" - "mtspr %1, %0\n" - "mfspr %0, %1\n" - "mfspr %0, %1\n" - "mfspr %0, %1\n" - "mfspr %0, %1\n" - "mfspr %0, %1\n" - "mfspr %0, %1\n" - "isync" : "=&r" (hid0) : "i" (HID0), "0" (hid0): - "memory"); - break; - -#ifdef CONFIG_PPC_PSERIES - case PLATFORM_PSERIES_LPAR: - set = 1UL << 63; - reset = 0; - plpar_hcall_norets(H_PERFMON, set, reset); - break; -#endif /* CONFIG_PPC_PSERIES */ - - default: - break; - } - -#ifdef CONFIG_PPC_PSERIES - /* instruct hypervisor to maintain PMCs */ - if (cur_cpu_spec->firmware_features & FW_FEATURE_SPLPAR) - get_paca()->lppaca.pmcregs_in_use = 1; -#endif /* CONFIG_PPC_PSERIES */ + if (ppc_md.enable_pmcs) + ppc_md.enable_pmcs(); } - -#else - -/* PMC stuff */ -void ppc64_enable_pmcs(void) -{ - /* XXX Implement for iseries */ -} -#endif /* CONFIG_PPC_MULTIPLATFORM */ - EXPORT_SYMBOL(ppc64_enable_pmcs); /* XXX convert to rusty's on_one_cpu */ Index: work/arch/ppc64/oprofile/op_model_power4.c =================================================================== --- work.orig/arch/ppc64/oprofile/op_model_power4.c +++ work/arch/ppc64/oprofile/op_model_power4.c @@ -83,6 +83,27 @@ static void power4_reg_setup(struct op_c mmcr0_val |= MMCR0_PROBLEM_DISABLE; } +void power4_enable_pmcs(void) +{ + unsigned long hid0; + + hid0 = mfspr(HID0); + hid0 |= 1UL << (63 - 20); + + /* POWER4 requires the following sequence */ + asm volatile( + "sync\n" + "mtspr %1, %0\n" + "mfspr %0, %1\n" + "mfspr %0, %1\n" + "mfspr %0, %1\n" + "mfspr %0, %1\n" + "mfspr %0, %1\n" + "mfspr %0, %1\n" + "isync" : "=&r" (hid0) : "i" (HID0), "0" (hid0): + "memory"); +} + extern void ppc64_enable_pmcs(void); static void power4_cpu_setup(void *unused) Index: work/include/asm-ppc64/machdep.h =================================================================== --- work.orig/include/asm-ppc64/machdep.h +++ work/include/asm-ppc64/machdep.h @@ -142,11 +142,16 @@ struct machdep_calls { /* Idle loop for this platform, leave empty for default idle loop */ int (*idle_loop)(void); + + /* Function to enable pmcs for this platform, called once per cpu. */ + void (*enable_pmcs)(void); }; extern int default_idle(void); extern int native_idle(void); +extern void power4_enable_pmcs(void); + extern struct machdep_calls ppc_md; extern char cmd_line[COMMAND_LINE_SIZE]; From sfr at canb.auug.org.au Tue Jul 12 17:36:55 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 12 Jul 2005 17:36:55 +1000 Subject: [PATCH 0/4] ppc64: split platform specific parts of vio.c out Message-ID: <20050712173655.387d5110.sfr@canb.auug.org.au> Hi all, This series of patches just splits the i/pSeries parts out of vio.c in order to allow some cleanups and to leave us with a more "generic" vio implementation that may be of use to some other architectures or platforms. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ From sfr at canb.auug.org.au Tue Jul 12 17:40:17 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 12 Jul 2005 17:40:17 +1000 Subject: [PATCH 1/4] ppc64: split iSeries specific parts out of vio.c In-Reply-To: <20050712173655.387d5110.sfr@canb.auug.org.au> References: <20050712173655.387d5110.sfr@canb.auug.org.au> Message-ID: <20050712174017.19e26cc3.sfr@canb.auug.org.au> Hi all, This patch splits the iSeries specific parts out of vio.c. Signed-off-by: Stephen Rothwell --- arch/ppc64/kernel/Makefile | 4 - arch/ppc64/kernel/iSeries_vio.c | 133 ++++++++++++++++++++++++++++++++++++ arch/ppc64/kernel/vio.c | 147 ++++++---------------------------------- include/asm-ppc64/vio.h | 7 + 4 files changed, 166 insertions(+), 125 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus/arch/ppc64/kernel/Makefile linus-vio-init.1/arch/ppc64/kernel/Makefile --- linus/arch/ppc64/kernel/Makefile 2005-06-27 16:08:00.000000000 +1000 +++ linus-vio-init.1/arch/ppc64/kernel/Makefile 2005-06-27 18:00:35.000000000 +1000 @@ -50,7 +50,9 @@ obj-$(CONFIG_LPARCFG) += lparcfg.o obj-$(CONFIG_HVC_CONSOLE) += hvconsole.o obj-$(CONFIG_BOOTX_TEXT) += btext.o obj-$(CONFIG_HVCS) += hvcserver.o -obj-$(CONFIG_IBMVIO) += vio.o + +vio-obj-$(CONFIG_PPC_ISERIES) += iSeries_vio.o +obj-$(CONFIG_IBMVIO) += vio.o $(vio-obj-y) obj-$(CONFIG_XICS) += xics.o obj-$(CONFIG_MPIC) += mpic.o diff -ruNp linus/arch/ppc64/kernel/iSeries_vio.c linus-vio-init.1/arch/ppc64/kernel/iSeries_vio.c --- linus/arch/ppc64/kernel/iSeries_vio.c 1970-01-01 10:00:00.000000000 +1000 +++ linus-vio-init.1/arch/ppc64/kernel/iSeries_vio.c 2005-06-24 17:31:30.000000000 +1000 @@ -0,0 +1,133 @@ +/* + * IBM PowerPC iSeries Virtual I/O Infrastructure Support. + * + * Copyright (c) 2005 Stephen Rothwell, IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +struct device *iSeries_vio_dev = &vio_bus_device.dev; +EXPORT_SYMBOL(iSeries_vio_dev); + +static struct iommu_table veth_iommu_table; +static struct iommu_table vio_iommu_table; + +void __init iommu_vio_init(void) +{ + struct iommu_table *t; + struct iommu_table_cb cb; + unsigned long cbp; + unsigned long itc_entries; + + cb.itc_busno = 255; /* Bus 255 is the virtual bus */ + cb.itc_virtbus = 0xff; /* Ask for virtual bus */ + + cbp = virt_to_abs(&cb); + HvCallXm_getTceTableParms(cbp); + + itc_entries = cb.itc_size * PAGE_SIZE / sizeof(union tce_entry); + veth_iommu_table.it_size = itc_entries / 2; + veth_iommu_table.it_busno = cb.itc_busno; + veth_iommu_table.it_offset = cb.itc_offset; + veth_iommu_table.it_index = cb.itc_index; + veth_iommu_table.it_type = TCE_VB; + veth_iommu_table.it_blocksize = 1; + + t = iommu_init_table(&veth_iommu_table); + + if (!t) + printk("Virtual Bus VETH TCE table failed.\n"); + + vio_iommu_table.it_size = itc_entries - veth_iommu_table.it_size; + vio_iommu_table.it_busno = cb.itc_busno; + vio_iommu_table.it_offset = cb.itc_offset + + veth_iommu_table.it_size; + vio_iommu_table.it_index = cb.itc_index; + vio_iommu_table.it_type = TCE_VB; + vio_iommu_table.it_blocksize = 1; + + t = iommu_init_table(&vio_iommu_table); + + if (!t) + printk("Virtual Bus VIO TCE table failed.\n"); +} + +/** + * vio_register_device: - Register a new vio device. + * @voidev: The device to register. + */ +static struct vio_dev *__init vio_register_device_iseries(char *type, + uint32_t unit_num) +{ + struct vio_dev *viodev; + + /* allocate a vio_dev for this node */ + viodev = kmalloc(sizeof(struct vio_dev), GFP_KERNEL); + if (!viodev) + return NULL; + memset(viodev, 0, sizeof(struct vio_dev)); + + snprintf(viodev->dev.bus_id, BUS_ID_SIZE, "%s%d", type, unit_num); + + return vio_register_device_common(viodev, viodev->dev.bus_id, type, + unit_num, &vio_iommu_table); +} + +void __init probe_bus_iseries(void) +{ + HvLpIndexMap vlan_map; + struct vio_dev *viodev; + int i; + + /* there is only one of each of these */ + vio_register_device_iseries("viocons", 0); + vio_register_device_iseries("vscsi", 0); + + vlan_map = HvLpConfig_getVirtualLanIndexMap(); + for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++) { + if ((vlan_map & (0x8000 >> i)) == 0) + continue; + viodev = vio_register_device_iseries("vlan", i); + /* veth is special and has it own iommu_table */ + viodev->iommu_table = &veth_iommu_table; + } + for (i = 0; i < HVMAXARCHITECTEDVIRTUALDISKS; i++) + vio_register_device_iseries("viodasd", i); + for (i = 0; i < HVMAXARCHITECTEDVIRTUALCDROMS; i++) + vio_register_device_iseries("viocd", i); + for (i = 0; i < HVMAXARCHITECTEDVIRTUALTAPES; i++) + vio_register_device_iseries("viotape", i); +} + +/** + * vio_bus_init_iseries: - Initialize the iSeries virtual IO bus + */ +static int __init vio_bus_init_iseries(void) +{ + int err; + + err = vio_bus_init(); + if (err == 0) { + vio_bus_device.iommu_table = &vio_iommu_table; + iSeries_vio_dev = &vio_bus_device.dev; + probe_bus_iseries(); + } + return err; +} + +__initcall(vio_bus_init_iseries); diff -ruNp linus/arch/ppc64/kernel/vio.c linus-vio-init.1/arch/ppc64/kernel/vio.c --- linus/arch/ppc64/kernel/vio.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-vio-init.1/arch/ppc64/kernel/vio.c 2005-06-27 18:00:12.000000000 +1000 @@ -25,10 +25,6 @@ #include #include #include -#include -#include -#include -#include #define DBGENTER() pr_debug("%s entered\n", __FUNCTION__) @@ -41,26 +37,14 @@ static const struct vio_device_id *vio_m static struct iommu_table *vio_build_iommu_table(struct vio_dev *); static int vio_num_address_cells; #endif -#ifdef CONFIG_PPC_ISERIES -static struct iommu_table veth_iommu_table; -static struct iommu_table vio_iommu_table; -#endif -static struct vio_dev vio_bus_device = { /* fake "parent" device */ +struct vio_dev vio_bus_device = { /* fake "parent" device */ .name = vio_bus_device.dev.bus_id, .type = "", -#ifdef CONFIG_PPC_ISERIES - .iommu_table = &vio_iommu_table, -#endif .dev.bus_id = "vio", .dev.bus = &vio_bus_type, }; #ifdef CONFIG_PPC_ISERIES -static struct vio_dev *__init vio_register_device_iseries(char *type, - uint32_t unit_num); - -struct device *iSeries_vio_dev = &vio_bus_device.dev; -EXPORT_SYMBOL(iSeries_vio_dev); #define device_is_compatible(a, b) 1 @@ -157,48 +141,6 @@ static const struct vio_device_id * vio_ return NULL; } -#ifdef CONFIG_PPC_ISERIES -void __init iommu_vio_init(void) -{ - struct iommu_table *t; - struct iommu_table_cb cb; - unsigned long cbp; - unsigned long itc_entries; - - cb.itc_busno = 255; /* Bus 255 is the virtual bus */ - cb.itc_virtbus = 0xff; /* Ask for virtual bus */ - - cbp = virt_to_abs(&cb); - HvCallXm_getTceTableParms(cbp); - - itc_entries = cb.itc_size * PAGE_SIZE / sizeof(union tce_entry); - veth_iommu_table.it_size = itc_entries / 2; - veth_iommu_table.it_busno = cb.itc_busno; - veth_iommu_table.it_offset = cb.itc_offset; - veth_iommu_table.it_index = cb.itc_index; - veth_iommu_table.it_type = TCE_VB; - veth_iommu_table.it_blocksize = 1; - - t = iommu_init_table(&veth_iommu_table); - - if (!t) - printk("Virtual Bus VETH TCE table failed.\n"); - - vio_iommu_table.it_size = itc_entries - veth_iommu_table.it_size; - vio_iommu_table.it_busno = cb.itc_busno; - vio_iommu_table.it_offset = cb.itc_offset + - veth_iommu_table.it_size; - vio_iommu_table.it_index = cb.itc_index; - vio_iommu_table.it_type = TCE_VB; - vio_iommu_table.it_blocksize = 1; - - t = iommu_init_table(&vio_iommu_table); - - if (!t) - printk("Virtual Bus VIO TCE table failed.\n"); -} -#endif - #ifdef CONFIG_PPC_PSERIES static void probe_bus_pseries(void) { @@ -223,38 +165,10 @@ static void probe_bus_pseries(void) } #endif -#ifdef CONFIG_PPC_ISERIES -static void probe_bus_iseries(void) -{ - HvLpIndexMap vlan_map = HvLpConfig_getVirtualLanIndexMap(); - struct vio_dev *viodev; - int i; - - /* there is only one of each of these */ - vio_register_device_iseries("viocons", 0); - vio_register_device_iseries("vscsi", 0); - - vlan_map = HvLpConfig_getVirtualLanIndexMap(); - for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++) { - if ((vlan_map & (0x8000 >> i)) == 0) - continue; - viodev = vio_register_device_iseries("vlan", i); - /* veth is special and has it own iommu_table */ - viodev->iommu_table = &veth_iommu_table; - } - for (i = 0; i < HVMAXARCHITECTEDVIRTUALDISKS; i++) - vio_register_device_iseries("viodasd", i); - for (i = 0; i < HVMAXARCHITECTEDVIRTUALCDROMS; i++) - vio_register_device_iseries("viocd", i); - for (i = 0; i < HVMAXARCHITECTEDVIRTUALTAPES; i++) - vio_register_device_iseries("viotape", i); -} -#endif - /** * vio_bus_init: - Initialize the virtual IO bus */ -static int __init vio_bus_init(void) +int __init vio_bus_init(void) { int err; @@ -264,25 +178,35 @@ static int __init vio_bus_init(void) return err; } - /* the fake parent of all vio devices, just to give us a nice directory */ + /* the fake parent of all vio devices, just to give us + * a nice directory + */ err = device_register(&vio_bus_device.dev); if (err) { - printk(KERN_WARNING "%s: device_register returned %i\n", __FUNCTION__, - err); + printk(KERN_WARNING "%s: device_register returned %i\n", + __FUNCTION__, err); return err; } + return 0; +} + #ifdef CONFIG_PPC_PSERIES - probe_bus_pseries(); -#endif -#ifdef CONFIG_PPC_ISERIES - probe_bus_iseries(); -#endif +/** + * vio_bus_init_pseries: - Initialize the pSeries virtual IO bus + */ +static int __init vio_bus_init_pseries(void) +{ + int err; - return 0; + err = vio_bus_init(); + if (err == 0) + probe_bus_pseries(); + return err; } -__initcall(vio_bus_init); +__initcall(vio_bus_init_pseries); +#endif /* vio_dev refcount hit 0 */ static void __devinit vio_dev_release(struct device *dev) @@ -312,7 +236,7 @@ static ssize_t viodev_show_name(struct d } DEVICE_ATTR(name, S_IRUSR | S_IRGRP | S_IROTH, viodev_show_name, NULL); -static struct vio_dev * __devinit vio_register_device_common( +struct vio_dev * __devinit vio_register_device_common( struct vio_dev *viodev, char *name, char *type, uint32_t unit_address, struct iommu_table *iommu_table) { @@ -408,31 +332,6 @@ struct vio_dev * __devinit vio_register_ EXPORT_SYMBOL(vio_register_device_node); #endif -#ifdef CONFIG_PPC_ISERIES -/** - * vio_register_device: - Register a new vio device. - * @voidev: The device to register. - */ -static struct vio_dev *__init vio_register_device_iseries(char *type, - uint32_t unit_num) -{ - struct vio_dev *viodev; - - DBGENTER(); - - /* allocate a vio_dev for this node */ - viodev = kmalloc(sizeof(struct vio_dev), GFP_KERNEL); - if (!viodev) - return NULL; - memset(viodev, 0, sizeof(struct vio_dev)); - - snprintf(viodev->dev.bus_id, BUS_ID_SIZE, "%s%d", type, unit_num); - - return vio_register_device_common(viodev, viodev->dev.bus_id, type, - unit_num, &vio_iommu_table); -} -#endif - void __devinit vio_unregister_device(struct vio_dev *viodev) { DBGENTER(); diff -ruNp linus/include/asm-ppc64/vio.h linus-vio-init.1/include/asm-ppc64/vio.h --- linus/include/asm-ppc64/vio.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-vio-init.1/include/asm-ppc64/vio.h 2005-06-27 18:00:12.000000000 +1000 @@ -56,6 +56,9 @@ const void * vio_get_attribute(struct vi int vio_get_irq(struct vio_dev *dev); int vio_enable_interrupts(struct vio_dev *dev); int vio_disable_interrupts(struct vio_dev *dev); +extern struct vio_dev * __devinit vio_register_device_common( + struct vio_dev *viodev, char *name, char *type, + uint32_t unit_address, struct iommu_table *iommu_table); extern struct dma_mapping_ops vio_dma_ops; @@ -95,9 +98,13 @@ struct vio_dev { struct device dev; }; +extern struct vio_dev vio_bus_device; + static inline struct vio_dev *to_vio_dev(struct device *dev) { return container_of(dev, struct vio_dev, dev); } +extern int vio_bus_init(void); + #endif /* _ASM_VIO_H */ From sfr at canb.auug.org.au Tue Jul 12 17:42:49 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 12 Jul 2005 17:42:49 +1000 Subject: [PATCH 2/4] ppc64: move iSeries vio iommu init In-Reply-To: <20050712173655.387d5110.sfr@canb.auug.org.au> References: <20050712173655.387d5110.sfr@canb.auug.org.au> Message-ID: <20050712174249.26fc8721.sfr@canb.auug.org.au> Hi all, Since the iSeries vio iommu tables cannot be used until after the vio bus has been initialised, move the initialisation of the tables to there. Signed-off-by: Stephen Rothwell --- arch/ppc64/kernel/iSeries_vio.c | 3 ++- arch/ppc64/mm/init.c | 3 --- include/asm-ppc64/iommu.h | 3 --- 3 files changed, 2 insertions(+), 7 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-vio-init.1/arch/ppc64/kernel/iSeries_vio.c linus-vio-init.2/arch/ppc64/kernel/iSeries_vio.c --- linus-vio-init.1/arch/ppc64/kernel/iSeries_vio.c 2005-06-24 17:31:30.000000000 +1000 +++ linus-vio-init.2/arch/ppc64/kernel/iSeries_vio.c 2005-06-26 06:23:28.000000000 +1000 @@ -27,7 +27,7 @@ EXPORT_SYMBOL(iSeries_vio_dev); static struct iommu_table veth_iommu_table; static struct iommu_table vio_iommu_table; -void __init iommu_vio_init(void) +static void __init iommu_vio_init(void) { struct iommu_table *t; struct iommu_table_cb cb; @@ -123,6 +123,7 @@ static int __init vio_bus_init_iseries(v err = vio_bus_init(); if (err == 0) { + iommu_vio_init(); vio_bus_device.iommu_table = &vio_iommu_table; iSeries_vio_dev = &vio_bus_device.dev; probe_bus_iseries(); diff -ruNp linus-vio-init.1/arch/ppc64/mm/init.c linus-vio-init.2/arch/ppc64/mm/init.c --- linus-vio-init.1/arch/ppc64/mm/init.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-vio-init.2/arch/ppc64/mm/init.c 2005-06-27 18:01:17.000000000 +1000 @@ -685,9 +685,6 @@ void __init mem_init(void) mem_init_done = 1; -#ifdef CONFIG_PPC_ISERIES - iommu_vio_init(); -#endif /* Initialize the vDSO */ vdso_init(); } diff -ruNp linus-vio-init.1/include/asm-ppc64/iommu.h linus-vio-init.2/include/asm-ppc64/iommu.h --- linus-vio-init.1/include/asm-ppc64/iommu.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-vio-init.2/include/asm-ppc64/iommu.h 2005-06-27 18:01:17.000000000 +1000 @@ -104,9 +104,6 @@ extern void iommu_devnode_init_pSeries(s #ifdef CONFIG_PPC_ISERIES -/* Initializes tables for bio buses */ -extern void __init iommu_vio_init(void); - struct iSeries_Device_Node; /* Creates table for an individual device node */ extern void iommu_devnode_init_iSeries(struct iSeries_Device_Node *dn); From sfr at canb.auug.org.au Tue Jul 12 17:45:27 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 12 Jul 2005 17:45:27 +1000 Subject: [PATCH 3/4] ppc64: make the bus matching function platform specific In-Reply-To: <20050712173655.387d5110.sfr@canb.auug.org.au> References: <20050712173655.387d5110.sfr@canb.auug.org.au> Message-ID: <20050712174527.0a349cb6.sfr@canb.auug.org.au> Hi all, This patch allows us to have a different bus if matching function for each platform. Signed-off-by: Stephen Rothwell --- arch/ppc64/kernel/iSeries_vio.c | 12 +++++++++++- arch/ppc64/kernel/vio.c | 28 +++++++++++++++++++--------- include/asm-ppc64/vio.h | 3 ++- 3 files changed, 32 insertions(+), 11 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-vio-init.2/arch/ppc64/kernel/iSeries_vio.c linus-vio-init.3/arch/ppc64/kernel/iSeries_vio.c --- linus-vio-init.2/arch/ppc64/kernel/iSeries_vio.c 2005-06-26 06:23:28.000000000 +1000 +++ linus-vio-init.3/arch/ppc64/kernel/iSeries_vio.c 2005-06-26 08:08:55.000000000 +1000 @@ -115,13 +115,23 @@ void __init probe_bus_iseries(void) } /** + * vio_match_device_iseries: - Tell if a iSeries VIO device matches a + * vio_device_id + */ +static int vio_match_device_iseries(const struct vio_device_id *id, + const struct vio_dev *dev) +{ + return strncmp(dev->type, id->type, strlen(id->type)) == 0; +} + +/** * vio_bus_init_iseries: - Initialize the iSeries virtual IO bus */ static int __init vio_bus_init_iseries(void) { int err; - err = vio_bus_init(); + err = vio_bus_init(vio_match_device_iseries); if (err == 0) { iommu_vio_init(); vio_bus_device.iommu_table = &vio_iommu_table; diff -ruNp linus-vio-init.2/arch/ppc64/kernel/vio.c linus-vio-init.3/arch/ppc64/kernel/vio.c --- linus-vio-init.2/arch/ppc64/kernel/vio.c 2005-06-27 18:01:17.000000000 +1000 +++ linus-vio-init.3/arch/ppc64/kernel/vio.c 2005-06-27 16:44:25.000000000 +1000 @@ -44,11 +44,8 @@ struct vio_dev vio_bus_device = { /* fa .dev.bus = &vio_bus_type, }; -#ifdef CONFIG_PPC_ISERIES - -#define device_is_compatible(a, b) 1 - -#endif +static int (*is_match)(const struct vio_device_id *id, + const struct vio_dev *dev); /* convert from struct device to struct vio_dev and pass to driver. * dev->driver has already been set by generic code because vio_bus_match @@ -133,8 +130,7 @@ static const struct vio_device_id * vio_ DBGENTER(); while (ids->type) { - if ((strncmp(dev->type, ids->type, strlen(ids->type)) == 0) && - device_is_compatible(dev->dev.platform_data, ids->compat)) + if (is_match(ids, dev)) return ids; ids++; } @@ -168,10 +164,13 @@ static void probe_bus_pseries(void) /** * vio_bus_init: - Initialize the virtual IO bus */ -int __init vio_bus_init(void) +int __init vio_bus_init(int (*match_func)(const struct vio_device_id *id, + const struct vio_dev *dev)) { int err; + is_match = match_func; + err = bus_register(&vio_bus_type); if (err) { printk(KERN_ERR "failed to register VIO bus\n"); @@ -193,13 +192,24 @@ int __init vio_bus_init(void) #ifdef CONFIG_PPC_PSERIES /** + * vio_match_device_pseries: - Tell if a pSeries VIO device matches a + * vio_device_id + */ +static int vio_match_device_pseries(const struct vio_device_id *id, + const struct vio_dev *dev) +{ + return (strncmp(dev->type, id->type, strlen(id->type)) == 0) && + device_is_compatible(dev->dev.platform_data, id->compat); +} + +/** * vio_bus_init_pseries: - Initialize the pSeries virtual IO bus */ static int __init vio_bus_init_pseries(void) { int err; - err = vio_bus_init(); + err = vio_bus_init(vio_match_device_pseries); if (err == 0) probe_bus_pseries(); return err; diff -ruNp linus-vio-init.2/include/asm-ppc64/vio.h linus-vio-init.3/include/asm-ppc64/vio.h --- linus-vio-init.2/include/asm-ppc64/vio.h 2005-06-27 18:01:17.000000000 +1000 +++ linus-vio-init.3/include/asm-ppc64/vio.h 2005-06-27 16:49:40.000000000 +1000 @@ -105,6 +105,7 @@ static inline struct vio_dev *to_vio_dev return container_of(dev, struct vio_dev, dev); } -extern int vio_bus_init(void); +extern int vio_bus_init(int (*is_match)(const struct vio_device_id *id, + const struct vio_dev *dev)); #endif /* _ASM_VIO_H */ From sfr at canb.auug.org.au Tue Jul 12 17:50:26 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 12 Jul 2005 17:50:26 +1000 Subject: [PATCH 4/4] ppc64: split pSeries specific parts out of vio.c In-Reply-To: <20050712173655.387d5110.sfr@canb.auug.org.au> References: <20050712173655.387d5110.sfr@canb.auug.org.au> Message-ID: <20050712175026.1652e11a.sfr@canb.auug.org.au> Hi all, This patch just splits out the pSeries specific parts of vio.c. Signed-off-by: Stephen Rothwell --- arch/ppc64/kernel/Makefile | 1 arch/ppc64/kernel/iSeries_vio.c | 2 arch/ppc64/kernel/pSeries_vio.c | 266 ++++++++++++++++++++++++++++++++++++ arch/ppc64/kernel/vio.c | 290 +--------------------------------------- include/asm-ppc64/vio.h | 4 5 files changed, 284 insertions(+), 279 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus-vio-init.3/arch/ppc64/kernel/Makefile linus-vio-init.4/arch/ppc64/kernel/Makefile --- linus-vio-init.3/arch/ppc64/kernel/Makefile 2005-06-27 18:01:47.000000000 +1000 +++ linus-vio-init.4/arch/ppc64/kernel/Makefile 2005-06-27 18:02:36.000000000 +1000 @@ -51,6 +51,7 @@ obj-$(CONFIG_HVC_CONSOLE) += hvconsole.o obj-$(CONFIG_BOOTX_TEXT) += btext.o obj-$(CONFIG_HVCS) += hvcserver.o +vio-obj-$(CONFIG_PPC_PSERIES) += pSeries_vio.o vio-obj-$(CONFIG_PPC_ISERIES) += iSeries_vio.o obj-$(CONFIG_IBMVIO) += vio.o $(vio-obj-y) obj-$(CONFIG_XICS) += xics.o diff -ruNp linus-vio-init.3/arch/ppc64/kernel/iSeries_vio.c linus-vio-init.4/arch/ppc64/kernel/iSeries_vio.c --- linus-vio-init.3/arch/ppc64/kernel/iSeries_vio.c 2005-06-26 08:08:55.000000000 +1000 +++ linus-vio-init.4/arch/ppc64/kernel/iSeries_vio.c 2005-06-26 09:07:02.000000000 +1000 @@ -131,7 +131,7 @@ static int __init vio_bus_init_iseries(v { int err; - err = vio_bus_init(vio_match_device_iseries); + err = vio_bus_init(vio_match_device_iseries, NULL, NULL); if (err == 0) { iommu_vio_init(); vio_bus_device.iommu_table = &vio_iommu_table; diff -ruNp linus-vio-init.3/arch/ppc64/kernel/pSeries_vio.c linus-vio-init.4/arch/ppc64/kernel/pSeries_vio.c --- linus-vio-init.3/arch/ppc64/kernel/pSeries_vio.c 1970-01-01 10:00:00.000000000 +1000 +++ linus-vio-init.4/arch/ppc64/kernel/pSeries_vio.c 2005-06-27 17:25:44.000000000 +1000 @@ -0,0 +1,266 @@ +/* + * IBM PowerPC pSeries Virtual I/O Infrastructure Support. + * + * Copyright (c) 2003-2005 IBM Corp. + * Dave Engebretsen engebret at us.ibm.com + * Santiago Leon santil at us.ibm.com + * Hollis Blanchard + * Stephen Rothwell + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +extern struct subsystem devices_subsys; /* needed for vio_find_name() */ + +static void probe_bus_pseries(void) +{ + struct device_node *node_vroot, *of_node; + + node_vroot = find_devices("vdevice"); + if ((node_vroot == NULL) || (node_vroot->child == NULL)) + /* this machine doesn't do virtual IO, and that's ok */ + return; + + /* + * Create struct vio_devices for each virtual device in the device tree. + * Drivers will associate with them later. + */ + for (of_node = node_vroot->child; of_node != NULL; + of_node = of_node->sibling) { + printk(KERN_DEBUG "%s: processing %p\n", __FUNCTION__, of_node); + vio_register_device_node(of_node); + } +} + +/** + * vio_match_device_pseries: - Tell if a pSeries VIO device matches a + * vio_device_id + */ +static int vio_match_device_pseries(const struct vio_device_id *id, + const struct vio_dev *dev) +{ + return (strncmp(dev->type, id->type, strlen(id->type)) == 0) && + device_is_compatible(dev->dev.platform_data, id->compat); +} + +static void vio_release_device_pseries(struct device *dev) +{ + /* XXX free TCE table */ + of_node_put(dev->platform_data); +} + +static ssize_t viodev_show_devspec(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct device_node *of_node = dev->platform_data; + + return sprintf(buf, "%s\n", of_node->full_name); +} +DEVICE_ATTR(devspec, S_IRUSR | S_IRGRP | S_IROTH, viodev_show_devspec, NULL); + +static void vio_unregister_device_pseries(struct vio_dev *viodev) +{ + device_remove_file(&viodev->dev, &dev_attr_devspec); +} + +/** + * vio_bus_init_pseries: - Initialize the pSeries virtual IO bus + */ +static int __init vio_bus_init_pseries(void) +{ + int err; + + err = vio_bus_init(vio_match_device_pseries, + vio_unregister_device_pseries, + vio_release_device_pseries); + if (err == 0) + probe_bus_pseries(); + return err; +} + +__initcall(vio_bus_init_pseries); + +/** + * vio_build_iommu_table: - gets the dma information from OF and + * builds the TCE tree. + * @dev: the virtual device. + * + * Returns a pointer to the built tce tree, or NULL if it can't + * find property. +*/ +static struct iommu_table *vio_build_iommu_table(struct vio_dev *dev) +{ + unsigned int *dma_window; + struct iommu_table *newTceTable; + unsigned long offset; + int dma_window_property_size; + + dma_window = (unsigned int *) get_property(dev->dev.platform_data, "ibm,my-dma-window", &dma_window_property_size); + if(!dma_window) { + return NULL; + } + + newTceTable = (struct iommu_table *) kmalloc(sizeof(struct iommu_table), GFP_KERNEL); + + /* There should be some code to extract the phys-encoded offset + using prom_n_addr_cells(). However, according to a comment + on earlier versions, it's always zero, so we don't bother */ + offset = dma_window[1] >> PAGE_SHIFT; + + /* TCE table size - measured in tce entries */ + newTceTable->it_size = dma_window[4] >> PAGE_SHIFT; + /* offset for VIO should always be 0 */ + newTceTable->it_offset = offset; + newTceTable->it_busno = 0; + newTceTable->it_index = (unsigned long)dma_window[0]; + newTceTable->it_type = TCE_VB; + + return iommu_init_table(newTceTable); +} + +/** + * vio_register_device_node: - Register a new vio device. + * @of_node: The OF node for this device. + * + * Creates and initializes a vio_dev structure from the data in + * of_node (dev.platform_data) and adds it to the list of virtual devices. + * Returns a pointer to the created vio_dev or NULL if node has + * NULL device_type or compatible fields. + */ +struct vio_dev * __devinit vio_register_device_node(struct device_node *of_node) +{ + struct vio_dev *viodev; + unsigned int *unit_address; + unsigned int *irq_p; + + /* we need the 'device_type' property, in order to match with drivers */ + if ((NULL == of_node->type)) { + printk(KERN_WARNING + "%s: node %s missing 'device_type'\n", __FUNCTION__, + of_node->name ? of_node->name : ""); + return NULL; + } + + unit_address = (unsigned int *)get_property(of_node, "reg", NULL); + if (!unit_address) { + printk(KERN_WARNING "%s: node %s missing 'reg'\n", __FUNCTION__, + of_node->name ? of_node->name : ""); + return NULL; + } + + /* allocate a vio_dev for this node */ + viodev = kmalloc(sizeof(struct vio_dev), GFP_KERNEL); + if (!viodev) { + return NULL; + } + memset(viodev, 0, sizeof(struct vio_dev)); + + viodev->dev.platform_data = of_node_get(of_node); + + viodev->irq = NO_IRQ; + irq_p = (unsigned int *)get_property(of_node, "interrupts", NULL); + if (irq_p) { + int virq = virt_irq_create_mapping(*irq_p); + if (virq == NO_IRQ) { + printk(KERN_ERR "Unable to allocate interrupt " + "number for %s\n", of_node->full_name); + } else + viodev->irq = irq_offset_up(virq); + } + + snprintf(viodev->dev.bus_id, BUS_ID_SIZE, "%x", *unit_address); + + /* register with generic device framework */ + if (vio_register_device_common(viodev, of_node->name, of_node->type, + *unit_address, vio_build_iommu_table(viodev)) + == NULL) { + /* XXX free TCE table */ + kfree(viodev); + return NULL; + } + device_create_file(&viodev->dev, &dev_attr_devspec); + + return viodev; +} +EXPORT_SYMBOL(vio_register_device_node); + +/** + * vio_get_attribute: - get attribute for virtual device + * @vdev: The vio device to get property. + * @which: The property/attribute to be extracted. + * @length: Pointer to length of returned data size (unused if NULL). + * + * Calls prom.c's get_property() to return the value of the + * attribute specified by the preprocessor constant @which +*/ +const void * vio_get_attribute(struct vio_dev *vdev, void* which, int* length) +{ + return get_property(vdev->dev.platform_data, (char*)which, length); +} +EXPORT_SYMBOL(vio_get_attribute); + +/* vio_find_name() - internal because only vio.c knows how we formatted the + * kobject name + * XXX once vio_bus_type.devices is actually used as a kset in + * drivers/base/bus.c, this function should be removed in favor of + * "device_find(kobj_name, &vio_bus_type)" + */ +static struct vio_dev *vio_find_name(const char *kobj_name) +{ + struct kobject *found; + + found = kset_find_obj(&devices_subsys.kset, kobj_name); + if (!found) + return NULL; + + return to_vio_dev(container_of(found, struct device, kobj)); +} + +/** + * vio_find_node - find an already-registered vio_dev + * @vnode: device_node of the virtual device we're looking for + */ +struct vio_dev *vio_find_node(struct device_node *vnode) +{ + uint32_t *unit_address; + char kobj_name[BUS_ID_SIZE]; + + /* construct the kobject name from the device node */ + unit_address = (uint32_t *)get_property(vnode, "reg", NULL); + if (!unit_address) + return NULL; + snprintf(kobj_name, BUS_ID_SIZE, "%x", *unit_address); + + return vio_find_name(kobj_name); +} +EXPORT_SYMBOL(vio_find_node); + +int vio_enable_interrupts(struct vio_dev *dev) +{ + int rc = h_vio_signal(dev->unit_address, VIO_IRQ_ENABLE); + if (rc != H_Success) + printk(KERN_ERR "vio: Error 0x%x enabling interrupts\n", rc); + return rc; +} +EXPORT_SYMBOL(vio_enable_interrupts); + +int vio_disable_interrupts(struct vio_dev *dev) +{ + int rc = h_vio_signal(dev->unit_address, VIO_IRQ_DISABLE); + if (rc != H_Success) + printk(KERN_ERR "vio: Error 0x%x disabling interrupts\n", rc); + return rc; +} +EXPORT_SYMBOL(vio_disable_interrupts); diff -ruNp linus-vio-init.3/arch/ppc64/kernel/vio.c linus-vio-init.4/arch/ppc64/kernel/vio.c --- linus-vio-init.3/arch/ppc64/kernel/vio.c 2005-06-27 16:44:25.000000000 +1000 +++ linus-vio-init.4/arch/ppc64/kernel/vio.c 2005-06-27 17:25:38.000000000 +1000 @@ -1,10 +1,11 @@ /* * IBM PowerPC Virtual I/O Infrastructure Support. * - * Copyright (c) 2003 IBM Corp. + * Copyright (c) 2003-2005 IBM Corp. * Dave Engebretsen engebret at us.ibm.com * Santiago Leon santil at us.ibm.com * Hollis Blanchard + * Stephen Rothwell * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -14,29 +15,16 @@ #include #include -#include #include -#include #include #include -#include #include #include -#include #include -#include - -#define DBGENTER() pr_debug("%s entered\n", __FUNCTION__) - -extern struct subsystem devices_subsys; /* needed for vio_find_name() */ static const struct vio_device_id *vio_match_device( const struct vio_device_id *, const struct vio_dev *); -#ifdef CONFIG_PPC_PSERIES -static struct iommu_table *vio_build_iommu_table(struct vio_dev *); -static int vio_num_address_cells; -#endif struct vio_dev vio_bus_device = { /* fake "parent" device */ .name = vio_bus_device.dev.bus_id, .type = "", @@ -46,6 +34,8 @@ struct vio_dev vio_bus_device = { /* fa static int (*is_match)(const struct vio_device_id *id, const struct vio_dev *dev); +static void (*unregister_device_callback)(struct vio_dev *dev); +static void (*release_device_callback)(struct device *dev); /* convert from struct device to struct vio_dev and pass to driver. * dev->driver has already been set by generic code because vio_bus_match @@ -57,8 +47,6 @@ static int vio_bus_probe(struct device * const struct vio_device_id *id; int error = -ENODEV; - DBGENTER(); - if (!viodrv->probe) return error; @@ -76,8 +64,6 @@ static int vio_bus_remove(struct device struct vio_dev *viodev = to_vio_dev(dev); struct vio_driver *viodrv = to_vio_driver(dev->driver); - DBGENTER(); - if (viodrv->remove) { return viodrv->remove(viodev); } @@ -127,8 +113,6 @@ EXPORT_SYMBOL(vio_unregister_driver); static const struct vio_device_id * vio_match_device(const struct vio_device_id *ids, const struct vio_dev *dev) { - DBGENTER(); - while (ids->type) { if (is_match(ids, dev)) return ids; @@ -137,39 +121,19 @@ static const struct vio_device_id * vio_ return NULL; } -#ifdef CONFIG_PPC_PSERIES -static void probe_bus_pseries(void) -{ - struct device_node *node_vroot, *of_node; - - node_vroot = find_devices("vdevice"); - if ((node_vroot == NULL) || (node_vroot->child == NULL)) - /* this machine doesn't do virtual IO, and that's ok */ - return; - - vio_num_address_cells = prom_n_addr_cells(node_vroot->child); - - /* - * Create struct vio_devices for each virtual device in the device tree. - * Drivers will associate with them later. - */ - for (of_node = node_vroot->child; of_node != NULL; - of_node = of_node->sibling) { - printk(KERN_DEBUG "%s: processing %p\n", __FUNCTION__, of_node); - vio_register_device_node(of_node); - } -} -#endif - /** * vio_bus_init: - Initialize the virtual IO bus */ int __init vio_bus_init(int (*match_func)(const struct vio_device_id *id, - const struct vio_dev *dev)) + const struct vio_dev *dev), + void (*unregister_dev)(struct vio_dev *), + void (*release_dev)(struct device *)) { int err; is_match = match_func; + unregister_device_callback = unregister_dev; + release_device_callback = release_dev; err = bus_register(&vio_bus_type); if (err) { @@ -190,56 +154,14 @@ int __init vio_bus_init(int (*match_func return 0; } -#ifdef CONFIG_PPC_PSERIES -/** - * vio_match_device_pseries: - Tell if a pSeries VIO device matches a - * vio_device_id - */ -static int vio_match_device_pseries(const struct vio_device_id *id, - const struct vio_dev *dev) -{ - return (strncmp(dev->type, id->type, strlen(id->type)) == 0) && - device_is_compatible(dev->dev.platform_data, id->compat); -} - -/** - * vio_bus_init_pseries: - Initialize the pSeries virtual IO bus - */ -static int __init vio_bus_init_pseries(void) -{ - int err; - - err = vio_bus_init(vio_match_device_pseries); - if (err == 0) - probe_bus_pseries(); - return err; -} - -__initcall(vio_bus_init_pseries); -#endif - /* vio_dev refcount hit 0 */ static void __devinit vio_dev_release(struct device *dev) { - DBGENTER(); - -#ifdef CONFIG_PPC_PSERIES - /* XXX free TCE table */ - of_node_put(dev->platform_data); -#endif + if (release_device_callback) + release_device_callback(dev); kfree(to_vio_dev(dev)); } -#ifdef CONFIG_PPC_PSERIES -static ssize_t viodev_show_devspec(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct device_node *of_node = dev->platform_data; - - return sprintf(buf, "%s\n", of_node->full_name); -} -DEVICE_ATTR(devspec, S_IRUSR | S_IRGRP | S_IROTH, viodev_show_devspec, NULL); -#endif - static ssize_t viodev_show_name(struct device *dev, struct device_attribute *attr, char *buf) { return sprintf(buf, "%s\n", to_vio_dev(dev)->name); @@ -250,8 +172,6 @@ struct vio_dev * __devinit vio_register_ struct vio_dev *viodev, char *name, char *type, uint32_t unit_address, struct iommu_table *iommu_table) { - DBGENTER(); - viodev->name = name; viodev->type = type; viodev->unit_address = unit_address; @@ -272,197 +192,15 @@ struct vio_dev * __devinit vio_register_ return viodev; } -#ifdef CONFIG_PPC_PSERIES -/** - * vio_register_device_node: - Register a new vio device. - * @of_node: The OF node for this device. - * - * Creates and initializes a vio_dev structure from the data in - * of_node (dev.platform_data) and adds it to the list of virtual devices. - * Returns a pointer to the created vio_dev or NULL if node has - * NULL device_type or compatible fields. - */ -struct vio_dev * __devinit vio_register_device_node(struct device_node *of_node) -{ - struct vio_dev *viodev; - unsigned int *unit_address; - unsigned int *irq_p; - - DBGENTER(); - - /* we need the 'device_type' property, in order to match with drivers */ - if ((NULL == of_node->type)) { - printk(KERN_WARNING - "%s: node %s missing 'device_type'\n", __FUNCTION__, - of_node->name ? of_node->name : ""); - return NULL; - } - - unit_address = (unsigned int *)get_property(of_node, "reg", NULL); - if (!unit_address) { - printk(KERN_WARNING "%s: node %s missing 'reg'\n", __FUNCTION__, - of_node->name ? of_node->name : ""); - return NULL; - } - - /* allocate a vio_dev for this node */ - viodev = kmalloc(sizeof(struct vio_dev), GFP_KERNEL); - if (!viodev) { - return NULL; - } - memset(viodev, 0, sizeof(struct vio_dev)); - - viodev->dev.platform_data = of_node_get(of_node); - - viodev->irq = NO_IRQ; - irq_p = (unsigned int *)get_property(of_node, "interrupts", NULL); - if (irq_p) { - int virq = virt_irq_create_mapping(*irq_p); - if (virq == NO_IRQ) { - printk(KERN_ERR "Unable to allocate interrupt " - "number for %s\n", of_node->full_name); - } else - viodev->irq = irq_offset_up(virq); - } - - snprintf(viodev->dev.bus_id, BUS_ID_SIZE, "%x", *unit_address); - - /* register with generic device framework */ - if (vio_register_device_common(viodev, of_node->name, of_node->type, - *unit_address, vio_build_iommu_table(viodev)) - == NULL) { - /* XXX free TCE table */ - kfree(viodev); - return NULL; - } - device_create_file(&viodev->dev, &dev_attr_devspec); - - return viodev; -} -EXPORT_SYMBOL(vio_register_device_node); -#endif - void __devinit vio_unregister_device(struct vio_dev *viodev) { - DBGENTER(); -#ifdef CONFIG_PPC_PSERIES - device_remove_file(&viodev->dev, &dev_attr_devspec); -#endif + if (unregister_device_callback) + unregister_device_callback(viodev); device_remove_file(&viodev->dev, &dev_attr_name); device_unregister(&viodev->dev); } EXPORT_SYMBOL(vio_unregister_device); -#ifdef CONFIG_PPC_PSERIES -/** - * vio_get_attribute: - get attribute for virtual device - * @vdev: The vio device to get property. - * @which: The property/attribute to be extracted. - * @length: Pointer to length of returned data size (unused if NULL). - * - * Calls prom.c's get_property() to return the value of the - * attribute specified by the preprocessor constant @which -*/ -const void * vio_get_attribute(struct vio_dev *vdev, void* which, int* length) -{ - return get_property(vdev->dev.platform_data, (char*)which, length); -} -EXPORT_SYMBOL(vio_get_attribute); - -/* vio_find_name() - internal because only vio.c knows how we formatted the - * kobject name - * XXX once vio_bus_type.devices is actually used as a kset in - * drivers/base/bus.c, this function should be removed in favor of - * "device_find(kobj_name, &vio_bus_type)" - */ -static struct vio_dev *vio_find_name(const char *kobj_name) -{ - struct kobject *found; - - found = kset_find_obj(&devices_subsys.kset, kobj_name); - if (!found) - return NULL; - - return to_vio_dev(container_of(found, struct device, kobj)); -} - -/** - * vio_find_node - find an already-registered vio_dev - * @vnode: device_node of the virtual device we're looking for - */ -struct vio_dev *vio_find_node(struct device_node *vnode) -{ - uint32_t *unit_address; - char kobj_name[BUS_ID_SIZE]; - - /* construct the kobject name from the device node */ - unit_address = (uint32_t *)get_property(vnode, "reg", NULL); - if (!unit_address) - return NULL; - snprintf(kobj_name, BUS_ID_SIZE, "%x", *unit_address); - - return vio_find_name(kobj_name); -} -EXPORT_SYMBOL(vio_find_node); - -/** - * vio_build_iommu_table: - gets the dma information from OF and builds the TCE tree. - * @dev: the virtual device. - * - * Returns a pointer to the built tce tree, or NULL if it can't - * find property. -*/ -static struct iommu_table * vio_build_iommu_table(struct vio_dev *dev) -{ - unsigned int *dma_window; - struct iommu_table *newTceTable; - unsigned long offset; - int dma_window_property_size; - - dma_window = (unsigned int *) get_property(dev->dev.platform_data, "ibm,my-dma-window", &dma_window_property_size); - if(!dma_window) { - return NULL; - } - - newTceTable = (struct iommu_table *) kmalloc(sizeof(struct iommu_table), GFP_KERNEL); - - /* There should be some code to extract the phys-encoded offset - using prom_n_addr_cells(). However, according to a comment - on earlier versions, it's always zero, so we don't bother */ - offset = dma_window[1] >> PAGE_SHIFT; - - /* TCE table size - measured in tce entries */ - newTceTable->it_size = dma_window[4] >> PAGE_SHIFT; - /* offset for VIO should always be 0 */ - newTceTable->it_offset = offset; - newTceTable->it_busno = 0; - newTceTable->it_index = (unsigned long)dma_window[0]; - newTceTable->it_type = TCE_VB; - - return iommu_init_table(newTceTable); -} - -int vio_enable_interrupts(struct vio_dev *dev) -{ - int rc = h_vio_signal(dev->unit_address, VIO_IRQ_ENABLE); - if (rc != H_Success) { - printk(KERN_ERR "vio: Error 0x%x enabling interrupts\n", rc); - } - return rc; -} -EXPORT_SYMBOL(vio_enable_interrupts); - -int vio_disable_interrupts(struct vio_dev *dev) -{ - int rc = h_vio_signal(dev->unit_address, VIO_IRQ_DISABLE); - if (rc != H_Success) { - printk(KERN_ERR "vio: Error 0x%x disabling interrupts\n", rc); - } - return rc; -} -EXPORT_SYMBOL(vio_disable_interrupts); -#endif - static dma_addr_t vio_map_single(struct device *dev, void *vaddr, size_t size, enum dma_data_direction direction) { @@ -526,8 +264,6 @@ static int vio_bus_match(struct device * const struct vio_device_id *ids = vio_drv->id_table; const struct vio_device_id *found_id; - DBGENTER(); - if (!ids) return 0; diff -ruNp linus-vio-init.3/include/asm-ppc64/vio.h linus-vio-init.4/include/asm-ppc64/vio.h --- linus-vio-init.3/include/asm-ppc64/vio.h 2005-06-27 16:49:40.000000000 +1000 +++ linus-vio-init.4/include/asm-ppc64/vio.h 2005-06-27 16:50:44.000000000 +1000 @@ -106,6 +106,8 @@ static inline struct vio_dev *to_vio_dev } extern int vio_bus_init(int (*is_match)(const struct vio_device_id *id, - const struct vio_dev *dev)); + const struct vio_dev *dev), + void (*)(struct vio_dev *), + void (*)(struct device *)); #endif /* _ASM_VIO_H */ From olof at lixom.net Tue Jul 12 22:26:48 2005 From: olof at lixom.net (Olof Johansson) Date: Tue, 12 Jul 2005 07:26:48 -0500 Subject: [PATCH] PPC64: Add 970MP PVR Message-ID: <20050712122647.GA27453@austin.ibm.com> Hi, Add PVR value and tests for 970MP. Also switch to a simpler (but slightly longer) check at init time for simplicity. Signed-off-by: Olof Johansson Index: 2.6/arch/ppc64/kernel/cpu_setup_power4.S =================================================================== --- 2.6.orig/arch/ppc64/kernel/cpu_setup_power4.S 2005-07-11 13:52:39.000000000 -0500 +++ 2.6/arch/ppc64/kernel/cpu_setup_power4.S 2005-07-11 13:54:38.000000000 -0500 @@ -31,10 +31,13 @@ _GLOBAL(__970_cpu_preinit) */ mfspr r0,SPRN_PVR srwi r0,r0,16 - cmpwi cr0,r0,0x39 - cmpwi cr1,r0,0x3c - cror 4*cr0+eq,4*cr0+eq,4*cr1+eq + cmpwi r0,0x39 + beq 1f + cmpwi r0,0x3c + beq 1f + cmpwi r0,0x44 bnelr +1: /* Make sure HID4:rm_ci is off before MMU is turned off, that large * pages are enabled with HID4:61 and clear HID5:DCBZ_size and @@ -133,12 +136,14 @@ _GLOBAL(__save_cpu_setup) /* We only deal with 970 for now */ mfspr r0,SPRN_PVR srwi r0,r0,16 - cmpwi cr0,r0,0x39 - cmpwi cr1,r0,0x3c - cror 4*cr0+eq,4*cr0+eq,4*cr1+eq - bne 1f + cmpwi r0,0x39 + beq 1f + cmpwi r0,0x3c + beq 1f + cmpwi r0,0x44 + bne 2f - /* Save HID0,1,4 and 5 */ +1: /* Save HID0,1,4 and 5 */ mfspr r3,SPRN_HID0 std r3,CS_HID0(r5) mfspr r3,SPRN_HID1 @@ -148,7 +153,7 @@ _GLOBAL(__save_cpu_setup) mfspr r3,SPRN_HID5 std r3,CS_HID5(r5) -1: +2: mtcr r7 blr @@ -165,12 +170,14 @@ _GLOBAL(__restore_cpu_setup) /* We only deal with 970 for now */ mfspr r0,SPRN_PVR srwi r0,r0,16 - cmpwi cr0,r0,0x39 - cmpwi cr1,r0,0x3c - cror 4*cr0+eq,4*cr0+eq,4*cr1+eq - bne 1f + cmpwi r0,0x39 + beq 1f + cmpwi r0,0x3c + beq 1f + cmpwi r0,0x44 + bnelr - /* Before accessing memory, we make sure rm_ci is clear */ +1: /* Before accessing memory, we make sure rm_ci is clear */ li r0,0 mfspr r3,SPRN_HID4 rldimi r3,r0,40,23 /* clear bit 23 (rm_ci) */ @@ -223,6 +230,5 @@ _GLOBAL(__restore_cpu_setup) mtspr SPRN_HID5,r3 sync isync -1: blr Index: 2.6/arch/ppc64/kernel/cputable.c =================================================================== --- 2.6.orig/arch/ppc64/kernel/cputable.c 2005-07-11 13:53:06.000000000 -0500 +++ 2.6/arch/ppc64/kernel/cputable.c 2005-07-11 13:54:15.000000000 -0500 @@ -183,6 +183,21 @@ struct cpu_spec cpu_specs[] = { .cpu_setup = __setup_cpu_ppc970, .firmware_features = COMMON_PPC64_FW, }, + { /* PPC970MP */ + .pvr_mask = 0xffff0000, + .pvr_value = 0x00440000, + .cpu_name = "PPC970MP", + .cpu_features = CPU_FTR_SPLIT_ID_CACHE | + CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | + CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_ALTIVEC_COMP | + CPU_FTR_CAN_NAP | CPU_FTR_PMC8 | CPU_FTR_MMCRA, + .cpu_user_features = COMMON_USER_PPC64 | + PPC_FEATURE_HAS_ALTIVEC_COMP, + .icache_bsize = 128, + .dcache_bsize = 128, + .cpu_setup = __setup_cpu_ppc970, + .firmware_features = COMMON_PPC64_FW, + }, { /* Power5 */ .pvr_mask = 0xffff0000, .pvr_value = 0x003a0000, From linas at austin.ibm.com Wed Jul 13 05:51:20 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 12 Jul 2005 14:51:20 -0500 Subject: [PATCH 2.6.13-rc1 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB664E.1050003@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB664E.1050003@jp.fujitsu.com> Message-ID: <20050712195120.GE26607@austin.ibm.com> Hi, Sorry for the late response ... I'm reading the patch, and I'm wondering what about performance and overhead. Here's the code that concerns me: On Wed, Jul 06, 2005 at 02:04:14PM +0900, Hidetoshi Seto was heard to remark: > [This is 3 of 10 patches, "iochk-03-register.patch"] > > - Implement ia64 version of basic codes: > iochk_clear, iochk_read, iochk_init, and iocookie > > int iochk_read(iocookie *cookie) > { > + if (cookie->error || have_error(cookie->dev)) .... > +} > + > +static int have_error(struct pci_dev *dev) > +{ > + u16 status; > + > + /* check status */ > + switch (dev->hdr_type) { > + case PCI_HEADER_TYPE_NORMAL: /* 0 */ > + pci_read_config_word(dev, PCI_STATUS, &status); > + break; > + case PCI_HEADER_TYPE_BRIDGE: /* 1 */ > + pci_read_config_word(dev, PCI_SEC_STATUS, &status); > + break; > + } > + > + if ( (status & PCI_STATUS_REC_TARGET_ABORT) > + || (status & PCI_STATUS_REC_MASTER_ABORT) > + || (status & PCI_STATUS_DETECTED_PARITY) ) > + return 1; > > return 0; > } Are you assuming that a device driver will use an iochk_read() for every DMA operation? for every MMIO to the card? For high performance devices, it seems to me that this will cause a rather large performance burden, especially if its envisioned that all architectures will do something similar. My concern is that (at least on ppc64) the call pci_read_config_word() requires a call into "firmware" aka "BIOS", which takes thousands upon thousands of cpu cycles. There are hundreds of cycles of gratuitous crud just to get into the firmware, and then lord-knows-what the firmware does while its in there; probably doing all sorts of crazy math to compute bus addresses and other arcane things. I would imagine that most architectures, includig ia64, are similar. Thus, one wouldn't want to perform an iochk_read() in this way unless one was already pretty sure that an error had already occured ... Am I misunderstanding something? --linas From linas at austin.ibm.com Wed Jul 13 07:14:01 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 12 Jul 2005 16:14:01 -0500 Subject: [PATCH 2.6.13-rc1 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB6961.2060508@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB6961.2060508@jp.fujitsu.com> Message-ID: <20050712211401.GF26607@austin.ibm.com> On Wed, Jul 06, 2005 at 02:17:21PM +0900, Hidetoshi Seto was heard to remark: > > Touching poisoned data become a MCA, so now it directly means Several questions: Is MCA an exception or fault of some sort, so at some point, the kernel would catch a fault? So when you say "Touching poisoned data become a MCA", you mean that if the CPU attempts to read poisoned data through the pci-to-host bridge, it will (at some point) catch an exception? > + ia64_mca_barrier(ret); I assume that the point of this barrier is to make sure that the fault, if any, is delivered before this routine returns? --linas From linas at austin.ibm.com Wed Jul 13 08:22:03 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 12 Jul 2005 17:22:03 -0500 Subject: [PATCH 2.6.13-rc1 08/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB69BD.1090607@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB69BD.1090607@jp.fujitsu.com> Message-ID: <20050712222203.GG26607@austin.ibm.com> On Wed, Jul 06, 2005 at 02:18:53PM +0900, Hidetoshi Seto was heard to remark: > +static int pci_error_recovery(peidx_table_t *peidx) Minor comment: Maybe a different name for this routine would be good; this potentially conflicts with generic pci routines. --linas From benh at kernel.crashing.org Wed Jul 13 10:07:35 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 13 Jul 2005 10:07:35 +1000 Subject: [PATCH] PPC64: Add 970MP PVR In-Reply-To: <20050712122647.GA27453@austin.ibm.com> References: <20050712122647.GA27453@austin.ibm.com> Message-ID: <1121213256.31924.397.camel@gaston> On Tue, 2005-07-12 at 07:26 -0500, Olof Johansson wrote: > Hi, > > > Add PVR value and tests for 970MP. Also switch to a simpler (but > slightly longer) check at init time for simplicity. Hrm ... I loved my little game with cror :) Besides, it would teach something to newbies reading the code :) BTW. Can you remind me what is this MMCRA CPU feature ? Ben. From benh at kernel.crashing.org Wed Jul 13 10:18:57 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 13 Jul 2005 10:18:57 +1000 Subject: [PATCH 2.6.13-rc1 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050712195120.GE26607@austin.ibm.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB664E.1050003@jp.fujitsu.com> <20050712195120.GE26607@austin.ibm.com> Message-ID: <1121213938.31924.406.camel@gaston> > Are you assuming that a device driver will use an iochk_read() for > every DMA operation? for every MMIO to the card? > > For high performance devices, it seems to me that this will cause > a rather large performance burden, especially if its envisioned that > all architectures will do something similar. > > My concern is that (at least on ppc64) the call pci_read_config_word() > requires a call into "firmware" aka "BIOS", which takes thousands upon > thousands of cpu cycles. There are hundreds of cycles of gratuitous > crud just to get into the firmware, and then lord-knows-what the > firmware does while its in there; probably doing all sorts of crazy > math to compute bus addresses and other arcane things. I would imagine > that most architectures, includig ia64, are similar. > > Thus, one wouldn't want to perform an iochk_read() in this way unless > one was already pretty sure that an error had already occured ... > > Am I misunderstanding something? I would expect pSeries not to use the "default" error checking (that tests the status register) but rather use EEH. Ben. From seto.hidetoshi at jp.fujitsu.com Wed Jul 13 11:33:12 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 13 Jul 2005 10:33:12 +0900 Subject: [PATCH 2.6.13-rc1 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050712195120.GE26607@austin.ibm.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB664E.1050003@jp.fujitsu.com> <20050712195120.GE26607@austin.ibm.com> Message-ID: <42D46F58.4070900@jp.fujitsu.com> Linas Vepstas wrote: > Thus, one wouldn't want to perform an iochk_read() in this way unless > one was already pretty sure that an error had already occured ... If another kind of I/O error detecting system finds a error before performing iochk_read(), it can prevents coming iochk_read() from spending such crazy cycles in have_error() by marking cookie->error. >> int iochk_read(iocookie *cookie) >> { >> + if (cookie->error || have_error(cookie->dev)) Isn't it enough? And as Ben said, it seems that ppc64 can have its own special iochk_*, unless calling pci_read_config_word() ;-) Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Wed Jul 13 11:36:51 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 13 Jul 2005 10:36:51 +0900 Subject: [PATCH 2.6.13-rc1 08/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050712222203.GG26607@austin.ibm.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB69BD.1090607@jp.fujitsu.com> <20050712222203.GG26607@austin.ibm.com> Message-ID: <42D47033.6030004@jp.fujitsu.com> Linas Vepstas wrote: > On Wed, Jul 06, 2005 at 02:18:53PM +0900, Hidetoshi Seto was heard to remark: > >>+static int pci_error_recovery(peidx_table_t *peidx) > > Minor comment: > Maybe a different name for this routine would be good; > this potentially conflicts with generic pci routines. Good point. I'll fix it. Thanks, H.Seto From seto.hidetoshi at jp.fujitsu.com Wed Jul 13 12:00:55 2005 From: seto.hidetoshi at jp.fujitsu.com (Hidetoshi Seto) Date: Wed, 13 Jul 2005 11:00:55 +0900 Subject: [PATCH 2.6.13-rc1 07/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <20050712211401.GF26607@austin.ibm.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB6961.2060508@jp.fujitsu.com> <20050712211401.GF26607@austin.ibm.com> Message-ID: <42D475D7.2090307@jp.fujitsu.com> Linas Vepstas wrote: > On Wed, Jul 06, 2005 at 02:17:21PM +0900, Hidetoshi Seto was heard to remark: > >>Touching poisoned data become a MCA, so now it directly means > > Several questions: > > Is MCA an exception or fault of some sort, so at some point, > the kernel would catch a fault? > > So when you say "Touching poisoned data become a MCA", you mean that > if the CPU attempts to read poisoned data through the pci-to-host > bridge, it will (at some point) catch an exception? Yes. More specifically, transferring poisoned data doesn't cause MCA, but loading it to CPU register cause MCA. At the end of load, CPU checks the data and deliver MCA if it was poisoned. >>+ ia64_mca_barrier(ret); > > I assume that the point of this barrier is to make sure that the fault, > if any, is delivered before this routine returns? Yes, that's what I expecting. Thanks, H.Seto From olof at lixom.net Wed Jul 13 12:11:15 2005 From: olof at lixom.net (Olof Johansson) Date: Tue, 12 Jul 2005 21:11:15 -0500 Subject: [PATCH] PPC64: Add 970MP PVR In-Reply-To: <1121213256.31924.397.camel@gaston> References: <20050712122647.GA27453@austin.ibm.com> <1121213256.31924.397.camel@gaston> Message-ID: <20050713021115.GB27453@austin.ibm.com> On Wed, Jul 13, 2005 at 10:07:35AM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2005-07-12 at 07:26 -0500, Olof Johansson wrote: > > Hi, > > > > > > Add PVR value and tests for 970MP. Also switch to a simpler (but > > slightly longer) check at init time for simplicity. > > Hrm ... I loved my little game with cror :) Besides, it would teach > something to newbies reading the code :) Heh. It's a execute-once path that I've had to touch a few times lately. I figured I could spend 5 minutes rewriting it once or 10 minutes scratching my head getting the cror's right every time. We can live with the extra pathlength. :) > BTW. Can you remind me what is this MMCRA CPU feature ? Performance counter stuff. Not sure about the exact usage, check the books or ask Anton. :) -Olof From olh at suse.de Wed Jul 13 23:57:39 2005 From: olh at suse.de (Olaf Hering) Date: Wed, 13 Jul 2005 15:57:39 +0200 Subject: [PATCH] update ppc64 defconfigs In-Reply-To: <20050713135626.GB18144@suse.de> References: <20050713135525.GA18144@suse.de> <20050713135626.GB18144@suse.de> Message-ID: <20050713135739.GC18144@suse.de> update defconfig, use new CONFIG_HZ and set it to 100 just for the kicks. Signed-off-by: Olaf Hering arch/ppc64/configs/g5_defconfig | 369 +++++++++++++++++---------------- arch/ppc64/configs/iSeries_defconfig | 322 ++++++++++++++++------------- arch/ppc64/configs/maple_defconfig | 218 ++++++++++---------- arch/ppc64/configs/pSeries_defconfig | 377 ++++++++++++++++++---------------- arch/ppc64/defconfig | 380 ++++++++++++++++++----------------- 5 files changed, 892 insertions(+), 774 deletions(-) Index: linux-2.6.13-rc3-olh/arch/ppc64/configs/g5_defconfig =================================================================== --- linux-2.6.13-rc3-olh.orig/arch/ppc64/configs/g5_defconfig +++ linux-2.6.13-rc3-olh/arch/ppc64/configs/g5_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.12-rc6 -# Tue Jun 14 16:59:20 2005 +# Linux kernel version: 2.6.13-rc3 +# Wed Jul 13 14:40:34 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -73,12 +73,15 @@ CONFIG_SYSVIPC_COMPAT=y # CONFIG_PPC_ISERIES is not set CONFIG_PPC_MULTIPLATFORM=y # CONFIG_PPC_PSERIES is not set +# CONFIG_PPC_BPA is not set CONFIG_PPC_PMAC=y # CONFIG_PPC_MAPLE is not set CONFIG_PPC=y CONFIG_PPC64=y CONFIG_PPC_OF=y +CONFIG_MPIC=y CONFIG_ALTIVEC=y +CONFIG_KEXEC=y CONFIG_U3_DART=y CONFIG_PPC_PMAC64=y CONFIG_BOOTX_TEXT=y @@ -86,8 +89,24 @@ CONFIG_POWER4_ONLY=y CONFIG_IOMMU_VMERGE=y CONFIG_SMP=y CONFIG_NR_CPUS=2 +CONFIG_ARCH_SELECT_MEMORY_MODEL=y +CONFIG_ARCH_FLATMEM_ENABLE=y +CONFIG_SELECT_MEMORY_MODEL=y +CONFIG_FLATMEM_MANUAL=y +# CONFIG_DISCONTIGMEM_MANUAL is not set +# CONFIG_SPARSEMEM_MANUAL is not set +CONFIG_FLATMEM=y +CONFIG_FLAT_NODE_MEM_MAP=y +# CONFIG_NUMA is not set # CONFIG_SCHED_SMT is not set +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set +# CONFIG_PREEMPT_BKL is not set +CONFIG_HZ_100=y +# CONFIG_HZ_250 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=100 CONFIG_GENERIC_HARDIRQS=y CONFIG_SECCOMP=y CONFIG_ISA_DMA_API=y @@ -117,6 +136,144 @@ CONFIG_PROC_DEVICETREE=y # CONFIG_CMDLINE_BOOL is not set # +# Networking +# +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_MMAP is not set +CONFIG_UNIX=y +CONFIG_XFRM=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=m +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=y +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +CONFIG_INET_TUNNEL=y +CONFIG_IP_TCPDIAG=m +# CONFIG_IP_TCPDIAG_IPV6 is not set +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y + +# +# IP: Virtual Server Configuration +# +# CONFIG_IP_VS is not set +# CONFIG_IPV6 is not set +CONFIG_NETFILTER=y +# CONFIG_NETFILTER_DEBUG is not set + +# +# IP: Netfilter Configuration +# +CONFIG_IP_NF_CONNTRACK=m +CONFIG_IP_NF_CT_ACCT=y +CONFIG_IP_NF_CONNTRACK_MARK=y +CONFIG_IP_NF_CT_PROTO_SCTP=m +CONFIG_IP_NF_FTP=m +CONFIG_IP_NF_IRC=m +CONFIG_IP_NF_TFTP=m +CONFIG_IP_NF_AMANDA=m +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_LIMIT=m +CONFIG_IP_NF_MATCH_IPRANGE=m +CONFIG_IP_NF_MATCH_MAC=m +CONFIG_IP_NF_MATCH_PKTTYPE=m +CONFIG_IP_NF_MATCH_MARK=m +CONFIG_IP_NF_MATCH_MULTIPORT=m +CONFIG_IP_NF_MATCH_TOS=m +CONFIG_IP_NF_MATCH_RECENT=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_DSCP=m +CONFIG_IP_NF_MATCH_AH_ESP=m +CONFIG_IP_NF_MATCH_LENGTH=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_MATCH_TCPMSS=m +CONFIG_IP_NF_MATCH_HELPER=m +CONFIG_IP_NF_MATCH_STATE=m +CONFIG_IP_NF_MATCH_CONNTRACK=m +CONFIG_IP_NF_MATCH_OWNER=m +CONFIG_IP_NF_MATCH_ADDRTYPE=m +CONFIG_IP_NF_MATCH_REALM=m +CONFIG_IP_NF_MATCH_SCTP=m +CONFIG_IP_NF_MATCH_COMMENT=m +CONFIG_IP_NF_MATCH_CONNMARK=m +CONFIG_IP_NF_MATCH_HASHLIMIT=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_LOG=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_TARGET_TCPMSS=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_NAT_NEEDED=y +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_SAME=m +CONFIG_IP_NF_NAT_SNMP_BASIC=m +CONFIG_IP_NF_NAT_IRC=m +CONFIG_IP_NF_NAT_FTP=m +CONFIG_IP_NF_NAT_TFTP=m +CONFIG_IP_NF_NAT_AMANDA=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_TOS=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_DSCP=m +CONFIG_IP_NF_TARGET_MARK=m +CONFIG_IP_NF_TARGET_CLASSIFY=m +CONFIG_IP_NF_TARGET_CONNMARK=m +CONFIG_IP_NF_TARGET_CLUSTERIP=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_TARGET_NOTRACK=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +CONFIG_LLC=y +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set +# CONFIG_NET_SCHED is not set +CONFIG_NET_CLS_ROUTE=y + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +# CONFIG_NETPOLL is not set +# CONFIG_NET_POLL_CONTROLLER is not set +# CONFIG_HAMRADIO is not set +# CONFIG_IRDA is not set +# CONFIG_BT is not set + +# # Device Drivers # @@ -218,6 +375,7 @@ CONFIG_IDEDMA_PCI_AUTO=y # CONFIG_BLK_DEV_HPT366 is not set # CONFIG_BLK_DEV_SC1200 is not set # CONFIG_BLK_DEV_PIIX is not set +# CONFIG_BLK_DEV_IT821X is not set # CONFIG_BLK_DEV_NS87415 is not set # CONFIG_BLK_DEV_PDC202XX_OLD is not set # CONFIG_BLK_DEV_PDC202XX_NEW is not set @@ -251,6 +409,7 @@ CONFIG_CHR_DEV_ST=y CONFIG_BLK_DEV_SR=y CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=y +# CONFIG_CHR_DEV_SCH is not set # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs @@ -338,6 +497,8 @@ CONFIG_DM_ZERO=m # Fusion MPT device support # # CONFIG_FUSION is not set +# CONFIG_FUSION_SPI is not set +# CONFIG_FUSION_FC is not set # # IEEE 1394 (FireWire) support @@ -351,6 +512,7 @@ CONFIG_IEEE1394=y CONFIG_IEEE1394_OUI_DB=y CONFIG_IEEE1394_EXTRA_CONFIG_ROMS=y CONFIG_IEEE1394_CONFIG_ROM_IP1394=y +# CONFIG_IEEE1394_EXPORT_FULL_API is not set # # Device Drivers @@ -380,149 +542,13 @@ CONFIG_IEEE1394_RAWIO=y CONFIG_ADB=y CONFIG_ADB_PMU=y CONFIG_PMAC_SMU=y -# CONFIG_PMAC_PBOOK is not set # CONFIG_PMAC_BACKLIGHT is not set # CONFIG_INPUT_ADBHID is not set CONFIG_THERM_PM72=y # -# Networking support -# -CONFIG_NET=y - -# -# Networking options +# Network device support # -CONFIG_PACKET=y -# CONFIG_PACKET_MMAP is not set -CONFIG_UNIX=y -CONFIG_NET_KEY=m -CONFIG_INET=y -CONFIG_IP_MULTICAST=y -# CONFIG_IP_ADVANCED_ROUTER is not set -# CONFIG_IP_PNP is not set -CONFIG_NET_IPIP=y -# CONFIG_NET_IPGRE is not set -# CONFIG_IP_MROUTE is not set -# CONFIG_ARPD is not set -CONFIG_SYN_COOKIES=y -CONFIG_INET_AH=m -CONFIG_INET_ESP=m -CONFIG_INET_IPCOMP=m -CONFIG_INET_TUNNEL=y -CONFIG_IP_TCPDIAG=m -# CONFIG_IP_TCPDIAG_IPV6 is not set - -# -# IP: Virtual Server Configuration -# -# CONFIG_IP_VS is not set -# CONFIG_IPV6 is not set -CONFIG_NETFILTER=y -# CONFIG_NETFILTER_DEBUG is not set - -# -# IP: Netfilter Configuration -# -CONFIG_IP_NF_CONNTRACK=m -CONFIG_IP_NF_CT_ACCT=y -CONFIG_IP_NF_CONNTRACK_MARK=y -CONFIG_IP_NF_CT_PROTO_SCTP=m -CONFIG_IP_NF_FTP=m -CONFIG_IP_NF_IRC=m -CONFIG_IP_NF_TFTP=m -CONFIG_IP_NF_AMANDA=m -CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m -CONFIG_XFRM=y -CONFIG_XFRM_USER=m - -# -# SCTP Configuration (EXPERIMENTAL) -# -# CONFIG_IP_SCTP is not set -# CONFIG_ATM is not set -# CONFIG_BRIDGE is not set -# CONFIG_VLAN_8021Q is not set -# CONFIG_DECNET is not set -CONFIG_LLC=y -# CONFIG_LLC2 is not set -# CONFIG_IPX is not set -# CONFIG_ATALK is not set -# CONFIG_X25 is not set -# CONFIG_LAPB is not set -# CONFIG_NET_DIVERT is not set -# CONFIG_ECONET is not set -# CONFIG_WAN_ROUTER is not set - -# -# QoS and/or fair queueing -# -# CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y - -# -# Network testing -# -# CONFIG_NET_PKTGEN is not set -# CONFIG_NETPOLL is not set -# CONFIG_NET_POLL_CONTROLLER is not set -# CONFIG_HAMRADIO is not set -# CONFIG_IRDA is not set -# CONFIG_BT is not set CONFIG_NETDEVICES=y CONFIG_DUMMY=m CONFIG_BONDING=m @@ -562,6 +588,7 @@ CONFIG_E1000=y # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_R8169 is not set +# CONFIG_SKGE is not set # CONFIG_SK98LIN is not set CONFIG_TIGON3=m # CONFIG_BNX2 is not set @@ -750,50 +777,19 @@ CONFIG_I2C_KEYWEST=y # CONFIG_I2C_VIAPRO is not set # CONFIG_I2C_VOODOO3 is not set # CONFIG_I2C_PCA_ISA is not set - -# -# Hardware Sensors Chip support -# # CONFIG_I2C_SENSOR is not set -# CONFIG_SENSORS_ADM1021 is not set -# CONFIG_SENSORS_ADM1025 is not set -# CONFIG_SENSORS_ADM1026 is not set -# CONFIG_SENSORS_ADM1031 is not set -# CONFIG_SENSORS_ASB100 is not set -# CONFIG_SENSORS_DS1621 is not set -# CONFIG_SENSORS_FSCHER is not set -# CONFIG_SENSORS_FSCPOS is not set -# CONFIG_SENSORS_GL518SM is not set -# CONFIG_SENSORS_GL520SM is not set -# CONFIG_SENSORS_IT87 is not set -# CONFIG_SENSORS_LM63 is not set -# CONFIG_SENSORS_LM75 is not set -# CONFIG_SENSORS_LM77 is not set -# CONFIG_SENSORS_LM78 is not set -# CONFIG_SENSORS_LM80 is not set -# CONFIG_SENSORS_LM83 is not set -# CONFIG_SENSORS_LM85 is not set -# CONFIG_SENSORS_LM87 is not set -# CONFIG_SENSORS_LM90 is not set -# CONFIG_SENSORS_LM92 is not set -# CONFIG_SENSORS_MAX1619 is not set -# CONFIG_SENSORS_PC87360 is not set -# CONFIG_SENSORS_SMSC47B397 is not set -# CONFIG_SENSORS_SIS5595 is not set -# CONFIG_SENSORS_SMSC47M1 is not set -# CONFIG_SENSORS_VIA686A is not set -# CONFIG_SENSORS_W83781D is not set -# CONFIG_SENSORS_W83L785TS is not set -# CONFIG_SENSORS_W83627HF is not set # -# Other I2C Chip support +# Miscellaneous I2C Chip support # # CONFIG_SENSORS_DS1337 is not set +# CONFIG_SENSORS_DS1374 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set +# CONFIG_SENSORS_PCA9539 is not set # CONFIG_SENSORS_PCF8591 is not set # CONFIG_SENSORS_RTC8564 is not set +# CONFIG_SENSORS_MAX6875 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set @@ -805,6 +801,11 @@ CONFIG_I2C_KEYWEST=y # CONFIG_W1 is not set # +# Hardware Monitoring support +# +# CONFIG_HWMON is not set + +# # Misc devices # @@ -911,6 +912,7 @@ CONFIG_USB_DEVICEFS=y CONFIG_USB_EHCI_HCD=y # CONFIG_USB_EHCI_SPLIT_ISO is not set # CONFIG_USB_EHCI_ROOT_HUB_TT is not set +# CONFIG_USB_ISP116X_HCD is not set CONFIG_USB_OHCI_HCD=y # CONFIG_USB_OHCI_BIG_ENDIAN is not set CONFIG_USB_OHCI_LITTLE_ENDIAN=y @@ -950,12 +952,15 @@ CONFIG_THRUSTMASTER_FF=y CONFIG_USB_HIDDEV=y # CONFIG_USB_AIPTEK is not set # CONFIG_USB_WACOM is not set +# CONFIG_USB_ACECAD is not set # CONFIG_USB_KBTAB is not set # CONFIG_USB_POWERMATE is not set # CONFIG_USB_MTOUCH is not set +# CONFIG_USB_ITMTOUCH is not set # CONFIG_USB_EGALAX is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_KEYSPAN_REMOTE is not set # # USB Imaging devices @@ -1071,10 +1076,11 @@ CONFIG_USB_EZUSB=y # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set # CONFIG_USB_SISUSBVGA is not set +# CONFIG_USB_LD is not set # CONFIG_USB_TEST is not set # -# USB ATM/DSL drivers +# USB DSL modem support # # @@ -1093,12 +1099,18 @@ CONFIG_USB_EZUSB=y # CONFIG_INFINIBAND is not set # +# SN Devices +# + +# # File systems # CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y CONFIG_EXT2_FS_SECURITY=y +CONFIG_EXT2_FS_XIP=y +CONFIG_FS_XIP=y CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y @@ -1126,6 +1138,7 @@ CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set +CONFIG_INOTIFY=y # CONFIG_QUOTA is not set CONFIG_DNOTIFY=y CONFIG_AUTOFS_FS=m @@ -1157,7 +1170,6 @@ CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y -# CONFIG_DEVFS_FS is not set CONFIG_DEVPTS_FS_XATTR=y # CONFIG_DEVPTS_FS_SECURITY is not set CONFIG_TMPFS=y @@ -1189,15 +1201,20 @@ CONFIG_CRAMFS=y # CONFIG_NFS_FS=y CONFIG_NFS_V3=y +CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y # CONFIG_NFS_DIRECTIO is not set CONFIG_NFSD=y +CONFIG_NFSD_V2_ACL=y CONFIG_NFSD_V3=y +CONFIG_NFSD_V3_ACL=y CONFIG_NFSD_V4=y CONFIG_NFSD_TCP=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_EXPORTFS=y +CONFIG_NFS_ACL_SUPPORT=y +CONFIG_NFS_COMMON=y CONFIG_SUNRPC=y CONFIG_SUNRPC_GSS=y CONFIG_RPCSEC_GSS_KRB5=y Index: linux-2.6.13-rc3-olh/arch/ppc64/configs/iSeries_defconfig =================================================================== --- linux-2.6.13-rc3-olh.orig/arch/ppc64/configs/iSeries_defconfig +++ linux-2.6.13-rc3-olh/arch/ppc64/configs/iSeries_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.12-rc6 -# Tue Jun 14 17:01:28 2005 +# Linux kernel version: 2.6.13-rc3 +# Wed Jul 13 14:43:39 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -80,8 +80,24 @@ CONFIG_IBMVIO=y CONFIG_IOMMU_VMERGE=y CONFIG_SMP=y CONFIG_NR_CPUS=32 +CONFIG_ARCH_SELECT_MEMORY_MODEL=y +CONFIG_ARCH_FLATMEM_ENABLE=y +CONFIG_SELECT_MEMORY_MODEL=y +CONFIG_FLATMEM_MANUAL=y +# CONFIG_DISCONTIGMEM_MANUAL is not set +# CONFIG_SPARSEMEM_MANUAL is not set +CONFIG_FLATMEM=y +CONFIG_FLAT_NODE_MEM_MAP=y +# CONFIG_NUMA is not set # CONFIG_SCHED_SMT is not set +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set +# CONFIG_PREEMPT_BKL is not set +CONFIG_HZ_100=y +# CONFIG_HZ_250 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=100 CONFIG_GENERIC_HARDIRQS=y CONFIG_MSCHUNKS=y CONFIG_LPARCFG=y @@ -110,6 +126,146 @@ CONFIG_PCI_NAMES=y # CONFIG_HOTPLUG_PCI is not set # +# Networking +# +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_MMAP is not set +CONFIG_UNIX=y +CONFIG_XFRM=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=m +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=y +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +CONFIG_INET_TUNNEL=y +CONFIG_IP_TCPDIAG=m +# CONFIG_IP_TCPDIAG_IPV6 is not set +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y + +# +# IP: Virtual Server Configuration +# +# CONFIG_IP_VS is not set +# CONFIG_IPV6 is not set +CONFIG_NETFILTER=y +# CONFIG_NETFILTER_DEBUG is not set + +# +# IP: Netfilter Configuration +# +CONFIG_IP_NF_CONNTRACK=m +CONFIG_IP_NF_CT_ACCT=y +CONFIG_IP_NF_CONNTRACK_MARK=y +CONFIG_IP_NF_CT_PROTO_SCTP=m +CONFIG_IP_NF_FTP=m +CONFIG_IP_NF_IRC=m +CONFIG_IP_NF_TFTP=m +CONFIG_IP_NF_AMANDA=m +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_LIMIT=m +CONFIG_IP_NF_MATCH_IPRANGE=m +CONFIG_IP_NF_MATCH_MAC=m +CONFIG_IP_NF_MATCH_PKTTYPE=m +CONFIG_IP_NF_MATCH_MARK=m +CONFIG_IP_NF_MATCH_MULTIPORT=m +CONFIG_IP_NF_MATCH_TOS=m +CONFIG_IP_NF_MATCH_RECENT=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_DSCP=m +CONFIG_IP_NF_MATCH_AH_ESP=m +CONFIG_IP_NF_MATCH_LENGTH=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_MATCH_TCPMSS=m +CONFIG_IP_NF_MATCH_HELPER=m +CONFIG_IP_NF_MATCH_STATE=m +CONFIG_IP_NF_MATCH_CONNTRACK=m +CONFIG_IP_NF_MATCH_OWNER=m +CONFIG_IP_NF_MATCH_ADDRTYPE=m +CONFIG_IP_NF_MATCH_REALM=m +CONFIG_IP_NF_MATCH_SCTP=m +CONFIG_IP_NF_MATCH_COMMENT=m +CONFIG_IP_NF_MATCH_CONNMARK=m +CONFIG_IP_NF_MATCH_HASHLIMIT=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_LOG=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_TARGET_TCPMSS=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_NAT_NEEDED=y +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_SAME=m +CONFIG_IP_NF_NAT_SNMP_BASIC=m +CONFIG_IP_NF_NAT_IRC=m +CONFIG_IP_NF_NAT_FTP=m +CONFIG_IP_NF_NAT_TFTP=m +CONFIG_IP_NF_NAT_AMANDA=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_TOS=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_DSCP=m +CONFIG_IP_NF_TARGET_MARK=m +CONFIG_IP_NF_TARGET_CLASSIFY=m +CONFIG_IP_NF_TARGET_CONNMARK=m +CONFIG_IP_NF_TARGET_CLUSTERIP=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_TARGET_NOTRACK=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +CONFIG_LLC=y +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set +# CONFIG_NET_SCHED is not set +CONFIG_NET_CLS_ROUTE=y + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +CONFIG_NETPOLL=y +CONFIG_NETPOLL_RX=y +CONFIG_NETPOLL_TRAP=y +CONFIG_NET_POLL_CONTROLLER=y +# CONFIG_HAMRADIO is not set +# CONFIG_IRDA is not set +# CONFIG_BT is not set + +# # Device Drivers # @@ -184,6 +340,7 @@ CONFIG_CHR_DEV_ST=y CONFIG_BLK_DEV_SR=y CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=y +# CONFIG_CHR_DEV_SCH is not set # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs @@ -260,6 +417,8 @@ CONFIG_DM_ZERO=m # Fusion MPT device support # # CONFIG_FUSION is not set +# CONFIG_FUSION_SPI is not set +# CONFIG_FUSION_FC is not set # # IEEE 1394 (FireWire) support @@ -276,145 +435,8 @@ CONFIG_DM_ZERO=m # # -# Networking support -# -CONFIG_NET=y - -# -# Networking options +# Network device support # -CONFIG_PACKET=y -# CONFIG_PACKET_MMAP is not set -CONFIG_UNIX=y -CONFIG_NET_KEY=m -CONFIG_INET=y -CONFIG_IP_MULTICAST=y -# CONFIG_IP_ADVANCED_ROUTER is not set -# CONFIG_IP_PNP is not set -CONFIG_NET_IPIP=y -# CONFIG_NET_IPGRE is not set -# CONFIG_IP_MROUTE is not set -# CONFIG_ARPD is not set -CONFIG_SYN_COOKIES=y -CONFIG_INET_AH=m -CONFIG_INET_ESP=m -CONFIG_INET_IPCOMP=m -CONFIG_INET_TUNNEL=y -CONFIG_IP_TCPDIAG=m -# CONFIG_IP_TCPDIAG_IPV6 is not set - -# -# IP: Virtual Server Configuration -# -# CONFIG_IP_VS is not set -# CONFIG_IPV6 is not set -CONFIG_NETFILTER=y -# CONFIG_NETFILTER_DEBUG is not set - -# -# IP: Netfilter Configuration -# -CONFIG_IP_NF_CONNTRACK=m -CONFIG_IP_NF_CT_ACCT=y -CONFIG_IP_NF_CONNTRACK_MARK=y -CONFIG_IP_NF_CT_PROTO_SCTP=m -CONFIG_IP_NF_FTP=m -CONFIG_IP_NF_IRC=m -CONFIG_IP_NF_TFTP=m -CONFIG_IP_NF_AMANDA=m -CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m -CONFIG_XFRM=y -CONFIG_XFRM_USER=m - -# -# SCTP Configuration (EXPERIMENTAL) -# -# CONFIG_IP_SCTP is not set -# CONFIG_ATM is not set -# CONFIG_BRIDGE is not set -# CONFIG_VLAN_8021Q is not set -# CONFIG_DECNET is not set -CONFIG_LLC=y -# CONFIG_LLC2 is not set -# CONFIG_IPX is not set -# CONFIG_ATALK is not set -# CONFIG_X25 is not set -# CONFIG_LAPB is not set -# CONFIG_NET_DIVERT is not set -# CONFIG_ECONET is not set -# CONFIG_WAN_ROUTER is not set - -# -# QoS and/or fair queueing -# -# CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y - -# -# Network testing -# -# CONFIG_NET_PKTGEN is not set -CONFIG_NETPOLL=y -CONFIG_NETPOLL_RX=y -CONFIG_NETPOLL_TRAP=y -CONFIG_NET_POLL_CONTROLLER=y -# CONFIG_HAMRADIO is not set -# CONFIG_IRDA is not set -# CONFIG_BT is not set CONFIG_NETDEVICES=y CONFIG_DUMMY=m CONFIG_BONDING=m @@ -471,6 +493,7 @@ CONFIG_E1000=m # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_R8169 is not set +# CONFIG_SKGE is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set # CONFIG_TIGON3 is not set @@ -610,6 +633,7 @@ CONFIG_MAX_RAW_DEVS=256 # I2C support # # CONFIG_I2C is not set +# CONFIG_I2C_SENSOR is not set # # Dallas's 1-wire bus @@ -617,6 +641,11 @@ CONFIG_MAX_RAW_DEVS=256 # CONFIG_W1 is not set # +# Hardware Monitoring support +# +# CONFIG_HWMON is not set + +# # Misc devices # @@ -663,12 +692,18 @@ CONFIG_USB_ARCH_HAS_OHCI=y # CONFIG_INFINIBAND is not set # +# SN Devices +# + +# # File systems # CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y CONFIG_EXT2_FS_SECURITY=y +CONFIG_EXT2_FS_XIP=y +CONFIG_FS_XIP=y CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y @@ -700,6 +735,7 @@ CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set +CONFIG_INOTIFY=y # CONFIG_QUOTA is not set CONFIG_DNOTIFY=y CONFIG_AUTOFS_FS=m @@ -731,7 +767,6 @@ CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y -# CONFIG_DEVFS_FS is not set CONFIG_DEVPTS_FS_XATTR=y CONFIG_DEVPTS_FS_SECURITY=y CONFIG_TMPFS=y @@ -763,15 +798,20 @@ CONFIG_CRAMFS=y # CONFIG_NFS_FS=y CONFIG_NFS_V3=y +CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y # CONFIG_NFS_DIRECTIO is not set CONFIG_NFSD=m +CONFIG_NFSD_V2_ACL=y CONFIG_NFSD_V3=y +CONFIG_NFSD_V3_ACL=y CONFIG_NFSD_V4=y CONFIG_NFSD_TCP=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_EXPORTFS=m +CONFIG_NFS_ACL_SUPPORT=y +CONFIG_NFS_COMMON=y CONFIG_SUNRPC=y CONFIG_SUNRPC_GSS=y CONFIG_RPCSEC_GSS_KRB5=y Index: linux-2.6.13-rc3-olh/arch/ppc64/configs/maple_defconfig =================================================================== --- linux-2.6.13-rc3-olh.orig/arch/ppc64/configs/maple_defconfig +++ linux-2.6.13-rc3-olh/arch/ppc64/configs/maple_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.12-rc6 -# Tue Jun 14 17:12:48 2005 +# Linux kernel version: 2.6.13-rc3 +# Wed Jul 13 14:46:18 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -73,12 +73,15 @@ CONFIG_SYSVIPC_COMPAT=y # CONFIG_PPC_ISERIES is not set CONFIG_PPC_MULTIPLATFORM=y # CONFIG_PPC_PSERIES is not set +# CONFIG_PPC_BPA is not set # CONFIG_PPC_PMAC is not set CONFIG_PPC_MAPLE=y CONFIG_PPC=y CONFIG_PPC64=y CONFIG_PPC_OF=y +CONFIG_MPIC=y # CONFIG_ALTIVEC is not set +CONFIG_KEXEC=y CONFIG_U3_DART=y CONFIG_MPIC_BROKEN_U3=y CONFIG_BOOTX_TEXT=y @@ -86,8 +89,24 @@ CONFIG_POWER4_ONLY=y CONFIG_IOMMU_VMERGE=y CONFIG_SMP=y CONFIG_NR_CPUS=2 +CONFIG_ARCH_SELECT_MEMORY_MODEL=y +CONFIG_ARCH_FLATMEM_ENABLE=y +CONFIG_SELECT_MEMORY_MODEL=y +CONFIG_FLATMEM_MANUAL=y +# CONFIG_DISCONTIGMEM_MANUAL is not set +# CONFIG_SPARSEMEM_MANUAL is not set +CONFIG_FLATMEM=y +CONFIG_FLAT_NODE_MEM_MAP=y +# CONFIG_NUMA is not set # CONFIG_SCHED_SMT is not set +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set +# CONFIG_PREEMPT_BKL is not set +CONFIG_HZ_100=y +# CONFIG_HZ_250 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=100 CONFIG_GENERIC_HARDIRQS=y CONFIG_SECCOMP=y CONFIG_ISA_DMA_API=y @@ -116,6 +135,71 @@ CONFIG_PROC_DEVICETREE=y # CONFIG_CMDLINE_BOOL is not set # +# Networking +# +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +CONFIG_PACKET_MMAP=y +CONFIG_UNIX=y +# CONFIG_NET_KEY is not set +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +# CONFIG_IP_PNP_BOOTP is not set +# CONFIG_IP_PNP_RARP is not set +# CONFIG_NET_IPIP is not set +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +# CONFIG_SYN_COOKIES is not set +# CONFIG_INET_AH is not set +# CONFIG_INET_ESP is not set +# CONFIG_INET_IPCOMP is not set +# CONFIG_INET_TUNNEL is not set +CONFIG_IP_TCPDIAG=y +# CONFIG_IP_TCPDIAG_IPV6 is not set +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y +# CONFIG_IPV6 is not set +# CONFIG_NETFILTER is not set + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set +# CONFIG_NET_SCHED is not set +# CONFIG_NET_CLS_ROUTE is not set + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +# CONFIG_NETPOLL is not set +# CONFIG_NET_POLL_CONTROLLER is not set +# CONFIG_HAMRADIO is not set +# CONFIG_IRDA is not set +# CONFIG_BT is not set + +# # Device Drivers # @@ -213,6 +297,7 @@ CONFIG_BLK_DEV_AMD74XX=y # CONFIG_BLK_DEV_HPT366 is not set # CONFIG_BLK_DEV_SC1200 is not set # CONFIG_BLK_DEV_PIIX is not set +# CONFIG_BLK_DEV_IT821X is not set # CONFIG_BLK_DEV_NS87415 is not set # CONFIG_BLK_DEV_PDC202XX_OLD is not set # CONFIG_BLK_DEV_PDC202XX_NEW is not set @@ -240,6 +325,7 @@ CONFIG_IDEDMA_AUTO=y # # Fusion MPT device support # +# CONFIG_FUSION is not set # # IEEE 1394 (FireWire) support @@ -256,70 +342,8 @@ CONFIG_IDEDMA_AUTO=y # # -# Networking support -# -CONFIG_NET=y - -# -# Networking options +# Network device support # -CONFIG_PACKET=y -CONFIG_PACKET_MMAP=y -CONFIG_UNIX=y -# CONFIG_NET_KEY is not set -CONFIG_INET=y -CONFIG_IP_MULTICAST=y -# CONFIG_IP_ADVANCED_ROUTER is not set -CONFIG_IP_PNP=y -CONFIG_IP_PNP_DHCP=y -# CONFIG_IP_PNP_BOOTP is not set -# CONFIG_IP_PNP_RARP is not set -# CONFIG_NET_IPIP is not set -# CONFIG_NET_IPGRE is not set -# CONFIG_IP_MROUTE is not set -# CONFIG_ARPD is not set -# CONFIG_SYN_COOKIES is not set -# CONFIG_INET_AH is not set -# CONFIG_INET_ESP is not set -# CONFIG_INET_IPCOMP is not set -# CONFIG_INET_TUNNEL is not set -CONFIG_IP_TCPDIAG=y -# CONFIG_IP_TCPDIAG_IPV6 is not set -# CONFIG_IPV6 is not set -# CONFIG_NETFILTER is not set - -# -# SCTP Configuration (EXPERIMENTAL) -# -# CONFIG_IP_SCTP is not set -# CONFIG_ATM is not set -# CONFIG_BRIDGE is not set -# CONFIG_VLAN_8021Q is not set -# CONFIG_DECNET is not set -# CONFIG_LLC2 is not set -# CONFIG_IPX is not set -# CONFIG_ATALK is not set -# CONFIG_X25 is not set -# CONFIG_LAPB is not set -# CONFIG_NET_DIVERT is not set -# CONFIG_ECONET is not set -# CONFIG_WAN_ROUTER is not set - -# -# QoS and/or fair queueing -# -# CONFIG_NET_SCHED is not set -# CONFIG_NET_CLS_ROUTE is not set - -# -# Network testing -# -# CONFIG_NET_PKTGEN is not set -# CONFIG_NETPOLL is not set -# CONFIG_NET_POLL_CONTROLLER is not set -# CONFIG_HAMRADIO is not set -# CONFIG_IRDA is not set -# CONFIG_BT is not set CONFIG_NETDEVICES=y # CONFIG_DUMMY is not set # CONFIG_BONDING is not set @@ -376,6 +400,7 @@ CONFIG_E1000=y # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_R8169 is not set +# CONFIG_SKGE is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set # CONFIG_TIGON3 is not set @@ -543,50 +568,19 @@ CONFIG_I2C_AMD8111=y # CONFIG_I2C_VIAPRO is not set # CONFIG_I2C_VOODOO3 is not set # CONFIG_I2C_PCA_ISA is not set - -# -# Hardware Sensors Chip support -# # CONFIG_I2C_SENSOR is not set -# CONFIG_SENSORS_ADM1021 is not set -# CONFIG_SENSORS_ADM1025 is not set -# CONFIG_SENSORS_ADM1026 is not set -# CONFIG_SENSORS_ADM1031 is not set -# CONFIG_SENSORS_ASB100 is not set -# CONFIG_SENSORS_DS1621 is not set -# CONFIG_SENSORS_FSCHER is not set -# CONFIG_SENSORS_FSCPOS is not set -# CONFIG_SENSORS_GL518SM is not set -# CONFIG_SENSORS_GL520SM is not set -# CONFIG_SENSORS_IT87 is not set -# CONFIG_SENSORS_LM63 is not set -# CONFIG_SENSORS_LM75 is not set -# CONFIG_SENSORS_LM77 is not set -# CONFIG_SENSORS_LM78 is not set -# CONFIG_SENSORS_LM80 is not set -# CONFIG_SENSORS_LM83 is not set -# CONFIG_SENSORS_LM85 is not set -# CONFIG_SENSORS_LM87 is not set -# CONFIG_SENSORS_LM90 is not set -# CONFIG_SENSORS_LM92 is not set -# CONFIG_SENSORS_MAX1619 is not set -# CONFIG_SENSORS_PC87360 is not set -# CONFIG_SENSORS_SMSC47B397 is not set -# CONFIG_SENSORS_SIS5595 is not set -# CONFIG_SENSORS_SMSC47M1 is not set -# CONFIG_SENSORS_VIA686A is not set -# CONFIG_SENSORS_W83781D is not set -# CONFIG_SENSORS_W83L785TS is not set -# CONFIG_SENSORS_W83627HF is not set # -# Other I2C Chip support +# Miscellaneous I2C Chip support # # CONFIG_SENSORS_DS1337 is not set +# CONFIG_SENSORS_DS1374 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set +# CONFIG_SENSORS_PCA9539 is not set # CONFIG_SENSORS_PCF8591 is not set # CONFIG_SENSORS_RTC8564 is not set +# CONFIG_SENSORS_MAX6875 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set @@ -598,6 +592,11 @@ CONFIG_I2C_AMD8111=y # CONFIG_W1 is not set # +# Hardware Monitoring support +# +# CONFIG_HWMON is not set + +# # Misc devices # @@ -649,6 +648,7 @@ CONFIG_USB_DEVICEFS=y CONFIG_USB_EHCI_HCD=y CONFIG_USB_EHCI_SPLIT_ISO=y CONFIG_USB_EHCI_ROOT_HUB_TT=y +# CONFIG_USB_ISP116X_HCD is not set CONFIG_USB_OHCI_HCD=y # CONFIG_USB_OHCI_BIG_ENDIAN is not set CONFIG_USB_OHCI_LITTLE_ENDIAN=y @@ -676,12 +676,15 @@ CONFIG_USB_HIDINPUT=y # CONFIG_USB_HIDDEV is not set # CONFIG_USB_AIPTEK is not set # CONFIG_USB_WACOM is not set +# CONFIG_USB_ACECAD is not set # CONFIG_USB_KBTAB is not set # CONFIG_USB_POWERMATE is not set # CONFIG_USB_MTOUCH is not set +# CONFIG_USB_ITMTOUCH is not set # CONFIG_USB_EGALAX is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_KEYSPAN_REMOTE is not set # # USB Imaging devices @@ -772,10 +775,11 @@ CONFIG_USB_EZUSB=y # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set # CONFIG_USB_SISUSBVGA is not set +# CONFIG_USB_LD is not set # CONFIG_USB_TEST is not set # -# USB ATM/DSL drivers +# USB DSL modem support # # @@ -794,16 +798,23 @@ CONFIG_USB_EZUSB=y # CONFIG_INFINIBAND is not set # +# SN Devices +# + +# # File systems # CONFIG_EXT2_FS=y # CONFIG_EXT2_FS_XATTR is not set +CONFIG_EXT2_FS_XIP=y +CONFIG_FS_XIP=y CONFIG_EXT3_FS=y # CONFIG_EXT3_FS_XATTR is not set CONFIG_JBD=y # CONFIG_JBD_DEBUG is not set # CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set +CONFIG_FS_POSIX_ACL=y # # XFS support @@ -811,6 +822,7 @@ CONFIG_JBD=y # CONFIG_XFS_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set +CONFIG_INOTIFY=y # CONFIG_QUOTA is not set CONFIG_DNOTIFY=y # CONFIG_AUTOFS_FS is not set @@ -838,7 +850,6 @@ CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y -# CONFIG_DEVFS_FS is not set CONFIG_DEVPTS_FS_XATTR=y # CONFIG_DEVPTS_FS_SECURITY is not set CONFIG_TMPFS=y @@ -870,12 +881,15 @@ CONFIG_CRAMFS=y # CONFIG_NFS_FS=y CONFIG_NFS_V3=y +CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y # CONFIG_NFS_DIRECTIO is not set # CONFIG_NFSD is not set CONFIG_ROOT_NFS=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y +CONFIG_NFS_ACL_SUPPORT=y +CONFIG_NFS_COMMON=y CONFIG_SUNRPC=y CONFIG_SUNRPC_GSS=y CONFIG_RPCSEC_GSS_KRB5=y Index: linux-2.6.13-rc3-olh/arch/ppc64/configs/pSeries_defconfig =================================================================== --- linux-2.6.13-rc3-olh.orig/arch/ppc64/configs/pSeries_defconfig +++ linux-2.6.13-rc3-olh/arch/ppc64/configs/pSeries_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.12-rc6 -# Tue Jun 14 17:13:47 2005 +# Linux kernel version: 2.6.13-rc3 +# Wed Jul 13 14:47:54 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -74,13 +74,17 @@ CONFIG_SYSVIPC_COMPAT=y # CONFIG_PPC_ISERIES is not set CONFIG_PPC_MULTIPLATFORM=y CONFIG_PPC_PSERIES=y +# CONFIG_PPC_BPA is not set # CONFIG_PPC_PMAC is not set # CONFIG_PPC_MAPLE is not set CONFIG_PPC=y CONFIG_PPC64=y CONFIG_PPC_OF=y +CONFIG_XICS=y +CONFIG_MPIC=y CONFIG_ALTIVEC=y CONFIG_PPC_SPLPAR=y +CONFIG_KEXEC=y CONFIG_IBMVIO=y # CONFIG_U3_DART is not set # CONFIG_BOOTX_TEXT is not set @@ -88,10 +92,30 @@ CONFIG_IBMVIO=y CONFIG_IOMMU_VMERGE=y CONFIG_SMP=y CONFIG_NR_CPUS=128 +CONFIG_ARCH_SELECT_MEMORY_MODEL=y +CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_ARCH_DISCONTIGMEM_ENABLE=y +CONFIG_ARCH_DISCONTIGMEM_DEFAULT=y +CONFIG_ARCH_SPARSEMEM_ENABLE=y +CONFIG_SELECT_MEMORY_MODEL=y +# CONFIG_FLATMEM_MANUAL is not set +CONFIG_DISCONTIGMEM_MANUAL=y +# CONFIG_SPARSEMEM_MANUAL is not set +CONFIG_DISCONTIGMEM=y +CONFIG_FLAT_NODE_MEM_MAP=y +CONFIG_NEED_MULTIPLE_NODES=y +CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y +CONFIG_NODES_SPAN_OTHER_NODES=y CONFIG_NUMA=y CONFIG_SCHED_SMT=y +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set +# CONFIG_PREEMPT_BKL is not set +CONFIG_HZ_100=y +# CONFIG_HZ_250 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=100 CONFIG_EEH=y CONFIG_GENERIC_HARDIRQS=y CONFIG_PPC_RTAS=y @@ -132,6 +156,146 @@ CONFIG_PROC_DEVICETREE=y # CONFIG_CMDLINE_BOOL is not set # +# Networking +# +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_MMAP is not set +CONFIG_UNIX=y +CONFIG_XFRM=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=m +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=y +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +CONFIG_INET_TUNNEL=y +CONFIG_IP_TCPDIAG=m +# CONFIG_IP_TCPDIAG_IPV6 is not set +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y + +# +# IP: Virtual Server Configuration +# +# CONFIG_IP_VS is not set +# CONFIG_IPV6 is not set +CONFIG_NETFILTER=y +# CONFIG_NETFILTER_DEBUG is not set + +# +# IP: Netfilter Configuration +# +CONFIG_IP_NF_CONNTRACK=m +CONFIG_IP_NF_CT_ACCT=y +CONFIG_IP_NF_CONNTRACK_MARK=y +CONFIG_IP_NF_CT_PROTO_SCTP=m +CONFIG_IP_NF_FTP=m +CONFIG_IP_NF_IRC=m +CONFIG_IP_NF_TFTP=m +CONFIG_IP_NF_AMANDA=m +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_LIMIT=m +CONFIG_IP_NF_MATCH_IPRANGE=m +CONFIG_IP_NF_MATCH_MAC=m +CONFIG_IP_NF_MATCH_PKTTYPE=m +CONFIG_IP_NF_MATCH_MARK=m +CONFIG_IP_NF_MATCH_MULTIPORT=m +CONFIG_IP_NF_MATCH_TOS=m +CONFIG_IP_NF_MATCH_RECENT=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_DSCP=m +CONFIG_IP_NF_MATCH_AH_ESP=m +CONFIG_IP_NF_MATCH_LENGTH=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_MATCH_TCPMSS=m +CONFIG_IP_NF_MATCH_HELPER=m +CONFIG_IP_NF_MATCH_STATE=m +CONFIG_IP_NF_MATCH_CONNTRACK=m +CONFIG_IP_NF_MATCH_OWNER=m +CONFIG_IP_NF_MATCH_ADDRTYPE=m +CONFIG_IP_NF_MATCH_REALM=m +CONFIG_IP_NF_MATCH_SCTP=m +CONFIG_IP_NF_MATCH_COMMENT=m +CONFIG_IP_NF_MATCH_CONNMARK=m +CONFIG_IP_NF_MATCH_HASHLIMIT=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_LOG=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_TARGET_TCPMSS=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_NAT_NEEDED=y +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_SAME=m +CONFIG_IP_NF_NAT_SNMP_BASIC=m +CONFIG_IP_NF_NAT_IRC=m +CONFIG_IP_NF_NAT_FTP=m +CONFIG_IP_NF_NAT_TFTP=m +CONFIG_IP_NF_NAT_AMANDA=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_TOS=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_DSCP=m +CONFIG_IP_NF_TARGET_MARK=m +CONFIG_IP_NF_TARGET_CLASSIFY=m +CONFIG_IP_NF_TARGET_CONNMARK=m +CONFIG_IP_NF_TARGET_CLUSTERIP=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_TARGET_NOTRACK=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +CONFIG_LLC=y +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set +# CONFIG_NET_SCHED is not set +CONFIG_NET_CLS_ROUTE=y + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +CONFIG_NETPOLL=y +CONFIG_NETPOLL_RX=y +CONFIG_NETPOLL_TRAP=y +CONFIG_NET_POLL_CONTROLLER=y +# CONFIG_HAMRADIO is not set +# CONFIG_IRDA is not set +# CONFIG_BT is not set + +# # Device Drivers # @@ -238,6 +402,7 @@ CONFIG_BLK_DEV_AMD74XX=y # CONFIG_BLK_DEV_HPT366 is not set # CONFIG_BLK_DEV_SC1200 is not set # CONFIG_BLK_DEV_PIIX is not set +# CONFIG_BLK_DEV_IT821X is not set # CONFIG_BLK_DEV_NS87415 is not set # CONFIG_BLK_DEV_PDC202XX_OLD is not set # CONFIG_BLK_DEV_PDC202XX_NEW is not set @@ -267,6 +432,7 @@ CONFIG_CHR_DEV_ST=y CONFIG_BLK_DEV_SR=y CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=y +# CONFIG_CHR_DEV_SCH is not set # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs @@ -352,6 +518,8 @@ CONFIG_DM_MULTIPATH_EMC=m # Fusion MPT device support # # CONFIG_FUSION is not set +# CONFIG_FUSION_SPI is not set +# CONFIG_FUSION_FC is not set # # IEEE 1394 (FireWire) support @@ -368,145 +536,8 @@ CONFIG_DM_MULTIPATH_EMC=m # # -# Networking support -# -CONFIG_NET=y - -# -# Networking options +# Network device support # -CONFIG_PACKET=y -# CONFIG_PACKET_MMAP is not set -CONFIG_UNIX=y -CONFIG_NET_KEY=m -CONFIG_INET=y -CONFIG_IP_MULTICAST=y -# CONFIG_IP_ADVANCED_ROUTER is not set -# CONFIG_IP_PNP is not set -CONFIG_NET_IPIP=y -# CONFIG_NET_IPGRE is not set -# CONFIG_IP_MROUTE is not set -# CONFIG_ARPD is not set -CONFIG_SYN_COOKIES=y -CONFIG_INET_AH=m -CONFIG_INET_ESP=m -CONFIG_INET_IPCOMP=m -CONFIG_INET_TUNNEL=y -CONFIG_IP_TCPDIAG=m -# CONFIG_IP_TCPDIAG_IPV6 is not set - -# -# IP: Virtual Server Configuration -# -# CONFIG_IP_VS is not set -# CONFIG_IPV6 is not set -CONFIG_NETFILTER=y -# CONFIG_NETFILTER_DEBUG is not set - -# -# IP: Netfilter Configuration -# -CONFIG_IP_NF_CONNTRACK=m -CONFIG_IP_NF_CT_ACCT=y -CONFIG_IP_NF_CONNTRACK_MARK=y -CONFIG_IP_NF_CT_PROTO_SCTP=m -CONFIG_IP_NF_FTP=m -CONFIG_IP_NF_IRC=m -CONFIG_IP_NF_TFTP=m -CONFIG_IP_NF_AMANDA=m -CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m -CONFIG_XFRM=y -CONFIG_XFRM_USER=m - -# -# SCTP Configuration (EXPERIMENTAL) -# -# CONFIG_IP_SCTP is not set -# CONFIG_ATM is not set -# CONFIG_BRIDGE is not set -# CONFIG_VLAN_8021Q is not set -# CONFIG_DECNET is not set -CONFIG_LLC=y -# CONFIG_LLC2 is not set -# CONFIG_IPX is not set -# CONFIG_ATALK is not set -# CONFIG_X25 is not set -# CONFIG_LAPB is not set -# CONFIG_NET_DIVERT is not set -# CONFIG_ECONET is not set -# CONFIG_WAN_ROUTER is not set - -# -# QoS and/or fair queueing -# -# CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y - -# -# Network testing -# -# CONFIG_NET_PKTGEN is not set -CONFIG_NETPOLL=y -CONFIG_NETPOLL_RX=y -CONFIG_NETPOLL_TRAP=y -CONFIG_NET_POLL_CONTROLLER=y -# CONFIG_HAMRADIO is not set -# CONFIG_IRDA is not set -# CONFIG_BT is not set CONFIG_NETDEVICES=y CONFIG_DUMMY=m CONFIG_BONDING=m @@ -566,6 +597,7 @@ CONFIG_E1000=y # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_R8169 is not set +# CONFIG_SKGE is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y @@ -772,50 +804,19 @@ CONFIG_I2C_ALGOBIT=y # CONFIG_I2C_VIAPRO is not set # CONFIG_I2C_VOODOO3 is not set # CONFIG_I2C_PCA_ISA is not set - -# -# Hardware Sensors Chip support -# # CONFIG_I2C_SENSOR is not set -# CONFIG_SENSORS_ADM1021 is not set -# CONFIG_SENSORS_ADM1025 is not set -# CONFIG_SENSORS_ADM1026 is not set -# CONFIG_SENSORS_ADM1031 is not set -# CONFIG_SENSORS_ASB100 is not set -# CONFIG_SENSORS_DS1621 is not set -# CONFIG_SENSORS_FSCHER is not set -# CONFIG_SENSORS_FSCPOS is not set -# CONFIG_SENSORS_GL518SM is not set -# CONFIG_SENSORS_GL520SM is not set -# CONFIG_SENSORS_IT87 is not set -# CONFIG_SENSORS_LM63 is not set -# CONFIG_SENSORS_LM75 is not set -# CONFIG_SENSORS_LM77 is not set -# CONFIG_SENSORS_LM78 is not set -# CONFIG_SENSORS_LM80 is not set -# CONFIG_SENSORS_LM83 is not set -# CONFIG_SENSORS_LM85 is not set -# CONFIG_SENSORS_LM87 is not set -# CONFIG_SENSORS_LM90 is not set -# CONFIG_SENSORS_LM92 is not set -# CONFIG_SENSORS_MAX1619 is not set -# CONFIG_SENSORS_PC87360 is not set -# CONFIG_SENSORS_SMSC47B397 is not set -# CONFIG_SENSORS_SIS5595 is not set -# CONFIG_SENSORS_SMSC47M1 is not set -# CONFIG_SENSORS_VIA686A is not set -# CONFIG_SENSORS_W83781D is not set -# CONFIG_SENSORS_W83L785TS is not set -# CONFIG_SENSORS_W83627HF is not set # -# Other I2C Chip support +# Miscellaneous I2C Chip support # # CONFIG_SENSORS_DS1337 is not set +# CONFIG_SENSORS_DS1374 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set +# CONFIG_SENSORS_PCA9539 is not set # CONFIG_SENSORS_PCF8591 is not set # CONFIG_SENSORS_RTC8564 is not set +# CONFIG_SENSORS_MAX6875 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set @@ -827,6 +828,11 @@ CONFIG_I2C_ALGOBIT=y # CONFIG_W1 is not set # +# Hardware Monitoring support +# +# CONFIG_HWMON is not set + +# # Misc devices # @@ -933,6 +939,7 @@ CONFIG_USB_DEVICEFS=y CONFIG_USB_EHCI_HCD=y # CONFIG_USB_EHCI_SPLIT_ISO is not set # CONFIG_USB_EHCI_ROOT_HUB_TT is not set +# CONFIG_USB_ISP116X_HCD is not set CONFIG_USB_OHCI_HCD=y # CONFIG_USB_OHCI_BIG_ENDIAN is not set CONFIG_USB_OHCI_LITTLE_ENDIAN=y @@ -969,12 +976,15 @@ CONFIG_USB_HIDINPUT=y CONFIG_USB_HIDDEV=y # CONFIG_USB_AIPTEK is not set # CONFIG_USB_WACOM is not set +# CONFIG_USB_ACECAD is not set # CONFIG_USB_KBTAB is not set # CONFIG_USB_POWERMATE is not set # CONFIG_USB_MTOUCH is not set +# CONFIG_USB_ITMTOUCH is not set # CONFIG_USB_EGALAX is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_KEYSPAN_REMOTE is not set # # USB Imaging devices @@ -1026,10 +1036,11 @@ CONFIG_USB_MON=y # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set # CONFIG_USB_SISUSBVGA is not set +# CONFIG_USB_LD is not set # CONFIG_USB_TEST is not set # -# USB ATM/DSL drivers +# USB DSL modem support # # @@ -1046,18 +1057,25 @@ CONFIG_USB_MON=y # InfiniBand support # CONFIG_INFINIBAND=m +CONFIG_INFINIBAND_USER_VERBS=m CONFIG_INFINIBAND_MTHCA=m # CONFIG_INFINIBAND_MTHCA_DEBUG is not set CONFIG_INFINIBAND_IPOIB=m # CONFIG_INFINIBAND_IPOIB_DEBUG is not set # +# SN Devices +# + +# # File systems # CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y CONFIG_EXT2_FS_SECURITY=y +CONFIG_EXT2_FS_XIP=y +CONFIG_FS_XIP=y CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y @@ -1089,6 +1107,7 @@ CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set +CONFIG_INOTIFY=y # CONFIG_QUOTA is not set CONFIG_DNOTIFY=y CONFIG_AUTOFS_FS=m @@ -1120,7 +1139,6 @@ CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y -# CONFIG_DEVFS_FS is not set CONFIG_DEVPTS_FS_XATTR=y CONFIG_DEVPTS_FS_SECURITY=y CONFIG_TMPFS=y @@ -1152,15 +1170,20 @@ CONFIG_CRAMFS=y # CONFIG_NFS_FS=y CONFIG_NFS_V3=y +CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y # CONFIG_NFS_DIRECTIO is not set CONFIG_NFSD=y +CONFIG_NFSD_V2_ACL=y CONFIG_NFSD_V3=y +CONFIG_NFSD_V3_ACL=y CONFIG_NFSD_V4=y CONFIG_NFSD_TCP=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_EXPORTFS=y +CONFIG_NFS_ACL_SUPPORT=y +CONFIG_NFS_COMMON=y CONFIG_SUNRPC=y CONFIG_SUNRPC_GSS=y CONFIG_RPCSEC_GSS_KRB5=y Index: linux-2.6.13-rc3-olh/arch/ppc64/defconfig =================================================================== --- linux-2.6.13-rc3-olh.orig/arch/ppc64/defconfig +++ linux-2.6.13-rc3-olh/arch/ppc64/defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.12-rc5-git9 -# Sun Jun 5 09:26:47 2005 +# Linux kernel version: 2.6.13-rc3 +# Wed Jul 13 14:37:07 2005 # CONFIG_64BIT=y CONFIG_MMU=y @@ -73,13 +73,18 @@ CONFIG_SYSVIPC_COMPAT=y # CONFIG_PPC_ISERIES is not set CONFIG_PPC_MULTIPLATFORM=y CONFIG_PPC_PSERIES=y +CONFIG_PPC_BPA=y CONFIG_PPC_PMAC=y CONFIG_PPC_MAPLE=y CONFIG_PPC=y CONFIG_PPC64=y CONFIG_PPC_OF=y +CONFIG_XICS=y +CONFIG_MPIC=y +CONFIG_BPA_IIC=y CONFIG_ALTIVEC=y CONFIG_PPC_SPLPAR=y +CONFIG_KEXEC=y CONFIG_IBMVIO=y CONFIG_U3_DART=y CONFIG_MPIC_BROKEN_U3=y @@ -89,10 +94,30 @@ CONFIG_BOOTX_TEXT=y CONFIG_IOMMU_VMERGE=y CONFIG_SMP=y CONFIG_NR_CPUS=32 +CONFIG_ARCH_SELECT_MEMORY_MODEL=y +CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_ARCH_DISCONTIGMEM_ENABLE=y +CONFIG_ARCH_DISCONTIGMEM_DEFAULT=y +CONFIG_ARCH_SPARSEMEM_ENABLE=y +CONFIG_SELECT_MEMORY_MODEL=y +# CONFIG_FLATMEM_MANUAL is not set +CONFIG_DISCONTIGMEM_MANUAL=y +# CONFIG_SPARSEMEM_MANUAL is not set +CONFIG_DISCONTIGMEM=y +CONFIG_FLAT_NODE_MEM_MAP=y +CONFIG_NEED_MULTIPLE_NODES=y +CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y +CONFIG_NODES_SPAN_OTHER_NODES=y # CONFIG_NUMA is not set # CONFIG_SCHED_SMT is not set +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set +# CONFIG_PREEMPT_BKL is not set +CONFIG_HZ_100=y +# CONFIG_HZ_250 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=100 CONFIG_EEH=y CONFIG_GENERIC_HARDIRQS=y CONFIG_PPC_RTAS=y @@ -133,6 +158,146 @@ CONFIG_PROC_DEVICETREE=y # CONFIG_CMDLINE_BOOL is not set # +# Networking +# +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_MMAP is not set +CONFIG_UNIX=y +CONFIG_XFRM=y +CONFIG_XFRM_USER=m +CONFIG_NET_KEY=m +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=y +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +CONFIG_SYN_COOKIES=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +CONFIG_INET_TUNNEL=y +# CONFIG_IP_TCPDIAG is not set +# CONFIG_IP_TCPDIAG_IPV6 is not set +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y + +# +# IP: Virtual Server Configuration +# +# CONFIG_IP_VS is not set +# CONFIG_IPV6 is not set +CONFIG_NETFILTER=y +# CONFIG_NETFILTER_DEBUG is not set + +# +# IP: Netfilter Configuration +# +CONFIG_IP_NF_CONNTRACK=m +CONFIG_IP_NF_CT_ACCT=y +CONFIG_IP_NF_CONNTRACK_MARK=y +CONFIG_IP_NF_CT_PROTO_SCTP=m +CONFIG_IP_NF_FTP=m +CONFIG_IP_NF_IRC=m +CONFIG_IP_NF_TFTP=m +CONFIG_IP_NF_AMANDA=m +CONFIG_IP_NF_QUEUE=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_LIMIT=m +CONFIG_IP_NF_MATCH_IPRANGE=m +CONFIG_IP_NF_MATCH_MAC=m +CONFIG_IP_NF_MATCH_PKTTYPE=m +CONFIG_IP_NF_MATCH_MARK=m +CONFIG_IP_NF_MATCH_MULTIPORT=m +CONFIG_IP_NF_MATCH_TOS=m +CONFIG_IP_NF_MATCH_RECENT=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_DSCP=m +CONFIG_IP_NF_MATCH_AH_ESP=m +CONFIG_IP_NF_MATCH_LENGTH=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_MATCH_TCPMSS=m +CONFIG_IP_NF_MATCH_HELPER=m +CONFIG_IP_NF_MATCH_STATE=m +CONFIG_IP_NF_MATCH_CONNTRACK=m +CONFIG_IP_NF_MATCH_OWNER=m +CONFIG_IP_NF_MATCH_ADDRTYPE=m +CONFIG_IP_NF_MATCH_REALM=m +CONFIG_IP_NF_MATCH_SCTP=m +CONFIG_IP_NF_MATCH_COMMENT=m +CONFIG_IP_NF_MATCH_CONNMARK=m +CONFIG_IP_NF_MATCH_HASHLIMIT=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_LOG=m +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_TARGET_TCPMSS=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_NAT_NEEDED=y +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_SAME=m +CONFIG_IP_NF_NAT_SNMP_BASIC=m +CONFIG_IP_NF_NAT_IRC=m +CONFIG_IP_NF_NAT_FTP=m +CONFIG_IP_NF_NAT_TFTP=m +CONFIG_IP_NF_NAT_AMANDA=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_TOS=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_DSCP=m +CONFIG_IP_NF_TARGET_MARK=m +CONFIG_IP_NF_TARGET_CLASSIFY=m +CONFIG_IP_NF_TARGET_CONNMARK=m +CONFIG_IP_NF_TARGET_CLUSTERIP=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_TARGET_NOTRACK=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +CONFIG_LLC=y +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set +# CONFIG_NET_SCHED is not set +CONFIG_NET_CLS_ROUTE=y + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +CONFIG_NETPOLL=y +CONFIG_NETPOLL_RX=y +CONFIG_NETPOLL_TRAP=y +CONFIG_NET_POLL_CONTROLLER=y +# CONFIG_HAMRADIO is not set +# CONFIG_IRDA is not set +# CONFIG_BT is not set + +# # Device Drivers # @@ -239,6 +404,7 @@ CONFIG_BLK_DEV_AMD74XX=y # CONFIG_BLK_DEV_HPT366 is not set # CONFIG_BLK_DEV_SC1200 is not set # CONFIG_BLK_DEV_PIIX is not set +# CONFIG_BLK_DEV_IT821X is not set # CONFIG_BLK_DEV_NS87415 is not set # CONFIG_BLK_DEV_PDC202XX_OLD is not set # CONFIG_BLK_DEV_PDC202XX_NEW is not set @@ -272,6 +438,7 @@ CONFIG_CHR_DEV_ST=y CONFIG_BLK_DEV_SR=y CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=y +# CONFIG_CHR_DEV_SCH is not set # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs @@ -369,6 +536,8 @@ CONFIG_DM_MULTIPATH_EMC=m # Fusion MPT device support # # CONFIG_FUSION is not set +# CONFIG_FUSION_SPI is not set +# CONFIG_FUSION_FC is not set # # IEEE 1394 (FireWire) support @@ -382,6 +551,7 @@ CONFIG_IEEE1394=y # CONFIG_IEEE1394_OUI_DB is not set CONFIG_IEEE1394_EXTRA_CONFIG_ROMS=y CONFIG_IEEE1394_CONFIG_ROM_IP1394=y +# CONFIG_IEEE1394_EXPORT_FULL_API is not set # # Device Drivers @@ -412,151 +582,13 @@ CONFIG_IEEE1394_AMDTP=m CONFIG_ADB=y CONFIG_ADB_PMU=y CONFIG_PMAC_SMU=y -# CONFIG_PMAC_PBOOK is not set # CONFIG_PMAC_BACKLIGHT is not set # CONFIG_INPUT_ADBHID is not set CONFIG_THERM_PM72=y # -# Networking support -# -CONFIG_NET=y - -# -# Networking options +# Network device support # -CONFIG_PACKET=y -# CONFIG_PACKET_MMAP is not set -CONFIG_UNIX=y -CONFIG_NET_KEY=m -CONFIG_INET=y -CONFIG_IP_MULTICAST=y -# CONFIG_IP_ADVANCED_ROUTER is not set -# CONFIG_IP_PNP is not set -CONFIG_NET_IPIP=y -# CONFIG_NET_IPGRE is not set -# CONFIG_IP_MROUTE is not set -# CONFIG_ARPD is not set -CONFIG_SYN_COOKIES=y -CONFIG_INET_AH=m -CONFIG_INET_ESP=m -CONFIG_INET_IPCOMP=m -CONFIG_INET_TUNNEL=y -# CONFIG_IP_TCPDIAG is not set -# CONFIG_IP_TCPDIAG_IPV6 is not set - -# -# IP: Virtual Server Configuration -# -# CONFIG_IP_VS is not set -# CONFIG_IPV6 is not set -CONFIG_NETFILTER=y -# CONFIG_NETFILTER_DEBUG is not set - -# -# IP: Netfilter Configuration -# -CONFIG_IP_NF_CONNTRACK=m -CONFIG_IP_NF_CT_ACCT=y -CONFIG_IP_NF_CONNTRACK_MARK=y -CONFIG_IP_NF_CT_PROTO_SCTP=m -CONFIG_IP_NF_FTP=m -CONFIG_IP_NF_IRC=m -CONFIG_IP_NF_TFTP=m -CONFIG_IP_NF_AMANDA=m -CONFIG_IP_NF_QUEUE=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_MATCH_LIMIT=m -CONFIG_IP_NF_MATCH_IPRANGE=m -CONFIG_IP_NF_MATCH_MAC=m -CONFIG_IP_NF_MATCH_PKTTYPE=m -CONFIG_IP_NF_MATCH_MARK=m -CONFIG_IP_NF_MATCH_MULTIPORT=m -CONFIG_IP_NF_MATCH_TOS=m -CONFIG_IP_NF_MATCH_RECENT=m -CONFIG_IP_NF_MATCH_ECN=m -CONFIG_IP_NF_MATCH_DSCP=m -CONFIG_IP_NF_MATCH_AH_ESP=m -CONFIG_IP_NF_MATCH_LENGTH=m -CONFIG_IP_NF_MATCH_TTL=m -CONFIG_IP_NF_MATCH_TCPMSS=m -CONFIG_IP_NF_MATCH_HELPER=m -CONFIG_IP_NF_MATCH_STATE=m -CONFIG_IP_NF_MATCH_CONNTRACK=m -CONFIG_IP_NF_MATCH_OWNER=m -CONFIG_IP_NF_MATCH_ADDRTYPE=m -CONFIG_IP_NF_MATCH_REALM=m -CONFIG_IP_NF_MATCH_SCTP=m -CONFIG_IP_NF_MATCH_COMMENT=m -CONFIG_IP_NF_MATCH_CONNMARK=m -CONFIG_IP_NF_MATCH_HASHLIMIT=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_LOG=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_TARGET_TCPMSS=m -CONFIG_IP_NF_NAT=m -CONFIG_IP_NF_NAT_NEEDED=y -CONFIG_IP_NF_TARGET_MASQUERADE=m -CONFIG_IP_NF_TARGET_REDIRECT=m -CONFIG_IP_NF_TARGET_NETMAP=m -CONFIG_IP_NF_TARGET_SAME=m -CONFIG_IP_NF_NAT_SNMP_BASIC=m -CONFIG_IP_NF_NAT_IRC=m -CONFIG_IP_NF_NAT_FTP=m -CONFIG_IP_NF_NAT_TFTP=m -CONFIG_IP_NF_NAT_AMANDA=m -CONFIG_IP_NF_MANGLE=m -CONFIG_IP_NF_TARGET_TOS=m -CONFIG_IP_NF_TARGET_ECN=m -CONFIG_IP_NF_TARGET_DSCP=m -CONFIG_IP_NF_TARGET_MARK=m -CONFIG_IP_NF_TARGET_CLASSIFY=m -CONFIG_IP_NF_TARGET_CONNMARK=m -CONFIG_IP_NF_TARGET_CLUSTERIP=m -CONFIG_IP_NF_RAW=m -CONFIG_IP_NF_TARGET_NOTRACK=m -CONFIG_IP_NF_ARPTABLES=m -CONFIG_IP_NF_ARPFILTER=m -CONFIG_IP_NF_ARP_MANGLE=m -CONFIG_XFRM=y -CONFIG_XFRM_USER=m - -# -# SCTP Configuration (EXPERIMENTAL) -# -# CONFIG_IP_SCTP is not set -# CONFIG_ATM is not set -# CONFIG_BRIDGE is not set -# CONFIG_VLAN_8021Q is not set -# CONFIG_DECNET is not set -CONFIG_LLC=y -# CONFIG_LLC2 is not set -# CONFIG_IPX is not set -# CONFIG_ATALK is not set -# CONFIG_X25 is not set -# CONFIG_LAPB is not set -# CONFIG_NET_DIVERT is not set -# CONFIG_ECONET is not set -# CONFIG_WAN_ROUTER is not set - -# -# QoS and/or fair queueing -# -# CONFIG_NET_SCHED is not set -CONFIG_NET_CLS_ROUTE=y - -# -# Network testing -# -# CONFIG_NET_PKTGEN is not set -CONFIG_NETPOLL=y -CONFIG_NETPOLL_RX=y -CONFIG_NETPOLL_TRAP=y -CONFIG_NET_POLL_CONTROLLER=y -# CONFIG_HAMRADIO is not set -# CONFIG_IRDA is not set -# CONFIG_BT is not set CONFIG_NETDEVICES=y CONFIG_DUMMY=m CONFIG_BONDING=m @@ -616,6 +648,7 @@ CONFIG_E1000=y # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set # CONFIG_R8169 is not set +# CONFIG_SKGE is not set # CONFIG_SK98LIN is not set # CONFIG_VIA_VELOCITY is not set CONFIG_TIGON3=y @@ -823,50 +856,19 @@ CONFIG_I2C_KEYWEST=y # CONFIG_I2C_VIAPRO is not set # CONFIG_I2C_VOODOO3 is not set # CONFIG_I2C_PCA_ISA is not set - -# -# Hardware Sensors Chip support -# # CONFIG_I2C_SENSOR is not set -# CONFIG_SENSORS_ADM1021 is not set -# CONFIG_SENSORS_ADM1025 is not set -# CONFIG_SENSORS_ADM1026 is not set -# CONFIG_SENSORS_ADM1031 is not set -# CONFIG_SENSORS_ASB100 is not set -# CONFIG_SENSORS_DS1621 is not set -# CONFIG_SENSORS_FSCHER is not set -# CONFIG_SENSORS_FSCPOS is not set -# CONFIG_SENSORS_GL518SM is not set -# CONFIG_SENSORS_GL520SM is not set -# CONFIG_SENSORS_IT87 is not set -# CONFIG_SENSORS_LM63 is not set -# CONFIG_SENSORS_LM75 is not set -# CONFIG_SENSORS_LM77 is not set -# CONFIG_SENSORS_LM78 is not set -# CONFIG_SENSORS_LM80 is not set -# CONFIG_SENSORS_LM83 is not set -# CONFIG_SENSORS_LM85 is not set -# CONFIG_SENSORS_LM87 is not set -# CONFIG_SENSORS_LM90 is not set -# CONFIG_SENSORS_LM92 is not set -# CONFIG_SENSORS_MAX1619 is not set -# CONFIG_SENSORS_PC87360 is not set -# CONFIG_SENSORS_SMSC47B397 is not set -# CONFIG_SENSORS_SIS5595 is not set -# CONFIG_SENSORS_SMSC47M1 is not set -# CONFIG_SENSORS_VIA686A is not set -# CONFIG_SENSORS_W83781D is not set -# CONFIG_SENSORS_W83L785TS is not set -# CONFIG_SENSORS_W83627HF is not set # -# Other I2C Chip support +# Miscellaneous I2C Chip support # # CONFIG_SENSORS_DS1337 is not set +# CONFIG_SENSORS_DS1374 is not set # CONFIG_SENSORS_EEPROM is not set # CONFIG_SENSORS_PCF8574 is not set +# CONFIG_SENSORS_PCA9539 is not set # CONFIG_SENSORS_PCF8591 is not set # CONFIG_SENSORS_RTC8564 is not set +# CONFIG_SENSORS_MAX6875 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set @@ -878,6 +880,11 @@ CONFIG_I2C_KEYWEST=y # CONFIG_W1 is not set # +# Hardware Monitoring support +# +# CONFIG_HWMON is not set + +# # Misc devices # @@ -988,6 +995,7 @@ CONFIG_USB_DEVICEFS=y CONFIG_USB_EHCI_HCD=y # CONFIG_USB_EHCI_SPLIT_ISO is not set # CONFIG_USB_EHCI_ROOT_HUB_TT is not set +# CONFIG_USB_ISP116X_HCD is not set CONFIG_USB_OHCI_HCD=y # CONFIG_USB_OHCI_BIG_ENDIAN is not set CONFIG_USB_OHCI_LITTLE_ENDIAN=y @@ -1024,12 +1032,15 @@ CONFIG_USB_HIDINPUT=y CONFIG_USB_HIDDEV=y # CONFIG_USB_AIPTEK is not set # CONFIG_USB_WACOM is not set +# CONFIG_USB_ACECAD is not set # CONFIG_USB_KBTAB is not set # CONFIG_USB_POWERMATE is not set # CONFIG_USB_MTOUCH is not set +# CONFIG_USB_ITMTOUCH is not set # CONFIG_USB_EGALAX is not set # CONFIG_USB_XPAD is not set # CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_KEYSPAN_REMOTE is not set # # USB Imaging devices @@ -1081,10 +1092,11 @@ CONFIG_USB_PEGASUS=y # CONFIG_USB_PHIDGETSERVO is not set # CONFIG_USB_IDMOUSE is not set # CONFIG_USB_SISUSBVGA is not set +# CONFIG_USB_LD is not set # CONFIG_USB_TEST is not set # -# USB ATM/DSL drivers +# USB DSL modem support # # @@ -1101,18 +1113,25 @@ CONFIG_USB_PEGASUS=y # InfiniBand support # CONFIG_INFINIBAND=m +CONFIG_INFINIBAND_USER_VERBS=m CONFIG_INFINIBAND_MTHCA=m # CONFIG_INFINIBAND_MTHCA_DEBUG is not set CONFIG_INFINIBAND_IPOIB=m # CONFIG_INFINIBAND_IPOIB_DEBUG is not set # +# SN Devices +# + +# # File systems # CONFIG_EXT2_FS=y CONFIG_EXT2_FS_XATTR=y CONFIG_EXT2_FS_POSIX_ACL=y CONFIG_EXT2_FS_SECURITY=y +CONFIG_EXT2_FS_XIP=y +CONFIG_FS_XIP=y CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y @@ -1144,6 +1163,7 @@ CONFIG_XFS_SECURITY=y CONFIG_XFS_POSIX_ACL=y # CONFIG_MINIX_FS is not set # CONFIG_ROMFS_FS is not set +CONFIG_INOTIFY=y # CONFIG_QUOTA is not set CONFIG_DNOTIFY=y CONFIG_AUTOFS_FS=y @@ -1174,7 +1194,6 @@ CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y -# CONFIG_DEVFS_FS is not set CONFIG_DEVPTS_FS_XATTR=y CONFIG_DEVPTS_FS_SECURITY=y CONFIG_TMPFS=y @@ -1206,15 +1225,20 @@ CONFIG_CRAMFS=y # CONFIG_NFS_FS=y CONFIG_NFS_V3=y +CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y # CONFIG_NFS_DIRECTIO is not set CONFIG_NFSD=m +CONFIG_NFSD_V2_ACL=y CONFIG_NFSD_V3=y +CONFIG_NFSD_V3_ACL=y CONFIG_NFSD_V4=y CONFIG_NFSD_TCP=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_EXPORTFS=m +CONFIG_NFS_ACL_SUPPORT=y +CONFIG_NFS_COMMON=y CONFIG_SUNRPC=y CONFIG_SUNRPC_GSS=y CONFIG_RPCSEC_GSS_KRB5=y From kumar.gala at freescale.com Thu Jul 14 06:47:04 2005 From: kumar.gala at freescale.com (Kumar Gala) Date: Wed, 13 Jul 2005 15:47:04 -0500 Subject: do_gettimeofday() Message-ID: <1CCFA65E-D79D-4E5D-9980-6170E8CB1616@freescale.com> I was wondering if anyone could explain how it is that the ppc64 do_gettimeofday() code doesn't need xtime_lock. I'm trying to mimic some of the vDSO gettimeofday() work in ppc32 and am having a hard time trying to understand the evolution of time.c in ppc32 vs ppc64. thanks - kumar From linas at austin.ibm.com Thu Jul 14 08:42:44 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 13 Jul 2005 17:42:44 -0500 Subject: [PATCH 2.6.13-rc1 03/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <1121213938.31924.406.camel@gaston> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB664E.1050003@jp.fujitsu.com> <20050712195120.GE26607@austin.ibm.com> <1121213938.31924.406.camel@gaston> Message-ID: <20050713224244.GM26607@austin.ibm.com> Hi, Yes, but ... On Wed, Jul 13, 2005 at 10:18:57AM +1000, Benjamin Herrenschmidt was heard to remark: > > > Are you assuming that a device driver will use an iochk_read() for > > every DMA operation? for every MMIO to the card? > > > > For high performance devices, it seems to me that this will cause > > a rather large performance burden, especially if its envisioned that > > all architectures will do something similar. > > > > My concern is that (at least on ppc64) the call pci_read_config_word() > > requires a call into "firmware" aka "BIOS", which takes thousands upon > > thousands of cpu cycles. There are hundreds of cycles of gratuitous > > crud just to get into the firmware, and then lord-knows-what the > > firmware does while its in there; probably doing all sorts of crazy > > math to compute bus addresses and other arcane things. I would imagine > > that most architectures, includig ia64, are similar. > > > > Thus, one wouldn't want to perform an iochk_read() in this way unless > > one was already pretty sure that an error had already occured ... > > > > Am I misunderstanding something? > > I would expect pSeries not to use the "default" error checking (that > tests the status register) but rather use EEH. OK, it wasn't clear to me if every possible case of the "detected parity error" bit being set on the pci adapter is converted into an EEH error. I had the impression that the adapter can set the bit, but not signal a #PERR, adn thus have no EEH event. I am investigating this now. If a given device driver is expecting iochk_read() to catch this situation, then we'd be screwed. --linas From olh at suse.de Fri Jul 15 04:00:16 2005 From: olh at suse.de (Olaf Hering) Date: Thu, 14 Jul 2005 20:00:16 +0200 Subject: [PATCH] hide CONFIG_ADB on ppc64 Message-ID: <20050714180016.GA27392@suse.de> This bites me all day when I use our default config for ppc64. We use a patch to fix the compile errors and provide the CONFIG_MAC_EMUMOUSEBTN functionality (which is behind CONFIG_INPUT_ADBHID). But Benh doesnt like it. http://ozlabs.org/pipermail/linuxppc64-dev/2005-March/003423.html Just hide all the ADB parts from via-pmu on ppc64 instead. drivers/macintosh/adbhid.c: In function `adbhid_init': drivers/macintosh/adbhid.c:1199: error: `_MACH_chrp' undeclared (first use in this function) drivers/macintosh/adbhid.c:1199: error: (Each undeclared identifier is reported only once drivers/macintosh/adbhid.c:1199: error: for each function it appears in.) drivers/macintosh/Kconfig | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.13-rc3-olh/drivers/macintosh/Kconfig =================================================================== --- linux-2.6.13-rc3-olh.orig/drivers/macintosh/Kconfig +++ linux-2.6.13-rc3-olh/drivers/macintosh/Kconfig @@ -4,7 +4,7 @@ menu "Macintosh device drivers" config ADB bool "Apple Desktop Bus (ADB) support" - depends on MAC || PPC_PMAC + depends on MAC || (PPC_PMAC && PPC32) help Apple Desktop Bus (ADB) support is for support of devices which are connected to an ADB port. ADB devices tend to have 4 pins. From christoph at lameter.com Fri Jul 15 03:24:33 2005 From: christoph at lameter.com (Christoph Lameter) Date: Thu, 14 Jul 2005 10:24:33 -0700 (PDT) Subject: RFC: Hugepage COW In-Reply-To: <20050707055554.GC11246@localhost.localdomain> References: <20050707055554.GC11246@localhost.localdomain> Message-ID: On Thu, 7 Jul 2005, David Gibson wrote: > Now that the hugepage code has been consolidated across the > architectures, it becomes much easier to implement copy-on-write. > Hugepage COW is of limited utility of itself, however, it is > essentially a prerequisite for any of a number of methods of allowing > userland programs to automatically use hugepages without code changes > e.g. hugepage malloc() libraries, implicit hugepage mmap(), hugepage > ELF segments. For certain applications (particularly enormous HPC > FORTRAN programs), these can result in a large performance > improvement. > > Thoughts? Flames? Great stuff. I am glad that you are cleaning up the hugepages and are making progress improving them. What are your thoughts on implementing fault handling for huge pages? From paypal at email.paypal.com Fri Jul 15 04:40:15 2005 From: paypal at email.paypal.com (PayPal) Date: 14 Jul 2005 20:40:15 +0200 Subject: Account Review Team Message-ID: <20050714184015.3951.qmail@server171-han.de-nserver.de> An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050714/89189110/attachment.htm From david at gibson.dropbear.id.au Fri Jul 15 11:14:28 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 15 Jul 2005 11:14:28 +1000 Subject: RFC: Hugepage COW In-Reply-To: References: <20050707055554.GC11246@localhost.localdomain> Message-ID: <20050715011428.GC7750@localhost.localdomain> On Thu, Jul 14, 2005 at 10:24:33AM -0700, Christoph Lameter wrote: > On Thu, 7 Jul 2005, David Gibson wrote: > > > Now that the hugepage code has been consolidated across the > > architectures, it becomes much easier to implement copy-on-write. > > Hugepage COW is of limited utility of itself, however, it is > > essentially a prerequisite for any of a number of methods of allowing > > userland programs to automatically use hugepages without code changes > > e.g. hugepage malloc() libraries, implicit hugepage mmap(), hugepage > > ELF segments. For certain applications (particularly enormous HPC > > FORTRAN programs), these can result in a large performance > > improvement. > > > > Thoughts? Flames? > > Great stuff. I am glad that you are cleaning up the hugepages and are > making progress improving them. What are your thoughts on implementing > fault handling for huge pages? Well, the COW patch implements a fault handler, obviously. What specifically where you thinking about? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From christoph at lameter.com Fri Jul 15 12:00:11 2005 From: christoph at lameter.com (Christoph Lameter) Date: Thu, 14 Jul 2005 19:00:11 -0700 (PDT) Subject: RFC: Hugepage COW In-Reply-To: <20050715011428.GC7750@localhost.localdomain> References: <20050707055554.GC11246@localhost.localdomain> <20050715011428.GC7750@localhost.localdomain> Message-ID: On Fri, 15 Jul 2005, David Gibson wrote: > Well, the COW patch implements a fault handler, obviously. What > specifically where you thinking about? About a fault handler of course and about surrounding scalability issues. I worked on some hugepage related patches last fall. Have you had a look at the work of Ken, Ray and me on the subject? From david at gibson.dropbear.id.au Fri Jul 15 14:21:46 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 15 Jul 2005 14:21:46 +1000 Subject: RFC: Hugepage COW In-Reply-To: References: <20050707055554.GC11246@localhost.localdomain> <20050715011428.GC7750@localhost.localdomain> Message-ID: <20050715042146.GE7750@localhost.localdomain> On Thu, Jul 14, 2005 at 07:00:11PM -0700, Christoph Lameter wrote: > On Fri, 15 Jul 2005, David Gibson wrote: > > > Well, the COW patch implements a fault handler, obviously. What > > specifically where you thinking about? > > About a fault handler of course and about surrounding scalability issues. > I worked on some hugepage related patches last fall. Have you had a look > at the work of Ken, Ray and me on the subject? I'm still not at all sure what you're getting at. Do you mean the demand-allocation patches which were floating around at some point - I gather they're important for doing sensible NUMA allocation of hugepages. They have a small overlap with the COW code, in the fault handler, but not much. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From christoph at lameter.com Fri Jul 15 14:34:26 2005 From: christoph at lameter.com (Christoph Lameter) Date: Thu, 14 Jul 2005 21:34:26 -0700 (PDT) Subject: RFC: Hugepage COW In-Reply-To: <20050715042146.GE7750@localhost.localdomain> References: <20050707055554.GC11246@localhost.localdomain> <20050715011428.GC7750@localhost.localdomain> <20050715042146.GE7750@localhost.localdomain> Message-ID: On Fri, 15 Jul 2005, David Gibson wrote: > I'm still not at all sure what you're getting at. Do you mean the > demand-allocation patches which were floating around at some point - I > gather they're important for doing sensible NUMA allocation of > hugepages. They have a small overlap with the COW code, in the fault > handler, but not much. Yes I meant that. I do not have time right now but I will be trying to contribute to this if things slow down a bit. Keep me posted. From david at gibson.dropbear.id.au Fri Jul 15 17:19:24 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 15 Jul 2005 17:19:24 +1000 Subject: PATCH: Add memreserve to DTC In-Reply-To: <1121371427.24467.34.camel@cashmere.sps.mot.com> References: <1120859097.8609.15.camel@cashmere.sps.mot.com> <20050711045532.GC32545@localhost.localdomain> <1121116950.15394.14.camel@cashmere.sps.mot.com> <20050712040623.GG3945@localhost.localdomain> <1121371427.24467.34.camel@cashmere.sps.mot.com> Message-ID: <20050715071924.GA16797@localhost.localdomain> On Thu, Jul 14, 2005 at 03:03:47PM -0500, Jon Loeliger wrote: > On Mon, 2005-07-11 at 23:06, David Gibson wrote: > > On Mon, Jul 11, 2005 at 04:22:30PM -0500, Jon Loeliger wrote: > > > On Sun, 2005-07-10 at 23:55, David Gibson wrote: > > > > On Fri, Jul 08, 2005 at 04:44:58PM -0500, Jon Loeliger wrote: > > [snip] > > > > Biggest thing is that rather than passing the tree itself and the > > > > memreserve info about as two parameters all over the place, I'd rather > > > > create a new structure which has both (and later can have anything > > > > else that might be needed). > > > > > > If you'd like, I'll do this work. > > > > That would be helpful. You'll need to rediff, though, I merged a > > couple of bugfixes from your patch that weren't directly related to > > the memreserve stuff. > > David, > > Here is an updated version of the patch that obsoletes > the previous one I submitted. I have incorporated all > of your syntactic suggestions except not using the > split-64 values (ie, this still uses 'struct data'). > It primarily merges in the changes that you adopted > from earlier and implements a new structure at the > base of the parse tree to hold both the device tree > and the header information. I called that new stucuture > 'struct header_tree'. Feel free to dream up something > better. :-) Ok, I've merged this, although I've tweaked things substantially in the process. I did rename "header_tree" to "boot_info", moved some things around, and changed the syntax. Reserve ranges can now be specified either as an address and length: /memreserve/ 10000000 00002000; or as an (inclusive) address range: /memreserve/ 10000000-10001fff; I am a bit worried that those two forms may be hard to distinguish at a glance. Any sugggestions for changes to the syntax soon please, I'd really like to keep the source syntax as stable as possible. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From jdl at freescale.com Sat Jul 16 00:30:58 2005 From: jdl at freescale.com (Jon Loeliger) Date: Fri, 15 Jul 2005 09:30:58 -0500 Subject: PATCH: Add memreserve to DTC In-Reply-To: <20050715071924.GA16797@localhost.localdomain> References: <1120859097.8609.15.camel@cashmere.sps.mot.com> <20050711045532.GC32545@localhost.localdomain> <1121116950.15394.14.camel@cashmere.sps.mot.com> <20050712040623.GG3945@localhost.localdomain> <1121371427.24467.34.camel@cashmere.sps.mot.com> <20050715071924.GA16797@localhost.localdomain> Message-ID: <1121437857.24864.12.camel@cashmere.sps.mot.com> On Fri, 2005-07-15 at 02:19, David Gibson wrote: > > Ok, I've merged this, Excellent, thanks! > although I've tweaked things substantially in the process. No problem. > I did rename "header_tree" to "boot_info", moved some Oh, good! > things around, and changed the syntax. Reserve ranges can now be > specified either as an address and length: > > /memreserve/ 10000000 00002000; > > or as an (inclusive) address range: > > /memreserve/ 10000000-10001fff; > > I am a bit worried that those two forms may be hard to distinguish at > a glance. Any sugggestions for changes to the syntax soon please, I'd > really like to keep the source syntax as stable as possible. Oh man. With syntax you can demystify those in any number of ways. Just a matter of what you are wanting. You can always add sugar: /memreserve_block/ 10000000 00002000; /memreserve_range/ 10000000 10001fff; /memreserve/ 10000000 /for/ 2000; // or /size/ ? /memreserve/ 10000000 /through/ 10001fff; /memreserve/ 10000000 00002000; /memreserve/ [10000000, 10001fff]; // or [10000000, 10002000)? Stuff like that maybe? jdl From hollis at penguinppc.org Sat Jul 16 04:42:16 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Fri, 15 Jul 2005 14:42:16 -0400 Subject: OLS 2005 attendees Message-ID: <6d05c36b7a666bf9c04c9125d028e2b9@penguinppc.org> As Ottawa Linux Symposium 2005 approaches, I've made a wiki page listing some of the PowerPC people attending, preseeding it with some people I already know will be there: http://oss.gonicus.de/openpower/index.php/LinuxSymposium2005 If you're going, you can add your info to the list and maybe we can all get together... -Hollis From hollis at penguinppc.org Sat Jul 16 04:43:37 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Fri, 15 Jul 2005 14:43:37 -0400 Subject: OLS 2005 attendees Message-ID: <52642f46a4231ff4111424b0565ddcde@penguinppc.org> As Ottawa Linux Symposium 2005 approaches, I've made a wiki page listing some of the PowerPC people attending, preseeding it with some people I already know will be there: http://oss.gonicus.de/openpower/index.php/LinuxSymposium2005 If you're going, you can add your info to the list and maybe we can all get together... -Hollis From sfr at canb.auug.org.au Fri Jul 15 13:09:33 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Fri, 15 Jul 2005 13:09:33 +1000 Subject: RFC: IOMMU bypass Message-ID: <20050715130933.492ac904.sfr@canb.auug.org.au> Hi all, We (Anton Blanchard and others) have been trying to figure out the best (or any) way to allow for IOMMU bypass when setting up DMA mappings on particular devices. Our current idea is to hang a structure of pointers to DMA mapping operations off the struct device and inherit it from the device's parent. This would allow for per-bus (rather than per-bus_type) mapping operations and also allow a driver to override the bus's operations for a particular device. Does this make sense? Comments (hopefully consructive) please. Is there a better/simpler/more sensible way to do this? -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050715/43ffbeac/attachment.pgp From grundler at parisc-linux.org Tue Jul 19 05:21:16 2005 From: grundler at parisc-linux.org (Grant Grundler) Date: Mon, 18 Jul 2005 13:21:16 -0600 Subject: [PATCH 2.6.13-rc1 05/10] IOCHK interface for I/O error handling/detecting In-Reply-To: <42CB680E.2010103@jp.fujitsu.com> References: <42CB63B2.6000505@jp.fujitsu.com> <42CB680E.2010103@jp.fujitsu.com> Message-ID: <20050718192116.GB11016@colo.lackof.org> On Wed, Jul 06, 2005 at 02:11:42PM +0900, Hidetoshi Seto wrote: > [This is 5 of 10 patches, "iochk-05-check_bridge.patch"] ... > It means that A or B hits a bus error, but there is no data > which one actually hits the error. So, C should notify the > error to both of A and B, and clear the H's status to start > its own I/Os. > > If there are only two devices, it become more simple. It is > clear if one find a bridge error while another is check-in, > the error is nothing except for another's. Sorry, I don't understand this last paragraph. I don't see how it's more simple with two devices (vs three) if we don't exactly know which device caused the error. I thought one still needed to reset/restart both devices. Is that correct? The devices operate asyncronously from the drivers. Only the driver can tell us for sure if IO was in flight for a particular device and decide that a device could NOT have generated an error. Otherwise, so far, the patches look fine to me. thanks, grant From santil at us.ibm.com Tue Jul 19 06:05:32 2005 From: santil at us.ibm.com (Santiago Leon) Date: Mon, 18 Jul 2005 15:05:32 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (2/2 - virtual drivers) Message-ID: <42DC0B8C.5070104@us.ibm.com> Part 2/2 of software-suspend-2 for ppc64 (virtual drivers changes)... Comments and suggestions are always welcome... Signed-off-by: Santiago Leon -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ssusp2.1.9.9-ppc64-virtual_drivers.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050718/bec63d7d/attachment.txt From olof at lixom.net Tue Jul 19 08:05:54 2005 From: olof at lixom.net (Olof Johansson) Date: Mon, 18 Jul 2005 17:05:54 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (2/2 - virtual drivers) In-Reply-To: <42DC0B8C.5070104@us.ibm.com> References: <42DC0B8C.5070104@us.ibm.com> Message-ID: <20050718220553.GA11634@austin.ibm.com> Hi Santi, Are the VIO drivers maintained outside of the kernel somewhere as well? If they're not, then there isn't really much use in having driver version numbers, right? -Olof > diff -urN corig/drivers/net/ibmveth.c c/drivers/net/ibmveth.c > --- corig/drivers/net/ibmveth.c 2005-07-14 16:55:44.000000000 -0500 > +++ c/drivers/net/ibmveth.c 2005-07-15 08:54:04.000000000 -0500 > @@ -105,7 +105,7 @@ > > static const char ibmveth_driver_name[] = "ibmveth"; > static const char ibmveth_driver_string[] = "IBM i/pSeries Virtual Ethernet Driver"; > -#define ibmveth_driver_version "1.03" > +#define ibmveth_driver_version "1.04" > > MODULE_AUTHOR("Santiago Leon "); > MODULE_DESCRIPTION("IBM i/pSeries Virtual Ethernet Driver"); [...] > diff -urN corig/drivers/scsi/ibmvscsi/ibmvscsi.c c/drivers/scsi/ibmvscsi/ibmvscsi.c > --- corig/drivers/scsi/ibmvscsi/ibmvscsi.c 2005-07-14 16:55:47.000000000 -0500 > +++ c/drivers/scsi/ibmvscsi/ibmvscsi.c 2005-07-15 08:54:20.000000000 -0500 > @@ -87,7 +87,7 @@ > static int init_timeout = 5; > static int max_requests = 50; > > -#define IBMVSCSI_VERSION "1.5.5" > +#define IBMVSCSI_VERSION "1.5.6" > > MODULE_DESCRIPTION("IBM Virtual SCSI"); > MODULE_AUTHOR("Dave Boutcher"); From santil at us.ibm.com Tue Jul 19 09:36:12 2005 From: santil at us.ibm.com (Santiago Leon) Date: Mon, 18 Jul 2005 18:36:12 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (2/2 - virtual drivers) In-Reply-To: <20050718220553.GA11634@austin.ibm.com> References: <42DC0B8C.5070104@us.ibm.com> <20050718220553.GA11634@austin.ibm.com> Message-ID: <42DC3CEC.4050807@us.ibm.com> Olof... We keep version numbers because there used to be some backwards-compatibility issues in these drivers... To make matters worse, the drivers are a little different in the SLES9, RHEL4, and mainline trees (sometimes the distros were hesitant to apply some updating patches)... The latter is something that we definitely want to avoid, and we're working on it... Olof Johansson wrote: > Hi Santi, > > Are the VIO drivers maintained outside of the kernel somewhere as well? > If they're not, then there isn't really much use in having driver > version numbers, right? > > > -Olof > > >>diff -urN corig/drivers/net/ibmveth.c c/drivers/net/ibmveth.c >>--- corig/drivers/net/ibmveth.c 2005-07-14 16:55:44.000000000 -0500 >>+++ c/drivers/net/ibmveth.c 2005-07-15 08:54:04.000000000 -0500 >>@@ -105,7 +105,7 @@ >> >> static const char ibmveth_driver_name[] = "ibmveth"; >> static const char ibmveth_driver_string[] = "IBM i/pSeries Virtual Ethernet Driver"; >>-#define ibmveth_driver_version "1.03" >>+#define ibmveth_driver_version "1.04" >> >> MODULE_AUTHOR("Santiago Leon "); >> MODULE_DESCRIPTION("IBM i/pSeries Virtual Ethernet Driver"); > > > [...] > > >>diff -urN corig/drivers/scsi/ibmvscsi/ibmvscsi.c c/drivers/scsi/ibmvscsi/ibmvscsi.c >>--- corig/drivers/scsi/ibmvscsi/ibmvscsi.c 2005-07-14 16:55:47.000000000 -0500 >>+++ c/drivers/scsi/ibmvscsi/ibmvscsi.c 2005-07-15 08:54:20.000000000 -0500 >>@@ -87,7 +87,7 @@ >> static int init_timeout = 5; >> static int max_requests = 50; >> >>-#define IBMVSCSI_VERSION "1.5.5" >>+#define IBMVSCSI_VERSION "1.5.6" >> >> MODULE_DESCRIPTION("IBM Virtual SCSI"); >> MODULE_AUTHOR("Dave Boutcher"); > > -- Santiago A. Leon Power Linux Development IBM Linux Technology Center From santil at us.ibm.com Tue Jul 19 09:41:37 2005 From: santil at us.ibm.com (Santiago Leon) Date: Mon, 18 Jul 2005 18:41:37 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) Message-ID: <42DC3E31.2080100@us.ibm.com> Resending (1/2) because it looks like the mailing list ate it the first time... These patches add support for software-suspend-2 on the ppc64 platform... I have tested and works nicely on a PAPR pSeries partition with virtual console, disk , and network (hvc_console, ibmvscsi, and ibmveth, respectively) hibernating to a file... It applies to the latest ssusp2 development version (2.1.9.9 which applies to the 2.6.12.2 kernel)... Comments and suggestions are always welcome... Signed-off-by: Santiago Leon -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ssusp2.1.9.9-ppc64-core.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050718/bd49ea10/attachment.txt From ntl at pobox.com Tue Jul 19 09:52:21 2005 From: ntl at pobox.com (Nathan Lynch) Date: Mon, 18 Jul 2005 18:52:21 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) In-Reply-To: <42DC3E31.2080100@us.ibm.com> References: <42DC3E31.2080100@us.ibm.com> Message-ID: <20050718235221.GB17865@otto> Santiago Leon wrote: > +static inline void move_stack_to_nonconflicing_area(void) ^^^^ nonconflicting? From santil at us.ibm.com Tue Jul 19 06:05:24 2005 From: santil at us.ibm.com (Santiago Leon) Date: Mon, 18 Jul 2005 15:05:24 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) Message-ID: <42DC0B84.7060109@us.ibm.com> These patches add support for software-suspend-2 on the ppc64 platform... I have tested and works nicely on a PAPR pSeries partition with virtual console, disk , and network (hvc_console, ibmvscsi, and ibmveth, respectively) hibernating to a file... It applies to the latest ssusp2 development version (2.1.9.9 which applies to the 2.6.12.2 kernel)... Comments and suggestions are always welcome... Signed-off-by: Santiago Leon -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ssusp2.1.9.9-ppc64-core.patch Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050718/ec8cef40/attachment.txt From santil at us.ibm.com Tue Jul 19 10:12:10 2005 From: santil at us.ibm.com (Santiago Leon) Date: Mon, 18 Jul 2005 19:12:10 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) In-Reply-To: <20050718235221.GB17865@otto> References: <42DC3E31.2080100@us.ibm.com> <20050718235221.GB17865@otto> Message-ID: <42DC455A.2090606@us.ibm.com> >>+static inline void move_stack_to_nonconflicing_area(void) > > ^^^^ > nonconflicting? Wow... the i386 and ppc files have the same spelling error... good catch!... -- Santiago A. Leon Power Linux Development IBM Linux Technology Center From sfr at canb.auug.org.au Tue Jul 19 10:58:09 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 19 Jul 2005 10:58:09 +1000 Subject: [PATCH] Software Suspend 2 for ppc64 (2/2 - virtual drivers) In-Reply-To: <42DC0B8C.5070104@us.ibm.com> References: <42DC0B8C.5070104@us.ibm.com> Message-ID: <20050719105809.25c7c601.sfr@canb.auug.org.au> On Mon, 18 Jul 2005 15:05:32 -0500 Santiago Leon wrote: > > Part 2/2 of software-suspend-2 for ppc64 (virtual drivers changes)... > > Comments and suggestions are always welcome... > > Signed-off-by: Santiago Leon Pleas put patches inline - it make it easier for people to comment ... > diff -urN corig/arch/ppc64/kernel/vio.c c/arch/ppc64/kernel/vio.c > --- corig/arch/ppc64/kernel/vio.c 2005-07-14 16:55:41.000000000 -0500 > +++ c/arch/ppc64/kernel/vio.c 2005-07-14 20:02:51.000000000 -0500 > @@ -632,9 +632,32 @@ > return 0; > } > > +static int vio_device_suspend(struct device * dev, pm_message_t state) > +{ > + struct vio_dev * vio_dev = to_vio_dev(dev); > + struct vio_driver *drv = to_vio_driver(dev->driver); > + > + if (drv && drv->suspend) > + return drv->suspend(vio_dev, state); > + else ^^^^ else not needed. > + return 0; > +} > + > +static int vio_device_resume(struct device * dev) > +{ > + struct vio_dev * vio_dev = to_vio_dev(dev); > + struct vio_driver *drv = to_vio_driver(dev->driver); > + > + if (drv && drv->resume) > + drv->resume(vio_dev); ^ Shouldn't this have "return" here? > + return 0; > diff -urN corig/drivers/net/ibmveth.c c/drivers/net/ibmveth.c > --- corig/drivers/net/ibmveth.c 2005-07-14 16:55:44.000000000 -0500 > +++ c/drivers/net/ibmveth.c 2005-07-15 08:54:04.000000000 -0500 > static struct vio_driver ibmveth_driver = { > .name = (char *)ibmveth_driver_name, > .id_table = ibmveth_device_table, > .probe = ibmveth_probe, > - .remove = ibmveth_remove > + .remove = ibmveth_remove, > + .suspend = ibmveth_suspend, > + .resume = ibmveth_resume You might as well terminate this witha a comma ... > diff -urN corig/include/asm-ppc64/vio.h c/include/asm-ppc64/vio.h > --- corig/include/asm-ppc64/vio.h 2005-07-14 16:55:59.000000000 -0500 > +++ c/include/asm-ppc64/vio.h 2005-07-14 20:00:03.000000000 -0500 > @@ -79,7 +82,7 @@ > > static inline struct vio_driver *to_vio_driver(struct device_driver *drv) > { > - return container_of(drv, struct vio_driver, driver); > + return drv ? container_of(drv, struct vio_driver, driver) : NULL; So who is passing a NULL to to_vio_driver? -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050719/f80be854/attachment.pgp From benh at kernel.crashing.org Tue Jul 19 11:24:54 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 19 Jul 2005 11:24:54 +1000 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) In-Reply-To: <42DC3E31.2080100@us.ibm.com> References: <42DC3E31.2080100@us.ibm.com> Message-ID: <1121736294.14393.40.camel@gaston> Hi! > > + if (try_to_freeze()) { > + signr = 0; > + if (!signal_pending(current)) > + goto no_signal; > + } > + > + if (freezing(current)) { > + try_to_freeze(); > + signr = 0; > + recalc_sigpending(); > + if (!signal_pending(current)) > + goto no_signal; > + } The above looks a bit weird & redundant ... But then, it might be some subtelty with Nigel's updated refrigerator. > + asm volatile ("std 1,%0" : "=m" (s->sp)); > + asm volatile ("std 2,%0" : "=m" (s->r2)); > + asm volatile ("std 12,%0" : "=m" (s->r[0])); > + asm volatile ("std 13,%0" : "=m" (s->r[1])); .../... > + asm volatile ("mftb 4; std 4,%0": "=m" (s->tb)); > + > + /* Save SPRGs */ > + asm volatile ("mfsprg 4,0; std 4,%0 " : "=m" (s->sprg[0])); > + asm volatile ("mfsprg 4,1; std 4,%0 " : "=m" (s->sprg[1])); > + asm volatile ("mfsprg 4,2; std 4,%0 " : "=m" (s->sprg[2])); > + asm volatile ("mfsprg 4,3; std 4,%0 " : "=m" (s->sprg[3])); > + > + /* Save MSR & SDR1 */ > + asm volatile ("mfmsr 4; std 4,%0" : "=m" (s->msr)); > + asm volatile ("mfsdr1 4; std 4,%0": "=m" (s->sdr1)); > +} The above should be in an assembly .S file. Generally, I don't like the way the processor state saving is done separately from the actual low level asm suspend call. It makes little sense. You assume gcc won't play with registers behind your back, fairly unsafe. > +void __smp_suspend_lowlevel(void * data) .../... > + __restore_processor_state(suspend2_saved_contexts + > + _smp_processor_id()); > + local_flush_tlb(); I'm not sure calling local_flush_tlb() here makes much sense. You need to do more than just flushing the current batch here. You actually need to invalidate the entire TLB which is not necessarily easy, and you need to take care of the SLB as well. > + /* > + *Save context and go back to idling. > + * Note that we cannot leave the processor > + * here. It must be able to receive IPIs if > + * the LZF compression driver (eg) does a > + * vfree after compressing the kernel etc > + */ Gack ? Can you explain the above comment a bit more ? Something is playing with kernel virtual space while CPUs are locked into IPIs and/or that kind of horror ? Doesn't seem like a very sane thing to do. > +static inline void move_stack_to_nonconflicing_area(void) > +{ > + unsigned long old_stack, src; > + > + new_stack_page = > + suspend2_get_nonconflicting_pages(get_order(THREAD_SIZE)); > + > + BUG_ON(!new_stack_page); > + > + /* geting stack address */ > + asm volatile ("std %%r1, %0" : "=m" (old_stack)); > + > + src = old_stack & (~(THREAD_SIZE - 1)); > + > + /* Copy stack */ > + memcpy((void*)new_stack_page, (void*)src, THREAD_SIZE); > + > + new_stack_page += (old_stack - src); > + > + /* switch to new stack */ > + asm volatile ("ld %%r1, %0" : "=m" (new_stack_page)); > + > +} In what context is the above called ? You are likely to die an horrible death when moving the stack around if you take an SLB miss at the wrong time. The kernel is careful about always locking the current kernel stack segment in SLB, various bits make assumption that remains true at all time. In general, you seem to completely ignore the hash table and SLB. That might work by luck, because your "loader" kernel is the same as your "saved" kernel, you'll eventually end up with proper bolted down hash entries, but it's very dodgy. You are also ignoring the iommu, and RTAs, you should pray the firmware will get them back to you at the same place. Among others ... Ben. From anton at samba.org Tue Jul 19 11:22:06 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 19 Jul 2005 11:22:06 +1000 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) In-Reply-To: <42DC3E31.2080100@us.ibm.com> References: <42DC3E31.2080100@us.ibm.com> Message-ID: <20050719012206.GA3164@krispykreme> Hi, > These patches add support for software-suspend-2 on the ppc64 > platform... I have tested and works nicely on a PAPR pSeries partition > with virtual console, disk , and network (hvc_console, ibmvscsi, and > ibmveth, respectively) hibernating to a file... It applies to the latest > ssusp2 development version (2.1.9.9 which applies to the 2.6.12.2 > kernel)... Cool. ... > + /* Restore TB */ > + asm volatile ("li 3,0; mttbl 3; \n" > + "lwz 3,%0\n; lwz 4,%1\n" > + "mttbu 3; mttbl 4" : > + "=m" (s->tb[0]), > + "=m" (s->tb[1]) : : "r3"); We cant write to the timebase on a partitioned machine. Also it looks like on a non partitioned machine we will restore the timebase without synchronising it between cpus. We probably need a callback to reset our timebase code on resume. Not sure what we do about userspace gettimeofday, if you suspend at exactly the wrong time I could imagine your time could go crazy for the first call :) Anton From david at gibson.dropbear.id.au Tue Jul 19 11:17:17 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 19 Jul 2005 11:17:17 +1000 Subject: PATCH: Add memreserve to DTC In-Reply-To: <1121437857.24864.12.camel@cashmere.sps.mot.com> References: <1120859097.8609.15.camel@cashmere.sps.mot.com> <20050711045532.GC32545@localhost.localdomain> <1121116950.15394.14.camel@cashmere.sps.mot.com> <20050712040623.GG3945@localhost.localdomain> <1121371427.24467.34.camel@cashmere.sps.mot.com> <20050715071924.GA16797@localhost.localdomain> <1121437857.24864.12.camel@cashmere.sps.mot.com> Message-ID: <20050719011717.GG24609@localhost.localdomain> On Fri, Jul 15, 2005 at 09:30:58AM -0500, Jon Loeliger wrote: > On Fri, 2005-07-15 at 02:19, David Gibson wrote: > > > > > Ok, I've merged this, > > Excellent, thanks! > > > although I've tweaked things substantially in the process. > > No problem. > > > I did rename "header_tree" to "boot_info", moved some > > Oh, good! > > > things around, and changed the syntax. Reserve ranges can now be > > specified either as an address and length: > > > > /memreserve/ 10000000 00002000; > > > > or as an (inclusive) address range: > > > > /memreserve/ 10000000-10001fff; > > > > I am a bit worried that those two forms may be hard to distinguish at > > a glance. Any sugggestions for changes to the syntax soon please, I'd > > really like to keep the source syntax as stable as possible. > > Oh man. With syntax you can demystify those in any number > of ways. Just a matter of what you are wanting. You can > always add sugar: > > /memreserve_block/ 10000000 00002000; > /memreserve_range/ 10000000 10001fff; > > /memreserve/ 10000000 /for/ 2000; // or /size/ ? > /memreserve/ 10000000 /through/ 10001fff; > > /memreserve/ 10000000 00002000; > /memreserve/ [10000000, 10001fff]; // or [10000000, 10002000)? > > Stuff like that maybe? Hrm.. don't really like any of those better than what I have already, I'm afraid. It does occur to me that size > base is going to be a very rare situation, so the value of the numbers themselves will act as a reasonable hint as to which form is in use. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From sfr at canb.auug.org.au Tue Jul 19 13:38:53 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Tue, 19 Jul 2005 13:38:53 +1000 Subject: [PATCH] compat: be more consistent about [ug]id_t Message-ID: <20050719133853.7ce48b06.sfr@canb.auug.org.au> Hi all, When I first wrote the compat layer patches, I was somewhat cavalier about the definition of compat_uid_t and compat_gid_t (or maybe I just misunderstood :-)). This patch makes the compat types much more consistent with the types we are being compatible with and hopefully will fix a few bugs along the way. compat type type in compat arch __compat_[ug]id_t __kernel_[ug]id_t __compat_[ug]id32_t __kernel_[ug]id32_t compat_[ug]id_t [ug]id_t The difference is that compat_uid_t is always 32 bits (for the archs we care about) but __compat_uid_t may be 16 bits on some. Signed-off-by: Stephen Rothwell --- arch/mips/kernel/linux32.c | 16 ++++++++-------- fs/compat.c | 16 ++++++++-------- include/asm-ia64/compat.h | 20 ++++++++++---------- include/asm-mips/compat.h | 10 ++++++---- include/asm-parisc/compat.h | 10 ++++++---- include/asm-ppc64/compat.h | 18 ++++++++++-------- include/asm-s390/compat.h | 20 ++++++++++---------- include/asm-sparc64/compat.h | 18 ++++++++++-------- include/asm-x86_64/compat.h | 20 ++++++++++---------- include/linux/compat.h | 3 +++ ipc/compat.c | 12 ++++++------ 11 files changed, 87 insertions(+), 76 deletions(-) -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruNp linus/arch/mips/kernel/linux32.c linus-compat_uid_t/arch/mips/kernel/linux32.c --- linus/arch/mips/kernel/linux32.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-compat_uid_t/arch/mips/kernel/linux32.c 2005-06-27 17:40:08.000000000 +1000 @@ -546,20 +546,20 @@ struct msgbuf32 { s32 mtype; char mtext[ struct ipc_perm32 { key_t key; - compat_uid_t uid; - compat_gid_t gid; - compat_uid_t cuid; - compat_gid_t cgid; + __compat_uid_t uid; + __compat_gid_t gid; + __compat_uid_t cuid; + __compat_gid_t cgid; compat_mode_t mode; unsigned short seq; }; struct ipc64_perm32 { key_t key; - compat_uid_t uid; - compat_gid_t gid; - compat_uid_t cuid; - compat_gid_t cgid; + __compat_uid_t uid; + __compat_gid_t gid; + __compat_uid_t cuid; + __compat_gid_t cgid; compat_mode_t mode; unsigned short seq; unsigned short __pad1; diff -ruNp linus/fs/compat.c linus-compat_uid_t/fs/compat.c --- linus/fs/compat.c 2005-07-13 15:13:18.000000000 +1000 +++ linus-compat_uid_t/fs/compat.c 2005-07-13 16:26:29.000000000 +1000 @@ -720,14 +720,14 @@ compat_sys_io_submit(aio_context_t ctx_i struct compat_ncp_mount_data { compat_int_t version; compat_uint_t ncp_fd; - compat_uid_t mounted_uid; + __compat_uid_t mounted_uid; compat_pid_t wdog_pid; unsigned char mounted_vol[NCP_VOLNAME_LEN + 1]; compat_uint_t time_out; compat_uint_t retry_count; compat_uint_t flags; - compat_uid_t uid; - compat_gid_t gid; + __compat_uid_t uid; + __compat_gid_t gid; compat_mode_t file_mode; compat_mode_t dir_mode; }; @@ -784,9 +784,9 @@ static void *do_ncp_super_data_conv(void struct compat_smb_mount_data { compat_int_t version; - compat_uid_t mounted_uid; - compat_uid_t uid; - compat_gid_t gid; + __compat_uid_t mounted_uid; + __compat_uid_t uid; + __compat_gid_t gid; compat_mode_t file_mode; compat_mode_t dir_mode; }; @@ -1808,8 +1808,8 @@ struct compat_nfsctl_export { compat_dev_t ex32_dev; compat_ino_t ex32_ino; compat_int_t ex32_flags; - compat_uid_t ex32_anon_uid; - compat_gid_t ex32_anon_gid; + __compat_uid_t ex32_anon_uid; + __compat_gid_t ex32_anon_gid; }; struct compat_nfsctl_fdparm { diff -ruNp linus/include/asm-ia64/compat.h linus-compat_uid_t/include/asm-ia64/compat.h --- linus/include/asm-ia64/compat.h 2005-06-27 16:08:06.000000000 +1000 +++ linus-compat_uid_t/include/asm-ia64/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -13,10 +13,10 @@ typedef s32 compat_time_t; typedef s32 compat_clock_t; typedef s32 compat_key_t; typedef s32 compat_pid_t; -typedef u16 compat_uid_t; -typedef u16 compat_gid_t; -typedef u32 compat_uid32_t; -typedef u32 compat_gid32_t; +typedef u16 __compat_uid_t; +typedef u16 __compat_gid_t; +typedef u32 __compat_uid32_t; +typedef u32 __compat_gid32_t; typedef u16 compat_mode_t; typedef u32 compat_ino_t; typedef u16 compat_dev_t; @@ -50,8 +50,8 @@ struct compat_stat { compat_ino_t st_ino; compat_mode_t st_mode; compat_nlink_t st_nlink; - compat_uid_t st_uid; - compat_gid_t st_gid; + __compat_uid_t st_uid; + __compat_gid_t st_gid; compat_dev_t st_rdev; u16 __pad2; u32 st_size; @@ -120,10 +120,10 @@ typedef u32 compat_sigset_word; struct compat_ipc64_perm { compat_key_t key; - compat_uid32_t uid; - compat_gid32_t gid; - compat_uid32_t cuid; - compat_gid32_t cgid; + __compat_uid32_t uid; + __compat_gid32_t gid; + __compat_uid32_t cuid; + __compat_gid32_t cgid; unsigned short mode; unsigned short __pad1; unsigned short seq; diff -ruNp linus/include/asm-mips/compat.h linus-compat_uid_t/include/asm-mips/compat.h --- linus/include/asm-mips/compat.h 2005-06-27 16:08:07.000000000 +1000 +++ linus-compat_uid_t/include/asm-mips/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -15,8 +15,10 @@ typedef s32 compat_clock_t; typedef s32 compat_suseconds_t; typedef s32 compat_pid_t; -typedef s32 compat_uid_t; -typedef s32 compat_gid_t; +typedef u32 __compat_uid_t; +typedef u32 __compat_gid_t; +typedef u32 __compat_uid32_t; +typedef u32 __compat_gid32_t; typedef u32 compat_mode_t; typedef u32 compat_ino_t; typedef u32 compat_dev_t; @@ -52,8 +54,8 @@ struct compat_stat { compat_ino_t st_ino; compat_mode_t st_mode; compat_nlink_t st_nlink; - compat_uid_t st_uid; - compat_gid_t st_gid; + __compat_uid32_t st_uid; + __compat_gid32_t st_gid; compat_dev_t st_rdev; s32 st_pad2[2]; compat_off_t st_size; diff -ruNp linus/include/asm-parisc/compat.h linus-compat_uid_t/include/asm-parisc/compat.h --- linus/include/asm-parisc/compat.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-compat_uid_t/include/asm-parisc/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -13,8 +13,10 @@ typedef s32 compat_ssize_t; typedef s32 compat_time_t; typedef s32 compat_clock_t; typedef s32 compat_pid_t; -typedef u32 compat_uid_t; -typedef u32 compat_gid_t; +typedef u32 __compat_uid_t; +typedef u32 __compat_gid_t; +typedef u32 __compat_uid32_t; +typedef u32 __compat_gid32_t; typedef u16 compat_mode_t; typedef u32 compat_ino_t; typedef u32 compat_dev_t; @@ -67,8 +69,8 @@ struct compat_stat { compat_dev_t st_realdev; u16 st_basemode; u16 st_spareshort; - compat_uid_t st_uid; - compat_gid_t st_gid; + __compat_uid32_t st_uid; + __compat_gid32_t st_gid; u32 st_spare4[3]; }; diff -ruNp linus/include/asm-ppc64/compat.h linus-compat_uid_t/include/asm-ppc64/compat.h --- linus/include/asm-ppc64/compat.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-compat_uid_t/include/asm-ppc64/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -13,8 +13,10 @@ typedef s32 compat_ssize_t; typedef s32 compat_time_t; typedef s32 compat_clock_t; typedef s32 compat_pid_t; -typedef u32 compat_uid_t; -typedef u32 compat_gid_t; +typedef u32 __compat_uid_t; +typedef u32 __compat_gid_t; +typedef u32 __compat_uid32_t; +typedef u32 __compat_gid32_t; typedef u32 compat_mode_t; typedef u32 compat_ino_t; typedef u32 compat_dev_t; @@ -48,8 +50,8 @@ struct compat_stat { compat_ino_t st_ino; compat_mode_t st_mode; compat_nlink_t st_nlink; - compat_uid_t st_uid; - compat_gid_t st_gid; + __compat_uid32_t st_uid; + __compat_gid32_t st_gid; compat_dev_t st_rdev; compat_off_t st_size; compat_off_t st_blksize; @@ -144,10 +146,10 @@ static inline void __user *compat_alloc_ */ struct compat_ipc64_perm { compat_key_t key; - compat_uid_t uid; - compat_gid_t gid; - compat_uid_t cuid; - compat_gid_t cgid; + __compat_uid_t uid; + __compat_gid_t gid; + __compat_uid_t cuid; + __compat_gid_t cgid; compat_mode_t mode; unsigned int seq; unsigned int __pad2; diff -ruNp linus/include/asm-s390/compat.h linus-compat_uid_t/include/asm-s390/compat.h --- linus/include/asm-s390/compat.h 2005-06-27 16:08:09.000000000 +1000 +++ linus-compat_uid_t/include/asm-s390/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -13,10 +13,10 @@ typedef s32 compat_ssize_t; typedef s32 compat_time_t; typedef s32 compat_clock_t; typedef s32 compat_pid_t; -typedef u16 compat_uid_t; -typedef u16 compat_gid_t; -typedef u32 compat_uid32_t; -typedef u32 compat_gid32_t; +typedef u16 __compat_uid_t; +typedef u16 __compat_gid_t; +typedef u32 __compat_uid32_t; +typedef u32 __compat_gid32_t; typedef u16 compat_mode_t; typedef u32 compat_ino_t; typedef u16 compat_dev_t; @@ -51,8 +51,8 @@ struct compat_stat { compat_ino_t st_ino; compat_mode_t st_mode; compat_nlink_t st_nlink; - compat_uid_t st_uid; - compat_gid_t st_gid; + __compat_uid_t st_uid; + __compat_gid_t st_gid; compat_dev_t st_rdev; u16 __pad2; u32 st_size; @@ -140,10 +140,10 @@ static inline void __user *compat_alloc_ struct compat_ipc64_perm { compat_key_t key; - compat_uid32_t uid; - compat_gid32_t gid; - compat_uid32_t cuid; - compat_gid32_t cgid; + __compat_uid32_t uid; + __compat_gid32_t gid; + __compat_uid32_t cuid; + __compat_gid32_t cgid; compat_mode_t mode; unsigned short __pad1; unsigned short seq; diff -ruNp linus/include/asm-sparc64/compat.h linus-compat_uid_t/include/asm-sparc64/compat.h --- linus/include/asm-sparc64/compat.h 2005-06-27 16:08:10.000000000 +1000 +++ linus-compat_uid_t/include/asm-sparc64/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -12,8 +12,10 @@ typedef s32 compat_ssize_t; typedef s32 compat_time_t; typedef s32 compat_clock_t; typedef s32 compat_pid_t; -typedef u16 compat_uid_t; -typedef u16 compat_gid_t; +typedef u16 __compat_uid_t; +typedef u16 __compat_gid_t; +typedef u32 __compat_uid32_t; +typedef u32 __compat_gid32_t; typedef u16 compat_mode_t; typedef u32 compat_ino_t; typedef u16 compat_dev_t; @@ -47,8 +49,8 @@ struct compat_stat { compat_ino_t st_ino; compat_mode_t st_mode; compat_nlink_t st_nlink; - compat_uid_t st_uid; - compat_gid_t st_gid; + __compat_uid_t st_uid; + __compat_gid_t st_gid; compat_dev_t st_rdev; compat_off_t st_size; compat_time_t st_atime; @@ -177,10 +179,10 @@ static __inline__ void __user *compat_al struct compat_ipc64_perm { compat_key_t key; - __kernel_uid_t uid; - __kernel_gid_t gid; - __kernel_uid_t cuid; - __kernel_gid_t cgid; + __compat_uid32_t uid; + __compat_gid32_t gid; + __compat_uid32_t cuid; + __compat_gid32_t cgid; unsigned short __pad1; compat_mode_t mode; unsigned short __pad2; diff -ruNp linus/include/asm-x86_64/compat.h linus-compat_uid_t/include/asm-x86_64/compat.h --- linus/include/asm-x86_64/compat.h 2005-06-27 16:08:10.000000000 +1000 +++ linus-compat_uid_t/include/asm-x86_64/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -14,10 +14,10 @@ typedef s32 compat_ssize_t; typedef s32 compat_time_t; typedef s32 compat_clock_t; typedef s32 compat_pid_t; -typedef u16 compat_uid_t; -typedef u16 compat_gid_t; -typedef u32 compat_uid32_t; -typedef u32 compat_gid32_t; +typedef u16 __compat_uid_t; +typedef u16 __compat_gid_t; +typedef u32 __compat_uid32_t; +typedef u32 __compat_gid32_t; typedef u16 compat_mode_t; typedef u32 compat_ino_t; typedef u16 compat_dev_t; @@ -52,8 +52,8 @@ struct compat_stat { compat_ino_t st_ino; compat_mode_t st_mode; compat_nlink_t st_nlink; - compat_uid_t st_uid; - compat_gid_t st_gid; + __compat_uid_t st_uid; + __compat_gid_t st_gid; compat_dev_t st_rdev; u16 __pad2; u32 st_size; @@ -122,10 +122,10 @@ typedef u32 compat_sigset_ struct compat_ipc64_perm { compat_key_t key; - compat_uid32_t uid; - compat_gid32_t gid; - compat_uid32_t cuid; - compat_gid32_t cgid; + __compat_uid32_t uid; + __compat_gid32_t gid; + __compat_uid32_t cuid; + __compat_gid32_t cgid; unsigned short mode; unsigned short __pad1; unsigned short seq; diff -ruNp linus/include/linux/compat.h linus-compat_uid_t/include/linux/compat.h --- linus/include/linux/compat.h 2005-06-27 16:08:11.000000000 +1000 +++ linus-compat_uid_t/include/linux/compat.h 2005-06-27 17:40:08.000000000 +1000 @@ -18,6 +18,9 @@ #define compat_jiffies_to_clock_t(x) \ (((unsigned long)(x) * COMPAT_USER_HZ) / HZ) +typedef __compat_uid32_t compat_uid_t; +typedef __compat_gid32_t compat_gid_t; + struct rusage; struct compat_itimerspec { diff -ruNp linus/ipc/compat.c linus-compat_uid_t/ipc/compat.c --- linus/ipc/compat.c 2005-07-08 15:18:28.000000000 +1000 +++ linus-compat_uid_t/ipc/compat.c 2005-07-08 15:23:00.000000000 +1000 @@ -42,10 +42,10 @@ struct compat_msgbuf { struct compat_ipc_perm { key_t key; - compat_uid_t uid; - compat_gid_t gid; - compat_uid_t cuid; - compat_gid_t cgid; + __compat_uid_t uid; + __compat_gid_t gid; + __compat_uid_t cuid; + __compat_gid_t cgid; compat_mode_t mode; unsigned short seq; }; @@ -174,8 +174,8 @@ static inline int __put_compat_ipc_perm( struct compat_ipc_perm __user *up) { int err; - compat_uid_t u; - compat_gid_t g; + __compat_uid_t u; + __compat_gid_t g; err = __put_user(p->key, &up->key); SET_UID(u, p->uid); From sfr at canb.auug.org.au Wed Jul 20 01:33:08 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 20 Jul 2005 01:33:08 +1000 Subject: [PATCH] Re: RFC: IOMMU bypass In-Reply-To: <20050715130933.492ac904.sfr@canb.auug.org.au> References: <20050715130933.492ac904.sfr@canb.auug.org.au> Message-ID: <20050720013308.2770521d.sfr@canb.auug.org.au> Hi again, On Fri, 15 Jul 2005 13:09:33 +1000 Stephen Rothwell wrote: > > We (Anton Blanchard and others) have been trying to figure out the best > (or any) way to allow for IOMMU bypass when setting up DMA mappings on > particular devices. Our current idea is to hang a structure of pointers > to DMA mapping operations off the struct device and inherit it from the > device's parent. This would allow for per-bus (rather than per-bus_type) > mapping operations and also allow a driver to override the bus's > operations for a particular device. > > Does this make sense? Comments (hopefully consructive) please. > > Is there a better/simpler/more sensible way to do this? Just to give you all something concrete to attack^Wcomment on, here is a preliminary patch with the generic work and PPC64 converted. It actually helps ppc64 more than some others because we already have two different "busses": pci and vio. This has been built on both pSeries and iSeries ppc64 but not tested, yet. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ diff -ruN linus/arch/ppc64/kernel/Makefile linus-dma_bypass.3/arch/ppc64/kernel/Makefile --- linus/arch/ppc64/kernel/Makefile 2005-06-27 16:08:00.000000000 +1000 +++ linus-dma_bypass.3/arch/ppc64/kernel/Makefile 2005-06-27 17:41:10.000000000 +1000 @@ -5,7 +5,7 @@ EXTRA_CFLAGS += -mno-minimal-toc extra-y := head.o vmlinux.lds -obj-y := setup.o entry.o traps.o irq.o idle.o dma.o \ +obj-y := setup.o entry.o traps.o irq.o idle.o \ time.o process.o signal.o syscalls.o misc.o ptrace.o \ align.o semaphore.o bitops.o pacaData.o \ udbg.o binfmt_elf32.o sys_ppc32.o ioctl32.o \ diff -ruN linus/arch/ppc64/kernel/bpa_iommu.c linus-dma_bypass.3/arch/ppc64/kernel/bpa_iommu.c --- linus/arch/ppc64/kernel/bpa_iommu.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-dma_bypass.3/arch/ppc64/kernel/bpa_iommu.c 2005-07-19 14:25:11.000000000 +1000 @@ -359,6 +359,11 @@ return mask < 0x100000000ull; } +static int bpa_direct_set_dma_mask(struct device *dev, u64 mask) +{ + return pci_set_dma_mask(to_pci_dev(dev), mask); +} + void bpa_init_iommu(void) { bpa_map_iommu(); @@ -374,4 +379,5 @@ pci_dma_ops.map_sg = bpa_map_sg; pci_dma_ops.unmap_sg = bpa_unmap_sg; pci_dma_ops.dma_supported = bpa_dma_supported; + pci_dma_ops.set_dma_mask = bpa_set_dma_mask; } diff -ruN linus/arch/ppc64/kernel/dma.c linus-dma_bypass.3/arch/ppc64/kernel/dma.c --- linus/arch/ppc64/kernel/dma.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-dma_bypass.3/arch/ppc64/kernel/dma.c 1970-01-01 10:00:00.000000000 +1000 @@ -1,151 +0,0 @@ -/* - * Copyright (C) 2004 IBM Corporation - * - * Implements the generic device dma API for ppc64. Handles - * the pci and vio busses - */ - -#include -#include -/* Include the busses we support */ -#include -#include -#include -#include - -static struct dma_mapping_ops *get_dma_ops(struct device *dev) -{ -#ifdef CONFIG_PCI - if (dev->bus == &pci_bus_type) - return &pci_dma_ops; -#endif -#ifdef CONFIG_IBMVIO - if (dev->bus == &vio_bus_type) - return &vio_dma_ops; -#endif - return NULL; -} - -int dma_supported(struct device *dev, u64 mask) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - return dma_ops->dma_supported(dev, mask); - BUG(); - return 0; -} -EXPORT_SYMBOL(dma_supported); - -int dma_set_mask(struct device *dev, u64 dma_mask) -{ -#ifdef CONFIG_PCI - if (dev->bus == &pci_bus_type) - return pci_set_dma_mask(to_pci_dev(dev), dma_mask); -#endif -#ifdef CONFIG_IBMVIO - if (dev->bus == &vio_bus_type) - return -EIO; -#endif /* CONFIG_IBMVIO */ - BUG(); - return 0; -} -EXPORT_SYMBOL(dma_set_mask); - -void *dma_alloc_coherent(struct device *dev, size_t size, - dma_addr_t *dma_handle, unsigned int __nocast flag) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - return dma_ops->alloc_coherent(dev, size, dma_handle, flag); - BUG(); - return NULL; -} -EXPORT_SYMBOL(dma_alloc_coherent); - -void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, - dma_addr_t dma_handle) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - dma_ops->free_coherent(dev, size, cpu_addr, dma_handle); - else - BUG(); -} -EXPORT_SYMBOL(dma_free_coherent); - -dma_addr_t dma_map_single(struct device *dev, void *cpu_addr, size_t size, - enum dma_data_direction direction) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - return dma_ops->map_single(dev, cpu_addr, size, direction); - BUG(); - return (dma_addr_t)0; -} -EXPORT_SYMBOL(dma_map_single); - -void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size, - enum dma_data_direction direction) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - dma_ops->unmap_single(dev, dma_addr, size, direction); - else - BUG(); -} -EXPORT_SYMBOL(dma_unmap_single); - -dma_addr_t dma_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, - enum dma_data_direction direction) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - return dma_ops->map_single(dev, - (page_address(page) + offset), size, direction); - BUG(); - return (dma_addr_t)0; -} -EXPORT_SYMBOL(dma_map_page); - -void dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size, - enum dma_data_direction direction) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - dma_ops->unmap_single(dev, dma_address, size, direction); - else - BUG(); -} -EXPORT_SYMBOL(dma_unmap_page); - -int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, - enum dma_data_direction direction) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - return dma_ops->map_sg(dev, sg, nents, direction); - BUG(); - return 0; -} -EXPORT_SYMBOL(dma_map_sg); - -void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries, - enum dma_data_direction direction) -{ - struct dma_mapping_ops *dma_ops = get_dma_ops(dev); - - if (dma_ops) - dma_ops->unmap_sg(dev, sg, nhwentries, direction); - else - BUG(); -} -EXPORT_SYMBOL(dma_unmap_sg); diff -ruN linus/arch/ppc64/kernel/pci.c linus-dma_bypass.3/arch/ppc64/kernel/pci.c --- linus/arch/ppc64/kernel/pci.c 2005-06-29 11:08:36.000000000 +1000 +++ linus-dma_bypass.3/arch/ppc64/kernel/pci.c 2005-07-19 15:28:32.000000000 +1000 @@ -71,9 +71,6 @@ LIST_HEAD(hose_list); -struct dma_mapping_ops pci_dma_ops; -EXPORT_SYMBOL(pci_dma_ops); - int global_phb_number; /* Global phb counter */ /* Cached ISA bridge dev. */ diff -ruN linus/arch/ppc64/kernel/pci_direct_iommu.c linus-dma_bypass.3/arch/ppc64/kernel/pci_direct_iommu.c --- linus/arch/ppc64/kernel/pci_direct_iommu.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-dma_bypass.3/arch/ppc64/kernel/pci_direct_iommu.c 2005-07-19 15:35:04.000000000 +1000 @@ -83,6 +83,11 @@ return mask < 0x100000000ull; } +static int pci_direct_set_dma_mask(struct device *dev, u64 mask) +{ + return pci_set_dma_mask(to_pci_dev(dev), mask); +} + void __init pci_direct_iommu_init(void) { pci_dma_ops.alloc_coherent = pci_direct_alloc_coherent; @@ -92,4 +97,5 @@ pci_dma_ops.map_sg = pci_direct_map_sg; pci_dma_ops.unmap_sg = pci_direct_unmap_sg; pci_dma_ops.dma_supported = pci_direct_dma_supported; + pci_dma_ops.set_dma_mask = pci_direct_set_dma_mask; } diff -ruN linus/arch/ppc64/kernel/pci_iommu.c linus-dma_bypass.3/arch/ppc64/kernel/pci_iommu.c --- linus/arch/ppc64/kernel/pci_iommu.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-dma_bypass.3/arch/ppc64/kernel/pci_iommu.c 2005-07-19 15:35:26.000000000 +1000 @@ -127,6 +127,11 @@ return 1; } +static int pci_iommu_set_dma_mask(struct device *dev, u64 mask) +{ + return pci_set_dma_mask(to_pci_dev(dev), mask); +} + void pci_iommu_init(void) { pci_dma_ops.alloc_coherent = pci_iommu_alloc_coherent; @@ -136,4 +141,5 @@ pci_dma_ops.map_sg = pci_iommu_map_sg; pci_dma_ops.unmap_sg = pci_iommu_unmap_sg; pci_dma_ops.dma_supported = pci_iommu_dma_supported; + pci_dma_ops.set_dma_mask = pci_iommu_set_dma_mask; } diff -ruN linus/arch/ppc64/kernel/vio.c linus-dma_bypass.3/arch/ppc64/kernel/vio.c --- linus/arch/ppc64/kernel/vio.c 2005-06-27 16:08:00.000000000 +1000 +++ linus-dma_bypass.3/arch/ppc64/kernel/vio.c 2005-07-19 15:45:05.000000000 +1000 @@ -45,6 +45,8 @@ static struct iommu_table veth_iommu_table; static struct iommu_table vio_iommu_table; #endif +static struct bus_type vio_bus_type; +static struct dma_mapping_ops vio_dma_mapping_ops; static struct vio_dev vio_bus_device = { /* fake "parent" device */ .name = vio_bus_device.dev.bus_id, .type = "", @@ -53,6 +55,7 @@ #endif .dev.bus_id = "vio", .dev.bus = &vio_bus_type, + .dev.dma_mapping_ops = &vio_dma_mapping_ops, }; #ifdef CONFIG_PPC_ISERIES @@ -600,7 +603,12 @@ return 1; } -struct dma_mapping_ops vio_dma_ops = { +static int vio_set_dma_mask(struct device *dev, u64 mask) +{ + return -EIO; +} + +static struct dma_mapping_ops vio_dma_mapping_ops = { .alloc_coherent = vio_alloc_coherent, .free_coherent = vio_free_coherent, .map_single = vio_map_single, @@ -608,6 +616,7 @@ .map_sg = vio_map_sg, .unmap_sg = vio_unmap_sg, .dma_supported = vio_dma_supported, + .set_dma_mask = vio_set_dma_mask, }; static int vio_bus_match(struct device *dev, struct device_driver *drv) @@ -629,7 +638,7 @@ return 0; } -struct bus_type vio_bus_type = { +static struct bus_type vio_bus_type = { .name = "vio", .match = vio_bus_match, }; diff -ruN linus/drivers/base/core.c linus-dma_bypass.3/drivers/base/core.c --- linus/drivers/base/core.c 2005-07-01 09:58:50.000000000 +1000 +++ linus-dma_bypass.3/drivers/base/core.c 2005-07-19 15:18:22.000000000 +1000 @@ -251,6 +251,9 @@ if (parent) klist_add_tail(&parent->klist_children, &dev->knode_parent); + if (parent && (dev->dma_mapping_ops == NULL)) + dev->dma_mapping_ops = parent->dma_mapping_ops; + /* notify platform of device entry */ if (platform_notify) platform_notify(dev); diff -ruN linus/drivers/pci/probe.c linus-dma_bypass.3/drivers/pci/probe.c --- linus/drivers/pci/probe.c 2005-07-06 21:18:22.000000000 +1000 +++ linus-dma_bypass.3/drivers/pci/probe.c 2005-07-19 15:43:37.000000000 +1000 @@ -9,6 +9,7 @@ #include #include #include +#include #include "pci.h" #define CARDBUS_LATENCY_TIMER 176 /* secondary latency timer */ @@ -22,6 +23,8 @@ LIST_HEAD(pci_devices); +struct dma_mapping_ops pci_dma_ops; + #ifdef HAVE_PCI_LEGACY /** * pci_create_legacy_files - create legacy I/O port and memory files @@ -915,6 +918,8 @@ dev->parent = parent; dev->release = pci_release_bus_bridge_dev; sprintf(dev->bus_id, "pci%04x:%02x", pci_domain_nr(b), bus); + if (parent == NULL) + dev->dma_mapping_ops = &pci_dma_ops; error = device_register(dev); if (error) goto dev_reg_err; diff -ruN linus/include/asm-ppc64/dma-mapping.h linus-dma_bypass.3/include/asm-ppc64/dma-mapping.h --- linus/include/asm-ppc64/dma-mapping.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-dma_bypass.3/include/asm-ppc64/dma-mapping.h 2005-07-19 13:47:08.000000000 +1000 @@ -16,25 +16,72 @@ #define DMA_ERROR_CODE (~(dma_addr_t)0x0) -extern int dma_supported(struct device *dev, u64 mask); -extern int dma_set_mask(struct device *dev, u64 dma_mask); -extern void *dma_alloc_coherent(struct device *dev, size_t size, - dma_addr_t *dma_handle, unsigned int __nocast flag); -extern void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, - dma_addr_t dma_handle); -extern dma_addr_t dma_map_single(struct device *dev, void *cpu_addr, - size_t size, enum dma_data_direction direction); -extern void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, - size_t size, enum dma_data_direction direction); -extern dma_addr_t dma_map_page(struct device *dev, struct page *page, +static inline struct dma_mapping_ops *__get_dma_ops(struct device *dev) +{ + BUG_ON(dev->dma_mapping_ops == NULL); + return dev->dma_mapping_ops; +} + +static inline int dma_supported(struct device *dev, u64 mask) +{ + return dev->dma_mapping_ops && + __get_dma_ops(dev)->dma_supported(dev, mask); +} + +static inline int dma_set_mask(struct device *dev, u64 dma_mask) +{ + return __get_dma_ops(dev)->set_dma_mask(dev, dma_mask); +} + +static inline void *dma_alloc_coherent(struct device *dev, size_t size, + dma_addr_t *dma_handle, unsigned int __nocast flag) +{ + return __get_dma_ops(dev)->alloc_coherent(dev, size, dma_handle, flag); +} + +static inline void dma_free_coherent(struct device *dev, size_t size, + void *cpu_addr, dma_addr_t dma_handle) +{ + __get_dma_ops(dev)->free_coherent(dev, size, cpu_addr, dma_handle); +} + +static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr, + size_t size, enum dma_data_direction direction) +{ + return __get_dma_ops(dev)->map_single(dev, cpu_addr, size, direction); +} + +static inline void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction direction) +{ + __get_dma_ops(dev)->unmap_single(dev, dma_addr, size, direction); +} + +static inline dma_addr_t dma_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, - enum dma_data_direction direction); -extern void dma_unmap_page(struct device *dev, dma_addr_t dma_address, - size_t size, enum dma_data_direction direction); -extern int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, - enum dma_data_direction direction); -extern void dma_unmap_sg(struct device *dev, struct scatterlist *sg, - int nhwentries, enum dma_data_direction direction); + enum dma_data_direction direction) +{ + return __get_dma_ops(dev)->map_single(dev, + (page_address(page) + offset), size, direction); +} + +static inline void dma_unmap_page(struct device *dev, dma_addr_t dma_address, + size_t size, enum dma_data_direction direction) +{ + __get_dma_ops(dev)->unmap_single(dev, dma_address, size, direction); +} + +static inline int dma_map_sg(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction direction) +{ + return __get_dma_ops(dev)->map_sg(dev, sg, nents, direction); +} + +static inline void dma_unmap_sg(struct device *dev, struct scatterlist *sg, + int nhwentries, enum dma_data_direction direction) +{ + __get_dma_ops(dev)->unmap_sg(dev, sg, nhwentries, direction); +} static inline void dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size, @@ -113,24 +160,4 @@ /* nothing to do */ } -/* - * DMA operations are abstracted for G5 vs. i/pSeries, PCI vs. VIO - */ -struct dma_mapping_ops { - void * (*alloc_coherent)(struct device *dev, size_t size, - dma_addr_t *dma_handle, unsigned int __nocast flag); - void (*free_coherent)(struct device *dev, size_t size, - void *vaddr, dma_addr_t dma_handle); - dma_addr_t (*map_single)(struct device *dev, void *ptr, - size_t size, enum dma_data_direction direction); - void (*unmap_single)(struct device *dev, dma_addr_t dma_addr, - size_t size, enum dma_data_direction direction); - int (*map_sg)(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction direction); - void (*unmap_sg)(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction direction); - int (*dma_supported)(struct device *dev, u64 mask); - int (*dac_dma_supported)(struct device *dev, u64 mask); -}; - #endif /* _ASM_DMA_MAPPING_H */ diff -ruN linus/include/asm-ppc64/pci.h linus-dma_bypass.3/include/asm-ppc64/pci.h --- linus/include/asm-ppc64/pci.h 2005-07-13 11:38:01.000000000 +1000 +++ linus-dma_bypass.3/include/asm-ppc64/pci.h 2005-07-19 15:55:23.000000000 +1000 @@ -66,15 +66,15 @@ extern unsigned int pcibios_assign_all_busses(void); -extern struct dma_mapping_ops pci_dma_ops; - /* For DAC DMA, we currently don't support it by default, but * we let the platform override this */ static inline int pci_dac_dma_supported(struct pci_dev *hwdev,u64 mask) { +#if 0 if (pci_dma_ops.dac_dma_supported) return pci_dma_ops.dac_dma_supported(&hwdev->dev, mask); +#endif return 0; } diff -ruN linus/include/asm-ppc64/vio.h linus-dma_bypass.3/include/asm-ppc64/vio.h --- linus/include/asm-ppc64/vio.h 2005-06-27 16:08:08.000000000 +1000 +++ linus-dma_bypass.3/include/asm-ppc64/vio.h 2005-06-27 17:40:50.000000000 +1000 @@ -57,10 +57,6 @@ int vio_enable_interrupts(struct vio_dev *dev); int vio_disable_interrupts(struct vio_dev *dev); -extern struct dma_mapping_ops vio_dma_ops; - -extern struct bus_type vio_bus_type; - struct vio_device_id { char *type; char *compat; diff -ruN linus/include/linux/device.h linus-dma_bypass.3/include/linux/device.h --- linus/include/linux/device.h 2005-07-13 11:38:01.000000000 +1000 +++ linus-dma_bypass.3/include/linux/device.h 2005-07-19 13:44:49.000000000 +1000 @@ -45,6 +45,7 @@ struct device_driver; struct class; struct class_device; +struct dma_mapping_ops; struct bus_type { const char * name; @@ -301,6 +302,7 @@ struct dma_coherent_mem *dma_mem; /* internal for coherent mem override */ + struct dma_mapping_ops *dma_mapping_ops; void (*release)(struct device * dev); }; diff -ruN linus/include/linux/dma-mapping.h linus-dma_bypass.3/include/linux/dma-mapping.h --- linus/include/linux/dma-mapping.h 2005-06-27 16:08:11.000000000 +1000 +++ linus-dma_bypass.3/include/linux/dma-mapping.h 2005-07-19 15:54:50.000000000 +1000 @@ -21,6 +21,28 @@ #define DMA_30BIT_MASK 0x000000003fffffffULL #define DMA_29BIT_MASK 0x000000001fffffffULL +struct scatterlist; + +/* + * DMA operations may be defined per bus or per device. + */ +struct dma_mapping_ops { + void * (*alloc_coherent)(struct device *dev, size_t size, + dma_addr_t *dma_handle, unsigned int __nocast flag); + void (*free_coherent)(struct device *dev, size_t size, + void *vaddr, dma_addr_t dma_handle); + dma_addr_t (*map_single)(struct device *dev, void *ptr, + size_t size, enum dma_data_direction direction); + void (*unmap_single)(struct device *dev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction direction); + int (*map_sg)(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction direction); + void (*unmap_sg)(struct device *dev, struct scatterlist *sg, + int nents, enum dma_data_direction direction); + int (*dma_supported)(struct device *dev, u64 mask); + int (*set_dma_mask)(struct device *dev, u64 mask); +}; + #include /* Backwards compat, remove in 2.7.x */ diff -ruN linus/include/linux/pci.h linus-dma_bypass.3/include/linux/pci.h --- linus/include/linux/pci.h 2005-07-06 21:18:22.000000000 +1000 +++ linus-dma_bypass.3/include/linux/pci.h 2005-07-19 15:42:03.000000000 +1000 @@ -18,6 +18,7 @@ #define LINUX_PCI_H #include +#include /* * Under PCI, each device has 256 bytes of configuration address space, @@ -722,6 +723,12 @@ extern struct list_head pci_root_buses; /* list of all known PCI buses */ extern struct list_head pci_devices; /* list of all devices */ +/* + * These DMA mapping ops will be applied to all discovered + * pci busses and inherited, by default, by all pci devices. + */ +extern struct dma_mapping_ops pci_dma_ops; + void pcibios_fixup_bus(struct pci_bus *); int pcibios_enable_device(struct pci_dev *, int mask); char *pcibios_setup (char *str); From ncunningham at cyclades.com Thu Jul 21 03:24:59 2005 From: ncunningham at cyclades.com (Nigel Cunningham) Date: Thu, 21 Jul 2005 03:24:59 +1000 Subject: [Suspend2-devel] [PATCH] Software Suspend 2 for ppc64 (1/2 - core) In-Reply-To: <42DC0B84.7060109@us.ibm.com> References: <42DC0B84.7060109@us.ibm.com> Message-ID: <1121880298.2233.8.camel@localhost> Hi. Thanks for working on ppc64 support. I'll wait for your reply to Stephen's feedback (and an update) before incorporating them. Regards, Nigel On Tue, 2005-07-19 at 06:05, Santiago Leon wrote: > These patches add support for software-suspend-2 on the ppc64 > platform... I have tested and works nicely on a PAPR pSeries partition > with virtual console, disk , and network (hvc_console, ibmvscsi, and > ibmveth, respectively) hibernating to a file... It applies to the latest > ssusp2 development version (2.1.9.9 which applies to the 2.6.12.2 > kernel)... > > Comments and suggestions are always welcome... > > Signed-off-by: Santiago Leon > > > > > ______________________________________________________________________ > diff -urN corig/arch/ppc64/Kconfig c/arch/ppc64/Kconfig > --- corig/arch/ppc64/Kconfig 2005-07-14 16:55:40.000000000 -0500 > +++ c/arch/ppc64/Kconfig 2005-07-14 15:55:52.000000000 -0500 > @@ -339,6 +339,12 @@ > > source "drivers/pci/Kconfig" > > +menu "Power management options" > + > +source kernel/power/Kconfig > + > +endmenu > + > config HOTPLUG_CPU > bool "Support for hot-pluggable CPUs" > depends on SMP && EXPERIMENTAL && (PPC_PSERIES || PPC_PMAC) > diff -urN corig/arch/ppc64/kernel/Makefile c/arch/ppc64/kernel/Makefile > --- corig/arch/ppc64/kernel/Makefile 2005-07-14 16:55:40.000000000 -0500 > +++ c/arch/ppc64/kernel/Makefile 2005-07-14 15:55:52.000000000 -0500 > @@ -62,6 +62,7 @@ > obj-$(CONFIG_PPC_MAPLE) += smp-tbsync.o > endif > > +obj-$(CONFIG_SUSPEND2) += suspend2.o > obj-$(CONFIG_ALTIVEC) += vecemu.o vector.o > obj-$(CONFIG_KPROBES) += kprobes.o > > diff -urN corig/arch/ppc64/kernel/rtasd.c c/arch/ppc64/kernel/rtasd.c > --- corig/arch/ppc64/kernel/rtasd.c 2005-07-14 16:55:41.000000000 -0500 > +++ c/arch/ppc64/kernel/rtasd.c 2005-07-14 16:11:46.000000000 -0500 > @@ -413,6 +413,8 @@ > /* Drop hotplug lock, and sleep for the specified delay */ > unlock_cpu_hotplug(); > set_current_state(TASK_INTERRUPTIBLE); > + try_to_freeze(); > + > schedule_timeout(delay); > lock_cpu_hotplug(); > > diff -urN corig/arch/ppc64/kernel/setup.c c/arch/ppc64/kernel/setup.c > --- corig/arch/ppc64/kernel/setup.c 2005-07-14 16:55:41.000000000 -0500 > +++ c/arch/ppc64/kernel/setup.c 2005-07-14 15:55:52.000000000 -0500 > @@ -700,6 +700,8 @@ > > EXPORT_SYMBOL(machine_halt); > > +void (*pm_power_off)(void) = machine_power_off; > + > unsigned long ppc_proc_freq; > unsigned long ppc_tb_freq; > > diff -urN corig/arch/ppc64/kernel/signal.c c/arch/ppc64/kernel/signal.c > --- corig/arch/ppc64/kernel/signal.c 2005-07-14 16:55:41.000000000 -0500 > +++ c/arch/ppc64/kernel/signal.c 2005-07-14 16:09:16.000000000 -0500 > @@ -534,6 +534,20 @@ > int signr; > struct k_sigaction ka; > > + if (try_to_freeze()) { > + signr = 0; > + if (!signal_pending(current)) > + goto no_signal; > + } > + > + if (freezing(current)) { > + try_to_freeze(); > + signr = 0; > + recalc_sigpending(); > + if (!signal_pending(current)) > + goto no_signal; > + } > + > /* > * If the current thread is 32 bit - invoke the > * 32 bit signal handling code > @@ -552,6 +566,7 @@ > return handle_signal(signr, &ka, &info, oldset, regs); > } > > +no_signal: > if (TRAP(regs) == 0x0C00) { /* System Call! */ > if ((int)regs->result == -ERESTARTNOHAND || > (int)regs->result == -ERESTARTSYS || > diff -urN corig/arch/ppc64/kernel/suspend2.c c/arch/ppc64/kernel/suspend2.c > --- corig/arch/ppc64/kernel/suspend2.c 1969-12-31 18:00:00.000000000 -0600 > +++ c/arch/ppc64/kernel/suspend2.c 2005-07-14 20:01:33.000000000 -0500 > @@ -0,0 +1,164 @@ > +/* > + * Written by Santiago Leon (santil at us.ibm.com) IBM Corp. > + * based on ppc implementation by Hu Gang (hugang at soulinfo.com) > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * as published by the Free Software Foundation; either version > + * 2 of the License, or (at your option) any later version. > + */ > + > +#include > +#include > +#include > +#include > + > +#ifdef CONFIG_SMP > +struct saved_context suspend2_saved_contexts[NR_CPUS]; > +#endif > +extern atomic_t suspend_cpu_counter __nosavedata; > + > +inline void __save_processor_state(struct saved_context *s) > +{ > + asm volatile ("std 1,%0" : "=m" (s->sp)); > + asm volatile ("std 2,%0" : "=m" (s->r2)); > + asm volatile ("std 12,%0" : "=m" (s->r[0])); > + asm volatile ("std 13,%0" : "=m" (s->r[1])); > + asm volatile ("std 14,%0" : "=m" (s->r[2])); > + asm volatile ("std 15,%0" : "=m" (s->r[3])); > + asm volatile ("std 16,%0" : "=m" (s->r[4])); > + asm volatile ("std 17,%0" : "=m" (s->r[5])); > + asm volatile ("std 18,%0" : "=m" (s->r[6])); > + asm volatile ("std 19,%0" : "=m" (s->r[7])); > + asm volatile ("std 20,%0" : "=m" (s->r[8])); > + asm volatile ("std 21,%0" : "=m" (s->r[9])); > + asm volatile ("std 22,%0" : "=m" (s->r[10])); > + asm volatile ("std 23,%0" : "=m" (s->r[11])); > + asm volatile ("std 24,%0" : "=m" (s->r[12])); > + asm volatile ("std 25,%0" : "=m" (s->r[13])); > + asm volatile ("std 26,%0" : "=m" (s->r[14])); > + asm volatile ("std 27,%0" : "=m" (s->r[15])); > + asm volatile ("std 28,%0" : "=m" (s->r[16])); > + asm volatile ("std 29,%0" : "=m" (s->r[17])); > + asm volatile ("std 30,%0" : "=m" (s->r[18])); > + asm volatile ("std 31,%0" : "=m" (s->r[19])); > + > + asm volatile ("mftb 4; std 4,%0": "=m" (s->tb)); > + > + /* Save SPRGs */ > + asm volatile ("mfsprg 4,0; std 4,%0 " : "=m" (s->sprg[0])); > + asm volatile ("mfsprg 4,1; std 4,%0 " : "=m" (s->sprg[1])); > + asm volatile ("mfsprg 4,2; std 4,%0 " : "=m" (s->sprg[2])); > + asm volatile ("mfsprg 4,3; std 4,%0 " : "=m" (s->sprg[3])); > + > + /* Save MSR & SDR1 */ > + asm volatile ("mfmsr 4; std 4,%0" : "=m" (s->msr)); > + asm volatile ("mfsdr1 4; std 4,%0": "=m" (s->sdr1)); > +} > + > +inline void __restore_processor_state(struct saved_context *s) > +{ > + /* Restore MSR and SDR1 */ > + asm volatile ("ld 4,%0; mtsdr1 4" : "=m" (s->sdr1)); > + asm volatile ("ld 4,%0; mtmsr 4" : "=m" (s->msr)); > + > + /* Restore SPRGs */ > + asm volatile ("ld 4,%0; mtsprg 0,4": "=m" (s->sprg[0])); > + asm volatile ("ld 4,%0; mtsprg 1,4": "=m" (s->sprg[1])); > + asm volatile ("ld 4,%0; mtsprg 2,4": "=m" (s->sprg[2])); > + asm volatile ("ld 4,%0; mtsprg 3,4": "=m" (s->sprg[3])); > + > + /* Restore TB */ > + asm volatile ("li 3,0; mttbl 3; \n" > + "lwz 3,%0\n; lwz 4,%1\n" > + "mttbu 3; mttbl 4" : > + "=m" (s->tb[0]), > + "=m" (s->tb[1]) : : "r3"); > + > + /* Restore the callee-saved registers and return */ > + asm volatile ("ld 12,%0" : "=m" (s->r[0])); > + asm volatile ("ld 13,%0" : "=m" (s->r[1])); > + asm volatile ("ld 14,%0" : "=m" (s->r[2])); > + asm volatile ("ld 15,%0" : "=m" (s->r[3])); > + asm volatile ("ld 16,%0" : "=m" (s->r[4])); > + asm volatile ("ld 17,%0" : "=m" (s->r[5])); > + asm volatile ("ld 18,%0" : "=m" (s->r[6])); > + asm volatile ("ld 19,%0" : "=m" (s->r[7])); > + asm volatile ("ld 20,%0" : "=m" (s->r[8])); > + asm volatile ("ld 21,%0" : "=m" (s->r[9])); > + asm volatile ("ld 22,%0" : "=m" (s->r[10])); > + asm volatile ("ld 23,%0" : "=m" (s->r[11])); > + asm volatile ("ld 24,%0" : "=m" (s->r[12])); > + asm volatile ("ld 25,%0" : "=m" (s->r[13])); > + asm volatile ("ld 26,%0" : "=m" (s->r[14])); > + asm volatile ("ld 27,%0" : "=m" (s->r[15])); > + asm volatile ("ld 28,%0" : "=m" (s->r[16])); > + asm volatile ("ld 29,%0" : "=m" (s->r[17])); > + asm volatile ("ld 30,%0" : "=m" (s->r[18])); > + asm volatile ("ld 31,%0" : "=m" (s->r[19])); > + > + asm volatile ("ld 2,%0" : "=m" (s->r2)); > + asm volatile ("ld 1,%0" : "=m" (s->sp)); > +} > + > + > + > +#ifdef CONFIG_SMP > + > +/* > + * Save and restore processor state for secondary processors. > + * IRQs (and therefore preemption) are already disabled > + * when we enter here (IPI). > + */ > +void __smp_suspend_lowlevel(void * data) > +{ > + > + if (test_suspend_state(SUSPEND_NOW_RESUMING)) { > + BUG_ON(!irqs_disabled()); > + atomic_inc(&suspend_cpu_counter); > + /* Only image copied back while we spin in this loop. Our > + * task info should not be looked at while this is happening > + * (which smp_processor_id() will do( */ > + while (test_suspend_state(SUSPEND_FREEZE_SMP)) { > + cpu_relax(); > + barrier(); > + } > + > + while (atomic_read(&suspend_cpu_counter) > + != _smp_processor_id()) { > + cpu_relax(); > + barrier(); > + } > + > + __restore_processor_state(suspend2_saved_contexts + > + _smp_processor_id()); > + local_flush_tlb(); > + atomic_dec(&suspend_cpu_counter); > + } else { /* suspending */ > + BUG_ON(!irqs_disabled()); > + /* > + *Save context and go back to idling. > + * Note that we cannot leave the processor > + * here. It must be able to receive IPIs if > + * the LZF compression driver (eg) does a > + * vfree after compressing the kernel etc > + */ > + while (test_suspend_state(SUSPEND_FREEZE_SMP) && > + (atomic_read(&suspend_cpu_counter) > + != (_smp_processor_id() - 1))) { > + cpu_relax(); > + barrier(); > + } > + __save_processor_state(suspend2_saved_contexts + > + _smp_processor_id()); > + atomic_inc(&suspend_cpu_counter); > + /* Now spin until the atomic copy of the kernel is made. */ > + while (test_suspend_state(SUSPEND_FREEZE_SMP)) { > + cpu_relax(); > + barrier(); > + } > + atomic_dec(&suspend_cpu_counter); > + } > +} > + > +#endif > diff -urN corig/arch/ppc64/kernel/vmlinux.lds.S c/arch/ppc64/kernel/vmlinux.lds.S > --- corig/arch/ppc64/kernel/vmlinux.lds.S 2005-07-14 16:55:41.000000000 -0500 > +++ c/arch/ppc64/kernel/vmlinux.lds.S 2005-07-14 15:55:52.000000000 -0500 > @@ -101,6 +101,13 @@ > > > /* Read/write sections */ > + > + . = ALIGN(4096); > + __nosave_begin = .; > + .data_nosave : { *(.data.nosave) } > + . = ALIGN(4096); > + __nosave_end = .; > + > . = ALIGN(16384); > /* The initial task and kernel stack */ > .data.init_task : { > diff -urN corig/arch/ppc64/mm/init.c c/arch/ppc64/mm/init.c > --- corig/arch/ppc64/mm/init.c 2005-07-14 16:55:41.000000000 -0500 > +++ c/arch/ppc64/mm/init.c 2005-07-14 15:55:52.000000000 -0500 > @@ -39,6 +39,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -461,6 +462,7 @@ > addr = (unsigned long)__init_begin; > for (; addr < (unsigned long)__init_end; addr += PAGE_SIZE) { > ClearPageReserved(virt_to_page(addr)); > + ClearPageNosave(virt_to_page(addr)); > set_page_count(virt_to_page(addr), 1); > free_page(addr); > totalram_pages++; > @@ -476,6 +478,7 @@ > printk ("Freeing initrd memory: %ldk freed\n", (end - start) >> 10); > for (; start < end; start += PAGE_SIZE) { > ClearPageReserved(virt_to_page(start)); > + ClearPageNosave(virt_to_page(start)); > set_page_count(virt_to_page(start), 1); > free_page(start); > totalram_pages++; > @@ -730,8 +733,13 @@ > for_each_pgdat(pgdat) { > for (i = 0; i < pgdat->node_spanned_pages; i++) { > page = pgdat->node_mem_map + i; > + void* addr = pfn_to_kaddr(page_to_pfn(page)); > if (PageReserved(page)) > reservedpages++; > + if (addr >= (void *)&__nosave_begin > + && addr < (void *)&__nosave_end) > + SetPageNosave(virt_to_page(addr)); > + > } > } > > diff -urN corig/include/asm-ppc64/suspend2.h c/include/asm-ppc64/suspend2.h > --- corig/include/asm-ppc64/suspend2.h 1969-12-31 18:00:00.000000000 -0600 > +++ c/include/asm-ppc64/suspend2.h 2005-07-14 20:01:54.000000000 -0500 > @@ -0,0 +1,91 @@ > +/* > + * Written by Santiago Leon (santil at us.ibm.com) IBM Corp. > + * based on ppc implementation by Hu Gang (hugang at soulinfo.com) > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * as published by the Free Software Foundation; either version > + * 2 of the License, or (at your option) any later version. > + */ > + > +/* image of the saved processor states */ > +struct saved_context { > + u32 cr; > + u64 lr, sp, r2; > + u64 r[20]; /* r12 - r31 */ > + u64 sprg[4]; > + u64 msr, sdr1; > + u32 tb[2]; > +}; > + > +inline void __save_processor_state(struct saved_context *s); > +inline void __restore_processor_state(struct saved_context *s); > + > +static struct saved_context suspend_saved_context; > +static unsigned long new_stack_page; > +extern atomic_t suspend_cpu_counter __nosavedata; > + > +static inline void suspend2_save_processor_context(void) > +{ > + __save_processor_state(&suspend_saved_context); > +} > + > +static inline void suspend2_restore_processor_context(void) > +{ > + __restore_processor_state(&suspend_saved_context); > + > + enable_kernel_fp(); > +} > + > +static inline void suspend2_pre_copy(void) > +{ > +} > + > +static inline void suspend2_post_copy(void) > +{ > +} > + > +static inline void suspend2_pre_copyback(void) > +{ > +} > + > +static inline void suspend2_post_copyback(void) > +{ > + /* Get other CPUs to restore their contexts and flush their tlbs. */ > + clear_suspend_state(SUSPEND_FREEZE_SMP); > + > + do { > + cpu_relax(); > + barrier(); > + } while (atomic_read(&suspend_cpu_counter)); > + > +} > + > +static inline void suspend2_flush_caches(void) > +{ > +} > + > + > +static inline void move_stack_to_nonconflicing_area(void) > +{ > + unsigned long old_stack, src; > + > + new_stack_page = > + suspend2_get_nonconflicting_pages(get_order(THREAD_SIZE)); > + > + BUG_ON(!new_stack_page); > + > + /* geting stack address */ > + asm volatile ("std %%r1, %0" : "=m" (old_stack)); > + > + src = old_stack & (~(THREAD_SIZE - 1)); > + > + /* Copy stack */ > + memcpy((void*)new_stack_page, (void*)src, THREAD_SIZE); > + > + new_stack_page += (old_stack - src); > + > + /* switch to new stack */ > + asm volatile ("ld %%r1, %0" : "=m" (new_stack_page)); > + > +} > diff -urN corig/include/asm-ppc64/tlbflush.h c/include/asm-ppc64/tlbflush.h > --- corig/include/asm-ppc64/tlbflush.h 2005-07-14 16:55:59.000000000 -0500 > +++ c/include/asm-ppc64/tlbflush.h 2005-07-14 15:55:52.000000000 -0500 > @@ -39,6 +39,7 @@ > put_cpu_var(ppc64_tlb_batch); > } > > +#define local_flush_tlb() flush_tlb_pending() > #define flush_tlb_mm(mm) flush_tlb_pending() > #define flush_tlb_page(vma, addr) flush_tlb_pending() > #define flush_tlb_page_nohash(vma, addr) do { } while (0) > diff -urN corig/kernel/power/driver_model.c c/kernel/power/driver_model.c > --- corig/kernel/power/driver_model.c 2005-07-14 17:02:22.000000000 -0500 > +++ c/kernel/power/driver_model.c 2005-07-14 16:13:20.000000000 -0500 > @@ -9,6 +9,7 @@ > */ > > #include > +#include > #include "driver_model.h" > #include "power_off.h" > > > > ______________________________________________________________________ > _______________________________________________ > Suspend2-devel mailing list > Suspend2-devel at lists.suspend2.net > http://lists.suspend2.net/mailman/listinfo/suspend2-devel -- Evolution. Enumerate the requirements. Consider the interdependencies. Calculate the probabilities. From santil at us.ibm.com Thu Jul 21 04:54:57 2005 From: santil at us.ibm.com (Santiago Leon) Date: Wed, 20 Jul 2005 13:54:57 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) In-Reply-To: <1121736294.14393.40.camel@gaston> References: <42DC3E31.2080100@us.ibm.com> <1121736294.14393.40.camel@gaston> Message-ID: <42DE9E01.3010200@us.ibm.com> Ben... thanks for the comments... >>+ if (try_to_freeze()) { >>+ signr = 0; >>+ if (!signal_pending(current)) >>+ goto no_signal; >>+ } >>+ >>+ if (freezing(current)) { >>+ try_to_freeze(); >>+ signr = 0; >>+ recalc_sigpending(); >>+ if (!signal_pending(current)) >>+ goto no_signal; >>+ } > > > The above looks a bit weird & redundant ... But then, it might be some > subtelty with Nigel's updated refrigerator. This was copied from the ppc version and I couldn't find a reason to change it... >>+ /* Save MSR & SDR1 */ >>+ asm volatile ("mfmsr 4; std 4,%0" : "=m" (s->msr)); >>+ asm volatile ("mfsdr1 4; std 4,%0": "=m" (s->sdr1)); >>+} > > The above should be in an assembly .S file. again, I copied this from the ppc version, but you're right, it doesn't make sense to have a function with all assembly code in a .c file... > Generally, I don't like the > way the processor state saving is done separately from the actual low > level asm suspend call. Do you mean that you don't like the separation between __save_processor_state() and __smp_suspend_lowlevel()?... > It makes little sense. You assume gcc won't play > with registers behind your back, fairly unsafe. yeah... gcc *did* play with the registers on me and I fixed that by adding a constraint to one of the asm calls... pretty silly of me... >>+ __restore_processor_state(suspend2_saved_contexts + >>+ _smp_processor_id()); >>+ local_flush_tlb(); > > > I'm not sure calling local_flush_tlb() here makes much sense. You need > to do more than just flushing the current batch here. Yeah, I'm aware... it's just that the thing worked before I started playing with this function (which I copied from i386), so I went ahead and released it to get people to play with it... > You actually need > to invalidate the entire TLB which is not necessarily easy, why is it "not necessarily easy"?... isn't executing a tlbia enough? > and you need to take care of the SLB as well. yep... >>+ /* >>+ *Save context and go back to idling. >>+ * Note that we cannot leave the processor >>+ * here. It must be able to receive IPIs if >>+ * the LZF compression driver (eg) does a >>+ * vfree after compressing the kernel etc >>+ */ > Gack ? Can you explain the above comment a bit more ? Something is > playing with kernel virtual space while CPUs are locked into IPIs and/or > that kind of horror ? Doesn't seem like a very sane thing to do. Sorry, I can't :)... This was copied from i386 with comments and all... >>+static inline void move_stack_to_nonconflicing_area(void) >>+{ >>+ unsigned long old_stack, src; ... > In what context is the above called ? You are likely to die an horrible > death when moving the stack around if you take an SLB miss at the wrong > time. The kernel is careful about always locking the current kernel > stack segment in SLB, various bits make assumption that remains true at > all time. Now that I realize, this function is not called by the latest ssusp2 core (I did the development on an earlier version)... so i guess it's a moot point... > In general, you seem to completely ignore the hash table and SLB. That > might work by luck, because your "loader" kernel is the same as your > "saved" kernel, you'll eventually end up with proper bolted down hash > entries, but it's very dodgy. yeah I know it is luck that it works without messing with the htab and slb... I'll fix that... > You are also ignoring the iommu, Well... when the drivers suspend, they should unmap all their dmaable buffers and structs... when they resume, they have to map them, so I'm not sure if it is necessary to do anything else the iommu... > , and RTAs, yeah, I'll make sure that RTAS gets instantiated again... From santil at us.ibm.com Thu Jul 21 05:06:36 2005 From: santil at us.ibm.com (Santiago Leon) Date: Wed, 20 Jul 2005 14:06:36 -0500 Subject: [PATCH] Software Suspend 2 for ppc64 (2/2 - virtual drivers) In-Reply-To: <20050719105809.25c7c601.sfr@canb.auug.org.au> References: <20050719105809.25c7c601.sfr@canb.auug.org.au> Message-ID: <42DEA0BC.40803@us.ibm.com> >>+ if (drv && drv->suspend) >>+ return drv->suspend(vio_dev, state); >>+ else > > ^^^^ > else not needed. ok, will remove it... >>+ if (drv && drv->resume) >>+ drv->resume(vio_dev); > > ^ > Shouldn't this have "return" here? yes it should... >>+ .suspend = ibmveth_suspend, >>+ .resume = ibmveth_resume > > You might as well terminate this witha a comma ... yep... >> static inline struct vio_driver *to_vio_driver(struct device_driver *drv) >> { >>- return container_of(drv, struct vio_driver, driver); >>+ return drv ? container_of(drv, struct vio_driver, driver) : NULL; > > > So who is passing a NULL to to_vio_driver? vio_device_suspend() calls to_vio_driver() for every device in /vdevice and in there there's devices IBM,sp at 4000, nvram at 4002, and rtc at 4001 that don't have a vio_driver... Thanks for the comments... -- Santiago A. Leon Power Linux Development IBM Linux Technology Center From glikely at gmail.com Thu Jul 21 05:23:29 2005 From: glikely at gmail.com (Grant Likely) Date: Wed, 20 Jul 2005 15:23:29 -0400 Subject: OLS 2005 attendees In-Reply-To: <52642f46a4231ff4111424b0565ddcde@penguinppc.org> References: <52642f46a4231ff4111424b0565ddcde@penguinppc.org> Message-ID: <528646bc050720122361ec0551@mail.gmail.com> For OLS2005 attendees: We are planning to hold a BOF for embedded PPC tomorrow for all who are interested. (Thursday July 21). Meet in the downstairs at 7:00pm Cheers, g. From benh at kernel.crashing.org Thu Jul 21 06:22:06 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 21 Jul 2005 06:22:06 +1000 Subject: [PATCH] Software Suspend 2 for ppc64 (1/2 - core) In-Reply-To: <42DE9E01.3010200@us.ibm.com> References: <42DC3E31.2080100@us.ibm.com> <1121736294.14393.40.camel@gaston> <42DE9E01.3010200@us.ibm.com> Message-ID: <1121890927.14393.80.camel@gaston> > > Do you mean that you don't like the separation between > __save_processor_state() and __smp_suspend_lowlevel()?... Yup, I'm not really a fan of it. > > It makes little sense. You assume gcc won't play > > with registers behind your back, fairly unsafe. > > yeah... gcc *did* play with the registers on me and I fixed that by > adding a constraint to one of the asm calls... pretty silly of me... > > >>+ __restore_processor_state(suspend2_saved_contexts + > >>+ _smp_processor_id()); > >>+ local_flush_tlb(); > > > > > > I'm not sure calling local_flush_tlb() here makes much sense. You need > > to do more than just flushing the current batch here. > > Yeah, I'm aware... it's just that the thing worked before I started > playing with this function (which I copied from i386), so I went ahead > and released it to get people to play with it... Ok, you are probably just getting lucky that things are similar between the "loader" kernel and the resumed kernel :) You need to be more careful about it though to be really reliable and you certainly want to force restoring the SLB > > You actually need > > to invalidate the entire TLB which is not necessarily easy, > > why is it "not necessarily easy"?... isn't executing a tlbia enough? tlbia is not implemented on most CPUs > > and you need to take care of the SLB as well. > > yep... > > >>+ /* > >>+ *Save context and go back to idling. > >>+ * Note that we cannot leave the processor > >>+ * here. It must be able to receive IPIs if > >>+ * the LZF compression driver (eg) does a > >>+ * vfree after compressing the kernel etc > >>+ */ > > Gack ? Can you explain the above comment a bit more ? Something is > > playing with kernel virtual space while CPUs are locked into IPIs and/or > > that kind of horror ? Doesn't seem like a very sane thing to do. > > Sorry, I can't :)... This was copied from i386 with comments and all... Hrmph... I'll have to poke Nigel then. > >>+static inline void move_stack_to_nonconflicing_area(void) > >>+{ > >>+ unsigned long old_stack, src; > ... > > In what context is the above called ? You are likely to die an horrible > > death when moving the stack around if you take an SLB miss at the wrong > > time. The kernel is careful about always locking the current kernel > > stack segment in SLB, various bits make assumption that remains true at > > all time. > > Now that I realize, this function is not called by the latest ssusp2 > core (I did the development on an earlier version)... so i guess it's a > moot point... Ok, good. > > In general, you seem to completely ignore the hash table and SLB. That > > might work by luck, because your "loader" kernel is the same as your > > "saved" kernel, you'll eventually end up with proper bolted down hash > > entries, but it's very dodgy. > > yeah I know it is luck that it works without messing with the htab and > slb... I'll fix that... > > > You are also ignoring the iommu, > Well... when the drivers suspend, they should unmap all their dmaable > buffers and structs No, not necessarily. At least it hasn't been a requirement so far... Oh, and you may need to work on the interrupt controller too. > ... when they resume, they have to map them, so I'm > not sure if it is necessary to do anything else the iommu... > > > , and RTAs, > yeah, I'll make sure that RTAS gets instantiated again... Well, it does get instanciated by the loader kernel, you just have to be extra careful about -where- it gets instanciated.... Or just maybe save/restore rtas instance along with the kernel... not sure if that works though, a bit gross but might be ok. Ben. From glikely at gmail.com Thu Jul 21 07:51:21 2005 From: glikely at gmail.com (Grant Likely) Date: Wed, 20 Jul 2005 17:51:21 -0400 Subject: OLS 2005 attendees In-Reply-To: <528646bc050720122361ec0551@mail.gmail.com> References: <52642f46a4231ff4111424b0565ddcde@penguinppc.org> <528646bc050720122361ec0551@mail.gmail.com> Message-ID: <528646bc0507201451271b68d5@mail.gmail.com> Change of plans; we are going to meet 1/2 hour earlier so we can decide as a group whether or not to just head over to the CELinux forum or just meet amongst ourselves. The CElinux forum is holding their BOF at Les Suites at 7:00 g. On 7/20/05, Grant Likely wrote: > For OLS2005 attendees: > > We are planning to hold a BOF for embedded PPC tomorrow for all who > are interested. (Thursday July 21). Meet in the downstairs at 7:00pm > > Cheers, > g. > -- "Why do musicians compose symphonies and poets write poems? They do it because life wouldn't have any meaning for them if they didn't. That's why I draw cartoons. It's my life." -- Charles Shultz From david at gibson.dropbear.id.au Thu Jul 21 14:44:14 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 21 Jul 2005 14:44:14 +1000 Subject: Dynamic segment tables Message-ID: <20050721044414.GD30030@localhost.localdomain> This patch changes the kernel to dynamically allocate segment tables, thereby removing 192k (!) of zeroes from the kernel's static data. If no-one can see any problems with it, I'll push to akpm shortly. PPC64 machines before Power4 need a segment table page allocated for each CPU. Currently these are allocated statically in a big array in head.S for all CPUs. The segment tables need to be in the first segment (so do_stab_bolted doesn't take a recursive fault on the stab itself), but other than that there are no constraints which require the stabs for the secondary CPUs to be statically allocated. This patch allocates segment tables dynamically during boot, using lmb_alloc() to ensure they are within the first 256M segment. This reduces the kernel image size by 192k... Tested on RS64 iSeries, and POWER3 pSeries. Signed-off-by: David Gibson Index: working-2.6/arch/ppc64/kernel/head.S =================================================================== --- working-2.6.orig/arch/ppc64/kernel/head.S 2005-07-14 10:57:49.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/head.S 2005-07-21 11:46:47.000000000 +1000 @@ -2131,13 +2131,6 @@ swapper_pg_dir: .space 4096 -#ifdef CONFIG_SMP -/* 1 page segment table per cpu (max 48, cpu0 allocated at STAB0_PHYS_ADDR) */ - .globl stab_array -stab_array: - .space 4096 * 48 -#endif - /* * This space gets a copy of optional info passed to us by the bootstrap * Used to pass parameters into the kernel like root=/dev/sda1, etc. Index: working-2.6/arch/ppc64/kernel/smp.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/smp.c 2005-07-06 10:30:12.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/smp.c 2005-07-21 11:46:47.000000000 +1000 @@ -65,8 +65,6 @@ static volatile unsigned int cpu_callin_map[NR_CPUS]; -extern unsigned char stab_array[]; - void smp_call_function_interrupt(void); int smt_enabled_at_boot = 1; @@ -492,19 +490,6 @@ paca[cpu].default_decr = tb_ticks_per_jiffy; - if (!cpu_has_feature(CPU_FTR_SLB)) { - void *tmp; - - /* maximum of 48 CPUs on machines with a segment table */ - if (cpu >= 48) - BUG(); - - tmp = &stab_array[PAGE_SIZE * cpu]; - memset(tmp, 0, PAGE_SIZE); - paca[cpu].stab_addr = (unsigned long)tmp; - paca[cpu].stab_real = virt_to_abs(tmp); - } - /* Make sure callin-map entry is 0 (can be leftover a CPU * hotplug */ Index: working-2.6/arch/ppc64/kernel/setup.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/setup.c 2005-07-14 10:57:49.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/setup.c 2005-07-21 11:46:47.000000000 +1000 @@ -1071,6 +1071,8 @@ irqstack_early_init(); emergency_stack_init(); + stabs_alloc(); + /* set up the bootmem stuff with available memory */ do_init_bootmem(); sparse_init(); Index: working-2.6/arch/ppc64/mm/stab.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/stab.c 2005-06-08 15:46:23.000000000 +1000 +++ working-2.6/arch/ppc64/mm/stab.c 2005-07-21 11:46:47.000000000 +1000 @@ -18,6 +18,8 @@ #include #include #include +#include +#include struct stab_entry { unsigned long esid_data; @@ -224,6 +226,39 @@ extern void slb_initialize(void); /* + * Allocate segment tables for secondary CPUs. These must all go in + * the first (bolted) segment, so that do_stab_bolted won't get a + * recursive segment miss on the segment table itself. + */ +void stabs_alloc(void) +{ + int cpu; + + if (cpu_has_feature(CPU_FTR_SLB)) + return; + + for_each_cpu(cpu) { + unsigned long newstab; + + if (cpu == 0) + continue; /* stab for CPU 0 is statically allocated */ + + newstab = lmb_alloc_base(PAGE_SIZE, PAGE_SIZE, 1< References: <20050721044414.GD30030@localhost.localdomain> Message-ID: <1121923959.14393.127.camel@gaston> > + paca[cpu].stab_addr = newstab; > + paca[cpu].stab_real = virt_to_abs(newstab); > + printk(KERN_DEBUG "Segment table for CPU %d at 0x%lx virtual, 0x%lx absolute\n", cpu, paca[cpu].stab_addr, paca[cpu].stab_real); > + } > +} That one could use some 80 cols wrapping :) Ben. From anton at samba.org Fri Jul 22 06:45:59 2005 From: anton at samba.org (Anton Blanchard) Date: Fri, 22 Jul 2005 06:45:59 +1000 Subject: Dynamic segment tables In-Reply-To: <20050721044414.GD30030@localhost.localdomain> References: <20050721044414.GD30030@localhost.localdomain> Message-ID: <20050721204559.GD24373@krispykreme> > This patch changes the kernel to dynamically allocate segment tables, > thereby removing 192k (!) of zeroes from the kernel's static data. If > no-one can see any problems with it, I'll push to akpm shortly. > > PPC64 machines before Power4 need a segment table page allocated for > each CPU. Currently these are allocated statically in a big array in > head.S for all CPUs. The segment tables need to be in the first > segment (so do_stab_bolted doesn't take a recursive fault on the stab > itself), but other than that there are no constraints which require > the stabs for the secondary CPUs to be statically allocated. > > This patch allocates segment tables dynamically during boot, using > lmb_alloc() to ensure they are within the first 256M segment. This > reduces the kernel image size by 192k... Nice work! That 192k always annoyed me :) Anton From david at gibson.dropbear.id.au Fri Jul 22 13:10:23 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 22 Jul 2005 13:10:23 +1000 Subject: [PPC64] Dynamically allocate segment tables Message-ID: <20050722031023.GB22596@localhost.localdomain> PPC64 machines before Power4 need a segment table page allocated for each CPU. Currently these are allocated statically in a big array in head.S for all CPUs. The segment tables need to be in the first segment (so do_stab_bolted doesn't take a recursive fault on the stab itself), but other than that there are no constraints which require the stabs for the secondary CPUs to be statically allocated. This patch allocates segment tables dynamically during boot, using lmb_alloc() to ensure they are within the first 256M segment. This reduces the kernel image size by 192k... Tested on RS64 iSeries, POWER3 pSeries, and POWER5. Signed-off-by: David Gibson Index: working-2.6/arch/ppc64/kernel/head.S =================================================================== --- working-2.6.orig/arch/ppc64/kernel/head.S 2005-07-14 10:57:49.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/head.S 2005-07-21 15:23:31.000000000 +1000 @@ -2131,13 +2131,6 @@ swapper_pg_dir: .space 4096 -#ifdef CONFIG_SMP -/* 1 page segment table per cpu (max 48, cpu0 allocated at STAB0_PHYS_ADDR) */ - .globl stab_array -stab_array: - .space 4096 * 48 -#endif - /* * This space gets a copy of optional info passed to us by the bootstrap * Used to pass parameters into the kernel like root=/dev/sda1, etc. Index: working-2.6/arch/ppc64/kernel/smp.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/smp.c 2005-07-06 10:30:12.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/smp.c 2005-07-21 14:55:28.000000000 +1000 @@ -65,8 +65,6 @@ static volatile unsigned int cpu_callin_map[NR_CPUS]; -extern unsigned char stab_array[]; - void smp_call_function_interrupt(void); int smt_enabled_at_boot = 1; @@ -492,19 +490,6 @@ paca[cpu].default_decr = tb_ticks_per_jiffy; - if (!cpu_has_feature(CPU_FTR_SLB)) { - void *tmp; - - /* maximum of 48 CPUs on machines with a segment table */ - if (cpu >= 48) - BUG(); - - tmp = &stab_array[PAGE_SIZE * cpu]; - memset(tmp, 0, PAGE_SIZE); - paca[cpu].stab_addr = (unsigned long)tmp; - paca[cpu].stab_real = virt_to_abs(tmp); - } - /* Make sure callin-map entry is 0 (can be leftover a CPU * hotplug */ Index: working-2.6/arch/ppc64/kernel/setup.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/setup.c 2005-07-14 10:57:49.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/setup.c 2005-07-21 15:23:31.000000000 +1000 @@ -1071,6 +1071,8 @@ irqstack_early_init(); emergency_stack_init(); + stabs_alloc(); + /* set up the bootmem stuff with available memory */ do_init_bootmem(); sparse_init(); Index: working-2.6/arch/ppc64/mm/stab.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/stab.c 2005-06-08 15:46:23.000000000 +1000 +++ working-2.6/arch/ppc64/mm/stab.c 2005-07-21 15:23:31.000000000 +1000 @@ -18,6 +18,8 @@ #include #include #include +#include +#include struct stab_entry { unsigned long esid_data; @@ -224,6 +226,39 @@ extern void slb_initialize(void); /* + * Allocate segment tables for secondary CPUs. These must all go in + * the first (bolted) segment, so that do_stab_bolted won't get a + * recursive segment miss on the segment table itself. + */ +void stabs_alloc(void) +{ + int cpu; + + if (cpu_has_feature(CPU_FTR_SLB)) + return; + + for_each_cpu(cpu) { + unsigned long newstab; + + if (cpu == 0) + continue; /* stab for CPU 0 is statically allocated */ + + newstab = lmb_alloc_base(PAGE_SIZE, PAGE_SIZE, 1< References: <20050721044414.GD30030@localhost.localdomain> <1121923959.14393.127.camel@gaston> Message-ID: <20050722031048.GC22596@localhost.localdomain> On Thu, Jul 21, 2005 at 01:32:38AM -0400, Benjamin Herrenschmidt wrote: > > > + paca[cpu].stab_addr = newstab; > > + paca[cpu].stab_real = virt_to_abs(newstab); > > + printk(KERN_DEBUG "Segment table for CPU %d at 0x%lx virtual, 0x%lx absolute\n", cpu, paca[cpu].stab_addr, paca[cpu].stab_real); > > + } > > +} > > That one could use some 80 cols wrapping :) Ok, fixed, along with a couple of minor warnings. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From michael at ellerman.id.au Mon Jul 25 13:53:17 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 25 Jul 2005 13:53:17 +1000 Subject: [PATCH] ppc64: Add /proc/ppc64/flat-device-tree for debugging purposes Message-ID: <200507251353.17778.michael@ellerman.id.au> This patch adds a /proc/ppc64/flat-device-tree file which exports the flattened device tree as a binary blob. It assumes the device tree is contiguous in memory. I haven't tested the !initial_boot_params case, but it should be ok looking at the code. Index: work/arch/ppc64/kernel/proc_ppc64.c =================================================================== --- work.orig/arch/ppc64/kernel/proc_ppc64.c +++ work/arch/ppc64/kernel/proc_ppc64.c @@ -69,6 +69,24 @@ core_initcall(proc_ppc64_create); static int __init proc_ppc64_init(void) { struct proc_dir_entry *pde; +#ifdef CONFIG_PROC_DEVICETREE + extern struct boot_param_header *initial_boot_params; + + pde = create_proc_entry("ppc64/flat-device-tree", + S_IFREG|S_IRUGO, NULL); + if (!pde) + return 1; + pde->nlink = 1; + + if (initial_boot_params) { + pde->data = initial_boot_params; + pde->size = initial_boot_params->totalsize; + } else { + pde->data = pde->size = 0; + } + + pde->proc_fops = &page_map_fops; +#endif pde = create_proc_entry("ppc64/systemcfg", S_IFREG|S_IRUGO, NULL); if (!pde) From david at gibson.dropbear.id.au Mon Jul 25 16:16:35 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 25 Jul 2005 16:16:35 +1000 Subject: [PPC64] Remove another fixed address constraint Message-ID: <20050725061635.GD19817@localhost.localdomain> Presently the LparMap, one of the structures the kernel shares with the legacy iSeries hypervisor has a fixed offset address in head.S. This patch changes this so the LparMap is a normally initialized structure, without fixed address. This allows us to use macros to compute some of the values in the structure, which wasn't previously possible because the assembler always uses signed-% which gets the wrong answers for the computations in question. Unfortunately, a gcc bug means that doing this requires another structure (hvReleaseData) to be initialized in asm instead of C, but on the whole the result is cleaner than before. Signed-off-by: David Gibson Index: working-2.6/include/asm-ppc64/iSeries/LparMap.h =================================================================== --- working-2.6.orig/include/asm-ppc64/iSeries/LparMap.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/iSeries/LparMap.h 2005-07-25 15:28:42.000000000 +1000 @@ -49,19 +49,26 @@ * entry to map the Esid to the Vsid. */ +#define HvEsidsToMap 2 +#define HvRangesToMap 1 + /* Hypervisor initially maps 32MB of the load area */ #define HvPagesToMap 8192 struct LparMap { - u64 xNumberEsids; // Number of ESID/VSID pairs (1) - u64 xNumberRanges; // Number of VA ranges to map (1) - u64 xSegmentTableOffs; // Page number within load area of seg table (0) + u64 xNumberEsids; // Number of ESID/VSID pairs + u64 xNumberRanges; // Number of VA ranges to map + u64 xSegmentTableOffs; // Page number within load area of seg table u64 xRsvd[5]; - u64 xKernelEsid; // Esid used to map kernel load (0x0C00000000) - u64 xKernelVsid; // Vsid used to map kernel load (0x0C00000000) - u64 xPages; // Number of pages to be mapped (8192) - u64 xOffset; // Offset from start of load area (0) - u64 xVPN; // Virtual Page Number (0x000C000000000000) + struct { + u64 xKernelEsid; // Esid used to map kernel load + u64 xKernelVsid; // Vsid used to map kernel load + } xEsids[HvEsidsToMap]; + struct { + u64 xPages; // Number of pages to be mapped + u64 xOffset; // Offset from start of load area + u64 xVPN; // Virtual Page Number + } xRanges[HvRangesToMap]; }; extern struct LparMap xLparMap; Index: working-2.6/include/asm-ppc64/iSeries/HvReleaseData.h =================================================================== --- working-2.6.orig/include/asm-ppc64/iSeries/HvReleaseData.h 2005-07-06 10:30:53.000000000 +1000 +++ working-2.6/include/asm-ppc64/iSeries/HvReleaseData.h 2005-07-25 15:28:42.000000000 +1000 @@ -39,6 +39,11 @@ * know that this PLIC does not support running an OS "that old". */ +#define HVREL_TAGSINACTIVE 0x8000 +#define HVREL_32BIT 0x4000 +#define HVREL_NOSHAREDPROCS 0x2000 +#define HVREL_NOHMT 0x1000 + struct HvReleaseData { u32 xDesc; /* Descriptor "HvRD" ebcdic x00-x03 */ u16 xSize; /* Size of this control block x04-x05 */ @@ -46,11 +51,7 @@ struct naca_struct *xSlicNacaAddr; /* Virt addr of SLIC NACA x08-x0F */ u32 xMsNucDataOffset; /* Offset of Linux Mapping Data x10-x13 */ u32 xRsvd1; /* Reserved x14-x17 */ - u16 xTagsMode:1; /* 0 == tags active, 1 == tags inactive */ - u16 xAddressSize:1; /* 0 == 64-bit, 1 == 32-bit */ - u16 xNoSharedProcs:1; /* 0 == shared procs, 1 == no shared */ - u16 xNoHMT:1; /* 0 == allow HMT, 1 == no HMT */ - u16 xRsvd2:12; /* Reserved x18-x19 */ + u16 xFlags; u16 xVrmIndex; /* VRM Index of OS image x1A-x1B */ u16 xMinSupportedPlicVrmIndex; /* Min PLIC level (soft) x1C-x1D */ u16 xMinCompatablePlicVrmIndex; /* Min PLIC levelP (hard) x1E-x1F */ Index: working-2.6/arch/ppc64/kernel/LparData.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/LparData.c 2005-07-14 10:57:49.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/LparData.c 2005-07-25 15:09:55.000000000 +1000 @@ -33,17 +33,36 @@ * the hypervisor and Linux. */ +/* + * WARNING - magic here + * + * Ok, this is a horrid hack below, but marginally better than the + * alternatives. What we really want is just to initialize + * hvReleaseData in C as in the #if 0 section here. However, gcc + * refuses to believe that (u32)&x is a constant expression, so will + * not allow the xMsNucDataOffset field to be properly initialized. + * So, we declare hvReleaseData in inline asm instead. We use inline + * asm, rather than a .S file, because the assembler won't generate + * the necessary relocation for the LparMap either, unless that symbol + * is declared in the same source file. Finally, we put the asm in a + * dummy, attribute-used function, instead of at file scope, because + * file scope asms don't allow contraints. We want to use the "i" + * constraints to put sizeof() and offsetof() expressions in there, + * because including asm/offsets.h in C code then stringifying causes + * all manner of warnings. + */ +#if 0 struct HvReleaseData hvReleaseData = { .xDesc = 0xc8a5d9c4, /* "HvRD" ebcdic */ .xSize = sizeof(struct HvReleaseData), .xVpdAreasPtrOffset = offsetof(struct naca_struct, xItVpdAreas), .xSlicNacaAddr = &naca, /* 64-bit Naca address */ - .xMsNucDataOffset = 0x4800, /* offset of LparMap within loadarea (see head.S) */ - .xTagsMode = 1, /* tags inactive */ - .xAddressSize = 0, /* 64 bit */ - .xNoSharedProcs = 0, /* shared processors */ - .xNoHMT = 0, /* HMT allowed */ - .xRsvd2 = 6, /* TEMP: This allows non-GA driver */ + .xMsNucDataOffset = (u32)((unsigned long)&xLparMap - KERNELBASE), + .xFlags = HVREL_TAGSINACTIVE /* tags inactive */ + /* 64 bit */ + /* shared processors */ + /* HMT allowed */ + | 6, /* TEMP: This allows non-GA driver */ .xVrmIndex = 4, /* We are v5r2m0 */ .xMinSupportedPlicVrmIndex = 3, /* v5r1m0 */ .xMinCompatablePlicVrmIndex = 3, /* v5r1m0 */ @@ -51,6 +70,63 @@ 0xa7, 0x40, 0xf2, 0x4b, 0xf4, 0x4b, 0xf6, 0xf4 }, }; +#endif + + +extern struct HvReleaseData hvReleaseData; + +static void __attribute_used__ hvReleaseData_wrapper(void) +{ + /* This doesn't appear to need any alignment (even 4 byte) */ + asm volatile ( + " lparMapPhys = xLparMap - %3\n" + " .data\n" + " .globl hvReleaseData\n" + "hvReleaseData:\n" + " .long 0xc8a5d9c4\n" /* xDesc */ + /* "HvRD" in ebcdic */ + " .short %0\n" /* xSize */ + " .short %1\n" /* xVpdAreasPtrOffset */ + " .llong naca\n" /* xSlicNacaAddr */ + " .long lparMapPhys\n" /* xMsNucDataOffset */ + " .long 0\n" /* xRsvd1 */ + " .short %2\n" /* xFlags */ + " .short 4\n" /* xVrmIndex - v5r2m0 */ + " .short 3\n" /* xMinSupportedPlicVrmIndex - v5r1m0 */ + " .short 3\n" /* xMinCompatablePlicVrmIndex - v5r1m0 */ + " .long 0xd38995a4\n" /* xVrmName */ + " .long 0xa740f24b\n" /* "Linux 2.4.64" ebcdic */ + " .long 0xf44bf6f4\n" + " . = hvReleaseData + %0\n" + " .previous\n" + : : "i"(sizeof(hvReleaseData)), + "i"(offsetof(struct naca_struct, xItVpdAreas)), + "i"(HVREL_TAGSINACTIVE /* tags inactive, 64 bit, */ + /* shared processors, HMT allowed */ + | 6), /* TEMP: This allows non-GA drivers */ + "i"(KERNELBASE) + ); +} + +struct LparMap __attribute__((aligned (16))) xLparMap = { + .xNumberEsids = HvEsidsToMap, + .xNumberRanges = HvRangesToMap, + .xSegmentTableOffs = STAB0_PAGE, + + .xEsids = { + { .xKernelEsid = GET_ESID(KERNELBASE), + .xKernelVsid = KERNEL_VSID(KERNELBASE), }, + { .xKernelEsid = GET_ESID(VMALLOCBASE), + .xKernelVsid = KERNEL_VSID(VMALLOCBASE), }, + }, + + .xRanges = { + { .xPages = HvPagesToMap, + .xOffset = 0, + .xVPN = KERNEL_VSID(KERNELBASE) << (SID_SHIFT - PAGE_SHIFT), + }, + }, +}; extern void system_reset_iSeries(void); extern void machine_check_iSeries(void); Index: working-2.6/include/asm-ppc64/mmu.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu.h 2005-07-25 15:28:15.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu.h 2005-07-25 15:29:40.000000000 +1000 @@ -338,6 +338,9 @@ | (ea >> SID_SHIFT)); } +#define VSID_SCRAMBLE(pvsid) (((pvsid) * VSID_MULTIPLIER) % VSID_MODULUS) +#define KERNEL_VSID(ea) VSID_SCRAMBLE(GET_ESID(ea)) + #endif /* __ASSEMBLY */ #endif /* _PPC64_MMU_H_ */ Index: working-2.6/arch/ppc64/kernel/head.S =================================================================== --- working-2.6.orig/arch/ppc64/kernel/head.S 2005-07-25 15:28:15.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/head.S 2005-07-25 15:31:00.000000000 +1000 @@ -522,36 +522,9 @@ #ifdef CONFIG_PPC_ISERIES .globl naca naca: - .llong itVpdAreas - - /* - * The iSeries LPAR map is at this fixed address - * so that the HvReleaseData structure can address - * it with a 32-bit offset. - * - * The VSID values below are dependent on the - * VSID generation algorithm. See include/asm/mmu_context.h. - */ - - . = 0x4800 - - .llong 2 /* # ESIDs to be mapped by hypervisor */ - .llong 1 /* # memory ranges to be mapped by hypervisor */ - .llong STAB0_PAGE /* Page # of segment table within load area */ - .llong 0 /* Reserved */ - .llong 0 /* Reserved */ - .llong 0 /* Reserved */ - .llong 0 /* Reserved */ - .llong 0 /* Reserved */ - .llong (KERNELBASE>>SID_SHIFT) - .llong 0x408f92c94 /* KERNELBASE VSID */ - /* We have to list the bolted VMALLOC segment here, too, so that it - * will be restored on shared processor switch */ - .llong (VMALLOCBASE>>SID_SHIFT) - .llong 0xf09b89af5 /* VMALLOCBASE VSID */ - .llong 8192 /* # pages to map (32 MB) */ - .llong 0 /* Offset from start of loadarea to start of map */ - .llong 0x408f92c940000 /* VPN of first page to map */ + .llong itVpdAreas + .llong 0 /* xRamDisk */ + .llong 0 /* xRamDiskSize */ . = 0x6100 -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From ntl at pobox.com Tue Jul 26 00:36:22 2005 From: ntl at pobox.com (Nathan Lynch) Date: Mon, 25 Jul 2005 09:36:22 -0500 Subject: [PATCH] ppc64: Add /proc/ppc64/flat-device-tree for debugging purposes In-Reply-To: <200507251353.17778.michael@ellerman.id.au> References: <200507251353.17778.michael@ellerman.id.au> Message-ID: <20050725143622.GB30720@otto> Michael Ellerman wrote: > This patch adds a /proc/ppc64/flat-device-tree file which exports > the flattened device tree as a binary blob. It assumes the device > tree is contiguous in memory. If this is expressly meant as a debugging tool then I suggest using debugfs instead of adding another entry to /proc/ppc64. Nathan From jschopp at austin.ibm.com Tue Jul 26 02:20:21 2005 From: jschopp at austin.ibm.com (Joel Schopp) Date: Mon, 25 Jul 2005 11:20:21 -0500 Subject: [PATCH] ppc64: Add /proc/ppc64/flat-device-tree for debugging purposes In-Reply-To: <200507251353.17778.michael@ellerman.id.au> References: <200507251353.17778.michael@ellerman.id.au> Message-ID: <42E51145.6080404@austin.ibm.com> > This patch adds a /proc/ppc64/flat-device-tree file which exports > the flattened device tree as a binary blob. Another /proc entry, and a binary one at that. I'm sure this is helpful for you, but I think many of the rest of us will want to scream if this ever goes into mainline. From dwg at au1.ibm.com Tue Jul 26 09:51:50 2005 From: dwg at au1.ibm.com (David Gibson) Date: Tue, 26 Jul 2005 09:51:50 +1000 Subject: [PATCH] ppc64: Add /proc/ppc64/flat-device-tree for debugging purposes In-Reply-To: <42E51145.6080404@austin.ibm.com> References: <200507251353.17778.michael@ellerman.id.au> <42E51145.6080404@austin.ibm.com> Message-ID: <20050725235150.GA14904@localhost.localdomain> On Mon, Jul 25, 2005 at 11:20:21AM -0500, Joel Schopp wrote: > >This patch adds a /proc/ppc64/flat-device-tree file which exports > >the flattened device tree as a binary blob. > > Another /proc entry, and a binary one at that. I'm sure this is > helpful for you, but I think many of the rest of us will want to scream > if this ever goes into mainline. It's not intended to go into mainline, it's a debugging tool. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From michael at ellerman.id.au Tue Jul 26 18:57:22 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:57:22 +1000 Subject: [RFC] Add flattened device tree for iSeries Message-ID: <200507261857.31494.michael@ellerman.id.au> Hi, This series of patches adds a flattened device tree into the iSeries startup code. This already allows us to remove a reasonable amount of iSeries specific code, and hopefully more will go in the future. Most of this is not ready to merge, but just here for early feedback. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050726/869cd34d/attachment.pgp From michael at ellerman.id.au Tue Jul 26 18:58:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:39 +1000 Subject: [PATCH 2/19] ppc64: Fix a misleading printk in unflatten_dt_node() In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368319.356982.239413262750.qpatch@concordia> When unflatten_dt_node() fails to find an OF_DT_END_NODE tag when scanning the flat device tree it prints "Weird tag at start of node", this should be "Weird tag at end of node". --- arch/ppc64/kernel/prom.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: work/arch/ppc64/kernel/prom.c =================================================================== --- work.orig/arch/ppc64/kernel/prom.c +++ work/arch/ppc64/kernel/prom.c @@ -816,7 +816,7 @@ static unsigned long __init unflatten_dt tag = *((u32 *)(*p)); } if (tag != OF_DT_END_NODE) { - printk("Weird tag at start of node: %x\n", tag); + printk("Weird tag at end of node: %x\n", tag); return mem; } *p += 4; From michael at ellerman.id.au Tue Jul 26 18:58:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:39 +1000 Subject: [PATCH 3/19] ppc64: Make get_property(NULL, "foo") return NULL rather than dieing In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368319.619681.345790155179.qpatch@concordia> In prom.c get_property() will die if you pass it NULL as the node parameter. Some code calling get_property() checks the node is non-NULL, but other code doesn't. The neatest solution is just to make get_property(NULL, "foo") return NULL as if the node exists but just has no "foo" property. Code that cares about the nodes existence can check the node explicitly. --- arch/ppc64/kernel/prom.c | 3 +++ 1 files changed, 3 insertions(+) Index: work/arch/ppc64/kernel/prom.c =================================================================== --- work.orig/arch/ppc64/kernel/prom.c +++ work/arch/ppc64/kernel/prom.c @@ -1760,6 +1760,9 @@ get_property(struct device_node *np, con { struct property *pp; + if (!np) + return NULL; + for (pp = np->properties; pp != 0; pp = pp->next) if (strcmp(pp->name, name) == 0) { if (lenp != 0) From michael at ellerman.id.au Tue Jul 26 19:02:05 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 19:02:05 +1000 Subject: [RFC] Add flattened device tree for iSeries In-Reply-To: <200507261857.31494.michael@ellerman.id.au> References: <200507261857.31494.michael@ellerman.id.au> Message-ID: <200507261902.05980.michael@ellerman.id.au> On Tue, 26 Jul 2005 18:57, Michael Ellerman wrote: > Hi, > > This series of patches adds a flattened device tree into the iSeries > startup code. This already allows us to remove a reasonable amount of > iSeries specific code, and hopefully more will go in the future. > > Most of this is not ready to merge, but just here for early feedback. ps. I should add that this series boots for me on our 4 cpu iSeries box. If anyone else has an iSeries kicking around and a few spare minutes I'd love to hear how you go. Hopefully I haven't broken pSeries etc. :} -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050726/d894ad42/attachment.pgp From michael at ellerman.id.au Tue Jul 26 18:58:42 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:42 +1000 Subject: [PATCH 17/19] ppc64: Remove iSeries cache size setup In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368322.742941.261730659389.qpatch@concordia> Now that we have cache sizes in the device tree we can remove the iSeries specific setup of ppc64_caches and systemcfg. --- arch/ppc64/kernel/iSeries_setup.c | 48 -------------------------------------- 1 files changed, 48 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -73,7 +73,6 @@ extern void hvlog(char *fmt, ...); extern void ppcdbg_initialize(void); static void build_iSeries_Memory_Map(void); -static void setup_iSeries_cache_sizes(void); static void iSeries_bolt_kernel(unsigned long saddr, unsigned long eaddr); static int iseries_shared_idle(void); static int iseries_dedicated_idle(void); @@ -321,12 +320,6 @@ static void __init iSeries_init_early(vo iSeries_recal_titan = HvCallXm_loadTod(); /* - * Cache sizes must be initialized before hpte_init_iSeries is called - * as the later need them for flush_icache_range() - */ - setup_iSeries_cache_sizes(); - - /* * Initialize the hash table management pointers */ hpte_init_iSeries(); @@ -513,47 +506,6 @@ static void __init build_iSeries_Memory_ } /* - * Set up the variables that describe the cache line sizes - * for this machine. - */ -static void __init setup_iSeries_cache_sizes(void) -{ - unsigned int i, n; - unsigned int procIx = get_paca()->lppaca.dyn_hv_phys_proc_index; - - systemcfg->icache_size = - ppc64_caches.isize = xIoHriProcessorVpd[procIx].xInstCacheSize * 1024; - systemcfg->icache_line_size = - ppc64_caches.iline_size = - xIoHriProcessorVpd[procIx].xInstCacheOperandSize; - systemcfg->dcache_size = - ppc64_caches.dsize = - xIoHriProcessorVpd[procIx].xDataL1CacheSizeKB * 1024; - systemcfg->dcache_line_size = - ppc64_caches.dline_size = - xIoHriProcessorVpd[procIx].xDataCacheOperandSize; - ppc64_caches.ilines_per_page = PAGE_SIZE / ppc64_caches.iline_size; - ppc64_caches.dlines_per_page = PAGE_SIZE / ppc64_caches.dline_size; - - i = ppc64_caches.iline_size; - n = 0; - while ((i = (i / 2))) - ++n; - ppc64_caches.log_iline_size = n; - - i = ppc64_caches.dline_size; - n = 0; - while ((i = (i / 2))) - ++n; - ppc64_caches.log_dline_size = n; - - printk("D-cache line size = %d\n", - (unsigned int)ppc64_caches.dline_size); - printk("I-cache line size = %d\n", - (unsigned int)ppc64_caches.iline_size); -} - -/* * Create a pte. Used during initialization only. */ static void iSeries_make_pte(unsigned long va, unsigned long pa, From michael at ellerman.id.au Tue Jul 26 18:58:41 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:41 +1000 Subject: [PATCH 13/19] ppc64: Add /system-id, /model and /compatible In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368321.781610.923749769978.qpatch@concordia> Add /system-id, /model and /compatible to the iSeries device tree. This should allow us to remove some iSeries specific code (at least for lparcfg) but I haven't got around to that yet. --- arch/ppc64/kernel/iSeries_setup.c | 18 ++++++++++++++++++ 1 files changed, 18 insertions(+) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -58,6 +58,7 @@ #include #include #include +#include #include #include @@ -1056,6 +1057,22 @@ void dt_prop_empty(struct iseries_flat_d dt_prop(dt, name, NULL, 0); } +void dt_model(struct iseries_flat_dt *dt) +{ + char buf[16] = "IBM,"; + + strne2a(buf + 4, xItExtVpdPanel.mfgID + 2, 2); + strne2a(buf + 6, xItExtVpdPanel.systemSerial + 1, 5); + buf[11] = '\0'; + dt_prop_str(dt, "system-id", buf); + + strne2a(buf + 4, xItExtVpdPanel.machineType, 4); + buf[8] = '\0'; + dt_prop_str(dt, "model", buf); + + dt_prop_str(dt, "compatible", "IBM,iSeries"); +} + void build_flat_dt(struct iseries_flat_dt *dt) { u64 tmp[2]; @@ -1066,6 +1083,7 @@ void build_flat_dt(struct iseries_flat_d dt_prop_u32(dt, "#address-cells", 2); dt_prop_u32(dt, "#size-cells", 2); + dt_model(dt); /* /memory */ dt_start_node(dt, "/memory at 0"); From michael at ellerman.id.au Tue Jul 26 18:58:40 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:40 +1000 Subject: [PATCH 7/19] ppc64: Enable /proc/device-tree for iSeries In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368320.463542.50879214517.qpatch@concordia> Allow CONFIG_PROC_DEVICETREE to be enabled on iSeries. --- arch/ppc64/Kconfig | 1 - 1 files changed, 1 deletion(-) Index: work/arch/ppc64/Kconfig =================================================================== --- work.orig/arch/ppc64/Kconfig +++ work/arch/ppc64/Kconfig @@ -407,7 +407,6 @@ source "drivers/pci/hotplug/Kconfig" config PROC_DEVICETREE bool "Support for Open Firmware device tree in /proc" - depends on !PPC_ISERIES help This option adds a device-tree directory under /proc which contains an image of the device tree that the kernel copies from Open From michael at ellerman.id.au Tue Jul 26 18:58:43 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:43 +1000 Subject: [PATCH 19/19] ppc64: Use setup_cpu_maps() instead of iSeries specific smp code In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368323.243384.202065854128.qpatch@concordia> Call setup_cpu_maps() on iSeries which will setup the cpu maps from the device tree. This removes the need for smp_iSeries_numProcs() and makes smp_iSeries_probe() a one-liner. --- arch/ppc64/kernel/iSeries_smp.c | 30 +----------------------------- arch/ppc64/kernel/setup.c | 8 +------- 2 files changed, 2 insertions(+), 36 deletions(-) Index: work/arch/ppc64/kernel/iSeries_smp.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_smp.c +++ work/arch/ppc64/kernel/iSeries_smp.c @@ -82,35 +82,9 @@ static void smp_iSeries_message_pass(int } } -static int smp_iSeries_numProcs(void) -{ - unsigned np, i; - - np = 0; - for (i=0; i < NR_CPUS; ++i) { - if (paca[i].lppaca.dyn_proc_status < 2) { - cpu_set(i, cpu_possible_map); - cpu_set(i, cpu_present_map); - cpu_set(i, cpu_sibling_map[i]); - ++np; - } - } - return np; -} - static int smp_iSeries_probe(void) { - unsigned i; - unsigned np = 0; - - for (i=0; i < NR_CPUS; ++i) { - if (paca[i].lppaca.dyn_proc_status < 2) { - /*paca[i].active = 1;*/ - ++np; - } - } - - return np; + return cpus_weight(cpu_possible_map); } static void smp_iSeries_kick_cpu(int nr) @@ -144,6 +118,4 @@ static struct smp_ops_t iSeries_smp_ops void __init smp_init_iSeries(void) { smp_ops = &iSeries_smp_ops; - systemcfg->processorCount = smp_iSeries_numProcs(); } - Index: work/arch/ppc64/kernel/setup.c =================================================================== --- work.orig/arch/ppc64/kernel/setup.c +++ work/arch/ppc64/kernel/setup.c @@ -182,8 +182,6 @@ void __init disable_early_printk(void) early_console_initialized = 0; } -#if defined(CONFIG_PPC_MULTIPLATFORM) && defined(CONFIG_SMP) - static int smt_enabled_cmdline; /* Look for ibm,smt-enabled OF option */ @@ -335,7 +333,6 @@ static void __init setup_cpu_maps(void) systemcfg->processorCount = num_present_cpus(); } -#endif /* defined(CONFIG_PPC_MULTIPLATFORM) && defined(CONFIG_SMP) */ #ifdef CONFIG_PPC_MULTIPLATFORM @@ -637,12 +634,9 @@ void __init setup_system(void) parse_early_param(); -#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) - /* - * iSeries has already initialized the cpu maps at this point. - */ setup_cpu_maps(); +#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) /* Release secondary cpus out of their spinloops at 0x60 now that * we can map physical -> logical CPU ids */ From michael at ellerman.id.au Tue Jul 26 18:58:42 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:42 +1000 Subject: [PATCH 14/19] ppc64: Move iSeries memory limit logic into device tree In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368322.8246.827409622930.qpatch@concordia> Move the iSeries specific memory limit logic earlier so we can put it into the device tree. Unfortunately this means we have to parse the command line twice (once in iSeries_early_setup() and once in setup_system()), I might clean that up later. --- arch/ppc64/kernel/iSeries_setup.c | 48 ++++++++++++++++++++++---------------- arch/ppc64/kernel/setup.c | 17 ------------- 2 files changed, 29 insertions(+), 36 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -313,8 +313,6 @@ static void __init iSeries_get_cmdline(v static void __init iSeries_init_early(void) { - extern unsigned long memory_limit; - DBG(" -> iSeries_init_early()\n"); ppcdbg_initialize(); @@ -358,23 +356,6 @@ static void __init iSeries_init_early(vo */ iommu_init_early_iSeries(); - iSeries_get_cmdline(); - - /* Save unparsed command line copy for /proc/cmdline */ - strlcpy(saved_command_line, cmd_line, COMMAND_LINE_SIZE); - - /* Parse early parameters, in particular mem=x */ - parse_early_param(); - - if (memory_limit) { - if (memory_limit < systemcfg->physicalMemorySize) - systemcfg->physicalMemorySize = memory_limit; - else { - printk("Ignoring mem=%lu >= ram_top.\n", memory_limit); - memory_limit = 0; - } - } - /* Initialize machine-dependency vectors */ #ifdef CONFIG_SMP smp_init_iSeries(); @@ -917,6 +898,19 @@ static int iseries_dedicated_idle(void) return 0; } +static unsigned long iseries_memory_limit; + +static int __init early_parsemem(char *p) +{ + if (!p) + return 0; + + iseries_memory_limit = ALIGN(memparse(p, &p), PAGE_SIZE); + + return 0; +} +early_param("mem", early_parsemem); + #ifndef CONFIG_PCI void __init iSeries_init_IRQ(void) { } #endif @@ -1097,6 +1091,10 @@ void build_flat_dt(struct iseries_flat_d /* /chosen */ dt_start_node(dt, "/chosen"); dt_prop_u32(dt, "linux,platform", PLATFORM_ISERIES_LPAR); + dt_prop_str(dt, "bootargs", cmd_line); + if (iseries_memory_limit) + dt_prop_u64(dt, "linux,memory-limit", iseries_memory_limit); + dt_end_node(dt); dt_end_node(dt); @@ -1116,6 +1114,18 @@ void __init iSeries_early_setup(void) */ build_iSeries_Memory_Map(); + iSeries_get_cmdline(); + + /* Save unparsed command line copy for /proc/cmdline */ + strlcpy(saved_command_line, cmd_line, COMMAND_LINE_SIZE); + + /* Parse early parameters, in particular mem=x */ + parse_early_param(); + + if (iseries_memory_limit && + iseries_memory_limit < systemcfg->physicalMemorySize) + systemcfg->physicalMemorySize = iseries_memory_limit; + build_flat_dt(&iseries_dt); early_init_devtree(&iseries_dt); Index: work/arch/ppc64/kernel/setup.c =================================================================== --- work.orig/arch/ppc64/kernel/setup.c +++ work/arch/ppc64/kernel/setup.c @@ -812,23 +812,6 @@ unsigned long memory_limit; unsigned long tce_alloc_start; unsigned long tce_alloc_end; -#ifdef CONFIG_PPC_ISERIES -/* - * On iSeries we just parse the mem=X option from the command line. - * On pSeries it's a bit more complicated, see prom_init_mem() - */ -static int __init early_parsemem(char *p) -{ - if (!p) - return 0; - - memory_limit = ALIGN(memparse(p, &p), PAGE_SIZE); - - return 0; -} -early_param("mem", early_parsemem); -#endif /* CONFIG_PPC_ISERIES */ - #ifdef CONFIG_PPC_MULTIPLATFORM static int __init set_preferred_console(void) { From michael at ellerman.id.au Tue Jul 26 18:58:42 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:42 +1000 Subject: [PATCH 15/19] ppc64: Move iSeries initrd logic into device tree In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368322.249925.69098618987.qpatch@concordia> Move iSeries initrd logic earlier and store the result in the device tree so generic code can do the rest for us. The iSeries code had a "feature" which the generic code lacks, ie. if the initrd is bigger than the configured ram disk size, we make the ram disk size big enough - I've copied the code across but I'm not sure if we want to keep it. --- arch/ppc64/kernel/iSeries_setup.c | 43 ++++++++++---------------------------- arch/ppc64/kernel/setup.c | 8 ++++++- 2 files changed, 19 insertions(+), 32 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -24,7 +24,6 @@ #include #include #include -#include #include #include #include @@ -95,7 +94,6 @@ static unsigned long tbFreqMhzHundreths; int piranha_simulator; -extern int rd_size; /* Defined in drivers/block/rd.c */ extern unsigned long klimit; extern unsigned long embedded_sysmap_start; extern unsigned long embedded_sysmap_end; @@ -319,24 +317,6 @@ static void __init iSeries_init_early(vo ppc64_interrupt_controller = IC_ISERIES; -#if defined(CONFIG_BLK_DEV_INITRD) - /* - * If the init RAM disk has been configured and there is - * a non-zero starting address for it, set it up - */ - if (naca.xRamDisk) { - initrd_start = (unsigned long)__va(naca.xRamDisk); - initrd_end = initrd_start + naca.xRamDiskSize * PAGE_SIZE; - initrd_below_start_ok = 1; // ramdisk in kernel space - ROOT_DEV = Root_RAM0; - if (((rd_size * 1024) / PAGE_SIZE) < naca.xRamDiskSize) - rd_size = (naca.xRamDiskSize * PAGE_SIZE) / 1024; - } else -#endif /* CONFIG_BLK_DEV_INITRD */ - { - /* ROOT_DEV = MKDEV(VIODASD_MAJOR, 1); */ - } - iSeries_recal_tb = get_tb(); iSeries_recal_titan = HvCallXm_loadTod(); @@ -370,17 +350,6 @@ static void __init iSeries_init_early(vo mf_initialized = 1; mb(); - /* If we were passed an initrd, set the ROOT_DEV properly if the values - * look sensible. If not, clear initrd reference. - */ -#ifdef CONFIG_BLK_DEV_INITRD - if (initrd_start >= KERNELBASE && initrd_end >= KERNELBASE && - initrd_end > initrd_start) - ROOT_DEV = Root_RAM0; - else - initrd_start = initrd_end = 0; -#endif /* CONFIG_BLK_DEV_INITRD */ - DBG(" <- iSeries_init_early()\n"); } @@ -1067,6 +1036,17 @@ void dt_model(struct iseries_flat_dt *dt dt_prop_str(dt, "compatible", "IBM,iSeries"); } +void dt_initrd(struct iseries_flat_dt *dt) +{ +#ifdef CONFIG_BLK_DEV_INITRD + if (naca.xRamDisk) { + dt_prop_u64(dt, "linux,initrd-start", (u64)naca.xRamDisk); + dt_prop_u64(dt, "linux,initrd-end", + (u64)naca.xRamDisk + naca.xRamDiskSize * PAGE_SIZE); + } +#endif +} + void build_flat_dt(struct iseries_flat_dt *dt) { u64 tmp[2]; @@ -1094,6 +1074,7 @@ void build_flat_dt(struct iseries_flat_d dt_prop_str(dt, "bootargs", cmd_line); if (iseries_memory_limit) dt_prop_u64(dt, "linux,memory-limit", iseries_memory_limit); + dt_initrd(dt); dt_end_node(dt); Index: work/arch/ppc64/kernel/setup.c =================================================================== --- work.orig/arch/ppc64/kernel/setup.c +++ work/arch/ppc64/kernel/setup.c @@ -534,6 +534,7 @@ static void __init check_for_initrd(void { #ifdef CONFIG_BLK_DEV_INITRD u64 *prop; + extern int rd_size; /* Defined in drivers/block/rd.c */ DBG(" -> check_for_initrd()\n"); @@ -557,8 +558,13 @@ static void __init check_for_initrd(void else initrd_start = initrd_end = 0; + /* XXX This came from iSeries code, make sure ram disk is big enough */ + if ((rd_size * 1024) < (initrd_end - initrd_start)) + rd_size = (initrd_end - initrd_start) / 1024; + if (initrd_start) - printk("Found initrd at 0x%lx:0x%lx\n", initrd_start, initrd_end); + printk("Found initrd at 0x%lx:0x%lx\n", initrd_start, + initrd_end); DBG(" <- check_for_initrd()\n"); #endif /* CONFIG_BLK_DEV_INITRD */ From michael at ellerman.id.au Tue Jul 26 18:58:43 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:43 +1000 Subject: [PATCH 18/19] ppc64: Add timebase-frequency and clock-frequency to iSeries device tree In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368323.6030.542495015632.qpatch@concordia> Add timebase-frequency and clock-frequency properties to the cpu nodes in the iSeries device tree. This removes the need for iSeries_calibrate_decr() so make generic_calibrate_decr() compile for iSeries and use that instead. This changes a few printks at boot. --- arch/ppc64/kernel/iSeries_setup.c | 83 +++----------------------------------- arch/ppc64/kernel/time.c | 2 2 files changed, 9 insertions(+), 76 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -83,14 +83,6 @@ static void iSeries_pci_final_fixup(void #endif /* Global Variables */ -static unsigned long procFreqHz; -static unsigned long procFreqMhz; -static unsigned long procFreqMhzHundreths; - -static unsigned long tbFreqHz; -static unsigned long tbFreqMhz; -static unsigned long tbFreqMhzHundreths; - int piranha_simulator; extern unsigned long klimit; @@ -576,36 +568,14 @@ static void __init iSeries_setup_arch(vo printk(KERN_INFO "Using dedicated idle loop\n"); } - /* Add an eye catcher and the systemcfg layout version number */ - strcpy(systemcfg->eye_catcher, "SYSTEMCFG:PPC64"); - systemcfg->version.major = SYSTEMCFG_MAJOR; - systemcfg->version.minor = SYSTEMCFG_MINOR; - /* Setup the Lp Event Queue */ setup_hvlpevent_queue(); - /* Compute processor frequency */ - procFreqHz = ((1UL << 34) * 1000000) / - xIoHriProcessorVpd[procIx].xProcFreq; - procFreqMhz = procFreqHz / 1000000; - procFreqMhzHundreths = (procFreqHz / 10000) - (procFreqMhz * 100); - ppc_proc_freq = procFreqHz; - - /* Compute time base frequency */ - tbFreqHz = ((1UL << 32) * 1000000) / - xIoHriProcessorVpd[procIx].xTimeBaseFreq; - tbFreqMhz = tbFreqHz / 1000000; - tbFreqMhzHundreths = (tbFreqHz / 10000) - (tbFreqMhz * 100); - ppc_tb_freq = tbFreqHz; - printk("Max logical processors = %d\n", itVpdAreas.xSlicMaxLogicalProcs); printk("Max physical processors = %d\n", itVpdAreas.xSlicMaxPhysicalProcs); - printk("Processor frequency = %lu.%02lu\n", procFreqMhz, - procFreqMhzHundreths); - printk("Time base frequency = %lu.%02lu\n", tbFreqMhz, - tbFreqMhzHundreths); + systemcfg->processor = xIoHriProcessorVpd[procIx].xPVR; printk("Processor version = %x\n", systemcfg->processor); } @@ -649,49 +619,6 @@ static void iSeries_halt(void) mf_power_off(); } -/* - * void __init iSeries_calibrate_decr() - * - * Description: - * This routine retrieves the internal processor frequency from the VPD, - * and sets up the kernel timer decrementer based on that value. - * - */ -static void __init iSeries_calibrate_decr(void) -{ - unsigned long cyclesPerUsec; - struct div_result divres; - - /* Compute decrementer (and TB) frequency in cycles/sec */ - cyclesPerUsec = ppc_tb_freq / 1000000; - - /* - * Set the amount to refresh the decrementer by. This - * is the number of decrementer ticks it takes for - * 1/HZ seconds. - */ - tb_ticks_per_jiffy = ppc_tb_freq / HZ; - -#if 0 - /* TEST CODE FOR ADJTIME */ - tb_ticks_per_jiffy += tb_ticks_per_jiffy / 5000; - /* END OF TEST CODE */ -#endif - - /* - * tb_ticks_per_sec = freq; would give better accuracy - * but tb_ticks_per_sec = tb_ticks_per_jiffy*HZ; assures - * that jiffies (and xtime) will match the time returned - * by do_gettimeofday. - */ - tb_ticks_per_sec = tb_ticks_per_jiffy * HZ; - tb_ticks_per_usec = cyclesPerUsec; - tb_to_us = mulhwu_scale_factor(ppc_tb_freq, 1000000); - div128_by_32(1024 * 1024, 0, tb_ticks_per_sec, &divres); - tb_to_xs = divres.result_low; - setup_default_decr(); -} - static void __init iSeries_progress(char * st, unsigned short code) { printk("Progress: [%04x] - %s\n", (unsigned)code, st); @@ -849,7 +776,7 @@ struct machdep_calls __initdata iseries_ .get_boot_time = iSeries_get_boot_time, .set_rtc_time = iSeries_set_rtc_time, .get_rtc_time = iSeries_get_rtc_time, - .calibrate_decr = iSeries_calibrate_decr, + .calibrate_decr = generic_calibrate_decr, .progress = iSeries_progress, }; @@ -1033,6 +960,12 @@ void dt_cpus(struct iseries_flat_dt *dt) dt_prop_u32(dt, "d-cache-size", d->xDataL1CacheSizeKB * 1024); dt_prop_u32(dt, "d-cache-line-size", d->xDataCacheOperandSize); + /* magic conversions to Hz copied from old code */ + dt_prop_u32(dt, "clock-frequency", + ((1UL << 34) * 1000000) / d->xProcFreq); + dt_prop_u32(dt, "timebase-frequency", + ((1UL << 32) * 1000000) / d->xTimeBaseFreq); + dt_prop_u32(dt, "reg", i); if (dt->header.boot_cpuid_phys == i) Index: work/arch/ppc64/kernel/time.c =================================================================== --- work.orig/arch/ppc64/kernel/time.c +++ work/arch/ppc64/kernel/time.c @@ -472,7 +472,7 @@ int do_settimeofday(struct timespec *tv) EXPORT_SYMBOL(do_settimeofday); -#if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_MAPLE) || defined(CONFIG_PPC_BPA) +#if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_MAPLE) || defined(CONFIG_PPC_BPA) || defined(CONFIG_PPC_ISERIES) void __init generic_calibrate_decr(void) { struct device_node *cpu; From michael at ellerman.id.au Tue Jul 26 18:58:40 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:40 +1000 Subject: [PATCH 8/19] ppc64: Create a fake flat device tree on iSeries In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368320.735021.509798647893.qpatch@concordia> This patch adds infrastructure for creating a fake flattened device tree on iSeries. This is very much work-in-progress and a bit hacky. We also need to build prom.o for iSeries which means we'll always need it. --- arch/ppc64/kernel/Makefile | 4 - arch/ppc64/kernel/iSeries_setup.c | 131 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 133 insertions(+), 2 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -940,6 +940,135 @@ struct machdep_calls __initdata iseries_ .progress = iSeries_progress, }; +struct blob +{ + unsigned char data[PAGE_SIZE]; + unsigned long next; +}; + +struct iseries_flat_dt +{ + struct boot_param_header header; + u64 reserve_map[2]; + struct blob dt; + struct blob strings; +}; + +struct iseries_flat_dt iseries_dt; + +void dt_init(struct iseries_flat_dt *dt) +{ + unsigned char *dt_addr = (unsigned char *)dt; + + dt->header.off_mem_rsvmap = (unsigned char *)&dt->reserve_map - dt_addr; + dt->header.off_dt_struct = (unsigned char *)&dt->dt - dt_addr; + dt->header.off_dt_strings = (unsigned char *)&dt->strings - dt_addr; + dt->header.totalsize = sizeof(*dt); + + /* There is no notion of hardware cpu id on iSeries */ + dt->header.boot_cpuid_phys = smp_processor_id(); + + dt->dt.next = (unsigned long)&dt->dt.data; + dt->strings.next = (unsigned long)&dt->strings.data; + + dt->header.magic = OF_DT_HEADER; + dt->header.version = 0x2; + dt->header.last_comp_version = 0x2; + + dt->reserve_map[0] = 0; + dt->reserve_map[1] = 0; +} + +void dt_push_u32(struct iseries_flat_dt *dt, u32 value) +{ + *((u32*)dt->dt.next) = value; + dt->dt.next += sizeof(u32); + dt->dt.next = _ALIGN(dt->dt.next, sizeof(u32)); + BUG_ON(dt->dt.next - (unsigned long)dt->dt.data > sizeof(dt->dt.data)); +} + +void dt_push_u64(struct iseries_flat_dt *dt, u64 value) +{ + *((u64*)dt->dt.next) = value; + dt->dt.next += sizeof(u64); + dt->dt.next = _ALIGN(dt->dt.next, sizeof(u64)); + BUG_ON(dt->dt.next - (unsigned long)dt->dt.data > sizeof(dt->dt.data)); +} + +void dt_push_str(struct blob *blob, char *str) +{ + memcpy((char *)blob->next, str, strlen(str) + 1); + blob->next += strlen(str) + 1; + blob->next = _ALIGN(blob->next, 4); +} + +#define str_offset(dt) (dt->strings.next - (unsigned long)&dt->strings) + +void dt_start_node(struct iseries_flat_dt *dt, char *name) +{ + dt_push_u32(dt, OF_DT_BEGIN_NODE); + dt_push_str(&dt->dt, name); +} + +#define dt_end_node(dt) dt_push_u32(dt, OF_DT_END_NODE) + +void dt_prop(struct iseries_flat_dt *dt, char *name, char *data, int len) +{ + dt_push_u32(dt, OF_DT_PROP); + + /* Length of the data */ + dt_push_u32(dt, len); + + /* The offset of the properties name in the string blob. */ + dt_push_u32(dt, str_offset(dt)); + + /* The actual data. */ + /* For versions below 16 the data is 4 or 8 byte aligned. */ + dt->dt.next = _ALIGN(dt->dt.next, len >= 8 ? 8 : 4); + memcpy((char *)dt->dt.next, data, len); + dt->dt.next += len; + /* and then 4 byte aligned after the data. */ + dt->dt.next = _ALIGN(dt->dt.next, 4); + + /* Put the property name in the string blob. */ + dt_push_str(&dt->strings, name); +} + +void dt_prop_str(struct iseries_flat_dt *dt, char *name, char *data) +{ + dt_prop(dt, name, data, strlen(data) + 1); /* + 1 for NULL */ +} + +void dt_prop_u32(struct iseries_flat_dt *dt, char *name, u32 data) +{ + dt_prop(dt, name, (char *)&data, sizeof(u32)); +} + +void dt_prop_u64(struct iseries_flat_dt *dt, char *name, u64 data) +{ + dt_prop(dt, name, (char *)&data, sizeof(u64)); +} + +void dt_prop_u64_list(struct iseries_flat_dt *dt, char *name, u64 *data, int n) +{ + dt_prop(dt, name, (char *)data, sizeof(u64) * n); +} + +void dt_prop_empty(struct iseries_flat_dt *dt, char *name) +{ + dt_prop(dt, name, NULL, 0); +} + +void build_flat_dt(struct iseries_flat_dt *dt) +{ + dt_init(dt); + + dt_start_node(dt, "/"); + dt_end_node(dt); + + dt_push_u32(dt, OF_DT_END); +} + void __init iSeries_early_setup(void) { iSeries_fixup_klimit(); @@ -952,6 +1081,8 @@ void __init iSeries_early_setup(void) */ build_iSeries_Memory_Map(); + build_flat_dt(&iseries_dt); + /* Bolt mappings for all of memory (or some if we've got a limit) */ iSeries_bolt_kernel(0, systemcfg->physicalMemorySize); } Index: work/arch/ppc64/kernel/Makefile =================================================================== --- work.orig/arch/ppc64/kernel/Makefile +++ work/arch/ppc64/kernel/Makefile @@ -11,7 +11,7 @@ obj-y := setup.o entry.o t udbg.o binfmt_elf32.o sys_ppc32.o ioctl32.o \ ptrace32.o signal32.o rtc.o init_task.o \ lmb.o cputable.o cpu_setup_power4.o idle_power4.o \ - iommu.o sysfs.o vdso.o pmc.o + iommu.o sysfs.o vdso.o pmc.o prom.o obj-y += vdso32/ vdso64/ obj-$(CONFIG_PPC_OF) += of_device.o @@ -27,7 +27,7 @@ obj-$(CONFIG_PPC_ISERIES) += HvCall.o Hv mf.o HvLpEvent.o iSeries_proc.o iSeries_htab.o \ iSeries_iommu.o -obj-$(CONFIG_PPC_MULTIPLATFORM) += nvram.o i8259.o prom_init.o prom.o +obj-$(CONFIG_PPC_MULTIPLATFORM) += nvram.o i8259.o prom_init.o obj-$(CONFIG_PPC_PSERIES) += pSeries_pci.o pSeries_lpar.o pSeries_hvCall.o \ pSeries_nvram.o rtasd.o ras.o pSeries_reconfig.o \ From michael at ellerman.id.au Tue Jul 26 18:58:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:39 +1000 Subject: [PATCH 4/19] ppc64: unflatten_device_tree() should check if lmb_alloc() fails In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368319.861142.830513193305.qpatch@concordia> unflatten_device_tree() doesn't check if lmb_alloc() succeeds or not, it should. All it can do is panic, but at least there's an error message (assuming you have some sort of console at that point). --- arch/ppc64/kernel/prom.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) Index: work/arch/ppc64/kernel/prom.c =================================================================== --- work.orig/arch/ppc64/kernel/prom.c +++ work/arch/ppc64/kernel/prom.c @@ -847,8 +847,13 @@ void __init unflatten_device_tree(void) DBG(" size is %lx, allocating...\n", size); /* Allocate memory for the expanded device tree */ - mem = (unsigned long)abs_to_virt(lmb_alloc(size, - __alignof__(struct device_node))); + mem = lmb_alloc(size, __alignof__(struct device_node)); + if (!mem) { + DBG("Couldn't allocate memory with lmb_alloc()!\n"); + panic("Couldn't allocate memory with lmb_alloc()!\n"); + } + mem = (unsigned long)abs_to_virt(mem); + DBG(" unflattening...\n", mem); /* Second pass, do actual unflattening */ From michael at ellerman.id.au Tue Jul 26 18:58:40 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:40 +1000 Subject: [PATCH 6/19] ppc64: Build iSeries memory map and bolt mappings earlier In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368320.297166.489107309512.qpatch@concordia> Move the building of the iSeries memory map and the call to iSeries_bolt_kernel() into iSeries_early_setup(). This allows us to make Hypervisor calls much earlier during boot. --- arch/ppc64/kernel/iSeries_setup.c | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -355,12 +355,6 @@ static void __init iSeries_init_early(vo */ iommu_init_early_iSeries(); - /* - * Initialize the table which translate Linux physical addresses to - * AS/400 absolute addresses - */ - build_iSeries_Memory_Map(); - iSeries_get_cmdline(); /* Save unparsed command line copy for /proc/cmdline */ @@ -378,9 +372,6 @@ static void __init iSeries_init_early(vo } } - /* Bolt kernel mappings for all of memory (or just a bit if we've got a limit) */ - iSeries_bolt_kernel(0, systemcfg->physicalMemorySize); - lmb_init(); lmb_add(0, systemcfg->physicalMemorySize); lmb_analyze(); @@ -954,5 +945,14 @@ void __init iSeries_early_setup(void) iSeries_fixup_klimit(); ppc_md = iseries_md; + + /* + * Initialize the table which translate Linux physical addresses to + * AS/400 absolute addresses + */ + build_iSeries_Memory_Map(); + + /* Bolt mappings for all of memory (or some if we've got a limit) */ + iSeries_bolt_kernel(0, systemcfg->physicalMemorySize); } From michael at ellerman.id.au Tue Jul 26 18:58:42 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:42 +1000 Subject: [PATCH 16/19] ppc64: Define /cpus in iSeries device tree In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368322.491586.951657804888.qpatch@concordia> Add the /cpus node and nodes for each cpu, as well as cache size properties, reg propery and linux,boot-cpu. --- arch/ppc64/kernel/iSeries_setup.c | 48 +++++++++++++++++++++++++++++++++++++- 1 files changed, 47 insertions(+), 1 deletion(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -1047,6 +1047,51 @@ void dt_initrd(struct iseries_flat_dt *d #endif } +void dt_cpus(struct iseries_flat_dt *dt) +{ + unsigned char buf[32]; + unsigned char *p; + unsigned int i, index; + struct IoHriProcessorVpd *d; + + /* yuck */ + snprintf(buf, 32, "/cpus/PowerPC,%s", cur_cpu_spec->cpu_name); + p = strchr(buf, ' '); + if (!p) p = buf + strlen(buf); + + dt_start_node(dt, "/cpus"); + dt_prop_u32(dt, "#address-cells", 1); + dt_prop_u32(dt, "#size-cells", 0); + + for (i = 0; i < NR_CPUS; i++) { + if (paca[i].lppaca.dyn_proc_status >= 2) + continue; + + snprintf(p, 32 - (p - buf), "@%d", i); + dt_start_node(dt, buf); + + dt_prop_str(dt, "device_type", "cpu"); + + index = paca[i].lppaca.dyn_hv_phys_proc_index; + d = &xIoHriProcessorVpd[index]; + + dt_prop_u32(dt, "i-cache-size", d->xInstCacheSize * 1024); + dt_prop_u32(dt, "i-cache-line-size", d->xInstCacheOperandSize); + + dt_prop_u32(dt, "d-cache-size", d->xDataL1CacheSizeKB * 1024); + dt_prop_u32(dt, "d-cache-line-size", d->xDataCacheOperandSize); + + dt_prop_u32(dt, "reg", i); + + if (dt->header.boot_cpuid_phys == i) + dt_prop_empty(dt, "linux,boot-cpu"); + + dt_end_node(dt); + } + + dt_end_node(dt); +} + void build_flat_dt(struct iseries_flat_dt *dt) { u64 tmp[2]; @@ -1075,9 +1120,10 @@ void build_flat_dt(struct iseries_flat_d if (iseries_memory_limit) dt_prop_u64(dt, "linux,memory-limit", iseries_memory_limit); dt_initrd(dt); - dt_end_node(dt); + dt_cpus(dt); + dt_end_node(dt); dt_push_u32(dt, OF_DT_END); From michael at ellerman.id.au Tue Jul 26 18:58:41 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:41 +1000 Subject: [PATCH 9/19] ppc64: Incorporate the iSeries device tree into the bootup sequence In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368321.24640.664081475565.qpatch@concordia> This patch adds a call to early_init_devtree() on the fake device tree, and also changes some ifdefs in setup_system() so we call unflatten_device_tree() on iSeries also. --- arch/ppc64/kernel/iSeries_setup.c | 1 + arch/ppc64/kernel/setup.c | 8 ++------ include/asm-ppc64/prom.h | 1 + 3 files changed, 4 insertions(+), 6 deletions(-) Index: work/arch/ppc64/kernel/setup.c =================================================================== --- work.orig/arch/ppc64/kernel/setup.c +++ work/arch/ppc64/kernel/setup.c @@ -447,6 +447,7 @@ void __init early_setup(unsigned long dt DBG(" <- early_setup()\n"); } +#endif /* CONFIG_PPC_MULTIPLATFORM */ /* * Initialize some remaining members of the ppc64_caches and systemcfg structures @@ -563,8 +564,6 @@ static void __init check_for_initrd(void #endif /* CONFIG_BLK_DEV_INITRD */ } -#endif /* CONFIG_PPC_MULTIPLATFORM */ - /* * Do some initial setup of the system. The parameters are those which * were passed in from the bootloader. @@ -577,9 +576,7 @@ void __init setup_system(void) /* pSeries systems are identified in prom.c via OF. */ if (itLpNaca.xLparInstalled == 1) systemcfg->platform = PLATFORM_ISERIES_LPAR; - - ppc_md.init_early(); -#else /* CONFIG_PPC_ISERIES */ +#endif /* CONFIG_PPC_ISERIES */ /* * Unflatten the device-tree passed by prom_init or kexec @@ -639,7 +636,6 @@ void __init setup_system(void) strlcpy(saved_command_line, cmd_line, COMMAND_LINE_SIZE); parse_early_param(); -#endif /* !CONFIG_PPC_ISERIES */ #if defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) /* Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -1082,6 +1082,7 @@ void __init iSeries_early_setup(void) build_iSeries_Memory_Map(); build_flat_dt(&iseries_dt); + early_init_devtree(&iseries_dt); /* Bolt mappings for all of memory (or some if we've got a limit) */ iSeries_bolt_kernel(0, systemcfg->physicalMemorySize); Index: work/include/asm-ppc64/prom.h =================================================================== --- work.orig/include/asm-ppc64/prom.h +++ work/include/asm-ppc64/prom.h @@ -215,5 +215,6 @@ extern int prom_n_size_cells(struct devi extern int prom_n_intr_cells(struct device_node* np); extern void prom_get_irq_senses(unsigned char *senses, int off, int max); extern void prom_add_property(struct device_node* np, struct property* prop); +extern void early_init_devtree(void *params); #endif /* _PPC64_PROM_H */ From michael at ellerman.id.au Tue Jul 26 18:58:41 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:41 +1000 Subject: [PATCH 12/19] ppc64: Move setup of systemcfg->platform into iSeries device tree In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368321.637334.382616961338.qpatch@concordia> Add /chosen/linux,platform to the device tree so we can remove iSeries specific code in setup_system() to set systemcfg->platform. --- arch/ppc64/kernel/iSeries_setup.c | 5 +++++ arch/ppc64/kernel/setup.c | 6 ------ 2 files changed, 5 insertions(+), 6 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -1076,6 +1076,11 @@ void build_flat_dt(struct iseries_flat_d dt_prop_u64_list(dt, "reg", tmp, 2); dt_end_node(dt); + /* /chosen */ + dt_start_node(dt, "/chosen"); + dt_prop_u32(dt, "linux,platform", PLATFORM_ISERIES_LPAR); + dt_end_node(dt); + dt_end_node(dt); dt_push_u32(dt, OF_DT_END); Index: work/arch/ppc64/kernel/setup.c =================================================================== --- work.orig/arch/ppc64/kernel/setup.c +++ work/arch/ppc64/kernel/setup.c @@ -572,12 +572,6 @@ void __init setup_system(void) { DBG(" -> setup_system()\n"); -#ifdef CONFIG_PPC_ISERIES - /* pSeries systems are identified in prom.c via OF. */ - if (itLpNaca.xLparInstalled == 1) - systemcfg->platform = PLATFORM_ISERIES_LPAR; -#endif /* CONFIG_PPC_ISERIES */ - /* * Unflatten the device-tree passed by prom_init or kexec */ From michael at ellerman.id.au Tue Jul 26 18:58:39 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:39 +1000 Subject: [PATCH 1/19] ppc64: Add strne2a() to convert a string from EBCDIC to ASCII In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368319.1797.266563500218.qpatch@concordia> This patch adds strne2a() which converts a string from EBCDIC to ASCII. --- arch/ppc64/lib/e2a.c | 10 ++++++++++ include/asm-ppc64/system.h | 2 ++ 2 files changed, 12 insertions(+) Index: work/arch/ppc64/lib/e2a.c =================================================================== --- work.orig/arch/ppc64/lib/e2a.c +++ work/arch/ppc64/lib/e2a.c @@ -105,4 +105,14 @@ unsigned char e2a(unsigned char x) } EXPORT_SYMBOL(e2a); +unsigned char* strne2a(unsigned char *dest, const unsigned char *src, size_t n) +{ + int i; + n = min(n, strlen(src)); + + for (i = 0; i < n; i++) + dest[i] = e2a(src[i]); + + return dest; +} Index: work/include/asm-ppc64/system.h =================================================================== --- work.orig/include/asm-ppc64/system.h +++ work/include/asm-ppc64/system.h @@ -132,6 +132,8 @@ extern int mem_init_done; /* set on boot /* EBCDIC -> ASCII conversion for [0-9A-Z] on iSeries */ extern unsigned char e2a(unsigned char); +extern unsigned char* strne2a(unsigned char *dest, + const unsigned char *src, size_t n); extern struct task_struct *__switch_to(struct task_struct *, struct task_struct *); From benh at kernel.crashing.org Wed Jul 27 00:56:49 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 26 Jul 2005 10:56:49 -0400 Subject: [RFC] Add flattened device tree for iSeries In-Reply-To: <200507261857.31494.michael@ellerman.id.au> References: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122389809.14393.184.camel@gaston> On Tue, 2005-07-26 at 18:57 +1000, Michael Ellerman wrote: > Hi, > > This series of patches adds a flattened device tree into the iSeries startup > code. This already allows us to remove a reasonable amount of iSeries > specific code, and hopefully more will go in the future. > > Most of this is not ready to merge, but just here for early feedback. Cool, but according to the numbering, A lot of your patches never made it to me (maybe it's just the list though). Can you put them online somewhere ? Ben. From ntl at pobox.com Wed Jul 27 01:21:14 2005 From: ntl at pobox.com (Nathan Lynch) Date: Tue, 26 Jul 2005 10:21:14 -0500 Subject: [PATCH 19/19] ppc64: Use setup_cpu_maps() instead of iSeries specific smp code In-Reply-To: <1122368323.243384.202065854128.qpatch@concordia> References: <200507261857.31494.michael@ellerman.id.au> <1122368323.243384.202065854128.qpatch@concordia> Message-ID: <20050726152114.GD30720@otto> Michael Ellerman wrote: > Call setup_cpu_maps() on iSeries which will setup the cpu maps from > the device tree. This removes the need for smp_iSeries_numProcs() > and makes smp_iSeries_probe() a one-liner. Have you checked whether the maxcpus= boot argument works on iSeries? Just curious. Patch looks fine. Nathan From michael at ellerman.id.au Tue Jul 26 18:58:40 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:40 +1000 Subject: [PATCH 5/19] ppc64: Move iSeries ppc_md functions into a machdep_calls struct In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368320.116853.395489071698.qpatch@concordia> Move the iSeries machine specific calls into a machdep_calls struct like other platforms, rather than setting members of ppc_md explicitly. --- arch/ppc64/kernel/iSeries_setup.c | 53 ++++++++++++++++++++------------------ 1 files changed, 28 insertions(+), 25 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -75,6 +75,8 @@ extern void ppcdbg_initialize(void); static void build_iSeries_Memory_Map(void); static void setup_iSeries_cache_sizes(void); static void iSeries_bolt_kernel(unsigned long saddr, unsigned long eaddr); +static int iseries_shared_idle(void); +static int iseries_dedicated_idle(void); #ifdef CONFIG_PCI extern void iSeries_pci_final_fixup(void); #else @@ -675,6 +677,14 @@ static void __init iSeries_setup_arch(vo { unsigned procIx = get_paca()->lppaca.dyn_hv_phys_proc_index; + if (get_paca()->lppaca.shared_proc) { + ppc_md.idle_loop = iseries_shared_idle; + printk(KERN_INFO "Using shared processor idle loop\n"); + } else { + ppc_md.idle_loop = iseries_dedicated_idle; + printk(KERN_INFO "Using dedicated idle loop\n"); + } + /* Add an eye catcher and the systemcfg layout version number */ strcpy(systemcfg->eye_catcher, "SYSTEMCFG:PPC64"); systemcfg->version.major = SYSTEMCFG_MAJOR; @@ -922,34 +932,27 @@ static int iseries_dedicated_idle(void) void __init iSeries_init_IRQ(void) { } #endif +struct machdep_calls __initdata iseries_md = { + .setup_arch = iSeries_setup_arch, + .get_cpuinfo = iSeries_get_cpuinfo, + .init_IRQ = iSeries_init_IRQ, + .get_irq = iSeries_get_irq, + .init_early = iSeries_init_early, + .pcibios_fixup = iSeries_pci_final_fixup, + .restart = iSeries_restart, + .power_off = iSeries_power_off, + .halt = iSeries_halt, + .get_boot_time = iSeries_get_boot_time, + .set_rtc_time = iSeries_set_rtc_time, + .get_rtc_time = iSeries_get_rtc_time, + .calibrate_decr = iSeries_calibrate_decr, + .progress = iSeries_progress, +}; + void __init iSeries_early_setup(void) { iSeries_fixup_klimit(); - ppc_md.setup_arch = iSeries_setup_arch; - ppc_md.get_cpuinfo = iSeries_get_cpuinfo; - ppc_md.init_IRQ = iSeries_init_IRQ; - ppc_md.get_irq = iSeries_get_irq; - ppc_md.init_early = iSeries_init_early, - - ppc_md.pcibios_fixup = iSeries_pci_final_fixup; - - ppc_md.restart = iSeries_restart; - ppc_md.power_off = iSeries_power_off; - ppc_md.halt = iSeries_halt; - - ppc_md.get_boot_time = iSeries_get_boot_time; - ppc_md.set_rtc_time = iSeries_set_rtc_time; - ppc_md.get_rtc_time = iSeries_get_rtc_time; - ppc_md.calibrate_decr = iSeries_calibrate_decr; - ppc_md.progress = iSeries_progress; - - if (get_paca()->lppaca.shared_proc) { - ppc_md.idle_loop = iseries_shared_idle; - printk(KERN_INFO "Using shared processor idle loop\n"); - } else { - ppc_md.idle_loop = iseries_dedicated_idle; - printk(KERN_INFO "Using dedicated idle loop\n"); - } + ppc_md = iseries_md; } From benh at kernel.crashing.org Wed Jul 27 01:58:30 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 26 Jul 2005 11:58:30 -0400 Subject: [PATCH 15/19] ppc64: Move iSeries initrd logic into device tree In-Reply-To: <1122368322.249925.69098618987.qpatch@concordia> References: <1122368322.249925.69098618987.qpatch@concordia> Message-ID: <1122393511.14393.187.camel@gaston> On Tue, 2005-07-26 at 18:58 +1000, Michael Ellerman wrote: > + /* XXX This came from iSeries code, make sure ram disk is big enough */ > + if ((rd_size * 1024) < (initrd_end - initrd_start)) > + rd_size = (initrd_end - initrd_start) / 1024; > + Will probably not work in practice. initrd is compressed at this point, but the rd_size is the uncompressed max size. Also, we can use initrd_start to pass an initramfs, in which case rd_size is irrelevant. I'd say just ditch it. Ben. From michael at ellerman.id.au Tue Jul 26 18:58:41 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:41 +1000 Subject: [PATCH 11/19] ppc64: Move memory setup into iSeries device tree In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368321.364864.719382515520.qpatch@concordia> This patch adds the required nodes to the iSeries device tree to allow early_init_devtree() to do the lmb setup for us. --- arch/ppc64/kernel/iSeries_setup.c | 20 +++++++++++++++----- 1 files changed, 15 insertions(+), 5 deletions(-) Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -374,11 +374,6 @@ static void __init iSeries_init_early(vo } } - lmb_init(); - lmb_add(0, systemcfg->physicalMemorySize); - lmb_analyze(); - lmb_reserve(0, __pa(klimit)); - /* Initialize machine-dependency vectors */ #ifdef CONFIG_SMP smp_init_iSeries(); @@ -1063,9 +1058,24 @@ void dt_prop_empty(struct iseries_flat_d void build_flat_dt(struct iseries_flat_dt *dt) { + u64 tmp[2]; + dt_init(dt); dt_start_node(dt, "/"); + + dt_prop_u32(dt, "#address-cells", 2); + dt_prop_u32(dt, "#size-cells", 2); + + /* /memory */ + dt_start_node(dt, "/memory at 0"); + dt_prop_str(dt, "name", "memory"); + dt_prop_str(dt, "device_type", "memory"); + tmp[0] = 0; + tmp[1] = systemcfg->physicalMemorySize; + dt_prop_u64_list(dt, "reg", tmp, 2); + dt_end_node(dt); + dt_end_node(dt); dt_push_u32(dt, OF_DT_END); From utz.bacher at de.ibm.com Wed Jul 27 05:20:08 2005 From: utz.bacher at de.ibm.com (Utz Bacher) Date: Tue, 26 Jul 2005 21:20:08 +0200 (CEST) Subject: [PATCH] RTAS console driver Message-ID: RTAS console driver This RTAS console driver can be used by all machines that abstract the system console through the {get,put}-term-char interface. This code should go into the hvc console driver some time, so it's just for review and early usage on BPA platform systems; it's not intended for inclusion at this time. Signed-off-by: Utz Bacher diff -ruNp linux-2.6.13-rc3-old/drivers/char/Kconfig linux-2.6.13-rc3-new/drivers/char/Kconfig --- linux-2.6.13-rc3-old/drivers/char/Kconfig 2005-07-26 19:52:40.000000000 +0200 +++ linux-2.6.13-rc3-new/drivers/char/Kconfig 2005-07-26 20:00:58.000000000 +0200 @@ -560,6 +560,12 @@ config HVC_CONSOLE console. This driver allows each pSeries partition to have a console which is accessed via the HMC. +config RTASCONS + bool "RTAS firmware console support" + depends on PPC_RTAS + help + RTAS console support. + config HVCS tristate "IBM Hypervisor Virtual Console Server support" depends on PPC_PSERIES diff -ruNp linux-2.6.13-rc3-old/drivers/char/Makefile linux-2.6.13-rc3-new/drivers/char/Makefile --- linux-2.6.13-rc3-old/drivers/char/Makefile 2005-07-26 19:52:40.000000000 +0200 +++ linux-2.6.13-rc3-new/drivers/char/Makefile 2005-07-26 20:06:16.000000000 +0200 @@ -40,6 +40,7 @@ obj-$(CONFIG_N_HDLC) += n_hdlc.o obj-$(CONFIG_AMIGA_BUILTIN_SERIAL) += amiserial.o obj-$(CONFIG_SX) += sx.o generic_serial.o obj-$(CONFIG_RIO) += rio/ generic_serial.o +obj-$(CONFIG_RTASCONS) += rtascons.o obj-$(CONFIG_HVC_CONSOLE) += hvc_console.o hvc_vio.o hvsi.o obj-$(CONFIG_RAW_DRIVER) += raw.o obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o diff -ruNp linux-2.6.13-rc3-old/drivers/char/rtascons.c linux-2.6.13-rc3-new/drivers/char/rtascons.c --- linux-2.6.13-rc3-old/drivers/char/rtascons.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.13-rc3-new/drivers/char/rtascons.c 2005-07-26 20:07:06.000000000 +0200 @@ -0,0 +1,339 @@ +/* + * console driver using RTAS calls + * + * (C) Copyright IBM Corp. 2005 + * RTAS console driver + * + * Author: Utz Bacher + * + * inspired by drivers/char/hvc_console.c + * written by Anton Blanchard and Paul Mackerras + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +/* The whole driver assumes we only have one RTAS console. This makes + * things pretty easy. */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#define RTASCONS_MAJOR 229 +#define RTASCONS_MINOR 0 + +#define RTASCONS_SYSRQ_KEY '\x0f' + +#define RTASCONS_PUT_ATTEMPTS 16 +#define RTASCONS_PUT_DELAY 100 +#define RTASCONS_BUFFER_SIZE 4096 + +#define RTASCONS_MAX_POLL 50 +#define RTASCONS_WRITE_ROOM 200 + +#define RTASCONS_TIMEOUT ((HZ + 99) / 100) + + +static struct tty_driver *rtascons_ttydriver; + +static atomic_t rtascons_usecount = ATOMIC_INIT(0); +static struct tty_struct *rtascons_tty; + +static int rtascons_put_char_token; +static int rtascons_get_char_token; + +static spinlock_t rtascons_buffer_lock = SPIN_LOCK_UNLOCKED; +static char rtascons_buffer[RTASCONS_BUFFER_SIZE]; +static int rtascons_buffer_head = 0; +static int rtascons_buffer_used = 0; + +static int +rtascons_get_char(void) +{ + int result; + + if (rtas_call(rtascons_get_char_token, 0, 2, &result)) + result = -1; + + return result; +} + +/* assumes that rtascons_buffer_lock is held */ +static void +rtascons_flush_chars(void) +{ + int result; + int attempts = RTASCONS_PUT_ATTEMPTS; + + /* if there is more than one character to be displayed, wait a bit */ + for (; rtascons_buffer_used && attempts; udelay(RTASCONS_PUT_DELAY)) { + attempts--; + result = rtas_call(rtascons_put_char_token, 1, 1, NULL, + rtascons_buffer[rtascons_buffer_head]); + + if (!result) { + rtascons_buffer_head = (rtascons_buffer_head + 1) % + RTASCONS_BUFFER_SIZE; + rtascons_buffer_used--; + } + } +} + +static void +rtascons_put_char(char c) +{ + spin_lock(&rtascons_buffer_lock); + + if (rtascons_buffer_used >= (RTASCONS_BUFFER_SIZE / 2)) + udelay(RTASCONS_PUT_DELAY); /* slow down if buffer tends + to get full */ + + if (rtascons_buffer_used >= RTASCONS_BUFFER_SIZE) + goto out; /* we're loosing characters. */ + + /* enqueue character */ + rtascons_buffer[(rtascons_buffer_head + rtascons_buffer_used) % + RTASCONS_BUFFER_SIZE] = c; + rtascons_buffer_used++; +out: + rtascons_flush_chars(); + + spin_unlock(&rtascons_buffer_lock); +} + +static void +rtascons_print_str(const char *buf, int count) +{ + int i = 0; + while (i < count) { + rtascons_put_char(buf[i]); + if (buf[i] == '\n') + rtascons_put_char('\r'); + i++; + } +} + +static int +rtascons_open(struct tty_struct *tty, struct file *filp) +{ + /* only one console */ + if (tty->index) { + /* close will be called and that decrement */ + atomic_inc(&rtascons_usecount); + return -ENODEV; + } + + if (atomic_inc_return(&rtascons_usecount) == 1) { + rtascons_tty = tty; + } + + tty->driver_data = &rtascons_ttydriver; + + return 0; +} + +static void +rtascons_close(struct tty_struct *tty, struct file * filp) +{ + atomic_dec(&rtascons_usecount); +} + +static void +rtascons_hangup(struct tty_struct *tty) +{ +} + +static int +rtascons_write(struct tty_struct *tty, const unsigned char *buf, int count) +{ + if (!atomic_read(&rtascons_usecount)) + return 0; + + rtascons_print_str(buf, count); + + return count; +} + +static int +rtascons_write_room(struct tty_struct *tty) +{ + return RTASCONS_WRITE_ROOM; +} + +static int +rtascons_chars_in_buffer(struct tty_struct *tty) +{ + return 0; +} + +static void +rtascons_poll(void) +{ + int i; + int do_poll = RTASCONS_MAX_POLL; +#ifdef CONFIG_MAGIC_SYSRQ + static int sysrq_pressed = 0; +#endif /* CONFIG_MAGIC_SYSRQ */ + + if (!atomic_read(&rtascons_usecount)) + return; + + while (do_poll--) { + i = rtascons_get_char(); + if (i < 0) + break; + +#ifdef CONFIG_MAGIC_SYSRQ + if (i == RTASCONS_SYSRQ_KEY) { + sysrq_pressed = 1; + continue; + } else if (sysrq_pressed) { + handle_sysrq(i, NULL, rtascons_tty); + sysrq_pressed = 0; + continue; + } +#endif /* CONFIG_MAGIC_SYSRQ */ + + tty_insert_flip_char(rtascons_tty, (unsigned char) i, 0); + } + + tty_flip_buffer_push(rtascons_tty); +} + +#if defined(CONFIG_XMON) && defined(CONFIG_SMP) +extern cpumask_t cpus_in_xmon; +#else +static const cpumask_t cpus_in_xmon = CPU_MASK_NONE; +#endif + +static int +krtasconsd(void *unused) +{ + daemonize("krtasconsd"); + + for (;;) { + if (cpus_empty(cpus_in_xmon)) { + rtascons_poll(); + /* no need for atomic access */ + if (rtascons_buffer_used) { + spin_lock(&rtascons_buffer_lock); + rtascons_flush_chars(); + spin_unlock(&rtascons_buffer_lock); + } + } + + set_current_state(TASK_INTERRUPTIBLE); + schedule_timeout(RTASCONS_TIMEOUT); + } +} + +static struct tty_operations rtascons_ops = { + .open = rtascons_open, + .close = rtascons_close, + .write = rtascons_write, + .hangup = rtascons_hangup, + .write_room = rtascons_write_room, + .chars_in_buffer = rtascons_chars_in_buffer, +}; + +static int __init +rtascons_init(void) +{ + rtascons_ttydriver = alloc_tty_driver(1); + if (!rtascons_ttydriver) + return -ENOMEM; + + rtascons_ttydriver->owner = THIS_MODULE; + rtascons_ttydriver->devfs_name = "rtascons/"; + rtascons_ttydriver->driver_name = "rtascons"; + rtascons_ttydriver->name = "rtascons"; + rtascons_ttydriver->major = RTASCONS_MAJOR; + rtascons_ttydriver->minor_start = RTASCONS_MINOR; + rtascons_ttydriver->type = TTY_DRIVER_TYPE_SYSTEM; + rtascons_ttydriver->subtype = SYSTEM_TYPE_CONSOLE; + rtascons_ttydriver->init_termios = tty_std_termios; + rtascons_ttydriver->flags = TTY_DRIVER_REAL_RAW; + tty_set_operations(rtascons_ttydriver, &rtascons_ops); + + if (tty_register_driver(rtascons_ttydriver)) + panic("Couldn't register RTAS console driver\n"); + + kernel_thread(krtasconsd, NULL, CLONE_KERNEL); + + return 0; +} + +static void __exit +rtascons_exit(void) +{ +} + +static void +rtascons_print(struct console *con, const char *buf, unsigned count) +{ + rtascons_print_str(buf, count); +} + +static struct tty_driver *rtascons_device(struct console *con, int *index) +{ + *index = con->index; + return rtascons_ttydriver; +} + +static int __init +rtascons_setup(struct console *con, char *options) +{ + return (con->index); +} + +struct console rtascons_driver = { + .name = "rtas", + .write = rtascons_print, + .device = rtascons_device, + .setup = rtascons_setup, + .flags = CON_PRINTBUFFER, + .index = -1, +}; + +static int __init +rtascons_register(void) +{ + rtascons_put_char_token = rtas_token("put-term-char"); + if (rtascons_put_char_token == -1) + return -EIO; + rtascons_get_char_token = rtas_token("get-term-char"); + if (rtascons_get_char_token == -1) + return -EIO; + + register_console(&rtascons_driver); + return 0; +} + +console_initcall(rtascons_register); + +module_init(rtascons_init); +module_exit(rtascons_exit); From michael at ellerman.id.au Tue Jul 26 18:58:41 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 26 Jul 2005 18:58:41 +1000 Subject: [PATCH 10/19] ppc64: Make some generic irq code compile for iSeries In-Reply-To: <200507261857.31494.michael@ellerman.id.au> Message-ID: <1122368321.169865.178188119708.qpatch@concordia> In order to call finish_device_tree() on iSeries we need virt_irq_create_mapping() linked. We also need to set ppc64_interrupt_controller set to !0. If we want to do interrupt setup via the device tree this code will need some serious work, but it's harmless to have it there as long as the iSeries device tree doesn't invoke it. --- arch/ppc64/kernel/iSeries_irq.c | 2 +- arch/ppc64/kernel/iSeries_setup.c | 2 ++ arch/ppc64/kernel/irq.c | 5 ++++- include/asm-ppc64/processor.h | 1 + 4 files changed, 8 insertions(+), 2 deletions(-) Index: work/arch/ppc64/kernel/iSeries_irq.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_irq.c +++ work/arch/ppc64/kernel/iSeries_irq.c @@ -43,7 +43,7 @@ #include /* This maps virtual irq numbers to real irqs */ -unsigned int virt_irq_to_real_map[NR_IRQS]; +extern unsigned int virt_irq_to_real_map[NR_IRQS]; /* The next available virtual irq number */ /* Note: the pcnet32 driver assumes irq numbers < 2 aren't valid. :( */ Index: work/arch/ppc64/kernel/irq.c =================================================================== --- work.orig/arch/ppc64/kernel/irq.c +++ work/arch/ppc64/kernel/irq.c @@ -353,7 +353,6 @@ void __init init_IRQ(void) irq_ctx_init(); } -#ifndef CONFIG_PPC_ISERIES /* * Virtual IRQ mapping code, used on systems with XICS interrupt controllers. */ @@ -389,6 +388,9 @@ int virt_irq_create_mapping(unsigned int unsigned int virq, first_virq; static int warned; + if (ppc64_interrupt_controller == IC_ISERIES) + return real_irq; /* XXX this is broken for real work */ + if (ppc64_interrupt_controller == IC_OPEN_PIC) return real_irq; /* no mapping for openpic (for now) */ @@ -430,6 +432,7 @@ int virt_irq_create_mapping(unsigned int return NO_IRQ; } +#ifndef CONFIG_PPC_ISERIES /* * In most cases will get a hit on the very first slot checked in the * virt_irq_to_real_map. Only when there are a large number of Index: work/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- work.orig/arch/ppc64/kernel/iSeries_setup.c +++ work/arch/ppc64/kernel/iSeries_setup.c @@ -318,6 +318,8 @@ static void __init iSeries_init_early(vo ppcdbg_initialize(); + ppc64_interrupt_controller = IC_ISERIES; + #if defined(CONFIG_BLK_DEV_INITRD) /* * If the init RAM disk has been configured and there is Index: work/include/asm-ppc64/processor.h =================================================================== --- work.orig/include/asm-ppc64/processor.h +++ work/include/asm-ppc64/processor.h @@ -290,6 +290,7 @@ #define IC_OPEN_PIC 1 #define IC_PPC_XIC 2 #define IC_BPA_IIC 3 +#define IC_ISERIES 4 #define XGLUE(a,b) a##b #define GLUE(a,b) XGLUE(a,b) From michael at ellerman.id.au Wed Jul 27 11:55:44 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 27 Jul 2005 11:55:44 +1000 Subject: [RFC] Add flattened device tree for iSeries In-Reply-To: <1122389809.14393.184.camel@gaston> References: <200507261857.31494.michael@ellerman.id.au> <1122389809.14393.184.camel@gaston> Message-ID: <200507271155.49748.michael@ellerman.id.au> On Wed, 27 Jul 2005 00:56, Benjamin Herrenschmidt wrote: > On Tue, 2005-07-26 at 18:57 +1000, Michael Ellerman wrote: > > Hi, > > > > This series of patches adds a flattened device tree into the iSeries > > startup code. This already allows us to remove a reasonable amount of > > iSeries specific code, and hopefully more will go in the future. > > > > Most of this is not ready to merge, but just here for early feedback. > > Cool, but according to the numbering, A lot of your patches never made > it to me (maybe it's just the list though). Can you put them online > somewhere ? > > Ben. It looks like they took a while to land, but I see them all now. They're up on http://patchwork.ozlabs.org also. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050727/89d2c5e8/attachment.pgp From david at gibson.dropbear.id.au Wed Jul 27 12:04:04 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 27 Jul 2005 12:04:04 +1000 Subject: [PATCH 8/19] ppc64: Create a fake flat device tree on iSeries In-Reply-To: <1122368320.735021.509798647893.qpatch@concordia> References: <200507261857.31494.michael@ellerman.id.au> <1122368320.735021.509798647893.qpatch@concordia> Message-ID: <20050727020404.GA27870@localhost.localdomain> On Tue, Jul 26, 2005 at 06:58:40PM +1000, Michael Ellerman wrote: > This patch adds infrastructure for creating a fake flattened device tree > on iSeries. This is very much work-in-progress and a bit hacky. > > We also need to build prom.o for iSeries which means we'll always need it. > > --- > > arch/ppc64/kernel/Makefile | 4 - > arch/ppc64/kernel/iSeries_setup.c | 131 ++++++++++++++++++++++++++++++++++++++ > 2 files changed, 133 insertions(+), 2 deletions(-) > > Index: work/arch/ppc64/kernel/iSeries_setup.c > =================================================================== > --- work.orig/arch/ppc64/kernel/iSeries_setup.c > +++ work/arch/ppc64/kernel/iSeries_setup.c > @@ -940,6 +940,135 @@ struct machdep_calls __initdata iseries_ > .progress = iSeries_progress, > }; > > +struct blob > +{ { generally goes on the same line as the struct name. > + unsigned char data[PAGE_SIZE]; > + unsigned long next; > +}; > + > +struct iseries_flat_dt > +{ > + struct boot_param_header header; > + u64 reserve_map[2]; > + struct blob dt; > + struct blob strings; > +}; > + > +struct iseries_flat_dt iseries_dt; > + > +void dt_init(struct iseries_flat_dt *dt) > +{ > + unsigned char *dt_addr = (unsigned char *)dt; > + > + dt->header.off_mem_rsvmap = (unsigned char *)&dt->reserve_map - dt_addr; > + dt->header.off_dt_struct = (unsigned char *)&dt->dt - dt_addr; > + dt->header.off_dt_strings = (unsigned char *)&dt->strings - dt_addr; The offsetof macro would make these a little cleaner. > + dt->header.totalsize = sizeof(*dt); > + > + /* There is no notion of hardware cpu id on iSeries */ > + dt->header.boot_cpuid_phys = smp_processor_id(); > + > + dt->dt.next = (unsigned long)&dt->dt.data; > + dt->strings.next = (unsigned long)&dt->strings.data; > + > + dt->header.magic = OF_DT_HEADER; > + dt->header.version = 0x2; > + dt->header.last_comp_version = 0x2; > + > + dt->reserve_map[0] = 0; > + dt->reserve_map[1] = 0; > +} > + > +void dt_push_u32(struct iseries_flat_dt *dt, u32 value) > +{ > + *((u32*)dt->dt.next) = value; > + dt->dt.next += sizeof(u32); > + dt->dt.next = _ALIGN(dt->dt.next, sizeof(u32)); If you need an ALIGN here at all, surely it should go before you emit the value. > + BUG_ON(dt->dt.next - (unsigned long)dt->dt.data > sizeof(dt->dt.data)); > +} > + > +void dt_push_u64(struct iseries_flat_dt *dt, u64 value) > +{ > + *((u64*)dt->dt.next) = value; > + dt->dt.next += sizeof(u64); > + dt->dt.next = _ALIGN(dt->dt.next, sizeof(u64)); Again, I don't see that u64 aligning, after the value makes any sense. > + BUG_ON(dt->dt.next - (unsigned long)dt->dt.data > sizeof(dt->dt.data)); > +} > + > +void dt_push_str(struct blob *blob, char *str) > +{ > + memcpy((char *)blob->next, str, strlen(str) + 1); > + blob->next += strlen(str) + 1; > + blob->next = _ALIGN(blob->next, 4); > +} > + > +#define str_offset(dt) (dt->strings.next - (unsigned long)&dt->strings) > + > +void dt_start_node(struct iseries_flat_dt *dt, char *name) > +{ > + dt_push_u32(dt, OF_DT_BEGIN_NODE); > + dt_push_str(&dt->dt, name); > +} > + > +#define dt_end_node(dt) dt_push_u32(dt, OF_DT_END_NODE) > + > +void dt_prop(struct iseries_flat_dt *dt, char *name, char *data, int len) > +{ > + dt_push_u32(dt, OF_DT_PROP); > + > + /* Length of the data */ > + dt_push_u32(dt, len); > + > + /* The offset of the properties name in the string blob. */ > + dt_push_u32(dt, str_offset(dt)); > + > + /* The actual data. */ > + /* For versions below 16 the data is 4 or 8 byte aligned. */ > + dt->dt.next = _ALIGN(dt->dt.next, len >= 8 ? 8 : 4); > + memcpy((char *)dt->dt.next, data, len); > + dt->dt.next += len; > + /* and then 4 byte aligned after the data. */ > + dt->dt.next = _ALIGN(dt->dt.next, 4); > + > + /* Put the property name in the string blob. */ > + dt_push_str(&dt->strings, name); I think having the str_offset() and adding the name to the string section separated a little confusing. Better to have a function which adds the string to the string section and returns the appropriate offset (there's one like this in dtc). That also makes the change to add name-combining (if it's worth it) more localised. > +} > + > +void dt_prop_str(struct iseries_flat_dt *dt, char *name, char *data) > +{ > + dt_prop(dt, name, data, strlen(data) + 1); /* + 1 for NULL */ > +} > + > +void dt_prop_u32(struct iseries_flat_dt *dt, char *name, u32 data) > +{ > + dt_prop(dt, name, (char *)&data, sizeof(u32)); > +} > + > +void dt_prop_u64(struct iseries_flat_dt *dt, char *name, u64 data) > +{ > + dt_prop(dt, name, (char *)&data, sizeof(u64)); > +} > + > +void dt_prop_u64_list(struct iseries_flat_dt *dt, char *name, u64 *data, int n) > +{ > + dt_prop(dt, name, (char *)data, sizeof(u64) * n); > +} > + > +void dt_prop_empty(struct iseries_flat_dt *dt, char *name) > +{ > + dt_prop(dt, name, NULL, 0); > +} > + > +void build_flat_dt(struct iseries_flat_dt *dt) > +{ > + dt_init(dt); > + > + dt_start_node(dt, "/"); > + dt_end_node(dt); > + > + dt_push_u32(dt, OF_DT_END); > +} > + > void __init iSeries_early_setup(void) > { > iSeries_fixup_klimit(); > @@ -952,6 +1081,8 @@ void __init iSeries_early_setup(void) > */ > build_iSeries_Memory_Map(); > > + build_flat_dt(&iseries_dt); > + > /* Bolt mappings for all of memory (or some if we've got a limit) */ > iSeries_bolt_kernel(0, systemcfg->physicalMemorySize); > } > Index: work/arch/ppc64/kernel/Makefile > =================================================================== > --- work.orig/arch/ppc64/kernel/Makefile > +++ work/arch/ppc64/kernel/Makefile > @@ -11,7 +11,7 @@ obj-y := setup.o entry.o t > udbg.o binfmt_elf32.o sys_ppc32.o ioctl32.o \ > ptrace32.o signal32.o rtc.o init_task.o \ > lmb.o cputable.o cpu_setup_power4.o idle_power4.o \ > - iommu.o sysfs.o vdso.o pmc.o > + iommu.o sysfs.o vdso.o pmc.o prom.o > obj-y += vdso32/ vdso64/ > > obj-$(CONFIG_PPC_OF) += of_device.o > @@ -27,7 +27,7 @@ obj-$(CONFIG_PPC_ISERIES) += HvCall.o Hv > mf.o HvLpEvent.o iSeries_proc.o iSeries_htab.o \ > iSeries_iommu.o > > -obj-$(CONFIG_PPC_MULTIPLATFORM) += nvram.o i8259.o prom_init.o prom.o > +obj-$(CONFIG_PPC_MULTIPLATFORM) += nvram.o i8259.o prom_init.o > > obj-$(CONFIG_PPC_PSERIES) += pSeries_pci.o pSeries_lpar.o pSeries_hvCall.o \ > pSeries_nvram.o rtasd.o ras.o pSeries_reconfig.o \ > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From david at gibson.dropbear.id.au Wed Jul 27 12:16:42 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 27 Jul 2005 12:16:42 +1000 Subject: [PATCH 14/19] ppc64: Move iSeries memory limit logic into device tree In-Reply-To: <1122368322.8246.827409622930.qpatch@concordia> References: <200507261857.31494.michael@ellerman.id.au> <1122368322.8246.827409622930.qpatch@concordia> Message-ID: <20050727021642.GB27870@localhost.localdomain> On Tue, Jul 26, 2005 at 06:58:42PM +1000, Michael Ellerman wrote: > Move the iSeries specific memory limit logic earlier so we can put it into > the device tree. Unfortunately this means we have to parse the command line > twice (once in iSeries_early_setup() and once in setup_system()), I might clean > that up later. Could you do this more neatly by just clamping the size of the memory range you put in the flat device tree by the parameter value, instead of changing systemcfg at all? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From david at gibson.dropbear.id.au Wed Jul 27 12:21:04 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 27 Jul 2005 12:21:04 +1000 Subject: [PATCH 19/19] ppc64: Use setup_cpu_maps() instead of iSeries specific smp code In-Reply-To: <1122368323.243384.202065854128.qpatch@concordia> References: <200507261857.31494.michael@ellerman.id.au> <1122368323.243384.202065854128.qpatch@concordia> Message-ID: <20050727022104.GC27870@localhost.localdomain> On Tue, Jul 26, 2005 at 06:58:43PM +1000, Michael Ellerman wrote: > Call setup_cpu_maps() on iSeries which will setup the cpu maps from the device > tree. This removes the need for smp_iSeries_numProcs() and makes > smp_iSeries_probe() a one-liner. > > --- > > arch/ppc64/kernel/iSeries_smp.c | 30 +----------------------------- > arch/ppc64/kernel/setup.c | 8 +------- > 2 files changed, 2 insertions(+), 36 deletions(-) > > Index: work/arch/ppc64/kernel/iSeries_smp.c > =================================================================== > --- work.orig/arch/ppc64/kernel/iSeries_smp.c > +++ work/arch/ppc64/kernel/iSeries_smp.c > @@ -82,35 +82,9 @@ static void smp_iSeries_message_pass(int > } > } > > -static int smp_iSeries_numProcs(void) > -{ > - unsigned np, i; > - > - np = 0; > - for (i=0; i < NR_CPUS; ++i) { > - if (paca[i].lppaca.dyn_proc_status < 2) { > - cpu_set(i, cpu_possible_map); > - cpu_set(i, cpu_present_map); > - cpu_set(i, cpu_sibling_map[i]); > - ++np; > - } > - } > - return np; > -} > - > static int smp_iSeries_probe(void) > { > - unsigned i; > - unsigned np = 0; > - > - for (i=0; i < NR_CPUS; ++i) { > - if (paca[i].lppaca.dyn_proc_status < 2) { > - /*paca[i].active = 1;*/ > - ++np; > - } > - } > - > - return np; > + return cpus_weight(cpu_possible_map); > } I would be inclined to get rid of this function entirely now, and just replace it with cpus_weight() in the callers. > > static void smp_iSeries_kick_cpu(int nr) > @@ -144,6 +118,4 @@ static struct smp_ops_t iSeries_smp_ops > void __init smp_init_iSeries(void) > { > smp_ops = &iSeries_smp_ops; > - systemcfg->processorCount = smp_iSeries_numProcs(); > } > - > Index: work/arch/ppc64/kernel/setup.c > =================================================================== > --- work.orig/arch/ppc64/kernel/setup.c > +++ work/arch/ppc64/kernel/setup.c > @@ -182,8 +182,6 @@ void __init disable_early_printk(void) > early_console_initialized = 0; > } > > -#if defined(CONFIG_PPC_MULTIPLATFORM) && defined(CONFIG_SMP) > - > static int smt_enabled_cmdline; > > /* Look for ibm,smt-enabled OF option */ > @@ -335,7 +333,6 @@ static void __init setup_cpu_maps(void) > > systemcfg->processorCount = num_present_cpus(); > } > -#endif /* defined(CONFIG_PPC_MULTIPLATFORM) && defined(CONFIG_SMP) */ > > > #ifdef CONFIG_PPC_MULTIPLATFORM > @@ -637,12 +634,9 @@ void __init setup_system(void) > > parse_early_param(); > > -#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) > - /* > - * iSeries has already initialized the cpu maps at this point. > - */ > setup_cpu_maps(); > > +#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_ISERIES) > /* Release secondary cpus out of their spinloops at 0x60 now that > * we can map physical -> logical CPU ids > */ > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From david at gibson.dropbear.id.au Wed Jul 27 15:47:23 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 27 Jul 2005 15:47:23 +1000 Subject: [PPC64] Remove nested feature sections Message-ID: <20050727054723.GH27870@localhost.localdomain> Andrew, please apply: The {BEGIN,END}_FTR_SECTION asm macros used in ppc64 to nop out sections of code at runtime cannot be nested. However, we do nest them in hash_low.S. We get away with it there, because there is nothing between the BEGIN markers for each section. However, that's confusing to someone reading the code. This patch removes the nested ifset and ifclr feature sections, replacing them with a single feature section in the full mask/value form. Signed-off-by: David Gibson Index: working-2.6/arch/ppc64/mm/hash_low.S =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_low.S 2005-07-26 10:36:48.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_low.S 2005-07-26 17:35:49.000000000 +1000 @@ -129,12 +129,10 @@ * code rather than call a C function...) */ BEGIN_FTR_SECTION -BEGIN_FTR_SECTION mr r4,r30 mr r5,r7 bl .hash_page_do_lazy_icache -END_FTR_SECTION_IFSET(CPU_FTR_NOEXECUTE) -END_FTR_SECTION_IFCLR(CPU_FTR_COHERENT_ICACHE) +END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE) /* At this point, r3 contains new PP bits, save them in * place of "access" in the param area (sic) -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From benh at kernel.crashing.org Thu Jul 28 00:07:13 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 27 Jul 2005 10:07:13 -0400 Subject: [PATCH 8/19] ppc64: Create a fake flat device tree on iSeries In-Reply-To: <20050727020404.GA27870@localhost.localdomain> References: <200507261857.31494.michael@ellerman.id.au> <1122368320.735021.509798647893.qpatch@concordia> <20050727020404.GA27870@localhost.localdomain> Message-ID: <1122473233.18835.14.camel@gaston> > > +struct blob > > +{ > > { generally goes on the same line as the struct name. That one I wouldn't enforce :) After all, we don't like putting { on the same line as a function declaration... I tend to go both ways for structures depending on my mood of the day... > Again, I don't see that u64 aligning, after the value makes any sense. In fact, he probably wants to work on top of my patches that implement version 0x10 of the format and doesn't require the variable alignment thingy anymore... From terry.reynolds2 at us.army.mil Thu Jul 28 02:00:37 2005 From: terry.reynolds2 at us.army.mil (Reynolds, Terry (Contractor-SIMTECH)) Date: Wed, 27 Jul 2005 11:00:37 -0500 Subject: PREEMPT_RT on ppc64? Message-ID: <0D21CBD1298D2C4790E2F2B86D96EC1935940F@amr-ex5.ds.amrdec.army.mil> Hi All, It's been a few months since I saw any posting about preemption on ppc64, can anyone fill me in on the current state of development? TIA! Terry Reynolds Simulation Technologies, INC. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050727/b3f6b78b/attachment.htm From david at gibson.dropbear.id.au Thu Jul 28 10:47:52 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 28 Jul 2005 10:47:52 +1000 Subject: [PATCH 8/19] ppc64: Create a fake flat device tree on iSeries In-Reply-To: <1122473233.18835.14.camel@gaston> References: <200507261857.31494.michael@ellerman.id.au> <1122368320.735021.509798647893.qpatch@concordia> <20050727020404.GA27870@localhost.localdomain> <1122473233.18835.14.camel@gaston> Message-ID: <20050728004752.GB4543@localhost.localdomain> On Wed, Jul 27, 2005 at 10:07:13AM -0400, Benjamin Herrenschmidt wrote: > > > > +struct blob > > > +{ > > > > { generally goes on the same line as the struct name. > > That one I wouldn't enforce :) After all, we don't like putting { on the > same line as a function declaration... I tend to go both ways for > structures depending on my mood of the day... > > > Again, I don't see that u64 aligning, after the value makes any sense. > > In fact, he probably wants to work on top of my patches that implement > version 0x10 of the format and doesn't require the variable alignment > thingy anymore... As far as I can tell, that particular alignment has nothing to do with the variable property alignment. I think that function's only used for the memreserve information. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From david at gibson.dropbear.id.au Thu Jul 28 13:31:18 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 28 Jul 2005 13:31:18 +1000 Subject: RFC: Kill off addRamDisk Message-ID: <20050728033118.GC4543@localhost.localdomain> Can anyone think of any problems with abolishing addRamDisk - the icky iSeries-specific method of loading an initial ramdisk. If not, I'll push this to Andrew. iSeries has a home-built method of attaching an initial ramdisk, which relies on an addRamDisk helper program poking the data, location and offset into the vmlinux after build. Both SLES and RHEL now use the normal initramfs mechanisms instead of the iSeries specific initrd on 2.6 kernels, so let's get rid of this obsolete mechanism. Removing this functionality in turn means the NACA, which used to contain the ramdisk parameters no longer needs to have a fixed address. So, this patch removes the fixed NACA from head.S and replaces it with an ordinary C initializer. For good measure, this patch also eliminates an old version of addRamDisk.c which was sitting, unused, in the ppc32 tree. Signed-off-by: David Gibson arch/ppc/boot/utils/addRamDisk.c | 203 ------------------------- arch/ppc64/boot/Makefile | 10 - arch/ppc64/boot/addRamDisk.c | 300 -------------------------------------- arch/ppc64/kernel/LparData.c | 9 + arch/ppc64/kernel/head.S | 18 -- arch/ppc64/kernel/iSeries_setup.c | 38 ---- include/asm-ppc64/naca.h | 10 - 7 files changed, 18 insertions(+), 570 deletions(-) Index: working-2.6/arch/ppc64/boot/Makefile =================================================================== --- working-2.6.orig/arch/ppc64/boot/Makefile 2005-07-28 13:02:04.000000000 +1000 +++ working-2.6/arch/ppc64/boot/Makefile 2005-07-28 13:02:49.000000000 +1000 @@ -52,24 +52,18 @@ src-sec = $(foreach section, $(1), $(patsubst %,$(obj)/kernel-%.c, $(section))) gz-sec = $(foreach section, $(1), $(patsubst %,$(obj)/kernel-%.gz, $(section))) -hostprogs-y := addnote addRamDisk +hostprogs-y := addnote targets += zImage zImage.initrd imagesize.c \ $(patsubst $(obj)/%,%, $(call obj-sec, $(required) $(initrd))) \ $(patsubst $(obj)/%,%, $(call src-sec, $(required) $(initrd))) \ - $(patsubst $(obj)/%,%, $(call gz-sec, $(required) $(initrd))) \ - vmlinux.initrd + $(patsubst $(obj)/%,%, $(call gz-sec, $(required) $(initrd))) extra-y := initrd.o -quiet_cmd_ramdisk = RAMDISK $@ - cmd_ramdisk = $(obj)/addRamDisk $(obj)/ramdisk.image.gz $< $@ - quiet_cmd_stripvm = STRIP $@ cmd_stripvm = $(STRIP) -s $< -o $@ vmlinux.strip: vmlinux FORCE $(call if_changed,stripvm) -$(obj)/vmlinux.initrd: vmlinux.strip $(obj)/addRamDisk $(obj)/ramdisk.image.gz FORCE - $(call if_changed,ramdisk) addsection = $(CROSS32OBJCOPY) $(1) \ --add-section=.kernel:$(strip $(patsubst $(obj)/kernel-%.o,%, $(1)))=$(patsubst %.o,%.gz, $(1)) \ Index: working-2.6/arch/ppc64/boot/addRamDisk.c =================================================================== --- working-2.6.orig/arch/ppc64/boot/addRamDisk.c 2005-07-28 13:02:04.000000000 +1000 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,300 +0,0 @@ -#include -#include -#include -#include -#include -#include -#include - -#define ElfHeaderSize (64 * 1024) -#define ElfPages (ElfHeaderSize / 4096) -#define KERNELBASE (0xc000000000000000) - -void get4k(FILE *file, char *buf ) -{ - unsigned j; - unsigned num = fread(buf, 1, 4096, file); - for ( j=num; j<4096; ++j ) - buf[j] = 0; -} - -void put4k(FILE *file, char *buf ) -{ - fwrite(buf, 1, 4096, file); -} - -void death(const char *msg, FILE *fdesc, const char *fname) -{ - fprintf(stderr, msg); - fclose(fdesc); - unlink(fname); - exit(1); -} - -int main(int argc, char **argv) -{ - char inbuf[4096]; - FILE *ramDisk = NULL; - FILE *sysmap = NULL; - FILE *inputVmlinux = NULL; - FILE *outputVmlinux = NULL; - - unsigned i = 0; - unsigned long ramFileLen = 0; - unsigned long ramLen = 0; - unsigned long roundR = 0; - - unsigned long sysmapFileLen = 0; - unsigned long sysmapLen = 0; - unsigned long sysmapPages = 0; - char* ptr_end = NULL; - unsigned long offset_end = 0; - - unsigned long kernelLen = 0; - unsigned long actualKernelLen = 0; - unsigned long round = 0; - unsigned long roundedKernelLen = 0; - unsigned long ramStartOffs = 0; - unsigned long ramPages = 0; - unsigned long roundedKernelPages = 0; - unsigned long hvReleaseData = 0; - u_int32_t eyeCatcher = 0xc8a5d9c4; - unsigned long naca = 0; - unsigned long xRamDisk = 0; - unsigned long xRamDiskSize = 0; - long padPages = 0; - - - if (argc < 2) { - fprintf(stderr, "Name of RAM disk file missing.\n"); - exit(1); - } - - if (argc < 3) { - fprintf(stderr, "Name of System Map input file is missing.\n"); - exit(1); - } - - if (argc < 4) { - fprintf(stderr, "Name of vmlinux file missing.\n"); - exit(1); - } - - if (argc < 5) { - fprintf(stderr, "Name of vmlinux output file missing.\n"); - exit(1); - } - - - ramDisk = fopen(argv[1], "r"); - if ( ! ramDisk ) { - fprintf(stderr, "RAM disk file \"%s\" failed to open.\n", argv[1]); - exit(1); - } - - sysmap = fopen(argv[2], "r"); - if ( ! sysmap ) { - fprintf(stderr, "System Map file \"%s\" failed to open.\n", argv[2]); - exit(1); - } - - inputVmlinux = fopen(argv[3], "r"); - if ( ! inputVmlinux ) { - fprintf(stderr, "vmlinux file \"%s\" failed to open.\n", argv[3]); - exit(1); - } - - outputVmlinux = fopen(argv[4], "w+"); - if ( ! outputVmlinux ) { - fprintf(stderr, "output vmlinux file \"%s\" failed to open.\n", argv[4]); - exit(1); - } - - - - /* Input Vmlinux file */ - fseek(inputVmlinux, 0, SEEK_END); - kernelLen = ftell(inputVmlinux); - fseek(inputVmlinux, 0, SEEK_SET); - printf("kernel file size = %d\n", kernelLen); - if ( kernelLen == 0 ) { - fprintf(stderr, "You must have a linux kernel specified as argv[3]\n"); - exit(1); - } - - actualKernelLen = kernelLen - ElfHeaderSize; - - printf("actual kernel length (minus ELF header) = %d\n", actualKernelLen); - - round = actualKernelLen % 4096; - roundedKernelLen = actualKernelLen; - if ( round ) - roundedKernelLen += (4096 - round); - printf("Vmlinux length rounded up to a 4k multiple = %ld/0x%lx \n", roundedKernelLen, roundedKernelLen); - roundedKernelPages = roundedKernelLen / 4096; - printf("Vmlinux pages to copy = %ld/0x%lx \n", roundedKernelPages, roundedKernelPages); - - - - /* Input System Map file */ - /* (needs to be processed simply to determine if we need to add pad pages due to the static variables not being included in the vmlinux) */ - fseek(sysmap, 0, SEEK_END); - sysmapFileLen = ftell(sysmap); - fseek(sysmap, 0, SEEK_SET); - printf("%s file size = %ld/0x%lx \n", argv[2], sysmapFileLen, sysmapFileLen); - - sysmapLen = sysmapFileLen; - - roundR = 4096 - (sysmapLen % 4096); - if (roundR) { - printf("Rounding System Map file up to a multiple of 4096, adding %ld/0x%lx \n", roundR, roundR); - sysmapLen += roundR; - } - printf("Rounded System Map size is %ld/0x%lx \n", sysmapLen, sysmapLen); - - /* Process the Sysmap file to determine where _end is */ - sysmapPages = sysmapLen / 4096; - /* read the whole file line by line, expect that it doesn't fail */ - while ( fgets(inbuf, 4096, sysmap) ) ; - /* search for _end in the last page of the system map */ - ptr_end = strstr(inbuf, " _end"); - if (!ptr_end) { - fprintf(stderr, "Unable to find _end in the sysmap file \n"); - fprintf(stderr, "inbuf: \n"); - fprintf(stderr, "%s \n", inbuf); - exit(1); - } - printf("Found _end in the last page of the sysmap - backing up 10 characters it looks like %s", ptr_end-10); - /* convert address of _end in system map to hex offset. */ - offset_end = (unsigned int)strtol(ptr_end-10, NULL, 16); - /* calc how many pages we need to insert between the vmlinux and the start of the ram disk */ - padPages = offset_end/4096 - roundedKernelPages; - - /* Check and see if the vmlinux is already larger than _end in System.map */ - if (padPages < 0) { - /* vmlinux is larger than _end - adjust the offset to the start of the embedded ram disk */ - offset_end = roundedKernelLen; - printf("vmlinux is larger than _end indicates it needs to be - offset_end = %lx \n", offset_end); - padPages = 0; - printf("will insert %lx pages between the vmlinux and the start of the ram disk \n", padPages); - } - else { - /* _end is larger than vmlinux - use the offset to _end that we calculated from the system map */ - printf("vmlinux is smaller than _end indicates is needed - offset_end = %lx \n", offset_end); - printf("will insert %lx pages between the vmlinux and the start of the ram disk \n", padPages); - } - - - - /* Input Ram Disk file */ - // Set the offset that the ram disk will be started at. - ramStartOffs = offset_end; /* determined from the input vmlinux file and the system map */ - printf("Ram Disk will start at offset = 0x%lx \n", ramStartOffs); - - fseek(ramDisk, 0, SEEK_END); - ramFileLen = ftell(ramDisk); - fseek(ramDisk, 0, SEEK_SET); - printf("%s file size = %ld/0x%lx \n", argv[1], ramFileLen, ramFileLen); - - ramLen = ramFileLen; - - roundR = 4096 - (ramLen % 4096); - if ( roundR ) { - printf("Rounding RAM disk file up to a multiple of 4096, adding %ld/0x%lx \n", roundR, roundR); - ramLen += roundR; - } - - printf("Rounded RAM disk size is %ld/0x%lx \n", ramLen, ramLen); - ramPages = ramLen / 4096; - printf("RAM disk pages to copy = %ld/0x%lx\n", ramPages, ramPages); - - - - // Copy 64K ELF header - for (i=0; i<(ElfPages); ++i) { - get4k( inputVmlinux, inbuf ); - put4k( outputVmlinux, inbuf ); - } - - /* Copy the vmlinux (as full pages). */ - fseek(inputVmlinux, ElfHeaderSize, SEEK_SET); - for ( i=0; i -#include -#include -#include -#include -#include -#include - -#define ElfHeaderSize (64 * 1024) -#define ElfPages (ElfHeaderSize / 4096) -#define KERNELBASE (0xc0000000) - -void get4k(FILE *file, char *buf ) -{ - unsigned j; - unsigned num = fread(buf, 1, 4096, file); - for ( j=num; j<4096; ++j ) - buf[j] = 0; -} - -void put4k(FILE *file, char *buf ) -{ - fwrite(buf, 1, 4096, file); -} - -void death(const char *msg, FILE *fdesc, const char *fname) -{ - printf(msg); - fclose(fdesc); - unlink(fname); - exit(1); -} - -int main(int argc, char **argv) -{ - char inbuf[4096]; - FILE *ramDisk = NULL; - FILE *inputVmlinux = NULL; - FILE *outputVmlinux = NULL; - unsigned i = 0; - u_int32_t ramFileLen = 0; - u_int32_t ramLen = 0; - u_int32_t roundR = 0; - u_int32_t kernelLen = 0; - u_int32_t actualKernelLen = 0; - u_int32_t round = 0; - u_int32_t roundedKernelLen = 0; - u_int32_t ramStartOffs = 0; - u_int32_t ramPages = 0; - u_int32_t roundedKernelPages = 0; - u_int32_t hvReleaseData = 0; - u_int32_t eyeCatcher = 0xc8a5d9c4; - u_int32_t naca = 0; - u_int32_t xRamDisk = 0; - u_int32_t xRamDiskSize = 0; - if ( argc < 2 ) { - printf("Name of RAM disk file missing.\n"); - exit(1); - } - - if ( argc < 3 ) { - printf("Name of vmlinux file missing.\n"); - exit(1); - } - - if ( argc < 4 ) { - printf("Name of vmlinux output file missing.\n"); - exit(1); - } - - ramDisk = fopen(argv[1], "r"); - if ( ! ramDisk ) { - printf("RAM disk file \"%s\" failed to open.\n", argv[1]); - exit(1); - } - inputVmlinux = fopen(argv[2], "r"); - if ( ! inputVmlinux ) { - printf("vmlinux file \"%s\" failed to open.\n", argv[2]); - exit(1); - } - outputVmlinux = fopen(argv[3], "w+"); - if ( ! outputVmlinux ) { - printf("output vmlinux file \"%s\" failed to open.\n", argv[3]); - exit(1); - } - fseek(ramDisk, 0, SEEK_END); - ramFileLen = ftell(ramDisk); - fseek(ramDisk, 0, SEEK_SET); - printf("%s file size = %d\n", argv[1], ramFileLen); - - ramLen = ramFileLen; - - roundR = 4096 - (ramLen % 4096); - if ( roundR ) { - printf("Rounding RAM disk file up to a multiple of 4096, adding %d\n", roundR); - ramLen += roundR; - } - - printf("Rounded RAM disk size is %d\n", ramLen); - fseek(inputVmlinux, 0, SEEK_END); - kernelLen = ftell(inputVmlinux); - fseek(inputVmlinux, 0, SEEK_SET); - printf("kernel file size = %d\n", kernelLen); - if ( kernelLen == 0 ) { - printf("You must have a linux kernel specified as argv[2]\n"); - exit(1); - } - - actualKernelLen = kernelLen - ElfHeaderSize; - - printf("actual kernel length (minus ELF header) = %d\n", actualKernelLen); - - round = actualKernelLen % 4096; - roundedKernelLen = actualKernelLen; - if ( round ) - roundedKernelLen += (4096 - round); - - printf("actual kernel length rounded up to a 4k multiple = %d\n", roundedKernelLen); - - ramStartOffs = roundedKernelLen; - ramPages = ramLen / 4096; - - printf("RAM disk pages to copy = %d\n", ramPages); - - // Copy 64K ELF header - for (i=0; i<(ElfPages); ++i) { - get4k( inputVmlinux, inbuf ); - put4k( outputVmlinux, inbuf ); - } - - roundedKernelPages = roundedKernelLen / 4096; - - fseek(inputVmlinux, ElfHeaderSize, SEEK_SET); - - for ( i=0; i #include #include -#include #include #include #include @@ -510,24 +509,9 @@ mfspr r12,SPRG2 EXCEPTION_PROLOG_PSERIES(PACA_EXSLB, .do_stab_bolted) - - /* Space for the naca. Architected to be located at real address - * NACA_PHYS_ADDR. Various tools rely on this location being fixed. - * The first dword of the naca is required by iSeries LPAR to - * point to itVpdAreas. On pSeries native, this value is not used. - */ - . = NACA_PHYS_ADDR - .globl __end_interrupts -__end_interrupts: -#ifdef CONFIG_PPC_ISERIES - .globl naca -naca: - .llong itVpdAreas - .llong 0 /* xRamDisk */ - .llong 0 /* xRamDiskSize */ - . = 0x6100 +#ifdef CONFIG_PPC_ISERIES /*** ISeries-LPAR interrupt handlers ***/ STD_EXCEPTION_ISERIES(0x200, machine_check, PACA_EXMC) Index: working-2.6/arch/ppc64/kernel/iSeries_setup.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/iSeries_setup.c 2005-07-28 10:41:40.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/iSeries_setup.c 2005-07-28 13:11:07.000000000 +1000 @@ -316,23 +316,7 @@ ppcdbg_initialize(); -#if defined(CONFIG_BLK_DEV_INITRD) - /* - * If the init RAM disk has been configured and there is - * a non-zero starting address for it, set it up - */ - if (naca.xRamDisk) { - initrd_start = (unsigned long)__va(naca.xRamDisk); - initrd_end = initrd_start + naca.xRamDiskSize * PAGE_SIZE; - initrd_below_start_ok = 1; // ramdisk in kernel space - ROOT_DEV = Root_RAM0; - if (((rd_size * 1024) / PAGE_SIZE) < naca.xRamDiskSize) - rd_size = (naca.xRamDiskSize * PAGE_SIZE) / 1024; - } else -#endif /* CONFIG_BLK_DEV_INITRD */ - { - /* ROOT_DEV = MKDEV(VIODASD_MAJOR, 1); */ - } + /* ROOT_DEV = MKDEV(VIODASD_MAJOR, 1); */ iSeries_recal_tb = get_tb(); iSeries_recal_titan = HvCallXm_loadTod(); @@ -805,22 +789,12 @@ static void __init iSeries_fixup_klimit(void) { /* - * Change klimit to take into account any ram disk - * that may be included + * Check and see if there was an embedded system map. Change + * klimit to take into account any embedded system map */ - if (naca.xRamDisk) - klimit = KERNELBASE + (u64)naca.xRamDisk + - (naca.xRamDiskSize * PAGE_SIZE); - else { - /* - * No ram disk was included - check and see if there - * was an embedded system map. Change klimit to take - * into account any embedded system map - */ - if (embedded_sysmap_end) - klimit = KERNELBASE + ((embedded_sysmap_end + 4095) & - 0xfffffffffffff000); - } + if (embedded_sysmap_end) + klimit = KERNELBASE + ((embedded_sysmap_end + 4095) & + 0xfffffffffffff000); } static int __init iSeries_src_init(void) Index: working-2.6/include/asm-ppc64/naca.h =================================================================== --- working-2.6.orig/include/asm-ppc64/naca.h 2005-05-24 14:12:25.000000000 +1000 +++ working-2.6/include/asm-ppc64/naca.h 2005-07-28 13:17:14.000000000 +1000 @@ -12,20 +12,10 @@ #include -#ifndef __ASSEMBLY__ - struct naca_struct { - /* Kernel only data - undefined for user space */ void *xItVpdAreas; /* VPD Data 0x00 */ - void *xRamDisk; /* iSeries ramdisk 0x08 */ - u64 xRamDiskSize; /* In pages 0x10 */ }; extern struct naca_struct naca; -#endif /* __ASSEMBLY__ */ - -#define NACA_PAGE 0x4 -#define NACA_PHYS_ADDR (NACA_PAGE< Hi all, I've started to work with Linux on power two weeks ago. And it's not a great success in fact. I'm using a PL600 machine with four RS64-IV CPU at 600MHz. I just wanted to do some tests of my work done with the latest kernel (I'm working with kernel 2.6.11.7 and 2.6.12.3). And after several tests and search, I finally found that my problem is a known one: the one described here: http://patchwork.ozlabs.org/linuxppc64/patch?id=1755 (look for "[boot]0020 XICS Init" , I have this kernel panic) I already have applied the proc_claim patch to correct the rtas deadbeef initialization. I'm surprised that there is such a blocking problem in the mainline kernel tree. So I just wanted to know if there are some news or if there is a known work around ? Or may be I missed something ? Anyway, thanks a lot for your help ! -- Pierre Peiffer From mostrows at watson.ibm.com Fri Jul 29 00:51:35 2005 From: mostrows at watson.ibm.com (Michal Ostrowski) Date: Thu, 28 Jul 2005 10:51:35 -0400 Subject: [PATCH] ppc64: Detect altivec via firmware on unknown CPUs Message-ID: <20050728105135.63dfe243@brick.watson.ibm.com> Following the application of this patch, if somebody (e.g. me) tries to boot a kernel without CONFIG_ALTIVEC using a root filesystem (and a glibc) that does use altivec, the boot just seems to hang with no errors or explanation. Is there some sane way to report this? -- Michal Ostrowski -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050728/350ed856/attachment.pgp From olof at lixom.net Fri Jul 29 02:00:01 2005 From: olof at lixom.net (Olof Johansson) Date: Thu, 28 Jul 2005 11:00:01 -0500 Subject: Problem booting latest linux on RS64 In-Reply-To: <42E8A69A.3030404@bull.net> References: <42E8A69A.3030404@bull.net> Message-ID: <20050728160000.GA19403@austin.ibm.com> On Thu, Jul 28, 2005 at 11:34:18AM +0200, Pierre PEIFFER wrote: > I'm surprised that there is such a blocking problem in the mainline > kernel tree. So I just wanted to know if there are some news or if there > is a known work around ? Or may be I missed something ? > > Anyway, thanks a lot for your help ! RS64 isn't a platform that alot of users seem to use mainline kernels on. Olaf Hering has reported this error in the past (like you found). I have a system here that reproduces it but I have unfortunately not had the time to sit down and track it down yet. -Olof From omkhar at gentoo.org Fri Jul 29 02:01:56 2005 From: omkhar at gentoo.org (Omkhar Arasaratnam) Date: Thu, 28 Jul 2005 12:01:56 -0400 Subject: G5 + 2.6.12 Powers off Underload Message-ID: <42E90174.3080403@gentoo.org> We have some users who are complaining that a G5 running 2.6.12 will power off or freeze underload. They define load as "when the fan comes on". They claim it happens on multiple machines, Fedora and Gentoo. Thoughts? -- Omkhar Arasaratnam Gentoo PPC64 Developer omkhar at gentoo.org / http://dev.gentoo.org/~omkhar -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 256 bytes Desc: OpenPGP digital signature Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050728/d1a47770/attachment.pgp From ppc-dev at storix.com Fri Jul 29 05:07:48 2005 From: ppc-dev at storix.com (ppc-dev at storix.com) Date: Thu, 28 Jul 2005 12:07:48 -0700 Subject: Kill off addRamDisk (David Gibson) Message-ID: <200507281207.49195.ppc-dev@storix.com> Just a thought on removing addRamDisk I know that addRamDisk was designed for iseries, but there were some uses with pSeries. Right now I use proceedures similar to the makefile for creating zImage.initrd images for network boot. This is a clunky process and I would rather use yaboot but cannot due to kernel size limitations. (another story). If you remove it, we should make sure there are still efficient and practical ways for users of 32-bit and 64-bit systems to create zImage.initrd images. Otherwise, it seems harmless to remove. David Huffman Storix, Inc. From joezhao at us.ibm.com Fri Jul 29 06:37:21 2005 From: joezhao at us.ibm.com (Joseph H Zhao) Date: Thu, 28 Jul 2005 15:37:21 -0500 Subject: RFC: Kill off addRamDisk (David Gibson) In-Reply-To: <20050728145319.F281C67D8F@ozlabs.org> Message-ID: A couple of questions: The old way of build a bootable image for iSeries involves the following five steps: make vmlinux mkinitrd initrd.vmlinux 2.4.xxx.xxx addRamDisk initrd.vmlinux System.map vmlinux vmlinux.initrd dd if=vmlinux.initrd of=/proc/iSeries/mf/B/vmlinux bs=4096 echo B > /proc/iSeries/mf/side If addRamDisk is obsolete, does this mean that we no longer need to build initrd for iSeries using 'mkinitrd' ? And is the following three steps sufficient for a image to boot ? make vmlinux dd if=vmlinux of=/proc/iSeries/mf/B/vmlinux bs=4096 echo B > /proc/iSeries/mf/side Thanks, Joseph From geoffrey.levand at am.sony.com Fri Jul 29 09:57:46 2005 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Thu, 28 Jul 2005 16:57:46 -0700 Subject: [PATCH 10/11] ppc64: SPU file system In-Reply-To: <200506212334.44066.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212326.18205.arnd@arndb.de> <200506212328.28929.arnd@arndb.de> <200506212334.44066.arnd@arndb.de> Message-ID: <42E970FA.9010805@am.sony.com> Arnd Bergmann wrote: ... > --- linux-cg.orig/mm/memory.c 2005-06-21 22:48:42.154975624 -0400 > +++ linux-cg/mm/memory.c 2005-06-21 22:48:48.780899080 -0400 > @@ -2201,6 +2201,7 @@ unsigned long vmalloc_to_pfn(void * vmal > { > return page_to_pfn(vmalloc_to_page(vmalloc_addr)); > } > +EXPORT_SYMBOL_GPL(handle_mm_fault); > > EXPORT_SYMBOL(vmalloc_to_pfn); > Your change to handle_mm_fault causes problems when I build for Ebony (ppc32). mm/built-in.o(*ABS*+0xfe6822c0): In function `__crc_handle_mm_fault': shmem.c: multiple definition of `__crc_handle_mm_fault' make[1]: *** [.tmp_vmlinux1] Error 1 $ egrep -HRn 'EXPORT_SYMBOL(_GPL)?\(handle_mm_fault' . ./mm/memory.c:2208:EXPORT_SYMBOL_GPL(handle_mm_fault); ./arch/ppc/kernel/ppc_ksyms.c:329:EXPORT_SYMBOL(handle_mm_fault); /* For MOL */ Also, it seems a definition of the DEFINE_SIMPLE_ATTRIBUTE macro is missing. Did I miss one of your patches? CC fs/spufs/file.o /home/geoff/projects/alp/alp-linux--dev-2-6-12--1.7/fs/spufs/file.c:401: error: parse error before string constant /home/geoff/projects/alp/alp-linux--dev-2-6-12--1.7/fs/spufs/file.c:401: warning: type defaults to `int' in declaration of `DEFINE_SIMPLE_ATTRIBUTE' ... $ egrep -HRn 'DEFINE_SIMPLE_ATTRIBUTE' . ./fs/spufs/file.c:400:DEFINE_SIMPLE_ATTRIBUTE(spufs_signal1_type, spufs_signal1_type_get, ./fs/spufs/file.c:414:DEFINE_SIMPLE_ATTRIBUTE(spufs_signal2_type, spufs_signal2_type_get, ./fs/spufs/file.c:428:DEFINE_SIMPLE_ATTRIBUTE(spufs_ ## name, \ ./fs/spufs/file.c:443:DEFINE_SIMPLE_ATTRIBUTE(spufs_ ## name, \ ./fs/spufs/file.c:458:DEFINE_SIMPLE_ATTRIBUTE(spufs_ ## name, \ -Geoff From david at gibson.dropbear.id.au Fri Jul 29 10:18:33 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 29 Jul 2005 10:18:33 +1000 Subject: Kill off addRamDisk (David Gibson) In-Reply-To: <200507281207.49195.ppc-dev@storix.com> References: <200507281207.49195.ppc-dev@storix.com> Message-ID: <20050729001833.GA28012@localhost.localdomain> On Thu, Jul 28, 2005 at 12:07:48PM -0700, ppc-dev at storix.com wrote: > Just a thought on removing addRamDisk > > I know that addRamDisk was designed for iseries, but there were some uses with > pSeries. Um.. given that addRamDisk pokes its data into a structure which hasn't even existed on pSeries for quite some time now, I think, such usage can be considered obsolete. > Right now I use proceedures similar to the makefile for creating > zImage.initrd images for network boot. This is a clunky process and > I would rather use yaboot but cannot due to kernel size > limitations. (another story). If you remove it, we should make sure > there are still efficient and practical ways for users of 32-bit and > 64-bit systems to create zImage.initrd images. Otherwise, it seems > harmless to remove. This doesn't affect the way pSeries zImage.initrd images are created, only the way vmlinux.initrd images were created, which have been iSeries only for a long time now. In any case, initramfs is now preferred to initrd. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From dwg at au1.ibm.com Fri Jul 29 13:10:14 2005 From: dwg at au1.ibm.com (David Gibson) Date: Fri, 29 Jul 2005 13:10:14 +1000 Subject: RFC: Kill off addRamDisk (David Gibson) In-Reply-To: References: <20050728145319.F281C67D8F@ozlabs.org> Message-ID: <20050729031014.GD28012@localhost.localdomain> On Thu, Jul 28, 2005 at 03:37:21PM -0500, Joseph H Zhao wrote: > A couple of questions: > > The old way of build a bootable image for iSeries involves the following > five steps: > > make vmlinux > mkinitrd initrd.vmlinux 2.4.xxx.xxx > addRamDisk initrd.vmlinux System.map vmlinux vmlinux.initrd > dd if=vmlinux.initrd of=/proc/iSeries/mf/B/vmlinux bs=4096 > echo B > /proc/iSeries/mf/side > > > If addRamDisk is obsolete, does this mean that we no longer need to build > initrd for iSeries using 'mkinitrd' ? And is the following three steps > sufficient for a image to boot ? > > make vmlinux > dd if=vmlinux of=/proc/iSeries/mf/B/vmlinux bs=4096 > echo B > /proc/iSeries/mf/side No, you'd still need to mkinitrd (which for a 2.6 kernel actually makes an initramfs, not an initrd). I thought there was a general (i.e. cross-platform) of attaching the initramfs which we could use on iSeries as well. Unfortunately, I've just checked and I'm mistaken - the "ramdisk" stuff for iSeries is also used to find the initramfs, so we can't get rid of addRamDisk (yet, anyway). (the standard methods of loading the initramfs appear to rely on a bootloader and/or zImage wrapper, which we don't use on iSeries) However, addRamDisk in the kernel tree actually locates the naca by following the pointer in hvReleaseData, just like the hypervisor, comments in head.S to the contrary notwithstanding. So, the patch below doesn't get rid of addRamDisk, but it does remove the fixed address constraint on the naca. It would be very handy if someone could test this on iSeries, with an initrd/initramfs. --- Comments in head.S suggest that the iSeries naca has a fixed address, because tools expect to find it there. The only tool which appears to access the naca is addRamDisk, but both the in-kernel version and the version used in RHEL and SuSE in fact locate the NACA the same way as the hypervisor does, by following the pointer in the hvReleaseData structure. Since the requirement for a fixed address seems to be obsolete, this patch removes the naca from head.S and replaces it with a normal C initializer. For good measure, it removes old versions of addRamDisk.c and addSystemMap.c which were sitting, unused, in the ppc32 tree. Signed-off-by: David Gibson arch/ppc/boot/utils/addRamDisk.c | 203 ------------------------------------- arch/ppc/boot/utils/addSystemMap.c | 186 --------------------------------- arch/ppc64/kernel/LparData.c | 11 ++ arch/ppc64/kernel/head.S | 16 -- 4 files changed, 12 insertions(+), 404 deletions(-) Index: working-2.6/arch/ppc/boot/utils/addRamDisk.c =================================================================== --- working-2.6.orig/arch/ppc/boot/utils/addRamDisk.c 2005-05-24 14:12:22.000000000 +1000 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,203 +0,0 @@ -#include -#include -#include -#include -#include -#include -#include - -#define ElfHeaderSize (64 * 1024) -#define ElfPages (ElfHeaderSize / 4096) -#define KERNELBASE (0xc0000000) - -void get4k(FILE *file, char *buf ) -{ - unsigned j; - unsigned num = fread(buf, 1, 4096, file); - for ( j=num; j<4096; ++j ) - buf[j] = 0; -} - -void put4k(FILE *file, char *buf ) -{ - fwrite(buf, 1, 4096, file); -} - -void death(const char *msg, FILE *fdesc, const char *fname) -{ - printf(msg); - fclose(fdesc); - unlink(fname); - exit(1); -} - -int main(int argc, char **argv) -{ - char inbuf[4096]; - FILE *ramDisk = NULL; - FILE *inputVmlinux = NULL; - FILE *outputVmlinux = NULL; - unsigned i = 0; - u_int32_t ramFileLen = 0; - u_int32_t ramLen = 0; - u_int32_t roundR = 0; - u_int32_t kernelLen = 0; - u_int32_t actualKernelLen = 0; - u_int32_t round = 0; - u_int32_t roundedKernelLen = 0; - u_int32_t ramStartOffs = 0; - u_int32_t ramPages = 0; - u_int32_t roundedKernelPages = 0; - u_int32_t hvReleaseData = 0; - u_int32_t eyeCatcher = 0xc8a5d9c4; - u_int32_t naca = 0; - u_int32_t xRamDisk = 0; - u_int32_t xRamDiskSize = 0; - if ( argc < 2 ) { - printf("Name of RAM disk file missing.\n"); - exit(1); - } - - if ( argc < 3 ) { - printf("Name of vmlinux file missing.\n"); - exit(1); - } - - if ( argc < 4 ) { - printf("Name of vmlinux output file missing.\n"); - exit(1); - } - - ramDisk = fopen(argv[1], "r"); - if ( ! ramDisk ) { - printf("RAM disk file \"%s\" failed to open.\n", argv[1]); - exit(1); - } - inputVmlinux = fopen(argv[2], "r"); - if ( ! inputVmlinux ) { - printf("vmlinux file \"%s\" failed to open.\n", argv[2]); - exit(1); - } - outputVmlinux = fopen(argv[3], "w+"); - if ( ! outputVmlinux ) { - printf("output vmlinux file \"%s\" failed to open.\n", argv[3]); - exit(1); - } - fseek(ramDisk, 0, SEEK_END); - ramFileLen = ftell(ramDisk); - fseek(ramDisk, 0, SEEK_SET); - printf("%s file size = %d\n", argv[1], ramFileLen); - - ramLen = ramFileLen; - - roundR = 4096 - (ramLen % 4096); - if ( roundR ) { - printf("Rounding RAM disk file up to a multiple of 4096, adding %d\n", roundR); - ramLen += roundR; - } - - printf("Rounded RAM disk size is %d\n", ramLen); - fseek(inputVmlinux, 0, SEEK_END); - kernelLen = ftell(inputVmlinux); - fseek(inputVmlinux, 0, SEEK_SET); - printf("kernel file size = %d\n", kernelLen); - if ( kernelLen == 0 ) { - printf("You must have a linux kernel specified as argv[2]\n"); - exit(1); - } - - actualKernelLen = kernelLen - ElfHeaderSize; - - printf("actual kernel length (minus ELF header) = %d\n", actualKernelLen); - - round = actualKernelLen % 4096; - roundedKernelLen = actualKernelLen; - if ( round ) - roundedKernelLen += (4096 - round); - - printf("actual kernel length rounded up to a 4k multiple = %d\n", roundedKernelLen); - - ramStartOffs = roundedKernelLen; - ramPages = ramLen / 4096; - - printf("RAM disk pages to copy = %d\n", ramPages); - - // Copy 64K ELF header - for (i=0; i<(ElfPages); ++i) { - get4k( inputVmlinux, inbuf ); - put4k( outputVmlinux, inbuf ); - } - - roundedKernelPages = roundedKernelLen / 4096; - - fseek(inputVmlinux, ElfHeaderSize, SEEK_SET); - - for ( i=0; i -#include -#include -#include -#include - -void xlate( char * inb, char * trb, unsigned len ) -{ - unsigned i; - for ( i=0; i> 4; - char c2 = c & 0xf; - if ( c1 > 9 ) - c1 = c1 + 'A' - 10; - else - c1 = c1 + '0'; - if ( c2 > 9 ) - c2 = c2 + 'A' - 10; - else - c2 = c2 + '0'; - *trb++ = c1; - *trb++ = c2; - } - *trb = 0; -} - -#define ElfHeaderSize (64 * 1024) -#define ElfPages (ElfHeaderSize / 4096) -#define KERNELBASE (0xc0000000) - -void get4k( /*istream *inf*/FILE *file, char *buf ) -{ - unsigned j; - unsigned num = fread(buf, 1, 4096, file); - for ( j=num; j<4096; ++j ) - buf[j] = 0; -} - -void put4k( /*ostream *outf*/FILE *file, char *buf ) -{ - fwrite(buf, 1, 4096, file); -} - -int main(int argc, char **argv) -{ - char inbuf[4096]; - FILE *ramDisk = NULL; - FILE *inputVmlinux = NULL; - FILE *outputVmlinux = NULL; - unsigned i = 0; - unsigned long ramFileLen = 0; - unsigned long ramLen = 0; - unsigned long roundR = 0; - unsigned long kernelLen = 0; - unsigned long actualKernelLen = 0; - unsigned long round = 0; - unsigned long roundedKernelLen = 0; - unsigned long ramStartOffs = 0; - unsigned long ramPages = 0; - unsigned long roundedKernelPages = 0; - if ( argc < 2 ) { - printf("Name of System Map file missing.\n"); - exit(1); - } - - if ( argc < 3 ) { - printf("Name of vmlinux file missing.\n"); - exit(1); - } - - if ( argc < 4 ) { - printf("Name of vmlinux output file missing.\n"); - exit(1); - } - - ramDisk = fopen(argv[1], "r"); - if ( ! ramDisk ) { - printf("System Map file \"%s\" failed to open.\n", argv[1]); - exit(1); - } - inputVmlinux = fopen(argv[2], "r"); - if ( ! inputVmlinux ) { - printf("vmlinux file \"%s\" failed to open.\n", argv[2]); - exit(1); - } - outputVmlinux = fopen(argv[3], "w"); - if ( ! outputVmlinux ) { - printf("output vmlinux file \"%s\" failed to open.\n", argv[3]); - exit(1); - } - fseek(ramDisk, 0, SEEK_END); - ramFileLen = ftell(ramDisk); - fseek(ramDisk, 0, SEEK_SET); - printf("%s file size = %ld\n", argv[1], ramFileLen); - - ramLen = ramFileLen; - - roundR = 4096 - (ramLen % 4096); - if ( roundR ) { - printf("Rounding System Map file up to a multiple of 4096, adding %ld\n", roundR); - ramLen += roundR; - } - - printf("Rounded System Map size is %ld\n", ramLen); - fseek(inputVmlinux, 0, SEEK_END); - kernelLen = ftell(inputVmlinux); - fseek(inputVmlinux, 0, SEEK_SET); - printf("kernel file size = %ld\n", kernelLen); - if ( kernelLen == 0 ) { - printf("You must have a linux kernel specified as argv[2]\n"); - exit(1); - } - - actualKernelLen = kernelLen - ElfHeaderSize; - - printf("actual kernel length (minus ELF header) = %ld\n", actualKernelLen); - - round = actualKernelLen % 4096; - roundedKernelLen = actualKernelLen; - if ( round ) - roundedKernelLen += (4096 - round); - - printf("actual kernel length rounded up to a 4k multiple = %ld\n", roundedKernelLen); - - ramStartOffs = roundedKernelLen; - ramPages = ramLen / 4096; - - printf("System map pages to copy = %ld\n", ramPages); - - // Copy 64K ELF header - for (i=0; i<(ElfPages); ++i) { - get4k( inputVmlinux, inbuf ); - put4k( outputVmlinux, inbuf ); - } - - - - roundedKernelPages = roundedKernelLen / 4096; - - fseek(inputVmlinux, ElfHeaderSize, SEEK_SET); - - { - for ( i=0; i References: <200506212310.54156.arnd@arndb.de> <200506212334.44066.arnd@arndb.de> <42E970FA.9010805@am.sony.com> Message-ID: <200507291037.38204.arnd@arndb.de> On Freedag 29 Juli 2005 01:57, Geoff Levand wrote: > Arnd Bergmann wrote: > ... > > --- linux-cg.orig/mm/memory.c 2005-06-21 22:48:42.154975624 -0400 > > +++ linux-cg/mm/memory.c 2005-06-21 22:48:48.780899080 -0400 > > @@ -2201,6 +2201,7 @@ unsigned long vmalloc_to_pfn(void * vmal > > { > > return page_to_pfn(vmalloc_to_page(vmalloc_addr)); > > } > > +EXPORT_SYMBOL_GPL(handle_mm_fault); > > > > EXPORT_SYMBOL(vmalloc_to_pfn); > > > > Your change to handle_mm_fault causes problems when I build for Ebony (ppc32). > > mm/built-in.o(*ABS*+0xfe6822c0): In function `__crc_handle_mm_fault': > shmem.c: multiple definition of `__crc_handle_mm_fault' > make[1]: *** [.tmp_vmlinux1] Error 1 > > $ egrep -HRn 'EXPORT_SYMBOL(_GPL)?\(handle_mm_fault' . > ./mm/memory.c:2208:EXPORT_SYMBOL_GPL(handle_mm_fault); > ./arch/ppc/kernel/ppc_ksyms.c:329:EXPORT_SYMBOL(handle_mm_fault); /* For MOL */ Right, good catch. The simple solution would be to remove the EXPORT_SYMBOL from arch/ppc/kernel/ppc_ksyms.c. Can anyone from MOL comment on wether it is ok for you to have it exported GPL-only? > Also, it seems a definition of the DEFINE_SIMPLE_ATTRIBUTE macro is missing. > Did I miss one of your patches? It was merged in 2.6.13-rc. An earlier version of the patch that might still work on vanilla 2.6.12 is at http://patchwork.ozlabs.org/linuxppc64/patch?id=835 Arnd <>< From benh at kernel.crashing.org Fri Jul 29 22:12:29 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 29 Jul 2005 14:12:29 +0200 Subject: [PATCH] ppc64: Detect altivec via firmware on unknown CPUs In-Reply-To: <20050728105135.63dfe243@brick.watson.ibm.com> References: <20050728105135.63dfe243@brick.watson.ibm.com> Message-ID: <1122639150.18835.37.camel@gaston> On Thu, 2005-07-28 at 10:51 -0400, Michal Ostrowski wrote: > Following the application of this patch, if somebody (e.g. me) tries to boot > a kernel without CONFIG_ALTIVEC using a root filesystem (and a glibc) > that does use altivec, the boot just seems to hang with no errors or > explanation. Is there some sane way to report this? What happen without this patch ? Without CONFIG_ALTIVEC, you get SIGILL's on Altivec instructions. It shouldn't hang ... hrm ... Unless we fubar'ed something like adding the altivec feature bit when CONFIG_ALTIVEC is not set :) I'll have to check. Thanks P.S. Could you fill it in osdl bugzilla and assign to me ? Ben. From benh at kernel.crashing.org Sun Jul 31 01:50:17 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 30 Jul 2005 17:50:17 +0200 Subject: [PATCH 10/11] ppc64: SPU file system In-Reply-To: <200507291037.38204.arnd@arndb.de> References: <200506212310.54156.arnd@arndb.de> <200506212334.44066.arnd@arndb.de> <42E970FA.9010805@am.sony.com> <200507291037.38204.arnd@arndb.de> Message-ID: <1122738617.18835.71.camel@gaston> On Fri, 2005-07-29 at 10:37 +0200, Arnd Bergmann wrote: > Right, good catch. The simple solution would be to remove the > EXPORT_SYMBOL from arch/ppc/kernel/ppc_ksyms.c. Can anyone from > MOL comment on wether it is ok for you to have it exported > GPL-only? MOL is fully GPL. Yes, it should be GPL only. From benh at kernel.crashing.org Sun Jul 31 01:53:03 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 30 Jul 2005 17:53:03 +0200 Subject: G5 + 2.6.12 Powers off Underload In-Reply-To: <42E90174.3080403@gentoo.org> References: <42E90174.3080403@gentoo.org> Message-ID: <1122738783.18835.74.camel@gaston> On Thu, 2005-07-28 at 12:01 -0400, Omkhar Arasaratnam wrote: > We have some users who are complaining that a G5 running 2.6.12 will > power off or freeze underload. They define load as "when the fan comes > on". They claim it happens on multiple machines, Fedora and Gentoo. > > Thoughts? Did it happen with earlier kernels ? If you create a binary /sbin/critical-overtemp, do you see it beeing run about 30sec before it powers down ? You can add some code to the therm driver to see what's going on, it's possible that the heat is ramping up too fast for the fan algorithm to catch up. Above max + 8 ?C, my code triggers a shutdown. Ben.