From apw at shadowen.org Thu Dec 1 04:34:50 2005 From: apw at shadowen.org (Andy Whitcroft) Date: Wed, 30 Nov 2005 17:34:50 +0000 Subject: [PATCH 1/2] powerpc powermac adb fix dependancy on btext_drawchar References: Message-ID: <20051130173450.GA851@shadowen.org> powerpc: powermac, adb fix dependancy on btext_drawchar udbg_adb_init() has become dependant on btext_drawchar, even when BOOTX_TEXT support is not selected. This leads to the error below. Make the check dependant on BOOTX_TEXT. LD .tmp_vmlinux1 arch/powerpc/platforms/built-in.o(.toc1+0xa40): undefined reference to `btext_drawchar' Signed-off-by: Andy Whitcroft --- diff -upN reference/arch/powerpc/platforms/powermac/udbg_adb.c current/arch/powerpc/platforms/powermac/udbg_adb.c --- reference/arch/powerpc/platforms/powermac/udbg_adb.c +++ current/arch/powerpc/platforms/powermac/udbg_adb.c @@ -171,9 +171,12 @@ int udbg_adb_init(int force_btext) udbg_adb_old_getc_poll = udbg_getc_poll; /* Check if our early init was already called */ - if (udbg_adb_old_putc == udbg_adb_putc || - udbg_adb_old_putc == btext_drawchar) + if (udbg_adb_old_putc == udbg_adb_putc) udbg_adb_old_putc = NULL; +#ifdef CONFIG_BOOTX_TEXT + if (udbg_adb_old_putc == btext_drawchar) + udbg_adb_old_putc = NULL; +#endif /* Set ours as output */ udbg_putc = udbg_adb_putc; From apw at shadowen.org Thu Dec 1 04:34:40 2005 From: apw at shadowen.org (Andy Whitcroft) Date: Wed, 30 Nov 2005 17:34:40 +0000 Subject: [PATCH 0/2] 2.6.15rc3mm1 ppc64 compile problems References: <20051129203134.13b93f48.akpm@osdl.org> Message-ID: Testing 2.6.15-rc3-mm1 seems to have issues on ppc64 systems using the powerpc architecture. The problems are in the powermac support relating to the BOOTX_TEXT support. Following this email are a couple of patches to clean up this build: powerpc-powermac-adb-fix-dependancy-on-btext_drawchar: fix up a dependancy problem on BOOTX_TEXT powerpc-powermac-adb-fix-udbg_adb_use_btext-warning: clean up a warning with unused externals Comments? -apw From apw at shadowen.org Thu Dec 1 04:35:01 2005 From: apw at shadowen.org (Andy Whitcroft) Date: Wed, 30 Nov 2005 17:35:01 +0000 Subject: [PATCH 2/2] powerpc powermac adb fix udbg_adb_use_btext warning References: Message-ID: <20051130173501.GA863@shadowen.org> powerpc: powermac, adb fix udbg_adb_use_btext warning When compiling without BOOTX_TEXT the following warning is emitted. Fix up the definition to only be made when required. CC arch/powerpc/platforms/powermac/udbg_adb.o .../arch/powerpc/platforms/powermac/udbg_adb.c:41: warning: `udbg_adb_use_btext' defined but not used Signed-off-by: Andy Whitcroft --- diff -upN reference/arch/powerpc/platforms/powermac/udbg_adb.c current/arch/powerpc/platforms/powermac/udbg_adb.c --- reference/arch/powerpc/platforms/powermac/udbg_adb.c +++ current/arch/powerpc/platforms/powermac/udbg_adb.c @@ -38,8 +38,6 @@ static enum { input_adb_cuda, } input_type = input_adb_none; -static int udbg_adb_use_btext; - int xmon_wants_key, xmon_adb_keycode; static inline void udbg_adb_poll(void) @@ -55,6 +53,8 @@ static inline void udbg_adb_poll(void) } #ifdef CONFIG_BOOTX_TEXT + +static int udbg_adb_use_btext; static int xmon_adb_shiftstate; static unsigned char xmon_keytab[128] = From kravetz at us.ibm.com Thu Dec 1 08:47:23 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Wed, 30 Nov 2005 13:47:23 -0800 Subject: [PATCH] updated: Minor numa memory code cleanup Message-ID: <20051130214723.GC29166@w-mikek2.ibm.com> Here is an updated version of the patch that panics if no memory is found as Nathan suggested. I'm still concerned that panic strings (not just the one added here) at this stage of booting do not show up on my system. But, that is an issue separate from this patch. Combine get_mem_*_cells() routines to avoid multiple memory node lookups. Added missing of_node_put() call. Changed variable names to help with some confusion as to meaning. Signed-off-by: Mike Kravetz diff -Naupr linux-2.6.15-rc3-git1/arch/powerpc/mm/numa.c linux-2.6.15-rc3-git1.work/arch/powerpc/mm/numa.c --- linux-2.6.15-rc3-git1/arch/powerpc/mm/numa.c 2005-11-29 03:51:27.000000000 +0000 +++ linux-2.6.15-rc3-git1.work/arch/powerpc/mm/numa.c 2005-11-30 19:53:41.000000000 +0000 @@ -254,29 +254,17 @@ static int __init find_min_common_depth( return depth; } -static int __init get_mem_addr_cells(void) +static void __init get_n_mem_cells(int *n_addr_cells, int *n_size_cells) { struct device_node *memory = NULL; - int rc; memory = of_find_node_by_type(memory, "memory"); if (!memory) - return 0; /* it won't matter */ + panic("numa.c: No memory nodes found!"); - rc = prom_n_addr_cells(memory); - return rc; -} - -static int __init get_mem_size_cells(void) -{ - struct device_node *memory = NULL; - int rc; - - memory = of_find_node_by_type(memory, "memory"); - if (!memory) - return 0; /* it won't matter */ - rc = prom_n_size_cells(memory); - return rc; + *n_addr_cells = prom_n_addr_cells(memory); + *n_size_cells = prom_n_size_cells(memory); + of_node_put(memory); } static unsigned long __init read_n_cells(int n, unsigned int **buf) @@ -386,7 +374,7 @@ static int __init parse_numa_properties( { struct device_node *cpu = NULL; struct device_node *memory = NULL; - int addr_cells, size_cells; + int n_addr_cells, n_size_cells; int max_domain; unsigned long i; @@ -425,8 +413,7 @@ static int __init parse_numa_properties( } } - addr_cells = get_mem_addr_cells(); - size_cells = get_mem_size_cells(); + get_n_mem_cells(&n_addr_cells, &n_size_cells); memory = NULL; while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { unsigned long start; @@ -443,8 +430,8 @@ static int __init parse_numa_properties( ranges = memory->n_addrs; new_range: /* these are order-sensitive, and modify the buffer pointer */ - start = read_n_cells(addr_cells, &memcell_buf); - size = read_n_cells(size_cells, &memcell_buf); + start = read_n_cells(n_addr_cells, &memcell_buf); + size = read_n_cells(n_size_cells, &memcell_buf); numa_domain = of_node_numa_domain(memory); From michael at ellerman.id.au Thu Dec 1 09:33:36 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 30 Nov 2005 16:33:36 -0600 Subject: [PATCH] updated: Minor numa memory code cleanup In-Reply-To: <20051130214723.GC29166@w-mikek2.ibm.com> References: <20051130214723.GC29166@w-mikek2.ibm.com> Message-ID: <200511301633.42773.michael@ellerman.id.au> On Wed, 30 Nov 2005 15:47, Mike Kravetz wrote: > Here is an updated version of the patch that panics if no memory is > found as Nathan suggested. I'm still concerned that panic strings > (not just the one added here) at this stage of booting do not show > up on my system. But, that is an issue separate from this patch. You probably need to enable one of the EARLY_DEBUG_INIT macros, in arch/powerpc/kernel/setup_64.c. I'm guessing you're on some LPAR machine if you're debugging NUMA? If so you'll want the LPAR debugging. It'll only work if you have a 'hvterm1' compatible console as your /chosen/linux,stdout-path, and it has to be vterm 0 (check the reg property). cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051130/63f3acac/attachment.pgp From kravetz at us.ibm.com Thu Dec 1 09:49:00 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Wed, 30 Nov 2005 14:49:00 -0800 Subject: [PATCH] updated: Minor numa memory code cleanup In-Reply-To: <200511301633.42773.michael@ellerman.id.au> References: <20051130214723.GC29166@w-mikek2.ibm.com> <200511301633.42773.michael@ellerman.id.au> Message-ID: <20051130224900.GE29166@w-mikek2.ibm.com> On Wed, Nov 30, 2005 at 04:33:36PM -0600, Michael Ellerman wrote: > On Wed, 30 Nov 2005 15:47, Mike Kravetz wrote: > > Here is an updated version of the patch that panics if no memory is > > found as Nathan suggested. I'm still concerned that panic strings > > (not just the one added here) at this stage of booting do not show > > up on my system. But, that is an issue separate from this patch. > > You probably need to enable one of the EARLY_DEBUG_INIT macros, in > arch/powerpc/kernel/setup_64.c. I was thinking more about debugging production systems in the field where we may not have the luxury of booting a debug kernel. Seem to recall a situation in the past where someone ran into a problem in numa.c that called panic. Didn't get the panic message displayed on the console. Had them enable xmon, and dig the panic message out of the console buffer. Sure would be nice if we could get all those early catastrophic failure messages to the console. -- Mike From michael at ellerman.id.au Thu Dec 1 10:22:21 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Wed, 30 Nov 2005 17:22:21 -0600 Subject: [PATCH] updated: Minor numa memory code cleanup In-Reply-To: <20051130224900.GE29166@w-mikek2.ibm.com> References: <20051130214723.GC29166@w-mikek2.ibm.com> <200511301633.42773.michael@ellerman.id.au> <20051130224900.GE29166@w-mikek2.ibm.com> Message-ID: <200511301722.26991.michael@ellerman.id.au> On Wed, 30 Nov 2005 16:49, Mike Kravetz wrote: > On Wed, Nov 30, 2005 at 04:33:36PM -0600, Michael Ellerman wrote: > > On Wed, 30 Nov 2005 15:47, Mike Kravetz wrote: > > > Here is an updated version of the patch that panics if no memory is > > > found as Nathan suggested. I'm still concerned that panic strings > > > (not just the one added here) at this stage of booting do not show > > > up on my system. But, that is an issue separate from this patch. > > > > You probably need to enable one of the EARLY_DEBUG_INIT macros, in > > arch/powerpc/kernel/setup_64.c. > > I was thinking more about debugging production systems in the field > where we may not have the luxury of booting a debug kernel. Sure, the nature of early debug is it's trying to tap things that may or may not be around and/or configured - so it's not enabled be default because it will cause some machines to not boot. That's just the way it is. > Seem to recall a situation in the past where someone ran into a > problem in numa.c that called panic. Didn't get the panic message > displayed on the console. Had them enable xmon, and dig the panic > message out of the console buffer. Sure would be nice if we could > get all those early catastrophic failure messages to the console. Hmm, I'd have to check the code, but if xmon is working then you should be able to see the panic. If it's a real problem you could look at generalising init/main.c's panic_later mechanism, which delays a panic until after the console is initialised? cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051130/81a6a7a7/attachment.pgp From j_vvprasad at yahoo.co.in Thu Dec 1 19:48:11 2005 From: j_vvprasad at yahoo.co.in (veera venkata prasad j) Date: Thu, 1 Dec 2005 00:48:11 -0800 (PST) Subject: Booting OS on PowerPC Message-ID: <20051201084811.53930.qmail@web8508.mail.in.yahoo.com> Hi all, Can any body tell me how Linux boot on PowerPC machine when Open Firmware is up. To be more preciese, what is the "known-environment" that the OS expect from Open Firmware. Regards Prasad Jvv. __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com From vatsa at in.ibm.com Fri Dec 2 01:26:19 2005 From: vatsa at in.ibm.com (Srivatsa Vaddagiri) Date: Thu, 1 Dec 2005 19:56:19 +0530 Subject: [PATCH] NO_IDLE_HZ patch updated to 2.6.15-rc3-mm1 Message-ID: <20051201142619.GA6157@in.ibm.com> Hello, Here's updated patch to implement NO_IDLE_HZ on PPC64. The patch is against 2.6.15-rc3-mm1 and has been tested on a Power5 LPAR. The patches attached are: boot_cpu_fix.patch -> Lets do_timer be called from any CPU no_idle_hz.patch -> Implement tickless idle CPUs for PPC64 debug.patch -> Debug patch that I used for getting decrementer statistics. We need more cleaner solution if we have to expose those statistics. Let me know if you have any comments on these patches. -- Thanks and Regards, Srivatsa Vaddagiri, Linux Technology Center, IBM Software Labs, Bangalore, INDIA - 560017 -------------- next part -------------- Currently xtime/jiffies is updated by only boot CPU which makes it difficult for an idle boot CPU to skip ticks. The patch overcomes this limitation and lets xtime/jiffies be updated from any CPU. Signed-off-by: Srivatsa Vaddagiri --- diff -puN arch/powerpc/kernel/time.c~boot_cpu_fix arch/powerpc/kernel/time.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/time.c~boot_cpu_fix 2005-12-01 13:14:55.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/time.c 2005-12-01 13:21:52.000000000 -0800 @@ -420,6 +420,7 @@ void timer_interrupt(struct pt_regs * re int next_dec; int cpu = smp_processor_id(); unsigned long ticks; + int end_singleshot = 0; #ifdef CONFIG_PPC32 if (atomic_read(&ppc_n_lost_interrupts) != 0) @@ -452,23 +453,29 @@ void timer_interrupt(struct pt_regs * re if (!cpu_is_offline(cpu)) update_process_times(user_mode(regs)); - /* - * No need to check whether cpu is offline here; boot_cpuid - * should have been fixed up by now. - */ - if (cpu != boot_cpuid) - continue; - write_seqlock(&xtime_lock); - tb_last_jiffy += tb_ticks_per_jiffy; - tb_last_stamp = per_cpu(last_jiffy, cpu); - timer_recalc_offset(tb_last_jiffy); - do_timer(regs); - timer_sync_xtime(tb_last_jiffy); - timer_check_rtc(); + if (tb_ticks_since(tb_last_stamp) >= tb_ticks_per_jiffy) { + tb_last_jiffy += tb_ticks_per_jiffy; + tb_last_stamp += tb_ticks_per_jiffy; + if (__USE_RTC() && tb_last_stamp >= 1000000000) + tb_last_stamp -= 1000000000; + timer_recalc_offset(tb_last_jiffy); + do_timer(regs); + timer_sync_xtime(tb_last_jiffy); + timer_check_rtc(); + } + if (adjusting_time && (time_adjust == 0)) { + adjusting_time = 0; + end_singleshot = 1; + } write_sequnlock(&xtime_lock); - if (adjusting_time && (time_adjust == 0)) + + if (end_singleshot) { +#ifdef DEBUG_PPC_ADJTIMEX + printk("ppc_adjtimex: ending single shot time_adjust\n"); +#endif ppc_adjtimex(); + } } next_dec = tb_ticks_per_jiffy - ticks; @@ -826,13 +833,6 @@ void ppc_adjtimex(void) if ( time_adjust < 0 ) singleshot_ppm = -singleshot_ppm; } - else { -#ifdef DEBUG_PPC_ADJTIMEX - if ( adjusting_time ) - printk("ppc_adjtimex: ending single shot time_adjust\n"); -#endif - adjusting_time = 0; - } /* Add up all of the frequency adjustments */ delta_freq = time_freq + ltemp + singleshot_ppm; _ -------------- next part -------------- This patch causes idle CPUs to skip timer ticks until the next scheduled event (next_timer_interrupt()) or until some max duration allowed by the decrementer. This helps to conserve power and on virtual partitions using shared processors, allows for efficient CPU utilization. Currently, only few idle routines have been converted over to use this feature. Other idle routine could be converted over later depending on the requirement. Signed-off-by : Srivatsa Vaddagiri --- diff -puN arch/powerpc/kernel/time.c~no_idle_hz arch/powerpc/kernel/time.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/time.c~no_idle_hz 2005-12-01 16:06:28.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/time.c 2005-12-01 16:18:52.000000000 -0800 @@ -401,40 +401,13 @@ static void iSeries_tb_recal(void) } #endif -/* - * For iSeries shared processors, we have to let the hypervisor - * set the hardware decrementer. We set a virtual decrementer - * in the lppaca and call the hypervisor if the virtual - * decrementer is less than the current value in the hardware - * decrementer. (almost always the new decrementer value will - * be greater than the current hardware decementer so the hypervisor - * call will not be needed) - */ - -/* - * timer_interrupt - gets called when the decrementer overflows, - * with interrupts disabled. - */ -void timer_interrupt(struct pt_regs * regs) +static void account_ticks(struct pt_regs *regs) { int next_dec; int cpu = smp_processor_id(); unsigned long ticks; int end_singleshot = 0; -#ifdef CONFIG_PPC32 - if (atomic_read(&ppc_n_lost_interrupts) != 0) - do_IRQ(regs); -#endif - - irq_enter(); - - profile_tick(CPU_PROFILING, regs); - -#ifdef CONFIG_PPC_ISERIES - get_paca()->lppaca.int_dword.fields.decr_int = 0; -#endif - while ((ticks = tb_ticks_since(per_cpu(last_jiffy, cpu))) >= tb_ticks_per_jiffy) { /* Update last_jiffy */ @@ -480,6 +453,58 @@ void timer_interrupt(struct pt_regs * re next_dec = tb_ticks_per_jiffy - ticks; set_dec(next_dec); +} + +#ifdef CONFIG_NO_IDLE_HZ +/* Returns 1 if this CPU was set in the mask */ +static inline int clear_hzless_mask(void) +{ + unsigned long cpu = smp_processor_id(); + int rc = 0; + + if (unlikely(cpu_isset(cpu, nohz_cpu_mask))) { + cpu_clear(cpu, nohz_cpu_mask); + rc = 1; + } + + return rc; +} +#else +static inline int clear_hzless_mask(void) { return 0;} +#endif + +/* + * For iSeries shared processors, we have to let the hypervisor + * set the hardware decrementer. We set a virtual decrementer + * in the lppaca and call the hypervisor if the virtual + * decrementer is less than the current value in the hardware + * decrementer. (almost always the new decrementer value will + * be greater than the current hardware decementer so the hypervisor + * call will not be needed) + */ + +/* + * timer_interrupt - gets called when the decrementer overflows, + * with interrupts disabled. + */ +void timer_interrupt(struct pt_regs * regs) +{ +#ifdef CONFIG_PPC32 + if (atomic_read(&ppc_n_lost_interrupts) != 0) + do_IRQ(regs); +#endif + + irq_enter(); + + clear_hzless_mask(); + + profile_tick(CPU_PROFILING, regs); + +#ifdef CONFIG_PPC_ISERIES + get_paca()->lppaca.int_dword.fields.decr_int = 0; +#endif + + account_ticks(regs); #ifdef CONFIG_PPC_ISERIES if (hvlpevent_is_pending()) @@ -497,6 +522,72 @@ void timer_interrupt(struct pt_regs * re irq_exit(); } +#ifdef CONFIG_NO_IDLE_HZ + +#define MAX_DEC_COUNT (UINT_MAX) /* Decrementer is 32-bit */ +#define MIN_SKIP 2 +#define MAX_SKIP (MAX_DEC_COUNT/tb_ticks_per_jiffy) + +int sysctl_hz_timer = 1; + +/* Avoid the HZ timer (decrementer) interrupt on this CPU for "some" time. + * This is accomplished by loading the decrementer with some large calculated + * value. The CPU exits this "tickless" state upon the occurence of an + * exception or external interrupt, at which point the decrementer is again + * reprogrammed to restore the timer interrupt frequency (see start_hz_timer). + * Caller has to ensure that the CPU does not exit the "tickless" idle state + * via other means. + * + * Has to be called with interrupts disabled. + */ +void stop_hz_timer(void) +{ + unsigned long cpu = smp_processor_id(), seq, delta; + int next_dec; + + if (sysctl_hz_timer != 0) + return; + + cpu_set(cpu, nohz_cpu_mask); + smp_mb(); + if (rcu_pending(cpu) || local_softirq_pending()) { + cpu_clear(cpu, nohz_cpu_mask); + return; + } + + do { + seq = read_seqbegin(&xtime_lock); + + delta = next_timer_interrupt() - jiffies; + + if (delta < MIN_SKIP) { + cpu_clear(cpu, nohz_cpu_mask); + return; + } + + if (delta > MAX_SKIP) + delta = MAX_SKIP; + + next_dec = tb_last_stamp + delta * tb_ticks_per_jiffy; + + } while (read_seqretry(&xtime_lock, seq)); + + next_dec -= get_tbl(); + set_dec(next_dec); + + return; +} + +/* Take into account skipped ticks and restore the HZ timer frequency */ +void start_hz_timer(struct pt_regs *regs) +{ + if (clear_hzless_mask()) + account_ticks(regs); +} + +#endif /* CONFIG_NO_IDLE_HZ */ + + void wakeup_decrementer(void) { int i; diff -puN arch/powerpc/kernel/irq.c~no_idle_hz arch/powerpc/kernel/irq.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/irq.c~no_idle_hz 2005-12-01 16:06:28.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/irq.c 2005-12-01 16:18:52.000000000 -0800 @@ -59,6 +59,7 @@ #include #include #include +#include #ifdef CONFIG_PPC_ISERIES #include #endif @@ -192,6 +193,8 @@ void do_IRQ(struct pt_regs *regs) irq_enter(); + start_hz_timer(regs); + #ifdef CONFIG_DEBUG_STACKOVERFLOW /* Debugging check for stack overflow: is there less than 2KB free? */ { diff -puN include/asm-powerpc/time.h~no_idle_hz include/asm-powerpc/time.h --- linux-2.6.15-rc3-mm1/include/asm-powerpc/time.h~no_idle_hz 2005-12-01 16:06:39.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/include/asm-powerpc/time.h 2005-12-01 16:06:39.000000000 -0800 @@ -198,6 +198,14 @@ static inline unsigned long tb_ticks_sin return get_tbl() - tstamp; } +#ifdef CONFIG_NO_IDLE_HZ +extern void stop_hz_timer(void); +extern void start_hz_timer(struct pt_regs *); +#else +static inline void stop_hz_timer(void) { } +static inline void start_hz_timer(struct pt_regs *regs) { } +#endif + #define mulhwu(x,y) \ ({unsigned z; asm ("mulhwu %0,%1,%2" : "=r" (z) : "r" (x), "r" (y)); z;}) diff -puN arch/powerpc/Kconfig~no_idle_hz arch/powerpc/Kconfig --- linux-2.6.15-rc3-mm1/arch/powerpc/Kconfig~no_idle_hz 2005-12-01 16:06:28.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/Kconfig 2005-12-01 16:06:28.000000000 -0800 @@ -532,6 +532,12 @@ config HOTPLUG_CPU Say N if you are unsure. +config NO_IDLE_HZ + depends on EXPERIMENTAL && (PPC_PSERIES || PPC_PMAC || PPC_MAPLE) + bool "Skip timer ticks on idle CPUs (EXPERIMENTAL)" + help + Switches the HZ timer interrupts off when a CPU is idle. + config KEXEC bool "kexec system call (EXPERIMENTAL)" depends on PPC_MULTIPLATFORM && EXPERIMENTAL diff -puN kernel/sysctl.c~no_idle_hz kernel/sysctl.c --- linux-2.6.15-rc3-mm1/kernel/sysctl.c~no_idle_hz 2005-12-01 16:06:36.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/kernel/sysctl.c 2005-12-01 16:06:36.000000000 -0800 @@ -542,6 +542,16 @@ static ctl_table kern_table[] = { .extra1 = &minolduid, .extra2 = &maxolduid, }, +#ifdef CONFIG_NO_IDLE_HZ + { + .ctl_name = KERN_HZ_TIMER, + .procname = "hz_timer", + .data = &sysctl_hz_timer, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, +#endif #ifdef CONFIG_ARCH_S390 #ifdef CONFIG_MATHEMU { @@ -553,16 +563,6 @@ static ctl_table kern_table[] = { .proc_handler = &proc_dointvec, }, #endif -#ifdef CONFIG_NO_IDLE_HZ - { - .ctl_name = KERN_HZ_TIMER, - .procname = "hz_timer", - .data = &sysctl_hz_timer, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = &proc_dointvec, - }, -#endif { .ctl_name = KERN_S390_USER_DEBUG_LOGGING, .procname = "userprocess_debug", diff -puN arch/powerpc/platforms/pseries/setup.c~no_idle_hz arch/powerpc/platforms/pseries/setup.c --- linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/setup.c~no_idle_hz 2005-12-01 16:06:28.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/platforms/pseries/setup.c 2005-12-01 16:18:52.000000000 -0800 @@ -461,9 +461,10 @@ static inline void dedicated_idle_sleep( * a prod occurs. Returning from the cede enables external * interrupts. */ - if (!need_resched()) + if (!need_resched()) { + stop_hz_timer(); cede_processor(); - else + } else local_irq_enable(); set_thread_flag(TIF_POLLING_NRFLAG); } else { @@ -553,9 +554,10 @@ static void pseries_shared_idle(void) * Check need_resched() again with interrupts disabled * to avoid a race. */ - if (!need_resched()) + if (!need_resched()) { + stop_hz_timer(); cede_processor(); - else + } else local_irq_enable(); HMT_medium(); diff -puN arch/powerpc/kernel/traps.c~no_idle_hz arch/powerpc/kernel/traps.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/traps.c~no_idle_hz 2005-12-01 16:06:28.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/traps.c 2005-12-01 16:06:28.000000000 -0800 @@ -40,6 +40,7 @@ #include #include #include +#include #ifdef CONFIG_PPC32 #include #endif @@ -889,6 +890,7 @@ void altivec_unavailable_exception(struc #if defined(CONFIG_PPC64) || defined(CONFIG_E500) void performance_monitor_exception(struct pt_regs *regs) { + start_hz_timer(regs); perf_irq(regs); } #endif diff -puN arch/powerpc/kernel/idle_64.c~no_idle_hz arch/powerpc/kernel/idle_64.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/idle_64.c~no_idle_hz 2005-12-01 16:06:28.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/idle_64.c 2005-12-01 16:18:52.000000000 -0800 @@ -66,8 +66,12 @@ void native_idle(void) while (1) { ppc64_runlatch_off(); - if (!need_resched()) - power4_idle(); + local_irq_disable(); + if (!need_resched()) { + stop_hz_timer(); + local_irq_enable(); + power4_idle(); + } if (need_resched()) { ppc64_runlatch_on(); _ -------------- next part -------------- This patch is a quick hack to get decrementer statistics. Not meant for inclusion. --- diff -puN arch/powerpc/kernel/time.c~debug arch/powerpc/kernel/time.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/time.c~debug 2005-12-01 16:19:07.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/time.c 2005-12-01 16:19:07.000000000 -0800 @@ -489,6 +489,8 @@ static inline int clear_hzless_mask(void */ void timer_interrupt(struct pt_regs * regs) { + int cpu = smp_processor_id(); + #ifdef CONFIG_PPC32 if (atomic_read(&ppc_n_lost_interrupts) != 0) do_IRQ(regs); @@ -498,6 +500,8 @@ void timer_interrupt(struct pt_regs * re clear_hzless_mask(); + kstat_cpu(cpu).irqs[0]++; + profile_tick(CPU_PROFILING, regs); #ifdef CONFIG_PPC_ISERIES @@ -548,6 +552,9 @@ void stop_hz_timer(void) if (sysctl_hz_timer != 0) return; + if (cpu_isset(cpu, nohz_cpu_mask)) + return; + cpu_set(cpu, nohz_cpu_mask); smp_mb(); if (rcu_pending(cpu) || local_softirq_pending()) { diff -puN arch/powerpc/kernel/idle_64.c~debug arch/powerpc/kernel/idle_64.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/idle_64.c~debug 2005-12-01 16:19:07.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/idle_64.c 2005-12-01 16:19:07.000000000 -0800 @@ -41,6 +41,11 @@ void default_idle(void) while (!need_resched() && !cpu_is_offline(cpu)) { ppc64_runlatch_off(); + local_irq_disable(); + if (!need_resched()) + stop_hz_timer(); + local_irq_enable(); + /* * Go into low thread priority and possibly * low power mode. diff -puN arch/powerpc/kernel/irq.c~debug arch/powerpc/kernel/irq.c --- linux-2.6.15-rc3-mm1/arch/powerpc/kernel/irq.c~debug 2005-12-01 16:19:07.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/kernel/irq.c 2005-12-01 16:19:07.000000000 -0800 @@ -107,6 +107,10 @@ int show_interrupts(struct seq_file *p, for_each_online_cpu(j) seq_printf(p, "CPU%d ", j); seq_putc(p, '\n'); + seq_printf(p, "%3d: ", i); + for_each_online_cpu(j) + seq_printf(p, "%10u ", kstat_cpu(j).irqs[i]); + seq_putc(p, '\n'); } if (i < NR_IRQS) { diff -puN arch/powerpc/platforms/pseries/setup.c~debug arch/powerpc/platforms/pseries/setup.c --- linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/setup.c~debug 2005-12-01 16:19:07.000000000 -0800 +++ linux-2.6.15-rc3-mm1-root/arch/powerpc/platforms/pseries/setup.c 2005-12-01 16:19:07.000000000 -0800 @@ -498,6 +498,11 @@ static void pseries_dedicated_idle(void) while (!need_resched() && !cpu_is_offline(cpu)) { ppc64_runlatch_off(); + local_irq_disable(); + if (!need_resched()) + stop_hz_timer(); + local_irq_enable(); + /* * Go into low thread priority and possibly * low power mode. _ From linas at austin.ibm.com Fri Dec 2 03:57:22 2005 From: linas at austin.ibm.com (linas) Date: Thu, 1 Dec 2005 10:57:22 -0600 Subject: Booting OS on PowerPC In-Reply-To: <20051201084811.53930.qmail@web8508.mail.in.yahoo.com> References: <20051201084811.53930.qmail@web8508.mail.in.yahoo.com> Message-ID: <20051201165721.GJ31651@austin.ibm.com> On Thu, Dec 01, 2005 at 12:48:11AM -0800, veera venkata prasad j was heard to remark: > Can any body tell me how Linux boot on PowerPC machine > when Open Firmware is up. To be more preciese, what is > the "known-environment" that the OS expect from Open > Firmware. Can you be more specific? What answer are you looking for? --linas From linas at austin.ibm.com Fri Dec 2 11:42:32 2005 From: linas at austin.ibm.com (linas) Date: Thu, 1 Dec 2005 18:42:32 -0600 Subject: [PATCH] powerpc/pseries: dlpar-add crash on null pointer deref Message-ID: <20051202004232.GN31651@austin.ibm.com> Paul, Please apply. This patch fixs a crash on null-pointer deref during dlpar slot addition. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/eeh.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/arch/powerpc/platforms/pseries/eeh.c 2005-12-01 17:30:21.000000000 -0600 +++ linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/eeh.c 2005-12-01 18:18:29.808112099 -0600@@ -698,7 +698,7 @@ int enable; struct pci_dn *pdn = PCI_DN(dn); - pdn->class_code = *class_code; + pdn->class_code = 0; pdn->eeh_mode = 0; pdn->eeh_check_count = 0; pdn->eeh_freeze_count = 0; @@ -715,6 +715,7 @@ pdn->eeh_mode |= EEH_MODE_NOCHECK; return NULL; } + pdn->class_code = *class_code; /* * Now decide if we are going to "Disable" EEH checking From linas at austin.ibm.com Fri Dec 2 11:56:14 2005 From: linas at austin.ibm.com (linas) Date: Thu, 1 Dec 2005 18:56:14 -0600 Subject: [PATCH 1/2] PCI Hotplug/powerpc: remove duplicated code Message-ID: <20051202005614.GO31651@austin.ibm.com> Greg, Please apply! --linas The RPAPHP code contains a routine that duplicates some existing code. This patch removes the rpaphp version of the code. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-01 18:36:40.897900661 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c 2005-12-01 18:51:18.686712139 -0600 @@ -287,18 +287,6 @@ return dev; } -void rpaphp_eeh_init_nodes(struct device_node *dn) -{ - struct device_node *sib; - - for (sib = dn->child; sib; sib = sib->sibling) - rpaphp_eeh_init_nodes(sib); - eeh_add_device_early(dn); - return; - -} -EXPORT_SYMBOL_GPL(rpaphp_eeh_init_nodes); - static void print_slot_pci_funcs(struct pci_bus *bus) { struct device_node *dn; @@ -324,7 +312,7 @@ if (!dn) goto exit; - rpaphp_eeh_init_nodes(dn); + eeh_add_device_tree_early(dn); dev = rpaphp_pci_config_slot(bus); if (!dev) { err("%s: can't find any devices.\n", __FUNCTION__); Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-12-01 18:36:40.898900520 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c 2005-12-01 18:51:18.687711999 -0600 @@ -154,7 +154,8 @@ struct pci_controller *phb = pdn->phb; struct pci_dev *dev = NULL; - rpaphp_eeh_init_nodes(dn); + eeh_add_device_tree_early(dn); + /* Add EADS device to PHB bus, adding new entry to bus->devices */ dev = of_create_pci_dev(dn, phb->bus, pdn->devfn); if (!dev) { From linas at austin.ibm.com Fri Dec 2 11:59:58 2005 From: linas at austin.ibm.com (linas) Date: Thu, 1 Dec 2005 18:59:58 -0600 Subject: [PATCH 2/2] PCI Hotplug/powerpc: more removal of duplicated code In-Reply-To: <20051202005614.GO31651@austin.ibm.com> References: <20051202005614.GO31651@austin.ibm.com> Message-ID: <20051202005957.GP31651@austin.ibm.com> Greg, Please apply! John Rose, Please review this code! --linas The RPAPHP code contains two routines that appear to be gratuitous copies of very similar pci code. In particular, rpaphp_claim_resource ~~ pci_claim_resource (there is a minor, non-functional difference) rpadlpar_claim_one_bus == pcibios_claim_one_bus (the code is identical) This patch removes the rpaphp versions of the code. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-01 18:51:18.686712139 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c 2005-12-01 18:51:26.357635444 -0600 @@ -62,28 +62,6 @@ } EXPORT_SYMBOL_GPL(rpaphp_find_pci_bus); -int rpaphp_claim_resource(struct pci_dev *dev, int resource) -{ - struct resource *res = &dev->resource[resource]; - struct resource *root = pci_find_parent_resource(dev, res); - char *dtype = resource < PCI_BRIDGE_RESOURCES ? "device" : "bridge"; - int err = -EINVAL; - - if (root != NULL) { - err = request_resource(root, res); - } - - if (err) { - err("PCI: %s region %d of %s %s [%lx:%lx]\n", - root ? "Address space collision on" : - "No parent found for", - resource, dtype, pci_name(dev), res->start, res->end); - } - return err; -} - -EXPORT_SYMBOL_GPL(rpaphp_claim_resource); - static int rpaphp_get_sensor_state(struct slot *slot, int *state) { int rc; @@ -177,7 +155,7 @@ if (r->parent || !r->start || !r->flags) continue; - rpaphp_claim_resource(dev, i); + pci_claim_resource(dev, i); } } } Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-12-01 18:51:18.687711999 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c 2005-12-01 18:51:26.358635304 -0600 @@ -112,28 +112,6 @@ return NULL; } -static void rpadlpar_claim_one_bus(struct pci_bus *b) -{ - struct list_head *ld; - struct pci_bus *child_bus; - - for (ld = b->devices.next; ld != &b->devices; ld = ld->next) { - struct pci_dev *dev = pci_dev_b(ld); - int i; - - for (i = 0; i < PCI_NUM_RESOURCES; i++) { - struct resource *r = &dev->resource[i]; - - if (r->parent || !r->start || !r->flags) - continue; - rpaphp_claim_resource(dev, i); - } - } - - list_for_each_entry(child_bus, &b->children, node) - rpadlpar_claim_one_bus(child_bus); -} - static struct pci_dev *dlpar_find_new_dev(struct pci_bus *parent, struct device_node *dev_dn) { @@ -171,7 +149,7 @@ rpaphp_init_new_devs(dev->subordinate); /* Claim new bus resources */ - rpadlpar_claim_one_bus(dev->bus); + pcibios_claim_one_bus(dev->bus); /* ioremap() for child bus, which may or may not succeed */ (void) remap_bus_range(dev->bus); From greg at kroah.com Fri Dec 2 12:07:07 2005 From: greg at kroah.com (Greg KH) Date: Thu, 1 Dec 2005 17:07:07 -0800 Subject: [PATCH 1/2] PCI Hotplug/powerpc: remove duplicated code In-Reply-To: <20051202005614.GO31651@austin.ibm.com> References: <20051202005614.GO31651@austin.ibm.com> Message-ID: <20051202010707.GA29258@kroah.com> On Thu, Dec 01, 2005 at 06:56:14PM -0600, linas wrote: > > Greg, > Please apply! I need an ack from John before I'll apply either of these. John? thanks, greg k-h From kravetz at us.ibm.com Fri Dec 2 12:22:08 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Thu, 1 Dec 2005 17:22:08 -0800 Subject: [PATCH] powerpc/pseries: dlpar-add crash on null pointer deref In-Reply-To: <20051202004232.GN31651@austin.ibm.com> References: <20051202004232.GN31651@austin.ibm.com> Message-ID: <20051202012208.GB9576@monkey.ibm.com> On Thu, Dec 01, 2005 at 06:42:32PM -0600, linas wrote: > > This patch fixs a crash on null-pointer deref during dlpar slot addition. Just curious is this specific to adapters? I experienced a crash when trying to add CPUs. Haven't debugged it yet. But, was able to successfully add memory. -- Mike From kravetz at us.ibm.com Fri Dec 2 12:28:39 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Thu, 1 Dec 2005 17:28:39 -0800 Subject: [PATCH] numa placement for dynamically added memory Message-ID: <20051202012839.GA14241@monkey.ibm.com> This patch places dynamically added memory within the appropriate numa node. A new routine hot_add_scn_to_nid() replicates most of the memory scanning code in parse_numa_properties(). I'd appreciate it if Anton or Nathan could take a look. I seem to break something every time I touch numa.c. This patch depends on the patch I sent yesterday that hits numa.c http://ozlabs.org/pipermail/linuxppc64-dev/2005-December/006923.html Signed-off-by: Mike Kravetz diff -Naupr linux-2.6.15-rc4.dep/arch/powerpc/mm/mem.c linux-2.6.15-rc4.work/arch/powerpc/mm/mem.c --- linux-2.6.15-rc4.dep/arch/powerpc/mm/mem.c 2005-12-01 06:25:15.000000000 +0000 +++ linux-2.6.15-rc4.work/arch/powerpc/mm/mem.c 2005-12-02 00:11:19.000000000 +0000 @@ -121,11 +121,15 @@ void online_page(struct page *page) */ int __devinit add_memory(u64 start, u64 size) { - struct pglist_data *pgdata = NODE_DATA(0); + struct pglist_data *pgdata; struct zone *zone; + int nid; unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; + nid = hot_add_scn_to_nid(start); + pgdata = NODE_DATA(nid); + start += KERNELBASE; create_section_mapping(start, start + size); diff -Naupr linux-2.6.15-rc4.dep/arch/powerpc/mm/numa.c linux-2.6.15-rc4.work/arch/powerpc/mm/numa.c --- linux-2.6.15-rc4.dep/arch/powerpc/mm/numa.c 2005-12-01 19:46:21.000000000 +0000 +++ linux-2.6.15-rc4.work/arch/powerpc/mm/numa.c 2005-12-02 00:11:19.000000000 +0000 @@ -37,6 +37,7 @@ EXPORT_SYMBOL(node_data); static bootmem_data_t __initdata plat_node_bdata[MAX_NUMNODES]; static int min_common_depth; +static int n_mem_addr_cells, n_mem_size_cells; /* * We need somewhere to store start/end/node for each region until we have @@ -267,7 +268,11 @@ static void __init get_n_mem_cells(int * of_node_put(memory); } +#ifdef CONFIG_MEMORY_HOTPLUG +static unsigned long read_n_cells(int n, unsigned int **buf) +#else static unsigned long __init read_n_cells(int n, unsigned int **buf) +#endif { unsigned long result = 0; @@ -374,7 +379,6 @@ static int __init parse_numa_properties( { struct device_node *cpu = NULL; struct device_node *memory = NULL; - int n_addr_cells, n_size_cells; int max_domain; unsigned long i; @@ -413,7 +417,7 @@ static int __init parse_numa_properties( } } - get_n_mem_cells(&n_addr_cells, &n_size_cells); + get_n_mem_cells(&n_mem_addr_cells, &n_mem_size_cells); memory = NULL; while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { unsigned long start; @@ -430,8 +434,8 @@ static int __init parse_numa_properties( ranges = memory->n_addrs; new_range: /* these are order-sensitive, and modify the buffer pointer */ - start = read_n_cells(n_addr_cells, &memcell_buf); - size = read_n_cells(n_size_cells, &memcell_buf); + start = read_n_cells(n_mem_addr_cells, &memcell_buf); + size = read_n_cells(n_mem_size_cells, &memcell_buf); numa_domain = of_node_numa_domain(memory); @@ -717,3 +721,50 @@ static int __init early_numa(char *p) return 0; } early_param("numa", early_numa); + +#ifdef CONFIG_MEMORY_HOTPLUG +/* + * Find the node associated with a hot added memory section. Section + * corresponds to a SPARSEMEM section, not an LMB. It is assumed that + * sections are fully contained within a single LMB. + */ +int hot_add_scn_to_nid(unsigned long scn_addr) +{ + struct device_node *memory = NULL; + + if (!numa_enabled || (min_common_depth < 0)) + return 0; + + while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { + unsigned long start, size; + int numa_domain, ranges; + unsigned int *memcell_buf; + unsigned int len; + + memcell_buf = (unsigned int *)get_property(memory, "reg", &len); + if (!memcell_buf || len <= 0) + continue; + + ranges = memory->n_addrs; /* ranges in cell */ +ha_new_range: + start = read_n_cells(n_mem_addr_cells, &memcell_buf); + size = read_n_cells(n_mem_size_cells, &memcell_buf); + numa_domain = of_node_numa_domain(memory); + + /* Domains not present at boot default to 0 */ + if (!node_online(numa_domain)) + numa_domain = 0; + + if ((scn_addr >= start) && (scn_addr < (start + size))) { + of_node_put(memory); + return numa_domain; + } + + if (--ranges) /* process all ranges in cell */ + goto ha_new_range; + } + + BUG(); /* section address should be found above */ + return 0; +} +#endif /* CONFIG_MEMORY_HOTPLUG */ diff -Naupr linux-2.6.15-rc4.dep/include/asm-powerpc/sparsemem.h linux-2.6.15-rc4.work/include/asm-powerpc/sparsemem.h --- linux-2.6.15-rc4.dep/include/asm-powerpc/sparsemem.h 2005-12-01 06:25:15.000000000 +0000 +++ linux-2.6.15-rc4.work/include/asm-powerpc/sparsemem.h 2005-12-01 19:57:03.000000000 +0000 @@ -13,6 +13,11 @@ #ifdef CONFIG_MEMORY_HOTPLUG extern void create_section_mapping(unsigned long start, unsigned long end); +#ifdef CONFIG_NUMA +extern int hot_add_scn_to_nid(unsigned long scn_addr); +#else +#define hot_add_scn_to_nid(scn_addr) (0) +#endif #endif /* CONFIG_MEMORY_HOTPLUG */ #endif /* CONFIG_SPARSEMEM */ From linas at austin.ibm.com Fri Dec 2 13:22:53 2005 From: linas at austin.ibm.com (linas) Date: Thu, 1 Dec 2005 20:22:53 -0600 Subject: [PATCH] powerpc/pseries: dlpar-add crash on null pointer deref In-Reply-To: <20051202012208.GB9576@monkey.ibm.com> References: <20051202004232.GN31651@austin.ibm.com> <20051202012208.GB9576@monkey.ibm.com> Message-ID: <20051202022253.GQ31651@austin.ibm.com> On Thu, Dec 01, 2005 at 05:22:08PM -0800, Mike Kravetz was heard to remark: > On Thu, Dec 01, 2005 at 06:42:32PM -0600, linas wrote: > > > > This patch fixs a crash on null-pointer deref during dlpar slot addition. > > Just curious is this specific to adapters? I experienced a crash when > trying to add CPUs. Haven't debugged it yet. But, was able to > successfully add memory. This should only affect PCI devices. --linas From ntl at pobox.com Fri Dec 2 14:02:30 2005 From: ntl at pobox.com (Nathan Lynch) Date: Thu, 1 Dec 2005 22:02:30 -0500 Subject: [PATCH] numa placement for dynamically added memory In-Reply-To: <20051202012839.GA14241@monkey.ibm.com> References: <20051202012839.GA14241@monkey.ibm.com> Message-ID: <20051202030229.GA7836@localhost.localdomain> Hi Mike- Mike Kravetz wrote: > This patch places dynamically added memory within the appropriate > numa node. A new routine hot_add_scn_to_nid() replicates most of > the memory scanning code in parse_numa_properties(). > > +#ifdef CONFIG_MEMORY_HOTPLUG > +static unsigned long read_n_cells(int n, unsigned int **buf) > +#else > static unsigned long __init read_n_cells(int n, unsigned int **buf) > +#endif Any reason not to use __devinit here? Or maybe look into devising a macro like __cpuinit for memory hotplug. > +#ifdef CONFIG_MEMORY_HOTPLUG > +/* > + * Find the node associated with a hot added memory section. Section > + * corresponds to a SPARSEMEM section, not an LMB. It is assumed that > + * sections are fully contained within a single LMB. > + */ > +int hot_add_scn_to_nid(unsigned long scn_addr) > +{ > + struct device_node *memory = NULL; > + > + if (!numa_enabled || (min_common_depth < 0)) > + return 0; > + > + while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { > + unsigned long start, size; > + int numa_domain, ranges; > + unsigned int *memcell_buf; > + unsigned int len; > + > + memcell_buf = (unsigned int *)get_property(memory, "reg", &len); > + if (!memcell_buf || len <= 0) > + continue; > + > + ranges = memory->n_addrs; /* ranges in cell */ > +ha_new_range: > + start = read_n_cells(n_mem_addr_cells, &memcell_buf); > + size = read_n_cells(n_mem_size_cells, &memcell_buf); > + numa_domain = of_node_numa_domain(memory); > + > + /* Domains not present at boot default to 0 */ > + if (!node_online(numa_domain)) > + numa_domain = 0; Nope, 0 is not always a valid node on pSeries lpar. I suggest using any_online_node(), or revisiting the idea of logical<->physical mapping of node/domain ids. I tried the latter a few months ago but I've been working on other stuff lately and haven't been able to revisit it. > +#ifdef CONFIG_NUMA > +extern int hot_add_scn_to_nid(unsigned long scn_addr); > +#else > +#define hot_add_scn_to_nid(scn_addr) (0) > +#endif Make hot_add_scn_to_nid a static inline in the !CONFIG_NUMA case, please. Nathan From paulus at samba.org Fri Dec 2 15:57:05 2005 From: paulus at samba.org (Paul Mackerras) Date: Fri, 2 Dec 2005 15:57:05 +1100 Subject: please pull powerpc-merge.git Message-ID: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge.git There is a fix for a bug in making IOMMU entries on partitioned pSeries systems when 64k pages are used, and a correction for the help text of a config option. Thanks, Paul. Michal Ostrowski: powerpc/pseries: Fix TCE building with 64k pagesize Olaf Hering: powerpc: correct the NR_CPUS description text arch/powerpc/Kconfig | 2 +- arch/powerpc/platforms/pseries/iommu.c | 9 ++++++--- 2 files changed, 7 insertions(+), 4 deletions(-) From olof at lixom.net Fri Dec 2 16:09:03 2005 From: olof at lixom.net (Olof Johansson) Date: Thu, 1 Dec 2005 23:09:03 -0600 Subject: please pull powerpc-merge.git In-Reply-To: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> References: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> Message-ID: <20051202050903.GC13870@pb15.lixom.net> On Fri, Dec 02, 2005 at 03:57:05PM +1100, Paul Mackerras wrote: > Michal Ostrowski: > powerpc/pseries: Fix TCE building with 64k pagesize Did I miss this one when it went by on the list, or was it never posted there? That's not a good way to do it -- tce_build_pSeriesLP will be called for 1 64K page, but it will actually insert 16 4K pages. It's definately a case for buildmulti. I suggest the following instead. Thanks, Olof ---- Fix adjustment of TCE_PAGE_FACTOR in fallbacks to tce_build_pSeriesLP. Signed-off-by: Olof Johansson Index: 2.6/arch/powerpc/platforms/pseries/iommu.c =================================================================== --- 2.6.orig/arch/powerpc/platforms/pseries/iommu.c 2005-11-29 09:11:47.000000000 -0600 +++ 2.6/arch/powerpc/platforms/pseries/iommu.c 2005-12-01 23:06:36.000000000 -0600 @@ -147,7 +147,8 @@ static void tce_buildmulti_pSeriesLP(str npages <<= TCE_PAGE_FACTOR; if (npages == 1) - return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr, + return tce_build_pSeriesLP(tbl, tcenum >> TCE_PAGE_FACTOR, + npages >> TCE_PAGE_FACTOR, uaddr, direction); tcep = __get_cpu_var(tce_page); @@ -159,7 +160,8 @@ static void tce_buildmulti_pSeriesLP(str tcep = (void *)__get_free_page(GFP_ATOMIC); /* If allocation fails, fall back to the loop implementation */ if (!tcep) - return tce_build_pSeriesLP(tbl, tcenum, npages, + return tce_build_pSeriesLP(tbl, tcenum >> TCE_PAGE_FACTOR, + npages >> TCE_PAGE_FACTOR, uaddr, direction); __get_cpu_var(tce_page) = tcep; } From olof at lixom.net Fri Dec 2 16:13:55 2005 From: olof at lixom.net (Olof Johansson) Date: Thu, 1 Dec 2005 23:13:55 -0600 Subject: please pull powerpc-merge.git In-Reply-To: <20051202050903.GC13870@pb15.lixom.net> References: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> <20051202050903.GC13870@pb15.lixom.net> Message-ID: <20051202051355.GD13870@pb15.lixom.net> On Thu, Dec 01, 2005 at 11:09:03PM -0600, olof wrote: > That's not a good way to do it -- tce_build_pSeriesLP will be called > for 1 64K page, but it will actually insert 16 4K pages. It's definately > a case for buildmulti. > > I suggest the following instead. ..and I forgot to include the first fix of regular build_pSeriesLP. Crap. New patch: --- Fix adjustment of TCE_PAGE_FACTOR in tce_build_pSeriesLP and fallbacks to it. Signed-off-by: Olof Johansson Index: 2.6/arch/powerpc/platforms/pseries/iommu.c =================================================================== --- 2.6.orig/arch/powerpc/platforms/pseries/iommu.c 2005-11-29 09:11:47.000000000 -0600 +++ 2.6/arch/powerpc/platforms/pseries/iommu.c 2005-12-01 23:12:57.000000000 -0600 @@ -109,6 +109,9 @@ static void tce_build_pSeriesLP(struct i u64 rc; union tce_entry tce; + tcenum <<= TCE_PAGE_FACTOR; + npages <<= TCE_PAGE_FACTOR; + tce.te_word = 0; tce.te_rpn = (virt_to_abs(uaddr)) >> TCE_SHIFT; tce.te_rdwr = 1; @@ -147,7 +150,8 @@ static void tce_buildmulti_pSeriesLP(str npages <<= TCE_PAGE_FACTOR; if (npages == 1) - return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr, + return tce_build_pSeriesLP(tbl, tcenum >> TCE_PAGE_FACTOR, + npages >>TCE_PAGE_FACTOR, uaddr, direction); tcep = __get_cpu_var(tce_page); @@ -159,7 +163,8 @@ static void tce_buildmulti_pSeriesLP(str tcep = (void *)__get_free_page(GFP_ATOMIC); /* If allocation fails, fall back to the loop implementation */ if (!tcep) - return tce_build_pSeriesLP(tbl, tcenum, npages, + return tce_build_pSeriesLP(tbl, tcenum >> TCE_PAGE_FACTOR, + npages >> TCE_PAGE_FACTOR, uaddr, direction); __get_cpu_var(tce_page) = tcep; } From paulus at samba.org Fri Dec 2 16:39:58 2005 From: paulus at samba.org (Paul Mackerras) Date: Fri, 2 Dec 2005 16:39:58 +1100 Subject: please pull powerpc-merge.git In-Reply-To: <20051202050903.GC13870@pb15.lixom.net> References: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> <20051202050903.GC13870@pb15.lixom.net> Message-ID: <17295.56878.842081.525602@cargo.ozlabs.ibm.com> Olof Johansson writes: > On Fri, Dec 02, 2005 at 03:57:05PM +1100, Paul Mackerras wrote: > > > Michal Ostrowski: > > powerpc/pseries: Fix TCE building with 64k pagesize > > Did I miss this one when it went by on the list, or was it never posted > there? Michal sent it just to me, for some reason. I convinced myself that it did actually fix a bug, so I sent it on. Next time maybe Michal can cc linuxppc64-dev. > That's not a good way to do it -- tce_build_pSeriesLP will be called > for 1 64K page, but it will actually insert 16 4K pages. It's definately > a case for buildmulti. > > I suggest the following instead. Or better still, we could do: if (TCE_PAGE_FACTOR == 0 && npages == 1) return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr, direction); which will let the whole tce_build_pSeriesLP call get optimized out when we have 64k pages selected. Paul. From olof at lixom.net Fri Dec 2 16:57:21 2005 From: olof at lixom.net (Olof Johansson) Date: Thu, 1 Dec 2005 23:57:21 -0600 Subject: please pull powerpc-merge.git In-Reply-To: <17295.56878.842081.525602@cargo.ozlabs.ibm.com> References: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> <20051202050903.GC13870@pb15.lixom.net> <17295.56878.842081.525602@cargo.ozlabs.ibm.com> Message-ID: <20051202055721.GG13870@pb15.lixom.net> On Fri, Dec 02, 2005 at 04:39:58PM +1100, Paul Mackerras wrote: > > I suggest the following instead. > > Or better still, we could do: > > if (TCE_PAGE_FACTOR == 0 && npages == 1) > return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr, > direction); > > which will let the whole tce_build_pSeriesLP call get optimized out > when we have 64k pages selected. Yep, that's even better. Yet another twist is to do: if ((npages << TCE_PAGE_FACTOR) == 1) Same result, maybe a little easier to read. Patch below if it's in your taste, if not go with what you have. :) -Olof --- Fix adjustment of TCE_PAGE_FACTOR in fallbacks to tce_build_pSeriesLP. Signed-off-by: Olof Johansson Index: 2.6/arch/powerpc/platforms/pseries/iommu.c =================================================================== --- 2.6.orig/arch/powerpc/platforms/pseries/iommu.c 2005-11-29 09:11:47.000000000 -0600 +++ 2.6/arch/powerpc/platforms/pseries/iommu.c 2005-12-01 23:53:04.000000000 -0600 @@ -109,6 +109,9 @@ static void tce_build_pSeriesLP(struct i u64 rc; union tce_entry tce; + tcenum <<= TCE_PAGE_FACTOR; + npages <<= TCE_PAGE_FACTOR; + tce.te_word = 0; tce.te_rpn = (virt_to_abs(uaddr)) >> TCE_SHIFT; tce.te_rdwr = 1; @@ -143,10 +146,8 @@ static void tce_buildmulti_pSeriesLP(str union tce_entry tce, *tcep; long l, limit; - tcenum <<= TCE_PAGE_FACTOR; - npages <<= TCE_PAGE_FACTOR; - - if (npages == 1) + /* For performance reasons, only fall back for single TCE insert */ + if ((npages << TCE_PAGE_FACTOR) == 1) return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr, direction); @@ -164,6 +165,9 @@ static void tce_buildmulti_pSeriesLP(str __get_cpu_var(tce_page) = tcep; } + tcenum <<= TCE_PAGE_FACTOR; + npages <<= TCE_PAGE_FACTOR; + tce.te_word = 0; tce.te_rpn = (virt_to_abs(uaddr)) >> TCE_SHIFT; tce.te_rdwr = 1; From olof at lixom.net Fri Dec 2 16:59:36 2005 From: olof at lixom.net (Olof Johansson) Date: Thu, 1 Dec 2005 23:59:36 -0600 Subject: please pull powerpc-merge.git In-Reply-To: <20051202055721.GG13870@pb15.lixom.net> References: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> <20051202050903.GC13870@pb15.lixom.net> <17295.56878.842081.525602@cargo.ozlabs.ibm.com> <20051202055721.GG13870@pb15.lixom.net> Message-ID: <20051202055935.GH13870@pb15.lixom.net> On Thu, Dec 01, 2005 at 11:57:21PM -0600, olof wrote: > Yep, that's even better. Yet another twist is to do: > > if ((npages << TCE_PAGE_FACTOR) == 1) Nevermind, my verification of the above was bad, I tried it with a constant instead of variable. GCC isn't smart enough to optimize that away for variable statements. Go with your solution. -Olof From mostrows at watson.ibm.com Fri Dec 2 23:54:07 2005 From: mostrows at watson.ibm.com (Michal Ostrowski) Date: Fri, 02 Dec 2005 07:54:07 -0500 Subject: please pull powerpc-merge.git In-Reply-To: <17295.56878.842081.525602@cargo.ozlabs.ibm.com> References: <17295.54305.921349.174302@cargo.ozlabs.ibm.com> <20051202050903.GC13870@pb15.lixom.net> <17295.56878.842081.525602@cargo.ozlabs.ibm.com> Message-ID: <1133528047.8137.80.camel@brick.watson.ibm.com> On Fri, 2005-12-02 at 16:39 +1100, Paul Mackerras wrote: > Olof Johansson writes: > > > On Fri, Dec 02, 2005 at 03:57:05PM +1100, Paul Mackerras wrote: > > > > > Michal Ostrowski: > > > powerpc/pseries: Fix TCE building with 64k pagesize > > > > Did I miss this one when it went by on the list, or was it never posted > > there? > > Michal sent it just to me, for some reason. I convinced myself that > it did actually fix a bug, so I sent it on. Next time maybe Michal > can cc linuxppc64-dev. > Yes... bit of an oops on my part. My original patch fixed a real bug I saw with tce_build_pSeriesLP being called directly, not from tce_buildmulti_pSeriesLP. This was due to the fact that firmware_has_feature(FW_FEATURE_MULTITCE) == 0 (see iommu_init_early_pSeries). -- Michal Ostrowski From johnrose at austin.ibm.com Sat Dec 3 03:46:28 2005 From: johnrose at austin.ibm.com (John Rose) Date: Fri, 02 Dec 2005 10:46:28 -0600 Subject: [PATCH 1/2] PCI Hotplug/powerpc: remove duplicated code In-Reply-To: <20051202005614.GO31651@austin.ibm.com> References: <20051202005614.GO31651@austin.ibm.com> Message-ID: <1133541988.9364.8.camel@sinatra.austin.ibm.com> The RPAPHP code contains a routine that duplicates some existing code. This patch removes the rpaphp version of the code. Signed-off-by: Linas Vepstas Acked-by: John Rose From msdemlei at cl.uni-heidelberg.de Sat Dec 3 02:30:50 2005 From: msdemlei at cl.uni-heidelberg.de (Markus Demleitner) Date: Fri, 2 Dec 2005 16:30:50 +0100 Subject: Windfarm/modules trouble Message-ID: <20051202153050.GA1368@victor.cl.uni-heidelberg.de> Hi, I've been trying out the windfarm system in 2.6.15-rc3 on a iMac G5 today. I like the general architecture a lot, but of course with abstraction comes a somewhat steep learning curve, in particular if you (like me) aren't really a kernel guy. So, sorry for not sending useful patches. First off, compiling the stuff statically works, but the fans are more active than they are in OS X or with my hacked simpleTemp. I wanted to find out why, and to save me some rebooting, I tried to compile windfarm as modules. Minor trouble: windfarm_pid.c is missing MODULE_AUTHOR("Benjamin Herrenschmidt "); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("PID algorithm for thermal control"); at its end, so you get "kernel tainted" messages. What is worse, the control loop doesn't run after you say modprobe windfarm_pm81. To get some idea why, I sprinkled windfarm_core with DBG-statements. wf_notify says "No notifiers" if the notifier chain is empty and otherwise prints the list head and next. Here's what happens: Dec 2 15:59:10 miller kernel: windfarm: Initializing for iMacG5 model ID 5 Dec 2 15:59:10 miller kernel: Windfarm compatible, init: 0 Dec 2 15:59:10 miller kernel: wf: Registered control system-fan Dec 2 15:59:10 miller kernel: No notifiers! Dec 2 15:59:10 miller kernel: wf: Registered control cpu-fan Dec 2 15:59:10 miller kernel: wf: Registered sensor cpu-temp Dec 2 15:59:10 miller kernel: wf: Registered sensor cpu-current Dec 2 15:59:10 miller kernel: wf: Registered sensor cpu-voltage Dec 2 15:59:10 miller kernel: wf: Registered sensor cpu-power Dec 2 15:59:10 miller kernel: wf: Registered sensor hd-temp (up to here, No notifiers continued, but I've clipped it) Dec 2 15:59:10 miller kernel: Driver register. Dec 2 15:59:10 miller kernel: wf... PROBE....register called... (these are my DBGs from windfarm_pm81:wf_smu_probe and wf_register_client, so that one works, so there's now one function in the notifier chain, wf_smu_notify:) Dec 2 15:59:10 miller kernel: Current chain head: 103638 Dec 2 15:59:10 miller kernel: Current chain next: 0 (these continue throughout, but I've clipped them again) Dec 2 15:59:10 miller kernel: wf: new control cpu-fan detected Dec 2 15:59:10 miller kernel: wf: new control system-fan detected Dec 2 15:59:10 miller kernel: wf: new sensor hd-temp detected Dec 2 15:59:10 miller kernel: wf: new sensor cpu-power detected Dec 2 15:59:10 miller kernel: wf: new sensor cpu-voltage detected Dec 2 15:59:10 miller kernel: wf: new sensor cpu-current detected Dec 2 15:59:10 miller kernel: wf: new sensor cpu-temp detected Dec 2 15:59:10 miller kernel: wf: thread started Dec 2 15:59:10 miller kernel: wf: notify called (this guy comes from wf_thread_func, after time_after_eq) Dec 2 15:59:10 miller kernel: register failed... (and this one now from wf_register_client, after the bail: label. I guess that's where the trouble starts, but I have no idea why this fails) Well, that's it, afterwards one sees the the thread running and call wf_smu_probe, but no pid. Hints, anyone? Other issues (I haven't really looked into any of them yet): (1) You cannot unload the windfarm_core once it's loaded because there still remain references into windfarm_smu_sensors: miller$ sudo modprobe windfarm_pm81 miller$ lsmod Module Size Used by windfarm_lm75_sensor 8872 1 windfarm_smu_sensors 10864 4 windfarm_smu_controls 8608 2 windfarm_pm81 18216 0 windfarm_core 20824 4 windfarm_lm75_sensor,windfarm_smu_sensors,windfarm_smu_controls,windfarm_pm81 windfarm_pid 4984 1 windfarm_pm81 [crap clipped] miller$ sudo rmmod windfarm_pm81 windfarm_pid windfarm_smu_controls windfarm_lm75_sensor miller$ lsmod Module Size Used by windfarm_smu_sensors 10864 2 windfarm_core 20824 1 windfarm_smu_sensors Of course, kwindfarm still runs. (2) After that, modprobing windfarm_pm81 again results in an Oops: Unable to handle kernel paging request for data at address 0x17f03280302b8b91 Faulting instruction address: 0xc00000000019b098 Oops: Kernel access of bad area, sig: 11 [#1] PREEMPT SMP NR_CPUS=2 POWERMAC Modules linked in: windfarm_lm75_sensor windfarm_smu_controls windfarm_pm81 windfarm_pid windfarm_smu_sensors windfarm_core cpufreq_powersave cpufreq_conservative cpufreq_ondemand usb_storage NIP: C00000000019B098 LR: C00000000032FF90 CTR: C00000000028E250 REGS: c00000001b233680 TRAP: 0300 Not tainted (2.6.15-rc3) MSR: 9000000000009032 CR: 24002488 XER: 20000000 DAR: 17F03280302B8B91, DSISR: 0000000040000000 TASK = c00000001b19e040[1551] 'modprobe' THREAD: c00000001b230000 CPU: 0 GPR00: C00000000032FFCC C00000001B233900 C0000000004BD4C0 17F03280302B8B91 GPR04: C0000000004B53B8 C000000000EE0568 FFFFFFFFFFFFFFED C000000000408D38 GPR08: C0000000004F1C00 C000000000430DA8 0000000000000000 0000000000000000 GPR12: 0000000024002442 C0000000003F7C00 00000000100170B8 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 00000000100013A4 000000001001DF18 000000001001DC98 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: D00000000010ACE0 17F03280302B8B79 C00000001B233A80 17F03280302B8B81 NIP [C00000000019B098] .kref_get+0x0/0x24 Call Trace: [C00000001B233990] [C00000000022541C] .next_device+0x10/0x38 [C00000001B233A10] [C0000000002254CC] .bus_for_each_dev+0x88/0xcc [C00000001B233AC0] [C00000000022666C] .driver_attach+0x28/0x40 [C00000001B233B40] [C000000000225C54] .bus_add_driver+0xc8/0x1dc [C00000001B233BF0] [C000000000226D0C] .driver_register+0x58/0x74 [C00000001B233C80] [C00000000028E9C8] .i2c_add_driver+0x78/0x188 [C00000001B233D10] [D000000000109588] .wf_lm75_sensor_init+0x1c/0x40 [windfarm_lm75_sensor] [C00000001B233D90] [C0000000000667AC] .sys_init_module+0x2a0/0x4f8 [C00000001B233E30] [C000000000008600] syscall_exit+0x0/0x18 Instruction dump: 7d635b78 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 4e800020 7c0c0378 4bffff88 38000001 90030000 4e800020 <80030000> 21200000 7c090114 0b000000 <6>note: modprobe[1551] exited with preempt_count 1 Cheers, Markus From brking at us.ibm.com Sat Dec 3 04:21:19 2005 From: brking at us.ibm.com (Brian King) Date: Fri, 02 Dec 2005 11:21:19 -0600 Subject: p615 boot hang with current GIT kernel Message-ID: <4390828F.6030705@us.ibm.com> I'm having trouble booting the current GIT tree on a p615. Not sure if it is a .config problem or a real bug... There are a few boot messages that look a bit strange and not sure if I should worry about: Failed to request PCI IO region on PCI domain 0000 EEH: event on unsupported device, rc=0 dn=/pci at 400000000110/IBM,sp at 1 The system then hangs when trying to talk with the CDROM. I tried removing the CDROM, and was then able to get to ipr loading, but it isn't getting any PCI interrupts either... Not sure if this is a PCI interrupt routing issue or not, but that is my hunch at this point. I also tried disabling distributing interrupts to all CPUs, but that didn't help either. Attached is my boot log and .config. -- Brian King eServer Storage I/O IBM Linux Technology Center -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: boot_hang_short Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051202/e12396e7/attachment.txt -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: boot-hang.config Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051202/e12396e7/attachment-0001.txt From kravetz at us.ibm.com Sat Dec 3 05:43:33 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Fri, 2 Dec 2005 10:43:33 -0800 Subject: [PATCH] numa placement for dynamically added memory In-Reply-To: <20051202030229.GA7836@localhost.localdomain> References: <20051202012839.GA14241@monkey.ibm.com> <20051202030229.GA7836@localhost.localdomain> Message-ID: <20051202184333.GB6927@w-mikek2.ibm.com> On Thu, Dec 01, 2005 at 10:02:30PM -0500, Nathan Lynch wrote: > > +#ifdef CONFIG_MEMORY_HOTPLUG > > +static unsigned long read_n_cells(int n, unsigned int **buf) > > +#else > > static unsigned long __init read_n_cells(int n, unsigned int **buf) > > +#endif > > Any reason not to use __devinit here? Or maybe look into devising a > macro like __cpuinit for memory hotplug. Nope that makes sense. Thanks. > > + /* Domains not present at boot default to 0 */ > > + if (!node_online(numa_domain)) > > + numa_domain = 0; > > Nope, 0 is not always a valid node on pSeries lpar. I suggest using > any_online_node(), or revisiting the idea of logical<->physical > mapping of node/domain ids. I tried the latter a few months ago but > I've been working on other stuff lately and haven't been able to > revisit it. Yeah, I can do that. As a side note, it looks like 0 will always be a valid node in the current code. If we successfully execute parse_numa_properties(), then this code will be run. for (i = 0; i <= max_domain; i++) node_set_online(i); If we execute setup_nonnuma() instead, then the following is executed: node_set_online(0); I've previously wondered about the above code in parse_numa_properties(). You seem to confirm that is not the desired behavior. Should this be changed? > > +#ifdef CONFIG_NUMA > > +extern int hot_add_scn_to_nid(unsigned long scn_addr); > > +#else > > +#define hot_add_scn_to_nid(scn_addr) (0) > > +#endif > > Make hot_add_scn_to_nid a static inline in the !CONFIG_NUMA case, > please. OK -- Mike From zarniwhoop at ntlworld.com Sat Dec 3 05:49:37 2005 From: zarniwhoop at ntlworld.com (Ken Moffat) Date: Fri, 2 Dec 2005 18:49:37 +0000 (GMT) Subject: atkbd keys missing Message-ID: Hi, my powermac G5 SMU now runs nicely with 2.6.15-rc4 (ignoring sound, of course) except for one awkward problem - I'm using a PC PS/2 keyboard (108 keys, if I can count) through a kvm switch, then a PS2-to-usb adaptor. This is a British keyboard, and two of the keys don't work (nothing at all shows from them, not even scancodes). One of these keys is the 'wake' key (code 143 on my other boxes) which I don't miss, but the other is the '\' key (or '|' when shifted) which I find somewhat important. On british keyboards we fit this between the left shift and z keys, and it shows up as code 86 on my other boxes. I guess not many people use these keyboards on macs (the standard keyboard is almost american), so I'm not surprised this is broken. I'm at a loss to know where to look to try to fix this - there are no logged messages from atkbd about unknown keys (although atkbd is built in to my config), and the keycode tables in drivers/char/keyboard.c and drivers/input/keyboard/atkbd.c seem to be constant for almost all architectures. Is there a powerpc(64) keyboard driver that I'm overlooking, or has anybody any pointers to where I should be looking, please ? Ken -- das eine Mal als Trag?die, das andere Mal als Farce From ntl at pobox.com Sat Dec 3 06:20:54 2005 From: ntl at pobox.com (Nathan Lynch) Date: Fri, 2 Dec 2005 14:20:54 -0500 Subject: [PATCH] numa placement for dynamically added memory In-Reply-To: <20051202184333.GB6927@w-mikek2.ibm.com> References: <20051202012839.GA14241@monkey.ibm.com> <20051202030229.GA7836@localhost.localdomain> <20051202184333.GB6927@w-mikek2.ibm.com> Message-ID: <20051202192054.GB7836@localhost.localdomain> Mike Kravetz wrote: > On Thu, Dec 01, 2005 at 10:02:30PM -0500, Nathan Lynch wrote: > > > + /* Domains not present at boot default to 0 */ > > > + if (!node_online(numa_domain)) > > > + numa_domain = 0; > > > > Nope, 0 is not always a valid node on pSeries lpar. I suggest using > > any_online_node(), or revisiting the idea of logical<->physical > > mapping of node/domain ids. I tried the latter a few months ago but > > I've been working on other stuff lately and haven't been able to > > revisit it. > > Yeah, I can do that. As a side note, it looks like 0 will always be a > valid node in the current code. If we successfully execute > parse_numa_properties(), then this code will be run. > > for (i = 0; i <= max_domain; i++) > node_set_online(i); Yes, the code erroneously assumes that we can just mark nodes 0 through max_domain - 1 online. Explained below. > If we execute setup_nonnuma() instead, then the following is executed: > > node_set_online(0); > > I've previously wondered about the above code in parse_numa_properties(). > You seem to confirm that is not the desired behavior. Should this be > changed? I think so. The fundamental issue is that the numa code does not distinguish between logical node numbers and the identifiers given by the platform in the ibm,associativity properties to denote "affinity domains". This is ok for cases such as larger Power4 machines running without a hypervisor and LPARs on smaller Power5 machines (e.g. just 2 nodes). But with larger Power5 systems, we're getting into trouble over this. We need to be able to handle situations where the domain numbering as given by the platform doesn't necessarily begin at zero and isn't necessarily continuous -- for example a partition with domains numbered 2, 7, and 9. So I think a logical to "physical" mapping makes sense, similar to what we do for cpus. Nathan From flar at allandria.com Sat Dec 3 06:43:02 2005 From: flar at allandria.com (Brad Boyer) Date: Fri, 2 Dec 2005 11:43:02 -0800 Subject: atkbd keys missing In-Reply-To: References: Message-ID: <20051202194302.GA9023@pants.nu> On Fri, Dec 02, 2005 at 06:49:37PM +0000, Ken Moffat wrote: > Hi, my powermac G5 SMU now runs nicely with 2.6.15-rc4 (ignoring sound, > of course) except for one awkward problem - I'm using a PC PS/2 keyboard > (108 keys, if I can count) through a kvm switch, then a PS2-to-usb > adaptor. This is a British keyboard, and two of the keys don't work > (nothing at all shows from them, not even scancodes). > > I'm at a loss to know where to look to try to fix this - there are no > logged messages from atkbd about unknown keys (although atkbd is built > in to my config), and the keycode tables in drivers/char/keyboard.c and > drivers/input/keyboard/atkbd.c seem to be constant for almost all > architectures. > > Is there a powerpc(64) keyboard driver that I'm overlooking, or has > anybody any pointers to where I should be looking, please ? Since you are running it through a USB adaptor, I would expect it to show up as a USB keyboard to the system. The atkbd driver is for a keyboard directly connected to an AT or PS/2 style port on the system, which actually acts more like a normal serial port. Take a look at usbhid (in drivers/usb/input) to see if that is where it's actually getting handled. You probably have messages in the logs from the USB detection code if that is the case. Here's some from one of my boxes: input: USB HID v1.00 Keyboard [Macally Macally iKey ] on usb-0001:02:0b.1-2.2.5.1 The name will probably be the model of the adapter, whereas I have a directly connected USB keyboard in this case. It is possible the adapter is eating the keypress, but it's hard to say without more investigation of the problem. Brad Boyer flar at allandria.com From johnrose at austin.ibm.com Sat Dec 3 07:11:41 2005 From: johnrose at austin.ibm.com (John Rose) Date: Fri, 02 Dec 2005 14:11:41 -0600 Subject: [PATCH 2/2] PCI Hotplug/powerpc: more removal of duplicated code In-Reply-To: <20051202005957.GP31651@austin.ibm.com> References: <20051202005614.GO31651@austin.ibm.com> <20051202005957.GP31651@austin.ibm.com> Message-ID: <1133554301.11039.11.camel@sinatra.austin.ibm.com> The RPAPHP code contains two routines that appear to be gratuitous copies of very similar pci code. In particular, rpaphp_claim_resource ~~ pci_claim_resource (there is a minor, non-functional difference) rpadlpar_claim_one_bus == pcibios_claim_one_bus (the code is identical) This patch removes the rpaphp versions of the code. Signed-off-by: Linas Vepstas Acked-by: John Rose From zarniwhoop at ntlworld.com Sat Dec 3 08:26:39 2005 From: zarniwhoop at ntlworld.com (Ken Moffat) Date: Fri, 2 Dec 2005 21:26:39 +0000 (GMT) Subject: atkbd keys missing In-Reply-To: <20051202194302.GA9023@pants.nu> References: <20051202194302.GA9023@pants.nu> Message-ID: On Fri, 2 Dec 2005, Brad Boyer wrote: >> Is there a powerpc(64) keyboard driver that I'm overlooking, or has >> anybody any pointers to where I should be looking, please ? > > Since you are running it through a USB adaptor, I would expect it to > show up as a USB keyboard to the system. The atkbd driver is for a > keyboard directly connected to an AT or PS/2 style port on the system, > which actually acts more like a normal serial port. Take a look at > usbhid (in drivers/usb/input) to see if that is where it's actually > getting handled. You probably have messages in the logs from the USB > detection code if that is the case. Here's some from one of my boxes: > > input: USB HID v1.00 Keyboard [Macally Macally iKey ] on usb-0001:02:0b.1-2.2.5.1 > Brad, thanks for clarifying the role of atkbd. I'll take a look at the usb code, and dig through my logs. Cheers. Ken -- das eine Mal als Trag?die, das andere Mal als Farce From haren at us.ibm.com Sat Dec 3 09:36:53 2005 From: haren at us.ibm.com (Haren Myneni) Date: Fri, 02 Dec 2005 14:36:53 -0800 Subject: compilation error for CONFIG_SMP=n Message-ID: <4390CC85.8030808@us.ibm.com> Getting undeclared symbol `H_SET_ASR' for CONFIG_SMP=n. Thanks Haren -------------- next part -------------- A non-text attachment was scrubbed... Name: UP_compile_error.patch Type: text/x-patch Size: 332 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051202/df342ccf/attachment.bin From benh at kernel.crashing.org Sat Dec 3 09:31:54 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 03 Dec 2005 09:31:54 +1100 Subject: Windfarm/modules trouble In-Reply-To: <20051202153050.GA1368@victor.cl.uni-heidelberg.de> References: <20051202153050.GA1368@victor.cl.uni-heidelberg.de> Message-ID: <1133562714.6100.78.camel@gaston> On Fri, 2005-12-02 at 16:30 +0100, Markus Demleitner wrote: > Hi, > > I've been trying out the windfarm system in 2.6.15-rc3 on a iMac G5 > today. I like the general architecture a lot, but of course with > abstraction comes a somewhat steep learning curve, in particular if > you (like me) aren't really a kernel guy. So, sorry for not sending > useful patches. > > First off, compiling the stuff statically works, but the fans are > more active than they are in OS X or with my hacked simpleTemp. I > wanted to find out why, and to save me some rebooting, I tried to > compile windfarm as modules. Yes, current windfarm has issues being in a module, plus some non-trivial problems with the module refcounting. I'll look into it. >From my experience, the fans are not faster than OS X if you also use something like powernowd to throttle down your CPU speed when idle... Ben. From linas at austin.ibm.com Sat Dec 3 11:55:24 2005 From: linas at austin.ibm.com (linas) Date: Fri, 2 Dec 2005 18:55:24 -0600 Subject: [PATCH] powerpc: export pcibios_fixup_new_pci_devices() Message-ID: <20051203005524.GV31651@austin.ibm.com> Hi Paul, Please apply. --linas There is code in the RPAPHP directory that is identical to this routine; I'll be removing that code in an upcoming patch, but this patch is needed to expose the function to make it callable. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/pci_dlpar.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/arch/powerpc/platforms/pseries/pci_dlpar.c 2005-12-02 17:30:02.997471195 -0600 +++ linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/pci_dlpar.c 2005-12-02 17:31:37.444176026 -0600 @@ -77,7 +77,7 @@ } /* Must be called before pci_bus_add_devices */ -static void +void pcibios_fixup_new_pci_devices(struct pci_bus *bus, int fix_bus) { struct pci_dev *dev; Index: linux-2.6.15-rc3-mm1/include/asm-powerpc/pci-bridge.h =================================================================== --- linux-2.6.15-rc3-mm1.orig/include/asm-powerpc/pci-bridge.h 2005-12-01 15:17:23.000000000 -0600 +++ linux-2.6.15-rc3-mm1/include/asm-powerpc/pci-bridge.h 2005-12-02 17:34:37.386846527 -0600 @@ -137,6 +137,7 @@ /** Discover new pci devices under this bus, and add them */ void pcibios_add_pci_devices(struct pci_bus * bus); +void pcibios_fixup_new_pci_devices(struct pci_bus *bus, int fix_bus); extern int pcibios_remove_root_bus(struct pci_controller *phb); From linas at austin.ibm.com Sat Dec 3 11:59:52 2005 From: linas at austin.ibm.com (linas) Date: Fri, 2 Dec 2005 18:59:52 -0600 Subject: [PATCH] PCI Error Recovery: documentation Message-ID: <20051203005951.GW31651@austin.ibm.com> Greg, Please apply. --linas pci-error-recovery_docs.patch Various PCI bus errors can be signaled by newer PCI controllers. Recovering from those errors requires an infrastructure to notify affected device drivers of the error, and a way of walking through a reset sequence. This patch adds documentation describing the current error recovery proposal. Signed-off-by: Linas Vepstas Documentation/pci-error-recovery.txt | 246 +++++++++++++++++++++++++++++++++++ MAINTAINERS | 7 2 files changed, 253 insertions(+) Index: linux-2.6.14-git10/Documentation/pci-error-recovery.txt =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.14-git10/Documentation/pci-error-recovery.txt 2005-11-07 17:33:26.920560069 -0600 @@ -0,0 +1,246 @@ + + PCI Error Recovery + ------------------ + May 31, 2005 + + Current document maintainer: + Linas Vepstas + + +Some PCI bus controllers are able to detect certain "hard" PCI errors +on the bus, such as parity errors on the data and address busses, as +well as SERR and PERR errors. These chipsets are then able to disable +I/O to/from the affected device, so that, for example, a bad DMA +address doesn't end up corrupting system memory. These same chipsets +are also able to reset the affected PCI device, and return it to +working condition. This document describes a generic API form +performing error recovery. + +The core idea is that after a PCI error has been detected, there must +be a way for the kernel to coordinate with all affected device drivers +so that the pci card can be made operational again, possibly after +performing a full electrical #RST of the PCI card. The API below +provides a generic API for device drivers to be notified of PCI +errors, and to be notified of, and respond to, a reset sequence. + +Preliminary sketch of API, cut-n-pasted-n-modified email from +Ben Herrenschmidt, circa 5 april 2005 + +The error recovery API support is exposed to the driver in the form of +a structure of function pointers pointed to by a new field in struct +pci_driver. The absence of this pointer in pci_driver denotes an +"non-aware" driver, behaviour on these is platform dependant. +Platforms like ppc64 can try to simulate pci hotplug remove/add. + +The definition of "pci_error_token" is not covered here. It is based on +Seto's work on the synchronous error detection. We still need to define +functions for extracting infos out of an opaque error token. This is +separate from this API. + +This structure has the form: + +struct pci_error_handlers +{ + int (*error_detected)(struct pci_dev *dev, pci_error_token error); + int (*mmio_enabled)(struct pci_dev *dev); + int (*resume)(struct pci_dev *dev); + int (*link_reset)(struct pci_dev *dev); + int (*slot_reset)(struct pci_dev *dev); +}; + +A driver doesn't have to implement all of these callbacks. The +only mandatory one is error_detected(). If a callback is not +implemented, the corresponding feature is considered unsupported. +For example, if mmio_enabled() and resume() aren't there, then the +driver is assumed as not doing any direct recovery and requires +a reset. If link_reset() is not implemented, the card is assumed as +not caring about link resets, in which case, if recover is supported, +the core can try recover (but not slot_reset() unless it really did +reset the slot). If slot_reset() is not supported, link_reset() can +be called instead on a slot reset. + +At first, the call will always be : + + 1) error_detected() + + Error detected. This is sent once after an error has been detected. At +this point, the device might not be accessible anymore depending on the +platform (the slot will be isolated on ppc64). The driver may already +have "noticed" the error because of a failing IO, but this is the proper +"synchronisation point", that is, it gives a chance to the driver to +cleanup, waiting for pending stuff (timers, whatever, etc...) to +complete; it can take semaphores, schedule, etc... everything but touch +the device. Within this function and after it returns, the driver +shouldn't do any new IOs. Called in task context. This is sort of a +"quiesce" point. See note about interrupts at the end of this doc. + + Result codes: + - PCIERR_RESULT_CAN_RECOVER: + Driever returns this if it thinks it might be able to recover + the HW by just banging IOs or if it wants to be given + a chance to extract some diagnostic informations (see + below). + - PCIERR_RESULT_NEED_RESET: + Driver returns this if it thinks it can't recover unless the + slot is reset. + - PCIERR_RESULT_DISCONNECT: + Return this if driver thinks it won't recover at all, + (this will detach the driver ? or just leave it + dangling ? to be decided) + +So at this point, we have called error_detected() for all drivers +on the segment that had the error. On ppc64, the slot is isolated. What +happens now typically depends on the result from the drivers. If all +drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would +re-enable IOs on the slot (or do nothing special if the platform doesn't +isolate slots) and call 2). If not and we can reset slots, we go to 4), +if neither, we have a dead slot. If it's an hotplug slot, we might +"simulate" reset by triggering HW unplug/replug though. + +>>> Current ppc64 implementation assumes that a device driver will +>>> *not* schedule or semaphore in this routine; the current ppc64 +>>> implementation uses one kernel thread to notify all devices; +>>> thus, of one device sleeps/schedules, all devices are affected. +>>> Doing better requires complex multi-threaded logic in the error +>>> recovery implementation (e.g. waiting for all notification threads +>>> to "join" before proceeding with recovery.) This seems excessively +>>> complex and not worth implementing. + +>>> The current ppc64 implementation doesn't much care if the device +>>> attempts i/o at this point, or not. I/O's will fail, returning +>>> a value of 0xff on read, and writes will be dropped. If the device +>>> driver attempts more than 10K I/O's to a frozen adapter, it will +>>> assume that the device driver has gone into an infinite loop, and +>>> it will panic the the kernel. + + 2) mmio_enabled() + + This is the "early recovery" call. IOs are allowed again, but DMA is +not (hrm... to be discussed, I prefer not), with some restrictions. This +is NOT a callback for the driver to start operations again, only to +peek/poke at the device, extract diagnostic information, if any, and +eventually do things like trigger a device local reset or some such, +but not restart operations. This is sent if all drivers on a segment +agree that they can try to recover and no automatic link reset was +performed by the HW. If the platform can't just re-enable IOs without +a slot reset or a link reset, it doesn't call this callback and goes +directly to 3) or 4). All IOs should be done _synchronously_ from +within this callback, errors triggered by them will be returned via +the normal pci_check_whatever() api, no new error_detected() callback +will be issued due to an error happening here. However, such an error +might cause IOs to be re-blocked for the whole segment, and thus +invalidate the recovery that other devices on the same segment might +have done, forcing the whole segment into one of the next states, +that is link reset or slot reset. + + Result codes: + - PCIERR_RESULT_RECOVERED + Driver returns this if it thinks the device is fully + functionnal and thinks it is ready to start + normal driver operations again. There is no + guarantee that the driver will actually be + allowed to proceed, as another driver on the + same segment might have failed and thus triggered a + slot reset on platforms that support it. + + - PCIERR_RESULT_NEED_RESET + Driver returns this if it thinks the device is not + recoverable in it's current state and it needs a slot + reset to proceed. + + - PCIERR_RESULT_DISCONNECT + Same as above. Total failure, no recovery even after + reset driver dead. (To be defined more precisely) + +>>> The current ppc64 implementation does not implement this callback. + + 3) link_reset() + + This is called after the link has been reset. This is typically +a PCI Express specific state at this point and is done whenever a +non-fatal error has been detected that can be "solved" by resetting +the link. This call informs the driver of the reset and the driver +should check if the device appears to be in working condition. +This function acts a bit like 2) mmio_enabled(), in that the driver +is not supposed to restart normal driver I/O operations right away. +Instead, it should just "probe" the device to check it's recoverability +status. If all is right, then the core will call resume() once all +drivers have ack'd link_reset(). + + Result codes: + (identical to mmio_enabled) + +>>> The current ppc64 implementation does not implement this callback. + + 4) slot_reset() + + This is called after the slot has been soft or hard reset by the +platform. A soft reset consists of asserting the adapter #RST line +and then restoring the PCI BARs and PCI configuration header. If the +platform supports PCI hotplug, then it might instead perform a hard +reset by toggling power on the slot off/on. This call gives drivers +the chance to re-initialize the hardware (re-download firmware, etc.), +but drivers shouldn't restart normal I/O processing operations at +this point. (See note about interrupts; interrupts aren't guaranteed +to be delivered until the resume() callback has been called). If all +device drivers report success on this callback, the patform will call +resume() to complete the error handling and let the driver restart +normal I/O processing. + +A driver can still return a critical failure for this function if +it can't get the device operational after reset. If the platform +previously tried a soft reset, it migh now try a hard reset (power +cycle) and then call slot_reset() again. It the device still can't +be recovered, there is nothing more that can be done; the platform +will typically report a "permanent failure" in such a case. The +device will be considered "dead" in this case. + + Result codes: + - PCIERR_RESULT_DISCONNECT + Same as above. + +>>> The current ppc64 implementation does not try a power-cycle reset +>>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should. + + 5) resume() + + This is called if all drivers on the segment have returned +PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks. +That basically tells the driver to restart activity, tht everything +is back and running. No result code is taken into account here. If +a new error happens, it will restart a new error handling process. + +That's it. I think this covers all the possibilities. The way those +callbacks are called is platform policy. A platform with no slot reset +capability for example may want to just "ignore" drivers that can't +recover (disconnect them) and try to let other cards on the same segment +recover. Keep in mind that in most real life cases, though, there will +be only one driver per segment. + +Now, there is a note about interrupts. If you get an interrupt and your +device is dead or has been isolated, there is a problem :) + +After much thinking, I decided to leave that to the platform. That is, +the recovery API only precies that: + + - There is no guarantee that interrupt delivery can proceed from any +device on the segment starting from the error detection and until the +restart callback is sent, at which point interrupts are expected to be +fully operational. + + - There is no guarantee that interrupt delivery is stopped, that is, ad +river that gets an interrupts after detecting an error, or that detects +and error within the interrupt handler such that it prevents proper +ack'ing of the interrupt (and thus removal of the source) should just +return IRQ_NOTHANDLED. It's up to the platform to deal with taht +condition, typically by masking the irq source during the duration of +the error handling. It is expected that the platform "knows" which +interrupts are routed to error-management capable slots and can deal +with temporarily disabling that irq number during error processing (this +isn't terribly complex). That means some IRQ latency for other devices +sharing the interrupt, but there is simply no other way. High end +platforms aren't supposed to share interrupts between many devices +anyway :) + + +Revised: 31 May 2005 Linas Vepstas Index: linux-2.6.14-git10/MAINTAINERS =================================================================== --- linux-2.6.14-git10.orig/MAINTAINERS 2005-11-07 17:23:59.053340654 -0600 +++ linux-2.6.14-git10/MAINTAINERS 2005-11-07 17:33:26.933558243 -0600 @@ -1899,6 +1899,13 @@ L: linux-abi-devel at lists.sourceforge.net S: Maintained +PCI ERROR RECOVERY +P: Linas Vepstas +M: linas at austin.ibm.com +L: linux-kernel at vger.kernel.org +L: linux-pci at atrey.karlin.mff.cuni.cz +S: Supported + PCI SOUND DRIVERS (ES1370, ES1371 and SONICVIBES) P: Thomas Sailer M: sailer at ife.ee.ethz.ch From greg at kroah.com Sat Dec 3 12:02:53 2005 From: greg at kroah.com (Greg KH) Date: Fri, 2 Dec 2005 17:02:53 -0800 Subject: [PATCH] PCI Error Recovery: documentation In-Reply-To: <20051203005951.GW31651@austin.ibm.com> References: <20051203005951.GW31651@austin.ibm.com> Message-ID: <20051203010253.GA31826@kroah.com> On Fri, Dec 02, 2005 at 06:59:52PM -0600, linas wrote: > +PCI ERROR RECOVERY > +P: Linas Vepstas > +M: linas at austin.ibm.com > +L: linux-kernel at vger.kernel.org > +L: linux-pci at atrey.karlin.mff.cuni.cz > +S: Supported Tab vs space problem here :( Care to redo? thanks, greg k-h From linas at austin.ibm.com Sat Dec 3 12:03:14 2005 From: linas at austin.ibm.com (linas) Date: Fri, 2 Dec 2005 19:03:14 -0600 Subject: [PATCH]: rpaphp: find_bus() -- remove duplicate code Message-ID: <20051203010314.GX31651@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas The function rpaphp_find_pci_bus() has been migrated to pcibios_find_pci_bus() in arch/powerpc/platforms/pseries/pci_dlpar.c This patch removes the old version. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-01 18:51:26.000000000 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c 2005-12-02 14:17:19.834504074 -0600 @@ -32,36 +32,6 @@ #include "../pci.h" /* for pci_add_new_bus */ #include "rpaphp.h" -static struct pci_bus *find_bus_among_children(struct pci_bus *bus, - struct device_node *dn) -{ - struct pci_bus *child = NULL; - struct list_head *tmp; - struct device_node *busdn; - - busdn = pci_bus_to_OF_node(bus); - if (busdn == dn) - return bus; - - list_for_each(tmp, &bus->children) { - child = find_bus_among_children(pci_bus_b(tmp), dn); - if (child) - break; - } - return child; -} - -struct pci_bus *rpaphp_find_pci_bus(struct device_node *dn) -{ - struct pci_dn *pdn = dn->data; - - if (!pdn || !pdn->phb || !pdn->phb->bus) - return NULL; - - return find_bus_among_children(pdn->phb->bus, dn); -} -EXPORT_SYMBOL_GPL(rpaphp_find_pci_bus); - static int rpaphp_get_sensor_state(struct slot *slot, int *state) { int rc; @@ -120,7 +90,7 @@ /* config/unconfig adapter */ *value = slot->state; } else { - bus = rpaphp_find_pci_bus(slot->dn); + bus = pcibios_find_pci_bus(slot->dn); if (bus && !list_empty(&bus->devices)) *value = CONFIGURED; else @@ -369,7 +339,7 @@ struct pci_bus *bus; BUG_ON(!dn); - bus = rpaphp_find_pci_bus(dn); + bus = pcibios_find_pci_bus(dn); if (!bus) { err("%s: no pci_bus for dn %s\n", __FUNCTION__, dn->full_name); goto exit_rc; Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-12-01 18:51:26.000000000 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c 2005-12-02 14:18:23.226614153 -0600 @@ -174,7 +174,7 @@ { struct pci_dev *dev; - if (rpaphp_find_pci_bus(dn)) + if (pcibios_find_pci_bus(dn)) return -EINVAL; /* Add pci bus */ @@ -221,7 +221,7 @@ struct pci_dn *pdn; int rc = 0; - if (!rpaphp_find_pci_bus(dn)) + if (!pcibios_find_pci_bus(dn)) return -EINVAL; slot = find_slot(dn); @@ -366,7 +366,7 @@ struct pci_bus *bus; struct slot *slot; - bus = rpaphp_find_pci_bus(dn); + bus = pcibios_find_pci_bus(dn); if (!bus) return -EINVAL; Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp.h =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpaphp.h 2005-12-01 15:14:48.000000000 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp.h 2005-12-02 14:19:24.050084110 -0600 @@ -88,13 +88,10 @@ /* function prototypes */ /* rpaphp_pci.c */ -extern struct pci_bus *rpaphp_find_pci_bus(struct device_node *dn); -extern int rpaphp_claim_resource(struct pci_dev *dev, int resource); extern int rpaphp_enable_pci_slot(struct slot *slot); extern int register_pci_slot(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); extern void rpaphp_init_new_devs(struct pci_bus *bus); -extern void rpaphp_eeh_init_nodes(struct device_node *dn); extern int rpaphp_config_pci_adapter(struct pci_bus *bus); extern int rpaphp_unconfig_pci_adapter(struct pci_bus *bus); From linas at austin.ibm.com Sat Dec 3 12:16:18 2005 From: linas at austin.ibm.com (linas) Date: Fri, 2 Dec 2005 19:16:18 -0600 Subject: [PATCH] PCI Error Recovery: documentation In-Reply-To: <20051203010253.GA31826@kroah.com> References: <20051203005951.GW31651@austin.ibm.com> <20051203010253.GA31826@kroah.com> Message-ID: <20051203011618.GZ31651@austin.ibm.com> On Fri, Dec 02, 2005 at 05:02:53PM -0800, Greg KH was heard to remark: > > Tab vs space problem here :( > Care to redo? Below: pci-error-recovery_docs.patch Various PCI bus errors can be signaled by newer PCI controllers. Recovering from those errors requires an infrastructure to notify affected device drivers of the error, and a way of walking through a reset sequence. This patch adds documentation describing the current error recovery proposal. Signed-off-by: Linas Vepstas Documentation/pci-error-recovery.txt | 246 +++++++++++++++++++++++++++++++++++ MAINTAINERS | 7 2 files changed, 253 insertions(+) Index: linux-2.6.15-rc3-mm1/Documentation/pci-error-recovery.txt =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6.15-rc3-mm1/Documentation/pci-error-recovery.txt 2005-12-02 19:12:23.715528104 -0600 @@ -0,0 +1,246 @@ + + PCI Error Recovery + ------------------ + May 31, 2005 + + Current document maintainer: + Linas Vepstas + + +Some PCI bus controllers are able to detect certain "hard" PCI errors +on the bus, such as parity errors on the data and address busses, as +well as SERR and PERR errors. These chipsets are then able to disable +I/O to/from the affected device, so that, for example, a bad DMA +address doesn't end up corrupting system memory. These same chipsets +are also able to reset the affected PCI device, and return it to +working condition. This document describes a generic API form +performing error recovery. + +The core idea is that after a PCI error has been detected, there must +be a way for the kernel to coordinate with all affected device drivers +so that the pci card can be made operational again, possibly after +performing a full electrical #RST of the PCI card. The API below +provides a generic API for device drivers to be notified of PCI +errors, and to be notified of, and respond to, a reset sequence. + +Preliminary sketch of API, cut-n-pasted-n-modified email from +Ben Herrenschmidt, circa 5 april 2005 + +The error recovery API support is exposed to the driver in the form of +a structure of function pointers pointed to by a new field in struct +pci_driver. The absence of this pointer in pci_driver denotes an +"non-aware" driver, behaviour on these is platform dependant. +Platforms like ppc64 can try to simulate pci hotplug remove/add. + +The definition of "pci_error_token" is not covered here. It is based on +Seto's work on the synchronous error detection. We still need to define +functions for extracting infos out of an opaque error token. This is +separate from this API. + +This structure has the form: + +struct pci_error_handlers +{ + int (*error_detected)(struct pci_dev *dev, pci_error_token error); + int (*mmio_enabled)(struct pci_dev *dev); + int (*resume)(struct pci_dev *dev); + int (*link_reset)(struct pci_dev *dev); + int (*slot_reset)(struct pci_dev *dev); +}; + +A driver doesn't have to implement all of these callbacks. The +only mandatory one is error_detected(). If a callback is not +implemented, the corresponding feature is considered unsupported. +For example, if mmio_enabled() and resume() aren't there, then the +driver is assumed as not doing any direct recovery and requires +a reset. If link_reset() is not implemented, the card is assumed as +not caring about link resets, in which case, if recover is supported, +the core can try recover (but not slot_reset() unless it really did +reset the slot). If slot_reset() is not supported, link_reset() can +be called instead on a slot reset. + +At first, the call will always be : + + 1) error_detected() + + Error detected. This is sent once after an error has been detected. At +this point, the device might not be accessible anymore depending on the +platform (the slot will be isolated on ppc64). The driver may already +have "noticed" the error because of a failing IO, but this is the proper +"synchronisation point", that is, it gives a chance to the driver to +cleanup, waiting for pending stuff (timers, whatever, etc...) to +complete; it can take semaphores, schedule, etc... everything but touch +the device. Within this function and after it returns, the driver +shouldn't do any new IOs. Called in task context. This is sort of a +"quiesce" point. See note about interrupts at the end of this doc. + + Result codes: + - PCIERR_RESULT_CAN_RECOVER: + Driever returns this if it thinks it might be able to recover + the HW by just banging IOs or if it wants to be given + a chance to extract some diagnostic informations (see + below). + - PCIERR_RESULT_NEED_RESET: + Driver returns this if it thinks it can't recover unless the + slot is reset. + - PCIERR_RESULT_DISCONNECT: + Return this if driver thinks it won't recover at all, + (this will detach the driver ? or just leave it + dangling ? to be decided) + +So at this point, we have called error_detected() for all drivers +on the segment that had the error. On ppc64, the slot is isolated. What +happens now typically depends on the result from the drivers. If all +drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would +re-enable IOs on the slot (or do nothing special if the platform doesn't +isolate slots) and call 2). If not and we can reset slots, we go to 4), +if neither, we have a dead slot. If it's an hotplug slot, we might +"simulate" reset by triggering HW unplug/replug though. + +>>> Current ppc64 implementation assumes that a device driver will +>>> *not* schedule or semaphore in this routine; the current ppc64 +>>> implementation uses one kernel thread to notify all devices; +>>> thus, of one device sleeps/schedules, all devices are affected. +>>> Doing better requires complex multi-threaded logic in the error +>>> recovery implementation (e.g. waiting for all notification threads +>>> to "join" before proceeding with recovery.) This seems excessively +>>> complex and not worth implementing. + +>>> The current ppc64 implementation doesn't much care if the device +>>> attempts i/o at this point, or not. I/O's will fail, returning +>>> a value of 0xff on read, and writes will be dropped. If the device +>>> driver attempts more than 10K I/O's to a frozen adapter, it will +>>> assume that the device driver has gone into an infinite loop, and +>>> it will panic the the kernel. + + 2) mmio_enabled() + + This is the "early recovery" call. IOs are allowed again, but DMA is +not (hrm... to be discussed, I prefer not), with some restrictions. This +is NOT a callback for the driver to start operations again, only to +peek/poke at the device, extract diagnostic information, if any, and +eventually do things like trigger a device local reset or some such, +but not restart operations. This is sent if all drivers on a segment +agree that they can try to recover and no automatic link reset was +performed by the HW. If the platform can't just re-enable IOs without +a slot reset or a link reset, it doesn't call this callback and goes +directly to 3) or 4). All IOs should be done _synchronously_ from +within this callback, errors triggered by them will be returned via +the normal pci_check_whatever() api, no new error_detected() callback +will be issued due to an error happening here. However, such an error +might cause IOs to be re-blocked for the whole segment, and thus +invalidate the recovery that other devices on the same segment might +have done, forcing the whole segment into one of the next states, +that is link reset or slot reset. + + Result codes: + - PCIERR_RESULT_RECOVERED + Driver returns this if it thinks the device is fully + functionnal and thinks it is ready to start + normal driver operations again. There is no + guarantee that the driver will actually be + allowed to proceed, as another driver on the + same segment might have failed and thus triggered a + slot reset on platforms that support it. + + - PCIERR_RESULT_NEED_RESET + Driver returns this if it thinks the device is not + recoverable in it's current state and it needs a slot + reset to proceed. + + - PCIERR_RESULT_DISCONNECT + Same as above. Total failure, no recovery even after + reset driver dead. (To be defined more precisely) + +>>> The current ppc64 implementation does not implement this callback. + + 3) link_reset() + + This is called after the link has been reset. This is typically +a PCI Express specific state at this point and is done whenever a +non-fatal error has been detected that can be "solved" by resetting +the link. This call informs the driver of the reset and the driver +should check if the device appears to be in working condition. +This function acts a bit like 2) mmio_enabled(), in that the driver +is not supposed to restart normal driver I/O operations right away. +Instead, it should just "probe" the device to check it's recoverability +status. If all is right, then the core will call resume() once all +drivers have ack'd link_reset(). + + Result codes: + (identical to mmio_enabled) + +>>> The current ppc64 implementation does not implement this callback. + + 4) slot_reset() + + This is called after the slot has been soft or hard reset by the +platform. A soft reset consists of asserting the adapter #RST line +and then restoring the PCI BARs and PCI configuration header. If the +platform supports PCI hotplug, then it might instead perform a hard +reset by toggling power on the slot off/on. This call gives drivers +the chance to re-initialize the hardware (re-download firmware, etc.), +but drivers shouldn't restart normal I/O processing operations at +this point. (See note about interrupts; interrupts aren't guaranteed +to be delivered until the resume() callback has been called). If all +device drivers report success on this callback, the patform will call +resume() to complete the error handling and let the driver restart +normal I/O processing. + +A driver can still return a critical failure for this function if +it can't get the device operational after reset. If the platform +previously tried a soft reset, it migh now try a hard reset (power +cycle) and then call slot_reset() again. It the device still can't +be recovered, there is nothing more that can be done; the platform +will typically report a "permanent failure" in such a case. The +device will be considered "dead" in this case. + + Result codes: + - PCIERR_RESULT_DISCONNECT + Same as above. + +>>> The current ppc64 implementation does not try a power-cycle reset +>>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should. + + 5) resume() + + This is called if all drivers on the segment have returned +PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks. +That basically tells the driver to restart activity, tht everything +is back and running. No result code is taken into account here. If +a new error happens, it will restart a new error handling process. + +That's it. I think this covers all the possibilities. The way those +callbacks are called is platform policy. A platform with no slot reset +capability for example may want to just "ignore" drivers that can't +recover (disconnect them) and try to let other cards on the same segment +recover. Keep in mind that in most real life cases, though, there will +be only one driver per segment. + +Now, there is a note about interrupts. If you get an interrupt and your +device is dead or has been isolated, there is a problem :) + +After much thinking, I decided to leave that to the platform. That is, +the recovery API only precies that: + + - There is no guarantee that interrupt delivery can proceed from any +device on the segment starting from the error detection and until the +restart callback is sent, at which point interrupts are expected to be +fully operational. + + - There is no guarantee that interrupt delivery is stopped, that is, ad +river that gets an interrupts after detecting an error, or that detects +and error within the interrupt handler such that it prevents proper +ack'ing of the interrupt (and thus removal of the source) should just +return IRQ_NOTHANDLED. It's up to the platform to deal with taht +condition, typically by masking the irq source during the duration of +the error handling. It is expected that the platform "knows" which +interrupts are routed to error-management capable slots and can deal +with temporarily disabling that irq number during error processing (this +isn't terribly complex). That means some IRQ latency for other devices +sharing the interrupt, but there is simply no other way. High end +platforms aren't supposed to share interrupts between many devices +anyway :) + + +Revised: 31 May 2005 Linas Vepstas Index: linux-2.6.15-rc3-mm1/MAINTAINERS =================================================================== --- linux-2.6.15-rc3-mm1.orig/MAINTAINERS 2005-12-01 15:17:24.000000000 -0600 +++ linux-2.6.15-rc3-mm1/MAINTAINERS 2005-12-02 19:14:19.126269787 -0600 @@ -1997,6 +1997,13 @@ L: linux-abi-devel at lists.sourceforge.net S: Maintained +PCI ERROR RECOVERY +P: Linas Vepstas +M: linas at austin.ibm.com +L: linux-kernel at vger.kernel.org +L: linux-pci at atrey.karlin.mff.cuni.cz +S: Supported + PCI SOUND DRIVERS (ES1370, ES1371 and SONICVIBES) P: Thomas Sailer M: sailer at ife.ee.ethz.ch From tom_gall at mac.com Sun Dec 4 09:29:42 2005 From: tom_gall at mac.com (Thomas Gall) Date: Sat, 3 Dec 2005 16:29:42 -0600 Subject: power3 / matrox problems on current git Message-ID: <3528CCF2-D2CD-496B-821D-E3714EC885DF@mac.com> Greetings, Trying to work with ben on some vdso support for glibc and I've ran into an interesting problem on my power3 box. 44p-270 rs6000 It would appear things are busticated on 2.6.15 current git. (Pull as of dec 3, this afternoon ~2pm CST) I suspect it's something related to the matrox card but that's only a theory at this point. Box has 2 gig of memory and G200 matrox card. Screen is all garbled and on the panel I get panic, VFS: can't find r which is probably a reference to the root partition. I've checked the yaboot entry, it has the same root reference as my working 2.6.12 kernel so I'm sure something else isn't quite right. I've tried both with and without video=matroxfb:1280x1024 at 60,memtype: 3 which in the past was required in order to get all 8 megs on the card working (and thus have a working X) Yes I did make sure optimize for POWER4 is off. :-) Appreciate any comments or suggestions, Regards, Tom From tom_gall at mac.com Sun Dec 4 09:57:34 2005 From: tom_gall at mac.com (Thomas Gall) Date: Sat, 3 Dec 2005 16:57:34 -0600 Subject: power3 / matrox problems on current git In-Reply-To: <3528CCF2-D2CD-496B-821D-E3714EC885DF@mac.com> References: <3528CCF2-D2CD-496B-821D-E3714EC885DF@mac.com> Message-ID: <02CADBC9-5D0F-4146-A29F-56F543D664D1@mac.com> On Dec 3, 2005, at 4:29 PM, Thomas Gall wrote: > > > Appreciate any comments or suggestions, As a follow up, I did try both with and without the NUMA settings Flat and Sparse memory models ... no change ... I'll see if I can get a serial cable on it later tonight... Regards, Tom From rsa at us.ibm.com Mon Dec 5 08:12:09 2005 From: rsa at us.ibm.com (Ryan S. Arnold) Date: Sun, 04 Dec 2005 15:12:09 -0600 Subject: [RFC PATCH 1/5] CELL bogus_console port to hvc_console backend driver Message-ID: <43935BA9.5020602@us.ibm.com> This patch removes the old bogus_console.c driver file. Signed-off-by: Ryan S. Arnold -------------- next part -------------- A non-text attachment was scrubbed... Name: hvc_fss.1.patch Type: text/x-patch Size: 7878 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051204/895b1270/attachment.bin From rsa at us.ibm.com Mon Dec 5 08:12:16 2005 From: rsa at us.ibm.com (Ryan S. Arnold) Date: Sun, 04 Dec 2005 15:12:16 -0600 Subject: [RFC PATCH 2/5] CELL bogus_console port to hvc_console backend driver Message-ID: <43935BB0.9050306@us.ibm.com> This patch shuffles around some data-type declarations and moves some functions out of include/asm-ppc64/hvconsole.h and into a new drivers/char/hvc_console.h file. Signed-off-by: Ryan S. Arnold -------------- next part -------------- A non-text attachment was scrubbed... Name: hvc_fss.2.patch Type: text/x-patch Size: 6230 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051204/2c07a770/attachment.bin From michael at ellerman.id.au Sun Dec 4 17:28:05 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 17:28:05 Subject: [PATCH] powerpc: Fix compile warning in __eeh_mark_slot() Message-ID: <20051204232819.9FABD6884B@ozlabs.org> Fix a compile warning the the powerpc.git tree: arch/powerpc/platforms/pseries/eeh.c: In function `__eeh_mark_slot': arch/powerpc/platforms/pseries/eeh.c:214: warning: ISO C90 forbids mixed declarations and code Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/eeh.c | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: kexec/arch/powerpc/platforms/pseries/eeh.c =================================================================== --- kexec.orig/arch/powerpc/platforms/pseries/eeh.c +++ kexec/arch/powerpc/platforms/pseries/eeh.c @@ -208,10 +208,10 @@ static void __eeh_mark_slot (struct devi { while (dn) { if (PCI_DN(dn)) { + struct pci_dev *dev = PCI_DN(dn)->pcidev; PCI_DN(dn)->eeh_mode |= mode_flag; /* Mark the pci device driver too */ - struct pci_dev *dev = PCI_DN(dn)->pcidev; if (dev && dev->driver) dev->error_state = pci_channel_io_frozen; From michael at ellerman.id.au Sun Dec 4 18:39:09 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:09 Subject: [PATCH 0/11] powerpc: Kdump support Message-ID: <1133743149.268607.418162138937.qpush@concordia> This patch series implements basic support for kdump on powerpc, on top of the current powerpc.git tree. Paulus please merge. From michael at ellerman.id.au Sun Dec 4 18:39:12 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:12 Subject: [PATCH 1/11] powerpc: Propagate regs through to machine_crash_shutdown In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003926.BD45B68869@ozlabs.org> Currently machine_crash_shutdown() gets a struct pt_regs, but doesn't pass it through to the ppc_md function, it should. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/machine_kexec.c | 2 +- include/asm-powerpc/machdep.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Index: kexec/include/asm-powerpc/machdep.h =================================================================== --- kexec.orig/include/asm-powerpc/machdep.h +++ kexec/include/asm-powerpc/machdep.h @@ -222,7 +222,7 @@ struct machdep_calls { * to run successfully. * XXX Should we move this one out of kexec scope? */ - void (*machine_crash_shutdown)(void); + void (*machine_crash_shutdown)(struct pt_regs *regs); /* Called to do what every setup is needed on image and the * reboot code buffer. Returns 0 on success. Index: kexec/arch/powerpc/kernel/machine_kexec.c =================================================================== --- kexec.orig/arch/powerpc/kernel/machine_kexec.c +++ kexec/arch/powerpc/kernel/machine_kexec.c @@ -23,7 +23,7 @@ note_buf_t crash_notes[NR_CPUS]; void machine_crash_shutdown(struct pt_regs *regs) { if (ppc_md.machine_crash_shutdown) - ppc_md.machine_crash_shutdown(); + ppc_md.machine_crash_shutdown(regs); } /* From michael at ellerman.id.au Sun Dec 4 18:39:15 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:15 Subject: [PATCH 2/11] powerpc: Add a is_kernel_addr() macro In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003930.04D9768871@ozlabs.org> There's a bunch of code that compares an address with KERNELBASE to see if it's a "kernel address", ie. >= KERNELBASE. The proper test is actually to compare with PAGE_OFFSET, since we're going to change KERNELBASE soon. So replace all of them with an is_kernel_addr() macro that does that. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/prom_init.c | 2 +- arch/powerpc/kernel/setup-common.c | 2 +- arch/powerpc/mm/slb.c | 6 +++--- arch/powerpc/mm/stab.c | 6 +++--- arch/powerpc/mm/tlb_64.c | 2 +- arch/powerpc/oprofile/op_model_power4.c | 4 ++-- arch/powerpc/oprofile/op_model_rs64.c | 3 +-- arch/powerpc/xmon/xmon.c | 4 ++-- include/asm-powerpc/page.h | 6 ++++++ 9 files changed, 20 insertions(+), 15 deletions(-) Index: kexec/arch/powerpc/mm/stab.c =================================================================== --- kexec.orig/arch/powerpc/mm/stab.c +++ kexec/arch/powerpc/mm/stab.c @@ -122,7 +122,7 @@ static int __ste_allocate(unsigned long unsigned long offset; /* Kernel or user address? */ - if (ea >= KERNELBASE) { + if (is_kernel_addr(ea)) { vsid = get_kernel_vsid(ea); } else { if ((ea >= TASK_SIZE_USER64) || (! mm)) @@ -133,7 +133,7 @@ static int __ste_allocate(unsigned long stab_entry = make_ste(get_paca()->stab_addr, GET_ESID(ea), vsid); - if (ea < KERNELBASE) { + if (!is_kernel_addr(ea)) { offset = __get_cpu_var(stab_cache_ptr); if (offset < NR_STAB_CACHE_ENTRIES) __get_cpu_var(stab_cache[offset++]) = stab_entry; @@ -190,7 +190,7 @@ void switch_stab(struct task_struct *tsk entry++, ste++) { unsigned long ea; ea = ste->esid_data & ESID_MASK; - if (ea < KERNELBASE) { + if (!is_kernel_addr(ea)) { ste->esid_data = 0; } } Index: kexec/arch/powerpc/kernel/prom_init.c =================================================================== --- kexec.orig/arch/powerpc/kernel/prom_init.c +++ kexec/arch/powerpc/kernel/prom_init.c @@ -1994,7 +1994,7 @@ static void __init prom_check_initrd(uns if (r3 && r4 && r4 != 0xdeadbeef) { unsigned long val; - RELOC(prom_initrd_start) = (r3 >= KERNELBASE) ? __pa(r3) : r3; + RELOC(prom_initrd_start) = is_kernel_addr(r3) ? __pa(r3) : r3; RELOC(prom_initrd_end) = RELOC(prom_initrd_start) + r4; val = RELOC(prom_initrd_start); Index: kexec/arch/powerpc/kernel/setup-common.c =================================================================== --- kexec.orig/arch/powerpc/kernel/setup-common.c +++ kexec/arch/powerpc/kernel/setup-common.c @@ -319,7 +319,7 @@ void __init check_for_initrd(void) /* If we were passed an initrd, set the ROOT_DEV properly if the values * look sensible. If not, clear initrd reference. */ - if (initrd_start >= KERNELBASE && initrd_end >= KERNELBASE && + if (is_kernel_addr(initrd_start) && is_kernel_addr(initrd_end) && initrd_end > initrd_start) ROOT_DEV = Root_RAM0; else Index: kexec/arch/powerpc/mm/slb.c =================================================================== --- kexec.orig/arch/powerpc/mm/slb.c +++ kexec/arch/powerpc/mm/slb.c @@ -134,14 +134,14 @@ void switch_slb(struct task_struct *tsk, else unmapped_base = TASK_UNMAPPED_BASE_USER64; - if (pc >= KERNELBASE) + if (is_kernel_addr(pc)) return; slb_allocate(pc); if (GET_ESID(pc) == GET_ESID(stack)) return; - if (stack >= KERNELBASE) + if (is_kernel_addr(stack)) return; slb_allocate(stack); @@ -149,7 +149,7 @@ void switch_slb(struct task_struct *tsk, || (GET_ESID(stack) == GET_ESID(unmapped_base))) return; - if (unmapped_base >= KERNELBASE) + if (is_kernel_addr(unmapped_base)) return; slb_allocate(unmapped_base); } Index: kexec/arch/powerpc/oprofile/op_model_power4.c =================================================================== --- kexec.orig/arch/powerpc/oprofile/op_model_power4.c +++ kexec/arch/powerpc/oprofile/op_model_power4.c @@ -252,7 +252,7 @@ static unsigned long get_pc(struct pt_re return (unsigned long)__va(pc); /* Not sure where we were */ - if (pc < KERNELBASE) + if (!is_kernel_addr(pc)) /* function descriptor madness */ return *((unsigned long *)kernel_unknown_bucket); @@ -264,7 +264,7 @@ static int get_kernel(unsigned long pc) int is_kernel; if (!mmcra_has_sihv) { - is_kernel = (pc >= KERNELBASE); + is_kernel = is_kernel_addr(pc); } else { unsigned long mmcra = mfspr(SPRN_MMCRA); is_kernel = ((mmcra & MMCRA_SIPR) == 0); Index: kexec/arch/powerpc/xmon/xmon.c =================================================================== --- kexec.orig/arch/powerpc/xmon/xmon.c +++ kexec/arch/powerpc/xmon/xmon.c @@ -1013,7 +1013,7 @@ static long check_bp_loc(unsigned long a unsigned int instr; addr &= ~3; - if (addr < KERNELBASE) { + if (!is_kernel_addr(addr)) { printf("Breakpoints may only be placed at kernel addresses\n"); return 0; } @@ -1064,7 +1064,7 @@ bpt_cmds(void) dabr.address = 0; dabr.enabled = 0; if (scanhex(&dabr.address)) { - if (dabr.address < KERNELBASE) { + if (!is_kernel_addr(dabr.address)) { printf(badaddr); break; } Index: kexec/include/asm-powerpc/page.h =================================================================== --- kexec.orig/include/asm-powerpc/page.h +++ kexec/include/asm-powerpc/page.h @@ -86,6 +86,12 @@ /* to align the pointer to the (next) page boundary */ #define PAGE_ALIGN(addr) _ALIGN(addr, PAGE_SIZE) +/* + * Don't compare things with KERNELBASE or PAGE_OFFSET to test for + * "kernelness", use is_kernel_addr() - it should do what you want. + */ +#define is_kernel_addr(x) ((x) >= PAGE_OFFSET) + #ifndef __ASSEMBLY__ #undef STRICT_MM_TYPECHECKS Index: kexec/arch/powerpc/oprofile/op_model_rs64.c =================================================================== --- kexec.orig/arch/powerpc/oprofile/op_model_rs64.c +++ kexec/arch/powerpc/oprofile/op_model_rs64.c @@ -178,7 +178,6 @@ static void rs64_handle_interrupt(struct int val; int i; unsigned long pc = mfspr(SPRN_SIAR); - int is_kernel = (pc >= KERNELBASE); /* set the PMM bit (see comment below) */ mtmsrd(mfmsr() | MSR_PMM); @@ -187,7 +186,7 @@ static void rs64_handle_interrupt(struct val = ctr_read(i); if (val < 0) { if (ctr[i].enabled) { - oprofile_add_pc(pc, is_kernel, i); + oprofile_add_pc(pc, is_kernel_addr(pc), i); ctr_write(i, reset_value[i]); } else { ctr_write(i, 0); Index: kexec/arch/powerpc/mm/tlb_64.c =================================================================== --- kexec.orig/arch/powerpc/mm/tlb_64.c +++ kexec/arch/powerpc/mm/tlb_64.c @@ -168,7 +168,7 @@ void hpte_update(struct mm_struct *mm, u batch->mm = mm; batch->psize = psize; } - if (addr < KERNELBASE) { + if (!is_kernel_addr(addr)) { vsid = get_vsid(mm->context.id, addr); WARN_ON(vsid == 0); } else From michael at ellerman.id.au Sun Dec 4 18:39:20 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:20 Subject: [PATCH 3/11] powerpc: Seperate usage of KERNELBASE and PAGE_OFFSET In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003934.643E26887C@ozlabs.org> This patch seperates usage of KERNELBASE and PAGE_OFFSET. I haven't looked at any of the PPC code, if we ever want to support Kdump on PPC we'll have to do another audit, ditto for iSeries. This patch makes PAGE_OFFSET the constant, it'll always be 0xC * 1 gazillion. To get a physical address from a virtual one you subtract PAGE_OFFSET, _not_ KERNELBASE. KERNELBASE is the virtual address of the start of the kernel, it's often the same as PAGE_OFFSET, but _might not be_. If you want to know something's offset from the start of the kernel you should subtract KERNELBASE. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/btext.c | 4 ++-- arch/powerpc/kernel/entry_64.S | 4 ++-- arch/powerpc/kernel/lparmap.c | 6 +++--- arch/powerpc/kernel/machine_kexec_64.c | 5 ++--- arch/powerpc/mm/hash_utils_64.c | 6 +++--- arch/powerpc/mm/slb.c | 4 ++-- arch/powerpc/mm/slb_low.S | 6 +++--- arch/powerpc/mm/stab.c | 10 +++++----- include/asm-powerpc/page.h | 2 +- 9 files changed, 23 insertions(+), 24 deletions(-) Index: kexec/arch/powerpc/mm/stab.c =================================================================== --- kexec.orig/arch/powerpc/mm/stab.c +++ kexec/arch/powerpc/mm/stab.c @@ -40,7 +40,7 @@ static int make_ste(unsigned long stab, unsigned long entry, group, old_esid, castout_entry, i; unsigned int global_entry; struct stab_entry *ste, *castout_ste; - unsigned long kernel_segment = (esid << SID_SHIFT) >= KERNELBASE; + unsigned long kernel_segment = (esid << SID_SHIFT) >= PAGE_OFFSET; vsid_data = vsid << STE_VSID_SHIFT; esid_data = esid << SID_SHIFT | STE_ESID_KP | STE_ESID_V; @@ -83,7 +83,7 @@ static int make_ste(unsigned long stab, } /* Dont cast out the first kernel segment */ - if ((castout_ste->esid_data & ESID_MASK) != KERNELBASE) + if ((castout_ste->esid_data & ESID_MASK) != PAGE_OFFSET) break; castout_entry = (castout_entry + 1) & 0xf; @@ -251,7 +251,7 @@ void stabs_alloc(void) panic("Unable to allocate segment table for CPU %d.\n", cpu); - newstab += KERNELBASE; + newstab = (unsigned long)__va(newstab); memset((void *)newstab, 0, HW_PAGE_SIZE); @@ -270,11 +270,11 @@ void stabs_alloc(void) */ void stab_initialize(unsigned long stab) { - unsigned long vsid = get_kernel_vsid(KERNELBASE); + unsigned long vsid = get_kernel_vsid(PAGE_OFFSET); unsigned long stabreal; asm volatile("isync; slbia; isync":::"memory"); - make_ste(stab, GET_ESID(KERNELBASE), vsid); + make_ste(stab, GET_ESID(PAGE_OFFSET), vsid); /* Order update */ asm volatile("sync":::"memory"); Index: kexec/arch/powerpc/kernel/machine_kexec_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/machine_kexec_64.c +++ kexec/arch/powerpc/kernel/machine_kexec_64.c @@ -153,9 +153,8 @@ void kexec_copy_flush(struct kimage *ima * including ones that were in place on the original copy */ for (i = 0; i < nr_segments; i++) - flush_icache_range(ranges[i].mem + KERNELBASE, - ranges[i].mem + KERNELBASE + - ranges[i].memsz); + flush_icache_range((unsigned long)__va(ranges[i].mem), + (unsigned long)__va(ranges[i].mem + ranges[i].memsz)); } #ifdef CONFIG_SMP Index: kexec/arch/powerpc/mm/hash_utils_64.c =================================================================== --- kexec.orig/arch/powerpc/mm/hash_utils_64.c +++ kexec/arch/powerpc/mm/hash_utils_64.c @@ -456,7 +456,7 @@ void __init htab_initialize(void) /* create bolted the linear mapping in the hash table */ for (i=0; i < lmb.memory.cnt; i++) { - base = lmb.memory.region[i].base + KERNELBASE; + base = (unsigned long)__va(lmb.memory.region[i].base); size = lmb.memory.region[i].size; DBG("creating mapping for region: %lx : %lx\n", base, size); @@ -498,8 +498,8 @@ void __init htab_initialize(void) * for either 4K or 16MB pages. */ if (tce_alloc_start) { - tce_alloc_start += KERNELBASE; - tce_alloc_end += KERNELBASE; + tce_alloc_start = (unsigned long)__va(tce_alloc_start); + tce_alloc_end = (unsigned long)__va(tce_alloc_end); if (base + size >= tce_alloc_start) tce_alloc_start = base + size + 1; Index: kexec/arch/powerpc/mm/slb.c =================================================================== --- kexec.orig/arch/powerpc/mm/slb.c +++ kexec/arch/powerpc/mm/slb.c @@ -75,7 +75,7 @@ static void slb_flush_and_rebolt(void) vflags = SLB_VSID_KERNEL | virtual_llp; ksp_esid_data = mk_esid_data(get_paca()->kstack, 2); - if ((ksp_esid_data & ESID_MASK) == KERNELBASE) + if ((ksp_esid_data & ESID_MASK) == PAGE_OFFSET) ksp_esid_data &= ~SLB_ESID_V; /* We need to do this all in asm, so we're sure we don't touch @@ -213,7 +213,7 @@ void slb_initialize(void) asm volatile("isync":::"memory"); asm volatile("slbmte %0,%0"::"r" (0) : "memory"); asm volatile("isync; slbia; isync":::"memory"); - create_slbe(KERNELBASE, lflags, 0); + create_slbe(PAGE_OFFSET, lflags, 0); /* VMALLOC space has 4K pages always for now */ create_slbe(VMALLOCBASE, vflags, 1); Index: kexec/arch/powerpc/kernel/entry_64.S =================================================================== --- kexec.orig/arch/powerpc/kernel/entry_64.S +++ kexec/arch/powerpc/kernel/entry_64.S @@ -690,7 +690,7 @@ _GLOBAL(enter_rtas) /* Setup our real return addr */ SET_REG_TO_LABEL(r4,.rtas_return_loc) - SET_REG_TO_CONST(r9,KERNELBASE) + SET_REG_TO_CONST(r9,PAGE_OFFSET) sub r4,r4,r9 mtlr r4 @@ -718,7 +718,7 @@ _GLOBAL(enter_rtas) _STATIC(rtas_return_loc) /* relocation is off at this point */ mfspr r4,SPRN_SPRG3 /* Get PACA */ - SET_REG_TO_CONST(r5, KERNELBASE) + SET_REG_TO_CONST(r5, PAGE_OFFSET) sub r4,r4,r5 /* RELOC the PACA base pointer */ mfmsr r6 Index: kexec/arch/powerpc/mm/slb_low.S =================================================================== --- kexec.orig/arch/powerpc/mm/slb_low.S +++ kexec/arch/powerpc/mm/slb_low.S @@ -37,9 +37,9 @@ _GLOBAL(slb_allocate_realmode) srdi r9,r3,60 /* get region */ srdi r10,r3,28 /* get esid */ - cmpldi cr7,r9,0xc /* cmp KERNELBASE for later use */ + cmpldi cr7,r9,0xc /* cmp PAGE_OFFSET for later use */ - /* r3 = address, r10 = esid, cr7 = <>KERNELBASE */ + /* r3 = address, r10 = esid, cr7 = <> PAGE_OFFSET */ blt cr7,0f /* user or kernel? */ /* kernel address: proto-VSID = ESID */ @@ -166,7 +166,7 @@ _GLOBAL(slb_allocate_user) /* * Finish loading of an SLB entry and return * - * r3 = EA, r10 = proto-VSID, r11 = flags, clobbers r9, cr7 = <>KERNELBASE + * r3 = EA, r10 = proto-VSID, r11 = flags, clobbers r9, cr7 = <> PAGE_OFFSET */ slb_finish_load: ASM_VSID_SCRAMBLE(r10,r9) Index: kexec/arch/powerpc/kernel/lparmap.c =================================================================== --- kexec.orig/arch/powerpc/kernel/lparmap.c +++ kexec/arch/powerpc/kernel/lparmap.c @@ -16,8 +16,8 @@ const struct LparMap __attribute__((__se .xSegmentTableOffs = STAB0_PAGE, .xEsids = { - { .xKernelEsid = GET_ESID(KERNELBASE), - .xKernelVsid = KERNEL_VSID(KERNELBASE), }, + { .xKernelEsid = GET_ESID(PAGE_OFFSET), + .xKernelVsid = KERNEL_VSID(PAGE_OFFSET), }, { .xKernelEsid = GET_ESID(VMALLOCBASE), .xKernelVsid = KERNEL_VSID(VMALLOCBASE), }, }, @@ -25,7 +25,7 @@ const struct LparMap __attribute__((__se .xRanges = { { .xPages = HvPagesToMap, .xOffset = 0, - .xVPN = KERNEL_VSID(KERNELBASE) << (SID_SHIFT - HW_PAGE_SHIFT), + .xVPN = KERNEL_VSID(PAGE_OFFSET) << (SID_SHIFT - HW_PAGE_SHIFT), }, }, }; Index: kexec/include/asm-powerpc/page.h =================================================================== --- kexec.orig/include/asm-powerpc/page.h +++ kexec/include/asm-powerpc/page.h @@ -56,7 +56,7 @@ #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) #define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT) -#define __va(x) ((void *)((unsigned long)(x) + KERNELBASE)) +#define __va(x) ((void *)((unsigned long)(x) + PAGE_OFFSET)) #define __pa(x) ((unsigned long)(x) - PAGE_OFFSET) /* Index: kexec/arch/powerpc/kernel/btext.c =================================================================== --- kexec.orig/arch/powerpc/kernel/btext.c +++ kexec/arch/powerpc/kernel/btext.c @@ -60,7 +60,7 @@ int force_printk_to_btext = 0; * * The display is mapped to virtual address 0xD0000000, rather * than 1:1, because some some CHRP machines put the frame buffer - * in the region starting at 0xC0000000 (KERNELBASE). + * in the region starting at 0xC0000000 (PAGE_OFFSET). * This mapping is temporary and will disappear as soon as the * setup done by MMU_Init() is applied. * @@ -71,7 +71,7 @@ int force_printk_to_btext = 0; */ void __init btext_prepare_BAT(void) { - unsigned long vaddr = KERNELBASE + 0x10000000; + unsigned long vaddr = PAGE_OFFSET + 0x10000000; unsigned long addr; unsigned long lowbits; From michael at ellerman.id.au Sun Dec 4 18:39:23 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:23 Subject: [PATCH 4/11] powerpc: Add CONFIG_CRASH_DUMP In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003942.A65C468851@ozlabs.org> This patch adds a Kconfig variable, CONFIG_CRASH_DUMP, which configures the built kernel for use as a Kdump kernel. Currently "all" this involves is changing the value of KERNELBASE to 32 MB. Signed-off-by: Michael Ellerman --- arch/powerpc/Kconfig | 11 +++++++++++ arch/powerpc/kernel/setup_64.c | 3 +++ include/asm-powerpc/page.h | 9 ++++++++- 3 files changed, 22 insertions(+), 1 deletion(-) Index: kexec/arch/powerpc/Kconfig =================================================================== --- kexec.orig/arch/powerpc/Kconfig +++ kexec/arch/powerpc/Kconfig @@ -379,6 +379,17 @@ config CELL_IIC bool default y +config CRASH_DUMP + bool "kernel crash dumps (EXPERIMENTAL)" + depends on PPC_MULTIPLATFORM + depends on EXPERIMENTAL + help + Build a kernel suitable for use as a kdump capture kernel. + The kernel will be linked at a different address than normal, and + so can only be used for Kdump. + + Don't change this unless you know what you are doing. + config IBMVIO depends on PPC_PSERIES || PPC_ISERIES bool Index: kexec/arch/powerpc/kernel/setup_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/setup_64.c +++ kexec/arch/powerpc/kernel/setup_64.c @@ -504,6 +504,9 @@ void __init setup_system(void) ppc64_caches.iline_size); printk("htab_address = 0x%p\n", htab_address); printk("htab_hash_mask = 0x%lx\n", htab_hash_mask); +#if PHYSICAL_START > 0 + printk("physical_start = 0x%x\n", PHYSICAL_START); +#endif printk("-----------------------------------------------------\n"); mm_init_ppc64(); Index: kexec/include/asm-powerpc/page.h =================================================================== --- kexec.orig/include/asm-powerpc/page.h +++ kexec/include/asm-powerpc/page.h @@ -37,8 +37,15 @@ */ #define PAGE_MASK (~((1 << PAGE_SHIFT) - 1)) +#ifdef CONFIG_CRASH_DUMP +/* Kdump kernel runs at 32 MB, change at your peril. */ +#define PHYSICAL_START 0x2000000 +#else +#define PHYSICAL_START 0x0 +#endif + #define PAGE_OFFSET ASM_CONST(CONFIG_KERNEL_START) -#define KERNELBASE PAGE_OFFSET +#define KERNELBASE (PAGE_OFFSET + PHYSICAL_START) #ifdef CONFIG_DISCONTIGMEM #define page_to_pfn(page) discontigmem_page_to_pfn(page) From michael at ellerman.id.au Sun Dec 4 18:39:33 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:33 Subject: [PATCH 5/11] powerpc: Create a trampoline for the fwnmi vectors In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003947.EA0A068876@ozlabs.org> The fwnmi vectors can be anywhere < 32 MB, so we need to use a trampoline for them. The kdump kernel will register the trampoline addresses, which will then jump up to the real code above 32 MB. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/head_64.S | 2 ++ arch/powerpc/platforms/pseries/ras.c | 6 ++---- arch/powerpc/platforms/pseries/setup.c | 18 ++++++++++-------- include/asm-powerpc/firmware.h | 6 ++++++ 4 files changed, 20 insertions(+), 12 deletions(-) Index: kexec/arch/powerpc/kernel/head_64.S =================================================================== --- kexec.orig/arch/powerpc/kernel/head_64.S +++ kexec/arch/powerpc/kernel/head_64.S @@ -553,6 +553,7 @@ slb_miss_user_pseries: * Vectors for the FWNMI option. Share common code. */ .globl system_reset_fwnmi + .align 7 system_reset_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ @@ -560,6 +561,7 @@ system_reset_fwnmi: EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) .globl machine_check_fwnmi + .align 7 machine_check_fwnmi: HMT_MEDIUM mtspr SPRN_SPRG1,r13 /* save r13 */ Index: kexec/arch/powerpc/platforms/pseries/setup.c =================================================================== --- kexec.orig/arch/powerpc/platforms/pseries/setup.c +++ kexec/arch/powerpc/platforms/pseries/setup.c @@ -77,8 +77,6 @@ #endif extern void find_udbg_vterm(void); -extern void system_reset_fwnmi(void); /* from head.S */ -extern void machine_check_fwnmi(void); /* from head.S */ int fwnmi_active; /* TRUE if an FWNMI handler is present */ @@ -104,18 +102,22 @@ void pSeries_show_cpuinfo(struct seq_fil /* Initialize firmware assisted non-maskable interrupts if * the firmware supports this feature. - * */ static void __init fwnmi_init(void) { - int ret; + unsigned long system_reset_addr, machine_check_addr; + int ibm_nmi_register = rtas_token("ibm,nmi-register"); if (ibm_nmi_register == RTAS_UNKNOWN_SERVICE) return; - ret = rtas_call(ibm_nmi_register, 2, 1, NULL, - __pa((unsigned long)system_reset_fwnmi), - __pa((unsigned long)machine_check_fwnmi)); - if (ret == 0) + + /* If the kernel's not linked at zero we point the firmware at low + * addresses anyway, and use a trampoline to get to the real code. */ + system_reset_addr = __pa(system_reset_fwnmi) - PHYSICAL_START; + machine_check_addr = __pa(machine_check_fwnmi) - PHYSICAL_START; + + if (0 == rtas_call(ibm_nmi_register, 2, 1, NULL, system_reset_addr, + machine_check_addr)) fwnmi_active = 1; } Index: kexec/include/asm-powerpc/firmware.h =================================================================== --- kexec.orig/include/asm-powerpc/firmware.h +++ kexec/include/asm-powerpc/firmware.h @@ -98,6 +98,12 @@ typedef struct { extern firmware_feature_t firmware_features_table[]; #endif +extern void system_reset_fwnmi(void); +extern void machine_check_fwnmi(void); + +/* This is true if we are using the firmware NMI handler (typically LPAR) */ +extern int fwnmi_active; + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_FIRMWARE_H */ Index: kexec/arch/powerpc/platforms/pseries/ras.c =================================================================== --- kexec.orig/arch/powerpc/platforms/pseries/ras.c +++ kexec/arch/powerpc/platforms/pseries/ras.c @@ -49,14 +49,12 @@ #include #include #include +#include static unsigned char ras_log_buf[RTAS_ERROR_LOG_MAX]; static DEFINE_SPINLOCK(ras_log_buf_lock); -char mce_data_buf[RTAS_ERROR_LOG_MAX] -; -/* This is true if we are using the firmware NMI handler (typically LPAR) */ -extern int fwnmi_active; +char mce_data_buf[RTAS_ERROR_LOG_MAX]; static int ras_get_sensor_state_token; static int ras_check_exception_token; From michael at ellerman.id.au Sun Dec 4 18:39:37 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:37 Subject: [PATCH 6/11] powerpc: Reroute interrupts from 0 + offset to PHYSICAL_START + offset In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003951.3E3CC6887E@ozlabs.org> Regardless of where the kernel's linked we always get interrupts at low addresses. This patch creates a trampoline in the first 3 pages of memory, where interrupts land, and patches those addresses to jump into the real kernel code at PHYSICAL_START. We also need to reserve the trampoline code and a bit more in prom.c Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/Makefile | 1 arch/powerpc/kernel/crash_dump.c | 53 +++++++++++++++++++++++++++++++++++++++ arch/powerpc/kernel/prom.c | 6 +++- arch/powerpc/kernel/setup_64.c | 5 +++ include/asm-powerpc/kdump.h | 13 +++++++++ 5 files changed, 77 insertions(+), 1 deletion(-) Index: kexec/arch/powerpc/kernel/setup_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/setup_64.c +++ kexec/arch/powerpc/kernel/setup_64.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -260,6 +261,10 @@ void __init early_setup(unsigned long dt } ppc_md = **mach; +#ifdef CONFIG_CRASH_DUMP + kdump_setup(); +#endif + DBG("Found, Initializing memory management...\n"); /* Index: kexec/arch/powerpc/kernel/prom.c =================================================================== --- kexec.orig/arch/powerpc/kernel/prom.c +++ kexec/arch/powerpc/kernel/prom.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include @@ -1335,11 +1336,14 @@ void __init early_init_devtree(void *par of_scan_flat_dt(early_init_dt_scan_memory, NULL); lmb_enforce_memory_limit(memory_limit); lmb_analyze(); - lmb_reserve(0, __pa(klimit)); DBG("Phys. mem: %lx\n", lmb_phys_mem_size()); /* Reserve LMB regions used by kernel, initrd, dt, etc... */ + lmb_reserve(PHYSICAL_START, __pa(klimit) - PHYSICAL_START); +#ifdef CONFIG_CRASH_DUMP + lmb_reserve(0, KDUMP_RESERVE_LIMIT); +#endif early_reserve_mem(); DBG("Scanning CPUs ...\n"); Index: kexec/include/asm-powerpc/kdump.h =================================================================== --- /dev/null +++ kexec/include/asm-powerpc/kdump.h @@ -0,0 +1,13 @@ +#ifndef _PPC64_KDUMP_H +#define _PPC64_KDUMP_H + +/* How many bytes to reserve at zero for kdump. The reserve limit should + * be greater or equal to the trampoline's end address. */ +#define KDUMP_RESERVE_LIMIT 0x8000 + +#define KDUMP_TRAMPOLINE_START 0x0100 +#define KDUMP_TRAMPOLINE_END 0x3000 + +extern void kdump_setup(void); + +#endif /* __PPC64_KDUMP_H */ Index: kexec/arch/powerpc/kernel/Makefile =================================================================== --- kexec.orig/arch/powerpc/kernel/Makefile +++ kexec/arch/powerpc/kernel/Makefile @@ -34,6 +34,7 @@ obj-$(CONFIG_IBMVIO) += vio.o obj-$(CONFIG_IBMEBUS) += ibmebus.o obj-$(CONFIG_GENERIC_TBSYNC) += smp-tbsync.o obj64-$(CONFIG_PPC_MULTIPLATFORM) += nvram_64.o +obj-$(CONFIG_CRASH_DUMP) += crash_dump.o ifeq ($(CONFIG_PPC_MERGE),y) Index: kexec/arch/powerpc/kernel/crash_dump.c =================================================================== --- /dev/null +++ kexec/arch/powerpc/kernel/crash_dump.c @@ -0,0 +1,53 @@ +/* + * Routines for doing kexec-based kdump. + * + * Copyright (C) 2005, IBM Corp. + * + * Created by: Michael Ellerman + * + * This source code is licensed under the GNU General Public License, + * Version 2. See the file COPYING for more details. + */ + +#undef DEBUG + +#include +#include +#include + +#ifdef DEBUG +#include +#define DBG(fmt...) udbg_printf(fmt) +#else +#define DBG(fmt...) +#endif + +static void __init create_trampoline(unsigned long addr) +{ + /* The maximum range of a single instruction branch, is the current + * instruction's address + (32 MB - 4) bytes. For the trampoline we + * need to branch to current address + 32 MB. So we insert a nop at + * the trampoline address, then the next instruction (+ 4 bytes) + * does a branch to (32 MB - 4). The net effect is that when we + * branch to "addr" we jump to ("addr" + 32 MB). Although it requires + * two instructions it doesn't require any registers. + */ + create_instruction(addr, 0x60000000); /* nop */ + create_branch(addr + 4, addr + PHYSICAL_START, 0); +} + +void __init kdump_setup(void) +{ + unsigned long i; + + DBG(" -> kdump_setup()\n"); + + for (i = KDUMP_TRAMPOLINE_START; i < KDUMP_TRAMPOLINE_END; i += 8) { + create_trampoline(i); + } + + create_trampoline(__pa(system_reset_fwnmi) - PHYSICAL_START); + create_trampoline(__pa(machine_check_fwnmi) - PHYSICAL_START); + + DBG(" <- kdump_setup()\n"); +} From michael at ellerman.id.au Sun Dec 4 18:39:40 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:40 Subject: [PATCH 7/11] powerpc: Fixups for kernel linked at 32 MB In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003954.6E56168802@ozlabs.org> There's a few places where we need to fix things up for the kernel to work if it's linked at 32MB: - platforms/powermac/smp.c To start secondary cpus on pmac we patch the reset vector, which is fine. Except if we're above 32MB we don't have enough bits for an absolute branch, it needs to relative. - kernel/head_64.s - A few branches in the cpu hold code need to load the full target address and do a bctr. - after_prom_start needs to load PHYSICAL_START as the dest address, not 0. - The exception prolog needs to load the low word of the target adddress, not just the low halfword. - Fixup handling of the initial stab address. - kernel/setup_64.c smp_release_cpus() needs to write 1 to the spinloop flag near 0, not 32 MB. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/head_64.S | 30 ++++++++++++++++++++++++------ arch/powerpc/kernel/setup_64.c | 5 ++++- arch/powerpc/platforms/powermac/smp.c | 16 +++++++--------- include/asm-powerpc/mmu.h | 3 ++- 4 files changed, 37 insertions(+), 17 deletions(-) Index: kexec/arch/powerpc/platforms/powermac/smp.c =================================================================== --- kexec.orig/arch/powerpc/platforms/powermac/smp.c +++ kexec/arch/powerpc/platforms/powermac/smp.c @@ -753,14 +753,15 @@ static int __init smp_core99_probe(void) static void __devinit smp_core99_kick_cpu(int nr) { unsigned int save_vector; - unsigned long new_vector; - unsigned long flags; + unsigned long target, flags; volatile unsigned int *vector = ((volatile unsigned int *)(KERNELBASE+0x100)); if (nr < 0 || nr > 3) return; - if (ppc_md.progress) ppc_md.progress("smp_core99_kick_cpu", 0x346); + + if (ppc_md.progress) + ppc_md.progress("smp_core99_kick_cpu", 0x346); local_irq_save(flags); local_irq_disable(); @@ -768,14 +769,11 @@ static void __devinit smp_core99_kick_cp /* Save reset vector */ save_vector = *vector; - /* Setup fake reset vector that does + /* Setup fake reset vector that does * b __secondary_start_pmac_0 + nr*8 - KERNELBASE */ - new_vector = (unsigned long) __secondary_start_pmac_0 + nr * 8; - *vector = 0x48000002 + new_vector - KERNELBASE; - - /* flush data cache and inval instruction cache */ - flush_icache_range((unsigned long) vector, (unsigned long) vector + 4); + target = (unsigned long) __secondary_start_pmac_0 + nr * 8; + create_branch((unsigned long)vector, target, BRANCH_SET_LINK); /* Put some life in our friend */ pmac_call_feature(PMAC_FTR_RESET_CPU, NULL, nr, 0); Index: kexec/arch/powerpc/kernel/head_64.S =================================================================== --- kexec.orig/arch/powerpc/kernel/head_64.S +++ kexec/arch/powerpc/kernel/head_64.S @@ -154,11 +154,15 @@ _GLOBAL(__secondary_hold) bne 100b #ifdef CONFIG_HMT - b .hmt_init + LOADADDR(r4, .hmt_init) + mtctr r4 + bctr #else #ifdef CONFIG_SMP + LOADADDR(r4, .pSeries_secondary_smp_init) + mtctr r4 mr r3,r24 - b .pSeries_secondary_smp_init + bctr #else BUG_OPCODE #endif @@ -200,6 +204,20 @@ exception_marker: #define EX_R3 64 #define EX_LR 72 +/* + * We're short on space and time in the exception prolog, so we can't use + * the normal LOADADDR macro. Normally we just need the low halfword of the + * address, but for Kdump we need the whole low word. + */ +#ifdef CONFIG_CRASH_DUMP +#define LOAD_HANDLER(reg, label) \ + oris r12,r12,(label)@h; /* virt addr of handler ... */ \ + ori r12,r12,(label)@l; /* .. and the rest */ +#else +#define LOAD_HANDLER(reg, label) \ + ori r12,r12,(label)@l; /* virt addr of handler ... */ +#endif + #define EXCEPTION_PROLOG_PSERIES(area, label) \ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ std r9,area+EX_R9(r13); /* save r9 - r12 */ \ @@ -212,7 +230,7 @@ exception_marker: clrrdi r12,r13,32; /* get high part of &label */ \ mfmsr r10; \ mfspr r11,SPRN_SRR0; /* save SRR0 */ \ - ori r12,r12,(label)@l; /* virt addr of handler */ \ + LOAD_HANDLER(r12,label) \ ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \ mtspr SPRN_SRR0,r12; \ mfspr r12,SPRN_SRR1; /* and SRR1 */ \ @@ -1348,7 +1366,7 @@ _GLOBAL(do_stab_bolted) * fixed address (the linker can't compute (u64)&initial_stab >> * PAGE_SHIFT). */ - . = STAB0_PHYS_ADDR /* 0x6000 */ + . = STAB0_OFFSET /* 0x6000 */ .globl initial_stab initial_stab: .space 4096 @@ -1553,7 +1571,7 @@ _STATIC(__boot_from_prom) _STATIC(__after_prom_start) /* - * We need to run with __start at physical address 0. + * We need to run with __start at physical address PHYSICAL_START. * This will leave some code in the first 256B of * real memory, which are reserved for software use. * The remainder of the first page is loaded with the fixed @@ -1568,7 +1586,7 @@ _STATIC(__after_prom_start) mr r26,r3 SET_REG_TO_CONST(r27,KERNELBASE) - li r3,0 /* target addr */ + LOADADDR(r3, PHYSICAL_START) /* target addr */ // XXX FIXME: Use phys returned by OF (r30) add r4,r27,r26 /* source addr */ Index: kexec/arch/powerpc/kernel/setup_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/setup_64.c +++ kexec/arch/powerpc/kernel/setup_64.c @@ -314,6 +314,7 @@ void early_setup_secondary(void) void smp_release_cpus(void) { extern unsigned long __secondary_hold_spinloop; + unsigned long *ptr; DBG(" -> smp_release_cpus()\n"); @@ -324,7 +325,9 @@ void smp_release_cpus(void) * This is useless but harmless on iSeries, secondaries are already * waiting on their paca spinloops. */ - __secondary_hold_spinloop = 1; + ptr = (unsigned long *)((unsigned long)&__secondary_hold_spinloop + - PHYSICAL_START); + *ptr = 1; mb(); DBG(" <- smp_release_cpus()\n"); Index: kexec/include/asm-powerpc/mmu.h =================================================================== --- kexec.orig/include/asm-powerpc/mmu.h +++ kexec/include/asm-powerpc/mmu.h @@ -33,7 +33,8 @@ /* Location of cpu0's segment table */ #define STAB0_PAGE 0x6 -#define STAB0_PHYS_ADDR (STAB0_PAGE<<12) +#define STAB0_OFFSET (STAB0_PAGE << 12) +#define STAB0_PHYS_ADDR (STAB0_OFFSET + PHYSICAL_START) #ifndef __ASSEMBLY__ extern char initial_stab[]; From michael at ellerman.id.au Sun Dec 4 18:39:43 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:43 Subject: [PATCH 8/11] powerpc: Add arch dependent basic infrastructure for Kdump. In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205003957.4CD9868887@ozlabs.org> Implementing the machine_crash_shutdown which will be called by crash_kexec (called in case of a panic, sysrq etc.). Disable the interrupts, shootdown cpus using debugger IPI and collect regs for all CPUs. elfcorehdr= specifies the location of elf core header stored by the crashed kernel. This command line option will be passed by the kexec-tools to capture kernel. savemaxmem= specifies the actual memory size that the first kernel has and this value will be used for dumping in the capture kernel. This command line option will be passed by the kexec-tools to capture kernel. Signed-off-by: Haren Myneni Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/Makefile | 2 arch/powerpc/kernel/crash.c | 264 ++++++++++++++++++++++++++++++++ arch/powerpc/kernel/crash_dump.c | 20 ++ arch/powerpc/kernel/machine_kexec_64.c | 13 + arch/powerpc/kernel/smp.c | 22 ++ arch/powerpc/kernel/traps.c | 17 +- arch/powerpc/platforms/cell/setup.c | 1 arch/powerpc/platforms/maple/setup.c | 1 arch/powerpc/platforms/powermac/setup.c | 1 arch/powerpc/platforms/pseries/setup.c | 1 arch/powerpc/platforms/pseries/xics.c | 2 include/asm-powerpc/kexec.h | 10 + 12 files changed, 345 insertions(+), 9 deletions(-) Index: kexec/arch/powerpc/kernel/smp.c =================================================================== --- kexec.orig/arch/powerpc/kernel/smp.c +++ kexec/arch/powerpc/kernel/smp.c @@ -75,6 +75,8 @@ void smp_call_function_interrupt(void); int smt_enabled_at_boot = 1; +static void (*crash_ipi_function_ptr)(struct pt_regs *) = NULL; + #ifdef CONFIG_MPIC int __init smp_mpic_probe(void) { @@ -123,11 +125,16 @@ void smp_message_recv(int msg, struct pt /* XXX Do we have to do this? */ set_need_resched(); break; -#ifdef CONFIG_DEBUGGER case PPC_MSG_DEBUGGER_BREAK: + if (crash_ipi_function_ptr) { + crash_ipi_function_ptr(regs); + break; + } +#ifdef CONFIG_DEBUGGER debugger_ipi(regs); break; -#endif +#endif /* CONFIG_DEBUGGER */ + /* FALLTHROUGH */ default: printk("SMP %d: smp_message_recv(): unknown msg %d\n", smp_processor_id(), msg); @@ -147,6 +154,17 @@ void smp_send_debugger_break(int cpu) } #endif +#ifdef CONFIG_KEXEC +void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *)) +{ + crash_ipi_function_ptr = crash_ipi_callback; + if (crash_ipi_callback) { + mb(); + smp_ops->message_pass(MSG_ALL_BUT_SELF, PPC_MSG_DEBUGGER_BREAK); + } +} +#endif + static void stop_this_cpu(void *dummy) { local_irq_disable(); Index: kexec/arch/powerpc/kernel/crash.c =================================================================== --- /dev/null +++ kexec/arch/powerpc/kernel/crash.c @@ -0,0 +1,264 @@ +/* + * Architecture specific (PPC64) functions for kexec based crash dumps. + * + * Copyright (C) 2005, IBM Corp. + * + * Created by: Haren Myneni + * + * This source code is licensed under the GNU General Public License, + * Version 2. See the file COPYING for more details. + * + */ + +#undef DEBUG + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#ifdef DEBUG +#include +#define DBG(fmt...) udbg_printf(fmt) +#else +#define DBG(fmt...) +#endif + +/* This keeps a track of which one is crashing cpu. */ +int crashing_cpu = -1; + +static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data, + size_t data_len) +{ + struct elf_note note; + + note.n_namesz = strlen(name) + 1; + note.n_descsz = data_len; + note.n_type = type; + memcpy(buf, ¬e, sizeof(note)); + buf += (sizeof(note) +3)/4; + memcpy(buf, name, note.n_namesz); + buf += (note.n_namesz + 3)/4; + memcpy(buf, data, note.n_descsz); + buf += (note.n_descsz + 3)/4; + + return buf; +} + +static void final_note(u32 *buf) +{ + struct elf_note note; + + note.n_namesz = 0; + note.n_descsz = 0; + note.n_type = 0; + memcpy(buf, ¬e, sizeof(note)); +} + +static void crash_save_this_cpu(struct pt_regs *regs, int cpu) +{ + struct elf_prstatus prstatus; + u32 *buf; + + if ((cpu < 0) || (cpu >= NR_CPUS)) + return; + + /* Using ELF notes here is opportunistic. + * I need a well defined structure format + * for the data I pass, and I need tags + * on the data to indicate what information I have + * squirrelled away. ELF notes happen to provide + * all of that that no need to invent something new. + */ + buf = &crash_notes[cpu][0]; + memset(&prstatus, 0, sizeof(prstatus)); + prstatus.pr_pid = current->pid; + elf_core_copy_regs(&prstatus.pr_reg, regs); + buf = append_elf_note(buf, "CORE", NT_PRSTATUS, &prstatus, + sizeof(prstatus)); + final_note(buf); +} + +/* FIXME Merge this with xmon_save_regs ?? */ +static inline void crash_get_current_regs(struct pt_regs *regs) +{ + unsigned long tmp1, tmp2; + + __asm__ __volatile__ ( + "std 0,0(%2)\n" + "std 1,8(%2)\n" + "std 2,16(%2)\n" + "std 3,24(%2)\n" + "std 4,32(%2)\n" + "std 5,40(%2)\n" + "std 6,48(%2)\n" + "std 7,56(%2)\n" + "std 8,64(%2)\n" + "std 9,72(%2)\n" + "std 10,80(%2)\n" + "std 11,88(%2)\n" + "std 12,96(%2)\n" + "std 13,104(%2)\n" + "std 14,112(%2)\n" + "std 15,120(%2)\n" + "std 16,128(%2)\n" + "std 17,136(%2)\n" + "std 18,144(%2)\n" + "std 19,152(%2)\n" + "std 20,160(%2)\n" + "std 21,168(%2)\n" + "std 22,176(%2)\n" + "std 23,184(%2)\n" + "std 24,192(%2)\n" + "std 25,200(%2)\n" + "std 26,208(%2)\n" + "std 27,216(%2)\n" + "std 28,224(%2)\n" + "std 29,232(%2)\n" + "std 30,240(%2)\n" + "std 31,248(%2)\n" + "mfmsr %0\n" + "std %0, 264(%2)\n" + "mfctr %0\n" + "std %0, 280(%2)\n" + "mflr %0\n" + "std %0, 288(%2)\n" + "bl 1f\n" + "1: mflr %1\n" + "std %1, 256(%2)\n" + "mtlr %0\n" + "mfxer %0\n" + "std %0, 296(%2)\n" + : "=&r" (tmp1), "=&r" (tmp2) + : "b" (regs)); +} + +/* We may have saved_regs from where the error came from + * or it is NULL if via a direct panic(). + */ +static void crash_save_self(struct pt_regs *saved_regs) +{ + struct pt_regs regs; + int cpu; + + cpu = smp_processor_id(); + if (saved_regs) + memcpy(®s, saved_regs, sizeof(regs)); + else + crash_get_current_regs(®s); + crash_save_this_cpu(®s, cpu); +} + +#ifdef CONFIG_SMP +static atomic_t waiting_for_crash_ipi; + +void crash_ipi_callback(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + if (cpu == crashing_cpu) + return; + + if (!cpu_online(cpu)) + return; + + if (ppc_md.kexec_cpu_down) + ppc_md.kexec_cpu_down(1, 1); + + local_irq_disable(); + + crash_save_this_cpu(regs, cpu); + atomic_dec(&waiting_for_crash_ipi); + kexec_smp_wait(); + /* NOTREACHED */ +} + +static void crash_kexec_prepare_cpus(void) +{ + unsigned int msecs; + + atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); + + crash_send_ipi(crash_ipi_callback); + smp_wmb(); + + /* + * FIXME: Until we will have the way to stop other CPUSs reliabally, + * the crash CPU will send an IPI and wait for other CPUs to + * respond. If not, proceed the kexec boot even though we failed to + * capture other CPU states. + */ + msecs = 1000000; + while ((atomic_read(&waiting_for_crash_ipi) > 0) && (--msecs > 0)) { + barrier(); + mdelay(1); + } + + /* Would it be better to replace the trap vector here? */ + + /* + * FIXME: In case if we do not get all CPUs, one possibility: ask the + * user to do soft reset such that we get all. + * IPI handler is already set by the panic cpu initially. Therefore, + * all cpus could invoke this handler from die() and the panic CPU + * will call machine_kexec() directly from this handler to do + * kexec boot. + */ + if (atomic_read(&waiting_for_crash_ipi)) + printk(KERN_ALERT "done waiting: %d cpus not responding\n", + atomic_read(&waiting_for_crash_ipi)); + /* Leave the IPI callback set */ +} +#else +static void crash_kexec_prepare_cpus(void) +{ + /* + * move the secondarys to us so that we can copy + * the new kernel 0-0x100 safely + * + * do this if kexec in setup.c ? + */ + smp_release_cpus(); +} + +#endif + +void default_machine_crash_shutdown(struct pt_regs *regs) +{ + /* + * This function is only called after the system + * has paniced or is otherwise in a critical state. + * The minimum amount of code to allow a kexec'd kernel + * to run successfully needs to happen here. + * + * In practice this means stopping other cpus in + * an SMP system. + * The kernel is broken so disable interrupts. + */ + local_irq_disable(); + + if (ppc_md.kexec_cpu_down) + ppc_md.kexec_cpu_down(1, 0); + + /* + * Make a note of crashing cpu. Will be used in machine_kexec + * such that another IPI will not be sent. + */ + crashing_cpu = smp_processor_id(); + crash_kexec_prepare_cpus(); + crash_save_self(regs); +} Index: kexec/arch/powerpc/kernel/traps.c =================================================================== --- kexec.orig/arch/powerpc/kernel/traps.c +++ kexec/arch/powerpc/kernel/traps.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -95,7 +96,7 @@ static DEFINE_SPINLOCK(die_lock); int die(const char *str, struct pt_regs *regs, long err) { - static int die_counter; + static int die_counter, crash_dump_start = 0; int nl = 0; if (debugger(regs)) @@ -156,7 +157,21 @@ int die(const char *str, struct pt_regs print_modules(); show_regs(regs); bust_spinlocks(0); + + if (!crash_dump_start && kexec_should_crash(current)) { + crash_dump_start = 1; + spin_unlock_irq(&die_lock); + crash_kexec(regs); + /* NOTREACHED */ + } spin_unlock_irq(&die_lock); + if (crash_dump_start) + /* + * Only for soft-reset: Other CPUs will be responded to an IPI + * sent by first kexec CPU. + */ + for(;;) + ; if (in_interrupt()) panic("Fatal exception in interrupt"); Index: kexec/arch/powerpc/kernel/machine_kexec_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/machine_kexec_64.c +++ kexec/arch/powerpc/kernel/machine_kexec_64.c @@ -265,11 +265,18 @@ extern NORET_TYPE void kexec_sequence(vo /* too late to fail here */ void default_machine_kexec(struct kimage *image) { - /* prepare control code if any */ - /* shutdown other cpus into our wait loop and quiesce interrupts */ - kexec_prepare_cpus(); + /* + * If the kexec boot is the normal one, need to shutdown other cpus + * into our wait loop and quiesce interrupts. + * Otherwise, in the case of crashed mode (crashing_cpu >= 0), + * stopping other CPUs and collecting their pt_regs is done before + * using debugger IPI. + */ + + if (crashing_cpu == -1) + kexec_prepare_cpus(); /* switch to a staticly allocated stack. Based on irq stack code. * XXX: the task struct will likely be invalid once we do the copy! Index: kexec/include/asm-powerpc/kexec.h =================================================================== --- kexec.orig/include/asm-powerpc/kexec.h +++ kexec/include/asm-powerpc/kexec.h @@ -32,6 +32,8 @@ #ifndef __ASSEMBLY__ +#ifdef CONFIG_KEXEC + #define MAX_NOTE_BYTES 1024 typedef u32 note_buf_t[MAX_NOTE_BYTES / sizeof(u32)]; @@ -41,11 +43,17 @@ extern note_buf_t crash_notes[]; extern void kexec_smp_wait(void); /* get and clear naca physid, wait for master to copy new code to 0 */ extern void __init kexec_setup(void); -#endif +extern int crashing_cpu; +extern void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *)); +#endif /* __powerpc64 __ */ struct kimage; +struct pt_regs; extern void default_machine_kexec(struct kimage *image); extern int default_machine_kexec_prepare(struct kimage *image); +extern void default_machine_crash_shutdown(struct pt_regs *regs); + +#endif /* !CONFIG_KEXEC */ #endif /* ! __ASSEMBLY__ */ #endif /* _ASM_POWERPC_KEXEC_H */ Index: kexec/arch/powerpc/platforms/pseries/xics.c =================================================================== --- kexec.orig/arch/powerpc/platforms/pseries/xics.c +++ kexec/arch/powerpc/platforms/pseries/xics.c @@ -417,7 +417,7 @@ irqreturn_t xics_ipi_action(int irq, voi smp_message_recv(PPC_MSG_MIGRATE_TASK, regs); } #endif -#ifdef CONFIG_DEBUGGER +#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC) if (test_and_clear_bit(PPC_MSG_DEBUGGER_BREAK, &xics_ipi_message[cpu].value)) { mb(); Index: kexec/arch/powerpc/platforms/cell/setup.c =================================================================== --- kexec.orig/arch/powerpc/platforms/cell/setup.c +++ kexec/arch/powerpc/platforms/cell/setup.c @@ -217,5 +217,6 @@ struct machdep_calls __initdata cell_md #ifdef CONFIG_KEXEC .machine_kexec = default_machine_kexec, .machine_kexec_prepare = default_machine_kexec_prepare, + .machine_crash_shutdown = default_machine_crash_shutdown, #endif }; Index: kexec/arch/powerpc/platforms/maple/setup.c =================================================================== --- kexec.orig/arch/powerpc/platforms/maple/setup.c +++ kexec/arch/powerpc/platforms/maple/setup.c @@ -282,5 +282,6 @@ struct machdep_calls __initdata maple_md #ifdef CONFIG_KEXEC .machine_kexec = default_machine_kexec, .machine_kexec_prepare = default_machine_kexec_prepare, + .machine_crash_shutdown = default_machine_crash_shutdown, #endif }; Index: kexec/arch/powerpc/platforms/powermac/setup.c =================================================================== --- kexec.orig/arch/powerpc/platforms/powermac/setup.c +++ kexec/arch/powerpc/platforms/powermac/setup.c @@ -771,6 +771,7 @@ struct machdep_calls __initdata pmac_md #ifdef CONFIG_KEXEC .machine_kexec = default_machine_kexec, .machine_kexec_prepare = default_machine_kexec_prepare, + .machine_crash_shutdown = default_machine_crash_shutdown, #endif #endif /* CONFIG_PPC64 */ #ifdef CONFIG_PPC32 Index: kexec/arch/powerpc/platforms/pseries/setup.c =================================================================== --- kexec.orig/arch/powerpc/platforms/pseries/setup.c +++ kexec/arch/powerpc/platforms/pseries/setup.c @@ -629,5 +629,6 @@ struct machdep_calls __initdata pSeries_ .kexec_cpu_down = pseries_kexec_cpu_down, .machine_kexec = default_machine_kexec, .machine_kexec_prepare = default_machine_kexec_prepare, + .machine_crash_shutdown = default_machine_crash_shutdown, #endif }; Index: kexec/arch/powerpc/kernel/crash_dump.c =================================================================== --- kexec.orig/arch/powerpc/kernel/crash_dump.c +++ kexec/arch/powerpc/kernel/crash_dump.c @@ -11,6 +11,8 @@ #undef DEBUG +#include +#include #include #include #include @@ -51,3 +53,21 @@ void __init kdump_setup(void) DBG(" <- kdump_setup()\n"); } + +static int __init parse_elfcorehdr(char *p) +{ + if (p) + elfcorehdr_addr = memparse(p, &p); + + return 0; +} +__setup("elfcorehdr=", parse_elfcorehdr); + +static int __init parse_savemaxmem(char *p) +{ + if (p) + saved_max_pfn = (memparse(p, &p) >> PAGE_SHIFT) - 1; + + return 0; +} +__setup("savemaxmem=", parse_savemaxmem); Index: kexec/arch/powerpc/kernel/Makefile =================================================================== --- kexec.orig/arch/powerpc/kernel/Makefile +++ kexec/arch/powerpc/kernel/Makefile @@ -66,7 +66,7 @@ pci64-$(CONFIG_PPC64) += pci_64.o pci_d obj-$(CONFIG_PCI) += $(pci64-y) kexec-$(CONFIG_PPC64) := machine_kexec_64.o kexec-$(CONFIG_PPC32) := machine_kexec_32.o -obj-$(CONFIG_KEXEC) += machine_kexec.o $(kexec-y) +obj-$(CONFIG_KEXEC) += machine_kexec.o crash.o $(kexec-y) ifeq ($(CONFIG_PPC_ISERIES),y) $(obj)/head_64.o: $(obj)/lparmap.s From michael at ellerman.id.au Sun Dec 4 18:39:48 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:48 Subject: [PATCH 9/11] powerpc: Parse crashkernel= parameter in first kernel In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205004002.7A01B68889@ozlabs.org> This patch adds code to parse and setup the crash kernel resource in the first kernel. PPC64 ignores the @x part, we always run at 32 MB. Signed-off-by: Haren Myneni Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/prom.c | 11 ++++++++ arch/powerpc/kernel/prom_init.c | 53 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) Index: kexec/arch/powerpc/kernel/prom_init.c =================================================================== --- kexec.orig/arch/powerpc/kernel/prom_init.c +++ kexec/arch/powerpc/kernel/prom_init.c @@ -192,6 +192,11 @@ static unsigned long __initdata alloc_bo static unsigned long __initdata rmo_top; static unsigned long __initdata ram_top; +#ifdef CONFIG_KEXEC +static unsigned long __initdata prom_crashk_base; +static unsigned long __initdata prom_crashk_size; +#endif + static struct mem_map_entry __initdata mem_reserve_map[MEM_RESERVE_MAP_SIZE]; static int __initdata mem_reserve_cnt; @@ -590,6 +595,34 @@ static void __init early_cmdline_parse(v RELOC(prom_memory_limit) = ALIGN(RELOC(prom_memory_limit), 0x1000000); #endif } + +#ifdef CONFIG_KEXEC + /* + * crashkernel=size at addr specifies the location to reserve for + * crash kernel. + */ + opt = strstr(RELOC(prom_cmd_line), RELOC("crashkernel=")); + if (opt) { + opt += 12; + RELOC(prom_crashk_size) = prom_memparse(opt, &opt); + + if (ALIGN(RELOC(prom_crashk_size), 0x1000000) != + RELOC(prom_crashk_size)) { + prom_printf("Warning: crashkernel size is not " + "aligned to 16MB\n"); + } + + /* + * At present, the crash kernel always run at 32MB. + * Just ignore whatever user passed. + */ + RELOC(prom_crashk_base) = 0x2000000; + if (*opt == '@') { + prom_printf("Warning: PPC64 kdump kernel always runs " + "at 32 MB\n"); + } + } +#endif } #ifdef CONFIG_PPC_PSERIES @@ -1011,6 +1044,12 @@ static void __init prom_init_mem(void) prom_printf(" alloc_top_hi : %x\n", RELOC(alloc_top_high)); prom_printf(" rmo_top : %x\n", RELOC(rmo_top)); prom_printf(" ram_top : %x\n", RELOC(ram_top)); +#ifdef CONFIG_KEXEC + if (RELOC(prom_crashk_base)) { + prom_printf(" crashk_base : %x\n", RELOC(prom_crashk_base)); + prom_printf(" crashk_size : %x\n", RELOC(prom_crashk_size)); + } +#endif } @@ -2094,6 +2133,10 @@ unsigned long __init prom_init(unsigned */ prom_init_mem(); +#ifdef CONFIG_KEXEC + if (RELOC(prom_crashk_base)) + reserve_mem(RELOC(prom_crashk_base), RELOC(prom_crashk_size)); +#endif /* * Determine which cpu is actually running right _now_ */ @@ -2150,6 +2193,16 @@ unsigned long __init prom_init(unsigned } #endif +#ifdef CONFIG_KEXEC + if (RELOC(prom_crashk_base)) { + prom_setprop(_prom->chosen, "/chosen", "linux,crashkernel-base", + PTRRELOC(&prom_crashk_base), + sizeof(RELOC(prom_crashk_base))); + prom_setprop(_prom->chosen, "/chosen", "linux,crashkernel-size", + PTRRELOC(&prom_crashk_size), + sizeof(RELOC(prom_crashk_size))); + } +#endif /* * Fixup any known bugs in the device-tree */ Index: kexec/arch/powerpc/kernel/prom.c =================================================================== --- kexec.orig/arch/powerpc/kernel/prom.c +++ kexec/arch/powerpc/kernel/prom.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include @@ -1198,6 +1199,16 @@ static int __init early_init_dt_scan_cho } #endif /* CONFIG_PPC_RTAS */ +#ifdef CONFIG_KEXEC + lprop = (u64*)of_get_flat_dt_prop(node, "linux,crashkernel-base", NULL); + if (lprop) + crashk_res.start = *lprop; + + lprop = (u64*)of_get_flat_dt_prop(node, "linux,crashkernel-size", NULL); + if (lprop) + crashk_res.end = crashk_res.start + *lprop - 1; +#endif + /* break now */ return 1; } From michael at ellerman.id.au Sun Dec 4 18:39:51 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:51 Subject: [PATCH 10/11] powerpc: Add arch-dependant copy_oldmem_page In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205004006.1FEB26887B@ozlabs.org> Add arch-dependant copy_oldmem_page. Signed-off-by: Haren Myneni Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/crash_dump.c | 36 ++++++++++++++++++++++++++++++++++++ include/asm-powerpc/kexec.h | 2 ++ kernel/crash_dump.c | 3 +++ 3 files changed, 41 insertions(+) Index: kexec/arch/powerpc/kernel/crash_dump.c =================================================================== --- kexec.orig/arch/powerpc/kernel/crash_dump.c +++ kexec/arch/powerpc/kernel/crash_dump.c @@ -16,6 +16,7 @@ #include #include #include +#include #ifdef DEBUG #include @@ -71,3 +72,38 @@ static int __init parse_savemaxmem(char return 0; } __setup("savemaxmem=", parse_savemaxmem); + +/* + * copy_oldmem_page - copy one page from "oldmem" + * @pfn: page frame number to be copied + * @buf: target memory address for the copy; this can be in kernel address + * space or user address space (see @userbuf) + * @csize: number of bytes to copy + * @offset: offset in bytes into the page (based on pfn) to begin the copy + * @userbuf: if set, @buf is in user address space, use copy_to_user(), + * otherwise @buf is in kernel address space, use memcpy(). + * + * Copy a page from "oldmem". For this page, there is no pte mapped + * in the current kernel. We stitch up a pte, similar to kmap_atomic. + */ +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, + size_t csize, unsigned long offset, int userbuf) +{ + void *vaddr; + + if (!csize) + return 0; + + vaddr = __ioremap(pfn << PAGE_SHIFT, PAGE_SIZE, 0); + + if (userbuf) { + if (copy_to_user((char __user *)buf, (vaddr + offset), csize)) { + iounmap(vaddr); + return -EFAULT; + } + } else + memcpy(buf, (vaddr + offset), csize); + + iounmap(vaddr); + return csize; +} Index: kexec/include/asm-powerpc/kexec.h =================================================================== --- kexec.orig/include/asm-powerpc/kexec.h +++ kexec/include/asm-powerpc/kexec.h @@ -30,6 +30,8 @@ #define KEXEC_ARCH KEXEC_ARCH_PPC #endif +#define HAVE_ARCH_COPY_OLDMEM_PAGE + #ifndef __ASSEMBLY__ #ifdef CONFIG_KEXEC Index: kexec/kernel/crash_dump.c =================================================================== --- kexec.orig/kernel/crash_dump.c +++ kexec/kernel/crash_dump.c @@ -14,10 +14,12 @@ #include #include +#include /* Stores the physical address of elf header of crash image. */ unsigned long long elfcorehdr_addr = ELFCORE_ADDR_MAX; +#ifndef HAVE_ARCH_COPY_OLDMEM_PAGE /** * copy_oldmem_page - copy one page from "oldmem" * @pfn: page frame number to be copied @@ -59,3 +61,4 @@ ssize_t copy_oldmem_page(unsigned long p kfree(page); return csize; } +#endif From michael at ellerman.id.au Sun Dec 4 18:39:55 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 18:39:55 Subject: [PATCH 11/11] powerpc: Add support for "linux, usable-memory" on memory nodes In-Reply-To: <1133743149.268607.418162138937.qpush@concordia> Message-ID: <20051205004009.D5BCE68865@ozlabs.org> Milton has proposed that we should support a "linux,usable-memory" property on memory nodes which describes, in preference to "reg", the regions of memory Linux should use. This facility is required for kdump, to inform the second kernel which memory it should use. Signed-off-by: Haren Myneni Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/prom.c | 9 +++++++-- arch/powerpc/mm/numa.c | 7 ++++++- 2 files changed, 13 insertions(+), 3 deletions(-) Index: kexec/arch/powerpc/kernel/prom.c =================================================================== --- kexec.orig/arch/powerpc/kernel/prom.c +++ kexec/arch/powerpc/kernel/prom.c @@ -567,7 +567,10 @@ static int __init interpret_root_props(s unsigned int *rp; int rpsize = (naddrc + nsizec) * sizeof(unsigned int); - rp = (unsigned int *) get_property(np, "reg", &l); + rp = (unsigned int *) get_property(np, "linux,usable-memory", &l); + if (rp == NULL) + rp = (unsigned int *) get_property(np, "reg", &l); + if (rp != 0 && l >= rpsize) { i = 0; adr = (struct address_range *) (*mem_start); @@ -1275,7 +1278,9 @@ static int __init early_init_dt_scan_mem } else if (strcmp(type, "memory") != 0) return 0; - reg = (cell_t *)of_get_flat_dt_prop(node, "reg", &l); + reg = (cell_t *)of_get_flat_dt_prop(node, "linux,usable-memory", &l); + if (reg == NULL) + reg = (cell_t *)of_get_flat_dt_prop(node, "reg", &l); if (reg == NULL) return 0; Index: kexec/arch/powerpc/mm/numa.c =================================================================== --- kexec.orig/arch/powerpc/mm/numa.c +++ kexec/arch/powerpc/mm/numa.c @@ -423,7 +423,12 @@ static int __init parse_numa_properties( unsigned int *memcell_buf; unsigned int len; - memcell_buf = (unsigned int *)get_property(memory, "reg", &len); + memcell_buf = (unsigned int *)get_property(memory, + "linux,usable-memory", &len); + if (!memcell_buf || len <= 0) + memcell_buf = + (unsigned int *)get_property(memory, "reg", + &len); if (!memcell_buf || len <= 0) continue; From paulus at samba.org Mon Dec 5 15:06:13 2005 From: paulus at samba.org (Paul Mackerras) Date: Mon, 5 Dec 2005 15:06:13 +1100 Subject: compilation error for CONFIG_SMP=n In-Reply-To: <4390CC85.8030808@us.ibm.com> References: <4390CC85.8030808@us.ibm.com> Message-ID: <17299.48309.788379.53779@cargo.ozlabs.ibm.com> Haren Myneni writes: > Getting undeclared symbol `H_SET_ASR' for CONFIG_SMP=n. There weren't actually any released pSeries machines that had a hypervisor and a segment table, so I will just take out the code that calls H_SET_ASR instead. Paul. From michael at ellerman.id.au Sun Dec 4 23:07:02 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Sun, 04 Dec 2005 23:07:02 Subject: [PATCH] powerpc: Separate usage of KERNELBASE and PAGE_OFFSET In-Reply-To: <20051205003934.643E26887C@ozlabs.org> Message-ID: <20051205050717.ED89768863@ozlabs.org> This patch separates usage of KERNELBASE and PAGE_OFFSET. I haven't looked at any of the PPC code, if we ever want to support Kdump on PPC we'll have to do another audit, ditto for iSeries. This patch makes PAGE_OFFSET the constant, it'll always be 0xC * 1 gazillion. To get a physical address from a virtual one you subtract PAGE_OFFSET, _not_ KERNELBASE. KERNELBASE is the virtual address of the start of the kernel, it's often the same as PAGE_OFFSET, but _might not be_. If you want to know something's offset from the start of the kernel you should subtract KERNELBASE. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/btext.c | 4 ++-- arch/powerpc/kernel/entry_64.S | 4 ++-- arch/powerpc/kernel/lparmap.c | 6 +++--- arch/powerpc/kernel/machine_kexec_64.c | 5 ++--- arch/powerpc/mm/hash_utils_64.c | 6 +++--- arch/powerpc/mm/slb.c | 4 ++-- arch/powerpc/mm/slb_low.S | 6 +++--- arch/powerpc/mm/stab.c | 10 +++++----- include/asm-powerpc/page.h | 2 +- 9 files changed, 23 insertions(+), 24 deletions(-) Index: kexec/arch/powerpc/mm/stab.c =================================================================== --- kexec.orig/arch/powerpc/mm/stab.c +++ kexec/arch/powerpc/mm/stab.c @@ -40,7 +40,7 @@ static int make_ste(unsigned long stab, unsigned long entry, group, old_esid, castout_entry, i; unsigned int global_entry; struct stab_entry *ste, *castout_ste; - unsigned long kernel_segment = (esid << SID_SHIFT) >= KERNELBASE; + unsigned long kernel_segment = (esid << SID_SHIFT) >= PAGE_OFFSET; vsid_data = vsid << STE_VSID_SHIFT; esid_data = esid << SID_SHIFT | STE_ESID_KP | STE_ESID_V; @@ -83,7 +83,7 @@ static int make_ste(unsigned long stab, } /* Dont cast out the first kernel segment */ - if ((castout_ste->esid_data & ESID_MASK) != KERNELBASE) + if ((castout_ste->esid_data & ESID_MASK) != PAGE_OFFSET) break; castout_entry = (castout_entry + 1) & 0xf; @@ -251,7 +251,7 @@ void stabs_alloc(void) panic("Unable to allocate segment table for CPU %d.\n", cpu); - newstab += KERNELBASE; + newstab = (unsigned long)__va(newstab); memset((void *)newstab, 0, HW_PAGE_SIZE); @@ -270,11 +270,11 @@ void stabs_alloc(void) */ void stab_initialize(unsigned long stab) { - unsigned long vsid = get_kernel_vsid(KERNELBASE); + unsigned long vsid = get_kernel_vsid(PAGE_OFFSET); unsigned long stabreal; asm volatile("isync; slbia; isync":::"memory"); - make_ste(stab, GET_ESID(KERNELBASE), vsid); + make_ste(stab, GET_ESID(PAGE_OFFSET), vsid); /* Order update */ asm volatile("sync":::"memory"); Index: kexec/arch/powerpc/kernel/machine_kexec_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/machine_kexec_64.c +++ kexec/arch/powerpc/kernel/machine_kexec_64.c @@ -153,9 +153,8 @@ void kexec_copy_flush(struct kimage *ima * including ones that were in place on the original copy */ for (i = 0; i < nr_segments; i++) - flush_icache_range(ranges[i].mem + KERNELBASE, - ranges[i].mem + KERNELBASE + - ranges[i].memsz); + flush_icache_range((unsigned long)__va(ranges[i].mem), + (unsigned long)__va(ranges[i].mem + ranges[i].memsz)); } #ifdef CONFIG_SMP Index: kexec/arch/powerpc/mm/hash_utils_64.c =================================================================== --- kexec.orig/arch/powerpc/mm/hash_utils_64.c +++ kexec/arch/powerpc/mm/hash_utils_64.c @@ -456,7 +456,7 @@ void __init htab_initialize(void) /* create bolted the linear mapping in the hash table */ for (i=0; i < lmb.memory.cnt; i++) { - base = lmb.memory.region[i].base + KERNELBASE; + base = (unsigned long)__va(lmb.memory.region[i].base); size = lmb.memory.region[i].size; DBG("creating mapping for region: %lx : %lx\n", base, size); @@ -498,8 +498,8 @@ void __init htab_initialize(void) * for either 4K or 16MB pages. */ if (tce_alloc_start) { - tce_alloc_start += KERNELBASE; - tce_alloc_end += KERNELBASE; + tce_alloc_start = (unsigned long)__va(tce_alloc_start); + tce_alloc_end = (unsigned long)__va(tce_alloc_end); if (base + size >= tce_alloc_start) tce_alloc_start = base + size + 1; Index: kexec/arch/powerpc/mm/slb.c =================================================================== --- kexec.orig/arch/powerpc/mm/slb.c +++ kexec/arch/powerpc/mm/slb.c @@ -75,7 +75,7 @@ static void slb_flush_and_rebolt(void) vflags = SLB_VSID_KERNEL | virtual_llp; ksp_esid_data = mk_esid_data(get_paca()->kstack, 2); - if ((ksp_esid_data & ESID_MASK) == KERNELBASE) + if ((ksp_esid_data & ESID_MASK) == PAGE_OFFSET) ksp_esid_data &= ~SLB_ESID_V; /* We need to do this all in asm, so we're sure we don't touch @@ -213,7 +213,7 @@ void slb_initialize(void) asm volatile("isync":::"memory"); asm volatile("slbmte %0,%0"::"r" (0) : "memory"); asm volatile("isync; slbia; isync":::"memory"); - create_slbe(KERNELBASE, lflags, 0); + create_slbe(PAGE_OFFSET, lflags, 0); /* VMALLOC space has 4K pages always for now */ create_slbe(VMALLOCBASE, vflags, 1); Index: kexec/arch/powerpc/kernel/entry_64.S =================================================================== --- kexec.orig/arch/powerpc/kernel/entry_64.S +++ kexec/arch/powerpc/kernel/entry_64.S @@ -690,7 +690,7 @@ _GLOBAL(enter_rtas) /* Setup our real return addr */ SET_REG_TO_LABEL(r4,.rtas_return_loc) - SET_REG_TO_CONST(r9,KERNELBASE) + SET_REG_TO_CONST(r9,PAGE_OFFSET) sub r4,r4,r9 mtlr r4 @@ -718,7 +718,7 @@ _GLOBAL(enter_rtas) _STATIC(rtas_return_loc) /* relocation is off at this point */ mfspr r4,SPRN_SPRG3 /* Get PACA */ - SET_REG_TO_CONST(r5, KERNELBASE) + SET_REG_TO_CONST(r5, PAGE_OFFSET) sub r4,r4,r5 /* RELOC the PACA base pointer */ mfmsr r6 Index: kexec/arch/powerpc/mm/slb_low.S =================================================================== --- kexec.orig/arch/powerpc/mm/slb_low.S +++ kexec/arch/powerpc/mm/slb_low.S @@ -37,9 +37,9 @@ _GLOBAL(slb_allocate_realmode) srdi r9,r3,60 /* get region */ srdi r10,r3,28 /* get esid */ - cmpldi cr7,r9,0xc /* cmp KERNELBASE for later use */ + cmpldi cr7,r9,0xc /* cmp PAGE_OFFSET for later use */ - /* r3 = address, r10 = esid, cr7 = <>KERNELBASE */ + /* r3 = address, r10 = esid, cr7 = <> PAGE_OFFSET */ blt cr7,0f /* user or kernel? */ /* kernel address: proto-VSID = ESID */ @@ -166,7 +166,7 @@ _GLOBAL(slb_allocate_user) /* * Finish loading of an SLB entry and return * - * r3 = EA, r10 = proto-VSID, r11 = flags, clobbers r9, cr7 = <>KERNELBASE + * r3 = EA, r10 = proto-VSID, r11 = flags, clobbers r9, cr7 = <> PAGE_OFFSET */ slb_finish_load: ASM_VSID_SCRAMBLE(r10,r9) Index: kexec/arch/powerpc/kernel/lparmap.c =================================================================== --- kexec.orig/arch/powerpc/kernel/lparmap.c +++ kexec/arch/powerpc/kernel/lparmap.c @@ -16,8 +16,8 @@ const struct LparMap __attribute__((__se .xSegmentTableOffs = STAB0_PAGE, .xEsids = { - { .xKernelEsid = GET_ESID(KERNELBASE), - .xKernelVsid = KERNEL_VSID(KERNELBASE), }, + { .xKernelEsid = GET_ESID(PAGE_OFFSET), + .xKernelVsid = KERNEL_VSID(PAGE_OFFSET), }, { .xKernelEsid = GET_ESID(VMALLOCBASE), .xKernelVsid = KERNEL_VSID(VMALLOCBASE), }, }, @@ -25,7 +25,7 @@ const struct LparMap __attribute__((__se .xRanges = { { .xPages = HvPagesToMap, .xOffset = 0, - .xVPN = KERNEL_VSID(KERNELBASE) << (SID_SHIFT - HW_PAGE_SHIFT), + .xVPN = KERNEL_VSID(PAGE_OFFSET) << (SID_SHIFT - HW_PAGE_SHIFT), }, }, }; Index: kexec/include/asm-powerpc/page.h =================================================================== --- kexec.orig/include/asm-powerpc/page.h +++ kexec/include/asm-powerpc/page.h @@ -56,7 +56,7 @@ #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) #define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT) -#define __va(x) ((void *)((unsigned long)(x) + KERNELBASE)) +#define __va(x) ((void *)((unsigned long)(x) + PAGE_OFFSET)) #define __pa(x) ((unsigned long)(x) - PAGE_OFFSET) /* Index: kexec/arch/powerpc/kernel/btext.c =================================================================== --- kexec.orig/arch/powerpc/kernel/btext.c +++ kexec/arch/powerpc/kernel/btext.c @@ -60,7 +60,7 @@ int force_printk_to_btext = 0; * * The display is mapped to virtual address 0xD0000000, rather * than 1:1, because some some CHRP machines put the frame buffer - * in the region starting at 0xC0000000 (KERNELBASE). + * in the region starting at 0xC0000000 (PAGE_OFFSET). * This mapping is temporary and will disappear as soon as the * setup done by MMU_Init() is applied. * @@ -71,7 +71,7 @@ int force_printk_to_btext = 0; */ void __init btext_prepare_BAT(void) { - unsigned long vaddr = KERNELBASE + 0x10000000; + unsigned long vaddr = PAGE_OFFSET + 0x10000000; unsigned long addr; unsigned long lowbits; From prenuka at gmail.com Mon Dec 5 19:35:22 2005 From: prenuka at gmail.com (Renuka Pampana) Date: Mon, 5 Dec 2005 14:05:22 +0530 Subject: Linuxppc64-dev Digest, Vol 16, Issue 11 In-Reply-To: <20051205010004.52D1568876@ozlabs.org> References: <20051205010004.52D1568876@ozlabs.org> Message-ID: <9b23fc710512050035i117c7bd7y75a01f487dc74654@mail.gmail.com> Hi, Where can i get PPC440ep (yosemite) patch for 64 bit kernel. Can you give me some pointers to refer. Thank you in advance Renuka On 12/5/05, linuxppc64-dev-request at ozlabs.org wrote: > Send Linuxppc64-dev mailing list submissions to > linuxppc64-dev at ozlabs.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > or, via email, send a message with subject or body 'help' to > linuxppc64-dev-request at ozlabs.org > > You can reach the person managing the list at > linuxppc64-dev-owner at ozlabs.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Linuxppc64-dev digest..." > > > Today's Topics: > > 1. [PATCH 8/11] powerpc: Add arch dependent basic infrastructure > for Kdump. (Michael Ellerman) > 2. [PATCH 9/11] powerpc: Parse crashkernel= parameter in first > kernel (Michael Ellerman) > 3. [PATCH 10/11] powerpc: Add arch-dependant copy_oldmem_page > (Michael Ellerman) > 4. [PATCH 11/11] powerpc: Add support for "linux, usable-memory" > on memory nodes (Michael Ellerman) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 04 Dec 2005 18:39:43 > From: Michael Ellerman > Subject: [PATCH 8/11] powerpc: Add arch dependent basic infrastructure > for Kdump. > To: , Paul Mackerras > Message-ID: <20051205003957.4CD9868887 at ozlabs.org> > > Implementing the machine_crash_shutdown which will be called by > crash_kexec (called in case of a panic, sysrq etc.). Disable the > interrupts, shootdown cpus using debugger IPI and collect regs > for all CPUs. > > elfcorehdr= specifies the location of elf core header stored by > the crashed kernel. This command line option will be passed by > the kexec-tools to capture kernel. > > savemaxmem= specifies the actual memory size that the first kernel > has and this value will be used for dumping in the capture kernel. > This command line option will be passed by the kexec-tools to > capture kernel. > > Signed-off-by: Haren Myneni > Signed-off-by: Michael Ellerman > --- > > arch/powerpc/kernel/Makefile | 2 > arch/powerpc/kernel/crash.c | 264 ++++++++++++++++++++++++++++++++ > arch/powerpc/kernel/crash_dump.c | 20 ++ > arch/powerpc/kernel/machine_kexec_64.c | 13 + > arch/powerpc/kernel/smp.c | 22 ++ > arch/powerpc/kernel/traps.c | 17 +- > arch/powerpc/platforms/cell/setup.c | 1 > arch/powerpc/platforms/maple/setup.c | 1 > arch/powerpc/platforms/powermac/setup.c | 1 > arch/powerpc/platforms/pseries/setup.c | 1 > arch/powerpc/platforms/pseries/xics.c | 2 > include/asm-powerpc/kexec.h | 10 + > 12 files changed, 345 insertions(+), 9 deletions(-) > > Index: kexec/arch/powerpc/kernel/smp.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/smp.c > +++ kexec/arch/powerpc/kernel/smp.c > @@ -75,6 +75,8 @@ void smp_call_function_interrupt(void); > > int smt_enabled_at_boot = 1; > > +static void (*crash_ipi_function_ptr)(struct pt_regs *) = NULL; > + > #ifdef CONFIG_MPIC > int __init smp_mpic_probe(void) > { > @@ -123,11 +125,16 @@ void smp_message_recv(int msg, struct pt > /* XXX Do we have to do this? */ > set_need_resched(); > break; > -#ifdef CONFIG_DEBUGGER > case PPC_MSG_DEBUGGER_BREAK: > + if (crash_ipi_function_ptr) { > + crash_ipi_function_ptr(regs); > + break; > + } > +#ifdef CONFIG_DEBUGGER > debugger_ipi(regs); > break; > -#endif > +#endif /* CONFIG_DEBUGGER */ > + /* FALLTHROUGH */ > default: > printk("SMP %d: smp_message_recv(): unknown msg %d\n", > smp_processor_id(), msg); > @@ -147,6 +154,17 @@ void smp_send_debugger_break(int cpu) > } > #endif > > +#ifdef CONFIG_KEXEC > +void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *)) > +{ > + crash_ipi_function_ptr = crash_ipi_callback; > + if (crash_ipi_callback) { > + mb(); > + smp_ops->message_pass(MSG_ALL_BUT_SELF, PPC_MSG_DEBUGGER_BREAK); > + } > +} > +#endif > + > static void stop_this_cpu(void *dummy) > { > local_irq_disable(); > Index: kexec/arch/powerpc/kernel/crash.c > =================================================================== > --- /dev/null > +++ kexec/arch/powerpc/kernel/crash.c > @@ -0,0 +1,264 @@ > +/* > + * Architecture specific (PPC64) functions for kexec based crash dumps. > + * > + * Copyright (C) 2005, IBM Corp. > + * > + * Created by: Haren Myneni > + * > + * This source code is licensed under the GNU General Public License, > + * Version 2. See the file COPYING for more details. > + * > + */ > + > +#undef DEBUG > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > +#include > + > +#ifdef DEBUG > +#include > +#define DBG(fmt...) udbg_printf(fmt) > +#else > +#define DBG(fmt...) > +#endif > + > +/* This keeps a track of which one is crashing cpu. */ > +int crashing_cpu = -1; > + > +static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data, > + size_t data_len) > +{ > + struct elf_note note; > + > + note.n_namesz = strlen(name) + 1; > + note.n_descsz = data_len; > + note.n_type = type; > + memcpy(buf, ¬e, sizeof(note)); > + buf += (sizeof(note) +3)/4; > + memcpy(buf, name, note.n_namesz); > + buf += (note.n_namesz + 3)/4; > + memcpy(buf, data, note.n_descsz); > + buf += (note.n_descsz + 3)/4; > + > + return buf; > +} > + > +static void final_note(u32 *buf) > +{ > + struct elf_note note; > + > + note.n_namesz = 0; > + note.n_descsz = 0; > + note.n_type = 0; > + memcpy(buf, ¬e, sizeof(note)); > +} > + > +static void crash_save_this_cpu(struct pt_regs *regs, int cpu) > +{ > + struct elf_prstatus prstatus; > + u32 *buf; > + > + if ((cpu < 0) || (cpu >= NR_CPUS)) > + return; > + > + /* Using ELF notes here is opportunistic. > + * I need a well defined structure format > + * for the data I pass, and I need tags > + * on the data to indicate what information I have > + * squirrelled away. ELF notes happen to provide > + * all of that that no need to invent something new. > + */ > + buf = &crash_notes[cpu][0]; > + memset(&prstatus, 0, sizeof(prstatus)); > + prstatus.pr_pid = current->pid; > + elf_core_copy_regs(&prstatus.pr_reg, regs); > + buf = append_elf_note(buf, "CORE", NT_PRSTATUS, &prstatus, > + sizeof(prstatus)); > + final_note(buf); > +} > + > +/* FIXME Merge this with xmon_save_regs ?? */ > +static inline void crash_get_current_regs(struct pt_regs *regs) > +{ > + unsigned long tmp1, tmp2; > + > + __asm__ __volatile__ ( > + "std 0,0(%2)\n" > + "std 1,8(%2)\n" > + "std 2,16(%2)\n" > + "std 3,24(%2)\n" > + "std 4,32(%2)\n" > + "std 5,40(%2)\n" > + "std 6,48(%2)\n" > + "std 7,56(%2)\n" > + "std 8,64(%2)\n" > + "std 9,72(%2)\n" > + "std 10,80(%2)\n" > + "std 11,88(%2)\n" > + "std 12,96(%2)\n" > + "std 13,104(%2)\n" > + "std 14,112(%2)\n" > + "std 15,120(%2)\n" > + "std 16,128(%2)\n" > + "std 17,136(%2)\n" > + "std 18,144(%2)\n" > + "std 19,152(%2)\n" > + "std 20,160(%2)\n" > + "std 21,168(%2)\n" > + "std 22,176(%2)\n" > + "std 23,184(%2)\n" > + "std 24,192(%2)\n" > + "std 25,200(%2)\n" > + "std 26,208(%2)\n" > + "std 27,216(%2)\n" > + "std 28,224(%2)\n" > + "std 29,232(%2)\n" > + "std 30,240(%2)\n" > + "std 31,248(%2)\n" > + "mfmsr %0\n" > + "std %0, 264(%2)\n" > + "mfctr %0\n" > + "std %0, 280(%2)\n" > + "mflr %0\n" > + "std %0, 288(%2)\n" > + "bl 1f\n" > + "1: mflr %1\n" > + "std %1, 256(%2)\n" > + "mtlr %0\n" > + "mfxer %0\n" > + "std %0, 296(%2)\n" > + : "=&r" (tmp1), "=&r" (tmp2) > + : "b" (regs)); > +} > + > +/* We may have saved_regs from where the error came from > + * or it is NULL if via a direct panic(). > + */ > +static void crash_save_self(struct pt_regs *saved_regs) > +{ > + struct pt_regs regs; > + int cpu; > + > + cpu = smp_processor_id(); > + if (saved_regs) > + memcpy(®s, saved_regs, sizeof(regs)); > + else > + crash_get_current_regs(®s); > + crash_save_this_cpu(®s, cpu); > +} > + > +#ifdef CONFIG_SMP > +static atomic_t waiting_for_crash_ipi; > + > +void crash_ipi_callback(struct pt_regs *regs) > +{ > + int cpu = smp_processor_id(); > + > + if (cpu == crashing_cpu) > + return; > + > + if (!cpu_online(cpu)) > + return; > + > + if (ppc_md.kexec_cpu_down) > + ppc_md.kexec_cpu_down(1, 1); > + > + local_irq_disable(); > + > + crash_save_this_cpu(regs, cpu); > + atomic_dec(&waiting_for_crash_ipi); > + kexec_smp_wait(); > + /* NOTREACHED */ > +} > + > +static void crash_kexec_prepare_cpus(void) > +{ > + unsigned int msecs; > + > + atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); > + > + crash_send_ipi(crash_ipi_callback); > + smp_wmb(); > + > + /* > + * FIXME: Until we will have the way to stop other CPUSs reliabally, > + * the crash CPU will send an IPI and wait for other CPUs to > + * respond. If not, proceed the kexec boot even though we failed to > + * capture other CPU states. > + */ > + msecs = 1000000; > + while ((atomic_read(&waiting_for_crash_ipi) > 0) && (--msecs > 0)) { > + barrier(); > + mdelay(1); > + } > + > + /* Would it be better to replace the trap vector here? */ > + > + /* > + * FIXME: In case if we do not get all CPUs, one possibility: ask the > + * user to do soft reset such that we get all. > + * IPI handler is already set by the panic cpu initially. Therefore, > + * all cpus could invoke this handler from die() and the panic CPU > + * will call machine_kexec() directly from this handler to do > + * kexec boot. > + */ > + if (atomic_read(&waiting_for_crash_ipi)) > + printk(KERN_ALERT "done waiting: %d cpus not responding\n", > + atomic_read(&waiting_for_crash_ipi)); > + /* Leave the IPI callback set */ > +} > +#else > +static void crash_kexec_prepare_cpus(void) > +{ > + /* > + * move the secondarys to us so that we can copy > + * the new kernel 0-0x100 safely > + * > + * do this if kexec in setup.c ? > + */ > + smp_release_cpus(); > +} > + > +#endif > + > +void default_machine_crash_shutdown(struct pt_regs *regs) > +{ > + /* > + * This function is only called after the system > + * has paniced or is otherwise in a critical state. > + * The minimum amount of code to allow a kexec'd kernel > + * to run successfully needs to happen here. > + * > + * In practice this means stopping other cpus in > + * an SMP system. > + * The kernel is broken so disable interrupts. > + */ > + local_irq_disable(); > + > + if (ppc_md.kexec_cpu_down) > + ppc_md.kexec_cpu_down(1, 0); > + > + /* > + * Make a note of crashing cpu. Will be used in machine_kexec > + * such that another IPI will not be sent. > + */ > + crashing_cpu = smp_processor_id(); > + crash_kexec_prepare_cpus(); > + crash_save_self(regs); > +} > Index: kexec/arch/powerpc/kernel/traps.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/traps.c > +++ kexec/arch/powerpc/kernel/traps.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -95,7 +96,7 @@ static DEFINE_SPINLOCK(die_lock); > > int die(const char *str, struct pt_regs *regs, long err) > { > - static int die_counter; > + static int die_counter, crash_dump_start = 0; > int nl = 0; > > if (debugger(regs)) > @@ -156,7 +157,21 @@ int die(const char *str, struct pt_regs > print_modules(); > show_regs(regs); > bust_spinlocks(0); > + > + if (!crash_dump_start && kexec_should_crash(current)) { > + crash_dump_start = 1; > + spin_unlock_irq(&die_lock); > + crash_kexec(regs); > + /* NOTREACHED */ > + } > spin_unlock_irq(&die_lock); > + if (crash_dump_start) > + /* > + * Only for soft-reset: Other CPUs will be responded to an IPI > + * sent by first kexec CPU. > + */ > + for(;;) > + ; > > if (in_interrupt()) > panic("Fatal exception in interrupt"); > Index: kexec/arch/powerpc/kernel/machine_kexec_64.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/machine_kexec_64.c > +++ kexec/arch/powerpc/kernel/machine_kexec_64.c > @@ -265,11 +265,18 @@ extern NORET_TYPE void kexec_sequence(vo > /* too late to fail here */ > void default_machine_kexec(struct kimage *image) > { > - > /* prepare control code if any */ > > - /* shutdown other cpus into our wait loop and quiesce interrupts */ > - kexec_prepare_cpus(); > + /* > + * If the kexec boot is the normal one, need to shutdown other cpus > + * into our wait loop and quiesce interrupts. > + * Otherwise, in the case of crashed mode (crashing_cpu >= 0), > + * stopping other CPUs and collecting their pt_regs is done before > + * using debugger IPI. > + */ > + > + if (crashing_cpu == -1) > + kexec_prepare_cpus(); > > /* switch to a staticly allocated stack. Based on irq stack code. > * XXX: the task struct will likely be invalid once we do the copy! > Index: kexec/include/asm-powerpc/kexec.h > =================================================================== > --- kexec.orig/include/asm-powerpc/kexec.h > +++ kexec/include/asm-powerpc/kexec.h > @@ -32,6 +32,8 @@ > > #ifndef __ASSEMBLY__ > > +#ifdef CONFIG_KEXEC > + > #define MAX_NOTE_BYTES 1024 > typedef u32 note_buf_t[MAX_NOTE_BYTES / sizeof(u32)]; > > @@ -41,11 +43,17 @@ extern note_buf_t crash_notes[]; > extern void kexec_smp_wait(void); /* get and clear naca physid, wait for > master to copy new code to 0 */ > extern void __init kexec_setup(void); > -#endif > +extern int crashing_cpu; > +extern void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *)); > +#endif /* __powerpc64 __ */ > > struct kimage; > +struct pt_regs; > extern void default_machine_kexec(struct kimage *image); > extern int default_machine_kexec_prepare(struct kimage *image); > +extern void default_machine_crash_shutdown(struct pt_regs *regs); > + > +#endif /* !CONFIG_KEXEC */ > > #endif /* ! __ASSEMBLY__ */ > #endif /* _ASM_POWERPC_KEXEC_H */ > Index: kexec/arch/powerpc/platforms/pseries/xics.c > =================================================================== > --- kexec.orig/arch/powerpc/platforms/pseries/xics.c > +++ kexec/arch/powerpc/platforms/pseries/xics.c > @@ -417,7 +417,7 @@ irqreturn_t xics_ipi_action(int irq, voi > smp_message_recv(PPC_MSG_MIGRATE_TASK, regs); > } > #endif > -#ifdef CONFIG_DEBUGGER > +#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC) > if (test_and_clear_bit(PPC_MSG_DEBUGGER_BREAK, > &xics_ipi_message[cpu].value)) { > mb(); > Index: kexec/arch/powerpc/platforms/cell/setup.c > =================================================================== > --- kexec.orig/arch/powerpc/platforms/cell/setup.c > +++ kexec/arch/powerpc/platforms/cell/setup.c > @@ -217,5 +217,6 @@ struct machdep_calls __initdata cell_md > #ifdef CONFIG_KEXEC > .machine_kexec = default_machine_kexec, > .machine_kexec_prepare = default_machine_kexec_prepare, > + .machine_crash_shutdown = default_machine_crash_shutdown, > #endif > }; > Index: kexec/arch/powerpc/platforms/maple/setup.c > =================================================================== > --- kexec.orig/arch/powerpc/platforms/maple/setup.c > +++ kexec/arch/powerpc/platforms/maple/setup.c > @@ -282,5 +282,6 @@ struct machdep_calls __initdata maple_md > #ifdef CONFIG_KEXEC > .machine_kexec = default_machine_kexec, > .machine_kexec_prepare = default_machine_kexec_prepare, > + .machine_crash_shutdown = default_machine_crash_shutdown, > #endif > }; > Index: kexec/arch/powerpc/platforms/powermac/setup.c > =================================================================== > --- kexec.orig/arch/powerpc/platforms/powermac/setup.c > +++ kexec/arch/powerpc/platforms/powermac/setup.c > @@ -771,6 +771,7 @@ struct machdep_calls __initdata pmac_md > #ifdef CONFIG_KEXEC > .machine_kexec = default_machine_kexec, > .machine_kexec_prepare = default_machine_kexec_prepare, > + .machine_crash_shutdown = default_machine_crash_shutdown, > #endif > #endif /* CONFIG_PPC64 */ > #ifdef CONFIG_PPC32 > Index: kexec/arch/powerpc/platforms/pseries/setup.c > =================================================================== > --- kexec.orig/arch/powerpc/platforms/pseries/setup.c > +++ kexec/arch/powerpc/platforms/pseries/setup.c > @@ -629,5 +629,6 @@ struct machdep_calls __initdata pSeries_ > .kexec_cpu_down = pseries_kexec_cpu_down, > .machine_kexec = default_machine_kexec, > .machine_kexec_prepare = default_machine_kexec_prepare, > + .machine_crash_shutdown = default_machine_crash_shutdown, > #endif > }; > Index: kexec/arch/powerpc/kernel/crash_dump.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/crash_dump.c > +++ kexec/arch/powerpc/kernel/crash_dump.c > @@ -11,6 +11,8 @@ > > #undef DEBUG > > +#include > +#include > #include > #include > #include > @@ -51,3 +53,21 @@ void __init kdump_setup(void) > > DBG(" <- kdump_setup()\n"); > } > + > +static int __init parse_elfcorehdr(char *p) > +{ > + if (p) > + elfcorehdr_addr = memparse(p, &p); > + > + return 0; > +} > +__setup("elfcorehdr=", parse_elfcorehdr); > + > +static int __init parse_savemaxmem(char *p) > +{ > + if (p) > + saved_max_pfn = (memparse(p, &p) >> PAGE_SHIFT) - 1; > + > + return 0; > +} > +__setup("savemaxmem=", parse_savemaxmem); > Index: kexec/arch/powerpc/kernel/Makefile > =================================================================== > --- kexec.orig/arch/powerpc/kernel/Makefile > +++ kexec/arch/powerpc/kernel/Makefile > @@ -66,7 +66,7 @@ pci64-$(CONFIG_PPC64) += pci_64.o pci_d > obj-$(CONFIG_PCI) += $(pci64-y) > kexec-$(CONFIG_PPC64) := machine_kexec_64.o > kexec-$(CONFIG_PPC32) := machine_kexec_32.o > -obj-$(CONFIG_KEXEC) += machine_kexec.o $(kexec-y) > +obj-$(CONFIG_KEXEC) += machine_kexec.o crash.o $(kexec-y) > > ifeq ($(CONFIG_PPC_ISERIES),y) > $(obj)/head_64.o: $(obj)/lparmap.s > > > ------------------------------ > > Message: 2 > Date: Sun, 04 Dec 2005 18:39:48 > From: Michael Ellerman > Subject: [PATCH 9/11] powerpc: Parse crashkernel= parameter in first > kernel > To: , Paul Mackerras > Message-ID: <20051205004002.7A01B68889 at ozlabs.org> > > This patch adds code to parse and setup the crash kernel resource in the > first kernel. PPC64 ignores the @x part, we always run at 32 MB. > > Signed-off-by: Haren Myneni > Signed-off-by: Michael Ellerman > --- > > arch/powerpc/kernel/prom.c | 11 ++++++++ > arch/powerpc/kernel/prom_init.c | 53 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 64 insertions(+) > > Index: kexec/arch/powerpc/kernel/prom_init.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/prom_init.c > +++ kexec/arch/powerpc/kernel/prom_init.c > @@ -192,6 +192,11 @@ static unsigned long __initdata alloc_bo > static unsigned long __initdata rmo_top; > static unsigned long __initdata ram_top; > > +#ifdef CONFIG_KEXEC > +static unsigned long __initdata prom_crashk_base; > +static unsigned long __initdata prom_crashk_size; > +#endif > + > static struct mem_map_entry __initdata mem_reserve_map[MEM_RESERVE_MAP_SIZE]; > static int __initdata mem_reserve_cnt; > > @@ -590,6 +595,34 @@ static void __init early_cmdline_parse(v > RELOC(prom_memory_limit) = ALIGN(RELOC(prom_memory_limit), 0x1000000); > #endif > } > + > +#ifdef CONFIG_KEXEC > + /* > + * crashkernel=size at addr specifies the location to reserve for > + * crash kernel. > + */ > + opt = strstr(RELOC(prom_cmd_line), RELOC("crashkernel=")); > + if (opt) { > + opt += 12; > + RELOC(prom_crashk_size) = prom_memparse(opt, &opt); > + > + if (ALIGN(RELOC(prom_crashk_size), 0x1000000) != > + RELOC(prom_crashk_size)) { > + prom_printf("Warning: crashkernel size is not " > + "aligned to 16MB\n"); > + } > + > + /* > + * At present, the crash kernel always run at 32MB. > + * Just ignore whatever user passed. > + */ > + RELOC(prom_crashk_base) = 0x2000000; > + if (*opt == '@') { > + prom_printf("Warning: PPC64 kdump kernel always runs " > + "at 32 MB\n"); > + } > + } > +#endif > } > > #ifdef CONFIG_PPC_PSERIES > @@ -1011,6 +1044,12 @@ static void __init prom_init_mem(void) > prom_printf(" alloc_top_hi : %x\n", RELOC(alloc_top_high)); > prom_printf(" rmo_top : %x\n", RELOC(rmo_top)); > prom_printf(" ram_top : %x\n", RELOC(ram_top)); > +#ifdef CONFIG_KEXEC > + if (RELOC(prom_crashk_base)) { > + prom_printf(" crashk_base : %x\n", RELOC(prom_crashk_base)); > + prom_printf(" crashk_size : %x\n", RELOC(prom_crashk_size)); > + } > +#endif > } > > > @@ -2094,6 +2133,10 @@ unsigned long __init prom_init(unsigned > */ > prom_init_mem(); > > +#ifdef CONFIG_KEXEC > + if (RELOC(prom_crashk_base)) > + reserve_mem(RELOC(prom_crashk_base), RELOC(prom_crashk_size)); > +#endif > /* > * Determine which cpu is actually running right _now_ > */ > @@ -2150,6 +2193,16 @@ unsigned long __init prom_init(unsigned > } > #endif > > +#ifdef CONFIG_KEXEC > + if (RELOC(prom_crashk_base)) { > + prom_setprop(_prom->chosen, "/chosen", "linux,crashkernel-base", > + PTRRELOC(&prom_crashk_base), > + sizeof(RELOC(prom_crashk_base))); > + prom_setprop(_prom->chosen, "/chosen", "linux,crashkernel-size", > + PTRRELOC(&prom_crashk_size), > + sizeof(RELOC(prom_crashk_size))); > + } > +#endif > /* > * Fixup any known bugs in the device-tree > */ > Index: kexec/arch/powerpc/kernel/prom.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/prom.c > +++ kexec/arch/powerpc/kernel/prom.c > @@ -29,6 +29,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -1198,6 +1199,16 @@ static int __init early_init_dt_scan_cho > } > #endif /* CONFIG_PPC_RTAS */ > > +#ifdef CONFIG_KEXEC > + lprop = (u64*)of_get_flat_dt_prop(node, "linux,crashkernel-base", NULL); > + if (lprop) > + crashk_res.start = *lprop; > + > + lprop = (u64*)of_get_flat_dt_prop(node, "linux,crashkernel-size", NULL); > + if (lprop) > + crashk_res.end = crashk_res.start + *lprop - 1; > +#endif > + > /* break now */ > return 1; > } > > > ------------------------------ > > Message: 3 > Date: Sun, 04 Dec 2005 18:39:51 > From: Michael Ellerman > Subject: [PATCH 10/11] powerpc: Add arch-dependant copy_oldmem_page > To: , Paul Mackerras > Message-ID: <20051205004006.1FEB26887B at ozlabs.org> > > Add arch-dependant copy_oldmem_page. > > Signed-off-by: Haren Myneni > Signed-off-by: Michael Ellerman > --- > > arch/powerpc/kernel/crash_dump.c | 36 ++++++++++++++++++++++++++++++++++++ > include/asm-powerpc/kexec.h | 2 ++ > kernel/crash_dump.c | 3 +++ > 3 files changed, 41 insertions(+) > > Index: kexec/arch/powerpc/kernel/crash_dump.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/crash_dump.c > +++ kexec/arch/powerpc/kernel/crash_dump.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > > #ifdef DEBUG > #include > @@ -71,3 +72,38 @@ static int __init parse_savemaxmem(char > return 0; > } > __setup("savemaxmem=", parse_savemaxmem); > + > +/* > + * copy_oldmem_page - copy one page from "oldmem" > + * @pfn: page frame number to be copied > + * @buf: target memory address for the copy; this can be in kernel address > + * space or user address space (see @userbuf) > + * @csize: number of bytes to copy > + * @offset: offset in bytes into the page (based on pfn) to begin the copy > + * @userbuf: if set, @buf is in user address space, use copy_to_user(), > + * otherwise @buf is in kernel address space, use memcpy(). > + * > + * Copy a page from "oldmem". For this page, there is no pte mapped > + * in the current kernel. We stitch up a pte, similar to kmap_atomic. > + */ > +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, > + size_t csize, unsigned long offset, int userbuf) > +{ > + void *vaddr; > + > + if (!csize) > + return 0; > + > + vaddr = __ioremap(pfn << PAGE_SHIFT, PAGE_SIZE, 0); > + > + if (userbuf) { > + if (copy_to_user((char __user *)buf, (vaddr + offset), csize)) { > + iounmap(vaddr); > + return -EFAULT; > + } > + } else > + memcpy(buf, (vaddr + offset), csize); > + > + iounmap(vaddr); > + return csize; > +} > Index: kexec/include/asm-powerpc/kexec.h > =================================================================== > --- kexec.orig/include/asm-powerpc/kexec.h > +++ kexec/include/asm-powerpc/kexec.h > @@ -30,6 +30,8 @@ > #define KEXEC_ARCH KEXEC_ARCH_PPC > #endif > > +#define HAVE_ARCH_COPY_OLDMEM_PAGE > + > #ifndef __ASSEMBLY__ > > #ifdef CONFIG_KEXEC > Index: kexec/kernel/crash_dump.c > =================================================================== > --- kexec.orig/kernel/crash_dump.c > +++ kexec/kernel/crash_dump.c > @@ -14,10 +14,12 @@ > > #include > #include > +#include > > /* Stores the physical address of elf header of crash image. */ > unsigned long long elfcorehdr_addr = ELFCORE_ADDR_MAX; > > +#ifndef HAVE_ARCH_COPY_OLDMEM_PAGE > /** > * copy_oldmem_page - copy one page from "oldmem" > * @pfn: page frame number to be copied > @@ -59,3 +61,4 @@ ssize_t copy_oldmem_page(unsigned long p > kfree(page); > return csize; > } > +#endif > > > ------------------------------ > > Message: 4 > Date: Sun, 04 Dec 2005 18:39:55 > From: Michael Ellerman > Subject: [PATCH 11/11] powerpc: Add support for "linux, usable-memory" > on memory nodes > To: , Paul Mackerras > Message-ID: <20051205004009.D5BCE68865 at ozlabs.org> > > Milton has proposed that we should support a "linux,usable-memory" property > on memory nodes which describes, in preference to "reg", the regions of memory > Linux should use. > > This facility is required for kdump, to inform the second kernel which memory > it should use. > > Signed-off-by: Haren Myneni > Signed-off-by: Michael Ellerman > --- > > arch/powerpc/kernel/prom.c | 9 +++++++-- > arch/powerpc/mm/numa.c | 7 ++++++- > 2 files changed, 13 insertions(+), 3 deletions(-) > > Index: kexec/arch/powerpc/kernel/prom.c > =================================================================== > --- kexec.orig/arch/powerpc/kernel/prom.c > +++ kexec/arch/powerpc/kernel/prom.c > @@ -567,7 +567,10 @@ static int __init interpret_root_props(s > unsigned int *rp; > int rpsize = (naddrc + nsizec) * sizeof(unsigned int); > > - rp = (unsigned int *) get_property(np, "reg", &l); > + rp = (unsigned int *) get_property(np, "linux,usable-memory", &l); > + if (rp == NULL) > + rp = (unsigned int *) get_property(np, "reg", &l); > + > if (rp != 0 && l >= rpsize) { > i = 0; > adr = (struct address_range *) (*mem_start); > @@ -1275,7 +1278,9 @@ static int __init early_init_dt_scan_mem > } else if (strcmp(type, "memory") != 0) > return 0; > > - reg = (cell_t *)of_get_flat_dt_prop(node, "reg", &l); > + reg = (cell_t *)of_get_flat_dt_prop(node, "linux,usable-memory", &l); > + if (reg == NULL) > + reg = (cell_t *)of_get_flat_dt_prop(node, "reg", &l); > if (reg == NULL) > return 0; > > Index: kexec/arch/powerpc/mm/numa.c > =================================================================== > --- kexec.orig/arch/powerpc/mm/numa.c > +++ kexec/arch/powerpc/mm/numa.c > @@ -423,7 +423,12 @@ static int __init parse_numa_properties( > unsigned int *memcell_buf; > unsigned int len; > > - memcell_buf = (unsigned int *)get_property(memory, "reg", &len); > + memcell_buf = (unsigned int *)get_property(memory, > + "linux,usable-memory", &len); > + if (!memcell_buf || len <= 0) > + memcell_buf = > + (unsigned int *)get_property(memory, "reg", > + &len); > if (!memcell_buf || len <= 0) > continue; > > > > ------------------------------ > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > > > End of Linuxppc64-dev Digest, Vol 16, Issue 11 > ********************************************** > From ericvanhensbergen at us.ibm.com Tue Dec 6 01:45:24 2005 From: ericvanhensbergen at us.ibm.com (Eric V Van hensbergen) Date: Mon, 5 Dec 2005 08:45:24 -0600 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: Message-ID: rsa at us.ltcfwd.linux.ibm.com wrote on 12/04/2005 03:13:12 PM: > This patch adds the hvc_fss.c driver file. > > Signed-off-by: Ryan S. Arnold > diff -uNr linux-2.6.14-rc5/drivers/char/hvc_fss.c linux-2.6.14-rc5- > cbe-fss/drivers/char/hvc_fss.c > --- linux-2.6.14-rc5/drivers/char/hvc_fss.c 1969-12-31 19:00:00. > 000000000 -0500 > +++ linux-2.6.14-rc5-cbe-fss/drivers/char/hvc_fss.c 2005-12-02 17: > 54:19.243249984 -0500 > @@ -0,0 +1,148 @@ ... > + > +static inline int callthru0(int command) > +{ > + register int c asm ("r3") = command; > + > + asm volatile (".long 0x000EAEB0" : "=r" (c): "r" (c)); > + return((c)); > +} > + > +static inline int callthru3(int command, unsigned long arg1, > unsigned long arg2, unsigned long arg3) > +{ > + register int c asm ("r3") = command; > + register unsigned long a1 asm ("r4") = arg1; > + register unsigned long a2 asm ("r5") = arg2; > + register unsigned long a3 asm ("r6") = arg3; > + > + asm volatile (".long 0x000EAEB0" : "=r" (c): "r" (c), "r" (a1), > "r" (a2), "r" (a3)); > + return((c)); > +} > + Its a relatively small knit-pick, but the callthru functions should probably be kept in a common include. My patch-set has include/asm-powerpc/systemsim.h which includes these definitions. That way we don't have to define the callthru's for every driver which might use them (such as BogusNet or BogusDisk). -eric From hollis at penguinppc.org Tue Dec 6 02:35:10 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Mon, 5 Dec 2005 09:35:10 -0600 Subject: [PATCH] powerpc: Separate usage of KERNELBASE and PAGE_OFFSET In-Reply-To: <20051205050717.ED89768863@ozlabs.org> References: <20051205050717.ED89768863@ozlabs.org> Message-ID: <5d89217ec251646b34138f147e73cad6@penguinppc.org> On Dec 4, 2005, at 5:07 PM, Michael Ellerman wrote: > This patch separates usage of KERNELBASE and PAGE_OFFSET. I haven't > looked at > any of the PPC code, if we ever want to support Kdump on PPC we'll > have to do > another audit, ditto for iSeries. (I guess you're trying to say you haven't tested 32-bit support, but saying "PPC" here is rather confusing...) > To get a physical address from a virtual one you subtract PAGE_OFFSET, > _not_ > KERNELBASE. > > KERNELBASE is the virtual address of the start of the kernel, it's > often the > same as PAGE_OFFSET, but _might not be_. > > If you want to know something's offset from the start of the kernel > you should > subtract KERNELBASE. Could you please add these helpful comments to page.h? You might also mention kdump as an example, to help people understand this subtle distinction. -Hollis From michael at ellerman.id.au Tue Dec 6 03:10:43 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 5 Dec 2005 10:10:43 -0600 Subject: [PATCH] powerpc: Separate usage of KERNELBASE and PAGE_OFFSET In-Reply-To: <5d89217ec251646b34138f147e73cad6@penguinppc.org> References: <20051205050717.ED89768863@ozlabs.org> <5d89217ec251646b34138f147e73cad6@penguinppc.org> Message-ID: <200512051010.50584.michael@ellerman.id.au> On Mon, 5 Dec 2005 09:35, Hollis Blanchard wrote: > On Dec 4, 2005, at 5:07 PM, Michael Ellerman wrote: > > This patch separates usage of KERNELBASE and PAGE_OFFSET. I haven't > > looked at > > any of the PPC code, if we ever want to support Kdump on PPC we'll > > have to do > > another audit, ditto for iSeries. > > (I guess you're trying to say you haven't tested 32-bit support, but > saying "PPC" here is rather confusing...) You're right that's not very clear. What I meant is I haven't audited any of the code in arch/ppc, or any of the 32-bit PPC code in arch/powerpc for usage of KERNELBASE vs PAGE_OFFSET. If we want kdump to work on 32-bit powerpc we'll need to audit that code first. > > To get a physical address from a virtual one you subtract PAGE_OFFSET, > > _not_ > > KERNELBASE. > > > > KERNELBASE is the virtual address of the start of the kernel, it's > > often the > > same as PAGE_OFFSET, but _might not be_. > > > > If you want to know something's offset from the start of the kernel > > you should > > subtract KERNELBASE. > > Could you please add these helpful comments to page.h? You might also > mention kdump as an example, to help people understand this subtle > distinction. Not a bad idea. I doesn't look like paulus has merged them yet (due to my speeling mistakes ;), so I'll just update this patch. cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/94126c26/attachment.pgp From michael at ellerman.id.au Tue Dec 6 03:24:33 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 05 Dec 2005 10:24:33 -0600 Subject: [PATCH] powerpc: Separate usage of KERNELBASE and PAGE_OFFSET In-Reply-To: <5d89217ec251646b34138f147e73cad6@penguinppc.org> Message-ID: <20051205162449.0DBE26884E@ozlabs.org> This patch separates usage of KERNELBASE and PAGE_OFFSET. I haven't looked at any of the PPC code, if we ever want to support Kdump on PPC we'll have to do another audit, ditto for iSeries. This patch makes PAGE_OFFSET the constant, it'll always be 0xC * 1 gazillion. To get a physical address from a virtual one you subtract PAGE_OFFSET, _not_ KERNELBASE. KERNELBASE is the virtual address of the start of the kernel, it's often the same as PAGE_OFFSET, but _might not be_. If you want to know something's offset from the start of the kernel you should subtract KERNELBASE. Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/btext.c | 4 ++-- arch/powerpc/kernel/entry_64.S | 4 ++-- arch/powerpc/kernel/lparmap.c | 6 +++--- arch/powerpc/kernel/machine_kexec_64.c | 5 ++--- arch/powerpc/mm/hash_utils_64.c | 6 +++--- arch/powerpc/mm/slb.c | 4 ++-- arch/powerpc/mm/slb_low.S | 6 +++--- arch/powerpc/mm/stab.c | 10 +++++----- include/asm-powerpc/page.h | 16 +++++++++++++++- 9 files changed, 37 insertions(+), 24 deletions(-) Index: kexec/arch/powerpc/mm/stab.c =================================================================== --- kexec.orig/arch/powerpc/mm/stab.c +++ kexec/arch/powerpc/mm/stab.c @@ -40,7 +40,7 @@ static int make_ste(unsigned long stab, unsigned long entry, group, old_esid, castout_entry, i; unsigned int global_entry; struct stab_entry *ste, *castout_ste; - unsigned long kernel_segment = (esid << SID_SHIFT) >= KERNELBASE; + unsigned long kernel_segment = (esid << SID_SHIFT) >= PAGE_OFFSET; vsid_data = vsid << STE_VSID_SHIFT; esid_data = esid << SID_SHIFT | STE_ESID_KP | STE_ESID_V; @@ -83,7 +83,7 @@ static int make_ste(unsigned long stab, } /* Dont cast out the first kernel segment */ - if ((castout_ste->esid_data & ESID_MASK) != KERNELBASE) + if ((castout_ste->esid_data & ESID_MASK) != PAGE_OFFSET) break; castout_entry = (castout_entry + 1) & 0xf; @@ -251,7 +251,7 @@ void stabs_alloc(void) panic("Unable to allocate segment table for CPU %d.\n", cpu); - newstab += KERNELBASE; + newstab = (unsigned long)__va(newstab); memset((void *)newstab, 0, HW_PAGE_SIZE); @@ -270,11 +270,11 @@ void stabs_alloc(void) */ void stab_initialize(unsigned long stab) { - unsigned long vsid = get_kernel_vsid(KERNELBASE); + unsigned long vsid = get_kernel_vsid(PAGE_OFFSET); unsigned long stabreal; asm volatile("isync; slbia; isync":::"memory"); - make_ste(stab, GET_ESID(KERNELBASE), vsid); + make_ste(stab, GET_ESID(PAGE_OFFSET), vsid); /* Order update */ asm volatile("sync":::"memory"); Index: kexec/arch/powerpc/kernel/machine_kexec_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/machine_kexec_64.c +++ kexec/arch/powerpc/kernel/machine_kexec_64.c @@ -153,9 +153,8 @@ void kexec_copy_flush(struct kimage *ima * including ones that were in place on the original copy */ for (i = 0; i < nr_segments; i++) - flush_icache_range(ranges[i].mem + KERNELBASE, - ranges[i].mem + KERNELBASE + - ranges[i].memsz); + flush_icache_range((unsigned long)__va(ranges[i].mem), + (unsigned long)__va(ranges[i].mem + ranges[i].memsz)); } #ifdef CONFIG_SMP Index: kexec/arch/powerpc/mm/hash_utils_64.c =================================================================== --- kexec.orig/arch/powerpc/mm/hash_utils_64.c +++ kexec/arch/powerpc/mm/hash_utils_64.c @@ -456,7 +456,7 @@ void __init htab_initialize(void) /* create bolted the linear mapping in the hash table */ for (i=0; i < lmb.memory.cnt; i++) { - base = lmb.memory.region[i].base + KERNELBASE; + base = (unsigned long)__va(lmb.memory.region[i].base); size = lmb.memory.region[i].size; DBG("creating mapping for region: %lx : %lx\n", base, size); @@ -498,8 +498,8 @@ void __init htab_initialize(void) * for either 4K or 16MB pages. */ if (tce_alloc_start) { - tce_alloc_start += KERNELBASE; - tce_alloc_end += KERNELBASE; + tce_alloc_start = (unsigned long)__va(tce_alloc_start); + tce_alloc_end = (unsigned long)__va(tce_alloc_end); if (base + size >= tce_alloc_start) tce_alloc_start = base + size + 1; Index: kexec/arch/powerpc/mm/slb.c =================================================================== --- kexec.orig/arch/powerpc/mm/slb.c +++ kexec/arch/powerpc/mm/slb.c @@ -75,7 +75,7 @@ static void slb_flush_and_rebolt(void) vflags = SLB_VSID_KERNEL | virtual_llp; ksp_esid_data = mk_esid_data(get_paca()->kstack, 2); - if ((ksp_esid_data & ESID_MASK) == KERNELBASE) + if ((ksp_esid_data & ESID_MASK) == PAGE_OFFSET) ksp_esid_data &= ~SLB_ESID_V; /* We need to do this all in asm, so we're sure we don't touch @@ -213,7 +213,7 @@ void slb_initialize(void) asm volatile("isync":::"memory"); asm volatile("slbmte %0,%0"::"r" (0) : "memory"); asm volatile("isync; slbia; isync":::"memory"); - create_slbe(KERNELBASE, lflags, 0); + create_slbe(PAGE_OFFSET, lflags, 0); /* VMALLOC space has 4K pages always for now */ create_slbe(VMALLOCBASE, vflags, 1); Index: kexec/arch/powerpc/kernel/entry_64.S =================================================================== --- kexec.orig/arch/powerpc/kernel/entry_64.S +++ kexec/arch/powerpc/kernel/entry_64.S @@ -690,7 +690,7 @@ _GLOBAL(enter_rtas) /* Setup our real return addr */ SET_REG_TO_LABEL(r4,.rtas_return_loc) - SET_REG_TO_CONST(r9,KERNELBASE) + SET_REG_TO_CONST(r9,PAGE_OFFSET) sub r4,r4,r9 mtlr r4 @@ -718,7 +718,7 @@ _GLOBAL(enter_rtas) _STATIC(rtas_return_loc) /* relocation is off at this point */ mfspr r4,SPRN_SPRG3 /* Get PACA */ - SET_REG_TO_CONST(r5, KERNELBASE) + SET_REG_TO_CONST(r5, PAGE_OFFSET) sub r4,r4,r5 /* RELOC the PACA base pointer */ mfmsr r6 Index: kexec/arch/powerpc/mm/slb_low.S =================================================================== --- kexec.orig/arch/powerpc/mm/slb_low.S +++ kexec/arch/powerpc/mm/slb_low.S @@ -37,9 +37,9 @@ _GLOBAL(slb_allocate_realmode) srdi r9,r3,60 /* get region */ srdi r10,r3,28 /* get esid */ - cmpldi cr7,r9,0xc /* cmp KERNELBASE for later use */ + cmpldi cr7,r9,0xc /* cmp PAGE_OFFSET for later use */ - /* r3 = address, r10 = esid, cr7 = <>KERNELBASE */ + /* r3 = address, r10 = esid, cr7 = <> PAGE_OFFSET */ blt cr7,0f /* user or kernel? */ /* kernel address: proto-VSID = ESID */ @@ -166,7 +166,7 @@ _GLOBAL(slb_allocate_user) /* * Finish loading of an SLB entry and return * - * r3 = EA, r10 = proto-VSID, r11 = flags, clobbers r9, cr7 = <>KERNELBASE + * r3 = EA, r10 = proto-VSID, r11 = flags, clobbers r9, cr7 = <> PAGE_OFFSET */ slb_finish_load: ASM_VSID_SCRAMBLE(r10,r9) Index: kexec/arch/powerpc/kernel/lparmap.c =================================================================== --- kexec.orig/arch/powerpc/kernel/lparmap.c +++ kexec/arch/powerpc/kernel/lparmap.c @@ -16,8 +16,8 @@ const struct LparMap __attribute__((__se .xSegmentTableOffs = STAB0_PAGE, .xEsids = { - { .xKernelEsid = GET_ESID(KERNELBASE), - .xKernelVsid = KERNEL_VSID(KERNELBASE), }, + { .xKernelEsid = GET_ESID(PAGE_OFFSET), + .xKernelVsid = KERNEL_VSID(PAGE_OFFSET), }, { .xKernelEsid = GET_ESID(VMALLOCBASE), .xKernelVsid = KERNEL_VSID(VMALLOCBASE), }, }, @@ -25,7 +25,7 @@ const struct LparMap __attribute__((__se .xRanges = { { .xPages = HvPagesToMap, .xOffset = 0, - .xVPN = KERNEL_VSID(KERNELBASE) << (SID_SHIFT - HW_PAGE_SHIFT), + .xVPN = KERNEL_VSID(PAGE_OFFSET) << (SID_SHIFT - HW_PAGE_SHIFT), }, }, }; Index: kexec/include/asm-powerpc/page.h =================================================================== --- kexec.orig/include/asm-powerpc/page.h +++ kexec/include/asm-powerpc/page.h @@ -37,6 +37,20 @@ */ #define PAGE_MASK (~((1 << PAGE_SHIFT) - 1)) +/* + * KERNELBASE is the virtual address of the start of the kernel, it's often + * the same as PAGE_OFFSET, but _might not be_. + * + * The kdump dump kernel is one example where KERNELBASE != PAGE_OFFSET. + * + * To get a physical address from a virtual one you subtract PAGE_OFFSET, + * _not_ KERNELBASE. + * + * If you want to know something's offset from the start of the kernel you + * should subtract KERNELBASE. + * + * If you want to test if something's a kernel address, use is_kernel_addr(). + */ #define PAGE_OFFSET ASM_CONST(CONFIG_KERNEL_START) #define KERNELBASE PAGE_OFFSET @@ -56,7 +70,7 @@ #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) #define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT) -#define __va(x) ((void *)((unsigned long)(x) + KERNELBASE)) +#define __va(x) ((void *)((unsigned long)(x) + PAGE_OFFSET)) #define __pa(x) ((unsigned long)(x) - PAGE_OFFSET) /* Index: kexec/arch/powerpc/kernel/btext.c =================================================================== --- kexec.orig/arch/powerpc/kernel/btext.c +++ kexec/arch/powerpc/kernel/btext.c @@ -60,7 +60,7 @@ int force_printk_to_btext = 0; * * The display is mapped to virtual address 0xD0000000, rather * than 1:1, because some some CHRP machines put the frame buffer - * in the region starting at 0xC0000000 (KERNELBASE). + * in the region starting at 0xC0000000 (PAGE_OFFSET). * This mapping is temporary and will disappear as soon as the * setup done by MMU_Init() is applied. * @@ -71,7 +71,7 @@ int force_printk_to_btext = 0; */ void __init btext_prepare_BAT(void) { - unsigned long vaddr = KERNELBASE + 0x10000000; + unsigned long vaddr = PAGE_OFFSET + 0x10000000; unsigned long addr; unsigned long lowbits; From arnd at arndb.de Tue Dec 6 03:33:34 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 5 Dec 2005 17:33:34 +0100 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: References: Message-ID: <200512051733.35417.arnd@arndb.de> On Maandag 05 Dezember 2005 15:45, Eric V Van hensbergen wrote: > Its a relatively small knit-pick, but the callthru functions should > probably > be kept in a common include. ?My patch-set has > include/asm-powerpc/systemsim.h > which includes these definitions. ?That way we don't have to define the > callthru's for every driver which might use them (such as BogusNet or > BogusDisk). > That's right. I have already ported the patches to the powerpc.git tree and used your systemsim.h file for that. What are your plans for bringing your patches upstream? The code in there looks pretty good already, but I guess some day you should split it up into smaller patches and submit those. Arnd <>< From rsa at us.ibm.com Tue Dec 6 04:06:27 2005 From: rsa at us.ibm.com (Ryan Arnold) Date: Mon, 05 Dec 2005 11:06:27 -0600 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: References: Message-ID: <1133802387.10632.8.camel@localhost.localdomain> On Mon, 2005-12-05 at 08:45 -0600, Eric V Van hensbergen wrote: > Its a relatively small knit-pick, but the callthru functions should > probably > be kept in a common include. My patch-set has > include/asm-powerpc/systemsim.h > which includes these definitions. That way we don't have to define the > callthru's for every driver which might use them (such as BogusNet or > BogusDisk). > > -eric Thanks Eric, I did question whether the console driver was the appropriate place for the callthru when I moved the definitions from bogus_console.c. I guess we won't see these definitions moved to an alternate file until Arnd makes his patches available against a more recent kernel? -- Ryan Arnold IBM Linux Technology Center From ericvanhensbergen at us.ibm.com Tue Dec 6 04:14:04 2005 From: ericvanhensbergen at us.ibm.com (Eric V Van hensbergen) Date: Mon, 5 Dec 2005 11:14:04 -0600 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: Message-ID: rsa at us.ltcfwd.linux.ibm.com wrote on 12/05/2005 11:11:11 AM: > On Mon, 2005-12-05 at 08:45 -0600, Eric V Van hensbergen wrote: > > Its a relatively small knit-pick, but the callthru functions should > > probably > > be kept in a common include. My patch-set has > > include/asm-powerpc/systemsim.h > > which includes these definitions. That way we don't have to define the > > callthru's for every driver which might use them (such as BogusNet or > > BogusDisk). > > > > -eric > > Thanks Eric, > > I did question whether the console driver was the appropriate place for > the callthru when I moved the definitions from bogus_console.c. I guess > we won't see these definitions moved to an alternate file until Arnd > makes his patches available against a more recent kernel? > You can get to my (generic systemsim patches) via kernel.org: /pub/scm/linux/kernel/git/ericvh/systemsim.git You should be able to pull the general definitiions from there if you want to update your patch. -eric From ericvanhensbergen at us.ibm.com Tue Dec 6 04:17:22 2005 From: ericvanhensbergen at us.ibm.com (Eric V Van hensbergen) Date: Mon, 5 Dec 2005 11:17:22 -0600 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: Message-ID: Arnd Bergmann wrote on 12/05/2005 10:35:13 AM: > On Maandag 05 Dezember 2005 15:45, Eric V Van hensbergen wrote: > > Its a relatively small knit-pick, but the callthru functions should > > probably > > be kept in a common include. ?My patch-set has > > include/asm-powerpc/systemsim.h > > which includes these definitions. ?That way we don't have to define the > > callthru's for every driver which might use them (such as BogusNet or > > BogusDisk). > > > > That's right. I have already ported the patches to the powerpc.git tree > and used your systemsim.h file for that. > > What are your plans for bringing your patches upstream? The code in there > looks pretty good already, but I guess some day you should split it up > into smaller patches and submit those. > I suppose if there is sufficient pull I could push them at any time -- I haven't gone down this path because I'm not sure how I feel including simulator drivers in the mainline kernel tree. If the linuxppc64 folks think this is valuable, I'd be happy to clean-up the drivers a bit more and submit a patch. -eric From miltonm at bga.com Tue Dec 6 04:27:07 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 5 Dec 2005 11:27:07 -0600 Subject: [RFC PATCH 2/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: <43935BB0.9050306@us.ibm.com> References: <43935BB0.9050306@us.ibm.com> Message-ID: <1e4db56531f2c11b786669e793c749d1@bga.com> Hi Ryan. On Dec 4, 2005, at 3:12 PM, Ryan S. Arnold wrote: > This patch shuffles around some data-type declarations and moves some > functions out of include/asm-ppc64/hvconsole.h and into a new > drivers/char/hvc_console.h file. > > Signed-off-by: Ryan S. Arnold > > diff -uNr linux-2.6.14-rc5/drivers/char/hvc_console.c > linux-2.6.14-rc5-cbe-fss/drivers/char/hvc_console.c > --- linux-2.6.14-rc5/drivers/char/hvc_console.c 2005-10-20 > 02:23:05.000000000 -0400 > +++ linux-2.6.14-rc5-cbe-fss/drivers/char/hvc_console.c 2005-12-02 > 17:20:58.095207576 -0500 > @@ -40,7 +40,7 @@ > #include > #include > #include > -#include > +#include "hvc_console.h" > > #define HVC_MAJOR 229 > #define HVC_MINOR 0 > @@ -61,11 +61,6 @@ > */ > #define HVC_ALLOC_TTY_ADAPTERS 8 Above should be in .h file, and consistent with console (or more). > > -#define N_OUTBUF 16 > -#define N_INBUF 16 > - > -#define __ALIGNED__ __attribute__((__aligned__(8))) > - > static struct tty_driver *hvc_driver; > static struct task_struct *hvc_task; > > @@ -76,22 +71,6 @@ > static int sysrq_pressed; > #endif > > -struct hvc_struct { > - spinlock_t lock; > - int index; > - struct tty_struct *tty; > - unsigned int count; > - int do_wakeup; > - char outbuf[N_OUTBUF] __ALIGNED__; > - int n_outbuf; > - uint32_t vtermno; > - struct hv_ops *ops; > - int irq_requested; > - int irq; > - struct list_head next; > - struct kobject kobj; /* ref count & hvc_struct lifetime */ > -}; > - > /* dynamic list of hvc_struct instances */ > static struct list_head hvc_structs = LIST_HEAD_INIT(hvc_structs); > > @@ -136,7 +115,6 @@ > return hp; > } > > - > /* > * Initial console vtermnos for console API usage prior to full > console > * initialization. Any vty adapter outside this range will not have > usable > @@ -154,6 +132,7 @@ > > void hvc_console_print(struct console *co, const char *b, unsigned > count) > { > + /* This [16] should probably use a #define */ N_OUTBUF perhaps? > char c[16] __ALIGNED__; > unsigned i = 0, n = 0; > int r, donecr = 0, index = co->index; > diff -uNr linux-2.6.14-rc5/drivers/char/hvc_console.h > linux-2.6.14-rc5-cbe-fss/drivers/char/hvc_console.h > --- linux-2.6.14-rc5/drivers/char/hvc_console.h 1969-12-31 > 19:00:00.000000000 -0500 > +++ linux-2.6.14-rc5-cbe-fss/drivers/char/hvc_console.h 2005-12-02 > 17:27:07.733180280 -0500 > @@ -0,0 +1,83 @@ > +/* > + * hvc_console.h > + * Copyright (C) 2005 IBM Corporation > + * > + * Author(s): > + * Ryan S. Arnold > + * > + * hvc_console header information: > + * moved here from include/asm-ppc64/hvconsole.h > + * and drivers/char/hvc_console.c > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA > 02111-1307 USA > + */ > + > +#ifndef HVC_CONSOLE_H > +#define HVC_CONSOLE_H > + > +#include > +#include > +#include > + > +/* > + * This is the max number of console adapters that can/will be found > as > + * console devices on first stage console init. Any number beyond > this range > + * can't be used as a console device but is still a valid tty device. > + */ > +#define MAX_NR_HVC_CONSOLES 16 > + > +/* > + * This is a design shortcoming, the number '16' is a vio required > buffer > + * size. This should be changeable per architecture, but hvc_struct > relies > + * upon it and that struct is used by all hvc_console backend > drivers. This > + * needs to be fixed. > + */ > +#define N_OUTBUF 16 > +#define N_INBUF 16 > + > +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long)))) > + A little bit generic for a .h file used by multiple files ... but see next comment. > +/* implemented by a low level driver */ > +struct hv_ops { > + int (*get_chars)(uint32_t vtermno, char *buf, int count); > + int (*put_chars)(uint32_t vtermno, const char *buf, int count); > +}; > + > +struct hvc_struct { > + spinlock_t lock; > + int index; > + struct tty_struct *tty; > + unsigned int count; > + int do_wakeup; > + char outbuf[N_OUTBUF] __ALIGNED__; > + int n_outbuf; > + uint32_t vtermno; > + struct hv_ops *ops; > + int irq_requested; > + int irq; > + struct list_head next; > + struct kobject kobj; /* ref count & hvc_struct lifetime */ > +}; Why are you putting the full structure definition in the .h file instead of just declaring the struct? It only encourages clients to dig into the structure instead of treating it as magic cookie. > + > +/* Register a vterm and a slot index for use as a console > (console_init) */ > +extern int hvc_instantiate(uint32_t vtermno, int index, struct hv_ops > *ops); > + > +/* register a vterm for hvc tty operation (module_init or hotplug > add) */ > +extern struct hvc_struct * __devinit hvc_alloc(uint32_t vtermno, int > irq, > + struct hv_ops *ops); > +/* remove a vterm from hvc tty operation (modele_exit or hotplug > remove) */ > +extern int __devexit hvc_remove(struct hvc_struct *hp); > + > +#endif // HVC_CONSOLE_H > diff -uNr linux-2.6.14-rc5/include/asm-ppc64/hvconsole.h > linux-2.6.14-rc5-cbe-fss/include/asm-ppc64/hvconsole.h > --- linux-2.6.14-rc5/include/asm-ppc64/hvconsole.h 2005-10-20 > 02:23:05.000000000 -0400 > +++ linux-2.6.14-rc5-cbe-fss/include/asm-ppc64/hvconsole.h 2005-11-14 > 16:24:02.000000000 -0500 > @@ -22,28 +22,7 @@ > #ifndef _PPC64_HVCONSOLE_H > #define _PPC64_HVCONSOLE_H > > -/* > - * This is the max number of console adapters that can/will be found > as > - * console devices on first stage console init. Any number beyond > this range > - * can't be used as a console device but is still a valid tty device. > - */ > -#define MAX_NR_HVC_CONSOLES 16 > - > -/* implemented by a low level driver */ > -struct hv_ops { > - int (*get_chars)(uint32_t vtermno, char *buf, int count); > - int (*put_chars)(uint32_t vtermno, const char *buf, int count); > -}; > extern int hvc_get_chars(uint32_t vtermno, char *buf, int count); > extern int hvc_put_chars(uint32_t vtermno, const char *buf, int > count); > > -struct hvc_struct; > - > -/* Register a vterm and a slot index for use as a console > (console_init) */ > -extern int hvc_instantiate(uint32_t vtermno, int index, struct hv_ops > *ops); > -/* register a vterm for hvc tty operation (module_init or hotplug > add) */ > -extern struct hvc_struct * __devinit hvc_alloc(uint32_t vtermno, int > irq, > - struct hv_ops *ops); > -/* remove a vterm from hvc tty operation (modele_exit or hotplug > remove) */ > -extern int __devexit hvc_remove(struct hvc_struct *hp); > #endif /* _PPC64_HVCONSOLE_H */ Did I miss the addition of hvc_console.h to hvc_vio.c (and hvsi)? I'm ok moving the .h but it does constrain the clients to be in drivers/char. milton From miltonm at bga.com Tue Dec 6 04:27:55 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 5 Dec 2005 11:27:55 -0600 Subject: [RFC PATCH 3/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: <43935BB5.9030302@us.ibm.com> References: <43935BB5.9030302@us.ibm.com> Message-ID: <2b19bee9bd90cfee311d8076b026add4@bga.com> On Dec 4, 2005, at 3:12 PM, Ryan S. Arnold wrote: > This patch modifies the defconfig file for the CELL simulator and > changes the Makefile and Kconfig to add hvc_fss. > > Signed-off-by: Ryan S. Arnold > > > diff -uNr linux-2.6.14-rc5/arch/ppc64/configs/cbesim_defconfig > linux-2.6.14-rc5-cbe-fss/arch/ppc64/configs/cbesim_defconfig > --- linux-2.6.14-rc5/arch/ppc64/configs/cbesim_defconfig 2005-11-14 > 12:26:32.000000000 -0500 > +++ > linux-2.6.14-rc5-cbe-fss/arch/ppc64/configs/cbesim_defconfig 2005-11 > -14 15:59:05.000000000 -0500 > @@ -322,7 +322,7 @@ > CONFIG_UNIX98_PTYS=y > # CONFIG_LEGACY_PTYS is not set > # CONFIG_RTASCONS is not set > -CONFIG_BOGUS_CONSOLE=y > +CONFIG_HVC_FSS=y > > # > # IPMI > diff -uNr linux-2.6.14-rc5/drivers/char/Kconfig > linux-2.6.14-rc5-cbe-fss/drivers/char/Kconfig > --- linux-2.6.14-rc5/drivers/char/Kconfig 2005-11-14 > 12:26:32.000000000 -0500 > +++ linux-2.6.14-rc5-cbe-fss/drivers/char/Kconfig 2005-12-02 > 17:44:04.490273872 -0500 > @@ -552,25 +552,37 @@ > > If unsure, say N. > > +config HVC_DRIVER > + bool "PowerPC virtual console front-end support" > + depends on PPC_PSERIES || PPC_BPA || PPC_RTAS > + help > + Users of pSeries machines that want to utilize the hvc console > front-end > + module for their backend console driver should select this option. > + It will automatically be selected if one of the back-end console > drivers > + is selected. > + Lets just keep this hidden -- so take out depends (its all generic code) and just say bool (without any quoted text). The help text could then be made more generic. > config HVC_CONSOLE > bool "pSeries Hypervisor Virtual Console support" > depends on PPC_PSERIES > + select HVC_DRIVER > help > pSeries machines when partitioned support a hypervisor virtual > console. This driver allows each pSeries partition to have a > console > which is accessed via the HMC. > > -config RTASCONS > - bool "RTAS firmware console support" > - depends on PPC_RTAS > - help > - RTAS console support. > - > -config BOGUS_CONSOLE > - bool "Simulator bogus console support" > +config HVC_FSS > + bool "IBM Full System Simulator Console support" > depends on PPC_PSERIES || PPC_BPA > + select HVC_DRIVER > + help > + IBM Full System Simulator Console device driver which makes use of > + the HVC_DRIVER front end. > + > +config RTASCONS > + bool "RTAS firmware console support" > + depends on PPC_RTAS > help > - IBM System Simulator bogus console device driver. > + RTAS console support. > > config HVCS > tristate "IBM Hypervisor Virtual Console Server support" > diff -uNr linux-2.6.14-rc5/drivers/char/Makefile > linux-2.6.14-rc5-cbe-fss/drivers/char/Makefile > --- linux-2.6.14-rc5/drivers/char/Makefile 2005-11-14 > 12:26:32.000000000 -0500 > +++ linux-2.6.14-rc5-cbe-fss/drivers/char/Makefile 2005-12-02 > 17:24:12.583189272 -0500 > @@ -41,12 +41,13 @@ > obj-$(CONFIG_SX) += sx.o generic_serial.o > obj-$(CONFIG_RIO) += rio/ generic_serial.o > obj-$(CONFIG_RTASCONS) += rtascons.o > -obj-$(CONFIG_BOGUS_CONSOLE) +=bogus_console.o > -obj-$(CONFIG_HVC_CONSOLE) += hvc_console.o hvc_vio.o hvsi.o > +obj-$(CONFIG_HVC_DRIVER) += hvc_console.o > +obj-$(CONFIG_HVC_CONSOLE) += hvc_vio.o hvsi.o > +obj-$(CONFIG_HVC_FSS) += hvc_fss.o > obj-$(CONFIG_RAW_DRIVER) += raw.o > obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o > obj-$(CONFIG_MMTIMER) += mmtimer.o > -obj-$(CONFIG_VIOCONS) += viocons.o > +obj-$(CONFIG_VIOCONS) += viocons.o > obj-$(CONFIG_VIOTAPE) += viotape.o > obj-$(CONFIG_HVCS) += hvcs.o > obj-$(CONFIG_SGI_MBCS) += mbcs.o milton From miltonm at bga.com Tue Dec 6 04:27:57 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 5 Dec 2005 11:27:57 -0600 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: <43935BBF.6080005@us.ibm.com> References: <43935BBF.6080005@us.ibm.com> Message-ID: <5f0ba2ea6429728d231d7baf74a7018d@bga.com> Mostly style and less confusion. On Dec 4, 2005, at 3:12 PM, Ryan S. Arnold wrote: > This patch adds the hvc_fss.c driver file. > > Signed-off-by: Ryan S. Arnold > diff -uNr linux-2.6.14-rc5/drivers/char/hvc_fss.c > linux-2.6.14-rc5-cbe-fss/drivers/char/hvc_fss.c > --- linux-2.6.14-rc5/drivers/char/hvc_fss.c 1969-12-31 > 19:00:00.000000000 -0500 > +++ linux-2.6.14-rc5-cbe-fss/drivers/char/hvc_fss.c 2005-12-02 > 17:54:19.243249984 -0500 > @@ -0,0 +1,148 @@ > +/* > + * IBM Full System Simulator driver interface to hvc_console.c > + * > + * (C) Copyright IBM Corporation 2001-2005 > + * Author(s): Maximino Augilar > + * : Ryan S. Arnold > + * > + * inspired by drivers/char/hvc_console.c > + * written by Anton Blanchard and Paul Mackerras > + * > + * Some code is from the IBM Full System Simulator Group in ARL. > + * Author: Patrick Bohrer > + * > + * Much of this code was moved here from the IBM Full System Simulator > + * Bogus console driver in order to reuse the framework provided by > the hvc > + * console driver. Ryan S. Arnold > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA > 02111-1307 USA > + */ > + > +#include > +#include > +#include > +#include "hvc_console.h" > + > +static uint32_t hvc_fss_vtermno = 0; This might be confusing ... its not a terminal number so much as a cookie. Made me review the order of arguments below. It could be #define hvc_fss_cookie 0x (unless you expect more channels later). > +struct hvc_struct *hvc_fss_dev; > + > +static inline int callthru0(int command) > +{ > + register int c asm ("r3") = command; > + > + asm volatile (".long 0x000EAEB0" : "=r" (c): "r" (c)); > + return((c)); > +} > + > +static inline int callthru3(int command, unsigned long arg1, unsigned > long arg2, unsigned long arg3) > +{ > + register int c asm ("r3") = command; > + register unsigned long a1 asm ("r4") = arg1; > + register unsigned long a2 asm ("r5") = arg2; > + register unsigned long a3 asm ("r6") = arg3; > + > + asm volatile (".long 0x000EAEB0" : "=r" (c): "r" (c), "r" (a1), "r" > (a2), "r" (a3)); > + return((c)); > +} nit: =&r ? (not sure but I thought that made something input and output) > + > +static inline int hvc_fss_write_console(uint32_t vtermno, const char > *buf, int count) > +{ > + int ret = 0; assigning =0 when unconditionally setting below is redundant. > + ret = callthru3(0, (unsigned long)buf, > + (unsigned long)count, (unsigned long)1); > + if (ret != 0) { > + return (count - ret); /* is this right? */ > + } > + > + /* the calling routine expects to receive the number of bytes sent */ > + return count; > +} > + > +static inline int hvc_fss_read_console(uint32_t vtermno, char *buf, > int count) > +{ > + unsigned long got; > + int c; > + int i; > + > + for (got = 0, i = 0; i < count; i++) { > + Here I would go the other way, and initialize got above, and only put i=0 in the for statement ... I had to look twice to find the initialization. > + if (( c = callthru0(60) ) != -1) { > + buf[i] = c; > + ++got; > + } > + else } and else on same line > + break; > + } > + return got; > +} > + > +static struct hv_ops hvc_fss_get_put_ops = { > + .get_chars = hvc_fss_read_console, > + .put_chars = hvc_fss_write_console, > +}; > + > +static int hvc_fss_init(void) > +{ > + /* Register a single device with the driver */ > + struct hvc_struct *hp; > + > + if(!__onsim()) { > + return -1; > + } > + > + if(hvc_fss_dev) { > + return -1; /* This shouldn't happen */ > + } > + > + /* Allocate an hvc_struct for the console device we instantiated > + * earlier. Save off hp so that we can return it on exit */ > + hp = hvc_alloc(hvc_fss_vtermno, NO_IRQ, &hvc_fss_get_put_ops); > + if (IS_ERR(hp)) > + return PTR_ERR(hp); > + hvc_fss_dev = hp; > + return 0; > +} > +module_init(hvc_fss_init); > + > +/* This will tear down the tty portion of the driver */ > +static void __exit hvc_fss_exit(void) > +{ > + struct hvc_struct *hp_safe; > + /* Hopefully this isn't premature */ > + if (!hvc_fss_dev) > + return; > + > + hp_safe = hvc_fss_dev; > + hvc_fss_dev = NULL; > + > + /* Really the fun isn't over until the worker thread breaks down and > the > + * tty cleans up */ > + hvc_remove(hp_safe); > +} > +module_exit(hvc_fss_exit); /* before drivers/char/hvc_console.c */ > + > +/* This will happen prior to module init. There is no tty at this > time? */ > +static int hvc_fss_console_init(void) > +{ > + /* Don't register if we aren't running on the simulator */ > + if (__onsim()) { > + /* Tell the driver we know of one console device. We > + * shouldn't get a collision on the index as long as no-one > + * else instantiates on hardware they don't have. */ > + hvc_instantiate(hvc_fss_vtermno, 0, &hvc_fss_get_put_ops ); > + } > + return 0; > +} > +console_initcall(hvc_fss_console_init); milton From kravetz at us.ibm.com Tue Dec 6 06:25:52 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Mon, 5 Dec 2005 11:25:52 -0800 Subject: [PATCH 3/11] powerpc: Seperate usage of KERNELBASE and PAGE_OFFSET In-Reply-To: <20051205003934.643E26887C@ozlabs.org> References: <1133743149.268607.418162138937.qpush@concordia> <20051205003934.643E26887C@ozlabs.org> Message-ID: <20051205192552.GA5535@w-mikek2.ibm.com> On Sun, Dec 04, 2005 at 06:39:20PM +0000, Michael Ellerman wrote: > Index: kexec/arch/powerpc/mm/hash_utils_64.c > =================================================================== > --- kexec.orig/arch/powerpc/mm/hash_utils_64.c > +++ kexec/arch/powerpc/mm/hash_utils_64.c > @@ -456,7 +456,7 @@ void __init htab_initialize(void) > > /* create bolted the linear mapping in the hash table */ > for (i=0; i < lmb.memory.cnt; i++) { > - base = lmb.memory.region[i].base + KERNELBASE; > + base = (unsigned long)__va(lmb.memory.region[i].base); > size = lmb.memory.region[i].size; I think you will want to make a similar change to the routine add_memory() in powerpc/mm/mem.c. This routine was based on htab_initialize's call to htab_bolt_mapping(). int __devinit add_memory(u64 start, u64 size) { struct pglist_data *pgdata = NODE_DATA(0); struct zone *zone; unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; start += KERNELBASE; create_section_mapping(start, start + size); /* this should work for most non-highmem platforms */ zone = pgdata->node_zones; return __add_pages(zone, start_pfn, nr_pages); return 0; } -- Mike From kravetz at us.ibm.com Tue Dec 6 07:06:42 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Mon, 5 Dec 2005 12:06:42 -0800 Subject: [PATCH] reworked: numa placement for dynamically added memory Message-ID: <20051205200642.GA20613@w-mikek2.ibm.com> Here is a reworked version of the patch with changes suggested by Nathan. Again, this patch depends on: http://ozlabs.org/pipermail/linuxppc64-dev/2005-December/006923.html This patch places dynamically added memory within the appropriate numa node. A new routine hot_add_scn_to_nid() replicates most of the memory scanning code in parse_numa_properties(). Signed-off-by: Mike Kravetz diff -Naupr linux-2.6.15-rc5-git1.dep/arch/powerpc/mm/mem.c linux-2.6.15-rc5-git1.work/arch/powerpc/mm/mem.c --- linux-2.6.15-rc5-git1.dep/arch/powerpc/mm/mem.c 2005-12-04 05:10:42.000000000 +0000 +++ linux-2.6.15-rc5-git1.work/arch/powerpc/mm/mem.c 2005-12-05 19:57:50.000000000 +0000 @@ -114,18 +114,17 @@ void online_page(struct page *page) num_physpages++; } -/* - * This works only for the non-NUMA case. Later, we'll need a lookup - * to convert from real physical addresses to nid, that doesn't use - * pfn_to_nid(). - */ int __devinit add_memory(u64 start, u64 size) { - struct pglist_data *pgdata = NODE_DATA(0); + struct pglist_data *pgdata; struct zone *zone; + int nid; unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; + nid = hot_add_scn_to_nid(start); + pgdata = NODE_DATA(nid); + start += KERNELBASE; create_section_mapping(start, start + size); diff -Naupr linux-2.6.15-rc5-git1.dep/arch/powerpc/mm/numa.c linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c --- linux-2.6.15-rc5-git1.dep/arch/powerpc/mm/numa.c 2005-12-05 19:54:24.000000000 +0000 +++ linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c 2005-12-05 19:57:50.000000000 +0000 @@ -37,6 +37,7 @@ EXPORT_SYMBOL(node_data); static bootmem_data_t __initdata plat_node_bdata[MAX_NUMNODES]; static int min_common_depth; +static int n_mem_addr_cells, n_mem_size_cells; /* * We need somewhere to store start/end/node for each region until we have @@ -267,7 +268,7 @@ static void __init get_n_mem_cells(int * of_node_put(memory); } -static unsigned long __init read_n_cells(int n, unsigned int **buf) +static unsigned long __devinit read_n_cells(int n, unsigned int **buf) { unsigned long result = 0; @@ -374,7 +375,6 @@ static int __init parse_numa_properties( { struct device_node *cpu = NULL; struct device_node *memory = NULL; - int n_addr_cells, n_size_cells; int max_domain; unsigned long i; @@ -413,7 +413,7 @@ static int __init parse_numa_properties( } } - get_n_mem_cells(&n_addr_cells, &n_size_cells); + get_n_mem_cells(&n_mem_addr_cells, &n_mem_size_cells); memory = NULL; while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { unsigned long start; @@ -430,8 +430,8 @@ static int __init parse_numa_properties( ranges = memory->n_addrs; new_range: /* these are order-sensitive, and modify the buffer pointer */ - start = read_n_cells(n_addr_cells, &memcell_buf); - size = read_n_cells(n_size_cells, &memcell_buf); + start = read_n_cells(n_mem_addr_cells, &memcell_buf); + size = read_n_cells(n_mem_size_cells, &memcell_buf); numa_domain = of_node_numa_domain(memory); @@ -717,3 +717,50 @@ static int __init early_numa(char *p) return 0; } early_param("numa", early_numa); + +#ifdef CONFIG_MEMORY_HOTPLUG +/* + * Find the node associated with a hot added memory section. Section + * corresponds to a SPARSEMEM section, not an LMB. It is assumed that + * sections are fully contained within a single LMB. + */ +int hot_add_scn_to_nid(unsigned long scn_addr) +{ + struct device_node *memory = NULL; + + if (!numa_enabled || (min_common_depth < 0)) + return 0; + + while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { + unsigned long start, size; + int numa_domain, ranges; + unsigned int *memcell_buf; + unsigned int len; + + memcell_buf = (unsigned int *)get_property(memory, "reg", &len); + if (!memcell_buf || len <= 0) + continue; + + ranges = memory->n_addrs; /* ranges in cell */ +ha_new_range: + start = read_n_cells(n_mem_addr_cells, &memcell_buf); + size = read_n_cells(n_mem_size_cells, &memcell_buf); + numa_domain = of_node_numa_domain(memory); + + /* Domains not present at boot default to 0 */ + if (!node_online(numa_domain)) + numa_domain = any_online_node(NODE_MASK_ALL); + + if ((scn_addr >= start) && (scn_addr < (start + size))) { + of_node_put(memory); + return numa_domain; + } + + if (--ranges) /* process all ranges in cell */ + goto ha_new_range; + } + + BUG(); /* section address should be found above */ + return 0; +} +#endif /* CONFIG_MEMORY_HOTPLUG */ diff -Naupr linux-2.6.15-rc5-git1.dep/include/asm-powerpc/sparsemem.h linux-2.6.15-rc5-git1.work/include/asm-powerpc/sparsemem.h --- linux-2.6.15-rc5-git1.dep/include/asm-powerpc/sparsemem.h 2005-12-04 05:10:42.000000000 +0000 +++ linux-2.6.15-rc5-git1.work/include/asm-powerpc/sparsemem.h 2005-12-05 19:57:50.000000000 +0000 @@ -13,6 +13,14 @@ #ifdef CONFIG_MEMORY_HOTPLUG extern void create_section_mapping(unsigned long start, unsigned long end); +#ifdef CONFIG_NUMA +extern int hot_add_scn_to_nid(unsigned long scn_addr); +#else +static inline int hot_add_scn_to_nid(unsigned long scn_addr) +{ + return 0; +} +#endif /* CONFIG_NUMA */ #endif /* CONFIG_MEMORY_HOTPLUG */ #endif /* CONFIG_SPARSEMEM */ From jdl at freescale.com Tue Dec 6 08:06:48 2005 From: jdl at freescale.com (Jon Loeliger) Date: Mon, 05 Dec 2005 15:06:48 -0600 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware Message-ID: <1133816807.8577.50.camel@cashmere.sps.mot.com> Folks, Included below is a proposed Revision 0.5 of the "Booting the Linux/ppc kernel without Open Firmware" document. This modification primarily extends the Revision 0.4 by adding definitions for OF Nodes that cover the System-On-a-Chip features found on PPC parts. It also generalizes some earlier wording that pertained to only PPC64 parts and covers the new, merged PPC 32 and 64 parts together. Finally, minor typos, style consistency and grammar problems were corrected. Please review this document, primarily Chapter VI, so that we may all proceed with the PPC32/64 merge tree development in a consistent, unified direction. While some effort has been made to follow standard OF nomenclature, terminology and standards, I confess that the authors of these additions are not experts in this area and my have missed details or key insights, or allowed for glaring errors. Our goal is collective improvement here, so be gentle when you call us stupid. :-) Thanks, jdl Booting the Linux/ppc kernel without Open Firmware -------------------------------------------------- (c) 2005 Benjamin Herrenschmidt , IBM Corp. (c) 2005 Becky Bruce , Freescale Semiconductor, FSL SOC and 32-bit additions May 18, 2005: Rev 0.1 - Initial draft, no chapter III yet. May 19, 2005: Rev 0.2 - Add chapter III and bits & pieces here or clarifies the fact that a lot of things are optional, the kernel only requires a very small device tree, though it is encouraged to provide an as complete one as possible. May 24, 2005: Rev 0.3 - Precise that DT block has to be in RAM - Misc fixes - Define version 3 and new format version 16 for the DT block (version 16 needs kernel patches, will be fwd separately). String block now has a size, and full path is replaced by unit name for more compactness. linux,phandle is made optional, only nodes that are referenced by other nodes need it. "name" property is now automatically deduced from the unit name June 1, 2005: Rev 0.4 - Correct confusion between OF_DT_END and OF_DT_END_NODE in structure definition. - Change version 16 format to always align property data to 4 bytes. Since tokens are already aligned, that means no specific required alignement between property size and property data. The old style variable alignment would make it impossible to do "simple" insertion of properties using memove (thanks Milton for noticing). Updated kernel patch as well - Correct a few more alignement constraints - Add a chapter about the device-tree compiler and the textural representation of the tree that can be "compiled" by dtc. November 21, 2005: Rev 0.5 - Additions/generalizations for 32-bit - Changed to reflect the new arch/powerpc structure - Added chapter VI ToDo: - Add some definitions of interrupt tree (simple/complex) - Add some definitions for pci host bridges - Add some common address format examples - Add definitions for standard properties and "compatible" names for cells that are not already defined by the existing OF spec. - Compare FSL SOC use of PCI to standard and make sure no new node definition required. - Add more information about node definitions for SOC devices that currently have no standard, like the FSL CPM. I - Introduction ================ During the recent development of the Linux/ppc64 kernel, and more specifically, the addition of new platform types outside of the old IBM pSeries/iSeries pair, it was decided to enforce some strict rules regarding the kernel entry and bootloader <-> kernel interfaces, in order to avoid the degeneration that had become the ppc32 kernel entry point and the way a new platform should be added to the kernel. The legacy iSeries platform breaks those rules as it predates this scheme, but no new board support will be accepted in the main tree that doesn't follows them properly. In addition, since the advent of the arch/powerpc merged architecture for ppc32 and ppc64, new 32-bit platforms and 32-bit platforms which move into arch/powerpc will be required to use these rules as well. The main requirement that will be defined in more detail below is the presence of a device-tree whose format is defined after Open Firmware specification. However, in order to make life easier to embedded board vendors, the kernel doesn't require the device-tree to represent every device in the system and only requires some nodes and properties to be present. This will be described in detail in section III, but, for example, the kernel does not require you to create a node for every PCI device in the system. It is a requirement to have a node for PCI host bridges in order to provide interrupt routing informations and memory/IO ranges, among others. It is also recommended to define nodes for on chip devices and other busses that don't specifically fit in an existing OF specification. This creates a great flexibility in the way the kernel can then probe those and match drivers to device, without having to hard code all sorts of tables. It also makes it more flexible for board vendors to do minor hardware upgrades without significantly impacting the kernel code or cluttering it with special cases. 1) Entry point for arch/powerpc ------------------------------- There is one and one single entry point to the kernel, at the start of the kernel image. That entry point supports two calling conventions: a) Boot from Open Firmware. If your firmware is compatible with Open Firmware (IEEE 1275) or provides an OF compatible client interface API (support for "interpret" callback of forth words isn't required), you can enter the kernel with: r5 : OF callback pointer as defined by IEEE 1275 bindings to powerpc. Only the 32 bit client interface is currently supported r3, r4 : address & length of an initrd if any or 0 The MMU is either on or off; the kernel will run the trampoline located in arch/powerpc/kernel/prom_init.c to extract the device-tree and other information from open firmware and build a flattened device-tree as described in b). prom_init() will then re-enter the kernel using the second method. This trampoline code runs in the context of the firmware, which is supposed to handle all exceptions during that time. b) Direct entry with a flattened device-tree block. This entry point is called by a) after the OF trampoline and can also be called directly by a bootloader that does not support the Open Firmware client interface. It is also used by "kexec" to implement "hot" booting of a new kernel from a previous running one. This method is what I will describe in more details in this document, as method a) is simply standard Open Firmware, and thus should be implemented according to the various standard documents defining it and its binding to the PowerPC platform. The entry point definition then becomes: r3 : physical pointer to the device-tree block (defined in chapter II) in RAM r4 : physical pointer to the kernel itself. This is used by the assembly code to properly disable the MMU in case you are entering the kernel with MMU enabled and a non-1:1 mapping. r5 : NULL (as to differenciate with method a) Note about SMP entry: Either your firmware puts your other CPUs in some sleep loop or spin loop in ROM where you can get them out via a soft reset or some other means, in which case you don't need to care, or you'll have to enter the kernel with all CPUs. The way to do that with method b) will be described in a later revision of this document. 2) Board support ---------------- 64-bit kernels: Board supports (platforms) are not exclusive config options. An arbitrary set of board supports can be built in a single kernel image. The kernel will "know" what set of functions to use for a given platform based on the content of the device-tree. Thus, you should: a) add your platform support as a _boolean_ option in arch/powerpc/Kconfig, following the example of PPC_PSERIES, PPC_PMAC and PPC_MAPLE. The later is probably a good example of a board support to start from. b) create your main platform file as "arch/powerpc/platforms/myplatform/myboard_setup.c" and add it to the Makefile under the condition of your CONFIG_ option. This file will define a structure of type "ppc_md" containing the various callbacks that the generic code will use to get to your platform specific code c) Add a reference to your "ppc_md" structure in the "machines" table in arch/powerpc/kernel/setup_64.c if you are a 64-bit platform. d) request and get assigned a platform number (see PLATFORM_* constants in include/asm-powerpc/processor.h 32-bit embedded kernels: Currently, board support is essentially an exclusive config option. The kernel is configured for a single platform. Part of the reason for this is to keep kernels on embedded systems small and efficient; part of this is due to the fact the code is already that way. In the future, a kernel may support multiple platforms, but only if the platforms feature the same core architectire. A single kernel build cannot support both configurations with Book E and configurations with classic Powerpc architectures. 32-bit embedded platforms that are moved into arch/powerpc using a flattened device tree should adopt the merged tree practice of setting ppc_md up dynamically, even though the kernel is currently built with support for only a single platform at a time. This allows unification of the setup code, and will make it easier to go to a multiple-platform-support model in the future. NOTE: I believe the above will be true once Ben's done with the merge of the boot sequences.... someone speak up if this is wrong! To add a 32-bit embedded platform support, follow the instructions for 64-bit platforms above, with the exception that the Kconfig option should be set up such that the kernel builds exclusively for the platform selected. The processor type for the platform should enable another config option to select the specific board supported. NOTE: If ben doesn't merge the setup files, may need to change this to point to setup_32.c I will describe later the boot process and various callbacks that your platform should implement. II - The DT block format ======================== This chapter defines the actual format of the flattened device-tree passed to the kernel. The actual content of it and kernel requirements are described later. You can find example of code manipulating that format in various places, including arch/powerpc/kernel/prom_init.c which will generate a flattened device-tree from the Open Firmware representation, or the fs2dt utility which is part of the kexec tools which will generate one from a filesystem representation. It is expected that a bootloader like uboot provides a bit more support, that will be discussed later as well. Note: The block has to be in main memory. It has to be accessible in both real mode and virtual mode with no mapping other than main memory. If you are writing a simple flash bootloader, it should copy the block to RAM before passing it to the kernel. 1) Header --------- The kernel is entered with r3 pointing to an area of memory that is roughtly described in include/asm-powerpc/prom.h by the structure boot_param_header: struct boot_param_header { u32 magic; /* magic word OF_DT_HEADER */ u32 totalsize; /* total size of DT block */ u32 off_dt_struct; /* offset to structure */ u32 off_dt_strings; /* offset to strings */ u32 off_mem_rsvmap; /* offset to memory reserve map */ u32 version; /* format version */ u32 last_comp_version; /* last compatible version */ /* version 2 fields below */ u32 boot_cpuid_phys; /* Which physical CPU id we're booting on */ /* version 3 fields below */ u32 size_dt_strings; /* size of the strings block */ }; Along with the constants: /* Definitions used by the flattened device tree */ #define OF_DT_HEADER 0xd00dfeed /* 4: version, 4: total size */ #define OF_DT_BEGIN_NODE 0x1 /* Start node: full name */ #define OF_DT_END_NODE 0x2 /* End node */ #define OF_DT_PROP 0x3 /* Property: name off, size, content */ #define OF_DT_END 0x9 All values in this header are in big endian format, the various fields in this header are defined more precisely below. All "offset" values are in bytes from the start of the header; that is from the value of r3. - magic This is a magic value that "marks" the beginning of the device-tree block header. It contains the value 0xd00dfeed and is defined by the constant OF_DT_HEADER - totalsize This is the total size of the DT block including the header. The "DT" block should enclose all data structures defined in this chapter (who are pointed to by offsets in this header). That is, the device-tree structure, strings, and the memory reserve map. - off_dt_struct This is an offset from the beginning of the header to the start of the "structure" part the device tree. (see 2) device tree) - off_dt_strings This is an offset from the beginning of the header to the start of the "strings" part of the device-tree - off_mem_rsvmap This is an offset from the beginning of the header to the start of the reserved memory map. This map is a list of pairs of 64 bit integers. Each pair is a physical address and a size. The list is terminated by an entry of size 0. This map provides the kernel with a list of physical memory areas that are "reserved" and thus not to be used for memory allocations, especially during early initialization. The kernel needs to allocate memory during boot for things like un-flattening the device-tree, allocating an MMU hash table, etc... Those allocations must be done in such a way to avoid overriding critical things like, on Open Firmware capable machines, the RTAS instance, or on some pSeries, the TCE tables used for the iommu. Typically, the reserve map should contain _at least_ this DT block itself (header,total_size). If you are passing an initrd to the kernel, you should reserve it as well. You do not need to reserve the kernel image itself. The map should be 64 bit aligned. - version This is the version of this structure. Version 1 stops here. Version 2 adds an additional field boot_cpuid_phys. Version 3 adds the size of the strings block, allowing the kernel to reallocate it easily at boot and free up the unused flattened structure after expansion. Version 16 introduces a new more "compact" format for the tree itself that is however not backward compatible. You should always generate a structure of the highest version defined at the time of your implementation. Currently that is version 16, unless you explicitely aim at being backward compatible. - last_comp_version Last compatible version. This indicates down to what version of the DT block you are backward compatible. For example, version 2 is backward compatible with version 1 (that is, a kernel build for version 1 will be able to boot with a version 2 format). You should put a 1 in this field if you generate a device tree of version 1 to 3, or 0x10 if you generate a tree of version 0x10 using the new unit name format. - boot_cpuid_phys This field only exist on version 2 headers. It indicate which physical CPU ID is calling the kernel entry point. This is used, among others, by kexec. If you are on an SMP system, this value should match the content of the "reg" property of the CPU node in the device-tree corresponding to the CPU calling the kernel entry point (see further chapters for more informations on the required device-tree contents) So the typical layout of a DT block (though the various parts don't need to be in that order) looks like this (addresses go from top to bottom): ------------------------------ r3 -> | struct boot_param_header | ------------------------------ | (alignment gap) (*) | ------------------------------ | memory reserve map | ------------------------------ | (alignment gap) | ------------------------------ | | | device-tree structure | | | ------------------------------ | (alignment gap) | ------------------------------ | | | device-tree strings | | | -----> ------------------------------ | | --- (r3 + totalsize) (*) The alignment gaps are not necessarily present; their presence and size are dependent on the various alignment requirements of the individual data blocks. 2) Device tree generalities --------------------------- This device-tree itself is separated in two different blocks, a structure block and a strings block. Both need to be aligned to a 4 byte boundary. First, let's quickly describe the device-tree concept before detailing the storage format. This chapter does _not_ describe the detail of the required types of nodes & properties for the kernel, this is done later in chapter III. The device-tree layout is strongly inherited from the definition of the Open Firmware IEEE 1275 device-tree. It's basically a tree of nodes, each node having two or more named properties. A property can have a value or not. It is a tree, so each node has one and only one parent except for the root node who has no parent. A node has 2 names. The actual node name is generally contained in a property of type "name" in the node property list whose value is a zero terminated string and is mandatory for version 1 to 3 of the format definition (as it is in Open Firmware). Version 0x10 makes it optional as it can generate it from the unit name defined below. There is also a "unit name" that is used to differenciate nodes with the same name at the same level, it is usually made of the node name's, the "@" sign, and a "unit address", which definition is specific to the bus type the node sits on. The unit name doesn't exist as a property per-se but is included in the device-tree structure. It is typically used to represent "path" in the device-tree. More details about the actual format of these will be below. The kernel powerpc generic code does not make any formal use of the unit address (though some board support code may do) so the only real requirement here for the unit address is to ensure uniqueness of the node unit name at a given level of the tree. Nodes with no notion of address and no possible sibling of the same name (like /memory or /cpus) may omit the unit address in the context of this specification, or use the "@0" default unit address. The unit name is used to define a node "full path", which is the concatenation of all parent node unit names separated with "/". The root node doesn't have a defined name, and isn't required to have a name property either if you are using version 3 or earlier of the format. It also has no unit address (no @ symbol followed by a unit address). The root node unit name is thus an empty string. The full path to the root node is "/". Every node which actually represents an actual device (that is, a node which isn't only a virtual "container" for more nodes, like "/cpus" is) is also required to have a "device_type" property indicating the type of node . Finally, every node that can be referenced from a property in another node is required to have a "linux,phandle" property. Real open firmware implementations provide a unique "phandle" value for every node that the "prom_init()" trampoline code turns into "linux,phandle" properties. However, this is made optional if the flattened device tree is used directly. An example of a node referencing another node via "phandle" is when laying out the interrupt tree which will be described in a further version of this document. This "linux, phandle" property is a 32 bit value that uniquely identifies a node. You are free to use whatever values or system of values, internal pointers, or whatever to generate these, the only requirement is that every node for which you provide that property has a unique value for it. Here is an example of a simple device-tree. In this example, an "o" designates a node followed by the node unit name. Properties are presented with their name followed by their content. "content" represents an ASCII string (zero terminated) value, while represents a 32 bit hexadecimal value. The various nodes in this example will be discussed in a later chapter. At this point, it is only meant to give you a idea of what a device-tree looks like. I have purposefully kept the "name" and "linux,phandle" properties which aren't necessary in order to give you a better idea of what the tree looks like in practice. / o device-tree |- name = "device-tree" |- model = "MyBoardName" |- compatible = "MyBoardFamilyName" |- #address-cells = <2> |- #size-cells = <2> |- linux,phandle = <0> | o cpus | | - name = "cpus" | | - linux,phandle = <1> | | - #address-cells = <1> | | - #size-cells = <0> | | | o PowerPC,970 at 0 | |- name = "PowerPC,970" | |- device_type = "cpu" | |- reg = <0> | |- clock-frequency = <5f5e1000> | |- linux,boot-cpu | |- linux,phandle = <2> | o memory at 0 | |- name = "memory" | |- device_type = "memory" | |- reg = <00000000 00000000 00000000 20000000> | |- linux,phandle = <3> | o chosen |- name = "chosen" |- bootargs = "root=/dev/sda2" |- linux,platform = <00000600> |- linux,phandle = <4> This tree is almost a minimal tree. It pretty much contains the minimal set of required nodes and properties to boot a linux kernel; that is, some basic model informations at the root, the CPUs, and the physical memory layout. It also includes misc information passed through /chosen, like in this example, the platform type (mandatory) and the kernel command line arguments (optional). The /cpus/PowerPC,970 at 0/linux,boot-cpu property is an example of a property without a value. All other properties have a value. The significance of the #address-cells and #size-cells properties will be explained in chapter IV which defines precisely the required nodes and properties and their content. 3) Device tree "structure" block The structure of the device tree is a linearized tree structure. The "OF_DT_BEGIN_NODE" token starts a new node, and the "OF_DT_END_NODE" ends that node definition. Child nodes are simply defined before "OF_DT_END_NODE" (that is nodes within the node). A 'token' is a 32 bit value. The tree has to be "finished" with a OF_DT_END token Here's the basic structure of a single node: * token OF_DT_BEGIN_NODE (that is 0x00000001) * for version 1 to 3, this is the node full path as a zero terminated string, starting with "/". For version 16 and later, this is the node unit name only (or an empty string for the root node) * [align gap to next 4 bytes boundary] * for each property: * token OF_DT_PROP (that is 0x00000003) * 32 bit value of property value size in bytes (or 0 of no * value) * 32 bit value of offset in string block of property name * property value data if any * [align gap to next 4 bytes boundary] * [child nodes if any] * token OF_DT_END_NODE (that is 0x00000002) So the node content can be summmarised as a start token, a full path, a list of properties, a list of child node and an end token. Every child node is a full node structure itself as defined above. 4) Device tree 'strings" block In order to save space, property names, which are generally redundant, are stored separately in the "strings" block. This block is simply the whole bunch of zero terminated strings for all property names concatenated together. The device-tree property definitions in the structure block will contain offset values from the beginning of the strings block. III - Required content of the device tree ========================================= WARNING: All "linux,*" properties defined in this document apply only to a flattened device-tree. If your platform uses a real implementation of Open Firmware or an implementation compatible with the Open Firmware client interface, those properties will be created by the trampoline code in the kernel's prom_init() file. For example, that's where you'll have to add code to detect your board model and set the platform number. However, when using the flatenned device-tree entry point, there is no prom_init() pass, and thus you have to provide those properties yourself. 1) Note about cells and address representation ---------------------------------------------- The general rule is documented in the various Open Firmware documentations. If you chose to describe a bus with the device-tree and there exist an OF bus binding, then you should follow the specification. However, the kernel does not require every single device or bus to be described by the device tree. In general, the format of an address for a device is defined by the parent bus type, based on the #address-cells and #size-cells property. In the absence of such a property, the parent's parent values are used, etc... The kernel requires the root node to have those properties defining addresses format for devices directly mapped on the processor bus. Those 2 properties define 'cells' for representing an address and a size. A "cell" is a 32 bit number. For example, if both contain 2 like the example tree given above, then an address and a size are both composed of 2 cells, and each is a 64 bit number (cells are concatenated and expected to be in big endian format). Another example is the way Apple firmware defines them, with 2 cells for an address and one cell for a size. Most 32-bit implementations should define #address-cells and #size-cells to 1, which represents a 32-bit value. Some 32-bit processors allow for physical addresses greater than 32 bits; these processors should define #address-cells as 2. "reg" properties are always a tuple of the type "address size" where the number of cells of address and size is specified by the bus #address-cells and #size-cells. When a bus supports various address spaces and other flags relative to a given address allocation (like prefetchable, etc...) those flags are usually added to the top level bits of the physical address. For example, a PCI physical address is made of 3 cells, the bottom two containing the actual address itself while the top cell contains address space indication, flags, and pci bus & device numbers. For busses that support dynamic allocation, it's the accepted practice to then not provide the address in "reg" (keep it 0) though while providing a flag indicating the address is dynamically allocated, and then, to provide a separate "assigned-addresses" property that contains the fully allocated addresses. See the PCI OF bindings for details. In general, a simple bus with no address space bits and no dynamic allocation is preferred if it reflects your hardware, as the existing kernel address parsing functions will work out of the box. If you define a bus type with a more complex address format, including things like address space bits, you'll have to add a bus translator to the prom_parse.c file of the recent kernels for your bus type. The "reg" property only defines addresses and sizes (if #size-cells is non-0) within a given bus. In order to translate addresses upward (that is into parent bus addresses, and possibly into cpu physical addresses), all busses must contain a "ranges" property. If the "ranges" property is missing at a given level, it's assumed that translation isn't possible. The format of the "ranges" proprety for a bus is a list of: bus address, parent bus address, size "bus address" is in the format of the bus this bus node is defining, that is, for a PCI bridge, it would be a PCI address. Thus, (bus address, size) defines a range of addresses for child devices. "parent bus address" is in the format of the parent bus of this bus. For example, for a PCI host controller, that would be a CPU address. For a PCI<->ISA bridge, that would be a PCI address. It defines the base address in the parent bus where the beginning of that range is mapped. For a new 64 bit powerpc board, I recommend either the 2/2 format or Apple's 2/1 format which is slightly more compact since sizes usually fit in a single 32 bit word. New 32 bit powerpc boards should use a 1/1 format, unless the processor supports physical addresses greater than 32-bits, in which case a 2/1 format is recommended. 2) Note about "compatible" properties ------------------------------------- These properties are optional, but recommended in devices and the root node. The format of a "compatible" property is a list of concatenated zero terminated strings. They allow a device to express its compatibility with a family of similar devices, in some cases, allowing a single driver to match against several devices regardless of their actual names. 3) Note about "name" properties ------------------------------- While earlier users of Open Firmware like OldWorld macintoshes tended to use the actual device name for the "name" property, it's nowadays considered a good practice to use a name that is closer to the device class (often equal to device_type). For example, nowadays, ethernet controllers are named "ethernet", an additional "model" property defining precisely the chip type/model, and "compatible" property defining the family in case a single driver can driver more than one of these chips. However, the kernel doesn't generally put any restriction on the "name" property; it is simply considered good practice to follow the standard and its evolutions as closely as possible. Note also that the new format version 16 makes the "name" property optional. If it's absent for a node, then the node's unit name is then used to reconstruct the name. That is, the part of the unit name before the "@" sign is used (or the entire unit name if no "@" sign is present). 4) Note about node and property names and character set ------------------------------------------------------- While open firmware provides more flexibe usage of 8859-1, this specification enforces more strict rules. Nodes and properties should be comprised only of ASCII characters 'a' to 'z', '0' to '9', ',', '.', '_', '+', '#', '?', and '-'. Node names additionally allow uppercase characters 'A' to 'Z' (property names should be lowercase. The fact that vendors like Apple don't respect this rule is irrelevant here). Additionally, node and property names should always begin with a character in the range 'a' to 'z' (or 'A' to 'Z' for node names). The maximum number of characters for both nodes and property names is 31. In the case of node names, this is only the leftmost part of a unit name (the pure "name" property), it doesn't include the unit address which can extend beyond that limit. 5) Required nodes and properties -------------------------------- These are all that are currently required. However, it is strongly recommended that you expose PCI host bridges as documented in the PCI binding to open firmware, and your interrupt tree as documented in OF interrupt tree specification. a) The root node The root node requires some properties to be present: - model : this is your board name/model - #address-cells : address representation for "root" devices - #size-cells: the size representation for "root" devices Additionally, some recommended properties are: - compatible : the board "family" generally finds its way here, for example, if you have 2 board models with a similar layout, that typically get driven by the same platform code in the kernel, you would use a different "model" property but put a value in "compatible". The kernel doesn't directly use that value (see /chosen/linux,platform for how the kernel choses a platform type) but it is generally useful. The root node is also generally where you add additional properties specific to your board like the serial number if any, that sort of thing. it is recommended that if you add any "custom" property whose name may clash with standard defined ones, you prefix them with your vendor name and a comma. b) The /cpus node This node is the parent of all individual CPU nodes. It doesn't have any specific requirements, though it's generally good practice to have at least: #address-cells = <00000001> #size-cells = <00000000> This defines that the "address" for a CPU is a single cell, and has no meaningful size. This is not necessary but the kernel will assume that format when reading the "reg" properties of a CPU node, see below c) The /cpus/* nodes So under /cpus, you are supposed to create a node for every CPU on the machine. There is no specific restriction on the name of the CPU, though It's common practice to call it PowerPC,. For example, Apple uses PowerPC,G5 while IBM uses PowerPC,970FX. Required properties: - device_type : has to be "cpu" - reg : This is the physical cpu number, it's a single 32 bit cell and is also used as-is as the unit number for constructing the unit name in the full path. For example, with 2 CPUs, you would have the full path: /cpus/PowerPC,970FX at 0 /cpus/PowerPC,970FX at 1 (unit addresses do not require leading zeroes) - d-cache-line-size : one cell, L1 data cache line size in bytes - i-cache-line-size : one cell, L1 instruction cache line size in bytes - d-cache-size : one cell, size of L1 data cache in bytes - i-cache-size : one cell, size of L1 instruction cache in bytes - linux, boot-cpu : Should be defined if this cpu is the boot cpu. Recommended properties: - timebase-frequency : a cell indicating the frequency of the timebase in Hz. This is not directly used by the generic code, but you are welcome to copy/paste the pSeries code for setting the kernel timebase/decrementer calibration based on this value. - clock-frequency : a cell indicating the CPU core clock frequency in Hz. A new property will be defined for 64 bit values, but if your frequency is < 4Ghz, one cell is enough. Here as well as for the above, the common code doesn't use that property, but you are welcome to re-use the pSeries or Maple one. A future kernel version might provide a common function for this. You are welcome to add any property you find relevant to your board, like some information about the mechanism used to soft-reset the CPUs. For example, Apple puts the GPIO number for CPU soft reset lines in there as a "soft-reset" property since they start secondary CPUs by soft-resetting them. d) the /memory node(s) To define the physical memory layout of your board, you should create one or more memory node(s). You can either create a single node with all memory ranges in its reg property, or you can create several nodes, as you wish. The unit address (@ part) used for the full path is the address of the first range of memory defined by a given node. If you use a single memory node, this will typically be @0. Required properties: - device_type : has to be "memory" - reg : This property contains all the physical memory ranges of your board. It's a list of addresses/sizes concatenated together, with the number of cells of each defined by the #address-cells and #size-cells of the root node. For example, with both of these properties beeing 2 like in the example given earlier, a 970 based machine with 6Gb of RAM could typically have a "reg" property here that looks like: 00000000 00000000 00000000 80000000 00000001 00000000 00000001 00000000 That is a range starting at 0 of 0x80000000 bytes and a range starting at 0x100000000 and of 0x100000000 bytes. You can see that there is no memory covering the IO hole between 2Gb and 4Gb. Some vendors prefer splitting those ranges into smaller segments, but the kernel doesn't care. e) The /chosen node This node is a bit "special". Normally, that's where open firmware puts some variable environment information, like the arguments, or phandle pointers to nodes like the main interrupt controller, or the default input/output devices. This specification makes a few of these mandatory, but also defines some linux-specific properties that would be normally constructed by the prom_init() trampoline when booting with an OF client interface, but that you have to provide yourself when using the flattened format. Required properties: - linux,platform : This is your platform number as assigned by the architecture maintainers Recommended properties: - bootargs : This zero-terminated string is passed as the kernel command line - linux,stdout-path : This is the full path to your standard console device if any. Typically, if you have serial devices on your board, you may want to put the full path to the one set as the default console in the firmware here, for the kernel to pick it up as it's own default console. If you look at the funciton set_preferred_console() in arch/ppc64/kernel/setup.c, you'll see that the kernel tries to find out the default console and has knowledge of various types like 8250 serial ports. You may want to extend this function to add your own. - interrupt-controller : This is one cell containing a phandle value that matches the "linux,phandle" property of your main interrupt controller node. May be used for interrupt routing. Note that u-boot creates and fills in the chosen node for platforms that use it. f) the /soc node This node is used to represent a system-on-a-chip (SOC) and must be present if the processor is a SOC. The top-level soc node contains information that is global to all devices on the SOC. The node name should contain a unit address for the SOC, which is the base address of the memory-mapped register set for the SOC. The name of an soc node should start with "soc", and the remainder of the name should represent the part number for the soc. For example, the MPC8540's soc node would be called "soc8540". Required properties: - device_type : Should be "soc" - ranges : Should be defined as specified in 1) to describe the translation of SOC addresses for memory mapped SOC registers. Recommended properties: - reg : This property defines the address and size of the memory-mapped registers that are used for the SOC node itself. It does not include the child device registers - these will be defined inside each child node. The address specified in the "reg" property should match the unit address of the SOC node. - #address-cells : Address representation for "soc" devices. The format of this field may vary depending on whether or not the device registers are memory mapped. For memory mapped registers, this field represents the number of cells needed to represent the address of the registers. For SOCs that do not use MMIO, a special address format should be defined that contains enough cells to represent the required information. See 1) above for more details on defining #address-cells. - #size-cells : Size representation for "soc" devices - #interrupt-cells : Defines the width of cells used to represent interrupts. Typically this value is <2>, which includes a 32-bit number that represents the interrupt number, and a 32-bit number that represents the interrupt sense and level. This field is only needed if the SOC contains an interrupt controller. The SOC node may contain child nodes for each SOC device that the platform uses. Nodes should not be created for devices which exist on the SOC but are not used by a particular platform. See chapter VI for more information on how to specify devices that are part of an SOC. Example SOC node for the MPC8540: soc8540 at e0000000 { #address-cells = <1>; #size-cells = <1>; #interrupt-cells = <2>; device_type = "soc"; ranges = <00000000 e0000000 00100000> reg = ; } IV - "dtc", the device tree compiler ==================================== dtc source code can be found at WARNING: This version is still in early development stage; the resulting device-tree "blobs" have not yet been validated with the kernel. The current generated bloc lacks a useful reserve map (it will be fixed to generate an empty one, it's up to the bootloader to fill it up) among others. The error handling needs work, bugs are lurking, etc... dtc basically takes a device-tree in a given format and outputs a device-tree in another format. The currently supported formats are: Input formats: ------------- - "dtb": "blob" format, that is a flattened device-tree block with header all in a binary blob. - "dts": "source" format. This is a text file containing a "source" for a device-tree. The format is defined later in this chapter. - "fs" format. This is a representation equivalent to the output of /proc/device-tree, that is nodes are directories and properties are files Output formats: --------------- - "dtb": "blob" format - "dts": "source" format - "asm": assembly language file. This is a file that can be sourced by gas to generate a device-tree "blob". That file can then simply be added to your Makefile. Additionally, the assembly file exports some symbols that can be use The syntax of the dtc tool is dtc [-I ] [-O ] [-o output-filename] [-V output_version] input_filename The "output_version" defines what versio of the "blob" format will be generated. Supported versions are 1,2,3 and 16. The default is currently version 3 but that may change in the future to version 16. Additionally, dtc performs various sanity checks on the tree, like the uniqueness of linux,phandle properties, validity of strings, etc... The format of the .dts "source" file is "C" like, supports C and C++ style commments. / { } The above is the "device-tree" definition. It's the only statement supported currently at the toplevel. / { property1 = "string_value"; /* define a property containing a 0 * terminated string */ property2 = <1234abcd>; /* define a property containing a * numerical 32 bits value (hexadecimal) */ property3 = <12345678 12345678 deadbeef>; /* define a property containing 3 * numerical 32 bits values (cells) in * hexadecimal */ property4 = [0a 0b 0c 0d de ea ad be ef]; /* define a property whose content is * an arbitrary array of bytes */ childnode at addresss { /* define a child node named "childnode" * whose unit name is "childnode at * address" */ childprop = "hello\n"; /* define a property "childprop" of * childnode (in this case, a string) */ }; }; Nodes can contain other nodes etc... thus defining the hierarchical structure of the tree. Strings support common escape sequences from C: "\n", "\t", "\r", "\(octal value)", "\x(hex value)". It is also suggested that you pipe your source file through cpp (gcc preprocessor) so you can use #include's, #define for constants, etc... Finally, various options are planned but not yet implemented, like automatic generation of phandles, labels (exported to the asm file so you can point to a property content and change it easily from whatever you link the device-tree with), label or path instead of numeric value in some cells to "point" to a node (replaced by a phandle at compile time), export of reserve map address to the asm file, ability to specify reserve map content at compile time, etc... We may provide a .h include file with common definitions of that proves useful for some properties (like building PCI properties or interrupt maps) though it may be better to add a notion of struct definitions to the compiler... V - Recommendations for a bootloader ==================================== Here are some various ideas/recommendations that have been proposed while all this has been defined and implemented. - The bootloader may want to be able to use the device-tree itself and may want to manipulate it (to add/edit some properties, like physical memory size or kernel arguments). At this point, 2 choices can be made. Either the bootloader works directly on the flattened format, or the bootloader has its own internal tree representation with pointers (similar to the kernel one) and re-flattens the tree when booting the kernel. The former is a bit more difficult to edit/modify, the later requires probably a bit more code to handle the tree structure. Note that the structure format has been designed so it's relatively easy to "insert" properties or nodes or delete them by just memmoving things around. It contains no internal offsets or pointers for this purpose. - An example of code for iterating nodes & retreiving properties directly from the flattened tree format can be found in the kernel file arch/ppc64/kernel/prom.c, look at scan_flat_dt() function, it's usage in early_init_devtree(), and the corresponding various early_init_dt_scan_*() callbacks. That code can be re-used in a GPL bootloader, and as the author of that code, I would be happy do discuss possible free licencing to any vendor who wishes to integrate all or part of this code into a non-GPL bootloader. VI - System-on-a-chip devices and nodes ======================================= Many companies are now starting to develop system-on-a-chip processors, where the processor core (cpu) and many peripheral devices exist on a single piece of silicon. For these SOCs, an SOC node should be used that defines child nodes for the devices that make up the SOC. While platforms are not required to use this model in order to boot the kernel, it is highly encouraged that all SOC implementations define as complete a flat-device-tree as possible to describe the devices on the SOC. This will allow for the genericization of much of the kernel code. 1) Defining child nodes of an SOC --------------------------------- Each device that is part of an SOC may have its own node entry inside the SOC node. For each device that is included in the SOC, the unit address property represents the address offset for this device's memory-mapped registers in the parent's address space. The parent's address space is defined by the "ranges" property in the top-level soc node. The "reg" property for each node that exists directly under the SOC node should contain the address mapping from the child address space to the parent SOC address space and the size of the device's memory-mapped register file. For many devices that may exist inside an SOC, there are predefined specifications for the format of the device tree node. All SOC child nodes should follow these specifications, except where noted in this document. See appendix A for an example partial SOC node definition for the MPC8540. 2) Specifying interrupt information for SOC devices --------------------------------------------------- Each device that is part of an SOC and which generates interrupts should have the following properties: - interrupt-parent : contains the phandle of the interrupt controller which handles interrupts for this device - interrupts : a list of tuples representing the interrupt number and the interrupt sense and level for each interupt for this device. This information is used by the kernel to build the interrupt table for the interrupt controllers in the system. Sense and level information should be encoded as follows: Devices connected to openPIC-compatible controllers should encode sense and polarity as follows: 0 = high to low edge sensitive type enabled 1 = active low level sensitive type enabled 2 = low to high edge sensitive type enabled 3 = active high level sensitive type enabled ISA PIC interrupt controllers should adhere to the ISA PIC encodings listed below: 0 = active low level sensitive type enabled 1 = active high level sensitive type enabled 2 = high to low edge sensitive type enabled 3 = low to high edge sensitive type enabled 3) Representing devices without a current OF specification ---------------------------------------------------------- Currently, there are many devices on SOCs that do not have a standard representation pre-defined as part of the open firmware specifications, mainly because the boards that contain these SOCs are not currently booted using open firmware. This section contains descriptions for the SOC devices for which new nodes have been defined; this list will expand as more and more SOC-containing platforms are moved over to use the flattened-device-tree model. a) MDIO IO device The MDIO is a bus to which the PHY devices are connected. For each device that exists on this bus, a child node should be created. See the definition of the PHY node below for an example of how to define a PHY. Required properties: - reg : Offset and length of the register set for the device Example: mdio at 24520 { reg = <24520 20>; ethernet-phy at 0 { ...... }; }; b) Gianfar-compatible ethernet nodes Required properties: - device_type : Should be "network" - model : Model of the device. Can be "TSEC" or "FEC" - compatible : Should be "gianfar" - reg : Offset and length of the register set for the device - address : List of bytes representing the ethernet address of this controller - interrupts : where a is the interrupt number and b is a field that represents an encoding of the sense and level information for the interrupt. This should be encoded based on the information in section 2) depending on the type of interrupt controller you have. - interrupt-parent : the phandle for the interrupt controller that services interrupts for this device. - phy-handle : The phandle for the PHY connected to this ethernet controller. Example: ethernet at 24000 { #size-cells = <0>; device_type = "network"; model = "TSEC"; compatible = "gianfar"; reg = <24000 1000>; address = [ 00 E0 0C 00 73 00 ]; interrupts = ; interrupt-parent = <40000>; phy-handle = <2452000> }; c) PHY nodes Required properties: - device_type : Should be "ethernet-phy" - interrupts : where a is the interrupt number and b is a field that represents an encoding of the sense and level information for the interrupt. This should be encoded based on the information in section 2) depending on the type of interrupt controller you have. - interrupt-parent : the phandle for the interrupt controller that services interrupts for this device. - reg : The ID number for the phy, usually a small integer - linux,phandle : phandle for this node; likely referenced by an ethernet controller node. Example: ethernet-phy at 0 { linux,phandle = <2452000> interrupt-parent = <40000>; interrupts = <35 1>; reg = <0>; device_type = "ethernet-phy"; }; d) Interrupt controllers Some SOC devices contain interrupt controllers that are different from the standard Open PIC specification. The SOC device nodes for these types of controllers should be specified just like a standard OpenPIC controller. Sense and level information should be encoded as specified in section 2) of this chapter for each device that specifies an interrupt. Example : pic at 40000 { linux,phandle = <40000>; clock-frequency = <0>; interrupt-controller; #address-cells = <0>; reg = <40000 40000>; built-in; compatible = "chrp,open-pic"; device_type = "open-pic"; big-endian; }; e) I2C Required properties : - device_type : Should be "i2c" - reg : Offset and length of the register set for the device Recommended properties : - compatible : Should be "fsl-i2c" for parts compatible with Freescale I2C specifications. - interrupts : where a is the interrupt number and b is a field that represents an encoding of the sense and level information for the interrupt. This should be encoded based on the information in section 2) depending on the type of interrupt controller you have. - interrupt-parent : the phandle for the interrupt controller that services interrupts for this device. - dfsrr : boolean; if defined, indicates that this I2C device has a digital filter sampling rate register - fsl5200-clocking : boolean; if defined, indicated that this device uses the FSL 5200 clocking mechanism. Example : i2c at 3000 { interrupt-parent = <40000>; interrupts = <1b 3>; reg = <3000 18>; device_type = "i2c"; compatible = "fsl-i2c"; dfsrr; }; More devices will be defined as this spec matures. Appendix A - Sample SOC node for MPC8540 ======================================== Note that the #address-cells and #size-cells for the SoC node in this example have been explicitly listed; these are likely not necessary as they are usually the same as the root node. soc8540 at e0000000 { #address-cells = <1>; #size-cells = <1>; #interrupt-cells = <2>; device_type = "soc"; ranges = <00000000 e0000000 00100000> reg = ; mdio at 24520 { reg = <24520 20>; ethernet-phy at 0 { linux,phandle = <2452000> interrupt-parent = <40000>; interrupts = <35 1>; reg = <0>; device_type = "ethernet-phy"; }; ethernet-phy at 1 { linux,phandle = <2452001> interrupt-parent = <40000>; interrupts = <35 1>; reg = <1>; device_type = "ethernet-phy"; }; ethernet-phy at 3 { linux,phandle = <2452002> interrupt-parent = <40000>; interrupts = <35 1>; reg = <3>; device_type = "ethernet-phy"; }; }; ethernet at 24000 { #size-cells = <0>; device_type = "network"; model = "TSEC"; compatible = "gianfar"; reg = <24000 1000>; address = [ 00 E0 0C 00 73 00 ]; interrupts = ; interrupt-parent = <40000>; phy-handle = <2452000>; }; ethernet at 25000 { #address-cells = <1>; #size-cells = <0>; device_type = "network"; model = "TSEC"; compatible = "gianfar"; reg = <25000 1000>; address = [ 00 E0 0C 00 73 01 ]; interrupts = <13 3 14 3 18 3>; interrupt-parent = <40000>; phy-handle = <2452001>; }; ethernet at 26000 { #address-cells = <1>; #size-cells = <0>; device_type = "network"; model = "FEC"; compatible = "gianfar"; reg = <26000 1000>; address = [ 00 E0 0C 00 73 02 ]; interrupts = <19 3>; interrupt-parent = <40000>; phy-handle = <2452002>; }; serial at 4500 { device_type = "serial"; compatible = "ns16550"; reg = <4500 100>; clock-frequency = <0>; interrupts = <1a 3>; interrupt-parent = <40000>; }; pic at 40000 { linux,phandle = <40000>; clock-frequency = <0>; interrupt-controller; #address-cells = <0>; reg = <40000 40000>; built-in; compatible = "chrp,open-pic"; device_type = "open-pic"; big-endian; }; i2c at 3000 { interrupt-parent = <40000>; interrupts = <1b 3>; reg = <3000 18>; device_type = "i2c"; compatible = "fsl-i2c"; dfsrr; }; }; From michael at ellerman.id.au Tue Dec 6 08:38:57 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 5 Dec 2005 15:38:57 -0600 Subject: [PATCH 7/11] powerpc: Fixups for kernel linked at 32 MB In-Reply-To: <20051205003954.6E56168802@ozlabs.org> References: <20051205003954.6E56168802@ozlabs.org> Message-ID: <200512051539.01254.michael@ellerman.id.au> On Sun, 4 Dec 2005 12:39, Michael Ellerman wrote: > +#ifdef CONFIG_CRASH_DUMP > +#define LOAD_HANDLER(reg, label) \ > + oris r12,r12,(label)@h; /* virt addr of handler ... */ \ > + ori r12,r12,(label)@l; /* .. and the rest */ > +#else > +#define LOAD_HANDLER(reg, label) \ > + ori r12,r12,(label)@l; /* virt addr of handler ... */ > +#endif Milton just spotted this buglet, we don't actually use reg, oops :} New patch on the way. -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/4b52804d/attachment.pgp From michael at ellerman.id.au Tue Dec 6 08:49:00 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Mon, 05 Dec 2005 15:49:00 -0600 Subject: [PATCH] powerpc: Fixups for kernel linked at 32 MB In-Reply-To: <20051205003954.6E56168802@ozlabs.org> Message-ID: <20051205214914.9BEE16887C@ozlabs.org> There's a few places where we need to fix things up for the kernel to work if it's linked at 32MB: - platforms/powermac/smp.c To start secondary cpus on pmac we patch the reset vector, which is fine. Except if we're above 32MB we don't have enough bits for an absolute branch, it needs to relative. - kernel/head_64.s - A few branches in the cpu hold code need to load the full target address and do a bctr. - after_prom_start needs to load PHYSICAL_START as the dest address, not 0. - The exception prolog needs to load the low word of the target adddress, not just the low halfword. - Fixup handling of the initial stab address. - kernel/setup_64.c smp_release_cpus() needs to write 1 to the spinloop flag near 0, not 32 MB. Signed-off-by: Michael Ellerman arch/powerpc/kernel/head_64.S | 30 ++++++++++++++++++++++++------ arch/powerpc/kernel/setup_64.c | 5 ++++- arch/powerpc/platforms/powermac/smp.c | 16 +++++++--------- include/asm-powerpc/mmu.h | 3 ++- 4 files changed, 37 insertions(+), 17 deletions(-) Index: kexec/arch/powerpc/platforms/powermac/smp.c =================================================================== --- kexec.orig/arch/powerpc/platforms/powermac/smp.c +++ kexec/arch/powerpc/platforms/powermac/smp.c @@ -753,14 +753,15 @@ static int __init smp_core99_probe(void) static void __devinit smp_core99_kick_cpu(int nr) { unsigned int save_vector; - unsigned long new_vector; - unsigned long flags; + unsigned long target, flags; volatile unsigned int *vector = ((volatile unsigned int *)(KERNELBASE+0x100)); if (nr < 0 || nr > 3) return; - if (ppc_md.progress) ppc_md.progress("smp_core99_kick_cpu", 0x346); + + if (ppc_md.progress) + ppc_md.progress("smp_core99_kick_cpu", 0x346); local_irq_save(flags); local_irq_disable(); @@ -768,14 +769,11 @@ static void __devinit smp_core99_kick_cp /* Save reset vector */ save_vector = *vector; - /* Setup fake reset vector that does + /* Setup fake reset vector that does * b __secondary_start_pmac_0 + nr*8 - KERNELBASE */ - new_vector = (unsigned long) __secondary_start_pmac_0 + nr * 8; - *vector = 0x48000002 + new_vector - KERNELBASE; - - /* flush data cache and inval instruction cache */ - flush_icache_range((unsigned long) vector, (unsigned long) vector + 4); + target = (unsigned long) __secondary_start_pmac_0 + nr * 8; + create_branch((unsigned long)vector, target, BRANCH_SET_LINK); /* Put some life in our friend */ pmac_call_feature(PMAC_FTR_RESET_CPU, NULL, nr, 0); Index: kexec/arch/powerpc/kernel/head_64.S =================================================================== --- kexec.orig/arch/powerpc/kernel/head_64.S +++ kexec/arch/powerpc/kernel/head_64.S @@ -154,11 +154,15 @@ _GLOBAL(__secondary_hold) bne 100b #ifdef CONFIG_HMT - b .hmt_init + LOADADDR(r4, .hmt_init) + mtctr r4 + bctr #else #ifdef CONFIG_SMP + LOADADDR(r4, .pSeries_secondary_smp_init) + mtctr r4 mr r3,r24 - b .pSeries_secondary_smp_init + bctr #else BUG_OPCODE #endif @@ -200,6 +204,20 @@ exception_marker: #define EX_R3 64 #define EX_LR 72 +/* + * We're short on space and time in the exception prolog, so we can't use + * the normal LOADADDR macro. Normally we just need the low halfword of the + * address, but for Kdump we need the whole low word. + */ +#ifdef CONFIG_CRASH_DUMP +#define LOAD_HANDLER(reg, label) \ + oris reg,reg,(label)@h; /* virt addr of handler ... */ \ + ori reg,reg,(label)@l; /* .. and the rest */ +#else +#define LOAD_HANDLER(reg, label) \ + ori reg,reg,(label)@l; /* virt addr of handler ... */ +#endif + #define EXCEPTION_PROLOG_PSERIES(area, label) \ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ std r9,area+EX_R9(r13); /* save r9 - r12 */ \ @@ -212,7 +230,7 @@ exception_marker: clrrdi r12,r13,32; /* get high part of &label */ \ mfmsr r10; \ mfspr r11,SPRN_SRR0; /* save SRR0 */ \ - ori r12,r12,(label)@l; /* virt addr of handler */ \ + LOAD_HANDLER(r12,label) \ ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \ mtspr SPRN_SRR0,r12; \ mfspr r12,SPRN_SRR1; /* and SRR1 */ \ @@ -1348,7 +1366,7 @@ _GLOBAL(do_stab_bolted) * fixed address (the linker can't compute (u64)&initial_stab >> * PAGE_SHIFT). */ - . = STAB0_PHYS_ADDR /* 0x6000 */ + . = STAB0_OFFSET /* 0x6000 */ .globl initial_stab initial_stab: .space 4096 @@ -1553,7 +1571,7 @@ _STATIC(__boot_from_prom) _STATIC(__after_prom_start) /* - * We need to run with __start at physical address 0. + * We need to run with __start at physical address PHYSICAL_START. * This will leave some code in the first 256B of * real memory, which are reserved for software use. * The remainder of the first page is loaded with the fixed @@ -1568,7 +1586,7 @@ _STATIC(__after_prom_start) mr r26,r3 SET_REG_TO_CONST(r27,KERNELBASE) - li r3,0 /* target addr */ + LOADADDR(r3, PHYSICAL_START) /* target addr */ // XXX FIXME: Use phys returned by OF (r30) add r4,r27,r26 /* source addr */ Index: kexec/arch/powerpc/kernel/setup_64.c =================================================================== --- kexec.orig/arch/powerpc/kernel/setup_64.c +++ kexec/arch/powerpc/kernel/setup_64.c @@ -314,6 +314,7 @@ void early_setup_secondary(void) void smp_release_cpus(void) { extern unsigned long __secondary_hold_spinloop; + unsigned long *ptr; DBG(" -> smp_release_cpus()\n"); @@ -324,7 +325,9 @@ void smp_release_cpus(void) * This is useless but harmless on iSeries, secondaries are already * waiting on their paca spinloops. */ - __secondary_hold_spinloop = 1; + ptr = (unsigned long *)((unsigned long)&__secondary_hold_spinloop + - PHYSICAL_START); + *ptr = 1; mb(); DBG(" <- smp_release_cpus()\n"); Index: kexec/include/asm-powerpc/mmu.h =================================================================== --- kexec.orig/include/asm-powerpc/mmu.h +++ kexec/include/asm-powerpc/mmu.h @@ -33,7 +33,8 @@ /* Location of cpu0's segment table */ #define STAB0_PAGE 0x6 -#define STAB0_PHYS_ADDR (STAB0_PAGE<<12) +#define STAB0_OFFSET (STAB0_PAGE << 12) +#define STAB0_PHYS_ADDR (STAB0_OFFSET + PHYSICAL_START) #ifndef __ASSEMBLY__ extern char initial_stab[]; From arnd at arndb.de Tue Dec 6 14:52:22 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:22 -0500 Subject: [PATCH 02/14] spufs: fix local store page refcounting References: <20051206035220.097737000@localhost> Message-ID: <20051206040643.452349000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-page-refcnt.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/7a8743a5/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:21 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:21 -0500 Subject: [PATCH 01/14] spufs: Make all exports GPL-only References: <20051206035220.097737000@localhost> Message-ID: <20051206040643.328108000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-export-symbol-gpl.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/a085066d/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:20 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:20 -0500 Subject: [PATCH 00/14] Cell updates for powerpc.git Message-ID: <20051206035220.097737000@localhost> This is my current set of updates related to the cell platforms. It includes some spufs updates, most importantly preemption support for SPUs from Mark Nutter, some platform code updates and a few bug fixes for the spidernet device driver. Paul, please apply to the powerpc.git tree, all patches are based on todays checkout. Arnd <>< -- From arnd at arndb.de Tue Dec 6 14:52:23 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:23 -0500 Subject: [PATCH 03/14] spufs: Fix oops when spufs module is not loaded References: <20051206035220.097737000@localhost> Message-ID: <20051206040643.620312000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-syscall-oops.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/36b2892e/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:24 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:24 -0500 Subject: [PATCH 04/14] spufs: Turn off debugging output References: <20051206035220.097737000@localhost> Message-ID: <20051206040643.792016000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-no-debug.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/c8555f2a/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:27 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:27 -0500 Subject: [PATCH 07/14] spufs: fix mailbox polling References: <20051206035220.097737000@localhost> Message-ID: <20051206040644.322664000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-mbox-intr.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/4e4f049c/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:26 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:26 -0500 Subject: [PATCH 06/14] spufs: Improved SPU preemptability [part 2]. References: <20051206035220.097737000@localhost> Message-ID: <20051206040644.145607000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spu-preempt-3.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/7701d47f/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:33 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:33 -0500 Subject: [PATCH 13/14] spidernet: read firmware from the OF device tree References: <20051206035220.097737000@localhost> Message-ID: <20051206040645.368270000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spidernet-fw-from-dt-2.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/fd9ba499/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:30 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:30 -0500 Subject: [PATCH 10/14] cell: add iommu support for larger memory References: <20051206035220.097737000@localhost> Message-ID: <20051206040644.841367000@localhost> An embedded and charset-unspecified text was scrubbed... Name: iommu-new-firmware-6.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/4865f86a/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:28 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:28 -0500 Subject: [PATCH 08/14] cell: enable pause(0) in cpu_idle References: <20051206035220.097737000@localhost> Message-ID: <20051206040644.495500000@localhost> An embedded and charset-unspecified text was scrubbed... Name: bpa-pmd-add-2.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/69d8875e/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:29 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:29 -0500 Subject: [PATCH 09/14] cell: add platform detection code References: <20051206035220.097737000@localhost> Message-ID: <20051206040644.665121000@localhost> An embedded and charset-unspecified text was scrubbed... Name: cell-platform-detect.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/1328ae8f/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:32 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:32 -0500 Subject: [PATCH 12/14] spidernet: check if firmware was loaded correctly References: <20051206035220.097737000@localhost> Message-ID: <20051206040645.193163000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spidernet-programcheck.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/f2494609/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:25 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:25 -0500 Subject: [PATCH 05/14] spufs: Improved SPU preemptability. References: <20051206035220.097737000@localhost> Message-ID: <20051206040644.014463000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spu-preempt-1.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/01b2ace2/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:31 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:31 -0500 Subject: [PATCH 11/14] spidernet: fix Kconfig after BPA->CELL rename References: <20051206035220.097737000@localhost> Message-ID: <20051206040645.065973000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spidernet-with-pci-and-cell.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/f9fb4bf1/attachment.txt From arnd at arndb.de Tue Dec 6 14:52:34 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 05 Dec 2005 22:52:34 -0500 Subject: [PATCH 14/14] spidernet: fix HW structures for 64 bit dma_addr_t References: <20051206035220.097737000@localhost> Message-ID: <20051206040645.538783000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spidernet-dma_addr_t-fix-2.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051205/34f8d7a2/attachment.txt From rsa at us.ibm.com Mon Dec 5 08:12:26 2005 From: rsa at us.ibm.com (Ryan S. Arnold) Date: Sun, 04 Dec 2005 15:12:26 -0600 Subject: [RFC PATCH 4/5] CELL bogus_console port to hvc_console backend driver Message-ID: <43935BBA.8090009@us.ibm.com> This patch sets the preffered_console when running on the simulator. Signed-off-by: Ryan S. Arnold -------------- next part -------------- A non-text attachment was scrubbed... Name: hvc_fss.4.patch Type: text/x-patch Size: 1008 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051204/b504f2d3/attachment.bin From rsa at us.ibm.com Mon Dec 5 08:12:31 2005 From: rsa at us.ibm.com (Ryan S. Arnold) Date: Sun, 04 Dec 2005 15:12:31 -0600 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver Message-ID: <43935BBF.6080005@us.ibm.com> This patch adds the hvc_fss.c driver file. Signed-off-by: Ryan S. Arnold -------------- next part -------------- A non-text attachment was scrubbed... Name: hvc_fss.5.patch Type: text/x-patch Size: 4623 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051204/24ea2a00/attachment.bin From rsa at us.ibm.com Mon Dec 5 08:12:21 2005 From: rsa at us.ibm.com (Ryan S. Arnold) Date: Sun, 04 Dec 2005 15:12:21 -0600 Subject: [RFC PATCH 3/5] CELL bogus_console port to hvc_console backend driver Message-ID: <43935BB5.9030302@us.ibm.com> This patch modifies the defconfig file for the CELL simulator and changes the Makefile and Kconfig to add hvc_fss. Signed-off-by: Ryan S. Arnold -------------- next part -------------- A non-text attachment was scrubbed... Name: hvc_fss.3.patch Type: text/x-patch Size: 3072 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051204/67c3f007/attachment.bin From paulus at samba.org Tue Dec 6 11:59:11 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 6 Dec 2005 11:59:11 +1100 Subject: [PATCH 12/14] spidernet: check if firmware was loaded correctly In-Reply-To: <20051206040645.193163000@localhost> References: <20051206035220.097737000@localhost> <20051206040645.193163000@localhost> Message-ID: <17300.57951.373636.507621@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > Uploading the device firmware may fail if wrong input data > was provided by the user. This checks for the condition. > > From: Jens.Osterkamp at de.ibm.com > Cc: netdev at vger.kernel.org This one should be sent to Jeff Garzik, along with patches 11, 13 and 14. Paul. From paulus at samba.org Tue Dec 6 11:51:39 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 6 Dec 2005 11:51:39 +1100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <20051206040643.452349000@localhost> References: <20051206035220.097737000@localhost> <20051206040643.452349000@localhost> Message-ID: <17300.57499.400458.387421@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/spufs/file.c Remind me again why spufs is under arch/powerpc/ rather than fs/ ? Regards, Paul. From rsa at us.ibm.com Mon Dec 5 08:11:56 2005 From: rsa at us.ibm.com (Ryan S. Arnold) Date: Sun, 04 Dec 2005 15:11:56 -0600 Subject: [RFC PATCH 0/5] CELL bogus_console port to hvc_console backend driver Message-ID: <43935B9C.5020503@us.ibm.com> The following patch-set was created against the 2.6.14-rc5 CBE (cell broadband environment) patch-set provided by Arnd Bergman. The purpose of this patch-set is to port the CELL IBM Full System Simulator bogus_console.c driver to an hvc_console back-end driver, namely hvc_fss.c. Our intention is to support binary-compatibility of all hvc_console back-end drivers such that all drivers can be built into the kernel at configuration time but only one back-end (the one that detects that it is running on the right hardware) actually registers the front-end driver at console init. This is a request-for-comments. Please contribute any suggestions especially in-regards-to the Makefile and Kconfig changes. I'm not very experienced with them. I do realize that the current mainline kernel has some significant differences in the hvc_console driver versus the 2.6.14-rc5 kernel. We'll address this when the cell patches go upstream. I've tested these patches on the CELL IBM Full System Simulator and the console works fine. I've not had a chance to test the hvc_vio back-end on ppc64 hardware but I'll do this once I have a chance. I suspect that there will be some udev magic required to get the /dev/hvc0 device to appear on the CELL simulator since there isn't an actual serial device. Using this driver one can actually execute agetty on the console as well. Thanks, Ryan S. Arnold IBM Linux Technology Center From linas at austin.ibm.com Tue Dec 6 12:37:35 2005 From: linas at austin.ibm.com (linas) Date: Mon, 5 Dec 2005 19:37:35 -0600 Subject: [PATCH] powerpc: minor cleanup of void ptr deref Message-ID: <20051206013735.GJ31651@austin.ibm.com> Paul, Please apply. --linas Minor: use macro to perform void pointer deref; this may someday help avoid pointer typecasting errors. Signed-off-by: Linas Vepstas -- Index: linux-2.6.15-rc3-mm1/arch/powerpc/platforms/powermac/pci.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/arch/powerpc/platforms/powermac/pci.c 2005-12-01 15:14:41.000000000 -0600 +++ linux-2.6.15-rc3-mm1/arch/powerpc/platforms/powermac/pci.c 2005-12-05 13:52:03.207941067 -0600 @@ -326,7 +326,7 @@ else busdn = hose->arch_data; for (dn = busdn->child; dn; dn = dn->sibling) - if (dn->data && PCI_DN(dn)->devfn == devfn) + if (PCI_DN(dn) && PCI_DN(dn)->devfn == devfn) break; if (dn == NULL) return -1; Index: linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/iommu.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/arch/powerpc/platforms/pseries/iommu.c 2005-12-01 15:14:41.000000000 -0600 +++ linux-2.6.15-rc3-mm1/arch/powerpc/platforms/pseries/iommu.c 2005-12-05 13:52:03.207941067 -0600 @@ -433,7 +433,7 @@ return; } - ppci = pdn->data; + ppci = PCI_DN(pdn); if (!ppci->iommu_table) { /* Bussubno hasn't been copied yet. * Do it now because iommu_table_setparms_lpar needs it. @@ -480,10 +480,10 @@ * an already allocated iommu table is found and use that. */ - while (dn && dn->data && PCI_DN(dn)->iommu_table == NULL) + while (dn && PCI_DN(dn) && PCI_DN(dn)->iommu_table == NULL) dn = dn->parent; - if (dn && dn->data) { + if (dn && PCI_DN(dn)) { PCI_DN(mydn)->iommu_table = PCI_DN(dn)->iommu_table; } else { DBG("iommu_dev_setup_pSeries, dev %p (%s) has no iommu table\n", dev, pci_name(dev)); @@ -494,7 +494,7 @@ { int err = NOTIFY_OK; struct device_node *np = node; - struct pci_dn *pci = np->data; + struct pci_dn *pci = PCI_DN(np); switch (action) { case PSERIES_RECONFIG_REMOVE: @@ -530,7 +530,7 @@ */ dn = pci_device_to_OF_node(dev); - for (pdn = dn; pdn && pdn->data && !PCI_DN(pdn)->iommu_table; + for (pdn = dn; pdn && PCI_DN(pdn) && !PCI_DN(pdn)->iommu_table; pdn = pdn->parent) { dma_window = (unsigned int *) get_property(pdn, "ibm,dma-window", NULL); @@ -549,7 +549,7 @@ DBG("Found DMA window, allocating table\n"); } - pci = pdn->data; + pci = PCI_DN(pdn); if (!pci->iommu_table) { /* iommu_table_setparms_lpar needs bussubno. */ pci->bussubno = pci->phb->bus->number; From hollis at penguinppc.org Tue Dec 6 12:59:03 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Mon, 5 Dec 2005 19:59:03 -0600 Subject: Linuxppc64-dev Digest, Vol 16, Issue 11 In-Reply-To: <9b23fc710512050035i117c7bd7y75a01f487dc74654@mail.gmail.com> References: <20051205010004.52D1568876@ozlabs.org> <9b23fc710512050035i117c7bd7y75a01f487dc74654@mail.gmail.com> Message-ID: <84a13e07288333aff98dfd0c36591b00@penguinppc.org> On Dec 5, 2005, at 2:35 AM, Renuka Pampana wrote: > > Where can i get PPC440ep (yosemite) patch for 64 bit kernel. Can you > give me some pointers to refer. Please do not quote an entire email digest, especially since your post had nothing to do with that at all. The 440EP is not a 64-bit processor, so you will never find support for it in the 64-bit kernel. Instead please see http://penguinppc.org/embedded/ and the resources it points you to. -Hollis From arnd at arndb.de Tue Dec 6 21:18:17 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 6 Dec 2005 11:18:17 +0100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <17300.57499.400458.387421@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <20051206040643.452349000@localhost> <17300.57499.400458.387421@cargo.ozlabs.ibm.com> Message-ID: <200512061118.19633.arnd@arndb.de> On Dinsdag 06 Dezember 2005 01:51, Paul Mackerras wrote: > Arnd Bergmann writes: > > > Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/spufs/file.c > > Remind me again why spufs is under arch/powerpc/ rather than fs/ ? We had a discussion about this in August, after the patch at http://patchwork.ozlabs.org/linuxppc64/patch?id=2140 Nobody had voiced any objections against the arch/powerpc location, and Pekka had good reasons against fs/, so I changed it. Arnd <>< From arnd at arndb.de Tue Dec 6 21:23:39 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 6 Dec 2005 11:23:39 +0100 Subject: [PATCH 12/14] spidernet: check if firmware was loaded correctly In-Reply-To: <17300.57951.373636.507621@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <20051206040645.193163000@localhost> <17300.57951.373636.507621@cargo.ozlabs.ibm.com> Message-ID: <200512061123.40059.arnd@arndb.de> On Dinsdag 06 Dezember 2005 01:59, Paul Mackerras wrote: > Arnd Bergmann writes: > > > Uploading the device firmware may fail if wrong input data > > was provided by the user. This checks for the condition. > > > > From: Jens.Osterkamp at de.ibm.com > > Cc: netdev at vger.kernel.org > > This one should be sent to Jeff Garzik, along with patches 11, 13 and > 14. Ok. Jens, is it ok for you if you send the network driver stuff to jgarzik at pobox.com, Cc: netdev at vger.kernel.org yourself in the future? Arnd <>< From dwmw2 at redhat.com Tue Dec 6 22:31:17 2005 From: dwmw2 at redhat.com (David Woodhouse) Date: Tue, 06 Dec 2005 11:31:17 +0000 Subject: [RFC PATCH 5/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: References: Message-ID: <1133868677.4136.80.camel@baythorne.infradead.org> On Mon, 2005-12-05 at 11:17 -0600, Eric V Van hensbergen wrote: > I suppose if there is sufficient pull I could push them at any time -- I > haven't gone down this path because I'm not sure how I feel including > simulator drivers in the mainline kernel tree. If the linuxppc64 folks > think this is valuable, I'd be happy to clean-up the drivers a bit more > and submit a patch. They're fairly self-contained and harmless -- I think it would be good to merge them. The bogusdisk driver wants a proper major number assigned, and it wants to look in the device tree for /mambo/bogusdisc nodes (which we can trivially hack the sim's tcl script to provide). -- dwmw2 From penberg at cs.helsinki.fi Tue Dec 6 22:38:28 2005 From: penberg at cs.helsinki.fi (Pekka Enberg) Date: Tue, 06 Dec 2005 13:38:28 +0200 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <200512061118.19633.arnd@arndb.de> References: <20051206035220.097737000@localhost> <20051206040643.452349000@localhost> <17300.57499.400458.387421@cargo.ozlabs.ibm.com> <200512061118.19633.arnd@arndb.de> Message-ID: <1133869108.7968.1.camel@localhost> Hi, On Dinsdag 06 Dezember 2005 01:51, Paul Mackerras wrote: > > Remind me again why spufs is under arch/powerpc/ rather than fs/ ? On Tue, 2005-12-06 at 11:18 +0100, Arnd Bergmann wrote: > We had a discussion about this in August, after the patch > at http://patchwork.ozlabs.org/linuxppc64/patch?id=2140 > > Nobody had voiced any objections against the arch/powerpc location, > and Pekka had good reasons against fs/, so I changed it. It had arch specific hooks which IMHO do not belong into fs/. Pekka From ntl at pobox.com Wed Dec 7 02:45:58 2005 From: ntl at pobox.com (Nathan Lynch) Date: Tue, 6 Dec 2005 10:45:58 -0500 Subject: [PATCH] reworked: numa placement for dynamically added memory In-Reply-To: <20051205200642.GA20613@w-mikek2.ibm.com> References: <20051205200642.GA20613@w-mikek2.ibm.com> Message-ID: <20051206154557.GA8901@localhost.localdomain> Mike Kravetz wrote: > Here is a reworked version of the patch with changes suggested by > Nathan. Again, this patch depends on: > http://ozlabs.org/pipermail/linuxppc64-dev/2005-December/006923.html > > This patch places dynamically added memory within the appropriate > numa node. A new routine hot_add_scn_to_nid() replicates most of > the memory scanning code in parse_numa_properties(). Changes look good to me, thanks. Nathan From miltonm at bga.com Wed Dec 7 03:40:00 2005 From: miltonm at bga.com (Milton Miller) Date: Tue, 6 Dec 2005 10:40:00 -0600 Subject: Booting OS on PowerPC Message-ID: On Thu Dec 1 19:48:11 EST 2005, veera venkata prasad j wrote: > Can any body tell me how Linux boot on PowerPC machine > when Open Firmware is up. To be more preciese, what is > the "known-environment" that the OS expect from Open > Firmware. It expects to be (1) in 32-bit mode, (2) r3, r4 0, (3) r5 is a pointer to the client interface callback, (4) r1 is a usable stack, (5) image is loaded according to elf-header of zImage wrapper. More information can be found at penguinppc.org, the CHRP (Common Hardware Reference Platform, outdated), relevant Openfirmware specifications, and by reading prom.c in the source. If you are looking to skip openfirmware, see the draft document in the current thread on this mailing list RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware milton From arnd at arndb.de Wed Dec 7 05:49:30 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 6 Dec 2005 19:49:30 +0100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <1133869108.7968.1.camel@localhost> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> Message-ID: <200512061949.33482.arnd@arndb.de> On Dinsdag 06 Dezember 2005 12:38, Pekka Enberg wrote: > On Dinsdag 06 Dezember 2005 01:51, Paul Mackerras wrote: > > > Remind me again why spufs is under arch/powerpc/ rather than fs/ ? > > On Tue, 2005-12-06 at 11:18 +0100, Arnd Bergmann wrote: > > We had a discussion about this in August, after the patch > > at http://patchwork.ozlabs.org/linuxppc64/patch?id=2140 > > > > Nobody had voiced any objections against the arch/powerpc location, > > and Pekka had good reasons against fs/, so I changed it. > > It had arch specific hooks which IMHO do not belong into fs/. Since the discussion came up again in irc, I looked up the existing file systems. outside of fs/, we have the following file systems. find -name \*.c | grep -v ^./fs | xargs grep struct.file_system_type.*= ./arch/ia64/kernel/perfmon.c:static struct file_system_type pfm_fs_type = { ./drivers/infiniband/core/uverbs_main.c:static struct file_system_type uverbs_event_fs = { ./drivers/isdn/capi/capifs.c:static struct file_system_type capifs_fs_type = { ./drivers/misc/ibmasm/ibmasmfs.c:static struct file_system_type ibmasmfs_type = { ./drivers/oprofile/oprofilefs.c:static struct file_system_type oprofilefs_type = { ./drivers/usb/core/inode.c:static struct file_system_type usb_fs_type = { ./drivers/usb/gadget/inode.c:static struct file_system_type gadgetfs_type = { ./ipc/mqueue.c:static struct file_system_type mqueue_fs_type = { ./kernel/cpuset.c:static struct file_system_type cpuset_fs_type = { ./kernel/futex.c:static struct file_system_type futex_fs_type = { ./mm/shmem.c:static struct file_system_type tmpfs_fs_type = { ./mm/tiny-shmem.c:static struct file_system_type tmpfs_fs_type = { ./net/socket.c:static struct file_system_type sock_fs_type = { ./net/sunrpc/rpc_pipe.c:static struct file_system_type rpc_pipe_fs_type = { ./security/inode.c:static struct file_system_type fs_type = { ./security/selinux/selinuxfs.c:static struct file_system_type sel_fs_type = { In fs/, most code deals with actual files stored on a disk or similar, with the exception of: ./fs/binfmt_misc.c:static struct file_system_type bm_fs_type = { ./fs/block_dev.c:static struct file_system_type bd_type = { ./fs/debugfs/inode.c:static struct file_system_type debug_fs_type = { ./fs/devfs/base.c:static struct file_system_type devfs_fs_type = { ./fs/devpts/inode.c:static struct file_system_type devpts_fs_type = { ./fs/eventpoll.c:static struct file_system_type eventpoll_fs_type = { ./fs/hugetlbfs/inode.c:static struct file_system_type hugetlbfs_fs_type = { ./fs/inotify.c:static struct file_system_type inotify_fs_type = { ./fs/openpromfs/inode.c:static struct file_system_type openprom_fs_type = { ./fs/pipe.c:static struct file_system_type pipe_fs_type = { ./fs/proc/root.c:static struct file_system_type proc_fs_type = { ./fs/relayfs/inode.c:static struct file_system_type relayfs_fs_type = { ./fs/sysfs/mount.c:static struct file_system_type sysfs_fs_type = { I guess there is no strict rule where these file systems go to, e.g. hugetlbs could just as well live near mm/shmem.c or any of those outside of fs/ could be moved in there. I don't really care where I put spufs, but I would prefer to move the files only one more time at most. Initially, they were in fs/spufs, and I moved them to arch/powerpc/platforms/cell/spufs at Pekkas suggestion. Arnd <>< From mikey at neuling.org Wed Dec 7 05:59:10 2005 From: mikey at neuling.org (Michael Neuling) Date: Tue, 6 Dec 2005 12:59:10 -0600 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <1133816807.8577.50.camel@cashmere.sps.mot.com> References: <1133816807.8577.50.camel@cashmere.sps.mot.com> Message-ID: <20051206125910.9f83d230.mikey@neuling.org> > dtc source code can be found at > > > WARNING: This version is still in early development stage; the > resulting device-tree "blobs" have not yet been validated with the > kernel. This has been done now. We added an insert blob option to the kexec tools so that a blob generated with dtc could be used by a kernel booted with kexec. See: http://lists.osdl.org/pipermail/fastboot/2005-October/002061.html Mikey From penberg at cs.helsinki.fi Wed Dec 7 06:05:47 2005 From: penberg at cs.helsinki.fi (Pekka Enberg) Date: Tue, 06 Dec 2005 21:05:47 +0200 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <200512061949.33482.arnd@arndb.de> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> Message-ID: <1133895947.3279.4.camel@localhost> Hi, On Tue, 2005-12-06 at 19:49 +0100, Arnd Bergmann wrote: > I guess there is no strict rule where these file systems go to, e.g. > hugetlbs could just as well live near mm/shmem.c or any of those outside > of fs/ could be moved in there. hugetlbs does not contain architecture specific code so I don't see it as a problem. On Tue, 2005-12-06 at 19:49 +0100, Arnd Bergmann wrote: > I don't really care where I put spufs, but I would prefer to move > the files only one more time at most. > Initially, they were in fs/spufs, and I moved them to > arch/powerpc/platforms/cell/spufs at Pekkas suggestion. I would prefer them to stay in arch/powerpc/. As far as I understand, spufs will never have any use for platforms other than cell, so I really don't see any point in putting it in fs/. Pekka From arnd at arndb.de Wed Dec 7 06:48:55 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 6 Dec 2005 20:48:55 +0100 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <1133816807.8577.50.camel@cashmere.sps.mot.com> References: <1133816807.8577.50.camel@cashmere.sps.mot.com> Message-ID: <200512062048.56131.arnd@arndb.de> On Maandag 05 Dezember 2005 22:06, Jon Loeliger wrote: > Included below is a proposed Revision 0.5 of the > "Booting the Linux/ppc kernel without Open Firmware" > document. This modification primarily extends the > Revision 0.4 by adding definitions for OF Nodes that > cover the System-On-a-Chip features found on PPC parts. > It also generalizes some earlier wording that pertained > to only PPC64 parts and covers the new, merged PPC 32 > and 64 parts together. Finally, minor typos, style > consistency and grammar problems were corrected. A few points are not clear yet, either because I don't understand the document or one it references correctly or because I might have different requirements: - Do we need a way to identify the type of soc bus? There are different standards for this, e.g. PLB4 on PPC440 or the EIB on the Cell BE. My initial idea was to have different device-type properties for these, but I now think that device_type = "soc" makes sense for all of them. Maybe we could add a model or compatible property for them. - It does not really belong into this document, but is related anyway: how do you want to represent this in Linux? Currently, most of these would be of_platform_device, but I think it would be good to have a new bus_type for it. The advantage would be that you can see the devices in /sys/devices/soc at xxx/ even if the driver is not loaded and the driver can even be autoloaded by udev. Also, which properties should show up in sysfs? All of them or just those specified in this document or a subset of them? - What do we do with pci root devices? They are often physically connected to the internal CPU bus, so it would make sense to represent them this way in the device tree. Should we add them to the specification here? Would it even work the expected way in Linux? - For some devices, you mandate a model property, for others you don't. Is this intentional? It might be easier to find the right device driver if the match string always contains a model name. - How would I represent nested interrupt controllers? E.g. suppose I have a Cell internal interrupt controller on one SOC bus and and an external interrupt controller on another SOC bus but have that deliver interrupts to the first one. - Should it mention nested SOC buses, e.g. a PLB4 bus connected to a PLB5 bus? - The title says 'without Open Firmware', but it should also be allowed to use the same SOC bus layout when using SLOF or some other OF implementation, right? - Also not new in this version, but still: Should there be support for specifying CPUs with multiple SMT threads? Arnd <>< From jdl at freescale.com Wed Dec 7 07:08:00 2005 From: jdl at freescale.com (Jon Loeliger) Date: Tue, 06 Dec 2005 14:08:00 -0600 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmwa re In-Reply-To: <20051206125910.9f83d230.mikey@neuling.org> References: <20051206125910.9f83d230.mikey@neuling.org> Message-ID: <1133899679.8577.82.camel@cashmere.sps.mot.com> On Tue, 2005-12-06 at 12:59, Michael Neuling wrote: > > dtc source code can be found at > > And on that note, I should probably make people aware that the current form of this document can now be found as part of the DTC tree! > > WARNING: This version is still in early development stage; the > > resulting device-tree "blobs" have not yet been validated with the > > kernel. > > This has been done now. We added an insert blob option to the kexec > tools so that a blob generated with dtc could be used by a kernel > booted with kexec. See: > > http://lists.osdl.org/pipermail/fastboot/2005-October/002061.html > > Mikey OK. So, do we want to have patches (versus the DTC version) sent to this list for changes to this document now too? jdl From paulus at samba.org Wed Dec 7 08:10:18 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 7 Dec 2005 08:10:18 +1100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <1133895947.3279.4.camel@localhost> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> Message-ID: <17301.65082.251692.675360@cargo.ozlabs.ibm.com> Pekka Enberg writes: > I would prefer them to stay in arch/powerpc/. As far as I understand, > spufs will never have any use for platforms other than cell, so I really > don't see any point in putting it in fs/. The point is that people making changes to the filesystem interfaces will be much more likely to notice and fix stuff that is under fs/ than code that is buried deep under arch/ somewhere. Filesystems should go under fs/ for the sake of long-term maintainability. The fact that it's only used on one architecture is irrelevant - you simply make sure (with the appropriate Kconfig bits) that it's only offered on that architecture. Paul. From penberg at cs.helsinki.fi Wed Dec 7 08:41:38 2005 From: penberg at cs.helsinki.fi (Pekka Enberg) Date: Tue, 06 Dec 2005 23:41:38 +0200 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <17301.65082.251692.675360@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> <17301.65082.251692.675360@cargo.ozlabs.ibm.com> Message-ID: <1133905298.8027.13.camel@localhost> Hi Paul, On Wed, 2005-12-07 at 08:10 +1100, Paul Mackerras wrote: > The point is that people making changes to the filesystem interfaces > will be much more likely to notice and fix stuff that is under fs/ > than code that is buried deep under arch/ somewhere. Filesystems > should go under fs/ for the sake of long-term maintainability. The > fact that it's only used on one architecture is irrelevant - you > simply make sure (with the appropriate Kconfig bits) that it's only > offered on that architecture. I think the fact that it is highly architecture specific is relevant. I have no way of testing spufs changes except on cell, no? And if I am developing on a cell, I probably will notice it in arch/ all the same. So I don't quite buy your the maintenace argument. But as Arnd said, there are no clear rules on what kind of filesystems should go into fs/ so please do whatever you must. Pekka From benh at kernel.crashing.org Wed Dec 7 08:41:15 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 07 Dec 2005 08:41:15 +1100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <1133869108.7968.1.camel@localhost> References: <20051206035220.097737000@localhost> <20051206040643.452349000@localhost> <17300.57499.400458.387421@cargo.ozlabs.ibm.com> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> Message-ID: <1133905276.7168.54.camel@gaston> On Tue, 2005-12-06 at 13:38 +0200, Pekka Enberg wrote: > Hi, > > On Dinsdag 06 Dezember 2005 01:51, Paul Mackerras wrote: > > > Remind me again why spufs is under arch/powerpc/ rather than fs/ ? > > On Tue, 2005-12-06 at 11:18 +0100, Arnd Bergmann wrote: > > We had a discussion about this in August, after the patch > > at http://patchwork.ozlabs.org/linuxppc64/patch?id=2140 > > > > Nobody had voiced any objections against the arch/powerpc location, > > and Pekka had good reasons against fs/, so I changed it. > > It had arch specific hooks which IMHO do not belong into fs/. Hrm... but not being into fs/ makes sure people like viro will "miss" it when fixing all filesystems... Ben. From ntl at pobox.com Wed Dec 7 09:14:34 2005 From: ntl at pobox.com (Nathan Lynch) Date: Tue, 6 Dec 2005 17:14:34 -0500 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <17301.65082.251692.675360@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> <17301.65082.251692.675360@cargo.ozlabs.ibm.com> Message-ID: <20051206221434.GB8901@localhost.localdomain> Paul Mackerras wrote: > Pekka Enberg writes: > > > I would prefer them to stay in arch/powerpc/. As far as I understand, > > spufs will never have any use for platforms other than cell, so I really > > don't see any point in putting it in fs/. > > The point is that people making changes to the filesystem interfaces > will be much more likely to notice and fix stuff that is under fs/ > than code that is buried deep under arch/ somewhere. Filesystems > should go under fs/ for the sake of long-term maintainability. The > fact that it's only used on one architecture is irrelevant - you > simply make sure (with the appropriate Kconfig bits) that it's only > offered on that architecture. openpromfs seems to be a precedent here. It makes sense only on sparc and sparc64 but it lives in fs/. From mikey at neuling.org Wed Dec 7 09:18:55 2005 From: mikey at neuling.org (Michael Neuling) Date: Tue, 6 Dec 2005 16:18:55 -0600 Subject: [PATCH 9/11] powerpc: Parse crashkernel= parameter in first kernel In-Reply-To: <20051205004002.7A01B68889@ozlabs.org> References: <1133743149.268607.418162138937.qpush@concordia> <20051205004002.7A01B68889@ozlabs.org> Message-ID: <20051206161855.745bd0be.mikey@neuling.org> > + RELOC(prom_crashk_size) = prom_memparse(opt, &opt); To avoid a compiler warning, this should be: RELOC(prom_crashk_size) = prom_memparse(opt, (const char **)&opt); Mikey From paulus at samba.org Wed Dec 7 09:19:28 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 7 Dec 2005 09:19:28 +1100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <1133905298.8027.13.camel@localhost> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> <17301.65082.251692.675360@cargo.ozlabs.ibm.com> <1133905298.8027.13.camel@localhost> Message-ID: <17302.3696.364669.18755@cargo.ozlabs.ibm.com> Pekka Enberg writes: > I think the fact that it is highly architecture specific is relevant. I > have no way of testing spufs changes except on cell, no? And if I am > developing on a cell, I probably will notice it in arch/ all the same. > So I don't quite buy your the maintenace argument. Think about someone changing the VFS layer interface and fixing up all the filesystems to accommodate that change. That person is doing some of your work for you, so you want to make it easy for him/her to find your filesystem. That's the sort of thing I was referring to as maintenance. As for changes on the cell-specific side, the people doing those changes will know where to find it, so it isn't a problem having it in fs/. Having it in fs/ also means that it is more likely that people familiar with VFS internals will look through your code and comment on it. I know that can be painful in the short term, but in the long term it will lead to better code. Paul. From arnd at arndb.de Wed Dec 7 09:27:08 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 6 Dec 2005 23:27:08 +0100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <17302.3696.364669.18755@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <1133905298.8027.13.camel@localhost> <17302.3696.364669.18755@cargo.ozlabs.ibm.com> Message-ID: <200512062327.08448.arnd@arndb.de> Am Dienstag 06 Dezember 2005 23:19 schrieb Paul Mackerras: > Having it in fs/ also means that it is more likely that people > familiar with VFS internals will look through your code and comment on > it. ?I know that can be painful in the short term, but in the long > term it will lead to better code. Yes, that is an excellent point. How should we proceed to get the code there? Do you want to move the files around in your git tree or do you prefer me to send a full set of patches again and kill the existing copy? Obviously, I'd prefer the former, since it would mean less work for me with the same result. Arnd <>< From michael at ellerman.id.au Wed Dec 7 10:39:03 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 6 Dec 2005 17:39:03 -0600 Subject: [RFC] Should lmb_alloc() always panic on failure? Message-ID: <200512061739.06783.michael@ellerman.id.au> Hi ya'll, Currently lmb_alloc(_base) returns 0 if it can't allocate memory, but a lot of places don't actualyl check. I was thinking it might be better if it just panicked. The following functions call lmb_alloc() and don't check the return value: finish_device_tree() rtas_initialize() irqstack_early_init() emergency_stack_init() early_get_page() MMU_init_hw() stabs_alloc() pmac_probe() alloc_u3_dart_table() These functions check and panic() or BUG_ON: unflatten_device_tree() htab_initialize() do_init_bootmem() dart_init() The only other caller is careful_allocation(), which checks and retries the alloc with different parameters - we could accomodate this with an __lmb_alloc() or similar. What do people think? cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051206/8d9d7159/attachment.pgp From david at gibson.dropbear.id.au Wed Dec 7 11:17:20 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 7 Dec 2005 11:17:20 +1100 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <200512062048.56131.arnd@arndb.de> References: <1133816807.8577.50.camel@cashmere.sps.mot.com> <200512062048.56131.arnd@arndb.de> Message-ID: <20051207001720.GB25533@localhost.localdomain> On Tue, Dec 06, 2005 at 08:48:55PM +0100, Arnd Bergmann wrote: > On Maandag 05 Dezember 2005 22:06, Jon Loeliger wrote: > > Included below is a proposed Revision 0.5 of the > > "Booting the Linux/ppc kernel without Open Firmware" > > document. This modification primarily extends the > > Revision 0.4 by adding definitions for OF Nodes that > > cover the System-On-a-Chip features found on PPC parts. > > It also generalizes some earlier wording that pertained > > to only PPC64 parts and covers the new, merged PPC 32 > > and 64 parts together. Finally, minor typos, style > > consistency and grammar problems were corrected. > > A few points are not clear yet, either because I don't understand the > document or one it references correctly or because I might have > different requirements: All comments below IMHO, and subject to persuasion otherwise. > - Do we need a way to identify the type of soc bus? There are different > standards for this, e.g. PLB4 on PPC440 or the EIB on the Cell BE. > My initial idea was to have different device-type properties for these, > but I now think that device_type = "soc" makes sense for all of them. > Maybe we could add a model or compatible property for them. It think it would be a good idea to have something labelling the specific type of SOC bus, though I'm not immediately sure where. "model" perhaps, if it rarely has an effect on how to operate the bus. > - It does not really belong into this document, but is related anyway: > how do you want to represent this in Linux? Currently, most of these > would be of_platform_device, but I think it would be good to have > a new bus_type for it. The advantage would be that you can see the > devices in /sys/devices/soc at xxx/ even if the driver is not loaded > and the driver can even be autoloaded by udev. > Also, which properties should show up in sysfs? All of them or just > those specified in this document or a subset of them? I concur - I believe we already have a bus_type for on-chip devices on 4xx. > - What do we do with pci root devices? They are often physically connected > to the internal CPU bus, so it would make sense to represent them > this way in the device tree. Should we add them to the specification > here? Would it even work the expected way in Linux? The host bridges should sit on the soc bus then, as you suggest (just as the PCI busses hang off HyperTransport on the G5). I think you need to refer to the OF docs for how to represent the PCI host bridge and devices themselves. > - For some devices, you mandate a model property, for others you don't. > Is this intentional? It might be easier to find the right device > driver if the match string always contains a model name. You rarely want to match model name to find a device - generally you want to match either on "compatible" or "device_type", or possibly both. > - How would I represent nested interrupt controllers? E.g. suppose I > have a Cell internal interrupt controller on one SOC bus and > and an external interrupt controller on another SOC bus but have > that deliver interrupts to the first one. Again, I believe this is in the OF docs - interrupt controllers have an interrupt-parent property IIRC, which gives the phandle of the next interrupt controller up the chain. > - Should it mention nested SOC buses, e.g. a PLB4 bus connected to a > PLB5 bus? Yes. > - The title says 'without Open Firmware', but it should also be allowed > to use the same SOC bus layout when using SLOF or some other OF > implementation, right? I guess so. > - Also not new in this version, but still: Should there be support for > specifying CPUs with multiple SMT threads? Umm.. maybe. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From geoffrey.levand at am.sony.com Wed Dec 7 11:34:00 2005 From: geoffrey.levand at am.sony.com (Geoff Levand) Date: Tue, 06 Dec 2005 16:34:00 -0800 Subject: [RFC] spufs: wrap spu priveleged register access Message-ID: <43962DF8.2040007@am.sony.com> The current spufs implementation accesses privileged (privilege 1) spu registers directly, which may not be allowed by a hypervisor. This patch adds wrapper functions that can be implemented as needed as either platform specific hypervisor calls or direct register accesses. Included is a sample of support for a fictitious hypervisor. This patch is just to give an idea, please re-write it as you like. It may be a good idea to wrap not only irq_mask/stat but also any other regs, and to remove generic functions like spu_priv1_get64/put64() since each access may be mapped to different hypervisor calls. Arnd mentioned it would be best to arrange for runtime configuration possibly using firmware_has_feature(). Signed-off-by: Masato Noguchi Signed-off-by: Geoff Levand Index: linux-2.6.15-rc4-cell/arch/powerpc/platforms/cell/spu_base.c =================================================================== --- linux-2.6.15-rc4-cell.orig/arch/powerpc/platforms/cell/spu_base.c 2005-12-02 16:26:20.000000000 -0800 +++ linux-2.6.15-rc4-cell/arch/powerpc/platforms/cell/spu_base.c 2005-12-02 16:27:40.000000000 -0800 @@ -141,8 +141,8 @@ /* atomically disable SPU mailbox interrupts */ spin_lock(&spu->register_lock); - out_be64(&spu->priv1->int_mask_class2_RW, - in_be64(&spu->priv1->int_mask_class2_RW) & ~0x1); + spu_irq_mask_set(spu, 2, + spu_irq_mask_get(spu, 2) & ~0x1); spin_unlock(&spu->register_lock); return 0; } @@ -177,8 +177,8 @@ /* atomically disable SPU mailbox interrupts */ spin_lock(&spu->register_lock); - out_be64(&spu->priv1->int_mask_class2_RW, - in_be64(&spu->priv1->int_mask_class2_RW) & ~0x10); + spu_irq_mask_set(spu, 2, + spu_irq_mask_get(spu, 2) & ~0x10); spin_unlock(&spu->register_lock); return 0; } @@ -202,7 +202,7 @@ spu->class_0_pending = 0; - stat = in_be64(&spu->priv1->int_stat_class0_RW); + stat = spu_irq_stat_get(spu, 0); if (stat & 1) /* invalid MFC DMA */ __spu_trap_invalid_dma(spu); @@ -213,7 +213,7 @@ if (stat & 4) /* error on SPU */ __spu_trap_error(spu); - out_be64(&spu->priv1->int_stat_class0_RW, stat); + spu_irq_stat_clear(spu, 0, stat); return 0; } @@ -227,13 +227,13 @@ /* atomically read & clear class1 status. */ spin_lock(&spu->register_lock); - mask = in_be64(&spu->priv1->int_mask_class1_RW); - stat = in_be64(&spu->priv1->int_stat_class1_RW) & mask; - dar = in_be64(&spu->priv1->mfc_dar_RW); - dsisr = in_be64(&spu->priv1->mfc_dsisr_RW); + mask = spu_irq_mask_get(spu, 1); + stat = spu_irq_stat_get(spu, 1) & mask; + dar = spu_priv1_get64(spu, mfc_dar_RW); + dsisr = spu_priv1_get64(spu, mfc_dsisr_RW); if (stat & 2) /* mapping fault */ - out_be64(&spu->priv1->mfc_dsisr_RW, 0UL); - out_be64(&spu->priv1->int_stat_class1_RW, stat); + spu_priv1_set64(spu, mfc_dsisr_RW, 0UL); + spu_irq_stat_clear(spu, 1, stat); spin_unlock(&spu->register_lock); if (stat & 1) /* segment fault */ @@ -259,10 +259,10 @@ unsigned long stat; spu = data; - stat = in_be64(&spu->priv1->int_stat_class2_RW); + stat = spu_irq_stat_get(spu, 2); pr_debug("class 2 interrupt %d, %lx, %lx\n", irq, stat, - in_be64(&spu->priv1->int_mask_class2_RW)); + spu_irq_mask_get(spu, int_mask_class2_RW)); if (stat & 1) /* PPC core mailbox */ @@ -280,7 +280,7 @@ if (stat & 0x10) /* SPU mailbox threshold */ __spu_trap_spubox(spu); - out_be64(&spu->priv1->int_stat_class2_RW, stat); + spu_irq_stat_set(spu, 2, stat); return stat ? IRQ_HANDLED : IRQ_NONE; } @@ -297,21 +297,21 @@ spu_irq_class_0, 0, spu->irq_c0, spu); if (ret) goto out; - out_be64(&spu->priv1->int_mask_class0_RW, 0x7); + spu_irq_mask_set(spu, 0, 0x7); snprintf(spu->irq_c1, sizeof (spu->irq_c1), "spe%02d.1", spu->number); ret = request_irq(irq_base + IIC_CLASS_STRIDE + spu->isrc, spu_irq_class_1, 0, spu->irq_c1, spu); if (ret) goto out1; - out_be64(&spu->priv1->int_mask_class1_RW, 0x3); + spu_irq_mask_set(spu, 1, 0x3); snprintf(spu->irq_c2, sizeof (spu->irq_c2), "spe%02d.2", spu->number); ret = request_irq(irq_base + 2*IIC_CLASS_STRIDE + spu->isrc, spu_irq_class_2, 0, spu->irq_c2, spu); if (ret) goto out2; - out_be64(&spu->priv1->int_mask_class2_RW, 0xe); + spu_irq_mask_set(spu, 2, 0xe); goto out; out2: @@ -373,9 +373,9 @@ static void spu_init_regs(struct spu *spu) { - out_be64(&spu->priv1->int_mask_class0_RW, 0x7); - out_be64(&spu->priv1->int_mask_class1_RW, 0x3); - out_be64(&spu->priv1->int_mask_class2_RW, 0xe); + spu_irq_mask_set(spu, 0, 0x7); + spu_irq_mask_set(spu, 1, 0x3); + spu_irq_mask_set(spu, 2, 0xe); } struct spu *spu_alloc(void) @@ -523,13 +523,11 @@ int spu_run(struct spu *spu) { struct spu_problem __iomem *prob; - struct spu_priv1 __iomem *priv1; struct spu_priv2 __iomem *priv2; u32 status; int ret; prob = spu->problem; - priv1 = spu->priv1; priv2 = spu->priv2; /* Let SPU run. */ @@ -561,7 +559,7 @@ cpu_relax(); out_be64(&priv2->slb_invalidate_all_W, 0); - out_be64(&priv1->tlb_invalidate_entry_W, 0UL); + spu_priv1_set64(spu, tlb_invalidate_entry_W, 0UL); eieio(); /* Check for SPU breakpoint. */ Index: linux-2.6.15-rc4-cell/arch/powerpc/platforms/cell/spufs/hw_ops.c =================================================================== --- linux-2.6.15-rc4-cell.orig/arch/powerpc/platforms/cell/spufs/hw_ops.c 2005-12-02 16:26:20.000000000 -0800 +++ linux-2.6.15-rc4-cell/arch/powerpc/platforms/cell/spufs/hw_ops.c 2005-12-02 16:27:40.000000000 -0800 @@ -62,7 +62,6 @@ { struct spu *spu = ctx->spu; struct spu_problem __iomem *prob = spu->problem; - struct spu_priv1 __iomem *priv1 = spu->priv1; struct spu_priv2 __iomem *priv2 = spu->priv2; int ret; @@ -73,8 +72,8 @@ ret = 4; } else { /* make sure we get woken up by the interrupt */ - out_be64(&priv1->int_mask_class2_RW, - in_be64(&priv1->int_mask_class2_RW) | 0x1); + spu_irq_mask_set(spu, 2, + spu_irq_mask_get(spu, 2) | 0x1); ret = 0; } spin_unlock_irq(&spu->register_lock); @@ -85,7 +84,6 @@ { struct spu *spu = ctx->spu; struct spu_problem __iomem *prob = spu->problem; - struct spu_priv1 __iomem *priv1 = spu->priv1; int ret; spin_lock_irq(&spu->register_lock); @@ -96,8 +94,8 @@ } else { /* make sure we get woken up by the interrupt when space becomes available */ - out_be64(&priv1->int_mask_class2_RW, - in_be64(&priv1->int_mask_class2_RW) | 0x10); + spu_irq_mask_set(spu, 2, + spu_irq_mask_get(spu, 2) | 0x10); ret = 0; } spin_unlock_irq(&spu->register_lock); Index: linux-2.6.15-rc4-cell/arch/powerpc/platforms/cell/spufs/switch.c =================================================================== --- linux-2.6.15-rc4-cell.orig/arch/powerpc/platforms/cell/spufs/switch.c 2005-12-02 16:26:20.000000000 -0800 +++ linux-2.6.15-rc4-cell/arch/powerpc/platforms/cell/spufs/switch.c 2005-12-02 16:27:40.000000000 -0800 @@ -108,8 +108,6 @@ static inline void disable_interrupts(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Save, Step 3: * Restore, Step 2: * Save INT_Mask_class0 in CSA. @@ -122,15 +120,15 @@ spin_lock_irq(&spu->register_lock); if (csa) { csa->priv1.int_mask_class0_RW = - in_be64(&priv1->int_mask_class0_RW); + spu_irq_mask_get(spu, 0); csa->priv1.int_mask_class1_RW = - in_be64(&priv1->int_mask_class1_RW); + spu_irq_mask_get(spu, 1); csa->priv1.int_mask_class2_RW = - in_be64(&priv1->int_mask_class2_RW); + spu_irq_mask_get(spu, 2); } - out_be64(&priv1->int_mask_class0_RW, 0UL); - out_be64(&priv1->int_mask_class1_RW, 0UL); - out_be64(&priv1->int_mask_class2_RW, 0UL); + spu_irq_mask_set(spu, 0, 0UL); + spu_irq_mask_set(spu, 1, 0UL); + spu_irq_mask_set(spu, 2, 0UL); eieio(); spin_unlock_irq(&spu->register_lock); } @@ -217,12 +215,10 @@ static inline void save_mfc_sr1(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Save, Step 10: * Save MFC_SR1 in the CSA. */ - csa->priv1.mfc_sr1_RW = in_be64(&priv1->mfc_sr1_RW); + csa->priv1.mfc_sr1_RW = spu_priv1_get64(spu, mfc_sr1_RW); } static inline void save_spu_status(struct spu_state *csa, struct spu *spu) @@ -316,15 +312,13 @@ static inline void issue_mfc_tlbie(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Save, Step 17: * Restore, Step 12. * Restore, Step 48. * Write TLB_Invalidate_Entry[IS,VPN,L,Lp]=0 register. * Then issue a PPE sync instruction. */ - out_be64(&priv1->tlb_invalidate_entry_W, 0UL); + spu_priv1_set64(spu, tlb_invalidate_entry_W, 0UL); mb(); } @@ -434,25 +428,21 @@ static inline void save_mfc_tclass_id(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Save, Step 25: * Save the MFC_TCLASS_ID register in * the CSA. */ - csa->priv1.mfc_tclass_id_RW = in_be64(&priv1->mfc_tclass_id_RW); + csa->priv1.mfc_tclass_id_RW = spu_priv1_get64(spu, mfc_tclass_id_RW); } static inline void set_mfc_tclass_id(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Save, Step 26: * Restore, Step 23. * Write the MFC_TCLASS_ID register with * the value 0x10000000. */ - out_be64(&priv1->mfc_tclass_id_RW, 0x10000000); + spu_priv1_set64(spu, mfc_tclass_id_RW, 0x10000000); eieio(); } @@ -482,14 +472,13 @@ static inline void save_mfc_slbs(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; struct spu_priv2 __iomem *priv2 = spu->priv2; int i; /* Save, Step 29: * If MFC_SR1[R]='1', save SLBs in CSA. */ - if (in_be64(&priv1->mfc_sr1_RW) & MFC_STATE1_RELOCATE_MASK) { + if (spu_priv1_get64(spu, mfc_sr1_RW) & MFC_STATE1_RELOCATE_MASK) { csa->priv2.slb_index_W = in_be64(&priv2->slb_index_W); for (i = 0; i < 8; i++) { out_be64(&priv2->slb_index_W, i); @@ -503,8 +492,6 @@ static inline void setup_mfc_sr1(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Save, Step 30: * Restore, Step 18: * Write MFC_SR1 with MFC_SR1[D=0,S=1] and @@ -516,9 +503,9 @@ * MFC_SR1[Pr] bit is not set. * */ - out_be64(&priv1->mfc_sr1_RW, (MFC_STATE1_MASTER_RUN_CONTROL_MASK | - MFC_STATE1_RELOCATE_MASK | - MFC_STATE1_BUS_TLBIE_MASK)); + spu_priv1_set64(spu, mfc_sr1_RW, (MFC_STATE1_MASTER_RUN_CONTROL_MASK | + MFC_STATE1_RELOCATE_MASK | + MFC_STATE1_BUS_TLBIE_MASK)); } static inline void save_spu_npc(struct spu_state *csa, struct spu *spu) @@ -595,16 +582,14 @@ static inline void save_mfc_rag(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Save, Step 38: * Save RA_GROUP_ID register and the * RA_ENABLE reigster in the CSA. */ csa->priv1.resource_allocation_groupID_RW = - in_be64(&priv1->resource_allocation_groupID_RW); + spu_priv1_get64(spu, resource_allocation_groupID_RW); csa->priv1.resource_allocation_enable_RW = - in_be64(&priv1->resource_allocation_enable_RW); + spu_priv1_get64(spu, resource_allocation_enable_RW); } static inline void save_ppu_mb_stat(struct spu_state *csa, struct spu *spu) @@ -722,14 +707,13 @@ static inline void invalidate_slbs(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; struct spu_priv2 __iomem *priv2 = spu->priv2; /* Save, Step 45: * Restore, Step 19: * If MFC_SR1[R]=1, write 0 to SLB_Invalidate_All. */ - if (in_be64(&priv1->mfc_sr1_RW) & MFC_STATE1_RELOCATE_MASK) { + if (spu_priv1_get64(spu, mfc_sr1_RW) & MFC_STATE1_RELOCATE_MASK) { out_be64(&priv2->slb_invalidate_all_W, 0UL); eieio(); } @@ -798,7 +782,6 @@ static inline void enable_interrupts(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; unsigned long class1_mask = CLASS1_ENABLE_SEGMENT_FAULT_INTR | CLASS1_ENABLE_STORAGE_FAULT_INTR; @@ -811,12 +794,12 @@ * (translation) interrupts. */ spin_lock_irq(&spu->register_lock); - out_be64(&priv1->int_stat_class0_RW, ~(0UL)); - out_be64(&priv1->int_stat_class1_RW, ~(0UL)); - out_be64(&priv1->int_stat_class2_RW, ~(0UL)); - out_be64(&priv1->int_mask_class0_RW, 0UL); - out_be64(&priv1->int_mask_class1_RW, class1_mask); - out_be64(&priv1->int_mask_class2_RW, 0UL); + spu_irq_stat_clear(spu, 0, ~(0UL)); + spu_irq_stat_clear(spu, 1, ~(0UL)); + spu_irq_stat_clear(spu, 2, ~(0UL)); + spu_irq_mask_set(spu, 0, 0UL); + spu_irq_mask_set(spu, 1, 0UL); + spu_irq_mask_set(spu, 2, 0UL); spin_unlock_irq(&spu->register_lock); } @@ -954,7 +937,6 @@ static inline void wait_tag_complete(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; struct spu_problem __iomem *prob = spu->problem; u32 mask = MFC_TAGID_TO_TAGMASK(0); unsigned long flags; @@ -971,14 +953,13 @@ POLL_WHILE_FALSE(in_be32(&prob->dma_tagstatus_R) & mask); local_irq_save(flags); - out_be64(&priv1->int_stat_class0_RW, ~(0UL)); - out_be64(&priv1->int_stat_class2_RW, ~(0UL)); + spu_irq_stat_clear(spu, 0, ~(0UL)); + spu_irq_stat_clear(spu, 2, ~(0UL)); local_irq_restore(flags); } static inline void wait_spu_stopped(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; struct spu_problem __iomem *prob = spu->problem; unsigned long flags; @@ -991,8 +972,8 @@ POLL_WHILE_TRUE(in_be32(&prob->spu_status_R) & SPU_STATUS_RUNNING); local_irq_save(flags); - out_be64(&priv1->int_stat_class0_RW, ~(0UL)); - out_be64(&priv1->int_stat_class2_RW, ~(0UL)); + spu_irq_stat_clear(spu, 0, ~(0UL)); + spu_irq_stat_clear(spu, 2, ~(0UL)); local_irq_restore(flags); } @@ -1091,7 +1072,6 @@ static inline void clear_spu_status(struct spu_state *csa, struct spu *spu) { struct spu_problem __iomem *prob = spu->problem; - struct spu_priv1 __iomem *priv1 = spu->priv1; /* Restore, Step 10: * If SPU_Status[R]=0 and SPU_Status[E,L,IS]=1, @@ -1100,8 +1080,8 @@ if (!(in_be32(&prob->spu_status_R) & SPU_STATUS_RUNNING)) { if (in_be32(&prob->spu_status_R) & SPU_STATUS_ISOLATED_EXIT_STAUTUS) { - out_be64(&priv1->mfc_sr1_RW, - MFC_STATE1_MASTER_RUN_CONTROL_MASK); + spu_priv1_set64(spu, mfc_sr1_RW, + MFC_STATE1_MASTER_RUN_CONTROL_MASK); eieio(); out_be32(&prob->spu_runcntl_RW, SPU_RUNCNTL_RUNNABLE); eieio(); @@ -1112,8 +1092,8 @@ SPU_STATUS_ISOLATED_LOAD_STAUTUS) || (in_be32(&prob->spu_status_R) & SPU_STATUS_ISOLATED_STATE)) { - out_be64(&priv1->mfc_sr1_RW, - MFC_STATE1_MASTER_RUN_CONTROL_MASK); + spu_priv1_set64(spu, mfc_sr1_RW, + MFC_STATE1_MASTER_RUN_CONTROL_MASK); eieio(); out_be32(&prob->spu_runcntl_RW, 0x2); eieio(); @@ -1281,16 +1261,14 @@ static inline void restore_mfc_rag(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Restore, Step 29: * Restore RA_GROUP_ID register and the * RA_ENABLE reigster from the CSA. */ - out_be64(&priv1->resource_allocation_groupID_RW, - csa->priv1.resource_allocation_groupID_RW); - out_be64(&priv1->resource_allocation_enable_RW, - csa->priv1.resource_allocation_enable_RW); + spu_priv1_set64(spu, resource_allocation_groupID_RW, + csa->priv1.resource_allocation_groupID_RW); + spu_priv1_set64(spu, resource_allocation_enable_RW, + csa->priv1.resource_allocation_enable_RW); } static inline void send_restore_code(struct spu_state *csa, struct spu *spu) @@ -1433,8 +1411,6 @@ static inline void clear_interrupts(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Restore, Step 49: * Write INT_MASK_class0 with value of 0. * Write INT_MASK_class1 with value of 0. @@ -1444,12 +1420,12 @@ * Write INT_STAT_class2 with value of -1. */ spin_lock_irq(&spu->register_lock); - out_be64(&priv1->int_mask_class0_RW, 0UL); - out_be64(&priv1->int_mask_class1_RW, 0UL); - out_be64(&priv1->int_mask_class2_RW, 0UL); - out_be64(&priv1->int_stat_class0_RW, ~(0UL)); - out_be64(&priv1->int_stat_class1_RW, ~(0UL)); - out_be64(&priv1->int_stat_class2_RW, ~(0UL)); + spu_irq_mask_set(spu, 0, 0UL); + spu_irq_mask_set(spu, 1, 0UL); + spu_irq_mask_set(spu, 2, 0UL); + spu_irq_stat_clear(spu, 0, ~(0UL)); + spu_irq_stat_clear(spu, 1, ~(0UL)); + spu_irq_stat_clear(spu, 2, ~(0UL)); spin_unlock_irq(&spu->register_lock); } @@ -1546,12 +1522,10 @@ static inline void restore_mfc_tclass_id(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Restore, Step 56: * Restore the MFC_TCLASS_ID register from CSA. */ - out_be64(&priv1->mfc_tclass_id_RW, csa->priv1.mfc_tclass_id_RW); + spu_priv1_set64(spu, mfc_tclass_id_RW, csa->priv1.mfc_tclass_id_RW); eieio(); } @@ -1713,7 +1687,6 @@ static inline void check_ppuint_mb_stat(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; struct spu_priv2 __iomem *priv2 = spu->priv2; u64 dummy = 0UL; @@ -1724,8 +1697,7 @@ if ((csa->prob.mb_stat_R & 0xFF0000) == 0) { dummy = in_be64(&priv2->puint_mb_R); eieio(); - out_be64(&priv1->int_stat_class2_RW, - CLASS2_ENABLE_MAILBOX_INTR); + spu_irq_stat_clear(spu, 2, CLASS2_ENABLE_MAILBOX_INTR); eieio(); } } @@ -1753,12 +1725,10 @@ static inline void restore_mfc_sr1(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Restore, Step 69: * Restore the MFC_SR1 register from CSA. */ - out_be64(&priv1->mfc_sr1_RW, csa->priv1.mfc_sr1_RW); + spu_priv1_set64(spu, mfc_sr1_RW, csa->priv1.mfc_sr1_RW); eieio(); } @@ -1816,15 +1786,13 @@ static inline void reenable_interrupts(struct spu_state *csa, struct spu *spu) { - struct spu_priv1 __iomem *priv1 = spu->priv1; - /* Restore, Step 75: * Re-enable SPU interrupts. */ spin_lock_irq(&spu->register_lock); - out_be64(&priv1->int_mask_class0_RW, csa->priv1.int_mask_class0_RW); - out_be64(&priv1->int_mask_class1_RW, csa->priv1.int_mask_class1_RW); - out_be64(&priv1->int_mask_class2_RW, csa->priv1.int_mask_class2_RW); + spu_irq_mask_set(spu, 0, csa->priv1.int_mask_class0_RW); + spu_irq_mask_set(spu, 1, csa->priv1.int_mask_class1_RW); + spu_irq_mask_set(spu, 2, csa->priv1.int_mask_class2_RW); spin_unlock_irq(&spu->register_lock); } Index: linux-2.6.15-rc4-cell/include/asm-powerpc/spu.h =================================================================== --- linux-2.6.15-rc4-cell.orig/include/asm-powerpc/spu.h 2005-12-02 16:26:20.000000000 -0800 +++ linux-2.6.15-rc4-cell/include/asm-powerpc/spu.h 2005-12-02 16:27:40.000000000 -0800 @@ -576,4 +576,64 @@ u64 spu_trace_cntl; /* 0x1070 */ } __attribute__ ((aligned(0x2000))); + +/* priv1 access */ + +#ifdef CONFIG_ON_HYPERVISOR_XXXXX + /* examples for a fictitious hypervisor */ + +#include + +inline u64 spu_irq_mask_get(struct spu *spu, int cls) +{ + u64 __val; + hvcall_spu_get_irq_mask(spu->spu_magical_id, + cls, + &__val); + return __val; +} + +#define spu_irq_mask_set(spu, cls, mask) \ + hvcall_spu_get_irq_mask(spu->spu_magical_id, \ + cls, \ + mask); + +#define spu_irq_stat_get(spu, cls) \ + hvcall_spu_get_interrupt_status(spu->spu_magical_id, \ + cls); +#define spu_irq_stat_clear(spu, cls, val) \ + hvcall_spu_clear_interrupt_status(spu->spu_magical_id, \ + cls, val); + +inline u64 spu_priv1_get64(struct spu *spu, int cls) +{ + u64 __val; + hvcall_spu_get_priv1(spu->spu_magical_id, + offsetof(struct spu_priv1, reg), + &__val); + return __val; +} + +#define spu_priv1_set64(spu, cls, val) \ + hvcall_spu_set_priv1(spu->spu_magical_id, \ + offsetof(struct spu_priv1, reg), \ + val); + +#else /* CONFIG_ON_HYPERVISOR_XXXXX */ + +#define spu_irq_mask_get(spu, cls) \ + spu_priv1_get64(spu, int_mask_class ## cls ## _RW) +#define spu_irq_mask_set(spu, cls, mask) \ + spu_priv1_set64(spu, int_mask_class ## cls ## _RW, mask) + +#define spu_irq_stat_get(spu, cls) \ + spu_priv1_get64(spu, int_mask_class ## cls ## _RW) +#define spu_irq_stat_clear(spu, cls, stat) \ + spu_priv1_set64(spu, int_mask_class ## cls ## _RW, stat) + +#define spu_priv1_get64(spu, reg) in_be64(&(spu)->priv1->reg) +#define spu_priv1_set64(spu, reg, val) out_be64(&(spu)->priv1->reg, val) + +#endif /* CONFIG_ON_HYPERVISOR_XXXXX */ + #endif From haren at us.ibm.com Wed Dec 7 12:53:37 2005 From: haren at us.ibm.com (Haren Myneni) Date: Tue, 06 Dec 2005 17:53:37 -0800 Subject: [PATCH] Trivial fix in __alloc_bootmem_core() when there is no free page in first node's memory Message-ID: <439640A1.3030300@us.ibm.com> Hi, Hitting BUG_ON() in __alloc_bootmem_core() when there is no free page available in the first node's memory. For the case of kdump on PPC64 (Power 4 machine), the captured kernel is used two memory regions - memory for TCE tables (tce-base and tce-size at top of RAM and reserved) and captured kernel memory region (crashk_base and crashk_size). Since we reserve the memory for the first node, we should be returning from __alloc_bootmem_core() to search for the next node (pg_dat). Currently, find_next_zero_bit() is returning the n^th bit (eidx) when there is no free page. Then, test_bit() is failed since we set 0xff only for the actual size initially (init_bootmem_core()) even though rounded up to one page for bdata->node_bootmem_map. We are hitting the BUG_ON after failing to enter second "for" loop. Please apply. Thanks Haren Signed-off-by: Haren Myneni -------------- next part -------------- A non-text attachment was scrubbed... Name: bootmem_bug_on_fix.patch Type: text/x-patch Size: 413 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051206/8fadadaf/attachment.bin From sfr at canb.auug.org.au Wed Dec 7 13:01:05 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 7 Dec 2005 13:01:05 +1100 Subject: [PATCH] powerpc: fix for "Update OF address parsers" Message-ID: <20051207130105.44a488c0.sfr@canb.auug.org.au> This patch allows iSeries to build again. It just moves pci_address_to_pio outside the #ifdef CONFIG_PPC_MULTIPLATFORM. Signed-off-by: Stephen Rothwell --- arch/powerpc/kernel/pci_64.c | 28 ++++++++++++++-------------- 1 files changed, 14 insertions(+), 14 deletions(-) Built on iSeries and pSeries and booted on iSeries. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ c3485e24b9b4fbd530f28022e6b3f58b206eec74 diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c index 0988222..4eb93fc 100644 --- a/arch/powerpc/kernel/pci_64.c +++ b/arch/powerpc/kernel/pci_64.c @@ -1181,20 +1181,6 @@ void phbs_remap_io(void) remap_bus_range(hose->bus); } -unsigned int pci_address_to_pio(phys_addr_t address) -{ - struct pci_controller *hose, *tmp; - - list_for_each_entry_safe(hose, tmp, &hose_list, list_node) { - if (address >= hose->io_base_phys && - address < (hose->io_base_phys + hose->pci_io_size)) - return (unsigned int)hose->io_base_virt + - (address - hose->io_base_phys); - } - return (unsigned int)-1; -} -EXPORT_SYMBOL_GPL(pci_address_to_pio); - static void __devinit fixup_resource(struct resource *res, struct pci_dev *dev) { struct pci_controller *hose = pci_bus_to_host(dev->bus); @@ -1337,6 +1323,20 @@ struct pci_controller* pci_find_hose_for #endif /* CONFIG_PPC_MULTIPLATFORM */ +unsigned int pci_address_to_pio(phys_addr_t address) +{ + struct pci_controller *hose, *tmp; + + list_for_each_entry_safe(hose, tmp, &hose_list, list_node) { + if (address >= hose->io_base_phys && + address < (hose->io_base_phys + hose->pci_io_size)) + return (unsigned int)hose->io_base_virt + + (address - hose->io_base_phys); + } + return (unsigned int)-1; +} +EXPORT_SYMBOL_GPL(pci_address_to_pio); + #define IOBASE_BRIDGE_NUMBER 0 #define IOBASE_MEMORY 1 -- 0.99.9l From viro at ftp.linux.org.uk Wed Dec 7 13:26:10 2005 From: viro at ftp.linux.org.uk (Al Viro) Date: Wed, 7 Dec 2005 02:26:10 +0000 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <17302.3696.364669.18755@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> <17301.65082.251692.675360@cargo.ozlabs.ibm.com> <1133905298.8027.13.camel@localhost> <17302.3696.364669.18755@cargo.ozlabs.ibm.com> Message-ID: <20051207022610.GI27946@ftp.linux.org.uk> On Wed, Dec 07, 2005 at 09:19:28AM +1100, Paul Mackerras wrote: > Think about someone changing the VFS layer interface and fixing up all > the filesystems to accommodate that change. That person is doing some > of your work for you, so you want to make it easy for him/her to find > your filesystem. That's the sort of thing I was referring to as > maintenance. FWIW, I think it's not a serious argument. Interface changes => grep time. And that means grep over the tree anyway. > As for changes on the cell-specific side, the people doing those > changes will know where to find it, so it isn't a problem having it in > fs/. > > Having it in fs/ also means that it is more likely that people > familiar with VFS internals will look through your code and comment on > it. I know that can be painful in the short term, but in the long > term it will lead to better code. That's solved by asking for review... As far as I'm concerned, the only thing here that looks like a possible reason to move the entire thing is highly unusual semantics of final close and interesting use of VFS interfaces in spu_create(). I.e. it's not that we have a filesystem there. OTOH, if you go looking for analogs as far as unusual interaction with VFS is concerned... net/unix is unlikely to get moved. From paulus at samba.org Wed Dec 7 13:57:14 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 7 Dec 2005 13:57:14 +1100 Subject: [PATCH 7/11] powerpc: Fixups for kernel linked at 32 MB In-Reply-To: <20051205003954.6E56168802@ozlabs.org> References: <1133743149.268607.418162138937.qpush@concordia> <20051205003954.6E56168802@ozlabs.org> Message-ID: <17302.20362.490309.877127@cargo.ozlabs.ibm.com> Michael Ellerman writes: > There's a few places where we need to fix things up for the kernel to work > if it's linked at 32MB: > > - platforms/powermac/smp.c > To start secondary cpus on pmac we patch the reset vector, which is fine. > Except if we're above 32MB we don't have enough bits for an absolute branch, > it needs to relative. A relative branch at 0x100 is only going to get 0x100 bytes further than an absolute branch, and I don't think that's far enough. Did you consider putting the kdump kernel at 24MB rather than 32MB? That would solve this and other branch issues. Paul. From paulus at samba.org Wed Dec 7 14:15:09 2005 From: paulus at samba.org (Paul Mackerras) Date: Wed, 7 Dec 2005 14:15:09 +1100 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <20051207022610.GI27946@ftp.linux.org.uk> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> <17301.65082.251692.675360@cargo.ozlabs.ibm.com> <1133905298.8027.13.camel@localhost> <17302.3696.364669.18755@cargo.ozlabs.ibm.com> <20051207022610.GI27946@ftp.linux.org.uk> Message-ID: <17302.21437.608048.64857@cargo.ozlabs.ibm.com> Al Viro writes: > FWIW, I think it's not a serious argument. Interface changes => grep time. > And that means grep over the tree anyway. OK, well, where would you prefer the spufs code to go? > That's solved by asking for review... Could you review the spufs code (i.e. the patches posted by Arnd recently to linuxppc64-dev at ozlabs.org) please? Thanks, Paul. From michael at ellerman.id.au Wed Dec 7 15:38:00 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Tue, 6 Dec 2005 22:38:00 -0600 Subject: [PATCH 7/11] powerpc: Fixups for kernel linked at 32 MB In-Reply-To: <17302.20362.490309.877127@cargo.ozlabs.ibm.com> References: <1133743149.268607.418162138937.qpush@concordia> <20051205003954.6E56168802@ozlabs.org> <17302.20362.490309.877127@cargo.ozlabs.ibm.com> Message-ID: <200512062238.03741.michael@ellerman.id.au> On Tue, 6 Dec 2005 20:57, Paul Mackerras wrote: > Michael Ellerman writes: > > There's a few places where we need to fix things up for the kernel to > > work if it's linked at 32MB: > > > > - platforms/powermac/smp.c > > To start secondary cpus on pmac we patch the reset vector, which is > > fine. Except if we're above 32MB we don't have enough bits for an > > absolute branch, it needs to relative. > > A relative branch at 0x100 is only going to get 0x100 bytes further > than an absolute branch, and I don't think that's far enough. Except we're patching at KERNELBASE + 0x100, so as long as __secondary_start_pmac_0 - KERNELBASE < 32 MB we should be fine. > Did you consider putting the kdump kernel at 24MB rather than 32MB? > That would solve this and other branch issues. No I didn't, but a few people have mentioned it since. I think it's something we could look at - but I'd rather not change it now. The only other "branch issue" I know of is having to use a no-op in the trampoline, is there anything else? cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051206/30952dec/attachment.pgp From penberg at cs.helsinki.fi Wed Dec 7 19:21:50 2005 From: penberg at cs.helsinki.fi (Pekka Enberg) Date: Wed, 7 Dec 2005 10:21:50 +0200 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <17302.21437.608048.64857@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> <17301.65082.251692.675360@cargo.ozlabs.ibm.com> <1133905298.8027.13.camel@localhost> <17302.3696.364669.18755@cargo.ozlabs.ibm.com> <20051207022610.GI27946@ftp.linux.org.uk> <17302.21437.608048.64857@cargo.ozlabs.ibm.com> Message-ID: <84144f020512070021r38188044x54c0b2491ef4a176@mail.gmail.com> Hi Paul, On 12/7/05, Paul Mackerras wrote: > Could you review the spufs code (i.e. the patches posted by Arnd > recently to linuxppc64-dev at ozlabs.org) please? Why not post them to LKML? Pekka From viro at ftp.linux.org.uk Wed Dec 7 21:17:08 2005 From: viro at ftp.linux.org.uk (Al Viro) Date: Wed, 7 Dec 2005 10:17:08 +0000 Subject: [PATCH 02/14] spufs: fix local store page refcounting In-Reply-To: <17302.21437.608048.64857@cargo.ozlabs.ibm.com> References: <20051206035220.097737000@localhost> <200512061118.19633.arnd@arndb.de> <1133869108.7968.1.camel@localhost> <200512061949.33482.arnd@arndb.de> <1133895947.3279.4.camel@localhost> <17301.65082.251692.675360@cargo.ozlabs.ibm.com> <1133905298.8027.13.camel@localhost> <17302.3696.364669.18755@cargo.ozlabs.ibm.com> <20051207022610.GI27946@ftp.linux.org.uk> <17302.21437.608048.64857@cargo.ozlabs.ibm.com> Message-ID: <20051207101708.GJ27946@ftp.linux.org.uk> On Wed, Dec 07, 2005 at 02:15:09PM +1100, Paul Mackerras wrote: > Al Viro writes: > > > FWIW, I think it's not a serious argument. Interface changes => grep time. > > And that means grep over the tree anyway. > > OK, well, where would you prefer the spufs code to go? Up to ppc folks, really - I don't see any serious objections to arch/powerpc/ variants; it could go there, it could go to fs/*. Objections along the lines of "it won't be found" are BS - any interface change is going to start with grep over the entire tree anyway. > > That's solved by asking for review... > > Could you review the spufs code (i.e. the patches posted by Arnd > recently to linuxppc64-dev at ozlabs.org) please? If it's what you have in powerpc.git - see comments on IRC yesterday... From Jens.Osterkamp at de.ibm.com Wed Dec 7 20:53:12 2005 From: Jens.Osterkamp at de.ibm.com (Jens Osterkamp) Date: Wed, 7 Dec 2005 10:53:12 +0100 Subject: [PATCH 12/14] spidernet: check if firmware was loaded correctly In-Reply-To: <200512061123.40059.arnd@arndb.de> Message-ID: Arnd Bergmann wrote on 12/06/2005 11:23:39 AM: > On Dinsdag 06 Dezember 2005 01:59, Paul Mackerras wrote: > > Arnd Bergmann writes: > > > > > Uploading the device firmware may fail if wrong input data > > > was provided by the user. This checks for the condition. > > > > > > From: Jens.Osterkamp at de.ibm.com > > > Cc: netdev at vger.kernel.org > > > > This one should be sent to Jeff Garzik, along with patches 11, 13 and > > 14. > > Ok. > > Jens, is it ok for you if you send the network driver stuff to > jgarzik at pobox.com, Cc: netdev at vger.kernel.org yourself in the future? Sure, I will do so for our next updates. Jens From rsa at us.ibm.com Thu Dec 8 02:54:59 2005 From: rsa at us.ibm.com (Ryan Arnold) Date: Wed, 07 Dec 2005 09:54:59 -0600 Subject: [RFC PATCH 3/5] CELL bogus_console port to hvc_console backend driver In-Reply-To: <2b19bee9bd90cfee311d8076b026add4@bga.com> References: <43935BB5.9030302@us.ibm.com> <2b19bee9bd90cfee311d8076b026add4@bga.com> Message-ID: <1133970899.10632.11.camel@localhost.localdomain> On Mon, 2005-12-05 at 11:27 -0600, Milton Miller wrote: > > +config HVC_DRIVER > > + bool "PowerPC virtual console front-end support" > > + depends on PPC_PSERIES || PPC_BPA || PPC_RTAS > > + help > > + Users of pSeries machines that want to utilize the hvc console > > front-end > > + module for their backend console driver should select this option. > > + It will automatically be selected if one of the back-end console > > drivers > > + is selected. > > + > > Lets just keep this hidden -- so take out depends (its all generic code) > and just say bool (without any quoted text). The help text could then > be made more generic. Good idea, I wasn't aware that a Kconfig option can remain hidden. -- Ryan Arnold IBM Linux Technology Center From galak at kernel.crashing.org Thu Dec 8 03:54:28 2005 From: galak at kernel.crashing.org (Kumar Gala) Date: Wed, 7 Dec 2005 10:54:28 -0600 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <20051207001720.GB25533@localhost.localdomain> References: <1133816807.8577.50.camel@cashmere.sps.mot.com> <200512062048.56131.arnd@arndb.de> <20051207001720.GB25533@localhost.localdomain> Message-ID: On Dec 6, 2005, at 6:17 PM, David Gibson wrote: > On Tue, Dec 06, 2005 at 08:48:55PM +0100, Arnd Bergmann wrote: >> On Maandag 05 Dezember 2005 22:06, Jon Loeliger wrote: >>> Included below is a proposed Revision 0.5 of the >>> "Booting the Linux/ppc kernel without Open Firmware" >>> document. This modification primarily extends the >>> Revision 0.4 by adding definitions for OF Nodes that >>> cover the System-On-a-Chip features found on PPC parts. >>> It also generalizes some earlier wording that pertained >>> to only PPC64 parts and covers the new, merged PPC 32 >>> and 64 parts together. Finally, minor typos, style >>> consistency and grammar problems were corrected. >> >> A few points are not clear yet, either because I don't understand the >> document or one it references correctly or because I might have >> different requirements: > > All comments below IMHO, and subject to persuasion otherwise. > >> - Do we need a way to identify the type of soc bus? There are >> different >> standards for this, e.g. PLB4 on PPC440 or the EIB on the Cell BE. >> My initial idea was to have different device-type properties for >> these, >> but I now think that device_type = "soc" makes sense for all of >> them. >> Maybe we could add a model or compatible property for them. > > It think it would be a good idea to have something labelling the > specific type of SOC bus, though I'm not immediately sure where. > "model" perhaps, if it rarely has an effect on how to operate the bus. I think this should be optional since it rarely has an effect on usage. >> - It does not really belong into this document, but is related >> anyway: >> how do you want to represent this in Linux? Currently, most of >> these >> would be of_platform_device, but I think it would be good to have >> a new bus_type for it. The advantage would be that you can see the >> devices in /sys/devices/soc at xxx/ even if the driver is not loaded >> and the driver can even be autoloaded by udev. >> Also, which properties should show up in sysfs? All of them or just >> those specified in this document or a subset of them? > > I concur - I believe we already have a bus_type for on-chip devices on > 4xx. Not, sure what the 4xx reference is but, we have be using the platform bus in the kernel for "soc" connected devices. I dont see the need to invent a new bus type unless there is a specific reason to. >> - What do we do with pci root devices? They are often physically >> connected >> to the internal CPU bus, so it would make sense to represent them >> this way in the device tree. Should we add them to the >> specification >> here? Would it even work the expected way in Linux? > > The host bridges should sit on the soc bus then, as you suggest (just > as the PCI busses hang off HyperTransport on the G5). I think you > need to refer to the OF docs for how to represent the PCI host bridge > and devices themselves. We need to provide some details on PCI nodes based on the OF docs. Ben and I have talked a little about this. Its mainly about what parts of the OF spec are truly required. We will probably add some additional information that the OF spec doesnt handle for host bridges setup. >> - For some devices, you mandate a model property, for others you >> don't. >> Is this intentional? It might be easier to find the right device >> driver if the match string always contains a model name. > > You rarely want to match model name to find a device - generally you > want to match either on "compatible" or "device_type", or possibly > both. > >> - How would I represent nested interrupt controllers? E.g. suppose I >> have a Cell internal interrupt controller on one SOC bus and >> and an external interrupt controller on another SOC bus but have >> that deliver interrupts to the first one. > > Again, I believe this is in the OF docs - interrupt controllers have > an interrupt-parent property IIRC, which gives the phandle of the next > interrupt controller up the chain. Yep, you need to check out the "Interrupt Mapping" OF spec for details. It handles describing the chaining you speak of. However, you will need to provide some "spec" for any properties of the interrupt controllers that you may need. >> - Should it mention nested SOC buses, e.g. a PLB4 bus connected to a >> PLB5 bus? > > Yes. Is there anything special about this? are these PLB4/5 busses software visible? > >> - The title says 'without Open Firmware', but it should also be >> allowed >> to use the same SOC bus layout when using SLOF or some other OF >> implementation, right? > > I guess so. > >> - Also not new in this version, but still: Should there be support >> for >> specifying CPUs with multiple SMT threads? > > Umm.. maybe. - kumar From miltonm at bga.com Thu Dec 8 04:23:39 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 7 Dec 2005 11:23:39 -0600 Subject: [PATCH 03/14] spufs: Fix oops when spufs module is not loaded Message-ID: > - if (try_module_get(spufs_calls.owner)) { > + if (owner && try_module_get(spufs_calls.owner)) { > try_module_get(owner) to avoid the race (twice) milton From miltonm at bga.com Thu Dec 8 04:23:43 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 7 Dec 2005 11:23:43 -0600 Subject: [PATCH 05/14] spufs: Improved SPU preemptability. Message-ID: <72ba6abba87023a896c2313797ede940@bga.com> > > This patch makes it easier to preempt an SPU context by > having the scheduler hold ctx->state_sema for much shorter > periods of time. > > As part of this restructuring, the control logic for the "run" > operation is moved from arch/ppc64/kernel/spu_base.c to > fs/spufs/file.c. Of course the base retains "bottom half" file.c moved > handlers for class{0,1} irqs. The new run loop will re-acquire > an SPU if preempted. > > From: Mark Nutter > Signed-off-by: Arnd Bergmann From miltonm at bga.com Thu Dec 8 04:24:21 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 7 Dec 2005 11:24:21 -0600 Subject: [PATCH 12/14] spidernet: check if firmware was loaded correctly Message-ID: <088b3b2b5b1187c6a716102b0201469f@bga.com> On Tue Dec 6 14:52:32 EST 2005, Arnd Bergmann wrote: > Uploading the device firmware may fail if wrong input data > was provided by the user. This checks for the condition. > > From: Jens.Osterkamp at de.ibm.com > Cc: netdev at vger.kernel.org > Signed-off-by: Arnd Bergmann > > Index: linux-2.6.15-rc/drivers/net/spider_net.c > =================================================================== > --- linux-2.6.15-rc.orig/drivers/net/spider_net.c > +++ linux-2.6.15-rc/drivers/net/spider_net.c > @@ -1836,7 +1836,7 @@ spider_net_setup_phy(struct spider_net_c > * spider_net_download_firmware loads the firmware opened by > * spider_net_init_firmware into the adapter. > */ > -static void > +static int > spider_net_download_firmware(struct spider_net_card *card, > const struct firmware *firmware) > { > @@ -1857,8 +1857,13 @@ spider_net_download_firmware(struct spid > } > } > > + if (spider_net_read_reg(card, SPIDER_NET_GSINIT)) > + return -EIO; > + > spider_net_write_reg(card, SPIDER_NET_GSINIT, > SPIDER_NET_RUN_SEQ_VALUE); > + > + return 0; > } > > /** > @@ -1909,9 +1914,8 @@ spider_net_init_firmware(struct spider_n > goto out; > } > > - spider_net_download_firmware(card, firmware); > - > - err = 0; > + if (!spider_net_download_firmware(card, firmware)) > + err = 0; Why not assign err to the return of spider_net_download_firmware? > out: > release_firmware(firmware); > > Index: linux-2.6.15-rc/drivers/net/spider_net.h > =================================================================== > --- linux-2.6.15-rc.orig/drivers/net/spider_net.h > +++ linux-2.6.15-rc/drivers/net/spider_net.h > @@ -155,7 +155,7 @@ extern char spider_net_driver_name[]; > /* set this first, then the FRAMENUM_VALUE */ > #define SPIDER_NET_GFXFRAMES_VALUE 0x00000000 > > -#define SPIDER_NET_STOP_SEQ_VALUE 0x00000000 > +#define SPIDER_NET_STOP_SEQ_VALUE 0x007e0000 > #define SPIDER_NET_RUN_SEQ_VALUE 0x0000007e > > #define SPIDER_NET_PHY_CTRL_VALUE 0x00040040 > milton From miltonm at bga.com Thu Dec 8 04:24:45 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 7 Dec 2005 11:24:45 -0600 Subject: [PATCH 10/14] cell: add iommu support for larger memory Message-ID: <0ee9d47e9c94a42075ef44e387650089@bga.com> On Tue Dec 6 14:52:30 EST 2005, Arnd Bergmann wrote: > Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/iommu.c > =================================================================== > --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/iommu.c > +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/iommu.c ... > @@ -40,6 +42,7 @@ > #include > #include > #include > +#include > > #include "iommu.h" > > @@ -221,7 +224,7 @@ set_iopt_cache(void __iomem *base, unsig > unsigned long __iomem *tags = base + IOC_PT_CACHE_DIR; > unsigned long __iomem *p = base + IOC_PT_CACHE_REG; > pr_debug("iopt %02lx was v%016lx/t%016lx, store > v%016lx/t%016lx\n", > - index, get_iopt_cache(base, index, &oldtag), oldtag, > val, tag); > + index, get_iopt_cache(base, index, &tag), tag, val, > tag); Assuming get_iopt_cache takes &tag to fill it in, this code is wrong. The order of function argument evaluation is undefined in C, and the compiler can choose to change its order at any time. > - for (address = 0; address < 0x100000000ul; address += > io_page_size) { > - ioste = get_iost_entry(0x10000000000ul, address, > io_page_size); > - if ((address & 0xfffffff) == 0) /* segment start */ > - set_iost_cache(base, address >> 28, ioste); > - index = get_ioc_hash_1way(ioste, address); > + for (real_address = 0, io_address = 0; > + io_address <= map_start + map_size; > + real_address += io_page_size, io_address += io_page_size) > { > + ioste = get_iost_entry(fake_iopt, io_address, > io_page_size); > + if ((real_address & 0xfffffff) == 0) /* segment start > */ > + set_iost_cache(ioc_mmio_base, > + io_address >> 28, ioste); > + index = get_ioc_hash_1way(ioste, io_address); [comment] more magic numbers remain... milton From miltonm at bga.com Thu Dec 8 04:28:31 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 7 Dec 2005 11:28:31 -0600 Subject: [PATCH 08/14] cell: enable pause(0) in cpu_idle Message-ID: <592f37d568f304c5bc5fbad0285c8cb8@bga.com> Hi Arnd. Quite a few comments on this one. On Tue Dec 6 14:52:28 EST 2005, Arnd Bergmann wrote: > This patch enables support for pause(0) power management state > for the Cell Broadband Processor, which is import for power efficient > operation. The pervasive infrastructure will in the future enable > us to introduce more functionality specific to the Cell's > pervasive unit. > > From: Maximino Aguilar > Signed-off-by: Arnd Bergmann > > Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile > =================================================================== > --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/Makefile > +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile > @@ -1,4 +1,6 @@ > obj-y += interrupt.o iommu.o setup.o spider-pic.o > +obj-y += pervasive.o > + > obj-$(CONFIG_SMP) += smp.o > obj-$(CONFIG_SPU_FS) += spufs/ spu_base.o > builtin-spufs-$(CONFIG_SPU_FS) += spu_syscalls.o > Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c > =================================================================== > --- /dev/null > +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c > @@ -0,0 +1,147 @@ > +/* > + * CBE Pervasive Monitor and Debug > + * > + * (C) Copyright IBM Corporation 2005 > + * > + * Authors: Maximino Aguilar (maguilar at us.ibm.com) > + * Michael N. Day (mnday at us.ibm.com) > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2, or (at your option) > + * any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > + > +#include "pervasive.h" > + > +struct pmd { > + struct pmd_regs __iomem *regs; > + int power_management_enable; > +}; This name conflicts with the memory management system used throughout the kernel. Please rename. > + > +static DEFINE_PER_CPU(struct pmd, pmd); > + > +void pause_zero(void) > +{ > + unsigned int multi_threading_control; > + unsigned long long machine_state; > + > + /* Reset Thread Run Latch (latch is set in idle.c) */ > + ppc64_runlatch_off(); > + > + if (__get_cpu_var(pmd).power_management_enable) How do you know __get_cpu_var is safe here? because this is only called in the idle loop which is bound? > + { > + /* Disable EE during check for pause */ > + machine_state=mfmsr(); > + machine_state &= ~MSR_EE; > + mtmsrd(machine_state); local_irq_disable() ? > + /* Pause the PU */ > + HMT_low(); > + multi_threading_control = 0; > + mtspr(SPRN_CTRLT,multi_threading_control); > + > + /* Re-enable EE after resuming */ > + machine_state=mfmsr(); > + machine_state |= MSR_EE; > + mtmsrd(machine_state); local_irq_enable() ? > + } > +} > + > +void enable_pause_zero(void * data) > +{ > + unsigned long thread_switch_control; > + unsigned long temp_register; > + struct pmd *pmd; > + > + pmd = &get_cpu_var(pmd); > + > + if (!pmd->regs) > + return; > + > + pr_debug("Power Management: CPU %d\n", smp_processor_id()); > + > + /* Enable Pause(0) control bit */ > + temp_register = in_be64(&pmd->regs->pm_control); > + > + out_be64(&pmd->regs->pm_control, > temp_register|PMD_PAUSE_ZERO_CONTROL); > + > + /* Enable DEC and EE interrupt request */ > + thread_switch_control = mfspr(SPRN_TSC_CELL); > + thread_switch_control |= TSCR_EE_ENABLE | TSCR_EE_BOOST; > + > + if (smp_processor_id()%2) smp_processor_id is software number, and does not necessarily correspond to the hardware thread id. Either use the hw version, or better yet, read the PIR (spr 1023?) directly. > + thread_switch_control |= TSC_DEC_ENABLE_1; > + else > + thread_switch_control |= TSC_DEC_ENABLE_0; > + > + mtspr(SPRN_TSC_CELL, thread_switch_control); > + > + pmd->power_management_enable = 1; > + put_cpu_var(pmd); > +} > + > +static struct pmd_regs __iomem *find_pmd_mmio(int cpu) > +{ > + struct device_node *node; > + int node_number = cpu / 2; hmm... so # threads / node hard coded in here ... > + struct pmd_regs __iomem *pmd_mmio_area; > + unsigned long real_address; > + > + for (node = of_find_node_by_type(NULL, "cpu"); node; > + node = of_find_node_by_type(node, "cpu")) { perhaps for (node = NULL; node = of_find(..) ;) or =NULL then while Somewhat long-winded, but ok the way it is. > + if (node_number == *(int *)get_property(node, > "node-id", NULL)) > + break; > + } > + > + if (!node) { > + printk(KERN_WARNING "PMD: CPU %d not found\n", cpu); > + pmd_mmio_area = NULL; > + } else { > + real_address = *(long *)get_property(node, > "pervasive", NULL); > + pr_debug("PMD for CPU %d at %lx\n", cpu, real_address); > + pmd_mmio_area = __ioremap(real_address, 0x1000, > _PAGE_NO_CACHE); > + } > + return pmd_mmio_area; > +} > + > +void __init cell_pervasive_init(void) > +{ > + struct pmd *pmd; > + int cpu; > + > + if (!cpu_has_feature(CPU_FTR_PAUSE_ZERO)) > + return; > + > + for_each_cpu(cpu) { > + pmd = &per_cpu(pmd, cpu); > + pmd->regs = find_pmd_mmio(cpu); > + } O(n**2) find loop, could combine to get O(n) > +} > + > +int __init enable_pause_zero_init(void) > +{ > + on_each_cpu(enable_pause_zero, NULL, 0, 1); > + return 0; > +} > + > +arch_initcall(enable_pause_zero_init); arch_initcall functions should be static > Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h > =================================================================== > --- /dev/null > +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h > @@ -0,0 +1,64 @@ > +/* > + * Cell Pervasive Monitor and Debug interface and HW structures > + * > + * (C) Copyright IBM Corporation 2005 > + * > + * Authors: Maximino Aguilar (maguilar at us.ibm.com) > + * David J. Erb (djerb at us.ibm.com) > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2, or (at your option) > + * any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. > + */ > + > + > +#ifndef PERVASIVE_H > +#define PERVASIVE_H > + > +struct pmd_regs { > + u8 pad_0x0000_0x0800[0x0800 - 0x0000]; /* > 0x0000 */ > + > + /* Thermal Sensor Registers */ > + u64 ts_ctsr1; /* > 0x0800 */ > + u64 ts_ctsr2; /* > 0x0808 */ > + u64 ts_mtsr1; /* > 0x0810 */ > + u64 ts_mtsr2; /* > 0x0818 */ > + u64 ts_itr1; /* > 0x0820 */ > + u64 ts_itr2; /* > 0x0828 */ > + u64 ts_gitr; /* > 0x0830 */ > + u64 ts_isr; /* > 0x0838 */ > + u64 ts_imr; /* > 0x0840 */ > + u64 tm_cr1; /* > 0x0848 */ > + u64 tm_cr2; /* > 0x0850 */ > + u64 tm_simr; /* > 0x0858 */ > + u64 tm_tpr; /* > 0x0860 */ > + u64 tm_str1; /* > 0x0868 */ > + u64 tm_str2; /* > 0x0870 */ > + u64 tm_tsr; /* > 0x0878 */ > + > + /* Power Management */ > + u64 pm_control; /* > 0x0880 */ > +#define PMD_PAUSE_ZERO_CONTROL 0x10000 > + u64 pm_status; /* > 0x0888 */ > + > + /* Time Base Register */ > + u64 tbr; /* > 0x0890 */ > + > + u8 pad_0x0898_0x1000 [0x1000 - 0x0898]; /* > 0x0898 */ > +}; > + > +void __init cell_pervasive_init(void); > +void enable_pause_zero(void *); > +void _pause_zero(void); what is the single_underscore _pause_zero() ? ohter functions are either arch_initcall or called by initcall in the same file and can be static. > + > +#endif > Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c > =================================================================== > --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/setup.c > +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c > @@ -49,6 +49,7 @@ > > #include "interrupt.h" > #include "iommu.h" > +#include "pervasive.h" > > #ifdef DEBUG > #define DBG(fmt...) udbg_printf(fmt) > @@ -165,6 +166,7 @@ static void __init cell_setup_arch(void) > init_pci_config_tokens(); > find_and_init_phbs(); > spider_init_IRQ(); > + cell_pervasive_init(); > #ifdef CONFIG_DUMMY_CONSOLE > conswitchp = &dummy_con; > #endif > Index: linux-2.6.15-rc/arch/powerpc/kernel/head_64.S > =================================================================== > --- linux-2.6.15-rc.orig/arch/powerpc/kernel/head_64.S > +++ linux-2.6.15-rc/arch/powerpc/kernel/head_64.S > @@ -383,7 +383,7 @@ label##_common: > \ > .globl __start_interrupts > __start_interrupts: > > - STD_EXCEPTION_PSERIES(0x100, system_reset) > + STD_EXCEPTION_PSERIES(0x100, system_reset_check) > > . = 0x200 > _machine_check_pSeries: > @@ -860,6 +860,31 @@ unrecov_fer: > bl .unrecoverable_exception > b 1b > > +/* This is a new system reset handler for the BE processor. > + * SRR1 stores wake information that must be decoded to determine why > + * the processor was at the system reset handler. > + */ > + > + .align 7 > + .globl system_reset_check_common > +system_reset_check_common: > +BEGIN_FTR_SECTION > + mr r22,r12 /* r12 has SRR1 saved */ > + srwi r22,r22,16 > + andi. r22,r22,MSR_WAKEMASK > + cmpwi r22,MSR_WAKEEE > + beq 40f > + cmpwi r22,MSR_WAKEDEC > + beq 42f > + cmpwi r22,MSR_WAKEMT > + beq 43f > +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) > + b system_reset_common > +40: b hardware_interrupt_common > +42: b decrementer_common > +43: EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); > + b fast_exception_return > + Branches to branches that must be in the same file, within the first 64k, currently within 32k. Just make the conditional branches directly to the other routines. This could go inline with system_reset_common, except that it would mean breaking apart the STD_EXCEPTION_COMMON macro for it. Space optimization would then be to put the test for WAKEMT after PROLOG_COMMON at the expense of breaking up the tests. > /* > * Here r13 points to the paca, r9 contains the saved CR, > * SRR0 and SRR1 are saved in r11 and r12, > Index: linux-2.6.15-rc/arch/powerpc/kernel/idle_64.c > =================================================================== > --- linux-2.6.15-rc.orig/arch/powerpc/kernel/idle_64.c > +++ linux-2.6.15-rc/arch/powerpc/kernel/idle_64.c > @@ -40,7 +40,8 @@ void default_idle(void) > if (!need_resched()) { > while (!need_resched() && > !cpu_is_offline(cpu)) { > ppc64_runlatch_off(); > - > + if > (cpu_has_feature(CPU_FTR_PAUSE_ZERO)) > + pause_zero(); We have multiple idle loops and ppc_md.idle_loop to avoid junk like this. Assign the idle-loop based on the cpu feature. Place it in persavisive.c, then you can make pause_zero static, and it will be inline. All better for power (fewer tests and branches). > /* > * Go into low thread priority and > possibly > * low power mode. > Index: linux-2.6.15-rc/include/asm-powerpc/cputable.h > =================================================================== > --- linux-2.6.15-rc.orig/include/asm-powerpc/cputable.h > +++ linux-2.6.15-rc/include/asm-powerpc/cputable.h > @@ -106,6 +106,7 @@ extern void do_cpu_ftr_fixups(unsigned l > #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0000040000000000) > #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) > #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0000100000000000) > +#define CPU_FTR_PAUSE_ZERO ASM_CONST(0x0000200000000000) > #else > /* ensure on 32b processors the flags are available for compiling but > * don't do anything */ > @@ -305,7 +306,8 @@ enum { > CPU_FTR_MMCRA_SIHV, > CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | > CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | > - CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT, > + CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | > + CPU_FTR_CTRL | CPU_FTR_PAUSE_ZERO, > CPU_FTRS_COMPATIBLE = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | > CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2, > #endif > Index: linux-2.6.15-rc/include/asm-powerpc/processor.h > =================================================================== > --- linux-2.6.15-rc.orig/include/asm-powerpc/processor.h > +++ linux-2.6.15-rc/include/asm-powerpc/processor.h > @@ -281,6 +281,14 @@ static inline void prefetchw(const void > #define HAVE_ARCH_PICK_MMAP_LAYOUT > #endif > > +#ifdef CONFIG_PPC_CELL > +extern void pause_zero(void); > +#else > +static inline void pause_zero(void) > +{ > +} > +#endif > + and you can stop polluting processor.h with something you only want called in a pinned cpu context from your idle loop. > #endif /* __KERNEL__ */ > #endif /* __ASSEMBLY__ */ > #endif /* _ASM_POWERPC_PROCESSOR_H */ > Index: linux-2.6.15-rc/include/asm-powerpc/reg.h > =================================================================== > --- linux-2.6.15-rc.orig/include/asm-powerpc/reg.h > +++ linux-2.6.15-rc/include/asm-powerpc/reg.h > @@ -92,6 +92,15 @@ > #define MSR_RI __MASK(MSR_RI_LG) /* Recoverable > Exception */ > #define MSR_LE __MASK(MSR_LE_LG) /* Little Endian */ > > +/* Wake Events */ > +#define MSR_WAKEMASK 0x0038 > +#define MSR_WAKERESET 0x0038 > +#define MSR_WAKESYSERR 0x0030 > +#define MSR_WAKEEE 0x0020 > +#define MSR_WAKEMT 0x0028 > +#define MSR_WAKEDEC 0x0018 > +#define MSR_WAKETHERM 0x0010 > + > #ifdef CONFIG_PPC64 > #define MSR_ MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_ISF > #define MSR_KERNEL MSR_ | MSR_SF | MSR_HV > @@ -257,9 +266,10 @@ > #define SPRN_HID6 0x3F9 /* BE HID 6 */ > #define HID6_LB (0x0F<<12) /* Concurrent Large Page > Modes */ > #define HID6_DLP (1<<20) /* Disable all large page > modes (4K only) */ > -#define SPRN_TSCR 0x399 /* Thread switch control on BE > */ > -#define SPRN_TTR 0x39A /* Thread switch timeout on BE > */ > -#define TSCR_DEC_ENABLE 0x200000 /* Decrementer > Interrupt */ > +#define SPRN_TSC_CELL 0x399 /* Thread switch control on > Cell */ > +#define SPRN_TTR 0x39A /* Thread switch timeout on > Cell */ > +#define TSC_DEC_ENABLE_0 0x400000 /* Decrementer > Interrupt */ > +#define TSC_DEC_ENABLE_1 0x200000 /* Decrementer > Interrupt */ The prefix should be the name of the register to which they apply and directly under that register. > #define TSCR_EE_ENABLE 0x100000 /* External Interrupt > */ > #define TSCR_EE_BOOST 0x080000 /* External Interrupt > Boost */ > #define SPRN_TSC 0x3FD /* Thread switch control on > others */ > milton From miltonm at bga.com Thu Dec 8 04:28:39 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 7 Dec 2005 11:28:39 -0600 Subject: [PATCH 13/14] spidernet: read firmware from the OF device tree Message-ID: <24364349f5d62b1f71eebc4cb2f3b76e@bga.com> On Tue Dec 6 14:52:33 EST 2005, Arnd Bergmann wrote: > request_firmware() is sometimes problematic, especially > in initramfs, reading the firmware from Open Firmware > is much preferrable. > > We still try to get the firmware from the file system > first, in order to support old SLOF releases and to allow > updates of the spidernet firmware without reflashing > the system. > > From: Jens.Osterkamp at de.ibm.com > Cc: netdev at vger.kernel.org > Signed-off-by: Arnd Bergmann > > Index: linux-2.6.15-rc/drivers/net/spider_net.c > =================================================================== > --- linux-2.6.15-rc.orig/drivers/net/spider_net.c > +++ linux-2.6.15-rc/drivers/net/spider_net.c > @@ -1895,16 +1895,27 @@ spider_net_download_firmware(struct spid > static int > spider_net_init_firmware(struct spider_net_card *card) > { > - const struct firmware *firmware; > + struct firmware *firmware; > + struct device_node *dn; > + u8 *fw_prop; > int err = -EIO; > > - if (request_firmware(&firmware, > + if (request_firmware((const struct firmware **)&firmware, > SPIDER_NET_FIRMWARE_NAME, > &card->pdev->dev) < 0) { > if (netif_msg_probe(card)) > pr_err("Couldn't read in sequencer data file > %s.\n", > SPIDER_NET_FIRMWARE_NAME); > - firmware = NULL; > - goto out; > + > + dn = pci_device_to_OF_node(card->pdev); > + if (!dn) > + goto out; > + > + fw_prop = (u8 *)get_property(dn, "firmware", NULL); > + if (!fw_prop) > + goto out; > + > + memcpy(firmware->data, fw_prop, 6 * > SPIDER_NET_FIRMWARE_LEN * sizeof(u32)); > + firmware->size = 6 * SPIDER_NET_FIRMWARE_LEN * > sizeof(u32); > } > > if (firmware->size != 6 * SPIDER_NET_FIRMWARE_LEN * > sizeof(u32)) { > > A person might think that FIRMWARE_LEN was the desired length. Or at least there would be something defined to that long expression. Also, how about actually getting the size of the property (that third NULL argument is to return that), and checking if it is the desired size? milton From kravetz at us.ibm.com Thu Dec 8 08:07:23 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Wed, 7 Dec 2005 13:07:23 -0800 Subject: [PATCH] boot failures on numa if no memory on node Message-ID: <20051207210723.GA8970@monkey.ibm.com> This bug exists in the current code and prevents machines from booting with numa enabled if there is a node that does not contain memory. Workaround is to boot with 'numa=off'. Looks like a simple type. Signed-off-by: Mike Kravetz diff -Naupr linux-2.6.15-rc5-git1/arch/powerpc/mm/numa.c linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c --- linux-2.6.15-rc5-git1/arch/powerpc/mm/numa.c 2005-12-04 05:10:42.000000000 +0000 +++ linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c 2005-12-07 20:49:23.000000000 +0000 @@ -125,7 +125,7 @@ void __init get_region(unsigned int nid, /* We didnt find a matching region, return start/end as 0 */ if (*start_pfn == -1UL) - start_pfn = 0; + *start_pfn = 0; } static inline void map_cpu_to_node(int cpu, int node) From anton at samba.org Thu Dec 8 08:01:31 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 8 Dec 2005 08:01:31 +1100 Subject: [PATCH] boot failures on numa if no memory on node In-Reply-To: <20051207210723.GA8970@monkey.ibm.com> References: <20051207210723.GA8970@monkey.ibm.com> Message-ID: <20051207210130.GA23641@krispykreme> Hi, Thanks Mike, that stupid bug was my fault. Looks good. Anton > This bug exists in the current code and prevents machines from booting > with numa enabled if there is a node that does not contain memory. > Workaround is to boot with 'numa=off'. Looks like a simple type. > > Signed-off-by: Mike Kravetz > > diff -Naupr linux-2.6.15-rc5-git1/arch/powerpc/mm/numa.c linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c > --- linux-2.6.15-rc5-git1/arch/powerpc/mm/numa.c 2005-12-04 05:10:42.000000000 +0000 > +++ linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c 2005-12-07 20:49:23.000000000 +0000 > @@ -125,7 +125,7 @@ void __init get_region(unsigned int nid, > > /* We didnt find a matching region, return start/end as 0 */ > if (*start_pfn == -1UL) > - start_pfn = 0; > + *start_pfn = 0; > } > > static inline void map_cpu_to_node(int cpu, int node) From Jens.Osterkamp at de.ibm.com Thu Dec 8 08:20:52 2005 From: Jens.Osterkamp at de.ibm.com (Jens Osterkamp) Date: Wed, 7 Dec 2005 22:20:52 +0100 Subject: [PATCH 12/14] spidernet: check if firmware was loaded correctly In-Reply-To: <088b3b2b5b1187c6a716102b0201469f@bga.com> Message-ID: Milton Miller wrote on 12/07/2005 06:24:21 PM: > > - spider_net_download_firmware(card, firmware); > > - > > - err = 0; > > + if (!spider_net_download_firmware(card, firmware)) > > + err = 0; > > Why not assign err to the return of spider_net_download_firmware? You are right, I will correct this. Jens From mostrows at watson.ibm.com Thu Dec 8 10:14:43 2005 From: mostrows at watson.ibm.com (Michal Ostrowski) Date: Wed, 07 Dec 2005 18:14:43 -0500 Subject: [PATCH] Fix windfarm model-id table. Message-ID: <1133997283.28136.57.camel@brick.watson.ibm.com> .model_id fields of wf_smu_sys_all_params should match the model ID they are supposed to represent (as commented). Fixes windfarm on iMac 8,1. Signed-off-by: Michal Ostrowski --- drivers/macintosh/windfarm_pm81.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) 645a833c9e40e76b56052af16c7ba96259a68163 diff --git a/drivers/macintosh/windfarm_pm81.c b/drivers/macintosh/windfarm_pm81.c index 322c74b..80ddf97 100644 --- a/drivers/macintosh/windfarm_pm81.c +++ b/drivers/macintosh/windfarm_pm81.c @@ -207,7 +207,7 @@ static struct wf_smu_sys_fans_param wf_s }, /* Model ID 3 */ { - .model_id = 2, + .model_id = 3, .itarget = 0x350000, .gd = 0x08e00000, .gp = 0x00566666, @@ -219,7 +219,7 @@ static struct wf_smu_sys_fans_param wf_s }, /* Model ID 5 */ { - .model_id = 2, + .model_id = 5, .itarget = 0x3a0000, .gd = 0x15400000, .gp = 0x00233333, -- 0.99.9.GIT From canticle400 at gmail.com Thu Dec 8 14:37:18 2005 From: canticle400 at gmail.com (Dennis Chua) Date: Wed, 7 Dec 2005 22:37:18 -0500 Subject: Linux+PPC64, self-modifying code Message-ID: Hello. Can anyone comment on the feasibility of writing self-modifying code on Linux PPC64? Disregarding the motivations behind this, is it possible for an executable program to - access the instruction opcode of one of its functions. - overwrite/alter the function opcode and to do this all during runtime? Any insight is much appreciated! Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051207/af36e207/attachment.htm From hollis at penguinppc.org Thu Dec 8 15:10:36 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Wed, 7 Dec 2005 22:10:36 -0600 Subject: Linux+PPC64, self-modifying code In-Reply-To: References: Message-ID: <9a87b582891390d1bbf5d00bee5488d0@penguinppc.org> On Dec 7, 2005, at 9:37 PM, Dennis Chua wrote: > > Can anyone comment on the feasibility of writing self-modifying > code on Linux PPC64? Disregarding the motivations behind this, > is it possible for an executable program to > > - access the instruction opcode of one of its functions. > - overwrite/alter the function opcode > > and to do this all during runtime? It's quite feasible. Many projects, including the kernel, do this. > Any insight is much appreciated! Thank you. The main trick is that most PowerPC have L1 instruction caches that are incoherent with the L1 data caches. In other words, when you write the new code to memory, it lands in the dcache, and then the icache has stale instructions which it will happily execute. The architected sequence you must execute for self-modifying code is documented, I believe in Book III of the PowerPC Architecture (see http://penguinppc.org/dev/#library). You basically flush the affected memory out of the L1 dcache, sync to make sure all that finished, invalidate the previous icache contents, then isync to discard partially-decoded instructions the processor may have already fetched out of the icache. See the Architecture book for the exact instructions... -Hollis From david at gibson.dropbear.id.au Thu Dec 8 15:25:51 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 8 Dec 2005 15:25:51 +1100 Subject: Linux+PPC64, self-modifying code In-Reply-To: <9a87b582891390d1bbf5d00bee5488d0@penguinppc.org> References: <9a87b582891390d1bbf5d00bee5488d0@penguinppc.org> Message-ID: <20051208042551.GA30681@localhost.localdomain> On Wed, Dec 07, 2005 at 10:10:36PM -0600, Hollis Blanchard wrote: > On Dec 7, 2005, at 9:37 PM, Dennis Chua wrote: > > > > Can anyone comment on the feasibility of writing self-modifying > > code on Linux PPC64? Disregarding the motivations behind this, > > is it possible for an executable program to > > > > - access the instruction opcode of one of its functions. > > - overwrite/alter the function opcode > > > > and to do this all during runtime? > > It's quite feasible. Many projects, including the kernel, do this. > > > Any insight is much appreciated! Thank you. > > The main trick is that most PowerPC have L1 instruction caches that are > incoherent with the L1 data caches. In other words, when you write the > new code to memory, it lands in the dcache, and then the icache has > stale instructions which it will happily execute. > > The architected sequence you must execute for self-modifying code is > documented, I believe in Book III of the PowerPC Architecture (see > http://penguinppc.org/dev/#library). You basically flush the affected > memory out of the L1 dcache, sync to make sure all that finished, > invalidate the previous icache contents, then isync to discard > partially-decoded instructions the processor may have already fetched > out of the icache. See the Architecture book for the exact > instructions... I believe dcbst sync icbi isync is what you need. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From hollis at penguinppc.org Thu Dec 8 16:07:50 2005 From: hollis at penguinppc.org (Hollis Blanchard) Date: Wed, 7 Dec 2005 23:07:50 -0600 Subject: Linux+PPC64, self-modifying code In-Reply-To: <9a87b582891390d1bbf5d00bee5488d0@penguinppc.org> References: <9a87b582891390d1bbf5d00bee5488d0@penguinppc.org> Message-ID: <88e1d310bda948bdf8ab8d1c6efcaf76@penguinppc.org> On Dec 7, 2005, at 10:10 PM, Hollis Blanchard wrote: > > The architected sequence you must execute for self-modifying code is > documented, I believe in Book III of the PowerPC Architecture (see > http://penguinppc.org/dev/#library). Correction: Book II, section 1.8 in the version 2.02 architecture PDFs (which are what's on the web). -Hoolis From benh at kernel.crashing.org Thu Dec 8 16:51:44 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 08 Dec 2005 16:51:44 +1100 Subject: [PATCH] powerpc: Fix a huge page bug Message-ID: <1134021104.7168.106.camel@gaston> The 64k pages patch changed the meaning of one argument passed to the low level hash functions (from "large" it became "psize" or page size index), but one of the call sites wasn't properly updates, causing potential random weird problems with huge pages. This fixes it. Signed-off-by: Benjamin Herrenschmidt --- This is a candidate for 2.6.15 Index: linux-work/arch/powerpc/mm/hugetlbpage.c =================================================================== --- linux-work.orig/arch/powerpc/mm/hugetlbpage.c 2005-11-28 11:04:49.000000000 +1100 +++ linux-work/arch/powerpc/mm/hugetlbpage.c 2005-12-08 16:42:41.000000000 +1100 @@ -703,7 +703,8 @@ int hash_huge_page(struct mm_struct *mm, slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; slot += (old_pte & _PAGE_F_GIX) >> 12; - if (ppc_md.hpte_updatepp(slot, rflags, va, 1, local) == -1) + if (ppc_md.hpte_updatepp(slot, rflags, va, mmu_huge_psize, + local) == -1) old_pte &= ~_PAGE_HPTEFLAGS; } From benh at kernel.crashing.org Thu Dec 8 16:53:34 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 08 Dec 2005 16:53:34 +1100 Subject: [PATCH] powerpc: Remove debug code in hash path Message-ID: <1134021215.7168.109.camel@gaston> Some debug code wasn't properly removed from the initial 64k pages patch, and while it's harmless, it's also slowing down significantly a very hot code path, thus it should really be removed. Signed-off-by: Benjamin Herrenschmidt --- This is a candidate for 2.6.15 Index: linux-work/arch/powerpc/platforms/pseries/lpar.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/pseries/lpar.c 2005-11-09 11:49:03.000000000 +1100 +++ linux-work/arch/powerpc/platforms/pseries/lpar.c 2005-12-08 16:43:40.000000000 +1100 @@ -298,18 +298,6 @@ long pSeries_lpar_hpte_insert(unsigned l if (!(vflags & HPTE_V_BOLTED)) DBG_LOW(" hpte_v=%016lx, hpte_r=%016lx\n", hpte_v, hpte_r); -#if 1 - { - int i; - for (i=0;i<8;i++) { - unsigned long w0, w1; - plpar_pte_read(0, hpte_group, &w0, &w1); - BUG_ON (HPTE_V_COMPARE(hpte_v, w0) - && (w0 & HPTE_V_VALID)); - } - } -#endif - /* Now fill in the actual HPTE */ /* Set CEC cookie to 0 */ /* Zero page = 0 */ From paulus at samba.org Thu Dec 8 17:18:58 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 8 Dec 2005 17:18:58 +1100 Subject: [PATCH 00/14] Cell updates for powerpc.git In-Reply-To: <20051206035220.097737000@localhost> References: <20051206035220.097737000@localhost> Message-ID: <17303.53330.215784.545952@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > This is my current set of updates related to the cell platforms. I have put patches 1..7 and 9 into the powerpc.git tree. Please address Milton's comments on patches 3, 5, 8 and 10. Thanks, Paul. From paulus at samba.org Thu Dec 8 17:22:52 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 8 Dec 2005 17:22:52 +1100 Subject: patches in powerpc.git tree Message-ID: <17303.53564.160668.376061@cargo.ozlabs.ibm.com> Here is a list of patches currently in the powerpc.git tree that aren't already in Linus' tree. Paul. Adrian Bunk: PPC_PREP: remove unneeded exports Andy Whitcroft: powerpc: powermac adb fix dependency on btext_drawchar powerpc: powermac adb fix udbg_adb_use_btext warning powerpc32: clean up available memory models powerpc32: fix definition of distribute_irqs Arnd Bergmann: spufs: The SPU file system, base spufs: cooperative scheduler support spufs: Make all exports GPL-only spufs: fix local store page refcounting spufs: Fix oops when spufs module is not loaded spufs: Turn off debugging output spufs: Improved SPU preemptability. spufs: Improved SPU preemptability [part 2]. spufs: fix mailbox polling cell: add platform detection code Benjamin Herrenschmidt: powerpc: Merge align.c (#2) powerpc: Add OF address parsing code (#2) powerpc: serial port discovery (#2) powerpc: Unify udbg (#2) powerpc: Add back support for booting from BootX (#2) powerpc: convert macio_asic to use prom_parse powerpc: Fix g5 build with xmon powerpc: More serial probe fixes (#2) powerpc: udbg updates powerpc: Update OF address parsers powerpc: Fix a huge page bug powerpc: Remove debug code in hash path David Gibson: powerpc: Remove imalloc.h powerpc: Make hugepage mappings respect hint addresses is_aligned_hugepage_range() cleanup powerpc: Remove ItLpRegSave area from the paca powerpc: Remove some unneeded fields from the paca David Woodhouse: syscall entry/exit revamp ppc64 syscall_exit_work: call the save_nvgprs function, not its descriptor. powerpc: serial port discovery: cope with broken firmware Save NVGPRS in 32-bit signal frame Fix code that saves NVGPRS in 32-bit signal frame ppc: Make ARCH=ppc build again with new syscall path Heiko J Schick: powerpc: IBMEBUS bus support Hugh Dickins: mm: powerpc ptlock comments mm: powerpc init_mm without ptlock Kumar Gala: powerpc: moved ipic code to arch/powerpc powerpc: Add support for building uImages powerpc: Fix suboptimal uImage target linas: powerpc/pseries: dlpar-add crash on null pointer deref powerpc: minor cleanup of void ptr deref Linas Vepstas: powerpc: PCI hotplug common code elimination powerpc: make pcibios_claim_one_bus available to other code powerpc: migrate common PCI hotplug code PCI Error Recovery: header file patch powerpc: PCI Error Recovery: PPC64 core recovery routines powerpc: Split out PCI address cache to its own file powerpc: Add "partitionable endpoint" support powerpc: remove bogus printk powerpc: Remove duplicate code powerpc: bugfix: fill in uninitialized field powerpc: Use PE configuration address consistently powerpc: set up the RTAS token just like the rest of them. powerpc: Don't continue with PCI Error recovery if slot reset failed. powerpc: handle multifunction PCI devices properly powerpc: IOMMU: don't ioremap null addresses powerpc: Save device BARs much earlier in the boot sequence powerpc: get rid of per_cpu EEH counters Marcelo Tosatti: ppc32: m8xx watchdog update powerpc/8xx: Fix m8xx_wdt issues Mark Nutter: spufs: switchable spu contexts kernel-side context switch code for spufs spufs: add spu-side context switch code Michael Ellerman: powerpc: Merge kexec powerpc: Propagate regs through to machine_crash_shutdown powerpc: Add a is_kernel_addr() macro powerpc: Separate usage of KERNELBASE and PAGE_OFFSET powerpc: Add CONFIG_CRASH_DUMP powerpc: Create a trampoline for the fwnmi vectors powerpc: Reroute interrupts from 0 + offset to PHYSICAL_START + offset powerpc: Fixups for kernel linked at 32 MB powerpc: Add arch dependent basic infrastructure for Kdump. powerpc: Parse crashkernel= parameter in first kernel powerpc: Add arch-dependent copy_oldmem_page powerpc: Add support for "linux,usable-memory" on memory nodes Michal Ostrowski: powerpc/pseries: Fix TCE building with 64k pagesize Mike Kravetz: Remove SPAN_OTHER_NODES config definition powerpc: Minor numa memory code cleanup powerpc: Minor numa memory code cleanup powerpc: numa placement for dynamically added memory powerpc/pseries: boot failures on numa if no memory on node Olaf Hering: powerpc: correct the NR_CPUS description text Olof Johansson: powerpc: remove redundant code in stab init Otavio Salvador: ppc: removed unused variable i from code. Paul Mackerras: powerpc: Update __NR_syscalls to account for SPU syscalls ppc: remove duplicate bseip.h powerpc: Fix up some compile errors in the PCI error recovery code powerpc/pseries: Optimize IOMMU setup ppc: Build in all three of powermac, PREP and CHRP support Revert "powerpc: Minor numa memory code cleanup" powerpc: Fix typo in head_64.S Stephen Rothwell: powerpc: remove arch/powerpc/include hack for 64 bit powerpc: cleanup iseries irq.c powerpc: use end_IRQ for iseries irqs powerpc: partly merge iseries do_IRQ powerpc: reduce include in irq.c powerpc: more iseries irq work powerpc: fix for "Update OF address parsers" From sfr at canb.auug.org.au Thu Dec 8 17:48:20 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 8 Dec 2005 17:48:20 +1100 Subject: patches in powerpc.git tree In-Reply-To: <17303.53564.160668.376061@cargo.ozlabs.ibm.com> References: <17303.53564.160668.376061@cargo.ozlabs.ibm.com> Message-ID: <20051208174820.728a4a7b.sfr@canb.auug.org.au> On Thu, 8 Dec 2005 17:22:52 +1100 Paul Mackerras wrote: > > Stephen Rothwell: > powerpc: remove arch/powerpc/include hack for 64 bit That one is in Linus' tree ... -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051208/055e16b3/attachment.pgp From dwmw2 at infradead.org Thu Dec 8 19:25:19 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Thu, 08 Dec 2005 09:25:19 +0100 Subject: instantiate_rtas on Cell sim fails... Message-ID: <1134030320.19711.51.camel@localhost.localdomain> We never used to check for != 0; we used to check for == PROM_ERROR instead. And on mambo we get 1, not 0. This makes it work again, but is it the sim at fault, or the kernel? --- linux-2.6.14/arch/powerpc/kernel/prom_init.c~ 2005-12-07 23:33:20.000000000 +0100 +++ linux-2.6.14/arch/powerpc/kernel/prom_init.c 2005-12-07 23:33:38.000000000 +0100 @@ -1051,7 +1051,7 @@ static void __init prom_instantiate_rtas if (call_prom_ret("call-method", 3, 2, &entry, ADDR("instantiate-rtas"), - rtas_inst, base) != 0 + rtas_inst, base) == PROM_ERROR || entry == 0) { prom_printf(" failed\n"); return; -- dwmw2 From dwmw2 at infradead.org Thu Dec 8 21:31:10 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Thu, 08 Dec 2005 11:31:10 +0100 Subject: [RFC PATCH 6/5] CELL rtas console port to hvc_console backend driver In-Reply-To: <43935B9C.5020503@us.ibm.com> References: <43935B9C.5020503@us.ibm.com> Message-ID: <1134037870.19711.57.camel@localhost.localdomain> --- linux-2.6.14/drivers/char/hvc_rtas.c~ 2005-12-07 18:12:59.000000000 +0100 +++ linux-2.6.14/drivers/char/hvc_rtas.c 2005-12-07 18:15:39.000000000 +0100 @@ -0,0 +1,161 @@ +/* + * IBM RTAS driver interface to hvc_console.c + * + * (C) Copyright IBM Corporation 2001-2005 + * (C) Copyright Red Hat, Inc. 2005 + * + * Author(s): Maximino Augilar + * : Ryan S. Arnold + * : Utz Bacher + * : David Woodhouse + * + * inspired by drivers/char/hvc_console.c + * written by Anton Blanchard and Paul Mackerras + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include +#include +#include +#include +#include "hvc_console.h" + +static uint32_t hvc_rtas_vtermno = 0; +struct hvc_struct *hvc_rtas_dev; + +#define RTASCONS_PUT_ATTEMPTS 16 + +static int rtascons_put_char_token = -1; +static int rtascons_get_char_token = -1; +static int rtascons_put_delay; + +static inline int hvc_rtas_write_console(uint32_t vtermno, const char *buf, int count) +{ + int result = 0; + int attempts = RTASCONS_PUT_ATTEMPTS; + int done = 0; + + /* if there is more than one character to be displayed, wait a bit */ + for (; done < count && attempts; udelay(rtascons_put_delay)) { + attempts--; + result = rtas_call(rtascons_put_char_token, 1, 1, NULL, buf[done]); + + if (!result) { + attempts = RTASCONS_PUT_ATTEMPTS; + done++; + } + } + /* the calling routine expects to receive the number of bytes sent */ + return done?:result; +} + +static inline int rtascons_get_char(void) +{ + int result; + + if (rtas_call(rtascons_get_char_token, 0, 2, &result)) + result = -1; + + return result; +} + +static int hvc_rtas_read_console(uint32_t vtermno, char *buf, int count) +{ + unsigned long got; + int c; + int i; + + for (got = 0, i = 0; i < count; i++) { + + if (( c = rtascons_get_char() ) != -1) { + buf[i] = c; + ++got; + } + else + break; + } + return got; +} + +static struct hv_ops hvc_rtas_get_put_ops = { + .get_chars = hvc_rtas_read_console, + .put_chars = hvc_rtas_write_console, +}; + +static int hvc_rtas_init(void) +{ + struct hvc_struct *hp; + + if (rtascons_put_char_token == -1) + rtascons_put_char_token = rtas_token("put-term-char"); + if (rtascons_put_char_token == -1) + return -EIO; + + if (rtascons_get_char_token == -1) + rtascons_get_char_token = rtas_token("get-term-char"); + if (rtascons_get_char_token == -1) + return -EIO; + + if (__onsim()) + rtascons_put_delay = 0; + else + rtascons_put_delay = 100; + + BUG_ON(hvc_rtas_dev); + + /* Allocate an hvc_struct for the console device we instantiated + * earlier. Save off hp so that we can return it on exit */ + hp = hvc_alloc(hvc_rtas_vtermno, NO_IRQ, &hvc_rtas_get_put_ops); + if (IS_ERR(hp)) + return PTR_ERR(hp); + hvc_rtas_dev = hp; + return 0; +} +module_init(hvc_rtas_init); + +/* This will tear down the tty portion of the driver */ +static void __exit hvc_rtas_exit(void) +{ + struct hvc_struct *hp_safe; + /* Hopefully this isn't premature */ + if (!hvc_rtas_dev) + return; + + hp_safe = hvc_rtas_dev; + hvc_rtas_dev = NULL; + + /* Really the fun isn't over until the worker thread breaks down and the + * tty cleans up */ + hvc_remove(hp_safe); +} +module_exit(hvc_rtas_exit); /* before drivers/char/hvc_console.c */ + +/* This will happen prior to module init. There is no tty at this time? */ +static int hvc_rtas_console_init(void) +{ + rtascons_put_char_token = rtas_token("put-term-char"); + if (rtascons_put_char_token == -1) + return -EIO; + rtascons_get_char_token = rtas_token("get-term-char"); + if (rtascons_get_char_token == -1) + return -EIO; + + hvc_instantiate(hvc_rtas_vtermno, 0, &hvc_rtas_get_put_ops ); + return 0; +} +console_initcall(hvc_rtas_console_init); --- linux-2.6.14/drivers/char/Makefile~ 2005-12-07 17:47:05.000000000 +0100 +++ linux-2.6.14/drivers/char/Makefile 2005-12-07 18:12:07.000000000 +0100 @@ -43,6 +43,7 @@ obj-$(CONFIG_RIO) += rio/ generic_seria obj-$(CONFIG_HVC_DRIVER) += hvc_console.o obj-$(CONFIG_HVC_CONSOLE) += hvc_vio.o hvsi.o obj-$(CONFIG_HVC_FSS) += hvc_fss.o +obj-$(CONFIG_HVC_RTAS) += hvc_rtas.o obj-$(CONFIG_RAW_DRIVER) += raw.o obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o obj-$(CONFIG_MMTIMER) += mmtimer.o --- linux-2.6.14/drivers/char/Kconfig~ 2005-12-07 17:47:05.000000000 +0100 +++ linux-2.6.14/drivers/char/Kconfig 2005-12-07 18:17:14.000000000 +0100 @@ -575,6 +575,13 @@ config HVC_FSS IBM Full System Simulator Console device driver which makes use of the HVC_DRIVER front end. +config HVC_RTAS + bool "IBM RTAS Console support" + depends on PPC_RTAS + select HVC_DRIVER + help + IBM Console device driver which makes use of RTAS + config HVCS tristate "IBM Hypervisor Virtual Console Server support" depends on PPC_PSERIES -- dwmw2 From paulus at samba.org Thu Dec 8 22:48:45 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 8 Dec 2005 22:48:45 +1100 Subject: instantiate_rtas on Cell sim fails... In-Reply-To: <1134030320.19711.51.camel@localhost.localdomain> References: <1134030320.19711.51.camel@localhost.localdomain> Message-ID: <17304.7581.151784.735540@cargo.ozlabs.ibm.com> David Woodhouse writes: > We never used to check for != 0; we used to check for == PROM_ERROR > instead. And on mambo we get 1, not 0. This makes it work again, but is > it the sim at fault, or the kernel? > > --- linux-2.6.14/arch/powerpc/kernel/prom_init.c~ 2005-12-07 23:33:20.000000000 +0100 > +++ linux-2.6.14/arch/powerpc/kernel/prom_init.c 2005-12-07 23:33:38.000000000 +0100 > @@ -1051,7 +1051,7 @@ static void __init prom_instantiate_rtas > > if (call_prom_ret("call-method", 3, 2, &entry, > ADDR("instantiate-rtas"), > - rtas_inst, base) != 0 > + rtas_inst, base) == PROM_ERROR The call-method function is supposed to execute the named method inside a catch. The first return value, which is what call_prom_ret returns, is the result from catch. Catch returns false (i.e. 0) if there was no throw call, or a non-zero error code if an error was signalled with throw. This is from IEEE 1275. So I think that != 0 is correct and thus sim is at fault, unless of course the forth code for instantiate-rtas is in fact calling throw for some reason, in which case we need to find out what error the sim firmware is detecting. Paul. From olof at lixom.net Fri Dec 9 12:40:17 2005 From: olof at lixom.net (Olof Johansson) Date: Thu, 8 Dec 2005 19:40:17 -0600 Subject: [PATCH] powerpc: Set cache info defaults Message-ID: <20051209014017.GD1082@pb15.lixom.net> Hi, I would like to see this in 2.6.15, please apply. --- Cache info is setup by walking the device tree in initialize_cache_info(). However, icache_flush_range might be called before that, in slb_initialize()->patch_slb_encoding, which modifies the load immediate instructions used with SLB fault code. Not only that, but depending on memory layout, we might take SLB faults during unflatten_device_tree. So that fault will load an SLB entry that might not contain the right LLP flags for the segment. Either we can walk the flattened device tree to setup cache info, or we can pick the known defaults that are known to work. Doing it in the flattened device tree is hairier since we need to know the machine type to know what property to look for, etc, etc. For now, it's just easier to go with the defaults. Worst thing that happens from it is that we might waste a few cycles doing too small dcbst/icbi increments. Signed-off-by: Olof Johansson Index: 2.6/arch/powerpc/kernel/setup_64.c =================================================================== --- 2.6.orig/arch/powerpc/kernel/setup_64.c 2005-12-08 17:17:59.000000000 -0600 +++ 2.6/arch/powerpc/kernel/setup_64.c 2005-12-08 19:35:15.000000000 -0600 @@ -106,7 +106,15 @@ int boot_cpuid_phys = 0; dev_t boot_dev; u64 ppc64_pft_size; -struct ppc64_caches ppc64_caches; +/* Pick defaults since we might want to patch instructions + * before we've read this from the device tree. + */ +struct ppc64_caches ppc64_caches = { + .dline_size = 0x80, + .log_dline_size = 7, + .iline_size = 0x80, + .log_iline_size = 7 +}; EXPORT_SYMBOL_GPL(ppc64_caches); /* From michael at ellerman.id.au Fri Dec 9 12:57:20 2005 From: michael at ellerman.id.au (Michael Ellerman) Date: Thu, 8 Dec 2005 19:57:20 -0600 Subject: [PATCH] powerpc: Set cache info defaults In-Reply-To: <20051209014017.GD1082@pb15.lixom.net> References: <20051209014017.GD1082@pb15.lixom.net> Message-ID: <200512081957.24861.michael@ellerman.id.au> On Thu, 8 Dec 2005 19:40, Olof Johansson wrote: > Cache info is setup by walking the device tree in initialize_cache_info(). > However, icache_flush_range might be called before that, in > slb_initialize()->patch_slb_encoding, which modifies the load immediate > instructions used with SLB fault code. > > Not only that, but depending on memory layout, we might take SLB faults > during unflatten_device_tree. So that fault will load an SLB entry that > might not contain the right LLP flags for the segment. > > Either we can walk the flattened device tree to setup cache info, or > we can pick the known defaults that are known to work. Doing it in the > flattened device tree is hairier since we need to know the machine type > to know what property to look for, etc, etc. > > For now, it's just easier to go with the defaults. Worst thing that > happens from it is that we might waste a few cycles doing too small > dcbst/icbi increments. This is cool. I had to hand-code the sync in one of my kdump patches exactly because it was too early to call flush_icache_range(). And I got it wrong the first time :P cheers -- Michael Ellerman IBM OzLabs email: michael:ellerman.id.au inmsg: mpe:jabber.org wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051208/81dd418c/attachment.pgp From tom_gall at vnet.ibm.com Fri Dec 9 13:13:54 2005 From: tom_gall at vnet.ibm.com (Tom Gall) Date: Thu, 8 Dec 2005 20:13:54 -0600 (CST) Subject: [PATCH] vDSO for ppc/ppc64 submission Message-ID: Greetings, Enclosed is the patch for ppc/ppc64 vDSO support in glibc plus changes to use the vDSO implementations of __vdso_get_tbfreq, __vdso_clock_gettime, __vdso_clock_getres and __vdso_gettimeofday found in the 2.6.15 kernel written by Ben Herrenschmidt. Comments/Complaints/Suggestions of course are most welcome. Regards, Tom 2005-12-08 Steven Munroe Tom Gall * elf/rtld.c (dl_main): Initialize l_local_scope for sysinfo_map. * sysdeps/powerpc/elf/libc-start.c: Move this. * sysdeps/unix/sysv/linux/powerpc/libc-start.c: To here. * sysdeps/powerpc/powerpc32/dl-start.S: add _dl_main_dispatch * sysdeps/powerpc/powerpc32/hp-timing.h: New file. * sysdeps/powerpc/Versions: add __vdso_ symbols * sysdeps/unix/sysv/linux/clock_getres.c: add INTERNAL_VSYSCALL defined by default to INTERNAL_SYSCALL and INLINE_VSYSCALL defined by default to INLINE_SYSCALL * sysdeps/unix/sysv/linux/clock_gettime.c: add INTERNAL_VSYSCALL defined by default to INTERNAL_SYSCALL and INLINE_VSYSCALL defined by default to INLINE_SYSCALL * sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h: new file * sysdeps/unix/sysv/linux/powerpc/clock_getres.c: New file. * sysdeps/unix/sysv/linux/powerpc/clock_gettime.c: New file. * sysdeps/unix/sysv/linux/powerpc/dl-vdso.c: New file. * sysdeps/unix/sysv/linux/powerpc/dl-vdso.h: New file. * sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c: use vDSO / format * sysdeps/unix/sysv/linux/powerpc/gettimeofday.c: New file. * sysdeps/unix/sysv/linux/powerpc/Makefile: Add routines += dl-vdso. * sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h: new INLINE_VDSOCALL, INTERNAL_VDSOCALL_SIMPLE, INLINE_VDSOCALL_NO_SYSCALL_FALLBACK, INLINE_VDSOCALL_SIMPLE and INTERNAL_VDSOCALL macros * sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h: new INLINE_VDSOCALL, INTERNAL_VDSOCALL_SIMPLE, INLINE_VDSOCALL_NO_SYSCALL_FALLBACK, INLINE_VDSOCALL_SIMPLE and INTERNAL_VDSOCALL macros macros diff -uNr libc.orig/elf/rtld.c libc/elf/rtld.c --- libc.orig/elf/rtld.c 2005-12-05 21:18:26.000000000 -0500 +++ libc/elf/rtld.c 2005-12-05 21:20:43.000000000 -0500 @@ -1296,6 +1296,13 @@ elf_get_dynamic_info (l, dyn_temp); _dl_setup_hash (l); l->l_relocated = 1; + /* Initialize l_local_scope to contain just this map. This allows + the use of dl_lookup_symbol_x to resolve symbols within the vdso. + So we create a single entry list pointing to l_real as its only + element */ + + l->l_local_scope[0]->r_nlist = 1; + l->l_local_scope[0]->r_list = &l->l_real; /* Now that we have the info handy, use the DSO image's soname so this object can be looked up by name. Note that we do not diff -uNr libc.orig/sysdeps/powerpc/elf/libc-start.c libc/sysdeps/powerpc/elf/libc-start.c --- libc.orig/sysdeps/powerpc/elf/libc-start.c 2005-12-05 21:18:27.000000000 -0500 +++ libc/sysdeps/powerpc/elf/libc-start.c 1969-12-31 19:00:00.000000000 -0500 @@ -1,99 +0,0 @@ -/* Copyright (C) 1998,2000,2001,2002,2003,2004 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, write to the Free - Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA - 02111-1307 USA. */ - -#include -#include -#include -#include -#include - -extern int __cache_line_size; -weak_extern (__cache_line_size) - -/* The main work is done in the generic function. */ -#define LIBC_START_MAIN generic_start_main -#define LIBC_START_DISABLE_INLINE -#define LIBC_START_MAIN_AUXVEC_ARG -#define MAIN_AUXVEC_ARG -#include - - -struct startup_info -{ - void *__unbounded sda_base; - int (*main) (int, char **, char **, void *); - int (*init) (int, char **, char **, void *); - void (*fini) (void); -}; - - -int -/* GKM FIXME: GCC: this should get __BP_ prefix by virtue of the - BPs in the arglist of startup_info.main and startup_info.init. */ -BP_SYM (__libc_start_main) (int argc, char *__unbounded *__unbounded ubp_av, - char *__unbounded *__unbounded ubp_ev, - ElfW(auxv_t) *__unbounded auxvec, - void (*rtld_fini) (void), - struct startup_info *__unbounded stinfo, - char *__unbounded *__unbounded stack_on_entry) -{ -#if __BOUNDED_POINTERS__ - char **argv; -#else -# define argv ubp_av -#endif - - /* the PPC SVR4 ABI says that the top thing on the stack will - be a NULL pointer, so if not we assume that we're being called - as a statically-linked program by Linux... */ - if (*stack_on_entry != NULL) - { - char *__unbounded *__unbounded temp; - /* ...in which case, we have argc as the top thing on the - stack, followed by argv (NULL-terminated), envp (likewise), - and the auxilary vector. */ - /* 32/64-bit agnostic load from stack */ - argc = *(long int *__unbounded) stack_on_entry; - ubp_av = stack_on_entry + 1; - ubp_ev = ubp_av + argc + 1; -#ifdef HAVE_AUX_VECTOR - temp = ubp_ev; - while (*temp != NULL) - ++temp; - auxvec = (ElfW(auxv_t) *)++temp; -#endif - rtld_fini = NULL; - } - - /* Initialize the __cache_line_size variable from the aux vector. */ - for (ElfW(auxv_t) *av = auxvec; av->a_type != AT_NULL; ++av) - switch (av->a_type) - { - case AT_DCACHEBSIZE: - { - int *cls = & __cache_line_size; - if (cls != NULL) - *cls = av->a_un.a_val; - } - break; - } - - return generic_start_main (stinfo->main, argc, ubp_av, auxvec, - stinfo->init, stinfo->fini, rtld_fini, - stack_on_entry); -} diff -uNr libc.orig/sysdeps/powerpc/powerpc32/dl-start.S libc/sysdeps/powerpc/powerpc32/dl-start.S --- libc.orig/sysdeps/powerpc/powerpc32/dl-start.S 2005-12-05 21:18:27.000000000 -0500 +++ libc/sysdeps/powerpc/powerpc32/dl-start.S 2005-12-05 21:20:43.000000000 -0500 @@ -98,6 +98,7 @@ Take the opportunity to clear LR, so anyone who accidentally returns from _start gets SEGV. Also clear the next few words of the stack. */ +ENTRY(_dl_main_dispatch) li r31,0 stw r31,0(r1) mtlr r31 diff -uNr libc.orig/sysdeps/powerpc/powerpc32/hp-timing.h libc/sysdeps/powerpc/powerpc32/hp-timing.h --- libc.orig/sysdeps/powerpc/powerpc32/hp-timing.h 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/powerpc/powerpc32/hp-timing.h 2005-12-05 21:20:43.000000000 -0500 @@ -0,0 +1,83 @@ +/* High precision, low overhead timing functions. Generic version. + Copyright (C) 1998, 2000 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Ulrich Drepper , 1998. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#ifndef _HP_TIMING_H +#define _HP_TIMING_H 1 + + +/* There are no generic definitions for the times. We could write something + using the `gettimeofday' system call where available but the overhead of + the system call might be too high. + + In case a platform supports timers in the hardware the following macros + and types must be defined: + + - HP_TIMING_AVAIL: test for availability. + + - HP_TIMING_INLINE: this macro is non-zero if the functionality is not + implemented using function calls but instead uses some inlined code + which might simply consist of a few assembler instructions. We have to + know this since we might want to use the macros here in places where we + cannot make function calls. + + - hp_timing_t: This is the type for variables used to store the time + values. + + - HP_TIMING_ZERO: clear `hp_timing_t' object. + + - HP_TIMING_NOW: place timestamp for current time in variable given as + parameter. + + - HP_TIMING_DIFF_INIT: do whatever is necessary to be able to use the + HP_TIMING_DIFF macro. + + - HP_TIMING_DIFF: compute difference between two times and store it + in a third. Source and destination might overlap. + + - HP_TIMING_ACCUM: add time difference to another variable. This might + be a bit more complicated to implement for some platforms as the + operation should be thread-safe and 64bit arithmetic on 32bit platforms + is not. + + - HP_TIMING_ACCUM_NT: this is the variant for situations where we know + there are no threads involved. + + - HP_TIMING_PRINT: write decimal representation of the timing value into + the given string. This operation need not be inline even though + HP_TIMING_INLINE is specified. + +*/ + +/* Provide dummy definitions. */ +#define HP_TIMING_AVAIL (0) +#define HP_TIMING_INLINE (0) +typedef unsigned long long int hp_timing_t; +#define HP_TIMING_ZERO(Var) +#define HP_TIMING_NOW(var) +#define HP_TIMING_DIFF_INIT() +#define HP_TIMING_DIFF(Diff, Start, End) +#define HP_TIMING_ACCUM(Sum, Diff) +#define HP_TIMING_ACCUM_NT(Sum, Diff) +#define HP_TIMING_PRINT(Buf, Len, Val) + +/* Since this implementation is not available we tell the user about it. */ +#define HP_TIMING_NONAVAIL 1 + +#endif /* hp-timing.h */ diff -uNr libc.orig/sysdeps/powerpc/Versions libc/sysdeps/powerpc/Versions --- libc.orig/sysdeps/powerpc/Versions 2005-12-05 21:18:27.000000000 -0500 +++ libc/sysdeps/powerpc/Versions 2005-12-05 21:20:43.000000000 -0500 @@ -13,5 +13,8 @@ GLIBC_PRIVATE { __novmx__libc_longjmp; __novmx__libc_siglongjmp; __vmx__libc_longjmp; __vmx__libc_siglongjmp; + __vdso_get_tbfreq; + __vdso_clock_gettime; + __vdso_clock_getres; } } diff -uNr libc.orig/sysdeps/unix/sysv/linux/clock_getres.c libc/sysdeps/unix/sysv/linux/clock_getres.c --- libc.orig/sysdeps/unix/sysv/linux/clock_getres.c 2005-12-05 21:18:30.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/clock_getres.c 2005-12-07 13:54:09.000000000 -0500 @@ -24,9 +24,16 @@ #include "kernel-features.h" +#ifndef INTERNAL_VSYSCALL +#define INTERNAL_VSYSCALL INTERNAL_SYSCALL +#endif + +#ifndef INLINE_VSYSCALL +#define INLINE_VSYSCALL INLINE_SYSCALL +#endif #define SYSCALL_GETRES \ - retval = INLINE_SYSCALL (clock_getres, 2, clock_id, res); \ + retval = INLINE_VSYSCALL (clock_getres, 2, clock_id, res); \ break #ifdef __ASSUME_POSIX_TIMERS @@ -109,7 +116,7 @@ if (!__libc_missing_posix_cpu_timers) { INTERNAL_SYSCALL_DECL (err); - int r = INTERNAL_SYSCALL (clock_getres, err, 2, clock_id, res); + int r = INTERNAL_VSYSCALL (clock_getres, err, 2, clock_id, res); if (!INTERNAL_SYSCALL_ERROR_P (r, err)) return 0; @@ -128,7 +135,7 @@ { /* Check whether the kernel supports CPU clocks at all. If not, record it for the future. */ - r = INTERNAL_SYSCALL (clock_getres, err, 2, + r = INTERNAL_VSYSCALL (clock_getres, err, 2, MAKE_PROCESS_CPUCLOCK (0, CPUCLOCK_SCHED), NULL); if (INTERNAL_SYSCALL_ERROR_P (r, err)) diff -uNr libc.orig/sysdeps/unix/sysv/linux/clock_gettime.c libc/sysdeps/unix/sysv/linux/clock_gettime.c --- libc.orig/sysdeps/unix/sysv/linux/clock_gettime.c 2005-12-05 21:18:30.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/clock_gettime.c 2005-12-07 13:52:13.000000000 -0500 @@ -23,9 +23,16 @@ #include "kernel-posix-cpu-timers.h" #include "kernel-features.h" +#ifndef INTERNAL_VSYSCALL +#define INTERNAL_VSYSCALL INTERNAL_SYSCALL +#endif + +#ifndef INLINE_VSYSCALL +#define INLINE_VSYSCALL INLINE_SYSCALL +#endif #define SYSCALL_GETTIME \ - retval = INLINE_SYSCALL (clock_gettime, 2, clock_id, tp); \ + retval = INLINE_VSYSCALL (clock_gettime, 2, clock_id, tp); \ break #ifdef __ASSUME_POSIX_TIMERS @@ -108,7 +115,7 @@ if (!__libc_missing_posix_cpu_timers) { INTERNAL_SYSCALL_DECL (err); - int r = INTERNAL_SYSCALL (clock_gettime, err, 2, clock_id, tp); + int r = INTERNAL_VSYSCALL (clock_gettime, err, 2, clock_id, tp); if (!INTERNAL_SYSCALL_ERROR_P (r, err)) return 0; @@ -127,7 +134,7 @@ { /* Check whether the kernel supports CPU clocks at all. If not, record it for the future. */ - r = INTERNAL_SYSCALL (clock_getres, err, 2, + r = INTERNAL_VSYSCALL (clock_getres, err, 2, MAKE_PROCESS_CPUCLOCK (0, CPUCLOCK_SCHED), NULL); if (INTERNAL_SYSCALL_ERROR_P (r, err)) diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h libc/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h --- libc.orig/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/bits/libc-vdso.h 2005-12-05 21:20:43.000000000 -0500 @@ -0,0 +1,36 @@ +/* Resolved function pointers to VDSO functions. + Copyright (C) 2005 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + + +#ifndef _LIBC_VDSO_H +#define _LIBC_VDSO_H + +#ifdef SHARED + +extern void *__vdso_gettimeofday; + +extern void *__vdso_clock_gettime; + +extern void *__vdso_clock_getres; + +extern void *__vdso_get_tbfreq; + +#endif + +#endif /* _LIBC_VDSO_H */ diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/clock_getres.c libc/sysdeps/unix/sysv/linux/powerpc/clock_getres.c --- libc.orig/sysdeps/unix/sysv/linux/powerpc/clock_getres.c 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/clock_getres.c 2005-12-07 16:44:14.000000000 -0500 @@ -0,0 +1,25 @@ +/* clock_getres -- Get the resolution of a POSIX clockid_t. Linux version. + Copyright (C) 2005 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#include + +#define INTERNAL_VSYSCALL INTERNAL_VDSOCALL_SIMPLE +#define INLINE_VSYSCALL INLINE_VDSOCALL + +#include diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/clock_gettime.c libc/sysdeps/unix/sysv/linux/powerpc/clock_gettime.c --- libc.orig/sysdeps/unix/sysv/linux/powerpc/clock_gettime.c 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/clock_gettime.c 2005-12-07 16:44:35.000000000 -0500 @@ -0,0 +1,25 @@ +/* clock_gettime -- Get current time from a POSIX clockid_t. Linux version. + Copyright (C) 2005 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#include + +#define INTERNAL_VSYSCALL INTERNAL_VDSOCALL_SIMPLE +#define INLINE_VSYSCALL INLINE_VDSOCALL + +#include diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/dl-vdso.c libc/sysdeps/unix/sysv/linux/powerpc/dl-vdso.c --- libc.orig/sysdeps/unix/sysv/linux/powerpc/dl-vdso.c 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/dl-vdso.c 2005-12-05 21:20:43.000000000 -0500 @@ -0,0 +1,59 @@ +/* ELF symbol resolve functions for VDSO objects. + Copyright (C) 2005 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#include "config.h" +#include +#include + +void *internal_function +_dl_vdso_vsym (const char *name, const char *version) +{ + ElfW (Sym) wsym; + const ElfW (Sym) * ref = &wsym; + struct link_map *map = GLRO (dl_sysinfo_map); + void *value = NULL; + struct r_found_version vers; + lookup_t result; + + + if (map != NULL) + { + /* Use a WEAK REF so we don't error out if the symbol is not found. */ + memset (&wsym, 0, sizeof (ElfW (Sym))); + wsym.st_info = (unsigned char) ELFW (ST_INFO (STB_WEAK, STT_NOTYPE)); + /* Compute hash value to the version string. */ + vers.name = version; + vers.hidden = 1; + vers.hash = _dl_elf_hash (version); + /* We don't have a specific file where the symbol can be found. */ + vers.filename = NULL; + + /* Search the scope of the vdso map. */ + result = GLRO (dl_lookup_symbol_x) (name, map, &ref, + map->l_local_scope, + &vers, 0, 0, NULL); + + if (ref != NULL) + { + value = DL_SYMBOL_ADDRESS (result, ref); + + } + } + return value; +} diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/dl-vdso.h libc/sysdeps/unix/sysv/linux/powerpc/dl-vdso.h --- libc.orig/sysdeps/unix/sysv/linux/powerpc/dl-vdso.h 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/dl-vdso.h 2005-12-05 21:20:43.000000000 -0500 @@ -0,0 +1,29 @@ +/* ELF symbol resolve functions for VDSO objects. + Copyright (C) 2005 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#ifndef _DL_VDSO_H +#define _DL_VDSO_H + +/* Functions for resolving symbols in the VDSO link map. */ + +extern void * +_dl_vdso_vsym (const char *name, const char *version) + internal_function attribute_hidden; + +#endif /* dl-vdso.h */ diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c libc/sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c --- libc.orig/sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c 2005-12-05 21:18:30.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/get_clockfreq.c 2005-12-05 21:20:43.000000000 -0500 @@ -22,14 +22,15 @@ #include #include #include - +#include +#include hp_timing_t __get_clockfreq (void) { /* We read the information from the /proc filesystem. /proc/cpuinfo contains at least one line like: - timebase : 33333333 + timebase : 33333333 We search for this line and convert the number into an integer. */ static hp_timing_t timebase_freq; hp_timing_t result = 0L; @@ -38,68 +39,76 @@ if (timebase_freq != 0) return timebase_freq; - int fd = open ("/proc/cpuinfo", O_RDONLY); - if (__builtin_expect (fd != -1, 1)) + /* if we can use the vDSO to obtain the timebase even better */ +#ifdef SHARED + timebase_freq = INLINE_VDSOCALL_SIMPLE (get_tbfreq, 0); + if (timebase_freq == 0) +#endif { - /* The timebase will be in the 1st 1024 bytes for systems with up - to 8 processors. If the first read returns less then 1024 - bytes read, we have the whole cpuinfo and can start the scan. - Otherwise we will have to read more to insure we have the - timebase value in the scan. */ - char buf[1024]; - ssize_t n; + int fd = open ("/proc/cpuinfo", O_RDONLY); - n = read (fd, buf, sizeof (buf)); - if (n == sizeof (buf)) + if (__builtin_expect (fd != -1, 1)) { - /* We are here because the 1st read returned exactly sizeof - (buf) bytes. This implies that we are not at EOF and may - not have read the timebase value yet. So we need to read - more bytes until we know we have EOF. We copy the lower - half of buf to the upper half and read sizeof (buf)/2 - bytes into the lower half of buf and repeat until we - reach EOF. We can assume that the timebase will be in - the last 512 bytes of cpuinfo, so two 512 byte half_bufs - will be sufficient to contain the timebase and will - handle the case where the timebase spans the half_buf - boundry. */ - const ssize_t half_buf = sizeof (buf) / 2; - while (n >= half_buf) + /* The timebase will be in the 1st 1024 bytes for systems with up + to 8 processors. If the first read returns less then 1024 + bytes read, we have the whole cpuinfo and can start the scan. + Otherwise we will have to read more to insure we have the + timebase value in the scan. */ + char buf[1024]; + ssize_t n; + + n = read (fd, buf, sizeof (buf)); + if (n == sizeof (buf)) { - memcpy (buf, buf + half_buf, half_buf); - n = read (fd, buf + half_buf, half_buf); + /* We are here because the 1st read returned exactly sizeof + (buf) bytes. This implies that we are not at EOF and may + not have read the timebase value yet. So we need to read + more bytes until we know we have EOF. We copy the lower + half of buf to the upper half and read sizeof (buf)/2 + bytes into the lower half of buf and repeat until we + reach EOF. We can assume that the timebase will be in + the last 512 bytes of cpuinfo, so two 512 byte half_bufs + will be sufficient to contain the timebase and will + handle the case where the timebase spans the half_buf + boundry. */ + const ssize_t half_buf = sizeof (buf) / 2; + while (n >= half_buf) + { + memcpy (buf, buf + half_buf, half_buf); + n = read (fd, buf + half_buf, half_buf); + } + if (n >= 0) + n += half_buf; } - if (n >= 0) - n += half_buf; - } - - if (__builtin_expect (n, 1) > 0) - { - char *mhz = memmem (buf, n, "timebase", 7); - if (__builtin_expect (mhz != NULL, 1)) + if (__builtin_expect (n, 1) > 0) { - char *endp = buf + n; + char *mhz = memmem (buf, n, "timebase", 7); - /* Search for the beginning of the string. */ - while (mhz < endp && (*mhz < '0' || *mhz > '9') && *mhz != '\n') - ++mhz; - - while (mhz < endp && *mhz != '\n') + if (__builtin_expect (mhz != NULL, 1)) { - if (*mhz >= '0' && *mhz <= '9') + char *endp = buf + n; + + /* Search for the beginning of the string. */ + while (mhz < endp && (*mhz < '0' || *mhz > '9') + && *mhz != '\n') + ++mhz; + + while (mhz < endp && *mhz != '\n') { - result *= 10; - result += *mhz - '0'; - } + if (*mhz >= '0' && *mhz <= '9') + { + result *= 10; + result += *mhz - '0'; + } - ++mhz; + ++mhz; + } } + timebase_freq = result; } - timebase_freq = result; + close (fd); } - close (fd); } - return timebase_freq; } diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c libc/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c --- libc.orig/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/gettimeofday.c 2005-12-07 14:02:19.000000000 -0500 @@ -0,0 +1,41 @@ +/* Copyright (C) 2005 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#include +#include +#include +#include +#include +#include + +#undef __gettimeofday +#include + +/* Get the current time of day and timezone information, + putting it into *TV and *TZ. If TZ is NULL, *TZ is not filled. + Returns 0 on success, -1 on errors. */ + +int +__gettimeofday (tv, tz) + struct timeval *tv; + struct timezone *tz; +{ + return INLINE_VDSOCALL (gettimeofday, 2, CHECK_1 (tv), CHECK_1 (tz)); +} + +INTDEF (__gettimeofday) weak_alias (__gettimeofday, gettimeofday) diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/libc-start.c libc/sysdeps/unix/sysv/linux/powerpc/libc-start.c --- libc.orig/sysdeps/unix/sysv/linux/powerpc/libc-start.c 1969-12-31 19:00:00.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/libc-start.c 2005-12-05 21:20:43.000000000 -0500 @@ -0,0 +1,130 @@ +/* Copyright (C) 1998,2000,2001,2002,2003,2004,2005 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA + 02111-1307 USA. */ + +#include +#include +#include +#include +#include + +extern int __cache_line_size; +weak_extern (__cache_line_size) +/* The main work is done in the generic function. */ +#define LIBC_START_MAIN generic_start_main +#define LIBC_START_DISABLE_INLINE +#define LIBC_START_MAIN_AUXVEC_ARG +#define MAIN_AUXVEC_ARG +#define INIT_MAIN_ARGS +#include + +struct startup_info + { + void *__unbounded sda_base; + int (*main) (int, char **, char **, void *); + int (*init) (int, char **, char **, void *); + void (*fini) (void); + }; + + +#ifdef SHARED +#include +#include +#undef __gettimeofday +#undef __clock_gettime +#undef __clock_getres +#include + +void *__vdso_gettimeofday; +void *__vdso_clock_gettime; +void *__vdso_clock_getres; +void *__vdso_get_tbfreq; + +static inline void _libc_vdso_platform_setup (void) + { + __vdso_gettimeofday = _dl_vdso_vsym ("__kernel_gettimeofday", + "LINUX_2.6.15"); + + __vdso_clock_gettime = _dl_vdso_vsym ("__kernel_clock_gettime", + "LINUX_2.6.15"); + + __vdso_clock_getres = _dl_vdso_vsym ("__kernel_clock_getres", + "LINUX_2.6.15"); + + __vdso_get_tbfreq = _dl_vdso_vsym ("__kernel_vdso_get_tbfreq", + "LINUX_2.6.15"); + } +#endif + +int +/* GKM FIXME: GCC: this should get __BP_ prefix by virtue of the + BPs in the arglist of startup_info.main and startup_info.init. */ + BP_SYM (__libc_start_main) (int argc, char *__unbounded * __unbounded ubp_av, + char *__unbounded * __unbounded ubp_ev, + ElfW (auxv_t) * __unbounded auxvec, + void (*rtld_fini) (void), + struct startup_info * __unbounded stinfo, + char *__unbounded * __unbounded stack_on_entry) +{ +#if __BOUNDED_POINTERS__ + char **argv; +#else +# define argv ubp_av +#endif + + /* the PPC SVR4 ABI says that the top thing on the stack will + be a NULL pointer, so if not we assume that we're being called + as a statically-linked program by Linux... */ + if (*stack_on_entry != NULL) + { + char *__unbounded * __unbounded temp; + /* ...in which case, we have argc as the top thing on the + stack, followed by argv (NULL-terminated), envp (likewise), + and the auxilary vector. */ + /* 32/64-bit agnostic load from stack */ + argc = *(long int *__unbounded) stack_on_entry; + ubp_av = stack_on_entry + 1; + ubp_ev = ubp_av + argc + 1; +#ifdef HAVE_AUX_VECTOR + temp = ubp_ev; + while (*temp != NULL) + ++temp; + auxvec = (ElfW (auxv_t) *)++ temp; +#endif + rtld_fini = NULL; + } + + /* Initialize the __cache_line_size variable from the aux vector. */ + for (ElfW (auxv_t) * av = auxvec; av->a_type != AT_NULL; ++av) + switch (av->a_type) + { + case AT_DCACHEBSIZE: + { + int *cls = &__cache_line_size; + if (cls != NULL) + *cls = av->a_un.a_val; + } + break; + } +#ifdef SHARED + /* Resolve and initialize function pointers for VDSO functions. */ + _libc_vdso_platform_setup (); +#endif + return generic_start_main (stinfo->main, argc, ubp_av, auxvec, + stinfo->init, stinfo->fini, rtld_fini, + stack_on_entry); +} diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/Makefile libc/sysdeps/unix/sysv/linux/powerpc/Makefile --- libc.orig/sysdeps/unix/sysv/linux/powerpc/Makefile 2005-12-05 21:18:30.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/Makefile 2005-12-05 21:20:43.000000000 -0500 @@ -2,3 +2,8 @@ ifeq ($(subdir),rt) librt-routines += rt-sysdep endif + +ifeq ($(subdir),misc) +routines += dl-vdso +endif + diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h libc/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h --- libc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h 2005-12-05 21:18:30.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h 2005-12-08 16:00:17.300339776 -0500 @@ -54,6 +54,139 @@ # include +# undef INLINE_VDSOCALL +#ifdef SHARED +# define INLINE_VDSOCALL(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret = 0; \ + \ + if ( __vdso_ ## name !=NULL) \ + sc_ret = INTERNAL_VDSOCALL (__vdso_ ## name, sc_err, nr, args); \ + if (( __vdso_ ## name == NULL ) || (sc_ret == ENOSYS)) \ + sc_ret = INTERNAL_SYSCALL (name, sc_err, nr, args); \ + if (INTERNAL_SYSCALL_ERROR_P (sc_ret, sc_err)) \ + { \ + __set_errno (INTERNAL_SYSCALL_ERRNO (sc_ret, sc_err)); \ + sc_ret = -1L; \ + } \ + sc_ret; \ + }) +#else +# define INLINE_VDSOCALL(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret; \ + \ + sc_ret = INTERNAL_SYSCALL (name, sc_err, nr, args); \ + if (INTERNAL_SYSCALL_ERROR_P (sc_ret, sc_err)) \ + { \ + __set_errno (INTERNAL_SYSCALL_ERRNO (sc_ret, sc_err)); \ + sc_ret = -1L; \ + } \ + sc_ret; \ + }) +#endif + +# undef INTERNAL_VDSOCALL_SIMPLE +#ifdef SHARED +# define INTERNAL_VDSOCALL_SIMPLE(name, err, nr, args...) \ + ({ \ + long int v_ret = 0; \ + \ + if ( __vdso_ ## name !=NULL) \ + v_ret = INTERNAL_VDSOCALL (__vdso_ ## name, err, nr, args); \ + if (( __vdso_ ## name == NULL ) || (v_ret == ENOSYS)) \ + v_ret = INTERNAL_SYSCALL (name, err, nr, args); \ + v_ret; \ + }) +#else +# define INTERNAL_VDSOCALL_SIMPLE(name, err, nr, args...) \ + ({ \ + long int v_ret; \ + \ + v_ret = INTERNAL_SYSCALL (name, err, nr, args); \ + v_ret; \ + }) +#endif + +# undef INLINE_VDSOCALL_NO_SYSCALL_FALLBACK +# define INLINE_VDSOCALL_NO_SYSCALL_FALLBACK(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret=0; \ + \ + if (__vdso_ ## name !=NULL) \ + { \ + sc_ret = INTERNAL_VDSOCALL (__vdso_ ## name, sc_err, nr, args); \ + } \ + else \ + { \ + sc_ret = ENOSYS; \ + } \ + if (INTERNAL_SYSCALL_ERROR_P (sc_ret, sc_err)) \ + { \ + __set_errno (INTERNAL_SYSCALL_ERRNO (sc_ret, sc_err)); \ + sc_ret = -1L; \ + } \ + sc_ret; \ + }) + +# undef INLINE_VDSOCALL_SIMPLE +# define INLINE_VDSOCALL_SIMPLE(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret=0; \ + \ + if (__vdso_ ## name !=NULL) \ + { \ + sc_ret = INTERNAL_VDSOCALL (__vdso_ ## name, sc_err, nr, args); \ + } \ + else \ + { \ + sc_ret = ENOSYS; \ + } \ + sc_ret; \ + }) + +/* Define a macro which expands inline into the wrapper code for a VDSO + call. This use is for internal calls that do not need to handle errors + normally. It will never touch errno. + On powerpc a system call basically clobbers the same registers like a + function call, with the exception of LR (which is needed for the + "sc; bnslr+" sequence) and CR (where only CR0.SO is clobbered to signal + an error return status). */ + +# undef INTERNAL_VDSOCALL +# define INTERNAL_VDSOCALL_NCS(funcptr, err, nr, args...) \ + ({ \ + register void *r0 __asm__ ("r0"); \ + register long int r3 __asm__ ("r3"); \ + register long int r4 __asm__ ("r4"); \ + register long int r5 __asm__ ("r5"); \ + register long int r6 __asm__ ("r6"); \ + register long int r7 __asm__ ("r7"); \ + register long int r8 __asm__ ("r8"); \ + register long int r9 __asm__ ("r9"); \ + register long int r10 __asm__ ("r10"); \ + register long int r11 __asm__ ("r11"); \ + register long int r12 __asm__ ("r12"); \ + LOADARGS_##nr(funcptr, args); \ + __asm__ __volatile__ \ + ("mtctr %0\n\t" \ + "bctrl\n\t" \ + "mfcr %0" \ + : "=&r" (r0), \ + "=&r" (r3), "=&r" (r4), "=&r" (r5), "=&r" (r6), "=&r" (r7), \ + "=&r" (r8), "=&r" (r9), "=&r" (r10), "=&r" (r11), "=&r" (r12) \ + : ASM_INPUT_##nr \ + : "cr0", "ctr", "lr", "memory"); \ + err = (long int)r0; \ + (int) r3; \ + }) +# define INTERNAL_VDSOCALL(name, err, nr, args...) \ + INTERNAL_VDSOCALL_NCS (name, err, nr, ##args) + # undef INLINE_SYSCALL # define INLINE_SYSCALL(name, nr, args...) \ ({ \ diff -uNr libc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h libc/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h --- libc.orig/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h 2005-12-05 21:18:30.000000000 -0500 +++ libc/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h 2005-12-08 16:00:42.708338736 -0500 @@ -66,7 +66,144 @@ #define ASM_TYPE_DIRECTIVE(name,typearg) .type name,typearg; #define ASM_SIZE_DIRECTIVE(name) .size name,.-name -#endif /* __ASSEMBLER__ */ +#endif /* __ASSEMBLER__ */ + +/* This version is for kernels that implement system calls that + behave like function calls as far as register saving. + It falls back to the syscall in the case that the vDSO doesn't + exist or fails for ENOSYS */ + +# undef INLINE_VDSOCALL +#ifdef SHARED +# define INLINE_VDSOCALL(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret = 0; \ + \ + if ( __vdso_ ## name !=NULL) \ + sc_ret = INTERNAL_VDSOCALL (__vdso_ ## name, sc_err, nr, args); \ + if (( __vdso_ ## name == NULL ) || (sc_ret == ENOSYS)) \ + sc_ret = INTERNAL_SYSCALL (name, sc_err, nr, args); \ + if (INTERNAL_SYSCALL_ERROR_P (sc_ret, sc_err)) \ + { \ + __set_errno (INTERNAL_SYSCALL_ERRNO (sc_ret, sc_err)); \ + sc_ret = -1L; \ + } \ + sc_ret; \ + }) +#else +# define INLINE_VDSOCALL(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret; \ + \ + sc_ret = INTERNAL_SYSCALL (name, sc_err, nr, args); \ + if (INTERNAL_SYSCALL_ERROR_P (sc_ret, sc_err)) \ + { \ + __set_errno (INTERNAL_SYSCALL_ERRNO (sc_ret, sc_err)); \ + sc_ret = -1L; \ + } \ + sc_ret; \ + }) +#endif + +# undef INTERNAL_VDSOCALL_SIMPLE +#ifdef SHARED +# define INTERNAL_VDSOCALL_SIMPLE(name, err, nr, args...) \ + ({ \ + long int v_ret = 0; \ + \ + if ( __vdso_ ## name !=NULL) \ + v_ret = INTERNAL_VDSOCALL (__vdso_ ## name, err, nr, args); \ + if (( __vdso_ ## name == NULL ) || (v_ret == ENOSYS)) \ + v_ret = INTERNAL_SYSCALL (name, err, nr, args); \ + v_ret; \ + }) +#else +# define INTERNAL_VDSOCALL_SIMPLE(name, err, nr, args...) \ + ({ \ + long int v_ret; \ + \ + v_ret = INTERNAL_SYSCALL (name, err, nr, args); \ + v_ret; \ + }) +#endif + + +/* This version does not fail back to a syscall as the previous + version does */ +# undef INLINE_VDSOCALL_NO_SYSCALL_FALLBACK +# define INLINE_VDSOCALL_NO_SYSCALL_FALLBACK(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret = 0; \ + if (__vdso_ ## name !=NULL) \ + { \ + sc_ret = INTERNAL_VDSOCALL (__vdso_ ## name, sc_err, nr, args); \ + } \ + else \ + { \ + sc_ret = ENOSYS; \ + } \ + if (INTERNAL_SYSCALL_ERROR_P (sc_ret, sc_err)) \ + { \ + __set_errno (INTERNAL_SYSCALL_ERRNO (sc_ret, sc_err)); \ + sc_ret = -1L; \ + } \ + sc_ret; \ + }) + +/* This version is for internal uses when there is no desire + to set errno */ +# undef INLINE_VDSOCALL_SIMPLE +# define INLINE_VDSOCALL_SIMPLE(name, nr, args...) \ + ({ \ + INTERNAL_SYSCALL_DECL (sc_err); \ + long int sc_ret=0; \ + \ + if ( __vdso_ ## name !=NULL) \ + { \ + sc_ret = INTERNAL_VDSOCALL (__vdso_ ## name, sc_err, nr, args); \ + } \ + else \ + { \ + sc_ret = ENOSYS; \ + } \ + sc_ret; \ + }) + +/* Define a macro which expands inline into the wrapper code for a system + call. This use is for internal calls that do not need to handle errors + normally. It will never touch errno. This returns just what the kernel + gave back in the non-error (CR0.SO cleared) case, otherwise (CR0.SO set) + the negation of the return value in the kernel gets reverted. */ + +#define INTERNAL_VDSOCALL_NCS(funcptr, err, nr, args...) \ + ({ \ + register void *r0 __asm__ ("r0"); \ + register long int r3 __asm__ ("r3"); \ + register long int r4 __asm__ ("r4"); \ + register long int r5 __asm__ ("r5"); \ + register long int r6 __asm__ ("r6"); \ + register long int r7 __asm__ ("r7"); \ + register long int r8 __asm__ ("r8"); \ + LOADARGS_##nr(funcptr, args); \ + __asm__ __volatile__ \ + ("mtctr %0\n\t" \ + "bctrl\n\t" \ + "mfcr %0\n\t" \ + "0:" \ + : "=&r" (r0), \ + "=&r" (r3), "=&r" (r4), "=&r" (r5), \ + "=&r" (r6), "=&r" (r7), "=&r" (r8) \ + : ASM_INPUT_##nr \ + : "r9", "r10", "r11", "r12", \ + "cr0", "ctr", "lr", "memory"); \ + err = (long int)r0; \ + (int) r3; \ + }) +#define INTERNAL_VDSOCALL(name, err, nr, args...) \ + INTERNAL_VDSOCALL_NCS (name, err, nr, ##args) #undef INLINE_SYSCALL From david at gibson.dropbear.id.au Fri Dec 9 13:31:55 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 9 Dec 2005 13:31:55 +1100 Subject: powerpc: Fix SLB flushing path in hugepage Message-ID: <20051209023155.GA6517@localhost.localdomain> Andrew, Paulus, please apply and forward upstream. This is a potentially serious bug which should be fixed before 2.6.15. On ppc64, when opening a new hugepage region, we need to make sure any old normal-page SLBs for the area are flushed on all CPUs. There was a bug in this logic - after putting the new hugepage area masks into the thread structure, we copied it into the paca (read by the SLB miss handler) only on one CPU, not on all. This could cause incorrect SLB entries to be loaded when a multithreaded program was running simultaneously on several CPUs. This patch corrects the error, copying the context information into the PACA on all CPUs using the mm in question before flushing any existing SLB entries. Signed-off-by: David Gibson Index: working-2.6/arch/powerpc/mm/hugetlbpage.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/hugetlbpage.c 2005-12-09 13:13:26.000000000 +1100 +++ working-2.6/arch/powerpc/mm/hugetlbpage.c 2005-12-09 13:15:56.000000000 +1100 @@ -133,43 +133,64 @@ pte_t huge_ptep_get_and_clear(struct mm_ return __pte(old); } +struct slb_flush_info { + struct mm_struct *mm; + u16 newareas; +}; + static void flush_low_segments(void *parm) { - u16 areas = (unsigned long) parm; + struct slb_flush_info *fi = parm; unsigned long i; - asm volatile("isync" : : : "memory"); + BUILD_BUG_ON((sizeof(fi->newareas)*8) != NUM_LOW_AREAS); + + if (current->mm != fi->mm) + return; + + + /* Only need to do anything if this CPU is working in the same + * mm as the one which has changed */ - BUILD_BUG_ON((sizeof(areas)*8) != NUM_LOW_AREAS); + /* update the paca copy of the context struct */ + get_paca()->context = current->mm->context; + asm volatile("isync" : : : "memory"); for (i = 0; i < NUM_LOW_AREAS; i++) { - if (! (areas & (1U << i))) + if (! (fi->newareas & (1U << i))) continue; asm volatile("slbie %0" : : "r" ((i << SID_SHIFT) | SLBIE_C)); } - asm volatile("isync" : : : "memory"); } static void flush_high_segments(void *parm) { - u16 areas = (unsigned long) parm; + struct slb_flush_info *fi = parm; unsigned long i, j; - asm volatile("isync" : : : "memory"); - BUILD_BUG_ON((sizeof(areas)*8) != NUM_HIGH_AREAS); + BUILD_BUG_ON((sizeof(fi->newareas)*8) != NUM_HIGH_AREAS); + + if (current->mm != fi->mm) + return; + /* Only need to do anything if this CPU is working in the same + * mm as the one which has changed */ + + /* update the paca copy of the context struct */ + get_paca()->context = current->mm->context; + + asm volatile("isync" : : : "memory"); for (i = 0; i < NUM_HIGH_AREAS; i++) { - if (! (areas & (1U << i))) + if (! (fi->newareas & (1U << i))) continue; for (j = 0; j < (1UL << (HTLB_AREA_SHIFT-SID_SHIFT)); j++) asm volatile("slbie %0" :: "r" (((i << HTLB_AREA_SHIFT) - + (j << SID_SHIFT)) | SLBIE_C)); + + (j << SID_SHIFT)) | SLBIE_C)); } - asm volatile("isync" : : : "memory"); } @@ -214,6 +235,7 @@ static int prepare_high_area_for_htlb(st static int open_low_hpage_areas(struct mm_struct *mm, u16 newareas) { unsigned long i; + struct slb_flush_info fi; BUILD_BUG_ON((sizeof(newareas)*8) != NUM_LOW_AREAS); BUILD_BUG_ON((sizeof(mm->context.low_htlb_areas)*8) != NUM_LOW_AREAS); @@ -229,19 +251,20 @@ static int open_low_hpage_areas(struct m mm->context.low_htlb_areas |= newareas; - /* update the paca copy of the context struct */ - get_paca()->context = mm->context; - /* the context change must make it to memory before the flush, * so that further SLB misses do the right thing. */ mb(); - on_each_cpu(flush_low_segments, (void *)(unsigned long)newareas, 0, 1); + + fi.mm = mm; + fi.newareas = newareas; + on_each_cpu(flush_low_segments, &fi, 0, 1); return 0; } static int open_high_hpage_areas(struct mm_struct *mm, u16 newareas) { + struct slb_flush_info fi; unsigned long i; BUILD_BUG_ON((sizeof(newareas)*8) != NUM_HIGH_AREAS); @@ -265,7 +288,10 @@ static int open_high_hpage_areas(struct /* the context change must make it to memory before the flush, * so that further SLB misses do the right thing. */ mb(); - on_each_cpu(flush_high_segments, (void *)(unsigned long)newareas, 0, 1); + + fi.mm = mm; + fi.newareas = newareas; + on_each_cpu(flush_high_segments, &fi, 0, 1); return 0; } -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Fri Dec 9 14:20:52 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 9 Dec 2005 14:20:52 +1100 Subject: powerpc: Add missing icache flushes for hugepages Message-ID: <20051209032051.GA11744@localhost.localdomain> Andrew, Paulus, please apply and forward upstream. This is a real bug (though a rarely triggered one) and should be fixed before 2.6.15. On most powerpc CPUs, the dcache and icache are not coherent so between writing and executing a page, the caches must be flushed. Userspace programs assume pages given to them by the kernel are icache clean, so we must do this flush between the kernel clearing a page and it being mapped into userspace for execute. We were not doing this for hugepages, this patch corrects the situation. We use the same lazy mechanism as we use for normal pages, delaying the flush until userspace actually attempts to execute from the page in question. Tested on G5 Signed-off-by: David Gibson Index: working-2.6/arch/powerpc/mm/hash_utils_64.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/hash_utils_64.c 2005-12-09 14:02:52.000000000 +1100 +++ working-2.6/arch/powerpc/mm/hash_utils_64.c 2005-12-09 14:03:09.000000000 +1100 @@ -601,7 +601,7 @@ int hash_page(unsigned long ea, unsigned /* Handle hugepage regions */ if (unlikely(in_hugepage_area(mm->context, ea))) { DBG_LOW(" -> huge page !\n"); - return hash_huge_page(mm, access, ea, vsid, local); + return hash_huge_page(mm, access, ea, vsid, local, trap); } /* Get PTE and page size from page tables */ Index: working-2.6/arch/powerpc/mm/hugetlbpage.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/hugetlbpage.c 2005-12-09 14:03:09.000000000 +1100 +++ working-2.6/arch/powerpc/mm/hugetlbpage.c 2005-12-09 14:03:09.000000000 +1100 @@ -685,8 +685,36 @@ unsigned long hugetlb_get_unmapped_area( return -ENOMEM; } +/* + * Called by asm hashtable.S for doing lazy icache flush + */ +static unsigned int hash_huge_page_do_lazy_icache(unsigned long rflags, + pte_t pte, int trap) +{ + struct page *page; + int i; + + if (!pfn_valid(pte_pfn(pte))) + return rflags; + + page = pte_page(pte); + + /* page is dirty */ + if (!test_bit(PG_arch_1, &page->flags) && !PageReserved(page)) { + if (trap == 0x400) { + for (i = 0; i < (HPAGE_SIZE / PAGE_SIZE); i++) + __flush_dcache_icache(page_address(page+i)); + set_bit(PG_arch_1, &page->flags); + } else { + rflags |= HPTE_R_N; + } + } + return rflags; +} + int hash_huge_page(struct mm_struct *mm, unsigned long access, - unsigned long ea, unsigned long vsid, int local) + unsigned long ea, unsigned long vsid, int local, + unsigned long trap) { pte_t *ptep; unsigned long old_pte, new_pte; @@ -737,6 +765,11 @@ int hash_huge_page(struct mm_struct *mm, rflags = 0x2 | (!(new_pte & _PAGE_RW)); /* _PAGE_EXEC -> HW_NO_EXEC since it's inverted */ rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N); + if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) + /* No CPU has hugepages but lacks no execute, so we + * don't need to worry about that case */ + rflags = hash_huge_page_do_lazy_icache(rflags, __pte(old_pte), + trap); /* Check if pte already has an hpte (case 2) */ if (unlikely(old_pte & _PAGE_HASHPTE)) { Index: working-2.6/include/asm-powerpc/mmu.h =================================================================== --- working-2.6.orig/include/asm-powerpc/mmu.h 2005-12-09 14:02:53.000000000 +1100 +++ working-2.6/include/asm-powerpc/mmu.h 2005-12-09 14:03:09.000000000 +1100 @@ -221,7 +221,8 @@ extern int __hash_page_64K(unsigned long unsigned int local); struct mm_struct; extern int hash_huge_page(struct mm_struct *mm, unsigned long access, - unsigned long ea, unsigned long vsid, int local); + unsigned long ea, unsigned long vsid, int local, + unsigned long trap); extern void htab_finish_init(void); extern int htab_bolt_mapping(unsigned long vstart, unsigned long vend, -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From aswathavijay at gmail.com Fri Dec 9 16:27:00 2005 From: aswathavijay at gmail.com (Vijayakumar Ramalingam) Date: Fri, 9 Dec 2005 10:57:00 +0530 Subject: Can we able to boot a 32-bit PPC440EP target with 64-bit kernel? Message-ID: <5f87992f0512082127u1466d609rd2ed74ce8bdea843@mail.gmail.com> Hi all, Is it possible to boot a 32-bit PPC440EP board which has 64-bit support with 64-bit Linux kernel 2.6.13-2. If so how can be done. Please give me some points over it. Thanks in Advance Regards Vijay -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/6e4aade8/attachment.htm From david at gibson.dropbear.id.au Fri Dec 9 16:45:17 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 9 Dec 2005 16:45:17 +1100 Subject: powerpc: Fix SLB flushing path in hugepage In-Reply-To: <20051209023155.GA6517@localhost.localdomain> References: <20051209023155.GA6517@localhost.localdomain> Message-ID: <20051209054517.GE11744@localhost.localdomain> On Fri, Dec 09, 2005 at 01:31:55PM +1100, David Gibson wrote: > Andrew, Paulus, please apply and forward upstream. This is a > potentially serious bug which should be fixed before 2.6.15. Bother, two problems with that patch. First I was working on top of some other patches, so it has some fuzz when applied to mainline. Second, I was using current->mm when I should have been using current->active_mm. Please apply the corrected version below instead. powerpc: Fix SLB flushing path in hugepage On ppc64, when opening a new hugepage region, we need to make sure any old normal-page SLBs for the area are flushed on all CPUs. There was a bug in this logic - after putting the new hugepage area masks into the thread structure, we copied it into the paca (read by the SLB miss handler) only on one CPU, not on all. This could cause incorrect SLB entries to be loaded when a multithreaded program was running simultaneously on several CPUs. This patch corrects the error, copying the context information into the PACA on all CPUs using the mm in question before flushing any existing SLB entries. Signed-off-by: David Gibson Index: working-2.6/arch/powerpc/mm/hugetlbpage.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/hugetlbpage.c 2005-12-09 16:16:36.000000000 +1100 +++ working-2.6/arch/powerpc/mm/hugetlbpage.c 2005-12-09 16:44:51.000000000 +1100 @@ -148,43 +148,63 @@ int is_aligned_hugepage_range(unsigned l return 0; } +struct slb_flush_info { + struct mm_struct *mm; + u16 newareas; +}; + static void flush_low_segments(void *parm) { - u16 areas = (unsigned long) parm; + struct slb_flush_info *fi = parm; unsigned long i; - asm volatile("isync" : : : "memory"); + BUILD_BUG_ON((sizeof(fi->newareas)*8) != NUM_LOW_AREAS); + + if (current->active_mm != fi->mm) + return; - BUILD_BUG_ON((sizeof(areas)*8) != NUM_LOW_AREAS); + /* Only need to do anything if this CPU is working in the same + * mm as the one which has changed */ + + /* update the paca copy of the context struct */ + get_paca()->context = current->active_mm->context; + asm volatile("isync" : : : "memory"); for (i = 0; i < NUM_LOW_AREAS; i++) { - if (! (areas & (1U << i))) + if (! (fi->newareas & (1U << i))) continue; asm volatile("slbie %0" : : "r" ((i << SID_SHIFT) | SLBIE_C)); } - asm volatile("isync" : : : "memory"); } static void flush_high_segments(void *parm) { - u16 areas = (unsigned long) parm; + struct slb_flush_info *fi = parm; unsigned long i, j; - asm volatile("isync" : : : "memory"); - BUILD_BUG_ON((sizeof(areas)*8) != NUM_HIGH_AREAS); + BUILD_BUG_ON((sizeof(fi->newareas)*8) != NUM_HIGH_AREAS); + + if (current->active_mm != fi->mm) + return; + + /* Only need to do anything if this CPU is working in the same + * mm as the one which has changed */ + /* update the paca copy of the context struct */ + get_paca()->context = current->active_mm->context; + + asm volatile("isync" : : : "memory"); for (i = 0; i < NUM_HIGH_AREAS; i++) { - if (! (areas & (1U << i))) + if (! (fi->newareas & (1U << i))) continue; for (j = 0; j < (1UL << (HTLB_AREA_SHIFT-SID_SHIFT)); j++) asm volatile("slbie %0" :: "r" (((i << HTLB_AREA_SHIFT) - + (j << SID_SHIFT)) | SLBIE_C)); + + (j << SID_SHIFT)) | SLBIE_C)); } - asm volatile("isync" : : : "memory"); } @@ -229,6 +249,7 @@ static int prepare_high_area_for_htlb(st static int open_low_hpage_areas(struct mm_struct *mm, u16 newareas) { unsigned long i; + struct slb_flush_info fi; BUILD_BUG_ON((sizeof(newareas)*8) != NUM_LOW_AREAS); BUILD_BUG_ON((sizeof(mm->context.low_htlb_areas)*8) != NUM_LOW_AREAS); @@ -244,19 +265,20 @@ static int open_low_hpage_areas(struct m mm->context.low_htlb_areas |= newareas; - /* update the paca copy of the context struct */ - get_paca()->context = mm->context; - /* the context change must make it to memory before the flush, * so that further SLB misses do the right thing. */ mb(); - on_each_cpu(flush_low_segments, (void *)(unsigned long)newareas, 0, 1); + + fi.mm = mm; + fi.newareas = newareas; + on_each_cpu(flush_low_segments, &fi, 0, 1); return 0; } static int open_high_hpage_areas(struct mm_struct *mm, u16 newareas) { + struct slb_flush_info fi; unsigned long i; BUILD_BUG_ON((sizeof(newareas)*8) != NUM_HIGH_AREAS); @@ -280,7 +302,10 @@ static int open_high_hpage_areas(struct /* the context change must make it to memory before the flush, * so that further SLB misses do the right thing. */ mb(); - on_each_cpu(flush_high_segments, (void *)(unsigned long)newareas, 0, 1); + + fi.mm = mm; + fi.newareas = newareas; + on_each_cpu(flush_high_segments, &fi, 0, 1); return 0; } -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From paulus at samba.org Fri Dec 9 19:54:47 2005 From: paulus at samba.org (Paul Mackerras) Date: Fri, 9 Dec 2005 19:54:47 +1100 Subject: please pull powerpc-merge.git Message-ID: <17305.18007.327733.878347@cargo.ozlabs.ibm.com> Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge.git Thanks, Paul. Benjamin Herrenschmidt: powerpc: Fix a huge page bug powerpc: Remove debug code in hash path David Gibson: powerpc: Add missing icache flushes for hugepages powerpc: Fix SLB flushing path in hugepage Michal Ostrowski: powerpc/pseries: Fix TCE building with 64k pagesize Fix windfarm model-id table Mike Kravetz: powerpc/pseries: boot failures on numa if no memory on node Olaf Hering: powerpc: correct the NR_CPUS description text Olof Johansson: powerpc: remove redundant code in stab init powerpc: Set cache info defaults Paul Mackerras: powerpc/pseries: Optimize IOMMU setup ppc: Build in all three of powermac, PREP and CHRP support arch/powerpc/Kconfig | 2 - arch/powerpc/kernel/setup_64.c | 10 +++ arch/powerpc/mm/hash_utils_64.c | 2 - arch/powerpc/mm/hugetlbpage.c | 95 ++++++++++++++++++++++++++------ arch/powerpc/mm/numa.c | 2 - arch/powerpc/mm/stab.c | 7 -- arch/powerpc/platforms/pseries/iommu.c | 11 ++-- arch/powerpc/platforms/pseries/lpar.c | 12 ---- arch/ppc/Kconfig | 6 +- drivers/macintosh/windfarm_pm81.c | 4 + include/asm-powerpc/mmu.h | 3 + 11 files changed, 104 insertions(+), 50 deletions(-) From arnd at arndb.de Sat Dec 10 02:30:27 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 9 Dec 2005 16:30:27 +0100 Subject: [PATCH 10/14] cell: add iommu support for larger memory In-Reply-To: <0ee9d47e9c94a42075ef44e387650089@bga.com> References: <0ee9d47e9c94a42075ef44e387650089@bga.com> Message-ID: <200512091630.28303.arnd@arndb.de> On Middeweken 07 Dezember 2005 18:24, Milton Miller wrote: > > @@ -221,7 +224,7 @@ set_iopt_cache(void __iomem *base, unsig > > ? ? ? ? unsigned long __iomem *tags = base + IOC_PT_CACHE_DIR; > > ? ? ? ? unsigned long __iomem *p = base + IOC_PT_CACHE_REG; > > ? ? ? ? pr_debug("iopt %02lx was v%016lx/t%016lx, store > > v%016lx/t%016lx\n", > > - ? ? ? ? ? ? ? index, get_iopt_cache(base, index, &oldtag), oldtag, > > val, tag); > > + ? ? ? ? ? ? ? index, get_iopt_cache(base, index, &tag), tag, val, > > tag); > > Assuming get_iopt_cache takes &tag to fill it in, this code is wrong. > The order of function argument evaluation is undefined in C, and the > compiler can choose to change its order at any time. Good catch. The old code simply got a compile error with pr_debug enabled, the new code would to actual harm. I'll just remove the line completely. > > + ? ? ? for (real_address = 0, io_address = 0; > > + ? ? ? ? ? ?io_address <= map_start + map_size; > > + ? ? ? ? ? ?real_address += io_page_size, io_address += io_page_size) > > { > > + ? ? ? ? ? ? ? ioste = get_iost_entry(fake_iopt, io_address, > > io_page_size); > > + ? ? ? ? ? ? ? if ((real_address & 0xfffffff) == 0) /* segment start > > */ > > + ? ? ? ? ? ? ? ? ? ? ? set_iost_cache(ioc_mmio_base, > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?io_address >> 28, ioste); > > + ? ? ? ? ? ? ? index = get_ioc_hash_1way(ioste, io_address); > > [comment] more magic numbers remain... Ok, I'll try to make that more readable. Arnd <>< From arnd at arndb.de Sat Dec 10 02:31:28 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 9 Dec 2005 16:31:28 +0100 Subject: [PATCH 03/14] spufs: Fix oops when spufs module is not loaded In-Reply-To: References: Message-ID: <200512091631.28725.arnd@arndb.de> On Middeweken 07 Dezember 2005 18:23, Milton Miller wrote: > > - if (try_module_get(spufs_calls.owner)) { > > + if (owner && try_module_get(spufs_calls.owner)) { > > > > try_module_get(owner) to avoid the race (twice) > Twice? AFAICS, I got it right once and wrong in the other place. I'll fix that one up, thanks. Arnd <>< From arnd at arndb.de Sat Dec 10 03:10:02 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 9 Dec 2005 17:10:02 +0100 Subject: [PATCH 08/14] cell: enable pause(0) in cpu_idle In-Reply-To: <592f37d568f304c5bc5fbad0285c8cb8@bga.com> References: <592f37d568f304c5bc5fbad0285c8cb8@bga.com> Message-ID: <200512091710.03350.arnd@arndb.de> On Middeweken 07 Dezember 2005 18:28, Milton Miller wrote: > Hi Arnd. Quite a few comments on this one. Unfortunately, I can't test my changes here, because I'm still on second generation hardware, while the code is written for the current DD3.x CPUs. I hope Max can help me with some of this. I'll send an updated patch for this to address your comments. It will also have an update to not use pause(0) at all on DD2.x hardware. > > + > > +struct pmd { > > + struct pmd_regs __iomem *regs; > > + int power_management_enable; > > +}; > > This name conflicts with the memory management system used throughout > the kernel. Please rename. Ok, I'll change to 'struct cbe_pervasive'. > > + if (__get_cpu_var(pmd).power_management_enable) > > How do you know __get_cpu_var is safe here? because this is only > called in the idle loop which is bound? > > > + { > > + /* Disable EE during check for pause */ > > + machine_state=mfmsr(); > > + machine_state &= ~MSR_EE; > > + mtmsrd(machine_state); > local_irq_disable() ? > > > + /* Pause the PU */ > > + HMT_low(); > > + multi_threading_control = 0; > > + mtspr(SPRN_CTRLT,multi_threading_control); > > + > > + /* Re-enable EE after resuming */ > > + machine_state=mfmsr(); > > + machine_state |= MSR_EE; > > + mtmsrd(machine_state); > local_irq_enable() ? I'll put local_irq_{en,dis}able outside of the if(), that should take care of both problems. > > + /* Enable DEC and EE interrupt request */ > > + thread_switch_control = mfspr(SPRN_TSC_CELL); > > + thread_switch_control |= TSCR_EE_ENABLE | TSCR_EE_BOOST; > > + > > + if (smp_processor_id()%2) > > smp_processor_id is software number, and does not necessarily > correspond to the hardware thread id. Either use the hw > version, or better yet, read the PIR (spr 1023?) directly. Hmm, who sets up the PIR? It would definitely be better to use that for getting reliable information (e.g. when not using SMT), but I'm not sure if PIR is available or correct on our boards. > > +static struct pmd_regs __iomem *find_pmd_mmio(int cpu) > > +{ > > + struct device_node *node; > > + int node_number = cpu / 2; > > hmm... so # threads / node hard coded in here ... Yes, it really shouldn't, but we couldn't find a good way around this. The association between logical CPUs, nodes and threads is done through the ibm,ppc-interrupt-server#s property of a CPU device node, right? But how can I get back from a logical CPU number to the thread number? > > + struct pmd_regs __iomem *pmd_mmio_area; > > + unsigned long real_address; > > + > > + for (node = of_find_node_by_type(NULL, "cpu"); node; > > + node = of_find_node_by_type(node, "cpu")) { > > perhaps > for (node = NULL; node = of_find(..) ;) or =NULL then while > > Somewhat long-winded, but ok the way it is. I find Max' version more readable, even if it is a bit longer. > > + if (node_number == *(int *)get_property(node, > > "node-id", NULL)) > > + break; > > + } > > + > > + if (!node) { > > + printk(KERN_WARNING "PMD: CPU %d not found\n", cpu); > > + pmd_mmio_area = NULL; > > + } else { > > + real_address = *(long *)get_property(node, > > "pervasive", NULL); > > + pr_debug("PMD for CPU %d at %lx\n", cpu, real_address); > > + pmd_mmio_area = __ioremap(real_address, 0x1000, > > _PAGE_NO_CACHE); > > + } > > + return pmd_mmio_area; > > +} > > + > > +void __init cell_pervasive_init(void) > > +{ > > + struct pmd *pmd; > > + int cpu; > > + > > + if (!cpu_has_feature(CPU_FTR_PAUSE_ZERO)) > > + return; > > + > > + for_each_cpu(cpu) { > > + pmd = &per_cpu(pmd, cpu); > > + pmd->regs = find_pmd_mmio(cpu); > > + } > > O(n**2) find loop, could combine to get O(n) For the current generation, we won't have systems with more than four CPU's so it should not matter at all. > > +arch_initcall(enable_pause_zero_init); > > arch_initcall functions should be static right. > > + > > +void __init cell_pervasive_init(void); > > +void enable_pause_zero(void *); > > +void _pause_zero(void); > > what is the single_underscore _pause_zero() ? It's left over from an earlier version of the patch and can will go away. > ohter functions are either arch_initcall or called by initcall > in the same file and can be static. ok. > > +BEGIN_FTR_SECTION > > + mr r22,r12 /* r12 has SRR1 saved */ > > + srwi r22,r22,16 > > + andi. r22,r22,MSR_WAKEMASK > > + cmpwi r22,MSR_WAKEEE > > + beq 40f > > + cmpwi r22,MSR_WAKEDEC > > + beq 42f > > + cmpwi r22,MSR_WAKEMT > > + beq 43f > > +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) > > + b system_reset_common > > +40: b hardware_interrupt_common > > +42: b decrementer_common > > +43: EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); > > + b fast_exception_return > > + > > Branches to branches that must be in the same file, within the first > 64k, currently within 32k. Just make the conditional branches > directly to the other routines. ok. > This could go inline with system_reset_common, except that it would > mean breaking apart the STD_EXCEPTION_COMMON macro for it. Space > optimization would then be to put the test for WAKEMT after > PROLOG_COMMON at the expense of breaking up the tests. Max, do you want to try doing this? > We have multiple idle loops and ppc_md.idle_loop to avoid junk like > this. Assign the idle-loop based on the cpu feature. Place it in > persavisive.c, then you can make pause_zero static, and it will be > inline. All better for power (fewer tests and branches). ok. > > > > +#ifdef CONFIG_PPC_CELL > > +extern void pause_zero(void); > > +#else > > +static inline void pause_zero(void) > > +{ > > +} > > +#endif > > + > > and you can stop polluting processor.h with something you only want > called in a pinned cpu context from your idle loop. ok. > > -#define SPRN_TSCR 0x399 /* Thread switch control on BE > > */ > > -#define SPRN_TTR 0x39A /* Thread switch timeout on BE > > */ > > -#define TSCR_DEC_ENABLE 0x200000 /* Decrementer > > Interrupt */ > > +#define SPRN_TSC_CELL 0x399 /* Thread switch control on > > Cell */ > > +#define SPRN_TTR 0x39A /* Thread switch timeout on > > Cell */ > > +#define TSC_DEC_ENABLE_0 0x400000 /* Decrementer > > Interrupt */ > > +#define TSC_DEC_ENABLE_1 0x200000 /* Decrementer > > Interrupt */ > > The prefix should be the name of the register to which they apply and > directly under that register. Ok. Thanks for reviewing this. Arnd <>< From miltonm at bga.com Sat Dec 10 03:21:11 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 9 Dec 2005 10:21:11 -0600 Subject: [RFC PATCH 6/5] CELL rtas console port to hvc_console backend driver In-Reply-To: <1134037870.19711.57.camel@localhost.localdomain> References: <43935B9C.5020503@us.ibm.com> <1134037870.19711.57.camel@localhost.localdomain> Message-ID: On Dec 8, 2005, at 4:31 AM, David Woodhouse wrote: > --- linux-2.6.14/drivers/char/hvc_rtas.c~ 2005-12-07 > 18:12:59.000000000 +0100 > +++ linux-2.6.14/drivers/char/hvc_rtas.c 2005-12-07 18:15:39.000000000 > +0100 > @@ -0,0 +1,161 @@ > +/* > + * IBM RTAS driver interface to hvc_console.c > + * > + * (C) Copyright IBM Corporation 2001-2005 > + * (C) Copyright Red Hat, Inc. 2005 > + * > + * Author(s): Maximino Augilar > + * : Ryan S. Arnold > + * : Utz Bacher > + * : David Woodhouse > + * > + * inspired by drivers/char/hvc_console.c > + * written by Anton Blanchard and Paul Mackerras > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA > 02111-1307 USA > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include "hvc_console.h" > + > +static uint32_t hvc_rtas_vtermno = 0; > +struct hvc_struct *hvc_rtas_dev; > + > +#define RTASCONS_PUT_ATTEMPTS 16 > + > +static int rtascons_put_char_token = -1; > +static int rtascons_get_char_token = -1; > +static int rtascons_put_delay; rtas.h includes #define RTAS_UNKNOWN_SERVICE (-1) how about using that? > + > +static inline int hvc_rtas_write_console(uint32_t vtermno, const char > *buf, int count) > +{ > + int result = 0; > + int attempts = RTASCONS_PUT_ATTEMPTS; > + int done = 0; > + > + /* if there is more than one character to be displayed, wait a > bit */ > + for (; done < count && attempts; udelay(rtascons_put_delay)) { > + attempts--; double - ? > + result = rtas_call(rtascons_put_char_token, 1, 1, > NULL, buf[done]); > + > + if (!result) { > + attempts = RTASCONS_PUT_ATTEMPTS; > + done++; > + } > + } The above loop delays after every character, regardless of intermediate success, or having more than one character as indicated. Also, I would move the initial attempts initialization to the first clause of the for. also done. Perhaps move udelay to else of !result and attempts to the loop closure (or just postfix --- on the && to avoid off-by-one). > + /* the calling routine expects to receive the number of bytes sent */ > + return done?:result; Is the result a sane thing to return? Do we know it will be < 0? How about -1 instead. And then result doesn't need a initial assignment (do we care about calling this with count 0?) > +} > + > +static inline int rtascons_get_char(void) > +{ > + int result; > + > + if (rtas_call(rtascons_get_char_token, 0, 2, &result)) > + result = -1; > + > + return result; > +} > + > +static int hvc_rtas_read_console(uint32_t vtermno, char *buf, int > count) > +{ > + unsigned long got; > + int c; > + int i; > + > + for (got = 0, i = 0; i < count; i++) { > + > + if (( c = rtascons_get_char() ) != -1) { > + buf[i] = c; > + ++got; > + } > + else > + break; > + } > + return got; > +} > + > +static struct hv_ops hvc_rtas_get_put_ops = { > + .get_chars = hvc_rtas_read_console, > + .put_chars = hvc_rtas_write_console, > +}; > + > +static int hvc_rtas_init(void) > +{ > + struct hvc_struct *hp; > + > + if (rtascons_put_char_token == -1) > + rtascons_put_char_token = rtas_token("put-term-char"); > + if (rtascons_put_char_token == -1) > + return -EIO; > + > + if (rtascons_get_char_token == -1) > + rtascons_get_char_token = rtas_token("get-term-char"); > + if (rtascons_get_char_token == -1) > + return -EIO; RTAS_UNKNOWN_SERVICE > + > + if (__onsim()) > + rtascons_put_delay = 0; > + else > + rtascons_put_delay = 100; > + hmm... should this delay be a (writable) module parameter? currently we get no delay until the tty subsystem initializes, which could lead to a lossy console during boot (albeit a faster boot). > + BUG_ON(hvc_rtas_dev); > + > + /* Allocate an hvc_struct for the console device we instantiated > + * earlier. Save off hp so that we can return it on exit */ > + hp = hvc_alloc(hvc_rtas_vtermno, NO_IRQ, &hvc_rtas_get_put_ops); > + if (IS_ERR(hp)) > + return PTR_ERR(hp); > + hvc_rtas_dev = hp; > + return 0; > +} > +module_init(hvc_rtas_init); > + > +/* This will tear down the tty portion of the driver */ > +static void __exit hvc_rtas_exit(void) > +{ > + struct hvc_struct *hp_safe; > + /* Hopefully this isn't premature */ > + if (!hvc_rtas_dev) > + return; > + > + hp_safe = hvc_rtas_dev; > + hvc_rtas_dev = NULL; > + There seems to be a lot of defensive programming around the storing of this cookie (test for null on init, move to safe, etc). Since there is no interrupt callback involved I don't see the need. > + /* Really the fun isn't over until the worker thread breaks down and > the > + * tty cleans up */ > + hvc_remove(hp_safe); > +} > +module_exit(hvc_rtas_exit); /* before drivers/char/hvc_console.c */ > + > +/* This will happen prior to module init. There is no tty at this > time? */ > +static int hvc_rtas_console_init(void) > +{ > + rtascons_put_char_token = rtas_token("put-term-char"); > + if (rtascons_put_char_token == -1) > + return -EIO; > + rtascons_get_char_token = rtas_token("get-term-char"); > + if (rtascons_get_char_token == -1) > + return -EIO; RTAS_UNKNOWN_SERVICE > + > + hvc_instantiate(hvc_rtas_vtermno, 0, &hvc_rtas_get_put_ops ); > + return 0; > +} Since (I assume) there is no cookie in the device tree, define a magic number instead of this never-set static variable. The purpose is to match the driver data between console and tty init. If all drivers use the same number there will be confusion if two drivers think they can init. > +console_initcall(hvc_rtas_console_init); > --- linux-2.6.14/drivers/char/Makefile~ 2005-12-07 17:47:05.000000000 > +0100 > +++ linux-2.6.14/drivers/char/Makefile 2005-12-07 18:12:07.000000000 > +0100 > @@ -43,6 +43,7 @@ obj-$(CONFIG_RIO) += rio/ generic_seria > obj-$(CONFIG_HVC_DRIVER) += hvc_console.o > obj-$(CONFIG_HVC_CONSOLE) += hvc_vio.o hvsi.o > obj-$(CONFIG_HVC_FSS) += hvc_fss.o > +obj-$(CONFIG_HVC_RTAS) += hvc_rtas.o > obj-$(CONFIG_RAW_DRIVER) += raw.o > obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o > obj-$(CONFIG_MMTIMER) += mmtimer.o > --- linux-2.6.14/drivers/char/Kconfig~ 2005-12-07 17:47:05.000000000 > +0100 > +++ linux-2.6.14/drivers/char/Kconfig 2005-12-07 18:17:14.000000000 > +0100 > @@ -575,6 +575,13 @@ config HVC_FSS > IBM Full System Simulator Console device driver which makes use of > the HVC_DRIVER front end. > > +config HVC_RTAS > + bool "IBM RTAS Console support" > + depends on PPC_RTAS > + select HVC_DRIVER > + help > + IBM Console device driver which makes use of RTAS > + > config HVCS > tristate "IBM Hypervisor Virtual Console Server support" > depends on PPC_PSERIES > > -- > dwmw2 > milton From miltonm at bga.com Sat Dec 10 03:59:25 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 9 Dec 2005 10:59:25 -0600 Subject: [PATCH 03/14] spufs: Fix oops when spufs module is not loaded In-Reply-To: <200512091631.28725.arnd@arndb.de> References: <200512091631.28725.arnd@arndb.de> Message-ID: <36813c4dfc4eaae066f569015a9cb4eb@bga.com> On Dec 9, 2005, at 9:31 AM, Arnd Bergmann wrote: > On Middeweken 07 Dezember 2005 18:23, Milton Miller wrote: >>> - if (try_module_get(spufs_calls.owner)) { >>> + if (owner && try_module_get(spufs_calls.owner)) { >>> >> >> try_module_get(owner) to avoid the race (twice) >> > > Twice? AFAICS, I got it right once and wrong in the other place. > I'll fix that one up, thanks. > > Arnd <>< > > Guess I read the second hunk too quickly. milton From miltonm at bga.com Sat Dec 10 04:09:13 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 9 Dec 2005 11:09:13 -0600 Subject: [PATCH 08/14] cell: enable pause(0) in cpu_idle In-Reply-To: <200512091710.03350.arnd@arndb.de> References: <592f37d568f304c5bc5fbad0285c8cb8@bga.com> <200512091710.03350.arnd@arndb.de> Message-ID: On Dec 9, 2005, at 10:10 AM, Arnd Bergmann wrote: > On Middeweken 07 Dezember 2005 18:28, Milton Miller wrote: >> Hi Arnd. Quite a few comments on this one. > > Unfortunately, I can't test my changes here, because I'm still on > second generation hardware, while the code is written for the current > DD3.x CPUs. I hope Max can help me with some of this. > > I'll send an updated patch for this to address your comments. It > will also have an update to not use pause(0) at all on DD2.x hardware. ... >>> + /* Enable DEC and EE interrupt request */ >>> + thread_switch_control = mfspr(SPRN_TSC_CELL); >>> + thread_switch_control |= TSCR_EE_ENABLE | TSCR_EE_BOOST; >>> + >>> + if (smp_processor_id()%2) >> >> smp_processor_id is software number, and does not necessarily >> correspond to the hardware thread id. Either use the hw >> version, or better yet, read the PIR (spr 1023?) directly. > > Hmm, who sets up the PIR? It would definitely be better to use > that for getting reliable information (e.g. when not using SMT), > but I'm not sure if PIR is available or correct on our boards. Since this assumption is throughout the code, I won't insist using PIR. > >>> +static struct pmd_regs __iomem *find_pmd_mmio(int cpu) >>> +{ >>> + struct device_node *node; >>> + int node_number = cpu / 2; >> >> hmm... so # threads / node hard coded in here ... > > Yes, it really shouldn't, but we couldn't find a good way > around this. The association between logical CPUs, nodes and > threads is done through the ibm,ppc-interrupt-server#s > property of a CPU device node, right? > But how can I get back from a logical CPU number to the > thread number? Use hard_smp_processor_id() in smp.h, which gives you a number retrieved from the device tree and used to start the thread. > > Thanks for reviewing this. > Your welcome. milton From galak at kernel.crashing.org Sat Dec 10 05:17:19 2005 From: galak at kernel.crashing.org (Kumar Gala) Date: Fri, 9 Dec 2005 12:17:19 -0600 Subject: Can we able to boot a 32-bit PPC440EP target with 64-bit kernel? In-Reply-To: <5f87992f0512082127u1466d609rd2ed74ce8bdea843@mail.gmail.com> References: <5f87992f0512082127u1466d609rd2ed74ce8bdea843@mail.gmail.com> Message-ID: <341411FF-0918-4EA3-9149-3DEF740D8251@kernel.crashing.org> On Dec 8, 2005, at 11:27 PM, Vijayakumar Ramalingam wrote: > Hi all, > > Is it possible to boot a 32-bit PPC440EP board which has 64- > bit support with 64-bit Linux kernel 2.6.13-2. > > If so how can be done. Please give me some points over it. What do you mean a PPC440EP with 64-bit support? Odds are that its not feasible to do what your asking. - kumar From arnd at arndb.de Sat Dec 10 05:04:15 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:15 +0100 Subject: [PATCH 1/8] spufs: fix module refcount race References: <20051209180414.872465000@localhost> Message-ID: <20051209182053.486222000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-modcount-fix.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/0d808e29/attachment.txt From arnd at arndb.de Sat Dec 10 05:04:14 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:14 +0100 Subject: [PATCH 0/8] Re: Cell updates for powerpc.git Message-ID: <20051209180414.872465000@localhost> On Dunnersdag 08 Dezember 2005 07:18, Paul Mackerras wrote: > > I have put patches 1..7 and 9 into the powerpc.git tree. Please > address Milton's comments on patches 3, 5, 8 and 10. Milton's comment for patch 5 was about the patch comment, I don't think I can address that any more. Here come updated version for patches 8 and 10, a fix for patch 3 and a number of other patches that have come up in the last week. All patches apply on top of today's powerpc.git tree. More patches will be coming to address the concerns of Al Viro about spufs. Thanks, Arnd <>< From arnd at arndb.de Sat Dec 10 05:04:20 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:20 +0100 Subject: [PATCH 6/8] cell: add iommu support for larger memory References: <20051209180414.872465000@localhost> Message-ID: <20051209182054.342686000@localhost> An embedded and charset-unspecified text was scrubbed... Name: iommu-new-firmware-7.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/05f5fada/attachment.txt From arnd at arndb.de Sat Dec 10 05:04:16 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:16 +0100 Subject: [PATCH 2/8] spufs: trivial compile fix References: <20051209180414.872465000@localhost> Message-ID: <20051209182053.654088000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-build-fix.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/c3591d22/attachment.txt From arnd at arndb.de Sat Dec 10 05:04:18 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:18 +0100 Subject: [PATCH 4/8] spufs: clear dsisr on CLASS1[Mf] exception References: <20051209180414.872465000@localhost> Message-ID: <20051209182053.991411000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-clear-dsisr.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/3cf2aa15/attachment.txt From arnd at arndb.de Sat Dec 10 05:04:19 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:19 +0100 Subject: [PATCH 5/8] cell: enable pause(0) in cpu_idle References: <20051209180414.872465000@localhost> Message-ID: <20051209182054.167727000@localhost> An embedded and charset-unspecified text was scrubbed... Name: bpa-pmd-add-3.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/56007b5b/attachment.txt From arnd at arndb.de Sat Dec 10 05:21:44 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 9 Dec 2005 19:21:44 +0100 Subject: [PATCH 8/8] powerpc: fix large nvram access References: <20051209180414.872465000@localhost> Message-ID: <200512091921.45285.arnd@arndb.de> An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/650a14b6/attachment.txt From arnd at arndb.de Sat Dec 10 05:04:17 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:17 +0100 Subject: [PATCH 3/8] spufs: fix hexdump format References: <20051209180414.872465000@localhost> Message-ID: <20051209182053.821119000@localhost> An embedded and charset-unspecified text was scrubbed... Name: spufs-fix-hexdump.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/d3c7ddb6/attachment.txt From arnd at arndb.de Sat Dec 10 05:04:21 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 09 Dec 2005 19:04:21 +0100 Subject: [PATCH 7/8] cell: disable legacy i/o area References: <20051209180414.872465000@localhost> Message-ID: <20051209182054.513792000@localhost> An embedded and charset-unspecified text was scrubbed... Name: no-legacy-io.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051209/ae8ef127/attachment.txt From jschopp at austin.ibm.com Sat Dec 10 10:06:00 2005 From: jschopp at austin.ibm.com (Joel Schopp) Date: Fri, 09 Dec 2005 17:06:00 -0600 Subject: [PATCH] boot failures on numa if no memory on node In-Reply-To: <20051207210723.GA8970@monkey.ibm.com> References: <20051207210723.GA8970@monkey.ibm.com> Message-ID: <439A0DD8.8020303@austin.ibm.com> > This bug exists in the current code and prevents machines from booting > with numa enabled if there is a node that does not contain memory. > Workaround is to boot with 'numa=off'. Looks like a simple type. There is some irony to misspelling typo. > > Signed-off-by: Mike Kravetz > > diff -Naupr linux-2.6.15-rc5-git1/arch/powerpc/mm/numa.c linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c > --- linux-2.6.15-rc5-git1/arch/powerpc/mm/numa.c 2005-12-04 05:10:42.000000000 +0000 > +++ linux-2.6.15-rc5-git1.work/arch/powerpc/mm/numa.c 2005-12-07 20:49:23.000000000 +0000 > @@ -125,7 +125,7 @@ void __init get_region(unsigned int nid, > > /* We didnt find a matching region, return start/end as 0 */ > if (*start_pfn == -1UL) > - start_pfn = 0; > + *start_pfn = 0; Good catch. Acked-by: Joel Schopp From dwmw2 at infradead.org Sat Dec 10 11:39:05 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Sat, 10 Dec 2005 01:39:05 +0100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() Message-ID: <1134175145.19711.167.camel@localhost.localdomain> This should prevent the crashes on PowerMac and Cell when parport_pc pokes at non-existent ports... Signed-off-by: David Woodhouse --- linux-2.6.14/drivers/parport/parport_pc.c~ 2005-10-28 02:02:08.000000000 +0200 +++ linux-2.6.14/drivers/parport/parport_pc.c 2005-12-10 01:33:19.000000000 +0100 @@ -1417,6 +1417,11 @@ static void __devinit winbond_check(int { int devid,devrev,oldid,x_devid,x_devrev,x_oldid; +#ifdef CONFIG_PPC_MERGE + if (check_legacy_ioport(io)) + return; +#endif + if (!request_region(io, 3, __FUNCTION__)) return; @@ -1451,6 +1456,11 @@ static void __devinit winbond_check2(int { int devid,devrev,oldid,x_devid,x_devrev,x_oldid; +#ifdef CONFIG_PPC_MERGE + if (check_legacy_ioport(io)) + return; +#endif + if (!request_region(io, 3, __FUNCTION__)) return; @@ -1484,6 +1494,10 @@ static void __devinit smsc_check(int io, { int id,rev,oldid,oldrev,x_id,x_rev,x_oldid,x_oldrev; +#ifdef CONFIG_PPC_MERGE + if (check_legacy_ioport(io)) + return; +#endif if (!request_region(io, 3, __FUNCTION__)) return; @@ -2154,6 +2168,12 @@ struct parport *parport_pc_probe_port (u struct resource *ECR_res = NULL; struct resource *EPP_res = NULL; +#ifdef CONFIG_PPC_MERGE + if (check_legacy_ioport(base)) + goto out1; + if (base_hi && check_legacy_ioport(base_hi)) + goto out1; +#endif ops = kmalloc(sizeof (struct parport_operations), GFP_KERNEL); if (!ops) goto out1; -- dwmw2 From benh at kernel.crashing.org Sat Dec 10 16:52:29 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 10 Dec 2005 16:52:29 +1100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <1134175145.19711.167.camel@localhost.localdomain> References: <1134175145.19711.167.camel@localhost.localdomain> Message-ID: <1134193949.6989.8.camel@gaston> On Sat, 2005-12-10 at 01:39 +0100, David Woodhouse wrote: > This should prevent the crashes on PowerMac and Cell when parport_pc > pokes at non-existent ports... Mikey has a better patch afaik Ben. From mikey at neuling.org Sat Dec 10 20:54:22 2005 From: mikey at neuling.org (Michael Neuling) Date: Sat, 10 Dec 2005 20:54:22 +1100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <1134193949.6989.8.camel@gaston> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> Message-ID: <20051210205422.4660d802.mikey@neuling.org> > > This should prevent the crashes on PowerMac and Cell when parport_pc > > pokes at non-existent ports... > > Mikey has a better patch afaik I've attched this here. This patched is rebased to Paulus current powerpc.git tree. It compiles but is not tested (although it was tested on an earlier verision). There is also a pcspkr patch for a similar problem which I'll repost in the coming days. Mikey Patch stops parport from accessing non existant ports. Signed-off-by: Michael Neuling include/asm-powerpc/parport.h | 28 ++++++++++++++++++++++++++-- 1 files changed, 26 insertions(+), 2 deletions(-) Index: linux-2.6-powerpc.nobackup/include/asm-powerpc/parport.h =================================================================== --- linux-2.6-powerpc.nobackup.orig/include/asm-powerpc/parport.h +++ linux-2.6-powerpc.nobackup/include/asm-powerpc/parport.h @@ -9,10 +9,34 @@ #ifndef _ASM_POWERPC_PARPORT_H #define _ASM_POWERPC_PARPORT_H -static int __devinit parport_pc_find_isa_ports (int autoirq, int autodma); +#include + +extern struct parport *parport_pc_probe_port (unsigned long int base, + unsigned long int base_hi, + int irq, int dma, + struct pci_dev *dev); + static int __devinit parport_pc_find_nonpci_ports (int autoirq, int autodma) { - return parport_pc_find_isa_ports (autoirq, autodma); + struct device_node *np; + u32 *prop; + u32 io1, io2; + int propsize; + int count = 0; + for (np = NULL; (np = of_find_compatible_node(np, + "parallel", + "pnpPNP,400")) != NULL;) { + prop = (u32 *)get_property(np, "reg", &propsize); + if (!prop || propsize > 6*sizeof(u32)) + continue; + io1 = prop[1]; io2 = prop[2]; + prop = (u32 *)get_property(np, "interrupts", NULL); + if (!prop) + continue; + if (parport_pc_probe_port(io1, io2, prop[0], autodma, NULL) != NULL) + count++; + } + return count; } #endif /* !(_ASM_POWERPC_PARPORT_H) */ From olh at suse.de Sat Dec 10 21:08:17 2005 From: olh at suse.de (Olaf Hering) Date: Sat, 10 Dec 2005 11:08:17 +0100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <20051210205422.4660d802.mikey@neuling.org> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> Message-ID: <20051210100817.GA30931@suse.de> On Sat, Dec 10, Michael Neuling wrote: > Patch stops parport from accessing non existant ports. The node on Pegasos looks like that: /proc/device-tree/pci at 80000000/isa at C/lpt at i3BC: name "lpt" device_type "lpt" reg 00000001 000003bc 00000008 interrupts 00000007 00000000 clock-frequency 00000000 linux,phandle 0fc5bae0 (264616672) -- short story of a lazy sysadmin: alias appserv=wotan From dwmw2 at infradead.org Sun Dec 11 03:00:36 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Sat, 10 Dec 2005 17:00:36 +0100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <20051210205422.4660d802.mikey@neuling.org> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> Message-ID: <1134230437.19711.173.camel@localhost.localdomain> On Sat, 2005-12-10 at 20:54 +1100, Michael Neuling wrote: > I've attched this here. This patched is rebased to Paulus current > powerpc.git tree. It compiles but is not tested (although it was > tested on an earlier verision). Thanks. This doesn't seem to cover the SuperIO probes though; they possibly still need check_legacy_ioport() (which also needs to be exported). Admittedly, it would be better to have a separate superio PNP driver which handles configuration of all the devices on these chips rather than having partial superio code in individual (parport,8250,IrDA,...) drivers, but that's one for later. -- dwmw2 From miltonm at bga.com Sun Dec 11 03:24:58 2005 From: miltonm at bga.com (Milton Miller) Date: Sat, 10 Dec 2005 10:24:58 -0600 Subject: [PATCH 5/8] cell: enable pause(0) in cpu_idle Message-ID: Much better, Just a few more things. One cut and paste error, the rest are more cleanups and probes of the idle code now it is in one place to see. On Sat Dec 10 05:04:19 EST 2005, Arnd Bergmann wrote: > This patch enables support for pause(0) power management state > for the Cell Broadband Processor, which is import for power efficient > operation. The pervasive infrastructure will in the future enable > us to introduce more functionality specific to the Cell's > pervasive unit. > > This version contains fixes for a few problems found by > Milton Miller and a fix to keep running on Cell BE DD2 > CPUs, where the hardware feature is broken. > > From: Maximino Aguilar > Signed-off-by: Arnd Bergmann ... > +#include "pervasive.h" > + > +struct cbe_pervasive { > + struct pmd_regs __iomem *regs; > + int power_management_enable; > +}; > + > +static DEFINE_PER_CPU(struct cbe_pervasive, cbe_pervasive); > + > +static void pause_zero(void) > +{ > + unsigned int multi_threading_control; > + > + /* Reset Thread Run Latch (latch is set in idle.c) */ runlatch_off is both here and in pause zero idle. It is turned on below in pause_zero_idle. > + ppc64_runlatch_off(); > + > + local_irq_disable(); > + if (__get_cpu_var(cbe_pervasive).power_management_enable) { > + /* Pause the PU */ > + HMT_low(); > + multi_threading_control = 0; > + mtspr(SPRN_CTRLT,multi_threading_control); What is the purpose of the stack variable set to 0 every loop and only used as an argument to mtspr? > + } > + local_irq_disable(); oops, local_irq_enable() > +} > + > +static void pause_zero_idle(void) > +{ > + set_thread_flag(TIF_POLLING_NRFLAG); > + So your cpu keeps running, just slowly. (This flag says don't bother to ipi the processor when there is work to do, it is polling waiting for work. If you set this when you are sleeping then you will incurr wakeup latency until the next decrementer tick. > + while (1) { > + if (!need_resched()) { > + while (!need_resched()) { > + ppc64_runlatch_off(); > + pause_zero(); > + /* > + * Go into low thread priority and > possibly > + * low power mode. > + */ > + HMT_low(); > + HMT_very_low(); Does your cpu support very_low? Can it be disabled without disabling low? If not, only one of these is needed. The generic has both because some older cpus did not have very_low. This code results in a periodic bump up to low on cpus that support both. And you set HMT_low in pauze_zero. > + } > + > + HMT_medium(); > + } > + > + ppc64_runlatch_on(); > + preempt_enable_no_resched(); > + schedule(); > + preempt_disable(); > + } > +} > Rest looks good. Thanks. milton From arnd at arndb.de Sun Dec 11 06:00:20 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Sat, 10 Dec 2005 20:00:20 +0100 Subject: [PATCH 5/8] cell: enable pause(0) in cpu_idle In-Reply-To: References: Message-ID: <200512102000.20304.arnd@arndb.de> Am Samstag 10 Dezember 2005 17:24 schrieb Milton Miller: > Much better, Just a few more things. One cut and paste error, the rest > are more cleanups and probes of the idle code now it is in one place to > see. Ok. Paulus, please ignore this patch again. > ... > > > +#include "pervasive.h" > > + > > +struct cbe_pervasive { > > + struct pmd_regs __iomem *regs; > > + int power_management_enable; > > +}; > > + > > +static DEFINE_PER_CPU(struct cbe_pervasive, cbe_pervasive); > > + > > +static void pause_zero(void) > > +{ > > + unsigned int multi_threading_control; > > + > > + /* Reset Thread Run Latch (latch is set in idle.c) */ > > runlatch_off is both here and in pause zero idle. It is turned on > below in pause_zero_idle. ok, I will fix. > > + ppc64_runlatch_off(); > > + > > + local_irq_disable(); > > + if (__get_cpu_var(cbe_pervasive).power_management_enable) { > > + /* Pause the PU */ > > + HMT_low(); > > + multi_threading_control = 0; > > + mtspr(SPRN_CTRLT,multi_threading_control); > > What is the purpose of the stack variable set to 0 every loop and > only used as an argument to mtspr? I can't see any. Max? > > + } > > + local_irq_disable(); > > oops, local_irq_enable() Ok. That makes it obvious that my test on the DD3 board actually did not get here, so I also have another bug that disables all of the code in here. > > +} > > + > > +static void pause_zero_idle(void) > > +{ > > + set_thread_flag(TIF_POLLING_NRFLAG); > > + > > So your cpu keeps running, just slowly. (This flag says don't bother > to ipi the processor when there is work to do, it is polling waiting > for work. If you set this when you are sleeping then you will incurr > wakeup latency until the next decrementer tick. Ok, I'll investigate further. This sounds like a serious problem, but I would expect to have observed some real performance hits as a result of it. > > + while (1) { > > + if (!need_resched()) { > > + while (!need_resched()) { > > + ppc64_runlatch_off(); > > + pause_zero(); > > + /* > > + * Go into low thread priority and > > possibly > > + * low power mode. > > + */ > > + HMT_low(); > > + HMT_very_low(); > > Does your cpu support very_low? Can it be disabled without disabling > low? If not, only one of these is needed. The generic has both > because some older cpus did not have very_low. This code results in a > periodic bump up to low on cpus that support both. > > And you set HMT_low in pauze_zero. Hmm, I wonder if that might be one of the causes of the performance problems we see on DD2 hardware (which can't do pauze_zero). Toggling between HMT_low and HMT_very_low on one thread might cause problems in the pipeline for the other thread in theory. Which document would be the one telling me if HMT_very_low is supported? Arnd <>< From segher at kernel.crashing.org Sun Dec 11 07:53:12 2005 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Sat, 10 Dec 2005 21:53:12 +0100 Subject: [PATCH 5/8] cell: enable pause(0) in cpu_idle In-Reply-To: <200512102000.20304.arnd@arndb.de> References: <200512102000.20304.arnd@arndb.de> Message-ID: <3d7062aa9b1cfe44f2f4f218d4ee61ea@kernel.crashing.org> >> Does your cpu support very_low? Can it be disabled without disabling >> low? If not, only one of these is needed. The generic has both >> because some older cpus did not have very_low. This code results in a >> periodic bump up to low on cpus that support both. >> >> And you set HMT_low in pauze_zero. > > Hmm, I wonder if that might be one of the causes of the performance > problems > we see on DD2 hardware (which can't do pauze_zero). Toggling between > HMT_low > and HMT_very_low on one thread might cause problems in the pipeline > for the > other thread in theory. > Which document would be the one telling me if HMT_very_low is > supported? CBEA claims to be compatible with the PowerPC 2.02 arch, which defines "very low" -- although it only says "it provides a hint to the processor", so your CPU might very well just ignore it. I grepped and grepped, but couldn't find anything about the thread priority in any of the Cell docs (except for lots of stuff relating to the interrupt controller ;-) ) You'll just have to experiment I guess (or ask the chip guys...) Segher From benh at kernel.crashing.org Sun Dec 11 08:48:16 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 11 Dec 2005 08:48:16 +1100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <20051210100817.GA30931@suse.de> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> <20051210100817.GA30931@suse.de> Message-ID: <1134251296.6989.18.camel@gaston> On Sat, 2005-12-10 at 11:08 +0100, Olaf Hering wrote: > On Sat, Dec 10, Michael Neuling wrote: > > > Patch stops parport from accessing non existant ports. > > The node on Pegasos looks like that: Well, then they should fix it :) I think we have enough workarounds for pegasos firmware "issues" so far. Ben. From benh at kernel.crashing.org Sun Dec 11 08:49:54 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 11 Dec 2005 08:49:54 +1100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <1134230437.19711.173.camel@localhost.localdomain> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> <1134230437.19711.173.camel@localhost.localdomain> Message-ID: <1134251394.6989.21.camel@gaston> On Sat, 2005-12-10 at 17:00 +0100, David Woodhouse wrote: > On Sat, 2005-12-10 at 20:54 +1100, Michael Neuling wrote: > > I've attched this here. This patched is rebased to Paulus current > > powerpc.git tree. It compiles but is not tested (although it was > > tested on an earlier verision). > > Thanks. This doesn't seem to cover the SuperIO probes though; they > possibly still need check_legacy_ioport() (which also needs to be > exported). How so ? > Admittedly, it would be better to have a separate superio PNP driver > which handles configuration of all the devices on these chips rather > than having partial superio code in individual (parport,8250,IrDA,...) > drivers, but that's one for later. Well, we expect the vaerious individual ISA devices to show up in the device-tree, if not, they are not accessible. That covers SuperIOs. Ben. From dwmw2 at infradead.org Sun Dec 11 10:36:39 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Sun, 11 Dec 2005 00:36:39 +0100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <1134251394.6989.21.camel@gaston> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> <1134230437.19711.173.camel@localhost.localdomain> <1134251394.6989.21.camel@gaston> Message-ID: <1134257800.19711.213.camel@localhost.localdomain> On Sun, 2005-12-11 at 08:49 +1100, Benjamin Herrenschmidt wrote: > On Sat, 2005-12-10 at 17:00 +0100, David Woodhouse wrote: > > On Sat, 2005-12-10 at 20:54 +1100, Michael Neuling wrote: > > > I've attched this here. This patched is rebased to Paulus current > > > powerpc.git tree. It compiles but is not tested (although it was > > > tested on an earlier verision). > > > > Thanks. This doesn't seem to cover the SuperIO probes though; they > > possibly still need check_legacy_ioport() (which also needs to be > > exported). > > How so ? I mean that smsc_check() and winbond_check() functions in parport_pc.c will still crash the machine, surely? > > Admittedly, it would be better to have a separate superio PNP driver > > which handles configuration of all the devices on these chips rather > > than having partial superio code in individual (parport,8250,IrDA,...) > > drivers, but that's one for later. > > Well, we expect the vaerious individual ISA devices to show up in the > device-tree, if not, they are not accessible. That covers SuperIOs. It's more than just whether the device exists or not. For some devices, part of the configuration is done in the SuperIO registers. In particular, most of the UARTs on these chips can go up to 460800 or 921600 baud, but those modes need to be enabled in the SuperIO (generally not in the registers of the UART itself, except in the NatSemi case where it _also_ needs to be enabled in the SuperIO regs to let you have access to that bank of UART regs). I think IrDA ports do FIR mode in a similar fashion too. But what I mean by the above quote is that for platforms where we _do_ have actual control of the hardware and we want to poke at superio, it should be a superio pnp/bus driver rather than a scattering of superio probes through various other (8250/parport/irda/floppy) places. -- dwmw2 From dwmw2 at infradead.org Sun Dec 11 10:50:42 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Sun, 11 Dec 2005 00:50:42 +0100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <1134251296.6989.18.camel@gaston> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> <20051210100817.GA30931@suse.de> <1134251296.6989.18.camel@gaston> Message-ID: <1134258643.19711.215.camel@localhost.localdomain> On Sun, 2005-12-11 at 08:48 +1100, Benjamin Herrenschmidt wrote: > Well, then they should fix it :) I think we have enough workarounds for > pegasos firmware "issues" so far. SLOF port, anyone? :) -- dwmw2 From benh at kernel.crashing.org Sun Dec 11 10:49:48 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 11 Dec 2005 10:49:48 +1100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <1134257800.19711.213.camel@localhost.localdomain> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> <1134230437.19711.173.camel@localhost.localdomain> <1134251394.6989.21.camel@gaston> <1134257800.19711.213.camel@localhost.localdomain> Message-ID: <1134258589.6989.41.camel@gaston> > I mean that smsc_check() and winbond_check() functions in parport_pc.c > will still crash the machine, surely? Oh, I haven't seen these... Dammit, can we have a single driver written on x86 that isn't a pile of horse shit ? > It's more than just whether the device exists or not. For some devices, > part of the configuration is done in the SuperIO registers. In > particular, most of the UARTs on these chips can go up to 460800 or > 921600 baud, but those modes need to be enabled in the SuperIO > (generally not in the registers of the UART itself, except in the > NatSemi case where it _also_ needs to be enabled in the SuperIO regs to > let you have access to that bank of UART regs). > > I think IrDA ports do FIR mode in a similar fashion too. > > But what I mean by the above quote is that for platforms where we _do_ > have actual control of the hardware and we want to poke at superio, it > should be a superio pnp/bus driver rather than a scattering of superio > probes through various other (8250/parport/irda/floppy) places. Just put that code in #ifndef CONFIG_PPC then Ben. From dwmw2 at infradead.org Sun Dec 11 11:12:44 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Sun, 11 Dec 2005 01:12:44 +0100 Subject: [PATCH] Make parport_pc use check_legacy_ioport() In-Reply-To: <1134258589.6989.41.camel@gaston> References: <1134175145.19711.167.camel@localhost.localdomain> <1134193949.6989.8.camel@gaston> <20051210205422.4660d802.mikey@neuling.org> <1134230437.19711.173.camel@localhost.localdomain> <1134251394.6989.21.camel@gaston> <1134257800.19711.213.camel@localhost.localdomain> <1134258589.6989.41.camel@gaston> Message-ID: <1134259964.19711.228.camel@localhost.localdomain> On Sun, 2005-12-11 at 10:49 +1100, Benjamin Herrenschmidt wrote: > Just put that code in #ifndef CONFIG_PPC then CONFIG_PPC_OF perhaps; embedded boards where we actually have control of the hardware might still want it, but we ought to be able to trust OpenFirmware to set the machine up optimally.... I think. For Winbond chips where we flip a bit in the superio regs to change the baud_base to 921600, that's OK because OF can just do that and change the appropriate property in the device-tree too. This assumes a kernel which is actually new enough to be looking for that property; older kernels which just assume baud_base is 115200 will screw up. For NatSemi chips where the bit in the superio regs merely enables access to the extra bank of uart regs in which the high-speed stuff is enabled, that's fine too -- the 8250 driver already detects those _if_ they're enabled. The only one I'm not sure about is the SMSC chips with the 'magic multipliers' -- setting a divisor of 0x8001 or 0x8002 to get 115200*2 and 115200*4 respectively. I'm not entirely sure how the kernel can tell _either_ that it's an SMSC chip or that the magic multipliers are actually enabled on it, without looking at the SuperIO regs for itself. That covers what we need from SuperIO for the UARTs on such chips -- I'm less sure about what we need for parport, IrDA, etc. -- dwmw2 From segher at kernel.crashing.org Sun Dec 11 07:04:13 2005 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Sat, 10 Dec 2005 21:04:13 +0100 Subject: Booting OS on PowerPC In-Reply-To: References: Message-ID: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> >> Can any body tell me how Linux boot on PowerPC machine >> when Open Firmware is up. To be more preciese, what is >> the "known-environment" that the OS expect from Open >> Firmware. > > It expects to be (1) in 32-bit mode, (2) r3, r4 0, (3) r5 is a pointer > to the client interface callback, (4) r1 is a usable stack, (5) image > is loaded according to elf-header of zImage wrapper. Not quite right, but close... 32-bit mode: if the ELF file OF loaded is 32-bit, it will be started in 32-bit mode; if instead it is a 64-bit ELF file, it will be started in 64-bit mode. Analogous for other binary formats. GPR1 points to an initialized stack, at least 32kB in size. GPR2 is zero. GPR3 and GPR4 are reserved for the platform binding. GPR5 is the client interface entry point. GPR6 and GPR7 are the address and the length of an array of bytes, the use of which is platform dependent, typically some boot parameters (although the normal "kernel command line string" is passed in the "bootargs" property in /chosen, instead). Segher From paulus at samba.org Sun Dec 11 22:00:53 2005 From: paulus at samba.org (Paul Mackerras) Date: Sun, 11 Dec 2005 22:00:53 +1100 Subject: Booting OS on PowerPC In-Reply-To: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> Message-ID: <17308.1765.869950.545612@cargo.ozlabs.ibm.com> Segher Boessenkool writes: > > It expects to be (1) in 32-bit mode, (2) r3, r4 0, (3) r5 is a pointer > > to the client interface callback, (4) r1 is a usable stack, (5) image > > is loaded according to elf-header of zImage wrapper. > > Not quite right, but close... Milton's reply was correct for pSeries (PAPR) machines. Paul. From segher at kernel.crashing.org Sun Dec 11 22:40:46 2005 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Sun, 11 Dec 2005 12:40:46 +0100 Subject: Booting OS on PowerPC In-Reply-To: <17308.1765.869950.545612@cargo.ozlabs.ibm.com> References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> <17308.1765.869950.545612@cargo.ozlabs.ibm.com> Message-ID: >>> It expects to be (1) in 32-bit mode, (2) r3, r4 0, (3) r5 is a >>> pointer >>> to the client interface callback, (4) r1 is a usable stack, (5) image >>> is loaded according to elf-header of zImage wrapper. >> >> Not quite right, but close... > > Milton's reply was correct for pSeries (PAPR) machines. Perhaps I should apologize for making it sound like his answer was fully incorrect; that was unintended, I meant more something like "incomplete". Anyway... The question didn't say PAPR, it even said PowerPC :-) But I just checked, and the PAPR also doesn't require GPR3 and GPR4 to be zero. I also checked the kernel (the converged arch/powerpc one); GPR6 and GPR7 are unused. GPR3 and GPR4 are used for passing an initrd, which is used by yaboot (at least by my ancient 1.3.12 copy). This mechanism is a bit fragile (relying on GPR4 never to be 0xdeadbeef when GPR3 and GPR4 are not an initrd; and the real vs. virtual hack); perhaps yaboot should set the linux,initrd-* properties in /chosen itself? So indeed, if you're writing a new bootloader / client interface, it is safest to set GPR3 and GPR4 to 0. Apologies for not double checking every documentation and all the source code, Segher From paulus at samba.org Mon Dec 12 09:20:16 2005 From: paulus at samba.org (Paul Mackerras) Date: Mon, 12 Dec 2005 09:20:16 +1100 Subject: Booting OS on PowerPC In-Reply-To: References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> <17308.1765.869950.545612@cargo.ozlabs.ibm.com> Message-ID: <17308.42528.789943.633949@cargo.ozlabs.ibm.com> Segher Boessenkool writes: > Perhaps I should apologize for making it sound like his answer > was fully incorrect; that was unintended, I meant more something > like "incomplete". > > Anyway... The question didn't say PAPR, it even said PowerPC :-) > But I just checked, and the PAPR also doesn't require GPR3 and > GPR4 to be zero. It does, in section C.10.2. Anyway, I was just trying to clarify which domain Milton's answer related to, and that it was correct for that domain. Part of the confusion comes from the fact that we don't have a single transfer of control (i.e. from OF to the kernel); we have at least two (OF -> yaboot -> kernel or OF -> zImage -> kernel) and possibly three (OF -> yaboot -> zImage -> kernel), and slightly different conventions can apply at each transfer. > I also checked the kernel (the converged arch/powerpc one); GPR6 > and GPR7 are unused. GPR3 and GPR4 are used for passing an initrd, > which is used by yaboot (at least by my ancient 1.3.12 copy). This > mechanism is a bit fragile (relying on GPR4 never to be 0xdeadbeef > when GPR3 and GPR4 are not an initrd; and the real vs. virtual hack); > perhaps yaboot should set the linux,initrd-* properties in /chosen > itself? That would be a good idea. Paul. From miltonm at bga.com Mon Dec 12 11:18:08 2005 From: miltonm at bga.com (Milton Miller) Date: Sun, 11 Dec 2005 18:18:08 -0600 Subject: Booting OS on PowerPC In-Reply-To: <17308.42528.789943.633949@cargo.ozlabs.ibm.com> References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> <17308.1765.869950.545612@cargo.ozlabs.ibm.com> <17308.42528.789943.633949@cargo.ozlabs.ibm.com> Message-ID: <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> On Dec 11, 2005, at 4:20 PM, Paul Mackerras wrote: >> I also checked the kernel (the converged arch/powerpc one); GPR6 >> and GPR7 are unused. GPR3 and GPR4 are used for passing an initrd, >> which is used by yaboot (at least by my ancient 1.3.12 copy). This >> mechanism is a bit fragile (relying on GPR4 never to be 0xdeadbeef >> when GPR3 and GPR4 are not an initrd; and the real vs. virtual hack); >> perhaps yaboot should set the linux,initrd-* properties in /chosen >> itself? > > That would be a good idea. > Except I am guessing that our "don't overwrite the initrd with the device tree" logic might break. In other words, prom_init.c might need updating. milton From paulus at samba.org Mon Dec 12 14:41:57 2005 From: paulus at samba.org (Paul Mackerras) Date: Mon, 12 Dec 2005 14:41:57 +1100 Subject: please pull powerpc-merge.git Message-ID: <17308.61829.788560.203752@cargo.ozlabs.ibm.com> Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge.git There are a handful of commits there that all should go in for 2.6.15. Thanks, Paul. arch/powerpc/Kconfig | 2 - arch/powerpc/kernel/setup_64.c | 10 +++ arch/powerpc/mm/hash_utils_64.c | 2 - arch/powerpc/mm/hugetlbpage.c | 95 ++++++++++++++++++++++++----- arch/powerpc/mm/numa.c | 2 - arch/powerpc/mm/stab.c | 7 -- arch/powerpc/platforms/powermac/feature.c | 21 +++++- arch/powerpc/platforms/pseries/iommu.c | 11 ++- arch/powerpc/platforms/pseries/lpar.c | 12 ---- arch/ppc/Kconfig | 6 +- arch/ppc/kernel/smp.c | 4 + arch/ppc/platforms/pmac_feature.c | 20 +++++- drivers/macintosh/windfarm_pm81.c | 4 + include/asm-powerpc/mmu.h | 3 + 14 files changed, 139 insertions(+), 60 deletions(-) Benjamin Herrenschmidt: powerpc: Fix a huge page bug powerpc: Remove debug code in hash path powerpc: Fix clock spreading setting on some powermacs David Gibson: powerpc: Add missing icache flushes for hugepages powerpc: Fix SLB flushing path in hugepage Johannes Berg: ppc32: set smp_tb_synchronized on UP with SMP kernel Michal Ostrowski: powerpc/pseries: Fix TCE building with 64k pagesize Fix windfarm model-id table Mike Kravetz: powerpc/pseries: boot failures on numa if no memory on node Olaf Hering: powerpc: correct the NR_CPUS description text Olof Johansson: powerpc: remove redundant code in stab init powerpc: Set cache info defaults Paul Mackerras: powerpc/pseries: Optimize IOMMU setup ppc: Build in all three of powermac, PREP and CHRP support From segher at kernel.crashing.org Mon Dec 12 21:14:03 2005 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Mon, 12 Dec 2005 11:14:03 +0100 Subject: Booting OS on PowerPC In-Reply-To: <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> <17308.1765.869950.545612@cargo.ozlabs.ibm.com> <17308.42528.789943.633949@cargo.ozlabs.ibm.com> <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> Message-ID: <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> > Except I am guessing that our "don't overwrite the initrd with the > device tree" logic might break. In other words, prom_init.c might > need updating. Yeah -- there needs to be a transition period where prom_init.c accepts both ways, with a warning to update yaboot if needed. On the other hand, do people find external initrd's useful at all still, or is everyone using built-in initramfs? If so, we could just deprecate initrd caompletely. Segher From olh at suse.de Mon Dec 12 23:09:20 2005 From: olh at suse.de (Olaf Hering) Date: Mon, 12 Dec 2005 13:09:20 +0100 Subject: Booting OS on PowerPC In-Reply-To: <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> <17308.1765.869950.545612@cargo.ozlabs.ibm.com> <17308.42528.789943.633949@cargo.ozlabs.ibm.com> <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> Message-ID: <20051212120920.GC12877@suse.de> On Mon, Dec 12, Segher Boessenkool wrote: > On the other hand, do people find external initrd's useful at > all still, or is everyone using built-in initramfs? If so, we > could just deprecate initrd caompletely. You mean the initrd should be put into a .init.ramfs section? Can objcopy safely add and remove such a section at any time? -- short story of a lazy sysadmin: alias appserv=wotan From miltonm at bga.com Tue Dec 13 03:41:07 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 12 Dec 2005 10:41:07 -0600 Subject: Booting OS on PowerPC In-Reply-To: <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> <17308.1765.869950.545612@cargo.ozlabs.ibm.com> <17308.42528.789943.633949@cargo.ozlabs.ibm.com> <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> Message-ID: <7afa30f0a08d75b3d50555516c88b969@bga.com> On Dec 12, 2005, at 4:14 AM, Segher Boessenkool wrote: >> Except I am guessing that our "don't overwrite the initrd with the >> device tree" logic might break. In other words, prom_init.c might >> need updating. > > Yeah -- there needs to be a transition period where prom_init.c > accepts both ways, with a warning to update yaboot if needed. > > On the other hand, do people find external initrd's useful at > all still, or is everyone using built-in initramfs? If so, we > could just deprecate initrd caompletely. Perhaps you didn't realize that external initramfs pieces are passed to the kernel via the initrd mechanizm? One of the key features of initramfs is that it can be built from smaller pieces. The kernel puts them all together, then looks at the result. I use this feature daily to piece together a base layout with customizations for the use of the machine. I even include a rc file that is specific to a given machine. Perhaps a better question is; does anyone build a custom initramfs into their kernel? Beyond those replacing the built-in functionality (prepare_namespae) with userspace? milton From miltonm at bga.com Tue Dec 13 03:45:11 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 12 Dec 2005 10:45:11 -0600 Subject: Booting OS on PowerPC In-Reply-To: <20051212120920.GC12877@suse.de> References: <748d9cf2d69508bb9fc7b5b4881fb1a0@kernel.crashing.org> <17308.1765.869950.545612@cargo.ozlabs.ibm.com> <17308.42528.789943.633949@cargo.ozlabs.ibm.com> <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> <20051212120920.GC12877@suse.de> Message-ID: On Dec 12, 2005, at 6:09 AM, Olaf Hering wrote: > On Mon, Dec 12, Segher Boessenkool wrote: > >> On the other hand, do people find external initrd's useful at >> all still, or is everyone using built-in initramfs? If so, we >> could just deprecate initrd caompletely. > > You mean the initrd should be put into a .init.ramfs section? > Can objcopy safely add and remove such a section at any time? > Any time before link, sure. But to actually boot a kernel, you would need to rerun the link, as all the references to loacations past the beginning of the initramfs would be broken, and need a pass of ld to fix [1]. So the bottom line answer; NO. milton [1] unless of course the new section is exactly the same size as the one being replaced. From miltonm at bga.com Tue Dec 13 03:52:00 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 12 Dec 2005 10:52:00 -0600 Subject: Booting OS on PowerPC In-Reply-To: <200512121217.46799.rene@exactcode.de> References: <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> <200512121217.46799.rene@exactcode.de> Message-ID: On Dec 12, 2005, at 5:17 AM, Ren? Rebe wrote: > > I prefer exteran initrd's since they are easier to regenerate > e.g. on u/dev update and so on. I rather would not like to see > that support be removed - I hope other distributors agree ,-) > External initrd? or externally loaded initramfs? I personally find the modularity of the seperate cpio pieces to be a big win; it means I don't have to keep rebulding an image, I can just replace the piece that changed. The only downside to initramfs vs initrd I have observed is that you don't have a nice "unmount the fs and everything disappears" cleanup. [ There are a few gotchas --- symlinks are there once established, files can ony be ovwritten and extended not replaced, but for clean packages this is not an issue ] milton From brian.jewell at themis.com Tue Dec 13 03:35:19 2005 From: brian.jewell at themis.com (brian jewell) Date: Mon, 12 Dec 2005 08:35:19 -0800 Subject: Validation and/or benchmarking the PPC64 Message-ID: Hi, The company that I work for is building a PPC970-based single board computer product. One of the issues to resolve is providing a means of doing hardware validation (e.g., QA) on the SBC before it is released. I was wondering if anyone knows of any Linux-based test tools that might be used to exercise the hardware on the PPC970 board? The hardware design is based largely on the IBM Maple reference board - only the footprint is much smaller. Thanks in advance for any suggestions. --Brian Jewell --Themis Computer -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 1704 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051212/07fec489/attachment.bin From linas at austin.ibm.com Tue Dec 13 04:28:28 2005 From: linas at austin.ibm.com (linas) Date: Mon, 12 Dec 2005 11:28:28 -0600 Subject: [PATCH]: powerpc: hugepage compile break in latest 2.6.15-rc5-mm2 Message-ID: <20051212172828.GC10037@austin.ibm.com> Hugepage doesn't compile out-of-the-box on linux-2.6.15-rc5-mm2. This patch fixes the compile breakage, but the patch authors may want to double-check that the rest of the patch series has applied correctly. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/arch/powerpc/mm/hugetlbpage.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/arch/powerpc/mm/hugetlbpage.c 2005-12-12 10:54:50.430595723 -0600 +++ linux-2.6.15-rc5-mm2/arch/powerpc/mm/hugetlbpage.c 2005-12-12 11:12:08.621525777 -0600 @@ -280,7 +280,6 @@ { struct slb_flush_info fi; unsigned long i; - struct slb_flush_info fi; BUILD_BUG_ON((sizeof(newareas)*8) != NUM_HIGH_AREAS); BUILD_BUG_ON((sizeof(mm->context.high_htlb_areas)*8) @@ -621,7 +620,6 @@ { int lastshift; u16 areamask, curareas; - struct vm_area_struct *vma; if (HPAGE_SHIFT == 0) return -EINVAL; From anton at samba.org Tue Dec 13 04:51:58 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 13 Dec 2005 04:51:58 +1100 Subject: [PATCH] ppc64: HVC init race Message-ID: <20051212175158.GH23641@krispykreme> From: Michael Neuling I've been hitting a crash on boot where tty_open is being called before the hvc console driver setup is complete. Below patch fixes this problem. Thanks to benh for his help on this. Signed-off-by: Michael Neuling Acked-by: Anton Blanchard --- Index: gr_work/drivers/char/hvc_console.c =================================================================== --- gr_work.orig/drivers/char/hvc_console.c 2005-12-05 15:02:20.348550735 -0600 +++ gr_work/drivers/char/hvc_console.c 2005-12-05 15:06:38.336028262 -0600 @@ -823,34 +823,38 @@ * interfaces start to become available. */ int __init hvc_init(void) { + struct tty_driver *drv; + /* We need more than hvc_count adapters due to hotplug additions. */ - hvc_driver = alloc_tty_driver(HVC_ALLOC_TTY_ADAPTERS); - if (!hvc_driver) + drv = alloc_tty_driver(HVC_ALLOC_TTY_ADAPTERS); + if (!drv) return -ENOMEM; - hvc_driver->owner = THIS_MODULE; - hvc_driver->devfs_name = "hvc/"; - hvc_driver->driver_name = "hvc"; - hvc_driver->name = "hvc"; - hvc_driver->major = HVC_MAJOR; - hvc_driver->minor_start = HVC_MINOR; - hvc_driver->type = TTY_DRIVER_TYPE_SYSTEM; - hvc_driver->init_termios = tty_std_termios; - hvc_driver->flags = TTY_DRIVER_REAL_RAW; - tty_set_operations(hvc_driver, &hvc_ops); + drv->owner = THIS_MODULE; + drv->devfs_name = "hvc/"; + drv->driver_name = "hvc"; + drv->name = "hvc"; + drv->major = HVC_MAJOR; + drv->minor_start = HVC_MINOR; + drv->type = TTY_DRIVER_TYPE_SYSTEM; + drv->init_termios = tty_std_termios; + drv->flags = TTY_DRIVER_REAL_RAW; + tty_set_operations(drv, &hvc_ops); /* Always start the kthread because there can be hotplug vty adapters * added later. */ hvc_task = kthread_run(khvcd, NULL, "khvcd"); if (IS_ERR(hvc_task)) { panic("Couldn't create kthread for console.\n"); - put_tty_driver(hvc_driver); + put_tty_driver(drv); return -EIO; } - if (tty_register_driver(hvc_driver)) + if (tty_register_driver(drv)) panic("Couldn't register hvc console driver\n"); + mb(); + hvc_driver = drv; return 0; } module_init(hvc_init); From rene at exactcode.de Mon Dec 12 22:17:42 2005 From: rene at exactcode.de (=?iso-8859-1?q?Ren=E9_Rebe?=) Date: Mon, 12 Dec 2005 12:17:42 +0100 Subject: Booting OS on PowerPC In-Reply-To: <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> References: <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> Message-ID: <200512121217.46799.rene@exactcode.de> Hi, On Monday 12 December 2005 11:14, Segher Boessenkool wrote: > > Except I am guessing that our "don't overwrite the initrd with the > > device tree" logic might break. In other words, prom_init.c might > > need updating. > > Yeah -- there needs to be a transition period where prom_init.c > accepts both ways, with a warning to update yaboot if needed. > > On the other hand, do people find external initrd's useful at > all still, or is everyone using built-in initramfs? If so, we > could just deprecate initrd caompletely. I prefer exteran initrd's since they are easier to regenerate e.g. on u/dev update and so on. I rather would not like to see that support be removed - I hope other distributors agree ,-) Yours, -- Ren? Rebe - Rubensstr. 64 - 12157 Berlin (Europe / Germany) http://www.exactcode.de | http://www.t2-project.org +49 (0)30 255 897 45 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051212/96c2d5e7/attachment.pgp From anton at samba.org Tue Dec 13 06:56:47 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 13 Dec 2005 06:56:47 +1100 Subject: [PATCH] ppc64: Add NUMA cpu summary at boot Message-ID: <20051212195647.GI23641@krispykreme> We used to print a NUMA cpu summary at boot before the hotplug cpu code was added. This has been useful for catching machine configuration as well as firmware bugs in the past. This patch restores that functionality. An example of the output is: Node 0 CPUs: 0-7 Node 1 CPUs: 8-15 Signed-off-by: Anton Blanchard --- Index: build/arch/powerpc/mm/numa.c =================================================================== --- build.orig/arch/powerpc/mm/numa.c 2005-12-13 06:19:08.000000000 +1100 +++ build/arch/powerpc/mm/numa.c 2005-12-13 06:22:30.000000000 +1100 @@ -497,7 +497,41 @@ node_set_online(0); } -static void __init dump_numa_topology(void) +void __init dump_numa_cpu_topology(void) +{ + unsigned int node; + unsigned int cpu, count; + + if (min_common_depth == -1 || !numa_enabled) + return; + + for_each_online_node(node) { + printk(KERN_INFO "Node %d CPUs:", node); + + count = 0; + /* + * If we used a CPU iterator here we would miss printing + * the holes in the cpumap. + */ + for (cpu = 0; cpu < NR_CPUS; cpu++) { + if (cpu_isset(cpu, numa_cpumask_lookup_table[node])) { + if (count == 0) + printk(" %u", cpu); + ++count; + } else { + if (count > 1) + printk("-%u", cpu - 1); + count = 0; + } + } + + if (count > 1) + printk("-%u", NR_CPUS - 1); + printk("\n"); + } +} + +static void __init dump_numa_memory_topology(void) { unsigned int node; unsigned int count; @@ -529,7 +563,6 @@ printk("-0x%lx", i); printk("\n"); } - return; } /* @@ -591,7 +624,7 @@ if (parse_numa_properties()) setup_nonnuma(); else - dump_numa_topology(); + dump_numa_memory_topology(); register_cpu_notifier(&ppc64_numa_nb); Index: build/arch/powerpc/kernel/smp.c =================================================================== --- build.orig/arch/powerpc/kernel/smp.c 2005-11-20 15:01:36.000000000 +1100 +++ build/arch/powerpc/kernel/smp.c 2005-12-13 06:22:51.000000000 +1100 @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -554,6 +555,8 @@ smp_ops->setup_cpu(boot_cpuid); set_cpus_allowed(current, old_mask); + + dump_numa_cpu_topology(); } #ifdef CONFIG_HOTPLUG_CPU Index: build/include/asm-powerpc/topology.h =================================================================== --- build.orig/include/asm-powerpc/topology.h 2005-11-20 15:01:36.000000000 +1100 +++ build/include/asm-powerpc/topology.h 2005-12-13 06:22:30.000000000 +1100 @@ -55,8 +55,12 @@ .nr_balance_failed = 0, \ } +extern void __init dump_numa_cpu_topology(void); + #else +static inline void dump_numa_cpu_topology(void) {} + #include #endif /* CONFIG_NUMA */ From anton at samba.org Tue Dec 13 07:45:33 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 13 Dec 2005 07:45:33 +1100 Subject: [PATCH] powerpc: Dont set 32bit cputable bits on 64bit Message-ID: <20051212204532.GJ23641@krispykreme> Milton and I were looking at the cputable code and it looks like we can set spurious bits on 64bit. Signed-off-by: Anton Blanchard --- Index: build/include/asm-powerpc/cputable.h =================================================================== --- build.orig/include/asm-powerpc/cputable.h 2005-11-20 07:31:45.000000000 +1100 +++ build/include/asm-powerpc/cputable.h 2005-12-13 07:39:34.000000000 +1100 @@ -311,6 +311,11 @@ #endif CPU_FTRS_POSSIBLE = +#ifdef __powerpc64__ + CPU_FTRS_POWER3 | CPU_FTRS_RS64 | CPU_FTRS_POWER4 | + CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | CPU_FTRS_CELL | + CPU_FTR_CI_LARGE_PAGE | +#else #if CLASSIC_PPC CPU_FTRS_PPC601 | CPU_FTRS_603 | CPU_FTRS_604 | CPU_FTRS_740_NOTAU | CPU_FTRS_740 | CPU_FTRS_750 | CPU_FTRS_750FX1 | @@ -344,14 +349,14 @@ #ifdef CONFIG_E500 CPU_FTRS_E500 | CPU_FTRS_E500_2 | #endif -#ifdef __powerpc64__ - CPU_FTRS_POWER3 | CPU_FTRS_RS64 | CPU_FTRS_POWER4 | - CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | CPU_FTRS_CELL | - CPU_FTR_CI_LARGE_PAGE | -#endif +#endif /* __powerpc64__ */ 0, CPU_FTRS_ALWAYS = +#ifdef __powerpc64__ + CPU_FTRS_POWER3 & CPU_FTRS_RS64 & CPU_FTRS_POWER4 & + CPU_FTRS_PPC970 & CPU_FTRS_POWER5 & CPU_FTRS_CELL & +#else #if CLASSIC_PPC CPU_FTRS_PPC601 & CPU_FTRS_603 & CPU_FTRS_604 & CPU_FTRS_740_NOTAU & CPU_FTRS_740 & CPU_FTRS_750 & CPU_FTRS_750FX1 & @@ -385,10 +390,7 @@ #ifdef CONFIG_E500 CPU_FTRS_E500 & CPU_FTRS_E500_2 & #endif -#ifdef __powerpc64__ - CPU_FTRS_POWER3 & CPU_FTRS_RS64 & CPU_FTRS_POWER4 & - CPU_FTRS_PPC970 & CPU_FTRS_POWER5 & CPU_FTRS_CELL & -#endif +#endif /* __powerpc64__ */ CPU_FTRS_POSSIBLE, }; From anton at samba.org Tue Dec 13 07:56:54 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 13 Dec 2005 07:56:54 +1100 Subject: [PATCH] powerpc: Remove old comment in head.S Message-ID: <20051212205654.GK23641@krispykreme> Remove a comment in head.S which is no longer relevant. Signed-off-by: Anton Blanchard --- Index: gr_work/arch/powerpc/kernel/head_64.S =================================================================== --- gr_work.orig/arch/powerpc/kernel/head_64.S 2005-12-05 21:15:02.577178123 -0600 +++ gr_work/arch/powerpc/kernel/head_64.S 2005-12-05 21:16:31.656616438 -0600 @@ -1855,7 +1855,7 @@ mulli r13,r27,PACA_SIZE /* Calculate vaddr of right paca */ add r13,r13,r24 /* for this processor. */ add r13,r13,r26 /* convert to physical addr */ - mtspr SPRN_SPRG3,r13 /* PPPBBB: Temp... -Peter */ + mtspr SPRN_SPRG3,r13 /* Do very early kernel initializations, including initial hash table, * stab and slb setup before we turn on relocation. */ From olof at lixom.net Tue Dec 13 08:00:55 2005 From: olof at lixom.net (Olof Johansson) Date: Mon, 12 Dec 2005 15:00:55 -0600 Subject: [PATCH] powerpc: Dont set 32bit cputable bits on 64bit In-Reply-To: <20051212204532.GJ23641@krispykreme> References: <20051212204532.GJ23641@krispykreme> Message-ID: <20051212210055.GA7603@pb15.lixom.net> On Tue, Dec 13, 2005 at 07:45:33AM +1100, Anton Blanchard wrote: > > Milton and I were looking at the cputable code and it looks like we can > set spurious bits on 64bit. > > Signed-off-by: Anton Blanchard > --- > > Index: build/include/asm-powerpc/cputable.h > =================================================================== > --- build.orig/include/asm-powerpc/cputable.h 2005-11-20 07:31:45.000000000 +1100 > +++ build/include/asm-powerpc/cputable.h 2005-12-13 07:39:34.000000000 +1100 > @@ -311,6 +311,11 @@ > #endif > > CPU_FTRS_POSSIBLE = > +#ifdef __powerpc64__ I know they were just moved, but should we use CONFIG_PPC64 instead to keep it similar with the other CONFIG_* tests? -Olof From benh at kernel.crashing.org Tue Dec 13 08:22:24 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 08:22:24 +1100 Subject: Booting OS on PowerPC In-Reply-To: References: <05b51d23974a87a1f2bb96f0b4fea0cb@bga.com> <2086a7d9f0192a0bca79e17680c2ba1e@kernel.crashing.org> <200512121217.46799.rene@exactcode.de> Message-ID: <1134422545.6989.123.camel@gaston> On Mon, 2005-12-12 at 10:52 -0600, Milton Miller wrote: > On Dec 12, 2005, at 5:17 AM, Ren? Rebe wrote: > > > > I prefer exteran initrd's since they are easier to regenerate > > e.g. on u/dev update and so on. I rather would not like to see > > that support be removed - I hope other distributors agree ,-) > > > > External initrd? or externally loaded initramfs? Another reason why I'll veto attemps to remove support for external initrd or initramfs is that there are still recurring issues with old machines when it comes to loading really large images. Ben. From anton at samba.org Tue Dec 13 08:29:40 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 13 Dec 2005 08:29:40 +1100 Subject: [PATCH] powerpc: Dont set 32bit cputable bits on 64bit In-Reply-To: <20051212210055.GA7603@pb15.lixom.net> References: <20051212204532.GJ23641@krispykreme> <20051212210055.GA7603@pb15.lixom.net> Message-ID: <20051212212940.GL23641@krispykreme> > I know they were just moved, but should we use CONFIG_PPC64 instead to > keep it similar with the other CONFIG_* tests? I originally thought that then I noticed __powerpc64__ is pretty widespread, in fact it outnumbers the number of CONFIG_PPC64 uses in include/asm-powerpc :) Either way Im OK. Anton From olof at lixom.net Tue Dec 13 08:33:14 2005 From: olof at lixom.net (Olof Johansson) Date: Mon, 12 Dec 2005 15:33:14 -0600 Subject: [PATCH] powerpc: Dont set 32bit cputable bits on 64bit In-Reply-To: <20051212212940.GL23641@krispykreme> References: <20051212204532.GJ23641@krispykreme> <20051212210055.GA7603@pb15.lixom.net> <20051212212940.GL23641@krispykreme> Message-ID: <20051212213314.GB7603@pb15.lixom.net> On Tue, Dec 13, 2005 at 08:29:40AM +1100, Anton Blanchard wrote: > > > I know they were just moved, but should we use CONFIG_PPC64 instead to > > keep it similar with the other CONFIG_* tests? > > I originally thought that then I noticed __powerpc64__ is pretty > widespread, in fact it outnumbers the number of CONFIG_PPC64 uses in > include/asm-powerpc :) Either way Im OK. Interesting. Ok, nevermind then. :) -Olof From arnd at arndb.de Tue Dec 13 09:28:56 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Mon, 12 Dec 2005 23:28:56 +0100 Subject: [PATCH] powerpc: Don't use CONFIG_PPC64 in user-visible header files In-Reply-To: <20051212213314.GB7603@pb15.lixom.net> References: <20051212204532.GJ23641@krispykreme> <20051212212940.GL23641@krispykreme> <20051212213314.GB7603@pb15.lixom.net> Message-ID: <200512122328.56724.arnd@arndb.de> I have checked the usage of CONFIG_PPC64 vs. __powerpc64__ and found that at least io.h and vdso_datapage.h get it wrong by using CONFIG_PPC64 outside of #ifdef __KERNEL__. This will cause any user space application that (probably without a good reason) includes these files gets the ppc32 version even if compiling for 64 bits. I originally argued that we should only use __powerpc64__ in header files for just this reason. The other files that do use CONFIG_PPC64 outside of __KERNEL__ are dma-mapping.h, mmu_context.h, mmu.h, pci-bridge.h, pgalloc.h, pgtable.h, pmc.h, ppc_asm.h and spinlock.h. However, including one of these files in user space is already impossible, so the only thing that might help there would be to wrap them completely in #ifdef __KERNEL__. Signed-off-by: Arnd Bergmann --- Am Montag 12 Dezember 2005 22:33 schrieb Olof Johansson: > > ? > > > > > I know they were just moved, but should we use CONFIG_PPC64 instead to > > > keep it similar with the other CONFIG_* tests? > > > > I originally thought that then I noticed __powerpc64__ is pretty > > widespread, in fact it outnumbers the number of CONFIG_PPC64 uses in > > include/asm-powerpc :) Either way Im OK. > > Interesting. Ok, nevermind then. :) diff --git a/include/asm-powerpc/io.h b/include/asm-powerpc/io.h index 48938d8..c8d075a 100644 --- a/include/asm-powerpc/io.h +++ b/include/asm-powerpc/io.h @@ -8,7 +8,7 @@ * 2 of the License, or (at your option) any later version. */ -#ifndef CONFIG_PPC64 +#ifndef __powerpc64__ #include #else diff --git a/include/asm-powerpc/vdso_datapage.h b/include/asm-powerpc/vdso_datapage.h index 411832d..f8fece7 100644 --- a/include/asm-powerpc/vdso_datapage.h +++ b/include/asm-powerpc/vdso_datapage.h @@ -45,7 +45,7 @@ * So here is the ppc64 backward compatible version */ -#ifdef CONFIG_PPC64 +#ifdef __powerpc64__ struct vdso_data { __u8 eye_catcher[16]; /* Eyecatcher: SYSTEMCFG:PPC64 0x00 */ From benh at kernel.crashing.org Tue Dec 13 09:50:01 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 09:50:01 +1100 Subject: [PATCH] powerpc: Don't use CONFIG_PPC64 in user-visible header files In-Reply-To: <200512122328.56724.arnd@arndb.de> References: <20051212204532.GJ23641@krispykreme> <20051212212940.GL23641@krispykreme> <20051212213314.GB7603@pb15.lixom.net> <200512122328.56724.arnd@arndb.de> Message-ID: <1134427802.6989.128.camel@gaston> On Mon, 2005-12-12 at 23:28 +0100, Arnd Bergmann wrote: > I have checked the usage of CONFIG_PPC64 vs. __powerpc64__ and found that > at least io.h and vdso_datapage.h get it wrong by using CONFIG_PPC64 outside > of #ifdef __KERNEL__. This will cause any user space application that > (probably without a good reason) includes these files gets the ppc32 version > even if compiling for 64 bits. They should not to that. > I originally argued that we should only use __powerpc64__ in header files for > just this reason. Userspace has no business including vdso_data.h at least and I fail to see what business it would have including io.h > The other files that do use CONFIG_PPC64 outside of > __KERNEL__ are dma-mapping.h, mmu_context.h, mmu.h, pci-bridge.h, > pgalloc.h, pgtable.h, pmc.h, ppc_asm.h and spinlock.h. However, including > one of these files in user space is already impossible, so the only thing > that might help there would be to wrap them completely in #ifdef __KERNEL__. > > Signed-off-by: Arnd Bergmann > > --- > > Am Montag 12 Dezember 2005 22:33 schrieb Olof Johansson: > > > > > > > > > > I know they were just moved, but should we use CONFIG_PPC64 instead to > > > > keep it similar with the other CONFIG_* tests? > > > > > > I originally thought that then I noticed __powerpc64__ is pretty > > > widespread, in fact it outnumbers the number of CONFIG_PPC64 uses in > > > include/asm-powerpc :) Either way Im OK. > > > > Interesting. Ok, nevermind then. :) > > diff --git a/include/asm-powerpc/io.h b/include/asm-powerpc/io.h > index 48938d8..c8d075a 100644 > --- a/include/asm-powerpc/io.h > +++ b/include/asm-powerpc/io.h > @@ -8,7 +8,7 @@ > * 2 of the License, or (at your option) any later version. > */ > > -#ifndef CONFIG_PPC64 > +#ifndef __powerpc64__ > #include > #else > > diff --git a/include/asm-powerpc/vdso_datapage.h > b/include/asm-powerpc/vdso_datapage.h > index 411832d..f8fece7 100644 > --- a/include/asm-powerpc/vdso_datapage.h > +++ b/include/asm-powerpc/vdso_datapage.h > @@ -45,7 +45,7 @@ > * So here is the ppc64 backward compatible version > */ > > -#ifdef CONFIG_PPC64 > +#ifdef __powerpc64__ > > struct vdso_data { > __u8 eye_catcher[16]; /* Eyecatcher: SYSTEMCFG:PPC64 0x00 */ > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From arnd at arndb.de Tue Dec 13 10:37:58 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 13 Dec 2005 00:37:58 +0100 Subject: [PATCH] powerpc: Don't use CONFIG_PPC64 in user-visible header files In-Reply-To: <1134427802.6989.128.camel@gaston> References: <20051212204532.GJ23641@krispykreme> <200512122328.56724.arnd@arndb.de> <1134427802.6989.128.camel@gaston> Message-ID: <200512130037.58332.arnd@arndb.de> Am Montag 12 Dezember 2005 23:50 schrieb Benjamin Herrenschmidt: > They should not to that. Of course they should not. But if they did such things on 2.6.14 it might have worked and if it fails in 2.6.15 that is a regression. > > I originally argued that we should only use __powerpc64__ in header files > > for just this reason. > > Userspace has no business including vdso_data.h at least and I fail to > see what business it would have including io.h Both these files have sections marked #ifdef __KERNEL__ in them, so the assumption is that the other parts are actually written in a way that they don't hurt if they get included. Traditionally, asm/io.h gets included on i386 in order to do nasty things with iopl() and direct device access from user space. The way that include/asm-powerpc/io.h is written allows you to call the {in,out}_{be,le}{8,16,32,64} on /dev/mem or similar files. While this is probably a bad idea, I can see why someone might have started using it and it should not just stop working. The comment in asm/vdso_data.h suggests that it actually is meant to be a user space ABI header (even though a misguided attempt at doing so). An application including it to get the expected ppc64 data structure layout will be silently broken by relying on CONFIG_PPC64. Arnd <>< From benh at kernel.crashing.org Tue Dec 13 11:01:06 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 11:01:06 +1100 Subject: [PATCH] powerpc: Don't use CONFIG_PPC64 in user-visible header files In-Reply-To: <200512130037.58332.arnd@arndb.de> References: <20051212204532.GJ23641@krispykreme> <200512122328.56724.arnd@arndb.de> <1134427802.6989.128.camel@gaston> <200512130037.58332.arnd@arndb.de> Message-ID: <1134432067.6989.137.camel@gaston> On Tue, 2005-12-13 at 00:37 +0100, Arnd Bergmann wrote: > Am Montag 12 Dezember 2005 23:50 schrieb Benjamin Herrenschmidt: > > They should not to that. > > Of course they should not. But if they did such things on 2.6.14 it might have > worked and if it fails in 2.6.15 that is a regression. No, it's not. They shouldn't do it, period. It's not a regression to break a bogus/forbidden behaviour. > Both these files have sections marked #ifdef __KERNEL__ in them, so the > assumption is that the other parts are actually written in a way that they > don't hurt if they get included. Yah, historical from when systemcfg still was sort-of supported. > Traditionally, asm/io.h gets included on i386 in order to do nasty things with > iopl() and direct device access from user space. The way that > include/asm-powerpc/io.h is written allows you to call the > {in,out}_{be,le}{8,16,32,64} on /dev/mem or similar files. While this is > probably a bad idea, I can see why someone might have started using > it and it should not just stop working. Good, that way they get a chance to fix their code. > The comment in asm/vdso_data.h suggests that it actually is meant to > be a user space ABI header (even though a misguided attempt at doing > so). An application including it to get the expected ppc64 data structure > layout will be silently broken by relying on CONFIG_PPC64. It used to be and I'm changing that so it's not anymore. Ben. From aswathavijay at gmail.com Tue Dec 13 16:32:58 2005 From: aswathavijay at gmail.com (Vijayakumar Ramalingam) Date: Tue, 13 Dec 2005 11:02:58 +0530 Subject: Error while compiling the kernel 2.6.7 Message-ID: <5f87992f0512122132k11e51daela6e7f574f1dc758a@mail.gmail.com> HI all, While I am compiling the linux kernel 2.6.7 with base features on x86 machine, it is not getting compiler and throwing the following error, ld -m elf_i386 -T /usr/src/linux-2.4.20-8/arch/i386/vmlinux.lds -e stext arch/i386/kernel/head.o arch/i386/kernel/init_task.o init/main.o init/version.o init/do_mounts.o --start-group arch/i386/kernel/kernel.o arch/i386/mm/mm.o kernel/kernel.o mm/mm.o fs/fs.o ipc/ipc.o drivers/char/char.o drivers/block/block.o drivers/misc/misc.o drivers/net/net.o drivers/char/drm/drm.o drivers/net/fc/fc.o drivers/net/appletalk/appletalk.o drivers/net/tokenring/tr.o drivers/net/wan/wan.o drivers/atm/atm.o drivers/ide/idedriver.o drivers/cdrom/driver.o drivers/pci/driver.o drivers/net/pcmcia/pcmcia_net.o drivers/net/wireless/wireless_net.o drivers/pnp/pnp.o drivers/video/video.o drivers/media/media.o drivers/md/mddev.o drivers/isdn/vmlinux-obj.onet/network.o crypto/crypto.o /usr/src/linux- 2.4.20-8/arch/i386/lib/lib.a /usr/src/linux-2.4.20-8/lib/lib.a /usr/src/linux-2.4.20-8/arch/i386/lib/lib.a --end-group -o .tmp_vmlinux1 kernel/kernel.o(.text+0x13191): In function `use_init_fs_context': : undefined reference to `set_fs_root' kernel/kernel.o(.text+0x131a5): In function `use_init_fs_context': : undefined reference to `set_fs_pwd' kernel/kernel.o(.text+0x13353): In function `schedule_task': : undefined reference to `queue_task' fs/fs.o(.text+0x67b): In function `sys_chdir': : undefined reference to `set_fs_pwd' fs/fs.o(.text+0x720): In function `sys_fchdir': : undefined reference to `set_fs_pwd' fs/fs.o(.text+0x7d2): In function `sys_chroot': : undefined reference to `set_fs_root' fs/fs.o(.text+0x16182): In function `d_prune_aliases': : undefined reference to `d_drop' fs/fs.o(.text+0x16ac0): In function `d_delete': : undefined reference to `d_drop' fs/fs.o(.text+0x1b720): In function `chroot_fs_refs': : undefined reference to `set_fs_pwd' fs/fs.o(.text+0x1b740): In function `chroot_fs_refs': : undefined reference to `set_fs_root' fs/fs.o(.text+0x25485): In function `pid_base_revalidate': : undefined reference to `d_drop' fs/fs.o(.text.init+0x940): In function `init_mount_tree': : undefined reference to `set_fs_pwd' fs/fs.o(.text.init+0x95f): In function `init_mount_tree': : undefined reference to `set_fs_root' drivers/char/char.o(.text+0x37c2): In function `flush_to_ldisc': : undefined reference to `queue_task' drivers/char/char.o(.text+0x38a2): In function `tty_flip_buffer_push': : undefined reference to `queue_task' drivers/char/char.o(.text+0x8bbe): In function `batch_entropy_store': : undefined reference to `queue_task' drivers/char/char.o(.text+0x14a74): In function `rs_sched_event': : undefined reference to `queue_task' drivers/block/block.o(.text+0x3b7): In function `generic_plug_device': : undefined reference to `queue_task' drivers/block/block.o(.text+0x3b9c): more undefined references to `queue_task' follow make[1]: *** [kallsyms] Error 1 make[1]: Leaving directory `/usr/src/linux-2.4.20-8' make: *** [vmlinux] Error 2 Can anyone tell me what is the main cause for the above mentioned error and how can i rectify it? Thank in advance Regards Vijay -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051213/d8b66f85/attachment.htm From olof at lixom.net Tue Dec 13 16:49:53 2005 From: olof at lixom.net (Olof Johansson) Date: Mon, 12 Dec 2005 21:49:53 -0800 Subject: Error while compiling the kernel 2.6.7 In-Reply-To: <5f87992f0512122132k11e51daela6e7f574f1dc758a@mail.gmail.com> References: <5f87992f0512122132k11e51daela6e7f574f1dc758a@mail.gmail.com> Message-ID: <20051213054953.GA8091@pb15.lixom.net> On Tue, Dec 13, 2005 at 11:02:58AM +0530, Vijayakumar Ramalingam wrote: > HI all, > While I am compiling the linux kernel 2.6.7 with base features on x86 > machine, it is not getting compiler and throwing the following error, 1. This is a mailing list used to discuss Linux on 64-bit PowerPC, not 32-bit x86. 2. Your email indicated that you're trying to build something in /usr/src/linux-2.4.20-8, either you're picking bad names for your source trees, or you are posting misleading information. 3. That being said, 2.6.7 is by now a very old kernel and you are unlikely to find anyone willing to help you with it. Chances would be better for 2.6.14 or so, if the same problem even still exists. 4. See http://kernelnewbies.org/faq/ for basic information on how to build a kernel, and other information on kernelnewbies.org for pointers at where to go for help. This list is the wrong place. -Olof From benh at kernel.crashing.org Tue Dec 13 17:46:23 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 17:46:23 +1100 Subject: [PATCH] powerpc: Add pmac32 defconfig for ARCH=powerpc Message-ID: <1134456383.6989.176.camel@gaston> This adds a defconfig for PowerMac with ARCH=powerpc Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/powerpc/configs/pmac32_defconfig =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-work/arch/powerpc/configs/pmac32_defconfig 2005-12-13 17:24:16.000000000 +1100 @@ -0,0 +1,1729 @@ +# +# Automatically generated make config: don't edit +# Linux kernel version: 2.6.15-rc5 +# Tue Dec 13 17:24:05 2005 +# +# CONFIG_PPC64 is not set +CONFIG_PPC32=y +CONFIG_PPC_MERGE=y +CONFIG_MMU=y +CONFIG_GENERIC_HARDIRQS=y +CONFIG_RWSEM_XCHGADD_ALGORITHM=y +CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_PPC=y +CONFIG_EARLY_PRINTK=y +CONFIG_GENERIC_NVRAM=y +CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y +CONFIG_ARCH_MAY_HAVE_PC_FDC=y + +# +# Processor support +# +CONFIG_6xx=y +# CONFIG_PPC_52xx is not set +# CONFIG_PPC_82xx is not set +# CONFIG_PPC_83xx is not set +# CONFIG_40x is not set +# CONFIG_44x is not set +# CONFIG_8xx is not set +# CONFIG_E200 is not set +# CONFIG_E500 is not set +CONFIG_PPC_FPU=y +CONFIG_ALTIVEC=y +CONFIG_PPC_STD_MMU=y +CONFIG_PPC_STD_MMU_32=y +# CONFIG_SMP is not set + +# +# Code maturity level options +# +CONFIG_EXPERIMENTAL=y +CONFIG_CLEAN_COMPILE=y +CONFIG_BROKEN_ON_SMP=y +CONFIG_INIT_ENV_ARG_LIMIT=32 + +# +# General setup +# +CONFIG_LOCALVERSION="" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_SWAP=y +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +# CONFIG_BSD_PROCESS_ACCT is not set +CONFIG_SYSCTL=y +# CONFIG_AUDIT is not set +CONFIG_HOTPLUG=y +CONFIG_KOBJECT_UEVENT=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_INITRAMFS_SOURCE="" +# CONFIG_EMBEDDED is not set +CONFIG_KALLSYMS=y +# CONFIG_KALLSYMS_ALL is not set +# CONFIG_KALLSYMS_EXTRA_PASS is not set +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_EPOLL=y +CONFIG_SHMEM=y +CONFIG_CC_ALIGN_FUNCTIONS=0 +CONFIG_CC_ALIGN_LABELS=0 +CONFIG_CC_ALIGN_LOOPS=0 +CONFIG_CC_ALIGN_JUMPS=0 +# CONFIG_TINY_SHMEM is not set +CONFIG_BASE_SMALL=0 + +# +# Loadable module support +# +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODULE_FORCE_UNLOAD=y +CONFIG_OBSOLETE_MODPARM=y +# CONFIG_MODVERSIONS is not set +# CONFIG_MODULE_SRCVERSION_ALL is not set +CONFIG_KMOD=y + +# +# Block layer +# +CONFIG_LBD=y + +# +# IO Schedulers +# +CONFIG_IOSCHED_NOOP=y +CONFIG_IOSCHED_AS=y +CONFIG_IOSCHED_DEADLINE=y +CONFIG_IOSCHED_CFQ=y +CONFIG_DEFAULT_AS=y +# CONFIG_DEFAULT_DEADLINE is not set +# CONFIG_DEFAULT_CFQ is not set +# CONFIG_DEFAULT_NOOP is not set +CONFIG_DEFAULT_IOSCHED="anticipatory" + +# +# Platform support +# +CONFIG_PPC_MULTIPLATFORM=y +# CONFIG_PPC_ISERIES is not set +# CONFIG_EMBEDDED6xx is not set +# CONFIG_APUS is not set +# CONFIG_PPC_CHRP is not set +CONFIG_PPC_PMAC=y +CONFIG_PPC_OF=y +CONFIG_MPIC=y +# CONFIG_PPC_RTAS is not set +# CONFIG_MMIO_NVRAM is not set +# CONFIG_CRASH_DUMP is not set +CONFIG_PPC_MPC106=y +# CONFIG_GENERIC_TBSYNC is not set +CONFIG_CPU_FREQ=y +CONFIG_CPU_FREQ_TABLE=y +# CONFIG_CPU_FREQ_DEBUG is not set +CONFIG_CPU_FREQ_STAT=y +# CONFIG_CPU_FREQ_STAT_DETAILS is not set +CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y +# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set +CONFIG_CPU_FREQ_GOV_PERFORMANCE=y +CONFIG_CPU_FREQ_GOV_POWERSAVE=y +CONFIG_CPU_FREQ_GOV_USERSPACE=y +# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set +# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set +CONFIG_CPU_FREQ_PMAC=y +CONFIG_PPC601_SYNC_FIX=y +# CONFIG_TAU is not set +# CONFIG_WANT_EARLY_SERIAL is not set + +# +# Kernel options +# +# CONFIG_HIGHMEM is not set +# CONFIG_HZ_100 is not set +CONFIG_HZ_250=y +# CONFIG_HZ_1000 is not set +CONFIG_HZ=250 +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set +# CONFIG_PREEMPT is not set +CONFIG_BINFMT_ELF=y +CONFIG_BINFMT_MISC=m +# CONFIG_KEXEC is not set +CONFIG_ARCH_FLATMEM_ENABLE=y +CONFIG_SELECT_MEMORY_MODEL=y +CONFIG_FLATMEM_MANUAL=y +# CONFIG_DISCONTIGMEM_MANUAL is not set +# CONFIG_SPARSEMEM_MANUAL is not set +CONFIG_FLATMEM=y +CONFIG_FLAT_NODE_MEM_MAP=y +# CONFIG_SPARSEMEM_STATIC is not set +CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_PROC_DEVICETREE=y +# CONFIG_CMDLINE_BOOL is not set +CONFIG_PM=y +# CONFIG_PM_LEGACY is not set +CONFIG_PM_DEBUG=y +CONFIG_SOFTWARE_SUSPEND=y +CONFIG_PM_STD_PARTITION="" +CONFIG_SECCOMP=y +CONFIG_ISA_DMA_API=y + +# +# Bus options +# +CONFIG_GENERIC_ISA_DMA=y +# CONFIG_PPC_I8259 is not set +CONFIG_PPC_INDIRECT_PCI=y +CONFIG_PCI=y +CONFIG_PCI_DOMAINS=y +CONFIG_PCI_LEGACY_PROC=y +# CONFIG_PCI_DEBUG is not set + +# +# PCCARD (PCMCIA/CardBus) support +# +CONFIG_PCCARD=m +# CONFIG_PCMCIA_DEBUG is not set +CONFIG_PCMCIA=m +CONFIG_PCMCIA_LOAD_CIS=y +CONFIG_PCMCIA_IOCTL=y +CONFIG_CARDBUS=y + +# +# PC-card bridges +# +CONFIG_YENTA=m +# CONFIG_PD6729 is not set +# CONFIG_I82092 is not set +CONFIG_PCCARD_NONSTATIC=m + +# +# PCI Hotplug Support +# +# CONFIG_HOTPLUG_PCI is not set + +# +# Advanced setup +# +# CONFIG_ADVANCED_OPTIONS is not set + +# +# Default settings for advanced configuration options are used +# +CONFIG_HIGHMEM_START=0xfe000000 +CONFIG_LOWMEM_SIZE=0x30000000 +CONFIG_KERNEL_START=0xc0000000 +CONFIG_TASK_SIZE=0x80000000 +CONFIG_BOOT_LOAD=0x00800000 + +# +# Networking +# +CONFIG_NET=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_MMAP is not set +CONFIG_UNIX=y +# CONFIG_NET_KEY is not set +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +# CONFIG_IP_ADVANCED_ROUTER is not set +CONFIG_IP_FIB_HASH=y +# CONFIG_IP_PNP is not set +# CONFIG_NET_IPIP is not set +# CONFIG_NET_IPGRE is not set +# CONFIG_IP_MROUTE is not set +# CONFIG_ARPD is not set +CONFIG_SYN_COOKIES=y +# CONFIG_INET_AH is not set +# CONFIG_INET_ESP is not set +# CONFIG_INET_IPCOMP is not set +# CONFIG_INET_TUNNEL is not set +CONFIG_INET_DIAG=y +CONFIG_INET_TCP_DIAG=y +# CONFIG_TCP_CONG_ADVANCED is not set +CONFIG_TCP_CONG_BIC=y + +# +# IP: Virtual Server Configuration +# +# CONFIG_IP_VS is not set +# CONFIG_IPV6 is not set +CONFIG_NETFILTER=y +# CONFIG_NETFILTER_DEBUG is not set + +# +# Core Netfilter Configuration +# +# CONFIG_NETFILTER_NETLINK is not set + +# +# IP: Netfilter Configuration +# +CONFIG_IP_NF_CONNTRACK=m +# CONFIG_IP_NF_CT_ACCT is not set +# CONFIG_IP_NF_CONNTRACK_MARK is not set +# CONFIG_IP_NF_CONNTRACK_EVENTS is not set +# CONFIG_IP_NF_CT_PROTO_SCTP is not set +CONFIG_IP_NF_FTP=m +CONFIG_IP_NF_IRC=m +CONFIG_IP_NF_NETBIOS_NS=m +CONFIG_IP_NF_TFTP=m +CONFIG_IP_NF_AMANDA=m +CONFIG_IP_NF_PPTP=m +# CONFIG_IP_NF_QUEUE is not set +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_LIMIT=m +CONFIG_IP_NF_MATCH_IPRANGE=m +CONFIG_IP_NF_MATCH_MAC=m +CONFIG_IP_NF_MATCH_PKTTYPE=m +CONFIG_IP_NF_MATCH_MARK=m +CONFIG_IP_NF_MATCH_MULTIPORT=m +CONFIG_IP_NF_MATCH_TOS=m +CONFIG_IP_NF_MATCH_RECENT=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_DSCP=m +CONFIG_IP_NF_MATCH_AH_ESP=m +CONFIG_IP_NF_MATCH_LENGTH=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_MATCH_TCPMSS=m +CONFIG_IP_NF_MATCH_HELPER=m +CONFIG_IP_NF_MATCH_STATE=m +CONFIG_IP_NF_MATCH_CONNTRACK=m +CONFIG_IP_NF_MATCH_OWNER=m +# CONFIG_IP_NF_MATCH_ADDRTYPE is not set +# CONFIG_IP_NF_MATCH_REALM is not set +# CONFIG_IP_NF_MATCH_SCTP is not set +CONFIG_IP_NF_MATCH_DCCP=m +# CONFIG_IP_NF_MATCH_COMMENT is not set +# CONFIG_IP_NF_MATCH_HASHLIMIT is not set +CONFIG_IP_NF_MATCH_STRING=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +# CONFIG_IP_NF_TARGET_LOG is not set +CONFIG_IP_NF_TARGET_ULOG=m +CONFIG_IP_NF_TARGET_TCPMSS=m +# CONFIG_IP_NF_TARGET_NFQUEUE is not set +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_NAT_NEEDED=y +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_SAME=m +CONFIG_IP_NF_NAT_SNMP_BASIC=m +CONFIG_IP_NF_NAT_IRC=m +CONFIG_IP_NF_NAT_FTP=m +CONFIG_IP_NF_NAT_TFTP=m +CONFIG_IP_NF_NAT_AMANDA=m +CONFIG_IP_NF_NAT_PPTP=m +# CONFIG_IP_NF_MANGLE is not set +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_TARGET_NOTRACK=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m + +# +# DCCP Configuration (EXPERIMENTAL) +# +CONFIG_IP_DCCP=m +CONFIG_INET_DCCP_DIAG=m + +# +# DCCP CCIDs Configuration (EXPERIMENTAL) +# +CONFIG_IP_DCCP_CCID3=m +CONFIG_IP_DCCP_TFRC_LIB=m + +# +# DCCP Kernel Hacking +# +# CONFIG_IP_DCCP_DEBUG is not set +# CONFIG_IP_DCCP_UNLOAD_HACK is not set + +# +# SCTP Configuration (EXPERIMENTAL) +# +# CONFIG_IP_SCTP is not set +# CONFIG_ATM is not set +# CONFIG_BRIDGE is not set +# CONFIG_VLAN_8021Q is not set +# CONFIG_DECNET is not set +# CONFIG_LLC2 is not set +# CONFIG_IPX is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_NET_DIVERT is not set +# CONFIG_ECONET is not set +# CONFIG_WAN_ROUTER is not set + +# +# QoS and/or fair queueing +# +# CONFIG_NET_SCHED is not set + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +# CONFIG_HAMRADIO is not set +CONFIG_IRDA=m + +# +# IrDA protocols +# +CONFIG_IRLAN=m +CONFIG_IRNET=m +CONFIG_IRCOMM=m +# CONFIG_IRDA_ULTRA is not set + +# +# IrDA options +# +CONFIG_IRDA_CACHE_LAST_LSAP=y +CONFIG_IRDA_FAST_RR=y +# CONFIG_IRDA_DEBUG is not set + +# +# Infrared-port device drivers +# + +# +# SIR device drivers +# +CONFIG_IRTTY_SIR=m + +# +# Dongle support +# +# CONFIG_DONGLE is not set + +# +# Old SIR device drivers +# +# CONFIG_IRPORT_SIR is not set + +# +# Old Serial dongle support +# + +# +# FIR device drivers +# +# CONFIG_USB_IRDA is not set +# CONFIG_SIGMATEL_FIR is not set +# CONFIG_NSC_FIR is not set +# CONFIG_WINBOND_FIR is not set +# CONFIG_TOSHIBA_FIR is not set +# CONFIG_SMC_IRCC_FIR is not set +# CONFIG_ALI_FIR is not set +# CONFIG_VLSI_FIR is not set +# CONFIG_VIA_FIR is not set +CONFIG_BT=m +CONFIG_BT_L2CAP=m +CONFIG_BT_SCO=m +CONFIG_BT_RFCOMM=m +CONFIG_BT_RFCOMM_TTY=y +CONFIG_BT_BNEP=m +CONFIG_BT_BNEP_MC_FILTER=y +CONFIG_BT_BNEP_PROTO_FILTER=y +CONFIG_BT_HIDP=m + +# +# Bluetooth device drivers +# +CONFIG_BT_HCIUSB=m +# CONFIG_BT_HCIUSB_SCO is not set +# CONFIG_BT_HCIUART is not set +CONFIG_BT_HCIBCM203X=m +# CONFIG_BT_HCIBPA10X is not set +CONFIG_BT_HCIBFUSB=m +# CONFIG_BT_HCIDTL1 is not set +# CONFIG_BT_HCIBT3C is not set +# CONFIG_BT_HCIBLUECARD is not set +# CONFIG_BT_HCIBTUART is not set +# CONFIG_BT_HCIVHCI is not set +CONFIG_IEEE80211=m +# CONFIG_IEEE80211_DEBUG is not set +CONFIG_IEEE80211_CRYPT_WEP=m +CONFIG_IEEE80211_CRYPT_CCMP=m +CONFIG_IEEE80211_CRYPT_TKIP=m + +# +# Device Drivers +# + +# +# Generic Driver Options +# +# CONFIG_STANDALONE is not set +CONFIG_PREVENT_FIRMWARE_BUILD=y +CONFIG_FW_LOADER=m +# CONFIG_DEBUG_DRIVER is not set + +# +# Connector - unified userspace <-> kernelspace linker +# +CONFIG_CONNECTOR=y +CONFIG_PROC_EVENTS=y + +# +# Memory Technology Devices (MTD) +# +# CONFIG_MTD is not set + +# +# Parallel port support +# +# CONFIG_PARPORT is not set + +# +# Plug and Play support +# + +# +# Block devices +# +# CONFIG_BLK_DEV_FD is not set +CONFIG_MAC_FLOPPY=y +# CONFIG_BLK_CPQ_DA is not set +# CONFIG_BLK_CPQ_CISS_DA is not set +# CONFIG_BLK_DEV_DAC960 is not set +# CONFIG_BLK_DEV_UMEM is not set +# CONFIG_BLK_DEV_COW_COMMON is not set +CONFIG_BLK_DEV_LOOP=y +# CONFIG_BLK_DEV_CRYPTOLOOP is not set +# CONFIG_BLK_DEV_NBD is not set +# CONFIG_BLK_DEV_SX8 is not set +CONFIG_BLK_DEV_UB=m +CONFIG_BLK_DEV_RAM=y +CONFIG_BLK_DEV_RAM_COUNT=16 +CONFIG_BLK_DEV_RAM_SIZE=4096 +CONFIG_BLK_DEV_INITRD=y +# CONFIG_CDROM_PKTCDVD is not set +# CONFIG_ATA_OVER_ETH is not set + +# +# ATA/ATAPI/MFM/RLL support +# +CONFIG_IDE=y +CONFIG_BLK_DEV_IDE=y + +# +# Please see Documentation/ide.txt for help/info on IDE drives +# +# CONFIG_BLK_DEV_IDE_SATA is not set +CONFIG_BLK_DEV_IDEDISK=y +# CONFIG_IDEDISK_MULTI_MODE is not set +CONFIG_BLK_DEV_IDECS=m +CONFIG_BLK_DEV_IDECD=y +# CONFIG_BLK_DEV_IDETAPE is not set +CONFIG_BLK_DEV_IDEFLOPPY=y +CONFIG_BLK_DEV_IDESCSI=y +# CONFIG_IDE_TASK_IOCTL is not set + +# +# IDE chipset support/bugfixes +# +# CONFIG_IDE_GENERIC is not set +CONFIG_BLK_DEV_IDEPCI=y +CONFIG_IDEPCI_SHARE_IRQ=y +# CONFIG_BLK_DEV_OFFBOARD is not set +CONFIG_BLK_DEV_GENERIC=y +# CONFIG_BLK_DEV_OPTI621 is not set +CONFIG_BLK_DEV_SL82C105=y +CONFIG_BLK_DEV_IDEDMA_PCI=y +# CONFIG_BLK_DEV_IDEDMA_FORCED is not set +CONFIG_IDEDMA_PCI_AUTO=y +# CONFIG_IDEDMA_ONLYDISK is not set +# CONFIG_BLK_DEV_AEC62XX is not set +# CONFIG_BLK_DEV_ALI15X3 is not set +# CONFIG_BLK_DEV_AMD74XX is not set +# CONFIG_BLK_DEV_CMD64X is not set +# CONFIG_BLK_DEV_TRIFLEX is not set +# CONFIG_BLK_DEV_CY82C693 is not set +# CONFIG_BLK_DEV_CS5520 is not set +# CONFIG_BLK_DEV_CS5530 is not set +# CONFIG_BLK_DEV_HPT34X is not set +# CONFIG_BLK_DEV_HPT366 is not set +# CONFIG_BLK_DEV_SC1200 is not set +# CONFIG_BLK_DEV_PIIX is not set +# CONFIG_BLK_DEV_IT821X is not set +# CONFIG_BLK_DEV_NS87415 is not set +# CONFIG_BLK_DEV_PDC202XX_OLD is not set +CONFIG_BLK_DEV_PDC202XX_NEW=y +# CONFIG_PDC202XX_FORCE is not set +# CONFIG_BLK_DEV_SVWKS is not set +# CONFIG_BLK_DEV_SIIMAGE is not set +# CONFIG_BLK_DEV_SLC90E66 is not set +# CONFIG_BLK_DEV_TRM290 is not set +# CONFIG_BLK_DEV_VIA82CXXX is not set +CONFIG_BLK_DEV_IDE_PMAC=y +CONFIG_BLK_DEV_IDE_PMAC_ATA100FIRST=y +CONFIG_BLK_DEV_IDEDMA_PMAC=y +CONFIG_BLK_DEV_IDE_PMAC_BLINK=y +# CONFIG_IDE_ARM is not set +CONFIG_BLK_DEV_IDEDMA=y +# CONFIG_IDEDMA_IVB is not set +CONFIG_IDEDMA_AUTO=y +# CONFIG_BLK_DEV_HD is not set + +# +# SCSI device support +# +# CONFIG_RAID_ATTRS is not set +CONFIG_SCSI=y +CONFIG_SCSI_PROC_FS=y + +# +# SCSI support type (disk, tape, CD-ROM) +# +CONFIG_BLK_DEV_SD=y +CONFIG_CHR_DEV_ST=y +# CONFIG_CHR_DEV_OSST is not set +CONFIG_BLK_DEV_SR=y +CONFIG_BLK_DEV_SR_VENDOR=y +CONFIG_CHR_DEV_SG=y +# CONFIG_CHR_DEV_SCH is not set + +# +# Some SCSI devices (e.g. CD jukebox) support multiple LUNs +# +# CONFIG_SCSI_MULTI_LUN is not set +CONFIG_SCSI_CONSTANTS=y +# CONFIG_SCSI_LOGGING is not set + +# +# SCSI Transport Attributes +# +CONFIG_SCSI_SPI_ATTRS=y +# CONFIG_SCSI_FC_ATTRS is not set +# CONFIG_SCSI_ISCSI_ATTRS is not set +# CONFIG_SCSI_SAS_ATTRS is not set + +# +# SCSI low-level drivers +# +# CONFIG_ISCSI_TCP is not set +# CONFIG_BLK_DEV_3W_XXXX_RAID is not set +# CONFIG_SCSI_3W_9XXX is not set +# CONFIG_SCSI_ACARD is not set +# CONFIG_SCSI_AACRAID is not set +CONFIG_SCSI_AIC7XXX=m +CONFIG_AIC7XXX_CMDS_PER_DEVICE=253 +CONFIG_AIC7XXX_RESET_DELAY_MS=15000 +CONFIG_AIC7XXX_DEBUG_ENABLE=y +CONFIG_AIC7XXX_DEBUG_MASK=0 +CONFIG_AIC7XXX_REG_PRETTY_PRINT=y +CONFIG_SCSI_AIC7XXX_OLD=m +# CONFIG_SCSI_AIC79XX is not set +# CONFIG_SCSI_DPT_I2O is not set +# CONFIG_MEGARAID_NEWGEN is not set +# CONFIG_MEGARAID_LEGACY is not set +# CONFIG_MEGARAID_SAS is not set +# CONFIG_SCSI_SATA is not set +# CONFIG_SCSI_BUSLOGIC is not set +# CONFIG_SCSI_DMX3191D is not set +# CONFIG_SCSI_EATA is not set +# CONFIG_SCSI_FUTURE_DOMAIN is not set +# CONFIG_SCSI_GDTH is not set +# CONFIG_SCSI_IPS is not set +# CONFIG_SCSI_INITIO is not set +# CONFIG_SCSI_INIA100 is not set +CONFIG_SCSI_SYM53C8XX_2=y +CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=0 +CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 +CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 +# CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set +# CONFIG_SCSI_IPR is not set +# CONFIG_SCSI_QLOGIC_FC is not set +# CONFIG_SCSI_QLOGIC_1280 is not set +CONFIG_SCSI_QLA2XXX=y +# CONFIG_SCSI_QLA21XX is not set +# CONFIG_SCSI_QLA22XX is not set +# CONFIG_SCSI_QLA2300 is not set +# CONFIG_SCSI_QLA2322 is not set +# CONFIG_SCSI_QLA6312 is not set +# CONFIG_SCSI_QLA24XX is not set +# CONFIG_SCSI_LPFC is not set +# CONFIG_SCSI_DC395x is not set +# CONFIG_SCSI_DC390T is not set +# CONFIG_SCSI_NSP32 is not set +# CONFIG_SCSI_DEBUG is not set +CONFIG_SCSI_MESH=y +CONFIG_SCSI_MESH_SYNC_RATE=5 +CONFIG_SCSI_MESH_RESET_DELAY_MS=1000 +CONFIG_SCSI_MAC53C94=y + +# +# PCMCIA SCSI adapter support +# +# CONFIG_PCMCIA_AHA152X is not set +# CONFIG_PCMCIA_FDOMAIN is not set +# CONFIG_PCMCIA_NINJA_SCSI is not set +# CONFIG_PCMCIA_QLOGIC is not set +# CONFIG_PCMCIA_SYM53C500 is not set + +# +# Multi-device support (RAID and LVM) +# +CONFIG_MD=y +CONFIG_BLK_DEV_MD=m +CONFIG_MD_LINEAR=m +CONFIG_MD_RAID0=m +CONFIG_MD_RAID1=m +# CONFIG_MD_RAID10 is not set +CONFIG_MD_RAID5=m +CONFIG_MD_RAID6=m +CONFIG_MD_MULTIPATH=m +CONFIG_MD_FAULTY=m +CONFIG_BLK_DEV_DM=m +CONFIG_DM_CRYPT=m +# CONFIG_DM_SNAPSHOT is not set +# CONFIG_DM_MIRROR is not set +# CONFIG_DM_ZERO is not set +# CONFIG_DM_MULTIPATH is not set + +# +# Fusion MPT device support +# +# CONFIG_FUSION is not set +# CONFIG_FUSION_SPI is not set +# CONFIG_FUSION_FC is not set +# CONFIG_FUSION_SAS is not set + +# +# IEEE 1394 (FireWire) support +# +CONFIG_IEEE1394=m + +# +# Subsystem Options +# +# CONFIG_IEEE1394_VERBOSEDEBUG is not set +# CONFIG_IEEE1394_OUI_DB is not set +CONFIG_IEEE1394_EXTRA_CONFIG_ROMS=y +CONFIG_IEEE1394_CONFIG_ROM_IP1394=y +# CONFIG_IEEE1394_EXPORT_FULL_API is not set + +# +# Device Drivers +# +# CONFIG_IEEE1394_PCILYNX is not set +CONFIG_IEEE1394_OHCI1394=m + +# +# Protocol Drivers +# +CONFIG_IEEE1394_VIDEO1394=m +CONFIG_IEEE1394_SBP2=m +# CONFIG_IEEE1394_SBP2_PHYS_DMA is not set +CONFIG_IEEE1394_ETH1394=m +CONFIG_IEEE1394_DV1394=m +CONFIG_IEEE1394_RAWIO=m +# CONFIG_IEEE1394_CMP is not set + +# +# I2O device support +# +# CONFIG_I2O is not set + +# +# Macintosh device drivers +# +CONFIG_ADB=y +CONFIG_ADB_CUDA=y +CONFIG_ADB_PMU=y +CONFIG_PMAC_APM_EMU=y +CONFIG_PMAC_MEDIABAY=y +CONFIG_PMAC_BACKLIGHT=y +CONFIG_INPUT_ADBHID=y +CONFIG_MAC_EMUMOUSEBTN=y +CONFIG_THERM_WINDTUNNEL=m +CONFIG_THERM_ADT746X=m +# CONFIG_WINDFARM is not set +# CONFIG_ANSLCD is not set + +# +# Network device support +# +CONFIG_NETDEVICES=y +# CONFIG_DUMMY is not set +# CONFIG_BONDING is not set +# CONFIG_EQUALIZER is not set +# CONFIG_TUN is not set + +# +# ARCnet devices +# +# CONFIG_ARCNET is not set + +# +# PHY device support +# +# CONFIG_PHYLIB is not set + +# +# Ethernet (10 or 100Mbit) +# +CONFIG_NET_ETHERNET=y +CONFIG_MII=y +CONFIG_MACE=y +# CONFIG_MACE_AAUI_PORT is not set +CONFIG_BMAC=y +# CONFIG_HAPPYMEAL is not set +CONFIG_SUNGEM=y +# CONFIG_CASSINI is not set +# CONFIG_NET_VENDOR_3COM is not set + +# +# Tulip family network device support +# +# CONFIG_NET_TULIP is not set +# CONFIG_HP100 is not set +CONFIG_NET_PCI=y +CONFIG_PCNET32=y +# CONFIG_AMD8111_ETH is not set +# CONFIG_ADAPTEC_STARFIRE is not set +# CONFIG_B44 is not set +# CONFIG_FORCEDETH is not set +# CONFIG_DGRS is not set +# CONFIG_EEPRO100 is not set +# CONFIG_E100 is not set +# CONFIG_FEALNX is not set +# CONFIG_NATSEMI is not set +# CONFIG_NE2K_PCI is not set +# CONFIG_8139CP is not set +# CONFIG_8139TOO is not set +# CONFIG_SIS900 is not set +# CONFIG_EPIC100 is not set +# CONFIG_SUNDANCE is not set +# CONFIG_TLAN is not set +# CONFIG_VIA_RHINE is not set + +# +# Ethernet (1000 Mbit) +# +# CONFIG_ACENIC is not set +# CONFIG_DL2K is not set +# CONFIG_E1000 is not set +# CONFIG_NS83820 is not set +# CONFIG_HAMACHI is not set +# CONFIG_YELLOWFIN is not set +# CONFIG_R8169 is not set +# CONFIG_SIS190 is not set +# CONFIG_SKGE is not set +# CONFIG_SK98LIN is not set +# CONFIG_VIA_VELOCITY is not set +# CONFIG_TIGON3 is not set +# CONFIG_BNX2 is not set +# CONFIG_MV643XX_ETH is not set + +# +# Ethernet (10000 Mbit) +# +# CONFIG_CHELSIO_T1 is not set +# CONFIG_IXGB is not set +# CONFIG_S2IO is not set + +# +# Token Ring devices +# +# CONFIG_TR is not set + +# +# Wireless LAN (non-hamradio) +# +CONFIG_NET_RADIO=y + +# +# Obsolete Wireless cards support (pre-802.11) +# +# CONFIG_STRIP is not set +# CONFIG_PCMCIA_WAVELAN is not set +# CONFIG_PCMCIA_NETWAVE is not set + +# +# Wireless 802.11 Frequency Hopping cards support +# +# CONFIG_PCMCIA_RAYCS is not set + +# +# Wireless 802.11b ISA/PCI cards support +# +# CONFIG_IPW2100 is not set +# CONFIG_IPW2200 is not set +# CONFIG_AIRO is not set +CONFIG_HERMES=m +CONFIG_APPLE_AIRPORT=m +# CONFIG_PLX_HERMES is not set +# CONFIG_TMD_HERMES is not set +# CONFIG_NORTEL_HERMES is not set +# CONFIG_PCI_HERMES is not set +# CONFIG_ATMEL is not set + +# +# Wireless 802.11b Pcmcia/Cardbus cards support +# +# CONFIG_PCMCIA_HERMES is not set +# CONFIG_PCMCIA_SPECTRUM is not set +# CONFIG_AIRO_CS is not set +# CONFIG_PCMCIA_WL3501 is not set + +# +# Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support +# +CONFIG_PRISM54=m +# CONFIG_HOSTAP is not set +CONFIG_NET_WIRELESS=y + +# +# PCMCIA network device support +# +# CONFIG_NET_PCMCIA is not set + +# +# Wan interfaces +# +# CONFIG_WAN is not set +# CONFIG_FDDI is not set +# CONFIG_HIPPI is not set +CONFIG_PPP=y +CONFIG_PPP_MULTILINK=y +# CONFIG_PPP_FILTER is not set +CONFIG_PPP_ASYNC=y +CONFIG_PPP_SYNC_TTY=m +CONFIG_PPP_DEFLATE=y +CONFIG_PPP_BSDCOMP=m +# CONFIG_PPP_MPPE is not set +# CONFIG_PPPOE is not set +# CONFIG_SLIP is not set +# CONFIG_NET_FC is not set +# CONFIG_SHAPER is not set +# CONFIG_NETCONSOLE is not set +# CONFIG_NETPOLL is not set +# CONFIG_NET_POLL_CONTROLLER is not set + +# +# ISDN subsystem +# +# CONFIG_ISDN is not set + +# +# Telephony Support +# +# CONFIG_PHONE is not set + +# +# Input device support +# +CONFIG_INPUT=y + +# +# Userland interfaces +# +CONFIG_INPUT_MOUSEDEV=y +CONFIG_INPUT_MOUSEDEV_PSAUX=y +CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 +CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 +# CONFIG_INPUT_JOYDEV is not set +# CONFIG_INPUT_TSDEV is not set +CONFIG_INPUT_EVDEV=y +# CONFIG_INPUT_EVBUG is not set + +# +# Input Device Drivers +# +CONFIG_INPUT_KEYBOARD=y +# CONFIG_KEYBOARD_ATKBD is not set +# CONFIG_KEYBOARD_SUNKBD is not set +# CONFIG_KEYBOARD_LKKBD is not set +# CONFIG_KEYBOARD_XTKBD is not set +# CONFIG_KEYBOARD_NEWTON is not set +CONFIG_INPUT_MOUSE=y +# CONFIG_MOUSE_PS2 is not set +# CONFIG_MOUSE_SERIAL is not set +# CONFIG_MOUSE_VSXXXAA is not set +# CONFIG_INPUT_JOYSTICK is not set +# CONFIG_INPUT_TOUCHSCREEN is not set +# CONFIG_INPUT_MISC is not set + +# +# Hardware I/O ports +# +CONFIG_SERIO=y +# CONFIG_SERIO_I8042 is not set +# CONFIG_SERIO_SERPORT is not set +# CONFIG_SERIO_PCIPS2 is not set +# CONFIG_SERIO_RAW is not set +# CONFIG_GAMEPORT is not set + +# +# Character devices +# +CONFIG_VT=y +CONFIG_VT_CONSOLE=y +CONFIG_HW_CONSOLE=y +# CONFIG_SERIAL_NONSTANDARD is not set + +# +# Serial drivers +# +CONFIG_SERIAL_8250=m +# CONFIG_SERIAL_8250_CS is not set +CONFIG_SERIAL_8250_NR_UARTS=4 +# CONFIG_SERIAL_8250_EXTENDED is not set + +# +# Non-8250 serial port support +# +CONFIG_SERIAL_CORE=m +# CONFIG_SERIAL_PMACZILOG is not set +# CONFIG_SERIAL_JSM is not set +CONFIG_UNIX98_PTYS=y +CONFIG_LEGACY_PTYS=y +CONFIG_LEGACY_PTY_COUNT=256 + +# +# IPMI +# +# CONFIG_IPMI_HANDLER is not set + +# +# Watchdog Cards +# +# CONFIG_WATCHDOG is not set +CONFIG_NVRAM=y +CONFIG_GEN_RTC=y +# CONFIG_GEN_RTC_X is not set +# CONFIG_DTLK is not set +# CONFIG_R3964 is not set +# CONFIG_APPLICOM is not set + +# +# Ftape, the floppy tape device driver +# +CONFIG_AGP=m +CONFIG_AGP_UNINORTH=m +CONFIG_DRM=m +# CONFIG_DRM_TDFX is not set +CONFIG_DRM_R128=m +CONFIG_DRM_RADEON=m +# CONFIG_DRM_MGA is not set +# CONFIG_DRM_SIS is not set +# CONFIG_DRM_VIA is not set +# CONFIG_DRM_SAVAGE is not set + +# +# PCMCIA character devices +# +# CONFIG_SYNCLINK_CS is not set +# CONFIG_CARDMAN_4000 is not set +# CONFIG_CARDMAN_4040 is not set +# CONFIG_RAW_DRIVER is not set + +# +# TPM devices +# +# CONFIG_TCG_TPM is not set +# CONFIG_TELCLOCK is not set + +# +# I2C support +# +CONFIG_I2C=y +CONFIG_I2C_CHARDEV=m + +# +# I2C Algorithms +# +CONFIG_I2C_ALGOBIT=y +# CONFIG_I2C_ALGOPCF is not set +# CONFIG_I2C_ALGOPCA is not set + +# +# I2C Hardware Bus support +# +# CONFIG_I2C_ALI1535 is not set +# CONFIG_I2C_ALI1563 is not set +# CONFIG_I2C_ALI15X3 is not set +# CONFIG_I2C_AMD756 is not set +# CONFIG_I2C_AMD8111 is not set +# CONFIG_I2C_I801 is not set +# CONFIG_I2C_I810 is not set +# CONFIG_I2C_PIIX4 is not set +CONFIG_I2C_KEYWEST=m +# CONFIG_I2C_MPC is not set +# CONFIG_I2C_NFORCE2 is not set +# CONFIG_I2C_PARPORT_LIGHT is not set +# CONFIG_I2C_PROSAVAGE is not set +# CONFIG_I2C_SAVAGE4 is not set +# CONFIG_SCx200_ACB is not set +# CONFIG_I2C_SIS5595 is not set +# CONFIG_I2C_SIS630 is not set +# CONFIG_I2C_SIS96X is not set +# CONFIG_I2C_STUB is not set +# CONFIG_I2C_VIA is not set +# CONFIG_I2C_VIAPRO is not set +# CONFIG_I2C_VOODOO3 is not set +# CONFIG_I2C_PCA_ISA is not set + +# +# Miscellaneous I2C Chip support +# +# CONFIG_SENSORS_DS1337 is not set +# CONFIG_SENSORS_DS1374 is not set +# CONFIG_SENSORS_EEPROM is not set +# CONFIG_SENSORS_PCF8574 is not set +# CONFIG_SENSORS_PCA9539 is not set +# CONFIG_SENSORS_PCF8591 is not set +# CONFIG_SENSORS_RTC8564 is not set +# CONFIG_SENSORS_M41T00 is not set +# CONFIG_SENSORS_MAX6875 is not set +# CONFIG_RTC_X1205_I2C is not set +# CONFIG_I2C_DEBUG_CORE is not set +# CONFIG_I2C_DEBUG_ALGO is not set +# CONFIG_I2C_DEBUG_BUS is not set +# CONFIG_I2C_DEBUG_CHIP is not set + +# +# Dallas's 1-wire bus +# +# CONFIG_W1 is not set + +# +# Hardware Monitoring support +# +# CONFIG_HWMON is not set +# CONFIG_HWMON_VID is not set + +# +# Misc devices +# + +# +# Multimedia Capabilities Port drivers +# + +# +# Multimedia devices +# +# CONFIG_VIDEO_DEV is not set + +# +# Digital Video Broadcasting Devices +# +# CONFIG_DVB is not set + +# +# Graphics support +# +CONFIG_FB=y +CONFIG_FB_CFB_FILLRECT=y +CONFIG_FB_CFB_COPYAREA=y +CONFIG_FB_CFB_IMAGEBLIT=y +CONFIG_FB_MACMODES=y +CONFIG_FB_MODE_HELPERS=y +CONFIG_FB_TILEBLITTING=y +# CONFIG_FB_CIRRUS is not set +# CONFIG_FB_PM2 is not set +# CONFIG_FB_CYBER2000 is not set +CONFIG_FB_OF=y +CONFIG_FB_CONTROL=y +CONFIG_FB_PLATINUM=y +CONFIG_FB_VALKYRIE=y +CONFIG_FB_CT65550=y +# CONFIG_FB_ASILIANT is not set +CONFIG_FB_IMSTT=y +# CONFIG_FB_VGA16 is not set +# CONFIG_FB_S1D13XXX is not set +CONFIG_FB_NVIDIA=y +CONFIG_FB_NVIDIA_I2C=y +# CONFIG_FB_RIVA is not set +CONFIG_FB_MATROX=y +CONFIG_FB_MATROX_MILLENIUM=y +CONFIG_FB_MATROX_MYSTIQUE=y +# CONFIG_FB_MATROX_G is not set +# CONFIG_FB_MATROX_I2C is not set +# CONFIG_FB_MATROX_MULTIHEAD is not set +# CONFIG_FB_RADEON_OLD is not set +CONFIG_FB_RADEON=y +CONFIG_FB_RADEON_I2C=y +# CONFIG_FB_RADEON_DEBUG is not set +CONFIG_FB_ATY128=y +CONFIG_FB_ATY=y +CONFIG_FB_ATY_CT=y +# CONFIG_FB_ATY_GENERIC_LCD is not set +# CONFIG_FB_ATY_XL_INIT is not set +CONFIG_FB_ATY_GX=y +# CONFIG_FB_SAVAGE is not set +# CONFIG_FB_SIS is not set +# CONFIG_FB_NEOMAGIC is not set +# CONFIG_FB_KYRO is not set +CONFIG_FB_3DFX=y +# CONFIG_FB_3DFX_ACCEL is not set +# CONFIG_FB_VOODOO1 is not set +# CONFIG_FB_CYBLA is not set +# CONFIG_FB_TRIDENT is not set +# CONFIG_FB_VIRTUAL is not set + +# +# Console display driver support +# +# CONFIG_VGA_CONSOLE is not set +CONFIG_DUMMY_CONSOLE=y +CONFIG_FRAMEBUFFER_CONSOLE=y +# CONFIG_FRAMEBUFFER_CONSOLE_ROTATION is not set +# CONFIG_FONTS is not set +CONFIG_FONT_8x8=y +CONFIG_FONT_8x16=y + +# +# Logo configuration +# +CONFIG_LOGO=y +CONFIG_LOGO_LINUX_MONO=y +CONFIG_LOGO_LINUX_VGA16=y +CONFIG_LOGO_LINUX_CLUT224=y +# CONFIG_BACKLIGHT_LCD_SUPPORT is not set + +# +# Sound +# +CONFIG_SOUND=m +CONFIG_DMASOUND_PMAC=m +CONFIG_DMASOUND=m + +# +# Advanced Linux Sound Architecture +# +CONFIG_SND=m +CONFIG_SND_TIMER=m +CONFIG_SND_PCM=m +CONFIG_SND_HWDEP=m +CONFIG_SND_RAWMIDI=m +CONFIG_SND_SEQUENCER=m +CONFIG_SND_SEQ_DUMMY=m +CONFIG_SND_OSSEMUL=y +CONFIG_SND_MIXER_OSS=m +CONFIG_SND_PCM_OSS=m +CONFIG_SND_SEQUENCER_OSS=y +# CONFIG_SND_VERBOSE_PRINTK is not set +# CONFIG_SND_DEBUG is not set +CONFIG_SND_GENERIC_DRIVER=y + +# +# Generic devices +# +CONFIG_SND_DUMMY=m +# CONFIG_SND_VIRMIDI is not set +# CONFIG_SND_MTPAV is not set +# CONFIG_SND_SERIAL_U16550 is not set +# CONFIG_SND_MPU401 is not set + +# +# PCI devices +# +# CONFIG_SND_ALI5451 is not set +# CONFIG_SND_ATIIXP is not set +# CONFIG_SND_ATIIXP_MODEM is not set +# CONFIG_SND_AU8810 is not set +# CONFIG_SND_AU8820 is not set +# CONFIG_SND_AU8830 is not set +# CONFIG_SND_AZT3328 is not set +# CONFIG_SND_BT87X is not set +# CONFIG_SND_CS46XX is not set +# CONFIG_SND_CS4281 is not set +# CONFIG_SND_EMU10K1 is not set +# CONFIG_SND_EMU10K1X is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_KORG1212 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set +# CONFIG_SND_HDSP is not set +# CONFIG_SND_HDSPM is not set +# CONFIG_SND_TRIDENT is not set +# CONFIG_SND_YMFPCI is not set +# CONFIG_SND_AD1889 is not set +# CONFIG_SND_ALS4000 is not set +# CONFIG_SND_CMIPCI is not set +# CONFIG_SND_ENS1370 is not set +# CONFIG_SND_ENS1371 is not set +# CONFIG_SND_ES1938 is not set +# CONFIG_SND_ES1968 is not set +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_FM801 is not set +# CONFIG_SND_ICE1712 is not set +# CONFIG_SND_ICE1724 is not set +# CONFIG_SND_INTEL8X0 is not set +# CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_VIA82XX is not set +# CONFIG_SND_VIA82XX_MODEM is not set +# CONFIG_SND_VX222 is not set +# CONFIG_SND_HDA_INTEL is not set + +# +# ALSA PowerMac devices +# +CONFIG_SND_POWERMAC=m +# CONFIG_SND_POWERMAC_AUTO_DRC is not set + +# +# USB devices +# +CONFIG_SND_USB_AUDIO=m +# CONFIG_SND_USB_USX2Y is not set + +# +# PCMCIA devices +# + +# +# Open Sound System +# +# CONFIG_SOUND_PRIME is not set + +# +# USB support +# +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB_ARCH_HAS_OHCI=y +CONFIG_USB=y +# CONFIG_USB_DEBUG is not set + +# +# Miscellaneous USB options +# +CONFIG_USB_DEVICEFS=y +# CONFIG_USB_BANDWIDTH is not set +CONFIG_USB_DYNAMIC_MINORS=y +# CONFIG_USB_SUSPEND is not set +# CONFIG_USB_OTG is not set + +# +# USB Host Controller Drivers +# +# CONFIG_USB_EHCI_HCD is not set +# CONFIG_USB_ISP116X_HCD is not set +CONFIG_USB_OHCI_HCD=y +# CONFIG_USB_OHCI_BIG_ENDIAN is not set +CONFIG_USB_OHCI_LITTLE_ENDIAN=y +# CONFIG_USB_UHCI_HCD is not set +# CONFIG_USB_SL811_HCD is not set + +# +# USB Device Class drivers +# +# CONFIG_OBSOLETE_OSS_USB_DRIVER is not set +CONFIG_USB_ACM=m +CONFIG_USB_PRINTER=m + +# +# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support' +# + +# +# may also be needed; see USB_STORAGE Help for more information +# +# CONFIG_USB_STORAGE is not set + +# +# USB Input Devices +# +CONFIG_USB_HID=y +CONFIG_USB_HIDINPUT=y +# CONFIG_HID_FF is not set +# CONFIG_USB_HIDDEV is not set +# CONFIG_USB_AIPTEK is not set +# CONFIG_USB_WACOM is not set +# CONFIG_USB_ACECAD is not set +# CONFIG_USB_KBTAB is not set +# CONFIG_USB_POWERMATE is not set +# CONFIG_USB_MTOUCH is not set +# CONFIG_USB_ITMTOUCH is not set +# CONFIG_USB_EGALAX is not set +# CONFIG_USB_YEALINK is not set +# CONFIG_USB_XPAD is not set +# CONFIG_USB_ATI_REMOTE is not set +# CONFIG_USB_KEYSPAN_REMOTE is not set +CONFIG_USB_APPLETOUCH=y + +# +# USB Imaging devices +# +# CONFIG_USB_MDC800 is not set +# CONFIG_USB_MICROTEK is not set + +# +# USB Multimedia devices +# +# CONFIG_USB_DABUSB is not set + +# +# Video4Linux support is needed for USB Multimedia device support +# + +# +# USB Network Adapters +# +# CONFIG_USB_CATC is not set +# CONFIG_USB_KAWETH is not set +# CONFIG_USB_PEGASUS is not set +# CONFIG_USB_RTL8150 is not set +CONFIG_USB_USBNET=m +CONFIG_USB_NET_AX8817X=m +CONFIG_USB_NET_CDCETHER=m +# CONFIG_USB_NET_GL620A is not set +CONFIG_USB_NET_NET1080=m +# CONFIG_USB_NET_PLUSB is not set +# CONFIG_USB_NET_RNDIS_HOST is not set +# CONFIG_USB_NET_CDC_SUBSET is not set +CONFIG_USB_NET_ZAURUS=m +# CONFIG_USB_ZD1201 is not set +CONFIG_USB_MON=y + +# +# USB port drivers +# + +# +# USB Serial Converter support +# +CONFIG_USB_SERIAL=m +# CONFIG_USB_SERIAL_GENERIC is not set +# CONFIG_USB_SERIAL_AIRPRIME is not set +# CONFIG_USB_SERIAL_ANYDATA is not set +# CONFIG_USB_SERIAL_BELKIN is not set +# CONFIG_USB_SERIAL_WHITEHEAT is not set +# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set +# CONFIG_USB_SERIAL_CP2101 is not set +# CONFIG_USB_SERIAL_CYPRESS_M8 is not set +# CONFIG_USB_SERIAL_EMPEG is not set +# CONFIG_USB_SERIAL_FTDI_SIO is not set +CONFIG_USB_SERIAL_VISOR=m +CONFIG_USB_SERIAL_IPAQ=m +# CONFIG_USB_SERIAL_IR is not set +# CONFIG_USB_SERIAL_EDGEPORT is not set +# CONFIG_USB_SERIAL_EDGEPORT_TI is not set +# CONFIG_USB_SERIAL_GARMIN is not set +# CONFIG_USB_SERIAL_IPW is not set +CONFIG_USB_SERIAL_KEYSPAN_PDA=m +CONFIG_USB_SERIAL_KEYSPAN=m +CONFIG_USB_SERIAL_KEYSPAN_MPR=y +CONFIG_USB_SERIAL_KEYSPAN_USA28=y +CONFIG_USB_SERIAL_KEYSPAN_USA28X=y +CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y +CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y +CONFIG_USB_SERIAL_KEYSPAN_USA19=y +CONFIG_USB_SERIAL_KEYSPAN_USA18X=y +CONFIG_USB_SERIAL_KEYSPAN_USA19W=y +CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y +CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y +CONFIG_USB_SERIAL_KEYSPAN_USA49W=y +CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y +# CONFIG_USB_SERIAL_KLSI is not set +# CONFIG_USB_SERIAL_KOBIL_SCT is not set +# CONFIG_USB_SERIAL_MCT_U232 is not set +# CONFIG_USB_SERIAL_PL2303 is not set +# CONFIG_USB_SERIAL_HP4X is not set +# CONFIG_USB_SERIAL_SAFE is not set +# CONFIG_USB_SERIAL_TI is not set +# CONFIG_USB_SERIAL_CYBERJACK is not set +# CONFIG_USB_SERIAL_XIRCOM is not set +# CONFIG_USB_SERIAL_OPTION is not set +# CONFIG_USB_SERIAL_OMNINET is not set +CONFIG_USB_EZUSB=y + +# +# USB Miscellaneous drivers +# +# CONFIG_USB_EMI62 is not set +# CONFIG_USB_EMI26 is not set +# CONFIG_USB_AUERSWALD is not set +# CONFIG_USB_RIO500 is not set +# CONFIG_USB_LEGOTOWER is not set +# CONFIG_USB_LCD is not set +# CONFIG_USB_LED is not set +# CONFIG_USB_CYTHERM is not set +# CONFIG_USB_PHIDGETKIT is not set +# CONFIG_USB_PHIDGETSERVO is not set +# CONFIG_USB_IDMOUSE is not set +# CONFIG_USB_LD is not set +# CONFIG_USB_TEST is not set + +# +# USB DSL modem support +# + +# +# USB Gadget Support +# +# CONFIG_USB_GADGET is not set + +# +# MMC/SD Card support +# +# CONFIG_MMC is not set + +# +# InfiniBand support +# +# CONFIG_INFINIBAND is not set + +# +# SN Devices +# + +# +# File systems +# +CONFIG_EXT2_FS=y +# CONFIG_EXT2_FS_XATTR is not set +# CONFIG_EXT2_FS_XIP is not set +CONFIG_EXT3_FS=y +CONFIG_EXT3_FS_XATTR=y +# CONFIG_EXT3_FS_POSIX_ACL is not set +# CONFIG_EXT3_FS_SECURITY is not set +CONFIG_JBD=y +# CONFIG_JBD_DEBUG is not set +CONFIG_FS_MBCACHE=y +# CONFIG_REISERFS_FS is not set +# CONFIG_JFS_FS is not set +# CONFIG_FS_POSIX_ACL is not set +# CONFIG_XFS_FS is not set +# CONFIG_MINIX_FS is not set +# CONFIG_ROMFS_FS is not set +CONFIG_INOTIFY=y +# CONFIG_QUOTA is not set +CONFIG_DNOTIFY=y +# CONFIG_AUTOFS_FS is not set +# CONFIG_AUTOFS4_FS is not set +CONFIG_FUSE_FS=m + +# +# CD-ROM/DVD Filesystems +# +CONFIG_ISO9660_FS=y +CONFIG_JOLIET=y +CONFIG_ZISOFS=y +CONFIG_ZISOFS_FS=y +CONFIG_UDF_FS=m +CONFIG_UDF_NLS=y + +# +# DOS/FAT/NT Filesystems +# +CONFIG_FAT_FS=m +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=m +CONFIG_FAT_DEFAULT_CODEPAGE=437 +CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" +# CONFIG_NTFS_FS is not set + +# +# Pseudo filesystems +# +CONFIG_PROC_FS=y +CONFIG_PROC_KCORE=y +CONFIG_SYSFS=y +CONFIG_TMPFS=y +# CONFIG_HUGETLB_PAGE is not set +CONFIG_RAMFS=y +CONFIG_RELAYFS_FS=m + +# +# Miscellaneous filesystems +# +# CONFIG_ADFS_FS is not set +# CONFIG_AFFS_FS is not set +CONFIG_HFS_FS=m +CONFIG_HFSPLUS_FS=m +# CONFIG_BEFS_FS is not set +# CONFIG_BFS_FS is not set +# CONFIG_EFS_FS is not set +# CONFIG_CRAMFS is not set +# CONFIG_VXFS_FS is not set +# CONFIG_HPFS_FS is not set +# CONFIG_QNX4FS_FS is not set +# CONFIG_SYSV_FS is not set +# CONFIG_UFS_FS is not set + +# +# Network File Systems +# +CONFIG_NFS_FS=y +# CONFIG_NFS_V3 is not set +# CONFIG_NFS_V4 is not set +# CONFIG_NFS_DIRECTIO is not set +CONFIG_NFSD=y +# CONFIG_NFSD_V3 is not set +# CONFIG_NFSD_TCP is not set +CONFIG_LOCKD=y +CONFIG_EXPORTFS=y +CONFIG_NFS_COMMON=y +CONFIG_SUNRPC=y +# CONFIG_RPCSEC_GSS_KRB5 is not set +# CONFIG_RPCSEC_GSS_SPKM3 is not set +CONFIG_SMB_FS=m +# CONFIG_SMB_NLS_DEFAULT is not set +# CONFIG_CIFS is not set +# CONFIG_NCP_FS is not set +# CONFIG_CODA_FS is not set +# CONFIG_AFS_FS is not set +# CONFIG_9P_FS is not set + +# +# Partition Types +# +CONFIG_PARTITION_ADVANCED=y +# CONFIG_ACORN_PARTITION is not set +# CONFIG_OSF_PARTITION is not set +# CONFIG_AMIGA_PARTITION is not set +# CONFIG_ATARI_PARTITION is not set +CONFIG_MAC_PARTITION=y +CONFIG_MSDOS_PARTITION=y +# CONFIG_BSD_DISKLABEL is not set +# CONFIG_MINIX_SUBPARTITION is not set +# CONFIG_SOLARIS_X86_PARTITION is not set +# CONFIG_UNIXWARE_DISKLABEL is not set +# CONFIG_LDM_PARTITION is not set +# CONFIG_SGI_PARTITION is not set +# CONFIG_ULTRIX_PARTITION is not set +# CONFIG_SUN_PARTITION is not set +# CONFIG_EFI_PARTITION is not set + +# +# Native Language Support +# +CONFIG_NLS=y +CONFIG_NLS_DEFAULT="iso8859-1" +CONFIG_NLS_CODEPAGE_437=m +# CONFIG_NLS_CODEPAGE_737 is not set +# CONFIG_NLS_CODEPAGE_775 is not set +# CONFIG_NLS_CODEPAGE_850 is not set +# CONFIG_NLS_CODEPAGE_852 is not set +# CONFIG_NLS_CODEPAGE_855 is not set +# CONFIG_NLS_CODEPAGE_857 is not set +# CONFIG_NLS_CODEPAGE_860 is not set +# CONFIG_NLS_CODEPAGE_861 is not set +# CONFIG_NLS_CODEPAGE_862 is not set +# CONFIG_NLS_CODEPAGE_863 is not set +# CONFIG_NLS_CODEPAGE_864 is not set +# CONFIG_NLS_CODEPAGE_865 is not set +# CONFIG_NLS_CODEPAGE_866 is not set +# CONFIG_NLS_CODEPAGE_869 is not set +# CONFIG_NLS_CODEPAGE_936 is not set +# CONFIG_NLS_CODEPAGE_950 is not set +# CONFIG_NLS_CODEPAGE_932 is not set +# CONFIG_NLS_CODEPAGE_949 is not set +# CONFIG_NLS_CODEPAGE_874 is not set +# CONFIG_NLS_ISO8859_8 is not set +# CONFIG_NLS_CODEPAGE_1250 is not set +# CONFIG_NLS_CODEPAGE_1251 is not set +# CONFIG_NLS_ASCII is not set +CONFIG_NLS_ISO8859_1=m +# CONFIG_NLS_ISO8859_2 is not set +# CONFIG_NLS_ISO8859_3 is not set +# CONFIG_NLS_ISO8859_4 is not set +# CONFIG_NLS_ISO8859_5 is not set +# CONFIG_NLS_ISO8859_6 is not set +# CONFIG_NLS_ISO8859_7 is not set +# CONFIG_NLS_ISO8859_9 is not set +# CONFIG_NLS_ISO8859_13 is not set +# CONFIG_NLS_ISO8859_14 is not set +# CONFIG_NLS_ISO8859_15 is not set +# CONFIG_NLS_KOI8_R is not set +# CONFIG_NLS_KOI8_U is not set +CONFIG_NLS_UTF8=m + +# +# Library routines +# +CONFIG_CRC_CCITT=y +CONFIG_CRC16=y +CONFIG_CRC32=y +# CONFIG_LIBCRC32C is not set +CONFIG_ZLIB_INFLATE=y +CONFIG_ZLIB_DEFLATE=y +CONFIG_TEXTSEARCH=y +CONFIG_TEXTSEARCH_KMP=m +CONFIG_TEXTSEARCH_BM=m +CONFIG_TEXTSEARCH_FSM=m + +# +# Instrumentation Support +# +CONFIG_PROFILING=y +CONFIG_OPROFILE=y + +# +# Kernel hacking +# +# CONFIG_PRINTK_TIME is not set +CONFIG_DEBUG_KERNEL=y +# CONFIG_MAGIC_SYSRQ is not set +CONFIG_LOG_BUF_SHIFT=14 +CONFIG_DETECT_SOFTLOCKUP=y +# CONFIG_SCHEDSTATS is not set +# CONFIG_DEBUG_SLAB is not set +# CONFIG_DEBUG_SPINLOCK is not set +# CONFIG_DEBUG_SPINLOCK_SLEEP is not set +# CONFIG_DEBUG_KOBJECT is not set +# CONFIG_DEBUG_INFO is not set +# CONFIG_DEBUG_FS is not set +# CONFIG_DEBUG_VM is not set +# CONFIG_RCU_TORTURE_TEST is not set +CONFIG_DEBUGGER=y +CONFIG_XMON=y +CONFIG_XMON_DEFAULT=y +# CONFIG_BDI_SWITCH is not set +CONFIG_BOOTX_TEXT=y + +# +# Security options +# +# CONFIG_KEYS is not set +# CONFIG_SECURITY is not set + +# +# Cryptographic options +# +CONFIG_CRYPTO=y +# CONFIG_CRYPTO_HMAC is not set +# CONFIG_CRYPTO_NULL is not set +# CONFIG_CRYPTO_MD4 is not set +# CONFIG_CRYPTO_MD5 is not set +# CONFIG_CRYPTO_SHA1 is not set +# CONFIG_CRYPTO_SHA256 is not set +# CONFIG_CRYPTO_SHA512 is not set +# CONFIG_CRYPTO_WP512 is not set +# CONFIG_CRYPTO_TGR192 is not set +# CONFIG_CRYPTO_DES is not set +# CONFIG_CRYPTO_BLOWFISH is not set +# CONFIG_CRYPTO_TWOFISH is not set +# CONFIG_CRYPTO_SERPENT is not set +CONFIG_CRYPTO_AES=m +# CONFIG_CRYPTO_CAST5 is not set +# CONFIG_CRYPTO_CAST6 is not set +# CONFIG_CRYPTO_TEA is not set +CONFIG_CRYPTO_ARC4=m +# CONFIG_CRYPTO_KHAZAD is not set +# CONFIG_CRYPTO_ANUBIS is not set +# CONFIG_CRYPTO_DEFLATE is not set +CONFIG_CRYPTO_MICHAEL_MIC=m +# CONFIG_CRYPTO_CRC32C is not set +# CONFIG_CRYPTO_TEST is not set + +# +# Hardware crypto devices +# From benh at kernel.crashing.org Tue Dec 13 17:48:35 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 17:48:35 +1100 Subject: [PATCH] powerpc: Fix platinumfb for some modes Message-ID: <1134456515.6989.180.camel@gaston> The platinumfb driver used only on some powermacs has an issue with some video modes & limited VRAM. This patch fixes it. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/drivers/video/platinumfb.h =================================================================== --- linux-work.orig/drivers/video/platinumfb.h 2005-09-22 14:07:19.000000000 +1000 +++ linux-work/drivers/video/platinumfb.h 2005-11-25 17:17:17.000000000 +1100 @@ -158,7 +158,9 @@ /* 832x624, 75Hz (13) */ static struct platinum_regvals platinum_reg_init_13 = { 0x70, - { 864, 1680, 3360 }, /* MacOS does 1680 instead of 1696 to fit 16bpp in 1MB */ + { 864, 1680, 3344 }, /* MacOS does 1680 instead of 1696 to fit 16bpp in 1MB, + * and we use 3344 instead of 3360 to fit in 2Mb + */ { 0xff0, 4, 0, 0, 0, 0, 0x299, 0, 0, 0x21e, 0x120, 0x10, 0x23f, 0x1f, 0x25, 0x37, 0x8a, 0x22a, 0x23e, 0x536, 0x534, 4, 9, 0x52, Index: linux-work/drivers/video/platinumfb.c =================================================================== --- linux-work.orig/drivers/video/platinumfb.c 2005-11-10 08:20:23.000000000 +1100 +++ linux-work/drivers/video/platinumfb.c 2005-11-25 17:17:17.000000000 +1100 @@ -138,13 +138,15 @@ init = platinum_reg_init[pinfo->vmode-1]; - if (pinfo->vmode == 13 && pinfo->cmode > 0) - offset = 0x10; + if ((pinfo->vmode == VMODE_832_624_75) && (pinfo->cmode > CMODE_8)) + offset = 0x10; + info->screen_base = pinfo->frame_buffer + init->fb_offset + offset; info->fix.smem_start = (pinfo->frame_buffer_phys) + init->fb_offset + offset; info->fix.visual = (pinfo->cmode == CMODE_8) ? FB_VISUAL_PSEUDOCOLOR : FB_VISUAL_DIRECTCOLOR; - info->fix.line_length = vmode_attrs[pinfo->vmode-1].hres * (1<cmode) + offset; + info->fix.line_length = vmode_attrs[pinfo->vmode-1].hres * (1<cmode) + + offset; printk("line_length: %x\n", info->fix.line_length); return 0; } @@ -221,7 +223,9 @@ static inline int platinum_vram_reqd(int video_mode, int color_mode) { return vmode_attrs[video_mode-1].vres * - (vmode_attrs[video_mode-1].hres * (1< CMODE_8)) ? 0x10 : 0x20) + 0x1000; } #define STORE_D2(a, d) { \ From benh at kernel.crashing.org Tue Dec 13 18:01:21 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 18:01:21 +1100 Subject: [PATCH] powerpc: Remove device_node addrs/n_addr Message-ID: <1134457283.6989.185.camel@gaston> The pre-parsed addrs/n_addrs fields in struct device_node are finally gone. Remove the dodgy heuristics that did that parsing at boot and remove the fields themselves since we now have a good replacement with the new OF parsing code. This patch also fixes a bunch of drivers to use the new code instead, so that at least pmac32, pseries, iseries and g5 defconfigs build. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/powerpc/kernel/prom.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/prom.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/kernel/prom.c 2005-12-13 17:51:20.000000000 +1100 @@ -57,21 +57,6 @@ #define DBG(fmt...) #endif -struct pci_reg_property { - struct pci_address addr; - u32 size_hi; - u32 size_lo; -}; - -struct isa_reg_property { - u32 space; - u32 address; - u32 size; -}; - - -typedef int interpret_func(struct device_node *, unsigned long *, - int, int, int); static int __initdata dt_root_addr_cells; static int __initdata dt_root_size_cells; @@ -410,237 +395,19 @@ static int __devinit finish_node_interru return 0; } -static int __devinit interpret_pci_props(struct device_node *np, - unsigned long *mem_start, - int naddrc, int nsizec, - int measure_only) -{ - struct address_range *adr; - struct pci_reg_property *pci_addrs; - int i, l, n_addrs; - - pci_addrs = (struct pci_reg_property *) - get_property(np, "assigned-addresses", &l); - if (!pci_addrs) - return 0; - - n_addrs = l / sizeof(*pci_addrs); - - adr = prom_alloc(n_addrs * sizeof(*adr), mem_start); - if (!adr) - return -ENOMEM; - - if (measure_only) - return 0; - - np->addrs = adr; - np->n_addrs = n_addrs; - - for (i = 0; i < n_addrs; i++) { - adr[i].space = pci_addrs[i].addr.a_hi; - adr[i].address = pci_addrs[i].addr.a_lo | - ((u64)pci_addrs[i].addr.a_mid << 32); - adr[i].size = pci_addrs[i].size_lo; - } - - return 0; -} - -static int __init interpret_dbdma_props(struct device_node *np, - unsigned long *mem_start, - int naddrc, int nsizec, - int measure_only) -{ - struct reg_property32 *rp; - struct address_range *adr; - unsigned long base_address; - int i, l; - struct device_node *db; - - base_address = 0; - if (!measure_only) { - for (db = np->parent; db != NULL; db = db->parent) { - if (!strcmp(db->type, "dbdma") && db->n_addrs != 0) { - base_address = db->addrs[0].address; - break; - } - } - } - - rp = (struct reg_property32 *) get_property(np, "reg", &l); - if (rp != 0 && l >= sizeof(struct reg_property32)) { - i = 0; - adr = (struct address_range *) (*mem_start); - while ((l -= sizeof(struct reg_property32)) >= 0) { - if (!measure_only) { - adr[i].space = 2; - adr[i].address = rp[i].address + base_address; - adr[i].size = rp[i].size; - } - ++i; - } - np->addrs = adr; - np->n_addrs = i; - (*mem_start) += i * sizeof(struct address_range); - } - - return 0; -} - -static int __init interpret_macio_props(struct device_node *np, - unsigned long *mem_start, - int naddrc, int nsizec, - int measure_only) -{ - struct reg_property32 *rp; - struct address_range *adr; - unsigned long base_address; - int i, l; - struct device_node *db; - - base_address = 0; - if (!measure_only) { - for (db = np->parent; db != NULL; db = db->parent) { - if (!strcmp(db->type, "mac-io") && db->n_addrs != 0) { - base_address = db->addrs[0].address; - break; - } - } - } - - rp = (struct reg_property32 *) get_property(np, "reg", &l); - if (rp != 0 && l >= sizeof(struct reg_property32)) { - i = 0; - adr = (struct address_range *) (*mem_start); - while ((l -= sizeof(struct reg_property32)) >= 0) { - if (!measure_only) { - adr[i].space = 2; - adr[i].address = rp[i].address + base_address; - adr[i].size = rp[i].size; - } - ++i; - } - np->addrs = adr; - np->n_addrs = i; - (*mem_start) += i * sizeof(struct address_range); - } - - return 0; -} - -static int __init interpret_isa_props(struct device_node *np, - unsigned long *mem_start, - int naddrc, int nsizec, - int measure_only) -{ - struct isa_reg_property *rp; - struct address_range *adr; - int i, l; - - rp = (struct isa_reg_property *) get_property(np, "reg", &l); - if (rp != 0 && l >= sizeof(struct isa_reg_property)) { - i = 0; - adr = (struct address_range *) (*mem_start); - while ((l -= sizeof(struct isa_reg_property)) >= 0) { - if (!measure_only) { - adr[i].space = rp[i].space; - adr[i].address = rp[i].address; - adr[i].size = rp[i].size; - } - ++i; - } - np->addrs = adr; - np->n_addrs = i; - (*mem_start) += i * sizeof(struct address_range); - } - - return 0; -} - -static int __init interpret_root_props(struct device_node *np, - unsigned long *mem_start, - int naddrc, int nsizec, - int measure_only) -{ - struct address_range *adr; - int i, l; - unsigned int *rp; - int rpsize = (naddrc + nsizec) * sizeof(unsigned int); - - rp = (unsigned int *) get_property(np, "linux,usable-memory", &l); - if (rp == NULL) - rp = (unsigned int *) get_property(np, "reg", &l); - - if (rp != 0 && l >= rpsize) { - i = 0; - adr = (struct address_range *) (*mem_start); - while ((l -= rpsize) >= 0) { - if (!measure_only) { - adr[i].space = 0; - adr[i].address = rp[naddrc - 1]; - adr[i].size = rp[naddrc + nsizec - 1]; - } - ++i; - rp += naddrc + nsizec; - } - np->addrs = adr; - np->n_addrs = i; - (*mem_start) += i * sizeof(struct address_range); - } - - return 0; -} - static int __devinit finish_node(struct device_node *np, unsigned long *mem_start, - interpret_func *ifunc, - int naddrc, int nsizec, int measure_only) { struct device_node *child; - int *ip, rc = 0; - - /* get the device addresses and interrupts */ - if (ifunc != NULL) - rc = ifunc(np, mem_start, naddrc, nsizec, measure_only); - if (rc) - goto out; + int rc = 0; rc = finish_node_interrupts(np, mem_start, measure_only); if (rc) goto out; - /* Look for #address-cells and #size-cells properties. */ - ip = (int *) get_property(np, "#address-cells", NULL); - if (ip != NULL) - naddrc = *ip; - ip = (int *) get_property(np, "#size-cells", NULL); - if (ip != NULL) - nsizec = *ip; - - if (!strcmp(np->name, "device-tree") || np->parent == NULL) - ifunc = interpret_root_props; - else if (np->type == 0) - ifunc = NULL; - else if (!strcmp(np->type, "pci") || !strcmp(np->type, "vci")) - ifunc = interpret_pci_props; - else if (!strcmp(np->type, "dbdma")) - ifunc = interpret_dbdma_props; - else if (!strcmp(np->type, "mac-io") || ifunc == interpret_macio_props) - ifunc = interpret_macio_props; - else if (!strcmp(np->type, "isa")) - ifunc = interpret_isa_props; - else if (!strcmp(np->name, "uni-n") || !strcmp(np->name, "u3")) - ifunc = interpret_root_props; - else if (!((ifunc == interpret_dbdma_props - || ifunc == interpret_macio_props) - && (!strcmp(np->type, "escc") - || !strcmp(np->type, "media-bay")))) - ifunc = NULL; - for (child = np->child; child != NULL; child = child->sibling) { - rc = finish_node(child, mem_start, ifunc, - naddrc, nsizec, measure_only); + rc = finish_node(child, mem_start, measure_only); if (rc) goto out; } @@ -702,10 +469,10 @@ void __init finish_device_tree(void) * reason and then remove those additional 16 bytes */ size = 16; - finish_node(allnodes, &size, NULL, 0, 0, 1); + finish_node(allnodes, &size, 1); size -= 16; end = start = (unsigned long) __va(lmb_alloc(size, 128)); - finish_node(allnodes, &end, NULL, 0, 0, 0); + finish_node(allnodes, &end, 0); BUG_ON(end != start + size); DBG(" <- finish_device_tree\n"); @@ -1822,7 +1589,6 @@ static void of_node_release(struct kref prop = next; } kfree(node->intrs); - kfree(node->addrs); kfree(node->full_name); kfree(node->data); kfree(node); @@ -1904,9 +1670,7 @@ void of_detach_node(const struct device_ * This should probably be split up into smaller chunks. */ -static int of_finish_dynamic_node(struct device_node *node, - unsigned long *unused1, int unused2, - int unused3, int unused4) +static int of_finish_dynamic_node(struct device_node *node) { struct device_node *parent = of_get_parent(node); int err = 0; @@ -1927,7 +1691,8 @@ static int of_finish_dynamic_node(struct return -ENODEV; /* fix up new node's linux_phandle field */ - if ((ibm_phandle = (unsigned int *)get_property(node, "ibm,phandle", NULL))) + if ((ibm_phandle = (unsigned int *)get_property(node, + "ibm,phandle", NULL))) node->linux_phandle = *ibm_phandle; out: @@ -1942,7 +1707,9 @@ static int prom_reconfig_notifier(struct switch (action) { case PSERIES_RECONFIG_ADD: - err = finish_node(node, NULL, of_finish_dynamic_node, 0, 0, 0); + err = of_finish_dynamic_node(node); + if (!err) + finish_node(node, NULL, 0); if (err < 0) { printk(KERN_ERR "finish_node returned %d\n", err); err = NOTIFY_BAD; @@ -2016,175 +1783,4 @@ int prom_add_property(struct device_node return 0; } -/* I quickly hacked that one, check against spec ! */ -static inline unsigned long -bus_space_to_resource_flags(unsigned int bus_space) -{ - u8 space = (bus_space >> 24) & 0xf; - if (space == 0) - space = 0x02; - if (space == 0x02) - return IORESOURCE_MEM; - else if (space == 0x01) - return IORESOURCE_IO; - else { - printk(KERN_WARNING "prom.c: bus_space_to_resource_flags(), space: %x\n", - bus_space); - return 0; - } -} - -#ifdef CONFIG_PCI -static struct resource *find_parent_pci_resource(struct pci_dev* pdev, - struct address_range *range) -{ - unsigned long mask; - int i; - - /* Check this one */ - mask = bus_space_to_resource_flags(range->space); - for (i=0; iresource[i].flags & mask) == mask && - pdev->resource[i].start <= range->address && - pdev->resource[i].end > range->address) { - if ((range->address + range->size - 1) > pdev->resource[i].end) { - /* Add better message */ - printk(KERN_WARNING "PCI/OF resource overlap !\n"); - return NULL; - } - break; - } - } - if (i == DEVICE_COUNT_RESOURCE) - return NULL; - return &pdev->resource[i]; -} - -/* - * Request an OF device resource. Currently handles child of PCI devices, - * or other nodes attached to the root node. Ultimately, put some - * link to resources in the OF node. - */ -struct resource *request_OF_resource(struct device_node* node, int index, - const char* name_postfix) -{ - struct pci_dev* pcidev; - u8 pci_bus, pci_devfn; - unsigned long iomask; - struct device_node* nd; - struct resource* parent; - struct resource *res = NULL; - int nlen, plen; - - if (index >= node->n_addrs) - goto fail; - - /* Sanity check on bus space */ - iomask = bus_space_to_resource_flags(node->addrs[index].space); - if (iomask & IORESOURCE_MEM) - parent = &iomem_resource; - else if (iomask & IORESOURCE_IO) - parent = &ioport_resource; - else - goto fail; - - /* Find a PCI parent if any */ - nd = node; - pcidev = NULL; - while (nd) { - if (!pci_device_from_OF_node(nd, &pci_bus, &pci_devfn)) - pcidev = pci_find_slot(pci_bus, pci_devfn); - if (pcidev) break; - nd = nd->parent; - } - if (pcidev) - parent = find_parent_pci_resource(pcidev, &node->addrs[index]); - if (!parent) { - printk(KERN_WARNING "request_OF_resource(%s), parent not found\n", - node->name); - goto fail; - } - - res = __request_region(parent, node->addrs[index].address, - node->addrs[index].size, NULL); - if (!res) - goto fail; - nlen = strlen(node->name); - plen = name_postfix ? strlen(name_postfix) : 0; - res->name = (const char *)kmalloc(nlen+plen+1, GFP_KERNEL); - if (res->name) { - strcpy((char *)res->name, node->name); - if (plen) - strcpy((char *)res->name+nlen, name_postfix); - } - return res; -fail: - return NULL; -} -EXPORT_SYMBOL(request_OF_resource); -int release_OF_resource(struct device_node *node, int index) -{ - struct pci_dev* pcidev; - u8 pci_bus, pci_devfn; - unsigned long iomask, start, end; - struct device_node* nd; - struct resource* parent; - struct resource *res = NULL; - - if (index >= node->n_addrs) - return -EINVAL; - - /* Sanity check on bus space */ - iomask = bus_space_to_resource_flags(node->addrs[index].space); - if (iomask & IORESOURCE_MEM) - parent = &iomem_resource; - else if (iomask & IORESOURCE_IO) - parent = &ioport_resource; - else - return -EINVAL; - - /* Find a PCI parent if any */ - nd = node; - pcidev = NULL; - while(nd) { - if (!pci_device_from_OF_node(nd, &pci_bus, &pci_devfn)) - pcidev = pci_find_slot(pci_bus, pci_devfn); - if (pcidev) break; - nd = nd->parent; - } - if (pcidev) - parent = find_parent_pci_resource(pcidev, &node->addrs[index]); - if (!parent) { - printk(KERN_WARNING "release_OF_resource(%s), parent not found\n", - node->name); - return -ENODEV; - } - - /* Find us in the parent and its childs */ - res = parent->child; - start = node->addrs[index].address; - end = start + node->addrs[index].size - 1; - while (res) { - if (res->start == start && res->end == end && - (res->flags & IORESOURCE_BUSY)) - break; - if (res->start <= start && res->end >= end) - res = res->child; - else - res = res->sibling; - } - if (!res) - return -ENODEV; - - if (res->name) { - kfree(res->name); - res->name = NULL; - } - release_resource(res); - kfree(res); - - return 0; -} -EXPORT_SYMBOL(release_OF_resource); -#endif /* CONFIG_PCI */ Index: linux-work/include/asm-powerpc/prom.h =================================================================== --- linux-work.orig/include/asm-powerpc/prom.h 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/include/asm-powerpc/prom.h 2005-12-13 17:51:20.000000000 +1100 @@ -65,49 +65,11 @@ struct boot_param_header typedef u32 phandle; typedef u32 ihandle; -struct address_range { - unsigned long space; - unsigned long address; - unsigned long size; -}; - struct interrupt_info { int line; int sense; /* +ve/-ve logic, edge or level, etc. */ }; -struct pci_address { - u32 a_hi; - u32 a_mid; - u32 a_lo; -}; - -struct isa_address { - u32 a_hi; - u32 a_lo; -}; - -struct isa_range { - struct isa_address isa_addr; - struct pci_address pci_addr; - unsigned int size; -}; - -struct reg_property { - unsigned long address; - unsigned long size; -}; - -struct reg_property32 { - unsigned int address; - unsigned int size; -}; - -struct reg_property64 { - u64 address; - u64 size; -}; - struct property { char *name; int length; @@ -120,8 +82,6 @@ struct device_node { char *type; phandle node; phandle linux_phandle; - int n_addrs; - struct address_range *addrs; int n_intrs; struct interrupt_info *intrs; char *full_name; Index: linux-work/arch/powerpc/platforms/powermac/feature.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/feature.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/feature.c 2005-12-13 17:51:20.000000000 +1100 @@ -1445,20 +1445,55 @@ static long g5_i2s_enable(struct device_ /* Very crude implementation for now */ struct macio_chip *macio = &macio_chips[0]; unsigned long flags; + int cell; + u32 fcrs[3][3] = { + { 0, + K2_FCR1_I2S0_CELL_ENABLE | + K2_FCR1_I2S0_CLK_ENABLE_BIT | K2_FCR1_I2S0_ENABLE, + KL3_I2S0_CLK18_ENABLE + }, + { KL0_SCC_A_INTF_ENABLE, + K2_FCR1_I2S1_CELL_ENABLE | + K2_FCR1_I2S1_CLK_ENABLE_BIT | K2_FCR1_I2S1_ENABLE, + KL3_I2S1_CLK18_ENABLE + }, + { KL0_SCC_B_INTF_ENABLE, + SH_FCR1_I2S2_CELL_ENABLE | + SH_FCR1_I2S2_CLK_ENABLE_BIT | SH_FCR1_I2S2_ENABLE, + SH_FCR3_I2S2_CLK18_ENABLE + }, + }; - if (value == 0) - return 0; /* don't disable yet */ + if (macio->type != macio_keylargo2 /* && macio->type != macio_shasta*/) + return -ENODEV; + if (strncmp(node->name, "i2s-", 4)) + return -ENODEV; + cell = node->name[4] - 'a'; + switch(cell) { + case 0: + case 1: + break; +#if 0 + case 2: + if (macio->type == macio_shasta) + break; +#endif + default: + return -ENODEV; + } LOCK(flags); - MACIO_BIS(KEYLARGO_FCR3, KL3_CLK45_ENABLE | KL3_CLK49_ENABLE | - KL3_I2S0_CLK18_ENABLE); - udelay(10); - MACIO_BIS(KEYLARGO_FCR1, K2_FCR1_I2S0_CELL_ENABLE | - K2_FCR1_I2S0_CLK_ENABLE_BIT | K2_FCR1_I2S0_ENABLE); + if (value) { + MACIO_BIC(KEYLARGO_FCR0, fcrs[cell][0]); + MACIO_BIS(KEYLARGO_FCR1, fcrs[cell][1]); + MACIO_BIS(KEYLARGO_FCR3, fcrs[cell][2]); + } else { + MACIO_BIC(KEYLARGO_FCR3, fcrs[cell][2]); + MACIO_BIC(KEYLARGO_FCR1, fcrs[cell][1]); + MACIO_BIS(KEYLARGO_FCR0, fcrs[cell][0]); + } udelay(10); - MACIO_BIC(KEYLARGO_FCR1, K2_FCR1_I2S0_RESET); UNLOCK(flags); - udelay(10); return 0; } @@ -2960,26 +2995,6 @@ pmac_feature_init(void) set_initial_features(); } -int __init pmac_feature_late_init(void) -{ -#if 0 - struct device_node *np; - - /* Request some resources late */ - if (uninorth_node) - request_OF_resource(uninorth_node, 0, NULL); - np = find_devices("hammerhead"); - if (np) - request_OF_resource(np, 0, NULL); - np = find_devices("interrupt-controller"); - if (np) - request_OF_resource(np, 0, NULL); -#endif - return 0; -} - -device_initcall(pmac_feature_late_init); - #if 0 static void dump_HT_speeds(char *name, u32 cfg, u32 frq) { Index: linux-work/arch/powerpc/platforms/powermac/pic.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/pic.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/pic.c 2005-12-13 17:51:20.000000000 +1100 @@ -5,8 +5,8 @@ * in a separate file * * Copyright (C) 1997 Paul Mackerras (paulus at samba.org) - * - * Maintained by Benjamin Herrenschmidt (benh at kernel.crashing.org) + * Copyright (C) 2005 Benjamin Herrenschmidt (benh at kernel.crashing.org) + * IBM, Corp. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -54,12 +54,7 @@ struct pmac_irq_hw { }; /* Default addresses */ -static volatile struct pmac_irq_hw *pmac_irq_hw[4] = { - (struct pmac_irq_hw *) 0xf3000020, - (struct pmac_irq_hw *) 0xf3000010, - (struct pmac_irq_hw *) 0xf4000020, - (struct pmac_irq_hw *) 0xf4000010, -}; +static volatile struct pmac_irq_hw __iomem *pmac_irq_hw[4]; #define GC_LEVEL_MASK 0x3ff00000 #define OHARE_LEVEL_MASK 0x1ff00000 @@ -82,8 +77,7 @@ static unsigned long ppc_lost_interrupts * since it can lose interrupts (see pmac_set_irq_mask). * -- Cort */ -void -__set_lost(unsigned long irq_nr, int nokick) +void __set_lost(unsigned long irq_nr, int nokick) { if (!test_and_set_bit(irq_nr, ppc_lost_interrupts)) { atomic_inc(&ppc_n_lost_interrupts); @@ -92,8 +86,7 @@ __set_lost(unsigned long irq_nr, int nok } } -static void -pmac_mask_and_ack_irq(unsigned int irq_nr) +static void pmac_mask_and_ack_irq(unsigned int irq_nr) { unsigned long bit = 1UL << (irq_nr & 0x1f); int i = irq_nr >> 5; @@ -224,8 +217,7 @@ static irqreturn_t gatwick_action(int cp return IRQ_NONE; } -int -pmac_get_irq(struct pt_regs *regs) +static int pmac_get_irq(struct pt_regs *regs) { int irq; unsigned long bits = 0; @@ -256,34 +248,40 @@ pmac_get_irq(struct pt_regs *regs) /* This routine will fix some missing interrupt values in the device tree * on the gatwick mac-io controller used by some PowerBooks + * + * Walking of OF nodes could use a bit more fixing up here, but it's not + * very important as this is all boot time code on static portions of the + * device-tree. + * + * However, the modifications done to "intrs" will have to be removed and + * replaced with proper updates of the "interrupts" properties or + * AAPL,interrupts, yet to be decided, once the dynamic parsing is there. */ -static void __init -pmac_fix_gatwick_interrupts(struct device_node *gw, int irq_base) +static void __init pmac_fix_gatwick_interrupts(struct device_node *gw, + int irq_base) { struct device_node *node; int count; memset(gatwick_int_pool, 0, sizeof(gatwick_int_pool)); - node = gw->child; count = 0; - while(node) - { + for (node = NULL; (node = of_get_next_child(gw, node)) != NULL;) { /* Fix SCC */ - if (strcasecmp(node->name, "escc") == 0) - if (node->child) { - if (node->child->n_intrs < 3) { - node->child->intrs = &gatwick_int_pool[count]; - count += 3; - } - node->child->n_intrs = 3; - node->child->intrs[0].line = 15+irq_base; - node->child->intrs[1].line = 4+irq_base; - node->child->intrs[2].line = 5+irq_base; - printk(KERN_INFO "irq: fixed SCC on second controller (%d,%d,%d)\n", - node->child->intrs[0].line, - node->child->intrs[1].line, - node->child->intrs[2].line); + if ((strcasecmp(node->name, "escc") == 0) && node->child) { + if (node->child->n_intrs < 3) { + node->child->intrs = &gatwick_int_pool[count]; + count += 3; } + node->child->n_intrs = 3; + node->child->intrs[0].line = 15+irq_base; + node->child->intrs[1].line = 4+irq_base; + node->child->intrs[2].line = 5+irq_base; + printk(KERN_INFO "irq: fixed SCC on gatwick" + " (%d,%d,%d)\n", + node->child->intrs[0].line, + node->child->intrs[1].line, + node->child->intrs[2].line); + } /* Fix media-bay & left SWIM */ if (strcasecmp(node->name, "media-bay") == 0) { struct device_node* ya_node; @@ -292,12 +290,11 @@ pmac_fix_gatwick_interrupts(struct devic node->intrs = &gatwick_int_pool[count++]; node->n_intrs = 1; node->intrs[0].line = 29+irq_base; - printk(KERN_INFO "irq: fixed media-bay on second controller (%d)\n", - node->intrs[0].line); + printk(KERN_INFO "irq: fixed media-bay on gatwick" + " (%d)\n", node->intrs[0].line); ya_node = node->child; - while(ya_node) - { + while(ya_node) { if (strcasecmp(ya_node->name, "floppy") == 0) { if (ya_node->n_intrs < 2) { ya_node->intrs = &gatwick_int_pool[count]; @@ -323,7 +320,6 @@ pmac_fix_gatwick_interrupts(struct devic ya_node = ya_node->sibling; } } - node = node->sibling; } if (count > 10) { printk("WARNING !! Gatwick interrupt pool overflow\n"); @@ -338,45 +334,41 @@ pmac_fix_gatwick_interrupts(struct devic * controller. If we find this second ohare, set it up and fix the * interrupt value in the device tree for the ethernet chip. */ -static int __init enable_second_ohare(void) +static void __init enable_second_ohare(struct device_node *np) { unsigned char bus, devfn; unsigned short cmd; - unsigned long addr; - struct device_node *irqctrler = find_devices("pci106b,7"); struct device_node *ether; - if (irqctrler == NULL || irqctrler->n_addrs <= 0) - return -1; - addr = (unsigned long) ioremap(irqctrler->addrs[0].address, 0x40); - pmac_irq_hw[1] = (volatile struct pmac_irq_hw *)(addr + 0x20); - max_irqs = 64; - if (pci_device_from_OF_node(irqctrler, &bus, &devfn) == 0) { - struct pci_controller* hose = pci_find_hose_for_OF_device(irqctrler); - if (!hose) - printk(KERN_ERR "Can't find PCI hose for OHare2 !\n"); - else { - early_read_config_word(hose, bus, devfn, PCI_COMMAND, &cmd); - cmd |= PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER; - cmd &= ~PCI_COMMAND_IO; - early_write_config_word(hose, bus, devfn, PCI_COMMAND, cmd); + /* This code doesn't strictly belong here, it could be part of + * either the PCI initialisation or the feature code. It's kept + * here for historical reasons. + */ + if (pci_device_from_OF_node(np, &bus, &devfn) == 0) { + struct pci_controller* hose = + pci_find_hose_for_OF_device(np); + if (!hose) { + printk(KERN_ERR "Can't find PCI hose for OHare2 !\n"); + return; } + early_read_config_word(hose, bus, devfn, PCI_COMMAND, &cmd); + cmd |= PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER; + cmd &= ~PCI_COMMAND_IO; + early_write_config_word(hose, bus, devfn, PCI_COMMAND, cmd); } /* Fix interrupt for the modem/ethernet combo controller. The number - in the device tree (27) is bogus (correct for the ethernet-only - board but not the combo ethernet/modem board). - The real interrupt is 28 on the second controller -> 28+32 = 60. - */ - ether = find_devices("pci1011,14"); + * in the device tree (27) is bogus (correct for the ethernet-only + * board but not the combo ethernet/modem board). + * The real interrupt is 28 on the second controller -> 28+32 = 60. + */ + ether = of_find_node_by_name(NULL, "pci1011,14"); if (ether && ether->n_intrs > 0) { ether->intrs[0].line = 60; printk(KERN_INFO "irq: Fixed ethernet IRQ to %d\n", ether->intrs[0].line); } - - /* Return the interrupt number of the cascade */ - return irqctrler->intrs[0].line; + of_node_put(ether); } #ifdef CONFIG_XMON @@ -394,189 +386,233 @@ static struct irqaction gatwick_cascade_ .mask = CPU_MASK_NONE, .name = "cascade", }; -#endif /* CONFIG_PPC32 */ -static int pmac_u3_cascade(struct pt_regs *regs, void *data) +static void __init pmac_pic_probe_oldstyle(void) { - return mpic_get_one_irq((struct mpic *)data, regs); -} - -void __init pmac_pic_init(void) -{ - struct device_node *irqctrler = NULL; - struct device_node *irqctrler2 = NULL; - struct device_node *np; -#ifdef CONFIG_PPC32 int i; - unsigned long addr; int irq_cascade = -1; -#endif - struct mpic *mpic1, *mpic2; + struct device_node *master = NULL; + struct device_node *slave = NULL; + u8 __iomem *addr; + struct resource r; - /* We first try to detect Apple's new Core99 chipset, since mac-io - * is quite different on those machines and contains an IBM MPIC2. - */ - np = find_type_devices("open-pic"); - while (np) { - if (np->parent && !strcmp(np->parent->name, "u3")) - irqctrler2 = np; - else - irqctrler = np; - np = np->next; - } - if (irqctrler != NULL && irqctrler->n_addrs > 0) { - unsigned char senses[128]; - - printk(KERN_INFO "PowerMac using OpenPIC irq controller at 0x%08x\n", - (unsigned int)irqctrler->addrs[0].address); - pmac_call_feature(PMAC_FTR_ENABLE_MPIC, irqctrler, 0, 0); - - prom_get_irq_senses(senses, 0, 128); - mpic1 = mpic_alloc(irqctrler->addrs[0].address, - MPIC_PRIMARY | MPIC_WANTS_RESET, - 0, 0, 128, 252, senses, 128, " OpenPIC "); - BUG_ON(mpic1 == NULL); - mpic_init(mpic1); - - if (irqctrler2 != NULL && irqctrler2->n_intrs > 0 && - irqctrler2->n_addrs > 0) { - printk(KERN_INFO "Slave OpenPIC at 0x%08x hooked on IRQ %d\n", - (u32)irqctrler2->addrs[0].address, - irqctrler2->intrs[0].line); - - pmac_call_feature(PMAC_FTR_ENABLE_MPIC, irqctrler2, 0, 0); - prom_get_irq_senses(senses, 128, 128 + 124); - - /* We don't need to set MPIC_BROKEN_U3 here since we don't have - * hypertransport interrupts routed to it - */ - mpic2 = mpic_alloc(irqctrler2->addrs[0].address, - MPIC_BIG_ENDIAN | MPIC_WANTS_RESET, - 0, 128, 124, 0, senses, 124, - " U3-MPIC "); - BUG_ON(mpic2 == NULL); - mpic_init(mpic2); - mpic_setup_cascade(irqctrler2->intrs[0].line, - pmac_u3_cascade, mpic2); - } -#if defined(CONFIG_XMON) && defined(CONFIG_PPC32) - { - struct device_node* pswitch; - int nmi_irq; - - pswitch = find_devices("programmer-switch"); - if (pswitch && pswitch->n_intrs) { - nmi_irq = pswitch->intrs[0].line; - mpic_irq_set_priority(nmi_irq, 9); - setup_irq(nmi_irq, &xmon_action); - } - } -#endif /* defined(CONFIG_XMON) && defined(CONFIG_PPC32) */ - return; - } - irqctrler = NULL; + /* Set our get_irq function */ + ppc_md.get_irq = pmac_get_irq; -#ifdef CONFIG_PPC32 - /* Get the level/edge settings, assume if it's not - * a Grand Central nor an OHare, then it's an Heathrow - * (or Paddington). + /* + * Find the interrupt controller type & node */ - ppc_md.get_irq = pmac_get_irq; - if (find_devices("gc")) + + if ((master = of_find_node_by_name(NULL, "gc")) != NULL) { + max_irqs = max_real_irqs = 32; level_mask[0] = GC_LEVEL_MASK; - else if (find_devices("ohare")) { + } else if ((master = of_find_node_by_name(NULL, "ohare")) != NULL) { + max_irqs = max_real_irqs = 32; level_mask[0] = OHARE_LEVEL_MASK; + /* We might have a second cascaded ohare */ - level_mask[1] = OHARE_LEVEL_MASK; - } else { + slave = of_find_node_by_name(NULL, "pci106b,7"); + if (slave) { + max_irqs = 64; + level_mask[1] = OHARE_LEVEL_MASK; + enable_second_ohare(slave); + } + } else if ((master = of_find_node_by_name(NULL, "mac-io")) != NULL) { + max_irqs = max_real_irqs = 64; level_mask[0] = HEATHROW_LEVEL_MASK; level_mask[1] = 0; + /* We might have a second cascaded heathrow */ - level_mask[2] = HEATHROW_LEVEL_MASK; - level_mask[3] = 0; - } + slave = of_find_node_by_name(master, "mac-io"); - /* - * G3 powermacs and 1999 G3 PowerBooks have 64 interrupts, - * 1998 G3 Series PowerBooks have 128, - * other powermacs have 32. - * The combo ethernet/modem card for the Powerstar powerbooks - * (2400/3400/3500, ohare based) has a second ohare chip - * effectively making a total of 64. - */ - max_irqs = max_real_irqs = 32; - irqctrler = find_devices("mac-io"); - if (irqctrler) - { - max_real_irqs = 64; - if (irqctrler->next) + /* Check ordering of master & slave */ + if (device_is_compatible(master, "gatwick")) { + struct device_node *tmp; + BUG_ON(slave == NULL); + tmp = master; + master = slave; + slave = tmp; + } + + /* We found a slave */ + if (slave) { max_irqs = 128; - else - max_irqs = 64; + level_mask[2] = HEATHROW_LEVEL_MASK; + level_mask[3] = 0; + pmac_fix_gatwick_interrupts(slave, max_real_irqs); + } } + BUG_ON(master == NULL); + + /* Set the handler for the main PIC */ for ( i = 0; i < max_real_irqs ; i++ ) irq_desc[i].handler = &pmac_pic; - /* get addresses of first controller */ - if (irqctrler) { - if (irqctrler->n_addrs > 0) { - addr = (unsigned long) - ioremap(irqctrler->addrs[0].address, 0x40); - for (i = 0; i < 2; ++i) - pmac_irq_hw[i] = (volatile struct pmac_irq_hw*) - (addr + (2 - i) * 0x10); - } + /* Get addresses of first controller if we have a node for it */ + BUG_ON(of_address_to_resource(master, 0, &r)); - /* get addresses of second controller */ - irqctrler = irqctrler->next; - if (irqctrler && irqctrler->n_addrs > 0) { - addr = (unsigned long) - ioremap(irqctrler->addrs[0].address, 0x40); - for (i = 2; i < 4; ++i) - pmac_irq_hw[i] = (volatile struct pmac_irq_hw*) - (addr + (4 - i) * 0x10); - irq_cascade = irqctrler->intrs[0].line; - if (device_is_compatible(irqctrler, "gatwick")) - pmac_fix_gatwick_interrupts(irqctrler, max_real_irqs); - } - } else { - /* older powermacs have a GC (grand central) or ohare at - f3000000, with interrupt control registers at f3000020. */ - addr = (unsigned long) ioremap(0xf3000000, 0x40); - pmac_irq_hw[0] = (volatile struct pmac_irq_hw *) (addr + 0x20); + /* Map interrupts of primary controller */ + addr = (u8 __iomem *) ioremap(r.start, 0x40); + i = 0; + pmac_irq_hw[i++] = (volatile struct pmac_irq_hw __iomem *) + (addr + 0x20); + if (max_real_irqs > 32) + pmac_irq_hw[i++] = (volatile struct pmac_irq_hw __iomem *) + (addr + 0x10); + of_node_put(master); + + printk(KERN_INFO "irq: Found primary Apple PIC %s for %d irqs\n", + master->full_name, max_real_irqs); + + /* Map interrupts of cascaded controller */ + if (slave && !of_address_to_resource(slave, 0, &r)) { + addr = (u8 __iomem *)ioremap(r.start, 0x40); + pmac_irq_hw[i++] = (volatile struct pmac_irq_hw __iomem *) + (addr + 0x20); + if (max_irqs > 64) + pmac_irq_hw[i++] = + (volatile struct pmac_irq_hw __iomem *) + (addr + 0x10); + irq_cascade = slave->intrs[0].line; + + printk(KERN_INFO "irq: Found slave Apple PIC %s for %d irqs" + " cascade: %d\n", slave->full_name, + max_irqs - max_real_irqs, irq_cascade); } - - /* PowerBooks 3400 and 3500 can have a second controller in a second - ohare chip, on the combo ethernet/modem card */ - if (machine_is_compatible("AAPL,3400/2400") - || machine_is_compatible("AAPL,3500")) - irq_cascade = enable_second_ohare(); + of_node_put(slave); /* disable all interrupts in all controllers */ for (i = 0; i * 32 < max_irqs; ++i) out_le32(&pmac_irq_hw[i]->enable, 0); + /* mark level interrupts */ for (i = 0; i < max_irqs; i++) if (level_mask[i >> 5] & (1UL << (i & 0x1f))) irq_desc[i].status = IRQ_LEVEL; - /* get interrupt line of secondary interrupt controller */ - if (irq_cascade >= 0) { - printk(KERN_INFO "irq: secondary controller on irq %d\n", - (int)irq_cascade); + /* Setup handlers for secondary controller and hook cascade irq*/ + if (slave) { for ( i = max_real_irqs ; i < max_irqs ; i++ ) irq_desc[i].handler = &gatwick_pic; setup_irq(irq_cascade, &gatwick_cascade_action); } - printk("System has %d possible interrupts\n", max_irqs); - if (max_irqs != max_real_irqs) - printk(KERN_DEBUG "%d interrupts on main controller\n", - max_real_irqs); - + printk(KERN_INFO "irq: System has %d possible interrupts\n", max_irqs); #ifdef CONFIG_XMON setup_irq(20, &xmon_action); -#endif /* CONFIG_XMON */ -#endif /* CONFIG_PPC32 */ +#endif +} +#endif /* CONFIG_PPC32 */ + +static int pmac_u3_cascade(struct pt_regs *regs, void *data) +{ + return mpic_get_one_irq((struct mpic *)data, regs); +} + +static void __init pmac_pic_setup_mpic_nmi(struct mpic *mpic) +{ +#if defined(CONFIG_XMON) && defined(CONFIG_PPC32) + struct device_node* pswitch; + int nmi_irq; + + pswitch = of_find_node_by_name(NULL, "programmer-switch"); + if (pswitch && pswitch->n_intrs) { + nmi_irq = pswitch->intrs[0].line; + mpic_irq_set_priority(nmi_irq, 9); + setup_irq(nmi_irq, &xmon_action); + } + of_node_put(pswitch); +#endif /* defined(CONFIG_XMON) && defined(CONFIG_PPC32) */ +} + +static int __init pmac_pic_probe_mpic(void) +{ + struct mpic *mpic1, *mpic2; + struct device_node *np, *master = NULL, *slave = NULL; + unsigned char senses[128]; + struct resource r; + + /* We can have up to 2 MPICs cascaded */ + for (np = NULL; (np = of_find_node_by_type(np, "open-pic")) + != NULL;) { + if (master == NULL && + get_property(np, "interrupt-parent", NULL) != NULL) + master = of_node_get(np); + else if (slave == NULL) + slave = of_node_get(np); + if (master && slave) + break; + } + + /* Check for bogus setups */ + if (master == NULL && slave != NULL) { + master = slave; + slave = NULL; + } + + /* Not found, default to good old pmac pic */ + if (master == NULL) + return -ENODEV; + + /* Set master handler */ + ppc_md.get_irq = mpic_get_irq; + + /* Setup master */ + BUG_ON(of_address_to_resource(master, 0, &r)); + pmac_call_feature(PMAC_FTR_ENABLE_MPIC, master, 0, 0); + prom_get_irq_senses(senses, 0, 128); + mpic1 = mpic_alloc(r.start, MPIC_PRIMARY | MPIC_WANTS_RESET, + 0, 0, 128, 252, senses, 128, " OpenPIC "); + BUG_ON(mpic1 == NULL); + mpic_init(mpic1); + + /* Install NMI if any */ + pmac_pic_setup_mpic_nmi(mpic1); + + of_node_put(master); + + /* No slave, let's go out */ + if (slave == NULL || slave->n_intrs < 1) + return 0; + + /* Setup slave, failures are non-fatal */ + if (of_address_to_resource(slave, 0, &r)) { + printk(KERN_ERR "Can't get address of MPIC %s\n", + slave->full_name); + return 0; + } + pmac_call_feature(PMAC_FTR_ENABLE_MPIC, slave, 0, 0); + prom_get_irq_senses(senses, 128, 128 + 124); + + /* We don't need to set MPIC_BROKEN_U3 here since we don't have + * hypertransport interrupts routed to it, at least not on currently + * supported machines, that may change. + */ + mpic2 = mpic_alloc(r.start, MPIC_BIG_ENDIAN | MPIC_WANTS_RESET, + 0, 128, 124, 0, senses, 124, " U3-MPIC "); + if (mpic2 == NULL) { + printk(KERN_ERR "Can't create slave MPIC %s\n", + slave->full_name); + return 0; + } + mpic_init(mpic2); + mpic_setup_cascade(slave->intrs[0].line, pmac_u3_cascade, mpic2); + + of_node_put(slave); + return 0; +} + + +void __init pmac_pic_init(void) +{ + /* We first try to detect Apple's new Core99 chipset, since mac-io + * is quite different on those machines and contains an IBM MPIC2. + */ + if (pmac_pic_probe_mpic() == 0) + return; + +#ifdef CONFIG_PPC32 + pmac_pic_probe_oldstyle(); +#endif } #if defined(CONFIG_PM) && defined(CONFIG_PPC32) Index: linux-work/drivers/macintosh/via-cuda.c =================================================================== --- linux-work.orig/drivers/macintosh/via-cuda.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/macintosh/via-cuda.c 2005-12-13 17:51:20.000000000 +1100 @@ -193,10 +193,6 @@ static int __init via_cuda_start(void) if (via == NULL) return -ENODEV; -#ifdef CONFIG_PPC - request_OF_resource(vias, 0, NULL); -#endif - if (request_irq(CUDA_IRQ, cuda_interrupt, 0, "ADB", cuda_interrupt)) { printk(KERN_ERR "cuda_init: can't get irq %d\n", CUDA_IRQ); return -EAGAIN; Index: linux-work/drivers/macintosh/via-pmu.c =================================================================== --- linux-work.orig/drivers/macintosh/via-pmu.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/macintosh/via-pmu.c 2005-12-13 17:51:20.000000000 +1100 @@ -298,7 +298,7 @@ static struct backlight_controller pmu_b int __init find_via_pmu(void) { - phys_addr_t taddr; + u64 taddr; u32 *reg; if (via != 0) @@ -337,7 +337,7 @@ int __init find_via_pmu(void) else if (device_is_compatible(vias->parent, "Keylargo") || device_is_compatible(vias->parent, "K2-Keylargo")) { struct device_node *gpiop; - phys_addr_t gaddr = 0; + u64 gaddr = OF_BAD_ADDR; pmu_kind = PMU_KEYLARGO_BASED; pmu_has_adb = (find_type_devices("adb") != NULL); @@ -352,7 +352,7 @@ int __init find_via_pmu(void) reg = (u32 *)get_property(gpiop, "reg", NULL); if (reg) gaddr = of_translate_address(gpiop, reg); - if (gaddr != 0) + if (gaddr != OF_BAD_ADDR) gpio_reg = ioremap(gaddr, 0x10); } if (gpio_reg == NULL) @@ -479,9 +479,6 @@ static int __init via_pmu_dev_init(void) if (vias == NULL) return -ENODEV; -#ifndef CONFIG_PPC64 - request_OF_resource(vias, 0, NULL); -#endif #ifdef CONFIG_PMAC_BACKLIGHT /* Enable backlight */ register_backlight_controller(&pmu_backlight_controller, NULL, "pmu"); Index: linux-work/arch/powerpc/platforms/powermac/nvram.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/nvram.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/nvram.c 2005-12-13 17:51:20.000000000 +1100 @@ -514,7 +514,7 @@ static void core99_nvram_sync(void) #endif } -static int __init core99_nvram_setup(struct device_node *dp) +static int __init core99_nvram_setup(struct device_node *dp, unsigned long addr) { int i; u32 gen_bank0, gen_bank1; @@ -528,7 +528,7 @@ static int __init core99_nvram_setup(str printk(KERN_ERR "nvram: can't allocate ram image\n"); return -ENOMEM; } - nvram_data = ioremap(dp->addrs[0].address, NVRAM_SIZE*2); + nvram_data = ioremap(addr, NVRAM_SIZE*2); nvram_naddrs = 1; /* Make sure we get the correct case */ DBG("nvram: Checking bank 0...\n"); @@ -570,34 +570,48 @@ static int __init core99_nvram_setup(str int __init pmac_nvram_init(void) { struct device_node *dp; + struct resource r1, r2; + unsigned int s1 = 0, s2 = 0; int err = 0; nvram_naddrs = 0; - dp = find_devices("nvram"); + dp = of_find_node_by_name(NULL, "nvram"); if (dp == NULL) { printk(KERN_ERR "Can't find NVRAM device\n"); return -ENODEV; } - nvram_naddrs = dp->n_addrs; + + /* Try to obtain an address */ + if (of_address_to_resource(dp, 0, &r1) == 0) { + nvram_naddrs = 1; + s1 = (r1.end - r1.start) + 1; + if (of_address_to_resource(dp, 1, &r2) == 0) { + nvram_naddrs = 2; + s2 = (r2.end - r2.start) + 1; + } + } + is_core_99 = device_is_compatible(dp, "nvram,flash"); - if (is_core_99) - err = core99_nvram_setup(dp); + if (is_core_99) { + err = core99_nvram_setup(dp, r1.start); + goto bail; + } + #ifdef CONFIG_PPC32 - else if (_machine == _MACH_chrp && nvram_naddrs == 1) { - nvram_data = ioremap(dp->addrs[0].address + isa_mem_base, - dp->addrs[0].size); + if (_machine == _MACH_chrp && nvram_naddrs == 1) { + nvram_data = ioremap(r1.start, s1); nvram_mult = 1; ppc_md.nvram_read_val = direct_nvram_read_byte; ppc_md.nvram_write_val = direct_nvram_write_byte; } else if (nvram_naddrs == 1) { - nvram_data = ioremap(dp->addrs[0].address, dp->addrs[0].size); - nvram_mult = (dp->addrs[0].size + NVRAM_SIZE - 1) / NVRAM_SIZE; + nvram_data = ioremap(r1.start, s1); + nvram_mult = (s1 + NVRAM_SIZE - 1) / NVRAM_SIZE; ppc_md.nvram_read_val = direct_nvram_read_byte; ppc_md.nvram_write_val = direct_nvram_write_byte; } else if (nvram_naddrs == 2) { - nvram_addr = ioremap(dp->addrs[0].address, dp->addrs[0].size); - nvram_data = ioremap(dp->addrs[1].address, dp->addrs[1].size); + nvram_addr = ioremap(r1.start, s1); + nvram_data = ioremap(r2.start, s2); ppc_md.nvram_read_val = indirect_nvram_read_byte; ppc_md.nvram_write_val = indirect_nvram_write_byte; } else if (nvram_naddrs == 0 && sys_ctrler == SYS_CTRLER_PMU) { @@ -606,13 +620,15 @@ int __init pmac_nvram_init(void) ppc_md.nvram_read_val = pmu_nvram_read_byte; ppc_md.nvram_write_val = pmu_nvram_write_byte; #endif /* CONFIG_ADB_PMU */ - } -#endif - else { + } else { printk(KERN_ERR "Incompatible type of NVRAM\n"); - return -ENXIO; + err = -ENXIO; } - lookup_partitions(); +#endif /* CONFIG_PPC32 */ +bail: + of_node_put(dp); + if (err == 0) + lookup_partitions(); return err; } Index: linux-work/arch/powerpc/platforms/powermac/pci.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/pci.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/pci.c 2005-12-13 17:51:20.000000000 +1100 @@ -285,15 +285,13 @@ static struct pci_ops chaos_pci_ops = }; static void __init setup_chaos(struct pci_controller *hose, - struct reg_property *addr) + struct resource *addr) { /* assume a `chaos' bridge */ hose->ops = &chaos_pci_ops; - hose->cfg_addr = ioremap(addr->address + 0x800000, 0x1000); - hose->cfg_data = ioremap(addr->address + 0xc00000, 0x1000); + hose->cfg_addr = ioremap(addr->start + 0x800000, 0x1000); + hose->cfg_data = ioremap(addr->start + 0xc00000, 0x1000); } -#else -#define setup_chaos(hose, addr) #endif /* CONFIG_PPC32 */ #ifdef CONFIG_PPC64 @@ -356,9 +354,11 @@ static unsigned long u3_ht_cfg_access(st /* For now, we don't self probe U3 HT bridge */ if (PCI_SLOT(devfn) == 0) return 0; - return ((unsigned long)hose->cfg_data) + U3_HT_CFA0(devfn, offset); + return ((unsigned long)hose->cfg_data) + + U3_HT_CFA0(devfn, offset); } else - return ((unsigned long)hose->cfg_data) + U3_HT_CFA1(bus, devfn, offset); + return ((unsigned long)hose->cfg_data) + + U3_HT_CFA1(bus, devfn, offset); } static int u3_ht_read_config(struct pci_bus *bus, unsigned int devfn, @@ -532,7 +532,8 @@ static void __init init_p2pbridge(void) } if (early_read_config_word(hose, bus, devfn, PCI_BRIDGE_CONTROL, &val) < 0) { - printk(KERN_ERR "init_p2pbridge: couldn't read bridge control\n"); + printk(KERN_ERR "init_p2pbridge: couldn't read bridge" + " control\n"); return; } val &= ~PCI_BRIDGE_CTL_MASTER_ABORT; @@ -576,36 +577,38 @@ static void __init fixup_nec_usb2(void) continue; early_read_config_dword(hose, bus, devfn, 0xe4, &data); if (data & 1UL) { - printk("Found NEC PD720100A USB2 chip with disabled EHCI, fixing up...\n"); + printk("Found NEC PD720100A USB2 chip with disabled" + " EHCI, fixing up...\n"); data &= ~1UL; early_write_config_dword(hose, bus, devfn, 0xe4, data); - early_write_config_byte(hose, bus, devfn | 2, PCI_INTERRUPT_LINE, + early_write_config_byte(hose, bus, + devfn | 2, PCI_INTERRUPT_LINE, nec->intrs[0].line); } } } static void __init setup_bandit(struct pci_controller *hose, - struct reg_property *addr) + struct resource *addr) { hose->ops = ¯isc_pci_ops; - hose->cfg_addr = ioremap(addr->address + 0x800000, 0x1000); - hose->cfg_data = ioremap(addr->address + 0xc00000, 0x1000); + hose->cfg_addr = ioremap(addr->start + 0x800000, 0x1000); + hose->cfg_data = ioremap(addr->start + 0xc00000, 0x1000); init_bandit(hose); } static int __init setup_uninorth(struct pci_controller *hose, - struct reg_property *addr) + struct resource *addr) { pci_assign_all_buses = 1; has_uninorth = 1; hose->ops = ¯isc_pci_ops; - hose->cfg_addr = ioremap(addr->address + 0x800000, 0x1000); - hose->cfg_data = ioremap(addr->address + 0xc00000, 0x1000); + hose->cfg_addr = ioremap(addr->start + 0x800000, 0x1000); + hose->cfg_data = ioremap(addr->start + 0xc00000, 0x1000); /* We "know" that the bridge at f2000000 has the PCI slots. */ - return addr->address == 0xf2000000; + return addr->start == 0xf2000000; } -#endif +#endif /* CONFIG_PPC32 */ #ifdef CONFIG_PPC64 static void __init setup_u3_agp(struct pci_controller* hose) @@ -722,7 +725,7 @@ static void __init setup_u3_ht(struct pc hose->mem_resources[cur-1].end = res->start - 1; } } -#endif +#endif /* CONFIG_PPC64 */ /* * We assume that if we have a G3 powermac, we have one bridge called @@ -733,24 +736,17 @@ static int __init add_bridge(struct devi { int len; struct pci_controller *hose; -#ifdef CONFIG_PPC32 - struct reg_property *addr; -#endif + struct resource rsrc; char *disp_name; int *bus_range; - int primary = 1; + int primary = 1, has_address = 0; DBG("Adding PCI host bridge %s\n", dev->full_name); -#ifdef CONFIG_PPC32 - /* XXX fix this */ - addr = (struct reg_property *) get_property(dev, "reg", &len); - if (addr == NULL || len < sizeof(*addr)) { - printk(KERN_WARNING "Can't use %s: no address\n", - dev->full_name); - return -ENODEV; - } -#endif + /* Fetch host bridge registers address */ + has_address = (of_address_to_resource(dev, 0, &rsrc) == 0); + + /* Get bus range if any */ bus_range = (int *) get_property(dev, "bus-range", &len); if (bus_range == NULL || len < 2 * sizeof(int)) { printk(KERN_WARNING "Can't get bus-range for %s, assume" @@ -770,6 +766,8 @@ static int __init add_bridge(struct devi hose->last_busno = bus_range ? bus_range[1] : 0xff; disp_name = NULL; + + /* 64 bits only bridges */ #ifdef CONFIG_PPC64 if (device_is_compatible(dev, "u3-agp")) { setup_u3_agp(hose); @@ -782,25 +780,30 @@ static int __init add_bridge(struct devi } printk(KERN_INFO "Found %s PCI host bridge. Firmware bus number: %d->%d\n", disp_name, hose->first_busno, hose->last_busno); -#else +#endif /* CONFIG_PPC64 */ + + /* 32 bits only bridges */ +#ifdef CONFIG_PPC32 if (device_is_compatible(dev, "uni-north")) { - primary = setup_uninorth(hose, addr); + primary = setup_uninorth(hose, &rsrc); disp_name = "UniNorth"; } else if (strcmp(dev->name, "pci") == 0) { /* XXX assume this is a mpc106 (grackle) */ setup_grackle(hose); disp_name = "Grackle (MPC106)"; } else if (strcmp(dev->name, "bandit") == 0) { - setup_bandit(hose, addr); + setup_bandit(hose, &rsrc); disp_name = "Bandit"; } else if (strcmp(dev->name, "chaos") == 0) { - setup_chaos(hose, addr); + setup_chaos(hose, &rsrc); disp_name = "Chaos"; primary = 0; } - printk(KERN_INFO "Found %s PCI host bridge at 0x%08lx. Firmware bus number: %d->%d\n", - disp_name, addr->address, hose->first_busno, hose->last_busno); -#endif + printk(KERN_INFO "Found %s PCI host bridge at 0x%08lx. " + "Firmware bus number: %d->%d\n", + disp_name, rsrc.start, hose->first_busno, hose->last_busno); +#endif /* CONFIG_PPC32 */ + DBG(" ->Hose at 0x%p, cfg_addr=0x%p,cfg_data=0x%p\n", hose, hose->cfg_addr, hose->cfg_data); @@ -814,8 +817,7 @@ static int __init add_bridge(struct devi return 0; } -static void __init -pcibios_fixup_OF_interrupts(void) +static void __init pcibios_fixup_OF_interrupts(void) { struct pci_dev* dev = NULL; @@ -835,8 +837,7 @@ pcibios_fixup_OF_interrupts(void) } } -void __init -pmac_pcibios_fixup(void) +void __init pmac_pcibios_fixup(void) { /* Fixup interrupts according to OF tree */ pcibios_fixup_OF_interrupts(); Index: linux-work/arch/powerpc/platforms/powermac/pmac.h =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/pmac.h 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/pmac.h 2005-12-13 17:51:20.000000000 +1100 @@ -42,10 +42,6 @@ extern void pmac_ide_init_hwif_ports(hw_ unsigned long data_port, unsigned long ctrl_port, int *irq); extern int pmac_nvram_init(void); - -extern struct hw_interrupt_type pmac_pic; - -void pmac_pic_init(void); -int pmac_get_irq(struct pt_regs *regs); +extern void pmac_pic_init(void); #endif /* __PMAC_H__ */ Index: linux-work/arch/powerpc/platforms/powermac/setup.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/setup.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/setup.c 2005-12-13 17:51:20.000000000 +1100 @@ -75,7 +75,6 @@ #include #include #include -#include #include #include @@ -751,7 +750,7 @@ struct machdep_calls __initdata pmac_md .init_early = pmac_init_early, .show_cpuinfo = pmac_show_cpuinfo, .init_IRQ = pmac_pic_init, - .get_irq = mpic_get_irq, /* changed later */ + .get_irq = NULL, /* changed later */ .pcibios_fixup = pmac_pcibios_fixup, .restart = pmac_restart, .power_off = pmac_power_off, Index: linux-work/arch/powerpc/platforms/powermac/time.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/time.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/time.c 2005-12-13 17:51:20.000000000 +1100 @@ -258,15 +258,20 @@ int __init via_calibrate_decr(void) volatile unsigned char __iomem *via; int count = VIA_TIMER_FREQ_6 / 100; unsigned int dstart, dend; + struct resource rsrc; - vias = find_devices("via-cuda"); + vias = of_find_node_by_name(NULL, "via-cuda"); if (vias == 0) - vias = find_devices("via-pmu"); + vias = of_find_node_by_name(NULL, "via-pmu"); if (vias == 0) - vias = find_devices("via"); - if (vias == 0 || vias->n_addrs == 0) + vias = of_find_node_by_name(NULL, "via"); + if (vias == 0 || of_address_to_resource(vias, 0, &rsrc)) return 0; - via = ioremap(vias->addrs[0].address, vias->addrs[0].size); + via = ioremap(rsrc.start, rsrc.end - rsrc.start + 1); + if (via == NULL) { + printk(KERN_ERR "Failed to map VIA for timer calibration !\n"); + return 0; + } /* set timer 1 for continuous interrupts */ out_8(&via[ACR], (via[ACR] & ~T1MODE) | T1MODE_CONT); Index: linux-work/drivers/macintosh/macio_asic.c =================================================================== --- linux-work.orig/drivers/macintosh/macio_asic.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/macintosh/macio_asic.c 2005-12-13 17:51:20.000000000 +1100 @@ -256,42 +256,42 @@ static int macio_resource_quirks(struct { if (res->flags & IORESOURCE_MEM) { /* Grand Central has too large resource 0 on some machines */ - if (index == 0 && !strcmp(np->name, "gc")) { - np->addrs[0].size = 0x20000; + if (index == 0 && !strcmp(np->name, "gc")) res->end = res->start + 0x1ffff; - } + /* Airport has bogus resource 2 */ if (index >= 2 && !strcmp(np->name, "radio")) return 1; + +#ifndef CONFIG_PPC64 /* DBDMAs may have bogus sizes */ - if ((res->start & 0x0001f000) == 0x00008000) { - np->addrs[index].size = 0x100; + if ((res->start & 0x0001f000) == 0x00008000) res->end = res->start + 0xff; - } +#endif /* CONFIG_PPC64 */ + /* ESCC parent eats child resources. We could have added a * level of hierarchy, but I don't really feel the need * for it */ if (!strcmp(np->name, "escc")) return 1; + /* ESCC has bogus resources >= 3 */ if (index >= 3 && !(strcmp(np->name, "ch-a") && strcmp(np->name, "ch-b"))) return 1; + /* Media bay has too many resources, keep only first one */ if (index > 0 && !strcmp(np->name, "media-bay")) return 1; + /* Some older IDE resources have bogus sizes */ if (!(strcmp(np->name, "IDE") && strcmp(np->name, "ATA") && strcmp(np->type, "ide") && strcmp(np->type, "ata"))) { - if (index == 0 && np->addrs[0].size > 0x1000) { - np->addrs[0].size = 0x1000; + if (index == 0 && (res->end - res->start) > 0xfff) res->end = res->start + 0xfff; - } - if (index == 1 && np->addrs[1].size > 0x100) { - np->addrs[1].size = 0x100; + if (index == 1 && (res->end - res->start) > 0xff) res->end = res->start + 0xff; - } } } return 0; @@ -349,7 +349,7 @@ static void macio_setup_resources(struct /* Currently, we consider failure as harmless, this may * change in the future, once I've found all the device * tree bugs in older machines & worked around them -l */ + */ if (insert_resource(parent_res, res)) { printk(KERN_WARNING "Can't request resource " "%d for MacIO device %s\n", Index: linux-work/drivers/macintosh/mediabay.c =================================================================== --- linux-work.orig/drivers/macintosh/mediabay.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/macintosh/mediabay.c 2005-12-13 17:51:20.000000000 +1100 @@ -647,6 +647,7 @@ static int __devinit media_bay_attach(st struct media_bay_info* bay; u32 __iomem *regbase; struct device_node *ofnode; + unsigned long base; int i; ofnode = mdev->ofdev.node; @@ -656,10 +657,11 @@ static int __devinit media_bay_attach(st if (macio_request_resources(mdev, "media-bay")) return -EBUSY; /* Media bay registers are located at the beginning of the - * mac-io chip, we get the parent address for now (hrm...) + * mac-io chip, for now, we trick and align down the first + * resource passed in */ - regbase = (u32 __iomem *) - ioremap(ofnode->parent->addrs[0].address, 0x100); + base = macio_resource_start(mdev, 0) & 0xffff0000u; + regbase = (u32 __iomem *)ioremap(base, 0x100); if (regbase == NULL) { macio_release_resources(mdev); return -ENOMEM; Index: linux-work/drivers/ide/ppc/pmac.c =================================================================== --- linux-work.orig/drivers/ide/ppc/pmac.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/ide/ppc/pmac.c 2005-12-13 17:51:20.000000000 +1100 @@ -1271,7 +1271,7 @@ static int pmac_ide_setup_device(pmac_ide_hwif_t *pmif, ide_hwif_t *hwif) { struct device_node *np = pmif->node; - int *bidp, i; + int *bidp; pmif->cable_80 = 0; pmif->broken_dma = pmif->broken_dma_warn = 0; @@ -1430,7 +1430,7 @@ pmac_ide_macio_attach(struct macio_dev * pmif = &pmac_ide[i]; hwif = &ide_hwifs[i]; - if (mdev->ofdev.node->n_addrs == 0) { + if (macio_resource_count(mdev) == 0) { printk(KERN_WARNING "ide%d: no address for %s\n", i, mdev->ofdev.node->full_name); return -ENXIO; Index: linux-work/drivers/scsi/mac53c94.c =================================================================== --- linux-work.orig/drivers/scsi/mac53c94.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/scsi/mac53c94.c 2005-12-13 17:51:20.000000000 +1100 @@ -432,11 +432,12 @@ static int mac53c94_probe(struct macio_d struct Scsi_Host *host; void *dma_cmd_space; unsigned char *clkprop; - int proplen; + int proplen, rc = -ENODEV; if (macio_resource_count(mdev) != 2 || macio_irq_count(mdev) != 2) { - printk(KERN_ERR "mac53c94: expected 2 addrs and intrs (got %d/%d)\n", - node->n_addrs, node->n_intrs); + printk(KERN_ERR "mac53c94: expected 2 addrs and intrs" + " (got %d/%d)\n", + macio_resource_count(mdev), macio_irq_count(mdev)); return -ENODEV; } @@ -448,6 +449,7 @@ static int mac53c94_probe(struct macio_d host = scsi_host_alloc(&mac53c94_template, sizeof(struct fsc_state)); if (host == NULL) { printk(KERN_ERR "mac53c94: couldn't register host"); + rc = -ENOMEM; goto out_release; } @@ -486,6 +488,7 @@ static int mac53c94_probe(struct macio_d if (dma_cmd_space == 0) { printk(KERN_ERR "mac53c94: couldn't allocate dma " "command space for %s\n", node->full_name); + rc = -ENOMEM; goto out_free; } state->dma_cmds = (struct dbdma_cmd *)DBDMA_ALIGN(dma_cmd_space); @@ -495,18 +498,21 @@ static int mac53c94_probe(struct macio_d mac53c94_init(state); - if (request_irq(state->intr, do_mac53c94_interrupt, 0, "53C94", state)) { + if (request_irq(state->intr, do_mac53c94_interrupt, 0, "53C94",state)) { printk(KERN_ERR "mac53C94: can't get irq %d for %s\n", state->intr, node->full_name); goto out_free_dma; } - /* XXX FIXME: handle failure */ - scsi_add_host(host, &mdev->ofdev.dev); - scsi_scan_host(host); + rc = scsi_add_host(host, &mdev->ofdev.dev); + if (rc != 0) + goto out_release_irq; + scsi_scan_host(host); return 0; + out_release_irq: + free_irq(state->intr, state); out_free_dma: kfree(state->dma_cmd_space); out_free: @@ -518,7 +524,7 @@ static int mac53c94_probe(struct macio_d out_release: macio_release_resources(mdev); - return -ENODEV; + return rc; } static int mac53c94_remove(struct macio_dev *mdev) Index: linux-work/drivers/scsi/mesh.c =================================================================== --- linux-work.orig/drivers/scsi/mesh.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/scsi/mesh.c 2005-12-13 17:51:20.000000000 +1100 @@ -1869,7 +1869,8 @@ static int mesh_probe(struct macio_dev * if (macio_resource_count(mdev) != 2 || macio_irq_count(mdev) != 2) { printk(KERN_ERR "mesh: expected 2 addrs and 2 intrs" - " (got %d,%d)\n", mesh->n_addrs, mesh->n_intrs); + " (got %d,%d)\n", macio_resource_count(mdev), + macio_irq_count(mdev)); return -ENODEV; } Index: linux-work/drivers/video/offb.c =================================================================== --- linux-work.orig/drivers/video/offb.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/video/offb.c 2005-12-13 17:51:20.000000000 +1100 @@ -223,6 +223,7 @@ static int offb_blank(int blank, struct int __init offb_init(void) { struct device_node *dp = NULL, *boot_disp = NULL; + #if defined(CONFIG_BOOTX_TEXT) && defined(CONFIG_PPC32) struct device_node *macos_display = NULL; #endif @@ -234,60 +235,54 @@ int __init offb_init(void) if (boot_infos != 0) { unsigned long addr = (unsigned long) boot_infos->dispDeviceBase; + u32 *addrp; + u64 daddr, dsize; + unsigned int flags; + /* find the device node corresponding to the macos display */ while ((dp = of_find_node_by_type(dp, "display"))) { int i; - /* - * Grrr... It looks like the MacOS ATI driver - * munges the assigned-addresses property (but - * the AAPL,address value is OK). - */ - if (strncmp(dp->name, "ATY,", 4) == 0 - && dp->n_addrs == 1) { - unsigned int *ap = - (unsigned int *) get_property(dp, - "AAPL,address", - NULL); - if (ap != NULL) { - dp->addrs[0].address = *ap; - dp->addrs[0].size = 0x01000000; - } - } /* - * The LTPro on the Lombard powerbook has no addresses - * on the display nodes, they are on their parent. + * Look for an AAPL,address property first. */ - if (dp->n_addrs == 0 - && device_is_compatible(dp, "ATY,264LTPro")) { - int na; - unsigned int *ap = (unsigned int *) - get_property(dp, "AAPL,address", &na); - if (ap != 0) - for (na /= sizeof(unsigned int); - na > 0; --na, ++ap) - if (*ap <= addr - && addr < - *ap + 0x1000000) - goto foundit; + unsigned int na; + unsigned int *ap = + (unsigned int *)get_property(dp, "AAPL,address", + &na); + if (ap != 0) { + for (na /= sizeof(unsigned int); na > 0; + --na, ++ap) + if (*ap <= addr && + addr < *ap + 0x1000000) { + macos_display = dp; + goto foundit; + } } /* * See if the display address is in one of the address * ranges for this display. */ - for (i = 0; i < dp->n_addrs; ++i) { - if (dp->addrs[i].address <= addr - && addr < - dp->addrs[i].address + - dp->addrs[i].size) + i = 0; + for (;;) { + addrp = of_get_address(dp, i++, &dsize, &flags); + if (addrp == NULL) break; + if (!(flags & IORESOURCE_MEM)) + continue; + daddr = of_translate_address(dp, addrp); + if (daddr == OF_BAD_ADDR) + continue; + if (daddr <= addr && addr < (daddr + dsize)) { + macos_display = dp; + goto foundit; + } } - if (i < dp->n_addrs) { - foundit: + foundit: + if (macos_display) { printk(KERN_INFO "MacOS display is %s\n", dp->full_name); - macos_display = dp; break; } } @@ -326,8 +321,10 @@ static void __init offb_init_nodriver(st int *pp, i; unsigned int len; int width = 640, height = 480, depth = 8, pitch; - unsigned int rsize, *up; - unsigned long address = 0; + unsigned int flags, rsize, *up; + u64 address = OF_BAD_ADDR; + u32 *addrp; + u64 asize; if ((pp = (int *) get_property(dp, "depth", &len)) != NULL && len == sizeof(int)) @@ -363,7 +360,7 @@ static void __init offb_init_nodriver(st break; } if (pdev) { - for (i = 0; i < 6 && address == 0; i++) { + for (i = 0; i < 6 && address == OF_BAD_ADDR; i++) { if ((pci_resource_flags(pdev, i) & IORESOURCE_MEM) && (pci_resource_len(pdev, i) >= rsize)) @@ -374,27 +371,33 @@ static void __init offb_init_nodriver(st } #endif /* CONFIG_PCI */ - if (address == 0 && - (up = (unsigned *) get_property(dp, "address", &len)) != NULL && - len == sizeof(unsigned)) - address = (u_long) * up; - if (address == 0) { - for (i = 0; i < dp->n_addrs; ++i) - if (dp->addrs[i].size >= - pitch * height * depth / 8) - break; - if (i >= dp->n_addrs) { + /* This one is dodgy, we may drop it ... */ + if (address == OF_BAD_ADDR && + (up = (unsigned *) get_property(dp, "address", &len)) != NULL && + len == sizeof(unsigned int)) + address = (u64) * up; + + if (address == OF_BAD_ADDR) { + for (i = 0; (addrp = of_get_address(dp, i, &asize, &flags)) + != NULL; i++) { + if (!(flags & IORESOURCE_MEM)) + continue; + if (asize >= pitch * height * depth / 8) + break; + } + if (addrp == NULL) { printk(KERN_ERR "no framebuffer address found for %s\n", dp->full_name); return; } - - address = (u_long) dp->addrs[i].address; - -#ifdef CONFIG_PPC64 - address += ((struct pci_dn *)dp->data)->phb->pci_mem_offset; -#endif + address = of_translate_address(dp, addrp); + if (address == OF_BAD_ADDR) { + printk(KERN_ERR + "can't translate framebuffer address for %s\n", + dp->full_name); + return; + } /* kludge for valkyrie */ if (strcmp(dp->name, "valkyrie") == 0) @@ -459,7 +462,9 @@ static void __init offb_init_fb(const ch par->cmap_type = cmap_unknown; if (depth == 8) { - /* XXX kludge for ati */ + + /* Palette hacks disabled for now */ +#if 0 if (dp && !strncmp(name, "ATY,Rage128", 11)) { unsigned long regbase = dp->addrs[2].address; par->cmap_adr = ioremap(regbase, 0x1FFF); @@ -490,6 +495,7 @@ static void __init offb_init_fb(const ch par->cmap_adr = ioremap(regbase + 0x6000, 0x1000); par->cmap_type = cmap_gxt2000; } +#endif fix->visual = par->cmap_adr ? FB_VISUAL_PSEUDOCOLOR : FB_VISUAL_STATIC_PSEUDOCOLOR; } else Index: linux-work/arch/powerpc/kernel/btext.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/btext.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/kernel/btext.c 2005-12-13 17:51:20.000000000 +1100 @@ -211,8 +211,6 @@ int __init btext_find_display(int allow_ struct device_node *np = NULL; int rc = -ENODEV; - printk("trying to initialize btext ...\n"); - name = (char *)get_property(of_chosen, "linux,stdout-path", NULL); if (name != NULL) { np = of_find_node_by_path(name); Index: linux-work/drivers/serial/pmac_zilog.c =================================================================== --- linux-work.orig/drivers/serial/pmac_zilog.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/serial/pmac_zilog.c 2005-12-13 17:51:20.000000000 +1100 @@ -1431,11 +1431,14 @@ static int __init pmz_init_port(struct u char name[1]; } *slots; int len; + struct resource r_ports, r_rxdma, r_txdma; /* * Request & map chip registers */ - uap->port.mapbase = np->addrs[0].address; + if (of_address_to_resource(np, 0, &r_ports)) + return -ENODEV; + uap->port.mapbase = r_ports.start; uap->port.membase = ioremap(uap->port.mapbase, 0x1000); uap->control_reg = uap->port.membase; @@ -1445,16 +1448,20 @@ static int __init pmz_init_port(struct u * Request & map DBDMA registers */ #ifdef HAS_DBDMA - if (np->n_addrs >= 3 && np->n_intrs >= 3) + if (of_address_to_resource(np, 1, &r_txdma) == 0 && + of_address_to_resource(np, 2, &r_rxdma) == 0) uap->flags |= PMACZILOG_FLAG_HAS_DMA; +#else + memset(&r_txdma, 0, sizeof(struct resource)); + memset(&r_rxdma, 0, sizeof(struct resource)); #endif if (ZS_HAS_DMA(uap)) { - uap->tx_dma_regs = ioremap(np->addrs[np->n_addrs - 2].address, 0x1000); + uap->tx_dma_regs = ioremap(r_txdma.start, 0x100); if (uap->tx_dma_regs == NULL) { uap->flags &= ~PMACZILOG_FLAG_HAS_DMA; goto no_dma; } - uap->rx_dma_regs = ioremap(np->addrs[np->n_addrs - 1].address, 0x1000); + uap->rx_dma_regs = ioremap(r_rxdma.start, 0x100); if (uap->rx_dma_regs == NULL) { iounmap(uap->tx_dma_regs); uap->tx_dma_regs = NULL; Index: linux-work/arch/powerpc/kernel/pci_64.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/pci_64.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/kernel/pci_64.c 2005-12-13 17:51:20.000000000 +1100 @@ -896,6 +896,25 @@ static void __devinit pci_process_ISA_OF unsigned long phb_io_base_phys, void __iomem * phb_io_base_virt) { + /* Remove these asap */ + + struct pci_address { + u32 a_hi; + u32 a_mid; + u32 a_lo; + }; + + struct isa_address { + u32 a_hi; + u32 a_lo; + }; + + struct isa_range { + struct isa_address isa_addr; + struct pci_address pci_addr; + unsigned int size; + }; + struct isa_range *range; unsigned long pci_addr; unsigned int isa_addr; @@ -1330,8 +1349,9 @@ unsigned int pci_address_to_pio(phys_add list_for_each_entry_safe(hose, tmp, &hose_list, list_node) { if (address >= hose->io_base_phys && address < (hose->io_base_phys + hose->pci_io_size)) - return (unsigned int)hose->io_base_virt + - (address - hose->io_base_phys); + return (unsigned int) + ((unsigned long)hose->io_base_virt + + (address - hose->io_base_phys)); } return (unsigned int)-1; } Index: linux-work/sound/ppc/pmac.c =================================================================== --- linux-work.orig/sound/ppc/pmac.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/sound/ppc/pmac.c 2005-12-13 17:51:20.000000000 +1100 @@ -817,21 +817,17 @@ static int snd_pmac_free(pmac_t *chip) iounmap(chip->playback.dma); if (chip->capture.dma) iounmap(chip->capture.dma); -#ifndef CONFIG_PPC64 + if (chip->node) { int i; - for (i = 0; i < 3; i++) { - if (chip->of_requested & (1 << i)) { - if (chip->is_k2) - release_OF_resource(chip->node->parent, - i); - else - release_OF_resource(chip->node, i); - } + if (chip->requested & (1 << i)) + release_mem_region(chip->rsrc[i].start, + chip->rsrc[i].end - + chip->rsrc[i].start + 1); } } -#endif /* CONFIG_PPC64 */ + if (chip->pdev) pci_dev_put(chip->pdev); kfree(chip); @@ -1005,6 +1001,11 @@ static int __init snd_pmac_detect(pmac_t chip->can_byte_swap = 0; /* FIXME: check this */ chip->control_mask = MASK_IEPC | 0x11;/* disable IEE */ break; + default: + printk(KERN_ERR "snd: Unknown layout ID 0x%x\n", + layout_id); + return -ENODEV; + } } prop = (unsigned int *)get_property(sound, "device-id", NULL); @@ -1186,46 +1187,69 @@ int __init snd_pmac_new(snd_card_t *card } np = chip->node; + chip->requested = 0; if (chip->is_k2) { - if (np->parent->n_addrs < 2 || np->n_intrs < 3) { + static char *rnames[] = { + "Sound Control", "Sound DMA" }; + if (np->n_intrs < 3) { err = -ENODEV; goto __error; } - for (i = 0; i < 2; i++) { -#ifndef CONFIG_PPC64 - static char *name[2] = { "- Control", "- DMA" }; - if (! request_OF_resource(np->parent, i, name[i])) { - snd_printk(KERN_ERR "pmac: can't request resource %d!\n", i); + for (i = 0; i < 2; i ++) { + if (of_address_to_resource(np->parent, i, + &chip->rsrc[i])) { + printk(KERN_ERR "snd: can't translate rsrc " + " %d (%s)\n", i, rnames[i]); + err = -ENODEV; + goto __error; + } + if (request_mem_region(chip->rsrc[i].start, + chip->rsrc[i].end - + chip->rsrc[i].start + 1, + rnames[i]) == NULL) { + printk(KERN_ERR "snd: can't request rsrc " + " %d (%s: 0x%08lx:%08lx)\n", + i, rnames[i], chip->rsrc[i].start, + chip->rsrc[i].end); err = -ENODEV; goto __error; } - chip->of_requested |= (1 << i); -#endif /* CONFIG_PPC64 */ - ctrl_addr = np->parent->addrs[0].address; - txdma_addr = np->parent->addrs[1].address; - rxdma_addr = txdma_addr + 0x100; + chip->requested |= (1 << i); } - + ctrl_addr = chip->rsrc[0].start; + txdma_addr = chip->rsrc[1].start; + rxdma_addr = txdma_addr + 0x100; } else { - if (np->n_addrs < 3 || np->n_intrs < 3) { + static char *rnames[] = { + "Sound Control", "Sound Tx DMA", "Sound Rx DMA" }; + if (np->n_intrs < 3) { err = -ENODEV; goto __error; } - - for (i = 0; i < 3; i++) { -#ifndef CONFIG_PPC64 - static char *name[3] = { "- Control", "- Tx DMA", "- Rx DMA" }; - if (! request_OF_resource(np, i, name[i])) { - snd_printk(KERN_ERR "pmac: can't request resource %d!\n", i); + for (i = 0; i < 3; i ++) { + if (of_address_to_resource(np->parent, i, + &chip->rsrc[i])) { + printk(KERN_ERR "snd: can't translate rsrc " + " %d (%s)\n", i, rnames[i]); + err = -ENODEV; + goto __error; + } + if (request_mem_region(chip->rsrc[i].start, + chip->rsrc[i].end - + chip->rsrc[i].start + 1, + rnames[i]) == NULL) { + printk(KERN_ERR "snd: can't request rsrc " + " %d (%s: 0x%08lx:%08lx)\n", + i, rnames[i], chip->rsrc[i].start, + chip->rsrc[i].end); err = -ENODEV; goto __error; } - chip->of_requested |= (1 << i); -#endif /* CONFIG_PPC64 */ - ctrl_addr = np->addrs[0].address; - txdma_addr = np->addrs[1].address; - rxdma_addr = np->addrs[2].address; + chip->requested |= (1 << i); } + ctrl_addr = chip->rsrc[0].start; + txdma_addr = chip->rsrc[1].start; + rxdma_addr = chip->rsrc[2].start; } chip->awacs = ioremap(ctrl_addr, 0x1000); @@ -1277,9 +1301,11 @@ int __init snd_pmac_new(snd_card_t *card } else if (chip->is_pbook_G3) { struct device_node* mio; for (mio = chip->node->parent; mio; mio = mio->parent) { - if (strcmp(mio->name, "mac-io") == 0 - && mio->n_addrs > 0) { - chip->macio_base = ioremap(mio->addrs[0].address, 0x40); + if (strcmp(mio->name, "mac-io") == 0) { + struct resource r; + if (of_address_to_resource(mio, 0, &r) == 0) + chip->macio_base = + ioremap(r.start, 0x40); break; } } Index: linux-work/sound/ppc/pmac.h =================================================================== --- linux-work.orig/sound/ppc/pmac.h 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/sound/ppc/pmac.h 2005-12-13 17:51:20.000000000 +1100 @@ -122,7 +122,8 @@ struct snd_pmac { unsigned int initialized : 1; unsigned int feature_is_set : 1; - unsigned int of_requested; + unsigned int requested; + struct resource rsrc[3]; int num_freqs; int *freq_table; Index: linux-work/include/asm-powerpc/keylargo.h =================================================================== --- linux-work.orig/include/asm-powerpc/keylargo.h 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/include/asm-powerpc/keylargo.h 2005-12-13 17:51:20.000000000 +1100 @@ -232,10 +232,12 @@ #define K2_FCR1_I2S0_RESET 0x00000800 #define K2_FCR1_I2S0_CLK_ENABLE_BIT 0x00001000 #define K2_FCR1_I2S0_ENABLE 0x00002000 - #define K2_FCR1_PCI1_CLK_ENABLE 0x00004000 #define K2_FCR1_FW_CLK_ENABLE 0x00008000 #define K2_FCR1_FW_RESET_N 0x00010000 +#define K2_FCR1_I2S1_CELL_ENABLE 0x00020000 +#define K2_FCR1_I2S1_CLK_ENABLE_BIT 0x00080000 +#define K2_FCR1_I2S1_ENABLE 0x00100000 #define K2_FCR1_GMAC_CLK_ENABLE 0x00400000 #define K2_FCR1_GMAC_POWER_DOWN 0x00800000 #define K2_FCR1_GMAC_RESET_N 0x01000000 @@ -246,3 +248,9 @@ #define K2_FCR1_UATA_RESET_N 0x40000000 #define K2_FCR1_UATA_CHOOSE_CLK66 0x80000000 +/* Shasta definitions */ +#define SH_FCR1_I2S2_CELL_ENABLE 0x00000010 +#define SH_FCR1_I2S2_CLK_ENABLE_BIT 0x00000040 +#define SH_FCR1_I2S2_ENABLE 0x00000080 +#define SH_FCR3_I2S2_CLK18_ENABLE 0x00008000 + Index: linux-work/drivers/video/platinumfb.c =================================================================== --- linux-work.orig/drivers/video/platinumfb.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/video/platinumfb.c 2005-12-13 17:51:20.000000000 +1100 @@ -69,6 +69,8 @@ struct fb_info_platinum { unsigned long total_vram; int clktype; int dactype; + + struct resource rsrc_fb, rsrc_reg; }; /* @@ -97,9 +99,6 @@ static int platinum_var_to_par(struct fb * Interface used by the world */ -int platinumfb_init(void); -int platinumfb_setup(char*); - static struct fb_ops platinumfb_ops = { .owner = THIS_MODULE, .fb_check_var = platinumfb_check_var, @@ -485,7 +484,7 @@ static int platinum_var_to_par(struct fb /* * Parse user speficied options (`video=platinumfb:') */ -int __init platinumfb_setup(char *options) +static int __init platinumfb_setup(char *options) { char *this_opt; @@ -526,19 +525,15 @@ int __init platinumfb_setup(char *option #define invalidate_cache(addr) #endif -static int __devinit platinumfb_probe(struct of_device* odev, const struct of_device_id *match) +static int __devinit platinumfb_probe(struct of_device* odev, + const struct of_device_id *match) { struct device_node *dp = odev->node; struct fb_info *info; struct fb_info_platinum *pinfo; - unsigned long addr, size; volatile __u8 *fbuffer; - int i, bank0, bank1, bank2, bank3, rc; + int bank0, bank1, bank2, bank3, rc; - if (dp->n_addrs != 2) { - printk(KERN_ERR "expecting 2 address for platinum (got %d)", dp->n_addrs); - return -ENXIO; - } printk(KERN_INFO "platinumfb: Found Apple Platinum video hardware\n"); info = framebuffer_alloc(sizeof(*pinfo), &odev->dev); @@ -546,27 +541,40 @@ static int __devinit platinumfb_probe(st return -ENOMEM; pinfo = info->par; - /* Map in frame buffer and registers */ - for (i = 0; i < dp->n_addrs; ++i) { - addr = dp->addrs[i].address; - size = dp->addrs[i].size; - /* Let's assume we can request either all or nothing */ - if (!request_mem_region(addr, size, "platinumfb")) { - framebuffer_release(info); - return -ENXIO; - } - if (size >= 0x400000) { - /* frame buffer - map only 4MB */ - pinfo->frame_buffer_phys = addr; - pinfo->frame_buffer = __ioremap(addr, 0x400000, _PAGE_WRITETHRU); - pinfo->base_frame_buffer = pinfo->frame_buffer; - } else { - /* registers */ - pinfo->platinum_regs_phys = addr; - pinfo->platinum_regs = ioremap(addr, size); - } + if (of_address_to_resource(dp, 0, &pinfo->rsrc_reg) || + of_address_to_resource(dp, 1, &pinfo->rsrc_fb)) { + printk(KERN_ERR "platinumfb: Can't get resources\n"); + framebuffer_release(info); + return -ENXIO; + } + if (!request_mem_region(pinfo->rsrc_reg.start, + pinfo->rsrc_reg.start - + pinfo->rsrc_reg.end + 1, + "platinumfb registers")) { + framebuffer_release(info); + return -ENXIO; + } + if (!request_mem_region(pinfo->rsrc_fb.start, + pinfo->rsrc_fb.start + - pinfo->rsrc_fb.end + 1, + "platinumfb framebuffer")) { + release_mem_region(pinfo->rsrc_reg.start, + pinfo->rsrc_reg.end - + pinfo->rsrc_reg.start + 1); + framebuffer_release(info); + return -ENXIO; } + /* frame buffer - map only 4MB */ + pinfo->frame_buffer_phys = pinfo->rsrc_fb.start; + pinfo->frame_buffer = __ioremap(pinfo->rsrc_fb.start, 0x400000, + _PAGE_WRITETHRU); + pinfo->base_frame_buffer = pinfo->frame_buffer; + + /* registers */ + pinfo->platinum_regs_phys = pinfo->rsrc_reg.start; + pinfo->platinum_regs = ioremap(pinfo->rsrc_reg.start, 0x1000); + pinfo->cmap_regs_phys = 0xf301b000; /* XXX not in prom? */ request_mem_region(pinfo->cmap_regs_phys, 0x1000, "platinumfb cmap"); pinfo->cmap_regs = ioremap(pinfo->cmap_regs_phys, 0x1000); @@ -628,18 +636,16 @@ static int __devexit platinumfb_remove(s { struct fb_info *info = dev_get_drvdata(&odev->dev); struct fb_info_platinum *pinfo = info->par; - struct device_node *dp = odev->node; - unsigned long addr, size; - int i; unregister_framebuffer (info); /* Unmap frame buffer and registers */ - for (i = 0; i < dp->n_addrs; ++i) { - addr = dp->addrs[i].address; - size = dp->addrs[i].size; - release_mem_region(addr, size); - } + release_mem_region(pinfo->rsrc_fb.start, + pinfo->rsrc_fb.end - + pinfo->rsrc_fb.start + 1); + release_mem_region(pinfo->rsrc_reg.start, + pinfo->rsrc_reg.end - + pinfo->rsrc_reg.start + 1); iounmap(pinfo->frame_buffer); iounmap(pinfo->platinum_regs); release_mem_region(pinfo->cmap_regs_phys, 0x1000); @@ -666,7 +672,7 @@ static struct of_platform_driver platinu .remove = platinumfb_remove, }; -int __init platinumfb_init(void) +static int __init platinumfb_init(void) { #ifndef MODULE char *option = NULL; @@ -680,7 +686,7 @@ int __init platinumfb_init(void) return 0; } -void __exit platinumfb_exit(void) +static void __exit platinumfb_exit(void) { of_unregister_driver(&platinum_driver); } Index: linux-work/drivers/video/controlfb.c =================================================================== --- linux-work.orig/drivers/video/controlfb.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/video/controlfb.c 2005-12-13 17:51:20.000000000 +1100 @@ -133,12 +133,6 @@ static int controlfb_mmap(struct fb_info static int controlfb_set_par (struct fb_info *info); static int controlfb_check_var (struct fb_var_screeninfo *var, struct fb_info *info); -/* - * inititialization - */ -int control_init(void); -void control_setup(char *); - /******************** Prototypes for internal functions **********************/ static void set_control_clock(unsigned char *params); @@ -550,9 +544,46 @@ static void control_set_hardware(struct /* - * Called from fbmem.c for probing & initializing + * Parse user speficied options (`video=controlfb:') */ -int __init control_init(void) +static void __init control_setup(char *options) +{ + char *this_opt; + + if (!options || !*options) + return; + + while ((this_opt = strsep(&options, ",")) != NULL) { + if (!strncmp(this_opt, "vmode:", 6)) { + int vmode = simple_strtoul(this_opt+6, NULL, 0); + if (vmode > 0 && vmode <= VMODE_MAX && + control_mac_modes[vmode - 1].m[1] >= 0) + default_vmode = vmode; + } else if (!strncmp(this_opt, "cmode:", 6)) { + int depth = simple_strtoul(this_opt+6, NULL, 0); + switch (depth) { + case CMODE_8: + case CMODE_16: + case CMODE_32: + default_cmode = depth; + break; + case 8: + default_cmode = CMODE_8; + break; + case 15: + case 16: + default_cmode = CMODE_16; + break; + case 24: + case 32: + default_cmode = CMODE_32; + break; + } + } + } +} + +static int __init control_init(void) { struct device_node *dp; char *option = NULL; @@ -651,15 +682,16 @@ static void __init find_vram_size(struct static int __init control_of_init(struct device_node *dp) { struct fb_info_control *p; - unsigned long addr; - int i; + struct resource fb_res, reg_res; if (control_fb) { printk(KERN_ERR "controlfb: only one control is supported\n"); return -ENXIO; } - if(dp->n_addrs != 2) { - printk(KERN_ERR "expecting 2 address for control (got %d)", dp->n_addrs); + + if (of_pci_address_to_resource(dp, 2, &fb_res) || + of_pci_address_to_resource(dp, 1, ®_res)) { + printk(KERN_ERR "can't get 2 addresses for control\n"); return -ENXIO; } p = kmalloc(sizeof(*p), GFP_KERNEL); @@ -669,18 +701,12 @@ static int __init control_of_init(struct memset(p, 0, sizeof(*p)); /* Map in frame buffer and registers */ - for (i = 0; i < dp->n_addrs; ++i) { - addr = dp->addrs[i].address; - if (dp->addrs[i].size >= 0x800000) { - p->fb_orig_base = addr; - p->fb_orig_size = dp->addrs[i].size; - /* use the big-endian aperture (??) */ - p->frame_buffer_phys = addr + 0x800000; - } else { - p->control_regs_phys = addr; - p->control_regs_size = dp->addrs[i].size; - } - } + p->fb_orig_base = fb_res.start; + p->fb_orig_size = fb_res.end - fb_res.start + 1; + /* use the big-endian aperture (??) */ + p->frame_buffer_phys = fb_res.start + 0x800000; + p->control_regs_phys = reg_res.start; + p->control_regs_size = reg_res.end - reg_res.start + 1; if (!p->fb_orig_base || !request_mem_region(p->fb_orig_base,p->fb_orig_size,"controlfb")) { @@ -1059,43 +1085,3 @@ static void control_cleanup(void) } -/* - * Parse user speficied options (`video=controlfb:') - */ -void __init control_setup(char *options) -{ - char *this_opt; - - if (!options || !*options) - return; - - while ((this_opt = strsep(&options, ",")) != NULL) { - if (!strncmp(this_opt, "vmode:", 6)) { - int vmode = simple_strtoul(this_opt+6, NULL, 0); - if (vmode > 0 && vmode <= VMODE_MAX && - control_mac_modes[vmode - 1].m[1] >= 0) - default_vmode = vmode; - } else if (!strncmp(this_opt, "cmode:", 6)) { - int depth = simple_strtoul(this_opt+6, NULL, 0); - switch (depth) { - case CMODE_8: - case CMODE_16: - case CMODE_32: - default_cmode = depth; - break; - case 8: - default_cmode = CMODE_8; - break; - case 15: - case 16: - default_cmode = CMODE_16; - break; - case 24: - case 32: - default_cmode = CMODE_32; - break; - } - } - } -} - Index: linux-work/sound/oss/dmasound/dmasound_awacs.c =================================================================== --- linux-work.orig/sound/oss/dmasound/dmasound_awacs.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/sound/oss/dmasound/dmasound_awacs.c 2005-12-13 17:51:20.000000000 +1100 @@ -125,6 +125,7 @@ static int awacs_rate_index; static int awacs_subframe; static struct device_node* awacs_node; static struct device_node* i2s_node; +static struct resource awacs_rsrc[3]; static char awacs_name[64]; static int awacs_revision; @@ -667,9 +668,12 @@ static void PMacIrqCleanup(void) iounmap(awacs_txdma); iounmap(awacs_rxdma); - release_OF_resource(awacs_node, 0); - release_OF_resource(awacs_node, 1); - release_OF_resource(awacs_node, 2); + release_mem_region(awacs_rsrc[0].start, + awacs_rsrc[0].end - awacs_rsrc[0].start + 1); + release_mem_region(awacs_rsrc[1].start, + awacs_rsrc[1].end - awacs_rsrc[1].start + 1); + release_mem_region(awacs_rsrc[2].start, + awacs_rsrc[2].end - awacs_rsrc[2].start + 1); kfree(awacs_tx_cmd_space); kfree(awacs_rx_cmd_space); @@ -2863,46 +2867,58 @@ printk("dmasound_pmac: couldn't find a C * other info if necessary (early AWACS we want to read chip ids) */ - if (io->n_addrs < 3 || io->n_intrs < 3) { + if (of_get_address(io, 2, NULL, NULL) == NULL || io->n_intrs < 3) { /* OK - maybe we need to use the 'awacs' node (on earlier * machines). - */ + */ if (awacs_node) { io = awacs_node ; - if (io->n_addrs < 3 || io->n_intrs < 3) { - printk("dmasound_pmac: can't use %s" - " (%d addrs, %d intrs)\n", - io->full_name, io->n_addrs, io->n_intrs); + if (of_get_address(io, 2, NULL, NULL) == NULL || + io->n_intrs < 3) { + printk("dmasound_pmac: can't use %s\n", + io->full_name); return -ENODEV; } - } else { - printk("dmasound_pmac: can't use %s (%d addrs, %d intrs)\n", - io->full_name, io->n_addrs, io->n_intrs); - } + } else + printk("dmasound_pmac: can't use %s\n", io->full_name); } - if (!request_OF_resource(io, 0, NULL)) { + if (of_address_to_resource(io, 0, &awacs_rsrc[0]) || + request_mem_region(awacs_rsrc[0].start, + awacs_rsrc[0].end - awacs_rsrc[0].start + 1, + " (IO)") == NULL) { printk(KERN_ERR "dmasound: can't request IO resource !\n"); return -ENODEV; } - if (!request_OF_resource(io, 1, " (tx dma)")) { - release_OF_resource(io, 0); - printk(KERN_ERR "dmasound: can't request TX DMA resource !\n"); + if (of_address_to_resource(io, 1, &awacs_rsrc[1]) || + request_mem_region(awacs_rsrc[1].start, + awacs_rsrc[1].end - awacs_rsrc[1].start + 1, + " (tx dma)") == NULL) { + release_mem_region(awacs_rsrc[0].start, + awacs_rsrc[0].end - awacs_rsrc[0].start + 1); + printk(KERN_ERR "dmasound: can't request Tx DMA resource !\n"); return -ENODEV; } - - if (!request_OF_resource(io, 2, " (rx dma)")) { - release_OF_resource(io, 0); - release_OF_resource(io, 1); - printk(KERN_ERR "dmasound: can't request RX DMA resource !\n"); + if (of_address_to_resource(io, 2, &awacs_rsrc[2]) || + request_mem_region(awacs_rsrc[2].start, + awacs_rsrc[2].end - awacs_rsrc[2].start + 1, + " (rx dma)") == NULL) { + release_mem_region(awacs_rsrc[0].start, + awacs_rsrc[0].end - awacs_rsrc[0].start + 1); + release_mem_region(awacs_rsrc[1].start, + awacs_rsrc[1].end - awacs_rsrc[1].start + 1); + printk(KERN_ERR "dmasound: can't request Rx DMA resource !\n"); return -ENODEV; } awacs_beep_dev = input_allocate_device(); if (!awacs_beep_dev) { - release_OF_resource(io, 0); - release_OF_resource(io, 1); - release_OF_resource(io, 2); + release_mem_region(awacs_rsrc[0].start, + awacs_rsrc[0].end - awacs_rsrc[0].start + 1); + release_mem_region(awacs_rsrc[1].start, + awacs_rsrc[1].end - awacs_rsrc[1].start + 1); + release_mem_region(awacs_rsrc[2].start, + awacs_rsrc[2].end - awacs_rsrc[2].start + 1); printk(KERN_ERR "dmasound: can't allocate input device !\n"); return -ENOMEM; } @@ -2916,11 +2932,11 @@ printk("dmasound_pmac: couldn't find a C /* all OF versions I've seen use this value */ if (i2s_node) - i2s = ioremap(io->addrs[0].address, 0x1000); + i2s = ioremap(awacs_rsrc[0].start, 0x1000); else - awacs = ioremap(io->addrs[0].address, 0x1000); - awacs_txdma = ioremap(io->addrs[1].address, 0x100); - awacs_rxdma = ioremap(io->addrs[2].address, 0x100); + awacs = ioremap(awacs_rsrc[0].start, 0x1000); + awacs_txdma = ioremap(awacs_rsrc[1].start, 0x100); + awacs_rxdma = ioremap(awacs_rsrc[2].start, 0x100); /* first of all make sure that the chip is powered up....*/ pmac_call_feature(PMAC_FTR_SOUND_CHIP_ENABLE, io, 0, 1); @@ -3083,9 +3099,10 @@ printk("dmasound_pmac: Awacs/Screamer Co struct device_node* mio; macio_base = NULL; for (mio = io->parent; mio; mio = mio->parent) { - if (strcmp(mio->name, "mac-io") == 0 - && mio->n_addrs > 0) { - macio_base = ioremap(mio->addrs[0].address, 0x40); + if (strcmp(mio->name, "mac-io") == 0) { + struct resource r; + if (of_address_to_resource(mio, 0, &r) == 0) + macio_base = ioremap(r.start, 0x40); break; } } Index: linux-work/drivers/video/valkyriefb.c =================================================================== --- linux-work.orig/drivers/video/valkyriefb.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/video/valkyriefb.c 2005-12-13 17:51:20.000000000 +1100 @@ -342,19 +342,19 @@ int __init valkyriefb_init(void) #else /* ppc (!CONFIG_MAC) */ { struct device_node *dp; + struct resource r; - dp = find_devices("valkyrie"); + dp = of_find_node_by_name(NULL, "valkyrie"); if (dp == 0) return 0; - if (dp->n_addrs != 1) { - printk(KERN_ERR "expecting 1 address for valkyrie (got %d)\n", - dp->n_addrs); + if (of_address_to_resource(dp, 0, &r)) { + printk(KERN_ERR "can't find address for valkyrie\n"); return 0; } - frame_buffer_phys = dp->addrs[0].address; - cmap_regs_phys = dp->addrs[0].address+0x304000; + frame_buffer_phys = r.start; + cmap_regs_phys = r.start + 0x304000; flags = _PAGE_WRITETHRU; } #endif /* ppc (!CONFIG_MAC) */ Index: linux-work/drivers/block/swim3.c =================================================================== --- linux-work.orig/drivers/block/swim3.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/drivers/block/swim3.c 2005-12-13 17:51:20.000000000 +1100 @@ -1083,23 +1083,33 @@ static int swim3_add_device(struct devic { struct device_node *mediabay; struct floppy_state *fs = &floppy_states[floppy_count]; + struct resource res_reg, res_dma; - if (swim->n_addrs < 2) - { - printk(KERN_INFO "swim3: expecting 2 addrs (n_addrs:%d, n_intrs:%d)\n", - swim->n_addrs, swim->n_intrs); + if (of_address_to_resource(swim, 0, &res_reg) || + of_address_to_resource(swim, 1, &res_dma)) { + printk(KERN_ERR "swim3: Can't get addresses\n"); return -EINVAL; } - - if (swim->n_intrs < 2) - { - printk(KERN_INFO "swim3: expecting 2 intrs (n_addrs:%d, n_intrs:%d)\n", - swim->n_addrs, swim->n_intrs); + if (request_mem_region(res_reg.start, res_reg.end - res_reg.start + 1, + " (reg)") == NULL) { + printk(KERN_ERR "swim3: Can't request register space\n"); + return -EINVAL; + } + if (request_mem_region(res_dma.start, res_dma.end - res_dma.start + 1, + " (dma)") == NULL) { + release_mem_region(res_reg.start, + res_reg.end - res_reg.start + 1); + printk(KERN_ERR "swim3: Can't request DMA space\n"); return -EINVAL; } - if (!request_OF_resource(swim, 0, NULL)) { - printk(KERN_INFO "swim3: can't request IO resource !\n"); + if (swim->n_intrs < 2) { + printk(KERN_INFO "swim3: expecting 2 intrs (n_intrs:%d)\n", + swim->n_intrs); + release_mem_region(res_reg.start, + res_reg.end - res_reg.start + 1); + release_mem_region(res_dma.start, + res_dma.end - res_dma.start + 1); return -EINVAL; } @@ -1110,10 +1120,8 @@ static int swim3_add_device(struct devic memset(fs, 0, sizeof(*fs)); spin_lock_init(&fs->lock); fs->state = idle; - fs->swim3 = (struct swim3 __iomem *) - ioremap(swim->addrs[0].address, 0x200); - fs->dma = (struct dbdma_regs __iomem *) - ioremap(swim->addrs[1].address, 0x200); + fs->swim3 = (struct swim3 __iomem *)ioremap(res_reg.start, 0x200); + fs->dma = (struct dbdma_regs __iomem *)ioremap(res_dma.start, 0x200); fs->swim3_intr = swim->intrs[0].line; fs->dma_intr = swim->intrs[1].line; fs->cur_cyl = -1; Index: linux-work/arch/powerpc/kernel/prom_init.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/prom_init.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/kernel/prom_init.c 2005-12-13 17:51:20.000000000 +1100 @@ -558,7 +558,8 @@ unsigned long prom_memparse(const char * static void __init early_cmdline_parse(void) { struct prom_t *_prom = &RELOC(prom); - char *opt, *p; + const char *opt; + char *p; int l = 0; RELOC(prom_cmd_line[0]) = 0; Index: linux-work/arch/powerpc/kernel/rtas_pci.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/rtas_pci.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/kernel/rtas_pci.c 2005-12-13 17:51:20.000000000 +1100 @@ -188,39 +188,19 @@ int is_python(struct device_node *dev) return 0; } -static int get_phb_reg_prop(struct device_node *dev, - unsigned int addr_size_words, - struct reg_property64 *reg) +static void python_countermeasures(struct device_node *dev) { - unsigned int *ui_ptr = NULL, len; - - /* Found a PHB, now figure out where his registers are mapped. */ - ui_ptr = (unsigned int *)get_property(dev, "reg", &len); - if (ui_ptr == NULL) - return 1; - - if (addr_size_words == 1) { - reg->address = ((struct reg_property32 *)ui_ptr)->address; - reg->size = ((struct reg_property32 *)ui_ptr)->size; - } else { - *reg = *((struct reg_property64 *)ui_ptr); - } - - return 0; -} - -static void python_countermeasures(struct device_node *dev, - unsigned int addr_size_words) -{ - struct reg_property64 reg_struct; + struct resource registers; void __iomem *chip_regs; volatile u32 val; - if (get_phb_reg_prop(dev, addr_size_words, ®_struct)) + if (of_address_to_resource(dev, 0, ®isters)) { + printk(KERN_ERR "Can't get address for Python workarounds !\n"); return; + } /* Python's register file is 1 MB in size. */ - chip_regs = ioremap(reg_struct.address & ~(0xfffffUL), 0x100000); + chip_regs = ioremap(registers.start & ~(0xfffffUL), 0x100000); /* * Firmware doesn't always clear this bit which is critical @@ -301,11 +281,10 @@ static int phb_set_bus_ranges(struct dev } static int __devinit setup_phb(struct device_node *dev, - struct pci_controller *phb, - unsigned int addr_size_words) + struct pci_controller *phb) { if (is_python(dev)) - python_countermeasures(dev, addr_size_words); + python_countermeasures(dev); if (phb_set_bus_ranges(dev, phb)) return 1; @@ -320,8 +299,8 @@ unsigned long __init find_and_init_phbs( { struct device_node *node; struct pci_controller *phb; - unsigned int root_size_cells = 0; unsigned int index; + unsigned int root_size_cells = 0; unsigned int *opprop = NULL; struct device_node *root = of_find_node_by_path("/"); @@ -343,10 +322,11 @@ unsigned long __init find_and_init_phbs( phb = pcibios_alloc_controller(node); if (!phb) continue; - setup_phb(node, phb, root_size_cells); + setup_phb(node, phb); pci_process_bridge_OF_ranges(phb, node, 0); pci_setup_phb_io(phb, index == 0); #ifdef CONFIG_PPC_PSERIES + /* XXX This code need serious fixing ... --BenH */ if (ppc64_interrupt_controller == IC_OPEN_PIC && pSeries_mpic) { int addr = root_size_cells * (index + 2) - 1; mpic_assign_isu(pSeries_mpic, index, opprop[addr]); @@ -381,22 +361,17 @@ unsigned long __init find_and_init_phbs( struct pci_controller * __devinit init_phb_dynamic(struct device_node *dn) { - struct device_node *root = of_find_node_by_path("/"); - unsigned int root_size_cells = 0; struct pci_controller *phb; int primary; - root_size_cells = prom_n_size_cells(root); - primary = list_empty(&hose_list); phb = pcibios_alloc_controller(dn); if (!phb) return NULL; - setup_phb(dn, phb, root_size_cells); + setup_phb(dn, phb); pci_process_bridge_OF_ranges(phb, dn, primary); pci_setup_phb_io_dynamic(phb, primary); - of_node_put(root); pci_devs_phb_init_dynamic(phb); scan_phb(phb); Index: linux-work/arch/powerpc/mm/numa.c =================================================================== --- linux-work.orig/arch/powerpc/mm/numa.c 2005-12-13 17:49:22.000000000 +1100 +++ linux-work/arch/powerpc/mm/numa.c 2005-12-13 17:51:20.000000000 +1100 @@ -432,7 +432,8 @@ static int __init parse_numa_properties( if (!memcell_buf || len <= 0) continue; - ranges = memory->n_addrs; + /* ranges in cell */ + ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells); new_range: /* these are order-sensitive, and modify the buffer pointer */ start = read_n_cells(n_mem_addr_cells, &memcell_buf); @@ -746,7 +747,8 @@ int hot_add_scn_to_nid(unsigned long scn if (!memcell_buf || len <= 0) continue; - ranges = memory->n_addrs; /* ranges in cell */ + /* ranges in cell */ + ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells); ha_new_range: start = read_n_cells(n_mem_addr_cells, &memcell_buf); size = read_n_cells(n_mem_size_cells, &memcell_buf); From benh at kernel.crashing.org Tue Dec 13 18:04:29 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 18:04:29 +1100 Subject: [PATCH] powerpc: Update MPIC workarounds Message-ID: <1134457469.6989.189.camel@gaston> From: Segher Boessenkool Cleanup the MPIC IO-APIC workarounds, make them a bit more generic, smaller and faster. Signed-off-by: Segher Boessenkool Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/powerpc/sysdev/mpic.c =================================================================== --- linux-work.orig/arch/powerpc/sysdev/mpic.c 2005-12-06 16:17:43.000000000 +1100 +++ linux-work/arch/powerpc/sysdev/mpic.c 2005-12-07 13:30:45.000000000 +1100 @@ -175,57 +175,57 @@ static inline int mpic_is_ht_interrupt(s return mpic->fixups[source_no].base != NULL; } + static inline void mpic_apic_end_irq(struct mpic *mpic, unsigned int source_no) { struct mpic_irq_fixup *fixup = &mpic->fixups[source_no]; - u32 tmp; spin_lock(&mpic->fixup_lock); - writeb(0x11 + 2 * fixup->irq, fixup->base); - tmp = readl(fixup->base + 2); - writel(tmp | 0x80000000ul, fixup->base + 2); - /* config writes shouldn't be posted but let's be safe ... */ - (void)readl(fixup->base + 2); + writeb(0x11 + 2 * fixup->irq, fixup->base + 2); + writel(fixup->data, fixup->base + 4); spin_unlock(&mpic->fixup_lock); } -static void __init mpic_amd8111_read_irq(struct mpic *mpic, u8 __iomem *devbase) +static void __init mpic_scan_ioapic(struct mpic *mpic, u8 __iomem *devbase) { - int i, irq; + int i, irq, n; u32 tmp; + u8 pos; - printk(KERN_INFO "mpic: - Workarounds on AMD 8111 @ %p\n", devbase); + for (pos = readb(devbase + 0x34); pos; pos = readb(devbase + pos + 1)) { + u8 id = readb(devbase + pos); - for (i=0; i < 24; i++) { - writeb(0x10 + 2*i, devbase + 0xf2); - tmp = readl(devbase + 0xf4); - if ((tmp & 0x1) || !(tmp & 0x20)) - continue; - irq = (tmp >> 16) & 0xff; - mpic->fixups[irq].irq = i; - mpic->fixups[irq].base = devbase + 0xf2; + if (id == 0x08) { + id = readb(devbase + pos + 3); + if (id == 0x80) + break; + } } -} - -static void __init mpic_amd8131_read_irq(struct mpic *mpic, u8 __iomem *devbase) -{ - int i, irq; - u32 tmp; + if (pos == 0) + return; + + printk(KERN_INFO "mpic: - Workarounds @ %p, pos = 0x%02x\n", devbase, pos); - printk(KERN_INFO "mpic: - Workarounds on AMD 8131 @ %p\n", devbase); + devbase += pos; - for (i=0; i < 4; i++) { - writeb(0x10 + 2*i, devbase + 0xba); - tmp = readl(devbase + 0xbc); - if ((tmp & 0x1) || !(tmp & 0x20)) + writeb(0x01, devbase + 2); + n = (readl(devbase + 4) >> 16) & 0xff; + + for (i = 0; i <= n; i++) { + writeb(0x10 + 2 * i, devbase + 2); + tmp = readl(devbase + 4); + if ((tmp & 0x21) != 0x20) continue; irq = (tmp >> 16) & 0xff; mpic->fixups[irq].irq = i; - mpic->fixups[irq].base = devbase + 0xba; + mpic->fixups[irq].base = devbase; + writeb(0x11 + 2 * i, devbase + 2); + mpic->fixups[irq].data = readl(devbase + 4) | 0x80000000; } } + static void __init mpic_scan_ioapics(struct mpic *mpic) { unsigned int devfn; @@ -241,21 +241,19 @@ static void __init mpic_scan_ioapics(str /* Init spinlock */ spin_lock_init(&mpic->fixup_lock); - /* Map u3 config space. We assume all IO-APICs are on the primary bus - * and slot will never be above "0xf" so we only need to map 32k + /* Map U3 config space. We assume all IO-APICs are on the primary bus + * so we only need to map 64kB. */ - cfgspace = (unsigned char __iomem *)ioremap(0xf2000000, 0x8000); + cfgspace = ioremap(0xf2000000, 0x10000); BUG_ON(cfgspace == NULL); /* Now we scan all slots. We do a very quick scan, we read the header type, * vendor ID and device ID only, that's plenty enough */ - for (devfn = 0; devfn < PCI_DEVFN(0x10,0); devfn ++) { + for (devfn = 0; devfn < 0x100; devfn++) { u8 __iomem *devbase = cfgspace + (devfn << 8); u8 hdr_type = readb(devbase + PCI_HEADER_TYPE); u32 l = readl(devbase + PCI_VENDOR_ID); - u16 vendor_id, device_id; - int multifunc = 0; DBG("devfn %x, l: %x\n", devfn, l); @@ -264,21 +262,11 @@ static void __init mpic_scan_ioapics(str l == 0x0000ffff || l == 0xffff0000) goto next; - /* Check if it's a multifunction device (only really used - * to function 0 though - */ - multifunc = !!(hdr_type & 0x80); - vendor_id = l & 0xffff; - device_id = (l >> 16) & 0xffff; - - /* If a known device, go to fixup setup code */ - if (vendor_id == PCI_VENDOR_ID_AMD && device_id == 0x7460) - mpic_amd8111_read_irq(mpic, devbase); - if (vendor_id == PCI_VENDOR_ID_AMD && device_id == 0x7450) - mpic_amd8131_read_irq(mpic, devbase); + mpic_scan_ioapic(mpic, devbase); + next: /* next device, if function 0 */ - if ((PCI_FUNC(devfn) == 0) && !multifunc) + if (PCI_FUNC(devfn) == 0 && (hdr_type & 0x80) == 0) devfn += 7; } } Index: linux-work/include/asm-powerpc/mpic.h =================================================================== --- linux-work.orig/include/asm-powerpc/mpic.h 2005-11-24 17:18:48.000000000 +1100 +++ linux-work/include/asm-powerpc/mpic.h 2005-12-07 10:32:51.000000000 +1100 @@ -117,7 +117,8 @@ typedef int (*mpic_cascade_t)(struct pt_ struct mpic_irq_fixup { u8 __iomem *base; - unsigned int irq; + u32 data; + unsigned int irq; }; #endif /* CONFIG_MPIC_BROKEN_U3 */ From benh at kernel.crashing.org Tue Dec 13 18:09:16 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 18:09:16 +1100 Subject: [PATCH] PCI: Export pci_cfg_space_size Message-ID: <1134457757.6989.195.camel@gaston> The powerpc PCI code sets up the PCI tree without doing config space accesses in most cases, from the firmware tree. However, it still wants to call pci_cfg_space_size() under some conditions, thus it needs to be made non-static (though I don't see a point to export it to modules). Signed-off-by: Benjamin Herrenschmidt --- Greg: I don't remember if I already sent you that patch or not, if not, then heh, here it is :) The powerpc patch that requires this one will be going next to the powerpc git tree and -mm, with a 2.6.16 target. Let me know if there is any objection. Paul: You need that for the new G5 support patch that I'm sending next. Index: linux-work/drivers/pci/probe.c =================================================================== --- linux-work.orig/drivers/pci/probe.c 2005-11-24 17:18:45.000000000 +1100 +++ linux-work/drivers/pci/probe.c 2005-12-08 10:15:41.000000000 +1100 @@ -678,7 +678,7 @@ static void pci_release_dev(struct devic * reading the dword at 0x100 which must either be 0 or a valid extended * capability header. */ -static int pci_cfg_space_size(struct pci_dev *dev) +int pci_cfg_space_size(struct pci_dev *dev) { int pos; u32 status; Index: linux-work/include/linux/pci.h =================================================================== --- linux-work.orig/include/linux/pci.h 2005-11-30 10:43:15.000000000 +1100 +++ linux-work/include/linux/pci.h 2005-12-08 10:16:19.000000000 +1100 @@ -515,6 +515,7 @@ int pci_scan_bridge(struct pci_bus *bus, void pci_walk_bus(struct pci_bus *top, void (*cb)(struct pci_dev *, void *), void *userdata); +int pci_cfg_space_size(struct pci_dev *dev); /* kmem_cache style wrapper around pci_alloc_consistent() */ From benh at kernel.crashing.org Tue Dec 13 18:12:51 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 18:12:51 +1100 Subject: [PATCH] nvidiafb: Fixes for new G5 Message-ID: <1134457971.6989.198.camel@gaston> This patch is not for inclusion in anything but powerpc git tree and only temporarily as it's being dealt with separately for upstream merge and is already in -mm. It fixes an issue in nvidiafb for the the G5s. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/drivers/video/nvidia/nv_hw.c =================================================================== --- linux-work.orig/drivers/video/nvidia/nv_hw.c 2005-11-24 17:18:46.000000000 +1100 +++ linux-work/drivers/video/nvidia/nv_hw.c 2005-12-07 19:14:52.000000000 +1100 @@ -848,7 +848,7 @@ void NVCalcStateExt(struct nvidia_par *p int width, int hDisplaySize, int height, int dotClock, int flags) { - int pixelDepth, VClk; + int pixelDepth, VClk = 0; /* * Save mode parameters. */ @@ -938,17 +938,26 @@ void NVLoadStateExt(struct nvidia_par *p if (par->Architecture == NV_ARCH_04) { NV_WR32(par->PFB, 0x0200, state->config); - } else if ((par->Chipset & 0xfff0) == 0x0090) { - for (i = 0; i < 15; i++) { - NV_WR32(par->PFB, 0x0600 + (i * 0x10), 0); - NV_WR32(par->PFB, 0x0604 + (i * 0x10), par->FbMapSize - 1); - } - } else { + } else if ((par->Architecture < NV_ARCH_40) || + (par->Chipset & 0xfff0) == 0x0040) { for (i = 0; i < 8; i++) { NV_WR32(par->PFB, 0x0240 + (i * 0x10), 0); - NV_WR32(par->PFB, 0x0244 + (i * 0x10), par->FbMapSize - 1); + NV_WR32(par->PFB, 0x0244 + (i * 0x10), + par->FbMapSize - 1); } - } + } else { + int regions = 12; + + if (((par->Chipset & 0xfff0) == 0x0090) || + ((par->Chipset & 0xfff0) == 0x01D0) || + ((par->Chipset & 0xfff0) == 0x0290)) + regions = 15; + for(i = 0; i < regions; i++) { + NV_WR32(par->PFB, 0x0600 + (i * 0x10), 0); + NV_WR32(par->PFB, 0x0604 + (i * 0x10), + par->FbMapSize - 1); + } + } if (par->Architecture >= NV_ARCH_40) { NV_WR32(par->PRAMIN, 0x0000 * 4, 0x80000010); @@ -1182,11 +1191,17 @@ void NVLoadStateExt(struct nvidia_par *p NV_WR32(par->PGRAPH, 0x0608, 0xFFFFFFFF); } else { if (par->Architecture >= NV_ARCH_40) { + u32 tmp; + NV_WR32(par->PGRAPH, 0x0084, 0x401287c0); NV_WR32(par->PGRAPH, 0x008C, 0x60de8051); NV_WR32(par->PGRAPH, 0x0090, 0x00008000); NV_WR32(par->PGRAPH, 0x0610, 0x00be3c5f); + tmp = NV_RD32(par->REGS, 0x1540) & 0xff; + for(i = 0; tmp && !(tmp & 1); tmp >>= 1, i++); + NV_WR32(par->PGRAPH, 0x5000, i); + if ((par->Chipset & 0xfff0) == 0x0040) { NV_WR32(par->PGRAPH, 0x09b0, 0x83280fff); @@ -1211,6 +1226,7 @@ void NVLoadStateExt(struct nvidia_par *p 0xffff7fff); break; case 0x00C0: + case 0x0120: NV_WR32(par->PGRAPH, 0x0828, 0x007596ff); NV_WR32(par->PGRAPH, 0x082C, @@ -1245,6 +1261,7 @@ void NVLoadStateExt(struct nvidia_par *p 0x00100000); break; case 0x0090: + case 0x0290: NV_WR32(par->PRAMDAC, 0x0608, NV_RD32(par->PRAMDAC, 0x0608) | 0x00100000); @@ -1310,14 +1327,44 @@ void NVLoadStateExt(struct nvidia_par *p } } - if ((par->Chipset & 0xfff0) == 0x0090) { - for (i = 0; i < 60; i++) - NV_WR32(par->PGRAPH, 0x0D00 + i, - NV_RD32(par->PFB, 0x0600 + i)); + if ((par->Architecture < NV_ARCH_40) || + ((par->Chipset & 0xfff0) == 0x0040)) { + for (i = 0; i < 32; i++) { + NV_WR32(par->PGRAPH, 0x0900 + i*4, + NV_RD32(par->PFB, 0x0240 +i*4)); + NV_WR32(par->PGRAPH, 0x6900 + i*4, + NV_RD32(par->PFB, 0x0240 +i*4)); + } } else { - for (i = 0; i < 32; i++) - NV_WR32(par->PGRAPH, 0x0900 + i, - NV_RD32(par->PFB, 0x0240 + i)); + if (((par->Chipset & 0xfff0) == 0x0090) || + ((par->Chipset & 0xfff0) == 0x01D0) || + ((par->Chipset & 0xfff0) == 0x0290)) { + for (i = 0; i < 60; i++) { + NV_WR32(par->PGRAPH, + 0x0D00 + i*4, + NV_RD32(par->PFB, + 0x0600 + i*4)); + NV_WR32(par->PGRAPH, + 0x6900 + i*4, + NV_RD32(par->PFB, + 0x0600 + i*4)); + } + } else { + for (i = 0; i < 48; i++) { + NV_WR32(par->PGRAPH, + 0x0900 + i*4, + NV_RD32(par->PFB, + 0x0600 + i*4)); + if(((par->Chipset & 0xfff0) + != 0x0160) && + ((par->Chipset & 0xfff0) + != 0x0220)) + NV_WR32(par->PGRAPH, + 0x6900 + i*4, + NV_RD32(par->PFB, + 0x0600 + i*4)); + } + } } if (par->Architecture >= NV_ARCH_40) { @@ -1338,7 +1385,9 @@ void NVLoadStateExt(struct nvidia_par *p NV_WR32(par->PGRAPH, 0x0868, par->FbMapSize - 1); } else { - if((par->Chipset & 0xfff0) == 0x0090) { + if ((par->Chipset & 0xfff0) == 0x0090 || + (par->Chipset & 0xfff0) == 0x01D0 || + (par->Chipset & 0xfff0) == 0x0290) { NV_WR32(par->PGRAPH, 0x0DF0, NV_RD32(par->PFB, 0x0200)); NV_WR32(par->PGRAPH, 0x0DF4, Index: linux-work/drivers/video/nvidia/nv_setup.c =================================================================== --- linux-work.orig/drivers/video/nvidia/nv_setup.c 2005-11-24 17:18:46.000000000 +1100 +++ linux-work/drivers/video/nvidia/nv_setup.c 2005-12-07 19:09:01.000000000 +1100 @@ -285,7 +285,6 @@ static void nv10GetConfig(struct nvidia_ par->CrystalFreqKHz = 27000; } - par->CursorStart = (par->RamAmountKBytes - 96) * 1024; par->CURSOR = NULL; /* can't set this here */ par->MinVClockFreqKHz = 12000; par->MaxVClockFreqKHz = par->twoStagePLL ? 400000 : 350000; @@ -382,6 +381,8 @@ void NVCommonSetup(struct fb_info *info) case 0x0146: case 0x0147: case 0x0148: + case 0x0098: + case 0x0099: mobile = 1; break; default: Index: linux-work/drivers/video/nvidia/nvidia.c =================================================================== --- linux-work.orig/drivers/video/nvidia/nvidia.c 2005-11-24 17:18:46.000000000 +1100 +++ linux-work/drivers/video/nvidia/nvidia.c 2005-12-07 18:52:36.000000000 +1100 @@ -1485,6 +1485,8 @@ static u32 __devinit nvidia_get_arch(str case 0x0210: case 0x0220: case 0x0230: + case 0x0290: + case 0x0390: arch = NV_ARCH_40; break; case 0x0020: /* TNT, TNT2 */ @@ -1581,10 +1583,15 @@ static int __devinit nvidiafb_probe(stru if (par->FbMapSize > 64 * 1024 * 1024) par->FbMapSize = 64 * 1024 * 1024; - par->FbUsableSize = par->FbMapSize - (128 * 1024); + if(par->Architecture >= NV_ARCH_40) + par->FbUsableSize = par->FbMapSize - (560 * 1024); + else + par->FbUsableSize = par->FbMapSize - (128 * 1024); par->ScratchBufferSize = (par->Architecture < NV_ARCH_10) ? 8 * 1024 : 16 * 1024; par->ScratchBufferStart = par->FbUsableSize - par->ScratchBufferSize; + par->CursorStart = par->FbUsableSize + (32 * 1024); + info->screen_base = ioremap(nvidiafb_fix.smem_start, par->FbMapSize); info->screen_size = par->FbUsableSize; nvidiafb_fix.smem_len = par->RamAmountKBytes * 1024; From benh at kernel.crashing.org Tue Dec 13 18:17:47 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 13 Dec 2005 18:17:47 +1100 Subject: [PATCH] powerpc: Experimental support for new G5 Macs Message-ID: <1134458268.6989.202.camel@gaston> This adds some very basic support for the new machines, including the Quad G5 (tested), and other new dual core based machines and iMac G5 iSight (untested). This is still experimental ! If you have more than 2Gb of RAM, you MUST boot with mem=2G to limit it as the iommu/DART code for the new bridge chip is not there yet (hopefully tomorrow). There is no thermal control yet, there is no proper handing of MSIs, etc.. but it boots, I have all 4 cores up on my machine. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/powerpc/platforms/powermac/feature.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/feature.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/feature.c 2005-12-13 18:13:11.000000000 +1100 @@ -101,7 +101,8 @@ static const char *macio_names[] = "Keylargo", "Pangea", "Intrepid", - "K2" + "K2", + "Shasta", }; @@ -119,7 +120,7 @@ static const char *macio_names[] = static struct device_node *uninorth_node; static u32 __iomem *uninorth_base; static u32 uninorth_rev; -static int uninorth_u3; +static int uninorth_maj; static void __iomem *u3_ht; /* @@ -1399,8 +1400,15 @@ static long g5_fw_enable(struct device_n static long g5_mpic_enable(struct device_node *node, long param, long value) { unsigned long flags; + struct device_node *parent = of_get_parent(node); + int is_u3; - if (node->parent == NULL || strcmp(node->parent->name, "u3")) + if (parent == NULL) + return 0; + is_u3 = strcmp(parent->name, "u3") == 0 || + strcmp(parent->name, "u4") == 0; + of_node_put(parent); + if (!is_u3) return 0; LOCK(flags); @@ -1464,7 +1472,7 @@ static long g5_i2s_enable(struct device_ }, }; - if (macio->type != macio_keylargo2 /* && macio->type != macio_shasta*/) + if (macio->type != macio_keylargo2 && macio->type != macio_shasta) return -ENODEV; if (strncmp(node->name, "i2s-", 4)) return -ENODEV; @@ -1473,11 +1481,9 @@ static long g5_i2s_enable(struct device_ case 0: case 1: break; -#if 0 case 2: if (macio->type == macio_shasta) break; -#endif default: return -ENODEV; } @@ -1508,7 +1514,7 @@ static long g5_reset_cpu(struct device_n struct device_node *np; macio = &macio_chips[0]; - if (macio->type != macio_keylargo2) + if (macio->type != macio_keylargo2 && macio->type != macio_shasta) return -ENODEV; np = find_path_device("/cpus"); @@ -1547,7 +1553,8 @@ static long g5_reset_cpu(struct device_n */ void g5_phy_disable_cpu1(void) { - UN_OUT(U3_API_PHY_CONFIG_1, 0); + if (uninorth_maj == 3) + UN_OUT(U3_API_PHY_CONFIG_1, 0); } #endif /* CONFIG_POWER4 */ @@ -2462,6 +2469,14 @@ static struct pmac_mb_def pmac_mb_defs[] PMAC_TYPE_POWERMAC_G5_U3L, g5_features, 0, }, + { "PowerMac11,2", "PowerMac G5 Dual Core", + PMAC_TYPE_POWERMAC_G5_U3L, g5_features, + 0, + }, + { "PowerMac12,1", "iMac G5 (iSight)", + PMAC_TYPE_POWERMAC_G5_U3L, g5_features, + 0, + }, { "RackMac3,1", "XServe G5", PMAC_TYPE_XSERVE_G5, g5_features, 0, @@ -2574,6 +2589,11 @@ static int __init probe_motherboard(void pmac_mb.model_name = "Unknown K2-based"; pmac_mb.features = g5_features; break; + case macio_shasta: + pmac_mb.model_id = PMAC_TYPE_UNKNOWN_SHASTA; + pmac_mb.model_name = "Unknown Shasta-based"; + pmac_mb.features = g5_features; + break; #endif /* CONFIG_POWER4 */ default: return -ENODEV; @@ -2651,7 +2671,12 @@ static void __init probe_uninorth(void) /* Locate G5 u3 */ if (uninorth_node == NULL) { uninorth_node = of_find_node_by_name(NULL, "u3"); - uninorth_u3 = 1; + uninorth_maj = 3; + } + /* Locate G5 u4 */ + if (uninorth_node == NULL) { + uninorth_node = of_find_node_by_name(NULL, "u4"); + uninorth_maj = 4; } if (uninorth_node == NULL) return; @@ -2664,12 +2689,13 @@ static void __init probe_uninorth(void) return; uninorth_base = ioremap(address, 0x40000); uninorth_rev = in_be32(UN_REG(UNI_N_VERSION)); - if (uninorth_u3) + if (uninorth_maj == 3 || uninorth_maj == 4) u3_ht = ioremap(address + U3_HT_CONFIG_BASE, 0x1000); - printk(KERN_INFO "Found %s memory controller & host bridge," - " revision: %d\n", uninorth_u3 ? "U3" : "UniNorth", - uninorth_rev); + printk(KERN_INFO "Found %s memory controller & host bridge" + " @ 0x%08x revision: 0x%02x\n", uninorth_maj == 3 ? "U3" : + uninorth_maj == 4 ? "U4" : "UniNorth", + (unsigned int)address, uninorth_rev); printk(KERN_INFO "Mapped at 0x%08lx\n", (unsigned long)uninorth_base); /* Set the arbitrer QAck delay according to what Apple does @@ -2677,7 +2703,8 @@ static void __init probe_uninorth(void) if (uninorth_rev < 0x11) { actrl = UN_IN(UNI_N_ARB_CTRL) & ~UNI_N_ARB_CTRL_QACK_DELAY_MASK; actrl |= ((uninorth_rev < 3) ? UNI_N_ARB_CTRL_QACK_DELAY105 : - UNI_N_ARB_CTRL_QACK_DELAY) << UNI_N_ARB_CTRL_QACK_DELAY_SHIFT; + UNI_N_ARB_CTRL_QACK_DELAY) << + UNI_N_ARB_CTRL_QACK_DELAY_SHIFT; UN_OUT(UNI_N_ARB_CTRL, actrl); } @@ -2685,7 +2712,8 @@ static void __init probe_uninorth(void) * revs 1.5 to 2.O and Pangea. Seem to toggle the UniN Maxbus/PCI * memory timeout */ - if ((uninorth_rev >= 0x11 && uninorth_rev <= 0x24) || uninorth_rev == 0xc0) + if ((uninorth_rev >= 0x11 && uninorth_rev <= 0x24) || + uninorth_rev == 0xc0) UN_OUT(0x2160, UN_IN(0x2160) & 0x00ffffff); } @@ -2736,12 +2764,14 @@ static void __init probe_one_macio(const node->full_name); return; } - if (type == macio_keylargo) { + if (type == macio_keylargo || type == macio_keylargo2) { u32 *did = (u32 *)get_property(node, "device-id", NULL); if (*did == 0x00000025) type = macio_pangea; if (*did == 0x0000003e) type = macio_intrepid; + if (*did == 0x0000004f) + type = macio_shasta; } macio_chips[i].of_node = node; macio_chips[i].type = type; @@ -2840,7 +2870,8 @@ set_initial_features(void) } #ifdef CONFIG_POWER4 - if (macio_chips[0].type == macio_keylargo2) { + if (macio_chips[0].type == macio_keylargo2 || + macio_chips[0].type == macio_shasta) { #ifndef CONFIG_SMP /* On SMP machines running UP, we have the second CPU eating * bus cycles. We need to take it off the bus. This is done Index: linux-work/arch/powerpc/platforms/powermac/pic.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/pic.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/pic.c 2005-12-13 18:13:11.000000000 +1100 @@ -524,18 +524,56 @@ static void __init pmac_pic_setup_mpic_n #endif /* defined(CONFIG_XMON) && defined(CONFIG_PPC32) */ } +static struct mpic * __init pmac_setup_one_mpic(struct device_node *np, + int master) +{ + unsigned char senses[128]; + int offset = master ? 0 : 128; + int count = master ? 128 : 124; + const char *name = master ? " MPIC 1 " : " MPIC 2 "; + struct resource r; + struct mpic *mpic; + unsigned int flags = master ? MPIC_PRIMARY : 0; + int rc; + + rc = of_address_to_resource(np, 0, &r); + if (rc) + return NULL; + + pmac_call_feature(PMAC_FTR_ENABLE_MPIC, np, 0, 0); + + prom_get_irq_senses(senses, offset, offset + count); + + flags |= MPIC_WANTS_RESET; + if (get_property(np, "big-endian", NULL)) + flags |= MPIC_BIG_ENDIAN; + + /* Primary Big Endian means HT interrupts. This is quite dodgy + * but works until I find a better way + */ + if (master && (flags & MPIC_BIG_ENDIAN)) + flags |= MPIC_BROKEN_U3; + + mpic = mpic_alloc(r.start, flags, 0, offset, count, master ? 252 : 0, + senses, count, name); + if (mpic == NULL) + return NULL; + + mpic_init(mpic); + + return mpic; + } + static int __init pmac_pic_probe_mpic(void) { struct mpic *mpic1, *mpic2; struct device_node *np, *master = NULL, *slave = NULL; - unsigned char senses[128]; - struct resource r; /* We can have up to 2 MPICs cascaded */ for (np = NULL; (np = of_find_node_by_type(np, "open-pic")) != NULL;) { if (master == NULL && - get_property(np, "interrupt-parent", NULL) != NULL) + get_property(np, "interrupts", NULL) == NULL) master = of_node_get(np); else if (slave == NULL) slave = of_node_get(np); @@ -557,13 +595,8 @@ static int __init pmac_pic_probe_mpic(vo ppc_md.get_irq = mpic_get_irq; /* Setup master */ - BUG_ON(of_address_to_resource(master, 0, &r)); - pmac_call_feature(PMAC_FTR_ENABLE_MPIC, master, 0, 0); - prom_get_irq_senses(senses, 0, 128); - mpic1 = mpic_alloc(r.start, MPIC_PRIMARY | MPIC_WANTS_RESET, - 0, 0, 128, 252, senses, 128, " OpenPIC "); + mpic1 = pmac_setup_one_mpic(master, 1); BUG_ON(mpic1 == NULL); - mpic_init(mpic1); /* Install NMI if any */ pmac_pic_setup_mpic_nmi(mpic1); @@ -574,27 +607,12 @@ static int __init pmac_pic_probe_mpic(vo if (slave == NULL || slave->n_intrs < 1) return 0; - /* Setup slave, failures are non-fatal */ - if (of_address_to_resource(slave, 0, &r)) { - printk(KERN_ERR "Can't get address of MPIC %s\n", - slave->full_name); - return 0; - } - pmac_call_feature(PMAC_FTR_ENABLE_MPIC, slave, 0, 0); - prom_get_irq_senses(senses, 128, 128 + 124); - - /* We don't need to set MPIC_BROKEN_U3 here since we don't have - * hypertransport interrupts routed to it, at least not on currently - * supported machines, that may change. - */ - mpic2 = mpic_alloc(r.start, MPIC_BIG_ENDIAN | MPIC_WANTS_RESET, - 0, 128, 124, 0, senses, 124, " U3-MPIC "); + mpic2 = pmac_setup_one_mpic(slave, 0); if (mpic2 == NULL) { - printk(KERN_ERR "Can't create slave MPIC %s\n", - slave->full_name); + printk(KERN_ERR "Failed to setup slave MPIC\n"); + of_node_put(slave); return 0; } - mpic_init(mpic2); mpic_setup_cascade(slave->intrs[0].line, pmac_u3_cascade, mpic2); of_node_put(slave); Index: linux-work/include/asm-powerpc/pmac_feature.h =================================================================== --- linux-work.orig/include/asm-powerpc/pmac_feature.h 2005-11-24 17:18:48.000000000 +1100 +++ linux-work/include/asm-powerpc/pmac_feature.h 2005-12-13 18:13:11.000000000 +1100 @@ -121,6 +121,7 @@ #define PMAC_TYPE_IMAC_G5 0x152 /* iMac G5 */ #define PMAC_TYPE_XSERVE_G5 0x153 /* Xserve G5 */ #define PMAC_TYPE_UNKNOWN_K2 0x19f /* Any other K2 based */ +#define PMAC_TYPE_UNKNOWN_SHASTA 0x19e /* Any other Shasta based */ /* * Motherboard flags @@ -341,6 +342,7 @@ enum { macio_pangea, macio_intrepid, macio_keylargo2, + macio_shasta, }; struct macio_chip Index: linux-work/arch/powerpc/Kconfig =================================================================== --- linux-work.orig/arch/powerpc/Kconfig 2005-12-13 15:03:21.000000000 +1100 +++ linux-work/arch/powerpc/Kconfig 2005-12-13 18:13:11.000000000 +1100 @@ -300,6 +300,7 @@ config PPC_PMAC64 bool depends on PPC_PMAC && POWER4 select U3_DART + select MPIC_BROKEN_U3 select GENERIC_TBSYNC default y Index: linux-work/arch/powerpc/kernel/prom.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/prom.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/kernel/prom.c 2005-12-13 18:13:11.000000000 +1100 @@ -298,6 +298,16 @@ static int __devinit finish_node_interru int i, j, n, sense; unsigned int *irq, virq; struct device_node *ic; + int trace = 0; + + //#define TRACE(fmt...) do { if (trace) { printk(fmt); mdelay(1000); } } while(0) +#define TRACE(fmt...) + + if (!strcmp(np->name, "smu-doorbell")) + trace = 1; + + TRACE("Finishing SMU doorbell ! num_interrupt_controllers = %d\n", + num_interrupt_controllers); if (num_interrupt_controllers == 0) { /* @@ -332,11 +342,12 @@ static int __devinit finish_node_interru } ints = (unsigned int *) get_property(np, "interrupts", &intlen); + TRACE("ints=%p, intlen=%d\n", ints, intlen); if (ints == NULL) return 0; intrcells = prom_n_intr_cells(np); intlen /= intrcells * sizeof(unsigned int); - + TRACE("intrcells=%d, new intlen=%d\n", intrcells, intlen); np->intrs = prom_alloc(intlen * sizeof(*(np->intrs)), mem_start); if (!np->intrs) return -ENOMEM; @@ -347,6 +358,7 @@ static int __devinit finish_node_interru intrcount = 0; for (i = 0; i < intlen; ++i, ints += intrcells) { n = map_interrupt(&irq, &ic, np, ints, intrcells); + TRACE("map, irq=%d, ic=%p, n=%d\n", irq, ic, n); if (n <= 0) continue; @@ -357,6 +369,7 @@ static int __devinit finish_node_interru np->intrs[intrcount].sense = map_isa_senses[sense]; } else { virq = virt_irq_create_mapping(irq[0]); + TRACE("virq=%d\n", virq); #ifdef CONFIG_PPC64 if (virq == NO_IRQ) { printk(KERN_CRIT "Could not allocate interrupt" @@ -366,6 +379,12 @@ static int __devinit finish_node_interru #endif np->intrs[intrcount].line = irq_offset_up(virq); sense = (n > 1)? (irq[1] & 3): 1; + + /* Apple uses bits in there in a different way, let's + * only keep the real sense bit on macs + */ + if (_machine == PLATFORM_POWERMAC) + sense &= 0x1; np->intrs[intrcount].sense = map_mpic_senses[sense]; } @@ -375,12 +394,13 @@ static int __devinit finish_node_interru char *name = get_property(ic->parent, "name", NULL); if (name && !strcmp(name, "u3")) np->intrs[intrcount].line += 128; - else if (!(name && !strcmp(name, "mac-io"))) + else if (!(name && (!strcmp(name, "mac-io") || + !strcmp(name, "u4")))) /* ignore other cascaded controllers, such as the k2-sata-root */ break; } -#endif +#endif /* CONFIG_PPC64 */ if (n > 2) { printk("hmmm, got %d intr cells for %s:", n, np->full_name); Index: linux-work/arch/powerpc/platforms/powermac/pci.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/pci.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/pci.c 2005-12-13 18:13:12.000000000 +1100 @@ -1,7 +1,7 @@ /* * Support for PCI bridges found on Power Macintoshes. * - * Copyright (C) 2003 Benjamin Herrenschmuidt (benh at kernel.crashing.org) + * Copyright (C) 2003-2005 Benjamin Herrenschmuidt (benh at kernel.crashing.org) * Copyright (C) 1997 Paul Mackerras (paulus at samba.org) * * This program is free software; you can redistribute it and/or @@ -25,7 +25,7 @@ #include #include #ifdef CONFIG_PPC64 -#include +//#include #include #endif @@ -44,6 +44,7 @@ static int add_bridge(struct device_node static int has_uninorth; #ifdef CONFIG_PPC64 static struct pci_controller *u3_agp; +static struct pci_controller *u4_pcie; static struct pci_controller *u3_ht; #endif /* CONFIG_PPC64 */ @@ -97,11 +98,8 @@ static void __init fixup_bus_range(struc /* Lookup the "bus-range" property for the hose */ bus_range = (int *) get_property(bridge, "bus-range", &len); - if (bus_range == NULL || len < 2 * sizeof(int)) { - printk(KERN_WARNING "Can't get bus-range for %s\n", - bridge->full_name); + if (bus_range == NULL || len < 2 * sizeof(int)) return; - } bus_range[1] = fixup_one_level_bus_range(bridge->child, bus_range[1]); } @@ -128,14 +126,14 @@ static void __init fixup_bus_range(struc */ #define MACRISC_CFA0(devfn, off) \ - ((1 << (unsigned long)PCI_SLOT(dev_fn)) \ - | (((unsigned long)PCI_FUNC(dev_fn)) << 8) \ - | (((unsigned long)(off)) & 0xFCUL)) + ((1 << (unsigned int)PCI_SLOT(dev_fn)) \ + | (((unsigned int)PCI_FUNC(dev_fn)) << 8) \ + | (((unsigned int)(off)) & 0xFCUL)) #define MACRISC_CFA1(bus, devfn, off) \ - ((((unsigned long)(bus)) << 16) \ - |(((unsigned long)(devfn)) << 8) \ - |(((unsigned long)(off)) & 0xFCUL) \ + ((((unsigned int)(bus)) << 16) \ + |(((unsigned int)(devfn)) << 8) \ + |(((unsigned int)(off)) & 0xFCUL) \ |1UL) static unsigned long macrisc_cfg_access(struct pci_controller* hose, @@ -168,7 +166,8 @@ static int macrisc_read_config(struct pc hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = macrisc_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -199,7 +198,8 @@ static int macrisc_write_config(struct p hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = macrisc_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -234,12 +234,13 @@ static struct pci_ops macrisc_pci_ops = /* * Verify that a specific (bus, dev_fn) exists on chaos */ -static int -chaos_validate_dev(struct pci_bus *bus, int devfn, int offset) +static int chaos_validate_dev(struct pci_bus *bus, int devfn, int offset) { struct device_node *np; u32 *vendor, *device; + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; np = pci_busdev_to_OF_node(bus, devfn); if (np == NULL) return PCIBIOS_DEVICE_NOT_FOUND; @@ -341,10 +342,10 @@ static int u3_ht_skip_device(struct pci_ } #define U3_HT_CFA0(devfn, off) \ - ((((unsigned long)devfn) << 8) | offset) + ((((unsigned int)devfn) << 8) | offset) #define U3_HT_CFA1(bus, devfn, off) \ (U3_HT_CFA0(devfn, off) \ - + (((unsigned long)bus) << 16) \ + + (((unsigned int)bus) << 16) \ + 0x01000000UL) static unsigned long u3_ht_cfg_access(struct pci_controller* hose, @@ -370,7 +371,8 @@ static int u3_ht_read_config(struct pci_ hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = u3_ht_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -419,7 +421,8 @@ static int u3_ht_write_config(struct pci hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = u3_ht_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -459,6 +462,112 @@ static struct pci_ops u3_ht_pci_ops = u3_ht_read_config, u3_ht_write_config }; + +#define U4_PCIE_CFA0(devfn, off) \ + ((1 << ((unsigned int)PCI_SLOT(dev_fn))) \ + | (((unsigned int)PCI_FUNC(dev_fn)) << 8) \ + | ((((unsigned int)(off)) >> 8) << 28) \ + | (((unsigned int)(off)) & 0xfcU)) + +#define U4_PCIE_CFA1(bus, devfn, off) \ + ((((unsigned int)(bus)) << 16) \ + |(((unsigned int)(devfn)) << 8) \ + | ((((unsigned int)(off)) >> 8) << 28) \ + |(((unsigned int)(off)) & 0xfcU) \ + |1UL) + +static unsigned long u4_pcie_cfg_access(struct pci_controller* hose, + u8 bus, u8 dev_fn, int offset) +{ + unsigned int caddr; + + if (bus == hose->first_busno) { + caddr = U4_PCIE_CFA0(dev_fn, offset); + } else + caddr = U4_PCIE_CFA1(bus, dev_fn, offset); + + /* Uninorth will return garbage if we don't read back the value ! */ + do { + out_le32(hose->cfg_addr, caddr); + } while (in_le32(hose->cfg_addr) != caddr); + + offset &= 0x03; + return ((unsigned long)hose->cfg_data) + offset; +} + +static int u4_pcie_read_config(struct pci_bus *bus, unsigned int devfn, + int offset, int len, u32 *val) +{ + struct pci_controller *hose; + unsigned long addr; + + hose = pci_bus_to_host(bus); + if (hose == NULL) + return PCIBIOS_DEVICE_NOT_FOUND; + if (offset >= 0x1000) + return PCIBIOS_BAD_REGISTER_NUMBER; + addr = u4_pcie_cfg_access(hose, bus->number, devfn, offset); + if (!addr) + return PCIBIOS_DEVICE_NOT_FOUND; + /* + * Note: the caller has already checked that offset is + * suitably aligned and that len is 1, 2 or 4. + */ + switch (len) { + case 1: + *val = in_8((u8 *)addr); + break; + case 2: + *val = in_le16((u16 *)addr); + break; + default: + *val = in_le32((u32 *)addr); + break; + } + return PCIBIOS_SUCCESSFUL; +} + +static int u4_pcie_write_config(struct pci_bus *bus, unsigned int devfn, + int offset, int len, u32 val) +{ + struct pci_controller *hose; + unsigned long addr; + + hose = pci_bus_to_host(bus); + if (hose == NULL) + return PCIBIOS_DEVICE_NOT_FOUND; + if (offset >= 0x1000) + return PCIBIOS_BAD_REGISTER_NUMBER; + addr = u4_pcie_cfg_access(hose, bus->number, devfn, offset); + if (!addr) + return PCIBIOS_DEVICE_NOT_FOUND; + /* + * Note: the caller has already checked that offset is + * suitably aligned and that len is 1, 2 or 4. + */ + switch (len) { + case 1: + out_8((u8 *)addr, val); + (void) in_8((u8 *)addr); + break; + case 2: + out_le16((u16 *)addr, val); + (void) in_le16((u16 *)addr); + break; + default: + out_le32((u32 *)addr, val); + (void) in_le32((u32 *)addr); + break; + } + return PCIBIOS_SUCCESSFUL; +} + +static struct pci_ops u4_pcie_pci_ops = +{ + u4_pcie_read_config, + u4_pcie_write_config +}; + #endif /* CONFIG_PPC64 */ #ifdef CONFIG_PPC32 @@ -628,15 +737,36 @@ static void __init setup_u3_agp(struct p hose->ops = ¯isc_pci_ops; hose->cfg_addr = ioremap(0xf0000000 + 0x800000, 0x1000); hose->cfg_data = ioremap(0xf0000000 + 0xc00000, 0x1000); - u3_agp = hose; } +static void __init setup_u4_pcie(struct pci_controller* hose) +{ + /* We currently only implement the "non-atomic" config space, to + * be optimised later. + */ + hose->ops = &u4_pcie_pci_ops; + hose->cfg_addr = ioremap(0xf0000000 + 0x800000, 0x1000); + hose->cfg_data = ioremap(0xf0000000 + 0xc00000, 0x1000); + + /* The bus contains a bridge from root -> device, we need to + * make it visible on bus 0 so that we pick the right type + * of config cycles. If we didn't, we would have to force all + * config cycles to be type 1. So we override the "bus-range" + * property here + */ + hose->first_busno = 0x00; + hose->last_busno = 0xff; + u4_pcie = hose; +} + static void __init setup_u3_ht(struct pci_controller* hose) { struct device_node *np = (struct device_node *)hose->arch_data; + struct pci_controller *other = NULL; int i, cur; + hose->ops = &u3_ht_pci_ops; /* We hard code the address because of the different size of @@ -670,11 +800,20 @@ static void __init setup_u3_ht(struct pc u3_ht = hose; - if (u3_agp == NULL) { - DBG("U3 has no AGP, using full resource range\n"); + if (u3_agp != NULL) + other = u3_agp; + else if (u4_pcie != NULL) + other = u4_pcie; + + if (other == NULL) { + DBG("U3/4 has no AGP/PCIE, using full resource range\n"); return; } + /* Fixup bus range vs. PCIE */ + if (u4_pcie) + hose->last_busno = u4_pcie->first_busno - 1; + /* We "remove" the AGP resources from the resources allocated to HT, * that is we create "holes". However, that code does assumptions * that so far happen to be true (cross fingers...), typically that @@ -682,7 +821,7 @@ static void __init setup_u3_ht(struct pc */ cur = 0; for (i=0; i<3; i++) { - struct resource *res = &u3_agp->mem_resources[i]; + struct resource *res = &other->mem_resources[i]; if (res->flags != IORESOURCE_MEM) continue; /* We don't care about "fine" resources */ @@ -777,9 +916,13 @@ static int __init add_bridge(struct devi setup_u3_ht(hose); disp_name = "U3-HT"; primary = 1; + } else if (device_is_compatible(dev, "u4-pcie")) { + setup_u4_pcie(hose); + disp_name = "U4-PCIE"; + primary = 0; } - printk(KERN_INFO "Found %s PCI host bridge. Firmware bus number: %d->%d\n", - disp_name, hose->first_busno, hose->last_busno); + printk(KERN_INFO "Found %s PCI host bridge. Firmware bus number:" + " %d->%d\n", disp_name, hose->first_busno, hose->last_busno); #endif /* CONFIG_PPC64 */ /* 32 bits only bridges */ @@ -900,6 +1043,8 @@ void __init pmac_pci_init(void) pci_setup_phb_io(u3_ht, 1); if (u3_agp) pci_setup_phb_io(u3_agp, 0); + if (u4_pcie) + pci_setup_phb_io(u4_pcie, 0); /* * On ppc64, fixup the IO resources on our host bridges as @@ -912,7 +1057,8 @@ void __init pmac_pci_init(void) /* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We * assume there is no P2P bridge on the AGP bus, which should be a - * safe assumptions hopefully. + * safe assumptions for now. We should do something better in the + * future though */ if (u3_agp) { struct device_node *np = u3_agp->arch_data; @@ -920,7 +1066,6 @@ void __init pmac_pci_init(void) for (np = np->child; np; np = np->sibling) PCI_DN(np)->busno = 0xf0; } - /* pmac_check_ht_link(); */ /* Tell pci.c to not use the common resource allocation mechanism */ @@ -1127,7 +1272,8 @@ void pmac_pci_fixup_pciata(struct pci_de good: pci_read_config_byte(dev, PCI_CLASS_PROG, &progif); if ((progif & 5) != 5) { - printk(KERN_INFO "Forcing PCI IDE into native mode: %s\n", pci_name(dev)); + printk(KERN_INFO "Forcing PCI IDE into native mode: %s\n", + pci_name(dev)); (void) pci_write_config_byte(dev, PCI_CLASS_PROG, progif|5); if (pci_read_config_byte(dev, PCI_CLASS_PROG, &progif) || (progif & 5) != 5) @@ -1153,7 +1299,8 @@ static void fixup_k2_sata(struct pci_dev for (i = 0; i < 6; i++) { dev->resource[i].start = dev->resource[i].end = 0; dev->resource[i].flags = 0; - pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, 0); + pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, + 0); } } else { pci_read_config_word(dev, PCI_COMMAND, &cmd); @@ -1162,7 +1309,8 @@ static void fixup_k2_sata(struct pci_dev for (i = 0; i < 5; i++) { dev->resource[i].start = dev->resource[i].end = 0; dev->resource[i].flags = 0; - pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, 0); + pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, + 0); } } } Index: linux-work/arch/powerpc/sysdev/mpic.c =================================================================== --- linux-work.orig/arch/powerpc/sysdev/mpic.c 2005-12-13 18:02:03.000000000 +1100 +++ linux-work/arch/powerpc/sysdev/mpic.c 2005-12-13 18:13:12.000000000 +1100 @@ -13,6 +13,9 @@ */ #undef DEBUG +#undef DEBUG_IPI +#undef DEBUG_IRQ +#undef DEBUG_LOW #include #include @@ -168,35 +171,86 @@ static void __init mpic_test_broken_ipi( /* Test if an interrupt is sourced from HyperTransport (used on broken U3s) * to force the edge setting on the MPIC and do the ack workaround. */ -static inline int mpic_is_ht_interrupt(struct mpic *mpic, unsigned int source_no) +static inline int mpic_is_ht_interrupt(struct mpic *mpic, unsigned int source) { - if (source_no >= 128 || !mpic->fixups) + if (source >= 128 || !mpic->fixups) return 0; - return mpic->fixups[source_no].base != NULL; + return mpic->fixups[source].base != NULL; } -static inline void mpic_apic_end_irq(struct mpic *mpic, unsigned int source_no) +static inline void mpic_ht_end_irq(struct mpic *mpic, unsigned int source) { - struct mpic_irq_fixup *fixup = &mpic->fixups[source_no]; + struct mpic_irq_fixup *fixup = &mpic->fixups[source]; - spin_lock(&mpic->fixup_lock); - writeb(0x11 + 2 * fixup->irq, fixup->base + 2); - writel(fixup->data, fixup->base + 4); - spin_unlock(&mpic->fixup_lock); + if (fixup->applebase) { + unsigned int soff = (fixup->index >> 3) & ~3; + unsigned int mask = 1U << (fixup->index & 0x1f); + writel(mask, fixup->applebase + soff); + } else { + spin_lock(&mpic->fixup_lock); + writeb(0x11 + 2 * fixup->index, fixup->base + 2); + writel(fixup->data, fixup->base + 4); + spin_unlock(&mpic->fixup_lock); + } } +static void mpic_startup_ht_interrupt(struct mpic *mpic, unsigned int source, + unsigned int irqflags) +{ + struct mpic_irq_fixup *fixup = &mpic->fixups[source]; + unsigned long flags; + u32 tmp; + + if (fixup->base == NULL) + return; + + DBG("startup_ht_interrupt(%u, %u) index: %d\n", + source, irqflags, fixup->index); + spin_lock_irqsave(&mpic->fixup_lock, flags); + /* Enable and configure */ + writeb(0x10 + 2 * fixup->index, fixup->base + 2); + tmp = readl(fixup->base + 4); + tmp &= ~(0x23U); + if (irqflags & IRQ_LEVEL) + tmp |= 0x22; + writel(tmp, fixup->base + 4); + spin_unlock_irqrestore(&mpic->fixup_lock, flags); +} + +static void mpic_shutdown_ht_interrupt(struct mpic *mpic, unsigned int source, + unsigned int irqflags) +{ + struct mpic_irq_fixup *fixup = &mpic->fixups[source]; + unsigned long flags; + u32 tmp; + + if (fixup->base == NULL) + return; + + DBG("shutdown_ht_interrupt(%u, %u)\n", source, irqflags); + + /* Disable */ + spin_lock_irqsave(&mpic->fixup_lock, flags); + writeb(0x10 + 2 * fixup->index, fixup->base + 2); + tmp = readl(fixup->base + 4); + tmp &= ~1U; + writel(tmp, fixup->base + 4); + spin_unlock_irqrestore(&mpic->fixup_lock, flags); +} -static void __init mpic_scan_ioapic(struct mpic *mpic, u8 __iomem *devbase) +static void __init mpic_scan_ht_pic(struct mpic *mpic, u8 __iomem *devbase, + unsigned int devfn, u32 vdid) { int i, irq, n; + u8 __iomem *base; u32 tmp; u8 pos; - for (pos = readb(devbase + 0x34); pos; pos = readb(devbase + pos + 1)) { - u8 id = readb(devbase + pos); - - if (id == 0x08) { + for (pos = readb(devbase + PCI_CAPABILITY_LIST); pos != 0; + pos = readb(devbase + pos + PCI_CAP_LIST_NEXT)) { + u8 id = readb(devbase + pos + PCI_CAP_LIST_ID); + if (id == PCI_CAP_ID_HT_IRQCONF) { id = readb(devbase + pos + 3); if (id == 0x80) break; @@ -205,33 +259,41 @@ static void __init mpic_scan_ioapic(stru if (pos == 0) return; - printk(KERN_INFO "mpic: - Workarounds @ %p, pos = 0x%02x\n", devbase, pos); - - devbase += pos; - - writeb(0x01, devbase + 2); - n = (readl(devbase + 4) >> 16) & 0xff; + base = devbase + pos; + writeb(0x01, base + 2); + n = (readl(base + 4) >> 16) & 0xff; + + printk(KERN_INFO "mpic: - HT:%02x.%x [0x%02x] vendor %04x device %04x" + " has %d irqs\n", + devfn >> 3, devfn & 0x7, pos, vdid & 0xffff, vdid >> 16, n + 1); for (i = 0; i <= n; i++) { - writeb(0x10 + 2 * i, devbase + 2); - tmp = readl(devbase + 4); - if ((tmp & 0x21) != 0x20) - continue; + writeb(0x10 + 2 * i, base + 2); + tmp = readl(base + 4); irq = (tmp >> 16) & 0xff; - mpic->fixups[irq].irq = i; - mpic->fixups[irq].base = devbase; - writeb(0x11 + 2 * i, devbase + 2); - mpic->fixups[irq].data = readl(devbase + 4) | 0x80000000; + DBG("HT PIC index 0x%x, irq 0x%x, tmp: %08x\n", i, irq, tmp); + /* mask it , will be unmasked later */ + tmp |= 0x1; + writel(tmp, base + 4); + mpic->fixups[irq].index = i; + mpic->fixups[irq].base = base; + /* Apple HT PIC has a non-standard way of doing EOIs */ + if ((vdid & 0xffff) == 0x106b) + mpic->fixups[irq].applebase = devbase + 0x60; + else + mpic->fixups[irq].applebase = NULL; + writeb(0x11 + 2 * i, base + 2); + mpic->fixups[irq].data = readl(base + 4) | 0x80000000; } } -static void __init mpic_scan_ioapics(struct mpic *mpic) +static void __init mpic_scan_ht_pics(struct mpic *mpic) { unsigned int devfn; u8 __iomem *cfgspace; - printk(KERN_INFO "mpic: Setting up IO-APICs workarounds for U3\n"); + printk(KERN_INFO "mpic: Setting up HT PICs workarounds for U3/U4\n"); /* Allocate fixups array */ mpic->fixups = alloc_bootmem(128 * sizeof(struct mpic_irq_fixup)); @@ -247,13 +309,14 @@ static void __init mpic_scan_ioapics(str cfgspace = ioremap(0xf2000000, 0x10000); BUG_ON(cfgspace == NULL); - /* Now we scan all slots. We do a very quick scan, we read the header type, - * vendor ID and device ID only, that's plenty enough + /* Now we scan all slots. We do a very quick scan, we read the header + * type, vendor ID and device ID only, that's plenty enough */ for (devfn = 0; devfn < 0x100; devfn++) { u8 __iomem *devbase = cfgspace + (devfn << 8); u8 hdr_type = readb(devbase + PCI_HEADER_TYPE); u32 l = readl(devbase + PCI_VENDOR_ID); + u16 s; DBG("devfn %x, l: %x\n", devfn, l); @@ -261,8 +324,12 @@ static void __init mpic_scan_ioapics(str if (l == 0xffffffff || l == 0x00000000 || l == 0x0000ffff || l == 0xffff0000) goto next; + /* Check if is supports capability lists */ + s = readw(devbase + PCI_STATUS); + if (!(s & PCI_STATUS_CAP_LIST)) + goto next; - mpic_scan_ioapic(mpic, devbase); + mpic_scan_ht_pic(mpic, devbase, devfn, l); next: /* next device, if function 0 */ @@ -363,6 +430,31 @@ static void mpic_enable_irq(unsigned int break; } } while(mpic_irq_read(src, MPIC_IRQ_VECTOR_PRI) & MPIC_VECPRI_MASK); + +#ifdef CONFIG_MPIC_BROKEN_U3 + if (mpic->flags & MPIC_BROKEN_U3) { + unsigned int src = irq - mpic->irq_offset; + if (mpic_is_ht_interrupt(mpic, src) && + (irq_desc[irq].status & IRQ_LEVEL)) + mpic_ht_end_irq(mpic, src); + } +#endif /* CONFIG_MPIC_BROKEN_U3 */ +} + +static unsigned int mpic_startup_irq(unsigned int irq) +{ +#ifdef CONFIG_MPIC_BROKEN_U3 + struct mpic *mpic = mpic_from_irq(irq); + unsigned int src = irq - mpic->irq_offset; + + if (mpic_is_ht_interrupt(mpic, src)) + mpic_startup_ht_interrupt(mpic, src, irq_desc[irq].status); + +#endif /* CONFIG_MPIC_BROKEN_U3 */ + + mpic_enable_irq(irq); + + return 0; } static void mpic_disable_irq(unsigned int irq) @@ -386,12 +478,27 @@ static void mpic_disable_irq(unsigned in } while(!(mpic_irq_read(src, MPIC_IRQ_VECTOR_PRI) & MPIC_VECPRI_MASK)); } +static void mpic_shutdown_irq(unsigned int irq) +{ +#ifdef CONFIG_MPIC_BROKEN_U3 + struct mpic *mpic = mpic_from_irq(irq); + unsigned int src = irq - mpic->irq_offset; + + if (mpic_is_ht_interrupt(mpic, src)) + mpic_shutdown_ht_interrupt(mpic, src, irq_desc[irq].status); + +#endif /* CONFIG_MPIC_BROKEN_U3 */ + + mpic_disable_irq(irq); +} + static void mpic_end_irq(unsigned int irq) { struct mpic *mpic = mpic_from_irq(irq); +#ifdef DEBUG_IRQ DBG("%s: end_irq: %d\n", mpic->name, irq); - +#endif /* We always EOI on end_irq() even for edge interrupts since that * should only lower the priority, the MPIC should have properly * latched another edge interrupt coming in anyway @@ -400,8 +507,9 @@ static void mpic_end_irq(unsigned int ir #ifdef CONFIG_MPIC_BROKEN_U3 if (mpic->flags & MPIC_BROKEN_U3) { unsigned int src = irq - mpic->irq_offset; - if (mpic_is_ht_interrupt(mpic, src)) - mpic_apic_end_irq(mpic, src); + if (mpic_is_ht_interrupt(mpic, src) && + (irq_desc[irq].status & IRQ_LEVEL)) + mpic_ht_end_irq(mpic, src); } #endif /* CONFIG_MPIC_BROKEN_U3 */ @@ -482,6 +590,8 @@ struct mpic * __init mpic_alloc(unsigned mpic->name = name; mpic->hc_irq.typename = name; + mpic->hc_irq.startup = mpic_startup_irq; + mpic->hc_irq.shutdown = mpic_shutdown_irq; mpic->hc_irq.enable = mpic_enable_irq; mpic->hc_irq.disable = mpic_disable_irq; mpic->hc_irq.end = mpic_end_irq; @@ -650,10 +760,10 @@ void __init mpic_init(struct mpic *mpic) mpic->irq_count = mpic->num_sources; #ifdef CONFIG_MPIC_BROKEN_U3 - /* Do the ioapic fixups on U3 broken mpic */ + /* Do the HT PIC fixups on U3 broken mpic */ DBG("MPIC flags: %x\n", mpic->flags); if ((mpic->flags & MPIC_BROKEN_U3) && (mpic->flags & MPIC_PRIMARY)) - mpic_scan_ioapics(mpic); + mpic_scan_ht_pics(mpic); #endif /* CONFIG_MPIC_BROKEN_U3 */ for (i = 0; i < mpic->num_sources; i++) { @@ -840,7 +950,9 @@ void mpic_send_ipi(unsigned int ipi_no, BUG_ON(mpic == NULL); +#ifdef DEBUG_IPI DBG("%s: send_ipi(ipi_no: %d)\n", mpic->name, ipi_no); +#endif mpic_cpu_write(MPIC_CPU_IPI_DISPATCH_0 + ipi_no * 0x10, mpic_physmask(cpu_mask & cpus_addr(cpu_online_map)[0])); @@ -851,19 +963,28 @@ int mpic_get_one_irq(struct mpic *mpic, u32 irq; irq = mpic_cpu_read(MPIC_CPU_INTACK) & MPIC_VECPRI_VECTOR_MASK; +#ifdef DEBUG_LOW DBG("%s: get_one_irq(): %d\n", mpic->name, irq); - +#endif if (mpic->cascade && irq == mpic->cascade_vec) { +#ifdef DEBUG_LOW DBG("%s: cascading ...\n", mpic->name); +#endif irq = mpic->cascade(regs, mpic->cascade_data); mpic_eoi(mpic); return irq; } if (unlikely(irq == MPIC_VEC_SPURRIOUS)) return -1; - if (irq < MPIC_VEC_IPI_0) + if (irq < MPIC_VEC_IPI_0) { +#ifdef DEBUG_IRQ + DBG("%s: irq %d\n", mpic->name, irq + mpic->irq_offset); +#endif return irq + mpic->irq_offset; + } +#ifdef DEBUG_IPI DBG("%s: ipi %d !\n", mpic->name, irq - MPIC_VEC_IPI_0); +#endif return irq - MPIC_VEC_IPI_0 + mpic->ipi_offset; } Index: linux-work/include/linux/pci_regs.h =================================================================== --- linux-work.orig/include/linux/pci_regs.h 2005-11-24 17:18:49.000000000 +1100 +++ linux-work/include/linux/pci_regs.h 2005-12-13 18:13:12.000000000 +1100 @@ -196,6 +196,7 @@ #define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */ #define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */ #define PCI_CAP_ID_PCIX 0x07 /* PCI-X */ +#define PCI_CAP_ID_HT_IRQCONF 0x08 /* HyperTransport IRQ Configuration */ #define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */ #define PCI_CAP_ID_EXP 0x10 /* PCI Express */ #define PCI_CAP_ID_MSIX 0x11 /* MSI-X */ Index: linux-work/arch/powerpc/kernel/pci_64.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/pci_64.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/kernel/pci_64.c 2005-12-13 18:13:12.000000000 +1100 @@ -34,7 +34,7 @@ #ifdef DEBUG #include -#define DBG(fmt...) udbg_printf(fmt) +#define DBG(fmt...) printk(fmt) #else #define DBG(fmt...) #endif @@ -323,6 +323,7 @@ static void pci_parse_of_addrs(struct de addrs = (u32 *) get_property(node, "assigned-addresses", &proplen); if (!addrs) return; + DBG(" parse addresses (%d bytes) @ %p\n", proplen, addrs); for (; proplen >= 20; proplen -= 20, addrs += 5) { flags = pci_parse_of_flags(addrs[0]); if (!flags) @@ -332,6 +333,9 @@ static void pci_parse_of_addrs(struct de if (!size) continue; i = addrs[0] & 0xff; + DBG(" base: %llx, size: %llx, i: %x\n", + (unsigned long long)base, (unsigned long long)size, i); + if (PCI_BASE_ADDRESS_0 <= i && i <= PCI_BASE_ADDRESS_5) { res = &dev->resource[(i - PCI_BASE_ADDRESS_0) >> 2]; } else if (i == dev->rom_base_reg) { @@ -362,6 +366,8 @@ struct pci_dev *of_create_pci_dev(struct if (type == NULL) type = ""; + DBG(" create device, devfn: %x, type: %s\n", devfn, type); + memset(dev, 0, sizeof(struct pci_dev)); dev->bus = bus; dev->sysdata = node; @@ -375,12 +381,14 @@ struct pci_dev *of_create_pci_dev(struct dev->subsystem_vendor = get_int_prop(node, "subsystem-vendor-id", 0); dev->subsystem_device = get_int_prop(node, "subsystem-id", 0); - dev->cfg_size = 256; /*pci_cfg_space_size(dev);*/ + dev->cfg_size = pci_cfg_space_size(dev); sprintf(pci_name(dev), "%04x:%02x:%02x.%d", pci_domain_nr(bus), dev->bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn)); dev->class = get_int_prop(node, "class-code", 0); + DBG(" class: 0x%x\n", dev->class); + dev->current_state = 4; /* unknown power state */ if (!strcmp(type, "pci")) { @@ -402,6 +410,8 @@ struct pci_dev *of_create_pci_dev(struct pci_parse_of_addrs(node, dev); + DBG(" adding to system ...\n"); + pci_device_add(dev, bus); /* XXX pci_scan_msi_device(dev); */ @@ -418,15 +428,21 @@ void __devinit of_scan_bus(struct device int reglen, devfn; struct pci_dev *dev; + DBG("of_scan_bus(%s) bus no %d... \n", node->full_name, bus->number); + while ((child = of_get_next_child(node, child)) != NULL) { + DBG(" * %s\n", child->full_name); reg = (u32 *) get_property(child, "reg", ®len); if (reg == NULL || reglen < 20) continue; devfn = (reg[0] >> 8) & 0xff; + /* create a new pci_dev for this device */ dev = of_create_pci_dev(child, bus, devfn); if (!dev) continue; + DBG("dev header type: %x\n", dev->hdr_type); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE || dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) of_scan_pci_bridge(child, dev); @@ -446,16 +462,18 @@ void __devinit of_scan_pci_bridge(struct unsigned int flags; u64 size; + DBG("of_scan_pci_bridge(%s)\n", node->full_name); + /* parse bus-range property */ busrange = (u32 *) get_property(node, "bus-range", &len); if (busrange == NULL || len != 8) { - printk(KERN_ERR "Can't get bus-range for PCI-PCI bridge %s\n", + printk(KERN_DEBUG "Can't get bus-range for PCI-PCI bridge %s\n", node->full_name); return; } ranges = (u32 *) get_property(node, "ranges", &len); if (ranges == NULL) { - printk(KERN_ERR "Can't get ranges for PCI-PCI bridge %s\n", + printk(KERN_DEBUG "Can't get ranges for PCI-PCI bridge %s\n", node->full_name); return; } @@ -509,10 +527,13 @@ void __devinit of_scan_pci_bridge(struct } sprintf(bus->name, "PCI Bus %04x:%02x", pci_domain_nr(bus), bus->number); + DBG(" bus name: %s\n", bus->name); mode = PCI_PROBE_NORMAL; if (ppc_md.pci_probe_mode) mode = ppc_md.pci_probe_mode(bus); + DBG(" probe mode: %d\n", mode); + if (mode == PCI_PROBE_DEVTREE) of_scan_bus(node, bus); else if (mode == PCI_PROBE_NORMAL) @@ -528,6 +549,8 @@ void __devinit scan_phb(struct pci_contr int i, mode; struct resource *res; + DBG("Scanning PHB %s\n", node ? node->full_name : ""); + bus = pci_create_bus(NULL, hose->first_busno, hose->ops, node); if (bus == NULL) { printk(KERN_ERR "Failed to create bus for PCI domain %04x\n", @@ -552,8 +575,9 @@ void __devinit scan_phb(struct pci_contr mode = PCI_PROBE_NORMAL; #ifdef CONFIG_PPC_MULTIPLATFORM - if (ppc_md.pci_probe_mode) + if (node && ppc_md.pci_probe_mode) mode = ppc_md.pci_probe_mode(bus); + DBG(" probe mode: %d\n", mode); if (mode == PCI_PROBE_DEVTREE) { bus->subordinate = hose->last_busno; of_scan_bus(node, bus); @@ -842,8 +866,7 @@ pgprot_t pci_phys_mem_access_prot(struct * Returns a negative error code on failure, zero on success. */ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, - enum pci_mmap_state mmap_state, - int write_combine) + enum pci_mmap_state mmap_state, int write_combine) { unsigned long offset = vma->vm_pgoff << PAGE_SHIFT; struct resource *rp; Index: linux-work/arch/powerpc/kernel/udbg.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/udbg.c 2005-12-06 16:17:43.000000000 +1100 +++ linux-work/arch/powerpc/kernel/udbg.c 2005-12-13 18:13:12.000000000 +1100 @@ -110,10 +110,12 @@ static int early_console_initialized; void __init disable_early_printk(void) { +#if 1 if (!early_console_initialized) return; unregister_console(&udbg_console); early_console_initialized = 0; +#endif } /* called by setup_system */ Index: linux-work/arch/powerpc/platforms/powermac/setup.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/setup.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/setup.c 2005-12-13 18:13:12.000000000 +1100 @@ -345,7 +345,7 @@ void __init pmac_setup_arch(void) #ifdef CONFIG_SMP /* Check for Core99 */ - if (find_devices("uni-n") || find_devices("u3")) + if (find_devices("uni-n") || find_devices("u3") || find_devices("u4")) smp_ops = &core99_smp_ops; #ifdef CONFIG_PPC32 else @@ -733,10 +733,11 @@ static int pmac_pci_probe_mode(struct pc struct device_node *node = bus->sysdata; /* We need to use normal PCI probing for the AGP bus, - since the device for the AGP bridge isn't in the tree. */ - if (bus->self == NULL && device_is_compatible(node, "u3-agp")) + * since the device for the AGP bridge isn't in the tree. + */ + if (bus->self == NULL && (device_is_compatible(node, "u3-agp") || + device_is_compatible(node, "u4-pcie"))) return PCI_PROBE_NORMAL; - return PCI_PROBE_DEVTREE; } #endif Index: linux-work/arch/powerpc/platforms/powermac/smp.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/smp.c 2005-12-13 15:03:21.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/smp.c 2005-12-13 18:13:12.000000000 +1100 @@ -361,7 +361,6 @@ static void __init psurge_dual_sync_tb(i set_dec(tb_ticks_per_jiffy); /* XXX fixme */ set_tb(0, 0); - last_jiffy_stamp(cpu_nr) = 0; if (cpu_nr > 0) { mb(); @@ -429,15 +428,62 @@ struct smp_ops_t psurge_smp_ops = { }; #endif /* CONFIG_PPC32 - actually powersurge support */ +/* + * Core 99 and later support + */ + +static void (*pmac_tb_freeze)(int freeze); +static unsigned long timebase; +static int tb_req; + +static void smp_core99_give_timebase(void) +{ + unsigned long flags; + + local_irq_save(flags); + + while(!tb_req) + barrier(); + tb_req = 0; + (*pmac_tb_freeze)(1); + mb(); + timebase = get_tb(); + mb(); + while (timebase) + barrier(); + mb(); + (*pmac_tb_freeze)(0); + mb(); + + local_irq_restore(flags); +} + + +static void __devinit smp_core99_take_timebase(void) +{ + unsigned long flags; + + local_irq_save(flags); + + tb_req = 1; + mb(); + while (!timebase) + barrier(); + mb(); + set_tb(timebase >> 32, timebase & 0xffffffff); + timebase = 0; + mb(); + set_dec(tb_ticks_per_jiffy/2); + + local_irq_restore(flags); +} + #ifdef CONFIG_PPC64 /* * G5s enable/disable the timebase via an i2c-connected clock chip. */ static struct device_node *pmac_tb_clock_chip_host; static u8 pmac_tb_pulsar_addr; -static void (*pmac_tb_freeze)(int freeze); -static DEFINE_SPINLOCK(timebase_lock); -static unsigned long timebase; static void smp_core99_cypress_tb_freeze(int freeze) { @@ -447,7 +493,8 @@ static void smp_core99_cypress_tb_freeze /* Strangely, the device-tree says address is 0xd2, but darwin * accesses 0xd0 ... */ - pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_combined); + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, + pmac_low_i2c_mode_combined); rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, 0xd0 | pmac_low_i2c_read, 0x81, &data, 1); @@ -475,7 +522,8 @@ static void smp_core99_pulsar_tb_freeze( u8 data; int rc; - pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_combined); + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, + pmac_low_i2c_mode_combined); rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, pmac_tb_pulsar_addr | pmac_low_i2c_read, 0x2e, &data, 1); @@ -496,54 +544,14 @@ static void smp_core99_pulsar_tb_freeze( } } - -static void smp_core99_give_timebase(void) -{ - /* Open i2c bus for synchronous access */ - if (pmac_low_i2c_open(pmac_tb_clock_chip_host, 0)) - panic("Can't open i2c for TB sync !\n"); - - spin_lock(&timebase_lock); - (*pmac_tb_freeze)(1); - mb(); - timebase = get_tb(); - spin_unlock(&timebase_lock); - - while (timebase) - barrier(); - - spin_lock(&timebase_lock); - (*pmac_tb_freeze)(0); - spin_unlock(&timebase_lock); - - /* Close i2c bus */ - pmac_low_i2c_close(pmac_tb_clock_chip_host); -} - - -static void __devinit smp_core99_take_timebase(void) -{ - while (!timebase) - barrier(); - spin_lock(&timebase_lock); - set_tb(timebase >> 32, timebase & 0xffffffff); - timebase = 0; - spin_unlock(&timebase_lock); -} - -static void __init smp_core99_setup(int ncpus) +static void __init smp_core99_setup_i2c_hwsync(int ncpus) { struct device_node *cc = NULL; struct device_node *p; + const char *name = NULL; u32 *reg; int ok; - /* HW sync only on these platforms */ - if (!machine_is_compatible("PowerMac7,2") && - !machine_is_compatible("PowerMac7,3") && - !machine_is_compatible("RackMac3,1")) - return; - /* Look for the clock chip */ while ((cc = of_find_node_by_name(cc, "i2c-hwclock")) != NULL) { p = of_get_parent(cc); @@ -561,114 +569,64 @@ static void __init smp_core99_setup(int if (device_is_compatible(cc, "pulsar-legacy-slewing")) { pmac_tb_freeze = smp_core99_pulsar_tb_freeze; pmac_tb_pulsar_addr = 0xd2; - printk(KERN_INFO "Timebase clock is Pulsar chip\n"); + name = "Pulsar"; } else if (device_is_compatible(cc, "cy28508")) { pmac_tb_freeze = smp_core99_cypress_tb_freeze; - printk(KERN_INFO "Timebase clock is Cypress chip\n"); + name = "Cypress"; } break; case 0xd4: pmac_tb_freeze = smp_core99_pulsar_tb_freeze; pmac_tb_pulsar_addr = 0xd4; - printk(KERN_INFO "Timebase clock is Pulsar chip\n"); + name = "Pulsar"; break; } - if (pmac_tb_freeze != NULL) { - pmac_tb_clock_chip_host = of_get_parent(cc); - of_node_put(cc); + if (pmac_tb_freeze != NULL) break; - } } - if (pmac_tb_freeze == NULL) { - smp_ops->give_timebase = smp_generic_give_timebase; - smp_ops->take_timebase = smp_generic_take_timebase; + if (pmac_tb_freeze != NULL) { + struct device_node *p = of_get_parent(cc); + of_node_put(cc); + while(p && strcmp(p->type, "i2c")) { + cc = of_get_parent(p); + of_node_put(p); + p = cc; + } + if (p == NULL) + goto no_i2c_sync; + /* Open i2c bus for synchronous access */ + if (pmac_low_i2c_open(p, 0)) { + printk(KERN_ERR "Failed top open i2c bus %s for clock" + " sync, fallback to software sync !\n", + p->full_name); + of_node_put(p); + goto no_i2c_sync; + } + pmac_tb_clock_chip_host = p; + printk(KERN_INFO "Processor timebase sync using %s i2c clock\n", + name); + return; } + no_i2c_sync: + pmac_tb_freeze = NULL; } -/* nothing to do here, caches are already set up by service processor */ -static inline void __devinit core99_init_caches(int cpu) -{ -} +#endif /* CONFIG_PPC64 */ -#else /* CONFIG_PPC64 */ /* - * SMP G4 powermacs use a GPIO to enable/disable the timebase. + * SMP G4 and newer G5 use a GPIO to enable/disable the timebase. */ static unsigned int core99_tb_gpio; /* Timebase freeze GPIO */ -static unsigned int pri_tb_hi, pri_tb_lo; -static unsigned int pri_tb_stamp; - -/* not __init, called in sleep/wakeup code */ -void smp_core99_give_timebase(void) +static void smp_core99_gpio_tb_freeze(int freeze) { - unsigned long flags; - unsigned int t; - - /* wait for the secondary to be in take_timebase */ - for (t = 100000; t > 0 && !sec_tb_reset; --t) - udelay(10); - if (!sec_tb_reset) { - printk(KERN_WARNING "Timeout waiting sync on second CPU\n"); - return; - } - - /* freeze the timebase and read it */ - /* disable interrupts so the timebase is disabled for the - shortest possible time */ - local_irq_save(flags); - pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 4); + if (freeze) + pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 4); + else + pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 0); pmac_call_feature(PMAC_FTR_READ_GPIO, NULL, core99_tb_gpio, 0); - mb(); - pri_tb_hi = get_tbu(); - pri_tb_lo = get_tbl(); - pri_tb_stamp = last_jiffy_stamp(smp_processor_id()); - mb(); - - /* tell the secondary we're ready */ - sec_tb_reset = 2; - mb(); - - /* wait for the secondary to have taken it */ - /* note: can't use udelay here, since it needs the timebase running */ - for (t = 10000000; t > 0 && sec_tb_reset; --t) - barrier(); - if (sec_tb_reset) - /* XXX BUG_ON here? */ - printk(KERN_WARNING "Timeout waiting sync(2) on second CPU\n"); - - /* Now, restart the timebase by leaving the GPIO to an open collector */ - pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 0); - pmac_call_feature(PMAC_FTR_READ_GPIO, NULL, core99_tb_gpio, 0); - local_irq_restore(flags); -} - -/* not __init, called in sleep/wakeup code */ -void smp_core99_take_timebase(void) -{ - unsigned long flags; - - /* tell the primary we're here */ - sec_tb_reset = 1; - mb(); - - /* wait for the primary to set pri_tb_hi/lo */ - while (sec_tb_reset < 2) - mb(); - - /* set our stuff the same as the primary */ - local_irq_save(flags); - set_dec(1); - set_tb(pri_tb_hi, pri_tb_lo); - last_jiffy_stamp(smp_processor_id()) = pri_tb_stamp; - mb(); - - /* tell the primary we're done */ - sec_tb_reset = 0; - mb(); - local_irq_restore(flags); } /* L2 and L3 cache settings to pass from CPU0 to CPU1 on G4 cpus */ @@ -677,6 +635,7 @@ volatile static long int core99_l3_cache static void __devinit core99_init_caches(int cpu) { +#ifndef CONFIG_PPC64 if (!cpu_has_feature(CPU_FTR_L2CR)) return; @@ -702,30 +661,80 @@ static void __devinit core99_init_caches _set_L3CR(core99_l3_cache); printk("CPU%d: L3CR set to %lx\n", cpu, core99_l3_cache); } +#endif /* !CONFIG_PPC64 */ } static void __init smp_core99_setup(int ncpus) { - struct device_node *cpu; - u32 *tbprop = NULL; - int i; - - core99_tb_gpio = KL_GPIO_TB_ENABLE; /* default value */ - cpu = of_find_node_by_type(NULL, "cpu"); - if (cpu != NULL) { - tbprop = (u32 *)get_property(cpu, "timebase-enable", NULL); - if (tbprop) - core99_tb_gpio = *tbprop; - of_node_put(cpu); - } - - /* XXX should get this from reg properties */ - for (i = 1; i < ncpus; ++i) - smp_hw_index[i] = i; - powersave_nap = 0; -} +#ifdef CONFIG_PPC64 + + /* i2c based HW sync on some G5s */ + if (machine_is_compatible("PowerMac7,2") || + machine_is_compatible("PowerMac7,3") || + machine_is_compatible("RackMac3,1")) + smp_core99_setup_i2c_hwsync(ncpus); + + /* GPIO based HW sync on recent G5s */ + if (pmac_tb_freeze == NULL) { + struct device_node *np = + of_find_node_by_name(NULL, "timebase-enable"); + u32 *reg = (u32 *)get_property(np, "reg", NULL); + + if (np && reg && !strcmp(np->type, "gpio")) { + core99_tb_gpio = *reg; + if (core99_tb_gpio < 0x50) + core99_tb_gpio += 0x50; + pmac_tb_freeze = smp_core99_gpio_tb_freeze; + printk(KERN_INFO "Processor timebase sync using" + " GPIO 0x%02x\n", core99_tb_gpio); + } + } + +#else /* CONFIG_PPC64 */ + + /* GPIO based HW sync on ppc32 Core99 */ + if (pmac_tb_freeze == NULL && !machine_is_compatible("MacRISC4")) { + struct device_node *cpu; + u32 *tbprop = NULL; + + core99_tb_gpio = KL_GPIO_TB_ENABLE; /* default value */ + cpu = of_find_node_by_type(NULL, "cpu"); + if (cpu != NULL) { + tbprop = (u32 *)get_property(cpu, "timebase-enable", + NULL); + if (tbprop) + core99_tb_gpio = *tbprop; + of_node_put(cpu); + } + pmac_tb_freeze = smp_core99_gpio_tb_freeze; + printk(KERN_INFO "Processor timebase sync using" + " GPIO 0x%02x\n", core99_tb_gpio); + } + +#endif /* CONFIG_PPC64 */ + + /* No timebase sync, fallback to software */ + if (pmac_tb_freeze == NULL) { + smp_ops->give_timebase = smp_generic_give_timebase; + smp_ops->take_timebase = smp_generic_take_timebase; + printk(KERN_INFO "Processor timebase sync using software\n"); + } + +#ifndef CONFIG_PPC64 + { + int i; + + /* XXX should get this from reg properties */ + for (i = 1; i < ncpus; ++i) + smp_hw_index[i] = i; + } #endif + /* 32 bits SMP can't NAP */ + if (!machine_is_compatible("MacRISC4")) + powersave_nap = 0; +} + static int __init smp_core99_probe(void) { struct device_node *cpus; @@ -803,17 +812,25 @@ static void __devinit smp_core99_setup_c mpic_setup_this_cpu(); if (cpu_nr == 0) { -#ifdef CONFIG_POWER4 +#ifdef CONFIG_PPC64 extern void g5_phy_disable_cpu1(void); + /* Close i2c bus if it was used for tb sync */ + if (pmac_tb_clock_chip_host) { + pmac_low_i2c_close(pmac_tb_clock_chip_host); + pmac_tb_clock_chip_host = NULL; + } + /* If we didn't start the second CPU, we must take * it off the bus */ if (machine_is_compatible("MacRISC4") && num_online_cpus() < 2) g5_phy_disable_cpu1(); -#endif /* CONFIG_POWER4 */ - if (ppc_md.progress) ppc_md.progress("core99_setup_cpu 0 done", 0x349); +#endif /* CONFIG_PPC64 */ + + if (ppc_md.progress) + ppc_md.progress("core99_setup_cpu 0 done", 0x349); } } Index: linux-work/arch/powerpc/sysdev/u3_iommu.c =================================================================== --- linux-work.orig/arch/powerpc/sysdev/u3_iommu.c 2005-11-24 17:21:41.000000000 +1100 +++ linux-work/arch/powerpc/sysdev/u3_iommu.c 2005-12-13 18:13:12.000000000 +1100 @@ -282,7 +282,7 @@ void iommu_init_early_u3(void) /* Find the DART in the device-tree */ dn = of_find_compatible_node(NULL, "dart", "u3-dart"); if (dn == NULL) - return; + goto bail; /* Setup low level TCE operations for the core IOMMU code */ ppc_md.tce_build = dart_build; @@ -290,20 +290,23 @@ void iommu_init_early_u3(void) ppc_md.tce_flush = dart_flush; /* Initialize the DART HW */ - if (dart_init(dn)) { - /* If init failed, use direct iommu and null setup functions */ - ppc_md.iommu_dev_setup = iommu_dev_setup_null; - ppc_md.iommu_bus_setup = iommu_bus_setup_null; - - /* Setup pci_dma ops */ - pci_direct_iommu_init(); - } else { + if (dart_init(dn) == 0) { ppc_md.iommu_dev_setup = iommu_dev_setup_u3; ppc_md.iommu_bus_setup = iommu_bus_setup_u3; /* Setup pci_dma ops */ pci_iommu_init(); + + return; } + + bail: + /* If init failed, use direct iommu and null setup functions */ + ppc_md.iommu_dev_setup = iommu_dev_setup_null; + ppc_md.iommu_bus_setup = iommu_bus_setup_null; + + /* Setup pci_dma ops */ + pci_direct_iommu_init(); } Index: linux-work/drivers/ide/ppc/pmac.c =================================================================== --- linux-work.orig/drivers/ide/ppc/pmac.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/drivers/ide/ppc/pmac.c 2005-12-13 18:13:12.000000000 +1100 @@ -1686,7 +1686,7 @@ pmac_ide_probe(void) #else macio_register_driver(&pmac_ide_macio_driver); pci_register_driver(&pmac_ide_pci_driver); -#endif +#endif } #ifdef CONFIG_BLK_DEV_IDEDMA_PMAC Index: linux-work/drivers/macintosh/smu.c =================================================================== --- linux-work.orig/drivers/macintosh/smu.c 2005-11-24 17:18:43.000000000 +1100 +++ linux-work/drivers/macintosh/smu.c 2005-12-13 18:13:12.000000000 +1100 @@ -53,7 +53,7 @@ #undef DEBUG_SMU #ifdef DEBUG_SMU -#define DPRINTK(fmt, args...) do { udbg_printf(KERN_DEBUG fmt , ##args); } while (0) +#define DPRINTK(fmt, args...) do { printk(KERN_DEBUG fmt , ##args); } while (0) #else #define DPRINTK(fmt, args...) do { } while (0) #endif @@ -909,10 +909,13 @@ static struct smu_sdbp_header *smu_creat struct property *prop; /* First query the partition info */ + DPRINTK("SMU: Query partition infos ... (irq=%d)\n", smu->db_irq); smu_queue_simple(&cmd, SMU_CMD_PARTITION_COMMAND, 2, smu_done_complete, &comp, SMU_CMD_PARTITION_LATEST, id); wait_for_completion(&comp); + DPRINTK("SMU: done, status: %d, reply_len: %d\n", + cmd.cmd.status, cmd.cmd.reply_len); /* Partition doesn't exist (or other error) */ if (cmd.cmd.status != 0 || cmd.cmd.reply_len != 6) @@ -975,6 +978,8 @@ struct smu_sdbp_header *__smu_get_sdb_pa sprintf(pname, "sdb-partition-%02x", id); + DPRINTK("smu_get_sdb_partition(%02x)\n", id); + if (interruptible) { int rc; rc = down_interruptible(&smu_part_access); @@ -986,6 +991,7 @@ struct smu_sdbp_header *__smu_get_sdb_pa part = (struct smu_sdbp_header *)get_property(smu->of_node, pname, size); if (part == NULL) { + DPRINTK("trying to extract from SMU ...\n"); part = smu_create_sdb_partition(id); if (part != NULL && size) *size = part->len << 2; Index: linux-work/include/asm-powerpc/mpic.h =================================================================== --- linux-work.orig/include/asm-powerpc/mpic.h 2005-12-13 18:02:03.000000000 +1100 +++ linux-work/include/asm-powerpc/mpic.h 2005-12-13 18:13:12.000000000 +1100 @@ -117,8 +117,9 @@ typedef int (*mpic_cascade_t)(struct pt_ struct mpic_irq_fixup { u8 __iomem *base; + u8 __iomem *applebase; u32 data; - unsigned int irq; + unsigned int index; }; #endif /* CONFIG_MPIC_BROKEN_U3 */ Index: linux-work/include/asm/hardirq.h =================================================================== --- linux-work.orig/include/asm/hardirq.h 2005-11-24 17:18:48.000000000 +1100 +++ linux-work/include/asm/hardirq.h 2005-12-13 18:13:12.000000000 +1100 @@ -11,13 +11,10 @@ */ typedef struct { unsigned int __softirq_pending; /* set_bit is used on this */ - unsigned int __last_jiffy_stamp; } ____cacheline_aligned irq_cpustat_t; #include /* Standard mappings for irq_cpustat_t above */ -#define last_jiffy_stamp(cpu) __IRQ_STAT((cpu), __last_jiffy_stamp) - static inline void ack_bad_irq(int irq) { printk(KERN_CRIT "illegal vector %d received!\n", irq); From olh at suse.de Tue Dec 13 19:21:31 2005 From: olh at suse.de (Olaf Hering) Date: Tue, 13 Dec 2005 09:21:31 +0100 Subject: [PATCH] powerpc: Don't use CONFIG_PPC64 in user-visible header files In-Reply-To: <1134432067.6989.137.camel@gaston> References: <20051212204532.GJ23641@krispykreme> <200512122328.56724.arnd@arndb.de> <1134427802.6989.128.camel@gaston> <200512130037.58332.arnd@arndb.de> <1134432067.6989.137.camel@gaston> Message-ID: <20051213082131.GA14838@suse.de> On Tue, Dec 13, Benjamin Herrenschmidt wrote: > On Tue, 2005-12-13 at 00:37 +0100, Arnd Bergmann wrote: > > Am Montag 12 Dezember 2005 23:50 schrieb Benjamin Herrenschmidt: > > > They should not to that. > > > > Of course they should not. But if they did such things on 2.6.14 it might have > > worked and if it fails in 2.6.15 that is a regression. > > No, it's not. They shouldn't do it, period. It's not a regression to > break a bogus/forbidden behaviour. asm-ppc/ had the good/bad feature that everything was inside #ifdef __KERNEL__. Can we have that back please? -- short story of a lazy sysadmin: alias appserv=wotan From olof at lixom.net Wed Dec 14 03:58:07 2005 From: olof at lixom.net (Olof Johansson) Date: Tue, 13 Dec 2005 08:58:07 -0800 Subject: [PATCH] powerpc: Update MPIC workarounds In-Reply-To: <1134457469.6989.189.camel@gaston> References: <1134457469.6989.189.camel@gaston> Message-ID: <20051213165807.GA7468@pb15.lixom.net> On Tue, Dec 13, 2005 at 06:04:29PM +1100, Benjamin Herrenschmidt wrote: > From: Segher Boessenkool > > Cleanup the MPIC IO-APIC workarounds, make them a bit more generic, > smaller and faster. I really don't like all the hand-coded constants in this code. They're all over the place, and there's no descriptions of what they are there for. Lots of hardcoded offsets, etc. Since this is a cleanup, wouldn't it be a good time to use symbolic constands and/or comment them up a bit? > Index: linux-work/arch/powerpc/sysdev/mpic.c > =================================================================== > --- linux-work.orig/arch/powerpc/sysdev/mpic.c 2005-12-06 16:17:43.000000000 +1100 > +++ linux-work/arch/powerpc/sysdev/mpic.c 2005-12-07 13:30:45.000000000 +1100 > @@ -175,57 +175,57 @@ static inline int mpic_is_ht_interrupt(s > return mpic->fixups[source_no].base != NULL; > } > > + > static inline void mpic_apic_end_irq(struct mpic *mpic, unsigned int source_no) > { > struct mpic_irq_fixup *fixup = &mpic->fixups[source_no]; > - u32 tmp; > > spin_lock(&mpic->fixup_lock); > - writeb(0x11 + 2 * fixup->irq, fixup->base); > - tmp = readl(fixup->base + 2); > - writel(tmp | 0x80000000ul, fixup->base + 2); > - /* config writes shouldn't be posted but let's be safe ... */ > - (void)readl(fixup->base + 2); > + writeb(0x11 + 2 * fixup->irq, fixup->base + 2); > + writel(fixup->data, fixup->base + 4); This seems like a functional change: Previous code wrote at base, new at base+2? -Olof From linas at austin.ibm.com Wed Dec 14 06:42:58 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 13:42:58 -0600 Subject: [PATCH 0/9] powerpc: Assorted PCI hotplug RPAPHP/DLPAR cleanup Message-ID: <20051213194258.GD10037@austin.ibm.com> Paul, John Rose, The following nine patches perform an assortment of cleanup mostly in the drivers/pci/hotplug directory, for the powerpc-related RPA-compliant pci hotplug and dlpar code. Its mostly a removal of some duplicated code, with some restructuring to go along with it to make it more readable, ("less baroque" is the term I used in the patch descriptions). Paul, Many of the other patches rely on the first patch being applied to the arch/powerpc/platforms/pseries tree. Please apply. John Rose, All of the other patches are in your neck of the woods. Please review and forward to GregKH. I've been testing on only one machine so far, a late-model power5 box. --linas From linas at austin.ibm.com Wed Dec 14 06:46:36 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 13:46:36 -0600 Subject: [PATCH 1/9] powerpc: export PCI fixup routine In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213194636.GE10037@austin.ibm.com> Hi Paul, Please apply. --linas There is code in the RPAPHP directory that is identical to this routine; I'll be removing that code in an upcoming patch, but this patch is needed to expose the function to make it callable. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/arch/powerpc/platforms/pseries/pci_dlpar.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/arch/powerpc/platforms/pseries/pci_dlpar.c 2005-12-12 10:54:50.530581660 -0600 +++ linux-2.6.15-rc5-mm2/arch/powerpc/platforms/pseries/pci_dlpar.c 2005-12-12 11:42:49.483694131 -0600 @@ -77,7 +77,7 @@ } /* Must be called before pci_bus_add_devices */ -static void +void pcibios_fixup_new_pci_devices(struct pci_bus *bus, int fix_bus) { struct pci_dev *dev; Index: linux-2.6.15-rc5-mm2/include/asm-powerpc/pci-bridge.h =================================================================== --- linux-2.6.15-rc5-mm2.orig/include/asm-powerpc/pci-bridge.h 2005-12-12 10:54:55.693855475 -0600 +++ linux-2.6.15-rc5-mm2/include/asm-powerpc/pci-bridge.h 2005-12-12 11:42:49.484693990 -0600 @@ -137,6 +137,7 @@ /** Discover new pci devices under this bus, and add them */ void pcibios_add_pci_devices(struct pci_bus * bus); +void pcibios_fixup_new_pci_devices(struct pci_bus *bus, int fix_bus); extern int pcibios_remove_root_bus(struct pci_controller *phb); From linas at austin.ibm.com Wed Dec 14 06:48:27 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 13:48:27 -0600 Subject: [PATCH 2/9] powerpc/PCI hotplug: remove rpaphp_find_bus() In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213194827.GF10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas The function rpaphp_find_pci_bus() has been migrated to pcibios_find_pci_bus() in arch/powerpc/platforms/pseries/pci_dlpar.c This patch removes the old version. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-01 18:51:26.000000000 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp_pci.c 2005-12-02 14:17:19.834504074 -0600 @@ -32,36 +32,6 @@ #include "../pci.h" /* for pci_add_new_bus */ #include "rpaphp.h" -static struct pci_bus *find_bus_among_children(struct pci_bus *bus, - struct device_node *dn) -{ - struct pci_bus *child = NULL; - struct list_head *tmp; - struct device_node *busdn; - - busdn = pci_bus_to_OF_node(bus); - if (busdn == dn) - return bus; - - list_for_each(tmp, &bus->children) { - child = find_bus_among_children(pci_bus_b(tmp), dn); - if (child) - break; - } - return child; -} - -struct pci_bus *rpaphp_find_pci_bus(struct device_node *dn) -{ - struct pci_dn *pdn = dn->data; - - if (!pdn || !pdn->phb || !pdn->phb->bus) - return NULL; - - return find_bus_among_children(pdn->phb->bus, dn); -} -EXPORT_SYMBOL_GPL(rpaphp_find_pci_bus); - static int rpaphp_get_sensor_state(struct slot *slot, int *state) { int rc; @@ -120,7 +90,7 @@ /* config/unconfig adapter */ *value = slot->state; } else { - bus = rpaphp_find_pci_bus(slot->dn); + bus = pcibios_find_pci_bus(slot->dn); if (bus && !list_empty(&bus->devices)) *value = CONFIGURED; else @@ -369,7 +339,7 @@ struct pci_bus *bus; BUG_ON(!dn); - bus = rpaphp_find_pci_bus(dn); + bus = pcibios_find_pci_bus(dn); if (!bus) { err("%s: no pci_bus for dn %s\n", __FUNCTION__, dn->full_name); goto exit_rc; Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-12-01 18:51:26.000000000 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpadlpar_core.c 2005-12-02 14:18:23.226614153 -0600 @@ -174,7 +174,7 @@ { struct pci_dev *dev; - if (rpaphp_find_pci_bus(dn)) + if (pcibios_find_pci_bus(dn)) return -EINVAL; /* Add pci bus */ @@ -221,7 +221,7 @@ struct pci_dn *pdn; int rc = 0; - if (!rpaphp_find_pci_bus(dn)) + if (!pcibios_find_pci_bus(dn)) return -EINVAL; slot = find_slot(dn); @@ -366,7 +366,7 @@ struct pci_bus *bus; struct slot *slot; - bus = rpaphp_find_pci_bus(dn); + bus = pcibios_find_pci_bus(dn); if (!bus) return -EINVAL; Index: linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp.h =================================================================== --- linux-2.6.15-rc3-mm1.orig/drivers/pci/hotplug/rpaphp.h 2005-12-01 15:14:48.000000000 -0600 +++ linux-2.6.15-rc3-mm1/drivers/pci/hotplug/rpaphp.h 2005-12-02 14:19:24.050084110 -0600 @@ -88,13 +88,10 @@ /* function prototypes */ /* rpaphp_pci.c */ -extern struct pci_bus *rpaphp_find_pci_bus(struct device_node *dn); -extern int rpaphp_claim_resource(struct pci_dev *dev, int resource); extern int rpaphp_enable_pci_slot(struct slot *slot); extern int register_pci_slot(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); extern void rpaphp_init_new_devs(struct pci_bus *bus); -extern void rpaphp_eeh_init_nodes(struct device_node *dn); extern int rpaphp_config_pci_adapter(struct pci_bus *bus); extern int rpaphp_unconfig_pci_adapter(struct pci_bus *bus); From linas at austin.ibm.com Wed Dec 14 06:49:52 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 13:49:52 -0600 Subject: [PATCH 3/9] powerpc/PCI hotplug: remove rpaphp_fixup_new_pci_devices() In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213194952.GG10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas The function rpaphp_fixup_new_pci_devices() has been migrated to pcibios_fixup_new_pci_devices() in arch/powerpc/platforms/pseries/pci_dlpar.c This patch removes the old version. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 11:43:23.000000000 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 15:30:46.720738713 -0600 @@ -101,140 +101,6 @@ return rc; } -/* Must be called before pci_bus_add_devices */ -void rpaphp_fixup_new_pci_devices(struct pci_bus *bus, int fix_bus) -{ - struct pci_dev *dev; - - list_for_each_entry(dev, &bus->devices, bus_list) { - /* - * Skip already-present devices (which are on the - * global device list.) - */ - if (list_empty(&dev->global_list)) { - int i; - - /* Need to setup IOMMU tables */ - ppc_md.iommu_dev_setup(dev); - - if(fix_bus) - pcibios_fixup_device_resources(dev, bus); - pci_read_irq_line(dev); - for (i = 0; i < PCI_NUM_RESOURCES; i++) { - struct resource *r = &dev->resource[i]; - - if (r->parent || !r->start || !r->flags) - continue; - pci_claim_resource(dev, i); - } - } - } -} - -static void rpaphp_eeh_add_bus_device(struct pci_bus *bus) -{ - struct pci_dev *dev; - - list_for_each_entry(dev, &bus->devices, bus_list) { - eeh_add_device_late(dev); - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { - struct pci_bus *subbus = dev->subordinate; - if (subbus) - rpaphp_eeh_add_bus_device (subbus); - } - } -} - -static int rpaphp_pci_config_bridge(struct pci_dev *dev) -{ - u8 sec_busno; - struct pci_bus *child_bus; - struct pci_dev *child_dev; - - dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev)); - - /* get busno of downstream bus */ - pci_read_config_byte(dev, PCI_SECONDARY_BUS, &sec_busno); - - /* add to children of PCI bridge dev->bus */ - child_bus = pci_add_new_bus(dev->bus, dev, sec_busno); - if (!child_bus) { - err("%s: could not add second bus\n", __FUNCTION__); - return -EIO; - } - sprintf(child_bus->name, "PCI Bus #%02x", child_bus->number); - /* do pci_scan_child_bus */ - pci_scan_child_bus(child_bus); - - list_for_each_entry(child_dev, &child_bus->devices, bus_list) { - eeh_add_device_late(child_dev); - } - - /* fixup new pci devices without touching bus struct */ - rpaphp_fixup_new_pci_devices(child_bus, 0); - - /* Make the discovered devices available */ - pci_bus_add_devices(child_bus); - return 0; -} - -void rpaphp_init_new_devs(struct pci_bus *bus) -{ - rpaphp_fixup_new_pci_devices(bus, 0); - rpaphp_eeh_add_bus_device(bus); -} -EXPORT_SYMBOL_GPL(rpaphp_init_new_devs); - -/***************************************************************************** - rpaphp_pci_config_slot() will configure all devices under the - given slot->dn and return the the first pci_dev. - *****************************************************************************/ -static struct pci_dev * -rpaphp_pci_config_slot(struct pci_bus *bus) -{ - struct device_node *dn = pci_bus_to_OF_node(bus); - struct pci_dev *dev = NULL; - int slotno; - int num; - - dbg("Enter %s: dn=%s bus=%s\n", __FUNCTION__, dn->full_name, bus->name); - if (!dn || !dn->child) - return NULL; - - if (_machine == PLATFORM_PSERIES_LPAR) { - of_scan_bus(dn, bus); - if (list_empty(&bus->devices)) { - err("%s: No new device found\n", __FUNCTION__); - return NULL; - } - - rpaphp_init_new_devs(bus); - pci_bus_add_devices(bus); - dev = list_entry(&bus->devices, struct pci_dev, bus_list); - } else { - slotno = PCI_SLOT(PCI_DN(dn->child)->devfn); - - /* pci_scan_slot should find all children */ - num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); - if (num) { - rpaphp_fixup_new_pci_devices(bus, 1); - pci_bus_add_devices(bus); - } - if (list_empty(&bus->devices)) { - err("%s: No new device found\n", __FUNCTION__); - return NULL; - } - list_for_each_entry(dev, &bus->devices, bus_list) { - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) - rpaphp_pci_config_bridge(dev); - - rpaphp_eeh_add_bus_device(bus); - } - } - - return dev; -} - static void print_slot_pci_funcs(struct pci_bus *bus) { struct device_node *dn; @@ -253,19 +119,13 @@ int rpaphp_config_pci_adapter(struct pci_bus *bus) { struct device_node *dn = pci_bus_to_OF_node(bus); - struct pci_dev *dev; int rc = -ENODEV; dbg("Entry %s: slot[%s]\n", __FUNCTION__, dn->full_name); if (!dn) goto exit; - eeh_add_device_tree_early(dn); - dev = rpaphp_pci_config_slot(bus); - if (!dev) { - err("%s: can't find any devices.\n", __FUNCTION__); - goto exit; - } + pcibios_add_pci_devices(bus); print_slot_pci_funcs(bus); rc = 0; exit: Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpadlpar_core.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-12-12 11:43:23.000000000 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpadlpar_core.c 2005-12-12 15:30:45.409923028 -0600 @@ -146,7 +146,7 @@ dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) of_scan_pci_bridge(dn, dev); - rpaphp_init_new_devs(dev->subordinate); + pcibios_fixup_new_pci_devices(dev->subordinate,0); /* Claim new bus resources */ pcibios_claim_one_bus(dev->bus); Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp.h 2005-12-12 15:30:44.528047030 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h 2005-12-12 15:31:06.243993515 -0600 @@ -91,7 +91,6 @@ extern int rpaphp_enable_pci_slot(struct slot *slot); extern int register_pci_slot(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); -extern void rpaphp_init_new_devs(struct pci_bus *bus); extern int rpaphp_config_pci_adapter(struct pci_bus *bus); extern int rpaphp_unconfig_pci_adapter(struct pci_bus *bus); From linas at austin.ibm.com Wed Dec 14 06:51:44 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 13:51:44 -0600 Subject: [PATCH 4/9] powerpc/PCI hotplug: merge config_pci_adapter In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213195144.GH10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas Remove general baroqueness. The function rpaphp_config_pci_adapter() is really just one line of code, once all the dbg printks are removed. And its called in only one place. So replace the call by the one line. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 15:30:46.720738713 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 15:33:32.646407144 -0600 @@ -116,24 +116,6 @@ return; } -int rpaphp_config_pci_adapter(struct pci_bus *bus) -{ - struct device_node *dn = pci_bus_to_OF_node(bus); - int rc = -ENODEV; - - dbg("Entry %s: slot[%s]\n", __FUNCTION__, dn->full_name); - if (!dn) - goto exit; - - pcibios_add_pci_devices(bus); - print_slot_pci_funcs(bus); - rc = 0; -exit: - dbg("Exit %s: rc=%d\n", __FUNCTION__, rc); - return rc; -} -EXPORT_SYMBOL_GPL(rpaphp_config_pci_adapter); - static void rpaphp_eeh_remove_bus_device(struct pci_dev *dev) { eeh_remove_device(dev); @@ -224,11 +206,8 @@ if (slot->hotplug_slot->info->adapter_status == NOT_CONFIGURED) { dbg("%s CONFIGURING pci adapter in slot[%s]\n", __FUNCTION__, slot->name); - if (rpaphp_config_pci_adapter(slot->bus)) { - err("%s: CONFIG pci adapter failed\n", __FUNCTION__); - goto exit_rc; - } - + pcibios_add_pci_devices(slot->bus); + } else if (slot->hotplug_slot->info->adapter_status != CONFIGURED) { err("%s: slot[%s]'s adapter_status is NOT_VALID.\n", __FUNCTION__, slot->name); @@ -273,16 +252,10 @@ /* if slot is not empty, enable the adapter */ if (state == PRESENT) { dbg("%s : slot[%s] is occupied.\n", __FUNCTION__, slot->name); - retval = rpaphp_config_pci_adapter(slot->bus); - if (!retval) { - slot->state = CONFIGURED; - info("%s: devices in slot[%s] configured\n", + pcibios_add_pci_devices(slot->bus); + slot->state = CONFIGURED; + info("%s: devices in slot[%s] configured\n", __FUNCTION__, slot->name); - } else { - slot->state = NOT_CONFIGURED; - dbg("%s: no pci_dev struct for adapter in slot[%s]\n", - __FUNCTION__, slot->name); - } } else if (state == EMPTY) { dbg("%s : slot[%s] is empty\n", __FUNCTION__, slot->name); slot->state = EMPTY; Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp.h 2005-12-12 15:33:30.472712804 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h 2005-12-12 15:33:52.278646492 -0600 @@ -92,7 +92,6 @@ extern int register_pci_slot(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); -extern int rpaphp_config_pci_adapter(struct pci_bus *bus); extern int rpaphp_unconfig_pci_adapter(struct pci_bus *bus); /* rpaphp_core.c */ From linas at austin.ibm.com Wed Dec 14 06:53:55 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 13:53:55 -0600 Subject: [PATCH 5/9] powerpc/PCI hotplug: remove remove_bus_device() In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213195355.GI10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas The function rpaphp_eeh_remove_bus_device() is a dupe of eeh_remove_bus_device(). Remove it. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 15:33:32.646407144 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 15:34:35.763531664 -0600 @@ -116,30 +116,12 @@ return; } -static void rpaphp_eeh_remove_bus_device(struct pci_dev *dev) -{ - eeh_remove_device(dev); - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { - struct pci_bus *bus = dev->subordinate; - struct list_head *ln; - if (!bus) - return; - for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) { - struct pci_dev *pdev = pci_dev_b(ln); - if (pdev) - rpaphp_eeh_remove_bus_device(pdev); - } - - } - return; -} - int rpaphp_unconfig_pci_adapter(struct pci_bus *bus) { struct pci_dev *dev, *tmp; list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) { - rpaphp_eeh_remove_bus_device(dev); + eeh_remove_bus_device(dev); pci_remove_bus_device(dev); } return 0; From linas at austin.ibm.com Wed Dec 14 06:56:06 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 13:56:06 -0600 Subject: [PATCH 6/9] powerpc/PCI hotplug: de-convolute rpaphp_unconfig_pci_adapter() In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213195606.GJ10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas Remove general baroqueness. The function rpaphp_unconfig_pci_adapter() is really just three lines of code, once all the dbg printks are removed. And its called in only one place. So replace the call by the thre lines. Also, provide proper semaphore locking in the affected function disable_slot() Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpadlpar_core.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-12-12 15:33:31.903511608 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpadlpar_core.c 2005-12-12 16:06:52.787105399 -0600 @@ -380,7 +380,11 @@ return -EIO; } } else { - rpaphp_unconfig_pci_adapter(bus); + struct pci_dev *dev, *tmp; + list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) { + eeh_remove_bus_device(dev); + pci_remove_bus_device(dev); + } } if (unmap_bus_range(bus)) { Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 15:33:31.903511608 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 16:07:46.806506500 -0600 @@ -412,27 +412,31 @@ return retval; } -static int disable_slot(struct hotplug_slot *hotplug_slot) +static int __disable_slot(struct slot *slot) { - int retval = -EINVAL; - struct slot *slot = (struct slot *)hotplug_slot->private; + struct pci_dev *dev, *tmp; - dbg("%s - Entry: slot[%s]\n", __FUNCTION__, slot->name); + if (slot->state == NOT_CONFIGURED) + return -EINVAL; - if (slot->state == NOT_CONFIGURED) { - dbg("%s: %s is already disabled\n", __FUNCTION__, slot->name); - goto exit; + list_for_each_entry_safe(dev, tmp, &slot->bus->devices, bus_list) { + eeh_remove_bus_device(dev); + pci_remove_bus_device(dev); } + + slot->state = NOT_CONFIGURED; + return 0; +} + +static int disable_slot(struct hotplug_slot *hotplug_slot) +{ + struct slot *slot = (struct slot *)hotplug_slot->private; + int retval; - dbg("DISABLING SLOT %s\n", slot->name); down(&rpaphp_sem); - retval = rpaphp_unconfig_pci_adapter(slot->bus); + retval = __disable_slot (slot); up(&rpaphp_sem); - slot->state = NOT_CONFIGURED; - info("%s: devices in slot[%s] unconfigured.\n", __FUNCTION__, - slot->name); -exit: - dbg("%s - Exit: rc[%d]\n", __FUNCTION__, retval); + return retval; } Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 15:34:35.763531664 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 16:06:53.403018759 -0600 @@ -116,18 +116,6 @@ return; } -int rpaphp_unconfig_pci_adapter(struct pci_bus *bus) -{ - struct pci_dev *dev, *tmp; - - list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) { - eeh_remove_bus_device(dev); - pci_remove_bus_device(dev); - } - return 0; -} -EXPORT_SYMBOL_GPL(rpaphp_unconfig_pci_adapter); - static int setup_pci_hotplug_slot_info(struct slot *slot) { dbg("%s Initilize the PCI slot's hotplug->info structure ...\n", Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp.h 2005-12-12 15:33:52.278646492 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h 2005-12-12 16:06:53.403018759 -0600 @@ -92,8 +92,6 @@ extern int register_pci_slot(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); -extern int rpaphp_unconfig_pci_adapter(struct pci_bus *bus); - /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); extern int rpaphp_remove_slot(struct slot *slot); From linas at austin.ibm.com Wed Dec 14 07:00:12 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 14:00:12 -0600 Subject: [PATCH 7/9] powerpc/PCI hotplug: merge rpaphp_enable_pci_slot() In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213200012.GK10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas Remove general baroqueness. The function rpaphp_enable_pci_slot() has a fairly simple logic structure, once all of the debug printk's are removed. Its called from only one place, and that place also has a very simple structure once he printk's are removed. Merge the two together. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 16:07:46.806506500 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 16:07:59.535715874 -0600 @@ -393,22 +393,40 @@ cleanup_slots(); } -static int enable_slot(struct hotplug_slot *hotplug_slot) +static int __enable_slot(struct slot *slot) { - int retval = 0; - struct slot *slot = (struct slot *)hotplug_slot->private; + int state; + int retval; + + if (slot->state == CONFIGURED) + return 0; - if (slot->state == CONFIGURED) { - dbg("%s: %s is already enabled\n", __FUNCTION__, slot->name); - goto exit; + retval = rpaphp_get_sensor_state(slot, &state); + if (retval) + return retval; + + if (state == PRESENT) { + pcibios_add_pci_devices(slot->bus); + slot->state = CONFIGURED; + } else if (state == EMPTY) { + slot->state = EMPTY; + } else { + err("%s: slot[%s] is in invalid state\n", __FUNCTION__, slot->name); + slot->state = NOT_VALID; + return -EINVAL; } + return 0; +} + +static int enable_slot(struct hotplug_slot *hotplug_slot) +{ + int retval; + struct slot *slot = (struct slot *)hotplug_slot->private; - dbg("ENABLING SLOT %s\n", slot->name); down(&rpaphp_sem); - retval = rpaphp_enable_pci_slot(slot); + retval = __enable_slot(slot); up(&rpaphp_sem); -exit: - dbg("%s - Exit: rc[%d]\n", __FUNCTION__, retval); + return retval; } Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 16:06:53.403018759 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 16:07:59.536715733 -0600 @@ -32,7 +32,7 @@ #include "../pci.h" /* for pci_add_new_bus */ #include "rpaphp.h" -static int rpaphp_get_sensor_state(struct slot *slot, int *state) +int rpaphp_get_sensor_state(struct slot *slot, int *state) { int rc; int setlevel; @@ -211,31 +211,3 @@ return rc; } -int rpaphp_enable_pci_slot(struct slot *slot) -{ - int retval = 0, state; - - retval = rpaphp_get_sensor_state(slot, &state); - if (retval) - goto exit; - dbg("%s: sensor state[%d]\n", __FUNCTION__, state); - /* if slot is not empty, enable the adapter */ - if (state == PRESENT) { - dbg("%s : slot[%s] is occupied.\n", __FUNCTION__, slot->name); - pcibios_add_pci_devices(slot->bus); - slot->state = CONFIGURED; - info("%s: devices in slot[%s] configured\n", - __FUNCTION__, slot->name); - } else if (state == EMPTY) { - dbg("%s : slot[%s] is empty\n", __FUNCTION__, slot->name); - slot->state = EMPTY; - } else { - err("%s: slot[%s] is in invalid state\n", __FUNCTION__, - slot->name); - slot->state = NOT_VALID; - retval = -EINVAL; - } -exit: - dbg("%s - Exit: rc[%d]\n", __FUNCTION__, retval); - return retval; -} Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp.h 2005-12-12 16:06:53.403018759 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h 2005-12-12 16:07:59.536715733 -0600 @@ -91,6 +91,7 @@ extern int rpaphp_enable_pci_slot(struct slot *slot); extern int register_pci_slot(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); +extern int rpaphp_get_sensor_state(struct slot *slot, int *state); /* rpaphp_core.c */ extern int rpaphp_add_slot(struct device_node *dn); From linas at austin.ibm.com Wed Dec 14 07:02:06 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 14:02:06 -0600 Subject: [PATCH 8/9] powerpc/PCI hotplug: cleanup: add prefix In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213200206.GL10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas Minor cleanup. Add the prefix rpaphp_* to several generic-sounding routines. Remove rpaphp_remove_slot(), which is a one-liner. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 16:07:59.535715874 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 16:10:22.642584658 -0600 @@ -196,11 +196,6 @@ return 0; } -int rpaphp_remove_slot(struct slot *slot) -{ - return deregister_slot(slot); -} - static int get_children_props(struct device_node *dn, int **drc_indexes, int **drc_names, int **drc_types, int **drc_power_domains) { @@ -307,13 +302,15 @@ return 0; } -/**************************************************************** +/** + * rpaphp_add_slot -- add hotplug or dlpar slot + * * rpaphp not only registers PCI hotplug slots(HOTPLUG), * but also logical DR slots(EMBEDDED). * HOTPLUG slot: An adapter can be physically added/removed. * EMBEDDED slot: An adapter can be logically removed/added * from/to a partition with the slot. - ***************************************************************/ + */ int rpaphp_add_slot(struct device_node *dn) { struct slot *slot; @@ -344,7 +341,7 @@ dbg("Found drc-index:0x%x drc-name:%s drc-type:%s\n", indexes[i + 1], name, type); - retval = register_pci_slot(slot); + retval = rpaphp_register_pci_slot(slot); } } exit: @@ -462,6 +459,5 @@ module_exit(rpaphp_exit); EXPORT_SYMBOL_GPL(rpaphp_add_slot); -EXPORT_SYMBOL_GPL(rpaphp_remove_slot); EXPORT_SYMBOL_GPL(rpaphp_slot_head); EXPORT_SYMBOL_GPL(rpaphp_get_drc_props); Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 16:07:59.536715733 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_pci.c 2005-12-12 16:12:00.376835840 -0600 @@ -198,7 +198,7 @@ return -EINVAL; } -int register_pci_slot(struct slot *slot) +int rpaphp_register_pci_slot(struct slot *slot) { int rc = -EINVAL; @@ -206,7 +206,7 @@ goto exit_rc; if (setup_pci_slot(slot)) goto exit_rc; - rc = register_slot(slot); + rc = rpaphp_register_slot(slot); exit_rc: return rc; } Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpadlpar_core.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-12-12 16:06:52.787105399 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpadlpar_core.c 2005-12-12 16:08:04.272049609 -0600 @@ -227,7 +227,7 @@ slot = find_slot(dn); if (slot) { /* Remove hotplug slot */ - if (rpaphp_remove_slot(slot)) { + if (rpaphp_deregister_slot(slot)) { printk(KERN_ERR "%s: unable to remove hotplug slot %s\n", __FUNCTION__, drc_name); @@ -373,7 +373,7 @@ slot = find_slot(dn); if (slot) { /* Remove hotplug slot */ - if (rpaphp_remove_slot(slot)) { + if (rpaphp_deregister_slot(slot)) { printk(KERN_ERR "%s: unable to remove hotplug slot %s\n", __FUNCTION__, drc_name); Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_slot.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_slot.c 2005-12-12 16:06:52.787105399 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_slot.c 2005-12-12 16:08:04.272049609 -0600 @@ -35,16 +35,16 @@ static ssize_t location_read_file (struct hotplug_slot *php_slot, char *buf) { - char *value; - int retval = -ENOENT; + char *value; + int retval = -ENOENT; struct slot *slot = (struct slot *)php_slot->private; if (!slot) return retval; - value = slot->location; - retval = sprintf (buf, "%s\n", value); - return retval; + value = slot->location; + retval = sprintf (buf, "%s\n", value); + return retval; } static struct hotplug_slot_attribute hotplug_slot_attr_location = { @@ -137,7 +137,7 @@ return 0; } -int deregister_slot(struct slot *slot) +int rpaphp_deregister_slot(struct slot *slot) { int retval = 0; struct hotplug_slot *php_slot = slot->hotplug_slot; @@ -160,7 +160,7 @@ return retval; } -int register_slot(struct slot *slot) +int rpaphp_register_slot(struct slot *slot) { int retval; @@ -169,7 +169,7 @@ slot->power_domain, slot->type); /* should not try to register the same slot twice */ if (is_registered(slot)) { /* should't be here */ - err("register_slot: slot[%s] is already registered\n", slot->name); + err("rpaphp_register_slot: slot[%s] is already registered\n", slot->name); rpaphp_release_slot(slot->hotplug_slot); return -EAGAIN; } Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp.h 2005-12-12 16:07:59.536715733 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp.h 2005-12-12 16:08:04.272049609 -0600 @@ -89,7 +89,7 @@ /* rpaphp_pci.c */ extern int rpaphp_enable_pci_slot(struct slot *slot); -extern int register_pci_slot(struct slot *slot); +extern int rpaphp_register_pci_slot(struct slot *slot); extern int rpaphp_get_pci_adapter_status(struct slot *slot, int is_init, u8 * value); extern int rpaphp_get_sensor_state(struct slot *slot, int *state); @@ -102,8 +102,8 @@ /* rpaphp_slot.c */ extern void dealloc_slot_struct(struct slot *slot); extern struct slot *alloc_slot_struct(struct device_node *dn, int drc_index, char *drc_name, int power_domain); -extern int register_slot(struct slot *slot); -extern int deregister_slot(struct slot *slot); +extern int rpaphp_register_slot(struct slot *slot); +extern int rpaphp_deregister_slot(struct slot *slot); extern int rpaphp_get_power_status(struct slot *slot, u8 * value); extern int rpaphp_set_attention_status(struct slot *slot, u8 status); From linas at austin.ibm.com Wed Dec 14 07:03:40 2005 From: linas at austin.ibm.com (linas) Date: Tue, 13 Dec 2005 14:03:40 -0600 Subject: [PATCH 9/9] powerpc/PCI hotplug: minor cleanup forward decls In-Reply-To: <20051213194258.GD10037@austin.ibm.com> References: <20051213194258.GD10037@austin.ibm.com> Message-ID: <20051213200340.GM10037@austin.ibm.com> John Rose, Please review and sign off, and forward to Greg KH! --linas Minor cleanup. Move structure initializer to bottom of file, this allows elimination of eyeball-strain-inducing forward declarations. Signed-off-by: Linas Vepstas Index: linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c =================================================================== --- linux-2.6.15-rc5-mm2.orig/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 16:10:22.642584658 -0600 +++ linux-2.6.15-rc5-mm2/drivers/pci/hotplug/rpaphp_core.c 2005-12-12 16:12:12.515128263 -0600 @@ -56,25 +56,6 @@ module_param(debug, bool, 0644); -static int enable_slot(struct hotplug_slot *slot); -static int disable_slot(struct hotplug_slot *slot); -static int set_attention_status(struct hotplug_slot *slot, u8 value); -static int get_power_status(struct hotplug_slot *slot, u8 * value); -static int get_attention_status(struct hotplug_slot *slot, u8 * value); -static int get_adapter_status(struct hotplug_slot *slot, u8 * value); -static int get_max_bus_speed(struct hotplug_slot *hotplug_slot, enum pci_bus_speed *value); - -struct hotplug_slot_ops rpaphp_hotplug_slot_ops = { - .owner = THIS_MODULE, - .enable_slot = enable_slot, - .disable_slot = disable_slot, - .set_attention_status = set_attention_status, - .get_power_status = get_power_status, - .get_attention_status = get_attention_status, - .get_adapter_status = get_adapter_status, - .get_max_bus_speed = get_max_bus_speed, -}; - static int rpaphp_get_attention_status(struct slot *slot) { return slot->hotplug_slot->info->attention_status; @@ -455,6 +436,17 @@ return retval; } +struct hotplug_slot_ops rpaphp_hotplug_slot_ops = { + .owner = THIS_MODULE, + .enable_slot = enable_slot, + .disable_slot = disable_slot, + .set_attention_status = set_attention_status, + .get_power_status = get_power_status, + .get_attention_status = get_attention_status, + .get_adapter_status = get_adapter_status, + .get_max_bus_speed = get_max_bus_speed, +}; + module_init(rpaphp_init); module_exit(rpaphp_exit); From gregkh at suse.de Wed Dec 14 06:48:48 2005 From: gregkh at suse.de (gregkh at suse.de) Date: Tue, 13 Dec 2005 11:48:48 -0800 Subject: patch pci-export-pci_cfg_space_size.patch added to gregkh-2.6 tree In-Reply-To: <1134457757.6989.195.camel@gaston> Message-ID: <1EmG8m-7MQ-00@press.kroah.org> This is a note to let you know that I've just added the patch titled Subject: [PATCH] PCI: Export pci_cfg_space_size to my gregkh-2.6 tree. Its filename is pci-export-pci_cfg_space_size.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ Patches currently in gregkh-2.6 which might be from benh at kernel.crashing.org are i2c/i2c-drop-driver-owner-and-name-04-macintosh.patch pci/pci-export-pci_cfg_space_size.patch >From benh at kernel.crashing.org Mon Dec 12 23:16:21 2005 Subject: [PATCH] PCI: Export pci_cfg_space_size From: Benjamin Herrenschmidt To: Greg KH , Paul Mackerras Cc: linux-pci , linuxppc64-dev , linuxppc-dev list Date: Tue, 13 Dec 2005 18:09:16 +1100 Message-Id: <1134457757.6989.195.camel at gaston> The powerpc PCI code sets up the PCI tree without doing config space accesses in most cases, from the firmware tree. However, it still wants to call pci_cfg_space_size() under some conditions, thus it needs to be made non-static (though I don't see a point to export it to modules). Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Greg Kroah-Hartman --- drivers/pci/probe.c | 2 +- include/linux/pci.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) --- gregkh-2.6.orig/drivers/pci/probe.c +++ gregkh-2.6/drivers/pci/probe.c @@ -717,7 +717,7 @@ static void pci_release_dev(struct devic * reading the dword at 0x100 which must either be 0 or a valid extended * capability header. */ -static int pci_cfg_space_size(struct pci_dev *dev) +int pci_cfg_space_size(struct pci_dev *dev) { int pos; u32 status; --- gregkh-2.6.orig/include/linux/pci.h +++ gregkh-2.6/include/linux/pci.h @@ -514,6 +514,7 @@ int pci_scan_bridge(struct pci_bus *bus, void pci_walk_bus(struct pci_bus *top, void (*cb)(struct pci_dev *, void *), void *userdata); +int pci_cfg_space_size(struct pci_dev *dev); /* kmem_cache style wrapper around pci_alloc_consistent() */ From benh at kernel.crashing.org Wed Dec 14 07:23:28 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 14 Dec 2005 07:23:28 +1100 Subject: [PATCH] powerpc: Update MPIC workarounds In-Reply-To: <20051213165807.GA7468@pb15.lixom.net> References: <1134457469.6989.189.camel@gaston> <20051213165807.GA7468@pb15.lixom.net> Message-ID: <1134505409.6458.8.camel@gaston> On Tue, 2005-12-13 at 08:58 -0800, Olof Johansson wrote: > On Tue, Dec 13, 2005 at 06:04:29PM +1100, Benjamin Herrenschmidt wrote: > > From: Segher Boessenkool > > > > Cleanup the MPIC IO-APIC workarounds, make them a bit more generic, > > smaller and faster. > > I really don't like all the hand-coded constants in this code. They're > all over the place, and there's no descriptions of what they are there > for. Lots of hardcoded offsets, etc. Since this is a cleanup, wouldn't > it be a good time to use symbolic constands and/or comment them up a > bit? Heh, welcome to segher world :) I'll add more comments & symbolic constants in the other patch that applies on top of this one. > > Index: linux-work/arch/powerpc/sysdev/mpic.c > > =================================================================== > > --- linux-work.orig/arch/powerpc/sysdev/mpic.c 2005-12-06 16:17:43.000000000 +1100 > > +++ linux-work/arch/powerpc/sysdev/mpic.c 2005-12-07 13:30:45.000000000 +1100 > > @@ -175,57 +175,57 @@ static inline int mpic_is_ht_interrupt(s > > return mpic->fixups[source_no].base != NULL; > > } > > > > + > > static inline void mpic_apic_end_irq(struct mpic *mpic, unsigned int source_no) > > { > > struct mpic_irq_fixup *fixup = &mpic->fixups[source_no]; > > - u32 tmp; > > > > spin_lock(&mpic->fixup_lock); > > - writeb(0x11 + 2 * fixup->irq, fixup->base); > > - tmp = readl(fixup->base + 2); > > - writel(tmp | 0x80000000ul, fixup->base + 2); > > - /* config writes shouldn't be posted but let's be safe ... */ > > - (void)readl(fixup->base + 2); > > + writeb(0x11 + 2 * fixup->irq, fixup->base + 2); > > + writel(fixup->data, fixup->base + 4); > > This seems like a functional change: Previous code wrote at base, new at > base+2? Hrm... I have to double check but I think the previous code hard coded the bases for the 2 known APICs (incuding the +2) while the new code properly scans the PCI capabilities and thus gets the capability base instead, which is better (but needs +2 / +4). I suppose I could pre-offset by 2 at discovery time to avoid an addition but I'm not sure it's worth it. Ben. From mjw at us.ibm.com Wed Dec 14 09:19:41 2005 From: mjw at us.ibm.com (Mike Wolf) Date: Tue, 13 Dec 2005 16:19:41 -0600 Subject: [PATCH] powerpc: radeon and CONFIG_PM Message-ID: <439F48FD.1090600@us.ltcfwd.linux.ibm.com> The following comment is in drivers/video/aty/radeon_pm.c /* Check if we can power manage on suspend/resume. We can do * D2 on M6, M7 and M9, and we can resume from D3 cold a few other * "Mac" cards, but that's all. We need more infos about what the * BIOS does tho. Right now, all this PM stuff is pmac-only for that * reason. --BenH */ but it didnt check that CONFIG_PMAC was selected with CONFIG_PM. This results in build errors when CONFIG_PM is selected and pseries is built. Signed-off-by: Mike Wolf ====================================================================== --- a/drivers/video/aty/radeon_pm.c 2005-12-08 14:17:45.000000000 +0800 +++ b/drivers/video/aty/radeon_pm.c 2005-12-08 14:35:57.000000000 +0800 @@ -2734,7 +2734,7 @@ * BIOS does tho. Right now, all this PM stuff is pmac-only for that * reason. --BenH */ -#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) +#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && defined(CONFIG_PPC_PMAC) if (_machine == _MACH_Pmac && rinfo->of_node) { if (rinfo->is_mobility && rinfo->pm_reg && rinfo->family <= CHIP_FAMILY_RV250) @@ -2778,12 +2778,12 @@ OUTREG(TV_DAC_CNTL, INREG(TV_DAC_CNTL) | 0x07000000); #endif } -#endif /* defined(CONFIG_PM) && defined(CONFIG_PPC_OF) */ +#endif /* defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && defined(CONFIG_PPC_PMAC) */ } void radeonfb_pm_exit(struct radeonfb_info *rinfo) { -#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) +#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && defined(CONFIG_PPC_PMAC) if (rinfo->pm_mode != radeon_pm_none) pmac_set_early_video_resume(NULL, NULL); #endif From benh at kernel.crashing.org Wed Dec 14 13:10:10 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 14 Dec 2005 13:10:10 +1100 Subject: [PATCH] powerpc: Experimental support for new G5 Macs (#2) Message-ID: <1134526211.6458.42.camel@gaston> his adds some very basic support for the new machines, including the Quad G5 (tested), and other new dual core based machines and iMac G5 iSight (untested). This is still experimental ! There is no thermal control yet, there is no proper handing of MSIs, etc.. but it boots, I have all 4 cores up on my machine. Compared to the previous version of this patch, this one adds DART IOMMU support for the U4 chipset and thus should work fine on setups with more than 2Gb of RAM. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/powerpc/platforms/powermac/feature.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/feature.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/feature.c 2005-12-13 18:13:11.000000000 +1100 @@ -101,7 +101,8 @@ static const char *macio_names[] = "Keylargo", "Pangea", "Intrepid", - "K2" + "K2", + "Shasta", }; @@ -119,7 +120,7 @@ static const char *macio_names[] = static struct device_node *uninorth_node; static u32 __iomem *uninorth_base; static u32 uninorth_rev; -static int uninorth_u3; +static int uninorth_maj; static void __iomem *u3_ht; /* @@ -1399,8 +1400,15 @@ static long g5_fw_enable(struct device_n static long g5_mpic_enable(struct device_node *node, long param, long value) { unsigned long flags; + struct device_node *parent = of_get_parent(node); + int is_u3; - if (node->parent == NULL || strcmp(node->parent->name, "u3")) + if (parent == NULL) + return 0; + is_u3 = strcmp(parent->name, "u3") == 0 || + strcmp(parent->name, "u4") == 0; + of_node_put(parent); + if (!is_u3) return 0; LOCK(flags); @@ -1464,7 +1472,7 @@ static long g5_i2s_enable(struct device_ }, }; - if (macio->type != macio_keylargo2 /* && macio->type != macio_shasta*/) + if (macio->type != macio_keylargo2 && macio->type != macio_shasta) return -ENODEV; if (strncmp(node->name, "i2s-", 4)) return -ENODEV; @@ -1473,11 +1481,9 @@ static long g5_i2s_enable(struct device_ case 0: case 1: break; -#if 0 case 2: if (macio->type == macio_shasta) break; -#endif default: return -ENODEV; } @@ -1508,7 +1514,7 @@ static long g5_reset_cpu(struct device_n struct device_node *np; macio = &macio_chips[0]; - if (macio->type != macio_keylargo2) + if (macio->type != macio_keylargo2 && macio->type != macio_shasta) return -ENODEV; np = find_path_device("/cpus"); @@ -1547,7 +1553,8 @@ static long g5_reset_cpu(struct device_n */ void g5_phy_disable_cpu1(void) { - UN_OUT(U3_API_PHY_CONFIG_1, 0); + if (uninorth_maj == 3) + UN_OUT(U3_API_PHY_CONFIG_1, 0); } #endif /* CONFIG_POWER4 */ @@ -2462,6 +2469,14 @@ static struct pmac_mb_def pmac_mb_defs[] PMAC_TYPE_POWERMAC_G5_U3L, g5_features, 0, }, + { "PowerMac11,2", "PowerMac G5 Dual Core", + PMAC_TYPE_POWERMAC_G5_U3L, g5_features, + 0, + }, + { "PowerMac12,1", "iMac G5 (iSight)", + PMAC_TYPE_POWERMAC_G5_U3L, g5_features, + 0, + }, { "RackMac3,1", "XServe G5", PMAC_TYPE_XSERVE_G5, g5_features, 0, @@ -2574,6 +2589,11 @@ static int __init probe_motherboard(void pmac_mb.model_name = "Unknown K2-based"; pmac_mb.features = g5_features; break; + case macio_shasta: + pmac_mb.model_id = PMAC_TYPE_UNKNOWN_SHASTA; + pmac_mb.model_name = "Unknown Shasta-based"; + pmac_mb.features = g5_features; + break; #endif /* CONFIG_POWER4 */ default: return -ENODEV; @@ -2651,7 +2671,12 @@ static void __init probe_uninorth(void) /* Locate G5 u3 */ if (uninorth_node == NULL) { uninorth_node = of_find_node_by_name(NULL, "u3"); - uninorth_u3 = 1; + uninorth_maj = 3; + } + /* Locate G5 u4 */ + if (uninorth_node == NULL) { + uninorth_node = of_find_node_by_name(NULL, "u4"); + uninorth_maj = 4; } if (uninorth_node == NULL) return; @@ -2664,12 +2689,13 @@ static void __init probe_uninorth(void) return; uninorth_base = ioremap(address, 0x40000); uninorth_rev = in_be32(UN_REG(UNI_N_VERSION)); - if (uninorth_u3) + if (uninorth_maj == 3 || uninorth_maj == 4) u3_ht = ioremap(address + U3_HT_CONFIG_BASE, 0x1000); - printk(KERN_INFO "Found %s memory controller & host bridge," - " revision: %d\n", uninorth_u3 ? "U3" : "UniNorth", - uninorth_rev); + printk(KERN_INFO "Found %s memory controller & host bridge" + " @ 0x%08x revision: 0x%02x\n", uninorth_maj == 3 ? "U3" : + uninorth_maj == 4 ? "U4" : "UniNorth", + (unsigned int)address, uninorth_rev); printk(KERN_INFO "Mapped at 0x%08lx\n", (unsigned long)uninorth_base); /* Set the arbitrer QAck delay according to what Apple does @@ -2677,7 +2703,8 @@ static void __init probe_uninorth(void) if (uninorth_rev < 0x11) { actrl = UN_IN(UNI_N_ARB_CTRL) & ~UNI_N_ARB_CTRL_QACK_DELAY_MASK; actrl |= ((uninorth_rev < 3) ? UNI_N_ARB_CTRL_QACK_DELAY105 : - UNI_N_ARB_CTRL_QACK_DELAY) << UNI_N_ARB_CTRL_QACK_DELAY_SHIFT; + UNI_N_ARB_CTRL_QACK_DELAY) << + UNI_N_ARB_CTRL_QACK_DELAY_SHIFT; UN_OUT(UNI_N_ARB_CTRL, actrl); } @@ -2685,7 +2712,8 @@ static void __init probe_uninorth(void) * revs 1.5 to 2.O and Pangea. Seem to toggle the UniN Maxbus/PCI * memory timeout */ - if ((uninorth_rev >= 0x11 && uninorth_rev <= 0x24) || uninorth_rev == 0xc0) + if ((uninorth_rev >= 0x11 && uninorth_rev <= 0x24) || + uninorth_rev == 0xc0) UN_OUT(0x2160, UN_IN(0x2160) & 0x00ffffff); } @@ -2736,12 +2764,14 @@ static void __init probe_one_macio(const node->full_name); return; } - if (type == macio_keylargo) { + if (type == macio_keylargo || type == macio_keylargo2) { u32 *did = (u32 *)get_property(node, "device-id", NULL); if (*did == 0x00000025) type = macio_pangea; if (*did == 0x0000003e) type = macio_intrepid; + if (*did == 0x0000004f) + type = macio_shasta; } macio_chips[i].of_node = node; macio_chips[i].type = type; @@ -2840,7 +2870,8 @@ set_initial_features(void) } #ifdef CONFIG_POWER4 - if (macio_chips[0].type == macio_keylargo2) { + if (macio_chips[0].type == macio_keylargo2 || + macio_chips[0].type == macio_shasta) { #ifndef CONFIG_SMP /* On SMP machines running UP, we have the second CPU eating * bus cycles. We need to take it off the bus. This is done Index: linux-work/arch/powerpc/platforms/powermac/pic.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/pic.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/pic.c 2005-12-13 18:13:11.000000000 +1100 @@ -524,18 +524,56 @@ static void __init pmac_pic_setup_mpic_n #endif /* defined(CONFIG_XMON) && defined(CONFIG_PPC32) */ } +static struct mpic * __init pmac_setup_one_mpic(struct device_node *np, + int master) +{ + unsigned char senses[128]; + int offset = master ? 0 : 128; + int count = master ? 128 : 124; + const char *name = master ? " MPIC 1 " : " MPIC 2 "; + struct resource r; + struct mpic *mpic; + unsigned int flags = master ? MPIC_PRIMARY : 0; + int rc; + + rc = of_address_to_resource(np, 0, &r); + if (rc) + return NULL; + + pmac_call_feature(PMAC_FTR_ENABLE_MPIC, np, 0, 0); + + prom_get_irq_senses(senses, offset, offset + count); + + flags |= MPIC_WANTS_RESET; + if (get_property(np, "big-endian", NULL)) + flags |= MPIC_BIG_ENDIAN; + + /* Primary Big Endian means HT interrupts. This is quite dodgy + * but works until I find a better way + */ + if (master && (flags & MPIC_BIG_ENDIAN)) + flags |= MPIC_BROKEN_U3; + + mpic = mpic_alloc(r.start, flags, 0, offset, count, master ? 252 : 0, + senses, count, name); + if (mpic == NULL) + return NULL; + + mpic_init(mpic); + + return mpic; + } + static int __init pmac_pic_probe_mpic(void) { struct mpic *mpic1, *mpic2; struct device_node *np, *master = NULL, *slave = NULL; - unsigned char senses[128]; - struct resource r; /* We can have up to 2 MPICs cascaded */ for (np = NULL; (np = of_find_node_by_type(np, "open-pic")) != NULL;) { if (master == NULL && - get_property(np, "interrupt-parent", NULL) != NULL) + get_property(np, "interrupts", NULL) == NULL) master = of_node_get(np); else if (slave == NULL) slave = of_node_get(np); @@ -557,13 +595,8 @@ static int __init pmac_pic_probe_mpic(vo ppc_md.get_irq = mpic_get_irq; /* Setup master */ - BUG_ON(of_address_to_resource(master, 0, &r)); - pmac_call_feature(PMAC_FTR_ENABLE_MPIC, master, 0, 0); - prom_get_irq_senses(senses, 0, 128); - mpic1 = mpic_alloc(r.start, MPIC_PRIMARY | MPIC_WANTS_RESET, - 0, 0, 128, 252, senses, 128, " OpenPIC "); + mpic1 = pmac_setup_one_mpic(master, 1); BUG_ON(mpic1 == NULL); - mpic_init(mpic1); /* Install NMI if any */ pmac_pic_setup_mpic_nmi(mpic1); @@ -574,27 +607,12 @@ static int __init pmac_pic_probe_mpic(vo if (slave == NULL || slave->n_intrs < 1) return 0; - /* Setup slave, failures are non-fatal */ - if (of_address_to_resource(slave, 0, &r)) { - printk(KERN_ERR "Can't get address of MPIC %s\n", - slave->full_name); - return 0; - } - pmac_call_feature(PMAC_FTR_ENABLE_MPIC, slave, 0, 0); - prom_get_irq_senses(senses, 128, 128 + 124); - - /* We don't need to set MPIC_BROKEN_U3 here since we don't have - * hypertransport interrupts routed to it, at least not on currently - * supported machines, that may change. - */ - mpic2 = mpic_alloc(r.start, MPIC_BIG_ENDIAN | MPIC_WANTS_RESET, - 0, 128, 124, 0, senses, 124, " U3-MPIC "); + mpic2 = pmac_setup_one_mpic(slave, 0); if (mpic2 == NULL) { - printk(KERN_ERR "Can't create slave MPIC %s\n", - slave->full_name); + printk(KERN_ERR "Failed to setup slave MPIC\n"); + of_node_put(slave); return 0; } - mpic_init(mpic2); mpic_setup_cascade(slave->intrs[0].line, pmac_u3_cascade, mpic2); of_node_put(slave); Index: linux-work/include/asm-powerpc/pmac_feature.h =================================================================== --- linux-work.orig/include/asm-powerpc/pmac_feature.h 2005-11-24 17:18:48.000000000 +1100 +++ linux-work/include/asm-powerpc/pmac_feature.h 2005-12-13 18:13:11.000000000 +1100 @@ -121,6 +121,7 @@ #define PMAC_TYPE_IMAC_G5 0x152 /* iMac G5 */ #define PMAC_TYPE_XSERVE_G5 0x153 /* Xserve G5 */ #define PMAC_TYPE_UNKNOWN_K2 0x19f /* Any other K2 based */ +#define PMAC_TYPE_UNKNOWN_SHASTA 0x19e /* Any other Shasta based */ /* * Motherboard flags @@ -341,6 +342,7 @@ enum { macio_pangea, macio_intrepid, macio_keylargo2, + macio_shasta, }; struct macio_chip Index: linux-work/arch/powerpc/Kconfig =================================================================== --- linux-work.orig/arch/powerpc/Kconfig 2005-12-13 15:03:21.000000000 +1100 +++ linux-work/arch/powerpc/Kconfig 2005-12-13 18:13:11.000000000 +1100 @@ -300,6 +300,7 @@ config PPC_PMAC64 bool depends on PPC_PMAC && POWER4 select U3_DART + select MPIC_BROKEN_U3 select GENERIC_TBSYNC default y Index: linux-work/arch/powerpc/kernel/prom.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/prom.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/kernel/prom.c 2005-12-13 18:13:11.000000000 +1100 @@ -298,6 +298,16 @@ static int __devinit finish_node_interru int i, j, n, sense; unsigned int *irq, virq; struct device_node *ic; + int trace = 0; + + //#define TRACE(fmt...) do { if (trace) { printk(fmt); mdelay(1000); } } while(0) +#define TRACE(fmt...) + + if (!strcmp(np->name, "smu-doorbell")) + trace = 1; + + TRACE("Finishing SMU doorbell ! num_interrupt_controllers = %d\n", + num_interrupt_controllers); if (num_interrupt_controllers == 0) { /* @@ -332,11 +342,12 @@ static int __devinit finish_node_interru } ints = (unsigned int *) get_property(np, "interrupts", &intlen); + TRACE("ints=%p, intlen=%d\n", ints, intlen); if (ints == NULL) return 0; intrcells = prom_n_intr_cells(np); intlen /= intrcells * sizeof(unsigned int); - + TRACE("intrcells=%d, new intlen=%d\n", intrcells, intlen); np->intrs = prom_alloc(intlen * sizeof(*(np->intrs)), mem_start); if (!np->intrs) return -ENOMEM; @@ -347,6 +358,7 @@ static int __devinit finish_node_interru intrcount = 0; for (i = 0; i < intlen; ++i, ints += intrcells) { n = map_interrupt(&irq, &ic, np, ints, intrcells); + TRACE("map, irq=%d, ic=%p, n=%d\n", irq, ic, n); if (n <= 0) continue; @@ -357,6 +369,7 @@ static int __devinit finish_node_interru np->intrs[intrcount].sense = map_isa_senses[sense]; } else { virq = virt_irq_create_mapping(irq[0]); + TRACE("virq=%d\n", virq); #ifdef CONFIG_PPC64 if (virq == NO_IRQ) { printk(KERN_CRIT "Could not allocate interrupt" @@ -366,6 +379,12 @@ static int __devinit finish_node_interru #endif np->intrs[intrcount].line = irq_offset_up(virq); sense = (n > 1)? (irq[1] & 3): 1; + + /* Apple uses bits in there in a different way, let's + * only keep the real sense bit on macs + */ + if (_machine == PLATFORM_POWERMAC) + sense &= 0x1; np->intrs[intrcount].sense = map_mpic_senses[sense]; } @@ -375,12 +394,13 @@ static int __devinit finish_node_interru char *name = get_property(ic->parent, "name", NULL); if (name && !strcmp(name, "u3")) np->intrs[intrcount].line += 128; - else if (!(name && !strcmp(name, "mac-io"))) + else if (!(name && (!strcmp(name, "mac-io") || + !strcmp(name, "u4")))) /* ignore other cascaded controllers, such as the k2-sata-root */ break; } -#endif +#endif /* CONFIG_PPC64 */ if (n > 2) { printk("hmmm, got %d intr cells for %s:", n, np->full_name); Index: linux-work/arch/powerpc/platforms/powermac/pci.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/pci.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/pci.c 2005-12-13 18:13:12.000000000 +1100 @@ -1,7 +1,7 @@ /* * Support for PCI bridges found on Power Macintoshes. * - * Copyright (C) 2003 Benjamin Herrenschmuidt (benh at kernel.crashing.org) + * Copyright (C) 2003-2005 Benjamin Herrenschmuidt (benh at kernel.crashing.org) * Copyright (C) 1997 Paul Mackerras (paulus at samba.org) * * This program is free software; you can redistribute it and/or @@ -25,7 +25,7 @@ #include #include #ifdef CONFIG_PPC64 -#include +//#include #include #endif @@ -44,6 +44,7 @@ static int add_bridge(struct device_node static int has_uninorth; #ifdef CONFIG_PPC64 static struct pci_controller *u3_agp; +static struct pci_controller *u4_pcie; static struct pci_controller *u3_ht; #endif /* CONFIG_PPC64 */ @@ -97,11 +98,8 @@ static void __init fixup_bus_range(struc /* Lookup the "bus-range" property for the hose */ bus_range = (int *) get_property(bridge, "bus-range", &len); - if (bus_range == NULL || len < 2 * sizeof(int)) { - printk(KERN_WARNING "Can't get bus-range for %s\n", - bridge->full_name); + if (bus_range == NULL || len < 2 * sizeof(int)) return; - } bus_range[1] = fixup_one_level_bus_range(bridge->child, bus_range[1]); } @@ -128,14 +126,14 @@ static void __init fixup_bus_range(struc */ #define MACRISC_CFA0(devfn, off) \ - ((1 << (unsigned long)PCI_SLOT(dev_fn)) \ - | (((unsigned long)PCI_FUNC(dev_fn)) << 8) \ - | (((unsigned long)(off)) & 0xFCUL)) + ((1 << (unsigned int)PCI_SLOT(dev_fn)) \ + | (((unsigned int)PCI_FUNC(dev_fn)) << 8) \ + | (((unsigned int)(off)) & 0xFCUL)) #define MACRISC_CFA1(bus, devfn, off) \ - ((((unsigned long)(bus)) << 16) \ - |(((unsigned long)(devfn)) << 8) \ - |(((unsigned long)(off)) & 0xFCUL) \ + ((((unsigned int)(bus)) << 16) \ + |(((unsigned int)(devfn)) << 8) \ + |(((unsigned int)(off)) & 0xFCUL) \ |1UL) static unsigned long macrisc_cfg_access(struct pci_controller* hose, @@ -168,7 +166,8 @@ static int macrisc_read_config(struct pc hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = macrisc_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -199,7 +198,8 @@ static int macrisc_write_config(struct p hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = macrisc_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -234,12 +234,13 @@ static struct pci_ops macrisc_pci_ops = /* * Verify that a specific (bus, dev_fn) exists on chaos */ -static int -chaos_validate_dev(struct pci_bus *bus, int devfn, int offset) +static int chaos_validate_dev(struct pci_bus *bus, int devfn, int offset) { struct device_node *np; u32 *vendor, *device; + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; np = pci_busdev_to_OF_node(bus, devfn); if (np == NULL) return PCIBIOS_DEVICE_NOT_FOUND; @@ -341,10 +342,10 @@ static int u3_ht_skip_device(struct pci_ } #define U3_HT_CFA0(devfn, off) \ - ((((unsigned long)devfn) << 8) | offset) + ((((unsigned int)devfn) << 8) | offset) #define U3_HT_CFA1(bus, devfn, off) \ (U3_HT_CFA0(devfn, off) \ - + (((unsigned long)bus) << 16) \ + + (((unsigned int)bus) << 16) \ + 0x01000000UL) static unsigned long u3_ht_cfg_access(struct pci_controller* hose, @@ -370,7 +371,8 @@ static int u3_ht_read_config(struct pci_ hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = u3_ht_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -419,7 +421,8 @@ static int u3_ht_write_config(struct pci hose = pci_bus_to_host(bus); if (hose == NULL) return PCIBIOS_DEVICE_NOT_FOUND; - + if (offset >= 0x100) + return PCIBIOS_BAD_REGISTER_NUMBER; addr = u3_ht_cfg_access(hose, bus->number, devfn, offset); if (!addr) return PCIBIOS_DEVICE_NOT_FOUND; @@ -459,6 +462,112 @@ static struct pci_ops u3_ht_pci_ops = u3_ht_read_config, u3_ht_write_config }; + +#define U4_PCIE_CFA0(devfn, off) \ + ((1 << ((unsigned int)PCI_SLOT(dev_fn))) \ + | (((unsigned int)PCI_FUNC(dev_fn)) << 8) \ + | ((((unsigned int)(off)) >> 8) << 28) \ + | (((unsigned int)(off)) & 0xfcU)) + +#define U4_PCIE_CFA1(bus, devfn, off) \ + ((((unsigned int)(bus)) << 16) \ + |(((unsigned int)(devfn)) << 8) \ + | ((((unsigned int)(off)) >> 8) << 28) \ + |(((unsigned int)(off)) & 0xfcU) \ + |1UL) + +static unsigned long u4_pcie_cfg_access(struct pci_controller* hose, + u8 bus, u8 dev_fn, int offset) +{ + unsigned int caddr; + + if (bus == hose->first_busno) { + caddr = U4_PCIE_CFA0(dev_fn, offset); + } else + caddr = U4_PCIE_CFA1(bus, dev_fn, offset); + + /* Uninorth will return garbage if we don't read back the value ! */ + do { + out_le32(hose->cfg_addr, caddr); + } while (in_le32(hose->cfg_addr) != caddr); + + offset &= 0x03; + return ((unsigned long)hose->cfg_data) + offset; +} + +static int u4_pcie_read_config(struct pci_bus *bus, unsigned int devfn, + int offset, int len, u32 *val) +{ + struct pci_controller *hose; + unsigned long addr; + + hose = pci_bus_to_host(bus); + if (hose == NULL) + return PCIBIOS_DEVICE_NOT_FOUND; + if (offset >= 0x1000) + return PCIBIOS_BAD_REGISTER_NUMBER; + addr = u4_pcie_cfg_access(hose, bus->number, devfn, offset); + if (!addr) + return PCIBIOS_DEVICE_NOT_FOUND; + /* + * Note: the caller has already checked that offset is + * suitably aligned and that len is 1, 2 or 4. + */ + switch (len) { + case 1: + *val = in_8((u8 *)addr); + break; + case 2: + *val = in_le16((u16 *)addr); + break; + default: + *val = in_le32((u32 *)addr); + break; + } + return PCIBIOS_SUCCESSFUL; +} + +static int u4_pcie_write_config(struct pci_bus *bus, unsigned int devfn, + int offset, int len, u32 val) +{ + struct pci_controller *hose; + unsigned long addr; + + hose = pci_bus_to_host(bus); + if (hose == NULL) + return PCIBIOS_DEVICE_NOT_FOUND; + if (offset >= 0x1000) + return PCIBIOS_BAD_REGISTER_NUMBER; + addr = u4_pcie_cfg_access(hose, bus->number, devfn, offset); + if (!addr) + return PCIBIOS_DEVICE_NOT_FOUND; + /* + * Note: the caller has already checked that offset is + * suitably aligned and that len is 1, 2 or 4. + */ + switch (len) { + case 1: + out_8((u8 *)addr, val); + (void) in_8((u8 *)addr); + break; + case 2: + out_le16((u16 *)addr, val); + (void) in_le16((u16 *)addr); + break; + default: + out_le32((u32 *)addr, val); + (void) in_le32((u32 *)addr); + break; + } + return PCIBIOS_SUCCESSFUL; +} + +static struct pci_ops u4_pcie_pci_ops = +{ + u4_pcie_read_config, + u4_pcie_write_config +}; + #endif /* CONFIG_PPC64 */ #ifdef CONFIG_PPC32 @@ -628,15 +737,36 @@ static void __init setup_u3_agp(struct p hose->ops = ¯isc_pci_ops; hose->cfg_addr = ioremap(0xf0000000 + 0x800000, 0x1000); hose->cfg_data = ioremap(0xf0000000 + 0xc00000, 0x1000); - u3_agp = hose; } +static void __init setup_u4_pcie(struct pci_controller* hose) +{ + /* We currently only implement the "non-atomic" config space, to + * be optimised later. + */ + hose->ops = &u4_pcie_pci_ops; + hose->cfg_addr = ioremap(0xf0000000 + 0x800000, 0x1000); + hose->cfg_data = ioremap(0xf0000000 + 0xc00000, 0x1000); + + /* The bus contains a bridge from root -> device, we need to + * make it visible on bus 0 so that we pick the right type + * of config cycles. If we didn't, we would have to force all + * config cycles to be type 1. So we override the "bus-range" + * property here + */ + hose->first_busno = 0x00; + hose->last_busno = 0xff; + u4_pcie = hose; +} + static void __init setup_u3_ht(struct pci_controller* hose) { struct device_node *np = (struct device_node *)hose->arch_data; + struct pci_controller *other = NULL; int i, cur; + hose->ops = &u3_ht_pci_ops; /* We hard code the address because of the different size of @@ -670,11 +800,20 @@ static void __init setup_u3_ht(struct pc u3_ht = hose; - if (u3_agp == NULL) { - DBG("U3 has no AGP, using full resource range\n"); + if (u3_agp != NULL) + other = u3_agp; + else if (u4_pcie != NULL) + other = u4_pcie; + + if (other == NULL) { + DBG("U3/4 has no AGP/PCIE, using full resource range\n"); return; } + /* Fixup bus range vs. PCIE */ + if (u4_pcie) + hose->last_busno = u4_pcie->first_busno - 1; + /* We "remove" the AGP resources from the resources allocated to HT, * that is we create "holes". However, that code does assumptions * that so far happen to be true (cross fingers...), typically that @@ -682,7 +821,7 @@ static void __init setup_u3_ht(struct pc */ cur = 0; for (i=0; i<3; i++) { - struct resource *res = &u3_agp->mem_resources[i]; + struct resource *res = &other->mem_resources[i]; if (res->flags != IORESOURCE_MEM) continue; /* We don't care about "fine" resources */ @@ -777,9 +916,13 @@ static int __init add_bridge(struct devi setup_u3_ht(hose); disp_name = "U3-HT"; primary = 1; + } else if (device_is_compatible(dev, "u4-pcie")) { + setup_u4_pcie(hose); + disp_name = "U4-PCIE"; + primary = 0; } - printk(KERN_INFO "Found %s PCI host bridge. Firmware bus number: %d->%d\n", - disp_name, hose->first_busno, hose->last_busno); + printk(KERN_INFO "Found %s PCI host bridge. Firmware bus number:" + " %d->%d\n", disp_name, hose->first_busno, hose->last_busno); #endif /* CONFIG_PPC64 */ /* 32 bits only bridges */ @@ -900,6 +1043,8 @@ void __init pmac_pci_init(void) pci_setup_phb_io(u3_ht, 1); if (u3_agp) pci_setup_phb_io(u3_agp, 0); + if (u4_pcie) + pci_setup_phb_io(u4_pcie, 0); /* * On ppc64, fixup the IO resources on our host bridges as @@ -912,7 +1057,8 @@ void __init pmac_pci_init(void) /* Fixup the PCI<->OF mapping for U3 AGP due to bus renumbering. We * assume there is no P2P bridge on the AGP bus, which should be a - * safe assumptions hopefully. + * safe assumptions for now. We should do something better in the + * future though */ if (u3_agp) { struct device_node *np = u3_agp->arch_data; @@ -920,7 +1066,6 @@ void __init pmac_pci_init(void) for (np = np->child; np; np = np->sibling) PCI_DN(np)->busno = 0xf0; } - /* pmac_check_ht_link(); */ /* Tell pci.c to not use the common resource allocation mechanism */ @@ -1127,7 +1272,8 @@ void pmac_pci_fixup_pciata(struct pci_de good: pci_read_config_byte(dev, PCI_CLASS_PROG, &progif); if ((progif & 5) != 5) { - printk(KERN_INFO "Forcing PCI IDE into native mode: %s\n", pci_name(dev)); + printk(KERN_INFO "Forcing PCI IDE into native mode: %s\n", + pci_name(dev)); (void) pci_write_config_byte(dev, PCI_CLASS_PROG, progif|5); if (pci_read_config_byte(dev, PCI_CLASS_PROG, &progif) || (progif & 5) != 5) @@ -1153,7 +1299,8 @@ static void fixup_k2_sata(struct pci_dev for (i = 0; i < 6; i++) { dev->resource[i].start = dev->resource[i].end = 0; dev->resource[i].flags = 0; - pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, 0); + pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, + 0); } } else { pci_read_config_word(dev, PCI_COMMAND, &cmd); @@ -1162,7 +1309,8 @@ static void fixup_k2_sata(struct pci_dev for (i = 0; i < 5; i++) { dev->resource[i].start = dev->resource[i].end = 0; dev->resource[i].flags = 0; - pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, 0); + pci_write_config_dword(dev, PCI_BASE_ADDRESS_0 + 4 * i, + 0); } } } Index: linux-work/arch/powerpc/sysdev/mpic.c =================================================================== --- linux-work.orig/arch/powerpc/sysdev/mpic.c 2005-12-13 18:02:03.000000000 +1100 +++ linux-work/arch/powerpc/sysdev/mpic.c 2005-12-13 18:13:12.000000000 +1100 @@ -13,6 +13,9 @@ */ #undef DEBUG +#undef DEBUG_IPI +#undef DEBUG_IRQ +#undef DEBUG_LOW #include #include @@ -168,35 +171,86 @@ static void __init mpic_test_broken_ipi( /* Test if an interrupt is sourced from HyperTransport (used on broken U3s) * to force the edge setting on the MPIC and do the ack workaround. */ -static inline int mpic_is_ht_interrupt(struct mpic *mpic, unsigned int source_no) +static inline int mpic_is_ht_interrupt(struct mpic *mpic, unsigned int source) { - if (source_no >= 128 || !mpic->fixups) + if (source >= 128 || !mpic->fixups) return 0; - return mpic->fixups[source_no].base != NULL; + return mpic->fixups[source].base != NULL; } -static inline void mpic_apic_end_irq(struct mpic *mpic, unsigned int source_no) +static inline void mpic_ht_end_irq(struct mpic *mpic, unsigned int source) { - struct mpic_irq_fixup *fixup = &mpic->fixups[source_no]; + struct mpic_irq_fixup *fixup = &mpic->fixups[source]; - spin_lock(&mpic->fixup_lock); - writeb(0x11 + 2 * fixup->irq, fixup->base + 2); - writel(fixup->data, fixup->base + 4); - spin_unlock(&mpic->fixup_lock); + if (fixup->applebase) { + unsigned int soff = (fixup->index >> 3) & ~3; + unsigned int mask = 1U << (fixup->index & 0x1f); + writel(mask, fixup->applebase + soff); + } else { + spin_lock(&mpic->fixup_lock); + writeb(0x11 + 2 * fixup->index, fixup->base + 2); + writel(fixup->data, fixup->base + 4); + spin_unlock(&mpic->fixup_lock); + } } +static void mpic_startup_ht_interrupt(struct mpic *mpic, unsigned int source, + unsigned int irqflags) +{ + struct mpic_irq_fixup *fixup = &mpic->fixups[source]; + unsigned long flags; + u32 tmp; + + if (fixup->base == NULL) + return; + + DBG("startup_ht_interrupt(%u, %u) index: %d\n", + source, irqflags, fixup->index); + spin_lock_irqsave(&mpic->fixup_lock, flags); + /* Enable and configure */ + writeb(0x10 + 2 * fixup->index, fixup->base + 2); + tmp = readl(fixup->base + 4); + tmp &= ~(0x23U); + if (irqflags & IRQ_LEVEL) + tmp |= 0x22; + writel(tmp, fixup->base + 4); + spin_unlock_irqrestore(&mpic->fixup_lock, flags); +} + +static void mpic_shutdown_ht_interrupt(struct mpic *mpic, unsigned int source, + unsigned int irqflags) +{ + struct mpic_irq_fixup *fixup = &mpic->fixups[source]; + unsigned long flags; + u32 tmp; + + if (fixup->base == NULL) + return; + + DBG("shutdown_ht_interrupt(%u, %u)\n", source, irqflags); + + /* Disable */ + spin_lock_irqsave(&mpic->fixup_lock, flags); + writeb(0x10 + 2 * fixup->index, fixup->base + 2); + tmp = readl(fixup->base + 4); + tmp &= ~1U; + writel(tmp, fixup->base + 4); + spin_unlock_irqrestore(&mpic->fixup_lock, flags); +} -static void __init mpic_scan_ioapic(struct mpic *mpic, u8 __iomem *devbase) +static void __init mpic_scan_ht_pic(struct mpic *mpic, u8 __iomem *devbase, + unsigned int devfn, u32 vdid) { int i, irq, n; + u8 __iomem *base; u32 tmp; u8 pos; - for (pos = readb(devbase + 0x34); pos; pos = readb(devbase + pos + 1)) { - u8 id = readb(devbase + pos); - - if (id == 0x08) { + for (pos = readb(devbase + PCI_CAPABILITY_LIST); pos != 0; + pos = readb(devbase + pos + PCI_CAP_LIST_NEXT)) { + u8 id = readb(devbase + pos + PCI_CAP_LIST_ID); + if (id == PCI_CAP_ID_HT_IRQCONF) { id = readb(devbase + pos + 3); if (id == 0x80) break; @@ -205,33 +259,41 @@ static void __init mpic_scan_ioapic(stru if (pos == 0) return; - printk(KERN_INFO "mpic: - Workarounds @ %p, pos = 0x%02x\n", devbase, pos); - - devbase += pos; - - writeb(0x01, devbase + 2); - n = (readl(devbase + 4) >> 16) & 0xff; + base = devbase + pos; + writeb(0x01, base + 2); + n = (readl(base + 4) >> 16) & 0xff; + + printk(KERN_INFO "mpic: - HT:%02x.%x [0x%02x] vendor %04x device %04x" + " has %d irqs\n", + devfn >> 3, devfn & 0x7, pos, vdid & 0xffff, vdid >> 16, n + 1); for (i = 0; i <= n; i++) { - writeb(0x10 + 2 * i, devbase + 2); - tmp = readl(devbase + 4); - if ((tmp & 0x21) != 0x20) - continue; + writeb(0x10 + 2 * i, base + 2); + tmp = readl(base + 4); irq = (tmp >> 16) & 0xff; - mpic->fixups[irq].irq = i; - mpic->fixups[irq].base = devbase; - writeb(0x11 + 2 * i, devbase + 2); - mpic->fixups[irq].data = readl(devbase + 4) | 0x80000000; + DBG("HT PIC index 0x%x, irq 0x%x, tmp: %08x\n", i, irq, tmp); + /* mask it , will be unmasked later */ + tmp |= 0x1; + writel(tmp, base + 4); + mpic->fixups[irq].index = i; + mpic->fixups[irq].base = base; + /* Apple HT PIC has a non-standard way of doing EOIs */ + if ((vdid & 0xffff) == 0x106b) + mpic->fixups[irq].applebase = devbase + 0x60; + else + mpic->fixups[irq].applebase = NULL; + writeb(0x11 + 2 * i, base + 2); + mpic->fixups[irq].data = readl(base + 4) | 0x80000000; } } -static void __init mpic_scan_ioapics(struct mpic *mpic) +static void __init mpic_scan_ht_pics(struct mpic *mpic) { unsigned int devfn; u8 __iomem *cfgspace; - printk(KERN_INFO "mpic: Setting up IO-APICs workarounds for U3\n"); + printk(KERN_INFO "mpic: Setting up HT PICs workarounds for U3/U4\n"); /* Allocate fixups array */ mpic->fixups = alloc_bootmem(128 * sizeof(struct mpic_irq_fixup)); @@ -247,13 +309,14 @@ static void __init mpic_scan_ioapics(str cfgspace = ioremap(0xf2000000, 0x10000); BUG_ON(cfgspace == NULL); - /* Now we scan all slots. We do a very quick scan, we read the header type, - * vendor ID and device ID only, that's plenty enough + /* Now we scan all slots. We do a very quick scan, we read the header + * type, vendor ID and device ID only, that's plenty enough */ for (devfn = 0; devfn < 0x100; devfn++) { u8 __iomem *devbase = cfgspace + (devfn << 8); u8 hdr_type = readb(devbase + PCI_HEADER_TYPE); u32 l = readl(devbase + PCI_VENDOR_ID); + u16 s; DBG("devfn %x, l: %x\n", devfn, l); @@ -261,8 +324,12 @@ static void __init mpic_scan_ioapics(str if (l == 0xffffffff || l == 0x00000000 || l == 0x0000ffff || l == 0xffff0000) goto next; + /* Check if is supports capability lists */ + s = readw(devbase + PCI_STATUS); + if (!(s & PCI_STATUS_CAP_LIST)) + goto next; - mpic_scan_ioapic(mpic, devbase); + mpic_scan_ht_pic(mpic, devbase, devfn, l); next: /* next device, if function 0 */ @@ -363,6 +430,31 @@ static void mpic_enable_irq(unsigned int break; } } while(mpic_irq_read(src, MPIC_IRQ_VECTOR_PRI) & MPIC_VECPRI_MASK); + +#ifdef CONFIG_MPIC_BROKEN_U3 + if (mpic->flags & MPIC_BROKEN_U3) { + unsigned int src = irq - mpic->irq_offset; + if (mpic_is_ht_interrupt(mpic, src) && + (irq_desc[irq].status & IRQ_LEVEL)) + mpic_ht_end_irq(mpic, src); + } +#endif /* CONFIG_MPIC_BROKEN_U3 */ +} + +static unsigned int mpic_startup_irq(unsigned int irq) +{ +#ifdef CONFIG_MPIC_BROKEN_U3 + struct mpic *mpic = mpic_from_irq(irq); + unsigned int src = irq - mpic->irq_offset; + + if (mpic_is_ht_interrupt(mpic, src)) + mpic_startup_ht_interrupt(mpic, src, irq_desc[irq].status); + +#endif /* CONFIG_MPIC_BROKEN_U3 */ + + mpic_enable_irq(irq); + + return 0; } static void mpic_disable_irq(unsigned int irq) @@ -386,12 +478,27 @@ static void mpic_disable_irq(unsigned in } while(!(mpic_irq_read(src, MPIC_IRQ_VECTOR_PRI) & MPIC_VECPRI_MASK)); } +static void mpic_shutdown_irq(unsigned int irq) +{ +#ifdef CONFIG_MPIC_BROKEN_U3 + struct mpic *mpic = mpic_from_irq(irq); + unsigned int src = irq - mpic->irq_offset; + + if (mpic_is_ht_interrupt(mpic, src)) + mpic_shutdown_ht_interrupt(mpic, src, irq_desc[irq].status); + +#endif /* CONFIG_MPIC_BROKEN_U3 */ + + mpic_disable_irq(irq); +} + static void mpic_end_irq(unsigned int irq) { struct mpic *mpic = mpic_from_irq(irq); +#ifdef DEBUG_IRQ DBG("%s: end_irq: %d\n", mpic->name, irq); - +#endif /* We always EOI on end_irq() even for edge interrupts since that * should only lower the priority, the MPIC should have properly * latched another edge interrupt coming in anyway @@ -400,8 +507,9 @@ static void mpic_end_irq(unsigned int ir #ifdef CONFIG_MPIC_BROKEN_U3 if (mpic->flags & MPIC_BROKEN_U3) { unsigned int src = irq - mpic->irq_offset; - if (mpic_is_ht_interrupt(mpic, src)) - mpic_apic_end_irq(mpic, src); + if (mpic_is_ht_interrupt(mpic, src) && + (irq_desc[irq].status & IRQ_LEVEL)) + mpic_ht_end_irq(mpic, src); } #endif /* CONFIG_MPIC_BROKEN_U3 */ @@ -482,6 +590,8 @@ struct mpic * __init mpic_alloc(unsigned mpic->name = name; mpic->hc_irq.typename = name; + mpic->hc_irq.startup = mpic_startup_irq; + mpic->hc_irq.shutdown = mpic_shutdown_irq; mpic->hc_irq.enable = mpic_enable_irq; mpic->hc_irq.disable = mpic_disable_irq; mpic->hc_irq.end = mpic_end_irq; @@ -650,10 +760,10 @@ void __init mpic_init(struct mpic *mpic) mpic->irq_count = mpic->num_sources; #ifdef CONFIG_MPIC_BROKEN_U3 - /* Do the ioapic fixups on U3 broken mpic */ + /* Do the HT PIC fixups on U3 broken mpic */ DBG("MPIC flags: %x\n", mpic->flags); if ((mpic->flags & MPIC_BROKEN_U3) && (mpic->flags & MPIC_PRIMARY)) - mpic_scan_ioapics(mpic); + mpic_scan_ht_pics(mpic); #endif /* CONFIG_MPIC_BROKEN_U3 */ for (i = 0; i < mpic->num_sources; i++) { @@ -840,7 +950,9 @@ void mpic_send_ipi(unsigned int ipi_no, BUG_ON(mpic == NULL); +#ifdef DEBUG_IPI DBG("%s: send_ipi(ipi_no: %d)\n", mpic->name, ipi_no); +#endif mpic_cpu_write(MPIC_CPU_IPI_DISPATCH_0 + ipi_no * 0x10, mpic_physmask(cpu_mask & cpus_addr(cpu_online_map)[0])); @@ -851,19 +963,28 @@ int mpic_get_one_irq(struct mpic *mpic, u32 irq; irq = mpic_cpu_read(MPIC_CPU_INTACK) & MPIC_VECPRI_VECTOR_MASK; +#ifdef DEBUG_LOW DBG("%s: get_one_irq(): %d\n", mpic->name, irq); - +#endif if (mpic->cascade && irq == mpic->cascade_vec) { +#ifdef DEBUG_LOW DBG("%s: cascading ...\n", mpic->name); +#endif irq = mpic->cascade(regs, mpic->cascade_data); mpic_eoi(mpic); return irq; } if (unlikely(irq == MPIC_VEC_SPURRIOUS)) return -1; - if (irq < MPIC_VEC_IPI_0) + if (irq < MPIC_VEC_IPI_0) { +#ifdef DEBUG_IRQ + DBG("%s: irq %d\n", mpic->name, irq + mpic->irq_offset); +#endif return irq + mpic->irq_offset; + } +#ifdef DEBUG_IPI DBG("%s: ipi %d !\n", mpic->name, irq - MPIC_VEC_IPI_0); +#endif return irq - MPIC_VEC_IPI_0 + mpic->ipi_offset; } Index: linux-work/include/linux/pci_regs.h =================================================================== --- linux-work.orig/include/linux/pci_regs.h 2005-11-24 17:18:49.000000000 +1100 +++ linux-work/include/linux/pci_regs.h 2005-12-13 18:13:12.000000000 +1100 @@ -196,6 +196,7 @@ #define PCI_CAP_ID_MSI 0x05 /* Message Signalled Interrupts */ #define PCI_CAP_ID_CHSWP 0x06 /* CompactPCI HotSwap */ #define PCI_CAP_ID_PCIX 0x07 /* PCI-X */ +#define PCI_CAP_ID_HT_IRQCONF 0x08 /* HyperTransport IRQ Configuration */ #define PCI_CAP_ID_SHPC 0x0C /* PCI Standard Hot-Plug Controller */ #define PCI_CAP_ID_EXP 0x10 /* PCI Express */ #define PCI_CAP_ID_MSIX 0x11 /* MSI-X */ Index: linux-work/arch/powerpc/kernel/pci_64.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/pci_64.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/kernel/pci_64.c 2005-12-13 18:13:12.000000000 +1100 @@ -34,7 +34,7 @@ #ifdef DEBUG #include -#define DBG(fmt...) udbg_printf(fmt) +#define DBG(fmt...) printk(fmt) #else #define DBG(fmt...) #endif @@ -323,6 +323,7 @@ static void pci_parse_of_addrs(struct de addrs = (u32 *) get_property(node, "assigned-addresses", &proplen); if (!addrs) return; + DBG(" parse addresses (%d bytes) @ %p\n", proplen, addrs); for (; proplen >= 20; proplen -= 20, addrs += 5) { flags = pci_parse_of_flags(addrs[0]); if (!flags) @@ -332,6 +333,9 @@ static void pci_parse_of_addrs(struct de if (!size) continue; i = addrs[0] & 0xff; + DBG(" base: %llx, size: %llx, i: %x\n", + (unsigned long long)base, (unsigned long long)size, i); + if (PCI_BASE_ADDRESS_0 <= i && i <= PCI_BASE_ADDRESS_5) { res = &dev->resource[(i - PCI_BASE_ADDRESS_0) >> 2]; } else if (i == dev->rom_base_reg) { @@ -362,6 +366,8 @@ struct pci_dev *of_create_pci_dev(struct if (type == NULL) type = ""; + DBG(" create device, devfn: %x, type: %s\n", devfn, type); + memset(dev, 0, sizeof(struct pci_dev)); dev->bus = bus; dev->sysdata = node; @@ -375,12 +381,14 @@ struct pci_dev *of_create_pci_dev(struct dev->subsystem_vendor = get_int_prop(node, "subsystem-vendor-id", 0); dev->subsystem_device = get_int_prop(node, "subsystem-id", 0); - dev->cfg_size = 256; /*pci_cfg_space_size(dev);*/ + dev->cfg_size = pci_cfg_space_size(dev); sprintf(pci_name(dev), "%04x:%02x:%02x.%d", pci_domain_nr(bus), dev->bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn)); dev->class = get_int_prop(node, "class-code", 0); + DBG(" class: 0x%x\n", dev->class); + dev->current_state = 4; /* unknown power state */ if (!strcmp(type, "pci")) { @@ -402,6 +410,8 @@ struct pci_dev *of_create_pci_dev(struct pci_parse_of_addrs(node, dev); + DBG(" adding to system ...\n"); + pci_device_add(dev, bus); /* XXX pci_scan_msi_device(dev); */ @@ -418,15 +428,21 @@ void __devinit of_scan_bus(struct device int reglen, devfn; struct pci_dev *dev; + DBG("of_scan_bus(%s) bus no %d... \n", node->full_name, bus->number); + while ((child = of_get_next_child(node, child)) != NULL) { + DBG(" * %s\n", child->full_name); reg = (u32 *) get_property(child, "reg", ®len); if (reg == NULL || reglen < 20) continue; devfn = (reg[0] >> 8) & 0xff; + /* create a new pci_dev for this device */ dev = of_create_pci_dev(child, bus, devfn); if (!dev) continue; + DBG("dev header type: %x\n", dev->hdr_type); + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE || dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) of_scan_pci_bridge(child, dev); @@ -446,16 +462,18 @@ void __devinit of_scan_pci_bridge(struct unsigned int flags; u64 size; + DBG("of_scan_pci_bridge(%s)\n", node->full_name); + /* parse bus-range property */ busrange = (u32 *) get_property(node, "bus-range", &len); if (busrange == NULL || len != 8) { - printk(KERN_ERR "Can't get bus-range for PCI-PCI bridge %s\n", + printk(KERN_DEBUG "Can't get bus-range for PCI-PCI bridge %s\n", node->full_name); return; } ranges = (u32 *) get_property(node, "ranges", &len); if (ranges == NULL) { - printk(KERN_ERR "Can't get ranges for PCI-PCI bridge %s\n", + printk(KERN_DEBUG "Can't get ranges for PCI-PCI bridge %s\n", node->full_name); return; } @@ -509,10 +527,13 @@ void __devinit of_scan_pci_bridge(struct } sprintf(bus->name, "PCI Bus %04x:%02x", pci_domain_nr(bus), bus->number); + DBG(" bus name: %s\n", bus->name); mode = PCI_PROBE_NORMAL; if (ppc_md.pci_probe_mode) mode = ppc_md.pci_probe_mode(bus); + DBG(" probe mode: %d\n", mode); + if (mode == PCI_PROBE_DEVTREE) of_scan_bus(node, bus); else if (mode == PCI_PROBE_NORMAL) @@ -528,6 +549,8 @@ void __devinit scan_phb(struct pci_contr int i, mode; struct resource *res; + DBG("Scanning PHB %s\n", node ? node->full_name : ""); + bus = pci_create_bus(NULL, hose->first_busno, hose->ops, node); if (bus == NULL) { printk(KERN_ERR "Failed to create bus for PCI domain %04x\n", @@ -552,8 +575,9 @@ void __devinit scan_phb(struct pci_contr mode = PCI_PROBE_NORMAL; #ifdef CONFIG_PPC_MULTIPLATFORM - if (ppc_md.pci_probe_mode) + if (node && ppc_md.pci_probe_mode) mode = ppc_md.pci_probe_mode(bus); + DBG(" probe mode: %d\n", mode); if (mode == PCI_PROBE_DEVTREE) { bus->subordinate = hose->last_busno; of_scan_bus(node, bus); @@ -842,8 +866,7 @@ pgprot_t pci_phys_mem_access_prot(struct * Returns a negative error code on failure, zero on success. */ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, - enum pci_mmap_state mmap_state, - int write_combine) + enum pci_mmap_state mmap_state, int write_combine) { unsigned long offset = vma->vm_pgoff << PAGE_SHIFT; struct resource *rp; Index: linux-work/arch/powerpc/kernel/udbg.c =================================================================== --- linux-work.orig/arch/powerpc/kernel/udbg.c 2005-12-06 16:17:43.000000000 +1100 +++ linux-work/arch/powerpc/kernel/udbg.c 2005-12-13 18:13:12.000000000 +1100 @@ -110,10 +110,12 @@ static int early_console_initialized; void __init disable_early_printk(void) { +#if 1 if (!early_console_initialized) return; unregister_console(&udbg_console); early_console_initialized = 0; +#endif } /* called by setup_system */ Index: linux-work/arch/powerpc/platforms/powermac/setup.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/setup.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/setup.c 2005-12-14 11:42:58.000000000 +1100 @@ -345,7 +345,7 @@ void __init pmac_setup_arch(void) #ifdef CONFIG_SMP /* Check for Core99 */ - if (find_devices("uni-n") || find_devices("u3")) + if (find_devices("uni-n") || find_devices("u3") || find_devices("u4")) smp_ops = &core99_smp_ops; #ifdef CONFIG_PPC32 else @@ -635,7 +635,7 @@ static void __init pmac_init_early(void) /* Setup interrupt mapping options */ ppc64_interrupt_controller = IC_OPEN_PIC; - iommu_init_early_u3(); + iommu_init_early_dart(); #endif } @@ -711,7 +711,7 @@ static int __init pmac_probe(int platfor * occupies having to be broken up so the DART itself is not * part of the cacheable linar mapping */ - alloc_u3_dart_table(); + alloc_dart_table(); #endif #ifdef CONFIG_PMAC_SMU @@ -733,10 +733,11 @@ static int pmac_pci_probe_mode(struct pc struct device_node *node = bus->sysdata; /* We need to use normal PCI probing for the AGP bus, - since the device for the AGP bridge isn't in the tree. */ - if (bus->self == NULL && device_is_compatible(node, "u3-agp")) + * since the device for the AGP bridge isn't in the tree. + */ + if (bus->self == NULL && (device_is_compatible(node, "u3-agp") || + device_is_compatible(node, "u4-pcie"))) return PCI_PROBE_NORMAL; - return PCI_PROBE_DEVTREE; } #endif Index: linux-work/arch/powerpc/platforms/powermac/smp.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/powermac/smp.c 2005-12-13 15:03:21.000000000 +1100 +++ linux-work/arch/powerpc/platforms/powermac/smp.c 2005-12-13 18:13:12.000000000 +1100 @@ -361,7 +361,6 @@ static void __init psurge_dual_sync_tb(i set_dec(tb_ticks_per_jiffy); /* XXX fixme */ set_tb(0, 0); - last_jiffy_stamp(cpu_nr) = 0; if (cpu_nr > 0) { mb(); @@ -429,15 +428,62 @@ struct smp_ops_t psurge_smp_ops = { }; #endif /* CONFIG_PPC32 - actually powersurge support */ +/* + * Core 99 and later support + */ + +static void (*pmac_tb_freeze)(int freeze); +static unsigned long timebase; +static int tb_req; + +static void smp_core99_give_timebase(void) +{ + unsigned long flags; + + local_irq_save(flags); + + while(!tb_req) + barrier(); + tb_req = 0; + (*pmac_tb_freeze)(1); + mb(); + timebase = get_tb(); + mb(); + while (timebase) + barrier(); + mb(); + (*pmac_tb_freeze)(0); + mb(); + + local_irq_restore(flags); +} + + +static void __devinit smp_core99_take_timebase(void) +{ + unsigned long flags; + + local_irq_save(flags); + + tb_req = 1; + mb(); + while (!timebase) + barrier(); + mb(); + set_tb(timebase >> 32, timebase & 0xffffffff); + timebase = 0; + mb(); + set_dec(tb_ticks_per_jiffy/2); + + local_irq_restore(flags); +} + #ifdef CONFIG_PPC64 /* * G5s enable/disable the timebase via an i2c-connected clock chip. */ static struct device_node *pmac_tb_clock_chip_host; static u8 pmac_tb_pulsar_addr; -static void (*pmac_tb_freeze)(int freeze); -static DEFINE_SPINLOCK(timebase_lock); -static unsigned long timebase; static void smp_core99_cypress_tb_freeze(int freeze) { @@ -447,7 +493,8 @@ static void smp_core99_cypress_tb_freeze /* Strangely, the device-tree says address is 0xd2, but darwin * accesses 0xd0 ... */ - pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_combined); + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, + pmac_low_i2c_mode_combined); rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, 0xd0 | pmac_low_i2c_read, 0x81, &data, 1); @@ -475,7 +522,8 @@ static void smp_core99_pulsar_tb_freeze( u8 data; int rc; - pmac_low_i2c_setmode(pmac_tb_clock_chip_host, pmac_low_i2c_mode_combined); + pmac_low_i2c_setmode(pmac_tb_clock_chip_host, + pmac_low_i2c_mode_combined); rc = pmac_low_i2c_xfer(pmac_tb_clock_chip_host, pmac_tb_pulsar_addr | pmac_low_i2c_read, 0x2e, &data, 1); @@ -496,54 +544,14 @@ static void smp_core99_pulsar_tb_freeze( } } - -static void smp_core99_give_timebase(void) -{ - /* Open i2c bus for synchronous access */ - if (pmac_low_i2c_open(pmac_tb_clock_chip_host, 0)) - panic("Can't open i2c for TB sync !\n"); - - spin_lock(&timebase_lock); - (*pmac_tb_freeze)(1); - mb(); - timebase = get_tb(); - spin_unlock(&timebase_lock); - - while (timebase) - barrier(); - - spin_lock(&timebase_lock); - (*pmac_tb_freeze)(0); - spin_unlock(&timebase_lock); - - /* Close i2c bus */ - pmac_low_i2c_close(pmac_tb_clock_chip_host); -} - - -static void __devinit smp_core99_take_timebase(void) -{ - while (!timebase) - barrier(); - spin_lock(&timebase_lock); - set_tb(timebase >> 32, timebase & 0xffffffff); - timebase = 0; - spin_unlock(&timebase_lock); -} - -static void __init smp_core99_setup(int ncpus) +static void __init smp_core99_setup_i2c_hwsync(int ncpus) { struct device_node *cc = NULL; struct device_node *p; + const char *name = NULL; u32 *reg; int ok; - /* HW sync only on these platforms */ - if (!machine_is_compatible("PowerMac7,2") && - !machine_is_compatible("PowerMac7,3") && - !machine_is_compatible("RackMac3,1")) - return; - /* Look for the clock chip */ while ((cc = of_find_node_by_name(cc, "i2c-hwclock")) != NULL) { p = of_get_parent(cc); @@ -561,114 +569,64 @@ static void __init smp_core99_setup(int if (device_is_compatible(cc, "pulsar-legacy-slewing")) { pmac_tb_freeze = smp_core99_pulsar_tb_freeze; pmac_tb_pulsar_addr = 0xd2; - printk(KERN_INFO "Timebase clock is Pulsar chip\n"); + name = "Pulsar"; } else if (device_is_compatible(cc, "cy28508")) { pmac_tb_freeze = smp_core99_cypress_tb_freeze; - printk(KERN_INFO "Timebase clock is Cypress chip\n"); + name = "Cypress"; } break; case 0xd4: pmac_tb_freeze = smp_core99_pulsar_tb_freeze; pmac_tb_pulsar_addr = 0xd4; - printk(KERN_INFO "Timebase clock is Pulsar chip\n"); + name = "Pulsar"; break; } - if (pmac_tb_freeze != NULL) { - pmac_tb_clock_chip_host = of_get_parent(cc); - of_node_put(cc); + if (pmac_tb_freeze != NULL) break; - } } - if (pmac_tb_freeze == NULL) { - smp_ops->give_timebase = smp_generic_give_timebase; - smp_ops->take_timebase = smp_generic_take_timebase; + if (pmac_tb_freeze != NULL) { + struct device_node *p = of_get_parent(cc); + of_node_put(cc); + while(p && strcmp(p->type, "i2c")) { + cc = of_get_parent(p); + of_node_put(p); + p = cc; + } + if (p == NULL) + goto no_i2c_sync; + /* Open i2c bus for synchronous access */ + if (pmac_low_i2c_open(p, 0)) { + printk(KERN_ERR "Failed top open i2c bus %s for clock" + " sync, fallback to software sync !\n", + p->full_name); + of_node_put(p); + goto no_i2c_sync; + } + pmac_tb_clock_chip_host = p; + printk(KERN_INFO "Processor timebase sync using %s i2c clock\n", + name); + return; } + no_i2c_sync: + pmac_tb_freeze = NULL; } -/* nothing to do here, caches are already set up by service processor */ -static inline void __devinit core99_init_caches(int cpu) -{ -} +#endif /* CONFIG_PPC64 */ -#else /* CONFIG_PPC64 */ /* - * SMP G4 powermacs use a GPIO to enable/disable the timebase. + * SMP G4 and newer G5 use a GPIO to enable/disable the timebase. */ static unsigned int core99_tb_gpio; /* Timebase freeze GPIO */ -static unsigned int pri_tb_hi, pri_tb_lo; -static unsigned int pri_tb_stamp; - -/* not __init, called in sleep/wakeup code */ -void smp_core99_give_timebase(void) +static void smp_core99_gpio_tb_freeze(int freeze) { - unsigned long flags; - unsigned int t; - - /* wait for the secondary to be in take_timebase */ - for (t = 100000; t > 0 && !sec_tb_reset; --t) - udelay(10); - if (!sec_tb_reset) { - printk(KERN_WARNING "Timeout waiting sync on second CPU\n"); - return; - } - - /* freeze the timebase and read it */ - /* disable interrupts so the timebase is disabled for the - shortest possible time */ - local_irq_save(flags); - pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 4); + if (freeze) + pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 4); + else + pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 0); pmac_call_feature(PMAC_FTR_READ_GPIO, NULL, core99_tb_gpio, 0); - mb(); - pri_tb_hi = get_tbu(); - pri_tb_lo = get_tbl(); - pri_tb_stamp = last_jiffy_stamp(smp_processor_id()); - mb(); - - /* tell the secondary we're ready */ - sec_tb_reset = 2; - mb(); - - /* wait for the secondary to have taken it */ - /* note: can't use udelay here, since it needs the timebase running */ - for (t = 10000000; t > 0 && sec_tb_reset; --t) - barrier(); - if (sec_tb_reset) - /* XXX BUG_ON here? */ - printk(KERN_WARNING "Timeout waiting sync(2) on second CPU\n"); - - /* Now, restart the timebase by leaving the GPIO to an open collector */ - pmac_call_feature(PMAC_FTR_WRITE_GPIO, NULL, core99_tb_gpio, 0); - pmac_call_feature(PMAC_FTR_READ_GPIO, NULL, core99_tb_gpio, 0); - local_irq_restore(flags); -} - -/* not __init, called in sleep/wakeup code */ -void smp_core99_take_timebase(void) -{ - unsigned long flags; - - /* tell the primary we're here */ - sec_tb_reset = 1; - mb(); - - /* wait for the primary to set pri_tb_hi/lo */ - while (sec_tb_reset < 2) - mb(); - - /* set our stuff the same as the primary */ - local_irq_save(flags); - set_dec(1); - set_tb(pri_tb_hi, pri_tb_lo); - last_jiffy_stamp(smp_processor_id()) = pri_tb_stamp; - mb(); - - /* tell the primary we're done */ - sec_tb_reset = 0; - mb(); - local_irq_restore(flags); } /* L2 and L3 cache settings to pass from CPU0 to CPU1 on G4 cpus */ @@ -677,6 +635,7 @@ volatile static long int core99_l3_cache static void __devinit core99_init_caches(int cpu) { +#ifndef CONFIG_PPC64 if (!cpu_has_feature(CPU_FTR_L2CR)) return; @@ -702,30 +661,80 @@ static void __devinit core99_init_caches _set_L3CR(core99_l3_cache); printk("CPU%d: L3CR set to %lx\n", cpu, core99_l3_cache); } +#endif /* !CONFIG_PPC64 */ } static void __init smp_core99_setup(int ncpus) { - struct device_node *cpu; - u32 *tbprop = NULL; - int i; - - core99_tb_gpio = KL_GPIO_TB_ENABLE; /* default value */ - cpu = of_find_node_by_type(NULL, "cpu"); - if (cpu != NULL) { - tbprop = (u32 *)get_property(cpu, "timebase-enable", NULL); - if (tbprop) - core99_tb_gpio = *tbprop; - of_node_put(cpu); - } - - /* XXX should get this from reg properties */ - for (i = 1; i < ncpus; ++i) - smp_hw_index[i] = i; - powersave_nap = 0; -} +#ifdef CONFIG_PPC64 + + /* i2c based HW sync on some G5s */ + if (machine_is_compatible("PowerMac7,2") || + machine_is_compatible("PowerMac7,3") || + machine_is_compatible("RackMac3,1")) + smp_core99_setup_i2c_hwsync(ncpus); + + /* GPIO based HW sync on recent G5s */ + if (pmac_tb_freeze == NULL) { + struct device_node *np = + of_find_node_by_name(NULL, "timebase-enable"); + u32 *reg = (u32 *)get_property(np, "reg", NULL); + + if (np && reg && !strcmp(np->type, "gpio")) { + core99_tb_gpio = *reg; + if (core99_tb_gpio < 0x50) + core99_tb_gpio += 0x50; + pmac_tb_freeze = smp_core99_gpio_tb_freeze; + printk(KERN_INFO "Processor timebase sync using" + " GPIO 0x%02x\n", core99_tb_gpio); + } + } + +#else /* CONFIG_PPC64 */ + + /* GPIO based HW sync on ppc32 Core99 */ + if (pmac_tb_freeze == NULL && !machine_is_compatible("MacRISC4")) { + struct device_node *cpu; + u32 *tbprop = NULL; + + core99_tb_gpio = KL_GPIO_TB_ENABLE; /* default value */ + cpu = of_find_node_by_type(NULL, "cpu"); + if (cpu != NULL) { + tbprop = (u32 *)get_property(cpu, "timebase-enable", + NULL); + if (tbprop) + core99_tb_gpio = *tbprop; + of_node_put(cpu); + } + pmac_tb_freeze = smp_core99_gpio_tb_freeze; + printk(KERN_INFO "Processor timebase sync using" + " GPIO 0x%02x\n", core99_tb_gpio); + } + +#endif /* CONFIG_PPC64 */ + + /* No timebase sync, fallback to software */ + if (pmac_tb_freeze == NULL) { + smp_ops->give_timebase = smp_generic_give_timebase; + smp_ops->take_timebase = smp_generic_take_timebase; + printk(KERN_INFO "Processor timebase sync using software\n"); + } + +#ifndef CONFIG_PPC64 + { + int i; + + /* XXX should get this from reg properties */ + for (i = 1; i < ncpus; ++i) + smp_hw_index[i] = i; + } #endif + /* 32 bits SMP can't NAP */ + if (!machine_is_compatible("MacRISC4")) + powersave_nap = 0; +} + static int __init smp_core99_probe(void) { struct device_node *cpus; @@ -803,17 +812,25 @@ static void __devinit smp_core99_setup_c mpic_setup_this_cpu(); if (cpu_nr == 0) { -#ifdef CONFIG_POWER4 +#ifdef CONFIG_PPC64 extern void g5_phy_disable_cpu1(void); + /* Close i2c bus if it was used for tb sync */ + if (pmac_tb_clock_chip_host) { + pmac_low_i2c_close(pmac_tb_clock_chip_host); + pmac_tb_clock_chip_host = NULL; + } + /* If we didn't start the second CPU, we must take * it off the bus */ if (machine_is_compatible("MacRISC4") && num_online_cpus() < 2) g5_phy_disable_cpu1(); -#endif /* CONFIG_POWER4 */ - if (ppc_md.progress) ppc_md.progress("core99_setup_cpu 0 done", 0x349); +#endif /* CONFIG_PPC64 */ + + if (ppc_md.progress) + ppc_md.progress("core99_setup_cpu 0 done", 0x349); } } Index: linux-work/arch/powerpc/sysdev/u3_iommu.c =================================================================== --- linux-work.orig/arch/powerpc/sysdev/u3_iommu.c 2005-11-24 17:21:41.000000000 +1100 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,327 +0,0 @@ -/* - * arch/powerpc/sysdev/u3_iommu.c - * - * Copyright (C) 2004 Olof Johansson , IBM Corporation - * - * Based on pSeries_iommu.c: - * Copyright (C) 2001 Mike Corrigan & Dave Engebretsen, IBM Corporation - * Copyright (C) 2004 Olof Johansson , IBM Corporation - * - * Dynamic DMA mapping support, Apple U3 & IBM CPC925 "DART" iommu. - * - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "dart.h" - -extern int iommu_force_on; - -/* Physical base address and size of the DART table */ -unsigned long dart_tablebase; /* exported to htab_initialize */ -static unsigned long dart_tablesize; - -/* Virtual base address of the DART table */ -static u32 *dart_vbase; - -/* Mapped base address for the dart */ -static unsigned int *dart; - -/* Dummy val that entries are set to when unused */ -static unsigned int dart_emptyval; - -static struct iommu_table iommu_table_u3; -static int iommu_table_u3_inited; -static int dart_dirty; - -#define DBG(...) - -static inline void dart_tlb_invalidate_all(void) -{ - unsigned long l = 0; - unsigned int reg; - unsigned long limit; - - DBG("dart: flush\n"); - - /* To invalidate the DART, set the DARTCNTL_FLUSHTLB bit in the - * control register and wait for it to clear. - * - * Gotcha: Sometimes, the DART won't detect that the bit gets - * set. If so, clear it and set it again. - */ - - limit = 0; - -retry: - reg = in_be32((unsigned int *)dart+DARTCNTL); - reg |= DARTCNTL_FLUSHTLB; - out_be32((unsigned int *)dart+DARTCNTL, reg); - - l = 0; - while ((in_be32((unsigned int *)dart+DARTCNTL) & DARTCNTL_FLUSHTLB) && - l < (1L<it_base) + index; - - /* On U3, all memory is contigous, so we can move this - * out of the loop. - */ - while (npages--) { - rpn = virt_to_abs(uaddr) >> DART_PAGE_SHIFT; - - *(dp++) = DARTMAP_VALID | (rpn & DARTMAP_RPNMASK); - - rpn++; - uaddr += DART_PAGE_SIZE; - } - - dart_dirty = 1; -} - - -static void dart_free(struct iommu_table *tbl, long index, long npages) -{ - unsigned int *dp; - - /* We don't worry about flushing the TLB cache. The only drawback of - * not doing it is that we won't catch buggy device drivers doing - * bad DMAs, but then no 32-bit architecture ever does either. - */ - - DBG("dart: free at: %lx, %lx\n", index, npages); - - index <<= DART_PAGE_FACTOR; - npages <<= DART_PAGE_FACTOR; - - dp = ((unsigned int *)tbl->it_base) + index; - - while (npages--) - *(dp++) = dart_emptyval; -} - - -static int dart_init(struct device_node *dart_node) -{ - unsigned int regword; - unsigned int i; - unsigned long tmp; - - if (dart_tablebase == 0 || dart_tablesize == 0) { - printk(KERN_INFO "U3-DART: table not allocated, using direct DMA\n"); - return -ENODEV; - } - - /* Make sure nothing from the DART range remains in the CPU cache - * from a previous mapping that existed before the kernel took - * over - */ - flush_dcache_phys_range(dart_tablebase, dart_tablebase + dart_tablesize); - - /* Allocate a spare page to map all invalid DART pages. We need to do - * that to work around what looks like a problem with the HT bridge - * prefetching into invalid pages and corrupting data - */ - tmp = lmb_alloc(DART_PAGE_SIZE, DART_PAGE_SIZE); - if (!tmp) - panic("U3-DART: Cannot allocate spare page!"); - dart_emptyval = DARTMAP_VALID | ((tmp >> DART_PAGE_SHIFT) & DARTMAP_RPNMASK); - - /* Map in DART registers. FIXME: Use device node to get base address */ - dart = ioremap(DART_BASE, 0x7000); - if (dart == NULL) - panic("U3-DART: Cannot map registers!"); - - /* Set initial control register contents: table base, - * table size and enable bit - */ - regword = DARTCNTL_ENABLE | - ((dart_tablebase >> DART_PAGE_SHIFT) << DARTCNTL_BASE_SHIFT) | - (((dart_tablesize >> DART_PAGE_SHIFT) & DARTCNTL_SIZE_MASK) - << DARTCNTL_SIZE_SHIFT); - dart_vbase = ioremap(virt_to_abs(dart_tablebase), dart_tablesize); - - /* Fill initial table */ - for (i = 0; i < dart_tablesize/4; i++) - dart_vbase[i] = dart_emptyval; - - /* Initialize DART with table base and enable it. */ - out_be32((unsigned int *)dart, regword); - - /* Invalidate DART to get rid of possible stale TLBs */ - dart_tlb_invalidate_all(); - - printk(KERN_INFO "U3/CPC925 DART IOMMU initialized\n"); - - return 0; -} - -static void iommu_table_u3_setup(void) -{ - iommu_table_u3.it_busno = 0; - iommu_table_u3.it_offset = 0; - /* it_size is in number of entries */ - iommu_table_u3.it_size = (dart_tablesize / sizeof(u32)) >> DART_PAGE_FACTOR; - - /* Initialize the common IOMMU code */ - iommu_table_u3.it_base = (unsigned long)dart_vbase; - iommu_table_u3.it_index = 0; - iommu_table_u3.it_blocksize = 1; - iommu_init_table(&iommu_table_u3); - - /* Reserve the last page of the DART to avoid possible prefetch - * past the DART mapped area - */ - set_bit(iommu_table_u3.it_size - 1, iommu_table_u3.it_map); -} - -static void iommu_dev_setup_u3(struct pci_dev *dev) -{ - struct device_node *dn; - - /* We only have one iommu table on the mac for now, which makes - * things simple. Setup all PCI devices to point to this table - * - * We must use pci_device_to_OF_node() to make sure that - * we get the real "final" pointer to the device in the - * pci_dev sysdata and not the temporary PHB one - */ - dn = pci_device_to_OF_node(dev); - - if (dn) - PCI_DN(dn)->iommu_table = &iommu_table_u3; -} - -static void iommu_bus_setup_u3(struct pci_bus *bus) -{ - struct device_node *dn; - - if (!iommu_table_u3_inited) { - iommu_table_u3_inited = 1; - iommu_table_u3_setup(); - } - - dn = pci_bus_to_OF_node(bus); - - if (dn) - PCI_DN(dn)->iommu_table = &iommu_table_u3; -} - -static void iommu_dev_setup_null(struct pci_dev *dev) { } -static void iommu_bus_setup_null(struct pci_bus *bus) { } - -void iommu_init_early_u3(void) -{ - struct device_node *dn; - - /* Find the DART in the device-tree */ - dn = of_find_compatible_node(NULL, "dart", "u3-dart"); - if (dn == NULL) - return; - - /* Setup low level TCE operations for the core IOMMU code */ - ppc_md.tce_build = dart_build; - ppc_md.tce_free = dart_free; - ppc_md.tce_flush = dart_flush; - - /* Initialize the DART HW */ - if (dart_init(dn)) { - /* If init failed, use direct iommu and null setup functions */ - ppc_md.iommu_dev_setup = iommu_dev_setup_null; - ppc_md.iommu_bus_setup = iommu_bus_setup_null; - - /* Setup pci_dma ops */ - pci_direct_iommu_init(); - } else { - ppc_md.iommu_dev_setup = iommu_dev_setup_u3; - ppc_md.iommu_bus_setup = iommu_bus_setup_u3; - - /* Setup pci_dma ops */ - pci_iommu_init(); - } -} - - -void __init alloc_u3_dart_table(void) -{ - /* Only reserve DART space if machine has more than 2GB of RAM - * or if requested with iommu=on on cmdline. - */ - if (lmb_end_of_DRAM() <= 0x80000000ull && !iommu_force_on) - return; - - /* 512 pages (2MB) is max DART tablesize. */ - dart_tablesize = 1UL << 21; - /* 16MB (1 << 24) alignment. We allocate a full 16Mb chuck since we - * will blow up an entire large page anyway in the kernel mapping - */ - dart_tablebase = (unsigned long) - abs_to_virt(lmb_alloc_base(1UL<<24, 1UL<<24, 0x80000000L)); - - printk(KERN_INFO "U3-DART allocated at: %lx\n", dart_tablebase); -} Index: linux-work/drivers/ide/ppc/pmac.c =================================================================== --- linux-work.orig/drivers/ide/ppc/pmac.c 2005-12-13 17:51:20.000000000 +1100 +++ linux-work/drivers/ide/ppc/pmac.c 2005-12-13 18:13:12.000000000 +1100 @@ -1686,7 +1686,7 @@ pmac_ide_probe(void) #else macio_register_driver(&pmac_ide_macio_driver); pci_register_driver(&pmac_ide_pci_driver); -#endif +#endif } #ifdef CONFIG_BLK_DEV_IDEDMA_PMAC Index: linux-work/drivers/macintosh/smu.c =================================================================== --- linux-work.orig/drivers/macintosh/smu.c 2005-11-24 17:18:43.000000000 +1100 +++ linux-work/drivers/macintosh/smu.c 2005-12-13 18:13:12.000000000 +1100 @@ -53,7 +53,7 @@ #undef DEBUG_SMU #ifdef DEBUG_SMU -#define DPRINTK(fmt, args...) do { udbg_printf(KERN_DEBUG fmt , ##args); } while (0) +#define DPRINTK(fmt, args...) do { printk(KERN_DEBUG fmt , ##args); } while (0) #else #define DPRINTK(fmt, args...) do { } while (0) #endif @@ -909,10 +909,13 @@ static struct smu_sdbp_header *smu_creat struct property *prop; /* First query the partition info */ + DPRINTK("SMU: Query partition infos ... (irq=%d)\n", smu->db_irq); smu_queue_simple(&cmd, SMU_CMD_PARTITION_COMMAND, 2, smu_done_complete, &comp, SMU_CMD_PARTITION_LATEST, id); wait_for_completion(&comp); + DPRINTK("SMU: done, status: %d, reply_len: %d\n", + cmd.cmd.status, cmd.cmd.reply_len); /* Partition doesn't exist (or other error) */ if (cmd.cmd.status != 0 || cmd.cmd.reply_len != 6) @@ -975,6 +978,8 @@ struct smu_sdbp_header *__smu_get_sdb_pa sprintf(pname, "sdb-partition-%02x", id); + DPRINTK("smu_get_sdb_partition(%02x)\n", id); + if (interruptible) { int rc; rc = down_interruptible(&smu_part_access); @@ -986,6 +991,7 @@ struct smu_sdbp_header *__smu_get_sdb_pa part = (struct smu_sdbp_header *)get_property(smu->of_node, pname, size); if (part == NULL) { + DPRINTK("trying to extract from SMU ...\n"); part = smu_create_sdb_partition(id); if (part != NULL && size) *size = part->len << 2; Index: linux-work/include/asm-powerpc/mpic.h =================================================================== --- linux-work.orig/include/asm-powerpc/mpic.h 2005-12-13 18:02:03.000000000 +1100 +++ linux-work/include/asm-powerpc/mpic.h 2005-12-13 18:13:12.000000000 +1100 @@ -117,8 +117,9 @@ typedef int (*mpic_cascade_t)(struct pt_ struct mpic_irq_fixup { u8 __iomem *base; + u8 __iomem *applebase; u32 data; - unsigned int irq; + unsigned int index; }; #endif /* CONFIG_MPIC_BROKEN_U3 */ Index: linux-work/include/asm/hardirq.h =================================================================== --- linux-work.orig/include/asm/hardirq.h 2005-11-24 17:18:48.000000000 +1100 +++ linux-work/include/asm/hardirq.h 2005-12-13 18:13:12.000000000 +1100 @@ -11,13 +11,10 @@ */ typedef struct { unsigned int __softirq_pending; /* set_bit is used on this */ - unsigned int __last_jiffy_stamp; } ____cacheline_aligned irq_cpustat_t; #include /* Standard mappings for irq_cpustat_t above */ -#define last_jiffy_stamp(cpu) __IRQ_STAT((cpu), __last_jiffy_stamp) - static inline void ack_bad_irq(int irq) { printk(KERN_CRIT "illegal vector %d received!\n", irq); Index: linux-work/arch/powerpc/platforms/maple/setup.c =================================================================== --- linux-work.orig/arch/powerpc/platforms/maple/setup.c 2005-12-13 15:03:21.000000000 +1100 +++ linux-work/arch/powerpc/platforms/maple/setup.c 2005-12-14 11:43:10.000000000 +1100 @@ -195,7 +195,7 @@ static void __init maple_init_early(void /* Setup interrupt mapping options */ ppc64_interrupt_controller = IC_OPEN_PIC; - iommu_init_early_u3(); + iommu_init_early_dart(); DBG(" <- maple_init_early\n"); } @@ -257,7 +257,7 @@ static int __init maple_probe(int platfo * occupies having to be broken up so the DART itself is not * part of the cacheable linar mapping */ - alloc_u3_dart_table(); + alloc_dart_table(); return 1; } Index: linux-work/arch/powerpc/sysdev/Makefile =================================================================== --- linux-work.orig/arch/powerpc/sysdev/Makefile 2005-11-24 17:21:41.000000000 +1100 +++ linux-work/arch/powerpc/sysdev/Makefile 2005-12-14 11:01:34.000000000 +1100 @@ -4,6 +4,6 @@ obj-$(CONFIG_PPC_I8259) += i8259.o obj-$(CONFIG_PPC_MPC106) += grackle.o obj-$(CONFIG_BOOKE) += dcr.o obj-$(CONFIG_40x) += dcr.o -obj-$(CONFIG_U3_DART) += u3_iommu.o +obj-$(CONFIG_U3_DART) += dart_iommu.o obj-$(CONFIG_MMIO_NVRAM) += mmio_nvram.o obj-$(CONFIG_83xx) += ipic.o Index: linux-work/arch/powerpc/sysdev/dart.h =================================================================== --- linux-work.orig/arch/powerpc/sysdev/dart.h 2005-11-24 17:21:41.000000000 +1100 +++ linux-work/arch/powerpc/sysdev/dart.h 2005-12-14 12:52:12.000000000 +1100 @@ -20,29 +20,44 @@ #define _POWERPC_SYSDEV_DART_H -/* physical base of DART registers */ -#define DART_BASE 0xf8033000UL - /* Offset from base to control register */ -#define DARTCNTL 0 +#define DART_CNTL 0 + /* Offset from base to exception register */ -#define DARTEXCP 0x10 +#define DART_EXCP_U3 0x10 /* Offset from base to TLB tag registers */ -#define DARTTAG 0x1000 +#define DART_TAGS_U3 0x1000 +/* U4 registers */ +#define DART_BASE_U4 0x10 +#define DART_SIZE_U4 0x20 +#define DART_EXCP_U4 0x30 +#define DART_TAGS_U4 0x1000 /* Control Register fields */ -/* base address of table (pfn) */ -#define DARTCNTL_BASE_MASK 0xfffff -#define DARTCNTL_BASE_SHIFT 12 +/* U3 registers */ +#define DART_CNTL_U3_BASE_MASK 0xfffff +#define DART_CNTL_U3_BASE_SHIFT 12 +#define DART_CNTL_U3_FLUSHTLB 0x400 +#define DART_CNTL_U3_ENABLE 0x200 +#define DART_CNTL_U3_SIZE_MASK 0x1ff +#define DART_CNTL_U3_SIZE_SHIFT 0 + +/* U4 registers */ +#define DART_BASE_U4_BASE_MASK 0xffffff +#define DART_BASE_U4_BASE_SHIFT 0 +#define DART_CNTL_U4_FLUSHTLB 0x20000000 +#define DART_CNTL_U4_ENABLE 0x80000000 +#define DART_SIZE_U4_SIZE_MASK 0x1fff +#define DART_SIZE_U4_SIZE_SHIFT 0 + +#define DART_REG(r) (dart + ((r) >> 2)) +#define DART_IN(r) (in_be32(DART_REG(r))) +#define DART_OUT(r,v) (out_be32(DART_REG(r), (v))) -#define DARTCNTL_FLUSHTLB 0x400 -#define DARTCNTL_ENABLE 0x200 /* size of table in pages */ -#define DARTCNTL_SIZE_MASK 0x1ff -#define DARTCNTL_SIZE_SHIFT 0 /* DART table fields */ Index: linux-work/arch/powerpc/sysdev/dart_iommu.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-work/arch/powerpc/sysdev/dart_iommu.c 2005-12-14 13:07:03.000000000 +1100 @@ -0,0 +1,350 @@ +/* + * arch/powerpc/sysdev/dart_iommu.c + * + * Copyright (C) 2004 Olof Johansson , IBM Corporation + * Copyright (C) 2005 Benjamin Herrenschmidt , + * IBM Corporation + * + * Based on pSeries_iommu.c: + * Copyright (C) 2001 Mike Corrigan & Dave Engebretsen, IBM Corporation + * Copyright (C) 2004 Olof Johansson , IBM Corporation + * + * Dynamic DMA mapping support, Apple U3, U4 & IBM CPC925 "DART" iommu. + * + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "dart.h" + +extern int iommu_force_on; + +/* Physical base address and size of the DART table */ +unsigned long dart_tablebase; /* exported to htab_initialize */ +static unsigned long dart_tablesize; + +/* Virtual base address of the DART table */ +static u32 *dart_vbase; + +/* Mapped base address for the dart */ +static unsigned int *__iomem dart; + +/* Dummy val that entries are set to when unused */ +static unsigned int dart_emptyval; + +static struct iommu_table iommu_table_dart; +static int iommu_table_dart_inited; +static int dart_dirty; +static int dart_is_u4; + +#define DBG(...) + +static inline void dart_tlb_invalidate_all(void) +{ + unsigned long l = 0; + unsigned int reg, inv_bit; + unsigned long limit; + + DBG("dart: flush\n"); + + /* To invalidate the DART, set the DARTCNTL_FLUSHTLB bit in the + * control register and wait for it to clear. + * + * Gotcha: Sometimes, the DART won't detect that the bit gets + * set. If so, clear it and set it again. + */ + + limit = 0; + + inv_bit = dart_is_u4 ? DART_CNTL_U4_FLUSHTLB : DART_CNTL_U3_FLUSHTLB; +retry: + l = 0; + reg = DART_IN(DART_CNTL); + reg |= inv_bit; + DART_OUT(DART_CNTL, reg); + + while ((DART_IN(DART_CNTL) & inv_bit) && l < (1L << limit)) + l++; + if (l == (1L << limit)) { + if (limit < 4) { + limit++; + reg = DART_IN(DART_CNTL); + reg &= ~inv_bit; + DART_OUT(DART_CNTL, reg); + goto retry; + } else + panic("DART: TLB did not flush after waiting a long " + "time. Buggy U3 ?"); + } +} + +static void dart_flush(struct iommu_table *tbl) +{ + if (dart_dirty) + dart_tlb_invalidate_all(); + dart_dirty = 0; +} + +static void dart_build(struct iommu_table *tbl, long index, + long npages, unsigned long uaddr, + enum dma_data_direction direction) +{ + unsigned int *dp; + unsigned int rpn; + + DBG("dart: build at: %lx, %lx, addr: %x\n", index, npages, uaddr); + + index <<= DART_PAGE_FACTOR; + npages <<= DART_PAGE_FACTOR; + + dp = ((unsigned int*)tbl->it_base) + index; + + /* On U3, all memory is contigous, so we can move this + * out of the loop. + */ + while (npages--) { + rpn = virt_to_abs(uaddr) >> DART_PAGE_SHIFT; + + *(dp++) = DARTMAP_VALID | (rpn & DARTMAP_RPNMASK); + + rpn++; + uaddr += DART_PAGE_SIZE; + } + + dart_dirty = 1; +} + + +static void dart_free(struct iommu_table *tbl, long index, long npages) +{ + unsigned int *dp; + + /* We don't worry about flushing the TLB cache. The only drawback of + * not doing it is that we won't catch buggy device drivers doing + * bad DMAs, but then no 32-bit architecture ever does either. + */ + + DBG("dart: free at: %lx, %lx\n", index, npages); + + index <<= DART_PAGE_FACTOR; + npages <<= DART_PAGE_FACTOR; + + dp = ((unsigned int *)tbl->it_base) + index; + + while (npages--) + *(dp++) = dart_emptyval; +} + + +static int dart_init(struct device_node *dart_node) +{ + unsigned int i; + unsigned long tmp, base, size; + struct resource r; + + if (dart_tablebase == 0 || dart_tablesize == 0) { + printk(KERN_INFO "DART: table not allocated, using " + "direct DMA\n"); + return -ENODEV; + } + + if (of_address_to_resource(dart_node, 0, &r)) + panic("DART: can't get register base ! "); + + /* Make sure nothing from the DART range remains in the CPU cache + * from a previous mapping that existed before the kernel took + * over + */ + flush_dcache_phys_range(dart_tablebase, + dart_tablebase + dart_tablesize); + + /* Allocate a spare page to map all invalid DART pages. We need to do + * that to work around what looks like a problem with the HT bridge + * prefetching into invalid pages and corrupting data + */ + tmp = lmb_alloc(DART_PAGE_SIZE, DART_PAGE_SIZE); + if (!tmp) + panic("DART: Cannot allocate spare page!"); + dart_emptyval = DARTMAP_VALID | ((tmp >> DART_PAGE_SHIFT) & + DARTMAP_RPNMASK); + + /* Map in DART registers */ + dart = ioremap(r.start, r.end - r.start + 1); + if (dart == NULL) + panic("DART: Cannot map registers!"); + + /* Map in DART table */ + dart_vbase = ioremap(virt_to_abs(dart_tablebase), dart_tablesize); + + /* Fill initial table */ + for (i = 0; i < dart_tablesize/4; i++) + dart_vbase[i] = dart_emptyval; + + /* Initialize DART with table base and enable it. */ + base = dart_tablebase >> DART_PAGE_SHIFT; + size = dart_tablesize >> DART_PAGE_SHIFT; + if (dart_is_u4) { + BUG_ON(size & ~DART_SIZE_U4_SIZE_MASK); + DART_OUT(DART_BASE_U4, base); + DART_OUT(DART_SIZE_U4, size); + DART_OUT(DART_CNTL, DART_CNTL_U4_ENABLE); + } else { + BUG_ON(size & ~DART_CNTL_U3_SIZE_MASK); + DART_OUT(DART_CNTL, + DART_CNTL_U3_ENABLE | + (base << DART_CNTL_U3_BASE_SHIFT) | + (size << DART_CNTL_U3_SIZE_SHIFT)); + } + + /* Invalidate DART to get rid of possible stale TLBs */ + dart_tlb_invalidate_all(); + + printk(KERN_INFO "DART IOMMU initialized for %s type chipset\n", + dart_is_u4 ? "U4" : "U3"); + + return 0; +} + +static void iommu_table_dart_setup(void) +{ + iommu_table_dart.it_busno = 0; + iommu_table_dart.it_offset = 0; + /* it_size is in number of entries */ + iommu_table_dart.it_size = (dart_tablesize / sizeof(u32)) >> DART_PAGE_FACTOR; + + /* Initialize the common IOMMU code */ + iommu_table_dart.it_base = (unsigned long)dart_vbase; + iommu_table_dart.it_index = 0; + iommu_table_dart.it_blocksize = 1; + iommu_init_table(&iommu_table_dart); + + /* Reserve the last page of the DART to avoid possible prefetch + * past the DART mapped area + */ + set_bit(iommu_table_dart.it_size - 1, iommu_table_dart.it_map); +} + +static void iommu_dev_setup_dart(struct pci_dev *dev) +{ + struct device_node *dn; + + /* We only have one iommu table on the mac for now, which makes + * things simple. Setup all PCI devices to point to this table + * + * We must use pci_device_to_OF_node() to make sure that + * we get the real "final" pointer to the device in the + * pci_dev sysdata and not the temporary PHB one + */ + dn = pci_device_to_OF_node(dev); + + if (dn) + PCI_DN(dn)->iommu_table = &iommu_table_dart; +} + +static void iommu_bus_setup_dart(struct pci_bus *bus) +{ + struct device_node *dn; + + if (!iommu_table_dart_inited) { + iommu_table_dart_inited = 1; + iommu_table_dart_setup(); + } + + dn = pci_bus_to_OF_node(bus); + + if (dn) + PCI_DN(dn)->iommu_table = &iommu_table_dart; +} + +static void iommu_dev_setup_null(struct pci_dev *dev) { } +static void iommu_bus_setup_null(struct pci_bus *bus) { } + +void iommu_init_early_dart(void) +{ + struct device_node *dn; + + /* Find the DART in the device-tree */ + dn = of_find_compatible_node(NULL, "dart", "u3-dart"); + if (dn == NULL) { + dn = of_find_compatible_node(NULL, "dart", "u4-dart"); + if (dn == NULL) + goto bail; + dart_is_u4 = 1; + } + + /* Setup low level TCE operations for the core IOMMU code */ + ppc_md.tce_build = dart_build; + ppc_md.tce_free = dart_free; + ppc_md.tce_flush = dart_flush; + + /* Initialize the DART HW */ + if (dart_init(dn) == 0) { + ppc_md.iommu_dev_setup = iommu_dev_setup_dart; + ppc_md.iommu_bus_setup = iommu_bus_setup_dart; + + /* Setup pci_dma ops */ + pci_iommu_init(); + + return; + } + + bail: + /* If init failed, use direct iommu and null setup functions */ + ppc_md.iommu_dev_setup = iommu_dev_setup_null; + ppc_md.iommu_bus_setup = iommu_bus_setup_null; + + /* Setup pci_dma ops */ + pci_direct_iommu_init(); +} + + +void __init alloc_dart_table(void) +{ + /* Only reserve DART space if machine has more than 2GB of RAM + * or if requested with iommu=on on cmdline. + */ + if (lmb_end_of_DRAM() <= 0x80000000ull && !iommu_force_on) + return; + + /* 512 pages (2MB) is max DART tablesize. */ + dart_tablesize = 1UL << 21; + /* 16MB (1 << 24) alignment. We allocate a full 16Mb chuck since we + * will blow up an entire large page anyway in the kernel mapping + */ + dart_tablebase = (unsigned long) + abs_to_virt(lmb_alloc_base(1UL<<24, 1UL<<24, 0x80000000L)); + + printk(KERN_INFO "DART table allocated at: %lx\n", dart_tablebase); +} Index: linux-work/include/asm-powerpc/iommu.h =================================================================== --- linux-work.orig/include/asm-powerpc/iommu.h 2005-11-24 17:21:41.000000000 +1100 +++ linux-work/include/asm-powerpc/iommu.h 2005-12-14 12:54:36.000000000 +1100 @@ -56,7 +56,7 @@ struct device_node; /* Walks all buses and creates iommu tables */ extern void iommu_setup_pSeries(void); -extern void iommu_setup_u3(void); +extern void iommu_setup_dart(void); /* Frees table for an individual device node */ extern void iommu_free_table(struct device_node *dn); @@ -104,7 +104,7 @@ extern void iommu_unmap_single(struct io extern void iommu_init_early_pSeries(void); extern void iommu_init_early_iSeries(void); -extern void iommu_init_early_u3(void); +extern void iommu_init_early_dart(void); #ifdef CONFIG_PCI extern void pci_iommu_init(void); @@ -113,6 +113,6 @@ extern void pci_direct_iommu_init(void); static inline void pci_iommu_init(void) { } #endif -extern void alloc_u3_dart_table(void); +extern void alloc_dart_table(void); #endif /* _ASM_IOMMU_H */ From mikey at neuling.org Wed Dec 14 14:54:43 2005 From: mikey at neuling.org (Michael Neuling) Date: Wed, 14 Dec 2005 14:54:43 +1100 Subject: [PATCH] powerpc: radeon and CONFIG_PM In-Reply-To: <439F48FD.1090600@us.ltcfwd.linux.ibm.com> References: <439F48FD.1090600@us.ltcfwd.linux.ibm.com> Message-ID: <20051214145443.0025e8f0.mikey@neuling.org> Mike, You may want to repost this patch as it's been line wrapped. Mikey Mike Wolf wrote: > The following comment is in drivers/video/aty/radeon_pm.c > > /* Check if we can power manage on suspend/resume. We can do > * D2 on M6, M7 and M9, and we can resume from D3 cold a few other > * "Mac" cards, but that's all. We need more infos about what the > * BIOS does tho. Right now, all this PM stuff is pmac-only for that > * reason. --BenH > */ > > but it didnt check that CONFIG_PMAC was selected with CONFIG_PM. This > results in build errors when CONFIG_PM is selected and pseries is built. > > Signed-off-by: Mike Wolf > > ====================================================================== > > > --- a/drivers/video/aty/radeon_pm.c 2005-12-08 14:17:45.000000000 +0800 > +++ b/drivers/video/aty/radeon_pm.c 2005-12-08 14:35:57.000000000 +0800 > @@ -2734,7 +2734,7 @@ > * BIOS does tho. Right now, all this PM stuff is pmac-only for that > * reason. --BenH > */ > -#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) > +#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && > defined(CONFIG_PPC_PMAC) > if (_machine == _MACH_Pmac && rinfo->of_node) { > if (rinfo->is_mobility && rinfo->pm_reg && > rinfo->family <= CHIP_FAMILY_RV250) > @@ -2778,12 +2778,12 @@ > OUTREG(TV_DAC_CNTL, INREG(TV_DAC_CNTL) | 0x07000000); > #endif > } > -#endif /* defined(CONFIG_PM) && defined(CONFIG_PPC_OF) */ > +#endif /* defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && > defined(CONFIG_PPC_PMAC) */ > } > > void radeonfb_pm_exit(struct radeonfb_info *rinfo) > { > -#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) > +#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && > defined(CONFIG_PPC_PMAC) > if (rinfo->pm_mode != radeon_pm_none) > pmac_set_early_video_resume(NULL, NULL); > #endif > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev > From david at gibson.dropbear.id.au Wed Dec 14 16:08:40 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 14 Dec 2005 16:08:40 +1100 Subject: powerpc: Replace VMALLOCBASE with VMALLOC_START Message-ID: <20051214050840.GB14178@localhost.localdomain> Paulus, a small cleanup, please apply to the powerpc tree. On ppc64, we independently define VMALLOCBASE and VMALLOC_START to be the same thing: the start of the vmalloc() area at 0xd000000000000000. VMALLOC_START is used much more widely, including in generic code, so this patch gets rid of the extraneous VMALLOCBASE. This does require moving the definitions of region IDs from page_64.h to pgtable.h, but they don't clearly belong in the former rather than the latter, anyway. While we're moving them, clean up the definitions of the REGION_IDs: - Abolish REGION_SIZE, it was only used once, to define REGION_MASK anyway - Define the specific region ids in terms of the REGION_ID() macro. - Define KERNEL_REGION_ID in terms of PAGE_OFFSET rather than KERNELBASE. It amounts to the same thing, but conceptually this is about the region of the linear mapping (which starts at PAGE_OFFSET) rather than of the kernel text itself (which is at KERNELBASE). Signed-off-by: David Gibson Index: working-2.6/arch/powerpc/kernel/lparmap.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/lparmap.c 2005-12-14 15:40:53.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/lparmap.c 2005-12-14 15:47:03.000000000 +1100 @@ -18,8 +18,8 @@ const struct LparMap __attribute__((__se .xEsids = { { .xKernelEsid = GET_ESID(PAGE_OFFSET), .xKernelVsid = KERNEL_VSID(PAGE_OFFSET), }, - { .xKernelEsid = GET_ESID(VMALLOCBASE), - .xKernelVsid = KERNEL_VSID(VMALLOCBASE), }, + { .xKernelEsid = GET_ESID(VMALLOC_START), + .xKernelVsid = KERNEL_VSID(VMALLOC_START), }, }, .xRanges = { Index: working-2.6/arch/powerpc/mm/slb.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/slb.c 2005-12-14 15:40:53.000000000 +1100 +++ working-2.6/arch/powerpc/mm/slb.c 2005-12-14 15:52:47.000000000 +1100 @@ -87,8 +87,8 @@ static void slb_flush_and_rebolt(void) /* Slot 2 - kernel stack */ "slbmte %2,%3\n" "isync" - :: "r"(mk_vsid_data(VMALLOCBASE, vflags)), - "r"(mk_esid_data(VMALLOCBASE, 1)), + :: "r"(mk_vsid_data(VMALLOC_START, vflags)), + "r"(mk_esid_data(VMALLOC_START, 1)), "r"(mk_vsid_data(ksp_esid_data, lflags)), "r"(ksp_esid_data) : "memory"); @@ -216,7 +216,7 @@ void slb_initialize(void) create_slbe(PAGE_OFFSET, lflags, 0); /* VMALLOC space has 4K pages always for now */ - create_slbe(VMALLOCBASE, vflags, 1); + create_slbe(VMALLOC_START, vflags, 1); /* We don't bolt the stack for the time being - we're in boot, * so the stack is in the bolted segment. By the time it goes Index: working-2.6/include/asm-powerpc/page_64.h =================================================================== --- working-2.6.orig/include/asm-powerpc/page_64.h 2005-11-29 13:51:33.000000000 +1100 +++ working-2.6/include/asm-powerpc/page_64.h 2005-12-14 15:49:23.000000000 +1100 @@ -25,16 +25,6 @@ */ #define PAGE_FACTOR (PAGE_SHIFT - HW_PAGE_SHIFT) -#define REGION_SIZE 4UL -#define REGION_SHIFT 60UL -#define REGION_MASK (((1UL<> REGION_SHIFT) -#define KERNEL_REGION_ID (KERNELBASE >> REGION_SHIFT) -#define USER_REGION_ID (0UL) -#define REGION_ID(ea) (((unsigned long)(ea)) >> REGION_SHIFT) - /* Segment size */ #define SID_SHIFT 28 #define SID_MASK 0xfffffffffUL Index: working-2.6/include/asm-powerpc/pgtable.h =================================================================== --- working-2.6.orig/include/asm-powerpc/pgtable.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/pgtable.h 2005-12-14 15:53:37.000000000 +1100 @@ -58,6 +58,17 @@ struct mm_struct; #define IMALLOC_END (VMALLOC_START + PGTABLE_RANGE) /* + * Region IDs + */ +#define REGION_SHIFT 60UL +#define REGION_MASK (0xfUL << REGION_SHIFT) +#define REGION_ID(ea) (((unsigned long)(ea)) >> REGION_SHIFT) + +#define VMALLOC_REGION_ID (REGION_ID(VMALLOC_START)) +#define KERNEL_REGION_ID (REGION_ID(PAGE_OFFSET)) +#define USER_REGION_ID (0UL) + +/* * Common bits in a linux-style PTE. These match the bits in the * (hardware-defined) PowerPC PTE as closely as possible. Additional * bits may be defined in pgtable-*.h -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From mjw at us.ibm.com Thu Dec 15 03:22:50 2005 From: mjw at us.ibm.com (Mike Wolf) Date: Wed, 14 Dec 2005 10:22:50 -0600 Subject: [PATCH] powerpc: radeon and CONFIG_PM Message-ID: <43A046DA.6080600@us.ltcfwd.linux.ibm.com> Previous patch was line wrapped, so I'm going to try this again. The following comment is in drivers/video/aty/radeon_pm.c /* Check if we can power manage on suspend/resume. We can do * D2 on M6, M7 and M9, and we can resume from D3 cold a few other * "Mac" cards, but that's all. We need more infos about what the * BIOS does tho. Right now, all this PM stuff is pmac-only for that * reason. --BenH */ but it didnt check that CONFIG_PMAC was selected with CONFIG_PM. This results in build errors when CONFIG_PM is selected and pseries is built. Signed-off-by: Mike Wolf ======================================================================================= --- a/drivers/video/aty/radeon_pm.c 2005-12-08 14:17:45.000000000 +0800 +++ b/drivers/video/aty/radeon_pm.c 2005-12-08 14:35:57.000000000 +0800 @@ -2734,7 +2734,7 @@ * BIOS does tho. Right now, all this PM stuff is pmac-only for that * reason. --BenH */ -#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) +#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && defined(CONFIG_PPC_PMAC) if (_machine == _MACH_Pmac && rinfo->of_node) { if (rinfo->is_mobility && rinfo->pm_reg && rinfo->family <= CHIP_FAMILY_RV250) @@ -2778,12 +2778,12 @@ OUTREG(TV_DAC_CNTL, INREG(TV_DAC_CNTL) | 0x07000000); #endif } -#endif /* defined(CONFIG_PM) && defined(CONFIG_PPC_OF) */ +#endif /* defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && defined(CONFIG_PPC_PMAC) */ } void radeonfb_pm_exit(struct radeonfb_info *rinfo) { -#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) +#if defined(CONFIG_PM) && defined(CONFIG_PPC_OF) && defined(CONFIG_PPC_PMAC) if (rinfo->pm_mode != radeon_pm_none) pmac_set_early_video_resume(NULL, NULL); #endif From arnd at arndb.de Thu Dec 15 05:10:18 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Wed, 14 Dec 2005 19:10:18 +0100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <200512102000.20304.arnd@arndb.de> References: <200512102000.20304.arnd@arndb.de> Message-ID: <200512141910.19121.arnd@arndb.de> This patch enables support for pause(0) power management state for the Cell Broadband Processor, which is import for power efficient operation. The pervasive infrastructure will in the future enable us to introduce more functionality specific to the Cell's pervasive unit. This version contains more changes according to comments from Milton. More importantly, it now also works on DD2 hardware, after I have fixed a bug in the initialization sequence. From: Maximino Aguilar Signed-off-by: Arnd Bergmann --- Paul, please merge this once Milton and Max have both acknowledged the contents. Max, I haven't gotten any reply from you on this patch so far. Please tell us if the changes I have done to your code look right! Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/Makefile +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile @@ -1,4 +1,6 @@ obj-y += interrupt.o iommu.o setup.o spider-pic.o +obj-y += pervasive.o + obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SPU_FS) += spufs/ spu_base.o builtin-spufs-$(CONFIG_SPU_FS) += spu_syscalls.o Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c @@ -0,0 +1,192 @@ +/* + * CBE Pervasive Monitor and Debug + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * Michael N. Day (mnday at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#undef DEBUG + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pervasive.h" + +struct cbe_pervasive { + struct pmd_regs __iomem *regs; + unsigned int thread; +}; + +/* can't use per_cpu from setup_arch */ +static struct cbe_pervasive cbe_pervasive[NR_CPUS]; + +static void __init cbe_enable_pause_zero(void) +{ + unsigned long thread_switch_control; + unsigned long temp_register; + struct cbe_pervasive *p; + int thread; + + p = &cbe_pervasive[get_cpu()]; + + if (!cbe_pervasive->regs) + return; + + pr_debug("Power Management: CPU %d\n", smp_processor_id()); + + /* Enable Pause(0) control bit */ + temp_register = in_be64(&p->regs->pm_control); + + out_be64(&p->regs->pm_control, + temp_register|PMD_PAUSE_ZERO_CONTROL); + + /* Enable DEC and EE interrupt request */ + thread_switch_control = mfspr(SPRN_TSC_CELL); + thread_switch_control |= TSC_CELL_EE_ENABLE | TSC_CELL_EE_BOOST; + + switch ((mfspr(SPRN_CTRLF) & CTRL_CT)) { + case CTRL_CT0: + thread_switch_control |= TSC_CELL_DEC_ENABLE_0; + thread = 0; + break; + case CTRL_CT1: + thread_switch_control |= TSC_CELL_DEC_ENABLE_1; + thread = 1; + break; + default: + printk(KERN_WARNING "%s: unknown configuration\n", + __FUNCTION__); + thread = -1; + break; + } + + if (p->thread != thread) + printk(KERN_WARNING "%s: device tree inconsistant, " + "cpu %i: %d/%d\n", __FUNCTION__, + smp_processor_id(), + p->thread, thread); + + mtspr(SPRN_TSC_CELL, thread_switch_control); + + put_cpu(); +} + +static void cbe_idle(void) +{ + unsigned long ctrl; + + cbe_enable_pause_zero(); + + while (1) { + if (!need_resched()) { + while (!need_resched()) { + /* go into low thread priority */ + HMT_low(); + + /* go into low power mode */ + local_irq_disable(); + ctrl = mfspr(SPRN_CTRLF); + ctrl &= ~(CTRL_RUNLATCH | CTRL_TE); + mtspr(SPRN_CTRLT, ctrl); + local_irq_enable(); + } + /* restore thread prio */ + HMT_medium(); + } + + ppc64_runlatch_on(); + preempt_enable_no_resched(); + schedule(); + preempt_disable(); + } +} + +static int __init cbe_find_pmd_mmio(int cpu, struct cbe_pervasive *p) +{ + struct device_node *node; + unsigned int *int_servers; + char *addr; + unsigned long real_address; + unsigned int size; + + struct pmd_regs __iomem *pmd_mmio_area; + int hardid, thread; + int proplen; + + pmd_mmio_area = NULL; + hardid = get_hard_smp_processor_id(cpu); + for (node = NULL; (node = of_find_node_by_type(node, "cpu"));) { + int_servers = (void *) get_property(node, + "ibm,ppc-interrupt-server#s", &proplen); + if (!int_servers) { + printk(KERN_WARNING "CPU device misses " + "ibm,ppc-interrupt-server#s property"); + continue; + } + for (thread = 0; thread < proplen / sizeof (int); thread++) { + if (hardid == int_servers[thread]) { + addr = get_property(node, "pervasive", NULL); + goto found; + } + } + } + + printk(KERN_WARNING "%s: CPU %d not found\n", __FUNCTION__, cpu); + return -EINVAL; + +found: + real_address = *(unsigned long*) addr; + addr += sizeof (unsigned long); + size = *(unsigned int*) addr; + + pr_debug("pervasive area for CPU %d at %lx, size %x\n", + cpu, real_address, size); + p->regs = __ioremap(real_address, size, _PAGE_NO_CACHE); + p->thread = thread; + return 0; +} + +void __init cell_pervasive_init(void) +{ + struct cbe_pervasive *p; + int cpu; + int ret; + + if (!cpu_has_feature(CPU_FTR_PAUSE_ZERO)) + return; + + for_each_cpu(cpu) { + p = &cbe_pervasive[cpu]; + ret = cbe_find_pmd_mmio(cpu, p); + if (ret) + return; + } + + ppc_md.idle_loop = cbe_idle; +} Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h @@ -0,0 +1,62 @@ +/* + * Cell Pervasive Monitor and Debug interface and HW structures + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * David J. Erb (djerb at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#ifndef PERVASIVE_H +#define PERVASIVE_H + +struct pmd_regs { + u8 pad_0x0000_0x0800[0x0800 - 0x0000]; /* 0x0000 */ + + /* Thermal Sensor Registers */ + u64 ts_ctsr1; /* 0x0800 */ + u64 ts_ctsr2; /* 0x0808 */ + u64 ts_mtsr1; /* 0x0810 */ + u64 ts_mtsr2; /* 0x0818 */ + u64 ts_itr1; /* 0x0820 */ + u64 ts_itr2; /* 0x0828 */ + u64 ts_gitr; /* 0x0830 */ + u64 ts_isr; /* 0x0838 */ + u64 ts_imr; /* 0x0840 */ + u64 tm_cr1; /* 0x0848 */ + u64 tm_cr2; /* 0x0850 */ + u64 tm_simr; /* 0x0858 */ + u64 tm_tpr; /* 0x0860 */ + u64 tm_str1; /* 0x0868 */ + u64 tm_str2; /* 0x0870 */ + u64 tm_tsr; /* 0x0878 */ + + /* Power Management */ + u64 pm_control; /* 0x0880 */ +#define PMD_PAUSE_ZERO_CONTROL 0x10000 + u64 pm_status; /* 0x0888 */ + + /* Time Base Register */ + u64 tbr; /* 0x0890 */ + + u8 pad_0x0898_0x1000 [0x1000 - 0x0898]; /* 0x0898 */ +}; + +void __init cell_pervasive_init(void); + +#endif Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/setup.c +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c @@ -49,6 +49,7 @@ #include "interrupt.h" #include "iommu.h" +#include "pervasive.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -165,6 +166,7 @@ static void __init cell_setup_arch(void) init_pci_config_tokens(); find_and_init_phbs(); spider_init_IRQ(); + cell_pervasive_init(); #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif Index: linux-2.6.15-rc/arch/powerpc/kernel/head_64.S =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/head_64.S +++ linux-2.6.15-rc/arch/powerpc/kernel/head_64.S @@ -401,7 +401,7 @@ label##_common: \ .globl __start_interrupts __start_interrupts: - STD_EXCEPTION_PSERIES(0x100, system_reset) + STD_EXCEPTION_PSERIES(0x100, system_reset_check) . = 0x200 _machine_check_pSeries: @@ -880,6 +880,28 @@ unrecov_fer: bl .unrecoverable_exception b 1b +/* This is a new system reset handler for the BE processor. + * SRR1 stores wake information that must be decoded to determine why + * the processor was at the system reset handler. + */ + + .align 7 + .globl system_reset_check_common +system_reset_check_common: +BEGIN_FTR_SECTION + mr r22,r12 /* r12 has SRR1 saved */ + srwi r22,r22,16 + andi. r22,r22,MSR_WAKEMASK + cmpwi r22,MSR_WAKEEE + beq hardware_interrupt_common + cmpwi r22,MSR_WAKEDEC + beq decrementer_common + cmpwi r22,MSR_WAKEMT + bne system_reset_common +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) + EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); + b fast_exception_return + /* * Here r13 points to the paca, r9 contains the saved CR, * SRR0 and SRR1 are saved in r11 and r12, Index: linux-2.6.15-rc/include/asm-powerpc/cputable.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/cputable.h +++ linux-2.6.15-rc/include/asm-powerpc/cputable.h @@ -106,6 +106,7 @@ extern void do_cpu_ftr_fixups(unsigned l #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0000040000000000) #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0000100000000000) +#define CPU_FTR_PAUSE_ZERO ASM_CONST(0x0000200000000000) #else /* ensure on 32b processors the flags are available for compiling but * don't do anything */ @@ -305,7 +306,8 @@ enum { CPU_FTR_MMCRA_SIHV, CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | - CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT, + CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | + CPU_FTR_CTRL | CPU_FTR_PAUSE_ZERO, CPU_FTRS_COMPATIBLE = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2, #endif Index: linux-2.6.15-rc/include/asm-powerpc/reg.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/reg.h +++ linux-2.6.15-rc/include/asm-powerpc/reg.h @@ -92,6 +92,15 @@ #define MSR_RI __MASK(MSR_RI_LG) /* Recoverable Exception */ #define MSR_LE __MASK(MSR_LE_LG) /* Little Endian */ +/* Wake Events */ +#define MSR_WAKEMASK 0x0038 +#define MSR_WAKERESET 0x0038 +#define MSR_WAKESYSERR 0x0030 +#define MSR_WAKEEE 0x0020 +#define MSR_WAKEMT 0x0028 +#define MSR_WAKEDEC 0x0018 +#define MSR_WAKETHERM 0x0010 + #ifdef CONFIG_PPC64 #define MSR_ MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_ISF #define MSR_KERNEL MSR_ | MSR_SF | MSR_HV @@ -145,6 +154,10 @@ #define SPRN_CTR 0x009 /* Count Register */ #define SPRN_CTRLF 0x088 #define SPRN_CTRLT 0x098 +#define CTRL_CT 0xc0000000 /* current thread */ +#define CTRL_CT0 0x80000000 /* thread 0 */ +#define CTRL_CT1 0x40000000 /* thread 1 */ +#define CTRL_TE 0x00c00000 /* thread enable */ #define CTRL_RUNLATCH 0x1 #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ #define DABR_TRANSLATION (1UL << 2) @@ -257,11 +270,11 @@ #define SPRN_HID6 0x3F9 /* BE HID 6 */ #define HID6_LB (0x0F<<12) /* Concurrent Large Page Modes */ #define HID6_DLP (1<<20) /* Disable all large page modes (4K only) */ -#define SPRN_TSCR 0x399 /* Thread switch control on BE */ -#define SPRN_TTR 0x39A /* Thread switch timeout on BE */ -#define TSCR_DEC_ENABLE 0x200000 /* Decrementer Interrupt */ -#define TSCR_EE_ENABLE 0x100000 /* External Interrupt */ -#define TSCR_EE_BOOST 0x080000 /* External Interrupt Boost */ +#define SPRN_TSC_CELL 0x399 /* Thread switch control on Cell */ +#define TSC_CELL_DEC_ENABLE_0 0x400000 /* Decrementer Interrupt */ +#define TSC_CELL_DEC_ENABLE_1 0x200000 /* Decrementer Interrupt */ +#define TSC_CELL_EE_ENABLE 0x100000 /* External Interrupt */ +#define TSC_CELL_EE_BOOST 0x080000 /* External Interrupt Boost */ #define SPRN_TSC 0x3FD /* Thread switch control on others */ #define SPRN_TST 0x3FC /* Thread switch timeout on others */ #if !defined(SPRN_IAC1) && !defined(SPRN_IAC2) Index: linux-2.6.15-rc/arch/powerpc/kernel/cputable.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/cputable.c +++ linux-2.6.15-rc/arch/powerpc/kernel/cputable.c @@ -273,7 +273,7 @@ struct cpu_spec cpu_specs[] = { .oprofile_model = &op_model_power4, #endif }, - { /* BE DD1.x */ + { /* Cell Broadband Engine */ .pvr_mask = 0xffff0000, .pvr_value = 0x00700000, .cpu_name = "Cell Broadband Engine", From miltonm at bga.com Thu Dec 15 07:25:41 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 14 Dec 2005 14:25:41 -0600 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: References: Message-ID: <048f09f7206105d6082904ae787cf9c4@bga.com> I should have mentioned this before: Does cbe_enable_pause_zero need locking between the two threads of a given cpu? milton From maguilar at us.ibm.com Thu Dec 15 07:13:02 2005 From: maguilar at us.ibm.com (Max Aguilar) Date: Wed, 14 Dec 2005 14:13:02 -0600 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <200512141910.19121.arnd@arndb.de> Message-ID: This looks good to me, thanks for looking over the code. Max Aguilar Linux Kernel/Bootloader/Bring-Up STI Design Center (512) 838-5704 T/L 678-5704 maguilar at us.ibm.com Arnd Bergmann on 12/14/2005 12:10:18 PM To: linuxppc64-dev at ozlabs.org cc: Milton Miller , Paul Mackerras , Max Aguilar/Austin/IBM at IBMUS Subject: [PATCH ] cell: enable pause(0) in cpu_idle This patch enables support for pause(0) power management state for the Cell Broadband Processor, which is import for power efficient operation. The pervasive infrastructure will in the future enable us to introduce more functionality specific to the Cell's pervasive unit. This version contains more changes according to comments from Milton. More importantly, it now also works on DD2 hardware, after I have fixed a bug in the initialization sequence. From: Maximino Aguilar Signed-off-by: Arnd Bergmann --- Paul, please merge this once Milton and Max have both acknowledged the contents. Max, I haven't gotten any reply from you on this patch so far. Please tell us if the changes I have done to your code look right! Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/Makefile +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile @@ -1,4 +1,6 @@ obj-y += interrupt.o iommu.o setup.o spider-pic.o +obj-y += pervasive.o + obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SPU_FS) += spufs/ spu_base.o builtin-spufs-$(CONFIG_SPU_FS) += spu_syscalls.o Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c @@ -0,0 +1,192 @@ +/* + * CBE Pervasive Monitor and Debug + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * Michael N. Day (mnday at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#undef DEBUG + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pervasive.h" + +struct cbe_pervasive { + struct pmd_regs __iomem *regs; + unsigned int thread; +}; + +/* can't use per_cpu from setup_arch */ +static struct cbe_pervasive cbe_pervasive[NR_CPUS]; + +static void __init cbe_enable_pause_zero(void) +{ + unsigned long thread_switch_control; + unsigned long temp_register; + struct cbe_pervasive *p; + int thread; + + p = &cbe_pervasive[get_cpu()]; + + if (!cbe_pervasive->regs) + return; + + pr_debug("Power Management: CPU %d\n", smp_processor_id()); + + /* Enable Pause(0) control bit */ + temp_register = in_be64(&p->regs->pm_control); + + out_be64(&p->regs->pm_control, + temp_register|PMD_PAUSE_ZERO_CONTROL); + + /* Enable DEC and EE interrupt request */ + thread_switch_control = mfspr(SPRN_TSC_CELL); + thread_switch_control |= TSC_CELL_EE_ENABLE | TSC_CELL_EE_BOOST; + + switch ((mfspr(SPRN_CTRLF) & CTRL_CT)) { + case CTRL_CT0: + thread_switch_control |= TSC_CELL_DEC_ENABLE_0; + thread = 0; + break; + case CTRL_CT1: + thread_switch_control |= TSC_CELL_DEC_ENABLE_1; + thread = 1; + break; + default: + printk(KERN_WARNING "%s: unknown configuration\n", + __FUNCTION__); + thread = -1; + break; + } + + if (p->thread != thread) + printk(KERN_WARNING "%s: device tree inconsistant, " + "cpu %i: %d/%d\n", __FUNCTION__, + smp_processor_id(), + p->thread, thread); + + mtspr(SPRN_TSC_CELL, thread_switch_control); + + put_cpu(); +} + +static void cbe_idle(void) +{ + unsigned long ctrl; + + cbe_enable_pause_zero(); + + while (1) { + if (!need_resched()) { + while (!need_resched()) { + /* go into low thread priority */ + HMT_low(); + + /* go into low power mode */ + local_irq_disable(); + ctrl = mfspr(SPRN_CTRLF); + ctrl &= ~(CTRL_RUNLATCH | CTRL_TE); + mtspr(SPRN_CTRLT, ctrl); + local_irq_enable(); + } + /* restore thread prio */ + HMT_medium(); + } + + ppc64_runlatch_on(); + preempt_enable_no_resched(); + schedule(); + preempt_disable(); + } +} + +static int __init cbe_find_pmd_mmio(int cpu, struct cbe_pervasive *p) +{ + struct device_node *node; + unsigned int *int_servers; + char *addr; + unsigned long real_address; + unsigned int size; + + struct pmd_regs __iomem *pmd_mmio_area; + int hardid, thread; + int proplen; + + pmd_mmio_area = NULL; + hardid = get_hard_smp_processor_id(cpu); + for (node = NULL; (node = of_find_node_by_type(node, "cpu"));) { + int_servers = (void *) get_property(node, + "ibm,ppc-interrupt-server#s", &proplen); + if (!int_servers) { + printk(KERN_WARNING "CPU device misses " + "ibm,ppc-interrupt-server#s property"); + continue; + } + for (thread = 0; thread < proplen / sizeof (int); thread++) { + if (hardid == int_servers[thread]) { + addr = get_property(node, "pervasive", NULL); + goto found; + } + } + } + + printk(KERN_WARNING "%s: CPU %d not found\n", __FUNCTION__, cpu); + return -EINVAL; + +found: + real_address = *(unsigned long*) addr; + addr += sizeof (unsigned long); + size = *(unsigned int*) addr; + + pr_debug("pervasive area for CPU %d at %lx, size %x\n", + cpu, real_address, size); + p->regs = __ioremap(real_address, size, _PAGE_NO_CACHE); + p->thread = thread; + return 0; +} + +void __init cell_pervasive_init(void) +{ + struct cbe_pervasive *p; + int cpu; + int ret; + + if (!cpu_has_feature(CPU_FTR_PAUSE_ZERO)) + return; + + for_each_cpu(cpu) { + p = &cbe_pervasive[cpu]; + ret = cbe_find_pmd_mmio(cpu, p); + if (ret) + return; + } + + ppc_md.idle_loop = cbe_idle; +} Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h @@ -0,0 +1,62 @@ +/* + * Cell Pervasive Monitor and Debug interface and HW structures + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * David J. Erb (djerb at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#ifndef PERVASIVE_H +#define PERVASIVE_H + +struct pmd_regs { + u8 pad_0x0000_0x0800[0x0800 - 0x0000]; /* 0x0000 */ + + /* Thermal Sensor Registers */ + u64 ts_ctsr1; /* 0x0800 */ + u64 ts_ctsr2; /* 0x0808 */ + u64 ts_mtsr1; /* 0x0810 */ + u64 ts_mtsr2; /* 0x0818 */ + u64 ts_itr1; /* 0x0820 */ + u64 ts_itr2; /* 0x0828 */ + u64 ts_gitr; /* 0x0830 */ + u64 ts_isr; /* 0x0838 */ + u64 ts_imr; /* 0x0840 */ + u64 tm_cr1; /* 0x0848 */ + u64 tm_cr2; /* 0x0850 */ + u64 tm_simr; /* 0x0858 */ + u64 tm_tpr; /* 0x0860 */ + u64 tm_str1; /* 0x0868 */ + u64 tm_str2; /* 0x0870 */ + u64 tm_tsr; /* 0x0878 */ + + /* Power Management */ + u64 pm_control; /* 0x0880 */ +#define PMD_PAUSE_ZERO_CONTROL 0x10000 + u64 pm_status; /* 0x0888 */ + + /* Time Base Register */ + u64 tbr; /* 0x0890 */ + + u8 pad_0x0898_0x1000 [0x1000 - 0x0898]; /* 0x0898 */ +}; + +void __init cell_pervasive_init(void); + +#endif Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/setup.c +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c @@ -49,6 +49,7 @@ #include "interrupt.h" #include "iommu.h" +#include "pervasive.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -165,6 +166,7 @@ static void __init cell_setup_arch(void) init_pci_config_tokens(); find_and_init_phbs(); spider_init_IRQ(); + cell_pervasive_init(); #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif Index: linux-2.6.15-rc/arch/powerpc/kernel/head_64.S =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/head_64.S +++ linux-2.6.15-rc/arch/powerpc/kernel/head_64.S @@ -401,7 +401,7 @@ label##_common: \ .globl __start_interrupts __start_interrupts: - STD_EXCEPTION_PSERIES(0x100, system_reset) + STD_EXCEPTION_PSERIES(0x100, system_reset_check) . = 0x200 _machine_check_pSeries: @@ -880,6 +880,28 @@ unrecov_fer: bl .unrecoverable_exception b 1b +/* This is a new system reset handler for the BE processor. + * SRR1 stores wake information that must be decoded to determine why + * the processor was at the system reset handler. + */ + + .align 7 + .globl system_reset_check_common +system_reset_check_common: +BEGIN_FTR_SECTION + mr r22,r12 /* r12 has SRR1 saved */ + srwi r22,r22,16 + andi. r22,r22,MSR_WAKEMASK + cmpwi r22,MSR_WAKEEE + beq hardware_interrupt_common + cmpwi r22,MSR_WAKEDEC + beq decrementer_common + cmpwi r22,MSR_WAKEMT + bne system_reset_common +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) + EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); + b fast_exception_return + /* * Here r13 points to the paca, r9 contains the saved CR, * SRR0 and SRR1 are saved in r11 and r12, Index: linux-2.6.15-rc/include/asm-powerpc/cputable.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/cputable.h +++ linux-2.6.15-rc/include/asm-powerpc/cputable.h @@ -106,6 +106,7 @@ extern void do_cpu_ftr_fixups(unsigned l #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0000040000000000) #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0000100000000000) +#define CPU_FTR_PAUSE_ZERO ASM_CONST(0x0000200000000000) #else /* ensure on 32b processors the flags are available for compiling but * don't do anything */ @@ -305,7 +306,8 @@ enum { CPU_FTR_MMCRA_SIHV, CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | - CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT, + CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | + CPU_FTR_CTRL | CPU_FTR_PAUSE_ZERO, CPU_FTRS_COMPATIBLE = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2, #endif Index: linux-2.6.15-rc/include/asm-powerpc/reg.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/reg.h +++ linux-2.6.15-rc/include/asm-powerpc/reg.h @@ -92,6 +92,15 @@ #define MSR_RI __MASK(MSR_RI_LG) /* Recoverable Exception */ #define MSR_LE __MASK(MSR_LE_LG) /* Little Endian */ +/* Wake Events */ +#define MSR_WAKEMASK 0x0038 +#define MSR_WAKERESET 0x0038 +#define MSR_WAKESYSERR 0x0030 +#define MSR_WAKEEE 0x0020 +#define MSR_WAKEMT 0x0028 +#define MSR_WAKEDEC 0x0018 +#define MSR_WAKETHERM 0x0010 + #ifdef CONFIG_PPC64 #define MSR_ MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_ISF #define MSR_KERNEL MSR_ | MSR_SF | MSR_HV @@ -145,6 +154,10 @@ #define SPRN_CTR 0x009 /* Count Register */ #define SPRN_CTRLF 0x088 #define SPRN_CTRLT 0x098 +#define CTRL_CT 0xc0000000 /* current thread */ +#define CTRL_CT0 0x80000000 /* thread 0 */ +#define CTRL_CT1 0x40000000 /* thread 1 */ +#define CTRL_TE 0x00c00000 /* thread enable */ #define CTRL_RUNLATCH 0x1 #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ #define DABR_TRANSLATION (1UL << 2) @@ -257,11 +270,11 @@ #define SPRN_HID6 0x3F9 /* BE HID 6 */ #define HID6_LB (0x0F<<12) /* Concurrent Large Page Modes */ #define HID6_DLP (1<<20) /* Disable all large page modes (4K only) */ -#define SPRN_TSCR 0x399 /* Thread switch control on BE */ -#define SPRN_TTR 0x39A /* Thread switch timeout on BE */ -#define TSCR_DEC_ENABLE 0x200000 /* Decrementer Interrupt */ -#define TSCR_EE_ENABLE 0x100000 /* External Interrupt */ -#define TSCR_EE_BOOST 0x080000 /* External Interrupt Boost */ +#define SPRN_TSC_CELL 0x399 /* Thread switch control on Cell */ +#define TSC_CELL_DEC_ENABLE_0 0x400000 /* Decrementer Interrupt */ +#define TSC_CELL_DEC_ENABLE_1 0x200000 /* Decrementer Interrupt */ +#define TSC_CELL_EE_ENABLE 0x100000 /* External Interrupt */ +#define TSC_CELL_EE_BOOST 0x080000 /* External Interrupt Boost */ #define SPRN_TSC 0x3FD /* Thread switch control on others */ #define SPRN_TST 0x3FC /* Thread switch timeout on others */ #if !defined(SPRN_IAC1) && !defined(SPRN_IAC2) Index: linux-2.6.15-rc/arch/powerpc/kernel/cputable.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/cputable.c +++ linux-2.6.15-rc/arch/powerpc/kernel/cputable.c @@ -273,7 +273,7 @@ struct cpu_spec cpu_specs[] = { .oprofile_model = &op_model_power4, #endif }, - { /* BE DD1.x */ + { /* Cell Broadband Engine */ .pvr_mask = 0xffff0000, .pvr_value = 0x00700000, .cpu_name = "Cell Broadband Engine", -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051214/ee8abe3b/attachment.htm From paulus at samba.org Thu Dec 15 11:53:50 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 15 Dec 2005 11:53:50 +1100 Subject: [PATCH]: powerpc: hugepage compile break in latest 2.6.15-rc5-mm2 In-Reply-To: <20051212172828.GC10037@austin.ibm.com> References: <20051212172828.GC10037@austin.ibm.com> Message-ID: <17312.48798.189816.509391@cargo.ozlabs.ibm.com> linas writes: > Hugepage doesn't compile out-of-the-box on linux-2.6.15-rc5-mm2. > This patch fixes the compile breakage, but the patch authors may want > to double-check that the rest of the patch series has applied correctly. ... > { > struct slb_flush_info fi; > unsigned long i; > - struct slb_flush_info fi; Hmmm, the powerpc.git tree doesn't seem to have this duplicate declaration... Paul. From paulus at samba.org Thu Dec 15 12:48:20 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 15 Dec 2005 12:48:20 +1100 Subject: [PATCH] powerpc: radeon and CONFIG_PM In-Reply-To: <43A046DA.6080600@us.ltcfwd.linux.ibm.com> References: <43A046DA.6080600@us.ltcfwd.linux.ibm.com> Message-ID: <17312.52068.907243.380290@cargo.ozlabs.ibm.com> Mike Wolf writes: > Previous patch was line wrapped, so I'm going to try this again. Still line-wrapped I'm afraid, and all the tabs have turned to spaces. Paul. From dwg at au1.ibm.com Thu Dec 15 13:17:26 2005 From: dwg at au1.ibm.com (David Gibson) Date: Thu, 15 Dec 2005 13:17:26 +1100 Subject: [PATCH]: powerpc: hugepage compile break in latest 2.6.15-rc5-mm2 In-Reply-To: <17312.48798.189816.509391@cargo.ozlabs.ibm.com> References: <20051212172828.GC10037@austin.ibm.com> <17312.48798.189816.509391@cargo.ozlabs.ibm.com> Message-ID: <20051215021726.GA14228@localhost.localdomain> On Thu, Dec 15, 2005 at 11:53:50AM +1100, Paul Mackerras wrote: > linas writes: > > > Hugepage doesn't compile out-of-the-box on linux-2.6.15-rc5-mm2. > > This patch fixes the compile breakage, but the patch authors may want > > to double-check that the rest of the patch series has applied correctly. > > ... > > > { > > struct slb_flush_info fi; > > unsigned long i; > > - struct slb_flush_info fi; > > Hmmm, the powerpc.git tree doesn't seem to have this duplicate > declaration... It woudln't, it's in -mm only, and I don't know how it got there, the patch which added the line is correct. I think it must be patch getting horribly confused by something further down the series. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From anton at samba.org Thu Dec 15 11:57:18 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 15 Dec 2005 11:57:18 +1100 Subject: [RFC] Should lmb_alloc() always panic on failure? In-Reply-To: <200512061739.06783.michael@ellerman.id.au> References: <200512061739.06783.michael@ellerman.id.au> Message-ID: <20051215005718.GA4053@krispykreme> Hi, > Currently lmb_alloc(_base) returns 0 if it can't allocate memory, but > a lot of places don't actualyl check. I was thinking it might be > better if it just panicked. Sounds reasonable to me. > The only other caller is careful_allocation(), which checks and retries the > alloc with different parameters - we could accomodate this with an > __lmb_alloc() or similar. Yeah in that case we are attempting to get node local memory but if that fails we can go off node. We sometimes have to create 0 size zones (eg if there is PCI or CPUs and no memory in the node), so that case can hit. Anton From paulus at samba.org Thu Dec 15 17:43:59 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 15 Dec 2005 17:43:59 +1100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <200512141910.19121.arnd@arndb.de> References: <200512102000.20304.arnd@arndb.de> <200512141910.19121.arnd@arndb.de> Message-ID: <17313.4271.637459.688763@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > This patch enables support for pause(0) power management state > for the Cell Broadband Processor, which is import for power efficient > operation. The pervasive infrastructure will in the future enable > us to introduce more functionality specific to the Cell's > pervasive unit. I put this in, but then took a closer look at this and reverted the commit: > +/* This is a new system reset handler for the BE processor. > + * SRR1 stores wake information that must be decoded to determine why > + * the processor was at the system reset handler. > + */ > + > + .align 7 > + .globl system_reset_check_common > +system_reset_check_common: > +BEGIN_FTR_SECTION > + mr r22,r12 /* r12 has SRR1 saved */ > + srwi r22,r22,16 > + andi. r22,r22,MSR_WAKEMASK > + cmpwi r22,MSR_WAKEEE > + beq hardware_interrupt_common > + cmpwi r22,MSR_WAKEDEC > + beq decrementer_common > + cmpwi r22,MSR_WAKEMT > + bne system_reset_common Why are you trashing r22 here? It hasn't been saved. You probably should use r10 instead, which has been saved. > +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) > + EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); > + b fast_exception_return This is a change in behaviour from before the patch. Instead of doing the EXCEPTION_PROLOG_COMMON and then just returning, you should branch to system_reset_common. Paul. From miltonm at bga.com Fri Dec 16 02:26:34 2005 From: miltonm at bga.com (Milton Miller) Date: Thu, 15 Dec 2005 09:26:34 -0600 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <17313.4271.637459.688763@cargo.ozlabs.ibm.com> References: <200512102000.20304.arnd@arndb.de> <200512141910.19121.arnd@arndb.de> <17313.4271.637459.688763@cargo.ozlabs.ibm.com> Message-ID: <3a267747328be451274c52ca5721c5ec@bga.com> On Dec 15, 2005, at 12:43 AM, Paul Mackerras wrote: > Arnd Bergmann writes: > >> This patch enables support for pause(0) power management state >> for the Cell Broadband Processor, which is import for power efficient >> operation. The pervasive infrastructure will in the future enable >> us to introduce more functionality specific to the Cell's >> pervasive unit. > > I put this in, but then took a closer look at this and reverted the > commit: > >> +/* This is a new system reset handler for the BE processor. >> + * SRR1 stores wake information that must be decoded to determine why >> + * the processor was at the system reset handler. >> + */ >> + >> + .align 7 >> + .globl system_reset_check_common >> +system_reset_check_common: >> +BEGIN_FTR_SECTION >> + mr r22,r12 /* r12 has SRR1 saved */ >> + srwi r22,r22,16 >> + andi. r22,r22,MSR_WAKEMASK >> + cmpwi r22,MSR_WAKEEE >> + beq hardware_interrupt_common >> + cmpwi r22,MSR_WAKEDEC >> + beq decrementer_common >> + cmpwi r22,MSR_WAKEMT >> + bne system_reset_common > > Why are you trashing r22 here? It hasn't been saved. You probably > should use r10 instead, which has been saved. I'm just guessing here that they missed the update on http://git.kernel.org/git/?p=linux/kernel/git/torvalds/old-2.6- bkcvs.git;a=commitdiff;h=d847914c00b6cef7ad40816a192828cc55fee30d [PATCH] ppc64: Optimize exception/syscall entry/exit and I missed it because I just looked at the patch and have looked at the old code too much. > >> +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) >> + EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); >> + b fast_exception_return > > This is a change in behaviour from before the patch. Instead of doing > the EXCEPTION_PROLOG_COMMON and then just returning, you should branch > to system_reset_common. > Actually, this is not the case, as this is only for the MSR_WAKEMT case -- they put a bne to the old handler. I'll raise the question from my first review again: would you rather see the STD_EXCEPTION_COMMON macro split for system_reset_common or leave this extra branch to the second handler? I didn't push because this exception should be rare for most processors. milton From arnd at arndb.de Fri Dec 16 02:51:52 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Thu, 15 Dec 2005 16:51:52 +0100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <3a267747328be451274c52ca5721c5ec@bga.com> References: <17313.4271.637459.688763@cargo.ozlabs.ibm.com> <3a267747328be451274c52ca5721c5ec@bga.com> Message-ID: <200512151651.54407.arnd@arndb.de> On Dunnersdag 15 Dezember 2005 16:26, Milton Miller wrote: > On Dec 15, 2005, at 12:43 AM, Paul Mackerras wrote: > > Why are you trashing r22 here? It hasn't been saved. You probably > > should use r10 instead, which has been saved. > > I'm just guessing here that they missed the update on > > http://git.kernel.org/git/?p=linux/kernel/git/torvalds/old-2.6- > bkcvs.git;a=commitdiff;h=d847914c00b6cef7ad40816a192828cc55fee30d > [PATCH] ppc64: Optimize exception/syscall entry/exit > > and I missed it because I just looked at the patch and have looked at > the old code too much. Yes, that's right. I never really looked at that code because I did not understand all of what Max did there, and Max probably did not follow the commits. I've fixed it now, just need to test it again. > >> +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) > >> + EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); > >> + b fast_exception_return > > > > This is a change in behaviour from before the patch. Instead of doing > > the EXCEPTION_PROLOG_COMMON and then just returning, you should branch > > to system_reset_common. > > > > Actually, this is not the case, as this is only for the MSR_WAKEMT > case -- they put a bne to the old handler. No, The change that Paul is referring to is for the case where CPU_FTR_PAUSE_ZERO is not set. I broke this when I tried to simplify the branches according to your comments. I now have in there BEGIN_FTR_SECTION ????????mr??????r10,r12 /* r12 has SRR1 saved */ ????????srwi????r10,r10,16 ????????andi.???r10,r10,MSR_WAKEMASK ????????cmpwi???r10,MSR_WAKEEE ????????beq?????hardware_interrupt_common ????????cmpwi???r10,MSR_WAKEDEC ????????beq?????decrementer_common ????????cmpwi???r10,MSR_WAKEMT ????????bne?????system_reset_common ????????EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); ????????b???????fast_exception_return END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) ????????b???????system_reset_common Is there a simpler way to express this? Arnd <<< From miltonm at bga.com Fri Dec 16 05:30:33 2005 From: miltonm at bga.com (Milton Miller) Date: Thu, 15 Dec 2005 12:30:33 -0600 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <200512151651.54407.arnd@arndb.de> References: <17313.4271.637459.688763@cargo.ozlabs.ibm.com> <3a267747328be451274c52ca5721c5ec@bga.com> <200512151651.54407.arnd@arndb.de> Message-ID: <2ad4a31d7613e4496d1f71bc642dab84@bga.com> On Dec 15, 2005, at 9:51 AM, Arnd Bergmann wrote: > On Dunnersdag 15 Dezember 2005 16:26, Milton Miller wrote: >> On Dec 15, 2005, at 12:43 AM, Paul Mackerras wrote: > >>>> +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) >>>> + EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); >>>> + b fast_exception_return >>> >>> This is a change in behaviour from before the patch. Instead of >>> doing >>> the EXCEPTION_PROLOG_COMMON and then just returning, you should >>> branch >>> to system_reset_common. >>> >> >> Actually, this is not the case, as this is only for the MSR_WAKEMT >> case -- they put a bne to the old handler. > > No, The change that Paul is referring to is for the case where > CPU_FTR_PAUSE_ZERO is not set. I broke this when I tried to simplify > the branches according to your comments. I now have in there > > BEGIN_FTR_SECTION > ????????mr??????r10,r12 /* r12 has SRR1 saved */ > ????????srwi????r10,r10,16 > ????????andi.???r10,r10,MSR_WAKEMASK > ????????cmpwi???r10,MSR_WAKEEE > ????????beq?????hardware_interrupt_common > ????????cmpwi???r10,MSR_WAKEDEC > ????????beq?????decrementer_common > ????????cmpwi???r10,MSR_WAKEMT > ????????bne?????system_reset_common > ????????EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); > ????????b???????fast_exception_return > END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) > ????????b???????system_reset_common > > Is there a simpler way to express this? Original patch: >>>> +/* Wake Events */ >>>> +#define MSR_WAKEMASK 0x0038 >>>> +#define MSR_WAKERESET 0x0038 >>>> +#define MSR_WAKESYSERR 0x0030 >>>> +#define MSR_WAKEEE 0x0020 >>>> +#define MSR_WAKEMT 0x0028 >>>> +#define MSR_WAKEDEC 0x0018 >>>> +#define MSR_WAKETHERM 0x0010 You could move the unconditional branch to system_reset_common to the front under IFCLR (in case EXCEPTION_PROLOG wants to use it in the future) at the cost of adding one nop to your path. You are looking for 3 values out of 8, and they have few bits in common to do a tree branch. You could rotate left 25, move to cr0, then write the branch tree; not sure if it would be faster. It would save the andi. and repeated compares writing the cr field, which might help if it is not renamed. (I assume you want EE and DEC to be fastest). Oh, I missed the right shift 16 in front of that, might help more. Actually, this means the constants are wrong. Add the 16 bits of zero at the end and use @h in the asm? Or rename from MSR_ namespace (and I would guess they are actually SRR1 only). Regardless, there is no reason to do mr then shift as two operations. milton From miltonm at bga.com Fri Dec 16 06:48:01 2005 From: miltonm at bga.com (Milton Miller) Date: Thu, 15 Dec 2005 13:48:01 -0600 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <2ad4a31d7613e4496d1f71bc642dab84@bga.com> References: <17313.4271.637459.688763@cargo.ozlabs.ibm.com> <3a267747328be451274c52ca5721c5ec@bga.com> <200512151651.54407.arnd@arndb.de> <2ad4a31d7613e4496d1f71bc642dab84@bga.com> Message-ID: <599700579ca4e14f343df28127501de0@bga.com> Segher and I were talking on #mklinux about optimsing this. Below is what we came up with. On Dec 15, 2005, at 12:30 PM, Milton Miller wrote: >> >> BEGIN_FTR_SECTION >> ????????mr??????r10,r12 /* r12 has SRR1 saved */ >> ????????srwi????r10,r10,16 >> ????????andi.???r10,r10,MSR_WAKEMASK >> ????????cmpwi???r10,MSR_WAKEEE >> ????????beq?????hardware_interrupt_common >> ????????cmpwi???r10,MSR_WAKEDEC >> ????????beq?????decrementer_common >> ????????cmpwi???r10,MSR_WAKEMT >> ????????bne?????system_reset_common >> ????????EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); >> ????????b???????fast_exception_return >> END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) >> ????????b???????system_reset_common >> >> Is there a simpler way to express this? > > Original patch: >>>>> +/* Wake Events */ >>>>> +#define MSR_WAKEMASK 0x0038 >>>>> +#define MSR_WAKERESET 0x0038 >>>>> +#define MSR_WAKESYSERR 0x0030 >>>>> +#define MSR_WAKEEE 0x0020 >>>>> +#define MSR_WAKEMT 0x0028 >>>>> +#define MSR_WAKEDEC 0x0018 >>>>> +#define MSR_WAKETHERM 0x0010 Since we are checking 3 consecitive values, make these be in the range 0-7 shift right 19 (decim) rlwinm a,b,13,29,31 sub 3 recording, beq, cmp usngined 2, blt beq but move the cmp an insn out lets see if I can get this all right.... BEGIN_FTR_SECTION b system_reset_common END_FTR_SECTION_IFCLR(CPU_FTR_PAUSE_ZERO) #if ((WAKE_MT - WAKE_DEC) != 2) || ((WAKE_EE - WAKE_DEC) != 1) #error This optimization is broken #endif rlwinm r10,r12,32-MSR_WAKE_LG,MSR_WAKE_MASK subi. r10,r10,WAKE_DEC cmplwi cr7,r10,WAKE_MT-WAKE_DEC beq decrementer_common blt cr7,hardware_interrupt_common bne cr7,system_reset_common EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); b fast_exception_return From paulus at samba.org Fri Dec 16 08:48:36 2005 From: paulus at samba.org (Paul Mackerras) Date: Fri, 16 Dec 2005 08:48:36 +1100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <3a267747328be451274c52ca5721c5ec@bga.com> References: <200512102000.20304.arnd@arndb.de> <200512141910.19121.arnd@arndb.de> <17313.4271.637459.688763@cargo.ozlabs.ibm.com> <3a267747328be451274c52ca5721c5ec@bga.com> Message-ID: <17313.58548.899471.558166@cargo.ozlabs.ibm.com> Milton Miller writes: > >> +END_FTR_SECTION_IFSET(CPU_FTR_PAUSE_ZERO) > >> + EXCEPTION_PROLOG_COMMON(0x100, PACA_EXGEN); > >> + b fast_exception_return > > > > This is a change in behaviour from before the patch. Instead of doing > > the EXCEPTION_PROLOG_COMMON and then just returning, you should branch > > to system_reset_common. > > > > Actually, this is not the case, as this is only for the MSR_WAKEMT > case -- they put a bne to the old handler. No, if we don't have the CPU_FTR_PAUSE_ZERO feature bit set, we will just do the EXCEPTION_PROLOG_COMMON and then branch to fast_exception_return. That's what I was worried about - what this code would do on CPUs other than cell. > I'll raise the question from my first review again: would you rather > see the STD_EXCEPTION_COMMON macro split for system_reset_common or > leave this extra branch to the second handler? Actually, why not test SRR1 (regs->msr) in C code, and call do_IRQ or timer_interrupt from there, or just return? No changes to head_64.S needed. If we have been asleep then presumably a few extra nanoseconds on waking isn't going to make a difference. Paul. From david at gibson.dropbear.id.au Fri Dec 16 14:49:25 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 16 Dec 2005 14:49:25 +1100 Subject: powerpc: Fix iSeries bug in VMALLOCBASE/VMALLOC_START consolidation Message-ID: <20051216034925.GB1448@localhost.localdomain> Paulus, please apply to powerpc tree. Oops, forgot to compile the VMALLOCBASE/VMALLOC_START patch on iSeries. VMALLOC_START is defined in pgtable.h whereas previously VMALLOCBASE was previously defined in page.h. lparmap.c needs to be updated appropriately: Booted on iSeries RS64 (now). Signed-off-by: David Gibson Index: working-2.6/arch/powerpc/kernel/lparmap.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/lparmap.c 2005-12-16 13:05:21.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/lparmap.c 2005-12-16 14:45:45.000000000 +1100 @@ -7,7 +7,7 @@ * 2 of the License, or (at your option) any later version. */ #include -#include +#include #include const struct LparMap __attribute__((__section__(".text"))) xLparMap = { -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Fri Dec 16 15:35:50 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 16 Dec 2005 15:35:50 +1100 Subject: powerpc: Cleanup LOADADDR etc. asm macros Message-ID: <20051216043550.GB9121@localhost.localdomain> Paulus, what do you think of this patch? This patch consolidates the variety of macros used for loading 32 or 64-bit constants in assembler (LOADADDR, LOADBASE, SET_REG_TO_*). The idea is to make the set of macros consistent across 32 and 64 bit and to make it more obvious which is the appropriate one to use in a given situation. The new macros and their semantics are described in the comments in ppc_asm.h. In the process, we change several places that were unnecessarily using immediate loads on ppc64 to use the GOT/TOC. Likewise we cleanup a couple of places where we were clumsily subtracting PAGE_OFFSET with asm instructions to use assemble-time arithmetic or the toreal() macro instead. Signed-off-by: David Gibson Index: working-2.6/include/asm-powerpc/ppc_asm.h =================================================================== --- working-2.6.orig/include/asm-powerpc/ppc_asm.h 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/include/asm-powerpc/ppc_asm.h 2005-12-16 15:00:29.000000000 +1100 @@ -155,52 +155,57 @@ n: #endif /* - * LOADADDR( rn, name ) - * loads the address of 'name' into 'rn' + * LOAD_REG_IMMEDIATE(rn, expr) + * Loads the value of the constant expression 'expr' into register 'rn' + * using immediate instructions only. Use this when it's important not + * to reference other data (i.e. on ppc64 when the TOC pointer is not + * valid). * - * LOADBASE( rn, name ) - * loads the address (possibly without the low 16 bits) of 'name' into 'rn' - * suitable for base+disp addressing + * LOAD_REG_ADDR(rn, name) + * Loads the address of label 'name' into register 'rn'. Use this when + * you don't particularly need immediate instructions only, but you need + * the whole address in one register (e.g. it's a structure address and + * you want to access various offsets within it). On ppc32 this is + * identical to LOAD_REG_IMMEDIATE. + * + * LOAD_REG_ADDRBASE(rn, name) + * ADDROFF(name) + * LOAD_REG_ADDRBASE loads part of the address of label 'name' into + * register 'rn'. ADDROFF(name) returns the remainder of the address as + * a constant expression. ADDROFF(name) is a signed expression < 16 bits + * in size, so is suitable for use directly as an offset in load and store + * instructions. Use this when loading/storing a single word or less as: + * LOAD_REG_ADDRBASE(rX, name) + * ld rY,ADDROFF(name)(rX) */ #ifdef __powerpc64__ -#define LOADADDR(rn,name) \ - lis rn,name##@highest; \ - ori rn,rn,name##@higher; \ - rldicr rn,rn,32,31; \ - oris rn,rn,name##@h; \ - ori rn,rn,name##@l - -#define LOADBASE(rn,name) \ - ld rn,name at got(r2) - -#define OFF(name) 0 - -#define SET_REG_TO_CONST(reg, value) \ - lis reg,(((value)>>48)&0xFFFF); \ - ori reg,reg,(((value)>>32)&0xFFFF); \ - rldicr reg,reg,32,31; \ - oris reg,reg,(((value)>>16)&0xFFFF); \ - ori reg,reg,((value)&0xFFFF); - -#define SET_REG_TO_LABEL(reg, label) \ - lis reg,(label)@highest; \ - ori reg,reg,(label)@higher; \ - rldicr reg,reg,32,31; \ - oris reg,reg,(label)@h; \ - ori reg,reg,(label)@l; +#define LOAD_REG_IMMEDIATE(reg,expr) \ + lis (reg),(expr)@highest; \ + ori (reg),(reg),(expr)@higher; \ + rldicr (reg),(reg),32,31; \ + oris (reg),(reg),(expr)@h; \ + ori (reg),(reg),(expr)@l; + +#define LOAD_REG_ADDR(reg,name) \ + ld (reg),name at got(r2) + +#define LOAD_REG_ADDRBASE(reg,name) LOAD_REG_ADDR(reg,name) +#define ADDROFF(name) 0 /* offsets for stack frame layout */ #define LRSAVE 16 #else /* 32-bit */ -#define LOADADDR(rn,name) \ - lis rn,name at ha; \ - addi rn,rn,name at l - -#define LOADBASE(rn,name) \ - lis rn,name at ha -#define OFF(name) name at l +#define LOAD_REG_IMMEDIATE(reg,expr) \ + lis (reg),(expr)@ha; \ + addi (reg),(reg),(expr)@l; + +#define LOAD_REG_ADDR(reg,name) SET_REG_IMMEDIATE(reg, name) + +#define LOAD_REG_ADDRBASE(reg, name) \ + lis (reg),name at ha +#define ADDROFF name at l /* offsets for stack frame layout */ #define LRSAVE 4 Index: working-2.6/arch/powerpc/kernel/entry_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/entry_64.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/entry_64.S 2005-12-16 14:49:45.000000000 +1100 @@ -689,9 +689,8 @@ _GLOBAL(enter_rtas) std r6,PACASAVEDMSR(r13) /* Setup our real return addr */ - SET_REG_TO_LABEL(r4,.rtas_return_loc) - SET_REG_TO_CONST(r9,PAGE_OFFSET) - sub r4,r4,r9 + LOAD_REG_ADDR(r4,.rtas_return_loc) + toreal(r4) mtlr r4 li r0,0 @@ -706,7 +705,7 @@ _GLOBAL(enter_rtas) sync /* disable interrupts so SRR0/1 */ mtmsrd r0 /* don't get trashed */ - SET_REG_TO_LABEL(r4,rtas) + LOAD_REG_ADDR(r4, rtas) ld r5,RTASENTRY(r4) /* get the rtas->entry value */ ld r4,RTASBASE(r4) /* get the rtas->base value */ @@ -718,8 +717,7 @@ _GLOBAL(enter_rtas) _STATIC(rtas_return_loc) /* relocation is off at this point */ mfspr r4,SPRN_SPRG3 /* Get PACA */ - SET_REG_TO_CONST(r5, PAGE_OFFSET) - sub r4,r4,r5 /* RELOC the PACA base pointer */ + toreal(r4) mfmsr r6 li r0,MSR_RI @@ -728,7 +726,7 @@ _STATIC(rtas_return_loc) mtmsrd r6 ld r1,PACAR1(r4) /* Restore our SP */ - LOADADDR(r3,.rtas_restore_regs) + LOAD_REG_IMMEDIATE(r3,.rtas_restore_regs) ld r4,PACASAVEDMSR(r4) /* Restore our MSR */ mtspr SPRN_SRR0,r3 Index: working-2.6/arch/powerpc/kernel/head_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/head_64.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/head_64.S 2005-12-16 14:51:07.000000000 +1100 @@ -154,12 +154,12 @@ _GLOBAL(__secondary_hold) bne 100b #ifdef CONFIG_HMT - LOADADDR(r4, .hmt_init) + SET_REG_IMMEDIATE(r4, .hmt_init) mtctr r4 bctr #else #ifdef CONFIG_SMP - LOADADDR(r4, .pSeries_secondary_smp_init) + LOAD_REG_IMMEDIATE(r4, .pSeries_secondary_smp_init) mtctr r4 mr r3,r24 bctr @@ -205,9 +205,10 @@ exception_marker: #define EX_LR 72 /* - * We're short on space and time in the exception prolog, so we can't use - * the normal LOADADDR macro. Normally we just need the low halfword of the - * address, but for Kdump we need the whole low word. + * We're short on space and time in the exception prolog, so we can't + * use the normal SET_REG_IMMEDIATE macro. Normally we just need the + * low halfword of the address, but for Kdump we need the whole low + * word. */ #ifdef CONFIG_CRASH_DUMP #define LOAD_HANDLER(reg, label) \ @@ -713,7 +714,7 @@ system_reset_iSeries: lbz r23,PACAPROCSTART(r13) /* Test if this processor * should start */ sync - LOADADDR(r3,current_set) + LOAD_REG_IMMEDIATE(r3,current_set) sldi r28,r24,3 /* get current_set[cpu#] */ ldx r3,r3,r28 addi r1,r3,THREAD_SIZE @@ -746,8 +747,8 @@ iSeries_secondary_smp_loop: decrementer_iSeries_masked: li r11,1 stb r11,PACALPPACA+LPPACADECRINT(r13) - LOADBASE(r12,tb_ticks_per_jiffy) - lwz r12,OFF(tb_ticks_per_jiffy)(r12) + LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) + lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) mtspr SPRN_DEC,r12 /* fall through */ @@ -1412,7 +1413,7 @@ _GLOBAL(pSeries_secondary_smp_init) * physical cpu id in r24, we need to search the pacas to find * which logical id maps to our physical one. */ - LOADADDR(r13, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r13, paca) /* Get base vaddr of paca array */ li r5,0 /* logical cpu id */ 1: lhz r6,PACAHWCPUID(r13) /* Load HW procid from paca */ cmpw r6,r24 /* Compare to our id */ @@ -1446,8 +1447,8 @@ _GLOBAL(pSeries_secondary_smp_init) #ifdef CONFIG_PPC_ISERIES _STATIC(__start_initialization_iSeries) /* Clear out the BSS */ - LOADADDR(r11,__bss_stop) - LOADADDR(r8,__bss_start) + LOAD_REG_IMMEDIATE(r11,__bss_stop) + LOAD_REG_IMMEDIATE(r8,__bss_start) sub r11,r11,r8 /* bss size */ addi r11,r11,7 /* round up to an even double word */ rldicl. r11,r11,61,3 /* shift right by 3 */ @@ -1458,17 +1459,17 @@ _STATIC(__start_initialization_iSeries) 3: stdu r0,8(r8) bdnz 3b 4: - LOADADDR(r1,init_thread_union) + LOAD_REG_IMMEDIATE(r1,init_thread_union) addi r1,r1,THREAD_SIZE li r0,0 stdu r0,-STACK_FRAME_OVERHEAD(r1) - LOADADDR(r3,cpu_specs) - LOADADDR(r4,cur_cpu_spec) + LOAD_REG_IMMEDIATE(r3,cpu_specs) + LOAD_REG_IMMEDIATE(r4,cur_cpu_spec) li r5,0 bl .identify_cpu - LOADADDR(r2,__toc_start) + LOAD_REG_IMMEDIATE(r2,__toc_start) addi r2,r2,0x4000 addi r2,r2,0x4000 @@ -1526,7 +1527,7 @@ _GLOBAL(__start_initialization_multiplat li r24,0 /* Switch off MMU if not already */ - LOADADDR(r4, .__after_prom_start - KERNELBASE) + LOAD_REG_IMMEDIATE(r4, .__after_prom_start - KERNELBASE) add r4,r4,r30 bl .__mmu_off b .__after_prom_start @@ -1545,7 +1546,7 @@ _STATIC(__boot_from_prom) /* put a relocation offset into r3 */ bl .reloc_offset - LOADADDR(r2,__toc_start) + LOAD_REG_IMMEDIATE(r2,__toc_start) addi r2,r2,0x4000 addi r2,r2,0x4000 @@ -1584,9 +1585,9 @@ _STATIC(__after_prom_start) */ bl .reloc_offset mr r26,r3 - SET_REG_TO_CONST(r27,KERNELBASE) + LOAD_REG_IMMEDIATE(r27, KERNELBASE) - LOADADDR(r3, PHYSICAL_START) /* target addr */ + LOAD_REG_IMMEDIATE(r3, PHYSICAL_START) /* target addr */ // XXX FIXME: Use phys returned by OF (r30) add r4,r27,r26 /* source addr */ @@ -1594,7 +1595,7 @@ _STATIC(__after_prom_start) /* i.e. where we are running */ /* the source addr */ - LOADADDR(r5,copy_to_here) /* # bytes of memory to copy */ + LOAD_REG_IMMEDIATE(r5,copy_to_here) /* # bytes of memory to copy */ sub r5,r5,r27 li r6,0x100 /* Start offset, the first 0x100 */ @@ -1604,11 +1605,11 @@ _STATIC(__after_prom_start) /* this includes the code being */ /* executed here. */ - LOADADDR(r0, 4f) /* Jump to the copy of this code */ + LOAD_REG_IMMEDIATE(r0, 4f) /* Jump to the copy of this code */ mtctr r0 /* that we just made/relocated */ bctr -4: LOADADDR(r5,klimit) +4: LOAD_REG_IMMEDIATE(r5,klimit) add r5,r5,r26 ld r5,0(r5) /* get the value of klimit */ sub r5,r5,r27 @@ -1690,7 +1691,7 @@ _GLOBAL(pmac_secondary_start) mtmsrd r3 /* RI on */ /* Set up a paca value for this processor. */ - LOADADDR(r4, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r4, paca) /* Get base vaddr of paca array */ mulli r13,r24,PACA_SIZE /* Calculate vaddr of right paca */ add r13,r13,r4 /* for this processor. */ mtspr SPRN_SPRG3,r13 /* Save vaddr of paca in SPRG3 */ @@ -1727,7 +1728,7 @@ _GLOBAL(__secondary_start) bl .early_setup_secondary /* Initialize the kernel stack. Just a repeat for iSeries. */ - LOADADDR(r3,current_set) + LOAD_REG_ADDR(r3, current_set) sldi r28,r24,3 /* get current_set[cpu#] */ ldx r1,r3,r28 addi r1,r1,THREAD_SIZE-STACK_FRAME_OVERHEAD @@ -1738,8 +1739,8 @@ _GLOBAL(__secondary_start) mtlr r7 /* enable MMU and jump to start_secondary */ - LOADADDR(r3,.start_secondary_prolog) - SET_REG_TO_CONST(r4, MSR_KERNEL) + LOAD_REG_ADDR(r3, .start_secondary_prolog) + LOAD_REG_IMMEDIATE(r4, MSR_KERNEL) #ifdef DO_SOFT_DISABLE ori r4,r4,MSR_EE #endif @@ -1788,8 +1789,8 @@ _STATIC(start_here_multiplatform) * be detached from the kernel completely. Besides, we need * to clear it now for kexec-style entry. */ - LOADADDR(r11,__bss_stop) - LOADADDR(r8,__bss_start) + LOAD_REG_IMMEDIATE(r11,__bss_stop) + LOAD_REG_IMMEDIATE(r8,__bss_start) sub r11,r11,r8 /* bss size */ addi r11,r11,7 /* round up to an even double word */ rldicl. r11,r11,61,3 /* shift right by 3 */ @@ -1827,7 +1828,7 @@ _STATIC(start_here_multiplatform) /* up the htab. This is done because we have relocated the */ /* kernel but are still running in real mode. */ - LOADADDR(r3,init_thread_union) + LOAD_REG_IMMEDIATE(r3,init_thread_union) add r3,r3,r26 /* set up a stack pointer (physical address) */ @@ -1836,14 +1837,14 @@ _STATIC(start_here_multiplatform) stdu r0,-STACK_FRAME_OVERHEAD(r1) /* set up the TOC (physical address) */ - LOADADDR(r2,__toc_start) + LOAD_REG_IMMEDIATE(r2,__toc_start) addi r2,r2,0x4000 addi r2,r2,0x4000 add r2,r2,r26 - LOADADDR(r3,cpu_specs) + LOAD_REG_IMMEDIATE(r3, cpu_specs) add r3,r3,r26 - LOADADDR(r4,cur_cpu_spec) + LOAD_REG_IMMEDIATE(r4,cur_cpu_spec) add r4,r4,r26 mr r5,r26 bl .identify_cpu @@ -1859,11 +1860,11 @@ _STATIC(start_here_multiplatform) * nowhere it can be initialized differently before we reach this * code */ - LOADADDR(r27, boot_cpuid) + LOAD_REG_IMMEDIATE(r27, boot_cpuid) add r27,r27,r26 lwz r27,0(r27) - LOADADDR(r24, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r24, paca) /* Get base vaddr of paca array */ mulli r13,r27,PACA_SIZE /* Calculate vaddr of right paca */ add r13,r13,r24 /* for this processor. */ add r13,r13,r26 /* convert to physical addr */ @@ -1876,8 +1877,8 @@ _STATIC(start_here_multiplatform) mr r3,r31 bl .early_setup - LOADADDR(r3,.start_here_common) - SET_REG_TO_CONST(r4, MSR_KERNEL) + LOAD_REG_IMMEDIATE(r3, .start_here_common) + LOAD_REG_IMMEDIATE(r4, MSR_KERNEL) mtspr SPRN_SRR0,r3 mtspr SPRN_SRR1,r4 rfid @@ -1891,7 +1892,7 @@ _STATIC(start_here_common) /* The following code sets up the SP and TOC now that we are */ /* running with translation enabled. */ - LOADADDR(r3,init_thread_union) + LOAD_REG_IMMEDIATE(r3,init_thread_union) /* set up the stack */ addi r1,r3,THREAD_SIZE @@ -1904,16 +1905,16 @@ _STATIC(start_here_common) li r3,0 bl .do_cpu_ftr_fixups - LOADADDR(r26, boot_cpuid) + LOAD_REG_IMMEDIATE(r26, boot_cpuid) lwz r26,0(r26) - LOADADDR(r24, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r24, paca) /* Get base vaddr of paca array */ mulli r13,r26,PACA_SIZE /* Calculate vaddr of right paca */ add r13,r13,r24 /* for this processor. */ mtspr SPRN_SPRG3,r13 /* ptr to current */ - LOADADDR(r4,init_task) + LOAD_REG_IMMEDIATE(r4, init_task) std r4,PACACURRENT(r13) /* Load the TOC */ @@ -1936,7 +1937,7 @@ _STATIC(start_here_common) _GLOBAL(hmt_init) #ifdef CONFIG_HMT - LOADADDR(r5, hmt_thread_data) + LOAD_REG_IMMEDIATE(r5, hmt_thread_data) mfspr r7,SPRN_PVR srwi r7,r7,16 cmpwi r7,0x34 /* Pulsar */ @@ -1957,7 +1958,7 @@ _GLOBAL(hmt_init) b 101f __hmt_secondary_hold: - LOADADDR(r5, hmt_thread_data) + LOAD_REG_IMMEDIATE(r5, hmt_thread_data) clrldi r5,r5,4 li r7,0 mfspr r6,SPRN_PIR @@ -1985,7 +1986,7 @@ __hmt_secondary_hold: #ifdef CONFIG_HMT _GLOBAL(hmt_start_secondary) - LOADADDR(r4,__hmt_secondary_hold) + LOAD_REG_IMMEDIATE(r4,__hmt_secondary_hold) clrldi r4,r4,4 mtspr SPRN_NIADORM, r4 mfspr r4, SPRN_MSRDORM Index: working-2.6/arch/powerpc/kernel/misc_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_64.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_64.S 2005-12-16 14:49:45.000000000 +1100 @@ -39,7 +39,7 @@ _GLOBAL(reloc_offset) mflr r0 bl 1f 1: mflr r3 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r3,r4,r3 mtlr r0 blr @@ -51,7 +51,7 @@ _GLOBAL(add_reloc_offset) mflr r0 bl 1f 1: mflr r5 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r5,r4,r5 add r3,r3,r5 mtlr r0 @@ -498,15 +498,15 @@ _GLOBAL(identify_cpu) */ _GLOBAL(do_cpu_ftr_fixups) /* Get CPU 0 features */ - LOADADDR(r6,cur_cpu_spec) + LOAD_REG_IMMEDIATE(r6,cur_cpu_spec) sub r6,r6,r3 ld r4,0(r6) sub r4,r4,r3 ld r4,CPU_SPEC_FEATURES(r4) /* Get the fixup table */ - LOADADDR(r6,__start___ftr_fixup) + LOAD_REG_IMMEDIATE(r6,__start___ftr_fixup) sub r6,r6,r3 - LOADADDR(r7,__stop___ftr_fixup) + LOAD_REG_IMMEDIATE(r7,__stop___ftr_fixup) sub r7,r7,r3 /* Do the fixup */ 1: cmpld r6,r7 Index: working-2.6/arch/powerpc/kernel/cpu_setup_power4.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/cpu_setup_power4.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/cpu_setup_power4.S 2005-12-16 14:49:45.000000000 +1100 @@ -130,7 +130,7 @@ _GLOBAL(__save_cpu_setup) mfcr r7 /* Get storage ptr */ - LOADADDR(r5,cpu_state_storage) + LOAD_REG_IMMEDIATE(r5,cpu_state_storage) /* We only deal with 970 for now */ mfspr r0,SPRN_PVR @@ -164,7 +164,7 @@ _GLOBAL(__restore_cpu_setup) /* Get storage ptr (FIXME when using anton reloc as we * are running with translation disabled here */ - LOADADDR(r5,cpu_state_storage) + LOAD_REG_IMMEDIATE(r5,cpu_state_storage) /* We only deal with 970 for now */ mfspr r0,SPRN_PVR Index: working-2.6/arch/powerpc/kernel/entry_32.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/entry_32.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/entry_32.S 2005-12-16 14:49:45.000000000 +1100 @@ -988,7 +988,7 @@ _GLOBAL(enter_rtas) stwu r1,-INT_FRAME_SIZE(r1) mflr r0 stw r0,INT_FRAME_SIZE+4(r1) - LOADADDR(r4, rtas) + LOAD_REG_ADDR(r4, rtas) lis r6,1f at ha /* physical return address for rtas */ addi r6,r6,1f at l tophys(r6,r6) Index: working-2.6/arch/powerpc/kernel/misc_32.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_32.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_32.S 2005-12-16 14:49:45.000000000 +1100 @@ -68,7 +68,7 @@ _GLOBAL(reloc_offset) mflr r0 bl 1f 1: mflr r3 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r3,r4,r3 mtlr r0 blr @@ -80,7 +80,7 @@ _GLOBAL(add_reloc_offset) mflr r0 bl 1f 1: mflr r5 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r5,r4,r5 add r3,r3,r5 mtlr r0 Index: working-2.6/arch/powerpc/kernel/fpu.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/fpu.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/fpu.S 2005-12-16 14:49:45.000000000 +1100 @@ -39,9 +39,9 @@ _GLOBAL(load_up_fpu) * to another. Instead we call giveup_fpu in switch_to. */ #ifndef CONFIG_SMP - LOADBASE(r3, last_task_used_math) + LOAD_REG_ADDRBASE(r3, last_task_used_math) toreal(r3) - PPC_LL r4,OFF(last_task_used_math)(r3) + PPC_LL r4,ADDROFF(last_task_used_math)(r3) PPC_LCMPI 0,r4,0 beq 1f toreal(r4) @@ -77,7 +77,7 @@ _GLOBAL(load_up_fpu) #ifndef CONFIG_SMP subi r4,r5,THREAD fromreal(r4) - PPC_STL r4,OFF(last_task_used_math)(r3) + PPC_STL r4,ADDROFF(last_task_used_math)(r3) #endif /* CONFIG_SMP */ /* restore registers and return */ /* we haven't used ctr or xer or lr */ @@ -113,8 +113,8 @@ _GLOBAL(giveup_fpu) 1: #ifndef CONFIG_SMP li r5,0 - LOADBASE(r4,last_task_used_math) - PPC_STL r5,OFF(last_task_used_math)(r4) + LOAD_REG_ADDRBASE(r4,last_task_used_math) + PPC_STL r5,ADDROFF(last_task_used_math)(r4) #endif /* CONFIG_SMP */ blr Index: working-2.6/arch/powerpc/kernel/idle_power4.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/idle_power4.S 2005-12-16 14:43:17.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/idle_power4.S 2005-12-16 14:49:45.000000000 +1100 @@ -38,14 +38,14 @@ END_FTR_SECTION_IFCLR(CPU_FTR_CAN_NAP) /* We must dynamically check for the NAP feature as it * can be cleared by CPU init after the fixups are done */ - LOADBASE(r3,cur_cpu_spec) - ld r4,OFF(cur_cpu_spec)(r3) + LOAD_REG_ADDRBASE(r3,cur_cpu_spec) + ld r4,ADDROFF(cur_cpu_spec)(r3) ld r4,CPU_SPEC_FEATURES(r4) andi. r0,r4,CPU_FTR_CAN_NAP beqlr /* Now check if user or arch enabled NAP mode */ - LOADBASE(r3,powersave_nap) - lwz r4,OFF(powersave_nap)(r3) + LOAD_REG_ADDRBASE(r3,powersave_nap) + lwz r4,ADDROFF(powersave_nap)(r3) cmpwi 0,r4,0 beqlr -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Fri Dec 16 16:10:37 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 16 Dec 2005 16:10:37 +1100 Subject: powerpc: Cleanup LOADADDR etc. asm macros In-Reply-To: <20051216043550.GB9121@localhost.localdomain> References: <20051216043550.GB9121@localhost.localdomain> Message-ID: <20051216051037.GA10825@localhost.localdomain> On Fri, Dec 16, 2005 at 03:35:50PM +1100, David Gibson wrote: > Paulus, what do you think of this patch? Ahem. Perhaps this version, which doesn't induce build errors on ppc32. powerpc: Cleanup LOADADDR etc. asm macros This patch consolidates the variety of macros used for loading 32 or 64-bit constants in assembler (LOADADDR, LOADBASE, SET_REG_TO_*). The idea is to make the set of macros consistent across 32 and 64 bit and to make it more obvious which is the appropriate one to use in a given situation. The new macros and their semantics are described in the comments in ppc_asm.h. In the process, we change several places that were unnecessarily using immediate loads on ppc64 to use the GOT/TOC. Likewise we cleanup a couple of places where we were clumsily subtracting PAGE_OFFSET with asm instructions to use assemble-time arithmetic or the toreal() macro instead. Signed-off-by: David Gibson Index: working-2.6/include/asm-powerpc/ppc_asm.h =================================================================== --- working-2.6.orig/include/asm-powerpc/ppc_asm.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/ppc_asm.h 2005-12-16 16:07:14.000000000 +1100 @@ -155,52 +155,56 @@ n: #endif /* - * LOADADDR( rn, name ) - * loads the address of 'name' into 'rn' + * LOAD_REG_IMMEDIATE(rn, expr) + * Loads the value of the constant expression 'expr' into register 'rn' + * using immediate instructions only. Use this when it's important not + * to reference other data (i.e. on ppc64 when the TOC pointer is not + * valid). * - * LOADBASE( rn, name ) - * loads the address (possibly without the low 16 bits) of 'name' into 'rn' - * suitable for base+disp addressing + * LOAD_REG_ADDR(rn, name) + * Loads the address of label 'name' into register 'rn'. Use this when + * you don't particularly need immediate instructions only, but you need + * the whole address in one register (e.g. it's a structure address and + * you want to access various offsets within it). On ppc32 this is + * identical to LOAD_REG_IMMEDIATE. + * + * LOAD_REG_ADDRBASE(rn, name) + * ADDROFF(name) + * LOAD_REG_ADDRBASE loads part of the address of label 'name' into + * register 'rn'. ADDROFF(name) returns the remainder of the address as + * a constant expression. ADDROFF(name) is a signed expression < 16 bits + * in size, so is suitable for use directly as an offset in load and store + * instructions. Use this when loading/storing a single word or less as: + * LOAD_REG_ADDRBASE(rX, name) + * ld rY,ADDROFF(name)(rX) */ #ifdef __powerpc64__ -#define LOADADDR(rn,name) \ - lis rn,name##@highest; \ - ori rn,rn,name##@higher; \ - rldicr rn,rn,32,31; \ - oris rn,rn,name##@h; \ - ori rn,rn,name##@l - -#define LOADBASE(rn,name) \ - ld rn,name at got(r2) - -#define OFF(name) 0 - -#define SET_REG_TO_CONST(reg, value) \ - lis reg,(((value)>>48)&0xFFFF); \ - ori reg,reg,(((value)>>32)&0xFFFF); \ - rldicr reg,reg,32,31; \ - oris reg,reg,(((value)>>16)&0xFFFF); \ - ori reg,reg,((value)&0xFFFF); - -#define SET_REG_TO_LABEL(reg, label) \ - lis reg,(label)@highest; \ - ori reg,reg,(label)@higher; \ - rldicr reg,reg,32,31; \ - oris reg,reg,(label)@h; \ - ori reg,reg,(label)@l; +#define LOAD_REG_IMMEDIATE(reg,expr) \ + lis (reg),(expr)@highest; \ + ori (reg),(reg),(expr)@higher; \ + rldicr (reg),(reg),32,31; \ + oris (reg),(reg),(expr)@h; \ + ori (reg),(reg),(expr)@l; + +#define LOAD_REG_ADDR(reg,name) \ + ld (reg),name at got(r2) + +#define LOAD_REG_ADDRBASE(reg,name) LOAD_REG_ADDR(reg,name) +#define ADDROFF(name) 0 /* offsets for stack frame layout */ #define LRSAVE 16 #else /* 32-bit */ -#define LOADADDR(rn,name) \ - lis rn,name at ha; \ - addi rn,rn,name at l -#define LOADBASE(rn,name) \ - lis rn,name at ha +#define LOAD_REG_IMMEDIATE(reg,expr) \ + lis (reg),(expr)@ha; \ + addi (reg),(reg),(expr)@l; + +#define LOAD_REG_ADDR(reg,name) LOAD_REG_IMMEDIATE(reg, name) -#define OFF(name) name at l +#define LOAD_REG_ADDRBASE(reg, name) lis (reg),name at ha +#define ADDROFF(name) name at l /* offsets for stack frame layout */ #define LRSAVE 4 Index: working-2.6/arch/powerpc/kernel/entry_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/entry_64.S 2005-12-16 13:05:21.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/entry_64.S 2005-12-16 15:51:35.000000000 +1100 @@ -689,9 +689,8 @@ _GLOBAL(enter_rtas) std r6,PACASAVEDMSR(r13) /* Setup our real return addr */ - SET_REG_TO_LABEL(r4,.rtas_return_loc) - SET_REG_TO_CONST(r9,PAGE_OFFSET) - sub r4,r4,r9 + LOAD_REG_ADDR(r4,.rtas_return_loc) + toreal(r4) mtlr r4 li r0,0 @@ -706,7 +705,7 @@ _GLOBAL(enter_rtas) sync /* disable interrupts so SRR0/1 */ mtmsrd r0 /* don't get trashed */ - SET_REG_TO_LABEL(r4,rtas) + LOAD_REG_ADDR(r4, rtas) ld r5,RTASENTRY(r4) /* get the rtas->entry value */ ld r4,RTASBASE(r4) /* get the rtas->base value */ @@ -718,8 +717,7 @@ _GLOBAL(enter_rtas) _STATIC(rtas_return_loc) /* relocation is off at this point */ mfspr r4,SPRN_SPRG3 /* Get PACA */ - SET_REG_TO_CONST(r5, PAGE_OFFSET) - sub r4,r4,r5 /* RELOC the PACA base pointer */ + toreal(r4) mfmsr r6 li r0,MSR_RI @@ -728,7 +726,7 @@ _STATIC(rtas_return_loc) mtmsrd r6 ld r1,PACAR1(r4) /* Restore our SP */ - LOADADDR(r3,.rtas_restore_regs) + LOAD_REG_IMMEDIATE(r3,.rtas_restore_regs) ld r4,PACASAVEDMSR(r4) /* Restore our MSR */ mtspr SPRN_SRR0,r3 Index: working-2.6/arch/powerpc/kernel/head_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/head_64.S 2005-12-16 13:05:21.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/head_64.S 2005-12-16 15:51:35.000000000 +1100 @@ -154,12 +154,12 @@ _GLOBAL(__secondary_hold) bne 100b #ifdef CONFIG_HMT - LOADADDR(r4, .hmt_init) + SET_REG_IMMEDIATE(r4, .hmt_init) mtctr r4 bctr #else #ifdef CONFIG_SMP - LOADADDR(r4, .pSeries_secondary_smp_init) + LOAD_REG_IMMEDIATE(r4, .pSeries_secondary_smp_init) mtctr r4 mr r3,r24 bctr @@ -205,9 +205,10 @@ exception_marker: #define EX_LR 72 /* - * We're short on space and time in the exception prolog, so we can't use - * the normal LOADADDR macro. Normally we just need the low halfword of the - * address, but for Kdump we need the whole low word. + * We're short on space and time in the exception prolog, so we can't + * use the normal SET_REG_IMMEDIATE macro. Normally we just need the + * low halfword of the address, but for Kdump we need the whole low + * word. */ #ifdef CONFIG_CRASH_DUMP #define LOAD_HANDLER(reg, label) \ @@ -713,7 +714,7 @@ system_reset_iSeries: lbz r23,PACAPROCSTART(r13) /* Test if this processor * should start */ sync - LOADADDR(r3,current_set) + LOAD_REG_IMMEDIATE(r3,current_set) sldi r28,r24,3 /* get current_set[cpu#] */ ldx r3,r3,r28 addi r1,r3,THREAD_SIZE @@ -746,8 +747,8 @@ iSeries_secondary_smp_loop: decrementer_iSeries_masked: li r11,1 stb r11,PACALPPACA+LPPACADECRINT(r13) - LOADBASE(r12,tb_ticks_per_jiffy) - lwz r12,OFF(tb_ticks_per_jiffy)(r12) + LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) + lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) mtspr SPRN_DEC,r12 /* fall through */ @@ -1412,7 +1413,7 @@ _GLOBAL(pSeries_secondary_smp_init) * physical cpu id in r24, we need to search the pacas to find * which logical id maps to our physical one. */ - LOADADDR(r13, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r13, paca) /* Get base vaddr of paca array */ li r5,0 /* logical cpu id */ 1: lhz r6,PACAHWCPUID(r13) /* Load HW procid from paca */ cmpw r6,r24 /* Compare to our id */ @@ -1446,8 +1447,8 @@ _GLOBAL(pSeries_secondary_smp_init) #ifdef CONFIG_PPC_ISERIES _STATIC(__start_initialization_iSeries) /* Clear out the BSS */ - LOADADDR(r11,__bss_stop) - LOADADDR(r8,__bss_start) + LOAD_REG_IMMEDIATE(r11,__bss_stop) + LOAD_REG_IMMEDIATE(r8,__bss_start) sub r11,r11,r8 /* bss size */ addi r11,r11,7 /* round up to an even double word */ rldicl. r11,r11,61,3 /* shift right by 3 */ @@ -1458,17 +1459,17 @@ _STATIC(__start_initialization_iSeries) 3: stdu r0,8(r8) bdnz 3b 4: - LOADADDR(r1,init_thread_union) + LOAD_REG_IMMEDIATE(r1,init_thread_union) addi r1,r1,THREAD_SIZE li r0,0 stdu r0,-STACK_FRAME_OVERHEAD(r1) - LOADADDR(r3,cpu_specs) - LOADADDR(r4,cur_cpu_spec) + LOAD_REG_IMMEDIATE(r3,cpu_specs) + LOAD_REG_IMMEDIATE(r4,cur_cpu_spec) li r5,0 bl .identify_cpu - LOADADDR(r2,__toc_start) + LOAD_REG_IMMEDIATE(r2,__toc_start) addi r2,r2,0x4000 addi r2,r2,0x4000 @@ -1526,7 +1527,7 @@ _GLOBAL(__start_initialization_multiplat li r24,0 /* Switch off MMU if not already */ - LOADADDR(r4, .__after_prom_start - KERNELBASE) + LOAD_REG_IMMEDIATE(r4, .__after_prom_start - KERNELBASE) add r4,r4,r30 bl .__mmu_off b .__after_prom_start @@ -1545,7 +1546,7 @@ _STATIC(__boot_from_prom) /* put a relocation offset into r3 */ bl .reloc_offset - LOADADDR(r2,__toc_start) + LOAD_REG_IMMEDIATE(r2,__toc_start) addi r2,r2,0x4000 addi r2,r2,0x4000 @@ -1584,9 +1585,9 @@ _STATIC(__after_prom_start) */ bl .reloc_offset mr r26,r3 - SET_REG_TO_CONST(r27,KERNELBASE) + LOAD_REG_IMMEDIATE(r27, KERNELBASE) - LOADADDR(r3, PHYSICAL_START) /* target addr */ + LOAD_REG_IMMEDIATE(r3, PHYSICAL_START) /* target addr */ // XXX FIXME: Use phys returned by OF (r30) add r4,r27,r26 /* source addr */ @@ -1594,7 +1595,7 @@ _STATIC(__after_prom_start) /* i.e. where we are running */ /* the source addr */ - LOADADDR(r5,copy_to_here) /* # bytes of memory to copy */ + LOAD_REG_IMMEDIATE(r5,copy_to_here) /* # bytes of memory to copy */ sub r5,r5,r27 li r6,0x100 /* Start offset, the first 0x100 */ @@ -1604,11 +1605,11 @@ _STATIC(__after_prom_start) /* this includes the code being */ /* executed here. */ - LOADADDR(r0, 4f) /* Jump to the copy of this code */ + LOAD_REG_IMMEDIATE(r0, 4f) /* Jump to the copy of this code */ mtctr r0 /* that we just made/relocated */ bctr -4: LOADADDR(r5,klimit) +4: LOAD_REG_IMMEDIATE(r5,klimit) add r5,r5,r26 ld r5,0(r5) /* get the value of klimit */ sub r5,r5,r27 @@ -1690,7 +1691,7 @@ _GLOBAL(pmac_secondary_start) mtmsrd r3 /* RI on */ /* Set up a paca value for this processor. */ - LOADADDR(r4, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r4, paca) /* Get base vaddr of paca array */ mulli r13,r24,PACA_SIZE /* Calculate vaddr of right paca */ add r13,r13,r4 /* for this processor. */ mtspr SPRN_SPRG3,r13 /* Save vaddr of paca in SPRG3 */ @@ -1727,7 +1728,7 @@ _GLOBAL(__secondary_start) bl .early_setup_secondary /* Initialize the kernel stack. Just a repeat for iSeries. */ - LOADADDR(r3,current_set) + LOAD_REG_ADDR(r3, current_set) sldi r28,r24,3 /* get current_set[cpu#] */ ldx r1,r3,r28 addi r1,r1,THREAD_SIZE-STACK_FRAME_OVERHEAD @@ -1738,8 +1739,8 @@ _GLOBAL(__secondary_start) mtlr r7 /* enable MMU and jump to start_secondary */ - LOADADDR(r3,.start_secondary_prolog) - SET_REG_TO_CONST(r4, MSR_KERNEL) + LOAD_REG_ADDR(r3, .start_secondary_prolog) + LOAD_REG_IMMEDIATE(r4, MSR_KERNEL) #ifdef DO_SOFT_DISABLE ori r4,r4,MSR_EE #endif @@ -1788,8 +1789,8 @@ _STATIC(start_here_multiplatform) * be detached from the kernel completely. Besides, we need * to clear it now for kexec-style entry. */ - LOADADDR(r11,__bss_stop) - LOADADDR(r8,__bss_start) + LOAD_REG_IMMEDIATE(r11,__bss_stop) + LOAD_REG_IMMEDIATE(r8,__bss_start) sub r11,r11,r8 /* bss size */ addi r11,r11,7 /* round up to an even double word */ rldicl. r11,r11,61,3 /* shift right by 3 */ @@ -1827,7 +1828,7 @@ _STATIC(start_here_multiplatform) /* up the htab. This is done because we have relocated the */ /* kernel but are still running in real mode. */ - LOADADDR(r3,init_thread_union) + LOAD_REG_IMMEDIATE(r3,init_thread_union) add r3,r3,r26 /* set up a stack pointer (physical address) */ @@ -1836,14 +1837,14 @@ _STATIC(start_here_multiplatform) stdu r0,-STACK_FRAME_OVERHEAD(r1) /* set up the TOC (physical address) */ - LOADADDR(r2,__toc_start) + LOAD_REG_IMMEDIATE(r2,__toc_start) addi r2,r2,0x4000 addi r2,r2,0x4000 add r2,r2,r26 - LOADADDR(r3,cpu_specs) + LOAD_REG_IMMEDIATE(r3, cpu_specs) add r3,r3,r26 - LOADADDR(r4,cur_cpu_spec) + LOAD_REG_IMMEDIATE(r4,cur_cpu_spec) add r4,r4,r26 mr r5,r26 bl .identify_cpu @@ -1859,11 +1860,11 @@ _STATIC(start_here_multiplatform) * nowhere it can be initialized differently before we reach this * code */ - LOADADDR(r27, boot_cpuid) + LOAD_REG_IMMEDIATE(r27, boot_cpuid) add r27,r27,r26 lwz r27,0(r27) - LOADADDR(r24, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r24, paca) /* Get base vaddr of paca array */ mulli r13,r27,PACA_SIZE /* Calculate vaddr of right paca */ add r13,r13,r24 /* for this processor. */ add r13,r13,r26 /* convert to physical addr */ @@ -1876,8 +1877,8 @@ _STATIC(start_here_multiplatform) mr r3,r31 bl .early_setup - LOADADDR(r3,.start_here_common) - SET_REG_TO_CONST(r4, MSR_KERNEL) + LOAD_REG_IMMEDIATE(r3, .start_here_common) + LOAD_REG_IMMEDIATE(r4, MSR_KERNEL) mtspr SPRN_SRR0,r3 mtspr SPRN_SRR1,r4 rfid @@ -1891,7 +1892,7 @@ _STATIC(start_here_common) /* The following code sets up the SP and TOC now that we are */ /* running with translation enabled. */ - LOADADDR(r3,init_thread_union) + LOAD_REG_IMMEDIATE(r3,init_thread_union) /* set up the stack */ addi r1,r3,THREAD_SIZE @@ -1904,16 +1905,16 @@ _STATIC(start_here_common) li r3,0 bl .do_cpu_ftr_fixups - LOADADDR(r26, boot_cpuid) + LOAD_REG_IMMEDIATE(r26, boot_cpuid) lwz r26,0(r26) - LOADADDR(r24, paca) /* Get base vaddr of paca array */ + LOAD_REG_IMMEDIATE(r24, paca) /* Get base vaddr of paca array */ mulli r13,r26,PACA_SIZE /* Calculate vaddr of right paca */ add r13,r13,r24 /* for this processor. */ mtspr SPRN_SPRG3,r13 /* ptr to current */ - LOADADDR(r4,init_task) + LOAD_REG_IMMEDIATE(r4, init_task) std r4,PACACURRENT(r13) /* Load the TOC */ @@ -1936,7 +1937,7 @@ _STATIC(start_here_common) _GLOBAL(hmt_init) #ifdef CONFIG_HMT - LOADADDR(r5, hmt_thread_data) + LOAD_REG_IMMEDIATE(r5, hmt_thread_data) mfspr r7,SPRN_PVR srwi r7,r7,16 cmpwi r7,0x34 /* Pulsar */ @@ -1957,7 +1958,7 @@ _GLOBAL(hmt_init) b 101f __hmt_secondary_hold: - LOADADDR(r5, hmt_thread_data) + LOAD_REG_IMMEDIATE(r5, hmt_thread_data) clrldi r5,r5,4 li r7,0 mfspr r6,SPRN_PIR @@ -1985,7 +1986,7 @@ __hmt_secondary_hold: #ifdef CONFIG_HMT _GLOBAL(hmt_start_secondary) - LOADADDR(r4,__hmt_secondary_hold) + LOAD_REG_IMMEDIATE(r4,__hmt_secondary_hold) clrldi r4,r4,4 mtspr SPRN_NIADORM, r4 mfspr r4, SPRN_MSRDORM Index: working-2.6/arch/powerpc/kernel/misc_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_64.S 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_64.S 2005-12-16 15:51:35.000000000 +1100 @@ -39,7 +39,7 @@ _GLOBAL(reloc_offset) mflr r0 bl 1f 1: mflr r3 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r3,r4,r3 mtlr r0 blr @@ -51,7 +51,7 @@ _GLOBAL(add_reloc_offset) mflr r0 bl 1f 1: mflr r5 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r5,r4,r5 add r3,r3,r5 mtlr r0 @@ -498,15 +498,15 @@ _GLOBAL(identify_cpu) */ _GLOBAL(do_cpu_ftr_fixups) /* Get CPU 0 features */ - LOADADDR(r6,cur_cpu_spec) + LOAD_REG_IMMEDIATE(r6,cur_cpu_spec) sub r6,r6,r3 ld r4,0(r6) sub r4,r4,r3 ld r4,CPU_SPEC_FEATURES(r4) /* Get the fixup table */ - LOADADDR(r6,__start___ftr_fixup) + LOAD_REG_IMMEDIATE(r6,__start___ftr_fixup) sub r6,r6,r3 - LOADADDR(r7,__stop___ftr_fixup) + LOAD_REG_IMMEDIATE(r7,__stop___ftr_fixup) sub r7,r7,r3 /* Do the fixup */ 1: cmpld r6,r7 Index: working-2.6/arch/powerpc/kernel/cpu_setup_power4.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/cpu_setup_power4.S 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/cpu_setup_power4.S 2005-12-16 15:51:35.000000000 +1100 @@ -130,7 +130,7 @@ _GLOBAL(__save_cpu_setup) mfcr r7 /* Get storage ptr */ - LOADADDR(r5,cpu_state_storage) + LOAD_REG_IMMEDIATE(r5,cpu_state_storage) /* We only deal with 970 for now */ mfspr r0,SPRN_PVR @@ -164,7 +164,7 @@ _GLOBAL(__restore_cpu_setup) /* Get storage ptr (FIXME when using anton reloc as we * are running with translation disabled here */ - LOADADDR(r5,cpu_state_storage) + LOAD_REG_IMMEDIATE(r5,cpu_state_storage) /* We only deal with 970 for now */ mfspr r0,SPRN_PVR Index: working-2.6/arch/powerpc/kernel/entry_32.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/entry_32.S 2005-12-16 13:05:21.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/entry_32.S 2005-12-16 15:51:35.000000000 +1100 @@ -988,7 +988,7 @@ _GLOBAL(enter_rtas) stwu r1,-INT_FRAME_SIZE(r1) mflr r0 stw r0,INT_FRAME_SIZE+4(r1) - LOADADDR(r4, rtas) + LOAD_REG_ADDR(r4, rtas) lis r6,1f at ha /* physical return address for rtas */ addi r6,r6,1f at l tophys(r6,r6) Index: working-2.6/arch/powerpc/kernel/misc_32.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_32.S 2005-12-16 13:05:21.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_32.S 2005-12-16 15:51:35.000000000 +1100 @@ -68,7 +68,7 @@ _GLOBAL(reloc_offset) mflr r0 bl 1f 1: mflr r3 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r3,r4,r3 mtlr r0 blr @@ -80,7 +80,7 @@ _GLOBAL(add_reloc_offset) mflr r0 bl 1f 1: mflr r5 - LOADADDR(r4,1b) + LOAD_REG_IMMEDIATE(r4,1b) subf r5,r4,r5 add r3,r3,r5 mtlr r0 Index: working-2.6/arch/powerpc/kernel/fpu.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/fpu.S 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/fpu.S 2005-12-16 15:51:35.000000000 +1100 @@ -39,9 +39,9 @@ _GLOBAL(load_up_fpu) * to another. Instead we call giveup_fpu in switch_to. */ #ifndef CONFIG_SMP - LOADBASE(r3, last_task_used_math) + LOAD_REG_ADDRBASE(r3, last_task_used_math) toreal(r3) - PPC_LL r4,OFF(last_task_used_math)(r3) + PPC_LL r4,ADDROFF(last_task_used_math)(r3) PPC_LCMPI 0,r4,0 beq 1f toreal(r4) @@ -77,7 +77,7 @@ _GLOBAL(load_up_fpu) #ifndef CONFIG_SMP subi r4,r5,THREAD fromreal(r4) - PPC_STL r4,OFF(last_task_used_math)(r3) + PPC_STL r4,ADDROFF(last_task_used_math)(r3) #endif /* CONFIG_SMP */ /* restore registers and return */ /* we haven't used ctr or xer or lr */ @@ -113,8 +113,8 @@ _GLOBAL(giveup_fpu) 1: #ifndef CONFIG_SMP li r5,0 - LOADBASE(r4,last_task_used_math) - PPC_STL r5,OFF(last_task_used_math)(r4) + LOAD_REG_ADDRBASE(r4,last_task_used_math) + PPC_STL r5,ADDROFF(last_task_used_math)(r4) #endif /* CONFIG_SMP */ blr Index: working-2.6/arch/powerpc/kernel/idle_power4.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/idle_power4.S 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/idle_power4.S 2005-12-16 15:51:35.000000000 +1100 @@ -38,14 +38,14 @@ END_FTR_SECTION_IFCLR(CPU_FTR_CAN_NAP) /* We must dynamically check for the NAP feature as it * can be cleared by CPU init after the fixups are done */ - LOADBASE(r3,cur_cpu_spec) - ld r4,OFF(cur_cpu_spec)(r3) + LOAD_REG_ADDRBASE(r3,cur_cpu_spec) + ld r4,ADDROFF(cur_cpu_spec)(r3) ld r4,CPU_SPEC_FEATURES(r4) andi. r0,r4,CPU_FTR_CAN_NAP beqlr /* Now check if user or arch enabled NAP mode */ - LOADBASE(r3,powersave_nap) - lwz r4,OFF(powersave_nap)(r3) + LOAD_REG_ADDRBASE(r3,powersave_nap) + lwz r4,ADDROFF(powersave_nap)(r3) cmpwi 0,r4,0 beqlr -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From matthew at wil.cx Sat Dec 17 00:22:24 2005 From: matthew at wil.cx (Matthew Wilcox) Date: Fri, 16 Dec 2005 06:22:24 -0700 Subject: typedefs and structs [was Re: [PATCH 16/42]: PCI: PCI Error reporting callbacks] In-Reply-To: <200512161509.01580.vda@ilport.com.ua> References: <20051103235918.GA25616@mail.gnucash.org> <1131412273.14381.142.camel@localhost.localdomain> <17263.64754.79733.651186@cse.unsw.edu.au> <200512161509.01580.vda@ilport.com.ua> Message-ID: <20051216132224.GD2361@parisc-linux.org> On Fri, Dec 16, 2005 at 03:09:01PM +0200, Denis Vlasenko wrote: > > Forward decl for typedef works too: > > typedef struct foo foo_t; > > is ok even before struct foo is defined. Not sure that standards > allow thing, but gcc does. Forward declarations of typedefs don't work in at least one case that do for struct definitions: $ cat foo.c typedef struct foo foo_t; typedef struct foo foo_t; $ gcc -Wall -o foo.o -c foo.c foo.c:2: error: redefinition of typedef 'foo_t' foo.c:1: error: previous declaration of 'foo_t' was here and if you don't believe we do that, take another look at our headers sometime. From vda at ilport.com.ua Sat Dec 17 00:09:01 2005 From: vda at ilport.com.ua (Denis Vlasenko) Date: Fri, 16 Dec 2005 15:09:01 +0200 Subject: typedefs and structs [was Re: [PATCH 16/42]: PCI: PCI Error reporting callbacks] In-Reply-To: <17263.64754.79733.651186@cse.unsw.edu.au> References: <20051103235918.GA25616@mail.gnucash.org> <1131412273.14381.142.camel@localhost.localdomain> <17263.64754.79733.651186@cse.unsw.edu.au> Message-ID: <200512161509.01580.vda@ilport.com.ua> On Tuesday 08 November 2005 03:18, Neil Brown wrote: > On Monday November 7, rostedt at goodmis.org wrote: > > > > This was for the simple reason, too many developers were passing > > structures by value instead of by reference, just because they were > > using a type that they didn't realize was a structure. And to make > > things worse, these structures started to get bigger. > > > > Another reason for not using typedefs is that if you do, and you want > to refer to the structure in some other include file, you have to > #include the include file that devices the structure. > If you don't use typedefs, you can just say: > > struct foo; Forward decl for typedef works too: typedef struct foo foo_t; is ok even before struct foo is defined. Not sure that standards allow thing, but gcc does. > and the compiler will happily wait for the complete definition later > (providing it doesn't need the size in the meanwhile). > So avoiding typedef means that you can sometimes avoid excess > #includes, which means faster compiling. -- vda From arnd at arndb.de Sat Dec 17 08:08:31 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 16 Dec 2005 22:08:31 +0100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <17313.58548.899471.558166@cargo.ozlabs.ibm.com> References: <3a267747328be451274c52ca5721c5ec@bga.com> <17313.58548.899471.558166@cargo.ozlabs.ibm.com> Message-ID: <200512162208.34832.arnd@arndb.de> This patch enables support for pause(0) power management state for the Cell Broadband Processor, which is import for power efficient operation. The pervasive infrastructure will in the future enable us to introduce more functionality specific to the Cell's pervasive unit. From: Maximino Aguilar Signed-off-by: Arnd Bergmann On Dunnersdag 15 Dezember 2005 22:48, Paul Mackerras wrote: --- On Dunnersdag 15 Dezember 2005 22:48, Paul Mackerras wrote: > Actually, why not test SRR1 (regs->msr) in C code, and call do_IRQ or > timer_interrupt from there, or just return? No changes to head_64.S > needed. If we have been asleep then presumably a few extra > nanoseconds on waking isn't going to make a difference. > Ok, tried that now, seems to work fine. I discovered that there is already a ppc_md callback for this purpose, but I changed it to not panic automatically after returning from that. Arnd <>< Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/Makefile +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile @@ -1,4 +1,6 @@ obj-y += interrupt.o iommu.o setup.o spider-pic.o +obj-y += pervasive.o + obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SPU_FS) += spufs/ spu_base.o builtin-spufs-$(CONFIG_SPU_FS) += spu_syscalls.o Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c @@ -0,0 +1,216 @@ +/* + * CBE Pervasive Monitor and Debug + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * Michael N. Day (mnday at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#undef DEBUG + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pervasive.h" + +static spinlock_t cbe_pervasive_lock; +struct cbe_pervasive { + struct pmd_regs __iomem *regs; + unsigned int thread; +}; + +/* can't use per_cpu from setup_arch */ +static struct cbe_pervasive cbe_pervasive[NR_CPUS]; + +static void __init cbe_enable_pause_zero(void) +{ + unsigned long thread_switch_control; + unsigned long temp_register; + struct cbe_pervasive *p; + int thread; + + p = &cbe_pervasive[get_cpu()]; + spin_lock_irq(&cbe_pervasive_lock); + + if (!cbe_pervasive->regs) + goto out; + + pr_debug("Power Management: CPU %d\n", smp_processor_id()); + + /* Enable Pause(0) control bit */ + temp_register = in_be64(&p->regs->pm_control); + + out_be64(&p->regs->pm_control, + temp_register|PMD_PAUSE_ZERO_CONTROL); + + /* Enable DEC and EE interrupt request */ + thread_switch_control = mfspr(SPRN_TSC_CELL); + thread_switch_control |= TSC_CELL_EE_ENABLE | TSC_CELL_EE_BOOST; + + switch ((mfspr(SPRN_CTRLF) & CTRL_CT)) { + case CTRL_CT0: + thread_switch_control |= TSC_CELL_DEC_ENABLE_0; + thread = 0; + break; + case CTRL_CT1: + thread_switch_control |= TSC_CELL_DEC_ENABLE_1; + thread = 1; + break; + default: + printk(KERN_WARNING "%s: unknown configuration\n", + __FUNCTION__); + thread = -1; + break; + } + + if (p->thread != thread) + printk(KERN_WARNING "%s: device tree inconsistant, " + "cpu %i: %d/%d\n", __FUNCTION__, + smp_processor_id(), + p->thread, thread); + + mtspr(SPRN_TSC_CELL, thread_switch_control); + +out: + spin_unlock_irq(&cbe_pervasive_lock); + put_cpu(); +} + +static void cbe_idle(void) +{ + unsigned long ctrl; + + cbe_enable_pause_zero(); + + while (1) { + if (!need_resched()) { + while (!need_resched()) { + /* go into low thread priority */ + HMT_low(); + + /* go into low power mode */ + local_irq_disable(); + ctrl = mfspr(SPRN_CTRLF); + ctrl &= ~(CTRL_RUNLATCH | CTRL_TE); + mtspr(SPRN_CTRLT, ctrl); + local_irq_enable(); + } + /* restore thread prio */ + HMT_medium(); + } + + ppc64_runlatch_on(); + preempt_enable_no_resched(); + schedule(); + preempt_disable(); + } +} + +void cbe_system_reset_exception(struct pt_regs *regs) +{ + switch (regs->msr & SRR1_WAKEMASK) { + case SRR1_WAKEEE: + do_IRQ(regs); + break; + case SRR1_WAKEDEC: + timer_interrupt(regs); + break; + case SRR1_WAKEMT: + /* no action required */ + break; + default: + die("System Reset", regs, SIGABRT); + } +} + +static int __init cbe_find_pmd_mmio(int cpu, struct cbe_pervasive *p) +{ + struct device_node *node; + unsigned int *int_servers; + char *addr; + unsigned long real_address; + unsigned int size; + + struct pmd_regs __iomem *pmd_mmio_area; + int hardid, thread; + int proplen; + + pmd_mmio_area = NULL; + hardid = get_hard_smp_processor_id(cpu); + for (node = NULL; (node = of_find_node_by_type(node, "cpu"));) { + int_servers = (void *) get_property(node, + "ibm,ppc-interrupt-server#s", &proplen); + if (!int_servers) { + printk(KERN_WARNING "CPU device misses " + "ibm,ppc-interrupt-server#s property"); + continue; + } + for (thread = 0; thread < proplen / sizeof (int); thread++) { + if (hardid == int_servers[thread]) { + addr = get_property(node, "pervasive", NULL); + goto found; + } + } + } + + printk(KERN_WARNING "%s: CPU %d not found\n", __FUNCTION__, cpu); + return -EINVAL; + +found: + real_address = *(unsigned long*) addr; + addr += sizeof (unsigned long); + size = *(unsigned int*) addr; + + pr_debug("pervasive area for CPU %d at %lx, size %x\n", + cpu, real_address, size); + p->regs = __ioremap(real_address, size, _PAGE_NO_CACHE); + p->thread = thread; + return 0; +} + +void __init cell_pervasive_init(void) +{ + struct cbe_pervasive *p; + int cpu; + int ret; + + spin_lock_init(&cbe_pervasive_lock); + + if (!cpu_has_feature(CPU_FTR_PAUSE_ZERO)) + return; + + for_each_cpu(cpu) { + p = &cbe_pervasive[cpu]; + ret = cbe_find_pmd_mmio(cpu, p); + if (ret) + return; + } + + ppc_md.idle_loop = cbe_idle; + ppc_md.system_reset_exception = cbe_system_reset_exception; +} Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h @@ -0,0 +1,62 @@ +/* + * Cell Pervasive Monitor and Debug interface and HW structures + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * David J. Erb (djerb at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#ifndef PERVASIVE_H +#define PERVASIVE_H + +struct pmd_regs { + u8 pad_0x0000_0x0800[0x0800 - 0x0000]; /* 0x0000 */ + + /* Thermal Sensor Registers */ + u64 ts_ctsr1; /* 0x0800 */ + u64 ts_ctsr2; /* 0x0808 */ + u64 ts_mtsr1; /* 0x0810 */ + u64 ts_mtsr2; /* 0x0818 */ + u64 ts_itr1; /* 0x0820 */ + u64 ts_itr2; /* 0x0828 */ + u64 ts_gitr; /* 0x0830 */ + u64 ts_isr; /* 0x0838 */ + u64 ts_imr; /* 0x0840 */ + u64 tm_cr1; /* 0x0848 */ + u64 tm_cr2; /* 0x0850 */ + u64 tm_simr; /* 0x0858 */ + u64 tm_tpr; /* 0x0860 */ + u64 tm_str1; /* 0x0868 */ + u64 tm_str2; /* 0x0870 */ + u64 tm_tsr; /* 0x0878 */ + + /* Power Management */ + u64 pm_control; /* 0x0880 */ +#define PMD_PAUSE_ZERO_CONTROL 0x10000 + u64 pm_status; /* 0x0888 */ + + /* Time Base Register */ + u64 tbr; /* 0x0890 */ + + u8 pad_0x0898_0x1000 [0x1000 - 0x0898]; /* 0x0898 */ +}; + +void __init cell_pervasive_init(void); + +#endif Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/setup.c +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c @@ -49,6 +49,7 @@ #include "interrupt.h" #include "iommu.h" +#include "pervasive.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -165,6 +166,7 @@ static void __init cell_setup_arch(void) init_pci_config_tokens(); find_and_init_phbs(); spider_init_IRQ(); + cell_pervasive_init(); #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif Index: linux-2.6.15-rc/include/asm-powerpc/cputable.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/cputable.h +++ linux-2.6.15-rc/include/asm-powerpc/cputable.h @@ -106,6 +106,7 @@ extern void do_cpu_ftr_fixups(unsigned l #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0000040000000000) #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0000100000000000) +#define CPU_FTR_PAUSE_ZERO ASM_CONST(0x0000200000000000) #else /* ensure on 32b processors the flags are available for compiling but * don't do anything */ @@ -305,7 +306,8 @@ enum { CPU_FTR_MMCRA_SIHV, CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | - CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT, + CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | + CPU_FTR_CTRL | CPU_FTR_PAUSE_ZERO, CPU_FTRS_COMPATIBLE = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2, #endif Index: linux-2.6.15-rc/include/asm-powerpc/reg.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/reg.h +++ linux-2.6.15-rc/include/asm-powerpc/reg.h @@ -145,6 +145,10 @@ #define SPRN_CTR 0x009 /* Count Register */ #define SPRN_CTRLF 0x088 #define SPRN_CTRLT 0x098 +#define CTRL_CT 0xc0000000 /* current thread */ +#define CTRL_CT0 0x80000000 /* thread 0 */ +#define CTRL_CT1 0x40000000 /* thread 1 */ +#define CTRL_TE 0x00c00000 /* thread enable */ #define CTRL_RUNLATCH 0x1 #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ #define DABR_TRANSLATION (1UL << 2) @@ -257,11 +261,11 @@ #define SPRN_HID6 0x3F9 /* BE HID 6 */ #define HID6_LB (0x0F<<12) /* Concurrent Large Page Modes */ #define HID6_DLP (1<<20) /* Disable all large page modes (4K only) */ -#define SPRN_TSCR 0x399 /* Thread switch control on BE */ -#define SPRN_TTR 0x39A /* Thread switch timeout on BE */ -#define TSCR_DEC_ENABLE 0x200000 /* Decrementer Interrupt */ -#define TSCR_EE_ENABLE 0x100000 /* External Interrupt */ -#define TSCR_EE_BOOST 0x080000 /* External Interrupt Boost */ +#define SPRN_TSC_CELL 0x399 /* Thread switch control on Cell */ +#define TSC_CELL_DEC_ENABLE_0 0x400000 /* Decrementer Interrupt */ +#define TSC_CELL_DEC_ENABLE_1 0x200000 /* Decrementer Interrupt */ +#define TSC_CELL_EE_ENABLE 0x100000 /* External Interrupt */ +#define TSC_CELL_EE_BOOST 0x080000 /* External Interrupt Boost */ #define SPRN_TSC 0x3FD /* Thread switch control on others */ #define SPRN_TST 0x3FC /* Thread switch timeout on others */ #if !defined(SPRN_IAC1) && !defined(SPRN_IAC2) @@ -375,6 +379,14 @@ #define SPRN_SPRG7 0x117 /* Special Purpose Register General 7 */ #define SPRN_SRR0 0x01A /* Save/Restore Register 0 */ #define SPRN_SRR1 0x01B /* Save/Restore Register 1 */ +#define SRR1_WAKEMASK 0x00380000 /* reason for wakeup */ +#define SRR1_WAKERESET 0x00380000 /* System reset */ +#define SRR1_WAKESYSERR 0x00300000 /* System error */ +#define SRR1_WAKEEE 0x00200000 /* External interrupt */ +#define SRR1_WAKEMT 0x00280000 /* mtctrl */ +#define SRR1_WAKEDEC 0x00180000 /* Decrementer interrupt */ +#define SRR1_WAKETHERM 0x00100000 /* Thermal management interrupt */ + #ifndef SPRN_SVR #define SPRN_SVR 0x11E /* System Version Register */ #endif Index: linux-2.6.15-rc/arch/powerpc/kernel/cputable.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/cputable.c +++ linux-2.6.15-rc/arch/powerpc/kernel/cputable.c @@ -273,7 +273,7 @@ struct cpu_spec cpu_specs[] = { .oprofile_model = &op_model_power4, #endif }, - { /* BE DD1.x */ + { /* Cell Broadband Engine */ .pvr_mask = 0xffff0000, .pvr_value = 0x00700000, .cpu_name = "Cell Broadband Engine", Index: linux-2.6.15-rc/arch/powerpc/kernel/traps.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/traps.c +++ linux-2.6.15-rc/arch/powerpc/kernel/traps.c @@ -230,8 +230,10 @@ void _exception(int signr, struct pt_reg void system_reset_exception(struct pt_regs *regs) { /* See if any machine dependent calls */ - if (ppc_md.system_reset_exception) + if (ppc_md.system_reset_exception) { ppc_md.system_reset_exception(regs); + return; + } die("System Reset", regs, SIGABRT); From arnd at arndb.de Sat Dec 17 08:43:46 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 16 Dec 2005 22:43:46 +0100 Subject: Subject: powerpc: sanitize header files for user space includes In-Reply-To: <20051213082131.GA14838@suse.de> References: <20051212204532.GJ23641@krispykreme> <1134432067.6989.137.camel@gaston> <20051213082131.GA14838@suse.de> Message-ID: <200512162243.51740.arnd@arndb.de> include/asm-ppc/ had #ifdef __KERNEL__ in all header files that are not meant for use by user space, include/asm-powerpc does not have this yet. This patch gets us a lot closer there. There are a few cases where I was not sure, so I left them out. I have verified that no CONFIG_* symbols are used outside of __KERNEL__ any more and that there are no obvious compile errors when including any of the headers in user space libraries. Signed-off-by: Arnd Bergmann --- On Dinsdag 13 Dezember 2005 09:21, Olaf Hering wrote: > asm-ppc/ had the good/bad feature that everything was inside #ifdef > __KERNEL__. Can we have that back please? > I just gave this a try: include/asm-powerpc/ppc_asm.h | 3 +++ include/asm-powerpc/abs_addr.h | 2 ++ include/asm-powerpc/agp.h | 2 ++ include/asm-powerpc/asm-compat.h | 3 ++- include/asm-powerpc/bootx.h | 5 +++++ include/asm-powerpc/bug.h | 2 ++ include/asm-powerpc/checksum.h | 2 ++ include/asm-powerpc/compat.h | 2 ++ include/asm-powerpc/cputable.h | 1 - include/asm-powerpc/current.h | 2 ++ include/asm-powerpc/delay.h | 2 ++ include/asm-powerpc/dma-mapping.h | 2 ++ include/asm-powerpc/dma.h | 2 ++ include/asm-powerpc/eeh.h | 2 ++ include/asm-powerpc/eeh_event.h | 2 ++ include/asm-powerpc/elf.h | 3 +++ include/asm-powerpc/floppy.h | 2 ++ include/asm-powerpc/grackle.h | 5 +++++ include/asm-powerpc/hardirq.h | 2 ++ include/asm-powerpc/heathrow.h | 5 +++++ include/asm-powerpc/hvcall.h | 2 ++ include/asm-powerpc/hvconsole.h | 2 ++ include/asm-powerpc/hvcserver.h | 2 ++ include/asm-powerpc/i8259.h | 2 ++ include/asm-powerpc/ibmebus.h | 2 ++ include/asm-powerpc/io.h | 6 +----- include/asm-powerpc/iommu.h | 2 ++ include/asm-powerpc/kdebug.h | 2 ++ include/asm-powerpc/kexec.h | 2 ++ include/asm-powerpc/keylargo.h | 5 +++++ include/asm-powerpc/kprobes.h | 2 ++ include/asm-powerpc/lmb.h | 2 ++ include/asm-powerpc/lppaca.h | 2 ++ include/asm-powerpc/macio.h | 2 ++ include/asm-powerpc/mmu.h | 2 ++ include/asm-powerpc/mmu_context.h | 2 ++ include/asm-powerpc/mmzone.h | 2 ++ include/asm-powerpc/module.h | 2 ++ include/asm-powerpc/mpic.h | 2 ++ include/asm-powerpc/numnodes.h | 2 ++ include/asm-powerpc/nvram.h | 4 ++++ include/asm-powerpc/of_device.h | 2 ++ include/asm-powerpc/ohare.h | 6 ++++++ include/asm-powerpc/oprofile_impl.h | 2 ++ include/asm-powerpc/pSeries_reconfig.h | 2 ++ include/asm-powerpc/paca.h | 2 ++ include/asm-powerpc/page_32.h | 2 ++ include/asm-powerpc/page_64.h | 2 ++ include/asm-powerpc/param.h | 2 -- include/asm-powerpc/parport.h | 2 ++ include/asm-powerpc/pci-bridge.h | 2 ++ include/asm-powerpc/pgalloc.h | 2 ++ include/asm-powerpc/pgtable-64k.h | 6 ++++++ include/asm-powerpc/pgtable.h | 2 ++ include/asm-powerpc/pmac_low_i2c.h | 2 ++ include/asm-powerpc/pmc.h | 2 ++ include/asm-powerpc/ppc-pci.h | 2 ++ include/asm-powerpc/processor.h | 4 ++-- include/asm-powerpc/rtas.h | 2 ++ include/asm-powerpc/seccomp.h | 4 ++++ include/asm-powerpc/sections.h | 2 ++ include/asm-powerpc/signal.h | 7 +++++-- include/asm-powerpc/smu.h | 22 ++++++++++++---------- include/asm-powerpc/sparsemem.h | 2 ++ include/asm-powerpc/spinlock.h | 2 ++ include/asm-powerpc/spu.h | 3 +++ include/asm-powerpc/spu_csa.h | 3 +-- include/asm-powerpc/synch.h | 4 ++-- include/asm-powerpc/system.h | 3 +-- include/asm-powerpc/tce.h | 2 ++ include/asm-powerpc/tlb.h | 2 ++ include/asm-powerpc/topology.h | 2 ++ include/asm-powerpc/udbg.h | 2 ++ include/asm-powerpc/vdso_datapage.h | 2 ++ include/asm-powerpc/vio.h | 2 ++ 75 files changed, 183 insertions(+), 29 deletions(-) Index: powerpc-2.6/include/asm-powerpc/abs_addr.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/abs_addr.h +++ powerpc-2.6/include/asm-powerpc/abs_addr.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_ABS_ADDR_H #define _ASM_POWERPC_ABS_ADDR_H +#ifdef __KERNEL__ #include @@ -70,4 +71,5 @@ static inline unsigned long phys_to_abs( #define iseries_hv_addr(virtaddr) \ (0x8000000000000000 | virt_to_abs(virtaddr)) +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_ABS_ADDR_H */ Index: powerpc-2.6/include/asm-powerpc/agp.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/agp.h +++ powerpc-2.6/include/asm-powerpc/agp.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_AGP_H #define _ASM_POWERPC_AGP_H +#ifdef __KERNEL__ #include @@ -18,4 +19,5 @@ #define free_gatt_pages(table, order) \ free_pages((unsigned long)(table), (order)) +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_AGP_H */ Index: powerpc-2.6/include/asm-powerpc/asm-compat.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/asm-compat.h +++ powerpc-2.6/include/asm-powerpc/asm-compat.h @@ -1,7 +1,6 @@ #ifndef _ASM_POWERPC_ASM_COMPAT_H #define _ASM_POWERPC_ASM_COMPAT_H -#include #include #ifdef __ASSEMBLY__ @@ -41,6 +40,7 @@ #endif +#ifdef __KERNEL__ #ifdef CONFIG_IBM405_ERR77 /* Erratum #77 on the 405 means we need a sync or dcbt before every * stwcx. The old ATOMIC_SYNC_FIX covered some but not all of this. @@ -51,5 +51,6 @@ #define PPC405_ERR77(ra,rb) #define PPC405_ERR77_SYNC #endif +#endif #endif /* _ASM_POWERPC_ASM_COMPAT_H */ Index: powerpc-2.6/include/asm-powerpc/bootx.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/bootx.h +++ powerpc-2.6/include/asm-powerpc/bootx.h @@ -9,6 +9,8 @@ #ifndef __ASM_BOOTX_H__ #define __ASM_BOOTX_H__ +#include + #ifdef macintosh #include #include "linux_type_defs.h" @@ -122,6 +124,7 @@ typedef struct boot_infos } boot_infos_t; +#ifdef __KERNEL__ /* (*) The format of the colormap is 256 * 3 * 2 bytes. Each color index * is represented by 3 short words containing a 16 bits (unsigned) color * component. Later versions may contain the gamma table for direct-color @@ -159,6 +162,8 @@ struct bootx_dt_node { extern void bootx_init(unsigned long r4, unsigned long phys); +#endif /* __KERNEL__ */ + #ifdef macintosh #pragma options align=reset #endif Index: powerpc-2.6/include/asm-powerpc/bug.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/bug.h +++ powerpc-2.6/include/asm-powerpc/bug.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_BUG_H #define _ASM_POWERPC_BUG_H +#ifdef __KERNEL__ #include /* @@ -67,4 +68,5 @@ struct bug_entry *find_bug(unsigned long #include +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_BUG_H */ Index: powerpc-2.6/include/asm-powerpc/checksum.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/checksum.h +++ powerpc-2.6/include/asm-powerpc/checksum.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_CHECKSUM_H #define _ASM_POWERPC_CHECKSUM_H +#ifdef __KERNEL__ /* * This program is free software; you can redistribute it and/or @@ -129,4 +130,5 @@ static inline unsigned long csum_tcpudp_ } #endif +#endif /* __KERNEL__ */ #endif Index: powerpc-2.6/include/asm-powerpc/compat.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/compat.h +++ powerpc-2.6/include/asm-powerpc/compat.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_COMPAT_H #define _ASM_POWERPC_COMPAT_H +#ifdef __KERNEL__ /* * Architecture specific compatibility types */ @@ -202,4 +203,5 @@ struct compat_shmid64_ds { compat_ulong_t __unused6; }; +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_COMPAT_H */ Index: powerpc-2.6/include/asm-powerpc/cputable.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/cputable.h +++ powerpc-2.6/include/asm-powerpc/cputable.h @@ -1,7 +1,6 @@ #ifndef __ASM_POWERPC_CPUTABLE_H #define __ASM_POWERPC_CPUTABLE_H -#include #include #define PPC_FEATURE_32 0x80000000 Index: powerpc-2.6/include/asm-powerpc/current.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/current.h +++ powerpc-2.6/include/asm-powerpc/current.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_CURRENT_H #define _ASM_POWERPC_CURRENT_H +#ifdef __KERNEL__ /* * This program is free software; you can redistribute it and/or @@ -24,4 +25,5 @@ register struct task_struct *current asm #endif +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CURRENT_H */ Index: powerpc-2.6/include/asm-powerpc/delay.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/delay.h +++ powerpc-2.6/include/asm-powerpc/delay.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_DELAY_H #define _ASM_POWERPC_DELAY_H +#ifdef __KERNEL__ /* * Copyright 1996, Paul Mackerras. @@ -16,4 +17,5 @@ extern void __delay(unsigned long loops); extern void udelay(unsigned long usecs); +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_DELAY_H */ Index: powerpc-2.6/include/asm-powerpc/dma-mapping.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/dma-mapping.h +++ powerpc-2.6/include/asm-powerpc/dma-mapping.h @@ -6,6 +6,7 @@ */ #ifndef _ASM_DMA_MAPPING_H #define _ASM_DMA_MAPPING_H +#ifdef __KERNEL__ #include #include @@ -282,4 +283,5 @@ struct dma_mapping_ops { int (*dac_dma_supported)(struct device *dev, u64 mask); }; +#endif /* __KERNEL__ */ #endif /* _ASM_DMA_MAPPING_H */ Index: powerpc-2.6/include/asm-powerpc/dma.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/dma.h +++ powerpc-2.6/include/asm-powerpc/dma.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_DMA_H #define _ASM_POWERPC_DMA_H +#ifdef __KERNEL__ /* * Defines for using and allocating dma channels. @@ -387,4 +388,5 @@ extern int isa_dma_bridge_buggy; #endif /* !defined(CONFIG_PPC_ISERIES) || defined(CONFIG_PCI) */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_DMA_H */ Index: powerpc-2.6/include/asm-powerpc/eeh.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/eeh.h +++ powerpc-2.6/include/asm-powerpc/eeh.h @@ -19,6 +19,7 @@ #ifndef _PPC64_EEH_H #define _PPC64_EEH_H +#ifdef __KERNEL__ #include #include @@ -375,4 +376,5 @@ static inline void eeh_insl_ns(unsigned eeh_check_failure((void __iomem *)(port), *(u32*)buf); } +#endif /* __KERNEL__ */ #endif /* _PPC64_EEH_H */ Index: powerpc-2.6/include/asm-powerpc/eeh_event.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/eeh_event.h +++ powerpc-2.6/include/asm-powerpc/eeh_event.h @@ -20,6 +20,7 @@ #ifndef ASM_PPC64_EEH_EVENT_H #define ASM_PPC64_EEH_EVENT_H +#ifdef __KERNEL__ /** EEH event -- structure holding pci controller data that describes * a change in the isolation status of a PCI slot. A pointer @@ -52,4 +53,5 @@ int eeh_send_failure_event (struct devic /* Main recovery function */ void handle_eeh_events (struct eeh_event *); +#endif /* __KERNEL__ */ #endif /* ASM_PPC64_EEH_EVENT_H */ Index: powerpc-2.6/include/asm-powerpc/elf.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/elf.h +++ powerpc-2.6/include/asm-powerpc/elf.h @@ -1,7 +1,10 @@ #ifndef _ASM_POWERPC_ELF_H #define _ASM_POWERPC_ELF_H +#ifdef __KERNEL__ #include /* for task_struct */ +#endif + #include #include #include Index: powerpc-2.6/include/asm-powerpc/floppy.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/floppy.h +++ powerpc-2.6/include/asm-powerpc/floppy.h @@ -9,6 +9,7 @@ */ #ifndef __ASM_POWERPC_FLOPPY_H #define __ASM_POWERPC_FLOPPY_H +#ifdef __KERNEL__ #include #include @@ -102,4 +103,5 @@ static int FDC2 = -1; #define EXTRA_FLOPPY_PARAMS +#endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_FLOPPY_H */ Index: powerpc-2.6/include/asm-powerpc/grackle.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/grackle.h +++ powerpc-2.6/include/asm-powerpc/grackle.h @@ -1,3 +1,6 @@ +#ifndef _ASM_POWERPC_GRACKLE_H +#define _ASM_POWERPC_GRACKLE_H +#ifdef __KERNEL__ /* * Functions for setting up and using a MPC106 northbridge */ @@ -5,3 +8,5 @@ #include extern void setup_grackle(struct pci_controller *hose); +#endif /* __KERNEL__ */ +#endif /* _ASM_POWERPC_GRACKLE_H */ Index: powerpc-2.6/include/asm-powerpc/hardirq.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/hardirq.h +++ powerpc-2.6/include/asm-powerpc/hardirq.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_HARDIRQ_H #define _ASM_POWERPC_HARDIRQ_H +#ifdef __KERNEL__ #include #include @@ -21,4 +22,5 @@ static inline void ack_bad_irq(int irq) BUG(); } +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_HARDIRQ_H */ Index: powerpc-2.6/include/asm-powerpc/heathrow.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/heathrow.h +++ powerpc-2.6/include/asm-powerpc/heathrow.h @@ -1,3 +1,6 @@ +#ifndef _ASM_POWERPC_HEATHROW_H +#define _ASM_POWERPC_HEATHROW_H +#ifndef __KERNEL__ /* * heathrow.h: definitions for using the "Heathrow" I/O controller chip. * @@ -60,3 +63,5 @@ /* Looks like Heathrow has some sort of GPIOs as well... */ #define HRW_GPIO_MODEM_RESET 0x6d +#endif /* __KERNEL__ */ +#endif /* _ASM_POWERPC_HEATHROW_H */ Index: powerpc-2.6/include/asm-powerpc/hvcall.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/hvcall.h +++ powerpc-2.6/include/asm-powerpc/hvcall.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_HVCALL_H #define _ASM_POWERPC_HVCALL_H +#ifdef __KERNEL__ #define HVSC .long 0x44000022 @@ -170,4 +171,5 @@ long plpar_hcall_4out(unsigned long opco unsigned long *out4); #endif /* __ASSEMBLY__ */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_HVCALL_H */ Index: powerpc-2.6/include/asm-powerpc/hvconsole.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/hvconsole.h +++ powerpc-2.6/include/asm-powerpc/hvconsole.h @@ -21,6 +21,7 @@ #ifndef _PPC64_HVCONSOLE_H #define _PPC64_HVCONSOLE_H +#ifdef __KERNEL__ /* * This is the max number of console adapters that can/will be found as @@ -46,4 +47,5 @@ extern struct hvc_struct * __devinit hvc struct hv_ops *ops); /* remove a vterm from hvc tty operation (modele_exit or hotplug remove) */ extern int __devexit hvc_remove(struct hvc_struct *hp); +#endif /* __KERNEL__ */ #endif /* _PPC64_HVCONSOLE_H */ Index: powerpc-2.6/include/asm-powerpc/hvcserver.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/hvcserver.h +++ powerpc-2.6/include/asm-powerpc/hvcserver.h @@ -21,6 +21,7 @@ #ifndef _PPC64_HVCSERVER_H #define _PPC64_HVCSERVER_H +#ifdef __KERNEL__ #include @@ -54,4 +55,5 @@ extern int hvcs_register_connection(uint uint32_t p_partition_ID, uint32_t p_unit_address); extern int hvcs_free_connection(uint32_t unit_address); +#endif /* __KERNEL__ */ #endif /* _PPC64_HVCSERVER_H */ Index: powerpc-2.6/include/asm-powerpc/i8259.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/i8259.h +++ powerpc-2.6/include/asm-powerpc/i8259.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_I8259_H #define _ASM_POWERPC_I8259_H +#ifdef __KERNEL__ #include @@ -9,4 +10,5 @@ extern void i8259_init(unsigned long int extern int i8259_irq(struct pt_regs *regs); extern int i8259_irq_cascade(struct pt_regs *regs, void *unused); +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_I8259_H */ Index: powerpc-2.6/include/asm-powerpc/ibmebus.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/ibmebus.h +++ powerpc-2.6/include/asm-powerpc/ibmebus.h @@ -37,6 +37,7 @@ #ifndef _ASM_EBUS_H #define _ASM_EBUS_H +#ifdef __KERNEL__ #include #include @@ -80,4 +81,5 @@ static inline struct ibmebus_dev *to_ibm } +#endif /* __KERNEL__ */ #endif /* _ASM_IBMEBUS_H */ Index: powerpc-2.6/include/asm-powerpc/io.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/io.h +++ powerpc-2.6/include/asm-powerpc/io.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_IO_H #define _ASM_POWERPC_IO_H +#ifdef __KERNEL__ /* * This program is free software; you can redistribute it and/or @@ -186,7 +187,6 @@ extern void _outsl_ns(volatile u32 __iom #define IO_SPACE_LIMIT ~(0UL) -#ifdef __KERNEL__ extern int __ioremap_explicit(unsigned long p_addr, unsigned long v_addr, unsigned long size, unsigned long flags); extern void __iomem *__ioremap(unsigned long address, unsigned long size, @@ -256,8 +256,6 @@ static inline void * phys_to_virt(unsign */ #define BIO_VMERGE_BOUNDARY 0 -#endif /* __KERNEL__ */ - static inline void iosync(void) { __asm__ __volatile__ ("sync" : : : "memory"); @@ -405,8 +403,6 @@ static inline void out_be64(volatile uns #include #endif -#ifdef __KERNEL__ - /** * check_signature - find BIOS signatures * @io_addr: mmio address to check Index: powerpc-2.6/include/asm-powerpc/iommu.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/iommu.h +++ powerpc-2.6/include/asm-powerpc/iommu.h @@ -20,6 +20,7 @@ #ifndef _ASM_IOMMU_H #define _ASM_IOMMU_H +#ifdef __KERNEL__ #include #include @@ -115,4 +116,5 @@ static inline void pci_iommu_init(void) extern void alloc_dart_table(void); +#endif /* __KERNEL__ */ #endif /* _ASM_IOMMU_H */ Index: powerpc-2.6/include/asm-powerpc/kdebug.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/kdebug.h +++ powerpc-2.6/include/asm-powerpc/kdebug.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_KDEBUG_H #define _ASM_POWERPC_KDEBUG_H +#ifdef __KERNEL__ /* nearly identical to x86_64/i386 code */ @@ -39,4 +40,5 @@ static inline int notify_die(enum die_va return notifier_call_chain(&powerpc_die_chain, val, &args); } +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_KDEBUG_H */ Index: powerpc-2.6/include/asm-powerpc/kexec.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/kexec.h +++ powerpc-2.6/include/asm-powerpc/kexec.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_KEXEC_H #define _ASM_POWERPC_KEXEC_H +#ifdef __KERNEL__ /* * Maximum page that is mapped directly into kernel memory. @@ -58,4 +59,5 @@ extern void default_machine_crash_shutdo #endif /* !CONFIG_KEXEC */ #endif /* ! __ASSEMBLY__ */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_KEXEC_H */ Index: powerpc-2.6/include/asm-powerpc/keylargo.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/keylargo.h +++ powerpc-2.6/include/asm-powerpc/keylargo.h @@ -1,3 +1,6 @@ +#ifndef _ASM_POWERPC_KEYLARGO_H +#define _ASM_POWERPC_KEYLARGO_H +#ifdef __KERNEL__ /* * keylargo.h: definitions for using the "KeyLargo" I/O controller chip. * @@ -254,3 +257,5 @@ #define SH_FCR1_I2S2_ENABLE 0x00000080 #define SH_FCR3_I2S2_CLK18_ENABLE 0x00008000 +#endif /* __KERNEL__ */ +#endif /* _ASM_POWERPC_KEYLARGO_H */ Index: powerpc-2.6/include/asm-powerpc/kprobes.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/kprobes.h +++ powerpc-2.6/include/asm-powerpc/kprobes.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_KPROBES_H #define _ASM_POWERPC_KPROBES_H +#ifdef __KERNEL__ /* * Kernel Probes (KProbes) * @@ -78,4 +79,5 @@ static inline int kprobe_exceptions_noti return 0; } #endif +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_KPROBES_H */ Index: powerpc-2.6/include/asm-powerpc/lmb.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/lmb.h +++ powerpc-2.6/include/asm-powerpc/lmb.h @@ -1,5 +1,6 @@ #ifndef _PPC64_LMB_H #define _PPC64_LMB_H +#ifdef __KERNEL__ /* * Definitions for talking to the Open Firmware PROM on @@ -78,4 +79,5 @@ lmb_end_pfn(struct lmb_region *type, uns lmb_size_pages(type, region_nr); } +#endif /* __KERNEL__ */ #endif /* _PPC64_LMB_H */ Index: powerpc-2.6/include/asm-powerpc/lppaca.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/lppaca.h +++ powerpc-2.6/include/asm-powerpc/lppaca.h @@ -18,6 +18,7 @@ */ #ifndef _ASM_POWERPC_LPPACA_H #define _ASM_POWERPC_LPPACA_H +#ifdef __KERNEL__ //============================================================================= // @@ -128,4 +129,5 @@ struct lppaca { u8 pmc_save_area[256]; // PMC interrupt Area x00-xFF }; +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_LPPACA_H */ Index: powerpc-2.6/include/asm-powerpc/macio.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/macio.h +++ powerpc-2.6/include/asm-powerpc/macio.h @@ -1,5 +1,6 @@ #ifndef __MACIO_ASIC_H__ #define __MACIO_ASIC_H__ +#ifdef __KERNEL__ #include @@ -137,4 +138,5 @@ struct macio_driver extern int macio_register_driver(struct macio_driver *); extern void macio_unregister_driver(struct macio_driver *); +#endif /* __KERNEL__ */ #endif /* __MACIO_ASIC_H__ */ Index: powerpc-2.6/include/asm-powerpc/mmu.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/mmu.h +++ powerpc-2.6/include/asm-powerpc/mmu.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_MMU_H_ #define _ASM_POWERPC_MMU_H_ +#ifdef __KERNEL__ #ifndef CONFIG_PPC64 #include @@ -402,4 +403,5 @@ typedef unsigned long phys_addr_t; #endif /* __ASSEMBLY */ #endif /* CONFIG_PPC64 */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_MMU_H_ */ Index: powerpc-2.6/include/asm-powerpc/mmu_context.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/mmu_context.h +++ powerpc-2.6/include/asm-powerpc/mmu_context.h @@ -1,5 +1,6 @@ #ifndef __ASM_POWERPC_MMU_CONTEXT_H #define __ASM_POWERPC_MMU_CONTEXT_H +#ifdef __KERNEL__ #ifndef CONFIG_PPC64 #include @@ -86,4 +87,5 @@ static inline void activate_mm(struct mm } #endif /* CONFIG_PPC64 */ +#endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_MMU_CONTEXT_H */ Index: powerpc-2.6/include/asm-powerpc/mmzone.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/mmzone.h +++ powerpc-2.6/include/asm-powerpc/mmzone.h @@ -6,6 +6,7 @@ */ #ifndef _ASM_MMZONE_H_ #define _ASM_MMZONE_H_ +#ifdef __KERNEL__ #include @@ -47,4 +48,5 @@ extern unsigned long max_pfn; extern int __init early_pfn_to_nid(unsigned long pfn); #endif +#endif /* __KERNEL__ */ #endif /* _ASM_MMZONE_H_ */ Index: powerpc-2.6/include/asm-powerpc/module.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/module.h +++ powerpc-2.6/include/asm-powerpc/module.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_MODULE_H #define _ASM_POWERPC_MODULE_H +#ifdef __KERNEL__ /* * This program is free software; you can redistribute it and/or @@ -74,4 +75,5 @@ struct exception_table_entry; void sort_ex_table(struct exception_table_entry *start, struct exception_table_entry *finish); +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_MODULE_H */ Index: powerpc-2.6/include/asm-powerpc/mpic.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/mpic.h +++ powerpc-2.6/include/asm-powerpc/mpic.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_MPIC_H #define _ASM_POWERPC_MPIC_H +#ifdef __KERNEL__ #include @@ -286,4 +287,5 @@ extern int mpic_get_irq(struct pt_regs * /* global mpic for pSeries */ extern struct mpic *pSeries_mpic; +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_MPIC_H */ Index: powerpc-2.6/include/asm-powerpc/numnodes.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/numnodes.h +++ powerpc-2.6/include/asm-powerpc/numnodes.h @@ -1,7 +1,9 @@ #ifndef _ASM_POWERPC_MAX_NUMNODES_H #define _ASM_POWERPC_MAX_NUMNODES_H +#ifdef __KERNEL__ /* Max 16 Nodes */ #define NODES_SHIFT 4 +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_MAX_NUMNODES_H */ Index: powerpc-2.6/include/asm-powerpc/nvram.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/nvram.h +++ powerpc-2.6/include/asm-powerpc/nvram.h @@ -55,6 +55,7 @@ struct nvram_header { char name[12]; }; +#ifdef __KERNEL__ struct nvram_partition { struct list_head partition; struct nvram_header header; @@ -69,6 +70,7 @@ extern struct nvram_partition *nvram_fin extern int pSeries_nvram_init(void); extern int mmio_nvram_init(void); +#endif /* __KERNEL__ */ /* PowerMac specific nvram stuffs */ @@ -78,6 +80,7 @@ enum { pmac_nvram_NR /* MacOS Name Registry partition */ }; +#ifdef __KERNEL__ /* Return partition offset in nvram */ extern int pmac_get_partition(int partition); @@ -91,6 +94,7 @@ extern void nvram_sync(void); /* Normal access to NVRAM */ extern unsigned char nvram_read_byte(int i); extern void nvram_write_byte(unsigned char c, int i); +#endif /* Some offsets in XPRAM */ #define PMAC_XPRAM_MACHINE_LOC 0xe4 Index: powerpc-2.6/include/asm-powerpc/of_device.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/of_device.h +++ powerpc-2.6/include/asm-powerpc/of_device.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_OF_DEVICE_H #define _ASM_POWERPC_OF_DEVICE_H +#ifdef __KERNEL__ #include #include @@ -61,4 +62,5 @@ extern struct of_device *of_platform_dev struct device *parent); extern void of_release_dev(struct device *dev); +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_OF_DEVICE_H */ Index: powerpc-2.6/include/asm-powerpc/ohare.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/ohare.h +++ powerpc-2.6/include/asm-powerpc/ohare.h @@ -1,3 +1,6 @@ +#ifndef _ASM_POWERPC_OHARE_H +#define _ASM_POWERPC_OHARE_H +#ifndef __KERNEL__ /* * ohare.h: definitions for using the "O'Hare" I/O controller chip. * @@ -46,3 +49,6 @@ * Contributed by Harry Eaton. */ #define STARMAX_FEATURES 0xbeff7a + +#endif /* __KERNEL__ */ +#endif /* _ASM_POWERPC_OHARE_H */ Index: powerpc-2.6/include/asm-powerpc/oprofile_impl.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/oprofile_impl.h +++ powerpc-2.6/include/asm-powerpc/oprofile_impl.h @@ -11,6 +11,7 @@ #ifndef _ASM_POWERPC_OPROFILE_IMPL_H #define _ASM_POWERPC_OPROFILE_IMPL_H +#ifdef __KERNEL__ #define OP_MAX_COUNTER 8 @@ -120,4 +121,5 @@ static inline void ctr_write(unsigned in } #endif /* __powerpc64__ */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_OPROFILE_IMPL_H */ Index: powerpc-2.6/include/asm-powerpc/pSeries_reconfig.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/pSeries_reconfig.h +++ powerpc-2.6/include/asm-powerpc/pSeries_reconfig.h @@ -1,5 +1,6 @@ #ifndef _PPC64_PSERIES_RECONFIG_H #define _PPC64_PSERIES_RECONFIG_H +#ifdef __KERNEL__ #include @@ -22,4 +23,5 @@ static inline int pSeries_reconfig_notif static inline void pSeries_reconfig_notifier_unregister(struct notifier_block *nb) { } #endif /* CONFIG_PPC_PSERIES */ +#endif /* __KERNEL__ */ #endif /* _PPC64_PSERIES_RECONFIG_H */ Index: powerpc-2.6/include/asm-powerpc/paca.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/paca.h +++ powerpc-2.6/include/asm-powerpc/paca.h @@ -14,6 +14,7 @@ */ #ifndef _ASM_POWERPC_PACA_H #define _ASM_POWERPC_PACA_H +#ifdef __KERNEL__ #include #include @@ -110,4 +111,5 @@ struct paca_struct { extern struct paca_struct paca[]; +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_PACA_H */ Index: powerpc-2.6/include/asm-powerpc/page_32.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/page_32.h +++ powerpc-2.6/include/asm-powerpc/page_32.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_PAGE_32_H #define _ASM_POWERPC_PAGE_32_H +#ifdef __KERNEL__ #define VM_DATA_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS32 @@ -37,4 +38,5 @@ extern __inline__ int get_order(unsigned #endif /* __ASSEMBLY__ */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_PAGE_32_H */ Index: powerpc-2.6/include/asm-powerpc/page_64.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/page_64.h +++ powerpc-2.6/include/asm-powerpc/page_64.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_PAGE_64_H #define _ASM_POWERPC_PAGE_64_H +#ifdef __KERNEL__ /* * Copyright (C) 2001 PPC64 Team, IBM Corp @@ -170,4 +171,5 @@ extern unsigned int HPAGE_SHIFT; #include +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_PAGE_64_H */ Index: powerpc-2.6/include/asm-powerpc/param.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/param.h +++ powerpc-2.6/include/asm-powerpc/param.h @@ -1,8 +1,6 @@ #ifndef _ASM_POWERPC_PARAM_H #define _ASM_POWERPC_PARAM_H -#include - #ifdef __KERNEL__ #define HZ CONFIG_HZ /* internal kernel timer frequency */ #define USER_HZ 100 /* for user interfaces in "ticks" */ Index: powerpc-2.6/include/asm-powerpc/parport.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/parport.h +++ powerpc-2.6/include/asm-powerpc/parport.h @@ -8,6 +8,7 @@ #ifndef _ASM_POWERPC_PARPORT_H #define _ASM_POWERPC_PARPORT_H +#ifdef __KERNEL__ static int __devinit parport_pc_find_isa_ports (int autoirq, int autodma); static int __devinit parport_pc_find_nonpci_ports (int autoirq, int autodma) @@ -15,4 +16,5 @@ static int __devinit parport_pc_find_non return parport_pc_find_isa_ports (autoirq, autodma); } +#endif /* __KERNEL__ */ #endif /* !(_ASM_POWERPC_PARPORT_H) */ Index: powerpc-2.6/include/asm-powerpc/pci-bridge.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/pci-bridge.h +++ powerpc-2.6/include/asm-powerpc/pci-bridge.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_PCI_BRIDGE_H #define _ASM_POWERPC_PCI_BRIDGE_H +#ifdef __KERNEL__ #ifndef CONFIG_PPC64 #include @@ -173,4 +174,5 @@ static inline unsigned long pci_address_ #define PCI_PROBE_DEVTREE 1 /* Instantiate from device tree */ #endif /* CONFIG_PPC64 */ +#endif /* __KERNEL__ */ #endif Index: powerpc-2.6/include/asm-powerpc/pgalloc.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/pgalloc.h +++ powerpc-2.6/include/asm-powerpc/pgalloc.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_PGALLOC_H #define _ASM_POWERPC_PGALLOC_H +#ifdef __KERNEL__ #ifndef CONFIG_PPC64 #include @@ -153,4 +154,5 @@ extern void pgtable_free_tlb(struct mmu_ #define check_pgt_cache() do { } while (0) #endif /* CONFIG_PPC64 */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_PGALLOC_H */ Index: powerpc-2.6/include/asm-powerpc/pgtable-64k.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/pgtable-64k.h +++ powerpc-2.6/include/asm-powerpc/pgtable-64k.h @@ -1,3 +1,7 @@ +#ifndef _ASM_POWERPC_PGTABLE_64K_H +#define _ASM_POWERPC_PGTABLE_64K_H +#ifdef __KERNEL__ + #include @@ -88,3 +92,5 @@ #endif /* __ASSEMBLY__ */ +#endif /* __KERNEL__ */ +#endif /* _ASM_POWERPC_PGTABLE_64K_H */ Index: powerpc-2.6/include/asm-powerpc/pgtable.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/pgtable.h +++ powerpc-2.6/include/asm-powerpc/pgtable.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_PGTABLE_H #define _ASM_POWERPC_PGTABLE_H +#ifdef __KERNEL__ #ifndef CONFIG_PPC64 #include @@ -532,4 +533,5 @@ void pgtable_cache_init(void); #endif /* __ASSEMBLY__ */ #endif /* CONFIG_PPC64 */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ Index: powerpc-2.6/include/asm-powerpc/pmac_low_i2c.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/pmac_low_i2c.h +++ powerpc-2.6/include/asm-powerpc/pmac_low_i2c.h @@ -11,6 +11,7 @@ */ #ifndef __PMAC_LOW_I2C_H__ #define __PMAC_LOW_I2C_H__ +#ifdef __KERNEL__ /* i2c mode (based on the platform functions format) */ enum { @@ -40,4 +41,5 @@ int pmac_low_i2c_setmode(struct device_n int pmac_low_i2c_xfer(struct device_node *np, u8 addrdir, u8 subaddr, u8 *data, int len); +#endif /* __KERNEL__ */ #endif /* __PMAC_LOW_I2C_H__ */ Index: powerpc-2.6/include/asm-powerpc/pmc.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/pmc.h +++ powerpc-2.6/include/asm-powerpc/pmc.h @@ -18,6 +18,7 @@ */ #ifndef _POWERPC_PMC_H #define _POWERPC_PMC_H +#ifdef __KERNEL__ #include @@ -44,4 +45,5 @@ void dump_pmcs(void); extern struct op_powerpc_model op_model_fsl_booke; #endif +#endif /* __KERNEL__ */ #endif /* _POWERPC_PMC_H */ Index: powerpc-2.6/include/asm-powerpc/ppc-pci.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/ppc-pci.h +++ powerpc-2.6/include/asm-powerpc/ppc-pci.h @@ -8,6 +8,7 @@ */ #ifndef _ASM_POWERPC_PPC_PCI_H #define _ASM_POWERPC_PPC_PCI_H +#ifdef __KERNEL__ #include #include @@ -114,4 +115,5 @@ struct device_node * find_device_pe(stru #endif +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_PPC_PCI_H */ Index: powerpc-2.6/include/asm-powerpc/processor.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/processor.h +++ powerpc-2.6/include/asm-powerpc/processor.h @@ -10,7 +10,6 @@ * 2 of the License, or (at your option) any later version. */ -#include #include #ifndef __ASSEMBLY__ @@ -50,6 +49,7 @@ #define _CHRP_IBM 0x05 /* IBM chrp, the longtrail and longtrail 2 */ #define _CHRP_Pegasos 0x06 /* Genesi/bplan's Pegasos and Pegasos2 */ +#ifdef __KERNEL__ #define platform_is_pseries() (_machine == PLATFORM_PSERIES || \ _machine == PLATFORM_PSERIES_LPAR) #define platform_is_lpar() (!!(_machine & PLATFORM_LPAR)) @@ -81,7 +81,7 @@ extern unsigned char ucBoardRevMaj, ucBo #else #define _machine 0 #endif /* CONFIG_PPC_MULTIPLATFORM */ - +#endif /* __KERNEL__ */ /* * Default implementation of macro that returns current * instruction pointer ("program counter"). Index: powerpc-2.6/include/asm-powerpc/rtas.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/rtas.h +++ powerpc-2.6/include/asm-powerpc/rtas.h @@ -1,5 +1,6 @@ #ifndef _POWERPC_RTAS_H #define _POWERPC_RTAS_H +#ifdef __KERNEL__ #include #include @@ -229,4 +230,5 @@ extern unsigned long rtas_rmo_buf; #define GLOBAL_INTERRUPT_QUEUE 9005 +#endif /* __KERNEL__ */ #endif /* _POWERPC_RTAS_H */ Index: powerpc-2.6/include/asm-powerpc/seccomp.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/seccomp.h +++ powerpc-2.6/include/asm-powerpc/seccomp.h @@ -1,6 +1,10 @@ #ifndef _ASM_POWERPC_SECCOMP_H +#define _ASM_POWERPC_SECCOMP_H +#ifndef __KERNEL__ #include +#endif + #include #define __NR_seccomp_read __NR_read Index: powerpc-2.6/include/asm-powerpc/sections.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/sections.h +++ powerpc-2.6/include/asm-powerpc/sections.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_SECTIONS_H #define _ASM_POWERPC_SECTIONS_H +#ifdef __KERNEL__ #include @@ -17,4 +18,5 @@ static inline int in_kernel_text(unsigne #endif +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_SECTIONS_H */ Index: powerpc-2.6/include/asm-powerpc/signal.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/signal.h +++ powerpc-2.6/include/asm-powerpc/signal.h @@ -2,10 +2,13 @@ #define _ASM_POWERPC_SIGNAL_H #include -#include #define _NSIG 64 -#define _NSIG_BPW BITS_PER_LONG +#ifdef __powerpc64__ +#define _NSIG_BPW 64 +#else +#define _NSIG_BPW 32 +#endif #define _NSIG_WORDS (_NSIG / _NSIG_BPW) typedef unsigned long old_sigset_t; /* at least 32 bits */ Index: powerpc-2.6/include/asm-powerpc/smu.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/smu.h +++ powerpc-2.6/include/asm-powerpc/smu.h @@ -4,9 +4,11 @@ /* * Definitions for talking to the SMU chip in newer G5 PowerMacs */ - +#ifdef __KERNEL__ #include #include +#endif +#include /* * Known SMU commands @@ -487,8 +489,8 @@ struct smu_sdbp_slotspow { #define SMU_SDB_SENSORTREE_ID 0x25 struct smu_sdbp_sensortree { - u8 model_id; - u8 unknown[3]; + __u8 model_id; + __u8 unknown[3]; }; /* This partition contains CPU thermal control PID informations. So far @@ -498,13 +500,13 @@ struct smu_sdbp_sensortree { #define SMU_SDB_CPUPIDDATA_ID 0x17 struct smu_sdbp_cpupiddata { - u8 unknown1; - u8 target_temp_delta; - u8 unknown2; - u8 history_len; - s16 power_adj; - u16 max_power; - s32 gp,gr,gd; + __u8 unknown1; + __u8 target_temp_delta; + __u8 unknown2; + __u8 history_len; + __s16 power_adj; + __u16 max_power; + __s32 gp,gr,gd; }; Index: powerpc-2.6/include/asm-powerpc/sparsemem.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/sparsemem.h +++ powerpc-2.6/include/asm-powerpc/sparsemem.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_SPARSEMEM_H #define _ASM_POWERPC_SPARSEMEM_H 1 +#ifdef __KERNEL__ #ifdef CONFIG_SPARSEMEM /* @@ -25,4 +26,5 @@ static inline int hot_add_scn_to_nid(uns #endif /* CONFIG_SPARSEMEM */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_SPARSEMEM_H */ Index: powerpc-2.6/include/asm-powerpc/spinlock.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/spinlock.h +++ powerpc-2.6/include/asm-powerpc/spinlock.h @@ -1,5 +1,6 @@ #ifndef __ASM_SPINLOCK_H #define __ASM_SPINLOCK_H +#ifdef __KERNEL__ /* * Simple spin lock operations. @@ -266,4 +267,5 @@ static __inline__ void __raw_write_unloc rw->lock = 0; } +#endif /* __KERNEL__ */ #endif /* __ASM_SPINLOCK_H */ Index: powerpc-2.6/include/asm-powerpc/spu.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/spu.h +++ powerpc-2.6/include/asm-powerpc/spu.h @@ -22,6 +22,8 @@ #ifndef _SPU_H #define _SPU_H +#ifdef __KERNEL__ + #include #include #include @@ -578,4 +580,5 @@ struct spu_priv1 { u64 spu_trace_cntl; /* 0x1070 */ } __attribute__ ((aligned(0x2000))); +#endif /* __KERNEL__ */ #endif Index: powerpc-2.6/include/asm-powerpc/spu_csa.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/spu_csa.h +++ powerpc-2.6/include/asm-powerpc/spu_csa.h @@ -22,6 +22,7 @@ #ifndef _SPU_CSA_H_ #define _SPU_CSA_H_ +#ifdef __KERNEL__ /* * Total number of 128-bit registers. @@ -89,8 +90,6 @@ struct spu_lscsa { unsigned char ls[LS_SIZE]; }; -#ifdef __KERNEL__ - /* * struct spu_problem_collapsed - condensed problem state area, w/o pads. */ Index: powerpc-2.6/include/asm-powerpc/synch.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/synch.h +++ powerpc-2.6/include/asm-powerpc/synch.h @@ -1,7 +1,6 @@ #ifndef _ASM_POWERPC_SYNCH_H #define _ASM_POWERPC_SYNCH_H - -#include +#ifdef __KERNEL__ #ifdef __powerpc64__ #define __SUBARCH_HAS_LWSYNC @@ -47,5 +46,6 @@ static inline void isync(void) #define isync_on_smp() __asm__ __volatile__("": : :"memory") #endif +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_SYNCH_H */ Index: powerpc-2.6/include/asm-powerpc/system.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/system.h +++ powerpc-2.6/include/asm-powerpc/system.h @@ -4,7 +4,6 @@ #ifndef _ASM_POWERPC_SYSTEM_H #define _ASM_POWERPC_SYSTEM_H -#include #include #include @@ -42,6 +41,7 @@ #define set_mb(var, value) do { var = value; mb(); } while (0) #define set_wmb(var, value) do { var = value; wmb(); } while (0) +#ifdef __KERNEL__ #ifdef CONFIG_SMP #define smp_mb() mb() #define smp_rmb() rmb() @@ -54,7 +54,6 @@ #define smp_read_barrier_depends() do { } while(0) #endif /* CONFIG_SMP */ -#ifdef __KERNEL__ struct task_struct; struct pt_regs; Index: powerpc-2.6/include/asm-powerpc/tce.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/tce.h +++ powerpc-2.6/include/asm-powerpc/tce.h @@ -20,6 +20,7 @@ #ifndef _ASM_POWERPC_TCE_H #define _ASM_POWERPC_TCE_H +#ifdef __KERNEL__ /* * Tces come in two formats, one for the virtual bus and a different @@ -61,4 +62,5 @@ union tce_entry { }; +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_TCE_H */ Index: powerpc-2.6/include/asm-powerpc/tlb.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/tlb.h +++ powerpc-2.6/include/asm-powerpc/tlb.h @@ -11,6 +11,7 @@ */ #ifndef _ASM_POWERPC_TLB_H #define _ASM_POWERPC_TLB_H +#ifdef __KERNEL__ #include #ifndef __powerpc64__ @@ -67,4 +68,5 @@ static inline void __tlb_remove_tlb_entr } #endif +#endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_TLB_H */ Index: powerpc-2.6/include/asm-powerpc/topology.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/topology.h +++ powerpc-2.6/include/asm-powerpc/topology.h @@ -1,5 +1,6 @@ #ifndef _ASM_POWERPC_TOPOLOGY_H #define _ASM_POWERPC_TOPOLOGY_H +#ifdef __KERNEL__ #include @@ -65,4 +66,5 @@ static inline void dump_numa_cpu_topolog #endif /* CONFIG_NUMA */ +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_TOPOLOGY_H */ Index: powerpc-2.6/include/asm-powerpc/udbg.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/udbg.h +++ powerpc-2.6/include/asm-powerpc/udbg.h @@ -9,6 +9,7 @@ #ifndef _ASM_POWERPC_UDBG_H #define _ASM_POWERPC_UDBG_H +#ifdef __KERNEL__ #include #include @@ -34,4 +35,5 @@ extern void udbg_scc_init(int force_scc) extern int udbg_adb_init(int force_btext); extern void udbg_adb_init_early(void); +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_UDBG_H */ Index: powerpc-2.6/include/asm-powerpc/vdso_datapage.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/vdso_datapage.h +++ powerpc-2.6/include/asm-powerpc/vdso_datapage.h @@ -1,5 +1,6 @@ #ifndef _VDSO_DATAPAGE_H #define _VDSO_DATAPAGE_H +#ifdef __KERNEL__ /* * Copyright (C) 2002 Peter Bergner , IBM @@ -105,4 +106,5 @@ extern struct vdso_data *vdso_data; #endif /* __ASSEMBLY__ */ +#endif /* __KERNEL__ */ #endif /* _SYSTEMCFG_H */ Index: powerpc-2.6/include/asm-powerpc/vio.h =================================================================== --- powerpc-2.6.orig/include/asm-powerpc/vio.h +++ powerpc-2.6/include/asm-powerpc/vio.h @@ -13,6 +13,7 @@ #ifndef _ASM_POWERPC_VIO_H #define _ASM_POWERPC_VIO_H +#ifdef __KERNEL__ #include #include @@ -103,4 +104,5 @@ static inline struct vio_dev *to_vio_dev return container_of(dev, struct vio_dev, dev); } +#endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_VIO_H */ diff --git a/include/asm-powerpc/ppc_asm.h b/include/asm-powerpc/ppc_asm.h index c27baa0..0dc798d 100644 --- a/include/asm-powerpc/ppc_asm.h +++ b/include/asm-powerpc/ppc_asm.h @@ -94,6 +94,7 @@ #define RFDI .long 0x4c00004e /* rfdi instruction */ #define RFMCI .long 0x4c00004c /* rfmci instruction */ +#ifdef __KERNEL__ #ifdef CONFIG_PPC64 #define XGLUE(a,b) a##b @@ -325,6 +326,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_601) #define CLR_TOP32(r) #endif +#endif /* __KERNEL__ */ + /* The boring bits... */ /* Condition Register Bit Fields */ From arnd at arndb.de Sat Dec 17 08:45:27 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Fri, 16 Dec 2005 22:45:27 +0100 Subject: [PATCH] powerpc: fix two build warnings Message-ID: <200512162245.29111.arnd@arndb.de> Building the arch/powerpc tree currently gives me two warnings with gcc-4.0: arch/powerpc/mm/imalloc.c: In function '__im_get_area': arch/powerpc/mm/imalloc.c:225: warning: 'tmp' may be used uninitialized in this function arch/powerpc/mm/hugetlbpage.c: In function 'hugetlb_get_unmapped_area': arch/powerpc/mm/hugetlbpage.c:608: warning: unused variable 'vma' both fixes are trivial. Signed-off-by: Arnd Bergmann Index: powerpc-2.6/arch/powerpc/mm/imalloc.c =================================================================== --- powerpc-2.6.orig/arch/powerpc/mm/imalloc.c +++ powerpc-2.6/arch/powerpc/mm/imalloc.c @@ -107,6 +107,7 @@ static int im_region_status(unsigned lon if (v_addr < (unsigned long) tmp->addr + tmp->size) break; + *vm = NULL; if (tmp) { if (im_region_overlaps(v_addr, size, tmp)) return IM_REGION_OVERLAP; @@ -127,7 +128,6 @@ static int im_region_status(unsigned lon } } - *vm = NULL; return IM_REGION_UNUSED; } diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 7283ff1..7370f9f 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -605,7 +605,6 @@ unsigned long hugetlb_get_unmapped_area( { int lastshift; u16 areamask, curareas; - struct vm_area_struct *vma; if (HPAGE_SHIFT == 0) return -EINVAL; From paulus at samba.org Sat Dec 17 09:22:37 2005 From: paulus at samba.org (Paul Mackerras) Date: Sat, 17 Dec 2005 09:22:37 +1100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <200512162208.34832.arnd@arndb.de> References: <3a267747328be451274c52ca5721c5ec@bga.com> <17313.58548.899471.558166@cargo.ozlabs.ibm.com> <200512162208.34832.arnd@arndb.de> Message-ID: <17315.15917.56310.132848@cargo.ozlabs.ibm.com> Arnd Bergmann writes: > Ok, tried that now, seems to work fine. I discovered that > there is already a ppc_md callback for this purpose, but > I changed it to not panic automatically after returning from > that. That looks reasonable, but unfortunately as a side-effect you have changed what happens on pSeries. Perhaps the best way around that would be to change ppc_md.system_reset_exception to return 1 if it has handled the exception, or 0 if it hasn't, and then make the code in system_reset_exception do: if (ppc_md.system_reset_exception) if (ppc_md.system_reset_exception(regs)) return; with the obvious changes to machdep.h and pSeries_system_reset_exception. Thanks, Paul. From kravetz at us.ibm.com Sat Dec 17 09:30:35 2005 From: kravetz at us.ibm.com (Mike Kravetz) Date: Fri, 16 Dec 2005 14:30:35 -0800 Subject: [PATCH] powerpc: don't add memory to empty node/zone Message-ID: <20051216223035.GA9316@monkey.ibm.com> The system will oops if an attempt is made to add memory to an empty node/zone. This patch prevents adding memory to an empty node. The code to dynamically add a node/zone is non-trivial. This patch is temporary and will be removed when the ability to dynamically add a node/zone is complete. Signed-off-by: Mike Kravetz diff -Naupr powerpc-2.6/arch/powerpc/mm/numa.c powerpc-2.6.work/arch/powerpc/mm/numa.c --- powerpc-2.6/arch/powerpc/mm/numa.c 2005-12-15 20:57:29.000000000 +0000 +++ powerpc-2.6.work/arch/powerpc/mm/numa.c 2005-12-16 22:14:30.000000000 +0000 @@ -766,13 +766,15 @@ early_param("numa", early_numa); int hot_add_scn_to_nid(unsigned long scn_addr) { struct device_node *memory = NULL; + nodemask_t nodes; + int numa_domain = 0; if (!numa_enabled || (min_common_depth < 0)) - return 0; + return numa_domain; while ((memory = of_find_node_by_type(memory, "memory")) != NULL) { unsigned long start, size; - int numa_domain, ranges; + int ranges; unsigned int *memcell_buf; unsigned int len; @@ -793,14 +795,21 @@ ha_new_range: if ((scn_addr >= start) && (scn_addr < (start + size))) { of_node_put(memory); - return numa_domain; + goto got_numa_domain; } if (--ranges) /* process all ranges in cell */ goto ha_new_range; } - BUG(); /* section address should be found above */ - return 0; + + /* Temporary code to ensure that returned node is not empty */ +got_numa_domain: + nodes_setall(nodes); + while (NODE_DATA(numa_domain)->node_spanned_pages == 0) { + node_clear(numa_domain, nodes); + numa_domain = any_online_node(nodes); + } + return numa_domain; } #endif /* CONFIG_MEMORY_HOTPLUG */ From arndb at de.ibm.com Sat Dec 17 11:10:31 2005 From: arndb at de.ibm.com (Arnd Bergmann) Date: Sat, 17 Dec 2005 01:10:31 +0100 Subject: [RFC PATCH 0/3] HVC console drivers Message-ID: <20051217001031.456315000@localhost> This a new version of the HVC console patches. I have tried to address all of Milton's comments and also some of my own. Also, the series is now split into three functionally separate patches. Code in here was written mostly not by me but at least the following people (alphabetically): Anton Blanchard Arnd Bergmann Charles Lefurgy David Woodhouse Eric Van Hensbergen Max Aguilar Milton Miller Patrick Bohrer Paul Mackerras Ryan S. Arnold Utz Bacher Please review! Arnd <>< From arndb at de.ibm.com Sat Dec 17 11:10:33 2005 From: arndb at de.ibm.com (Arnd Bergmann) Date: Sat, 17 Dec 2005 01:10:33 +0100 Subject: [RFC PATCH 2/3] Add a hvc backend for systemsim References: <20051217001031.456315000@localhost> Message-ID: <20051217002255.774056000@localhost> An embedded and charset-unspecified text was scrubbed... Name: hvc-console-fss.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051217/7f6617fc/attachment.txt From arndb at de.ibm.com Sat Dec 17 11:10:34 2005 From: arndb at de.ibm.com (Arnd Bergmann) Date: Sat, 17 Dec 2005 01:10:34 +0100 Subject: [RFC PATCH 3/3] add hvc backend for rtas References: <20051217001031.456315000@localhost> Message-ID: <20051217002255.944161000@localhost> An embedded and charset-unspecified text was scrubbed... Name: hvc-console-rtas.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051217/63a4897c/attachment.txt From arndb at de.ibm.com Sat Dec 17 11:10:32 2005 From: arndb at de.ibm.com (Arnd Bergmann) Date: Sat, 17 Dec 2005 01:10:32 +0100 Subject: [RFC PATCH 1/3] move hvc_console headers around References: <20051217001031.456315000@localhost> Message-ID: <20051217002255.601962000@localhost> An embedded and charset-unspecified text was scrubbed... Name: hvc-console-rework.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051217/bdbddcf3/attachment.txt From dwmw2 at infradead.org Sat Dec 17 11:36:34 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Sat, 17 Dec 2005 00:36:34 +0000 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <20051217002255.944161000@localhost> References: <20051217001031.456315000@localhost> <20051217002255.944161000@localhost> Message-ID: <1134779794.7104.183.camel@pmac.infradead.org> On Sat, 2005-12-17 at 01:10 +0100, Arnd Bergmann wrote: > Current Cell hardware is using the console through a set > of rtas calls. This driver is needed to get console > output on those boards. Does this work with zmodem transfers now? -- dwmw2 From arnd at arndb.de Sat Dec 17 11:47:08 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Sat, 17 Dec 2005 01:47:08 +0100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <17315.15917.56310.132848@cargo.ozlabs.ibm.com> References: <200512162208.34832.arnd@arndb.de> <17315.15917.56310.132848@cargo.ozlabs.ibm.com> Message-ID: <200512170147.10755.arnd@arndb.de> This patch enables support for pause(0) power management state for the Cell Broadband Processor, which is import for power efficient operation. The pervasive infrastructure will in the future enable us to introduce more functionality specific to the Cell's pervasive unit. From: Maximino Aguilar Signed-off-by: Arnd Bergmann --- On Freedag 16 Dezember 2005 23:22, Paul Mackerras wrote: > That looks reasonable, but unfortunately as a side-effect you have > changed what happens on pSeries. Perhaps the best way around that > would be to change ppc_md.system_reset_exception to return 1 if it has > handled the exception, or 0 if it hasn't, and then make the code in > system_reset_exception do: > > if (ppc_md.system_reset_exception) > if (ppc_md.system_reset_exception(regs)) > return; > > with the obvious changes to machdep.h and > pSeries_system_reset_exception. Ok. Since that function had a broken extern declaration, I fixed that as well by moving it to a proper header file. I have test-built this for pSeries and run on Cell to check the functionality. arch/powerpc/kernel/cputable.c | 2 arch/powerpc/kernel/traps.c | 6 arch/powerpc/platforms/cell/Makefile | 2 arch/powerpc/platforms/cell/pervasive.c | 217 ++++++++++++++++++++++++++++++++ arch/powerpc/platforms/cell/pervasive.h | 62 +++++++++ arch/powerpc/platforms/cell/setup.c | 2 arch/powerpc/platforms/pseries/ras.c | 5 arch/powerpc/platforms/pseries/ras.h | 9 + arch/powerpc/platforms/pseries/setup.c | 4 include/asm-powerpc/cputable.h | 4 include/asm-powerpc/machdep.h | 2 include/asm-powerpc/reg.h | 22 ++- 12 files changed, 323 insertions(+), 14 deletions(-) Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/Makefile +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile @@ -1,4 +1,6 @@ obj-y += interrupt.o iommu.o setup.o spider-pic.o +obj-y += pervasive.o + obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SPU_FS) += spufs/ spu_base.o builtin-spufs-$(CONFIG_SPU_FS) += spu_syscalls.o Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c @@ -0,0 +1,217 @@ +/* + * CBE Pervasive Monitor and Debug + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * Michael N. Day (mnday at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#undef DEBUG + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pervasive.h" + +static spinlock_t cbe_pervasive_lock; +struct cbe_pervasive { + struct pmd_regs __iomem *regs; + unsigned int thread; +}; + +/* can't use per_cpu from setup_arch */ +static struct cbe_pervasive cbe_pervasive[NR_CPUS]; + +static void __init cbe_enable_pause_zero(void) +{ + unsigned long thread_switch_control; + unsigned long temp_register; + struct cbe_pervasive *p; + int thread; + + p = &cbe_pervasive[get_cpu()]; + spin_lock_irq(&cbe_pervasive_lock); + + if (!cbe_pervasive->regs) + goto out; + + pr_debug("Power Management: CPU %d\n", smp_processor_id()); + + /* Enable Pause(0) control bit */ + temp_register = in_be64(&p->regs->pm_control); + + out_be64(&p->regs->pm_control, + temp_register|PMD_PAUSE_ZERO_CONTROL); + + /* Enable DEC and EE interrupt request */ + thread_switch_control = mfspr(SPRN_TSC_CELL); + thread_switch_control |= TSC_CELL_EE_ENABLE | TSC_CELL_EE_BOOST; + + switch ((mfspr(SPRN_CTRLF) & CTRL_CT)) { + case CTRL_CT0: + thread_switch_control |= TSC_CELL_DEC_ENABLE_0; + thread = 0; + break; + case CTRL_CT1: + thread_switch_control |= TSC_CELL_DEC_ENABLE_1; + thread = 1; + break; + default: + printk(KERN_WARNING "%s: unknown configuration\n", + __FUNCTION__); + thread = -1; + break; + } + + if (p->thread != thread) + printk(KERN_WARNING "%s: device tree inconsistant, " + "cpu %i: %d/%d\n", __FUNCTION__, + smp_processor_id(), + p->thread, thread); + + mtspr(SPRN_TSC_CELL, thread_switch_control); + +out: + spin_unlock_irq(&cbe_pervasive_lock); + put_cpu(); +} + +static void cbe_idle(void) +{ + unsigned long ctrl; + + cbe_enable_pause_zero(); + + while (1) { + if (!need_resched()) { + while (!need_resched()) { + /* go into low thread priority */ + HMT_low(); + + /* go into low power mode */ + local_irq_disable(); + ctrl = mfspr(SPRN_CTRLF); + ctrl &= ~(CTRL_RUNLATCH | CTRL_TE); + mtspr(SPRN_CTRLT, ctrl); + local_irq_enable(); + } + /* restore thread prio */ + HMT_medium(); + } + + ppc64_runlatch_on(); + preempt_enable_no_resched(); + schedule(); + preempt_disable(); + } +} + +int cbe_system_reset_exception(struct pt_regs *regs) +{ + switch (regs->msr & SRR1_WAKEMASK) { + case SRR1_WAKEEE: + do_IRQ(regs); + break; + case SRR1_WAKEDEC: + timer_interrupt(regs); + break; + case SRR1_WAKEMT: + /* no action required */ + break; + default: + return 0; /* do system reset */ + } + return 1; /* everything handled */ +} + +static int __init cbe_find_pmd_mmio(int cpu, struct cbe_pervasive *p) +{ + struct device_node *node; + unsigned int *int_servers; + char *addr; + unsigned long real_address; + unsigned int size; + + struct pmd_regs __iomem *pmd_mmio_area; + int hardid, thread; + int proplen; + + pmd_mmio_area = NULL; + hardid = get_hard_smp_processor_id(cpu); + for (node = NULL; (node = of_find_node_by_type(node, "cpu"));) { + int_servers = (void *) get_property(node, + "ibm,ppc-interrupt-server#s", &proplen); + if (!int_servers) { + printk(KERN_WARNING "CPU device misses " + "ibm,ppc-interrupt-server#s property"); + continue; + } + for (thread = 0; thread < proplen / sizeof (int); thread++) { + if (hardid == int_servers[thread]) { + addr = get_property(node, "pervasive", NULL); + goto found; + } + } + } + + printk(KERN_WARNING "%s: CPU %d not found\n", __FUNCTION__, cpu); + return -EINVAL; + +found: + real_address = *(unsigned long*) addr; + addr += sizeof (unsigned long); + size = *(unsigned int*) addr; + + pr_debug("pervasive area for CPU %d at %lx, size %x\n", + cpu, real_address, size); + p->regs = __ioremap(real_address, size, _PAGE_NO_CACHE); + p->thread = thread; + return 0; +} + +void __init cell_pervasive_init(void) +{ + struct cbe_pervasive *p; + int cpu; + int ret; + + spin_lock_init(&cbe_pervasive_lock); + + if (!cpu_has_feature(CPU_FTR_PAUSE_ZERO)) + return; + + for_each_cpu(cpu) { + p = &cbe_pervasive[cpu]; + ret = cbe_find_pmd_mmio(cpu, p); + if (ret) + return; + } + + ppc_md.idle_loop = cbe_idle; + ppc_md.system_reset_exception = cbe_system_reset_exception; +} Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h @@ -0,0 +1,62 @@ +/* + * Cell Pervasive Monitor and Debug interface and HW structures + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * David J. Erb (djerb at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#ifndef PERVASIVE_H +#define PERVASIVE_H + +struct pmd_regs { + u8 pad_0x0000_0x0800[0x0800 - 0x0000]; /* 0x0000 */ + + /* Thermal Sensor Registers */ + u64 ts_ctsr1; /* 0x0800 */ + u64 ts_ctsr2; /* 0x0808 */ + u64 ts_mtsr1; /* 0x0810 */ + u64 ts_mtsr2; /* 0x0818 */ + u64 ts_itr1; /* 0x0820 */ + u64 ts_itr2; /* 0x0828 */ + u64 ts_gitr; /* 0x0830 */ + u64 ts_isr; /* 0x0838 */ + u64 ts_imr; /* 0x0840 */ + u64 tm_cr1; /* 0x0848 */ + u64 tm_cr2; /* 0x0850 */ + u64 tm_simr; /* 0x0858 */ + u64 tm_tpr; /* 0x0860 */ + u64 tm_str1; /* 0x0868 */ + u64 tm_str2; /* 0x0870 */ + u64 tm_tsr; /* 0x0878 */ + + /* Power Management */ + u64 pm_control; /* 0x0880 */ +#define PMD_PAUSE_ZERO_CONTROL 0x10000 + u64 pm_status; /* 0x0888 */ + + /* Time Base Register */ + u64 tbr; /* 0x0890 */ + + u8 pad_0x0898_0x1000 [0x1000 - 0x0898]; /* 0x0898 */ +}; + +void __init cell_pervasive_init(void); + +#endif Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/setup.c +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c @@ -49,6 +49,7 @@ #include "interrupt.h" #include "iommu.h" +#include "pervasive.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -165,6 +166,7 @@ static void __init cell_setup_arch(void) init_pci_config_tokens(); find_and_init_phbs(); spider_init_IRQ(); + cell_pervasive_init(); #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif Index: linux-2.6.15-rc/include/asm-powerpc/cputable.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/cputable.h +++ linux-2.6.15-rc/include/asm-powerpc/cputable.h @@ -106,6 +106,7 @@ extern void do_cpu_ftr_fixups(unsigned l #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0000040000000000) #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0000100000000000) +#define CPU_FTR_PAUSE_ZERO ASM_CONST(0x0000200000000000) #else /* ensure on 32b processors the flags are available for compiling but * don't do anything */ @@ -305,7 +306,8 @@ enum { CPU_FTR_MMCRA_SIHV, CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | - CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT, + CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | + CPU_FTR_CTRL | CPU_FTR_PAUSE_ZERO, CPU_FTRS_COMPATIBLE = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2, #endif Index: linux-2.6.15-rc/include/asm-powerpc/reg.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/reg.h +++ linux-2.6.15-rc/include/asm-powerpc/reg.h @@ -145,6 +145,10 @@ #define SPRN_CTR 0x009 /* Count Register */ #define SPRN_CTRLF 0x088 #define SPRN_CTRLT 0x098 +#define CTRL_CT 0xc0000000 /* current thread */ +#define CTRL_CT0 0x80000000 /* thread 0 */ +#define CTRL_CT1 0x40000000 /* thread 1 */ +#define CTRL_TE 0x00c00000 /* thread enable */ #define CTRL_RUNLATCH 0x1 #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ #define DABR_TRANSLATION (1UL << 2) @@ -257,11 +261,11 @@ #define SPRN_HID6 0x3F9 /* BE HID 6 */ #define HID6_LB (0x0F<<12) /* Concurrent Large Page Modes */ #define HID6_DLP (1<<20) /* Disable all large page modes (4K only) */ -#define SPRN_TSCR 0x399 /* Thread switch control on BE */ -#define SPRN_TTR 0x39A /* Thread switch timeout on BE */ -#define TSCR_DEC_ENABLE 0x200000 /* Decrementer Interrupt */ -#define TSCR_EE_ENABLE 0x100000 /* External Interrupt */ -#define TSCR_EE_BOOST 0x080000 /* External Interrupt Boost */ +#define SPRN_TSC_CELL 0x399 /* Thread switch control on Cell */ +#define TSC_CELL_DEC_ENABLE_0 0x400000 /* Decrementer Interrupt */ +#define TSC_CELL_DEC_ENABLE_1 0x200000 /* Decrementer Interrupt */ +#define TSC_CELL_EE_ENABLE 0x100000 /* External Interrupt */ +#define TSC_CELL_EE_BOOST 0x080000 /* External Interrupt Boost */ #define SPRN_TSC 0x3FD /* Thread switch control on others */ #define SPRN_TST 0x3FC /* Thread switch timeout on others */ #if !defined(SPRN_IAC1) && !defined(SPRN_IAC2) @@ -375,6 +379,14 @@ #define SPRN_SPRG7 0x117 /* Special Purpose Register General 7 */ #define SPRN_SRR0 0x01A /* Save/Restore Register 0 */ #define SPRN_SRR1 0x01B /* Save/Restore Register 1 */ +#define SRR1_WAKEMASK 0x00380000 /* reason for wakeup */ +#define SRR1_WAKERESET 0x00380000 /* System reset */ +#define SRR1_WAKESYSERR 0x00300000 /* System error */ +#define SRR1_WAKEEE 0x00200000 /* External interrupt */ +#define SRR1_WAKEMT 0x00280000 /* mtctrl */ +#define SRR1_WAKEDEC 0x00180000 /* Decrementer interrupt */ +#define SRR1_WAKETHERM 0x00100000 /* Thermal management interrupt */ + #ifndef SPRN_SVR #define SPRN_SVR 0x11E /* System Version Register */ #endif Index: linux-2.6.15-rc/arch/powerpc/kernel/cputable.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/cputable.c +++ linux-2.6.15-rc/arch/powerpc/kernel/cputable.c @@ -273,7 +273,7 @@ struct cpu_spec cpu_specs[] = { .oprofile_model = &op_model_power4, #endif }, - { /* BE DD1.x */ + { /* Cell Broadband Engine */ .pvr_mask = 0xffff0000, .pvr_value = 0x00700000, .cpu_name = "Cell Broadband Engine", Index: linux-2.6.15-rc/arch/powerpc/kernel/traps.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/traps.c +++ linux-2.6.15-rc/arch/powerpc/kernel/traps.c @@ -230,8 +230,10 @@ void _exception(int signr, struct pt_reg void system_reset_exception(struct pt_regs *regs) { /* See if any machine dependent calls */ - if (ppc_md.system_reset_exception) - ppc_md.system_reset_exception(regs); + if (ppc_md.system_reset_exception) { + if (ppc_md.system_reset_exception(regs)) + return; + } die("System Reset", regs, SIGABRT); Index: linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.h =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.h @@ -0,0 +1,9 @@ +#ifndef _PSERIES_RAS_H +#define _PSERIES_RAS_H + +struct pt_regs; + +extern int pSeries_system_reset_exception(struct pt_regs *regs); +extern int pSeries_machine_check_exception(struct pt_regs *regs); + +#endif /* _PSERIES_RAS_H */ Index: linux-2.6.15-rc/arch/powerpc/platforms/pseries/setup.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/pseries/setup.c +++ linux-2.6.15-rc/arch/powerpc/platforms/pseries/setup.c @@ -69,6 +69,7 @@ #include #include "plpar_wrappers.h" +#include "ras.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -80,9 +81,6 @@ extern void find_udbg_vterm(void); int fwnmi_active; /* TRUE if an FWNMI handler is present */ -extern void pSeries_system_reset_exception(struct pt_regs *regs); -extern int pSeries_machine_check_exception(struct pt_regs *regs); - static void pseries_shared_idle(void); static void pseries_dedicated_idle(void); Index: linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/pseries/ras.c +++ linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.c @@ -51,6 +51,8 @@ #include #include +#include "ras.h" + static unsigned char ras_log_buf[RTAS_ERROR_LOG_MAX]; static DEFINE_SPINLOCK(ras_log_buf_lock); @@ -278,7 +280,7 @@ static void fwnmi_release_errinfo(void) printk("FWNMI: nmi-interlock failed: %d\n", ret); } -void pSeries_system_reset_exception(struct pt_regs *regs) +int pSeries_system_reset_exception(struct pt_regs *regs) { if (fwnmi_active) { struct rtas_error_log *errhdr = fwnmi_get_errinfo(regs); @@ -287,6 +289,7 @@ void pSeries_system_reset_exception(stru } fwnmi_release_errinfo(); } + return 0; /* need to perform reset */ } /* Index: linux-2.6.15-rc/include/asm-powerpc/machdep.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/machdep.h +++ linux-2.6.15-rc/include/asm-powerpc/machdep.h @@ -134,7 +134,7 @@ struct machdep_calls { void (*nvram_sync)(void); /* Exception handlers */ - void (*system_reset_exception)(struct pt_regs *regs); + int (*system_reset_exception)(struct pt_regs *regs); int (*machine_check_exception)(struct pt_regs *regs); /* Motherboard/chipset features. This is a kind of general purpose From dwmw2 at infradead.org Sat Dec 17 12:02:10 2005 From: dwmw2 at infradead.org (David Woodhouse) Date: Sat, 17 Dec 2005 01:02:10 +0000 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <200512170153.47144.arndb@de.ibm.com> References: <20051217001031.456315000@localhost> <20051217002255.944161000@localhost> <1134779794.7104.183.camel@pmac.infradead.org> <200512170153.47144.arndb@de.ibm.com> Message-ID: <1134781330.7104.190.camel@pmac.infradead.org> On Sat, 2005-12-17 at 01:53 +0100, Arnd Bergmann wrote: > No, I think we just lose too many characters in the polling loop > when reading from the serial port. No idea how to get that working. Hardware flow control ought to fix that, perhaps? But there's no substitute for having real access to the hardware and having a proper UART driver. -- dwmw2 From benh at kernel.crashing.org Sat Dec 17 11:57:49 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sat, 17 Dec 2005 11:57:49 +1100 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <20051217002255.944161000@localhost> References: <20051217001031.456315000@localhost> <20051217002255.944161000@localhost> Message-ID: <1134781070.6102.23.camel@gaston> > +#define hvc_rtas_cookie 0x67781e15 What is the point of these "cookies" ? Magic numbers are evil ! Ben. From miltonm at bga.com Sat Dec 17 15:44:16 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 16 Dec 2005 22:44:16 -0600 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <1134781070.6102.23.camel@gaston> References: <20051217001031.456315000@localhost> <20051217002255.944161000@localhost> <1134781070.6102.23.camel@gaston> Message-ID: <1c2320460fadceafaf909562accc17a4@bga.com> On Dec 16, 2005, at 6:57 PM, Benjamin Herrenschmidt wrote: > >> +#define hvc_rtas_cookie 0x67781e15 > > What is the point of these "cookies" ? Magic numbers are evil ! > > Ben. The cookie is actually the hypervisor defined vterm number retrieved from the device tree for the case of multiple vterm devices managed by the hardware. This is used to correlate the console selected by device tree scan with the tty instantiated by vio device scanning in the hvc driver. While the cookie could be zero, defining it non-zero per driver means that if two console drivers both register with the core, we will guarantee that /dev/console output is sent to the same driver as the kernel printks. The other driver will get assigned a minor number above the last minor requested during the first scan. So while not strictly necessary in these single-channel drivers, it is very necessary for multi-channel drivers like the hvc vio driver. milton From miltonm at bga.com Sat Dec 17 15:48:17 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 16 Dec 2005 22:48:17 -0600 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <200512170153.47144.arndb@de.ibm.com> References: <20051217001031.456315000@localhost> <20051217002255.944161000@localhost> <1134779794.7104.183.camel@pmac.infradead.org> <200512170153.47144.arndb@de.ibm.com> Message-ID: On Dec 16, 2005, at 6:53 PM, Arnd Bergmann wrote: > On S?nnavend 17 Dezember 2005 01:36, David Woodhouse wrote: >> Does this work with zmodem transfers now? >> > No, I think we just lose too many characters in the polling loop > when reading from the serial port. No idea how to get that working. > > Arnd <>< > > You would probably have better luck if (1) you made the ^O MAGIC_SYSRQ_TRIGGER processing conditional on being the console, and not selecting this as the console (or disable it in the config), and (2) changing the inter-character delay to be only on failure. I have sent files xmodem over another hvc_driver client which had a an effective throughput less than 1200 baud. milton From arndb at de.ibm.com Sat Dec 17 11:53:44 2005 From: arndb at de.ibm.com (Arnd Bergmann) Date: Sat, 17 Dec 2005 01:53:44 +0100 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <1134779794.7104.183.camel@pmac.infradead.org> References: <20051217001031.456315000@localhost> <20051217002255.944161000@localhost> <1134779794.7104.183.camel@pmac.infradead.org> Message-ID: <200512170153.47144.arndb@de.ibm.com> On S?nnavend 17 Dezember 2005 01:36, David Woodhouse wrote: > Does this work with zmodem transfers now? > No, I think we just lose too many characters in the polling loop when reading from the serial port. No idea how to get that working. Arnd <>< From miltonm at bga.com Sat Dec 17 16:02:12 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 16 Dec 2005 23:02:12 -0600 Subject: [RFC PATCH 1/3] move hvc_console headers around In-Reply-To: <20051217002255.601962000@localhost> References: <20051217001031.456315000@localhost> <20051217002255.601962000@localhost> Message-ID: <858c3c666b9ac1fd8ffd01709426114e@bga.com> On Dec 16, 2005, at 6:10 PM, Arnd Bergmann wrote: > This patch shuffles around some data-type declarations and moves some > functions out of include/asm-ppc64/hvconsole.h and into a new > drivers/char/hvc_console.h file. > > Signed-off-by: "Ryan S. Arnold" > Signed-off-by: Arnd Bergmann > > Index: linux-2.6.15-rc/drivers/char/hvc_console.c > =================================================================== > --- linux-2.6.15-rc.orig/drivers/char/hvc_console.c > +++ linux-2.6.15-rc/drivers/char/hvc_console.c > @@ -39,8 +39,10 @@ > #include > #include > #include > + > #include > -#include > + > +#include "hvc_console.h" > > #define HVC_MAJOR 229 > #define HVC_MINOR 0 > @@ -61,10 +63,17 @@ > */ > #define HVC_ALLOC_TTY_ADAPTERS 8 This patch misses the requested cleanup of moving this define for the total number of tty devices registered to the .h and sanitizing that against the number of consoles. Currently we can recoginse the first 16 devices in the device tree and choose any one of them as the kernel console, but we only allow the first 8 to have a tty (and hence allow /dev/console to be opened). > > + > +/* > + * This is a design shortcoming, the number '16' is a vio required > buffer > + * size. This should be changeable per architecture, but hvc_struct > relies > + * upon it and that struct is used by all hvc_console backend > drivers. This > + * needs to be fixed. > + */ This is a bit strong. vio requires inbuf to be at least 16, and will process upto 16 in outbound. They could be bigger, it will only cause the hvc_driver to loop. Outbound couuld be smaller, but it would reduce the efficency. > #define N_OUTBUF 16 > #define N_INBUF 16 > > -#define __ALIGNED__ __attribute__((__aligned__(8))) > +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long)))) > > static struct tty_driver *hvc_driver; > static struct task_struct *hvc_task; > @@ -154,7 +163,7 @@ static uint32_t vtermnos[MAX_NR_HVC_CONS > > void hvc_console_print(struct console *co, const char *b, unsigned > count) > { > - char c[16] __ALIGNED__; > + char c[N_OUTBUF] __ALIGNED__; > unsigned i = 0, n = 0; > int r, donecr = 0, index = co->index; > > Index: linux-2.6.15-rc/drivers/char/hvc_console.h > =================================================================== > --- /dev/null > +++ linux-2.6.15-rc/drivers/char/hvc_console.h > @@ -0,0 +1,58 @@ > +/* > + * hvc_console.h > + * Copyright (C) 2005 IBM Corporation > + * > + * Author(s): > + * Ryan S. Arnold > + * > + * hvc_console header information: > + * moved here from include/asm-powerpc/hvconsole.h > + * and drivers/char/hvc_console.c > + * > + * This program is free software; you can redistribute it and/or > modify > + * it under the terms of the GNU General Public License as published > by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA > 02111-1307 USA > + */ > + > +#ifndef HVC_CONSOLE_H > +#define HVC_CONSOLE_H > + > +#include > +#include > +#include > + Since we removed thie actual definition of the struct, these includes should go back to hvc_console.c (and any other file that needs them). > +/* > + * This is the max number of console adapters that can/will be found > as > + * console devices on first stage console init. Any number beyond > this range > + * can't be used as a console device but is still a valid tty device. > + */ > +#define MAX_NR_HVC_CONSOLES 16 > + > +/* implemented by a low level driver */ > +struct hv_ops { > + int (*get_chars)(uint32_t vtermno, char *buf, int count); > + int (*put_chars)(uint32_t vtermno, const char *buf, int count); > +}; > + > +struct hvc_struct; > + > +/* Register a vterm and a slot index for use as a console > (console_init) */ > +extern int hvc_instantiate(uint32_t vtermno, int index, struct hv_ops > *ops); > + > +/* register a vterm for hvc tty operation (module_init or hotplug > add) */ > +extern struct hvc_struct * __devinit hvc_alloc(uint32_t vtermno, int > irq, > + struct hv_ops *ops); > +/* remove a vterm from hvc tty operation (modele_exit or hotplug > remove) */ > +extern int __devexit hvc_remove(struct hvc_struct *hp); > + > +#endif // HVC_CONSOLE_H > Index: linux-2.6.15-rc/include/asm-powerpc/hvconsole.h > =================================================================== > --- linux-2.6.15-rc.orig/include/asm-powerpc/hvconsole.h > +++ linux-2.6.15-rc/include/asm-powerpc/hvconsole.h > @@ -22,28 +22,7 @@ > #ifndef _PPC64_HVCONSOLE_H > #define _PPC64_HVCONSOLE_H > > -/* > - * This is the max number of console adapters that can/will be found > as > - * console devices on first stage console init. Any number beyond > this range > - * can't be used as a console device but is still a valid tty device. > - */ > -#define MAX_NR_HVC_CONSOLES 16 > - > -/* implemented by a low level driver */ > -struct hv_ops { > - int (*get_chars)(uint32_t vtermno, char *buf, int count); > - int (*put_chars)(uint32_t vtermno, const char *buf, int count); > -}; > extern int hvc_get_chars(uint32_t vtermno, char *buf, int count); > extern int hvc_put_chars(uint32_t vtermno, const char *buf, int > count); > > -struct hvc_struct; > - > -/* Register a vterm and a slot index for use as a console > (console_init) */ > -extern int hvc_instantiate(uint32_t vtermno, int index, struct hv_ops > *ops); > -/* register a vterm for hvc tty operation (module_init or hotplug > add) */ > -extern struct hvc_struct * __devinit hvc_alloc(uint32_t vtermno, int > irq, > - struct hv_ops *ops); > -/* remove a vterm from hvc tty operation (modele_exit or hotplug > remove) */ > -extern int __devexit hvc_remove(struct hvc_struct *hp); > #endif /* _PPC64_HVCONSOLE_H */ > Index: linux-2.6.15-rc/drivers/char/Kconfig > =================================================================== > --- linux-2.6.15-rc.orig/drivers/char/Kconfig > +++ linux-2.6.15-rc/drivers/char/Kconfig > @@ -552,9 +552,19 @@ config TIPAR > > If unsure, say N. > > +config HVC_DRIVER > + bool > + help > + Users of pSeries machines that want to utilize the hvc console > front-end > + module for their backend console driver should select this option. > + It will automatically be selected if one of the back-end console > drivers > + is selected. > + > + > config HVC_CONSOLE > bool "pSeries Hypervisor Virtual Console support" > depends on PPC_PSERIES > + select HVC_DRIVER > help > pSeries machines when partitioned support a hypervisor virtual > console. This driver allows each pSeries partition to have a > console > Index: linux-2.6.15-rc/drivers/char/Makefile > =================================================================== > --- linux-2.6.15-rc.orig/drivers/char/Makefile > +++ linux-2.6.15-rc/drivers/char/Makefile > @@ -40,11 +40,12 @@ obj-$(CONFIG_N_HDLC) += n_hdlc.o > obj-$(CONFIG_AMIGA_BUILTIN_SERIAL) += amiserial.o > obj-$(CONFIG_SX) += sx.o generic_serial.o > obj-$(CONFIG_RIO) += rio/ generic_serial.o > -obj-$(CONFIG_HVC_CONSOLE) += hvc_console.o hvc_vio.o hvsi.o > +obj-$(CONFIG_HVC_DRIVER) += hvc_console.o > +obj-$(CONFIG_HVC_CONSOLE) += hvc_vio.o hvsi.o > obj-$(CONFIG_RAW_DRIVER) += raw.o > obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o > obj-$(CONFIG_MMTIMER) += mmtimer.o > -obj-$(CONFIG_VIOCONS) += viocons.o > +obj-$(CONFIG_VIOCONS) += viocons.o > obj-$(CONFIG_VIOTAPE) += viotape.o > obj-$(CONFIG_HVCS) += hvcs.o > obj-$(CONFIG_SGI_MBCS) += mbcs.o > Index: linux-2.6.15-rc/drivers/char/hvc_vio.c > =================================================================== > --- linux-2.6.15-rc.orig/drivers/char/hvc_vio.c > +++ linux-2.6.15-rc/drivers/char/hvc_vio.c > @@ -31,10 +31,13 @@ > > #include > #include > + > #include > #include > #include > > +#include "hvc_console.h" > + > char hvc_driver_name[] = "hvc_console"; > > static struct vio_device_id hvc_driver_table[] __devinitdata = { > > -- > From miltonm at bga.com Sat Dec 17 16:18:06 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 16 Dec 2005 23:18:06 -0600 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <20051217002255.944161000@localhost> References: <20051217001031.456315000@localhost> <20051217002255.944161000@localhost> Message-ID: <3cb5e7ee12f0e85af50802ec119caa92@bga.com> On Dec 16, 2005, at 6:10 PM, Arnd Bergmann wrote: > Current Cell hardware is using the console through a set > of rtas calls. This driver is needed to get console > output on those boards. > +#define RTASCONS_PUT_ATTEMPTS 16 > + > +static int rtascons_put_char_token = RTAS_UNKNOWN_SERVICE; > +static int rtascons_get_char_token = RTAS_UNKNOWN_SERVICE; > +static int rtascons_put_delay = 100; > +module_param(rtascons_put_delay, int, 0644); > + > +static inline int hvc_rtas_write_console(uint32_t vtermno, const char > *buf, int count) > +{ > + int attempts = RTASCONS_PUT_ATTEMPTS; > + int result; > + int done; > + > + /* if there is more than one character to be displayed, wait a bit */ > + for (done = 0; done < count && attempts; udelay(rtascons_put_delay)) > { > + attempts--; > + result = rtas_call(rtascons_put_char_token, 1, 1, NULL, buf[done]); > + > + if (!result) { > + attempts = RTASCONS_PUT_ATTEMPTS; > + done++; > + } > + } > + /* the calling routine expects to receive the number of bytes sent */ > + return done ?: 0; this reduces to return done > +} Do we really know that we need a delay after every character written? Does firmware not tell us when it fails, and we could skip the delay when we make progress? I would move the delay to the bottom of the loop and continue on success after the other processing. Should we make RTAS_PUT_ATTEMPTS another module paramter? > + > +static inline int rtascons_get_char(void) > +{ > + int result; > + > + if (rtas_call(rtascons_get_char_token, 0, 2, &result)) > + result = -1; > + > + return result; > +} This function doesn't hide much ... > + > +static int hvc_rtas_read_console(uint32_t vtermno, char *buf, int > count) > +{ > + unsigned long got; > + int c; > + int i; > + > + for (got = 0, i = 0; i < count; i++) { > + > + if (( c = rtascons_get_char() ) != -1) { And the assignment in if statement doesn't make this more readable. I would just expand rtas_get_char here, the if would check the return of the function and c (already the correct size) wouuld be set as a side effect. > + buf[i] = c; > + ++got; > + } > + else > + break; > + } reversing the sense of the if will eliminate the need for braces on the if, as the else can just be the remainder of the loop. > + return got; > +} > + > From miltonm at bga.com Sat Dec 17 17:01:22 2005 From: miltonm at bga.com (Milton Miller) Date: Sat, 17 Dec 2005 00:01:22 -0600 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: <200512170147.10755.arnd@arndb.de> References: <200512162208.34832.arnd@arndb.de> <17315.15917.56310.132848@cargo.ozlabs.ibm.com> <200512170147.10755.arnd@arndb.de> Message-ID: On Dec 16, 2005, at 6:47 PM, Arnd Bergmann wrote: > This patch enables support for pause(0) power management state > for the Cell Broadband Processor, which is import for power efficient > operation. The pervasive infrastructure will in the future enable > us to introduce more functionality specific to the Cell's > pervasive unit. Not to be a pain, but a few more comments mostly on things that have changed. > + > +static spinlock_t cbe_pervasive_lock; > = SPIN_LOCK_INIT Initialize statically allocated locks statically. > + p = &cbe_pervasive[get_cpu()]; > + spin_lock_irq(&cbe_pervasive_lock); > Since you are going atomic here, you could move the assignment inside and use __get_cpu() > +out: > + spin_unlock_irq(&cbe_pervasive_lock); > + put_cpu(); and then remove the put_cpu here. > +} > + > +static void cbe_idle(void) > +{ > + unsigned long ctrl; > + > + cbe_enable_pause_zero(); I'm guessing you don't need to do all of that work each time. Maybe set a flag in you pervasive structure to say that you did the work on this cpu. > + > + while (1) { > + if (!need_resched()) { you shoud disable interrupts here > + while (!need_resched()) { > + /* go into low thread priority */ > + HMT_low(); > + > + /* go into low power mode */ > + local_irq_disable(); > + ctrl = mfspr(SPRN_CTRLF); > + ctrl &= ~(CTRL_RUNLATCH | CTRL_TE); > + mtspr(SPRN_CTRLT, ctrl); > + local_irq_enable(); > + } and not enable until here ... so that you don't take an interrupt that sets need_resched() between your check and your sleep. (Your sleep impliclity enables the interrupts to do the work). > + /* restore thread prio */ > + HMT_medium(); > + } > + > + ppc64_runlatch_on(); > + preempt_enable_no_resched(); > + schedule(); > + preempt_disable(); > + } > +} > + > + for (node = NULL; (node = of_find_node_by_type(node, "cpu"));) { > + int_servers = (void *) get_property(node, > + "ibm,ppc-interrupt-server#s", &proplen); > + if (!int_servers) { > + printk(KERN_WARNING "CPU device misses " > + "ibm,ppc-interrupt-server#s property"); > + continue; %s, node->full_name > + > +void __init cell_pervasive_init(void) > +{ > + struct cbe_pervasive *p; > + int cpu; > + int ret; > + > + spin_lock_init(&cbe_pervasive_lock); > + initialize staticly instead of at runtime as mentioned above. From arnd at arndb.de Sat Dec 17 22:28:19 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Sat, 17 Dec 2005 12:28:19 +0100 Subject: [PATCH ] cell: enable pause(0) in cpu_idle In-Reply-To: References: <200512170147.10755.arnd@arndb.de> Message-ID: <200512171228.21578.arnd@arndb.de> On S?nnavend 17 Dezember 2005 07:01, Milton Miller wrote: > > > + > > +static spinlock_t cbe_pervasive_lock; > > > = SPIN_LOCK_INIT > > Initialize statically allocated locks statically. No, there are efforts to remove SPIN_LOCK_INIT from the kernel, although I could not find out what exactly the reason is. I looked this up again and found that static DEFINE_SPINLOCK(cbe_pervasive_lock); would be the preferred way to write this now. > > + p = &cbe_pervasive[get_cpu()]; > > + spin_lock_irq(&cbe_pervasive_lock); > > > > Since you are going atomic here, you could move the assignment inside > and use __get_cpu() > Yes. > > +out: > > + spin_unlock_irq(&cbe_pervasive_lock); > > + put_cpu(); > > and then remove the put_cpu here. > > > +} > > + > > +static void cbe_idle(void) > > +{ > > + unsigned long ctrl; > > + > > + cbe_enable_pause_zero(); > > I'm guessing you don't need to do all of that work each time. Maybe > set a flag in you pervasive structure to say that you did the work on > this cpu. I moved this here when I discovered that the idle function is is called only once on every CPU and is never left, only scheduled away. > > + > > + while (1) { > > + if (!need_resched()) { > > you shoud disable interrupts here > > + while (!need_resched()) { > > + /* go into low thread priority */ > > + HMT_low(); > > + > > + /* go into low power mode */ > > + local_irq_disable(); > > + ctrl = mfspr(SPRN_CTRLF); > > + ctrl &= ~(CTRL_RUNLATCH | CTRL_TE); > > + mtspr(SPRN_CTRLT, ctrl); > > + local_irq_enable(); > > + } > and not enable until here ... > > so that you don't take an interrupt that sets need_resched() between > your check and your sleep. (Your sleep impliclity enables the > interrupts to do the work). Ah, yes. That was the reason we have this obscure method of waking up in the first place. Arnd <>< From arnd at arndb.de Sat Dec 17 22:38:37 2005 From: arnd at arndb.de (arnd at arndb.de) Date: Sat, 17 Dec 2005 12:38:37 +0100 Subject: [RFC PATCH 1/3] move hvc_console headers around In-Reply-To: <858c3c666b9ac1fd8ffd01709426114e@bga.com> References: <20051217001031.456315000@localhost> <20051217002255.601962000@localhost> <858c3c666b9ac1fd8ffd01709426114e@bga.com> Message-ID: <200512171238.38780.arnd@arndb.de> On S?nnavend 17 Dezember 2005 06:02, Milton Miller wrote: > > On Dec 16, 2005, at 6:10 PM, Arnd Bergmann wrote: > > */ > > #define HVC_ALLOC_TTY_ADAPTERS 8 > > This patch misses the requested cleanup of moving this define for the > total number of tty devices registered to the .h What's the point of moving it to a header file if no other file needs to know the value? > and sanitizing that against the number of consoles. > Currently we can recoginse the first 16 devices in the device tree and > choose any one of them as the kernel console, but we only allow the > first 8 to have a tty (and hence allow /dev/console to be opened). I understood the problem, but did not want to make functional changes to the hvc_vio driver in my cleanup of the cleanup. What do you suggest as a fix for this? Should the number be limited at all? > > > > + > > +/* > > + * This is a design shortcoming, the number '16' is a vio required > > buffer > > + * size. This should be changeable per architecture, but hvc_struct > > relies > > + * upon it and that struct is used by all hvc_console backend > > drivers. This > > + * needs to be fixed. > > + */ > > This is a bit strong. vio requires inbuf to be at least 16, and will > process upto 16 in outbound. They could be bigger, it will only cause > the hvc_driver to loop. Outbound couuld be smaller, but it would > reduce the efficency. ok > > +#ifndef HVC_CONSOLE_H > > +#define HVC_CONSOLE_H > > + > > +#include > > +#include > > +#include > > + > > Since we removed thie actual definition of the struct, these includes > should go back to hvc_console.c (and any other file that needs them). correct. I missed that when moving the struct again. Arnd <>< From benh at kernel.crashing.org Sun Dec 18 08:59:21 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 18 Dec 2005 08:59:21 +1100 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <200512062048.56131.arnd@arndb.de> References: <1133816807.8577.50.camel@cashmere.sps.mot.com> <200512062048.56131.arnd@arndb.de> Message-ID: <1134856762.6102.54.camel@gaston> > - Do we need a way to identify the type of soc bus? There are different > standards for this, e.g. PLB4 on PPC440 or the EIB on the Cell BE. > My initial idea was to have different device-type properties for these, > but I now think that device_type = "soc" makes sense for all of them. > Maybe we could add a model or compatible property for them. That would be a good idea. Also, it might be useful to ass a "clock-frequency" to it for processors where it makes sense. One of the things we are passing from uboot currently is the list of clock frequencies for PLB/OPB/PCI/... we need to replace this with appropriate nodes and their respective "clock-frequency" properties > - It does not really belong into this document, but is related anyway: > how do you want to represent this in Linux? Currently, most of these > would be of_platform_device, but I think it would be good to have > a new bus_type for it. The advantage would be that you can see the > devices in /sys/devices/soc at xxx/ even if the driver is not loaded > and the driver can even be autoloaded by udev. > Also, which properties should show up in sysfs? All of them or just > those specified in this document or a subset of them? If we go that way, we also need to have the SOC type take optionally part in the matching. That is, the driver matching infos should be based on model & compatible like OF does, thus we could recommend something like: - Define a unique SOC name per SOC bus type/family, for example, ppc4xxPLB, etc... This goes into /soc/model. - Optionally, use compatible for similar busses. For example, if you have a new rev of that PLB that is similar but has extensions called PLB2, you can have model be ppc4xxPLB2 and compatible containing ppc4xxPLB. - Define that the "model" property of a device under /soc is of the form "socname,devicename"... For example, EMAC would be ppc4xxPLB,emac", Same rule applies with compatible (this one could be compatible, among others, with "ppc4xxPLB,emac" and model "ppc4xxPLB2,emac". > - What do we do with pci root devices? They are often physically connected > to the internal CPU bus, so it would make sense to represent them > this way in the device tree. Should we add them to the specification > here? Would it even work the expected way in Linux? They are generally below the root of the tree, they don't have to though. Linux shouldn't care as there is no generic code to instanciate them, it's platform specific. I may change that in the future though but this isn't the case yet. The only rule is their name should be "pci" > - For some devices, you mandate a model property, for others you don't. > Is this intentional? It might be easier to find the right device > driver if the match string always contains a model name. > > - How would I represent nested interrupt controllers? E.g. suppose I > have a Cell internal interrupt controller on one SOC bus and > and an external interrupt controller on another SOC bus but have > that deliver interrupts to the first one. Read OF interrupt binding :) Typically, nodes contain either an interrupt-parent or a parent device with interrupt routing info (like a PCI bridge) which points to their actual parent controller. If it's a nested controller, it will itself have an interrupt parent and "interrupts" property to link it to its parent controller. > - Should it mention nested SOC buses, e.g. a PLB4 bus connected to a > PLB5 bus? > Do we have many of these horrors in real life ? > - The title says 'without Open Firmware', but it should also be allowed > to use the same SOC bus layout when using SLOF or some other OF > implementation, right? Yes, in fact, this document does cover open firmware as well. It defines the flattened tree format, but doesn't exclude open firmware, and then defines the subset of OF required by the kernel. > - Also not new in this version, but still: Should there be support for > specifying CPUs with multiple SMT threads? We need to think about this... Ben. From miltonm at bga.com Sun Dec 18 10:29:07 2005 From: miltonm at bga.com (Milton Miller) Date: Sat, 17 Dec 2005 17:29:07 -0600 Subject: [RFC PATCH 1/3] move hvc_console headers around In-Reply-To: <200512171238.38780.arnd@arndb.de> References: <20051217001031.456315000@localhost> <20051217002255.601962000@localhost> <858c3c666b9ac1fd8ffd01709426114e@bga.com> <200512171238.38780.arnd@arndb.de> Message-ID: <5a15bb50c2b3b29a62ba8f9bf49d5383@bga.com> On Dec 17, 2005, at 5:38 AM, arnd at arndb.de wrote: > On S?nnavend 17 Dezember 2005 06:02, Milton Miller wrote: >> >> On Dec 16, 2005, at 6:10 PM, Arnd Bergmann wrote: >>> */ >>> #define HVC_ALLOC_TTY_ADAPTERS 8 >> >> This patch misses the requested cleanup of moving this define for the >> total number of tty devices registered to the .h > > What's the point of moving it to a header file if no other file needs > to know the value? That argument applies equally well to both defines. >> and sanitizing that against the number of consoles. > >> Currently we can recoginse the first 16 devices in the device tree and >> choose any one of them as the kernel console, but we only allow the >> first 8 to have a tty (and hence allow /dev/console to be opened). > > I understood the problem, but did not want to make functional changes > to > the hvc_vio driver in my cleanup of the cleanup. What do you suggest as > a fix for this? Should the number be limited at all? Since one controlls static allocations and the other a runtime allocation having them visable makes sense. I would propose that since no one has aparently complained, that 8 would be enough for both. That could be a nieve position. Perhaps the change should be a seperate patch. milton From segher at kernel.crashing.org Sun Dec 18 17:35:32 2005 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Sun, 18 Dec 2005 07:35:32 +0100 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <1134856762.6102.54.camel@gaston> References: <1133816807.8577.50.camel@cashmere.sps.mot.com> <200512062048.56131.arnd@arndb.de> <1134856762.6102.54.camel@gaston> Message-ID: <3a8648f28b591dee596d6cb195f1cad1@kernel.crashing.org> >> - Do we need a way to identify the type of soc bus? There are >> different >> standards for this, e.g. PLB4 on PPC440 or the EIB on the Cell BE. >> My initial idea was to have different device-type properties for >> these, >> but I now think that device_type = "soc" makes sense for all of >> them. >> Maybe we could add a model or compatible property for them. > > That would be a good idea. "device_type" is what defines the (OF) programming interface. As not all SoC busses are identical for this, they should not have the same device_type. If you do want all of those semi-transparent SoC busses to be found by one wildcard, you can add it to the "compatible" property. > Also, it might be useful to ass a "clock-frequency" to it for > processors > where it makes sense. Certainly. > One of the things we are passing from uboot > currently is the list of clock frequencies for PLB/OPB/PCI/... we need > to replace this with appropriate nodes and their respective > "clock-frequency" properties > >> - It does not really belong into this document, but is related anyway: >> how do you want to represent this in Linux? Currently, most of these >> would be of_platform_device, but I think it would be good to have >> a new bus_type for it. The advantage would be that you can see the >> devices in /sys/devices/soc at xxx/ even if the driver is not loaded >> and the driver can even be autoloaded by udev. >> Also, which properties should show up in sysfs? All of them or just >> those specified in this document or a subset of them? All that make sense. > If we go that way, we also need to have the SOC type take optionally > part in the matching. That is, the driver matching infos should be > based > on model & compatible like OF does, thus we could recommend something > like: > > - Define a unique SOC name per SOC bus type/family, for example, > ppc4xxPLB, etc... This goes into /soc/model. "model" should be a string that is the "official" vendor name for the device. > - Optionally, use compatible for similar busses. For example, if you > have a new rev of that PLB that is similar but has extensions called > PLB2, you can have model be ppc4xxPLB2 and compatible containing > ppc4xxPLB. "compatible" does not contain alternatives for "model"; it contains alternatives for "name". > - Define that the "model" property of a device under /soc is of the > form "socname,devicename"... For example, EMAC would be > ppc4xxPLB,emac", > Same rule applies with compatible (this one could be compatible, among > others, with "ppc4xxPLB,emac" and model "ppc4xxPLB2,emac". > >> - What do we do with pci root devices? They are often physically >> connected >> to the internal CPU bus, so it would make sense to represent them >> this way in the device tree. Should we add them to the specification >> here? Would it even work the expected way in Linux? > > They are generally below the root of the tree, they don't have to > though. Linux shouldn't care as there is no generic code to instanciate > them, it's platform specific. I may change that in the future though > but > this isn't the case yet. The only rule is their name should be "pci" No. Their name can be whatever is required. The "device_type" should be "pci", for conventional PCI busses; and it should be whatever is defined by the appropriate OF binding for newer, mostly PCI-comnpatible, busses (like HT, PCIe, PCI-X, etc.) >> - For some devices, you mandate a model property, for others you >> don't. >> Is this intentional? It might be easier to find the right device >> driver if the match string always contains a model name. It is up to a device's parent bus to find the correct driver; for the parent bus, device_type and/or compatible are normally enough to do the matching. "model" is useful to disambiguate sometimes, but it normally is _too_ exact to do useful driver matching. >> - How would I represent nested interrupt controllers? E.g. suppose I >> have a Cell internal interrupt controller on one SOC bus and >> and an external interrupt controller on another SOC bus but have >> that deliver interrupts to the first one. > > Read OF interrupt binding :) Typically, nodes contain either an > interrupt-parent or a parent device with interrupt routing info (like a > PCI bridge) which points to their actual parent controller. If it's a > nested controller, it will itself have an interrupt parent and > "interrupts" property to link it to its parent controller. Interrupts are evil evil evil as always ;-) >> - Should it mention nested SOC buses, e.g. a PLB4 bus connected to a >> PLB5 bus? >> > Do we have many of these horrors in real life ? Yes, almost every SoC has at least two busses; e.g., you often see a high-speed coherent "system" bus, and a lower-speed non-coherent I/O bus connected to it. But there are lots of variations to this theme. >> - The title says 'without Open Firmware', but it should also be >> allowed >> to use the same SOC bus layout when using SLOF or some other OF >> implementation, right? > > Yes, in fact, this document does cover open firmware as well. It > defines > the flattened tree format, but doesn't exclude open firmware, and then > defines the subset of OF required by the kernel. > >> - Also not new in this version, but still: Should there be support for >> specifying CPUs with multiple SMT threads? > > We need to think about this... SMT threads should not be represented as separate CPUs. But some CPU resources that are described in a CPU node are non-shared between SMT threads; we need to find a way to describe those. The biggest problem is interrupts (as always); the unit-id for a "cpu" node in OF is the IPI number of that CPU, but on SMT, IPIs are per thread. Segher From benh at kernel.crashing.org Sun Dec 18 18:18:32 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Sun, 18 Dec 2005 18:18:32 +1100 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <3a8648f28b591dee596d6cb195f1cad1@kernel.crashing.org> References: <1133816807.8577.50.camel@cashmere.sps.mot.com> <200512062048.56131.arnd@arndb.de> <1134856762.6102.54.camel@gaston> <3a8648f28b591dee596d6cb195f1cad1@kernel.crashing.org> Message-ID: <1134890313.6102.78.camel@gaston> > No. Their name can be whatever is required. The "device_type" should > be "pci", for conventional PCI busses; and it should be whatever is > defined by the appropriate OF binding for newer, mostly PCI-comnpatible, > busses (like HT, PCIe, PCI-X, etc.) Yes, but the recommended practice is still to call them "pci" :) The device_type of "pci" defines the fact that it's providing a PCI bus, but is not unique to PCI hosts, p2p bridges share it. It's a convention for a pci host bridge however to be named "pci" while a p2p bridge is named "pci-bridge". As for HT, PCI-X, PCI-E etc... I don't think there are much bindings around, but I would recommend sticking strictly to the PCI one, only adding something to either model or compatible. We may want to standardize some additional things that aren't in the binding, like a pci-family (pci, pci-e, pci-x, ht) property, or a extended-config-space to advertise support for config space > 4096, etc. At this point, I would really like to find out the remains of the OF working group an kick that back into life to properly define those things. > It is up to a device's parent bus to find the correct driver; for > the parent bus, device_type and/or compatible are normally enough > to do the matching. "model" is useful to disambiguate sometimes, > but it normally is _too_ exact to do useful driver matching. Except that OF platform devices don't really have a parent bus, they expose a bus_type structure that can be used to match any device node in the OF tree. There is no and there will not be a 1:1 relationship between the OF device-tree and the linux one, so we must do compromises. > Interrupts are evil evil evil as always ;-) Yah, and I need to design something smart on the linux side to backup my promise of not requiring device-nodes per PCI devices, since that means not requiring nodes for p2p bridges neither, and thus impementing a generic interrupt mapping algorithm that works both with full of parsing, but also with partial one, doing standard swizzling for bridges without a node (and with a platform hook to override that optionally). On my todo list but not done yet. > Yes, almost every SoC has at least two busses; e.g., you often see > a high-speed coherent "system" bus, and a lower-speed non-coherent > I/O bus connected to it. But there are lots of variations to this > theme. That shouldn't be a problem anyway. Just cascade them and don't forget the "ranges" property :) > SMT threads should not be represented as separate CPUs. But some > CPU resources that are described in a CPU node are non-shared between > SMT threads; we need to find a way to describe those. > > The biggest problem is interrupts (as always); the unit-id for a > "cpu" node in OF is the IPI number of that CPU, but on SMT, IPIs > are per thread. Yes, that's a problem Ben. From benh at kernel.crashing.org Mon Dec 19 11:24:53 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 19 Dec 2005 11:24:53 +1100 Subject: [PATCH] powerpc: g5 thermal overtemp bug Message-ID: <1134951893.6102.126.camel@gaston> The g5 thermal control for liquid cooled machines has a small bug, when the temperatures gets too high, it boosts all fans to the max, but incorrectly sets the liquids pump to the min instead of the max speed, thus causing the overtemp condition not to clear and the machine to shut down after a while. This fixes it to set the pumps to max speed instead. This problem might explain some of the reports of random shutdowns that some g5 users have been reporting in the past. Many thanks to Marcus Rothe for spending a lot of time trying various patches & sending log logs before I found out that typo. Note that overtemp handling is still not perfect and the machine might still shutdown, that patch should reduce if not eliminate such occcurences in "normal" conditions with high load. I'll implement a better handling with proper slowing down of the CPUs later. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/drivers/macintosh/therm_pm72.c =================================================================== --- linux-work.orig/drivers/macintosh/therm_pm72.c 2005-11-10 08:20:14.000000000 +1100 +++ linux-work/drivers/macintosh/therm_pm72.c 2005-12-19 11:20:39.000000000 +1100 @@ -933,7 +933,7 @@ if (state0->overtemp > 0) { state0->rpm = state0->mpu.rmaxn_exhaust_fan; state0->intake_rpm = intake = state0->mpu.rmaxn_intake_fan; - pump = state0->pump_min; + pump = state0->pump_max; goto do_set_fans; } From benh at kernel.crashing.org Mon Dec 19 16:49:07 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Mon, 19 Dec 2005 16:49:07 +1100 Subject: [PATCH] powerpc: Fix g5 DART init Message-ID: <1134971349.6162.22.camel@gaston> The patch enabling the new G5's with U4 broke initialization of the DART driver, causing it to trigger a BUG_ON for a case that is actually valid. This patch fixes it: Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/powerpc/sysdev/dart_iommu.c =================================================================== --- linux-work.orig/arch/powerpc/sysdev/dart_iommu.c 2005-12-19 16:13:38.000000000 +1100 +++ linux-work/arch/powerpc/sysdev/dart_iommu.c 2005-12-19 16:20:43.000000000 +1100 @@ -216,12 +216,12 @@ static int dart_init(struct device_node base = dart_tablebase >> DART_PAGE_SHIFT; size = dart_tablesize >> DART_PAGE_SHIFT; if (dart_is_u4) { - BUG_ON(size & ~DART_SIZE_U4_SIZE_MASK); + size &= DART_SIZE_U4_SIZE_MASK; DART_OUT(DART_BASE_U4, base); DART_OUT(DART_SIZE_U4, size); DART_OUT(DART_CNTL, DART_CNTL_U4_ENABLE); } else { - BUG_ON(size & ~DART_CNTL_U3_SIZE_MASK); + size &= DART_CNTL_U3_SIZE_MASK; DART_OUT(DART_CNTL, DART_CNTL_U3_ENABLE | (base << DART_CNTL_U3_BASE_SHIFT) | From david at gibson.dropbear.id.au Mon Dec 19 16:44:10 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Mon, 19 Dec 2005 16:44:10 +1100 Subject: [RFC] powerpc: Merge 32/64 cacheflush code Message-ID: <20051219054410.GB13285@localhost.localdomain> Paulus et al, I think the patch below is roughly the right way to go, but it needs much more review and testing (at present it's been cursorily tested on ppc64 pSeries only). This patch merges the cache flushing code for 32 and 64 bit powerpc machines. This means the ppc64_caches mechanism for determining correct cache sizes at runtime is ported to 32-bit, and is thus renamed as 'powerpc_caches'. The merged cache flushing functions go in new file arch/powerpc/kernel/cache.S. Previously, the ppc32 version of flush_dcache_range() did a writeback and invalidate of the given cache lines (dcbf) whereas the ppc64 version did just a writeback (dcbst). In general, there's no consistent meaning of "flush" as one or the other, so this patch also renames the dcache flushing functions less ambiguously. The new names are: wback_dcache_range() - previously flush_dcache_range() on ppc64 and clean_dcache_range() on ppc32 wback_inval_dcache_range() - previously flush_inval_dcache_range() on ppc64 and flush_dcache_range on ppc32 invalidate_dcache_range() - didn't previously exist on ppc64, unchanged on ppc32 Finally we also cleanup the initialization of the powerpc_caches structure from the old ppc64 specific version. We remove a pointless loop, and remove a dependence on _machine. arch/powerpc/kernel/Makefile | 2 arch/powerpc/kernel/align.c | 2 arch/powerpc/kernel/asm-offsets.c | 12 - arch/powerpc/kernel/cache.S | 224 +++++++++++++++++++++++++++++++++++++ arch/powerpc/kernel/misc_32.S | 123 -------------------- arch/powerpc/kernel/misc_64.S | 182 ------------------------------ arch/powerpc/kernel/ppc_ksyms.c | 2 arch/powerpc/kernel/setup-common.c | 89 ++++++++++++++ arch/powerpc/kernel/setup_32.c | 33 +---- arch/powerpc/kernel/setup_64.c | 123 +------------------- arch/powerpc/kernel/vdso.c | 8 - arch/powerpc/sysdev/dart_iommu.c | 2 arch/ppc/8xx_io/cs4218_tdm.c | 4 arch/ppc/8xx_io/enet.c | 4 arch/ppc/8xx_io/fec.c | 4 arch/ppc/kernel/dma-mapping.c | 6 arch/ppc/kernel/misc.S | 8 - arch/ppc/kernel/ppc_ksyms.c | 2 drivers/char/agp/uninorth-agp.c | 12 - drivers/macintosh/smu.c | 4 drivers/net/fec.c | 4 drivers/serial/mpsc.c | 12 - include/asm-powerpc/asm-compat.h | 2 include/asm-powerpc/cache.h | 9 - include/asm-powerpc/cacheflush.h | 13 -- include/asm-powerpc/page_64.h | 4 include/asm-ppc/io.h | 4 27 files changed, 395 insertions(+), 499 deletions(-) Index: working-2.6/include/asm-powerpc/cache.h =================================================================== --- working-2.6.orig/include/asm-powerpc/cache.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/cache.h 2005-12-19 15:47:10.000000000 +1100 @@ -21,8 +21,8 @@ #define SMP_CACHE_BYTES L1_CACHE_BYTES #define L1_CACHE_SHIFT_MAX 7 /* largest L1 which this arch supports */ -#if defined(__powerpc64__) && !defined(__ASSEMBLY__) -struct ppc64_caches { +#ifndef __ASSEMBLY__ +struct powerpc_caches { u32 dsize; /* L1 d-cache size */ u32 dline_size; /* L1 d-cache line size */ u32 log_dline_size; @@ -33,8 +33,9 @@ struct ppc64_caches { u32 ilines_per_page; }; -extern struct ppc64_caches ppc64_caches; -#endif /* __powerpc64__ && ! __ASSEMBLY__ */ +extern struct powerpc_caches powerpc_caches; +extern void initialize_cache_info(void); +#endif /* ! __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHE_H */ Index: working-2.6/arch/powerpc/kernel/Makefile =================================================================== --- working-2.6.orig/arch/powerpc/kernel/Makefile 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/Makefile 2005-12-19 15:47:10.000000000 +1100 @@ -13,7 +13,7 @@ endif obj-y := semaphore.o cputable.o ptrace.o syscalls.o \ irq.o align.o signal_32.o pmc.o vdso.o \ - prom_parse.o + prom_parse.o cache.o obj-y += vdso32/ obj-$(CONFIG_PPC64) += setup_64.o binfmt_elf32.o sys_ppc32.o \ signal_64.o ptrace32.o systbl.o \ Index: working-2.6/arch/powerpc/kernel/align.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/align.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/align.c 2005-12-19 15:47:10.000000000 +1100 @@ -231,7 +231,7 @@ static int emulate_dcbz(struct pt_regs * int i, size; #ifdef __powerpc64__ - size = ppc64_caches.dline_size; + size = powerpc_caches.dline_size; #else size = L1_CACHE_BYTES; #endif Index: working-2.6/arch/powerpc/kernel/asm-offsets.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/asm-offsets.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/asm-offsets.c 2005-12-19 15:47:10.000000000 +1100 @@ -99,13 +99,13 @@ int main(void) DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); #endif /* CONFIG_PPC32 */ + DEFINE(DCACHEL1LINESIZE, offsetof(struct powerpc_caches, dline_size)); + DEFINE(DCACHEL1LOGLINESIZE, offsetof(struct powerpc_caches, log_dline_size)); + DEFINE(DCACHEL1LINESPERPAGE, offsetof(struct powerpc_caches, dlines_per_page)); + DEFINE(ICACHEL1LINESIZE, offsetof(struct powerpc_caches, iline_size)); + DEFINE(ICACHEL1LOGLINESIZE, offsetof(struct powerpc_caches, log_iline_size)); + DEFINE(ICACHEL1LINESPERPAGE, offsetof(struct powerpc_caches, ilines_per_page)); #ifdef CONFIG_PPC64 - DEFINE(DCACHEL1LINESIZE, offsetof(struct ppc64_caches, dline_size)); - DEFINE(DCACHEL1LOGLINESIZE, offsetof(struct ppc64_caches, log_dline_size)); - DEFINE(DCACHEL1LINESPERPAGE, offsetof(struct ppc64_caches, dlines_per_page)); - DEFINE(ICACHEL1LINESIZE, offsetof(struct ppc64_caches, iline_size)); - DEFINE(ICACHEL1LOGLINESIZE, offsetof(struct ppc64_caches, log_iline_size)); - DEFINE(ICACHEL1LINESPERPAGE, offsetof(struct ppc64_caches, ilines_per_page)); DEFINE(PLATFORM_LPAR, PLATFORM_LPAR); /* paca */ Index: working-2.6/arch/powerpc/kernel/cache.S =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ working-2.6/arch/powerpc/kernel/cache.S 2005-12-19 16:09:14.000000000 +1100 @@ -0,0 +1,224 @@ +/* + * arch/powerpc/kernel/cache.S + * + * Cache-flushing functions. + * Copyright (C) 2005 David Gibson + * Based on earlier code: + * Copyright (C) 1995-1996 Gary Thomas (gdt at linuxppc.org) + * Largely rewritten by Cort Dougan (cort at cs.nmt.edu) + * and Paul Mackerras. + * Adapted for iSeries by Mike Corrigan (mikejc at us.ibm.com) + * PPC64 updates by Dave Engebretsen (engebret at us.ibm.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + */ + +#include +#include +#include +#include +#include +#include +#include + +/* + * Write any modified data cache blocks out to memory + * and invalidate the corresponding instruction cache blocks. + * + * flush_icache_range(unsigned long start, unsigned long stop) + * + * flush all bytes from start through stop-1 inclusive + */ +_KPROBE(__flush_icache_range) +BEGIN_FTR_SECTION + blr /* for 601, do nothing */ +END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) +/* + * Flush the data cache to memory + * + * Different systems have different cache line sizes + * and in some cases i-cache and d-cache line sizes differ from + * each other. + */ + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10)/* Get cache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get dcache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +1: dcbst 0,r6 + add r6,r6,r7 + bdnz 1b + sync + +/* Now invalidate the instruction cache */ + lwz r7,ICACHEL1LINESIZE(r10) /* Get Icache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 + lwz r9,ICACHEL1LOGLINESIZE(r10) /* Get icache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +2: icbi 0,r6 + add r6,r6,r7 + bdnz 2b + sync /* additional sync needed on g4 */ + isync + blr + +/* + * Flush a particular page from the data cache to RAM. + * Note: this is necessary because the instruction cache does *not* + * snoop from the data cache. + * + * void __flush_dcache_icache(void *page) + */ +_GLOBAL(__flush_dcache_icache) +BEGIN_FTR_SECTION + blr /* for 601, do nothing */ +END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) + /* Flush the dcache */ + LOAD_REG_ADDR(r7, powerpc_caches) + PPC_CLRRLI r3,r3,PAGE_SHIFT /* Page align */ + lwz r4,DCACHEL1LINESPERPAGE(r7) /* Get dcache lines per page */ + lwz r5,DCACHEL1LINESIZE(r7) /* Get dcache line size */ + mr r6,r3 + mtctr r4 +0: dcbst 0,r6 + add r6,r6,r5 + bdnz 0b + sync + + /* Now invalidate the icache */ + lwz r4,ICACHEL1LINESPERPAGE(r7) /* Get icache lines per page */ + lwz r5,ICACHEL1LINESIZE(r7) /* Get icache line size */ + mtctr r4 +1: icbi 0,r3 + add r3,r3,r5 + bdnz 1b + sync /* additional sync needed on g4 */ + isync + blr + +/* + * Like above, but only do the D-cache. + * + * wback_dcache_range(unsigned long start, unsigned long stop) + * + * writeback all bytes from start to stop-1 inclusive + */ +_GLOBAL(wback_dcache_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get dcache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +0: dcbst 0,r6 + add r6,r6,r7 + bdnz 0b + sync + blr + +/* + * wback_inval_dcache_range(unsigned long start, unsigned long stop) + * + * writeback and invalidate all bytes from start to stop-1 inclusive + */ +_GLOBAL(wback_inval_dcache_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10)/* Get log-2 of dcache line size */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + sync /* FIXME: is this necessary? */ + isync /* FIXME: is this necessary? */ + mtctr r8 +0: dcbf 0,r6 + add r6,r6,r7 + bdnz 0b + sync + isync /* FIXME: is this necessary? */ + blr + +/* + * Like above, but invalidate the D-cache. This is used by the 8xx + * to invalidate the cache so the PPC core doesn't get stale data + * from the CPM (no cache snooping here :-). + * + * invalidate_dcache_range(unsigned long start, unsigned long stop) + */ +_GLOBAL(invalidate_dcache_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10)/* Get log-2 of dcache line size */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +0: dcbi 0,r6 + add r6,r6,r7 + bdnz 0b + sync + blr + +#ifdef CONFIG_U3_DART +/* + * Like above, but works on non-mapped physical addresses. + * Use only for non-LPAR setups ! It also assumes real mode + * is cacheable. Used for flushing out the DART before using + * it as uncacheable memory + * + * wback_dcache_phys_range(unsigned long start, unsigned long stop) + * + * writeback all bytes from start to stop-1 inclusive + */ +_GLOBAL(wback_dcache_phys_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get log-2 of dcache line size */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mfmsr r5 /* Disable MMU Data Relocation */ + ori r0,r5,MSR_DR + xori r0,r0,MSR_DR + sync + mtmsr r0 + sync + isync + mtctr r8 +0: dcbst 0,r6 + add r6,r6,r7 + bdnz 0b + sync + isync + mtmsr r5 /* Re-enable MMU Data Relocation */ + sync + isync + blr +#endif /* CONFIG_U3_DART */ \ No newline at end of file Index: working-2.6/arch/powerpc/kernel/misc_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_64.S 2005-12-19 15:47:10.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_64.S 2005-12-19 16:02:21.000000000 +1100 @@ -142,188 +142,6 @@ _GLOBAL(call_with_mmu_off) mtspr SPRN_SRR1,r0 rfid - - .section ".toc","aw" -PPC64_CACHES: - .tc ppc64_caches[TC],ppc64_caches - .section ".text" - -/* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * - * flush_icache_range(unsigned long start, unsigned long stop) - * - * flush all bytes from start through stop-1 inclusive - */ - -_KPROBE(__flush_icache_range) - -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - * and in some cases i-cache and d-cache line sizes differ from - * each other. - */ - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10)/* Get cache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get log-2 of cache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -1: dcbst 0,r6 - add r6,r6,r7 - bdnz 1b - sync - -/* Now invalidate the instruction cache */ - - lwz r7,ICACHEL1LINESIZE(r10) /* Get Icache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 - lwz r9,ICACHEL1LOGLINESIZE(r10) /* Get log-2 of Icache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -2: icbi 0,r6 - add r6,r6,r7 - bdnz 2b - isync - blr - .previous .text -/* - * Like above, but only do the D-cache. - * - * flush_dcache_range(unsigned long start, unsigned long stop) - * - * flush all bytes from start to stop-1 inclusive - */ -_GLOBAL(flush_dcache_range) - -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - */ - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get log-2 of dcache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -0: dcbst 0,r6 - add r6,r6,r7 - bdnz 0b - sync - blr - -/* - * Like above, but works on non-mapped physical addresses. - * Use only for non-LPAR setups ! It also assumes real mode - * is cacheable. Used for flushing out the DART before using - * it as uncacheable memory - * - * flush_dcache_phys_range(unsigned long start, unsigned long stop) - * - * flush all bytes from start to stop-1 inclusive - */ -_GLOBAL(flush_dcache_phys_range) - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get log-2 of dcache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mfmsr r5 /* Disable MMU Data Relocation */ - ori r0,r5,MSR_DR - xori r0,r0,MSR_DR - sync - mtmsr r0 - sync - isync - mtctr r8 -0: dcbst 0,r6 - add r6,r6,r7 - bdnz 0b - sync - isync - mtmsr r5 /* Re-enable MMU Data Relocation */ - sync - isync - blr - -_GLOBAL(flush_inval_dcache_range) - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10)/* Get log-2 of dcache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - sync - isync - mtctr r8 -0: dcbf 0,r6 - add r6,r6,r7 - bdnz 0b - sync - isync - blr - - -/* - * Flush a particular page from the data cache to RAM. - * Note: this is necessary because the instruction cache does *not* - * snoop from the data cache. - * - * void __flush_dcache_icache(void *page) - */ -_GLOBAL(__flush_dcache_icache) -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - */ - -/* Flush the dcache */ - ld r7,PPC64_CACHES at toc(r2) - clrrdi r3,r3,PAGE_SHIFT /* Page align */ - lwz r4,DCACHEL1LINESPERPAGE(r7) /* Get # dcache lines per page */ - lwz r5,DCACHEL1LINESIZE(r7) /* Get dcache line size */ - mr r6,r3 - mtctr r4 -0: dcbst 0,r6 - add r6,r6,r5 - bdnz 0b - sync - -/* Now invalidate the icache */ - - lwz r4,ICACHEL1LINESPERPAGE(r7) /* Get # icache lines per page */ - lwz r5,ICACHEL1LINESIZE(r7) /* Get icache line size */ - mtctr r4 -1: icbi 0,r3 - add r3,r3,r5 - bdnz 1b - isync - blr - /* * I/O string operations * Index: working-2.6/arch/powerpc/kernel/setup_64.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/setup_64.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/setup_64.c 2005-12-19 15:47:10.000000000 +1100 @@ -103,25 +103,6 @@ int boot_cpuid_phys = 0; dev_t boot_dev; u64 ppc64_pft_size; -/* Pick defaults since we might want to patch instructions - * before we've read this from the device tree. - */ -struct ppc64_caches ppc64_caches = { - .dline_size = 0x80, - .log_dline_size = 7, - .iline_size = 0x80, - .log_iline_size = 7 -}; -EXPORT_SYMBOL_GPL(ppc64_caches); - -/* - * These are used in binfmt_elf.c to put aux entries on the stack - * for each elf executable being started. - */ -int dcache_bsize; -int icache_bsize; -int ucache_bsize; - /* The main machine-dep calls structure */ struct machdep_calls ppc_md; @@ -345,81 +326,6 @@ void smp_release_cpus(void) #endif /* CONFIG_SMP || CONFIG_KEXEC */ /* - * Initialize some remaining members of the ppc64_caches and systemcfg - * structures - * (at least until we get rid of them completely). This is mostly some - * cache informations about the CPU that will be used by cache flush - * routines and/or provided to userland - */ -static void __init initialize_cache_info(void) -{ - struct device_node *np; - unsigned long num_cpus = 0; - - DBG(" -> initialize_cache_info()\n"); - - for (np = NULL; (np = of_find_node_by_type(np, "cpu"));) { - num_cpus += 1; - - /* We're assuming *all* of the CPUs have the same - * d-cache and i-cache sizes... -Peter - */ - - if ( num_cpus == 1 ) { - u32 *sizep, *lsizep; - u32 size, lsize; - const char *dc, *ic; - - /* Then read cache informations */ - if (_machine == PLATFORM_POWERMAC) { - dc = "d-cache-block-size"; - ic = "i-cache-block-size"; - } else { - dc = "d-cache-line-size"; - ic = "i-cache-line-size"; - } - - size = 0; - lsize = cur_cpu_spec->dcache_bsize; - sizep = (u32 *)get_property(np, "d-cache-size", NULL); - if (sizep != NULL) - size = *sizep; - lsizep = (u32 *) get_property(np, dc, NULL); - if (lsizep != NULL) - lsize = *lsizep; - if (sizep == 0 || lsizep == 0) - DBG("Argh, can't find dcache properties ! " - "sizep: %p, lsizep: %p\n", sizep, lsizep); - - ppc64_caches.dsize = size; - ppc64_caches.dline_size = lsize; - ppc64_caches.log_dline_size = __ilog2(lsize); - ppc64_caches.dlines_per_page = PAGE_SIZE / lsize; - - size = 0; - lsize = cur_cpu_spec->icache_bsize; - sizep = (u32 *)get_property(np, "i-cache-size", NULL); - if (sizep != NULL) - size = *sizep; - lsizep = (u32 *)get_property(np, ic, NULL); - if (lsizep != NULL) - lsize = *lsizep; - if (sizep == 0 || lsizep == 0) - DBG("Argh, can't find icache properties ! " - "sizep: %p, lsizep: %p\n", sizep, lsizep); - - ppc64_caches.isize = size; - ppc64_caches.iline_size = lsize; - ppc64_caches.log_iline_size = __ilog2(lsize); - ppc64_caches.ilines_per_page = PAGE_SIZE / lsize; - } - } - - DBG(" <- initialize_cache_info()\n"); -} - - -/* * Do some initial setup of the system. The parameters are those which * were passed in from the bootloader. */ @@ -437,14 +343,13 @@ void __init setup_system(void) #endif /* - * Fill the ppc64_caches & systemcfg structures with informations - * retreived from the device-tree. Need to be called before + * Fill the powerpc_caches structure with information + * retreived from the device-tree. Needs to be called before * finish_device_tree() since the later requires some of the - * informations filled up here to properly parse the interrupt - * tree. - * It also sets up the cache line sizes which allows to call - * routines like flush_icache_range (used by the hash init - * later on). + * information filled up here to properly parse the interrupt + * tree. It also sets up the cache line sizes which allows to + * call routines like flush_icache_range (used by the hash + * init later on). */ initialize_cache_info(); @@ -514,10 +419,10 @@ void __init setup_system(void) ppc64_interrupt_controller); printk("platform = 0x%x\n", _machine); printk("physicalMemorySize = 0x%lx\n", lmb_phys_mem_size()); - printk("ppc64_caches.dcache_line_size = 0x%x\n", - ppc64_caches.dline_size); - printk("ppc64_caches.icache_line_size = 0x%x\n", - ppc64_caches.iline_size); + printk("powerpc_caches.dcache_line_size = 0x%x\n", + powerpc_caches.dline_size); + printk("powerpc_caches.icache_line_size = 0x%x\n", + powerpc_caches.iline_size); printk("htab_address = 0x%p\n", htab_address); printk("htab_hash_mask = 0x%lx\n", htab_hash_mask); #if PHYSICAL_START > 0 @@ -597,14 +502,6 @@ void __init setup_arch(char **cmdline_p) *cmdline_p = cmd_line; - /* - * Set cache line size based on type of cpu as a default. - * Systems with OF can look in the properties on the cpu node(s) - * for a possibly more accurate value. - */ - dcache_bsize = ppc64_caches.dline_size; - icache_bsize = ppc64_caches.iline_size; - /* reboot on panic */ panic_timeout = 180; Index: working-2.6/arch/powerpc/kernel/vdso.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/vdso.c 2005-11-29 16:23:57.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/vdso.c 2005-12-19 15:47:10.000000000 +1100 @@ -671,10 +671,10 @@ void __init vdso_init(void) vdso_data->processor = mfspr(SPRN_PVR); vdso_data->platform = _machine; vdso_data->physicalMemorySize = lmb_phys_mem_size(); - vdso_data->dcache_size = ppc64_caches.dsize; - vdso_data->dcache_line_size = ppc64_caches.dline_size; - vdso_data->icache_size = ppc64_caches.isize; - vdso_data->icache_line_size = ppc64_caches.iline_size; + vdso_data->dcache_size = powerpc_caches.dsize; + vdso_data->dcache_line_size = powerpc_caches.dline_size; + vdso_data->icache_size = powerpc_caches.isize; + vdso_data->icache_line_size = powerpc_caches.iline_size; /* * Calculate the size of the 64 bits vDSO Index: working-2.6/include/asm-powerpc/page_64.h =================================================================== --- working-2.6.orig/include/asm-powerpc/page_64.h 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/include/asm-powerpc/page_64.h 2005-12-19 15:47:10.000000000 +1100 @@ -40,8 +40,8 @@ static __inline__ void clear_page(void * { unsigned long lines, line_size; - line_size = ppc64_caches.dline_size; - lines = ppc64_caches.dlines_per_page; + line_size = powerpc_caches.dline_size; + lines = powerpc_caches.dlines_per_page; __asm__ __volatile__( "mtctr %1 # clear_page\n\ Index: working-2.6/arch/powerpc/kernel/setup-common.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/setup-common.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/setup-common.c 2005-12-19 16:12:55.000000000 +1100 @@ -466,3 +466,92 @@ static int __init early_xmon(char *p) } early_param("xmon", early_xmon); #endif + +/* Pick defaults since we might want to patch instructions + * before we've read this from the device tree. + */ +struct powerpc_caches powerpc_caches = { + .dline_size = L1_CACHE_BYTES, + .log_dline_size = L1_CACHE_SHIFT, + .iline_size = L1_CACHE_BYTES, + .log_iline_size = L1_CACHE_SHIFT, +}; +EXPORT_SYMBOL_GPL(powerpc_caches); + +/* + * These are used in binfmt_elf.c to put aux entries on the stack + * for each elf executable being started. + */ +int dcache_bsize; +int icache_bsize; +int ucache_bsize; + +/* + * Initialize the powerpc_caches structure. This is some cache + * informations about the CPU that will be used by cache flush + * routines and/or provided to userland + */ +void __init initialize_cache_info(void) +{ + struct device_node *np; + u32 *sizep, *lsizep; + u32 size, lsize; + + DBG(" -> initialize_cache_info()\n"); + + /* We're assuming *all* of the CPUs have the same d-cache and + * i-cache sizes... -Peter + */ + np = of_find_node_by_type(NULL, "cpu"); + BUG_ON(!np); + + size = 0; + lsize = cur_cpu_spec->dcache_bsize; + sizep = (u32 *)get_property(np, "d-cache-size", NULL); + if (sizep) + size = *sizep; + lsizep = (u32 *) get_property(np, "d-cache-line-size", NULL); + if (! lsizep) + lsizep = (u32 *) get_property(np, "d-cache-block-size", NULL); + if (lsizep) + lsize = *lsizep; + if (!sizep || !lsizep) + DBG("Argh, can't find dcache properties! " + "sizep: %p, lsizep: %p\n", sizep, lsizep); + + powerpc_caches.dsize = size; + powerpc_caches.dline_size = lsize; + powerpc_caches.log_dline_size = __ilog2(lsize); + powerpc_caches.dlines_per_page = PAGE_SIZE / lsize; + + size = 0; + lsize = cur_cpu_spec->icache_bsize; + sizep = (u32 *)get_property(np, "i-cache-size", NULL); + if (sizep) + size = *sizep; + lsizep = (u32 *)get_property(np, "i-cache-line-size", NULL); + if (! lsizep) + lsizep = (u32 *) get_property(np, "i-cache-block-size", NULL); + if (lsizep) + lsize = *lsizep; + if (!sizep || !lsizep) + DBG("Argh, can't find icache properties ! " + "sizep: %p, lsizep: %p\n", sizep, lsizep); + + powerpc_caches.isize = size; + powerpc_caches.iline_size = lsize; + powerpc_caches.log_iline_size = __ilog2(lsize); + powerpc_caches.ilines_per_page = PAGE_SIZE / lsize; + + /* + * Set cache line size based on type of cpu as a default. + * Systems with OF can look in the properties on the cpu node(s) + * for a possibly more accurate value. + */ + dcache_bsize = powerpc_caches.dline_size; + icache_bsize = powerpc_caches.iline_size; + if (! cpu_has_feature(CPU_FTR_SPLIT_ID_CACHE)) + ucache_bsize = dcache_bsize; + + DBG(" <- initialize_cache_info()\n"); +} Index: working-2.6/arch/powerpc/kernel/setup_32.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/setup_32.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/setup_32.c 2005-12-19 15:47:10.000000000 +1100 @@ -89,14 +89,6 @@ struct machdep_calls ppc_md; EXPORT_SYMBOL(ppc_md); /* - * These are used in binfmt_elf.c to put aux entries on the stack - * for each elf executable being started. - */ -int dcache_bsize; -int icache_bsize; -int ucache_bsize; - -/* * We're called here very early in the boot. We determine the machine * type and call the appropriate low-level setup functions. * -- Cort @@ -294,6 +286,18 @@ void __init setup_arch(char **cmdline_p) loops_per_jiffy = 500000000 / HZ; unflatten_device_tree(); + + /* + * Fill the powerpc_caches structure with information + * retreived from the device-tree. Needs to be called before + * finish_device_tree() since the later requires some of the + * information filled up here to properly parse the interrupt + * tree. It also sets up the cache line sizes which allows to + * call routines like flush_icache_range (used by the hash + * init later on). + */ + initialize_cache_info(); + check_for_initrd(); if (ppc_md.init_early) @@ -324,19 +328,6 @@ void __init setup_arch(char **cmdline_p) } #endif - /* - * Set cache line size based on type of cpu as a default. - * Systems with OF can look in the properties on the cpu node(s) - * for a possibly more accurate value. - */ - if (cpu_has_feature(CPU_FTR_SPLIT_ID_CACHE)) { - dcache_bsize = cur_cpu_spec->dcache_bsize; - icache_bsize = cur_cpu_spec->icache_bsize; - ucache_bsize = 0; - } else - ucache_bsize = dcache_bsize = icache_bsize - = cur_cpu_spec->dcache_bsize; - /* reboot on panic */ panic_timeout = 180; Index: working-2.6/arch/powerpc/kernel/misc_32.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_32.S 2005-12-19 15:47:10.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_32.S 2005-12-19 16:07:23.000000000 +1100 @@ -511,129 +511,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_C blr /* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * This is a no-op on the 601. - * - * flush_icache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(__flush_icache_range) -BEGIN_FTR_SECTION - blr /* for 601, do nothing */ -END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - mr r6,r3 -1: dcbst 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbst's to get to ram */ - mtctr r4 -2: icbi 0,r6 - addi r6,r6,L1_CACHE_BYTES - bdnz 2b - sync /* additional sync needed on g4 */ - isync - blr -/* - * Write any modified data cache blocks out to memory. - * Does not invalidate the corresponding cache lines (especially for - * any corresponding instruction cache). - * - * clean_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(clean_dcache_range) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbst 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbst's to get to ram */ - blr - -/* - * Write any modified data cache blocks out to memory and invalidate them. - * Does not invalidate the corresponding instruction cache blocks. - * - * flush_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(flush_dcache_range) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbf 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbst's to get to ram */ - blr - -/* - * Like above, but invalidate the D-cache. This is used by the 8xx - * to invalidate the cache so the PPC core doesn't get stale data - * from the CPM (no cache snooping here :-). - * - * invalidate_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(invalidate_dcache_range) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbi 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbi's to get to ram */ - blr - -/* - * Flush a particular page from the data cache to RAM. - * Note: this is necessary because the instruction cache does *not* - * snoop from the data cache. - * This is a no-op on the 601 which has a unified cache. - * - * void __flush_dcache_icache(void *page) - */ -_GLOBAL(__flush_dcache_icache) -BEGIN_FTR_SECTION - blr /* for 601, do nothing */ -END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) - rlwinm r3,r3,0,0,19 /* Get page base address */ - li r4,4096/L1_CACHE_BYTES /* Number of lines in a page */ - mtctr r4 - mr r6,r3 -0: dcbst 0,r3 /* Write line to ram */ - addi r3,r3,L1_CACHE_BYTES - bdnz 0b - sync - mtctr r4 -1: icbi 0,r6 - addi r6,r6,L1_CACHE_BYTES - bdnz 1b - sync - isync - blr - -/* * Flush a particular page from the data cache to RAM, identified * by its physical address. We turn off the MMU so we can just use * the physical address (this may be a highmem page without a kernel Index: working-2.6/arch/powerpc/kernel/ppc_ksyms.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/ppc_ksyms.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/ppc_ksyms.c 2005-12-19 15:47:10.000000000 +1100 @@ -165,7 +165,7 @@ EXPORT_SYMBOL(flush_tlb_page); EXPORT_SYMBOL(_tlbie); #endif EXPORT_SYMBOL(__flush_icache_range); -EXPORT_SYMBOL(flush_dcache_range); +EXPORT_SYMBOL(wback_dcache_range); #ifdef CONFIG_SMP EXPORT_SYMBOL(smp_call_function); Index: working-2.6/arch/ppc/kernel/dma-mapping.c =================================================================== --- working-2.6.orig/arch/ppc/kernel/dma-mapping.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/ppc/kernel/dma-mapping.c 2005-12-19 15:47:10.000000000 +1100 @@ -210,7 +210,7 @@ __dma_alloc_coherent(size_t size, dma_ad { unsigned long kaddr = (unsigned long)page_address(page); memset(page_address(page), 0, size); - flush_dcache_range(kaddr, kaddr + size); + wback_inval_dcache_range(kaddr, kaddr + size); } /* @@ -375,10 +375,10 @@ void __dma_sync(void *vaddr, size_t size invalidate_dcache_range(start, end); break; case DMA_TO_DEVICE: /* writeback only */ - clean_dcache_range(start, end); + wback_dcache_range(start, end); break; case DMA_BIDIRECTIONAL: /* writeback and invalidate */ - flush_dcache_range(start, end); + wback_inval_dcache_range(start, end); break; } } Index: working-2.6/arch/ppc/kernel/misc.S =================================================================== --- working-2.6.orig/arch/ppc/kernel/misc.S 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/ppc/kernel/misc.S 2005-12-19 16:20:41.000000000 +1100 @@ -527,9 +527,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_C * Does not invalidate the corresponding cache lines (especially for * any corresponding instruction cache). * - * clean_dcache_range(unsigned long start, unsigned long stop) + * wback_dcache_range(unsigned long start, unsigned long stop) */ -_GLOBAL(clean_dcache_range) +_GLOBAL(wback_dcache_range) li r5,L1_CACHE_BYTES-1 andc r3,r3,r5 subf r4,r3,r4 @@ -548,9 +548,9 @@ _GLOBAL(clean_dcache_range) * Write any modified data cache blocks out to memory and invalidate them. * Does not invalidate the corresponding instruction cache blocks. * - * flush_dcache_range(unsigned long start, unsigned long stop) + * wback_inval_dcache_range(unsigned long start, unsigned long stop) */ -_GLOBAL(flush_dcache_range) +_GLOBAL(wback_inval_dcache_range) li r5,L1_CACHE_BYTES-1 andc r3,r3,r5 subf r4,r3,r4 Index: working-2.6/include/asm-powerpc/cacheflush.h =================================================================== --- working-2.6.orig/include/asm-powerpc/cacheflush.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/cacheflush.h 2005-12-19 16:09:03.000000000 +1100 @@ -44,15 +44,12 @@ extern void flush_dcache_icache_page(str extern void __flush_dcache_icache_phys(unsigned long physaddr); #endif /* CONFIG_PPC32 && !CONFIG_BOOKE */ -extern void flush_dcache_range(unsigned long start, unsigned long stop); -#ifdef CONFIG_PPC32 -extern void clean_dcache_range(unsigned long start, unsigned long stop); +extern void wback_dcache_range(unsigned long start, unsigned long stop); +extern void wback_inval_dcache_range(unsigned long start, unsigned long stop); extern void invalidate_dcache_range(unsigned long start, unsigned long stop); -#endif /* CONFIG_PPC32 */ -#ifdef CONFIG_PPC64 -extern void flush_inval_dcache_range(unsigned long start, unsigned long stop); -extern void flush_dcache_phys_range(unsigned long start, unsigned long stop); -#endif +#ifdef CONFIG_U3_DART +extern void wback_dcache_phys_range(unsigned long start, unsigned long stop); +#endif /* CONFIG_U3_DART */ #define copy_to_user_page(vma, page, vaddr, dst, src, len) \ do { \ Index: working-2.6/include/asm-ppc/io.h =================================================================== --- working-2.6.orig/include/asm-ppc/io.h 2005-11-23 15:56:36.000000000 +1100 +++ working-2.6/include/asm-ppc/io.h 2005-12-19 16:20:02.000000000 +1100 @@ -550,9 +550,9 @@ extern void pci_iounmap(struct pci_dev * #define dma_cache_inv(_start,_size) \ invalidate_dcache_range(_start, (_start + _size)) #define dma_cache_wback(_start,_size) \ - clean_dcache_range(_start, (_start + _size)) + wback_dcache_range(_start, (_start + _size)) #define dma_cache_wback_inv(_start,_size) \ - flush_dcache_range(_start, (_start + _size)) + wback_inval_dcache_range(_start, (_start + _size)) #else Index: working-2.6/drivers/macintosh/smu.c =================================================================== --- working-2.6.orig/drivers/macintosh/smu.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/drivers/macintosh/smu.c 2005-12-19 15:47:10.000000000 +1100 @@ -127,7 +127,7 @@ static void smu_start_cmd(void) /* Flush command and data to RAM */ faddr = (unsigned long)smu->cmd_buf; fend = faddr + smu->cmd_buf->length + 2; - flush_inval_dcache_range(faddr, fend); + wback_inval_dcache_range(faddr, fend); /* This isn't exactly a DMA mapping here, I suspect * the SMU is actually communicating with us via i2c to the @@ -176,7 +176,7 @@ static irqreturn_t smu_db_intr(int irq, * reply lenght (it's only 2 cache lines anyway) */ faddr = (unsigned long)smu->cmd_buf; - flush_inval_dcache_range(faddr, faddr + 256); + wback_inval_dcache_range(faddr, faddr + 256); /* Now check ack */ ack = (~cmd->cmd) & 0xff; Index: working-2.6/arch/ppc/kernel/ppc_ksyms.c =================================================================== --- working-2.6.orig/arch/ppc/kernel/ppc_ksyms.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/ppc/kernel/ppc_ksyms.c 2005-12-19 15:47:10.000000000 +1100 @@ -181,7 +181,7 @@ EXPORT_SYMBOL(kernel_thread); EXPORT_SYMBOL(flush_instruction_cache); EXPORT_SYMBOL(giveup_fpu); EXPORT_SYMBOL(__flush_icache_range); -EXPORT_SYMBOL(flush_dcache_range); +EXPORT_SYMBOL(wback_inval_dcache_range); EXPORT_SYMBOL(flush_icache_user_range); EXPORT_SYMBOL(flush_dcache_page); EXPORT_SYMBOL(flush_tlb_kernel_range); Index: working-2.6/include/asm-powerpc/asm-compat.h =================================================================== --- working-2.6.orig/include/asm-powerpc/asm-compat.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/asm-compat.h 2005-12-19 15:47:10.000000000 +1100 @@ -26,6 +26,7 @@ #define PPC_LLARX stringify_in_c(ldarx) #define PPC_STLCX stringify_in_c(stdcx.) #define PPC_CNTLZL stringify_in_c(cntlzd) +#define PPC_CLRRLI stringify_in_c(clrrdi) #else /* 32-bit */ @@ -38,6 +39,7 @@ #define PPC_LLARX stringify_in_c(lwarx) #define PPC_STLCX stringify_in_c(stwcx.) #define PPC_CNTLZL stringify_in_c(cntlzw) +#define PPC_CLRRLI stringify_in_c(clrrwi) #endif Index: working-2.6/arch/powerpc/sysdev/dart_iommu.c =================================================================== --- working-2.6.orig/arch/powerpc/sysdev/dart_iommu.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/powerpc/sysdev/dart_iommu.c 2005-12-19 16:03:07.000000000 +1100 @@ -187,7 +187,7 @@ static int dart_init(struct device_node * from a previous mapping that existed before the kernel took * over */ - flush_dcache_phys_range(dart_tablebase, + wback_dcache_phys_range(dart_tablebase, dart_tablebase + dart_tablesize); /* Allocate a spare page to map all invalid DART pages. We need to do Index: working-2.6/arch/ppc/8xx_io/enet.c =================================================================== --- working-2.6.orig/arch/ppc/8xx_io/enet.c 2005-10-25 11:59:53.000000000 +1000 +++ working-2.6/arch/ppc/8xx_io/enet.c 2005-12-19 16:21:19.000000000 +1100 @@ -239,8 +239,8 @@ scc_enet_start_xmit(struct sk_buff *skb, /* Push the data cache so the CPM does not get stale memory * data. */ - flush_dcache_range((unsigned long)(skb->data), - (unsigned long)(skb->data + skb->len)); + wback_inval_dcache_range((unsigned long)(skb->data), + (unsigned long)(skb->data + skb->len)); spin_lock_irq(&cep->lock); Index: working-2.6/arch/ppc/8xx_io/fec.c =================================================================== --- working-2.6.orig/arch/ppc/8xx_io/fec.c 2005-10-25 11:59:53.000000000 +1000 +++ working-2.6/arch/ppc/8xx_io/fec.c 2005-12-19 16:21:39.000000000 +1100 @@ -387,8 +387,8 @@ fec_enet_start_xmit(struct sk_buff *skb, /* Push the data cache so the CPM does not get stale memory * data. */ - flush_dcache_range((unsigned long)skb->data, - (unsigned long)skb->data + skb->len); + wback_inval_dcache_range((unsigned long)skb->data, + (unsigned long)skb->data + skb->len); /* disable interrupts while triggering transmit */ spin_lock_irq(&fep->lock); Index: working-2.6/drivers/char/agp/uninorth-agp.c =================================================================== --- working-2.6.orig/drivers/char/agp/uninorth-agp.c 2005-11-23 15:56:23.000000000 +1100 +++ working-2.6/drivers/char/agp/uninorth-agp.c 2005-12-19 16:21:50.000000000 +1100 @@ -157,12 +157,12 @@ static int uninorth_insert_memory(struct for (i = 0, j = pg_start; i < mem->page_count; i++, j++) { agp_bridge->gatt_table[j] = cpu_to_le32((mem->memory[i] & 0xFFFFF000UL) | 0x1UL); - flush_dcache_range((unsigned long)__va(mem->memory[i]), + wback_dcache_range((unsigned long)__va(mem->memory[i]), (unsigned long)__va(mem->memory[i])+0x1000); } (void)in_le32((volatile u32*)&agp_bridge->gatt_table[pg_start]); mb(); - flush_dcache_range((unsigned long)&agp_bridge->gatt_table[pg_start], + wback_dcache_range((unsigned long)&agp_bridge->gatt_table[pg_start], (unsigned long)&agp_bridge->gatt_table[pg_start + mem->page_count]); uninorth_tlbflush(mem); @@ -195,11 +195,11 @@ static int u3_insert_memory(struct agp_m for (i = 0; i < mem->page_count; i++) { gp[i] = (mem->memory[i] >> PAGE_SHIFT) | 0x80000000UL; - flush_dcache_range((unsigned long)__va(mem->memory[i]), + wback_dcache_range((unsigned long)__va(mem->memory[i]), (unsigned long)__va(mem->memory[i])+0x1000); } mb(); - flush_dcache_range((unsigned long)gp, (unsigned long) &gp[i]); + wback_dcache_range((unsigned long)gp, (unsigned long) &gp[i]); uninorth_tlbflush(mem); return 0; @@ -218,7 +218,7 @@ int u3_remove_memory(struct agp_memory * for (i = 0; i < mem->page_count; ++i) gp[i] = 0; mb(); - flush_dcache_range((unsigned long)gp, (unsigned long) &gp[i]); + wback_dcache_range((unsigned long)gp, (unsigned long) &gp[i]); uninorth_tlbflush(mem); return 0; @@ -412,7 +412,7 @@ static int uninorth_create_gatt_table(st for (i = 0; i < num_entries; i++) bridge->gatt_table[i] = 0; - flush_dcache_range((unsigned long)table, (unsigned long)table_end); + wback_dcache_range((unsigned long)table, (unsigned long)table_end); return 0; } Index: working-2.6/drivers/net/fec.c =================================================================== --- working-2.6.orig/drivers/net/fec.c 2005-11-23 15:56:25.000000000 +1100 +++ working-2.6/drivers/net/fec.c 2005-12-19 16:19:42.000000000 +1100 @@ -361,8 +361,8 @@ fec_enet_start_xmit(struct sk_buff *skb, /* Push the data cache so the CPM does not get stale memory * data. */ - flush_dcache_range((unsigned long)skb->data, - (unsigned long)skb->data + skb->len); + wback_inval_dcache_range((unsigned long)skb->data, + (unsigned long)skb->data + skb->len); spin_lock_irq(&fep->lock); Index: working-2.6/arch/ppc/8xx_io/cs4218_tdm.c =================================================================== --- working-2.6.orig/arch/ppc/8xx_io/cs4218_tdm.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/ppc/8xx_io/cs4218_tdm.c 2005-12-19 16:22:24.000000000 +1100 @@ -1235,8 +1235,8 @@ static void CS_Play(void) bdp = &tx_base[i]; bdp->cbd_datlen = count; - flush_dcache_range((ulong)sound_buffers[i], - (ulong)(sound_buffers[i] + count)); + wback_inval_dcache_range((ulong)sound_buffers[i], + (ulong)(sound_buffers[i] + count)); if (++i >= sq.max_count) i = 0; Index: working-2.6/drivers/serial/mpsc.c =================================================================== --- working-2.6.orig/drivers/serial/mpsc.c 2005-11-23 15:56:27.000000000 +1100 +++ working-2.6/drivers/serial/mpsc.c 2005-12-19 16:23:48.000000000 +1100 @@ -685,7 +685,7 @@ mpsc_init_rings(struct mpsc_port_info *p DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)pi->dma_region, + wback_inval_dcache_range((ulong)pi->dma_region, (ulong)pi->dma_region + MPSC_DMA_ALLOC_SIZE); #endif @@ -851,7 +851,7 @@ next_frame: dma_cache_sync((void *)rxre, MPSC_RXRE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)rxre, + wback_inval_dcache_range((ulong)rxre, (ulong)rxre + MPSC_RXRE_SIZE); #endif @@ -896,7 +896,7 @@ mpsc_setup_tx_desc(struct mpsc_port_info dma_cache_sync((void *) txre, MPSC_TXRE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)txre, + wback_inval_dcache_range((ulong)txre, (ulong)txre + MPSC_TXRE_SIZE); #endif @@ -945,7 +945,7 @@ mpsc_copy_tx_data(struct mpsc_port_info dma_cache_sync((void *) bp, MPSC_TXBE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)bp, + wback_inval_dcache_range((ulong)bp, (ulong)bp + MPSC_TXBE_SIZE); #endif mpsc_setup_tx_desc(pi, i, 1); @@ -1405,8 +1405,8 @@ mpsc_console_write(struct console *co, c dma_cache_sync((void *) bp, MPSC_TXBE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)bp, - (ulong)bp + MPSC_TXBE_SIZE); + wback_inval_dcache_range((ulong)bp, + (ulong)bp + MPSC_TXBE_SIZE); #endif mpsc_setup_tx_desc(pi, i, 0); pi->txr_head = (pi->txr_head + 1) & (MPSC_TXR_ENTRIES - 1); -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From miltonm at bga.com Tue Dec 20 01:52:47 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 19 Dec 2005 08:52:47 -0600 Subject: [RFC] powerpc: Merge 32/64 cacheflush code Message-ID: On Mon Dec 19 16:44:10 EST 2005, David Gibson wrote: > +extern void wback_dcache_range(unsigned long start, unsigned long > stop); > +extern void wback_inval_dcache_range(unsigned long start, unsigned > long stop); I think that while we are here we should change the arguments to be pointers (void *). The assembly doesn't care, and almost all of the users are casting from pointer to usigned long at the call site, with dart being the exception. The instruction cache flush should also change. milton From galak at kernel.crashing.org Tue Dec 20 02:30:37 2005 From: galak at kernel.crashing.org (Kumar Gala) Date: Mon, 19 Dec 2005 09:30:37 -0600 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <20051219054410.GB13285@localhost.localdomain> References: <20051219054410.GB13285@localhost.localdomain> Message-ID: On Dec 18, 2005, at 11:44 PM, David Gibson wrote: > Paulus et al, I think the patch below is roughly the right way to go, > but it needs much more review and testing (at present it's been > cursorily tested on ppc64 pSeries only). > > This patch merges the cache flushing code for 32 and 64 bit powerpc > machines. This means the ppc64_caches mechanism for determining > correct cache sizes at runtime is ported to 32-bit, and is thus > renamed as 'powerpc_caches'. The merged cache flushing functions go > in new file arch/powerpc/kernel/cache.S. Why dont we just use the cache line information in the cputable? Why the introduction of this new powerpc_caches structure? - kumar From olh at suse.de Tue Dec 20 02:44:23 2005 From: olh at suse.de (Olaf Hering) Date: Mon, 19 Dec 2005 16:44:23 +0100 Subject: Subject: powerpc: sanitize header files for user space includes In-Reply-To: <200512162243.51740.arnd@arndb.de> References: <20051212204532.GJ23641@krispykreme> <1134432067.6989.137.camel@gaston> <20051213082131.GA14838@suse.de> <200512162243.51740.arnd@arndb.de> Message-ID: <20051219154423.GB13963@suse.de> On Fri, Dec 16, Arnd Bergmann wrote: > include/asm-ppc/ had #ifdef __KERNEL__ in all header files that > are not meant for use by user space, include/asm-powerpc does > not have this yet. Looks good. And now the same for asm-i386... -- short story of a lazy sysadmin: alias appserv=wotan From rsa at us.ibm.com Tue Dec 20 03:01:24 2005 From: rsa at us.ibm.com (Ryan Arnold) Date: Mon, 19 Dec 2005 10:01:24 -0600 Subject: [RFC PATCH 2/3] Add a hvc backend for systemsim In-Reply-To: <20051217002255.774056000@localhost> References: <20051217001031.456315000@localhost> <20051217002255.774056000@localhost> Message-ID: <1135008084.7044.4.camel@localhost.localdomain> On Sat, 2005-12-17 at 01:10 +0100, Arnd Bergmann wrote: > +static int hvc_fss_write_console(uint32_t vtermno, const char *buf, int count) > +{ > + int ret; > + ret = callthru3(SIM_WRITE_CONSOLE_CODE, (unsigned long)buf, count, 1); > + if (ret != 0) > + return (count - ret); /* is this right? */ > + > + /* the calling routine expects to receive the number of bytes sent */ > + return count; > +} > + Greetings Arnd, I added the question "is this right?" because I didn't have documentation on the exact return value of callthru3(SIM_WRITE_CONSOLE_CODE...). The return value certainly isn't the number of bytes sent, hence the reason for the 'return count'. I suspect that it returns the number of bytes NOT sent. Could you verify this and then remove the comment? Ryan S. Arnold IBM Linux Technology Center From galak at gate.crashing.org Tue Dec 20 07:49:56 2005 From: galak at gate.crashing.org (Kumar Gala) Date: Mon, 19 Dec 2005 14:49:56 -0600 (CST) Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <1134856762.6102.54.camel@gaston> Message-ID: On Sun, 18 Dec 2005, Benjamin Herrenschmidt wrote: > > > - Do we need a way to identify the type of soc bus? There are different > > standards for this, e.g. PLB4 on PPC440 or the EIB on the Cell BE. > > My initial idea was to have different device-type properties for these, > > but I now think that device_type = "soc" makes sense for all of them. > > Maybe we could add a model or compatible property for them. > > That would be a good idea. > > Also, it might be useful to ass a "clock-frequency" to it for processors > where it makes sense. One of the things we are passing from uboot > currently is the list of clock frequencies for PLB/OPB/PCI/... we need > to replace this with appropriate nodes and their respective > "clock-frequency" properties > > > - It does not really belong into this document, but is related anyway: > > how do you want to represent this in Linux? Currently, most of these > > would be of_platform_device, but I think it would be good to have > > a new bus_type for it. The advantage would be that you can see the > > devices in /sys/devices/soc at xxx/ even if the driver is not loaded > > and the driver can even be autoloaded by udev. > > Also, which properties should show up in sysfs? All of them or just > > those specified in this document or a subset of them? I'm still in favor of just leaving these devices as straight platform devices. Unless there is something that is bus specific that each device on the bus conforms to I dont see any reason to create a new bus type. > If we go that way, we also need to have the SOC type take optionally > part in the matching. That is, the driver matching infos should be based > on model & compatible like OF does, thus we could recommend something > like: > > - Define a unique SOC name per SOC bus type/family, for example, > ppc4xxPLB, etc... This goes into /soc/model. > > - Optionally, use compatible for similar busses. For example, if you > have a new rev of that PLB that is similar but has extensions called > PLB2, you can have model be ppc4xxPLB2 and compatible containing > ppc4xxPLB. > > - Define that the "model" property of a device under /soc is of the > form "socname,devicename"... For example, EMAC would be ppc4xxPLB,emac", > Same rule applies with compatible (this one could be compatible, among > others, with "ppc4xxPLB,emac" and model "ppc4xxPLB2,emac". > > > - What do we do with pci root devices? They are often physically connected > > to the internal CPU bus, so it would make sense to represent them > > this way in the device tree. Should we add them to the specification > > here? Would it even work the expected way in Linux? - kumar From david at gibson.dropbear.id.au Tue Dec 20 10:54:55 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 20 Dec 2005 10:54:55 +1100 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: References: <20051219054410.GB13285@localhost.localdomain> Message-ID: <20051219235455.GB29993@localhost.localdomain> On Mon, Dec 19, 2005 at 09:30:37AM -0600, Kumar Gala wrote: > > On Dec 18, 2005, at 11:44 PM, David Gibson wrote: > > >Paulus et al, I think the patch below is roughly the right way to go, > >but it needs much more review and testing (at present it's been > >cursorily tested on ppc64 pSeries only). > > > >This patch merges the cache flushing code for 32 and 64 bit powerpc > >machines. This means the ppc64_caches mechanism for determining > >correct cache sizes at runtime is ported to 32-bit, and is thus > >renamed as 'powerpc_caches'. The merged cache flushing functions go > >in new file arch/powerpc/kernel/cache.S. > > Why dont we just use the cache line information in the cputable? Why > the introduction of this new powerpc_caches structure? Because the device tree can override the information from the cputable. Oh, and the structure is only new for ppc32. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Tue Dec 20 12:06:17 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 20 Dec 2005 12:06:17 +1100 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: References: Message-ID: <20051220010617.GC29993@localhost.localdomain> On Mon, Dec 19, 2005 at 08:52:47AM -0600, Milton Miller wrote: > On Mon Dec 19 16:44:10 EST 2005, David Gibson wrote: > > >+extern void wback_dcache_range(unsigned long start, unsigned long > >stop); > >+extern void wback_inval_dcache_range(unsigned long start, unsigned > >long stop); > > I think that while we are here we should change the arguments to be > pointers (void *). The assembly doesn't care, and almost all of the > users are casting from pointer to usigned long at the call site, with > dart being the exception. The instruction cache flush should also > change. True. And while we're at that, the dcache flushing functions are almost invariable called as *_dcache_range(start, start+length), so how about changing them to take start and length instead of start and end. However, flush_icache_range() is called from generic code, so I don't want to change it's interface. Revised patch below: powerpc: Merge 32/64 cacheflush code This patch merges the cache flushing code for 32 and 64 bit powerpc machines. This means the ppc64_caches mechanism for determining correct cache sizes at runtime is ported to 32-bit, and is thus renamed as 'powerpc_caches'. The merged cache flushing functions go in new file arch/powerpc/kernel/cache.S. Previously, the ppc32 version of flush_dcache_range() did a writeback and invalidate of the given cache lines (dcbf) whereas the ppc64 version did just a writeback (dcbst). In general, there's no consistent meaning of "flush" as one or the other, so this patch also renames the dcache flushing functions less ambiguously. The new names are: wback_dcache_range() - previously flush_dcache_range() on ppc64 and clean_dcache_range() on ppc32 wback_inval_dcache_range() - previously flush_inval_dcache_range() on ppc64 and flush_dcache_range on ppc32 invalidate_dcache_range() - didn't previously exist on ppc64, unchanged on ppc32 Finally we also cleanup the initialization of the powerpc_caches structure from the old ppc64 specific version. We remove a pointless loop, and remove a dependence on _machine. arch/powerpc/kernel/Makefile | 2 arch/powerpc/kernel/align.c | 2 arch/powerpc/kernel/asm-offsets.c | 12 - arch/powerpc/kernel/cache.S | 229 +++++++++++++++++++++++++++++++++++++ arch/powerpc/kernel/misc_32.S | 123 ------------------- arch/powerpc/kernel/misc_64.S | 182 ----------------------------- arch/powerpc/kernel/ppc_ksyms.c | 2 arch/powerpc/kernel/setup-common.c | 89 ++++++++++++++ arch/powerpc/kernel/setup_32.c | 33 +---- arch/powerpc/kernel/setup_64.c | 123 +------------------ arch/powerpc/kernel/vdso.c | 8 - arch/powerpc/sysdev/dart_iommu.c | 3 arch/ppc/8xx_io/cs4218_tdm.c | 8 - arch/ppc/8xx_io/enet.c | 3 arch/ppc/8xx_io/fec.c | 3 arch/ppc/kernel/dma-mapping.c | 16 -- arch/ppc/kernel/misc.S | 19 +-- arch/ppc/kernel/ppc_ksyms.c | 2 drivers/char/agp/uninorth-agp.c | 23 +-- drivers/macintosh/smu.c | 9 - drivers/net/fec.c | 3 drivers/serial/mpsc.c | 34 +---- include/asm-powerpc/asm-compat.h | 2 include/asm-powerpc/cache.h | 9 - include/asm-powerpc/cacheflush.h | 15 -- include/asm-powerpc/page_64.h | 4 include/asm-ppc/io.h | 6 27 files changed, 420 insertions(+), 544 deletions(-) Index: working-2.6/include/asm-powerpc/cache.h =================================================================== --- working-2.6.orig/include/asm-powerpc/cache.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/cache.h 2005-12-20 12:03:10.000000000 +1100 @@ -21,8 +21,8 @@ #define SMP_CACHE_BYTES L1_CACHE_BYTES #define L1_CACHE_SHIFT_MAX 7 /* largest L1 which this arch supports */ -#if defined(__powerpc64__) && !defined(__ASSEMBLY__) -struct ppc64_caches { +#ifndef __ASSEMBLY__ +struct powerpc_caches { u32 dsize; /* L1 d-cache size */ u32 dline_size; /* L1 d-cache line size */ u32 log_dline_size; @@ -33,8 +33,9 @@ struct ppc64_caches { u32 ilines_per_page; }; -extern struct ppc64_caches ppc64_caches; -#endif /* __powerpc64__ && ! __ASSEMBLY__ */ +extern struct powerpc_caches powerpc_caches; +extern void initialize_cache_info(void); +#endif /* ! __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHE_H */ Index: working-2.6/arch/powerpc/kernel/Makefile =================================================================== --- working-2.6.orig/arch/powerpc/kernel/Makefile 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/Makefile 2005-12-20 12:03:10.000000000 +1100 @@ -13,7 +13,7 @@ endif obj-y := semaphore.o cputable.o ptrace.o syscalls.o \ irq.o align.o signal_32.o pmc.o vdso.o \ - prom_parse.o + prom_parse.o cache.o obj-y += vdso32/ obj-$(CONFIG_PPC64) += setup_64.o binfmt_elf32.o sys_ppc32.o \ signal_64.o ptrace32.o systbl.o \ Index: working-2.6/arch/powerpc/kernel/align.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/align.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/align.c 2005-12-20 12:03:10.000000000 +1100 @@ -231,7 +231,7 @@ static int emulate_dcbz(struct pt_regs * int i, size; #ifdef __powerpc64__ - size = ppc64_caches.dline_size; + size = powerpc_caches.dline_size; #else size = L1_CACHE_BYTES; #endif Index: working-2.6/arch/powerpc/kernel/asm-offsets.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/asm-offsets.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/asm-offsets.c 2005-12-20 12:03:10.000000000 +1100 @@ -99,13 +99,13 @@ int main(void) DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); #endif /* CONFIG_PPC32 */ + DEFINE(DCACHEL1LINESIZE, offsetof(struct powerpc_caches, dline_size)); + DEFINE(DCACHEL1LOGLINESIZE, offsetof(struct powerpc_caches, log_dline_size)); + DEFINE(DCACHEL1LINESPERPAGE, offsetof(struct powerpc_caches, dlines_per_page)); + DEFINE(ICACHEL1LINESIZE, offsetof(struct powerpc_caches, iline_size)); + DEFINE(ICACHEL1LOGLINESIZE, offsetof(struct powerpc_caches, log_iline_size)); + DEFINE(ICACHEL1LINESPERPAGE, offsetof(struct powerpc_caches, ilines_per_page)); #ifdef CONFIG_PPC64 - DEFINE(DCACHEL1LINESIZE, offsetof(struct ppc64_caches, dline_size)); - DEFINE(DCACHEL1LOGLINESIZE, offsetof(struct ppc64_caches, log_dline_size)); - DEFINE(DCACHEL1LINESPERPAGE, offsetof(struct ppc64_caches, dlines_per_page)); - DEFINE(ICACHEL1LINESIZE, offsetof(struct ppc64_caches, iline_size)); - DEFINE(ICACHEL1LOGLINESIZE, offsetof(struct ppc64_caches, log_iline_size)); - DEFINE(ICACHEL1LINESPERPAGE, offsetof(struct ppc64_caches, ilines_per_page)); DEFINE(PLATFORM_LPAR, PLATFORM_LPAR); /* paca */ Index: working-2.6/arch/powerpc/kernel/cache.S =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ working-2.6/arch/powerpc/kernel/cache.S 2005-12-20 12:03:30.000000000 +1100 @@ -0,0 +1,229 @@ +/* + * arch/powerpc/kernel/cache.S + * + * Cache-flushing functions. + * Copyright (C) 2005 David Gibson + * Based on earlier code: + * Copyright (C) 1995-1996 Gary Thomas (gdt at linuxppc.org) + * Largely rewritten by Cort Dougan (cort at cs.nmt.edu) + * and Paul Mackerras. + * Adapted for iSeries by Mike Corrigan (mikejc at us.ibm.com) + * PPC64 updates by Dave Engebretsen (engebret at us.ibm.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + */ + +#include +#include +#include +#include +#include +#include +#include + +/* + * Write any modified data cache blocks out to memory + * and invalidate the corresponding instruction cache blocks. + * + * flush_icache_range(unsigned long start, unsigned long stop) + * + * flush all bytes from start through stop-1 inclusive + */ +_KPROBE(__flush_icache_range) +BEGIN_FTR_SECTION + blr /* for 601, do nothing */ +END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) +/* + * Flush the data cache to memory + * + * Different systems have different cache line sizes + * and in some cases i-cache and d-cache line sizes differ from + * each other. + */ + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10)/* Get cache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get dcache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +1: dcbst 0,r6 + add r6,r6,r7 + bdnz 1b + sync + +/* Now invalidate the instruction cache */ + lwz r7,ICACHEL1LINESIZE(r10) /* Get Icache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + subf r8,r6,r4 /* compute length */ + add r8,r8,r5 + lwz r9,ICACHEL1LOGLINESIZE(r10) /* Get icache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +2: icbi 0,r6 + add r6,r6,r7 + bdnz 2b + sync /* additional sync needed on g4 */ + isync + blr + +/* + * Flush a particular page from the data cache to RAM. + * Note: this is necessary because the instruction cache does *not* + * snoop from the data cache. + * + * void __flush_dcache_icache(void *page) + */ +_GLOBAL(__flush_dcache_icache) +BEGIN_FTR_SECTION + blr /* for 601, do nothing */ +END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) + /* Flush the dcache */ + LOAD_REG_ADDR(r7, powerpc_caches) + PPC_CLRRLI r3,r3,PAGE_SHIFT /* Page align */ + lwz r4,DCACHEL1LINESPERPAGE(r7) /* Get dcache lines per page */ + lwz r5,DCACHEL1LINESIZE(r7) /* Get dcache line size */ + mr r6,r3 + mtctr r4 +0: dcbst 0,r6 + add r6,r6,r5 + bdnz 0b + sync + + /* Now invalidate the icache */ + lwz r4,ICACHEL1LINESPERPAGE(r7) /* Get icache lines per page */ + lwz r5,ICACHEL1LINESIZE(r7) /* Get icache line size */ + mtctr r4 +1: icbi 0,r3 + add r3,r3,r5 + bdnz 1b + sync /* additional sync needed on g4 */ + isync + blr + +/* + * Like above, but only do the D-cache. + * + * wback_dcache_range(void *start, unsigned long len) + * + * writeback all bytes from start to stop-1 inclusive + */ +_GLOBAL(wback_dcache_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + and r8,r3,r5 /* get cacheline offset of start */ + add r8,r8,r4 /* add length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get dcache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +0: dcbst 0,r6 + add r6,r6,r7 + bdnz 0b + sync + blr + +/* + * wback_inval_dcache_range(void *start, unsigned long len) + * + * writeback and invalidate all bytes from start to stop-1 inclusive + */ +_GLOBAL(wback_inval_dcache_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + and r8,r3,r5 /* get cacheline offset of start */ + add r8,r8,r4 /* add length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10)/* Get dcache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + sync /* FIXME: is this necessary? */ + isync /* FIXME: is this necessary? */ + mtctr r8 +0: dcbf 0,r6 + add r6,r6,r7 + bdnz 0b + sync + isync /* FIXME: is this necessary? */ + blr + +/* + * Like above, but invalidate the D-cache. This is used by the 8xx + * to invalidate the cache so the PPC core doesn't get stale data + * from the CPM (no cache snooping here :-). + * + * invalidate_dcache_range(void *start, unsigned long stop) + */ +_GLOBAL(invalidate_dcache_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + /* FIXME: should BUG() on non-aligned parameters instead */ + and r8,r3,r5 /* get cacheline offset of start */ + add r8,r8,r4 /* add length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10)/* Get dcache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mtctr r8 +0: dcbi 0,r6 + add r6,r6,r7 + bdnz 0b + sync + blr + +#ifdef CONFIG_U3_DART +/* + * Like above, but works on non-mapped physical addresses. + * Use only for non-LPAR setups ! It also assumes real mode + * is cacheable. Used for flushing out the DART before using + * it as uncacheable memory + * + * wback_dcache_phys_range(unsigned long start, unsigned long len) + * + * writeback all bytes from start to stop-1 inclusive + */ +_GLOBAL(wback_dcache_phys_range) + LOAD_REG_ADDR(r10, powerpc_caches) + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ + addi r5,r7,-1 + andc r6,r3,r5 /* round low to line bdy */ + and r8,r3,r5 /* get cacheline offset of start */ + add r8,r8,r4 /* add length */ + add r8,r8,r5 /* ensure we get enough */ + lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get dcache line shift */ + srw. r8,r8,r9 /* compute line count */ + beqlr /* nothing to do? */ + mfmsr r5 /* Disable MMU Data Relocation */ + ori r0,r5,MSR_DR + xori r0,r0,MSR_DR + sync + mtmsr r0 + sync + isync + mtctr r8 +0: dcbst 0,r6 + add r6,r6,r7 + bdnz 0b + sync + isync + mtmsr r5 /* Re-enable MMU Data Relocation */ + sync + isync + blr +#endif /* CONFIG_U3_DART */ \ No newline at end of file Index: working-2.6/arch/powerpc/kernel/misc_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_64.S 2005-12-20 12:03:09.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_64.S 2005-12-20 12:03:10.000000000 +1100 @@ -142,188 +142,6 @@ _GLOBAL(call_with_mmu_off) mtspr SPRN_SRR1,r0 rfid - - .section ".toc","aw" -PPC64_CACHES: - .tc ppc64_caches[TC],ppc64_caches - .section ".text" - -/* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * - * flush_icache_range(unsigned long start, unsigned long stop) - * - * flush all bytes from start through stop-1 inclusive - */ - -_KPROBE(__flush_icache_range) - -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - * and in some cases i-cache and d-cache line sizes differ from - * each other. - */ - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10)/* Get cache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get log-2 of cache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -1: dcbst 0,r6 - add r6,r6,r7 - bdnz 1b - sync - -/* Now invalidate the instruction cache */ - - lwz r7,ICACHEL1LINESIZE(r10) /* Get Icache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 - lwz r9,ICACHEL1LOGLINESIZE(r10) /* Get log-2 of Icache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -2: icbi 0,r6 - add r6,r6,r7 - bdnz 2b - isync - blr - .previous .text -/* - * Like above, but only do the D-cache. - * - * flush_dcache_range(unsigned long start, unsigned long stop) - * - * flush all bytes from start to stop-1 inclusive - */ -_GLOBAL(flush_dcache_range) - -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - */ - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get log-2 of dcache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mtctr r8 -0: dcbst 0,r6 - add r6,r6,r7 - bdnz 0b - sync - blr - -/* - * Like above, but works on non-mapped physical addresses. - * Use only for non-LPAR setups ! It also assumes real mode - * is cacheable. Used for flushing out the DART before using - * it as uncacheable memory - * - * flush_dcache_phys_range(unsigned long start, unsigned long stop) - * - * flush all bytes from start to stop-1 inclusive - */ -_GLOBAL(flush_dcache_phys_range) - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get log-2 of dcache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - mfmsr r5 /* Disable MMU Data Relocation */ - ori r0,r5,MSR_DR - xori r0,r0,MSR_DR - sync - mtmsr r0 - sync - isync - mtctr r8 -0: dcbst 0,r6 - add r6,r6,r7 - bdnz 0b - sync - isync - mtmsr r5 /* Re-enable MMU Data Relocation */ - sync - isync - blr - -_GLOBAL(flush_inval_dcache_range) - ld r10,PPC64_CACHES at toc(r2) - lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ - addi r5,r7,-1 - andc r6,r3,r5 /* round low to line bdy */ - subf r8,r6,r4 /* compute length */ - add r8,r8,r5 /* ensure we get enough */ - lwz r9,DCACHEL1LOGLINESIZE(r10)/* Get log-2 of dcache line size */ - srw. r8,r8,r9 /* compute line count */ - beqlr /* nothing to do? */ - sync - isync - mtctr r8 -0: dcbf 0,r6 - add r6,r6,r7 - bdnz 0b - sync - isync - blr - - -/* - * Flush a particular page from the data cache to RAM. - * Note: this is necessary because the instruction cache does *not* - * snoop from the data cache. - * - * void __flush_dcache_icache(void *page) - */ -_GLOBAL(__flush_dcache_icache) -/* - * Flush the data cache to memory - * - * Different systems have different cache line sizes - */ - -/* Flush the dcache */ - ld r7,PPC64_CACHES at toc(r2) - clrrdi r3,r3,PAGE_SHIFT /* Page align */ - lwz r4,DCACHEL1LINESPERPAGE(r7) /* Get # dcache lines per page */ - lwz r5,DCACHEL1LINESIZE(r7) /* Get dcache line size */ - mr r6,r3 - mtctr r4 -0: dcbst 0,r6 - add r6,r6,r5 - bdnz 0b - sync - -/* Now invalidate the icache */ - - lwz r4,ICACHEL1LINESPERPAGE(r7) /* Get # icache lines per page */ - lwz r5,ICACHEL1LINESIZE(r7) /* Get icache line size */ - mtctr r4 -1: icbi 0,r3 - add r3,r3,r5 - bdnz 1b - isync - blr - /* * I/O string operations * Index: working-2.6/arch/powerpc/kernel/setup_64.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/setup_64.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/setup_64.c 2005-12-20 12:03:10.000000000 +1100 @@ -103,25 +103,6 @@ int boot_cpuid_phys = 0; dev_t boot_dev; u64 ppc64_pft_size; -/* Pick defaults since we might want to patch instructions - * before we've read this from the device tree. - */ -struct ppc64_caches ppc64_caches = { - .dline_size = 0x80, - .log_dline_size = 7, - .iline_size = 0x80, - .log_iline_size = 7 -}; -EXPORT_SYMBOL_GPL(ppc64_caches); - -/* - * These are used in binfmt_elf.c to put aux entries on the stack - * for each elf executable being started. - */ -int dcache_bsize; -int icache_bsize; -int ucache_bsize; - /* The main machine-dep calls structure */ struct machdep_calls ppc_md; @@ -345,81 +326,6 @@ void smp_release_cpus(void) #endif /* CONFIG_SMP || CONFIG_KEXEC */ /* - * Initialize some remaining members of the ppc64_caches and systemcfg - * structures - * (at least until we get rid of them completely). This is mostly some - * cache informations about the CPU that will be used by cache flush - * routines and/or provided to userland - */ -static void __init initialize_cache_info(void) -{ - struct device_node *np; - unsigned long num_cpus = 0; - - DBG(" -> initialize_cache_info()\n"); - - for (np = NULL; (np = of_find_node_by_type(np, "cpu"));) { - num_cpus += 1; - - /* We're assuming *all* of the CPUs have the same - * d-cache and i-cache sizes... -Peter - */ - - if ( num_cpus == 1 ) { - u32 *sizep, *lsizep; - u32 size, lsize; - const char *dc, *ic; - - /* Then read cache informations */ - if (_machine == PLATFORM_POWERMAC) { - dc = "d-cache-block-size"; - ic = "i-cache-block-size"; - } else { - dc = "d-cache-line-size"; - ic = "i-cache-line-size"; - } - - size = 0; - lsize = cur_cpu_spec->dcache_bsize; - sizep = (u32 *)get_property(np, "d-cache-size", NULL); - if (sizep != NULL) - size = *sizep; - lsizep = (u32 *) get_property(np, dc, NULL); - if (lsizep != NULL) - lsize = *lsizep; - if (sizep == 0 || lsizep == 0) - DBG("Argh, can't find dcache properties ! " - "sizep: %p, lsizep: %p\n", sizep, lsizep); - - ppc64_caches.dsize = size; - ppc64_caches.dline_size = lsize; - ppc64_caches.log_dline_size = __ilog2(lsize); - ppc64_caches.dlines_per_page = PAGE_SIZE / lsize; - - size = 0; - lsize = cur_cpu_spec->icache_bsize; - sizep = (u32 *)get_property(np, "i-cache-size", NULL); - if (sizep != NULL) - size = *sizep; - lsizep = (u32 *)get_property(np, ic, NULL); - if (lsizep != NULL) - lsize = *lsizep; - if (sizep == 0 || lsizep == 0) - DBG("Argh, can't find icache properties ! " - "sizep: %p, lsizep: %p\n", sizep, lsizep); - - ppc64_caches.isize = size; - ppc64_caches.iline_size = lsize; - ppc64_caches.log_iline_size = __ilog2(lsize); - ppc64_caches.ilines_per_page = PAGE_SIZE / lsize; - } - } - - DBG(" <- initialize_cache_info()\n"); -} - - -/* * Do some initial setup of the system. The parameters are those which * were passed in from the bootloader. */ @@ -437,14 +343,13 @@ void __init setup_system(void) #endif /* - * Fill the ppc64_caches & systemcfg structures with informations - * retreived from the device-tree. Need to be called before + * Fill the powerpc_caches structure with information + * retreived from the device-tree. Needs to be called before * finish_device_tree() since the later requires some of the - * informations filled up here to properly parse the interrupt - * tree. - * It also sets up the cache line sizes which allows to call - * routines like flush_icache_range (used by the hash init - * later on). + * information filled up here to properly parse the interrupt + * tree. It also sets up the cache line sizes which allows to + * call routines like flush_icache_range (used by the hash + * init later on). */ initialize_cache_info(); @@ -514,10 +419,10 @@ void __init setup_system(void) ppc64_interrupt_controller); printk("platform = 0x%x\n", _machine); printk("physicalMemorySize = 0x%lx\n", lmb_phys_mem_size()); - printk("ppc64_caches.dcache_line_size = 0x%x\n", - ppc64_caches.dline_size); - printk("ppc64_caches.icache_line_size = 0x%x\n", - ppc64_caches.iline_size); + printk("powerpc_caches.dcache_line_size = 0x%x\n", + powerpc_caches.dline_size); + printk("powerpc_caches.icache_line_size = 0x%x\n", + powerpc_caches.iline_size); printk("htab_address = 0x%p\n", htab_address); printk("htab_hash_mask = 0x%lx\n", htab_hash_mask); #if PHYSICAL_START > 0 @@ -597,14 +502,6 @@ void __init setup_arch(char **cmdline_p) *cmdline_p = cmd_line; - /* - * Set cache line size based on type of cpu as a default. - * Systems with OF can look in the properties on the cpu node(s) - * for a possibly more accurate value. - */ - dcache_bsize = ppc64_caches.dline_size; - icache_bsize = ppc64_caches.iline_size; - /* reboot on panic */ panic_timeout = 180; Index: working-2.6/arch/powerpc/kernel/vdso.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/vdso.c 2005-11-29 16:23:57.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/vdso.c 2005-12-20 12:03:10.000000000 +1100 @@ -671,10 +671,10 @@ void __init vdso_init(void) vdso_data->processor = mfspr(SPRN_PVR); vdso_data->platform = _machine; vdso_data->physicalMemorySize = lmb_phys_mem_size(); - vdso_data->dcache_size = ppc64_caches.dsize; - vdso_data->dcache_line_size = ppc64_caches.dline_size; - vdso_data->icache_size = ppc64_caches.isize; - vdso_data->icache_line_size = ppc64_caches.iline_size; + vdso_data->dcache_size = powerpc_caches.dsize; + vdso_data->dcache_line_size = powerpc_caches.dline_size; + vdso_data->icache_size = powerpc_caches.isize; + vdso_data->icache_line_size = powerpc_caches.iline_size; /* * Calculate the size of the 64 bits vDSO Index: working-2.6/include/asm-powerpc/page_64.h =================================================================== --- working-2.6.orig/include/asm-powerpc/page_64.h 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/include/asm-powerpc/page_64.h 2005-12-20 12:03:10.000000000 +1100 @@ -40,8 +40,8 @@ static __inline__ void clear_page(void * { unsigned long lines, line_size; - line_size = ppc64_caches.dline_size; - lines = ppc64_caches.dlines_per_page; + line_size = powerpc_caches.dline_size; + lines = powerpc_caches.dlines_per_page; __asm__ __volatile__( "mtctr %1 # clear_page\n\ Index: working-2.6/arch/powerpc/kernel/setup-common.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/setup-common.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/setup-common.c 2005-12-20 12:03:10.000000000 +1100 @@ -466,3 +466,92 @@ static int __init early_xmon(char *p) } early_param("xmon", early_xmon); #endif + +/* Pick defaults since we might want to patch instructions + * before we've read this from the device tree. + */ +struct powerpc_caches powerpc_caches = { + .dline_size = L1_CACHE_BYTES, + .log_dline_size = L1_CACHE_SHIFT, + .iline_size = L1_CACHE_BYTES, + .log_iline_size = L1_CACHE_SHIFT, +}; +EXPORT_SYMBOL_GPL(powerpc_caches); + +/* + * These are used in binfmt_elf.c to put aux entries on the stack + * for each elf executable being started. + */ +int dcache_bsize; +int icache_bsize; +int ucache_bsize; + +/* + * Initialize the powerpc_caches structure. This is some cache + * informations about the CPU that will be used by cache flush + * routines and/or provided to userland + */ +void __init initialize_cache_info(void) +{ + struct device_node *np; + u32 *sizep, *lsizep; + u32 size, lsize; + + DBG(" -> initialize_cache_info()\n"); + + /* We're assuming *all* of the CPUs have the same d-cache and + * i-cache sizes... -Peter + */ + np = of_find_node_by_type(NULL, "cpu"); + BUG_ON(!np); + + size = 0; + lsize = cur_cpu_spec->dcache_bsize; + sizep = (u32 *)get_property(np, "d-cache-size", NULL); + if (sizep) + size = *sizep; + lsizep = (u32 *) get_property(np, "d-cache-line-size", NULL); + if (! lsizep) + lsizep = (u32 *) get_property(np, "d-cache-block-size", NULL); + if (lsizep) + lsize = *lsizep; + if (!sizep || !lsizep) + DBG("Argh, can't find dcache properties! " + "sizep: %p, lsizep: %p\n", sizep, lsizep); + + powerpc_caches.dsize = size; + powerpc_caches.dline_size = lsize; + powerpc_caches.log_dline_size = __ilog2(lsize); + powerpc_caches.dlines_per_page = PAGE_SIZE / lsize; + + size = 0; + lsize = cur_cpu_spec->icache_bsize; + sizep = (u32 *)get_property(np, "i-cache-size", NULL); + if (sizep) + size = *sizep; + lsizep = (u32 *)get_property(np, "i-cache-line-size", NULL); + if (! lsizep) + lsizep = (u32 *) get_property(np, "i-cache-block-size", NULL); + if (lsizep) + lsize = *lsizep; + if (!sizep || !lsizep) + DBG("Argh, can't find icache properties ! " + "sizep: %p, lsizep: %p\n", sizep, lsizep); + + powerpc_caches.isize = size; + powerpc_caches.iline_size = lsize; + powerpc_caches.log_iline_size = __ilog2(lsize); + powerpc_caches.ilines_per_page = PAGE_SIZE / lsize; + + /* + * Set cache line size based on type of cpu as a default. + * Systems with OF can look in the properties on the cpu node(s) + * for a possibly more accurate value. + */ + dcache_bsize = powerpc_caches.dline_size; + icache_bsize = powerpc_caches.iline_size; + if (! cpu_has_feature(CPU_FTR_SPLIT_ID_CACHE)) + ucache_bsize = dcache_bsize; + + DBG(" <- initialize_cache_info()\n"); +} Index: working-2.6/arch/powerpc/kernel/setup_32.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/setup_32.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/setup_32.c 2005-12-20 12:03:10.000000000 +1100 @@ -89,14 +89,6 @@ struct machdep_calls ppc_md; EXPORT_SYMBOL(ppc_md); /* - * These are used in binfmt_elf.c to put aux entries on the stack - * for each elf executable being started. - */ -int dcache_bsize; -int icache_bsize; -int ucache_bsize; - -/* * We're called here very early in the boot. We determine the machine * type and call the appropriate low-level setup functions. * -- Cort @@ -294,6 +286,18 @@ void __init setup_arch(char **cmdline_p) loops_per_jiffy = 500000000 / HZ; unflatten_device_tree(); + + /* + * Fill the powerpc_caches structure with information + * retreived from the device-tree. Needs to be called before + * finish_device_tree() since the later requires some of the + * information filled up here to properly parse the interrupt + * tree. It also sets up the cache line sizes which allows to + * call routines like flush_icache_range (used by the hash + * init later on). + */ + initialize_cache_info(); + check_for_initrd(); if (ppc_md.init_early) @@ -324,19 +328,6 @@ void __init setup_arch(char **cmdline_p) } #endif - /* - * Set cache line size based on type of cpu as a default. - * Systems with OF can look in the properties on the cpu node(s) - * for a possibly more accurate value. - */ - if (cpu_has_feature(CPU_FTR_SPLIT_ID_CACHE)) { - dcache_bsize = cur_cpu_spec->dcache_bsize; - icache_bsize = cur_cpu_spec->icache_bsize; - ucache_bsize = 0; - } else - ucache_bsize = dcache_bsize = icache_bsize - = cur_cpu_spec->dcache_bsize; - /* reboot on panic */ panic_timeout = 180; Index: working-2.6/arch/powerpc/kernel/misc_32.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/misc_32.S 2005-12-20 12:03:09.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/misc_32.S 2005-12-20 12:03:10.000000000 +1100 @@ -511,129 +511,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_C blr /* - * Write any modified data cache blocks out to memory - * and invalidate the corresponding instruction cache blocks. - * This is a no-op on the 601. - * - * flush_icache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(__flush_icache_range) -BEGIN_FTR_SECTION - blr /* for 601, do nothing */ -END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - mr r6,r3 -1: dcbst 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbst's to get to ram */ - mtctr r4 -2: icbi 0,r6 - addi r6,r6,L1_CACHE_BYTES - bdnz 2b - sync /* additional sync needed on g4 */ - isync - blr -/* - * Write any modified data cache blocks out to memory. - * Does not invalidate the corresponding cache lines (especially for - * any corresponding instruction cache). - * - * clean_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(clean_dcache_range) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbst 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbst's to get to ram */ - blr - -/* - * Write any modified data cache blocks out to memory and invalidate them. - * Does not invalidate the corresponding instruction cache blocks. - * - * flush_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(flush_dcache_range) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbf 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbst's to get to ram */ - blr - -/* - * Like above, but invalidate the D-cache. This is used by the 8xx - * to invalidate the cache so the PPC core doesn't get stale data - * from the CPM (no cache snooping here :-). - * - * invalidate_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(invalidate_dcache_range) - li r5,L1_CACHE_BYTES-1 - andc r3,r3,r5 - subf r4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbi 0,r3 - addi r3,r3,L1_CACHE_BYTES - bdnz 1b - sync /* wait for dcbi's to get to ram */ - blr - -/* - * Flush a particular page from the data cache to RAM. - * Note: this is necessary because the instruction cache does *not* - * snoop from the data cache. - * This is a no-op on the 601 which has a unified cache. - * - * void __flush_dcache_icache(void *page) - */ -_GLOBAL(__flush_dcache_icache) -BEGIN_FTR_SECTION - blr /* for 601, do nothing */ -END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_CACHE) - rlwinm r3,r3,0,0,19 /* Get page base address */ - li r4,4096/L1_CACHE_BYTES /* Number of lines in a page */ - mtctr r4 - mr r6,r3 -0: dcbst 0,r3 /* Write line to ram */ - addi r3,r3,L1_CACHE_BYTES - bdnz 0b - sync - mtctr r4 -1: icbi 0,r6 - addi r6,r6,L1_CACHE_BYTES - bdnz 1b - sync - isync - blr - -/* * Flush a particular page from the data cache to RAM, identified * by its physical address. We turn off the MMU so we can just use * the physical address (this may be a highmem page without a kernel Index: working-2.6/arch/powerpc/kernel/ppc_ksyms.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/ppc_ksyms.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/ppc_ksyms.c 2005-12-20 12:03:10.000000000 +1100 @@ -165,7 +165,7 @@ EXPORT_SYMBOL(flush_tlb_page); EXPORT_SYMBOL(_tlbie); #endif EXPORT_SYMBOL(__flush_icache_range); -EXPORT_SYMBOL(flush_dcache_range); +EXPORT_SYMBOL(wback_dcache_range); #ifdef CONFIG_SMP EXPORT_SYMBOL(smp_call_function); Index: working-2.6/arch/ppc/kernel/dma-mapping.c =================================================================== --- working-2.6.orig/arch/ppc/kernel/dma-mapping.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/ppc/kernel/dma-mapping.c 2005-12-20 12:03:30.000000000 +1100 @@ -207,11 +207,8 @@ __dma_alloc_coherent(size_t size, dma_ad * Invalidate any data that might be lurking in the * kernel direct-mapped region for device DMA. */ - { - unsigned long kaddr = (unsigned long)page_address(page); - memset(page_address(page), 0, size); - flush_dcache_range(kaddr, kaddr + size); - } + memset(page_address(page), 0, size); + wback_inval_dcache_range(page_address(page), size); /* * Allocate a virtual address in the consistent mapping region. @@ -365,20 +362,17 @@ core_initcall(dma_alloc_init); */ void __dma_sync(void *vaddr, size_t size, int direction) { - unsigned long start = (unsigned long)vaddr; - unsigned long end = start + size; - switch (direction) { case DMA_NONE: BUG(); case DMA_FROM_DEVICE: /* invalidate only */ - invalidate_dcache_range(start, end); + invalidate_dcache_range(vaddr, size); break; case DMA_TO_DEVICE: /* writeback only */ - clean_dcache_range(start, end); + wback_dcache_range(vaddr, size); break; case DMA_BIDIRECTIONAL: /* writeback and invalidate */ - flush_dcache_range(start, end); + wback_inval_dcache_range(vaddr, size); break; } } Index: working-2.6/arch/ppc/kernel/misc.S =================================================================== --- working-2.6.orig/arch/ppc/kernel/misc.S 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/ppc/kernel/misc.S 2005-12-20 12:03:30.000000000 +1100 @@ -527,12 +527,13 @@ END_FTR_SECTION_IFCLR(CPU_FTR_SPLIT_ID_C * Does not invalidate the corresponding cache lines (especially for * any corresponding instruction cache). * - * clean_dcache_range(unsigned long start, unsigned long stop) + * wback_dcache_range(void *start, unsigned long len) */ -_GLOBAL(clean_dcache_range) +_GLOBAL(wback_dcache_range) li r5,L1_CACHE_BYTES-1 andc r3,r3,r5 - subf r4,r3,r4 + and r6,r3,r5 + add r4,r4,r6 add r4,r4,r5 srwi. r4,r4,L1_CACHE_SHIFT beqlr @@ -548,12 +549,13 @@ _GLOBAL(clean_dcache_range) * Write any modified data cache blocks out to memory and invalidate them. * Does not invalidate the corresponding instruction cache blocks. * - * flush_dcache_range(unsigned long start, unsigned long stop) + * wback_inval_dcache_range(void *start, unsigned long len) */ -_GLOBAL(flush_dcache_range) +_GLOBAL(wback_inval_dcache_range) li r5,L1_CACHE_BYTES-1 andc r3,r3,r5 - subf r4,r3,r4 + and r6,r3,r5 + add r4,r4,r6 add r4,r4,r5 srwi. r4,r4,L1_CACHE_SHIFT beqlr @@ -570,12 +572,13 @@ _GLOBAL(flush_dcache_range) * to invalidate the cache so the PPC core doesn't get stale data * from the CPM (no cache snooping here :-). * - * invalidate_dcache_range(unsigned long start, unsigned long stop) + * invalidate_dcache_range(void *start, unsigned long len) */ _GLOBAL(invalidate_dcache_range) li r5,L1_CACHE_BYTES-1 andc r3,r3,r5 - subf r4,r3,r4 + and r6,r3,r5 + add r4,r4,r6 add r4,r4,r5 srwi. r4,r4,L1_CACHE_SHIFT beqlr Index: working-2.6/include/asm-powerpc/cacheflush.h =================================================================== --- working-2.6.orig/include/asm-powerpc/cacheflush.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/cacheflush.h 2005-12-20 12:03:30.000000000 +1100 @@ -44,15 +44,12 @@ extern void flush_dcache_icache_page(str extern void __flush_dcache_icache_phys(unsigned long physaddr); #endif /* CONFIG_PPC32 && !CONFIG_BOOKE */ -extern void flush_dcache_range(unsigned long start, unsigned long stop); -#ifdef CONFIG_PPC32 -extern void clean_dcache_range(unsigned long start, unsigned long stop); -extern void invalidate_dcache_range(unsigned long start, unsigned long stop); -#endif /* CONFIG_PPC32 */ -#ifdef CONFIG_PPC64 -extern void flush_inval_dcache_range(unsigned long start, unsigned long stop); -extern void flush_dcache_phys_range(unsigned long start, unsigned long stop); -#endif +extern void wback_dcache_range(void *start, unsigned long len); +extern void wback_inval_dcache_range(void *start, unsigned long len); +extern void invalidate_dcache_range(void *start, unsigned long len); +#ifdef CONFIG_U3_DART +extern void wback_dcache_phys_range(unsigned long start, unsigned long len); +#endif /* CONFIG_U3_DART */ #define copy_to_user_page(vma, page, vaddr, dst, src, len) \ do { \ Index: working-2.6/include/asm-ppc/io.h =================================================================== --- working-2.6.orig/include/asm-ppc/io.h 2005-11-23 15:56:36.000000000 +1100 +++ working-2.6/include/asm-ppc/io.h 2005-12-20 12:03:30.000000000 +1100 @@ -548,11 +548,11 @@ extern void pci_iounmap(struct pci_dev * #ifdef CONFIG_NOT_COHERENT_CACHE #define dma_cache_inv(_start,_size) \ - invalidate_dcache_range(_start, (_start + _size)) + invalidate_dcache_range((void *)_start, _size) #define dma_cache_wback(_start,_size) \ - clean_dcache_range(_start, (_start + _size)) + wback_dcache_range((void *)_start, _size) #define dma_cache_wback_inv(_start,_size) \ - flush_dcache_range(_start, (_start + _size)) + wback_inval_dcache_range((void *)_start, _size) #else Index: working-2.6/drivers/macintosh/smu.c =================================================================== --- working-2.6.orig/drivers/macintosh/smu.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/drivers/macintosh/smu.c 2005-12-20 12:03:30.000000000 +1100 @@ -100,7 +100,6 @@ static DECLARE_MUTEX(smu_part_access); static void smu_start_cmd(void) { - unsigned long faddr, fend; struct smu_cmd *cmd; if (list_empty(&smu->cmd_list)) @@ -125,9 +124,7 @@ static void smu_start_cmd(void) memcpy(smu->cmd_buf->data, cmd->data_buf, cmd->data_len); /* Flush command and data to RAM */ - faddr = (unsigned long)smu->cmd_buf; - fend = faddr + smu->cmd_buf->length + 2; - flush_inval_dcache_range(faddr, fend); + wback_inval_dcache_range(smu->cmd_buf, smu->cmd_buf->length + 2); /* This isn't exactly a DMA mapping here, I suspect * the SMU is actually communicating with us via i2c to the @@ -166,7 +163,6 @@ static irqreturn_t smu_db_intr(int irq, goto bail; if (rc == 0) { - unsigned long faddr; int reply_len; u8 ack; @@ -175,8 +171,7 @@ static irqreturn_t smu_db_intr(int irq, * flush the entire buffer for now as we haven't read the * reply lenght (it's only 2 cache lines anyway) */ - faddr = (unsigned long)smu->cmd_buf; - flush_inval_dcache_range(faddr, faddr + 256); + wback_inval_dcache_range(smu->cmd_buf, 256); /* Now check ack */ ack = (~cmd->cmd) & 0xff; Index: working-2.6/arch/ppc/kernel/ppc_ksyms.c =================================================================== --- working-2.6.orig/arch/ppc/kernel/ppc_ksyms.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/ppc/kernel/ppc_ksyms.c 2005-12-20 12:03:10.000000000 +1100 @@ -181,7 +181,7 @@ EXPORT_SYMBOL(kernel_thread); EXPORT_SYMBOL(flush_instruction_cache); EXPORT_SYMBOL(giveup_fpu); EXPORT_SYMBOL(__flush_icache_range); -EXPORT_SYMBOL(flush_dcache_range); +EXPORT_SYMBOL(wback_inval_dcache_range); EXPORT_SYMBOL(flush_icache_user_range); EXPORT_SYMBOL(flush_dcache_page); EXPORT_SYMBOL(flush_tlb_kernel_range); Index: working-2.6/include/asm-powerpc/asm-compat.h =================================================================== --- working-2.6.orig/include/asm-powerpc/asm-compat.h 2005-11-23 15:56:35.000000000 +1100 +++ working-2.6/include/asm-powerpc/asm-compat.h 2005-12-20 12:03:10.000000000 +1100 @@ -26,6 +26,7 @@ #define PPC_LLARX stringify_in_c(ldarx) #define PPC_STLCX stringify_in_c(stdcx.) #define PPC_CNTLZL stringify_in_c(cntlzd) +#define PPC_CLRRLI stringify_in_c(clrrdi) #else /* 32-bit */ @@ -38,6 +39,7 @@ #define PPC_LLARX stringify_in_c(lwarx) #define PPC_STLCX stringify_in_c(stwcx.) #define PPC_CNTLZL stringify_in_c(cntlzw) +#define PPC_CLRRLI stringify_in_c(clrrwi) #endif Index: working-2.6/arch/powerpc/sysdev/dart_iommu.c =================================================================== --- working-2.6.orig/arch/powerpc/sysdev/dart_iommu.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/powerpc/sysdev/dart_iommu.c 2005-12-20 12:03:30.000000000 +1100 @@ -187,8 +187,7 @@ static int dart_init(struct device_node * from a previous mapping that existed before the kernel took * over */ - flush_dcache_phys_range(dart_tablebase, - dart_tablebase + dart_tablesize); + wback_dcache_phys_range(dart_tablebase, dart_tablesize); /* Allocate a spare page to map all invalid DART pages. We need to do * that to work around what looks like a problem with the HT bridge Index: working-2.6/arch/ppc/8xx_io/enet.c =================================================================== --- working-2.6.orig/arch/ppc/8xx_io/enet.c 2005-10-25 11:59:53.000000000 +1000 +++ working-2.6/arch/ppc/8xx_io/enet.c 2005-12-20 12:03:30.000000000 +1100 @@ -239,8 +239,7 @@ scc_enet_start_xmit(struct sk_buff *skb, /* Push the data cache so the CPM does not get stale memory * data. */ - flush_dcache_range((unsigned long)(skb->data), - (unsigned long)(skb->data + skb->len)); + wback_inval_dcache_range(skb->data, skb->len); spin_lock_irq(&cep->lock); Index: working-2.6/arch/ppc/8xx_io/fec.c =================================================================== --- working-2.6.orig/arch/ppc/8xx_io/fec.c 2005-10-25 11:59:53.000000000 +1000 +++ working-2.6/arch/ppc/8xx_io/fec.c 2005-12-20 12:03:30.000000000 +1100 @@ -387,8 +387,7 @@ fec_enet_start_xmit(struct sk_buff *skb, /* Push the data cache so the CPM does not get stale memory * data. */ - flush_dcache_range((unsigned long)skb->data, - (unsigned long)skb->data + skb->len); + wback_inval_dcache_range(skb->data, skb->len); /* disable interrupts while triggering transmit */ spin_lock_irq(&fep->lock); Index: working-2.6/drivers/char/agp/uninorth-agp.c =================================================================== --- working-2.6.orig/drivers/char/agp/uninorth-agp.c 2005-11-23 15:56:23.000000000 +1100 +++ working-2.6/drivers/char/agp/uninorth-agp.c 2005-12-20 12:03:30.000000000 +1100 @@ -157,13 +157,12 @@ static int uninorth_insert_memory(struct for (i = 0, j = pg_start; i < mem->page_count; i++, j++) { agp_bridge->gatt_table[j] = cpu_to_le32((mem->memory[i] & 0xFFFFF000UL) | 0x1UL); - flush_dcache_range((unsigned long)__va(mem->memory[i]), - (unsigned long)__va(mem->memory[i])+0x1000); + wback_dcache_range(__va(mem->memory[i]), 0x1000); } (void)in_le32((volatile u32*)&agp_bridge->gatt_table[pg_start]); mb(); - flush_dcache_range((unsigned long)&agp_bridge->gatt_table[pg_start], - (unsigned long)&agp_bridge->gatt_table[pg_start + mem->page_count]); + wback_dcache_range(&agp_bridge->gatt_table[pg_start], + mem->page_count*sizeof(*(agp_bridge->gatt_table))); uninorth_tlbflush(mem); return 0; @@ -195,11 +194,10 @@ static int u3_insert_memory(struct agp_m for (i = 0; i < mem->page_count; i++) { gp[i] = (mem->memory[i] >> PAGE_SHIFT) | 0x80000000UL; - flush_dcache_range((unsigned long)__va(mem->memory[i]), - (unsigned long)__va(mem->memory[i])+0x1000); + wback_dcache_range(__va(mem->memory[i]), 0x1000); } mb(); - flush_dcache_range((unsigned long)gp, (unsigned long) &gp[i]); + wback_dcache_range(gp, i*sizeof(*gp)); uninorth_tlbflush(mem); return 0; @@ -218,7 +216,7 @@ int u3_remove_memory(struct agp_memory * for (i = 0; i < mem->page_count; ++i) gp[i] = 0; mb(); - flush_dcache_range((unsigned long)gp, (unsigned long) &gp[i]); + wback_dcache_range(gp, i*sizeof(*gp)); uninorth_tlbflush(mem); return 0; @@ -365,7 +363,7 @@ static int agp_uninorth_resume(struct pc static int uninorth_create_gatt_table(struct agp_bridge_data *bridge) { char *table; - char *table_end; + unsigned long table_size; int size; int page_order; int num_entries; @@ -400,9 +398,10 @@ static int uninorth_create_gatt_table(st if (table == NULL) return -ENOMEM; - table_end = table + ((PAGE_SIZE * (1 << page_order)) - 1); + table_size= (PAGE_SIZE * (1 << page_order)) - 1; - for (page = virt_to_page(table); page <= virt_to_page(table_end); page++) + for (page = virt_to_page(table); + page <= virt_to_page(table + table_size); page++) SetPageReserved(page); bridge->gatt_table_real = (u32 *) table; @@ -412,7 +411,7 @@ static int uninorth_create_gatt_table(st for (i = 0; i < num_entries; i++) bridge->gatt_table[i] = 0; - flush_dcache_range((unsigned long)table, (unsigned long)table_end); + wback_dcache_range(table, table_size); return 0; } Index: working-2.6/drivers/net/fec.c =================================================================== --- working-2.6.orig/drivers/net/fec.c 2005-11-23 15:56:25.000000000 +1100 +++ working-2.6/drivers/net/fec.c 2005-12-20 12:03:30.000000000 +1100 @@ -361,8 +361,7 @@ fec_enet_start_xmit(struct sk_buff *skb, /* Push the data cache so the CPM does not get stale memory * data. */ - flush_dcache_range((unsigned long)skb->data, - (unsigned long)skb->data + skb->len); + wback_inval_dcache_range(skb->data, skb->len); spin_lock_irq(&fep->lock); Index: working-2.6/arch/ppc/8xx_io/cs4218_tdm.c =================================================================== --- working-2.6.orig/arch/ppc/8xx_io/cs4218_tdm.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/ppc/8xx_io/cs4218_tdm.c 2005-12-20 12:03:30.000000000 +1100 @@ -1235,8 +1235,7 @@ static void CS_Play(void) bdp = &tx_base[i]; bdp->cbd_datlen = count; - flush_dcache_range((ulong)sound_buffers[i], - (ulong)(sound_buffers[i] + count)); + wback_inval_dcache_range(sound_buffers[i], count); if (++i >= sq.max_count) i = 0; @@ -1334,9 +1333,8 @@ cs4218_tdm_rx_intr(void *devid) /* Invalidate the data cache range for this buffer. */ - invalidate_dcache_range( - (uint)(sound_read_buffers[read_sq.rear]), - (uint)(sound_read_buffers[read_sq.rear] + read_sq.block_size)); + invalidate_dcache_range(sound_read_buffers[read_sq.rear], + read_sq.block_size); /* Make buffer available again and move on. */ Index: working-2.6/drivers/serial/mpsc.c =================================================================== --- working-2.6.orig/drivers/serial/mpsc.c 2005-11-23 15:56:27.000000000 +1100 +++ working-2.6/drivers/serial/mpsc.c 2005-12-20 12:03:30.000000000 +1100 @@ -308,8 +308,7 @@ mpsc_sdma_start_tx(struct mpsc_port_info dma_cache_sync((void *) txre, MPSC_TXRE_SIZE, DMA_FROM_DEVICE); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - invalidate_dcache_range((ulong)txre, - (ulong)txre + MPSC_TXRE_SIZE); + invalidate_dcache_range(txre, MPSC_TXRE_SIZE); #endif if (be32_to_cpu(txre->cmdstat) & SDMA_DESC_CMDSTAT_O) { @@ -685,8 +684,8 @@ mpsc_init_rings(struct mpsc_port_info *p DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)pi->dma_region, - (ulong)pi->dma_region + MPSC_DMA_ALLOC_SIZE); + wback_inval_dcache_range(pi->dma_region, + MPSC_DMA_ALLOC_SIZE); #endif return; @@ -758,8 +757,7 @@ mpsc_rx_intr(struct mpsc_port_info *pi, dma_cache_sync((void *)rxre, MPSC_RXRE_SIZE, DMA_FROM_DEVICE); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - invalidate_dcache_range((ulong)rxre, - (ulong)rxre + MPSC_RXRE_SIZE); + invalidate_dcache_range(rxre, MPSC_RXRE_SIZE); #endif /* @@ -782,8 +780,7 @@ mpsc_rx_intr(struct mpsc_port_info *pi, dma_cache_sync((void *) bp, MPSC_RXBE_SIZE, DMA_FROM_DEVICE); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - invalidate_dcache_range((ulong)bp, - (ulong)bp + MPSC_RXBE_SIZE); + invalidate_dcache_range(bp, MPSC_RXBE_SIZE); #endif /* @@ -851,8 +848,7 @@ next_frame: dma_cache_sync((void *)rxre, MPSC_RXRE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)rxre, - (ulong)rxre + MPSC_RXRE_SIZE); + wback_inval_dcache_range(rxre, MPSC_RXRE_SIZE); #endif /* Advance to next descriptor */ @@ -862,8 +858,7 @@ next_frame: dma_cache_sync((void *)rxre, MPSC_RXRE_SIZE, DMA_FROM_DEVICE); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - invalidate_dcache_range((ulong)rxre, - (ulong)rxre + MPSC_RXRE_SIZE); + invalidate_dcache_range(rxre, MPSC_RXRE_SIZE); #endif rc = 1; @@ -896,8 +891,7 @@ mpsc_setup_tx_desc(struct mpsc_port_info dma_cache_sync((void *) txre, MPSC_TXRE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)txre, - (ulong)txre + MPSC_TXRE_SIZE); + wback_inval_dcache_range(txre, MPSC_TXRE_SIZE); #endif return; @@ -945,8 +939,7 @@ mpsc_copy_tx_data(struct mpsc_port_info dma_cache_sync((void *) bp, MPSC_TXBE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)bp, - (ulong)bp + MPSC_TXBE_SIZE); + wback_inval_dcache_range(bp, MPSC_TXBE_SIZE); #endif mpsc_setup_tx_desc(pi, i, 1); @@ -970,8 +963,7 @@ mpsc_tx_intr(struct mpsc_port_info *pi) dma_cache_sync((void *) txre, MPSC_TXRE_SIZE, DMA_FROM_DEVICE); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - invalidate_dcache_range((ulong)txre, - (ulong)txre + MPSC_TXRE_SIZE); + invalidate_dcache_range(txre, MPSC_TXRE_SIZE); #endif while (!(be32_to_cpu(txre->cmdstat) & SDMA_DESC_CMDSTAT_O)) { @@ -989,8 +981,7 @@ mpsc_tx_intr(struct mpsc_port_info *pi) DMA_FROM_DEVICE); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - invalidate_dcache_range((ulong)txre, - (ulong)txre + MPSC_TXRE_SIZE); + invalidate_dcache_range(txre, MPSC_TXRE_SIZE); #endif } @@ -1405,8 +1396,7 @@ mpsc_console_write(struct console *co, c dma_cache_sync((void *) bp, MPSC_TXBE_SIZE, DMA_BIDIRECTIONAL); #if defined(CONFIG_PPC32) && !defined(CONFIG_NOT_COHERENT_CACHE) if (pi->cache_mgmt) /* GT642[46]0 Res #COMM-2 */ - flush_dcache_range((ulong)bp, - (ulong)bp + MPSC_TXBE_SIZE); + wback_inval_dcache_range(bp, MPSC_TXBE_SIZE); #endif mpsc_setup_tx_desc(pi, i, 0); pi->txr_head = (pi->txr_head + 1) & (MPSC_TXR_ENTRIES - 1); -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From benh at kernel.crashing.org Tue Dec 20 14:23:34 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 20 Dec 2005 14:23:34 +1100 Subject: [PATCH] macintosh: don't store i2c_add_driver() return if no further processing done In-Reply-To: <20051220031439.GB24700@krypton> References: <20051220031439.GB24700@krypton> Message-ID: <1135049014.10035.64.camel@gaston> On Mon, 2005-12-19 at 22:14 -0500, Arthur Othieno wrote: > therm_pm72.c and windfarm_lm75_sensor.c both store the return from > i2c_add_driver() but do no further processing on the result. Simply > return what i2c_add_driver() did, instead. > > Signed-off-by: Arthur Othieno Acked-by: Benjamin Herrenschmidt > > drivers/macintosh/therm_pm72.c | 7 +------ > drivers/macintosh/windfarm_lm75_sensor.c | 7 +------ > 2 files changed, 2 insertions(+), 12 deletions(-) > > 1066f48b47e6d216b41bc58064b7c791d05f4a44 > diff --git a/drivers/macintosh/therm_pm72.c b/drivers/macintosh/therm_pm72.c > index 3fc8cdd..a112eed 100644 > --- a/drivers/macintosh/therm_pm72.c > +++ b/drivers/macintosh/therm_pm72.c > @@ -1988,18 +1988,13 @@ static void fcu_lookup_fans(struct devic > > static int fcu_of_probe(struct of_device* dev, const struct of_device_id *match) > { > - int rc; > - > state = state_detached; > > /* Lookup the fans in the device tree */ > fcu_lookup_fans(dev->node); > > /* Add the driver */ > - rc = i2c_add_driver(&therm_pm72_driver); > - if (rc < 0) > - return rc; > - return 0; > + return i2c_add_driver(&therm_pm72_driver); > } > > static int fcu_of_remove(struct of_device* dev) > diff --git a/drivers/macintosh/windfarm_lm75_sensor.c b/drivers/macintosh/windfarm_lm75_sensor.c > index a0a41ad..c62ed68 100644 > --- a/drivers/macintosh/windfarm_lm75_sensor.c > +++ b/drivers/macintosh/windfarm_lm75_sensor.c > @@ -240,12 +240,7 @@ static int wf_lm75_detach(struct i2c_cli > > static int __init wf_lm75_sensor_init(void) > { > - int rc; > - > - rc = i2c_add_driver(&wf_lm75_driver); > - if (rc < 0) > - return rc; > - return 0; > + return i2c_add_driver(&wf_lm75_driver); > } > > static void __exit wf_lm75_sensor_exit(void) From a.othieno at bluewin.ch Tue Dec 20 14:14:39 2005 From: a.othieno at bluewin.ch (Arthur Othieno) Date: Mon, 19 Dec 2005 22:14:39 -0500 Subject: [PATCH] macintosh: don't store i2c_add_driver() return if no further processing done Message-ID: <20051220031439.GB24700@krypton> therm_pm72.c and windfarm_lm75_sensor.c both store the return from i2c_add_driver() but do no further processing on the result. Simply return what i2c_add_driver() did, instead. Signed-off-by: Arthur Othieno --- drivers/macintosh/therm_pm72.c | 7 +------ drivers/macintosh/windfarm_lm75_sensor.c | 7 +------ 2 files changed, 2 insertions(+), 12 deletions(-) 1066f48b47e6d216b41bc58064b7c791d05f4a44 diff --git a/drivers/macintosh/therm_pm72.c b/drivers/macintosh/therm_pm72.c index 3fc8cdd..a112eed 100644 --- a/drivers/macintosh/therm_pm72.c +++ b/drivers/macintosh/therm_pm72.c @@ -1988,18 +1988,13 @@ static void fcu_lookup_fans(struct devic static int fcu_of_probe(struct of_device* dev, const struct of_device_id *match) { - int rc; - state = state_detached; /* Lookup the fans in the device tree */ fcu_lookup_fans(dev->node); /* Add the driver */ - rc = i2c_add_driver(&therm_pm72_driver); - if (rc < 0) - return rc; - return 0; + return i2c_add_driver(&therm_pm72_driver); } static int fcu_of_remove(struct of_device* dev) diff --git a/drivers/macintosh/windfarm_lm75_sensor.c b/drivers/macintosh/windfarm_lm75_sensor.c index a0a41ad..c62ed68 100644 --- a/drivers/macintosh/windfarm_lm75_sensor.c +++ b/drivers/macintosh/windfarm_lm75_sensor.c @@ -240,12 +240,7 @@ static int wf_lm75_detach(struct i2c_cli static int __init wf_lm75_sensor_init(void) { - int rc; - - rc = i2c_add_driver(&wf_lm75_driver); - if (rc < 0) - return rc; - return 0; + return i2c_add_driver(&wf_lm75_driver); } static void __exit wf_lm75_sensor_exit(void) -- 0.99.9n From paulus at samba.org Tue Dec 20 16:47:41 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 20 Dec 2005 16:47:41 +1100 Subject: please pull powerpc-merge.git Message-ID: <17319.39677.706288.200088@cargo.ozlabs.ibm.com> Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge.git There are 3 commits in there: a defconfig update and two other fixes that need to go in 2.6.15. I have included the patch for the two fixes (but not the boring defconfig update :). Thanks, Paul. arch/powerpc/configs/cell_defconfig | 7 ++++--- arch/powerpc/configs/g5_defconfig | 9 +++++---- arch/powerpc/configs/iseries_defconfig | 7 ++++--- arch/powerpc/configs/maple_defconfig | 10 +++++----- arch/powerpc/configs/ppc64_defconfig | 7 ++++--- arch/powerpc/configs/pseries_defconfig | 7 ++++--- arch/powerpc/kernel/entry_64.S | 4 ++-- arch/ppc/platforms/85xx/mpc85xx_cds_common.c | 3 ++- 8 files changed, 30 insertions(+), 24 deletions(-) Edson Seabra: powerpc: CPM2 interrupt handler failure after 100,000 interrupts Paul Mackerras: powerpc: correct register usage in 64-bit syscall exit path powerpc: update defconfigs diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 2d22bf0..bce33a3 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -183,8 +183,8 @@ syscall_exit_trace_cont: ld r13,GPR13(r1) /* returning to usermode */ 1: ld r2,GPR2(r1) li r12,MSR_RI - andc r10,r10,r12 - mtmsrd r10,1 /* clear MSR.RI */ + andc r11,r10,r12 + mtmsrd r11,1 /* clear MSR.RI */ ld r1,GPR1(r1) mtlr r4 mtcr r5 diff --git a/arch/ppc/platforms/85xx/mpc85xx_cds_common.c b/arch/ppc/platforms/85xx/mpc85xx_cds_common.c index d8991b8..5e8cc5e 100644 --- a/arch/ppc/platforms/85xx/mpc85xx_cds_common.c +++ b/arch/ppc/platforms/85xx/mpc85xx_cds_common.c @@ -130,10 +130,11 @@ mpc85xx_cds_show_cpuinfo(struct seq_file } #ifdef CONFIG_CPM2 -static void cpm2_cascade(int irq, void *dev_id, struct pt_regs *regs) +static irqreturn_t cpm2_cascade(int irq, void *dev_id, struct pt_regs *regs) { while((irq = cpm2_get_irq(regs)) >= 0) __do_IRQ(irq, regs); + return IRQ_HANDLED; } static struct irqaction cpm2_irqaction = { From arnd at arndb.de Tue Dec 20 21:18:03 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 20 Dec 2005 11:18:03 +0100 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: References: Message-ID: <200512201118.05404.arnd@arndb.de> On Maandag 19 Dezember 2005 21:49, Kumar Gala wrote: > I'm still in favor of just leaving these devices as straight platform > devices. ?Unless there is something that is bus specific that each device > on the bus conforms to I dont see any reason to create a new bus type. How do platform devices work with module autoloading? What I'm interested in is to have stuff like the Fedora installer or kernels with modular drivers 'just work' because they can use the same way to load their modules that is already used for PCI devices. AFAICS, that requires at least two things: - The device needs to be created when the bus is probed, i.e. of_device_register can not be called from inside the driver module_init() function. - The bus type needs to create a modalias file so user space can do the matching with the of device table in the modules. Both of these should be a lot easier to implement with a special bus type that creates entries in sysfs for a subset of the OF device tree. The alternative would be to represent all of the device tree in /sys/devices, but IMHO that should better be part of /sys/firmware with symlinks to the linux internal device tree representation. Arnd <>< From arnd at arndb.de Tue Dec 20 23:14:03 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 20 Dec 2005 13:14:03 +0100 Subject: [PATCH, version 6] cell: enable pause(0) in cpu_idle In-Reply-To: <200512171228.21578.arnd@arndb.de> References: <200512171228.21578.arnd@arndb.de> Message-ID: <200512201314.12932.arnd@arndb.de> This patch enables support for pause(0) power management state for the Cell Broadband Processor, which is import for power efficient operation. The pervasive infrastructure will in the future enable us to introduce more functionality specific to the Cell's pervasive unit. From: Maximino Aguilar Signed-off-by: Arnd Bergmann Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/Makefile +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/Makefile @@ -1,4 +1,6 @@ obj-y += interrupt.o iommu.o setup.o spider-pic.o +obj-y += pervasive.o + obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SPU_FS) += spufs/ spu_base.o builtin-spufs-$(CONFIG_SPU_FS) += spu_syscalls.o Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.c @@ -0,0 +1,223 @@ +/* + * CBE Pervasive Monitor and Debug + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * Michael N. Day (mnday at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#undef DEBUG + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "pervasive.h" + +static DEFINE_SPINLOCK(cbe_pervasive_lock); +struct cbe_pervasive { + struct pmd_regs __iomem *regs; + unsigned int thread; +}; + +/* can't use per_cpu from setup_arch */ +static struct cbe_pervasive cbe_pervasive[NR_CPUS]; + +static void __init cbe_enable_pause_zero(void) +{ + unsigned long thread_switch_control; + unsigned long temp_register; + struct cbe_pervasive *p; + int thread; + + spin_lock_irq(&cbe_pervasive_lock); + p = &cbe_pervasive[smp_processor_id()]; + + if (!cbe_pervasive->regs) + goto out; + + pr_debug("Power Management: CPU %d\n", smp_processor_id()); + + /* Enable Pause(0) control bit */ + temp_register = in_be64(&p->regs->pm_control); + + out_be64(&p->regs->pm_control, + temp_register|PMD_PAUSE_ZERO_CONTROL); + + /* Enable DEC and EE interrupt request */ + thread_switch_control = mfspr(SPRN_TSC_CELL); + thread_switch_control |= TSC_CELL_EE_ENABLE | TSC_CELL_EE_BOOST; + + switch ((mfspr(SPRN_CTRLF) & CTRL_CT)) { + case CTRL_CT0: + thread_switch_control |= TSC_CELL_DEC_ENABLE_0; + thread = 0; + break; + case CTRL_CT1: + thread_switch_control |= TSC_CELL_DEC_ENABLE_1; + thread = 1; + break; + default: + printk(KERN_WARNING "%s: unknown configuration\n", + __FUNCTION__); + thread = -1; + break; + } + + if (p->thread != thread) + printk(KERN_WARNING "%s: device tree inconsistant, " + "cpu %i: %d/%d\n", __FUNCTION__, + smp_processor_id(), + p->thread, thread); + + mtspr(SPRN_TSC_CELL, thread_switch_control); + +out: + spin_unlock_irq(&cbe_pervasive_lock); +} + +static void cbe_idle(void) +{ + unsigned long ctrl; + + cbe_enable_pause_zero(); + + while (1) { + if (!need_resched()) { + local_irq_disable(); + while (!need_resched()) { + /* go into low thread priority */ + HMT_low(); + + /* atomically disable thread execution + * and runlatch. + * External and Decrementer exceptions + * are still handled when the thread + * is disabled but now enter in + * cbe_system_reset_exception() */ + ctrl = mfspr(SPRN_CTRLF); + ctrl &= ~(CTRL_RUNLATCH | CTRL_TE); + mtspr(SPRN_CTRLT, ctrl); + } + /* restore thread prio */ + HMT_medium(); + local_irq_enable(); + } + + /* turn runlatch on again before scheduling the + * process we just woke up */ + ppc64_runlatch_on(); + + preempt_enable_no_resched(); + schedule(); + preempt_disable(); + } +} + +int cbe_system_reset_exception(struct pt_regs *regs) +{ + switch (regs->msr & SRR1_WAKEMASK) { + case SRR1_WAKEEE: + do_IRQ(regs); + break; + case SRR1_WAKEDEC: + timer_interrupt(regs); + break; + case SRR1_WAKEMT: + /* no action required */ + break; + default: + return 0; /* do system reset */ + } + return 1; /* everything handled */ +} + +static int __init cbe_find_pmd_mmio(int cpu, struct cbe_pervasive *p) +{ + struct device_node *node; + unsigned int *int_servers; + char *addr; + unsigned long real_address; + unsigned int size; + + struct pmd_regs __iomem *pmd_mmio_area; + int hardid, thread; + int proplen; + + pmd_mmio_area = NULL; + hardid = get_hard_smp_processor_id(cpu); + for (node = NULL; (node = of_find_node_by_type(node, "cpu"));) { + int_servers = (void *) get_property(node, + "ibm,ppc-interrupt-server#s", &proplen); + if (!int_servers) { + printk(KERN_WARNING "%s misses " + "ibm,ppc-interrupt-server#s property", + node->full_name); + continue; + } + for (thread = 0; thread < proplen / sizeof (int); thread++) { + if (hardid == int_servers[thread]) { + addr = get_property(node, "pervasive", NULL); + goto found; + } + } + } + + printk(KERN_WARNING "%s: CPU %d not found\n", __FUNCTION__, cpu); + return -EINVAL; + +found: + real_address = *(unsigned long*) addr; + addr += sizeof (unsigned long); + size = *(unsigned int*) addr; + + pr_debug("pervasive area for CPU %d at %lx, size %x\n", + cpu, real_address, size); + p->regs = __ioremap(real_address, size, _PAGE_NO_CACHE); + p->thread = thread; + return 0; +} + +void __init cell_pervasive_init(void) +{ + struct cbe_pervasive *p; + int cpu; + int ret; + + if (!cpu_has_feature(CPU_FTR_PAUSE_ZERO)) + return; + + for_each_cpu(cpu) { + p = &cbe_pervasive[cpu]; + ret = cbe_find_pmd_mmio(cpu, p); + if (ret) + return; + } + + ppc_md.idle_loop = cbe_idle; + ppc_md.system_reset_exception = cbe_system_reset_exception; +} Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/pervasive.h @@ -0,0 +1,62 @@ +/* + * Cell Pervasive Monitor and Debug interface and HW structures + * + * (C) Copyright IBM Corporation 2005 + * + * Authors: Maximino Aguilar (maguilar at us.ibm.com) + * David J. Erb (djerb at us.ibm.com) + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + + +#ifndef PERVASIVE_H +#define PERVASIVE_H + +struct pmd_regs { + u8 pad_0x0000_0x0800[0x0800 - 0x0000]; /* 0x0000 */ + + /* Thermal Sensor Registers */ + u64 ts_ctsr1; /* 0x0800 */ + u64 ts_ctsr2; /* 0x0808 */ + u64 ts_mtsr1; /* 0x0810 */ + u64 ts_mtsr2; /* 0x0818 */ + u64 ts_itr1; /* 0x0820 */ + u64 ts_itr2; /* 0x0828 */ + u64 ts_gitr; /* 0x0830 */ + u64 ts_isr; /* 0x0838 */ + u64 ts_imr; /* 0x0840 */ + u64 tm_cr1; /* 0x0848 */ + u64 tm_cr2; /* 0x0850 */ + u64 tm_simr; /* 0x0858 */ + u64 tm_tpr; /* 0x0860 */ + u64 tm_str1; /* 0x0868 */ + u64 tm_str2; /* 0x0870 */ + u64 tm_tsr; /* 0x0878 */ + + /* Power Management */ + u64 pm_control; /* 0x0880 */ +#define PMD_PAUSE_ZERO_CONTROL 0x10000 + u64 pm_status; /* 0x0888 */ + + /* Time Base Register */ + u64 tbr; /* 0x0890 */ + + u8 pad_0x0898_0x1000 [0x1000 - 0x0898]; /* 0x0898 */ +}; + +void __init cell_pervasive_init(void); + +#endif Index: linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/cell/setup.c +++ linux-2.6.15-rc/arch/powerpc/platforms/cell/setup.c @@ -49,6 +49,7 @@ #include "interrupt.h" #include "iommu.h" +#include "pervasive.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -165,6 +166,7 @@ static void __init cell_setup_arch(void) init_pci_config_tokens(); find_and_init_phbs(); spider_init_IRQ(); + cell_pervasive_init(); #ifdef CONFIG_DUMMY_CONSOLE conswitchp = &dummy_con; #endif Index: linux-2.6.15-rc/include/asm-powerpc/cputable.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/cputable.h +++ linux-2.6.15-rc/include/asm-powerpc/cputable.h @@ -106,6 +106,7 @@ extern void do_cpu_ftr_fixups(unsigned l #define CPU_FTR_LOCKLESS_TLBIE ASM_CONST(0x0000040000000000) #define CPU_FTR_MMCRA_SIHV ASM_CONST(0x0000080000000000) #define CPU_FTR_CI_LARGE_PAGE ASM_CONST(0x0000100000000000) +#define CPU_FTR_PAUSE_ZERO ASM_CONST(0x0000200000000000) #else /* ensure on 32b processors the flags are available for compiling but * don't do anything */ @@ -305,7 +306,8 @@ enum { CPU_FTR_MMCRA_SIHV, CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 | - CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT, + CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | + CPU_FTR_CTRL | CPU_FTR_PAUSE_ZERO, CPU_FTRS_COMPATIBLE = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB | CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2, #endif Index: linux-2.6.15-rc/include/asm-powerpc/reg.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/reg.h +++ linux-2.6.15-rc/include/asm-powerpc/reg.h @@ -145,6 +145,10 @@ #define SPRN_CTR 0x009 /* Count Register */ #define SPRN_CTRLF 0x088 #define SPRN_CTRLT 0x098 +#define CTRL_CT 0xc0000000 /* current thread */ +#define CTRL_CT0 0x80000000 /* thread 0 */ +#define CTRL_CT1 0x40000000 /* thread 1 */ +#define CTRL_TE 0x00c00000 /* thread enable */ #define CTRL_RUNLATCH 0x1 #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ #define DABR_TRANSLATION (1UL << 2) @@ -257,11 +261,11 @@ #define SPRN_HID6 0x3F9 /* BE HID 6 */ #define HID6_LB (0x0F<<12) /* Concurrent Large Page Modes */ #define HID6_DLP (1<<20) /* Disable all large page modes (4K only) */ -#define SPRN_TSCR 0x399 /* Thread switch control on BE */ -#define SPRN_TTR 0x39A /* Thread switch timeout on BE */ -#define TSCR_DEC_ENABLE 0x200000 /* Decrementer Interrupt */ -#define TSCR_EE_ENABLE 0x100000 /* External Interrupt */ -#define TSCR_EE_BOOST 0x080000 /* External Interrupt Boost */ +#define SPRN_TSC_CELL 0x399 /* Thread switch control on Cell */ +#define TSC_CELL_DEC_ENABLE_0 0x400000 /* Decrementer Interrupt */ +#define TSC_CELL_DEC_ENABLE_1 0x200000 /* Decrementer Interrupt */ +#define TSC_CELL_EE_ENABLE 0x100000 /* External Interrupt */ +#define TSC_CELL_EE_BOOST 0x080000 /* External Interrupt Boost */ #define SPRN_TSC 0x3FD /* Thread switch control on others */ #define SPRN_TST 0x3FC /* Thread switch timeout on others */ #if !defined(SPRN_IAC1) && !defined(SPRN_IAC2) @@ -375,6 +379,14 @@ #define SPRN_SPRG7 0x117 /* Special Purpose Register General 7 */ #define SPRN_SRR0 0x01A /* Save/Restore Register 0 */ #define SPRN_SRR1 0x01B /* Save/Restore Register 1 */ +#define SRR1_WAKEMASK 0x00380000 /* reason for wakeup */ +#define SRR1_WAKERESET 0x00380000 /* System reset */ +#define SRR1_WAKESYSERR 0x00300000 /* System error */ +#define SRR1_WAKEEE 0x00200000 /* External interrupt */ +#define SRR1_WAKEMT 0x00280000 /* mtctrl */ +#define SRR1_WAKEDEC 0x00180000 /* Decrementer interrupt */ +#define SRR1_WAKETHERM 0x00100000 /* Thermal management interrupt */ + #ifndef SPRN_SVR #define SPRN_SVR 0x11E /* System Version Register */ #endif Index: linux-2.6.15-rc/arch/powerpc/kernel/cputable.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/cputable.c +++ linux-2.6.15-rc/arch/powerpc/kernel/cputable.c @@ -273,7 +273,7 @@ struct cpu_spec cpu_specs[] = { .oprofile_model = &op_model_power4, #endif }, - { /* BE DD1.x */ + { /* Cell Broadband Engine */ .pvr_mask = 0xffff0000, .pvr_value = 0x00700000, .cpu_name = "Cell Broadband Engine", Index: linux-2.6.15-rc/arch/powerpc/kernel/traps.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/kernel/traps.c +++ linux-2.6.15-rc/arch/powerpc/kernel/traps.c @@ -230,8 +230,10 @@ void _exception(int signr, struct pt_reg void system_reset_exception(struct pt_regs *regs) { /* See if any machine dependent calls */ - if (ppc_md.system_reset_exception) - ppc_md.system_reset_exception(regs); + if (ppc_md.system_reset_exception) { + if (ppc_md.system_reset_exception(regs)) + return; + } die("System Reset", regs, SIGABRT); Index: linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.h =================================================================== --- /dev/null +++ linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.h @@ -0,0 +1,9 @@ +#ifndef _PSERIES_RAS_H +#define _PSERIES_RAS_H + +struct pt_regs; + +extern int pSeries_system_reset_exception(struct pt_regs *regs); +extern int pSeries_machine_check_exception(struct pt_regs *regs); + +#endif /* _PSERIES_RAS_H */ Index: linux-2.6.15-rc/arch/powerpc/platforms/pseries/setup.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/pseries/setup.c +++ linux-2.6.15-rc/arch/powerpc/platforms/pseries/setup.c @@ -69,6 +69,7 @@ #include #include "plpar_wrappers.h" +#include "ras.h" #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -80,9 +81,6 @@ extern void find_udbg_vterm(void); int fwnmi_active; /* TRUE if an FWNMI handler is present */ -extern void pSeries_system_reset_exception(struct pt_regs *regs); -extern int pSeries_machine_check_exception(struct pt_regs *regs); - static void pseries_shared_idle(void); static void pseries_dedicated_idle(void); Index: linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.c =================================================================== --- linux-2.6.15-rc.orig/arch/powerpc/platforms/pseries/ras.c +++ linux-2.6.15-rc/arch/powerpc/platforms/pseries/ras.c @@ -51,6 +51,8 @@ #include #include +#include "ras.h" + static unsigned char ras_log_buf[RTAS_ERROR_LOG_MAX]; static DEFINE_SPINLOCK(ras_log_buf_lock); @@ -278,7 +280,7 @@ static void fwnmi_release_errinfo(void) printk("FWNMI: nmi-interlock failed: %d\n", ret); } -void pSeries_system_reset_exception(struct pt_regs *regs) +int pSeries_system_reset_exception(struct pt_regs *regs) { if (fwnmi_active) { struct rtas_error_log *errhdr = fwnmi_get_errinfo(regs); @@ -287,6 +289,7 @@ void pSeries_system_reset_exception(stru } fwnmi_release_errinfo(); } + return 0; /* need to perform reset */ } /* Index: linux-2.6.15-rc/include/asm-powerpc/machdep.h =================================================================== --- linux-2.6.15-rc.orig/include/asm-powerpc/machdep.h +++ linux-2.6.15-rc/include/asm-powerpc/machdep.h @@ -134,7 +134,7 @@ struct machdep_calls { void (*nvram_sync)(void); /* Exception handlers */ - void (*system_reset_exception)(struct pt_regs *regs); + int (*system_reset_exception)(struct pt_regs *regs); int (*machine_check_exception)(struct pt_regs *regs); /* Motherboard/chipset features. This is a kind of general purpose From galak at kernel.crashing.org Wed Dec 21 01:00:41 2005 From: galak at kernel.crashing.org (Kumar Gala) Date: Tue, 20 Dec 2005 08:00:41 -0600 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <20051219235455.GB29993@localhost.localdomain> References: <20051219054410.GB13285@localhost.localdomain> <20051219235455.GB29993@localhost.localdomain> Message-ID: <5648700F-9B2F-49F0-ADA8-278328C4CF17@kernel.crashing.org> On Dec 19, 2005, at 5:54 PM, David Gibson wrote: > On Mon, Dec 19, 2005 at 09:30:37AM -0600, Kumar Gala wrote: >> >> On Dec 18, 2005, at 11:44 PM, David Gibson wrote: >> >>> Paulus et al, I think the patch below is roughly the right way to >>> go, >>> but it needs much more review and testing (at present it's been >>> cursorily tested on ppc64 pSeries only). >>> >>> This patch merges the cache flushing code for 32 and 64 bit powerpc >>> machines. This means the ppc64_caches mechanism for determining >>> correct cache sizes at runtime is ported to 32-bit, and is thus >>> renamed as 'powerpc_caches'. The merged cache flushing functions go >>> in new file arch/powerpc/kernel/cache.S. >> >> Why dont we just use the cache line information in the cputable? Why >> the introduction of this new powerpc_caches structure? > > Because the device tree can override the information from the > cputable. Oh, and the structure is only new for ppc32. If the device tree overrides the cputable should we not believe it? I guess I dont understand why we need the same information in multiple places? - kumar From pnasrat at redhat.com Wed Dec 21 03:08:05 2005 From: pnasrat at redhat.com (Paul Nasrat) Date: Tue, 20 Dec 2005 16:08:05 +0000 Subject: yaboot mailing lists - new home Message-ID: <1135094885.8082.7.camel@enki.eridu> Following the hardware failure on penguinppc.org I've managed to arrange hosting thanks to the kind administrators at ozlabs https://ozlabs.org/mailman/listinfo/yaboot-devel https://ozlabs.org/mailman/listinfo/yaboot-users I haven't pushed any content there yet, but I think we were aiming at a test release shortly so having the lists back will help. Paul From arndb at de.ibm.com Wed Dec 21 01:50:42 2005 From: arndb at de.ibm.com (Arnd Bergmann) Date: Tue, 20 Dec 2005 15:50:42 +0100 Subject: [RFC PATCH 3/3] add hvc backend for rtas In-Reply-To: <1c2320460fadceafaf909562accc17a4@bga.com> References: <20051217001031.456315000@localhost> <1134781070.6102.23.camel@gaston> <1c2320460fadceafaf909562accc17a4@bga.com> Message-ID: <200512201550.45324.arndb@de.ibm.com> On S?nnavend 17 Dezember 2005 05:44, Milton Miller wrote: > While the cookie could be zero, defining it non-zero per driver means > that if two console drivers both register with the core, we will > guarantee that /dev/console output is sent to the same driver as the > kernel printks. ?The other driver will get assigned a minor number > above the last minor requested during the first scan. How about using separate number spaces for each type of console backend? AFAICS, this little change should do. I'll send it as part of the updated set of console patches. Arnd <>< --- linux-2.6.15-rc.orig/drivers/char/hvc_console.c +++ linux-2.6.15-rc/drivers/char/hvc_console.c @@ -766,7 +766,8 @@ struct hvc_struct __devinit *hvc_alloc(u * see if this vterm id matches one registered for console. */ for (i=0; i < MAX_NR_HVC_CONSOLES; i++) - if (vtermnos[i] == hp->vtermno) + if (vtermnos[i] == hp->vtermno && + cons_ops[i] == hp->ops) break; /* no matching slot, just use a counter */ From galak at gate.crashing.org Wed Dec 21 04:26:08 2005 From: galak at gate.crashing.org (Kumar Gala) Date: Tue, 20 Dec 2005 11:26:08 -0600 (CST) Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: <200512201118.05404.arnd@arndb.de> Message-ID: On Tue, 20 Dec 2005, Arnd Bergmann wrote: > On Maandag 19 Dezember 2005 21:49, Kumar Gala wrote: > > I'm still in favor of just leaving these devices as straight platform > > devices. ?Unless there is something that is bus specific that each device > > on the bus conforms to I dont see any reason to create a new bus type. > > How do platform devices work with module autoloading? What I'm interested > in is to have stuff like the Fedora installer or kernels with modular > drivers 'just work' because they can use the same way to load their > modules that is already used for PCI devices. > > AFAICS, that requires at least two things: > - The device needs to be created when the bus is probed, i.e. > of_device_register can not be called from inside the driver > module_init() function. > This is already handled by the platform device in the kernel. > - The bus type needs to create a modalias file so user space can > do the matching with the of device table in the modules. Seems like a simple thing to add to platform device. > Both of these should be a lot easier to implement with a special > bus type that creates entries in sysfs for a subset of the OF > device tree. I still dont see what a new bus type get us. I'm going to have to have specific code to parse and build and register my devices. If that could ends up registering a platform device or a newflatOF device I dont see any real difference. > The alternative would be to represent all of the device tree > in /sys/devices, but IMHO that should better be part of > /sys/firmware with symlinks to the linux internal device tree > representation. Today I have: /sys/devices/platform/ fsl-gianfar.1 fsl-i2c.1 fsl-i2c.2 fsl-sec2.1 fsl-usb2-dr.1 fsl-usb2-mph.1 serial8250 serial8250.0 - kumar From mgreer at mvista.com Wed Dec 21 05:27:49 2005 From: mgreer at mvista.com (Mark A. Greer) Date: Tue, 20 Dec 2005 11:27:49 -0700 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <20051220010617.GC29993@localhost.localdomain> References: <20051220010617.GC29993@localhost.localdomain> Message-ID: <20051220182749.GD5647@mag.az.mvista.com> On Tue, Dec 20, 2005 at 12:06:17PM +1100, David Gibson wrote: > Previously, the ppc32 version of flush_dcache_range() did a writeback > and invalidate of the given cache lines (dcbf) whereas the ppc64 > version did just a writeback (dcbst). In general, there's no > consistent meaning of "flush" as one or the other, so this patch also > renames the dcache flushing functions less ambiguously. The new names > are: > > wback_dcache_range() - previously flush_dcache_range() on > ppc64 and clean_dcache_range() on ppc32 > > wback_inval_dcache_range() - previously > flush_inval_dcache_range() on ppc64 and flush_dcache_range on ppc32 I agree about the inconsistent meaning of 'flush' but I find 'wback' distracting b/c it also refers to a type of cache/cache mode. It makes me think that there's another set of routines for writethru or something like that. I realize the caches are in writeback mode but the point is that it sends my brain down a different path than what is really meant. Could we just define 'flush' to mean "push the cached data/instns back into memory but not invalidate" and still call them 'flush'? Or use 'push' or something else that does also refer to a cache mode? Maybe its just me... Mark From arnd at arndb.de Wed Dec 21 05:58:36 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 20 Dec 2005 19:58:36 +0100 Subject: RFC: Rev 0.5 Booting the Linux/ppc kernel without Open Firmware In-Reply-To: References: Message-ID: <200512201958.37850.arnd@arndb.de> On Dinsdag 20 Dezember 2005 18:26, Kumar Gala wrote: > > > AFAICS, that requires at least two things: > > - The device needs to be created when the bus is probed, i.e. > > ? of_device_register can not be called from inside the driver > > ? module_init() function. > > > > This is already handled by the platform device in the kernel. Ok, i was a bit misinformed about how platform devices currently work, for some reason I thought that there was only a single entry point for register_device and register_driver, which is untrue. Sorry for the confusion here. > I still dont see what a new bus type get us. I'm going to have to have > specific code to parse and build and register my devices. If that could > ends up registering a platform device or a newflatOF device I dont see any > real difference. After looking into it a bit more, I found that we already have two different types of platform devices here: the standard struct platform_device and the of_platform_device that is currently used only on powermac and (maybe surprisingly) is not based on a platform_device at all but on of_device. Using the basic platform_device has the big advantage that you can't access the device_node information (unless we implement the device::firmware_data infrastructure for powerpc, which looks increasingly appealing). > > The alternative would be to represent all of the device tree > > in /sys/devices, but IMHO that should better be part of > > /sys/firmware with symlinks to the linux internal device tree > > representation. > > Today I have: > /sys/devices/platform/ > fsl-gianfar.1 > fsl-i2c.1 > fsl-i2c.2 > fsl-sec2.1 > fsl-usb2-dr.1 > fsl-usb2-mph.1 > serial8250 > serial8250.0 The problem I see with these is that currently they are all created by platform specific code or driver specific code that knows exactly that these devices exist on the hardware. When moving to the device tree, you need some criteria to decide which devices to add and which not. One way to do this would be to only add devices that are a direct child of an SOC bus. If you would simply add every single device_node as a platform_device, you end up with a huge number of entries in /sys/devices/platform/ that are either unused completely or already represented elsewhere like /sys/devices/pci*/*. One idea I just had (forgive me if that is bullshit, it's getting late here) is to really do the /sys/firmware stuff by embedding a kobject in every device_node. The existing bus types that already know about device_node (pci, vio, macio, of_platform, ...) get converted to use that (e.g. pci stores the device_node in pci_dev:dev:firmware_data instead of pci_dev:sysdata) and create the appropriate symlink in sysfs. When an of_platform_driver registers, we can still give it all devices because we can create the of_platform_device on the fly by doing the match on the device_node first. Arnd <>< From olof at lixom.net Wed Dec 21 08:45:25 2005 From: olof at lixom.net (Olof Johansson) Date: Tue, 20 Dec 2005 15:45:25 -0600 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <20051220204530.GA26351@suse.de> References: <20051220204530.GA26351@suse.de> Message-ID: <20051220214525.GB7428@pb15.lixom.net> On Tue, Dec 20, 2005 at 09:45:30PM +0100, Olaf Hering wrote: > The connection of ttyS0 to /dev/console doesnt seem to work anymore mit > 2.6.15-rc5+6 on a POWER4 p630 in fullsystempartition mode, no HMC > connected. It works with 2.6.14.4. > I tested 2.6.15-rc6 arch/powerpc/configs/ppc64_defconfig. It seems to have been broken a while: According to test.kernel.org (last machine in the matrix is an SMP mode p650), it broke between 2.6.14-git2 and 2.6.14-git3. Console output can be found in: http://test.kernel.org/15622/debug/console.log for the failed one http://test.kernel.org/15530/debug/console.log for the successful one -Olof From olh at suse.de Wed Dec 21 09:09:32 2005 From: olh at suse.de (Olaf Hering) Date: Tue, 20 Dec 2005 23:09:32 +0100 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <20051220214525.GB7428@pb15.lixom.net> References: <20051220204530.GA26351@suse.de> <20051220214525.GB7428@pb15.lixom.net> Message-ID: <20051220220932.GA29092@suse.de> On Tue, Dec 20, Olof Johansson wrote: > On Tue, Dec 20, 2005 at 09:45:30PM +0100, Olaf Hering wrote: > > The connection of ttyS0 to /dev/console doesnt seem to work anymore mit > > 2.6.15-rc5+6 on a POWER4 p630 in fullsystempartition mode, no HMC > > connected. It works with 2.6.14.4. > > I tested 2.6.15-rc6 arch/powerpc/configs/ppc64_defconfig. > > It seems to have been broken a while: According to test.kernel.org (last > machine in the matrix is an SMP mode p650), it broke between 2.6.14-git2 > and 2.6.14-git3. Console output can be found in: > I remember someone mentioned that a 43p 150 did not boot if the keyboard is connected. Will try that tomorrow. The git2-3 diff is huge, so maybe this hint helps. -- short story of a lazy sysadmin: alias appserv=wotan From galak at gate.crashing.org Wed Dec 21 09:16:26 2005 From: galak at gate.crashing.org (Kumar Gala) Date: Tue, 20 Dec 2005 16:16:26 -0600 (CST) Subject: [PATCH] powerpc: Loosen udbg_probe_uart_speed sanity checking Message-ID: The checking of the baudrate in udbg_probe_uart_speed was too tight and would cause reporting back of the default baud rate in cases where the computed speed was valid. Signed-off-by: Kumar Gala --- commit 8bec94b3bb35c0273fbab5a6aa477ec71a4d1fab tree c2dde72aa75a3d807a918556e97b21723c10267b parent a86f866f7b31e01c729ee7498228c547a51d8514 author Kumar Gala Tue, 20 Dec 2005 16:17:13 -0600 committer Kumar Gala Tue, 20 Dec 2005 16:17:13 -0600 arch/powerpc/kernel/udbg_16550.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/udbg_16550.c b/arch/powerpc/kernel/udbg_16550.c index e58c048..7541bf4 100644 --- a/arch/powerpc/kernel/udbg_16550.c +++ b/arch/powerpc/kernel/udbg_16550.c @@ -137,7 +137,7 @@ unsigned int udbg_probe_uart_speed(void speed = (clock / prescaler) / (divisor * 16); /* sanity check */ - if (speed < 9600 || speed > 115200) + if (speed < 0 || speed > (clock / 16)) speed = 9600; return speed; From galak at gate.crashing.org Wed Dec 21 09:16:52 2005 From: galak at gate.crashing.org (Kumar Gala) Date: Tue, 20 Dec 2005 16:16:52 -0600 (CST) Subject: [PATCH] powerpc: Add the ability to handle SOC ports in legacy_serial Message-ID: Add the ability to configure and initialize legacy 8250 serials ports on an SOC bus. Also, fixed an issue that we would not configure any serial ports if "linux,stdout-path" was not found. Signed-off-by: Kumar Gala --- commit adb69b3e888de037f4e58cc505443e624d498596 tree 7be9bb626b8d2c773dd88a1d3e17a5f56995075f parent 8bec94b3bb35c0273fbab5a6aa477ec71a4d1fab author Kumar Gala Tue, 20 Dec 2005 16:20:09 -0600 committer Kumar Gala Tue, 20 Dec 2005 16:20:09 -0600 arch/powerpc/kernel/legacy_serial.c | 62 +++++++++++++++++++++++++++++------ 1 files changed, 52 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c index d179ec5..59164ba 100644 --- a/arch/powerpc/kernel/legacy_serial.c +++ b/arch/powerpc/kernel/legacy_serial.c @@ -36,7 +36,8 @@ static int legacy_serial_console = -1; static int __init add_legacy_port(struct device_node *np, int want_index, int iotype, phys_addr_t base, - phys_addr_t taddr, unsigned long irq) + phys_addr_t taddr, unsigned long irq, + unsigned int flags) { u32 *clk, *spd, clock = BASE_BAUD * 16; int index; @@ -90,7 +91,7 @@ static int __init add_legacy_port(struct legacy_serial_ports[index].iotype = iotype; legacy_serial_ports[index].uartclk = clock; legacy_serial_ports[index].irq = irq; - legacy_serial_ports[index].flags = ASYNC_BOOT_AUTOCONF; + legacy_serial_ports[index].flags = flags; legacy_serial_infos[index].taddr = taddr; legacy_serial_infos[index].np = of_node_get(np); legacy_serial_infos[index].clock = clock; @@ -107,6 +108,32 @@ static int __init add_legacy_port(struct return index; } +static int __init add_legacy_soc_port(struct device_node *np, + struct device_node *soc_dev) +{ + phys_addr_t addr; + u32 *addrp; + unsigned int flags = UPF_BOOT_AUTOCONF | UPF_SKIP_TEST | UPF_SHARE_IRQ; + + /* We only support ports that have a clock frequency properly + * encoded in the device-tree. + */ + if (get_property(np, "clock-frequency", NULL) == NULL) + return -1; + + /* Get the address */ + addrp = of_get_address(soc_dev, 0, NULL, NULL); + if (addrp == NULL) + return -1; + + addr = of_translate_address(soc_dev, addrp); + + /* Add port, irq will be dealt with later. We passed a translated + * IO port value. It will be fixed up later along with the irq + */ + return add_legacy_port(np, -1, UPIO_MEM, addr, addr, NO_IRQ, flags); +} + static int __init add_legacy_isa_port(struct device_node *np, struct device_node *isa_bridge) { @@ -137,7 +164,7 @@ static int __init add_legacy_isa_port(st taddr = of_translate_address(np, reg); /* Add port, irq will be dealt with later */ - return add_legacy_port(np, index, UPIO_PORT, reg[1], taddr, NO_IRQ); + return add_legacy_port(np, index, UPIO_PORT, reg[1], taddr, NO_IRQ, UPF_BOOT_AUTOCONF); } @@ -204,7 +231,7 @@ static int __init add_legacy_pci_port(st /* Add port, irq will be dealt with later. We passed a translated * IO port value. It will be fixed up later along with the irq */ - return add_legacy_port(np, index, iotype, base, addr, NO_IRQ); + return add_legacy_port(np, index, iotype, base, addr, NO_IRQ, UPF_BOOT_AUTOCONF); } /* @@ -218,7 +245,7 @@ static int __init add_legacy_pci_port(st */ void __init find_legacy_serial_ports(void) { - struct device_node *np, *stdout; + struct device_node *np, *stdout = NULL; char *path; int index; @@ -226,13 +253,23 @@ void __init find_legacy_serial_ports(voi /* Now find out if one of these is out firmware console */ path = (char *)get_property(of_chosen, "linux,stdout-path", NULL); - if (path == NULL) { + if (path != NULL) { + stdout = of_find_node_by_path(path); + if (stdout) + DBG("stdout is %s\n", stdout->full_name); + } else { DBG(" no linux,stdout-path !\n"); - return; } - stdout = of_find_node_by_path(path); - if (stdout) { - DBG("stdout is %s\n", stdout->full_name); + + /* First fill our array with SOC ports */ + for (np = NULL; (np = of_find_compatible_node(np, "serial", "ns16550")) != NULL;) { + struct device_node *soc = of_get_parent(np); + if (soc && !strcmp(soc->type, "soc")) { + index = add_legacy_soc_port(np, np); + if (index >= 0 && np == stdout) + legacy_serial_console = index; + } + of_node_put(soc); } /* First fill our array with ISA ports */ @@ -437,6 +474,11 @@ static int __init check_legacy_serial_co DBG(" of_chosen is NULL !\n"); return -ENODEV; } + + if (legacy_serial_console < 0) { + DBG(" legacy_serial_console not found !\n"); + return -ENODEV; + } /* We are getting a weird phandle from OF ... */ /* ... So use the full path instead */ name = (char *)get_property(of_chosen, "linux,stdout-path", NULL); From galak at gate.crashing.org Wed Dec 21 09:37:07 2005 From: galak at gate.crashing.org (Kumar Gala) Date: Tue, 20 Dec 2005 16:37:07 -0600 (CST) Subject: [PATCH] powerpc: added a udbg_progress Message-ID: Added a common udbg_progress for use by ppc_md.progress() Signed-off-by: Kumar Gala --- commit 7b69a37a4e811e72e82896ab38e6ae04542455f5 tree 923809875e727ab5d0dd0579e45da9201dd86653 parent adb69b3e888de037f4e58cc505443e624d498596 author Kumar Gala Tue, 20 Dec 2005 16:40:09 -0600 committer Kumar Gala Tue, 20 Dec 2005 16:40:09 -0600 arch/powerpc/kernel/udbg.c | 6 ++++++ arch/powerpc/platforms/powermac/setup.c | 8 +------- include/asm-powerpc/udbg.h | 1 + 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kernel/udbg.c b/arch/powerpc/kernel/udbg.c index 9567d94..558c1ce 100644 --- a/arch/powerpc/kernel/udbg.c +++ b/arch/powerpc/kernel/udbg.c @@ -90,6 +90,12 @@ void udbg_printf(const char *fmt, ...) va_end(args); } +void __init udbg_progress(char *s, unsigned short hex) +{ + udbg_puts(s); + udbg_puts("\n"); +} + /* * Early boot console based on udbg */ diff --git a/arch/powerpc/platforms/powermac/setup.c b/arch/powerpc/platforms/powermac/setup.c index 1daa5a0..e5a5bdb 100644 --- a/arch/powerpc/platforms/powermac/setup.c +++ b/arch/powerpc/platforms/powermac/setup.c @@ -639,12 +639,6 @@ static void __init pmac_init_early(void) #endif } -static void __init pmac_progress(char *s, unsigned short hex) -{ - udbg_puts(s); - udbg_puts("\n"); -} - /* * pmac has no legacy IO, anything calling this function has to * fail or bad things will happen @@ -763,7 +757,7 @@ struct machdep_calls __initdata pmac_md .calibrate_decr = pmac_calibrate_decr, .feature_call = pmac_do_feature_call, .check_legacy_ioport = pmac_check_legacy_ioport, - .progress = pmac_progress, + .progress = udbg_progress, #ifdef CONFIG_PPC64 .pci_probe_mode = pmac_pci_probe_mode, .idle_loop = native_idle, diff --git a/include/asm-powerpc/udbg.h b/include/asm-powerpc/udbg.h index 58cdc88..a3390b0 100644 --- a/include/asm-powerpc/udbg.h +++ b/include/asm-powerpc/udbg.h @@ -23,6 +23,7 @@ extern int udbg_read(char *buf, int bufl extern void register_early_udbg_console(void); extern void udbg_printf(const char *fmt, ...); +extern void udbg_progress(char *s, unsigned short hex); extern void udbg_init_uart(void __iomem *comport, unsigned int speed, unsigned int clock); From david at gibson.dropbear.id.au Wed Dec 21 10:21:54 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Wed, 21 Dec 2005 10:21:54 +1100 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <20051220182749.GD5647@mag.az.mvista.com> References: <20051220010617.GC29993@localhost.localdomain> <20051220182749.GD5647@mag.az.mvista.com> Message-ID: <20051220232154.GB19042@localhost.localdomain> On Tue, Dec 20, 2005 at 11:27:49AM -0700, Mark A. Greer wrote: > On Tue, Dec 20, 2005 at 12:06:17PM +1100, David Gibson wrote: > > > Previously, the ppc32 version of flush_dcache_range() did a writeback > > and invalidate of the given cache lines (dcbf) whereas the ppc64 > > version did just a writeback (dcbst). In general, there's no > > consistent meaning of "flush" as one or the other, so this patch also > > renames the dcache flushing functions less ambiguously. The new names > > are: > > > > wback_dcache_range() - previously flush_dcache_range() on > > ppc64 and clean_dcache_range() on ppc32 > > > > wback_inval_dcache_range() - previously > > flush_inval_dcache_range() on ppc64 and flush_dcache_range on ppc32 > > I agree about the inconsistent meaning of 'flush' but I find 'wback' > distracting b/c it also refers to a type of cache/cache mode. > It makes me think that there's another set of routines for writethru or > something like that. I realize the caches are in writeback mode but the > point is that it sends my brain down a different path than what is really > meant. I see what you mean, but I can't think of a better term. We could use "clean", but I don't like that very much either. > Could we just define 'flush' to mean "push the cached data/instns back > into memory but not invalidate" and still call them 'flush'? Or use > 'push' or something else that does also refer to a cache mode? I don't think there's any way to make such a definition obvious enough that someone won't make the same mistake at some later point. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From mgreer at mvista.com Wed Dec 21 10:31:36 2005 From: mgreer at mvista.com (Mark A. Greer) Date: Tue, 20 Dec 2005 16:31:36 -0700 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <20051220232154.GB19042@localhost.localdomain> References: <20051220010617.GC29993@localhost.localdomain> <20051220182749.GD5647@mag.az.mvista.com> <20051220232154.GB19042@localhost.localdomain> Message-ID: <20051220233136.GA8686@mag.az.mvista.com> On Wed, Dec 21, 2005 at 10:21:54AM +1100, David Gibson wrote: > On Tue, Dec 20, 2005 at 11:27:49AM -0700, Mark A. Greer wrote: > > On Tue, Dec 20, 2005 at 12:06:17PM +1100, David Gibson wrote: > > > > > Previously, the ppc32 version of flush_dcache_range() did a writeback > > > and invalidate of the given cache lines (dcbf) whereas the ppc64 > > > version did just a writeback (dcbst). In general, there's no > > > consistent meaning of "flush" as one or the other, so this patch also > > > renames the dcache flushing functions less ambiguously. The new names > > > are: > > > > > > wback_dcache_range() - previously flush_dcache_range() on > > > ppc64 and clean_dcache_range() on ppc32 > > > > > > wback_inval_dcache_range() - previously > > > flush_inval_dcache_range() on ppc64 and flush_dcache_range on ppc32 > > > > I agree about the inconsistent meaning of 'flush' but I find 'wback' > > distracting b/c it also refers to a type of cache/cache mode. > > It makes me think that there's another set of routines for writethru or > > something like that. I realize the caches are in writeback mode but the > > point is that it sends my brain down a different path than what is really > > meant. > > I see what you mean, but I can't think of a better term. Me either. :( > We could use > "clean", but I don't like that very much either. I can't either, really. The only ones that come to mind are "push" or "write"...write_inval_dcache_range()? Not sure if that's better or worse. > > Could we just define 'flush' to mean "push the cached data/instns back > > into memory but not invalidate" and still call them 'flush'? Or use > > 'push' or something else that does also refer to a cache mode? > > I don't think there's any way to make such a definition obvious enough > that someone won't make the same mistake at some later point. I agree. Its not a big deal. I figured I'd says something before it was too late and maybe someone would come up w/ a better term. Mark From galak at gate.crashing.org Wed Dec 21 13:45:27 2005 From: galak at gate.crashing.org (Kumar Gala) Date: Tue, 20 Dec 2005 20:45:27 -0600 (CST) Subject: [PATCH] powerpc: Call find_legacy_serial_ports() if we enable CONFIG_SERIAL_8250 Message-ID: In setup_arch and setup_system call find_legacy_serial_ports() if we build in support for 8250 serial ports instead of basing it on PPC_MULTIPLATFORM. Signed-off-by: Kumar Gala --- commit 399e10aa1c6a3ca64e8e8e8cea7289e9908cc62c tree 3e5320b8a97a37c4d010186aabf8a95e5be264b4 parent 7b69a37a4e811e72e82896ab38e6ae04542455f5 author Kumar Gala Tue, 20 Dec 2005 20:48:36 -0600 committer Kumar Gala Tue, 20 Dec 2005 20:48:36 -0600 arch/powerpc/kernel/setup_32.c | 2 +- arch/powerpc/kernel/setup_64.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c index 79d434f..e5d285a 100644 --- a/arch/powerpc/kernel/setup_32.c +++ b/arch/powerpc/kernel/setup_32.c @@ -299,7 +299,7 @@ void __init setup_arch(char **cmdline_p) if (ppc_md.init_early) ppc_md.init_early(); -#ifdef CONFIG_PPC_MULTIPLATFORM +#ifdef CONFIG_SERIAL_8250 find_legacy_serial_ports(); #endif finish_device_tree(); diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 419e0b9..98e9f05 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -472,7 +472,7 @@ void __init setup_system(void) * hash table management for us, thus ioremap works. We do that early * so that further code can be debugged */ -#ifdef CONFIG_PPC_MULTIPLATFORM +#ifdef CONFIG_SERIAL_8250 find_legacy_serial_ports(); #endif From sfr at canb.auug.org.au Wed Dec 21 16:13:30 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Wed, 21 Dec 2005 16:13:30 +1100 Subject: [PATCH] powerpc: Call find_legacy_serial_ports() if we enable CONFIG_SERIAL_8250 In-Reply-To: References: Message-ID: <20051221161330.1a23384b.sfr@canb.auug.org.au> On Tue, 20 Dec 2005 20:45:27 -0600 (CST) Kumar Gala wrote: > > In setup_arch and setup_system call find_legacy_serial_ports() if we > build in support for 8250 serial ports instead of basing it on PPC_MULTIPLATFORM. > > Signed-off-by: Kumar Gala > > --- > commit 399e10aa1c6a3ca64e8e8e8cea7289e9908cc62c > tree 3e5320b8a97a37c4d010186aabf8a95e5be264b4 > parent 7b69a37a4e811e72e82896ab38e6ae04542455f5 > author Kumar Gala Tue, 20 Dec 2005 20:48:36 -0600 > committer Kumar Gala Tue, 20 Dec 2005 20:48:36 -0600 > > arch/powerpc/kernel/setup_32.c | 2 +- > arch/powerpc/kernel/setup_64.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) You should probably change the Makefile as well to build legacy_serial.c in the CONFIG_SERIAL_8250 case. -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051221/1e013dfe/attachment.pgp From olh at suse.de Wed Dec 21 21:41:38 2005 From: olh at suse.de (Olaf Hering) Date: Wed, 21 Dec 2005 11:41:38 +0100 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <20051220220932.GA29092@suse.de> References: <20051220204530.GA26351@suse.de> <20051220214525.GB7428@pb15.lixom.net> <20051220220932.GA29092@suse.de> Message-ID: <20051221104138.GA17580@suse.de> On Tue, Dec 20, Olaf Hering wrote: > I remember someone mentioned that a 43p 150 did not boot if the keyboard > is connected. Will try that tomorrow. The git2-3 diff is huge, so maybe > this hint helps. Yes, removing the keyboard helps. -- short story of a lazy sysadmin: alias appserv=wotan From olh at suse.de Thu Dec 22 01:23:41 2005 From: olh at suse.de (Olaf Hering) Date: Wed, 21 Dec 2005 15:23:41 +0100 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <20051220214525.GB7428@pb15.lixom.net> References: <20051220204530.GA26351@suse.de> <20051220214525.GB7428@pb15.lixom.net> Message-ID: <20051221142341.GA23639@suse.de> On Tue, Dec 20, Olof Johansson wrote: > On Tue, Dec 20, 2005 at 09:45:30PM +0100, Olaf Hering wrote: > > The connection of ttyS0 to /dev/console doesnt seem to work anymore mit > > 2.6.15-rc5+6 on a POWER4 p630 in fullsystempartition mode, no HMC > > connected. It works with 2.6.14.4. > > I tested 2.6.15-rc6 arch/powerpc/configs/ppc64_defconfig. > > It seems to have been broken a while: According to test.kernel.org (last > machine in the matrix is an SMP mode p650), it broke between 2.6.14-git2 > and 2.6.14-git3. Console output can be found in: my bisect result doesnt make much sense. The bad one is 12a39407f021fd17d5f9d33d78bddb005bd106fb good ones are .git/refs/bisect/good-0b360adbdb54d5b98b78d57ba0916bc4b8871968 .git/refs/bisect/good-1480d0a31db62b9803f829cc0e5cc71935ffe3cc .git/refs/bisect/good-9f75e1eff3edb2bb07349b94c28f4f2a6c66ca43 .git/refs/bisect/good-ecc81e0f719f566b75b222b8aef64c8b809b2e29 If I only knew how to export the tree state at these points... -- short story of a lazy sysadmin: alias appserv=wotan From galak at kernel.crashing.org Thu Dec 22 02:31:19 2005 From: galak at kernel.crashing.org (Kumar Gala) Date: Wed, 21 Dec 2005 09:31:19 -0600 Subject: [PATCH] powerpc: Call find_legacy_serial_ports() if we enable CONFIG_SERIAL_8250 In-Reply-To: <20051221161330.1a23384b.sfr@canb.auug.org.au> References: <20051221161330.1a23384b.sfr@canb.auug.org.au> Message-ID: <0D8F6A7E-0DA9-40ED-A55E-364921AE64E5@kernel.crashing.org> On Dec 20, 2005, at 11:13 PM, Stephen Rothwell wrote: > On Tue, 20 Dec 2005 20:45:27 -0600 (CST) Kumar Gala > wrote: >> >> In setup_arch and setup_system call find_legacy_serial_ports() if we >> build in support for 8250 serial ports instead of basing it on >> PPC_MULTIPLATFORM. >> >> Signed-off-by: Kumar Gala >> >> --- >> commit 399e10aa1c6a3ca64e8e8e8cea7289e9908cc62c >> tree 3e5320b8a97a37c4d010186aabf8a95e5be264b4 >> parent 7b69a37a4e811e72e82896ab38e6ae04542455f5 >> author Kumar Gala Tue, 20 Dec 2005 >> 20:48:36 -0600 >> committer Kumar Gala Tue, 20 Dec 2005 >> 20:48:36 -0600 >> >> arch/powerpc/kernel/setup_32.c | 2 +- >> arch/powerpc/kernel/setup_64.c | 2 +- >> 2 files changed, 2 insertions(+), 2 deletions(-) > > You should probably change the Makefile as well to build > legacy_serial.c > in the CONFIG_SERIAL_8250 case. Good call, was meaning to update the Makefile as well. I'll send an updated patch with the Makefile change. - kumar From galak at gate.crashing.org Thu Dec 22 02:27:13 2005 From: galak at gate.crashing.org (Kumar Gala) Date: Wed, 21 Dec 2005 09:27:13 -0600 (CST) Subject: [PATCH][UPDATE] powerpc: Call find_legacy_serial_ports() if we enable CONFIG_SERIAL_8250 Message-ID: In setup_arch and setup_system call find_legacy_serial_ports() if we build in support for 8250 serial ports instead of basing it on PPC_MULTIPLATFORM. Signed-off-by: Kumar Gala --- commit 27b4ecf949b83a0aa79640b2d124d65ffd4db0cd tree 91c52973d0469e67fd20cd38650c216e3e26755d parent 785912139d1f9480683e9fb359c81ce903a60fed author Kumar Gala Wed, 21 Dec 2005 09:29:51 -0600 committer Kumar Gala Wed, 21 Dec 2005 09:29:51 -0600 arch/powerpc/kernel/Makefile | 3 +-- arch/powerpc/kernel/setup_32.c | 2 +- arch/powerpc/kernel/setup_64.c | 2 +- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 5bdc5fa..a852b37 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -56,8 +56,7 @@ obj-$(CONFIG_BOOTX_TEXT) += btext.o obj-$(CONFIG_6xx) += idle_6xx.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_KPROBES) += kprobes.o -obj-$(CONFIG_PPC_MULTIPLATFORM) += legacy_serial.o -obj-$(CONFIG_PPC_MULTIPLATFORM) += udbg_16550.o +obj-$(CONFIG_SERIAL_8250) += legacy_serial.o udbg_16550.o module-$(CONFIG_PPC64) += module_64.o obj-$(CONFIG_MODULES) += $(module-y) diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c index 79d434f..e5d285a 100644 --- a/arch/powerpc/kernel/setup_32.c +++ b/arch/powerpc/kernel/setup_32.c @@ -299,7 +299,7 @@ void __init setup_arch(char **cmdline_p) if (ppc_md.init_early) ppc_md.init_early(); -#ifdef CONFIG_PPC_MULTIPLATFORM +#ifdef CONFIG_SERIAL_8250 find_legacy_serial_ports(); #endif finish_device_tree(); diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 419e0b9..98e9f05 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -472,7 +472,7 @@ void __init setup_system(void) * hash table management for us, thus ioremap works. We do that early * so that further code can be debugged */ -#ifdef CONFIG_PPC_MULTIPLATFORM +#ifdef CONFIG_SERIAL_8250 find_legacy_serial_ports(); #endif From miltonm at bga.com Thu Dec 22 04:16:01 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 21 Dec 2005 11:16:01 -0600 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <20051220010617.GC29993@localhost.localdomain> References: <20051220010617.GC29993@localhost.localdomain> Message-ID: <485df2ab5f0f0aba53755871a3e12b2c@bga.com> On Dec 19, 2005, at 7:06 PM, David Gibson wrote: > On Mon, Dec 19, 2005 at 08:52:47AM -0600, Milton Miller wrote: >> On Mon Dec 19 16:44:10 EST 2005, David Gibson wrote: >> >>> +extern void wback_dcache_range(unsigned long start, unsigned long >>> stop); >>> +extern void wback_inval_dcache_range(unsigned long start, unsigned >>> long stop); >> >> I think that while we are here we should change the arguments to be >> pointers (void *). The assembly doesn't care, and almost all of the >> users are casting from pointer to usigned long at the call site, with >> dart being the exception. The instruction cache flush should also >> change. > > True. And while we're at that, the dcache flushing functions are > almost invariable called as *_dcache_range(start, start+length), so > how about changing them to take start and length instead of start and > end. However, flush_icache_range() is called from generic code, so I > don't want to change it's interface. > The information from the above paragraph should be in the change log. And if we are not going to change icache, then I would propose we not use range on the dcache side, it is confusing. How about rename to th start, length ones to _region ? > Revised patch below: > > powerpc: Merge 32/64 cacheflush code > > This patch merges the cache flushing code for 32 and 64 bit powerpc > machines. This means the ppc64_caches mechanism for determining > correct cache sizes at runtime is ported to 32-bit, and is thus > renamed as 'powerpc_caches'. The merged cache flushing functions go > in new file arch/powerpc/kernel/cache.S. > > Previously, the ppc32 version of flush_dcache_range() did a writeback > and invalidate of the given cache lines (dcbf) whereas the ppc64 > version did just a writeback (dcbst). In general, there's no > consistent meaning of "flush" as one or the other, so this patch also > renames the dcache flushing functions less ambiguously. The new names > are: > > wback_dcache_range() - previously flush_dcache_range() on > ppc64 and clean_dcache_range() on ppc32 > > wback_inval_dcache_range() - previously > flush_inval_dcache_range() on ppc64 and flush_dcache_range on ppc32 > > invalidate_dcache_range() - didn't previously exist on ppc64, > unchanged on ppc32 > > Finally we also cleanup the initialization of the powerpc_caches > structure from the old ppc64 specific version. We remove a pointless > loop, and remove a dependence on _machine. ... > +/* > + * Flush a particular page from the data cache to RAM. > + * Note: this is necessary because the instruction cache does *not* > + * snoop from the data cache. > + * > + * void __flush_dcache_icache(void *page) When I see *page i think struct page *page .... even though this is void, how about page_address ? ... > +/* > + * Like above, but only do the D-cache. > + * > + * wback_dcache_range(void *start, unsigned long len) > + * > + * writeback all bytes from start to stop-1 inclusive > + */ > +_GLOBAL(wback_dcache_range) > + LOAD_REG_ADDR(r10, powerpc_caches) > + lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ > + addi r5,r7,-1 > + andc r6,r3,r5 /* round low to line bdy */ > + and r8,r3,r5 /* get cacheline offset of start */ Does the above term ever make a difference since we round up? > + add r8,r8,r4 /* add length */ > + add r8,r8,r5 /* ensure we get enough */ > + lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get dcache line shift */ Moving this load up might help some processors > + srw. r8,r8,r9 /* compute line count */ So its really unsigned int len (or did you want a PPC_ macro here?) > + beqlr /* nothing to do? */ > + mtctr r8 > +0: dcbst 0,r6 > + add r6,r6,r7 > + bdnz 0b > + sync > + blr > + > milton From olh at suse.de Thu Dec 22 04:56:28 2005 From: olh at suse.de (Olaf Hering) Date: Wed, 21 Dec 2005 18:56:28 +0100 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <20051220214525.GB7428@pb15.lixom.net> References: <20051220204530.GA26351@suse.de> <20051220214525.GB7428@pb15.lixom.net> Message-ID: <20051221175628.GA29363@suse.de> On Tue, Dec 20, Olof Johansson wrote: > On Tue, Dec 20, 2005 at 09:45:30PM +0100, Olaf Hering wrote: > > The connection of ttyS0 to /dev/console doesnt seem to work anymore mit > > 2.6.15-rc5+6 on a POWER4 p630 in fullsystempartition mode, no HMC > > connected. It works with 2.6.14.4. > > I tested 2.6.15-rc6 arch/powerpc/configs/ppc64_defconfig. > > It seems to have been broken a while: According to test.kernel.org (last > machine in the matrix is an SMP mode p650), it broke between 2.6.14-git2 > and 2.6.14-git3. Console output can be found in: > > http://test.kernel.org/15622/debug/console.log for the failed one > http://test.kernel.org/15530/debug/console.log for the successful one I finally managed to find the culprit. good: 25635c71e44111a6bd48f342e144e2fc02d0a314 bad: f9bd170a87948a9e077149b70fb192c563770fdf ... powerpc: Merge i8259.c into arch/powerpc/sysdev This changes the parameters for i8259_init so that it takes two parameters: a physical address for generating an interrupt acknowledge cycle, and an interrupt number offset. i8259_init now sets the irq_desc[] for its interrupts; all the callers were doing this, and that code is gone now. This also defines a CONFIG_PPC_I8259 symbol to select i8259.o for inclusion, and makes the platforms that need it select that symbol. ... --- good.log 2005-12-21 18:45:30.268293213 +0100 +++ bad.log 2005-12-21 18:44:45.381519395 +0100 @@ -38,7 +38,7 @@ boot stdout isn't a display ! trying /pci at 400000000112/pci at 2,6/pci at 1/display at 0 ... result: 0 -Starting Linux PPC64 #18 SMP Wed Dec 21 18:27:31 CET 2005 +Starting Linux PPC64 #19 SMP Wed Dec 21 18:35:08 CET 2005 ----------------------------------------------------- ppc64_pft_size = 0x1b ppc64_debug_switch = 0x0 @@ -54,7 +54,7 @@ ----------------------------------------------------- [boot]0100 MM Init [boot]0100 MM Init Done -Linux version 2.6.14-rc5 (olaf at pomegranate) (gcc version 3.3.3 (SuSE Linux)) #18 SMP Wed Dec 21 18:27:31 CET 2005 +Linux version 2.6.14-rc5 (olaf at pomegranate) (gcc version 3.3.3 (SuSE Linux)) #19 SMP Wed Dec 21 18:35:08 CET 2005 [boot]0012 Setup Arch Syscall map setup, 241 32 bits and 221 64 bits syscalls EEH: PCI Enhanced I/O Error Handling Enabled @@ -68,7 +68,7 @@ [boot]0020 XICS Init [boot]0021 XICS Done PID hash table entries: 4096 (order: 12, 131072 bytes) -time_init: decrementer frequency = 181.700926 MHz +time_init: decrementer frequency = 181.701073 MHz time_init: processor frequency = 1453.000000 MHz Found initrd at 0xc000000004535000:0xc000000004667a34 firmware_features = 0x0 @@ -76,7 +76,7 @@ boot stdout isn't a display ! trying /pci at 400000000112/pci at 2,6/pci at 1/display at 0 ... result: 0 -Starting Linux PPC64 #18 SMP Wed Dec 21 18:27:31 CET 2005 +Starting Linux PPC64 #19 SMP Wed Dec 21 18:35:08 CET 2005 ----------------------------------------------------- ppc64_pft_size = 0x1b ppc64_debug_switch = 0x0 @@ -92,7 +92,7 @@ ----------------------------------------------------- [boot]0100 MM Init [boot]0100 MM Init Done -Linux version 2.6.14-rc5 (olaf at pomegranate) (gcc version 3.3.3 (SuSE Linux)) #18 SMP Wed Dec 21 18:27:31 CET 2005 +Linux version 2.6.14-rc5 (olaf at pomegranate) (gcc version 3.3.3 (SuSE Linux)) #19 SMP Wed Dec 21 18:35:08 CET 2005 [boot]0012 Setup Arch Syscall map setup, 241 32 bits and 221 64 bits syscalls EEH: PCI Enhanced I/O Error Handling Enabled @@ -110,7 +110,7 @@ [boot]0020 XICS Init [boot]0021 XICS Done PID hash table entries: 4096 (order: 12, 131072 bytes) -time_init: decrementer frequency = 181.700926 MHz +time_init: decrementer frequency = 181.701073 MHz time_init: processor frequency = 1453.000000 MHz Console: colour dummy device 80x25 Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes) @@ -130,6 +130,7 @@ Freeing initrd memory: 1226k freed NET: Registered protocol family 16 PCI: Probing PCI hardware +Failed to request PCI IO region on PCI domain 0000 Using INTC for W82c105 IDE controller. IOMMU table initialized, virtual merging enabled mapping IO 3fd30000000 -> d000080000000000, size: 100000 @@ -234,8 +235,22 @@ target1:0:0: Beginning Domain Validation target1:0:0: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 31) target1:0:0: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31) - target1:0:0: Domain Validation skipping write tests + 1:0:0:0: ABORT operation started. + 1:0:0:0: ABORT operation timed-out. + 1:0:0:0: DEVICE RESET operation started. + 1:0:0:0: DEVICE RESET operation complete. + target1:0:0: control msgout: c. +sym1: TARGET 0 has been reset. + 1:0:0:0: ABORT operation started. + 1:0:0:0: ABORT operation complete. + 1:0:0:0: BUS RESET operation started. + 1:0:0:0: BUS RESET operation complete. +sym1: SCSI BUS reset detected. +sym1: SCSI BUS has been reset. target1:0:0: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 31) + target1:0:0: Wide Transfers Fail + target1:0:0: Domain Validation skipping write tests + target1:0:0: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 31) target1:0:0: Ending Domain Validation ipr: IBM Power RAID SCSI Device Driver version: 2.0.14 (May 2, 2005) vio_register_driver: driver ibmvscsi registering @@ -284,26 +299,25 @@ hub 4-0:1.0: 1 port detected usbcore: registered new driver hiddev usbcore: registered new driver usbhid -/home/olaf/kernel/git/ps2-ppc64/linux-2.6-25635c71e44111a6bd48f342e144e2fc02d0a314/drivers/usb/input/hid-core.c: v2.6:USB HID core driver +/home/olaf/kernel/git/ps2-ppc64/linux-2.6-f9bd170a87948a9e077149b70fb192c563770fdf/drivers/usb/input/hid-core.c: v2.6:USB HID core driver mice: PS/2 mouse device common for all mice i2c /dev entries driver md: linear personality registered as nr 1 +atkbd.c: keyboard reset failed on isa0060/serio1 md: raid0 personality registered as nr 2 md: raid1 personality registered as nr 3 md: raid10 personality registered as nr 9 md: raid5 personality registered as nr 4 raid5: measuring checksumming speed - 8regs : 3434.000 MB/sec - 8regs_prefetch: 3071.000 MB/sec -atkbd.c: keyboard reset failed on isa0060/serio1 - 32regs : 4735.000 MB/sec - 32regs_prefetch: 3721.000 MB/sec -raid5: using function: 32regs (4735.000 MB/sec) + 8regs : 54242.000 MB/sec + 8regs_prefetch: 208256.000 MB/sec + 32regs : 1.000 MB/sec + 32regs_prefetch: 1.000 MB/sec +raid5: using function: 8regs_prefetch (208256.000 MB/sec) md: md driver 0.90.2 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: bitmap version 3.39 device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel at redhat.com oprofile: using ppc64/power4 performance monitoring. -input: AT Raw Set 2 keyboard on isa0060/serio0 NET: Registered protocol family 2 IP route cache hash table entries: 524288 (order: 10, 4194304 bytes) TCP established hash table entries: 1048576 (order: 12, 16777216 bytes) @@ -316,56 +330,12 @@ NET: Registered protocol family 1 NET: Registered protocol family 17 Freeing unused kernel memory: 372k freed - running (1:1) /init --login - -creating device nodes .[: [0-9]*: bad number -[: [0-9]*: bad number -[: [0-9]*: bad n 0:0:15:0: phase change 6-7 9 at 400503a8 resid=7. -umber -0:0:15:0: neither page 0x83 nor 0x80 supported +input: AT Raw Set 2 keyboard on isa0060/serio0 + 0:0:15:0: phase change 6-7 9 at 400503a8 resid=7. st0: Block limits 1 - 16777215 bytes. -[: [0-9]*: bad number -1:0:0:0: neither page 0x83 nor 0x80 supported -[: [0-9]*: bad number -1:0:0:0: neither page 0x83 nor 0x80 supported -[: [0-9]*: bad number -1:0:0:0: ioctl failed: 6 -1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0. -[: [0-9]*: bad number -1:0:0:0: ioctl failed: 6 -1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0. -[: [0-9]*: bad number -1:0:0:0: ioctl failed: 6 -1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0. -[: [0-9]*: bad number -1:0:0:0: neither page 0x83 nor 0x80 supported -[: [0-9]*: bad number -1:0:0:0: ioctl failed: 6 -1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0. -[: [0-9]*: bad number -1:0:0:0: ioctl failed: 6 -1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0. -[: [0-9]*: bad number -1:0:0:0: ioctl failed: 6 -1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0. -[: [0-9]*: bad number -[: [0-9]*: bad number -[: [0-9]*: bad number -[: [0-9]*: bad number -[: [0-9]*: bad number -[: [0-9]*: bad number -[: [0-9]*: bad number -.. -mount -o ro /deReiserFS: sda4: found reiserfs format "3.6" with standard journal -v/sda4 +ReiserFS: sda4: found reiserfs format "3.6" with standard journal ReiserFS: sda4: using ordered data mode ReiserFS: sda4: journal params: device sda4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 ReiserFS: sda4: checking transaction log (sda4) ReiserFS: sda4: Using r5 hash to sort names -mknod: `/dev/fb0': File exists -mknod: `/dev/fb1': File exists -(none):/# /sbin/reboot -f -md: stopping all md devices. -Restarting system. -. -- short story of a lazy sysadmin: alias appserv=wotan From miltonm at bga.com Thu Dec 22 07:21:08 2005 From: miltonm at bga.com (Milton Miller) Date: Wed, 21 Dec 2005 14:21:08 -0600 Subject: console on POWER4 not working with 2.6.15 Message-ID: <4dbd49fd28881ccde9e47a8c45cb0221@bga.com> On Thu Dec 22 04:56:28 EST 2005, Olaf Hering wrote: > I finally managed to find the culprit. > > good: 25635c71e44111a6bd48f342e144e2fc02d0a314 > bad: f9bd170a87948a9e077149b70fb192c563770fdf > > ... > powerpc: Merge i8259.c into arch/powerpc/sysdev > > This changes the parameters for i8259_init so that it takes two > parameters: a physical address for generating an interrupt > acknowledge cycle, and an interrupt number offset. i8259_init > now sets the irq_desc[] for its interrupts; all the callers > were doing this, and that code is gone now. This also defines > a CONFIG_PPC_I8259 symbol to select i8259.o for inclusion, and > makes the platforms that need it select that symbol. ... > PCI: Probing PCI hardware > +Failed to request PCI IO region on PCI domain 0000 That caught my eye. It turns out that xics calls the 8259 init at arch_initcall before the pci has probed the first bus. The ppc64 version did not request the resources it was using, but the combined version picked this up from the ppc side. If we defer the 8259 init to subsys_initcall, or even look for it after we setup the first io resource from the pci code, it may solve this problem. I noticed the commit did not touch xics.c. Looking further I see that the move of the 8259 irq_desc setup code into i8259_init undoes the combined xics_8259_pic descriptors. Perhaps this should be made more like the mpic code? Just detect the cascade, call the cascaded irq controller, eoi, and return? From there I think xics_mask_and_ack can disappear or become empty. Also, is there a reason we are not looking for a intack pci bridge property? Ok enough for someone on vacation. milton From sfr at canb.auug.org.au Thu Dec 22 10:23:03 2005 From: sfr at canb.auug.org.au (Stephen Rothwell) Date: Thu, 22 Dec 2005 10:23:03 +1100 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <20051221142341.GA23639@suse.de> References: <20051220204530.GA26351@suse.de> <20051220214525.GB7428@pb15.lixom.net> <20051221142341.GA23639@suse.de> Message-ID: <20051222102303.001abf38.sfr@canb.auug.org.au> On Wed, 21 Dec 2005 15:23:41 +0100 Olaf Hering wrote: > > If I only knew how to export the tree state at these points... git tar-tree ? -- Cheers, Stephen Rothwell sfr at canb.auug.org.au http://www.canb.auug.org.au/~sfr/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051222/69c1a3c3/attachment.pgp From brian.jewell at themis.com Thu Dec 22 11:23:06 2005 From: brian.jewell at themis.com (brian jewell) Date: Wed, 21 Dec 2005 16:23:06 -0800 Subject: Accessing NVRAM on the Maple board Message-ID: Hi, I need to be able to access NVRAM from Linux on the Maple PPC Evaluation board. There is what appears to be a device driver that provides this capability in arch/ppc64/kernel, in a file called "nvram.c". Does anyone know where I can obtain some documentation on nvram.c? Thanks for the help. --Brian Jewell --Themis Computer -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 1572 bytes Desc: not available Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051221/6f75ba2e/attachment.bin From drdavew at austin.rr.com Thu Dec 22 11:28:55 2005 From: drdavew at austin.rr.com (Dave Willoughby) Date: Wed, 21 Dec 2005 18:28:55 -0600 Subject: Accessing NVRAM on the Maple board In-Reply-To: References: Message-ID: <75787748-0F0B-47E8-AE18-D3A7B65D9BCF@austin.rr.com> I'm not sure the Maple board has what I think of as NVRAM, the way Apple or pSeries PowerPC computers have NVRAM. The Maple firmware has a PIBS firmware prompt that allows setting boot configuration values if that's what you are looking for. Dave Willoughby On Dec 21, 2005, at 6:23 PM, brian jewell wrote: > Hi, > > I need to be able to access NVRAM from Linux on the Maple PPC > Evaluation > board. > > There is what appears to be a device driver that provides this > capability in > arch/ppc64/kernel, in a file called "nvram.c". > > Does anyone know where I can obtain some documentation on nvram.c? > > Thanks for the help. > > --Brian Jewell > --Themis Computer > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From khem at mvista.com Thu Dec 22 12:08:52 2005 From: khem at mvista.com (Khem Raj) Date: Wed, 21 Dec 2005 17:08:52 -0800 Subject: GCC 3.3.6: ICE in emit_move_insn, at expr.c:3198 Message-ID: <43A9FCA4.3070104@mvista.com> Hello While compiling gcc 3.3.6 for powerpc64 I encountered an ICE in GCC cross build. /usr/include/bits/string2.h: In function `__strpbrk_c2': /usr/include/bits/string2.h:1041: internal compiler error: in emit_move_insn, at expr.c:3198 Please submit a full bug report, with preprocessed source if appropriate. See for instructions. Here is short example that I reduced. char* foo ( char *s ) { return s; } -- Khem Raj MontaVista Software, Inc. www.mvista.com From david at gibson.dropbear.id.au Thu Dec 22 14:31:39 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 22 Dec 2005 14:31:39 +1100 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <485df2ab5f0f0aba53755871a3e12b2c@bga.com> References: <20051220010617.GC29993@localhost.localdomain> <485df2ab5f0f0aba53755871a3e12b2c@bga.com> Message-ID: <20051222033139.GB9475@localhost.localdomain> On Wed, Dec 21, 2005 at 11:16:01AM -0600, Milton Miller wrote: > > On Dec 19, 2005, at 7:06 PM, David Gibson wrote: > > >On Mon, Dec 19, 2005 at 08:52:47AM -0600, Milton Miller wrote: > >>On Mon Dec 19 16:44:10 EST 2005, David Gibson wrote: > >> > >>>+extern void wback_dcache_range(unsigned long start, unsigned long > >>>stop); > >>>+extern void wback_inval_dcache_range(unsigned long start, unsigned > >>>long stop); > >> > >>I think that while we are here we should change the arguments to be > >>pointers (void *). The assembly doesn't care, and almost all of the > >>users are casting from pointer to usigned long at the call site, with > >>dart being the exception. The instruction cache flush should also > >>change. > > > >True. And while we're at that, the dcache flushing functions are > >almost invariable called as *_dcache_range(start, start+length), so > >how about changing them to take start and length instead of start and > >end. However, flush_icache_range() is called from generic code, so I > >don't want to change it's interface. > > The information from the above paragraph should be in the change log. Duly added. > And if we are not going to change icache, then I would propose we not > use range on the dcache side, it is confusing. How about rename to th > start, length ones to _region ? Hrm.. yes. Paulus, do you have a preferred approach? > >Revised patch below: > > > >powerpc: Merge 32/64 cacheflush code > > > >This patch merges the cache flushing code for 32 and 64 bit powerpc > >machines. This means the ppc64_caches mechanism for determining > >correct cache sizes at runtime is ported to 32-bit, and is thus > >renamed as 'powerpc_caches'. The merged cache flushing functions go > >in new file arch/powerpc/kernel/cache.S. > > > >Previously, the ppc32 version of flush_dcache_range() did a writeback > >and invalidate of the given cache lines (dcbf) whereas the ppc64 > >version did just a writeback (dcbst). In general, there's no > >consistent meaning of "flush" as one or the other, so this patch also > >renames the dcache flushing functions less ambiguously. The new names > >are: > > > > wback_dcache_range() - previously flush_dcache_range() on > >ppc64 and clean_dcache_range() on ppc32 > > > > wback_inval_dcache_range() - previously > >flush_inval_dcache_range() on ppc64 and flush_dcache_range on ppc32 > > > > invalidate_dcache_range() - didn't previously exist on ppc64, > >unchanged on ppc32 > > > >Finally we also cleanup the initialization of the powerpc_caches > >structure from the old ppc64 specific version. We remove a pointless > >loop, and remove a dependence on _machine. > > ... > > >+/* > >+ * Flush a particular page from the data cache to RAM. > >+ * Note: this is necessary because the instruction cache does *not* > >+ * snoop from the data cache. > >+ * > >+ * void __flush_dcache_icache(void *page) > > When I see *page i think struct page *page .... even though this is > void, how about page_address ? Good point, though not my addition (that's copied verbatim from the existing misc_64.S. Changed to page_va, as in the prototype in cacheflush.h > >+/* > >+ * Like above, but only do the D-cache. > >+ * > >+ * wback_dcache_range(void *start, unsigned long len) > >+ * > >+ * writeback all bytes from start to stop-1 inclusive > >+ */ > >+_GLOBAL(wback_dcache_range) > >+ LOAD_REG_ADDR(r10, powerpc_caches) > >+ lwz r7,DCACHEL1LINESIZE(r10) /* Get dcache line size */ > >+ addi r5,r7,-1 > >+ andc r6,r3,r5 /* round low to line bdy */ > >+ and r8,r3,r5 /* get cacheline offset of start */ > > Does the above term ever make a difference since we round up? Yes. Consider wback_dcache_range((void*)0x7f, 2). > >+ add r8,r8,r4 /* add length */ > >+ add r8,r8,r5 /* ensure we get enough */ > >+ lwz r9,DCACHEL1LOGLINESIZE(r10) /* Get dcache line shift */ > > Moving this load up might help some processors I'd rather make such a change as a separate patch, this code path is identical to the existing ppc64 code. > >+ srw. r8,r8,r9 /* compute line count */ > > So its really unsigned int len (or did you want a PPC_ macro here?) Hmm.. good point. The old versions also had this, though there it was more subtle - a difference of pointers being truncated to 32bits. I can see no reason not to use an srd. on 64-bit here, PPC_SRL macro duly added. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Thu Dec 22 14:46:31 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 22 Dec 2005 14:46:31 +1100 Subject: powerpc: Remove lppaca structure from the PACA Message-ID: <20051222034631.GC9475@localhost.localdomain> Paulus, for your consideration.. At present the lppaca - the structure shared with the iSeries hypervisor and phyp is contained within the PACA, our own low-level per-cpu structure. This doesn't have to be so, the patch below removes it, making a separate array of lppaca structures. This saves approximately 500*NR_CPUS bytes of image size and kernel memory, because we don't need aligning gap between the Linux and hypervisor portions of every PACA. On the other hand it means an extra level of dereference in many accesses to the lppaca. The patch also gets rid of several places where we assign the paca address to a local variable for no particular reason. Does this seem like a good idea or not? Signed-off-by: David Gibson Index: working-2.6/arch/powerpc/kernel/asm-offsets.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/asm-offsets.c 2005-12-22 14:26:50.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/asm-offsets.c 2005-12-22 14:37:28.000000000 +1100 @@ -135,7 +135,7 @@ int main(void) DEFINE(PACA_EXMC, offsetof(struct paca_struct, exmc)); DEFINE(PACA_EXSLB, offsetof(struct paca_struct, exslb)); DEFINE(PACAEMERGSP, offsetof(struct paca_struct, emergency_sp)); - DEFINE(PACALPPACA, offsetof(struct paca_struct, lppaca)); + DEFINE(PACALPPACAPTR, offsetof(struct paca_struct, lppaca_ptr)); DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id)); DEFINE(LPPACASRR0, offsetof(struct lppaca, saved_srr0)); Index: working-2.6/arch/powerpc/kernel/paca.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/paca.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/paca.c 2005-12-22 14:36:59.000000000 +1100 @@ -25,6 +25,28 @@ * field correctly */ extern unsigned long __toc_start; +/* + * iSeries structure which the hypervisor knows about - this structure + * should not cross a page boundary. The vpa_init/register_vpa call + * is now known to fail if the lppaca structure crosses a page + * boundary. The lppaca is also used on POWER5 pSeries boxes. The + * lppaca is 640 bytes long, and cannot readily change since the + * hypervisor knows its layout, so a 1kB alignment will suffice to + * ensure that it doesn't cross a page boundary. + */ +struct lppaca lppaca[] = { + [0 ... (NR_CPUS-1)] = { + .desc = 0xd397d781, /* "LpPa" */ + .size = sizeof(struct lppaca), + .dyn_proc_status = 2, + .decr_val = 0x00ff0000, + .fpregs_in_use = 1, + .end_of_quantum = 0xfffffffffffffffful, + .slb_count = 64, + .vmxregs_in_use = 0, + }, +}; + /* The Paca is an array with one entry per processor. Each contains an * lppaca, which contains the information shared between the * hypervisor and Linux. @@ -35,27 +57,17 @@ extern unsigned long __toc_start; * processor (not thread). */ #define PACA_INIT_COMMON(number, start, asrr, asrv) \ + .lppaca_ptr = &lppaca[number], \ .lock_token = 0x8000, \ .paca_index = (number), /* Paca Index */ \ .kernel_toc = (unsigned long)(&__toc_start) + 0x8000UL, \ .stab_real = (asrr), /* Real pointer to segment table */ \ .stab_addr = (asrv), /* Virt pointer to segment table */ \ .cpu_start = (start), /* Processor start */ \ - .hw_cpu_id = 0xffff, \ - .lppaca = { \ - .desc = 0xd397d781, /* "LpPa" */ \ - .size = sizeof(struct lppaca), \ - .dyn_proc_status = 2, \ - .decr_val = 0x00ff0000, \ - .fpregs_in_use = 1, \ - .end_of_quantum = 0xfffffffffffffffful, \ - .slb_count = 64, \ - .vmxregs_in_use = 0, \ - }, \ + .hw_cpu_id = 0xffff, #ifdef CONFIG_PPC_ISERIES #define PACA_INIT_ISERIES(number) \ - .lppaca_ptr = &paca[number].lppaca, \ .reg_save_ptr = &iseries_reg_save[number], #define PACA_INIT(number) \ Index: working-2.6/include/asm-powerpc/lppaca.h =================================================================== --- working-2.6.orig/include/asm-powerpc/lppaca.h 2005-12-21 10:59:36.000000000 +1100 +++ working-2.6/include/asm-powerpc/lppaca.h 2005-12-22 14:36:59.000000000 +1100 @@ -29,7 +29,9 @@ //---------------------------------------------------------------------------- #include -struct lppaca { +/* The Hypervisor barfs if the lppaca crosses a page boundary. A 1k + * alignment is sufficient to prevent this */ +struct __attribute__((__aligned__(0x400))) lppaca { //============================================================================= // CACHE_LINE_1 0x0000 - 0x007F Contains read-only data // NOTE: The xDynXyz fields are fields that will be dynamically changed by @@ -129,5 +131,7 @@ struct lppaca { u8 pmc_save_area[256]; // PMC interrupt Area x00-xFF }; +extern struct lppaca lppaca[]; + #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_LPPACA_H */ Index: working-2.6/include/asm-powerpc/paca.h =================================================================== --- working-2.6.orig/include/asm-powerpc/paca.h 2005-12-21 10:59:36.000000000 +1100 +++ working-2.6/include/asm-powerpc/paca.h 2005-12-22 14:37:28.000000000 +1100 @@ -23,6 +23,7 @@ register struct paca_struct *local_paca asm("r13"); #define get_paca() local_paca +#define get_lppaca() (get_paca()->lppaca_ptr) struct task_struct; @@ -94,19 +95,6 @@ struct paca_struct { u64 saved_r1; /* r1 save for RTAS calls */ u64 saved_msr; /* MSR saved here by enter_rtas */ u8 proc_enabled; /* irq soft-enable flag */ - - /* - * iSeries structure which the hypervisor knows about - - * this structure should not cross a page boundary. - * The vpa_init/register_vpa call is now known to fail if the - * lppaca structure crosses a page boundary. - * The lppaca is also used on POWER5 pSeries boxes. - * The lppaca is 640 bytes long, and cannot readily change - * since the hypervisor knows its layout, so a 1kB - * alignment will suffice to ensure that it doesn't - * cross a page boundary. - */ - struct lppaca lppaca __attribute__((__aligned__(0x400))); }; extern struct paca_struct paca[]; Index: working-2.6/arch/powerpc/platforms/pseries/lpar.c =================================================================== --- working-2.6.orig/arch/powerpc/platforms/pseries/lpar.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/powerpc/platforms/pseries/lpar.c 2005-12-22 14:36:59.000000000 +1100 @@ -254,11 +254,11 @@ out: void vpa_init(int cpu) { int hwcpu = get_hard_smp_processor_id(cpu); - unsigned long vpa = __pa(&paca[cpu].lppaca); + unsigned long vpa = __pa(&lppaca[cpu]); long ret; if (cpu_has_feature(CPU_FTR_ALTIVEC)) - paca[cpu].lppaca.vmxregs_in_use = 1; + lppaca[cpu].vmxregs_in_use = 1; ret = register_vpa(hwcpu, vpa); Index: working-2.6/arch/powerpc/kernel/lparcfg.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/lparcfg.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/lparcfg.c 2005-12-22 14:36:59.000000000 +1100 @@ -55,15 +55,13 @@ static unsigned long get_purr(void) { unsigned long sum_purr = 0; int cpu; - struct paca_struct *lpaca; for_each_cpu(cpu) { - lpaca = paca + cpu; - sum_purr += lpaca->lppaca.emulated_time_base; + sum_purr += lppaca[cpu].emulated_time_base; #ifdef PURR_DEBUG printk(KERN_INFO "get_purr for cpu (%d) has value (%ld) \n", - cpu, lpaca->lppaca.emulated_time_base); + cpu, lppaca[cpu].emulated_time_base); #endif } return sum_purr; @@ -79,12 +77,11 @@ static int lparcfg_data(struct seq_file unsigned long pool_id, lp_index; int shared, entitled_capacity, max_entitled_capacity; int processors, max_processors; - struct paca_struct *lpaca = get_paca(); unsigned long purr = get_purr(); seq_printf(m, "%s %s \n", MODULE_NAME, MODULE_VERS); - shared = (int)(lpaca->lppaca_ptr->shared_proc); + shared = (int)(get_lppaca()->shared_proc); seq_printf(m, "serial_number=%c%c%c%c%c%c%c\n", e2a(xItExtVpdPanel.mfgID[2]), e2a(xItExtVpdPanel.mfgID[3]), @@ -402,7 +399,7 @@ static int lparcfg_data(struct seq_file (h_resource >> 0 * 8) & 0xffff); /* pool related entries are apropriate for shared configs */ - if (paca[0].lppaca.shared_proc) { + if (lppaca[0].shared_proc) { h_pic(&pool_idle_time, &pool_procs); @@ -451,7 +448,7 @@ static int lparcfg_data(struct seq_file seq_printf(m, "partition_potential_processors=%d\n", partition_potential_processors); - seq_printf(m, "shared_processor_mode=%d\n", paca[0].lppaca.shared_proc); + seq_printf(m, "shared_processor_mode=%d\n", lppaca[0].shared_proc); return 0; } Index: working-2.6/arch/powerpc/platforms/pseries/setup.c =================================================================== --- working-2.6.orig/arch/powerpc/platforms/pseries/setup.c 2005-12-19 14:18:25.000000000 +1100 +++ working-2.6/arch/powerpc/platforms/pseries/setup.c 2005-12-22 14:36:59.000000000 +1100 @@ -192,7 +192,7 @@ static void pseries_lpar_enable_pmcs(voi /* instruct hypervisor to maintain PMCs */ if (firmware_has_feature(FW_FEATURE_SPLPAR)) - get_paca()->lppaca.pmcregs_in_use = 1; + get_lppaca()->pmcregs_in_use = 1; } static void __init pSeries_setup_arch(void) @@ -236,7 +236,7 @@ static void __init pSeries_setup_arch(vo /* Choose an idle loop */ if (firmware_has_feature(FW_FEATURE_SPLPAR)) { vpa_init(boot_cpuid); - if (get_paca()->lppaca.shared_proc) { + if (get_lppaca()->shared_proc) { printk(KERN_INFO "Using shared processor idle loop\n"); ppc_md.idle_loop = pseries_shared_idle; } else { @@ -443,10 +443,10 @@ DECLARE_PER_CPU(unsigned long, smt_snooz static inline void dedicated_idle_sleep(unsigned int cpu) { - struct paca_struct *ppaca = &paca[cpu ^ 1]; + struct lppaca *plppaca = &lppaca[cpu ^ 1]; /* Only sleep if the other thread is not idle */ - if (!(ppaca->lppaca.idle)) { + if (!(plppaca->idle)) { local_irq_disable(); /* @@ -479,7 +479,6 @@ static inline void dedicated_idle_sleep( static void pseries_dedicated_idle(void) { - struct paca_struct *lpaca = get_paca(); unsigned int cpu = smp_processor_id(); unsigned long start_snooze; unsigned long *smt_snooze_delay = &__get_cpu_var(smt_snooze_delay); @@ -490,7 +489,7 @@ static void pseries_dedicated_idle(void) * Indicate to the HV that we are idle. Now would be * a good time to find other work to dispatch. */ - lpaca->lppaca.idle = 1; + get_lppaca()->idle = 1; if (!need_resched()) { start_snooze = get_tb() + @@ -517,7 +516,7 @@ static void pseries_dedicated_idle(void) HMT_medium(); } - lpaca->lppaca.idle = 0; + get_lppaca()->idle = 0; ppc64_runlatch_on(); preempt_enable_no_resched(); @@ -531,7 +530,6 @@ static void pseries_dedicated_idle(void) static void pseries_shared_idle(void) { - struct paca_struct *lpaca = get_paca(); unsigned int cpu = smp_processor_id(); while (1) { @@ -539,7 +537,7 @@ static void pseries_shared_idle(void) * Indicate to the HV that we are idle. Now would be * a good time to find other work to dispatch. */ - lpaca->lppaca.idle = 1; + get_lppaca()->idle = 1; while (!need_resched() && !cpu_is_offline(cpu)) { local_irq_disable(); @@ -563,7 +561,7 @@ static void pseries_shared_idle(void) HMT_medium(); } - lpaca->lppaca.idle = 0; + get_lppaca()->idle = 0; ppc64_runlatch_on(); preempt_enable_no_resched(); @@ -587,7 +585,7 @@ static void pseries_kexec_cpu_down(int c { /* Don't risk a hypervisor call if we're crashing */ if (!crash_shutdown) { - unsigned long vpa = __pa(&get_paca()->lppaca); + unsigned long vpa = __pa(get_lppaca()); if (unregister_vpa(hard_smp_processor_id(), vpa)) { printk("VPA deregistration of cpu %u (hw_cpu_id %d) " Index: working-2.6/arch/powerpc/kernel/entry_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/entry_64.S 2005-12-22 14:26:50.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/entry_64.S 2005-12-22 14:36:59.000000000 +1100 @@ -511,7 +511,8 @@ restore: cmpdi 0,r5,0 beq 4f /* Check for pending interrupts (iSeries) */ - ld r3,PACALPPACA+LPPACAANYINT(r13) + ld r3,PACALPPACAPTR(r13) + ld r3,LPPACAANYINT(r3) cmpdi r3,0 beq+ 4f /* skip do_IRQ if no interrupts */ Index: working-2.6/arch/powerpc/kernel/head_64.S =================================================================== --- working-2.6.orig/arch/powerpc/kernel/head_64.S 2005-12-22 14:26:50.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/head_64.S 2005-12-22 14:36:59.000000000 +1100 @@ -255,8 +255,9 @@ exception_marker: #define EXCEPTION_PROLOG_ISERIES_2 \ mfmsr r10; \ - ld r11,PACALPPACA+LPPACASRR0(r13); \ - ld r12,PACALPPACA+LPPACASRR1(r13); \ + ld r12,PACALPPACAPTR(r13); \ + ld r11,LPPACASRR0(r12); \ + ld r12,LPPACASRR1(r12); \ ori r10,r10,MSR_RI; \ mtmsrd r10,1 @@ -635,7 +636,8 @@ data_access_slb_iSeries: std r12,PACA_EXSLB+EX_R12(r13) mfspr r10,SPRN_SPRG1 std r10,PACA_EXSLB+EX_R13(r13) - ld r12,PACALPPACA+LPPACASRR1(r13); + ld r12,PACALPPACAPTR(r13) + ld r12,LPPACASRR1(r12) b .slb_miss_realmode STD_EXCEPTION_ISERIES(0x400, instruction_access, PACA_EXGEN) @@ -645,7 +647,8 @@ instruction_access_slb_iSeries: mtspr SPRN_SPRG1,r13 /* save r13 */ mfspr r13,SPRN_SPRG3 /* get paca address into r13 */ std r3,PACA_EXSLB+EX_R3(r13) - ld r3,PACALPPACA+LPPACASRR0(r13) /* get SRR0 value */ + ld r3,PACALPPACAPTR(r13) + ld r3,LPPACASRR0(r3) /* get SRR0 value */ std r9,PACA_EXSLB+EX_R9(r13) mfcr r9 #ifdef __DISABLED__ @@ -657,7 +660,8 @@ instruction_access_slb_iSeries: std r12,PACA_EXSLB+EX_R12(r13) mfspr r10,SPRN_SPRG1 std r10,PACA_EXSLB+EX_R13(r13) - ld r12,PACALPPACA+LPPACASRR1(r13); + ld r12,PACALPPACAPTR(r13) + ld r12,LPPACASRR1(r12) b .slb_miss_realmode #ifdef __DISABLED__ @@ -746,7 +750,8 @@ iSeries_secondary_smp_loop: .globl decrementer_iSeries_masked decrementer_iSeries_masked: li r11,1 - stb r11,PACALPPACA+LPPACADECRINT(r13) + ld r12,PACALPPACAPTR(r13) + stb r11,LPPACADECRINT(r12) LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy) lwz r12,ADDROFF(tb_ticks_per_jiffy)(r12) mtspr SPRN_DEC,r12 @@ -755,8 +760,9 @@ decrementer_iSeries_masked: .globl hardware_interrupt_iSeries_masked hardware_interrupt_iSeries_masked: mtcrf 0x80,r9 /* Restore regs */ - ld r11,PACALPPACA+LPPACASRR0(r13) - ld r12,PACALPPACA+LPPACASRR1(r13) + ld r12,PACALPPACAPTR(r13) + ld r11,LPPACASRR0(r12) + ld r12,LPPACASRR1(r12) mtspr SPRN_SRR0,r11 mtspr SPRN_SRR1,r12 ld r9,PACA_EXGEN+EX_R9(r13) @@ -995,7 +1001,8 @@ _GLOBAL(slb_miss_realmode) ld r3,PACA_EXSLB+EX_R3(r13) lwz r9,PACA_EXSLB+EX_CCR(r13) /* get saved CR */ #ifdef CONFIG_PPC_ISERIES - ld r11,PACALPPACA+LPPACASRR0(r13) /* get SRR0 value */ + ld r11,PACALPPACAPTR(r13) + ld r11,LPPACASRR0(r11) /* get SRR0 value */ #endif /* CONFIG_PPC_ISERIES */ mtlr r10 Index: working-2.6/arch/powerpc/kernel/irq.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/irq.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/irq.c 2005-12-22 14:36:59.000000000 +1100 @@ -238,14 +238,10 @@ void do_IRQ(struct pt_regs *regs) irq_exit(); #ifdef CONFIG_PPC_ISERIES - { - struct paca_struct *lpaca = get_paca(); - - if (lpaca->lppaca.int_dword.fields.decr_int) { - lpaca->lppaca.int_dword.fields.decr_int = 0; - /* Signal a fake decrementer interrupt */ - timer_interrupt(regs); - } + if (get_lppaca()->int_dword.fields.decr_int) { + get_lppaca()->int_dword.fields.decr_int = 0; + /* Signal a fake decrementer interrupt */ + timer_interrupt(regs); } #endif } Index: working-2.6/arch/powerpc/kernel/time.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/time.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/time.c 2005-12-22 14:36:59.000000000 +1100 @@ -431,7 +431,7 @@ void timer_interrupt(struct pt_regs * re profile_tick(CPU_PROFILING, regs); #ifdef CONFIG_PPC_ISERIES - get_paca()->lppaca.int_dword.fields.decr_int = 0; + get_lppaca()->int_dword.fields.decr_int = 0; #endif while ((ticks = tb_ticks_since(per_cpu(last_jiffy, cpu))) Index: working-2.6/arch/powerpc/lib/locks.c =================================================================== --- working-2.6.orig/arch/powerpc/lib/locks.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/lib/locks.c 2005-12-22 14:36:59.000000000 +1100 @@ -28,15 +28,13 @@ void __spin_yield(raw_spinlock_t *lock) { unsigned int lock_value, holder_cpu, yield_count; - struct paca_struct *holder_paca; lock_value = lock->slock; if (lock_value == 0) return; holder_cpu = lock_value & 0xffff; BUG_ON(holder_cpu >= NR_CPUS); - holder_paca = &paca[holder_cpu]; - yield_count = holder_paca->lppaca.yield_count; + yield_count = lppaca[holder_cpu].yield_count; if ((yield_count & 1) == 0) return; /* virtual cpu is currently running */ rmb(); @@ -60,15 +58,13 @@ void __rw_yield(raw_rwlock_t *rw) { int lock_value; unsigned int holder_cpu, yield_count; - struct paca_struct *holder_paca; lock_value = rw->lock; if (lock_value >= 0) return; /* no write lock at present */ holder_cpu = lock_value & 0xffff; BUG_ON(holder_cpu >= NR_CPUS); - holder_paca = &paca[holder_cpu]; - yield_count = holder_paca->lppaca.yield_count; + yield_count = lppaca[holder_cpu].yield_count; if ((yield_count & 1) == 0) return; /* virtual cpu is currently running */ rmb(); Index: working-2.6/arch/powerpc/platforms/iseries/irq.c =================================================================== --- working-2.6.orig/arch/powerpc/platforms/iseries/irq.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/platforms/iseries/irq.c 2005-12-22 14:36:59.000000000 +1100 @@ -339,14 +339,12 @@ int __init iSeries_allocate_IRQ(HvBusNum */ int iSeries_get_irq(struct pt_regs *regs) { - struct paca_struct *lpaca; /* -2 means ignore this interrupt */ int irq = -2; - lpaca = get_paca(); #ifdef CONFIG_SMP - if (lpaca->lppaca.int_dword.fields.ipi_cnt) { - lpaca->lppaca.int_dword.fields.ipi_cnt = 0; + if (get_lppaca()->int_dword.fields.ipi_cnt) { + get_lppaca()->int_dword.fields.ipi_cnt = 0; iSeries_smp_message_recv(regs); } #endif /* CONFIG_SMP */ Index: working-2.6/arch/powerpc/platforms/iseries/setup.c =================================================================== --- working-2.6.orig/arch/powerpc/platforms/iseries/setup.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/platforms/iseries/setup.c 2005-12-22 14:36:59.000000000 +1100 @@ -547,7 +547,7 @@ static unsigned long __init build_iSerie */ static void __init iSeries_setup_arch(void) { - if (get_paca()->lppaca.shared_proc) { + if (get_lppaca()->shared_proc) { ppc_md.idle_loop = iseries_shared_idle; printk(KERN_INFO "Using shared processor idle loop\n"); } else { @@ -656,7 +656,7 @@ static void yield_shared_processor(void) * The decrementer stops during the yield. Force a fake decrementer * here and let the timer_interrupt code sort out the actual time. */ - get_paca()->lppaca.int_dword.fields.decr_int = 1; + get_lppaca()->int_dword.fields.decr_int = 1; process_iSeries_events(); } @@ -883,7 +883,7 @@ void dt_cpus(struct iseries_flat_dt *dt) dt_prop_u32(dt, "#size-cells", 0); for (i = 0; i < NR_CPUS; i++) { - if (paca[i].lppaca.dyn_proc_status >= 2) + if (lppaca[i].dyn_proc_status >= 2) continue; snprintf(p, 32 - (p - buf), "@%d", i); @@ -891,7 +891,7 @@ void dt_cpus(struct iseries_flat_dt *dt) dt_prop_str(dt, "device_type", "cpu"); - index = paca[i].lppaca.dyn_hv_phys_proc_index; + index = lppaca[i].dyn_hv_phys_proc_index; d = &xIoHriProcessorVpd[index]; dt_prop_u32(dt, "i-cache-size", d->xInstCacheSize * 1024); Index: working-2.6/arch/powerpc/platforms/iseries/smp.c =================================================================== --- working-2.6.orig/arch/powerpc/platforms/iseries/smp.c 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/platforms/iseries/smp.c 2005-12-22 14:36:59.000000000 +1100 @@ -91,7 +91,7 @@ static void smp_iSeries_kick_cpu(int nr) BUG_ON((nr < 0) || (nr >= NR_CPUS)); /* Verify that our partition has a processor nr */ - if (paca[nr].lppaca.dyn_proc_status >= 2) + if (lppaca[nr].dyn_proc_status >= 2) return; /* The processor is currently spinning, waiting Index: working-2.6/include/asm-powerpc/spinlock.h =================================================================== --- working-2.6.orig/include/asm-powerpc/spinlock.h 2005-12-21 10:59:36.000000000 +1100 +++ working-2.6/include/asm-powerpc/spinlock.h 2005-12-22 14:36:59.000000000 +1100 @@ -80,7 +80,7 @@ static int __inline__ __raw_spin_trylock #if defined(CONFIG_PPC_SPLPAR) || defined(CONFIG_PPC_ISERIES) /* We only yield to the hypervisor if we are in shared processor mode */ -#define SHARED_PROCESSOR (get_paca()->lppaca.shared_proc) +#define SHARED_PROCESSOR (get_lppaca()->shared_proc) extern void __spin_yield(raw_spinlock_t *lock); extern void __rw_yield(raw_rwlock_t *lock); #else /* SPLPAR || ISERIES */ Index: working-2.6/include/asm-powerpc/time.h =================================================================== --- working-2.6.orig/include/asm-powerpc/time.h 2005-11-23 15:56:36.000000000 +1100 +++ working-2.6/include/asm-powerpc/time.h 2005-12-22 14:36:59.000000000 +1100 @@ -175,11 +175,10 @@ static inline void set_dec(int val) set_dec_cpu6(val); #else #ifdef CONFIG_PPC_ISERIES - struct paca_struct *lpaca = get_paca(); int cur_dec; - if (lpaca->lppaca.shared_proc) { - lpaca->lppaca.virtual_decr = val; + if (get_lppaca()->shared_proc) { + get_lppaca()->virtual_decr = val; cur_dec = get_dec(); if (cur_dec > val) HvCall_setVirtualDecr(); Index: working-2.6/arch/powerpc/platforms/iseries/misc.S =================================================================== --- working-2.6.orig/arch/powerpc/platforms/iseries/misc.S 2005-11-23 15:56:22.000000000 +1100 +++ working-2.6/arch/powerpc/platforms/iseries/misc.S 2005-12-22 14:36:59.000000000 +1100 @@ -44,7 +44,8 @@ _GLOBAL(local_irq_restore) /* Check pending interrupts */ /* A decrementer, IPI or PMC interrupt may have occurred * while we were in the hypervisor (which enables) */ - ld r4,PACALPPACA+LPPACAANYINT(r13) + ld r4,PACALPPACAPTR(r13) + ld r4,LPPACAANYINT(r4) cmpdi r4,0 beqlr -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From david at gibson.dropbear.id.au Thu Dec 22 17:42:35 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 22 Dec 2005 17:42:35 +1100 Subject: RFC: Use bitmaps to track free/user SLB slots Message-ID: <20051222064235.GE9475@localhost.localdomain> This needs way more testing and thought before being considered for merging, but here it is in case people are interested. It implements a new, possibly superior approach to managing SLB entries. Currently, when we take an SLB miss, we just use round-robin to find a slot to put the new entry into - the slot located may or may not already contain a useful translation. When we take an SLB miss on a user address we record its address in a cache of up to 16 entries. On context switch, if the cache hasn't overflowed we use it to just flush the user entries, rather than flushing the whole SLB. With this patch, instead of maintaining the cache and round-robin pointer, we keep a bitmap of free SLB slots, and a bitmap of SLB slots containing user entries. When we take an SLB miss, we find a free slot from the bitmap (using cntlzd) rather than using round robin. We fall back to round robin if we use all free slots (though we do this by manipulating the bitmap, avoiding the need for a separate round robin counter). The SLB miss handler clears the relevant bit in the free slots bitmap and updates the relevant bit in the user slots bitmap. On context switch, we use the user slots bitmap to flush just those slots containing user entries, and those slots are then added to the free slots bitmap. The idea, obviously, is to try to reduce the number of SLB misses by making better use of free SLB slots. My preliminary tests (on POWER5 LPAR) seem to indicate that this has essentially no effect (delta<1ns) on the time for a user SLB miss (the cost of the bitmap manipulation is the same as that for maintaining the old slb cache). Time for kernel SLB misses is probably slightly increased; not measured, but I think it should be delta<~5ns. Context switch time may be increased slightlyl; also not measured yet, but I think it should be <0.5us at most and quite likely negligible in comparison to the rest of a context switch. I've no idea what the impact on SLB miss rates for various workloads might be. Index: working-2.6/arch/powerpc/mm/slb_low.S =================================================================== --- working-2.6.orig/arch/powerpc/mm/slb_low.S 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/mm/slb_low.S 2005-12-22 16:55:05.000000000 +1100 @@ -192,17 +192,8 @@ slb_finish_load: beq 3f #endif /* CONFIG_PPC_ISERIES */ - ld r10,PACASTABRR(r13) - addi r10,r10,1 - /* use a cpu feature mask if we ever change our slb size */ - cmpldi r10,SLB_NUM_ENTRIES - - blt+ 4f - li r10,SLB_NUM_BOLTED - -4: - std r10,PACASTABRR(r13) - + ld r9,PACASLBFREEBITMAP(r13) + cntlzd r10,r9 3: rldimi r3,r10,0,36 /* r3= EA[0:35] | entry */ oris r10,r3,SLB_ESID_V at h /* r3 |= SLB_ESID_V */ @@ -215,26 +206,46 @@ slb_finish_load: */ slbmte r11,r10 - /* we're done for kernel addresses */ - crclr 4*cr0+eq /* set result to "success" */ - bgelr cr7 + ld r3,PACASLBUSERBITMAP(r13) + + li r11,1 + sldi r11,r11,63 /* r11 = 0x8000000000000000 */ + srd r11,r11,r10 + + andc. r9,r9,r11 + andc r3,r3,r11 + bne 7f - /* Update the slb cache */ - lhz r3,PACASLBCACHEPTR(r13) /* offset = paca->slb_cache_ptr */ - cmpldi r3,SLB_CACHE_ENTRIES - bge 1f - - /* still room in the slb cache */ - sldi r11,r3,1 /* r11 = offset * sizeof(u16) */ - rldicl r10,r10,36,28 /* get low 16 bits of the ESID */ - add r11,r11,r13 /* r11 = (u16 *)paca + offset */ - sth r10,PACASLBCACHE(r11) /* paca->slb_cache[offset] = esid */ - addi r3,r3,1 /* offset++ */ - b 2f -1: /* offset >= SLB_CACHE_ENTRIES */ - li r3,SLB_CACHE_ENTRIES+1 -2: - sth r3,PACASLBCACHEPTR(r13) /* paca->slb_cache_ptr = offset */ + srdi. r9,r11,1 + bne 7f + + li r9,1 + rotrdi r9,r9,SLB_NUM_BOLTED+1 + +7: bge cr7,6f + or r3,r3,r11 + +6: std r9,PACASLBFREEBITMAP(r13) + std r3,PACASLBUSERBITMAP(r13) crclr 4*cr0+eq /* set result to "success" */ blr +/* void slb_flush_user_slots(u64 slots) */ +_GLOBAL(slb_flush_user_slots) + li r6,-1 + srdi r6,r6,1 /* r6 = 0x7fffffffffffffff */ + +1: + cmpldi r3,0 + beqlr /* Nothing left, we're done */ + + cntlzd r4,r3 + slbmfee r5,r4 + /* V bit from slbmfee becomes class bit for slbie, since user + * SLBEs have the class bit set */ + slbie r5 + + srd r7,r6,r4 /* r7 = bits we still care about */ + and r3,r3,r7 + + b 1b Index: working-2.6/include/asm-powerpc/paca.h =================================================================== --- working-2.6.orig/include/asm-powerpc/paca.h 2005-12-22 16:30:32.000000000 +1100 +++ working-2.6/include/asm-powerpc/paca.h 2005-12-22 16:56:33.000000000 +1100 @@ -83,15 +83,14 @@ struct paca_struct { #endif /* CONFIG_PPC_64K_PAGES */ mm_context_t context; - u16 slb_cache[SLB_CACHE_ENTRIES]; - u16 slb_cache_ptr; + u64 slb_free_bitmap; + u64 slb_user_bitmap; /* * then miscellaneous read-write fields */ struct task_struct *__current; /* Pointer to current */ u64 kstack; /* Saved Kernel stack addr */ - u64 stab_rr; /* stab/slb round-robin counter */ u64 saved_r1; /* r1 save for RTAS calls */ u64 saved_msr; /* MSR saved here by enter_rtas */ u8 proc_enabled; /* irq soft-enable flag */ Index: working-2.6/arch/powerpc/kernel/asm-offsets.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/asm-offsets.c 2005-12-22 16:30:32.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/asm-offsets.c 2005-12-22 16:56:39.000000000 +1100 @@ -117,12 +117,11 @@ int main(void) DEFINE(PACASAVEDMSR, offsetof(struct paca_struct, saved_msr)); DEFINE(PACASTABREAL, offsetof(struct paca_struct, stab_real)); DEFINE(PACASTABVIRT, offsetof(struct paca_struct, stab_addr)); - DEFINE(PACASTABRR, offsetof(struct paca_struct, stab_rr)); DEFINE(PACAR1, offsetof(struct paca_struct, saved_r1)); DEFINE(PACATOC, offsetof(struct paca_struct, kernel_toc)); DEFINE(PACAPROCENABLED, offsetof(struct paca_struct, proc_enabled)); - DEFINE(PACASLBCACHE, offsetof(struct paca_struct, slb_cache)); - DEFINE(PACASLBCACHEPTR, offsetof(struct paca_struct, slb_cache_ptr)); + DEFINE(PACASLBFREEBITMAP, offsetof(struct paca_struct, slb_free_bitmap)); + DEFINE(PACASLBUSERBITMAP, offsetof(struct paca_struct, slb_user_bitmap)); DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id)); #ifdef CONFIG_PPC_64K_PAGES DEFINE(PACAPGDIR, offsetof(struct paca_struct, pgdir)); Index: working-2.6/arch/powerpc/mm/slb.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/slb.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/mm/slb.c 2005-12-22 16:57:10.000000000 +1100 @@ -32,6 +32,7 @@ extern void slb_allocate_realmode(unsigned long ea); extern void slb_allocate_user(unsigned long ea); +extern void slb_flush_user_slots(u64 slots); static void slb_allocate(unsigned long ea) { @@ -92,35 +93,27 @@ static void slb_flush_and_rebolt(void) "r"(mk_vsid_data(ksp_esid_data, lflags)), "r"(ksp_esid_data) : "memory"); + + get_paca()->slb_user_bitmap = 0; } /* Flush all user entries from the segment table of the current processor. */ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) { - unsigned long offset = get_paca()->slb_cache_ptr; - unsigned long esid_data = 0; + u64 slots = get_paca()->slb_user_bitmap; unsigned long pc = KSTK_EIP(tsk); unsigned long stack = KSTK_ESP(tsk); unsigned long unmapped_base; - if (offset <= SLB_CACHE_ENTRIES) { - int i; - asm volatile("isync" : : : "memory"); - for (i = 0; i < offset; i++) { - esid_data = ((unsigned long)get_paca()->slb_cache[i] - << SID_SHIFT) | SLBIE_C; - asm volatile("slbie %0" : : "r" (esid_data)); - } - asm volatile("isync" : : : "memory"); - } else { - slb_flush_and_rebolt(); - } + slb_flush_user_slots(slots); /* Workaround POWER5 < DD2.1 issue */ - if (offset == 1 || offset > SLB_CACHE_ENTRIES) - asm volatile("slbie %0" : : "r" (esid_data)); + if (slots) { + asm volatile("slbie %0" : : "r" (0xa000000000000000)); + get_paca()->slb_free_bitmap |= slots; + } - get_paca()->slb_cache_ptr = 0; + get_paca()->slb_user_bitmap = 0; get_paca()->context = mm->context; #ifdef CONFIG_PPC_64K_PAGES get_paca()->pgdir = mm->pgd; @@ -225,6 +218,4 @@ void slb_initialize(void) asm volatile("isync":::"memory"); } #endif /* CONFIG_PPC_ISERIES */ - - get_paca()->stab_rr = SLB_NUM_BOLTED; } Index: working-2.6/arch/powerpc/kernel/paca.c =================================================================== --- working-2.6.orig/arch/powerpc/kernel/paca.c 2005-12-22 16:30:32.000000000 +1100 +++ working-2.6/arch/powerpc/kernel/paca.c 2005-12-22 16:30:32.000000000 +1100 @@ -64,7 +64,8 @@ struct lppaca lppaca[] = { .stab_real = (asrr), /* Real pointer to segment table */ \ .stab_addr = (asrv), /* Virt pointer to segment table */ \ .cpu_start = (start), /* Processor start */ \ - .hw_cpu_id = 0xffff, + .hw_cpu_id = 0xffff, \ + .slb_free_bitmap = (-1UL >> SLB_NUM_BOLTED), #ifdef CONFIG_PPC_ISERIES #define PACA_INIT_ISERIES(number) \ Index: working-2.6/arch/powerpc/mm/stab.c =================================================================== --- working-2.6.orig/arch/powerpc/mm/stab.c 2005-12-19 14:18:24.000000000 +1100 +++ working-2.6/arch/powerpc/mm/stab.c 2005-12-22 16:58:39.000000000 +1100 @@ -28,6 +28,7 @@ struct stab_entry { }; #define NR_STAB_CACHE_ENTRIES 8 +DEFINE_PER_CPU(unsigned long, stab_rr); DEFINE_PER_CPU(long, stab_cache_ptr); DEFINE_PER_CPU(long, stab_cache[NR_STAB_CACHE_ENTRIES]); @@ -70,7 +71,7 @@ static int make_ste(unsigned long stab, * Could not find empty entry, pick one with a round robin selection. * Search all entries in the two groups. */ - castout_entry = get_paca()->stab_rr; + castout_entry = __get_cpu_var(stab_rr); for (i = 0; i < 16; i++) { if (castout_entry < 8) { global_entry = (esid & 0x1f) << 3; @@ -89,7 +90,7 @@ static int make_ste(unsigned long stab, castout_entry = (castout_entry + 1) & 0xf; } - get_paca()->stab_rr = (castout_entry + 1) & 0xf; + __get_cpu_var(stab_rr) = (castout_entry + 1) & 0xf; /* Modify the old entry to the new value. */ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From paulus at samba.org Thu Dec 22 21:49:34 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 22 Dec 2005 21:49:34 +1100 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <20051221175628.GA29363@suse.de> References: <20051220204530.GA26351@suse.de> <20051220214525.GB7428@pb15.lixom.net> <20051221175628.GA29363@suse.de> Message-ID: <17322.33982.22166.437385@cargo.ozlabs.ibm.com> Olaf Hering writes: > I finally managed to find the culprit. > > good: 25635c71e44111a6bd48f342e144e2fc02d0a314 > bad: f9bd170a87948a9e077149b70fb192c563770fdf > > ... > powerpc: Merge i8259.c into arch/powerpc/sysdev > > This changes the parameters for i8259_init so that it takes two > parameters: a physical address for generating an interrupt > acknowledge cycle, and an interrupt number offset. i8259_init > now sets the irq_desc[] for its interrupts; all the callers > were doing this, and that code is gone now. This also defines > a CONFIG_PPC_I8259 symbol to select i8259.o for inclusion, and > makes the platforms that need it select that symbol. Try this patch... it fixes things on the p630 at work. Paul. diff -urN linux-2.6/arch/powerpc/platforms/pseries/xics.c powerpc-merge/arch/powerpc/platforms/pseries/xics.c --- linux-2.6/arch/powerpc/platforms/pseries/xics.c 2005-11-14 10:33:54.000000000 +1100 +++ powerpc-merge/arch/powerpc/platforms/pseries/xics.c 2005-12-22 13:17:53.000000000 +1100 @@ -48,11 +48,6 @@ .set_affinity = xics_set_affinity }; -static struct hw_interrupt_type xics_8259_pic = { - .typename = " XICS/8259", - .ack = xics_mask_and_ack_irq, -}; - /* This is used to map real irq numbers to virtual */ static struct radix_tree_root irq_map = RADIX_TREE_INIT(GFP_ATOMIC); @@ -367,12 +362,7 @@ /* for sanity, this had better be < NR_IRQS - 16 */ if (vec == xics_irq_8259_cascade_real) { irq = i8259_irq(regs); - if (irq == -1) { - /* Spurious cascaded interrupt. Still must ack xics */ - xics_end_irq(irq_offset_up(xics_irq_8259_cascade)); - - irq = -1; - } + xics_end_irq(irq_offset_up(xics_irq_8259_cascade)); } else if (vec == XICS_IRQ_SPURIOUS) { irq = -1; } else { @@ -542,6 +532,7 @@ xics_irq_8259_cascade_real = *ireg; xics_irq_8259_cascade = virt_irq_create_mapping(xics_irq_8259_cascade_real); + i8259_init(0, 0); of_node_put(np); } @@ -565,12 +556,7 @@ #endif /* CONFIG_SMP */ } - xics_8259_pic.enable = i8259_pic.enable; - xics_8259_pic.disable = i8259_pic.disable; - xics_8259_pic.end = i8259_pic.end; - for (i = 0; i < 16; ++i) - get_irq_desc(i)->handler = &xics_8259_pic; - for (; i < NR_IRQS; ++i) + for (i = irq_offset_value(); i < NR_IRQS; ++i) get_irq_desc(i)->handler = &xics_pic; xics_setup_cpu(); @@ -590,7 +576,6 @@ no_action, 0, "8259 cascade", NULL)) printk(KERN_ERR "xics_setup_i8259: couldn't get 8259 " "cascade\n"); - i8259_init(0, 0); } return 0; } From paulus at samba.org Thu Dec 22 22:46:45 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 22 Dec 2005 22:46:45 +1100 Subject: please pull powerpc-merge.git Message-ID: <17322.37413.147381.320624@cargo.ozlabs.ibm.com> Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge.git There is one commit there, which fixes a bug which was preventing the serial port (or any ISA device) from working on POWER4 systems running without a hypervisor. arch/powerpc/platforms/pseries/xics.c | 21 +++------------------ 1 files changed, 3 insertions(+), 18 deletions(-) commit 8b1af56b29b9b81538b4d0d4fd9515618618ead1 Author: Paul Mackerras Date: Thu Dec 22 21:55:37 2005 +1100 powerpc: Fix i8259 cascade on pSeries with XICS interrupt controller It turns out that commit f9bd170a87948a9e077149b70fb192c563770fdf broke the cascade from XICS to i8259 on pSeries machines; specifically we ended up not ever doing the EOI on the XICS for the cascade. The result was that interrupts from the serial ports (and presumably any other devices using ISA interrupts) didn't get through. This fixes it and also simplifies the code, by doing the EOI on the XICS in the xics_get_irq routine after reading and acking the interrupt on the i8259. Signed-off-by: Paul Mackerras diff --git a/arch/powerpc/platforms/pseries/xics.c b/arch/powerpc/platforms/pseries/xics.c index 72ac180..0377dec 100644 --- a/arch/powerpc/platforms/pseries/xics.c +++ b/arch/powerpc/platforms/pseries/xics.c @@ -48,11 +48,6 @@ static struct hw_interrupt_type xics_pic .set_affinity = xics_set_affinity }; -static struct hw_interrupt_type xics_8259_pic = { - .typename = " XICS/8259", - .ack = xics_mask_and_ack_irq, -}; - /* This is used to map real irq numbers to virtual */ static struct radix_tree_root irq_map = RADIX_TREE_INIT(GFP_ATOMIC); @@ -367,12 +362,7 @@ int xics_get_irq(struct pt_regs *regs) /* for sanity, this had better be < NR_IRQS - 16 */ if (vec == xics_irq_8259_cascade_real) { irq = i8259_irq(regs); - if (irq == -1) { - /* Spurious cascaded interrupt. Still must ack xics */ - xics_end_irq(irq_offset_up(xics_irq_8259_cascade)); - - irq = -1; - } + xics_end_irq(irq_offset_up(xics_irq_8259_cascade)); } else if (vec == XICS_IRQ_SPURIOUS) { irq = -1; } else { @@ -542,6 +532,7 @@ nextnode: xics_irq_8259_cascade_real = *ireg; xics_irq_8259_cascade = virt_irq_create_mapping(xics_irq_8259_cascade_real); + i8259_init(0, 0); of_node_put(np); } @@ -565,12 +556,7 @@ nextnode: #endif /* CONFIG_SMP */ } - xics_8259_pic.enable = i8259_pic.enable; - xics_8259_pic.disable = i8259_pic.disable; - xics_8259_pic.end = i8259_pic.end; - for (i = 0; i < 16; ++i) - get_irq_desc(i)->handler = &xics_8259_pic; - for (; i < NR_IRQS; ++i) + for (i = irq_offset_value(); i < NR_IRQS; ++i) get_irq_desc(i)->handler = &xics_pic; xics_setup_cpu(); @@ -590,7 +576,6 @@ static int __init xics_setup_i8259(void) no_action, 0, "8259 cascade", NULL)) printk(KERN_ERR "xics_setup_i8259: couldn't get 8259 " "cascade\n"); - i8259_init(0, 0); } return 0; } From olh at suse.de Thu Dec 22 23:04:24 2005 From: olh at suse.de (Olaf Hering) Date: Thu, 22 Dec 2005 13:04:24 +0100 Subject: console on POWER4 not working with 2.6.15 In-Reply-To: <17322.33982.22166.437385@cargo.ozlabs.ibm.com> References: <20051220204530.GA26351@suse.de> <20051220214525.GB7428@pb15.lixom.net> <20051221175628.GA29363@suse.de> <17322.33982.22166.437385@cargo.ozlabs.ibm.com> Message-ID: <20051222120424.GA24475@suse.de> On Thu, Dec 22, Paul Mackeras wrote: > Olaf Hering writes: > > > I finally managed to find the culprit. > > > > good: 25635c71e44111a6bd48f342e144e2fc02d0a314 > > bad: f9bd170a87948a9e077149b70fb192c563770fdf > > > > ... > > powerpc: Merge i8259.c into arch/powerpc/sysdev > > > > This changes the parameters for i8259_init so that it takes two > > parameters: a physical address for generating an interrupt > > acknowledge cycle, and an interrupt number offset. i8259_init > > now sets the irq_desc[] for its interrupts; all the callers > > were doing this, and that code is gone now. This also defines > > a CONFIG_PPC_I8259 symbol to select i8259.o for inclusion, and > > makes the platforms that need it select that symbol. > > Try this patch... it fixes things on the p630 at work. This fixes it also for me. Thanks. -- short story of a lazy sysadmin: alias appserv=wotan From olof at lixom.net Fri Dec 23 02:31:57 2005 From: olof at lixom.net (Olof Johansson) Date: Thu, 22 Dec 2005 09:31:57 -0600 Subject: RFC: Use bitmaps to track free/user SLB slots In-Reply-To: <20051222064235.GE9475@localhost.localdomain> References: <20051222064235.GE9475@localhost.localdomain> Message-ID: <20051222153156.GA24601@pb15.lixom.net> On Thu, Dec 22, 2005 at 05:42:35PM +1100, David Gibson wrote: > This needs way more testing and thought before being considered for > merging, but here it is in case people are interested. It implements > a new, possibly superior approach to managing SLB entries. [...] > My preliminary tests (on POWER5 LPAR) seem to indicate that this has > essentially no effect (delta<1ns) on the time for a user SLB miss (the > cost of the bitmap manipulation is the same as that for maintaining > the old slb cache). Time for kernel SLB misses is probably slightly > increased; not measured, but I think it should be delta<~5ns. Context > switch time may be increased slightlyl; also not measured yet, but I > think it should be <0.5us at most and quite likely negligible in > comparison to the rest of a context switch. I've no idea what the > impact on SLB miss rates for various workloads might be. So, essentially what you're saying is that it's more complex than the older one, slightly slower in the execution path and it has no proven benefit? Until we see numbers where the old code causes too much SLB misses, and this patch reduces the miss rate, this is just unneeded complexity and over-engineering. But I'd be happy to be proven wrong on this. I'll have a read-through of the code as well, but I need some coffee before that, especially given the lack of comments in the new assembly. :-) -Olof From galak at kernel.crashing.org Fri Dec 23 02:52:51 2005 From: galak at kernel.crashing.org (Kumar Gala) Date: Thu, 22 Dec 2005 09:52:51 -0600 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <5648700F-9B2F-49F0-ADA8-278328C4CF17@kernel.crashing.org> References: <20051219054410.GB13285@localhost.localdomain> <20051219235455.GB29993@localhost.localdomain> <5648700F-9B2F-49F0-ADA8-278328C4CF17@kernel.crashing.org> Message-ID: <2E963BF2-D5EA-40BF-9614-1F2A686F6C00@kernel.crashing.org> David, Never saw a response to my query. - k On Dec 20, 2005, at 8:00 AM, Kumar Gala wrote: > > On Dec 19, 2005, at 5:54 PM, David Gibson wrote: > >> On Mon, Dec 19, 2005 at 09:30:37AM -0600, Kumar Gala wrote: >>> >>> On Dec 18, 2005, at 11:44 PM, David Gibson wrote: >>> >>>> Paulus et al, I think the patch below is roughly the right way >>>> to go, >>>> but it needs much more review and testing (at present it's been >>>> cursorily tested on ppc64 pSeries only). >>>> >>>> This patch merges the cache flushing code for 32 and 64 bit powerpc >>>> machines. This means the ppc64_caches mechanism for determining >>>> correct cache sizes at runtime is ported to 32-bit, and is thus >>>> renamed as 'powerpc_caches'. The merged cache flushing >>>> functions go >>>> in new file arch/powerpc/kernel/cache.S. >>> >>> Why dont we just use the cache line information in the cputable? >>> Why >>> the introduction of this new powerpc_caches structure? >> >> Because the device tree can override the information from the >> cputable. Oh, and the structure is only new for ppc32. > > If the device tree overrides the cputable should we not believe > it? I guess I dont understand why we need the same information in > multiple places? > > - kumar > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From gregkh at suse.de Fri Dec 23 07:22:59 2005 From: gregkh at suse.de (Greg KH) Date: Thu, 22 Dec 2005 12:22:59 -0800 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> Message-ID: <20051222202259.GA4959@suse.de> On Thu, Dec 22, 2005 at 02:15:44PM -0600, Mark Maule wrote: > Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer I'll wait for Resend #3 based on my previous comments before considering adding it to my kernel trees:) thanks, greg k-h From gregkh at suse.de Fri Dec 23 07:34:15 2005 From: gregkh at suse.de (Greg KH) Date: Thu, 22 Dec 2005 12:34:15 -0800 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051222202627.GI17552@sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> <20051222202259.GA4959@suse.de> <20051222202627.GI17552@sgi.com> Message-ID: <20051222203415.GA28240@suse.de> On Thu, Dec 22, 2005 at 02:26:27PM -0600, Mark Maule wrote: > On Thu, Dec 22, 2005 at 12:22:59PM -0800, Greg KH wrote: > > On Thu, Dec 22, 2005 at 02:15:44PM -0600, Mark Maule wrote: > > > Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer > > > > I'll wait for Resend #3 based on my previous comments before considering > > adding it to my kernel trees:) > > > > Resend #2 includes the correction to the irq_vector[] declaration, and I > responded to the question about setting irq_vector[0] if that's what you > mean ... Sorry, but I missed that last response. Why do you set the [0] value in a #ifdef now? thanks, greg k-h From matthew at wil.cx Fri Dec 23 07:50:23 2005 From: matthew at wil.cx (Matthew Wilcox) Date: Thu, 22 Dec 2005 13:50:23 -0700 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051222203824.GJ17552@sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> <20051222202259.GA4959@suse.de> <20051222202627.GI17552@sgi.com> <20051222203415.GA28240@suse.de> <20051222203824.GJ17552@sgi.com> Message-ID: <20051222205023.GK2361@parisc-linux.org> On Thu, Dec 22, 2005 at 02:38:24PM -0600, Mark Maule wrote: > Because on ia64 IA64_FIRST_DEVICE_VECTOR and IA64_LAST_DEVICE_VECTOR > (from which MSI FIRST_DEVICE_VECTOR/LAST_DEVICE_VECTOR are derived) are not > constants. The are now global variables (see change to asm-ia64/hw_irq.h) > to allow the platform to override them. Altix uses a reduced range of > vectors for devices, and this change was necessary to make assign_irq_vector() > to work on altix. To be honest, I think this is just adding a third layer of paper over the crack in the wall. The original code assumed x86; the ia64 port added enough emulation to make it look like x86 and now altix fixes a couple of assumptions. I say: bleh. What we actually need is an interface provided by the architecture that allocates a new irq. I have a hankering to implement MSI on PA-RISC but haven't found the time ... From maule at sgi.com Fri Dec 23 07:15:44 2005 From: maule at sgi.com (Mark Maule) Date: Thu, 22 Dec 2005 14:15:44 -0600 (CST) Subject: [PATCH 0/3] msi abstractions and support for altix Message-ID: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer Patch set to abstract portions of the MSI core so that it can be used on architectures which don't use standard interrupt controllers. Changes from last version based on review comments: + Change uintXX_t to uXX + Change _callouts to _ops + Renamed the _generic routines to _apic and moved them to a new file msi-apic.c + Have each msi_arch_init() routine call msi_register() with the desired msi ops for that platform. + Moved msi_address, msi_data, and related defs out of msi.h and into msi-apic.c, replaced by shifts/masks. + Rolled msi-arch-init.patch and msi-callouts.patch into a single msi-ops.patch Note: I don't have ia32 or non-altix ia64 gear with MSI capable cards. Could some give this a sanity check to make sure I didn't break the redefinition of the former msi_address/msi_data bits? Mark 1/3 msi-ops.patch Add an msi_arch_init() hook which can be used to perform platform specific setup prior to msi use. Define a set of msi ops to implement the platform-specific tasks: setup - set up plumbing to get a vector directed at a default cpu, and return the corresponding MSI bus address and data. teardown - inverse of msi_setup target - retarget a vector to a given cpu Define the routine msi_register() called from msi_arch_init() to set the desired ops. Move a bunch of apic-specific code out of the msi core .h/.c and into a new msi-apic.c file. 2/3 ia64-per-platform-device-vector.patch For the ia64 arch, allow per-platform definitions of IA64_FIRST_DEVICE_VECTOR and IA64_LAST_DEVICE_VECTOR. 3/3 msi-altix.patch Altix specific callouts to implement MSI. From maule at sgi.com Fri Dec 23 07:15:49 2005 From: maule at sgi.com (Mark Maule) Date: Thu, 22 Dec 2005 14:15:49 -0600 (CST) Subject: [PATCH 1/3] msi vector targeting abstractions In-Reply-To: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> Message-ID: <20051222201657.2019.69251.48815@lnx-maule.americas.sgi.com> Abstract portions of the MSI core for platforms that do not use standard APIC interrupt controllers. This is implemented through a new arch-specific msi setup routine, and a set of msi ops which can be set on a per platform basis. Signed-off-by: Mark Maule Index: msi/drivers/pci/msi.c =================================================================== --- msi.orig/drivers/pci/msi.c 2005-12-13 12:22:42.784269607 -0600 +++ msi/drivers/pci/msi.c 2005-12-21 22:59:09.200800164 -0600 @@ -23,8 +23,6 @@ #include "pci.h" #include "msi.h" -#define MSI_TARGET_CPU first_cpu(cpu_online_map) - static DEFINE_SPINLOCK(msi_lock); static struct msi_desc* msi_desc[NR_IRQS] = { [0 ... NR_IRQS-1] = NULL }; static kmem_cache_t* msi_cachep; @@ -40,6 +38,15 @@ u8 irq_vector[NR_IRQ_VECTORS] = { FIRST_DEVICE_VECTOR , 0 }; #endif +static struct msi_ops *msi_ops; + +int +msi_register(struct msi_ops *ops) +{ + msi_ops = ops; + return 0; +} + static void msi_cache_ctor(void *p, kmem_cache_t *cache, unsigned long flags) { memset(p, 0, NR_IRQS * sizeof(struct msi_desc)); @@ -92,7 +99,7 @@ static void set_msi_affinity(unsigned int vector, cpumask_t cpu_mask) { struct msi_desc *entry; - struct msg_address address; + u32 address_hi, address_lo; unsigned int irq = vector; unsigned int dest_cpu = first_cpu(cpu_mask); @@ -108,28 +115,36 @@ if (!(pos = pci_find_capability(entry->dev, PCI_CAP_ID_MSI))) return; + pci_read_config_dword(entry->dev, msi_upper_address_reg(pos), + &address_hi); pci_read_config_dword(entry->dev, msi_lower_address_reg(pos), - &address.lo_address.value); - address.lo_address.value &= MSI_ADDRESS_DEST_ID_MASK; - address.lo_address.value |= (cpu_physical_id(dest_cpu) << - MSI_TARGET_CPU_SHIFT); - entry->msi_attrib.current_cpu = cpu_physical_id(dest_cpu); + &address_lo); + + msi_ops->target(vector, dest_cpu, &address_hi, &address_lo); + + pci_write_config_dword(entry->dev, msi_upper_address_reg(pos), + address_hi); pci_write_config_dword(entry->dev, msi_lower_address_reg(pos), - address.lo_address.value); + address_lo); set_native_irq_info(irq, cpu_mask); break; } case PCI_CAP_ID_MSIX: { - int offset = entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE + - PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET; + int offset_hi = + entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE + + PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET; + int offset_lo = + entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE + + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET; + + address_hi = readl(entry->mask_base + offset_hi); + address_lo = readl(entry->mask_base + offset_lo); - address.lo_address.value = readl(entry->mask_base + offset); - address.lo_address.value &= MSI_ADDRESS_DEST_ID_MASK; - address.lo_address.value |= (cpu_physical_id(dest_cpu) << - MSI_TARGET_CPU_SHIFT); - entry->msi_attrib.current_cpu = cpu_physical_id(dest_cpu); - writel(address.lo_address.value, entry->mask_base + offset); + msi_ops->target(vector, dest_cpu, &address_hi, &address_lo); + + writel(address_hi, entry->mask_base + offset_hi); + writel(address_lo, entry->mask_base + offset_lo); set_native_irq_info(irq, cpu_mask); break; } @@ -249,30 +264,6 @@ .set_affinity = set_msi_irq_affinity }; -static void msi_data_init(struct msg_data *msi_data, - unsigned int vector) -{ - memset(msi_data, 0, sizeof(struct msg_data)); - msi_data->vector = (u8)vector; - msi_data->delivery_mode = MSI_DELIVERY_MODE; - msi_data->level = MSI_LEVEL_MODE; - msi_data->trigger = MSI_TRIGGER_MODE; -} - -static void msi_address_init(struct msg_address *msi_address) -{ - unsigned int dest_id; - unsigned long dest_phys_id = cpu_physical_id(MSI_TARGET_CPU); - - memset(msi_address, 0, sizeof(struct msg_address)); - msi_address->hi_address = (u32)0; - dest_id = (MSI_ADDRESS_HEADER << MSI_ADDRESS_HEADER_SHIFT); - msi_address->lo_address.u.dest_mode = MSI_PHYSICAL_MODE; - msi_address->lo_address.u.redirection_hint = MSI_REDIRECTION_HINT_MODE; - msi_address->lo_address.u.dest_id = dest_id; - msi_address->lo_address.value |= (dest_phys_id << MSI_TARGET_CPU_SHIFT); -} - static int msi_free_vector(struct pci_dev* dev, int vector, int reassign); static int assign_msi_vector(void) { @@ -367,6 +358,20 @@ return status; } + if ((status = msi_arch_init()) < 0) { + pci_msi_enable = 0; + printk(KERN_WARNING + "PCI: MSI arch init failed. MSI disabled.\n"); + return status; + } + + if (! msi_ops) { + printk(KERN_WARNING + "PCI: MSI ops not registered. MSI disabled.\n"); + status = -EINVAL; + return status; + } + if ((status = msi_cache_init()) < 0) { pci_msi_enable = 0; printk(KERN_WARNING "PCI: MSI cache init failed\n"); @@ -510,9 +515,11 @@ **/ static int msi_capability_init(struct pci_dev *dev) { + int status; struct msi_desc *entry; - struct msg_address address; - struct msg_data data; + u32 address_lo; + u32 address_hi; + u32 data; int pos, vector; u16 control; @@ -539,23 +546,26 @@ entry->mask_base = (void __iomem *)(long)msi_mask_bits_reg(pos, is_64bit_address(control)); } + /* Configure MSI capability structure */ + status = msi_ops->setup(dev, vector, + &address_hi, + &address_lo, + &data); + if (status < 0) { + kmem_cache_free(msi_cachep, entry); + return status; + } /* Replace with MSI handler */ irq_handler_init(PCI_CAP_ID_MSI, vector, entry->msi_attrib.maskbit); - /* Configure MSI capability structure */ - msi_address_init(&address); - msi_data_init(&data, vector); - entry->msi_attrib.current_cpu = ((address.lo_address.u.dest_id >> - MSI_TARGET_CPU_SHIFT) & MSI_TARGET_CPU_MASK); - pci_write_config_dword(dev, msi_lower_address_reg(pos), - address.lo_address.value); + + pci_write_config_dword(dev, msi_lower_address_reg(pos), address_lo); if (is_64bit_address(control)) { pci_write_config_dword(dev, - msi_upper_address_reg(pos), address.hi_address); - pci_write_config_word(dev, - msi_data_reg(pos, 1), *((u32*)&data)); + msi_upper_address_reg(pos), address_hi); + pci_write_config_word(dev, msi_data_reg(pos, 1), data); } else - pci_write_config_word(dev, - msi_data_reg(pos, 0), *((u32*)&data)); + pci_write_config_word(dev, msi_data_reg(pos, 0), data); + if (entry->msi_attrib.maskbit) { unsigned int maskbits, temp; /* All MSIs are unmasked by default, Mask them all */ @@ -590,13 +600,15 @@ struct msix_entry *entries, int nvec) { struct msi_desc *head = NULL, *tail = NULL, *entry = NULL; - struct msg_address address; - struct msg_data data; + u32 address_hi; + u32 address_lo; + u32 data; int vector, pos, i, j, nr_entries, temp = 0; u32 phys_addr, table_offset; u16 control; u8 bir; void __iomem *base; + int status; pos = pci_find_capability(dev, PCI_CAP_ID_MSIX); /* Request & Map MSI-X table region */ @@ -643,18 +655,20 @@ /* Replace with MSI-X handler */ irq_handler_init(PCI_CAP_ID_MSIX, vector, 1); /* Configure MSI-X capability structure */ - msi_address_init(&address); - msi_data_init(&data, vector); - entry->msi_attrib.current_cpu = - ((address.lo_address.u.dest_id >> - MSI_TARGET_CPU_SHIFT) & MSI_TARGET_CPU_MASK); - writel(address.lo_address.value, + status = msi_ops->setup(dev, vector, + &address_hi, + &address_lo, + &data); + if (status < 0) + break; + + writel(address_lo, base + j * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET); - writel(address.hi_address, + writel(address_hi, base + j * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET); - writel(*(u32*)&data, + writel(data, base + j * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_DATA_OFFSET); attach_msi_entry(entry, vector); @@ -789,6 +803,8 @@ void __iomem *base; unsigned long flags; + msi_ops->teardown(vector); + spin_lock_irqsave(&msi_lock, flags); entry = msi_desc[vector]; if (!entry || entry->dev != dev) { Index: msi/include/asm-i386/msi.h =================================================================== --- msi.orig/include/asm-i386/msi.h 2005-12-13 12:22:42.785246074 -0600 +++ msi/include/asm-i386/msi.h 2005-12-21 23:18:10.602753071 -0600 @@ -12,4 +12,11 @@ #define LAST_DEVICE_VECTOR 232 #define MSI_TARGET_CPU_SHIFT 12 +static inline int msi_arch_init(void) +{ + extern struct msi_ops msi_apic_ops; + msi_register(&msi_apic_ops); + return 0; +} + #endif /* ASM_MSI_H */ Index: msi/include/asm-x86_64/msi.h =================================================================== --- msi.orig/include/asm-x86_64/msi.h 2005-12-13 12:22:42.786222541 -0600 +++ msi/include/asm-x86_64/msi.h 2005-12-21 23:18:20.733573956 -0600 @@ -13,4 +13,11 @@ #define LAST_DEVICE_VECTOR 232 #define MSI_TARGET_CPU_SHIFT 12 +static inline int msi_arch_init(void) +{ + extern struct msi_ops msi_apic_ops; + msi_register_apic(&msi_apic_ops); + return 0; +} + #endif /* ASM_MSI_H */ Index: msi/include/asm-ia64/machvec.h =================================================================== --- msi.orig/include/asm-ia64/machvec.h 2005-12-13 12:22:42.786222541 -0600 +++ msi/include/asm-ia64/machvec.h 2005-12-21 23:18:32.231445414 -0600 @@ -74,6 +74,7 @@ typedef unsigned short ia64_mv_readw_relaxed_t (const volatile void __iomem *); typedef unsigned int ia64_mv_readl_relaxed_t (const volatile void __iomem *); typedef unsigned long ia64_mv_readq_relaxed_t (const volatile void __iomem *); +typedef int ia64_mv_msi_init_t (void); static inline void machvec_noop (void) @@ -146,6 +147,7 @@ # define platform_readw_relaxed ia64_mv.readw_relaxed # define platform_readl_relaxed ia64_mv.readl_relaxed # define platform_readq_relaxed ia64_mv.readq_relaxed +# define platform_msi_init ia64_mv.msi_init # endif /* __attribute__((__aligned__(16))) is required to make size of the @@ -194,6 +196,7 @@ ia64_mv_readw_relaxed_t *readw_relaxed; ia64_mv_readl_relaxed_t *readl_relaxed; ia64_mv_readq_relaxed_t *readq_relaxed; + ia64_mv_msi_init_t *msi_init; } __attribute__((__aligned__(16))); /* align attrib? see above comment */ #define MACHVEC_INIT(name) \ @@ -238,6 +241,7 @@ platform_readw_relaxed, \ platform_readl_relaxed, \ platform_readq_relaxed, \ + platform_msi_init, \ } extern struct ia64_machine_vector ia64_mv; @@ -386,5 +390,9 @@ #ifndef platform_readq_relaxed # define platform_readq_relaxed __ia64_readq_relaxed #endif +#ifndef platform_msi_init +# define platform_msi_init { extern struct msi_ops msi_apic_ops; \ + msi_register(&msi_apic_ops); return 0; } +#endif #endif /* _ASM_IA64_MACHVEC_H */ Index: msi/include/asm-ia64/machvec_sn2.h =================================================================== --- msi.orig/include/asm-ia64/machvec_sn2.h 2005-12-13 12:22:42.787199008 -0600 +++ msi/include/asm-ia64/machvec_sn2.h 2005-12-13 16:09:49.257035213 -0600 @@ -71,6 +71,7 @@ extern ia64_mv_dma_sync_sg_for_device sn_dma_sync_sg_for_device; extern ia64_mv_dma_mapping_error sn_dma_mapping_error; extern ia64_mv_dma_supported sn_dma_supported; +extern ia64_mv_msi_init_t sn_msi_init; /* * This stuff has dual use! @@ -120,6 +121,7 @@ #define platform_dma_sync_sg_for_device sn_dma_sync_sg_for_device #define platform_dma_mapping_error sn_dma_mapping_error #define platform_dma_supported sn_dma_supported +#define platform_msi_init sn_msi_init #include Index: msi/include/asm-ia64/msi.h =================================================================== --- msi.orig/include/asm-ia64/msi.h 2005-12-13 12:22:42.787199008 -0600 +++ msi/include/asm-ia64/msi.h 2005-12-13 16:09:49.268752815 -0600 @@ -14,4 +14,6 @@ #define ack_APIC_irq ia64_eoi #define MSI_TARGET_CPU_SHIFT 4 +static inline int msi_arch_init(void) { return platform_msi_init(); } + #endif /* ASM_MSI_H */ Index: msi/arch/ia64/sn/pci/Makefile =================================================================== --- msi.orig/arch/ia64/sn/pci/Makefile 2005-12-13 12:22:42.788175474 -0600 +++ msi/arch/ia64/sn/pci/Makefile 2005-12-13 16:09:49.296093887 -0600 @@ -3,8 +3,9 @@ # License. See the file "COPYING" in the main directory of this archive # for more details. # -# Copyright (C) 2000-2004 Silicon Graphics, Inc. All Rights Reserved. +# Copyright (C) 2000-2005 Silicon Graphics, Inc. All Rights Reserved. # # Makefile for the sn pci general routines. obj-y := pci_dma.o tioca_provider.o tioce_provider.o pcibr/ +obj-$(CONFIG_PCI_MSI) += msi.o Index: msi/arch/ia64/sn/pci/msi.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ msi/arch/ia64/sn/pci/msi.c 2005-12-21 22:59:02.713172526 -0600 @@ -0,0 +1,18 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2005 Silicon Graphics, Inc. All Rights Reserved. + */ + +#include + +int +sn_msi_init(void) +{ + /* + * return error until MSI is supported on altix platforms + */ + return -EINVAL; +} Index: msi/drivers/pci/Makefile =================================================================== --- msi.orig/drivers/pci/Makefile 2005-12-21 16:08:58.222841575 -0600 +++ msi/drivers/pci/Makefile 2005-12-21 16:10:32.848440390 -0600 @@ -23,7 +23,7 @@ obj-$(CONFIG_PPC64) += setup-bus.o obj-$(CONFIG_MIPS) += setup-bus.o setup-irq.o obj-$(CONFIG_X86_VISWS) += setup-irq.o -obj-$(CONFIG_PCI_MSI) += msi.o +obj-$(CONFIG_PCI_MSI) += msi.o msi-apic.o # # ACPI Related PCI FW Functions Index: msi/drivers/pci/msi-apic.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ msi/drivers/pci/msi-apic.c 2005-12-22 11:09:37.022232088 -0600 @@ -0,0 +1,102 @@ +/* + * MSI hooks for standard x86 apic + */ + +#include + +#include "msi.h" + +/* + * Shifts for APIC-based data + */ + +#define MSI_DATA_VECTOR_SHIFT 0 +#define MSI_DATA_VECTOR(v) (((u8)v) << MSI_DATA_VECTOR_SHIFT) + +#define MSI_DATA_DELIVERY_SHIFT 8 +#define MSI_DATA_DELIVERY_FIXED (0 << MSI_DATA_DELIVERY_SHIFT) +#define MSI_DATA_DELIVERY_LOWPRI (1 << MSI_DATA_DELIVERY_SHIFT) + +#define MSI_DATA_LEVEL_SHIFT 14 +#define MSI_DATA_LEVEL_DEASSERT (0 << MSI_DATA_LEVEL_SHIFT) +#define MSI_DATA_LEVEL_ASSERT (1 << MSI_DATA_LEVEL_SHIFT) + +#define MSI_DATA_TRIGGER_SHIFT 15 +#define MSI_DATA_TRIGGER_EDGE (0 << MSI_DATA_TRIGGER_SHIFT) +#define MSI_DATA_TRIGGER_LEVEL (1 << MSI_DATA_TRIGGER_SHIFT) + +/* + * Shift/mask fields for APIC-based bus address + */ + +#define MSI_ADDR_HEADER 0xfee00000 + +#define MSI_ADDR_DESTID_MASK 0xfff0000f +#define MSI_ADDR_DESTID_CPU(cpu) ((cpu) << MSI_TARGET_CPU_SHIFT) + +#define MSI_ADDR_DESTMODE_SHIFT 2 +#define MSI_ADDR_DESTMODE_PHYS (0 << MSI_ADDR_DESTMODE_SHIFT) +#define MSI_ADDR_DESTMODE_LOGIC (1 << MSI_ADDR_DESTMODE_SHIFT) + +#define MSI_ADDR_REDIRECTION_SHIFT 3 +#define MSI_ADDR_REDIRECTION_CPU (0 << MSI_ADDR_REDIRECTION_SHIFT) +#define MSI_ADDR_REDIRECTION_LOWPRI (1 << MSI_ADDR_REDIRECTION_SHIFT) + + +static void +msi_target_apic(unsigned int vector, + unsigned int dest_cpu, + u32 *address_hi, /* in/out */ + u32 *address_lo) /* in/out */ +{ + u32 addr = *address_lo; + + addr &= MSI_ADDR_DESTID_MASK; + addr |= MSI_ADDR_DESTID_CPU(cpu_physical_id(dest_cpu)); + + *address_lo = addr; +} + +static int +msi_setup_apic(struct pci_dev *pdev, /* unused in generic */ + unsigned int vector, + u32 *address_hi, + u32 *address_lo, + u32 *data) +{ + unsigned long dest_phys_id; + + dest_phys_id = cpu_physical_id(first_cpu(cpu_online_map)); + + *address_hi = 0; + *address_lo = MSI_ADDR_HEADER | + MSI_ADDR_DESTMODE_PHYS | + MSI_ADDR_REDIRECTION_CPU | + MSI_ADDR_DESTID_CPU(dest_phys_id); + + *data = MSI_DATA_TRIGGER_EDGE | + MSI_DATA_LEVEL_ASSERT | + MSI_DATA_DELIVERY_FIXED | + MSI_DATA_VECTOR(vector); + + return 0; +} + +static void +msi_teardown_apic(unsigned int vector) +{ + return; /* no-op */ +} + +/* + * Generic callouts used on most archs/platforms. Override with + * msi_register_callouts() + */ + +struct msi_ops msi_apic_ops = { + .setup = msi_setup_apic, + .teardown = msi_teardown_apic, +#ifdef CONFIG_SMP + .target = msi_target_apic, +#endif +}; Index: msi/drivers/pci/msi.h =================================================================== --- msi.orig/drivers/pci/msi.h 2005-12-21 16:08:58.222841575 -0600 +++ msi/drivers/pci/msi.h 2005-12-21 23:06:48.292305590 -0600 @@ -69,67 +69,6 @@ #define msix_mask(address) (address | PCI_MSIX_FLAGS_BITMASK) #define msix_is_pending(address) (address & PCI_MSIX_FLAGS_PENDMASK) -/* - * MSI Defined Data Structures - */ -#define MSI_ADDRESS_HEADER 0xfee -#define MSI_ADDRESS_HEADER_SHIFT 12 -#define MSI_ADDRESS_HEADER_MASK 0xfff000 -#define MSI_ADDRESS_DEST_ID_MASK 0xfff0000f -#define MSI_TARGET_CPU_MASK 0xff -#define MSI_DELIVERY_MODE 0 -#define MSI_LEVEL_MODE 1 /* Edge always assert */ -#define MSI_TRIGGER_MODE 0 /* MSI is edge sensitive */ -#define MSI_PHYSICAL_MODE 0 -#define MSI_LOGICAL_MODE 1 -#define MSI_REDIRECTION_HINT_MODE 0 - -struct msg_data { -#if defined(__LITTLE_ENDIAN_BITFIELD) - __u32 vector : 8; - __u32 delivery_mode : 3; /* 000b: FIXED | 001b: lowest prior */ - __u32 reserved_1 : 3; - __u32 level : 1; /* 0: deassert | 1: assert */ - __u32 trigger : 1; /* 0: edge | 1: level */ - __u32 reserved_2 : 16; -#elif defined(__BIG_ENDIAN_BITFIELD) - __u32 reserved_2 : 16; - __u32 trigger : 1; /* 0: edge | 1: level */ - __u32 level : 1; /* 0: deassert | 1: assert */ - __u32 reserved_1 : 3; - __u32 delivery_mode : 3; /* 000b: FIXED | 001b: lowest prior */ - __u32 vector : 8; -#else -#error "Bitfield endianness not defined! Check your byteorder.h" -#endif -} __attribute__ ((packed)); - -struct msg_address { - union { - struct { -#if defined(__LITTLE_ENDIAN_BITFIELD) - __u32 reserved_1 : 2; - __u32 dest_mode : 1; /*0:physic | 1:logic */ - __u32 redirection_hint: 1; /*0: dedicated CPU - 1: lowest priority */ - __u32 reserved_2 : 4; - __u32 dest_id : 24; /* Destination ID */ -#elif defined(__BIG_ENDIAN_BITFIELD) - __u32 dest_id : 24; /* Destination ID */ - __u32 reserved_2 : 4; - __u32 redirection_hint: 1; /*0: dedicated CPU - 1: lowest priority */ - __u32 dest_mode : 1; /*0:physic | 1:logic */ - __u32 reserved_1 : 2; -#else -#error "Bitfield endianness not defined! Check your byteorder.h" -#endif - }u; - __u32 value; - }lo_address; - __u32 hi_address; -} __attribute__ ((packed)); - struct msi_desc { struct { __u8 type : 5; /* {0: unused, 5h:MSI, 11h:MSI-X} */ @@ -138,7 +77,7 @@ __u8 reserved: 1; /* reserved */ __u8 entry_nr; /* specific enabled entry */ __u8 default_vector; /* default pre-assigned vector */ - __u8 current_cpu; /* current destination cpu */ + __u8 unused; /* formerly unused destination cpu*/ }msi_attrib; struct { Index: msi/include/linux/pci.h =================================================================== --- msi.orig/include/linux/pci.h 2005-12-21 16:08:58.223818043 -0600 +++ msi/include/linux/pci.h 2005-12-21 16:10:32.847463922 -0600 @@ -478,6 +478,16 @@ u16 entry; /* driver uses to specify entry, OS writes */ }; +struct msi_ops { + int (*setup) (struct pci_dev *pdev, unsigned int vector, + u32 *addr_hi, u32 *addr_lo, u32 *data); + void (*teardown) (unsigned int vector); +#ifdef CONFIG_SMP + void (*target) (unsigned int vector, unsigned int cpu, + u32 *addr_hi, u32 *addr_lo); +#endif +}; + #ifndef CONFIG_PCI_MSI static inline void pci_scan_msi_device(struct pci_dev *dev) {} static inline int pci_enable_msi(struct pci_dev *dev) {return -1;} @@ -486,6 +496,7 @@ struct msix_entry *entries, int nvec) {return -1;} static inline void pci_disable_msix(struct pci_dev *dev) {} static inline void msi_remove_pci_irq_vectors(struct pci_dev *dev) {} +static inline int msi_register(struct msi_ops *ops) {return -1;} #else extern void pci_scan_msi_device(struct pci_dev *dev); extern int pci_enable_msi(struct pci_dev *dev); @@ -494,6 +505,7 @@ struct msix_entry *entries, int nvec); extern void pci_disable_msix(struct pci_dev *dev); extern void msi_remove_pci_irq_vectors(struct pci_dev *dev); +extern int msi_register(struct msi_ops *ops); #endif extern void pci_block_user_cfg_access(struct pci_dev *dev); From maule at sgi.com Fri Dec 23 07:38:24 2005 From: maule at sgi.com (Mark Maule) Date: Thu, 22 Dec 2005 14:38:24 -0600 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051222203415.GA28240@suse.de> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> <20051222202259.GA4959@suse.de> <20051222202627.GI17552@sgi.com> <20051222203415.GA28240@suse.de> Message-ID: <20051222203824.GJ17552@sgi.com> On Thu, Dec 22, 2005 at 12:34:15PM -0800, Greg KH wrote: > On Thu, Dec 22, 2005 at 02:26:27PM -0600, Mark Maule wrote: > > On Thu, Dec 22, 2005 at 12:22:59PM -0800, Greg KH wrote: > > > On Thu, Dec 22, 2005 at 02:15:44PM -0600, Mark Maule wrote: > > > > Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer > > > > > > I'll wait for Resend #3 based on my previous comments before considering > > > adding it to my kernel trees:) > > > > > > > Resend #2 includes the correction to the irq_vector[] declaration, and I > > responded to the question about setting irq_vector[0] if that's what you > > mean ... > > Sorry, but I missed that last response. Why do you set the [0] value in > a #ifdef now? Because on ia64 IA64_FIRST_DEVICE_VECTOR and IA64_LAST_DEVICE_VECTOR (from which MSI FIRST_DEVICE_VECTOR/LAST_DEVICE_VECTOR are derived) are not constants. The are now global variables (see change to asm-ia64/hw_irq.h) to allow the platform to override them. Altix uses a reduced range of vectors for devices, and this change was necessary to make assign_irq_vector() to work on altix. Mark From maule at SGI.com Fri Dec 23 07:26:27 2005 From: maule at SGI.com (Mark Maule) Date: Thu, 22 Dec 2005 14:26:27 -0600 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051222202259.GA4959@suse.de> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> <20051222202259.GA4959@suse.de> Message-ID: <20051222202627.GI17552@sgi.com> On Thu, Dec 22, 2005 at 12:22:59PM -0800, Greg KH wrote: > On Thu, Dec 22, 2005 at 02:15:44PM -0600, Mark Maule wrote: > > Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer > > I'll wait for Resend #3 based on my previous comments before considering > adding it to my kernel trees:) > Resend #2 includes the correction to the irq_vector[] declaration, and I responded to the question about setting irq_vector[0] if that's what you mean ... Mark From maule at SGI.com Fri Dec 23 07:15:57 2005 From: maule at SGI.com (Mark Maule) Date: Thu, 22 Dec 2005 14:15:57 -0600 (CST) Subject: [PATCH 2/3] per-platform IA64_{FIRST, LAST}_DEVICE_VECTOR definitions In-Reply-To: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> Message-ID: <20051222201705.2019.59377.24060@lnx-maule.americas.sgi.com> Abstract IA64_FIRST_DEVICE_VECTOR/IA64_LAST_DEVICE_VECTOR since SN platforms use a subset of the IA64 range. Implement this by making the above macros global variables which the platform can override in it setup code. Also add a reserve_irq_vector() routine used by SN to mark a vector's as in-use when that weren't allocated through assign_irq_vector(). Signed-off-by: Mark Maule Index: msi/arch/ia64/kernel/irq_ia64.c =================================================================== --- msi.orig/arch/ia64/kernel/irq_ia64.c 2005-12-21 22:59:09.199823700 -0600 +++ msi/arch/ia64/kernel/irq_ia64.c 2005-12-22 14:10:01.012860466 -0600 @@ -46,6 +46,10 @@ #define IRQ_DEBUG 0 +/* These can be overridden in platform_irq_init */ +int ia64_first_device_vector = IA64_DEF_FIRST_DEVICE_VECTOR; +int ia64_last_device_vector = IA64_DEF_LAST_DEVICE_VECTOR; + /* default base addr of IPI table */ void __iomem *ipi_base_addr = ((void __iomem *) (__IA64_UNCACHED_OFFSET | IA64_IPI_DEFAULT_BASE_ADDR)); @@ -60,7 +64,7 @@ }; EXPORT_SYMBOL(isa_irq_to_vector_map); -static unsigned long ia64_vector_mask[BITS_TO_LONGS(IA64_NUM_DEVICE_VECTORS)]; +static unsigned long ia64_vector_mask[BITS_TO_LONGS(IA64_MAX_DEVICE_VECTORS)]; int assign_irq_vector (int irq) @@ -89,6 +93,17 @@ printk(KERN_WARNING "%s: double free!\n", __FUNCTION__); } +int +reserve_irq_vector (int vector) +{ + if (vector < IA64_FIRST_DEVICE_VECTOR || + vector > IA64_LAST_DEVICE_VECTOR) + return -EINVAL; + + return test_and_set_bit(IA64_FIRST_DEVICE_VECTOR + vector, + ia64_vector_mask); +} + #ifdef CONFIG_SMP # define IS_RESCHEDULE(vec) (vec == IA64_IPI_RESCHEDULE) #else Index: msi/arch/ia64/sn/kernel/irq.c =================================================================== --- msi.orig/arch/ia64/sn/kernel/irq.c 2005-12-21 22:59:09.199823700 -0600 +++ msi/arch/ia64/sn/kernel/irq.c 2005-12-22 14:10:01.024578027 -0600 @@ -203,6 +203,9 @@ int i; irq_desc_t *base_desc = irq_desc; + ia64_first_device_vector = IA64_SN2_FIRST_DEVICE_VECTOR; + ia64_last_device_vector = IA64_SN2_LAST_DEVICE_VECTOR; + for (i = 0; i < NR_IRQS; i++) { if (base_desc[i].handler == &no_irq_type) { base_desc[i].handler = &irq_type_sn; @@ -287,6 +290,7 @@ /* link it into the sn_irq[irq] list */ spin_lock(&sn_irq_info_lock); list_add_rcu(&sn_irq_info->list, sn_irq_lh[sn_irq_info->irq_irq]); + reserve_irq_vector(sn_irq_info->irq_irq); spin_unlock(&sn_irq_info_lock); (void)register_intr_pda(sn_irq_info); @@ -310,8 +314,11 @@ spin_lock(&sn_irq_info_lock); list_del_rcu(&sn_irq_info->list); spin_unlock(&sn_irq_info_lock); + if (list_empty(sn_irq_lh[sn_irq_info->irq_irq])) + free_irq_vector(sn_irq_info->irq_irq); call_rcu(&sn_irq_info->rcu, sn_irq_info_free); pci_dev_put(pci_dev); + } static inline void Index: msi/include/asm-ia64/hw_irq.h =================================================================== --- msi.orig/include/asm-ia64/hw_irq.h 2005-12-21 22:59:09.200800164 -0600 +++ msi/include/asm-ia64/hw_irq.h 2005-12-22 14:10:01.046060224 -0600 @@ -47,9 +47,19 @@ #define IA64_CMC_VECTOR 0x1f /* corrected machine-check interrupt vector */ /* * Vectors 0x20-0x2f are reserved for legacy ISA IRQs. + * Use vectors 0x30-0xe7 as the default device vector range for ia64. + * Platforms may choose to reduce this range in platform_irq_setup, but the + * platform range must fall within + * [IA64_DEF_FIRST_DEVICE_VECTOR..IA64_DEF_LAST_DEVICE_VECTOR] */ -#define IA64_FIRST_DEVICE_VECTOR 0x30 -#define IA64_LAST_DEVICE_VECTOR 0xe7 +extern int ia64_first_device_vector; +extern int ia64_last_device_vector; + +#define IA64_DEF_FIRST_DEVICE_VECTOR 0x30 +#define IA64_DEF_LAST_DEVICE_VECTOR 0xe7 +#define IA64_FIRST_DEVICE_VECTOR ia64_first_device_vector +#define IA64_LAST_DEVICE_VECTOR ia64_last_device_vector +#define IA64_MAX_DEVICE_VECTORS (IA64_DEF_LAST_DEVICE_VECTOR - IA64_DEF_FIRST_DEVICE_VECTOR + 1) #define IA64_NUM_DEVICE_VECTORS (IA64_LAST_DEVICE_VECTOR - IA64_FIRST_DEVICE_VECTOR + 1) #define IA64_MCA_RENDEZ_VECTOR 0xe8 /* MCA rendez interrupt */ @@ -83,6 +93,7 @@ extern int assign_irq_vector (int irq); /* allocate a free vector */ extern void free_irq_vector (int vector); +extern int reserve_irq_vector (int vector); extern void ia64_send_ipi (int cpu, int vector, int delivery_mode, int redirect); extern void register_percpu_irq (ia64_vector vec, struct irqaction *action); Index: msi/drivers/pci/msi.c =================================================================== --- msi.orig/drivers/pci/msi.c 2005-12-21 22:59:09.200800164 -0600 +++ msi/drivers/pci/msi.c 2005-12-22 14:10:19.301044796 -0600 @@ -35,7 +35,7 @@ #ifndef CONFIG_X86_IO_APIC int vector_irq[NR_VECTORS] = { [0 ... NR_VECTORS - 1] = -1}; -u8 irq_vector[NR_IRQ_VECTORS] = { FIRST_DEVICE_VECTOR , 0 }; +u8 irq_vector[NR_IRQ_VECTORS]; #endif static struct msi_ops *msi_ops; @@ -377,6 +377,11 @@ printk(KERN_WARNING "PCI: MSI cache init failed\n"); return status; } + +#ifndef CONFIG_X86_IO_APIC + irq_vector[0] = FIRST_DEVICE_VECTOR; +#endif + last_alloc_vector = assign_irq_vector(AUTO_ASSIGN); if (last_alloc_vector < 0) { pci_msi_enable = 0; From maule at SGI.com Fri Dec 23 07:16:02 2005 From: maule at SGI.com (Mark Maule) Date: Thu, 22 Dec 2005 14:16:02 -0600 (CST) Subject: [PATCH 2/3] altix: msi support In-Reply-To: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> Message-ID: <20051222201710.2019.3976.52779@lnx-maule.americas.sgi.com> MSI callouts for altix. Involves a fair amount of code reorg in sn irq.c code as well as adding some extensions to the altix PCI provider abstaction. Signed-off-by: Mark Maule Index: linux-2.6.14/arch/ia64/sn/pci/msi.c =================================================================== --- linux-2.6.14.orig/arch/ia64/sn/pci/msi.c 2005-12-21 22:37:02.978262311 -0600 +++ linux-2.6.14/arch/ia64/sn/pci/msi.c 2005-12-21 22:38:24.090685912 -0600 @@ -6,13 +6,205 @@ * Copyright (C) 2005 Silicon Graphics, Inc. All Rights Reserved. */ -#include +#include +#include +#include + +#include + +#include +#include +#include +#include +#include + +struct sn_msi_info { + u64 pci_addr; + struct sn_irq_info *sn_irq_info; +}; + +static struct sn_msi_info *sn_msi_info; + +static void +sn_msi_teardown(unsigned int vector) +{ + nasid_t nasid; + int widget; + struct pci_dev *pdev; + struct pcidev_info *sn_pdev; + struct sn_irq_info *sn_irq_info; + struct pcibus_bussoft *bussoft; + struct sn_pcibus_provider *provider; + + sn_irq_info = sn_msi_info[vector].sn_irq_info; + if (sn_irq_info == NULL || sn_irq_info->irq_int_bit >= 0) + return; + + sn_pdev = (struct pcidev_info *)sn_irq_info->irq_pciioinfo; + pdev = sn_pdev->pdi_linux_pcidev; + provider = SN_PCIDEV_BUSPROVIDER(pdev); + + (*provider->dma_unmap)(pdev, + sn_msi_info[vector].pci_addr, + PCI_DMA_FROMDEVICE); + sn_msi_info[vector].pci_addr = 0; + + bussoft = SN_PCIDEV_BUSSOFT(pdev); + nasid = NASID_GET(bussoft->bs_base); + widget = (nasid & 1) ? + TIO_SWIN_WIDGETNUM(bussoft->bs_base) : + SWIN_WIDGETNUM(bussoft->bs_base); + + sn_intr_free(nasid, widget, sn_irq_info); + sn_msi_info[vector].sn_irq_info = NULL; + + return; +} int -sn_msi_init(void) +sn_msi_setup(struct pci_dev *pdev, unsigned int vector, + u32 *addr_hi, u32 *addr_lo, u32 *data) { + int widget; + int status; + nasid_t nasid; + u64 bus_addr; + struct sn_irq_info *sn_irq_info; + struct pcibus_bussoft *bussoft = SN_PCIDEV_BUSSOFT(pdev); + struct sn_pcibus_provider *provider = SN_PCIDEV_BUSPROVIDER(pdev); + + if (bussoft == NULL) + return -EINVAL; + + if (provider == NULL || provider->dma_map_consistent == NULL) + return -EINVAL; + + /* + * Set up the vector plumbing. Let the prom (via sn_intr_alloc) + * decide which cpu to direct this msi at by default. + */ + + nasid = NASID_GET(bussoft->bs_base); + widget = (nasid & 1) ? + TIO_SWIN_WIDGETNUM(bussoft->bs_base) : + SWIN_WIDGETNUM(bussoft->bs_base); + + sn_irq_info = kzalloc(sizeof(struct sn_irq_info), GFP_KERNEL); + if (! sn_irq_info) + return -ENOMEM; + + status = sn_intr_alloc(nasid, widget, sn_irq_info, vector, -1, -1); + if (status) { + kfree(sn_irq_info); + return -ENOMEM; + } + + sn_irq_info->irq_int_bit = -1; /* mark this as an MSI irq */ + sn_irq_fixup(pdev, sn_irq_info); + + /* Prom probably should fill these in, but doesn't ... */ + sn_irq_info->irq_bridge_type = bussoft->bs_asic_type; + sn_irq_info->irq_bridge = (void *)bussoft->bs_base; + /* - * return error until MSI is supported on altix platforms + * Map the xio address into bus space */ - return -EINVAL; + bus_addr = (*provider->dma_map_consistent)(pdev, + sn_irq_info->irq_xtalkaddr, + sizeof(sn_irq_info->irq_xtalkaddr), + SN_DMA_MSI|SN_DMA_ADDR_XIO); + if (! bus_addr) { + sn_intr_free(nasid, widget, sn_irq_info); + kfree(sn_irq_info); + return -ENOMEM; + } + + sn_msi_info[vector].sn_irq_info = sn_irq_info; + sn_msi_info[vector].pci_addr = bus_addr; + + *addr_hi = (u32)(bus_addr >> 32); + *addr_lo = (u32)(bus_addr & 0x00000000ffffffff); + + /* + * In the SN platform, bit 16 is a "send vector" bit which + * must be present in order to move the vector through the system. + */ + *data = 0x100 + (unsigned int)vector; + +#ifdef CONFIG_SMP + set_irq_affinity_info((vector & 0xff), sn_irq_info->irq_cpuid, 0); +#endif + + return 0; +} + +static void +sn_msi_target(unsigned int vector, unsigned int cpu, + u32 *addr_hi, u32 *addr_lo) +{ + int slice; + nasid_t nasid; + u64 bus_addr; + struct pci_dev *pdev; + struct pcidev_info *sn_pdev; + struct sn_irq_info *sn_irq_info; + struct sn_irq_info *new_irq_info; + struct sn_pcibus_provider *provider; + + sn_irq_info = sn_msi_info[vector].sn_irq_info; + if (sn_irq_info == NULL || sn_irq_info->irq_int_bit >= 0) + return; + + /* + * Release XIO resources for the old MSI PCI address + */ + + sn_pdev = (struct pcidev_info *)sn_irq_info->irq_pciioinfo; + pdev = sn_pdev->pdi_linux_pcidev; + provider = SN_PCIDEV_BUSPROVIDER(pdev); + + bus_addr = (u64)(*addr_hi) << 32 | (u64)(*addr_lo); + (*provider->dma_unmap)(pdev, bus_addr, PCI_DMA_FROMDEVICE); + sn_msi_info[vector].pci_addr = 0; + + nasid = cpuid_to_nasid(cpu); + slice = cpuid_to_slice(cpu); + + new_irq_info = sn_retarget_vector(sn_irq_info, nasid, slice); + sn_msi_info[vector].sn_irq_info = new_irq_info; + if (new_irq_info == NULL) + return; + + /* + * Map the xio address into bus space + */ + + bus_addr = (*provider->dma_map_consistent)(pdev, + new_irq_info->irq_xtalkaddr, + sizeof(new_irq_info->irq_xtalkaddr), + SN_DMA_MSI|SN_DMA_ADDR_XIO); + + sn_msi_info[vector].pci_addr = bus_addr; + *addr_hi = (u32)(bus_addr >> 32); + *addr_lo = (u32)(bus_addr & 0x00000000ffffffff); +} + +struct msi_ops sn_msi_ops = { + .setup = sn_msi_setup, + .teardown = sn_msi_teardown, +#ifdef CONFIG_SMP + .target = sn_msi_target, +#endif +}; + +int +sn_msi_init(void) +{ + sn_msi_info = + kzalloc(sizeof(struct sn_msi_info) * NR_VECTORS, GFP_KERNEL); + if (! sn_msi_info) + return -ENOMEM; + + msi_register(&sn_msi_ops); + return 0; } Index: linux-2.6.14/arch/ia64/sn/kernel/io_init.c =================================================================== --- linux-2.6.14.orig/arch/ia64/sn/kernel/io_init.c 2005-12-21 22:35:23.436286704 -0600 +++ linux-2.6.14/arch/ia64/sn/kernel/io_init.c 2005-12-21 22:38:24.091685671 -0600 @@ -51,7 +51,7 @@ */ static dma_addr_t -sn_default_pci_map(struct pci_dev *pdev, unsigned long paddr, size_t size) +sn_default_pci_map(struct pci_dev *pdev, unsigned long paddr, size_t size, int type) { return 0; } Index: linux-2.6.14/arch/ia64/sn/kernel/irq.c =================================================================== --- linux-2.6.14.orig/arch/ia64/sn/kernel/irq.c 2005-12-21 22:37:58.120953676 -0600 +++ linux-2.6.14/arch/ia64/sn/kernel/irq.c 2005-12-21 22:38:24.092685429 -0600 @@ -25,11 +25,11 @@ int sn_force_interrupt_flag = 1; extern int sn_ioif_inited; -static struct list_head **sn_irq_lh; +struct list_head **sn_irq_lh; static spinlock_t sn_irq_info_lock = SPIN_LOCK_UNLOCKED; /* non-IRQ lock */ -static inline uint64_t sn_intr_alloc(nasid_t local_nasid, int local_widget, - u64 sn_irq_info, +uint64_t sn_intr_alloc(nasid_t local_nasid, int local_widget, + struct sn_irq_info *sn_irq_info, int req_irq, nasid_t req_nasid, int req_slice) { @@ -39,12 +39,13 @@ SAL_CALL_NOLOCK(ret_stuff, (u64) SN_SAL_IOIF_INTERRUPT, (u64) SAL_INTR_ALLOC, (u64) local_nasid, - (u64) local_widget, (u64) sn_irq_info, (u64) req_irq, + (u64) local_widget, __pa(sn_irq_info), (u64) req_irq, (u64) req_nasid, (u64) req_slice); + return ret_stuff.status; } -static inline void sn_intr_free(nasid_t local_nasid, int local_widget, +void sn_intr_free(nasid_t local_nasid, int local_widget, struct sn_irq_info *sn_irq_info) { struct ia64_sal_retval ret_stuff; @@ -113,73 +114,91 @@ static void sn_irq_info_free(struct rcu_head *head); -static void sn_set_affinity_irq(unsigned int irq, cpumask_t mask) +struct sn_irq_info *sn_retarget_vector(struct sn_irq_info *sn_irq_info, + nasid_t nasid, int slice) { - struct sn_irq_info *sn_irq_info, *sn_irq_info_safe; - int cpuid, cpuphys; + int vector; + int cpuphys; + int64_t bridge; + int local_widget, status; + nasid_t local_nasid; + struct sn_irq_info *new_irq_info; + struct sn_pcibus_provider *pci_provider; - cpuid = first_cpu(mask); - cpuphys = cpu_physical_id(cpuid); + new_irq_info = kmalloc(sizeof(struct sn_irq_info), GFP_ATOMIC); + if (new_irq_info == NULL) + return NULL; - list_for_each_entry_safe(sn_irq_info, sn_irq_info_safe, - sn_irq_lh[irq], list) { - uint64_t bridge; - int local_widget, status; - nasid_t local_nasid; - struct sn_irq_info *new_irq_info; - struct sn_pcibus_provider *pci_provider; - - new_irq_info = kmalloc(sizeof(struct sn_irq_info), GFP_ATOMIC); - if (new_irq_info == NULL) - break; - memcpy(new_irq_info, sn_irq_info, sizeof(struct sn_irq_info)); - - bridge = (uint64_t) new_irq_info->irq_bridge; - if (!bridge) { - kfree(new_irq_info); - break; /* irq is not a device interrupt */ - } + memcpy(new_irq_info, sn_irq_info, sizeof(struct sn_irq_info)); - local_nasid = NASID_GET(bridge); + bridge = (uint64_t) new_irq_info->irq_bridge; + if (!bridge) { + kfree(new_irq_info); + return NULL; /* irq is not a device interrupt */ + } - if (local_nasid & 1) - local_widget = TIO_SWIN_WIDGETNUM(bridge); - else - local_widget = SWIN_WIDGETNUM(bridge); - - /* Free the old PROM new_irq_info structure */ - sn_intr_free(local_nasid, local_widget, new_irq_info); - /* Update kernels new_irq_info with new target info */ - unregister_intr_pda(new_irq_info); - - /* allocate a new PROM new_irq_info struct */ - status = sn_intr_alloc(local_nasid, local_widget, - __pa(new_irq_info), irq, - cpuid_to_nasid(cpuid), - cpuid_to_slice(cpuid)); - - /* SAL call failed */ - if (status) { - kfree(new_irq_info); - break; - } + local_nasid = NASID_GET(bridge); - new_irq_info->irq_cpuid = cpuid; - register_intr_pda(new_irq_info); + if (local_nasid & 1) + local_widget = TIO_SWIN_WIDGETNUM(bridge); + else + local_widget = SWIN_WIDGETNUM(bridge); - pci_provider = sn_pci_provider[new_irq_info->irq_bridge_type]; - if (pci_provider && pci_provider->target_interrupt) - (pci_provider->target_interrupt)(new_irq_info); - - spin_lock(&sn_irq_info_lock); - list_replace_rcu(&sn_irq_info->list, &new_irq_info->list); - spin_unlock(&sn_irq_info_lock); - call_rcu(&sn_irq_info->rcu, sn_irq_info_free); + vector = sn_irq_info->irq_irq; + /* Free the old PROM new_irq_info structure */ + sn_intr_free(local_nasid, local_widget, new_irq_info); + /* Update kernels new_irq_info with new target info */ + unregister_intr_pda(new_irq_info); + + /* allocate a new PROM new_irq_info struct */ + status = sn_intr_alloc(local_nasid, local_widget, + new_irq_info, vector, + nasid, slice); + + /* SAL call failed */ + if (status) { + kfree(new_irq_info); + return NULL; + } + + cpuphys = nasid_slice_to_cpuid(nasid, slice); + new_irq_info->irq_cpuid = cpuphys; + register_intr_pda(new_irq_info); + + pci_provider = sn_pci_provider[new_irq_info->irq_bridge_type]; + + /* + * If this represents a line interrupt, target it. If it's + * an msi (irq_int_bit < 0), it's already targeted. + */ + if (new_irq_info->irq_int_bit >= 0 && + pci_provider && pci_provider->target_interrupt) + (pci_provider->target_interrupt)(new_irq_info); + + spin_lock(&sn_irq_info_lock); + list_replace_rcu(&sn_irq_info->list, &new_irq_info->list); + spin_unlock(&sn_irq_info_lock); + call_rcu(&sn_irq_info->rcu, sn_irq_info_free); #ifdef CONFIG_SMP - set_irq_affinity_info((irq & 0xff), cpuphys, 0); + set_irq_affinity_info((vector & 0xff), cpuphys, 0); #endif - } + + return new_irq_info; +} + +static void sn_set_affinity_irq(unsigned int irq, cpumask_t mask) +{ + struct sn_irq_info *sn_irq_info, *sn_irq_info_safe; + nasid_t nasid; + int slice; + + nasid = cpuid_to_nasid(first_cpu(mask)); + slice = cpuid_to_slice(first_cpu(mask)); + + list_for_each_entry_safe(sn_irq_info, sn_irq_info_safe, + sn_irq_lh[irq], list) + (void)sn_retarget_vector(sn_irq_info, nasid, slice); } struct hw_interrupt_type irq_type_sn = { @@ -441,5 +460,4 @@ INIT_LIST_HEAD(sn_irq_lh[i]); } - } Index: linux-2.6.14/arch/ia64/sn/pci/pci_dma.c =================================================================== --- linux-2.6.14.orig/arch/ia64/sn/pci/pci_dma.c 2005-12-21 22:35:23.446284291 -0600 +++ linux-2.6.14/arch/ia64/sn/pci/pci_dma.c 2005-12-21 22:38:24.092685429 -0600 @@ -11,7 +11,7 @@ #include #include -#include +#include #include #include #include @@ -113,7 +113,8 @@ * resources. */ - *dma_handle = provider->dma_map_consistent(pdev, phys_addr, size); + *dma_handle = provider->dma_map_consistent(pdev, phys_addr, size, + SN_DMA_ADDR_PHYS); if (!*dma_handle) { printk(KERN_ERR "%s: out of ATEs\n", __FUNCTION__); free_pages((unsigned long)cpuaddr, get_order(size)); @@ -176,7 +177,7 @@ BUG_ON(dev->bus != &pci_bus_type); phys_addr = __pa(cpu_addr); - dma_addr = provider->dma_map(pdev, phys_addr, size); + dma_addr = provider->dma_map(pdev, phys_addr, size, SN_DMA_ADDR_PHYS); if (!dma_addr) { printk(KERN_ERR "%s: out of ATEs\n", __FUNCTION__); return 0; @@ -260,7 +261,8 @@ for (i = 0; i < nhwentries; i++, sg++) { phys_addr = SG_ENT_PHYS_ADDRESS(sg); sg->dma_address = provider->dma_map(pdev, - phys_addr, sg->length); + phys_addr, sg->length, + SN_DMA_ADDR_PHYS); if (!sg->dma_address) { printk(KERN_ERR "%s: out of ATEs\n", __FUNCTION__); Index: linux-2.6.14/arch/ia64/sn/pci/pcibr/pcibr_dma.c =================================================================== --- linux-2.6.14.orig/arch/ia64/sn/pci/pcibr/pcibr_dma.c 2005-12-21 22:35:23.445284532 -0600 +++ linux-2.6.14/arch/ia64/sn/pci/pcibr/pcibr_dma.c 2005-12-21 22:38:24.093685188 -0600 @@ -41,7 +41,7 @@ static dma_addr_t pcibr_dmamap_ate32(struct pcidev_info *info, - uint64_t paddr, size_t req_size, uint64_t flags) + uint64_t paddr, size_t req_size, uint64_t flags, int dma_flags) { struct pcidev_info *pcidev_info = info->pdi_host_pcidev_info; @@ -81,9 +81,12 @@ if (IS_PCIX(pcibus_info)) ate_flags &= ~(PCI32_ATE_PREF); - xio_addr = - IS_PIC_SOFT(pcibus_info) ? PHYS_TO_DMA(paddr) : - PHYS_TO_TIODMA(paddr); + if (SN_DMA_ADDRTYPE(dma_flags == SN_DMA_ADDR_PHYS)) + xio_addr = IS_PIC_SOFT(pcibus_info) ? PHYS_TO_DMA(paddr) : + PHYS_TO_TIODMA(paddr); + else + xio_addr = paddr; + offset = IOPGOFF(xio_addr); ate = ate_flags | (xio_addr - offset); @@ -91,6 +94,13 @@ if (IS_PIC_SOFT(pcibus_info)) { ate |= (pcibus_info->pbi_hub_xid << PIC_ATE_TARGETID_SHFT); } + + /* + * If we're mapping for MSI, set the MSI bit in the ATE + */ + if (dma_flags & SN_DMA_MSI) + ate |= PCI32_ATE_MSI; + ate_write(pcibus_info, ate_index, ate_count, ate); /* @@ -105,20 +115,27 @@ if (pcibus_info->pbi_devreg[internal_device] & PCIBR_DEV_SWAP_DIR) ATE_SWAP_ON(pci_addr); + return pci_addr; } static dma_addr_t pcibr_dmatrans_direct64(struct pcidev_info * info, uint64_t paddr, - uint64_t dma_attributes) + uint64_t dma_attributes, int dma_flags) { struct pcibus_info *pcibus_info = (struct pcibus_info *) ((info->pdi_host_pcidev_info)->pdi_pcibus_info); uint64_t pci_addr; /* Translate to Crosstalk View of Physical Address */ - pci_addr = (IS_PIC_SOFT(pcibus_info) ? PHYS_TO_DMA(paddr) : - PHYS_TO_TIODMA(paddr)) | dma_attributes; + if (SN_DMA_ADDRTYPE(dma_flags) == SN_DMA_ADDR_PHYS) + pci_addr = IS_PIC_SOFT(pcibus_info) ? + PHYS_TO_DMA(paddr) : + PHYS_TO_TIODMA(paddr) | dma_attributes; + else + pci_addr = IS_PIC_SOFT(pcibus_info) ? + paddr : + paddr | dma_attributes; /* Handle Bus mode */ if (IS_PCIX(pcibus_info)) @@ -130,7 +147,9 @@ ((uint64_t) pcibus_info-> pbi_hub_xid << PIC_PCI64_ATTR_TARG_SHFT); } else - pci_addr |= TIOCP_PCI64_CMDTYPE_MEM; + pci_addr |= (dma_flags & SN_DMA_MSI) ? + TIOCP_PCI64_CMDTYPE_MSI : + TIOCP_PCI64_CMDTYPE_MEM; /* If PCI mode, func zero uses VCHAN0, every other func uses VCHAN1 */ if (!IS_PCIX(pcibus_info) && PCI_FUNC(info->pdi_linux_pcidev->devfn)) @@ -142,7 +161,7 @@ static dma_addr_t pcibr_dmatrans_direct32(struct pcidev_info * info, - uint64_t paddr, size_t req_size, uint64_t flags) + uint64_t paddr, size_t req_size, uint64_t flags, int dma_flags) { struct pcidev_info *pcidev_info = info->pdi_host_pcidev_info; @@ -158,8 +177,14 @@ return 0; } - xio_addr = IS_PIC_SOFT(pcibus_info) ? PHYS_TO_DMA(paddr) : - PHYS_TO_TIODMA(paddr); + if (dma_flags & SN_DMA_MSI) + return 0; + + if (SN_DMA_ADDRTYPE(dma_flags) == SN_DMA_ADDR_PHYS) + xio_addr = IS_PIC_SOFT(pcibus_info) ? PHYS_TO_DMA(paddr) : + PHYS_TO_TIODMA(paddr); + else + xio_addr = paddr; xio_base = pcibus_info->pbi_dir_xbase; offset = xio_addr - xio_base; @@ -331,7 +356,7 @@ */ dma_addr_t -pcibr_dma_map(struct pci_dev * hwdev, unsigned long phys_addr, size_t size) +pcibr_dma_map(struct pci_dev * hwdev, unsigned long phys_addr, size_t size, int dma_flags) { dma_addr_t dma_handle; struct pcidev_info *pcidev_info = SN_PCIDEV_INFO(hwdev); @@ -348,11 +373,11 @@ */ dma_handle = pcibr_dmatrans_direct64(pcidev_info, phys_addr, - PCI64_ATTR_PREF); + PCI64_ATTR_PREF, dma_flags); } else { /* Handle 32-63 bit cards via direct mapping */ dma_handle = pcibr_dmatrans_direct32(pcidev_info, phys_addr, - size, 0); + size, 0, dma_flags); if (!dma_handle) { /* * It is a 32 bit card and we cannot do direct mapping, @@ -360,7 +385,8 @@ */ dma_handle = pcibr_dmamap_ate32(pcidev_info, phys_addr, - size, PCI32_ATE_PREF); + size, PCI32_ATE_PREF, + dma_flags); } } @@ -369,18 +395,18 @@ dma_addr_t pcibr_dma_map_consistent(struct pci_dev * hwdev, unsigned long phys_addr, - size_t size) + size_t size, int dma_flags) { dma_addr_t dma_handle; struct pcidev_info *pcidev_info = SN_PCIDEV_INFO(hwdev); if (hwdev->dev.coherent_dma_mask == ~0UL) { dma_handle = pcibr_dmatrans_direct64(pcidev_info, phys_addr, - PCI64_ATTR_BAR); + PCI64_ATTR_BAR, dma_flags); } else { dma_handle = (dma_addr_t) pcibr_dmamap_ate32(pcidev_info, phys_addr, size, - PCI32_ATE_BAR); + PCI32_ATE_BAR, dma_flags); } return dma_handle; Index: linux-2.6.14/arch/ia64/sn/pci/tioca_provider.c =================================================================== --- linux-2.6.14.orig/arch/ia64/sn/pci/tioca_provider.c 2005-12-21 22:35:23.446284291 -0600 +++ linux-2.6.14/arch/ia64/sn/pci/tioca_provider.c 2005-12-21 22:38:24.093685188 -0600 @@ -515,11 +515,17 @@ * use the GART mapped mode. */ static uint64_t -tioca_dma_map(struct pci_dev *pdev, uint64_t paddr, size_t byte_count) +tioca_dma_map(struct pci_dev *pdev, uint64_t paddr, size_t byte_count, int dma_flags) { uint64_t mapaddr; /* + * Not supported for now ... + */ + if (dma_flags & SN_DMA_MSI) + return 0; + + /* * If card is 64 or 48 bit addresable, use a direct mapping. 32 * bit direct is so restrictive w.r.t. where the memory resides that * we don't use it even though CA has some support. Index: linux-2.6.14/arch/ia64/sn/pci/tioce_provider.c =================================================================== --- linux-2.6.14.orig/arch/ia64/sn/pci/tioce_provider.c 2005-12-21 22:35:23.445284532 -0600 +++ linux-2.6.14/arch/ia64/sn/pci/tioce_provider.c 2005-12-21 22:38:24.094684947 -0600 @@ -52,7 +52,8 @@ (ATE_PAGE((start)+(len)-1, pagesize) - ATE_PAGE(start, pagesize) + 1) #define ATE_VALID(ate) ((ate) & (1UL << 63)) -#define ATE_MAKE(addr, ps) (((addr) & ~ATE_PAGEMASK(ps)) | (1UL << 63)) +#define ATE_MAKE(addr, ps, msi) \ + (((addr) & ~ATE_PAGEMASK(ps)) | (1UL << 63) | ((msi)?(1UL << 62):0)) /* * Flavors of ate-based mapping supported by tioce_alloc_map() @@ -78,15 +79,17 @@ * * 63 - must be 1 to indicate d64 mode to CE hardware * 62 - barrier bit ... controlled with tioce_dma_barrier() - * 61 - 0 since this is not an MSI transaction + * 61 - msi bit ... specified through dma_flags * 60:54 - reserved, MBZ */ static uint64_t -tioce_dma_d64(unsigned long ct_addr) +tioce_dma_d64(unsigned long ct_addr, int dma_flags) { uint64_t bus_addr; bus_addr = ct_addr | (1UL << 63); + if (dma_flags & SN_DMA_MSI) + bus_addr |= (1UL << 61); return bus_addr; } @@ -143,7 +146,7 @@ */ static uint64_t tioce_alloc_map(struct tioce_kernel *ce_kern, int type, int port, - uint64_t ct_addr, int len) + uint64_t ct_addr, int len, int dma_flags) { int i; int j; @@ -152,6 +155,7 @@ int entries; int nates; int pagesize; + int msi_capable, msi_wanted; uint64_t *ate_shadow; uint64_t *ate_reg; uint64_t addr; @@ -173,6 +177,7 @@ ate_reg = ce_mmr->ce_ure_ate3240; pagesize = ce_kern->ce_ate3240_pagesize; bus_base = TIOCE_M32_MIN; + msi_capable = 1; break; case TIOCE_ATE_M40: first = 0; @@ -181,6 +186,7 @@ ate_reg = ce_mmr->ce_ure_ate40; pagesize = MB(64); bus_base = TIOCE_M40_MIN; + msi_capable = 0; break; case TIOCE_ATE_M40S: /* @@ -193,11 +199,16 @@ ate_reg = ce_mmr->ce_ure_ate3240; pagesize = GB(16); bus_base = TIOCE_M40S_MIN; + msi_capable = 0; break; default: return 0; } + msi_wanted = dma_flags & SN_DMA_MSI; + if (msi_wanted && !msi_capable) + return 0; + nates = ATE_NPAGES(ct_addr, len, pagesize); if (nates > entries) return 0; @@ -226,7 +237,7 @@ for (j = 0; j < nates; j++) { uint64_t ate; - ate = ATE_MAKE(addr, pagesize); + ate = ATE_MAKE(addr, pagesize, msi_wanted); ate_shadow[i + j] = ate; writeq(ate, &ate_reg[i + j]); addr += pagesize; @@ -253,7 +264,7 @@ * Map @paddr into 32-bit bus space of the CE associated with @pcidev_info. */ static uint64_t -tioce_dma_d32(struct pci_dev *pdev, uint64_t ct_addr) +tioce_dma_d32(struct pci_dev *pdev, uint64_t ct_addr, int dma_flags) { int dma_ok; int port; @@ -263,6 +274,9 @@ uint64_t ct_lower; dma_addr_t bus_addr; + if (dma_flags & SN_DMA_MSI) + return 0; + ct_upper = ct_addr & ~0x3fffffffUL; ct_lower = ct_addr & 0x3fffffffUL; @@ -387,7 +401,7 @@ */ static uint64_t tioce_do_dma_map(struct pci_dev *pdev, uint64_t paddr, size_t byte_count, - int barrier) + int barrier, int dma_flags) { unsigned long flags; uint64_t ct_addr; @@ -403,15 +417,18 @@ if (dma_mask < 0x7fffffffUL) return 0; - ct_addr = PHYS_TO_TIODMA(paddr); + if (SN_DMA_ADDRTYPE(dma_flags) == SN_DMA_ADDR_PHYS) + ct_addr = PHYS_TO_TIODMA(paddr); + else + ct_addr = paddr; /* * If the device can generate 64 bit addresses, create a D64 map. - * Since this should never fail, bypass the rest of the checks. */ if (dma_mask == ~0UL) { - mapaddr = tioce_dma_d64(ct_addr); - goto dma_map_done; + mapaddr = tioce_dma_d64(ct_addr, dma_flags); + if (mapaddr) + goto dma_map_done; } pcidev_to_tioce(pdev, NULL, &ce_kern, &port); @@ -454,18 +471,22 @@ if (byte_count > MB(64)) { mapaddr = tioce_alloc_map(ce_kern, TIOCE_ATE_M40S, - port, ct_addr, byte_count); + port, ct_addr, byte_count, + dma_flags); if (!mapaddr) mapaddr = tioce_alloc_map(ce_kern, TIOCE_ATE_M40, -1, - ct_addr, byte_count); + ct_addr, byte_count, + dma_flags); } else { mapaddr = tioce_alloc_map(ce_kern, TIOCE_ATE_M40, -1, - ct_addr, byte_count); + ct_addr, byte_count, + dma_flags); if (!mapaddr) mapaddr = tioce_alloc_map(ce_kern, TIOCE_ATE_M40S, - port, ct_addr, byte_count); + port, ct_addr, byte_count, + dma_flags); } } @@ -473,7 +494,7 @@ * 32-bit direct is the next mode to try */ if (!mapaddr && dma_mask >= 0xffffffffUL) - mapaddr = tioce_dma_d32(pdev, ct_addr); + mapaddr = tioce_dma_d32(pdev, ct_addr, dma_flags); /* * Last resort, try 32-bit ATE-based map. @@ -481,12 +502,12 @@ if (!mapaddr) mapaddr = tioce_alloc_map(ce_kern, TIOCE_ATE_M32, -1, ct_addr, - byte_count); + byte_count, dma_flags); spin_unlock_irqrestore(&ce_kern->ce_lock, flags); dma_map_done: - if (mapaddr & barrier) + if (mapaddr && barrier) mapaddr = tioce_dma_barrier(mapaddr, 1); return mapaddr; @@ -502,9 +523,9 @@ * in the address. */ static uint64_t -tioce_dma(struct pci_dev *pdev, uint64_t paddr, size_t byte_count) +tioce_dma(struct pci_dev *pdev, uint64_t paddr, size_t byte_count, int dma_flags) { - return tioce_do_dma_map(pdev, paddr, byte_count, 0); + return tioce_do_dma_map(pdev, paddr, byte_count, 0, dma_flags); } /** @@ -516,9 +537,9 @@ * Simply call tioce_do_dma_map() to create a map with the barrier bit set * in the address. */ static uint64_t -tioce_dma_consistent(struct pci_dev *pdev, uint64_t paddr, size_t byte_count) +tioce_dma_consistent(struct pci_dev *pdev, uint64_t paddr, size_t byte_count, int dma_flags) { - return tioce_do_dma_map(pdev, paddr, byte_count, 1); + return tioce_do_dma_map(pdev, paddr, byte_count, 1, dma_flags); } /** Index: linux-2.6.14/include/asm-ia64/sn/intr.h =================================================================== --- linux-2.6.14.orig/include/asm-ia64/sn/intr.h 2005-12-21 22:35:23.446284291 -0600 +++ linux-2.6.14/include/asm-ia64/sn/intr.h 2005-12-21 22:38:24.094684947 -0600 @@ -3,13 +3,14 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (C) 1992 - 1997, 2000-2004 Silicon Graphics, Inc. All rights reserved. + * Copyright (C) 1992 - 1997, 2000-2005 Silicon Graphics, Inc. All rights reserved. */ #ifndef _ASM_IA64_SN_INTR_H #define _ASM_IA64_SN_INTR_H #include +#include #define SGI_UART_VECTOR (0xe9) @@ -40,6 +41,7 @@ int irq_cpuid; /* kernel logical cpuid */ int irq_irq; /* the IRQ number */ int irq_int_bit; /* Bridge interrupt pin */ + /* <0 means MSI */ uint64_t irq_xtalkaddr; /* xtalkaddr IRQ is sent to */ int irq_bridge_type;/* pciio asic type (pciio.h) */ void *irq_bridge; /* bridge generating irq */ @@ -53,6 +55,12 @@ }; extern void sn_send_IPI_phys(int, long, int, int); +extern uint64_t sn_intr_alloc(nasid_t, int, + struct sn_irq_info *, + int, nasid_t, int); +extern void sn_intr_free(nasid_t, int, struct sn_irq_info *); +extern struct sn_irq_info *sn_retarget_vector(struct sn_irq_info *, nasid_t, int); +extern struct list_head **sn_irq_lh; #define CPU_VECTOR_TO_IRQ(cpuid,vector) (vector) Index: linux-2.6.14/include/asm-ia64/sn/pcibr_provider.h =================================================================== --- linux-2.6.14.orig/include/asm-ia64/sn/pcibr_provider.h 2005-12-21 22:35:23.446284291 -0600 +++ linux-2.6.14/include/asm-ia64/sn/pcibr_provider.h 2005-12-21 22:38:24.094684947 -0600 @@ -3,7 +3,7 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (C) 1992-1997,2000-2004 Silicon Graphics, Inc. All rights reserved. + * Copyright (C) 1992-1997,2000-2005 Silicon Graphics, Inc. All rights reserved. */ #ifndef _ASM_IA64_SN_PCI_PCIBR_PROVIDER_H #define _ASM_IA64_SN_PCI_PCIBR_PROVIDER_H @@ -55,6 +55,7 @@ #define PCI32_ATE_V (0x1 << 0) #define PCI32_ATE_CO (0x1 << 1) #define PCI32_ATE_PREC (0x1 << 2) +#define PCI32_ATE_MSI (0x1 << 2) #define PCI32_ATE_PREF (0x1 << 3) #define PCI32_ATE_BAR (0x1 << 4) #define PCI32_ATE_ADDR_SHFT 12 @@ -129,8 +130,8 @@ extern int pcibr_init_provider(void); extern void *pcibr_bus_fixup(struct pcibus_bussoft *, struct pci_controller *); -extern dma_addr_t pcibr_dma_map(struct pci_dev *, unsigned long, size_t); -extern dma_addr_t pcibr_dma_map_consistent(struct pci_dev *, unsigned long, size_t); +extern dma_addr_t pcibr_dma_map(struct pci_dev *, unsigned long, size_t, int type); +extern dma_addr_t pcibr_dma_map_consistent(struct pci_dev *, unsigned long, size_t, int type); extern void pcibr_dma_unmap(struct pci_dev *, dma_addr_t, int); /* Index: linux-2.6.14/include/asm-ia64/sn/pcibus_provider_defs.h =================================================================== --- linux-2.6.14.orig/include/asm-ia64/sn/pcibus_provider_defs.h 2005-12-21 22:35:23.446284291 -0600 +++ linux-2.6.14/include/asm-ia64/sn/pcibus_provider_defs.h 2005-12-21 22:38:24.095684706 -0600 @@ -3,7 +3,7 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (C) 1992 - 1997, 2000-2004 Silicon Graphics, Inc. All rights reserved. + * Copyright (C) 1992 - 1997, 2000-2005 Silicon Graphics, Inc. All rights reserved. */ #ifndef _ASM_IA64_SN_PCI_PCIBUS_PROVIDER_H #define _ASM_IA64_SN_PCI_PCIBUS_PROVIDER_H @@ -45,13 +45,24 @@ */ struct sn_pcibus_provider { - dma_addr_t (*dma_map)(struct pci_dev *, unsigned long, size_t); - dma_addr_t (*dma_map_consistent)(struct pci_dev *, unsigned long, size_t); + dma_addr_t (*dma_map)(struct pci_dev *, unsigned long, size_t, int flags); + dma_addr_t (*dma_map_consistent)(struct pci_dev *, unsigned long, size_t, int flags); void (*dma_unmap)(struct pci_dev *, dma_addr_t, int); void * (*bus_fixup)(struct pcibus_bussoft *, struct pci_controller *); void (*force_interrupt)(struct sn_irq_info *); void (*target_interrupt)(struct sn_irq_info *); }; +/* + * Flags used by the map interfaces + * bits 3:0 specifies format of passed in address + * bit 4 specifies that address is to be used for MSI + */ + +#define SN_DMA_ADDRTYPE(x) ((x) & 0xf) +#define SN_DMA_ADDR_PHYS 1 /* address is an xio address. */ +#define SN_DMA_ADDR_XIO 2 /* address is phys memory */ +#define SN_DMA_MSI 0x10 /* Bus address is to be used for MSI */ + extern struct sn_pcibus_provider *sn_pci_provider[]; #endif /* _ASM_IA64_SN_PCI_PCIBUS_PROVIDER_H */ Index: linux-2.6.14/include/asm-ia64/sn/tiocp.h =================================================================== --- linux-2.6.14.orig/include/asm-ia64/sn/tiocp.h 2005-12-21 22:35:23.446284291 -0600 +++ linux-2.6.14/include/asm-ia64/sn/tiocp.h 2005-12-21 22:38:24.095684706 -0600 @@ -3,13 +3,14 @@ * License. See the file "COPYING" in the main directory of this archive * for more details. * - * Copyright (C) 2003-2004 Silicon Graphics, Inc. All rights reserved. + * Copyright (C) 2003-2005 Silicon Graphics, Inc. All rights reserved. */ #ifndef _ASM_IA64_SN_PCI_TIOCP_H #define _ASM_IA64_SN_PCI_TIOCP_H #define TIOCP_HOST_INTR_ADDR 0x003FFFFFFFFFFFFFUL #define TIOCP_PCI64_CMDTYPE_MEM (0x1ull << 60) +#define TIOCP_PCI64_CMDTYPE_MSI (0x3ull << 60) /***************************************************************************** From greg at kroah.com Fri Dec 23 08:44:46 2005 From: greg at kroah.com (Greg KH) Date: Thu, 22 Dec 2005 13:44:46 -0800 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051222203824.GJ17552@sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> <20051222202259.GA4959@suse.de> <20051222202627.GI17552@sgi.com> <20051222203415.GA28240@suse.de> <20051222203824.GJ17552@sgi.com> Message-ID: <20051222214446.GC14978@kroah.com> On Thu, Dec 22, 2005 at 02:38:24PM -0600, Mark Maule wrote: > On Thu, Dec 22, 2005 at 12:34:15PM -0800, Greg KH wrote: > > On Thu, Dec 22, 2005 at 02:26:27PM -0600, Mark Maule wrote: > > > On Thu, Dec 22, 2005 at 12:22:59PM -0800, Greg KH wrote: > > > > On Thu, Dec 22, 2005 at 02:15:44PM -0600, Mark Maule wrote: > > > > > Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer > > > > > > > > I'll wait for Resend #3 based on my previous comments before considering > > > > adding it to my kernel trees:) > > > > > > > > > > Resend #2 includes the correction to the irq_vector[] declaration, and I > > > responded to the question about setting irq_vector[0] if that's what you > > > mean ... > > > > Sorry, but I missed that last response. Why do you set the [0] value in > > a #ifdef now? > > Because on ia64 IA64_FIRST_DEVICE_VECTOR and IA64_LAST_DEVICE_VECTOR > (from which MSI FIRST_DEVICE_VECTOR/LAST_DEVICE_VECTOR are derived) are not > constants. The are now global variables (see change to asm-ia64/hw_irq.h) > to allow the platform to override them. Altix uses a reduced range of > vectors for devices, and this change was necessary to make assign_irq_vector() > to work on altix. I'm with Matthew on this one, that's not a real fix for this. What would PPC64 do in this case? thanks, greg k-h From david at gibson.dropbear.id.au Fri Dec 23 09:45:36 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Fri, 23 Dec 2005 09:45:36 +1100 Subject: RFC: Use bitmaps to track free/user SLB slots In-Reply-To: <20051222153156.GA24601@pb15.lixom.net> References: <20051222064235.GE9475@localhost.localdomain> <20051222153156.GA24601@pb15.lixom.net> Message-ID: <20051222224536.GA11853@localhost.localdomain> On Thu, Dec 22, 2005 at 09:31:57AM -0600, Olof Johansson wrote: > On Thu, Dec 22, 2005 at 05:42:35PM +1100, David Gibson wrote: > > This needs way more testing and thought before being considered for > > merging, but here it is in case people are interested. It implements > > a new, possibly superior approach to managing SLB entries. > > [...] > > > My preliminary tests (on POWER5 LPAR) seem to indicate that this has > > essentially no effect (delta<1ns) on the time for a user SLB miss (the > > cost of the bitmap manipulation is the same as that for maintaining > > the old slb cache). Time for kernel SLB misses is probably slightly > > increased; not measured, but I think it should be delta<~5ns. Context > > switch time may be increased slightlyl; also not measured yet, but I > > think it should be <0.5us at most and quite likely negligible in > > comparison to the rest of a context switch. I've no idea what the > > impact on SLB miss rates for various workloads might be. > > So, essentially what you're saying is that it's more complex than the > older one, slightly slower in the execution path and it has no proven > benefit? Yes. Well, except I don't think it's really more complex than the various cases of dealing with not-full-cache, just-full-cache and overfull-cache we have now. > Until we see numbers where the old code causes too much SLB misses, and > this patch reduces the miss rate, this is just unneeded complexity and > over-engineering. But I'd be happy to be proven wrong on this. We already know we get a heap of SLB misses with some workloads (database, mostly, IIRC). Whether this will help significantly I don't know, but since it's one of an extremely few things I can think of which might conceivably reduce the SLB miss rate, I think it's worth testing. Bear in mind that a single SLB miss is around 200 cycles. > I'll have a read-through of the code as well, but I need some coffee > before that, especially given the lack of comments in the new assembly. > :-) > > > -Olof > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson From benh at kernel.crashing.org Fri Dec 23 12:17:26 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Fri, 23 Dec 2005 12:17:26 +1100 Subject: Accessing NVRAM on the Maple board In-Reply-To: <75787748-0F0B-47E8-AE18-D3A7B65D9BCF@austin.rr.com> References: <75787748-0F0B-47E8-AE18-D3A7B65D9BCF@austin.rr.com> Message-ID: <1135300646.10035.229.camel@gaston> On Wed, 2005-12-21 at 18:28 -0600, Dave Willoughby wrote: > I'm not sure the Maple board has what I think of as NVRAM, the way > Apple or pSeries > PowerPC computers have NVRAM. > > The Maple firmware has a PIBS firmware prompt that allows setting > boot configuration values > if that's what you are looking for. It also does have an nvram but it's mostly used for communication between the service processor and the host OS ... Ben. From oozey at web.de Fri Dec 23 23:33:24 2005 From: oozey at web.de (Jan Schukat) Date: Fri, 23 Dec 2005 13:33:24 +0100 Subject: ppc32 on Xserve? Message-ID: <877247003@web.de> Hello, I got myself an Xserve G5 Single Processor, and want to put a Debian on it, since that is what I have most experience with (and sofware and performance). But I still have pretty little experience with ppc hardware, so that could become a catch. I have a Debian on my PowerBook and have a pretty customized (minimum components, no initrd, no sound, no graphics etc., as much as possible compiled in) own kernel running there. But my time to fiddle on my Xserve is somewhat limited, since it is an office where I have a permanent connection and own IP. So I want to prepare as much as possible before going there. And the main thing to prepare is have a prepare a proper kernel-package, that won't let me down when I put it on there. So here finally my two questions: 1. I haven't found a kernel_config as a place to start with customizing, can anyone be so kind and point me to one? 2. Since I only have 1GB memory, I figured it would be a waste to use 64bit. But do the SATA/RAID/Thermal/Netweork drivers work with 32bit? I won't need wireless. Finally, if anyone has anything additional to say, like things I should take care of and that inexperienced with the hardware an easily break, that would be appreciated too. Once I have my System running there, I will definetely put up a page with my experiences. And I'm also willing to be (ab)used for newest patch testing ;) Regards Jan From miltonm at bga.com Sat Dec 24 02:01:54 2005 From: miltonm at bga.com (Milton Miller) Date: Fri, 23 Dec 2005 09:01:54 -0600 Subject: ppc32 on Xserve? Message-ID: On Fri Dec 23 23:33:24 EST 2005, Jan Schukat wrote: > > Hello, > I got myself an Xserve G5 Single Processor, and want to put a Debian > on it, since that is what I have most experience with (and sofware and > performance). > But I still have pretty little experience with ppc hardware, so that > could become a catch. > I have a Debian on my PowerBook and have a pretty customized (minimum > components, no initrd, no sound, no graphics etc., as much as > possible compiled in) own kernel running there. > But my time to fiddle on my Xserve is somewhat limited, since it is an > office where I have a permanent connection and own IP. So I want to > prepare as much as possible before going there. And the main thing to > prepare is have a prepare a proper kernel-package, that won't let me > down when I put it on there. > > So here finally my two questions: > > 1. I haven't found a kernel_config as a place to start with > customizing, can anyone be so kind and point me to one? I would guess the ARCH=powerpc g5_defconfig should be a good starting place. (This assumes something like recient upstream 2.6.15-rc6 or so). > 2. Since I only have 1GB memory, I figured it would be a waste to use > 64bit. But do the SATA/RAID/Thermal/Netweork drivers work with 32bit? > I won't need wireless. The current 64 bit PowerPC architecture has several significant differences for kernel programming compared to the 32 bit one. While this does not affect drivers, it does affect the mmu code and the early assembly, and even the order in which we call some of the early bringup functions. For that reason, to run on a G5 you need to use a 64 bit kernel.[1] You should still be able to use the 32 bit userland you are familiar with. Current Debian includes the biarch compiler that should allow you to compile 64 bit kernels. Since you said your fiddle time is limited, to check your configuration, you might consider building a test kernel with similar features (just changing the cpu type and checking that the mac feature stuff is re-enabled) on your PowerBook. [1] Yes, there was support put back in briefly to run g5 in ARCH=ppc, but that has been unmaintained and is expected to disappear durng the merge into powerpc. > Finally, if anyone has anything additional to say, like things I > should take care of and that inexperienced with the hardware an easily > break, that would be appreciated too. I don't have a lot of expericence with G5 processors, just other 64 bit ones. Therefore I can't comment on system issues. And while i have used Debian quite a bit, I have let ohters administrate it. One utility you might look for is the "ppc" utility that changes the uname returned to be ppc instead of ppc64. Sometimes that lets packages install that otherwise think they are not on the right architecture. Unfornately I am not sure where to obtain it. > Once I have my System running there, I will definetely put up a page > with my experiences. And I'm also willing to be (ab)used for newest > patch testing ;) > > Regards > > Jan Hope this helps, milton From maule at sgi.com Sat Dec 24 02:32:15 2005 From: maule at sgi.com (Mark Maule) Date: Fri, 23 Dec 2005 09:32:15 -0600 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051222214446.GC14978@kroah.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> <20051222202259.GA4959@suse.de> <20051222202627.GI17552@sgi.com> <20051222203415.GA28240@suse.de> <20051222203824.GJ17552@sgi.com> <20051222214446.GC14978@kroah.com> Message-ID: <20051223153215.GA11935@sgi.com> On Thu, Dec 22, 2005 at 01:44:46PM -0800, Greg KH wrote: > On Thu, Dec 22, 2005 at 02:38:24PM -0600, Mark Maule wrote: > > On Thu, Dec 22, 2005 at 12:34:15PM -0800, Greg KH wrote: > > > On Thu, Dec 22, 2005 at 02:26:27PM -0600, Mark Maule wrote: > > > > On Thu, Dec 22, 2005 at 12:22:59PM -0800, Greg KH wrote: > > > > > On Thu, Dec 22, 2005 at 02:15:44PM -0600, Mark Maule wrote: > > > > > > Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer > > > > > > > > > > I'll wait for Resend #3 based on my previous comments before considering > > > > > adding it to my kernel trees:) > > > > > > > > > > > > > Resend #2 includes the correction to the irq_vector[] declaration, and I > > > > responded to the question about setting irq_vector[0] if that's what you > > > > mean ... > > > > > > Sorry, but I missed that last response. Why do you set the [0] value in > > > a #ifdef now? > > > > Because on ia64 IA64_FIRST_DEVICE_VECTOR and IA64_LAST_DEVICE_VECTOR > > (from which MSI FIRST_DEVICE_VECTOR/LAST_DEVICE_VECTOR are derived) are not > > constants. The are now global variables (see change to asm-ia64/hw_irq.h) > > to allow the platform to override them. Altix uses a reduced range of > > vectors for devices, and this change was necessary to make assign_irq_vector() > > to work on altix. > > I'm with Matthew on this one, that's not a real fix for this. What > would PPC64 do in this case? Using the existing framework, wouldn't PPC just define it's own assign_irq_vector and {FIRST,LAST}_DEVICE_VECTOR and handle it however it wants under the covers? I agree that this is not a great solution, but it's what the existing framework allowed. I'm willing to pursue a more general vector allocation scheme, but I suspect that'll take some time. Is this issue going to hold up forward progress of this patchset? IMO, this set is a major step in generalizing the MSI code and I think the vector generalizing code would best be handled by a separate effort. Mark Mark From greg at kroah.com Sat Dec 24 03:32:21 2005 From: greg at kroah.com (Greg KH) Date: Fri, 23 Dec 2005 08:32:21 -0800 Subject: [PATCH 0/3] msi abstractions and support for altix In-Reply-To: <20051223153215.GA11935@sgi.com> References: <20051222201651.2019.37913.96422@lnx-maule.americas.sgi.com> <20051222202259.GA4959@suse.de> <20051222202627.GI17552@sgi.com> <20051222203415.GA28240@suse.de> <20051222203824.GJ17552@sgi.com> <20051222214446.GC14978@kroah.com> <20051223153215.GA11935@sgi.com> Message-ID: <20051223163221.GA13018@kroah.com> On Fri, Dec 23, 2005 at 09:32:15AM -0600, Mark Maule wrote: > On Thu, Dec 22, 2005 at 01:44:46PM -0800, Greg KH wrote: > > On Thu, Dec 22, 2005 at 02:38:24PM -0600, Mark Maule wrote: > > > On Thu, Dec 22, 2005 at 12:34:15PM -0800, Greg KH wrote: > > > > On Thu, Dec 22, 2005 at 02:26:27PM -0600, Mark Maule wrote: > > > > > On Thu, Dec 22, 2005 at 12:22:59PM -0800, Greg KH wrote: > > > > > > On Thu, Dec 22, 2005 at 02:15:44PM -0600, Mark Maule wrote: > > > > > > > Resend #2: including linuxppc64-dev and linux-pci as well as PCI maintainer > > > > > > > > > > > > I'll wait for Resend #3 based on my previous comments before considering > > > > > > adding it to my kernel trees:) > > > > > > > > > > > > > > > > Resend #2 includes the correction to the irq_vector[] declaration, and I > > > > > responded to the question about setting irq_vector[0] if that's what you > > > > > mean ... > > > > > > > > Sorry, but I missed that last response. Why do you set the [0] value in > > > > a #ifdef now? > > > > > > Because on ia64 IA64_FIRST_DEVICE_VECTOR and IA64_LAST_DEVICE_VECTOR > > > (from which MSI FIRST_DEVICE_VECTOR/LAST_DEVICE_VECTOR are derived) are not > > > constants. The are now global variables (see change to asm-ia64/hw_irq.h) > > > to allow the platform to override them. Altix uses a reduced range of > > > vectors for devices, and this change was necessary to make assign_irq_vector() > > > to work on altix. > > > > I'm with Matthew on this one, that's not a real fix for this. What > > would PPC64 do in this case? > > Using the existing framework, wouldn't PPC just define it's own > assign_irq_vector and {FIRST,LAST}_DEVICE_VECTOR and handle it however it > wants under the covers? > > I agree that this is not a great solution, but it's what the existing framework > allowed. I'm willing to pursue a more general vector allocation scheme, but > I suspect that'll take some time. > > Is this issue going to hold up forward progress of this patchset? IMO, this > set is a major step in generalizing the MSI code and I think the vector > generalizing code would best be handled by a separate effort. I don't know, let's see what the ppc64 developers say. If they are happy with this implementation, then it might be ok... Ben? thanks, greg k-h From zarniwhoop at ntlworld.com Sat Dec 24 11:45:48 2005 From: zarniwhoop at ntlworld.com (Ken Moffat) Date: Sat, 24 Dec 2005 00:45:48 +0000 (GMT) Subject: ppc32 on Xserve? In-Reply-To: References: Message-ID: On Fri, 23 Dec 2005, Milton Miller wrote: > > One utility you might look for is the "ppc" utility that changes the uname > returned to be ppc instead of ppc64. Sometimes that lets packages install > that otherwise think they are not on the right architecture. Unfornately I > am not sure where to obtain it. > That sounds like 'linux32' which I grabbed from my local debian mirror - on my G5 with fully 32-bit userspace (non-debian, built from source), everything seems to work as if it was just a ppc (I had to wrap agetty, and then gdm, to call them from linux32, otherwise configure scripts tried to build a 64-bit system). But, you might not need it - I couldn't boot debian on my G5 SMU, but ubuntu seemed to give me a fully 32-bit userspace and a uname of ppc. Ken -- das eine Mal als Trag?die, das andere Mal als Farce From segher at kernel.crashing.org Sun Dec 25 13:05:47 2005 From: segher at kernel.crashing.org (Segher Boessenkool) Date: Sun, 25 Dec 2005 03:05:47 +0100 Subject: [RFC] powerpc: Merge 32/64 cacheflush code In-Reply-To: <2E963BF2-D5EA-40BF-9614-1F2A686F6C00@kernel.crashing.org> References: <20051219054410.GB13285@localhost.localdomain> <20051219235455.GB29993@localhost.localdomain> <5648700F-9B2F-49F0-ADA8-278328C4CF17@kernel.crashing.org> <2E963BF2-D5EA-40BF-9614-1F2A686F6C00@kernel.crashing.org> Message-ID: <3a76be5afd0badadd869ddd1b1695ec0@kernel.crashing.org> >>>> Why dont we just use the cache line information in the cputable? >>>> Why >>>> the introduction of this new powerpc_caches structure? >>> >>> Because the device tree can override the information from the >>> cputable. Oh, and the structure is only new for ppc32. >> >> If the device tree overrides the cputable should we not believe it? >> I guess I dont understand why we need the same information in >> multiple places? In an ideal world, you won't need the information in the cputable at all -- not just for cache info, but for everything (except perhaps the human-readable CPU name). All this information should be provided by the device tree. In the real world, this information might be wrong / missing. Best would be for the kernel to require the info in the device tree, and have some fixups for the bad cases. This same strategy can be applied to everything device-tree related, btw... Segher From amodra at bigpond.net.au Tue Dec 27 11:03:08 2005 From: amodra at bigpond.net.au (Alan Modra) Date: Tue, 27 Dec 2005 10:33:08 +1030 Subject: GCC 3.3.6: ICE in emit_move_insn, at expr.c:3198 In-Reply-To: <43A9FCA4.3070104@mvista.com> References: <43A9FCA4.3070104@mvista.com> Message-ID: <20051227000308.GA2770@bubble.grove.modra.org> On Wed, Dec 21, 2005 at 05:08:52PM -0800, Khem Raj wrote: > While compiling gcc 3.3.6 for powerpc64 I encountered an ICE in GCC > cross build. Why are you trying to use an old compiler? Did you apply my patchset to the 3.3 sources? (It may be out of date as it is 2 years since I last updated it.) I recommend using a newer compiler for powerpc64. If you must use 3.3 for some reason, then at least use 3.3-hammer. -- Alan Modra IBM OzLabs - Linux Technology Centre From brian.jewell at themis.com Wed Dec 28 09:38:55 2005 From: brian.jewell at themis.com (brian jewell) Date: Tue, 27 Dec 2005 14:38:55 -0800 Subject: Accessing NVRAM on the Maple board References: <75787748-0F0B-47E8-AE18-D3A7B65D9BCF@austin.rr.com> <1135300646.10035.229.camel@gaston> Message-ID: <000801c60b36$521b7af0$e2010a0a@themis.com> Benjamin, I think I need to reword my question... The company I work for is building a small footprint PPC970 board, based on the Maple reference design, that uses a proprietary service processor. I need to be able to read NVRAM from Linux on our PPC970 board. There is a driver in /arch/ppc64/kernel called "nvram.c" that looks like it would provide the capability of reading NVRAM. I was wondering if anyone knew anything about this driver, such as if there is any documentation on how to use it? Thanks for your reply. --Brian ----- Original Message ----- From: "Benjamin Herrenschmidt" To: "Dave Willoughby" Cc: "brian jewell" ; Sent: Thursday, December 22, 2005 5:17 PM Subject: Re: Accessing NVRAM on the Maple board > On Wed, 2005-12-21 at 18:28 -0600, Dave Willoughby wrote: >> I'm not sure the Maple board has what I think of as NVRAM, the way >> Apple or pSeries >> PowerPC computers have NVRAM. >> >> The Maple firmware has a PIBS firmware prompt that allows setting >> boot configuration values >> if that's what you are looking for. > > It also does have an nvram but it's mostly used for communication > between the service processor and the host OS ... > > Ben. > > From benh at kernel.crashing.org Wed Dec 28 10:21:47 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Wed, 28 Dec 2005 10:21:47 +1100 Subject: Accessing NVRAM on the Maple board In-Reply-To: <000801c60b36$521b7af0$e2010a0a@themis.com> References: <75787748-0F0B-47E8-AE18-D3A7B65D9BCF@austin.rr.com> <1135300646.10035.229.camel@gaston> <000801c60b36$521b7af0$e2010a0a@themis.com> Message-ID: <1135725707.4780.56.camel@localhost.localdomain> On Tue, 2005-12-27 at 14:38 -0800, brian jewell wrote: > Benjamin, > > I think I need to reword my question... > > The company I work for is building a small footprint PPC970 board, based on > the Maple reference design, that uses a proprietary service processor. I > need to be able to read NVRAM from Linux on our PPC970 board. > > There is a driver in /arch/ppc64/kernel called "nvram.c" that looks > like it would provide the capability of reading NVRAM. I was wondering if > anyone knew anything about this driver, such as if there is any > documentation on how to use it? What kernel are you using ? arch/ppc64 is gone on recent kernels... The powerpc architecture provides indeed a generic nvram access driver (nowadays, it's in arch/powerpc/kernel/nvram_64.c), which is just a read/write interface via /dev/nvram that also implements open firmware partitions. If you don't want to use that format, you may need to modify the driver slightly to get you raw access. The driver itself doesn't have any code for the nvram access itself, it goes through a pair of platform callbacks that your platform code will have to provide for doing the actual reads & writes to the nvram. Ben. From sven.luther at wanadoo.fr Wed Dec 28 23:12:33 2005 From: sven.luther at wanadoo.fr (Sven Luther) Date: Wed, 28 Dec 2005 13:12:33 +0100 Subject: ppc32 on Xserve? In-Reply-To: <877247003@web.de> References: <877247003@web.de> Message-ID: <20051228121232.GA14557@localhost.localdomain> On Fri, Dec 23, 2005 at 01:33:24PM +0100, Jan Schukat wrote: > Hello, > I got myself an Xserve G5 Single Processor, and want to put a Debian on it, since that is what I have most experience with (and sofware and performance). > But I still have pretty little experience with ppc hardware, so that could become a catch. > I have a Debian on my PowerBook and have a pretty customized (minimum components, no initrd, no sound, no graphics etc., as much as possible compiled in) own kernel running there. > But my time to fiddle on my Xserve is somewhat limited, since it is an office where I have a permanent connection and own IP. So I want to prepare as much as possible before going there. And the main thing to prepare is have a prepare a proper kernel-package, that won't let me down when I put it on there. > > So here finally my two questions: > > 1. I haven't found a kernel_config as a place to start with customizing, can anyone be so kind and point me to one? > 2. Since I only have 1GB memory, I figured it would be a waste to use 64bit. But do the SATA/RAID/Thermal/Netweork drivers work with 32bit? I won't need wireless. > > Finally, if anyone has anything additional to say, like things I should take care of and that inexperienced with the hardware an easily break, that would be appreciated too. > Once I have my System running there, I will definetely put up a page with my experiences. And I'm also willing to be (ab)used for newest patch testing ;) the install64 target of the debian-installer (etch beta1 or sid daily builds probably) should installjust fine. I would forget about the 32bit kernel for this, altough because you use a 64bit kernel doesn't mean you have to use a 64bit userland, and indeed debian is (still) 32bit only for the powerpc userland. Friendly, Sven Luther From sven.luther at wanadoo.fr Thu Dec 29 08:58:25 2005 From: sven.luther at wanadoo.fr (Sven Luther) Date: Wed, 28 Dec 2005 22:58:25 +0100 Subject: ARCH=powerpc 64bit 2.6.15-rc7 build error : ld: drivers/built-in.o section .init.text exceeds stub group size Message-ID: <20051228215825.GA28650@localhost.localdomain> Hi, ... While trying to build the debian 2.6.15-rc7 ARCH=powerpc 64bit kernel, i get the following : LD init/built-in.o LD .tmp_vmlinux1 ld: drivers/built-in.o section .init.text exceeds stub group size ld: arch/powerpc/kernel/built-in.o section .init.text exceeds stub group size ld: net/built-in.o section .text exceeds stub group size ld: drivers/built-in.o section .text exceeds stub group size ld: block/built-in.o section .text exceeds stub group size ld: security/built-in.o section .text exceeds stub group size ld: ipc/built-in.o section .text exceeds stub group size ld: fs/built-in.o section .text exceeds stub group size ld: mm/built-in.o section .text exceeds stub group size ld: kernel/built-in.o section .text exceeds stub group size ld: arch/powerpc/platforms/built-in.o section .text exceeds stub group size ld: arch/powerpc/kernel/built-in.o section .text exceeds stub group size ld: arch/powerpc/kernel/head_64.o section .text exceeds stub group size KSYM .tmp_kallsyms1.S AS .tmp_kallsyms1.o LD .tmp_vmlinux2 ld: drivers/built-in.o section .init.text exceeds stub group size ld: arch/powerpc/kernel/built-in.o section .init.text exceeds stub group size ld: net/built-in.o section .text exceeds stub group size ld: drivers/built-in.o section .text exceeds stub group size ld: block/built-in.o section .text exceeds stub group size ld: security/built-in.o section .text exceeds stub group size ld: ipc/built-in.o section .text exceeds stub group size ld: fs/built-in.o section .text exceeds stub group size ld: mm/built-in.o section .text exceeds stub group size ld: kernel/built-in.o section .text exceeds stub group size ld: arch/powerpc/platforms/built-in.o section .text exceeds stub group size ld: arch/powerpc/kernel/built-in.o section .text exceeds stub group size ld: arch/powerpc/kernel/head_64.o section .text exceeds stub group size KSYM .tmp_kallsyms2.S AS .tmp_kallsyms2.o LD vmlinux ld: drivers/built-in.o section .init.text exceeds stub group size ld: arch/powerpc/kernel/built-in.o section .init.text exceeds stub group size ld: net/built-in.o section .text exceeds stub group size ld: drivers/built-in.o section .text exceeds stub group size ld: block/built-in.o section .text exceeds stub group size ld: security/built-in.o section .text exceeds stub group size ld: ipc/built-in.o section .text exceeds stub group size ld: fs/built-in.o section .text exceeds stub group size ld: mm/built-in.o section .text exceeds stub group size ld: kernel/built-in.o section .text exceeds stub group size ld: arch/powerpc/platforms/built-in.o section .text exceeds stub group size ld: arch/powerpc/kernel/built-in.o section .text exceeds stub group size ld: arch/powerpc/kernel/head_64.o section .text exceeds stub group size SYSMAP System.map SYSMAP .tmp_System.map I never saw those before, so i wonder if this are harmless messages coming from the deubg symbols which are too big or something, or if i could worry about them. Any help on this would be very welcome. Friendly, Sven Luther From anton at samba.org Thu Dec 29 10:46:29 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 29 Dec 2005 10:46:29 +1100 Subject: [PATCH] ppc64: htab_initialize_secondary cannot be marked __init Message-ID: <20051228234629.GA18479@krispykreme> Sonny has noticed hotplug CPU on ppc64 is broken in 2.6.15-*. One of the problems is that htab_initialize_secondary is called when a cpu is being brought up, but it is marked __init. Signed-off-by: Anton Blanchard --- diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index a33583f..a606504 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -514,7 +514,7 @@ void __init htab_initialize(void) #undef KB #undef MB -void __init htab_initialize_secondary(void) +void htab_initialize_secondary(void) { if (!platform_is_lpar()) mtspr(SPRN_SDR1, _SDR1); From paulus at samba.org Thu Dec 29 11:11:51 2005 From: paulus at samba.org (Paul Mackerras) Date: Thu, 29 Dec 2005 11:11:51 +1100 Subject: [PATCH] ppc64: htab_initialize_secondary cannot be marked __init In-Reply-To: <20051228234629.GA18479@krispykreme> References: <20051228234629.GA18479@krispykreme> Message-ID: <17331.10695.861398.383827@cargo.ozlabs.ibm.com> Anton Blanchard writes: > > Sonny has noticed hotplug CPU on ppc64 is broken in 2.6.15-*. One of the > problems is that htab_initialize_secondary is called when a cpu is being > brought up, but it is marked __init. > > Signed-off-by: Anton Blanchard Acked-by: Paul Mackerras From brian.jewell at themis.com Thu Dec 29 11:59:28 2005 From: brian.jewell at themis.com (brian jewell) Date: Wed, 28 Dec 2005 16:59:28 -0800 Subject: Accessing NVRAM on the Maple board In-Reply-To: <1135725707.4780.56.camel@localhost.localdomain> Message-ID: Ben, I am not "bleeding edge"; I'm using 2.6.14.4. Is there any configuration required for using nvram.c (nvram_64.c)? I see the function names in the kernel map file. But when I do an "open" on /dev/nvram, I get "No such device", even though the /dev/nvram device file is there, and I have set the permissions to "666". Do I need to include anything in my .config? I am basically building my kernel with the Maple default configuration file. Any help is appreciated. regards, --Brian -----Original Message----- From: Benjamin Herrenschmidt [mailto:benh at kernel.crashing.org] Sent: Tuesday, December 27, 2005 3:22 PM To: brian jewell Cc: Dave Willoughby; linuxppc64-dev at ozlabs.org Subject: Re: Accessing NVRAM on the Maple board On Tue, 2005-12-27 at 14:38 -0800, brian jewell wrote: > Benjamin, > > I think I need to reword my question... > > The company I work for is building a small footprint PPC970 board, based on > the Maple reference design, that uses a proprietary service processor. I > need to be able to read NVRAM from Linux on our PPC970 board. > > There is a driver in /arch/ppc64/kernel called "nvram.c" that looks > like it would provide the capability of reading NVRAM. I was wondering if > anyone knew anything about this driver, such as if there is any > documentation on how to use it? What kernel are you using ? arch/ppc64 is gone on recent kernels... The powerpc architecture provides indeed a generic nvram access driver (nowadays, it's in arch/powerpc/kernel/nvram_64.c), which is just a read/write interface via /dev/nvram that also implements open firmware partitions. If you don't want to use that format, you may need to modify the driver slightly to get you raw access. The driver itself doesn't have any code for the nvram access itself, it goes through a pair of platform callbacks that your platform code will have to provide for doing the actual reads & writes to the nvram. Ben. From benh at kernel.crashing.org Thu Dec 29 12:16:34 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Thu, 29 Dec 2005 12:16:34 +1100 Subject: Accessing NVRAM on the Maple board In-Reply-To: References: Message-ID: <1135818994.4635.46.camel@localhost.localdomain> On Wed, 2005-12-28 at 16:59 -0800, brian jewell wrote: > Ben, > > > > I am not "bleeding edge"; I'm using 2.6.14.4. > > Is there any configuration required for using nvram.c (nvram_64.c)? I see > the function names in the kernel map file. But when I do an "open" on > /dev/nvram, I get "No such device", even though the /dev/nvram device file > is there, and I have set the permissions to "666". > > Do I need to include anything in my .config? I am basically building my > kernel with the Maple default configuration file. As I told you earlier, Maple uses nvram for communication with the service processor, and thus the maple support code doesn't expose the nvram to userland. You'll have to write your own board support code if you diverge from maple in that area (which should be fairly simple, provided you correctly identify your board from the firmware). Also, the current ppc64 nvram driver is designed to work with chrp-like partitionned nvrams. You may have to change that in the code. Ben. From anton at samba.org Thu Dec 29 13:56:04 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 29 Dec 2005 13:56:04 +1100 Subject: [PATCH] ppc64: per_cpu data optimisations Message-ID: <20051229025604.GB18479@krispykreme> Hi, The current ppc64 per cpu data implementation is quite slow. eg: lhz 11,18(13) /* smp_processor_id() */ ld 9,.LC63-.LCTOC1(30) /* per_cpu__variable_name */ ld 8,.LC61-.LCTOC1(30) /* __per_cpu_offset */ sldi 11,11,3 /* form index into __per_cpu_offset */ mr 10,9 ldx 9,11,8 /* __per_cpu_offset[smp_processor_id()] */ ldx 0,10,9 /* load per cpu data */ 5 loads for something that is supposed to be fast, pretty awful. One reason for the large number of loads is that we have to synthesize 2 64bit constants (per_cpu__variable_name and __per_cpu_offset). By putting __per_cpu_offset into the paca we can avoid the 2 loads associated with it: ld 11,56(13) /* paca->data_offset */ ld 9,.LC59-.LCTOC1(30) /* per_cpu__variable_name */ ldx 0,9,11 /* load per cpu data Unfortunately this patch exposes a bug that has been in ppc64 gcc forever: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25572 Basically we can trash r30 - thanks to Alan Modra for fixing it. I have implemented the workaround he suggested, changing RELOC_HIDE to use =r instead of =g. Longer term we can should be able to do even better than 3 loads. If per_cpu__variable_name wasnt a 64bit constant and paca->data_offset was in a register we could cut it down to one load. A suggestion from Rusty is to use gcc's __thread extension here. In order to do this we would need to free up r13 (the __thread register and where the paca currently is). So far Ive had a few unsuccessful attempts at doing that :) The patch also allocates per cpu memory node local on NUMA machines. This patch from Rusty has been sitting in my queue _forever_ but stalled when I hit the compiler bug. Sorry about that. Finally I also only allocate per cpu data for possible cpus, which comes straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128) and 4 possible cpus we see some nice gains: total used free shared buffers cached Mem: 4012228 212860 3799368 0 0 162424 total used free shared buffers cached Mem: 4016200 212984 3803216 0 0 162424 A saving of 3.75MB. Quite nice for smaller machines. Note: we now have to be careful of per cpu users that touch data for !possible cpus. At this stage it might be worth making the NUMA and possible cpu optimisations generic, but per cpu init is done so early we have to be careful that all architectures have their possible map setup correctly. Signed-off-by: Anton Blanchard --- Index: linux-2.6/arch/powerpc/kernel/setup_64.c =================================================================== --- linux-2.6.orig/arch/powerpc/kernel/setup_64.c 2005-12-29 13:34:36.000000000 +1100 +++ linux-2.6/arch/powerpc/kernel/setup_64.c 2005-12-29 13:34:48.000000000 +1100 @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -851,3 +852,28 @@ if (ppc_md.cpu_die) ppc_md.cpu_die(); } + +#ifdef CONFIG_SMP +void __init setup_per_cpu_areas(void) +{ + int i; + unsigned long size; + char *ptr; + + /* Copy section for each CPU (we discard the original) */ + size = ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES); +#ifdef CONFIG_MODULES + if (size < PERCPU_ENOUGH_ROOM) + size = PERCPU_ENOUGH_ROOM; +#endif + + for_each_cpu(i) { + ptr = alloc_bootmem_node(NODE_DATA(cpu_to_node(i)), size); + if (!ptr) + panic("Cannot allocate cpu data for CPU %d\n", i); + + paca[i].data_offset = ptr - __per_cpu_start; + memcpy(ptr, __per_cpu_start, __per_cpu_end - __per_cpu_start); + } +} +#endif Index: linux-2.6/include/asm-powerpc/paca.h =================================================================== --- linux-2.6.orig/include/asm-powerpc/paca.h 2005-12-29 13:34:36.000000000 +1100 +++ linux-2.6/include/asm-powerpc/paca.h 2005-12-29 13:34:48.000000000 +1100 @@ -64,6 +64,7 @@ u64 stab_real; /* Absolute address of segment table */ u64 stab_addr; /* Virtual address of segment table */ void *emergency_sp; /* pointer to emergency stack */ + u64 data_offset; /* per cpu data offset */ s16 hw_cpu_id; /* Physical processor number */ u8 cpu_start; /* At startup, processor spins until */ /* this becomes non-zero. */ Index: linux-2.6/include/asm-powerpc/percpu.h =================================================================== --- linux-2.6.orig/include/asm-powerpc/percpu.h 2005-12-29 13:34:36.000000000 +1100 +++ linux-2.6/include/asm-powerpc/percpu.h 2005-12-29 13:34:48.000000000 +1100 @@ -1 +1,59 @@ +#ifndef _ASM_POWERPC_PERCPU_H_ +#define _ASM_POWERPC_PERCPU_H_ +#ifdef __powerpc64__ +#include + +/* + * Same as asm-generic/percpu.h, except that we store the per cpu offset + * in the paca. Based on the x86-64 implementation. + */ + +#ifdef CONFIG_SMP + +#include + +#define __per_cpu_offset(cpu) (paca[cpu].data_offset) +#define __my_cpu_offset() get_paca()->data_offset + +/* Separate out the type, so (int[3], foo) works. */ +#define DEFINE_PER_CPU(type, name) \ + __attribute__((__section__(".data.percpu"))) __typeof__(type) per_cpu__##name + +/* var is in discarded region: offset to particular copy we want */ +#define per_cpu(var, cpu) (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset(cpu))) +#define __get_cpu_var(var) (*RELOC_HIDE(&per_cpu__##var, __my_cpu_offset())) + +/* A macro to avoid #include hell... */ +#define percpu_modcopy(pcpudst, src, size) \ +do { \ + unsigned int __i; \ + for (__i = 0; __i < NR_CPUS; __i++) \ + if (cpu_possible(__i)) \ + memcpy((pcpudst)+__per_cpu_offset(__i), \ + (src), (size)); \ +} while (0) + +extern void setup_per_cpu_areas(void); + +#else /* ! SMP */ + +static inline void setup_per_cpu_areas(void) { } + +#define DEFINE_PER_CPU(type, name) \ + __typeof__(type) per_cpu__##name + +#define per_cpu(var, cpu) (*((void)(cpu), &per_cpu__##var)) +#define __get_cpu_var(var) per_cpu__##var + +#endif /* SMP */ + +#define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name + +#define EXPORT_PER_CPU_SYMBOL(var) EXPORT_SYMBOL(per_cpu__##var) +#define EXPORT_PER_CPU_SYMBOL_GPL(var) EXPORT_SYMBOL_GPL(per_cpu__##var) + +#else #include +#endif + +#endif /* _ASM_POWERPC_PERCPU_H_ */ Index: linux-2.6/include/linux/compiler-gcc.h =================================================================== --- linux-2.6.orig/include/linux/compiler-gcc.h 2005-12-29 13:34:36.000000000 +1100 +++ linux-2.6/include/linux/compiler-gcc.h 2005-12-29 13:34:48.000000000 +1100 @@ -13,5 +13,5 @@ shouldn't recognize the original var, and make assumptions about it */ #define RELOC_HIDE(ptr, off) \ ({ unsigned long __ptr; \ - __asm__ ("" : "=g"(__ptr) : "0"(ptr)); \ + __asm__ ("" : "=r"(__ptr) : "0"(ptr)); \ (typeof(ptr)) (__ptr + (off)); }) From anton at samba.org Thu Dec 29 21:51:31 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 29 Dec 2005 21:51:31 +1100 Subject: [PATCH] ppc64: Fix oprofile when compiled as a module Message-ID: <20051229105131.GC18479@krispykreme> My recent changes to oprofile broke it when built as a module. Fix it by using an enum instead of a function pointer. This way we still retain the oprofile configuration in the cputable. Signed-off-by: Anton Blanchard --- Index: build/arch/powerpc/kernel/cputable.c =================================================================== --- build.orig/arch/powerpc/kernel/cputable.c 2005-12-29 20:50:47.000000000 +1100 +++ build/arch/powerpc/kernel/cputable.c 2005-12-29 20:53:49.000000000 +1100 @@ -78,10 +78,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power3, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/power3", - .oprofile_model = &op_model_rs64, -#endif + .oprofile_type = RS64, }, { /* Power3+ */ .pvr_mask = 0xffff0000, @@ -93,10 +91,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power3, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/power3", - .oprofile_model = &op_model_rs64, -#endif + .oprofile_type = RS64, }, { /* Northstar */ .pvr_mask = 0xffff0000, @@ -108,10 +104,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power3, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/rs64", - .oprofile_model = &op_model_rs64, -#endif + .oprofile_type = RS64, }, { /* Pulsar */ .pvr_mask = 0xffff0000, @@ -123,10 +117,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power3, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/rs64", - .oprofile_model = &op_model_rs64, -#endif + .oprofile_type = RS64, }, { /* I-star */ .pvr_mask = 0xffff0000, @@ -138,10 +130,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power3, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/rs64", - .oprofile_model = &op_model_rs64, -#endif + .oprofile_type = RS64, }, { /* S-star */ .pvr_mask = 0xffff0000, @@ -153,10 +143,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power3, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/rs64", - .oprofile_model = &op_model_rs64, -#endif + .oprofile_type = RS64, }, { /* Power4 */ .pvr_mask = 0xffff0000, @@ -168,10 +156,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power4, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/power4", - .oprofile_model = &op_model_rs64, -#endif + .oprofile_type = POWER4, }, { /* Power4+ */ .pvr_mask = 0xffff0000, @@ -183,10 +169,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_power4, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/power4", - .oprofile_model = &op_model_power4, -#endif + .oprofile_type = POWER4, }, { /* PPC970 */ .pvr_mask = 0xffff0000, @@ -199,10 +183,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_ppc970, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/970", - .oprofile_model = &op_model_power4, -#endif + .oprofile_type = POWER4, }, #endif /* CONFIG_PPC64 */ #if defined(CONFIG_PPC64) || defined(CONFIG_POWER4) @@ -221,10 +203,8 @@ .dcache_bsize = 128, .num_pmcs = 8, .cpu_setup = __setup_cpu_ppc970, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/970", - .oprofile_model = &op_model_power4, -#endif + .oprofile_type = POWER4, }, #endif /* defined(CONFIG_PPC64) || defined(CONFIG_POWER4) */ #ifdef CONFIG_PPC64 @@ -238,10 +218,8 @@ .icache_bsize = 128, .dcache_bsize = 128, .cpu_setup = __setup_cpu_ppc970, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/970", - .oprofile_model = &op_model_power4, -#endif + .oprofile_type = POWER4, }, { /* Power5 GR */ .pvr_mask = 0xffff0000, @@ -253,10 +231,8 @@ .dcache_bsize = 128, .num_pmcs = 6, .cpu_setup = __setup_cpu_power4, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/power5", - .oprofile_model = &op_model_power4, -#endif + .oprofile_type = POWER4, }, { /* Power5 GS */ .pvr_mask = 0xffff0000, @@ -268,10 +244,8 @@ .dcache_bsize = 128, .num_pmcs = 6, .cpu_setup = __setup_cpu_power4, -#ifdef CONFIG_OPROFILE .oprofile_cpu_type = "ppc64/power5", - .oprofile_model = &op_model_power4, -#endif + .oprofile_type = POWER4, }, { /* BE DD1.x */ .pvr_mask = 0xffff0000, Index: build/arch/powerpc/oprofile/common.c =================================================================== --- build.orig/arch/powerpc/oprofile/common.c 2005-12-29 20:50:47.000000000 +1100 +++ build/arch/powerpc/oprofile/common.c 2005-12-29 20:53:49.000000000 +1100 @@ -167,9 +167,20 @@ ops->cpu_type = cpu_type; #else /* __powerpc64__ */ - if (!cur_cpu_spec->oprofile_model || !cur_cpu_spec->oprofile_cpu_type) + if (!cur_cpu_spec->oprofile_cpu_type) return -ENODEV; - model = cur_cpu_spec->oprofile_model; + + switch (cur_cpu_spec->oprofile_type) { + case RS64: + model = &op_model_rs64; + break; + case POWER4: + model = &op_model_power4; + break; + default: + return -ENODEV; + } + model->num_counters = cur_cpu_spec->num_pmcs; ops->cpu_type = cur_cpu_spec->oprofile_cpu_type; Index: build/include/asm-powerpc/cputable.h =================================================================== --- build.orig/include/asm-powerpc/cputable.h 2005-12-29 20:50:47.000000000 +1100 +++ build/include/asm-powerpc/cputable.h 2005-12-29 21:33:01.000000000 +1100 @@ -28,10 +28,15 @@ * via the mkdefs mechanism. */ struct cpu_spec; -struct op_powerpc_model; typedef void (*cpu_setup_t)(unsigned long offset, struct cpu_spec* spec); +enum powerpc_oprofile_type { + INVALID = 0, + RS64 = 1, + POWER4 = 2, +}; + struct cpu_spec { /* CPU is matched via (PVR & pvr_mask) == pvr_value */ unsigned int pvr_mask; @@ -57,7 +62,7 @@ char *oprofile_cpu_type; /* Processor specific oprofile operations */ - struct op_powerpc_model *oprofile_model; + enum powerpc_oprofile_type oprofile_type; }; extern struct cpu_spec *cur_cpu_spec; From anton at samba.org Thu Dec 29 22:09:11 2005 From: anton at samba.org (Anton Blanchard) Date: Thu, 29 Dec 2005 22:09:11 +1100 Subject: [PATCH] ppc64: POWER5+ oprofile support Message-ID: <20051229110911.GD18479@krispykreme> POWER5+ adds new PMU groups and as such needs to be treated differently by oprofile userspace. Change it to report itself as power5+. Signed-off-by: Anton Blanchard --- Index: build/arch/powerpc/kernel/cputable.c =================================================================== --- build.orig/arch/powerpc/kernel/cputable.c 2005-12-29 20:53:49.000000000 +1100 +++ build/arch/powerpc/kernel/cputable.c 2005-12-29 22:06:19.000000000 +1100 @@ -237,14 +237,14 @@ { /* Power5 GS */ .pvr_mask = 0xffff0000, .pvr_value = 0x003b0000, - .cpu_name = "POWER5 (gs)", + .cpu_name = "POWER5+ (gs)", .cpu_features = CPU_FTRS_POWER5, .cpu_user_features = COMMON_USER_POWER5_PLUS, .icache_bsize = 128, .dcache_bsize = 128, .num_pmcs = 6, .cpu_setup = __setup_cpu_power4, - .oprofile_cpu_type = "ppc64/power5", + .oprofile_cpu_type = "ppc64/power5+", .oprofile_type = POWER4, }, { /* BE DD1.x */ From amodra at bigpond.net.au Thu Dec 29 22:38:05 2005 From: amodra at bigpond.net.au (Alan Modra) Date: Thu, 29 Dec 2005 22:08:05 +1030 Subject: ARCH=powerpc 64bit 2.6.15-rc7 build error : ld: drivers/built-in.o section .init.text exceeds stub group size In-Reply-To: <20051228215825.GA28650@localhost.localdomain> References: <20051228215825.GA28650@localhost.localdomain> Message-ID: <20051229113805.GB655@bubble.grove.modra.org> On Wed, Dec 28, 2005 at 10:58:25PM +0100, Sven Luther wrote: > I never saw those before, so i wonder if this are harmless messages coming They are harmless. I made ld a little too paranoid on 2005-09-19, and cured the paranoia on 2005-11-18. If you update to current binutils the warnings should disappear, I think -- Alan Modra IBM OzLabs - Linux Technology Centre From ak at suse.de Thu Dec 29 22:54:34 2005 From: ak at suse.de (Andi Kleen) Date: Thu, 29 Dec 2005 12:54:34 +0100 Subject: [PATCH] ppc64: per_cpu data optimisations In-Reply-To: <20051229025604.GB18479@krispykreme> References: <20051229025604.GB18479@krispykreme> Message-ID: <20051229115434.GH11515@wotan.suse.de> > 5 loads for something that is supposed to be fast, pretty awful. One > reason for the large number of loads is that we have to synthesize 2 > 64bit constants (per_cpu__variable_name and __per_cpu_offset). It will probably not help you very much because most code seems to use int cpu = get_cpu(); per_cpu(...., cpu); put_cpu(); instead of the faster get_cpu(); __get_per_cpu(...); put_cpu(); With the cpu argument there is no fast way to go the local CPU shortcut :/ It would be probably a good idea to go through the fast paths and change them over to the second pattern. > Longer term we can should be able to do even better than 3 loads. > If per_cpu__variable_name wasnt a 64bit constant and paca->data_offset > was in a register we could cut it down to one load. A suggestion from > Rusty is to use gcc's __thread extension here. In order to do this we > would need to free up r13 (the __thread register and where the paca > currently is). So far Ive had a few unsuccessful attempts at doing that :) I tried it at some point on x86-64, but gave up because the ELF relocations for this are hopelessly user space specific hacks and it was just impossible to use them for anything else. Also you become very glibc/binutils specific and I think it would be a bad thing to reach glibc state in the kernel where you need always the latest toolkit to build it. > > At this stage it might be worth making the NUMA and possible cpu > optimisations generic, but per cpu init is done so early we have to be > careful that all architectures have their possible map setup correctly. It's quite complicated to do it anyways - i'm just going through it with Kiran. One problem is that sched_init() access per cpu variables really early, so you have ugly ordering problems. That is why Kiran's patch has to bootstrap it with a "boot time" per cpu area and then later relocating. Quite ugly. -Andi From sven.luther at wanadoo.fr Thu Dec 29 23:06:35 2005 From: sven.luther at wanadoo.fr (Sven Luther) Date: Thu, 29 Dec 2005 13:06:35 +0100 Subject: ARCH=powerpc 64bit 2.6.15-rc7 build error : ld: drivers/built-in.o section .init.text exceeds stub group size In-Reply-To: <20051229113805.GB655@bubble.grove.modra.org> References: <20051228215825.GA28650@localhost.localdomain> <20051229113805.GB655@bubble.grove.modra.org> Message-ID: <20051229120635.GB24600@localhost.localdomain> On Thu, Dec 29, 2005 at 10:08:05PM +1030, Alan Modra wrote: > On Wed, Dec 28, 2005 at 10:58:25PM +0100, Sven Luther wrote: > > I never saw those before, so i wonder if this are harmless messages coming > > They are harmless. I made ld a little too paranoid on 2005-09-19, and > cured the paranoia on 2005-11-18. If you update to current binutils the > warnings should disappear, I think Cool, this had me worried, thanks for the info. Friendly, Sven Luther From mostrows at watson.ibm.com Fri Dec 30 03:49:55 2005 From: mostrows at watson.ibm.com (Michal Ostrowski) Date: Thu, 29 Dec 2005 11:49:55 -0500 Subject: [PATCH] __KERNEL__ guard for linux/sched.h Message-ID: <1135874995.11717.396.camel@brick.watson.ibm.com> Including linux/sched.h must be done within #ifdef __KERNEL__. Failure to do so results in inability to build user-space apps with these headers; failures occur in including , which requires . Signed-off-by: Michal Ostrowski --- include/asm-powerpc/elf.h | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) 197d7542f54d97ff77cbfb2714b7d954cb0615aa diff --git a/include/asm-powerpc/elf.h b/include/asm-powerpc/elf.h index 3dcd65e..adb5cec 100644 --- a/include/asm-powerpc/elf.h +++ b/include/asm-powerpc/elf.h @@ -1,7 +1,6 @@ #ifndef _ASM_POWERPC_ELF_H #define _ASM_POWERPC_ELF_H -#include /* for task_struct */ #include #include #include @@ -175,6 +174,7 @@ typedef elf_vrreg_t elf_vrregset_t32[ELF #define ELF_ET_DYN_BASE (0x08000000) #ifdef __KERNEL__ +#include /* for task_struct */ /* Common routine for both 32-bit and 64-bit processes */ static inline void ppc_elf_core_copy_regs(elf_gregset_t elf_regs, -- 0.99.9.GIT From linas at austin.ibm.com Fri Dec 30 06:14:42 2005 From: linas at austin.ibm.com (linas) Date: Thu, 29 Dec 2005 13:14:42 -0600 Subject: [PATCH] Small fix in eeh definitions when CONFIG_EEH not enabled In-Reply-To: <43B1FF55.1050703@us.ibm.com> References: <43B1FF55.1050703@us.ibm.com> Message-ID: <20051229191441.GX10037@austin.ibm.com> Haren, technically, you should place a Signed-off-by: on this patch. On Tue, Dec 27, 2005 at 06:58:29PM -0800, Haren Myneni was heard to remark: > > Undefined symbols (eeh_add_device_tree_early and eeh_remove_bus_device) > when EEH is not enabled. This small patch will fix this. > > Thanks > Haren Acked-by: Linas Vepstas > --- ppc64git/include/asm-powerpc/eeh.h.orig 2005-12-31 22:36:06.000000000 -0800 > +++ ppc64git/include/asm-powerpc/eeh.h 2005-12-31 22:39:28.000000000 -0800 > @@ -120,6 +120,9 @@ static inline void eeh_add_device_late(s > > static inline void eeh_remove_device(struct pci_dev *dev) { } > > +static inline void eeh_add_device_tree_early(struct device_node *dn) { } > + > +static inline void eeh_remove_bus_device(struct pci_dev *dev) { } > #define EEH_POSSIBLE_ERROR(val, type) (0) > #define EEH_IO_ERROR_VALUE(size) (-1UL) > #endif /* CONFIG_EEH */ From galak at kernel.crashing.org Sat Dec 31 02:49:18 2005 From: galak at kernel.crashing.org (Kumar Gala) Date: Fri, 30 Dec 2005 09:49:18 -0600 Subject: [PATCH] ppc64: Fix oprofile when compiled as a module In-Reply-To: <20051229105131.GC18479@krispykreme> References: <20051229105131.GC18479@krispykreme> Message-ID: <53736FD9-AE1B-48EE-A7A8-CAB2A8F86326@kernel.crashing.org> Anton, Common on, if you are going to "fix" things dont break ppc32 :) Anyways, can you extend your patch to cover the op_model_7450 & op_model_fsl_booke which are now in the powerpc.git tree. - kumar On Dec 29, 2005, at 4:51 AM, Anton Blanchard wrote: > > My recent changes to oprofile broke it when built as a module. Fix > it by > using an enum instead of a function pointer. This way we still retain > the oprofile configuration in the cputable. > > Signed-off-by: Anton Blanchard > --- > > Index: build/arch/powerpc/kernel/cputable.c > =================================================================== > --- build.orig/arch/powerpc/kernel/cputable.c 2005-12-29 > 20:50:47.000000000 +1100 > +++ build/arch/powerpc/kernel/cputable.c 2005-12-29 > 20:53:49.000000000 +1100 > @@ -78,10 +78,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power3, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/power3", > - .oprofile_model = &op_model_rs64, > -#endif > + .oprofile_type = RS64, > }, > { /* Power3+ */ > .pvr_mask = 0xffff0000, > @@ -93,10 +91,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power3, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/power3", > - .oprofile_model = &op_model_rs64, > -#endif > + .oprofile_type = RS64, > }, > { /* Northstar */ > .pvr_mask = 0xffff0000, > @@ -108,10 +104,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power3, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/rs64", > - .oprofile_model = &op_model_rs64, > -#endif > + .oprofile_type = RS64, > }, > { /* Pulsar */ > .pvr_mask = 0xffff0000, > @@ -123,10 +117,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power3, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/rs64", > - .oprofile_model = &op_model_rs64, > -#endif > + .oprofile_type = RS64, > }, > { /* I-star */ > .pvr_mask = 0xffff0000, > @@ -138,10 +130,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power3, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/rs64", > - .oprofile_model = &op_model_rs64, > -#endif > + .oprofile_type = RS64, > }, > { /* S-star */ > .pvr_mask = 0xffff0000, > @@ -153,10 +143,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power3, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/rs64", > - .oprofile_model = &op_model_rs64, > -#endif > + .oprofile_type = RS64, > }, > { /* Power4 */ > .pvr_mask = 0xffff0000, > @@ -168,10 +156,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power4, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/power4", > - .oprofile_model = &op_model_rs64, > -#endif > + .oprofile_type = POWER4, > }, > { /* Power4+ */ > .pvr_mask = 0xffff0000, > @@ -183,10 +169,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_power4, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/power4", > - .oprofile_model = &op_model_power4, > -#endif > + .oprofile_type = POWER4, > }, > { /* PPC970 */ > .pvr_mask = 0xffff0000, > @@ -199,10 +183,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_ppc970, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/970", > - .oprofile_model = &op_model_power4, > -#endif > + .oprofile_type = POWER4, > }, > #endif /* CONFIG_PPC64 */ > #if defined(CONFIG_PPC64) || defined(CONFIG_POWER4) > @@ -221,10 +203,8 @@ > .dcache_bsize = 128, > .num_pmcs = 8, > .cpu_setup = __setup_cpu_ppc970, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/970", > - .oprofile_model = &op_model_power4, > -#endif > + .oprofile_type = POWER4, > }, > #endif /* defined(CONFIG_PPC64) || defined(CONFIG_POWER4) */ > #ifdef CONFIG_PPC64 > @@ -238,10 +218,8 @@ > .icache_bsize = 128, > .dcache_bsize = 128, > .cpu_setup = __setup_cpu_ppc970, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/970", > - .oprofile_model = &op_model_power4, > -#endif > + .oprofile_type = POWER4, > }, > { /* Power5 GR */ > .pvr_mask = 0xffff0000, > @@ -253,10 +231,8 @@ > .dcache_bsize = 128, > .num_pmcs = 6, > .cpu_setup = __setup_cpu_power4, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/power5", > - .oprofile_model = &op_model_power4, > -#endif > + .oprofile_type = POWER4, > }, > { /* Power5 GS */ > .pvr_mask = 0xffff0000, > @@ -268,10 +244,8 @@ > .dcache_bsize = 128, > .num_pmcs = 6, > .cpu_setup = __setup_cpu_power4, > -#ifdef CONFIG_OPROFILE > .oprofile_cpu_type = "ppc64/power5", > - .oprofile_model = &op_model_power4, > -#endif > + .oprofile_type = POWER4, > }, > { /* BE DD1.x */ > .pvr_mask = 0xffff0000, > Index: build/arch/powerpc/oprofile/common.c > =================================================================== > --- build.orig/arch/powerpc/oprofile/common.c 2005-12-29 > 20:50:47.000000000 +1100 > +++ build/arch/powerpc/oprofile/common.c 2005-12-29 > 20:53:49.000000000 +1100 > @@ -167,9 +167,20 @@ > > ops->cpu_type = cpu_type; > #else /* __powerpc64__ */ > - if (!cur_cpu_spec->oprofile_model || !cur_cpu_spec- > >oprofile_cpu_type) > + if (!cur_cpu_spec->oprofile_cpu_type) > return -ENODEV; > - model = cur_cpu_spec->oprofile_model; > + > + switch (cur_cpu_spec->oprofile_type) { > + case RS64: > + model = &op_model_rs64; > + break; > + case POWER4: > + model = &op_model_power4; > + break; > + default: > + return -ENODEV; > + } > + > model->num_counters = cur_cpu_spec->num_pmcs; > > ops->cpu_type = cur_cpu_spec->oprofile_cpu_type; > Index: build/include/asm-powerpc/cputable.h > =================================================================== > --- build.orig/include/asm-powerpc/cputable.h 2005-12-29 > 20:50:47.000000000 +1100 > +++ build/include/asm-powerpc/cputable.h 2005-12-29 > 21:33:01.000000000 +1100 > @@ -28,10 +28,15 @@ > * via the mkdefs mechanism. > */ > struct cpu_spec; > -struct op_powerpc_model; > > typedef void (*cpu_setup_t)(unsigned long offset, struct cpu_spec* > spec); > > +enum powerpc_oprofile_type { > + INVALID = 0, > + RS64 = 1, > + POWER4 = 2, > +}; > + > struct cpu_spec { > /* CPU is matched via (PVR & pvr_mask) == pvr_value */ > unsigned int pvr_mask; > @@ -57,7 +62,7 @@ > char *oprofile_cpu_type; > > /* Processor specific oprofile operations */ > - struct op_powerpc_model *oprofile_model; > + enum powerpc_oprofile_type oprofile_type; > }; > > extern struct cpu_spec *cur_cpu_spec; > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc64-dev From drepper at redhat.com Fri Dec 30 18:32:41 2005 From: drepper at redhat.com (Ulrich Drepper) Date: Thu, 29 Dec 2005 23:32:41 -0800 Subject: [PATCH] vDSO for ppc/ppc64 submission In-Reply-To: References: Message-ID: <43B4E299.1040607@redhat.com> I've added the patch but had to change quite a lot. It was less work to do it myself than to explain it but please compare the new code with your patch to see what should have been done. A few notes I took: - copyright years were not updates. Roland's upd-copyr.el does this automatically. - still many formatting errors. Including the changelog. - lots of whitespaces at end of lines. In one case even after the backslash which is meant to continue a macro definition on the next line. Use whitespace.el in emacs or equivalent tools for your editor - in dl-start.c, there are multiple ENTRY uses. All but the first are no real entry symbols. The ppc ENTRY macro doesn't use cfi_startproc yet but it should and then this is a real problem. - if you copy a generic file and modify it, then at least change the "Generic" in the first line - export symbols or better one table? In any case, these are Linux specific definitions and therefore the export must be in a Linux specific Versions file - the macros in sysdep.h did the error checking wrong. You cannot just compare the return value. Didn't you look at the INTERNAL_SYSCALL_ERROR_P macro etc? - when using local label, some of the macros can be optimized - I've renamed some of the sysdep.h macros and removed others which are not needed. There should be two kinds: INTERNAL_* and INLINE_*. The normal and vdso versions should have the same semantics. It's fine to have the INTERNAL_VSYSCALL_NO_SYSCALL_FALLBACK since it's useful. But again, the semantics must match the other INTERNAL_* macros. -- ? Ulrich Drepper ? Red Hat, Inc. ? 444 Castro St ? Mountain View, CA ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 251 bytes Desc: OpenPGP digital signature Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051229/47e2bf54/attachment.pgp From anton at samba.org Sat Dec 31 06:25:38 2005 From: anton at samba.org (Anton Blanchard) Date: Sat, 31 Dec 2005 06:25:38 +1100 Subject: [PATCH] ppc64: Fix oprofile when compiled as a module In-Reply-To: <53736FD9-AE1B-48EE-A7A8-CAB2A8F86326@kernel.crashing.org> References: <20051229105131.GC18479@krispykreme> <53736FD9-AE1B-48EE-A7A8-CAB2A8F86326@kernel.crashing.org> Message-ID: <20051230192538.GA26924@krispykreme> > Common on, if you are going to "fix" things dont break ppc32 :) That brings up an interesting point with the new merged world. How many combinations should we be testing? If random changes are going to break either 32bit ppc and powerpc targets, I can see a lot of compiling in our future. > Anyways, can you extend your patch to cover the op_model_7450 & > op_model_fsl_booke which are now in the powerpc.git tree. Someone asked me for a 2.6.15 based patch, but yeah it should go on top of the -git tree too. Anton From tom_gall at vnet.ibm.com Sat Dec 31 09:23:19 2005 From: tom_gall at vnet.ibm.com (Tom Gall) Date: Fri, 30 Dec 2005 16:23:19 -0600 (CST) Subject: [PATCH] vDSO for ppc/ppc64 submission In-Reply-To: <43B4E299.1040607@redhat.com> References: <43B4E299.1040607@redhat.com> Message-ID: On Thu, 29 Dec 2005, Ulrich Drepper wrote: > I've added the patch but had to change quite a lot. It was less work to > do it myself than to explain it but please compare the new code with > your patch to see what should have been done. A few notes I took: Thanks. I'll have a look this weekend. Still in the thick of the holiday so time is unfortunately short today. Currently as is in CVS head doesn't even build for me, so I'd imagine there's some cleanup ahead. :-/ > > - copyright years were not updates. Roland's upd-copyr.el does this > automatically. I'd be happy to use a version for vim ;-) > - still many formatting errors. Including the changelog. Odd, is indent not implemented to the full set of coding standards? Which brings me to the gnits comment you had made a few weeks back. I'm happy to follow coding standards but they ought to be documented. The only gnits standard I could find online (least via google) is at : http://www.amath.washington.edu/~lf/tutorials/autoconf/gnits/gnits.html Tho largely when it comes to source standards, the document just points to the gnu standards document which is a bit incomplete in my opinion. If gnits is the standard we ask people to aim for then shouldn't that be pointed to at say http://www.gnu.org/software/libc/resources.html ? > - lots of whitespaces at end of lines. In one case even after the > backslash which is meant to continue a macro definition on the next > line. Use whitespace.el in emacs or equivalent tools for your editor Very odd, vim and pine don't have a history of adding spaces. Looking back at the patch as it left here, I do just see one instance of a space after a backslash. Regards, Tom From drepper at redhat.com Sat Dec 31 09:42:42 2005 From: drepper at redhat.com (Ulrich Drepper) Date: Fri, 30 Dec 2005 14:42:42 -0800 Subject: [PATCH] vDSO for ppc/ppc64 submission In-Reply-To: References: <43B4E299.1040607@redhat.com> Message-ID: <43B5B7E2.2080609@redhat.com> Tom Gall wrote: > Currently as is in CVS head doesn't even build for me, so I'd imagine > there's some cleanup ahead. :-/ The trunk build fine. Either you checked it out while I was checking in some patches or something is wrong in your environment. >> - still many formatting errors. Including the changelog. > > Odd, is indent not implemented to the full set of coding standards? indent doesn't look into macros AFAIK. -- ? Ulrich Drepper ? Red Hat, Inc. ? 444 Castro St ? Mountain View, CA ? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 251 bytes Desc: OpenPGP digital signature Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051230/c29ab877/attachment.pgp