From tinglett at vnet.ibm.com Sat Feb 1 09:47:16 2003 From: tinglett at vnet.ibm.com (Todd Inglett) Date: Fri, 31 Jan 2003 16:47:16 -0600 Subject: machine check exception Message-ID: <3E3AFCF4.3010108@vnet.ibm.com> Randy pointed out that we have a DI exposure on pre-power4 hardware in the machine check handler. The exception is not synchronous so the current handler may attempt to send a SIGBUS to a user process when the kernel was actually at fault. This is bad. Another point is that it is actually trivial to see via the FWNMI handler if the machine check was recovered by firmware so we should take advantage of that. Here's the function after I reorganized it. I can post a patch later, but diff doesn't handle the reorg of the code very well :(. Note that we don't attempt to recover unless we have an fwnmi handler which AFAIK is always present on power4 and beyond. -todd WARNING: this code is untested...please review :) void MachineCheckException(struct pt_regs *regs) { struct rtas_error_log *errhdr; int recoverable; siginfo_t info; if (fwnmi_active) { struct rtas_error_log *errhdr = FWNMI_get_errinfo(regs); recoverable = errhdr ? errhdr->disposition == DISP_FULLY_RECOVERED : 0; FWNMI_release_errinfo(); if (recoverable) return; /* easy recovery */ else if (regs->msr & MSR_RI) { if (user_mode(regs)) { /* Only need to kill user process */ info.si_signo = SIGBUS; info.si_errno = 0; info.si_code = BUS_ADRERR; info.si_addr = (void *)regs->nip; _exception(SIGSEGV, &info, regs); return; } else if (power4_handle_mce(regs)) { return; } } } if (debugger_fault_handler) { debugger_fault_handler(regs); return; } if (debugger) debugger(regs); console_verbose(); spin_lock_irq(&die_lock); bust_spinlocks(1); printk("Machine check in kernel mode.\n"); printk("Caused by (from SRR1=%lx): ", regs->msr); show_regs(regs); bust_spinlocks(0); spin_unlock_irq(&die_lock); panic("Unrecoverable Machine Check"); } ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From anton at samba.org Mon Feb 3 21:33:45 2003 From: anton at samba.org (Anton Blanchard) Date: Mon, 3 Feb 2003 21:33:45 +1100 Subject: Hotplug CPU In-Reply-To: <200301292127.h0TLR8h34442@fleming.austin.ibm.com> References: <200301292127.h0TLR8h34442@fleming.austin.ibm.com> Message-ID: <20030203103344.GC23130@krispykreme> Hi Matt, > I downloaded the following patches: > > cpucontrols.patch, based on 2.5.46 > hotcpu-cpudown.patch, based on 2.5.44 > hotcpu-cpudown-ppc64.patch, based on 2.5.33 > > These were called "Driverfs CPU controls", "Hotplug CPU Remove Generic > Code", and "Hotplug CPU Remove for PPC64", respectively. > > I am interested in finding out the following: > > 1) Are there more recent patches? Yes they are the most recent. > 2) Did I miss a patch somewhere? I do not have the /proc/sys/cpu > interface on my machine when I patch, fix the parts that don't > patch, compile and boot the kernel. My understanding is that things will end up in the sysfs filesystem. There is already some support for cpu features there (eg cpu frequency). > 3) (This one I'll look up soon anyways) Have any of these patches been > submitted and accepted into the main ppc64 tree? Not yet. > 4) Is there any work or code so far for hotplug CPU add? I think there has been some work on this. I was planning on doing the ppc64 bits but Ive been given several other high priority things to do. Anton ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From fleming at austin.ibm.com Tue Feb 4 03:41:26 2003 From: fleming at austin.ibm.com (fleming at austin.ibm.com) Date: Mon, 3 Feb 2003 10:41:26 -0600 (CST) Subject: Hotplug CPU In-Reply-To: <20030203103344.GC23130@krispykreme> from "Anton Blanchard" at Feb 03, 2003 09:33:45 PM Message-ID: <200302031641.h13GfRF24584@fleming.austin.ibm.com> > My understanding is that things will end up in the sysfs filesystem. > There is already some support for cpu features there (eg cpu > frequency). Okay... I have discovered the sysfs interface. I haced up a boot-time routine to call register_cpu() for each boot-time cpu. I then have a /sys/devices/sys/cpu? directory fr each CPU in my machine. The name file correctly has the name as printed by register_cpu. There are two other files, online and power. 'online' appears empty when I cat the file, and 'power' has just the string "0\n" in it. The show_online and store_online functions registered with the device do not seem to be called, as the info that show_online produces is not showing up. I've read the documentation in sysfs.txt several times and I still don't understand where the 'online' and 'power' files are coming from, nor which sysfs file should be calling out to show_online() when I read it. Lastly and mostly unrelated, at what point during boot will printk() be successful? It's getting hard to debug when I can't even use printk's to track execution... Thanks, and TIA, Matt -- fleming at austin.ibm.com ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From tinglett at vnet.ibm.com Wed Feb 5 01:50:01 2003 From: tinglett at vnet.ibm.com (Todd Inglett) Date: Tue, 04 Feb 2003 08:50:01 -0600 Subject: machine check exception References: <3E3AFCF4.3010108@vnet.ibm.com> Message-ID: <3E3FD319.1010108@vnet.ibm.com> Ok, with input from Dave Altobelli I rearranged the code slightly (sorry, this version is relative to 2.4). He suggested we not kill of pids 0 and 1 which is a very good idea. Note that we will panic instead of signaling 0 or 1 which isn't more productive...but perhaps more obvious as to what happened. He also suggested that we only handle uncorrectable ECC errors. Here is the new code. Again, the patch is not terribly readable but I can send it to anyone who wants to see it. I removed power4_handle_mce() from the 2.5 code and replaced it with recover_mce(). The panic path in 2.5 will be slightly different. -todd /* See if we can recover from a machine check exception. * This is only called on power4 (or above) and only via * the Firmware Non-Maskable Interrupts (fwnmi) handler * which provides the error analysis for us. * * Return 1 if corrected (or delivered a signal). * Return 0 if there is nothing we can do. */ static int recover_mce(struct pt_regs *regs, struct rtas_error_log err) { siginfo_t info; if (err.disposition == DISP_FULLY_RECOVERED) { /* Platform corrected itself */ return 1; } else if ((regs->msr & MSR_RI) && user_mode(regs) && err.severity == SEVERITY_ERROR_SYNC && err.disposition == DISP_NOT_RECOVERED && err.target == TARGET_MEMORY && err.type == TYPE_ECC_UNCORR && !(current->pid == 0 || current->pid == 1)) { /* Kill off a user process with an ECC error */ info.si_signo = SIGBUS; info.si_errno = 0; info.si_code = BUS_ECCERR; info.si_addr = (void *)regs->nip; printk(KERN_ERR "MCE: uncorrectable ecc error for pid %d\n", current->pid); _exception(SIGBUS, &info, regs); return 1; } return 0; } /* Handle a machine check. * * Note that on Power 4 and beyond Firmware Non-Maskable Interrupts (fwnmi) * should be present. If so the handler which called us tells us if the * error was recovered (never true if RI=0). * * On hardware prior to Power 4 these exceptions were asynchronous which * means we can't tell exactly where it occurred and so we can't recover. * * Note that the debugger should test RI=0 and warn the user that system * state has been corrupted. */ void MachineCheckException(struct pt_regs *regs) { struct rtas_error_log err, *errp; if (fwnmi_active) { errp = FWNMI_get_errinfo(regs); if (errp) err = *errp; FWNMI_release_errinfo(); /* frees errp */ if (errp && recover_mce(regs, err)) return; } if (debugger_fault_handler) { debugger_fault_handler(regs); return; } if (debugger) debugger(regs); printk("Machine check in kernel mode.\n"); printk("Caused by (from SRR1=%lx): ", regs->msr); show_regs(regs); #if defined(CONFIG_XMON) || defined(CONFIG_KGDB) debugger(regs); #endif #ifdef CONFIG_KDB if (kdb(KDB_REASON_FAULT, 0, regs)) return ; #endif print_backtrace((unsigned long *)regs->gpr[1]); panic("Unrecoverable machine check"); } ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From anton at samba.org Tue Feb 11 23:08:38 2003 From: anton at samba.org (Anton Blanchard) Date: Tue, 11 Feb 2003 23:08:38 +1100 Subject: machine check exception In-Reply-To: <3E3FD319.1010108@vnet.ibm.com> References: <3E3AFCF4.3010108@vnet.ibm.com> <3E3FD319.1010108@vnet.ibm.com> Message-ID: <20030211120838.GA20034@krispykreme> Hi Todd, > Ok, with input from Dave Altobelli I rearranged the code slightly > (sorry, this version is relative to 2.4). He suggested we not kill of > pids 0 and 1 which is a very good idea. Note that we will panic instead > of signaling 0 or 1 which isn't more productive...but perhaps more > obvious as to what happened. He also suggested that we only handle > uncorrectable ECC errors. > > Here is the new code. Again, the patch is not terribly readable but I > can send it to anyone who wants to see it. I removed > power4_handle_mce() from the 2.5 code and replaced it with > recover_mce(). The panic path in 2.5 will be slightly different. Looks good! I merged it into the 2.5 tree. In fact Im chasing something that results in a machine check then a complete lockup (on a p690 running SMP). Paul just reminded me that we should probably be doing rtas check-exception when we get a machine check. I assume this isnt required on a machine with FWNMI since it does the work for us, right? Anton ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From dalto at us.ibm.com Wed Feb 12 05:47:36 2003 From: dalto at us.ibm.com (David K Altobelli) Date: Tue, 11 Feb 2003 12:47:36 -0600 Subject: machine check exception Message-ID: Anton, If a machine has FWNMI, signified by the presense of the "ibm,nmi-register" token, it will not have the "check-exception" token. LPAR machines are required to support FWNMI. > Paul just reminded me that we should probably be doing rtas check-exception > when we get a machine check. I assume this isnt required on a machine > with FWNMI since it does the work for us, right? Dave ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From dalto at us.ibm.com Wed Feb 12 06:01:13 2003 From: dalto at us.ibm.com (David K Altobelli) Date: Tue, 11 Feb 2003 13:01:13 -0600 Subject: machine check exception Message-ID: Also, we were planning on adding check-exception logic to the machine check handler to retrieve error information from RTAS (for non-FWNMI machines). This has a dependency on being able to log fatal errors to nvram and recover on next reboot. > Paul just reminded me that we should probably be doing > rtas check-exception when we get a machine check. > I assume this isnt required on a machine with FWNMI > since it does the work for us, right? ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From anton at samba.org Wed Feb 12 10:26:06 2003 From: anton at samba.org (Anton Blanchard) Date: Wed, 12 Feb 2003 10:26:06 +1100 Subject: machine check exception In-Reply-To: <3E493CDF.40800@vnet.ibm.com> References: <3E3AFCF4.3010108@vnet.ibm.com> <3E3FD319.1010108@vnet.ibm.com> <20030211120838.GA20034@krispykreme> <3E493CDF.40800@vnet.ibm.com> Message-ID: <20030211232606.GA24440@krispykreme> Hi, > Thanks...these things are next to impossible to test. Nice you have a > test case (I think :)). Not so nice from my perspective :) > BTW, I pulled your bk today (2.5.60) and noticed that I get a > decrementer before time_init. Some init order change? I haven't had > time to look just yet. Easy enough to hack around at the moment.... Yep, known bug. Hopefully Linus merged the fix. -- Date: Tue, 11 Feb 2003 01:19:05 -0800 From: Andrew Morton To: Linus Torvalds , Anton Blanchard Subject: sched_init enables interrupts too early wake_up_forked_process() unconditionally enables interrupts. It is called from sched_init(). Enabling interrupts that early makes Anton's ppc64 machine lock up. I tried going back to just wake_up_process() but the kernel didn't start. diff -puN kernel/sched.c~sched_init-fix kernel/sched.c --- 25/kernel/sched.c~sched_init-fix 2003-02-11 01:09:51.000000000 -0800 +++ 25-akpm/kernel/sched.c 2003-02-11 01:12:44.000000000 -0800 @@ -519,7 +519,8 @@ int wake_up_state(task_t *p, unsigned in */ void wake_up_forked_process(task_t * p) { - runqueue_t *rq = this_rq_lock(); + unsigned long flags; + runqueue_t *rq = task_rq_lock(current, &flags); p->state = TASK_RUNNING; if (!rt_task(p)) { @@ -535,7 +536,7 @@ void wake_up_forked_process(task_t * p) set_task_cpu(p, smp_processor_id()); activate_task(p, rq); - rq_unlock(rq); + task_rq_unlock(rq, &flags); } /* ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From bharring at us.ibm.com Thu Feb 13 05:13:32 2003 From: bharring at us.ibm.com (Bradley Harrington) Date: Wed, 12 Feb 2003 12:13:32 -0600 Subject: powerpc64-linux-g++ host types Message-ID: I'm not sure if this is a user question or a dev question, but I'll try here first. We have gotten ELF64 gcc working wonderfully on our build machines, which exclusively run AIX (4.3 & 5.1). However, some people are clamoring for C++ now. Has anyone successfully generated an ELF64 g++? The barrier seems to be, the requirement of glibc, which will not configure on AIX. Any help at all would be greatly appreciated. Thanks, Brad Bradley R. Harrington bharring at us.ibm.com pSeries Firmware Development Tel (512) 838-6625 T/L: 678-6625 ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From kaena at us.ibm.com Thu Feb 13 06:36:57 2003 From: kaena at us.ibm.com (Kaena Freitas) Date: Wed, 12 Feb 2003 13:36:57 -0600 Subject: powerpc64-linux-g++ host types Message-ID: Hi Bradley, I think you might be asking the wrong development team. I assume you are using the GNU tools which run on AIX and are part of the AIX Toolbox of Linux Applications. I copied David Clissold on this note. David is the Toolbox owner and can answer your question. Aloha, Kaena D. Kaena Freitas Linux on pSeries Development kaena at us.ibm.com Office: (512)838-2676 cell: (512)762-3884 Bradley Harrington/Austin/IBM at IBMUS To: linuxppc64-dev at lists.linuxppc.org Sent by: cc: owner-linuxppc64-dev at lists.l Subject: powerpc64-linux-g++ host types inuxppc.org 02/12/2003 12:13 PM I'm not sure if this is a user question or a dev question, but I'll try here first. We have gotten ELF64 gcc working wonderfully on our build machines, which exclusively run AIX (4.3 & 5.1). However, some people are clamoring for C++ now. Has anyone successfully generated an ELF64 g++? The barrier seems to be, the requirement of glibc, which will not configure on AIX. Any help at all would be greatly appreciated. Thanks, Brad Bradley R. Harrington bharring at us.ibm.com pSeries Firmware Development Tel (512) 838-6625 T/L: 678-6625 ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From cliss at us.ibm.com Thu Feb 13 07:28:24 2003 From: cliss at us.ibm.com (David Clissold) Date: Wed, 12 Feb 2003 14:28:24 -0600 Subject: powerpc64-linux-g++ host types Message-ID: Actually, I don't think this is an AIX Toolbox g++ question. This question refer to building ELF64 objects with g++ on AIX, presumably as a cross-compile; but the gcc and g++ compiler versions in the AIX Toolbox produce XCOFF binaries, to be run natively on AIX itself. They can produce 64-bit XCOFF, but they use the AIX native libc.a, without glibc. David Clissold, IBM Austin cliss at us.ibm.com Kaena Freitas To: Bradley Harrington/Austin/IBM at IBMUS 02/12/2003 01:36 cc: linuxppc64-dev at lists.linuxppc.org, PM owner-linuxppc64-dev at lists.linuxppc.org, David Clissold/Austin/IBM at IBMUS From: Kaena Freitas/Austin/IBM at IBMUS Subject: Re: powerpc64-linux-g++ host types(Document link: David Clissold) Hi Bradley, I think you might be asking the wrong development team. I assume you are using the GNU tools which run on AIX and are part of the AIX Toolbox of Linux Applications. I copied David Clissold on this note. David is the Toolbox owner and can answer your question. Aloha, Kaena D. Kaena Freitas Linux on pSeries Development kaena at us.ibm.com Office: (512)838-2676 cell: (512)762-3884 Bradley Harrington/Austin/IBM at IBMUS To: linuxppc64-dev at lists.linuxppc.org Sent by: cc: owner-linuxppc64-dev at lists.l Subject: powerpc64-linux-g++ host types inuxppc.org 02/12/2003 12:13 PM I'm not sure if this is a user question or a dev question, but I'll try here first. We have gotten ELF64 gcc working wonderfully on our build machines, which exclusively run AIX (4.3 & 5.1). However, some people are clamoring for C++ now. Has anyone successfully generated an ELF64 g++? The barrier seems to be, the requirement of glibc, which will not configure on AIX. Any help at all would be greatly appreciated. Thanks, Brad Bradley R. Harrington bharring at us.ibm.com pSeries Firmware Development Tel (512) 838-6625 T/L: 678-6625 ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From amodra at bigpond.net.au Thu Feb 13 13:24:49 2003 From: amodra at bigpond.net.au (Alan Modra) Date: Thu, 13 Feb 2003 12:54:49 +1030 Subject: powerpc64-linux-g++ host types In-Reply-To: References: Message-ID: <20030213022449.GQ23622@bubble.sa.bigpond.net.au> On Wed, Feb 12, 2003 at 12:13:32PM -0600, Bradley Harrington wrote: > The barrier seems to be, the requirement of glibc, which will not configure > on AIX. Any help at all would be greatly appreciated. I've attached some instructions I wrote some time ago (they may be a little stale) for bootstrapping cross-toolchains. -- Alan Modra IBM OzLabs - Linux Technology Centre -------------- next part -------------- PowerPC64 toolchain bootstrapping. The following assumes you want to build a full powerpc64-linux toolchain including glibc for use in a cross-toolchain environment, starting from a system with no powerpc64 tools installed, and that you have the correct sources available. The examples configure lines are for an i686-linux host installing tools to the default prefix of /usr/local. Adjust to suit your system, and the location of your binutils, gcc, linux etc. sources. 1) Build binutils mkdir /tmp/binutils cd /tmp/binutils /src/binutils-current/configure --prefix=/usr/local \ --build=i686-linux --host=i686-linux --target=powerpc64-linux \ --disable-nls make su make install exit Note: On a powerpc-linux system it's possible to build a biarch set of binutils that handles both powerpc-linux and powerpc64-linux. In that case, you would configure using something like: /src/binutils-current/configure --prefix=/usr/local \ --build=powerpc-linux --host=powerpc-linux --target=powerpc-linux \ --enable-targets=powerpc64-linux --disable-nls Besides reducing the number of toolchain binaries on your system, this has the advantage of enabling some features of ld only available on native toolchains (ie. with host == target). 2) Build gcc First make sure your PATH includes /usr/local/bin so you have access to the binutils you installed in step (1). Edit /src/gcc-ppc64-3.2/gcc/config/rs6000/linux.h, and replace the line "#ifdef IN_LIBGCC2" with "#if 0" We don't have any glibc or kernel headers yet. Alternatively, steal a copy of the headers from somewhere and install to /usr/local/powerpc64-linux/include/ mkdir /tmp/gcc cd /tmp/gcc /src/gcc-ppc64-3.2/configure --prefix=/usr/local \ --build=i686-linux --host=i686-linux --target=powerpc64-linux \ --disable-nls --disable-threads --disable-shared --enable-languages=c make su make install exit 3) Build a powerpc64-linux kernel This step is necessary to generate kernel headers used by glibc and gcc. 4) Build glibc The changed --prefix and --host below are _not_ typos. mkdir /tmp/glibc cd /tmp/glibc /src/glibc-ppc64/configure --prefix=/usr/local/powerpc64-linux \ --build=i686-linux --host=powerpc64-linux --target=powerpc64-linux \ --with-headers=/src/linux-2.4.19/include --without-cvs \ --enable-add-ons --enable-shared --disable-sanity-checks make su make install # Add links to kernel headers, or copy them if you like cd /usr/local/powerpc64-linux/include/ ln -s /src/linux-2.4.19/include/asm-ppc64 asm ln -s /src/linux-2.4.19/include/linux linux # Make /usr/local/powerpc64-linux writable for the next gcc compile # The reason being that --with-headers copies include/ to sys-include/ # You need --with-headers for gcc's limits.h to be properly generated # with #include_next to pick up the "real" limits.h, and to adjust # libgcc support now that you have glibc installed. chmod 777 /usr/local/powerpc64-linux # If your host is powerpc-linux and you intend running powerpc64 apps on # the host, link (or copy) powerpc64-linux ld.so to the standard location mkdir /lib64 ln -s ../usr/local/powerpc64-linux/lib/ld.so.1 /lib64/ld64.so.1 exit 5) Rebuild gcc, with more language support Undo the change to linux.h you made in step 2. rm -rf /tmp/gcc mkdir /tmp/gcc cd /tmp/gcc /src/gcc-ppc64-3.2/configure --prefix=/usr/local \ --build=i686-linux --host=i686-linux --target=powerpc64-linux \ --disable-nls --enable-shared --enable-languages=c,c++,f77 \ --with-headers=/usr/local/powerpc64-linux/include make su make install chmod 755 /usr/local/powerpc64-linux exit That's it! From anton at samba.org Fri Feb 14 13:19:19 2003 From: anton at samba.org (Anton Blanchard) Date: Fri, 14 Feb 2003 13:19:19 +1100 Subject: lockless gettimeofday Message-ID: <20030214021919.GA20211@krispykreme> Hi, 2.5 just got the improved gettimeofday which should prevent any starvation issues we were seeing in 2.4. It should also be faster since the rwlock is replaced by two counters, the cost is now 2 extra loads and 2 read barriers. With this and sysconfig which allows libc to implement gettimeofday in userspace Im looking to remove the kernel lockless gettimeofday. Its quite complex and the gain this gives in 2.5 is less. Anton ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From gb at clozure.com Mon Feb 17 00:26:11 2003 From: gb at clozure.com (Gary Byers) Date: Sun, 16 Feb 2003 06:26:11 -0700 (MST) Subject: 32-bit signal contexts missing dar, dsisr, trap Message-ID: Hi. When setting up the signal frame for a 32-bit signal handler, current versions (2.5.61 from kernel.org, 2.4.21-pre4 with recent patches from penguinppc64.org) of the ppc64 kernel neglect to copy the dar, dsisr, and trap fields (at least) from the 64-bit register context to the handler's 32-bit context. Somewhat oddly, in both kernels the functions 'setup_frame32' and 'setup_rt_frame32' copy a slightly different set of registers/state information: setup_rt_frame32() copies the MQ register and setup_rt_frame32() doesn't. (I honestly don't know if there are machines that can run the ppc64 kernel that -have- MQ registers or not.) Perhaps this would be clearer if the loop that copies the 32 "real" GPRs were just extended to copy up to and including PT_RESULT ? Gary Byers gb at clozure.com ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From bergner at vnet.ibm.com Wed Feb 19 02:30:27 2003 From: bergner at vnet.ibm.com (Peter Bergner) Date: Tue, 18 Feb 2003 09:30:27 -0600 Subject: 32-bit signal contexts missing dar, dsisr, trap References: Message-ID: <3E525193.30001@vnet.ibm.com> Gary Byers wrote: > When setting up the signal frame for a 32-bit signal handler, current > versions (2.5.61 from kernel.org, 2.4.21-pre4 with recent patches from > penguinppc64.org) of the ppc64 kernel neglect to copy the dar, dsisr, > and trap fields (at least) from the 64-bit register context to the > handler's 32-bit context. > > Somewhat oddly, in both kernels the functions 'setup_frame32' and > 'setup_rt_frame32' copy a slightly different set of registers/state > information: setup_rt_frame32() copies the MQ register and > setup_rt_frame32() doesn't. (I honestly don't know if there are > machines that can run the ppc64 kernel that -have- MQ registers or > not.) > > Perhaps this would be clearer if the loop that copies the 32 "real" > GPRs were just extended to copy up to and including PT_RESULT ? I'll take a look at this. Thanks. Peter ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From paulus at samba.org Wed Feb 19 09:03:27 2003 From: paulus at samba.org (Paul Mackerras) Date: Wed, 19 Feb 2003 09:03:27 +1100 Subject: 32-bit signal contexts missing dar, dsisr, trap In-Reply-To: References: Message-ID: <15954.44463.598314.453596@argo.ozlabs.ibm.com> Gary Byers writes: > When setting up the signal frame for a 32-bit signal handler, current > versions (2.5.61 from kernel.org, 2.4.21-pre4 with recent patches from > penguinppc64.org) of the ppc64 kernel neglect to copy the dar, dsisr, > and trap fields (at least) from the 64-bit register context to the > handler's 32-bit context. Hmmm, these fields aren't needed for restoring the state of the process when the signal handler returns, and the information in them should mostly be available in a more portable form in the siginfo struct (for a "real-time" signal handler, at least). Why do you need dar, dsisr, and trap? > Somewhat oddly, in both kernels the functions 'setup_frame32' and > 'setup_rt_frame32' copy a slightly different set of registers/state > information: setup_rt_frame32() copies the MQ register and > setup_rt_frame32() doesn't. (I honestly don't know if there are > machines that can run the ppc64 kernel that -have- MQ registers or > not.) No, the 601 is the most recent processor that has an MQ register AFAIK. :) Paul. ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From gb at clozure.com Wed Feb 19 10:07:41 2003 From: gb at clozure.com (Gary Byers) Date: Tue, 18 Feb 2003 16:07:41 -0700 (MST) Subject: 32-bit signal contexts missing dar, dsisr, trap In-Reply-To: <15954.44463.598314.453596@argo.ozlabs.ibm.com> Message-ID: On Wed, 19 Feb 2003, Paul Mackerras wrote: > Gary Byers writes: > > > When setting up the signal frame for a 32-bit signal handler, current > > versions (2.5.61 from kernel.org, 2.4.21-pre4 with recent patches from > > penguinppc64.org) of the ppc64 kernel neglect to copy the dar, dsisr, > > and trap fields (at least) from the 64-bit register context to the > > handler's 32-bit context. > > Hmmm, these fields aren't needed for restoring the state of the > process when the signal handler returns, and the information in them > should mostly be available in a more portable form in the siginfo > struct (for a "real-time" signal handler, at least). > > Why do you need dar, dsisr, and trap? I have certain stack-like data structures that have write-protected guard pages at their limits and use a SIGSEGV handler to detect writes to those guard pages. The handler can respond to a write to a guard page more reliably if it knows that a DSI caused the SIGSEGV, if it knows that the fault involved a write, and if it knows the address being written to. I suppose that it might be possible to reconstruct this information by disassembling the instruction at the sigcontext's PC, but using the dar, dsisr, and trap fields of the sigcontext seems far more reliable. As far as I know, ppc32 SIGSEGV handlers don't receive siginfo arguments; the dar and dsisr are generally only meaningful after some synchronous, memory-related exception that will typically raise SIGSEGV. > Paul. > > Gary Byers gb at clozure.com ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/ From peter at bergner.org Thu Feb 20 06:15:05 2003 From: peter at bergner.org (Peter Bergner) Date: Wed, 19 Feb 2003 13:15:05 -0600 Subject: 32-bit signal contexts missing dar, dsisr, trap References: Message-ID: <3E53D7B9.3010204@bergner.org> Gary Byers wrote: > I have certain stack-like data structures that have write-protected guard > pages at their limits and use a SIGSEGV handler to detect writes to those > guard pages. The handler can respond to a write to a guard page more > reliably if it knows that a DSI caused the SIGSEGV, if it knows that > the fault involved a write, and if it knows the address being written > to. I suppose that it might be possible to reconstruct this information > by disassembling the instruction at the sigcontext's PC, but using the > dar, dsisr, and trap fields of the sigcontext seems far more reliable. How does the attacked patch work for you? It does seem to clean the code up slightly and it matches better the behaviour seen on 64-bit apps. > As far as I know, ppc32 SIGSEGV handlers don't receive siginfo arguments; > the dar and dsisr are generally only meaningful after some synchronous, > memory-related exception that will typically raise SIGSEGV. If you set sa_flags = SA_SIGINFO, your SIGSEGV handler can get siginfo data. I've attached a test program which you can use to test the attached patch. It seems to work for me. Peter -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: signal32.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030219/3cd23d5f/attachment.txt -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: sigsegv.c Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030219/3cd23d5f/attachment-0001.txt From gb at clozure.com Fri Feb 21 10:16:31 2003 From: gb at clozure.com (Gary Byers) Date: Thu, 20 Feb 2003 16:16:31 -0700 (MST) Subject: 32-bit signal contexts missing dar, dsisr, trap In-Reply-To: <3E53D7B9.3010204@bergner.org> Message-ID: On Wed, 19 Feb 2003, Peter Bergner wrote: > Gary Byers wrote: > > I have certain stack-like data structures that have write-protected guard > > pages at their limits and use a SIGSEGV handler to detect writes to those > > guard pages. The handler can respond to a write to a guard page more > > reliably if it knows that a DSI caused the SIGSEGV, if it knows that > > the fault involved a write, and if it knows the address being written > > to. I suppose that it might be possible to reconstruct this information > > by disassembling the instruction at the sigcontext's PC, but using the > > dar, dsisr, and trap fields of the sigcontext seems far more reliable. > > How does the attacked patch work for you? It does seem to clean the > code up slightly and it matches better the behaviour seen on 64-bit apps. That seems to work fine; thanks! Gary Byers gb at clozure.com ** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/