From anton at samba.org  Mon Sep  1 18:01:39 2003
From: anton at samba.org (Anton Blanchard)
Date: Mon, 1 Sep 2003 18:01:39 +1000
Subject: [PATCH] Specify right base address/IRQ for ttyS0{2,3}
In-Reply-To: <Pine.A41.4.44.0308282158110.50410-100000@forte.austin.ibm.com>
References: <3F4D1D4B.6010701@austin.ibm.com> <Pine.A41.4.44.0308282158110.50410-100000@forte.austin.ibm.com>
Message-ID: <20030901080139.GE12245@krispykreme>


Hi,

> Hmm, an interesting observation: When I boot with this patch on a
> 7044-270, it also finds 4 serial ports. One of them is the tablet port
> (which is a serial port). I suppose there is be hardware for 4 ports, but
> only 2+1 actually have connectors.
>
> This brings a bit more urgency to doing the "nice" fix: Looking at the
> device tree to find the serial ports, and patch the table early in the
> boot. Detecting them doesn't do much harm, but it's misleading.

On some ppc64 boxes the keyboard controller is present but not terminated.
Ive seen it on our SP2 node as well as other machines.

If you try and use it you will just end up with a hot interrupt. The
quick fix (I think from Milton) was to reserve the relevant IO range in
early ppc64 boot so the keyboard driver would fail later on.

Perhaps a similar quick fix would work here?

Anton

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From meissner at suse.de  Mon Sep  1 18:12:37 2003
From: meissner at suse.de (Marcus Meissner)
Date: Mon, 1 Sep 2003 10:12:37 +0200
Subject: possible deadlock in pipes
In-Reply-To: <1062094772.612.156.camel@gaston>
References: <20030820214114.GA20395@suse.de> <20030821215433.GE29476@krispykreme> <20030828110829.GB1482@suse.de> <20030828154248.GB12541@krispykreme> <20030828154542.GA12971@suse.de> <20030828155105.GD12541@krispykreme> <1062094772.612.156.camel@gaston>
Message-ID: <20030901081237.GB26954@suse.de>


On Thu, Aug 28, 2003 at 08:19:33PM +0200, Benjamin Herrenschmidt wrote:
>
> On Thu, 2003-08-28 at 17:51, Anton Blanchard wrote:
> > > Hmm, I tested 2.4.21, but anyway :)
> >
> > Should be similar in this area :) Im hoping Paul or Dave can pick this
> > change up for 2.4.
>
> I suppose ppc32 is affected as well ?

Yes, judging from a look at arch/ppc/kernel/semaphore.c.

Ciao, Marcus

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From anton at samba.org  Tue Sep  2 11:54:28 2003
From: anton at samba.org (Anton Blanchard)
Date: Tue, 2 Sep 2003 11:54:28 +1000
Subject: [PATCH] Specify right base address/IRQ for ttyS0{2,3}
In-Reply-To: <3F4D0B0F.3070701@austin.ibm.com>
References: <3F4D0B0F.3070701@austin.ibm.com>
Message-ID: <20030902015428.GA1941@krispykreme>


Hi Olof,

> Attached patch sets valid default values for the 3rd and 4th serial port on
> pSeries systems.

Thanks, I added it to 2.5 BK.

Anton

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From taj.subhani at oracle.com  Wed Sep  3 02:07:56 2003
From: taj.subhani at oracle.com (Mohammed Tajuddin)
Date: Tue, 02 Sep 2003 09:07:56 -0700
Subject: TOC overflow on linuxppc64
References: <3F4E8DDB.C6EAABE5@oracle.com> <20030829011859.GS1320@bubble.sa.bigpond.net.au>
Message-ID: <3F54C05C.62B2812C@oracle.com>


mainline CVS binutils solved the problem. Thanks a lot.

Regards,
TAJ

Alan Modra wrote:
>
> Use mainline CVS binutils.
>
> --
> Alan Modra
> IBM OzLabs - Linux Technology Centre

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From lxiep at us.ibm.com  Wed Sep  3 08:36:49 2003
From: lxiep at us.ibm.com (Linda Xie)
Date: Tue, 02 Sep 2003 17:36:49 -0500
Subject: patch for exporting hotplug_slots subsys
Message-ID: <3F551B81.2000109@us.ltcfwd.linux.ibm.com>

Hi Greg,

This patch allows a hotlug controller driver to create a new file in
hotplug_slots
directory that driver can read or write anything it wants to.

Please see attached file and let me know if you have any problems
with it.

Thanks,

Linda
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hotplug.patch
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030902/8f233395/attachment.txt 

From engebret at vnet.ibm.com  Fri Sep  5 00:05:57 2003
From: engebret at vnet.ibm.com (Dave Engebretsen)
Date: Thu, 04 Sep 2003 09:05:57 -0500
Subject: interrupt stacks
References: <20030828155000.GC12541@krispykreme>
Message-ID: <3F5746C5.3C9C7793@vnet.ibm.com>


It seems that if we are certain no recursion can occur, the stacks
should be removed.  Debugging that problem the first time around was
quite time consuming, so we should be very sure of this fact.

Do you have a patch, or could you just add the code into 2.4 BK to fix
the problems found in xics.c?

Thanks -

Dave.

Anton Blanchard wrote:
>
> Hi,
>
> In 2.4 we have interrupt stacks. It turns out there were a number of
> issues with the xics irq routines, Milton and I weeded them out in 2.5
> over the last few months. There were windows where we could take irqs
> recursively, resulting in excessive stack usage.
>
> In 2.5 interrupt stacks are disabled because the thread_info changes
> broke it. The thread_info by default lives at the bottom of the kernel
> stack, and switching stacks on the fly confuses it greatly.
>
> Our options are:
>
> 1 fix thread_info. Some recent changes pushed by ia64 should allow us to
> move the thread_info into the task_struct.
>
> 2 remove interrupt stacks. We have done very heavy testing of 2.5 in the
> lab and have not seen any stack overflow problems. Now that we cant
> take recursive irqs we shouldnt be able to overflow a tasks stack.
>
> Im leaning towards option 2, if we dont need interrupt stacks then there
> is no need for that complexity. Thoughts?
>
> Anton
>

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From hollisb at us.ibm.com  Fri Sep  5 07:07:47 2003
From: hollisb at us.ibm.com (Hollis Blanchard)
Date: Thu, 4 Sep 2003 16:07:47 -0500
Subject: configure help
Message-ID: <D619D44C-DF1B-11D7-BE69-000A95A0560C@us.ibm.com>


Hey, I was just noticing that there are a lot of ppc64 config options
that are undocumented in both 2.4 and 2.5 bk trees. Some examples from
2.5 include MSCHUNKS, RTAS_FLASH, SCANLOG, PPC_RTAS, VIOCONS, and VETH.

If you know what any of these are (especially if you wrote them :),
could you take a moment to add some help text? In 2.5 it goes with the
config option (e.g. in arch/ppc64/Kconfig). In 2.4 it goes into
Documentation/Configure.help.

--
Hollis Blanchard, who's always wondered what MSCHUNKS were...
IBM Linux Technology Center


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From kaena at us.ibm.com  Fri Sep  5 07:23:19 2003
From: kaena at us.ibm.com (Kaena Freitas)
Date: Thu, 4 Sep 2003 16:23:19 -0500
Subject: configure help
Message-ID: <OFD5A0650C.73ADEBBC-ON87256D97.00752141@us.ibm.com>


Hollis -

Welcome back to the "oven". You've been away from the farm for much too
long. :-) I'll let more technical guys document the true meaning of these
macros but since you've been playing on your little Intel boxes, the Power
team has been adding things like Virtual I/O, Virtual Ethernet, updating
system firmware, and a whole bunch of neat stuff. :-)

Aloha,

Kaena

D. Kaena Freitas
Linux on Power Leadership Team
kaena at us.ibm.com  Office: (512)838-2676
cell: (512)762-3884

[ hollisb at us.ltcfwd.linux.ibm.com writes: ]
>
> Hey, I was just noticing that there are a lot of ppc64 config options
> that are undocumented in both 2.4 and 2.5 bk trees. Some examples from
> 2.5 include MSCHUNKS, RTAS_FLASH, SCANLOG, PPC_RTAS, VIOCONS, and
> VETH.
>
> If you know what any of these are (especially if you wrote them :),
> could you take a moment to add some help text? In 2.5 it goes with
> the config option (e.g. in arch/ppc64/Kconfig). In 2.4 it goes into
> Documentation/Configure.help.

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From taj.subhani at oracle.com  Tue Sep  9 10:03:25 2003
From: taj.subhani at oracle.com (Mohammed Tajuddin)
Date: Mon, 08 Sep 2003 17:03:25 -0700
Subject: how to access PC
References: <OF12786557.32B9372E-ON86256D91.0052BC58-86256D91.00531E3A@us.ibm.com> <20030829162103.A29782@infradead.org>
Message-ID: <3F5D18CD.E04E0B6C@oracle.com>


Hello,

Has anyone tried accessing the program counter on linux powerpc64?  I
notice struct pt_regs defined in ptrace.h, does not contain any place
holder for program counter which is defined on mac-osx for example.
Basically I was trying to access PC, PS from within a program.
Appreciate your feedback.


Regards,
TAJ

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From pinskia at physics.uc.edu  Tue Sep  9 11:26:34 2003
From: pinskia at physics.uc.edu (Andrew Pinski)
Date: Mon, 8 Sep 2003 18:26:34 -0700
Subject: how to access PC
In-Reply-To: <3F5D18CD.E04E0B6C@oracle.com>
Message-ID: <A6740B58-E264-11D7-A58D-0003939F15BE@physics.uc.edu>


Try this function:
void * __attribute__((noinline)) getpc()
{
	void *t;
	asm("mflr %0":"=r"(t));
	return t;
}
It might not work always because of sibcalling optimizations but it
should get you started.

Thanks,
Andrew Pinski


On Monday, September 8, 2003, at 05:03 PM, Mohammed Tajuddin wrote:

> Hello,
>
> Has anyone tried accessing the program counter on linux powerpc64?  I
> notice struct pt_regs defined in ptrace.h, does not contain any place
> holder for program counter which is defined on mac-osx for example.
> Basically I was trying to access PC, PS from within a program.
> Appreciate your feedback.
>
>
> Regards,
> TAJ
>
>


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From rth at redhat.com  Tue Sep  9 16:28:22 2003
From: rth at redhat.com (Richard Henderson)
Date: Mon, 8 Sep 2003 23:28:22 -0700
Subject: how to access PC
In-Reply-To: <A6740B58-E264-11D7-A58D-0003939F15BE@physics.uc.edu>
References: <3F5D18CD.E04E0B6C@oracle.com> <A6740B58-E264-11D7-A58D-0003939F15BE@physics.uc.edu>
Message-ID: <20030909062822.GE14968@redhat.com>


On Mon, Sep 08, 2003 at 06:26:34PM -0700, Andrew Pinski wrote:
> Try this function:
> void * __attribute__((noinline)) getpc()
> {
> 	void *t;
> 	asm("mflr %0":"=r"(t));
> 	return t;
> }

If you're not going to inline, __builtin_return_address(0) will work.


r~

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From paulus at samba.org  Tue Sep  9 17:07:34 2003
From: paulus at samba.org (Paul Mackerras)
Date: Tue, 9 Sep 2003 17:07:34 +1000
Subject: how to access PC
In-Reply-To: <3F5D18CD.E04E0B6C@oracle.com>
References: <OF12786557.32B9372E-ON86256D91.0052BC58-86256D91.00531E3A@us.ibm.com>
	<20030829162103.A29782@infradead.org>
	<3F5D18CD.E04E0B6C@oracle.com>
Message-ID: <16221.31798.126557.517626@cargo.ozlabs.ibm.com>


Mohammed Tajuddin writes:

> Has anyone tried accessing the program counter on linux powerpc64?  I
> notice struct pt_regs defined in ptrace.h, does not contain any place
> holder for program counter which is defined on mac-osx for example.

You want the nip field.  NIP stands for Next Instruction Pointer.

> Basically I was trying to access PC, PS from within a program.
> Appreciate your feedback.

And for "PS" you want the msr field, i.e. the Machine State Register.

Paul.

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From hollisb at us.ibm.com  Thu Sep 11 02:03:06 2003
From: hollisb at us.ibm.com (Hollis Blanchard)
Date: Wed, 10 Sep 2003 11:03:06 -0500
Subject: gcc 3.3 fix for 2.4
Message-ID: <43E784B8-E3A8-11D7-AC00-000A95A0560C@us.ibm.com>

Anton recently pushed this gcc 3.3 fix to Linus' 2.5 tree. The same fix
is needed for 2.4 (I modified the name to be like HAS_BIARCH).

Also, I think HAS_BIARCH needs to use $(CC) instead of gcc directly.

Could one of you please push this patch to ameslab 2.4? I'll attach it
as well in case it linewraps.

--
Hollis Blanchard
IBM Linux Technology Center

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcc-3.3-bss.diff
Type: application/octet-stream
Size: 853 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030910/bbbf919a/attachment.obj 
-------------- next part --------------


===== arch/ppc64/Makefile 1.4 vs edited =====
--- 1.4/arch/ppc64/Makefile	Mon Aug 25 23:47:42 2003
+++ edited/arch/ppc64/Makefile	Wed Sep 10 05:39:06 2003
@@ -19,7 +19,7 @@
  CHECKS		= checks
  endif

-HAS_BIARCH      := $(shell if gcc -m64 -S -o /dev/null -xc /dev/null >
/dev/null 2>&1; then echo y; else echo n; fi;)
+HAS_BIARCH      := $(shell if $(CC) -m64 -S -o /dev/null -xc /dev/null
 > /dev/null 2>&1; then echo y; else echo n; fi;)
  ifeq ($(HAS_BIARCH),y)
  AS              := $(AS) -64
  LD              := $(LD) -m elf64ppc
@@ -33,6 +33,11 @@
  		-mtraceback=full
  CPP		= $(CC) -E $(CFLAGS)

+HAVE_ZERO_BSS := $(shell if $(CC) -fno-zero-initialized-in-bss -S -o
/dev/null -xc /dev/null > /dev/null 2>&1; then echo y; else echo n; fi)
+
+ifeq ($(HAVE_ZERO_BSS),y)
+CFLAGS		+= -fno-zero-initialized-in-bss
+endif

  HEAD := arch/ppc64/kernel/head.o

From taj.subhani at oracle.com  Thu Sep 11 04:21:46 2003
From: taj.subhani at oracle.com (Mohammed Tajuddin)
Date: Wed, 10 Sep 2003 11:21:46 -0700
Subject: how to access PC
References: <A6740B58-E264-11D7-A58D-0003939F15BE@physics.uc.edu>
Message-ID: <3F5F6BBA.32CBB690@oracle.com>


Hello,

getcontext defined in sys/ucontext.h fails on ppc64. I am using gcc 3.2,
is that a problem. Am I missing something here? Here is a sample program
I tried.


#include <sys/ucontext.h>

int main()
{
  ucontext_t  ucon, * context;
  context = &ucon;

  if ( 0 != getcontext(context)) {
    printf("getcontext failed \n");
  }
  return 0;
}


Regards,
TAJ

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From sjmunroe at us.ibm.com  Thu Sep 11 06:02:01 2003
From: sjmunroe at us.ibm.com (Steve Munroe)
Date: Wed, 10 Sep 2003 15:02:01 -0500
Subject: how to access PC
Message-ID: <OFC15D10CC.6576A580-ON86256D9D.006DC2FA-86256D9D.006E0DEE@us.ibm.com>


Mohammed Tajuddin writes

> getcontext defined in sys/ucontext.h fails on ppc64. I am using gcc 3.2,
> is that a problem. Am I missing something here? Here is a sample program
> I tried.

the [get|make|set|swap]context support requires kernel-2.4.21 and glibc
from cvs. Context support involved a sigcontext layout change.

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From amodra at bigpond.net.au  Thu Sep 11 11:41:54 2003
From: amodra at bigpond.net.au (Alan Modra)
Date: Thu, 11 Sep 2003 11:11:54 +0930
Subject: gcc 3.3 fix for 2.4
In-Reply-To: <43E784B8-E3A8-11D7-AC00-000A95A0560C@us.ibm.com>
References: <43E784B8-E3A8-11D7-AC00-000A95A0560C@us.ibm.com>
Message-ID: <20030911014153.GG1443@bubble.modra.org>


On Wed, Sep 10, 2003 at 11:03:06AM -0500, Hollis Blanchard wrote:
> Also, I think HAS_BIARCH needs to use $(CC) instead of gcc directly.

Definitely.  Anyone cross-compiling from an x86 box will need this, as
I found within a few minutes of trying to build a kernel on my Athlon.

--
Alan Modra
IBM OzLabs - Linux Technology Centre

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From engebret at vnet.ibm.com  Thu Sep 11 23:22:48 2003
From: engebret at vnet.ibm.com (Dave Engebretsen)
Date: Thu, 11 Sep 2003 08:22:48 -0500
Subject: gcc 3.3 fix for 2.4
References: <43E784B8-E3A8-11D7-AC00-000A95A0560C@us.ibm.com>
Message-ID: <3F607728.B45F503C@vnet.ibm.com>


I put this change into 2.4 BK.  Thanks Hollis.

Dave.

Hollis Blanchard wrote:
>
> Anton recently pushed this gcc 3.3 fix to Linus' 2.5 tree. The same fix
> is needed for 2.4 (I modified the name to be like HAS_BIARCH).
>
> Also, I think HAS_BIARCH needs to use $(CC) instead of gcc directly.
>
> Could one of you please push this patch to ameslab 2.4? I'll attach it
> as well in case it linewraps.
>
> --
> Hollis Blanchard
> IBM Linux Technology Center

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From hollisb at us.ibm.com  Fri Sep 12 05:26:46 2003
From: hollisb at us.ibm.com (Hollis Blanchard)
Date: Thu, 11 Sep 2003 14:26:46 -0500
Subject: gcc 3.3 fix for 2.4
In-Reply-To: <3F607728.B45F503C@vnet.ibm.com>
Message-ID: <E245A8E4-E48D-11D7-AC00-000A95A0560C@us.ibm.com>

On Thursday, Sep 11, 2003, at 08:22 US/Central, Dave Engebretsen wrote:

> I put this change into 2.4 BK.  Thanks Hollis.

Thanks Dave. :)

I forgot to mention I had to backport some of the 2.5 unistd.h to 2.4,
because gcc 3.3.1 didn't like it. Specifially the error was:
	/home/hollis/source/linuxppc64-2.4/include/asm/unistd.h:442: error:
asm-specifier for
	variable `__sc_4' conflicts with asm clobber list
[varied maybe 50 times]

This patch (which I hadn't really tested yesterday, which is why I
didn't send it then) backports the gcc3-friendly inline asm from 2.5's
unistd.h, apparently originally from Franz Sirl.

With this patch applied, I can build with gcc 3.3.1 (and the resulting
kernel seems to work, though I didn't do anything like run LTP), so if
it's acceptable please apply.

--
Hollis Blanchard
IBM Linux Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcc3-syscalls.diff
Type: application/octet-stream
Size: 8579 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030911/9db30acb/attachment.obj 

From nathanl at austin.ibm.com  Mon Sep 15 16:17:37 2003
From: nathanl at austin.ibm.com (Nathan Lynch)
Date: Mon, 15 Sep 2003 01:17:37 -0500
Subject: [PATCH] add new OF device tree API (2.6.0-test3)
Message-ID: <3F655981.80601@austin.ibm.com>

Hi-

This is an adaptation of the new Open Firmware device tree traversal API
from ppc32, originally written by Benjamin Herrenschmidt.  This patch is
against 2.6.0-test3, but should apply ok to the latest 2.5 ameslab tree.

These functions are meant to be SMP-safe alternatives to the current set
of query/traversal routines (find_devices, find_type_devices, et al).

Some things which I plan to add within the next few days:
- "porting" arch/ppc64 to the new API
- implementation of reference counting
- support for addition and removal of device nodes
- a /proc-based mechanism for initiating node addition and removal from
userspace

Comments?  I'm especially interested in feedback on whether a rwlock
approach would be sufficient for supporting updates, or whether I should
do something cool like RCU. :)


Nathan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: of_node_api.patch
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030915/6aca6f81/attachment.txt 

From anton at samba.org  Tue Sep 16 18:38:00 2003
From: anton at samba.org (Anton Blanchard)
Date: Tue, 16 Sep 2003 18:38:00 +1000
Subject: ppc64 kernel memset
Message-ID: <20030916083800.GL820@krispykreme>


Hi,

Was running fsx-linux and noticed memset was way up in the profiles:

  46.6797     fsx-linux                gendata
  15.5596     libc-2.3.2.so            (no symbols)
   4.1808     vmlinux                  .memset
   2.6565     vmlinux                  .__copy_tofrom_user

Almost all of that 4.18% is in this:

 4.0229 :c0000000001cc010:       stwu    r4,4(r6)
 0.0035 :c0000000001cc014:       bdnz+   c0000000001cc010

Does anyone feel like doing something a little more optimised? :)
Getting things doubleword aligned then doing std;std;bdnz would
be a good start.

Anton

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From hollisb at us.ibm.com  Wed Sep 17 01:46:31 2003
From: hollisb at us.ibm.com (Hollis Blanchard)
Date: Tue, 16 Sep 2003 10:46:31 -0500
Subject: gcc 3.3 fix for 2.4
In-Reply-To: <E245A8E4-E48D-11D7-AC00-000A95A0560C@us.ibm.com>
Message-ID: <F1192FF0-E85C-11D7-A8BC-000A95A0560C@us.ibm.com>

On Thursday, Sep 11, 2003, at 14:26 US/Central, Hollis Blanchard wrote:
>
> I forgot to mention I had to backport some of the 2.5 unistd.h to 2.4,
> because gcc 3.3.1 didn't like it. Specifially the error was:
> 	/home/hollis/source/linuxppc64-2.4/include/asm/unistd.h:442: error:
> asm-specifier for
> 	variable `__sc_4' conflicts with asm clobber list
> [varied maybe 50 times]
>
> This patch (which I hadn't really tested yesterday, which is why I
> didn't send it then) backports the gcc3-friendly inline asm from 2.5's
> unistd.h, apparently originally from Franz Sirl.
>
> With this patch applied, I can build with gcc 3.3.1 (and the resulting
> kernel seems to work, though I didn't do anything like run LTP), so if
> it's acceptable please apply.

Hi Paul, I haven't seen you comment on this patch. Without it, one
cannot build ameslab 2.4 with gcc 3.3.1, and I've been told you're the
person to review it. I know Anton has something similar. Could the 2.4
tree be fixed for gcc 3.3.1?

--
Hollis Blanchard
IBM Linux Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcc3-syscalls.diff
Type: application/octet-stream
Size: 8579 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030916/8988e664/attachment.obj 

From anton at samba.org  Wed Sep 17 02:09:39 2003
From: anton at samba.org (Anton Blanchard)
Date: Wed, 17 Sep 2003 02:09:39 +1000
Subject: [PATCH] ppc64 kernel 2.6 ide-related patches (revised)
In-Reply-To: <20030912120037.A26892@forte.austin.ibm.com>
References: <20030911112823.A27066@forte.austin.ibm.com> <20030912120037.A26892@forte.austin.ibm.com>
Message-ID: <20030916160939.GP820@krispykreme>


> The patch below, against the kernel-2.5 ppc64 bk of 28 august, allows
> the IDE code to compile and work with ide disks & cdrom's.

Thanks Linas, its applied. Nice work testing it against an x86 box :)

Anton

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From engebret at vnet.ibm.com  Wed Sep 17 04:37:34 2003
From: engebret at vnet.ibm.com (Dave Engebretsen)
Date: Tue, 16 Sep 2003 13:37:34 -0500
Subject: gcc 3.3 fix for 2.4
References: <E245A8E4-E48D-11D7-AC00-000A95A0560C@us.ibm.com>
Message-ID: <3F67586E.7462B43F@vnet.ibm.com>


I put the inline asm change into 2.4 BK.  Thanks again Hollis.

Dave.

Hollis Blanchard wrote:
>
> On Thursday, Sep 11, 2003, at 08:22 US/Central, Dave Engebretsen wrote:
>
> > I put this change into 2.4 BK.  Thanks Hollis.
>
> Thanks Dave. :)
>
> I forgot to mention I had to backport some of the 2.5 unistd.h to 2.4,
> because gcc 3.3.1 didn't like it. Specifially the error was:
>         /home/hollis/source/linuxppc64-2.4/include/asm/unistd.h:442: error:
> asm-specifier for
>         variable `__sc_4' conflicts with asm clobber list
> [varied maybe 50 times]
>
> This patch (which I hadn't really tested yesterday, which is why I
> didn't send it then) backports the gcc3-friendly inline asm from 2.5's
> unistd.h, apparently originally from Franz Sirl.
>
> With this patch applied, I can build with gcc 3.3.1 (and the resulting
> kernel seems to work, though I didn't do anything like run LTP), so if
> it's acceptable please apply.
>
> --
> Hollis Blanchard
> IBM Linux Technology Center
>
>   ------------------------------------------------------------------------
>                          Name: gcc3-syscalls.diff
>    gcc3-syscalls.diff    Type: unspecified type (application/octet-stream)
>                      Encoding: 7bit

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From sri at us.ibm.com  Wed Sep 17 07:16:17 2003
From: sri at us.ibm.com (Sridhar Samudrala)
Date: Tue, 16 Sep 2003 14:16:17 -0700 (PDT)
Subject: [PATCH] Fixes for linux-2.6.0-test4 ppc64 build errors
Message-ID: <Pine.LNX.4.44.0309161405370.2516-100000@localhost.localdomain>


I ran into a following compiler errors and an undefined symbol warning while
building linux-2.6.0-test4 on a PPC64 box with SCTP as a module.

In file included from arch/ppc64/kernel/setup.c:14:
include/linux/module.h:19:23: asm/local.h: No such file or directory
In file included from arch/ppc64/kernel/setup.c:14:
include/linux/module.h:175: syntax error before "local_t"
include/linux/module.h:175: warning: no semicolon at end of struct or union
include/linux/module.h:176: warning: empty declaration
include/linux/module.h:235: field `ref' has incomplete type
include/linux/module.h:267: confused by earlier errors, bailing out
make[1]: *** [arch/ppc64/kernel/setup.o] Error 1
make: *** [arch/ppc64/kernel] Error 2

  CC      arch/ppc64/kernel/pSeries_pci.o
arch/ppc64/kernel/pSeries_pci.c: In function `pcibios_name_device':
arch/ppc64/kernel/pSeries_pci.c:440: structure has no member named `name'
arch/ppc64/kernel/pSeries_pci.c:441: structure has no member named `name'
arch/ppc64/kernel/pSeries_pci.c:441: structure has no member named `name'
arch/ppc64/kernel/pSeries_pci.c:442: structure has no member named `name'
arch/ppc64/kernel/pSeries_pci.c:443: structure has no member named `name'
arch/ppc64/kernel/pSeries_pci.c:444: structure has no member named `name'
arch/ppc64/kernel/pSeries_pci.c:445: structure has no member named `name'
arch/ppc64/kernel/pSeries_pci.c:445: structure has no member named `name'
make[1]: *** [arch/ppc64/kernel/pSeries_pci.o] Error 1
make: *** [arch/ppc64/kernel] Error 2

  CC      arch/ppc64/kernel/eeh.o
arch/ppc64/kernel/eeh.c: In function `eeh_check_failure':
arch/ppc64/kernel/eeh.c:119: structure has no member named `name'

  MODPOST
*** Warning: "paca" [net/sctp/sctp.ko] undefined!

Here is a patch that fixes these issues.

Thanks
Sridhar
------------------------------------------------------------------------------
diff -urN -X dontdiff a/arch/ppc64/kernel/eeh.c b/arch/ppc64/kernel/eeh.c
--- a/arch/ppc64/kernel/eeh.c	2003-08-22 16:57:23.000000000 -0700
+++ b/arch/ppc64/kernel/eeh.c	2003-09-04 14:21:39.000000000 -0700
@@ -116,7 +116,7 @@
 				dn->eeh_config_addr, BUID_HI(dn->phb->buid), BUID_LO(dn->phb->buid));
 		if (ret == 0 && rets[1] == 1 && rets[0] >= 2) {
 			panic("EEH:  MMIO failure (%ld) on device:\n  %s %s\n",
-			      rets[0], pci_name(dev), dev->dev.name);
+			      rets[0], pci_name(dev), dev->dev.bus_id);
 		}
 	}
 	eeh_false_positives++;
diff -urN -X dontdiff a/arch/ppc64/kernel/pSeries_pci.c b/arch/ppc64/kernel/pSeries_pci.c
--- a/arch/ppc64/kernel/pSeries_pci.c	2003-08-22 17:03:21.000000000 -0700
+++ b/arch/ppc64/kernel/pSeries_pci.c	2003-09-04 14:21:02.000000000 -0700
@@ -437,12 +437,12 @@
 		char *loc_code = get_property(dn, "ibm,loc-code", 0);
 		if (loc_code) {
 			int loc_len = strlen(loc_code);
-			if (loc_len < sizeof(dev->dev.name)) {
-				memmove(dev->dev.name+loc_len+1, dev->dev.name,
-					sizeof(dev->dev.name)-loc_len-1);
-				memcpy(dev->dev.name, loc_code, loc_len);
-				dev->dev.name[loc_len] = ' ';
-				dev->dev.name[sizeof(dev->dev.name)-1] = '\0';
+			if (loc_len < sizeof(dev->dev.bus_id)) {
+				memmove(dev->dev.bus_id+loc_len+1, dev->dev.bus_id,
+					sizeof(dev->dev.bus_id)-loc_len-1);
+				memcpy(dev->dev.bus_id, loc_code, loc_len);
+				dev->dev.bus_id[loc_len] = ' ';
+				dev->dev.bus_id[sizeof(dev->dev.bus_id)-1] = '\0';
 			}
 		}
 	}
diff -urN -X dontdiff a/arch/ppc64/kernel/ppc_ksyms.c b/arch/ppc64/kernel/ppc_ksyms.c
--- a/arch/ppc64/kernel/ppc_ksyms.c	2003-08-22 16:53:07.000000000 -0700
+++ b/arch/ppc64/kernel/ppc_ksyms.c	2003-09-08 11:19:06.000000000 -0700
@@ -232,3 +232,4 @@
 #endif

 EXPORT_SYMBOL(tb_ticks_per_usec);
+EXPORT_SYMBOL(paca);
diff -urN -X dontdiff a/include/asm-ppc64/local.h b/include/asm-ppc64/local.h
--- a/include/asm-ppc64/local.h	1969-12-31 16:00:00.000000000 -0800
+++ b/include/asm-ppc64/local.h	2003-09-04 11:55:40.000000000 -0700
@@ -0,0 +1,6 @@
+#ifndef __PPC_LOCAL_H
+#define __PPC_LOCAL_H
+
+#include <asm-generic/local.h>
+
+#endif /* __PPC_LOCAL_H */

-------------------------------------------------------------------------------


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From amodra at bigpond.net.au  Thu Sep 18 02:11:59 2003
From: amodra at bigpond.net.au (Alan Modra)
Date: Thu, 18 Sep 2003 01:41:59 +0930
Subject: PowerPC64 alignment of double in structs
Message-ID: <20030917161159.GT3822@bubble.sa.bigpond.net.au>


http://gcc.gnu.org/ml/gcc-patches/2003-09/msg01003.html is a proposal
for fixing struct layout rules on powerpc64-linux-gcc to comply with
the PowerPC64 Linux ABI.  The ABI specifies 4 byte alignment for
doubles, which is a little surprising for someone without an AIX
background, and isn't ideal for speed.  An alternative would be to
change the ABI and gcc (and presumably xlc) to natural alignment.

I'm interested in opinions..

--
Alan Modra
IBM OzLabs - Linux Technology Centre

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From benh at kernel.crashing.org  Thu Sep 18 07:26:14 2003
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Wed, 17 Sep 2003 23:26:14 +0200
Subject: PowerPC64 alignment of double in structs
In-Reply-To: <20030917161159.GT3822@bubble.sa.bigpond.net.au>
References: <20030917161159.GT3822@bubble.sa.bigpond.net.au>
Message-ID: <1063833972.600.223.camel@gaston>


On Wed, 2003-09-17 at 18:11, Alan Modra wrote:
> http://gcc.gnu.org/ml/gcc-patches/2003-09/msg01003.html is a proposal
> for fixing struct layout rules on powerpc64-linux-gcc to comply with
> the PowerPC64 Linux ABI.  The ABI specifies 4 byte alignment for
> doubles, which is a little surprising for someone without an AIX
> background, and isn't ideal for speed.  An alternative would be to
> change the ABI and gcc (and presumably xlc) to natural alignment.
>
> I'm interested in opinions..

For what it's worth on ABI matters, my opinion is too enforce strict
alignement (same goes for Altivec). I'd go further saying that not
enforcing alignement by default is completely insane.

Ben.


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From sjmunroe at us.ibm.com  Thu Sep 18 07:55:13 2003
From: sjmunroe at us.ibm.com (Steve Munroe)
Date: Wed, 17 Sep 2003 16:55:13 -0500
Subject: PowerPC64 alignment of double in structs
Message-ID: <OF9E32D214.C7228C51-ON86256DA4.007741A0-86256DA4.00786B5C@us.ibm.com>


On Wed, 2003-09-17 at 18:11, Alan Modra wrote:
> http://gcc.gnu.org/ml/gcc-patches/2003-09/msg01003.html is a proposal
> for fixing struct layout rules on powerpc64-linux-gcc to comply with
> the PowerPC64 Linux ABI.  The ABI specifies 4 byte alignment for
> doubles, which is a little surprising for someone without an AIX
> background, and isn't ideal for speed.  An alternative would be to
> change the ABI and gcc (and presumably xlc) to natural alignment.
>
> I'm interested in opinions..

Unfortunately this change might break backward compatibility with existing
libraries. Basically any ABI where the application passes a struct
containing a double to an existing library (or an old application to a new
library) might break. I bumped into this when I wanted to change the
alignment of pthread_mutex_t (etc) to avoid false sharing at the
reservation.

In glibc it may be possible to version these interfaces but it is not
clear the gain is worth the pain. In the case of pthread_mutex_t
alignment, I decided to leave it alone. Even if we can fix glibc what
about the other libraries.

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From amodra at bigpond.net.au  Thu Sep 18 11:35:23 2003
From: amodra at bigpond.net.au (Alan Modra)
Date: Thu, 18 Sep 2003 11:05:23 +0930
Subject: PowerPC64 alignment of double in structs
In-Reply-To: 
 <OF9E32D214.C7228C51-ON86256DA4.007741A0-86256DA4.00786B5C@us.ibm.com>
References: 
 <OF9E32D214.C7228C51-ON86256DA4.007741A0-86256DA4.00786B5C@us.ibm.com>
Message-ID: <20030918013523.GU3822@bubble.sa.bigpond.net.au>


On Wed, Sep 17, 2003 at 04:55:13PM -0500, Steve Munroe wrote:
> On Wed, 2003-09-17 at 18:11, Alan Modra wrote:
> > http://gcc.gnu.org/ml/gcc-patches/2003-09/msg01003.html is a proposal
> > for fixing struct layout rules on powerpc64-linux-gcc to comply with
> > the PowerPC64 Linux ABI.  The ABI specifies 4 byte alignment for
> > doubles, which is a little surprising for someone without an AIX
> > background, and isn't ideal for speed.  An alternative would be to
> > change the ABI and gcc (and presumably xlc) to natural alignment.
> >
> > I'm interested in opinions..
>
> Unfortunately this change might break backward compatibility with existing
> libraries.

Yes, that's clear.  Did you read the gcc mailing list thread?  We're
ABI incompatible *now* with xlc, and the rules xlc uses are not easy
to follow.

gcc-3.4 is an appropriate time to make an ABI change, because in
fixing a bug with the way function args are passed, gcc already has
a small ABI incompatibility with older gcc code.

--
Alan Modra
IBM OzLabs - Linux Technology Centre

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From amodra at bigpond.net.au  Thu Sep 18 14:50:56 2003
From: amodra at bigpond.net.au (Alan Modra)
Date: Thu, 18 Sep 2003 14:20:56 +0930
Subject: PowerPC64 alignment of double in structs
In-Reply-To: <1063833972.600.223.camel@gaston>
References: <20030917161159.GT3822@bubble.sa.bigpond.net.au>
 <1063833972.600.223.camel@gaston>
Message-ID: <20030918045056.GZ3822@bubble.sa.bigpond.net.au>


On Wed, Sep 17, 2003 at 11:26:14PM +0200, Benjamin Herrenschmidt wrote:
> On Wed, 2003-09-17 at 18:11, Alan Modra wrote:
> > http://gcc.gnu.org/ml/gcc-patches/2003-09/msg01003.html is a proposal
> > for fixing struct layout rules on powerpc64-linux-gcc to comply with
> > the PowerPC64 Linux ABI.  The ABI specifies 4 byte alignment for
> > doubles, which is a little surprising for someone without an AIX
> > background, and isn't ideal for speed.  An alternative would be to
> > change the ABI and gcc (and presumably xlc) to natural alignment.
>
> For what it's worth on ABI matters, my opinion is too enforce strict
> alignement (same goes for Altivec). I'd go further saying that not
> enforcing alignement by default is completely insane.

The insanity is due to trying to be compatible with AIX, to aid
migration of apps from AIX to Linux.  The trouble is, we now have
problems migrating apps from other flavours of Linux to PowerPC64
Linux.

An example:  I went looking for places in glibc that might break if
we changed alignment of doubles.  I didn't find anything that would
cause a problem if we changed.  ie. old binaries could be linked with
a new glibc and vice versa without trouble.  However, I did find one
place that's broken currently, and needs natural alignment of doubles
to work..  In malloc/obstack.c:

/* Determine default alignment.  */
struct fooalign {char x; double d;};
# define DEFAULT_ALIGNMENT  \
  ((PTR_INT_TYPE) ((char *) &((struct fooalign *) 0)->d - (char *) 0))

Oops, obstacks only aligned to 4 bytes.

--
Alan Modra
IBM OzLabs - Linux Technology Centre

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From sjmunroe at us.ibm.com  Fri Sep 19 00:30:56 2003
From: sjmunroe at us.ibm.com (Steve Munroe)
Date: Thu, 18 Sep 2003 09:30:56 -0500
Subject: PowerPC64 alignment of double in structs
Message-ID: <OF31F5A84D.31D1A943-ON86256DA5.004F2728-86256DA5.004FBE68@us.ibm.com>


On Wed, 2003-09-17 at 23:50, Alan Modra wrote:
> On Wed, Sep 17, 2003 at 11:26:14PM +0200, Benjamin Herrenschmidt wrote:
> > On Wed, 2003-09-17 at 18:11, Alan Modra wrote:
> > > http://gcc.gnu.org/ml/gcc-patches/2003-09/msg01003.html is a proposal
> > > for fixing struct layout rules on powerpc64-linux-gcc to comply with
> > > the PowerPC64 Linux ABI.  The ABI specifies 4 byte alignment for
> > > doubles, which is a little surprising for someone without an AIX
> > > background, and isn't ideal for speed.  An alternative would be to
> > > change the ABI and gcc (and presumably xlc) to natural alignment.
> >
> > For what it's worth on ABI matters, my opinion is too enforce strict
> > alignement (same goes for Altivec). I'd go further saying that not
> > enforcing alignement by default is completely insane.
>
> The insanity is due to trying to be compatible with AIX, to aid
> migration of apps from AIX to Linux.  The trouble is, we now have
> problems migrating apps from other flavours of Linux to PowerPC64
> Linux.
>
> An example:  I went looking for places in glibc that might break if
> we changed alignment of doubles.  I didn't find anything that would
> cause a problem if we changed.  ...

I hate to be a nag but we need to be concerned about all the core
libraries, not just those included by glibc.

> ...  However, I did find one
> place that's broken currently, and needs natural alignment of doubles
> to work..  In malloc/obstack.c:
> ...
> Oops, obstacks only aligned to 4 bytes.

Actually to support VMX we will need obstacks quadword (16 byte) aligned
...

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From anton at samba.org  Fri Sep 19 16:47:42 2003
From: anton at samba.org (Anton Blanchard)
Date: Fri, 19 Sep 2003 16:47:42 +1000
Subject: [PATCH] Fixes for linux-2.6.0-test4 ppc64 build errors
In-Reply-To: <Pine.LNX.4.44.0309161405370.2516-100000@localhost.localdomain>
References: <Pine.LNX.4.44.0309161405370.2516-100000@localhost.localdomain>
Message-ID: <20030919064742.GC15151@krispykreme>


Hi Sridhar,

> I ran into a following compiler errors and an undefined symbol warning while
> building linux-2.6.0-test4 on a PPC64 box with SCTP as a module.

Thanks, we had most of them except for the paca export. Ive added it.

Anton

> diff -urN -X dontdiff a/arch/ppc64/kernel/ppc_ksyms.c b/arch/ppc64/kernel/ppc_ksyms.c
> --- a/arch/ppc64/kernel/ppc_ksyms.c	2003-08-22 16:53:07.000000000 -0700
> +++ b/arch/ppc64/kernel/ppc_ksyms.c	2003-09-08 11:19:06.000000000 -0700
> @@ -232,3 +232,4 @@
>  #endif
>
>  EXPORT_SYMBOL(tb_ticks_per_usec);
> +EXPORT_SYMBOL(paca);

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From anton at samba.org  Sat Sep 20 02:55:43 2003
From: anton at samba.org (Anton Blanchard)
Date: Sat, 20 Sep 2003 02:55:43 +1000
Subject: Kernel panic on a p630
In-Reply-To: <207D6ADFC044A84686D44CA11B297EEA02018AD9@chn-ex02.cvns.corp.covansys.com>
References: <207D6ADFC044A84686D44CA11B297EEA02018AD9@chn-ex02.cvns.corp.covansys.com>
Message-ID: <20030919165543.GH15151@krispykreme>


Hi Sri,

> I noticed quite a few scsi changes from 2.6.0-test4-mm5 to
> 2.6.0-test4-mm6, which carried on. I was forced to switch back to
> test4-mm5 because every time I upgrade and boot up, I get the message:
>
> VFS: Cannot open root device "sda3" or unknown-block (0,0) Please
> append a correct "root=" boot option
>
> The kernel then panics and the machine reboots. Did this happen to you
> for a p630 and do you know the solution?

The symbios controller is currently checking the return value of
pci_set_mwi. Short term fix is below. As Paul suggested we need a way in
the arch_prepare_mwi code to differentiate between a failure and
everything is good, no need to set the cacheline/mwi bits.

Anton

diff -puN drivers/scsi/sym53c8xx_2/sym_glue.c~sym2patch drivers/scsi/sym53c8xx_2/sym_glue.c
--- gr15/drivers/scsi/sym53c8xx_2/sym_glue.c~sym2patch	2003-09-06 21:36:59.000000000 -0500
+++ gr15-anton/drivers/scsi/sym53c8xx_2/sym_glue.c	2003-09-06 21:36:59.000000000 -0500
@@ -2287,8 +2287,7 @@ sym53c8xx_pci_init(struct pci_dev *pdev,
 	}

 	if (chip->features & FE_WRIE) {
-		if (pci_set_mwi(pdev))
-			return -1;
+		pci_set_mwi(pdev);
 	}

 	/*

_

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From bergner at vnet.ibm.com  Mon Sep 22 12:14:32 2003
From: bergner at vnet.ibm.com (Peter Bergner)
Date: Sun, 21 Sep 2003 21:14:32 -0500
Subject: PowerPC64 alignment of double in structs
In-Reply-To: <20030918013523.GU3822@bubble.sa.bigpond.net.au>
References: <OF9E32D214.C7228C51-ON86256DA4.007741A0-86256DA4.00786B5C@us.ibm.com> <20030918013523.GU3822@bubble.sa.bigpond.net.au>
Message-ID: <3F6E5B08.5000500@vnet.ibm.com>


Alan Modra wrote:
> On Wed, Sep 17, 2003 at 04:55:13PM -0500, Steve Munroe wrote:
>
>>On Wed, 2003-09-17 at 18:11, Alan Modra wrote:
>>
>>>http://gcc.gnu.org/ml/gcc-patches/2003-09/msg01003.html is a proposal
>>>for fixing struct layout rules on powerpc64-linux-gcc to comply with
>>>the PowerPC64 Linux ABI.  The ABI specifies 4 byte alignment for
>>>doubles, which is a little surprising for someone without an AIX
>>>background, and isn't ideal for speed.  An alternative would be to
>>>change the ABI and gcc (and presumably xlc) to natural alignment.
>>>
>>>I'm interested in opinions..
>>
>>Unfortunately this change might break backward compatibility with existing
>>libraries.
>
>
> Yes, that's clear.  Did you read the gcc mailing list thread?  We're
> ABI incompatible *now* with xlc, and the rules xlc uses are not easy
> to follow.
>
> gcc-3.4 is an appropriate time to make an ABI change, because in
> fixing a bug with the way function args are passed, gcc already has
> a small ABI incompatibility with older gcc code.

Alan, are you saying xlc does or doesn't follow the ABI?  Aren't the "rules"
is uses just what is mandated by the ABI?

Mark, can you confirm what alignment xlc uses for doubles and long doubles?
Do you know whether xlc takes any special liberties with respect to the
PowerPC64 Linux ABI?

Given that xlc is already a shipping product we need to be carefull here.
It's one thing to cause an ABI compatibilty between new and old gcc code
because of a bug in gcc.  It would be another thing altogether to create an
incompatibility with xlc code because we changed the ABI underneath them.

We definiately need to come to a consensus before we do anything!


Peter


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From nathanl at austin.ibm.com  Mon Sep 22 16:40:10 2003
From: nathanl at austin.ibm.com (Nathan Lynch)
Date: Mon, 22 Sep 2003 01:40:10 -0500
Subject: Kernel panic on a p630
In-Reply-To: <20030919165543.GH15151@krispykreme>
References: <207D6ADFC044A84686D44CA11B297EEA02018AD9@chn-ex02.cvns.corp.covansys.com> <20030919165543.GH15151@krispykreme>
Message-ID: <3F6E994A.7050004@austin.ibm.com>


Hi Anton-

> The symbios controller is currently checking the return value of
> pci_set_mwi. Short term fix is below. As Paul suggested we need a way in
> the arch_prepare_mwi code to differentiate between a failure and
> everything is good, no need to set the cacheline/mwi bits.

I have tried the workaround (with test5 from ameslab, default config),
and still get the "cannot mount root fs" problem.  I believe the symbios
controller is somehow escaping detection altogether, because I do not
even see the driver initialization messages during bootup, e.g. from a
test3 kernel:

sym.65.1.0: setting PCI_COMMAND_INVALIDATE.
sym0: <1010-66> rev 0x1 on pci bus 65 device 1 function 0 irq 325
sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: SCSI BUS has been reset.
...

These messages do not appear with test5 whether the patch is applied or not.

I booted with initcall_debug, and it looks like sym2_init is called.  I
added a printk to sym2_probe, and that did not show up in the boot messages.

I can see how the pci_set_mwi situation would break things, but I am not
convinced the symbios code is even getting to that point.  Is there
another issue at play here, or am I missing something?

Nathan


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From SVakkalankarao at covansys.com  Mon Sep 22 17:21:42 2003
From: SVakkalankarao at covansys.com (VAKKALANKA  RAO Sridhar)
Date: Mon, 22 Sep 2003 12:51:42 +0530
Subject: Kernel panic on a p630
Message-ID: <207D6ADFC044A84686D44CA11B297EEA02018ADA@chn-ex02.cvns.corp.covansys.com>


Hi Nathan,

It works for me now

> sym.65.1.0: setting PCI_COMMAND_INVALIDATE.
> sym0: <1010-66> rev 0x1 on pci bus 65 device 1 function 0 irq 325
> sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking
> sym0: SCSI BUS has been reset.


Here is the snapshot of my dmesg for scsi:

++++++++++++++++++++
PCI: Enabling device: (0000:41:01.0), cmd 143
sym0: <1010-66> rev 0x1 at pci 0000:41:01.0 irq 103
sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.1.17a
  Vendor: IBM       Model: IC35L036UCD210-0  Rev: S5BS
  Type:   Direct-Access                      ANSI SCSI revision: 03
sym0:8:0: tagged command queuing enabled, command queue depth 16.
sym0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
  Vendor: IBM       Model: IC35L036UCD210-0  Rev: S5BS
  Type:   Direct-Access                      ANSI SCSI revision: 03
sym0:9:0: tagged command queuing enabled, command queue depth 16.
sym0:9: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
  Vendor: IBM       Model: HSBPD4E  PU3SCSI  Rev: 0015
  Type:   Enclosure                          ANSI SCSI revision: 02
PCI: Enabling device: (0000:41:01.1), cmd 143
sym1: <1010-66> rev 0x1 at pci 0000:41:01.1 irq 104
sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi1 : sym-2.1.17a
PCI: Enabling device: (0001:61:01.0), cmd 143
sym2: <1010-66> rev 0x1 at pci 0001:61:01.0 irq 137
sym2: No NVRAM, ID 7, Fast-80, SE, parity checking
sym2: SCSI BUS has been reset.
scsi2 : sym-2.1.17a
sym2:0: FAST-20 WIDE SCSI 40.0 MB/s ST (50.0 ns, offset 31)
  Vendor: HP        Model: IBM-C568303030!D  Rev: C209
  Type:   Sequential-Access                  ANSI SCSI revision: 02
PCI: Enabling device: (0001:61:01.1), cmd 143
sym3: <1010-66> rev 0x1 at pci 0001:61:01.1 irq 138
sym3: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym3: SCSI BUS has been reset.
scsi3 : sym-2.1.17a
st: Version 20030811, fixed bufsize 32768, s/g segs 256
Attached scsi tape st0 at scsi2, channel 0, id 0, lun 0
st0: try direct i/o: yes, max page reachable by HBA 1048576
SCSI device sda: 71096640 512-byte hdwr sectors (36401 MB)
SCSI device sda: drive cache: write through
 sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 8, lun 0
++++++++++++++++++++

Thanks again
Sri


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From mjstumpf at cotse.net  Tue Sep 23 02:54:35 2003
From: mjstumpf at cotse.net (mjstumpf)
Date: Mon, 22 Sep 2003 12:54:35 -0400 (EDT)
Subject: latest toolchain in binary format?
Message-ID: <bWpzdHVtcGY=.8c196b77901a104faf6941c7a1dead51@1064249675.cotse.net>


I'm getting stung by the TOC-blowing-up problem on the 64 bit toolchain,
but I've found appropriate patches/etc to fix it.  However, try as I
might, I can't get the full toolchain to build.  I've even successfully
built the whole thing, and the compiler appears to work--but
powerpc64-linux-gdb acts screwy.

Are there any precompiled binaries of the toolchain out there?

Maybe on a SuSE developer's personal ftp?  I'll even take alpha-grade at
this point.

Thanks,
Michael


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From olh at suse.de  Tue Sep 23 03:38:26 2003
From: olh at suse.de (Olaf Hering)
Date: Mon, 22 Sep 2003 19:38:26 +0200
Subject: iSeries cmdline wraps
Message-ID: <20030922173826.GA1333@suse.de>


The cmdline should be only one line. I'm not sure about the proc
interface, is this part of the patch ok?


diff -p -purNX /suse/olh/kernel/kernel_exclude.txt /dev/shm/usr/src/linux-2.4.21-73/arch/ppc64/kernel/iSeries_setup.c linux-2.4.21-73/arch/ppc64/kernel/iSeries_setup.c
--- /dev/shm/usr/src/linux-2.4.21-73/arch/ppc64/kernel/iSeries_setup.c	2003-09-22 11:21:12.000000000 +0200
+++ linux-2.4.21-73/arch/ppc64/kernel/iSeries_setup.c	2003-09-22 19:09:50.000000000 +0200
@@ -378,15 +378,14 @@ iSeries_init(unsigned long r3, unsigned
 				     256,
 				     HvLpDma_Direction_RemoteToLocal );

-		p = q = cmd_line + 255;
-		while( p > cmd_line ) {
-			if ((*p == 0) || (*p == ' ') || (*p == '\n'))
-				--p;
-			else
+		p = cmd_line;
+		q = cmd_line + 255;
+		while( p < q ) {
+			if (!*p || *p == '\n')
 				break;
+			++p;
 		}
-		if ( p < q )
-			*(p+1) = 0;
+		*p = 0;
 	}

 	iSeries_proc_early_init();
diff -p -purNX /suse/olh/kernel/kernel_exclude.txt /dev/shm/usr/src/linux-2.4.21-73/arch/ppc64/kernel/mf_proc.c linux-2.4.21-73/arch/ppc64/kernel/mf_proc.c
--- /dev/shm/usr/src/linux-2.4.21-73/arch/ppc64/kernel/mf_proc.c	2003-09-22 11:21:12.000000000 +0200
+++ linux-2.4.21-73/arch/ppc64/kernel/mf_proc.c	2003-09-22 19:13:38.000000000 +0200
@@ -151,30 +151,25 @@ int proc_mf_dump_cmdline
 	int		len = count;
 	char *p;

+	/* it seems non NULL is the second call == second (unwanted) line */
+	if ( off ) {
+		*eof = 1;
+		return 0;
+	}
+
 	len = mf_getCmdLine(page, &len, (u64)data);

-	p = page + len - 1;
-	while ( p > page ) {
-		if ( (*p == 0) || (*p == ' ') )
-			--p;
-		else
+	p = page + off;
+	while ( len < ( count - 1 ) ) {
+		if ( ! *p || *p == '\n' )
 			break;
-	}
-	if ( *p != '\n' ) {
 		++p;
-		*p = '\n';
+		++len;
 	}
+	*p = '\n';
 	++p;
 	*p = 0;
-	len = p - page;
-
-	len -= off;
-	if (len < count) {
-		*eof = 1;
-		if (len <= 0)
-			return 0;
-	} else
-		len = count;
+	len = p - (page + off);
 	*start = page + off;
 	return len;
 }
--
USB is for mice, FireWire is for men!

sUse lINUX ag, n?RNBERG

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From sjmunroe at us.ibm.com  Tue Sep 23 03:48:25 2003
From: sjmunroe at us.ibm.com (Steve Munroe)
Date: Mon, 22 Sep 2003 12:48:25 -0500
Subject: latest toolchain in binary format?
Message-ID: <OFBD0C4365.E6D8787F-ON86256DA9.0061A0C6-86256DA9.0061D26D@us.ibm.com>


What are you trying to build? have you tried -mminimal-toc?

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From mjstumpf at cotse.net  Tue Sep 23 05:00:12 2003
From: mjstumpf at cotse.net (mjstumpf)
Date: Mon, 22 Sep 2003 15:00:12 -0400 (EDT)
Subject: latest toolchain in binary format?
In-Reply-To: <OFBD0C4365.E6D8787F-ON86256DA9.0061A0C6-86256DA9.0061D26D@us.ibm.com>
References: <OFBD0C4365.E6D8787F-ON86256DA9.0061A0C6-86256DA9.0061D26D@us.ibm.com>
Message-ID: <bWpzdHVtcGY=.76855e28fe35efeeccd2035ca0e47321@1064257212.cotse.net>


> What are you trying to build? have you tried -mminimal-toc?

I'm trying to build a very large application (8 mb binary executable,
shared, not stripped).

Just tried it.  Still no dice; compiling that way, using SLES 8 + 64 bit
dev kit still results in a bogus error:

lssbfrec.o: In function `lssbfrec':
lssbfrec.c:21: undefined reference to `.LCTOC0'
collect2: ld returned 1 exit status


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From sjmunroe at us.ibm.com  Tue Sep 23 05:13:17 2003
From: sjmunroe at us.ibm.com (Steve Munroe)
Date: Mon, 22 Sep 2003 14:13:17 -0500
Subject: latest toolchain in binary format?
Message-ID: <OF38D56E40.E893C83D-ON86256DA9.00694645-86256DA9.00699745@us.ibm.com>


mjstumpf <mjstumpf at cotse.net> writes

> I'm trying to build a very large application (8 mb binary executable,
> shared, not stripped).
>
> Just tried it.  Still no dice; compiling that way, using SLES 8 + 64 bit
> dev kit still results in a bogus error:

-mminumal-toc does not work at link time, it is a compile time option. Did
you recompile all source to new .o's with -mminimal-toc?


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From mjstumpf at cotse.net  Tue Sep 23 05:18:30 2003
From: mjstumpf at cotse.net (mjstumpf)
Date: Mon, 22 Sep 2003 15:18:30 -0400 (EDT)
Subject: latest toolchain in binary format?
In-Reply-To: <OF38D56E40.E893C83D-ON86256DA9.00694645-86256DA9.00699745@us.ibm.com>
References: <OF38D56E40.E893C83D-ON86256DA9.00694645-86256DA9.00699745@us.ibm.com>
Message-ID: <bWpzdHVtcGY=.13befe05a7f6a7c14c212ce02a755241@1064258310.cotse.net>


> -mminumal-toc does not work at link time, it is a compile time option.
> Did you recompile all source to new .o's with -mminimal-toc?

Yep.  Just double checked; all compilations were redone, and it still
didn't work.


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From sjmunroe at us.ibm.com  Tue Sep 23 05:59:38 2003
From: sjmunroe at us.ibm.com (Steve Munroe)
Date: Mon, 22 Sep 2003 14:59:38 -0500
Subject: latest toolchain in binary format?
Message-ID: <OF33BEA66A.35981B5F-ON86256DA9.006C66BA-86256DA9.006DD577@us.ibm.com>


mjstumpf writes:

> Just tried it.  Still no dice; compiling that way, using SLES 8 + 64 bit
> dev kit still results in a bogus error:
>
> lssbfrec.o: In function `lssbfrec':
> lssbfrec.c:21: undefined reference to `.LCTOC0'
> collect2: ld returned 1 exit status

This is not the usual symptom for TOC overflow. I would expect something
like:

auxl.o(.text+0xdbfa): relocation truncated to fit: R_PPC64_TOC16_DS
.toc+4360

Your error implies that a single object/compile unit is so large it can't
be compiled...

try:

 objdump -h *.o | grep '\.toc'

and add up all the numbers (in hex) from the 3rd column.

You may have to break this application into a number of shared libraries
(each *.so has its own 64KB TOC). So it would be good to know how close or
far you are from fitting. Also some applications try to prelink all
objects into a single large object and try to build the application from
that. This will not work in your case. And don't even think about linking
-static!

Also I have to ask all the obvious questions like: Is this gcc or xlc?
This xlf or g77?

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From hollisb at us.ibm.com  Tue Sep 23 06:51:04 2003
From: hollisb at us.ibm.com (Hollis Blanchard)
Date: Mon, 22 Sep 2003 15:51:04 -0500
Subject: vty device tree patch
Message-ID: <7B94251E-ED3E-11D7-8FD1-000A95A0560C@us.ibm.com>

In the future, the /rtas/ibm,termno device node is going away, to be
replaced with /vdevice/vty nodes. This patch is the minimum necessary
to boot relying on the vty nodes, falling back to "ibm,termno" if
necessary on older systems.

Note that hvc_count will need to be replaced, as we can have multiple
vty nodes containing discontiguous vterm numbers (as opposed to
"ibm,termno", which only expresses vterm numbers as "base, total
number").

At the moment, firmware is supporting the old "ibm,termno" interface as
well as vty nodes, but that is expected to change.

Attached are both 2.4 and 2.5 patches. If there are no comments, these
will be going in soon...

--
Hollis Blanchard
IBM Linux Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Name: devtree-vty-2.4-v4.diff
Type: application/octet-stream
Size: 4536 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030922/c28fec37/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: devtree-vty-2.5-v4.diff
Type: application/octet-stream
Size: 5009 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030922/c28fec37/attachment-0001.obj 

From mjstumpf at cotse.net  Tue Sep 23 08:00:27 2003
From: mjstumpf at cotse.net (mjstumpf)
Date: Mon, 22 Sep 2003 18:00:27 -0400 (EDT)
Subject: latest toolchain in binary format?
In-Reply-To: <OF33BEA66A.35981B5F-ON86256DA9.006C66BA-86256DA9.006DD577@us.ibm.com>
References: <OF33BEA66A.35981B5F-ON86256DA9.006C66BA-86256DA9.006DD577@us.ibm.com>
Message-ID: <bWpzdHVtcGY=.673d76595639ff29b429a76c5e7ecfc1@1064268027.cotse.net>


> Your error implies that a single object/compile unit is so large it
> can't be compiled...
>
> try:
>
>  objdump -h *.o | grep '\.toc'
>
> and add up all the numbers (in hex) from the 3rd column.

Ah I have done this before.  We are way way way over the limit.  Although
the patches to binutils/gcc head do appear to fix this issue.

> You may have to break this application into a number of shared libraries
> (each *.so has its own 64KB TOC). So it would be good to know how close
> or far you are from fitting. Also some applications try to prelink all
> objects into a single large object and try to build the application from
> that. This will not work in your case. And don't even think about
> linking -static!

We're not prelinking, that I can tell anyway.  Everything is built via
"-c" in gcc to stop after compile, then we're gccing the whole thing
together at the end.  Going to shared libraries really seems kludgy at
best..  This is a C app ported over from old 390 code.  We've successfully
built this app on Linux/390 with gcc.  It is 98% C code with just a little
inline assembly.


> Also I have to ask all the obvious questions like: Is this gcc or xlc?
> This xlf or g77?

This is gcc executing on powerpc-linux, it is the /opt/cross (dev64, I
believe was its name) as packaged by SuSE in either SLES 8 or one of its
updates.


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From nathanl at austin.ibm.com  Wed Sep 24 06:22:23 2003
From: nathanl at austin.ibm.com (Nathan Lynch)
Date: Tue, 23 Sep 2003 15:22:23 -0500
Subject: [PATCH] add new OF device tree API (2.6.0-test3)
In-Reply-To: <3F655981.80601@austin.ibm.com>
References: <3F655981.80601@austin.ibm.com>
Message-ID: <3F70AB7F.8050408@austin.ibm.com>

Nathan Lynch wrote:
> Some things which I plan to add within the next few days:
> - "porting" arch/ppc64 to the new API
> - implementation of reference counting
> - support for addition and removal of device nodes
> - a /proc-based mechanism for initiating node addition and removal from
> userspace

Attached is a patch which replaces all the uses of the old device tree
API in arch/ppc64.  Patch is against 2.6.0-test5 (cset 1.1328) from
ameslab bk, plus the patch from my previous message.  I've tested this
on a pSeries LPAR.

Nathan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ppc64_use_new_OF_api.patch
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030923/d50b99d5/attachment.txt 

From olof at austin.ibm.com  Wed Sep 24 07:09:46 2003
From: olof at austin.ibm.com (olof at austin.ibm.com)
Date: Tue, 23 Sep 2003 16:09:46 -0500 (CDT)
Subject: [PATCH] [2.4] Remove page_table_lock in hash_page for kernel faults
Message-ID: <Pine.A41.4.44.0309231526420.49238-200000@forte.austin.ibm.com>

Attached patch removes the page_table_lock for vmalloc and io-mapped
regions.

We've seen cases where someone would call vmalloc(), which takes the
page_table_lock without disabling interrupts, we get an interrupt, and the
interrupt code will fault -> hash_page() -> deadlock. It seems to be more
common when the driver is a module (i.e. in the vmalloc region).

The tricky part is to make the update atomic. I essentially copied Paul
MacKerras 32-bit code where it checks access and sets the ACCESSED/DIRTY
bits. ppc(32) has it all in assembler, I kept most of it in C to keep
changes smaller. Rewriting selected parts into larger asm-blocks is a
todo-item for the future (it'd shorten path length a bit).

We've given this patch a few days of testing on one of the large SPECweb
servers, where we'd otherwise see the deadlock every now and then, and we
haven't seen any problems with it. But more eyes on the code couldn't
hurt.

Comments/questions anyone?


Thanks,

-Olof

Olof Johansson                                        Office: 4E002/905
pSeries Linux Development                             IBM Systems Group
Email: olof at austin.ibm.com                          Phone: 512-838-9858
All opinions are my own and not those of IBM
-------------- next part --------------
===== arch/ppc64/kernel/htab.c 1.8 vs edited =====
--- 1.8/arch/ppc64/kernel/htab.c	Mon Aug 25 23:47:44 2003
+++ edited/arch/ppc64/kernel/htab.c	Tue Sep 23 13:46:05 2003
@@ -310,6 +310,25 @@
 		  1));
 }
 
+#define READ_PTE(addr, ret) __asm__ __volatile__	\
+		("ldarx %0,0,%1"			\
+		: "=r"(ret)				\
+		: "r"(addr)				\
+		: "memory")
+
+/* Returns 0 if store failed due to lost reservation */
+          
+#define WRITE_PTE(addr, val) ({				\
+	long r;						\
+	__asm__ __volatile__ ("				\
+		stdcx.	%1,0,%2\n			\
+		mfcr	%0\n				\
+		rlwinm	%0,%0,3,31,31"			\
+		: "=r" (r)				\
+		: "r"(val), "r"(addr)			\
+		: "cc", "memory");			\
+	r; })
+
 /*
  * Handle a fault by adding an HPTE. If the address can't be determined
  * to be valid via Linux page tables, return 1. If handled return 0
@@ -321,7 +340,9 @@
 	unsigned long newpp, prpn;
 	unsigned long hpteflags, lock_slot;
 	long slot;
-	pte_t old_pte, new_pte;
+	pte_t pte;
+	long set_bits, clear_bits;
+	int shared_mem_area = 0;
 
 	/* Search the Linux page table for a match with va */
 	va = (vsid << 28) | (ea & 0x0fffffff);
@@ -333,26 +354,10 @@
 	 */
 	spin_lock(&hash_table_lock[lock_slot].lock);
 	
-	/* 
-	 * Check the user's access rights to the page.  If access should be
-	 * prevented then send the problem up to do_page_fault.
-	 */
 #ifdef CONFIG_SHARED_MEMORY_ADDRESSING
-	access |= _PAGE_PRESENT;
-	if (unlikely(access & ~(pte_val(*ptep)))) {
-		if(!(((ea >> SMALLOC_EA_SHIFT) == 
+	shared_mem_area = (((ea >> SMALLOC_EA_SHIFT) == 
 		      (SMALLOC_START >> SMALLOC_EA_SHIFT)) &&
-		     ((current->thread.flags) & PPC_FLAG_SHARED))) {
-			spin_unlock(&hash_table_lock[lock_slot].lock);
-			return 1;
-		}
-	}
-#else
-	access |= _PAGE_PRESENT;
-	if (unlikely(access & ~(pte_val(*ptep)))) {
-		spin_unlock(&hash_table_lock[lock_slot].lock);
-		return 1;
-	}
+				((current->thread.flags) & PPC_FLAG_SHARED));
 #endif
 
 	/* 
@@ -360,12 +365,44 @@
 	 * The spinlocks prevent this status from changing
 	 * The hash_table_lock prevents the _PAGE_HASHPTE status
 	 * from changing (RPN, DIRTY and ACCESSED too)
-	 * The page_table_lock prevents the pte from being 
-	 * invalidated or modified
+	 *
+	 * For VMALLOC/IO regions the page_table_lock is not held when
+	 * we're executing here so all PTE updates must be atomic.
 	 */
 
+	set_bits = 0;
+	clear_bits = 0;
+
+	/* _PAGE_RW is set for store accesses */
+	if (access & _PAGE_RW)
+		set_bits |= _PAGE_ACCESSED|_PAGE_DIRTY;
+	else
+		set_bits |= _PAGE_ACCESSED;
+
+	access |= _PAGE_PRESENT;
+
+retry:
+
+	READ_PTE(ptep, pte);
+	/* 
+	 * Check the user's access rights to the page.  If access should be
+	 * prevented then send the problem up to do_page_fault.
+	 */
+
+	if(unlikely(access & ~(pte_val(pte)))) {
+		if(!shared_mem_area) {
+			spin_unlock(&hash_table_lock[lock_slot].lock);
+			return 1;
+		}
+	}
+
+	/* Update the entry. If it's been modified we need to start over. */
+
+	if(!WRITE_PTE(ptep, pte_val(pte) | set_bits))
+		goto retry;
+
 	/*
-	 * At this point, we have a pte (old_pte) which can be used to build
+	 * At this point, we have a pte (pte) which can be used to build
 	 * or update an HPTE. There are 2 cases:
 	 *
 	 * 1. There is a valid (present) pte with no associated HPTE (this is 
@@ -376,56 +413,42 @@
 	 *	page is currently not DIRTY. 
 	 */
 
-	old_pte = *ptep;
-	new_pte = old_pte;
-
-	/* If the attempted access was a store */
-	if (access & _PAGE_RW)
-		pte_val(new_pte) |= _PAGE_ACCESSED | _PAGE_DIRTY;
-	else
-		pte_val(new_pte) |= _PAGE_ACCESSED;
-
-	newpp = computeHptePP(pte_val(new_pte));
+	newpp = computeHptePP(pte_val(pte) | set_bits);
 	
 	/* Check if pte already has an hpte (case 2) */
-	if (unlikely(pte_val(old_pte) & _PAGE_HASHPTE)) {
+	if (unlikely(pte_val(pte) & _PAGE_HASHPTE)) {
 		/* There MIGHT be an HPTE for this pte */
-		unsigned long hash, slot, secondary;
+		unsigned long hash, secondary;
 
 		/* XXX fix large pte flag */
 		hash = hpt_hash(vpn, 0);
-		secondary = (pte_val(old_pte) & _PAGE_SECONDARY) >> 15;
+		secondary = (pte_val(pte) & _PAGE_SECONDARY) >> 15;
 		if (secondary)
 			hash = ~hash;
 		slot = (hash & htab_data.htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (pte_val(old_pte) & _PAGE_GROUP_IX) >> 12;
+		slot += (pte_val(pte) & _PAGE_GROUP_IX) >> 12;
 
 		/* XXX fix large pte flag */
 		if (ppc_md.hpte_updatepp(slot, secondary, 
 					 newpp, va, 0) == -1) {
-			pte_val(old_pte) &= ~_PAGE_HPTEFLAGS;
-		} else {
-			if (!pte_same(old_pte, new_pte)) {
-				*ptep = new_pte;
-			}
+			pte_val(pte) &= ~_PAGE_HPTEFLAGS;
 		}
 	}
 
-	if (likely(!(pte_val(old_pte) & _PAGE_HASHPTE))) {
+	if (likely(!(pte_val(pte) & _PAGE_HASHPTE))) {
 		/* Update the linux pte with the HPTE slot */
-		pte_val(new_pte) &= ~_PAGE_HPTEFLAGS;
-		pte_val(new_pte) |= _PAGE_HASHPTE;
-		prpn = pte_val(old_pte) >> PTE_SHIFT;
+		clear_bits |= _PAGE_HPTEFLAGS;
+		set_bits |= _PAGE_HASHPTE;
+		prpn = pte_val(pte) >> PTE_SHIFT;
 
 		/* copy appropriate flags from linux pte */
-		hpteflags = (pte_val(new_pte) & 0x1f8) | newpp;
+		hpteflags = (((pte_val(pte)&~clear_bits)|set_bits) & 0x1f8 ) | newpp;
 
 		slot = ppc_md.hpte_insert(vpn, prpn, hpteflags, 0, 0);
 
-		pte_val(new_pte) |= ((slot<<12) & 
-				     (_PAGE_GROUP_IX | _PAGE_SECONDARY));
+		set_bits |= ((slot<<12) & (_PAGE_GROUP_IX | _PAGE_SECONDARY));
 
-		*ptep = new_pte;
+		pte_update(ptep, clear_bits, set_bits);
 	}
 
 	spin_unlock(&hash_table_lock[lock_slot].lock);
@@ -444,6 +467,7 @@
 	struct mm_struct *mm;
 	pte_t *ptep;
 	int ret;
+	spinlock_t *lock = NULL;
 
 	/* Check for invalid addresses. */
 	if (!IS_VALID_EA(ea)) return 1;
@@ -451,15 +475,18 @@
  	switch (REGION_ID(ea)) {
 	case USER_REGION_ID:
 		mm = current->mm;
+		lock = &mm->page_table_lock;
 		if (mm == NULL) return 1;
 		vsid = get_vsid(mm->context, ea);
 		break;
 	case IO_REGION_ID:
 		mm = &ioremap_mm;
+		/* no locking for IO regions */
 		vsid = get_kernel_vsid(ea);
 		break;
 	case VMALLOC_REGION_ID:
 		mm = &init_mm;
+		/* no locking for VMALLOC regions */
 		vsid = get_kernel_vsid(ea);
 #ifdef CONFIG_SHARED_MEMORY_ADDRESSING
                 /*
@@ -501,7 +528,8 @@
 	 * Lock the Linux page table to prevent mmap and kswapd
 	 * from modifying entries while we search and update
 	 */
-	spin_lock(&mm->page_table_lock);
+	if(lock)
+		spin_lock(lock);
 
 	ptep = find_linux_pte(pgdir, ea);
 	/*
@@ -515,7 +543,8 @@
 		ret = 1;
 	}
 
-	spin_unlock(&mm->page_table_lock);
+	if(lock)
+		spin_unlock(lock);
 
 	return ret;
 }

From hollisb at us.ibm.com  Thu Sep 25 05:39:57 2003
From: hollisb at us.ibm.com (Hollis Blanchard)
Date: Wed, 24 Sep 2003 14:39:57 -0500
Subject: [PATCH] add new OF device tree API (2.6.0-test3)
In-Reply-To: <3F70AB7F.8050408@austin.ibm.com>
Message-ID: <E109C2DE-EEC6-11D7-898A-000A95A0560C@us.ibm.com>


On Tuesday, Sep 23, 2003, at 15:22 US/Central, Nathan Lynch wrote:

> Nathan Lynch wrote:
>> Some things which I plan to add within the next few days:
>> - "porting" arch/ppc64 to the new API
>> - implementation of reference counting
>> - support for addition and removal of device nodes
>> - a /proc-based mechanism for initiating node addition and removal
>> from
>> userspace
>
> Attached is a patch which replaces all the uses of the old device tree
> API in arch/ppc64.  Patch is against 2.6.0-test5 (cset 1.1328) from
> ameslab bk, plus the patch from my previous message.  I've tested this
> on a pSeries LPAR.

Some of the virtual IO code is very dependent on the device tree. It
would be great if Nathan's code could be reviewed and committed? I
don't want to code to an obsolete interface, and we'll need this new
interface anyways for VIO DLPAR.

--
Hollis Blanchard
IBM Linux Technology Center


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From paulus at au1.ibm.com  Thu Sep 25 13:05:23 2003
From: paulus at au1.ibm.com (Paul Mackerras)
Date: Thu, 25 Sep 2003 13:05:23 +1000
Subject: [PATCH] [2.4] Remove page_table_lock in hash_page for kernel faults
In-Reply-To: <Pine.A41.4.44.0309231526420.49238-200000@forte.austin.ibm.com>
References: <Pine.A41.4.44.0309231526420.49238-200000@forte.austin.ibm.com>
Message-ID: <16242.23411.91281.584318@cargo.ozlabs.ibm.com>


Olof Johansson writes:

> Attached patch removes the page_table_lock for vmalloc and io-mapped
> regions.
>
> We've seen cases where someone would call vmalloc(), which takes the
> page_table_lock without disabling interrupts, we get an interrupt, and the
> interrupt code will fault -> hash_page() -> deadlock. It seems to be more
> common when the driver is a module (i.e. in the vmalloc region).
>
> The tricky part is to make the update atomic. I essentially copied Paul
> MacKerras 32-bit code where it checks access and sets the ACCESSED/DIRTY
> bits. ppc(32) has it all in assembler, I kept most of it in C to keep
> changes smaller. Rewriting selected parts into larger asm-blocks is a
> todo-item for the future (it'd shorten path length a bit).
>
> We've given this patch a few days of testing on one of the large SPECweb
> servers, where we'd otherwise see the deadlock every now and then, and we
> haven't seen any problems with it. But more eyes on the code couldn't
> hurt.
>
> Comments/questions anyone?

The basic idea is good but I am worried about the fact that we update
the PTE (the linux PTE) twice, once to set accessed/dirty bits and
once to update the HPTE present/slot-number bits.  Since we haven't
taken the mm->page_table_lock, some other cpu could have zeroed out
the PTE in the meantime.  We then have a race when it goes to do the
corresponding tlb_flush call, which should find and wipe out the
HPTE.

This is going to take a bit more thought, I think.  A first step would
be to define a compare-and-swap operation for the linux PTE (called,
say, CAS_PTE).  We would first read the PTE and check permissions and
set the accessed and/or dirty bits.  Then we do CAS_PTE to update the
pte provided it hasn't changed - if it has we go back and read it
again.  Then we update the hash table and finally do another CAS_PTE
to set the hashtable present/slot-number bits.

Also, I am mildly surprised that we aren't checking _PAGE_USER.  I
guess that's OK although it will mean that an attempted access by a
user process to a page that it doesn't have the right to access will
still result in _PAGE_ACCESSED (and possibly _PAGE_DIRTY) getting
set.  Setting _PAGE_ACCESSED is probably benign but setting
_PAGE_DIRTY could be more serious.

Another problem is that if shared_mem_area is true we seem to be going
ahead and putting in a HPTE even if _PAGE_PRESENT is false, which is
bad.  Not even the kernel can access a page for which _PAGE_PRESENT is
false.  If _PAGE_PRESENT is false, the other bits in the pte (except
for the HPTE present/slot-number bits) can be used for other purposes
such as storing a swap entry.   That is probably more generically a
bug in the shared memory area code though.

Paul.

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From linas at austin.ibm.com  Fri Sep 26 05:15:53 2003
From: linas at austin.ibm.com (linas at austin.ibm.com)
Date: Thu, 25 Sep 2003 14:15:53 -0500
Subject: [PATCH] add new OF device tree API (2.6.0-test3)
In-Reply-To: <E109C2DE-EEC6-11D7-898A-000A95A0560C@us.ibm.com>; from hollisb@us.ibm.com on Wed, Sep 24, 2003 at 02:39:57PM -0500
References: <3F70AB7F.8050408@austin.ibm.com> <E109C2DE-EEC6-11D7-898A-000A95A0560C@us.ibm.com>
Message-ID: <20030925141553.A40918@forte.austin.ibm.com>


On Wed, Sep 24, 2003 at 02:39:57PM -0500, Hollis Blanchard wrote:
>
> On Tuesday, Sep 23, 2003, at 15:22 US/Central, Nathan Lynch wrote:
>
> > Nathan Lynch wrote:
> >> Some things which I plan to add within the next few days:
> >> - "porting" arch/ppc64 to the new API
> >> - implementation of reference counting
> >> - support for addition and removal of device nodes
> >> - a /proc-based mechanism for initiating node addition and removal
> >> from
> >> userspace
> >
> > Attached is a patch which replaces all the uses of the old device tree
> > API in arch/ppc64.  Patch is against 2.6.0-test5 (cset 1.1328) from
> > ameslab bk, plus the patch from my previous message.  I've tested this
> > on a pSeries LPAR.
>
> Some of the virtual IO code is very dependent on the device tree. It
> would be great if Nathan's code could be reviewed and committed? I
> don't want to code to an obsolete interface, and we'll need this new
> interface anyways for VIO DLPAR.

I don't know if my opinion counts for anything here, but I actually read
the patch and it looked harmless.  But that is because it mostly does
nothing other than replace calls to find_path_device() with calls to
of_find_node_by_path() and insserts of_node_put() in reasonable looking
places.  So what can go wrong?

I didn't actually read of_find_node_by_path(), however, since its not
in this patch.


--linas

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From linas at austin.ibm.com  Fri Sep 26 06:01:21 2003
From: linas at austin.ibm.com (linas at austin.ibm.com)
Date: Thu, 25 Sep 2003 15:01:21 -0500
Subject: [PATCH] Add KDB Modules support
Message-ID: <20030925150121.A52258@forte.austin.ibm.com>


Hi,

Could those of you who maintain KDB-enabled ppc64 trees add the
following to your configs?


--- arch/ppc64/config.in.orig   2003-09-23 14:51:00.000000000 -0500
+++ arch/ppc64/config.in        2003-09-23 14:56:58.000000000 -0500
@@ -258,6 +258,8 @@ bool 'Include xmon kernel debugger' CONF
 bool 'Include kdb kernel debugger' CONFIG_KDB
 bool 'Debug memory allocations' CONFIG_DEBUG_SLAB
 if [ "$CONFIG_KDB" = "y" ]; then
+
+  dep_tristate '  KDB additional modules' CONFIG_KDB_MODULES $CONFIG_KDB
   bool '  KDB off by default' CONFIG_KDB_OFF
   define_bool CONFIG_KALLSYMS y
   define_bool CONFIG_XMON n


If one turns on CONFIG_KDB_MODULES, one gets a number of additional
KDB commands for printing out various structs in a human readable way.
Given that some of these structs have large offsets to interesting
fields (e.g. some interesting scsi values occur 1920 bytes in), these
prints are far far better than trying to read hex dumps. (and counting
1920 bytes).  I wish I'd known about this some 2-3 weeks ago.

Here's what the additional stuff looks like:

[0]kdb> h
[... stuff deleted ... ]
vm              <vaddr>              Display vm_area_struct
dentry          <dentry>             Display interesting dentry stuff
filp            <filp>               Display interesting filp stuff
sh              <vaddr>              Show scsi_host
sd              <vaddr>              Show scsi_device
sc              <vaddr>              Show scsi_cmnd
kiobuf          <vaddr>              Display kiobuf
page            <vaddr>              Display page
inode           <vaddr>              Display inode
bh              <buffer head address Display buffer
inode_pages     <inode *>            Display pages in an inode
req             <vaddr>              dump request struct
rqueue          <vaddr>              dump request queue
memmap                               page table summary

[0]kdb> vm 0xc000000004e30000
struct vm_area_struct at 0xc000000004e30000 for 136 bytes
vm_start = 0x0   vm_end = 0x0
page_prot = 0xc000000000404bf8
Flags:

[0]kdb> page 0xc000000004e30000
struct page at 0xc000000004e30000
  next 0x0000000000000000 prev 0x0000000000000000 addr space 0x0000000000000000)  count 0 flags
  virtual 0x4ba2e8baf8d0b000
  buffers 0xc00000000064a378

[0]kdb> sh 0xc000000004e30000
Scsi_Host at 0xc000000004e30000
next = 0x0000000000000000   host_queue = 0x0000000000000000
ehandler = 0x0000000000000000 eh_wait = 0x0000000000000000  en_notify = 0xc00000eh_active = 0x0 host_wait = 0xc0000000003b95d0 hostt = 0xc000000000649aa8 host_0host_failed = 0  extra_bytes = 0  host_no = 1 resetting = 0
max id/lun/channel = [1/0/-1073741824]  this_id = 0
can_queue = 0 cmd_per_lun = -16384  sg_tablesize = 0 u_isa_dma = 1
host_blocked = 1  reverse_ordering = 0

[0]kdb> bh 0xc000000004e30000
buffer_head at 0xc000000004e30000
  next 0x0000000000000000 bno 0 rsec 4294967296 size 0 dev 0x0 rdev 0x0
  count 0 state 0xc000000000404bf8 [Req Mapped New Async Wait_IO Launder JBD Pr0  b_next_free 0x0000000000000000 b_prev_free 0xffffffff00000000 b_reqnext 0xc008  b_page 0x0000000000000000 b_this_page 0x0000008b0000008b b_private 0x000000000


[0]kdb> memmap
  Total pages:      524288
  Slab pages:         2908
  Dirty pages:         441
  Locked pages:          0
  Buffer pages:       9035
  0 page count:     503233
  1 page count:      12019
  2 page count:       7428
  3 page count:        743
  4 page count:        292
  5 page count:         20
  6 page count:        218
  7 page count:          3
  high page count:     332


Warning: On my machine asking for req and rqueue hung the machine hard.
req             <vaddr>              dump request struct
rqueue          <vaddr>              dump request queue

I suspect that this may be due to the fact that the ppc64 KDB is downlevel,
and that this problem is fixed in newer KDB's.

--linas

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From olh at suse.de  Fri Sep 26 06:06:08 2003
From: olh at suse.de (Olaf Hering)
Date: Thu, 25 Sep 2003 22:06:08 +0200
Subject: [PATCH] Add KDB Modules support
In-Reply-To: <20030925150121.A52258@forte.austin.ibm.com>
References: <20030925150121.A52258@forte.austin.ibm.com>
Message-ID: <20030925200608.GA30817@suse.de>


 On Thu, Sep 25, linas at austin.ibm.com wrote:

>
> Warning: On my machine asking for req and rqueue hung the machine hard.
> req             <vaddr>              dump request struct
> rqueue          <vaddr>              dump request queue
>
> I suspect that this may be due to the fact that the ppc64 KDB is downlevel,
> and that this problem is fixed in newer KDB's.

Is that fixable for the currently used KDB?
A simple #if 0 might count as a fix.

--
USB is for mice, FireWire is for men!

sUse lINUX ag, n?RNBERG

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From linas at austin.ibm.com  Fri Sep 26 06:20:19 2003
From: linas at austin.ibm.com (linas at austin.ibm.com)
Date: Thu, 25 Sep 2003 15:20:19 -0500
Subject: [PATCH] Add KDB Modules support
In-Reply-To: <20030925200608.GA30817@suse.de>; from olh@suse.de on Thu, Sep 25, 2003 at 10:06:08PM +0200
References: <20030925150121.A52258@forte.austin.ibm.com> <20030925200608.GA30817@suse.de>
Message-ID: <20030925152019.B52258@forte.austin.ibm.com>


On Thu, Sep 25, 2003 at 10:06:08PM +0200, Olaf Hering wrote:
>  On Thu, Sep 25, linas at austin.ibm.com wrote:
>
> >
> > Warning: On my machine asking for req and rqueue hung the machine hard.
> > req             <vaddr>              dump request struct
> > rqueue          <vaddr>              dump request queue
> >
> > I suspect that this may be due to the fact that the ppc64 KDB is downlevel,
> > and that this problem is fixed in newer KDB's.
>
> Is that fixable for the currently used KDB?
> A simple #if 0 might count as a fix.


I'll investigate.  If you don't hear from me in 48 hours, use the #if 0 below

--linas

--- kdb/modules/kdbm_pg.c.orig  2003-09-25 15:09:53.000000000 -0500
+++ kdb/modules/kdbm_pg.c       2003-09-25 15:13:28.000000000 -0500
@@ -197,6 +197,8 @@ static int
 kdbm_request(int argc, const char **argv, const char **envp,
        struct pt_regs *regs)
 {
+#if 0   /* currently this locks up KDB hard don't know why */
+
        long    offset=0;
        unsigned long addr;
        int nextarg;
@@ -211,6 +213,7 @@ kdbm_request(int argc, const char **argv
                return diag;

        print_request(addr);
+#endif
        return 0;
 }

@@ -219,6 +222,8 @@ static int
 kdbm_rqueue(int argc, const char **argv, const char **envp,
        struct pt_regs *regs)
 {
+
+#if 0   /* currently this locks up KDB hard don't know why */
        struct request_queue    rq;
        unsigned long addr, head_addr, next;
        long    offset=0;
@@ -252,6 +257,7 @@ kdbm_rqueue(int argc, const char **argv,
        if (i)
                kdb_printf("%d requests found\n", i);

+#endif
        return 0;
 }


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From linas at austin.ibm.com  Fri Sep 26 08:35:28 2003
From: linas at austin.ibm.com (linas at austin.ibm.com)
Date: Thu, 25 Sep 2003 17:35:28 -0500
Subject: [PATCH] Add KDB Modules support
In-Reply-To: <20030925152019.B52258@forte.austin.ibm.com>; from linas@austin.ibm.com on Thu, Sep 25, 2003 at 03:20:19PM -0500
References: <20030925150121.A52258@forte.austin.ibm.com> <20030925200608.GA30817@suse.de> <20030925152019.B52258@forte.austin.ibm.com>
Message-ID: <20030925173528.A54132@forte.austin.ibm.com>


On Thu, Sep 25, 2003 at 03:20:19PM -0500, linas at austin.ibm.com wrote:
>
> On Thu, Sep 25, 2003 at 10:06:08PM +0200, Olaf Hering wrote:
> >  On Thu, Sep 25, linas at austin.ibm.com wrote:
> >
> > >
> > > Warning: On my machine asking for req and rqueue hung the machine hard.
> > > req             <vaddr>              dump request struct
> > > rqueue          <vaddr>              dump request queue
> > >
> > > I suspect that this may be due to the fact that the ppc64 KDB is downlevel,
> > > and that this problem is fixed in newer KDB's.
> >
> > Is that fixable for the currently used KDB?
> > A simple #if 0 might count as a fix.

The following seems to fix the hang, making all of the new commands usable:

--- kdb/modules/kdbm_pg.c.orig  2003-09-25 15:09:53.000000000 -0500
+++ kdb/modules/kdbm_pg.c       2003-09-25 17:12:47.000000000 -0500
@@ -244,7 +244,7 @@ kdbm_rqueue(int argc, const char **argv,
        head_addr = addr + offsetof(struct request_queue, queue_head);
        kdb_printf(" request queue: %s\n", next == head_addr ?
                "empty" : "");
-       while (next != head_addr) {
+       while (next && next != head_addr) {
                i++;
                next = print_request(next);
        }


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From olh at suse.de  Fri Sep 26 17:10:00 2003
From: olh at suse.de (Olaf Hering)
Date: Fri, 26 Sep 2003 09:10:00 +0200
Subject: [PATCH] Add KDB Modules support
In-Reply-To: <20030925173528.A54132@forte.austin.ibm.com>
References: <20030925150121.A52258@forte.austin.ibm.com> <20030925200608.GA30817@suse.de> <20030925152019.B52258@forte.austin.ibm.com> <20030925173528.A54132@forte.austin.ibm.com>
Message-ID: <20030926071000.GA7843@suse.de>


 On Thu, Sep 25, linas at austin.ibm.com wrote:

> On Thu, Sep 25, 2003 at 03:20:19PM -0500, linas at austin.ibm.com wrote:
> >
> > On Thu, Sep 25, 2003 at 10:06:08PM +0200, Olaf Hering wrote:
> > >  On Thu, Sep 25, linas at austin.ibm.com wrote:
> > >
> > > >
> > > > Warning: On my machine asking for req and rqueue hung the machine hard.
> > > > req             <vaddr>              dump request struct
> > > > rqueue          <vaddr>              dump request queue
> > > >
> > > > I suspect that this may be due to the fact that the ppc64 KDB is downlevel,
> > > > and that this problem is fixed in newer KDB's.
> > >
> > > Is that fixable for the currently used KDB?
> > > A simple #if 0 might count as a fix.
>
> The following seems to fix the hang, making all of the new commands usable:

thanks Linas.

--
USB is for mice, FireWire is for men!

sUse lINUX ag, n?RNBERG

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From roland at topspin.com  Sat Sep 27 13:22:48 2003
From: roland at topspin.com (Roland Dreier)
Date: 26 Sep 2003 20:22:48 -0700
Subject: Problem with P2P PCI bridge in pSeries 630
Message-ID: <52n0cqgbjr.fsf@topspin.com>

A non-text attachment was scrubbed...
Name: p630-suse-hca-dmesg.txt.bz2
Type: application/x-bzip
Size: 4900 bytes
Desc: dmesg with HCA installed
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030926/62eab014/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p630-suse-hca-lspci.txt.bz2
Type: application/x-bzip
Size: 2448 bytes
Desc: lspci -vvv with HCA installed
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030926/62eab014/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p630-suse-lspci.txt.bz2
Type: application/x-bzip
Size: 2413 bytes
Desc: lspci -vvv without HCA installed
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030926/62eab014/attachment-0002.bin 

From linas at austin.ibm.com  Tue Sep 30 00:54:06 2003
From: linas at austin.ibm.com (linas at austin.ibm.com)
Date: Mon, 29 Sep 2003 09:54:06 -0500
Subject: Problem with P2P PCI bridge in pSeries 630
In-Reply-To: <52n0cqgbjr.fsf@topspin.com>; from roland@topspin.com on Fri, Sep 26, 2003 at 08:22:48PM -0700
References: <52n0cqgbjr.fsf@topspin.com>
Message-ID: <20030929095405.A22040@forte.austin.ibm.com>


On Fri, Sep 26, 2003 at 08:22:48PM -0700, Roland Dreier wrote:
> I have an IBM pSeries 630 server, and I am attempting to port a driver
> for an InfiniBand HCA to ppc64 Linux.  (I already have the driver
> working on i386, x86_64, ia64 and ppc32) However, I'm having a problem

[ ...]

> be able to cope with the PCI bridge.  The first problem appears around
> when the kernel tries to start the framebuffer (right around when it
> prints its message about leaving prom_init).  Usually it displays four

[...]

> EEH: PCI Enhanced I/O Error Handling Enabled
> PCI: 0062:00.0 pci15b3,5a44 (<unknown type>) has bad status from firmware! (fail-perm)<4>write_OF_bars 0062:00.0 pci15b3,5a44 (<unknown type>): read BAR0 failed
> write_OF_bars 0062:00.0 pci15b3,5a44 (<unknown type>): read BAR1 failed

Before you bang your head too much on the PCI code, please make sure
that you have the latest firmware installed (and install it if you don't).

The OF firmware is involved in the PCI setup, and there have been some
bugs fixed having to do with PCI bridges.  These are 'recent' bugs,
fixed about 6 months ago.

FYI EEH is a mechanism that takes a PCI slot off-line if a PCI
parity error, data or address error etc. are detected.  Once the
slot is taken offline, reads from that slot will return 0xff and
writes will never make it to the device.  Currently the kernel will
be forced to panic if an EEH error is detected, but I guess in the
future the device will be hotplug-removed.

There were some bugs when the EEH code walked bridges.  A firmware upgrade
should fix it; if not then you've got a real problem.

--linas


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From roland at topspin.com  Tue Sep 30 06:20:38 2003
From: roland at topspin.com (Roland Dreier)
Date: 29 Sep 2003 13:20:38 -0700
Subject: Problem with P2P PCI bridge in pSeries 630
In-Reply-To: <20030929095405.A22040@forte.austin.ibm.com>
References: <52n0cqgbjr.fsf@topspin.com>
	<20030929095405.A22040@forte.austin.ibm.com>
Message-ID: <52y8w72vop.fsf@topspin.com>


    linas> Before you bang your head too much on the PCI code, please
    linas> make sure that you have the latest firmware installed (and
    linas> install it if you don't).

    linas> The OF firmware is involved in the PCI setup, and there
    linas> have been some bugs fixed having to do with PCI bridges.
    linas> These are 'recent' bugs, fixed about 6 months ago.

Thanks for the advice.  I updated my firmware from 3R030528 to
3R030718, but unfortunately the behavior is exactly the same.  I get:

    EEH: PCI Enhanced I/O Error Handling Enabled
    PCI: 0062:00.0 pci15b3,5a44 (<unknown type>) has bad status from firmware! (fail-perm)<4>write_OF_bars 0062:00.0 pci15b3,5a44 (<unknown type>): read BAR0 failed
    write_OF_bars 0062:00.0 pci15b3,5a44 (<unknown type>): read BAR1 failed

in the kernel log, and lspci doesn't see the device behind the bridge.

    linas> There were some bugs when the EEH code walked bridges.  A
    linas> firmware upgrade should fix it; if not then you've got a
    linas> real problem.

I guess I have a real problem :(

 - Roland


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From moilanen at austin.ibm.com  Tue Sep 30 06:26:01 2003
From: moilanen at austin.ibm.com (Jake Moilanen)
Date: Mon, 29 Sep 2003 15:26:01 -0500 (CDT)
Subject: Problem with P2P PCI bridge in pSeries 630
In-Reply-To: <52y8w72vop.fsf@topspin.com>
References: <52n0cqgbjr.fsf@topspin.com> <20030929095405.A22040@forte.austin.ibm.com>
 <52y8w72vop.fsf@topspin.com>
Message-ID: <Pine.A41.4.51.0309291524580.32436@wolverines.austin.ibm.com>


I believe you have to enable a linux compatible boot as well.  It's
somewhere in the Service Processor menues.

Jake

On Mon, 29 Sep 2003, Roland Dreier wrote:

>
>     linas> Before you bang your head too much on the PCI code, please
>     linas> make sure that you have the latest firmware installed (and
>     linas> install it if you don't).
>
>     linas> The OF firmware is involved in the PCI setup, and there
>     linas> have been some bugs fixed having to do with PCI bridges.
>     linas> These are 'recent' bugs, fixed about 6 months ago.
>
> Thanks for the advice.  I updated my firmware from 3R030528 to
> 3R030718, but unfortunately the behavior is exactly the same.  I get:
>
>     EEH: PCI Enhanced I/O Error Handling Enabled
>     PCI: 0062:00.0 pci15b3,5a44 (<unknown type>) has bad status from firmware! (fail-perm)<4>write_OF_bars 0062:00.0 pci15b3,5a44 (<unknown type>): read BAR0 failed
>     write_OF_bars 0062:00.0 pci15b3,5a44 (<unknown type>): read BAR1 failed
>
> in the kernel log, and lspci doesn't see the device behind the bridge.
>
>     linas> There were some bugs when the EEH code walked bridges.  A
>     linas> firmware upgrade should fix it; if not then you've got a
>     linas> real problem.
>
> I guess I have a real problem :(
>
>  - Roland
>
>
>
>

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From roland at topspin.com  Tue Sep 30 06:49:52 2003
From: roland at topspin.com (Roland Dreier)
Date: 29 Sep 2003 13:49:52 -0700
Subject: Problem with P2P PCI bridge in pSeries 630
In-Reply-To: <Pine.A41.4.51.0309291524580.32436@wolverines.austin.ibm.com>
References: <52n0cqgbjr.fsf@topspin.com>
	<20030929095405.A22040@forte.austin.ibm.com>
	<52y8w72vop.fsf@topspin.com>
	<Pine.A41.4.51.0309291524580.32436@wolverines.austin.ibm.com>
Message-ID: <52r81z48wf.fsf@topspin.com>


    Jake> I believe you have to enable a linux compatible boot as
    Jake> well.  It's somewhere in the Service Processor menues.

I can't find anything about Linux boot in my service processor menu.
Do you know where it is?  (I also looked in the SMS menu, and I
couldn't find anything there either)

Thanks,
  Roland

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From moilanen at austin.ibm.com  Tue Sep 30 07:21:05 2003
From: moilanen at austin.ibm.com (Jake Moilanen)
Date: Mon, 29 Sep 2003 16:21:05 -0500 (CDT)
Subject: Problem with P2P PCI bridge in pSeries 630
In-Reply-To: <52r81z48wf.fsf@topspin.com>
References: <52n0cqgbjr.fsf@topspin.com> <20030929095405.A22040@forte.austin.ibm.com>
 <52y8w72vop.fsf@topspin.com> <Pine.A41.4.51.0309291524580.32436@wolverines.austin.ibm.com>
 <52r81z48wf.fsf@topspin.com>
Message-ID: <Pine.A41.4.51.0309291618240.32436@wolverines.austin.ibm.com>


I'm not positive this will fix your problem, but this is what you need to
do.  In the SP Menues:

Type: 12320

This will bring up additional options.

Next set lpar-mode toggle on.

Go back to the main menu and goto:
-> Power Control -> Boot Mode -> Linux Compatible Mode

I hope this works for you.

Thanks,
Jake


On Mon, 29 Sep 2003, Roland Dreier wrote:

>
>     Jake> I believe you have to enable a linux compatible boot as
>     Jake> well.  It's somewhere in the Service Processor menues.
>
> I can't find anything about Linux boot in my service processor menu.
> Do you know where it is?  (I also looked in the SMS menu, and I
> couldn't find anything there either)
>
> Thanks,
>   Roland
>
>
>

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From hollisb at us.ibm.com  Tue Sep 30 07:41:38 2003
From: hollisb at us.ibm.com (Hollis Blanchard)
Date: Mon, 29 Sep 2003 16:41:38 -0500
Subject: [2.5 PATCH] virtual console via /vdevice/vty
Message-ID: <B47F51B3-F2C5-11D7-AC04-000A95A0560C@us.ibm.com>

Last week I committed a patch to ameslab 2.4 that uses the /vdevice/vty
nodes to determine virtual console information. This is the 2.5 version
of that patch. The /rtas/ibm,termno property will be going away in the
future, which really should be ok because all LPAR device trees have
had /vdevice/vty nodes (according to a firmware developer).

With the patch, we also use the OF /chosen/stdout property to detemine
where the udbg output should go. This is more correct than before
(hardcoding vterm output), but currently there are two unhandled cases:
a future incompatible vty node (compatible "hvterm-protocol1"), and
serial console to an LPAR just needs to be plugged in and tested.

The hvc_count function was also affected; it now just returns
information from the first vty node found. (Now that I think about it,
it may make more sense to keep looking for vty nodes if the first one
is incompatible...) The vterm numbers found in vty nodes are not
necessarily contiguous (different from ibm,termno), so we can only
handle one vterm anyways. (I'm working on replacing hvc_count entirely
but that code is not ready and I don't think I'll have time in the next
two weeks for it.)

The hvc_console driver was unaffected by the switch from
/rtas/ibm,termno to /vdevice/vty.

This patch has been tested on an LPAR p630. Please consider for
inclusion.

--
Hollis Blanchard
IBM Linux Technology Center
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vty-2.5.diff
Type: application/octet-stream
Size: 5894 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20030929/90b35056/attachment.obj 

From roland at topspin.com  Tue Sep 30 07:48:16 2003
From: roland at topspin.com (Roland Dreier)
Date: 29 Sep 2003 14:48:16 -0700
Subject: Problem with P2P PCI bridge in pSeries 630
In-Reply-To: <Pine.A41.4.51.0309291618240.32436@wolverines.austin.ibm.com>
References: <52n0cqgbjr.fsf@topspin.com>
	<20030929095405.A22040@forte.austin.ibm.com>
	<52y8w72vop.fsf@topspin.com>
	<Pine.A41.4.51.0309291524580.32436@wolverines.austin.ibm.com>
	<52r81z48wf.fsf@topspin.com>
	<Pine.A41.4.51.0309291618240.32436@wolverines.austin.ibm.com>
Message-ID: <52ekxz4673.fsf@topspin.com>


Thanks... that didn't seem to help either.

I've just spoken to Wolfgang Maier at IBM Austin, and he said that the
problem seems to be that OF can't handle BARs bigger than 64M.
Unfortunately the InfiniBand HCA has a BAR of 128M.  Wolfgang is going
to put me in touch with the appropriate people at Austin.

Thanks for all the help.

 - Roland

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From anton at samba.org  Tue Sep 30 17:13:24 2003
From: anton at samba.org (Anton Blanchard)
Date: Tue, 30 Sep 2003 17:13:24 +1000
Subject: Problem with P2P PCI bridge in pSeries 630
In-Reply-To: <52ekxz4673.fsf@topspin.com>
References: <52n0cqgbjr.fsf@topspin.com> <20030929095405.A22040@forte.austin.ibm.com> <52y8w72vop.fsf@topspin.com> <Pine.A41.4.51.0309291524580.32436@wolverines.austin.ibm.com> <52r81z48wf.fsf@topspin.com> <Pine.A41.4.51.0309291618240.32436@wolverines.austin.ibm.com> <52ekxz4673.fsf@topspin.com>
Message-ID: <20030930071324.GH24019@krispykreme>


Hi,

> I've just spoken to Wolfgang Maier at IBM Austin, and he said that the
> problem seems to be that OF can't handle BARs bigger than 64M.
> Unfortunately the InfiniBand HCA has a BAR of 128M.  Wolfgang is going
> to put me in touch with the appropriate people at Austin.

Yep, this is a bug in our OF that we've seen before :(

Anton

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/


From olh at suse.de  Tue Sep 30 17:39:29 2003
From: olh at suse.de (Olaf Hering)
Date: Tue, 30 Sep 2003 09:39:29 +0200
Subject: possible deadlock in pipes
In-Reply-To: <20030821215433.GE29476@krispykreme>
References: <20030820214114.GA20395@suse.de> <20030821215433.GE29476@krispykreme>
Message-ID: <20030930073929.GA8352@suse.de>


 On Fri, Aug 22, Anton Blanchard wrote:

>
> Hi Olaf,
>
> > I see random dead locks in pipe_wait() on power4 (p630, p650).
> > I dont have a simple testcase to trigger it. The userland code its
> > either 32bit or 64bit, new or (very) old.
> > It happens with 2.4.19 and with 2.4.21.
> >
> > This patch seems to fix it, maybe it is only a workaround. Is the 2.4
> > pipe code supposed to work on power4 (enough sync accross cpus etc.)?
>
> How hard is it to hit? I think we should be using set_task_state()
> which includes a memory barrier. I also updated the other open coded
> statements to use __set_task_state. Finally I got rid of some redundant
> wmb()s and added unlikely() to force the slow path out of line
>
> Note in 2.5 we have each way barriers on our atomics that return values
> (like atomic_dec_and_test). It doesnt look like we have that in 2.4.
>
> Anton
>
> ===== arch/ppc64/kernel/semaphore.c 1.2 vs edited =====
> --- 1.2/arch/ppc64/kernel/semaphore.c	Mon Apr  8 15:56:10 2002
> +++ edited/arch/ppc64/kernel/semaphore.c	Fri Aug 22 07:11:26 2003
> @@ -122,10 +120,11 @@
>  			break;
>  		}
>  		schedule();
> -		tsk->state = TASK_INTERRUPTIBLE;
> +		set_task_state(tsk, TASK_INTERRUPTIBLE);
>  	}
> -	tsk->state = TASK_RUNNING;
>  	remove_wait_queue(&sem->wait, &wait);
> +	__set_task_state(tsk, TASK_RUNNING);
> +
>  	wake_up(&sem->wait);
>  	return retval;
>  }

why did you move the set task state?

--
USB is for mice, FireWire is for men!

sUse lINUX ag, n?RNBERG

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/