[Cbe-oss-dev] 2.6.21-rc4-arnd1

Arnd Bergmann arnd at arndb.de
Wed Mar 21 07:29:43 EST 2007


I've just uploaded the 2.6.21-rc4-arnd1 kernel, and an equivalent backport
to the latest stable kernel, 2.6.20-arnd3.

This is still not merged with the ps3 kernel from Geoff Levand, which I
mean to do (really, promised). It does contain a number of important
bug fixes, especially for spufs, so please upgrade if you are running
a -arnd kernel right now.

With a little luck, this will also be the version that our friends
in bsc.es maintain for the next SDK release.

Most of the patches are also meant to go into 2.6.22, I'll send the
patchbomb out for another round of review soon.

	Arnd <><

====== spidernet-ipfrag-nfs.diff ======
Subject: spidernet: Fix problem sending IP fragments

From: Norbert Eicker <n.eicker at fz-juelich.de>
I found out that the spidernet-driver is unable to send fragmented IP 
frames.

Let me just recall the basic structure of "normal" UDP/IP/Ethernet 
frames (that actually work):
 - It starts with the Ethernet header (dest MAC, src MAC, etc.)
 - The next part is occupied by the IP header (version info, length of 
packet, id=0, fragment offset=0, checksum, from / to address, etc.)
 - Then comes the UDP header (src / dest port, length, checksum)
 - Actual payload
 - Ethernet checksum

Now what's different for IP fragment:
 - The IP header has id set to some value (same for all fragments), 
offset is set appropriately (i.e. 0 for first fragment, following 
according to size of other fragments), size is the length of the frame.
 - UDP header is unchanged. I.e. length is according to full UDP 
datagram, not just the part within the actual frame! But this is only 
true within the first frame: all following frames don't have a valid 
UDP-header at all.

The spidernet silicon seems to be quite intelligent: It's able to 
compute (IP / UDP / Ethernet) checksums on the fly and tests if frames 
are conforming to RFC -- at least conforming to RFC on complete frames.

But IP fragments are different as explained above:
I.e. for IP fragments containing part of a UDP datagram it sees 
incompatible length in the headers for IP and UDP in the first frame 
and, thus, skips this frame. But the content *is* correct for IP 
fragments. For all following frames it finds (most probably) no valid 
UDP header at all. But this *is* also correct for IP fragments.

The Linux IP-stack seems to be clever in this point. It expects the 
spidernet to calculate the checksum (since the module claims to be able 
to do so) and marks the skb's for "normal" frames accordingly 
(ip_summed set to CHECKSUM_HW).
But for the IP fragments it does not expect the driver to be capable to 
handle the frames appropriately. Thus all checksums are allready 
computed. This is also flaged within the skb (ip_summed set to 
CHECKSUM_NONE).

Unfortunately the spidernet driver ignores that hints. It tries to send 
the IP fragments of UDP datagrams as normal UDP/IP frames. Since they 
have different structure the silicon detects them the be not 
"well-formed" and skips them.

The following one-liner against 2.6.21-rc2 changes this behavior. If the 
IP-stack claims to have done the checksumming, the driver should not 
try to checksum (and analyze) the frame but send it as is.

Signed-off-by: Norbert Eicker <n.eicker at fz-juelich.de>
Signed-off-by: Linas Vepstas <linas at austin.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

----
---
diffstat:
 spider_net.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

====== ipmi_si-check-devicetree-3.diff ======
Subject: ipmi: add suport for powerpc of_platform_driver

From: Christian Krafft <krafft at de.ibm.com>
This patch adds support for of_platform_driver to the ipmi_si module.
When loading the module, the driver will be registered to of_platform.
The driver will be probed for all devices with the type ipmi. It's supporting
devices with compatible settings ipmi-kcs, ipmi-smic and ipmi-bt.
Only ipmi-kcs could be tested.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Acked-by: Heiko J Schick <schihei at de.ibm.com>
Signed-off-by: Corey Minyard <minyard at acm.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 ipmi_si_intf.c |  108 +++++++++++++++++++++++++++++++++++++
 1 file changed, 108 insertions(+)

====== ipmi_add_module_device_table.diff ======
Subject: ipmi: add module_device_table to ipmi_si

From: Christian Krafft <krafft at de.ibm.com>
This patch adds MODULE_DEVICE_TABLE to ipmi_si.
This way the module can be autoloaded by the kernel
if a matching device is found.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 ipmi_si_intf.c |    2 ++
 1 file changed, 2 insertions(+)

====== spu-sched-tick-workqueue-is-rearming.diff ======
Subject: use cancel_rearming_delayed_workqueue when stopping spu contexts

From: Christoph Hellwig <hch at lst.de>

The scheduler workqueue may rearm itself and deadlock when we try to stop
it.  Put a flag in place to avoid skip the work if we're tearing down
the context.

Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 sched.c |   23 +++++++++++++++++++++--
 spufs.h |    2 +-
 2 files changed, 22 insertions(+), 3 deletions(-)

====== cbe_thermal-add-reg_to_temp.diff ======
Subject: cbe_thermal: clean up computation of temperature

From: Christian Krafft <krafft at de.ibm.com>
This patch introduces a little function for transforming
register values into temperature.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 cbe_thermal.c |   26 +++++++++-----------------
 1 file changed, 9 insertions(+), 17 deletions(-)

====== cbe_thermal-throttling-attributes.diff ======
Subject: cbe_thermal: add throttling attributes to cpu and spu nodes

From: Christian Krafft <krafft at de.ibm.com>
This patch adds some attributes the cpu and spu nodes:
/sys/devices/system/[c|s]pu/[c|s]pu*/thermal/throttle_begin
/sys/devices/system/[c|s]pu/[c|s]pu*/thermal/throttle_end
/sys/devices/system/[c|s]pu/[c|s]pu*/thermal/throttle_full_stop

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 cbe_thermal.c |  155 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 154 insertions(+), 1 deletion(-)

====== cell-add-node-to-cpu-4.diff ======
Subject: cell: add cbe_node_to_cpu function

From: Christian Krafft <krafft at de.ibm.com>

This patch adds code to deal with conversion of
logical cpu to cbe nodes. It removes code that
assummed there were two logical CPUs per CBE.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 arch/powerpc/oprofile/op_model_cell.c  |    1 
 arch/powerpc/platforms/cell/cbe_regs.c |   53 +++++++++----
 arch/powerpc/platforms/cell/cbe_regs.h |    5 +
 include/asm-powerpc/cell-pmu.h         |    5 -
 4 files changed, 45 insertions(+), 19 deletions(-)

====== cbe_cpufreq-use-pmi-3.diff ======
Subject: cell: use pmi in cpufreq driver

From: Christian Krafft <krafft at de.ibm.com>

The new PMI driver was added in order to support
cpufreq on blades that require the frequency to
be controlled by the service processor, so use it
on those.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---

---
diffstat:
 cbe_cpufreq.c |   81 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 80 insertions(+), 1 deletion(-)

====== powerpc-add-of_remap.diff ======
Subject: powerpc: add of_iomap function

From: Christian Krafft <krafft at de.ibm.com>

The of_iomap function maps memory for a given
device_node and returns a pointer to that memory.
This is used at some places, so it makes sense to
a seperate function.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 arch/powerpc/sysdev/pmi.c  |   19 ++-----------------
 include/asm-powerpc/prom.h |   11 +++++++++++
 2 files changed, 13 insertions(+), 17 deletions(-)

====== cell-support-new-device-tree-layout.diff ======
Subject: cell: add support for proper device-tree

From: Christian Krafft <krafft at de.ibm.com>

This patch adds support for a proper device-tree.
A porper device-tree on cell contains be nodes
for each CBE containg nodes for SPEs and all the
other special devices on it.
Ofcourse oldschool devicetree is still supported.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 cbe_regs.c |  119 ++++++++++++++++++++++++++++++-----------
 1 file changed, 88 insertions(+), 31 deletions(-)

====== spufs-fix-ctx-lifetimes.diff ======
Subject: Clear mapping pointers after last close

From: Christoph Hellwig <hch at lst.de>

Make sure the pointers to various mappings are cleared once the last
user stopped using them.  This avoids accessing freed memory when
tearing down the gang directory aswell as optimizing away
pte invalidations if no one uses these.


Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>


---
diffstat:
 context.c |    1 
 file.c    |  146 +++++++++++++++++++++++++++++++++++++++---
 inode.c   |    1 
 spufs.h   |   12 ++-
 4 files changed, 147 insertions(+), 13 deletions(-)

====== spufs-ensure-preempted-threads-are-on-the-runqueue.diff ======
Subject: spu sched: ensure preempted threads are put back on the runqueue
From: Christoph Hellwig <hch at lst.de>

To not lose a spu thread we need to make sure it always gets put back
on the runqueue.  

Signed-off-by: Christoph Hellwig <hch at lst.de>
Acked-by: Jeremy Kerr <jk at ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>



---
diffstat:
 sched.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

====== spufs-add-missing-wakeup-in-find_victim.diff ======
Subject: spu sched: ensure preempted threads are put back on the runqueue, part2

From: Christoph Hellwig <hch at lst.de>
To not lose a spu thread we need to make sure it always gets put back
on the runqueue.  In find_victim aswell as in the scheduler tick as done
in the previous patch.

Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 sched.c |    6 ++++++
 1 file changed, 6 insertions(+)

====== spufs-use-barriers-for-set_bit.diff ======
Subject: spufs: add memory barriers after set_bit

From: Arnd Bergmann <arnd.bergmann at de.ibm.com>
set_bit does not guarantee ordering on powerpc, so using it
for communication between threads requires explicit
mb() calls.

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---
diffstat:
 sched.c |    3 +++
 1 file changed, 3 insertions(+)

====== spusched-remove-from-runqueue-early.diff ======
Subject: remove woken threads from the runqueue early
From: Christoph Hellwig <hch at lst.de>

A single context should only be woken once, and we should not have
more wakeups for a given priority than the number of contexts on
that runqueue position.

Also add some asserts to trap future problems in this area more
easily.


Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 context.c |    2 +
 sched.c   |   44 ++++++++++++++++--------------------------
 2 files changed, 19 insertions(+), 27 deletions(-)

====== spufs-always-release-mapping-lock.diff ======
Subject: fix missing unlock in spufs_signal1_release
From: Christoph Hellwig <hch at lst.de>

Add a missing spin_unlock in spufs_signal1_release.


Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 file.c |    1 +
 1 file changed, 1 insertion(+)

====== spu_base-move-spu-init-channel-out-of-spu-mutex.diff ======
Subject: spu_base: move spu_init_channels out of spu_mutex

From: Christoph Hellwig <hch at lst.de>
There is no reason to execute spu_init_channels under spu_mutex -               
after the spu has been taken off the freelist it's ours.  

Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---


---
diffstat:
 spu_base.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

====== cell-rtas-ptcal.diff ======
Subject: cell: enable RTAS-based PTCAL for Cell XDR memory

From: Jeremy Kerr <jk at ozlabs.org>
Enable Periodic Recalibration (PTCAL) support for Cell XDR memory,
using the new ibm,cbe-start-ptcal and ibm,cbe-stop-ptcal RTAS calls.

Tested on QS20 and QS21 (by Thomas Huth). It seems that SLOF has
problems disabling, at least on QS20; this patch should only be
used once these problems have been addressed.

Signed-off-by: Jeremy Kerr <jk at ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

--
Update: Updated with Michael's feedback, expanded comment.

---
diffstat:
 ras.c |  160 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 160 insertions(+)

====== axon-ram-2.diff ======
Subject: cell: driver for DDR2 memory on AXON

From: Maxim Shchetynin <maxim.shchetynin at de.ibm.com>

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 Kconfig          |    7 
 sysdev/Makefile  |    1 
 sysdev/axonram.c |  498 +++++++++++++++++++++++++++++++++++
 3 files changed, 506 insertions(+)

====== axon-ram-block-config-fix.diff ======
Subject: axonram bugfix

From: Jens Osterkamp <jens at de.ibm.com>
unlink_gendisk is already called in del_gendisk, so we dont need it here.

Signed-off-by: Jens Osterkamp <jens at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---

---
diffstat:
 axonram.c |    1 -
 1 file changed, 1 deletion(-)

====== spufs-export-expand_stack.diff ======
Subject: spufs: export expand_stack

From: Arnd Bergmann <arnd.bergmann at de.ibm.com>
An SPU can create page faults on the stack, which we need to handle
from a loadable module, so export the expand_stack function used
for this.

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 mmap.c |    1 +
 1 file changed, 1 insertion(+)

====== spufs-pagefault-rework.diff ======
Subject: spufs: make spu page faults not block scheduling

From: Arnd Bergmann <arnd.bergmann at de.ibm.com>
Until now, we have always entered the spu page fault handler
with a mutex for the spu context held. This has multiple
bad side-effects:
- it becomes impossible to suspend the context during
  page faults
- if an spu program attempts to access its own mmio
  areas through DMA, we get an immediate livelock when
  the nopage function tries to acquire the same mutex

This patch makes the page fault logic operate on a
struct spu_context instead of a struct spu, and moves it
from spu_base.c to a new file fault.c inside of spufs.

We now also need to copy the dar and dsisr contents
of the last fault into the saved context to have it
accessible in case we schedule out the context before
activating the page fault handler.

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 arch/powerpc/platforms/cell/spu_base.c          |  103 -----
 arch/powerpc/platforms/cell/spufs/Makefile      |    1 
 arch/powerpc/platforms/cell/spufs/backing_ops.c |    6 
 arch/powerpc/platforms/cell/spufs/fault.c       |  192 ++++++++++
 arch/powerpc/platforms/cell/spufs/hw_ops.c      |    9 
 arch/powerpc/platforms/cell/spufs/run.c         |   28 -
 arch/powerpc/platforms/cell/spufs/spufs.h       |    4 
 arch/powerpc/platforms/cell/spufs/switch.c      |    8 
 include/asm-powerpc/mmu.h                       |    1 
 include/asm-powerpc/spu_csa.h                   |    1 
 10 files changed, 224 insertions(+), 129 deletions(-)

====== export-force-siginfo.diff ======
Subject: export force_sig_info

From: Jeremy Kerr <jk at ozlabs.org>
Export force_sig_info for use by modules. This is required to allow
spufs to provide siginfo data for SPE-generated signals.

Signed-off-by: Jeremy Kerr <jk at ozlabs.org>

---
diffstat:
 signal.c |    1 +
 1 file changed, 1 insertion(+)

====== spufs-provide-siginfo-for-SPE-faults.diff ======
Subject: spufs: provide siginfo for SPE faults

From: Jeremy Kerr <jk at ozlabs.org>
This change populates a siginfo struct for SPE application exceptions
(ie, invalid DMAs and illegal instructions).

Tested on an IBM Cell Blade.

Signed-off-by: Jeremy Kerr <jk at ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 fault.c |   32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

====== cell-be_info-2.diff ======
Subject: cell: add per BE structure with info about its SPUs
From: Andre Detsch <adetsch at br.ibm.com>

Addition of a spufs-global "be_info" array. Each entry contains information
about one BE node, namelly:
* list of spus (both free and busy spus are in this list);
* list of free spus (replacing the static spu_list from spu_base.c)
* number of spus;
* number of reserved (non scheduleable) spus.

SPE affinity implementation actually requires only access to one spu per
BE node (since it implements its own pointer to walk through the other spus
of the ring) and the number of scheduleable spus (n_spus - non_sched_spus)
However having this more general structure can be useful for other
functionalities, concentrating per-be statistics / data.

Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---
---
diffstat:
 arch/powerpc/platforms/cell/spu_base.c    |   26 ++++++----
 arch/powerpc/platforms/cell/spufs/sched.c |    4 +
 include/asm-powerpc/spu.h                 |   10 +++
 3 files changed, 31 insertions(+), 9 deletions(-)

====== cell-spu_indexing-2.diff ======
Subject: cell: add vicinity information on spus
From: Andre Detsch <adetsch at br.ibm.com>

This patch adds affinity data to each spu instance.
A doubly linked list is created, meant to connect the spus
in the physical order they are placed in the BE. SPUs
near to memory should be marked as having memory affinity.
Adjustments of the fields acording to FW properties is done
in separate patches, one for CPBW, one for Malta (patch for
Malta under testing).

Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---


---
diffstat:
 arch/powerpc/platforms/cell/spu_base.c |    2 ++
 include/asm-powerpc/spu.h              |    3 +++
 2 files changed, 5 insertions(+)

====== cell-spu_indexing_QS20-2.diff ======
Subject: cell: add hardcoded spu vicinity information for QS20
From: Andre Detsch <adetsch at br.ibm.com>

This patch allows the use of spu affinity on QS20, whose
original FW does not provide affinity information.
This is done through two hardcoded arrays, and by reading the reg
property from each spu.

Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---

---
diffstat:
 spu_base.c |   55 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

====== spufs-affinity_create-4.diff ======
Subject: spufs: extension of spu_create to support affinity definition
From: Andre Detsch <adetsch at br.ibm.com>

This patch adds support for additional flags at spu_create, which relate
to the establishment of affinity between contexts and contexts to memory.
A fourth, optional, parameter is supported. This parameter represent
a affinity neighbor of the context being created, and is used when defining
SPU-SPU affinity.
Affinity is represented as a doubly linked list of spu_contexts.

Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---

---
diffstat:
 arch/powerpc/platforms/cell/spu_syscalls.c   |   17 +
 arch/powerpc/platforms/cell/spufs/context.c  |    1 
 arch/powerpc/platforms/cell/spufs/gang.c     |    4 
 arch/powerpc/platforms/cell/spufs/inode.c    |  132 +++++++++-
 arch/powerpc/platforms/cell/spufs/spufs.h    |   16 +
 arch/powerpc/platforms/cell/spufs/syscalls.c |   32 ++
 include/asm-powerpc/spu.h                    |    8 
 include/linux/syscalls.h                     |    2 
 8 files changed, 195 insertions(+), 17 deletions(-)

====== spufs-affinity_placement-3.diff ======
Subject: cell: add placement computation for scheduling of affinity contexts
From: Andre Detsch <adetsch at br.ibm.com>

This patch provides the spu affinity placement logic for the spufs scheduler.
Each time a gang is going to be scheduled, the placement of a reference
context is defined. The placement of all other contexts with affinity from
the gang is defined based on this reference context location and on a
precomputed displacement offset.

Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---

---
diffstat:
 gang.c  |    4 -
 sched.c |  134 ++++++++++++++++++++++++++++++++++++++++++++
 spufs.h |    6 +
 3 files changed, 143 insertions(+), 1 deletion(-)

====== spufs-affinity_schedulling-2.diff ======
Subject: spufs: integration of SPE affinity with the scheduller
From: Andre Detsch <adetsch at br.ibm.com>

This patch makes the scheduller honor affinity information for each
context being scheduled. If the context has no affinity information,
behaviour is unchanged. If there are affinity information, context is
schedulled to be run on the exact spu recommended by the affinity
placement algorithm.


Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---

---
diffstat:
 spu_base.c    |   19 +++++++++++++++++++
 spufs/sched.c |    4 ++++
 2 files changed, 23 insertions(+)

====== cell-spu_indexing_FW_vicinity-1.diff ======
Subject: cell: indexing of SPUs based on firmware vicinity properties
From: Andre Detsch <adetsch at br.ibm.com>

This patch links spus according to their physical position using
information provided by the firmware through a special vicinity
device-tree property. This property is present in current version
of Malta firmware.

Example of vicinity properties for a node in Malta:

Node:        Vicinity property contains phandles of:
spe at 0        [ spe at 100000 , mic-tm at 50a000 ]
spe at 100000   [ spe at 0      , spe at 200000    ]
spe at 200000   [ spe at 100000 , spe at 300000    ]
spe at 300000   [ spe at 200000 , bif0 at 512000   ]
spe at 80000    [ spe at 180000 , mic-tm at 50a000 ]
spe at 180000   [ spe at 80000  , spe at 280000    ]
spe at 280000   [ spe at 180000 , spe at 380000    ]
spe at 380000   [ spe at 280000 , bif0 at 512000   ]

Only spe@* have a vicinity property (e.g., bif0 at 512000 and
mic-tm at 50a000 do not have it).

Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 spu_base.c |   90 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 89 insertions(+), 1 deletion(-)

====== spufs-affinity-respecting-numa-properties.diff ======
Subject: spufs: affinity now respecting numa properties

From: Andre Detsch <adetsch at br.ibm.com>

Placement of context with affinity honoring numa properties,
the same way contexts without affinity already do.

Signed-off-by: Andre Detsch <adetsch at br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 sched.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

====== re-cell-oprofile-spu-profiling-updated-patch-3.diff ======
Subject: Add support to OProfile for profiling Cell BE SPUs

From: Maynard Johnson <mpjohn at us.ibm.com>

This patch updates the existing arch/powerpc/oprofile/op_model_cell.c
to add in the SPU profiling capabilities.  In addition, a 'cell' subdirectory
was added to arch/powerpc/oprofile to hold Cell-specific SPU profiling
code.

Signed-off-by: Carl Love <carll at us.ibm.com>
Signed-off-by: Maynard Johnson <mpjohn at us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 arch/powerpc/configs/cell_defconfig         |    3 
 arch/powerpc/kernel/time.c                  |    1 
 arch/powerpc/oprofile/Kconfig               |    7 
 arch/powerpc/oprofile/Makefile              |    3 
 arch/powerpc/oprofile/cell/pr_util.h        |   90 +
 arch/powerpc/oprofile/cell/spu_profiler.c   |  220 ++++
 arch/powerpc/oprofile/cell/spu_task_sync.c  |  487 +++++++++
 arch/powerpc/oprofile/cell/vma_map.c        |  279 +++++
 arch/powerpc/oprofile/common.c              |   49 
 arch/powerpc/oprofile/op_model_cell.c       |  505 +++++++++-
 arch/powerpc/oprofile/op_model_power4.c     |   11 
 arch/powerpc/oprofile/op_model_rs64.c       |   10 
 arch/powerpc/platforms/cell/spufs/context.c |   20 
 arch/powerpc/platforms/cell/spufs/sched.c   |    8 
 arch/powerpc/platforms/cell/spufs/spufs.h   |    4 
 drivers/oprofile/buffer_sync.c              |    1 
 drivers/oprofile/event_buffer.h             |   20 
 drivers/oprofile/oprof.c                    |   26 
 include/asm-powerpc/oprofile_impl.h         |   10 
 include/asm-powerpc/spu.h                   |   15 
 include/linux/dcookies.h                    |    1 
 include/linux/elf-em.h                      |    3 
 include/linux/oprofile.h                    |   38 
 kernel/hrtimer.c                            |    1 
 24 files changed, 1723 insertions(+), 89 deletions(-)

====== oprofile-spu-cleanup.diff ======
Subject: cleanup spu oprofile code

From: Arnd Bergmann <arnd.bergmann at de.ibm.com>
This cleans up some of the new oprofile code. It's mostly
cosmetic changes, like way multi-line comments are formatted.
The most significant change is a simplification of the
context-switch record format.

It does mean the oprofile report tool needs to be adapted,
but I'm sure that it pays off in the end.

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---
diffstat:
 cell/spu_task_sync.c |   89 ++++---------
 op_model_cell.c      |  204 +++++++++++++++++--------------
 2 files changed, 147 insertions(+), 146 deletions(-)

====== miscellaneous-fixes-for-spu-profiling-code.diff ======
Subject: Miscellaneous fixes for SPU profiling code

From: Maynard Johnson <maynardj at us.ibm.com>

After applying the "cleanup spu oprofile code" patch posted by Arnd Bergmann
on Feb 26, 2007, I found a few issues that required fixing up:
  -  Bug fix:  Initialize retval in spu_task_sync.c, line 95, otherwise this
         function returns non-zero and OProfile fails.
  -  Remove unused codes in include/linux/oprofile.h
  -  Compile warnings:  Initialize offset and spu_cookie at lines 283 and 284
         in spu_task_sync.c.

Additionally, in a separate email, Arnd pointed out a bug in
spu_task_sync.c:process_context_switch, where we were ignoring invalid
values in the dcookies returned from get_exec_dcookie_and_offset.  This is
fixed in this patch so that we now fail with ENOENT if either cookie is invalid.

Signed-off-by: Maynard Johnson <maynardj at us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 arch/powerpc/oprofile/cell/spu_task_sync.c |   10 +++-
 include/linux/oprofile.h                   |   23 ++++------
 2 files changed, 17 insertions(+), 16 deletions(-)

====== re-add-support-to-oprofile-for-profiling-cell-be-spus-update-2.diff ======
Subject: Enable SPU switch notification to detect currently active SPU tasks.

From: Maynard Johnson <mpjohn at us.ibm.com>
This patch adds to the capability of spu_switch_event_register so that the
caller is also notified of currently active SPU tasks.  It also exports
spu_switch_event_register and spu_switch_event_unregister.

Signed-off-by: Maynard Johnson <mpjohn at us.ibm.com>
Signed-off-by: Carl Love <carll at us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 run.c   |   16 ++++++++++++++--
 sched.c |   30 ++++++++++++++++++++++++++++--
 spufs.h |    2 ++
 3 files changed, 44 insertions(+), 4 deletions(-)

====== cell-oprofile-compile-fix.diff ======
Subject: Fix oprofile compilation
From: Christoph Hellwig <hch at lst.de>

Fix compilation for CONFIG_OPROFILE=y CONFIG_OPROFILE_CELL=n on cell.


Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 common.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

====== this-is-a-hack-to-get_unmapped_area-to-make-the-spe-64k-code-work.diff ======
Subject: This is a hack to get_unmapped_area to make the SPE 64K code work.

From: Benjamin Herrenschmidt <benh at kernel.crashing.org>
(Though it might prove to not have nasty side effects ...)

The basic idea is that if the filesystem's get_unmapped_area was used,
we skip the hugepage check. That assumes that the only filesytems that
provide a g_u_a callback are either hugetlbfs itself, or filesystems
that have arch specific code that "knows" already not to collide with
hugetlbfs.

A proper fix will be done later, basically by removing the hugetlbfs
hacks completely from get_unmapped_area and calling down to the mm
and/or the filesytem g_u_a implementations for MAX_FIXED as well.

(Note that this will still rely on the fact that filesytems that
provide a g_u_a "know" how to return areas that don't collide with
hugetlbfs, thus the base assumption is the same as this hack)

Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

 mm/mmap.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

---
diffstat:
 mmap.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

====== powerpc-introduce-address-space-slices.diff ======
Subject: powerpc: Introduce address space "slices"

From: Benjamin Herrenschmidt <benh at kernel.crashing.org>
The basic issue is to be able to do what hugetlbfs does but with
different page sizes for some other special filesystems, more
specifically, my need is:

 - Huge pages

 - SPE local store mappings using 64K pages on a 4K base page size
kernel on Cell

 - Some special 4K segments in 64K pages kernels for mapping a dodgy
specie of powerpc specific infiniband hardware that requires 4K MMU
mappings for various reasons I won't explain here.

The main issues are:

 - To maintain/keep track of the page size per "segments" (as we can
only have one page size per segment on powerpc, which are 256MB
divisions of the address space).

 - To make sure special mappings stay within their alloted
"segments" (including MAP_FIXED crap)

 - To make sure everybody else doesn't mmap/brk/grow_stack into a
"segment" that is used for a special mapping

Some of the necessary mecanisms to handle that were present in the
hugetlbfs code, but mostly in ways not suitable for anything else.

The patch address these in various ways described quickly below that
hijack some of the existing hugetlbfs callbacks.

The ideal solution requires some changes to the generic
get_unmapped_area(), among others, to get rid of the hugetlbfs hacks in
there, and instead, make sure that the fs and mm get_unmapped_area are
also called for MAP_FIXED. We might also need to add an mm callback to
validate a mapping.

I intend to do those changes separately and then adapt this work to use
them.

So what is a slice ? Well, I re-used the mecanism used formerly by our
hugetlbfs implementation which divides the address space in
"meta-segments" which I called "slices". The division is done using
256MB slices below 4G, and 1T slices above. Thus the address space is
divided currently into 16 "low" slices and 16 "high" slices. (Special
case: high slice 0 is the area between 4G and 1T).

Doing so simplifies significantly the tracking of segments and avoid
having to keep track of all the 256MB segments in the address space.

While I used the "concepts" of hugetlbfs, I mostly re-implemented
everything in a more generic way and "ported" hugetlbfs to it. 

slices can have an associated page size, which is encoded in the mmu
context and used by the SLB miss handler to set the segment sizes. The
hash code currently doesn't care, it has a specific check for hugepages,
though I might add a mecanism to provide per-slice hash mapping
functions in the future.

The slice code provide a pair of "generic" get_unmapped_area() (bottomup
and topdown) functions that should work with any slice size. There is
some trickyness here so I would appreciate people to have a look at the
implementation of these and let me know if I got something wrong.

Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

 arch/powerpc/Kconfig                   |    5 
 arch/powerpc/kernel/asm-offsets.c      |   16 
 arch/powerpc/mm/Makefile               |    1 
 arch/powerpc/mm/hash_utils_64.c        |  124 +++---
 arch/powerpc/mm/hugetlbpage.c          |  528 ---------------------------
 arch/powerpc/mm/mmu_context_64.c       |   10 
 arch/powerpc/mm/slb.c                  |   11 
 arch/powerpc/mm/slb_low.S              |   54 +-
 arch/powerpc/mm/slice.c                |  630 +++++++++++++++++++++++++++++++++
 arch/powerpc/platforms/cell/spu_base.c |    9 
 include/asm-powerpc/mmu.h              |   12 
 include/asm-powerpc/paca.h             |    2 
 include/asm-powerpc/page_64.h          |   87 ++--
 13 files changed, 827 insertions(+), 662 deletions(-)

---
diffstat:
 arch/powerpc/Kconfig                   |    5 
 arch/powerpc/kernel/asm-offsets.c      |   16 
 arch/powerpc/mm/Makefile               |    1 
 arch/powerpc/mm/hash_utils_64.c        |  124 +-
 arch/powerpc/mm/hugetlbpage.c          |  528 ----------
 arch/powerpc/mm/mmu_context_64.c       |   10 
 arch/powerpc/mm/slb.c                  |   11 
 arch/powerpc/mm/slb_low.S              |   54 -
 arch/powerpc/mm/slice.c                |  630 +++++++++++++
 arch/powerpc/platforms/cell/spu_base.c |    9 
 include/asm-powerpc/mmu.h              |   12 
 include/asm-powerpc/paca.h             |    2 
 include/asm-powerpc/page_64.h          |   87 -
 13 files changed, 827 insertions(+), 662 deletions(-)

====== powerpc-add-ability-to-4k-kernel-to-hash-in-64k-pages.diff ======
Subject: powerpc: Add ability to 4K kernel to hash in 64K pages

From: Benjamin Herrenschmidt <benh at kernel.crashing.org>
This patch adds the ability for a kernel compiled with 4K page size
to have special slices containing 64K pages and hash the right type
of hash PTEs.

Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

 arch/powerpc/Kconfig              |    6 ++++++
 arch/powerpc/mm/hash_low_64.S     |    5 ++++-
 arch/powerpc/mm/hash_utils_64.c   |   36 +++++++++++++++++++++++-------------
 arch/powerpc/mm/tlb_64.c          |   12 +++++++++---
 include/asm-powerpc/pgtable-4k.h  |    6 +++++-
 include/asm-powerpc/pgtable-64k.h |    7 ++++++-
 6 files changed, 53 insertions(+), 19 deletions(-)

---
diffstat:
 arch/powerpc/Kconfig              |    6 +++
 arch/powerpc/mm/hash_low_64.S     |    5 ++
 arch/powerpc/mm/hash_utils_64.c   |   36 +++++++++++-------
 arch/powerpc/mm/tlb_64.c          |   12 ++++--
 include/asm-powerpc/pgtable-4k.h  |    6 ++-
 include/asm-powerpc/pgtable-64k.h |    7 +++
 6 files changed, 53 insertions(+), 19 deletions(-)

====== powerpc-spufs-support-for-64k-ls-mappings-on-4k-kernels.diff ======
Subject: powerpc: spufs support for 64K LS mappings on 4K kernels

From: Benjamin Herrenschmidt <benh at kernel.crashing.org>
This patch adds an option to spufs when the kernel is configured for
4K page to give it the ability to use 64K pages for SPE local store
mappings.

Currently, we are optimistic and try order 4 allocations when creting
contexts. If that fails, the code will fallback to 4K automatically.

Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

 arch/powerpc/platforms/cell/Kconfig             |   15 +
 arch/powerpc/platforms/cell/spufs/Makefile      |    2 
 arch/powerpc/platforms/cell/spufs/context.c     |    4 
 arch/powerpc/platforms/cell/spufs/file.c        |   77 ++++++++--
 arch/powerpc/platforms/cell/spufs/lscsa_alloc.c |  181 ++++++++++++++++++++++++
 arch/powerpc/platforms/cell/spufs/switch.c      |   28 +--
 include/asm-powerpc/spu_csa.h                   |   10 +
 7 files changed, 282 insertions(+), 35 deletions(-)

---
diffstat:
 arch/powerpc/platforms/cell/Kconfig             |   15 
 arch/powerpc/platforms/cell/spufs/Makefile      |    2 
 arch/powerpc/platforms/cell/spufs/context.c     |    4 
 arch/powerpc/platforms/cell/spufs/file.c        |   79 +++-
 arch/powerpc/platforms/cell/spufs/lscsa_alloc.c |  181 ++++++++++
 arch/powerpc/platforms/cell/spufs/switch.c      |   28 -
 include/asm-powerpc/spu_csa.h                   |   10 
 7 files changed, 283 insertions(+), 36 deletions(-)

====== allow-spufs-to-build-as-a-module-with-slices-enabled.diff ======
Subject: Allow spufs to build as a module with slices enabled

From: Michael Ellerman <michael at ellerman.id.au>
The slice code is missing some exports to allow spufs to build as a
module. Add them.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---
 MODPOST 209 modules
WARNING: ".get_slice_psize" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined!
WARNING: ".slice_get_unmapped_area" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined!

---
 arch/powerpc/mm/slice.c |    3 +++
 1 file changed, 3 insertions(+)

---
diffstat:
 slice.c |    3 +++
 1 file changed, 3 insertions(+)

====== 64k-ls-mappings-fix.diff ======
Subject: fix ls store access with 64k mappings

From: Benjamin Herrenschmidt <benh at kernel.crashing.org>
This is also part of the latest patch posted to cbe-oss-dev.

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
---
diffstat:
 hash_utils_64.c |    9 +++++++++
 1 file changed, 9 insertions(+)

====== ipv6-round-robin-stub.diff ======
Subject: Patch ported from LTC bugzilla report 31558

From: sri at us.ibm.com

We saw similar stack traces with RHEL5 on zSeries(bug #28338) and
iSeries(bug #29263).

There definitely seems to be some bugs in the ipv6 fib insertion code.
Redhat decided to work around this bug by disabling round-robin routing
with ipv6 until the fib management code is fixed. The following patch 
does this.

Acked-by: Linas Vepstas <linas at austin.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

----
 net/ipv6/route.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

---
diffstat:
 route.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

====== mm-fix-alloc_bootmem-on-nodes-without-mem.diff ======
Subject: mm: enables booting a NUMA system where some nodes have no memory

From: Christian Krafft <krafft at de.ibm.com>
When booting a NUMA system with nodes that have no memory (eg by limiting memory),
bootmem_alloc_core tried to find pages in an uninitialized bootmem_map.
This caused a null pointer access.
This fix adds a check, so that NULL is returned.
That will enable the caller (bootmem_alloc_nopanic)
to alloc memory on other without a panic.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 bootmem.c |    4 ++++
 1 file changed, 4 insertions(+)

====== mm-fix-alloc_bootmem-call-after-bootmem-freed-2.diff ======
Subject: mm: fix call to alloc_bootmem after bootmem has been freed

From: Christian Krafft <krafft at de.ibm.com>
In some cases it might happen, that alloc_bootmem is beeing called
after bootmem pages have been freed. This is, because the condition
SYSTEM_BOOTING is still true after bootmem has been freed.

Signed-off-by: Christian Krafft <krafft at de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

====== early_pfn_in_nid-workaround-2.diff ======
Subject: early_pfn_in_nid() called when not early

From: Arnd Bergmann <arnd.bergmann at de.ibm.com>
After a lot of debugging in spufs, I found that a crash that we
encountered
on Cell actually was caused by a change in the memory management.

The patch that caused it is archived in
http://lkml.org/lkml/2006/11/1/43,
and this one has been discussed back and forth, but I fear that the
current
version may be broken for all setups that do memory hotplug with
sparsemen
and NUMA, at least on powerpc.

What happens exactly is that the spufs code tries to register the memory
area owned by the SPU as hotplug memory in order to get page structs (we
probably shouldn't do it that way, but that's a separate discussion).

memmap_init_zone now calls early_pfn_valid() and early_pfn_in_nid()
in order to determine if the page struct should be initialized. This
is wrong for two reasons:

- early_pfn_in_nid checks the early_node_map variable to determine
  to which node the hot plugged memory belongs. However, the new
  memory never was part of the early_node_map to start with, so
  it incorrectly returns node zero, and then fails to initialize
  the page struct if we were trying to add it to a nonzero node.
  This is probably not a problem for pseries, but it is for cell.

- both early_pfn_{in,to}_nid and early_node_map are in the __init
  section and may already have been freed at the time we are calling
  memmap_init_zone().

The patch below is not a suggested fix that I want to get into mainline
(checking slab_is_available is the wrong here), but it is a quick fix
that you should apply if you want to run a recent (post-2.6.18) kernel
on the IBM QS20 blade. I'm sorry for not having reported this earlier,
but we were always trying to find the problem in my own code...

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

---
diffstat:
 page_alloc.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)




More information about the cbe-oss-dev mailing list