marcello kernel &ppc64

linas at austin.ibm.com linas at austin.ibm.com
Fri Nov 14 09:53:05 EST 2003


On Thu, Nov 13, 2003 at 03:10:02PM -0600, Hollis Blanchard wrote:
> On Thursday, Nov 13, 2003, at 14:38 US/Central, linas at austin.ibm.com
> wrote:
> >
> > Nothing critical per se, its just that
> > diff -r marcello-2.4.23pre9/arch/ppc64/kernel sles8/arch/ppc64/kernel
> > shows a vast number of differences.  I would be much happier if these
> > two dirs were nearly identical.
>
> Forget SLES8; they're always going to apply hundreds of patches and
> that's their business (take it up with them).

No, I was talking about the ppc64 bits specifically, not about the
general kernel.  I really would think that the ppc64 bits would be
nearly identical across all vendors, etc.  ... since I can assure you
that SuSE is not developing new ppc64 code.

I'm not sure where SuSE pulls thier ppc64 code from, I'm guessing it
comes from ameslab.  I'm guessing those two are real close, but I don't
have an ameslab tree anywhere to look at.

=================
Which takes me to another, ahem, politically touchy point, I suspect.
There's a test lab here that pounds the crap out of SuSE kernels.
Speaking from personal experience, 9 out of 10 or 19 out of 20 of
the kernel crashes & hangs that they find are *not* in the ppc64 code,
but are in the generic linux kernel code.   These are races of
various sorts, missing locks, data corruptions, you name it, I've
seen it.  These get fixed in the SuSE kernel.

And that's what bugs me... I'm perfectly aware that many/most of
these bugs are also in the marcello kernel.  (For example, the latest:
a race condition on setting/resetting of current->need_resched,
which will make a heavily loaded machine go idle.  The setting of
need_resched uses no locks or semaphores of any kind, but is
used as if it were always correct.)  But the patches don't go out
to the LKML because ... well, it can be hard to defend a patch when
the bug was never seen on a marcello kernel.  So I'm a tad concerned
that there's a bit of a disconnect not only at the source code level,
but also at the test level.

(OK, your right, I need to post the need_resched thing to LKML. This
one, at least, should be easy to explain).

--linas


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/





More information about the Linuxppc64-dev mailing list