Crash (ext3 ) during 2.6.29-rc6 boot

Wed Feb 25 12:27:38 EST 2009

On Wed, 25 Feb 2009 05:01:59 am Geert Uytterhoeven wrote:
> On Mon, 23 Feb 2009, Paul Mackerras wrote:
> > Andrew Morton writes:
> > > It looks like we died in ext3_xattr_block_get():
> > > 
> > > 		memcpy(buffer, bh->b_data + le16_to_cpu(entry->e_value_offs),
> > > 		       size);
> > > 
> > > Perhaps entry->e_value_offs is no good.  I wonder if the filesystem is
> > > corrupted and this snuck through the defenses.
> > > 
> > > I also wonder if there is enough info in that trace for a ppc person to
> > > be able to determine whether the faulting address is in the source or
> > > destination of the memcpy() (please)?
> > 
> > It appears to have faulted on a load, implicating the source.  The
> > address being referenced (0xc00000003f380000) doesn't look
> > outlandish.  I wonder if this kernel has CONFIG_DEBUG_PAGEALLOC turned
> > on, and what page size is selected?
> 
> I'm seeing a similar thing on PS3, but not in ext3. During early userspace
> setup (udevd), it crashes accessing a 0xc00* address in:
> 
> | NIP setup+0x20/0x130
> | LR copy_user_page+0x18/0x6c
> | Call trace:
> | do_wp_page+0x5b4/0x89c
> | do_page_fault+0x3a8/0x58c
> | handle_page_fault+0x20/0x5c
> 
> I have CONFIG_DEBUG_PAGEALLOC=y. If I disable it, the system boots fine.
> 
> If needed, I can probably bisect this tomorrow. It definitely didn't happen in
> 2.6.29-rc5.

No need to bisect - it was 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, my
commit that "optimised" 64bit memcpy() for Power6 and Cell.

The bug was in -rc1, but if your copies were 8-byte aligned with respect
to the source the problem wouldn't have been seen... Could this have
been why you didn't see it in -rc5?

I'll work on a fix now.

Thanks!

Mark