bug in drivers/edac/mpc85xx_edac.c:mpc85xx_mc_check()

Andrew Morton akpm at linux-foundation.org
Wed Apr 29 17:37:04 EST 2009


Let's cc the suitable people.

On Tue, 28 Apr 2009 18:23:42 -0700 "Jeff Haran" <jharan at Brocade.COM> wrote:

> Hi,
> 
> Recent versions of this function contain the following snippets:
> 
>     if (err_detect & DDR_EDE_SBE)
>         edac_mc_handle_ce(mci, pfn, err_addr & PAGE_MASK,
>                   syndrome, row_index, 0, mci->ctl_name);
> 
>     if (err_detect & DDR_EDE_MBE)
>         edac_mc_handle_ue(mci, pfn, err_addr & PAGE_MASK,
>                   row_index, mci->ctl_name);
> 
> I am pretty sure the references to PAGE_MASK should be proceeded by a
> tilda, as in:
> 
>     if (err_detect & DDR_EDE_SBE)
>         edac_mc_handle_ce(mci, pfn, err_addr & ~PAGE_MASK,
>                   syndrome, row_index, 0, mci->ctl_name);
> 
>     if (err_detect & DDR_EDE_MBE)
>         edac_mc_handle_ue(mci, pfn, err_addr & ~PAGE_MASK,
>                   row_index, mci->ctl_name);
> 

Could well be.  PAGE_MASK is very easy to get wrong.  I've _never_
trusted my own memory of it and I always have to go back to the
definition when reviewing code :(

> Much as I would like to submit a tested patch like the rest of the
> world, I find myself in the situation where the only Freescale target
> system I have to test on is running a 3 year old kernel (2.6.14), which
> preceeds the introduction of EDAC driver support, at least for
> Freescale. So the best I can do is borrow from the new EDAC driver and
> backport it to the old kernel.
> 
> But I have learned a few things in this process and can thus share what
> I've learned as it may be of help to the EDAC driver developers:
> 
> 1) Before you read the Freescale 8548 CAPTURE_ADDRESS register, you want
> to read CAPTURE_ATTRIBUTES first and make sure the VLD bit (least
> significant bit in the register) is set or else the data in
> CAPTURE_ADDRESS may not be yet valid.
> 
> 2) When you are done scrubbing the memory with the single bit error, you
> want to write 0 to CAPTURE_ATTRIBUTES so as to clear VLD and thus setup
> the ECC capture logic to capture the next single bit error.
> 




More information about the Linuxppc-dev mailing list