[Skiboot] [PATCH] core/cpu: Initialize all cpu thread areas to avoid invalid memory access.

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Thu Sep 6 15:25:06 AEST 2018


On 09/03/2018 12:25 PM, Oliver wrote:
> On Mon, Sep 3, 2018 at 4:07 PM, Mahesh Jagannath Salgaonkar
> <mahesh at linux.vnet.ibm.com> wrote:
>> On 09/03/2018 11:20 AM, Oliver wrote:
>>> On Sun, Sep 2, 2018 at 3:40 AM, Mahesh J Salgaonkar
>>> <mahesh at linux.vnet.ibm.com> wrote:
>>>> From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>

.../...

>>>
>>> Is this from a cold IPL or MPIPL? Memory should be zeroed before we
>>> enter OPAL so if there is junk in the stack area we might have a data
>>> corruption problem,
>>
>> This is happening during re-IPL (OS reboot). On FSP, per Dean Sanner, OS
>> reboot requests get transformed into an MPIPL. On cold IPL the memory is
>> all zeroed and we don't see this issue during first IPL.
> 
> Right, you should probably mention this is MPIPL specific in the commit message.
> 
> I'm still a little concerned that skiboot is putting junk data in the
> unused cpu_thread structures at all. If it is then I'd expect
> cpu_for_each*() to be broken at broken at runtime even if it boots
> successfully from a cold IPL. I'll take a look at it sometime.

Yesterday we spent sometime to track down the real issue. It turned out that 
Hostboot
copied 16MB HDAT data to OPAL memory instead of 8MB.. That corrupted our TCE space
and CPU stack.

Hostboot upstream has required commit... that needs to be backported to relevant 
FSP builds.


FYI : HB commit :

commit 3d3d39d62a94da9dc9bc2da73474c9c3400762c4
Author: Mike Baiocchi <mbaiocch at us.ibm.com>
Date:   Thu May 3 09:02:02 2018 -0500

     Get Final HDAT Size from PAYLOAD's SPIRA section


-Vasant



More information about the Skiboot mailing list