SQUASHFS errors and OpenBMC hang

Thu Sep 3 08:56:30 AEST 2020

On 9/1/20 4:07 PM, Milton Miller II wrote:
> On September 1, 2020 around 7:36AM in some timezone, Patrick Williams wrote:
>> On Sat, Aug 29, 2020 at 12:40:31AM +0000, Kun Zhao wrote:
>>> I’m working on validating OpenBMC on our POC system for a while,
>> but starting from 2 weeks ago, the BMC filesystem sometimes report
>> failures, and after that sometimes the BMC will hang after running
>> for a while. It started to happen on one system and then on another.
>> Tried to use programmer to re-flash, still see this issue. Tried to
>> flash back to the very first known good OpenBMC image we built, still
>> see the same symptoms. It seems like a SPI ROM failure. But when
>> flash back the POC system original 3rd-party BMC, no such issue at
>> all. Not sure if anyone ever met similar issues before?
>>
>> Yeah, this does look like a bad SPI NOR.  Have you tried flashing on
>> a
>> fresh image to the NOR and then reading it back to confirm all the
>> bits
>> keep their values?  It is possible that the corruption is hitting the
>> other BMC code in a less-important location.
>>
>>> [ 3.372932] jffs2: notice: (78) jffs2_get_inode_nodes: Node header
>> CRC failed at 0x3e0aa4. {1985,e002,0000004a,78280c2e}
>>
>> I'm surprised to see anyone using jffs2.  Don't we generally use
>> ubifs
>> in OpenBMC?  Is there a reason you've chosen to use jffs2?
>>
>> I don't necessarily think jffs2 will be better or worse in this
>> particular scenario but we've seen lots of upgrade issues over the
>> years
>> with jffs2.
> The default layout is static partitions with squashfs over mtdblock 
> for the read-only layer and jffs2 for the read-write layer.
>
> The ubifs option is opt-in and the code update supports two images 
> so that a new image is always available.  These options should be 
> orthogonal but in practice are probably tied in the code update 
> repository.
>
> The third option is eMMC support on the sdhci controller.  This 
> was prototyped on ast2500 and in use on the ast2600.
>
> There are some differences in the overlay strategy in the current 
> builds but I will support anyone willing to test to merge the new 
> limited writable directories from ubifs and emmc to the static mtd 
> layout.   This means I'm willing to update the init scripts.
Thank you, Milton for the comments. Can I update ubifs image to static partitioned BMC with code update? Or I have to program it directly to the NOR flash?
>>> BMC debug console shows the same SQUASHFS error as above, by
>> checking filesystem usage we could see rwfs usage keep increasing
>> like this,
>>> root at dgx:~# df
>>> Filesystem 1K-blocks Used Available Use% Mounted on
>>> dev 212904 0 212904 0% /dev
>>> tmpfs 246728 20172 226556 8% /run
>>> /dev/mtdblock4 22656 22656 0 100% /run/initramfs/ro
>>> /dev/mtdblock5 4096 880 3216 21% /run/initramfs/rw
>>> cow 4096 880 3216 21% /
>>> tmpfs 246728 8 246720 0% /dev/shm
>>> tmpfs 246728 0 246728 0% /sys/fs/cgroup
>>> tmpfs 246728 0 246728 0% /tmp
>>> tmpfs 246728 8 246720 0% /var/volatile
>>>
>>> and can see more and more ipmid coredump files,
>> This implies to me that we need to adjust the systemd recovery for
>> ipmid.  We shouldn't just keep re-launching the same process over and
>> over after a coredump.  Systemd has some thresholding capability.
>>
> I've seen problems in the past where the squashfs image was bigger 
> than the aloted space and it became partially overwritten by the 
> jffs2 writable filesystem.   We added code that tries to catch this 
> and have seen such reports but wanted to bring it up. 
Do you still have that issue links?
>  Also we don't 
> support the host accessing the flash controller while linux is up in 
> case your host is trying to flash the bmc bios (or even read it
> directly; all data must go through API such as IPMI or REST.
Do you mean if BMC is up and running and I use tools like socflash to program the BMC  directly in host OS, there will be problem?
>>> I found the following actions could trigger this failure,
>>>
>>>
>>>   1.  do SSH login to BMC debug console remotely, it will show this
>> error when triggered,
>>> $ ssh root@<bmc ip>
>>> ssh_exchange_identification: read: Connection reset by peer
>>>
>>>
>>>   1.  set BMC MAC address by fw_setenv in BMC debug console, reboot
>> BMC, and do 'ip -a'.
>>
>> I have no idea why this procedure would solve SPI NOR issues.  It
>> doesn't seem connected on the surface.
>>
>>> The code is based on upstream commit 5ddb5fa99ec259 on master
>> branch.
>>> The flash layout definition is the default
>> openbmc-flash-layout.dtsi.
>>> The SPI ROM is Macronix MX25L25635F
>>>
>>> Some questions,
>>>
>>>   1.  Any SPI lock feature enabled in OpenBMC?
>>>   2.  If yes, do I have to unlock u-boot-env partition before
>> fw_setenv?
>>
>> There is not, to my knowledge, a software SPI lock.  Some machines
>> have
>> a 'golden' NOR which they enable by, in hardware, setting the
>> write-protect input pin on the SPI NOR (with a strapping resistor).
>> Does your machine do this mechanism?  If so, it is possible that
>> you're
>> booting onto the 'wrong' NOR flash in some conditions and a reboot
>> resets the chip-select logic in the SPI controller.  (Usually, you
>> have
>> the watchdog configured to automatically swap the chip-select after
>> some
>> number of boot failures.)
>>
>> -- 
>> Patrick Williams
> Our default is that the os is in control of the flash an we do not 
> mark any areas as read-only.

Got it. Thanks for the confirmation.

> milton
> ---
> I speak only for myself.  But I have written or reviewed the layouts 
> and initrd scripting.
>
Kun