boot failure when read-write fs is full

Heyi Guo guoheyi at linux.alibaba.com
Tue Apr 12 15:58:51 AEST 2022


Hi Rohit,

We also got similar issue, and it should be directly caused by the 
failure of "mkdir -p $upper $work" in obmc-init.sh. Our work around is 
done by below patch, i.e. avoid to remove the directory 
/run/initramfs/rw/work and recreate it, but only removing its contents.

This may cause rwfs to become read only when it is full, but it won't 
trigger kernel panic and we still have chance to repair it. We are 
planing to send a patch to OBMC gerrit, but not yet.

diff --git 
a/meta-phosphor/recipes-phosphor/initrdscripts/files/obmc-init.sh 
b/meta-phosphor/recipes-phosphor/initrdscripts/files/obmc-init.sh
index e61ede9111..d4425b56b1 100644
--- a/meta-phosphor/recipes-phosphor/initrdscripts/files/obmc-init.sh
+++ b/meta-phosphor/recipes-phosphor/initrdscripts/files/obmc-init.sh
@@ -411,7 +411,15 @@ HERE
         debug_takeover "$msg"
  fi

-rm -rf $work
+# Empty workdir; do not remove workdir itself for it will fail to 
recreate it if
+# RWFS is full
+if [ -d $work ]
+then
+    cd $work
+    ls -a | grep -v -E '^\.$|^\.\.$' | xargs rm -rf
+    cd -
+fi
+
  mkdir -p $upper $work

  mount -t overlay -o lowerdir=$rodir,upperdir=$upper,workdir=$work cow 
/root

Heyi


在 2022/4/12 上午1:51, Rohit Pai 写道:
> Hello All,
>
> Currently I am investigating the bootup failures which I see on our 0penBmc based boards when the rw-fs is full.
> The rw-fs can become full because of many reasons. One example being too frequent bmc dump creation which are stored in rw-fs.
>
> I have allocated 16MB for the read-write fs and part of the init sequence there is overlay file system which is being mounted as the root-fs which combines the ro-fs and the rw-fs.
>
>
> mount<https://grok.openbmc.org/s?defs=mount&project=openbmc> -t overlay<https://grok.openbmc.org/s?defs=overlay&project=openbmc> -o lowerdir<https://grok.openbmc.org/s?defs=lowerdir&project=openbmc>=$rodir<https://grok.openbmc.org/s?defs=%24rodir&project=openbmc>,upperdir<https://grok.openbmc.org/s?defs=upperdir&project=openbmc>=$upper<https://grok.openbmc.org/s?defs=%24upper&project=openbmc>,workdir<https://grok.openbmc.org/s?defs=workdir&project=openbmc>=$work<https://grok.openbmc.org/s?defs=%24work&project=openbmc> cow<https://grok.openbmc.org/s?defs=cow&project=openbmc> /root<https://grok.openbmc.org/s?defs=root&project=openbmc>
>
> Above 'mount overlay' command fails with the below error when the upperdir which the rw-fs is full.
>
>
> chroot: can't execute '/bin/sh': No such file or directory
>
> Unable to confirm /sbin/init is an executable non-empty file
>
> in merged file system mounted at /root.
>
> Change Root test failed!
>
> Fatal error, triggering kernel panic!
>
> Basically, when the 'overlayfs' fails there is no rootfs mounted on the /. So, I am thinking the subsequent init sequence fails.
>
> I am very much interested in knowing if anyone has any thoughts on solving this issue or has encountered and already found some existing solutions.
> One solution which is in my mind is to capture the failure of mount overlay command and do a self-clean-up procedure on the rw-fs with a white-list policy.
> Thanks for any kind input.
>
> Regards,
> Rohit PAI
>
>
>
>


More information about the openbmc mailing list