[PATCH skiboot/pflash] libflash/libffs: Detect first byte flash corruption

Cédric Le Goater clg at kaod.org
Fri Jun 30 23:34:31 AEST 2017


On 06/29/2017 12:40 AM, Cyril Bur wrote:
> This patch is not for upstream but currently this unknown corruption is
> making life difficult.

>From what I have seen on a couple of affected systems, the data is 
not corrupted but it's the first byte read which is bogus. After a
reboot, all comes back to normal. 

So we might have an issue with the optimize read settings and/or 
fast-read. This is difficult to corner. 

Has anyone seen this issue on the openbmc 4.7 kernel ? 
 
Thanks,

C.


> Perhaps it can be useful at least temporarily.
> 
> Feel free to remove the option to continue and leave just the
> information message.
> 
> Signed-off-by: Cyril Bur <cyril.bur at au1.ibm.com>
> ---
>  libflash/libffs.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/libflash/libffs.c b/libflash/libffs.c
> index 7ae9050e..dce6237e 100644
> --- a/libflash/libffs.c
> +++ b/libflash/libffs.c
> @@ -260,6 +260,35 @@ int ffs_init(uint32_t offset, uint32_t max_size, struct blocklevel_device *bl,
>  
>  	/* Convert and check flash header */
>  	rc = ffs_check_convert_header(&f->hdr, &raw_hdr);
> +	/*
> +	 * A bug on witherspoon of currently unknown origin causes
> +	 * corruption of the first byte of flash. This is by no means
> +	 * a fix and should never go upstream but may prove helpful.
> +	 */
> +	if (rc == FFS_ERR_BAD_MAGIC) {
> +		/*
> +		 * We can set the first byte to what we know it should be
> +		 * and try to check and convert header again. If it
> +		 * succeeds there is a very high chance this is the
> +		 * problem.
> +		 */
> +		*(char *)&raw_hdr = 'P';
> +		rc = ffs_check_convert_header(&f->hdr, &raw_hdr);
> +		if (rc) {
> +			/*
> +			 * Turn it back into BAD_MAGIC, the flash obviously
> +			 * has bigger problems.
> +			 */
> +			rc = FFS_ERR_BAD_MAGIC;
> +		} else {
> +			FL_ERR("FFS: Detected corruption of the first byte of flash.\n");
> +			FL_ERR("     Recommend hard power cycling the entire machine\n");
> +			FL_ERR("     Continuing is dangerous!\n");
> +			FL_ERR("     Continuing in 10 seconds\n");
> +			sleep(10);
> +			rc = 0;
> +		}
> +	}
>  	if (rc) {
>  		FL_INF("FFS: Flash header not found. Code: %d\n", rc);
>  		goto out;
> 



More information about the openbmc mailing list