[Skiboot] [PATCH] opal: Fix an issue where partial LID load causes opal to hang.

Mahesh J Salgaonkar mahesh at linux.vnet.ibm.com
Tue Mar 31 05:20:18 AEDT 2015


From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>

commit c789772 introduced an asynchronous mechanism to load LID resource for
FSP systems. But after this change some of the FSP based system failed
to load/boot petitboot kernel.

While fetching LID resource in multiple chunks, we depend on return
status from FSP whether there is more data available to fetch or not.
As per FSP mailbox documentation, fetch cmd returns status=2 which means,
there is more data pending, and status=0 means we have reached end-of-file.
But in reality FSP don't behave as per the document. It looks like we
always get status=0 irrespective of whether end of file is reached or not.

The old implementation (fsp_sync_msg) used to rely on (wlen < chunk) check
to decide whether we reached end of file or not.

Ideally, FSP folks should be fix their code as per documentation. But until
they do, adding the old check back here again.

Without this patch some system won't be able to boot into petitboot kernel.

Signed-off-by: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
---
 hw/fsp/fsp.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/hw/fsp/fsp.c b/hw/fsp/fsp.c
index e5a3c1d..c57052f 100644
--- a/hw/fsp/fsp.c
+++ b/hw/fsp/fsp.c
@@ -2156,6 +2156,7 @@ struct fsp_fetch_lid_item {
 	void *buffer;
 	size_t *length;
 	size_t remaining;
+	size_t chunk_requested;
 	struct list_node link;
 	int result;
 };
@@ -2269,7 +2270,20 @@ static void fsp_fetch_lid_complete(struct fsp_msg *msg)
 		return;
 	}
 
-	if (rc == 0)
+	/*
+	 * As per documentation, rc=2 means end of file not reached and
+	 * rc=1 means we reached end of file. But it looks like we always
+	 * get rc=0 irrespective of whether end of file is reached or not.
+	 * The old implementation (fsp_sync_msg) used to rely on
+	 * (wlen < chunk) to decide whether we reached end of file.
+	 *
+	 * Ideally FSP folks should be fix their code as per documentation.
+	 * but until they do, adding the old check (hack) here again.
+	 *
+	 * Without this hack some systems would load partial lid and won't
+	 * be able to boot into petitboot kernel.
+	 */
+	if (rc == 0 && (wlen < last->chunk_requested))
 		last->result = OPAL_SUCCESS;
 
 	fsp_freemsg(msg);
@@ -2317,6 +2331,7 @@ static void fsp_fetch_lid_next_chunk(struct fsp_fetch_lid_item *last)
 	if (chunk > (PSI_DMA_FETCH_SIZE - boff))
 		chunk = PSI_DMA_FETCH_SIZE - boff;
 	last->bsize = ((boff + chunk) + TCE_MASK) & ~TCE_MASK;
+	last->chunk_requested = chunk;
 
 	prlog(PR_DEBUG, "FSP: Loading Chunk 0x%08x bytes balign=%llx"
 	      " boff=%llx bsize=%llx\n",



More information about the Skiboot mailing list