[Skiboot] [PATCH v3 1/6] occ: Wait if OCC GPU presence status not immediately available

Andrew Donnellan andrew.donnellan at au1.ibm.com
Fri Aug 31 14:15:58 AEST 2018


It takes a few seconds for the OCC to set everything up in order to read
GPU presence. At present, we try to kick off OCC initialisation as early as
possible to maximise the time it has to read GPU presence.

Unfortunately sometimes that's not enough, so add a loop in
occ_get_gpu_presence() so that on the first time we try to get GPU presence
we keep trying for up to 2 seconds. Experimentally this seems to be
adequate.

Fixes: 9b394a32c8ea ("occ: Add support for GPU presence detection")
Signed-off-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat at linux.vnet.ibm.com>

---

v2->v3:
- Remove noisy print (Rashmica/Fred)
---
 hw/occ.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/hw/occ.c b/hw/occ.c
index a55bf8ed4f54..9580bb873639 100644
--- a/hw/occ.c
+++ b/hw/occ.c
@@ -1238,14 +1238,24 @@ exit:
 bool occ_get_gpu_presence(struct proc_chip *chip, int gpu_num)
 {
 	struct occ_dynamic_data *ddata;
+	static int max_retries = 20;
+	static bool found = false;
 
 	assert(gpu_num <= 2);
 
 	ddata = get_occ_dynamic_data(chip);
+	while (!found && max_retries) {
+		if (ddata->major_version == 0 && ddata->minor_version >= 1) {
+			found = true;
+			break;
+		}
+		time_wait_ms(100);
+		max_retries--;
+		ddata = get_occ_dynamic_data(chip);
+	}
 
-	if (ddata->major_version != 0 || ddata->minor_version < 1) {
-		prlog(PR_INFO, "OCC: OCC not reporting GPU slot presence, "
-		      "assuming device is present\n");
+	if (!found) {
+		prlog(PR_INFO, "OCC: No GPU slot presence, assuming GPU present\n");
 		return true;
 	}
 
-- 
git-series 0.9.1



More information about the Skiboot mailing list