[Skiboot] [PATCH v2 1/6] occ: Wait if OCC GPU presence status not immediately available
Frederic Barrat
fbarrat at linux.ibm.com
Wed Aug 29 22:44:41 AEST 2018
Le 27/08/2018 à 10:55, Andrew Donnellan a écrit :
> It takes a few seconds for the OCC to set everything up in order to read
> GPU presence. At present, we try to kick off OCC initialisation as early as
> possible to maximise the time it has to read GPU presence.
>
> Unfortunately sometimes that's not enough, so add a loop in
> occ_get_gpu_presence() so that on the first time we try to get GPU presence
> we keep trying for up to 2 seconds. Experimentally this seems to be
> adequate.
>
> Fixes: 9b394a32c8ea ("occ: Add support for GPU presence detection")
> Signed-off-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
> ---
> hw/occ.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/hw/occ.c b/hw/occ.c
> index a55bf8ed4f54..9fcac3f9581c 100644
> --- a/hw/occ.c
> +++ b/hw/occ.c
> @@ -1238,14 +1238,26 @@ exit:
> bool occ_get_gpu_presence(struct proc_chip *chip, int gpu_num)
> {
> struct occ_dynamic_data *ddata;
> + static int max_retries = 20;
> + static bool found = false;
>
> assert(gpu_num <= 2);
>
> ddata = get_occ_dynamic_data(chip);
> -
> - if (ddata->major_version != 0 || ddata->minor_version < 1) {
> + while (!found && max_retries) {
> + if (ddata->major_version == 0 && ddata->minor_version >= 1) {
> + found = true;
> + break;
> + }
> prlog(PR_INFO, "OCC: OCC not reporting GPU slot presence, "
> - "assuming device is present\n");
> + "waiting\n");
Do we really want to print up to 20 times the same message?
Other than that:
Reviewed-by: Frederic Barrat <fbarrat at linux.vnet.ibm.com>
> + time_wait_ms(100);
> + max_retries--;
> + ddata = get_occ_dynamic_data(chip);
> + }
> +
> + if (!found) {
> + prlog(PR_INFO, "OCC: No GPU slot presence, assuming GPU present\n");
> return true;
> }
>
More information about the Skiboot
mailing list