[Skiboot] [RFC v2 PATCH 2/6] fast-reboot: parallel memory clearing

Tue Jul 3 16:30:26 AEST 2018

On Thu, 28 Jun 2018 12:54:57 +1000
Stewart Smith <stewart at linux.ibm.com> wrote:

> Arbitrarily pick 16GB as the unit of parallelism, and
> split up clearing memory into jobs and schedule them
> node-local to the memory (or on node 0 if we can't
> work that out because it's the memory up to SKIBOOT_BASE)
> 
> This seems to cut at least ~40% time from memory zeroing on
> fast-reboot on a 256GB Boston system.
> 
> Signed-off-by: Stewart Smith <stewart at linux.ibm.com>

[...]

> +		while(l > MEM_REGION_CLEAR_JOB_SIZE) {
> +			job_args[i].s = s+l - MEM_REGION_CLEAR_JOB_SIZE;
> +			job_args[i].e = s+l;
> +			l-=MEM_REGION_CLEAR_JOB_SIZE;
> +			job_args[i].job_name = malloc(sizeof(char)*100);
> +			total+=MEM_REGION_CLEAR_JOB_SIZE;
> +			chip_id = __dt_get_chip_id(r->node);
> +			if (chip_id == -1)
> +				chip_id = 0;
> +			path = dt_get_path(r->node);
> +			snprintf(job_args[i].job_name, 100,
> +				 "clear %s, %s 0x%"PRIx64" len: %"PRIx64" on %d",
> +				 r->name, path,
> +				 job_args[i].s,
> +				 (job_args[i].e - job_args[i].s),
> +				 chip_id);
> +			free(path);
> +			printf("job: %s\n", job_args[i].job_name);
> +			jobs[i] = cpu_queue_job_on_node(chip_id,
> +							job_args[i].job_name,
> +							mem_region_clear_job,
> +							&job_args[i]);

Ahh, the API will return NULL if it can't be scheduled on the requested
node it will return. So you'll want to re-run cpu_queue_job() in that
case to ensure it runs.

Thanks,
Nick