[Skiboot] [PATCH v2 04/13] core/flash: Make opal_flash_op() actually asynchronous

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Wed Nov 6 17:25:47 AEDT 2019


On 11/4/19 6:59 PM, Deb McLemore wrote:
> From: Cyril Bur <cyril.bur at au1.ibm.com>
> 
> This patch provides a simple (although not particularly efficient)
> asynchronous capability to the opal_flash interface. The advantage of
> this approach is that it doesn't require any changing of blocklevel or
> its backends to provide an asynchronous implementation. This is also the
> disadvantage of this implementation as all it actually does is break the
> work up in chunks that it can performed quickly, but still
> synchronously. Only a backend could provide a more asynchronous
> implementation.
> 
> This solves a problem we have right now with the opal_flash_erase call
> where it can block in Skiboot for around three minutes. This causes a
> variety of problems in Linux due to a processor being gone for a long
> time.
> For example:
> 
> [   98.610043] INFO: rcu_sched detected stalls on CPUs/tasks:
> [   98.610050]         113-...: (1 GPs behind) idle=96f/140000000000000/0 softirq=527/528 fqs=1044
> [   98.610051]         (detected by 112, t=2102 jiffies, g=223, c=222, q=123)
> [   98.610060] Task dump for CPU 113:
> [   98.610062] pflash          R  running task        0  3335   3333 0x00040004
> [   98.610066] Call Trace:
> [   98.610070] [c000001fdd847730] [0000000000000001] 0x1 (unreliable)
> [   98.610076] [c000001fdd847900] [c000000000013854] __switch_to+0x1e8/0x1f4
> [   98.610081] [c000001fdd847960] [c0000000006122c4] __schedule+0x32c/0x874
> [   98.610083] [c000001fdd847a30] [c000001fdd847b40] 0xc000001fdd847b40
> 
> It is for this reason that breaking the work up in smaller chunks solves
> this problem as Skiboot can return the CPU to Linux between chunks to
> avoid Linux getting upset.
> 
> Reported-By: Samuel Mendoza-Jonas <sam at mendozajonas.com>
> Signed-off-by: Cyril Bur <cyril.bur at au1.ibm.com>
> Signed-off-by: Stewart Smith <stewart at linux.ibm.com>
> ---

.../...

> 
> +/*
> + * Called with flash lock held, drop it on async completion
> + */
> +static void flash_poll(struct timer *t __unused, void *data, uint64_t now __unused)
> +{
> +	struct flash *flash = data;
> +	uint64_t offset, buf, len;
> +	int rc;
> +
> +	offset = flash->async.pos;
> +	buf = flash->async.buf;
> +	len = MIN(flash->async.len, flash->block_size*10);
> +	printf("Flash poll op %d len %llu\n", flash->async.op, len);
> +
> +	switch (flash->async.op) {
> +	case FLASH_OP_READ:
> +		rc = blocklevel_raw_read(flash->bl, offset, (void *)buf, len);
> +		break;
> +	case FLASH_OP_WRITE:
> +		rc = blocklevel_raw_write(flash->bl, offset, (void *)buf, len);
> +		break;
> +	case FLASH_OP_ERASE:
> +		rc = blocklevel_erase(flash->bl, offset, len);
> +		break;
> +	default:
> +		assert(0);

Why not just return OPAL_PARAMETER?

-Vasant



More information about the Skiboot mailing list