[Skiboot] [PATCH v2 04/13] core/flash: Make opal_flash_op() actually asynchronous
Vasant Hegde
hegdevasant at linux.vnet.ibm.com
Wed Nov 6 17:25:47 AEDT 2019
On 11/4/19 6:59 PM, Deb McLemore wrote:
> From: Cyril Bur <cyril.bur at au1.ibm.com>
>
> This patch provides a simple (although not particularly efficient)
> asynchronous capability to the opal_flash interface. The advantage of
> this approach is that it doesn't require any changing of blocklevel or
> its backends to provide an asynchronous implementation. This is also the
> disadvantage of this implementation as all it actually does is break the
> work up in chunks that it can performed quickly, but still
> synchronously. Only a backend could provide a more asynchronous
> implementation.
>
> This solves a problem we have right now with the opal_flash_erase call
> where it can block in Skiboot for around three minutes. This causes a
> variety of problems in Linux due to a processor being gone for a long
> time.
> For example:
>
> [ 98.610043] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 98.610050] 113-...: (1 GPs behind) idle=96f/140000000000000/0 softirq=527/528 fqs=1044
> [ 98.610051] (detected by 112, t=2102 jiffies, g=223, c=222, q=123)
> [ 98.610060] Task dump for CPU 113:
> [ 98.610062] pflash R running task 0 3335 3333 0x00040004
> [ 98.610066] Call Trace:
> [ 98.610070] [c000001fdd847730] [0000000000000001] 0x1 (unreliable)
> [ 98.610076] [c000001fdd847900] [c000000000013854] __switch_to+0x1e8/0x1f4
> [ 98.610081] [c000001fdd847960] [c0000000006122c4] __schedule+0x32c/0x874
> [ 98.610083] [c000001fdd847a30] [c000001fdd847b40] 0xc000001fdd847b40
>
> It is for this reason that breaking the work up in smaller chunks solves
> this problem as Skiboot can return the CPU to Linux between chunks to
> avoid Linux getting upset.
>
> Reported-By: Samuel Mendoza-Jonas <sam at mendozajonas.com>
> Signed-off-by: Cyril Bur <cyril.bur at au1.ibm.com>
> Signed-off-by: Stewart Smith <stewart at linux.ibm.com>
> ---
.../...
>
> +/*
> + * Called with flash lock held, drop it on async completion
> + */
> +static void flash_poll(struct timer *t __unused, void *data, uint64_t now __unused)
> +{
> + struct flash *flash = data;
> + uint64_t offset, buf, len;
> + int rc;
> +
> + offset = flash->async.pos;
> + buf = flash->async.buf;
> + len = MIN(flash->async.len, flash->block_size*10);
> + printf("Flash poll op %d len %llu\n", flash->async.op, len);
> +
> + switch (flash->async.op) {
> + case FLASH_OP_READ:
> + rc = blocklevel_raw_read(flash->bl, offset, (void *)buf, len);
> + break;
> + case FLASH_OP_WRITE:
> + rc = blocklevel_raw_write(flash->bl, offset, (void *)buf, len);
> + break;
> + case FLASH_OP_ERASE:
> + rc = blocklevel_erase(flash->bl, offset, len);
> + break;
> + default:
> + assert(0);
Why not just return OPAL_PARAMETER?
-Vasant
More information about the Skiboot
mailing list