[SLOF] [PATCH slof] virtio-serial: Close device completely

Alexey Kardashevskiy aik at ozlabs.ru
Tue Mar 10 13:40:40 AEDT 2020



On 06/03/2020 21:12, Greg Kurz wrote:
> On Fri,  6 Mar 2020 15:39:33 +1100
> Alexey Kardashevskiy <aik at ozlabs.ru> wrote:
> 
>> Linux closes stdout at the end of prom_init which triggers the FW quiesce
>> code which closes the virtio-serial instance. This misses stopping
>> the virtio queues, clearing the "emit" token and other things. However
>> this seemed working for a little longer (until the Linux driver took over)
>> till 0cdb2dd13c3e which moved the VQ descriptors around which caused
>> use-after-free corruption.
>>
>> This adds virtio_queue_term_vq(), cleanup in the forth driver, few checks
>> and reverts emit to hvterm-emit.
>>
>> Fixes: 0cdb2dd13c3e ("virtio: Store queue descriptors in virtio_device")
> 
> SHA1 is wrong.
> 
> https://git.qemu.org/?p=SLOF.git;a=commit;h=300384f3dc68588a051f5737aee3b5eab4dd19e4


Huh. I did not make it up though (I generate the "Fixes" line with a git
macro so it would have complained), 0cdb2dd13c3e is from my local tree,
probably from some rebase.

> 
>> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>> ---
>>
>> Turns out it is close(stdout) what triggers quiesce, not the client
>> interface's "quiesce".
>>
> 
> Yeah I discovered that while trying to debug :)
> 
>> This helps with iommu_platform=off but iommu_platform=on still does not work.
>> Debugging...
>>
>> /home/aik/pbuild/qemu-localhost-ppc64/ppc64-softmmu/qemu-system-ppc64 \
>> -device virtio-serial,id=virtio-serial0,iommu_platform=on,disable-modern=off,disable-legacy=on,ioeventfd=off \
>> -nodefaults \
>> -chardev stdio,id=STDIO0,signal=off,mux=on \
>> -device virtconsole,id=virtconsole0,chardev=STDIO0 \
>> -mon id=MON0,chardev=STDIO0,mode=readline \
>> -nographic \
>> -vga none \
>> -machine pseries,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off,ic-mode=xics \
>> -m 2G \
>> -smp 1 \
>> -bios p/slof/boot_rom.bin \
>> -enable-kvm \
>> -initrd t/le.cpio \
>> -kernel t/vml4120le \
>> -L /home/aik/t/qemu-ppc64-bios/ \
>> -trace events=qemu_trace_events \
>> -d guest_errors \
>> -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.ssh55056 \
>> -mon chardev=SOCKET0,mode=control
>>
>>
>> ---
>>  lib/libvirtio/virtio-serial.c    |  7 +++++++
>>  board-qemu/slof/virtio-serial.fs | 11 ++++++++++-
>>  2 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/libvirtio/virtio-serial.c b/lib/libvirtio/virtio-serial.c
>> index d1503a44f433..92afb02014b2 100644
>> --- a/lib/libvirtio/virtio-serial.c
>> +++ b/lib/libvirtio/virtio-serial.c
>> @@ -102,6 +102,10 @@ void virtio_serial_shutdown(struct virtio_device *dev)
>>  	/* Quiesce device */
>>  	virtio_set_status(dev, VIRTIO_STAT_FAILED);
>>  
>> +	/* Stop queues */
>> +	virtio_queue_term_vq(dev, &dev->vq[TX_Q], TX_Q);
>> +	virtio_queue_term_vq(dev, &dev->vq[RX_Q], RX_Q);
>> +
> 
> It is certainly appropriate to undo the effects of virtio_queue_init_vq()
> when shutting down indeed, but according to the virtio 1 spec:
> 
> https://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-630001
> 
> =============================================================================
> 3.3.1 Driver Requirements: Device Cleanup
> 
> A driver MUST NOT alter descriptor table entries which have
> been exposed in the available ring (and not marked consumed
> by the device in the used ring) of a live virtqueue.
> 
> A driver MUST NOT decrement the available idx on a live
> virtqueue (ie. there is no way to “unexpose” buffers).
> 
> Thus a driver MUST ensure a virtqueue isn’t live (by device reset)
> before removing exposed buffers. 
> =============================================================================
> 
> So I'm wondering if this should rather be done after virtio_reset_device(),
> like virtionet_term() already does.


virtio_reset_device() just calls virtio_set_status(dev, 0) and we
already have virtio_set_status(dev, VIRTIO_STAT_FAILED) above and I'd
think this has the same effect of stopping the device. Does not matter
in practice I guess, we can move it after virtio_reset_device().


> Also, shouldn't the other virtio drivers do the same ?

We should, one thing at the time :)


> 
>>  	/* Reset device */
>>  	virtio_reset_device(dev);
>>  }
>> @@ -114,6 +118,9 @@ int virtio_serial_putchar(struct virtio_device *dev, char c)
>>  	uint16_t last_used_idx, avail_idx;
>>  	struct vqs *vq = &dev->vq[TX_Q];
>>  
>> +	if (!vq->desc)
>> +		return 0;
>> +
>>  	avail_idx = virtio_modern16_to_cpu(dev, vq->avail->idx);
>>  
>>  	last_used_idx = vq->used->idx;
>> diff --git a/board-qemu/slof/virtio-serial.fs b/board-qemu/slof/virtio-serial.fs
>> index 42ab3e2ccc9c..5293ab1f6341 100644
>> --- a/board-qemu/slof/virtio-serial.fs
>> +++ b/board-qemu/slof/virtio-serial.fs
>> @@ -25,6 +25,8 @@ virtio-setup-vd VALUE virtiodev
>>              close-dev
>>          THEN
>>          FALSE to initialized?
>> +	['] hvterm-emit to emit
> 
> Is it correct to assume hvterm-emit is the value we want to restore ?
>> Also the init function does:
> 
>     ['] virtio-serial-term-emit to emit
>     ['] virtio-serial-term-key  to key
>     ['] virtio-serial-term-key? to key?
> 
> Shouldn't we revert key and key? as well ?


These are all good questions and I do not have answers.

My thinking was that when restoring the "emit", we are most likely in
"quiesce" stage and all we care about at this point is 1) no
use-after-free 2) (with that qemu hack) we can get some traces from the
guest.

Just doing nothing in  virtio-serial-term-emit if "virtiodev 0=" is
enough (and this is what my fix to "write" does).


> 
>> +	0 to virtiodev
>>      THEN
>>  ;
>>  
>> @@ -61,12 +63,18 @@ virtiodev virtio-serial-init drop
>>  : close
>>      open-count 0> IF
>>          open-count 1 - dup to open-count
>> -        0= IF close THEN
>> +        0= IF
>> +	    close
>> +	    FALSE to initialized?
>> +	    ['] hvterm-emit to emit
>> +	    0 to virtiodev
> 
> Maybe f^wworth factoring this out to its own word ?

We already have this word and it is "shutdown", I should have reused
that one.


> 
>> +	THEN
>>      THEN
>>      close


When I see things like above - "close" is unconditionally calling
"close", my head explodes. It looks like a recursion but it is not and
what is this second "close" - I have to guess or debug (this one is from
slof/fs/pci-device.fs). Oh boy...


>>  ;
>>  
>>  : write ( addr len -- actual )
>> +    virtiodev 0= IF nip EXIT THEN
>>      tuck
>>      0 ?DO
>>          dup c@ virtiodev SWAP virtio-serial-putchar
>> @@ -77,6 +85,7 @@ virtiodev virtio-serial-init drop
>>  
>>  : read ( addr len -- actual )
>>      0= IF drop 0 EXIT THEN
>> +    virtiodev 0= IF nip EXIT THEN
>>      virtiodev virtio-serial-haschar 0= IF 0 swap c! -2 EXIT THEN
>>      virtiodev virtio-serial-getchar swap c! 1
>>  ;
> 

-- 
Alexey


More information about the SLOF mailing list