[Lguest] [kvm-devel] [PATCH 3/3] virtio PCI device

Anthony Liguori aliguori at us.ibm.com
Tue Nov 27 06:18:11 EST 2007


Avi Kivity wrote:
> rx and tx are closely related. You rarely have one without the other.
>
> In fact, a turned implementation should have zero kicks or interrupts 
> for bulk transfers. The rx interrupt on the host will process new tx 
> descriptors and fill the guest's rx queue; the guest's transmit 
> function can also check the receive queue. I don't know if that's 
> achievable for Linuz guests currently, but we should aim to make it 
> possible.

ATM, the net driver does a pretty good job of disabling kicks/interrupts 
unless they are needed.  Checking for rx on tx and vice versa is a good 
idea and could further help there.  I'll give it a try this week.

> Another point is that virtio still has a lot of leading zeros in its 
> mileage counter. We need to keep things flexible and learn from others 
> as much as possible, especially when talking about the ABI.

Yes, after thinking about it over holiday, I agree that we should at 
least introduce a virtio-pci feature bitmask.  I'm not inclined to 
attempt to define a hypercall ABI or anything like that right now but 
having the feature bitmask will at least make it possible to do such a 
thing in the future.

>> I'm wary of introducing the notion of hypercalls to this device 
>> because it makes the device VMM specific.  Maybe we could have the 
>> device provide an option ROM that was treated as the device "BIOS" 
>> that we could use for kicking and interrupt acking?  Any idea of how 
>> that would map to Windows?  Are there real PCI devices that use the 
>> option ROM space to provide what's essentially firmware?  
>> Unfortunately, I don't think an option ROM BIOS would map well to 
>> other architectures.
>>
>>   
>
> The BIOS wouldn't work even on x86 because it isn't mapped to the 
> guest address space (at least not consistently), and doesn't know the 
> guest's programming model (16, 32, or 64-bits? segmented or flat?)
>
> Xen uses a hypercall page to abstract these details out. However, I'm 
> not proposing that. Simply indicate that we support hypercalls, and 
> use some layer below to actually send them. It is the responsibility 
> of this layer to detect if hypercalls are present and how to call them.
>
> Hey, I think the best place for it is in paravirt_ops. We can even 
> patch the hypercall instruction inline, and the driver doesn't need to 
> know about it.

Yes, paravirt_ops is attractive for abstracting the hypercall calling 
mechanism but it's still necessary to figure out how hypercalls would be 
identified.  I think it would be necessary to define a virtio specific 
hypercall space and use the virtio device ID to claim subspaces.

For instance, the hypercall number could be (virtio_devid << 16) | (call 
number).  How that translates into a hypercall would then be part of the 
paravirt_ops abstraction.  In KVM, we may have a single virtio hypercall 
where we pass the virtio hypercall number as one of the arguments or 
something like that.

>>>>> Not much of an argument, I know.
>>>>>
>>>>>
>>>>> wrt. number of queues, 8 queues will consume 32 bytes of pci space 
>>>>> if all you store is the ring pfn.
>>>>>             
>>>> You also at least need a num argument which takes you to 48 or 64 
>>>> depending on whether you care about strange formatting.  8 queues 
>>>> may not be enough either.  Eric and I have discussed whether the 9p 
>>>> virtio device should support multiple mounts per-virtio device and 
>>>> if so, whether each one should have it's own queue.  Any devices 
>>>> that supports this sort of multiplexing will very quickly start 
>>>> using a lot of queues.
>>>>         
>>> Make it appear as a pci function?  (though my feeling is that 
>>> multiple mounts should be different devices; we can then hotplug 
>>> mountpoints).
>>>     
>>
>> We may run out of PCI slots though :-/
>>   
>
> Then we can start selling virtio extension chassis.

:-)  Do you know if there is a hard limit on the number of devices on a 
PCI bus?  My concern was that it was limited by something stupid like an 
8-bit identifier.

Regards,

Anthony Liguori




More information about the Lguest mailing list