[PATCH 06/19] KVM: PPC: Book3S HV: add a GET_ESB_FD control to the XIVE native device

Benjamin Herrenschmidt benh at kernel.crashing.org
Mon Feb 11 17:42:52 AEDT 2019


On Mon, 2019-02-11 at 13:38 +1100, David Gibson wrote:
> 
> 1) All in kernel
> 
> The offset always maps directly to guest irq number and the kernel
> somehow binds it either to an IPI or a host irq as necessary.
> Cédric's original code attempts this, but the mechanism of keeping a
> pointer to the VMA can't work.

Why do you need a pointer to the VMA anyway ? unmap_mapping_range()
doesn't need a VMA for the unmap part, and faults/mmaps have the VMA.

> But.. remapping the irqs should be sufficiently infrequent that it
> might be ok to consider simply stepping through all the hosting
> process's VMAs to do this.

Which unmap_mapping_range() does for you as I explained previously. You
only need the address space. See how spufs does it (among others).

> 2) Remapped in qemu (using memory regions)
> 
> I _think_ (in hindsight) was Cédric's been discussing as the
> alternative in more recent posts.
> 
> Qemu maps the IPI pages at one place and the passthrough IRQ pages
> somewhere else.  The IPIs are mapped into the guest as one memory
> region, then any passthrough IRQ pages are mapped over that using
> overlapping memory regions.
> 
> I don't think this approach will work well, because it could require a
> bunch of separate KVM memory slots, which are fairly scarce.
> 
> 3) Remapped in qemu (using mmap())
> 
> This is the approach I (and I think Paul) have been suggested in
> contrast to (1).
> 
> Qemu maps the IPI pages and maps those into the guest.  When we need
> to set up a passthrough IRQ, qemu mmap()s its pages directly over the
> IPI pages, and it remains mapped into the guest with the same memory
> region / memslot as the IPIs are already using.  If the passthrough
> device is removed we have to remap the IPI pages back into place.
> 
> 4) Dedicated irq numbers
> 
> We never re-use regular guest irq numbers for passthrough irqs,
> instead we put them somewhere else and keep those mapped to the
> passthrough irq pages.
> 
> I was favouring this approach, but it does mean there will be a guest
> visible difference between kernel_irqchip=on and off which isn't
> great.
> 
> 
> (1) is the most elegant _interface_, but as we've seen it's
> problematic to implement.  Looking at the for_all_vmas() approach
> could be interesting, but otherwise option (3) might be the most
> practical.
> 
> --



More information about the Linuxppc-dev mailing list