[PATCH 06/19] KVM: PPC: Book3S HV: add a GET_ESB_FD control to the XIVE native device
Benjamin Herrenschmidt
benh at kernel.crashing.org
Mon Feb 11 17:42:52 AEDT 2019
On Mon, 2019-02-11 at 13:38 +1100, David Gibson wrote:
>
> 1) All in kernel
>
> The offset always maps directly to guest irq number and the kernel
> somehow binds it either to an IPI or a host irq as necessary.
> Cédric's original code attempts this, but the mechanism of keeping a
> pointer to the VMA can't work.
Why do you need a pointer to the VMA anyway ? unmap_mapping_range()
doesn't need a VMA for the unmap part, and faults/mmaps have the VMA.
> But.. remapping the irqs should be sufficiently infrequent that it
> might be ok to consider simply stepping through all the hosting
> process's VMAs to do this.
Which unmap_mapping_range() does for you as I explained previously. You
only need the address space. See how spufs does it (among others).
> 2) Remapped in qemu (using memory regions)
>
> I _think_ (in hindsight) was Cédric's been discussing as the
> alternative in more recent posts.
>
> Qemu maps the IPI pages at one place and the passthrough IRQ pages
> somewhere else. The IPIs are mapped into the guest as one memory
> region, then any passthrough IRQ pages are mapped over that using
> overlapping memory regions.
>
> I don't think this approach will work well, because it could require a
> bunch of separate KVM memory slots, which are fairly scarce.
>
> 3) Remapped in qemu (using mmap())
>
> This is the approach I (and I think Paul) have been suggested in
> contrast to (1).
>
> Qemu maps the IPI pages and maps those into the guest. When we need
> to set up a passthrough IRQ, qemu mmap()s its pages directly over the
> IPI pages, and it remains mapped into the guest with the same memory
> region / memslot as the IPIs are already using. If the passthrough
> device is removed we have to remap the IPI pages back into place.
>
> 4) Dedicated irq numbers
>
> We never re-use regular guest irq numbers for passthrough irqs,
> instead we put them somewhere else and keep those mapped to the
> passthrough irq pages.
>
> I was favouring this approach, but it does mean there will be a guest
> visible difference between kernel_irqchip=on and off which isn't
> great.
>
>
> (1) is the most elegant _interface_, but as we've seen it's
> problematic to implement. Looking at the for_all_vmas() approach
> could be interesting, but otherwise option (3) might be the most
> practical.
>
> --
More information about the Linuxppc-dev
mailing list