[PATCH 02/10] powerpc/xive: guest exploitation of the XIVE interrupt controller
David Gibson
david at gibson.dropbear.id.au
Thu Aug 10 14:28:49 AEST 2017
On Wed, Aug 09, 2017 at 10:48:48AM +0200, Cédric Le Goater wrote:
> On 08/09/2017 05:53 AM, David Gibson wrote:
> > On Tue, Aug 08, 2017 at 10:56:12AM +0200, Cédric Le Goater wrote:
> >> This is the framework for using XIVE in a PowerVM guest. The support
> >> is very similar to the native one in a much simpler form.
> >>
> >> Instead of OPAL calls, a set of Hypervisors call are used to configure
> >> the interrupt sources and the event/notification queues of the guest:
> >>
> >> - H_INT_GET_SOURCE_INFO
> >>
> >> used to obtain the address of the MMIO page of the Event State
> >> Buffer (PQ bits) entry associated with the source.
> >>
> >> - H_INT_SET_SOURCE_CONFIG
> >>
> >> assigns a source to a "target".
> >>
> >> - H_INT_GET_SOURCE_CONFIG
> >>
> >> determines to which "target" and "priority" is assigned to a source
> >>
> >> - H_INT_GET_QUEUE_INFO
> >>
> >> returns the address of the notification management page associated
> >> with the specified "target" and "priority".
> >>
> >> - H_INT_SET_QUEUE_CONFIG
> >>
> >> sets or resets the event queue for a given "target" and "priority".
> >> It is also used to set the notification config associated with the
> >> queue, only unconditional notification for the moment. Reset is
> >> performed with a queue size of 0 and queueing is disabled in that
> >> case.
> >>
> >> - H_INT_GET_QUEUE_CONFIG
> >>
> >> returns the queue settings for a given "target" and "priority".
> >>
> >> - H_INT_RESET
> >>
> >> resets all of the partition's interrupt exploitation structures to
> >> their initial state, losing all configuration set via the hcalls
> >> H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.
> >>
> >> - H_INT_SYNC
> >>
> >> issue a synchronisation on a source to make sure sure all
> >> notifications have reached their queue.
> >>
> >> As for XICS, the XIVE interface for the guest is described in the
> >> device tree under the interrupt controller node. A couple of new
> >> properties are specific to XIVE :
> >>
> >> - "reg"
> >>
> >> contains the base address and size of the thread interrupt
> >> managnement areas (TIMA) for the user level for the OS level. Only
> >> the OS level is taken into account.
> >>
> >> - "ibm,xive-eq-sizes"
> >>
> >> the size of the event queues.
> >>
> >> - "ibm,xive-lisn-ranges"
> >>
> >> the interrupt numbers ranges assigned to the guest. These are
> >> allocated using a simple bitmap.
> >>
> >> Tested with a QEMU XIVE model for pseries and with the Power
> >> hypervisor
> >>
> >> Signed-off-by: Cédric Le Goater <clg at kaod.org>
> >
> > I don't know XIVE well enough to review in detail, but I've made some
> > comments based on general knowledge.
>
> Thanks for taking a look.
np
[snip]
> >> +/* Cause IPI as setup by the interrupt controller (xics or xive) */
> >> +static void (*ic_cause_ipi)(int cpu);
> >> +
> >> static void smp_pseries_cause_ipi(int cpu)
> >> {
> >> - /* POWER9 should not use this handler */
> >> if (doorbell_try_core_ipi(cpu))
> >> return;
> >>
> >> - icp_ops->cause_ipi(cpu);
> >> + ic_cause_ipi(cpu);
> >
> > Wouldn't it make more sense to change smp_ops->cause_ipi, rather than
> > having a double indirection through smp_ops, then the ic_cause_ipi
> > global?
>
> we need to retain the original setting of smp_ops->cause_ipi
> somewhere as it can be set from xive_smp_probe() to :
>
> icp_ops->cause_ipi
>
> or from xics_smp_probe() to :
>
> xive_cause_ipi()
>
> I am not sure we can do any better ?
I don't see why not. You basically have two bits of options xics vs
xive, and doorbell vs no doorbells. At worst that gives you 4
different versions of ->cause_ipi, and you can work out the right one
in smp_probe(). If the number of combinations got too much larger you
might indeed need some more indirection. But with 4 there's a little
code duplication, but small enough that I think it's preferable to an
extra global and an extra indirection.
Also, will POWER9 always have doorbells? In which case you could
reduce it to 3 options.
[snip]
> >> +static void xive_spapr_update_pending(struct xive_cpu *xc)
> >> +{
> >> + u8 nsr, cppr;
> >> + u16 ack;
> >> +
> >> + /* Perform the acknowledge hypervisor to register cycle */
> >> + ack = be16_to_cpu(__raw_readw(xive_tima + TM_SPC_ACK_OS_REG));
> >
> > Why do you need the raw_readw() + be16_to_cpu + mb, rather than one of
> > the higher level IO helpers?
>
> This is one of the many ways to do MMIOs on the TIMA. This memory
> region defines a set of offsets and sizes for which loads and
> stores have different effects.
>
> See the arch/powerpc/include/asm/xive-regs.h file and
> arch/powerpc/kvm/book3s_xive_template.c for some more usage.
Sure, much like any IO region. My point is, why do you want this kind
of complex combo, rather than say an in_be16() or readw_be().
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20170810/9bd98cc1/attachment.sig>
More information about the Linuxppc-dev
mailing list