[PATCH v3] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

Paul Mackerras paulus at samba.org
Fri Jul 18 09:52:14 EST 2014


On Thu, Jul 17, 2014 at 01:19:57PM +1000, Stewart Smith wrote:

> The POWER8 processor has a Micro Partition Prefetch Engine, which is
> a fancy way of saying "has way to store and load contents of L2 or
> L2+MRU way of L3 cache". We initiate the storing of the log (list of
> addresses) using the logmpp instruction and start restore by writing
> to a SPR.
> 
> The logmpp instruction takes parameters in a single 64bit register:
> - starting address of the table to store log of L2/L2+L3 cache contents
>   - 32kb for L2
>   - 128kb for L2+L3
>   - Aligned relative to maximum size of the table (32kb or 128kb)
> - Log control (no-op, L2 only, L2 and L3, abort logout)
> 
> We should abort any ongoing logging before initiating one.

Do we ever want to wait for ongoing logging to finish?

[snip]

> +#if defined(CONFIG_PPC_64K_PAGES)
> +#define MPP_BUFFER_ORDER	0
> +#elif defined(CONFIG_PPC_4K_PAGES)
> +#define MPP_BUFFER_ORDER	4

Why 4 not 3?  You only need 32kB, don't you?

> +static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
> +{
> +	struct kvmppc_vcore *vcore;
> +
> +	vcore = kzalloc(sizeof(struct kvmppc_vcore), GFP_KERNEL);
> +
> +	if (vcore == NULL)
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&vcore->runnable_threads);
> +	spin_lock_init(&vcore->lock);
> +	init_waitqueue_head(&vcore->wq);
> +	vcore->preempt_tb = TB_NIL;
> +	vcore->lpcr = kvm->arch.lpcr;
> +	vcore->first_vcpuid = core * threads_per_core;
> +	vcore->kvm = kvm;

Is there a particular reason why you need to pull this code out into a
separate function?  If so, it would be a little nicer if you did that
in a separate patch, to make it easier to see that the code motion
changes nothing.

> @@ -1590,9 +1645,16 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
>  
>  	srcu_idx = srcu_read_lock(&vc->kvm->srcu);
>  
> +	if (vc->mpp_buffer_is_valid)
> +		ppc_start_restoring_l2_cache(vc);
> +
>  	__kvmppc_vcore_entry();
>  
>  	spin_lock(&vc->lock);
> +
> +	if (vc->mpp_buffer)
> +		ppc_start_saving_l2_cache(vc);

I wonder if we would get better performance improvements if we kicked
this off earlier, for instance before we save all the FP/VSX state and
switch the MMU?  I guess that could be a subsequent patch.

Paul.


More information about the Linuxppc-dev mailing list