[PATCH v3] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

Stewart Smith stewart at linux.vnet.ibm.com
Fri Jul 18 14:10:14 EST 2014


Paul Mackerras <paulus at samba.org> writes:
> On Thu, Jul 17, 2014 at 01:19:57PM +1000, Stewart Smith wrote:
>
>> The POWER8 processor has a Micro Partition Prefetch Engine, which is
>> a fancy way of saying "has way to store and load contents of L2 or
>> L2+MRU way of L3 cache". We initiate the storing of the log (list of
>> addresses) using the logmpp instruction and start restore by writing
>> to a SPR.
>> 
>> The logmpp instruction takes parameters in a single 64bit register:
>> - starting address of the table to store log of L2/L2+L3 cache contents
>>   - 32kb for L2
>>   - 128kb for L2+L3
>>   - Aligned relative to maximum size of the table (32kb or 128kb)
>> - Log control (no-op, L2 only, L2 and L3, abort logout)
>> 
>> We should abort any ongoing logging before initiating one.
>
> Do we ever want to wait for ongoing logging to finish?

*Probably* not... but as far as I can see the hardware doesn't expose a
way to find out if there is any ongoing logging or a way to be notified
when it's done.

>> +#if defined(CONFIG_PPC_64K_PAGES)
>> +#define MPP_BUFFER_ORDER	0
>> +#elif defined(CONFIG_PPC_4K_PAGES)
>> +#define MPP_BUFFER_ORDER	4
>
> Why 4 not 3?  You only need 32kB, don't you?

Correct. whoops.

>> +static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
>> +{
>> +	struct kvmppc_vcore *vcore;
>> +
>> +	vcore = kzalloc(sizeof(struct kvmppc_vcore), GFP_KERNEL);
>> +
>> +	if (vcore == NULL)
>> +		return NULL;
>> +
>> +	INIT_LIST_HEAD(&vcore->runnable_threads);
>> +	spin_lock_init(&vcore->lock);
>> +	init_waitqueue_head(&vcore->wq);
>> +	vcore->preempt_tb = TB_NIL;
>> +	vcore->lpcr = kvm->arch.lpcr;
>> +	vcore->first_vcpuid = core * threads_per_core;
>> +	vcore->kvm = kvm;
>
> Is there a particular reason why you need to pull this code out into a
> separate function?  If so, it would be a little nicer if you did that
> in a separate patch, to make it easier to see that the code motion
> changes nothing.

ack, done.

>> @@ -1590,9 +1645,16 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
>>  
>>  	srcu_idx = srcu_read_lock(&vc->kvm->srcu);
>>  
>> +	if (vc->mpp_buffer_is_valid)
>> +		ppc_start_restoring_l2_cache(vc);
>> +
>>  	__kvmppc_vcore_entry();
>>  
>>  	spin_lock(&vc->lock);
>> +
>> +	if (vc->mpp_buffer)
>> +		ppc_start_saving_l2_cache(vc);
>
> I wonder if we would get better performance improvements if we kicked
> this off earlier, for instance before we save all the FP/VSX state and
> switch the MMU?  I guess that could be a subsequent patch.

Possibly, yes. Maybe something to look at in future patch, along with if
also doing the L3 save/restore is a benefit.



More information about the Linuxppc-dev mailing list