[RFC PATCH 1/2] KVM: PPC: Book3S HV: Make virtual processor area registration more robust

Alexander Graf agraf at suse.de
Tue Jan 17 20:27:26 EST 2012



On 17.01.2012, at 06:56, Paul Mackerras <paulus at samba.org> wrote:

> On Mon, Jan 16, 2012 at 02:04:29PM +0100, Alexander Graf wrote:
>> 
>> On 20.12.2011, at 11:22, Paul Mackerras wrote:
> 
>>> @@ -152,6 +152,8 @@ static unsigned long do_h_register_vpa(struct kvm_vcpu *vcpu,
>>>    flags &= 7;
>>>    if (flags == 0 || flags == 4)
>> 
>> This could probably use a new variable name. Also, what do 0 and 4 mean? Constant defines would be nice here.
> 
> Those constants are defined in PAPR as being a subfunction code
> indicating what sort of area and whether it is to be registered or
> unregistered.  I'll make up some names for them.
> 
>> [pasted from real source]
>>>                va = kvmppc_pin_guest_page(kvm, vpa, &nb);
>> 
>> Here you're pinning the page, setting va to that temporarily available address.
> 
> Well, it's not just temporarily available, it's available until we
> unpin it, since we increment the page count, which inhibits migration.
> 
>>>            len = *(unsigned int *)(va + 4);
>> 
>> va + 4 isn't really descriptive. Is this a defined struct? Why not actually define one which you can just read data from? Or at least make this a define too. Reading random numbers in code is barely readable.
> 
> It's not really a struct, at least not one that is used for anything
> else.  PAPR defines that the length of the buffer has to be placed in
> the second 32-bit word at registration time.
> 
>> 
>>> +        free_va = va;
>> 
>> Now free_va is the temporarily available address.
> ...
>>> +        free_va = tvcpu->arch.next_vpa;
>>> +        tvcpu->arch.next_vpa = va;
>> 
>> Now you're setting next_vpa to this temporarily available address? But next_vpa will be used after va is getting free'd, no? Or is that why you have free_va?
> 
> Yes; here we are freeing any previously-set value of next_vpa.  The
> idea of free_va is that it is initially set to va so that we correctly
> unpin va if any error occurs.  But if there is no error, va gets put
> into next_vpa and we free anything that was previously in next_vpa
> instead.
> 
>> 
>> Wouldn't it be easier to just map it every time we actually use it and only shove the GPA around? We could basically save ourselves a lot of the logic here.
> 
> There are fields in the VPA that we really want to be able to access
> from real mode, for instance the fields that indicate whether we need
> to save the FPR and/or VR values.  As far as the DTL is concerned, we
> could in fact use copy_to_user to access it, so it doesn't strictly
> need to be pinned.  We don't currently use the slb_shadow buffer, but
> if we did we would need to access it from real mode, since we would be
> reading it in order to set up guest SLB entries.
> 
> The other thing is that the VPA registration/unregistration is only
> done a few times in the life of the guest, whereas we use the VPAs
> constantly while the guest is running.  So it is more efficient to do
> more of the work at registration time to make it quicker to access the
> VPAs.

The thing I was getting at was not the map during the lifetime, but the map during registration. Currently we have:

1) Set VPA to x
2) Assign feature y to VPA
3) Use VPA

1 and 2 are the slow path, 3 occurs more frequently. So we want 3 to be fast. 1 and 2 don't matter that much wrt performance.

You are currently mapping the VPA at /, which gets you into this map/unmap mess trying to free the previous mapping. If you moved the map to step 2 and only stored the GPA at step 1, all map+unmap operations except for final unmaps would be in one spot, so you wouldn't need to construct this big complex state machine.

I hope that makes it more clear :)

Alex

> 
> I'll send revised patches.  There's a small change I want to make to
> patch 2 to avoid putting a very large stolen time value in the first
> entry that gets put in after the DTL is registered, which can happen
> currently if the DTL gets registered some time after the guest started
> running.
> 
> Paul.


More information about the Linuxppc-dev mailing list