[PATCH kernel v5 10/10] KVM: PPC: VFIO: Add in-kernel acceleration for VFIO

David Gibson david at gibson.dropbear.id.au
Mon Feb 27 12:53:52 AEDT 2017


On Fri, Feb 24, 2017 at 02:43:05PM +1100, Alexey Kardashevskiy wrote:
> On 24/02/17 14:36, David Gibson wrote:
> > On Fri, Feb 24, 2017 at 02:29:14PM +1100, Alexey Kardashevskiy wrote:
> >> On 24/02/17 13:14, David Gibson wrote:
> >>> On Wed, Feb 22, 2017 at 07:21:33PM +1100, Alexey Kardashevskiy
> wrote:
[snip]
> >>>> +static long kvmppc_rm_tce_iommu_unmap(struct kvm *kvm,
> >>>> +		struct iommu_table *tbl, unsigned long entry)
> >>>> +{
> >>>> +	enum dma_data_direction dir = DMA_NONE;
> >>>> +	unsigned long hpa = 0;
> >>>> +	long ret;
> >>>> +
> >>>> +	if (iommu_tce_xchg_rm(tbl, entry, &hpa, &dir))
> >>>> +		return H_HARDWARE;
> >>>
> >>> To avoid a double WARN() (and to make the warnings easier to
> >>> understand) I'd suggest putting a WARN_ON() here, rather than in the
> >>> callers when they receieve an H_HARDWARE.  IIUC this really shouldn't
> >>> ever happen, and it certainly can't be the guest's fault?
> >>
> >>
> >> Makes sense.
> > 
> > I guess it might want WARN_ON_ONCE() to avoid spamming the user with
> > errors for every TCE, though.
> 
> 
> We do not expect this to happen at all :) I can convert all of them to
> _ONCE really as the purpose of WARN_ON is mostly to document what we do not
> expect.

Sure, seems reasonable.

[snip]
> >>>> @@ -220,9 +338,10 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
> >>>>  {
> >>>>  	struct kvmppc_spapr_tce_table *stt;
> >>>>  	long i, ret = H_SUCCESS;
> >>>> -	unsigned long tces, entry, ua = 0;
> >>>> +	unsigned long tces, entry, ua = 0, tce, gpa;
> >>>>  	unsigned long *rmap = NULL;
> >>>>  	bool prereg = false;
> >>>> +	struct kvmppc_spapr_tce_iommu_table *stit;
> >>>>  
> >>>>  	stt = kvmppc_find_table(vcpu->kvm, liobn);
> >>>>  	if (!stt)
> >>>> @@ -287,12 +406,24 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
> >>>>  	}
> >>>>  
> >>>>  	for (i = 0; i < npages; ++i) {
> >>>> -		unsigned long tce = be64_to_cpu(((u64 *)tces)[i]);
> >>>> +		tce = be64_to_cpu(((u64 *)tces)[i]);
> >>>>  
> >>>>  		ret = kvmppc_tce_validate(stt, tce);
> >>>>  		if (ret != H_SUCCESS)
> >>>>  			goto unlock_exit;
> >>>>  
> >>>> +		gpa = tce & ~(TCE_PCI_READ | TCE_PCI_WRITE);
> >>>> +
> >>>> +		list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
> >>>> +			ret = kvmppc_rm_tce_iommu_map(vcpu->kvm,
> >>>> +					stit->tbl, entry + i, gpa,
> >>>> +					iommu_tce_direction(tce));
> >>>> +			if (WARN_ON_ONCE(ret == H_HARDWARE))
> >>>
> >>> I don't think you need the WARN() here - the only H_HARDWARE failure
> >>> path in iommu_map() already includes a WARN().
> >>
> >>
> >> True, I can drop it here.
> >>
> >>
> >>>
> >>>> +				kvmppc_rm_clear_tce(stit->tbl, entry);
> >>>> +			else if (ret != H_SUCCESS)
> >>>> +				goto unlock_exit;
> >>>
> >>> It's also not clear to me why the H_HARDWARE error path clears the
> >>> entry, but the other failure paths don't.  Or why an H_HARDWARE will
> >>> result in continuing to set the rest of the TCEs, but other failures
> >>> won't.
> >>
> >>
> >> The idea was that other failures still have some chance that handling may
> >> succeed in virtual mode or via QEMU, H_HARDWARE is fatal.
> > 
> > Um... yes.. but the logic seems to be backwards for that: on
> > H_HARDWARE you warn and keep going, on other errors you bail out
> > entirely.
> 
> By "fatal" I means fatal for this particular hardware TCE(s), no hope in
> trying this particular TCE in virtual mode.

Ok... still not following why that means the "fatal" error results in
continuing to attempt for the rest of the updated TCEs, whereas the
"non fatal" one bails out.  Especially since the bail out will only go
to virtual mode if ret == H_TOO_HARD, which it isn't clear is the only
possibility.

> >> I am just not sure if H_PARAMETER is what I want to return at [1], to make
> >> the calling code simplier, I could return H_HARDWARE there as well (instead
> >> of H_PARAMETER).
> > 
> > That sounds right, IIUC the gpa to ua translation shouldn't ever
> > fail because of something the guest did. 
> 
> 
> The guest can easily pass bad TCE/GPA which is not in any registered slot.
> So it is rather H_PARAMETER.

Ah, yes.

> > So I'd expect either
> > H_HARDWARE, or H_TOO_HARD (if there's some hope that virtual mode can
> > make the translation when real mode couldn't).
> 
> No, virtual mode uses the exact same helper.

Ok.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20170227/a733ee02/attachment.sig>


More information about the Linuxppc-dev mailing list