[PATCH] powerpc/kvm: Handle transparent hugepage in KVM

Michael Neuling mikey at neuling.org
Thu Jun 20 09:59:22 EST 2013


> >> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> >> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> >> @@ -162,33 +162,40 @@ static inline int hpte_cache_flags_ok(unsigned long ptel, unsigned long io_type)
> >>   * Lock and read a linux PTE.  If it's present and writable, atomically
> >>   * set dirty and referenced bits and return the PTE, otherwise return 0.
> >
> > This is comment still valid now the ldarx/stdcx is gone?  
> 
> In a way yes. Instead of lock and read as it was before, it is now done
> via cmpxchg which still use ldarx/stdcx

OK, maybe you can update to reflect that.

> >> +	pte_t old_pte, new_pte = __pte(0);
> >> +repeat:
> >> +	do {
> >> +		old_pte = pte_val(*ptep);
> >> +		/*
> >> +		 * wait until _PAGE_BUSY is clear then set it atomically
> >> +		 */
> >> +		if (unlikely(old_pte & _PAGE_BUSY))
> >> +			goto repeat;
> >
> > continue here?  Please don't create looping primitives.
> 
> No that would be wrong. (I did that in an earlier version :).We really
> don't want the below cmpxchg to run if we find _PAGE_BUSY.

How about something like this then?

while (1) {
      if (unlikely(old_pte & _PAGE_BUSY))
	    continue;
.....
      if cmpxchg(foo)
      	 break;
}


> 
> >   
> >> +
> >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> >> +		/* If hugepage and is trans splitting return None */
> >> +		if (unlikely(hugepage &&
> >> +			     pmd_trans_splitting(pte_pmd(old_pte))))
> >
> > Comment looks much like the code... seems redundant.
> >
> >> +			return __pte(0);
> >> +#endif
> >>  
> >> -	*p = pte;	/* clears _PAGE_BUSY */
> >> +		/* If pte is not present return None */
> >> +		if (unlikely(!(old_pte & _PAGE_PRESENT)))
> >> +			return __pte(0);
> >>  
> >> -	return pte;
> >> +		new_pte = pte_mkyoung(old_pte);
> >> +		if (writing && pte_write(old_pte))
> >> +			new_pte = pte_mkdirty(new_pte);
> >> +
> >> +	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
> >> +					  old_pte, new_pte));
> >> +	return new_pte;
> >>  }
> >>  
> >> +
> >
> > Whitespace



> >> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> >> index dcf892d..39ae723 100644
> >> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> >> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> >> @@ -150,9 +150,7 @@ static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
> >>  		*pte_sizep = PAGE_SIZE;
> >>  	if (ps > *pte_sizep)
> >>  		return __pte(0);
> >> -	if (!pte_present(*ptep))
> >> -		return __pte(0);
> >> -	return kvmppc_read_update_linux_pte(ptep, writing);
> >> +	return kvmppc_read_update_linux_pte(ptep, writing, shift);
> >
> > 'shift' goes into the new 'hugepage' parameter?  Doesn't seem logical?
> > Can we harmonise the name to make it less confusing?
> >
> 
> it is actually the shift bits represending hugepage size. We set it to 0
> if we don't find hugepage in find_linux_pte_or_hugepte. May be something
> like hugepage_shift is better ?

Sure.

Mikey


More information about the Linuxppc-dev mailing list