[PATCH 3 of 3] [KVM POWERPC] PowerPC 440 KVM implementation

Josh Boyer jwboyer at linux.vnet.ibm.com
Tue Apr 8 12:12:40 EST 2008


On Mon, 07 Apr 2008 15:53:34 -0500
Hollis Blanchard <hollisb at us.ibm.com> wrote:

> Currently supports only PowerPC 440 Linux guests on 440 hosts. (Only tested
> with 440EP "Bamboo" guests so far, but with appropriate userspace support other
> SoC/board combinations should work.)

I haven't reviewed the whole patch yet, but lots of it looks pretty
clean.  A couple comments on some things that made me scratch my head
below.

> Interrupt handling: We use IVPR to hijack the host interrupt vectors while
> running the guest, but hand off decrementer and external interrupts for normal
> guest processing.
> 
> Address spaces: We take advantage of the fact that Linux doesn't use the AS=1
> address space (in host or guest), which gives us virtual address space to use
> for guest mappings. While the guest is running, the host kernel remains mapped
> in AS=0, but the guest can only use AS=1 mappings.
> 
> TLB entries: The TLB entries covering the host linear address space remain
> present while running the guest (which reduces the overhead of lightweight
> exits). We keep three copies of the TLB:
>  - guest TLB: contents of the TLB as the guest sees it
>  - shadow TLB: the TLB that is actually in hardware while guest is running
>  - host TLB: to restore TLB state when context switching guest -> host
> When a TLB miss occurs because a mapping was not present in the shadow TLB, but
> was present in the guest TLB, KVM handles the fault without invoking the guest.
> Large guest pages are backed by multiple 4KB shadow pages through this
> mechanism.
> 
> Instruction emulation: The guest kernel executes at user level, so executing
> privileged instructions trap into KVM, where we decode and emulate them. Future
> performance work will focus on reducing the overhead and frequency of these
> traps.
> 
> IO: MMIO and DCR accesses are emulated by userspace. We use virtio for network
> and block IO, so those drivers must be enabled in the guest. It's possible that
> some qemu device emulation (e.g. e1000 or rtl8139) may also work with little
> effort.
> 
> Signed-off-by: Hollis Blanchard <hollisb at us.ibm.com>

As Olof pointed out, having this in Documentation might be nice.

> diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
> --- a/arch/powerpc/Kconfig.debug
> +++ b/arch/powerpc/Kconfig.debug
> @@ -151,6 +151,7 @@
> 
>  config PPC_EARLY_DEBUG
>  	bool "Early debugging (dangerous)"
> +	depends on !KVM
>  	help
>  	  Say Y to enable some early debugging facilities that may be available
>  	  for your processor/board combination. Those facilities are hacks

Might want to add a brief explanation as to why this doesn't work with
KVM due to the AS conflict.

> diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c
> new file mode 100644
> --- /dev/null
> +++ b/arch/powerpc/kvm/44x_tlb.c
> @@ -0,0 +1,224 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
> + *
> + * Copyright IBM Corp. 2007
> + *
> + * Authors: Hollis Blanchard <hollisb at us.ibm.com>
> + */
> +
> +#include <linux/types.h>
> +#include <linux/string.h>
> +#include <linux/kvm_host.h>
> +#include <linux/highmem.h>
> +#include <asm/mmu-44x.h>
> +#include <asm/kvm_ppc.h>
> +
> +#include "44x_tlb.h"
> +
> +#define PPC44x_TLB_USER_PERM_MASK (PPC44x_TLB_UX|PPC44x_TLB_UR|PPC44x_TLB_UW)
> +#define PPC44x_TLB_SUPER_PERM_MASK (PPC44x_TLB_SX|PPC44x_TLB_SR|PPC44x_TLB_SW)
> +
> +static unsigned int kvmppc_tlb_44x_pos;
> +
> +static u32 kvmppc_44x_tlb_shadow_attrib(u32 attrib, int usermode)
> +{
> +	/* XXX remove mask when Linux is fixed */
> +	attrib &= 0xf03f;

What does that mean?  Is this the "44x TLB handler writes to reserved
fields" issue?  If so, could you comment that a little more verbosely?

Or you could just send a patch to fix Linux... ;).

> +
> +	if (!usermode) {
> +		/* Guest is in supervisor mode, so we need to translate guest
> +		 * supervisor permissions into user permissions. */
> +		attrib &= ~PPC44x_TLB_USER_PERM_MASK;
> +		attrib |= (attrib & PPC44x_TLB_SUPER_PERM_MASK) << 3;
> +	}
> +
> +	/* Make sure host can always access this memory. */
> +	attrib |= PPC44x_TLB_SX|PPC44x_TLB_SR|PPC44x_TLB_SW;
> +
> +	return attrib;
> +}
> +

<snip>

> diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
> new file mode 100644
> --- /dev/null
> +++ b/arch/powerpc/kvm/emulate.c
> @@ -0,0 +1,753 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
> + *
> + * Copyright IBM Corp. 2007
> + *
> + * Authors: Hollis Blanchard <hollisb at us.ibm.com>
> + */
> +
> +#include <linux/jiffies.h>
> +#include <linux/timer.h>
> +#include <linux/types.h>
> +#include <linux/string.h>
> +#include <linux/kvm_host.h>
> +
> +#include <asm/dcr.h>
> +#include <asm/time.h>
> +#include <asm/byteorder.h>
> +#include <asm/kvm_ppc.h>

#include <asm/dcr-regs.h>

> +
> +#include "44x_tlb.h"
> +
> +#define DCRN_CPR0_CFGADDR	0xc
> +#define DCRN_CPR0_CFGDATA	0xd

Remove these.

<snip>

> +int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
> +{
> +	u32 inst = vcpu->arch.last_inst;
> +	u32 ea;
> +	int ra;
> +	int rb;
> +	int rc;
> +	int rs;
> +	int rt;
> +	int sprn;
> +	int dcrn;
> +	enum emulation_result emulated = EMULATE_DONE;
> +	int advance = 1;
> +
> +	switch (get_op(inst)) {

<snip>

> +		case 323:                                       /* mfdcr */
> +			dcrn = get_dcrn(inst);
> +			rt = get_rt(inst);
> +
> +			/* emulate some access in kernel */
> +			switch (dcrn) {
> +			case DCRN_CPR0_CFGADDR:
> +				vcpu->arch.gpr[rt] = vcpu->arch.cpr0_cfgaddr;
> +				break;
> +			case DCRN_CPR0_CFGDATA:
> +				local_irq_disable();
> +				mtdcr(DCRN_CPR0_CFGADDR,
> +				      vcpu->arch.cpr0_cfgaddr);
> +				vcpu->arch.gpr[rt] = mfdcr(DCRN_CPR0_CFGDATA);
> +				local_irq_enable();
> +				break;
> +			default:
> +				run->dcr.dcrn = dcrn;
> +				run->dcr.data =  0;
> +				run->dcr.is_write = 0;
> +				vcpu->arch.io_gpr = rt;
> +				vcpu->arch.dcr_needed = 1;
> +				emulated = EMULATE_DO_DCR;
> +			}
> +
> +			break;

<snip>

> +		case 451:                                       /* mtdcr */
> +			dcrn = get_dcrn(inst);
> +			rs = get_rs(inst);
> +
> +			/* emulate some access in kernel */
> +			switch (dcrn) {
> +			case DCRN_CPR0_CFGADDR:
> +				vcpu->arch.cpr0_cfgaddr = vcpu->arch.gpr[rs];
> +				break;
> +			default:
> +				run->dcr.dcrn = dcrn;
> +				run->dcr.data = vcpu->arch.gpr[rs];
> +				run->dcr.is_write = 1;
> +				vcpu->arch.dcr_needed = 1;
> +				emulated = EMULATE_DO_DCR;
> +			}
> +
> +			break;

What exactly are those doing?  I'm confused as to why the CPR0 DCRs are
special cased.  Also, you don't seem to actually be doing much on
storing guest DCRs, so can I assume they aren't really needed at
runtime?

josh



More information about the Linuxppc-dev mailing list