[Patch 2/2] Kexec/Kdump support POWER6

Olof Johansson olof at lixom.net
Wed May 23 01:34:19 EST 2007


Hi,

On Tue, May 22, 2007 at 05:56:36PM +0530, Sachin P. Sant wrote:
> On Power machines supporting VRMA, Kexec/Kdump does not work.
> Hypervisor stores VRMA mapping used by the OS, in the hpte hash
> tables. Make sure these hpte entries are left untouched.

> diff -Naurp linux-2.6.22-rc2-vrma/arch/powerpc/kernel/machine_kexec_64.c linux-2.6.22-rc2-p6/arch/powerpc/kernel/machine_kexec_64.c
> --- linux-2.6.22-rc2-vrma/arch/powerpc/kernel/machine_kexec_64.c	2007-05-21 15:14:58.000000000 +0530
> +++ linux-2.6.22-rc2-p6/arch/powerpc/kernel/machine_kexec_64.c	2007-05-21 15:19:14.000000000 +0530
> @@ -279,6 +279,9 @@ void default_machine_kexec(struct kimage
>  	kexec_stack.thread_info.task = current_thread_info()->task;
>  	kexec_stack.thread_info.flags = 0;
>  
> +	if (have_vrma)
> +		pSeries_find_hpte_vrma();
> +

This will break kexec builds on non-pseries. It's referring to platform
code that might not be built.

>  	/* Some things are best done in assembly.  Finding globals with
>  	 * a toc is easier in C, so pass in what we can.
>  	 */
> diff -Naurp linux-2.6.22-rc2-vrma/arch/powerpc/platforms/pseries/lpar.c linux-2.6.22-rc2-p6/arch/powerpc/platforms/pseries/lpar.c
> --- linux-2.6.22-rc2-vrma/arch/powerpc/platforms/pseries/lpar.c	2007-05-21 15:14:57.000000000 +0530
> +++ linux-2.6.22-rc2-p6/arch/powerpc/platforms/pseries/lpar.c	2007-05-22 15:53:11.000000000 +0530
> @@ -369,6 +369,56 @@ static long pSeries_lpar_hpte_remove(uns
>  	return -1;
>  }
>  
> +unsigned long hpte_vrma_slots[HPTE_V_RMA_NUM];
> +unsigned int num_hpte_vrma_slots = 0;
> +
> +void pSeries_find_hpte_vrma(void)

Does this function find the vrma, or save it away? Seems like the name
is misleading.

> +{
> +	unsigned int step;
> +	unsigned long hash, slot, vaddr;
> +	unsigned long dword0, dummy1, rma_size;
> +	long lpar_rc;
> +	int i;
> +	
> +	/* Get the RMA size */
> +	rma_size = lmb.rmo_size;
> +	
> +	/* Get the VRMA page size */	
> +	step = 1 << ppc64_vrma_page_size;

Is ppc64_vrma_page_size really the size, or the shift? Above would
indicate that it's really a shift value.

> +
> +	vaddr = HPTE_V_RMA_VPN + rma_size;
> +
> +	/* Find hpte's with VRMA mappings */
> +	for (; vaddr >= HPTE_V_RMA_VPN; vaddr -= step) {
> +		hash = hpt_hash(vaddr, mmu_psize_defs[MMU_PAGE_16M].shift);

Why is 16M hardcoded here, when you're taking such great care to read
out the pagesize earlier?

> +		slot = ((hash & htab_hash_mask) * HPTES_PER_GROUP);	
> +
> +		for (i = 0; i < HPTES_PER_GROUP; i++) {
> +			lpar_rc = plpar_pte_read(0, slot, 
> +						&dword0, &dummy1);
> +			if (!lpar_rc && dword0 &&
> +			((dword0 & HPTE_V_MASK) == MAGIC_SKIP_HPTE)) {

Indentation

> +				/* store the hpte */
> +				hpte_vrma_slots[num_hpte_vrma_slots++] = slot;

Here you rely on global exported state (num_hpte_vrma_slots), increasing it without
checking for limits. What happens if this function is ever called twice? Should you
set it to 0 in the beginning of the function and check it against the size of the
hpte_vrma_slots array instead?

> +				break;
> +			}
> +			slot++;
> +		}
> +	}
> +}
> +
> +static inline int check_vrma_slot(int slot)
> +{
> +	int j;
> +
> +	for (j = 0; j < num_hpte_vrma_slots; j++)
> +		if (hpte_vrma_slots[j] == slot) 
> +			return 1;
> +
> +	return 0;
> +
> +}
> +
>  static void pSeries_lpar_hptab_clear(void)
>  {
>  	unsigned long size_bytes = 1UL << ppc64_pft_size;
> @@ -377,8 +427,12 @@ static void pSeries_lpar_hptab_clear(voi
>  	int i;
>  
>  	/* TODO: Use bulk call */
> -	for (i = 0; i < hpte_count; i++)
> +	for (i = 0; i < hpte_count; i++) {
> +		if (have_vrma && check_vrma_slot(i))
> +			/* You don't want to remove this hpte */
> +			continue;
>  		plpar_pte_remove_raw(0, i, 0, &dummy1, &dummy2);
> +	}
>  }
>  
>  /*
> diff -Naurp linux-2.6.22-rc2-vrma/include/asm-powerpc/kexec.h linux-2.6.22-rc2-p6/include/asm-powerpc/kexec.h
> --- linux-2.6.22-rc2-vrma/include/asm-powerpc/kexec.h	2007-05-21 15:14:55.000000000 +0530
> +++ linux-2.6.22-rc2-p6/include/asm-powerpc/kexec.h	2007-05-21 15:19:14.000000000 +0530
> @@ -24,6 +24,8 @@
>  
>  #define KEXEC_CONTROL_CODE_SIZE 4096
>  
> +extern void pSeries_find_hpte_vrma(void);
> +

Same comment as above: This isn't a kexec function as much as a pseries function, so
it should be defined in some other header instead.

>  /* The native architecture */
>  #ifdef __powerpc64__
>  #define KEXEC_ARCH KEXEC_ARCH_PPC64
> diff -Naurp linux-2.6.22-rc2-vrma/include/asm-powerpc/mmu-hash64.h linux-2.6.22-rc2-p6/include/asm-powerpc/mmu-hash64.h
> --- linux-2.6.22-rc2-vrma/include/asm-powerpc/mmu-hash64.h	2007-05-21 15:14:55.000000000 +0530
> +++ linux-2.6.22-rc2-p6/include/asm-powerpc/mmu-hash64.h	2007-05-21 15:23:31.000000000 +0530
> @@ -94,6 +94,11 @@ extern char initial_stab[];
>  #define HPTE_R_C		ASM_CONST(0x0000000000000080)
>  #define HPTE_R_R		ASM_CONST(0x0000000000000100)
>  
> +#define HPTE_V_RMA_VPN         ASM_CONST(0x001FFFFFF0000000)
> +#define HPTE_V_MASK            ASM_CONST(0xc000000000000000)
> +#define MAGIC_SKIP_HPTE        ASM_CONST(0x4000000000000000)
> +#define HPTE_V_RMA_NUM         16

"MAGIC_SKIP_HPTE"? I'm sure there's a proper name for this field in the
PAPR, isn't there? Also, HPTE_V_RMA_NUM isn't a HPTE_V field, it shouldn't
have that prefix. It's not a property of the mmu in the first place.

These should maybe be local defines in the pseries lpar code instead, since it's
more of a lpar<->phyp interface than mmu programming interface.


-Olof



More information about the Linuxppc-dev mailing list