[PATCH] powerpc/eeh: Avoid to handle EEH on a passed Child PE

Gavin Shan gwshan at linux.vnet.ibm.com
Mon Sep 21 21:49:45 AEST 2015

On Mon, Sep 21, 2015 at 05:29:48PM +0800, Wei Yang wrote:
>Current EEH infrastructure would avoid to handle EEH when a PE is passed to
>guest, while if this PE is a Child PE of the one hit EEH, host would handle
>this. By doing so, this would leads to guest hang. The correct way is
>avoid to handle it on host and let guest to recover.
>This patch avoids to handle EEH on a passed Child PE.

Ok. It's fixing the problem the guest, which owns a VF, when its PF hitting
EEH error, right? If so, I'm not sure if you really tested this code. Does
it work for you?

When the parent PE (PF) is stopped for EEH recovery, it sounds impossible
that the child PE can't be affected and just escape from the error. The
question is how the guest can continue to work after the EEH recovery on
parent PE?

>Signed-off-by: Wei Yang <weiyang at linux.vnet.ibm.com>
> arch/powerpc/kernel/eeh_pe.c | 5 +++++
> 1 file changed, 5 insertions(+)
>diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>index 5cde950..c6d0e9f 100644
>--- a/arch/powerpc/kernel/eeh_pe.c
>+++ b/arch/powerpc/kernel/eeh_pe.c
>@@ -172,6 +172,7 @@ static struct eeh_pe *eeh_pe_next(struct eeh_pe *pe,
>  * callback returns something other than NULL, or no more PEs
>  * to be traversed.
>  */
>+static void *__eeh_pe_get(void *data, void *flag);
> void *eeh_pe_traverse(struct eeh_pe *root,
> 		      eeh_traverse_func fn, void *flag)
> {
>@@ -179,6 +180,8 @@ void *eeh_pe_traverse(struct eeh_pe *root,
> 	void *ret;
> 	for (pe = root; pe; pe = eeh_pe_next(pe, root)) {
>+		if (eeh_pe_passed(pe) && (fn != __eeh_pe_get))
>+			continue;

The code change here seems ugly.

The "flag" can be extended to carry the information to skip pass-through
PEs or not. So the function calling eeh_pe_traverse() decides to skip
pass-through PEs or not.

> 		ret = fn(pe, flag);
> 		if (ret) return ret;
> 	}
>@@ -210,6 +213,8 @@ void *eeh_pe_dev_traverse(struct eeh_pe *root,

> 	/* Traverse root PE */
> 	for (pe = root; pe; pe = eeh_pe_next(pe, root)) {
>+		if (eeh_pe_passed(pe))
>+			continue;
> 		eeh_pe_for_each_dev(pe, edev, tmp) {
> 			ret = fn(edev, flag);
> 			if (ret)

