[PATCH v2] powerpc/pseries: Do not initiate shutdown when system is running on UPS

Michael Ellerman mpe at ellerman.id.au
Thu Aug 20 22:57:46 AEST 2020


Vasant Hegde <hegdevasant at linux.vnet.ibm.com> writes:
> As per PAPR we have to look for both EPOW sensor value and event modifier to
> identify type of event and take appropriate action.
>
> Sensor value = 3 (EPOW_SYSTEM_SHUTDOWN) schedule system to be shutdown after
>                   OS defined delay (default 10 mins).
>
> EPOW Event Modifier for sensor value = 3:
>    We have to initiate immediate shutdown for most of the event modifier except
>    value = 2 (system running on UPS).
>
> Checking with firmware document its clear that we have to wait for predefined
> time before initiating shutdown. If power is restored within time we should
> cancel the shutdown process. I think commit 79872e35 accidently enabled
> immediate poweroff for EPOW_SHUTDOWN_ON_UPS event.

It's not that clear to me :)

LoPAPR v1.1 section 10.2.2 includes table 136 "EPOW Action Codes":

  SYSTEM_SHUTDOWN 3
  
  The system must be shut down. An EPOW-aware OS logs the EPOW error
  log information, then schedules the system to be shut down to begin
  after an OS defined delay internal (default is 10 minutes.)

And then in section 10.3.2.2.8 there is table 146 "Platform Event Log
Format, Version 6, EPOW Section", which includes the "EPOW Event
Modifier":

  For EPOW sensor value = 3
  0x01 = Normal system shutdown with no additional delay
  0x02 = Loss of utility power, system is running on UPS/Battery
  0x03 = Loss of system critical functions, system should be shutdown
  0x04 = Ambient temperature too high
  All other values = reserved

There is also section 7.3.6.4 which includes a note saying:

  2. The report that a system needs to be shutdown due to running under
  a UPS would be given by the platform as an EPOW event with EPOW event
  modifier being given as, 0x02 = Loss of utility power, system is
  running on UPS/Battery, as described in section Section 10.3.2.2.8‚
  “Platform Event Log Format, EPOW Section‚” on page 308.


So the only mention of the 10 minutes is in relation to all
SYSTEM_SHUTDOWN events. ie. according to that we should not be doing an
immediate shutdown for any of the events.

> We have user space tool (rtas_errd) on LPAR to monitor for EPOW_SHUTDOWN_ON_UPS.
> Once it gets event it initiates shutdown after predefined time. Also starts
> monitoring for any new EPOW events. If it receives "Power restored" event
> before predefined time it will cancel the shutdown. Otherwise after
> predefined time it will shutdown the system.

What event are you referring to as the "Power restored" event? AFAICS
PAPR just says we "may" receive an EPOW_RESET.

I can't see anything else about what we're supposed to do if power is
restored.

Anyway I'm not opposed to the change, but I don't think it's correct to
say that PAPR defines the behaviour.

Rather we used to implement a certain behaviour, and we have at least
one customer who relies on that old behaviour and dislikes the new
behaviour. It's also generally good to defer decisions like this to
userspace, so that administrators can customise the behaviour.

Anyway I'll massage the change log a bit to incorporate some of the
above and apply it.

cheers

> Fixes: 79872e35 (powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown)
> Cc: stable at vger.kernel.org # v4.0+
> Cc: Tyrel Datwyler <tyreld at linux.ibm.com>
> Cc: Michael Ellerman <mpe at ellerman.id.au>
> Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
> ---
> Changes in v2:
>   - Updated patch description based on mpe, Tyrel comment.
>
> -Vasant
>  arch/powerpc/platforms/pseries/ras.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
> index f3736fcd98fc..13c86a292c6d 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -184,7 +184,6 @@ static void handle_system_shutdown(char event_modifier)
>  	case EPOW_SHUTDOWN_ON_UPS:
>  		pr_emerg("Loss of system power detected. System is running on"
>  			 " UPS/battery. Check RTAS error log for details\n");
> -		orderly_poweroff(true);
>  		break;
>  
>  	case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS:
> -- 
> 2.26.2


More information about the Linuxppc-dev mailing list