[v5] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform

Vipin K Parashar vipin at linux.vnet.ibm.com
Thu Jun 4 05:12:01 AEST 2015


Hi Michael,
               Thanks for review. Responses below

On 06/03/2015 10:43 AM, Michael Ellerman wrote:
> On Mon, 2015-18-05 at 15:18:04 UTC, Vipin K Parashar wrote:
>> This patch adds support for FSP EPOW (Early Power Off Warning) and
> Please spell out the acronyms the first time you use them, including FSP.

Will do.

>
>> DPO (Delayed Power Off) events for PowerNV platform. EPOW events are
>                                      ^
> 				    the

the PowerNV platform.  Will edit.

>
>> generated by SPCN/FSP due to various critical system conditions that
> SPCN?

Will remove SPCN. FSP should be sufficient.

>
>> need system shutdown. Few examples of these conditions are high
>                          ^
> s/need/require/ ?       A few

Agreed.

>
>> ambient temperature or system running on UPS power with low UPS battery.
>> DPO event is generated in response to admin initiated system request.
> Blank line between paragraphs please.

Sure

>
>> 	Upon receipt of EPOW and DPO events host kernel invokes
>                                              ^
> 					    the host kernel

will edit

>> orderly_poweroff for performing graceful system shutdown. System admin
> I like it if you spell functions with a trailing () to make it clear they are
> functions, so this would be "orderly_powerof()".

Agreed.

>
>> can also add systemd service shutdown scripts to perform any specific
>> actions like graceful guest shutdown upon system poweroff. libvirt-guests
>> is systemd service available on recent distros for management of guests
>> at system start/shutdown time.
> This last part about the scripts is not relevant to the kernel patch so just
> leave it out please.

Agreed.

>
>> Signed-off-by: Vipin K Parashar <vipin at linux.vnet.ibm.com>
>> Reviewed-by: Joel Stanley <joel at jms.id.au>
>> Reviewed-by: Vaibhav Jain <vaibhav at linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/include/asm/opal-api.h            |  44 ++++++++
>>   arch/powerpc/include/asm/opal.h                |   3 +-
>>   arch/powerpc/platforms/powernv/opal-power.c    | 147 ++++++++++++++++++++++---
>>   arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
>>   4 files changed, 179 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
>> index 0321a90..90fa364 100644
>> --- a/arch/powerpc/include/asm/opal-api.h
>> +++ b/arch/powerpc/include/asm/opal-api.h
>> @@ -355,6 +355,10 @@ enum opal_msg_type {
>>   	OPAL_MSG_TYPE_MAX,
>>   };
>>   
>> +/* OPAL_MSG_SHUTDOWN parameter values */
>> +#define	SOFT_OFF	0x00
>> +#define	SOFT_REBOOT	0x01
> I don't see this in the skiboot version of opal-api.h ?
>
> They should be kept in sync.
>
> If it's a Linux only define it should go in opal.h

Agreed. Won't add these definitions to opal-api.h as its not present in 
skiboot version of opal-api.h.

>>   struct opal_msg {
>>   	__be32 msg_type;
>>   	__be32 reserved;
>> @@ -730,6 +734,46 @@ struct opal_i2c_request {
>>   	__be64 buffer_ra;		/* Buffer real address */
>>   };
>>   
>> +/*
>> + * EPOW status sharing (OPAL and the host)
>> + *
>> + * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
>> + * with individual elements being 16 bits wide to fetch the system
>> + * wide EPOW status. Each element in the buffer will contain the
>> + * EPOW status in it's bit representation for a particular EPOW sub
>> + * class as defiend here. So multiple detailed EPOW status bits
>> + * specific for any sub class can be represented in a single buffer
>> + * element as it's bit representation.
>> + */
>> +
>> +/* System EPOW type */
>> +enum OpalSysEpow {
>> +	OPAL_SYSEPOW_POWER	= 0,	/* Power EPOW */
>> +	OPAL_SYSEPOW_TEMP	= 1,	/* Temperature EPOW */
>> +	OPAL_SYSEPOW_COOLING	= 2,	/* Cooling EPOW */
>> +	OPAL_SYSEPOW_MAX	= 3,	/* Max EPOW categories */
>> +};
>> +
>> +/* Power EPOW */
>> +enum OpalSysPower {
>> +	OPAL_SYSPOWER_UPS	= 0x0001, /* System on UPS power */
>> +	OPAL_SYSPOWER_CHNG	= 0x0002, /* System power config change */
>> +	OPAL_SYSPOWER_FAIL	= 0x0004, /* System impending power failure */
>> +	OPAL_SYSPOWER_INCL	= 0x0008, /* System incomplete power */
>> +};
>> +
>> +/* Temperature EPOW */
>> +enum OpalSysTemp {
>> +	OPAL_SYSTEMP_AMB	= 0x0001, /* System over ambient temperature */
>> +	OPAL_SYSTEMP_INT	= 0x0002, /* System over internal temperature */
>> +	OPAL_SYSTEMP_HMD	= 0x0004, /* System over ambient humidity */
>> +};
>> +
>> +/* Cooling EPOW */
>> +enum OpalSysCooling {
>> +	OPAL_SYSCOOL_INSF	= 0x0001, /* System insufficient cooling */
>> +};
> I don't see the last three of these enums used at all, so please drop them.

OPAL_SYSPOWER_CHNG / FAIL / INCL, OPAL_SYSTEMP_HMD and OPAL_SYSCOOL_INSF
enums aren't used here but they are part of skiboot version of 
opal-api.h and
thus need to be retained.
          PKVM2.1 uses these enums and thus can't be removed from 
skiboot opal-api.h

>
>>   #endif /* __ASSEMBLY__ */
>>   
>>   #endif /* __OPAL_API_H */
>> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
>> index 042af1a..d30766f 100644
>> --- a/arch/powerpc/include/asm/opal.h
>> +++ b/arch/powerpc/include/asm/opal.h
>> @@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
>>   int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
>>   int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t error_type, uint8_t mask_action);
>>   int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t led_type, uint8_t led_action);
>> -int64_t opal_get_epow_status(__be64 *status);
>> +int64_t opal_get_epow_status(uint16_t *status, uint16_t *length);
> Has the signature of this function really changed or was it just wrong before?
>
> If it's changed how do we know we're running on a version of OPAL that supports
> the two argument version?

__be64 * status is wrong here. OPAL expects two arguments of type __be16 *
and has been unchanged.

>
> The parameter names don't seem very clear either. status is actually a pointer
> to an array of "statuses", and length is the number of entries in that array.
>
> Also you removed the endian annotations but then you pass it __be16 values, so
> that looks incorrect, you should be using __be16 here.

Will correct endian notation and use below names for epow, dpo api
int64_t opal_get_epow_status(__be16 *epow_status, __be16 *num_epow_classes);
int64_t opal_get_dpo_status(__be64 *dpo_timeout);

>
>> +int64_t opal_get_dpo_status(int64_t *dpo_timeout);
> Similarly this should be __be64 AFAICS.

Agreed. Changed api above.

>
>> diff --git a/arch/powerpc/platforms/powernv/opal-power.c b/arch/powerpc/platforms/powernv/opal-power.c
>> index ac46c2c..581bbd8 100644
>> --- a/arch/powerpc/platforms/powernv/opal-power.c
>> +++ b/arch/powerpc/platforms/powernv/opal-power.c
>> @@ -1,5 +1,5 @@
>>   /*
>> - * PowerNV OPAL power control for graceful shutdown handling
>> + * PowerNV support for OPAL power-control, poweroff events
>>    *
>>    * Copyright 2015 IBM Corp.
>>    *
>> @@ -9,18 +9,87 @@
>>    * 2 of the License, or (at your option) any later version.
>>    */
>>   
>> +#define pr_fmt(fmt)	"OPAL-POWER: "	fmt
> Please don't shout, "opal-power" is fine.

Will keep in small letters.

>
>>   #include <linux/kernel.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/timer.h>
> Don't think you need those?

yes, not needed now.

>
>>   #include <linux/reboot.h>
>> -#include <linux/notifier.h>
>> -
> I think you DO need notifier.h.

Agreed. Won't remove.

>
>> +#include <linux/of.h>
>>   #include <asm/opal.h>
>>   #include <asm/machdep.h>
>>   
>> -#define SOFT_OFF 0x00
>> -#define SOFT_REBOOT 0x01
>> +/* Get EPOW status */
>> +static bool get_epow_status(void)
> This is not a great name, "get" implies it gives you something back, but it
> doesn't it just tells you true or false.
>
> So maybe epow_event_pending() ?

Agreed. Will use better naming.

>
>> +{
>> +	int i;
>> +	u16 num_classes;
>> +	__be16 epow_classes;
> I think this would be cleaner if you just had a single num_classes and you
> endian swap in the one place you use it.

Agreed.

>
>> +	__be16 opal_epow_status[OPAL_SYSEPOW_MAX] = {0};
>> +
>> +	/* Send kernel EPOW classes supported info to OPAL */
>> +	epow_classes = cpu_to_be16(OPAL_SYSEPOW_MAX);
>> +
>> +	/* Get EPOW events information from OPAL */
>> +	opal_get_epow_status(opal_epow_status, &epow_classes);
> This could fail.

Will add a check for API failure.
Though skiboot implementation doesn't have a fail case for this.

>
>> +
>> +	/* Look for EPOW events present */
>> +	num_classes = be16_to_cpu(epow_classes);
>> +	for (i = 0; i < num_classes; i++) {
>> +		if (be16_to_cpu(opal_epow_status[i]))
>> +			return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +/* Process existing EPOW, DPO events */
>> +static void process_existing_poweroff_events(void)
>> +{
>> +	int rc;
>> +	__be64 opal_dpo_timeout;
>>   
>> +	/* Check for DPO event */
>> +	rc = opal_get_dpo_status(&opal_dpo_timeout);
>> +	if (rc != OPAL_WRONG_STATE) {
>> +		pr_info("Existing DPO event detected. Powering off system\n");
>> +		goto poweroff;
>> +	}
>> +
>> +	/* Check for EPOW event */
>> +	if (get_epow_status()) {
>> +		pr_info("Existing EPOW event detected. Powering off system");
>> +		goto poweroff;
>> +	}
>> +	return;
>> +
>> +poweroff:
>> +	orderly_poweroff(true);
> I don't like that much, you shouldn't need to use goto for such simple logic.
>
> Can you create a single function, maybe called event_pending(), and have it
> check both EPOW and DPO and return a bool if there's any kind of event pending.
>
> Then this can just become:
>
>    if (event_pending())
>    	orderly_poweroff(true);

Agreed. Will reorganize this code.

>
>> +}
>> +
>> +/* OPAL EPOW, DPO event notifier */
>> +static int opal_epow_dpo_event(struct notifier_block *nb,
>> +				unsigned long msg_type, void *msg)
>> +{
>> +	switch (msg_type) {
>> +	case OPAL_MSG_EPOW:
>> +		pr_info("EPOW msg received. Powering off system\n");
>> +		break;
>> +	case OPAL_MSG_DPO:
>> +		pr_info("DPO msg received. Powering off system\n");
>> +		break;
>> +	default:
>> +		pr_err("Unknown message type %lu\n", msg_type);
>> +		return 0;
>> +	}
>> +
>> +	orderly_poweroff(true);
>> +	return 0;
>> +}
> Why do we need a separate notifier function? Can't this just be folded into
> opal_power_control_event() ?

Will merge all of them into opal_power_control_event.

>
>> +
>> +/* OPAL power-control events notifier */
>>   static int opal_power_control_event(struct notifier_block *nb,
>> -				    unsigned long msg_type, void *msg)
>> +					unsigned long msg_type, void *msg)
>>   {
>>   	struct opal_msg *power_msg = msg;
>>   	uint64_t type;
>> @@ -29,20 +98,35 @@ static int opal_power_control_event(struct notifier_block *nb,
>>   
>>   	switch (type) {
>>   	case SOFT_REBOOT:
>> -		pr_info("OPAL: reboot requested\n");
>> +		pr_info("Reboot requested\n");
>>   		orderly_reboot();
>>   		break;
>>   	case SOFT_OFF:
>> -		pr_info("OPAL: poweroff requested\n");
>> +		pr_info("Poweroff requested\n");
>>   		orderly_poweroff(true);
>>   		break;
>>   	default:
>> -		pr_err("OPAL: power control type unexpected %016llx\n", type);
>> +		pr_err("Unknown power-control type %llu\n", type);
>>   	}
>>   
>>   	return 0;
>>   }
>>   
>> +/* OPAL EPOW event notifier block */
>> +static struct notifier_block opal_epow_nb = {
>> +	.notifier_call	= opal_epow_dpo_event,
>> +	.next		= NULL,
>> +	.priority	= 0,
>> +};
>> +
>> +/* OPAL DPO event notifier block */
>> +static struct notifier_block opal_dpo_nb = {
>> +	.notifier_call	= opal_epow_dpo_event,
>> +	.next		= NULL,
>> +	.priority	= 0,
>> +};
>> +
>> +/* OPAL power-control event notifier block */
>>   static struct notifier_block opal_power_control_nb = {
>>   	.notifier_call	= opal_power_control_event,
>>   	.next		= NULL,
>> @@ -51,16 +135,49 @@ static struct notifier_block opal_power_control_nb = {
>>   
>>   static int __init opal_power_control_init(void)
>>   {
>> -	int ret;
>> +	int ret, epow_dpo_supported = 0;
> Can you make that a bool and call it "supported".

Will change name to supported. Pasted reorganized code below.

>
>> +	struct device_node *node_epow;
> It's typical to just call it "np".

Agreed. will use "np"

>
>>   
>> +	/* Register OPAL power-control events notifier */
>>   	ret = opal_message_notifier_register(OPAL_MSG_SHUTDOWN,
>> -					     &opal_power_control_nb);
>> -	if (ret) {
>> -		pr_err("%s: Can't register OPAL event notifier (%d)\n",
>> -				__func__, ret);
>> -		return ret;
>> +				&opal_power_control_nb);
>> +	if (ret)
>> +		pr_err("Power-control events notifier registration "
>> +				"failed, ret = %d\n", ret);
> Please don't split the string, and similarly below.

sure

>
>> +
>> +	/* Determine EPOW, DPO support in hardware. */
>> +	node_epow = of_find_node_by_path("/ibm,opal/epow");
>> +	if (node_epow) {
>> +		epow_dpo_supported = of_device_is_compatible(node_epow,
>> +				"ibm,opal-v3-epow");
>> +		of_node_put(node_epow);
>>   	}
>>   
>> +	if (epow_dpo_supported)
>> +		pr_info("OPAL EPOW, DPO support detected.\n");
>> +	else
>> +		return 0;
> Clearer as:
>
> 	if (!supported)
> 		return 0;
>
> 	pr_info("OPAL EPOW, DPO support detected.\n");

Reorganized code as below

         /* Determine OPAL EPOW, DPO support */
         np = of_find_node_by_path("/ibm,opal/epow");
         if (np) {
                 int supported;

                 supported = of_device_is_compatible(np, 
"ibm,opal-v3-epow");
                 of_node_put(np);
                 if (!supported)
                         return 0;
                 pr_info("OPAL EPOW, DPO support detected.\n");
         }

>
>> +
>> +	/* Register EPOW event notifier */
>> +	ret = opal_message_notifier_register(OPAL_MSG_EPOW,
>> +			&opal_epow_nb);
>> +	if (ret)
>> +		pr_err("EPOW event notifier registration failed, "
>> +				"ret = %d\n", ret);
>> +
>> +	/* Register DPO event notifier */
>> +	ret = opal_message_notifier_register(OPAL_MSG_DPO,
>> +			&opal_dpo_nb);
>> +	if (ret)
>> +		pr_err("DPO event notifier registration failed, "
>> +				"ret = %d\n", ret);
>> +
>> +	/* Check for any existing EPOW or DPO events. */
>> +	process_existing_poweroff_events();
>> +
>> +	pr_info("Poweroff events support initialized\n");
>> +
>>   	return 0;
>>   }
>> +
> No extra blank line thanks.

sure

>
>>   machine_subsys_initcall(powernv, opal_power_control_init);
>
> cheers
>



More information about the Linuxppc-dev mailing list