[v5] powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform
Michael Ellerman
mpe at ellerman.id.au
Wed Jun 3 15:13:08 AEST 2015
On Mon, 2015-18-05 at 15:18:04 UTC, Vipin K Parashar wrote:
> This patch adds support for FSP EPOW (Early Power Off Warning) and
Please spell out the acronyms the first time you use them, including FSP.
> DPO (Delayed Power Off) events for PowerNV platform. EPOW events are
^
the
> generated by SPCN/FSP due to various critical system conditions that
SPCN?
> need system shutdown. Few examples of these conditions are high
^
s/need/require/ ? A few
> ambient temperature or system running on UPS power with low UPS battery.
> DPO event is generated in response to admin initiated system request.
Blank line between paragraphs please.
> Upon receipt of EPOW and DPO events host kernel invokes
^
the host kernel
> orderly_poweroff for performing graceful system shutdown. System admin
I like it if you spell functions with a trailing () to make it clear they are
functions, so this would be "orderly_powerof()".
> can also add systemd service shutdown scripts to perform any specific
> actions like graceful guest shutdown upon system poweroff. libvirt-guests
> is systemd service available on recent distros for management of guests
> at system start/shutdown time.
This last part about the scripts is not relevant to the kernel patch so just
leave it out please.
>
> Signed-off-by: Vipin K Parashar <vipin at linux.vnet.ibm.com>
> Reviewed-by: Joel Stanley <joel at jms.id.au>
> Reviewed-by: Vaibhav Jain <vaibhav at linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/opal-api.h | 44 ++++++++
> arch/powerpc/include/asm/opal.h | 3 +-
> arch/powerpc/platforms/powernv/opal-power.c | 147 ++++++++++++++++++++++---
> arch/powerpc/platforms/powernv/opal-wrappers.S | 1 +
> 4 files changed, 179 insertions(+), 16 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 0321a90..90fa364 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -355,6 +355,10 @@ enum opal_msg_type {
> OPAL_MSG_TYPE_MAX,
> };
>
> +/* OPAL_MSG_SHUTDOWN parameter values */
> +#define SOFT_OFF 0x00
> +#define SOFT_REBOOT 0x01
I don't see this in the skiboot version of opal-api.h ?
They should be kept in sync.
If it's a Linux only define it should go in opal.h
> struct opal_msg {
> __be32 msg_type;
> __be32 reserved;
> @@ -730,6 +734,46 @@ struct opal_i2c_request {
> __be64 buffer_ra; /* Buffer real address */
> };
>
> +/*
> + * EPOW status sharing (OPAL and the host)
> + *
> + * The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
> + * with individual elements being 16 bits wide to fetch the system
> + * wide EPOW status. Each element in the buffer will contain the
> + * EPOW status in it's bit representation for a particular EPOW sub
> + * class as defiend here. So multiple detailed EPOW status bits
> + * specific for any sub class can be represented in a single buffer
> + * element as it's bit representation.
> + */
> +
> +/* System EPOW type */
> +enum OpalSysEpow {
> + OPAL_SYSEPOW_POWER = 0, /* Power EPOW */
> + OPAL_SYSEPOW_TEMP = 1, /* Temperature EPOW */
> + OPAL_SYSEPOW_COOLING = 2, /* Cooling EPOW */
> + OPAL_SYSEPOW_MAX = 3, /* Max EPOW categories */
> +};
> +
> +/* Power EPOW */
> +enum OpalSysPower {
> + OPAL_SYSPOWER_UPS = 0x0001, /* System on UPS power */
> + OPAL_SYSPOWER_CHNG = 0x0002, /* System power config change */
> + OPAL_SYSPOWER_FAIL = 0x0004, /* System impending power failure */
> + OPAL_SYSPOWER_INCL = 0x0008, /* System incomplete power */
> +};
> +
> +/* Temperature EPOW */
> +enum OpalSysTemp {
> + OPAL_SYSTEMP_AMB = 0x0001, /* System over ambient temperature */
> + OPAL_SYSTEMP_INT = 0x0002, /* System over internal temperature */
> + OPAL_SYSTEMP_HMD = 0x0004, /* System over ambient humidity */
> +};
> +
> +/* Cooling EPOW */
> +enum OpalSysCooling {
> + OPAL_SYSCOOL_INSF = 0x0001, /* System insufficient cooling */
> +};
I don't see the last three of these enums used at all, so please drop them.
> #endif /* __ASSEMBLY__ */
>
> #endif /* __OPAL_API_H */
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index 042af1a..d30766f 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -141,7 +141,8 @@ int64_t opal_pci_fence_phb(uint64_t phb_id);
> int64_t opal_pci_reinit(uint64_t phb_id, uint64_t reinit_scope, uint64_t data);
> int64_t opal_pci_mask_pe_error(uint64_t phb_id, uint16_t pe_number, uint8_t error_type, uint8_t mask_action);
> int64_t opal_set_slot_led_status(uint64_t phb_id, uint64_t slot_id, uint8_t led_type, uint8_t led_action);
> -int64_t opal_get_epow_status(__be64 *status);
> +int64_t opal_get_epow_status(uint16_t *status, uint16_t *length);
Has the signature of this function really changed or was it just wrong before?
If it's changed how do we know we're running on a version of OPAL that supports
the two argument version?
The parameter names don't seem very clear either. status is actually a pointer
to an array of "statuses", and length is the number of entries in that array.
Also you removed the endian annotations but then you pass it __be16 values, so
that looks incorrect, you should be using __be16 here.
> +int64_t opal_get_dpo_status(int64_t *dpo_timeout);
Similarly this should be __be64 AFAICS.
> diff --git a/arch/powerpc/platforms/powernv/opal-power.c b/arch/powerpc/platforms/powernv/opal-power.c
> index ac46c2c..581bbd8 100644
> --- a/arch/powerpc/platforms/powernv/opal-power.c
> +++ b/arch/powerpc/platforms/powernv/opal-power.c
> @@ -1,5 +1,5 @@
> /*
> - * PowerNV OPAL power control for graceful shutdown handling
> + * PowerNV support for OPAL power-control, poweroff events
> *
> * Copyright 2015 IBM Corp.
> *
> @@ -9,18 +9,87 @@
> * 2 of the License, or (at your option) any later version.
> */
>
> +#define pr_fmt(fmt) "OPAL-POWER: " fmt
Please don't shout, "opal-power" is fine.
> #include <linux/kernel.h>
> +#include <linux/spinlock.h>
> +#include <linux/timer.h>
Don't think you need those?
> #include <linux/reboot.h>
> -#include <linux/notifier.h>
> -
I think you DO need notifier.h.
> +#include <linux/of.h>
> #include <asm/opal.h>
> #include <asm/machdep.h>
>
> -#define SOFT_OFF 0x00
> -#define SOFT_REBOOT 0x01
> +/* Get EPOW status */
> +static bool get_epow_status(void)
This is not a great name, "get" implies it gives you something back, but it
doesn't it just tells you true or false.
So maybe epow_event_pending() ?
> +{
> + int i;
> + u16 num_classes;
> + __be16 epow_classes;
I think this would be cleaner if you just had a single num_classes and you
endian swap in the one place you use it.
> + __be16 opal_epow_status[OPAL_SYSEPOW_MAX] = {0};
> +
> + /* Send kernel EPOW classes supported info to OPAL */
> + epow_classes = cpu_to_be16(OPAL_SYSEPOW_MAX);
> +
> + /* Get EPOW events information from OPAL */
> + opal_get_epow_status(opal_epow_status, &epow_classes);
This could fail.
> +
> + /* Look for EPOW events present */
> + num_classes = be16_to_cpu(epow_classes);
> + for (i = 0; i < num_classes; i++) {
> + if (be16_to_cpu(opal_epow_status[i]))
> + return true;
> + }
> +
> + return false;
> +}
> +
> +/* Process existing EPOW, DPO events */
> +static void process_existing_poweroff_events(void)
> +{
> + int rc;
> + __be64 opal_dpo_timeout;
>
> + /* Check for DPO event */
> + rc = opal_get_dpo_status(&opal_dpo_timeout);
> + if (rc != OPAL_WRONG_STATE) {
> + pr_info("Existing DPO event detected. Powering off system\n");
> + goto poweroff;
> + }
> +
> + /* Check for EPOW event */
> + if (get_epow_status()) {
> + pr_info("Existing EPOW event detected. Powering off system");
> + goto poweroff;
> + }
> + return;
> +
> +poweroff:
> + orderly_poweroff(true);
I don't like that much, you shouldn't need to use goto for such simple logic.
Can you create a single function, maybe called event_pending(), and have it
check both EPOW and DPO and return a bool if there's any kind of event pending.
Then this can just become:
if (event_pending())
orderly_poweroff(true);
> +}
> +
> +/* OPAL EPOW, DPO event notifier */
> +static int opal_epow_dpo_event(struct notifier_block *nb,
> + unsigned long msg_type, void *msg)
> +{
> + switch (msg_type) {
> + case OPAL_MSG_EPOW:
> + pr_info("EPOW msg received. Powering off system\n");
> + break;
> + case OPAL_MSG_DPO:
> + pr_info("DPO msg received. Powering off system\n");
> + break;
> + default:
> + pr_err("Unknown message type %lu\n", msg_type);
> + return 0;
> + }
> +
> + orderly_poweroff(true);
> + return 0;
> +}
Why do we need a separate notifier function? Can't this just be folded into
opal_power_control_event() ?
> +
> +/* OPAL power-control events notifier */
> static int opal_power_control_event(struct notifier_block *nb,
> - unsigned long msg_type, void *msg)
> + unsigned long msg_type, void *msg)
> {
> struct opal_msg *power_msg = msg;
> uint64_t type;
> @@ -29,20 +98,35 @@ static int opal_power_control_event(struct notifier_block *nb,
>
> switch (type) {
> case SOFT_REBOOT:
> - pr_info("OPAL: reboot requested\n");
> + pr_info("Reboot requested\n");
> orderly_reboot();
> break;
> case SOFT_OFF:
> - pr_info("OPAL: poweroff requested\n");
> + pr_info("Poweroff requested\n");
> orderly_poweroff(true);
> break;
> default:
> - pr_err("OPAL: power control type unexpected %016llx\n", type);
> + pr_err("Unknown power-control type %llu\n", type);
> }
>
> return 0;
> }
>
> +/* OPAL EPOW event notifier block */
> +static struct notifier_block opal_epow_nb = {
> + .notifier_call = opal_epow_dpo_event,
> + .next = NULL,
> + .priority = 0,
> +};
> +
> +/* OPAL DPO event notifier block */
> +static struct notifier_block opal_dpo_nb = {
> + .notifier_call = opal_epow_dpo_event,
> + .next = NULL,
> + .priority = 0,
> +};
> +
> +/* OPAL power-control event notifier block */
> static struct notifier_block opal_power_control_nb = {
> .notifier_call = opal_power_control_event,
> .next = NULL,
> @@ -51,16 +135,49 @@ static struct notifier_block opal_power_control_nb = {
>
> static int __init opal_power_control_init(void)
> {
> - int ret;
> + int ret, epow_dpo_supported = 0;
Can you make that a bool and call it "supported".
> + struct device_node *node_epow;
It's typical to just call it "np".
>
> + /* Register OPAL power-control events notifier */
> ret = opal_message_notifier_register(OPAL_MSG_SHUTDOWN,
> - &opal_power_control_nb);
> - if (ret) {
> - pr_err("%s: Can't register OPAL event notifier (%d)\n",
> - __func__, ret);
> - return ret;
> + &opal_power_control_nb);
> + if (ret)
> + pr_err("Power-control events notifier registration "
> + "failed, ret = %d\n", ret);
Please don't split the string, and similarly below.
> +
> + /* Determine EPOW, DPO support in hardware. */
> + node_epow = of_find_node_by_path("/ibm,opal/epow");
> + if (node_epow) {
> + epow_dpo_supported = of_device_is_compatible(node_epow,
> + "ibm,opal-v3-epow");
> + of_node_put(node_epow);
> }
>
> + if (epow_dpo_supported)
> + pr_info("OPAL EPOW, DPO support detected.\n");
> + else
> + return 0;
Clearer as:
if (!supported)
return 0;
pr_info("OPAL EPOW, DPO support detected.\n");
> +
> + /* Register EPOW event notifier */
> + ret = opal_message_notifier_register(OPAL_MSG_EPOW,
> + &opal_epow_nb);
> + if (ret)
> + pr_err("EPOW event notifier registration failed, "
> + "ret = %d\n", ret);
> +
> + /* Register DPO event notifier */
> + ret = opal_message_notifier_register(OPAL_MSG_DPO,
> + &opal_dpo_nb);
> + if (ret)
> + pr_err("DPO event notifier registration failed, "
> + "ret = %d\n", ret);
> +
> + /* Check for any existing EPOW or DPO events. */
> + process_existing_poweroff_events();
> +
> + pr_info("Poweroff events support initialized\n");
> +
> return 0;
> }
> +
No extra blank line thanks.
> machine_subsys_initcall(powernv, opal_power_control_init);
cheers
More information about the Linuxppc-dev
mailing list