[Skiboot] [PATCH] core/ipmi: Fix use-after-free

Oliver O'Halloran oohall at gmail.com
Tue Oct 15 14:37:36 AEDT 2019


On Tue, Oct 15, 2019 at 4:04 AM Vasant Hegde
<hegdevasant at linux.vnet.ibm.com> wrote:
>
> Commit f01cd77 introduced backend poller() for ipmi message. But in some
> corner cases its possible that we endup calling poller() after freeing
> ipmi message.
>
> Thread 1 :
>   ipmi_queue_msg_sync()
>     Waiting for ipmi sync message to complete
>
> Thread 2 :
>   bt_poll() -> ipmi_cmd_done() -> callback handler -> free message
>
> Oliver hit this issue during fast-reboot test with skiboot DEBUG build.
> In debug build we poision the memory after free. That helped us to catch
> this issue.
>
> [  460.295570781,3] ***********************************************
> [  460.295773157,3] Fatal MCE at 0000000030035cb4   .ipmi_queue_msg_sync+0x110  MSR 9000000000201002
> [  460.295887496,3] CFAR : 0000000030035ce8 MSR  : 9000000000000000
> [  460.295956419,3] SRR0 : 0000000030035cb4 SRR1 : 9000000000201002
> [  460.296035015,3] HSRR0: 0000000030012624 HSRR1: 9000000002803002
> [  460.296102413,3] DSISR: 00000008         DAR  : 99999999999999d1
> [  460.296169710,3] LR   : 0000000030035ce4 CTR  : 0000000030002880
> [  460.296248482,3] CR   : 28002422         XER  : 20040000
> [  460.296336621,3] GPR00: 0000000030035ce4 GPR16: 00000000301d36d8
> [  460.296415449,3] GPR01: 0000000031c133d0 GPR17: 00000000300f5cd8
> [  460.296482811,3] GPR02: 0000000030142700 GPR18: 0000000030407ff0
> [  460.296550265,3] GPR03: 0000000000000100 GPR19: 0000000000000000
> [  460.296629041,3] GPR04: 0000000028002424 GPR20: 0000000000000000
> [  460.296696369,3] GPR05: 0000000020040000 GPR21: 0000000030121d73
> [  460.296820977,3] GPR06: c000001fffffd480 GPR22: 0000000030121dd2
> [  460.296888226,3] GPR07: c000001fffffd480 GPR23: 0000000030613400
> [  460.296978218,3] GPR08: 0000000000000001 GPR24: 0000000000000001
> [  460.297056871,3] GPR09: 9999999999999999 GPR25: 0000000031c13960
> [  460.297124647,3] GPR10: 0000000000000000 GPR26: 0000000000000004
> [  460.297203811,3] GPR11: 0000000000000000 GPR27: 0000000000000003
> [  460.297271250,3] GPR12: 0000000028002424 GPR28: 0000000030613400
> [  460.297339026,3] GPR13: 0000000031c10000 GPR29: 0000000030406b50
> [  460.297417605,3] GPR14: 00000000300f58f8 GPR30: 0000000030406b40
> [  460.297485176,3] GPR15: 00000000300f58d8 GPR31: 00000000309249c8
>
> Reported-by: Oliver O'Halloran <oohall at gmail.com>
> Fixes: f01cd77 (ipmi: ensure forward progress on ipmi_queue_msg_sync())
> Cc: skiboot-stable at lists.ozlabs.org # v6.3+
> Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
> ---
>  core/ipmi.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/core/ipmi.c b/core/ipmi.c
> index 566300e13..ab2931bc4 100644
> --- a/core/ipmi.c
> +++ b/core/ipmi.c
> @@ -126,6 +126,8 @@ int ipmi_dequeue_msg(struct ipmi_msg *msg)
>
>  void ipmi_cmd_done(uint8_t cmd, uint8_t netfn, uint8_t cc, struct ipmi_msg *msg)
>  {
> +       bool clr_sync_msg = false;
> +
>         msg->cc = cc;
>         if (msg->cmd != cmd) {
>                 prerror("IPMI: Incorrect cmd 0x%02x in response\n", cmd);
> @@ -138,6 +140,10 @@ void ipmi_cmd_done(uint8_t cmd, uint8_t netfn, uint8_t cc, struct ipmi_msg *msg)
>         }
>         msg->netfn = netfn;
>
> +       if (msg == sync_msg)
> +               clr_sync_msg = true;
> +
> +
>         if (cc != IPMI_CC_NO_ERROR) {
>                 prlog(PR_DEBUG, "IPMI: Got error response. cmd=0x%x, netfn=0x%x,"
>                       " rc=0x%02x\n", msg->cmd, msg->netfn >> 2, msg->cc);
> @@ -151,12 +157,14 @@ void ipmi_cmd_done(uint8_t cmd, uint8_t netfn, uint8_t cc, struct ipmi_msg *msg)
>            completion functions. */
>
>         /* If this is a synchronous message flag that we are done */
> -       if (msg == sync_msg)
> +       if (clr_sync_msg)
>                 sync_msg = NULL;
>  }

Hmm, should have looked at this closer before pushing it through CI
and into master. Oh well.

The clr_sync_msg flag here isn't really necessary. We might have
freed() the memory pointed to be msg, but msg itself is just a local
variable and comparing it with sync_msg is just a pointer comparison
which is perfectly safe. I think we might need a barrier after setting
sync_msg = NULL too to ensure that the update to sync_msg is visible
to other threads. It probably doesn't matter much since the other
threads will just block until they see the update though.

That said, merged to master as
d75e82dbfbb9443efeb3f9a5921ac23605aab469. Thanks for looking into it.


More information about the Skiboot mailing list