[PATCH v2] powernv/elog: Fix the race while processing OPAL error log event.

Oliver O'Halloran oohall at gmail.com
Tue Oct 6 11:25:39 AEDT 2020


On Mon, Oct 5, 2020 at 3:12 PM Mahesh Salgaonkar <mahesh at linux.ibm.com> wrote:
>
> Every error log reported by OPAL is exported to userspace through a sysfs
> interface and notified using kobject_uevent(). The userspace daemon
> (opal_errd) then reads the error log and acknowledges it error log is saved
> safely to disk. Once acknowledged the kernel removes the respective sysfs
> file entry causing respective resources getting released including kobject.
>
> However there are chances where user daemon may already be scanning elog
> entries while new sysfs elog entry is being created by kernel. User daemon
> may read this new entry and ack it even before kernel can notify userspace
> about it through kobject_uevent() call. If that happens then we have a
> potential race between elog_ack_store->kobject_put() and kobject_uevent
> which can lead to use-after-free issue of a kernfs object resulting into a
> kernel crash. This patch fixes this race by protecting a sysfs file
> creation/notification by holding an additional reference count on kobject
> until we safely send kobject_uevent().
>
> Reported-by: Oliver O'Halloran <oohall at gmail.com>
> Signed-off-by: Mahesh Salgaonkar <mahesh at linux.ibm.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
> ---
> Change in v2:
> - Instead of mutex and use extra reference count on kobject to avoid the
>   race.
> ---
>  arch/powerpc/platforms/powernv/opal-elog.c |   15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/opal-elog.c b/arch/powerpc/platforms/powernv/opal-elog.c
> index 62ef7ad995da..230f102e87c0 100644
> --- a/arch/powerpc/platforms/powernv/opal-elog.c
> +++ b/arch/powerpc/platforms/powernv/opal-elog.c
> @@ -222,13 +222,28 @@ static struct elog_obj *create_elog_obj(uint64_t id, size_t size, uint64_t type)
>                 return NULL;
>         }
>
> +       /*
> +        * As soon as sysfs file for this elog is created/activated there is
> +        * chance opal_errd daemon might read and acknowledge this elog before
> +        * kobject_uevent() is called. If that happens then we have a potential
> +        * race between elog_ack_store->kobject_put() and kobject_uevent which
> +        * leads to use-after-free issue of a kernfs object resulting into
> +        * kernel crash. To avoid this race take an additional reference count
> +        * on kobject until we safely send kobject_uevent().
> +        */
> +
> +       kobject_get(&elog->kobj);  /* extra reference count */
>         rc = sysfs_create_bin_file(&elog->kobj, &elog->raw_attr);
>         if (rc) {
> +               kobject_put(&elog->kobj);
> +               /* Drop the extra reference count  */
>                 kobject_put(&elog->kobj);
>                 return NULL;
>         }
>
>         kobject_uevent(&elog->kobj, KOBJ_ADD);
> +       /* Drop the extra reference count  */
> +       kobject_put(&elog->kobj);

Makes sense,

Reviewed-by: Oliver O'Halloran <oohall at gmail.com>

>
>         return elog;

Does the returned value actually get used anywhere? We'd have a
similar use-after-free problem if it does. This should probably return
an error code rather than the object itself.


More information about the Linuxppc-dev mailing list