Infinite looping observed in __offline_pages

Rashmica rashmica.g at gmail.com
Wed Aug 1 11:37:05 AEST 2018



On 26/07/18 04:11, John Allen wrote:
> Hi All,
>
> Under heavy stress and constant memory hot add/remove, I have observed
> the following loop to occasionally loop infinitely:
>
> mm/memory_hotplug.c:__offline_pages
>
> repeat:
>        /* start memory hot removal */
>        ret = -EINTR;
>        if (signal_pending(current))
>                goto failed_removal;
>
>        cond_resched();
>        lru_add_drain_all();
>        drain_all_pages(zone);
>
>        pfn = scan_movable_pages(start_pfn, end_pfn);
>        if (pfn) { /* We have movable pages */
>                ret = do_migrate_range(pfn, end_pfn);
>                goto repeat;
>        }
>

What is CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE set to for you?

I have also observed this when hot removing and adding memory. However I
only have only seen this when my kernel has
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n (when it is set to online
automatically I do not have this issue) so I assumed that I wasn't
onlining the memory properly...

> What appears to be happening in this case is that do_migrate_range
> returns a failure code which is being ignored. The failure is stemming
> from migrate_pages returning "1" which I'm guessing is the result of
> us hitting the following case:
>
> mm/migrate.c: migrate_pages
>
>     default:
>         /*
>          * Permanent failure (-EBUSY, -ENOSYS, etc.):
>          * unlike -EAGAIN case, the failed page is
>          * removed from migration page list and not
>          * retried in the next outer loop.
>          */
>         nr_failed++;
>         break;
>     }
>
> Does a failure in do_migrate_range indicate that the range is
> unmigratable and the loop in __offline_pages should terminate and goto
> failed_removal? Or should we allow a certain number of retrys before we
> give up on migrating the range?
>
> This issue was observed on a ppc64le lpar on a 4.18-rc6 kernel.
>
> -John
>



More information about the Linuxppc-dev mailing list