[PATCH] powerpc/pseries: fix oops in hotplug memory notifier

Nathan Lynch nathanl at linux.ibm.com
Sat Jun 8 09:39:47 AEST 2019


Tyrel Datwyler <tyreld at linux.vnet.ibm.com> writes:

> On 06/06/2019 10:04 PM, Nathan Lynch wrote:
>> During post-migration device tree updates, we can oops in
>> pseries_update_drconf_memory if the source device tree has an
>> ibm,dynamic-memory-v2 property and the destination has a
>> ibm,dynamic_memory (v1) property. The notifier processes an "update"
>> for the ibm,dynamic-memory property but it's really an add in this
>> scenario. So make sure the old property object is there before
>> dereferencing it.
>> 
>> Signed-off-by: Nathan Lynch <nathanl at linux.ibm.com>
>> ---
>
> Yes, this patch solves the oops, but I worry it just papers over some short
> comings in our code.
>
> After some poking around unless I'm mistaken our memory notifier only handles v1
> versions of the ibm,dynamic-memory property. So, on newer firmware we aren't
> doing any memory fixups if v2 (ibm,dynamic-memory-v2) of the property
> is updated.

It's not clear to me what the notifier is meant to accomplish:

	for (i = 0; i < entries; i++) {
		if ((be32_to_cpu(old_drmem[i].flags) & DRCONF_MEM_ASSIGNED) &&
		    (!(be32_to_cpu(new_drmem[i].flags) & DRCONF_MEM_ASSIGNED))) {
			rc = pseries_remove_memblock(
				be64_to_cpu(old_drmem[i].base_addr),
						     memblock_size);
			break;
		} else if ((!(be32_to_cpu(old_drmem[i].flags) &
			    DRCONF_MEM_ASSIGNED)) &&
			    (be32_to_cpu(new_drmem[i].flags) &
			    DRCONF_MEM_ASSIGNED)) {
			rc = memblock_add(be64_to_cpu(old_drmem[i].base_addr),
					  memblock_size);
			rc = (rc < 0) ? -EINVAL : 0;
			break;
		}
	}

This compares the 'assigned' flag for each LMB in the old vs new
properties and adds or removes the block accordingly. However:

- Migration and PRRNs are specified only to change LMBs' NUMA affinity
  information. This notifier should be a no-op for those scenarios since
  the assigned flags should not change.
- The memory hotplug/DLPAR path has a hack which inhibits the execution
  of the notifier:
     dlpar_memory()
        ...
        rtas_hp_event = true;
        drmem_update_dt()
           of_update_property()
              pseries_memory_notifier()
                 pseries_update_drconf_memory()
                    if (rtas_hp_event) return;

So what's it for? My best guess is that it's a holdover from when more
of the DLPAR work was done in user space. I don't see a purpose for it
now.


> For older PFW if we have source and target that only support v1 we will update
> the memory in response to any update to ibm,dynamic-memory. It also appears to
> be the case if we start with v1 and migrate to a target with newer PFW that
> supports both v1 and v2 that the PFW will continue with v1 on the target and as
> a result we update memory in accordance to a property update to ibm,dynamic-memory.
>
> Now, if we have source and targets that both support v2 after a migration we
> will do no update in response to ibm,dynamic-memory-v2 changes. And then there
> is the case of a source with v2 support migrating to a target with only v1
> support where we observe this oops. The oops is a result of ibm,dynamic-memory
> being a new property that is added and there for no old property date exists.
> However, simply returning in response also has the side effect that we do not
> update memory in response to a device tree update of dynamic memory.
>
> Maybe we are ok with this behavior as I haven't dug deep enough into the memory
> subsystem here to really understand what the memory code is updating, but it is
> concerning that we are doing it in some cases, but not all.

I hope I've made a good case above that the notifier does not do any
useful work, and a counterpart for the v2 format isn't needed.  Do you
agree?

If so, I'll send a patch later to remove the notifier altogether. In the
near term I would still like this minimal fix to go in.



More information about the Linuxppc-dev mailing list