[PATCH] powerpc: Fix device node refcounting

Nathan Lynch nathanl at linux.ibm.com
Fri Feb 10 04:11:05 AEDT 2023


Brian King <brking at linux.vnet.ibm.com> writes:
> On 2/7/23 9:14 AM, Nathan Lynch wrote:
>> Brian King <brking at linux.vnet.ibm.com> writes:
>>> While testing fixes to the hvcs hotplug code, kmemleak was reporting
>>> potential memory leaks. This was tracked down to the struct device_node
>>> object associated with the hvcs device. Looking at the leaked
>>> object in crash showed that the kref in the kobject in the device_node
>>> had a reference count of 1 still, and the release function was never
>>> getting called as a result of this. This adds an of_node_put in
>>> pSeries_reconfig_remove_node in order to balance the refcounting
>>> so that we actually free the device_node in the case of it being
>>> allocated in pSeries_reconfig_add_node.
>> 
>> My concern here would be whether the additional put is the right thing
>> to do in all cases. The questions it raises for me are:
>> 
>> - Is it safe for nodes that were present at boot, instead of added
>>   dynamically?
>
> Yes. of_node_release has a check to see if OF_DYNAMIC is set. If it is not set,
> the release function is a noop.

Yes, but to be more specific - does the additional of_node_put() risk
underflowing the refcount on nodes without the OF_DYNAMIC flag? I
suspect it's OK. If it's not, then I would expect to see warnings from
the refcount code when that case is exercised.

>
>> - Is it correct for all types of nodes, or is there something specific
>>   to hvcs that leaves a dangling refcount?
>
> I would welcome more testing and I shared the same concern. I did do some
> DLPARs of a virtual ethernet device with the change along with CONFIG_PAGE_POISONING
> enabled and did not run into any issues. However if I do a DLPAR remove of a virtual
> ethernet device without the change with kmemleak enabled it does not detect any
> leaked memory.

Seems odd. If the change is generically correct, then without it applied
I would expect kmemleak to flag a leak on removal of any type of
dynamically-added node. On the other hand, if the change is for some
reason not correct for virtual ethernet devices, then I would expect it
to cause complaints from the refcount code and/or allocator debug
facilities. But if I understand correctly, neither of those things is
happening.


More information about the Linuxppc-dev mailing list