[PATCH 1/9] of: Add warpper function of_find_node_by_name_balanced()
zhangzekun (A)
zhangzekun11 at huawei.com
Fri Feb 7 22:28:23 AEDT 2025
在 2025/2/7 16:24, Oleksij Rempel 写道:
> On Fri, Feb 07, 2025 at 09:31:09AM +0800, Zhang Zekun wrote:
>> There are many drivers use of_find_node_by_name() with a not-NULL
>> device_node pointer, and a number of callers would require a call to
>> of_node_get() before using it. There are also some drivers who forget
>> to call of_node_get() which would cause a ref count leak[1]. So, Add a
>> wraper function for of_find_node_by_name(), drivers may use this function
>> to call of_find_node_by_name() with the refcount already balanced.
>>
>> [1] https://lore.kernel.org/all/20241024015909.58654-1-zhangzekun11@huawei.com/
>
> Hi Zhang Zekun,
>
> thank you for working on this issue!
>
> First of all, let's take a step back and analyze the initial problem.
> Everything following is only my opinion...
>
> The main issue I see is that the current API - of_find_node_by_name -
> modifies the refcount of its input by calling of_node_put(from) as part
> of its search. Typically, a "find" function is expected to treat its
> input as read-only. That is, when you pass an object into such a
> function, you expect its reference count to remain unchanged unless
> ownership is explicitly transferred. In this case, lowering the refcount
> on the input node is counterintuitive and already lead to unexpected
> behavior and subtle bugs.
>
> To address this, the workaround introduces a wrapper function,
> of_find_node_by_name_balanced, which first increments the input’s
> refcount (via of_node_get()) before calling the original function. While
> this "balances" the refcount change, the naming remains problematic from
> my perspective. The "_balanced" suffix isn’t part of our common naming
> conventions (traditions? :)). Most drivers expect that a function
> starting with "find" will not alter the reference count of its input.
> The term "balanced" doesn’t clearly convey that the input's refcount is
> being explicitly managed - it instead obscures the underlying behavior,
> leaving many developers confused about what guarantees the API provides.
>
> In my view, a more natural solution would be to redesign the API so that
> it doesn’t modify the input object’s refcount at all. Instead, it should
> solely increase the refcount of the returned node (if found) for safe
> asynchronous usage. This approach would align with established
> conventions where "find" implies no side effects on inputs or output,
> and a "get" indicates that the output comes with an extra reference. For
> example, a function named of_get_node_by_name would clearly signal that
> only the returned node is subject to a refcount increase while leaving
> the input intact.
>
> Thus, while the current workaround "balances" the reference count, it
> doesn't address the underlying design flaw. The naming still suggests a
> "find" function that should leave the input untouched, which isn’t the
> case here. A redesign of the API - with both the behavior and naming
> aligned to common expectations - would be a clearer and more robust
> solution.
>
> Nevertheless, it is only my POV, and the final decision rests with the
> OpenFirmware framework maintainers.
>
> Best Regards,
> Oleksij
Hi, Oleksij,
Thanks for your review. I think redesign the API would be a fundamental
way to fix this issue, but it would cause impact for current users who
knows the exact functionality of of_find_node_by_name(), which need to
put the "from" before calling it (I can't make the assumption that all
of drivers calling of_find_node_by_name() with a not-NULL "from"
pointer, but not calling of_node_get() before is misuse). The basic idea
for adding a new function is try not to impact current users. For users
who are confused about the "_balanced" suffix,the comments of
of_find_node_by_name() explains why this interface exists.
Thanks,
Zekun
More information about the Linuxppc-dev
mailing list