phandle_cache vs of_detach_node (was Re: [PATCH] powerpc/mobility: Fix node detach/rename problem)
Michael Bringmann
mwb at linux.vnet.ibm.com
Thu Aug 2 03:26:08 AEST 2018
On 08/01/2018 09:26 AM, Michael Ellerman wrote:
> Frank Rowand <frowand.list at gmail.com> writes:
>> On 07/31/18 07:17, Rob Herring wrote:
>>> On Tue, Jul 31, 2018 at 12:34 AM Michael Ellerman <mpe at ellerman.id.au> wrote:
>>>>
>>>> Hi Rob/Frank,
>>>>
>>>> I think we might have a problem with the phandle_cache not interacting
>>>> well with of_detach_node():
>>>
>>> Probably needs a similar fix as this commit did for overlays:
>>>
>>> commit b9952b5218added5577e4a3443969bc20884cea9
>>> Author: Frank Rowand <frank.rowand at sony.com>
>>> Date: Thu Jul 12 14:00:07 2018 -0700
>>>
>>> of: overlay: update phandle cache on overlay apply and remove
>>>
>>> A comment in the review of the patch adding the phandle cache said that
>>> the cache would have to be updated when modules are applied and removed.
>>> This patch implements the cache updates.
>>>
>>> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
>>> of_find_node_by_phandle()")
>>> Reported-by: Alan Tull <atull at kernel.org>
>>> Suggested-by: Alan Tull <atull at kernel.org>
>>> Signed-off-by: Frank Rowand <frank.rowand at sony.com>
>>> Signed-off-by: Rob Herring <robh at kernel.org>
>>
>> Agreed. Sorry about missing the of_detach_node() case.
>>
>>
>>> Really what we need here is an "invalidate phandle" function rather
>>> than free and re-allocate the whole damn cache.
>>
>> The big hammer approach was chosen to avoid the race conditions that
>> would otherwise occur. OF does not have a locking strategy that
>> would be able to protect against the races.
>>
>> We could maybe implement a slightly smaller hammer by (1) disabling
>> the cache, (2) invalidate a phandle entry in the cache, (3) re-enable
>> the cache. That is an off the cuff thought - I would have to look
>> a little bit more carefully to make sure it would work.
>>
>> But I don't see a need to add the complexity of the smaller hammer
>> or the bigger hammer of proper locking _unless_ we start seeing that
>> the cache is being freed and re-allocated frequently. For overlays
>> I don't expect the high frequency because it happens on a per overlay
>> removal basis (not per node removal basis). For of_detach_node() the
>> event _is_ on a per node removal basis. Michael, do you expect node
>> removals to be a frequent event with low latency being important? If
>> so, a rough guess on what the frequency would be?
>
> No in practice we expect them to be fairly rare and in the thousands at
> most I think.
>
> TBH you could just disable the phandle_cache on the first detach and
> that would be fine by me. I don't think we've ever noticed phandle
> lookups being much of a problem for us on our hardware.
I ran a couple of migrations with the calls to
of_free_phandle_cache() ... of_populate_phandle_cache()
bracketing the body of
arch/powerpc/platforms/pseries/mobility.c:post_mobility_fixup()
with a patch like,
diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
index e245a88..efc9442 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -22,6 +22,9 @@
#include <asm/rtas.h>
#include "pseries.h"
+extern int of_free_phandle_cache(void);
+extern void of_populate_phandle_cache(void);
+
static struct kobject *mobility_kobj;
struct update_props_workarea {
@@ -343,6 +346,8 @@ void post_mobility_fixup(void)
rc = rtas_call(activate_fw_token, 0, 1, NULL);
} while (rtas_busy_delay(rc));
+ of_free_phandle_cache();
+
if (rc)
printk(KERN_ERR "Post-mobility activate-fw failed: %d\n", rc);
@@ -354,6 +359,8 @@ void post_mobility_fixup(void)
/* Possibly switch to a new RFI flush type */
pseries_setup_rfi_flush();
+ of_populate_phandle_cache();
+
return;
}
and did not observe the warnings/crashes that have been
plaguing me. We will need to move their prototypes out
of drivers/of/of_private.h to include/linux/of.h, though.
>
> cheers
>
Michael
--
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line 363-5196
External: (512) 286-5196
Cell: (512) 466-0650
mwb at linux.vnet.ibm.com
More information about the Linuxppc-dev
mailing list