OF_DYNAMIC node lifecycle

Grant Likely grant.likely at linaro.org
Thu Jun 26 06:22:16 EST 2014


On Tue, 24 Jun 2014 15:07:05 -0500, Nathan Fontenot <nfont at austin.ibm.com> wrote:
> On 06/23/2014 09:58 AM, Grant Likely wrote:
> > On Thu, 19 Jun 2014 11:33:20 +0300, Pantelis Antoniou <pantelis.antoniou at konsulko.com> wrote:
> >> Hi Grant,
> >>
> >> CCing Thomas Gleixner & Steven Rostedt, since they might have a few
> >> ideas...
> >>
> >> On Jun 18, 2014, at 11:07 PM, Grant Likely wrote:
> >>
> >>> Hi Nathan and Tyrel,
> >>>
> >>> I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and
> >>> I'm hoping you can help me. Right now, pseries seems to be the only
> >>> user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on
> >>> the entire kernel because it requires all DT code to manage reference
> >>> counting with iterating over nodes. Most users simply get it wrong.
> >>> Pantelis did some investigation and found that the reference counts on
> >>> a running kernel are all over the place. I have my doubts that any
> >>> code really gets it right.
> >>>
> >>> The problem is that users need to know when it is appropriate to call
> >>> of_node_get()/of_node_put(). All list traversals that exit early need
> >>> an extra call to of_node_put(), and code that is searching for a node
> >>> in the tree and holding a reference to it needs to call of_node_get().
> >>>
> >>
> >> In hindsight it appears that drivers just can't get the lifecycle right.
> >> So we need to simplify things.
> >>
> >>> I've got a few pseries questions:
> >>> - What are the changes being requested by pseries firmware? Is it only
> >>> CPUs and memory nodes, or does it manipulate things all over the tree?
> >>> - How frequent are the changes? How many changes would be likely over
> >>> the runtime of the system?
> >>> - Are you able to verify that removed nodes are actually able to be
> >>> freed correctly? Do you have any testcases for node removal?
> >>>
> >>> I'm thinking very seriously about changing the locking semantics of DT
> >>> code entirely so that most users never have to worry about
> >>> of_node_get/put at all. If the DT code is switched to use rcu
> >>> primitives for tree iteration (which also means making DT code use
> >>> list_head, something I'm already investigating), then instead of
> >>> trying to figure out of_node_get/put rules, callers could use
> >>> rcu_read_lock()/rcu_read_unlock() to protect the region that is
> >>> searching over nodes, and only call of_node_get() if the node pointer
> >>> is needed outside the rcu read-side lock.
> >>>
> >>> I'd really like to be rid of the node reference counting entirely, but
> >>> I can't figure out a way of doing that safely, so I'd settle for
> >>> making it a lot easier to get correct.
> >>>
> >>
> >> Since we're going about changing things, how about that devtree_lock?
> > 
> > I believe rcu would pretty much eliminate the devtree_lock entirely. All
> > modifiers would need to grab a mutex to ensure there is only one writer
> > at any given time, but readers would have free reign to parse the tree
> > however they like.
> > 
> > DT writers would have to follow some strict rules about how to handle
> > nodes that are removed (ie. don't modify or of_node_put() them until
> > after rcu is syncronized), but the number of writers is very small and
> > we have control of all of them.
> > 
> >> We're using a raw_spinlock and we're always taking the lock with
> >> interrupts disabled.
> >>
> >> If we're going to make DT changes frequently during normal runtime
> >> and not only during boot time, those are bad for any kind of real-time
> >> performance.
> >>
> >> So the question is, do we really have code that access the live tree
> >> during atomic sections?  Is that something we want? Enforcing this
> >> will make our lives easier, and we'll get the change to replace
> >> that spinlock with a mutex.
> > 
> > Yes, I believe the powerpc CPU hotplug code accesses the DT in atomic
> > sections. I cannot put my finger on the exact code however. Nathan might
> > know better. But, if I'm right, the whole problem goes away with RCU.
> 
> I went back through the cpu hotplug code. we do update the DT during cpu
> hotplug but I don't see it happening during atomic sections.
> 
> The code is in arch/powerpc/platforms/pseries/dlpar.c

Great, thanks,

By the way, notifiers currently get sent before any updates are applied
to the tree. I want to change it so that the notifier gets sent
afterwards. Does that work for you? I've looked through all the users
and aside from a stupid block of code in arch/powerpc/kernel/prom.c
which does things that should be done by of_attach_node(), it looks like
everything should be fine.

g.


More information about the Linuxppc-dev mailing list