[RFC PATCH v2 1/1] of: introduce event tracepoints for dynamic device_node lifecyle

Frank Rowand frowand.list at gmail.com
Thu Jan 25 17:58:45 AEDT 2018


On 01/24/18 22:48, Frank Rowand wrote:
> On 01/21/18 06:31, Wolfram Sang wrote:
>> From: Tyrel Datwyler <tyreld at linux.vnet.ibm.com>
>>
>> This patch introduces event tracepoints for tracking a device_nodes
>> reference cycle as well as reconfig notifications generated in response
>> to node/property manipulations.
>>
>> With the recent upstreaming of the refcount API several device_node
>> underflows and leaks have come to my attention in the pseries (DLPAR)
>> dynamic logical partitioning code (ie. POWER speak for hotplugging
>> virtual and physcial resources at runtime such as cpus or IOAs). These
>> tracepoints provide a easy and quick mechanism for validating the
>> reference counting of device_nodes during their lifetime.
>>
>> Further, when pseries lpars are migrated to a different machine we
>> perform a live update of our device tree to bring it into alignment with
>> the configuration of the new machine. The of_reconfig_notify trace point
>> provides a mechanism that can be turned for debuging the device tree
>> modifications with out having to build a custom kernel to get at the
>> DEBUG code introduced by commit 00aa37206e1a54 ("of/reconfig: Add debug
>> output for OF_RECONFIG notifiers").
>>
>> The following trace events are provided: of_node_get, of_node_put,
>> of_node_release, and of_reconfig_notify. These trace points require a
> 
> Please add a note that the of_reconfig_notify trace event is not an
> added bit of debug info, but is instead replacing information that
> was previously available via pr_debug() when DEBUG was defined.

I got a little carried away, "when DEBUG was defined" is extra
un-needed detail for the commit message.


> 
> 
>> kernel built with ftrace support to be enabled. In a typical environment
>> where debugfs is mounted at /sys/kernel/debug the entire set of
>> tracepoints can be set with the following:
>>
>>   echo "of:*" > /sys/kernel/debug/tracing/set_event
>>
>> or
>>
>>   echo 1 > /sys/kernel/debug/tracing/events/of/enable
>>
>> The following shows the trace point data from a DLPAR remove of a cpu
>> from a pseries lpar:
>>
>> cat /sys/kernel/debug/tracing/trace | grep "POWER8 at 10"
>>
>> cpuhp/23-147   [023] ....   128.324827:
>>         of_node_put: refcount=5, dn->full_name=/cpus/PowerPC,POWER8 at 10
>> cpuhp/23-147   [023] ....   128.324829:
>>         of_node_put: refcount=4, dn->full_name=/cpus/PowerPC,POWER8 at 10
>> cpuhp/23-147   [023] ....   128.324829:
>>         of_node_put: refcount=3, dn->full_name=/cpus/PowerPC,POWER8 at 10
>> cpuhp/23-147   [023] ....   128.324831:
>>         of_node_put: refcount=2, dn->full_name=/cpus/PowerPC,POWER8 at 10
>>    drmgr-7284  [009] ....   128.439000:
>>         of_node_put: refcount=1, dn->full_name=/cpus/PowerPC,POWER8 at 10
>>    drmgr-7284  [009] ....   128.439002:
>>         of_reconfig_notify: action=DETACH_NODE, dn->full_name=/cpus/PowerPC,POWER8 at 10,
>>                             prop->name=null, old_prop->name=null
>>    drmgr-7284  [009] ....   128.439015:
>>         of_node_put: refcount=0, dn->full_name=/cpus/PowerPC,POWER8 at 10
>>    drmgr-7284  [009] ....   128.439016:
>>         of_node_release: dn->full_name=/cpus/PowerPC,POWER8 at 10, dn->_flags=4
>>
>> Signed-off-by: Tyrel Datwyler <tyreld at linux.vnet.ibm.com>
> 
> The following belongs in a list of version 2 changes, below the "---" line:
> 
>> [wsa: fixed commit abbrev and one of the sysfs paths in commit desc,
>> removed trailing space and fixed pointer declaration in code]
> 
>> Signed-off-by: Wolfram Sang <wsa+renesas at sang-engineering.com>
>> ---
>>  drivers/of/dynamic.c      | 32 ++++++----------
>>  include/trace/events/of.h | 93 +++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 105 insertions(+), 20 deletions(-)
>>  create mode 100644 include/trace/events/of.h
> 
> mode looks incorrect.  Existing files in include/trace/events/ are -rw-rw----
> 
> 
>> diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
>> index ab988d88704da0..b0d6ab5a35b8c6 100644
>> --- a/drivers/of/dynamic.c
>> +++ b/drivers/of/dynamic.c
>> @@ -21,6 +21,9 @@ static struct device_node *kobj_to_device_node(struct kobject *kobj)
>>  	return container_of(kobj, struct device_node, kobj);
>>  }
>>  
>> +#define CREATE_TRACE_POINTS
>> +#include <trace/events/of.h>
>> +
>>  /**
>>   * of_node_get() - Increment refcount of a node
>>   * @node:	Node to inc refcount, NULL is supported to simplify writing of
>> @@ -30,8 +33,10 @@ static struct device_node *kobj_to_device_node(struct kobject *kobj)
>>   */
>>  struct device_node *of_node_get(struct device_node *node)
>>  {
>> -	if (node)
>> +	if (node) {
>>  		kobject_get(&node->kobj);
>> +		trace_of_node_get(refcount_read(&node->kobj.kref.refcount), node->full_name);
> 
> See the comment from Ron that I mentioned in my previous email.
                       ^^^^
                       Rob, darn it.


> Also, the path has been removed from node->full_name.  Does using it here
> still give all of the information that is desired?  Same for all others uses
> of full_name in this patch.
> 
> The trace point should have a single argument, node.  Accessing the two
> fields can be done in the tracepoint assignment.  Or is there some
> reason that can't be done?  Same for the trace_of_node_put() tracepoint.
> 
> 
>> +	}
>>  	return node;
>>  }
>>  EXPORT_SYMBOL(of_node_get);
>> @@ -43,8 +48,10 @@ EXPORT_SYMBOL(of_node_get);
>>   */
>>  void of_node_put(struct device_node *node)
>>  {
>> -	if (node)
>> +	if (node) {
>> +		trace_of_node_put(refcount_read(&node->kobj.kref.refcount) - 1, node->full_name);
> 
> If this is moved down one line to after kobject_put(), then no need
> to subtract 1.
> 
> 
>>  		kobject_put(&node->kobj);
>> +	}
>>  }
>>  EXPORT_SYMBOL(of_node_put);
>>  
>> @@ -75,24 +82,7 @@ const char *action_names[] = {
>>  int of_reconfig_notify(unsigned long action, struct of_reconfig_data *p)
>>  {
>>  	int rc;
>> -#ifdef DEBUG
>> -	struct of_reconfig_data *pr = p;
>> -
>> -	switch (action) {
>> -	case OF_RECONFIG_ATTACH_NODE:
>> -	case OF_RECONFIG_DETACH_NODE:
>> -		pr_debug("notify %-15s %pOF\n", action_names[action],
>> -			pr->dn);
>> -		break;
>> -	case OF_RECONFIG_ADD_PROPERTY:
>> -	case OF_RECONFIG_REMOVE_PROPERTY:
>> -	case OF_RECONFIG_UPDATE_PROPERTY:
>> -		pr_debug("notify %-15s %pOF:%s\n", action_names[action],
>> -			pr->dn, pr->prop->name);
>> -		break;
>> -
>> -	}
>> -#endif
>> +	trace_of_reconfig_notify(action, p);
>>  	rc = blocking_notifier_call_chain(&of_reconfig_chain, action, p);
>>  	return notifier_to_errno(rc);
>>  }
>> @@ -320,6 +310,8 @@ void of_node_release(struct kobject *kobj)
>>  {
>>  	struct device_node *node = kobj_to_device_node(kobj);
>>  
>> +	trace_of_node_release(node);
>> +
>>  	/* We should never be releasing nodes that haven't been detached. */
>>  	if (!of_node_check_flag(node, OF_DETACHED)) {
>>  		pr_err("ERROR: Bad of_node_put() on %pOF\n", node);
>> diff --git a/include/trace/events/of.h b/include/trace/events/of.h
>> new file mode 100644
>> index 00000000000000..e8b1302a6f0129
>> --- /dev/null
>> +++ b/include/trace/events/of.h
>> @@ -0,0 +1,93 @@
>> +#undef TRACE_SYSTEM
>> +#define TRACE_SYSTEM of
>> +
>> +#if !defined(_TRACE_OF_H) || defined(TRACE_HEADER_MULTI_READ)
>> +#define _TRACE_OF_H
>> +
>> +#include <linux/of.h>
>> +#include <linux/tracepoint.h>
>> +
>> +DECLARE_EVENT_CLASS(of_node_ref_template,
>> +
>> +	TP_PROTO(int refcount, const char *dn_name),
>> +
>> +	TP_ARGS(refcount, dn_name),
>> +
>> +	TP_STRUCT__entry(
>> +		__string(dn_name, dn_name)
>> +		__field(int, refcount)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__assign_str(dn_name, dn_name);
>> +		__entry->refcount = refcount;
>> +	),
>> +
>> +	TP_printk("refcount=%d, dn->full_name=%s",
> 
> For version 1 of the patch, the example trace output was reformatted
> by hand because the lines were so long.  it seems that it is easy to
> create long lines and trace output, so extra effort on keeping the
> lines shorter is well spent.
> 
> A little verbose.  I would prefer to get rid of "refcount=", ",",
> and "dn->full_name=" entirely.
> 
> 
>> +		  __entry->refcount, __get_str(dn_name))
>> +);
>> +
>> +DEFINE_EVENT(of_node_ref_template, of_node_get,
>> +	     TP_PROTO(int refcount, const char *dn_name),
>> +	     TP_ARGS(refcount, dn_name));
>> +
>> +DEFINE_EVENT(of_node_ref_template, of_node_put,
>> +	     TP_PROTO(int refcount, const char *dn_name),
>> +	     TP_ARGS(refcount, dn_name));
>> +
>> +TRACE_EVENT(of_node_release,
>> +
>> +	TP_PROTO(struct device_node *dn),
>> +
>> +	TP_ARGS(dn),
>> +
>> +	TP_STRUCT__entry(
>> +		__string(dn_name, dn->full_name)
>> +		__field(unsigned long, flags)
> 
> Does not follow the alignment conventions for files in this directory.
> 
> Steve: have you dropped the conventions or are they still relevant?
> 
> 
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__assign_str(dn_name, dn->full_name);
>> +		__entry->flags = dn->_flags;
>> +	),
>> +
>> +	TP_printk("dn->full_name=%s, dn->_flags=%lu",
>> +		  __get_str(dn_name), __entry->flags)
> 
> That the first field is the node name should be obvious to the
> person reading the trace, so ditch "dn->full_name=".  One
> could argue that _flags is not so obvious, but if someone
> is looking at why of_node_release() is being called, they
> will either care about flag, in which case it will be obvious
> that the flag value is being printed, or they won't care
> about the flag value and will be ignoring it anyway.  So
> I would ditch "dn->-flags=" also.
> 
> 
>> +);
>> +
>> +#define of_reconfig_action_names \
>> +	{OF_RECONFIG_ATTACH_NODE, "ATTACH_NODE"}, \
>> +	{OF_RECONFIG_DETACH_NODE, "DETACH_NODE"}, \
>> +	{OF_RECONFIG_ADD_PROPERTY, "ADD_PROPERTY"}, \
>> +	{OF_RECONFIG_REMOVE_PROPERTY, "REMOVE_PROPERTY"}, \
>> +	{OF_RECONFIG_UPDATE_PROPERTY, "UPDATE_PROPERTY"}
>> +
>> +TRACE_EVENT(of_reconfig_notify,
>> +
>> +	TP_PROTO(unsigned long action, struct of_reconfig_data *ord),
>> +
>> +	TP_ARGS(action, ord),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(unsigned long, action)
>> +		__string(dn_name, ord->dn->full_name)
>> +		__string(prop_name, ord->prop ? ord->prop->name : "null")
>> +		__string(oldprop_name, ord->old_prop ? ord->old_prop->name : "null")
> 
> Does not follow the alignment conventions for files in this directory.
> 
> 
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->action = action;
>> +		__assign_str(dn_name, ord->dn->full_name);
>> +		__assign_str(prop_name, ord->prop ? ord->prop->name : "null");
>> +		__assign_str(oldprop_name, ord->old_prop ? ord->old_prop->name : "null");
>> +	),
>> +
>> +	TP_printk("action=%s, dn->full_name=%s, prop->name=%s, old_prop->name=%s",
> 
> Isn't old_prop->name the same as prop->name, if old_prop exists?  If so, drop
> it.
> 
> This is more verbose than the pr_debug() that is being replaced.  Please make
> it more compact.
> 
> 
>> +		  __print_symbolic(__entry->action, of_reconfig_action_names),
>> +		  __get_str(dn_name), __get_str(prop_name), __get_str(oldprop_name))
>> +);
>> +
>> +#endif /*	_TRACE_OF_H */
>> +
>> +/* This part must be outside protection */
>> +#include <trace/define_trace.h>
>>
> 
> 



More information about the Linuxppc-dev mailing list