sched/debug: CPU hotplug operation suffers in a large cpu systems

Srikar Dronamraju srikar at linux.vnet.ibm.com
Wed Nov 9 01:51:00 AEDT 2022


* Greg Kroah-Hartman <gregkh at linuxfoundation.org> [2022-11-08 13:24:39]:

> On Tue, Nov 08, 2022 at 03:30:46PM +0530, Vishal Chourasia wrote:

Hi Greg, 

> > 
> > Thanks Greg & Peter for your direction. 
> > 
> > While we pursue the idea of having debugfs based on kernfs, we thought about
> > having a boot time parameter which would disable creating and updating of the
> > sched_domain debugfs files and this would also be useful even when the kernfs
> > solution kicks in, as users who may not care about these debugfs files would
> > benefit from a faster CPU hotplug operation.
> 
> Ick, no, you would be adding a new user/kernel api that you will be
> required to support for the next 20+ years.  Just to get over a
> short-term issue before you solve the problem properly.
> 
> If you really do not want these debugfs files, just disable debugfs from
> your system.  That should be a better short-term solution, right?
> 
> Or better yet, disable SCHED_DEBUG, why can't you do that?

Thanks a lot for your quick inputs.

CONFIG_SCHED_DEBUG disables a lot more stuff than just updation of debugfs
files. Information like /sys/kernel/debug/sched/debug and system-wide and
per process wide information would be lost when that config is disabled.

Most users would still be using distribution kernels and most distribution
kernels that I know of seem to have CONFIG_SCHED_DEBUG enabled.

In a large system, lets say close to 2000 CPUs and we are offlining around
1750 CPUs. For example ppc64_cpu --smt=1  on a powerpc. Even if we move to a
lesser overhead kernfs based implementation, we would still be creating
files and deleting files for every CPU offline. Most users may not even be
aware of these files. However for a few users who may be using these files
once a while, we end up creating and deleting these files for all users. The
overhead increases exponentially with the number of CPUs. I would assume the
max number of CPUs are going to increase in future further.

Hence our approach was to reduce the overhead for those users who are sure
they don't depend on these files. We still keep the creating of the files as
the default approach so that others who depend on it are not going to be
impacted.

> 
> thanks,
> 
> greg k-h

-- 
Thanks and Regards
Srikar Dronamraju


More information about the Linuxppc-dev mailing list