[PATCH v2 1/2] fadump: reduce memory consumption for capture kernel

Michal Suchánek msuchanek at suse.de
Wed Apr 19 00:18:16 AEST 2017


On Mon, 17 Apr 2017 20:43:02 +0530
Hari Bathini <hbathini at linux.vnet.ibm.com> wrote:

> On Friday 14 April 2017 01:28 AM, Michal Suchánek wrote:
> > On Thu, 13 Apr 2017 01:59:13 +0530
> > Hari Bathini <hbathini at linux.vnet.ibm.com> wrote:
> >  
> >> On Friday 07 April 2017 07:16 PM, Michael Ellerman wrote:  
> >>> Hari Bathini <hbathini at linux.vnet.ibm.com> writes:  
> >>>> On Friday 07 April 2017 07:24 AM, Michael Ellerman wrote:  
> >>>>> My preference would be that the fadump kernel "just works". If
> >>>>> it's using too much memory then the fadump kernel should do
> >>>>> whatever it needs to use less memory, eg. shrinking nr_cpu_ids
> >>>>> etc. Do we actually know *why* the fadump kernel is running out
> >>>>> of memory? Obviously large numbers of CPUs is one of the main
> >>>>> drivers (lots of stacks required). But other than that what is
> >>>>> causing the memory pressure? I would like some data on that
> >>>>> before we proceed.  
> >>>> Almost the same amount of memory in comparison with the memory
> >>>> required to boot the production kernel but that is unwarranted
> >>>> for fadump (dump capture) kernel.  
> >>> That's not data! :)
> >>>
> >>> The dump kernel is booted with *much* less memory than the
> >>> production kernel (that's the whole issue!) and so it doesn't need
> >>> to create struct pages for all that memory, which means it should
> >>> need less memory.
> >>>
> >>> The vfs caches are also sized based on the available memory, so
> >>> they should also shrink in the dump kernel.
> >>>
> >>> I want some actual numbers on what's driving the memory usage.
> >>>
> >>> I tried some of these parameters to see how much memory they would
> >>> save:  
> >> Hi Michael,
> >>
> >> Tried to get data to show parameters like numa=off &
> >> cgroup_disable=memory matter too but parameter nr_cpus=1 is making
> >> parameters like numa=off, cgroup_disable=memory insignificant.
> >> Also, these parameters not using much of early memory reservations
> >> is making quantification of memory saved for each of them that
> >> much more difficult. But I would still like to argue that passing
> >> additional parameters to fadump is better than enforcing nr_cpus=1
> >> in the kernel for:
> >>
> >>     a) With makedumpfile tool supporting multi-threading it would
> >> make sense to leave the choice of how many CPUs to have, to the
> >> user.
> >>
> >>     b) Parameters like udev.children-max=2 can help to reduce the
> >> number of parallel executed events bringing down the memory
> >> pressure on fadump kernel (when it is booted with more than one
> >> CPU).
> >>
> >>     c) Ease of maintainability is better (considering any new
> >> kernel features with some memory to save or stability to gain on
> >> disabling, possible platform supports) with append approach over
> >> enforcing these parameters
> >>        in the kernel.
> >>
> >>     d) It would give user the flexibility to disable unwanted
> >> kernel features in fadump kernel (numa=off,
> >> cgroup_disable=memory). For every feature enabled in the
> >> production kernel, fadump kernel will have the choice to
> >>        opt out of it, provided there is such cmdline option.  
> > Hello,  
> 
> Hi Michal,
> 
> > can't the extra parameters be passed in the devicetree?  
> 
> Hmmm.. possible. Without change in f/w, this may not be guaranteed
> though.
> 
> > The docs say that the kernel can tell it's a fadump crash kernel by
> > checking the devicetree ibm,dump-kernel property. Is there any
> > reason  
> 
> This node is exported by firmware
> 
> > more (optional) properties cannot be added?  
> 
> Kernel change seems simple over f/w enhancement..

That certainly looks so when you are a kernel developer and can
implement the change yourself compared to convincing some firmware
developer that this feature makes sense.

On the other hand, the proposed kernel-only solution introduces
requirement that the maintainer does not like.

For the platform as a whole does it make more sense to add a hack to
the kernel or does it make sense to enhance the firmware to provide
more options for firmware-assisted dump?

Thanks

Michal


More information about the Linuxppc-dev mailing list