Large stack usage in fs code (especially for PPC64)
Linus Torvalds
torvalds at linux-foundation.org
Tue Nov 18 08:18:44 EST 2008
On Mon, 17 Nov 2008, Steven Rostedt wrote:
>
> Here's my stack after boot up with CONFIG_IRQSTACKS set. Seems that
> softirqs still use the same stack as the process.
Yes.
> This is still 12K. Kind of big even for a 16K stack.
And while that 1kB+ stack slot for block_read_full_page still stands out
like a sore thumb, I do agree that there's way too many other functions
too with big stack frames.
I do wonder just _what_ it is that causes the stack frames to be so
horrid. For example, you have
18) 8896 160 .kmem_cache_alloc+0xfc/0x140
and I'm looking at my x86-64 compile, and it has a stack frame of just 8
bytes (!) for local variables plus the save/restore area (which looks like
three registers plus frame pointer plus return address). IOW, if I'm
looking at the code right (so big caveat: I did _not_ do a real stack
dump!) the x86-64 stack cost for that same function is on the order of 48
bytes. Not 160.
Where does that factor-of-three+ difference come from? From the numbers, I
suspect ppc64 has a 32-byte stack alignment, which may be part of it, and
I guess the compiler is more eager to use all those extra registers and
will happily have many more callee-saved regs that are actually used.
But that still a _lot_ of extra stack.
Of course, you may have things like spinlock debugging etc enabled. Some
of our debugging options do tend to blow things up.
Linus
More information about the Linuxppc-dev
mailing list