Large stack usage in fs code (especially for PPC64)

Linus Torvalds torvalds at linux-foundation.org
Tue Nov 18 08:18:44 EST 2008



On Mon, 17 Nov 2008, Steven Rostedt wrote:
> 
> Here's my stack after boot up with CONFIG_IRQSTACKS set. Seems that 
> softirqs still use the same stack as the process.

Yes.

> This is still 12K. Kind of big even for a 16K stack.

And while that 1kB+ stack slot for block_read_full_page still stands out 
like a sore thumb, I do agree that there's way too many other functions 
too with big stack frames.

I do wonder just _what_ it is that causes the stack frames to be so 
horrid. For example, you have

	 18)     8896     160   .kmem_cache_alloc+0xfc/0x140

and I'm looking at my x86-64 compile, and it has a stack frame of just 8 
bytes (!) for local variables plus the save/restore area (which looks like 
three registers plus frame pointer plus return address). IOW, if I'm 
looking at the code right (so big caveat: I did _not_ do a real stack 
dump!) the x86-64 stack cost for that same function is on the order of 48 
bytes. Not 160.

Where does that factor-of-three+ difference come from? From the numbers, I 
suspect ppc64 has a 32-byte stack alignment, which may be part of it, and 
I guess the compiler is more eager to use all those extra registers and 
will happily have many more callee-saved regs that are actually used.

But that still a _lot_ of extra stack.

Of course, you may have things like spinlock debugging etc enabled. Some 
of our debugging options do tend to blow things up.

			Linus



More information about the Linuxppc-dev mailing list