Large stack usage in fs code (especially for PPC64)

Benjamin Herrenschmidt benh at kernel.crashing.org
Tue Nov 18 10:13:16 EST 2008


> Well, it's not unacceptable on good CPU's with 4kB blocks (just an 8-entry 
> array), but as you say:
> 
> > On PPC64 I'm told that the page size is 64K, which makes the above equal 
> > to: 64K / 512 = 128  multiply that by 8 byte words, we have 1024 bytes.
> 
> Yeah. Not good. I think 64kB pages are insane. In fact, I think 32kB 
> pages are insane, and 16kB pages are borderline. I've told people so.
> 
> The ppc people run databases, and they don't care about sane people 
> telling them the big pages suck.

Hehe :-)

Guess who is pushing for larger page sizes nowadays ? Embedded
people :-) In fact, we have patches submited on the list to offer the
option for ... 256K pages on some 44x embedded CPUs :-)

It makes some sort of sense I suppose on very static embedded workloads
with no swap nor demand paging.

> It's made worse by the fact that they 
> also have horribly bad TLB fills on their broken CPU's, and years and 
> years of telling people that the MMU on ppc's are sh*t has only been 
> reacted to with "talk to the hand, we know better".

Who are you talking about here precisely ? I don't think either Paul or
I every said something nearly around those lines ... Oh well.

But yeah, our existing server CPUs have pretty poor TLB refills and yes,
64K pages help. And yes, we would like things to be different, but they
aren't.

But there is also pressure to get larger page sizes from small embedded
field, where CPUs have even poorer TLB refill (software loaded
basically) :-)

> Quite frankly, 64kB pages are INSANE. But yes, in this case they actually 
> cause bugs. With a sane page-size, that *arr[MAX_BUF_PER_PAGE] thing uses 
> 64 bytes, not 1kB.

Come on, the code is crap to allocate that on the stack anyway :-)

> Of course, that would likely mean that FAT etc wouldn't work on ppc64, so 
> I don't think that's a valid model either. But if the 64kB page size is 
> just a "database server crazy-people config option", then maybe it's 
> acceptable.

Well, as I said, embedded folks are wanting that too ...

Cheers,
Ben.




More information about the Linuxppc-dev mailing list