Large stack usage in fs code (especially for PPC64)
Linus Torvalds
torvalds at linux-foundation.org
Tue Nov 18 10:28:41 EST 2008
On Tue, 18 Nov 2008, Benjamin Herrenschmidt wrote:
>
> Guess who is pushing for larger page sizes nowadays ? Embedded
> people :-) In fact, we have patches submited on the list to offer the
> option for ... 256K pages on some 44x embedded CPUs :-)
>
> It makes some sort of sense I suppose on very static embedded workloads
> with no swap nor demand paging.
It makes perfect sense for anything that doesn't use any MMU.
The hugepage support seems to cover many of the relevant cases, ie
databases and things like big static mappings (frame buffers etc).
> > It's made worse by the fact that they
> > also have horribly bad TLB fills on their broken CPU's, and years and
> > years of telling people that the MMU on ppc's are sh*t has only been
> > reacted to with "talk to the hand, we know better".
>
> Who are you talking about here precisely ? I don't think either Paul or
> I every said something nearly around those lines ... Oh well.
Every single time I've complained about it, somebody from IBM has said "..
but but AIX".
This time it was Paul. Sometimes it has been software people who agree,
but point to hardware designers who "know better". If it's not some insane
database person, it's a Fortran program that runs for days.
> But there is also pressure to get larger page sizes from small embedded
> field, where CPUs have even poorer TLB refill (software loaded
> basically) :-)
Yeah, I agree that you _can_ have even worse MMU's. I'm not saying that
PPC64 is absolutely pessimal and cannot be made worse. Software fill is
indeed even worse from a performance angle, despite the fact that it's
really "nice" from a conceptual angle.
Of course, of thesw fill users that remain, many do seem to be ppc.. It's
like the architecture brings out the worst in hardware designers.
> > Quite frankly, 64kB pages are INSANE. But yes, in this case they actually
> > cause bugs. With a sane page-size, that *arr[MAX_BUF_PER_PAGE] thing uses
> > 64 bytes, not 1kB.
>
> Come on, the code is crap to allocate that on the stack anyway :-)
Why? We do actually expect to be able to use stack-space for small
structures. We do it for a lot of things, including stuff like select()
optimistically using arrays allocated on the stack for the common small
case, just because it's, oh, about infinitely faster to do than to use
kmalloc().
Many of the page cache functions also have the added twist that they get
called from low-memory setups (eg write_whole_page()), and so try to
minimize allocations for that reason too.
Linus
More information about the Linuxppc-dev
mailing list