fsl booke MM vs. SMP questions

Mon May 28 20:23:27 EST 2007

On Mon, May 28, 2007 at 08:00:21PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2007-05-28 at 17:37 +0800, Liu Dave-r63238 wrote:
> > 
> > BTW, if the x86 processor support the broadcast tlb operation to
> > system?
> > If it can,  why we adopt the IPI mechanism for x86? what is the
> > concern?
> 
> I don't think it supports them but then, I don't know for sure.
> 

It does not. However IA64 (aka Itanic) does. Of course on x86 until
recently, the TLB were completely flushed (at least the entries mapping to
user space) on task switches to a different mm, which automatically 
avoids races for single threaded apps.

> Part of the problem is what your workload is. if you have a lot of small
> and short lived processes, such as CGI's on a web server, they are
> fairly unlikely to exist on more than one processor, maybe two, during
> their lifetime (there is a strong optimisation to only do a local
> invalidate when the process only ever existed on one processor).
> 
> If you have a massively threaded workload, that is, a given process is
> likely to exist on all processors, then it's also fairly unlikely that
> you start doing a lot of fork()'s or to have that processes be short
> lived... so it's less of an issue unless you start abusing mmap/munmap
> or mprotect.
> 
> Also, when you have a large number of processors, having broadcast tlb
> invalidations on the bus might become a bottleneck if, at the end of the
> day, you really only want to invalidate one or two siblings. In that
> case, targetted IPIs are probably a better option.

On SMP with single die and integrated memory controllers (PASemi), 
I'd bet that tlb invalidation broadcast is typically much cheaper 
since no external signals are involved (from a hardware point of view
it's not very different from a store to a shared cache line that has 
to be invalidated in the cache of the other processors).

	Gabriel