[PATCH 0/5] Microwatt updates
Gabriel Paubert
paubert at iram.es
Sun Mar 2 21:13:03 AEDT 2025
[Sorry, I wanted to reply earlier, but it stayed in my drafts folder for a month]
On Sat, Feb 01, 2025 at 12:22:51PM +1100, Paul Mackerras wrote:
[snipped]
>
> 603 was a looong time ago, I don't recall the details.
>
> Regarding broadcast TLBIEs, the protocols and mechanisms for doing
> that are known to be complex and slow in the IBM Power processors (ask
> Derek Williams about that :). Anton found that in fact doing only
> local TLBIEs and using IPIs gave *better* performance on IBM Power
> systems than using hardware broadcast TLBIEs in many cases (the reason
> being that software knows which other CPUs might have a given TLB
> entry, often quite a small set, whereas hardware doesn't, and has to
> send the invalidation to every CPU and wait for a response from every
> CPU). Add to that, that most other SMP-capable CPU architectures
> don't do broadcast TLB invalidations, Intel x86 for example.
Actually it's coming to x86, at least on the AMD side:
https://lore.kernel.org/all/20250206044346.3810242-1-riel@surriel.com/
with performance numbers which look rather good.
I don't know how it looks like at the level of the hardware protocol,
but implementing it on a single chip/socket is likely relatively simple.
Gabriel
>
> > > the kernel already has code to deal with this. One of the patches in
> > > this series provides a config option to allow platforms to select
> > > unconditionally the behaviour where cross-CPU TLB invalidations are
> > > handled using inter-processor interrupts.
> >
> > Are there plans to broadcast the (SMP cache invalidation) messages?
>
> Cache (i.e. instruction and data cache) - yes, they *are* coherent.
> More precisely, the D caches are write-through, and all I and D caches
> snoop writes to memory (including DMA writes) and invalidate any cache
> lines being written to.
>
> > Will uwatt support some real bus protocol, for example?
>
> "Real" meaning using tri-state bus drivers, like we did in the 90s? :)
>
> > Again, congrats on this great milestone! Does this floating point
> > support do square roots as well (aka "gpopt"; does it do "gfxopt" for
> > that matter, fsel?) fsqrt is kinda tricky to get to work fully
> > correctly :-)
>
> Yes, fsqrt and fsel are implemented in hardware, and are accurate to
> the last bit. Also, the FPU handles denormalized values in hardware
> (both input and output) and implements all exception handling as per
> the ISA, including the trap-enabled overflow cases. Feel free to run
> whatever tests you like and report bugs. But we're getting a bit
> off-topic from the kernel patches. :)
>
> Paul.
>
More information about the Linuxppc-dev
mailing list