[PATCH 0/5] Microwatt updates
Paul Mackerras
paulus at ozlabs.org
Sat Feb 1 12:22:51 AEDT 2025
On Fri, Jan 31, 2025 at 10:13:43AM -0600, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Jan 29, 2025 at 09:49:49AM +1100, Paul Mackerras wrote:
> > This patch series updates the kernel support for the Microwatt
> > soft-core and its implementation on FPGA systems, particularly the
> > Digilent Arty A7-100 FPGA development board.
> >
> > Microwatt now supports almost all of the features of the SFFS (Scalar
> > Fixed-poin and Floating-point Subset) compliancy subset of Power ISA
> > version 3.1C, including prefixed instructions and the fixed-point hash
> > (ROP mitigation) instructions. It is also now SMP-capable, and a
> > dual-core system will fit on the Arty A7-100 board.
>
> Congratulations!
Thanks!
> > Microwatt does not have broadcast TLB invalidations in SMP systems;
>
> So it isn't *really* SMP. Compare 603 vs. 604. With enough software
Actually, the term "SMP" is about latency to memory, indicating that
all CPUs have access to memory with similar latency. It doesn't say
anything about coherency, either of memory caches or TLBs. So yes,
Microwatt is SMP.
And for the record, the instruction and data caches are coherent,
which is what matters to user-space. Stuff to do with the TLB is not
visible to user-space. And the ISA explicitly says "TLBs are
non-coherent caches of the HTABs and Radix Trees" (Book III section
6.10.1).
> (OS) trickery you can make some things work, but :-) (There have been
> many 603 multiprocessor systems as well, to draw the analogy further
> than wanted :-) )
603 was a looong time ago, I don't recall the details.
Regarding broadcast TLBIEs, the protocols and mechanisms for doing
that are known to be complex and slow in the IBM Power processors (ask
Derek Williams about that :). Anton found that in fact doing only
local TLBIEs and using IPIs gave *better* performance on IBM Power
systems than using hardware broadcast TLBIEs in many cases (the reason
being that software knows which other CPUs might have a given TLB
entry, often quite a small set, whereas hardware doesn't, and has to
send the invalidation to every CPU and wait for a response from every
CPU). Add to that, that most other SMP-capable CPU architectures
don't do broadcast TLB invalidations, Intel x86 for example.
> > the kernel already has code to deal with this. One of the patches in
> > this series provides a config option to allow platforms to select
> > unconditionally the behaviour where cross-CPU TLB invalidations are
> > handled using inter-processor interrupts.
>
> Are there plans to broadcast the (SMP cache invalidation) messages?
Cache (i.e. instruction and data cache) - yes, they *are* coherent.
More precisely, the D caches are write-through, and all I and D caches
snoop writes to memory (including DMA writes) and invalidate any cache
lines being written to.
> Will uwatt support some real bus protocol, for example?
"Real" meaning using tri-state bus drivers, like we did in the 90s? :)
> Again, congrats on this great milestone! Does this floating point
> support do square roots as well (aka "gpopt"; does it do "gfxopt" for
> that matter, fsel?) fsqrt is kinda tricky to get to work fully
> correctly :-)
Yes, fsqrt and fsel are implemented in hardware, and are accurate to
the last bit. Also, the FPU handles denormalized values in hardware
(both input and output) and implements all exception handling as per
the ISA, including the trap-enabled overflow cases. Feel free to run
whatever tests you like and report bugs. But we're getting a bit
off-topic from the kernel patches. :)
Paul.
More information about the Linuxppc-dev
mailing list