wmb vs mmiowb

Nick Piggin npiggin at suse.de
Wed Aug 29 10:59:04 EST 2007


On Tue, Aug 28, 2007 at 03:56:28PM -0500, Brent Casavant wrote:
> On Fri, 24 Aug 2007, Nick Piggin wrote:
> 
> > And all platforms other than sn2 don't appear to reorder IOs after
> > they leave the CPU, so only sn2 needs to do the mmiowb thing before
> > spin_unlock.
> 
> I'm sure all of the following is already known to most readers, but
> I thought the paragraph above might potentially cause confusion as
> to the nature of the problem mmiowb() is solving on SN2.  So for
> the record...
> 
> SN2 does not reorder IOs issued from a single CPU (that would be
> insane).  Neither does it reorder IOs once they've reached the IO
> fabric (equally insane).  From an individual CPU's perspective, all
> IOs that it issues to a device will arrive at that device in program
> order.

This is why I think mmiowb() is not like a Linux memory barrier.

And I presume that the device would see IOs and regular stores from
a CPU in program order, given the correct wmb()s? (but maybe I'm
wrong... more below).


> (In this entire message, all IOs are assumed to be memory-mapped.)
> 
> The problem mmiowb() helps solve on SN2 is the ordering of IOs issued
> from multiple CPUs to a single device.  That ordering is undefined, as
> IO transactions are not ordered across CPUs.  That is, if CPU A issues
> an IO at time T, and CPU B at time T+1, CPU B's IO may arrive at the
> IO fabric before CPU A's IO, particularly if CPU B happens to be closer
> than CPU B to the target IO bridge on the NUMA network.
> 
> The simplistic method to solve this is a lock around the section
> issuing IOs, thereby ensuring serialization of access to the IO
> device.  However, as SN2 does not enforce an ordering between normal
> memory transactions and memory-mapped IO transactions, you cannot
> be sure that an IO transaction will arrive at the IO fabric "on the
> correct side" of the unlock memory transaction using this scheme.

Hmm. So what if you had the following code executed by a single CPU:

writel(data, ioaddr);
wmb(); 
*mem = 10;

Will the device see the io write before the store to mem?


> Enter mmiowb().
> 
> mmiowb() causes SN2 to drain the pending IOs from the current CPU's
> node.  Once the IOs are drained the CPU can safely unlock a normal
> memory based lock without fear of the unlock's memory write passing
> any outstanding IOs from that CPU.

mmiowb needs to have the disclaimer that it's probably wrong if called
outside a lock, and it's probably wrong if called between two io writes
(need a regular wmb() in that case). I think some drivers are getting
this wrong.



More information about the Linuxppc-dev mailing list