[PATCH] Document Linux's memory barriers [try #4]

Paul Mackerras paulus at samba.org
Fri Mar 10 10:34:53 EST 2006

David Howells writes:

> +On some systems, I/O writes are not strongly ordered across all CPUs, and so
> +locking should be used, and mmiowb() should be issued prior to unlocking the
> +critical section.

I think we should say more strongly that mmiowb() is required where
MMIO accesses are done under a spinlock, and that if your driver is
missing them then that is a bug.  I don't think it makes sense to say
that mmiowb is required "on some systems".

At least, we should either make that statement, or we should not
require any driver on any platform to use mmiowb explicitly.  (In that
case, the platforms that need it could do something like keep a
per-cpu count of MMIO accesses, which is zeroed in spin_lock and
incremented by read*/write*, and if spin_unlock finds it non-zero, it
does the mmiowb().)

Also, this section doesn't sound right to me:

> +	writew(ADDR, ctl_reg_3);
> +	writew(DATA, y);
> +	<interrupt>
> +	writew(ADDR, ctl_reg_4);
> +	q = readw(DATA);
> +	</interrupt>
> +
> +If ordering rules are sufficiently relaxed, the write to the data register
> +might happen after the second write to the address register.
> +
> +
> +It must be assumed that accesses done inside an interrupt disabled section may
> +leak outside of it and may interleave with accesses performed in an interrupt
> +and vice versa unless implicit or explicit barriers are used.
> +
> +Normally this won't be a problem because the I/O accesses done inside such
> +sections will include synchronous read operations on strictly ordered I/O
> +registers that form implicit I/O barriers. If this isn't sufficient then an
> +mmiowb() may need to be used explicitly.

There shouldn't be any problem here, because readw/writew _must_
ensure that the device accesses are serialized.  Just saying "if this
isn't sufficient" leaves the reader wondering when it might not be
sufficient or how they would know when it wasn't sufficient, and
introduces doubt where there needn't be any.

Of course, on an SMP system it would be quite possible for the
interrupt to be taken on another CPU, and in that case disabling
interrupts (I assume that by "DISABLE IRQ" you mean
local_irq_disable() or some such) gets you absolutely nothing; you
need to use a spinlock, and then the mmiowb is required.

You may like to include these words describing some of the rules:

* If you have stores to regular memory, followed by an MMIO store, and
  you want the device to see the stores to regular memory at the point
  where it receives the MMIO store, then you need a wmb() between the
  stores to regular memory and the MMIO store.

* If you have PIO or MMIO accesses, and you need to ensure the
  PIO/MMIO accesses don't get reordered with respect to PIO/MMIO
  accesses on another CPU, put the accesses inside a spin-locked
  region, and put a mmiowb() between the last access and the

* smp_wmb() doesn't necessarily do any ordering of MMIO accesses
  vs. other accesses, and in that sense it is weaker than wmb().


More information about the Linuxppc64-dev mailing list