[PATCH/RFC] Hookable IO operations
Segher Boessenkool
segher at kernel.crashing.org
Sun Nov 5 09:33:18 EST 2006
>>> + * for the out_* accessors has the arguments in opposite order
>>> from the usual
>>> + * linux PCI accessors. Unlike those, they take the address first
>>> and the value
>>> + * next.
>>
>> Isn't that more confusing than useful?
>
> That's how they have been singe day 1. I'm not about to change that
> and
> go fix every bit of ppc code that uses them. At least not with this
> patch.
Sure. By just reading the patch I couldn't tell whether this was
new (or a new comment) or not, without cross-referencing stuff :-)
>> Drop the "inline" and you don't need any macro games anymore, and
>> code
>> size drops, and it shouldn't be slower -- the I/O operation itself
>> dominates runtime (but testing is needed, of course).
>
> No, the low level accessors are staying inline and I don't see how
> that
> would prevent macro games... Anyway that doesn't matter much at this
> point.
It would be nice if someone could measure the difference between
inlined and non-inlined though, both in code size and actual
performance.
> Note that I've posted a second version of the patch with simpler
> macros.
Your mails show up in my mailbox really late, I'll look at the
new version.
>> You can also drop the "volatile" -- it has no effect.
>
> Well, it's possible, but I've never quite understood when and when
> not C
> compilers need/want volatile or not in the case of IO accessors :-)
"volatile" on a pointer only has effect at the point where that
pointer is dereferenced (at the C level, asm doesn't count).
"volatile" in a function prototype is meaningless; in a function
definition, it only does something _within_ that function, and
then only where the pointer is dereferenced. It's preferable
for clarity to cast the pointer to the volatile version of its
type _right at the dereference itself_.
> It
> seems to be the rule accross linux that those functions take
> volatile so
> I will continue following it.
... but then there's that, heh. I don't feel up to a bad-use-of-
volatile-crusade right now, so I won't ask you to either :-)
>> "r"(addr) should probably be "b" (and you might want to lose the
>> "m" --
>> it certainly won't have any bad effects if you decide to get rid
>> of the
>> inline).
>
> In which case should "r" be "b" ? Those macros are directly equivalent
> to the previous code that was there and it didn't use "b". If you look
> closely, depending on the type of operations(lbz,lhz,lwz etc.. vs.
> lhbrz, lwbrx etc...), we use %1 or %2 for the address. In the former
> case, we get more efficient code by letting gcc generate the right
> type
> of instruction (update form, indexed form, ... using %U and %X) by
> using
> and "m" input. For the later, we use "r" and we use it for rB which
> can
> safely be r0 afaik so we don't need "b".
>
>>> +#define DEF_MMIO_IN_BE(pfx, size, insn) \
>>> + DEF_MMIO_IN(pfx##size, u##size, __stringify(insn)"%U2%X2 %0,%2")
>>> +#define DEF_MMIO_IN_LE(pfx, size, insn) \
>>> + DEF_MMIO_IN(pfx##size, u##size, __stringify(insn)" %0,0,%1")
>>
>> ... "r" is okay actually, no "b" needed (would be needed if addr
>> was the
>> first arg -- which is very much preferred on most CPUs for
>> performance
>> reasons).
>
> Ah ? But how so ? I mean, there has to be a rB, there is no form of
> these instructions (the LE ones) with only one argument.
In load/store insns of the form
op rD,rA,rB
the rA needs to be "b" (i.e., GPR0 is disallowed) and rB can be "r".
Some CPUs want rA to be a "real" address itself, probably because of
how they do the storage address translation. I don't have details,
but see http://gcc.gnu.org/PR28690 . I can speculate but let's do
that in private, not on this mailing list :-)
> Thus there is
> always a register specified as the second part of the address. So if I
> was using "b" with rA, I would have to spill another register for
> rB and
> put 0 in it.
Quite true; just keep it as-is until you get complaints :-)
>>> +/* Enforce in-order execution of data I/O.
>>> + * No distinction between read/write on PPC; use eieio for all
>>> three.
>>> + */
>>
>> It also prevents write gathering; if not, the _w version wouldn't
>> need eieio.
>
> That was there already, I haven't touched that comment. Maybe I
> should...
Yes exactly.
>>> +config PPC_INDIRECT_IO
>>
>> I find this name a little confusing, it sounds too much like the
>> "PCI indirect I/O".
>
> Any better idea ?
Not sure really... PPC_HOOKED_IO?
>> Get rid of the caps in IoAddress too?
>
> As I said to Arnd, I will do that in a separate patch as I sanitize a
> bit the content of those functions.
Yeah I didn't see that reply yet. Great to hear you'll do this
cleanup, thanks!
And now to review #2, or did you send #3 already :-)
Segher
More information about the Linuxppc-dev
mailing list