[RFC PATCH 0/2] powerpc: CR based local atomic operation implementation

Rusty Russell rusty at rustcorp.com.au
Thu Dec 18 15:18:17 AEDT 2014


David Laight <David.Laight at ACULAB.COM> writes:
> From: Madhavan Srinivasan [mailto:maddy at linux.vnet.ibm.com]
> ...
>> >>> I also wonder if it is possible to inspect the interrupted
>> >>> code to determine the start/end of the RAS block.
>> >>> (Easiest if you assume that there is a single 'write' instruction
>> >>> as the last entry in the block.)
>> >>>
>> >> So each local_* function also have code in the __ex_table section. IIUC,
>> >> __ex_table contains two address. So if the return address found in the
>> >> first column of the _ex_table, use the corresponding address in the
>> >> second column to continue from.
>> >
>> > That really doesn't scale.
>> > I don't know how many 1000 address pairs you table will have (and the
>> > ones in each loadable module), but the search isn't going to be cheap.
>> >
>> > If these sequences are restartable then they can only have one write
>> > to memory.
>> >
>> 
>> May be, but i see these issues incase of insts decode path,
>> 
>> 1) Decoding instruction may also cause a fault (in case of module) and
>> handling a fault at this stage toward the exit path of interrupt exit
>> makes me nervous
>
> It shouldn't be possible to unload a module that is interrupted by
> a hardware interrupt.
> An 'invalid' loadable module can cause an oops/panic anyway.

Yes, the module won't fault (vmalloc memory can be lazily mapped, but
we've already copied the module into there, so it won't happen).

>> 2) resulting code with lot of condition and branch (for opcode decode)
>> will be lot messy and may be an issue incase of maintenance,
>
> You don't need to decode the instructions.
> Just look for the two specific instructions used as markers.
> This is only really possible with fixed-size instructions.
>
> It might also be that the 'interrupt entry' path is easier to
> modify than the 'interrupt exit' one (fewer code paths) and
> you just need to modify the 'pc' in the stack frame.
> You are only interested in interrupts from kernel space.

It's an overoptimization for case that statistically never happens.
You won't even be able to measure the difference.

The question of bloat remains, but that's also easily measured.  In
practice, I'd guess less than 1k.

Cheers,
Rusty.


More information about the Linuxppc-dev mailing list