[RFC PATCH v2 00/11] powerpc: "paca->soft_enabled" based local atomic operation implementation
maddy at linux.vnet.ibm.com
Mon Aug 1 05:06:18 AEST 2016
Local atomic operations are fast and highly reentrant per CPU counters.
Used for percpu variable updates. Local atomic operations only guarantee
variable modification atomicity wrt the CPU which owns the data and
these needs to be executed in a preemption safe way.
Here is the design of the patchset. Since local_* operations
are only need to be atomic to interrupts (IIUC), we have two options.
Either replay the "op" if interrupted or replay the interrupt after
the "op". Initial patchset posted was based on implementing local_* operation
based on CR5 which replay's the "op". Patchset had issues in case of
rewinding the address pointor from an array. This make the slow patch
really slow. Since CR5 based implementation proposed using __ex_table to find
the rewind address, this rasied concerns about size of __ex_table and vmlinux.
But this patchset uses Benjamin Herrenschmidt suggestion of using
arch_local_irq_disable() to soft_disable interrupts (including PMIs).
After finishing the "op", arch_local_irq_restore() called and correspondingly
interrupts are replayed if any occured.
Current paca->soft_enabled logic is reserved and MASKABLE_EXCEPTION_* macros
are extended to support this feature.
patch re-write the current local_* functions to use arch_local_irq_disbale.
Base flow for each function is
Reason for the approach is that, currently l[w/d]arx/st[w/d]cx.
instruction pair is used for local_* operations, which are heavy
on cycle count and they dont support a local variant. So to
see whether the new implementation helps, used a modified
version of Rusty's benchmark code on local_t.
Modifications to Rusty's benchmark code:
- Executed only local_t test
Here are the values with the patch.
Time in ns per iteration
Local_t Without Patch With Patch
_inc 28 8
_add 28 8
_read 3 3
_add_return 28 7
Currently only asm/local.h has been rewrite, and also
the entire change is tested only in PPC64 (pseries guest)
First four are the clean up patches which lays the foundation
to make things easier. Fifth patch in the patchset reverse the
current soft_enabled logic and commit message details the reason and
need for this change. Sixth and seventh patch refactor's the __EXPECTION_PROLOG_1
code to support addition of a new parameter to MASKABLE_* macros. New parameter
will give the possible masking level for the interrupt. Rest of the patches are
to add support for maskable PMI and implementation of local_t using arch_local_irq_*().
Since the patchset is experimental, changes made are focused on pseries and
powernv platforms only. Would really like to know comments for
this approach before extending to other powerpc platforms.
Tested the patchset in a
- pSeries LPAR (with perf record).
- Ran kernbench with perf record for 24 hours.
- More testing needed.
Changelog RFC v1:
1)Commit messages are improved.
2)Renamed the arch_local_irq_disable_var to soft_irq_set_level as suggested
3)Renamed the LAZY_INTERRUPT* macro to IRQ_DISABLE_LEVEL_* as suggested
4)Extended the MASKABLE_EXCEPTION* macros to support additional parameter.
5)Each MASKABLE_EXCEPTION_* macro will carry a "mask_level"
6)Logic to decide on jump to maskable_handler in SOFTEN_TEST is now based on
7)__EXCEPTION_PROLOG_1 is factored out to support "mask_level" parameter.
This reduced the code changes needed for supporting "mask_level" parameters.
Signed-off-by: Madhavan Srinivasan <maddy at linux.vnet.ibm.com>
Madhavan Srinivasan (11):
Add #defs for paca->soft_enabled flags
Cleanup to use IRQ_DISABLE_LEVEL_* macros for paca->soft_enabled
powerpc: move set_soft_enabled()
powerpc: Use set_soft_enabled api to update paca->soft_enabled
powerpc: reverse the soft_enable logic
powerpc: Avoid using EXCEPTION_PROLOG_1 macro in MASKABLE_*
powerpc: Add new _EXCEPTION_PROLOG_1 macro
powerpc: Add "mask_lvl" paramater to MASKABLE_* macros
powerpc: Add support to mask perf interrupts
powerpc: Support to replay PMIs
powerpc: rewrite local_t using soft_irq
arch/powerpc/include/asm/exception-64s.h | 106 +++++++++++++++++++++----------
arch/powerpc/include/asm/hw_irq.h | 46 ++++++++++++--
arch/powerpc/include/asm/irqflags.h | 8 +--
arch/powerpc/include/asm/kvm_ppc.h | 2 +-
arch/powerpc/include/asm/local.h | 91 ++++++++++++++++++--------
arch/powerpc/kernel/entry_64.S | 16 ++---
arch/powerpc/kernel/exceptions-64s.S | 46 ++++++++++----
arch/powerpc/kernel/head_64.S | 3 +-
arch/powerpc/kernel/idle_power4.S | 3 +-
arch/powerpc/kernel/irq.c | 24 +++----
arch/powerpc/kernel/process.c | 3 +-
arch/powerpc/kernel/setup_64.c | 5 +-
arch/powerpc/kernel/time.c | 4 +-
arch/powerpc/mm/hugetlbpage.c | 2 +-
arch/powerpc/perf/core-book3s.c | 2 +-
15 files changed, 247 insertions(+), 114 deletions(-)
More information about the Linuxppc-dev