[RFC Linux patch] powerpc: add documentation for HWCAPs

Nicholas Piggin npiggin at gmail.com
Fri May 20 22:06:38 AEST 2022


Excerpts from Michael Ellerman's message of May 20, 2022 7:21 pm:
> Nicholas Piggin via Libc-alpha <libc-alpha at sourceware.org> writes:
>> This takes the arm64 file and adjusts it for powerpc. Feature
>> descriptions are vaguely handwaved by me.
>> ---
> 
> Thanks for attempting to document this.

It was mainly copy and paste from two existing files :)

>> +1. Introduction
>> +---------------
>> +
>> +Some hardware or software features are only available on some CPU
>> +implementations, and/or with certain kernel configurations, but have no
>> +architected discovery mechanism available to userspace code. The kernel
> 
> By "no architected discovery mechanism" you mean nothing in the ISA, but
> I think a reader might not understand that. After all HWCAP is an
> "architected discovery mechanism", architected by the kernel and libc.
> 
> Maybe just say "no other discovery mechanism".

Good point I reworded that.

>> +Features cannot be probed reliably through other means. When a feature
>> +is not available, attempting to use it may result in unpredictable
>> +behaviour, and is not guaranteed to result in any reliable indication
>> +that the feature is unavailable, such as a SIGILL.
> 
> I'd just drop the "such as a SIGILL", don't give people ideas :)

Yep.

>> +2. hwcap allocation
>> +-------------------
>> +
>> +HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI
> 
> Are we calling them hwcaps or HWCAPs?

arm64 was mixed. We'll go with HWCAP.

>> +Specification (which will be reflected in the kernel's uapi headers).
>> +
>> +3. The hwcaps exposed in AT_HWCAP
>> +---------------------------------
>> +
>> +PPC_FEATURE_32
>> +    32-bit CPU
>> +
>> +PPC_FEATURE_64
>> +    64-bit CPU (userspace may be running in 32-bit mode).
>> +
>> +PPC_FEATURE_601_INSTR
>> +    The processor is PowerPC 601
> 
> Unused in the kernel since:
>   f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
> 
>> +PPC_FEATURE_HAS_ALTIVEC
>> +    Vector (aka Altivec, VSX) facility is available.
>> +
>> +PPC_FEATURE_HAS_FPU
>> +    Floating point facility is available.
>> +
>> +PPC_FEATURE_HAS_MMU
>> +    Memory management unit is present.
>> +
>> +PPC_FEATURE_HAS_4xxMAC
>> +    ?
> 
> First appeared in v2.4.9.2, as part of "Paul Mackerras: PPC update (big re-org)":
> 
>   https://github.com/mpe/linux-fullhistory/commit/dccd38599dad0588f4fb254c0a188b7c70af02e1
> 
> No extra context I can see.
> 
> I think all our 4xx (40x or 44x) CPUs have that set, so seems like it
> just means "is a 40x or 44x".
> 
>> +PPC_FEATURE_UNIFIED_CACHE
>> +    ?
> 
> Unused in the kernel since:
>   39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)")
> 
>> +PPC_FEATURE_HAS_SPE
>> +    ?
> 
> AFAIK means the CPU supports SPE (Signal Processing Engine) instructions.
> 
> They were documented in ISA v2.07 Book I chapter 8.
> 
> Not to be confused with the Cell SPEs.

Okay.

> 
> I think GCC has dropped support for SPE, so at some point we may want to
> drop the kernel support also, as it will be increasingly untested.
> 
>> +PPC_FEATURE_HAS_EFP_SINGLE
>> +    ?
> 
> Seems to be SPE related, only set on CPUs that also have SPE.

Maybe found some docs on it. It was some ops additional to the SPE
facility by the looks.

> 
>> +PPC_FEATURE_HAS_EFP_DOUBLE
>> +    ?
> 
> As above.
> 
>> +PPC_FEATURE_NO_TB
>> +    The timebase facility (mftb instruction) is not available.
>> +
> 
> Unused in the kernel since:
>   f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
> 
>> +PPC_FEATURE_POWER4
>> +    The processor is POWER4.
> 
> We dropped Power4 support in:
> 
>   471d7ff8b51b ("powerpc/64s: Remove POWER4 support")
> 
> But that bit is still set for PPC970/FX/MP.

Ah good catch.

> 
>> +PPC_FEATURE_POWER5
>> +    The processor is POWER5.
>> +
>> +PPC_FEATURE_POWER5_PLUS
>> +    The processor is POWER5+.
>> +
>> +PPC_FEATURE_CELL
>> +    The processor is Cell.
>> +
>> +PPC_FEATURE_BOOKE
>> +    The processor implements the BookE architecture.
>> +
>> +PPC_FEATURE_SMT
>> +    The processor implements SMT.
>> +
>> +PPC_FEATURE_ICACHE_SNOOP
>> +    The processor icache is coherent with the dcache, and instruction storage
>> +    can be made consistent with data storage for the purpose of executing
>> +    instructions with the instruction sequence:
>> +        sync
>> +        icbi (to any address)
>> +        isync
> 
> Where did you get that from, the ISA?

User manuals. I can't find it in the ISA but arguably I'd say it should
have some note or reference to coherent implementation seeing as all 
POWER CPUs for years have had it.

>> +PPC_FEATURE_ARCH_2_05
>> +    The processor supports the v2.05 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE_PA6T
>> +    The processor is PA6T.
>> +
>> +PPC_FEATURE_HAS_DFP
>> +    DFP facility is available.
>> +
>> +PPC_FEATURE_POWER6_EXT
>> +    The processor is POWER6.
>> +
>> +PPC_FEATURE_ARCH_2_06
>> +    The processor supports the v2.06 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE_HAS_VSX
>> +    VSX facility is available.
>> +
>> +PPC_FEATURE_PSERIES_PERFMON_COMPAT
> 
> Explanation in:
>   0f4733147520 ("powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT")
> 
> But AFAIK only oprofile ever used that, not perf, or maybe perfmon2 uses it?

Seems to be the architected PMU events?

> 
>> +PPC_FEATURE_TRUE_LE
>> +    Reserved, do not use
> 
> No it's not reserved, you read the comment wrong :)
> 
> /* Reserved - do not use		0x00000004 */
> #define PPC_FEATURE_TRUE_LE		0x00000002
> #define PPC_FEATURE_PPC_LE		0x00000001
> 
> It's 4 that's reserved.

Ah yep.

> 
>> +PPC_FEATURE_PPC_LE
>> +    Reserved, do not use
> 
> There's some discussion of the two LE properties here:
> 
>   fab5db97e44f ("[PATCH] powerpc: Implement support for setting little-endian mode via prctl")
> 
> But it doesn't really explain the difference.
> 
> And this commit:
> 
>   651d765d0b2c ("[PATCH] Add a prctl to change the endianness of a process.")
> 
> Added the prctl flags:
> 
> # define PR_ENDIAN_LITTLE	1	/* True little endian mode */
> # define PR_ENDIAN_PPC_LITTLE	2	/* "PowerPC" pseudo little endian */
> 
> Which matches my recollection that PPC_LE is somehow not proper little
> endian, but I've forgotten why. Someone older than me will remember :)

Looked it up and found it's "address munging" some strane mode that looks
like little endian to one's own loads and stores, but stores to memory
in some entirely different format that doesn't even match the address!

> 
>> +3. The hwcaps exposed in AT_HWCAP2
>> +----------------------------------
>> +
>> +PPC_FEATURE2_ARCH_2_07
>> +    The processor supports the v2.07 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE2_HTM
>> +    Transactional Memory feature is available.
>> +
>> +PPC_FEATURE2_DSCR
>> +    DSCR facility is available.
>> +
>> +PPC_FEATURE2_EBB
>> +    EBB facility is available.
>> +
>> +PPC_FEATURE2_ISEL
>> +    isel instruction is available. This is superseded by ARCH_2_07 and
>> +    later.
>> +
>> +PPC_FEATURE2_TAR
>> +    VSX facility is available.
> 
> Typo?
> 
> It means the CPU has the "tar" register. I suspect it's never used.

Yeah typo.

>> +PPC_FEATURE2_VEC_CRYPTO
>> +    v2.07 crypto instructions are available.
>> +
>> +PPC_FEATURE2_HTM_NOSC
>> +    System calls fail if called in a transactional state, see
>> +    Documentation/powerpc/syscall64-abi.rst
>> +
>> +PPC_FEATURE2_ARCH_3_00
>> +    The processor supports the v3.0B / v3.0C userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE2_HAS_IEEE128
>> +    IEEE 128 is available? What instructions/data?
>> +
>> +PPC_FEATURE2_DARN
>> +    darn instruction is available.
>> +
>> +PPC_FEATURE2_SCV
>> +    scv instruction is available.
>> +
>> +PPC_FEATURE2_HTM_NO_SUSPEND
>> +    A limited Transactional Memory facility that does not support suspend is
>> +    available, see Documentation/powerpc/transactional_memory.rst.
>> +
>> +PPC_FEATURE2_ARCH_3_1
>> +    The processor supports the v3.1 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE2_MMA
>> +    MMA facility is available.

How's this?

---
 Documentation/powerpc/elf_hwcaps.rst | 209 +++++++++++++++++++++++++++
 1 file changed, 209 insertions(+)
 create mode 100644 Documentation/powerpc/elf_hwcaps.rst

diff --git a/Documentation/powerpc/elf_hwcaps.rst b/Documentation/powerpc/elf_hwcaps.rst
new file mode 100644
index 000000000000..ac0d8983717b
--- /dev/null
+++ b/Documentation/powerpc/elf_hwcaps.rst
@@ -0,0 +1,209 @@
+.. _elf_hwcaps_index:
+
+==================
+POWERPC ELF HWCAPs
+==================
+
+This document describes the usage and semantics of the powerpc ELF HWCAPs.
+
+
+1. Introduction
+---------------
+
+Some hardware or software features are only available on some CPU
+implementations, and/or with certain kernel configurations, but have no other
+discovery mechanism available to userspace code. The kernel exposes the
+presence of these features to userspace through a set of flags called HWCAPs,
+exposed in the auxiliary vector.
+
+Userspace software can test for features by acquiring the AT_HWCAP or
+AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
+flags are set, e.g.::
+
+	bool floating_point_is_present(void)
+	{
+		unsigned long HWCAPs = getauxval(AT_HWCAP);
+		if (HWCAPs & PPC_FEATURE_HAS_FPU)
+			return true;
+
+		return false;
+	}
+
+Where software relies on a feature described by a HWCAP, it should check the
+relevant HWCAP flag to verify that the feature is present before attempting to
+make use of the feature.
+
+Features should not be probed through other means. When a feature is not
+available, attempting to use it may result in unpredictable behaviour, and
+may not be guaranteed to result in any reliable indication that the feature
+is unavailable.
+
+2. HWCAP allocation
+-------------------
+
+HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI
+Specification (which will be reflected in the kernel's uapi headers).
+
+3. The HWCAPs exposed in AT_HWCAP
+---------------------------------
+
+PPC_FEATURE_32
+    32-bit CPU
+
+PPC_FEATURE_64
+    64-bit CPU (userspace may be running in 32-bit mode).
+
+PPC_FEATURE_601_INSTR
+    The processor is PowerPC 601.
+    Unused in the kernel since:
+      f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
+
+PPC_FEATURE_HAS_ALTIVEC
+    Vector (aka Altivec, VSX) facility is available.
+
+PPC_FEATURE_HAS_FPU
+    Floating point facility is available.
+
+PPC_FEATURE_HAS_MMU
+    Memory management unit is present.
+
+PPC_FEATURE_HAS_4xxMAC
+    The processor is 40x or 44x family.
+
+PPC_FEATURE_UNIFIED_CACHE
+    The processor has a unified L1 cache for instructions and data, as
+    found in the NXP e200.
+    Unused in the kernel since:
+      39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)")
+
+PPC_FEATURE_HAS_SPE
+    Signal Processing Engine facility is available.
+
+PPC_FEATURE_HAS_EFP_SINGLE
+    Embedded Floating Point single precision operations are available.
+
+PPC_FEATURE_HAS_EFP_DOUBLE
+    Embedded Floating Point double precision operations are available.
+
+PPC_FEATURE_NO_TB
+    The timebase facility (mftb instruction) is not available.
+    This is a 601 specific HWCAP, so if it is known that the processor
+    running is not a 601, via other HWCAPs or other means, it is not
+    required to test this bit before using the timebase.
+    Unused in the kernel since:
+      f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
+
+PPC_FEATURE_POWER4
+    The processor is POWER4 or PPC970/FX/MP.
+    POWER4 support dropped from the kernel since:
+      471d7ff8b51b ("powerpc/64s: Remove POWER4 support")
+
+PPC_FEATURE_POWER5
+    The processor is POWER5.
+
+PPC_FEATURE_POWER5_PLUS
+    The processor is POWER5+.
+
+PPC_FEATURE_CELL
+    The processor is Cell.
+
+PPC_FEATURE_BOOKE
+    The processor implements the BookE architecture.
+
+PPC_FEATURE_SMT
+    The processor implements SMT.
+
+PPC_FEATURE_ICACHE_SNOOP
+    The processor icache is coherent with the dcache, and instruction storage
+    can be made consistent with data storage for the purpose of executing
+    instructions with the sequence (as described in, e.g., POWER9 Processor
+    User's Manual, 4.6.2.2 Instruction Cache Block Invalidate (icbi)):
+        sync
+        icbi (to any address)
+        isync
+
+PPC_FEATURE_ARCH_2_05
+    The processor supports the v2.05 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_PA6T
+    The processor is PA6T.
+
+PPC_FEATURE_HAS_DFP
+    DFP facility is available.
+
+PPC_FEATURE_POWER6_EXT
+    The processor is POWER6.
+
+PPC_FEATURE_ARCH_2_06
+    The processor supports the v2.06 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_HAS_VSX
+    VSX facility is available.
+
+PPC_FEATURE_PSERIES_PERFMON_COMPAT
+    The processor supports architected PMU events in the range 0xE0-0xFF.
+
+PPC_FEATURE_TRUE_LE
+    The processor supports true little-endian mode.
+
+PPC_FEATURE_PPC_LE
+    The processor supports "PowerPC Little-Endian", that uses address
+    munging to make storage access appear to be little-endian, but the
+    data is stored in a different format that is unsuitable to be
+    accessed by other agents not running in this mode.
+
+3. The HWCAPs exposed in AT_HWCAP2
+----------------------------------
+
+PPC_FEATURE2_ARCH_2_07
+    The processor supports the v2.07 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HTM
+    Transactional Memory feature is available.
+
+PPC_FEATURE2_DSCR
+    DSCR facility is available.
+
+PPC_FEATURE2_EBB
+    EBB facility is available.
+
+PPC_FEATURE2_ISEL
+    isel instruction is available. This is superseded by ARCH_2_07 and
+    later.
+
+PPC_FEATURE2_TAR
+    TAR facility is available.
+
+PPC_FEATURE2_VEC_CRYPTO
+    v2.07 crypto instructions are available.
+
+PPC_FEATURE2_HTM_NOSC
+    System calls fail if called in a transactional state, see
+    Documentation/powerpc/syscall64-abi.rst
+
+PPC_FEATURE2_ARCH_3_00
+    The processor supports the v3.0B / v3.0C userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HAS_IEEE128
+    IEEE 128 is available? What instructions/data?
+
+PPC_FEATURE2_DARN
+    darn instruction is available.
+
+PPC_FEATURE2_SCV
+    scv instruction is available.
+
+PPC_FEATURE2_HTM_NO_SUSPEND
+    A limited Transactional Memory facility that does not support suspend is
+    available, see Documentation/powerpc/transactional_memory.rst.
+
+PPC_FEATURE2_ARCH_3_1
+    The processor supports the v3.1 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_MMA
+    MMA facility is available.
-- 
2.35.1



More information about the Linuxppc-dev mailing list