powerpc Linux scv support and scv system call ABI proposal

Nicholas Piggin npiggin at gmail.com
Wed Jan 29 01:05:40 AEDT 2020


Florian Weimer's on January 28, 2020 11:09 pm:
> * Nicholas Piggin:
> 
>> * Proposal is for PPC_FEATURE2_SCV to indicate 'scv 0' support, all other
>>   vectors will return -ENOSYS, and the decision for how to add support for
>>   a new vector deferred until we see the next user.
> 
> Seems reasonable.  We don't have to decide this today.
> 
>> * Proposal is for scv 0 to provide the standard Linux system call ABI with some
>>   differences:
>>
>> - LR is volatile across scv calls. This is necessary for support because the
>>   scv instruction clobbers LR.
> 
> I think we can express this in the glibc system call assembler wrapper
> generators.  The mcount profiling wrappers already have this property.
> 
> But I don't think we are so lucky for the inline system calls.  GCC
> recognizes an "lr" clobber with inline asm (even though it is not
> documented), but it generates rather strange assembler output as a
> result:
> 
> long
> f (long x)
> {
>   long y;
>   asm ("#" : "=r" (y) : "r" (x) : "lr");
>   return y;
> }
> 
> 	.abiversion 2
> 	.section	".text"
> 	.align 2
> 	.p2align 4,,15
> 	.globl f
> 	.type	f, @function
> f:
> .LFB0:
> 	.cfi_startproc
> 	mflr 0
> 	.cfi_register 65, 0
> #APP
>  # 5 "t.c" 1
> 	#
>  # 0 "" 2
> #NO_APP
> 	std 0,16(1)
> 	.cfi_offset 65, 16
> 	ori 2,2,0
> 	ld 0,16(1)
> 	mtlr 0
> 	.cfi_restore 65
> 	blr
> 	.long 0
> 	.byte 0,0,0,1,0,0,0,0
> 	.cfi_endproc
> .LFE0:
> 	.size	f,.-f
> 
> 
> That's with GCC 8.3 at -O2.  I don't understand what the ori is about.

ori 2,2,0 is the group terminating nop hint for POWER8 type cores
which had dispatch grouping rules.

> 
> I don't think we can save LR in a regular register around the system
> call, explicitly in the inline asm statement, because we still have to
> generate proper unwinding information using CFI directives, something
> that you cannot do from within the asm statement.
> 
> Supporting this in GCC should not be impossible, but someone who
> actually knows this stuff needs to look at it.

The generated assembler actually seems okay to me. If we compile
something like a syscall and with -mcpu=power9:

long
f (long _r3, long _r4, long _r5, long _r6, long _r7, long _r8, long _r0)
{
  register long r0 asm ("r0") = _r0;
  register long r3 asm ("r3") = _r3;
  register long r4 asm ("r4") = _r4;
  register long r5 asm ("r5") = _r5;
  register long r6 asm ("r6") = _r6;
  register long r7 asm ("r7") = _r7;
  register long r8 asm ("r8") = _r8;

  asm ("# scv" : "=r"(r3) : "r"(r0), "r"(r4), "r"(r5), "r"(r6), "r"(r7), "r"(r8) : "lr", "ctr", "cc", "xer");

  return r3;
}


f:
.LFB0:
        .cfi_startproc
        mflr 0
        std 0,16(1)
        .cfi_offset 65, 16
        mr 0,9
#APP
 # 12 "a.c" 1
        # scv
 # 0 "" 2
#NO_APP
        ld 0,16(1)
        mtlr 0
        .cfi_restore 65
        blr
        .long 0
        .byte 0,0,0,1,0,0,0,0
        .cfi_endproc

That gets the LR save/restore right when we're also using r0.

> 
>> - CR1 and CR5-CR7 are volatile. This matches the C ABI and would allow the
>>   system call exit to avoid restoring the CR register.
> 
> This sounds reasonable, but I don't know what kind of knock-on effects
> this has.  The inline system call wrappers can handle this with minor
> tweaks.

Okay, good. In the end we would have to check code trace through the
kernel and libc of course, but I think there's little to no opportunity
to take advantage of current extra non-volatile cr regs.

mtcr has to write 8 independently renamed registers so it's cracked into
2 insns on POWER9 (and likely to always be a bit troublesome). It's not
much in the scheme of a system call, but while we can tweak the ABI...

> 
>> - Error handling: use of CR0[SO] to indicate error requires a mtcr / mtocr
>>   instruction on the kernel side, and it is currently not implemented well
>>   in glibc, requiring a mfcr (mfocr should be possible and asm goto support
>>   would allow a better implementation). Is it worth continuing this style of
>>   error handling? Or just move to -ve return means error? Using a different
>>   bit would allow the kernel to piggy back the CR return code setting with
>>   a test for the error case exit.
> 
> GCC does not model the condition registers, so for inline system calls,
> we have to produce a value anyway that the subsequence C code can check.
> The assembler syscall wrappers do not need to do this, of course, but
> I'm not sure which category of interfaces is more important.

Right. asm goto can improve this kind of pattern if it's inlined
into the C code which tests the result, it can branch using the flags
to the C error handling label, rather than move flags into GPR, test
GPR, branch. However...

> But the kernel uses the -errno convention internally, so I think it
> would make sense to pass this to userspace and not convert back and
> forth.  This would align with what most of the architectures do, and
> also avoids the GCC oddity.

Yes I would be interested in opinions for this option. It seems like
matching other architectures is a good idea. Maybe there are some
reasons not to.

>> - Should this be for 64-bit only? 'scv 1' could be reserved for 32-bit
>>   calls if there was interest in developing an ABI for 32-bit programs.
>>   Marginal benefit in avoiding compat syscall selection.
> 
> We don't have an ELFv2 ABI for 32-bit.  I doubt it makes sense to
> provide an ELFv1 port for this given that it's POWER9-specific.

Okay. There's no reason not to enable this for BE, at least for the
kernel it's no additional work so it probably remains enabled (unless
there is something really good we could do with the ABI if we exclude
ELFv1 but I don't see anything).

But if glibc only builds for ELFv2 support that's probably reasonable.

> 
> From the glibc perspective, the major question is how we handle run-time
> selection of the system call instruction sequence.  On i386, we use a
> function pointer in the TCB to call an instruction sequence in the vDSO.
> That's problematic from a security perspective.  I expect that on
> POWER9, using a pointer in read-only memory would be equally
> non-attractive due to a similar lack of PC-relative addressing.  We
> could use the HWCAP bit in the TCB, but that would add another (easy to
> predict) conditional branch to every system call.

I would have to defer to glibc devs on this. Conditional branch
should be acceptable I think, scv improves speed as much as several
mispredicted branches (about 90 cycles).

> I don't think it matters whether both system call variants use the same
> error convention because we could have different error code extraction
> code on the two branches.

That's one less difficulty.

Thanks,
Nick


More information about the Linuxppc-dev mailing list