[musl] Powerpc Linux 'scv' system call ABI proposal take 2

Wed Apr 22 01:00:31 AEST 2020

On 21/04/2020 11:39, Rich Felker wrote:
> On Tue, Apr 21, 2020 at 12:28:25PM +0000, David Laight wrote:
>> From: Nicholas Piggin
>>> Sent: 20 April 2020 02:10
>> ...
>>>>> Yes, but does it really matter to optimize this specific usage case
>>>>> for size? glibc, for instance, tries to leverage the syscall mechanism
>>>>> by adding some complex pre-processor asm directives.  It optimizes
>>>>> the syscall code size in most cases.  For instance, kill in static case
>>>>> generates on x86_64:
>>>>>
>>>>> 0000000000000000 <__kill>:
>>>>>    0:   b8 3e 00 00 00          mov    $0x3e,%eax
>>>>>    5:   0f 05                   syscall
>>>>>    7:   48 3d 01 f0 ff ff       cmp    $0xfffffffffffff001,%rax
>>>>>    d:   0f 83 00 00 00 00       jae    13 <__kill+0x13>
>>
>> Hmmm... that cmp + jae is unnecessary here.
> 
> It's not.. Rather the objdump was just mistakenly done without -r so
> it looks like a nop jump rather than a conditional tail call to the
> function that sets errno.
> 

Indeed, the output with -r is:

0000000000000000 <__kill>:
   0:   b8 3e 00 00 00          mov    $0x3e,%eax
   5:   0f 05                   syscall 
   7:   48 3d 01 f0 ff ff       cmp    $0xfffffffffffff001,%rax
   d:   0f 83 00 00 00 00       jae    13 <__kill+0x13>
                        f: R_X86_64_PLT32       __syscall_error-0x4
  13:   c3                      retq   

And for x86_64 __syscall_error is defined as:

0000000000000000 <__syscall_error>:
   0:   48 f7 d8                neg    %rax

0000000000000003 <__syscall_error_1>:
   3:   64 89 04 25 00 00 00    mov    %eax,%fs:0x0
   a:   00
                        7: R_X86_64_TPOFF32     errno
   b:   48 83 c8 ff             or     $0xffffffffffffffff,%rax
   f:   c3                      retq

Different than musl, each architecture defines its own error handling
mechanism (some embedded errno setting in syscall itself, other branches
to a __syscall_error like function as x86_64).  

This is due most likely from the glibc long history.  One of my long 
term plan is to just simplify, get rid of the assembly pre-processor,
implement all syscall in C code, and set error handling mechanism in
a platform neutral way using a tail call (most likely you do on musl).

>> It is also a 32bit offset jump.
>> I also suspect it gets predicted very badly.
> 
> I doubt that. This is a very standard idiom and the size of the offset
> (which is necessarily 32-bit because it has a relocation on it) is
> orthogonal to the condition on the jump.
> 
> FWIW a syscall like kill takes global kernel-side locks to be able to
> address a target process by pid, and the rate of meaningful calls you
> can make to it is very low (since it's bounded by time for target
> process to act on the signal). Trying to optimize it for speed is
> pointless, and even size isn't important locally (although in
> aggregate, lots of wasted small size can add up to more pages = more
> TLB entries = ...).

I agree and I would prefer to focus on code simplicity to have a
platform neutral way to handle error and let the compiler optimize
it than messy with assembly macros to squeeze this kind of
micro-optimizations.

> 
>>>>>   13:   c3                      retq
>>>>>
>>>>> While on musl:
>>>>>
>>>>> 0000000000000000 <kill>:
>>>>>    0:	48 83 ec 08          	sub    $0x8,%rsp
>>>>>    4:	48 63 ff             	movslq %edi,%rdi
>>>>>    7:	48 63 f6             	movslq %esi,%rsi
>>>>>    a:	b8 3e 00 00 00       	mov    $0x3e,%eax
>>>>>    f:	0f 05                	syscall
>>>>>   11:	48 89 c7             	mov    %rax,%rdi
>>>>>   14:	e8 00 00 00 00       	callq  19 <kill+0x19>
>>>>>   19:	5a                   	pop    %rdx
>>>>>   1a:	c3                   	retq
>>>>
>>>> Wow that's some extraordinarily bad codegen going on by gcc... The
>>>> sign-extension is semantically needed and I don't see a good way
>>>> around it (glibc's asm is kinda a hack taking advantage of kernel not
>>>> looking at high bits, I think), but the gratuitous stack adjustment
>>>> and refusal to generate a tail call isn't. I'll see if we can track
>>>> down what's going on and get it fixed.
>>
>> A suitable cast might get rid of the sign extension.
>> Possibly just (unsigned int).
> 
> No, it won't. The problem is that there is no representation of the
> fact that the kernel is only going to inspect the low 32 bits (by
> declaring the kernel-side function as taking an int argument). The
> external kill function receives arguments by the ABI, where the upper
> bits of int args can contain junk, and the asm register constraints
> for syscalls use longs (or rather an abstract syscall-arg type). It
> wouldn't even work to have macro magic detect that the expressions
> passed are ints and use hacks to avoid that, since it's perfectly
> valid to pass an int to a syscall that expects a long argument (e.g.
> offset to mmap), in which case it needs to be sign-extended.
> 
> The only way to avoid this is encoding somewhere the syscall-specific
> knowledge of what arg size the kernel function expects. That's way too
> much redundant effort and too error-prone for the incredibly miniscule
> size benefit you'd get out of it.
> 
> Rich
>