[musl] Powerpc Linux 'scv' system call ABI proposal take 2

Nicholas Piggin npiggin at gmail.com
Mon Apr 20 10:46:45 AEST 2020


Excerpts from Adhemerval Zanella's message of April 17, 2020 4:52 am:
> 
> 
> On 16/04/2020 15:31, Rich Felker wrote:
>> On Thu, Apr 16, 2020 at 03:18:42PM -0300, Adhemerval Zanella wrote:
>>>
>>>
>>> On 16/04/2020 14:59, Rich Felker wrote:
>>>> On Thu, Apr 16, 2020 at 02:50:18PM -0300, Adhemerval Zanella wrote:
>>>>>
>>>>>
>>>>> On 16/04/2020 12:37, Rich Felker wrote:
>>>>>> On Thu, Apr 16, 2020 at 11:16:04AM -0300, Adhemerval Zanella wrote:
>>>>>>>> My preference would be that it work just like the i386 AT_SYSINFO
>>>>>>>> where you just replace "int $128" with "call *%%gs:16" and the kernel
>>>>>>>> provides a stub in the vdso that performs either scv or the old
>>>>>>>> mechanism with the same calling convention. Then if the kernel doesn't
>>>>>>>> provide it (because the kernel is too old) libc would have to provide
>>>>>>>> its own stub that uses the legacy method and matches the calling
>>>>>>>> convention of the one the kernel is expected to provide.
>>>>>>>
>>>>>>> What about pthread cancellation and the requirement of checking the
>>>>>>> cancellable syscall anchors in asynchronous cancellation? My plan is
>>>>>>> still to use musl strategy on glibc (BZ#12683) and for i686 it 
>>>>>>> requires to always use old int$128 for program that uses cancellation
>>>>>>> (static case) or just threads (dynamic mode, which should be more
>>>>>>> common on glibc).
>>>>>>>
>>>>>>> Using the i686 strategy of a vDSO bridge symbol would require to always
>>>>>>> fallback to 'sc' to still use the same cancellation strategy (and
>>>>>>> thus defeating this optimization in such cases).
>>>>>>
>>>>>> Yes, I assumed it would be the same, ignoring the new syscall
>>>>>> mechanism for cancellable syscalls. While there are some exceptions,
>>>>>> cancellable syscalls are generally not hot paths but things that are
>>>>>> expected to block and to have significant amounts of work to do in
>>>>>> kernelspace, so saving a few tens of cycles is rather pointless.
>>>>>>
>>>>>> It's possible to do a branch/multiple versions of the syscall asm for
>>>>>> cancellation but would require extending the cancellation handler to
>>>>>> support checking against multiple independent address ranges or using
>>>>>> some alternate markup of them.
>>>>>
>>>>> The main issue is at least for glibc dynamic linking is way more common
>>>>> than static linking and once the program become multithread the fallback
>>>>> will be always used.
>>>>
>>>> I'm not relying on static linking optimizing out the cancellable
>>>> version. I'm talking about how cancellable syscalls are pretty much
>>>> all "heavy" operations to begin with where a few tens of cycles are in
>>>> the realm of "measurement noise" relative to the dominating time
>>>> costs.
>>>
>>> Yes I am aware, but at same time I am not sure how it plays on real world.
>>> For instance, some workloads might issue kernel query syscalls, such as
>>> recv, where buffer copying might not be dominant factor. So I see that if
>>> the idea is optimizing syscall mechanism, we should try to leverage it
>>> as whole in libc.
>> 
>> Have you timed a minimal recv? I'm not assuming buffer copying is the
>> dominant factor. I'm assuming the overhead of all the kernel layers
>> involved is dominant.
> 
> Not really, but reading the advantages of using 'scv' over 'sc' also does
> not outline the real expect gain.  Taking in consideration this should
> be a micro-optimization (focused on entry syscall patch), I think we should
> use where it possible.

It's around 90 cycles improvement, depending on config options and 
speculative mitigations in place, this may be roughly 5-20% of a gettid
syscall, which itself probably bears little relationship to what a recv
syscall doing real work would do, it's easy to swamp it with other work.

But it's a pretty big win in terms of how much we try to optimise this
path.

Thanks,
Nick


More information about the Linuxppc-dev mailing list