VDSO ELF header
Christophe Leroy
christophe.leroy at csgroup.eu
Sat Mar 27 03:11:40 AEDT 2021
Le 26/03/2021 à 16:13, Dmitry Safonov a écrit :
> Hello,
>
> On 3/26/21 10:50 AM, Christophe Leroy wrote:
>>
>>
>> Le 26/03/2021 à 11:46, Michael Ellerman a écrit :
>>> Laurent Dufour <ldufour at linux.ibm.com> writes:
>>>> Le 25/03/2021 à 17:56, Laurent Dufour a écrit :
>>>>> Le 25/03/2021 à 17:46, Christophe Leroy a écrit :
>>>>>> Le 25/03/2021 à 17:11, Laurent Dufour a écrit :
>>>>>>> Since v5.11 and the changes you made to the VDSO code, it no more
>>>>>>> exposing
>>>>>>> the ELF header at the beginning of the VDSO mapping in user space.
>>>>>>>
>>>>>>> This is confusing CRIU which is checking for this ELF header cookie
>>>>>>> (https://github.com/checkpoint-restore/criu/issues/1417).
>>>>>>
>>>>>> How does it do on other architectures ?
>>>>>
>>>>> Good question, I'll double check the CRIU code.
>>>>
>>>> On x86, there are 2 VDSO entries:
>>>> 7ffff7fcb000-7ffff7fce000 r--p 00000000 00:00
>>>> 0 [vvar]
>>>> 7ffff7fce000-7ffff7fcf000 r-xp 00000000 00:00
>>>> 0 [vdso]
>>>>
>>>> And the VDSO is starting with the ELF header.
>>>>
>>>>>>> I'm not an expert in loading and ELF part and reading the change
>>>>>>> you made, I
>>>>>>> can't identify how this could work now as I'm expecting the loader
>>>>>>> to need
>>>>>>> that ELF header to do the relocation.
>>>>>>
>>>>>> I think the loader is able to find it at the expected place.
>>>>>
>>>>> Actually, it seems the loader relies on the AUX vector
>>>>> AT_SYSINFO_EHDR. I guess
>>>>> CRIU should do the same.
>>>>>
>>>>>>>
>>>>>>> From my investigation it seems that the first bytes of the VDSO
>>>>>>> area are now
>>>>>>> the vdso_arch_data.
>>>>>>>
>>>>>>> Is the ELF header put somewhere else?
>>>>>>> How could the loader process the VDSO without that ELF header?
>>>>>>>
>>>>>>
>>>>>> Like most other architectures, we now have the data section as
>>>>>> first page and
>>>>>> the text section follows. So you will likely find the elf header on
>>>>>> the second
>>>>>> page.
>>>>
>>>> I'm wondering if the data section you're refering to is the vvar
>>>> section I can
>>>> see on x86.
>>>
>>> Many of the other architectures have separate vm_special_mapping's for
>>> the data page and the vdso binary, where the former is called "vvar".
>>>
>>> eg, s390:
>>>
>>> static struct vm_special_mapping vvar_mapping = {
>>> .name = "[vvar]",
>>> .fault = vvar_fault,
>>> };
>>>
>>> static struct vm_special_mapping vdso_mapping = {
>>> .name = "[vdso]",
>>> .mremap = vdso_mremap,
>>> };
>>>
>>>
>>> I guess we probably should be doing that too.
>>>
>>
>> Dmitry proposed the same, see
>> https://github.com/0x7f454c46/linux/commit/783c7a2532d2219edbcf555cc540eab05f698d2a
>>
>>
>> Discussion at https://github.com/checkpoint-restore/criu/issues/1417
>
> Yeah, I didn't submit it officially to lkml because I couldn't test it
> yet (and I usually don't send untested patches). The VM I have fails to
> kexec and there's some difficulty to get serial console working, so I'd
> appreciate if someone could either pick it up, or add tested-by.
>
Just to let everyone know, while testing your patch with selftest I encountered the following Oops.
But I also have it without your patch thought.
root at vgoip:~# ./sigreturn_vdso
test: sigreturn_vdso
tags: git_version:v5.12-rc4-1553-gc31141d460e6
VDSO is at 0x104000-0x10bfff (32768 bytes)
Signal delivered OK with VDSO mapped
VDSO moved to 0x77bf4000-0x77bfbfff (32768 bytes)
Signal delivered OK with VDSO moved
Unmapped VDSO
[ 1855.444371] Kernel attempted to read user page (7ff9ff30) - exploit attempt? (uid: 0)
[ 1855.459404] BUG: Unable to handle kernel data access on read at 0x7ff9ff30
[ 1855.466188] Faulting instruction address: 0xc00111d4
[ 1855.471099] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1855.476428] BE PAGE_SIZE=16K PREEMPT CMPC885
[ 1855.480702] SAF3000 DIE NOTIFICATION
[ 1855.484184] CPU: 0 PID: 362 Comm: sigreturn_vdso Not tainted
5.12.0-rc4-s3k-dev-01553-gc31141d460e6 #4811
[ 1855.493644] NIP: c00111d4 LR: c0005a28 CTR: 00000000
[ 1855.498634] REGS: cadb3dd0 TRAP: 0300 Not tainted (5.12.0-rc4-s3k-dev-01553-gc31141d460e6)
[ 1855.507068] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48000884 XER: 20000000
[ 1855.513866] DAR: 7ff9ff30 DSISR: 88000000
[ 1855.513866] GPR00: c0007788 cadb3e90 c28dc000 7ff9ff30 7ff9ff40 000004e0 7ff9fd50 00000000
[ 1855.513866] GPR08: 00000001 00000001 7ff9ff30 00000000 28000282 1001b7e8 100a0920 00000000
[ 1855.513866] GPR16: 100cac0c 100b0000 102883a4 10289685 100d0000 100d0000 100d0000 100b2e9e
[ 1855.513866] GPR24: ffffffff 102883c8 00000000 7ff9ff38 cadb3f40 cadb3ec8 c28dc000 00000000
[ 1855.552767] NIP [c00111d4] flush_icache_range+0x90/0xb4
[ 1855.557932] LR [c0005a28] handle_signal32+0x1bc/0x1c4
[ 1855.562925] Call Trace:
[ 1855.565332] [cadb3e90] [100d0000] 0x100d0000 (unreliable)
[ 1855.570666] [cadb3ec0] [c0007788] do_notify_resume+0x260/0x314
[ 1855.576432] [cadb3f20] [c000c764] syscall_exit_prepare+0x120/0x184
[ 1855.582542] [cadb3f30] [c00100b4] ret_from_syscall+0xc/0x28
[ 1855.588050] --- interrupt: c00 at 0xfe807f8
[ 1855.592183] NIP: 0fe807f8 LR: 10001048 CTR: c0139378
[ 1855.597174] REGS: cadb3f40 TRAP: 0c00 Not tainted (5.12.0-rc4-s3k-dev-01553-gc31141d460e6)
[ 1855.605607] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 28000282 XER: 20000000
[ 1855.612664]
[ 1855.612664] GPR00: 00000025 7ffa0230 77c09690 00000000 0000000a 28000282 00000001 0ff03a38
[ 1855.612664] GPR08: 0000d032 00000328 c28dc000 00000009 88000282 1001b7e8 100a0920 00000000
[ 1855.612664] GPR16: 100cac0c 100b0000 102883a4 10289685 100d0000 100d0000 100d0000 100b2e9e
[ 1855.612664] GPR24: ffffffff 102883c8 00000000 77bff628 10002358 10010000 1000210c 00008000
[ 1855.648894] NIP [0fe807f8] 0xfe807f8
[ 1855.652426] LR [10001048] 0x10001048
[ 1855.655954] --- interrupt: c00
[ 1855.658969] Instruction dump:
[ 1855.661893] 38630010 7c001fac 38630010 4200fff0 7c0004ac 4c00012c 4e800020 7c001fac
[ 1855.669811] 2c0a0000 38630010 4082ffcc 4bffffe4 <7c00186c> 2c070000 39430010 4082ff8c
[ 1855.677910] ---[ end trace f071a5587092b3aa ]---
[ 1855.682462]
Remapped the stack executable
!! child died by signal 11
failure: sigreturn_vdso
More information about the Linuxppc-dev
mailing list