[PATCH v5 01/10] capabilities: introduce CAP_PERFMON to kernel and user space
Alexey Budankov
alexey.budankov at linux.intel.com
Thu Feb 13 20:05:24 AEDT 2020
On 12.02.2020 20:09, Stephen Smalley wrote:
> On 2/12/20 11:56 AM, Alexey Budankov wrote:
>>
>>
>> On 12.02.2020 18:45, Stephen Smalley wrote:
>>> On 2/12/20 10:21 AM, Stephen Smalley wrote:
>>>> On 2/12/20 8:53 AM, Alexey Budankov wrote:
>>>>> On 12.02.2020 16:32, Stephen Smalley wrote:
>>>>>> On 2/12/20 3:53 AM, Alexey Budankov wrote:
>>>>>>> Hi Stephen,
>>>>>>>
>>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote:
>>>>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:
>>>>>>>>>
>>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:
>>>>>>>>>>
>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:
>>>>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov
>>>>>>>>>>> <alexey.budankov at linux.intel.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:
>>>>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:
>>>>>>>>>>>>>>
>>>>>>> <SNIP>
>>>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance
>>>>>>>>>>>>>
>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.
>>>>>>>>>
>>>>>>>>> So far so good, I suggest using the simplest version for v6:
>>>>>>>>>
>>>>>>>>> static inline bool perfmon_capable(void)
>>>>>>>>> {
>>>>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> It keeps the implementation simple and readable. The implementation is more
>>>>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON
>>>>>>>>> privileged process.
>>>>>>>>>
>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,
>>>>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON
>>>>>>>>> based approach to use perf_event_open system call.
>>>>>>>>
>>>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.
>>>>>>>
>>>>>>> I am trying to reproduce this double logging with CAP_PERFMON.
>>>>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.
>>>>>>> When running perf stat -a I am observing this AVC audit messages:
>>>>>>>
>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1
>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1
>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1
>>>>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1
>>>>>>>
>>>>>>> However there is no capability related messages around. I suppose my refpolicy should
>>>>>>> be modified somehow to observe capability related AVCs.
>>>>>>>
>>>>>>> Could you please comment or clarify on how to enable caps related AVCs in order
>>>>>>> to test the concerned logging.
>>>>>>
>>>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.
>>>>>
>>>>> Yes, I already have it like this:
>>>>> common cap2
>>>>> {
>>>>> <------>mac_override<--># unused by SELinux
>>>>> <------>mac_admin
>>>>> <------>syslog
>>>>> <------>wake_alarm
>>>>> <------>block_suspend
>>>>> <------>audit_read
>>>>> <------>perfmon
>>>>> }
>>>>>
>>>>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.
>>>>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.
>>>>
>>>> Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.
>>>
>>> Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that).
>>
>> It is user_t for local terminal session:
>> ps -Z
>> LABEL PID TTY TIME CMD
>> user_u:user_r:user_t 11317 pts/9 00:00:00 bash
>> user_u:user_r:user_t 11796 pts/9 00:00:00 ps
>>
>> For local terminal root session:
>> ps -Z
>> LABEL PID TTY TIME CMD
>> user_u:user_r:user_su_t 2926 pts/3 00:00:00 bash
>> user_u:user_r:user_su_t 10995 pts/3 00:00:00 ps
>>
>> For remote ssh session:
>> ps -Z
>> LABEL PID TTY TIME CMD
>> user_u:user_r:user_t 7540 pts/8 00:00:00 ps
>> user_u:user_r:user_systemd_t 8875 pts/8 00:00:00 bash
>
> That's a bug in either your policy or your userspace/distro integration. In any event, unless user_systemd_t is allowed all capability2 permissions by your policy, you should see the denials if CAP_PERFMON is set in the effective capability set of the process.
>
That all seems to be true. After instrumentation, rebuilding and rebooting, in CAP_PERFMON case:
$ getcap perf
perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
$ perf stat -a
type=AVC msg=audit(1581580399.165:784): avc: denied { open } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580399.165:785): avc: denied { perfmon } for pid=8859 comm="perf" capability=38 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1
type=AVC msg=audit(1581580399.165:786): avc: denied { kernel } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580399.165:787): avc: denied { cpu } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580399.165:788): avc: denied { write } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580408.078:791): avc: denied { read } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
dmesg:
[ 137.877713] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 137.877774] cread_has_capability(CAP_PERFMON) = 0
[ 137.877775] prior avc_audit(CAP_PERFMON)
[ 137.877779] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0
[ 137.877784] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 137.877785] cread_has_capability(CAP_PERFMON) = 0
[ 137.877786] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0
[ 137.877794] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 137.877795] cread_has_capability(CAP_PERFMON) = 0
[ 137.877796] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0
...
in CAP_SYS_ADMIN case:
$ getcap perf
perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep
$ perf stat -a
type=AVC msg=audit(1581580747.928:835): avc: denied { open } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580747.928:836): avc: denied { cpu } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580747.928:837): avc: denied { kernel } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580747.928:838): avc: denied { read } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580747.928:839): avc: denied { write } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
...
$ perf record -- ls
...
type=AVC msg=audit(1581580747.930:843): avc: denied { sys_ptrace } for pid=8927 comm="perf" capability=19 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability permissive=1
...
dmesg:
[ 276.714266] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 276.714268] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1
[ 276.714269] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?
[ 276.714270] cread_has_capability(CAP_SYS_ADMIN) = 0
[ 276.714270] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0
[ 276.714287] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 276.714287] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1
[ 276.714288] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?
[ 276.714288] cread_has_capability(CAP_SYS_ADMIN) = 0
[ 276.714289] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0
[ 276.714294] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 276.714295] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1
[ 276.714295] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?
[ 276.714296] cread_has_capability(CAP_SYS_ADMIN) = 0
[ 276.714296] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0
...
in unprivileged case:
$ getcap perf
perf =
$ perf stat -a; perf record -a
...
dmesg:
[ 947.275611] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 947.275613] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = -1
[ 947.275614] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?
[ 947.275615] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = -1
[ 947.275636] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = ?
[ 947.275637] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = -1
[ 947.275638] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?
[ 947.275638] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = -1
...
So it looks like CAP_PERFMON and CAP_SYS_ADMIN are not ever logged by AVC simultaneously,
in the current LSM and perfmon_capable() implementations.
If perfmon is granted:
perfmon is not logged by capabilities, perfmon is logged by AVC,
no check for sys_admin by perfmon_capable().
If perfmon is not granted but sys_admin is granted:
perfmon is not logged by capabilities, AVC logging is not called for perfmon,
sys_admin is not logged by capabilities, sys_admin is not logged by AVC, for some intended reason?
No caps are granted:
AVC logging is not called either for perfmon or for sys_admin.
BTW, is there a way to may be drop some AV cache so denials would appear in audit in the next AV access?
Well, I guess you have initially mentioned some case similar to this (note that ids are not the same but pids= are):
type=AVC msg=audit(1581580399.165:784): avc: denied { open } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580399.165:785): avc: denied { perfmon } for pid=8859 comm="perf" capability=38 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1
type=AVC msg=audit( . : ): avc: denied { sys_admin } for pid=8859 comm="perf" capability=21 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1
type=AVC msg=audit(1581580399.165:786): avc: denied { kernel } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580399.165:787): avc: denied { cpu } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580399.165:788): avc: denied { write } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
type=AVC msg=audit(1581580408.078:791): avc: denied { read } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1
So the message could be like this:
"If audit logs for a process using perf_events related syscalls i.e. perf_event_open(), read(), write(),
ioctl(), mmap() contain denials both for CAP_PERFMON and CAP_SYS_ADMIN capabilities then providing the
process with CAP_PERFMON capability singly is the secure preferred approach to resolve access denials
to performance monitoring and observability operations."
~Alexey
More information about the Linuxppc-dev
mailing list