[PATCH v2 08/12] cachefiles: never get a new anonymous fd if ondemand_id is valid

Mon May 20 19:24:00 AEST 2024

On 5/20/24 5:07 PM, Baokun Li wrote:
> On 2024/5/20 16:43, Jingbo Xu wrote:
>>
>> On 5/15/24 4:45 PM, libaokun at huaweicloud.com wrote:
>>> From: Baokun Li <libaokun1 at huawei.com>
>>>
>>> Now every time the daemon reads an open request, it gets a new
>>> anonymous fd
>>> and ondemand_id. With the introduction of "restore", it is possible
>>> to read
>>> the same open request more than once, and therefore an object can
>>> have more
>>> than one anonymous fd.
>>>
>>> If the anonymous fd is not unique, the following concurrencies will
>>> result
>>> in an fd leak:
>>>
>>>       t1     |         t2         |          t3
>>> ------------------------------------------------------------
>>>   cachefiles_ondemand_init_object
>>>    cachefiles_ondemand_send_req
>>>     REQ_A = kzalloc(sizeof(*req) + data_len)
>>>     wait_for_completion(&REQ_A->done)
>>>              cachefiles_daemon_read
>>>               cachefiles_ondemand_daemon_read
>>>                REQ_A = cachefiles_ondemand_select_req
>>>                cachefiles_ondemand_get_fd
>>>                  load->fd = fd0
>>>                  ondemand_id = object_id0
>>>                                    ------ restore ------
>>>                                    cachefiles_ondemand_restore
>>>                                     // restore REQ_A
>>>                                    cachefiles_daemon_read
>>>                                     cachefiles_ondemand_daemon_read
>>>                                      REQ_A =
>>> cachefiles_ondemand_select_req
>>>                                        cachefiles_ondemand_get_fd
>>>                                          load->fd = fd1
>>>                                          ondemand_id = object_id1
>>>               process_open_req(REQ_A)
>>>               write(devfd, ("copen %u,%llu", msg->msg_id, size))
>>>               cachefiles_ondemand_copen
>>>                xa_erase(&cache->reqs, id)
>>>                complete(&REQ_A->done)
>>>     kfree(REQ_A)
>>>                                    process_open_req(REQ_A)
>>>                                    // copen fails due to no req
>>>                                    // daemon close(fd1)
>>>                                    cachefiles_ondemand_fd_release
>>>                                     // set object closed
>>>   -- umount --
>>>   cachefiles_withdraw_cookie
>>>    cachefiles_ondemand_clean_object
>>>     cachefiles_ondemand_init_close_req
>>>      if (!cachefiles_ondemand_object_is_open(object))
>>>        return -ENOENT;
>>>      // The fd0 is not closed until the daemon exits.
>>>
>>> However, the anonymous fd holds the reference count of the object and
>>> the
>>> object holds the reference count of the cookie. So even though the
>>> cookie
>>> has been relinquished, it will not be unhashed and freed until the
>>> daemon
>>> exits.
>>>
>>> In fscache_hash_cookie(), when the same cookie is found in the hash
>>> list,
>>> if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then
>>> the new
>>> cookie waits for the old cookie to be unhashed, while the old cookie is
>>> waiting for the leaked fd to be closed, if the daemon does not exit
>>> in time
>>> it will trigger a hung task.
>>>
>>> To avoid this, allocate a new anonymous fd only if no anonymous fd has
>>> been allocated (ondemand_id == 0) or if the previously allocated
>>> anonymous
>>> fd has been closed (ondemand_id == -1). Moreover, returns an error if
>>> ondemand_id is valid, letting the daemon know that the current userland
>>> restore logic is abnormal and needs to be checked.
>>>
>>> Fixes: c8383054506c ("cachefiles: notify the user daemon when looking
>>> up cookie")
>>> Signed-off-by: Baokun Li <libaokun1 at huawei.com>
>> The LOCs of this fix is quite under control.  But still it seems that
>> the worst consequence is that the (potential) malicious daemon gets
>> hung.  No more effect to the system or other processes.  Or does a
>> non-malicious daemon have any chance having the same issue?
> If we enable hung_task_panic, it may cause panic to crash the server.

Then this issue has nothing to do with this patch?  As long as a
malicious daemon doesn't close the anonymous fd after umounting, then I
guess a following attempt of mounting cookie with the same name will
also wait and hung there?

-- 
Thanks,
Jingbo