[PATCH v2 08/12] cachefiles: never get a new anonymous fd if ondemand_id is valid
Jingbo Xu
jefflexu at linux.alibaba.com
Mon May 20 19:24:00 AEST 2024
On 5/20/24 5:07 PM, Baokun Li wrote:
> On 2024/5/20 16:43, Jingbo Xu wrote:
>>
>> On 5/15/24 4:45 PM, libaokun at huaweicloud.com wrote:
>>> From: Baokun Li <libaokun1 at huawei.com>
>>>
>>> Now every time the daemon reads an open request, it gets a new
>>> anonymous fd
>>> and ondemand_id. With the introduction of "restore", it is possible
>>> to read
>>> the same open request more than once, and therefore an object can
>>> have more
>>> than one anonymous fd.
>>>
>>> If the anonymous fd is not unique, the following concurrencies will
>>> result
>>> in an fd leak:
>>>
>>> t1 | t2 | t3
>>> ------------------------------------------------------------
>>> cachefiles_ondemand_init_object
>>> cachefiles_ondemand_send_req
>>> REQ_A = kzalloc(sizeof(*req) + data_len)
>>> wait_for_completion(&REQ_A->done)
>>> cachefiles_daemon_read
>>> cachefiles_ondemand_daemon_read
>>> REQ_A = cachefiles_ondemand_select_req
>>> cachefiles_ondemand_get_fd
>>> load->fd = fd0
>>> ondemand_id = object_id0
>>> ------ restore ------
>>> cachefiles_ondemand_restore
>>> // restore REQ_A
>>> cachefiles_daemon_read
>>> cachefiles_ondemand_daemon_read
>>> REQ_A =
>>> cachefiles_ondemand_select_req
>>> cachefiles_ondemand_get_fd
>>> load->fd = fd1
>>> ondemand_id = object_id1
>>> process_open_req(REQ_A)
>>> write(devfd, ("copen %u,%llu", msg->msg_id, size))
>>> cachefiles_ondemand_copen
>>> xa_erase(&cache->reqs, id)
>>> complete(&REQ_A->done)
>>> kfree(REQ_A)
>>> process_open_req(REQ_A)
>>> // copen fails due to no req
>>> // daemon close(fd1)
>>> cachefiles_ondemand_fd_release
>>> // set object closed
>>> -- umount --
>>> cachefiles_withdraw_cookie
>>> cachefiles_ondemand_clean_object
>>> cachefiles_ondemand_init_close_req
>>> if (!cachefiles_ondemand_object_is_open(object))
>>> return -ENOENT;
>>> // The fd0 is not closed until the daemon exits.
>>>
>>> However, the anonymous fd holds the reference count of the object and
>>> the
>>> object holds the reference count of the cookie. So even though the
>>> cookie
>>> has been relinquished, it will not be unhashed and freed until the
>>> daemon
>>> exits.
>>>
>>> In fscache_hash_cookie(), when the same cookie is found in the hash
>>> list,
>>> if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then
>>> the new
>>> cookie waits for the old cookie to be unhashed, while the old cookie is
>>> waiting for the leaked fd to be closed, if the daemon does not exit
>>> in time
>>> it will trigger a hung task.
>>>
>>> To avoid this, allocate a new anonymous fd only if no anonymous fd has
>>> been allocated (ondemand_id == 0) or if the previously allocated
>>> anonymous
>>> fd has been closed (ondemand_id == -1). Moreover, returns an error if
>>> ondemand_id is valid, letting the daemon know that the current userland
>>> restore logic is abnormal and needs to be checked.
>>>
>>> Fixes: c8383054506c ("cachefiles: notify the user daemon when looking
>>> up cookie")
>>> Signed-off-by: Baokun Li <libaokun1 at huawei.com>
>> The LOCs of this fix is quite under control. But still it seems that
>> the worst consequence is that the (potential) malicious daemon gets
>> hung. No more effect to the system or other processes. Or does a
>> non-malicious daemon have any chance having the same issue?
> If we enable hung_task_panic, it may cause panic to crash the server.
Then this issue has nothing to do with this patch? As long as a
malicious daemon doesn't close the anonymous fd after umounting, then I
guess a following attempt of mounting cookie with the same name will
also wait and hung there?
--
Thanks,
Jingbo
More information about the Linux-erofs
mailing list