[PATCH v2] erofs: fix incorrect symlink detection in fast symlink

Gao Xiang hsiangkao at linux.alibaba.com
Tue Sep 10 12:18:25 AEST 2024



On 2024/9/10 08:12, Colin Walters wrote:
> 
> 
> On Mon, Sep 9, 2024, at 11:40 AM, Gao Xiang wrote:
>>
>> Just my personal opinion, my understanding of rubustness
>> is stability and security.
>>
>> But whether to check or not check this, it doesn't crash
>> the kernel or deadlock or livelock, so IMHO, it's already
>> rubustness.
> 
> OK, if you're telling me this is already checked elsewhere then I think we can call this end of thread.

I know you ask for an explicit check on symlink i_size, but
I've explained the current kernel behavior:
   - For symlink i_size < PAGE_SIZE (always >= 4096 on Linux),
     it behaves normally for EROFS Linux implementation;

   - For symlink i_size >= PAGE_SIZE, EROFS Linux
     implementation will mark '\0' at PAGE_SIZE - 1 in
     page_get_link() -> nd_terminate_link() so the behavior is also
     deterministic and not harmful to the system stability and security;

In order to verify this, you could also check the EROFS image
(encoded in gzip+base64) with a 16000-byte symlink file below:

H4sICPqj32YCA3Rlc3QuZXJvZnMA7dsvSwNhHMDx350IBotgt0wMwsnegDCYb8FiVbu4LAom
kygbgwURfR1Wm12Lf5JFTIpY9B58GCK2BYV9vvDhee64226sXPlFSBrXHu5f7k4u+isT9X46
GlHm8+9nt5tpHaxOxeS364+Wjh8PXtvP3bn9/nXvpjvq96fPfmpFLObjj7q07lzOxnq9Hubn
eLvaapa/3N8YruVwP59+Rb54IYqoqqqzsd3xZ0uSJGnsSy/GAAAAAAAAAAAAAADA/5bmb89P
I3aXv+YBiuzn/O3azF6zMD8AAADAHzHBP1qf7k2HRABQAAA=

Here the symlink is already recorded in the image (let's not think if
the symlink is reasonable or not for now), but Linux implementation will
just truncate it as a 4095-byte link path, that is the current release
behavior and it has no security issue inside.

In other words, currently i_size >= PAGE_SIZE is an undefined behavior
but Linux just truncates the link path.

So I understand that what you propose here is to check the size
explicitly, which means we need to introduce a formal on-disk hard
limitation, for example:

   - Introduce EROFS_SYMLINK_MAXLEN as 4095;

   - For symlink i_size < EROFS_SYMLINK_MAXLEN, it behaves normally;

   - For symlink i_size >= EROFS_SYMLINK_MAXLEN, it just return
     -EFSCORRUPTED to refuse such inodes;

So that is an added limitation (just like a "don't care" bit into
a "meaningful" bit).

For this case, to be clear I'm totally fine with the limitation,
but I need to decide whether I should make "EROFS_SYMLINK_MAXLEN"
as 4095 or "EROFS_SYMLINK_MAXLEN" as 4096 but also accepts
`link[4095] == '\0'`.

No matter which is finalled selected, it's a new hard on-disk
limitation, which means we cannot change the limitation anymore
(-EFSCORRUPTED) unless some strong reason as a bugfix.

> 
>> Actually, I think EROFS for i_size > PAGE_SIZE, it's an
>> undefined or reserved behavior for now (just like CPU
>> reserved bits or don't care bits), just Linux
>> implementation treats it with PAGE_SIZE-1 trailing '\0',
>> but using erofs dump tool you could still dump large
>> symlinks.
>>
>> Since PATH_MAX is a system-defined constant too, currently
>> Linux PATH_MAX is 4096, but how about other OSes? I've
>> seen some `PATH_MAX 8192` reference but I'm not sure which
>> OS uses this setting.
> 
> Famously GNU Hurd tried to avoid having a PATH_MAX at all:
> https://www.gnu.org/software/hurd/community/gsoc/project_ideas/maxpath.html
> 
> But I am pretty confident in saying that Linux isn't going to try to or really be able to meaningfully change its PATH_MAX in the forseeable future.
> 
> Now we're firmly off into the weeds but since it's kind of an interesting debate: Honestly: who would *want* to ever interact with such long file paths? And I think a much better evolutionary direction is already in flight - operate in terms of directory fds with openat() etc. While it'd be logistically complicated one could imagine a variant of a Unix filesystem which hard required using openat() to access sub-domains where the paths in that sub-domain are limited to the existing PATH_MAX. I guess that kind of already exists in subvolumes/snapshots, and we're effectively enabling this with composefs too.

Yes, but that is just off-topic.  I just need to confirm
that `PATH_MAX >= 4096 is absolutely nonsense for all OSes
on earth`.

If there is a path larger than 4096 in some OS, we maybe
(just maybe) run into an awkward situation:  EROFS can have
some limitation to be used as an _archive format_ on this
OS.

Similar to EXT4->XFS, if XFS is used as an _archive format_
there could be a possiability that a EXT4 symlink cannot be
represented correctly according to its on-disk format.  So
as an _archive format_, XFS is more limited now (Please
don't misunderstand, I'm not saying XFS is not a great
filesystem. I like XFS.)

Thanks,
Gao Xiang

> 
>> But I think it's a filesystem on-disk limitation, but if
>> i_size exceeds that, we return -EOPNOTSUPP or -EFSCORRUPTED?
>> For this symlink case, I tend to return -EFSCORRUPTED but
>> for other similar but complex cases, it could be hard to
>> decide.
> 
> -EFSCORRUPTED is what XFS does at least (in xfs_inactive_symlink()).






More information about the Linux-erofs mailing list