[PATCH v2 1/2] erofs: update on-disk format for xattr name filter

Gao Xiang hsiangkao at linux.alibaba.com
Wed Jul 5 17:51:30 AEST 2023



On 2023/7/5 15:43, Alexander Larsson wrote:
> On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>>
>>
>>
>> On 2023/7/5 15:04, Jingbo Xu wrote:
>>> The xattr name bloom filter feature is going to be introduced to speed
>>> up the negative xattr lookup, e.g. system.posix_acl_[access|default]
>>> lookup when running "ls -lR" workload.
>>>
>>> The number of common used extended attributes (n) is approximately 30.
>>
>> There are some commonly used extended attributes (n) and the total number
>> of these is 31:
>>
>>>
>>>        trusted.overlay.opaque
>>>        trusted.overlay.redirect
>>>        trusted.overlay.origin
>>>        trusted.overlay.impure
>>>        trusted.overlay.nlink
>>>        trusted.overlay.upper
>>>        trusted.overlay.metacopy
>>>        trusted.overlay.protattr
>>>        user.overlay.opaque
>>>        user.overlay.redirect
>>>        user.overlay.origin
>>>        user.overlay.impure
>>>        user.overlay.nlink
>>>        user.overlay.upper
>>>        user.overlay.metacopy
>>>        user.overlay.protattr
>>>        security.evm
>>>        security.ima
>>>        security.selinux
>>>        security.SMACK64
>>>        security.SMACK64IPIN
>>>        security.SMACK64IPOUT
>>>        security.SMACK64EXEC
>>>        security.SMACK64TRANSMUTE
>>>        security.SMACK64MMAP
>>>        security.apparmor
>>>        security.capability
>>>        system.posix_acl_access
>>>        system.posix_acl_default
>>>        user.mime_type
>>>
>>> Given the number of bits of the bloom filter (m) is 32, the optimal
>>> value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).
>>>
>>> The single hash function is implemented as:
>>>
>>>        xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)
>>>
>>> where index represents the index of corresponding predefined short name
>>
>> where `index`...
>>
>>
>>
>>> prefix, while name represents the name string after stripping the above
>>> predefined name prefix.
>>>
>>> The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
>>> used to give a better spread when mapping these 30 extended attributes
>>> into 32-bit bloom filter as:
>>>
>>>        bit  0: security.ima
>>>        bit  1:
>>>        bit  2: trusted.overlay.nlink
>>>        bit  3:
>>>        bit  4: user.overlay.nlink
>>>        bit  5: trusted.overlay.upper
>>>        bit  6: user.overlay.origin
>>>        bit  7: trusted.overlay.protattr
>>>        bit  8: security.apparmor
>>>        bit  9: user.overlay.protattr
>>>        bit 10: user.overlay.opaque
>>>        bit 11: security.selinux
>>>        bit 12: security.SMACK64TRANSMUTE
>>>        bit 13: security.SMACK64
>>>        bit 14: security.SMACK64MMAP
>>>        bit 15: user.overlay.impure
>>>        bit 16: security.SMACK64IPIN
>>>        bit 17: trusted.overlay.redirect
>>>        bit 18: trusted.overlay.origin
>>>        bit 19: security.SMACK64IPOUT
>>>        bit 20: trusted.overlay.opaque
>>>        bit 21: system.posix_acl_default
>>>        bit 22:
>>>        bit 23: user.mime_type
>>>        bit 24: trusted.overlay.impure
>>>        bit 25: security.SMACK64EXEC
>>>        bit 26: user.overlay.redirect
>>>        bit 27: user.overlay.upper
>>>        bit 28: security.evm
>>>        bit 29: security.capability
>>>        bit 30: system.posix_acl_access
>>>        bit 31: trusted.overlay.metacopy, user.overlay.metacopy
>>>
>>> The h_name_filter field is introduced to the on-disk per-inode xattr
>>> header to place the corresponding xattr name filter, where bit value 1
>>> indicates non-existence for compatibility.
>>>
>>> This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
>>> compatible feature bit.
>>>
>>> Suggested-by: Alexander Larsson <alexl at redhat.com>
>>> Signed-off-by: Jingbo Xu <jefflexu at linux.alibaba.com>
>>> ---
>>>    fs/erofs/erofs_fs.h | 8 +++++++-
>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
>>> index 2c7b16e340fe..b4b6235fd720 100644
>>> --- a/fs/erofs/erofs_fs.h
>>> +++ b/fs/erofs/erofs_fs.h
>>> @@ -13,6 +13,7 @@
>>>
>>>    #define EROFS_FEATURE_COMPAT_SB_CHKSUM          0x00000001
>>>    #define EROFS_FEATURE_COMPAT_MTIME              0x00000002
>>> +#define EROFS_FEATURE_COMPAT_XATTR_FILTER    0x00000004
>>
>> I'd suggest that if we could leave one reserved byte in the
>> superblock for now (and checking if it's 0) since
>>     1) xattr filter feature is a compatible feature;
>>     2) I'm not sure if the implementation could be changed.
>>
>> so that later implementation changes won't bother compat bits
>> again.
> 
> I would very much like to generate these bloom filters in composefs
> right now, before the composefs v1 format is completely locked down,
> and this should be fully possible given that this is a backwards
> compat change. But this is only possible if it doesn't require a
> feature flag like this that makes old erofs versions not mount the
> image.

EROFS has two types of feature bits:

  1) compat flags, which doesn't block mounting on old kernels;
  2) incompat flags, which will block mounting on old kernels.

here bloom filter use a new compat flag, so old kernels will just
ignore this and mount.  compat flags just indicates that "an image
with a feature, and you could use it or not".

Here I just meant the bloom filter internals are fixed for now,
so that we might reserve a byte in the on-disk super block for
later potential changes (if any).  And don't need to bother another
new compat flag.

Thanks,
Gao Xiang


More information about the Linux-erofs mailing list