[PATCH v2 1/2] erofs: update on-disk format for xattr name filter

Gao Xiang hsiangkao at linux.alibaba.com
Wed Jul 5 18:13:54 AEST 2023



On 2023/7/5 16:12, Alexander Larsson wrote:
> On Wed, Jul 5, 2023 at 9:51 AM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>>
>>
>>
>> On 2023/7/5 15:43, Alexander Larsson wrote:
>>> On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2023/7/5 15:04, Jingbo Xu wrote:
>>>>> The xattr name bloom filter feature is going to be introduced to speed
>>>>> up the negative xattr lookup, e.g. system.posix_acl_[access|default]
>>>>> lookup when running "ls -lR" workload.
>>>>>
>>>>> The number of common used extended attributes (n) is approximately 30.
>>>>
>>>> There are some commonly used extended attributes (n) and the total number
>>>> of these is 31:
>>>>
>>>>>
>>>>>         trusted.overlay.opaque
>>>>>         trusted.overlay.redirect
>>>>>         trusted.overlay.origin
>>>>>         trusted.overlay.impure
>>>>>         trusted.overlay.nlink
>>>>>         trusted.overlay.upper
>>>>>         trusted.overlay.metacopy
>>>>>         trusted.overlay.protattr
>>>>>         user.overlay.opaque
>>>>>         user.overlay.redirect
>>>>>         user.overlay.origin
>>>>>         user.overlay.impure
>>>>>         user.overlay.nlink
>>>>>         user.overlay.upper
>>>>>         user.overlay.metacopy
>>>>>         user.overlay.protattr
>>>>>         security.evm
>>>>>         security.ima
>>>>>         security.selinux
>>>>>         security.SMACK64
>>>>>         security.SMACK64IPIN
>>>>>         security.SMACK64IPOUT
>>>>>         security.SMACK64EXEC
>>>>>         security.SMACK64TRANSMUTE
>>>>>         security.SMACK64MMAP
>>>>>         security.apparmor
>>>>>         security.capability
>>>>>         system.posix_acl_access
>>>>>         system.posix_acl_default
>>>>>         user.mime_type
>>>>>
>>>>> Given the number of bits of the bloom filter (m) is 32, the optimal
>>>>> value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).
>>>>>
>>>>> The single hash function is implemented as:
>>>>>
>>>>>         xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)
>>>>>
>>>>> where index represents the index of corresponding predefined short name
>>>>
>>>> where `index`...
>>>>
>>>>
>>>>
>>>>> prefix, while name represents the name string after stripping the above
>>>>> predefined name prefix.
>>>>>
>>>>> The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
>>>>> used to give a better spread when mapping these 30 extended attributes
>>>>> into 32-bit bloom filter as:
>>>>>
>>>>>         bit  0: security.ima
>>>>>         bit  1:
>>>>>         bit  2: trusted.overlay.nlink
>>>>>         bit  3:
>>>>>         bit  4: user.overlay.nlink
>>>>>         bit  5: trusted.overlay.upper
>>>>>         bit  6: user.overlay.origin
>>>>>         bit  7: trusted.overlay.protattr
>>>>>         bit  8: security.apparmor
>>>>>         bit  9: user.overlay.protattr
>>>>>         bit 10: user.overlay.opaque
>>>>>         bit 11: security.selinux
>>>>>         bit 12: security.SMACK64TRANSMUTE
>>>>>         bit 13: security.SMACK64
>>>>>         bit 14: security.SMACK64MMAP
>>>>>         bit 15: user.overlay.impure
>>>>>         bit 16: security.SMACK64IPIN
>>>>>         bit 17: trusted.overlay.redirect
>>>>>         bit 18: trusted.overlay.origin
>>>>>         bit 19: security.SMACK64IPOUT
>>>>>         bit 20: trusted.overlay.opaque
>>>>>         bit 21: system.posix_acl_default
>>>>>         bit 22:
>>>>>         bit 23: user.mime_type
>>>>>         bit 24: trusted.overlay.impure
>>>>>         bit 25: security.SMACK64EXEC
>>>>>         bit 26: user.overlay.redirect
>>>>>         bit 27: user.overlay.upper
>>>>>         bit 28: security.evm
>>>>>         bit 29: security.capability
>>>>>         bit 30: system.posix_acl_access
>>>>>         bit 31: trusted.overlay.metacopy, user.overlay.metacopy
>>>>>
>>>>> The h_name_filter field is introduced to the on-disk per-inode xattr
>>>>> header to place the corresponding xattr name filter, where bit value 1
>>>>> indicates non-existence for compatibility.
>>>>>
>>>>> This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
>>>>> compatible feature bit.
>>>>>
>>>>> Suggested-by: Alexander Larsson <alexl at redhat.com>
>>>>> Signed-off-by: Jingbo Xu <jefflexu at linux.alibaba.com>
>>>>> ---
>>>>>     fs/erofs/erofs_fs.h | 8 +++++++-
>>>>>     1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
>>>>> index 2c7b16e340fe..b4b6235fd720 100644
>>>>> --- a/fs/erofs/erofs_fs.h
>>>>> +++ b/fs/erofs/erofs_fs.h
>>>>> @@ -13,6 +13,7 @@
>>>>>
>>>>>     #define EROFS_FEATURE_COMPAT_SB_CHKSUM          0x00000001
>>>>>     #define EROFS_FEATURE_COMPAT_MTIME              0x00000002
>>>>> +#define EROFS_FEATURE_COMPAT_XATTR_FILTER    0x00000004
>>>>
>>>> I'd suggest that if we could leave one reserved byte in the
>>>> superblock for now (and checking if it's 0) since
>>>>      1) xattr filter feature is a compatible feature;
>>>>      2) I'm not sure if the implementation could be changed.
>>>>
>>>> so that later implementation changes won't bother compat bits
>>>> again.
>>>
>>> I would very much like to generate these bloom filters in composefs
>>> right now, before the composefs v1 format is completely locked down,
>>> and this should be fully possible given that this is a backwards
>>> compat change. But this is only possible if it doesn't require a
>>> feature flag like this that makes old erofs versions not mount the
>>> image.
>>
>> EROFS has two types of feature bits:
>>
>>    1) compat flags, which doesn't block mounting on old kernels;
>>    2) incompat flags, which will block mounting on old kernels.
>>
>> here bloom filter use a new compat flag, so old kernels will just
>> ignore this and mount.  compat flags just indicates that "an image
>> with a feature, and you could use it or not".
>>
>> Here I just meant the bloom filter internals are fixed for now,
>> so that we might reserve a byte in the on-disk super block for
>> later potential changes (if any).  And don't need to bother another
>> new compat flag.
> 
> Cool. Then we're all good!

:)

> 


More information about the Linux-erofs mailing list