[PATCH v2 1/2] erofs: update on-disk format for xattr name filter

Alexander Larsson alexl at redhat.com
Wed Jul 5 18:12:16 AEST 2023


On Wed, Jul 5, 2023 at 9:51 AM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>
>
>
> On 2023/7/5 15:43, Alexander Larsson wrote:
> > On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
> >>
> >>
> >>
> >> On 2023/7/5 15:04, Jingbo Xu wrote:
> >>> The xattr name bloom filter feature is going to be introduced to speed
> >>> up the negative xattr lookup, e.g. system.posix_acl_[access|default]
> >>> lookup when running "ls -lR" workload.
> >>>
> >>> The number of common used extended attributes (n) is approximately 30.
> >>
> >> There are some commonly used extended attributes (n) and the total number
> >> of these is 31:
> >>
> >>>
> >>>        trusted.overlay.opaque
> >>>        trusted.overlay.redirect
> >>>        trusted.overlay.origin
> >>>        trusted.overlay.impure
> >>>        trusted.overlay.nlink
> >>>        trusted.overlay.upper
> >>>        trusted.overlay.metacopy
> >>>        trusted.overlay.protattr
> >>>        user.overlay.opaque
> >>>        user.overlay.redirect
> >>>        user.overlay.origin
> >>>        user.overlay.impure
> >>>        user.overlay.nlink
> >>>        user.overlay.upper
> >>>        user.overlay.metacopy
> >>>        user.overlay.protattr
> >>>        security.evm
> >>>        security.ima
> >>>        security.selinux
> >>>        security.SMACK64
> >>>        security.SMACK64IPIN
> >>>        security.SMACK64IPOUT
> >>>        security.SMACK64EXEC
> >>>        security.SMACK64TRANSMUTE
> >>>        security.SMACK64MMAP
> >>>        security.apparmor
> >>>        security.capability
> >>>        system.posix_acl_access
> >>>        system.posix_acl_default
> >>>        user.mime_type
> >>>
> >>> Given the number of bits of the bloom filter (m) is 32, the optimal
> >>> value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).
> >>>
> >>> The single hash function is implemented as:
> >>>
> >>>        xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)
> >>>
> >>> where index represents the index of corresponding predefined short name
> >>
> >> where `index`...
> >>
> >>
> >>
> >>> prefix, while name represents the name string after stripping the above
> >>> predefined name prefix.
> >>>
> >>> The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
> >>> used to give a better spread when mapping these 30 extended attributes
> >>> into 32-bit bloom filter as:
> >>>
> >>>        bit  0: security.ima
> >>>        bit  1:
> >>>        bit  2: trusted.overlay.nlink
> >>>        bit  3:
> >>>        bit  4: user.overlay.nlink
> >>>        bit  5: trusted.overlay.upper
> >>>        bit  6: user.overlay.origin
> >>>        bit  7: trusted.overlay.protattr
> >>>        bit  8: security.apparmor
> >>>        bit  9: user.overlay.protattr
> >>>        bit 10: user.overlay.opaque
> >>>        bit 11: security.selinux
> >>>        bit 12: security.SMACK64TRANSMUTE
> >>>        bit 13: security.SMACK64
> >>>        bit 14: security.SMACK64MMAP
> >>>        bit 15: user.overlay.impure
> >>>        bit 16: security.SMACK64IPIN
> >>>        bit 17: trusted.overlay.redirect
> >>>        bit 18: trusted.overlay.origin
> >>>        bit 19: security.SMACK64IPOUT
> >>>        bit 20: trusted.overlay.opaque
> >>>        bit 21: system.posix_acl_default
> >>>        bit 22:
> >>>        bit 23: user.mime_type
> >>>        bit 24: trusted.overlay.impure
> >>>        bit 25: security.SMACK64EXEC
> >>>        bit 26: user.overlay.redirect
> >>>        bit 27: user.overlay.upper
> >>>        bit 28: security.evm
> >>>        bit 29: security.capability
> >>>        bit 30: system.posix_acl_access
> >>>        bit 31: trusted.overlay.metacopy, user.overlay.metacopy
> >>>
> >>> The h_name_filter field is introduced to the on-disk per-inode xattr
> >>> header to place the corresponding xattr name filter, where bit value 1
> >>> indicates non-existence for compatibility.
> >>>
> >>> This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
> >>> compatible feature bit.
> >>>
> >>> Suggested-by: Alexander Larsson <alexl at redhat.com>
> >>> Signed-off-by: Jingbo Xu <jefflexu at linux.alibaba.com>
> >>> ---
> >>>    fs/erofs/erofs_fs.h | 8 +++++++-
> >>>    1 file changed, 7 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> >>> index 2c7b16e340fe..b4b6235fd720 100644
> >>> --- a/fs/erofs/erofs_fs.h
> >>> +++ b/fs/erofs/erofs_fs.h
> >>> @@ -13,6 +13,7 @@
> >>>
> >>>    #define EROFS_FEATURE_COMPAT_SB_CHKSUM          0x00000001
> >>>    #define EROFS_FEATURE_COMPAT_MTIME              0x00000002
> >>> +#define EROFS_FEATURE_COMPAT_XATTR_FILTER    0x00000004
> >>
> >> I'd suggest that if we could leave one reserved byte in the
> >> superblock for now (and checking if it's 0) since
> >>     1) xattr filter feature is a compatible feature;
> >>     2) I'm not sure if the implementation could be changed.
> >>
> >> so that later implementation changes won't bother compat bits
> >> again.
> >
> > I would very much like to generate these bloom filters in composefs
> > right now, before the composefs v1 format is completely locked down,
> > and this should be fully possible given that this is a backwards
> > compat change. But this is only possible if it doesn't require a
> > feature flag like this that makes old erofs versions not mount the
> > image.
>
> EROFS has two types of feature bits:
>
>   1) compat flags, which doesn't block mounting on old kernels;
>   2) incompat flags, which will block mounting on old kernels.
>
> here bloom filter use a new compat flag, so old kernels will just
> ignore this and mount.  compat flags just indicates that "an image
> with a feature, and you could use it or not".
>
> Here I just meant the bloom filter internals are fixed for now,
> so that we might reserve a byte in the on-disk super block for
> later potential changes (if any).  And don't need to bother another
> new compat flag.

Cool. Then we're all good!

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                Red Hat, Inc
       alexl at redhat.com         alexander.larsson at gmail.com



More information about the Linux-erofs mailing list