[PATCH] erofs: don't bother with s_stack_depth increasing for now
Amir Goldstein
amir73il at gmail.com
Sun Jan 4 21:01:10 AEDT 2026
[+fsdevel][+overlayfs]
On Sun, Jan 4, 2026 at 4:56 AM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>
> Hi Amir,
>
> On 2026/1/1 23:52, Amir Goldstein wrote:
> > On Wed, Dec 31, 2025 at 9:42 PM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
> >>
> >> Previously, commit d53cd891f0e4 ("erofs: limit the level of fs stacking
> >> for file-backed mounts") bumped `s_stack_depth` by one to avoid kernel
> >> stack overflow, but it breaks composefs mounts, which need erofs+ovl^2
> >> sometimes (and such setups are already used in production for quite long
> >> time) since `s_stack_depth` can be 3 (i.e., FILESYSTEM_MAX_STACK_DEPTH
> >> needs to change from 2 to 3).
> >>
> >> After a long discussion on GitHub issues [1] about possible solutions,
> >> it seems there is no need to support nesting file-backed mounts as one
> >> conclusion (especially when increasing FILESYSTEM_MAX_STACK_DEPTH to 3).
> >> So let's disallow this right now, since there is always a way to use
> >> loopback devices as a fallback.
> >>
> >> Then, I started to wonder about an alternative EROFS quick fix to
> >> address the composefs mounts directly for this cycle: since EROFS is the
> >> only fs to support file-backed mounts and other stacked fses will just
> >> bump up `FILESYSTEM_MAX_STACK_DEPTH`, just check that `s_stack_depth`
> >> != 0 and the backing inode is not from EROFS instead.
> >>
> >> At least it works for all known file-backed mount use cases (composefs,
> >> containerd, and Android APEX for some Android vendors), and the fix is
> >> self-contained.
> >>
> >> Let's defer increasing FILESYSTEM_MAX_STACK_DEPTH for now.
> >>
> >> Fixes: d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed mounts")
> >> Closes: https://github.com/coreos/fedora-coreos-tracker/issues/2087 [1]
> >> Closes: https://lore.kernel.org/r/CAFHtUiYv4+=+JP_-JjARWjo6OwcvBj1wtYN=z0QXwCpec9sXtg@mail.gmail.com
> >> Cc: Amir Goldstein <amir73il at gmail.com>
> >> Cc: Alexander Larsson <alexl at redhat.com>
> >> Cc: Christian Brauner <brauner at kernel.org>
> >> Cc: Miklos Szeredi <mszeredi at redhat.com>
> >> Signed-off-by: Gao Xiang <hsiangkao at linux.alibaba.com>
> >> ---
> >
> > Acked-by: Amir Goldstein <amir73il at gmail.com>
> >
> > But you forgot to include details of the stack usage analysis you ran
> > with erofs+ovl^2 setup.
> >
> > I am guessing people will want to see this information before relaxing
> > s_stack_depth in this case.
>
> Sorry I didn't check emails these days, I'm not sure if posting
> detailed stack traces are useful, how about adding the following
> words:
Didn't mean detailed stack traces, but you did some tests with the
new possible setup and you reached stack usage < 8K so I think this is
something worth mentioning.
>
> Note: There are some observations while evaluating the erofs + ovl^2
> setup with an XFS backing fs:
>
> - Regular RW workloads traverse only one overlayfs layer regardless of
> the value of FILESYSTEM_MAX_STACK_DEPTH, because `upperdir=` cannot
> point to another overlayfs. Therefore, for pure RW workloads, the
> typical stack is always just:
> overlayfs + upper fs + underlay storage
>
> - For read-only workloads and the copy-up read part (ovl_splice_read),
> the difference can lie in how many overlays are nested.
> The stack just looks like either:
> ovl + ovl [+ erofs] + backing fs + underlay storage
> or
> ovl [+ erofs] + ext4/xfs + underlay storage
>
> - The fs reclaim path should be entered only once, so the writeback
> path will not re-enter.
>
> Sorry about my English, and I'm not sure if it's enough (e.g. FUSE
> passthrough part). I will look for your further inputs (and other
> acks) before sending this patch upstream.
>
I think that most people will have problems understanding this
rationale not because of the English, but because of the tech ;)
this is a bit too hand wavy IMO.
> (Also btw, i'm not sure if it's possible to optimize read_iter and
> splice_read stack usage even further in overlayfs, e.g. just
> recursive handling real file/path directly in the top overlayfs
> since the permission check is already done when opening the file.)
Maybe so, but LSM permission to open hook is not the same hook
as permission to read/write.
Thanks,
Amir.
More information about the Linux-erofs
mailing list