Weird EROFS data corruption
Juhyung Park
qkrwngud825 at gmail.com
Mon Dec 4 14:41:30 AEDT 2023
Hi Gao,
On Mon, Dec 4, 2023 at 12:28 PM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>
>
>
> On 2023/12/4 01:32, Juhyung Park wrote:
> > Hi Gao,
>
> ...
>
> >>>
> >>>>
> >>>> What is the difference between these two machines? just different CPU or
> >>>> they have some other difference like different compliers?
> >>>
> >>> I fully and exclusively control both devices, and the setup is almost the same.
> >>> Same Ubuntu version, kernel/compiler version.
> >>>
> >>> But as I said, on my laptop, the issue happens on kernels that someone
> >>> else (Canonical) built, so I don't think it matters.
> >>
> >> The only thing I could say is that the kernel side has optimized
> >> inplace decompression compared to fuse so that it will reuse the
> >> same buffer for decompression but with a safe margin (according to
> >> the current lz4 decompression implementation). It shouldn't behave
> >> different just due to different CPUs. Let me find more clues
> >> later, also maybe we should introduce a way for users to turn off
> >> this if needed.
> >
> > Cool :)
> >
> > I'm comfortable changing and building my own custom kernel for this
> > specific laptop. Feel free to ask me to try out some patches.
>
> Thanks, I need to narrow down this issue:
>
> - First, could you apply the following diff to test if it's still
> reproducable?
>
> diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
> index 021be5feb1bc..40a306628e1a 100644
> --- a/fs/erofs/decompressor.c
> +++ b/fs/erofs/decompressor.c
> @@ -131,7 +131,7 @@ static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx,
>
> if (rq->inplace_io) {
> omargin = PAGE_ALIGN(ctx->oend) - ctx->oend;
> - if (rq->partial_decoding || !may_inplace ||
> + if (1 || rq->partial_decoding || !may_inplace ||
> omargin < LZ4_DECOMPRESS_INPLACE_MARGIN(rq->inputsize))
> goto docopy;
Yup, that fixes it.
The hash output is the same for 50 runs.
>
> - Could you share the full message about the output of `lscpu`?
Sure:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
Model name: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
BIOS Model name: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz None CPU
@ 3.0GHz
BIOS CPU family: 198
CPU family: 6
Model: 140
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 1
CPU(s) scaling MHz: 60%
CPU max MHz: 4800.0000
CPU min MHz: 400.0000
BogoMIPS: 5990.40
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
arch_perfmon pebs bts rep_good nopl xtopology nonstop_
tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes6
4 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xt
pr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_dead
line_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowp
refetch cpuid_fault epb cat_l2 cdp_l2 ssbd ibrs ibpb st
ibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_
ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
rdt_a avx512f avx512dq rdseed adx smap avx512ifma clfl
ushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl
xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm
ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
hwp_pkg_req vnmi avx512vbmi umip pku ospke avx512_vbmi
2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme av
x512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2i
ntersect md_clear ibt flush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 192 KiB (4 instances)
L1i: 128 KiB (4 instances)
L2: 5 MiB (4 instances)
L3: 12 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
Vulnerabilities:
Gather data sampling: Vulnerable
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Vulnerable
Spectre v1: Vulnerable: __user pointer sanitization and usercopy ba
rriers only; no swapgs barriers
Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBR
S: Vulnerable
Srbds: Not affected
Tsx async abort: Not affected
>
> Thanks,
> Gao Xiang
More information about the Linux-erofs
mailing list