Weird EROFS data corruption

Gao Xiang hsiangkao at linux.alibaba.com
Wed Dec 6 01:34:37 AEDT 2023



On 2023/12/5 22:23, Juhyung Park wrote:
> Hi Gao,
> 
> On Tue, Dec 5, 2023 at 4:32 PM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>>
>> Hi Juhyung,
>>
>> On 2023/12/4 11:41, Juhyung Park wrote:
>>
>> ...
>>>
>>>>
>>>> - Could you share the full message about the output of `lscpu`?
>>>
>>> Sure:
>>>
>>> Architecture:            x86_64
>>>     CPU op-mode(s):        32-bit, 64-bit
>>>     Address sizes:         39 bits physical, 48 bits virtual
>>>     Byte Order:            Little Endian
>>> CPU(s):                  8
>>>     On-line CPU(s) list:   0-7
>>> Vendor ID:               GenuineIntel
>>>     BIOS Vendor ID:        Intel(R) Corporation
>>>     Model name:            11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
>>>       BIOS Model name:     11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz None CPU
>>>                             @ 3.0GHz
>>>       BIOS CPU family:     198
>>>       CPU family:          6
>>>       Model:               140
>>>       Thread(s) per core:  2
>>>       Core(s) per socket:  4
>>>       Socket(s):           1
>>>       Stepping:            1
>>>       CPU(s) scaling MHz:  60%
>>>       CPU max MHz:         4800.0000
>>>       CPU min MHz:         400.0000
>>>       BogoMIPS:            5990.40
>>>       Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
>>>                            a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
>>>                            ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
>>>                             arch_perfmon pebs bts rep_good nopl xtopology nonstop_
>>>                            tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes6
>>>                            4 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xt
>>>                            pr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_dead
>>>                            line_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowp
>>>                            refetch cpuid_fault epb cat_l2 cdp_l2 ssbd ibrs ibpb st
>>>                            ibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_
>>>                            ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
>>>                             rdt_a avx512f avx512dq rdseed adx smap avx512ifma clfl
>>>                            ushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl
>>>                            xsaveopt xsavec xgetbv1 xsaves split_lock_detect dtherm
>>>                             ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
>>>                             hwp_pkg_req vnmi avx512vbmi umip pku ospke avx512_vbmi
>>>                            2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme av
>>>                            x512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2i
>>
>> Sigh, I've been thinking.  Here FSRM is the most significant difference between
>> our environments, could you only try the following diff to see if there's any
>> difference anymore? (without the previous disable patch.)
>>
>> diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
>> index 1b60ae81ecd8..1b52a913233c 100644
>> --- a/arch/x86/lib/memmove_64.S
>> +++ b/arch/x86/lib/memmove_64.S
>> @@ -41,9 +41,7 @@ SYM_FUNC_START(__memmove)
>>    #define CHECK_LEN     cmp $0x20, %rdx; jb 1f
>>    #define MEMMOVE_BYTES movq %rdx, %rcx; rep movsb; RET
>>    .Lmemmove_begin_forward:
>> -       ALTERNATIVE_2 __stringify(CHECK_LEN), \
>> -                     __stringify(CHECK_LEN; MEMMOVE_BYTES), X86_FEATURE_ERMS, \
>> -                     __stringify(MEMMOVE_BYTES), X86_FEATURE_FSRM
>> +       CHECK_LEN
>>
>>          /*
>>           * movsq instruction have many startup latency
> 
> Yup, that also seems to fix it.
> Are we looking at a potential memmove issue?

I'm still analyzing this behavior as well as the root cause and
I will also try to get a recent cloud server with FSRM myself
to find more clues.

Thanks,
Gao Xiang


More information about the Linux-erofs mailing list