[External Mail]Re: [PATCH] erofs: Deadlock caused by kswap work in low memory scenarios

Jianhua1 Hao 郝建华 haojianhua1 at xiaomi.com
Tue Nov 23 21:03:02 AEDT 2021


Hi Jianan and Gao Xiang

We have brought this patch for testing. Thanks for your reply.



________________________________
Jianhua1 Hao

From: Gao Xiang<mailto:xiang at kernel.org>
Date: 2021-11-23 13:16
To: Huang Jianan<mailto:huangjianan at oppo.com>; Jianhua1 Hao 郝建华<mailto:haojianhua1 at xiaomi.com>
CC: xiang at kernel.org<mailto:xiang at kernel.org>; linux-erofs<mailto:linux-erofs at lists.ozlabs.org>; linux-kernel<mailto:linux-kernel at vger.kernel.org>; chao<mailto:chao at kernel.org>; guoweichao at oppo.com<mailto:guoweichao at oppo.com>; guanyuwei at oppo.com<mailto:guanyuwei at oppo.com>; yh at oppo.com<mailto:yh at oppo.com>; zhangshiming at oppo.com<mailto:zhangshiming at oppo.com>
Subject: [External Mail]Re: [PATCH] erofs: Deadlock caused by kswap work in low memory scenarios
*This message originated from outside of XIAOMI. Please treat this email with caution*


Hi Jianan and Jianhua,

On Tue, Nov 23, 2021 at 11:58:32AM +0800, Huang Jianan wrote:
> 在 2021/11/23 10:59, Jianhua1 Hao 郝建华 via Linux-erofs 写道:
> > *We also found that it is easy to cause deadlock in the kswap scene, We
> > observed the following deadlock in the stress test under low memory
> > scenario,****Same as "erofs: fix deadlock when shrink erofs slab".*
> > **
> >
> > Thread A: Thread B:
> >
> > erofs_try_to_release_workgroup(grp =
> > 0xFFFFFF87ADFEE610)erofs_insert_workgroup()
> >
> > erofs_workgroup_try_to_freeze(grp, 1)//xa lock is held here
> >
> > //set ref count to EROFS_LOCKED_MAGICxa_lock(&sbi->managed_pslots);
> >
> > atomic_cmpxchg(&grp->refcount, val,EROFS_LOCKED_MAGIC)pre =
> > __xa_cmpxchg(&sbi->managed_pslots, grp->index, NULL, grp, GFP_NOFS);
> >
> > xa_erase(&sbi->managed_pslots, grp->index)erofs_workgroup_get(pre)
> > //pre = grp = 0xFFFFFF87ADFEE610
> >
> > //stuck there to wait for xa lock, already held by thread
> > Berofs_wait_on_workgroup_freezed(grp);
> >
> > xa_lock(xa); //wait ref count to be unlocked, which should be done by
> > thread A
> >
> > atomic_cond_read_relaxed(&grp->refcount, VAL != EROFS_LOCKED_MAGIC);
> >
> > Follow-up fix:it need to hold the xa lock before freeze the workgroup
> >
> > beacuse we will operate xarry?
> >
> Hi,  JianHua,
>
> The fix is in the patch, please test it kindly if you have condition.
> https://lore.kernel.org/linux-erofs/YZcJpDs3FKpSfzAE@B-P7TQMD6M-0146/T/#t

Thanks for the report, I had some other work to do just now.

I've pushed out this patch to fixes branch and will send to Linus this
week:
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git/commit/?id=deccd444d2844f1e89314dfc3956cccfdb813b65

As Jianan said, I believe this patch can fix your issue and please take
a try in advance. Also, it doesn't effect v4.19 and v5.4 LTS, only v5.10
and v5.15 LTS are impacted.

Thanks for your report!
Gao Xiang

#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linux-erofs/attachments/20211123/e6a81d54/attachment.htm>


More information about the Linux-erofs mailing list