[PATCH RESEND] erofs: fix a race of deduplicated compressed images to avoid loops

Gao Xiang hsiangkao at linux.alibaba.com
Fri May 19 17:12:19 AEST 2023


After heavily stressing EROFS with several images which include a
hand-crafted image of repeated patterns for more than 46 days, I found
two chains could be linked with each other almost simultaneously and
form a loop, so the entire loop won't be submitted to the device.  As a
consequence, the corresponding file pages will remain locked forever.

It can be _only_ observed on data-deduplicated compressed images.  For
example, consider two chains with five pclusters in total:
	Chain 1:  2->3->4->5    -- The tail pcluster is 5;
        Chain 2:  5->1->2       -- The tail pcluster is 2.

Chain 2 could link to Chain 1 due to pcluster 5; and Chain 1 could link
to Chain 2 at the same time due to pcluster 2  (Note that Chain 2 is
invalid on traditional compressed images without data deduplciation.)

Fix this by checking if the tail of a chain is extended after the chain
itself is attached into another chain.  If so, bail out instead.

Fixes: 267f2492c8f7 ("erofs: introduce multi-reference pclusters (fully-referenced)")
Signed-off-by: Gao Xiang <hsiangkao at linux.alibaba.com>
---
RESEND:
 fix commit message.

I plan to stress this patch for a week before upstreaming.

 fs/erofs/zdata.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 45f21db2303a..88295c73ff90 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -756,13 +756,17 @@ static void z_erofs_try_to_claim_pcluster(struct z_erofs_decompress_frontend *f)
 	 * type 2, link to the end of an existing open chain, be careful
 	 * that its submission is controlled by the original attached chain.
 	 */
-	if (*owned_head != &pcl->next && pcl != f->tailpcl &&
-	    cmpxchg(&pcl->next, Z_EROFS_PCLUSTER_TAIL,
-		    *owned_head) == Z_EROFS_PCLUSTER_TAIL) {
-		*owned_head = Z_EROFS_PCLUSTER_TAIL;
-		f->mode = Z_EROFS_PCLUSTER_HOOKED;
-		f->tailpcl = NULL;
-		return;
+	if (pcl != f->tailpcl && cmpxchg(&pcl->next, Z_EROFS_PCLUSTER_TAIL,
+					*owned_head) == Z_EROFS_PCLUSTER_TAIL) {
+		/* switch to type 3 if our owned chain is attached by others */
+		if (f->tailpcl && f->tailpcl->next != Z_EROFS_PCLUSTER_TAIL) {
+			WRITE_ONCE(pcl->next, Z_EROFS_PCLUSTER_TAIL);
+		} else {
+			*owned_head = Z_EROFS_PCLUSTER_TAIL;
+			f->mode = Z_EROFS_PCLUSTER_HOOKED;
+			f->tailpcl = NULL;
+			return;
+		}
 	}
 	/* type 3, it belongs to a chain, but it isn't the end of the chain */
 	f->mode = Z_EROFS_PCLUSTER_INFLIGHT;
@@ -825,9 +829,6 @@ static int z_erofs_register_pcluster(struct z_erofs_decompress_frontend *fe)
 			goto err_out;
 		}
 	}
-	/* used to check tail merging loop due to corrupted images */
-	if (fe->owned_head == Z_EROFS_PCLUSTER_TAIL)
-		fe->tailpcl = pcl;
 	fe->owned_head = &pcl->next;
 	fe->pcl = pcl;
 	return 0;
@@ -867,14 +868,14 @@ static int z_erofs_collector_begin(struct z_erofs_decompress_frontend *fe)
 
 	if (ret == -EEXIST) {
 		mutex_lock(&fe->pcl->lock);
-		/* used to check tail merging loop due to corrupted images */
-		if (fe->owned_head == Z_EROFS_PCLUSTER_TAIL)
-			fe->tailpcl = fe->pcl;
-
 		z_erofs_try_to_claim_pcluster(fe);
 	} else if (ret) {
 		return ret;
 	}
+
+	/* detect/avoid loop formed out of chain linking (type 2) */
+	if (fe->pcl->next == Z_EROFS_PCLUSTER_TAIL)
+		fe->tailpcl = fe->pcl;
 	z_erofs_bvec_iter_begin(&fe->biter, &fe->pcl->bvset,
 				Z_EROFS_INLINE_BVECS, fe->pcl->vcnt);
 	/* since file-backed online pages are traversed in reverse order */
-- 
2.24.4



More information about the Linux-erofs mailing list