Some new bio merging behaviors in __bio_try_merge_page

Ming Lei ming.lei at redhat.com
Thu Apr 11 18:09:54 AEST 2019


On Thu, Apr 11, 2019 at 03:43:02PM +0800, Gao Xiang wrote:
> 
> 
> On 2019/4/11 15:08, Ming Lei wrote:
> > Hi Gao Xiang,
> > 
> > On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote:
> >> Hi Ming,
> >>
> >> I found a erofs issue after commit 07173c3ec276
> >> ("block: enable multipage bvecs") is merged. It seems that
> >> it tries to merge more physical continuous pages in one iovec.
> >>
> >> However it breaks the current erofs_read_raw_page logic since it uses
> >> nr_iovecs of bio_alloc to limit the maximum number of physical
> > 
> > I believe you can do the limit outside easily, such as by checking
> > how many pages have been added to the bio.
> 
> Yes, I agree that. However, I noticed that bio_add_page is exported as
> EXPORT_SYMBOL. I have no idea how many out-of-tree drivers break as well.

I don't think it is a good behaviour to use bio->bi_max_vecs to limit
max allowed page, you may see the idea from the naming simply...

If there were other such drivers, we may fix it easily, and the following
patch should fix your issue:

diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
index 526e0dbea5b5..8878f2f2593e 100644
--- a/drivers/staging/erofs/data.c
+++ b/drivers/staging/erofs/data.c
@@ -298,7 +298,7 @@ static inline struct bio *erofs_read_raw_page(struct bio *bio,
 	*last_block = current_block;
 
 	/* shift in advance in case of it followed by too many gaps */
-	if (unlikely(bio->bi_vcnt >= bio->bi_max_vecs)) {
+	if (unlikely(bio->bi_iter.bi_size >= bio->bi_max_vecs * PAGE_SIZE)) {
 		/* err should reassign to 0 after submitting */
 		err = 0;
 		goto submit_bio_out;
> 
> Is there no way to take old behavior into consideration in the block layer as well?

One new interface, such as bio_add_page_limit(), may be helpful for
addressing this issue, but not sure if it is necessary, since it is
more reasonable to put the logic in user side.

> 
> > 
> >> continuous blocks as well. It was practicable since the old
> >> __bio_try_merge_page only tries to merge in the same page.
> >> it is a kAPI behavior change which also affects bio_alloc...
> >>
> >> ...
> >> 231                 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
> >> 232                 if (unlikely(err))
> >> 233                         goto err_out;
> >> ...
> >> 284                 /* max # of continuous pages */
> >> 285                 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
> >> 286                         nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
> >> 287                 if (nblocks > BIO_MAX_PAGES)
> >> 288                         nblocks = BIO_MAX_PAGES;
> >> 289
> >> 290                 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
> >> 291                                      read_endio, false);
> > 
> > Previously this bio is allowed to add at most 'nblocks' pages, however,
> > now we are allowed to add at most 'nblocks' io vecs, instead of pages.
> > 
> >> 292                 if (IS_ERR(bio)) {
> >> 293                         err = PTR_ERR(bio);
> >> 294                         bio = NULL;
> >> 295                         goto err_out;
> >> 296                 }
> >> 297         }
> >> 298
> >> 299         err = bio_add_page(bio, page, PAGE_SIZE, 0);
> >> 300         /* out of the extent or bio is full */
> >> 301         if (err < PAGE_SIZE)
> >> 302                 goto submit_bio_retry;
> >> ...
> >>
> >> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
> >> read more beyond what erofs_map_blocks assigns, and out-of-bound data could
> >> be read and it breaks tail-end inline determination.
> > 
> > I don't understand why, could you explain a bit why erofs reads more?
> 
> In current erofs, file could be break in two parts (non tail-end and tail-end blocks).
> Considering this on-disk layout,
>  ________________________________________________________________
> |     non tail-end data              |   meta data               |
> |____________________________________|_(inode) + non-inline data_|
>                           ^          ^
>                           |          |        
>                           bio start  \-- what nr_iovecs indicates as well:
>                                          the end of the non tail-end data.
> 
> Before this commit, it will stop just before the next meta data. However,
> if the new bio merging behavior is introduced, it could add pages more than
> nr_iovecs, thus meta data will be read as the normal data...

I'd suggest you to not re-use bio->bi_max_vecs for this purpose.

> 
> 
> > 
> > The amount depends on how many pages you added to the bio.
> > 
> >>
> >> I can change the logic in erofs. However, out of curiosity, I have no idea
> >> if some other places also are designed like this.
> >>
> >> IMO, it's better to provide a total count which indicates how many real
> >> pages have been added in this bio. some thoughts?
> > 
> > As I mentioned, you may count how many pages added to bio, or you still
> > can get the number via bio_segments(bio).
> 
> It is unsuitable to introduce bio_segments for each bio_add_page...

It isn't necessary, see the above patch.


Thanks,
Ming


More information about the Linux-erofs mailing list