Some new bio merging behaviors in __bio_try_merge_page

Gao Xiang gaoxiang25 at huawei.com
Thu Apr 11 20:20:53 AEST 2019



On 2019/4/11 16:09, Ming Lei wrote:
> On Thu, Apr 11, 2019 at 03:43:02PM +0800, Gao Xiang wrote:
>>
>>
>> On 2019/4/11 15:08, Ming Lei wrote:
>>> Hi Gao Xiang,
>>>
>>> On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote:
>>>> Hi Ming,
>>>>
>>>> I found a erofs issue after commit 07173c3ec276
>>>> ("block: enable multipage bvecs") is merged. It seems that
>>>> it tries to merge more physical continuous pages in one iovec.
>>>>
>>>> However it breaks the current erofs_read_raw_page logic since it uses
>>>> nr_iovecs of bio_alloc to limit the maximum number of physical
>>>
>>> I believe you can do the limit outside easily, such as by checking
>>> how many pages have been added to the bio.
>>
>> Yes, I agree that. However, I noticed that bio_add_page is exported as
>> EXPORT_SYMBOL. I have no idea how many out-of-tree drivers break as well.
> 
> I don't think it is a good behaviour to use bio->bi_max_vecs to limit
> max allowed page, you may see the idea from the naming simply...

I hoped to simplify all these page limits to one number which is
the minimum number among them all then...

> 
> If there were other such drivers, we may fix it easily, and the following
> patch should fix your issue:
> 
> diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
> index 526e0dbea5b5..8878f2f2593e 100644
> --- a/drivers/staging/erofs/data.c
> +++ b/drivers/staging/erofs/data.c
> @@ -298,7 +298,7 @@ static inline struct bio *erofs_read_raw_page(struct bio *bio,
>  	*last_block = current_block;
>  
>  	/* shift in advance in case of it followed by too many gaps */
> -	if (unlikely(bio->bi_vcnt >= bio->bi_max_vecs)) {
> +	if (unlikely(bio->bi_iter.bi_size >= bio->bi_max_vecs * PAGE_SIZE)) {
>  		/* err should reassign to 0 after submitting */
>  		err = 0;
>  		goto submit_bio_out;

Yeah, it is fine, thanks for your suggestion :)

Thanks,
Gao Xiang

>>
>> Is there no way to take old behavior into consideration in the block layer as well?
> 
> One new interface, such as bio_add_page_limit(), may be helpful for
> addressing this issue, but not sure if it is necessary, since it is
> more reasonable to put the logic in user side.
> 
>>
>>>
>>>> continuous blocks as well. It was practicable since the old
>>>> __bio_try_merge_page only tries to merge in the same page.
>>>> it is a kAPI behavior change which also affects bio_alloc...
>>>>
>>>> ...
>>>> 231                 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
>>>> 232                 if (unlikely(err))
>>>> 233                         goto err_out;
>>>> ...
>>>> 284                 /* max # of continuous pages */
>>>> 285                 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
>>>> 286                         nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
>>>> 287                 if (nblocks > BIO_MAX_PAGES)
>>>> 288                         nblocks = BIO_MAX_PAGES;
>>>> 289
>>>> 290                 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
>>>> 291                                      read_endio, false);
>>>
>>> Previously this bio is allowed to add at most 'nblocks' pages, however,
>>> now we are allowed to add at most 'nblocks' io vecs, instead of pages.
>>>
>>>> 292                 if (IS_ERR(bio)) {
>>>> 293                         err = PTR_ERR(bio);
>>>> 294                         bio = NULL;
>>>> 295                         goto err_out;
>>>> 296                 }
>>>> 297         }
>>>> 298
>>>> 299         err = bio_add_page(bio, page, PAGE_SIZE, 0);
>>>> 300         /* out of the extent or bio is full */
>>>> 301         if (err < PAGE_SIZE)
>>>> 302                 goto submit_bio_retry;
>>>> ...
>>>>
>>>> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
>>>> read more beyond what erofs_map_blocks assigns, and out-of-bound data could
>>>> be read and it breaks tail-end inline determination.
>>>
>>> I don't understand why, could you explain a bit why erofs reads more?
>>
>> In current erofs, file could be break in two parts (non tail-end and tail-end blocks).
>> Considering this on-disk layout,
>>  ________________________________________________________________
>> |     non tail-end data              |   meta data               |
>> |____________________________________|_(inode) + non-inline data_|
>>                           ^          ^
>>                           |          |        
>>                           bio start  \-- what nr_iovecs indicates as well:
>>                                          the end of the non tail-end data.
>>
>> Before this commit, it will stop just before the next meta data. However,
>> if the new bio merging behavior is introduced, it could add pages more than
>> nr_iovecs, thus meta data will be read as the normal data...
> 
> I'd suggest you to not re-use bio->bi_max_vecs for this purpose.
> 
>>
>>
>>>
>>> The amount depends on how many pages you added to the bio.
>>>
>>>>
>>>> I can change the logic in erofs. However, out of curiosity, I have no idea
>>>> if some other places also are designed like this.
>>>>
>>>> IMO, it's better to provide a total count which indicates how many real
>>>> pages have been added in this bio. some thoughts?
>>>
>>> As I mentioned, you may count how many pages added to bio, or you still
>>> can get the number via bio_segments(bio).
>>
>> It is unsuitable to introduce bio_segments for each bio_add_page...
> 
> It isn't necessary, see the above patch.
> 
> 
> Thanks,
> Ming
> 


More information about the Linux-erofs mailing list