[PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

Chao Yu chao at kernel.org
Sun Aug 18 22:28:17 AEST 2019


Hi Xiang,

On 2019-8-18 18:52, Gao Xiang wrote:
> Hi Chao,
> 
> On Sun, Aug 18, 2019 at 06:39:52PM +0800, Chao Yu wrote:
>> On 2019-8-18 10:53, Matthew Wilcox wrote:
>>> On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
>>>> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
>>>>> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
>>>>>> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
>>>>>>  		unsigned int nameoff, maxsize;
>>>>>>  
>>>>>>  		dentry_page = read_mapping_page(mapping, i, NULL);
>>>>>> -		if (IS_ERR(dentry_page))
>>>>>> -			continue;
>>>>>> +		if (IS_ERR(dentry_page)) {
>>>>>> +			errln("fail to readdir of logical block %u of nid %llu",
>>>>>> +			      i, EROFS_V(dir)->nid);
>>>>>> +			err = PTR_ERR(dentry_page);
>>>>>> +			break;
>>>>>
>>>>> I don't think you want to use the errno that came back from
>>>>> read_mapping_page() (which is, I think, always going to be -EIO).
>>>>> Rather you want -EFSCORRUPTED, at least if I understand the recent
>>>>> patches to ext2/ext4/f2fs/xfs/...
>>>>
>>>> Thanks for your reply and noticing this. :)
>>>>
>>>> Yes, as I talked with you about read_mapping_page() in a xfs related
>>>> topic earlier, I think I fully understand what returns here.
>>>>
>>>> I actually had some concern about that before sending out this patch.
>>>> You know the status is
>>>>    PG_uptodate is not set and PG_error is set here.
>>>>
>>>> But we cannot know it is actually a disk read error or due to
>>>> corrupted images (due to lack of page flags or some status, and
>>>> I think it could be a waste of page structure space for such
>>>> corrupted image or disk error)...
>>>>
>>>> And some people also like propagate errors from insiders...
>>>> (and they could argue about err = -EFSCORRUPTED as well..)
>>>>
>>>> I'd like hear your suggestion about this after my words above?
>>>> still return -EFSCORRUPTED?
>>>
>>> I don't think it matters whether it's due to a disk error or a corrupted
>>> image.  We can't read the directory entry, so we should probably return
>>> -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
>>> also return -ENOMEM, so it should probably look something like this:
>>>
>>> 		err = 0;
>>> 		if (dentry_page == ERR_PTR(-ENOMEM))
>>> 			err = -ENOMEM;
>>> 		else if (IS_ERR(dentry_page)) {
>>> 			errln("fail to readdir of logical block %u of nid %llu",
>>> 			      i, EROFS_V(dir)->nid);
>>> 			err = -EFSCORRUPTED;
>>
>> Well, if there is real IO error happen under filesystem, we should return -EIO
>> instead of EFSCORRUPTED?
>>
>> The right fix may be that doing sanity check on on-disk blkaddr, and return
>> -EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller
>> erofs_readdir(), IIUC below error info.
> 
> In my thought, I actually don't care what is actually returned
> (In other words, I have no tendency about EFSCORRUPTED / EIO)
> as long as it behaves normal for corrupted image.

I doubt that user can and be willing to handle different error code in there
error path.

> 
> A little concern is that I have no idea whether all user problems
> can handle EUCLEAN properly.

Yes, I can see it's widely used in fileystem, I guess it would be better to
update manual of common fs interfaces to let user be aware of the meaning of it.

> 
> I don't want to limit blkaddr as what ->blocks recorded in
> erofs_super_block since it's already used for our hotpatching
> approach and that field is only used for statfs() for users
> to know its visible size, and block layer will block such
> invalid block access.
> 
> All in all, that is minor. I think we can fix as what Matthew said.

Yeah, as we discuss offline, we need keep flexibility on current version for
android, and maybe we can add a feature to check blkaddr validation later.

Thanks,

> 
> Thanks,
> Gao Xiang
> 
>>
>>> [36297.354090] attempt to access beyond end of device
>>> [36297.354098] loop17: rw=0, want=29887428984, limit=1953128
>>> [36297.354107] attempt to access beyond end of device
>>> [36297.354109] loop17: rw=0, want=29887428480, limit=1953128
>>> [36301.827234] attempt to access beyond end of device
>>> [36301.827243] loop17: rw=0, want=29887428480, limit=1953128
>>
>> Thanks,
>>
>>> 		}
>>>
>>> 		if (err)
>>> 			break;
>>>


More information about the Linux-erofs mailing list