[PATCH V10 09/19] block: introduce bio_bvecs()
Sagi Grimberg
sagi at grimberg.me
Wed Nov 21 15:25:46 AEDT 2018
>> I would like to avoid growing bvec tables and keep everything
>> preallocated. Plus, a bvec_iter operates on a bvec which means
>> we'll need a table there as well... Not liking it so far...
>
> In case of bios in one request, we can't know how many bvecs there
> are except for calling rq_bvecs(), so it may not be suitable to
> preallocate the table. If you have to send the IO request in one send(),
> runtime allocation may be inevitable.
I don't want to do that, I want to work on a single bvec at a time like
the current implementation does.
> If you don't require to send the IO request in one send(), you may send
> one bio in one time, and just uses the bio's bvec table directly,
> such as the single bio case in lo_rw_aio().
we'd need some indication that we need to reinit my iter with the
new bvec, today we do:
static inline void nvme_tcp_advance_req(struct nvme_tcp_request *req,
int len)
{
req->snd.data_sent += len;
req->pdu_sent += len;
iov_iter_advance(&req->snd.iter, len);
if (!iov_iter_count(&req->snd.iter) &&
req->snd.data_sent < req->data_len) {
req->snd.curr_bio = req->snd.curr_bio->bi_next;
nvme_tcp_init_send_iter(req);
}
}
and initialize the send iter. I imagine that now I will need to
switch to the next bvec and only if I'm on the last I need to
use the next bio...
Do you offer an API for that?
>>> can this way avoid your blocking issue? You may see this
>>> example in branch 'rq->bio != rq->biotail' of lo_rw_aio().
>>
>> This is exactly an example of not ignoring the bios...
>
> Yeah, that is the most common example, given merge is enabled
> in most of cases. If the driver or device doesn't care merge,
> you can disable it and always get single bio request, then the
> bio's bvec table can be reused for send().
Does bvec_iter span bvecs with your patches? I didn't see that change?
>> I'm not sure how this helps me either. Unless we can set a bvec_iter to
>> span bvecs or have an abstract bio crossing when we re-initialize the
>> bvec_iter I don't see how I can ignore bios completely...
>
> rq_for_each_bvec() will iterate over all bvecs from all bios, so you
> needn't to see any bio in this req.
But I don't need this iteration, I need a transparent API like;
bvec2 = rq_bvec_next(rq, bvec)
This way I can simply always reinit my iter without thinking about how
the request/bios/bvecs are constructed...
> rq_bvecs() will return how many bvecs there are in this request(cover
> all bios in this req)
Still not very useful given that I don't want to use a table...
>>> So looks nvme-tcp host driver might be the 2nd driver which benefits
>>> from multi-page bvec directly.
>>>
>>> The multi-page bvec V11 has passed my tests and addressed almost
>>> all the comments during review on V10. I removed bio_vecs() in V11,
>>> but it won't be big deal, we can introduce them anytime when there
>>> is the requirement.
>>
>> multipage-bvecs and nvme-tcp are going to conflict, so it would be good
>> to coordinate on this. I think that nvme-tcp host needs some adjustments
>> as setting a bvec_iter. I'm under the impression that the change is rather
>> small and self-contained, but I'm not sure I have the full
>> picture here.
>
> I guess I may not get your exact requirement on block io iterator from nvme-tcp
> too, :-(
They are pretty much listed above. Today nvme-tcp sets an iterator with:
vec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
nsegs = bio_segments(bio);
size = bio->bi_iter.bi_size;
offset = bio->bi_iter.bi_bvec_done;
iov_iter_bvec(&req->snd.iter, WRITE, vec, nsegs, size);
and when done, iterate to the next bio and do the same.
With multipage bvec it would be great if we can simply have
something like rq_bvec_next() that would pretty much satisfy
the requirements from the nvme-tcp side...
More information about the Linux-erofs
mailing list