On Wed, May 21, 2025 at 06:05:14PM -0700, Caleb Sander Mateos wrote: > On Wed, May 21, 2025 at 5:42 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > > On Wed, May 21, 2025 at 08:58:20AM -0700, Caleb Sander Mateos wrote: > > > On Tue, May 20, 2025 at 7:55 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > > > > > > UBLK_F_AUTO_BUF_REG requires that the buffer registered automatically > > > > is unregistered in same `io_ring_ctx`, so check it explicitly. > > > > > > > > Meantime return the failure code if io_buffer_unregister_bvec() fails, > > > > then ublk server can handle the failure in consistent way. > > > > > > > > Also force to clear UBLK_IO_FLAG_AUTO_BUF_REG in ublk_io_release() > > > > because ublk_io_release() may be triggered not from handling > > > > UBLK_IO_COMMIT_AND_FETCH_REQ, and from releasing the `io_ring_ctx` > > > > for registering the buffer. > > > > > > > > Fixes: 99c1e4eb6a3f ("ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG") > > > > Reported-by: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx> > > > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > > > --- > > > > drivers/block/ublk_drv.c | 35 +++++++++++++++++++++++++++++++---- > > > > include/uapi/linux/ublk_cmd.h | 3 ++- > > > > 2 files changed, 33 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c > > > > index fcf568b89370..2af6422d6a89 100644 > > > > --- a/drivers/block/ublk_drv.c > > > > +++ b/drivers/block/ublk_drv.c > > > > @@ -84,6 +84,7 @@ struct ublk_rq_data { > > > > > > > > /* for auto-unregister buffer in case of UBLK_F_AUTO_BUF_REG */ > > > > u16 buf_index; > > > > + unsigned long buf_ctx_id; > > > > }; > > > > > > > > struct ublk_uring_cmd_pdu { > > > > @@ -1192,6 +1193,11 @@ static void ublk_auto_buf_reg_fallback(struct request *req, struct ublk_io *io) > > > > refcount_set(&data->ref, 1); > > > > } > > > > > > > > +static unsigned long ublk_uring_cmd_ctx_id(struct io_uring_cmd *cmd) > > > > +{ > > > > + return (unsigned long)(cmd_to_io_kiocb(cmd)->ctx); > > > > > > Is the fact that a struct io_uring_cmd * can be passed to > > > cmd_to_io_kiocb() an io_uring internal implementation detail? Maybe it > > > would be good to add a helper in include/linux/io_uring/cmd.h so ublk > > > isn't depending on io_uring internals. > > > > All this definition is defined in kernel public header, not sure if it is > > big deal to add the helper, especially there is just single user. > > > > But I will do it. > > > > > > > > Also, storing buf_ctx_id as a void * instead would allow this cast to > > > be avoided, but not a big deal. > > > > > > > +} > > > > + > > > > static bool ublk_auto_buf_reg(struct request *req, struct ublk_io *io, > > > > unsigned int issue_flags) > > > > { > > > > @@ -1211,6 +1217,8 @@ static bool ublk_auto_buf_reg(struct request *req, struct ublk_io *io, > > > > } > > > > /* one extra reference is dropped by ublk_io_release */ > > > > refcount_set(&data->ref, 2); > > > > + > > > > + data->buf_ctx_id = ublk_uring_cmd_ctx_id(io->cmd); > > > > /* store buffer index in request payload */ > > > > data->buf_index = pdu->buf.index; > > > > io->flags |= UBLK_IO_FLAG_AUTO_BUF_REG; > > > > @@ -1994,6 +2002,21 @@ static void ublk_io_release(void *priv) > > > > { > > > > struct request *rq = priv; > > > > struct ublk_queue *ubq = rq->mq_hctx->driver_data; > > > > + struct ublk_io *io = &ubq->ios[rq->tag]; > > > > + > > > > + /* > > > > + * In case of UBLK_F_AUTO_BUF_REG, the `io_uring_ctx` for registering > > > > + * this buffer may be released, so we reach here not from handling > > > > + * `UBLK_IO_COMMIT_AND_FETCH_REQ`. > > > > > > What do you mean by this? That the io_uring was closed while a ublk > > > I/O owned by the server still had a registered buffer? > > > > The buffer is registered to `io_ring_ctx A`, which is closed and the buffer > > is used up and un-registered on `io_ring_ctx A`, so this callback is > > triggered, but the io command isn't completed yet, which can be run from > > `io_ring_ctx B` > > > > > > > > > + * > > > > + * Force to clear UBLK_IO_FLAG_AUTO_BUF_REG, so that ublk server > > > > + * still may complete this IO request by issuing uring_cmd from > > > > + * another `io_uring_ctx` in case that the `io_ring_ctx` for > > > > + * registering the buffer is gone > > > > + */ > > > > + if (ublk_support_auto_buf_reg(ubq) && > > > > + (io->flags & UBLK_IO_FLAG_AUTO_BUF_REG)) > > > > + io->flags &= ~UBLK_IO_FLAG_AUTO_BUF_REG; > > > > > > This is racy, since ublk_io_release() can be called on a thread other > > > than the ubq_daemon. > > > > Yeah, it can be true. > > > > > Could we avoid touching io->flags here and > > > instead have ublk_commit_and_fetch() check whether the reference count > > > is already 1? > > > > It is still a little racy because the buffer unregister from another thread > > can happen just after the check immediately. > > True. I think it might be better to just skip the unregister if the > contexts don't match rather than returning -EINVAL. Then there is no > race. If userspace has already closed the old io_uring context, > skipping the unregister is the desired behavior. If userspace hasn't > closed the old io_uring, then that's a userspace bug and they get what > they deserve (a buffer stuck registered). If userspace wants to submit > the UBLK_IO_COMMIT_AND_FETCH_REQ on a different io_uring for some > reason, they can always issue an explicit UBLK_IO_UNREGISTER_IO_BUF on > the old io_uring to unregister the buffer. Sounds good, will add the above words on ublk_commit_and_fetch(). Follows two invariants: - ublk_io_release() is always called once no matter if it is called from any thread context, request can't be completed until ublk_io_release() is called - new UBLK_IO_COMMIT_AND_FETCH_REQ can't be issued until old request is completed & new request comes The merged code should work just fine. But we need to document the feature only works on same io_ring_context. Thanks, Ming