On 4/29/25 01:47, Ming Lei wrote:
On Mon, Apr 28, 2025 at 11:28:30AM +0100, Pavel Begunkov wrote:
On 4/28/25 10:44, Ming Lei wrote:
Extend io_buffer_register_bvec() and io_buffer_unregister_bvec() for
supporting to register/unregister bvec buffer to specified io_uring,
which FD is usually passed from userspace.
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
include/linux/io_uring/cmd.h | 4 ++
io_uring/rsrc.c | 83 +++++++++++++++++++++++++++---------
2 files changed, 67 insertions(+), 20 deletions(-)
diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
index 78fa336a284b..7516fe5cd606 100644
--- a/include/linux/io_uring/cmd.h
+++ b/include/linux/io_uring/cmd.h
@@ -25,6 +25,10 @@ struct io_uring_cmd_data {
...
io_ring_submit_lock(ctx, issue_flags);
- ret = __io_buffer_unregister_bvec(ctx, buf);
+ if (reg)
+ ret = __io_buffer_register_bvec(ctx, buf);
+ else
+ ret = __io_buffer_unregister_bvec(ctx, buf);
io_ring_submit_unlock(ctx, issue_flags);
return ret;
}
+
+static int io_buffer_reg_unreg_bvec(struct io_ring_ctx *ctx,
+ struct io_buf_data *buf,
+ unsigned int issue_flags,
+ bool reg)
+{
+ struct io_ring_ctx *remote_ctx = ctx;
+ struct file *file = NULL;
+ int ret;
+
+ if (buf->has_fd) {
+ file = io_uring_register_get_file(buf->ring_fd, buf->registered_fd);
io_uring_register_get_file() accesses task private data and the request
doesn't control from which task it's executed. IOW, you can't use the
helper here. It can be iowq or sqpoll, but either way nothing is
promised.
Good catch!
Actually ublk uring_cmd is guaranteed to be issued from user context.
We can enhance it by failing buffer register:
if ((current->flags & PF_KTHREAD) || (issue_flags & IO_URING_F_IOWQ))
return -EACCESS;
Can it somehow check that current matches the desired task? That's exactly
the condition where it can go wrong, and that's much better than listing all
corner cases that might change.
Just to avoid confusion, it's not guaranteed by io_uring it'll be run from
the "right" task. If that changes in the future, either the ublk uapi should
be mandating the user to fall back to something else like regular fds, or
ublk will need to handle it somehow.
+ if (IS_ERR(file))
+ return PTR_ERR(file);
+ remote_ctx = file->private_data;
+ if (!remote_ctx)
+ return -EINVAL;
nit: this check is not needed.
OK.
+ }
+
+ if (remote_ctx == ctx) {
+ do_reg_unreg_bvec(ctx, buf, issue_flags, reg);
+ } else {
+ if (!(issue_flags & IO_URING_F_UNLOCKED))
+ mutex_unlock(&ctx->uring_lock);
We shouldn't be dropping the lock in random helpers, for example
it'd be pretty nasty suspending a submission loop with a submission
from another task.
You can try lock first, if fails it'll need a fresh context via
iowq to be task-work'ed into the ring. see msg_ring.c for how
it's done for files.
Looks trylock is better, will take this approach by returning -EAGAIN,
and let ublk driver retry.
Is there a reliable fall back path for the userspace? Hammering the
same thing until it succeeds in never a good option.
--
Pavel Begunkov