Re: [RFC PATCH 3/7] io_uring: support to register bvec buffer to specified io_uring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 28, 2025 at 05:43:12PM -0700, Caleb Sander Mateos wrote:
> On Mon, Apr 28, 2025 at 2:44 AM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
> >
> > Extend io_buffer_register_bvec() and io_buffer_unregister_bvec() for
> > supporting to register/unregister bvec buffer to specified io_uring,
> > which FD is usually passed from userspace.
> >
> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > ---
> >  include/linux/io_uring/cmd.h |  4 ++
> >  io_uring/rsrc.c              | 83 +++++++++++++++++++++++++++---------
> >  2 files changed, 67 insertions(+), 20 deletions(-)
> >
> > diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
> > index 78fa336a284b..7516fe5cd606 100644
> > --- a/include/linux/io_uring/cmd.h
> > +++ b/include/linux/io_uring/cmd.h
> > @@ -25,6 +25,10 @@ struct io_uring_cmd_data {
> >
> >  struct io_buf_data {
> >         unsigned short index;
> > +       bool has_fd;
> > +       bool registered_fd;
> > +
> > +       int ring_fd;
> >         struct request *rq;
> >         void (*release)(void *);
> >  };
> > diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
> > index 5f8ab130a573..701dd33fecf7 100644
> > --- a/io_uring/rsrc.c
> > +++ b/io_uring/rsrc.c
> > @@ -969,21 +969,6 @@ static int __io_buffer_register_bvec(struct io_ring_ctx *ctx,
> >         return 0;
> >  }
> >
> > -int io_buffer_register_bvec(struct io_uring_cmd *cmd,
> > -                           struct io_buf_data *buf,
> > -                           unsigned int issue_flags)
> > -{
> > -       struct io_ring_ctx *ctx = cmd_to_io_kiocb(cmd)->ctx;
> > -       int ret;
> > -
> > -       io_ring_submit_lock(ctx, issue_flags);
> > -       ret = __io_buffer_register_bvec(ctx, buf);
> > -       io_ring_submit_unlock(ctx, issue_flags);
> > -
> > -       return ret;
> > -}
> > -EXPORT_SYMBOL_GPL(io_buffer_register_bvec);
> > -
> >  static int __io_buffer_unregister_bvec(struct io_ring_ctx *ctx,
> >                                        struct io_buf_data *buf)
> >  {
> > @@ -1006,19 +991,77 @@ static int __io_buffer_unregister_bvec(struct io_ring_ctx *ctx,
> >         return 0;
> >  }
> >
> > -int io_buffer_unregister_bvec(struct io_uring_cmd *cmd,
> > -                             struct io_buf_data *buf,
> > -                             unsigned int issue_flags)
> > +static inline int do_reg_unreg_bvec(struct io_ring_ctx *ctx,
> > +                                   struct io_buf_data *buf,
> > +                                   unsigned int issue_flags,
> > +                                   bool reg)
> >  {
> > -       struct io_ring_ctx *ctx = cmd_to_io_kiocb(cmd)->ctx;
> >         int ret;
> >
> >         io_ring_submit_lock(ctx, issue_flags);
> > -       ret = __io_buffer_unregister_bvec(ctx, buf);
> > +       if (reg)
> > +               ret = __io_buffer_register_bvec(ctx, buf);
> > +       else
> > +               ret = __io_buffer_unregister_bvec(ctx, buf);
> 
> It feels like unifying __io_buffer_register_bvec() and
> __io_buffer_unregister_bvec() would belong better in the prior patch
> that changes their signatures.

Can you share how to do above in previous patch?

> 
> >         io_ring_submit_unlock(ctx, issue_flags);
> >
> >         return ret;
> >  }
> > +
> > +static int io_buffer_reg_unreg_bvec(struct io_ring_ctx *ctx,
> > +                                   struct io_buf_data *buf,
> > +                                   unsigned int issue_flags,
> > +                                   bool reg)
> > +{
> > +       struct io_ring_ctx *remote_ctx = ctx;
> > +       struct file *file = NULL;
> > +       int ret;
> > +
> > +       if (buf->has_fd) {
> > +               file = io_uring_register_get_file(buf->ring_fd, buf->registered_fd);
> > +               if (IS_ERR(file))
> > +                       return PTR_ERR(file);
> 
> It would be good to avoid the overhead of this lookup and
> reference-counting in the I/O path. Would it be possible to move this
> lookup to when UBLK_IO_FETCH_REQ (and UBLK_IO_COMMIT_AND_FETCH_REQ, if
> it specifies a different ring_fd) is submitted? I guess that might
> require storing an extra io_ring_ctx pointer in struct ublk_io.

Let's start from the flexible way & simple implementation.

Any optimization & improvement can be done as follow-up.

Each command may register buffer to different io_uring context,
it can't be done in UBLK_IO_FETCH_REQ stage, because new IO with same
tag may register buffer to new io_uring context.

But it can be optimized in future for one specific use case with feature
flag.

> 
> > +               remote_ctx = file->private_data;
> > +               if (!remote_ctx)
> > +                       return -EINVAL;
> > +       }
> > +
> > +       if (remote_ctx == ctx) {
> > +               do_reg_unreg_bvec(ctx, buf, issue_flags, reg);
> > +       } else {
> > +               if (!(issue_flags & IO_URING_F_UNLOCKED))
> > +                       mutex_unlock(&ctx->uring_lock);
> > +
> > +               do_reg_unreg_bvec(remote_ctx, buf, IO_URING_F_UNLOCKED, reg);
> > +
> > +               if (!(issue_flags & IO_URING_F_UNLOCKED))
> > +                       mutex_lock(&ctx->uring_lock);
> > +       }
> > +
> > +       if (file)
> > +               fput(file);
> > +
> > +       return ret;
> > +}
> > +
> > +int io_buffer_register_bvec(struct io_uring_cmd *cmd,
> > +                           struct io_buf_data *buf,
> > +                           unsigned int issue_flags)
> 
> If buf->has_fd is set, this struct io_uring_cmd *cmd is unused. Could
> you define separate functions that take a struct io_uring_cmd * vs. a
> ring_fd?

The ring_fd may point to the same io_uring context with 'io_uring_cmd',
we need this information for dealing with io_uring context lock.


Thanks, 
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux