On Thu, Apr 24, 2025 at 06:19:29PM +0000, Ofer Oshri wrote: > Hi, > > Our code uses a single io_uring per core, which is shared among all block devices - meaning each block device on a core uses the same io_uring. > Can I understand you are using single io_uring for serving one hw queue of multiple ublk device? > Let’s say the size of the io_uring is N. Each block device submits M UBLK_U_IO_FETCH_REQ requests. As a result, with the current implementation, we can only support up to P block devices, where P = N / M. This means that when we attempt to support block device P+1, it will fail due to io_uring exhaustion. > Suppose N is the SQ size, the supported count of ublk device can be much bigger than N/M, because any SQE is freed & available after it is issued to kernel, here the SQE should be free for reuse after one UBLK_U_IO_FETCH_REQ uring_cmd is issued to ublk driver. That is said you can queue arbitrary number of uring_cmd with fixed SQ size since N is just the submission batch size. But it needs the ublk server implementation to flush queued SQE if io_uring_get_sqe() returns NULL. > To address this, we’d like to propose an enhancement to the ublk driver. The idea is inspired by the multi-shot concept, where a single request allows multiple replies. > > We propose adding: > > 1. A method to register a pool of ublk_io commands. > > 2. Introduce a new UBLK_U_IO_FETCH_REQ_MULTISHOT operation, where a pool of ublk_io commands is bound to a block device. Then, upon receiving a new BIO, the ublk driver can select a reply from the pre-registered pool and push it to the io_uring. > > 3. Introduce a new UBLK_U_IO_COMMIT_REQ command to explicitly mark the completion of a request. In this case, the ublk driver returns the request to the pool. We can retain the existing UBLK_U_IO_COMMIT_AND_FETCH_REQ command, but for multi-shot scenarios, the “FETCH” operation would simply mean returning the request to the pool. > > What are your thoughts on this approach? I think we need to understand the real problem you want to address before digging into the uring_cmd pool concept. 1) for save memory for lots of ublk device ? - so far, the main preallocation should be from blk-mq request, and as Caleb mentioned, the state memory from both ublk and io_uring isn't very big 2) need to support as many as ublk device in single io_uring context with limited SQ/CQ size ? - it may not be one big problem because fixed SQ size allows to issue arbitrary number of uring_cmd - but CQ size may limit number of completed uring_cmd for notifying incoming ublk request, is this your problem? Jens has added ring resize via IORING_REGISTER_RESIZE_RINGS: https://lore.kernel.org/io-uring/20241022021159.820925-1-axboe@xxxxxxxxx/ 3) or other requirement? Thanks, Ming