Hi, Our code uses a single io_uring per core, which is shared among all block devices - meaning each block device on a core uses the same io_uring. Let’s say the size of the io_uring is N. Each block device submits M UBLK_U_IO_FETCH_REQ requests. As a result, with the current implementation, we can only support up to P block devices, where P = N / M. This means that when we attempt to support block device P+1, it will fail due to io_uring exhaustion. To address this, we’d like to propose an enhancement to the ublk driver. The idea is inspired by the multi-shot concept, where a single request allows multiple replies. We propose adding: 1. A method to register a pool of ublk_io commands. 2. Introduce a new UBLK_U_IO_FETCH_REQ_MULTISHOT operation, where a pool of ublk_io commands is bound to a block device. Then, upon receiving a new BIO, the ublk driver can select a reply from the pre-registered pool and push it to the io_uring. 3. Introduce a new UBLK_U_IO_COMMIT_REQ command to explicitly mark the completion of a request. In this case, the ublk driver returns the request to the pool. We can retain the existing UBLK_U_IO_COMMIT_AND_FETCH_REQ command, but for multi-shot scenarios, the “FETCH” operation would simply mean returning the request to the pool. What are your thoughts on this approach? Ofer