On Thu, Jun 12, 2025 at 12:04:39PM +0000, Yoav Cohen wrote: > > Hi Ming, > > So I know it's a radical situation but my only concern is that: > > 0) On our application timeout of IO may be set to few minutes as it is goes over the network. > 1) Let's assume we have 1 queue with QD=1. > 2) the Only IO is in the userspace application but as we send the IOs over the network it may be stuck due to connectivy issues. > 3) User trying to upgrade/stop the application so we issue Quiesce_DEV with infinite timeout as we want to ensure it works. > 4) We are stuck now until network connection will recover or Our datapath will somehow Issue the COMMIT_AND_FETCH back to to the kernel so it we can get the ABORT later and QUICESE_DEV can finish. > > Problem is that I don't want to wait for this IO until recovery but on the other end I don't want to complete the IO with error to the user. > So on this case I guess we can abort the application or something but maybe it will be cleaner that on Quiesce_DEV will need to issue another SQE per queue or something so we can notify the application this way about it. > > Anyway also on our case it will be super rare to happen where there is a queue without an idle operation + network is currently down but we try to be complete as possible. > What do you think? Hi Yoav, The multishot approach for fetching io command should address the issue wrt. single queue depth case: https://github.com/ming1/linux/commits/ublk2-cmd-batch.v3/ In which there is always one multishot FETCH_IO_CMDS for notifying ublk server for new io commands(requests) in batch way. Thanks, Ming