Re: [PATCH 2/3] ublk: add feature UBLK_F_QUIESCE

Ming Lei <ming.lei@xxxxxxxxxx> · Mon, 23 Jun 2025 10:13:33 +0800

On Thu, Jun 12, 2025 at 12:04:39PM +0000, Yoav Cohen wrote:
> 
> Hi Ming,
> 
> So I know it's a radical situation but my only concern is that:
> 
> 0) On our application timeout of IO may be set to few minutes as it is goes over the network.
> 1) Let's assume we have 1 queue with QD=1.
> 2) the Only IO is in the userspace application but as we send the IOs over the network it may be stuck due to connectivy issues.
> 3) User trying to upgrade/stop the application so we issue Quiesce_DEV with infinite timeout as we want to ensure it works.
> 4) We are stuck now until network connection will recover or Our datapath will somehow Issue the COMMIT_AND_FETCH back to to the kernel so it we can get the ABORT later and QUICESE_DEV can finish.
> 
> Problem is that I don't want to wait for this IO until recovery but on the other end I don't want to complete the IO with error to the user.
> So on this case I guess we can abort the application or something but maybe it will be cleaner that on Quiesce_DEV will need to issue another SQE per queue or something so we can notify the application this way about it.
> 
> Anyway also on our case it will be super rare to happen where there is a queue without an idle operation + network is currently down but we try to be complete as possible.
> What do you think?

Hi Yoav,

The multishot approach for fetching io command should address the issue
wrt. single queue depth case:

https://github.com/ming1/linux/commits/ublk2-cmd-batch.v3/

In which there is always one multishot FETCH_IO_CMDS for notifying ublk
server for new io commands(requests) in batch way.

Thanks, 
Ming