Hi Yoav, On Sun, Apr 20, 2025 at 08:57:23AM +0000, Yoav Cohen wrote: > Hi Ming, > > Thank you very much! > The above seems to match our requirements. > Just to be sure, Do you want me to Implement it and issue a patch or do you plan add it to your plan? It is great if you'd like to implement the feature, and please add one selftest case(add quiesce command & one test case) together with the driver change. Otherwise, I can add it to my todo list. Thanks, Ming > > Thanks > > ________________________________________ > From: Ming Lei <ming.lei@xxxxxxxxxx> > Sent: Wednesday, April 16, 2025 12:12 PM > To: Yoav Cohen > Cc: Uday Shankar; linux-block@xxxxxxxxxxxxxxx; axboe@xxxxxxxxx > Subject: Re: ublk: Graceful Upgrade of ublk server application > > External email: Use caution opening links or attachments > > > On Wed, Apr 16, 2025 at 08:16:44AM +0000, Yoav Cohen wrote: > > Hi, > > > > The use case is as you say to replace the binary (update) without making the bdev to disappear. > > Currently I don't even use the user_copy(to avoid the 1 more system call) so the io buffer is also part of the sqe which is prevent me from free it from userspace perspective. > > So yes, even ABORT_URING_CMD by given tag can be enough. > > What do you think? > > I think the requirement is reasonable, which could be one QUIESCE_DEV command: > > - only usable for UBLK_F_USER_RECOVERY > > - need ublk server cooperation for handling inflight IO command > > - fallback to normal cancel code path in case that io_uring is exiting > > The implementation shouldn't be hard: > > - mark ubq->canceling as ture > - freeze request queue > - mark ubq->canceling as true > - unfreeze request queue > > - canceling all uring_cmd with UBLK_IO_RES_ABORT (*) > - now there can't be new ublk IO request coming, and ublk server won't > send new uring_cmd too, > > - the gatekeeper code of __ublk_ch_uring_cmd() should be reliable to prevent > any new uring_cmd from malicious application, maybe need audit & refactoring > a bit > > - need ublk server to handle UBLK_IO_RES_ABORT correctly: release all > kinds resource, close ublk char device... > > - wait until ublk char device is released by checking UB_STATE_OPEN > > - now ublk state becomes UBLK_S_DEV_QUIESCED or UBLK_S_DEV_FAIL_IO, > and userspace can replace the binary and recover device with new > application via UBLK_CMD_START_USER_RECOVERY & UBLK_CMD_END_USER_RECOVERY > > Please let us know if the above works for your requirement. > > Thanks, > Ming > -- Ming