On Mon, May 12, 2025 at 01:25:35PM -0700, Caleb Sander Mateos wrote: > On Fri, May 9, 2025 at 8:06 AM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > > `ublk_queue` is read only for io buffer register/unregister command. Both > > `ublk_io` and block layer request are read-only for IO buffer register/ > > unregister command. > > > > So the two command can be issued from other task contexts. > > > > Not same with other three ublk commands, these two are for handling target > > IO only, we shouldn't limit their issue task context, otherwise it becomes > > hard for ublk server(backend) to use zero copy feature. > > > > Reported-by: Uday Shankar <ushankar@xxxxxxxxxxxxxxx> > > Closes: https://lore.kernel.org/linux-block/20250410-ublk_task_per_io-v3-2-b811e8f4554a@xxxxxxxxxxxxxxx/ > > I don't agree that this change obviates the need for per-io tasks. > Being able to perform zero-copy buffer registration on other threads You may misunderstand the concern, it isn't for solving load balancing, it is just for making zero copy easier to use. Not like other uring_cmd(FETCH, COMMIT_AND_FETCH), register io buffer is for target IO handling, which shouldn't be limited in the ubq_daemon context, Uday did mention this point in above link. > can't help with spreading the load if the ublk server isn't using > zero-copy in the first place. And sending I/Os between ublk server > threads adds cross-CPU synchronization overhead (as Uday points out in > the commit message for his change). Distributing I/Os among the ublk > server threads at the point where the blk-mq request is queued seems > like a natural place to do load balancing, as the request is already > being sent between CPUs there. I do agree load balancing should be addressed, together with relaxing existing ublk server context limitation. Uday's patch can be one good start from both driver and selftest code side. However, spread load in static way may not solve this problem completely, which may be one transitional solution, IMO, but it is fine to move on with it when comments are addressed. We should support dynamic load balancing by allowing to migrate IO to other context runtime in future: - it should be enough to add one per-io spin lock in driver side, there isn't contention for good ublk implementation https://github.com/ming1/linux/commits/ublk_task_neutral/ - when load isn't balanced or some task contexts are saturated, IO need to migrate to other task contexts - IO migration should just happen when load isn't balanced, and it won't be needed when load becomes balanced, so cross-CPU isn't one thing - the migration logic need to be triggered from target code, but the mechanism can be implemented in library Almost all Uday's selftest code can be reused for above, especially the nice ublk_thread abstraction, maybe it can be named as ublk_task_ctx, then ring_buf & eventfd & read_mshot notification can be added to it for supporting IO migration. Thanks, Ming