Hi! Below is a summary of a discussion about the Workqueue API and cpu isolation considerations. Details and more information are available here: "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND." https://lore.kernel.org/all/20250221112003.1dSuoGyc@xxxxxxxxxxxxx/ === Current situation: problems === Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected. This leads to different scenarios if a work item is scheduled on an isolated CPU where "delay" value is 0 or greater then 0: schedule_delayed_work(, 0); This will be handled by __queue_work() that will queue the work item on the current local (isolated) CPU, while: schedule_delayed_work(, 1); Will move the timer on an housekeeping CPU, and schedule the work there. Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. === Plan and future plans === This patchset is the first stone on a refactoring needed in order to address the points aforementioned; it will have a positive impact also on the cpu isolation, in the long term, moving away percpu workqueue in favor to an unbound model. These are the main steps: 1) API refactoring (that this patch is introducing) - Make more clear and uniform the system wq names, both per-cpu and unbound. This to avoid any possible confusion on what should be used. - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND, introduced in this patchset and used on all the callers that are not currently using WQ_UNBOUND. WQ_UNBOUND will be removed in a future release cycle. Most users don't need to be per-cpu, because they don't have locality requirements, because of that, a next future step will be make "unbound" the default behavior. 2) Check who really needs to be per-cpu - Remove the WQ_PERCPU flag when is not strictly required. 3) Add a new API (prefer local cpu) - There are users that don't require a local execution, like mentioned above; despite that, local execution yeld to performance gain. This new API will prefer the local execution, without requiring it. === Introduced Changes by this series === 1) [P 1-2] Replace use of system_wq and system_unbound_wq system_wq is a per-CPU workqueue, but his name is not clear. system_unbound_wq is to be used when locality is not required. Because of that, system_wq has been renamed in system_percpu_wq, and system_unbound_wq has been renamed in system_dfl_wq. 2) [P 3] add WQ_PERCPU to remaining alloc_workqueue() users Every alloc_workqueue() caller should use one among WQ_PERCPU or WQ_UNBOUND. This is actually enforced warning if both or none of them are present at the same time. WQ_UNBOUND will be removed in a next release cycle. === For Maintainers === There are prerequisites for this series, already merged in the master branch. The commits are: 128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and system_dfl_wq") 930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag") Thanks! Marco Crivellari (3): fs: replace use of system_unbound_wq with system_dfl_wq fs: replace use of system_wq with system_percpu_wq fs: WQ_PERCPU added to alloc_workqueue users fs/afs/callback.c | 4 ++-- fs/afs/main.c | 4 ++-- fs/afs/write.c | 2 +- fs/aio.c | 2 +- fs/bcachefs/btree_write_buffer.c | 2 +- fs/bcachefs/io_read.c | 12 ++++++------ fs/bcachefs/journal_io.c | 2 +- fs/bcachefs/super.c | 10 +++++----- fs/btrfs/async-thread.c | 3 +-- fs/btrfs/block-group.c | 2 +- fs/btrfs/disk-io.c | 2 +- fs/btrfs/extent_map.c | 2 +- fs/btrfs/space-info.c | 4 ++-- fs/btrfs/zoned.c | 2 +- fs/ceph/super.c | 2 +- fs/dlm/lowcomms.c | 2 +- fs/dlm/main.c | 2 +- fs/ext4/mballoc.c | 2 +- fs/fs-writeback.c | 4 ++-- fs/fuse/dev.c | 2 +- fs/fuse/inode.c | 2 +- fs/gfs2/main.c | 5 +++-- fs/gfs2/ops_fstype.c | 6 ++++-- fs/netfs/objects.c | 2 +- fs/netfs/read_collect.c | 2 +- fs/netfs/write_collect.c | 2 +- fs/nfs/namespace.c | 2 +- fs/nfs/nfs4renewd.c | 2 +- fs/nfsd/filecache.c | 2 +- fs/notify/mark.c | 4 ++-- fs/ocfs2/dlm/dlmdomain.c | 3 ++- fs/ocfs2/dlmfs/dlmfs.c | 3 ++- fs/quota/dquot.c | 2 +- fs/smb/client/cifsfs.c | 16 +++++++++++----- fs/smb/server/ksmbd_work.c | 2 +- fs/smb/server/transport_rdma.c | 3 ++- fs/super.c | 3 ++- fs/verity/verify.c | 2 +- fs/xfs/xfs_log.c | 3 +-- fs/xfs/xfs_mru_cache.c | 3 ++- fs/xfs/xfs_super.c | 15 ++++++++------- 41 files changed, 82 insertions(+), 69 deletions(-) -- 2.51.0