fput currently gates whether or not a task can run task_work on the PF_KTHREAD flag, which excludes kernel threads as they don't usually run task_work as they never exit to userspace. This punts the final fput done from a kthread to a delayed work item instead of using task_work. It's perfectly viable to have the final fput done by the kthread itself, as long as it will actually run the task_work. Add a PF_NO_TASKWORK flag which is set by default by a kernel thread, and gate the task_work fput on that instead. This enables a kernel thread to clear this flag temporarily while putting files, as long as it runs its task_work manually. This enables users like io_uring to ensure that when the final fput of a file is done as part of ring teardown to run the local task_work and hence know that all files have been properly put, without needing to resort to workqueue flushing tricks which can deadlock. No functional changes in this patch. Cc: Christian Brauner <brauner@xxxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> --- fs/file_table.c | 2 +- include/linux/sched.h | 2 +- kernel/fork.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/file_table.c b/fs/file_table.c index c04ed94cdc4b..e3c3dd1b820d 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -521,7 +521,7 @@ static void __fput_deferred(struct file *file) return; } - if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { + if (likely(!in_interrupt() && !(task->flags & PF_NO_TASKWORK))) { init_task_work(&file->f_task_work, ____fput); if (!task_work_add(task, &file->f_task_work, TWA_RESUME)) return; diff --git a/include/linux/sched.h b/include/linux/sched.h index f96ac1982893..349c993fc32b 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1736,7 +1736,7 @@ extern struct pid *cad_pid; * I am cleaning dirty pages from some other bdi. */ #define PF_KTHREAD 0x00200000 /* I am a kernel thread */ #define PF_RANDOMIZE 0x00400000 /* Randomize virtual address space */ -#define PF__HOLE__00800000 0x00800000 +#define PF_NO_TASKWORK 0x00800000 /* task doesn't run task_work */ #define PF__HOLE__01000000 0x01000000 #define PF__HOLE__02000000 0x02000000 #define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */ diff --git a/kernel/fork.c b/kernel/fork.c index c4b26cd8998b..8dd0b8a5348d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2261,7 +2261,7 @@ __latent_entropy struct task_struct *copy_process( goto fork_out; p->flags &= ~PF_KTHREAD; if (args->kthread) - p->flags |= PF_KTHREAD; + p->flags |= PF_KTHREAD | PF_NO_TASKWORK; if (args->user_worker) { /* * Mark us a user worker, and block any signal that isn't -- 2.49.0