[PATCH 1/5] fs: gate final fput task_work on PF_NO_TASKWORK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



fput currently gates whether or not a task can run task_work on the
PF_KTHREAD flag, which excludes kernel threads as they don't usually run
task_work as they never exit to userspace. This punts the final fput
done from a kthread to a delayed work item instead of using task_work.

It's perfectly viable to have the final fput done by the kthread itself,
as long as it will actually run the task_work. Add a PF_NO_TASKWORK flag
which is set by default by a kernel thread, and gate the task_work fput
on that instead. This enables a kernel thread to clear this flag
temporarily while putting files, as long as it runs its task_work
manually.

This enables users like io_uring to ensure that when the final fput of a
file is done as part of ring teardown to run the local task_work and
hence know that all files have been properly put, without needing to
resort to workqueue flushing tricks which can deadlock.

No functional changes in this patch.

Cc: Christian Brauner <brauner@xxxxxxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
---
 fs/file_table.c       | 2 +-
 include/linux/sched.h | 2 +-
 kernel/fork.c         | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index c04ed94cdc4b..e3c3dd1b820d 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -521,7 +521,7 @@ static void __fput_deferred(struct file *file)
 		return;
 	}
 
-	if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
+	if (likely(!in_interrupt() && !(task->flags & PF_NO_TASKWORK))) {
 		init_task_work(&file->f_task_work, ____fput);
 		if (!task_work_add(task, &file->f_task_work, TWA_RESUME))
 			return;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f96ac1982893..349c993fc32b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1736,7 +1736,7 @@ extern struct pid *cad_pid;
 						 * I am cleaning dirty pages from some other bdi. */
 #define PF_KTHREAD		0x00200000	/* I am a kernel thread */
 #define PF_RANDOMIZE		0x00400000	/* Randomize virtual address space */
-#define PF__HOLE__00800000	0x00800000
+#define PF_NO_TASKWORK		0x00800000	/* task doesn't run task_work */
 #define PF__HOLE__01000000	0x01000000
 #define PF__HOLE__02000000	0x02000000
 #define PF_NO_SETAFFINITY	0x04000000	/* Userland is not allowed to meddle with cpus_mask */
diff --git a/kernel/fork.c b/kernel/fork.c
index c4b26cd8998b..8dd0b8a5348d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2261,7 +2261,7 @@ __latent_entropy struct task_struct *copy_process(
 		goto fork_out;
 	p->flags &= ~PF_KTHREAD;
 	if (args->kthread)
-		p->flags |= PF_KTHREAD;
+		p->flags |= PF_KTHREAD | PF_NO_TASKWORK;
 	if (args->user_worker) {
 		/*
 		 * Mark us a user worker, and block any signal that isn't
-- 
2.49.0





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux