Re: [PATCH RFC 3/4] pidfd: improve uapi when task isn't found

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/03, Christian Brauner wrote:
>
> We currently report EINVAL whenever a struct pid has no tasked attached
> anymore thereby conflating two concepts:
>
> (1) The task has already been reaped.
> (2) The caller requested a pidfd for a thread-group leader but the pid
>     actually references a struct pid that isn't used as a thread-group
>     leader.
>
> This is causing issues for non-threaded workloads as in [1].
>
> This patch tries to allow userspace to distinguish between (1) and (2).
> This is racy of course but that shouldn't matter.
>
> Link: https://github.com/systemd/systemd/pull/36982 [1]
> Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>

For this series:

Reviewed-by: Oleg Nesterov <oleg@xxxxxxxxxx>


But I have a couple of cosmetic nits...

>  int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
>  {
> -	bool thread = flags & PIDFD_THREAD;
> +	int err = 0;
>
> -	if (!pid_has_task(pid, thread ? PIDTYPE_PID : PIDTYPE_TGID))
> -		return -EINVAL;
> +	if (!(flags & PIDFD_THREAD)) {
> +		/*
> +		 * If this is struct pid isn't used as a thread-group
> +		 * leader pid but the caller requested to create a
> +		 * thread-group leader pidfd then report ENOENT to the
> +		 * caller as a hint.
> +		 */
> +		if (!pid_has_task(pid, PIDTYPE_TGID))
> +			err = -ENOENT;
> +	}
> +
> +	/*
> +	 * If this wasn't a thread-group leader struct pid or the task
> +	 * got reaped in the meantime report -ESRCH to userspace.
> +	 *
> +	 * This is racy of course. This could've not been a thread-group
> +	 * leader struct pid and we set ENOENT above but in the meantime
> +	 * the task got reaped. Or there was a multi-threaded-exec by a
> +	 * subthread and we were a thread-group leader but now got
> +	 * killed.

The comment about the multi-threaded-exec looks a bit misleading to me.
If this pid is a group-leader-pid and we race with de_thread() which does

		exchange_tids(tsk, leader);
		transfer_pid(leader, tsk, PIDTYPE_TGID);

nothing "bad" can happen, both pid_has_task(PIDTYPE_PID) or
pid_has_task(PIDTYPE_TGID) can't return NULL during (or after) this
transition.

hlists_swap_heads_rcu() or hlist_replace_rcu() can't make
hlist_head->first == NULL during this transition...

Or I misunderstood the comment?

And... the code looks a bit overcomplicated to me, why not simply

	int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
	{
		if (!pid_has_task(pid, PIDTYPE_PID))
			return -ESRCH;

		if (!(flags & PIDFD_THREAD) && !pid_has_task(pid, PIDTYPE_TGID))
			return -ENOENT;

		return __pidfd_prepare(pid, flags, ret);
	}

? Of course, the comments should stay.

But again, this is cosmetic/subjective, please do what you like more.

Oleg.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux