Usually, execve() preserves the effective ids. Many programs rely on this to detect setuid/setgid execution and will disable certain features (such as rejecting certain user input / environment variables). However, if NO_NEW_PRIVS is set, effective ids are always reset by cap_bprm_creds_from_file(), but capabilities are not revoked. That means the process looks like it's not setuid/setgid, but has full capabilities, and is effectively a superuser process. This breaks userspace assumptions. It was argued [1] that this surprising behavior must not change because programs might rely on it: Of course, this leaves many programs vulnerable, but if we decide the behavior must remain, we should at least document it with a warning. [1] https://lore.kernel.org/lkml/87h61t7siv.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ Signed-off-by: Max Kellermann <max.kellermann@xxxxxxxxx> --- Documentation/userspace-api/no_new_privs.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/Documentation/userspace-api/no_new_privs.rst b/Documentation/userspace-api/no_new_privs.rst index d060ea217ea1..89b0884991e9 100644 --- a/Documentation/userspace-api/no_new_privs.rst +++ b/Documentation/userspace-api/no_new_privs.rst @@ -29,6 +29,12 @@ bits will no longer change the uid or gid; file capabilities will not add to the permitted set, and LSMs will not relax constraints after execve. +A successful execve call with ``no_new_privs`` will reset the +effective uid/gid to the real uid/gid, but does not drop capabilities. +This means that comparing effective and real ids is not a valid method +to detect setuid/setgid execution; the proper way to do that is +getauxval(AT_SECURE). + To set ``no_new_privs``, use:: prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); -- 2.47.2