Jann Horn <jannh@xxxxxxxxxx> writes: > On Tue, May 13, 2025 at 10:57 PM Kees Cook <kees@xxxxxxxxxx> wrote: >> On May 13, 2025 6:05:45 AM PDT, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: >> >Here is my proposal: *deny* exec of suid/sgid binaries if fs_struct is >> >shared. This will have to be checked for after the execing proc becomes >> >single-threaded ofc. >> >> Unfortunately the above Chrome helper is setuid and uses CLONE_FS. > > Chrome first launches a setuid helper, and then the setuid helper does > CLONE_FS. Mateusz's proposal would not impact this usecase. > > Mateusz is proposing to block the case where a process first does > CLONE_FS, and *then* one of the processes sharing the fs_struct does a > setuid execve(). Linux already downgrades such an execve() to be > non-setuid, which probably means anyone trying to do this will get > hard-to-understand problems. Mateusz' proposal would just turn this > hard-to-debug edgecase, which already doesn't really work, into a > clean error; I think that is a nice improvement even just from the > UAPI standpoint. > > If this change makes it possible to clean up the kernel code a bit, even better. What has brought this to everyone's attention just now? This is the second mention of this code path I have seen this week. AKA: security/commoncap.c:cap_bprm_creds_from_file(...) > ... > /* Don't let someone trace a set[ug]id/setpcap binary with the revised > * credentials unless they have the appropriate permit. > * > * In addition, if NO_NEW_PRIVS, then ensure we get no new privs. > */ > is_setid = __is_setuid(new, old) || __is_setgid(new, old); > > if ((is_setid || __cap_gained(permitted, new, old)) && > ((bprm->unsafe & ~LSM_UNSAFE_PTRACE) || > !ptracer_capable(current, new->user_ns))) { > /* downgrade; they get no more than they had, and maybe less */ > if (!ns_capable(new->user_ns, CAP_SETUID) || > (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)) { > new->euid = new->uid; > new->egid = new->gid; > } > new->cap_permitted = cap_intersect(new->cap_permitted, > old->cap_permitted); > } The actual downgrade is because a ptrace'd executable also takes this path. I have seen it argued rather forcefully that continuing rather than simply failing seems better in the ptrace case. In general I think it can be said this policy is "safe". AKA we don't let a shared fs struct confuse privileged applications. So nothing to panic about. It looks like most of the lsm's also test bprm->unsafe. So I imagine someone could very carefully separate the non-ptrace case from the ptrace case but *shrug*. Perhaps: if ((is_setid || __cap_gained(permitted, new_old)) && ((bprm->unsafe & ~LSM_UNSAFE_PTRACE) || !ptracer_capable(current, new->user_ns))) { + if (!(bprm->unsafe & LSM_UNSAFE_PTRACE)) { + return -EPERM; + } /* downgrade; they get no more than they had, and maybe less */ if (!ns_capable(new->user_ns, CAP_SETUID) || (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)) { new->euid = new->uid; new->egid = new->gid; } new->cap_permitted = cap_intersect(new->cap_permitted, old->cap_permitted); } If that is what you want that doesn't look to scary. I don't think it simplifies anything about fs->in_exec. As fs->in_exec is set when the processing calling exec is the only process that owns the fs_struct. With fs->in_exec just being a flag that doesn't allow another thread to call fork and start sharing the fs_struct during exec. *Shrug* I don't see why anyone would care. It is just a very silly corner case. Eric