On Thu, May 08, 2025 at 01:18:20AM -0700, John Johansen wrote: > On 5/7/25 23:06, Song Liu wrote: > > On Wed, May 7, 2025 at 8:37 AM Maxime Bélair > > <maxime.belair@xxxxxxxxxxxxx> wrote: > > [...] > > > > > > > > These two do not feel like real benefits: > > > > - One syscall cannot fit all use cases well... > > > > > > This syscall is not intended to cover every case, nor to replace existing kernel > > > interfaces. > > > > > > Each LSM can decide which operations it wants to support (if any). For example, when > > > loading policies, an LSM may choose to allow only policies that further restrict > > > privileges. > > > > > > > - Not working in containers is often not an issue, but a feature. > > > > > > Indeed, using this syscall requires appropriate capabilities and will not permit > > > unprivileged containers to manage policies arbitrarily. > > > > > > With this syscall, capability checks remain the responsibility of each LSM. > > > > > > For instance, in the AppArmor patch, a profile can be loaded only if > > > aa_policy_admin_capable() succeeds (which requires CAP_MAC_ADMIN). Moreover, by design, > > > policies can be loaded only in the current namespace. > > > > > > I see this syscall as a middle point between exposing the entire sysfs, creating a large > > > attack surface, and blocking everything. > > > > > > Landlock’s existing syscalls already improve security by allowing processes to further > > > restrict their ambient rights while adding only a modest attack surface. > > > > > > This syscall is a further step in that direction: it lets LSMs add restrictive policies > > > without requiring exposing every other interface. > > > > I don't think a syscall makes the API more secure. If necessary, we can add > > It exposes a different attack surface. Requiring mounting of the fs to where it is visible > in the container, provides attack surface, and requires additional external configuration. We should also keep in mind that syscalls could be accessible from everywhere, by everyone, which may increase the attack surface compared to a privileged filesystem interface. Adding a second interface may also introduce issues. Anyway, I'm definitely not against syscalls, but I don't see why the filesystem interface would be "less secure" in this context. > > Then there is the whole issue of getting the various LSMs to allow another LSM in the > stack to be able manage its own policy. Right, and it's a similar issue with seccomp policies wrt syscalls. > > > permission check to each pseudo file. The downside of the syscall, however, > > is that all the permission checks are hard-coded in the kernel (except for > > The permission checks don't have to be hard coded. Each LSM can define how it handles > or manages the syscall. The default is that it isn't supported, but if an lsm decides > to support it, there is now reason that its policy can't determine the use of the > syscall.