On Tue 29-04-25 21:31:35, Roman Gushchin wrote: > On Tue, Apr 29, 2025 at 01:46:07PM +0200, Michal Hocko wrote: > > On Mon 28-04-25 03:36:15, Roman Gushchin wrote: > > > Introduce bpf_out_of_memory() bpf kfunc, which allows to declare > > > an out of memory events and trigger the corresponding kernel OOM > > > handling mechanism. > > > > > > It takes a trusted memcg pointer (or NULL for system-wide OOMs) > > > as an argument, as well as the page order. > > > > > > Only one OOM can be declared and handled in the system at once, > > > so if the function is called in parallel to another OOM handling, > > > it bails out with -EBUSY. > > > > This makes sense for the global OOM handler because concurrent handlers > > are cooperative. But is this really correct for memcg ooms which could > > happen for different hierarchies? Currently we do block on oom_lock in > > that case to make sure one oom doesn't starve others. Do we want the > > same behavior for custom OOM handlers? > > It's a good point and I had similar thoughts when I was working on it. > But I think it's orthogonal to the customization of the oom handling. > Even for the existing oom killer it makes no sense to serialize memcg ooms > in independent memcg subtrees. But I'm worried about the dmesg reporting, > it can become really messy for 2+ concurrent OOMs. > > Also, some memory can be shared, so one OOM can eliminate a need for another > OOM, even if they look independent. > > So my conclusion here is to leave things as they are until we'll get signs > of real world problems with the (lack of) concurrency between ooms. How do we learn about that happening though? I do not think we have any counters to watch to suspect that some oom handlers cannot run. -- Michal Hocko SUSE Labs