Re: GCC compilation performance under RAM starvation

Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx> · Fri, 27 Jun 2025 20:54:36 -0500

Hi!

On Sat, Jun 28, 2025 at 09:54:59AM +0900, Simon Richter wrote:
> On 27.06.25 02:01, Krystian Kazmierczak (Nokia) via Gcc-help wrote:
> 
> >What I observe is that when all compilation which run on certain container 
> >cause memory pressure (memory utilization hits memory limit defined by 
> >cgroupv2 for container, swap is disabled) then cpu load for cc1 processes 
> >significantly increase (container looks like hanged) and all compilations 
> >may last indefinitely.
> 
> Compiling is pretty much the worst case scenario for a system with paged 
> memory, because the program being compiled is represented as a tree-like 
> data structure, so we have lots of indirection through chains of 
> pointers, and each pointer target has the risk of being on a page that 
> has been paged out.

Or more generally: a compiler has a lot of "active memory".  Not "paged
memory", data that is just from files on disk (like stuff read it from
source files).

> So in a memory pressure scenario, when a page is not present, it will 
> have to be fetched from its backing storage, and another page evicted 
> instead. Then, we retrieve the information we actually want, which often 
> is another pointer that we have to follow, and the process repeats.
> 
> Having no swap space does not mean that no swapping occurs: swap space 
> is space for "anonymous" memory that has no other file backing. However, 
> loaded programs do have their executable files as backing, and anything 
> loaded using mmap() is also fair game, so it is possible to evict them 
> -- so the compiler pushes out other processes.

Yup, swap space is "just ram" in many ways.  But having the distinction
swap <-> ram allows the kernel to do better (heuristic!) tradeoffs
often.  And, like I try to point out all over this thread, all of the
kernel tuning for this depends on it having a reasonable amount of swap
space.  If there is *no* swap space, there are more hard edges.

> Because things are pulled back into memory on-demand, one page for each 
> access, it takes a really long time to recover after memory pressure has 
> been resolved.
> 
> You can avoid that by disabling memory overcommit, this enforces that 
> every allocated memory page has a physical memory location. The downside 
> of that is that on process startup, physical memory needs to be reserved 
> for a copy of all unshared pages of the parent process between the 
> fork() that starts the process, and the execve() that replaces the 
> process image, so it makes starting a new process more expensive and 
> introduces a failure mode where fork() fails because the parent process 
> is too large.

How do you disable memory overcommit?  "Just" don't run problems that
are bigger than you have memmory for?

> For traditional Unix, that is usually not a problem (just less 
> efficient) because processes tend to be fairly small, but modern desktop 
> environments usually deal badly with overcommit being disabled, so it's 
> not a general solution. If you have a dedicated machine or VM for 
> compiling, I'd definitely go that route, though.
> 
> The only proper solution can come from the kernel here: this is where 
> resources are directed. GCC can only use what it is assigned.

The only proper solution is to just have enough memory for everything
you run.  Problem solved :-)

(But yes, there is nothing GCC can do to solve this, or ameliorate this
even).

Segher