Re: squashfs can starve/block apps

"Joakim Tjernlund (Nokia)" <joakim.tjernlund@xxxxxxxxx> · Sun, 6 Jul 2025 09:46:09 +0000

On Fri, 2025-07-04 at 20:51 +0100, Phillip Lougher wrote:
>
>
> > On 26/06/2025 15:27 BST Joakim Tjernlund (Nokia) <joakim.tjernlund@xxxxxxxxx> wrote:
> >
> >
> > On Thu, 2025-06-26 at 10:09 +0200, Joakim Tjernlund wrote:
> > > We have an app running on a squashfs RFS(XZ compressed) and a appfs also on squashfs.
> > > Whenever we validate an SW update image(stream a image.xz, uncompress it and on to /dev/null),
> > > the apps are starved/blocked and make almost no progress, system time in top goes up to 99+%
> > > and the console also becomes unresponsive.
> > >
> > > This feels like kernel is stuck/busy in a loop and does not let apps execute.
> > >
>
> I have been away at the Glastonbury festival, hence the delay in replying. But
> this isn't really anything to do with Squashfs per se, and basic computer
> science theory explains what is going on here.  So I'm surprised no-else has
> responded.
>
> > > Kernel 5.15.185
> > >
> > > Any ideas/pointers ?
>
> Yes,
>
> > >
> > >  Jocke
> >
> > This will reproduce the stuck behaviour we see:
> >  > cd /tmp (/tmp is an tmpfs)
> >  > wget
> > https://fullimage.xz/
>
> You've identified the cause here.
>
> >
> > So just downloading it to tmpfs will confuse squashfs, seems to
> > me that squashfs somehow see the xz compressed pages in page cache/VFS and
> > tried to do something with them.
>
> But this is the completely wrong conclusion.  Squashfs doesn't "magically"
> see files downloaded into a different filesystem and try to do something
> with them.
>
> What is happening is the system is thrashing, because the page cache doesn't
> have enough remaining space to contain the working set of the running
> application(s).
>
> See Wikipedia article
> https://en.wikipedia.org/wiki/Thrashing_(computer_science)
>
> Tmpfs filesystems (/tmp here) are not backed by physical media, and their
> content are stored in the page cache.  So in effect if fullImage.xz takes
> most of the page cache (system RAM), then there is no much space left to store
> the pages of the applications that are running, and they constantly replace
> each others pages.
>
> To make it easy, imagine we have two processes A and B, and the page cache
> doesn't have enough space to store both the pages for processes A and B.
>
> Now:
>
> 1. Process A starts and demand-pages pages into the page cache from the
>    Squashfs root filesystem.  This takes CPU resources to decompress the pages.
>    Process A runs for a while and then gets descheduled.
>
> 2. Process B starts and demand-pages pages into the page cache, replacing
>    Process A's pages.  It runs for a while and then gets descheduled.
>
> 3 Process A restarts and finds all its pages have gone from page cache, and so
>   it has to re-demand-page the pages back.  This replaces Process B's pages.
>
> 4. Process B restarts and finds all its pages have gone from the page cache ...
>
> In effect the system spends all it's time reading pages from the
> Squashfs root filesystem, and doesn't do anything else, and hence it looks
> like it has hung.
>
> This is not a fault with Squashfs, and it will happen with any filesystem
> (ext4 etc) when system memory is too small to contain the working set of
> pages.
>
> Now, to repeat what has caused this is the download of that fullImage.xz
> which has filled most of the page cache (system RAM).  To prevent that
> from happening, there are two obvious solutions:
>
> 1. Split fullImage.xz into pieces and only download one piece at a time.  This
>    will avoid filling up the page cache and the system trashing.
>
> 2. Kill all unnecessary applications and processes before downloading
>    fullImage.xz.  In doing that you reduce the working set to RAM available,
>    which will again prevent thrashing.
>
> Hope that helps.
>
> Phillip

You are absolutely right, above was low RAM due to filling the tmpfs RAM.
But what threw me off was that I observed the same when streaming XZ to /dev/null.

After som digging I found why, some XZ options do not respect "-0" presets
w.r.t dict size and reset it back to default. Once I changed from
  "-0 --check=crc32 --arm --lzma2=lp=2,lc=2"
to
  "-0 --check=crc32 --lzma2=dict=128KiB"
I got a stable system.

Perhaps xz -l could be improved to include dict size to make this more obvious?

 Jocke