On Mon, 8 Sep 2025, Robert Beckett wrote: > Hi, > > While testing resiliency of encrypted swap using dmcrypt we encounter easily reproducible deadlocks. > The setup is a simple 1GB encrypted swap file [1] with a little mem chewer program [2] to consume all ram. > > Usually the first run will oomkill the memchewer successfully. > However, after 1-3 runs typically, it will deadlock the machine. > > Using softdog and the lockup detectors it looks like [3] it looks like the dmcrypt_write thread > is stuck for over 2 minutes while everything else is waiting on the swap bio limiter [4] > > I wondered whether it might be hitting tag exhaustion in blk_mq_get_tag, but adding trace debug and > enabling the block trace events seems to suggest that generally progress is being made [5]. > > Also note lockdep doesn't complain. > > Looks to me like a soft lockup possibly due to swap out hitting similar or same issue as [4] but > not self inflicted this time. However, once general memory exhaustion occurs, it seems to result > in the same issue. > > I'm not intimately familiar with the dm and block-mq code, so I'd appreciate any help in > debugging it further or a fix. > I guess the main question is: why doesn't it oomkill? oomkill seems like a > sensible action in this scenario. Any advice on making oomkill more reliable here? > Would [4] need to be tweaked in any way for swap files vs partition? > > Thanks > > Bob Hi What happens if you lower /sys/module/dm_mod/parameters/swap_bios? Does it help? The general problem with swapping to encrypted device is that for each swapped-out page, the dm-crypt driver needs to allocate another page that holds the encrypted data. So, the harder it tries to swap, the more memory it consumes. The device mapper stack uses mempools, so that it should work in case of total memory exhaustion, but perhaps some kernel part doesn't use them and deadlocks. I could try to reproduce it. Mikulas