Re: [EXTERNAL] Re: EXT4/JBD2 Not Fully Released device after unmount of NVMe-oF Block Device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Jun 2, 2025, at 6:29 PM, Theodore Ts'o <tytso@xxxxxxx> wrote:
> 
> On Mon, Jun 02, 2025 at 09:32:18PM +0000, Mitta Sai Chaithanya wrote:
> 
>> However, after the connection is re-established and the device is
>> unmounted from all namespaces, I still observe errors from both ext4
>> and jb2 when the device is especially disconnected.
> 
> How do you *know* that you've unmounted the device in all namespaces.
> I seem to recall that some process (I think one of the systemd
> daemons, but I could be wrong) was creating a namespace that users
> were not expecting, resulting in the device staying mounted when the
> users were not so expecting it.
> 
> The fact that /proc/fs/ext4/<device_name> still exists means that the
> kernel (specifically, the VFS layer) doesn't think that the file
> system can be shut down.  As a result, the VFS layer has not called
> ext4's put_super() and kill_sb() methods.  And so yes, I/O activity
> can still happen, because the file system has not been shutdown.
> 
> If you still see /proc/fs/ext4/<device_name>, my suggestion would be
> grep /proc/*/mounts looking to see which processes has a namespace
> which still has the device mounted.  I suspect that you will see that
> there is some namespace that you weren't aware of that is keeping the
> ext4 struct super object pinned and alive.
> 
>> Another point I would like to mention, I am observing JBD2 errors especially after NVMe-oF device has been disconnected and below are the logs.
> 
> Sure, but that's the effect, not the cause, of the NVME-of device
> getting ripped down while the file system is still active.  Which I am
> 99.997% sure is because it is still mounted in some namespace.  The
> other 0.003% chance is that there is some refcount problem in the VFS
> subsytem, and I would suggest that you ask Microsoft's VFS experts,
> (such as Christain Brauner, who is one of the VFS maintainers) to take
> a look.  I very much doubt it is a kernel bug, though.

We've definitely seen similar situations with filesystem mounts inside
of a namespace keeping the mountpoint busy.

Adding debugging in ext4_put_super() if current->comm != "umount" to print
the process name showed monitoring tools running in the container that
held open references on the mountpoint until they exited and closed files.

Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux