Hi, Trond, I'm investigating an issue on our systems that are running your latest containerized NFS client teardown patches while Jeff is out. We're not seeing the NFS client get stuck anymore, but I'm debugging what appears to be a reference leak. Jeff noticed that there are some lingering network namespaces not in use by any processes after the container shutdown. I chased these references through: net -> nfs_client -> nfs4_pnfs_ds -> nfs4_ff_layout_ds -> nfs4_ff_layout_mirror -> nfs4_ff_layout_segment What I'm seeing is: * The nfs4_ff_layout_segment/pnfs_layout_segment has a pls_refcount of 0, but hasn't been freed. * Its pls_layout has already been freed, and the nfs_inode and nfs_server are also long gone. * The segment was on pls_layout_hdr->plh_return_segs. >>> lseg *(struct pnfs_layout_segment *)0xffff88813147ca00 = { .pls_list = (struct list_head){ .next = (struct list_head *)0xffff8885d49e0f38, .prev = (struct list_head *)0xffff888dee919f80, }, .pls_lc_list = (struct list_head){ .next = (struct list_head *)0xffff88813147ca10, .prev = (struct list_head *)0xffff88813147ca10, }, .pls_commits = (struct list_head){ .next = (struct list_head *)0xffff88813147ca20, .prev = (struct list_head *)0xffff88813147ca20, }, .pls_range = (struct pnfs_layout_range){ .iomode = (u32)1, .offset = (u64)0, .length = (u64)18446744073709551615, }, .pls_refcount = (refcount_t){ .refs = (atomic_t){ .counter = (int)0, }, }, .pls_seq = (u32)2, .pls_flags = (unsigned long)10, .pls_layout = (struct pnfs_layout_hdr *)0xffff8885d49e0f00, } >>> decode_enum_type_flags(lseg.pls_flags, prog["NFS_LSEG_VALID"].type_) 'NFS_LSEG_ROC|NFS_LSEG_LAYOUTRETURN' >>> lseg.pls_list.next == lseg.pls_layout.plh_return_segs.address_of_() True So my guess is that there were still segments on plh_return_segs when the pnfs_layout_hdr was freed. I wasn't able to make sense of how the lifetime of that list is supposed to work. My next step is to test with WARN_ONCE(!list_empty(&lo->plh_return_segs)) in the free path of pnfs_put_layout_hdr(). In the meantime, do you have any ideas? Thanks, Omar P.S. I spotted a separate issue that nfs4_data_server_cache is only keyed on the socket address, not taking the network namespace into account, which can result in connections being shared between containers. This leak has a knock-on effect of pinning dead DS connections in the cache, which other containers may try to reuse. Maybe the cache should be split up by netns?