Re: Machine lockup with large d_invalidate()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 15 May 2025 at 16:57, Jan Kara <jack@xxxxxxx> wrote:
>
> Hello,
>
> we have a customer who is mounting over NFS a directory (let's call it
> hugedir) with many files (there are several millions dentries on d_children
> list). Now when they do 'mv hugedir hugedir.bak; mkdir hugedir' on the
> server, which invalidates NFS cache of this directory, NFS clients get
> stuck in d_invalidate() for hours (until the customer lost patience).
>
> Now I don't want to discuss here sanity or efficiency of this application
> architecture but I'm sharing the opinion that it shouldn't take hours to
> invalidate couple million dentries. Analysis of the crashdump revealed that
> d_invalidate() can have O(n^2) complexity with the number of dentries it is
> invalidating which leads to impractical times to invalidate large numbers
> of dentries. What happens is the following:
>
> There are several processes accessing the hugedir directory - about 16 in
> the case I was inspecting. When the directory changes on the server all
> these 16 processes quickly enter d_invalidate() -> shrink_dcache_parent()

First thing d_invalidate() does is check if the dentry is unhashed and
return if so, unhash it otherwise.   So only d_invalidate() that won
the race for d_lock is going to invoke shink_dcache_parent() the
others will return immediately.

What am I missing?

Thanks,
Miklos




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux