On 6/4/25 19:56, Tingmao Wang wrote: > On 6/4/25 03:21, Al Viro wrote: >> On Wed, Jun 04, 2025 at 02:12:11AM +0100, Tingmao Wang wrote: >>> On 6/4/25 01:55, Al Viro wrote: >>>> On Wed, Jun 04, 2025 at 01:45:45AM +0100, Tingmao Wang wrote: >>>>> + rename_seqcount = read_seqbegin(&rename_lock); >>>>> + if (rename_seqcount % 2 == 1) { >>>> >>>> Please, describe the condition when that can happen, preferably >>>> along with a reproducer. >>> >>> My understanding is that when a rename is in progress the seqcount is odd, >>> is that correct? >>> >>> If that's the case, then the fs_race_test in patch 2 should act as a >>> reproducer, since it's constantly moving the directory. >>> >>> I can add a comment to explain this, thanks for pointing out. >> >> Please, read through the header declaring those primitives and read the >> documentation it refers to - it's useful for background. > > Ok, so I didn't realize read_seqbegin actually waits for the seqcount to > turn even. I did read the header earlier when following dget_parent but > probably misremembered and mixed raw_seqcount_begin with read_seqbegin. Right, after more careful looking I think what I actually want here is raw_read_seqcount. My apologies. > >> >> What's more, look at the area covered by rename_lock - I seriously suspect >> that you are greatly overestimating it. > > Admittedly "when a rename is in progress" is a vague statement. Looking > at what takes rename_lock in the code, it's only when we actually do > d_move where we take this lock (plus some other places), and the critical > section isn't very large, and does not contain any waits etc. > > If we keep read_seqbegin, then that gives landlock more opportunity to do > a reference-less parent walk, but at the expense that a d_move anywhere, > even if it doesn't affect anything we're currently looking at, will > temporarily block this landlocked application (even if not for very long), > and multiple concurrent renames might cause us to wait for longer (but > probably won't starve us since we just need one "cycle" where rename > seqcount is even). > > Since we can still safely do a parent walk, just needing to take dentry > references on our way, we could simply fallback to that in this situation. > i.e. we can use raw_seqcount_begin and keep the seqcount & 1 check. This will be raw_read_seqcount(&rename_lock.seqcount) > > Now, there is the argument that if d_move is very quick, then it might be > worth waiting for it to finish, and we will fallback to the original > parent walk if the seqcount changes again. I'm not sure which is best, > but I'm inclining towards changing this to raw_seqcount_begin, as this is > purely an optimization, and we do not _need_ to avoid concurrent renames.