On Mon, 2025-04-28 at 13:24 -0700, Jeff Layton wrote: > Sending this as an RFC as I don't have a reliable reproducer for the > problem that Omar reported. I'm also not sure this is the best fix for > the problem. There is probably a case to be made that the real bug is in > the error handling for pnfs_layoutreturn_before_put_layout_hdr(). > > My guess is that the issue is that we end up with entries on the > plh_return_segs list just before the network goes down. That causes the > LAYOUTRETURN to fail with something that looks retryable, and the lsegs > on the list aren't freed. > > It's possible that we just need to catch ENETUNREACH in the LAYOUTRETURN > error handling, but I'm not sure I correctly understand the problem. If > entries are racing onto the list just before the refcount decrement, > then that wouldn't fix it. > > The first patch should fix the issue of the leaked lsegs, and the second > should let us know if it ever crops up again. > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > --- > Jeff Layton (2): > nfs: free leftover lsegs before freeing a layout in pnfs_put_layout_hdr > nfs: pr_warn if plh_segs or plh_return_segs are non-empty when freeing > > fs/nfs/pnfs.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > --- > base-commit: 5bc1018675ec28a8a60d83b378d8c3991faa5a27 > change-id: 20250428-nfs-6-16-87062aa2989d > > Best regards, Trond, Anna, ping? This seems like the right thing to do, but I'd appreciate a second (and third) set of eyes on this. Thanks, -- Jeff Layton <jlayton@xxxxxxxxxx>