On Thu, Sep 04, 2025 at 12:10:00PM -0400, Chuck Lever wrote: > On 9/4/25 10:42 AM, Mike Snitzer wrote: > > On Tue, Sep 02, 2025 at 05:27:11PM -0400, Mike Snitzer wrote: > >> On Tue, Sep 02, 2025 at 05:16:10PM -0400, Chuck Lever wrote: > >>> > >>> I am testing with a physically separate client and server, so I believe > >>> that LOCALIO is not in play. I do see WRITEs. And other workloads (in > >>> particular "fsx -Z <fname>") show READ traffic and I'm getting the > >>> new trace point to fire quite a bit, and it is showing misaligned > >>> READ requests. So it has something to do with dt. > >> > >> OK, yeah I figured you weren't doing loopback mount, only thing that > >> came to mind for you not seeing READ like expected. I haven't had any > >> problems with dt not driving READs to NFSD... > >> > >> You'll certainly need to see READs in order for NFSD's new misaligned > >> DIO READ handling to get tested. > > > > I was doing some additional testing of the v9 changes last night and > > realized why you weren't seeing any READs come through to NFSD: > > "flags=direct" must be added to the dt commandline. Otherwise it'll > > use buffered IO at the client and the READ will be serviced by the > > client's page cache. > > > > But like I said in another reply: when I just use v3 and RDMA (without > > the intermediary of flexfiles at the client) I'm not able to see the > > data mismatch with dt... > > > > So while its unlikely: does adding "flags=direct" cause dt to fail > > when NFSD handles the misaligned DIO READ? > Applied v9. > > Multiple successful runs, no failures after adding "flags=direct". > Some excerpts from the last run show the server is seeing NFS > READs now: > > Filesystem options: > rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard, > fatal_neterrors=none,proto=rdma,port=20049,timeo=600,retrans=2, > sec=sys,mountaddr=192.168.2.55,mountvers=3,mountproto=tcp, > local_lock=none,addr=192.168.2.55 > > nfsd-1342 [004] 463.832928: nfsd_analyze_read_dio: xid=0x89784d89 > fh_hash=0x024204eb offset=0 len=47008 start=0+0 middle=0+47008 end=47008+96 > nfsd-1342 [004] 463.833105: nfsd_analyze_read_dio: xid=0x8a784d89 > fh_hash=0x024204eb offset=47008 len=47008 start=46592+416 > middle=47008+47008 end=94016+192 > nfsd-1342 [004] 463.833185: nfsd_analyze_read_dio: xid=0x8b784d89 > fh_hash=0x024204eb offset=94016 len=47008 start=93696+320 > middle=94016+47008 end=141024+288 OK, thanks for testing! So yeah, patch 9/9 of v9 does workaround the problem relative to flexfiles+RDMA (though patch header should really be updated to add "flags=direct" to the dt command line): https://lore.kernel.org/linux-nfs/20250903205121.41380-10-snitzer@xxxxxxxxxx/ Is it a tolerable intermediate workaround you'd be OK with? To be clear, I'm continuing to work the problem (and will be discussing it with Trond)... but its a tricky one for sure. Mike