From: Mike Snitzer <snitzer@xxxxxxxxxxxxxxx> Chuck Lever advised that allocating a single start_extra_page, to avoid RDMA corruption on client, definitely shouldn't be needed: "There's nothing I can think of in the RDMA or RPC/RDMA protocols that mandates that the first page offset must always be zero. Moving data at one address on the server to an entirely different address and alignment on the client is exactly what RDMA is supposed to do. It sounds like an implementation omission because the server's upper layers have never needed it before now. If TCP already handles it, I'm guessing it's going to be straightforward to fix." So avoid papering over what seems to be an rpcrdma bug, remove the allocation and use of an extra start_extra_page. With this patch applied ontop of v8 patchset [0], I get the following data mismatch errors at the end [3] when using the NFS RDMA client with reproducer documented in associated patch header since v2 [1]: "Must allocate and use a bounce-buffer page (called 'start_extra_page') if/when expanding the misaligned READ requires reading extra partial page at the start of the READ so that its DIO-aligned. Otherwise that extra page at the start will make its way back to the NFS client and corruption will occur. As found, and then this fix of using an extra page verified, using the 'dt' utility: dt of=/mnt/share1/dt_a.test passes=1 bs=47008 count=2 \ iotype=sequential pattern=iot onerr=abort oncerr=abort see: https://github.com/RobinTMiller/dt.git " I really did try to call attention to this misaligned DIO READ alloc_page hack to make RDMA work, see [2], but I didn't frame it as RDMA specific and definitely should've been clearer on that important detail: "Also, I think its worth calling out this nfsd_complete_misaligned_read_dio function for its remapping/shifting of the READ payload reflected in rqstp->rq_bvec[]." Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> [0]: https://lore.kernel.org/linux-nfs/20250826185718.5593-1-snitzer@xxxxxxxxxx/ [1]: https://lore.kernel.org/linux-nfs/20250708160619.64800-9-snitzer@xxxxxxxxxx/ [2]: https://lore.kernel.org/linux-nfs/aG2MDVyyCbjTpgOv@xxxxxxxxxx/ [3]: partial output of dt utility that shows NFS client READ data mismatch: ++ COUNT=3 ++ IOSIZE=47008 ++ dt of=/mnt/hs_test/dt_thisisa.test passes=1 bs=47008 count=3 iotype=sequential pattern=iot onerr=abort oncerr=abort dt (j:1 t:1): dt (j:1 t:1): Write Statistics: dt (j:1 t:1): Job Information Reported: Job 1, Thread 1 dt (j:1 t:1): Last IOT seed value used: 0x01010101 dt (j:1 t:1): Total records processed: 3 @ 47008 bytes/record (45.906 Kbytes) dt (j:1 t:1): Total bytes transferred: 141024 (137.719 Kbytes, 0.134 Mbytes) dt (j:1 t:1): Average transfer rates: 1004137 bytes/sec, 980.602 Kbytes/sec, 0.958 Mbytes/sec dt (j:1 t:1): Number I/O's per second: 21.361 dt (j:1 t:1): Number seconds per I/O: 0.0468 (46.81ms) dt (j:1 t:1): Total passes completed: 0/1 dt (j:1 t:1): Total errors detected: 0/1 dt (j:1 t:1): Total elapsed time: 00m00.14s dt (j:1 t:1): Total system time: 00m00.00s dt (j:1 t:1): Total user time: 00m00.00s dt (j:1 t:1): Starting time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): Ending time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): Warning: The bytes written 141024, is less than the data limit 1880320000 requested! dt (j:1 t:1): ERROR: Error number 1 occurred on Sat Aug 30 16:14:08 2025 dt (j:1 t:1): dt (j:1 t:1): Error Number: 1 dt (j:1 t:1): Time of Current Error: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): Read Pass Start Time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): Write Pass Start Time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): Pass Number: 1 dt (j:1 t:1): Pass Elapsed Time: 00m00.10s dt (j:1 t:1): Test Elapsed Time: 00m00.24s dt (j:1 t:1): File Name: /mnt/hs_test/dt_thisisa.test dt (j:1 t:1): File Inode: 1199 (0x4af) dt (j:1 t:1): Directory Inode: 1 (0x1) dt (j:1 t:1): File Size: 1880320000 (0x70136800) dt (j:1 t:1): Operation: miscompare dt (j:1 t:1): Record Number: 2 dt (j:1 t:1): Request Size: 47008 (0xb7a0) dt (j:1 t:1): Block Length: 91 (0x5b) dt (j:1 t:1): I/O Mode: read dt (j:1 t:1): I/O Type: sequential dt (j:1 t:1): File Type: output dt (j:1 t:1): Direct I/O: disabled (caching data) dt (j:1 t:1): Device Size: 512 (0x200) dt (j:1 t:1): Starting File Offset: 47008 (0xb7a0) dt (j:1 t:1): Starting LBA: 91 (0x5b) dt (j:1 t:1): Ending File Offset: 94016 (0x16f40) dt (j:1 t:1): Ending LBA: 182 (0xb6) dt (j:1 t:1): Error File Offset: 47008 (0xb7a0) dt (j:1 t:1): Error Offset Modulos: %8 = 0, %512 = 416, %4096 = 1952 dt (j:1 t:1): Starting Relative Error LBA: 91 (0x5b) dt (j:1 t:1): Relative 4096 byte Error LBA: 11 (0xb) dt (j:1 t:1): Corruption Buffer Index: 0 (byte index into read buffer) dt (j:1 t:1): Corruption Block Index: 0 (byte index in miscompare block) dt (j:1 t:1): dt (j:1 t:1): dt (j:1 t:1): Data compare error at byte 0 in record number 2 dt (j:1 t:1): Relative block number where the error occurred is 91, offset 47008 dt (j:1 t:1): Block expected = 91 (0x0000005b), block found = 1919311731 (0x72665f73), count = 47008 dt (j:1 t:1): The correct data starts at memory address 0x000000003c589000 (marked by asterisk '*') dt (j:1 t:1): Dumping Pattern Buffer (base = 0x3c589000, mismatch offset = 0, limit = 512 bytes): dt (j:1 t:1): / Buffer dt (j:1 t:1): Memory Address / Index dt (j:1 t:1): 0x000000003c589000/ 0 |*5b 00 00 00 5c 01 01 01 5d 02 02 02 5e 03 03 03 "[ \ ] ^ " dt (j:1 t:1): 0x000000003c589010/ 16 | 5f 04 04 04 60 05 05 05 61 06 06 06 62 07 07 07 "_ ` a b " dt (j:1 t:1): 0x000000003c589020/ 32 | 63 08 08 08 64 09 09 09 65 0a 0a 0a 66 0b 0b 0b "c d e f " dt (j:1 t:1): 0x000000003c589030/ 48 | 67 0c 0c 0c 68 0d 0d 0d 69 0e 0e 0e 6a 0f 0f 0f "g h i j " dt (j:1 t:1): 0x000000003c589040/ 64 | 6b 10 10 10 6c 11 11 11 6d 12 12 12 6e 13 13 13 "k l m n " dt (j:1 t:1): 0x000000003c589050/ 80 | 6f 14 14 14 70 15 15 15 71 16 16 16 72 17 17 17 "o p q r " dt (j:1 t:1): 0x000000003c589060/ 96 | 73 18 18 18 74 19 19 19 75 1a 1a 1a 76 1b 1b 1b "s t u v " dt (j:1 t:1): 0x000000003c589070/ 112 | 77 1c 1c 1c 78 1d 1d 1d 79 1e 1e 1e 7a 1f 1f 1f "w x y z " dt (j:1 t:1): 0x000000003c589080/ 128 | 7b 20 20 20 7c 21 21 21 7d 22 22 22 7e 23 23 23 "{ |!!!}"""~###" dt (j:1 t:1): 0x000000003c589090/ 144 | 7f 24 24 24 80 25 25 25 81 26 26 26 82 27 27 27 " $$$ %%% &&& '''" dt (j:1 t:1): 0x000000003c5890a0/ 160 | 83 28 28 28 84 29 29 29 85 2a 2a 2a 86 2b 2b 2b " ((( ))) *** +++" dt (j:1 t:1): 0x000000003c5890b0/ 176 | 87 2c 2c 2c 88 2d 2d 2d 89 2e 2e 2e 8a 2f 2f 2f " ,,, --- ... ///" dt (j:1 t:1): 0x000000003c5890c0/ 192 | 8b 30 30 30 8c 31 31 31 8d 32 32 32 8e 33 33 33 " 000 111 222 333" dt (j:1 t:1): 0x000000003c5890d0/ 208 | 8f 34 34 34 90 35 35 35 91 36 36 36 92 37 37 37 " 444 555 666 777" dt (j:1 t:1): 0x000000003c5890e0/ 224 | 93 38 38 38 94 39 39 39 95 3a 3a 3a 96 3b 3b 3b " 888 999 ::: ;;;" dt (j:1 t:1): 0x000000003c5890f0/ 240 | 97 3c 3c 3c 98 3d 3d 3d 99 3e 3e 3e 9a 3f 3f 3f " <<< === >>> ???" dt (j:1 t:1): 0x000000003c589100/ 256 | 9b 40 40 40 9c 41 41 41 9d 42 42 42 9e 43 43 43 " @@@ AAA BBB CCC" dt (j:1 t:1): 0x000000003c589110/ 272 | 9f 44 44 44 a0 45 45 45 a1 46 46 46 a2 47 47 47 " DDD EEE FFF GGG" dt (j:1 t:1): 0x000000003c589120/ 288 | a3 48 48 48 a4 49 49 49 a5 4a 4a 4a a6 4b 4b 4b " HHH III JJJ KKK" dt (j:1 t:1): 0x000000003c589130/ 304 | a7 4c 4c 4c a8 4d 4d 4d a9 4e 4e 4e aa 4f 4f 4f " LLL MMM NNN OOO" dt (j:1 t:1): 0x000000003c589140/ 320 | ab 50 50 50 ac 51 51 51 ad 52 52 52 ae 53 53 53 " PPP QQQ RRR SSS" dt (j:1 t:1): 0x000000003c589150/ 336 | af 54 54 54 b0 55 55 55 b1 56 56 56 b2 57 57 57 " TTT UUU VVV WWW" dt (j:1 t:1): 0x000000003c589160/ 352 | b3 58 58 58 b4 59 59 59 b5 5a 5a 5a b6 5b 5b 5b " XXX YYY ZZZ [[[" dt (j:1 t:1): 0x000000003c589170/ 368 | b7 5c 5c 5c b8 5d 5d 5d b9 5e 5e 5e ba 5f 5f 5f " \\\ ]]] ^^^ ___" dt (j:1 t:1): 0x000000003c589180/ 384 | bb 60 60 60 bc 61 61 61 bd 62 62 62 be 63 63 63 " ``` aaa bbb ccc" dt (j:1 t:1): 0x000000003c589190/ 400 | bf 64 64 64 c0 65 65 65 c1 66 66 66 c2 67 67 67 " ddd eee fff ggg" dt (j:1 t:1): 0x000000003c5891a0/ 416 | c3 68 68 68 c4 69 69 69 c5 6a 6a 6a c6 6b 6b 6b " hhh iii jjj kkk" dt (j:1 t:1): 0x000000003c5891b0/ 432 | c7 6c 6c 6c c8 6d 6d 6d c9 6e 6e 6e ca 6f 6f 6f " lll mmm nnn ooo" dt (j:1 t:1): 0x000000003c5891c0/ 448 | cb 70 70 70 cc 71 71 71 cd 72 72 72 ce 73 73 73 " ppp qqq rrr sss" dt (j:1 t:1): 0x000000003c5891d0/ 464 | cf 74 74 74 d0 75 75 75 d1 76 76 76 d2 77 77 77 " ttt uuu vvv www" dt (j:1 t:1): 0x000000003c5891e0/ 480 | d3 78 78 78 d4 79 79 79 d5 7a 7a 7a d6 7b 7b 7b " xxx yyy zzz {{{" dt (j:1 t:1): 0x000000003c5891f0/ 496 | d7 7c 7c 7c d8 7d 7d 7d d9 7e 7e 7e da 7f 7f 7f " ||| }}} ~~~ " dt (j:1 t:1): dt (j:1 t:1): The incorrect data starts at memory address 0x000000003c596000 (for Robin's debug! :) dt (j:1 t:1): The incorrect data starts at file offset 000000000000047008 (marked by asterisk '*') dt (j:1 t:1): Dumping Data File offsets (base = 47008, mismatch offset = 0, limit = 512 bytes): dt (j:1 t:1): / Block dt (j:1 t:1): File Offset / Index dt (j:1 t:1): 000000000000047008/ 0 |*73 5f 66 72 65 65 5f 63 6f 6d 6d 69 74 5f 61 72 "s_free_commit_ar" dt (j:1 t:1): 000000000000047024/ 16 | 72 61 79 00 54 43 50 5f 54 49 4d 45 5f 57 41 49 "ray TCP_TIME_WAI" dt (j:1 t:1): 000000000000047040/ 32 | 54 00 42 50 46 5f 50 52 4f 47 5f 54 59 50 45 5f "T BPF_PROG_TYPE_" dt (j:1 t:1): 000000000000047056/ 48 | 43 47 52 4f 55 50 5f 53 59 53 43 54 4c 00 4c 41 "CGROUP_SYSCTL LA" dt (j:1 t:1): 000000000000047072/ 64 | 59 4f 55 54 5f 46 4c 45 58 5f 46 49 4c 45 53 00 "YOUT_FLEX_FILES " dt (j:1 t:1): 000000000000047088/ 80 | 4e 46 53 45 52 52 5f 4a 55 4b 45 42 4f 58 00 72 "NFSERR_JUKEBOX r" dt (j:1 t:1): 000000000000047104/ 96 | 78 5f 63 70 75 5f 72 6d 61 70 00 6d 69 67 72 61 "x_cpu_rmap migra" dt (j:1 t:1): 000000000000047120/ 112 | 74 69 6f 6e 5f 64 69 73 61 62 6c 65 64 00 5f 5f "tion_disabled __" dt (j:1 t:1): 000000000000047136/ 128 | 64 61 74 61 00 6e 64 6f 5f 64 65 6c 5f 73 6c 61 "data ndo_del_sla" dt (j:1 t:1): 000000000000047152/ 144 | 76 65 00 6e 66 73 5f 63 6f 6d 6d 69 74 5f 64 61 "ve nfs_commit_da" dt (j:1 t:1): 000000000000047168/ 160 | 74 61 00 65 78 74 5f 6d 75 74 65 78 00 63 6f 6e "ta ext_mutex con" dt (j:1 t:1): 000000000000047184/ 176 | 6e 65 63 74 5f 63 6f 6f 6b 69 65 00 54 43 50 5f "nect_cookie TCP_" dt (j:1 t:1): 000000000000047200/ 192 | 43 4c 4f 53 45 5f 57 41 49 54 00 6d 65 6d 63 6d "CLOSE_WAIT memcm" dt (j:1 t:1): 000000000000047216/ 208 | 70 00 52 50 4d 5f 52 45 51 5f 53 55 53 50 45 4e "p RPM_REQ_SUSPEN" dt (j:1 t:1): 000000000000047232/ 224 | 44 00 63 72 6d 61 74 63 68 00 63 61 6e 63 65 6c "D crmatch cancel" dt (j:1 t:1): 000000000000047248/ 240 | 5f 66 6f 72 6b 00 70 67 70 72 6f 74 5f 74 00 74 "_fork pgprot_t t" dt (j:1 t:1): 000000000000047264/ 256 | 72 61 63 65 70 6f 69 6e 74 5f 70 74 72 5f 74 00 "racepoint_ptr_t " dt (j:1 t:1): 000000000000047280/ 272 | 66 6f 72 5f 72 65 63 6c 61 69 6d 00 4e 46 53 45 "for_reclaim NFSE" dt (j:1 t:1): 000000000000047296/ 288 | 52 52 5f 42 41 44 43 48 41 52 00 5f 73 6b 62 5f "RR_BADCHAR _skb_" dt (j:1 t:1): 000000000000047312/ 304 | 72 65 66 64 73 74 00 70 68 79 73 69 63 61 6c 5f "refdst physical_" dt (j:1 t:1): 000000000000047328/ 320 | 6c 6f 63 61 74 69 6f 6e 00 6e 75 6d 5f 72 65 71 "location num_req" dt (j:1 t:1): 000000000000047344/ 336 | 73 00 5f 5f 53 43 54 5f 5f 74 70 5f 66 75 6e 63 "s __SCT__tp_func" dt (j:1 t:1): 000000000000047360/ 352 | 5f 70 6e 66 73 5f 6d 64 73 5f 66 61 6c 6c 62 61 "_pnfs_mds_fallba" dt (j:1 t:1): 000000000000047376/ 368 | 63 6b 5f 77 72 69 74 65 5f 64 6f 6e 65 00 74 61 "ck_write_done ta" dt (j:1 t:1): 000000000000047392/ 384 | 73 6b 5f 63 6c 65 61 6e 75 70 00 65 78 70 61 6e "sk_cleanup expan" dt (j:1 t:1): 000000000000047408/ 400 | 64 5f 72 65 61 64 61 68 65 61 64 00 6c 6f 63 6b "d_readahead lock" dt (j:1 t:1): 000000000000047424/ 416 | 5f 6d 61 6e 61 67 65 72 5f 6f 70 65 72 61 74 69 "_manager_operati" dt (j:1 t:1): 000000000000047440/ 432 | 6f 6e 73 00 73 72 63 5f 72 65 67 00 63 72 64 65 "ons src_reg crde" dt (j:1 t:1): 000000000000047456/ 448 | 73 74 72 6f 79 00 63 68 69 6c 64 72 65 6e 5f 6c "stroy children_l" dt (j:1 t:1): 000000000000047472/ 464 | 6f 77 5f 75 73 61 67 65 00 6e 75 6d 5f 76 66 00 "ow_usage num_vf " dt (j:1 t:1): 000000000000047488/ 480 | 73 63 72 61 74 63 68 00 50 49 44 54 59 50 45 5f "scratch PIDTYPE_" dt (j:1 t:1): 000000000000047504/ 496 | 4d 41 58 00 70 72 65 70 61 72 65 5f 77 72 69 74 "MAX prepare_writ" dt (j:1 t:1): dt (j:1 t:1): dt (j:1 t:1): Analyzing IOT Record Data: (Note: Block #'s are relative to start of record!) dt (j:1 t:1): dt (j:1 t:1): IOT block size: 512 dt (j:1 t:1): Total number of blocks: 91 (47008 bytes) dt (j:1 t:1): Current IOT seed value: 0x01010101 (pass 1) dt (j:1 t:1): Range of corrupted blocks: 0 - 90 dt (j:1 t:1): Length of corrupted blocks: 91 (46592 bytes) dt (j:1 t:1): Corrupted blocks file offset: 47008 (LBA 91) dt (j:1 t:1): Number of corrupted blocks: 91 dt (j:1 t:1): Number of good blocks found: 0 dt (j:1 t:1): Number of zero blocks found: 0 dt (j:1 t:1): dt (j:1 t:1): Record #: 2 dt (j:1 t:1): Starting Record Offset: 47008 dt (j:1 t:1): Transfer Count: 47008 (0xb7a0) dt (j:1 t:1): Ending Record Offset: 94016 dt (j:1 t:1): Relative Record Block Range: 91 - 182 dt (j:1 t:1): Read Buffer Address: 0x3c596000 dt (j:1 t:1): Pattern Base Address: 0x3c589000 dt (j:1 t:1): Note: Incorrect data is marked with asterisk '*' dt (j:1 t:1): dt (j:1 t:1): Record Block: 0 (BAD data) dt (j:1 t:1): Record Block Offset: 47008 (LBA 91) dt (j:1 t:1): Record Buffer Index: 0 (0x0) dt (j:1 t:1): Expected Block Number: 91 (0x0000005b) dt (j:1 t:1): Received Block Number: 1919311731 (0x72665f73) dt (j:1 t:1): Received Block Offset: 982687606272 dt (j:1 t:1): dt (j:1 t:1): Byte Expected: address 0x3c589000 Received: address 0x3c596000 dt (j:1 t:1): 0000 0000005b 0101015c 0202025d 0303035e * 72665f73 635f6565 696d6d6f 72615f74 dt (j:1 t:1): 0010 0404045f 05050560 06060661 07070762 * 00796172 5f504354 454d4954 4941575f dt (j:1 t:1): 0020 08080863 09090964 0a0a0a65 0b0b0b66 * 50420054 52505f46 545f474f 5f455059 dt (j:1 t:1): 0030 0c0c0c67 0d0d0d68 0e0e0e69 0f0f0f6a * 4f524743 535f5055 54435359 414c004c dt (j:1 t:1): 0040 1010106b 1111116c 1212126d 1313136e * 54554f59 454c465f 49465f58 0053454c dt (j:1 t:1): 0050 1414146f 15151570 16161671 17171772 * 4553464e 4a5f5252 42454b55 7200584f dt (j:1 t:1): 0060 18181873 19191974 1a1a1a75 1b1b1b76 * 70635f78 6d725f75 6d007061 61726769 dt (j:1 t:1): 0070 1c1c1c77 1d1d1d78 1e1e1e79 1f1f1f7a * 6e6f6974 7369645f 656c6261 5f5f0064 dt (j:1 t:1): 0080 2020207b 2121217c 2222227d 2323237e * 61746164 6f646e00 6c65645f 616c735f dt (j:1 t:1): 0090 2424247f 25252580 26262681 27272782 * 6e006576 635f7366 696d6d6f 61645f74 dt (j:1 t:1): 00a0 28282883 29292984 2a2a2a85 2b2b2b86 * 65006174 6d5f7478 78657475 6e6f6300 dt (j:1 t:1): 00b0 2c2c2c87 2d2d2d88 2e2e2e89 2f2f2f8a * 7463656e 6f6f635f 0065696b 5f504354 dt (j:1 t:1): 00c0 3030308b 3131318c 3232328d 3333338e * 534f4c43 41575f45 6d005449 6d636d65 dt (j:1 t:1): 00d0 3434348f 35353590 36363691 37373792 * 50520070 45525f4d 55535f51 4e455053 dt (j:1 t:1): 00e0 38383893 39393994 3a3a3a95 3b3b3b96 * 72630044 6374616d 61630068 6c65636e dt (j:1 t:1): 00f0 3c3c3c97 3d3d3d98 3e3e3e99 3f3f3f9a * 726f665f 6770006b 746f7270 7400745f dt (j:1 t:1): 0100 4040409b 4141419c 4242429d 4343439e * 65636172 6e696f70 74705f74 00745f72 dt (j:1 t:1): 0110 4444449f 454545a0 464646a1 474747a2 * 5f726f66 6c636572 006d6961 4553464e dt (j:1 t:1): 0120 484848a3 494949a4 4a4a4aa5 4b4b4ba6 * 425f5252 48434441 5f005241 5f626b73 dt (j:1 t:1): 0130 4c4c4ca7 4d4d4da8 4e4e4ea9 4f4f4faa * 64666572 70007473 69737968 5f6c6163 dt (j:1 t:1): 0140 505050ab 515151ac 525252ad 535353ae * 61636f6c 6e6f6974 6d756e00 7165725f dt (j:1 t:1): 0150 545454af 555555b0 565656b1 575757b2 * 5f5f0073 5f544353 5f70745f 636e7566 dt (j:1 t:1): 0160 585858b3 595959b4 5a5a5ab5 5b5b5bb6 * 666e705f 646d5f73 61665f73 61626c6c dt (j:1 t:1): 0170 5c5c5cb7 5d5d5db8 5e5e5eb9 5f5f5fba * 775f6b63 65746972 6e6f645f 61740065 dt (j:1 t:1): 0180 606060bb 616161bc 626262bd 636363be * 635f6b73 6e61656c 65007075 6e617078 dt (j:1 t:1): 0190 646464bf 656565c0 666666c1 676767c2 * 65725f64 68616461 00646165 6b636f6c dt (j:1 t:1): 01a0 686868c3 696969c4 6a6a6ac5 6b6b6bc6 * 6e616d5f 72656761 65706f5f 69746172 dt (j:1 t:1): 01b0 6c6c6cc7 6d6d6dc8 6e6e6ec9 6f6f6fca * 00736e6f 5f637273 00676572 65647263 dt (j:1 t:1): 01c0 707070cb 717171cc 727272cd 737373ce * 6f727473 68630079 72646c69 6c5f6e65 dt (j:1 t:1): 01d0 747474cf 757575d0 767676d1 777777d2 * 755f776f 65676173 6d756e00 0066765f dt (j:1 t:1): 01e0 787878d3 797979d4 7a7a7ad5 7b7b7bd6 * 61726373 00686374 54444950 5f455059 dt (j:1 t:1): 01f0 7c7c7cd7 7d7d7dd8 7e7e7ed9 7f7f7fda * 0058414d 70657270 5f657261 74697277 ... dt (j:1 t:1): Reread data does NOT match previous data or expected data! dt (j:1 t:1): Writing reread data to file dt_thisisa.test-REREAD3-j1t1, from buffer 0x7f12bc004000, 47008 bytes... dt (j:1 t:1): Command line to re-read the corrupted data: dt (j:1 t:1): # dt if=/mnt/hs_test/dt_thisisa.test bs=47008 count=1 offset=47008 pattern=iot disable=retryDC,savecorrupted,trigdefaults dt (j:1 t:1): dt (j:1 t:1): Command line to re-read the data: dt (j:1 t:1): # dt if=/mnt/hs_test/dt_thisisa.test bs=47008 dsize=512 iotype=sequential iodir=forward limit=94016 records=1 pattern=iot disable=retryDC,savecorrupted,trigdefaults dt (j:1 t:1): dt (j:1 t:1): dt (j:1 t:1): Read Statistics: dt (j:1 t:1): Job Information Reported: Job 1, Thread 1 dt (j:1 t:1): Last IOT seed value used: 0x01010101 dt (j:1 t:1): Total records processed: 2 @ 47008 bytes/record (45.906 Kbytes) dt (j:1 t:1): Total bytes transferred: 94016 (91.812 Kbytes, 0.090 Mbytes) dt (j:1 t:1): Average transfer rates: 656857 bytes/sec, 641.462 Kbytes/sec, 0.626 Mbytes/sec dt (j:1 t:1): Number I/O's per second: 13.973 dt (j:1 t:1): Number seconds per I/O: 0.0716 (71.56ms) dt (j:1 t:1): Total passes completed: 1/1 dt (j:1 t:1): Total errors detected: 1/1 dt (j:1 t:1): Total elapsed time: 00m00.15s dt (j:1 t:1): Total system time: 00m00.00s dt (j:1 t:1): Total user time: 00m00.00s dt (j:1 t:1): Starting time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): Ending time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): dt (j:1 t:1): Operating System Information: dt (j:1 t:1): Host name: plsm121c-06.perf.hammer.space (192.168.1.106) dt (j:1 t:1): User name: root dt (j:1 t:1): Process ID: 31703 dt (j:1 t:1): OS information: Linux 6.12.24.17.hs.snitm+ #34 SMP PREEMPT_DYNAMIC Fri Aug 15 22:03:10 UTC 2025 x86_64 dt (j:1 t:1): dt (j:1 t:1): File System Information: dt (j:1 t:1): Mounted from device: 192.168.0.105:/hs_test dt (j:1 t:1): Mounted on directory: /mnt/hs_test dt (j:1 t:1): Filesystem type: nfs4 dt (j:1 t:1): Filesystem options: rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,fatal_neterrors=none,proto=tcp,nconnect=16,port=20491,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.106,local_lock=none,addr=192.168.0.105 dt (j:1 t:1): Filesystem block size: 1048576 dt (j:1 t:1): Filesystem free space: 60990430380032 (58165007.000 Mbytes, 56801.765 Gbytes, 55.470 Tbytes) dt (j:1 t:1): Filesystem total space: 60992310476800 (58166800.000 Mbytes, 56803.516 Gbytes, 55.472 Tbytes) dt (j:1 t:1): dt (j:1 t:1): Total Statistics: dt (j:1 t:1): Output device/file name: /mnt/hs_test/dt_thisisa.test (device type=regular) dt (j:1 t:1): Type of I/O's performed: sequential (forward) dt (j:1 t:1): Job Information Reported: Job 1, Thread 1 dt (j:1 t:1): Data pattern string used: 'IOT Pattern' (blocking is 512 bytes) dt (j:1 t:1): Last IOT seed value used: 0x01010101 dt (j:1 t:1): Total records read: 2 dt (j:1 t:1): Total bytes read: 94016 (91.812 Kbytes, 0.090 Mbytes, 0.000 Gbytes) dt (j:1 t:1): Total records written: 3 dt (j:1 t:1): Total bytes written: 141024 (137.719 Kbytes, 0.134 Mbytes, 0.000 Gbytes) dt (j:1 t:1): Total records processed: 5 @ 47008 bytes/record (45.906 Kbytes) dt (j:1 t:1): Total bytes transferred: 235040 (229.531 Kbytes, 0.224 Mbytes) dt (j:1 t:1): Average transfer rates: 828023 bytes/sec, 808.616 Kbytes/sec, 0.790 Mbytes/sec dt (j:1 t:1): Number I/O's per second: 17.615 dt (j:1 t:1): Number seconds per I/O: 0.0568 (56.77ms) dt (j:1 t:1): Total passes completed: 1/1 dt (j:1 t:1): Total errors detected: 1/1 dt (j:1 t:1): Total elapsed time: 00m00.29s dt (j:1 t:1): Total system time: 00m00.00s dt (j:1 t:1): Total user time: 00m00.00s dt (j:1 t:1): Starting time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): Ending time: Sat Aug 30 16:14:08 2025 dt (j:1 t:1): dt (j:1 t:1): Command line to re-read the data: dt (j:1 t:1): # dt if=/mnt/hs_test/dt_thisisa.test bs=47008 dsize=512 iotype=sequential iodir=forward limit=141024 records=3 pattern=iot dt (j:1 t:1): dt (j:1 t:1): Command Line: dt (j:1 t:1): dt (j:1 t:1): # dt of=/mnt/hs_test/dt_thisisa.test passes=1 bs=47008 count=3 iotype=sequential pattern=iot onerr=abort oncerr=abort dt (j:1 t:1): dt (j:1 t:1): --> Date: September 21st, 2023, Version: 25.05, Author: Robin T. Miller <-- dt (j:1 t:1): dt (j:1 t:1): onerr=abort, so stopping all threads for job 1... dt (j:0 t:0): Job 1 is being stopped (1 thread) dt (j:0 t:0): Program is exiting with status -1... --- fs/nfsd/vfs.c | 25 ++++++------------------- 1 file changed, 6 insertions(+), 19 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index f8975ee262b5c..762d745b1b15d 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1079,13 +1079,11 @@ struct nfsd_read_dio { loff_t end; unsigned long start_extra; unsigned long end_extra; - struct page *start_extra_page; }; static void init_nfsd_read_dio(struct nfsd_read_dio *read_dio) { memset(read_dio, 0, sizeof(*read_dio)); - read_dio->start_extra_page = NULL; } #define NFSD_READ_DIO_MIN_KB (32 << 10) @@ -1121,9 +1119,8 @@ static bool nfsd_analyze_read_dio(struct svc_rqst *rqstp, struct svc_fh *fhp, /* * Any misaligned READ less than NFSD_READ_DIO_MIN_KB won't be expanded - * to be DIO-aligned (this heuristic avoids excess work, like allocating - * start_extra_page, for smaller IO that can generally already perform - * well using buffered IO). + * to be DIO-aligned (this heuristic avoids excess work, for smaller IO + * that can generally already perform well using buffered IO). */ if ((read_dio->start_extra || read_dio->end_extra) && (len < NFSD_READ_DIO_MIN_KB)) { @@ -1131,15 +1128,6 @@ static bool nfsd_analyze_read_dio(struct svc_rqst *rqstp, struct svc_fh *fhp, return false; } - if (read_dio->start_extra) { - read_dio->start_extra_page = alloc_page(GFP_KERNEL); - if (WARN_ONCE(read_dio->start_extra_page == NULL, - "%s: Unable to allocate start_extra_page\n", __func__)) { - init_nfsd_read_dio(read_dio); - return false; - } - } - /* Show original offset and count, and how it was expanded for DIO */ middle_end = read_dio->end - read_dio->end_extra; trace_nfsd_analyze_read_dio(rqstp, fhp, offset, len, @@ -1162,11 +1150,10 @@ static ssize_t nfsd_complete_misaligned_read_dio(struct svc_rqst *rqstp, if (!read_dio->start_extra && !read_dio->end_extra) return host_err; - /* If nfsd_analyze_read_dio() allocated a start_extra_page it must - * be removed from rqstp->rq_bvec[] to avoid returning unwanted data. + /* If nfsd_analyze_read_dio() found start_extra (front-pad) page needed it + * must be removed from rqstp->rq_bvec[] to avoid returning unwanted data. */ - if (read_dio->start_extra_page) { - __free_page(read_dio->start_extra_page); + if (read_dio->start_extra) { *rq_bvec_numpages -= 1; v = *rq_bvec_numpages; memmove(rqstp->rq_bvec, rqstp->rq_bvec + 1, @@ -1276,7 +1263,7 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp, if (read_dio.start_extra) { len = read_dio.start_extra; bvec_set_page(&rqstp->rq_bvec[v], - read_dio.start_extra_page, + *(rqstp->rq_next_page++), len, PAGE_SIZE - len); total -= len; ++v; -- 2.44.0