[PATCH v9 9/9] NFSD: use /end/ of rq_pages for misaligned DIO READ's start_extra page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This commit works around what seems like a flexfiles+rpcrdma bug, and
Chuck Lever clarified that this shouldn't be needed:

    "Yes, the extra page needs to come from rq_pages. But I don't see
    why it should come from the /end/ of rq_pages."

However, when using NFSD DIRECT for READ and NFS 4.2 client with pNFS
flexfiles (and client gets a layout to use a v3 DS) over RDMA it is
easy to see data mismatch when NFSD handles a misaligned DIO READ. If
the same misaligned DIO READ is issued directly to the v3 DS over RDMA
(so flexfiles is _not_ used) then no data mismatch occurs.

Therefore, until this bug can be found, must use a 'start_extra' page
from rq_pages that follows the NFS client requested READ payload (RDMA
memory) if/when expanding the misaligned READ requires reading an
extra partial page at the start of the READ so that its DIO-aligned.

Otherwise if the 'start_extra' page is taken from the beginning of
rq_pages the pNFS flexfiles client will see data mismatch corruption.
As found, and then this fix of using the end of rq_pages verified,
using the 'dt' utility:
      dt of=/mnt/share1/dt_a.test passes=1 bs=47008 count=2 \
         iotype=sequential pattern=iot onerr=abort oncerr=abort
    see: https://github.com/RobinTMiller/dt.git

Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx>
---
 fs/nfsd/vfs.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 5b3c6072b6f5c..e9ddeec3c9a32 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1263,7 +1263,7 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 			if (read_dio.start_extra) {
 				len = read_dio.start_extra;
 				bvec_set_page(&rqstp->rq_bvec[v],
-					      *(rqstp->rq_next_page++),
+					      NULL, /* set below */
 					      len, PAGE_SIZE - len);
 				total -= len;
 				++v;
@@ -1288,6 +1288,11 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		base = 0;
 	}
 	WARN_ON_ONCE(v > rqstp->rq_maxpages);
+	/* FIXME: having the start_extra page come from the end of
+	 * rq_pages[] works around what seems to be a flexfiles+rpcrdma bug.
+	 */
+	if ((kiocb.ki_flags & IOCB_DIRECT) && read_dio.start_extra)
+		rqstp->rq_bvec[0].bv_page = *(rqstp->rq_next_page++);
 
 	trace_nfsd_read_vector(rqstp, fhp, offset, in_count);
 	iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, in_count);
-- 
2.44.0





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux