On 7/30/25 4:08 PM, Olga Kornievskaia wrote: > Scott Mayhew discovered a security exploit in NFS over TLS in > tls_alert_recv() due to its assumption it can read data from > the msg iterator's kvec.. > > kTLS implementation splits TLS non-data record payload between > the control message buffer (which includes the type such as TLS > aler or TLS cipher change) and the rest of the payload (say TLS > alert's level/description) which goes into the msg payload buffer. > > This patch proposes to rework how control messages are setup and > used by sock_recvmsg(). > > If no control message structure is setup, kTLS layer will read and > process TLS data record types. As soon as it encounters a TLS control > message, it would return an error. At that point, NFS can setup a > kvec backed msg buffer and read in the control message such as a > TLS alert. Scott found that msg iterator can advance the kvec > pointer as a part of the copy process thus we need to revert the > iterator before calling into the tls_alert_recv. > > Reported-by: Scott Mayhew <smayhew@xxxxxxxxxx> > Fixes: 5e052dda121e ("SUNRPC: Recognize control messages in server-side TCP socket code") > Suggested-by: Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> > Suggested-by: Scott Mayhew <smayhew@xxxxxxxxxx> > Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx> > --- > net/sunrpc/svcsock.c | 43 +++++++++++++++++++++++++++++++++++-------- > 1 file changed, 35 insertions(+), 8 deletions(-) > > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c > index 46c156b121db..e2c5e0e626f9 100644 > --- a/net/sunrpc/svcsock.c > +++ b/net/sunrpc/svcsock.c > @@ -257,20 +257,47 @@ svc_tcp_sock_process_cmsg(struct socket *sock, struct msghdr *msg, > } > > static int > -svc_tcp_sock_recv_cmsg(struct svc_sock *svsk, struct msghdr *msg) > +svc_tcp_sock_recv_cmsg(struct socket *sock, unsigned int *msg_flags) > { > union { > struct cmsghdr cmsg; > u8 buf[CMSG_SPACE(sizeof(u8))]; > } u; > - struct socket *sock = svsk->sk_sock; > + u8 alert[2]; > + struct kvec alert_kvec = { > + .iov_base = alert, > + .iov_len = sizeof(alert), > + }; > + struct msghdr msg = { > + .msg_flags = *msg_flags, > + .msg_control = &u, > + .msg_controllen = sizeof(u), > + }; > + int ret; > + > + iov_iter_kvec(&msg.msg_iter, ITER_DEST, &alert_kvec, 1, > + alert_kvec.iov_len); > + ret = sock_recvmsg(sock, &msg, MSG_DONTWAIT); > + if (ret > 0 && > + tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) { > + iov_iter_revert(&msg.msg_iter, ret); > + ret = svc_tcp_sock_process_cmsg(sock, &msg, &u.cmsg, -EAGAIN); > + } > + return ret; > +} > + > +static int > +svc_tcp_sock_recvmsg(struct svc_sock *svsk, struct msghdr *msg) > +{ > int ret; > + struct socket *sock = svsk->sk_sock; > > - msg->msg_control = &u; > - msg->msg_controllen = sizeof(u); > ret = sock_recvmsg(sock, msg, MSG_DONTWAIT); > - if (unlikely(msg->msg_controllen != sizeof(u))) > - ret = svc_tcp_sock_process_cmsg(sock, msg, &u.cmsg, ret); > + if (msg->msg_flags & MSG_CTRUNC) { > + msg->msg_flags &= ~(MSG_CTRUNC | MSG_EOR); > + if (ret == 0 || ret == -EIO) > + ret = svc_tcp_sock_recv_cmsg(sock, &msg->msg_flags); > + } > return ret; > } > > @@ -321,7 +348,7 @@ static ssize_t svc_tcp_read_msg(struct svc_rqst *rqstp, size_t buflen, > iov_iter_advance(&msg.msg_iter, seek); > buflen -= seek; > } > - len = svc_tcp_sock_recv_cmsg(svsk, &msg); > + len = svc_tcp_sock_recvmsg(svsk, &msg); > if (len > 0) > svc_flush_bvec(bvec, len, seek); > > @@ -1018,7 +1045,7 @@ static ssize_t svc_tcp_read_marker(struct svc_sock *svsk, > iov.iov_base = ((char *)&svsk->sk_marker) + svsk->sk_tcplen; > iov.iov_len = want; > iov_iter_kvec(&msg.msg_iter, ITER_DEST, &iov, 1, want); > - len = svc_tcp_sock_recv_cmsg(svsk, &msg); > + len = svc_tcp_sock_recvmsg(svsk, &msg); > if (len < 0) > return len; > svsk->sk_tcplen += len; Fwiw, I've already pulled 1/4 into nfsd-testing. But 4/4 might not apply until the others are in the tree. We might want these to go through a single tree. Or, can we delay 4/4 until 6.18 ? -- Chuck Lever