On Tue, May 20, 2025 at 03:59:16PM -0400, cel@xxxxxxxxxx wrote: > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > Engineers at Hammerspace noticed that sometimes mounting with > "xprtsec=tls" hangs for a minute or so, and then times out, even > when the NFS server is reachable and responsive. > > kTLS shuts off data_ready callbacks if strp->msg_ready is set to > mitigate data_ready callbacks when a full TLS record is not yet > ready to be read from the socket. > > Normally msg_ready is clear when the first TLS record arrives on > a socket. However, I observed that sometimes tls_setsockopt() sets > strp->msg_ready, and that prevents forward progress because > tls_data_ready() becomes a no-op. > > Moreover, Jakub says: "If there's a full record queued at the time > when [tlshd] passes the socket back to the kernel, it's up to the > reader to read the already queued data out." So SunRPC cannot > expect a data_ready call when ingress data is already waiting. > > Add an explicit poll after SunRPC's upper transport is set up to > pick up any data that arrived after the TLS handshake but before > transport set-up is complete. > > Reported-by: Steve Sears <sjs@xxxxxxxxxxxxxxx> > Suggested-by: Jakub Kacinski <kuba@xxxxxxxxxx> > Fixes: 75eb6af7acdf ("SUNRPC: Add a TCP-with-TLS RPC transport class") > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > --- > net/sunrpc/xprtsock.c | 5 +++++ > 1 file changed, 5 insertions(+) > > Mike, can you try this out? Works well, thanks to you and Jakub for seeing this through! Tested-by: Mike Snitzer <snitzer@xxxxxxxxxx> Reviewed-by: Mike Snitzer <snitzer@xxxxxxxxxx> > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > index 83cc095846d3..4b10ecf4c265 100644 > --- a/net/sunrpc/xprtsock.c > +++ b/net/sunrpc/xprtsock.c > @@ -2740,6 +2740,11 @@ static void xs_tcp_tls_setup_socket(struct work_struct *work) > } > rpc_shutdown_client(lower_clnt); > > + /* Check for ingress data that arrived before the socket's > + * ->data_ready callback was set up. > + */ > + xs_poll_check_readable(upper_transport); > + > out_unlock: > current_restore_flags(pflags, PF_MEMALLOC); > upper_transport->clnt = NULL; > -- > 2.49.0 >