On Thu, 04 Sep 2025, Justin Worrell wrote: > My initial attempt to submit this patch was not formatted correctly. The only change in v2 is me trying to use git send-email correctly. Thanks for being patient with me. The v2 patch is good from a formatting standpoint (i.e. it applies with 'git am') but you're missing a few things: - You need a patch description (I thought what you had in v1 was fine). See https://docs.kernel.org/process/submitting-patches.html#describe-your-changes - You need a Signed-off-by: line. See https://docs.kernel.org/process/submitting-patches.html#developer-s-certificate-of-origin-1-1 You should probably also add: Fixes: cc5d59081fa2 ("sunrpc: fix client side handling of tls alerts") You can also add: Reviewed-and-tested-by: Scott Mayhew <smayhew@xxxxxxxxxx> > > xs_sock_recv_cmsg was failing to call xs_sock_process_cmsg for any cmsg type other than TLS_RECORD_TYPE_ALERT (TLS_RECORD_TYPE_DATA, and other values not handled.) Based on my reading of the previous commit (cc5d5908: sunrpc: fix client side handling of tls alerts), it looks like only iov_iter_revert should be conditional on TLS_RECORD_TYPE_ALERT (but that other cmsg types should still call xs_sock_process_cmsg). On my machine, I was unable to connect (over mtls) to an NFS share hosted on FreeBSD. With this patch applied, I am able to mount the share again. > > Based on analysis by Olga Kornievskaia, while this patch does allow Linux to mount FreeBSD exports over (m)tls -- it does not seem to fully correct all underlying issues. I think I’m a bit over my head to analyze the protocol for deeper issues than what I found (which was just thinking through functional differences between the old and new code, and throwing stuff at the wall till it runs.) > Olga's main concern was why were seeing so many tls_contenttype tracepoints fire where the type was HANDSHAKE. I think I can explain why. Prior to cc5d59081fa2, we'd read in those HANDSHAKE records via a handful of sock_recvmsg() calls, and then we'd discard them (because we only have handling for DATA and ALERT records). But we also had a bug in our handling of ALERT records that could lead to kernel panics. cc5d59081fa2 addressed the kernel panics by setting up a separate 2-byte buffer to read in ALERT records. But we're also receiving those HANDSHAKE records using that 2-byte buffer. That means a lot more sock_recvmsg()... and a lot more tls_get_record_type() (which fires the tls_contenttype tracepoint) calls. Now couple that with the bug that you identified - that we're no longer tossing those HANDSHAKE messages to the side. Instead we treat them as a if they have RPC data. We manage to find what looks like a RPC fragment header, so then we check what we think is the call direction field. It happens to be four-bytes of zeros, so we think it's an RPC call (i.e we think the the server is sending us an RPC call). Because the transport isn't for the backchannel, the client closes the socket. The client keeps retrying the mount, so rinse and repeat a whole bunch of times until the mount eventually fails, and that's a lot of HANDSHAKE tracepoints firing. At least that's what I'm seeing in my environment. Your patch fixes the issue by ensuring that we call xs_sock_process_cmsg() for all records, so now we're discarding those HANDSHAKE records again. We see more HANDSHAKE tracepoints firing, but the number of them makes sense to me. When I look at a packet capture after I perform a mount of my freebsd server, I see that the server sent 2 New Session Ticket messages. Each one has a decrypted payload that is 233 bytes in length. If we're pulling that in using a two-byte buffer, we need to call sock_recvmsg() 117 times. Each time we're actually firing the tls_contenttype tracepoint twice, because we call tls_get_record_type() (which calls the tracepoint) twice - once in xs_sock_recv_cmsg() and again in xs_sock_process_cmsg(). So we should see 234 HANDSHAKE tracepoints. Now double that, because there were 2 New Session Messages, for a total of 468 HANDSHAKE tracepoints... and that's exactly what I see: [root@rawhide ~]# echo 1 >/sys/kernel/debug/tracing/events/handshake/enable [root@rawhide ~]# mount -o v4.2,xprtsec=tls 10.6.54.203:/ /mnt/t [root@rawhide ~]# grep -c HANDSHAKE /sys/kernel/debug/tracing/trace 468 We should probably pull in those HANDSHAKE records more efficiently (maybe pivot back to reading the data into the msghdr->msg_iter when we detect we have a HANDSHAKE record?). At any rate, I think that should be done separately, since your patch fixes a clear bug. -Scott > Obviously it is up to the community and maintainers if stands alone as step in the right direction, or if a full root cause needs to be found and corrected before proceeding with any changes. > > Justin Worrell (1): > call xs_sock_process_cmsg for all cmsg > > net/sunrpc/xprtsock.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > -- > 2.51.0 >