On Tue, Sep 2, 2025 at 7:06 PM Justin Worrell <jworrell@xxxxxxxxx> wrote: > > > > On 9/2/25 4:11 PM, Olga Kornievskaia wrote: > > On Tue, Sep 2, 2025 at 4:46 PM Justin Worrell <jworrell@xxxxxxxxx> wrote: > >> > >> > >> > >> On 9/2/25 11:21 AM, Olga Kornievskaia wrote: > >>> On Tue, Sep 2, 2025 at 8:27 AM Justin Worrell <jworrell@xxxxxxxxx> wrote: > >>>> > >>>> xs_sock_recv_cmsg was failing to call xs_sock_process_cmsg for any cmsg > >>>> type other than TLS_RECORD_TYPE_ALERT (TLS_RECORD_TYPE_DATA, and other > >>>> values not handled.) Based on my reading of the previous commit > >>>> (cc5d5908: sunrpc: fix client side handling of tls alerts), it looks > >>>> like only iov_iter_revert should be conditional on TLS_RECORD_TYPE_ALERT > >>>> (but that other cmsg types should still call xs_sock_process_cmsg). On > >>>> my machine, I was unable to connect (over mtls) to an NFS share hosted > >>>> on FreeBSD. With this patch applied, I am able to mount the share again. > >>> > >>> Thanks for the catch Justin. Indeed, the client fails to return an > >>> error in case it receives anything other than TLS DATA or TLS ALERT. > >>> Could you tell what kind of TLS message the FreeBSD server is sending? > >>> Either a network trace or turning on tls_contentype tracepoint should > >>> show what type the client has been receiving. > >> > >> Hi Olga, > >> > >> Unfortunately, I don't know much (anything, really) about Kernel > >> debugging or the SSL protocol. I do have root on both boxes and am happy > >> to provide whatever information would help with better understanding the > >> issue. Could you provide some guidance (even if just where to go to > >> rtfm) to fetch the requested info? I don't imagine just a tcpdump of the > >> ciphertext is sufficient. If providing this assistance is too spammy for > >> the list, it is okay to reach out off-list. > > > > Hi Justin, > > > > If you can do either of the 2 below that should capture the needed information. > > > > For tracepoints (the following is easiest for me, others might prefer > > usage of trace-cmd), as root, prior to executing the mount common > > which I believe was shows (demonstrates the problem), > > echo 1 > /sys/kernel/debug/tracing/events/handshake/tls_contenttype/enable > > cat /sys/kernel/debug/tracing/trace_pipe (this can be redirected to a > > file if desired) > > do the mount with TLS > > ctrl-c the cat. Provide output of cat command. I hope that should show > > the types of control messages the client received. > > > > Tcpdump is useful if there is a corresponding TLS session key > > included. Tlshd (the user level daemon that handles the TLS handshake > > for the kernel NFS) will dump session key material to the location of > > the SSLKEYLOGFILE environmental variable. So easiest (for me), set an > > environment variable on the command line. SSLKEYLOGFILE=ssl.log, then > > on the same shell run manually /usr/sbin/tlshd -s (assuming you > > stopped the system's tlshd that was running before). Start tcpdump to > > capture a network trace. Do the mount. Stop the network trace. Provide > > ssl.log file and network trace (wireshark can decode TLS traffic > > provided that log file). If it's not appropriate stopping tlshd and > > running it by hand, then turning on tracepoints might be the way to > > go. > > > > Thank you for your help. > > > > Hi Olga, > > I'm not sure if attachments are allowed on this list, will be stripped, > or if this email will be rejected. Fingers crossed. Thank you for all the info. It was very useful. > The tracepoint option produces the following line (sometimes with .l..., > sometimes with .....) ~3400 times, which was unexpected to me: > kworker/u40:2-712 [007] .l... 203.225007: tls_contenttype: > src=192.168.124.204:896 dest=10.1.2.9:2049 HANDSHAKE This is indeed interesting and pointing that something is looping over 'receiving' TLS HANDSHAKE type record past when handshake is done.. Something isn't right with the code (still) possibly. > I have attached the output from cat'ing tracepoints as well as the > tcpdump pcap file and tlshd ssl.log. > > All of this is from the VM where I have applied my patch (and the mount > works). I can provide output for a stock kernel (where the mount command > hangs) as well if required. > > >> > >>>> --- > >>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > >>>> --- a/net/sunrpc/xprtsock.c (revision > >>>> b320789d6883cc00ac78ce83bccbfe7ed58afcf0) > >>>> +++ b/net/sunrpc/xprtsock.c (date 1756813457481) > >>>> @@ -407,9 +407,9 @@ > >>>> iov_iter_kvec(&msg.msg_iter, ITER_DEST, &alert_kvec, 1, > >>>> alert_kvec.iov_len); > >>>> ret = sock_recvmsg(sock, &msg, flags); > >>>> - if (ret > 0 && > >>>> - tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) { > >>>> - iov_iter_revert(&msg.msg_iter, ret); > >>>> + if (ret > 0) { > >>>> + if (tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) > >>>> + iov_iter_revert(&msg.msg_iter, ret); > >>>> ret = xs_sock_process_cmsg(sock, &msg, msg_flags, &u.cmsg, > >>>> -EAGAIN); > >>>> } > >>>> > >> > >