On 5/15/25 11:02 AM, Hannes Reinecke wrote: > On 5/15/25 16:44, Chuck Lever wrote: >> Resending with linux-nfs and kernel-tls-handshake on Cc >> >> >> On 5/15/25 10:35 AM, Chuck Lever wrote: >>> Hi - >>> >>> I'm troubleshooting an issue where, after a successful handshake, the >>> kernel TLS socket's data_ready callback is never invoked. I'm able to >>> reproduce this 100% on an Atom-based system with a Realtek Ethernet >>> device. But on many other systems, the problem is intermittent or not >>> reproducible. >>> >>> The problem seems to be that strp->msg_ready is already set when >>> tls_data_ready is called, and that prevents any further processing. I >>> see that msg_ready is set when the handshake daemon sets the ktls >>> security parameters, and is then never cleared. >>> >>> function: tls_setsockopt >>> function: do_tls_setsockopt_conf >>> function: tls_set_device_offload_rx >>> function: tls_set_sw_offload >>> function: init_prot_info >>> function: tls_strp_init >>> function: tls_sw_strparser_arm >>> function: tls_strp_check_rcv >>> function: tls_strp_read_sock >>> function: tls_strp_load_anchor_with_queue >>> function: tls_rx_msg_size >>> function: tls_device_rx_resync_new_rec >>> function: tls_rx_msg_ready >>> >>> For a working system (a VMware guest using a VMXNet device), setsockopt >>> leaves msg_ready set to zero: >>> >>> function: tls_setsockopt >>> function: do_tls_setsockopt_conf >>> function: tls_set_device_offload_rx >>> function: tls_set_sw_offload >>> function: init_prot_info >>> function: tls_strp_init >>> function: tls_sw_strparser_arm >>> function: tls_strp_check_rcv >>> >>> The first tls_data_ready call then handles the waiting ingress data as >>> expected. >>> >>> Any advice is appreciated. >>> >> > I _think_ you are expected to set the callbacks prior to do the tls > handshake upcall (at least, that's what I'm doing). > It's not that you can (nor should) receive anything on the socket > while the handshake is active. > If it fails you can always reset them to the original callbacks. It looks to me like the socket callbacks are set up correctly. If I apply a patch to remove the msg_ready optimization from tls_data_ready, everything works as expected. diff --git a/net/tls/tls_strp.c b/net/tls/tls_strp.c index 77e33e1e340e..0440391dc476 100644 --- a/net/tls/tls_strp.c +++ b/net/tls/tls_strp.c @@ -537,7 +537,7 @@ static int tls_strp_read_sock(struct tls_strparser *strp) void tls_strp_check_rcv(struct tls_strparser *strp) { - if (unlikely(strp->stopped) || strp->msg_ready) + if (unlikely(strp->stopped)) return; if (tls_strp_read_sock(strp) == -ENOMEM) -- Chuck Lever