v2 changes: - Split out the preload operation to a separate routine from tcp_sendmsg_locked() and restrict from looping over the supplied iovec ------ Support write to a listen TCP socket, for immediate transmission on all later passive connection establishments parented by the listen socket. On a normal connection transmission of the data is triggered by the receipt of the 3rd-ack. On a fastopen (with accepted cookie) connection the data is sent in the synack packet. The data preload is done using a sendmsg with a newly-defined flag (MSG_PRELOAD); the amount of data limited to a single linear sk_buff. Note that this definition is the last-but-two bit available if "int" is 32 bits. Intent: lower latency for server-first protocols using TCP. Known cases of this use are SMTP and MySQL. Measurements: Packet capture (laptop, loopback, TFO requeste) for initial SYN to first client data packet (5 samples): - baseline TFO-C 1064 1470 1455 1547 1595 usec - patched non-TFO 140 150 159 144 153 usec - patched TFO-C 142 149 149 125 125 usec Out of scope: - Client-first protocols - TLS-on-connect Testing: A) packetdrill scripts for - normal non-TFO - normal TFO - synack lost - 3rd-ack acks only the SYN - 3rd-ack acks partial data (NB: packetdrill can only check the data size, not actual content) B) Application use, running the application testsuite and manual check of specific cases via packet capture C) Daily-driver laptop use (not expected to trigger the feature; only regression-test) D) KASAN/syzkaller - enable_syscalls "socket$inet_tcp", "listen", "sendmsg", "accept", "read", "write", "close", "syz_emit_ethernet", "syz_extract_tcp_res" - the coverage seems rather limited; the sendmsg onto a listen socket is there, but I am not convinced actual TCP connections are being excercised. tcp_minisocks.c is entirely uncovered. - A need for limiting iteration in the above sendmesg was found (RCU timeouts), hence v2, but no hint of locking problems. Eric: could you expand on your previous comment? If it referred to the listening socket, tcp_sendmsg_locked() is called with the sk locked. Jeremy Harris (6): tcp: support writing to a socket in listening state tcp: copy write-data from listen socket to accept child socket tcp: fastopen: add write-data to fastopen synack packet tcp: transmit any pending data on receipt of 3rd-ack tcp: fastopen: retransmit data when only the SYN of a synack-with-data is acked tcp: fastopen: extend retransmit-queue trimming to handle linear sk_buff include/linux/socket.h | 1 + net/ipv4/tcp.c | 115 ++++++++++++++++++ net/ipv4/tcp_fastopen.c | 3 +- net/ipv4/tcp_input.c | 15 ++- net/ipv4/tcp_ipv4.c | 4 +- net/ipv4/tcp_minisocks.c | 58 ++++++++- net/ipv4/tcp_output.c | 50 +++++++- .../perf/trace/beauty/include/linux/socket.h | 1 + tools/perf/trace/beauty/msg_flags.c | 3 + 9 files changed, 237 insertions(+), 13 deletions(-) base-commit: f685204c57e87d2a88b159c7525426d70ee745c9 -- 2.49.0