Hi neal,
Am 26.05.25 um 15:50 schrieb Neal Cardwell:
We would very much appreciate it if someone could help us on the
following questions:
- Why are the remaining segments not send out immediately, despite
TCP_NODELAY?
- Is there a way to change this?
- If not, do you have better workarounds than injecting a fake ACK
pretending to come "from the server" via a raw socket?
Actually, we haven't tried this yet, but probably will soon.
Sounds like you are probably seeing the effects of TCP Small Queues
(TSQ) limiting the number of skbs queued in various layers of the
sending machine. See tcp_small_queue_check() for details.
thank you so much! I compiled v6.15 with a tcp_small_queue_check() that
I patched to always return false and things just worked (again)! Now I
wrote a small module using kretprobe and regs_set_return_value() to
allow us to apply this change a bit more selectively (and without
recompiling the entire kernel). That's probably not optimal for anything
that should be widely deployed, but since we are currently just
experimenting and don't even know what might be actually used later on,
it seems good enough for now.
Probably with shorter RTTs the incoming ACKs clear skbs from the rtx
queue, and thus the tcp_small_queue_check() call to
tcp_rtx_queue_empty_or_single_skb(sk) returns true and
tcp_small_queue_check() returns false, enabling transmissions.
Honestly, I still don't quite understand why this works the way it does.
We intercept all outgoing (initial) payload segments before we NF_ACCEPT
any of them (i.e., collect all first, then release), so after the
handshake itself there shouldn't be any skb clearing triggered by new
ACKs from our server... Oh well. In any case, it does work, and I'm
happy with that.
Thanks again,
Dennis