On Thu, Apr 3, 2025 at 10:42 AM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > On Thu, Apr 3, 2025 at 5:58 AM Ricardo Cañuelo Navarro <rcn@xxxxxxxxxx> wrote: > > > > Thanks for reviewing, answers below: > > > > On Wed, Apr 02 2025 at 15:40:56, Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > The data send path: > > > > > > sctp_endpoint_lookup_assoc() -> > > > sctp_sendmsg_to_asoc() > > > > > > And the transport removal path: > > > > > > sctp_sf_do_asconf() -> > > > sctp_process_asconf() -> > > > sctp_assoc_rm_peer() > > > > > > are both protected by the same socket lock. > > > > > > Additionally, when a path is removed, sctp_assoc_rm_peer() updates the > > > transport of all existing chunks in the send queues (peer->transmitted > > > and asoc->outqueue.out_chunk_list) to NULL. > > > > > > It will be great if you can reproduce the issue locally and help check > > > how the potential race occurs. > > > > That's true but if there isn't enough space in the send buffer, then > > sctp_sendmsg_to_asoc() will release the lock temporarily. > > > Oh right, I missed that. Thanks. > > > The scenario that the reproducer generates is the following: > > > > Thread A Thread B > > -------------------- -------------------- > > (1) sctp_sendmsg() > > lock_sock() > > sctp_sendmsg_to_asoc() > > sctp_wait_for_sndbuf() > > release_sock() > > sctp_setsockopt(SCTP_SOCKOPT_BINDX_REM) > > lock_sock() > > sctp_setsockopt_bindx() > > sctp_send_asconf_del_ip() > > ... > > release_sock() > > process rcv backlog: > > sctp_do_sm() > > sctp_sf_do_asconf() > > ... > > sctp_assoc_rm_peer() > > lock_sock() > > (2) chunk->transport = transport > > sctp_primitive_SEND() > > ... > > sctp_outq_select_transport() > > *BUG* switch (new_transport->state) > > > > > > Notes: > > ------ > > > > Both threads operate on the same socket. > > > > 1. Here, sctp_endpoint_lookup_assoc() finds and returns an existing > > association and transport. > > > > 2. At this point, `transport` is already deleted. chunk->transport is > > not set to NULL because sctp_assoc_rm_peer() ran _before_ the transport > > was assigned to the chunk. > > > > > We should avoid an extra hashtable lookup on this hot TX path, as it would > > > negatively impact performance. > > > > Good point. I can't really tell the performance impact of the lookup > > here, my experience with the SCTP implementation is very limited. Do you > > have any suggestions or alternatives about how to deal with this? > > > I think the correct approach is to follow how sctp_assoc_rm_peer() > handles this. > > You can use asoc->peer.last_sent_to (which isn't really used elsewhere) > to temporarily store the transport before releasing the socket lock and > sleeping in sctp_sendmsg_to_asoc(). After waking up and reacquiring the > lock, restore the transport back to asoc->peer.last_sent_to. > > Additionally, during an ASCONF update, ensure asoc->peer.last_sent_to > is set to a valid transport if it matches the transport being removed. > > For example: > > in sctp_wait_for_sndbuf(): > > asoc->peer.last_sent_to = *tp; > release_sock(sk); > current_timeo = schedule_timeout(current_timeo); > lock_sock(sk); > *tp = asoc->peer.last_sent_to; > asoc->peer.last_sent_to = NULL; > > in sctp_assoc_rm_peer(): > > if (asoc->peer.last_sent_to == peer) > asoc->peer.last_sent_to = transport; This change introduces a side effect: when multiple threads send data on the same asoc using different daddrs, they may interfere with each other while waiting for buffer space, as each thread updates asoc->peer.last_sent_to. You may consider holding a refcnt to the transport (similar to how the asoc refcnt is held) in sctp_wait_for_sndbuf(), as shown below: @@ -9225,7 +9225,9 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, pr_debug("%s: asoc:%p, timeo:%ld, msg_len:%zu\n", __func__, asoc, *timeo_p, msg_len); - /* Increment the association's refcnt. */ + /* Increment the transport and association's refcnt. */ + if (t) + sctp_transport_hold(t); sctp_association_hold(asoc); /* Wait on the association specific sndbuf space. */ @@ -9234,7 +9236,7 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, TASK_INTERRUPTIBLE); if (asoc->base.dead) goto do_dead; - if (!*timeo_p) + if (!*timeo_p || (t && t->dead)) goto do_nonblock; if (sk->sk_err || asoc->state >= SCTP_STATE_SHUTDOWN_PENDING) goto do_error; @@ -9259,7 +9261,9 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, out: finish_wait(&asoc->wait, &wait); - /* Release the association's refcnt. */ + /* Release the transport and association's refcnt. */ + if (t) + sctp_transport_put(t); sctp_association_put(asoc); You will need to reintroduce the dead bit in struct sctp_transport and set it in sctp_transport_free(). Note this field was previously removed in: commit 47faa1e4c50ec26e6e75dcd1ce53f064bd45f729 Author: Xin Long <lucien.xin@xxxxxxxxx> Date: Fri Jan 22 01:49:09 2016 +0800 sctp: remove the dead field of sctp_transport Thanks.