On Wed, May 21, 2025 at 04:25:00PM +0200, Valentin Schneider wrote: > On 19/05/25 09:40, Nam Cao wrote: > > @@ -136,14 +136,28 @@ struct epitem { > > struct rcu_head rcu; > > }; > > > > - /* List header used to link this structure to the eventpoll ready list */ > > - struct list_head rdllink; > > + /* > > + * Whether this item can be added to the eventpoll ready list. Adding to the list can be > > + * blocked for two reasons: > > + * > > + * 1. This item is already on the list. > > + * 2. A waiter is consuming the ready list and has consumed this item. The waiter therefore > > + * is blocking this item from being added again, preventing seeing the same item twice. > > + * If adding is blocked due to this reason, the waiter will add this item to the list > > + * once consuming is done. > > + */ > > + bool link_locked; > > Nit: IIUC it's not just the readylist, it's anytime the link is used > (e.g. to punt it on a txlist), so how about: > > /* > * Whether epitem.rdllink is currently used in a list. When used, it > * cannot be detached or inserted elsewhere. > * > * It may be in use for two reasons: > * > * 1. This item is on the eventpoll ready list > * 2. This item is being consumed by a waiter and stashed on a temporary > * list. If adding is blocked due to this reason, the waiter will add > * this item to the list once consuming is done. > */ > bool link_used; Acked. > > > > /* > > - * Works together "struct eventpoll"->ovflist in keeping the > > - * single linked chain of items. > > + * Indicate whether this item is ready for consumption. All items on the ready list has this > > + * flag set. Item that should be on the ready list, but cannot be added because of > > + * link_locked (in other words, a waiter is consuming the ready list), also has this flag > > + * set. When a waiter is done consuming, the waiter will add ready items to the ready list. > > */ > > - struct epitem *next; > > + bool ready; > > + > > + /* List header used to link this structure to the eventpoll ready list */ > > + struct llist_node rdllink; > > > > /* The file descriptor information this item refers to */ > > struct epoll_filefd ffd; > > > @@ -361,10 +368,27 @@ static inline int ep_cmp_ffd(struct epoll_filefd *p1, > > (p1->file < p2->file ? -1 : p1->fd - p2->fd)); > > } > > > > -/* Tells us if the item is currently linked */ > > -static inline int ep_is_linked(struct epitem *epi) > > +static void epitem_ready(struct epitem *epi) > > { > > - return !list_empty(&epi->rdllink); > > + /* > > + * Mark it ready, just in case a waiter is blocking this item from going into the ready > > + * list. This will tell the waiter to add this item to the ready list, after the waiter is > > + * finished. > > + */ > > + xchg(&epi->ready, true); > > Perhaps a stupid question, why use an xchg() for .ready and .link_locked > (excepted for that epitem_ready() cmpxchg()) writes when the return value > is always discarded? Wouldn't e.g. smp_store_release() suffice, considering > the reads are smp_load_acquire()? That's me being stupid, smp_store_release() is good enough. > > + > > + /* > > + * If this item is not blocked, add it to the ready list. This item could be blocked for two > > + * reasons: > > + * > > + * 1. It is already on the ready list. Then nothing further is required. > > + * 2. A waiter is consuming the ready list, and has consumed this item. The waiter is now > > + * blocking this item, so that this item won't be seen twice. In this case, the waiter > > + * will add this item to the ready list after the waiter is finished. > > + */ > > + if (!cmpxchg(&epi->link_locked, false, true)) > > + llist_add(&epi->rdllink, &epi->ep->rdllist); > > + > > } > > > > static inline struct eppoll_entry *ep_pwq_from_wait(wait_queue_entry_t *p) > > > @@ -1924,19 +1874,39 @@ static int ep_send_events(struct eventpoll *ep, > > * Trigger mode, we need to insert back inside > > * the ready list, so that the next call to > > * epoll_wait() will check again the events > > - * availability. At this point, no one can insert > > - * into ep->rdllist besides us. The epoll_ctl() > > - * callers are locked out by > > - * ep_send_events() holding "mtx" and the > > - * poll callback will queue them in ep->ovflist. > > + * availability. > > */ > > - list_add_tail(&epi->rdllink, &ep->rdllist); > > + xchg(&epi->ready, true); > > + } > > + } > > + > > + llist_for_each_entry_safe(epi, tmp, txlist.first, rdllink) { > > + /* > > + * We are done iterating. Allow the items we took to be added back to the ready > > + * list. > > + */ > > + xchg(&epi->link_locked, false); > > + > > + /* > > + * In the loop above, we may mark some items ready, and they should be added back. > > + * > > + * Additionally, someone else may also attempt to add the item to the ready list, > > + * but got blocked by us. Add those blocked items now. > > + */ > > + if (smp_load_acquire(&epi->ready)) { > > ep_pm_stay_awake(epi); > > + epitem_ready(epi); > > } > > Isn't this missing a: > > list_del_init(&epi->rdllink); > > AFAICT we're always going to overwrite that link when next marking the item > as ready, but I'd say it's best to exit this with a clean state. That would > have to be before the clearing of link_locked so it doesn't race with a > concurrent epitem_ready() call. To confirm I understand you: there is no functional problem, and your comment is more of a "just to be safe"? Thanks so much for the review, Nam