On 2025-05-27 11:08:36 [+0200], Nam Cao wrote: > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -1867,19 +1704,18 @@ static int ep_send_events(struct eventpoll *ep, > init_poll_funcptr(&pt, NULL); > > mutex_lock(&ep->mtx); > - ep_start_scan(ep, &txlist); > > - /* > - * We can loop without lock because we are passed a task private list. > - * Items cannot vanish during the loop we are holding ep->mtx. > - */ > - list_for_each_entry_safe(epi, tmp, &txlist, rdllink) { > + while (res < maxevents) { > struct wakeup_source *ws; > + struct llist_node *n; > __poll_t revents; > > - if (res >= maxevents) > + n = llist_del_first(&ep->rdllist); > + if (!n) > break; > > + epi = llist_entry(n, struct epitem, rdllink); > + > /* > * Activate ep->ws before deactivating epi->ws to prevent > * triggering auto-suspend here (in case we reactive epi->ws > @@ -1896,21 +1732,30 @@ static int ep_send_events(struct eventpoll *ep, > __pm_relax(ws); > } > > - list_del_init(&epi->rdllink); > - > /* > * If the event mask intersect the caller-requested one, > * deliver the event to userspace. Again, we are holding ep->mtx, > * so no operations coming from userspace can change the item. > */ > revents = ep_item_poll(epi, &pt, 1); > - if (!revents) > + if (!revents) { > + init_llist_node(n); > + > + /* > + * Just in case epi becomes ready after ep_item_poll() above, but before > + * init_llist_node(). Make sure to add it to the ready list, otherwise an > + * event may be lost. > + */ So why not llist_del_first_init() at the top? Wouldn't this avoid the add below? > + if (unlikely(ep_item_poll(epi, &pt, 1))) { > + ep_pm_stay_awake(epi); > + epitem_ready(epi); > + } > continue; > + } > > events = epoll_put_uevent(revents, epi->event.data, events); > if (!events) { > - list_add(&epi->rdllink, &txlist); > - ep_pm_stay_awake(epi); > + llist_add(&epi->rdllink, &ep->rdllist); That epitem_ready() above and this llist_add() add epi back where it was retrieved from. Wouldn't it loop in this case? I think you can avoid the add above and here adding it to txlist would avoid the loop. (It returns NULL if the copy-to-user failed so I am not sure why another retry will change something but the old code did it, too so). > if (!res) > res = -EFAULT; > break; One note: The old code did "list_add() + ep_pm_stay_awake()". Now you do "ep_pm_stay_awake() + epitem_ready()". epitem_ready() adds the item conditionally to the list so you may do ep_pm_stay_awake() without adding it to the list because it already is. Looking through ep_pm_stay_awake() it shouldn't do any harm except incrementing a counter again. Sebastian