On Thu, Jul 10, 2025 at 11:08:18AM +0800, Xi Ruoyao wrote: > After upgrading my kernel to the recent mainline I've encountered some > stability issue, like: > > - When GDM started gnome-shell, the screen often froze and the only > thing I could do was to switch into a VT and reboot. > - Sometimes gnome-shell started "fine" but then starting an application > (like gnome-console) needed to wait for about a minute. > - Sometimes the system shutdown process hangs waiting for a service to > stop. > - Rarely the system boot process hangs for no obvious reason. > > Most strangely in all the cases there are nothing alarming in dmesg or > system journal. > > I'm unsure if this is the culprit but I'm almost sure it's the trigger. > Maybe there's some race condition in my userspace that the priority > inversion had happened to hide... but anyway reverting this patch > seemed to "fix" the issue. > > Any thoughts or pointers to diagnose further? I have been running this new epoll on my work machine for weeks by now without issue, while you seem to reproduce it reliably. I'm guessing that the problem is on some code path which is dead on my system, but executed on yours. I am curious if Gnome is using some epoll options which are unused on my system. I presume you can still access dmesg despite the freeze. Do you mind running the below patch, let me know what's in your dmesg? It may help identifying that code path. Best regards, Nam diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 895256cd2786..e3dafc48a59a 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -532,6 +532,9 @@ static long ep_eventpoll_bp_ioctl(struct file *file, unsigned int cmd, WRITE_ONCE(ep->busy_poll_usecs, epoll_params.busy_poll_usecs); WRITE_ONCE(ep->busy_poll_budget, epoll_params.busy_poll_budget); WRITE_ONCE(ep->prefer_busy_poll, epoll_params.prefer_busy_poll); + printk("%s busy_poll_usecs=%d busy_poll_budget=%d prefer_busy_poll=%d\n", + __func__, epoll_params.busy_poll_usecs, epoll_params.busy_poll_budget, + epoll_params.prefer_busy_poll); return 0; case EPIOCGPARAMS: memset(&epoll_params, 0, sizeof(epoll_params)); @@ -2120,6 +2123,9 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds, struct epitem *epi; struct eventpoll *tep = NULL; + printk("%s: epfd=%d op=%d fd=%d events=0x%x data=0x%llx nonblock=%d\n", + __func__, epfd, op, fd, epds->events, epds->data, nonblock); + CLASS(fd, f)(epfd); if (fd_empty(f)) return -EBADF; diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 8d4610246ba0..e9c33c0c8cc5 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -54,6 +54,8 @@ int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags) int ret; bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; + printk("%s flags=0x%x\n", __func__, issue_flags); + ret = do_epoll_ctl(ie->epfd, ie->op, ie->fd, &ie->event, force_nonblock); if (force_nonblock && ret == -EAGAIN) return -EAGAIN;