On Thu, 2025-07-10 at 08:21 +0200, Nam Cao wrote: > On Thu, Jul 10, 2025 at 11:08:18AM +0800, Xi Ruoyao wrote: > > After upgrading my kernel to the recent mainline I've encountered some > > stability issue, like: > > > > - When GDM started gnome-shell, the screen often froze and the only > > thing I could do was to switch into a VT and reboot. > > - Sometimes gnome-shell started "fine" but then starting an application > > (like gnome-console) needed to wait for about a minute. > > - Sometimes the system shutdown process hangs waiting for a service to > > stop. > > - Rarely the system boot process hangs for no obvious reason. > > > > Most strangely in all the cases there are nothing alarming in dmesg or > > system journal. > > > > I'm unsure if this is the culprit but I'm almost sure it's the trigger. > > Maybe there's some race condition in my userspace that the priority > > inversion had happened to hide... but anyway reverting this patch > > seemed to "fix" the issue. > > > > Any thoughts or pointers to diagnose further? > > I have been running this new epoll on my work machine for weeks by now > without issue, while you seem to reproduce it reliably. I'm guessing that > the problem is on some code path which is dead on my system, but executed > on yours. I also failed to reproduce it in a VM running the latest Fedora Rawhide (in 3 attempts). > I am curious if Gnome is using some epoll options which are unused on my > system. > I presume you can still access dmesg despite the freeze. Do you mind > running the below patch, let me know what's in your dmesg? It may help > identifying that code path. Attached the system journal (dmesg was truncated due to too many lines). I guess the relevant part should be between line 6947 ("New session 2 of user xry111") and line 8022 ("start operation timed out. Terminating"). -- Xi Ruoyao <xry111@xxxxxxxxxxx>
Attachment:
log.gz
Description: application/gzip