On Thu, Apr 24, 2025 at 08:54:40AM -0600, Jens Axboe wrote: > On 4/24/25 8:03 AM, Johannes Weiner wrote: > > On Wed, Apr 23, 2025 at 05:37:06PM -0600, Jens Axboe wrote: > >> userfaultfd may use interruptible sleeps to wait on userspace filling > >> a page fault, which works fine if the task can be reliably put to > >> sleeping waiting for that. However, if the task has a normal (ie > >> non-fatal) signal pending, then TASK_INTERRUPTIBLE sleep will simply > >> cause schedule() to be a no-op. > >> > >> For a task that registers a page with userfaultfd and then proceeds > >> to do a write from it, if that task also has a signal pending then > >> it'll essentially busy loop from do_page_fault() -> handle_userfault() > >> until that fault has been filled. Normally it'd be expected that the > >> task would sleep until that happens. Here's a trace from an application > >> doing just that: > >> > >> handle_userfault+0x4b8/0xa00 (P) > >> hugetlb_fault+0xe24/0x1060 > >> handle_mm_fault+0x2bc/0x318 > >> do_page_fault+0x1e8/0x6f0 > > > > Makes sense. There is a fault_signal_pending() check before retrying: > > > > static inline bool fault_signal_pending(vm_fault_t fault_flags, > > struct pt_regs *regs) > > { > > return unlikely((fault_flags & VM_FAULT_RETRY) && > > (fatal_signal_pending(current) || > > (user_mode(regs) && signal_pending(current)))); > > } > > > > Since it's an in-kernel fault, and the signal is non-fatal, it won't > > stop looping until the fault is handled. > > > > This in itself seems a bit sketchy. You have to hope there is no > > dependency between handling the signal -> handling the fault inside > > the userspace components. > > Indeed... But that's generic userfaultfd sketchiness, not really related > to this patch. Definitely, it wasn't meant as an objection to the patch. The bug just highlights a fairly subtle dependency chain between signals and userfault handling that users of the feature might not be aware of. Sorry if I was being unclear.