Re: [PATCH] mm/userfaultfd: prevent busy looping for tasks with signals pending

Johannes Weiner <hannes@xxxxxxxxxxx> · Thu, 24 Apr 2025 11:11:13 -0400

On Thu, Apr 24, 2025 at 08:54:40AM -0600, Jens Axboe wrote:
> On 4/24/25 8:03 AM, Johannes Weiner wrote:
> > On Wed, Apr 23, 2025 at 05:37:06PM -0600, Jens Axboe wrote:
> >> userfaultfd may use interruptible sleeps to wait on userspace filling
> >> a page fault, which works fine if the task can be reliably put to
> >> sleeping waiting for that. However, if the task has a normal (ie
> >> non-fatal) signal pending, then TASK_INTERRUPTIBLE sleep will simply
> >> cause schedule() to be a no-op.
> >>
> >> For a task that registers a page with userfaultfd and then proceeds
> >> to do a write from it, if that task also has a signal pending then
> >> it'll essentially busy loop from do_page_fault() -> handle_userfault()
> >> until that fault has been filled. Normally it'd be expected that the
> >> task would sleep until that happens. Here's a trace from an application
> >> doing just that:
> >>
> >> handle_userfault+0x4b8/0xa00 (P)
> >> hugetlb_fault+0xe24/0x1060
> >> handle_mm_fault+0x2bc/0x318
> >> do_page_fault+0x1e8/0x6f0
> > 
> > Makes sense. There is a fault_signal_pending() check before retrying:
> > 
> > static inline bool fault_signal_pending(vm_fault_t fault_flags,
> >                                         struct pt_regs *regs)
> > {
> >         return unlikely((fault_flags & VM_FAULT_RETRY) &&
> >                         (fatal_signal_pending(current) ||
> >                          (user_mode(regs) && signal_pending(current))));
> > }
> > 
> > Since it's an in-kernel fault, and the signal is non-fatal, it won't
> > stop looping until the fault is handled.
> > 
> > This in itself seems a bit sketchy. You have to hope there is no
> > dependency between handling the signal -> handling the fault inside
> > the userspace components.
> 
> Indeed... But that's generic userfaultfd sketchiness, not really related
> to this patch.

Definitely, it wasn't meant as an objection to the patch. The bug just
highlights a fairly subtle dependency chain between signals and
userfault handling that users of the feature might not be aware of.
Sorry if I was being unclear.