On Mon, Jul 14, 2025 at 07:19:35PM +0900, Masami Hiramatsu wrote: > On Mon, 14 Jul 2025 11:39:03 +0200 > Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > On Mon, Jul 14, 2025 at 05:39:15PM +0900, Masami Hiramatsu wrote: > > > > > > + /* > > > > + * Some of the uprobe consumers has changed sp, we can do nothing, > > > > + * just return via iret. > > > > + */ > > > > > > Do we allow consumers to change the `sp`? It seems dangerous > > > because consumer needs to know whether it is called from > > > breakpoint or syscall. Note that it has to set up ax, r11 > > > and cx on the stack correctly only if it is called from syscall, > > > that is not compatible with breakpoint mode. > > > > > > > + if (regs->sp != sp) > > > > + return regs->ax; > > > > > > Shouldn't we recover regs->ip? Or in this case does consumer has > > > to change ip (== return address from trampline) too? > > > > > > IMHO, it should not allow to change the `sp` and `ip` directly > > > in syscall mode. In case of kprobes, kprobe jump optimization > > > must be disabled explicitly (e.g. setting dummy post_handler) > > > if the handler changes `ip`. > > > > > > Or, even if allowing to modify `sp` and `ip`, it should be helped > > > by this function, e.g. stack up the dummy regs->ax/r11/cx on the > > > new stack at the new `regs->sp`. This will allow modifying those > > > registries transparently as same as breakpoint mode. > > > In this case, I think we just need to remove above 2 lines. > > > > There are two syscall return paths; the 'normal' is sysret and for that > > you need to undo all things just right. > > > > The other is IRET. At which point we can have whatever state we want, > > including modified SP. > > > > See arch/x86/entry/syscall_64.c:do_syscall_64() and > > arch/x86/entry/entry_64.S:entry_SYSCALL_64 > > > > The IRET path should return pt_regs as is from an interrupt/exception > > very much like INT3. > > OK, so SYSRET case, we need to follow; > > sys_uprobe -> do_syscall_64 -> entry_SYSCALL_64 -> trampoline -> retaddr > > But using IRET to return, we can skip returning to trampoline, > > sys_uprobe -> do_syscall_64 -> entry_SYSCALL_64 -> regs->ip the handler gets the original breakpoint address, it's set in: regs->ip = ax_r11_cx_ip[3] - 5; and at the point we do: /* * Some of the uprobe consumers has changed sp, we can do nothing, * just return via iret. */ if (regs->sp != sp) return regs->ax; .. regs->ip value wasn't restored for the trampoline's return address, so iret will skip the trampoline but perhaps we could do the extra check below to land on the next instruction? jirka --- diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 043d826295a3..4318517aa852 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -817,8 +817,12 @@ SYSCALL_DEFINE0(uprobe) * Some of the uprobe consumers has changed sp, we can do nothing, * just return via iret. */ - if (regs->sp != sp) + if (regs->sp != sp) { + /* skip the trampoline call */ + if (ax_r11_cx_ip[3] - 5 == regs->ip) + regs->ip += 5; return regs->ax; + } regs->sp -= sizeof(ax_r11_cx_ip);