On Mon, 7 Jul 2025 11:02:06 +0300 "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > On Sun, Jul 06, 2025 at 10:13:42AM +0100, David Laight wrote: > > On Thu, 3 Jul 2025 10:13:44 -0700 > > Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > > > > > On 7/1/25 02:58, Kirill A. Shutemov wrote: > > > > Extract memcpy and memset functions from copy_user_generic() and > > > > __clear_user(). > > > > > > > > They can be used as inline memcpy and memset instead of the GCC builtins > > > > whenever necessary. LASS requires them to handle text_poke. > > > > > > Why are we messing with the normal user copy functions? Code reuse is > > > great, but as you're discovering, the user copy code is highly > > > specialized and not that easy to reuse for other things. > > > > > > Don't we just need a dirt simple chunk of code that does (logically): > > > > > > stac(); > > > asm("rep stosq..."); > > > clac(); > > > > > > Performance doesn't matter for text poking, right? It could be stosq or > > > anything else that you can inline. It could be a for() loop for all I > > > care as long as the compiler doesn't transform it into some out-of-line > > > memset. Right? > > > > > > > It doesn't even really matter if there is an out-of-line memset. > > All you need to do is 'teach' objtool it isn't a problem. > > PeterZ was not fan of the idead; > > https://lore.kernel.org/all/20241029113611.GS14555@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > > Is this for the boot-time asm-alternatives? > > Not only boot-time. static_branches are switchable at runtime. > > > In that case I wonder why a 'low' address is being used? > > With LASS enabled using a low address on a life kernel would make it > > harder for another cpu to leverage the writable code page, but > > that isn't a requirement of LASS. > > Because kernel side of address space is shared across all CPU and we don't > want kernel code to be writable to all CPUs So, as I said, it isn't a requirement for LASS. Just something that LASS lets you do. Although I'm sure there will be some odd effect of putting a 'supervisor' page in the middle of 'user' pages. Isn't there also (something like) kmap_local_page() that updates the local page tables but doesn't broadcast the change? > > > If it is being used for later instruction patching you need the > > very careful instruction sequences and cpu synchronisation. > > In that case I suspect you need to add conditional stac/clac > > to the existing patching code (and teach objtool it is all ok). > > STAC/CLAC is conditional in text poke on LASS presence on the machine. So just change the code to use byte copy loops with a volatile destination pointer and all will be fine. David