Re: [PATCHv9 02/16] x86/alternatives: Disable LASS when patching kernel alternatives

Sohil Mehta <sohil.mehta@xxxxxxxxx> · Thu, 31 Jul 2025 17:15:14 -0700

On 7/28/2025 12:38 PM, David Laight wrote:
>>> ...
>>>
>>> Or just write a byte copy loop in C with (eg) barrier() inside it
>>> to stop gcc converting it to memcpy().
>>>
>>> 	David  
>>
>> Great. It's rep movsb without any of the performance.
> 
> And without the massive setup overhead that dominates short copies.
> Given the rest of the code I'm sure a byte copy loop won't make
> any difference to the overall performance.
>

Wouldn't it be better to introduce a generic mechanism than something
customized for this scenario?

PeterZ had suggested that inline memcpy could have more usages:
https://lore.kernel.org/lkml/20241029113611.GS14555@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Is there a concern that the inline versions might get optimized into
standard memcpy/memset calls by GCC? Wouldn't the volatile keyword
prevent that?

static __always_inline void *__inline_memcpy(void *to, const void *from,
size_t len)
{
	void *ret = to;

	asm volatile("rep movsb"
		     : "+D" (to), "+S" (from), "+c" (len)
		     : : "memory");
	return ret;
}

static __always_inline void *__inline_memset(void *s, int v, size_t n)
{
	void *ret = s;

	asm volatile("rep stosb"
		     : "+D" (s), "+c" (n)
		     : "a" ((uint8_t)v)
		     : "memory");
	return ret;
}