On Tue, May 20, 2025 at 03:32:16PM +0100, Usama Arif wrote: > > > On 20/05/2025 15:22, Lorenzo Stoakes wrote: > > On Tue, May 20, 2025 at 10:08:03PM +0800, Yafang Shao wrote: > >> On Tue, May 20, 2025 at 9:10 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > >>> > >>> On Tue, May 20, 2025 at 03:25:07PM +0800, Yafang Shao wrote: > >>>> The challenge we face is that our system administration team doesn't > >>>> permit enabling THP globally in production by setting it to "madvise" > >>>> or "always". As a result, we can only experiment with your feature on > >>>> our test servers at this stage. > >>> > >>> That's a you problem. > >> > >> perhaps. > >> > >>> You need to figure out how to influence your > >>> sysadmin team to change their mind; whether it's by talking to their > >>> superiors or persuading them directly. > >> > >> I believe that "practicing" matters more than "talking" or "persuading". > >> I’m surprised your suggestion relies on "talking" ;-) > >> If I understand correctly, we all agree that "talk is cheap", right? > >> > >>> It's not a justification for why > >>> upstream should take this patch. > >> > >> I believe Johannes has clearly explained the challenges the community > >> is currently facing [0]. > >> > >> [0]. https://lore.kernel.org/linux-mm/20250430174521.GC2020@xxxxxxxxxxx/ > > > > (Sorry to interject on your conversation, but :) > > > > I don't think anybody denies we have issues in configuring this stuff > > sensibly. A global-only control isn't going to cut it in the real world it > > seems. > > > > To me as you say yourself, definining the ABI/API here is what really matters, > > and we're right now inundated with several series all at once (you wait for one > > bus then 3 come at once... :). > > > > So this I think, should be the question. > > > > I like the idea of just exposing something like madvise(), which is something > > we're going to maintain indefinitely. > > > > Though any such exposure would in my view would need to be opt-in i.e. have a > > list of MADV_... options that are accepted, as we'd need to very cautiously > > determine which are safe from this context. > > > > Of course then this leads to the whole thing (and I really know very little > > about BPF internals - obviously happy to understand more) of whether we can just > > use the madvise() code direct or what locking we can do or how all that works. > > > > At any rate, a custom thing that is specific as 'switch mode for mTHP pages of > > size X to Y' is just something I'd rather us not tie ourselves to. > > > >> > >> > >> -- > >> Regards > >> > >> Yafang > > > > What do you think re: bpf vs. something like my proposed process_madvise() > > extensions or Usama's proposed prctl()? > > > > Simpler, but really just using madvise functionality and having a means of > > defaulting across fork/exec (notwithstanding Jann's concerns in this area). > > Unfortunately I think the issue is that neither prctl or process_madvise would work > for Yafangs usecase? Its usecase 3 mentioned in [1], i.e. > global system policy=never, process wants "madvise" policy for itself. > Will let Yafang confirm. > > [1] https://lore.kernel.org/all/13b68fa0-8755-43d8-8504-d181c2d46134@xxxxxxxxx/ > Yeah I really object to that case. I explicitly said on your series I object to it, I believe David did too. Never should mean never. It's a NACK if that's what this is about unless I'm missing something here. I agree global settings are not fine-grained enough, but 'sys admins refuse to do X so we want to ignore what they do' is... really not right at all.