On 5/5/25 11:49, Libo Chen wrote: > > > On 5/5/25 11:27, Chen, Yu C wrote: >> Hi Michal, >> >> On 5/6/2025 1:46 AM, Michal Koutný wrote: >>> On Mon, May 05, 2025 at 11:03:10PM +0800, "Chen, Yu C" <yu.c.chen@xxxxxxxxx> wrote: >>>> According to this address, >>>> 4c 8b af 50 09 00 00 mov 0x950(%rdi),%r13 <--- r13 = p->mm; >>>> 49 8b bd 98 04 00 00 mov 0x498(%r13),%rdi <--- p->mm->owner >>>> It seems that this task to be swapped has NULL mm_struct. >>> >>> So it's likely a kernel thread. Does it make sense to NUMA balance >>> those? (I naïvely think it doesn't, please correct me.) ... >>> >> >> I agree kernel threads are not supposed to be covered by >> NUMA balance, because currently NUMA balance only considers >> user pages via VMAs, and one question below: >> >>>> static void __migrate_swap_task(struct task_struct *p, int cpu) >>>> { >>>> __schedstat_inc(p->stats.numa_task_swapped); >>>> - count_memcg_event_mm(p->mm, NUMA_TASK_SWAP); >>>> + if (p->mm) >>>> + count_memcg_event_mm(p->mm, NUMA_TASK_SWAP); >>> >>> ... proper fix should likely guard this earlier, like the guard in >>> task_numa_fault() but for the other swapped task. >> I see. For task swapping in task_numa_compare(), >> it is triggered when there are no idle CPUs in task A's >> preferred node. >> In this case, we choose a task B on A's preferred node, >> and swap B with A. This helps improve A's Numa locality >> without introducing the load imbalance between Nodes. >> Hi Chenyu There are two problems here: 1. Many kthreads are pinned, with all the efforts in task_numa_compare() and task_numa_find_cpu(), the swapping may not end up happening. I only see a check on source task: cpumask_test_cpu(cpu, env->p->cpus_ptr) but not dst task. 2. Assuming B is migratable, that can potentially make B worse, right? I think some kthreads are quite cache-sensitive, and we swap like their locality doesn't matter. Ideally we probably just want to stay off kthreads, if we cannot find any others p->mm tasks, just don't swap (?). That sounds like a brand new patch though. Libo >> But B's Numa node preference is not mandatory in >> current implementation IIUC, because B's load is mainly > > hmm, that's doesn't seem to be right, can we choose B that > is not a kthread from A's preferred node? > >> considered. That is to say, is it legit to swap a >> Numa sensitive task A with a non-Numa sensitive kernel >> thread B? If not, I think we can add kernel thread >> check in task swap like the guard in >> task_tick_numa()/task_numa_fault(). >> > > >> thanks, >> Chenyu >> >>> >>> Michal >> >