Re: [PATCH v3] sched/numa: add statistics of numa balance task migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/5/25 11:49, Libo Chen wrote:
> 
> 
> On 5/5/25 11:27, Chen, Yu C wrote:
>> Hi Michal,
>>
>> On 5/6/2025 1:46 AM, Michal Koutný wrote:
>>> On Mon, May 05, 2025 at 11:03:10PM +0800, "Chen, Yu C" <yu.c.chen@xxxxxxxxx> wrote:
>>>> According to this address,
>>>>     4c 8b af 50 09 00 00    mov    0x950(%rdi),%r13  <--- r13 = p->mm;
>>>>     49 8b bd 98 04 00 00    mov    0x498(%r13),%rdi  <--- p->mm->owner
>>>> It seems that this task to be swapped has NULL mm_struct.
>>>
>>> So it's likely a kernel thread. Does it make sense to NUMA balance
>>> those? (I naïvely think it doesn't, please correct me.) ...
>>>
>>
>> I agree kernel threads are not supposed to be covered by
>> NUMA balance, because currently NUMA balance only considers
>> user pages via VMAs, and one question below:
>>
>>>>   static void __migrate_swap_task(struct task_struct *p, int cpu)
>>>>   {
>>>>          __schedstat_inc(p->stats.numa_task_swapped);
>>>> -       count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
>>>> +       if (p->mm)
>>>> +               count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
>>>
>>> ... proper fix should likely guard this earlier, like the guard in
>>> task_numa_fault() but for the other swapped task.
>> I see. For task swapping in task_numa_compare(),
>> it is triggered when there are no idle CPUs in task A's
>> preferred node.
>> In this case, we choose a task B on A's preferred node,
>> and swap B with A. This helps improve A's Numa locality
>> without introducing the load imbalance between Nodes.
>>
Hi Chenyu

There are two problems here:
1. Many kthreads are pinned, with all the efforts in task_numa_compare()
and task_numa_find_cpu(), the swapping may not end up happening. I only see a
check on source task: cpumask_test_cpu(cpu, env->p->cpus_ptr) but not dst task.
2. Assuming B is migratable, that can potentially make B worse, right? I think
some kthreads are quite cache-sensitive, and we swap like their locality doesn't
matter.

Ideally we probably just want to stay off kthreads, if we cannot find any others
p->mm tasks, just don't swap (?). That sounds like a brand new patch though.



Libo 
>> But B's Numa node preference is not mandatory in
>> current implementation IIUC, because B's load is mainly
> 
> hmm, that's doesn't seem to be right, can we choose B that
> is not a kthread from A's preferred node?
> 
>> considered. That is to say, is it legit to swap a
>> Numa sensitive task A with a non-Numa sensitive kernel
>> thread B? If not, I think we can add kernel thread
>> check in task swap like the guard in
>> task_tick_numa()/task_numa_fault().
>>
> 
> 
>> thanks,
>> Chenyu
>>
>>>
>>> Michal
>>
> 





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux