On 05/05/25 8:33 pm, Chen, Yu C wrote:
On 5/5/2025 2:43 PM, Jain, Ayush wrote:
Hello,
Hitting Kernel Panic on latest-next while running rcutorture tests
37ff6e9a2ce3 ("Add linux-next specific files for 20250502")
reverting this patch fixes it
3b2339eeb032
("sched-numa-add-statistics-of-numa-balance-task-migration-v3")
https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/kernel/sched/core.c?id=3b2339eeb032627e9329daf70a4ba8cd62c9cc8d
by looking at RIP pointer
$ ./scripts/faddr2line vmlinux __migrate_swap_task+0x2e/0x180
__migrate_swap_task+0x2e/0x180:
count_memcg_events_mm at include/linux/memcontrol.h:987
(inlined by) count_memcg_events_mm at include/linux/memcontrol.h:978
(inlined by) __migrate_swap_task at kernel/sched/core.c:3356
memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
mm->owner -> NULL
Attaching kernel logs below:
[ 1070.635450] rcu-torture: rcu_torture_read_exit: End of episode
[ 1074.047617] BUG: kernel NULL pointer dereference, address:
0000000000000498
Thanks Ayush,
According to this address,
4c 8b af 50 09 00 00 mov 0x950(%rdi),%r13 <--- r13 = p->mm;
49 8b bd 98 04 00 00 mov 0x498(%r13),%rdi <--- p->mm->owner
It seems that this task to be swapped has NULL mm_struct.
Does the following help?
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 96db6947bc92..0cb8cc4d551d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3353,7 +3353,8 @@ void set_task_cpu(struct task_struct *p,
unsigned int new_cpu)
static void __migrate_swap_task(struct task_struct *p, int cpu)
{
__schedstat_inc(p->stats.numa_task_swapped);
- count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
+ if (p->mm)
+ count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
if (task_on_rq_queued(p)) {
struct rq *src_rq, *dst_rq;
Hello Chenyu,
This issue is reported even on IBM Power servers.
Proposed fix works fine. Hence,
Tested-by: Venkat Rao Bagalkote <venkat88@xxxxxxxxxxxxx>
Regards,
Venkat.
Hi Andrew,
May I know if we can hold this patch and not merge it for now,
besides this regression, Libo has another comment related to
this patch and I'll address it in next version. Sorry for
inconvenience.
thanks,
Chenyu