On Thu, 8 May 2025 at 22:15, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Wed, 2025-05-07 at 10:17 -0700, Kumar Kartikeya Dwivedi wrote: > > Prepare a function for use in future patches that can extract the file > > info, line info, and the source line number for a given BPF program > > provided it's program counter. > > > > Only the basename of the file path is provided, given it can be > > excessively long in some cases. > > > > This will be used in later patches to print source info to the BPF > > stream. The source line number is indicated by the return value, and the > > file and line info are provided through out parameters. > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> > > --- > > Hi Kumar, > > I did a silly test for this function by calling it for every ip in the > program at the and of the program load. See patch at the end of the > email. The goal was to compare its output with output of the `bpftool > prog dump jited`. > > Next, I used pyperf600_iter.bpf.o as a guinea pig: > > bpftool prog load <kernel>/tools/testing/selftests/bpf/pyperf600_iter.bpf.o /sys/fs/bpf/dbg-prog > bpftool prog dump jited pinned /sys/fs/bpf/dbg-prog > > Overall, the bpftool output looks coherent to what is shown by printk. > However, I see an off-by-one difference, e.g.: > > // bpftool output > > void * get_thread_state(void * tls_base, PidData * pidData): > bpf_prog_2af5b1ca414a1163_get_thread_state: > ; static void *get_thread_state(void *tls_base, PidData *pidData) > 0: endbr64 > ... > ; bpf_probe_read_user(&key, sizeof(key), (void*)(long)pidData->tls_key_addr); > 1f: movl 4(%rsi), %edx > ... > ; bpf_probe_read_user(&key, sizeof(key), (void*)(long)pidData->tls_key_addr); > 29: movl $4, %esi > ... > ; tls_base + 0x310 + key * 0x10 + 0x08); > 33: movl -12(%rbp), %edi > ... > ; bpf_probe_read_user(&thread_state, sizeof(thread_state), > 52: movl $8, %esi > ... > ; return thread_state; > 5f: movq -8(%rbp), %rax > ... > > // printk > > [ 114.506237] func[2] jited_len=106 > [ 114.506306] ip=0, file='(null)', line='(null)', line_num=-2 > [ 114.506395] ip=1, file='pyperf.h', line='static void *get_thread_state(void *tls_base, PidData *pidData)', line_num=77 > [ 114.506571] ip=20, file='pyperf.h', line='bpf_probe_read_user(&key, sizeof(key), (void*)(long)pidData->tls_key_addr);', line_num=82 > [ 114.506765] ip=34, file='pyperf.h', line='tls_base + 0x310 + key * 0x10 + 0x08);', line_num=84 > [ 114.506919] ip=53, file='pyperf.h', line='bpf_probe_read_user(&thread_state, sizeof(thread_state),', line_num=83 > [ 114.507096] ip=60, file='pyperf.h', line='return thread_state;', line_num=85 > > Note that ip for each printk entry is +1 compared to bpftool output. > > Also, there is a BUG splat from KASAN in the end: > > [ 2.343160] ================================================================== > [ 2.343277] BUG: KASAN: slab-out-of-bounds in bpf_prog_get_file_line (kernel/bpf/core.c:3213) > [ 2.343397] Read of size 4 at addr ffff88810b5ea810 by task veristat/145 > [ 2.343496] > [ 2.343542] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014 > [ 2.343544] Call Trace: > ... > [ 2.343592] bpf_prog_get_file_line (kernel/bpf/core.c:3213) > [ 2.343598] ?bpf_prog_2af5b1ca414a1163_get_thread_state+0x64/0x6a 85 > [ 2.343602] bpf_prog_load (kernel/bpf/syscall.c:3014) > ... > [ 2.343686] > [ 2.346851] Allocated by task 145: > [ 2.346912] kasan_save_track (mm/kasan/common.c:48 mm/kasan/common.c:68) > [ 2.346974] __kasan_kmalloc (mm/kasan/common.c:398) > [ 2.347036] __kvmalloc_node_noprof (mm/slub.c:4342 mm/slub.c:5026) > [ 2.347117] check_btf_info (kernel/bpf/verifier.c:17908 kernel/bpf/verifier.c:18120) > [ 2.347179] bpf_check (kernel/bpf/verifier.c:24004) > [ 2.347240] bpf_prog_load (kernel/bpf/syscall.c:2971) > [ 2.347301] __sys_bpf (kernel/bpf/syscall.c:5897) > [ 2.347363] __x64_sys_bpf (kernel/bpf/syscall.c:5958 kernel/bpf/syscall.c:5956 kernel/bpf/syscall.c:5956) > [ 2.347423] do_syscall_64 (arch/x86/entry/syscall_64.c:0) > [ 2.347484] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > [ 2.347566] > [ 2.347607] The buggy address belongs to the object at ffff88810b5ea000 > [ 2.347607] which belongs to the cache kmalloc-4k of size 4096 > [ 2.347782] The buggy address is located 0 bytes to the right of > [ 2.347782] allocated 2064-byte region [ffff88810b5ea000, ffff88810b5ea810) > > Am I doing something stupid or there is an issue? > > --- 8< ------------------------------------------- > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c > index 4664ab5e8cc7..467ae79f77a1 100644 > --- a/kernel/bpf/core.c > +++ b/kernel/bpf/core.c > @@ -3188,6 +3188,7 @@ EXPORT_SYMBOL(bpf_stats_enabled_key); > EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_exception); > EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_bulk_tx); > > +__attribute__((optnone)) // to see line numbers after decode_stacktrace > int bpf_prog_get_file_line(struct bpf_prog *prog, unsigned long ip, const char **filep, const char **linep) > { > int idx = -1, insn_start, insn_end, len; > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 64c3393e8270..d1777b8c5558 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -3001,6 +3001,23 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) > err = bpf_prog_new_fd(prog); > if (err < 0) > bpf_prog_put(prog); > + for (int fidx = 0; fidx < prog->aux->func_cnt; ++fidx) { > + struct bpf_prog *fprog = prog->aux->func[fidx]; > + int line_num, prev_line_num; > + const char *filep, *linep; > + > + prev_line_num = -1; > + printk("func[%d] jited_len=%d\n", fidx, fprog->jited_len); > + for (u32 ip = 0; ip < fprog->jited_len; ++ip) { > + filep = NULL; > + linep = NULL; > + line_num = bpf_prog_get_file_line(fprog, (u64)fprog->bpf_func + ip, &filep, &linep); > + if (line_num != prev_line_num) > + printk("ip=%x, file='%s', line='%s', line_num=%d\n", > + ip, filep, linep, line_num); > + prev_line_num = line_num; > + } > + } > return err; > ------------------------------------------- >8 --- > Thanks for trying it out, the ip slip is probably because we get the return address at runtime so it's always trailing the actual ip of what called into us. I will look into the KASAN error / off-by-one.