On 3/25/25 2:10 AM, Domenico Andreoli wrote: > Hi, > > This a forward of Debian bug report [0] where you can find more > details. At [1] and [2] you can get the kernel and module to reproduce. > I could reproduce on both amd64 and arm64 using pahole 1.29. > > This is marked as serious severity because it makes the autobuilder hang > as well [3]. > > Could you please help? > > Regards, > Domenico Hi Domenico, thanks for the bug report. I debugged the hanging, and it appears that "abort" handling in case of a BTF encoding error was overlooked in recent changes to speedup parallel encoding. Could you please try the diff below, and check if it resolves the hanging? diff --git a/dwarf_loader.c b/dwarf_loader.c index 84122d0..e1ba7bc 100644 --- a/dwarf_loader.c +++ b/dwarf_loader.c @@ -3459,6 +3459,7 @@ static struct { */ uint32_t next_cu_id; struct list_head jobs; + bool abort; } cus_processing_queue; enum job_type { @@ -3479,6 +3480,7 @@ static void cus_queue__init(void) pthread_cond_init(&cus_processing_queue.job_added, NULL); INIT_LIST_HEAD(&cus_processing_queue.jobs); cus_processing_queue.next_cu_id = 0; + cus_processing_queue.abort = false; } static void cus_queue__destroy(void) @@ -3535,8 +3537,9 @@ static struct cu_processing_job *cus_queue__enqdeq_job(struct cu_processing_job pthread_cond_signal(&cus_processing_queue.job_added); } for (;;) { + bool abort = __atomic_load_n(&cus_processing_queue.abort, __ATOMIC_SEQ_CST); job = cus_queue__try_dequeue(); - if (job) + if (job || abort) break; /* No jobs or only steals out of order */ pthread_cond_wait(&cus_processing_queue.job_added, &cus_processing_queue.mutex); @@ -3653,6 +3656,9 @@ static void *dwarf_loader__worker_thread(void *arg) while (!stop) { job = cus_queue__enqdeq_job(job); + if (!job) + goto out_abort; + switch (job->type) { case JOB_DECODE: @@ -3688,6 +3694,8 @@ static void *dwarf_loader__worker_thread(void *arg) return (void *)DWARF_CB_OK; out_abort: + __atomic_store_n(&cus_processing_queue.abort, true, __ATOMIC_SEQ_CST); + pthread_cond_signal(&cus_processing_queue.job_added); return (void *)DWARF_CB_ABORT; } @@ -4028,7 +4036,7 @@ static int cus__process_file(struct cus *cus, struct conf_load *conf, int fd, /* Process the one or more modules gleaned from this file. */ int err = dwfl_getmodules(dwfl, cus__process_dwflmod, &parms, 0); - if (err < 0) + if (err) return -1; // We can't call dwfl_end(dwfl) here, as we keep pointers to strings -- 2.48.1 > > > The command to succeed: > > This simplified (sequential) command succeeds: > > cp nvidia-modeset.base.ko nvidia-modeset.ko > LLVM_OBJCOPY="x86_64-linux-gnu-objcopy" pahole -J --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs --btf_features=distilled_base --btf_base vmlinux nvidia-modeset.ko -j1 > echo $? > > producing this output: > ===== 8< ===== > dwarf_expr: unhandled 0x12 DW_OP_ operation > Unsupported DW_TAG_reference_type(0x10): type: 0x28172 > Error while encoding BTF. > 0 > ===== >8 ===== > > > While this (parallel) command hangs: > > cp nvidia-modeset.base.ko nvidia-modeset.ko > LLVM_OBJCOPY="x86_64-linux-gnu-objcopy" pahole -J --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs --btf_features=distilled_base --btf_base vmlinux nvidia-modeset.ko -j2 > echo $? > > producing this output: > ===== 8< ===== > dwarf_expr: unhandled 0x12 DW_OP_ operation > dwarf_expr: unhandled 0x12 DW_OP_ operation > dwarf_expr: unhandled 0x12 DW_OP_ operation > dwarf_expr: unhandled 0x12 DW_OP_ operation > Unsupported DW_TAG_reference_type(0x10): type: 0x28172 > Error while encoding BTF. > Terminated > 143 > ===== >8 ===== Please note that even though the sequential command succeeds, the BTF output is going to be incomplete (and potentially invalid). The underlying issue is that there is an unhandled DW_TAG in the BTF encoder. The encoding process exits on errors like this. It would be nice if you provided all the input (base vmlinux and the module) that led to this error, so we could investigate. Thank you! > > > [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1100503 > [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=1100503;filename=vmlinux.zst;msg=19 > [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=1100503;filename=nvidia-modeset.base.ko.zst;msg=12 > [3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1101262 >