On 7/23/25 3:03 AM, Alan Maguire wrote:
On 23/07/2025 09:39, Alexis Lothoré wrote:
Hi Ihor,
On Wed Jul 23, 2025 at 1:42 AM CEST, Ihor Solodrai wrote:
On 7/16/25 2:04 AM, Alexis Lothoré (eBPF Foundation) wrote:
[...]
cu__finalize(cu, cus, conf);
- if (cus__steal_now(cus, cu, conf) == LSK__STOP_LOADING)
+ if (cus__steal_now(cus, cu, conf) == LSK__ABORT)
goto out_abort;
I think we should handle both LSK__STOP_LOADING and LSK__ABORT here,
otherwise the worker will keep loading DWARF (which is most of the
pahole's work) and passing cu-s to the stealer function.
I guess in many cases it'll complete without errors, but it's just
unnecessary work.
Maybe add a check for LSK__STOP_LOADING here with goto_ok?
Hmmm, I am not sure how to follow you here. Even if cus__steal_now returns
LSK__STOP_LOADING, I need to return DWARF_CB_OK from
cus__merge_and_process_cu, otherwise it will be processed as an error
again. Or are you suggesting just make sure to delete current cu but still
return DWARF_CB_OK when getting LSK__STOP_LOADING, as done below ?
LSK__STOP_LOADING signals that no more input needs to be processed.
So I was thinking we should make sure the workers terminate without
error and return DWARF_CB_OK. Whether we delete a cu or not is not
that important since we are exiting soon anyway.
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -3796,6 +3796,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
Dwarf_Off off = 0, noff;
struct cu *cu = NULL;
size_t cuhl;
+ int res;
while (dwarf_nextcu(dw, off, &noff, &cuhl, NULL, &pointer_size,
&offset_size) == 0) {
@@ -3874,7 +3875,8 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
goto out_abort;
cu__finalize(cu, cus, conf);
- if (cus__steal_now(cus, cu, conf) == LSK__ABORT)
+ res = cus__steal_now(cus, cu, conf);
+ if (res == LSK__STOP_LOADING || res == LSK__ABORT)
goto out_abort;
return 0;
@@ -3882,7 +3884,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
out_abort:
dwarf_cu__delete(cu);
cu__delete(cu);
- return DWARF_CB_ABORT;
+ return res == LSK__STOP_LOADING ? DWARF_CB_OK : DWARF_CB_ABORT;
}
Thanks,
Alexis
Good point; I think ideally we'd like to stop processing in either case
but still distinguish error from early exit. We could perhaps set the
appropriate LSK_ value in dcus (dcus->lsk_status?) to make the
distinction and use it in cus__load_module(), renaming out_error in
dwarf_cus__process_cu() to out_early or something to underline the fact
that it's an early exit not necessarily a failure. So we preserve the
return of CB_ABORT for both failure and early exit, but can still tell
these apart using dcus->lsk_status. Would that work?
Good suggestion. Yes, I think that would work better than just return
DWARF_CB_OK from the worker.
I overlooked the fact that there are many workers, and one of them
exiting with DWARF_CB_OK would not affect the others. But
DWARF_CB_ABORT would via the cus_processing_queue.abort flag.