On Wed, May 14, 2025 at 3:31 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > On 12/05/2025 10:17, Tony Ambardar wrote: > > On Fri, May 09, 2025 at 11:40:47AM -0700, Andrii Nakryiko wrote: > >> On Thu, May 8, 2025 at 6:22 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > >>> > >>> When testing v1 of [1] we noticed that functions with 0-sized structs > >>> as parameters were not part of BTF encoding; this was fixed in v2. > >>> However we need to make sure we handle such zero-sized structs > >>> correctly since they confound the calling convention expectations - > >>> no registers are used for the empty struct so this has knock-on effects > >>> for subsequent register-parameter matching. > >> > >> Do you have a list (or at least an example) of the function we are > >> talking about, just curious to see what's that. > >> > > > > BTW, Alan shared an example in the other pahole patch thread: > > https://lore.kernel.org/dwarves/07d92da1-36f3-44d2-a0a4-cf7dabf278c6@xxxxxxxxxx/ > > > > Yep, the one I came across on x86_64 was > > static int __io_run_local_work(struct io_ring_ctx *ctx, io_tw_token_t > tw, int min_events, int max_events); > > (the io_tw_token_t parameter is a typedef for > > struct io_tw_state { > }; > > > >> The question I have is whether it's safe to assume that regardless of > >> architecture we can assume that zero-sized struct has no effect on > >> register allocation (which would seem logical, but is that true for > >> all ABIs). > >> > >> BTW, while looking at patch #2, I noticed that > >> btf_distill_func_proto() disallows functions returning > >> struct-by-value, which seems overly aggressive, at least for structs > >> of up to 8 bytes. So maybe if we can validate that both cases are not > >> introducing any new quirks across all supported architectures, we can > >> solve both limitations? > >> > > > > Good idea. I'll try and address this and add a return value test. > > > Given pahole (and my related patch) assume pass-by-value for well-sized > > structs, I'd like to see this too. But while the pahole patch works on > > 64/32-bit archs, I noticed from patch #1 that e.g. ___bpf_treg_cnt() > > seems to hard-code a 64-bit register size. Perhaps we can fix that too? > > > > So I think your concern is the assumptions > > > __builtin_choose_expr(sizeof(t) == 8, 1, \ > __builtin_choose_expr(sizeof(t) == 16, 2, \ > > ? We may need arch-specific macros that specify register size that we > can use here, or is there a better way? we know the target architecture, so this shouldn't be hard to define the word size accordingly and use that here? > > >> P.S., oh, and s390x selftest (test_struct_args) isn't happy, please check. > > Yep, working to repro this locally now, thanks. > > Alan