On Fri, Jun 13, 2025 at 8:35 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > > When building the selftest with arm64/clang20, the following test failed: > ... > ubtest_multispec_usdt:PASS:usdt_100_called 0 nsec > subtest_multispec_usdt:PASS:usdt_100_sum 0 nsec > subtest_multispec_usdt:FAIL:usdt_300_bad_attach unexpected pointer: 0xaaaad82a2a80 > #469/2 usdt/multispec:FAIL > #469 usdt:FAIL > > But gcc11 built kernel/selftests succeeded. Further debug found clang generated > code has much less argument pattern after dedup, but gcc generated code has > a lot more. > > Below is the test:usdt_100 stapsdt's with clang20 generated binary: > > $ readelf -n usdt.test.o > Displaying notes found in: .note.stapsdt > Owner Data size Description > stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x0000000000000024, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[x9] > stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x000000000000003c, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[x9] > ... > stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x0000000000000954, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[x9] > stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x000000000000096c, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[x8] > > Below is the test:usdt_100 stapsdt's with gcc11 generated binary: > > $ readelf -n usdt.test.o > Displaying notes found in: .note.stapsdt > Owner Data size Description > ... > stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x000000000000470c, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[sp] > stapsdt 0x00000031 NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x0000000000004724, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[sp, 4] > ... > stapsdt 0x00000033 NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x000000000000503c, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[sp, 392] > stapsdt 0x00000033 NT_STAPSDT (SystemTap probe descriptors) > Provider: test > Name: usdt_100 > Location: 0x0000000000005054, Base: 0x0000000000000000, Semaphore: 0x0000000000000006 > Arguments: -4@[sp, 396] > > Considering libbpf dedup of usdt spec's, the clang generated code has 3 spec's, and > gcc has 100 spec's. Due to this, bpf_program__attach_usdt() failed with gcc but succeeded > with clang. To fix the test failure for clang generated code, make bpf_program__attach_usdt() > succeed with necessary macro guards. This is not the right way. We can just override BPF_USDT_MAX_SPEC_CNT #define in the BPF code instead. It's set to 256 by default, seems like we need more due to the unique set of stack offsets. But it's kind of surprising that GCC generates such a suboptimal code where each value is in its own slot on the stack. Look at trigger_100_usdts(), we just call the same USDT with x + i, where i goes from 0 to 100. I guess it's because it's debug mode, but still a bit surprising, IMO. pw-bot: cr > > Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx> > --- > tools/testing/selftests/bpf/prog_tests/usdt.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/usdt.c b/tools/testing/selftests/bpf/prog_tests/usdt.c > index 495d66414b57..7429029cbd63 100644 > --- a/tools/testing/selftests/bpf/prog_tests/usdt.c > +++ b/tools/testing/selftests/bpf/prog_tests/usdt.c > @@ -272,12 +272,19 @@ static void subtest_multispec_usdt(void) > > /* we'll reuse usdt_100 BPF program for usdt_300 test */ > bpf_link__destroy(skel->links.usdt_100); > + > skel->links.usdt_100 = bpf_program__attach_usdt(skel->progs.usdt_100, -1, "/proc/self/exe", > "test", "usdt_300", NULL); > +#if __clang__ && defined(__aarch64__) > + if (!ASSERT_OK_PTR(skel->links.usdt_100, "usdt_300_bad_attach")) > + goto cleanup; > + bpf_link__destroy(skel->links.usdt_100); > +#else > err = -errno; > if (!ASSERT_ERR_PTR(skel->links.usdt_100, "usdt_300_bad_attach")) > goto cleanup; > ASSERT_EQ(err, -E2BIG, "usdt_300_attach_err"); > +#endif > > /* let's check that there are no "dangling" BPF programs attached due > * to partial success of the above test:usdt_300 attachment > -- > 2.47.1 >