Re: [PATCH bpf] bpf/selftests: fix test_tcpnotify_user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 15, 2025 at 01:17:39PM -0700, Martin KaFai Lau wrote:
> On 8/15/25 8:44 AM, Stanislav Fomichev wrote:
> > On 08/15, Matt Bobrowski wrote:
> > > Based on a bisect, it appears that commit 7ee988770326 ("timers:
> > > Implement the hierarchical pull model") has somehow inadvertently
> > > broken BPF selftest test_tcpnotify_user. The error that is being
> > > generated by this test is as follows:
> > > 
> > > 	FAILED: Wrong stats Expected 10 calls, got 8
> > > 
> > > It looks like the change allows timer functions to be run on CPUs
> > > different from the one they are armed on. The test had pinned itself
> > > to CPU 0, and in the past the retransmit attempts also occurred on CPU
> > > 0. The test had set the max_entries attribute for
> > > BPF_MAP_TYPE_PERF_EVENT_ARRAY to 2 and was calling
> > > bpf_perf_event_output() with BPF_F_CURRENT_CPU, so the entry was
> > > likely to be in range. With the change to allow timers to run on other
> > > CPUs, the current CPU tasked with performing the retransmit might be
> > > bumped and in turn fall out of range, as the event will be filtered
> > > out via __bpf_perf_event_output() using:
> > > 
> > >      if (unlikely(index >= array->map.max_entries))
> > >              return -E2BIG;
> > 
> > [..]
> > 
> > > A possible change would be to explicitly set the max_entries attribute
> > > for perf_event_map in test_tcpnotify_kern.c to a value that's at least
> > > as large as the number of CPUs. As it turns out however, if the field
> > > is left unset, then the BPF selftest library will determine the number
> > > of CPUs available on the underlying system and update the max_entries
> > > attribute accordingly.
> > 
> > nit: the max_entries is set by libbpf in map_set_def_max_entries. 'BPF
> > selftest library' seems a bit vague. But not a reason for respin.
> 
> Fixed the commit message. Thanks. Applied.

ACK. Apologies about the oversimplification there.

> > > A further problem with the test is that it has a thread that continues
> > > running up until the program exits. The main thread cleans up some
> > > LIBBPF data structures, while the other thread continues to use them,
> > > which inevitably will trigger a SIGSEGV. This can be dealt with by
> > > telling the thread to run for as long as necessary and doing a
> > > pthread_join on it before exiting the program.
> 
> Some of the "goto err" seems to have similar problem but ok-ish as long as
> the iptables runs fine. I didn't look why the test needs to start a thread
> at all, so I leave it as is. The CI is not running this test. The test is
> getting rotten overall. It should be moved to test_progs. Probably as a
> subtest in some of the existing sockops test in test_progs.

Agree, this test is archaic and needs should be converted or folded
such that it runs under test_progs. If I have time, I'll perform this
clean up.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux