On Tue, 2025-08-12 at 16:07 +0200, Ilya Leoshkevich wrote: > The tailcall_bpf2bpf_hierarchy_1 test hangs on s390. Its call graph > is > as follows: > > entry() > subprog_tail() > bpf_tail_call_static(0) -> entry + tail_call_start > subprog_tail() > bpf_tail_call_static(0) -> entry + tail_call_start > > entry() copies its tail call counter to the subprog_tail()'s frame, > which then increments it. However, the incremented result is > discarded, > leading to an astronomically large number of tail calls. > > Fix by writing the incremented counter back to the entry()'s frame. > > Fixes: dd691e847d28 ("s390/bpf: Implement > bpf_jit_supports_subprog_tailcalls()") > Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > --- > arch/s390/net/bpf_jit_comp.c | 20 +++++++++++++------- > 1 file changed, 13 insertions(+), 7 deletions(-) > > diff --git a/arch/s390/net/bpf_jit_comp.c > b/arch/s390/net/bpf_jit_comp.c > index bb17efe29d65..85695576df6c 100644 > --- a/arch/s390/net/bpf_jit_comp.c > +++ b/arch/s390/net/bpf_jit_comp.c > @@ -1790,16 +1790,11 @@ static noinline int bpf_jit_insn(struct > bpf_jit *jit, struct bpf_prog *fp, > > REG_SET_SEEN(BPF_REG_5); > jit->seen |= SEEN_FUNC; > + > /* > * Copy the tail call counter to where the callee > expects it. > - * > - * Note 1: The callee can increment the tail call > counter, but > - * we do not load it back, since the x86 JIT does > not do this > - * either. > - * > - * Note 2: We assume that the verifier does not let > us call the > - * main program, which clears the tail call counter > on entry. > */ > + > /* mvc > tail_call_cnt(4,%r15),frame_off+tail_call_cnt(%r15) */ > _EMIT6(0xd203f000 | offsetof(struct prog_frame, > tail_call_cnt), > 0xf000 | (jit->frame_off + > @@ -1825,6 +1820,17 @@ static noinline int bpf_jit_insn(struct > bpf_jit *jit, struct bpf_prog *fp, > call_r1(jit); > /* lgr %b0,%r2: load return value into %b0 */ > EMIT4(0xb9040000, BPF_REG_0, REG_2); > + > + /* > + * Copy the potentially updated tail call counter > back. > + */ > + > + /* mvc > frame_off+tail_call_cnt(%r15),tail_call_cnt(4,%r15) */ > + _EMIT6(0xd203f000 | (jit->frame_off + > + offsetof(struct prog_frame, > + tail_call_cnt)), > + 0xf000 | offsetof(struct prog_frame, > tail_call_cnt)); > + > break; > } > case BPF_JMP | BPF_TAIL_CALL: { Hmm, we need to do this only for BPF_PSEUDO_CALLs, otherwise a helper or a kfunc, which is unaware of the tail call counter convention, will clobber it with something random, potentially causing a kernel stack overflow. I will send a v2 and also provide a test that catches this issue.