Re: [RFC PATCH bpf-next v1 1/2] bpf: Explore PTR_TO_STACK as R0 for bpf_dynptr_slice_rdwr

Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> · Thu, 20 Feb 2025 16:41:04 +0100

On Thu, 20 Feb 2025 at 01:13, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Wed, Feb 19, 2025 at 10:10 AM Kumar Kartikeya Dwivedi
> <memxor@xxxxxxxxx> wrote:
> >
> > On Wed, 19 Feb 2025 at 18:41, Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Wed, Feb 19, 2025 at 4:51 AM Kumar Kartikeya Dwivedi
> > > <memxor@xxxxxxxxx> wrote:
> > > >
> > > > For the bpf_dynptr_slice_rdwr kfunc, the verifier may return a pointer
> > > > to the underlying packet (if the requested slice is linear), or copy out
> > > > the data to the buffer passed into the kfunc. The verifier performs
> > > > symbolic execution assuming the returned value is a PTR_TO_MEM of a
> > > > certain size (passed into the kfunc), and ensures reads and writes are
> > > > within bounds.
> > >
> > > sounds like
> > > check_kfunc_mem_size_reg() -> check_mem_size_reg() ->
> > > check_helper_mem_access()
> > >    case PTR_TO_STACK:
> > >       check_stack_range_initialized()
> > >          clobber = true
> > >              if (clobber) {
> > >                   __mark_reg_unknown(env, &state->stack[spi].spilled_ptr);
> > >
> > > is somehow broken?
> > >
> > > ohh. It might be:
> > > || !is_kfunc_arg_optional(meta->btf, buff_arg)
> > >
> > > This bit is wrong then.
> > > When arg is not-null check_kfunc_mem_size_reg() should be called.
> > > The PTR_TO_STACK abuse is a small subset of issues
> > > if check_kfunc_mem_size_reg() is indeed not called.
> >
> > The condition looks ok to me.
> >
> > The condition to do check_mem_size_reg is !null || !opt.
> > So when it's null, and it's opt, it will be skipped.
> > When it's null, and it's not opt, the check will happen.
> > When arg is not-null, the said function is called, opt does not matter then.
> > So the stack slots are marked misc.
> >
> > In our case we're not passing a NULL pointer in the selftest.
> >
> > The problem occurs once we spill to that slot _after_ the call, and
> > then do a write through returned mem pointer.
> >
> > The final few lines from the selftest do the dirty thing, where r0 is
> > aliasing fp-8, and r1 = 0.
> >
> > + *(u64 *)(r10 - 8) = r8; \
> > + *(u64 *)(r0 + 0) = r1; \
> > + r8 = *(u64 *)(r10 - 8); \
> > + r0 = *(u64 *)(r8 + 0); \
> >
> > The write through r0 must re-mark the stack, but the verifier doesn't
> > know it's pointing to the stack.
> > push_stack was the conceptually cleaner/simpler fix, but it apparently
> > isn't good enough.
> >
> > Remarking the stack on write to PTR_TO_MEM, or invalidating PTR_TO_MEM
> > when r8 is spilled to fp-8 first time after the call are two options.
> > Both have some concerns (first, the misaligned stack access is not
> > caught, second PTR_TO_MEM may outlive stack frame).
>
> Reading the description of the problem my first instinct was to have
> stack slots with associated obj_ref_id for such cases and when
> something writes into that stack slot, invalidate everything with that
> obj_ref_id. So that's probably what you mean by invalidating
> PTR_TO_MEM, right?
>
> Not sure I understand what "PTR_TO_STACK with mem_size" (that Alexei
> mentioned in another email) means, though, so hard to compare.
>

Invalidation is certainly one option. The one Alexei mentioned was
where we discussed bounding how much data can be read through the
PTR_TO_STACK (similar to PTR_TO_MEM), and mark r0 as PTR_TO_STACK.
This ends up combining the constraints of both types of pointers (it
may as well be called PTR_TO_STACK_OR_MEM) without forking paths.

The benefit over the push_stack approach is that we avoid the states
regression for cls_redirect and balancer_ingress.
For the selftest failure, I plan to just silence the error by changing it.

> >
> > I don't recall if there was a hardware/JIT specific reason to care
> > about stack access alignment or not (on some architectures), but
> > otherwise we can over approximately mark at 8-byte granularity for any
> > slot(s) that overlap with the buffer to cover such a case. The second
> > problem is slightly trickier, which makes me lean towards invalidating
> > returned PTR_TO_MEM when stack slot is overwritten or frame is
> > destroyed.