Re: [PATCH bpf-next v4 3/4] bpf: Add kfuncs for read-only string operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 9, 2025 at 2:37 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Fri, 2025-05-09 at 11:20 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > +/**
> > > + * bpf_strchr - Find the first occurrence of a character in a string
> > > + * @s: The string to be searched
> > > + * @c: The character to search for
> > > + *
> > > + * Note that the %NUL-terminator is considered part of the string, and can
> > > + * be searched for.
> > > + *
> > > + * Return:
> > > + * * const char * - Pointer to the first occurrence of @c within @s
> > > + * * %NULL        - @c not found in @s
> > > + * * %-EFAULT     - Cannot read @s
> > > + * * %-E2BIG      - @s too large
> > > + */
> > > +__bpf_kfunc const char *bpf_strchr(const char *s, char c)
> >
> > so let's say we found the character, we return a pointer to it, and
> > that memory goes away (because we never owned it, so we don't really
> > know what and when will happen with it). Question, will verifier allow
> > BPF program to dereference this pointer? If yes, that's a problem. But
> > if not, then I'm not sure there is much point in returning a pointer.
> >
> >
> > I'm just trying to imply that in BPF world integer-based APIs work
> > better/safer, overall? For strings, we can switch any
> > pointer-returning API to position-returning (or negative error) API
> > and it would more or less naturally fit into BPF API surface, no?
>
> Integer based API solves the problem with memory access but is not
> really ergonomic. W/o special logic in verifier the returned int would
> be unbounded, hence the user would have to compare it with string
> length before using.
>
> It looks like some verifier logic is necessary regardless of API being
> integer or pointer based. In any case verifier needs additional rules
> for each pointer type to adjust bounds on the return value or its refobj_id.
>

You can't safely dereference any pointer returned from these APIs,
because the memory might not be there anymore.

For integers, same idea. If you use bpf_probe_read_{kernel,user} to
read data, then verifier doesn't care about the value of integer.

But that's not ergonomic, so in some other thread few days ago I was
proposing that we should add an untyped counterpart to bpf_core_cast()
that would just make any memory accesses performed using
__get_kernel_nofault() semantics. And so then:


const char *str = <random value or we got it from somewhere untrusted>;
int space_idx = bpf_strchr(str, ' ');
if (space_idx < 0)
  return -1; /* bad luck */

const char *s = bpf_mem_cast(str);
char buf[64] = {};

bpf_for(i, 0, space_idx)
    buf[i] = s[i]; /* MAGIC */

bpf_printk("STUFF BEFORE SPACE: %s", buf);


Tbh, when dealing with libc string APIs, I still very frequently
convert resulting pointers into indices, so I don't think it's
actually an API regression to have index-based string APIs





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux