Re: [PATCH bpf-next v7 2/4] bpf: Add kfuncs for read-only string operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 23, 2025 at 6:48 AM Viktor Malik <vmalik@xxxxxxxxxx> wrote:
>
> String operations are commonly used so this exposes the most common ones
> to BPF programs. For now, we limit ourselves to operations which do not
> copy memory around.
>
> Unfortunately, most in-kernel implementations assume that strings are
> %NUL-terminated, which is not necessarily true, and therefore we cannot
> use them directly in the BPF context. Instead, we open-code them using
> __get_kernel_nofault instead of plain dereference to make them safe and
> limit the strings length to XATTR_SIZE_MAX to make sure the functions
> terminate. When __get_kernel_nofault fails, functions return -EFAULT.
> Similarly, when the size bound is reached, the functions return -E2BIG.
> In addition, we return -ERANGE when the passed strings are outside of
> the kernel address space.
>
> Note that thanks to these dynamic safety checks, no other constraints
> are put on the kfunc args (they are marked with the "__ign" suffix to
> skip any verifier checks for them).
>
> All of the functions return integers, including functions which normally
> (in kernel or libc) return pointers to the strings. The reason is that
> since the strings are generally treated as unsafe, the pointers couldn't
> be dereferenced anyways. So, instead, we return an index to the string
> and let user decide what to do with it. This also nicely fits with
> returning various error codes when necessary (see above).
>
> Suggested-by: Alexei Starovoitov <ast@xxxxxxxxxx>
> Signed-off-by: Viktor Malik <vmalik@xxxxxxxxxx>
> ---
>  kernel/bpf/helpers.c | 389 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 389 insertions(+)
>

few more comments below, beside the -ENOENT issue Alexei mentioned earlier

[...]

> +/**
> + * bpf_strrchr - Find the last occurrence of a character in a string
> + * @s__ign: The string to be searched
> + * @c: The character to search for
> + *
> + * Return:
> + * * >=0      - Index of the last occurrence of @c within @s__ign
> + * * %-1      - @c not found in @s__ign
> + * * %-EFAULT - Cannot read @s__ign
> + * * %-E2BIG  - @s__ign is too large
> + * * %-ERANGE - @s__ign is outside of kernel address space
> + */
> +__bpf_kfunc int bpf_strrchr(const char *s__ign, int c)
> +{
> +       char sc;
> +       int i, last = -1;
> +
> +       if (!copy_from_kernel_nofault_allowed(s__ign, 1))
> +               return -ERANGE;
> +
> +       guard(pagefault)();
> +       for (i = 0; i < XATTR_SIZE_MAX; i++) {
> +               __get_kernel_nofault(&sc, s__ign, char, err_out);
> +               if (sc == '\0')
> +                       return last;
> +               if (sc == c)
> +                       last = i;

swap these two ifs, so that bpf_strrchr("blah", 0) will still return a
meaningful result (effectively become bpf_strlen(), of course)? That
should match strrchr's behavior in libc, if I'm reading this
correctly:

"The terminating NUL character is considered to be part of the string."


> +               s__ign++;
> +       }
> +       return -E2BIG;
> +err_out:
> +       return -EFAULT;
> +}
> +

[...]

> +/**
> + * bpf_strspn - Calculate the length of the initial substring of @s__ign which
> + *              only contains letters in @accept__ign
> + * @s__ign: The string to be searched
> + * @accept__ign: The string to search for
> + *
> + * Return:
> + * * >=0      - The length of the initial substring of @s__ign which only
> + *              contains letters from @accept__ign
> + * * %-EFAULT - Cannot read one of the strings
> + * * %-E2BIG  - One of the strings is too large
> + * * %-ERANGE - One of the strings is outside of kernel address space
> + */
> +__bpf_kfunc int bpf_strspn(const char *s__ign, const char *accept__ign)
> +{
> +       char cs, ca;
> +       bool found;
> +       int i, j;
> +
> +       if (!copy_from_kernel_nofault_allowed(s__ign, 1) ||
> +           !copy_from_kernel_nofault_allowed(accept__ign, 1)) {
> +               return -ERANGE;
> +       }
> +
> +       guard(pagefault)();
> +       for (i = 0; i < XATTR_SIZE_MAX; i++) {
> +               __get_kernel_nofault(&cs, s__ign, char, err_out);
> +               if (cs == '\0')
> +                       return i;
> +               found = false;
> +               for (j = 0; j < XATTR_SIZE_MAX; j++) {
> +                       __get_kernel_nofault(&ca, accept__ign + j, char, err_out);
> +                       if (cs == ca) {
> +                               found = true;
> +                               break;
> +                       }
> +                       if (ca == '\0')
> +                               break;
> +               }
> +               if (!found)
> +                       return i;

nit: you shouldn't need "found", just `ca == '\0'` would mean "not
found", I think?

so you'd have a succinct `if (cs == ca || ca == '\0') break;` in the
innermost loop


> +               s__ign++;
> +       }
> +       return -E2BIG;
> +err_out:
> +       return -EFAULT;
> +}
> +

[...]

> +/**
> + * bpf_strnstr - Find the first substring in a length-limited string
> + * @s1__ign: The string to be searched
> + * @s2__ign: The string to search for
> + * @len: the maximum number of characters to search
> + *
> + * Return:
> + * * >=0      - Index of the first character of the first occurrence of @s2__ign
> + *              within the first @len characters of @s1__ign
> + * * %-1      - @s2__ign not found in the first @len characters of @s1__ign
> + * * %-EFAULT - Cannot read one of the strings
> + * * %-E2BIG  - One of the strings is too large
> + * * %-ERANGE - One of the strings is outside of kernel address space
> + */
> +__bpf_kfunc int bpf_strnstr(const char *s1__ign, const char *s2__ign, size_t len)
> +{
> +       char c1, c2;
> +       int i, j;
> +
> +       if (!copy_from_kernel_nofault_allowed(s1__ign, 1) ||
> +           !copy_from_kernel_nofault_allowed(s2__ign, 1)) {
> +               return -ERANGE;
> +       }
> +
> +       guard(pagefault)();
> +       for (i = 0; i < XATTR_SIZE_MAX; i++) {
> +               for (j = 0; i + j < len && j < XATTR_SIZE_MAX; j++) {
> +                       __get_kernel_nofault(&c1, s1__ign + j, char, err_out);

move this after you check `c2 == 0` below? why reading this character
if we are not going to compare it?

> +                       __get_kernel_nofault(&c2, s2__ign + j, char, err_out);
> +                       if (c2 == '\0')
> +                               return i;
> +                       if (c1 == '\0')
> +                               return -1;
> +                       if (c1 != c2)
> +                               break;
> +               }
> +               if (j == XATTR_SIZE_MAX)
> +                       return -E2BIG;
> +               if (i + j == len)
> +                       return -1;
> +               s1__ign++;
> +       }
> +       return -E2BIG;
> +err_out:
> +       return -EFAULT;
> +}
> +

[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux