Re: [PATCH bpf-next v7 2/4] bpf: Add kfuncs for read-only string operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/23/25 23:29, Andrii Nakryiko wrote:
> On Mon, Jun 23, 2025 at 6:48 AM Viktor Malik <vmalik@xxxxxxxxxx> wrote:
>>
>> String operations are commonly used so this exposes the most common ones
>> to BPF programs. For now, we limit ourselves to operations which do not
>> copy memory around.
>>
>> Unfortunately, most in-kernel implementations assume that strings are
>> %NUL-terminated, which is not necessarily true, and therefore we cannot
>> use them directly in the BPF context. Instead, we open-code them using
>> __get_kernel_nofault instead of plain dereference to make them safe and
>> limit the strings length to XATTR_SIZE_MAX to make sure the functions
>> terminate. When __get_kernel_nofault fails, functions return -EFAULT.
>> Similarly, when the size bound is reached, the functions return -E2BIG.
>> In addition, we return -ERANGE when the passed strings are outside of
>> the kernel address space.
>>
>> Note that thanks to these dynamic safety checks, no other constraints
>> are put on the kfunc args (they are marked with the "__ign" suffix to
>> skip any verifier checks for them).
>>
>> All of the functions return integers, including functions which normally
>> (in kernel or libc) return pointers to the strings. The reason is that
>> since the strings are generally treated as unsafe, the pointers couldn't
>> be dereferenced anyways. So, instead, we return an index to the string
>> and let user decide what to do with it. This also nicely fits with
>> returning various error codes when necessary (see above).
>>
>> Suggested-by: Alexei Starovoitov <ast@xxxxxxxxxx>
>> Signed-off-by: Viktor Malik <vmalik@xxxxxxxxxx>
>> ---
>>  kernel/bpf/helpers.c | 389 +++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 389 insertions(+)
>>
> 
> few more comments below, beside the -ENOENT issue Alexei mentioned earlier
> 
> [...]
> 
>> +/**
>> + * bpf_strrchr - Find the last occurrence of a character in a string
>> + * @s__ign: The string to be searched
>> + * @c: The character to search for
>> + *
>> + * Return:
>> + * * >=0      - Index of the last occurrence of @c within @s__ign
>> + * * %-1      - @c not found in @s__ign
>> + * * %-EFAULT - Cannot read @s__ign
>> + * * %-E2BIG  - @s__ign is too large
>> + * * %-ERANGE - @s__ign is outside of kernel address space
>> + */
>> +__bpf_kfunc int bpf_strrchr(const char *s__ign, int c)
>> +{
>> +       char sc;
>> +       int i, last = -1;
>> +
>> +       if (!copy_from_kernel_nofault_allowed(s__ign, 1))
>> +               return -ERANGE;
>> +
>> +       guard(pagefault)();
>> +       for (i = 0; i < XATTR_SIZE_MAX; i++) {
>> +               __get_kernel_nofault(&sc, s__ign, char, err_out);
>> +               if (sc == '\0')
>> +                       return last;
>> +               if (sc == c)
>> +                       last = i;
> 
> swap these two ifs, so that bpf_strrchr("blah", 0) will still return a
> meaningful result (effectively become bpf_strlen(), of course)? That
> should match strrchr's behavior in libc, if I'm reading this
> correctly:
> 
> "The terminating NUL character is considered to be part of the string."

Yeah, that's correct, nice catch! I'll add a test for the above case.

> 
> 
>> +               s__ign++;
>> +       }
>> +       return -E2BIG;
>> +err_out:
>> +       return -EFAULT;
>> +}
>> +
> 
> [...]
> 
>> +/**
>> + * bpf_strspn - Calculate the length of the initial substring of @s__ign which
>> + *              only contains letters in @accept__ign
>> + * @s__ign: The string to be searched
>> + * @accept__ign: The string to search for
>> + *
>> + * Return:
>> + * * >=0      - The length of the initial substring of @s__ign which only
>> + *              contains letters from @accept__ign
>> + * * %-EFAULT - Cannot read one of the strings
>> + * * %-E2BIG  - One of the strings is too large
>> + * * %-ERANGE - One of the strings is outside of kernel address space
>> + */
>> +__bpf_kfunc int bpf_strspn(const char *s__ign, const char *accept__ign)
>> +{
>> +       char cs, ca;
>> +       bool found;
>> +       int i, j;
>> +
>> +       if (!copy_from_kernel_nofault_allowed(s__ign, 1) ||
>> +           !copy_from_kernel_nofault_allowed(accept__ign, 1)) {
>> +               return -ERANGE;
>> +       }
>> +
>> +       guard(pagefault)();
>> +       for (i = 0; i < XATTR_SIZE_MAX; i++) {
>> +               __get_kernel_nofault(&cs, s__ign, char, err_out);
>> +               if (cs == '\0')
>> +                       return i;
>> +               found = false;
>> +               for (j = 0; j < XATTR_SIZE_MAX; j++) {
>> +                       __get_kernel_nofault(&ca, accept__ign + j, char, err_out);
>> +                       if (cs == ca) {
>> +                               found = true;
>> +                               break;
>> +                       }
>> +                       if (ca == '\0')
>> +                               break;
>> +               }
>> +               if (!found)
>> +                       return i;
> 
> nit: you shouldn't need "found", just `ca == '\0'` would mean "not
> found", I think?
> 
> so you'd have a succinct `if (cs == ca || ca == '\0') break;` in the
> innermost loop

That sounds about right. We'll just need to cover the case when
`j == XATTR_SIZE_MAX` and return -E2BIG in such a case (which should
have been done anyways here and currently is not).

Also, an analogical thing can be done in bpf_strcspn.

> 
> 
>> +               s__ign++;
>> +       }
>> +       return -E2BIG;
>> +err_out:
>> +       return -EFAULT;
>> +}
>> +
> 
> [...]
> 
>> +/**
>> + * bpf_strnstr - Find the first substring in a length-limited string
>> + * @s1__ign: The string to be searched
>> + * @s2__ign: The string to search for
>> + * @len: the maximum number of characters to search
>> + *
>> + * Return:
>> + * * >=0      - Index of the first character of the first occurrence of @s2__ign
>> + *              within the first @len characters of @s1__ign
>> + * * %-1      - @s2__ign not found in the first @len characters of @s1__ign
>> + * * %-EFAULT - Cannot read one of the strings
>> + * * %-E2BIG  - One of the strings is too large
>> + * * %-ERANGE - One of the strings is outside of kernel address space
>> + */
>> +__bpf_kfunc int bpf_strnstr(const char *s1__ign, const char *s2__ign, size_t len)
>> +{
>> +       char c1, c2;
>> +       int i, j;
>> +
>> +       if (!copy_from_kernel_nofault_allowed(s1__ign, 1) ||
>> +           !copy_from_kernel_nofault_allowed(s2__ign, 1)) {
>> +               return -ERANGE;
>> +       }
>> +
>> +       guard(pagefault)();
>> +       for (i = 0; i < XATTR_SIZE_MAX; i++) {
>> +               for (j = 0; i + j < len && j < XATTR_SIZE_MAX; j++) {
>> +                       __get_kernel_nofault(&c1, s1__ign + j, char, err_out);
> 
> move this after you check `c2 == 0` below? why reading this character
> if we are not going to compare it?

Good point, will move it.

> 
>> +                       __get_kernel_nofault(&c2, s2__ign + j, char, err_out);
>> +                       if (c2 == '\0')
>> +                               return i;
>> +                       if (c1 == '\0')
>> +                               return -1;
>> +                       if (c1 != c2)
>> +                               break;
>> +               }
>> +               if (j == XATTR_SIZE_MAX)
>> +                       return -E2BIG;
>> +               if (i + j == len)
>> +                       return -1;
>> +               s1__ign++;
>> +       }
>> +       return -E2BIG;
>> +err_out:
>> +       return -EFAULT;
>> +}
>> +
> 
> [...]

Thanks for the reviews!

I'll wait a couple more days for potential reviews for the other patches
before sending the next revision.

Viktor





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux