On 6/23/25 23:29, Andrii Nakryiko wrote: > On Mon, Jun 23, 2025 at 6:48 AM Viktor Malik <vmalik@xxxxxxxxxx> wrote: >> >> String operations are commonly used so this exposes the most common ones >> to BPF programs. For now, we limit ourselves to operations which do not >> copy memory around. >> >> Unfortunately, most in-kernel implementations assume that strings are >> %NUL-terminated, which is not necessarily true, and therefore we cannot >> use them directly in the BPF context. Instead, we open-code them using >> __get_kernel_nofault instead of plain dereference to make them safe and >> limit the strings length to XATTR_SIZE_MAX to make sure the functions >> terminate. When __get_kernel_nofault fails, functions return -EFAULT. >> Similarly, when the size bound is reached, the functions return -E2BIG. >> In addition, we return -ERANGE when the passed strings are outside of >> the kernel address space. >> >> Note that thanks to these dynamic safety checks, no other constraints >> are put on the kfunc args (they are marked with the "__ign" suffix to >> skip any verifier checks for them). >> >> All of the functions return integers, including functions which normally >> (in kernel or libc) return pointers to the strings. The reason is that >> since the strings are generally treated as unsafe, the pointers couldn't >> be dereferenced anyways. So, instead, we return an index to the string >> and let user decide what to do with it. This also nicely fits with >> returning various error codes when necessary (see above). >> >> Suggested-by: Alexei Starovoitov <ast@xxxxxxxxxx> >> Signed-off-by: Viktor Malik <vmalik@xxxxxxxxxx> >> --- >> kernel/bpf/helpers.c | 389 +++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 389 insertions(+) >> > > few more comments below, beside the -ENOENT issue Alexei mentioned earlier > > [...] > >> +/** >> + * bpf_strrchr - Find the last occurrence of a character in a string >> + * @s__ign: The string to be searched >> + * @c: The character to search for >> + * >> + * Return: >> + * * >=0 - Index of the last occurrence of @c within @s__ign >> + * * %-1 - @c not found in @s__ign >> + * * %-EFAULT - Cannot read @s__ign >> + * * %-E2BIG - @s__ign is too large >> + * * %-ERANGE - @s__ign is outside of kernel address space >> + */ >> +__bpf_kfunc int bpf_strrchr(const char *s__ign, int c) >> +{ >> + char sc; >> + int i, last = -1; >> + >> + if (!copy_from_kernel_nofault_allowed(s__ign, 1)) >> + return -ERANGE; >> + >> + guard(pagefault)(); >> + for (i = 0; i < XATTR_SIZE_MAX; i++) { >> + __get_kernel_nofault(&sc, s__ign, char, err_out); >> + if (sc == '\0') >> + return last; >> + if (sc == c) >> + last = i; > > swap these two ifs, so that bpf_strrchr("blah", 0) will still return a > meaningful result (effectively become bpf_strlen(), of course)? That > should match strrchr's behavior in libc, if I'm reading this > correctly: > > "The terminating NUL character is considered to be part of the string." Yeah, that's correct, nice catch! I'll add a test for the above case. > > >> + s__ign++; >> + } >> + return -E2BIG; >> +err_out: >> + return -EFAULT; >> +} >> + > > [...] > >> +/** >> + * bpf_strspn - Calculate the length of the initial substring of @s__ign which >> + * only contains letters in @accept__ign >> + * @s__ign: The string to be searched >> + * @accept__ign: The string to search for >> + * >> + * Return: >> + * * >=0 - The length of the initial substring of @s__ign which only >> + * contains letters from @accept__ign >> + * * %-EFAULT - Cannot read one of the strings >> + * * %-E2BIG - One of the strings is too large >> + * * %-ERANGE - One of the strings is outside of kernel address space >> + */ >> +__bpf_kfunc int bpf_strspn(const char *s__ign, const char *accept__ign) >> +{ >> + char cs, ca; >> + bool found; >> + int i, j; >> + >> + if (!copy_from_kernel_nofault_allowed(s__ign, 1) || >> + !copy_from_kernel_nofault_allowed(accept__ign, 1)) { >> + return -ERANGE; >> + } >> + >> + guard(pagefault)(); >> + for (i = 0; i < XATTR_SIZE_MAX; i++) { >> + __get_kernel_nofault(&cs, s__ign, char, err_out); >> + if (cs == '\0') >> + return i; >> + found = false; >> + for (j = 0; j < XATTR_SIZE_MAX; j++) { >> + __get_kernel_nofault(&ca, accept__ign + j, char, err_out); >> + if (cs == ca) { >> + found = true; >> + break; >> + } >> + if (ca == '\0') >> + break; >> + } >> + if (!found) >> + return i; > > nit: you shouldn't need "found", just `ca == '\0'` would mean "not > found", I think? > > so you'd have a succinct `if (cs == ca || ca == '\0') break;` in the > innermost loop That sounds about right. We'll just need to cover the case when `j == XATTR_SIZE_MAX` and return -E2BIG in such a case (which should have been done anyways here and currently is not). Also, an analogical thing can be done in bpf_strcspn. > > >> + s__ign++; >> + } >> + return -E2BIG; >> +err_out: >> + return -EFAULT; >> +} >> + > > [...] > >> +/** >> + * bpf_strnstr - Find the first substring in a length-limited string >> + * @s1__ign: The string to be searched >> + * @s2__ign: The string to search for >> + * @len: the maximum number of characters to search >> + * >> + * Return: >> + * * >=0 - Index of the first character of the first occurrence of @s2__ign >> + * within the first @len characters of @s1__ign >> + * * %-1 - @s2__ign not found in the first @len characters of @s1__ign >> + * * %-EFAULT - Cannot read one of the strings >> + * * %-E2BIG - One of the strings is too large >> + * * %-ERANGE - One of the strings is outside of kernel address space >> + */ >> +__bpf_kfunc int bpf_strnstr(const char *s1__ign, const char *s2__ign, size_t len) >> +{ >> + char c1, c2; >> + int i, j; >> + >> + if (!copy_from_kernel_nofault_allowed(s1__ign, 1) || >> + !copy_from_kernel_nofault_allowed(s2__ign, 1)) { >> + return -ERANGE; >> + } >> + >> + guard(pagefault)(); >> + for (i = 0; i < XATTR_SIZE_MAX; i++) { >> + for (j = 0; i + j < len && j < XATTR_SIZE_MAX; j++) { >> + __get_kernel_nofault(&c1, s1__ign + j, char, err_out); > > move this after you check `c2 == 0` below? why reading this character > if we are not going to compare it? Good point, will move it. > >> + __get_kernel_nofault(&c2, s2__ign + j, char, err_out); >> + if (c2 == '\0') >> + return i; >> + if (c1 == '\0') >> + return -1; >> + if (c1 != c2) >> + break; >> + } >> + if (j == XATTR_SIZE_MAX) >> + return -E2BIG; >> + if (i + j == len) >> + return -1; >> + s1__ign++; >> + } >> + return -E2BIG; >> +err_out: >> + return -EFAULT; >> +} >> + > > [...] Thanks for the reviews! I'll wait a couple more days for potential reviews for the other patches before sending the next revision. Viktor