On Fri, Apr 25, 2025 at 2:40 PM Amery Hung <ameryhung@xxxxxxxxx> wrote: > > Task local data provides simple and fast bpf and user space APIs to > exchange per-task data through task local storage map. The user space > first declares task local data using bpf_tld_type_key_var() or > bpf_tld_type_var(). The data is a thread-specific variable which > every thread has its own copy. Then, a bpf_tld_thread_init() needs to > be called for every thread to share the data with bpf. Finally, users > can directly read/write thread local data without bpf syscall. > > For bpf programs to see task local data, every data need to be > initialized once for every new task using bpf_tld_init_var(). Then > bpf programs can retrieve pointers to the data with bpf_tld_lookup(). > > The core of task local storage implementation relies on UPTRs. They > pin user pages to the kernel so that user space can share data with bpf > programs without syscalls. Both data and the metadata used to access > data are pinned via UPTRs. > > A limitation of UPTR makes the implementation of task local data > less trivial than it sounds: memory pinned to UPTR cannot exceed a > page and must not cross the page boundary. In addition, the data > declaration uses __thread identifier and therefore does not have > directly control over the memory allocation. Therefore, several > tricks and checks are used to make it work. > > First, task local data declaration APIs place data in a custom "udata" > section so that data from different compilation units will be contiguous > in the memory and can be pinned using two UPTRs if they are smaller than > one page. > > To avoid each data from spanning across two pages, they are each aligned > to the smallest power of two larget than their sizes. > > As we don't control the memory allocation for data, we need to figure > out the layout of user defined data. This is done by the data > declaration API and bpf_tld_thread_init(). The data declaration API > will insert constructors for all data, and they are used to find the > size and offset of data as well as the beginning and end of the whole > udata section. Then, bpf_tld_thread_init() performs a per-thread check > to make sure no data will cross the page boundary as udata can start at > different offset in a page. > > Note that for umetadata, we directly aligned_alloc() memory for it and > assigned to the UPTR. This is only done once for every process as every > tasks shares the same umetadata. The actual thread-specific data offset > will be adjusted in the bpf program when calling bpf_tld_init_var(). > > Signed-off-by: Amery Hung <ameryhung@xxxxxxxxx> > --- > .../bpf/prog_tests/task_local_data.c | 159 +++++++++++++++ > .../bpf/prog_tests/task_local_data.h | 58 ++++++ > .../selftests/bpf/progs/task_local_data.h | 181 ++++++++++++++++++ > .../selftests/bpf/task_local_data_common.h | 41 ++++ > 4 files changed, 439 insertions(+) > create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.c > create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.h > create mode 100644 tools/testing/selftests/bpf/progs/task_local_data.h > create mode 100644 tools/testing/selftests/bpf/task_local_data_common.h > [...] > +/** > + * @brief bpf_tld_key_type_var() declares a task local data shared with bpf > + * programs. The data will be a thread-specific variable which a user space > + * program can directly read/write, while bpf programs will need to lookup > + * with the string key. > + * > + * @param key The string key a task local data will be associated with. The > + * string will be truncated if the length exceeds TASK_LOCAL_DATA_KEY_LEN > + * @param type The type of the task local data > + * @param var The name of the task local data > + */ > +#define bpf_tld_key_type_var(key, type, var) \ > +__thread type var SEC("udata") __aligned(ROUND_UP_POWER_OF_TWO(sizeof(type))); \ > + \ > +__attribute__((constructor)) \ > +void __bpf_tld_##var##_init(void) \ > +{ \ > + _Static_assert(sizeof(type) < PAGE_SIZE, \ > + "data size must not exceed a page"); \ > + __bpf_tld_var_init(key, &var, sizeof(type)); \ > +} > + > +/** > + * @brief bpf_tld_key_type_var() declares a task local data shared with bpf declares -> defines, declaration would involve extern and no storage allocated (it's only to reference something *defined* in another compilation unit) > + * programs. The data will be a thread-specific variable which a user space > + * program can directly read/write, while bpf programs will need to lookup > + * the data with the string key same as the variable name. > + * > + * @param type The type of the task local data > + * @param var The name of the task local data as well as the name of the > + * key. The key string will be truncated if the length exceeds > + * TASK_LOCAL_DATA_KEY_LEN. > + */ > +#define bpf_tld_type_var(type, var) \ > + bpf_tld_key_type_var(#var, type, var) > + > +/** > + * @brief bpf_tld_thread_init() initializes the task local data for the current > + * thread. All data are undefined from a bpf program's point of view until > + * bpf_tld_thread_init() is called. > + * > + * @return 0 on success; negative error number on failure > + */ > +int bpf_tld_thread_init(void); > + > +#endif > diff --git a/tools/testing/selftests/bpf/progs/task_local_data.h b/tools/testing/selftests/bpf/progs/task_local_data.h > new file mode 100644 > index 000000000000..7358993ee634 > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/task_local_data.h given this is BPF-only header, I'd make it more explicit by using .bpf.h suffix, just like libbpf's usdt.bpf.h [...] > +/** > + * @brief bpf_tld_lookup() lookups the task KV store using the cached offset > + * corresponding to the key. > + * > + * @param tld A pointer to a valid bpf_task_local_data object initialized by bpf_tld_init() > + * @param key The key used to lookup the task KV store. Should be one of the > + * symbols defined in struct task_local_data_offsets, not a string > + * @param size The size of the value. Must be a known constant value > + * @return A pointer to the value corresponding to the key; NULL if the offset > + * if not cached or the size is too big > + */ > +#define bpf_tld_lookup(tld, key, size) __bpf_tld_lookup(tld, (tld)->off_map->key, size) > +__attribute__((unused)) > +static void *__bpf_tld_lookup(struct bpf_task_local_data *tld, short cached_off, int size) I'd probably use some opaque type for these offsets to discourage user application to try to create them, instead of strictly using offsets returned from API. So: typedef struct { int off; } tld_off_t; and then just use tld_off_t everywhere? > +{ > + short page_off, page_idx; > + > + if (!cached_off--) > + return NULL; > + > + page_off = cached_off & ~PAGE_IDX_MASK; > + page_idx = !!(cached_off & PAGE_IDX_MASK); > + > + if (page_idx) { > + return (tld->data_map->udata[1].page && page_off < PAGE_SIZE - size) ? > + (void *)tld->data_map->udata[1].page + page_off : NULL; > + } else { > + return (tld->data_map->udata[0].page && page_off < PAGE_SIZE - size) ? > + (void *)tld->data_map->udata[0].page + page_off : NULL; > + } > +} [...]