* Overview * Task local data defines an abstract storage type for storing data specific to each task and provides user space and bpf libraries to access it. The result is a fast and easy way to share per-task data between user space and bpf programs. The intended use case is sched_ext, where user space programs will pass hints to sched_ext bpf programs to affect task scheduling. Task local data is built on top of task local storage map and UPTR[0] to achieve fast per-task data sharing. UPTR is a type of special field supported in task local storage map value. A user page assigned to a UPTR will be pinned by the kernel when the map is updated. Therefore, user space programs can update data seen by bpf programs without syscalls. Additionally, unlike most bpf maps, task local data does not require a static map value definition. This design is driven by sched_ext, which would like to allow multiple developers to share a storage without the need to explicitly agree on the layout of it. While a centralized layout definition would have worked, the friction of synchronizing it across different repos is not desirable. This simplify code base management and makes experimenting easier. In the rest of the cover letter, "task local data" is used to refer to the abstract storage and TLD is used to denote a single data entry in the storage. * Design * Task local data library provides simple APIs for user space and bpf through two header files, task_local_data.h and task_loca_data.bpf.h, respectively. The usage is illustrated in the following diagram. An entry of data in the task local data, TLD, first needs to be created in the user space by calling tld_create_key() with the size of the data and a name associated with the data. The function returns an opaque key object of tld_key_t type, which can be used to locate a TLD. The same key may be passed to tld_get_data() in different threads, and a pointer to data specific to the calling thread will be returned. The pointer will remain valid until the process terminates, so there is not need to call tld_get_data() in subsequent accesses. On the bpf side, programs will also use keys to locate TLDs. For every new task, a bpf program must first fetch the keys and save them for later uses. This is done by calling tld_fetch_key() with names specified in tld_create_key(). The key will be saved in a task local storage map, tld_key_map. The map value type, struct tld_keys, __must__ be defined by developers. It should contain keys used in the compilation unit. Finally, bpf programs can call tld_get_data() to get a pointer to a TLD that is shared with user space. ┌─ Application ───────────────────────────────────────────────────────┐ │ tld_key_t kx = tld_create_key(fd, "X", sizeof(int)); │ │ ... ┌─ library A ────────────────────────┐│ │ int *x = tld_get_data(fd, kx);│ ky = tld_create_key(fd, "Y", ││ │ if (x) *x = 123; │ sizeof(bool)); ││ │ │ bool *y = tld_get_data(ky); ││ │ ┌─────┤ if (y) *y = true; ││ │ │ └────────────────────────────────────┘│ └───────┬─────────────────│───────────────────────────────────────────┘ V V + ─ Task local data ─ ─ ─ ─ ─ + ┌─ sched_ext_ops::init_task ────────┐ | ┌─ tld_data_map ──────────┐ | │ tld_init_object(task, &tld_obj); │ | │ BPF Task local storage │ | │ tld_fetch_key(&tld_obj, "X", kx); │ | │ │ |<─┤ tld_fetch_key(&tld_obj, "Y", ky); │ | │ data_page __uptr *data │ | └───────────────────────────────────┘ | │ metadata_page __uptr *metadata | └─────────────────────────┘ | ┌─ Other sched_ext_ops op ──────────┐ | ┌─ tld_key_map ───────────┐ | │ tld_init_object(task, &tld_obj); │ | │ BPF Task local storage │ | │ bool *y = tld_get_data(&tld_obj, ├┐ | │ │ |<─┤ ky, 1); ││ | │ tld_key_t kx; │ | │ if (y) ││ | │ tld_key_t ky; │ | │ /* do something */ ││ | └─────────────────────────┘ | └┬──────────────────────────────────┘│ + ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ + └───────────────────────────────────┘ * Implementation * Task local data defines the storage to be a task local storage map with two UPTRs, data and metadata. Data points to a blob of memory for storing TLDs individual to every task. Metadata, individual to each process and shared by its threads, records the number of TLDs declared and the metadata of each TLD. Metadata for a TLD contains the key name and the size of the TLD in data. struct data_page { char data[PAGE_SIZE]; }; struct metadata_page { u8 cnt; struct metadata data[TLD_DATA_CNT]; }; Both user space and bpf API follow the same protocol when accessing task local data. A pointer to a TLD is located by a key. tld_key_t effectively is the offset of a TLD in data. To add a TLD, user space API, tld_create_key(), loops through metadata->data until an empty slot is found and update it. It also adds sizes of prior TLDs along the way to derive the offset. To fetch a key, bpf API, tld_fetch_key(), also loops through metadata->data until the key name is found. The offset is also derived by adding sizes. The detail of task local data operations can be found in patch 1. [0] https://lore.kernel.org/bpf/20241023234759.860539-1-martin.lau@xxxxxxxxx/ v3 -> v4 - API improvements - Simplify API - Drop string obfuscation - Use opaque type for key - Better documentation - Implementation - Switch to dynamic allocation for per-task data - Now offer as header-only libraries - No TLS map pinning; leave it to users - Drop pthread dependency - Add more invalid tld_create_key() test - Add a race test for tld_create_key() v3: https://lore.kernel.org/bpf/20250425214039.2919818-1-ameryhung@xxxxxxxxx/ Amery Hung (3): selftests/bpf: Introduce task local data selftests/bpf: Test basic task local data operations selftests/bpf: Test concurrent task local data key creation .../bpf/prog_tests/task_local_data.h | 263 ++++++++++++++++++ .../bpf/prog_tests/test_task_local_data.c | 254 +++++++++++++++++ .../selftests/bpf/progs/task_local_data.bpf.h | 220 +++++++++++++++ .../bpf/progs/test_task_local_data.c | 81 ++++++ 4 files changed, 818 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.h create mode 100644 tools/testing/selftests/bpf/prog_tests/test_task_local_data.c create mode 100644 tools/testing/selftests/bpf/progs/task_local_data.bpf.h create mode 100644 tools/testing/selftests/bpf/progs/test_task_local_data.c -- 2.47.1