Hi Reinette, On 3/21/25 18:00, Reinette Chatre wrote: > Hi Babu, > > On 1/30/25 1:20 PM, Babu Moger wrote: >> "io_alloc" feature is a mechanism that enables direct insertion of data >> from I/O devices into the L3 cache. By directly caching data from I/O >> devices rather than first storing the I/O data in DRAM, it reduces the >> demands on DRAM bandwidth and reduces latency to the processor consuming >> the I/O data. >> >> io_alloc feature uses the highest CLOSID to route the traffic from I/O >> devices. Provide the interface to modify io_alloc CBMs (Capacity Bit Mask) >> when feature is enabled. >> >> Signed-off-by: Babu Moger <babu.moger@xxxxxxx> >> --- >> v3: Minor changes due to changes in resctrl_arch_get_io_alloc_enabled() >> and resctrl_io_alloc_closid_get(). >> Taken care of handling the CBM update when CDP is enabled. >> Updated the commit log to make it generic. >> >> v2: Added more generic text in documentation. >> --- >> Documentation/arch/x86/resctrl.rst | 12 ++ >> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +- >> arch/x86/kernel/cpu/resctrl/internal.h | 1 + >> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 134 +++++++++++++++++++++- >> 4 files changed, 147 insertions(+), 2 deletions(-) >> >> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst >> index 1b67e31d626c..29c8851bcc7f 100644 >> --- a/Documentation/arch/x86/resctrl.rst >> +++ b/Documentation/arch/x86/resctrl.rst >> @@ -169,6 +169,18 @@ related to allocation: >> When CDP is enabled, io_alloc routes I/O traffic using the highest >> CLOSID allocated for the instruction cache. >> >> +"io_alloc_cbm": >> + Capacity Bit Masks (CBMs) available to supported IO devices which >> + can directly insert cache lines in L3 which can help to reduce the >> + latency. CBM can be configured by writing to the interface in the >> + following format:: >> + >> + L3:<cache_id0>=<cbm>;<cache_id1>=<cbm>;... > > This format is dependent on the resource name (not always L3). Yes. Will remove "L3:" > >> + >> + When CDP is enabled, L3 control is divided into two separate resources: >> + L3CODE and L3DATA. However, the CBM can only be updated on the L3CODE >> + resource. >> + >> Memory bandwidth(MB) subdirectory contains the following files >> with respect to allocation: >> >> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c >> index d272dea43924..4dfee0436c1c 100644 >> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c >> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c >> @@ -102,7 +102,7 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, >> * requires at least two bits set. >> * AMD allows non-contiguous bitmasks. >> */ >> -static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r) >> +bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r) >> { >> unsigned long first_bit, zero_bit, val; >> unsigned int cbm_len = r->cache.cbm_len; >> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h >> index 07cf8409174d..702f6926bbdf 100644 >> --- a/arch/x86/kernel/cpu/resctrl/internal.h >> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >> @@ -669,4 +669,5 @@ void rdt_staged_configs_clear(void); >> bool closid_allocated(unsigned int closid); >> int resctrl_find_cleanest_closid(void); >> void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid); >> +bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r); >> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */ >> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> index 81b9d8c5dabf..9997cbfc1c19 100644 >> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> @@ -1999,6 +1999,137 @@ static int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of, >> return ret; >> } >> >> +/* >> + * Read the CBM and check the validity. Make sure CBM is not shared >> + * with any other exclusive resctrl groups. >> + */ >> +static int resctrl_io_alloc_parse_cbm(char *buf, struct resctrl_schema *s, >> + struct rdt_ctrl_domain *d) >> +{ >> + struct resctrl_staged_config *cfg; >> + struct rdt_resource *r = s->res; >> + u32 io_alloc_closid; >> + u32 cbm_val; >> + >> + cfg = &d->staged_config[s->conf_type]; >> + if (cfg->have_new_ctrl) { >> + rdt_last_cmd_printf("Duplicate domain %d\n", d->hdr.id); >> + return -EINVAL; >> + } >> + >> + if (!cbm_validate(buf, &cbm_val, r)) >> + return -EINVAL; >> + >> + /* >> + * The CBM may not overlap with other exclusive group. >> + */ >> + io_alloc_closid = resctrl_io_alloc_closid_get(r, s); >> + if (rdtgroup_cbm_overlaps(s, d, cbm_val, io_alloc_closid, true)) { >> + rdt_last_cmd_puts("Overlaps with exclusive group\n"); >> + return -EINVAL; >> + } >> + >> + cfg->new_ctrl = cbm_val; >> + cfg->have_new_ctrl = true; >> + >> + return 0; >> +} > > Could you please reduce amount of duplication with parse_cbm()? parse_cbm() needs rdtgrp to read 'mode' and 'closid' which is passed in rdt_parse_data. We can call parse_cbm directly if we add 'mode' and closid in rdt_parse_data. Will add those changes in next revision. > > (for rest of patch, please check that related comments from previous patches > are addressed here also) Sure. Will do. > > Reinette > -- Thanks Babu Moger