Hi Babu, On 6/13/25 2:05 PM, Babu Moger wrote: > Resctrl subsystem can support two monitoring modes, "mbm_event" or > "default". In mbm_event mode, monitoring event can only accumulate data > while it is backed by a hardware counter. In "default" mode, resctrl > assumes there is a hardware counter for each event within every CTRL_MON > and MON group. > > Introduce interface to switch between mbm_event and default modes. "Introduce interface" -> "Introduce mbm_assign_mode resctrl file" > > Example: > To list the MBM monitor modes supported: > $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > [mbm_event] > default > > To enable the "mbm_event" monitoring mode: > $ echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > > To enable the "default" monitoring mode: > $ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > > MBM event counters are automatically reset as part of changing the mode. > Clear both architectural and non-architectural event states to prevent > overflow conditions during the next event read. Also clear assignable > counter configuration on all the domains. > > Signed-off-by: Babu Moger <babu.moger@xxxxxxx> > --- ... > --- > Documentation/filesystems/resctrl.rst | 23 +++++++- > fs/resctrl/internal.h | 2 + > fs/resctrl/monitor.c | 27 ++++++++++ > fs/resctrl/rdtgroup.c | 78 +++++++++++++++++++++++++-- > 4 files changed, 126 insertions(+), 4 deletions(-) > > diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst > index cd82c2966ed7..7e62c7fdcefa 100644 > --- a/Documentation/filesystems/resctrl.rst > +++ b/Documentation/filesystems/resctrl.rst > @@ -259,7 +259,9 @@ with the following files: > > "mbm_assign_mode": > The supported monitoring modes. The enclosed brackets indicate which mode > - is enabled. > + is enabled. The MBM events (mbm_total_bytes and/or mbm_local_bytes) associated Since there may be more events in future I think the "(mbm_total_bytes and/or mbm_local_bytes)" can be dropped. > + with counters may reset when "mbm_assign_mode" is changed. > + > :: > > # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > @@ -279,6 +281,15 @@ with the following files: > of counters available is described in the "num_mbm_cntrs" file. Changing the > mode may cause all counters on the resource to reset. > > + Moving to mbm_event mode require users to assign the counters to the events. "Moving to mbm_event mode require" -> "mbm_event counter assignment mode requires" > + Otherwise, the MBM event counters will return 'Unassigned' when read. > + > + The mode is beneficial for AMD platforms that support more CTRL_MON > + and MON groups than available hardware counters. By default, this > + feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth > + Monitoring Counters) capability, ensuring counters remain assigned even > + when the corresponding RMID is not actively used by any processor. > + > "default": > > In default mode, resctrl assumes there is a hardware counter for each > @@ -288,6 +299,16 @@ with the following files: > result in misleading values or display "Unavailable" if no counter is assigned > to the event. > > + * To enable "mbm_event" monitoring mode: > + :: > + > + # echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > + > + * To enable "default" monitoring mode: > + :: > + > + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > + > "num_mbm_cntrs": > The maximum number of counter IDs (total of available and assigned counters) > in each domain when the system supports mbm_event mode. > diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h > index 14d99c723ea5..adc9ff3efdfd 100644 > --- a/fs/resctrl/internal.h > +++ b/fs/resctrl/internal.h > @@ -414,6 +414,8 @@ void resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain * > struct rdtgroup *rdtgrp, struct mon_evt *mevt); > int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d, > struct rdtgroup *rdtgrp, enum resctrl_event_id evtid); > +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d); > +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d); > > #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK > int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp); > diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c > index 618c94cd1ad8..504b869570e6 100644 > --- a/fs/resctrl/monitor.c > +++ b/fs/resctrl/monitor.c > @@ -1045,6 +1045,33 @@ static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id) > memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg)); > } > > +/** > + * mbm_cntr_free_all() - Clear all the counter ID configuration details in the > + * domain @d. Called when mbm_assign_mode is changed. > + */ > +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d) > +{ > + memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs); > +} > + > +/** > + * resctrl_reset_rmid_all() - Reset all non-architecture states for all the > + * supported RMIDs. > + */ > +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d) struct rdt_resource *r is unused? At this time it seems unnecessary to check if an MBM event belongs to particular resource since at this point I expect only L3 is possible. Even so, to be consistent and robust I think it would make flows easier to understand by always ensureing that mon_evt::rid matches expected resource. > +{ > + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); > + enum resctrl_event_id evt; > + int idx; > + > + for_each_mbm_event_id(evt) { > + if (!resctrl_is_mon_event_enabled(evt)) > + continue; > + idx = MBM_STATE_IDX(evt); > + memset(d->mbm_states[idx], 0, sizeof(struct mbm_state) * idx_limit); sizeof(*d->mbm_states[0])? > + } > +} > + > /** > * resctrl_alloc_config_cntr() - Allocate a counter ID and configure it for the > * event pointed to by @mevt and the resctrl group @rdtgrp within the domain @d. > diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c > index 8c67e0897f25..6bb61fcf8673 100644 > --- a/fs/resctrl/rdtgroup.c > +++ b/fs/resctrl/rdtgroup.c > @@ -1876,6 +1876,77 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of, > return 0; > } > > +static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, > + char *buf, size_t nbytes, loff_t off) > +{ > + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); > + struct rdt_mon_domain *d; > + int ret = 0; > + bool enable; > + > + /* Valid input requires a trailing newline */ > + if (nbytes == 0 || buf[nbytes - 1] != '\n') > + return -EINVAL; > + > + buf[nbytes - 1] = '\0'; > + > + cpus_read_lock(); > + mutex_lock(&rdtgroup_mutex); > + > + rdt_last_cmd_clear(); > + > + if (!strcmp(buf, "default")) { > + enable = 0; > + } else if (!strcmp(buf, "mbm_event")) { > + if (r->mon.mbm_cntr_assignable) { > + enable = 1; > + } else { > + ret = -EINVAL; > + rdt_last_cmd_puts("mbm_event mode is not supported\n"); > + goto write_exit; write_exit -> out_unlock > + } > + } else { > + ret = -EINVAL; > + rdt_last_cmd_puts("Unsupported assign mode\n"); > + goto write_exit; > + } > + > + if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) { > + ret = resctrl_arch_mbm_cntr_assign_set(r, enable); > + if (ret) > + goto write_exit; > + > + /* Update the visibility of BMEC related files */ > + resctrl_bmec_files_show(r, !enable); > + > + /* > + * Initialize the default memory transaction values for > + * total and local events. > + */ > + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) > + resctrl_set_mon_evt_cfg(QOS_L3_MBM_TOTAL_EVENT_ID, > + MAX_EVT_CONFIG_BITS); > + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) > + resctrl_set_mon_evt_cfg(QOS_L3_MBM_LOCAL_EVENT_ID, > + READS_TO_LOCAL_MEM | > + READS_TO_LOCAL_S_MEM | > + NON_TEMP_WRITE_TO_LOCAL_MEM); Nice, yes, this belongs in resctrl fs. > + /* > + * Reset all the non-achitectural RMID state and assignable counters. > + */ > + list_for_each_entry(d, &r->mon_domains, hdr.list) { > + mbm_cntr_free_all(r, d); > + resctrl_reset_rmid_all(r, d); > + } > + } > + > +write_exit: > + mutex_unlock(&rdtgroup_mutex); > + cpus_read_unlock(); > + > + return ret ?: nbytes; > +} > + > static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of, > struct seq_file *s, void *v) > { > @@ -2203,8 +2274,8 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr > struct mon_evt *mevt; > int assign_state; > char domain[10]; > + int ret = 0; > bool found; > - int ret; > > mevt = mbm_get_mon_event_by_name(r, event); > if (!mevt) { > @@ -2249,7 +2320,7 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr > > switch (assign_state) { > case ASSIGN_NONE: > - ret = resctrl_unassign_cntr_event(r, d, rdtgrp, mevt); > + resctrl_unassign_cntr_event(r, d, rdtgrp, mevt); > break; > case ASSIGN_EXCLUSIVE: > ret = resctrl_assign_cntr_event(r, d, rdtgrp, mevt); Two stray hunks? > @@ -2463,9 +2534,10 @@ static struct rftype res_common_files[] = { > }, > { > .name = "mbm_assign_mode", > - .mode = 0444, > + .mode = 0644, > .kf_ops = &rdtgroup_kf_single_ops, > .seq_show = resctrl_mbm_assign_mode_show, > + .write = resctrl_mbm_assign_mode_write, > .fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE, > }, > { Reinette