Resctrl subsystem can support two monitoring modes, "mbm_cntr_assign" or "default". In mbm_cntr_assign, monitoring event can only accumulate data while it is backed by a hardware counter. In "default" mode, resctrl assumes there is a hardware counter for each event within every CTRL_MON and MON group. Introduce interface to switch between mbm_cntr_assign and default modes. $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode [mbm_cntr_assign] default To enable the "mbm_cntr_assign" monitoring mode: $ echo "mbm_cntr_assign" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode To enable the "default" monitoring mode: $ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode MBM event counters are automatically reset as part of changing the mode. Clear both architectural and non-architectural event states to prevent overflow conditions during the next event read. Signed-off-by: Babu Moger <babu.moger@xxxxxxx> --- v12: Fixed the documentation for a consistency. Introduced mbm_cntr_free_all() and resctrl_reset_rmid_all() to clear counters and non-architectural states when monitor mode is changed. https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@xxxxxxxxx/ v11: Changed the name of the function rdtgroup_mbm_assign_mode_write() to resctrl_mbm_assign_mode_write(). Rewrote the commit message with context. Added few more details in resctrl.rst about mbm_cntr_assign mode. Re-arranged the text in resctrl.rst file. v10: The call mbm_cntr_reset() has been moved to earlier patch. Minor documentation update. v9: Fixed extra spaces in user documentation. Fixed problem changing the mode to mbm_cntr_assign mode when it is not supported. Added extra checks to detect if systems supports it. Used the rdtgroup_cntr_id_init to initialize cntr_id. v8: Reset the internal counters after mbm_cntr_assign mode is changed. Renamed rdtgroup_mbm_cntr_reset() to mbm_cntr_reset() Updated the documentation to make text generic. v7: Changed the interface name to mbm_assign_mode. Removed the references of ABMC. Added the changes to reset global and domain bitmaps. Added the changes to reset rmid. v6: Changed the mode name to mbm_cntr_assign. Moved all the FS related code here. Added changes to reset mbm_cntr_map and resctrl group counters. v5: Change log and mode description text correction. v4: Minor commit text changes. Keep the default to ABMC when supported. Fixed comments to reflect changed interface "mbm_mode". v3: New patch to address the review comments from upstream. --- Documentation/arch/x86/resctrl.rst | 25 ++++++++++- arch/x86/kernel/cpu/resctrl/internal.h | 2 + arch/x86/kernel/cpu/resctrl/monitor.c | 16 +++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 58 +++++++++++++++++++++++++- 4 files changed, 99 insertions(+), 2 deletions(-) diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst index ad35c38eed34..05f1852ad8e2 100644 --- a/Documentation/arch/x86/resctrl.rst +++ b/Documentation/arch/x86/resctrl.rst @@ -259,7 +259,10 @@ with the following files: "mbm_assign_mode": Reports the list of monitoring modes supported. The enclosed brackets - indicate which mode is enabled. + indicate which mode is enabled. The MBM events (mbm_total_bytes and/or + mbm_local_bytes) associated with counters may reset when "mbm_assign_mode" + is changed. + :: # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode @@ -275,6 +278,16 @@ with the following files: counters available is described in the "num_mbm_cntrs" file. Changing the mode may cause all counters on the resource to reset. + Moving to mbm_cntr_assign mode require users to assign the counters to + the events. Otherwise, the MBM event counters will return 'Unassigned' + when read. + + The mode is beneficial for AMD platforms that support more CTRL_MON + and MON groups than available hardware counters. By default, this + feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth + Monitoring Counters) capability, ensuring counters remain assigned even + when the corresponding RMID is not actively used by any processor. + "default": In default mode, resctrl assumes there is a hardware counter for each @@ -284,6 +297,16 @@ with the following files: misleading values or display "Unavailable" if no counter is assigned to the event. + * To enable "mbm_cntr_assign" monitoring mode: + :: + + # echo "mbm_cntr_assign" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode + + * To enable "default" monitoring mode: + :: + + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode + "num_mbm_cntrs": The maximum number of monitoring counters (total of available and assigned counters) in each domain when the system supports mbm_cntr_assign mode. diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 2020a2fe7135..2f3a5d78d153 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -599,6 +599,8 @@ int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d struct rdtgroup *rdtgrp, enum resctrl_event_id evtid, u32 evt_cfg); int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d, struct rdtgroup *rdtgrp, enum resctrl_event_id evtid); +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d); +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d); #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 0c6fd5f6ec19..7f2e1fdfa936 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -610,6 +610,17 @@ static struct mbm_state *get_mbm_state(struct rdt_mon_domain *d, u32 closid, } } +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d) +{ + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); + + if (resctrl_arch_is_mbm_total_enabled()) + memset(d->mbm_total, 0, sizeof(struct mbm_state) * idx_limit); + + if (resctrl_arch_is_mbm_local_enabled()) + memset(d->mbm_local, 0, sizeof(struct mbm_state) * idx_limit); +} + static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) { int cpu = smp_processor_id(); @@ -1558,6 +1569,11 @@ static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id) memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg)); } +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d) +{ + memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs); +} + /* * Allocate a fresh counter and configure the event if not assigned already. */ diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 5d9c4c216522..d10cf1e5b914 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1050,6 +1050,61 @@ static ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of, return ret ?: nbytes; } +static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + struct rdt_resource *r = of->kn->parent->priv; + struct rdt_mon_domain *d; + int ret = 0; + bool enable; + + /* Valid input requires a trailing newline */ + if (nbytes == 0 || buf[nbytes - 1] != '\n') + return -EINVAL; + + buf[nbytes - 1] = '\0'; + + cpus_read_lock(); + mutex_lock(&rdtgroup_mutex); + + rdt_last_cmd_clear(); + + if (!strcmp(buf, "default")) { + enable = 0; + } else if (!strcmp(buf, "mbm_cntr_assign")) { + if (r->mon.mbm_cntr_assignable) { + enable = 1; + } else { + ret = -EINVAL; + rdt_last_cmd_puts("mbm_cntr_assign mode is not supported\n"); + goto write_exit; + } + } else { + ret = -EINVAL; + rdt_last_cmd_puts("Unsupported assign mode\n"); + goto write_exit; + } + + if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) { + ret = resctrl_arch_mbm_cntr_assign_set(r, enable); + if (ret) + goto write_exit; + /* + * Reset all the non-achitectural RMID state and assignable counters. + */ + list_for_each_entry(d, &r->mon_domains, hdr.list) { + mbm_cntr_free_all(r, d); + resctrl_reset_rmid_all(r, d); + } + } + +write_exit: + mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); + + return ret ?: nbytes; +} + #ifdef CONFIG_PROC_CPU_RESCTRL /* @@ -2412,9 +2467,10 @@ static struct rftype res_common_files[] = { }, { .name = "mbm_assign_mode", - .mode = 0444, + .mode = 0644, .kf_ops = &rdtgroup_kf_single_ops, .seq_show = resctrl_mbm_assign_mode_show, + .write = resctrl_mbm_assign_mode_write, .fflags = RFTYPE_MON_INFO, }, { -- 2.34.1