Hi Reinette, On 4/11/25 15:52, Reinette Chatre wrote: > Hi Babu, > > On 4/3/25 5:18 PM, Babu Moger wrote: >> Users can create as many monitor groups as RMIDs supported by the hardware. >> However, bandwidth monitoring feature on AMD system only guarantees that >> RMIDs currently assigned to a processor will be tracked by hardware. The >> counters of any other RMIDs which are no longer being tracked will be reset >> to zero. The MBM event counters return "Unavailable" for the RMIDs that are >> not tracked by hardware. So, there can be only limited number of groups >> that can give guaranteed monitoring numbers. With ever changing >> configurations there is no way to definitely know which of these groups are >> being tracked for certain point of time. Users do not have the option to >> monitor a group or set of groups for certain period of time without >> worrying about RMID being reset in between. >> >> The ABMC feature provides an option to the user to assign a hardware >> counter to an RMID, event pair and monitor the bandwidth as long as it is >> assigned. The assigned RMID will be tracked by the hardware until the user >> unassigns it manually. There is no need to worry about counters being reset >> during this period. Additionally, the user can specify a bitmask >> identifying the specific bandwidth types from the given source to track >> with the counter. >> >> Without ABMC enabled, monitoring will work in current mode without >> assignment option. >> >> Linux resctrl subsystem provides the interface to count maximum of two >> memory bandwidth events per group, from a combination of available total >> and local events. Keeping the current interface, users can enable a maximum >> of 2 ABMC counters per group. User will also have the option to enable only >> one counter to the group. If the system runs out of assignable ABMC >> counters, kernel will display an error. Users need to disable an already >> enabled counter to make space for new assignments. > > The above paragraph sounds like it is still talking about the original > global assignment of counters. Ok. Sure. Will update it. > >> >> The feature can be detected via CPUID_Fn80000020_EBX_x00 bit 5. >> Bits Description >> 5 ABMC (Assignable Bandwidth Monitoring Counters) >> >> The feature details are documented in APM listed below [1]. >> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming >> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth >> Monitoring (ABMC). >> >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 >> Signed-off-by: Babu Moger <babu.moger@xxxxxxx> >> --- >> >> Note: Checkpatch checks/warnings are ignored to maintain coding style. >> >> v12: Removed the dependancy on X86_FEATURE_BMEC. > > Considering this removal it is not clear to me how the BMEC and ABMC features > are managed on a platform. Since this dependency existed I assume platforms > that support both ABMC and BMEC exist and after previous discussion [1] > I expected to see that BMEC support will be disabled when ABMC is detected > but I do not see this done in this series. From what I can tell, looking at > patch "x86/resctrl: Detect Assignable Bandwidth Monitoring feature details" > BMEC and ABMC are both detected and enabled while I do not see any > interactions handled. For example, a user modifying the BMEC appears > to have no impact on existing ABMC assigned counters. Could you please clarify > how event configuration works on platforms that support both ABMC and BMEC? They are mutually exclusive. If ABMC is enabled then BMEC should not work. I missed to handle it. Also, I was not very clear at that on how to handle that. Here is my proposal to handle this case. This can be separate patch. diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index d10cf1e5b914..772f2f77faee 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1370,7 +1370,7 @@ static int rdt_mon_features_show(struct kernfs_open_file *of, list_for_each_entry(mevt, &r->mon.evt_list, list) { seq_printf(seq, "%s\n", mevt->name); - if (mevt->configurable) + if (mevt->configurable && !resctrl_arch_mbm_cntr_assign_enabled(r)) seq_printf(seq, "%s_config\n", mevt->name); } @@ -1846,6 +1846,11 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid cpus_read_lock(); mutex_lock(&rdtgroup_mutex); + if (resctrl_arch_mbm_cntr_assign_enabled(r)) { + rdt_last_cmd_puts("Event configuration(BMEC) not supported with mbm_cntr_assign mode\n"); + return -EINVAL; + } + list_for_each_entry(dom, &r->mon_domains, hdr.list) { if (sep) seq_puts(s, ";"); @@ -1865,21 +1870,24 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid static int mbm_total_bytes_config_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { + int ret; struct rdt_resource *r = of->kn->parent->priv; - mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID); + ret = mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID); - return 0; + return ret; } static int mbm_local_bytes_config_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) { + int ret; + struct rdt_resource *r = of->kn->parent->priv; - mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID); + ret = mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID); - return 0; + return ret; } static void mbm_config_write_domain(struct rdt_resource *r, @@ -1932,6 +1940,11 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid) /* Walking r->domains, ensure it can't race with cpuhp */ lockdep_assert_cpus_held(); + if (resctrl_arch_mbm_cntr_assign_enabled(r)) { + rdt_last_cmd_puts("Event configuration(BMEC) not supported with mbm_cntr_assign mode\n"); + return -EINVAL; + } + next: if (!tok || tok[0] == '\0') return 0; > > Reinette > > [1] https://lore.kernel.org/all/4b66ea1c-2f76-4218-a67b-2232b2be6990@xxxxxxx/ > -- Thanks Babu Moger