Resctrl subsystem can support two monitoring modes, "mbm_event" or "default". In mbm_event mode, monitoring event can only accumulate data while it is backed by a hardware counter. In "default" mode, resctrl assumes there is a hardware counter for each event within every CTRL_MON and MON group. Introduce mbm_assign_mode resctrl file to switch between mbm_event and default modes. Example: To list the MBM monitor modes supported: $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode [mbm_event] default To enable the "mbm_event" counter assignment mode: $ echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode To enable the "default" monitoring mode: $ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode Reset MBM event counters automatically as part of changing the mode. Clear both architectural and non-architectural event states to prevent overflow conditions during the next event read. Clear assignable counter configuration on all the domains. Also, enable auto assignment when switching to "mbm_event" mode. Signed-off-by: Babu Moger Reviewed-by: Reinette Chatre --- v18: Changelog update for imperative mode. Added Reviewed-by tag. v17: Moved resctrl_mbm_assign_mode_write() to fs/resctrl/monitor.c Fixed the event configuration initialization while considering hw support. Enable auto assignment when switching to "mbm_event" mode. v16: Minor changelog update. Minor update in resctrl.rst. Updated resctrl_bmec_files_show() to pass NULL for kn_fs_node. v15: Minor changelog update. Minir user do resctrl.rst update. Fixed stray hunks. v14: Updated the changelog to reflect the change in monitor mode naming. Added the call resctrl_bmec_files_show() to enable/disable files related to BMEC. Added resctrl_set_mon_evt_cfg() to reset event configuration values when mode is changes. v13: Resolved the conflicts due to FS/ARCH restructure. Introduced the new resctrl_init_evt_configuration() to initialize the event modes and configuration values. Added the call to resctrl_bmec_files_show() hide/show BMEC related files. v12: Fixed the documentation for a consistency. Introduced mbm_cntr_free_all() and resctrl_reset_rmid_all() to clear counters and non-architectural states when monitor mode is changed. https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/ v11: Changed the name of the function rdtgroup_mbm_assign_mode_write() to resctrl_mbm_assign_mode_write(). Rewrote the commit message with context. Added few more details in resctrl.rst about mbm_cntr_assign mode. Re-arranged the text in resctrl.rst file. v10: The call mbm_cntr_reset() has been moved to earlier patch. Minor documentation update. v9: Fixed extra spaces in user documentation. Fixed problem changing the mode to mbm_cntr_assign mode when it is not supported. Added extra checks to detect if systems supports it. Used the rdtgroup_cntr_id_init to initialize cntr_id. v8: Reset the internal counters after mbm_cntr_assign mode is changed. Renamed rdtgroup_mbm_cntr_reset() to mbm_cntr_reset() Updated the documentation to make text generic. v7: Changed the interface name to mbm_assign_mode. Removed the references of ABMC. Added the changes to reset global and domain bitmaps. Added the changes to reset rmid. v6: Changed the mode name to mbm_cntr_assign. Moved all the FS related code here. Added changes to reset mbm_cntr_map and resctrl group counters. v5: Change log and mode description text correction. v4: Minor commit text changes. Keep the default to ABMC when supported. Fixed comments to reflect changed interface "mbm_mode". v3: New patch to address the review comments from upstream. --- Documentation/filesystems/resctrl.rst | 22 +++++- fs/resctrl/internal.h | 6 ++ fs/resctrl/monitor.c | 100 ++++++++++++++++++++++++++ fs/resctrl/rdtgroup.c | 7 +- 4 files changed, 131 insertions(+), 4 deletions(-) diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst index f60f6a96cb6b..006d23af66e1 100644 --- a/Documentation/filesystems/resctrl.rst +++ b/Documentation/filesystems/resctrl.rst @@ -259,7 +259,8 @@ with the following files: "mbm_assign_mode": The supported counter assignment modes. The enclosed brackets indicate which mode - is enabled. + is enabled. The MBM events associated with counters may reset when "mbm_assign_mode" + is changed. :: # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode @@ -279,6 +280,15 @@ with the following files: of counters available is described in the "num_mbm_cntrs" file. Changing the mode may cause all counters on the resource to reset. + Moving to mbm_event counter assignment mode requires users to assign the counters + to the events. Otherwise, the MBM event counters will return 'Unassigned' when read. + + The mode is beneficial for AMD platforms that support more CTRL_MON + and MON groups than available hardware counters. By default, this + feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth + Monitoring Counters) capability, ensuring counters remain assigned even + when the corresponding RMID is not actively used by any processor. + "default": In default mode, resctrl assumes there is a hardware counter for each @@ -288,6 +298,16 @@ with the following files: result in misleading values or display "Unavailable" if no counter is assigned to the event. + * To enable "mbm_event" counter assignment mode: + :: + + # echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode + + * To enable "default" monitoring mode: + :: + + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode + "num_mbm_cntrs": The maximum number of counters (total of available and assigned counters) in each domain when the system supports mbm_event mode. diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h index 264f04c7dfba..6938734b14a4 100644 --- a/fs/resctrl/internal.h +++ b/fs/resctrl/internal.h @@ -396,6 +396,12 @@ void *rdt_kn_parent_priv(struct kernfs_node *kn); int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of, struct seq_file *s, void *v); +ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, char *buf, + size_t nbytes, loff_t off); + +void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn, + bool show); + int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of, struct seq_file *s, void *v); int resctrl_available_mbm_cntrs_show(struct kernfs_open_file *of, struct seq_file *s, diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c index d49170247b75..4cf78a9a8807 100644 --- a/fs/resctrl/monitor.c +++ b/fs/resctrl/monitor.c @@ -1080,6 +1080,33 @@ ssize_t resctrl_mbm_assign_on_mkdir_write(struct kernfs_open_file *of, char *buf return ret ?: nbytes; } +/* + * mbm_cntr_free_all() - Clear all the counter ID configuration details in the + * domain @d. Called when mbm_assign_mode is changed. + */ +static void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d) +{ + memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs); +} + +/* + * resctrl_reset_rmid_all() - Reset all non-architecture states for all the + * supported RMIDs. + */ +static void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d) +{ + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); + enum resctrl_event_id evt; + int idx; + + for_each_mbm_event_id(evt) { + if (!resctrl_is_mon_event_enabled(evt)) + continue; + idx = MBM_STATE_IDX(evt); + memset(d->mbm_states[idx], 0, sizeof(*d->mbm_states[0]) * idx_limit); + } +} + /* * rdtgroup_assign_cntr() - Assign/unassign the counter ID for the event, RMID * pair in the domain. @@ -1390,6 +1417,79 @@ int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of, return 0; } +ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, char *buf, + size_t nbytes, loff_t off) +{ + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); + struct rdt_mon_domain *d; + int ret = 0; + bool enable; + + /* Valid input requires a trailing newline */ + if (nbytes == 0 || buf[nbytes - 1] != '\n') + return -EINVAL; + + buf[nbytes - 1] = '\0'; + + cpus_read_lock(); + mutex_lock(&rdtgroup_mutex); + + rdt_last_cmd_clear(); + + if (!strcmp(buf, "default")) { + enable = 0; + } else if (!strcmp(buf, "mbm_event")) { + if (r->mon.mbm_cntr_assignable) { + enable = 1; + } else { + ret = -EINVAL; + rdt_last_cmd_puts("mbm_event mode is not supported\n"); + goto out_unlock; + } + } else { + ret = -EINVAL; + rdt_last_cmd_puts("Unsupported assign mode\n"); + goto out_unlock; + } + + if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) { + ret = resctrl_arch_mbm_cntr_assign_set(r, enable); + if (ret) + goto out_unlock; + + /* Update the visibility of BMEC related files */ + resctrl_bmec_files_show(r, NULL, !enable); + + /* + * Initialize the default memory transaction values for + * total and local events. + */ + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) + mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) + mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask & + (READS_TO_LOCAL_MEM | + READS_TO_LOCAL_S_MEM | + NON_TEMP_WRITE_TO_LOCAL_MEM); + /* Enable auto assignment when switching to "mbm_event" mode */ + if (enable) + r->mon.mbm_assign_on_mkdir = true; + /* + * Reset all the non-achitectural RMID state and assignable counters. + */ + list_for_each_entry(d, &r->mon_domains, hdr.list) { + mbm_cntr_free_all(r, d); + resctrl_reset_rmid_all(r, d); + } + } + +out_unlock: + mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); + + return ret ?: nbytes; +} + int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of, struct seq_file *s, void *v) { diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c index 0c404a159d45..0320360cd7a6 100644 --- a/fs/resctrl/rdtgroup.c +++ b/fs/resctrl/rdtgroup.c @@ -1807,8 +1807,8 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of, * Don't treat kernfs_find_and_get failure as an error, since this function may * be called regardless of whether BMEC is supported or the event is enabled. */ -static void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn, - bool show) +void resctrl_bmec_files_show(struct rdt_resource *r, struct kernfs_node *l3_mon_kn, + bool show) { struct kernfs_node *kn_config, *mon_kn = NULL; char name[32]; @@ -1985,9 +1985,10 @@ static struct rftype res_common_files[] = { }, { .name = "mbm_assign_mode", - .mode = 0444, + .mode = 0644, .kf_ops = &rdtgroup_kf_single_ops, .seq_show = resctrl_mbm_assign_mode_show, + .write = resctrl_mbm_assign_mode_write, .fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE, }, { -- 2.34.1