Mirroring similar work for software events in commit 6e9fa4131abb ("perf parse-events: Remove non-json software events"). These changes migrate the legacy hardware and cache events to json. With no hard coded legacy hardware or cache events the wild card, case insensitivity, etc. is consistent for events. This does, however, mean events like cycles will wild card against all PMUs. A change doing the same was originally posted and merged from: https://lore.kernel.org/r/20240416061533.921723-10-irogers@xxxxxxxxxx and reverted by Linus in commit 4f1b067359ac ("Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"") due to his dislike for the cycles behavior on ARM with perf record. Earlier patches in this series make perf record event opening failures non-fatal and hide the cycles event's failure to open on ARM in perf record, so it is expected the behavior will now be transparent in perf record on ARM. perf stat with a cycles event will wildcard open the event on all PMUs. The change to support legacy events with PMUs was done to clean up Intel's hybrid PMU implementation. Having sysfs/json events with increased priority to legacy was requested by Mark Rutland <mark.rutland@xxxxxxx> to fix Apple-M PMU issues wrt broken legacy events on that PMU. It is believed the PMU driver is now fixed, but this has only been confirmed on ARM Juno boards. It was requested that RISC-V be able to add events to the perf tool json so the PMU driver didn't need to map legacy events to config encodings: https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@xxxxxxxxxxxx/ This patch series achieves this. A previous series of patches decreasing legacy hardware event priorities was posted in: https://lore.kernel.org/lkml/20250416045117.876775-1-irogers@xxxxxxxxxx/ Namhyung Kim <namhyung@xxxxxxxxxx> mentioned that hardware and software events can be implemented similarly: https://lore.kernel.org/lkml/aIJmJns2lopxf3EK@xxxxxxxxxx/ and this patch series achieves this. Note, patch 1 (perf parse-events: Fix legacy cache events if event is duplicated in a PMU) fixes a function deleted by patch 15 (perf parse-events: Remove hard coded legacy hardware and cache parsing). Adding the json exposed an issue when legacy cache (not legacy hardware) and sysfs/json events exist. The fix is necessary to keep tests passing through the series. It is also posted for backports to stable trees. The perf list behavior includes a lot more information and events. The before behavior on a hybrid alderlake is: ``` $ perf list hw List of pre-defined events (to be used in -e or -M): branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] ref-cycles [Hardware event] $ perf list hwcache List of pre-defined events (to be used in -e or -M): cache: L1-dcache-loads OR cpu_atom/L1-dcache-loads/ L1-dcache-stores OR cpu_atom/L1-dcache-stores/ L1-icache-loads OR cpu_atom/L1-icache-loads/ L1-icache-load-misses OR cpu_atom/L1-icache-load-misses/ LLC-loads OR cpu_atom/LLC-loads/ LLC-load-misses OR cpu_atom/LLC-load-misses/ LLC-stores OR cpu_atom/LLC-stores/ LLC-store-misses OR cpu_atom/LLC-store-misses/ dTLB-loads OR cpu_atom/dTLB-loads/ dTLB-load-misses OR cpu_atom/dTLB-load-misses/ dTLB-stores OR cpu_atom/dTLB-stores/ dTLB-store-misses OR cpu_atom/dTLB-store-misses/ iTLB-load-misses OR cpu_atom/iTLB-load-misses/ branch-loads OR cpu_atom/branch-loads/ branch-load-misses OR cpu_atom/branch-load-misses/ L1-dcache-loads OR cpu_core/L1-dcache-loads/ L1-dcache-load-misses OR cpu_core/L1-dcache-load-misses/ L1-dcache-stores OR cpu_core/L1-dcache-stores/ L1-icache-load-misses OR cpu_core/L1-icache-load-misses/ LLC-loads OR cpu_core/LLC-loads/ LLC-load-misses OR cpu_core/LLC-load-misses/ LLC-stores OR cpu_core/LLC-stores/ LLC-store-misses OR cpu_core/LLC-store-misses/ dTLB-loads OR cpu_core/dTLB-loads/ dTLB-load-misses OR cpu_core/dTLB-load-misses/ dTLB-stores OR cpu_core/dTLB-stores/ dTLB-store-misses OR cpu_core/dTLB-store-misses/ iTLB-load-misses OR cpu_core/iTLB-load-misses/ branch-loads OR cpu_core/branch-loads/ branch-load-misses OR cpu_core/branch-load-misses/ node-loads OR cpu_core/node-loads/ node-load-misses OR cpu_core/node-load-misses/ ``` and after it is: ``` $ perf list hw legacy hardware: branch-instructions [Retired branch instructions [This event is an alias of branches]. Unit: cpu_atom] branch-misses [Mispredicted branch instructions. Unit: cpu_atom] branches [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu_atom] bus-cycles [Bus cycles,which can be different from total cycles. Unit: cpu_atom] cache-misses [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu_atom] cache-references [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu_atom] cpu-cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu_atom] cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu_atom] instructions [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu_atom] ref-cycles [Total cycles; not affected by CPU frequency scaling. Unit: cpu_atom] branch-instructions [Retired branch instructions [This event is an alias of branches]. Unit: cpu_core] branch-misses [Mispredicted branch instructions. Unit: cpu_core] branches [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu_core] bus-cycles [Bus cycles,which can be different from total cycles. Unit: cpu_core] cache-misses [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu_core] cache-references [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu_core] cpu-cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu_core] cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu_core] instructions [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu_core] ref-cycles [Total cycles; not affected by CPU frequency scaling. Unit: cpu_core] $ perf list hwcache legacy cache: branch-load-misses [Branch prediction unit read misses. Unit: cpu_atom] branch-loads [Branch prediction unit read accesses. Unit: cpu_atom] dtlb-load-misses [Data TLB read misses. Unit: cpu_atom] dtlb-loads [Data TLB read accesses. Unit: cpu_atom] dtlb-store-misses [Data TLB write misses. Unit: cpu_atom] dtlb-stores [Data TLB write accesses. Unit: cpu_atom] itlb-load-misses [Instruction TLB read misses. Unit: cpu_atom] l1-dcache-loads [Level 1 data cache read accesses. Unit: cpu_atom] l1-dcache-stores [Level 1 data cache write accesses. Unit: cpu_atom] l1-icache-load-misses [Level 1 instruction cache read misses. Unit: cpu_atom] l1-icache-loads [Level 1 instruction cache read accesses. Unit: cpu_atom] llc-load-misses [Last level cache read misses. Unit: cpu_atom] llc-loads [Last level cache read accesses. Unit: cpu_atom] llc-store-misses [Last level cache write misses. Unit: cpu_atom] llc-stores [Last level cache write accesses. Unit: cpu_atom] branch-load-misses [Branch prediction unit read misses. Unit: cpu_core] branch-loads [Branch prediction unit read accesses. Unit: cpu_core] dtlb-load-misses [Data TLB read misses. Unit: cpu_core] dtlb-loads [Data TLB read accesses. Unit: cpu_core] dtlb-store-misses [Data TLB write misses. Unit: cpu_core] dtlb-stores [Data TLB write accesses. Unit: cpu_core] itlb-load-misses [Instruction TLB read misses. Unit: cpu_core] l1-dcache-load-misses [Level 1 data cache read misses. Unit: cpu_core] l1-dcache-loads [Level 1 data cache read accesses. Unit: cpu_core] l1-dcache-stores [Level 1 data cache write accesses. Unit: cpu_core] l1-icache-load-misses [Level 1 instruction cache read misses. Unit: cpu_core] llc-load-misses [Last level cache read misses. Unit: cpu_core] llc-loads [Last level cache read accesses. Unit: cpu_core] llc-store-misses [Last level cache write misses. Unit: cpu_core] llc-stores [Last level cache write accesses. Unit: cpu_core] node-load-misses [Local memory read misses. Unit: cpu_core] node-loads [Local memory read accesses. Unit: cpu_core] ``` v3: Deprecate the legacy cache events that aren't shown in the previous perf list to avoid the perf list output being too verbose. v2: Additional details to the cover letter. Credit to Vince Weaver added to the commit message for the event details. Additional patches to clean up perf_pmu new_alias by removing an unused term scanner argument and avoid stdio usage. https://lore.kernel.org/lkml/20250828163225.3839073-1-irogers@xxxxxxxxxx/ v1: https://lore.kernel.org/lkml/20250828064231.1762997-1-irogers@xxxxxxxxxx/ Ian Rogers (15): perf parse-events: Fix legacy cache events if event is duplicated in a PMU perf perf_api_probe: Avoid scanning all PMUs, try software PMU first perf record: Skip don't fail for events that don't open perf jevents: Support copying the source json files to OUTPUT perf pmu: Don't eagerly parse event terms perf parse-events: Remove unused FILE input argument to scanner perf pmu: Use fd rather than FILE from new_alias perf pmu: Factor term parsing into a perf_event_attr into a helper perf parse-events: Add terms for legacy hardware and cache config values perf jevents: Add legacy json terms and default_core event table helper perf pmu: Add and use legacy_terms in alias information perf jevents: Add legacy-hardware and legacy-cache json perf print-events: Remove print_hwcache_events perf print-events: Remove print_symbol_events perf parse-events: Remove hard coded legacy hardware and cache parsing tools/perf/Makefile.perf | 21 +- tools/perf/arch/x86/util/intel-pt.c | 2 +- tools/perf/builtin-list.c | 34 +- tools/perf/builtin-record.c | 89 +- tools/perf/pmu-events/Build | 24 +- .../arch/common/common/legacy-hardware.json | 72 + tools/perf/pmu-events/empty-pmu-events.c | 2763 ++++++++++++++++- tools/perf/pmu-events/jevents.py | 24 + tools/perf/pmu-events/make_legacy_cache.py | 129 + tools/perf/pmu-events/pmu-events.h | 1 + tools/perf/tests/parse-events.c | 2 +- tools/perf/tests/pmu-events.c | 24 +- tools/perf/tests/pmu.c | 3 +- tools/perf/util/parse-events.c | 283 +- tools/perf/util/parse-events.h | 16 +- tools/perf/util/parse-events.l | 54 +- tools/perf/util/parse-events.y | 114 +- tools/perf/util/perf_api_probe.c | 27 +- tools/perf/util/pmu.c | 302 +- tools/perf/util/print-events.c | 112 - tools/perf/util/print-events.h | 4 - 21 files changed, 3330 insertions(+), 770 deletions(-) create mode 100644 tools/perf/pmu-events/arch/common/common/legacy-hardware.json create mode 100755 tools/perf/pmu-events/make_legacy_cache.py -- 2.51.0.318.gd7df087d1a-goog