Hi Ian, On Thu, Aug 28, 2025 at 01:59:15PM -0700, Ian Rogers wrote: > Mirroring similar work for software events in commit 6e9fa4131abb > ("perf parse-events: Remove non-json software events"). These changes > migrate the legacy hardware and cache events to json. With no hard > coded legacy hardware or cache events the wild card, case > insensitivity, etc. is consistent for events. This does, however, mean > events like cycles will wild card against all PMUs. A change doing the > same was originally posted and merged from: > https://lore.kernel.org/r/20240416061533.921723-10-irogers@xxxxxxxxxx > and reverted by Linus in commit 4f1b067359ac ("Revert "perf > parse-events: Prefer sysfs/JSON hardware events over legacy"") due to > his dislike for the cycles behavior on ARM with perf record. Earlier > patches in this series make perf record event opening failures > non-fatal and hide the cycles event's failure to open on ARM in perf > record, so it is expected the behavior will now be transparent in perf > record on ARM. perf stat with a cycles event will wildcard open the > event on all PMUs. > > The change to support legacy events with PMUs was done to clean up > Intel's hybrid PMU implementation. Having sysfs/json events with > increased priority to legacy was requested by Mark Rutland > <mark.rutland@xxxxxxx> to fix Apple-M PMU issues wrt broken legacy > events on that PMU. It is believed the PMU driver is now fixed, but > this has only been confirmed on ARM Juno boards. It was requested that > RISC-V be able to add events to the perf tool json so the PMU driver > didn't need to map legacy events to config encodings: > https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@xxxxxxxxxxxx/ > This patch series achieves this. > > A previous series of patches decreasing legacy hardware event > priorities was posted in: > https://lore.kernel.org/lkml/20250416045117.876775-1-irogers@xxxxxxxxxx/ > Namhyung Kim <namhyung@xxxxxxxxxx> mentioned that hardware and > software events can be implemented similarly: > https://lore.kernel.org/lkml/aIJmJns2lopxf3EK@xxxxxxxxxx/ > and this patch series achieves this. Thanks for working on this. Yeah, I think it's be easier to handle all events consistently with JSON. I expect the sysfs encoding will be used in a higher priority if it comes with <PMU>/<EVENT>/ format. > > Note, patch 1 (perf parse-events: Fix legacy cache events if event is > duplicated in a PMU) fixes a function deleted by patch 15 (perf > parse-events: Remove hard coded legacy hardware and cache > parsing). Adding the json exposed an issue when legacy cache (not > legacy hardware) and sysfs/json events exist. The fix is necessary to > keep tests passing through the series. It is also posted for backports > to stable trees. Sounds ok. > > The perf list behavior includes a lot more information and events. The > before behavior on a hybrid alderlake is: > ``` > $ perf list hw > > List of pre-defined events (to be used in -e or -M): > > branch-instructions OR branches [Hardware event] > branch-misses [Hardware event] > bus-cycles [Hardware event] > cache-misses [Hardware event] > cache-references [Hardware event] > cpu-cycles OR cycles [Hardware event] > instructions [Hardware event] > ref-cycles [Hardware event] > $ perf list hwcache > > List of pre-defined events (to be used in -e or -M): > > > cache: > L1-dcache-loads OR cpu_atom/L1-dcache-loads/ > L1-dcache-stores OR cpu_atom/L1-dcache-stores/ > L1-icache-loads OR cpu_atom/L1-icache-loads/ > L1-icache-load-misses OR cpu_atom/L1-icache-load-misses/ > LLC-loads OR cpu_atom/LLC-loads/ > LLC-load-misses OR cpu_atom/LLC-load-misses/ > LLC-stores OR cpu_atom/LLC-stores/ > LLC-store-misses OR cpu_atom/LLC-store-misses/ > dTLB-loads OR cpu_atom/dTLB-loads/ > dTLB-load-misses OR cpu_atom/dTLB-load-misses/ > dTLB-stores OR cpu_atom/dTLB-stores/ > dTLB-store-misses OR cpu_atom/dTLB-store-misses/ > iTLB-load-misses OR cpu_atom/iTLB-load-misses/ > branch-loads OR cpu_atom/branch-loads/ > branch-load-misses OR cpu_atom/branch-load-misses/ > L1-dcache-loads OR cpu_core/L1-dcache-loads/ > L1-dcache-load-misses OR cpu_core/L1-dcache-load-misses/ > L1-dcache-stores OR cpu_core/L1-dcache-stores/ > L1-icache-load-misses OR cpu_core/L1-icache-load-misses/ > LLC-loads OR cpu_core/LLC-loads/ > LLC-load-misses OR cpu_core/LLC-load-misses/ > LLC-stores OR cpu_core/LLC-stores/ > LLC-store-misses OR cpu_core/LLC-store-misses/ > dTLB-loads OR cpu_core/dTLB-loads/ > dTLB-load-misses OR cpu_core/dTLB-load-misses/ > dTLB-stores OR cpu_core/dTLB-stores/ > dTLB-store-misses OR cpu_core/dTLB-store-misses/ > iTLB-load-misses OR cpu_core/iTLB-load-misses/ > branch-loads OR cpu_core/branch-loads/ > branch-load-misses OR cpu_core/branch-load-misses/ > node-loads OR cpu_core/node-loads/ > node-load-misses OR cpu_core/node-load-misses/ > ``` > and after it is: > ``` > $ perf list hw > > legacy hardware: > branch-instructions > [Retired branch instructions [This event is an alias of branches]. > Unit: cpu_atom] > branch-misses > [Mispredicted branch instructions. Unit: cpu_atom] > branches > [Retired branch instructions [This event is an alias of > branch-instructions]. Unit: cpu_atom] A nit. Can we have one actual event and an alias of it? I think 'branch-instructions' will be the actual event and 'branches' will be the alias. Then the description will be like branch-instructions [Retired branch instructions. Unit: cpu_atom] ... branches [This event is an alias of branch-instructions.] The same goes to 'cycles' and 'cpu-cycles'. Thanks, Namhyung > bus-cycles > [Bus cycles,which can be different from total cycles. Unit: cpu_atom] > cache-misses > [Cache misses. Usually this indicates Last Level Cache misses; this is > intended to be used in conjunction with the > PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. > Unit: cpu_atom] > cache-references > [Cache accesses. Usually this indicates Last Level Cache accesses but > this may vary depending on your CPU. This may include prefetches and > coherency messages; again this depends on the design of your CPU. > Unit: cpu_atom] > cpu-cycles > [Total cycles. Be wary of what happens during CPU frequency scaling > [This event is an alias of cycles]. Unit: cpu_atom] > cycles > [Total cycles. Be wary of what happens during CPU frequency scaling > [This event is an alias of cpu-cycles]. Unit: cpu_atom] > instructions > [Retired instructions. Be careful,these can be affected by various > issues,most notably hardware interrupt counts. Unit: cpu_atom] > ref-cycles > [Total cycles; not affected by CPU frequency scaling. Unit: cpu_atom] > branch-instructions > [Retired branch instructions [This event is an alias of branches]. > Unit: cpu_core] > branch-misses > [Mispredicted branch instructions. Unit: cpu_core] > branches > [Retired branch instructions [This event is an alias of > branch-instructions]. Unit: cpu_core] > bus-cycles > [Bus cycles,which can be different from total cycles. Unit: cpu_core] > cache-misses > [Cache misses. Usually this indicates Last Level Cache misses; this is > intended to be used in conjunction with the > PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. > Unit: cpu_core] > cache-references > [Cache accesses. Usually this indicates Last Level Cache accesses but > this may vary depending on your CPU. This may include prefetches and > coherency messages; again this depends on the design of your CPU. > Unit: cpu_core] > cpu-cycles > [Total cycles. Be wary of what happens during CPU frequency scaling > [This event is an alias of cycles]. Unit: cpu_core] > cycles > [Total cycles. Be wary of what happens during CPU frequency scaling > [This event is an alias of cpu-cycles]. Unit: cpu_core] > instructions > [Retired instructions. Be careful,these can be affected by various > issues,most notably hardware interrupt counts. Unit: cpu_core] > ref-cycles > [Total cycles; not affected by CPU frequency scaling. Unit: cpu_core] > $ perf list hwcache > > legacy cache: > branch-load-misses > [Branch prediction unit read misses. Unit: cpu_atom] > branch-loads > [Branch prediction unit read accesses. Unit: cpu_atom] > dtlb-load-misses > [Data TLB read misses. Unit: cpu_atom] > dtlb-loads > [Data TLB read accesses. Unit: cpu_atom] > dtlb-store-misses > [Data TLB write misses. Unit: cpu_atom] > dtlb-stores > [Data TLB write accesses. Unit: cpu_atom] > itlb-load-misses > [Instruction TLB read misses. Unit: cpu_atom] > l1-dcache-loads > [Level 1 data cache read accesses. Unit: cpu_atom] > l1-dcache-stores > [Level 1 data cache write accesses. Unit: cpu_atom] > l1-icache-load-misses > [Level 1 instruction cache read misses. Unit: cpu_atom] > l1-icache-loads > [Level 1 instruction cache read accesses. Unit: cpu_atom] > llc-load-misses > [Last level cache read misses. Unit: cpu_atom] > llc-loads > [Last level cache read accesses. Unit: cpu_atom] > llc-store-misses > [Last level cache write misses. Unit: cpu_atom] > llc-stores > [Last level cache write accesses. Unit: cpu_atom] > branch-load-misses > [Branch prediction unit read misses. Unit: cpu_core] > branch-loads > [Branch prediction unit read accesses. Unit: cpu_core] > dtlb-load-misses > [Data TLB read misses. Unit: cpu_core] > dtlb-loads > [Data TLB read accesses. Unit: cpu_core] > dtlb-store-misses > [Data TLB write misses. Unit: cpu_core] > dtlb-stores > [Data TLB write accesses. Unit: cpu_core] > itlb-load-misses > [Instruction TLB read misses. Unit: cpu_core] > l1-dcache-load-misses > [Level 1 data cache read misses. Unit: cpu_core] > l1-dcache-loads > [Level 1 data cache read accesses. Unit: cpu_core] > l1-dcache-stores > [Level 1 data cache write accesses. Unit: cpu_core] > l1-icache-load-misses > [Level 1 instruction cache read misses. Unit: cpu_core] > llc-load-misses > [Last level cache read misses. Unit: cpu_core] > llc-loads > [Last level cache read accesses. Unit: cpu_core] > llc-store-misses > [Last level cache write misses. Unit: cpu_core] > llc-stores > [Last level cache write accesses. Unit: cpu_core] > node-load-misses > [Local memory read misses. Unit: cpu_core] > node-loads > [Local memory read accesses. Unit: cpu_core] > ``` > > v3: Deprecate the legacy cache events that aren't shown in the > previous perf list to avoid the perf list output being too verbose. > > v2: Additional details to the cover letter. Credit to Vince Weaver > added to the commit message for the event details. Additional > patches to clean up perf_pmu new_alias by removing an unused term > scanner argument and avoid stdio usage. > https://lore.kernel.org/lkml/20250828163225.3839073-1-irogers@xxxxxxxxxx/ > > v1: https://lore.kernel.org/lkml/20250828064231.1762997-1-irogers@xxxxxxxxxx/ > > Ian Rogers (15): > perf parse-events: Fix legacy cache events if event is duplicated in a > PMU > perf perf_api_probe: Avoid scanning all PMUs, try software PMU first > perf record: Skip don't fail for events that don't open > perf jevents: Support copying the source json files to OUTPUT > perf pmu: Don't eagerly parse event terms > perf parse-events: Remove unused FILE input argument to scanner > perf pmu: Use fd rather than FILE from new_alias > perf pmu: Factor term parsing into a perf_event_attr into a helper > perf parse-events: Add terms for legacy hardware and cache config > values > perf jevents: Add legacy json terms and default_core event table > helper > perf pmu: Add and use legacy_terms in alias information > perf jevents: Add legacy-hardware and legacy-cache json > perf print-events: Remove print_hwcache_events > perf print-events: Remove print_symbol_events > perf parse-events: Remove hard coded legacy hardware and cache parsing > > tools/perf/Makefile.perf | 21 +- > tools/perf/arch/x86/util/intel-pt.c | 2 +- > tools/perf/builtin-list.c | 34 +- > tools/perf/builtin-record.c | 89 +- > tools/perf/pmu-events/Build | 24 +- > .../arch/common/common/legacy-hardware.json | 72 + > tools/perf/pmu-events/empty-pmu-events.c | 2763 ++++++++++++++++- > tools/perf/pmu-events/jevents.py | 24 + > tools/perf/pmu-events/make_legacy_cache.py | 129 + > tools/perf/pmu-events/pmu-events.h | 1 + > tools/perf/tests/parse-events.c | 2 +- > tools/perf/tests/pmu-events.c | 24 +- > tools/perf/tests/pmu.c | 3 +- > tools/perf/util/parse-events.c | 283 +- > tools/perf/util/parse-events.h | 16 +- > tools/perf/util/parse-events.l | 54 +- > tools/perf/util/parse-events.y | 114 +- > tools/perf/util/perf_api_probe.c | 27 +- > tools/perf/util/pmu.c | 302 +- > tools/perf/util/print-events.c | 112 - > tools/perf/util/print-events.h | 4 - > 21 files changed, 3330 insertions(+), 770 deletions(-) > create mode 100644 tools/perf/pmu-events/arch/common/common/legacy-hardware.json > create mode 100755 tools/perf/pmu-events/make_legacy_cache.py > > -- > 2.51.0.318.gd7df087d1a-goog >