-
Notifications
You must be signed in to change notification settings - Fork 48
[6.6-velinux] Intel: Backport PMU support and dependency for CWF platform #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
x56Jason
wants to merge
73
commits into
openvelinux:6.6-velinux
Choose a base branch
from
openvelinux:cwf-pmu-6.6
base: 6.6-velinux
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
commit d4b5694 upstream. From PMU's perspective, the SPR/GNR server has a similar uarch to the ADL/MTL client p-core. Many functions are shared. However, the shared function name uses the abbreviation of the server product code name, rather than the common uarch code name. Rename these internal shared functions by the common uarch name. Intel-SIG: commit d4b5694 perf/x86/intel: Use the common uarch name for the shared functions Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 0ba0c03 upstream. The SPR and ADL p-core have a similar uarch. Most of the initialization code can be shared. Factor out intel_pmu_init_glc() for the common initialization code. The common part of the ADL p-core will be replaced by the later patch. Intel-SIG: commit 0ba0c03 perf/x86/intel: Factor out the initialization code for SPR Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit d87d221 upstream. From PMU's perspective, the ADL e-core and newer SRF/GRR have a similar uarch. Most of the initialization code can be shared. Factor out intel_pmu_init_grt() for the common initialization code. The common part of the ADL e-core will be replaced by the later patch. Intel-SIG: commit d87d221 perf/x86/intel: Factor out the initialization code for ADL e-core Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 299a5fc upstream. Use the intel_pmu_init_glc() and intel_pmu_init_grt() to replace the duplicate code for ADL. The current code already checks the PERF_X86_EVENT_TOPDOWN flag before invoking the Topdown metrics functions. (The PERF_X86_EVENT_TOPDOWN flag is to indicate the Topdown metric feature, which is only available for the p-core.) Drop the unnecessary adl_set_topdown_event_period() and adl_update_topdown_event(). Intel-SIG: commit 299a5fc perf/x86/intel: Apply the common initialization code for ADL Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit b0560bf upstream. There is a fairly long list of grievances about the current code. The main beefs: 1. hybrid_big_small assumes that the *HARDWARE* (CPUID) provided core types are a bitmap. They are not. If Intel happened to make a core type of 0xff, hilarity would ensue. 2. adl_get_hybrid_cpu_type() utterly inscrutable. There are precisely zero comments and zero changelog about what it is attempting to do. According to Kan, the adl_get_hybrid_cpu_type() is there because some Alder Lake (ADL) CPUs can do some silly things. Some ADL models are *supposed* to be hybrid CPUs with big and little cores, but there are some SKUs that only have big cores. CPUID(0x1a) on those CPUs does not say that the CPUs are big cores. It apparently just returns 0x0. It confuses perf because it expects to see either 0x40 (Core) or 0x20 (Atom). The perf workaround for this is to watch for a CPU core saying it is type 0x0. If that happens on an Alder Lake, it calls x86_pmu.get_hybrid_cpu_type() and just assumes that the core is a Core (0x40) CPU. To fix up the mess, separate out the CPU types and the 'pmu' types. This allows 'hybrid_pmu_type' bitmaps without worrying that some future CPU type will set multiple bits. Since the types are now separate, add a function to glue them back together again. Actual comment on the situation in the glue function (find_hybrid_pmu_for_cpu()). Also, give ->get_hybrid_cpu_type() a real return type and make it clear that it is overriding the *CPU* type, not the PMU type. Rename cpu_type to pmu_type in the struct x86_hybrid_pmu to reflect the change. Intel-SIG: commit b0560bf perf/x86/intel: Clean up the hybrid CPU type handling code Backport CWF PMU support and dependency Originally-by: Dave Hansen <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> [jz: fix pmu->cpu_type in hybrid_td_is_visible() due to difference between 6.6 stable branch and vanilla upstream] Signed-off-by: Jason Zeng <[email protected]>
commit 97588df upstream. The current hybrid initialization codes aren't well organized and are hard to read. Factor out intel_pmu_init_hybrid() to do a common setup for each hybrid PMU. The PMU-specific capability will be updated later via either hard code (ADL) or CPUID hybrid enumeration (MTL). Splitting the ADL and MTL initialization codes, since they have different uarches. The hard code PMU capabilities are not required for MTL either. They can be enumerated by the new leaf 0x23 and IA32_PERF_CAPABILITIES MSR. The hybrid enumeration of the IA32_PERF_CAPABILITIES MSR is broken on MTL. Using the default value. Intel-SIG: commit 97588df perf/x86/intel: Add common intel_pmu_init_hybrid() Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> [jz: resolve conflicts in update_pmu_cap() due to following commit 47a973fd7563 ("perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF") already been backported by stable branch] Signed-off-by: Jason Zeng <[email protected]>
commit 950ecdc upstream. Unnecessary multiplexing is triggered when running an "instructions" event on an MTL. perf stat -e cpu_core/instructions/,cpu_core/instructions/ -a sleep 1 Performance counter stats for 'system wide': 115,489,000 cpu_core/instructions/ (50.02%) 127,433,777 cpu_core/instructions/ (49.98%) 1.002294504 seconds time elapsed Linux architectural perf events, e.g., cycles and instructions, usually have dedicated fixed counters. These events also have equivalent events which can be used in the general-purpose counters. The counters are precious. In the intel_pmu_check_event_constraints(), perf check/extend the event constraints of these events. So these events can utilize both fixed counters and general-purpose counters. The following cleanup commit: 97588df ("perf/x86/intel: Add common intel_pmu_init_hybrid()") forgot adding the intel_pmu_check_event_constraints() into update_pmu_cap(). The architectural perf events cannot utilize the general-purpose counters. The code to check and update the counters, event constraints and extra_regs is the same among hybrid systems. Move intel_pmu_check_hybrid_pmus() to init_hybrid_pmu(), and emove the duplicate check in update_pmu_cap(). Intel-SIG: commit 950ecdc perf/x86/intel: Fix broken fixed event constraints extension Backport CWF PMU support and dependency Fixes: 97588df ("perf/x86/intel: Add common intel_pmu_init_hybrid()") Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> [jz: resolve context conflicts in update_pmu_cap() due to following commit 47a973fd7563 ("perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF") already been backported by stable branch] Signed-off-by: Jason Zeng <[email protected]>
…the kernel commit 76db7aa upstream. Sync the new sample type for the branch counters feature. Intel-SIG: commit 76db7aa tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tinghao Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit e8df9d9 upstream. When running perf-stat command on Intel hybrid platform, perf-stat reports the following errors: sudo taskset -c 7 ./perf stat -vvvv -e cpu_atom/instructions/ sleep 1 Opening: cpu/cycles/:HG ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0xa00000000 disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -16 Performance counter stats for 'sleep 1': <not counted> cpu_atom/instructions/ It looks the cpu_atom/instructions/ event can't be enabled on atom PMU even when the process is pinned on atom core. Investigation shows that exclusive_event_init() helper always returns -EBUSY error in the perf event creation. That's strange since the atom PMU should not be an exclusive PMU. Further investigation shows the issue was introduced by commit: 97588df ("perf/x86/intel: Add common intel_pmu_init_hybrid()") The commit originally intents to clear the bit PERF_PMU_CAP_AUX_OUTPUT from PMU capabilities if intel_cap.pebs_output_pt_available is not set, but it incorrectly uses 'or' operation and leads to all PMU capabilities bits are set to 1 except bit PERF_PMU_CAP_AUX_OUTPUT. Testing this fix on Intel hybrid platforms, the observed issues disappear. Intel-SIG: commit e8df9d9 perf/x86/intel: Correct incorrect 'or' operation for PMU capabilities Backport CWF PMU support and dependency Fixes: 97588df ("perf/x86/intel: Add common intel_pmu_init_hybrid()") Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 72b8b94 upstream. Sort header files alphabetically. Intel-SIG: commit 72b8b94 powercap: intel_rapl: Sort header files Backport CWF PMU support and dependency Signed-off-by: Zhang Rui <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 575024a upstream. Introduce two new APIs rapl_package_add_pmu()/rapl_package_remove_pmu(). RAPL driver can invoke these APIs to expose its supported energy counters via perf PMU. The new RAPL PMU is fully compatible with current MSR RAPL PMU, including using the same PMU name and events name/id/unit/scale, etc. For example, use below command perf stat -e power/energy-pkg/ -e power/energy-ram/ FOO to get the energy consumption if power/energy-pkg/ and power/energy-ram/ events are available in the "perf list" output. This does not introduce any conflict because TPMI RAPL is the only user of these APIs currently, and it never co-exists with MSR RAPL. Note that RAPL Packages can be probed/removed dynamically, and the events supported by each TPMI RAPL device can be different. Thus the RAPL PMU support is done on demand, which means 1. PMU is registered only if it is needed by a RAPL Package. PMU events for unsupported counters are not exposed. 2. PMU is unregistered and registered when a new RAPL Package is probed and supports new counters that are not supported by current PMU. For example, on a dual-package system using TPMI RAPL, it is possible that Package 1 behaves as TPMI domain root and supports Psys domain. In this case, register PMU without Psys event when probing Package 0, and re-register the PMU with Psys event when probing Package 1. 3. PMU is unregistered when all registered RAPL Packages don't need PMU. Intel-SIG: commit 575024a powercap: intel_rapl: Introduce APIs for PMU support Backport CWF PMU support and dependency Signed-off-by: Zhang Rui <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 963a9ad upstream. Enable RAPL PMU support for TPMI RAPL driver. Intel-SIG: commit 963a9ad powercap: intel_rapl_tpmi: Enable PMU support Backport CWF PMU support and dependency Signed-off-by: Zhang Rui <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 0007f39 upstream. The unit control address of some CXL units may be wrongly calculated under some configuration on a EMR machine. The current implementation only saves the unit control address of the units from the first die, and the first unit of the rest of dies. Perf assumed that the units from the other dies have the same offset as the first die. So the unit control address of the rest of the units can be calculated. However, the assumption is wrong, especially for the CXL units. Introduce an RB tree for each uncore type to save the unit control address and three kinds of ID information (unit ID, PMU ID, and die ID) for all units. The unit ID is a physical ID of a unit. The PMU ID is a logical ID assigned to a unit. The logical IDs start from 0 and must be contiguous. The physical ID and the logical ID are 1:1 mapping. The units with the same physical ID in different dies share the same PMU. The die ID indicates which die a unit belongs to. The RB tree can be searched by two different keys (unit ID or PMU ID + die ID). During the RB tree setup, the unit ID is used as a key to look up the RB tree. The perf can create/assign a proper PMU ID to the unit. Later, after the RB tree is setup, PMU ID + die ID is used as a key to look up the RB tree to fill the cpumask of a PMU. It's used more frequently, so PMU ID + die ID is compared in the unit_less(). The uncore_find_unit() has to be O(N). But the RB tree setup only occurs once during the driver load time. It should be acceptable. Compared with the current implementation, more space is required to save the information of all units. The extra size should be acceptable. For example, on EMR, there are 221 units at most. For a 2-socket machine, the extra space is ~6KB at most. Intel-SIG: commit 0007f39 perf/x86/uncore: Save the unit control address of all units Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit c74443d upstream. The cpumask of some uncore units, e.g., CXL uncore units, may be wrong under some configurations. Perf may access an uncore counter of a non-existent uncore unit. The uncore driver assumes that all uncore units are symmetric among dies. A global cpumask is shared among all uncore PMUs. However, some CXL uncore units may only be available on some dies. A per PMU cpumask is introduced to track the CPU mask of this PMU. The driver searches the unit control RB tree to check whether the PMU is available on a given die, and updates the per PMU cpumask accordingly. Intel-SIG: commit c74443d perf/x86/uncore: Support per PMU cpumask Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Yunying Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 585463f upstream. The box_ids only save the unit ID for the first die. If a unit, e.g., a CXL unit, doesn't exist in the first die. The unit ID cannot be retrieved. The unit control RB tree also stores the unit ID information. Retrieve the unit ID from the unit control RB tree Intel-SIG: commit 585463f perf/x86/uncore: Retrieve the unit ID from the unit control RB tree Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Yunying Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 80580da upstream. The unit control RB tree has the unit control and unit ID information for all the units. Use it to replace the box_ctls/mmio_offsets to get an accurate unit control address for MMIO uncore units. Intel-SIG: commit 80580da perf/x86/uncore: Apply the unit control RB tree to MMIO uncore units Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Yunying Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit b1d9ea2 upstream. The unit control RB tree has the unit control and unit ID information for all the MSR units. Use them to replace the box_ctl and uncore_msr_box_ctl() to get an accurate unit control address for MSR uncore units. Add intel_generic_uncore_assign_hw_event(), which utilizes the accurate unit control address from the unit control RB tree to calculate the config_base and event_base. The unit id related information should be retrieved from the unit control RB tree as well. Intel-SIG: commit b1d9ea2 perf/x86/uncore: Apply the unit control RB tree to MSR uncore units Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Yunying Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit f76a842 upstream. The unit control RB tree has the unit control and unit ID information for all the PCI units. Use them to replace the box_ctls/pci_offsets to get an accurate unit control address for PCI uncore units. The UPI/M3UPI units in the discovery table are ignored. Please see the commit 65248a9 ("perf/x86/uncore: Add a quirk for UPI on SPR"). Manually allocate a unit control RB tree for UPI/M3UPI. Add cleanup_extra_boxes to release such manual allocation. Intel-SIG: commit f76a842 perf/x86/uncore: Apply the unit control RB tree to PCI uncore units Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Yunying Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 15a4bd5 upstream. The unit control and ID information are retrieved from the unit control RB tree. No one uses the old structure anymore. Remove them. Intel-SIG: commit 15a4bd5 perf/x86/uncore: Cleanup unused unit structure Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Yunying Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit f8a86a9 upstream. Unknown uncore PMON types can be found in both SPR and EMR with HBM or CXL. $ls /sys/devices/ | grep type uncore_type_12_16 uncore_type_12_18 uncore_type_12_2 uncore_type_12_4 uncore_type_12_6 uncore_type_12_8 uncore_type_13_17 uncore_type_13_19 uncore_type_13_3 uncore_type_13_5 uncore_type_13_7 uncore_type_13_9 The unknown PMON types are HBM and CXL PMON. Except for the name, the other information regarding the HBM and CXL PMON counters can be retrieved via the discovery table. Add them into the uncores tables for SPR and EMR. The event config registers for all CXL related units are 8-byte apart. Add SPR_UNCORE_MMIO_OFFS8_COMMON_FORMAT to specially handle it. Intel-SIG: commit f8a86a9 perf/x86/intel/uncore: Support HBM and CXL PMON counters Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Yunying Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit a23eb2f upstream. The current perf assumes that the counters that support PEBS are contiguous. But it's not guaranteed with the new leaf 0x23 introduced. The counters are enumerated with a counter mask. There may be holes in the counter mask for future platforms or in a virtualization environment. Store the PEBS event mask rather than the maximum number of PEBS counters in the x86 PMU structures. Intel-SIG: commit a23eb2f perf/x86/intel: Support the PEBS event mask Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 722e42e upstream. The current perf assumes that both GP and fixed counters are contiguous. But it's not guaranteed on newer Intel platforms or in a virtualization environment. Use the counter mask to replace the number of counters for both GP and the fixed counters. For the other ARCHs or old platforms which don't support a counter mask, using GENMASK_ULL(num_counter - 1, 0) to replace. There is no functional change for them. The interface to KVM is not changed. The number of counters still be passed to KVM. It can be updated later separately. Intel-SIG: commit 722e42e perf/x86: Support counter mask Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [Dapeng Mi: resolve conflict and amend commit log] Signed-off-by: Dapeng Mi <[email protected]> [jz: resolve conflicts in update_pmu_cap() due to following commit 47a973fd7563 ("perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF") already been backported by stable branch] Signed-off-by: Jason Zeng <[email protected]>
commit a932aa0 upstream. From PMU's perspective, Lunar Lake and Arrow Lake are similar to the previous generation Meteor Lake. Both are hybrid platforms, with e-core and p-core. The key differences include: - The e-core supports 3 new fixed counters - The p-core supports an updated PEBS Data Source format - More GP counters (Updated event constraint table) - New Architectural performance monitoring V6 (New Perfmon MSRs aliasing, umask2, eq). - New PEBS format V6 (Counters Snapshotting group) - New RDPMC metrics clear mode The legacy features, the 3 new fixed counters and updated event constraint table are enabled in this patch. The new PEBS data source format, the architectural performance monitoring V6, the PEBS format V6, and the new RDPMC metrics clear mode are supported in the following patches. Intel-SIG: commit a932aa0 perf/x86: Add Lunar Lake and Arrow Lake support Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [Dapeng Mi: resolve conflict and amend commit log] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 0902624 upstream. The model-specific pebs_latency_data functions of ADL and MTL use the "small" as a postfix to indicate the e-core. The postfix is too generic for a model-specific function. It cannot provide useful information that can directly map it to a specific uarch, which can facilitate the development and maintenance. Use the abbr of the uarch to rename the model-specific functions. Intel-SIG: commit 0902624 perf/x86/intel: Rename model-specific pebs_latency_data functions Backport CWF PMU support and dependency Suggested-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 608f697 upstream. A new PEBS data source format is introduced for the p-core of Lunar Lake. The data source field is extended to 8 bits with new encodings. A new layout is introduced into the union intel_x86_pebs_dse. Introduce the lnl_latency_data() to parse the new format. Enlarge the pebs_data_source[] accordingly to include new encodings. Only the mem load and the mem store events can generate the data source. Introduce INTEL_HYBRID_LDLAT_CONSTRAINT and INTEL_HYBRID_STLAT_CONSTRAINT to mark them. Add two new bits for the new cache-related data src, L2_MHB and MSC. The L2_MHB is short for L2 Miss Handling Buffer, which is similar to LFB (Line Fill Buffer), but to track the L2 Cache misses. The MSC stands for the memory-side cache. Intel-SIG: commit 608f697 perf/x86/intel: Support new data source for Lunar Lake Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Zeng <[email protected]>
commit e8fb5d6 upstream. Different vendors may support different fields in EVENTSEL MSR, such as Intel would introduce new fields umask2 and eq bits in EVENTSEL MSR since Perfmon version 6. However, a fixed mask X86_RAW_EVENT_MASK is used to filter the attr.config. Introduce a new config_mask to record the real supported EVENTSEL bitmask. Only apply it to the existing code now. No functional change. Intel-SIG: commit e8fb5d6 perf/x86: Add config_mask to represent EVENTSEL bitmask Backport CWF PMU support and dependency Co-developed-by: Dapeng Mi <[email protected]> Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit dce0c74 upstream. Two new fields (the unit mask2, and the equal flag) are added in the IA32_PERFEVTSELx MSRs. They can be enumerated by the CPUID.23H.0.EBX. Update the config_mask in x86_pmu and x86_hybrid_pmu for the true layout of the PERFEVTSEL. Expose the new formats into sysfs if they are available. The umask extension reuses the same format attr name "umask" as the previous umask. Add umask2_show to determine/display the correct format for the current machine. Intel-SIG: commit dce0c74 perf/x86/intel: Support PERFEVTSEL extension Backport CWF PMU support and dependency Co-developed-by: Dapeng Mi <[email protected]> Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [Dapeng Mi: resolve conflict and amend commit log] Signed-off-by: Dapeng Mi <[email protected]> [jz: due to following commit 47a973fd75639 ("perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF") has already been backported by stable branch, - resolve context conflict in update_pmu_cap(). - definition for ARCH_PERFMON_EXT_UMASK2 and ARCH_PERFMON_EXT_EQ are not added because upstream commit 47a973fd75639 have removed them] Signed-off-by: Jason Zeng <[email protected]>
commit 149fd47 upstream. The architectural performance monitoring V6 supports a new range of counters' MSRs in the 19xxH address range. They include all the GP counter MSRs, the GP control MSRs, and the fixed counter MSRs. The step between each sibling counter is 4. Add intel_pmu_addr_offset() to calculate the correct offset. Add fixedctr in struct x86_pmu to store the address of the fixed counter 0. It can be used to calculate the rest of the fixed counters. The MSR address of the fixed counter control is not changed. Intel-SIG: commit 149fd47 perf/x86/intel: Support Perfmon MSRs aliasing Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [Dapeng Mi: resolve conflict and amend commit log] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit fa0c1c9 upstream. Currently, the Sapphire Rapids and Granite Rapids share the same PMU name, sapphire_rapids. Because from the kernel’s perspective, GNR is similar to SPR. The only key difference is that they support different extra MSRs. The code path and the PMU name are shared. However, from end users' perspective, they are quite different. Besides the extra MSRs, GNR has a newer PEBS format, supports Retire Latency, supports new CPUID enumeration architecture, doesn't required the load-latency AUX event, has additional TMA Level 1 Architectural Events, etc. The differences can be enumerated by CPUID or the PERF_CAPABILITIES MSR. They weren't reflected in the model-specific kernel setup. But it is worth to have a distinct PMU name for GNR. Intel-SIG: commit fa0c1c9 perf/x86/intel: Add a distinct name for Granite Rapids Backport CWF PMU support and dependency Fixes: a6742cb ("perf/x86/intel: Fix the FRONTEND encoding on GNR and MTL") Suggested-by: Ahmad Yasin <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] [ Yunying Sun: amend commit log ] Signed-off-by: Yunying Sun <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit a5f5e1238f4ff919816f69e77d2537a48911767b upstream.
The below code would always unconditionally clear other status bits like
perf metrics overflow bit once PEBS buffer overflows:
status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
This is incorrect. Perf metrics overflow bit should be cleared only when
fixed counter 3 in PEBS counter group. Otherwise perf metrics overflow
could be missed to handle.
Intel-SIG: commit a5f5e1238f4f perf/x86/intel: Don't clear perf metrics overflow bit unconditionally
Backport CWF PMU support and dependency
Closes: https://lore.kernel.org/all/[email protected]/
Fixes: 7b2c05a ("perf/x86/intel: Generic support for hardware TopDown metrics")
Signed-off-by: Dapeng Mi <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dapeng Mi <[email protected]>
Signed-off-by: Jason Zeng <[email protected]>
commit 48d66c89dce1e3687174608a5f5c31d5961a9916 upstream. From the PMU's perspective, Clearwater Forest is similar to the previous generation Sierra Forest. The key differences are the ARCH PEBS feature and the new added 3 fixed counters for topdown L1 metrics events. The ARCH PEBS is supported in the following patches. This patch provides support for basic perfmon features and 3 new added fixed counters. Intel-SIG: commit 48d66c89dce1 perf/x86/intel: Add PMU support for Clearwater Forest Backport CWF PMU support and dependency Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [Dapeng Mi: resolve conflict and amend commit log] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 25c623f41438fafc6f63c45e2e141d7bcff78299 upstream. CPUID archPerfmonExt (0x23) leaves are supported to enumerate CPU level's PMU capabilities on non-hybrid processors as well. This patch supports to parse archPerfmonExt leaves on non-hybrid processors. Architectural PEBS leverages archPerfmonExt sub-leaves 0x4 and 0x5 to enumerate the PEBS capabilities as well. This patch is a precursor of the subsequent arch-PEBS enabling patches. Intel-SIG: commit 25c623f41438 perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs Backport CWF PMU support and dependency Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 75aea4b0656ead0facd13d2aae4cb77326e53d2f upstream.
A warning in intel_pmu_lbr_counters_reorder() may be triggered by below
perf command.
perf record -e "{cpu-clock,cycles/call-graph="lbr"/}" -- sleep 1
It's because the group is mistakenly treated as a branch counter group.
The hw.flags of the leader are used to determine whether a group is a
branch counters group. However, the hw.flags is only available for a
hardware event. The field to store the flags is a union type. For a
software event, it's a hrtimer. The corresponding bit may be set if the
leader is a software event.
For a branch counter group and other groups that have a group flag
(e.g., topdown, PEBS counters snapshotting, and ACR), the leader must
be a X86 event. Check the X86 event before checking the flag.
The patch only fixes the issue for the branch counter group.
The following patch will fix the other groups.
There may be an alternative way to fix the issue by moving the hw.flags
out of the union type. It should work for now. But it's still possible
that the flags will be used by other types of events later. As long as
that type of event is used as a leader, a similar issue will be
triggered. So the alternative way is dropped.
Intel-SIG: commit 75aea4b0656e perf/x86/intel: Only check the group flag for X86 leader
Backport CWF PMU support and dependency
Fixes: 3374491 ("perf/x86/intel: Support branch counters logging")
Closes: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Luo Gengkun <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dapeng Mi <[email protected]>
[jz: the prototype of is_x86_event() already changed in upstream commit ec980e4facef
("perf/x86/intel: Support auto counter reload") which is already merged]
Signed-off-by: Jason Zeng <[email protected]>
commit e9988ad7b1744991118ac348a804f9395368a284 upstream.
The PEBS counters snapshotting group also requires a group flag in the
leader. The leader must be a X86 event.
Intel-SIG: commit e9988ad7b174 perf/x86/intel: Check the X86 leader for pebs_counter_event_group
Backport CWF PMU support and dependency
Fixes: e02e9b0374c3 ("perf/x86/intel: Support PEBS counters snapshotting")
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dapeng Mi <[email protected]>
Signed-off-by: Jason Zeng <[email protected]>
commit efd448540e6243dbdaf0a7e1bcf49734e73f3c93 upstream. The auto counter reload group also requires a group flag in the leader. The leader must be a X86 event. Intel-SIG: commit efd448540e62 perf/x86/intel: Check the X86 leader for ACR group Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 3e830f657f69ab6a4822d72ec2f364c6d51beef8 upstream. The current is_x86_event has to go through the hybrid_pmus list to find the matched pmu, then check if it's a X86 PMU and a X86 event. It's not necessary. The X86 PMU has a unique type ID on a non-hybrid machine, and a unique capability type. They are good enough to do the check. Intel-SIG: commit 3e830f657f69 perf/x86: Optimize the is_x86_event Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 8ec9497 upstream. To pick up changes from: 608f697 perf/x86/intel: Support new data source for Lunar Lake This should be used to beautify perf syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Intel-SIG: commit 8ec9497 tools/include: Sync uapi/linux/perf.h with the kernel sources Backport CWF PMU support and dependency Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: "Liang, Kan" <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
…l sources
commit ae62977331fcbf5c9a4260c88d9f94450db2d99a upstream.
To pick up the changes in:
c53e14f1ea4a8f8d perf: Extend per event callchain limit to branch stack
Addressing this perf tools build warning:
Warning: Kernel ABI header differences:
diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
Please see tools/include/uapi/README for further details.
Intel-SIG: commit ae62977331fc tools headers: Update the uapi/linux/perf_event.h copy with the kernel sources
Backport CWF PMU support and dependency
Acked-by: Ingo Molnar <[email protected]>
Tested-by: Venkat Rao Bagalkote <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Jason Zeng <[email protected]>
commit 44889ff67cee7b9ee2d305690ce7a5488b137a66 upstream. When applying a recent commit to the <uapi/linux/perf_event.h> header I noticed that we have accumulated quite a bit of historic noise in this header, so do a bit of spring cleaning: - Define bitfields in a vertically aligned fashion, like perf_event_mmap_page::capabilities already does. This makes it easier to see the distribution and sizing of bits within a word, at a glance. The following is much more readable: __u64 cap_bit0 : 1, cap_bit0_is_deprecated : 1, cap_user_rdpmc : 1, cap_user_time : 1, cap_user_time_zero : 1, cap_user_time_short : 1, cap_____res : 58; Than: __u64 cap_bit0:1, cap_bit0_is_deprecated:1, cap_user_rdpmc:1, cap_user_time:1, cap_user_time_zero:1, cap_user_time_short:1, cap_____res:58; So convert all bitfield definitions from the latter style to the former style. - Fix typos and grammar - Fix capitalization - Remove whitespace noise - Harmonize the definitions of various generations and groups of PERF_MEM_ ABI values. - Vertically align all definitions and assignments to the same column (48), as the first definition (enum perf_type_id), throughout the entire header. - And in general make the code and comments to be more in sync with each other and to be more readable overall. No change in functionality. Copy the changes over to tools/include/uapi/linux/perf_event.h. Intel-SIG: commit 44889ff67cee perf/uapi: Clean up <uapi/linux/perf_event.h> a bit Backport CWF PMU support and dependency Signed-off-by: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jason Zeng <[email protected]>
commit cf002dafedd06241175e4dbce39ba90a4b75822c upstream. Starting from the Panther Lake, the discovery table mechanism is also supported in client platforms. The difference is that the portal of the global discovery table is retrieved from an MSR. The layout of discovery tables are the same as the server platforms. Factor out __parse_discovery_table() to parse discover tables. The uncore PMON is Die scope. Need to parse the discovery tables for each die. Intel-SIG: commit cf002dafedd0 perf/x86/intel/uncore: Support MSR portal for discovery tables Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dapeng Mi <[email protected]> Link: https://lore.kernel.org/r/[email protected] [Dapeng Mi: Replace rdmsrq_safe_on_cpu() with rdmsrl_safe_on_cpu()] Signed-off-by: Dapeng Mi <[email protected]> [jz: use rdmsrl_safe_on_cpu() since rdmsrq_safe_on_cpu() not available yet] Signed-off-by: Jason Zeng <[email protected]>
commit fca24bf2b6b619770d7f1222c0284791d7766239 upstream. For a server platform, the MMIO map size is always 0x4000. However, a client platform may have a smaller map size. Make the map size customizable. Intel-SIG: commit fca24bf2b6b6 perf/x86/intel/uncore: Support customized MMIO map size Backport CWF PMU support and dependency Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dapeng Mi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit b0823d5fbacb1c551d793cbfe7af24e0d1fa45ed upstream.
[ Upstream commit b0823d5fbacb1c551d793cbfe7af24e0d1fa45ed ]
The perf_fuzzer found a hard-lockup crash on a RaptorLake machine:
Oops: general protection fault, maybe for address 0xffff89aeceab400: 0000
CPU: 23 UID: 0 PID: 0 Comm: swapper/23
Tainted: [W]=WARN
Hardware name: Dell Inc. Precision 9660/0VJ762
RIP: 0010:native_read_pmc+0x7/0x40
Code: cc e8 8d a9 01 00 48 89 03 5b cd cc cc cc cc 0f 1f ...
RSP: 000:fffb03100273de8 EFLAGS: 00010046
....
Call Trace:
<TASK>
icl_update_topdown_event+0x165/0x190
? ktime_get+0x38/0xd0
intel_pmu_read_event+0xf9/0x210
__perf_event_read+0xf9/0x210
CPUs 16-23 are E-core CPUs that don't support the perf metrics feature.
The icl_update_topdown_event() should not be invoked on these CPUs.
It's a regression of commit:
f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
The bug introduced by that commit is that the is_topdown_event() function
is mistakenly used to replace the is_topdown_count() call to check if the
topdown functions for the perf metrics feature should be invoked.
Fix it.
Intel-SIG: commit b0823d5fbacb perf/x86/intel: Fix crash in icl_update_topdown_event()
Backport CWF PMU support and dependency
Fixes: f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read")
Closes: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Vince Weaver <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Vince Weaver <[email protected]>
Cc: [email protected] # v6.15+
Link: https://lore.kernel.org/r/[email protected]
[ omitted PEBS check ]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Jason Zeng <[email protected]>
commit 99bcd91fabada0dbb1d5f0de44532d8008db93c6 upstream.
Currently, using PEBS-via-PT with a sample frequency instead of a sample
period, causes a segfault. For example:
BUG: kernel NULL pointer dereference, address: 0000000000000195
<NMI>
? __die_body.cold+0x19/0x27
? page_fault_oops+0xca/0x290
? exc_page_fault+0x7e/0x1b0
? asm_exc_page_fault+0x26/0x30
? intel_pmu_pebs_event_update_no_drain+0x40/0x60
? intel_pmu_pebs_event_update_no_drain+0x32/0x60
intel_pmu_drain_pebs_icl+0x333/0x350
handle_pmi_common+0x272/0x3c0
intel_pmu_handle_irq+0x10a/0x2e0
perf_event_nmi_handler+0x2a/0x50
That happens because intel_pmu_pebs_event_update_no_drain() assumes all the
pebs_enabled bits represent counter indexes, which is not always the case.
In this particular case, bits 60 and 61 are set for PEBS-via-PT purposes.
The behaviour of PEBS-via-PT with sample frequency is questionable because
although a PMI is generated (PEBS_PMI_AFTER_EACH_RECORD), the period is not
adjusted anyway.
Putting that aside, fix intel_pmu_pebs_event_update_no_drain() by passing
the mask of counter bits instead of 'size'. Note, prior to the Fixes
commit, 'size' would be limited to the maximum counter index, so the issue
was not hit.
Intel-SIG: commit 99bcd91fabad perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freq
Backport CWF PMU support and dependency
Fixes: 722e42e ("perf/x86: Support counter mask")
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Zeng <[email protected]>
…rs-snapshotting
commit 7da9960b59fb7e590eb8538c9428db55a4ea2d23 upstream.
The counter backwards may be observed in the PMI handler when
counters-snapshotting some non-precise events in the freq mode.
For the non-precise events, it's possible the counters-snapshotting
records a positive value for an overflowed PEBS event. Then the HW
auto-reload mechanism reset the counter to 0 immediately. Because the
pebs_event_reset is cleared in the freq mode, which doesn't set the
PERF_X86_EVENT_AUTO_RELOAD.
In the PMI handler, 0 will be read rather than the positive value
recorded in the counters-snapshotting record.
The counters-snapshotting case has to be specially handled. Since the
event value has been updated when processing the counters-snapshotting
record, only needs to set the new period for the counter via
x86_pmu_set_period().
Intel-SIG: commit 7da9960b59fb perf/x86/intel/ds: Fix counter backwards of non-precise events counters-snapshotting
Backport CWF PMU support and dependency
Fixes: e02e9b0374c3 ("perf/x86/intel: Support PEBS counters snapshotting")
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jason Zeng <[email protected]>
commit 782cffeec9ad96daa64ffb2d527b2a052fb02552 upstream. According to the latest event list, update the event constraint tables for Lion Cove core. The general rule (the event codes < 0x90 are restricted to counters 0-3.) has been removed. There is no restriction for most of the performance monitoring events. Intel-SIG: commit 782cffeec9ad perf/x86/intel: Fix event constraints for LNC Backport CWF PMU support and dependency Fixes: a932aa0 ("perf/x86: Add Lunar Lake and Arrow Lake support") Reported-by: Amiri Khalil <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Zeng <[email protected]>
commit aa5d2ca7c179c40669edb5e96d931bf9828dea3d upstream. The released OCR and FRONTEND events utilized more bits on Lunar Lake p-core. The corresponding mask in the extra_regs has to be extended to unblock the extra bits. Add a dedicated intel_lnc_extra_regs. Intel-SIG: commit aa5d2ca7c179 perf/x86/intel: Fix bitmask of OCR and FRONTEND events for LNC Backport CWF PMU support and dependency Fixes: a932aa0 ("perf/x86: Add Lunar Lake and Arrow Lake support") Reported-by: Andi Kleen <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Zeng <[email protected]>
…fig_acr()
commit 86aa94cd50b138be0dd872b0779fa3036e641881 upstream.
The MSR offset calculations in intel_pmu_config_acr() are buggy.
To calculate fixed counter MSR addresses in intel_pmu_config_acr(),
the HW counter index "idx" is subtracted by INTEL_PMC_IDX_FIXED.
This leads to the ACR mask value of fixed counters to be incorrectly
saved to the positions of GP counters in acr_cfg_b[], e.g.
For fixed counter 0, its ACR counter mask should be saved to
acr_cfg_b[32], but it's saved to acr_cfg_b[0] incorrectly.
Fix this issue.
[ mingo: Clarified & improved the changelog. ]
Intel-SIG: commit 86aa94cd50b1 perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()
Backport CWF PMU support and dependency
Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Dapeng Mi <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Zeng <[email protected]>
commit f6d9883 upstream. To pick up changes from: 149fd47 perf/x86/intel: Support Perfmon MSRs aliasing 21b362c x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems 4f460bf cpufreq: acpi: move MSR_K7_HWCR_CPB_DIS_BIT into msr-index.h 7ea8193 x86/cpufeatures: Add HWP highest perf change feature flag 78ce84b x86/cpufeatures: Flip the /proc/cpuinfo appearance logic 1beb348 x86/sev: Provide SVSM discovery support This should be used to beautify x86 syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Intel-SIG: commit f6d9883 tools/include: Sync x86 headers with the kernel sources Backport CWF PMU support and dependency Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Conflicts: tools/arch/x86/include/asm/cpufeatures.h tools/arch/x86/include/asm/msr-index.h [jz: only adapt changes in following 3 commits: 149fd47 ("perf/x86/intel: Support Perfmon MSRs aliasing") 21b362c ("x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems") 7ea8193 ("x86/cpufeatures: Add HWP highest perf change feature flag") because other commits are not backported in velinux] Signed-off-by: Jason Zeng <[email protected]>
commit f4138de5e41fae1a0b406f0d354a3028dc46bf1f upstream. Also fix some nearby whitespace damage while at it. Intel-SIG: commit f4138de5e41f x86/msr: Standardize on u64 in <asm/msr-index.h> Backport CWF PMU support and dependency Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Juergen Gross <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Xin Li <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 6143374c6dc874d301dc16612aed144bf4544ba upstream.
To pick up the changes from these csets:
159013a7ca18c271 ("x86/its: Enumerate Indirect Target Selection (ITS) bug")
f4138de5e41fae1a ("x86/msr: Standardize on u64 in <asm/msr-index.h>")
ec980e4facef8110 ("perf/x86/intel: Support auto counter reload")
That cause no changes to tooling as it doesn't include a new MSR to be
captured by the tools/perf/trace/beauty/tracepoints/x86_msr.sh script.
Just silences this perf build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Intel-SIG: commit 6143374c6dc8 tools arch x86: Sync the msr-index.h copy with the kernel sources
Backport CWF PMU support and dependency
Cc: Adrian Hunter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/aEtAUg83OQGx8Kay@x1
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Jason Zeng <[email protected]>
commit 43796f30507802d93ead2dc44fc9637f34671a89 upstream.
When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.
[ 55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
[ 55.611280] Call Trace:
[ 55.611282] <TASK>
[ 55.611284] ? intel_pmu_config_acr+0x87/0x160
[ 55.611289] intel_pmu_enable_acr+0x6d/0x80
[ 55.611291] intel_pmu_enable_event+0xce/0x460
[ 55.611293] x86_pmu_start+0x78/0xb0
[ 55.611297] x86_pmu_enable+0x218/0x3a0
[ 55.611300] ? x86_pmu_enable+0x121/0x3a0
[ 55.611302] perf_pmu_enable+0x40/0x50
[ 55.611307] ctx_resched+0x19d/0x220
[ 55.611309] __perf_install_in_context+0x284/0x2f0
[ 55.611311] ? __pfx_remote_function+0x10/0x10
[ 55.611314] remote_function+0x52/0x70
[ 55.611317] ? __pfx_remote_function+0x10/0x10
[ 55.611319] generic_exec_single+0x84/0x150
[ 55.611323] smp_call_function_single+0xc5/0x1a0
[ 55.611326] ? __pfx_remote_function+0x10/0x10
[ 55.611329] perf_install_in_context+0xd1/0x1e0
[ 55.611331] ? __pfx___perf_install_in_context+0x10/0x10
[ 55.611333] __do_sys_perf_event_open+0xa76/0x1040
[ 55.611336] __x64_sys_perf_event_open+0x26/0x30
[ 55.611337] x64_sys_call+0x1d8e/0x20c0
[ 55.611339] do_syscall_64+0x4f/0x120
[ 55.611343] entry_SYSCALL_64_after_hwframe+0x76/0x7e
On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR
which requires to enable auto counter reload on GP counter 0.
The root cause of causing this issue is the check for auto counter
reload (ACR) counter mask from user space is incorrect in
intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter
mask from user space could be set into hw.config1 and then written into
CFG_B MSRs and trigger the MSR access warning.
e.g., User may create a perf event with ACR counter mask (config2=0xcb),
and there is only 1 event created, so "cpuc->n_events" is 1.
The correct check condition should be "i + idx >= cpuc->n_events"
instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise,
the counter mask would traverse twice and an invalid "cpuc->assign[1]"
bit (bit 0) is set into hw.config1 and cause MSR accessing error.
Besides, also check if the ACR counter mask corresponding events are
ACR events. If not, filter out these counter mask. If a event is not a
ACR event, it could be scheduled to an HW counter which doesn't support
ACR. It's invalid to add their counter index in ACR counter mask.
Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as
user could set any invalid ACR counter mask and the warning message
could mislead users.
Intel-SIG: commit 43796f305078 perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
Backport CWF PMU support and dependency
Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Dapeng Mi <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Zeng <[email protected]>
commit be6b067 upstream. Add a common define to "officially" solidify KVM's split of counters, i.e. to commit to using bits 31:0 to track general purpose counters and bits 63:32 to track fixed counters (which only Intel supports). KVM already bleeds this behavior all over common PMU code, and adding a KVM- defined macro allows clarifying that the value is a _base_, as oppposed to the _flag_ that is used to access fixed PMCs via RDPMC (which perf confusingly calls INTEL_PMC_FIXED_RDPMC_BASE). No functional change intended. Intel-SIG: commit be6b067 KVM: x86/pmu: Add common define to capture fixed counters offset Backport CWF PMU support and dependency Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 0e102ce upstream. Several '_mask' suffixed variables such as, global_ctrl_mask, are defined in kvm_pmu structure. However the _mask suffix is ambiguous and misleading since it's not a real mask with positive logic. On the contrary it represents the reserved bits of corresponding MSRs and these bits should not be accessed. Intel-SIG: commit 0e102ce KVM: x86/pmu: Change ambiguous _mask suffix to _rsvd in kvm_pmu Backport CWF PMU support and dependency Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Dapeng Mi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 75430c4 upstream. Magic numbers are used to manipulate the bit fields of FIXED_CTR_CTRL MSR. This makes reading code become difficult, so use pre-defined macros to replace these magic numbers. Intel-SIG: commit 75430c4 KVM: x86/pmu: Manipulate FIXED_CTR_CTRL MSR with macros Backport CWF PMU support and dependency Signed-off-by: Dapeng Mi <[email protected]> Link: https://lore.kernel.org/r/[email protected] [sean: drop unnecessary curly braces] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
commit 2676dbf9f4fb7f6739d1207c0f1deaf63124642a ICL_FIXED_0_ADAPTIVE is missed to be added into INTEL_FIXED_BITS_MASK, add it. With help of this new INTEL_FIXED_BITS_MASK, intel_pmu_enable_fixed() can be optimized. The old fixed counter control bits can be unconditionally cleared with INTEL_FIXED_BITS_MASK and then set new control bits base on new configuration. Intel-SIG: commit 2676dbf9f4fb perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Backport CWF PMU support and dependency Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kan Liang <[email protected]> Tested-by: Yi Lai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jason Zeng <[email protected]>
commit f49e1be19542487921e82b29004908966cb99d7c upstream. Along with the introduction Perfmon v6, pmu counters could be incontinuous, like fixed counters on CWF, only fixed counters 0-3 and 5-7 are supported, there is no fixed counter 4 on CWF. To accommodate this change, archPerfmonExt CPUID (0x23) leaves are introduced to enumerate the true-view of counters bitmap. Current perf code already supports archPerfmonExt CPUID and uses counters-bitmap to enumerate HW really supported counters, but x86_pmu_show_pmu_cap() still only dumps the absolute counter number instead of true-view bitmap, it's out-dated and may mislead readers. So dump counters true-view bitmap in x86_pmu_show_pmu_cap() and opportunistically change the dump sequence and words. Intel-SIG: commit f49e1be19542 perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap() Backport CWF PMU support and dependency Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jason Zeng <[email protected]>
commit 4f23fc34cc68812c68c3a3dec15e26e87565f430 upstream. With commit 8ec9497 ("tools/include: Sync uapi/linux/perf.h with the kernel sources"), 'perf mem report' gives an incorrect memory access string. ... 0.02% 1 3644 L5 hit [.] 0x0000000000009b0e mlc [.] 0x00007fce43f59480 ... This occurs because, if no entry exists in mem_lvlnum, perf_mem__lvl_scnprintf will default to 'L%d, lvl', which in this case for PERF_MEM_LVLNUM_L2_MHB is 0x05. Add entries for PERF_MEM_LVLNUM_L2_MHB and PERF_MEM_LVLNUM_MSC to mem_lvlnum, so that the correct strings are printed. ... 0.02% 1 3644 L2 MHB hit [.] 0x0000000000009b0e mlc [.] 0x00007fce43f59480 ... Intel-SIG: commit 4f23fc34cc68 perf mem: Fix printing PERF_MEM_LVLNUM_{L2_MHB|MSC} Backport CWF PMU support and dependency Fixes: 8ec9497 ("tools/include: Sync uapi/linux/perf.h with the kernel sources") Suggested-by: Kan Liang <[email protected]> Signed-off-by: Thomas Falcon <[email protected]> Reviewed-by: Leo Yan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Jason Zeng <[email protected]>
a21df8a to
1f9d89d
Compare
x56Jason
added a commit
to openvelinux/kernel-intel
that referenced
this pull request
Nov 10, 2025
…6' into intel-6.6-velinux
== Description
This is to backport PMU core/uncore/tool upstream patches for CWF platform.
== Test
- core PMU perf counting test
```
[root@cwf linux]# tools/perf/perf stat -a sleep 1
Performance counter stats for 'system wide':
491,357.35 msec cpu-clock # 482.195 CPUs utilized
2,471 context-switches # 5.029 /sec
481 cpu-migrations # 0.979 /sec
88 page-faults # 0.179 /sec
650,502,887 cycles # 0.001 GHz
185,129,269 instructions # 0.28 insn per cycle
37,198,246 branches # 75.705 K/sec
216,984 branch-misses # 0.58% of all branches
1.019001085 seconds time elapsed
```
- core PMU perf recording tests (fixed and GP counters) pass.
```
[root@cwf linux]# tools/perf/perf record -e instructions -Iax,bx -b -c 100000 sleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.047 MB perf.data (27 samples) ]
[root@cwf linux]# tools/perf/perf record -e branches -Iax,bx -b -c 10000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.066 MB perf.data (60 samples) ]
```
- uncore devices can be seen in sysfs
```
[root@cwf linux]# ls /sys/devices/* | grep uncore
```
- uncore PMU perf counting tests pass.
```
[root@cwf linux]# tools/perf/perf stat -e uncore_upi/event=0x1/,uncore_cha/event=0x1/,uncore_imc/event=0x1/ -a sleep 1
Performance counter stats for 'system wide':
25,144,619,084 uncore_upi/event=0x1/
109,517,186,568 uncore_cha/event=0x1/
22,178,643,523 uncore_imc/event=0x1/
1.004042980 seconds time elapsed
```
- CWF specific perf event counting test pass.
```
[root@cwf linux]# tools/perf/perf stat -e LONGEST_LAT_CACHE.MISS,LONGEST_LAT_CACHE.REFERENCE -a sleep 1
Performance counter stats for 'system wide':
825,684 LONGEST_LAT_CACHE.MISS
11,596,700 LONGEST_LAT_CACHE.REFERENCE
1.014799623 seconds time elapsed
```
- CWF specific perf event sampling test pass.
```
[root@cwf linux]# tools/perf/perf record -e LONGEST_LAT_CACHE.MISS,LONGEST_LAT_CACHE.REFERENCE -c 10000 -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.852 MB perf.data (1082 samples) ]
```
- GNR TPMI based RAPL PMU events available
```
$ perf list | grep -i energy
power/energy-pkg/ [Kernel PMU event]
power/energy-ram/ [Kernel PMU event]
```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This is to backport PMU core/uncore/tool upstream patches for CWF platform.
Test