-
Notifications
You must be signed in to change notification settings - Fork 49
[Intel-SIG] 5.15-ClearWater PMU support for core #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
quanxianwang
wants to merge
29
commits into
openvelinux:5.15-velinux
Choose a base branch
from
quanxianwang:PMU-velinux-kernel-5.15-legacy-core
base: 5.15-velinux
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[Intel-SIG] 5.15-ClearWater PMU support for core #82
quanxianwang
wants to merge
29
commits into
openvelinux:5.15-velinux
from
quanxianwang:PMU-velinux-kernel-5.15-legacy-core
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
commit 24d74b9 upstream. AVX-VNNI-INT8 is a new set of instructions in the latest Intel platform Sierra Forest, aims for the platform to have superior AI capabilities. This instruction multiplies the individual bytes of two unsigned or unsigned source operands, then adds and accumulates the results into the destination dword element size operand. The bit definition: CPUID.(EAX=7,ECX=1):EDX[bit 4] AVX-VNNI-INT8 is on a new and sparse CPUID leaf and all bits on this leaf have no truly kernel use case for now. Given that and to save space for kernel feature bits, move this new leaf to KVM-only subleaf and plus an x86_FEATURE definition for AVX-VNNI-INT8 to direct it to the KVM entry. Advertise AVX-VNNI-INT8 to KVM userspace. This is safe because there are no new VMX controls or additional host enabling required for guests to use this feature. Intel-SIG: commit 24d74b9 KVM: x86: Advertise AVX-VNNI-INT8 CPUID to user space. ClearWater support including CPU model and new ISAs and its dependency Signed-off-by: Jiaxi Chen <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 090e3be upstream. Server product based on the Atom Darkmont core. Intel-SIG: commit 090e3be x86/cpu: Add model number for Intel Clearwater Forest processor. ClearWater support including CPU model and new ISAs and its dependency Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit a0423af92cb31e6fc4f53ef9b6e19fdf08ad4395 upstream. Latest Intel platform Clearwater Forest has introduced new instructions enumerated by CPUIDs of SHA512, SM3, SM4 and AVX-VNNI-INT16. Advertise these CPUIDs to userspace so that guests can query them directly. SHA512, SM3 and SM4 are on an expected-dense CPUID leaf and some other bits on this leaf have kernel usages. Considering they have not truly kernel usages, hide them in /proc/cpuinfo. These new instructions only operate in xmm, ymm registers and have no new VMX controls, so there is no additional host enabling required for guests to use these instructions, i.e. advertising these CPUIDs to userspace is safe. Intel-SIG: commit a0423af92cb3 x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest. ClearWater support including CPU model and new ISAs and its dependency Tested-by: Jiaan Lu <[email protected]> Tested-by: Xuelian Guo <[email protected]> Signed-off-by: Tao Su <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 8a8a9c9 upstream. This one is the regular laptop CPU. Intel-SIG: commit 8a8a9c9 x86/cpu: Add model number for another Intel Arrow Lake mobile processor. New Intel X86 CPU Family definition Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit a9d0adc upstream. Refactor struct cpuinfo_x86 so that the vendor, family, and model fields are overlaid in a union with a 32-bit field that combines all three (together with a one byte reserved field in the upper byte). This will make it easy, cheap, and reliable to check all three values at once. See https://lore.kernel.org/r/Zgr6kT8oULbnmEXx@agluck-desk3 for why the ordering is (low-to-high bits): (vendor, family, model) [ bp: Move comments over the line, add the backstory about the particular order of the fields. ] Intel-SIG: commit a9d0adc x86/cpu/vfm: Add/initialize x86_vfm field to struct cpuinfo_x86. New Intel X86 CPU Family definition Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit e6dfdc2 upstream. To avoid adding a slew of new macros for each new Intel CPU family switch over from providing CPU model number #defines to a new scheme that encodes vendor, family, and model in a single number. [ bp: s/casted/cast/g ] Intel-SIG: commit e6dfdc2 x86/cpu/vfm: Add new macros to work with (vendor/family/model) values. New Intel X86 CPU Family definition Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit f055b62 upstream. New CPU #defines encode vendor and family as well as model. Update the example usage comment in arch/x86/kernel/cpu/match.c Intel-SIG: commit f055b62 x86/cpu/vfm: Update arch/x86/include/asm/intel-family.h. New Intel X86 CPU Family definition Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 744866f upstream. New CPU #defines encode vendor and family as well as model. Update INTEL_CPU_DESC() to work with vendor/family/model. Intel-SIG: commit 744866f x86/cpu: Switch to new Intel CPU model defines. New Intel X86 CPU Family definition Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Link: https://lore.kernel.org/all/20240520224620.9480-34-tony.luck%40intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 6568fc1 upstream. New CPU #defines encode vendor and family as well as model. Intel-SIG: commit 6568fc1 x86/cpu/intel: Switch to new Intel CPU model defines. New Intel X86 CPU Family definition Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Link: https://lore.kernel.org/all/20240520224620.9480-29-tony.luck%40intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 34b3fc5 upstream. The outer if () should have been dropped when switching to c->x86_vfm. Fixes: 6568fc1 ("x86/cpu/intel: Switch to new Intel CPU model defines") Intel-SIG: commit 34b3fc5 x86/cpu/intel: Drop stray FAM6 check with new Intel CPU model defines. New Intel X86 CPU Family definition Signed-off-by: Andrew Cooper <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Tony Luck <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit d142df1 upstream. New CPU #defines encode vendor and family as well as model. Intel-SIG: commit d142df1 perf/x86/intel: Switch to new Intel CPU model defines. New Intel X86 CPU Family definition Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Link: https://lore.kernel.org/all/20240520224620.9480-32-tony.luck%40intel.com [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
7b1fb7c to
2d62766
Compare
commit d4b5694 upstream. From PMU's perspective, the SPR/GNR server has a similar uarch to the ADL/MTL client p-core. Many functions are shared. However, the shared function name uses the abbreviation of the server product code name, rather than the common uarch code name. Rename these internal shared functions by the common uarch name. Intel-SIG: commit d4b5694 perf/x86/intel: Use the common uarch name for the shared functions. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 0ba0c03 upstream. The SPR and ADL p-core have a similar uarch. Most of the initialization code can be shared. Factor out intel_pmu_init_glc() for the common initialization code. The common part of the ADL p-core will be replaced by the later patch. Intel-SIG: commit 0ba0c03 perf/x86/intel: Factor out the initialization code for SPR. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit d87d221 upstream. From PMU's perspective, the ADL e-core and newer SRF/GRR have a similar uarch. Most of the initialization code can be shared. Factor out intel_pmu_init_grt() for the common initialization code. The common part of the ADL e-core will be replaced by the later patch. Intel-SIG: commit d87d221 perf/x86/intel: Factor out the initialization code for ADL e-core. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 299a5fc upstream. Use the intel_pmu_init_glc() and intel_pmu_init_grt() to replace the duplicate code for ADL. The current code already checks the PERF_X86_EVENT_TOPDOWN flag before invoking the Topdown metrics functions. (The PERF_X86_EVENT_TOPDOWN flag is to indicate the Topdown metric feature, which is only available for the p-core.) Drop the unnecessary adl_set_topdown_event_period() and adl_update_topdown_event(). Intel-SIG: commit 299a5fc perf/x86/intel: Apply the common initialization code for ADL. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit b0560bf upstream. There is a fairly long list of grievances about the current code. The main beefs: 1. hybrid_big_small assumes that the *HARDWARE* (CPUID) provided core types are a bitmap. They are not. If Intel happened to make a core type of 0xff, hilarity would ensue. 2. adl_get_hybrid_cpu_type() utterly inscrutable. There are precisely zero comments and zero changelog about what it is attempting to do. According to Kan, the adl_get_hybrid_cpu_type() is there because some Alder Lake (ADL) CPUs can do some silly things. Some ADL models are *supposed* to be hybrid CPUs with big and little cores, but there are some SKUs that only have big cores. CPUID(0x1a) on those CPUs does not say that the CPUs are big cores. It apparently just returns 0x0. It confuses perf because it expects to see either 0x40 (Core) or 0x20 (Atom). The perf workaround for this is to watch for a CPU core saying it is type 0x0. If that happens on an Alder Lake, it calls x86_pmu.get_hybrid_cpu_type() and just assumes that the core is a Core (0x40) CPU. To fix up the mess, separate out the CPU types and the 'pmu' types. This allows 'hybrid_pmu_type' bitmaps without worrying that some future CPU type will set multiple bits. Since the types are now separate, add a function to glue them back together again. Actual comment on the situation in the glue function (find_hybrid_pmu_for_cpu()). Also, give ->get_hybrid_cpu_type() a real return type and make it clear that it is overriding the *CPU* type, not the PMU type. Rename cpu_type to pmu_type in the struct x86_hybrid_pmu to reflect the change. Originally-by: Dave Hansen <[email protected]> Intel-SIG: commit b0560bf perf/x86/intel: Clean up the hybrid CPU type handling code. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 97588df upstream. The current hybrid initialization codes aren't well organized and are hard to read. Factor out intel_pmu_init_hybrid() to do a common setup for each hybrid PMU. The PMU-specific capability will be updated later via either hard code (ADL) or CPUID hybrid enumeration (MTL). Splitting the ADL and MTL initialization codes, since they have different uarches. The hard code PMU capabilities are not required for MTL either. They can be enumerated by the new leaf 0x23 and IA32_PERF_CAPABILITIES MSR. The hybrid enumeration of the IA32_PERF_CAPABILITIES MSR is broken on MTL. Using the default value. Intel-SIG: commit 97588df perf/x86/intel: Add common intel_pmu_init_hybrid(). CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 950ecdc upstream. Unnecessary multiplexing is triggered when running an "instructions" event on an MTL. perf stat -e cpu_core/instructions/,cpu_core/instructions/ -a sleep 1 Performance counter stats for 'system wide': 115,489,000 cpu_core/instructions/ (50.02%) 127,433,777 cpu_core/instructions/ (49.98%) 1.002294504 seconds time elapsed Linux architectural perf events, e.g., cycles and instructions, usually have dedicated fixed counters. These events also have equivalent events which can be used in the general-purpose counters. The counters are precious. In the intel_pmu_check_event_constraints(), perf check/extend the event constraints of these events. So these events can utilize both fixed counters and general-purpose counters. The following cleanup commit: 97588df ("perf/x86/intel: Add common intel_pmu_init_hybrid()") forgot adding the intel_pmu_check_event_constraints() into update_pmu_cap(). The architectural perf events cannot utilize the general-purpose counters. The code to check and update the counters, event constraints and extra_regs is the same among hybrid systems. Move intel_pmu_check_hybrid_pmus() to init_hybrid_pmu(), and emove the duplicate check in update_pmu_cap(). Fixes: 97588df ("perf/x86/intel: Add common intel_pmu_init_hybrid()") Intel-SIG: commit 950ecdc perf/x86/intel: Fix broken fixed event constraints extension. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit a23eb2f upstream. The current perf assumes that the counters that support PEBS are contiguous. But it's not guaranteed with the new leaf 0x23 introduced. The counters are enumerated with a counter mask. There may be holes in the counter mask for future platforms or in a virtualization environment. Store the PEBS event mask rather than the maximum number of PEBS counters in the x86 PMU structures. Intel-SIG: commit a23eb2f perf/x86/intel: Support the PEBS event mask. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 722e42e upstream. The current perf assumes that both GP and fixed counters are contiguous. But it's not guaranteed on newer Intel platforms or in a virtualization environment. Use the counter mask to replace the number of counters for both GP and the fixed counters. For the other ARCHs or old platforms which don't support a counter mask, using GENMASK_ULL(num_counter - 1, 0) to replace. There is no functional change for them. The interface to KVM is not changed. The number of counters still be passed to KVM. It can be updated later separately. Intel-SIG: commit 722e42e perf/x86: Support counter mask. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit a932aa0 upstream. From PMU's perspective, Lunar Lake and Arrow Lake are similar to the previous generation Meteor Lake. Both are hybrid platforms, with e-core and p-core. The key differences include: - The e-core supports 3 new fixed counters - The p-core supports an updated PEBS Data Source format - More GP counters (Updated event constraint table) - New Architectural performance monitoring V6 (New Perfmon MSRs aliasing, umask2, eq). - New PEBS format V6 (Counters Snapshotting group) - New RDPMC metrics clear mode The legacy features, the 3 new fixed counters and updated event constraint table are enabled in this patch. The new PEBS data source format, the architectural performance monitoring V6, the PEBS format V6, and the new RDPMC metrics clear mode are supported in the following patches. Intel-SIG: commit a932aa0 perf/x86: Add Lunar Lake and Arrow Lake support. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 0902624 upstream. The model-specific pebs_latency_data functions of ADL and MTL use the "small" as a postfix to indicate the e-core. The postfix is too generic for a model-specific function. It cannot provide useful information that can directly map it to a specific uarch, which can facilitate the development and maintenance. Use the abbr of the uarch to rename the model-specific functions. Intel-SIG: commit 0902624 perf/x86/intel: Rename model-specific pebs_latency_data functions. CWF PMU core backporting Suggested-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 608f697 upstream. A new PEBS data source format is introduced for the p-core of Lunar Lake. The data source field is extended to 8 bits with new encodings. A new layout is introduced into the union intel_x86_pebs_dse. Introduce the lnl_latency_data() to parse the new format. Enlarge the pebs_data_source[] accordingly to include new encodings. Only the mem load and the mem store events can generate the data source. Introduce INTEL_HYBRID_LDLAT_CONSTRAINT and INTEL_HYBRID_STLAT_CONSTRAINT to mark them. Add two new bits for the new cache-related data src, L2_MHB and MSC. The L2_MHB is short for L2 Miss Handling Buffer, which is similar to LFB (Line Fill Buffer), but to track the L2 Cache misses. The MSC stands for the memory-side cache. Intel-SIG: commit 608f697 perf/x86/intel: Support new data source for Lunar Lake. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit e8fb5d6 upstream. Different vendors may support different fields in EVENTSEL MSR, such as Intel would introduce new fields umask2 and eq bits in EVENTSEL MSR since Perfmon version 6. However, a fixed mask X86_RAW_EVENT_MASK is used to filter the attr.config. Introduce a new config_mask to record the real supported EVENTSEL bitmask. Only apply it to the existing code now. No functional change. Co-developed-by: Dapeng Mi <[email protected]> Intel-SIG: commit e8fb5d6 perf/x86: Add config_mask to represent EVENTSEL bitmask. CWF PMU core backporting Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit dce0c74 upstream. Two new fields (the unit mask2, and the equal flag) are added in the IA32_PERFEVTSELx MSRs. They can be enumerated by the CPUID.23H.0.EBX. Update the config_mask in x86_pmu and x86_hybrid_pmu for the true layout of the PERFEVTSEL. Expose the new formats into sysfs if they are available. The umask extension reuses the same format attr name "umask" as the previous umask. Add umask2_show to determine/display the correct format for the current machine. Co-developed-by: Dapeng Mi <[email protected]> Intel-SIG: commit dce0c74 perf/x86/intel: Support PERFEVTSEL extension. CWF PMU core backporting Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 149fd47 upstream. The architectural performance monitoring V6 supports a new range of counters' MSRs in the 19xxH address range. They include all the GP counter MSRs, the GP control MSRs, and the fixed counter MSRs. The step between each sibling counter is 4. Add intel_pmu_addr_offset() to calculate the correct offset. Add fixedctr in struct x86_pmu to store the address of the fixed counter 0. It can be used to calculate the rest of the fixed counters. The MSR address of the fixed counter control is not changed. Intel-SIG: commit 149fd47 perf/x86/intel: Support Perfmon MSRs aliasing. CWF PMU core backporting Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 48d66c89dce1e3687174608a5f5c31d5961a9916 upstream. From the PMU's perspective, Clearwater Forest is similar to the previous generation Sierra Forest. The key differences are the ARCH PEBS feature and the new added 3 fixed counters for topdown L1 metrics events. The ARCH PEBS is supported in the following patches. This patch provides support for basic perfmon features and 3 new added fixed counters. Intel-SIG: commit 48d66c89dce1 perf/x86/intel: Add PMU support for Clearwater Forest. CWF PMU core backporting Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 25c623f41438fafc6f63c45e2e141d7bcff78299 upstream. CPUID archPerfmonExt (0x23) leaves are supported to enumerate CPU level's PMU capabilities on non-hybrid processors as well. This patch supports to parse archPerfmonExt leaves on non-hybrid processors. Architectural PEBS leverages archPerfmonExt sub-leaves 0x4 and 0x5 to enumerate the PEBS capabilities as well. This patch is a precursor of the subsequent arch-PEBS enabling patches. Intel-SIG: commit 25c623f41438 perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs. CWF PMU core backporting Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
commit 4a3fd13054a98c43dfcfcbdb93deb43c7b1b9c34 upstream. Arch-PEBS retires IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs, so intel_pmu_pebs_enable/disable() and intel_pmu_pebs_enable/disable_all() are not needed to call for ach-PEBS. To make the code cleaner, introduce static calls x86_pmu_pebs_enable/disable() and x86_pmu_pebs_enable/disable_all() instead of adding "x86_pmu.arch_pebs" check directly in these helpers. Intel-SIG: commit 4a3fd13054a9 perf/x86/intel: Introduce pairs of PEBS static calls. CWF PMU core backporting Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Dapeng Mi <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Quanxian Wang: amend commit log ] Signed-off-by: Quanxian Wang <[email protected]>
2d62766 to
8addcf2
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
this patch depends on PR#72 Intel-SIG: new Intel X86 CPU model definition
support legacy enabling for PMU core part.
4a3fd13054a9,perf/x86/intel: Introduce pairs of PEBS static calls,2025-04-17 14:21:24,Dapeng Mi [email protected],v6.16-rc1,v6.16-rc1 - added
25c623f41438,perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs,2025-04-17 14:21:24,Dapeng Mi [email protected],v6.16-rc1,v6.16-rc1 - added
48d66c89dce1,perf/x86/intel: Add PMU support for Clearwater Forest,2025-04-17 14:21:23,Dapeng Mi [email protected],v6.16-rc1,v6.16-rc1
149fd47,perf/x86/intel: Support Perfmon MSRs aliasing,2024-07-04 16:00:40,Kan Liang [email protected],v6.11-rc1
dce0c74,perf/x86/intel: Support PERFEVTSEL extension,2024-07-04 16:00:40,Kan Liang [email protected],v6.11-rc1
e8fb5d6,perf/x86: Add config_mask to represent EVENTSEL bitmask,2024-07-04 16:00:39,Kan Liang [email protected],v6.11-rc1
608f697,perf/x86/intel: Support new data source for Lunar Lake,2024-07-04 16:00:38,Kan Liang [email protected],v6.11-rc1,v6.11-rc1
0902624,perf/x86/intel: Rename model-specific pebs_latency_data functions,2024-07-04 16:00:38,Kan Liang [email protected],v6.11-rc1,v6.11-rc1
a932aa0,perf/x86: Add Lunar Lake and Arrow Lake support,2024-07-04 16:00:37,Kan Liang [email protected],v6.11-rc1,v6.11-rc1
722e42e,perf/x86: Support counter mask,2024-07-04 16:00:36,Kan Liang [email protected],v6.11-rc1,v6.11-rc1
a23eb2f,perf/x86/intel: Support the PEBS event mask,2024-07-04 16:00:36,Kan Liang [email protected],v6.11-rc1,v6.11-rc1
950ecdc,perf/x86/intel: Fix broken fixed event constraints extension,2023-09-12 08:22:24,Kan Liang [email protected],v6.7-rc1 - added
97588df,perf/x86/intel: Add common intel_pmu_init_hybrid(),2023-08-29 20:59:23,Kan Liang [email protected],v6.7-rc1,v6.7-rc1
b0560bf,perf/x86/intel: Clean up the hybrid CPU type handling code,2023-08-29 20:59:23,Kan Liang [email protected],v6.7-rc1,v6.7-rc1
299a5fc,perf/x86/intel: Apply the common initialization code for ADL,2023-08-29 20:59:23,Kan Liang [email protected],v6.7-rc1,v6.7-rc1
d87d221,perf/x86/intel: Factor out the initialization code for ADL e-core,2023-08-29 20:59:22,Kan Liang [email protected],v6.7-rc1,v6.7-rc1
0ba0c03,perf/x86/intel: Factor out the initialization code for SPR,2023-08-29 20:59:22,Kan Liang [email protected],v6.7-rc1,v6.7-rc1
d4b5694,perf/x86/intel: Use the common uarch name for the shared functions,2023-08-29 20:59:22,Kan Liang [email protected],v6.7-rc1,v6.7-rc1
PMU core test result: PASS
perf stat -a sleep 1
Performance counter stats for 'system wide':
145,420,198,486 cpu-clock # 144.133 CPUs utilized
604 context-switches # 4.153 /sec
145 cpu-migrations # 0.997 /sec
92 page-faults # 0.633 /sec
14,078,739 instructions # 0.30 insn per cycle
47,187,298 cycles # 0.000 GHz
2,809,817 branches # 19.322 K/sec
226,584 branch-misses # 8.06% of all branches
[root@CS17CA101IS1502 ~]# perf record -e instructions -Iax,bx -b -c 100000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data (13 samples) ]
[root@CS17CA101IS1502 ~]# perf record -e branches -Iax,bx -b -c 10000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.045 MB perf.data (29 samples) ]