Skip to content

xinydev/telemetry-solution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arm Telemetry Solution

Arm Telemetry Solution provides a standardized solution with a set of key components, a top-down performance analysis methodology, a telemetry data framework, and a command-line based profiling tool. The solution leverages telemetry data from Arm IP to identify performance bottlenecks and improve execution efficiency.

This repo contains Arm telemetry specifications, telemetry tools and test suites.

  • data Contains the telemetry specification JSON for all supported Arm products.
  • tools Contains telemetry tools, validation tests and other utilities for telemetry data collection and validation.

Content

Arm Topdown Methodology

Arm Topdown Methodology specifies a set of metrics and performance analysis methodology using hardware PMU events, to help identify processor bottlenecks during workload execution.

Arm Topdown methodology can be conducted in two stages:

  • Stage 1: Topdown Analysis Topdown hot spot analysis stage using stall-related metrics to locate the pipeline bottlenecks.

  • Stage 2: Micro-architecture Exploration Deeper analysis stage to further analyze bottlenecked resources, using per micro-architecture resource effectiveness metric groups and metrics.

Arm CPU Telemetry Solution

The Arm CPU Telemetry Solution enables collection, analysis, and representation of CPU telemetry data on Arm platforms.

  • Each supported CPU provides a Telemetry Specification defining PMU events and a metric-driven hierarchical decision tree for hotspot detection. This decision tree is Arm’s implementation of the Topdown Methodology for performance analysis.

  • Telemetry data is structured in the Arm Telemetry Framework, which standardizes events/metrics into machine-readable JSON (MRS). This supports large-scale data collection, processing, and integration with profiling tools.

  • The solution includes the Arm Topdown Tool, a simple CLI for profiling applications. It parses the MRS to collect telemetry data and deliver performance insights. The tool is supported on Linux and Windows.

For more information about Arm CPU Telemetry Solution, see Arm® Telemetry on Arm Developer, see Arm CPU Telemetry Solution Topdown Methodology Specification.

Key chapters from this solution architecture specification are as below:

Chapter Content
Arm Topdown Methodology Topdown methodology and stages for performance analysis (Stage 1 and Stage 2).
Arm Telemetry Framework for CPUs Arm telemetry framework and data model standardization.
Arm Telemetry Specification and Profiling Tools Details on how telemetry specification is enabled for Linux and Windows perf tools.
Arm Topdown Tool Example Arm Topdown tool data collection example.
Linux perf data collection Linux perf tool data collection example.
Windows perf data collection Windows perf tool data collection example.

Refer to Arm Neoverse V1 Performance Analysis Methodology whitepaper for an example Arm Topdown methodology supported by the Neoverse V1 processor, with example case studies.

Key chapters from this whitepaper are as below:

Chapter Content
2 PMU event and metric cheat sheets for performance analysis
3 Arm topdown performance analysis methodology (Neoverse V1). This chapter describes the methodology in detail with all metrics.
4 An example case study to demonstrate how to use our methodology for code tuning exercise.
Appendix B Telemetry Specification: PMU events with concise descriptions
Appendix C Telemetry Specification: Metrics and metric groups for performance analysis derives using PMU events

Note:

The Arm CPU Telemetry Solution is supported across all Neoverse and Lumex CPUs, with PMU events, metrics, and methodology defined and upstreamed in Linux perf. Support for additional Arm CPUs will be available soon.

Arm Telemetry Framework

The building blocks of the Telemetry Framework are as follows.

  • Events are hardware PMU events that count micro-architectural activity.

  • Metrics specify mathematical relations between events that help with the correlation of events for analyzing the system.

  • Metric Groups specify a group of metrics that can be analysed together for a use case. Metric Groups can be components of methodology.

  • Methodology specifies different performance analysis approaches common among software consumers or performance analysts.

CPU Telemetry Specifications & JSON Schema

Arm provides a standardized JSON schema to describe PMU events, derived metrics, and the methodology tree for a CPU in a single file, enabling seamless integration with tooling.

High level schema structure is as follows:

{ "events": {}, // PMU events supported by the CPU "metrics": {}, // Derived metrics supported by the CPU "groups": { // Grouping of events and metrics "function": {}, // Event groups by CPU function "metrics": {} // Metric groups for analysis/methodology }, "methodologies": { "topdown_methodology": {} // Stages and decision tree for Topdown analysis } }

Event Field Definitions

Field Definition
code Event register code for counting
title Title of the event
description Description of what is being counted for the event
accesses Access interface – PMU/ETM
architecture_dfined Architecturally defined event, included in Arm Architecture Reference Manual
product_defined Micro-architecture implementation specific event, specified by the product architecture

Metric Field Definitions

Field Definition
title Title of the Metrics
formula Formula to compute the metrics
description Description of the metrics
units Metrics unit
events Events needed to calculate the metrics
sample_events Events for sampling if a bottleneck is detected with this metric

Topdown Methodology Field Definitions

Field Definition
title Title
description Description
metric_grouping Metric groups used for each stage of the methodology added as lists
decision tree Stage 1 topdown analysis tree with root_nodes and child metrics. Each metric has the following fields:
  • name: metric name
  • group: metric groups the metric belong to
  • next_items: leaves of the node
  • sample_events: Events for sampling if the bottleneck is detected at this specific metric node

Tools

The tools folder contains a collection of telemetry tools used for performance analysis on Arm-based platforms. topdown_tool performs topdown analysis, ustress is a validation test suite that stresses microarchitectural CPU features, spe_parser processes Arm SPE data and perf_json_generator generate CPU definition for Linux perf from Arm CPU JSON specification.

Name Description Folder
Arm Topdown Tool Tool to support the Arm topdown methodology by collecting derived metrics based on Performance Monitoring Unit (PMU) events. tools/topdown_tool
Perf JSON Generator Tool to generate JSON files for Linux perf tool which enable and document Arm PMU events and metrics. tools/perf_json_generator
SPE Parser Tool to parse SPE raw data and generate a Parquet or CSV file for further processing and analysis. tools/spe_parser
UStress Test Validation test suite to stress test major CPU resources. tools/ustress

Support

For feedback, collaboration or support, contact [email protected].

License

This project is licensed as Apache-2.0. See LICENSE.md for more details.

Releases

No releases published

Packages

No packages published

Contributors 14

Languages