Skip to content

orlarey/faustcompilerbenchtool

Repository files navigation

Faust Compiler Benchmark Tools

Faust Compiler Benchmark Tools is a suite of utilities designed to analyze and measure the performance of the C++ code generated by the Faust compiler. These tools are particularly useful when developing new compilation strategies, as they ensure correctness and evaluate performance impacts.

Prerequisites

  • Faust compiler available on PATH
  • A C++ toolchain (clang++ is used by default; override with CXX)
  • Python 3 for the helper scripts; matplotlib is optional for plotting
  • Write access to /usr/local/bin and /usr/local/share when installing system-wide

Tools Overview

Assuming you have compiled foo.dsp into foo.cpp, the following tools are available:

  1. fcbenchtool foo.cpp
    Generates a binary (foo) to measure the performance of the compute method in foo.cpp.

  2. fcasmtool foo.cpp
    Produces an assembly file (foo.s) for inspecting the generated assembly code.

  3. fcanalyzetool foo.cpp
    Performs static analysis on foo.cpp to identify potential issues.

  4. fcplottool foo.cpp
    Creates a binary (foo) that, when executed, prints or plots the impulse response of foo.cpp.

  5. fccomparetool foo1.cpp foo2.cpp
    Compares the impulse responses of foo1.cpp and foo2.cpp using fcplottool.

  6. fcdebugtool foo.cpp Similar to fcplottool, but produces a debug-enabled binary for use with gdb.

  7. fcexplorer.py: A Python script to explore various Faust compilation options and their impact on the generated C++ code.

  8. fcbenchgraph.py: A Python script to benchmark multiple DSP files with different FAUST configurations and generate comparative performance graphs.

  9. fcanalyze.py: A Python script to analyze multiple DSP files with different FAUST configurations using static analysis to detect warnings and errors.

  10. fcoptimize.py: An automatic optimization tool that searches for the best Faust scalar compilation options for a given DSP file.


Installation

The tools can be installed with the following command:

sudo ./install.sh
  • Binaries: Installed in /usr/local/bin (including convenience symlinks fcbenchgraph, fcanalyze, fcoptimize).
  • Dependencies: Stored in /usr/local/share/fctool (headers/footers used by the wrappers).
  • Set CXX if you want a compiler other than clang++ for all wrappers.

Quickstart

# Generate C++ from Faust
faust foo.dsp -o foo.cpp

# Benchmark the generated code
fcbenchtool foo.cpp
./foo            # prints the best time over the benchmark loop

# Inspect correctness by impulse response
fcplottool foo.cpp
./foo > foo.ir   # capture impulse response for plotting/diffing

Main Tool: fcbenchtool

The primary tool, fcbenchtool, is used to benchmark the performance of C++ implementations generated by Faust. This involves generating various implementations (e.g., foo1.cpp, foo2.cpp, etc.) by varying Faust compiler options and comparing their performance.

How It Works

  1. Code Wrapping:
    fcbenchtool wraps the original C++ source file between a header and footer to enable performance measurement.

  2. Optimized Compilation:
    The code is compiled with the following options:

    • -O3 (high optimization)
    • -ffast-math (faster floating-point computations)
    • -march=native (architecture-specific optimizations)
  3. Performance Measurement:
    The resulting binary measures the time (in milliseconds) to process 1 second of sound (44100 samples). The benchmark iterates until the minimal result remains stable over at least 1000 iterations. This iteration count can be customized.

Workflow Example

# Step 1: Generate C++ code from a Faust file
faust foo.dsp -o foo.cpp

# Step 2: Compile with fcbenchtool
fcbenchtool foo.cpp

# Step 3: Execute the binary and analyze performance
./foo
./foo 250  # Custom iteration count

Custom Compiler and Extensions

You can specify a custom compiler using the CXX environment variable and define a custom file extension for the generated binary. For example:

faust foo.dsp -o foo.cpp
CXX=clang++-19 fcbenchtool foo.cpp .cl19
sudo ./foo.cl19

Practical use cases

  • Compare the impact of Faust flags: build foo_vec.cpp with -vec, foo_scalar.cpp without, then benchmark both binaries.
  • Validate performance regressions: run fcbenchtool on the same source before/after a code change and archive the printed timings.
  • CPU feature investigations: set CXX="clang++ -march=skylake" to force a specific target and see the effect on throughput.

Additional Tools

fcasmtool

Inspect the assembly code generated by the C++ compiler. Useful to check vectorization, instruction choices, or how a Faust option changes the emitted loops.

fcasmtool foo.cpp
# Peek at the first lines
sed -n '1,60p' foo.s
# Look for vector ops
rg "ymm|xmm" foo.s

fcanalyzetool

Run static analysis on foo.cpp using the compiler analyzer to flag potential issues (undefined behavior, suspicious constructs).

fcanalyzetool foo.cpp
# Only show warnings/errors
fcanalyzetool foo.cpp | rg -i "warning|error"

Practical use: run it on every generated variant while developing new Faust lowering passes to ensure no diagnostics are introduced.

fcplottool and fccomparetool

  • Plot/print the impulse response of foo.cpp:

    fcplottool foo.cpp
    ./foo > foo.ir   # redirect to a file for plotting
  • Compare impulse responses between two files:

    fccomparetool foo1.cpp foo2.cpp
    # Produces a side-by-side diff of impulse response samples

Typical workflow: generate foo_vec.cpp and foo_scalar.cpp, build both, run fccomparetool to confirm the new optimization keeps the DSP identical.

fcdebugtool

Build a debug-enabled binary for analysis with gdb or lldb. Helpful when stepping through generated code or inspecting runtime state.

fcdebugtool foo.cpp
gdb ./foo.db

fcexplorer.py

Explore various Faust compilation options to observe their impact on the generated C++ code.

fcexplorer.py -mcd "0 2 4 8" -vec "" foo.dsp...

will generate the 8 corresponding C++ files. The generated files will be named foo_mcd0.cpp, foo_mcd0_vec.cpp, foo_mcd2.cpp, etc. In the example -vec "" indicates an option that can be present or not, without additional arguments.

Common recipes:

  • Delay tuning sweep: fcexplorer.py -mcd "0 4 8 12" -mdd "512 1024 2048" myalgo.dsp
  • Vector vs scalar: fcexplorer.py -vec "" myalgo.dsp then benchmark the produced files with fcbenchtool.

fcbenchgraph.py

fcbenchgraph.py is a Python script designed to benchmark multiple DSP files with different FAUST parameter configurations and generate comparative performance graphs.

Features

  • Benchmark multiple .dsp files with various FAUST compiler configurations
  • Generate performance comparison matrices and statistics
  • Create visual graphs showing performance differences across configurations
  • Support for custom iteration counts and binary extensions
  • Automatic graph generation with matplotlib (optional)

Usage

fcbenchgraph.py <file_pattern> <faust_config1> [faust_config2] ... [OPTIONS]
# after install you can also call: fcbenchgraph <pattern> ...

Parameters

  • file_pattern: Glob pattern for .dsp files to benchmark (e.g., "*.dsp", "tests/**/*.dsp")
  • faust_config: One or more FAUST parameter sets to test (e.g., "-lang cpp", "-lang cpp -vec")

Options

  • --iterations N: Number of benchmark iterations (default: 1000)
  • --extension EXT: Extension for generated binaries (default: .bench)
  • --no-graph: Disable graph generation
  • --graph-output FILE: Custom graph filename (default: benchmark_YYYYMMDD_HHMMSS.png)

Examples

  1. Basic benchmarking with single configuration:

    fcbenchgraph.py "*.dsp" "-lang cpp"

    Quickly establish a baseline across a directory before trying new options.

  2. Compare multiple configurations:

    fcbenchgraph.py "tests/impulse-tests/dsp/*.dsp" "-lang cpp" "-lang cpp -vec" "-lang cpp -double"
    # Useful to quantify the benefit of vectorization on a whole suite
  3. Custom iterations and graph output:

    fcbenchgraph.py "*.dsp" "-lang cpp" "-lang rust" --iterations=500 --graph-output=my_benchmark.png

    Handy for CI runs where you want fewer iterations and a deterministic graph name.

Output

The script generates:

  1. Console output:

    • Progress information during benchmarking
    • Results matrix showing execution times in milliseconds
    • Configuration details and statistics
    • Global success rates
  2. Graph file (if matplotlib is available):

    • Visual comparison of performance across configurations
    • Line plots showing execution times for each DSP file
    • Command information and generation timestamp
    • Automatic filename with timestamp if not specified

Prerequisites

  • FAUST compiler must be installed and accessible
  • fcbenchtool must be installed (from this toolkit)
  • Python 3 with standard libraries
  • matplotlib (optional, for graph generation): pip install matplotlib

Process

For each DSP file and configuration combination:

  1. Compiles the DSP file using FAUST with specified parameters
  2. Uses fcbenchtool to create a benchmarking binary
  3. Executes the binary to measure performance
  4. Collects timing results and generates statistics
  5. Creates comparative visualizations (if matplotlib is available)

fcanalyze.py

fcanalyze.py is a Python script designed to analyze multiple DSP files with different FAUST parameter configurations using static analysis. Instead of measuring performance like fcbenchgraph.py, it focuses on detecting warnings, errors, and potential issues in the generated C++ code.

Features

  • Analyze multiple .dsp files with various FAUST compiler configurations
  • Use fcanalyzetool to perform static analysis on generated C++ code
  • Generate analysis comparison matrices and statistics
  • Detect warnings, errors, and potential code issues
  • Identify the most problematic files across configurations

Usage

fcanalyze.py <file_pattern> <faust_config1> [faust_config2] ... [OPTIONS]
# or simply fcanalyze <pattern> ... after installation

Parameters

  • file_pattern: Glob pattern for .dsp files to analyze (e.g., "*.dsp", "tests/**/*.dsp")
  • faust_config: One or more FAUST parameter sets to test (e.g., "-lang cpp", "-lang cpp -vec")

Examples

  1. Basic analysis with single configuration:

    fcanalyze.py "*.dsp" "-lang cpp"
  2. Compare multiple configurations:

    fcanalyze.py "tests/impulse-tests/dsp/*.dsp" "-lang cpp" "-lang cpp -vec" "-lang cpp -double"
  3. Analyze specific file patterns:

    fcanalyze.py "examples/*.dsp" "-lang cpp" "-lang rust"

Typical uses: keep a directory warning-free during refactors, or compare how different Faust backends affect static-analysis noise.

Output

The script generates console output including:

  1. Progress information during analysis for each file and configuration

  2. Results matrix showing analysis status:

    • βœ“ CLEAN: No issues found
    • XW/YE: X warnings and Y errors found
    • FAUST_ERR: FAUST compilation failed
    • ANALYSIS_ERR: Static analysis failed
  3. Configuration details and statistics per configuration

  4. Global statistics including success rates

  5. Most problematic files section listing files with the most issues

Prerequisites

  • FAUST compiler must be installed and accessible
  • fcanalyzetool must be installed (from this toolkit)
  • Python 3 with standard libraries

Process

For each DSP file and configuration combination:

  1. Compiles the DSP file using FAUST with specified parameters
  2. Uses fcanalyzetool to perform static analysis on the generated C++ code
  3. Parses analysis output to extract warnings, errors, and other issues
  4. Collects results and generates comprehensive statistics
  5. Provides a summary of code quality across different configurations

fcoptimize.py

fcoptimize.py is an automatic optimization tool that searches for the best Faust compilation options for a given DSP file. It intelligently explores the option space and identifies the configuration that produces the fastest executable.

Features

  • Automatic exploration of Faust scalar compilation options
  • Focus on cpp and ocpp backends without vectorization/parallelism
  • Two search strategies: random and adaptive
  • Baseline comparison to measure improvements
  • JSON export of results for further analysis
  • Optional graph generation showing optimization progress
  • Configurable number of trials and benchmark iterations

Explored Options

The tool systematically explores these scalar-mode options focused on performance (using -single precision only, as -double and -quad would slow down computations):

Delay optimizations (especially important for ocpp):

  • -mcd (max copy delay): 0, 2, 4, 6, 8, 9, 12, 16, 20 (10 not supported with ocpp)
  • -udd (use dense delay): 0, 1
  • -mcl (max copy loop): 2, 4, 8, 16 (ocpp only, fixed at 4 for cpp)
  • -mdd (max dense delay): 256, 512, 1024, 2048, 4096 (ocpp only, fixed at 1024 for cpp)
  • -mca (max cache delay): 4, 8, 16, 32 (ocpp only, fixed at 8 for cpp)
  • -mdy (min density): 70, 80, 90, 95 (ocpp only, fixed at 90 for cpp)

Code generation:

  • -ss (scheduling strategy): 0=depth-first, 1=breadth-first, 2=special, 3=reverse breadth-first
  • -fsr (fixed sample rate): None (variable) or 44100 Hz
  • -cm (compute mix) β€” cpp only
  • -fm def (fast math) β€” cpp only
  • -mapp (math approximations)
  • -exp10 (exp10 optimization)
  • -it (inline tables) β€” cpp only

FIR/IIR optimizations:

  • -fir (FIR/IIR reconstruction)
  • -ff (factorize FIR/IIR coefficients)
  • -mfs (max FIR size): 256, 512, 1024, 2048
  • -fls (FIR loop size): 2, 4, 8, 16
  • -irt (IIR ring threshold): 2, 4, 8, 16

Other optimizations:

  • -ssel (simplify select2)

Note: Options marked as cpp only are automatically excluded when using --lang ocpp. The tool adapts the option space based on backend constraints.

Usage

fcoptimize.py <dsp_file> [OPTIONS]
# or fcoptimize <dsp_file> ... after installation

Parameters

  • dsp_file: Faust DSP file to optimize (required)

Options

  • --lang {cpp,ocpp}: Target language (default: cpp)
  • --strategy {random,adaptive}: Search strategy (default: random)
  • --max-trials N: Maximum configurations to try (default: 100)
  • --iterations N: Benchmark iterations per config (default: 1000)
  • --top-n N: Show top N best configurations (default: 10)
  • --save-results FILE: Save results to JSON file
  • --graph-output FILE: Generate optimization progress graph
  • --baseline CONFIG: Baseline configuration for comparison (e.g., "-lang cpp")
  • --timeout N: Timeout per benchmark in seconds (default: 60)
  • --sensitivity-analysis: Perform sensitivity analysis on best configuration

Search Strategies

Random Search (--strategy random):

  • Explores the option space uniformly
  • Good for discovering unexpected optimal configurations
  • Works well with limited trials

Adaptive Search (--strategy adaptive):

  • Phase 1 (30%): Random exploration
  • Phase 2 (70%): Mutations of best configurations found
  • Converges faster to local optima
  • Better for large trial counts

Examples

  1. Quick optimization with 50 trials:

    fcoptimize.py foo.dsp --max-trials 50
  2. Deep search with adaptive strategy:

    fcoptimize.py foo.dsp --strategy adaptive --max-trials 200
  3. Optimize for ocpp with baseline and export:

    fcoptimize.py foo.dsp --lang ocpp --baseline "-lang ocpp" \
        --save-results results.json --graph-output progress.png
  4. Fast exploration for quick feedback:

    fcoptimize.py foo.dsp --max-trials 30 --iterations 500
  5. Optimization with sensitivity analysis:

    fcoptimize.py foo.dsp --max-trials 100 --sensitivity-analysis

Sensitivity Analysis

The --sensitivity-analysis option performs an additional phase after optimization to identify which compiler options have the most impact on performance. This helps understand which optimizations are most critical for a specific DSP program.

How it works:

  • Takes the best configuration found during optimization
  • For each option, tests alternative values while keeping other options fixed
  • Calculates the performance impact (percentage change) for each variation
  • Iterative local optimization: If a better configuration is found, the analysis automatically re-runs around the new point
  • Continues until convergence (no improvement found) or maximum iterations reached
  • Ranks options by their maximum impact on performance

Key features:

  • Automatic refinement: If the initial search missed the true optimum, sensitivity analysis finds and refines it
  • Convergence detection: Stops when reaching a local optimum where no single-option change improves performance (with 0.5% significance threshold)
  • Parameter importance analysis: Automatically categorizes parameters by their impact (CRITICAL/HIGH/MODERATE/LOW)
  • Complete iteration history: All iterations are saved in JSON for analysis

Output:

  • Console report showing:
    • Progress through iterations with improvements highlighted
    • Final sensitivity ranking (most impactful options first)
    • Parameter importance analysis with automatic categorization:
      • πŸ”΄ CRITICAL (>20% of total impact): Must be carefully optimized
      • 🟑 HIGH (10-20%): Important to tune correctly
      • 🟒 MODERATE (5-10%): Secondary priority
      • βšͺ LOW (<5%): Can be fixed to safe defaults
    • Practical recommendations for manual tuning
    • Total improvement achieved through local optimization
  • Human-readable text report: <dsp_name>_sensitivity_<lang>_<timestamp>.txt
    • Same format as console output
    • Easy to read with any text viewer (cat, less, etc.)
    • Sensitivity ranking table
    • Parameter importance analysis with categories
    • Recommendations and final configuration
  • JSON file with detailed data: <dsp_name>_sensitivity_<lang>_<timestamp>.json
    • Initial and final configurations
    • Complete iteration history
    • Sensitivity rankings at convergence
    • Parameter importance scores and categories
    • For machine processing and further analysis
  • Enhanced bar chart visualization (if matplotlib available): <dsp_name>_sensitivity_<lang>_<timestamp>.png
    • Color-coded bars by importance category
    • Visual importance indicators

Use cases:

  • Identify which options matter most for your specific DSP algorithm
  • Refine the optimization result to find true local optima
  • Understand trade-offs when manually tuning configurations
  • Prioritize which options to focus on when exploring variations
  • Determine which options can be ignored (low impact)
  • Verify that the optimization found a stable local optimum

Output

The script automatically generates files with timestamps and always saves results:

  1. Real-time progress:

    • Configuration being tested
    • Benchmark result or failure
    • Best configuration updates
  2. Final summary:

    • Top N best configurations
    • Execution times and speedup vs baseline
    • Complete Faust command for best config
  3. JSON results (always saved automatically):

    • Default filename: <dsp_name>_opt_<lang>_<strategy>_<timestamp>.json
    • Can be customized with --save-results
    • Contains all tested configurations, benchmark times, and configuration details
  4. Progress graph (always generated if matplotlib available):

    • Default filename: <dsp_name>_opt_<lang>_<strategy>_<timestamp>.png
    • Can be customized with --graph-output
    • Shows scatter plot of all trials, running minimum line, and baseline comparison

Example Output

=== RANDOM SEARCH OPTIMIZATION ===
DSP file: reverb.dsp
Language: ocpp
Max trials: 100
==================================================

Testing baseline configuration: -lang ocpp
  Baseline: 12.345ms

[1/100] Testing: -lang ocpp -double -mcd 4 -dlt 512 -ftz 2
  Result: 10.234ms βœ“ NEW BEST! (-17.1% vs baseline)

[2/100] Testing: -lang ocpp -single -mcd 8 -mdd 2048 -cm
  Result: 9.876ms βœ“ NEW BEST! (-20.0% vs baseline)

...

=== TOP 10 CONFIGURATIONS ===

#1: 8.234ms (-33.3% vs baseline)
    -lang ocpp -double -mcd 8 -mdd 2048 -dlt 256 -ftz 2

#2: 8.567ms (-30.6% vs baseline)
    -lang ocpp -single -mcd 4 -mca 16 -ftz 2 -fm def

...

======================================================================
BEST CONFIGURATION:
  Time: 8.234ms
  Speedup vs baseline: 33.3%
  Command: faust -lang ocpp -double -mcd 8 -mdd 2048 -dlt 256 -ftz 2 <file.dsp> -o <file.cpp>
======================================================================

Prerequisites

  • FAUST compiler must be installed and accessible
  • fcbenchtool must be installed (from this toolkit)
  • Python 3 with standard libraries
  • matplotlib (optional, for graph generation): pip install matplotlib

Tips

  • Start with fewer trials (30-50) to get quick feedback
  • Use --baseline to measure improvements over default settings
  • The ocpp backend often benefits more from delay optimizations (-mcd, -mdd, etc.)
  • Save results to JSON for later comparison or analysis
  • Use adaptive strategy for deep optimization (200+ trials)

Workflow Recipes

  • Validate a new Faust flag: fcexplorer.py to generate variants β†’ fccomparetool to ensure impulse responses stay identical β†’ fcbenchtool to see if the change speeds up or slows down.
  • Guard against regressions: run fcanalyze.py with your standard configs to keep warnings/errors at zero, then fcbenchgraph.py to watch for performance drops across the whole corpus.
  • Deep-dive a problematic DSP: capture the impulse response with fcplottool, open a debug build with fcdebugtool in gdb, and inspect hot loops in foo.s produced by fcasmtool.

About

A simple benchmark tool to measure the performance of the cpp code generate by the Faust compiler

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •