Faust Compiler Benchmark Tools is a suite of utilities designed to analyze and measure the performance of the C++ code generated by the Faust compiler. These tools are particularly useful when developing new compilation strategies, as they ensure correctness and evaluate performance impacts.
- Faust compiler available on
PATH - A C++ toolchain (
clang++is used by default; override withCXX) - Python 3 for the helper scripts;
matplotlibis optional for plotting - Write access to
/usr/local/binand/usr/local/sharewhen installing system-wide
Assuming you have compiled foo.dsp into foo.cpp, the following tools are available:
-
fcbenchtool foo.cpp
Generates a binary (foo) to measure the performance of thecomputemethod infoo.cpp. -
fcasmtool foo.cpp
Produces an assembly file (foo.s) for inspecting the generated assembly code. -
fcanalyzetool foo.cpp
Performs static analysis onfoo.cppto identify potential issues. -
fcplottool foo.cpp
Creates a binary (foo) that, when executed, prints or plots the impulse response offoo.cpp. -
fccomparetool foo1.cpp foo2.cpp
Compares the impulse responses offoo1.cppandfoo2.cppusingfcplottool. -
fcdebugtool foo.cppSimilar tofcplottool, but produces a debug-enabled binary for use withgdb. -
fcexplorer.py: A Python script to explore various Faust compilation options and their impact on the generated C++ code. -
fcbenchgraph.py: A Python script to benchmark multiple DSP files with different FAUST configurations and generate comparative performance graphs. -
fcanalyze.py: A Python script to analyze multiple DSP files with different FAUST configurations using static analysis to detect warnings and errors. -
fcoptimize.py: An automatic optimization tool that searches for the best Faust scalar compilation options for a given DSP file.
The tools can be installed with the following command:
sudo ./install.sh- Binaries: Installed in
/usr/local/bin(including convenience symlinksfcbenchgraph,fcanalyze,fcoptimize). - Dependencies: Stored in
/usr/local/share/fctool(headers/footers used by the wrappers). - Set
CXXif you want a compiler other thanclang++for all wrappers.
# Generate C++ from Faust
faust foo.dsp -o foo.cpp
# Benchmark the generated code
fcbenchtool foo.cpp
./foo # prints the best time over the benchmark loop
# Inspect correctness by impulse response
fcplottool foo.cpp
./foo > foo.ir # capture impulse response for plotting/diffingThe primary tool, fcbenchtool, is used to benchmark the performance of C++ implementations generated by Faust. This involves generating various implementations (e.g., foo1.cpp, foo2.cpp, etc.) by varying Faust compiler options and comparing their performance.
-
Code Wrapping:
fcbenchtoolwraps the original C++ source file between a header and footer to enable performance measurement. -
Optimized Compilation:
The code is compiled with the following options:-O3(high optimization)-ffast-math(faster floating-point computations)-march=native(architecture-specific optimizations)
-
Performance Measurement:
The resulting binary measures the time (in milliseconds) to process 1 second of sound (44100 samples). The benchmark iterates until the minimal result remains stable over at least 1000 iterations. This iteration count can be customized.
# Step 1: Generate C++ code from a Faust file
faust foo.dsp -o foo.cpp
# Step 2: Compile with fcbenchtool
fcbenchtool foo.cpp
# Step 3: Execute the binary and analyze performance
./foo
./foo 250 # Custom iteration countYou can specify a custom compiler using the CXX environment variable and define a custom file extension for the generated binary. For example:
faust foo.dsp -o foo.cpp
CXX=clang++-19 fcbenchtool foo.cpp .cl19
sudo ./foo.cl19- Compare the impact of Faust flags: build
foo_vec.cppwith-vec,foo_scalar.cppwithout, then benchmark both binaries. - Validate performance regressions: run
fcbenchtoolon the same source before/after a code change and archive the printed timings. - CPU feature investigations: set
CXX="clang++ -march=skylake"to force a specific target and see the effect on throughput.
Inspect the assembly code generated by the C++ compiler. Useful to check vectorization, instruction choices, or how a Faust option changes the emitted loops.
fcasmtool foo.cpp
# Peek at the first lines
sed -n '1,60p' foo.s
# Look for vector ops
rg "ymm|xmm" foo.sRun static analysis on foo.cpp using the compiler analyzer to flag potential issues (undefined behavior, suspicious constructs).
fcanalyzetool foo.cpp
# Only show warnings/errors
fcanalyzetool foo.cpp | rg -i "warning|error"Practical use: run it on every generated variant while developing new Faust lowering passes to ensure no diagnostics are introduced.
-
Plot/print the impulse response of
foo.cpp:fcplottool foo.cpp ./foo > foo.ir # redirect to a file for plotting
-
Compare impulse responses between two files:
fccomparetool foo1.cpp foo2.cpp # Produces a side-by-side diff of impulse response samples
Typical workflow: generate foo_vec.cpp and foo_scalar.cpp, build both, run fccomparetool to confirm the new optimization keeps the DSP identical.
Build a debug-enabled binary for analysis with gdb or lldb. Helpful when stepping through generated code or inspecting runtime state.
fcdebugtool foo.cpp
gdb ./foo.dbExplore various Faust compilation options to observe their impact on the generated C++ code.
fcexplorer.py -mcd "0 2 4 8" -vec "" foo.dsp...will generate the 8 corresponding C++ files. The generated files will be named foo_mcd0.cpp, foo_mcd0_vec.cpp, foo_mcd2.cpp, etc. In the example -vec "" indicates an option that can be present or not, without additional arguments.
Common recipes:
- Delay tuning sweep:
fcexplorer.py -mcd "0 4 8 12" -mdd "512 1024 2048" myalgo.dsp - Vector vs scalar:
fcexplorer.py -vec "" myalgo.dspthen benchmark the produced files withfcbenchtool.
fcbenchgraph.py is a Python script designed to benchmark multiple DSP files with different FAUST parameter configurations and generate comparative performance graphs.
- Benchmark multiple
.dspfiles with various FAUST compiler configurations - Generate performance comparison matrices and statistics
- Create visual graphs showing performance differences across configurations
- Support for custom iteration counts and binary extensions
- Automatic graph generation with matplotlib (optional)
fcbenchgraph.py <file_pattern> <faust_config1> [faust_config2] ... [OPTIONS]
# after install you can also call: fcbenchgraph <pattern> ...- file_pattern: Glob pattern for
.dspfiles to benchmark (e.g.,"*.dsp","tests/**/*.dsp") - faust_config: One or more FAUST parameter sets to test (e.g.,
"-lang cpp","-lang cpp -vec")
--iterations N: Number of benchmark iterations (default: 1000)--extension EXT: Extension for generated binaries (default:.bench)--no-graph: Disable graph generation--graph-output FILE: Custom graph filename (default:benchmark_YYYYMMDD_HHMMSS.png)
-
Basic benchmarking with single configuration:
fcbenchgraph.py "*.dsp" "-lang cpp"
Quickly establish a baseline across a directory before trying new options.
-
Compare multiple configurations:
fcbenchgraph.py "tests/impulse-tests/dsp/*.dsp" "-lang cpp" "-lang cpp -vec" "-lang cpp -double" # Useful to quantify the benefit of vectorization on a whole suite
-
Custom iterations and graph output:
fcbenchgraph.py "*.dsp" "-lang cpp" "-lang rust" --iterations=500 --graph-output=my_benchmark.png
Handy for CI runs where you want fewer iterations and a deterministic graph name.
The script generates:
-
Console output:
- Progress information during benchmarking
- Results matrix showing execution times in milliseconds
- Configuration details and statistics
- Global success rates
-
Graph file (if matplotlib is available):
- Visual comparison of performance across configurations
- Line plots showing execution times for each DSP file
- Command information and generation timestamp
- Automatic filename with timestamp if not specified
- FAUST compiler must be installed and accessible
fcbenchtoolmust be installed (from this toolkit)- Python 3 with standard libraries
- matplotlib (optional, for graph generation):
pip install matplotlib
For each DSP file and configuration combination:
- Compiles the DSP file using FAUST with specified parameters
- Uses
fcbenchtoolto create a benchmarking binary - Executes the binary to measure performance
- Collects timing results and generates statistics
- Creates comparative visualizations (if matplotlib is available)
fcanalyze.py is a Python script designed to analyze multiple DSP files with different FAUST parameter configurations using static analysis. Instead of measuring performance like fcbenchgraph.py, it focuses on detecting warnings, errors, and potential issues in the generated C++ code.
- Analyze multiple
.dspfiles with various FAUST compiler configurations - Use
fcanalyzetoolto perform static analysis on generated C++ code - Generate analysis comparison matrices and statistics
- Detect warnings, errors, and potential code issues
- Identify the most problematic files across configurations
fcanalyze.py <file_pattern> <faust_config1> [faust_config2] ... [OPTIONS]
# or simply fcanalyze <pattern> ... after installation- file_pattern: Glob pattern for
.dspfiles to analyze (e.g.,"*.dsp","tests/**/*.dsp") - faust_config: One or more FAUST parameter sets to test (e.g.,
"-lang cpp","-lang cpp -vec")
-
Basic analysis with single configuration:
fcanalyze.py "*.dsp" "-lang cpp"
-
Compare multiple configurations:
fcanalyze.py "tests/impulse-tests/dsp/*.dsp" "-lang cpp" "-lang cpp -vec" "-lang cpp -double"
-
Analyze specific file patterns:
fcanalyze.py "examples/*.dsp" "-lang cpp" "-lang rust"
Typical uses: keep a directory warning-free during refactors, or compare how different Faust backends affect static-analysis noise.
The script generates console output including:
-
Progress information during analysis for each file and configuration
-
Results matrix showing analysis status:
β CLEAN: No issues foundXW/YE: X warnings and Y errors foundFAUST_ERR: FAUST compilation failedANALYSIS_ERR: Static analysis failed
-
Configuration details and statistics per configuration
-
Global statistics including success rates
-
Most problematic files section listing files with the most issues
- FAUST compiler must be installed and accessible
fcanalyzetoolmust be installed (from this toolkit)- Python 3 with standard libraries
For each DSP file and configuration combination:
- Compiles the DSP file using FAUST with specified parameters
- Uses
fcanalyzetoolto perform static analysis on the generated C++ code - Parses analysis output to extract warnings, errors, and other issues
- Collects results and generates comprehensive statistics
- Provides a summary of code quality across different configurations
fcoptimize.py is an automatic optimization tool that searches for the best Faust compilation options for a given DSP file. It intelligently explores the option space and identifies the configuration that produces the fastest executable.
- Automatic exploration of Faust scalar compilation options
- Focus on
cppandocppbackends without vectorization/parallelism - Two search strategies: random and adaptive
- Baseline comparison to measure improvements
- JSON export of results for further analysis
- Optional graph generation showing optimization progress
- Configurable number of trials and benchmark iterations
The tool systematically explores these scalar-mode options focused on performance (using -single precision only, as -double and -quad would slow down computations):
Delay optimizations (especially important for ocpp):
-mcd(max copy delay): 0, 2, 4, 6, 8, 9, 12, 16, 20 (10 not supported with ocpp)-udd(use dense delay): 0, 1-mcl(max copy loop): 2, 4, 8, 16 (ocpp only, fixed at 4 for cpp)-mdd(max dense delay): 256, 512, 1024, 2048, 4096 (ocpp only, fixed at 1024 for cpp)-mca(max cache delay): 4, 8, 16, 32 (ocpp only, fixed at 8 for cpp)-mdy(min density): 70, 80, 90, 95 (ocpp only, fixed at 90 for cpp)
Code generation:
-ss(scheduling strategy): 0=depth-first, 1=breadth-first, 2=special, 3=reverse breadth-first-fsr(fixed sample rate): None (variable) or 44100 Hz-cm(compute mix) β cpp only-fm def(fast math) β cpp only-mapp(math approximations)-exp10(exp10 optimization)-it(inline tables) β cpp only
FIR/IIR optimizations:
-fir(FIR/IIR reconstruction)-ff(factorize FIR/IIR coefficients)-mfs(max FIR size): 256, 512, 1024, 2048-fls(FIR loop size): 2, 4, 8, 16-irt(IIR ring threshold): 2, 4, 8, 16
Other optimizations:
-ssel(simplify select2)
Note: Options marked as cpp only are automatically excluded when using --lang ocpp. The tool adapts the option space based on backend constraints.
fcoptimize.py <dsp_file> [OPTIONS]
# or fcoptimize <dsp_file> ... after installation- dsp_file: Faust DSP file to optimize (required)
--lang {cpp,ocpp}: Target language (default: cpp)--strategy {random,adaptive}: Search strategy (default: random)--max-trials N: Maximum configurations to try (default: 100)--iterations N: Benchmark iterations per config (default: 1000)--top-n N: Show top N best configurations (default: 10)--save-results FILE: Save results to JSON file--graph-output FILE: Generate optimization progress graph--baseline CONFIG: Baseline configuration for comparison (e.g., "-lang cpp")--timeout N: Timeout per benchmark in seconds (default: 60)--sensitivity-analysis: Perform sensitivity analysis on best configuration
Random Search (--strategy random):
- Explores the option space uniformly
- Good for discovering unexpected optimal configurations
- Works well with limited trials
Adaptive Search (--strategy adaptive):
- Phase 1 (30%): Random exploration
- Phase 2 (70%): Mutations of best configurations found
- Converges faster to local optima
- Better for large trial counts
-
Quick optimization with 50 trials:
fcoptimize.py foo.dsp --max-trials 50
-
Deep search with adaptive strategy:
fcoptimize.py foo.dsp --strategy adaptive --max-trials 200
-
Optimize for ocpp with baseline and export:
fcoptimize.py foo.dsp --lang ocpp --baseline "-lang ocpp" \ --save-results results.json --graph-output progress.png -
Fast exploration for quick feedback:
fcoptimize.py foo.dsp --max-trials 30 --iterations 500
-
Optimization with sensitivity analysis:
fcoptimize.py foo.dsp --max-trials 100 --sensitivity-analysis
The --sensitivity-analysis option performs an additional phase after optimization to identify which compiler options have the most impact on performance. This helps understand which optimizations are most critical for a specific DSP program.
How it works:
- Takes the best configuration found during optimization
- For each option, tests alternative values while keeping other options fixed
- Calculates the performance impact (percentage change) for each variation
- Iterative local optimization: If a better configuration is found, the analysis automatically re-runs around the new point
- Continues until convergence (no improvement found) or maximum iterations reached
- Ranks options by their maximum impact on performance
Key features:
- Automatic refinement: If the initial search missed the true optimum, sensitivity analysis finds and refines it
- Convergence detection: Stops when reaching a local optimum where no single-option change improves performance (with 0.5% significance threshold)
- Parameter importance analysis: Automatically categorizes parameters by their impact (CRITICAL/HIGH/MODERATE/LOW)
- Complete iteration history: All iterations are saved in JSON for analysis
Output:
- Console report showing:
- Progress through iterations with improvements highlighted
- Final sensitivity ranking (most impactful options first)
- Parameter importance analysis with automatic categorization:
- π΄ CRITICAL (>20% of total impact): Must be carefully optimized
- π‘ HIGH (10-20%): Important to tune correctly
- π’ MODERATE (5-10%): Secondary priority
- βͺ LOW (<5%): Can be fixed to safe defaults
- Practical recommendations for manual tuning
- Total improvement achieved through local optimization
- Human-readable text report:
<dsp_name>_sensitivity_<lang>_<timestamp>.txt- Same format as console output
- Easy to read with any text viewer (
cat,less, etc.) - Sensitivity ranking table
- Parameter importance analysis with categories
- Recommendations and final configuration
- JSON file with detailed data:
<dsp_name>_sensitivity_<lang>_<timestamp>.json- Initial and final configurations
- Complete iteration history
- Sensitivity rankings at convergence
- Parameter importance scores and categories
- For machine processing and further analysis
- Enhanced bar chart visualization (if matplotlib available):
<dsp_name>_sensitivity_<lang>_<timestamp>.png- Color-coded bars by importance category
- Visual importance indicators
Use cases:
- Identify which options matter most for your specific DSP algorithm
- Refine the optimization result to find true local optima
- Understand trade-offs when manually tuning configurations
- Prioritize which options to focus on when exploring variations
- Determine which options can be ignored (low impact)
- Verify that the optimization found a stable local optimum
The script automatically generates files with timestamps and always saves results:
-
Real-time progress:
- Configuration being tested
- Benchmark result or failure
- Best configuration updates
-
Final summary:
- Top N best configurations
- Execution times and speedup vs baseline
- Complete Faust command for best config
-
JSON results (always saved automatically):
- Default filename:
<dsp_name>_opt_<lang>_<strategy>_<timestamp>.json - Can be customized with
--save-results - Contains all tested configurations, benchmark times, and configuration details
- Default filename:
-
Progress graph (always generated if matplotlib available):
- Default filename:
<dsp_name>_opt_<lang>_<strategy>_<timestamp>.png - Can be customized with
--graph-output - Shows scatter plot of all trials, running minimum line, and baseline comparison
- Default filename:
=== RANDOM SEARCH OPTIMIZATION ===
DSP file: reverb.dsp
Language: ocpp
Max trials: 100
==================================================
Testing baseline configuration: -lang ocpp
Baseline: 12.345ms
[1/100] Testing: -lang ocpp -double -mcd 4 -dlt 512 -ftz 2
Result: 10.234ms β NEW BEST! (-17.1% vs baseline)
[2/100] Testing: -lang ocpp -single -mcd 8 -mdd 2048 -cm
Result: 9.876ms β NEW BEST! (-20.0% vs baseline)
...
=== TOP 10 CONFIGURATIONS ===
#1: 8.234ms (-33.3% vs baseline)
-lang ocpp -double -mcd 8 -mdd 2048 -dlt 256 -ftz 2
#2: 8.567ms (-30.6% vs baseline)
-lang ocpp -single -mcd 4 -mca 16 -ftz 2 -fm def
...
======================================================================
BEST CONFIGURATION:
Time: 8.234ms
Speedup vs baseline: 33.3%
Command: faust -lang ocpp -double -mcd 8 -mdd 2048 -dlt 256 -ftz 2 <file.dsp> -o <file.cpp>
======================================================================
- FAUST compiler must be installed and accessible
fcbenchtoolmust be installed (from this toolkit)- Python 3 with standard libraries
- matplotlib (optional, for graph generation):
pip install matplotlib
- Start with fewer trials (30-50) to get quick feedback
- Use
--baselineto measure improvements over default settings - The
ocppbackend often benefits more from delay optimizations (-mcd,-mdd, etc.) - Save results to JSON for later comparison or analysis
- Use adaptive strategy for deep optimization (200+ trials)
- Validate a new Faust flag:
fcexplorer.pyto generate variants βfccomparetoolto ensure impulse responses stay identical βfcbenchtoolto see if the change speeds up or slows down. - Guard against regressions: run
fcanalyze.pywith your standard configs to keep warnings/errors at zero, thenfcbenchgraph.pyto watch for performance drops across the whole corpus. - Deep-dive a problematic DSP: capture the impulse response with
fcplottool, open a debug build withfcdebugtoolingdb, and inspect hot loops infoo.sproduced byfcasmtool.