DV-Smith: SystemVerilog/UVM Verification Gym Generator

DV-Smith is a framework that automatically converts SystemVerilog/UVM testbenches into containerized verification tasks (DV gyms), enabling AI agents and automated tools to learn and improve hardware verification.

Inspired by SWE-smith and SWE-Gym, DV-Smith brings the same containerized task paradigm to hardware verification.

🎯 What is a DV-Smith?

DV-Smith is a DV gym generator that:

Analyzes UVM repositories using AI to discover tests, sequences, and covergroups
Builds isolated verification tasks from existing testbenches
Evaluates solutions based on functional coverage, code coverage, and simulation health
Supports multiple simulators: Xcelium, Questa/ModelSim, VCS, Verilator

Key Features

✨ Claude-Powered Analysis: Uses Claude 3.5 Sonnet to understand any UVM repository structure 🎯 Automatic Task Generation: Converts existing tests into isolated tasks with HOWTO guides 📈 Multi-Metric Evaluation: Scores solutions on coverage and health metrics 🔌 Pluggable Simulator Support: Extensible adapter system for any simulator 🧪 Comprehensive Testing: Unit tests, integration tests, and real-world benchmarks 📝 Intelligent Gym Cleaning: Uses Claude Code SDK to identify and preserve infrastructure files 🔍 AI Transparency: Complete logging of all AI calls with debugging tools (dvsmith ai-logs)

🚀 Quick Start

Prerequisites

Python 3.12+
Docker (required by Terminal-Bench)
Anthropic API key

Installation

git clone https://github.com/yourusername/dv-smith.git
cd dv-smith

# Install with uv (recommended)
uv sync
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Required: Set Anthropic API key for Claude-powered analysis
echo "ANTHROPIC_API_KEY=your-key-here" > .env

Create Your First Terminal-Bench Tasks

# 1. Ingest and analyze a UVM repository
dvsmith ingest https://github.com/mbits-mirafra/apb_avip

# 2. Build a specific Terminal-Bench task
dvsmith build coverage-apb_master_coverage

# 3. Explore generated task
ls dvsmith_workspace/terminal_bench_tasks/apb_avip/coverage-apb_master_coverage/
# You'll see: prompt.md, task.yaml, Dockerfile, tests/, solution.sh

# 4. Run Terminal-Bench to test an AI agent on the task
tb run -t coverage-apb_master_coverage \
  --dataset-path dvsmith_workspace/terminal_bench_tasks/apb_avip \
  -a claude-code --livestream

# 5. View results
./parse_agent_log.py runs/<run-id>/coverage-apb_master_coverage/.../sessions/agent.log

Running Tasks from thomas_tasks/

The thomas_tasks/ directory contains pre-built AXI4 verification tasks that can be run directly with Terminal-Bench:

# Run a specific AXI4 task with Claude Code agent
tb run \
  --dataset-path thomas_tasks \
  --task-id axi4_blocking_32b_write_read_test \
  --agent claude-code \
  --model anthropic/claude-sonnet-4-5 \
  --livestream

# Available tasks:
# - axi4_blocking_32b_write_read_test
# - axi4_blocking_incr_burst_read_test
# - axi4_blocking_incr_burst_write_read_test
# - axi4_blocking_wrap_burst_write_read_test

These tasks are ready-to-use with Docker environments, solution scripts, and grading infrastructure.

For complete documentation on the build command, see Build Command Documentation.

Running `dvsmith build` in Docker (Recommended for Security)

⚠️ Security Warning: dvsmith build runs AI agents that execute arbitrary bash commands on your system. For untrusted repositories, run it in Docker isolation.

# One-time: Build the Docker image
docker build -f Dockerfile.dvsmith -t dvsmith:latest .

# Run dvsmith build safely in Docker
docker run -it --rm \
  --network=none \
  -v $(pwd)/dvsmith_workspace:/workspace \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  dvsmith:latest build coverage-apb_master_coverage

# For ingest (needs network access to clone repos)
docker run -it --rm \
  -v $(pwd)/dvsmith_workspace:/workspace \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  dvsmith:latest ingest https://github.com/mbits-mirafra/apb_avip

Security benefits:

✅ No network access for build command (prevents data exfiltration)
✅ Isolated filesystem (only workspace directory is accessible)
✅ Can't modify your host system
✅ Reproducible builds

🔍 AI Transparency & Debugging

DV-Smith provides full transparency into AI operations with built-in logging and debugging tools.

Debug Logging

Enable verbose debug output to troubleshoot issues or understand what's happening:

export DVSMITH_DEBUG=1
dvsmith build apb_avip --sim xcelium

This will show:

Detailed compilation commands and simulator invocations
File operations (copying, removing, etc.)
AI query details and responses
Coverage extraction steps
Infrastructure file analysis

Debug output uses the standard Python logging system and is enabled only when DVSMITH_DEBUG is set to 1, true, or yes.

View AI Call Logs

All AI interactions are automatically logged to ~/.dvsmith/ai_calls.jsonl:

# View recent AI calls (last 10 by default)
dvsmith ai-logs

# Show all entries
dvsmith ai-logs --all

# Show detailed view of a specific call
dvsmith ai-logs -d 5

📚 Documentation

Build Command: Complete guide to the build command and AI agent integration
Getting Started: Installation, first gym, basic workflows
Writing Agents: Create agents that solve verification tasks
Understanding Evaluation: How solutions are scored
Claude Code SDK: Using Claude Agent SDK for AI-powered analysis

📊 Benchmarks

DV-Smith has been tested on public UVM AVIPs:

Benchmark	Tests Found	Tasks Generated	Covergroups	Simulators	Status
APB AVIP	10	9	2	questa, vcs, xcelium	✅
AXI4 AVIP	72	70	2	xcelium, vcs, questa	✅
I3C AVIP	8	6	2	questa, vcs, xcelium	✅
SPI AVIP	TBD	TBD	TBD	questa, vcs, xcelium	⚠️

🧪 Testing

For debugging, set DVSMITH_DEBUG=1

# Run all tests
pytest tests/ -v

# Run specific test suites
pytest tests/test_models.py -v                  # Unit tests
pytest tests/test_coverage_parsers.py -v        # Parser tests
pytest tests/test_integration.py -v             # Integration tests

# Run with coverage
pytest tests/ --cov=dvsmith --cov-report=html

Workspace Structure

dvsmith_workspace/
├── clones/                # Cloned repositories
│   └── <bench_name>/
├── profiles/              # Repository profiles
│   └── <bench_name>.yaml
└── gyms/                  # Generated DV gyms
    └── <bench_name>/
        ├── tasks/         # Task specifications (*.md)
        ├── HOWTO.md       # Guide for adding new tests
        ├── gym_metadata.yaml
        ├── backups/       # Original test files (for reference)
        ├── work/          # Evaluation artifacts
        │   └── eval/
        │       └── <task_id>/
        │           ├── *.log
        │           └── coverage files
        ├── src/           # Source code (tests removed)
        ├── sim/           # Simulation makefiles
        └── ...            # Other repo files

Task Format

Each task includes a "Getting Started" section that directs agents to read the HOWTO.md file:

## Getting Started
**IMPORTANT:** Before implementing your solution, read the `HOWTO.md` file in the gym root directory.
It contains critical information about:
- How to add tests to the package file (required for compilation)
- UVM test structure and base classes
- Common errors and how to fix them

The HOWTO.md guide is automatically generated for each gym and includes:

Step-by-step instructions for adding new UVM tests
Package file editing requirements (critical for test registration)
Common pitfalls and troubleshooting

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.github/workflows		.github/workflows
docs		docs
dvsmith		dvsmith
runs		runs
tests		tests
thomas_tasks		thomas_tasks
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile.dvsmith		Dockerfile.dvsmith
README.md		README.md
parse_agent_log.py		parse_agent_log.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DV-Smith: SystemVerilog/UVM Verification Gym Generator

🎯 What is a DV-Smith?

Key Features

🚀 Quick Start

Prerequisites

Installation

Create Your First Terminal-Bench Tasks

Running Tasks from thomas_tasks/

Running `dvsmith build` in Docker (Recommended for Security)

🔍 AI Transparency & Debugging

Debug Logging

View AI Call Logs

📚 Documentation

📊 Benchmarks

🧪 Testing

Workspace Structure

Task Format

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

thomasnormal/dv-smith

Folders and files

Latest commit

History

Repository files navigation

DV-Smith: SystemVerilog/UVM Verification Gym Generator

🎯 What is a DV-Smith?

Key Features

🚀 Quick Start

Prerequisites

Installation

Create Your First Terminal-Bench Tasks

Running Tasks from thomas_tasks/

Running dvsmith build in Docker (Recommended for Security)

🔍 AI Transparency & Debugging

Debug Logging

View AI Call Logs

📚 Documentation

📊 Benchmarks

🧪 Testing

Workspace Structure

Task Format

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Running `dvsmith build` in Docker (Recommended for Security)

Packages